WO2021072301A1

WO2021072301A1 - Search and recommendation process for identifying useful boundaries in virtual interaction settings

Info

Publication number: WO2021072301A1
Application number: PCT/US2020/055122
Authority: WO
Inventors: Mohammad Keshavarzi; M. Luisa G. CALDAS
Original assignee: The Regents Of The University Of California
Priority date: 2019-10-11
Filing date: 2020-10-09
Publication date: 2021-04-15

Abstract

A method of providing spatial relationships for a virtual interaction includes analyzing a position of each call participant and the environment for each participant, directing each participant to reposition themselves and any objects as needed for the virtual interaction, searching the environment of each participant for shared space within spatial constraints of each environment of each participant, and providing each participant with a virtual arrangement of other participants in the virtual interaction.

Description

SEARCH AND RECOMMENDATION PROCESS FOR IDENTIFYING USEFUL BOUNDARIES IN VIRTUAL INTERACTION SETTINGS

RELATED APPLICATION

[0001] This application claims priority to and the benefit of US Provisional Patent Application No. 62/913,938 filed October 11, 2019, which is incorporated herein by reference in its entirety.

TECHNICAL FIELD

[0002] This disclosure relates to virtual communication systems, more particularly to managing spatial constraints in virtual communication systems.

BACKGROUND

[0003] With the rapid increase and demand for remote communication platforms in workplaces, households and education institutes, more forms of effective communication technologies have emerged in the past two decades. Advances in consumer-grade Augmented, Virtual and Mixed Reality (AR/VR) headsets and displays, such as Magic Leap 1®, Microsoft’s HoloLens®, VIVE Pro, and Oculus Quest®, have introduced alternative and affordable systems of immersive communication and remote collaboration. While much work has been done in 3D tele-immersion, real-life avatars, and virtual social platforms there is still one key factor that can cause a major bottleneck in future AR/VR/MR communications: the limited surrounding space of the user itself, in the real world.

[0004] Users of AR/VR technologies are introduced to unlimited spatial data. Within virtual reality setups, the user can virtually locomote between infinite spatial ranges and unlimited virtual environments. In Augmented Reality, unlimited spatial data can be imported to the user's current surroundings. Many of these virtual objects do not hold spatial limitations to themselves and are only restricted to the user's surrounding constraints. They can be visualized, augmented and placed anywhere necessary in the space, as far as they are holding the user's environmental boundaries.

[0005] However, this one-way spatial limitation between virtual and real objects does not always apply in communication applications, where two or more users, all having spatial constraints are interacting with each other in a spatial setting. All parties of the remote telepresence (or any other virtual communication) setting hold spatial limitations such as room size, furniture settings, etc., and their virtual doubles, or avatars, may not be able to practice the same spatial relationship and arrangement between the real-world spaces and their corresponding boundaries for all parties. This would result in misalignment of head and body gestures, line of sight, spatial sound errors and other micro expression mis-accuracies due to the incorrect positioning of each member of the virtual call.

BRIEF DESCRIPTION OF THE DRAWINGS [0006] Figure 1 shows a flowchart of an embodiment of a method to define a virtual environment having a shared space.

[0007] Figure 2 shows an initial setting of participants in a virtual communications system call.

[0008] Figure 3 shows an embodiment of repositioning objects and call participants.

[0009] Figure 4 shows an embodiment of the space of each participant and a search for mutual space.

[00010] Figure 5 shows an embodiment of a virtual arrangement of participants for a virtual communications system call. [00011] Figures 6-7 show a comparison between available standing only and standing and sitting area in rooms.

[00012] Figures 8-11 show examples of standable, non-standable, sittable and workable spaces.

[00013] Figures 12-15 show mutual space boundaries for different generations of an embodiment of a search mechanism.

[00014] Figure 16 shows a representation of a furniture rearrangement process.

[00015] Figures 17-18 show the results of an iterative process to increase mutual space with minimal effort.

[00016] Figures 19-22 shows views through a mixed reality head-mounted display with mutual shared spaces.

[00017] Figure 23 shows a system in which users can set up a virtual environment for calls.

DETAILED DESCRIPTION OF THE EMBODIMENTS [00018] Motivated by the challenges discussed above, the embodiments here introduce a search and recommendation process which can identify mutual accessible boundaries of all the parties of a communication setting, such as in AR conference calls, virtual calls, tele- immersion, etc., referred to here as ‘virtual interactions’ and provide each user the exact location to position themselves and where to move surrounding objects so that all parties of the interaction can hold a similar spatial relationship to each other with minimum effort. Such process would allow all members of the virtual interaction to augment other members in their own spaces, by considering the spatial limitations of all remote participants in the virtual/augmented/mixed reality interaction. The term ‘virtual’ as used here encompasses virtual, augmented, and mixed reality environments.

[00019] The process can highly promote remote communication in all consumer levels, in both commercial and personal settings. It would also benefit remote workplace procedures, allowing workers and employees to communicate efficiently together, without accessing large commercial spaces. Preserving micro-gestures and expressions in another main outcome of this process, maintaining different attributions of social interactions and effective communications. The cost-benefit of applying this process system can be seen in the decreasing real-estate requirements for communication applications and also decreasing transportation and relocation costs due to the improvement.

[00020] The embodiments can be employed in any augmented reality communication application and can also benefit virtual reality collaborative applications such as games, collaborative design applications, etc. These communication applications may be developed as native programs from the AR platform providers or third-party cross-platform applications. The process can be implemented as one the main setup functions of these communication platforms.

[00021] The embodiments of this process can be also applied in tele-training and education applications, where different objects/users with specific spatial limitations need to communicate with a constant spatial arrangement. The system can be scaled for holding large meetings, presentation settings, conferences, and virtual/ augmented classrooms which may require spatial search for identifying boundaries for mutual positioning for each participant. The process can be useful in any communication scenario where natural body interactions and facial expressions play a vital role in achieving higher task performance of participants. [00022] Figure 1 shows a flowchart of an embodiment of a process to provide a virtual interaction environment. At 10, the embodiments generally begin with an analysis of the participants’ positions and their environment, typically using cameras or sensors of the AR/VR system and/or the user's individual headset. The system captures participants’ positions and object in their environments and may generate a map of each environment, where the term map could mean any type of space projection used to locate spaces and objects around the users.

[00023] At 12, the system may direct participants to reposition themselves and any objects to make more space available to for shared space. The system searches the environments for all the participants of the interaction to find shared space within spatial constrains of each environment at 14. The repositioning and searching may occur in either order. For example, the system may first search for shared space and then direct the user to move objects and/or reposition themselves. The process may consist of an iterative process, where the system directs a user to move an object and/or reposition themselves, then searches for shared space. If the shared space is too constrained, the system may direct the user to move an object, the same or different object, and/or to reposition themselves again.

[00024] As shown in Figure 2, the particular example involves 4 user/participants, 20, 30, 40, and 50, each in slightly different environments 28, 38, 48 and 58, respectively, with different objects around them. User 40, for example, has a table 42 in front of the user. User 30 is facing away from the open space in user 30’ s environment 38.

[00025] In Figure 3, one can see that the system has recommended that users 20 and 30 move to different positions, and that user 40 should move the table 42 out of the way. In Figure 4, the mapping regions 24, 34, 44, and 54 show results of a search process that searches for mutual space between the participants in their environments. While the mapping in each space is oriented differently, one can see that the shape is consistent between mappings. As discussed above, the search process may occur before, after or simultaneous with object and participant repositioning. While the claims may imply a particular order, no order is intended and should not be implied.

[00026] The amount of time a given spacing process takes may depend upon the user settings. If the settings specify no moving of furniture, the process may only have a single criteria optimization. This will typically result in a faster result with no physical effort, but the shared space may be smaller. If the user settings include moving of furniture, the process involves multi-criteria optimization. It will more than likely take longer but result in more shared space. The settings may allow the users to make incremental steps such as adding a percentage of shared space. The settings may also allow for the designation of ‘sittable’ areas and ‘workable’ areas, instead of only standing areas. If the activity being performed allows sitting, then the system may add the edge of a sittable object such as a chair, couch, or bed to the mutual space. Similarly, working areas can be defined by adding the surface of tables, desks, or coffee tables.

[00027] In Figure 5, having identified mutual space within the users’ spatial constraints and boundaries, the process gives each participant an arrangement of the other participants for the interaction at 16 of Figure 1. This arrangement will be referred to as a ‘virtual arrangement’ because the other 3 participants in this particular example are not actually located in the user's environment. The arrangement for each user is in that space of the user that is mutual with the other users’ spaces. Each participant may have a slightly different arrangement of the participants, which may result from the different orientations of the shared space shown in Figure 3.

[00028] For example, in user 20’s environment 28 and in user 50’s environment 58, user 20 is in the top center of the arrangement of participants, but in user 30’ s arrangement in environment 38 user 20 is in the lower left, in user 40’ s environment 48 user 20 is at the bottom center. In some instances, the users may have the same position relative to the other participants or not. In the example given, user 40 is in a different position relative to the other users in each environment.

[00029] The shared space boundaries have limits. The boundaries are shown as the overlying colored region on the floor of the virtual space. The system may send a participant a notification or render a cautionary visualization if that participant approaches the edge of the shared space, similar to visualizations used in VR games when the user approaches a limit of the play area. If the call participants need more mutual space, such as examining a large object, or play a game, etc. The users can re-run the spacing process to search for a larger space and may have the ability to input the dimensions of the desired larger space. The users may also change the settings discussed above. In one embodiment, the shared space may take the form of a 3D ‘cage’ within the display to show call participants the boundaries.

[00030] Referring back to Figure 1, the process of analyzing the position of participants and the environment at 10 may include semantic segmentation of the surrounding environment. This may then be used in generating a topological scene graph for mutual space identification. This process may include moving objects around to maximize space, discussed in more detail later. [00031] Given a closed 3D room space in R³, one can project its enclosure, such as floors, ceilings, and walls, via an orthographic projection to form a 2D projection, which is commonly known as the floor plan of the space. If one assigns the (x, y) coordinates on the floor-plan plane and the z coordinate perpendicular to the floor-plan plane, simplifying the optimization problems on to the (x, y) plane significantly reduces the complexity of the algorithms. It also implies an assumption that there is no overlap between two objects on the (x, y) plane but with different z values. Nevertheless, such simplification is reasonable for analyzing the majority of room structures and thus does not compromise the generality of the analysis provided herein.

[00032] The process then defines for each user i their own room space expressed as a 2D floor plan as R_i. Each k- th object, such as furniture, in R_i is denoted as O_i,k. The collection of all n, objects in R_i, is denoted as represents the boundary of the

object O_i,k. Similarly, represents the boundary of the room R_i. Finally, the process defines

the area function as K(O).

[00033] Given the measurement of the surrounding physical environments as large sets of point cloud data, one can take advantage of the semantic segmentation methods widely investigated in computer vision literature to segment their spatial boundaries and obtain their geometric properties, such as dimensions, position and orientation, object classification, functional shapes, and their weights. In doing so, one can convert the 3D point cloud data to labeled objects O_i,k with a bounding box as

[00034] The discussion here excludes lightweight objects, such as pillows, alarm clocks, laptops, etc., positioned on larger furniture. This allows simplification of calculations in the next steps as the process assumes these lightweight objects can be easily moved by the users and do not need to be considered in the optimization criteria. Such classification is dependent on the output labeled object categories above.

[00035] The implementation of a computer vision algorithm is beyond the scope of the discussion here. The experiments below may employ a modified version of Matterport 3D®. The process and system here can employ any robust semantic segmentation algorithms, as long as they provide bounding box coordinates for each object category.

[00036] After identifying the bounding box, orientation, and category type of each object in the scene R_i a topological graph is readily generated that describes the relationship and constraints of the objects between one each other within R_i. This allows the process to identify usable spatial functions such as standing in virtual immersion, located between the objects. The process categorizes this type of functions as standalone spatial functions , and their spaces are called standalone spaces.

[00037] A topological scene graph will also allow identification of other spatial functions on the objects themselves such as sitting on a chair and working on a table. But note that such functions as sitting or working are also constrained by the distances between the object that performs the function and its adjacent other objects. For example, a side of the table cannot be utilized for working purposes if that side adjacent to other furniture or building elements such as walls, doors, etc. The process categorizes this type of functions as auxiliary spatial functions , and their spaces are called auxiliary spaces.

[00038] The embodiments use two spatial functions standable and sittable as an example to demonstrate how to integrate both standalone spatial functions and auxiliary spatial functions in the optimization of contextual mutual spaces for multi-user interaction in AR/VR. [00039] One should not that standalone spaces and auxiliary spaces are not mutually exclusive. For example, the embodiments classify that a standable space can be assumed to be sittable as well. However, the vice versa may not be true. For example, a portion of a sittable space involves a part of a bed object, which is not assumed to be standable. Such contextual constraints can be highly customizable based on the content of the AR/VR application. But the framework of the embodiments is general enough to accommodate other contextual interpretations of the standalone spatial functions and auxiliary spatial functions. [00040] One embodiment uses a doubly-linked data structure to construct the graph. For each side face of an object’s bounding box the process defines the closest adjacent objects to the face and calculate the distance between the object and the specified face. This information would be stored at the object level, where topological distances and constraints are referenced using pointers.

[00041] Mathematically, for each object O_i,k, the process defines the function δX_max( O_i,k ) as the shortest distance between the points in O_i,k that have the maximal x value and the other objects including Ri. Similarly, the process defines the functions δX_min(.) δY_max(.), and δY_min(.).

[00042] Figures 6 and 7 show more detailed views of bounding boxes for standable and sittable bounding boxes. In Figure 6, the user 30 in space 38, and user 40 in space 48 has standable spaces that are anywhere but in the bounding boxes 31 and 41. In Figure 7, the space inside the bounding boxes of 31 and 41 identified by the additional boundaries 33 and 43 are sittable spaces. These bounding boxes are then used in the mutual space determinations. [00043] Mutual space determinations identify the geometrical boundaries of available spaces in each room and then align the calculated boundaries of all rooms to achieve maximum consensus on mutual spaces.

[00044] Using the geometrical and topological properties extracted in previous steps, the process can calculate available spaces in each room based on two categories, namely, the standalone spaces and auxiliary spaces. In one embodiment, the process will formulate the calculation of the two most typical spatial functions as examples again, standable and sittable spaces.

[00045] Standing spaces consist of the volume of the room in which no object located within a human user's height range is present. In such spaces, user movement can be performed freely without any risk of colliding with an object in the surrounding physical environment. Activities such as intense gaming or performative arts can be safely executed within these boundaries. Such spaces are also suitable for virtual reality experiences, where users may not be aware of their physical surroundings.

[00046] One embodiment calculates the available standing space (S) for room Ri as follows:

[00047] The calculation of maximal sittable spaces is more involved than that of the standable spaces above. As mentioned before, sittable spaces normally extend the standable spaces by adding areas where humans are able to sit on. Furniture types such as sofas, chairs, and beds include sitting areas that can extend usable spaces of a room for social functions such as general meetings, design reviews, and conference calls.

[00048] To start, the process defines a sittable threshold ε(O_i,k ) to calculate the sittable area within the bounding box of the object O_i,k. In other words, ε(O_i,k ) is the maximum distance inward from an edge of the object’s bounding box that can be comfortably sit on. One can use measurements from architectural graphic standards to define the e of each furniture type. If object O is classified as non-sittable, then ε(O) = 0.

[00049] Therefore, one can first calculate the non-sittable area of an object O as

where B(p, ε(O)) is a sphere in R² centered at p and with radius ε(O).

[00050] One should note that sittable spaces do not necessarily comprise only objects to be sit on, but rather describe an area in which a sittable object can be placed. For example, while an individual may not be able to comfortably sit on the top of the table, but the foot space below the table can be considered as a sittable space. Therefore, in such context the sittable area of the room is always larger than its standable area.

[00051] Moreover, sittable areas of each object in the room is constrained by the topological positioning of the object. If any of the object’s boundaries is adjacent to a non-sittable object (such as a wall, bookshelf, etc.) or does not contain enough standable area between itself and a non-sittable object, the sittable area of the side of the face should be excluded. For instance, if a table is positioned in the center of a room, with no other non-sittable object around it, the sittable area would be calculated by applying the sittable threshold to all four sides of the table’s boundaries. However, if the table is positioned in the corner of the room, then there will be no sittable area accumulated for the sides that are adjacent to the wall. [00052] To simplify calculations, one can define a surrounding boundary threshold p(O) for object O, which measures the distance from any object’s boundary point outward that allows that point to remain part of the sittable space of the object. In other words, if the boundary point is close to other objects or the room boundary within distance p, then that point cannot be sit on. C(O_i,k) defined below collects all such points for exclusion from O_i,k in room R_i:

where θ denotes the empty set. Therefore, the sittable space of each object O is simply defined as

A(O) 0 N(O) UC(0). (4)

Finally, the total sittable space A(R_i) for the room R_i is

[00053] Now one can consider an immersive experience where there are m subjects and therefore m room spaces ( R_i , R₂, ..., R_m), respectively. Then, in the (x, y)-coordinates, one can define a rigid-body motion in R₂ as G(F, θ), where θ describes a translation and a rotation. [00054] To maximize a mutual standable space, one can apply one G(S_i θ_i) to each individual standable space S_i for the i-th user. The optimal rigid body motion then maximizes the area of the interaction space:

Then the maximal mutual standable space can be calculated as

Similarly, one can calculate the maximal mutual sittable space M_A(R_I, R_m) by substituting the rigid body motions in (7) that maximizes their intersection area function in (6).

[00055] In the event where individual spaces R_i include movable furniture, additional optimization can be considered to potentially increase the maximal mutual spaces. Diverging from merely considering rigid-body motions to transform just the coordinate representation of the spaces, the embodiments consider moving furniture objects in space, which has an additional cost of human effort. Consequently, the embodiments may formulate this effort as part of the optimization objective.

[00056] More specifically, given a rigid-body motion G, one can define as the

Euclidean distance of its translation vector. Then one can define

where w is a given parameter that approximates the weight of each object. Note that such weight estimate can be looked up using architecture standards or extracted from any other public datasets. Hence, if a room space R_i has n_i objects, then the total effort to re-arrange the space is:

where denotes the collection of n, rigid-body motion parameters.

[00057] Since solving for the optimal object transformation is an NP-Hard problem, the embodiments will demonstrate a heuristic-based but practical algorithm to optimize it in a step-by-step greedy fashion.

where K^s indicates the area value at the s-th step with respect to transformation coefficients

The iteration would stop if the optimization cannot further increase the area of the mutual space.

[00058] To comprehensively observe how the search and recommendation system performs given various rooms types with different spatial organizations, the experiment takes advantage of available 3D datasets to be able to experiment with large quantities of real- world case studies. The embodiment uses the Matterport 3D® dataset and randomly sample subsets of varying sizes of 3D scanned scenes, and perform the search and recommendation practice on each subset to observe how the mutual spaces are identified and maximized with the algorithm. Matterport 3D® is a large-scale RGB-D (Red Green Blue - Depth) dataset containing 90 building-scale scenes. The dataset consists of various building types with diverse architectural styles, each including numerous spatial functionalities and furniture layouts. Annotations of building elements and furniture are provided with surface reconstructions as well as 2D and 3D semantic segmentation. The experiment initially excludes spaces that are not generally used for multi-user interaction, such as bathrooms, small corridors, stairs, closet, etc. Furthermore, the experiment randomly groups the available rooms in groups of 2, 3, and 4. The experiment utilizes the object category labels provided in the dataset as the ground truth for semantic labeling purposes.

[00059] The experiments implement the framework using the Rhinoceros3D (R3D) software and its development libraries. For each room, the experiment converts the labeling data structure provided by the dataset to the proposed topological scene graph. This provides the system with bounding boxes for each object and the topological constraints for their potential rearrangement. Using such a structure, the experiment was able to extract the standable and sittable spaces for each room based on the proposed methodology. Figures 8-11 illustrate the available standable, non-standable, sittable and workable boundaries for a sample room processed by the system. A constant ε_Oi,k = 70 cm can be used for all sittable objects.

[00060] Figure 8 shows standable space in the shading and Figure 9 shows non-standable space for the same room. Figure 10 shows the sittaable space shaded, and Figure 11 shows the workable space for the same room. [00061] Next, the experiment integrates the algorithm with a robust Strength Pareto Evolutionary Algorithm 2 (SPEA 2) available through the Octopus multi-objective optimization tool in R3D. The fitness function (6) above is used to maximize the mutual space for calculated standable spaces. The genotype is comprised of the transformation parameters G(F, θ ) of each room, allowing free movement and orientation to achieve maximum spatial consensus. Therefore, a total of 3(n-l) genes are allocated for the search process. This process would result in the shape, position and orientation of the maximum mutual boundary of the assigned rooms. The experiment uses a population size of 100, mutation probability of 10%, mutation rate of 50% and a crossover rate of 80% for the search. As the solution integrates a genetic search, one expects the result to gradually converge to the global optimum. Figures 12-15 show how the mutual space boundary is progressively expanded with an increase of the generations in the search.

[00062] In Figures 12-15, the mutual space boundaries 66 in blue overlay the standable, green, areas, and the non-standable spaces, red. The three rooms represented are a living room 60 and two different bedrooms 62 and 64. Each figure shows a successive generation of the processing, Figure 12 showing generation 5, Figure 13 showing generation 12, Figure 14 showing generation 21, and Figure 15 showing generation 32.

[00063] Expanding further, the experiment extends the search by manipulating the scene with alternative furniture arrangements. As the objective goal is to achieve an increased mutual spatial boundary area with minimum effort, one calculates the E based on the transformation parameters assigned to each object present in the room. However, in the current implementation, the genetic algorithm integrated in the solution is not capable of adapting dynamic genotype values, and therefore cannot update the topological values of each object ( δ Xmax, δ Xmin, δYmax, δYmin) during the search process. Hence, to avoid transformations which result in physical conflicts of manipulated furniture, the system penalizes phenotypes that contain intersecting furniture within the scene. This penalty is added to the E value, lowering the probability of such phenotypes to be selected or survive throughout the genetic generations.

[00064] The optimization can either be (i) triggered in separate attempts for each step (s), where the mutual area value ( K ) is constrained based on the resulting step value, or executed in a single attempt where minimizing E and maximizing K are both set as objective functions. In the latter, Ms is defined as the solution which holds the largest K while E= 0. Executing the optimization in a one-time event is also likely to require additional computational cost due to the added complexity to the solution space. Figure 16 shows a graphical representation of weight determinations compared to free space.

[00065] Figure 16 shows a graphical example of how the process determines which objects to move. A room 70 has a bed 72 and a chair 74, shown on the left side. On the right, the bed is moved, freeing up space 76, and the chair is moved freeing up space 78. The weight of the bed is greater than the weight of the chair and frees up more space.

[00066] Figure 17 illustrates the results for a furniture manipulation optimization task applied to three example rooms. A total of 34 objects are located in the rooms. To shorten the gene length the process does not apply rotation transformations to objects. The process uses a population size of 250, mutation probability of 10%, mutation rate of 50% and crossover rate of 80% for the scene manipulation search. The process visualizes the standable, sittable and mutual boundaries for each spatial expansion step. Each iteration has the corresponding E for each room in the alternative furniture layout. The results in this example indicate the solution can identify solutions which increase the maximum mutual boundary area up to 65% more than its initial state before furniture movement.

[00067] The optimization process was able to generate a well-defined Pareto front, as seen in Figure 18, locating both the two extreme points and numerous intermediate trade-off points representing non-dominated solutions. The bottom region of the curve is flat, indicating that for a similar amount of effort, a significant increase in mutual standable area can be achieved. The trade-off frontier thus starts at point Ms, becoming very densely populated in its initial soft slope. This shows that for each modest increase in physical effort, that is, in moving furniture, there can be extensive gains in mutual shareable area, which is an interesting result. After s = 4, the Pareto front becomes increasingly steep, signalling that the user would now have to significantly increase physical effort levels for modest gains in shareable area. Point 4Gs thus seems to indicate a breaking point of diminishing returns.

[00068] Similar to the Ms search, in smaller furniture optimization steps, the algorithm seeks solutions which are highly dependent on the transformation parameters G(F, θ ) of the room itself, whereas in larger steps, one can observe the algorithm correctly moving the objects to the more populated side of the room in order to increase the empty spaces in available. In rooms where objects are facing the center and empty areas are initially located in the middle portion of the space, one can see the objects being pushed towards the corners or outer perimeter of the room in order increase the initial unoccupied areas.

[00069] Due to the smaller gene size, calculating the optimal Ms, maximum mutual space without furniture manipulation, executes much faster compared to

optimization, where the complexity of the search mechanism radically increases due to the additional object transformation parameters. The speed of the optimization is also highly

dependent on the transformation range of each object, meaning that objects in larger rooms have more movement options to choose from than those in small, constrained rooms. One can observe an example of this effect in the augmented reality experiment discussed below, where the smaller kitchen space dominates the search process, causing the final mutual outcome between the rooms to maintain a very similar shape to the open boundaries of the smaller space. While such an effect would still provide a well-constrained problem for medium-sized rooms with multiple objects, such as the conference room, there are many possible ways of fitting the smaller space in larger rooms with open spaces, such as the robotics laboratory, resulting in an under-constrained optimization problem.

[00070] To explore the usability aspect of the embodiments solution in real-world scenarios, an experiment deployed the resulting spatial segmentation in augmented reality using the Microsoft Hololens®, a mixed reality head mounted display. In this experiment, three types of rooms were defined as potential telecommunication spaces, shown in Figure 19: a conventional meeting room 84, where a large conference table is placed in the middle of the room and unused spaces are located around the table; a robotics laboratory 82, where working desks and equipment are mainly located around the perimeter of the room, while some larger equipment and a few tables are disorderly positioned around the central section of the lab; and a kitchen space 80, where surrounding appliances and cabinets are present in the scene.

[00071] Within each environment are positions number 1-9. Figure 20 shows the views from positions 1 and 2. Figure 21 shows the views from positions 3, 7, and 9. Figure 22 shows the views from positions 4, 5, 6, and 8. [00072] After the initial scan of the surrounding environment by the user of each room, the geometrical mesh data is sent to a central server for processing. This process happens in an offline manner, as the current Hololens hardware is incapable of processing the computations that the solution would require. In addition, the system scans the space using a Matterport camera, and perform the semantic segmentation step using Matterport classifications to locate the bounding boxes of all the furniture located in the room. The bounding box data is then fed to the algorithm for mutual boundary search. The implementation outputs spatial coordinates for standable and sittable areas which are automatically updated in the Unity Game Engine to be rendered in the Hololenses.

[00073] Figures 19-22 show how the spatial boundary properties are visualized within the Hololens AR experience. The red spaces indicate non standable objects, the green spaces indicate standable boundaries, and the blue spaces indicate mutual boundaries that are accessible between all users. The visualized boundaries are positioned slightly above the floor level, allowing users to identify the mutual accessible ground between their local surroundings and the remote participant’s spatial constraints.

[00074] Visualizing the mutual ground within the space itself using HoloLens allows the understanding of how complex the problem can be when executed in a manual fashion. Some corner spaces that are not typically used as default social areas of a certain room may become the only required common ground for interaction with other rooms. Overcoming this spatial bias is easily executed within the algorithm; meanwhile, this may not happen so easily and instantly when individuals are left to deal with it on their own.

[00075] The system transmits these images to the users through their computing device to which they connect to the system. Figure 23 shows an embodiment of such a system 90. The user's computer device may consist of anything from a personal computer and a webcam 92, or a virtual or augmented or mixed reality headset 94, as mentioned above, such as a Microsoft HoloLens® or an Oculus Rift®. The user device will typically have a processor 96, memory 98, display 100, sensors 102, user interface 104, and a network connection 106. The display may consist of a laptop screen, or a see-through holographic display, or a head- mounted display as in a virtual reality headset. The sensors could consist of a single webcam, microphone, multiple cameras, both visible light and infrared, depth sensors, accelerometers, gyroscopes, magnetometers, etc. The headset 94 would have all of the components set out in 92, not shown for simplicity.

[00076] The processor of the user device may execute software code that can perform particular functions based upon input from the sensors. This may include eye tracking, gesture tracking, voice controls, and biometric security such as face or iris recognition, fingerprint ID, etc. The virtual calling system may provide data back to the systems depending upon their capabilities to process the data, such as the type of display, etc.

[00077] When the user sets up or ‘calls’ the system 110, typically a server or cluster of servers having their own processors such as 102 and memories such as 104, the user inputs will include information from the sensors, possibly in the form of the geometrical mesh data mentioned above, provided to the system to allow the different virtual environments to be configured and the information sent back to the user devices. The system servers such as 110 then gather the information associated with each user to analyze each user's environment.

The server then communicates back with each user depending that user's settings as to whether the user needs to reposition themselves, reposition themselves and move objects, and whether the user wants sittable objects in their environment, as examples. When any repositionings and moves have been completed, the user or user device notifies the virtual interaction system, which then sends the virtual environment data back to the users.

[00078] The user devices will then use the information from the system to display the individual users’ environments within the capabilities of their computing devices. This includes interactions with the users to provide directions on movement of objects and repositioning, the spatial mapping of the multiple user's spaces and then the data to allow each user's device to render the virtual space.

[00079] In this manner, the embodiment identifies mutually accessible boundaries for all of the participants in a communications setting, provide each user a position to assume during the interactions and where to move surrounding objects so all of the participants can hold a similar spatial relationship with each other. This improves the effectiveness and efficiency of communication between the participants, while reducing the amount of space needed to allow the users to meaningfully interact.

[00080] It will be appreciated that variants of the above-disclosed and other features and functions, or alternatives thereof, may be combined into many other different systems or applications. Various presently unforeseen or unanticipated alternatives, modifications, variations, or improvements therein may be subsequently made by those skilled in the art which are also intended to be encompassed by the embodiments.

Claims

WHAT IS CLAIMED IS:

1. A method of providing spatial relationships for a virtual interaction, comprising: analyzing a position of each participant and the environment for each participant based upon user inputs from at least two participant computing devices; directing each participant to reposition themselves and any objects as needed for the virtual interaction by sending direction data to each of the at least two participant computing devices; searching the environment of each participant for mutual space within spatial constraints of each environment of each participant; and providing each participant with a virtual arrangement of other participants in the virtual interaction within boundaries of the mutual space to be displayed a display of the computing device of each participant.

2. The method as claimed in claim 1, further comprising receiving the user inputs from the at least two participant computing devices, the user inputs based upon data gathered from at least one sensor on each of the at least two participant computing devices.

3. The method as claimed in claim 1, wherein the searching is performed prior to the directing.

4. The method as claimed in claim 1, wherein the directing is performed prior to the searching.

5. The method as claimed in claim 1, wherein the directing and searching are performed in an iterative process.

6. The method as claimed in claim 1, wherein analyzing the position of each participant and the environment for each participant comprises performing semantic segregation of the environment for each participant to obtain spatial boundaries and geometric properties.

7. The method as claimed in claim 7, further comprising using the spatial boundaries and geometric properties to generate a topological graph of the environment for each participant.

8. The method as claimed in claim 1, wherein searching the environment of each participant for mutual space comprises identifying the geometrical boundaries of available spaces in each environment and aligning the boundaries of all environments to maximize the mutual space.

9. The method as claimed in claim 1, wherein directing the participants to reposition themselves and any objects comprises directing repositioning of objects based upon analyzing an effort to be made to reposition the object against any gained mutual space.

10. The method as claimed in claim 1, further comprising notifying any of the participants when the participant approaches the boundaries of the mutual space.

11. The method as claimed in claim 1, further comprising: receiving a notification from a participant that additional space outside the boundaries is required; repeating the direction and the searching to identify a new shared space with additional space.

12. The method as claimed in claim 8, wherein receiving the notification comprises receiving a user input of a setting change.

13. The method as claimed in claim 1, wherein the boundaries of the mutual space display as a shaded region on a floor of the virtual environment.

14. The method as claimed in claim 1, wherein the boundaries of the shared space display as a three-dimensional cage.

15. A computing device comprising: a network interface to allow the device to communicate with at least two participants simultaneously; a processor configured to execute code that causes the processor to: receive user inputs containing information from sensors in an environment for each of the at least two participants; determine settings of each of the at least two participants; analyze the user inputs and the settings to determine a mutual space; send messages to the at least two users to direct each of the at least two participants to at least one of re-position or move objects in the environment of each participant; and provide each of the at least two participants with information usable to depict a virtual environment having the mutual space with defined boundaries.

16. The device as claimed in claim 15, wherein code that causes the processor to analyze the position of each participant and the environment for each participant causes the processor to perform semantic segregation of the environment for each participant to obtain spatial boundaries and geometric properties.

17. The device as claimed in claim 16, further comprising the processor to execute code to cause the process or use the spatial boundaries and geometric properties to generate a topological graph of the environment for each participant.

18. The device as claimed in claim 15, wherein the code that causes the processor to search the environment of each participant for mutual space comprises identifying the geometric boundaries of available spaces in each environment and aligning the boundaries of all environments to maximize the mutual space.

19. The method as claimed in claim 15, wherein directing the participants to reposition themselves and any objects comprises directing repositioning of objects based upon analyzing an effort to be made to reposition the object against any gained mutual space.

20. The method as claimed in claim 15, wherein the code further comprising notifying any of the participants when the participant approaches the boundaries of the mutual space.

21. The device as claimed in claim 15, wherein the code executed by the processor further causes the processor to: receive a notification from a participant that additional space outside the boundaries is required; and identify a new shared space with additional space.