WO2021072301A1 - Search and recommendation process for identifying useful boundaries in virtual interaction settings - Google Patents

Search and recommendation process for identifying useful boundaries in virtual interaction settings Download PDF

Info

Publication number
WO2021072301A1
WO2021072301A1 PCT/US2020/055122 US2020055122W WO2021072301A1 WO 2021072301 A1 WO2021072301 A1 WO 2021072301A1 US 2020055122 W US2020055122 W US 2020055122W WO 2021072301 A1 WO2021072301 A1 WO 2021072301A1
Authority
WO
WIPO (PCT)
Prior art keywords
participant
environment
boundaries
space
participants
Prior art date
Application number
PCT/US2020/055122
Other languages
French (fr)
Inventor
Mohammad Keshavarzi
M. Luisa G. CALDAS
Original Assignee
The Regents Of The University Of California
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The Regents Of The University Of California filed Critical The Regents Of The University Of California
Publication of WO2021072301A1 publication Critical patent/WO2021072301A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/03Arrangements for converting the position or the displacement of a member into a coded form
    • G06F3/0304Detection arrangements using opto-electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/03Arrangements for converting the position or the displacement of a member into a coded form
    • G06F3/033Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor
    • G06F3/0346Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor with detection of the device orientation or free movement in a 3D space, e.g. 3D mice, 6-DOF [six degrees of freedom] pointers using gyroscopes, accelerometers or tilt-sensors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • H04N7/157Conference systems defining a virtual conference space and using avatars or agents

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Processing Or Creating Images (AREA)

Abstract

A method of providing spatial relationships for a virtual interaction includes analyzing a position of each call participant and the environment for each participant, directing each participant to reposition themselves and any objects as needed for the virtual interaction, searching the environment of each participant for shared space within spatial constraints of each environment of each participant, and providing each participant with a virtual arrangement of other participants in the virtual interaction.

Description

SEARCH AND RECOMMENDATION PROCESS FOR IDENTIFYING USEFUL BOUNDARIES IN VIRTUAL INTERACTION SETTINGS
RELATED APPLICATION
[0001] This application claims priority to and the benefit of US Provisional Patent Application No. 62/913,938 filed October 11, 2019, which is incorporated herein by reference in its entirety.
TECHNICAL FIELD
[0002] This disclosure relates to virtual communication systems, more particularly to managing spatial constraints in virtual communication systems.
BACKGROUND
[0003] With the rapid increase and demand for remote communication platforms in workplaces, households and education institutes, more forms of effective communication technologies have emerged in the past two decades. Advances in consumer-grade Augmented, Virtual and Mixed Reality (AR/VR) headsets and displays, such as Magic Leap 1®, Microsoft’s HoloLens®, VIVE Pro, and Oculus Quest®, have introduced alternative and affordable systems of immersive communication and remote collaboration. While much work has been done in 3D tele-immersion, real-life avatars, and virtual social platforms there is still one key factor that can cause a major bottleneck in future AR/VR/MR communications: the limited surrounding space of the user itself, in the real world.
[0004] Users of AR/VR technologies are introduced to unlimited spatial data. Within virtual reality setups, the user can virtually locomote between infinite spatial ranges and unlimited virtual environments. In Augmented Reality, unlimited spatial data can be imported to the user's current surroundings. Many of these virtual objects do not hold spatial limitations to themselves and are only restricted to the user's surrounding constraints. They can be visualized, augmented and placed anywhere necessary in the space, as far as they are holding the user's environmental boundaries.
[0005] However, this one-way spatial limitation between virtual and real objects does not always apply in communication applications, where two or more users, all having spatial constraints are interacting with each other in a spatial setting. All parties of the remote telepresence (or any other virtual communication) setting hold spatial limitations such as room size, furniture settings, etc., and their virtual doubles, or avatars, may not be able to practice the same spatial relationship and arrangement between the real-world spaces and their corresponding boundaries for all parties. This would result in misalignment of head and body gestures, line of sight, spatial sound errors and other micro expression mis-accuracies due to the incorrect positioning of each member of the virtual call.
BRIEF DESCRIPTION OF THE DRAWINGS [0006] Figure 1 shows a flowchart of an embodiment of a method to define a virtual environment having a shared space.
[0007] Figure 2 shows an initial setting of participants in a virtual communications system call.
[0008] Figure 3 shows an embodiment of repositioning objects and call participants.
[0009] Figure 4 shows an embodiment of the space of each participant and a search for mutual space.
[00010] Figure 5 shows an embodiment of a virtual arrangement of participants for a virtual communications system call. [00011] Figures 6-7 show a comparison between available standing only and standing and sitting area in rooms.
[00012] Figures 8-11 show examples of standable, non-standable, sittable and workable spaces.
[00013] Figures 12-15 show mutual space boundaries for different generations of an embodiment of a search mechanism.
[00014] Figure 16 shows a representation of a furniture rearrangement process.
[00015] Figures 17-18 show the results of an iterative process to increase mutual space with minimal effort.
[00016] Figures 19-22 shows views through a mixed reality head-mounted display with mutual shared spaces.
[00017] Figure 23 shows a system in which users can set up a virtual environment for calls.
DETAILED DESCRIPTION OF THE EMBODIMENTS [00018] Motivated by the challenges discussed above, the embodiments here introduce a search and recommendation process which can identify mutual accessible boundaries of all the parties of a communication setting, such as in AR conference calls, virtual calls, tele- immersion, etc., referred to here as ‘virtual interactions’ and provide each user the exact location to position themselves and where to move surrounding objects so that all parties of the interaction can hold a similar spatial relationship to each other with minimum effort. Such process would allow all members of the virtual interaction to augment other members in their own spaces, by considering the spatial limitations of all remote participants in the virtual/augmented/mixed reality interaction. The term ‘virtual’ as used here encompasses virtual, augmented, and mixed reality environments.
[00019] The process can highly promote remote communication in all consumer levels, in both commercial and personal settings. It would also benefit remote workplace procedures, allowing workers and employees to communicate efficiently together, without accessing large commercial spaces. Preserving micro-gestures and expressions in another main outcome of this process, maintaining different attributions of social interactions and effective communications. The cost-benefit of applying this process system can be seen in the decreasing real-estate requirements for communication applications and also decreasing transportation and relocation costs due to the improvement.
[00020] The embodiments can be employed in any augmented reality communication application and can also benefit virtual reality collaborative applications such as games, collaborative design applications, etc. These communication applications may be developed as native programs from the AR platform providers or third-party cross-platform applications. The process can be implemented as one the main setup functions of these communication platforms.
[00021] The embodiments of this process can be also applied in tele-training and education applications, where different objects/users with specific spatial limitations need to communicate with a constant spatial arrangement. The system can be scaled for holding large meetings, presentation settings, conferences, and virtual/ augmented classrooms which may require spatial search for identifying boundaries for mutual positioning for each participant. The process can be useful in any communication scenario where natural body interactions and facial expressions play a vital role in achieving higher task performance of participants. [00022] Figure 1 shows a flowchart of an embodiment of a process to provide a virtual interaction environment. At 10, the embodiments generally begin with an analysis of the participants’ positions and their environment, typically using cameras or sensors of the AR/VR system and/or the user's individual headset. The system captures participants’ positions and object in their environments and may generate a map of each environment, where the term map could mean any type of space projection used to locate spaces and objects around the users.
[00023] At 12, the system may direct participants to reposition themselves and any objects to make more space available to for shared space. The system searches the environments for all the participants of the interaction to find shared space within spatial constrains of each environment at 14. The repositioning and searching may occur in either order. For example, the system may first search for shared space and then direct the user to move objects and/or reposition themselves. The process may consist of an iterative process, where the system directs a user to move an object and/or reposition themselves, then searches for shared space. If the shared space is too constrained, the system may direct the user to move an object, the same or different object, and/or to reposition themselves again.
[00024] As shown in Figure 2, the particular example involves 4 user/participants, 20, 30, 40, and 50, each in slightly different environments 28, 38, 48 and 58, respectively, with different objects around them. User 40, for example, has a table 42 in front of the user. User 30 is facing away from the open space in user 30’ s environment 38.
[00025] In Figure 3, one can see that the system has recommended that users 20 and 30 move to different positions, and that user 40 should move the table 42 out of the way. In Figure 4, the mapping regions 24, 34, 44, and 54 show results of a search process that searches for mutual space between the participants in their environments. While the mapping in each space is oriented differently, one can see that the shape is consistent between mappings. As discussed above, the search process may occur before, after or simultaneous with object and participant repositioning. While the claims may imply a particular order, no order is intended and should not be implied.
[00026] The amount of time a given spacing process takes may depend upon the user settings. If the settings specify no moving of furniture, the process may only have a single criteria optimization. This will typically result in a faster result with no physical effort, but the shared space may be smaller. If the user settings include moving of furniture, the process involves multi-criteria optimization. It will more than likely take longer but result in more shared space. The settings may allow the users to make incremental steps such as adding a percentage of shared space. The settings may also allow for the designation of ‘sittable’ areas and ‘workable’ areas, instead of only standing areas. If the activity being performed allows sitting, then the system may add the edge of a sittable object such as a chair, couch, or bed to the mutual space. Similarly, working areas can be defined by adding the surface of tables, desks, or coffee tables.
[00027] In Figure 5, having identified mutual space within the users’ spatial constraints and boundaries, the process gives each participant an arrangement of the other participants for the interaction at 16 of Figure 1. This arrangement will be referred to as a ‘virtual arrangement’ because the other 3 participants in this particular example are not actually located in the user's environment. The arrangement for each user is in that space of the user that is mutual with the other users’ spaces. Each participant may have a slightly different arrangement of the participants, which may result from the different orientations of the shared space shown in Figure 3.
[00028] For example, in user 20’s environment 28 and in user 50’s environment 58, user 20 is in the top center of the arrangement of participants, but in user 30’ s arrangement in environment 38 user 20 is in the lower left, in user 40’ s environment 48 user 20 is at the bottom center. In some instances, the users may have the same position relative to the other participants or not. In the example given, user 40 is in a different position relative to the other users in each environment.
[00029] The shared space boundaries have limits. The boundaries are shown as the overlying colored region on the floor of the virtual space. The system may send a participant a notification or render a cautionary visualization if that participant approaches the edge of the shared space, similar to visualizations used in VR games when the user approaches a limit of the play area. If the call participants need more mutual space, such as examining a large object, or play a game, etc. The users can re-run the spacing process to search for a larger space and may have the ability to input the dimensions of the desired larger space. The users may also change the settings discussed above. In one embodiment, the shared space may take the form of a 3D ‘cage’ within the display to show call participants the boundaries.
[00030] Referring back to Figure 1, the process of analyzing the position of participants and the environment at 10 may include semantic segmentation of the surrounding environment. This may then be used in generating a topological scene graph for mutual space identification. This process may include moving objects around to maximize space, discussed in more detail later. [00031] Given a closed 3D room space in R3, one can project its enclosure, such as floors, ceilings, and walls, via an orthographic projection to form a 2D projection, which is commonly known as the floor plan of the space. If one assigns the (x, y) coordinates on the floor-plan plane and the z coordinate perpendicular to the floor-plan plane, simplifying the optimization problems on to the (x, y) plane significantly reduces the complexity of the algorithms. It also implies an assumption that there is no overlap between two objects on the (x, y) plane but with different z values. Nevertheless, such simplification is reasonable for analyzing the majority of room structures and thus does not compromise the generality of the analysis provided herein.
[00032] The process then defines for each user i their own room space expressed as a 2D floor plan as Ri. Each k- th object, such as furniture, in Ri is denoted as Oi,k. The collection of all n, objects in Ri, is denoted as represents the boundary of the
Figure imgf000009_0001
object Oi,k. Similarly, represents the boundary of the room Ri. Finally, the process defines
Figure imgf000009_0003
the area function as K(O).
[00033] Given the measurement of the surrounding physical environments as large sets of point cloud data, one can take advantage of the semantic segmentation methods widely investigated in computer vision literature to segment their spatial boundaries and obtain their geometric properties, such as dimensions, position and orientation, object classification, functional shapes, and their weights. In doing so, one can convert the 3D point cloud data to labeled objects Oi,k with a bounding box as
Figure imgf000009_0002
[00034] The discussion here excludes lightweight objects, such as pillows, alarm clocks, laptops, etc., positioned on larger furniture. This allows simplification of calculations in the next steps as the process assumes these lightweight objects can be easily moved by the users and do not need to be considered in the optimization criteria. Such classification is dependent on the output labeled object categories above.
[00035] The implementation of a computer vision algorithm is beyond the scope of the discussion here. The experiments below may employ a modified version of Matterport 3D®. The process and system here can employ any robust semantic segmentation algorithms, as long as they provide bounding box coordinates for each object category.
[00036] After identifying the bounding box, orientation, and category type of each object in the scene Ri a topological graph is readily generated that describes the relationship and constraints of the objects between one each other within Ri. This allows the process to identify usable spatial functions such as standing in virtual immersion, located between the objects. The process categorizes this type of functions as standalone spatial functions , and their spaces are called standalone spaces.
[00037] A topological scene graph will also allow identification of other spatial functions on the objects themselves such as sitting on a chair and working on a table. But note that such functions as sitting or working are also constrained by the distances between the object that performs the function and its adjacent other objects. For example, a side of the table cannot be utilized for working purposes if that side adjacent to other furniture or building elements such as walls, doors, etc. The process categorizes this type of functions as auxiliary spatial functions , and their spaces are called auxiliary spaces.
[00038] The embodiments use two spatial functions standable and sittable as an example to demonstrate how to integrate both standalone spatial functions and auxiliary spatial functions in the optimization of contextual mutual spaces for multi-user interaction in AR/VR. [00039] One should not that standalone spaces and auxiliary spaces are not mutually exclusive. For example, the embodiments classify that a standable space can be assumed to be sittable as well. However, the vice versa may not be true. For example, a portion of a sittable space involves a part of a bed object, which is not assumed to be standable. Such contextual constraints can be highly customizable based on the content of the AR/VR application. But the framework of the embodiments is general enough to accommodate other contextual interpretations of the standalone spatial functions and auxiliary spatial functions. [00040] One embodiment uses a doubly-linked data structure to construct the graph. For each side face of an object’s bounding box the process defines the closest adjacent objects to the face and calculate the distance between the object and the specified face. This information would be stored at the object level, where topological distances and constraints are referenced using pointers.
[00041] Mathematically, for each object Oi,k, the process defines the function δXmax( Oi,k ) as the shortest distance between the points in Oi,k that have the maximal x value and the other objects including Ri. Similarly, the process defines the functions δXmin(.) δYmax(.), and δYmin(.).
[00042] Figures 6 and 7 show more detailed views of bounding boxes for standable and sittable bounding boxes. In Figure 6, the user 30 in space 38, and user 40 in space 48 has standable spaces that are anywhere but in the bounding boxes 31 and 41. In Figure 7, the space inside the bounding boxes of 31 and 41 identified by the additional boundaries 33 and 43 are sittable spaces. These bounding boxes are then used in the mutual space determinations. [00043] Mutual space determinations identify the geometrical boundaries of available spaces in each room and then align the calculated boundaries of all rooms to achieve maximum consensus on mutual spaces.
[00044] Using the geometrical and topological properties extracted in previous steps, the process can calculate available spaces in each room based on two categories, namely, the standalone spaces and auxiliary spaces. In one embodiment, the process will formulate the calculation of the two most typical spatial functions as examples again, standable and sittable spaces.
[00045] Standing spaces consist of the volume of the room in which no object located within a human user's height range is present. In such spaces, user movement can be performed freely without any risk of colliding with an object in the surrounding physical environment. Activities such as intense gaming or performative arts can be safely executed within these boundaries. Such spaces are also suitable for virtual reality experiences, where users may not be aware of their physical surroundings.
[00046] One embodiment calculates the available standing space (S) for room Ri as follows:
Figure imgf000012_0001
[00047] The calculation of maximal sittable spaces is more involved than that of the standable spaces above. As mentioned before, sittable spaces normally extend the standable spaces by adding areas where humans are able to sit on. Furniture types such as sofas, chairs, and beds include sitting areas that can extend usable spaces of a room for social functions such as general meetings, design reviews, and conference calls.
[00048] To start, the process defines a sittable threshold ε(Oi,k ) to calculate the sittable area within the bounding box of the object Oi,k. In other words, ε(Oi,k ) is the maximum distance inward from an edge of the object’s bounding box that can be comfortably sit on. One can use measurements from architectural graphic standards to define the e of each furniture type. If object O is classified as non-sittable, then ε(O) = 0.
[00049] Therefore, one can first calculate the non-sittable area of an object O as
Figure imgf000013_0001
where B(p, ε(O)) is a sphere in R2 centered at p and with radius ε(O).
[00050] One should note that sittable spaces do not necessarily comprise only objects to be sit on, but rather describe an area in which a sittable object can be placed. For example, while an individual may not be able to comfortably sit on the top of the table, but the foot space below the table can be considered as a sittable space. Therefore, in such context the sittable area of the room is always larger than its standable area.
[00051] Moreover, sittable areas of each object in the room is constrained by the topological positioning of the object. If any of the object’s boundaries is adjacent to a non-sittable object (such as a wall, bookshelf, etc.) or does not contain enough standable area between itself and a non-sittable object, the sittable area of the side of the face should be excluded. For instance, if a table is positioned in the center of a room, with no other non-sittable object around it, the sittable area would be calculated by applying the sittable threshold to all four sides of the table’s boundaries. However, if the table is positioned in the corner of the room, then there will be no sittable area accumulated for the sides that are adjacent to the wall. [00052] To simplify calculations, one can define a surrounding boundary threshold p(O) for object O, which measures the distance from any object’s boundary point outward that allows that point to remain part of the sittable space of the object. In other words, if the boundary point is close to other objects or the room boundary within distance p, then that point cannot be sit on. C(Oi,k) defined below collects all such points for exclusion from Oi,k in room Ri:
Figure imgf000014_0001
where θ denotes the empty set. Therefore, the sittable space of each object O is simply defined as
A(O) 0 N(O) UC(0). (4)
Finally, the total sittable space A(Ri) for the room Ri is
Figure imgf000014_0002
[00053] Now one can consider an immersive experience where there are m subjects and therefore m room spaces ( Ri , R2, ..., Rm), respectively. Then, in the (x, y)-coordinates, one can define a rigid-body motion in R2 as G(F, θ), where θ describes a translation and a rotation. [00054] To maximize a mutual standable space, one can apply one G(Si θi) to each individual standable space Si for the i-th user. The optimal rigid body motion then maximizes the area of the interaction space:
Figure imgf000014_0003
Then the maximal mutual standable space can be calculated as
Figure imgf000014_0004
Similarly, one can calculate the maximal mutual sittable space MA(RI, Rm) by substituting the rigid body motions in (7) that maximizes their intersection area function in (6).
[00055] In the event where individual spaces Ri include movable furniture, additional optimization can be considered to potentially increase the maximal mutual spaces. Diverging from merely considering rigid-body motions to transform just the coordinate representation of the spaces, the embodiments consider moving furniture objects in space, which has an additional cost of human effort. Consequently, the embodiments may formulate this effort as part of the optimization objective.
[00056] More specifically, given a rigid-body motion G, one can define as the
Figure imgf000015_0004
Euclidean distance of its translation vector. Then one can define
Figure imgf000015_0001
where w is a given parameter that approximates the weight of each object. Note that such weight estimate can be looked up using architecture standards or extracted from any other public datasets. Hence, if a room space Ri has ni objects, then the total effort to re-arrange the space is:
Figure imgf000015_0002
where denotes the collection of n, rigid-body motion parameters.
Figure imgf000015_0005
[00057] Since solving for the optimal object transformation is an NP-Hard problem, the embodiments will demonstrate a heuristic-based but practical algorithm to optimize it in a step-by-step greedy fashion.
Figure imgf000015_0003
where Ks indicates the area value at the s-th step with respect to transformation coefficients
Figure imgf000015_0006
The iteration would stop if the optimization cannot further increase the area of the mutual space.
[00058] To comprehensively observe how the search and recommendation system performs given various rooms types with different spatial organizations, the experiment takes advantage of available 3D datasets to be able to experiment with large quantities of real- world case studies. The embodiment uses the Matterport 3D® dataset and randomly sample subsets of varying sizes of 3D scanned scenes, and perform the search and recommendation practice on each subset to observe how the mutual spaces are identified and maximized with the algorithm. Matterport 3D® is a large-scale RGB-D (Red Green Blue - Depth) dataset containing 90 building-scale scenes. The dataset consists of various building types with diverse architectural styles, each including numerous spatial functionalities and furniture layouts. Annotations of building elements and furniture are provided with surface reconstructions as well as 2D and 3D semantic segmentation. The experiment initially excludes spaces that are not generally used for multi-user interaction, such as bathrooms, small corridors, stairs, closet, etc. Furthermore, the experiment randomly groups the available rooms in groups of 2, 3, and 4. The experiment utilizes the object category labels provided in the dataset as the ground truth for semantic labeling purposes.
[00059] The experiments implement the framework using the Rhinoceros3D (R3D) software and its development libraries. For each room, the experiment converts the labeling data structure provided by the dataset to the proposed topological scene graph. This provides the system with bounding boxes for each object and the topological constraints for their potential rearrangement. Using such a structure, the experiment was able to extract the standable and sittable spaces for each room based on the proposed methodology. Figures 8-11 illustrate the available standable, non-standable, sittable and workable boundaries for a sample room processed by the system. A constant εOi,k = 70 cm can be used for all sittable objects.
[00060] Figure 8 shows standable space in the shading and Figure 9 shows non-standable space for the same room. Figure 10 shows the sittaable space shaded, and Figure 11 shows the workable space for the same room. [00061] Next, the experiment integrates the algorithm with a robust Strength Pareto Evolutionary Algorithm 2 (SPEA 2) available through the Octopus multi-objective optimization tool in R3D. The fitness function (6) above is used to maximize the mutual space for calculated standable spaces. The genotype is comprised of the transformation parameters G(F, θ ) of each room, allowing free movement and orientation to achieve maximum spatial consensus. Therefore, a total of 3(n-l) genes are allocated for the search process. This process would result in the shape, position and orientation of the maximum mutual boundary of the assigned rooms. The experiment uses a population size of 100, mutation probability of 10%, mutation rate of 50% and a crossover rate of 80% for the search. As the solution integrates a genetic search, one expects the result to gradually converge to the global optimum. Figures 12-15 show how the mutual space boundary is progressively expanded with an increase of the generations in the search.
[00062] In Figures 12-15, the mutual space boundaries 66 in blue overlay the standable, green, areas, and the non-standable spaces, red. The three rooms represented are a living room 60 and two different bedrooms 62 and 64. Each figure shows a successive generation of the processing, Figure 12 showing generation 5, Figure 13 showing generation 12, Figure 14 showing generation 21, and Figure 15 showing generation 32.
[00063] Expanding further, the experiment extends the search by manipulating the scene with alternative furniture arrangements. As the objective goal is to achieve an increased mutual spatial boundary area with minimum effort, one calculates the E based on the transformation parameters assigned to each object present in the room. However, in the current implementation, the genetic algorithm integrated in the solution is not capable of adapting dynamic genotype values, and therefore cannot update the topological values of each object ( δ Xmax, δ Xmin, δYmax, δYmin) during the search process. Hence, to avoid transformations which result in physical conflicts of manipulated furniture, the system penalizes phenotypes that contain intersecting furniture within the scene. This penalty is added to the E value, lowering the probability of such phenotypes to be selected or survive throughout the genetic generations.
[00064] The optimization can either be (i) triggered in separate attempts for each step (s), where the mutual area value ( K ) is constrained based on the resulting step value, or executed in a single attempt where minimizing E and maximizing K are both set as objective functions. In the latter, Ms is defined as the solution which holds the largest K while E= 0. Executing the optimization in a one-time event is also likely to require additional computational cost due to the added complexity to the solution space. Figure 16 shows a graphical representation of weight determinations compared to free space.
[00065] Figure 16 shows a graphical example of how the process determines which objects to move. A room 70 has a bed 72 and a chair 74, shown on the left side. On the right, the bed is moved, freeing up space 76, and the chair is moved freeing up space 78. The weight of the bed is greater than the weight of the chair and frees up more space.
[00066] Figure 17 illustrates the results for a furniture manipulation optimization task applied to three example rooms. A total of 34 objects are located in the rooms. To shorten the gene length the process does not apply rotation transformations to objects. The process uses a population size of 250, mutation probability of 10%, mutation rate of 50% and crossover rate of 80% for the scene manipulation search. The process visualizes the standable, sittable and mutual boundaries for each spatial expansion step. Each iteration has the corresponding E for each room in the alternative furniture layout. The results in this example indicate the solution can identify solutions which increase the maximum mutual boundary area up to 65% more than its initial state before furniture movement.
[00067] The optimization process was able to generate a well-defined Pareto front, as seen in Figure 18, locating both the two extreme points and numerous intermediate trade-off points representing non-dominated solutions. The bottom region of the curve is flat, indicating that for a similar amount of effort, a significant increase in mutual standable area can be achieved. The trade-off frontier thus starts at point Ms, becoming very densely populated in its initial soft slope. This shows that for each modest increase in physical effort, that is, in moving furniture, there can be extensive gains in mutual shareable area, which is an interesting result. After s = 4, the Pareto front becomes increasingly steep, signalling that the user would now have to significantly increase physical effort levels for modest gains in shareable area. Point 4Gs thus seems to indicate a breaking point of diminishing returns.
[00068] Similar to the Ms search, in smaller furniture optimization steps, the algorithm seeks solutions which are highly dependent on the transformation parameters G(F, θ ) of the room itself, whereas in larger steps, one can observe the algorithm correctly moving the objects to the more populated side of the room in order to increase the empty spaces in available. In rooms where objects are facing the center and empty areas are initially located in the middle portion of the space, one can see the objects being pushed towards the corners or outer perimeter of the room in order increase the initial unoccupied areas.
[00069] Due to the smaller gene size, calculating the optimal Ms, maximum mutual space without furniture manipulation, executes much faster compared to
Figure imgf000019_0001
optimization, where the complexity of the search mechanism radically increases due to the additional object transformation parameters. The speed of the optimization is also highly
Figure imgf000019_0002
dependent on the transformation range of each object, meaning that objects in larger rooms have more movement options to choose from than those in small, constrained rooms. One can observe an example of this effect in the augmented reality experiment discussed below, where the smaller kitchen space dominates the search process, causing the final mutual outcome between the rooms to maintain a very similar shape to the open boundaries of the smaller space. While such an effect would still provide a well-constrained problem for medium-sized rooms with multiple objects, such as the conference room, there are many possible ways of fitting the smaller space in larger rooms with open spaces, such as the robotics laboratory, resulting in an under-constrained optimization problem.
[00070] To explore the usability aspect of the embodiments solution in real-world scenarios, an experiment deployed the resulting spatial segmentation in augmented reality using the Microsoft Hololens®, a mixed reality head mounted display. In this experiment, three types of rooms were defined as potential telecommunication spaces, shown in Figure 19: a conventional meeting room 84, where a large conference table is placed in the middle of the room and unused spaces are located around the table; a robotics laboratory 82, where working desks and equipment are mainly located around the perimeter of the room, while some larger equipment and a few tables are disorderly positioned around the central section of the lab; and a kitchen space 80, where surrounding appliances and cabinets are present in the scene.
[00071] Within each environment are positions number 1-9. Figure 20 shows the views from positions 1 and 2. Figure 21 shows the views from positions 3, 7, and 9. Figure 22 shows the views from positions 4, 5, 6, and 8. [00072] After the initial scan of the surrounding environment by the user of each room, the geometrical mesh data is sent to a central server for processing. This process happens in an offline manner, as the current Hololens hardware is incapable of processing the computations that the solution would require. In addition, the system scans the space using a Matterport camera, and perform the semantic segmentation step using Matterport classifications to locate the bounding boxes of all the furniture located in the room. The bounding box data is then fed to the algorithm for mutual boundary search. The implementation outputs spatial coordinates for standable and sittable areas which are automatically updated in the Unity Game Engine to be rendered in the Hololenses.
[00073] Figures 19-22 show how the spatial boundary properties are visualized within the Hololens AR experience. The red spaces indicate non standable objects, the green spaces indicate standable boundaries, and the blue spaces indicate mutual boundaries that are accessible between all users. The visualized boundaries are positioned slightly above the floor level, allowing users to identify the mutual accessible ground between their local surroundings and the remote participant’s spatial constraints.
[00074] Visualizing the mutual ground within the space itself using HoloLens allows the understanding of how complex the problem can be when executed in a manual fashion. Some corner spaces that are not typically used as default social areas of a certain room may become the only required common ground for interaction with other rooms. Overcoming this spatial bias is easily executed within the algorithm; meanwhile, this may not happen so easily and instantly when individuals are left to deal with it on their own.
[00075] The system transmits these images to the users through their computing device to which they connect to the system. Figure 23 shows an embodiment of such a system 90. The user's computer device may consist of anything from a personal computer and a webcam 92, or a virtual or augmented or mixed reality headset 94, as mentioned above, such as a Microsoft HoloLens® or an Oculus Rift®. The user device will typically have a processor 96, memory 98, display 100, sensors 102, user interface 104, and a network connection 106. The display may consist of a laptop screen, or a see-through holographic display, or a head- mounted display as in a virtual reality headset. The sensors could consist of a single webcam, microphone, multiple cameras, both visible light and infrared, depth sensors, accelerometers, gyroscopes, magnetometers, etc. The headset 94 would have all of the components set out in 92, not shown for simplicity.
[00076] The processor of the user device may execute software code that can perform particular functions based upon input from the sensors. This may include eye tracking, gesture tracking, voice controls, and biometric security such as face or iris recognition, fingerprint ID, etc. The virtual calling system may provide data back to the systems depending upon their capabilities to process the data, such as the type of display, etc.
[00077] When the user sets up or ‘calls’ the system 110, typically a server or cluster of servers having their own processors such as 102 and memories such as 104, the user inputs will include information from the sensors, possibly in the form of the geometrical mesh data mentioned above, provided to the system to allow the different virtual environments to be configured and the information sent back to the user devices. The system servers such as 110 then gather the information associated with each user to analyze each user's environment.
The server then communicates back with each user depending that user's settings as to whether the user needs to reposition themselves, reposition themselves and move objects, and whether the user wants sittable objects in their environment, as examples. When any repositionings and moves have been completed, the user or user device notifies the virtual interaction system, which then sends the virtual environment data back to the users.
[00078] The user devices will then use the information from the system to display the individual users’ environments within the capabilities of their computing devices. This includes interactions with the users to provide directions on movement of objects and repositioning, the spatial mapping of the multiple user's spaces and then the data to allow each user's device to render the virtual space.
[00079] In this manner, the embodiment identifies mutually accessible boundaries for all of the participants in a communications setting, provide each user a position to assume during the interactions and where to move surrounding objects so all of the participants can hold a similar spatial relationship with each other. This improves the effectiveness and efficiency of communication between the participants, while reducing the amount of space needed to allow the users to meaningfully interact.
[00080] It will be appreciated that variants of the above-disclosed and other features and functions, or alternatives thereof, may be combined into many other different systems or applications. Various presently unforeseen or unanticipated alternatives, modifications, variations, or improvements therein may be subsequently made by those skilled in the art which are also intended to be encompassed by the embodiments.

Claims

WHAT IS CLAIMED IS:
1. A method of providing spatial relationships for a virtual interaction, comprising: analyzing a position of each participant and the environment for each participant based upon user inputs from at least two participant computing devices; directing each participant to reposition themselves and any objects as needed for the virtual interaction by sending direction data to each of the at least two participant computing devices; searching the environment of each participant for mutual space within spatial constraints of each environment of each participant; and providing each participant with a virtual arrangement of other participants in the virtual interaction within boundaries of the mutual space to be displayed a display of the computing device of each participant.
2. The method as claimed in claim 1, further comprising receiving the user inputs from the at least two participant computing devices, the user inputs based upon data gathered from at least one sensor on each of the at least two participant computing devices.
3. The method as claimed in claim 1, wherein the searching is performed prior to the directing.
4. The method as claimed in claim 1, wherein the directing is performed prior to the searching.
5. The method as claimed in claim 1, wherein the directing and searching are performed in an iterative process.
6. The method as claimed in claim 1, wherein analyzing the position of each participant and the environment for each participant comprises performing semantic segregation of the environment for each participant to obtain spatial boundaries and geometric properties.
7. The method as claimed in claim 7, further comprising using the spatial boundaries and geometric properties to generate a topological graph of the environment for each participant.
8. The method as claimed in claim 1, wherein searching the environment of each participant for mutual space comprises identifying the geometrical boundaries of available spaces in each environment and aligning the boundaries of all environments to maximize the mutual space.
9. The method as claimed in claim 1, wherein directing the participants to reposition themselves and any objects comprises directing repositioning of objects based upon analyzing an effort to be made to reposition the object against any gained mutual space.
10. The method as claimed in claim 1, further comprising notifying any of the participants when the participant approaches the boundaries of the mutual space.
11. The method as claimed in claim 1, further comprising: receiving a notification from a participant that additional space outside the boundaries is required; repeating the direction and the searching to identify a new shared space with additional space.
12. The method as claimed in claim 8, wherein receiving the notification comprises receiving a user input of a setting change.
13. The method as claimed in claim 1, wherein the boundaries of the mutual space display as a shaded region on a floor of the virtual environment.
14. The method as claimed in claim 1, wherein the boundaries of the shared space display as a three-dimensional cage.
15. A computing device comprising: a network interface to allow the device to communicate with at least two participants simultaneously; a processor configured to execute code that causes the processor to: receive user inputs containing information from sensors in an environment for each of the at least two participants; determine settings of each of the at least two participants; analyze the user inputs and the settings to determine a mutual space; send messages to the at least two users to direct each of the at least two participants to at least one of re-position or move objects in the environment of each participant; and provide each of the at least two participants with information usable to depict a virtual environment having the mutual space with defined boundaries.
16. The device as claimed in claim 15, wherein code that causes the processor to analyze the position of each participant and the environment for each participant causes the processor to perform semantic segregation of the environment for each participant to obtain spatial boundaries and geometric properties.
17. The device as claimed in claim 16, further comprising the processor to execute code to cause the process or use the spatial boundaries and geometric properties to generate a topological graph of the environment for each participant.
18. The device as claimed in claim 15, wherein the code that causes the processor to search the environment of each participant for mutual space comprises identifying the geometric boundaries of available spaces in each environment and aligning the boundaries of all environments to maximize the mutual space.
19. The method as claimed in claim 15, wherein directing the participants to reposition themselves and any objects comprises directing repositioning of objects based upon analyzing an effort to be made to reposition the object against any gained mutual space.
20. The method as claimed in claim 15, wherein the code further comprising notifying any of the participants when the participant approaches the boundaries of the mutual space.
21. The device as claimed in claim 15, wherein the code executed by the processor further causes the processor to: receive a notification from a participant that additional space outside the boundaries is required; and identify a new shared space with additional space.
PCT/US2020/055122 2019-10-11 2020-10-09 Search and recommendation process for identifying useful boundaries in virtual interaction settings WO2021072301A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201962913938P 2019-10-11 2019-10-11
US62/913,938 2019-10-11

Publications (1)

Publication Number Publication Date
WO2021072301A1 true WO2021072301A1 (en) 2021-04-15

Family

ID=75437760

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2020/055122 WO2021072301A1 (en) 2019-10-11 2020-10-09 Search and recommendation process for identifying useful boundaries in virtual interaction settings

Country Status (1)

Country Link
WO (1) WO2021072301A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11612817B1 (en) 2021-09-28 2023-03-28 Sony Group Corporation Method for predefining activity zones in an extended reality (XR) environment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180060333A1 (en) * 2016-08-23 2018-03-01 Google Inc. System and method for placement of virtual characters in an augmented/virtual reality environment
WO2018226508A1 (en) * 2017-06-09 2018-12-13 Pcms Holdings, Inc. Spatially faithful telepresence supporting varying geometries and moving users
US20190253667A1 (en) * 2015-08-14 2019-08-15 Pcms Holdings, Inc. System and method for augmented reality multi-view telepresence

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190253667A1 (en) * 2015-08-14 2019-08-15 Pcms Holdings, Inc. System and method for augmented reality multi-view telepresence
US20180060333A1 (en) * 2016-08-23 2018-03-01 Google Inc. System and method for placement of virtual characters in an augmented/virtual reality environment
WO2018226508A1 (en) * 2017-06-09 2018-12-13 Pcms Holdings, Inc. Spatially faithful telepresence supporting varying geometries and moving users

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11612817B1 (en) 2021-09-28 2023-03-28 Sony Group Corporation Method for predefining activity zones in an extended reality (XR) environment
WO2023052859A1 (en) * 2021-09-28 2023-04-06 Sony Group Corporation Method for predefining activity zones in an extended reality (xr) environment

Similar Documents

Publication Publication Date Title
Keshavarzi et al. Optimization and manipulation of contextual mutual spaces for multi-user virtual and augmented reality interaction
KR102601622B1 (en) Contextual rendering of virtual avatars
de Belen et al. A systematic review of the current state of collaborative mixed reality technologies: 2013–2018
US10650106B2 (en) Classifying, separating and displaying individual stories of a three-dimensional model of a multi-story structure based on captured image data of the multi-story structure
US9911232B2 (en) Molding and anchoring physically constrained virtual environments to real-world environments
Wang et al. Mixed reality in architecture, design, and construction
Billinghurst et al. Advanced interaction techniques for augmented reality applications
CN105981076B (en) Synthesize the construction of augmented reality environment
CN107240151B (en) Scene layout saving and reproducing method based on parent-child constraint
EP3769509B1 (en) Multi-endpoint mixed-reality meetings
US9774653B2 (en) Cooperative federation of digital devices via proxemics and device micro-mobility
US9898860B2 (en) Method, apparatus and terminal for reconstructing three-dimensional object
KR20190134030A (en) Method And Apparatus Creating for Avatar by using Multi-view Image Matching
Radu et al. A survey of needs and features for augmented reality collaborations in collocated spaces
Dong et al. Tailored reality: Perception-aware scene restructuring for adaptive vr navigation
WO2021072301A1 (en) Search and recommendation process for identifying useful boundaries in virtual interaction settings
Kim et al. Mutual space generation with relative translation gains in redirected walking for asymmetric remote collaboration
KR20220026186A (en) A Mixed Reality Telepresence System for Dissimilar Spaces Using Full-Body Avatar
Keshavarzi et al. Mutual scene synthesis for mixed reality telepresence
CN107103646B (en) Expression synthesis method and device
KR20210101570A (en) Method and Apparatus of Avatar Placement in Remote Space from Telepresence
Wang et al. Designing interaction for multi-agent cooperative system in an office environment
Forte et al. Teleimmersive archaeology: simulation and cognitive impact
Arora et al. Introduction to 3d sketching
Keshavarzi et al. Synthesizing Novel Spaces for Remote Telepresence Experiences

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20874068

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20874068

Country of ref document: EP

Kind code of ref document: A1