CN115576427A - XR-based multi-user online live broadcast and system - Google Patents

XR-based multi-user online live broadcast and system Download PDF

Info

Publication number
CN115576427A
CN115576427A CN202211322357.2A CN202211322357A CN115576427A CN 115576427 A CN115576427 A CN 115576427A CN 202211322357 A CN202211322357 A CN 202211322357A CN 115576427 A CN115576427 A CN 115576427A
Authority
CN
China
Prior art keywords
participant
information
space
data
virtual
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211322357.2A
Other languages
Chinese (zh)
Inventor
印眈峰
吴梅荣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ningbo Longtai Medical Technology Co ltd
Original Assignee
Intuitive Vision Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intuitive Vision Co ltd filed Critical Intuitive Vision Co ltd
Publication of CN115576427A publication Critical patent/CN115576427A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/14Digital output to display device ; Cooperation and interconnection of the display device with other functional units
    • G06F3/1454Digital output to display device ; Cooperation and interconnection of the display device with other functional units involving copying of the display data of a local workstation or window to a remote workstation or window so that an actual copy of the data is displayed simultaneously on two or more displays, e.g. teledisplay
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Computer Graphics (AREA)
  • Software Systems (AREA)
  • Geometry (AREA)
  • Computer Hardware Design (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The embodiment of the specification provides an XR-based multi-user online live broadcast method and system, and the method comprises the following steps: establishing communication connection with terminals of at least two participants; creating a virtual space in which a virtual character corresponding to each of at least two participants is created; determining the position information of a virtual character corresponding to a participant in a virtual space based on the acquired position data of the participant in the actual space through a preset 3D coordinate position algorithm; displaying a virtual character in the virtual space based on the position information of the virtual character; and acquiring the shared data uploaded by the participants, and displaying the shared data in the virtual space.

Description

XR-based multi-user online live broadcast and system
Description of the cases
The application is a divisional application proposed by Chinese application with application date of 28/09/2022, application number of 2022111915615 and invention name of 'an XR-based multi-person cooperation method and system'.
Technical Field
The specification relates to the technical field of communication, in particular to an XR-based multi-user online live broadcast method and system.
Background
In the current multi-person communication scene, due to time cost and traffic cost, the user often cannot attend important meetings at any time in different places. However, the multi-user remote communication scheme in the current market can only use computers and mobile phones to show planar pictures for explaining and communicating objects.
Therefore, it is desirable to provide an XR-based multi-user collaboration method and system that can provide a more direct and efficient communication method for collaboration of participants in deep and remote locations.
Disclosure of Invention
One embodiment of the present specification provides an XR-based multi-user online live broadcast method, including: establishing communication connection with terminals of at least two participants; creating a virtual space in which a virtual character corresponding to each of at least two participants is created; determining the position information of a virtual character corresponding to the participant in the virtual space based on the acquired position data of the participant in the actual space by a preset 3D coordinate position algorithm; displaying a virtual character in the virtual space based on the position information of the virtual character; and acquiring the shared data uploaded by the participants, and displaying the shared data in the virtual space.
One of the embodiments of the present specification provides an XR-based multi-user online live broadcast system, including: the connection module is used for establishing communication connection with the terminals of at least two participants; a positioning module for creating a virtual space in which a virtual character corresponding to each of at least two participants is created; the positioning module is further used for determining the position information of a virtual character corresponding to the participant in the virtual space based on the acquired position data of the participant in the actual space through a preset 3D coordinate position algorithm; and displaying a virtual character in the virtual space based on the location information of the task; the downloading module is used for acquiring the shared data uploaded by the participants; and the display module is used for displaying the shared data in the virtual space.
One of the embodiments of this specification provides an XR-based multi-user online live broadcast device, which includes: at least one storage medium storing computer instructions; and the at least one processor executes computer instructions to implement the XR-based multi-user online live broadcast method.
One embodiment of the present specification provides a computer-readable storage medium storing computer instructions, which when read by a computer, cause the computer to execute the XR-based multi-user online live broadcast method.
Drawings
The present description will be further explained by way of exemplary embodiments, which will be described in detail by way of the accompanying drawings. These embodiments are not intended to be limiting, and in these embodiments like numerals are used to indicate like structures, wherein:
FIG. 1 is a schematic diagram illustrating an application scenario of an XR-based multi-person collaboration system, in accordance with some embodiments of the invention;
FIG. 2 is an exemplary block diagram of an XR-based multi-person collaboration system in accordance with some embodiments of the present description;
FIG. 3 is an exemplary flow diagram of an XR-based multi-person collaboration method in accordance with some embodiments of the present description;
FIG. 4 is a flow diagram illustrating an exemplary method for determining location information of a participant in a virtual space according to some embodiments of the present description;
FIG. 5 is an exemplary flow diagram of an XR-based multi-person online live method, shown in accordance with some embodiments of the present description;
FIG. 6 is an exemplary flow diagram illustrating real-time updating of location information according to some embodiments of the present description;
FIG. 7 is an exemplary flow diagram illustrating determining presentation priorities of sub-action information in accordance with some embodiments of the present description;
FIG. 8 is an exemplary flow diagram of a data processing method for XR according to some embodiments described herein;
FIG. 9 is an exemplary flow diagram of a presentation of content to be tagged as shown in some embodiments herein;
FIG. 10 is an exemplary flow diagram illustrating the determination of a predicted presentation according to some embodiments of the present description.
Detailed Description
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings used in the description of the embodiments will be briefly described below. It is obvious that the drawings in the following description are only examples or embodiments of the present description, and that for a person skilled in the art, the present description can also be applied to other similar scenarios on the basis of these drawings without inventive effort. Unless otherwise apparent from the context, or otherwise indicated, like reference numbers in the figures refer to the same structure or operation.
It should be understood that "system", "apparatus", "unit" and/or "module" as used herein is a method for distinguishing different components, elements, parts, portions or assemblies at different levels. However, other words may be substituted by other expressions if they accomplish the same purpose.
As used in this specification and the appended claims, the terms "a," "an," "the," and/or "the" are not intended to be inclusive in the singular, but rather are intended to be inclusive in the plural, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that steps and elements are included which are explicitly identified, that the steps and elements do not form an exclusive list, and that a method or apparatus may include other steps or elements.
Flowcharts are used in this specification to illustrate the operations performed by the system according to embodiments of the present specification. It should be understood that the preceding or following operations are not necessarily performed in the exact order in which they are performed. Rather, the various steps may be processed in reverse order or simultaneously. Meanwhile, other operations may be added to the processes, or a certain step or several steps of operations may be removed from the processes.
Fig. 1 is a schematic diagram illustrating an application scenario 100 of an XR-based multi-person collaboration system, in accordance with some embodiments of the invention. XR (Extended Reality), also known as augmented Reality, is a generic name for various new immersive technologies such as Virtual Reality (VR), augmented Reality (AR), and Mixed Reality (MR). XR can combine real and virtual by computer to create a virtual space which can be interacted by man and machine.
As shown in fig. 1, an application scenario 100 of an XR-based multi-person collaborative system may include a processing device 110, a network 120, a storage device 130, a terminal 140, and a data collection device 150. The components in the application scenario 100 of an XR-based multi-person collaborative system may be connected in one or more different ways. For example, the data collection device 150 may be connected to the processing device 110 through the network 120. For example, as shown in FIG. 1, the data acquisition device 150 may be directly connected to the processing device 110.
In some embodiments, the application scenario 100 of the XR-based multi-person collaboration system may include a scenario where multiple persons not in the same physical space need to collaborate. For example, the application scenario 100 may include academic conferences, remote consultation, instructional training, surgical guidance, live broadcasting, and the like. XR-based multi-person collaboration systems may create virtual spaces. The application scenario 100 may be implemented through a virtual space. For example, in a surgical guidance scene, medical staff participating in a surgery can interactively communicate in a virtual space to share information of a patient recorded by medical equipment, and can live broadcast with an expert in the virtual space to assist the expert in remote operation guidance. Further, in virtual space, can also share the 3D model, like heart model to can disassemble the model, can also present the relevant video image data of operation in this virtual space, the operating personnel just can share data information through wearing terminal equipment such as VR, AR equipment.
The data capture device 150 may be configured as a device that obtains audio and video data relating to the participants and the space in which the participants are located. The data acquisition device 150 may include a panoramic camera 151, a general camera 152, a motion sensor (not shown in the figures), and the like.
Processing device 110 may process data and/or information obtained from storage device 130, terminal 140, and/or data collection device 150. The processing device 110 may include a server data center. In some embodiments, the processing device 110 may host a simulated virtual world, or meta-domain for the terminal 140. For example, the processing device 110 may generate the participant's location data based on images of the participant collected by the data collection device 150. As another example, the processing device 110 may generate location information of the participant in the virtual space based on the location data of the participant.
In some embodiments, the processing device 110 may be a computer, a user console, a single server or group of servers, or the like. The server groups may be centralized or distributed. For example, a designated area of a meta-domain may be emulated by a single server. In some embodiments, processing device 110 may include multiple simulation servers dedicated to physical simulation to manage interactions and to handle collisions between characters and objects in the metastables.
In some embodiments, the processing device 110 may be implemented on a cloud platform. For example, the cloud platform may include private clouds, public clouds, hybrid clouds, community clouds, distributed and inter-cloud clouds, multi-clouds, the like, or combinations thereof.
In some embodiments, the processing device 110 may include a storage device dedicated to storing data related to objects and characters in the meta-world. The data stored in the storage device may include object shapes, avatar shapes and appearances, audio clips, meta-world related scripts, and other meta-world related objects. In some embodiments, the processing device 110 may be implemented by a computing device having a processor, memory, input/output (I/O), communication ports, and the like. In some embodiments, the processing device 110 may be implemented on a processing circuit (e.g., processor, CPU) of the terminal 140.
The terminal 140 may be a device that allows a user to participate in a virtual reality experience. In some embodiments, terminal 140 may include a VR headset, VR glasses, VR patches, stereo head displays or the like, personal Computers (PCs), cell phones, or any combination thereof. For example, the terminal 140 may include Google Glass TM 、Oculus Rift TM 、Gear VR TM And the like. In particular, the terminal 140 may include a display device 141 on which virtual content may be presented and displayed. The user may view virtual content (e.g., content to be tagged, tagging information, etc.) through the display device 141.
In some embodiments, a user may interact with the virtual content through display device 141. For example, head movements and/or gaze directions of the user may be tracked while the user is wearing the display device 141, thereby presenting virtual content in response to changes in the user's position and/or orientation, providing an immersive and convincing virtual reality experience that reflects changes in the user's perspective.
In some embodiments, the terminal 140 may further include an input component 142. Input component 142 may enable user interaction between a user and virtual content displayed on display device 141. The virtual content may include data information uploaded by the participants. For example, the input component 142 may include a touch sensor, microphone, or the like configured to receive user inputs that may be provided to the terminal 140 and control the virtual world by changing visual content presented on the display device. In some embodiments, the user input received by the input component may include, for example, touch, voice input, and/or gesture input, and may be sensed by any suitable sensing technique (e.g., capacitive, resistive, acoustic, optical). In some embodiments, the input components 142 may include a handle, gloves, a stylus, a game console, and the like.
In some embodiments, display device 141 (or processing device 110) may track input component 142 and present virtual elements based on the tracking of input component 142. The virtual element may include a representation of the input component 142 (e.g., an image of a user's hand, fingers). The virtual element may be rendered in a 3D position in the virtual reality experience that corresponds to the real position of the input component 142.
For example, one or more sensors may be used to track the input component 142. Display device 141 may receive signals collected by one or more sensors from input component 142 over a wired or wireless network. The signal may include any suitable information capable of tracking the input assembly 142, such as the output of one or more inertial measurement units (e.g., accelerometers, gyroscopes, magnetometers) in the input assembly 142, a Global Positioning System (GPS) sensor in the input assembly 142, or the like, or a combination thereof.
The signal may indicate a position (e.g., in the form of a three-dimensional coordinate) and/or an orientation (e.g., in the form of a three-dimensional rotational coordinate) of the input assembly 142. In some embodiments, the sensors may include one or more optical sensors for tracking the input component 142. For example, the sensor may use a visible light and/or depth camera to position the input component 142.
In some embodiments, input component 142 may include a haptic component that may provide haptic feedback to a user. For example, the haptic assembly may include a plurality of force sensors, motors, and/or actuators. The force sensor may measure the magnitude and direction of the force applied by the user and input these measurements to the processing device 110.
Processing device 110 may translate the entered measurements into movements of one or more virtual elements (e.g., virtual fingers, virtual palm, etc.) that may be displayed on display device 141. Processing device 110 may then calculate one or more interactions between one or more virtual elements and at least a portion of the participants and output these interactions as computer signals (i.e., signals representing feedback forces). The motors or actuators in the haptic components may apply feedback forces to the user based on computer signals received from the processing device 110 so that the participants feel the actual tactile sensation of the objects in the surgical guidance. In some embodiments, the magnitude of the feedback force may be based on a default setting of the XR-based multi-person collaboration system or a preset setting by a user or operator, for example, a terminal device (e.g., terminal 140).
In some embodiments, the application scenario 100 of the XR-based multi-person collaboration system may further include an audio device (not shown) configured to provide audio signals to a user. For example, an audio device (e.g., a speaker) may play sounds made by the participants. In some embodiments, the audio device may include an electromagnetic speaker (e.g., a moving coil speaker, a moving iron speaker, etc.), a piezoelectric speaker, an electrostatic speaker (e.g., a condenser speaker), or the like, or any combination thereof. In some embodiments, the audio device may be integrated into the terminal 140. In some embodiments, terminal 140 may include two audio devices located on the left and right sides of terminal 140, respectively, to provide audio signals to the left and right ears of a user.
Storage device 130 may be used to store data and/or instructions, for example, storage device 130 may be used to store relevant information and/or data collected by data collection device 150. Storage device 130 may obtain data and/or instructions from, for example, processing device 110. In some embodiments, storage device 130 may store data and/or instructions for use by processing device 130 to perform or use to perform the exemplary methods described in this specification. In some embodiments, the storage device 130 may be integrated on the processing device 110.
Network 120 may provide a conduit for information and/or data exchange. In some embodiments, information may be exchanged between processing device 110, storage device 130, terminal 140, and data collection device 150 via network 160. For example, the terminal 140 may acquire data information and the like transmitted from the processing device 110 through the network 120.
It should be noted that the above description of the application scenario 100 for an XR-based multi-person collaboration system is for illustrative purposes only and is not intended to limit the scope of the present disclosure. For example, the assembly and/or functionality of the application scenario 100 of an XR-based multi-person collaborative system may vary or change depending on the specific implementation scenario. In some embodiments, the application scenario 100 of the XR-based multi-person collaborative system may include one or more additional components (e.g., storage devices, networks, etc.) and/or may omit one or more components of the application scenario 100 of the XR-based multi-person collaborative system described above. Additionally, two or more components of the application scenario 100 of the XR-based multi-person collaboration system may be integrated into one component. One component of the application scenario 100 of the XR-based multi-person collaboration system may be implemented on two or more subcomponents.
FIG. 2 illustrates an exemplary block diagram of an XR-based multi-person collaboration system 200 in accordance with some embodiments of the present description. In some embodiments, the XR-based multi-person collaboration system 200 may include a connection module 210, a location module 220, a download module 230, a presentation module 240, and a generation module 250.
The connection module 210 may be used to establish a communication connection with the terminals of at least two participants.
The positioning module 220 may be configured to determine the position information of the at least two participants in the virtual space through a preset 3D coordinate position algorithm.
In some embodiments, the position information of the at least two participants in the virtual space is related to the position data of the at least two participants in the real space, and the position data of the at least two participants in the real space is obtained by the terminals of the at least two participants.
In some embodiments, the location module 220 may be further operable to create a virtual space; creating a virtual character in the virtual space corresponding to each of the at least two participants, wherein the virtual character has initial position information in the virtual space; acquiring position data of a participant in a real space, and associating the position data with position information of a corresponding virtual character in a virtual space; acquiring movement data of the participant in the actual space based on the position data of the participant; and updating the initial position information based on the mobile data through a preset 3D coordinate position algorithm, and determining the updated position information.
In some embodiments, the positioning module 220 may be configured to create a virtual space and to create a virtual character in the virtual space corresponding to each of the at least two participants. In some embodiments, the positioning module 220 is further configured to determine, through a preset 3D coordinate position algorithm, position information of a virtual character corresponding to the participant in the virtual space based on the acquired position data of the participant in the real space. In some embodiments, the positioning module 220 may be configured to display the avatar in the virtual space based on the location information of the task.
In some embodiments, the positioning module 220 may be further configured to scan the actual space where the participant is located, and perform spatial positioning on the participant; for the participant who finishes scanning, determining real-time position data of the participant in the actual space; determining first movement information of the participant in the real space based on the real-time location data; determining initial position information of the virtual character in the virtual space; acquiring first action information of a participant in a real space; the first action information comprises sub-action information of each part of the body of the participant; and synchronously updating the second movement information and/or the second action information of the virtual character based on the first movement information and/or the first action information through a preset 3D coordinate position algorithm.
In some embodiments, the positioning module 220 may be further configured to determine at least one core body part of the participant based on the current scene; determining presentation priorities of sub-action information for various parts of a participant's body based on at least one core body part; determining display parameters of the action information based on the display priority of the sub-action information, wherein the display parameters comprise display frequency and display precision; second action information of the avatar corresponding to the participant is synchronized based on the presentation parameters.
The download module 230 may be configured to store data information uploaded by at least two participants and provide a data download service to the at least two participants, where the data download service includes at least one of creating a data download channel and providing a download resource.
In some embodiments, the download module 230 may be used to obtain shared data uploaded by the participants.
The presentation module 240 may be used to simultaneously present data information on the terminals of at least two participants.
In some embodiments, the terminal includes at least one of a VR display device, an AR display device, a mobile phone, and a PC computer.
In some embodiments, the presentation module 240 may be used to present shared data within a virtual space.
In some embodiments, the presentation module 240 may be configured to create at least one second space and/or second window in the virtual space, wherein each of the at least one second space and/or second window corresponds to one participant; the shared data of the corresponding participant is presented through the second space and/or the second window.
In some embodiments, the presentation module 240 may be configured to present the content to be marked on the canvas, where the content to be marked is marked data and/or unmarked raw data; acquiring marking information created on a canvas by a marking requester by using a ray interaction system, wherein the marking information comprises marking content and a marking path; and sharing the content to be marked and the marking information to terminals of other participants for displaying.
In some embodiments, the content to be marked is content displayed in any window and any position in a plurality of windows on the terminal of the annotation requester.
In some embodiments, the presentation module 240 may be further configured to obtain presentation settings of the annotation requester, where the presentation settings include real-time mark presentation and mark-completed presentation; and sharing the content to be marked and the marking information thereof to terminals of other participants for displaying based on the display setting.
In some embodiments, presentation module 240 may be further operable to determine perspective information for each of the other participants based on the location information for each of the other participants; and determining the display content of each participant based on the view angle information of each participant, and displaying, wherein the display content comprises the content to be marked and/or the marking information under the view angle information.
The generation module 250 can be used to create a canvas within the virtual space in response to a request by an annotation requestor.
It should be noted that the above description of the system and its modules is merely for convenience of description and should not limit the present application to the scope of the illustrated embodiments. It will be appreciated by those skilled in the art that, given the teachings of the present system, any combination of modules or sub-system configurations may be used to connect to other modules without departing from such teachings. For example, the connection module 210, the location module 220, the download module 230, and the presentation module 240 may be combined to form an XR-based multiplayer online live system. As another example, presentation module 240 and generation module 250 may, in combination, comprise a data processing system for XR. Such variations are within the scope of the present application.
Fig. 3 is an exemplary flow diagram of a multi-person collaboration method in accordance with some embodiments shown herein. As shown in fig. 3, the process 300 may include the following steps.
In step 310, a communication connection is established with the terminals of at least two participants. In some embodiments, step 310 may be performed by connection module 210.
A participant may refer to a person participating in a collaboration. The collaboration scenarios are different, and the participants may be different. For example, in an operating room VR scenario, the participants may include a worker (e.g., a doctor participating in an operation), a remote specialist, and an operator. Wherein, the operator can manage the user authority and archive the guidance process.
The multi-person cooperation can be used for academic conferences, remote consultation, medical training, operation live broadcasting, medical instrument training and the like, and the multi-person cooperation can realize real-time live broadcasting, real-time comment sharing and the like of multiple persons. Details of live multi-user broadcasting can be found in the description of other parts of the specification, for example, fig. 5, 6 and 7. Details of real-time annotation sharing can be found in the description of other parts of the specification, for example, fig. 8, 9, and 10.
In some embodiments, a server data center may be established, and the participants are connected with the server data center through terminals of the participants to realize communication connection.
The server data center can be a platform carrying multiple persons for collaborative instant communication. The server data center can comprise at least one server, the performance of the server can meet the requirement of multi-user cooperative operation, the server data center can be accessed by multiple users and multiple terminals, the stability and the real-time performance of the access of the multiple terminals can be ensured, and the safety and the integrity of server data can be ensured.
In some embodiments, the communication connection may be for audio instant messaging, video instant messaging, multi-person virtual space technology messaging, and the like. Where the audio instant communication may include recording, transmission, and reception of audio information. Video instant messaging may include recording, decoding, transmitting, and receiving of video information. The multi-person cooperation can be realized through multi-person virtual space technical communication. For example, a specialist in a different location may be invited to perform remote assistance during the surgical live broadcast. As another example, the invited participant may view and communicate with other participants and give help to the live view of the inviting participant. For another example, the participant can also use the tagging function to perform local tagging, and can display the tagged content to other participants in real time.
The participant's terminal may refer to a device used by the participant to participate in the collaboration. In some embodiments, the participant's terminal may include a device for enabling the participant to connect with the server data center and a data collection device. The data acquisition device is a device for acquiring data such as audio and video of an actual space where a participant is located, and examples of the data acquisition device include a panoramic camera, a general camera, AR glasses, a mobile phone, a motion sensor, a depth camera, and the like. The participant terminal may further include a display device that may display data obtained from the server data center.
In some embodiments, the terminal includes at least one of a VR display device, an AR display device, a mobile phone, and a PC computer. For example, the devices that enable the participants to connect with the server data center may include AR glasses, VR helmets, PCs, cell phones, and the like.
And 320, determining the position information of at least two participants in the virtual space through a preset 3D coordinate position algorithm. In some embodiments, step 320 may be performed by the positioning module 220.
A virtual space may refer to a space in which virtual objects are presented. The virtual space can be created based on the information of the actual space or based on preset virtual information; for details on creating the virtual space, reference may be made to the description elsewhere in this specification, for example, fig. 4.
In some embodiments, the virtual space may correspond to different scenarios, for example, may include academic conferences, teaching and training, case inquiries, operating room VR scenarios, surgical procedure detail scenarios, pathology sharing scenarios, surgical navigation information scenarios, patient vital sign data scenarios, and the like. In the virtual space, the participant's location information and data information may be presented. For details of the location information and the data information, reference may be made to the description of the rest of the present specification, for example, step 340 in fig. 3.
By way of example only, in a surgical procedure detail scene, an operator can wear a terminal device and connect to a server data center, an operation picture of an actual space seen by the operator can be projected to a virtual space and live broadcasted to other remote experts and scholars in real time, other remote experts and scholars can view and know on-site close-range operation details in real time through connecting to the server data center, and the operator can also communicate with remote experts in audio and video and obtain remote guidance.
For another example, in a teaching and training scene, a teacher and a student can join in a virtual space through a participant terminal, the teacher can perform training live broadcast in the virtual space, and can import and share data such as a three-dimensional model, images and characters in the virtual space, the teacher and the student can walk and interact in the virtual space, and the shared data can be edited and marked.
In some embodiments, a spatial coordinate system may be provided in the virtual space, and the spatial coordinate system may be used to represent a spatial position relationship of the virtual object in the virtual space. A plurality of participants can communicate and interact in the same virtual space through participant terminals.
In some embodiments, the virtual object may include a spatial background, a virtual character, a virtual window, a canvas, data information, and the like. In some embodiments, the virtual space may include a spatial background, which may be a live image or other preset image of the actual space. In some embodiments, a virtual character corresponding to each participant may be included in the virtual space. For details of the avatar, reference may be made to the description elsewhere in this specification, e.g., FIG. 4.
In some embodiments, the virtual space may include a plurality of second windows and/or a plurality of second spaces. For details of the second window and/or the second space, reference may be made to the description of the rest of this specification, for example, fig. 5. In some embodiments, the virtual space may include a canvas. For details of the canvas, reference may be made to the description of the remainder of this specification, e.g., FIG. 8.
Location information may refer to information relating to the participant's position and/or motion in virtual space. The location information may include initial location information and real-time location information of the participant in the virtual space. The initial location information may refer to the initial location of each participant in the virtual space. For details of the initial position information, reference may be made to the description of the rest of the present description, for example, fig. 4.
In some embodiments, the location information may include motion information of the virtual character corresponding to the participant in the virtual space. The motion information may refer to physical motion information generated by the participant in real space. For details of the action information, reference may be made to the description elsewhere in this specification. For example, fig. 6.
In some embodiments, the motion information may also include head motion information of the virtual character in the virtual space corresponding to the actual motion of the participant, from which perspective information of the participant in the virtual space may be determined. For details of the viewing angle information, reference may be made to the description elsewhere in this specification, for example, fig. 9.
In some embodiments, the position information of the at least two participants in the virtual space is related to the position data of the at least two participants in the real space, and the position data of the at least two participants in the real space is obtained by the terminals of the at least two participants.
The real space may refer to the space where the participant is actually located. For example, the real space may refer to an office, a study, an outdoor location, etc. where the participant is located.
Location data may refer to data relating to the participant's position and/or motion in real space. In some embodiments, the location data may include the participant's location and/or actions in real space. Where the position may be represented by coordinates in real space. For example, the coordinates may be represented by coordinates consisting of longitude and latitude or coordinate information based on other preset coordinate systems. In some embodiments, the position data may include the participant's coordinate position, speed of movement, acceleration, motion of body parts, orientation of the participant's terminal (i.e., the participant's orientation), and the like. The location data may include real-time location data.
The participant's location data may be determined by a positioning device, a data collection device (e.g., camera, sensor, etc.) in the physical space in which the participant is located, and the participant's location data may be determined by receiving data transmitted by the positioning device, data collection device, etc. For example, based on the received data of the positioning device, the position of the user may be determined. As another example, the participant's actions may be determined by a camera and sensors. The position data may be acquired by connecting to a positioning device, a data acquisition device in the physical space in which the participant is located.
Exemplary positioning devices may include the Global Positioning System (GPS), global navigation satellite system (GLONASS), beidou navigation system, galileo positioning system, quasi-zenith satellite system (QZSS), base station positioning system, wi-Fi positioning system.
In some embodiments, the participant's location information in the virtual space may be determined based on the participant's location data in the real space. For example, a database may be preset in the server data center, and the position data of the participants may be corresponded to the position information of the virtual character in the database. The database may be established based on the correspondence between the location data and the location information in the history data. The correspondence of the position data and the position information may be determined by a 3D coordinate position algorithm.
In some embodiments, the position information may be updated based on the position data of the participant in the real space, enabling synchronization of the participant's position information in the real space and the virtual space. For details on updating the location information, reference may be made to the contents of the rest of the description, for example, fig. 4.
In some embodiments, the position data (e.g., 3D coordinates) in the real space may be transformed into position information in the virtual space by a 3D coordinate position algorithm via a projective transformation matrix. For example, the coordinates of the real space may be converted into coordinates in a virtual space coordinate system.
In some embodiments, determining the position information of the at least two participants in the virtual space by a preset 3D coordinate position algorithm includes: creating a virtual space; creating a virtual character in the virtual space corresponding to each of the at least two participants, wherein the virtual character has initial position information in the virtual space; acquiring the position data of the participant in the actual space, and associating the position data with the position information of the corresponding virtual character in the virtual space; acquiring movement data of the participant in the actual space based on the position data of the participant; and updating the initial position information based on the mobile data through a preset 3D coordinate position algorithm, and determining the updated position information. For details on determining the location information, reference may be made to the contents of the rest of the description, for example, fig. 4.
In some embodiments, determining, by a preset 3D coordinate location algorithm, location information of the virtual character corresponding to the participant in the virtual space based on the acquired location data of the participant in the real space includes: scanning the actual space where the participant is located, and carrying out space positioning on the participant; for the participant who finishes scanning, determining real-time position data of the participant in the actual space; determining first movement information of the participant in the real space based on the real-time location data; determining initial position information of the virtual character in the virtual space; acquiring first action information of a participant in a real space; the first action information comprises sub-action information of each part of the body of the participant; and synchronously updating the second movement information and/or the second action information of the virtual character corresponding to the participant based on the first movement information and/or the first action information through a preset 3D coordinate position algorithm. Further, location information is determined based on the second movement information and/or the second motion information. For details on determining the location information, reference may be made to the contents of the rest of the description, for example, fig. 6.
Step 330, storing the data information uploaded by the at least two participants, and providing a data download service to the at least two participants, wherein the data download service includes at least one of creating a data download channel and providing a download resource. In some embodiments, step 330 may be performed by the download module 230.
The data information may refer to information shared in a virtual space. For example, the data information may include 3D models, videos, documents, operation manuals, and the like. In some embodiments, the data information may include content to be tagged and tagging information. For details of the content to be marked and the marking information, reference may be made to the description of other contents of the present specification, for example, fig. 8.
In some embodiments, the data information may be data uploaded by the participants or data retrieved from other platforms (e.g., a network cloud platform). The data information may be stored in a storage device of the server data center. In response to the data request of the participant, the server data center can be connected with other platforms and can call corresponding data information, can also call data information uploaded to the server data center by the participant, and can also call data information stored in storage equipment of the server data center.
The data download service may refer to a service for data downloading of information by connecting a corresponding communication network (e.g., a 4G network) through a communication module (e.g., an LTE communication module). The participant may obtain the data information through a data download service.
In some embodiments, the download channel may be created at the server data center and participant terminals. The download channel can be multiple, corresponding to each participant. The participants can acquire the required information data through the data downloading channel.
In some embodiments, the data information may be stored in a storage device of the server data center in categories (e.g., by data type), each data information type corresponding to one data download channel, and the corresponding data information may be obtained from a different data download channel in response to the type of data request by the participant.
Step 340, synchronously displaying data information on the terminals of at least two participants. In some embodiments, step 340 may be performed by presentation module 240.
In some embodiments, the data information may be displayed in a virtual space, and the participant terminal may establish a connection with the server data center to obtain the data information, and synchronously display the data information through a display device of the participant terminal.
In some embodiments, different presentation modes may be determined according to different participant terminals. For example, the PC or the mobile phone may display the data information through a screen of the PC or the mobile phone, and the AR glasses and/or the VR helmet may display the data information through a screen projected inside the AR glasses and/or the VR helmet.
By establishing the 3D virtual space and synchronously sharing data information in the 3D virtual space, the synchronization between the local place and the remote place is realized, and the problems of people number limitation, site limitation and the like of an offline multi-person conference are solved. The participant can be through virtual space to face-to-face communication mode carries out the scene interaction to the object information in the virtual space more directly perceivedly, and the participant supports more compatible platforms, can join the discussion with different equipment anytime and anywhere, has reduced a large amount of time costs, can high-efficient and quick formation cooperation team. Meanwhile, through the virtual space, records can be formed, and other people in the later period can learn reference, summarize experience and even investigate and obtain evidence conveniently.
FIG. 4 is a flow diagram illustrating an exemplary method for determining location information of a participant in a virtual space according to some embodiments of the present description. In some embodiments, the flow 400 may be performed by the positioning module 220.
Step 410, creating a virtual space; creating a virtual character in a virtual space corresponding to each of the at least two participants, wherein the virtual character has initial position information in the virtual space.
In some embodiments, a coordinate system may be established in an arbitrary real space, model data of a real space model may be created based on the real space coordinate system and real space scan data, a real space coordinate system corresponding to the real space model may be established, a virtual space model corresponding to the real space model may be established by mapping according to the model data of the real space model, and a virtual space coordinate system corresponding to the virtual space model may be established.
In some embodiments, the virtual space may be created based on the design, e.g., the virtual space may be a designed virtual operating room or the like.
An avatar may refer to a character image corresponding to a participant in a virtual space. The participants may be assigned corresponding avatars by default when they join the server data center, or they may be provided with a plurality of candidate avatars that have already been created, one of which is selected by the participants to determine the avatar corresponding to themselves. The position information of the corresponding participants can be synchronously displayed through the virtual character. For example, participant 1 clicks on avatar 1, the participant moves to the left in the real space, and the corresponding avatar 1 also moves to the left in the virtual space.
In some embodiments, the initial position information of the virtual character may be determined according to a preset rule. For example, each avatar is pre-set with an initial position. When the participant selects the virtual character corresponding to the participant, the initial position of the participant can be determined, or the participant can select the initial position of the virtual character corresponding to the participant in the virtual space.
Step 420, obtaining the position data of the participant in the real space, and associating the position data with the position information of the corresponding virtual character in the virtual space.
In some embodiments, the position data of the participant in the real space can be acquired through connection with a positioning device and a data acquisition device.
In some embodiments, a storage device for each avatar may be provided in the server data center. When the position data of the participant is acquired, the position data may be stored in a storage device corresponding to the participant, and the server data center may convert the position data into the position information of the virtual character through a preset 3D coordinate position algorithm.
Step 430, based on the position data of the participant, the movement data of the participant in the real space is obtained.
Movement data may refer to data relating to the movement of the participant in real space. The movement data may include the direction and distance the participant moved, etc.
In some embodiments, the participant movement data may be determined based on the direction of movement of the participant, the coordinate points before and after the movement. The distance moved may be determined based on the coordinates of the participant before and after movement and a distance formula. For example, the position data of the participant includes a leftward movement, coordinates before the movement are (1,2), and coordinates after the movement are (1,3), and if the distance of the movement is 1 meter, which can be calculated from the coordinates before and after the movement, the movement data is a leftward movement of 1 meter.
Step 440, updating the initial position information based on the movement data by a preset 3D coordinate position algorithm, and determining the updated position information.
In some embodiments, specifically, the virtual space needs to obtain spatial location information of each participant, the participants are in initial locations when entering the virtual space, after the user performs relative displacement, the mobile data is uploaded to the server data center through the participant terminals, the server data center converts the mobile data into mobile information of the virtual space through a projection transformation matrix through a 3D coordinate location algorithm, updates the location of the virtual character, and synchronizes to other participant terminals, so that other participants can see real-time movement of the virtual character of the participant in the virtual space.
Based on the position data of the participant in the actual space and the position information of the virtual character of the participant in the virtual space, more vivid, omnibearing and multi-level rendering display effects can be provided for the participant, the sense of reality approaching face-to-face communication is created, and the effectiveness of the communication is increased.
Fig. 5 is an exemplary flow diagram of an XR-based multi-person online live method, shown in some embodiments herein. As shown in fig. 5, the process 500 may include the following steps.
Step 510, establishing communication connection with terminals of at least two participants. In some embodiments, step 510 may be performed by connection module 210.
For the definition and description of the participants and the terminals, and the method for establishing the communication connection, reference may be made to fig. 3 and its related description.
Step 520, a virtual space is created, and a virtual character corresponding to each of the at least two participants is created in the virtual space. In some embodiments, step 520 may be performed by the positioning module 220.
With respect to the definition and illustration of the virtual space and virtual character, and the method of creating the virtual space and virtual character, reference may be made to FIG. 4 and its associated description.
Step 530, determining the position information of the virtual character corresponding to the participant in the virtual space based on the acquired position data of the participant in the actual space by a preset 3D coordinate position algorithm. In some embodiments, step 530 may be performed by the positioning module 220.
For the definition and description of the 3D coordinate location algorithm, location data and location information, reference may be made to fig. 3 and its associated description. With respect to the method of determining and updating location information in real time, reference may be made to fig. 4 and its associated description.
And 540, displaying the virtual character in the virtual space based on the position information of the virtual character. In some embodiments, step 540 may be performed by positioning module 220.
In some embodiments, the created virtual character may be displayed at the coordinates corresponding to the position information according to the position information of the virtual character. When the position information changes, the display of the virtual character changes in real time as the position information changes. For a detailed description of the avatar, reference may be made to FIG. 4 and its associated description.
And step 550, acquiring the shared data uploaded by the participants, and displaying the shared data in the virtual space. In some embodiments, step 550 may be performed by download module 230 and presentation module 240.
Shared data refers to data uploaded to the virtual space by participants. The shared data has various expression forms such as video, audio, images, models and the like, and the shared data in different application scenes can be different.
For example, when presenting an operating room VR scene, the shared data may include panoramic (e.g., spatial design of the operating room, position orientation, instrument placement, etc.) data for the operating room. As another example, in presenting surgical details, the shared data may include data of a close-up view of the surgical procedure (e.g., surgeon hand manipulation, instrument manipulation, patient surgical site, etc.). For another example, when sharing pathological data, the shared data may include a three-dimensional image model, pathological pictures, videos, and the like of the patient. For another example, when the surgical navigation information is displayed, the shared data may include a screen of the surgical robot (e.g., a surgical planning screen), and the like. For another example, when presenting patient vital sign data, the shared data can include patient intraoperative vital sign monitoring data (e.g., vital signs, blood pressure, heart rate, electrocardiogram, blood oxygen saturation, etc.). For another example, when a remote expert video screen is displayed, the shared data may include video data, audio data, and the like of an expert photographed by a camera. For another example, in an interactive scenario, the shared data may include model manipulation, spatial labeling, group chat message boards, and private chat dialog boxes.
In some embodiments, step 550 further comprises creating at least one second space and/or second window in the virtual space, wherein each of the at least one second space and/or second window corresponds to one participant; the shared data of the corresponding participant is presented through the second space and/or the second window.
The second space and/or the second window refer to a space and/or a window created in the virtual space for presenting the shared data. In some embodiments, the second space and/or the second window may be a window visible only to the corresponding participant, e.g., a private chat window between two participants. In some embodiments, the second space and/or the second window may be typeset by the system preset, or the participant may drag the mobile position by himself.
In some embodiments, the participant may create the second space and/or the second window as desired, for example, the participant may select to create the second space and/or the second window on a creation interface of the terminal. In some embodiments, the second space and/or the second window may also be created by the system by default.
In some embodiments, different second spaces and/or second windows may correspond to different participants and present different shared data. In some embodiments, the second space and/or second window may achieve a one-to-one correspondence with the participant through dynamic allocation, e.g., when the participant enters the virtual space, the system automatically creates a corresponding second space and/or second window for the participant. As another example, a second space and/or a second window created by the participant himself corresponds to herself.
In some embodiments, the participant may upload the data to be shared received or stored by the terminal to a server of the system, and the other participants may download the shared data from the server according to the requirement. For details of the method of sharing data, reference may be made to fig. 3 and its associated description.
The XR-based multi-person online live broadcast method can realize first person visual angle immersive interactive operation of participants in a virtual space, increase learning interest and skill mastering proficiency of the participants, solve the problems that the best guidance effect cannot be achieved due to limitation of actual space fields and the number of people, and the like, can visually display data shared by the participants in the virtual space, and is convenient for information synchronization among the participants, so that the efficiency and the effect of discussion, guidance and the like are improved.
Fig. 6 is an exemplary flow diagram illustrating real-time updating of location information according to some embodiments of the present description. In some embodiments, the flow 600 may be performed by the positioning module 220.
In some embodiments, the participant terminal may scan the actual space where the participant is located, and perform spatial positioning on the participant; for the participant who completed the scan, the participant terminal may determine the participant's real-time location data in real space 610.
In some embodiments, the participant terminal may scan the actual space in which the participant is located based on a variety of ways, for example, the participant may hold the terminal in his hand and scan the environment surrounding the actual space using the depth camera of the terminal.
For another example, the participant terminal may obtain an actual spatial anchor point, perform multi-point spatial scanning on a special plane of the actual space, and keep the anchor point of the actual space successfully positioned, that is, the spatial positioning is successful; and if the scanning fails, reminding that the space scanning is not finished.
In some embodiments, the participant terminal may determine the real-time location data 610 of the participant in real space based on a variety of ways. For example, the participant terminal may draw a spatial profile after the actual spatial scan is completed, and determine the real-time location data 610 of the participant according to the relative location information of the participant and the spatial reference object. For another example, the participant terminal may also directly obtain the real-time location data 610 of the participant according to a positioning method such as GPS.
In some embodiments, the participant terminal may determine first movement information 620 of the participant in the real space based on the real-time location data 610.
The first movement information 620 refers to information generated as the participant moves in a real space. The first movement information may include movement information of the participant's position, distance, altitude, etc. in real space.
In some embodiments, the first movement information 620 may be obtained in a variety of ways, for example, the first movement information may be determined based on real-time location data of the participant in the real space, and when the real-time location data of the participant changes, the participant terminal may calculate corresponding first movement information according to the changed data.
As another example, the determination may be made by the positioning of the anchor point of the participant in real space, i.e., may be determined by the movement information of the anchor point. For more on obtaining the first movement information, reference may be made to fig. 4 and its related description.
In some embodiments, the server may determine the initial position information 660 of the virtual character in the virtual space.
For example, after scanning the real space in which the participant is located, the participant terminal may determine the position data of the current participant in the real space, and the server may acquire the position data and map it to the initial position information of the virtual character in the virtual space. For another example, the initial position information may be preset by the server. For further explanation regarding determining the initial position information of the virtual character in the virtual space, reference may be made to fig. 4 and its associated description.
In some embodiments, the participant terminal may obtain first action information 630 of the participant in the real space; the first action information 630 includes sub-action information of various parts of the participant's body,
the first motion information 630 refers to physical motion information generated by the participant in a real space. The first motion information 630 may also include limb motion information (e.g., extending arms, shaking the body, walking, squatting), facial expression information (e.g., blinking, opening the mouth), and so on. In some embodiments, the first action information 630 includes sub-action information for various parts of the participant's body.
The sub-motion information refers to specific motion information of each part of the body of the participant, for example, leg motion information and arm motion information in a running motion. In some embodiments, the participant terminal may divide the participant's motion into sub-motions of a plurality of body parts, thereby obtaining sub-motion information of the respective body parts.
In some embodiments, when a participant performs an action, at least one body part participates in or constitutes the action, for example, when the participant performs a running action, the feet, legs, arms, etc. of the participant generate corresponding actions, wherein the feet and legs can be used as core parts in the running action. The core parts of the participants may be different for different scenes and different actions. For more description of the different scenarios and core regions, reference may be made to fig. 6 and its associated description.
In some embodiments, the first action information and the sub-action information may be obtained in a variety of ways. For example, the information may be obtained by a camera, a wearable device, a sensor, or the like. Specifically, the camera can capture real-time image information of the participant, and the change of each part of the body of the participant can be obtained by processing the real-time image so as to obtain first action information and sub-action information; the wearable device can be fixedly connected with elements such as a displacement sensor and an angle sensor at the corresponding joint moving part and is used for acquiring the change information of each part of the body of the participant and converting the change information into first action information and sub-action information.
In some embodiments, the participant terminal may synchronously update the second movement information 670 and/or the second motion information 680 of the virtual character corresponding to the participant based on the first movement information 620 and/or the first motion information 630 through a preset 3D coordinate position algorithm.
In some embodiments, synchronously updating the second movement information and/or the second action information of the virtual character corresponding to the participant based on the first movement information and/or the first action information may include updating the second movement information based on the first movement information, updating the second movement information and the second action information based on the first movement information, updating the second action information based on the first action information, updating the second movement information and the second action information based on the first movement information and the first action information, and so on.
In some embodiments, the participant terminal may synchronously update the second movement information and/or the second action information of the virtual character corresponding to the participant based on the first movement information and/or the first action information through various methods. For example, the participant terminal may scan the real space in real time to obtain first movement information and/or first motion information of the participant and transmit the first movement information and/or the first motion information to the server, and perform coordinate conversion by using a preset 3D coordinate position algorithm to combine the first movement information and/or the first motion information of the participant with the data of the virtual character, thereby obtaining second movement information and/or second motion information of the corresponding virtual character.
For example, if the participant terminal acquires first movement information of the participant moving forward by a distance of two meters in the real space and first movement information of the participant moving with a walking step of 70 cm and swinging with the arms hanging down by 15 °, the above data may be combined with the virtual character by coordinate conversion using a preset 3D coordinate position algorithm to acquire second movement information of the virtual character moving forward by a distance of two meters in the virtual space and second movement information of the participant moving with a walking step of 70 cm and swinging with the arms hanging down by 15 °.
In some embodiments, the server may determine at least one core body part of the participant based on the current scene; determining presentation priorities 640 of sub-action information for various parts of the participant's body based on the at least one core body part; determining a display parameter 650 of the action information based on the display priority of the sub-action information, wherein the display parameter 650 comprises display frequency and display precision; second action information 680 for the avatar corresponding to the participant is synchronized based on the presentation parameters.
The current scene refers to a scene in the current virtual space, such as an academic conference, a remote consultation, and the like, and more description about the scene can be given with reference to fig. 1 and the related content thereof.
The core body part refers to the body part that is most important to the participant's action. For example, in a surgical guidance scenario, the core site of the surgeon when performing the surgical procedure may be the hand; for another example, during interrogation, the core region of the interrogated person may be the face.
In some embodiments, the core body part may be determined in various ways, for example, a comparison table of core body parts corresponding to different stages of different scenes may be preset, and the core body part in the current scene is determined based on the preset comparison table. For another example, the core body part may be determined according to the continuous moving time of the part, and the part with longer continuous moving time may be considered to be the core body part which takes the current main action of the participant.
Presentation priority 640 refers to presentation priority of sub-action information for various parts of the participant's body. The presentation priority 640 may be represented by a sort or a rank, for example, a numerical value of 1 to 10 may reflect the sort order of the presentation priority, and the smaller the numerical value, the higher the sort order, which indicates that the corresponding sub-action information is to be presented in the front. For another example, the numerical value of 1-10 may reflect the level of the presentation priority, and a larger numerical value indicates a higher level, and the corresponding sub-action information is presented before.
In some embodiments, the server may preset a display priority of each part in different scenes, and may display the sub-action information based on a preset display priority comparison table in actual application. For example, the display priority of the hand sub-action information can be preset in a live operation scene, the display priority of the hand sub-action information is highest, the display priority of the arm sub-action information is second, and the display priority of the leg sub-action information is lowest, so that the actions of the doctor can be displayed based on the information in the preset priority comparison table during the actual live operation. In some embodiments, presentation priority may also be determined based on context information and motion information of various parts of the body. For details of determining presentation priority, reference may be made to fig. 7 and the description thereof.
The presentation parameter 650 refers to a parameter related to the sub-action presentation, for example, the presentation parameter may include a presentation frequency and a presentation precision of the action.
The presentation frequency refers to the update frequency of the sub-actions. For example, the display frequency range may be set to 30-60 Hz for low display frequency, 60-90 Hz for medium display frequency, and 90-120 Hz for high display frequency. In some embodiments, the presentation frequency may be a fixed numerical option or may be freely varied within the presentation frequency range. In some embodiments, the presentation frequency may be preset by the server, and may also be determined based on the presentation priority of the sub-action information, such as the greater the presentation priority, the higher the presentation frequency.
The display precision refers to the display precision of the sub-action, and the display precision can be expressed by pixels. For example, 1280 × 720 pixels may be used for smooth display accuracy, 1920 × 1080 pixels may be used for standard display accuracy, and 2560 × 1440 and above pixels may be used for high definition display accuracy. In some embodiments, the presentation precision may be preset by the server, and may also be determined based on the presentation priority of the sub-action information, for example, the greater the presentation priority, the higher the presentation precision.
In some embodiments, the presentation parameters may be determined based on a presentation priority of the sub-action information, and for sub-actions with higher priority, presentation may be performed with higher frequency and accuracy. For example, in the operation, the hand motion change of the doctor is displayed by adopting higher display frequency and display precision, and the motion of other parts such as shaking the body can be displayed by adopting lower display frequency; for another example, the variation of facial expression of the audited person during the audition process may be performed with a higher frequency and precision of presentation.
In some embodiments, the server may preset a parameter table corresponding to different display priorities, for example, preset a display frequency corresponding to a first display priority to be 120hz, a display precision to be 2560 × 1440 pixels, a display frequency corresponding to a second display priority to be 90hz, and a display precision to be 1920 × 1080 pixels. In some embodiments, the presentation parameters may also be set by the participant himself.
In some embodiments, after the display parameters of each part of the participant are determined based on the display priority of the sub-action information, the server may obtain the display parameter data of each part, and then synchronously update the second action information of the virtual character according to the display parameters of different parts. For example, second motion information of the corresponding part is collected and updated according to the display frequency of different parts (for the same participant, the hand display frequency is 120hz, and the leg display frequency is 60 hz); for another example, the second motion information is displayed according to the display accuracy of different parts (for example, for the participants, the hand display accuracy is 2560 × 1440 pixels, and the leg display accuracy is 1280 × 720 pixels).
In some embodiments, the position information may be updated based on the second movement information through a preset 3D coordinate position algorithm, and the detailed description may refer to fig. 4 and its related contents.
And determining display parameters according to the display priority to synchronize the second action information, and adopting high-frequency and high-precision display for more important action changes and adopting lower-frequency and precision display for unimportant action changes, so that the action display effect can be ensured and server resources can be effectively saved.
According to the position updating method shown in fig. 6, the action information and the movement information of the participant in the actual space can be accurately mapped in the virtual space in real time, so that the remote participant can watch and know the operation details of the operator in the actual space in real time, real-time guidance information is provided, interference caused by unnecessary actions is avoided, and the immersive experience of the participant is improved.
Fig. 7 is an exemplary diagram of determining presentation priorities of sub-action information according to some embodiments of the present description. In some embodiments, the flow 700 may be performed by the positioning module 220.
In some embodiments, the presentation priority of sub-action information may be implemented based on a processing model.
In some embodiments, the process model may be used to determine a presentation priority for sub-action information. The process model may be a machine learning model, for example, the process model may include a Convolutional Neural network model (CNN), a Deep Neural network model (DNN).
In step 710, motion trajectories of various parts of the body and motion feature vectors of various parts of the body can be determined through a convolutional neural network model based on the motion images.
The motion image may refer to an image of sub-motions of various parts of the participant. And acquiring the action information through a data acquisition device of the actual space where the participant is located. For example, the motion image may be a motion video or picture of the participant a taken by a panoramic camera.
In some embodiments, the convolutional neural network model may be used to process the at least one motion image to determine at least one motion trajectory and motion feature vector corresponding to the motion image.
The motion trajectory may refer to a motion trajectory of a participant's body part. The motion trajectory may be represented by a sequence or matrix of position coordinates of the corresponding body part at successive time points, etc., wherein each sequence or matrix element may represent a position coordinate of a center position of a part on the body at a corresponding time instant. For example, the motion trajectory sequence may be ((1,0), (1,1), (1,2)), where, (1,0), (1,1), (1,2) are position coordinates of the right hand of the participant a at three consecutive time points, respectively.
The motion feature vector may refer to a feature vector of motion of each part of the body. The elements of the motion feature vector may include the name of the part, the importance of the part motion in each scene, and how frequently the part motion occurs, etc. The number of parts obtained based on the motion image, that is, the part names may be plural. The action of each part can preset different importance degrees according to different scenes. The frequency of the action of the part can be represented by the number of times of the action in a preset time period. For example, in a training course, the teacher's finger movements and facial movements may be set to a higher degree of importance. For example only, the motion feature vector may be (1,40,3), where 1 may represent the hand, 40 may represent the importance of the hand to the current scene, and 3 may represent the number of times the hand has moved is 3.
In some embodiments, the deep neural network model may be used to process the motion trajectory, motion feature vector, and scene information to determine presentation priority of the sub-motion information.
The scene information may be represented on a variety of forms, for example, by a vector. The elements in the scene information may correspond to one scene according to a preset relationship between a preset scene and a number and/or a letter. For example, a 1 in the scene vector (1) may represent a training scene.
In step 720, the display priority of the sub-action information can be determined through the deep neural network model based on the scene information, the action tracks of the parts of the body and the action characteristic vectors of the parts of the body.
In some embodiments, the deep neural network model may be configured to process at least one motion trajectory, motion feature vector, and scene information corresponding to the motion image to determine a presentation priority of the sub-motion information. Details regarding the presentation priority of the sub-action information may be found in the description of other parts of this specification, for example, fig. 6.
In some embodiments, the process model may be obtained by training a convolutional neural network model and a deep neural network model in combination. For example, inputting a training sample, namely a historical motion image, into the initial convolutional neural network model to obtain at least one historical motion track and at least one historical motion feature vector corresponding to the historical motion image; and then, taking the output of the initial convolutional neural network model and historical scene information corresponding to the historical motion image as the input of the initial deep neural network model. In the training process, a loss function is established based on the label of the training sample and the output result of the initial deep neural network model, and the parameters of the initial convolutional neural network model and the initial deep neural network model are simultaneously updated in an iterative manner based on the loss function until the preset condition is met and the training is completed. Parameters of the convolutional neural network model and the deep neural network model in the trained post-processing model can also be determined.
In some embodiments, the training samples may be obtained based on historical motion images acquired by the data acquisition device and historical scene information corresponding thereto. The label of the training sample may be a historical presentation priority of the corresponding sub-action information. The labels may be manually labeled.
The display priority of the sub-action information is determined through the machine learning model, the speed of determining the display priority can be improved, and the accuracy of the display priority can be improved.
In some embodiments, the presentation priority of sub-action information may be implemented by a vector database. Specifically, a scene action vector may be constructed based on the scene information and the sub-action information of each part of the body of the participant, then a reference vector is retrieved in the vector database based on the scene action vector, and the display priority of the sub-action information corresponding to the reference vector is taken as the current priority.
The sub-motion information may be represented by a sub-motion information vector. The elements in the sub-action information vector may represent body part names and corresponding actions. Different actions may be based on different numeric or alphabetical representations. For example, the sub-motion information vector is (1,2), where 1 represents a hand and 2 represents that the hand motion is a fist. In some embodiments, the scene information, sub-action information may be combined to determine a scene action vector. The scene motion vector may be a multi-dimensional vector. For example, in the scene motion vector (a, b), a may represent a consultation scene and b may represent a sub-motion information vector.
In some implementations, the scene action vector may be obtained by an embedding layer. The embedding layer may be a machine learning model, for example, the embedding layer may be a Recurrent Neural Network (RNN) model or the like. The input of the embedding layer can be scene information, sub-action information of each part of the body of the participant, and the output can be a scene action vector.
A vector database may refer to a database containing historical scene action vectors. In some embodiments, the preset database includes the historical scene action vectors and the presentation priorities of the sub-action information corresponding to the historical scene action vectors.
The reference vector may refer to a historical scene motion vector having a similarity to the scene motion vector exceeding a preset threshold. For example, if the preset threshold is 80%, and the similarity between the historical scene motion vector 1 and the scene motion vector in the vector database is 90%, the historical scene motion vector 1 is a reference vector. In some embodiments, the reference vector may be a historical scene action vector that is most similar to the scene action vector.
The reference vector may refer to a similarity to a scene motion vector may be determined based on a vector distance between the scene motion vector and a historical scene motion vector. The vector distance may include a manhattan distance, a euclidean distance, a chebyshev distance, a cosine distance, a mahalanobis distance, and the like. The numerical value can be substituted for mathematical calculation according to formulas corresponding to different distance types.
In some embodiments, the embedding layer may be obtained by training in conjunction with a deep neural network model. Inputting a training sample to the initial embedding layer to obtain a scene action vector; the output of the initial embedding layer is then used as input to the initial deep neural network model. In the training process, a loss function is established based on the output results of the label and the initial deep neural network model, and parameters of the initial embedded layer and the initial deep neural network model are simultaneously updated in an iterative mode based on the loss function until the preset conditions are met and the training is completed. Parameters of the embedded layer and the deep neural network model can also be determined after the training is finished.
In some embodiments, the training samples may be historical scene information, historical sub-action information for various parts of the participant's body. The label of the training sample may be a historical presentation priority of the corresponding sub-action information. The labels may be manually labeled.
The parameters of the embedding layer are obtained through the training mode, the problem that labels are difficult to obtain when the embedding layer is trained independently is solved, and the embedding layer can well obtain scene motion vectors reflecting scene information and sub-action information.
And presetting a vector database based on historical data to further determine the display priority of the sub-action information, so that the determined display priority can better accord with the actual situation.
Fig. 8 is a flow diagram of an exemplary flow 800 of a data processing method for XR, in accordance with some embodiments of the present description. The flow 800 may be performed by the presentation module 240 and the generation module 250.
At step 810, a canvas is created within the virtual space in response to a request by an annotation requestor.
The annotation requester refers to a participant who makes an annotation request.
In some embodiments, the annotation requester may send an annotation request at the terminal and be received by the server.
The canvas refers to a canvas showing contents to be marked in a virtual scene. The canvas may take a variety of forms, for example, a three-dimensional canvas, and the like.
In some embodiments, upon receiving the request of the annotation requester, the server can create a default canvas in the virtual space. In some embodiments, the annotation requester can change the size and shape of the canvas through manual operation or preset options, and in some embodiments, the annotation requester can also drag the canvas to move as needed.
And 820, displaying the content to be marked on the canvas, wherein the content to be marked is marked data and/or unmarked original data.
In some embodiments, the content to be tagged may be derived from a variety of data. For example, the content to be tagged may be data that the participant has prepared in advance. For another example, the content to be marked may further include shared data corresponding to the scene, for example, a three-dimensional image model, a pathological image, a video, and the like of the patient when pathological data is shared. The shared data in different scenes are different, and the content to be marked is also different, and for more description of the content to be marked in different scenes, reference may be made to the related content in fig. 5.
The marked data refers to data with history marks. In some embodiments, the annotation can continue for the data that was annotated. In some embodiments, when data is labeled secondarily, whether to display the history labeling condition can be selected.
The unmarked raw data refers to data without history marks, such as raw data downloaded by a participant from a server, real-time data in live broadcast, and the like.
In some embodiments, the content to be marked is content displayed in any window and any position in a plurality of windows on the terminal of the annotation requester.
In some embodiments, different presentation modes can be adopted for different contents to be marked. For example, pictures can be presented statically, and videos can be presented dynamically using video sources. In some embodiments, different display modes may also be adopted for a different terminal, for example, the VR device may perform 3D display through different pictures of two eyes and an appropriate interpupillary distance, the mobile terminal device may perform display through a screen, and the computer terminal may perform display through a display, etc.
Step 830, obtaining the marking information created on the canvas by the marking requester using the ray interaction system, wherein the marking information includes marking content and marking path.
The marking information is information generated by marking the content to be marked. In some embodiments, the mark information may further include a mark time, mark position information corresponding to the mark time, and the like. For example, the marker time is nine am, and the marker is at the position of the virtual space coordinates (20, 30, 40) at this time.
The marked content refers to specific content marked on the content to be marked, and the marked content can include content drawn by a brush, an inserted picture, operation of adjusting size and the like.
The marker path refers to the stroke path of the marker. For example, if the annotation requester marked a "person" on the canvas, the path of the marking would be the left-falling and right-falling of the "person" word.
A radiation interactive system refers to a system for marking. In some embodiments, through the ray interaction system, the participant may point the ray to the content to be marked presented by the canvas and mark the content by a gesture operation such as touching, pressing, and the like.
In some embodiments, the terminal can automatically save the annotation information of the annotation requester to the local in real time. In some embodiments, the annotation requester can actively select to save the annotation information by touching the canvas, clicking a button, and the like.
And 840, sharing the content to be marked and the marking information to terminals of other participants for displaying.
In some embodiments, the mark information of the corresponding participant can be collected based on the terminal and uploaded to the server, then the mark information is sent to other participant terminals by the server, and a display window is created in other participant terminals, so that the content to be marked and the mark information are shared to the terminals of other participants for display.
In some embodiments, the content to be marked and the marking information thereof may be shared to terminals of other participants for displaying based on the display setting, and personalization of displaying may be achieved by displaying based on the display setting, so as to meet requirements of the participants, which may specifically refer to fig. 9 and related content thereof.
The data processing method for XR shown in fig. 8 can realize real-time annotation of shared data, and meanwhile, participants can perform operations such as drawing insertion, brush color change, size adjustment, cancellation, emptying and the like on a canvas, and store operation results to the local, so that reference, summary and comparison can be performed in the future. In addition, the annotation information can be shared with other participants, so that the participants can conveniently discuss complex problems.
Fig. 9 is a schematic diagram of an exemplary flow 900 of content presentation to be tagged, shown in some embodiments herein. Flow 900 may show module 240 executing.
Step 910, obtaining a display setting of the annotation requester, where the display setting includes real-time mark display and display after the mark is completed.
The display setting refers to setting related to displaying the content to be marked and the marking information. The presentation settings may also include 3D position information of the presentation window, size, color, precision, etc. of the presentation picture.
The real-time mark showing refers to that the marking process of the mark requester is synchronized to other participants in real time, namely the other participants can see the creation process of the mark information. The display after marking is that the final result after marking is shared with other participants, that is, the other participants can obtain the result after marking is finished, but cannot see the creating process of the marking information.
In some embodiments, the presentation settings may be determined by the terminal by default, e.g., the terminal marks the presentation after the default is completed. In some embodiments, the presentation setting may also be determined by the participant's selection of an option of the terminal presentation setting window, for example, the participant may click on the option of the live markup presentation by clicking, touching, or the like. In some embodiments, the terminal may record the participant's presentation settings and transmit the setting data to the server.
And 920, sharing the content to be marked and the marking information thereof to terminals of other participants for displaying based on the display setting.
In some embodiments, the server may share the content to be marked and the marking information thereof to terminals of other participants for display according to the display setting. For example, the server may obtain the presentation setting selected by the participant as a marker presentation, may also obtain the content to be marked of the participant and the marker information thereof, and may further transmit the real-time data of the content to be marked and the marker information thereof to the terminals of other participants for real-time presentation according to the presentation setting.
In some embodiments, the scenario presented may be different for different scenarios. For example, during surgical guidance, in order to avoid influencing the surgical procedure, the display of the content to be marked and the marking information thereof needs to avoid the surgical site of the patient, the display screen of the device, and the like. For another example, in the training and teaching process, in order to ensure that each participant can clearly see the content to be marked and the marking information thereof, a display window can be created in front of each participant; in the academic lecture scene, only one large display window can be created for displaying.
In some embodiments, sharing the content to be marked and the marking information thereof to terminals of other participants for display includes: determining perspective information for each of the other participants based on the location information for each of the participants; and determining the display content of each participant based on the visual angle information of each participant, and displaying, wherein the display content comprises the content to be marked and/or the marking information under the visual angle information.
The visual angle information refers to the visual angle information of the participant relative to the content to be marked and the marking information thereof. The perspective information may include azimuth, angle, altitude, distance, etc. The participants are located at different positions, and the corresponding visual angle information is different. For example, for the same presentation model, the perspective information of the participant located at the right front of the presentation model mainly includes partial right view information and partial front view information of the presentation model; the perspective information of the participants positioned at the upper left of the display model mainly comprises partial left view information and partial top view information of the display model.
In some embodiments, the server may determine the viewing angle information by obtaining the position information of the participant based on the terminal, comparing the position information with the display position information of the content to be marked and the marking information thereof, and determining the relative position between the participant and the display position. For example, three-dimensional space coordinates (x, y, z) can be constructed in a virtual space, when a three-dimensional image such as a model is displayed, a participant stands facing the y direction, the position coordinates are (1,1,1), the position coordinates (such as center coordinates) of the display model are (1,2,2), the relative position between the display model and the participant can be obtained based on calculation between the two coordinates, the participant can see a part of the front view of the model and the information at the bottom of the model under the view angle, and the server can determine specific view angle information through an algorithm. For another example, in the above example, if there is another participant with position coordinates of (1,0,1), the distance from the presentation model in the perspective information corresponding to the participant is greater than the distance from the presentation model for the participant with position coordinates of (1,1,1), that is, the proportion of models seen by the corresponding participant is smaller than the proportion of models seen by the participant with position coordinates of (1,1,1).
In some embodiments, the presentation content that can be seen by the participants at the viewing angle can be calculated based on the viewing angle information of each participant, and the content is presented. For example, the participant is located right to the presented 3D model according to the perspective information of the participant, i.e., the right view that the participant can see the model is determined and the right view is presented to the participant. In some embodiments, the content that a participant can see is also related to distance, e.g., for a participant that is farther away, the proportion of the corresponding presented content may be less than for a participant that is closer.
The content to be marked and the marking information thereof are displayed to other participants according to the display setting, so that the participants can discuss complex problems conveniently, different display settings are adopted for different content to be marked and the marking information thereof, the optimal display effect is achieved, and the discussion and guidance effects are further improved; and different display settings can realize the individuation of display and meet the requirements of participants. Meanwhile, the display content is determined according to the visual angle information of the participants, so that more vivid, omnibearing and multilevel rendering display effects can be provided, and the immersive experience of the participants is enhanced.
Fig. 10 is a schematic diagram of an exemplary schematic flow 1000 of determining a predicted presentation in accordance with some embodiments of the present description. In some embodiments, flow 1000 may be performed by presentation module 240.
Step 1010, predicting a future motion trajectory of the participant.
The future motion profile refers to the motion profile of the participant after the current time. The motion trajectory may comprise time information and corresponding position information, etc. In some embodiments, the time length of the time period corresponding to the future motion trajectory may be set according to requirements. In some embodiments, the time period corresponding to the future motion trajectory may be a time period from the current time or a time period spaced from the current time. For example, the future movement trajectory may be a movement trajectory within 5 seconds in the future, or may be a movement trajectory within 10 minutes in the future. For another example, the future movement trajectory may be a movement trajectory within 5 seconds from the current time, or may be a movement trajectory within 10 minutes after 5 minutes from the current time (i.e., within 5-15 minutes from the current time).
In some embodiments, a future motion trajectory of the participant may be predicted based on the scene and the current sub-action information of the participant. For example, in a live surgical scenario, a surgeon picks up an instrument or suture in a surgical suture package and can predict that the surgeon will suture the surgical site next. In some embodiments, the participant's future motion profile may also be predicted based on historical scenes and historical times of entry into the scene. For example, for teaching scenes of the same subject, teaching persons in history teaching scenes show demonstration actions at 30 minutes into the scene, it can be predicted that teaching persons show demonstration actions at 30 minutes in the current teaching scene. For another example, in the case of a surgical guidance scene with the same theme, in the historical surgical guidance scene, the guided person may focus on the bedside to watch guidance 10 minutes after entering the scene, and it may be predicted that the guided person may move around the hospital bed 10 minutes after entering the current surgical guidance scene.
In some embodiments, the predicting the future motion trajectory of the participant may be implemented based on a predictive model; the structure of the prediction model is a recurrent neural network model; the input of the prediction model is a preset historical time period from the current time to the positioning data sequence of the participant; the output of the predictive model is a sequence of predicted position data for a preset future time period.
The preset history period refers to a period of time up to the current time. In some embodiments, the preset historical time period may be a time period from the participant entering the scene to the current time, for example, if nine participants enter the virtual scene and the current time is ten, then nine to ten are the preset historical time periods. In some embodiments, the predetermined historical period of time may also be a period of time from the beginning of an action to the current time, e.g., the participant performs a squat action, the predetermined historical period of time being the time from the moment the participant begins to squat to the current time.
In some embodiments, a predetermined historical period of time may also include a plurality of time point information. In some embodiments, the point-in-time information may be information of a single point in time. For example, if the collection interval of the time points is 1 second, and the preset historical time period is 2 minutes in the past, the preset historical time period includes 120 pieces of time point information. For another example, if the collection interval of the time points is 1 minute and the preset historical time period is 1 hour in the past, the preset historical time period includes 60 pieces of time point information.
In some embodiments, one time point information also corresponds to information of one sub-period. For example, the preset historical time period is nine to ten points, the time period may be further divided into three sub-time periods, for example, the time period is divided into twenty-five sub-time periods from nine to nine, the time period is divided into twenty-five sub-time periods from nine to twenty-five sub-time periods, the time period is divided into twenty-five sub-time periods from nine to nine, the time period is divided into forty-five sub-time periods from nine to forty sub-time periods, and the time period is divided into thirty sub-time periods from nine to forty sub-time periods, that is, the preset time period includes three time point information. In some embodiments, the collection interval of the time point and the length of the sub-period may be preset by the server, or may be set by the participant.
The positioning data sequence refers to the positioning data sequence of the participant in the virtual space. The sequence of positioning data may reflect movement of the participant over a preset historical period of time. Each element value in the positioning data sequence corresponds to the position data of the participant at a point in time. For example, in a set of sequences ((1,1,1), (2,2,2), (1,2,1), (1,2,2)), if each coordinate element value corresponds to a single point in time that is one second apart, (1,2,1) indicates the participant's location data within the virtual space at the third second within the preset time period. For another example, in the above sequence, if the time point corresponding to each coordinate element value is a sub-period, (1,2,1) indicates the position data of the participant in the virtual space corresponding to the third sub-period in the preset time period.
In some embodiments, the training data of the prediction model may be a plurality of sets of labeled training samples, and the training samples may be a preset historical time period from the past to the current time, and a historical positioning data sequence of the participant. The training samples may be derived from server-stored historical data. The labels of the training samples of the prediction model can be actual position data sequences of the participants when future time periods are preset for history, and the labels can be obtained in a manual labeling mode.
In some embodiments, a loss function is constructed from the labels and the results of the initial predictive model, and parameters of the predictive model are iteratively updated by gradient descent or other methods based on the loss function. And when the preset conditions are met, the model training is finished, and a trained prediction model is obtained. The preset condition may be that the loss function converges, the number of iterations reaches a threshold, and the like.
And step 1020, determining the view angle information of the future time point based on the future motion trail.
The future point in time refers to a time after the current time. The future time point is included in the time period corresponding to the future motion trajectory.
The visual angle information refers to the visual angle information of the participant relative to the content to be marked and the marking information thereof. See fig. 9 and its associated description for more about perspective information.
In some embodiments, based on the future motion trajectory of the participant, the position information of the participant at the future time point can be determined, and according to the comparison between the position information of the participant at the future time point and the position information of the display window of the content to be marked and the marking information thereof, the relative position between the participant and the display window can be determined, so as to determine the view angle information. For more description of how to determine the viewing angle information, see fig. 9 for related matter.
Step 1030, determining corresponding predicted display content based on the view information of the future time point.
In some embodiments, the predicted presentation content may include content to be marked, marked content and marking information thereof in the virtual scene, and the like. In some embodiments, when the predicted presentation is determined, the predicted presentation that can be seen by the participants at the perspective may be calculated based on the perspective information for each participant at a future point in time. For example, when the predicted display content is the three-dimensional model of the marked heart and the marking information thereof, it can be known from the view angle information at the future time point that the predicted display window of the content to be marked and the marking information thereof is located at the front-right 30 ° of the position of the participant at the corresponding predicted future time point, and then the side perspective view of the marked content and the marking information thereof at the future time point at the angle corresponding to the predicted display content of the participant can be predicted.
In some embodiments, when the predicted presentation content changes in real time, the predicted presentation content at a future point in time may be prepared for pre-acquisition based on perspective information for each participant at the future point in time. For example, if the content to be marked in the virtual scene is a surgical operation of a doctor, and the surgical operation of the doctor corresponding to the future time point is a chest surgical operation, the camera at the corresponding position of the participant perspective information can be brought into a standby state for chest shooting in advance.
The display content is predicted by predicting the future motion trail of the participant, the display content at the corresponding future time point can be prepared in advance, or the acquisition of the display content at the future time point is prepared in advance, so that the loading speed is increased, and the use experience of the participant is optimized.
It is to be noted that different embodiments may produce different advantages, and in different embodiments, any one or combination of the above advantages may be produced, or any other advantages may be obtained.
Having thus described the basic concept, it will be apparent to those skilled in the art that the foregoing detailed disclosure is to be regarded as illustrative only and not as limiting the present specification. Various modifications, improvements and adaptations to the present description may occur to those skilled in the art, although not explicitly described herein. Such modifications, improvements and adaptations are proposed in the present specification and thus fall within the spirit and scope of the exemplary embodiments of the present specification.
Also, the description uses specific words to describe embodiments of the description. Reference throughout this specification to "one embodiment," "an embodiment," and/or "some embodiments" means that a particular feature, structure, or characteristic described in connection with at least one embodiment of the specification is included. Therefore, it is emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, some features, structures, or characteristics of one or more embodiments of the specification may be combined as appropriate.
Additionally, the order in which the elements and sequences of the process are recited in the specification, the use of alphanumeric characters, or other designations, is not intended to limit the order in which the processes and methods of the specification occur, unless otherwise specified in the claims. While various presently contemplated embodiments have been discussed in the foregoing disclosure by way of example, it should be understood that such detail is solely for that purpose and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements that are within the spirit and scope of the embodiments herein. For example, although the system components described above may be implemented by hardware devices, they may also be implemented by software-only solutions, such as installing the described system on an existing server or mobile device.
Similarly, it should be noted that in the foregoing description of embodiments of the specification, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the embodiments. This method of disclosure, however, is not intended to imply that more features than are expressly recited in a claim. Indeed, the embodiments may be characterized as having less than all of the features of a single embodiment disclosed above.
Where numerals describing the number of components, attributes or the like are used in some embodiments, it is to be understood that such numerals used in the description of the embodiments are modified in some instances by the modifier "about", "approximately" or "substantially". Unless otherwise indicated, "about", "approximately" or "substantially" indicates that the number allows a variation of ± 20%. Accordingly, in some embodiments, the numerical parameters used in the specification and claims are approximations that may vary depending upon the desired properties of the individual embodiments. In some embodiments, the numerical parameter should take into account the specified significant digits and employ a general digit preserving approach. Notwithstanding that the numerical ranges and parameters setting forth the broad scope of the range are approximations, in the specific examples, such numerical values are set forth as precisely as possible within the scope of the application.
For each patent, patent application publication, and other material, such as articles, books, specifications, publications, documents, etc., cited in this specification, the entire contents of each are hereby incorporated by reference into this specification. Except where the application history document does not conform to or conflict with the contents of the present specification, it is to be understood that the application history document, as used herein in the present specification or appended claims, is intended to define the broadest scope of the present specification (whether presently or later in the specification) rather than the broadest scope of the present specification. It is to be understood that the descriptions, definitions and/or uses of terms in the accompanying materials of this specification shall control if they are inconsistent or contrary to the descriptions and/or uses of terms in this specification.
Finally, it should be understood that the embodiments described herein are merely illustrative of the principles of the embodiments described herein. Other variations are also possible within the scope of the present description. Thus, by way of example, and not limitation, alternative configurations of the embodiments of the specification can be considered consistent with the teachings of the specification. Accordingly, the embodiments of the present description are not limited to only those embodiments explicitly described and depicted herein.

Claims (10)

1. An XR-based multi-user online live broadcast method comprises the following steps:
establishing communication connection with terminals of at least two participants;
creating a virtual space in which a virtual character corresponding to each of the at least two participants is created;
determining the position information of the virtual character corresponding to the participant in the virtual space based on the acquired position data of the participant in the actual space by a preset 3D coordinate position algorithm;
displaying the virtual character in the virtual space based on the position information of the virtual character;
and acquiring shared data uploaded by the participants, and displaying the shared data in the virtual space.
2. The method of claim 1, wherein the determining, by a preset 3D coordinate location algorithm, the location information of the virtual character corresponding to the participant in the virtual space based on the acquired location data of the participant in the real space comprises:
scanning the actual space where the participant is located, and carrying out space positioning on the participant;
for the participant who completes the scanning, determining real-time position data of the participant in the real space;
determining first movement information of the participant in the real space based on the real-time location data;
determining initial position information of the virtual character in the virtual space;
acquiring first action information of the participant in the actual space; the first action information comprises sub-action information of various parts of the participant's body;
and synchronously updating second movement information and/or second action information of the virtual character corresponding to the participant based on the first movement information and/or the first action information through the preset 3D coordinate position algorithm.
3. The method of claim 2, wherein the step of synchronously updating second movement information and/or second motion information of the avatar corresponding to the participant based on the first movement information and/or the first motion information through the preset 3D coordinate position algorithm comprises:
determining at least one core body part of the participant based on a current scene;
determining presentation priorities for the sub-action information for the participant's body parts based on the at least one core body part;
determining display parameters of the action information based on the display priority of the sub-action information, wherein the display parameters comprise display frequency and display precision;
synchronizing the second action information of the avatar corresponding to the participant based on the presentation parameters.
4. The method of claim 1, further comprising:
creating at least one second space and/or second window in the virtual space, wherein each of the at least one second space and/or second window corresponds to one of the participants;
and displaying the shared data of the corresponding participant through the second space and/or a second window.
5. An XR-based multi-user online live broadcast system, comprising:
the connection module is used for establishing communication connection with the terminals of at least two participants;
a positioning module for creating a virtual space in which a virtual character corresponding to each of the at least two participants is created;
the positioning module is further used for determining the position information of the virtual character corresponding to the participant in the virtual space based on the acquired position data of the participant in the actual space through a preset 3D coordinate position algorithm; and displaying the virtual character in the virtual space based on the position information of the task;
the downloading module is used for acquiring the shared data uploaded by the participants;
and the display module is used for displaying the shared data in the virtual space.
6. The system of claim 5, wherein the location module is further configured to:
scanning the actual space where the participant is located, and carrying out space positioning on the participant;
for the participant who completes the scanning, determining real-time position data of the participant in the real space; determining first movement information of the participant in the real space based on the real-time location data;
determining initial position information of the virtual character in the virtual space;
acquiring first action information of the participant in the actual space; the first action information comprises sub-action information of various parts of the participant's body;
and synchronously updating second movement information and/or second action information of the virtual character based on the first movement information and/or the first action information through the preset 3D coordinate position algorithm.
7. The system of claim 6, wherein the location module is further configured to:
determining at least one core body part of the participant based on a current scene;
determining presentation priorities of the sub-action information for the participant's body parts based on the at least one core body part;
determining display parameters of the action information based on the display priority of the sub-action information, wherein the display parameters comprise display frequency and display precision;
synchronizing the second action information of the avatar corresponding to the participant based on the presentation parameters.
8. The system of claim 5, wherein the presentation module is further configured to:
creating at least one second space and/or second window in the virtual space, wherein each of the at least one second space and/or second window corresponds to one of the participants;
and displaying the shared data of the corresponding participant through the second space and/or a second window.
9. An XR-based multi-person online live device, the device comprising:
at least one storage medium storing computer instructions;
at least one processor that executes the computer instructions to implement the XR-based multi-person online live method of any of claims 1-4.
10. A computer readable storage medium storing computer instructions which, when read by a computer, cause the computer to perform the XR-based multi-user online live method of any one of claims 1 to 4.
CN202211322357.2A 2022-09-28 2022-09-28 XR-based multi-user online live broadcast and system Pending CN115576427A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211191561.5A CN117826976A (en) 2022-09-28 2022-09-28 XR-based multi-person collaboration method and system

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN202211191561.5A Division CN117826976A (en) 2022-09-28 2022-09-28 XR-based multi-person collaboration method and system

Publications (1)

Publication Number Publication Date
CN115576427A true CN115576427A (en) 2023-01-06

Family

ID=84980721

Family Applications (3)

Application Number Title Priority Date Filing Date
CN202211191561.5A Pending CN117826976A (en) 2022-09-28 2022-09-28 XR-based multi-person collaboration method and system
CN202211408599.3A Pending CN117111724A (en) 2022-09-28 2022-09-28 Data processing method and system for XR
CN202211322357.2A Pending CN115576427A (en) 2022-09-28 2022-09-28 XR-based multi-user online live broadcast and system

Family Applications Before (2)

Application Number Title Priority Date Filing Date
CN202211191561.5A Pending CN117826976A (en) 2022-09-28 2022-09-28 XR-based multi-person collaboration method and system
CN202211408599.3A Pending CN117111724A (en) 2022-09-28 2022-09-28 Data processing method and system for XR

Country Status (1)

Country Link
CN (3) CN117826976A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117041474A (en) * 2023-09-07 2023-11-10 腾讯烟台新工科研究院 Remote conference system and method based on virtual reality and artificial intelligence technology

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117041474A (en) * 2023-09-07 2023-11-10 腾讯烟台新工科研究院 Remote conference system and method based on virtual reality and artificial intelligence technology

Also Published As

Publication number Publication date
CN117111724A (en) 2023-11-24
CN117826976A (en) 2024-04-05

Similar Documents

Publication Publication Date Title
US11663789B2 (en) Recognizing objects in a passable world model in augmented or virtual reality systems
US10672288B2 (en) Augmented and virtual reality simulator for professional and educational training
US20220277506A1 (en) Motion-based online interactive platform
US11928384B2 (en) Systems and methods for virtual and augmented reality
CN109035415B (en) Virtual model processing method, device, equipment and computer readable storage medium
CN115576427A (en) XR-based multi-user online live broadcast and system
Sereno et al. Point specification in collaborative visualization for 3D scalar fields using augmented reality
CN111881807A (en) VR conference control system and method based on face modeling and expression tracking
CN118155465B (en) Immersive virtual simulation experiment platform and method
EP4280226A1 (en) Remote reproduction method, system, and apparatus, device, medium, and program product
He et al. vConnect: Connect the real world to the virtual world
US20240135617A1 (en) Online interactive platform with motion detection
JP2019512173A (en) Method and apparatus for displaying multimedia information
Schäfer Improving Essential Interactions for Immersive Virtual Environments with Novel Hand Gesture Authoring Tools
Lala et al. Enhancing communication through distributed mixed reality
Wu et al. VRAS: A Virtual Rehearsal Assistant System for Live Performance
Li et al. A Method for Transmitting Real Human Skeletal Points to the Virtual Reality Character
Wang et al. Research on Tai Chi APP Simulation System Based on Computer Virtual Reality Technology
Holobar et al. A distributed virtual reality‐based system for neonatal decision‐making training
Andersen Effective User Guidance Through Augmented Reality Interfaces: Advances and Applications
Maia Interactive collaboration platform in augmented reality
CN114967931A (en) Method and device for controlling motion of virtual object and readable storage medium
CN118394215A (en) Digital transmission method for national dance based on virtual digital man technology
Camporesi Immersive virtual human training systems based on direct demonstration
Prabhakaran et al. Message from the Chairpersons

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20231211

Address after: No. 2555 Yinzhou Avenue, Yinzhou District, Ningbo City, Zhejiang Province, 315100

Applicant after: NINGBO LONGTAI MEDICAL TECHNOLOGY Co.,Ltd.

Address before: 17 / F, Zhaoying commercial building, 151-155 Queen's Road Central, Hong Kong, China

Applicant before: Intuitive Vision Co.,Ltd.