CN113515193B

CN113515193B - Model data transmission method and device

Info

Publication number: CN113515193B
Application number: CN202110532603.6A
Authority: CN
Inventors: 刘帅; 陈春朋; 吴连朋
Original assignee: Juhaokan Technology Co Ltd
Current assignee: Juhaokan Technology Co Ltd
Priority date: 2021-05-17
Filing date: 2021-05-17
Publication date: 2023-10-27
Anticipated expiration: 2041-05-17
Also published as: CN113515193A

Abstract

The application provides a model data transmission method and device, which are used for solving the problems of long transmission time consumption, rendering delay and the like caused by large data volume of a three-dimensional model at present. The method comprises the following steps: receiving motion state information from a second terminal device, wherein the motion state information is used for representing the head pose change speed of a user of the second terminal device; adjusting the resolution of the three-dimensional model to be sent to the second terminal equipment according to the motion state information; and sending the three-dimensional model with the adjusted resolution to the second terminal equipment.

Description

Model data transmission method and device

Technical Field

The present application relates to the field of image processing technologies, and in particular, to a method and an apparatus for transmitting model data.

Background

At present, the requirements of users on the image quality of images presented by a display end in the remote communication technology based on AR or VR are higher and higher, so that when an acquisition end carries out model reconstruction, the resolution of a model is higher and higher, the data size of the model is larger and larger, and therefore the time consumed for transmission of data in a cloud end is longer, and delay in acquiring the data by the display end is possibly caused. How to reduce the transmission time is worth researching on the premise of guaranteeing the rendering quality.

Disclosure of Invention

The embodiment of the application provides a model data transmission method and device, which are used for reconstructing a three-dimensional model by combining the actual demands of a display end user, and improving the transmission rate on the premise of ensuring the rendering quality.

In a first aspect, an embodiment of the present application provides a method for transmitting model data, which is applied to a first terminal device, where the first terminal device and a second terminal device establish video communication, and the method includes:

receiving motion state information from a second terminal device, wherein the motion state information is used for representing the head pose change speed of a user of the second terminal device;

adjusting the resolution of the three-dimensional model to be sent to the second terminal equipment according to the motion state information;

and sending the three-dimensional model with the adjusted resolution to the second terminal equipment.

Based on the scheme, the first terminal equipment adjusts the resolution of the generated three-dimensional model according to the position and the posture change speed of the head of the user of the second terminal equipment, and transmits the three-dimensional model with the resolution adjusted. The three-dimensional model with the adjusted resolution ratio can be more in line with the watching requirement of a user of the second terminal equipment, dizzy symptoms caused by viewpoint changes and movements are reduced, the data size of the transmitted three-dimensional model can be controlled, the transmission efficiency is improved, and the rendering delay is reduced.

In a second aspect, an embodiment of the present application provides a method for transmitting model data, which is applied to a second terminal device, where the second terminal device establishes video communication with a first terminal device, and the method includes:

acquiring motion state information of a user and sending the motion state information to first terminal equipment; the motion state information is used for adjusting the resolution of the three-dimensional model to be sent to the second terminal equipment;

receiving a three-dimensional model with resolution adjusted according to the motion state information sent by a first terminal device;

and rendering the three-dimensional model with the adjusted resolution.

Based on the scheme, the second terminal equipment acquires the motion state information of the user in real time, transmits the motion state information to the first terminal equipment, receives the three-dimensional model with the resolution adjusted by the first terminal equipment according to the motion state information of the user, and renders the three-dimensional model with the resolution adjusted. The rendered content can meet the watching requirement of the user of the second terminal equipment, the data volume of the three-dimensional model can be controlled, the data transmission efficiency is improved, the rendering delay is reduced, and the use experience of the user is improved.

In a third aspect, an embodiment of the present application provides a first terminal device, where the first terminal device establishes video communication with a second terminal device, and the first terminal device includes:

A communicator for receiving motion state information from the second terminal device, the motion state information being used to characterize a head pose change speed of a user of the second terminal device;

the processor is used for adjusting the resolution ratio of the three-dimensional model to be sent to the second terminal equipment according to the motion state information;

the communicator is further configured to send the three-dimensional model after resolution adjustment to the second terminal device.

In a fourth aspect, an embodiment of the present application provides a second terminal device, where the second terminal device establishes video communication with the first terminal device, and the second terminal device includes:

the processor is used for acquiring the motion state information of the user;

a communicator for transmitting the motion state information to a first terminal device; the motion state information is used for adjusting the resolution of the three-dimensional model to be sent to the second terminal equipment;

the communicator is further used for receiving the three-dimensional model with the resolution adjusted according to the motion state information, which is sent by the first terminal equipment;

the processor is also used for rendering the three-dimensional model with the adjusted resolution to a display screen;

the display screen is used for displaying the three-dimensional model after resolution adjustment.

In a fifth aspect, an embodiment of the present application further provides a model data transmission apparatus, which is applied to a first terminal device, including:

a communication unit for receiving motion state information from the second terminal device, wherein the motion state information is used for representing the head pose change speed of the user of the second terminal device;

the processing unit is used for adjusting the resolution ratio of the three-dimensional model to be sent to the second terminal equipment according to the motion state information;

the communication unit is further configured to send the three-dimensional model after resolution adjustment to the second terminal device.

In a sixth aspect, an embodiment of the present application provides another model data transmission apparatus, which is applied to a second terminal device, including:

the processing unit is used for acquiring the motion state information of the user;

a communication unit for transmitting the motion state information to a first terminal device; the motion state information is used for adjusting the resolution of the three-dimensional model to be sent to the second terminal equipment;

the communication unit is further used for receiving the three-dimensional model with the resolution adjusted according to the motion state information, which is sent by the first terminal equipment;

the processing unit is also used for rendering the three-dimensional model with the adjusted resolution to the display unit;

The display unit is used for displaying the three-dimensional model after resolution adjustment.

In a seventh aspect, embodiments of the present application also provide a computer storage medium having stored therein computer program instructions which, when run on a computer, cause the computer to perform the method as set forth in the first aspect.

Technical effects brought about by any implementation manner of the third aspect to the seventh aspect may be referred to technical effects brought about by corresponding implementation manners of the first aspect to the second aspect, and are not described here again.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the description below are only some embodiments of the present application.

FIG. 1 is a schematic diagram of a model data transmission system according to an embodiment of the present application;

fig. 2 is a schematic diagram of an application scenario provided in an embodiment of the present application;

FIG. 3 is a flowchart of a method for transmitting model data according to an embodiment of the present application;

fig. 4A is a schematic diagram of a correspondence relationship between a speed range and an angular speed range configured in a second terminal device and an adjustment level of resolution according to an embodiment of the present application;

Fig. 4B is a schematic diagram of a correspondence between an adjustment level of a resolution configured in a first terminal device and a target resolution according to an embodiment of the present application;

fig. 4C is a schematic diagram of a correspondence between an adjustment level of a resolution configured in a first terminal device and a percentage of resolution adjustment according to an embodiment of the present application;

FIG. 5 is a flowchart of another method for transmitting model data according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of a first terminal device according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of a second terminal device according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of a model data transmission device according to an embodiment of the present application.

Detailed Description

For the purposes of making the objects, embodiments and advantages of the present application more apparent, an exemplary embodiment of the present application will be described more fully hereinafter with reference to the accompanying drawings in which exemplary embodiments of the application are shown, it being understood that the exemplary embodiments described are merely some, but not all, of the examples of the application.

Based on the exemplary embodiments described herein, all other embodiments that may be obtained by one of ordinary skill in the art without making any inventive effort are within the scope of the appended claims. Furthermore, while the present disclosure has been described in terms of an exemplary embodiment or embodiments, it should be understood that each aspect of the disclosure can be practiced separately from the other aspects.

It should be noted that the brief description of the terminology in the present application is for the purpose of facilitating understanding of the embodiments described below only and is not intended to limit the embodiments of the present application. Unless otherwise indicated, these terms should be construed in their ordinary and customary meaning.

Furthermore, the terms "comprise" and "have," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a product or apparatus that comprises a list of elements is not necessarily limited to those elements expressly listed, but may include other elements not expressly listed or inherent to such product or apparatus.

The term "module" as used in this disclosure refers to any known or later developed hardware, software, firmware, artificial intelligence, fuzzy logic, or combination of hardware and/or software code that is capable of performing the function associated with that element.

Current VR or AR based three-dimensional telecommunications technologies present a fundamental challenge in that rendering images at the extremely high resolution required for high immersion puts great demands on the rendering engine and data transfer process. For users, a good tele-immersive experience requires low latency, high frame rate, high image quality of rendering. How to effectively exert the operation capability of the graphics processor at the display end and provide high quality VR or AR content that meets the human eye perception is a key issue. When a user uses VR or AR helmets to quickly browse a scene, that is, the user's head rotates quickly or the user moves the position, the scene rendered in real time needs to change quickly at this time, and the problem of shaking or even tearing of the picture may occur, mainly because the operation required by the graphics processor at the display end is complex and the calculated amount exceeds the load, and because the three-dimensional data volume is large, there is delay in transmission at the transmission end, so that the real-time update cannot be performed, and the user experience is very bad.

At present, the core technology of a remote three-dimensional communication system in different places relates to a real-time three-dimensional reconstruction technology, a three-dimensional data encoding and decoding technology, an immersive VR or AR display technology and the like. Wherein, the data of the three-dimensional model transmitted by the transmission end has important influence on the quality of dynamic three-dimensional reconstruction and the imaging of the final display end, and the higher the resolution of the dynamic three-dimensional reconstruction is, the amount of data of the three-dimensional model to be transferred will increase drastically accordingly, for example, 192×192×128 resolution requires a transmission code rate of 256Mbps,384×384×384 the transmission code rate required for resolution is 1120Mbps (30 FPS for example). Therefore, how to ensure better three-dimensional reconstruction quality and reduce transmission pressure is a problem to be solved.

In view of this, an embodiment of the present application provides a method and an apparatus for transmitting model data, in which motion state information of a user is determined by monitoring a head pose change speed of the user at a display end, and the motion state information of the user at the display end is transmitted back to an acquisition end, and the acquisition end controls a data amount of a three-dimensional model generated by the acquisition end according to the motion state information of the user, so as to adjust a resolution of the three-dimensional model to be transmitted. The transmission pressure of the transmission end is reduced, the data receiving efficiency is improved, the transmission time is saved, the delay is reduced, and the experience effect of VR or AR remote communication is improved.

Embodiments of the present application are described in detail below with reference to the accompanying drawings.

Fig. 1 schematically illustrates a model data transmission system architecture according to an embodiment of the present application. As shown in fig. 1, the system includes an acquisition end device 101, a transmission network device 102, and a rendering display end device 103.

The capturing device 101 includes a capturing unit 1011 for capturing color RGB images and depth information of a user, and the capturing unit 1011 may include one or more cameras (which may be also referred to as cameras), as an example. The camera may be an RGBD camera, such as an Azure Kinect depth camera or a Realsense depth camera, to collect depth information and RGB images of the user. The acquisition end device further includes a processor 1012, configured to receive the RGB image and the depth information acquired by the acquisition unit 1011, and perform a correlation operation to obtain geometric data and motion gesture data of the three-dimensional model. And accurately reconstructing a three-dimensional model of the human body according to the motion and position data of the human body. The geometric data may be composed of three-dimensional position data, color data, normal data and triangular patch index data of geometric vertexes. The source of the geometric data is an RGB image and depth information (the depth information contains the depth information of each pixel in the RGB image) which can be obtained by point cloud poisson reconstruction. The motion pose data comprises three-dimensional position data and pose data of a human body (skeleton) joint point, wherein the position data is three-dimensional space coordinate data (x, y, z), and several expression modes of the pose data comprise: the euler angle, the axis angle, the rotation matrix and the quaternion are that three data form a vector, the rotation matrix is that nine data form a vector, and the quaternion is that four data form a vector.

The transmitting end device 102 is configured to transmit data of the three-dimensional model of the three-dimensional human body determined by the collecting end device 101 to the rendering display end device 103. The transmitting end device may be a cloud server, and may be further configured to encode and decode and distribute the data of the three-dimensional model transmitted by the collecting end device 101.

And the display end device 103 is used for receiving data for the three-dimensional model, reconstructing the three-dimensional model of the human body according to the data of the three-dimensional model, and performing three-dimensional immersive rendering, wherein the pre-constructed three-dimensional model is constructed through different body parameters and different posture parameters of the human body acquired by the acquisition end, and motion pose data and geometric data are used in the pre-construction process. The display end device is also used for acquiring the motion state information of the user in the motion process, sending the motion state information to the acquisition end device and adjusting the resolution of the three-dimensional model according to the motion state information by the acquisition end device. The display end device comprises a television, a mobile phone, a VR/AR head-mounted display device and the like. It should be noted that, if the display device 103 is an AR head-mounted display device (head display), the display device needs to further perform rendering of the three-dimensional model in combination with a scene where the display user is located.

It should be noted that the architecture diagram shown in fig. 1 is only an example, and the number of the collection end device, the transmission end device, and the display end device in the embodiment of the present application is not specifically limited.

Based on the system architecture shown in fig. 1, fig. 2 schematically shows an application scenario provided by the embodiment of the present application. As shown in fig. 2, the user end 1 to the user end 4 perform real-time remote three-dimensional communication, the user end 1 to the user end 4 respectively arrange an acquisition end device (including a camera and a processor) and a rendering display end device (including all or part of a television, a mobile phone, a VR or an AR head display), in the process of remote three-dimensional communication, a three-dimensional reconstruction model of the user end 1 can be uploaded to a cloud server, the user end 2 to the user end 4 download the three-dimensional reconstruction model of the user end 1 from the cloud server and synchronously display the three-dimensional reconstruction model, and similarly, the user end 1, the user end 3 and the user end 4 can synchronously display the three-dimensional reconstruction model of the user end 2.

It should be noted that fig. 2 is only an example of the three-dimensional communication of multiple people, and the number of clients in the three-dimensional communication of the embodiment of the present application is not limited.

The following describes the model data transmission scheme according to the embodiment of the present application with reference to the architecture diagrams shown in fig. 1 and fig. 2. Referring to fig. 3, a flow chart of a method of model data transmission, in particular interactions between a first terminal device and a second terminal device, is shown. It should be noted that, the first terminal device in the flowchart shown in fig. 3 may be a display end device in the architecture diagram of fig. 1, or may be an acquisition end device in the architecture diagram of fig. 1; likewise, the second terminal device in fig. 3 may be the display end device in fig. 1, or may be the acquisition end device in fig. 1. Then, when the first terminal equipment is the display end equipment, the second terminal equipment is the acquisition end equipment; when the first terminal equipment is the acquisition terminal equipment, the second terminal equipment is the display terminal equipment. In fig. 3, a first terminal device is taken as an acquisition end device, and a second terminal device is taken as a display end device for introduction, which specifically includes:

301, the second terminal device acquires motion state information of the user in the motion process of the user, and sends the motion state information to the first terminal device.

Wherein the motion state information is used to characterize the speed of the pose change of the head of the user of the second terminal device. The pose change includes a change in the position of the user, that is, the second terminal device acquires the speed of the change in the displacement of the user. The pose change also includes a pose change of the head of the user, for example, when the user rotates the head, the second terminal device obtains an angular velocity when the user rotates the head. The second terminal device transmits motion state information for representing the pose change speed of the user to the first terminal device.

Optionally, the second terminal device may also forward the movement status information to the first terminal device via the transmitting end device shown in fig. 1.

302, the first terminal device receives the motion state information, and adjusts the resolution of the three-dimensional model to be sent according to the motion state information.

The resolution of the three-dimensional model includes the resolutions of the three-dimensional model in three directions, and it is understood that the number of volume elements (voxels) constituting the three-dimensional model in a certain direction is the resolution of the three-dimensional model in the certain direction.

And 303, the first terminal equipment sends the three-dimensional model with the resolution adjusted to the second terminal equipment.

The first terminal device may forward the three-dimensional model with the adjusted resolution to the second terminal device through the transmitting terminal device shown in fig. 1. The three-dimensional model with the adjusted resolution can also be directly transmitted to the second terminal device.

304, the second terminal device receives the three-dimensional model with the adjusted resolution, and renders the three-dimensional model with the adjusted resolution.

Alternatively, if the second terminal device is a display device such as a mobile phone or a television, rendering of the two-dimensional image may be performed.

And if the second terminal equipment is the VR head display, performing three-dimensional rendering.

And if the second terminal equipment is the AR head display, performing three-dimensional rendering in combination with a scene where the user of the second terminal equipment is located.

In the following, for further understanding of the model data transmission method proposed by the present application, a description will be given with reference to a specific scenario. For convenience of description, the following description will take the first terminal device as the acquisition end device and the second terminal device as the display end device as an example. The scheme provided by the application can be used in various scenes such as conference scenes, live scenes, game scenes and the like. The proposal provided by the application is introduced by taking a conference scene and a live broadcast scene as examples respectively.

Scene one: conference scene

It should be noted that the conference scenario may be divided into many cases. For example, the meeting scenario may be: the method comprises the steps that a VR head display is worn by a user A corresponding to a first terminal device, an AR head display is worn by a second terminal device for a user B corresponding to the first terminal device, the first terminal device serves as an acquisition end, RGB (red, green and blue) images and depth information of the user A are acquired, three-dimensional model data of the user A are generated, and the three-dimensional model data of the user A are transmitted to the second terminal device. And the second terminal equipment performs three-dimensional rendering on the three-dimensional model data of the user A in combination with the scene where the user B is located. The conference scenario may also be: the VR head display is worn by the user A corresponding to the first terminal equipment, the VR head display is worn by the second terminal equipment for the user B corresponding to the first terminal equipment, the first terminal equipment is used as an acquisition end, three-dimensional model data of the user A are acquired and transmitted to the second terminal equipment, namely transmitted to the VR head display worn by the user B, and the second terminal equipment performs three-dimensional rendering on the received three-dimensional model data of the user A. The above is merely an example of two conference scenarios, and many other conference scenarios are not described herein, and the following description will take the first conference scenario as an example.

The second terminal device may acquire information such as a moving speed of the user B and an angular speed at which the head of the user B rotates in response to the movement operation of the user B. As an example, the second terminal device may be provided with an inertial sensor (Inertial Measurement Unit, IMU) and a camera for acquiring the posture change amount and the position change amount of the head of the user B. The second terminal device calculates according to the attitude change amount and the position change amount, and is used for acquiring information such as the movement speed of the user B and the angular speed of the head rotation. That is, in the present application, the second terminal device may acquire not only the change in the position of the user B but also the change in the posture of the user B. And jointly determining the motion state information of the user B according to the change of the position and the gesture. Optionally, an optical tracking method, such as VR head display with tracking function, may be used when acquiring the position change of the user B. The second terminal device sends the motion state information to the first terminal device, which may be the current acquired speed or angular speed or speed and angular speed of the user B, or may be the speed range in which the current speed of the user B is located, or the angular speed range in which the angular speed of the user B is located, or the speed range and angular speed range in which the current user is located. Still alternatively, the second terminal device may be configured with a correspondence between the speed range and the adjustment level of the resolution, or configured with a correspondence between the angular speed range and the adjustment level of the resolution, or configured with a correspondence between the speed range and the angular speed range and the adjustment level of the resolution. In this way, the second terminal device can also determine the range of the acquired information according to the acquired speed or angular speed or speed and angular speed, and directly send the adjustment level corresponding to the range as the motion state information to the first terminal device, so that the data volume of transmission is small and the transmission speed is high. The adjustment level of the resolution is used for the first terminal device to adjust the resolution of the three-dimensional model. For ease of understanding, reference may be made to fig. 4A, which illustrates exemplary correspondence between the speed range and angular speed range configured in the second terminal device and the adjustment level of the resolution.

In some embodiments, there may be temperature drift, scaling and installation errors due to the IMU used to collect the speed and angular velocity information. These problems cause inaccurate readings of the sensor, and the data acquired by the IMU needs to be corrected later, for example, the second terminal device may use a complementary filtering algorithm to correct the acquired data such as angular speed, velocity, etc., so that the second terminal device may output more accurate motion state information.

Further, the second terminal device may directly transmit the acquired motion state information to the first terminal device, or may transmit the motion state information to the transmitting end device shown in fig. 1, which forwards the motion state information to the first terminal device. After receiving the motion state information, the first terminal device can adjust the resolution of the three-dimensional model of the user A according to the motion state information. In order to understand the concept of adjusting the resolution of the three-dimensional model by the first terminal device more deeply, the process of constructing the three-dimensional model is described: there are many current methods for constructing three-dimensional models, and in the present application, a model construction method based on a symbol distance function (Truncated Signed Distance Function, TSDF) is described as an example. In an actual application scene, a three-dimensional space is opened up in one space to serve as a TSDF field, then the three-dimensional space is split in the directions of x, y and z, the split small cubes are called three-dimensional elements (voxels), the number of the voxels in each direction is the resolution of the space in the direction, and the central point of each voxel is the sampling point of the TSDF function. In a TSDF field, an isosurface is typically set, where all function values on the isosurface are 0, the function value of a point in free space outside the isosurface is positive and proportional to the distance from the point to the isosurface, the function value of a point in space inside the isosurface (the space surrounded by the isosurface) is negative and inversely proportional to the distance from the point to the isosurface. All function sampling points serve as sparse sampling points in space. After the sampling is completed, the three-dimensional model is extracted from the TSDF field, for example, a moving Cube algorithm (Marching Cube algorithm) can be adopted to extract the three-dimensional model. The process of constructing the three-dimensional model is described above, and the process of adjusting the resolution of the three-dimensional model by the first terminal device according to the motion state information is described below.

Optionally, in one case, the first terminal device may adjust the data amount of the collected three-dimensional model of the user a, thereby achieving the purpose of adjusting the resolution of the three-dimensional model of the user a. The motion state information received by the first terminal device may include one or more of a motion speed of the user B, an angular speed of the head rotation of the user B, an angular speed of the motion speed and the head rotation of the user B, a speed range in which the motion speed of the user B is located, an angular speed range in which the head rotation of the user B is located, an angular speed range in which the motion speed of the user B is located and an angular speed range in which the head rotation is located, an adjustment level of resolution corresponding to the speed range in which the motion speed of the user B is located, an adjustment level of resolution corresponding to the angular speed range in which the head rotation of the user B is located, and an adjustment level of resolution corresponding to the angular speed range in which the motion speed of the user B is located. The following describes a procedure for adjusting the resolution of the three-dimensional model of the user B by the first terminal according to the motion state information in connection with a specific embodiment.

In the first embodiment, when the motion state information is the current motion speed of the user B, the resolution of the first terminal device in the case where the resolution of the three-dimensional model of the user a is the highest may be stored in the first terminal device, and for convenience of description, the resolution of the three-dimensional model of the user a stored in the first terminal device is referred to as M. The first terminal device may store a correspondence between a speed range in which the current movement speed of the user B is located and a target resolution to be adjusted, or the first terminal device may store a correspondence between a speed range and a percentage of resolution adjustment, that is, a percentage of the target resolution to be adjusted that is a percentage of the resolution M. After receiving the current movement speed of the user B, the first terminal device may first determine a speed range in which the movement speed is located, and further determine the target resolution to be adjusted according to the corresponding relation between the stored speed range and the target resolution, or determine the percentage of resolution adjustment according to the corresponding relation between the stored speed range and the percentage of resolution adjustment, and then determine the target resolution according to the product of the determined percentage and the resolution M. Then, the resolution of the user A three-dimensional model is adjusted according to the determined target resolution. For example, taking the TSDF function-based three-dimensional model construction mentioned in the above embodiment as an example, the first terminal device may achieve the purpose of adjusting the resolution of the three-dimensional model by controlling the resolution of the TSDF field according to the determined target resolution to be adjusted. That is, the first terminal device may adjust the number of voxels split in each direction according to the determined target resolution, thereby adjusting the resolution of the three-dimensional model.

In the second embodiment, when the movement state information is the angular velocity of the rotation of the head of the user B, the corresponding relationship between the angular velocity range in which the angular velocity of the rotation of the head of the user B is located and the target resolution to be adjusted may be stored in the first terminal device, or the corresponding relationship between the angular velocity range and the adjusted percentage of the resolution, that is, the adjusted percentage of the resolution is the percentage of the target resolution to be adjusted that is, the percentage of the resolution M mentioned in the first embodiment. After receiving the angular velocity of the head rotation of the user B, the first terminal device may first determine an angular velocity range in which the angular velocity is located, and further determine the target resolution to be adjusted according to the stored correspondence between the angular velocity and the target resolution, or determine an adjustment percentage of the required resolution according to the stored correspondence between the angular velocity range and the adjustment percentage of the resolution, and then determine the target resolution by using the product of the determined percentage and the resolution M. And finally, adjusting the resolution of the three-dimensional model of the user A according to the determined target resolution.

In the third embodiment, when the movement state information is the current movement speed of the user B and the angular speed of the head rotation. The first terminal device may store therein a correspondence between a speed range in which the movement speed of the user B is located and an angular speed range in which the rotational angular speed of the head of the user B is located and a target resolution to be adjusted, or the first terminal device may store therein a correspondence between a speed range in which the movement speed of the user B is located and an angular speed range in which the rotational angular speed of the head of the user B is located and a percentage of adjustment of the resolution, that is, a percentage of the target resolution to be adjusted that is to be adjusted is a percentage of the resolution M mentioned in the above embodiment. The first terminal device receives the current movement speed of the user B and the angular speed of the head rotation, and may first determine the speed range in which the movement speed is located and the angular speed range described by the angular speed, and further adjust the resolution of the three-dimensional model of the user a according to the determined speed range and the target resolution corresponding to the angular speed range.

In the fourth embodiment, when the movement state information is the speed range in which the current movement speed of the user B is located. The first terminal device may store a correspondence between a speed range in which the movement speed of the user B is located and a target resolution to be adjusted, or the first terminal device may also store a correspondence between a speed range in which the movement speed of the user B is located and an adjustment percentage of the resolution, that is, the target resolution to be adjusted is the percentage of the resolution M mentioned in the first embodiment. After receiving the speed range of the current movement speed of the user B, the first terminal device can determine the target resolution corresponding to the speed range, and adjust the resolution of the three-dimensional model according to the determined target resolution.

In the fifth embodiment, when the movement state information is an angular velocity range in which the angular velocity of the current head rotation of the user B is located. The first terminal device may store a correspondence between the angular velocity range and the target resolution to be adjusted, or the first terminal device may also store a correspondence between the angular velocity range in which the angular velocity of the user B is located and the percentage of adjustment of the resolution, that is, the percentage of the target resolution to be adjusted that is the percentage of the resolution M mentioned in the above embodiment. After receiving the angular velocity range in which the current head of the user B rotates, the first terminal device may determine a target resolution corresponding to the angular velocity range, and adjust the resolution of the three-dimensional model according to the determined target resolution.

In the sixth embodiment, when the movement state information is a speed range in which the current movement speed of the user B is located and an angular speed range in which the angular speed of the head rotation is located. The first terminal device may store therein a correspondence between the speed range and the angular speed range and the target resolution to be adjusted, or the first terminal device may store therein a correspondence between the speed range and the angular speed range and an adjustment percentage of the resolution, that is, a percentage of the target resolution to be adjusted that is the percentage of the resolution M mentioned in the above embodiment. After the first terminal device receives the speed range in which the current movement speed of the user B is located and the angular speed range in which the angular speed of the head rotates, the first terminal device can determine the target resolution corresponding to the speed range and the angular speed range, and adjust the resolution of the three-dimensional model of the user a according to the determined target resolution.

In the seventh embodiment, when the motion state information is the resolution adjustment level corresponding to the speed range in which the current motion speed of the user B is located, or the resolution adjustment level corresponding to the angular speed range in which the angular speed of the current head rotation of the user B is located, or the resolution adjustment level corresponding to the speed range in which the current motion speed of the user B is located and the angular speed range in which the angular speed of the head rotation is located. The first terminal device may store a correspondence relationship between the adjustment level of the resolution and the target resolution, for example, see the correspondence relationship shown in fig. 4B; or the correspondence between the adjustment level of the resolution and the adjustment percentage of the resolution may be stored in the first terminal device, for example, see the correspondence shown in fig. 4C, where the adjustment percentage of the resolution, that is, the target resolution to be adjusted, is the percentage of the resolution M mentioned in the above embodiment. After receiving the adjustment level of the resolution, the first terminal device may determine a target resolution according to the adjustment level, and further adjust the resolution of the three-dimensional model of the user a according to the determined target resolution.

In another case, the first terminal device determines that the three-dimensional model of the user a has been constructed, and may perform downsampling processing on the constructed three-dimensional model, for example, a multi-level of Detail (LOD) technique may be used to simplify the three-dimensional model. When the LOD technology is adopted to simplify the three-dimensional model, the detail of the important three-dimensional model is drawn with higher quality according to the importance degree of the three-dimensional model, and the detail which is not important is drawn with lower quality. For example, when user a makes a certain gesture and uses LOD technology to simplify the model, the hand of user a is drawn with emphasis. The simplified three-dimensional model can fully maintain the geometric characteristics and key characteristics of the original three-dimensional model, and the operation speed of the first terminal equipment can be improved. The judgment of the importance of the details of the three-dimensional model may be based on: distance criteria (distance of the three-dimensional model to the first terminal device), size criteria (size of the three-dimensional model), specific settings of the user, etc. In LOD technology, there are also many simplified algorithms for three-dimensional models, such as geometric element deletion method, region merging method, fixed-point clustering method, and the like. In one possible implementation manner, taking the motion state information received by the first terminal device as an example of an adjustment level of the resolution of the three-dimensional model, the first terminal device may be configured with a corresponding relationship between the adjustment level and the downsampling level, determine the downsampling level according to the received adjustment level, and simplify the three-dimensional model according to the algorithm, so as to achieve the purpose of reducing the resolution of the three-dimensional model.

Still further, the first terminal device sends the three-dimensional model of the user a after the resolution adjustment to the second terminal device after the resolution adjustment of the three-dimensional model of the user a. Alternatively, forwarding may be performed by the transmitting end device shown in fig. 1. The present application is not particularly limited thereto.

And the second terminal equipment renders the three-dimensional model into the scene in combination with the scene in which the user B is currently positioned after receiving the three-dimensional model of the user A after the resolution adjustment. In some embodiments, if the pose of the user a changes, the second terminal device may further predict the pose of the user a, so as to reduce rendering delay and ensure that an accurate scene can be rendered in time. As an example, when the user a moves horizontally (the head does not rotate, only the position changes), the user a starts moving from the position S1, the movement speed is v, the acceleration is a, and the position S2 where the user a is located when the predicted time reaches the time t is predicted by the following formula (1) or formula (2):

S2＝S1+v*t (1)

or alternatively, the process may be performed,

S2＝S1+[v+(v+a*t)]*t/2 (2)

after the second terminal device predicts the position S2 that the user a will reach, it will be ready to render the corresponding scene at S2.

Scene II: live scene

In the live broadcast scene, the first terminal device may be an acquisition device corresponding to the anchor, the second terminal device is a VR or AR head display worn by the user C watching live broadcast, and in the present scene, introduction is performed by taking the VR head display worn by the user C as an example.

And the second terminal equipment responds to the motion operation of the user C and acquires the motion state information of the user C. The description of the second terminal device acquiring the motion state information of the user B in the previous scenario may be referred to specifically, and will not be described herein. After the second terminal device obtains the motion state information of the user C, the motion state information is directly sent to the first terminal device, or the motion state information is forwarded to the first terminal device through the transmission end device shown in fig. 1.

After the first terminal device receives the motion state information of the user C, the resolution of the three-dimensional model of the anchor may be adjusted according to the motion state information. The specific adjustment process can be referred to as the process of adjusting the resolution of the three-dimensional model described in the previous scenario, and will not be described herein. The first terminal device may send the three-dimensional model of the anchor after resolution adjustment to the second terminal device, and the second terminal device renders the three-dimensional model, that is, the VR head display of the user C renders the three-dimensional model. Based on the scheme, the resolution of the three-dimensional model is adjusted by combining the motion state of the user C, so that the data volume of the three-dimensional model can be reduced, the transmission efficiency is improved, and the rendering delay is reduced. The rendered picture may also be made more closely to the human eye viewing needs of user C because high resolution views are not required when user C is in the process of fast motion or rotation. If the picture viewed by user C is still of high resolution during the fast motion or rotation, it may lead to a dizzy situation for user C. Therefore, when the user C moves, the resolution of the three-dimensional model is reduced, so that the human eyes are consistent with the experience of watching pictures in natural scenes when watching the rendered pictures, the gap between the three-dimensional model and reality is reduced, and the dizzy symptoms caused by viewpoint changes and movement are reduced.

The scheme provided by the application is introduced by combining specific scenes. In order to understand the scheme of model data transmission provided by the application more clearly, the scheme of the application will be described in the following by using a specific embodiment, and the description will be continued by taking the first terminal device as the acquisition end and the second terminal device as the display end. Referring to fig. 5, a flowchart of a method for transmitting model data according to an embodiment of the present application includes:

the second terminal device determines 501 motion state information of the user in response to a motion operation of the user.

Specifically, the second terminal device may be configured with an IMU, and obtain information such as a speed and an angular speed of the user through the IMU, where the motion state information sent by the second terminal device includes one or more of a speed, an angular speed, a speed and an angular speed or a speed range, an angular speed range, a speed range, and an angular speed range.

And 502, the second terminal device corrects the motion state information of the user.

Because IMUs may have problems such as temperature drift, installation errors, etc., the data acquired by the IMU needs to be corrected to obtain more accurate motion state information.

And 503, the second terminal equipment predicts the pose according to the motion state information of the user.

When the pose of the user changes, the scene required to be rendered by the second terminal equipment also needs to change, so that the scene required to be rendered in the next step is predicted according to the motion state information of the user, and the scene is prepared to be rendered in advance, thereby avoiding the problem of rendering delay. The prediction process may be specifically described with reference to the above description of scenario one, and will not be described herein. For convenience of description, in the present embodiment, a scene in which a user is predicted to be located is referred to as a scene P.

The second terminal device sends 504 the motion state information of the user to the transmitting end device.

505, the transmitting end device sends the motion state information of the user to the first terminal device.

And 506, the first terminal device adjusts the resolution of the three-dimensional model to be transmitted according to the motion state information.

The specific adjustment process may be referred to the related description in the first scenario, and will not be described herein.

And 507, the first terminal equipment sends the three-dimensional model with the adjusted resolution to the transmission end equipment.

And 508, the transmission end equipment sends the three-dimensional model with the adjusted resolution to the second terminal equipment.

509, the second terminal device renders the three-dimensional model after resolution adjustment.

Specifically, the second terminal device may render the three-dimensional model with the adjusted resolution in combination with the scene P.

It should be noted that, fig. 5 is only an example, and the order of the steps 503 and 504 may be adjusted, and the present application is not limited to the order of the steps.

Based on the same concept as the above method, referring to fig. 6, a first terminal device 600 is provided in an embodiment of the present application. The first terminal device 600 is capable of performing the steps of the above-described method, which will not be described in detail here in order to avoid repetition. The first terminal device 600 comprises a communicator 601, a processor 602, a camera 603.

A communicator 601 for receiving motion state information from the second terminal device, the motion state information being used to characterize a head pose change speed of a user of the second terminal device;

a processor 602, configured to adjust a resolution of a three-dimensional model to be sent to the second terminal device according to the motion state information;

the communicator 601 is further configured to send the three-dimensional model after resolution adjustment to the second terminal device.

Based on the same concept as the above method, referring to fig. 7, an embodiment of the present application provides a second terminal device 700, and the second terminal device 700 is capable of performing the steps of the above method, and in order to avoid repetition, details will not be described here. The second terminal device 700 comprises a communicator 701, a processor 702, a display 703.

A processor 702 for acquiring motion state information of a user;

a communicator 701 for transmitting the motion state information to a first terminal device; the motion state information is used for adjusting the resolution of the three-dimensional model to be sent to the second terminal equipment;

the communicator 701 is further configured to receive the three-dimensional model sent by the first terminal device after resolution adjustment according to the motion state information;

the processor 702 is further configured to render the three-dimensional model after resolution adjustment to a display screen;

the display 703 is configured to display the three-dimensional model after resolution adjustment.

Based on the same concept as the above method, referring to fig. 8, a model data transmission apparatus 800 according to an embodiment of the present application is provided, where the apparatus 800 is capable of performing each step in the above method, and in order to avoid repetition, details are not described herein. The apparatus 800 comprises a communication unit 801, a processing unit 802, a display unit 803.

In one scenario:

a communication unit 801, configured to receive motion state information from the second terminal device, where the motion state information is used to characterize a head pose change speed of a user of the second terminal device;

a processing unit 802, configured to adjust a resolution of a three-dimensional model to be sent to the second terminal device according to the motion state information;

The communication unit 801 is further configured to send the three-dimensional model after resolution adjustment to the second terminal device.

In another scenario:

a processing unit 802, configured to obtain motion state information of a user;

a communication unit 801 for transmitting the motion state information to a first terminal device; the motion state information is used for adjusting the resolution of the three-dimensional model to be sent to the second terminal equipment;

the communication unit 801 is further configured to receive a three-dimensional model sent by the first terminal device after resolution adjustment according to the motion state information;

the processing unit 802 is further configured to render the three-dimensional model after resolution adjustment to the display unit 803;

the display unit 803 is configured to display the three-dimensional model after resolution adjustment.

The embodiments of the present application also provide a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of any of the methods described above.

Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the method embodiments described above may be performed by hardware associated with program instructions. The foregoing program may be stored in a computer readable storage medium. The program, when executed, performs steps including the method embodiments described above; and the aforementioned storage medium includes: various media that can store program code, such as ROM, RAM, magnetic or optical disks.

While specific embodiments of the application have been described above, it will be appreciated by those skilled in the art that these are by way of example only, and the scope of the application is defined by the appended claims. Various changes and modifications to these embodiments may be made by those skilled in the art without departing from the principles and spirit of the application, but such changes and modifications fall within the scope of the application. While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application.

It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims

1. A model data transmission method, characterized by being applied to remote three-dimensional communication, the method comprising:

The method comprises the steps that a first terminal device collects depth information and RGB image information of a first user, a three-dimensional model is determined according to the depth information and the RGB image information, and the three-dimensional model is sent to a second terminal device; the first terminal device and the second terminal device are terminal devices which participate in remote three-dimensional communication and are located in different places; the first user is a user of the first terminal device;

the second terminal equipment responds to the motion operation of the second user, acquires motion state information of the second user and sends the motion state information to the first terminal equipment; the second user is a user of the second terminal equipment, and the motion state information is used for representing the head pose change of the second user;

after the first terminal equipment receives the motion state information, reducing the resolution of the three-dimensional model sent to the second terminal equipment according to the motion state information so as to reduce the data size of the three-dimensional model in the transmission process; the resolution is the number of voxels of the three-dimensional model in the appointed direction, and different resolutions correspond to different motion state information;

the first terminal device sends the three-dimensional model with reduced resolution to the second terminal device; the second terminal equipment is configured with a corresponding relation between a preset speed range and an adjustment level of the resolution;

The second terminal device obtains current motion state information of a second user, and the method comprises the following steps:

acquiring the current speed of the second user, and determining an adjustment level corresponding to a preset speed range to which the current speed belongs; taking the resolution adjustment level as the motion state information; the current speed is the current moving speed of the head of the second user or the rotational angular speed of the head.

2. The method of claim 1, wherein reducing the resolution of the three-dimensional model to be transmitted to the second terminal device based on the motion state information comprises:

the motion state information comprises the current moving speed of the head of the second user, and when the first terminal equipment collects data for generating the three-dimensional model, the data amount of the collected three-dimensional model is reduced according to the data amount of the three-dimensional model corresponding to the speed range where the current moving speed is located, and the three-dimensional model is built according to the collected data of the three-dimensional model; or alternatively, the process may be performed,

the motion state information comprises the rotation angular speed of the head of the second user, and when the first terminal equipment collects data for generating the three-dimensional model, the data amount of the collected three-dimensional model is reduced according to the data amount of the three-dimensional model corresponding to the angular speed range where the rotation angular speed is located, and the three-dimensional model is built according to the collected data of the three-dimensional model; or alternatively, the process may be performed,

And when the first terminal equipment collects data for generating the three-dimensional model, reducing the data volume of the collected three-dimensional model according to the data volume of the three-dimensional model corresponding to the speed range in which the current moving speed is located and the angular speed range in which the angular speed is located, and constructing the three-dimensional model according to the collected data of the three-dimensional model.

3. The method of claim 1, wherein reducing the resolution of the three-dimensional model to be transmitted to the second terminal device based on the motion state information comprises:

the first terminal equipment reduces the data volume of the acquired three-dimensional model according to the data volume of the three-dimensional model corresponding to the first indication information when acquiring the data for generating the three-dimensional model, and constructs the three-dimensional model according to the acquired data of the three-dimensional model; or alternatively, the process may be performed,

the motion state information comprises second indication information, the second indication information is used for indicating the change degree of the angular speed of the head rotation of the second terminal equipment, when the first terminal equipment collects data for generating the three-dimensional model, the data amount of the collected three-dimensional model is reduced according to the data amount of the three-dimensional model corresponding to the second indication information, and the three-dimensional model is built according to the collected data of the three-dimensional model; or alternatively, the process may be performed,

The motion state information comprises first indication information and second indication information, when the first terminal equipment collects data for generating the three-dimensional model, the data amount of the collected three-dimensional model is reduced according to the data amount of the three-dimensional model corresponding to the first indication information and the second indication information, and the three-dimensional model is built according to the collected data of the three-dimensional model.

4. The method of claim 1, wherein reducing the resolution of the three-dimensional model to be transmitted to the second terminal device based on the motion state information comprises:

and when the motion state information meets one or more of the following setting conditions, carrying out downsampling processing on the three-dimensional model which is built according to the motion state information:

the motion state information comprises the current moving speed of the second user, and the current moving speed is larger than a first set threshold value; or alternatively, the process may be performed,

the motion state information comprises a current rotation angular speed of the head of the second user, wherein the current rotation angular speed is larger than a second set threshold value; or alternatively, the process may be performed,

the motion state information includes a current movement speed of the second user and a current rotational angular speed of the second user's head, the current movement speed being greater than the first set threshold and the current rotational angular speed being greater than the second set threshold.

5. The method of claim 4, wherein the motion state information includes a current movement speed of the second user, and a sampling rate used in the downsampling process has a mapping relationship with a speed range in which the current movement speed is located; or alternatively, the process may be performed,

the motion state information comprises the rotation angular velocity of the head of the second user, and the sampling rate adopted by the downsampling processing has a mapping relation with the angular velocity range in which the rotation angular velocity is positioned; or alternatively, the process may be performed,

the motion state information comprises the current moving speed of the second user and the rotating angular speed of the head of the second user, and the sampling rate adopted by the downsampling processing has a mapping relation with the angular speed range in which the rotating angular speed is located and the speed range in which the current moving speed is located.

6. The method of claim 1, wherein reducing the resolution of the three-dimensional model to be transmitted to the second terminal device based on the motion state information comprises:

The motion state information comprises first indication information, the value of the first indication information is larger than a first set value, and the greater the value of the first indication information is, the higher the degree of change of the moving speed of the head of the second user is; or alternatively, the process may be performed,

the motion state information comprises first indication information, the value of the first indication information is smaller than a first set value, and the greater the value of the first indication information is, the lower the degree of change of the moving speed of the head of the second user is; or alternatively, the process may be performed,

the motion state information comprises second indication information, the value of the second indication information is larger than a second set value, and the larger the value of the first indication information is, the higher the degree of change of the rotation angular speed of the head of the second user is; or alternatively, the process may be performed,

the motion state information comprises second indication information, the value of the second indication information is smaller than a second set value, and the larger the value of the first indication information is, the lower the degree of change of the rotation angular speed of the head of the second user is; or alternatively, the process may be performed,

the motion state information comprises first indication information and second indication information, the value of the first indication information is larger than a first set value, the value of the second indication information is larger than a second set value, and the larger the value of the first indication information and the second indication information is, the lower the degree of change of the first speed of the head of the second user is, and the higher the degree of change of the rotation angular speed of the head of the second user is; or alternatively, the process may be performed,

The motion state information comprises first indication information and second indication information, the value of the first indication information is smaller than a first set value, the value of the second indication information is smaller than a second set value, and the larger the value of the first indication information and the second indication information is, the lower the degree of change of the first speed of the head of the second user is, and the lower the degree of change of the rotation angular speed of the head of the second user is.

7. The method of claim 6, wherein the motion state information includes the first indication information, and wherein a sampling rate used by the downsampling process has a mapping relationship with the first indication information; or alternatively, the process may be performed,

the motion state information comprises the second indication information, and a mapping relation exists between the sampling rate adopted by the downsampling process and the second indication information; or alternatively, the process may be performed,

the motion state information comprises the first indication information and the second indication information, and the sampling rate adopted by the downsampling processing has a mapping relation with the first indication information and the second indication information.

8. A first terminal device, wherein the first terminal device establishes video communication with a second terminal device, the first terminal device comprising:

A communicator for receiving motion state information from the second terminal device, the motion state information being used to characterize a head pose change of a user of the second terminal device;

the processor is used for collecting the depth information and RGB image information of the first user, determining a three-dimensional model according to the depth information and the RGB image information and sending the three-dimensional model to the second terminal equipment; the first terminal equipment and the second terminal equipment are terminal equipment which participate in remote three-dimensional communication and are located in different places; the first user is a user of the first terminal device;

the processor is further configured to reduce, after receiving the motion state information, a resolution of a three-dimensional model sent to the second terminal device according to the motion state information, so as to reduce a data amount of the three-dimensional model in a transmission process; the resolution is the number of voxels of the three-dimensional model in the appointed direction, and different resolutions correspond to different motion state information; the second terminal equipment is configured with a corresponding relation between a preset speed range and an adjustment level of the resolution; the second terminal equipment determines an adjustment level corresponding to a preset speed range to which the current speed belongs by acquiring the current speed of a second user, and sends the resolution adjustment level as the motion state information to the first terminal equipment; the current speed is the current moving speed of the head of the second user or the rotating angular speed of the head; the second user is a user of the second terminal device;

9. A second terminal device, wherein the second terminal device establishes video communication with the first terminal device, and a corresponding relation between a preset speed range and an adjustment level of resolution is configured in the second terminal device; the second terminal device includes:

the processor is used for responding to the motion operation of the second user and acquiring the current speed of the second user; determining an adjustment level corresponding to a preset speed range to which the current speed belongs, and taking the resolution adjustment level as motion state information; the second user is a user of the second terminal device, and the current speed is a current moving speed of the head of the second user or a rotation angular speed of the head;

a communicator for transmitting the motion state information to a first terminal device; the motion state information is used for adjusting the resolution of the three-dimensional model to be sent to the second terminal equipment so as to reduce the data volume of the three-dimensional model in the transmission process; the three-dimensional model is obtained by acquiring depth information and RGB image information of a user of the first terminal equipment by the first terminal equipment; the resolution is the number of voxels of the three-dimensional model in a specified direction;