WO2023236815A1

WO2023236815A1 - Three-dimensional model transmission method and apparatus, and storage medium and program product

Info

Publication number: WO2023236815A1
Application number: PCT/CN2023/097180
Authority: WO
Inventors: 王志强
Original assignee: 中兴通讯股份有限公司
Priority date: 2022-06-08
Filing date: 2023-05-30
Publication date: 2023-12-14
Also published as: CN117240831A

Abstract

Provided in the present application are a three-dimensional model transmission method and apparatus, and a storage medium and a program product. The method comprises: acquiring a plurality of video images, wherein the video images are obtained by a design end by means of performing frame extraction processing on a video stream of an augmented reality video call which is established with a client by the design end (S110); performing segmentation processing on the plurality of video images to obtain a real object image set and an environment image set (S120); respectively performing modeling processing on the real object image set and the environment image set, so as to obtain a real object three-dimensional model and an environment three-dimensional model (S130-S140); generating a first three-dimensional scene model according to the real object three-dimensional model and the environment three-dimensional model (S150); and sending the first three-dimensional scene model to the client (S160).

Description

Three-dimensional model transmission method and device, storage medium, and program product

Cross-references to related applications

This application is filed based on a Chinese patent application with application number 202210640757.1 and a filing date of June 8, 2022, and claims the priority of the Chinese patent application. The entire content of the Chinese patent application is hereby incorporated by reference into this application.

Technical field

Embodiments of the present application relate to but are not limited to the field of communication technology, and in particular, to a three-dimensional model transmission method and its device, storage medium, and program product.

Background technique

In related technologies, in AR (Augmented Reality, augmented reality) scenarios, only files such as videos, texts, and pictures can be transmitted between the client and the design end, but not three-dimensional (3D) models. Therefore, how to realize the transmission of 3D models in augmented reality scenes is an issue that needs to be solved urgently.

Contents of the invention

Embodiments of the present application provide a three-dimensional model transmission method and its device, storage media, and program products.

In the first aspect, embodiments of the present application provide a three-dimensional model transmission method, including: acquiring multiple video images from the design end, and the video images are processed by the design end by performing frame extraction on the video stream of the augmented reality video call. To obtain, the augmented reality video call is established by the design terminal and the client; perform segmentation processing on a plurality of the video images to obtain a set of physical images and a set of environmental images; perform modeling processing on the set of physical images, Obtain the physical three-dimensional model; perform modeling processing on the environmental image collection to obtain the environmental three-dimensional model; generate a first three-dimensional scene model according to the physical three-dimensional model and the environmental three-dimensional model; send the first three-dimensional scene model to the client.

In the second aspect, embodiments of the present application provide a three-dimensional model transmission method, which includes: establishing an augmented reality video call with the design end; receiving a first three-dimensional scene model sent by the server, and the first three-dimensional scene model is sent by the server The terminal is obtained based on the physical three-dimensional model and the environmental three-dimensional model. The physical three-dimensional model is obtained by the server side by modeling the physical object image collection. The environmental three-dimensional model is obtained by the server side by modeling the environmental image collection. The physical image set and the environment image set are obtained by the server side by segmenting multiple video images, and the video images are obtained by the design end through an augmented reality video call. The video stream is obtained by performing frame extraction processing.

In a third aspect, embodiments of the present application also provide a three-dimensional model transmission device, including: a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program Implement the three-dimensional model transfer method as described above.

In a fourth aspect, embodiments of the present application also provide a computer-readable storage medium that stores computer-executable instructions, and the computer-executable instructions are used to execute the three-dimensional model transmission method as described above.

In a fifth aspect, embodiments of the present application further provide a computer program product, which includes a computer program or computer instructions. The computer program or computer instructions are stored in a computer-readable storage medium. The processor of the computer device obtains the information from the computer program or computer instructions. The computer-readable storage medium reads the computer program or the computer instructions, and the processor executes the computer program or the computer instructions, so that the computer device performs the three-dimensional model transmission method as described above.

Description of the drawings

Figure 1 is a flow chart of a three-dimensional model transmission method provided by an embodiment of the present application;

Figure 2 is a flow chart of a method in step S150 in Figure 1;

Figure 3 is a flow chart of a three-dimensional model transmission method provided by another embodiment of the present application;

Figure 4 is a flow chart of a method in step S130 in Figure 1;

Figure 5 is a flow chart of a method in step S140 in Figure 1;

Figure 6 is a flow chart of a three-dimensional model transmission method provided by another embodiment of the present application;

Figure 7 is a flow chart of a three-dimensional model transmission method provided by another embodiment of the present application;

Figure 8 is a flow chart of a three-dimensional model transmission method provided by another embodiment of the present application;

Figure 9 is a flow chart of a three-dimensional model transmission method provided by an embodiment of the present application;

Figure 10 is a schematic structural diagram of a three-dimensional model transmission device provided by an embodiment of the present application;

Figure 11 is a schematic structural diagram of a three-dimensional model transmission device provided by another embodiment of the present application.

Detailed ways

In order to make the purpose, technical solutions and advantages of the present application more clear, the present application will be further described in detail below with reference to the drawings and embodiments. It should be understood that the embodiments described here are only used to explain the present application and are not used to limit the present application.

It should be noted that although a logical sequence is shown in the flowchart, in some cases, the steps shown or described may be performed in an order different from that in the flowchart. In the description of the specification, claims and the above drawings, plural (or multiple) means two or more, greater than, less than, exceeding, etc. are understood to exclude the number, and above, below, within, etc. are understood to include the number. If there are descriptions of "first", "second", etc., they are only used for the purpose of distinguishing technical features and cannot be understood as indicating or implying the relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the indicated technical features. The sequence relationship of technical features.

This application provides a three-dimensional model transmission method and its device, storage media, and program products. First, the server side obtains multiple video images from the design side. The video images are processed by the design side by extracting frames from the video stream of the augmented reality video call. To obtain, the augmented reality video call is established between the design end and the client, and then multiple video images are segmented to obtain a collection of physical images and a collection of environmental images, and then the collection of physical images is modeled to obtain a three-dimensional model of the physical object, and The environment image collection is modeled and processed to obtain an environment three-dimensional model. The first three-dimensional scene model is generated based on the physical three-dimensional model and the environment three-dimensional model. Finally, the first three-dimensional scene model is sent to the client. That is to say, at the design end and the client In the scenario of an augmented reality video call created on the client, the design end performs frame extraction processing on the video stream of the augmented reality video call to obtain multiple video images. The server end processes the multiple video images to finally obtain the first three-dimensional scene. model, and sends the first three-dimensional scene model to the client. Therefore, the embodiment of the present application can realize the transmission of the three-dimensional model in the augmented reality scene.

The embodiments of the present application will be further described below with reference to the accompanying drawings.

Referring to Figure 1, Figure 1 is a flow chart of a three-dimensional model transmission method provided by an embodiment of the present application. The three-dimensional model transmission method may include but is not limited to step S110, step S120, step S130, step S140, step S150 and step S160.

Step S110: Obtain multiple video images from the design end.

In one implementation, the video image is obtained by the design end by performing frame extraction processing on the video stream of the augmented reality video call, and the augmented reality video call is established by the design end and the client.

It is understandable that since the essence of video is continuous pictures frame by frame, the design end can intercept a clear frame of image from the video, or it can intercept multiple image sets, that is, the design end can intercept the video stream of the augmented reality video call. Perform frame extraction processing to obtain multiple video images.

In one implementation, the design end sends a video conference invitation to the client. After the client confirms acceptance, the design end and the client can successfully establish an augmented reality video call after server-side authentication and confirmation; or, the client sends a video conference invitation to the design end. After sending a video conference invitation and confirming acceptance on the design side, and after confirmation by the server-side queuing system and server-side authentication system, the design side and the client can successfully establish an augmented reality video call. There are no specific restrictions here.

In one embodiment, during the augmented reality video call, audio and video switching can be supported. For example, during the augmented reality video call, the client can switch to the playback interface of application A and display the augmented reality video. Set the call as a background program so that the augmented reality video call runs in the background. Application A can be audio or video. frequency; for another example, when the client is having an augmented reality video call with the designer, the client can switch the video call to a voice call, and there are no specific restrictions here.

In one implementation, the video stream of the augmented reality video call can adopt YUV encoding and H.264 video protocol, where YUV is a color encoding format, "Y" represents brightness, that is, grayscale value; "U" and "V" represents chroma, "U" and "V" can both describe image color and saturation, and can be used to specify the color of pixels. Among them, the H.264 video protocol is a digital video compression encoding standard.

In one embodiment, the video stream of the augmented reality video call can be a video stream on the client side, a video stream on the design side, or a video stream on the client side and a video stream on the user side. It is obtained based on the actual situation and there are no specific restrictions here. For example, after the design end and the client establish an augmented reality video call, the design end can extract frames from the video stream on the design end to obtain multiple video images; or, after the design end and the client establish an augmented reality video call, , the design end can perform frame extraction processing on the video stream on the client side to obtain multiple video images; or, after the design end and the client establish an augmented reality video call, the design end can process the video stream on the design end side and the client side respectively. The video stream on the client side undergoes frame extraction processing to obtain multiple video images.

Step S120: Perform segmentation processing on multiple video images to obtain a set of physical images and a set of environmental images.

It can be understood that the physical object image set includes multiple physical object images, and the environmental image set includes multiple environmental images.

In one embodiment, there are many implementation methods for segmenting multiple video images. For example, edge contour scanning may be used to segment multiple video images to obtain a set of physical images and a set of environmental images; or , the deep learning algorithm can be used to identify and process the physical object images and environmental images in multiple video images, and the segmented areas corresponding to the recognized physical images and the environmental images corresponding to the multiple video images are determined respectively according to the composition rules. segmentation area, and then obtain the physical image set and the environmental image set based on the segmentation area corresponding to the physical image and the segmentation area corresponding to the environmental image, wherein the composition rule may include dividing the physical image or the environmental image in the segmentation area. There are no specific restrictions on the rules for setting the location and area occupied.

Step S130: Perform modeling processing on the collection of real object images to obtain a three-dimensional model of the real object.

Step S140: Perform modeling processing on the environment image collection to obtain a three-dimensional environment model.

Step S150: Generate a first three-dimensional scene model based on the physical three-dimensional model and the environmental three-dimensional model.

Step S160: Send the first three-dimensional scene model to the client.

In one implementation, the server can send the first three-dimensional scene model to the design end, which is not specifically limited here.

In this embodiment, by adopting the three-dimensional model transmission method including the above-mentioned steps S110 to S160, first, the server side obtains multiple video images from the design side, and the video images are extracted from the video stream of the augmented reality video call by the design side. Obtained through processing, the augmented reality video call is established between the design end and the client, and then multiple video images are segmented to obtain a collection of physical images and a collection of environmental images, and then the collection of physical images is modeled to obtain a three-dimensional model of the physical object. And perform modeling processing on the environment image collection to obtain the environment three-dimensional model, generate the first three-dimensional scene model based on the physical three-dimensional model and the environment three-dimensional model, and finally send the first three-dimensional scene model to the client, that is to say, on the design side and In the scenario of an augmented reality video call established by the client, the design end performs frame extraction processing on the video stream of the augmented reality video call to obtain multiple video images. The server end processes the multiple video images to finally obtain the first three-dimensional scene model, and sends the first three-dimensional scene model to the client. Therefore, the embodiment of the present application can construct a three-dimensional model in an augmented reality scene and realize the transmission of the three-dimensional model.

In one implementation, the design end, the server end and the client can all perform three-dimensional rendering processing on the first three-dimensional scene model, which is not specifically limited here. For example, the client performs three-dimensional rendering processing on the first three-dimensional scene model sent by the server to obtain the rendered first three-dimensional scene model, and then displays the rendered first three-dimensional scene model; or, when the server sends the first three-dimensional scene model, the client performs three-dimensional rendering processing on the first three-dimensional scene model sent by the server. A three-dimensional scene model is sent to the design end, and the design end performs three-dimensional rendering processing on the first three-dimensional scene model sent by the server end to obtain the rendered first three-dimensional scene model, and then displays the rendered first three-dimensional scene model; Alternatively, the server first performs three-dimensional rendering processing on the first three-dimensional scene model to obtain the rendered first three-dimensional scene model, and then sends the rendered first three-dimensional scene model to the design end or the client.

In one embodiment, the first three-dimensional scene model can be subjected to three-dimensional rendering processing multiple times, and whether it is necessary to perform three-dimensional rendering processing on the first three-dimensional scene model multiple times can be determined according to the model scale of the first three-dimensional scene model. To give an example, When the first three-dimensional scene model simulates an amusement park, the first three-dimensional scene model can be subjected to three-dimensional rendering processing multiple times. That is, the server first performs three-dimensional rendering processing on the first three-dimensional scene model to obtain the rendered first three-dimensional scene model. The scene model is then sent to the design end (or client), and the design end (or client) performs three-dimensional rendering processing again on the rendered first three-dimensional scene model sent by the server. , obtain the re-rendered first three-dimensional scene model, and display the re-rendered first three-dimensional scene model; or, when the first three-dimensional scene model simulates a bedroom, then only the first three-dimensional scene model can be A three-dimensional rendering process is performed once, and the first three-dimensional scene model can be three-dimensionally rendered by the design end, the server end, or the client, and there are no specific restrictions here.

It can be understood that 3D rendering is essentially a conversion process of computer processing of 3D models into 2D images. Rendering involves scan line rendering, ray tracing, photon mapping and other technologies, which can simulate the interaction between light and various substances (such as materials and surface textures). Therefore, rendering requires the support of 3D plug-ins, 3D software and hardware.

In an embodiment, as shown in FIG. 2 , step S150 is further described. This step S150 may include but is not limited to step S210, step S220 and step S230.

Step S210: Obtain a preset three-dimensional model library, which includes an environment model library.

In one embodiment, the environment model library may include multiple preset environment models. The multiple preset environment models may be created by model makers using three-dimensional modeling tools (such as 3D Studio MAX and other modeling tools, where 3D Studio MAX is a three-dimensional animation rendering and production software based on a computer operating system). It is obtained by modifying the existing model data and using the modified model data for modeling processing. It can also be an existing model, which will not be done here. Specific restrictions.

In one embodiment, the preset three-dimensional model library may include a target three-dimensional scene model, model-related data, model basic components, model material materials, etc., where the model-related data includes model scale, location information of the real scene simulated by the model, etc., No specific restrictions are made here. For example, in a route planning scenario, when the user is at the foot of a mountain, the user sends location information to the server through the AR device, and the server determines the target 3D scene corresponding to the mountain peak from the preset 3D model library based on the location information. model, and sends the target three-dimensional scene model to the design end. The design end plans the route of the mountain peak based on the target three-dimensional scene model and marks the iconic attractions, shops, restrooms, etc. on the mountain route, and combines the route planning and The relevant information of the mark is pushed to the AR client, thereby improving the user experience, in which the AR device is the client.

Step S220: Determine the target environment three-dimensional model from the environment model library according to the physical three-dimensional model and the environment three-dimensional model.

In one embodiment, the physical three-dimensional model and the environmental three-dimensional model can be matched with the preset environmental model in the environmental model library to determine the target environmental three-dimensional model, thereby improving modeling efficiency and shortening modeling time.

Step S230: Generate a first three-dimensional scene model based on the physical three-dimensional model and the target environment three-dimensional model.

In this embodiment, by adopting the three-dimensional model transmission method including the above steps S210 to step S230, first the server side obtains a preset three-dimensional model library, where the three-dimensional model library includes an environment model library, and then based on the physical three-dimensional model and the environmental three-dimensional The model determines the target environment three-dimensional model from the environment model library, and finally generates the first three-dimensional scene model based on the physical three-dimensional model and the target environment three-dimensional model. Therefore, the embodiment of the present application can obtain the target environment three-dimensional model from the environment model library based on the physical three-dimensional model and the environment three-dimensional model. Determine the three-dimensional model of the target environment to improve modeling efficiency and shorten modeling time.

In an embodiment, as shown in Figure 3, after executing step S160, the three-dimensional model transmission method may also include but is not limited to step S310, step S320, step S330, step S340 and step S350.

Step S310: Receive the annotation information sent by the client.

In one embodiment, the annotation information may include coordinate information, text information, annotated models, etc., and the annotation information is obtained by the client performing virtual annotation processing on the first three-dimensional scene model. For example, when the first three-dimensional scene model simulates a living room, and a sofa is placed in the living room at the coordinates (8, 4, 0), if it is necessary to change the sofa to the position at the coordinates (4, 4, 0) If placed, the coordinate information may be to modify the physical three-dimensional model at the coordinates of (8, 4, 0) to the position of the coordinates of (4, 4, 0).

Step S320: Determine the labeling area in the first three-dimensional scene model according to the labeling information.

It can be understood that the labeled area can be the area before modification, the area after modification, or the area before modification and the area after modification. For example, the labeling information is the coordinates (8, 4, 0) The physical 3D model at is modified to If the coordinates are (4, 4, 0), then the marked area can be the area with coordinates (8, 4, 0) or the area with coordinates (4, 4, 0). There are no specific restrictions here. .

Step S330: Nest the annotation information on the annotation area to obtain a second three-dimensional scene model.

Step S340: Overlay the first three-dimensional scene model and the second three-dimensional scene model to obtain a third three-dimensional scene model.

Step S350: Send the third three-dimensional scene model to the client.

In one implementation, the server side can send the third three-dimensional scene model to the design side, which is not specifically limited here.

In this embodiment, by adopting the three-dimensional model transmission method including the above-mentioned steps S310 to S350, first the server receives the annotation information sent by the client, determines the annotation area in the third three-dimensional scene model according to the annotation information, and then determines the annotation area in the annotation area. The annotation information is nested on the top to obtain the second three-dimensional scene model. Then the third three-dimensional scene model and the second three-dimensional scene model are superimposed to obtain the third three-dimensional scene model. Finally, the third three-dimensional scene model is sent to the client. Therefore, , the embodiment of this application can receive annotation information fed back by the client, and modify the third three-dimensional scene model based on the annotation information, reducing communication costs caused by professional barriers, achieving rapid iteration and delivery, and conducive to meeting customer needs. At the same time Improved user experience.

In one implementation, the design end, the server end and the client can all perform 3D rendering processing on the third 3D scene model, which is not specifically limited here. For example, the client performs three-dimensional rendering processing on the third three-dimensional scene model sent by the server to obtain the rendered third three-dimensional scene model, and then displays the rendered third three-dimensional scene model; or, when the server sends the third three-dimensional scene model, the client performs three-dimensional rendering processing. The three-dimensional scene model is sent to the design end, and the design end performs three-dimensional rendering processing on the third three-dimensional scene model sent by the server to obtain a rendered third three-dimensional scene model, and then displays the rendered third three-dimensional scene model; Alternatively, the server first performs three-dimensional rendering processing on the third three-dimensional scene model to obtain a rendered third three-dimensional scene model, and then sends the rendered third three-dimensional scene model to the design end or the client.

In one embodiment, the third three-dimensional scene model can be subjected to three-dimensional rendering processing multiple times, and whether it is necessary to perform three-dimensional rendering processing on the third three-dimensional scene model multiple times can be determined according to the model scale of the third three-dimensional scene model. For example, when the third 3D scene model simulates an amusement park, the third 3D scene model can be 3D rendered multiple times. That is, the server first performs 3D rendering on the third 3D scene model and obtains the The third three-dimensional scene model is then sent to the design end (or client), and the design end (or client) responds to the rendered third three-dimensional scene model sent by the server again. Perform three-dimensional rendering processing to obtain a re-rendered third three-dimensional scene model, and display the re-rendered third three-dimensional scene model; or, when the third three-dimensional scene model simulates a bedroom, then only the third three-dimensional scene model can be displayed. The three-dimensional scene model undergoes one three-dimensional rendering process, and the third three-dimensional scene model can be three-dimensionally rendered by the design end, the server end, or the client. There are no specific restrictions here.

In one embodiment, as shown in Figure 4, when the three-dimensional model library includes a physical model library, step S130 is further described. This step S130 may include but is not limited to step S410, step S420, step S430 and Step S440.

Step S410: Analyze the collection of physical objects to obtain graphic information.

In one embodiment, the graphic information may include the line outline of the physical image, the position of the physical image, the size of the physical image, the resource address of the physical image, etc., which are not specifically limited here.

Step S420: Obtain physical image element information from the physical model library according to the graphic information.

In one embodiment, the physical model library may include multiple preset physical models. The multiple preset physical models may be created by model makers using three-dimensional modeling tools (such as 3D Studio MAX and other modeling tools, where 3D Studio MAX is a three-dimensional animation rendering and production software based on a computer operating system). It is obtained by modifying the existing model data and using the modified model data for modeling processing. It can also be an existing model, which will not be done here. Specific restrictions.

Step S430: Use the element information of the physical image to construct a base map of the physical image.

In one implementation, the physical image base map may be the basic framework of the physical three-dimensional model, which is not specifically limited here.

Step S440: Perform modeling processing on the image base map of the physical object to obtain a three-dimensional model of the physical object.

In one embodiment, when the physical image base map is the outline of a vase, then the vase is modeled, that is, The vase is subjected to material veneer, light adjustment, texture, depression, and convex processing, etc., and finally a physical three-dimensional model of the vase is obtained.

In this embodiment, by adopting the three-dimensional model transmission method including the above-mentioned steps S410 to S440, first, the server side parses the physical image collection to obtain graphic information, and obtains the physical image element information from the physical model library according to the graphic information. Then, the physical image element information is used to construct the physical image base map, and finally the physical image base map is modeled to obtain the physical three-dimensional model. Therefore, the embodiment of the present application can obtain the physical image element information from the physical model library through the graphic information, using The physical image element information is used to construct the physical image base map to shorten the modeling time and improve the modeling efficiency.

In one embodiment, as shown in FIG. 5 , step S140 is further described. This step S140 may include but is not limited to step S510, step S520 and step S530.

Step S510: Analyze the environment image set to obtain label information.

In one embodiment, the tag information may include material attribute information, color attribute information, light source attribute information, etc. of environmental appearance, which are not specifically limited here.

Step S520: Obtain environmental image element information from the environment model library according to the label information.

In one embodiment, the environment image element information may include index information such as material attribute information, color attribute information, light source attribute information, etc. of the environment appearance, which is not specifically limited here.

Step S530: Use the environment image element information to perform modeling processing on the environment image set to obtain a three-dimensional environment model.

In one embodiment, the environment image element information is used to model the environment image collection, which may include material veneer, light adjustment, texture, depression, bulge processing, collapse processing, etc., so that the model has a three-dimensional sense. This is not the case here. Make specific restrictions.

In this embodiment, by adopting the three-dimensional model transmission method including the above steps S510 to S530, first the server parses the environment image collection to obtain label information, and then obtains environmental image element information from the environment model library according to the label information. , and finally the environment image element information is used to model the environment image collection to obtain the environment three-dimensional model. Therefore, the embodiment of the present application obtains the environment image element information from the environment model library through the tag information, and uses the environment image element information to model the environment image collection. Carry out modeling processing to achieve the purpose of shortening modeling time and improving modeling efficiency.

In an embodiment, as shown in Figure 6, the three-dimensional model transmission method may also include but is not limited to step S610 and step S620.

Step S610: Determine the target environment three-dimensional model from the environment model library according to the physical three-dimensional model.

In one embodiment, the physical three-dimensional model can be adapted to the preset environment model in the environment model library to determine the target environment three-dimensional model without modeling the environment image collection, which shortens the modeling time. The application examples do not specifically limit this.

Step S620: Nest the physical three-dimensional model and the target environment three-dimensional model to generate a first three-dimensional scene model.

In this embodiment, by adopting the three-dimensional model transmission method including the above steps S610 to step S620, first the server determines the target environment three-dimensional model from the environment model library based on the physical three-dimensional model, and finally embeds the physical three-dimensional model and the target environment three-dimensional model. A set of processes is performed to generate the first three-dimensional scene model. That is to say, you can only perform modeling processing on the physical image collection to obtain the physical three-dimensional model. You can only determine the target environment three-dimensional model from the environment model library based on the physical three-dimensional model, and convert the physical three-dimensional model to the first three-dimensional scene model. It is nested with the target environment 3D model to generate the first 3D scene model without the need to model the environment image collection, which shortens the modeling time. At the same time, the actual objects in the scene where the client or the design end is located can be passed through The form of the three-dimensional model is adapted to the preset environmental three-dimensional model in the environment model library to achieve the effect of remote design, meet the diverse needs of users, and improve user experience.

In one embodiment, when designing the home layout of the living room, the client can send a collection of physical objects including sofas, refrigerators, seats, etc. to the server, and the server performs modeling processing on the physical image collection to obtain multiple physical objects. three dimensional model, and then the server determines the target environment three-dimensional model from the environment model library based on the multiple physical three-dimensional models, performs nesting processing on the physical three-dimensional model and the target environment three-dimensional model, and generates the first three-dimensional scene model. Therefore, the embodiment of the present application There is no need to perform modeling processing on the environment image collection, which shortens the modeling time.

In an embodiment, step S150 is further described. Step S150 may include but is not limited to the following steps:

The physical three-dimensional model and the environmental three-dimensional model are nested to generate a first three-dimensional scene model.

In this embodiment, the server side can obtain multiple video images from the design side. The video images are obtained by the design side by performing frame extraction processing on the video stream of the augmented reality video call. The augmented reality video call is established by the design side and the client. Then multiple video images are segmented to obtain a collection of physical images and a collection of environmental images. Then the physical image collection is modeled to obtain a three-dimensional model of the physical object. The environmental image collection is analyzed and processed to obtain graphic information, and then the physical object is The three-dimensional model and the environment three-dimensional model are nested to generate the first three-dimensional scene model. Therefore, the embodiment of the present application can model the environment and physical objects in the scene where the user is located, and combine the physical three-dimensional model with the environmental three-dimensional model. The first three-dimensional scene model is obtained through nesting processing, restoring the real scene and achieving the effect of remote customization.

In one embodiment, when the server side can obtain multiple video images from the design side, it performs segmentation processing on the multiple video images to obtain a set of physical images and a set of environmental images, and then performs modeling processing on the set of physical images to obtain a three-dimensional physical object. model, and parses the environment image collection to obtain label information, and adapts the label information to the preset environment model in the environment model library. When the label information does not match the preset environment model, the environment image collection is constructed. Model processing is performed to obtain an environment three-dimensional model, and the physical three-dimensional model and the environment three-dimensional model are nested to obtain the first three-dimensional scene model. In addition, the environment three-dimensional model can be stored in the environment model library, and this application does not impose specific restrictions on this.

In one embodiment, first, the server acquires the set of environment images sent by the client, and performs modeling processing on the set of environment images to obtain a three-dimensional model of the environment. Then, the server acquires multiple video images from the design end, and the video images are generated by the design end. It is obtained by extracting frames from the video stream of the augmented reality video call. The augmented reality video call is established by the design end and the client. Then multiple video images are segmented to obtain a set of physical images, and then the set of physical images is constructed. Model processing is performed to obtain the physical three-dimensional model, the first three-dimensional scene model is generated according to the physical three-dimensional model and the environmental three-dimensional model, and finally the first three-dimensional scene model is sent to the client. Therefore, the embodiment of the present application processes the environment image collection sent by the client. Carry out modeling processing to meet customer needs and achieve remote customization effects.

In addition, FIG. 7 is a three-dimensional model transmission method provided by another embodiment of the present application. The three-dimensional model transmission method may include but is not limited to step S710 and step S720.

Step S710: Establish an augmented reality video call with the design end.

In one implementation, the design end sends a video conference invitation to the client. After the client confirms acceptance, the design end can establish an augmented reality video call with the client only after confirmation by the server-side queuing system and the server-side authentication system; or , the client sends a video conference invitation to the design end. After the design end confirms acceptance, the client can establish an augmented reality video call with the design end after being confirmed by the server-side queuing system and the server-side authentication system. There are no specific restrictions here. .

Step S720: Receive the first three-dimensional scene model sent by the server.

In one embodiment, the first three-dimensional scene model is obtained by the server based on the physical three-dimensional model and the environmental three-dimensional model. The physical three-dimensional model is obtained by the server through modeling processing of the physical image collection, and the environmental three-dimensional model is obtained by the server. The environmental image collection is obtained through modeling processing. Both the physical image collection and the environmental image collection are obtained by segmenting multiple video images on the server side. The video images are extracted by the design end by extracting the video stream of the augmented reality video call. obtained through frame processing.

In one embodiment, there are many implementation methods for segmenting multiple video images. For example, edge contour scanning may be used to segment multiple video images to obtain a set of physical images and a set of environmental images; or , the deep learning algorithm can be used to identify and process the physical object images and environmental images in multiple video images, and the segmented areas corresponding to the recognized physical images and the environmental images corresponding to the multiple video images are determined respectively according to the composition rules. segmentation area, and then obtain the physical image set and the environmental image set based on the segmentation area corresponding to the physical image and the segmentation area corresponding to the environmental image, wherein the composition rule may include placing the physical image or the environmental image in the segmentation area The rules for setting the location and area occupied are not specifically limited here.

In one embodiment, during the augmented reality video call, audio and video switching can be supported. For example, during the augmented reality video call, the client can switch to the playback interface of application A and display the augmented reality video. The call is set as a background program so that the augmented reality video call runs in the background. Application A can be audio or video. For another example, when the client is having an augmented reality video call with the designer, the client can There are no specific restrictions on switching a video call to a voice call.

In one implementation, the video stream of the augmented reality video call adopts YUV encoding and H.264 video protocol, where YUV is a color encoding format, "Y" represents brightness, that is, grayscale value; "U" and " V" represents chroma. Both "U" and "V" can describe image color and saturation, and can be used to specify the color of pixels. Among them, the H.264 video protocol is a digital video compression encoding standard.

In this embodiment, by adopting the three-dimensional model transmission method including the above steps S710 to S720, first the client establishes an augmented reality video call with the design terminal, and then the client receives the first three-dimensional scene model sent by the server, wherein the third A three-dimensional scene model is obtained by the server side based on the physical three-dimensional model and the environmental three-dimensional model. The physical three-dimensional model is obtained by the server side by modeling a collection of physical objects. The environmental three-dimensional model is obtained by the server side by modeling a collection of environmental images. The physical image collection and the environment image collection are obtained by segmenting multiple video images on the server side, and the video images are obtained by the design end by extracting frames from the video stream of the augmented reality video call, that is, Say, in the scenario of an augmented reality video call established between the design end and the client, the design end performs frame extraction processing on the video stream of the augmented reality video call to obtain multiple video images, and the server end processes the multiple video images. , finally obtain the first three-dimensional scene model, and finally the client receives the first three-dimensional scene model sent by the server. Therefore, the embodiment of the present application can realize the transmission of the three-dimensional model in the augmented reality scene.

In an embodiment, as shown in FIG. 8 , the three-dimensional model transmission method may include but is not limited to step S810 and step S820.

Step S810: Perform virtual annotation processing on the first three-dimensional scene model to obtain annotation information.

In one implementation, the annotation information may include coordinate information, text information, annotated models, etc., and is not specifically limited here.

In one implementation, the server may send the first three-dimensional scene model to the design end, and the design end may store and process the first three-dimensional scene model. When the client performs virtual annotation processing on the first three-dimensional scene model, the client can send the annotation information to the design end in real time. For example, the client can switch the perspective of the first three-dimensional scene model. When the first three-dimensional scene model is When a house includes a bedroom, a living room and a bathroom, the client can switch the first three-dimensional scene model to the bedroom area and perform virtual annotation processing on the bedroom area, or it can switch the first three-dimensional scene model to the living room area and Perform virtual annotation processing on the living room area, switch the first three-dimensional scene model to the bathroom area, perform virtual annotation processing on the bathroom area, and finally send the annotation information of all areas to the design end (or server end). The design end can receive the annotation information of all areas in real time, and modify the design of the first three-dimensional scene model stored in the design end based on the annotation information to realize remote modification of the three-dimensional model and improve the user experience. This will not be detailed here. limit.

Step S820: Send the annotation information to the server.

In one implementation, the client can also send the standard information to the design end, and the design end modifies the design of the first three-dimensional scene model stored in the design end based on the standard information. Since each module in the first three-dimensional scene model Different designers may be responsible for it, that is, multiple designers may make concurrent modifications to the first three-dimensional scene model. Therefore, concurrency conflicts may occur. In order to avoid the problem of concurrency conflicts, multiple designers in the design end can modify the source data files corresponding to the modules based on their own design copies. Finally, the design end submits all modified source data files to the server end. The server collects the source data files and then delivers them to the user in a unified manner. Therefore, in this embodiment of the present application, the first three-dimensional scene model is directly modified based on the annotation information, thereby reducing communication costs and achieving rapid delivery.

In one implementation, the annotation information is saved in a relevant file, packaged into a data packet and sent to the server. After receiving the data packet, the server parses the data packet to obtain the annotation information, which will not be detailed here. limit.

In this embodiment, by adopting the three-dimensional model transmission method including the above steps S810 to S820, first, the client can perform virtual annotation processing on the first three-dimensional scene model, obtain annotation information, and send the annotation information to the server, so that the The server determines the annotation area in the first three-dimensional scene model based on the annotation information, nests the annotation information on the annotation area, and obtains the second three-dimensional scene model. Finally, the first three-dimensional scene model and the second three-dimensional scene model are superimposed to obtain The third three-dimensional scene model means that the client can feed back the annotation information to the server, and the server can modify the first three-dimensional scene model based on the annotation information, thereby reducing communication costs, while also helping to meet user needs and improve user experience.

In one embodiment, in the scenario of real-life tourism explanation, the user marks building B in the AR device, and sends the building B to the server through the AR device, and the server parses the collection of physical images. , obtain the graphic information, obtain the physical image element information from the physical model library according to the graphic information, then use the physical image element information to construct the physical image base map, perform modeling processing on the physical image base map, obtain the physical three-dimensional model, and finally convert the physical object The three-dimensional model is sent to the client, and the user also sends the building B to the design end through the AR device. The design end obtains the historical data related to building B from the data storage module based on the building B and saves the historical data. Send it to the AR device. The design end can send the historical data to the AR device through voice, video or text. In addition, the user can also control the physical objects in the AR device through voice control or peripheral input. The three-dimensional model is enlarged or reduced, and the viewing angle of the physical three-dimensional model in the AR device is switched to view the model details. The AR device is the client and is not specifically limited here.

Regarding the three-dimensional model transmission method provided by the above embodiments, a detailed description is given below using an embodiment:

In one embodiment, referring to Figure 9, first, the design end and the client establish an augmented reality video call, and then the design end performs frame extraction processing on the video stream of the augmented reality video call to obtain multiple video images, and then processes the multiple videos. The image is segmented to obtain a set of physical images and a set of environmental images, and then the set of physical images and the set of environmental images are analyzed and processed respectively. Among them, for the analysis and processing of the set of physical images, the set of physical images can be analyzed and processed to obtain graphic information. Obtain the physical image element information from the physical model library according to the graphic information, use the physical image element information to construct the physical image base map, perform modeling processing on the physical image base map, and obtain the physical three-dimensional model; for the analysis and processing of the environmental image collection, you can The environmental image collection is analyzed and processed to obtain label information. The environmental image element information is obtained from the environmental model library according to the label information. The environmental image element information is used to model the environmental image collection to obtain an environmental three-dimensional model. Then, the environmental three-dimensional model is obtained based on the physical three-dimensional model and the environmental three-dimensional model. The model generates a first three-dimensional scene model, and sends the first three-dimensional scene model to the client, and the client performs three-dimensional rendering processing on the first three-dimensional scene model; or, the server performs three-dimensional rendering processing on the first three-dimensional scene model to obtain the rendered third A three-dimensional scene model, and then the rendered first three-dimensional scene model is sent to the client. The client performs three-dimensional rendering processing on the rendered first three-dimensional scene model again to obtain the re-rendered first three-dimensional scene model. After that, The client performs virtual annotation processing on the rendered first three-dimensional scene model or the re-rendered first three-dimensional scene model to obtain annotation information, and the client applies the annotation information to the rendered first three-dimensional scene model or the re-rendered first three-dimensional scene model. The first three-dimensional scene model is modified concurrently, that is, multiple designers in the design end modify the source data files corresponding to the module based on their own design copies, and finally the design end submits all modified source data files to the server The server side sends the source data files to the client in a unified manner to achieve remote delivery.

In addition, referring to FIG. 10 , an embodiment of the present application also provides a three-dimensional model transmission device. The three-dimensional model transmission device includes a design terminal 100, a client 300, and a server terminal 200.

Among them, the design end 100 includes a first display module 101, a first three-dimensional rendering module 102, a first audio and video processing module 103, a data storage module 104 and an agent integration interface 105. The first display module 101 may be used to display the rendered first three-dimensional scene model or the third three-dimensional scene model, or to display the re-rendered first three-dimensional scene model or the third three-dimensional scene model, wherein the display The module includes a computer display screen or an AR screen, etc.; the first three-dimensional rendering module 102 can be used to perform three-dimensional rendering processing on the first three-dimensional scene model or the third three-dimensional scene model to obtain the rendered first three-dimensional scene model or the rendered third three-dimensional scene model. The three-dimensional scene model can also be used to perform three-dimensional rendering processing on the rendered first three-dimensional scene model or the rendered third three-dimensional scene model to obtain the first three-dimensional scene model after re-rendering or the third three-dimensional scene model after re-rendering. Three-dimensional scene model; the first audio and video processing module 103 can be used to receive the video stream of the augmented reality video call sent to the agent integration interface 105, and perform frame extraction processing on the video stream of the augmented reality video call to obtain multiple video images, It can also be used to receive the first three-dimensional scene model or the third three-dimensional scene model from the agent integration interface 105, or to receive the rendered first three-dimensional scene model or the rendered third three-dimensional scene model from the agent integration interface 105; data The storage module 104 may be used to store the video stream of the augmented reality video call from the agent integration interface 105, annotation information, model-related data, the first three-dimensional scene model, the third three-dimensional scene model, the rendered first three-dimensional scene model and the rendering. The third three-dimensional scene model, etc.; the agent integration interface 105 can be used to obtain multiple video images in the first audio and video processing module 103, and send the multiple video images to the server 200, and can also obtain the server 200 The first three-dimensional scene model, the third three-dimensional scene model, the rendered first three-dimensional scene model, the rendered third three-dimensional scene model, etc.

In one embodiment, the first audio and video processing module 103 can support audio and video encoding formats; can call the data storage module 104; can sort video conference invitations according to priority, and can select video responses or audio responses for video conference invitations. ;Support multi-stream media negotiation mode simultaneously; The first three-dimensional scene model or the third three-dimensional scene model can be transmitted to the first three-dimensional rendering module 102.

In one embodiment, the agent integration interface 105 can be used to receive an access request from the message processing module 203 of the server 200, parse the access request, and also obtain relevant information from the first audio and video processing module 103 for packaging. Obtain the data packet and send the data packet to the server 200, which is not specifically limited here.

The server 200 includes a file storage 201, a message processing module 203, a message cache module 204, a data flow module 205, an image analysis module 209, a model adaptation module 210, a three-dimensional model constructor 211, a model storage module 208 and a second three-dimensional rendering module. 206. Among them, the file storage 201 can be used to store multiple video images from the agent integration interface 105; the message processing module 203 can be used to authenticate and confirm the video conference invitation; the message caching module 204 can be used to receive users of the message processing module 203 Information and data such as agent addresses; the data flow module 205 can be used to transmit the video stream of the augmented reality video call from the agent integration interface 105 of the design end 100, and can be used to send video invitations to join the meeting and exit the augmented reality video call; image The parsing module 209 can be used to parse the physical image set to obtain graphic information, and to parse the environmental image set to obtain label information; the model adaptation module 210 can be used to construct the physical image base map using the physical image element information, and The physical image base map is subjected to modeling processing to obtain a three-dimensional model of the physical object; the model storage module 208 can be used to obtain environmental image element information from the environmental model library according to the label information, and can also be used to obtain the physical object from the physical model library according to the graphic information. Image element information; the three-dimensional model constructor 211 can be used to perform modeling processing on the environment image set using the environment image element information to obtain an environment three-dimensional model; the second three-dimensional rendering module 206 can be used to convert the first three-dimensional model constructor 211 from the three-dimensional model constructor 211. The three-dimensional scene model or the third three-dimensional scene model is subjected to three-dimensional rendering processing to obtain the rendered first three-dimensional scene model or the third three-dimensional scene model.

In one embodiment, the file storage 201 can also receive a data package from the agent integration interface 105 through the upload and download module 202, and store it in a designated directory, where the data package includes a three-dimensional model information file.

In one embodiment, the server 200 also includes an upload and download module 202. The upload and download module can be used to detect the number of directory levels according to the directory of files stored in the file storage 201, detect whether the directory of the server 200 exists, and detect the file. Whether the name or format is correct, and the token is legal. After confirming that it is correct, generate a URL (Uniform Resource Locator, Uniform Resource Locator) address, and send the URL address to the design end 100 and the client 300.

In one embodiment, the message processing module 203 can forward the account login request; can distribute the query operation initiated by the client 300 to other modules, and send messages to other modules; at the same time, when there are too many request messages to be processed, one can Data such as user information and agent address are stored in the message cache module 204.

In one embodiment, the image analysis module 209 can also apply image processing algorithms and computer vision algorithms, and the image analysis model can also be used for image recognition and image segmentation. For example, a tape from the design end 100 After being processed by the image parsing module 209, the object marked with a red frame parses out the graphic information of the object, such as the object's sub-model and the marked coordinate information, and then notifies the client 300 through the message processing module 203 and the message caching module 204. Capture the object in the video stream.

In one embodiment, the image analysis module 209 can obtain multiple video images from the file storage 201 and perform segmentation processing on the multiple video images to obtain a set of physical images and a set of environmental images.

In one embodiment, the model storage module 208 includes a preset three-dimensional model library, and the preset three-dimensional model library includes a physical model library and an environment model library.

In one embodiment, after the design end 100 and the client 300 establish an augmented reality video call, the first audio and video processing module 103 of the design end 100 can obtain the video stream of the augmented reality video call from the agent integration interface 105 or from the data storage. For the video stream of the augmented reality video call in module 104, the first audio and video processing module 103 performs frame extraction processing on the video stream of the augmented reality video call to obtain multiple video images, and then the first audio and video processing module 103 processes the multiple video images. The video image is sent to the server 200 through the agent integration interface 105.

In an embodiment, the file storage 201 can obtain multiple video images of the agent integration interface 105 of the design terminal 100 through the upload and download module 202, and can also obtain multiple video images of the client integration interface 302 of the client 300 through the upload and download module 202. Video images are not specifically limited here.

In one embodiment, the file storage 201 stores multiple video images from the agent integration interface 105 of the design end 100, or stores multiple video images from the client integration interface 302 of the client 300. The image parsing module 209 can extract data from the file. Acquire multiple video images in the memory 201, segment the multiple video images to obtain a set of physical images and a set of environmental images, perform analysis on the set of physical images to obtain graphic information, and perform analysis on the set of environmental images to obtain labels. information, and sends the graphic information and the label information to the model adaptation module 210. Then the model adaptation module 210 uses the graphic information to obtain the physical image element information from the physical model library in the model storage module 208, and uses the physical image element The information is used to construct a base map of the physical image, and modeling is performed on the base map of the physical object to obtain a three-dimensional model of the physical object, and the model adaptation module 210 uses the label information to obtain the environmental image element information from the environment model library in the model storage module 208, using The environmental image element information is used to model the environmental image collection to obtain an environmental three-dimensional model. The model adaptation module 210 sends the physical three-dimensional model and the environmental three-dimensional model to the three-dimensional model constructor 211. The three-dimensional model constructor 211 uses the physical three-dimensional model and the environment. The three-dimensional model generates a first three-dimensional scene model, and sends the first three-dimensional scene model to the second three-dimensional rendering module 206. The second three-dimensional rendering module 206 performs three-dimensional rendering processing on the first three-dimensional scene model to obtain the rendered first three-dimensional scene model. scene model, and then the second three-dimensional rendering module 206 sends the rendered first three-dimensional scene model to the video stream module, and sends it to the agent integration interface 105 of the design end 100 through the video stream module; or, the three-dimensional model constructor 211 uploads The download module 202 sends the first three-dimensional scene model to the file storage 201, and the file storage 201 stores the first three-dimensional scene model. The first three-dimensional scene module can also be sent to the agent integration interface 105, or the first three-dimensional scene module can be sent to the agent integration interface 105. The scene module is sent to the client 300 agent interface.

In one embodiment, the agent integration interface 105 of the design end 100 can obtain the rendered first three-dimensional scene module from the video stream module and send it to the data storage module 104. The first audio and video processing module 103 can obtain the rendered first three-dimensional scene module from the data storage module 104. Obtaining the rendered first three-dimensional scene module, the first audio and video processing module 103 directly sends the rendered first three-dimensional scene module to the first display module 101, and the first display module 101 displays the rendered first three-dimensional scene module. The model is displayed and no specific restrictions are made here.

In another embodiment, the agent integration interface 105 of the design end 100 can obtain the rendered first three-dimensional scene module from the data storage module 104, and the first audio and video processing module 103 can obtain the rendered first three-dimensional scene module from the data storage module 104. The three-dimensional scene module is sent to the first three-dimensional rendering module 102. The first three-dimensional rendering module 102 performs three-dimensional rendering processing on the rendered first three-dimensional scene model again. Finally, the first display module 101 receives another three-dimensional rendering process from the first three-dimensional rendering module 102. The rendered first three-dimensional scene model displays the re-rendered first three-dimensional scene model, which is not specifically limited here.

In one embodiment, the agent integration interface 105 of the design end 100 can obtain the first three-dimensional scene module from the file storage 201 of the server 200 through the upload and download module 202 of the server 200, and send it to the first audio and video processing module 103. The first audio and video processing module 103 sends the first three-dimensional scene module to the first three-dimensional rendering module 102, and performs three-dimensional rendering processing on the first three-dimensional scene model through the first three-dimensional rendering module 102. Finally, the first display module 101 receives from The rendered first three-dimensional scene model of the first three-dimensional rendering module 102 displays the rendered first three-dimensional scene model, which is not specifically limited here.

In another embodiment, the agent integration interface 105 of the design end 100 can obtain the first three-dimensional scene module from the file storage 201 of the server 200 through the upload and download module 202 of the server 200, and send it to the data storage module 104. The first The audio and video processing module 103 obtains the first three-dimensional scene module from the data storage module 104, sends the first three-dimensional scene module to the first three-dimensional rendering module 102, and performs three-dimensional rendering on the first three-dimensional scene model through the first three-dimensional rendering module 102. Processing, finally the first display module 101 receives the rendered first three-dimensional scene model from the first three-dimensional rendering module 102, and displays the rendered first three-dimensional scene model, which is not specifically limited here.

In one embodiment, the client 300 can click the cloud rendering button in the 3D program of the local terminal through Web software or directly access the resources through high-speed Internet access. The instructions are issued from the user terminal, and the server executes the corresponding rendering tasks according to the instructions. , and the rendering result screen is sent back to the user terminal for display. Providing remote rendering capabilities to terminal devices can make up for the terminal's shortcomings in rendering capabilities.

In one embodiment, the server 200 also includes an information database 207, where the information database 207 is a storage database for important information data of the server 200. In order to prevent system abnormalities from causing the loss of user information, agent addresses and other data in the message cache module 204 , the message cache module 204 can store some important user information, agent address and other data into the information database 207, and the information database 207 can be used to resume interrupted services.

The client 300 includes a client integrated interface 302, a camera acquisition module 303, a second display module 306, a third three-dimensional rendering module 305 and a second audio and video processing module 304. The client integration interface 302 can be used to obtain multiple video images in the second audio and video processing module 304, and send the multiple video images to the server 200; the second display module 306 can be used to display the rendered first A three-dimensional scene model or a third three-dimensional scene model, in which the display module includes a computer display screen or an AR screen. The third three-dimensional rendering module 305 is used to perform three-dimensional rendering processing on the first three-dimensional scene model or the third three-dimensional scene model from the server 200 to obtain the rendered first three-dimensional scene model or the third three-dimensional scene model; the second sound The video processing module 304 can be used to receive the video stream of the augmented reality video call sent to the client integrated interface 302, and perform frame extraction processing on the video stream of the augmented reality video call to obtain multiple video images; the camera acquisition module 303 is used to Capture the video stream of an augmented reality video call.

In one embodiment, the client 300 also includes a local storage module 301. The local storage module 301 is used to upload or download files from the file storage 201 of the server 200, according to the URL address sent by the upload and download module 202, to the server. 200 files are saved in the local address or uploaded to the server 200. The format of the files can also be converted, such as converting pictures into binary files.

In one embodiment, the client integration interface 302 can be used to receive an access request from the message processing module 203 of the server 200, parse the access request, and also obtain relevant information in the first audio and video processing module 103 for packaging. Obtain the data packet and send the data packet to the server 200, which is not specifically limited here.

In an embodiment, the local storage module 301 of the client 300 can download the first three-dimensional scene module from the file storage 201 of the server 200 through the upload and download module 202 of the server 200.

In one embodiment, the camera collection module 303 collects the video stream of the augmented reality video call, and sends the video stream to the second audio and video processing module 304. The second audio and video processing module 304 processes the video stream of the augmented reality video call. Frame extraction processing is performed to obtain multiple video images, and the multiple video images are sent to the client integration interface 302. The client integration interface 302 sends the multiple video images to the server 200. For example, the client integration interface 302 sends the multiple video images to the server 200. The multiple video images are stored in the local storage module 301, and the multiple video images are sent to the file storage 201 of the server 200 through the upload and download module 202 of the server 200 through the local storage module 301.

In one embodiment, the client integration interface 302 of the client 300 can obtain the rendered first three-dimensional scene module from the video stream module and send it to the second audio and video processing module 304. The second audio and video processing module 304 will Rendered The first three-dimensional scene module is directly sent to the second display module 306, and the second display module 306 displays the rendered first three-dimensional scene model, which is not specifically limited here.

In another embodiment, the client integration interface 302 of the client 300 can obtain the rendered first three-dimensional scene module from the video stream module and send it to the second audio and video processing module 304. The second audio and video processing module 304 will The rendered first three-dimensional scene module is sent to the third three-dimensional rendering module 305, and the third three-dimensional rendering module 305 performs three-dimensional rendering processing on the rendered first three-dimensional scene model again. Finally, the second display module 306 receives data from the third three-dimensional rendering module 305. The re-rendered first three-dimensional scene model of the three-dimensional rendering module 305 displays the re-rendered first three-dimensional scene model, which is not specifically limited here.

In an embodiment, the client integration interface 302 of the client 300 can obtain the first three-dimensional scene module from the local storage module 301 and send it to the second audio and video processing module 304, and the second audio and video processing module 304 will process the first three-dimensional scene module. The three-dimensional scene module is sent to the third three-dimensional rendering module 305, and the third three-dimensional rendering module 305 performs three-dimensional rendering processing on the first three-dimensional scene model. Finally, the second display module 306 receives the rendered third image from the third three-dimensional rendering module 305. A three-dimensional scene model is used to display the rendered first three-dimensional scene model, which is not specifically limited here.

It is worth noting that in each specific implementation of the present application, when it comes to relevant processing based on user information, user behavior data, user historical data, user location information and other data related to user identity or characteristics, the first step is to perform relevant processing. The user's permission or consent is obtained, and the collection, use and processing of this data will comply with the relevant laws, regulations and standards of the relevant countries and regions. In addition, when the embodiment of this application needs to obtain the user's sensitive personal information, it will obtain the user's separate permission or separate consent through a pop-up window or jump to a confirmation page. After clearly obtaining the user's separate permission or separate consent, it will then Obtain necessary user-related data for normal operation of the embodiment of the present application.

In addition, referring to Figure 11, one embodiment of the present application also provides another three-dimensional model transmission device. The three-dimensional model transmission device 400 includes a memory 402, a processor 401, and is stored in the memory 402 and can be run on the processor 401. computer program.

The processor 401 and the memory 402 may be connected through a bus or other means.

As a non-transitory computer-readable storage medium, the memory 402 can be used to store non-transitory software programs and non-transitory computer executable programs. In addition, memory 402 may include high-speed random access memory and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, the memory 402 may include memory located remotely relative to the processor 401, and these remote memories may be connected to the processor 401 through a network. Examples of the above-mentioned networks include but are not limited to the Internet, intranets, local area networks, mobile communication networks and combinations thereof.

The non-transient software programs and instructions required to implement the three-dimensional model transmission method in the above embodiment are stored in the memory 402. When executed by the processor 401, the three-dimensional model transmission method in the above embodiment is executed, for example, the above-described The method steps S110 to S160 in Figure 1, the method steps S210 to S230 in Figure 2, the method steps S310 to S350 in Figure 3, the method steps S410 to S440 in Figure 4, the method steps S510 to S530 in Figure 5, Method steps S610 to S620 in FIG. 6 , method steps S710 to S720 in FIG. 7 , and method steps S810 to S820 in FIG. 8 .

The device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separate, that is, they may be located in one place, or they may be distributed to multiple network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, an embodiment of the present application also provides a computer-readable storage medium that stores computer-executable instructions, and the computer-executable instructions are executed by a processor or controller, for example, by the above-mentioned Execution by a processor in the device embodiment can cause the above-mentioned processor to execute the three-dimensional model transmission method in the above embodiment, and execute the above-described method steps S110 to S160 in Figure 1, method steps S210 to S230 in Figure 2, The method steps S310 to S350 in Figure 3, the method steps S410 to S440 in Figure 4, the method steps S510 to S530 in Figure 5, the method steps S610 to S620 in Figure 6, the method steps S710 to S720 in Figure 7 and Method steps S810 to S820 in Figure 8 .

In addition, an embodiment of the present application also provides a computer program product, including a computer program or computer instructions Let the computer program or computer instructions be stored in a computer-readable storage medium, the processor of the computer device reads the computer program or computer instructions from the computer-readable storage medium, and the processor executes the computer program or computer instructions, causing the computer device to perform the above implementation The three-dimensional model transmission method in the example, for example, executes the above-described method steps S110 to S160 in Figure 1, method steps S210 to S230 in Figure 2, method steps S310 to S350 in Figure 3, and method steps in Figure 4. S410 to S440, method steps S510 to S530 in FIG. 5 , method steps S610 to S620 in FIG. 6 , method steps S710 to S720 in FIG. 7 , and method steps S810 to S820 in FIG. 8 .

The embodiments of this application include: first, the server side obtains multiple video images from the design side. The video images are obtained by the design side by performing frame extraction processing on the video stream of the augmented reality video call. The augmented reality video call is established by the design side and the client. , then perform segmentation processing on multiple video images to obtain a collection of physical images and a collection of environmental images, then perform modeling processing on the collection of physical images to obtain a three-dimensional model of the physical object, and perform modeling processing on the collection of environmental images to obtain a three-dimensional model of the environment, The first 3D scene model is generated based on the physical 3D model and the environment 3D model, and finally the first 3D scene model is sent to the client. That is to say, in the scenario of the augmented reality video call established between the design end and the client, the design end The video stream of the augmented reality video call is subjected to frame extraction processing to obtain multiple video images. The server processes the multiple video images to finally obtain a first three-dimensional scene model, and sends the first three-dimensional scene model to the client. , Therefore, the embodiments of the present application can realize the transmission of three-dimensional models in augmented reality scenarios.

Those of ordinary skill in the art can understand that all or some steps and systems in the methods disclosed above can be implemented as software, firmware, hardware, and appropriate combinations thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, a digital signal processor, or a microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit . Such software may be distributed on computer-readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). As is known to those of ordinary skill in the art, the term computer storage media includes volatile and nonvolatile media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. removable, removable and non-removable media. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disk (DVD) or other optical disk storage, magnetic cassettes, tapes, disk storage or other magnetic storage devices, or may Any other medium used to store the desired information and that can be accessed by a computer. Additionally, it is known to those of ordinary skill in the art that communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .

Claims

A three-dimensional model transmission method, including:

Acquire multiple video images from the design end, the video images are obtained by the design end by performing frame extraction processing on the video stream of the augmented reality video call, and the augmented reality video call is established by the design end and the client;

Perform segmentation processing on multiple video images to obtain a collection of physical images and a collection of environmental images;

Perform modeling processing on the collection of real object images to obtain a three-dimensional model of the real object;

Perform modeling processing on the environmental image collection to obtain a three-dimensional environment model;

Generate a first three-dimensional scene model based on the physical three-dimensional model and the environmental three-dimensional model;

Send the first three-dimensional scene model to the client.
The three-dimensional model transmission method according to claim 1, wherein generating the first three-dimensional scene model according to the physical three-dimensional model and the environmental three-dimensional model includes:

Obtain a preset three-dimensional model library, where the three-dimensional model library includes an environment model library;

Determine a target environment three-dimensional model from the environment model library according to the physical three-dimensional model and the environment three-dimensional model;

A first three-dimensional scene model is generated according to the three-dimensional model of the physical object and the three-dimensional model of the target environment.
The three-dimensional model transmission method according to claim 1, wherein after sending the first three-dimensional scene model to the client, the three-dimensional model transmission method further includes:

Receive annotation information sent by the client;

Determine a labeling area in the first three-dimensional scene model according to the labeling information;

Nest the annotation information on the annotation area to obtain a second three-dimensional scene model;

Superimpose the first three-dimensional scene model and the second three-dimensional scene model to obtain a third three-dimensional scene model;

Send the third three-dimensional scene model to the client.
The three-dimensional model transmission method according to claim 2, wherein the three-dimensional model library includes a physical model library, and the modeling process on the physical image collection to obtain the physical three-dimensional model includes:

Analyze and process the collection of physical images to obtain graphic information;

Obtain physical image element information from the physical model library according to the graphic information;

Using the physical image element information to construct a physical image base map;

Perform modeling processing on the image base map of the physical object to obtain a three-dimensional model of the physical object.
The three-dimensional model transmission method according to claim 2, wherein the modeling process on the environmental image collection to obtain the environmental three-dimensional model includes:

Perform analysis and processing on the environmental image collection to obtain label information;

Obtain environmental image element information from the environment model library according to the tag information;

The environmental image element information is used to perform modeling processing on the environmental image set to obtain an environmental three-dimensional model.
The three-dimensional model transmission method according to claim 2, wherein the three-dimensional model transmission method further includes:

Determine a target environment three-dimensional model from the environment model library according to the physical three-dimensional model;

The physical three-dimensional model and the target environment three-dimensional model are nested to generate the first three-dimensional scene model.
The three-dimensional model transmission method according to claim 1, wherein generating the first three-dimensional scene model according to the physical three-dimensional model and the environmental three-dimensional model includes:

The physical three-dimensional model and the environmental three-dimensional model are nested to generate the first three-dimensional scene model.
A three-dimensional model transmission method, including:

Establish an augmented reality video call with the designer;

Receive the first three-dimensional scene model sent by the server. The first three-dimensional scene model is obtained by the server based on the physical three-dimensional model and the environmental three-dimensional model. The physical three-dimensional model is obtained by the server through a collection of physical images. The environment three-dimensional model is obtained by performing modeling processing on the environment image set by the server side. Both the physical image set and the environment image set are obtained by the server side by modeling a plurality of environment image sets. The video image is obtained by segmentation processing, and the video image is obtained by the design end by performing frame extraction processing on the video stream of the augmented reality video call.
The three-dimensional model transmission method according to claim 8, wherein the three-dimensional model transmission method further includes:

Perform virtual annotation processing on the first three-dimensional scene model to obtain annotation information;

Send the annotation information to the server.
A three-dimensional model transmission device, including: a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, it implements any one of claims 1 to 9 The three-dimensional model transmission method.
A computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are used to execute the three-dimensional model transmission method described in any one of claims 1 to 9.
A computer program product comprising a computer program or computer instructions stored in a computer-readable storage medium from which a processor of a computer device reads the computer program Or the computer instruction, the processor executes the computer program or the computer instruction, so that the computer device executes the three-dimensional model transmission method according to any one of claims 1 to 9.