WO2023174209A1 - 一种虚拟拍摄方法、装置及设备 - Google Patents

一种虚拟拍摄方法、装置及设备 Download PDF

Info

Publication number
WO2023174209A1
WO2023174209A1 PCT/CN2023/081078 CN2023081078W WO2023174209A1 WO 2023174209 A1 WO2023174209 A1 WO 2023174209A1 CN 2023081078 W CN2023081078 W CN 2023081078W WO 2023174209 A1 WO2023174209 A1 WO 2023174209A1
Authority
WO
WIPO (PCT)
Prior art keywords
scene file
target
scene
user
cloud computing
Prior art date
Application number
PCT/CN2023/081078
Other languages
English (en)
French (fr)
Inventor
陈普
Original Assignee
华为云计算技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN202210802065.2A external-priority patent/CN116800737A/zh
Application filed by 华为云计算技术有限公司 filed Critical 华为云计算技术有限公司
Publication of WO2023174209A1 publication Critical patent/WO2023174209A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/265Mixing

Definitions

  • the present application relates to the field of computers, and in particular to a virtual shooting method, device and equipment.
  • Computer rendering technology refers to calculating and outputting pictures containing the same model and lighting conditions in the real world based on three-dimensional model data (including object models, surface materials, etc.) and light data (including light source position, color, intensity, etc.).
  • Virtual shooting is a rising video creation method.
  • the members participating in the shooting do not need to conduct real shooting scenes, but shoot against a background such as a green screen.
  • a background such as a green screen.
  • This application provides a virtual shooting method, which can improve virtual shooting efficiency.
  • the first aspect of this application provides a virtual shooting method, which is applied to a cloud computing platform.
  • the method includes: providing a first configuration interface, the first configuration interface being used to receive a scene file uploaded by a first user, and the scene The file includes parameters of at least one three-dimensional model; provides a second configuration interface for receiving a target image uploaded by a second user; and fuses the scene file and the target image to obtain a composite image.
  • the method receives a scene file uploaded by the first user for rendering and provides the scene file according to the second user's request, so that the second user can use the scene file and the target image to fuse and obtain a composite image.
  • the target image is provided by the second user.
  • the first configuration interface is also used to receive description information and/or charging methods of the scene file configured by the first user.
  • the second user can select the required scene file more conveniently and targetedly for image fusion.
  • the target image is included in the video file.
  • the target image may be a single image provided by the second user, or a frame of image in the video stream. That is, the target image may include one or more frames of images to be fused. In other words, this method can also be applied to image fusion of video streams.
  • the scene file and the target image are fused through the cloud rendering service of the cloud computing platform to obtain the composite image.
  • the second configuration interface is also used to receive the fusion parameters uploaded by the second user.
  • the fusion parameters include one or more of the following: duration, fusion position, and orientation; wherein the duration indicates the time period during which the scene file and the image are fused, and the orientation indicates the orientation of the acquisition device in the target image; Fusion of the scene file and the target image to obtain a synthetic image includes: fusing the scene file and the target image based on the fusion parameter to obtain the synthetic image.
  • the scene file includes a live broadcast background
  • the target image includes portraits in the live video stream
  • the scene file and the target image are fused to obtain a composite image, including: based on the fusion parameters, the live broadcast The background serves as the background of the live video stream and is merged with the portrait in the live video stream to obtain a new live video stream.
  • This method can also be applied in the field of live broadcast. After fusing the scene file uploaded by the first user as the live broadcast background with the live video stream, a new live video stream can be obtained. It avoids the second user (live broadcaster) from making a live broadcast background, and can also provide a variety of scene files for the second user to choose from, which greatly simplifies the image fusion steps for the second user and improves the second user's experience.
  • the rendering result of the scene file is displayed to the second user; adjustment information is received from the second user, and the adjustment information indicates that the parameters of the at least one three-dimensional model are adjusted; according to the adjustment information and the The rendering result of the scene file is displayed to the second user.
  • the rendering result of the adjusted scene file is displayed to the second user.
  • one or more of the following operations are performed on the target 3D model: copy, paste, delete, edit; wherein the editing includes text editing and color editing, and the text editing indicates that the target 3D model The text information contained in the model is modified, and the color editing indicates that the color of the target 3D model is modified.
  • the at least one frame of the composite image is sent to a third user designated by the second user according to the instruction information of the second user.
  • the cloud computing platform is an edge computing cluster.
  • this method When this method is executed by an edge computing cluster, it can effectively reduce the bandwidth pressure of the central computing node in the network, save data transmission time, and further improve the efficiency of image fusion.
  • the composite image and the indication information are sent to the edge node, so that the edge node sends the composite image to the third user specified by the second user.
  • a second aspect of the application also provides a virtual shooting method, which method includes: receiving a rendering result of a target scene sent by the cloud computing platform.
  • the target scene is obtained by rendering a scene file.
  • the scene file is obtained by the first user.
  • the scene file includes parameters of at least one three-dimensional model; the rendering result of the target scene and the target image are fused to obtain a composite image, wherein the target image is collected by the second user.
  • the second user obtains the scene file provided by the first user for rendering, and based on the target image produced/obtained by the second user, can achieve the fusion of the scene file and the target image to obtain a composite image.
  • the production of scene files and the production of target images are effectively separated, avoiding the situation where the second user spends a lot of time and computing resources to produce scene files, saving a lot of money. time and computing resource overhead.
  • the rendering result of the target scene frame also includes depth information corresponding to the target scene frame.
  • the target image is included in the video file.
  • a configuration interface is provided, and the configuration interface is used to receive the fusion parameters uploaded by the second user.
  • the fusion parameters include one or more of the following: duration, fusion position, and orientation; wherein, the The duration indicates the time period during which the scene file is merged with the image, and the orientation indicates the orientation of the acquisition device in the target image; Fusion of the scene file and the target image to obtain a synthetic image includes: fusing the scene file and the target image based on the fusion parameter to obtain the synthetic image.
  • the live broadcast background is used as the background of the live video stream and fused with the portrait in the live video stream to obtain a new live video stream.
  • adjustment information is sent to the cloud computing platform service platform, and the adjustment information indicates that parameters of the at least one three-dimensional model included in the scene file are adjusted; and a rendering result of the adjusted scene file is obtained.
  • one or more of the following operations are performed on the target 3D model: copy, paste, delete, edit; wherein the editing includes text editing and color editing, and the text editing indicates that the target 3D model The text information contained in the model is modified, and the color editing indicates that the color of the target 3D model is modified.
  • the adjusted scene file is uploaded to the cloud computing platform service platform.
  • description information and/or charging methods for the adjusted scene file are uploaded to the cloud computing platform service platform.
  • configuration information is sent to the cloud computing platform service platform, and the configuration information indicates selecting the scene file from at least one scene file provided by the cloud computing platform service platform.
  • the third aspect of the present application provides a cloud computing platform.
  • the cloud computing platform includes a communication unit for providing a first configuration interface.
  • the first configuration interface is used for receiving a scene file uploaded by a first user.
  • the scene file includes Parameters of at least one three-dimensional model; a second configuration interface is provided, the second configuration interface is used to receive the target image uploaded by the second user; the processing unit is used to fuse the scene file and the target image to obtain a composite image.
  • the first configuration interface is also used to receive description information and/or charging methods of the scene file configured by the first user.
  • the target image is included in the video file.
  • the processing unit is also used to fuse the scene file and the target image through the cloud rendering service of the cloud computing platform to obtain the composite image.
  • the second configuration interface is also used to receive fusion parameters uploaded by the second user.
  • the fusion parameters include one or more of the following: duration, fusion position, and orientation; wherein, the fusion parameters
  • the duration indicates the time period during which the scene file and the image are fused, and the orientation indicates the orientation of the acquisition device in the target image;
  • the processing unit is also used to fuse the scene file and the target image based on the fusion parameter to obtain the At least one frame of composite image.
  • the scene file includes a live broadcast background
  • the target image includes portraits in the live video stream
  • the processing unit is also used to use the live broadcast background as the background of the live video stream and the background of the live video stream based on the fusion parameter.
  • the portrait in the live video stream is merged to obtain a new live video stream portrait.
  • the communication unit is also used to display the rendering result of the scene file to the second user; receive adjustment information from the second user, the adjustment information indicating to adjust the parameters of the at least one three-dimensional model ;
  • the processing unit is also used to obtain the rendering result of the adjusted scene file based on the adjustment information and the rendering result of the scene file; the communication unit is also used to display the rendering of the adjusted scene file to the second user result.
  • the at least one three-dimensional model includes a target three-dimensional model
  • the processing unit is also used to perform one or more of the following operations on the target three-dimensional model: copy, paste, delete, edit; wherein,
  • the editing includes text editing and color editing.
  • the text editing indicates that the text information contained in the target three-dimensional model should be edited. Modify, this color editing instruction is to modify the color of the target 3D model.
  • the processing unit is also configured to send the composite image to a third user specified by the second user according to the second user's instruction information.
  • the cloud computing platform is an edge computing cluster.
  • the processing unit is also configured to send the composite image and the indication information to the edge node, so that the edge node sends the composite image to the third user specified by the second user.
  • the fourth aspect of the present application provides a virtual shooting device.
  • the device includes an interactive unit for receiving a rendering result of a target scene sent by a cloud computing platform.
  • the target scene is obtained by rendering a scene file.
  • the scene file is obtained by The first user provides on the cloud computing platform that the scene file includes parameters of at least one three-dimensional model; the computing unit is used to fuse the rendering result of the target scene with the target image to obtain a composite image, wherein the target image is obtained by the second Collected by users.
  • the rendering result of the target scene also includes depth information corresponding to the target scene.
  • the target image is included in the video file.
  • the interaction unit is also used to provide a configuration interface.
  • the configuration interface is used to receive the fusion parameters uploaded by the second user.
  • the fusion parameters include one or more of the following: duration, fusion Position and orientation; wherein, the duration indicates the time period during which the scene file and the image are fused, and the orientation indicates the orientation of the acquisition device in the target image; the computing unit is also used to combine the scene file and the image based on the fusion parameter.
  • the target images are fused to obtain the composite image.
  • the scene file includes a live broadcast background
  • the target image includes portraits in the live video stream
  • the computing unit is also used to use the live broadcast background as the background of the live video stream and the background of the live video stream based on the fusion parameter.
  • the portrait in the live video stream is merged to obtain a new live video stream.
  • the interaction unit is also used to send adjustment information to the cloud computing platform, where the adjustment information indicates to adjust the parameters of the at least one three-dimensional model included in the scene file; obtain the adjusted scene file rendering result.
  • the at least one three-dimensional model includes a target three-dimensional model
  • the computing unit is also used to perform one or more of the following operations on the target three-dimensional model: copy, paste, delete, edit; wherein,
  • the editing includes text editing and color editing.
  • the text editing instructs the text information contained in the target three-dimensional model to be modified.
  • the color editing instructs the color of the target three-dimensional model to be modified.
  • the interaction unit is also used to upload the adjusted scene file to the cloud computing platform.
  • the interaction unit is also used to upload description information and/or charging methods for the adjusted scene file to the cloud computing platform.
  • the interaction unit is also used to send configuration information to the cloud computing platform, where the configuration information indicates selecting the scene file from at least one scene file provided by the cloud computing platform.
  • a fifth aspect of the present application provides a computing device cluster, including at least one computing device, each computing device including a processor and a memory; the processor of the at least one computing device is configured to execute instructions stored in the memory of the at least one computing device, So that the computing device executes the method provided by the first aspect or any possible design of the first aspect.
  • a sixth aspect of the present application provides a computing device cluster, including at least one computing device, each computing device including a processor and a memory; the processor of the at least one computing device is used to perform storage of the at least one computing device instructions stored in the processor, so that the computing device executes the method provided by the second aspect or any possible design of the second aspect.
  • a seventh aspect of the present application provides a computer program product containing instructions that, when executed by a cluster of computer equipment, cause the cluster of computer equipment to perform the method provided by the first aspect or any possible design of the first aspect.
  • An eighth aspect of the present application provides a computer program product containing instructions that, when executed by a cluster of computer equipment, cause the cluster of computer equipment to perform the method provided by the second aspect or any possible design of the second aspect.
  • a ninth aspect of the present application provides a computer-readable storage medium, including computer program instructions.
  • the computer program instructions When executed by a computing device cluster, the computing device cluster performs the first aspect or any possible method of the first aspect. The methods provided by the design.
  • a tenth aspect of the present application provides a computer-readable storage medium, including computer program instructions.
  • the computer program instructions When executed by a cluster of computing devices, the computing device cluster performs the first aspect or any possible method of the first aspect. The methods provided by the design.
  • Figure 1 is an architectural diagram of a virtual shooting involved in this application
  • Figure 2 is a flow chart of a virtual shooting method involved in this application
  • Figure 3 is an interactive interface for uploading scene files involved in this application
  • Figure 4 is an interactive interface for configuring fusion information involved in this application
  • Figure 5 is a schematic diagram of a virtual viewpoint orientation involved in this application.
  • Figure 6 is a flow chart of another virtual shooting method involved in this application.
  • Figure 7 is a schematic diagram of a scene file editing interactive interface involved in this application.
  • Figure 8 is a schematic diagram of another scene file editing interactive interface involved in this application.
  • Figure 9 is a structural diagram of a cloud computing platform involved in this application.
  • Figure 10 is a structural diagram of a virtual shooting device involved in this application.
  • Figure 11 is a schematic diagram of a computing device involved in this application.
  • Figure 12 is a schematic diagram of another computing device cluster involved in this application.
  • Figure 13 is a schematic diagram of another computing device cluster involved in this application.
  • first and second in the embodiments of this application are only used for descriptive purposes and cannot be understood as indicating or implying relative importance or implicitly indicating the number of indicated technical features. Therefore, features defined as “first” and “second” may explicitly or implicitly include one or more of these features.
  • the more common virtual shooting method is to carry out scene construction and data collection separately, and then combine the above two through computer synthesis technology to obtain a composite image.
  • Data collection refers to using the camera function of a collection device (such as a camera) to collect people or objects to obtain multiple frames of images (usually collected with a green screen). Optionally, it can also be achieved by collecting point clouds using lidar.
  • a collection device such as a camera
  • it can also be achieved by collecting point clouds using lidar.
  • Scenes include real scenes, virtual scenes, and combinations of real scenes and virtual scenes. Among them, the construction of virtual scenes is relatively complicated. Generally speaking, virtual scenes include light sources and multiple three-dimensional models. As the number of three-dimensional models increases or the requirements for image quality increase, it becomes increasingly difficult to construct virtual scenes. Furthermore, even if it is true Real scenes also need to perform virtualization operations on real scenes.
  • the existing virtual shooting mode puts forward high technical requirements for the staff, and also brings a lot of expenses. It is characterized by difficulty, high expenses, and low efficiency.
  • this application proposes a virtual shooting method.
  • the scene file can be uploaded to the cloud computing platform by a third-party user.
  • the target user who needs to create a composite image can select the scene file in the cloud computing platform and further combine it.
  • the image data collected by the target user is used to obtain the composite image.
  • the synthesis operation can also be performed by the cloud computing platform. This method greatly reduces the difficulty and cost of video production by decoupling image data collection and scene construction, and effectively improves the efficiency of virtual shooting and video production.
  • Figure 1 shows an architecture diagram of a virtual shooting.
  • the target users are those who need to create composite images.
  • image data is collected by the target user using a collection device such as a camera.
  • the image data includes one or more frames of images.
  • the image data may be collected by the target user using multiple collection devices. That is, multiple acquisition devices are used to obtain multiple streams of image data to produce composite images.
  • third-party users can use virtualization software, image production software and other software to create scene files, and upload them to the scene file resource library in the cloud computing platform 100 .
  • the scene file resource library contains different types of scene files.
  • the scene file resource library includes at least scene file 1, scene file 2 and scene file 3.
  • scene file 1 is uploaded by third-party user 1
  • scene file 2 is uploaded by third-party user 2.
  • third-party users upload scene files will be introduced later. It should be noted that there is no fixed sequence between the operation of third-party users uploading scene files and the operation of target users collecting image data.
  • the target user can select the appropriate one or more scene files through the cloud computing platform 100 for producing the composite image.
  • the creation operation of the composite image may be performed by the cloud computing platform 100 .
  • it can also be executed by the target user.
  • the virtual shooting method proposed in this application can be used to produce offline composite images, and can also be used to produce online composite images. Among them, after synthesizing images online, the synthesized images can be used to provide online services, such as party live broadcasts, outdoor live broadcasts, etc.
  • Figure 2 shows a flow chart of a virtual shooting method 200.
  • S200 The third-party user client uploads the scene file to the cloud computing platform 100.
  • the scene file can be uploaded through the interactive interface shown in Figure 3 on the client.
  • the scene file includes parameter information of multiple three-dimensional models.
  • the parameter information includes one or more of the following: material information, texture map information, method Line map information, position information, size information, etc.
  • the scene file can be unrendered. Further, after receiving the scene file uploaded by the third-party user, the cloud computing platform 100 uses the rendering engine to render, and obtains the rendered scene file.
  • the interactive interface shown in Figure 3 includes:
  • Scene file selection control 301 through which a third-party user can select one or more scene files to be uploaded to the cloud computing platform 100 from multiple locally stored files. That is to say, this control supports third-party users to upload multiple scene files at one time.
  • Description information input control 302 third-party users can use this control to input description information for the uploaded scene file.
  • the description information includes the subject and/or content of the scene file.
  • the theme can be a sports theme, animation theme or outdoor theme, etc.
  • Third-party users can select one or more themes to describe the scene file according to the actual situation. It should be noted that the above description does not constitute a limitation on the type of subject matter, and this application does not limit the type of subject matter.
  • the target user can quickly select a scene file for producing a composite image according to the theme type of the scene file, which can effectively improve the production efficiency of the composite image.
  • the description information also includes content.
  • the content can be a textual description of the scene file or a descriptive file about key parameters.
  • the key parameters include one or more of the following: the position, orientation, field of view (fov) and visual plane position of the virtual viewpoint/camera in the scene file, etc.
  • the field of view is also called the field of view in optical engineering.
  • the size of the field of view determines the field of view of optical instruments (such as acquisition equipment).
  • optical instruments with the lens of the optical instrument as the vertex and the angle formed by the two edges of the maximum range through which the object image of the measured target can pass through the lens, it is called the field of view angle.
  • the charging method setting control 303 is used to set the charging method for the scene file.
  • the charging methods include at least the following two: charging by time and charging by time.
  • charging based on duration indicates charging based on the duration the target user uses the scene file. For example, when using a scene file in a live broadcast event, the usage duration is the live broadcast duration.
  • third-party users choose the time-based charging method, they can further set the unit price. For example, the price charged per minute.
  • Pay-per-view indicates that the target user is charged according to the number of times the scene file is used. It can generally be thought of that the scene file is used once to create a composite image.
  • third-party users choose the pay-per-view method, they can further set the unit price. For example, the price charged each time (not shown in Figure 3).
  • the cloud computing platform 100 when setting the unit price, can provide a reference price to third-party users for reference.
  • Determination control 304 is used to upload scene files, description information and charging methods to the cloud computing platform 100.
  • this application uses the interactive interface provided by the cloud computing platform 100 to easily upload scene files and related information.
  • the interactive interface provided by the cloud computing platform 100 for uploading scene files can be presented in a variety of ways, which is not limited by this application.
  • the target user client collects image data.
  • the image data collected by the target user using the collection device includes one or more frames of images of people and/or objects.
  • the image data may be video stream data collected by the target user, and the video stream data includes multiple frames of images.
  • the image data may also include collection device information.
  • the collection device information includes one or more of the following: relative position information and orientation of the collection device relative to the person or object during the shooting process.
  • the target user uses multiple acquisition devices to acquire multiple channels of image data.
  • the multiple channels of image data can be jointly used to create composite images.
  • the multi-channel image data can be converted into the same coordinate system according to the relative position information and orientation of the acquisition device relative to the person or object in each channel of image data, thereby realizing the synthesis of image data.
  • the third-party user's operation of uploading the scene file may precede the target user's operation of collecting image data, and the third-party user's operation of uploading the scene file may also occur after the target user's operation of collecting image data.
  • the operation of the third-party user to upload the scene file can also be performed at the same time as the operation of the target user to collect image data.
  • S202 The target user selects a scene file for producing a composite image from the cloud computing platform 100.
  • the third-party user uploads one or more scene files and related information to the cloud computing platform 100 in S200. Therefore, the target user can select scene files for making composite images as needed. Specifically, target users can make selections based on the theme type, charging method, unit price and other information of the scene file.
  • S203 The target user uploads the collected image data to the cloud computing platform 100 through the target user client.
  • the collected image data can be uploaded to the cloud computing platform 100 .
  • S204 The cloud computing platform 100 renders based on the image data and the selected scene file to obtain a composite image.
  • the composite image can be obtained using the rendering method.
  • the method of obtaining the composite image may also be to manually or automatically call a computing service on the cloud (such as a cloud rendering service).
  • the computing service is used to perform the calculation steps of rendering, and after obtaining the rendering result and the industry-composited image, the composite image is obtained.
  • the provider of the above computing service may be the same as the provider of the cloud computing platform, or may be different.
  • the live background can be used as the background of the live video stream and fused with the portrait in the live video stream to obtain a new live broadcast Video streaming.
  • the collection device information includes one or more of the following: relative position information and orientation of the collection device relative to the person or object during the shooting process.
  • the key parameters include one or more of the following: the position, orientation, field of view (fov) and visual plane position of the virtual viewpoint/camera in the scene file, etc.
  • Common rendering methods include raster method, ray tracing method, reverse ray tracing method, etc.
  • ray tracing also known as ray tracing or ray tracing, comes from a general technology of geometric optics. It tracks the light rays that interact with the optical surface to obtain a model of the path of the light rays. It is used in the design of optical systems, such as camera lenses, microscopes, telescopes, and binoculars.
  • a mathematical model of the choreographed scene emerges from a technique that traces rays emanating from the eye rather than from the light source. The results obtained are similar to the results of the raycasting and scanline rendering methods, but this method has better optical effects.
  • the ray tracing method first calculates the distance, direction and new position that the light travels through the medium before it is absorbed by the medium or changes direction. and then spawn from this new location Generate a new ray, use the same processing method, and finally calculate a complete path of light propagating in the medium. Since the algorithm is a complete simulation of the imaging system, complex images can be simulated. It should be noted that this application does not limit the rendering method.
  • Figure 4 shows an interactive interface for configuring fusion information. As shown in Figure 4, the interactive interface includes,
  • the screen display frame 401 is used to display the screen of the scene file. Specifically, it corresponds to a certain frame (a certain moment) in a scene file.
  • pictures at different times in the scene file can be selected.
  • the picture display box 401 displays the picture at 3 minutes and 16 seconds in the scene file, and the picture includes at least two three-dimensional models 402 .
  • the target user can also use the playback control 406 to implement operations such as playing, pausing, fast forwarding, and rewinding the picture.
  • the preview of the composite image can be realized through the operations of playback, fast forward and rewind, and the picture selection of the scene file can be realized through the pause operation.
  • the screen display frame 401 also includes an orientation adjustment control 404.
  • the orientation adjustment control 404 By operating the orientation adjustment control 404, the image of the scene file presented in the image display frame 401 can be adjusted. Specifically, because the scene file is modeled in a three-dimensional space, if you choose to place the virtual viewpoint at different locations in the three-dimensional space, the content/pictures you see may be different. That is, by operating the orientation adjustment control 404, the position of the virtual viewpoint in the three-dimensional space can be adjusted, thereby adjusting the picture presented in the picture display frame 401 of the scene file. It should be noted that for the same scene file, the positions of the virtual viewpoints corresponding to different frames (or moments) may be different.
  • the scene file information box 407 displays scene file information. Specifically, it includes one or more of the following: scene file identification, scene file type, virtual viewpoint coordinates, virtual viewpoint orientation, and field of view angle. Among them, the scene file identifier can be a file number or file name, or it can be a file storage path.
  • the coordinates and orientation of the virtual viewpoint refer to the coordinates and orientation of the virtual viewpoint in the three-dimensional space in the frame displayed in the screen display frame 401 . Specifically, the orientation of the virtual viewpoint indicates the angle between the observation direction of the virtual viewpoint and the coordinate axis in the three-dimensional space.
  • Figure 5 shows a schematic diagram of the orientation of the virtual viewpoint.
  • the observation direction of the virtual viewpoint is S1
  • the projection of S1 on the xoy plane is S1'
  • the angle between S1' and the x-axis is ⁇
  • the angle between S1' and y is ⁇
  • the orientation of the virtual viewpoint is ( ⁇ , ⁇ , ⁇ ).
  • the screen display box 401 also includes a fusion identifier 403, which is used to indicate the location where the image data and the scene file are fused.
  • the target user can set the position of the fusion mark 403 using the mouse or based on the touch screen.
  • Image data information is displayed in the image data information box 408 .
  • it includes one or more of the following: image data identification, duration, collection device coordinates, collection device orientation and field of view angle, etc.
  • the duration refers to the time that the image data lasts at the position (or fusion position) of the screen display frame 401 at the current moment. For example, a duration of 3:16-10:00 means that the image data will be at the fusion mark 403 between 3 minutes and 16 seconds to 10 minutes.
  • the target user can configure and modify the duration in the image data information box 408.
  • the image data identifier can be a file number or file name, or it can be a file storage path.
  • the coordinates and orientation of the collection device refer to the coordinates and orientation of the collection device in the three-dimensional space of the image data in the book data.
  • the orientation of the collection device is similar to the orientation of the virtual viewpoint.
  • the target user can adjust the scene file by modifying the coordinates, orientation and/or field of view angle of the virtual viewpoint in the scene file information box 407.
  • the target user can adjust the image data by modifying the coordinates, orientation and/or field of view angle of the collection device in the image data information box 408 .
  • the target user can use the orientation adjustment control 404 to adjust the image data. all. Specifically, after the fusion mark 403 is selected, the coordinates, orientation, and field of view angle of the collection device in the image data can be controlled by operating the orientation adjustment control 404. That is, by operating the orientation adjustment control 404, the three-dimensional space included in the image data can be observed from different positions in the space, thereby presenting different pictures in the picture display frame 401.
  • the target user After determining the position of the fusion identifier 403, the parameters in the scene file box 407 and the parameters in the image data information box 408, the target user can start the fusion of the image data and scene files by clicking the determination control 409 to obtain a composite image.
  • Image fusion can be divided into pixel-level, feature-level, decision-level fusion, etc. from low to high intelligence.
  • Pixel-level fusion refers to splicing and fusion based on image pixels, which is the fusion of two or more images into a whole.
  • Feature-level fusion performs image splicing and fusion based on the obvious features of graphics, such as lines, buildings and other features.
  • Decision-level fusion uses mathematical algorithms such as Bayesian method and D-S evidence method (dempster-shafer evidence theory) to make probabilistic decisions, based on which image fusion is performed, which is more suitable for subjective requirements. It should be understood that this application does not limit the method used for image fusion.
  • the image data and scene files also need to be preprocessed before image fusion.
  • Preprocessing technology is mainly used to perform geometric correction, noise elimination, color adjustment, brightness adjustment, image registration, etc. on images.
  • Image registration refers to finding the maximum correlation between the image and the three-dimensional virtual scene in order to eliminate the information differences of the image in the directions of space, phase and resolution, so as to achieve the purpose of more realistic fusion and more accurate information.
  • image data exists that require the separation of people and/or objects from the data collected by the acquisition device.
  • image processing software can be used to cut out specified people and/or objects in the picture.
  • the operation of separating people and/or objects from the data collected by the collection device can be completed by the target user before S203, or can be completed by the cloud computing platform 100. That is, the image data uploaded by the target user in S203 has not been preprocessed.
  • the cloud computing platform 100 can perform a rendering operation on the scene file to obtain the rendered scene file. That is, this operation can be performed in S204, or can be performed before S204,
  • the interactive interface for configuring fusion information provided by the cloud computing platform 100 may be an application interface running on the target user client, or it may run on a web page based on hypertext transfer protocol (HTTP).
  • HTTP hypertext transfer protocol
  • the composite image can also be sent directly to the target user's customers according to the target user's instructions.
  • the target user is a live broadcast platform
  • the composite image can be sent directly to the audience based on the audience's access status and access information. Therefore, this method can effectively improve the speed and efficiency of synthetic image distribution.
  • the synthetic image can also be distributed according to the parameter information of the target user or the target user's client.
  • the parameter information includes one or more of the following: coordinates, orientation, field of view angle, etc. of the virtual viewpoint.
  • the cloud computing platform 100 can generate multiple channels of image data and send them to the multiple target users or customers of the target users respectively.
  • the cloud computing platform 100 can generate multiple channels of downlink image data according to the player's parameter information to meet the needs of the audience.
  • this application provides a virtual shooting method.
  • This method provides an interactive interface for scene file upload and an interactive interface for image fusion, allowing target users to select and use scenes uploaded by third-party users.
  • scene files for image fusion It avoids operations such as setting up a scene file shooting venue, purchasing shooting equipment, and operating scene file production software for the target user. It effectively simplifies the steps for virtual shooting on the target user side, reduces the cost of equipment purchase and venue rental, and greatly reduces the cost of equipment purchase and venue rental. Improved the efficiency of virtual shooting on the target user side.
  • the above-mentioned cloud computing platform 100 may also be an edge node cluster.
  • clusters of computing nodes in the network that are close to third-party users and/or target users.
  • this method is executed by an edge computing cluster, it can effectively reduce the bandwidth pressure of the central computing node in the network, save data transmission time, and further improve the efficiency of image fusion.
  • the edge node can also send the synthesized image directly to the target user's customers according to the target user's instructions, further improving efficiency, while alleviating the target user's bandwidth pressure and reducing bandwidth overhead.
  • FIG. 5 shows a flow chart of another virtual shooting method 200'.
  • S200 The third-party user client uploads the scene file to the cloud computing platform 100.
  • the target user client collects image data.
  • S202 The target user selects a scene file for producing a composite image from the cloud computing platform 100.
  • S200-S202 is basically the same as S200-S202 in Figure 5, so no details will be given.
  • the target user client runs a virtual shooting device 600.
  • the virtual shooting device 600 may be an application software (such as a virtual shooting client provided by a cloud service provider) running on the target user client. Further, the virtual shooting device 600 may be the subject that executes the virtual shooting method 200' on the target user client side.
  • the cloud computing platform 100 sends the rendering result of the selected scene file to the target user.
  • the cloud computing platform 100 renders based on the scene files stored in the cloud computing platform 100 , and the scene files therein are not rendered.
  • the method in Figure 6 is that the target user client performs rendering according to the rendering result and image data of the received scene file to obtain a composite image.
  • the rendering result of the scene file includes the rendering picture of each frame (moment) corresponding to the scene file.
  • the rendering result of the scene file also includes depth information corresponding to each frame of the rendered picture.
  • the rendering result of the scene file may also be encoded.
  • the target user client performs rendering based on the received rendering result of the scene file and the information of the image data to obtain a composite image.
  • the rendering result of the scene file includes the rendering picture of each frame (moment).
  • the virtual shooting device 600 on the client side of the target user can fuse the rendering result of the scene file with the image data according to parameters such as fusion time and fusion position set by the target user.
  • the scene file containing the coordinates of each three-dimensional model is used, in S204', the rendering result of the scene file is used. That is, the rendered picture of each frame. Therefore, in the fusion process, the relevant parameters of the image data (the coordinates and orientation of the acquisition device, etc.) are mainly configured.
  • the target user client can use artificial intelligence (artificial intelligence) reasoning models, deep neural networks and other methods to provide each frame with depth information.
  • Artificial intelligence artificial intelligence
  • Frame-rendered picture inference obtains the depth information of each frame of rendered picture.
  • the target user client when the rendering result of the scene file contains the depth information of each frame of the rendered image, the target user client can fuse the synthetic image based on the rendered image of each frame and its depth information.
  • both of the above methods can establish a three-dimensional space for the scene file based on the depth information (received and inferred). Combined with the coordinates and orientation of the collection device in the image data and the fusion position of the video file, the image data and Scene files are constructed in the same coordinate system. Furthermore, synthetic images can be obtained using rendering methods such as rasterization and ray tracing.
  • the target user client after the target user client receives the rendering result of the scene file, it also needs to decode the rendering result of the scene file.
  • target users can also use computing services provided by cloud service providers to obtain composite images.
  • the cloud service vendor and the cloud computing platform 100 vendor may be the same or different.
  • the target user client can distribute the composite image to the target user's clients.
  • the synthesized image can be distributed through a content delivery network (CDN).
  • CDN content delivery network
  • the composite image can be sent directly to the audience based on the audience's access status and access information.
  • the composite image can also be distributed according to the parameter information of the target user's customer.
  • the parameter information includes one or more of the following: coordinates, orientation, field of view angle, etc. of the virtual viewpoint.
  • the target user client can generate multiple channels of image data and send them to the clients of the multiple target users respectively.
  • the live broadcast platform provides the perspective of each player for the audience (customers of the target user) to watch. Therefore, in the same three-dimensional space, different players correspond to different parameter information. Then the target user client can generate multiple downlink video stream data based on the player's parameter information to meet the needs of the audience.
  • S205’ is an optional step. That is, after obtaining the composite image, the target user client can store the composite image in the target user client.
  • this application provides a virtual shooting method.
  • This method provides an interactive interface for scene file upload and an interactive interface for image fusion, allowing target users to select and use scene files uploaded by third-party users for image fusion.
  • the image fusion operation is performed by the target user client, making maximum use of the computing power of the target user client.
  • it also avoids the target user's operations such as setting up a scene file shooting venue, purchasing shooting equipment, and operating the scene file production software. It effectively simplifies the steps of virtual shooting on the target user's side and reduces the cost of equipment purchase and venue rental. , which greatly improves the efficiency of virtual shooting on the target user side.
  • the target user client can select scene files for composite images as required on the cloud computing platform 100 .
  • the scene file since the scene file is produced and uploaded by a third-party user, the scene file in the cloud computing platform 100 may not be directly used by the target user.
  • the target user after the target user makes certain modifications or adjustments to the scene file, it may be able to meet the requirements. the needs of target users.
  • Figure 7 provides a schematic diagram of a scene file editing interactive interface.
  • the interactive interface 500 includes multiple three-dimensional models 501 and 502, wherein the three-dimensional model 502 includes text.
  • the interactive interface 500 is used to display part or all of the three-dimensional scene included in a scene file.
  • the target user can change the perspective by dragging/rotating the interactive interface to observe the three-dimensional scene from a new perspective.
  • the three-dimensional scene shown in Figure 7 includes at least two three-dimensional models 501 and one three-dimensional model 502 including files. All 3D models in the 3D scene can be edited by the target user.
  • the operation bar can be opened.
  • the operation bar includes functions such as copy, paste, delete, and edit.
  • the copy instruction is to copy the parameters (such as length, width, and height) of the three-dimensional model to the clipboard
  • the paste instruction is to generate a corresponding three-dimensional model at a specified location according to the parameters of the three-dimensional model in the clipboard.
  • Delete indicates to delete the 3D model from the 3D scene.
  • Editing functions include at least text editing and color editing functions. Among them, text editing indicates that when the three-dimensional model includes text, the text can be modified and/or adjusted, while color editing indicates that the color of the three-dimensional model is adjusted.
  • the target user when the target user performs a copy operation on the three-dimensional model 501_1, performs a paste operation on the right side of it, and edits the text of the three-dimensional model 502 (modifies the original text "ABC” to "DEF"), the target user You will see a screen as shown in Figure 8 in the interactive interface 500.
  • a rendering operation still needs to be performed. That is, rendering is performed based on parameters such as the number and position of the adjusted three-dimensional model, as well as parameters such as light sources, thereby obtaining the picture shown in Figure 8 .
  • the above rendering may be pre-rendering, that is, rendering with a set virtual viewpoint and perspective.
  • pre-rendering that is, rendering with a set virtual viewpoint and perspective.
  • the visible three-dimensional model is rendered according to the perspective relationship of the object, while the invisible three-dimensional model is not rendered, thereby reducing the amount of rendering calculations and increasing the rendering speed.
  • the target user can add a logo to the modified scene file. For example, rename or add a version number to facilitate querying when using this modified scene file later.
  • the target user can choose whether to upload the modified scene file to the cloud computing platform for use by other target users, or it can be set to be accessible only to the target user himself.
  • the method for the target user to upload the modified scene file may refer to Figure 3 and the corresponding embodiment of Figure 3 .
  • the modified scene file may be stored in the cloud computing platform 100, or may be stored in a system designated by the target user (such as the target user client) in the form of images and/or depth information.
  • the scene file editing interactive interface 500 may be built based on a network product interface design (website user interface, WebUI) method, or may be built based on other applications.
  • WebUI web user interface
  • the Web program running in the browser can receive the image rendered by the local/remote server through web real-time communication (WebRTC).
  • WebRTC web real-time communication
  • this application provides a method for editing scene files. After the scene file is uploaded to the cloud computing platform 100 by a third-party user, the target user can edit the scene file as needed to obtain a scene file that meets the needs of the target user.
  • This method avoids the situation where the scene files uploaded by third-party users do not fully meet the needs of the target users, improves the applicable scenarios of the scene files, and further improves the efficiency of virtual shooting for the target users.
  • This application provides a cloud computing platform 100, as shown in Figure 9, including:
  • the communication unit 101 is configured to receive scene files uploaded by third-party users in S200, and store the scene files in the storage unit 103 in the cloud computing platform 100.
  • the storage unit may not be included in the cloud computing platform 100, but may be provided in the form of a cloud service.
  • the communication unit 101 is also used to receive a scene file selected by the target user through the target user client for making the composite image.
  • the communication unit 101 is also used to receive image data uploaded by the target user.
  • the communication unit 101 is also used to send the synthesized image to the target user.
  • the communication unit is also used to send the image data and the scene file selected by the target user to a computing service on the cloud (such as a cloud rendering service), and receive the calculation results returned by the computing service, that is, the composite image.
  • a computing service on the cloud such as a cloud rendering service
  • the processing unit 102 is configured to perform rendering based on the image data and the scene file selected by the target user to obtain a composite image in S204.
  • the storage unit 103 is used to store the scene files uploaded by third-party users received in S200.
  • the storage unit 103 is also used to store image data uploaded by the target user.
  • the composite image obtained in S205 will also be stored in the storage unit 103.
  • the three units (communication unit 101, processing unit 102 and storage unit 103) included in the above-mentioned cloud computing platform 100 can be used to jointly execute the virtual shooting method 200.
  • This application also provides another cloud computing platform 100.
  • the structure can be referred to Figure 9, including:
  • the communication unit 101 is configured to receive scene files uploaded by third-party users in S200, and store the scene files in the storage unit 103 in the cloud computing platform 100.
  • the storage unit may not be included in the cloud computing platform 100, but may be provided in the form of a cloud service.
  • the communication unit 101 is also used to receive a scene file selected by the target user through the target user client for making the composite image.
  • the communication unit 101 is also used to receive image data uploaded by the target user.
  • the communication unit 101 is also used to send the information of the scene file selected by the target user to the target user in S203'.
  • the scene file information includes the rendering result of the scene file.
  • the scene file information may also include depth information of the rendered image.
  • the processing unit 102 is configured to obtain, in S203, the rendering result of the scene file selected by the target user. Optionally, it is also used in S203 to calculate the depth information of the rendering result of the scene file selected by the target user.
  • the storage unit 103 is used to store the scene files uploaded by third-party users received in S200. In S203, the storage unit 103 is also used to store information about the scene file selected by the target user.
  • the three units (communication unit 101, processing unit 102 and storage unit 103) included in the above-mentioned cloud computing platform 100 can be used to jointly execute the virtual shooting method 200'.
  • the communication unit 101, the processing unit 102 and the storage unit 103 in the above two cloud computing platforms 100 can all be implemented by software, or can be implemented by hardware.
  • the implementation of the communication unit 101 is introduced next.
  • the implementation of the processing unit 102 and the storage unit 103 may refer to the implementation of the communication unit 101.
  • the communication unit 101 may include code running on a computing instance.
  • the computing instance may be at least one of a physical host (computing device), a virtual machine, a container, and other computing devices. Further, the above computing device may be one or more.
  • communication unit 101 may include code running on multiple hosts/virtual machines/containers. It should be noted that multiple hosts/virtual machines/containers used to run the application can be distributed in the same region or in different regions. Multiple hosts for running this code Machines/virtual machines/containers can be distributed in the same AZ or in different AZs. Each AZ includes one data center or multiple geographically close data centers. Among them, usually a region can include multiple AZs.
  • the multiple hosts/VMs/containers used to run the code can be distributed in the same VPC or across multiple VPCs.
  • a VPC is set up in a region.
  • Cross-region communication between two VPCs in the same region or between VPCs in different regions requires a communication gateway in each VPC, and the interconnection between VPCs is realized through the communication gateway.
  • the communication unit 101 may include at least one computing device, such as a server.
  • the communication unit 101 may be a device implemented using an ASIC, a PLD, or the like.
  • the above-mentioned PLD can be implemented by CPLD, FPGA, GAL or any combination thereof.
  • Multiple computing devices included in the communication unit 101 may be distributed in the same region or in different regions. Multiple computing devices included in the communication unit 101 may be distributed in the same AZ or in different AZs. Similarly, multiple computing devices included in the communication unit 101 may be distributed in the same VPC or in multiple VPCs.
  • the plurality of computing devices may be any combination of computing devices such as servers, ASICs, PLDs, CPLDs, FPGAs, and GALs.
  • This application also provides a virtual shooting device 600, as shown in Figure 10, including:
  • the interaction unit 601 obtains the scene file selected by the target user in S201 for synthesizing images. And receive the information of the scene file sent by the cloud computing platform 100 in S203'. The interaction unit 601 is also used to distribute the synthesized image to the target user, the target user's customers or designated edge nodes in S205'.
  • the computing unit 602 is configured to render and obtain a composite image based on the received scene file and image data information in S204'.
  • Storage unit 603 is used to store the image data collected in S201.
  • the composite image obtained by rendering in S204' will also be stored in the storage unit 603.
  • the three units (interaction unit 601, computing unit 602 and storage unit 603) included in the above-mentioned virtual shooting device 600 can be used to jointly execute the virtual shooting method 200'.
  • the interaction unit 601, the calculation unit 602 and the storage unit 603 in the above virtual shooting device 600 can all be implemented by software, or can be implemented by hardware. As an example, the implementation of the interaction unit 601 is introduced next. Similarly, the implementation of the calculation unit 602 and the storage unit 603 may refer to the implementation of the interaction unit 601.
  • the interactive unit 601 may include code running on a computing instance.
  • the computing instance may be at least one of a physical host (computing device), a virtual machine, a container, and other computing devices. Further, the above computing device may be one or more.
  • interaction unit 601 may include code running on multiple hosts/virtual machines/containers. It should be noted that multiple hosts/virtual machines/containers used to run the application can be distributed in the same region or in different regions. Multiple hosts/VMs/containers used to run the code can be distributed in the same AZ or in different AZs, with each AZ including one data center or multiple geographically close data centers. Among them, usually a region can include multiple AZs.
  • the multiple hosts/VMs/containers used to run the code can be distributed in the same VPC or across multiple VPCs.
  • a VPC is set up in a region.
  • Cross-region communication between two VPCs in the same region or between VPCs in different regions requires a communication gateway in each VPC, and the interconnection between VPCs is realized through the communication gateway.
  • the interactive unit 601 may include at least one computing device, such as a server.
  • the interactive unit 601 may also be a device implemented using ASIC or PLD.
  • the above-mentioned PLD can be implemented by CPLD, FPGA, GAL or any combination thereof.
  • Multiple computing devices included in the interaction unit 601 may be distributed in the same region or in different regions. Multiple computing devices included in the interaction unit 601 may be distributed in the same AZ or in different AZs. Similarly, multiple computing devices included in the communication unit 101 may be distributed in the same VPC or in multiple VPCs.
  • the plurality of computing devices may be any combination of computing devices such as servers, ASICs, PLDs, CPLDs, FPGAs, and GALs.
  • computing device 700 includes: bus 702, processor 704, memory 706, and communication interface 708.
  • the processor 704, the memory 706 and the communication interface 708 communicate through the bus 702.
  • Computing device 700 may be a server or a terminal device. It should be understood that this application does not limit the number of processors and memories in the computing device 700.
  • the bus 702 may be a peripheral component interconnect (PCI) bus or an extended industry standard architecture (EISA) bus, etc.
  • PCI peripheral component interconnect
  • EISA extended industry standard architecture
  • the bus can be divided into address bus, data bus, control bus, etc. For ease of presentation, only one line is used in Figure 11, but it does not mean that there is only one bus or one type of bus.
  • Bus 704 may include a path that carries information between various components of computing device 700 (eg, memory 706, processor 704, communications interface 708).
  • the processor 704 may include a central processing unit (CPU), a graphics processing unit (GPU), a microprocessor (micro processor, MP) or a digital signal processor (digital signal processor, DSP). any one or more of them.
  • CPU central processing unit
  • GPU graphics processing unit
  • MP microprocessor
  • DSP digital signal processor
  • Memory 706 may include volatile memory, such as random access memory (RAM).
  • the processor 704 may also include non-volatile memory (non-volatile memory), such as read-only memory (ROM), flash memory, hard disk drive (HDD) or solid state drive (solid state drive). drive, SSD).
  • non-volatile memory such as read-only memory (ROM), flash memory, hard disk drive (HDD) or solid state drive (solid state drive). drive, SSD).
  • the memory 706 stores executable program code, and the processor 704 executes the executable program code to respectively realize the functions of the communication unit 101, the processing unit 102 and the storage unit 103, thereby realizing the virtual shooting method 200. That is, the memory 706 stores instructions for executing the virtual shooting method 200 .
  • executable code is stored in the memory 706, and the processor 704 executes the executable code to realize the functions of the aforementioned interactive unit 601, computing unit 602 and storage unit 603 respectively, thereby realizing the virtual shooting method 200'. That is, the memory 706 stores instructions for executing the virtual shooting method 200'.
  • the communication interface 703 uses transceiver units such as, but not limited to, network interface cards and transceivers to implement communication between the computing device 700 and other devices or communication networks.
  • An embodiment of the present application also provides a computing device cluster.
  • the computing device cluster includes at least one computing device.
  • the computing device may be a server, such as a central server, an edge server, or a local server in a local data center.
  • the computing device may also be a terminal device such as a desktop computer, a laptop computer, or a smartphone.
  • the computing device cluster includes at least one computing device 700.
  • the memory 706 of one or more computing devices 700 in the computing device cluster may store the same program for executing the virtual shooting method 200. instruction.
  • the memory 706 of one or more computing devices 700 in the computing device cluster may also store part of the instructions for executing the virtual shooting method 200 respectively.
  • a combination of one or more computing devices 700 may collectively execute instructions for performing the virtual photography method 200 .
  • the memory 706 in different computing devices 700 in the computing device cluster can store different instructions, respectively used to execute some functions of the cloud computing platform 100. That is, the instructions stored in the memory 706 in different computing devices 700 can implement the functions of one or more units among the communication unit 101, the processing unit 102, and the storage unit 103.
  • one or more computing devices in a cluster of computing devices may be connected through a network.
  • the network may be a wide area network or a local area network, etc.
  • Figure 13 shows a possible implementation. As shown in Figure 13, two computing devices 700A and 700B are connected through a network. Specifically, the connection to the network is made through a communication interface in each computing device.
  • instructions for performing the functions of the communication unit 101 and the storage unit 103 are stored in the memory 706 of the computing device 700A. Concurrently, memory 706 in computing device 700B stores instructions for performing the functions of processing unit 102 .
  • connection method between computing device clusters shown in FIG. 13 may be that considering that the virtual shooting method 200 provided in this application requires a large amount of calculations, it is considered that the functions implemented by the processing unit 102 are executed by the computing device 700B.
  • computing device 700A shown in FIG. 13 may also be performed by multiple computing devices 700.
  • the functions of computing device 700B may also be performed by multiple computing devices 700 .
  • the embodiment of the present application also provides another computing device cluster.
  • the connection relationship between the computing devices in the computing device cluster can be similar to the connection method of the computing device cluster described in FIG. 12 and FIG. 13 .
  • the difference is that the same instructions for executing the virtual shooting method 200' may be stored in the memory 706 of one or more computing devices 700 in the computing device cluster.
  • the memory 706 of one or more computing devices 700 in the computing device cluster may also respectively store part of the instructions for executing the virtual shooting method 200'.
  • a combination of one or more computing devices 700 may collectively execute instructions for performing the virtual photography method 200'.
  • the memory 706 in different computing devices 700 in the computing device cluster can store different instructions for executing part of the functions of the virtual shooting device 600 . That is, the instructions stored in the memory 706 in different computing devices 700 can implement the functions of one or more units among the interaction unit 601, the computing unit 602, and the storage unit 603.
  • An embodiment of the present application also provides a computer program product containing instructions.
  • the computer program product may be a software or program product containing instructions capable of running on a computing device or stored in any available medium.
  • the at least one computing device is caused to execute the virtual photographing method 200, or the virtual photographing method 200'.
  • An embodiment of the present application also provides a computer-readable storage medium.
  • the computer-readable storage medium may be any available medium that a computing device can store or a data storage device such as a data center that contains one or more available media.
  • the available media may be magnetic media (eg, floppy disk, hard disk, tape), optical media (eg, DVD), or semiconductor media (eg, solid state drive), etc.
  • the computer-readable storage medium includes instructions that instruct the computing device to perform virtual photography method 200, or virtual photography method 200'.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Processing Or Creating Images (AREA)

Abstract

本申请提供了一种虚拟拍摄方法,所述方法应用于云计算平台,所述方法包括:提供第一配置接口,所述第一配置接口用于接收第一用户上传的场景文件,所述场景文件包括至少一个三维模型的参数;提供第二配置接口,所述第二配置接口用于接收第二用户上传的目标图像;将所述场景文件和所述目标图像融合,获得合成图像。该方法通过将场景文件放置于云上进行出售/出租,有效地实现了场景文件的制作与目标图像的制作的分离,避免了第二用户为了制作场景文件耗费大量时间和计算资源的情况。

Description

一种虚拟拍摄方法、装置及设备 技术领域
本申请涉及计算机领域,特别涉及一种虚拟拍摄方法、装置及设备。
背景技术
计算机渲染技术是指根据三维模型数据(包括物体模型、表面材质等等),和光线数据(包括光源位置、颜色、强度等),计算并输出包含真实世界中相同模型和光照条件下的图片。
虚拟拍摄是正在兴起的一种视频创作方法,在视频制作过程中,参与拍摄的成员无需进行真实的拍摄场景,而是在绿幕等背景下进行拍摄。通过将拍摄成果与虚拟场景进行处理后,即可获得完整的视频。
发明内容
本申请提供了一种虚拟拍摄方法,该方法可以提升虚拟拍摄效率。
本申请的第一方面提供了一种虚拟拍摄方法,该方法应用于云计算平台,该方法包括:提供第一配置接口,该第一配置接口用于接收第一用户上传的场景文件,该场景文件包括至少一个三维模型的参数;提供第二配置接口,该第二配置接口用于接收第二用户上传的目标图像;将该场景文件和该目标图像融合,获得合成图像。
该方法通过接收第一用户上传的、用于渲染的场景文件,并根据第二用户的请求提供该场景文件,以使得该第二用户可以利用该场景文件和目标图像进行融合,获得合成图像。其中,目标图像由该第二用户提供。通过将场景文件放置于云上进行出售/出租,有效地实现了场景文件的制作与目标图像的制作的分离,避免了第二用户为了制作场景文件耗费大量时间和计算资源的情况。并且,通过将图像融合的操作交由云计算平台来执行,进一步地提高了渲染效率,为该第二用户节省了大量的时间和计算资源开销。
在一些可能的设计中,该第一配置接口,还用于接收该第一用户配置的该场景文件的描述信息和/或收费方式。
基于第一用户在上传场景文件时同步上传的描述信息和/或收费方式,第二用户可以更加方便地、有针对性地选取所需要地场景文件,用于进行图像融合。
在一些可能的设计中,该目标图像包括在视频文件中。
目标图像可以是第二用户提供的单一图像,也可以是视频流中的一帧图像。也即,该目标图像可以包括一帧或多帧待融合的图像。换言之,该方法也可以应用于视频流的图像融合。
在一些可能的设计中,通过该云计算平台的云渲染服务,将该场景文件和该目标图像融合,获得该合成图像。
通过人为地或自动地调用云上的计算服务(如云渲染服务),可以实现场景文件和d目标图像的融合,以获得合成图像。通过调用云上专门的计算服务,可以进一步地提升渲染效率。
在一些可能的设计中,第二配置接口,还用于接收该第二用户上传的融合参数,该 融合参数包括下述的一种或多种:持续时间、融合位置、朝向;其中,该持续时间指示该场景文件与该图像融合的时间段,该朝向指示该目标图像中采集设备的朝向;该将该场景文件和该目标图像融合,获得合成图像,包括:基于该融合参数,将该场景文件和该目标图像融合,获得该合成图像。
在一些可能的设计中,该场景文件包括直播背景,该目标图像包括直播视频流中的人像;该将该场景文件和该目标图像融合,获得合成图像,包括:基于该融合参数,将该直播背景作为该直播视频流的背景与该直播视频流中的该人像融合,获得新的直播视频流。
该方法还可以应用在直播领域,将第一用户上传的场景文件作为直播背景与直播视频流进行融合后,可以获得新的直播视频流。避免了第二用户(直播方)制作直播背景,并且还可以提供多种场景文件以供第二用户选择,极大地简化了第二用户进行图像融合地步骤,提高了第二用户的使用体验。
在一些可能的设计中,向该第二用户展示该场景文件的渲染结果;接收该第二用户发出调整信息,该调整信息指示对该至少一个三维模型的参数进行调整;根据该调整信息和该场景文件的渲染结果,向该第二用户展示调整后的场景文件的渲染结果。
在一些可能的设计中,对该目标三维模型执行下述操作的一种或多种:复制、粘贴、删除、编辑;其中,该编辑包括文字编辑和颜色编辑,该文字编辑指示对该目标三维模型包含的文本信息进行修改,该颜色编辑指示对该目标三维模型的颜色进行修改。
在一些可能的设计中,根据该第二用户的指示信息,将该至少一帧合成图像发送至该第二用户指定的第三用户。
在一些可能的设计中,该云计算平台为边缘计算集群。
当该方法由边缘计算集群执行时,可以有效地降低网络中中心计算节点的带宽压力,同时节省数据传输的时间,进一步地提升图像融合的效率。
在一些可能的设计中,将该合成图像和该指示信息发送至边缘节点,以使得该边缘节点将该合成图像发送至该第二用户指定的该第三用户。
本申请的第二方面还提供了一种虚拟拍摄方法,该方法包括:接收云计算平台发送的目标场景的渲染结果,该目标场景是对场景文件进行渲染获得的,该场景文件由第一用户在该云计算平台提供,该场景文件包括至少一个三维模型的参数;将该目标场景的渲染结果和目标图像融合,获得合成图像,其中,该目标图像由第二用户采集获得。
该方法中第二用户通过获取第一用户提供的、用于渲染的场景文件,并基于第二用户自行制作/获取的目标图像,可以实现场景文件和目标图像的融合,获得合成图像。通过将场景文件放置于云上进行出售/出租,有效地实现了场景文件的制作与目标图像的制作的分离,避免了第二用户为了制作场景文件耗费大量时间和计算资源的情况,节省了大量的时间和计算资源开销。
在一些可能的设计中,该目标场景帧的渲染结果还包括该目标场景帧对应的深度信息。
在一些可能的设计中,该目标图像包括在视频文件中。
在一些可能的设计中,提供配置接口,该配置接口用于接收该第二用户上传的融合参数,该融合参数包括下述的一种或多种:持续时间、融合位置、朝向;其中,该持续时间指示该场景文件与该图像融合的时间段,该朝向指示该目标图像中采集设备的朝向; 该将该场景文件和该目标图像融合,获得合成图像,包括:基于该融合参数,将该场景文件和该目标图像融合,获得该合成图像。
在一些可能的设计中,基于该融合参数,将该直播背景作为该直播视频流的背景与该直播视频流中的该人像融合,获得新的直播视频流。
在一些可能的设计中,向该云云计算平台服务平台发送调整信息,该调整信息指示对该场景文件包括的该至少一个三维模型的参数进行调整;获取调整后的场景文件的渲染结果。
在一些可能的设计中,对该目标三维模型执行下述操作的一种或多种:复制、粘贴、删除、编辑;其中,该编辑包括文字编辑和颜色编辑,该文字编辑指示对该目标三维模型包含的文本信息进行修改,该颜色编辑指示对该目标三维模型的颜色进行修改。
在一些可能的设计中,向该云计算平台服务平台上传该调整后的场景文件。
在一些可能的设计中,向该云计算平台服务平台上传针对该调整后的场景文件的描述信息和/或收费方式。
在一些可能的设计中,向该云计算平台服务平台发送配置信息,该配置信息指示从该云计算平台服务平台提供的至少一个场景文件中选择该场景文件。本申请的第三方面提供了一种云计算平台,该云计算平台包括通信单元,用于提供第一配置接口,该第一配置接口用于接收第一用户上传的场景文件,该场景文件包括至少一个三维模型的参数;提供第二配置接口,该第二配置接口用于接收第二用户上传的目标图像;处理单元,用于将该场景文件和该目标图像融合,获得合成图像。
在一些可能的设计中,该第一配置接口,还用于接收该第一用户配置的该场景文件的描述信息和/或收费方式。
在一些可能的设计中,该目标图像包括在视频文件中。
在一些可能的设计中,该处理单元,还用于通过该云计算平台的云渲染服务,将该场景文件和该目标图像融合,获得该合成图像。
在一些可能的设计中,该第二配置接口,还用于接收该第二用户上传的融合参数,该融合参数包括下述的一种或多种:持续时间、融合位置、朝向;其中,该持续时间指示该场景文件与该图像融合的时间段,该朝向指示该目标图像中采集设备的朝向;该处理单元,还用于基于该融合参数,将该场景文件和该目标图像融合,获得该至少一帧合成图像。
在一些可能的设计中,该场景文件包括直播背景,该目标图像包括直播视频流中的人像;该处理单元,还用于基于该融合参数,将该直播背景作为该直播视频流的背景与该直播视频流中的该人像融合,获得新的直播视频流人像。
在一些可能的设计中,该通信单元,还用于向该第二用户展示该场景文件的渲染结果;接收该第二用户发出调整信息,该调整信息指示对该至少一个三维模型的参数进行调整;该处理单元,还用于根据该调整信息和该场景文件的渲染结果,获得调整后的场景文件的渲染结果;该通信单元,还用于向该第二用户展示调整后的场景文件的渲染结果。
在一些可能的设计中,该至少一个三维模型包括目标三维模型,该处理单元,还用于对该目标三维模型执行下述操作的一种或多种:复制、粘贴、删除、编辑;其中,该编辑包括文字编辑和颜色编辑,该文字编辑指示对该目标三维模型包含的文本信息进行 修改,该颜色编辑指示对该目标三维模型的颜色进行修改。
在一些可能的设计中,该处理单元,还用于根据该第二用户的指示信息,将该合成图像发送至该第二用户指定的第三用户。
在一些可能的设计中,该云计算平台为边缘计算集群。
在一些可能的设计中,该处理单元,还用于将该合成图像和该指示信息发送至边缘节点,以使得该边缘节点将该合成图像发送至该第二用户指定的该第三用户。
本申请的第四方面提供了一种虚拟拍摄装置,该装置包括交互单元,用于接收云计算平台发送的目标场景的渲染结果,该目标场景是对场景文件进行渲染获得的,该场景文件由第一用户在该云计算平台提供,该场景文件包括至少一个三维模型的参数;计算单元,用于将该目标场景的渲染结果和目标图像融合,获得合成图像,其中,该目标图像由第二用户采集获得。
在一些可能的设计中,该目标场景的渲染结果还包括该目标场景对应的深度信息。
在一些可能的设计中,该目标图像包括在视频文件中。
在一些可能的设计中,该交互单元,还用于提供配置接口,该配置接口用于接收该第二用户上传的融合参数,该融合参数包括下述的一种或多种:持续时间、融合位置、朝向;其中,该持续时间指示该场景文件与该图像融合的时间段,该朝向指示该目标图像中采集设备的朝向;该计算单元,还用于基于该融合参数,将该场景文件和该目标图像融合,获得该合成图像。
在一些可能的设计中,该场景文件包括直播背景,该目标图像包括直播视频流中的人像,该计算单元,还用于基于该融合参数,将将该直播背景作为该直播视频流的背景与该直播视频流中的该人像融合,获得新的直播视频流。
在一些可能的设计中,该交互单元,还用于向该云云计算平台发送调整信息,该调整信息指示对该场景文件包括的该至少一个三维模型的参数进行调整;获取调整后的场景文件的渲染结果。
在一些可能的设计中,该至少一个三维模型包括目标三维模型,该计算单元,还用于对该目标三维模型执行下述操作的一种或多种:复制、粘贴、删除、编辑;其中,该编辑包括文字编辑和颜色编辑,该文字编辑指示对该目标三维模型包含的文本信息进行修改,该颜色编辑指示对该目标三维模型的颜色进行修改。
在一些可能的设计中,该交互单元,还用于向该云计算平台上传该调整后的场景文件。
在一些可能的设计中,该交互单元,还用于向该云计算平台上传针对该调整后的场景文件的描述信息和/或收费方式。
在一些可能的设计中,该交互单元,还用于向该云计算平台发送配置信息,该配置信息指示从该云计算平台提供的至少一个场景文件中选择该场景文件。
本申请的第五方面提供了一种计算设备集群,包括至少一个计算设备,每个计算设备包括处理器和存储器;至少一个计算设备的处理器用于执行至少一个计算设备的存储器中存储的指令,以使得该计算设备执行如第一方面或第一方面的任意可能的设计提供的方法。
本申请的第六方面提供了一种计算设备集群,包括至少一个计算设备,每个计算设备包括处理器和存储器;至少一个计算设备的处理器用于执行至少一个计算设备的存储 器中存储的指令,以使得该计算设备执行如第二方面或第二方面的任意可能的设计提供的方法。
本申请的第七方面提供了一种包含指令的计算机程序产品,当该指令被计算机设备集群运行时,使得该计算机设备集群执行如第一方面或第一方面的任意可能的设计提供的方法。
本申请的第八方面提供了一种包含指令的计算机程序产品,当该指令被计算机设备集群运行时,使得该计算机设备集群执行如第二方面或第二方面的任意可能的设计提供的方法。
本申请的第九方面提供了一种计算机可读存储介质,包括计算机程序指令,当该计算机程序指令由计算设备集群执行时,该计算设备集群执行如第一方面或第一方面的任意可能的设计提供的方法。
本申请的第十方面提供了一种计算机可读存储介质,包括计算机程序指令,当该计算机程序指令由计算设备集群执行时,该计算设备集群执行如第一方面或第一方面的任意可能的设计提供的方法。
附图说明
为了更清楚地说明本申请实施例的技术方法,下面将对实施例中所需使用的附图作以简单地介绍。
图1是本申请涉及的一种虚拟拍摄的架构图;
图2是本申请涉及的一种虚拟拍摄的方法流程图;
图3是本申请涉及的一种上传场景文件的交互界面;
图4是本申请涉及的一种配置融合信息的交互界面;
图5是本申请涉及的一种虚拟视点朝向的示意图;
图6是本申请涉及的另一种虚拟拍摄的方法流程图;
图7是本申请涉及的一种场景文件编辑交互界面的示意图;
图8是本申请涉及的另一种场景文件编辑交互界面的示意图;
图9是本申请涉及的一种云计算平台的结构图;
图10是本申请涉及的一种虚拟拍摄装置的结构图;
图11是本申请涉及的一种计算设备的示意图;
图12是本申请涉及的另一种计算设备集群的示意图;
图13是本申请涉及的另一种计算设备集群的示意图。
具体实施方式
本申请实施例中的术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。
目前比较常见的虚拟拍摄的方法是分别进行场景构建和数据采集,然后通过计算机合成技术将上述二者合成以获得合成图像。
数据采集指的是利用采集设备(如相机)的摄像功能采集人或物体获得多帧图像(通常可以结合绿幕进行采集)。可选的,也可以通过激光雷达采集点云的方式实现。
场景包括真实场景、虚拟场景和真实场景与虚拟场景结合等类型。其中,虚拟场景的构建相对复杂。通常来说,虚拟场景中包括光源和多个三维模型。随着三维模型的数量的增多、或者对画质要求的提高,虚拟场景构建的难度会越来越大。此外,即使是真 实场景,也需要对真实场景进行虚拟化操作。
实际中,虚拟拍摄要求负责图像数据采集工作的人员还同时掌握场景构建的技术。进一步地,为了构建合成图像所用的场景文件,还需要制作虚拟场景、采购虚拟化制作软件以及操作这些软件的人员。在制作得到了场景文件后,还需要结合图像数据获得合成图像。
需要说明的是,对于大型视频(如晚会节目、电视剧等)的制作而言,本地计算设备可能无法提供充足的计算能力。因此,还需要购买硬件(计算设备)或者云计算服务以支撑合成图像的制作。
总的来说,现有的虚拟拍摄模式对工作人员提出了较高的技术要求,同时也带来了不菲的花销,具有难度大、花销大、效率低的特点。
有鉴于此,本申请提出一种虚拟拍摄方法,该方法中场景文件可以由第三方用户上传至云计算平台,需要制作合成图像的目标用户可以在云计算平台中选择场景文件,并进一步地结合目标用户采集到的图像数据以获得合成图像。其中,合成操作也可以由云计算平台执行。该方法通过将图像数据采集和场景构建两个部分解耦,极大地降低了视频制作的难度和成本,有效地提升了虚拟拍摄制作视频的效率。
图1示出了一种虚拟拍摄的架构图。如图1所示,目标用户为需要制作合成图像的用户。如前所述,图像数据是目标用户利用如相机等采集设备采集到的。其中,图像数据包括一帧或多帧图像。可选的,图像数据可以是目标用户利用多个采集设备采集得到的。也即,利用多个采集设备可以获得多路图像数据流以制作合成图像。
与此同时,第三方用户可以利用虚拟化软件、图像制作软件等软件制作场景文件,并上传至云计算平台100中的场景文件资源库中。该场景文件资源库中包含不同类型的场景文件。例如,场景文件资源库中至少包括场景文件1、场景文件2和场景文件3。而这些不同类型的场景文件可以是由不同的第三方用户上传的。例如,场景文件1是由第三方用户1上传,而场景文件2是由第三方用户2上传。具体的,关于第三方用户如何上传场景文件将在后文展开介绍。需要说明的是,第三方用户上传场景文件的操作与目标用户采集图像数据的操作没有固定的先后顺序。
进一步地,在场景文件资源库中上传了至少一个场景文件后,目标用户可以通过云计算平台100选择合适的一个或多个场景文件,用于制作合成图像。其中,合成图像的制作操作可以是由云计算平台100来执行。可选的,也可以是由目标用户来执行。
本申请提出的虚拟拍摄方法可以用于离线合成图像的制作,也可以用于在线合成图像的制作。其中,在线合成图像后可以将合成图像用于提供在线服务,比如晚会直播、户外直播等。
图2示出了一种虚拟拍摄的方法200流程图。
S200:第三方用户客户端上传场景文件至云计算平台100。
在第三方用户利用虚拟化软件、视频制作软件等软件完成场景文件的制作后,可以通过客户端中如图3所示的交互界面上传场景文件。其中,场景文件中包括多个三维模型的参数信息。具体地,参数信息包括下述的一种或多种:材质信息、纹理贴图信息、法 线贴图信息、位置信息和尺寸信息等。
其中,场景文件可以是未经渲染的。进一步地,云计算平台100在接收到第三方用户上传的场景文件后利用渲染引擎进行渲染,获得渲染后的场景文件。
图3示出的交互界面中包括,
场景文件选择控件301,第三方用户可以通过该控件从本地存储的多个文件中选择上传至云计算平台100的一个或多个场景文件。也即,该控件支持第三方用户一次性上传多个场景文件。
描述信息输入控件302,第三方用户可以通用该控件输入针对上传的场景文件的描述信息。具体地,描述信息包括场景文件的主题和/或内容。其中,主题可以是体育主题、动画主题或户外主题等。第三方用户可以根据实际情况选择一个或多个主题用于描述场景文件。需要说明的是,上述描述不构成对主题类型的限制,本申请不对主题的类型进行限制。
在一些可能的实现方式中,目标用户可以根据场景文件的主题类型快速地选择用于制作合成图像的场景文件,可以有效地提升合成图像的制作效率。
在一些可能的实现方式中,描述信息中还包括内容。具体地,内容可以是对场景文件的文字性描述,也可以是关于关键参数的描述性文件。其中,关键参数包括下述的一种或多种:场景文件中虚拟视点/相机的位置、朝向、视场角(field of view,fov)和视平面位置等。
其中,视场角在光学工程中又称视场,视场角的大小决定了光学仪器(如采集设备)的视野范围。在光学仪器中,以光学仪器的镜头为顶点,以被测目标的物像可通过镜头的最大范围的两条边缘构成的夹角,称为视场角。
当上传的场景文件存在多个时,第三方用户需要分别为每一场景文件配置描述信息。可选的,也可以在描述信息输入控件302中集中地进行配置。
收费方式设置控件303,用于为场景文件设置收费方式。具体地,收费方式至少包括以下两种:按时长收费和按次收费。其中,按时长收费指示的是按照目标用户使用场景文件的时长进行收费。例如,在直播活动中使用场景文件时,使用时长为直播时长。当第三方用户选择了按时长收费的方式后,可以进一步地设置收费的单价。例如,每分钟的收费价格。
按次收费指示的是按照目标用户使用场景文件的次数进行收费。通常可以认为制作一个合成图像需要使用场景文件一次。当第三方用户选择了按次收费的方式后,可以进一步地设置收费的单价。例如,每次的收费价格(未在图3中示出)。
在一些可能的实现方式中,在设置收费单价时,云计算平台100可以提供一个参考价给第三方用户以供参考。
确定控件304,用于上传场景文件、描述信息以及收费方式至云计算平台100。
在场景文件制作和数据采集行为解耦的情况下,本申请利用由云计算平台100提供的交互界面,可以十分方便地实现场景文件及其相关信息的上传。
需要说明的是,云计算平台100提供的用于上传场景文件的交互界面可以有多种呈现方式,本申请不对其进行限制。
S201,目标用户客户端采集图像数据。
目标用户利用采集设备采集到的图像数据包括人和/或物体的一帧或多帧图像。例如, 图像数据可以是目标用户采集到的视频流数据,该视频流数据包括多帧图像。进一步地,所述图像数据中还可以包括采集设备信息。其中,采集设备信息包括下述的一种或多种:拍摄过程中采集设备相对于人或物体的相对位置信息和朝向等参数。
在一些可能的实现方式中,目标用户利用多个采集设备采集可以获得多路图像数据。而所述多路图像数据可以共同用于制作合成图像。而对于多路图像数据而言,根据每一路图像数据中采集设备相对于人或物体的相对位置信息和朝向等参数可以将多路图像数据转换到同一坐标系下,从而实现图像数据的合成。
需要说明的是,S200和S201的执行不存在固定的先后顺序。也即,第三方用户上传场景文件的操作可以先于目标用户采集图像数据的操作,第三方用户上传场景文件的操作也可以后于目标用户采集图像数据的操作。可选的,第三方用户上传场景文件的操作还可以与目标用户采集图像数据的操作同时被执行。
S202,目标用户从云计算平台100中选择用于制作合成图像的场景文件。
如前所述,第三方用户在S200中上传了一个或多个场景文件及其相关信息至云计算平台100。因此,目标用户可以根据需要选择用于制作合成图像的场景文件。具体的,目标用户可以根据场景文件的主题类型、收费方式、单价等信息进行选择。
S203,目标用户通过目标用户客户端上传采集到的图像数据至云计算平台100。
在目标用户采集了图像数据以及选择了用于制作合成图像的场景文件后,可以将采集的图像数据上传至云计算平台100。
S204,云计算平台100基于图像数据和选择的场景文件进行渲染获得合成图像。
根据图像数据中采集设备的采集设备信息和场景文件中的关键参数,利用渲染方法,可以获得合成图像。
需要说明的是,获得合成图像的方法还可以是人为地或自动地调用云上的计算服务(例如云渲染服务)。利用所述计算服务来执行渲染的计算步骤,并在获得渲染结果,业界合成图像后,获取所述合成图像。
此外,上述的计算服务的提供商可以与云计算平台的提供商相同,也可以不同。
例如,当场景文件包括直播背景,目标图像包括直播视频流中的人像时,可以基于S200中的融合参数,将直播背景作为直播视频流的背景与直播视频流中的人像融合,获得新的直播视频流。
其中,采集设备信息包括下述的一种或多种:拍摄过程中采集设备相对于人或物体的相对位置信息和朝向等参数。而关键参数包括下述的一种或多种:场景文件中虚拟视点/相机的位置、朝向、视野范围(fov)和视平面位置等。
常见的渲染方法有光栅法、光线追踪法和反向光线追踪方法等等。其中,光线追踪又称为光迹跟踪或光线追迹,来自于几何光学的一项通用技术,它通过跟踪与光学表面发生交互作用的光线从而得到光线经过路径的模型。它用于光学系统设计,如照相机镜头、显微镜、望远镜以及双目镜等。当用于渲染时,跟踪从眼睛发出的光线而不是光源发出的光线,通过这样一项技术生成编排好的场景的数学模型显现出来。这样得到的结果类似于光线投射与扫描线渲染方法的结果,但是这种方法有更好的光学效果。例如对于反射与透射有更准确的模拟效果,并且效率非常高,所以当追求这样高质量结果时候经常使用这种方法。具体地,光线追踪方法首先计算一条光线在被介质吸收,或者改变方向前,光线在介质中传播的距离、方向以及到达的新位置。然后从这个新的位置产生 出一条新的光线,使用同样的处理方法,最终计算出一个完整的光线在介质中传播的路径。由于该算法是成像系统的完全模拟,所以可以模拟生成复杂的图片。需要说明的是,本申请不对渲染方法进行限定。
在融合图像数据和场景文件时,需要确定图像数据和场景文件的融合时间以及融合位置等参数。
图4示出了一种配置融合信息的交互界面。如图4所示,该交互界面包括,
画面显示框401,用于展示场景文件的画面。具体地,对应的是一段场景文件中某一帧(某一时刻)的画面。通过控制时间轴405可以选取该场景文件中不同时刻的画面。例如,画面显示框401中展示了该场景文件中在第3分16秒时的画面,该画面中包括至少两个三维模型402。进一步地,目标用户还可以通过播放控件406实现画面的播放、暂停、快进和快退等操作。其中,通过播放、快进和快退的操作可以实现合成图像的预览,而通过暂停的操作可以实现场景文件的画面选取。
可选的,画面显示框401中还包括方位调节控件404。通过操作方位调节控件404,可以调整场景文件在画面显示框401中呈现的画面。具体地,因为场景文件是在三维空间中建模,因此选择在三维空间中的不同位置放置虚拟视点,所看到的内容/画面可能会有不同。也即,通过操作方位调节控件404可以调整三维空间中虚拟视点的位置,从而调整场景文件在画面显示框401中呈现的画面。需要说明的是,对于同一个场景文件,不同的帧画面(或时刻)对应的虚拟视点的位置可以不同。
场景文件信息框407中展示有场景文件的信息。具体地,包括下述的一种或多种:场景文件标识、场景文件类型、虚拟视点坐标、虚拟视点朝向和视场角。其中,场景文件标识可以是文件编号或者文件名,也可以是文件存储路径。而虚拟视点的坐标和朝向是指在画面显示框401中显示的帧画面中,虚拟视点在三维空间中的坐标和朝向。具体地,虚拟视点的朝向指示的是虚拟视点的观测方向分别与三维空间中坐标轴之间的夹角。
图5示出了虚拟视点朝向的示意图,如图所示,虚拟视点的观测方向为S1,S1在xoy平面上的投影为S1’,S1’与x轴的夹角为α,S1’与y轴的夹角为β,而S1与z轴的夹角为σ。因此该虚拟视点的朝向为(α,β,σ)。
画面显示框401中还包括融合标识403,所述融合标识403用于指示图像数据与场景文件融合的位置。目标用户可以利用鼠标或基于触控屏设置融合标识403的位置。
图像数据信息框408中展示有图像数据的信息。具体地,包括下述的一种或多种:图像数据标识、持续时间、采集设备坐标、采集设备朝向和视场角等。其中,持续时间是指图像数据在当前时刻在画面显示框401的位置(或融合位置)持续的时间。例如,持续时间为3:16—10:00代表着图像数据将于3分16秒至10分之间的时间内,一直处于融合标识403处。可选的,目标用户可以在图像数据信息框408中配置和修改持续时间。
图像数据标识可以是文件编号或者文件名,也可以是文件存储路径。而采集设备的坐标和朝向是指书品数据中,采集设备在图像数据的三维空间中的坐标和朝向。其中,采集设备的朝向与虚拟视点的朝向类似。
在一些可能的实现方式中,目标用户可以通过修改场景文件信息框407中虚拟视点的坐标、朝向和/或视场角实现对场景文件的调整。同样,目标用户可以通过修改图像数据信息框408中采集设备的坐标、朝向和/或视场角实现对图像数据的调整。
在一些可能的实现方式中,目标用户可以利用方位调节控件404实现对图像数据的调 整。具体的,在选中融合标识403后,可以通过操作方位调节控件404来控制图像数据中采集设备的坐标、朝向和视场角。也即,通过操作方位调节控件404,可以从空间中不同的位置观察图像数据所包括的三维空间,从而在画面显示框401中呈现不同的画面。
在确定了融合标识403的位置、场景文件框407中的参数和图像数据信息框408中的参数后,目标用户可以通过点击确定控件409启动图像数据和场景文件的融合,以获得合成图像。
图像融合由智能度由低向高可以分为像素级、特征级、决策级融合等。像素级融合指基于图像像素进行拼接融合,是两个或两个以上的图像融合成为一个整体。特征级融合以图形的明显特征,如线条、建筑等特征为基础进行图像的拼接与融合。决策级融合使用贝叶斯法、D-S证据法(dempster-shafer evidence theory)等数学算法进行概率决策,依此进行图像融合,更适应于主观要求。应理解,本申请不对图像融合使用的方法进行限制。
在一些可能的实现方式中,在进行图像融合前,还需要对图像数据和场景文件进行预处理。预处理技术主要用来对图像进行几何校正、噪声消除、色彩调整、亮度调整及图像配准等等。图像配准是指找到图像与三维虚拟场景的最大相关,以消除图像在空间、相位和分辨率等方向的信息差异,达到融合更真实,信息更准确的目的。
如前所述,图像数据存在需要将人和/或物体从采集设备采集到的数据中分离出来。例如,利用图像处理软件可以将画面中指定的人和/或物体截取出来。具体地,将人和/或物体从采集设备采集到的数据中分离出来这一操作可以由目标用户在S203之前完成,也可以由云计算平台100完成。也即,目标用户在S203中上传的图像数据未经预处理。
需要说明的是,在S200中当第三方用户上传场景文件后,云计算平台100可以对所述场景文件进行渲染操作,以获得渲染后的场景文件。也即,这一操作可以在S204被执行,也可以先于S204被执行,
云计算平台100提供的配置融合信息的交互界面可以是运行在目标用户客户端上应用(application)的界面,也可以运行在基于超文本传输协议(hyper text transfer protocol,HTTP)的网页上。
S205,发送合成图像至目标用户。
可选的,还可以根据目标用户的指示直接将合成图像发送至目标用户的客户。例如,目标用户为直播平台,那么就可以根据观众的接入情况和接入信息将合成图像直接发送至观众处。因此,这一方法可以有效地提升合成图像分发的速度和效率。
进一步地,还可以根据目标用户或目标用户的客户的参数信息进行合成图像的分发。以目标用户为例,参数信息包括下述的一种或多种:虚拟视点的坐标、朝向、视场角等。
此外,当目标用户或目标用户的客户存在多个时,云计算平台100可以产生多路图像数据,分别发送至所述多个目标用户或目标用户的客户。
以游戏直播场景为例,一局游戏中同时存在多个玩家,直播平台(目标用户)提供了每一玩家的视角供观众(目标用户的客户)观看。因此,在同一个三维空间中,不同玩家对应着不同的参数信息。那么云计算平台100可以根据玩家的参数信息产生多路下行图像数据,以满足观众的需要。
基于上述内容描述,本申请提供了一种虚拟拍摄方法。该方法提供了场景文件上传的交互界面以及图像融合的交互界面,让目标用户可以选择并使用第三方用户上传的场 景文件进行图像融合。避免了目标用户搭建场景文件拍摄场地、采购拍摄设备、操作场景文件制作软件等操作,有效地简化了目标用户侧进行虚拟拍摄的步骤,减少了设备采购和场地租赁带来的花销,极大地提升了目标用户侧进行虚拟拍摄的效率。
需要说明的是,上述云计算平台100还可以是边缘节点集群。例如,在网络中属于距离第三方用户和/或目标用户较近的计算节点集群。当该方法由边缘计算集群执行时,可以有效地降低网络中的中心计算节点的带宽压力,同时节省数据传输的时间,进一步地提升图像融合的效率。
具体地,边缘节点还可以根据目标用户的指示,将合成图像直接发送至目标用户的客户,进一步地提升效率,同时减轻目标用户的带宽压力,降低带宽开销。
图5提供的虚拟拍摄方法中执行图像融合操作的是云计算平台100,而目标用户客户端仅需要上传采集到的图像数据和接收合成图像,然而目标用户客户端可能也具有一定的计算能力。因此,为了充分利用目标用户客户端的计算能力,图6示出了另一种虚拟拍摄的方法200’流程图。
S200:第三方用户客户端上传场景文件至云计算平台100。
S201,目标用户客户端采集图像数据。
S202,目标用户从云计算平台100中选择用于制作合成图像的场景文件。
需要说明的是,S200-S202与图5中的S200-S202基本一致,故不再赘述。但是目标用户客户端侧运行有虚拟拍摄装置600,虚拟拍摄装置600可以是运行在目标用户客户端上的一款应用软件(例如云服务厂商提供的虚拟拍摄客户端)等。进一步地,虚拟拍摄装置600可以是在目标用户客户端侧执行虚拟拍摄方法200’的主体。
S203’,云计算平台100将选择的场景文件的渲染结果发送至目标用户。
相较于图5中S204云计算平台100基于存储于云计算平台100中的场景文件进行渲染,且其中的场景文件未经渲染。图6中的方法是由目标用户客户端根据接收到的场景文件的渲染结果和图像数据进行渲染获得合成图像。其中,所述场景文件的渲染结果包括场景文件对应的每一帧(时刻)的渲染画面。可选的,所述场景文件的渲染结果还包括每一帧渲染画面对应的深度信息。
可选的,在云计算平台100将选择的场景文件的信息发送至目标用户之前,还可以对所述场景文件的渲染结果进行编码。
S204’:目标用户客户端基于接收的场景文件的渲染结果和图像数据的信息进行渲染获得合成图像。
如前所述,场景文件的渲染结果包括每一帧(时刻)的渲染画面。进一步地,目标用户客户端侧的虚拟拍摄装置600可以根据目标用户设置的融合时间和融合位置等参数,将场景文件的渲染结果和图像数据进行融合。
其中,在融合的过程中涉及对图像数据放置的位置以及采集设备的坐标、朝向等参数的设置可以参考S204中的描述,故不再赘述。
需要说明的是,不同于S204中用于融合得到合成图像的是包含各个三维模型坐标的场景文件,S204’中使用的是场景文件的渲染结果。也即,每一帧的渲染画面。因此,在融合的过程中,主要配置的是图像数据的相关参数(采集设备的坐标和朝向等)。
其中,在场景文件的渲染结果中不包含每一帧渲染画面的深度信息时,目标用户客户端可以利用人工智能(artificial intelligence)推理模型、深度神经网络等方法,为每一 帧渲染画面推理得到每一帧渲染画面的深度信息。
在一些可能的实现方式中,当场景文件的渲染结果中包含每一帧渲染画面的深度信息时,目标用户客户端可以根据每一帧的渲染画面及其深度信息进行合成图像的融合。
具体地,上述两种方法均可以基于深度信息(接收到的和推理得到的)针对场景文件建立三维空间,结合图像数据中采集设备的坐标和朝向以及视频文件的融合位置,可以将图像数据和场景文件构建在同一坐标系下。进一步地,利用光栅化、光线追踪等渲染方法即可获得合成图像。
可选的,在目标用户客户端接收到场景文件的渲染结果后,还需要对所述场景文件的渲染结果进行解码。
需要说明的是,本申请不对目标用户获得合成图像的方式进行限定。例如,目标用户还可以利用云服务厂商提供的计算服务来获得合成图像。其中,所述云服务厂商与云计算平台100的厂商可以相同,也可以不同。
S205’:目标用户客户端分发合成图像。
在获得了合成图像之后,目标用户客户端可以将合成图像分发至目标用户的客户。具体地,可以通过内容分发网络(content delivery network,CDN)将合成图像进行分发。
例如,目标用户为直播平台,那么就可以根据观众的接入情况和接入信息将合成图像直接发送至观众处。
进一步地,还可以根据目标用户的客户的参数信息进行合成图像的分发。其中,参数信息包括下述的一种或多种:虚拟视点的坐标、朝向、视场角等。
此外,当目标用户的客户存在多个时,目标用户客户端可以产生多路图像数据,分别发送至所述多个目标用户的客户。
以游戏直播场景为例,一局游戏中同时存在多个玩家,直播平台(目标用户)提供了每一玩家的视角供观众(目标用户的客户)观看。因此,在同一个三维空间中,不同玩家对应着不同的参数信息。那么目标用户客户端可以根据玩家的参数信息产生多路下行视频流数据,以满足观众的需要。
需要说明的是,S205’是可选的步骤。也即,目标用户客户端在获得了合成图像之后,可以将合成图像存储至目标用户客户端中。
基于上述内容描述,本申请提供了一种虚拟拍摄方法。该方法提供了场景文件上传的交互界面以及图像融合的交互界面,让目标用户可以选择并使用第三方用户上传的场景文件进行图像融合。其中,图像融合的操作由目标用户客户端执行,最大限度地利用了目标用户客户端的计算能力。同时,还避免了目标用户搭建场景文件拍摄场地、采购拍摄设备、操作场景文件制作软件等操作,有效地简化了目标用户侧进行虚拟拍摄的步骤,减少了设备采购和场地租赁带来的花销,极大地提升了目标用户侧进行虚拟拍摄的效率。
如前所述,目标用户客户端可以在云计算平台100上按照需求选择用于合成图像的场景文件。但是由于场景文件是由第三方用户制作并上传的,因此云计算平台100中的场景文件可能无法直接被目标用户所使用,但是在目标用户对场景文件进行一定的修改或调整后,可能可以满足目标用户的需要。
因此,图7提供了一种场景文件编辑交互界面的示意图。如图7所示,该交互界面500 中包括多个三维模型501和502,其中,三维模型502中包括文本。
交互界面500用于展示一份场景文件包括的三维场景的部分或全部。可选的,目标用户可以通过拖动/旋转交互界面以实现视角的转换,从而在新的视角中观察该三维场景。
图7中示出的三维场景中至少包括两个三维模型501和一个包括文件的三维模型502。对于三维场景中的所有三维模型,目标用户均可以进行编辑。
以三维模型501为例,在目标用户选中三维模型501后,可以打开操作栏。其中,操作栏包括复制、粘贴、删除以及编辑等功能。具体地,复制指示的是将该三维模型的参数(例如长、宽和高)复制至剪贴板,而粘贴指示的是根据剪贴板中的三维模型的参数在指定位置生成相应的三维模型。删除指示的是将该三维模型从该三维场景中删除。编辑功能则至少包括文字编辑和颜色编辑功能。其中,文字编辑指示的是当该三维模型包括文本时,可以对该文本进行修改和/或调整,而颜色编辑则指示的是对该三维模型的颜色进行调整。
例如,当目标用户对三维模型501_1执行了复制操作,并在其右侧执行了粘贴操作,以及对三维模型502的文本进行编辑(将原始文本“ABC”修改为“DEF”)后,目标用户将在交互界面500中看到如图8所示的画面。
需要说明的是,在目标用户对场景文件进行修改之后,展示调整后的场景文件之前,还需要执行渲染操作。也即,基于调整后的三维模型的数量和位置等参数以及光源等参数进行渲染,从而获得图8中示出的画面。
可选的,上述渲染可以是预渲染,即以设定虚拟视点和视角进行渲染。换言之,在指定虚拟视点以及指定视角下,根据物体的透视关系,对于可见的三维模型进行渲染,而对不可见的三维模型不进行渲染操作,从而减少渲染的计算量,从而提升渲染的速度。
在目标用户对场景文件进行修改之后,目标用户可以为修改后的场景文件添加标识。例如,进行重命名或者添加版本号,以便于在后续使用这一修改后的场景文件时进行查询。
进一步地,目标用户可以选择是否将修改后的场景文件上传至云计算平台,供其他目标用户使用,也可以设置为仅限目标用户自身访问。具体地,目标用户上传所述修改后的场景文件的方法可以参考图3以及图3对应的实施例。
修改后的场景文件可以存储在云计算平台100中,也可以以图像和/或深度信息的方式存储在目标用户指定的系统(如目标用户客户端)中。
此外,场景文件编辑交互界面500可以是基于网络产品界面设计(website user interface,WebUI)方式来构建的,也可以是基于其他应用程序构建的。在WebUI方式下,浏览器运行的Web程序可通过网页即时通信(w real-time communication,WebRTC)方式接收本地/远程服务器渲染后的图像。
综上所述,本申请提供了一种编辑场景文件的方法。场景文件由第三方用户上传至云计算平台100后,目标用户可以根据需要对场景文件进行编辑,以获得符合目标用户需要的场景文件。这一方法避免了因第三方用户上传的场景文件不完全符合目标用户需求的情况,提高了场景文件的适用场景,进一步地提升了目标用户进行虚拟拍摄的效率。
上文中结合图1至图8,详细描述了本申请所提供的两种虚拟拍摄方法,下面将结合图9至图12,描述根据本申请所提供的装置和计算设备。
本申请提供一种云计算平台100,如图9所示,包括:
通信单元101,用于在S200中接收第三方用户有上传的场景文件,并将所述场景文件存储至云计算平台100中的存储单元103中。可选地,该存储单元也可以不包括在所述云计算平台100中,而是以云服务的形式提供。在S202中,通信单元101还用于接收目标用户通过目标用户客户端选择的用于制作合成图像的场景文件。而通信单元101还用于接收目标用户上传的图像数据。在S205中,通信单元101还用于将合成图像发送至目标用户。
可选的,通信单元还用于将图像数据和目标用户选择的场景文件发送至云上的计算服务(例如云渲染服务),并接收所述计算服务返回的计算结果,也即,合成图像。
处理单元102,用于在S204中,基于图像数据和目标用户选择的场景文件进行渲染获得合成图像。
存储单元103,用于存储在S200中接收到的由第三方用户上传的场景文件。在S203中,存储单元103还用于存储目标用户上传的图像数据。可选的,在S205中获得的合成图像也将被存储至存储单元103中。
总的来说,上述云计算平台100包括的三个单元(通信单元101、处理单元102和存储单元103)可以用于共同执行虚拟拍摄方法200。
本申请还提供另一种云计算平台100,结构可以参考图9,包括:
通信单元101,用于在S200中接收第三方用户有上传的场景文件,并将所述场景文件存储至云计算平台100中的存储单元103中。可选地,该存储单元也可以不包括在所述云计算平台100中,而是以云服务的形式提供。在S202中,通信单元101还用于接收目标用户通过目标用户客户端选择的用于制作合成图像的场景文件。而通信单元101还用于接收目标用户上传的图像数据。需要注意的是,通信单元101还用于在S203’中将目标用户选择的场景文件的信息发送至所述目标用户。其中,场景文件的信息包括场景文件的渲染结果。可选的,场景文件的信息还可以包括所述渲染图像的深度信息。
处理单元102,用于在S203中,获得目标用户选择的场景文件的渲染结果。可选的,还用于在S203中,计算目标用户选择的场景文件的渲染结果的深度信息。
存储单元103,用于存储在S200中接收到的由第三方用户上传的场景文件。在S203中,存储单元103还用于存储目标用户选择的场景文件的信息。
总的来说,上述云计算平台100包括的三个单元(通信单元101、处理单元102和存储单元103)可以用于共同执行虚拟拍摄方法200’。
需要说明的是,以上两种云计算平台100中的通信单元101、处理单元102和存储单元103均可以通过软件实现,或者可以通过硬件实现。示例性的,接下来介绍通信单元101的实现方式。类似的,处理单元102和存储单元103的实现方式可以参考通信单元101的实现方式。
单元作为软件功能单元的一种举例,通信单元101可以包括运行在计算实例上的代码。其中,计算实例可以是物理主机(计算设备)、虚拟机、容器等计算设备中的至少一种。进一步地,上述计算设备可以是一台或者多台。例如,通信单元101可以包括运行在多个主机/虚拟机/容器上的代码。需要说明的是,用于运行该应用程序的多个主机/虚拟机/容器可以分布在相同的region中,也可以分布在不同的region中。用于运行该代码的多个主 机/虚拟机/容器可以分布在相同的AZ中,也可以分布在不同的AZ中,每个AZ包括一个数据中心或多个地理位置相近的数据中心。其中,通常一个region可以包括多个AZ。
同样,用于运行该代码的多个主机/虚拟机/容器可以分布在同一个VPC中,也可以分布在多个VPC中。其中,通常一个VPC设置在一个region内。同一region内两个VPC之间,以及不同region的VPC之间跨区通信需在每个VPC内设置通信网关,经通信网关实现VPC之间的互连。
单元作为硬件功能单元的一种举例,通信单元101可以包括至少一个计算设备,如服务器等。或者,通信单元101也可以是利用ASIC实现、或PLD实现的设备等。其中,上述PLD可以是CPLD、FPGA、GAL或其任意组合实现。
通信单元101包括的多个计算设备可以分布在相同的region中,也可以分布在不同的region中。通信单元101包括的多个计算设备可以分布在相同的AZ中,也可以分布在不同的AZ中。同样,通信单元101包括的多个计算设备可以分布在同一个VPC中,也可以分布在多个VPC中。其中,所述多个计算设备可以是服务器、ASIC、PLD、CPLD、FPGA和GAL等计算设备的任意组合。
本申请还提供一种虚拟拍摄装置600,如图10所示,包括:
交互单元601,获取目标用户在S201中选择的用于合成图像的场景文件。并接收云计算平台100在S203’中发送的场景文件的信息。交互单元601,还用于在S205’中分发合成图像至目标用户、目标用户的客户或指定的边缘节点。
计算单元602,用于在S204’中基于接收的场景文件和图像数据的信息进行渲染获得合成图像。
存储单元603,用于存储在S201中采集的图像数据。可选的,在S204’中通过渲染获得的合成图像也将被存储在存储单元603中。
总的来说,上述虚拟拍摄装置600包括的三个单元(交互单元601、计算单元602和存储单元603)可以用于共同执行虚拟拍摄方法200’。
需要说明的是,以上虚拟拍摄装置600中的交互单元601、计算单元602和存储单元603均可以通过软件实现,或者可以通过硬件实现。示例性的,接下来介绍交互单元601的实现方式。类似的,计算单元602和存储单元603的实现方式可以参考交互单元601的实现方式。
单元作为软件功能单元的一种举例,交互单元601可以包括运行在计算实例上的代码。其中,计算实例可以是物理主机(计算设备)、虚拟机、容器等计算设备中的至少一种。进一步地,上述计算设备可以是一台或者多台。例如,交互单元601可以包括运行在多个主机/虚拟机/容器上的代码。需要说明的是,用于运行该应用程序的多个主机/虚拟机/容器可以分布在相同的region中,也可以分布在不同的region中。用于运行该代码的多个主机/虚拟机/容器可以分布在相同的AZ中,也可以分布在不同的AZ中,每个AZ包括一个数据中心或多个地理位置相近的数据中心。其中,通常一个region可以包括多个AZ。
同样,用于运行该代码的多个主机/虚拟机/容器可以分布在同一个VPC中,也可以分布在多个VPC中。其中,通常一个VPC设置在一个region内。同一region内两个VPC之间,以及不同region的VPC之间跨区通信需在每个VPC内设置通信网关,经通信网关实现VPC之间的互连。
单元作为硬件功能单元的一种举例,交互单元601可以包括至少一个计算设备,如服务器等。或者,交互单元601也可以是利用ASIC实现、或PLD实现的设备等。其中,上述PLD可以是CPLD、FPGA、GAL或其任意组合实现。
交互单元601包括的多个计算设备可以分布在相同的region中,也可以分布在不同的region中。交互单元601包括的多个计算设备可以分布在相同的AZ中,也可以分布在不同的AZ中。同样,通信单元101包括的多个计算设备可以分布在同一个VPC中,也可以分布在多个VPC中。其中,所述多个计算设备可以是服务器、ASIC、PLD、CPLD、FPGA和GAL等计算设备的任意组合。
本申请还提供一种计算设备700。如图11所示,计算设备700包括:总线702、处理器704、存储器706和通信接口708。处理器704、存储器706和通信接口708之间通过总线702通信。计算设备700可以是服务器或终端设备。应理解,本申请不限定计算设备700中的处理器、存储器的个数。
总线702可以是外设部件互连标准(peripheral component interconnect,PCI)总线或扩展工业标准结构(extended industry standard architecture,EISA)总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表示,图11中仅用一条线表示,但并不表示仅有一根总线或一种类型的总线。总线704可包括在计算设备700各个部件(例如,存储器706、处理器704、通信接口708)之间传送信息的通路。
处理器704可以包括中央处理器(central processing unit,CPU)、图形处理器(graphics processing unit,GPU)、微处理器(micro processor,MP)或者数字信号处理器(digital signal processor,DSP)等处理器中的任意一种或多种。
存储器706可以包括易失性存储器(volatile memory),例如随机存取存储器(random access memory,RAM)。处理器704还可以包括非易失性存储器(non-volatile memory),例如只读存储器(read-only memory,ROM),快闪存储器,机械硬盘(hard disk drive,HDD)或固态硬盘(solid state drive,SSD)。
存储器706中存储有可执行的程序代码,处理器704执行该可执行的程序代码以分别实现前述通信单元101、处理单元102和存储单元103的功能,从而实现虚拟拍摄方法200。也即,存储器706上存有用于执行虚拟拍摄方法200的指令。
或者,存储器706中存储有可执行的代码,处理器704执行该可执行的代码以分别实现前述交互单元601、计算单元602和存储单元603装置的功能,从而实现虚拟拍摄方法200’。也即,存储器706上存有用于执行虚拟拍摄方法200’的指令。
通信接口703使用例如但不限于网络接口卡、收发器一类的收发单元,来实现计算设备700与其他设备或通信网络之间的通信。
本申请实施例还提供了一种计算设备集群。该计算设备集群包括至少一台计算设备。该计算设备可以是服务器,例如是中心服务器、边缘服务器,或者是本地数据中心中的本地服务器。在一些实施例中,计算设备也可以是台式机、笔记本电脑或者智能手机等终端设备。
如图12所示,所述计算设备集群包括至少一个计算设备700。计算设备集群中的一个或多个计算设备700中的存储器706中可以存有相同的用于执行虚拟拍摄方法200的 指令。
在一些可能的实现方式中,该计算设备集群中的一个或多个计算设备700的存储器706中也可以分别存有用于执行虚拟拍摄方法200的部分指令。换言之,一个或多个计算设备700的组合可以共同执行用于执行虚拟拍摄方法200的指令。
需要说明的是,计算设备集群中的不同的计算设备700中的存储器706可以存储不同的指令,分别用于执行云计算平台100的部分功能。也即,不同的计算设备700中的存储器706存储的指令可以实现通信单元101、处理单元102和存储单元103中的一个或多个单元的功能。
在一些可能的实现方式中,计算设备集群中的一个或多个计算设备可以通过网络连接。其中,所述网络可以是广域网或局域网等等。图13示出了一种可能的实现方式。如图13所示,两个计算设备700A和700B之间通过网络进行连接。具体地,通过各个计算设备中的通信接口与所述网络进行连接。在这一类可能的实现方式中,计算设备700A中的存储器706中存有执行通信单元101和存储单元103的功能的指令。同时,计算设备700B中的存储器706中存有执行处理单元102的功能的指令。
图13所示的计算设备集群之间的连接方式可以是考虑到本申请提供的虚拟拍摄方法200需要进行大量的计算,因此考虑将处理单元102实现的功能交由计算设备700B执行。
应理解,图13中示出的计算设备700A的功能也可以由多个计算设备700完成。同样,计算设备700B的功能也可以由多个计算设备700完成。
本申请实施例还提供了另一种计算设备集群。该计算设备集群中各计算设备之间的连接关系可以类似的参考图12和图13所述计算设备集群的连接方式。不同的是,该计算设备集群中的一个或多个计算设备700中的存储器706中可以存有相同的用于执行虚拟拍摄方法200’的指令。
在一些可能的实现方式中,该计算设备集群中的一个或多个计算设备700的存储器706中也可以分别存有用于执行虚拟拍摄方法200’的部分指令。换言之,一个或多个计算设备700的组合可以共同执行用于执行虚拟拍摄方法200’的指令。
需要说明的是,计算设备集群中的不同的计算设备700中的存储器706可以存储不同的指令,用于执行虚拟拍摄装置600的部分功能。也即,不同的计算设备700中的存储器706存储的指令可以实现交互单元601、计算单元602和存储单元603中的一个或多个单元的功能。
本申请实施例还提供了一种包含指令的计算机程序产品。所述计算机程序产品可以是包含指令的,能够运行在计算设备上或被储存在任何可用介质中的软件或程序产品。当所述计算机程序产品在至少一个计算设备上运行时,使得至少一个计算设备执行虚拟拍摄方法200,或虚拟拍摄方法200’。
本申请实施例还提供了一种计算机可读存储介质。所述计算机可读存储介质可以是计算设备能够存储的任何可用介质或者是包含一个或多个可用介质的数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘)等。该计算机可读存储介质包括指令,所述指令指示计算设备执行虚拟拍摄方法200,或虚拟拍摄方法200’。
最后应说明的是:以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的保护范围。

Claims (48)

  1. 一种虚拟拍摄方法,其特征在于,所述方法应用于云计算平台,所述方法包括:
    提供第一配置接口,所述第一配置接口用于接收第一用户上传的场景文件,所述场景文件包括至少一个三维模型的参数;
    提供第二配置接口,所述第二配置接口用于接收第二用户上传的目标图像;
    将所述场景文件和所述目标图像融合,获得合成图像。
  2. 如权利要求1所述的方法,其特征在于,所述第一配置接口,还用于接收所述第一用户配置的所述场景文件的描述信息和/或收费方式。
  3. 如权利要求1或2所述的方法,其特征在于,所述目标图像包括在视频文件中。
  4. 如权利要求1至3中任一所述的方法,其特征在于,所述将所述场景文件和所述目标图像融合,获得合成图像,包括:
    通过所述云计算平台的云渲染服务,将所述场景文件和所述目标图像融合,获得所述合成图像。
  5. 如权利要求1至4中任一所述的方法,其特征在于,所述第二配置接口,还用于接收所述第二用户上传的融合参数,所述融合参数包括下述的一种或多种:
    持续时间、融合位置、朝向;
    其中,所述持续时间指示所述场景文件与所述图像融合的时间段,所述朝向指示所述目标图像中采集设备的朝向;
    所述将所述场景文件和所述目标图像融合,获得合成图像,包括:
    基于所述融合参数,将所述场景文件和所述目标图像融合,获得所述合成图像。
  6. 如权利要求5所述的方法,其特征在于,所述场景文件包括直播背景,所述目标图像包括直播视频流中的人像;
    所述将所述场景文件和所述目标图像融合,获得合成图像,包括:
    基于所述融合参数,将所述直播背景作为所述直播视频流的背景与所述直播视频流中的所述人像融合,获得新的直播视频流。
  7. 如权利要求1至6中任一所述的方法,其特征在于,所述方法还包括:
    向所述第二用户展示所述场景文件的渲染结果;
    接收所述第二用户发出调整信息,所述调整信息指示对所述至少一个三维模型的参数进行调整;
    根据所述调整信息和所述场景文件的渲染结果,向所述第二用户展示调整后的场景文件的渲染结果。
  8. 如权利要求7中所述的方法,其特征在于,所述至少一个三维模型包括目标三维模型,所述对所述至少一个三维模型的参数进行调整,包括:
    对所述目标三维模型执行下述操作的一种或多种:
    复制、粘贴、删除、编辑;
    其中,所述编辑包括文字编辑和颜色编辑,所述文字编辑指示对所述目标三维模型包含的文本信息进行修改,所述颜色编辑指示对所述目标三维模型的颜色进行修改。
  9. 如权利要求1至8中任一所述的方法,其特征在于,所述方法还包括:
    根据所述第二用户的指示信息,将所述合成图像发送至所述第二用户指定的第三用户。
  10. 如权利要求1至9中任一所述的方法,其特征在于,所述云计算平台为边缘计算集群。
  11. 如权利要求9所述的方法,其特征在于,所述将所述合成图像发送至所述第二用户指定的第三用户,包括:
    将所述合成图像和所述指示信息发送至边缘节点,以使得所述边缘节点将所述合成图像发送至所述第二用户指定的所述第三用户。
  12. 一种虚拟拍摄方法,其特征在于,所述方法包括:
    接收云计算平台云计算平台发送的目标场景的渲染结果,所述目标场景是对场景文件进行渲染获得的,所述场景文件由第一用户在所述云计算平台云计算平台提供,所述场景文件包括至少一个三维模型的参数;
    将所述目标场景的渲染结果和目标图像融合,获得合成图像,其中,所述目标图像由第二用户采集获得。
  13. 如权利要求12所述的方法,其特征在于,所述目标场景的渲染结果还包括所述目标场景对应的深度信息。
  14. 如权利要求12或13所述的方法,其特征在于,所述目标图像包括在视频文件中。
  15. 如权利要求12至14中任一所述的方法,其特征在于,所述方法还包括:
    提供配置接口,所述配置接口用于接收所述第二用户上传的融合参数,所述融合参数包括下述的一种或多种:
    持续时间、融合位置、朝向;
    其中,所述持续时间指示所述场景文件与所述图像融合的时间段,所述朝向指示所述目标图像中采集设备的朝向;
    所述将所述场景文件和所述目标图像融合,获得合成图像,包括:
    基于所述融合参数,将所述场景文件和所述目标图像融合,获得所述合成图像。
  16. 如权利要求12至14中任一所述的方法,其特征在于,所述场景文件包括直播背景,所述目标图像包括直播视频流中的人像,所述将所述场景文件和所述目标图像融合, 获得合成图像,包括:
    基于所述融合参数,将所述直播背景作为所述直播视频流的背景与所述直播视频流中的所述人像融合,获得新的直播视频流。
  17. 如权利要求12至16中任一所述的方法,其特征在于,所述接收云云计算平台发送的目标场景帧的渲染结果前,所述方法还包括:
    向所述云云计算平台发送调整信息,所述调整信息指示对所述场景文件包括的所述至少一个三维模型的参数进行调整;
    获取调整后的场景文件的渲染结果。
  18. 如权利要求17所述的方法,其特征在于,所述至少一个三维模型包括目标三维模型,所述对所述至少一个三维模型的参数进行调整,包括:
    对所述目标三维模型执行下述操作的一种或多种:
    复制、粘贴、删除、编辑;
    其中,所述编辑包括文字编辑和颜色编辑,所述文字编辑指示对所述目标三维模型包含的文本信息进行修改,所述颜色编辑指示对所述目标三维模型的颜色进行修改。
  19. 如权利要求17或18所述的方法,其特征在于,所述方法还包括:
    向所述云计算平台上传所述调整后的场景文件。
  20. 如权利要求19所述的方法,其特征在于,所述方法还包括:
    向所述云计算平台上传针对所述调整后的场景文件的描述信息和/或收费方式。
  21. 如权利要求12至20中任一所述的方法,其特征在于,所述接收云计算平台发送的目标场景帧的渲染结果前,所述方法包括:
    向所述云计算平台发送配置信息,所述配置信息指示从所述云计算平台提供的至少一个场景文件中选择所述场景文件。
  22. 一种云计算平台,其特征在于,包括:
    通信单元,用于提供第一配置接口,所述第一配置接口用于接收第一用户上传的场景文件,所述场景文件包括至少一个三维模型的参数;提供第二配置接口,所述第二配置接口用于接收第二用户上传的目标图像;
    处理单元,用于将所述场景文件和所述目标图像融合,获得合成图像。
  23. 如权利要求22所述的云计算平台,其特征在于,所述第一配置接口,还用于接收所述第一用户配置的所述场景文件的描述信息和/或收费方式。
  24. 如权利要求22或23所述的云计算平台,其特征在于,所述目标图像包括在视频文件中。
  25. 如权利要求22至24中任一所述的云计算平台,其特征在于,所述处理单元,还用于通过所述云计算平台的云渲染服务,将所述场景文件和所述目标图像融合,获得所述合成图像。
  26. 如权利要求22至25中任一所述的云计算平台,其特征在于,所述第二配置接口,还用于接收所述第二用户上传的融合参数,所述融合参数包括下述的一种或多种:
    持续时间、融合位置、朝向;
    其中,所述持续时间指示所述场景文件与所述图像融合的时间段,所述朝向指示所述目标图像中采集设备的朝向;
    所述处理单元,还用于基于所述融合参数,将所述场景文件和所述目标图像融合,获得所述至少一帧合成图像。
  27. 如权利要求26所述的云计算平台,其特征在于,所述场景文件包括直播背景,所述目标图像包括直播视频流中的人像;所述处理单元,还用于基于所述融合参数,将所述直播背景作为所述直播视频流的背景与所述直播视频流中的所述人像融合,获得新的直播视频流人像。
  28. 如权利要求22至27中任一所述的云计算平台,其特征在于,所述通信单元,还用于向所述第二用户展示所述场景文件的渲染结果;接收所述第二用户发出调整信息,所述调整信息指示对所述至少一个三维模型的参数进行调整;所述处理单元,还用于根据所述调整信息和所述场景文件的渲染结果,获得调整后的场景文件的渲染结果;所述通信单元,还用于向所述第二用户展示调整后的场景文件的渲染结果。
  29. 如权利要求28中所述的云计算平台,其特征在于,所述至少一个三维模型包括目标三维模型,所述处理单元,还用于对所述目标三维模型执行下述操作的一种或多种:
    复制、粘贴、删除、编辑;
    其中,所述编辑包括文字编辑和颜色编辑,所述文字编辑指示对所述目标三维模型包含的文本信息进行修改,所述颜色编辑指示对所述目标三维模型的颜色进行修改。
  30. 如权利要求22至29中任一所述的云计算平台,其特征在于,所述处理单元,还用于根据所述第二用户的指示信息,将所述合成图像发送至所述第二用户指定的第三用户。
  31. 如权利要求22至30中任一所述的云计算平台,其特征在于,所述云计算平台为边缘计算集群。
  32. 如权利要求30所述的云计算平台,其特征在于,所述处理单元,还用于将所述合成图像和所述指示信息发送至边缘节点,以使得所述边缘节点将所述合成图像发送至所述第二用户指定的所述第三用户。
  33. 一种虚拟拍摄装置,其特征在于,包括:
    交互单元,用于接收云计算平台发送的目标场景的渲染结果,所述目标场景是对场景文件进行渲染获得的,所述场景文件由第一用户在所述云计算平台提供,所述场景文件包括至少一个三维模型的参数;
    计算单元,用于将所述目标场景的渲染结果和目标图像融合,获得合成图像,其中,所述目标图像由第二用户采集获得。
  34. 如权利要求33所述的装置,其特征在于,所述目标场景的渲染结果还包括所述目标场景对应的深度信息。
  35. 如权利要求33或34所述的装置,其特征在于,所述目标图像包括在视频文件中。
  36. 如权利要求33至35中任一所述的装置,其特征在于,所述交互单元,还用于提供配置接口,所述配置接口用于接收所述第二用户上传的融合参数,所述融合参数包括下述的一种或多种:
    持续时间、融合位置、朝向;
    其中,所述持续时间指示所述场景文件与所述图像融合的时间段,所述朝向指示所述目标图像中采集设备的朝向;
    所述计算单元,还用于基于所述融合参数,将所述场景文件和所述目标图像融合,获得所述合成图像。
  37. 如权利要求33至36中任一所述的装置,其特征在于,所述场景文件包括直播背景,所述目标图像包括直播视频流中的人像,所述计算单元,还用于基于所述融合参数,将将所述直播背景作为所述直播视频流的背景与所述直播视频流中的所述人像融合,获得新的直播视频流。
  38. 如权利要求33至37中任一所述的装置,其特征在于,所述交互单元,还用于向所述云云计算平台发送调整信息,所述调整信息指示对所述场景文件包括的所述至少一个三维模型的参数进行调整;获取调整后的场景文件的渲染结果。
  39. 如权利要求38所述的装置,其特征在于,所述至少一个三维模型包括目标三维模型,所述计算单元,还用于对所述目标三维模型执行下述操作的一种或多种:
    复制、粘贴、删除、编辑;
    其中,所述编辑包括文字编辑和颜色编辑,所述文字编辑指示对所述目标三维模型包含的文本信息进行修改,所述颜色编辑指示对所述目标三维模型的颜色进行修改。
  40. 如权利要求38或39所述的装置,其特征在于,所述交互单元,还用于向所述云计算平台上传所述调整后的场景文件。
  41. 如权利要求40所述的装置,其特征在于,所述交互单元,还用于向所述云计算 平台上传针对所述调整后的场景文件的描述信息和/或收费方式。
  42. 如权利要求33至41中任一所述的装置,其特征在于,所述交互单元,还用于向所述云计算平台发送配置信息,所述配置信息指示从所述云计算平台提供的至少一个场景文件中选择所述场景文件。
  43. 一种计算设备集群,其特征在于,包括至少一个计算设备,每个计算设备包括处理器和存储器;
    所述至少一个计算设备的处理器用于执行所述至少一个计算设备的存储器中存储的指令,以使得所述计算设备集群执行如权利要求1至11中任一所述的方法。
  44. 一种计算设备集群,其特征在于,包括至少一个计算设备,每个计算设备包括处理器和存储器;
    所述至少一个计算设备的处理器用于执行所述至少一个计算设备的存储器中存储的指令,以使得所述计算设备集群执行如权利要求12至21中任一所述的方法。
  45. 一种包含指令的计算机程序产品,其特征在于,当所述指令被计算机设备集群运行时,使得所述计算机设备集群执行如权利要求的1至11中任一所述的方法。
  46. 一种包含指令的计算机程序产品,其特征在于,当所述指令被计算机设备集群运行时,使得所述计算机设备集群执行如权利要求的12至21中任一所述的方法。
  47. 一种计算机可读存储介质,其特征在于,包括计算机程序指令,当所述计算机程序指令由计算设备集群执行时,所述计算设备集群执行如权利要求1至11中任一所述的方法。
  48. 一种计算机可读存储介质,其特征在于,包括计算机程序指令,当所述计算机程序指令由计算设备集群执行时,所述计算设备集群执行如权利要求12至21中任一所述的方法。
PCT/CN2023/081078 2022-03-18 2023-03-13 一种虚拟拍摄方法、装置及设备 WO2023174209A1 (zh)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202210271138 2022-03-18
CN202210271138.X 2022-03-18
CN202210802065.2A CN116800737A (zh) 2022-03-18 2022-07-07 一种虚拟拍摄方法、装置及设备
CN202210802065.2 2022-07-07

Publications (1)

Publication Number Publication Date
WO2023174209A1 true WO2023174209A1 (zh) 2023-09-21

Family

ID=88022246

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/081078 WO2023174209A1 (zh) 2022-03-18 2023-03-13 一种虚拟拍摄方法、装置及设备

Country Status (1)

Country Link
WO (1) WO2023174209A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130336640A1 (en) * 2012-06-15 2013-12-19 Efexio, Inc. System and method for distributing computer generated 3d visual effects over a communications network
CN108650523A (zh) * 2018-05-22 2018-10-12 广州虎牙信息科技有限公司 直播间的显示及虚拟物品选取方法、服务器、终端和介质
US10665037B1 (en) * 2018-11-28 2020-05-26 Seek Llc Systems and methods for generating and intelligently distributing forms of extended reality content
CN113781660A (zh) * 2021-09-04 2021-12-10 上海白兔网络科技有限公司 一种用于直播间在线渲染加工虚拟场景的方法及装置
CN113941147A (zh) * 2021-10-25 2022-01-18 腾讯科技(深圳)有限公司 画面生成方法、装置、设备及介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130336640A1 (en) * 2012-06-15 2013-12-19 Efexio, Inc. System and method for distributing computer generated 3d visual effects over a communications network
CN108650523A (zh) * 2018-05-22 2018-10-12 广州虎牙信息科技有限公司 直播间的显示及虚拟物品选取方法、服务器、终端和介质
US10665037B1 (en) * 2018-11-28 2020-05-26 Seek Llc Systems and methods for generating and intelligently distributing forms of extended reality content
CN113781660A (zh) * 2021-09-04 2021-12-10 上海白兔网络科技有限公司 一种用于直播间在线渲染加工虚拟场景的方法及装置
CN113941147A (zh) * 2021-10-25 2022-01-18 腾讯科技(深圳)有限公司 画面生成方法、装置、设备及介质

Similar Documents

Publication Publication Date Title
US8471844B2 (en) Streaming geometry for use in displaying and editing 3D imagery
WO2018045927A1 (zh) 一种基于三维虚拟技术的网络实时互动直播方法及装置
US10750088B2 (en) System and method for identifying comment clusters for panoramic content segments
US20130321586A1 (en) Cloud based free viewpoint video streaming
US10403045B2 (en) Photorealistic augmented reality system
US9679369B2 (en) Depth key compositing for video and holographic projection
CN101208723A (zh) 用于3维照相机和3维视频的自动场景建模
CN104954769A (zh) 一种浸入式超高清视频处理系统及方法
US20130257851A1 (en) Pipeline web-based process for 3d animation
KR20150129260A (ko) 오브젝트 가상현실 콘텐츠 서비스 시스템 및 방법
CN112543344B (zh) 直播控制方法、装置、计算机可读介质及电子设备
GB2508243A (en) Incorporating product placement video objects into source video
US20200104030A1 (en) User interface elements for content selection in 360 video narrative presentations
WO2019193364A1 (en) Method and apparatus for generating augmented reality images
CN112449707A (zh) 由计算机实现的用于创建包括合成图像的内容的方法
CN114173021B (zh) 基于高清多屏的虚拟演播方法、系统
CN115830224A (zh) 多媒体数据的编辑方法、装置、电子设备及存储介质
See et al. Creating high fidelity 360° virtual reality with high dynamic range spherical panorama images
WO2019007370A1 (zh) 用于在全景视频中融合对象的方法和装置
WO2023174209A1 (zh) 一种虚拟拍摄方法、装置及设备
KR20130067855A (ko) 시점 선택이 가능한 3차원 가상 콘텐츠 동영상을 제공하는 장치 및 그 방법
US20220207848A1 (en) Method and apparatus for generating three dimensional images
CN116800737A (zh) 一种虚拟拍摄方法、装置及设备
US20190370932A1 (en) Systems And Methods For Transforming Media Artifacts Into Virtual, Augmented and Mixed Reality Experiences
Song et al. From Expanded Cinema to Extended Reality: How AI Can Expand and Extend Cinematic Experiences

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23769712

Country of ref document: EP

Kind code of ref document: A1