WO2023231793A1 - 对物理场景进行虚拟化的方法、电子设备、计算机可读存储介质和计算机程序产品 - Google Patents

对物理场景进行虚拟化的方法、电子设备、计算机可读存储介质和计算机程序产品 Download PDF

Info

Publication number
WO2023231793A1
WO2023231793A1 PCT/CN2023/094999 CN2023094999W WO2023231793A1 WO 2023231793 A1 WO2023231793 A1 WO 2023231793A1 CN 2023094999 W CN2023094999 W CN 2023094999W WO 2023231793 A1 WO2023231793 A1 WO 2023231793A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
scene
virtual
entity
physical
Prior art date
Application number
PCT/CN2023/094999
Other languages
English (en)
French (fr)
Inventor
张哲�
朱丹枫
Original Assignee
京东方科技集团股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 京东方科技集团股份有限公司 filed Critical 京东方科技集团股份有限公司
Publication of WO2023231793A1 publication Critical patent/WO2023231793A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/005General purpose rendering architectures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/04Indexing scheme for image data processing or generation, in general involving 3D image data

Definitions

  • the present disclosure relates to the fields of virtual reality and digital twins, and more specifically to a method for virtualizing a scene, an electronic device, a computer-readable storage medium, and a computer program product.
  • Digital Twins (English: Digital Twins) make full use of data such as physical models, sensor updates, and operating history to integrate multi-disciplinary, multi-physical quantities, multi-scale, and multi-probability simulation processes to complete mapping in virtual space to reflect the corresponding The whole life cycle process of physical entities.
  • Digital twin is a concept that transcends reality and can be regarded as a digital mapping system of one or more important and interdependent equipment systems.
  • Extended reality technology specifically includes virtual reality technology (VR, Virtual Reality), augmented reality technology (AR, Augmented Reality), mixed reality technology (MR, Mixed Reality), etc.
  • Digital twin technology has been widely used in the field of engineering construction, especially in the field of three-dimensional scene modeling.
  • Visualized 3D scene applications based on 3D scene models have become widely popular.
  • 3D engines that can assist the development of visual 3D scene applications.
  • due to the virtualization properties of three-dimensional scenes it often involves the simultaneous operation of scene modeling applications and virtual reality applications.
  • the model generation process of the current 3D scene modeling solution is not only complex and time-consuming, but also requires the collection of a large amount of data in advance. Therefore, in the actual application process, lags often occur and the realism of the simulated virtual scene is too low. Condition.
  • the present disclosure proposes a method, electronic device, computer-readable storage medium, and computer program product for virtualizing a scene to solve the technical problems of high computational complexity and long time consumption in the scene virtualization process.
  • Embodiments of the present disclosure provide a method for virtualizing a physical scene, including: determining the scene boundary based on interaction information used to indicate the scene boundary; determining physical objects within the scene boundary based on the scene boundary. entity, and capture the video data corresponding to the physical entity; based on the video data corresponding to the physical entity, determine the model data of the virtual entity corresponding to the physical entity; and based on the model data corresponding to the virtual entity, create the The virtual scene corresponding to the physical scene is described.
  • the video data includes a plurality of video frames, and different video frames among the plurality of video frames correspond to different lighting conditions, shooting positions or shooting angles.
  • determining the model data of the virtual entity corresponding to the physical entity based on the video data corresponding to the physical entity further includes: extracting a plurality of discrete points from each video frame in the video data; based on each Generate three-dimensional model data represented by Thiessen polygons from multiple discrete points of the video frame as the three-dimensional model data of the video frame; determine the model data of the virtual entity corresponding to the physical entity based on the three-dimensional model data of each video frame .
  • determining the model data of the virtual entity corresponding to the physical entity based on the video data corresponding to the physical entity further includes: obtaining one or more of a building information model, global geographical location information, and building positioning spatial data.
  • Item Based on one or more of the building information model, the global geographical location information and the building positioning spatial data, using the video data corresponding to the physical entity, determine the virtual entity corresponding to the physical entity model data.
  • determining the model data of the virtual entity corresponding to the physical entity based on the video data corresponding to the physical entity further includes: obtaining one or more of urban traffic data, urban planning data, and urban municipal data; Based on one or more of the urban traffic data, urban planning data, and urban municipal data, the video data corresponding to the physical entity is used to determine the model data of the virtual entity corresponding to the physical entity.
  • the method further includes: based on the virtual scene corresponding to the physical scene, displaying relevant information of the virtual scene.
  • displaying the relevant information of the virtual scene further includes: selecting multiple video frames from the video data; performing texture compression and/or texture scaling on the multiple video frames to generate texture data; Based on the texture data, a virtual scene corresponding to the physical scene is rendered, and the rendered virtual scene is displayed.
  • performing texture compression and/or texture scaling on the plurality of video frames to generate map data further includes: performing texture compression on the plurality of video frames to generate texture-compressed map data; based on the texture Compressed map data, determine the material resource data and material resource data corresponding to the map data; determine the parameters corresponding to the texture scaling process based on the material resource data and material resource data corresponding to the map data; based on the texture scaling process corresponding Parameter, perform texture scaling processing on the texture compressed map data to generate texture scaled map data.
  • Some embodiments of the present disclosure provide an electronic device, including: a processor; and a memory.
  • the memory stores computer instructions, and when the computer instructions are executed by the processor, the above method is implemented.
  • Some embodiments of the present disclosure provide a computer-readable storage medium on which computer instructions are stored, and when the computer instructions are executed by a processor, the above-mentioned method is implemented.
  • Some embodiments of the present disclosure provide a computer program product, which includes computer-readable instructions. When executed by a processor, the computer-readable instructions cause the processor to perform the above method.
  • various embodiments of the present disclosure use video data to realize scene virtualization, which helps to solve the technical problem of high complexity and long time-consuming scene model generation process.
  • FIG. 1 is an example schematic diagram illustrating an application scenario according to an embodiment of the present disclosure.
  • Figure 2 is a flowchart illustrating an example method of virtualizing a physical scene according to an embodiment of the present disclosure.
  • Figure 3 is a diagram illustrating a physical scene, interaction information and physical entities according to an embodiment of the present disclosure. picture.
  • FIG. 4 is a schematic diagram showing an example interface change when a terminal obtains interaction information according to an embodiment of the present disclosure.
  • FIG. 5 is a schematic diagram illustrating obtaining interaction information according to an embodiment of the present disclosure.
  • Figure 6 is a schematic diagram illustrating processing of video frames according to an embodiment of the present disclosure.
  • FIG. 7 is a schematic diagram illustrating processing of video frames in combination with building information according to an embodiment of the present disclosure.
  • FIG. 8 is a schematic diagram illustrating processing of video frames in combination with geographical information according to an embodiment of the present disclosure.
  • FIG. 9 is an architectural schematic diagram illustrating a scene modeling application and/or a virtual reality application according to an embodiment of the present disclosure.
  • FIG. 10 is a schematic diagram illustrating the operation of a rendering engine according to an embodiment of the present disclosure.
  • Figure 11 shows a schematic diagram of an electronic device according to an embodiment of the present disclosure.
  • Figure 12 shows a schematic diagram of the architecture of an exemplary computing device in accordance with an embodiment of the present disclosure.
  • Figure 13 shows a schematic diagram of a storage medium according to an embodiment of the present disclosure.
  • the first data may be referred to as second data, and similarly, the second data may be to be called the first data.
  • Both the first data and the second data may be data, and in some cases, may be separate and different data.
  • the term "at least one" in this application means one or more, and the term “plurality” in this application means two or more.
  • multiple audio frames means two or more audio frame.
  • the size of the sequence number of each process does not mean the order of execution.
  • the execution order of each process should be determined by its function and internal logic, and should not be determined by the execution order of the embodiments of the present application.
  • the implementation process constitutes no limitation. It should also be understood that determining B according to (based on) A does not mean only determining B according to (based on) A, and B can also be determined according to (based on) A and/or other information.
  • the term “if” may be interpreted to mean “when” or “upon” or “in response to determining” or “in response to detecting.”
  • the phrase “if it is determined" or “if [the stated condition or event] is detected” may be interpreted to mean “when it is determined" or “in response to the determination... ” or “on detection of [stated condition or event]” or “in response to detection of [stated condition or event].”
  • FIG. 1 shows the A schematic diagram of an application scenario 100 of the disclosed embodiment, in which a server 110 and a plurality of terminals 120 are schematically shown.
  • the terminal 120 and the server 110 can be connected directly or indirectly through wired or wireless communication methods, and this disclosure is not limited here.
  • the embodiment of the present disclosure adopts Internet technology, especially physical network technology.
  • the Internet of Things can be used as an extension of the Internet. It includes the Internet and all resources on the Internet, and is compatible with all applications of the Internet. With the application of IoT technology in various fields, various new smart IoT application fields have emerged, such as smart homes, smart transportation, and smart health.
  • scene data may be data related to Internet of Things technology.
  • Scene data includes XX.
  • the present disclosure is not limited to this.
  • methods according to some embodiments of the present disclosure may be fully or partially mounted on the server 110 to process scene data, for example, scene data in the form of pictures.
  • the server 110 will be used to analyze scene data and determine model data based on the analysis results.
  • the server 110 here can be an independent server, a server cluster or a distributed system composed of multiple physical servers, or it can provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, and cloud communications. , middleware services, domain name services, security services, content delivery network (CDN, Content Delivery Network), location services, and cloud servers for basic cloud computing services such as big data and artificial intelligence platforms, the embodiments of this disclosure do not specifically limit this.
  • the server 110 is also referred to as the cloud.
  • the method according to the embodiment of the present disclosure may also be fully or partially mounted on the terminal 120 to process scene data.
  • the terminal 120 will be used to collect the above scene data in the form of pictures.
  • the terminal 120 will be used to present scene data so that the user can interact with the constructed three-dimensional model in the virtual scene.
  • the terminal 120 can be an interactive device that can provide 3D digital virtual objects and include a display device of a user interface. The 3D digital virtual objects can be displayed through the user interface, and the user can interact with the interactive device.
  • the terminal 120 will also be used to analyze the above-mentioned building data. This disclosure does not limit this.
  • each terminal of the plurality of terminals 120 may be a fixed terminal such as a desktop computer, such as a smartphone, a tablet computer, a portable computer, a handheld device, a personal digital assistant, a smart wearable device (eg, smart glasses), Smart head-mounted devices, cameras, vehicle-mounted terminals and other mobile terminals with network functions, or any combination thereof, are not specifically limited in the embodiments of the present disclosure.
  • Each terminal in the plurality of terminals 120 may also include various sensors or data collection devices, such as the temperature sensor shown in FIG. 1 and so on.
  • the scene data is related to lighting conditions, so the terminal can also be a brightness sensor.
  • the terminal 120 may also be a camera (such as an infrared camera) or a distance detector.
  • augmented reality technology is a technology that integrates virtual scene data with real scenes. It widely uses multimedia, three-dimensional modeling, real-time tracking and registration, intelligent interaction, sensing and other technical means to combine computer-generated text with , images, three-dimensional models, music, videos and other virtual information are simulated and applied to the real world. The two types of information complement each other, thereby achieving "enhancement" of the real world.
  • Virtual reality uses computers to simulate real scenes to generate a three-dimensional virtual world, providing users with simulations of vision and other senses, making users feel as if they are immersed in the scene and can observe things in the three-dimensional space in real time and without restrictions. When the user moves his position, the computer can immediately perform complex calculations and send back accurate three-dimensional images of the world to create a sense of presence.
  • Smart glasses not only include various optical components and support components of conventional glasses, but also include display components for displaying the above-mentioned augmented reality information and/or virtual reality information.
  • Smart glasses also include corresponding battery components, sensor components, network components, etc.
  • the sensor component may include a depth camera (for example, a Kinect depth camera), which captures depth information in a real scene through the principle of amplitude modulated continuous wave (AMCW) time-of-flight (TOF), and uses near-infrared light (NIR) to generate real-life images.
  • ACW amplitude modulated continuous wave
  • TOF time-of-flight
  • NIR near-infrared light
  • Sensor components can also include various acceleration sensors, gyroscope sensors, geomagnetic field sensors, etc., to detect the user's attitude and position information, thereby providing reference information for scene data processing.
  • Smart glasses may also be integrated with various eye-tracking accessories to build a bridge between the real world, the virtual world and the user through the user's eye movement data, thereby providing a more natural user experience.
  • embodiments of the present disclosure may further involve artificial intelligence services to intelligently provide the above-mentioned virtual scenes.
  • the artificial intelligence service may not only be executed on the server 110, but also on the terminal 120, or may be executed jointly by the terminal and the server. There are no restrictions on this.
  • the device that applies the artificial service of the embodiment of the present disclosure to analyze and reason the scene data may be a terminal, a server, or a system composed of a terminal and a server.
  • the virtual three-dimensional scenes generated by such a solution often have poor realism.
  • the six pictures all correspond to different lighting scenes.
  • the actual generated virtual scene is often difficult to simulate real lighting conditions, resulting in distortion of the virtual scene.
  • these six pictures are simply attached to the spatial scene model in the form of a cube, it often requires a large amount of information collected in advance and a large amount of computing resources to accurately determine the information that meets the needs of the scene modeling application, resulting in scene Modeling applications are difficult to run simultaneously with virtual reality applications.
  • embodiments of the present disclosure provide a method for virtualizing a physical scene, including: determining physical entities within the scene boundary based on interaction information indicating scene boundaries, and capturing video data corresponding to the physical entities; Based on the video data, determine model data of a virtual entity corresponding to the physical entity; and create a virtual scene corresponding to the physical scene based on the model data corresponding to the virtual entity. Therefore, in response to the needs of application business visualization and scene virtualization, various embodiments of the present disclosure use video data to realize scene virtualization, which helps to solve the technical problem of high complexity and long time-consuming scene model generation process.
  • FIG. 2 is an illustration illustrating virtualization of a physical scene according to an embodiment of the present disclosure.
  • Figure 3 is a schematic diagram illustrating a physical scene, interaction information and physical entities according to an embodiment of the present disclosure.
  • the example method 20 may include one or all of operations S201-S203, and may also include more operations.
  • operations S201 to S203 are performed by the terminal 120/server 110 in real time, or performed offline by the terminal 120/server 110.
  • This disclosure does not limit the execution subject of each operation of the example method 200, as long as it can achieve the purpose of the disclosure.
  • Various steps in the example methods may be performed in whole or in part by a virtual reality application and/or a scene modeling application.
  • Virtual reality applications and scene modeling applications can be integrated into one large application.
  • Virtual reality applications and scene modeling applications can be two independent applications, but interactive information, video data, model data, etc. are transmitted through their mutually open interfaces. .
  • This disclosure is not limited in this regard.
  • the scene boundary is determined based on interaction information indicating the scene boundary.
  • operation S202 based on the scene boundary, a physical entity within the scene boundary is determined, and video data corresponding to the physical entity is captured.
  • the interaction information may be collected through the terminal 120 in FIG. 1 , which indicates which physical entities in the physical scene need to be further virtualized.
  • Figure 3 shows an example of a physical scene, interactive information, and physical entities. It schematically shows a physical scene including physical entities such as sofas, curtains, moons, desk lamps, lockers, and books. example.
  • interaction information shown in a circular frame can be obtained, which indicates that only the physical entities and physical scenes in the circular frame need to be virtualized. That is, in the example of FIG. 3 , it can be correspondingly determined that the physical entities in the scene only include desk lamps, lockers, and books.
  • video data corresponding to desk lamps, lockers, and books can be captured.
  • the scene boundary is shown in the form of a circular frame in FIG. 3 , those skilled in the art should understand that the present disclosure is not limited thereto. Specifically, the scene boundary can also be indicated in any connected shape. Various examples of interactive information will be described in detail later with reference to FIGS. 4 to 5 , and the disclosure will not be repeated here.
  • the video data corresponding to the physical entity refers to a continuous image sequence, which is essentially composed of a group of continuous images. Each image in this image sequence is also called a video frame, which is the smallest visual unit that makes up a video.
  • the video data can be collected using various terminals 120 described with reference to FIG. 1 .
  • smart glasses, mobile phone terminals, depth cameras and other devices can be used to collect the video data. Since video data captures images (video frames) of physical entities over a period of time, the multiple Different video frames in the video frames correspond to different lighting conditions, shooting positions or shooting angles. Therefore, each video frame in the video data includes various information about the physical entity. According to various experiments using embodiments of the present disclosure, it can be determined that enough information that can characterize a physical entity can be extracted from video data including 300 frames, thereby achieving modeling processing of a highly realistic virtual entity.
  • model data of the virtual entity corresponding to the physical entity is determined based on the video data corresponding to the physical entity.
  • the analysis and processing of the video data may be processed by the server 110.
  • the terminal 120 can transmit the video data to the server through streaming, and then the server 110 can process the video data corresponding to the physical entity (for example, image processing, etc.) to obtain the virtual entity corresponding to the physical entity. model data.
  • the server 110 can also combine various known information or connect to public or non-public databases through various interfaces to obtain information related to the physical entity as model data of the virtual entity.
  • the model data indication of the virtual entity can be used to build any data related to the virtual entity in the virtual scene. For example, it can extract edge information, position information, depth information, vertex information, height information, width information, length information, etc. of the virtual entity from each video frame of the video data.
  • the model data of the virtual entity may also be the environmental information of the virtual entity extracted from each video frame of the video data, such as lighting information, relative position relationship information, etc.
  • the model data of the virtual entity may also include Internet of Things related information, such as network status, registration request information, registration entity information, device operation information, and so on.
  • any data related to the physical entity can be pulled from the Internet/database based on the analysis of the video data. This disclosure does not limit this.
  • Various examples of interactive information will be described in detail later with reference to FIG. 6 , and the disclosure will not be repeated here.
  • a virtual scene corresponding to the physical scene is created based on the model data corresponding to the virtual entity.
  • the virtual scene is a three-dimensional virtual scene, which is a virtualization of a real physical scene.
  • a three-dimensional virtual model corresponding to the virtual entity is placed in the three-dimensional virtual scene.
  • Three-dimensional virtual models are also called 3D models, which can be produced through various 3D software.
  • the software for making 3D models in the present disclosure is, for example, CAD (CAD-Computer Aided Design, computer-aided design) software.
  • the 3D model file in STL format can be obtained through the software; then, the STL format file can be imported into the slicing processing pipeline in the 3D software that can perform slicing to obtain the 3D virtual model.
  • model data can be structurally optimized before constructing the 3D virtual model to save computing resources and improve processing efficiency.
  • this disclosure does not limit the type of 3D software.
  • it can be software for 3D model analysis, 3D software for visual art creation, 3D software for 3D printing, etc.; in addition, Three-dimensional models can also be generated through computer graphics libraries (that is, graphics libraries used in self-programming); for example, (OpenGraphics Library, Open Graphics Library), DirectX (Direct eXtension), etc.
  • the method 20 may further include operation S205.
  • operation S205 relevant information of the virtual scene is displayed based on the virtual scene corresponding to the physical scene. For example, the virtual scene is displayed in three dimensions.
  • various three-dimensional rendering engines can be used to visualize the virtual scene.
  • the 3D rendering engine can generate displayable 2D images from digital 3D scenes.
  • the generated two-dimensional images can be realistic or non-realistic.
  • the three-dimensional rendering process relies on a 3D rendering engine.
  • example rendering engines in this disclosure may use "ray tracing” technology, which is generated by tracing rays from a camera through a virtual plane of pixels and simulating the effect of their encounter with an object. image.
  • Example rendering engines in this disclosure may also use "rasterization” technology, which collects information about various bins to determine the value of each pixel in a two-dimensional image. This disclosure does not limit the types of 3D rendering engines and the technologies used.
  • method 20 uses video data to realize scene virtualization, which helps to solve the technical problem of high complexity and long time-consuming scene model generation process.
  • FIGS. 4 and 5 Examples of operations S201 to S202 are further described next with reference to FIGS. 4 and 5 .
  • 4 is a schematic diagram showing an example interface change when the terminal obtains interaction information according to an embodiment of the present disclosure.
  • FIG. 5 is a schematic diagram illustrating obtaining interaction information according to an embodiment of the present disclosure.
  • the terminal 120 may be equipped with a scene modeling application and/or a virtual reality application.
  • the terminal 120 may trigger a "gesture circle" related function for obtaining interactive information indicating the scene boundary.
  • the terminal 120 being smart Through smart glasses or a smart phone, it is possible to see the 7 example physical entities in the left picture of Figure 4 through the smart glasses or using the camera of the smart phone.
  • the smart glasses or smartphone will capture the user's gestures. For example, a user may draw an irregular area in the air with his hand in front of smart glasses.
  • the user may hold a smartphone with one hand and use the other hand to mark an irregular area in the area that can be captured by the camera of the smartphone.
  • Smart glasses or smartphones will recognize the gesture to obtain a scene boundary that can be described by a vectorial continuous vector.
  • a convex polygonal closed area can be generated as shown in Figures 4 and 5.
  • the camera component for example, the camera of smart glasses or a smartphone
  • the shortest distance is selected as the shortest distance corresponding to the convex polygon closed area.
  • the first vertical plane is determined based on the shortest distance corresponding to the closed area of the convex polygon. For example, the first vertical plane is perpendicular to the horizontal plane, and the horizontal distance between the first vertical plane and the camera component is the shortest distance corresponding to the convex polygon closed area.
  • a circular planar area is determined based on the first vertical plane. The circular planar area is used to assist in determining whether a certain physical entity is located within the scene boundary.
  • the highest point and the lowest point on the closed area of the convex polygon can be projected onto the first vertical plane, and the distance between the projection of the highest point and the projection of the lowest point on the first vertical plane can be
  • the connecting line is used as the diameter
  • the center of the connecting line is used as the center of the circle to determine the circular planar area.
  • the leftmost point and the rightmost point on the closed area of the convex polygon can be projected onto the first vertical plane, and the projection of the leftmost point and the rightmost point on the first vertical plane can be projected onto the first vertical plane.
  • the line connecting the projections of the points is used as the diameter
  • the center of the line is used as the center of the circle to determine the circular plane area.
  • the longest diagonal line of the convex polygon closed area can be projected onto the first vertical plane, and the projection of the longest diagonal line is used as the diameter, and the longest diagonal line is used as the diameter.
  • the center of the projection is used as the center of the circle to determine the circular planar area.
  • the present disclosure does not further limit the manner of determining the circular planar area.
  • the camera component determines the distance from multiple points on the edge of the physical entity to the vertical plane where the starting point is located. Based on the distances from multiple points on the edge of the physical entity to the vertical plane where the starting point is located, the shortest distance corresponding to the physical entity is selected. Based on the shortest distance corresponding to the physical entity, the second vertical plane is determined. For example, the second vertical plane is perpendicular to the horizontal plane, and the second vertical plane is aligned with the camera group The horizontal distance between pieces is the shortest distance corresponding to the physical entity.
  • a circular planar area that is proportionally expanded is determined on the second vertical plane.
  • the ratio between the diameter of the circular planar area and the diameter of the proportionally expanded circular planar area is equal to the ratio of the shortest distance corresponding to the convex polygon closed area to the shortest distance corresponding to the physical entity, and the center of the circular planar area is proportional to The center of the expanded circular plane area is on the same horizontal line.
  • the projections of the physical entity on the proportionally enlarged circular plane area are all within the proportionally enlarged circular plane area, then it can be determined that the physical entity is within the scene boundary. As shown in Figures 4 and 5, it can be determined that the physical entities marked in gray are within the boundary of the scene, while the physical entities marked in white are outside the boundary of the scene. Therefore, determining the first vertical plane and the second vertical plane based on the shortest horizontal distance corresponding to the closed area of the convex polygon can achieve smaller errors. Of course, the present disclosure is not limited to this.
  • Figures 4 and 5 are only an example solution of using a hand tracking solution to obtain interaction information indicating scene boundaries and determine physical entities within the scene boundaries, and the disclosure is not limited thereto.
  • the virtual reality application can first determine multiple physical entities that the camera component can capture through infrared sensing or dynamic image recognition, and prompt the user to select from the multiple physical entities through a voice or text dialog box. . In such a case, the information of the user's selection from the plurality of physical entities will serve as interaction information indicating the scene boundary.
  • the virtual reality application can also first capture a static image, perform edge extraction on the static image, and draw buttons covering the captured physical entity on the static image. The user can click/touch/gesture instructions, etc. method to trigger this button to select the physical entity that needs to be virtualized from the multiple physical entities. In this case, the information that the user triggered the button can also be used as interactive information indicating the scene boundary.
  • the camera component will capture video data corresponding to the physical entities within the scene boundaries.
  • the camera component can continuously adjust the shooting parameters automatically/manually during the shooting period, such as adjusting the focus, focal length, position of the camera component, intermittently turning on the flash, intermittently turning on the high beam, intermittently turning on the low beam, etc. Capture video data corresponding to the physical entity so that the video data includes more information.
  • the camera component may not make any adjustments to the shooting parameters during the shooting period. Since during the operation of virtual reality applications, there are often changes in ambient light that can be captured by the device, the captured video data often includes enough information to provide sufficient virtual reality. body model data.
  • various aspects of the present disclosure use rich human-computer interaction methods to provide interactive information for indicating scene boundaries through virtual reality applications, which can easily determine the physical entities within the scene boundaries and provide sufficient information for subsequent creation of virtual scenes. Lots of model data.
  • FIGS. 6 to 8 are schematic diagram illustrating processing of video frames according to an embodiment of the present disclosure.
  • 7 is a schematic diagram illustrating processing of video frames in combination with building information according to an embodiment of the present disclosure.
  • 8 is a schematic diagram illustrating processing of video frames in combination with geographical information according to an embodiment of the present disclosure.
  • operation S202 includes extracting a plurality of discrete points from each video frame in the video data; based on the plurality of discrete points of each video frame, generating three-dimensional model data represented by Thiessen polygons as the video frame The three-dimensional model data; based on the three-dimensional model data of each video frame, determine the model data of the virtual entity corresponding to the physical entity.
  • FIG. 6 shows an example of a scene modeling application and/or a virtual reality application for one video frame in video data.
  • the video data captures a physical entity shown in the form of a cup.
  • FIG. 6 is only a schematic diagram for illustrating the solution of the present disclosure, and real video data may also include more or less pixels and information in a single video frame.
  • a scene modeling application and/or a virtual reality application will extract video frames marked as 601 from the video data.
  • a plurality of discrete points marked with black dots in the image marked as 602 can be extracted from the video frame marked as 601.
  • Each of the plurality of discrete points indicates information associated with a physical entity. Examples of discrete points can be the apex of the cup, the center point, the feature point, and the point where the light and dark changes are the most dramatic. As an example, 20 to 30 discrete points can be extracted in a single video frame. Of course, the embodiments of the present disclosure are not limited to this.
  • Discrete points can be extracted in various ways, and the present disclosure does not limit the way of extracting discrete points.
  • a grayscale image can be generated from the video frame to determine the change in light and dark of each pixel from the grayscale image. Then, a heat map is generated based on the light and dark changes of each pixel to obtain the light and dark change distribution of the video frame. Based on the light and dark change distribution, the coordinates of a plurality of discrete points are determined, and these discrete points all indicate the light and dark change information of the video frame.
  • a neural network can be used to intelligently identify multiple discrete points in the video frame, and each discrete point can be a feature point in the video frame.
  • Various neural network models can be used to determine this For some discrete points, for example, deep neural network (DNN) model, factorization machine (FM) model, etc. can be used. These neural network models can be implemented as acyclic graphs, where neurons are arranged in different layers.
  • a neural network model includes an input layer and an output layer, which are separated by at least one hidden layer. The hidden layer transforms the input received by the input layer into a representation useful for generating the output in the output layer.
  • Network nodes are fully connected to nodes in adjacent layers via edges, and there are no edges between nodes within each layer.
  • Data received at the nodes of the input layer of the neural network are propagated to the nodes of the output layer via any of hidden layers, activation layers, pooling layers, convolutional layers, etc.
  • the input and output of the neural network model can take various forms, and this disclosure does not limit this.
  • stereo model data represented by Thiessen polygons can be generated based on each of the extracted discrete points. For example, you can randomly select a discrete point from these discrete points as the first discrete point, then find the point closest to this point as the second discrete point, and connect the first discrete point and the second discrete point as the first baseline. Find the point closest to the first baseline as the third discrete point, connect the first discrete point and the third discrete point as the second baseline, and connect the second discrete point and the third discrete point as the third baseline.
  • the first baseline, the second baseline and the third baseline form the triangle marked in box 603.
  • Thiessen polygon generation method is used to form a three-dimensional model structure.
  • Tyson polygon generation is to take any discrete point as the center point, then connect the center point to multiple surrounding discrete points, and then make vertical bisectors of straight lines.
  • the polygon formed by the intersection of these vertical bisectors (therefore, is called is the adjacent range of the center point), the polygon is a Thiessen polygon.
  • a three-dimensional model structure represented by Thiessen polygons can be generated.
  • the data extracted from the video frames can be Similarity between discrete points to determine the same discrete point in multiple video frames. Combined with the principle of near large and far small, the depth information at each discrete point can be calculated. The depth information at each discrete point will be used as an example of the model data of the virtual entity corresponding to the physical entity.
  • BIM model Building information model
  • the model data of the virtual entity is also the building information model, and its full English name is Building Information Modeling.
  • a BIM model not only contains the three-dimensional model of the building, but also can set the building's material properties, color, designer, manufacturer, builder, inspector, date and time, area, volume and other information.
  • Each monitoring virtual entity can be set in the BIM model as an entity object, which correspondingly includes the object identification, the object's geometric data, the object's reference geometric data, the object's real-time collected data, and so on. This disclosure is not limited in this regard.
  • the global geographical location information corresponding to the large building can be further combined to determine the model data of the virtual entity corresponding to the physical entity.
  • the global geographical location information may be information found in a map database based on some characteristics of the physical entity. For example, the longitude and latitude information corresponding to the physical entity can be found through various navigation map applications as global geographical location information.
  • the location of the physical entity within a certain range from the mobile phone can be further determined based on the location data of the terminal 120 determined by the positioning module of the terminal 120 (eg, GPS positioning module, Beidou system positioning module). This disclosure does not further limit global geographical location information.
  • the model data of the virtual entity corresponding to the physical entity can be further combined with the building positioning spatial data corresponding to the large building.
  • the terminal 120 can be used to retrieve the building positioning space data of the corresponding building from the building positioning space database, which includes the building's length, width and height data, wall data, various design data of the building when submitting for approval, and so on. This disclosure does not further limit the building positioning spatial data.
  • lighting information can be extracted from the above-mentioned video data, and then the lighting information can be combined with the above-mentioned building information model to determine the model data of the virtual entity corresponding to the physical entity.
  • the method described in Figure 6 can be combined to generate the three-dimensional model data of the video frame from each video frame in the video data, combining the three-dimensional model data, the building information model, the global geographical location information and the building positioning spatial data.
  • One or more items of the method are used to determine the model data of the virtual entity corresponding to the physical entity, thereby enabling the presentation of virtual scenes under different lighting conditions. This disclosure does not limit this.
  • FIG. 9 is an architectural schematic diagram illustrating a scene modeling application and/or a virtual reality application according to an embodiment of the present disclosure.
  • video data can be obtained from a data collection module (such as a camera), and then the video data can be initially parsed through the underlying functional module.
  • the supporting components of the data collection module can include any hardware device SDK or WebSocket client, while the underlying functional modules include: a serialization function module that generates a serialized Xml/Json file summary based on video data, and determines each program/service Activity monitoring function module, file format conversion module, etc.
  • the I/O module can also be used to process the video data into a transferable file.
  • the I/O module may include multiple service modules, such as a file monitoring module that provides file monitoring services, a file transfer module that is used to transfer files via FTP, and so on.
  • the scene modeling application and/or the virtual reality application installed on the terminal 120 transmits the video data in file form to the server 110 for further analysis.
  • the server 110 also includes a communication module similarly.
  • the communication module and similar support components can include any hardware device SDK or WebSocket client. Even to increase the transmission speed, a pipeline transmission module can be included accordingly.
  • the server 110 also includes various databases, such as a model database, a material database, and a texture database. The server 110 may use its analysis module to perform the above operation S202 in combination with various databases, and then return the model data of the virtual entity to the scene modeling application and/or the virtual reality application.
  • the scene modeling application and/or the virtual reality application will use the rule conversion module to convert the rules in the physical world into the rules in the virtual scene (for example, perform coordinate conversion), and combine the rules in the virtual scene to create the physical The virtual scene corresponding to the scene.
  • the terminal that receives the model data of the virtual entity is not necessarily the terminal that sends the video data file.
  • terminal A can be used to collect video data and send it to the server, and then the server sends the model data to terminal B, thereby realizing remote multi-location collaborative operations.
  • the scene modeling application and/or the virtual reality application may also include a rendering process and a control process to realize the visualization process of the virtual scene.
  • the rendering process and the control process can communicate with each other to realize the visualization of the virtual scene.
  • the rendering process also provides simulation feedback information to the control process to indicate the above-mentioned comparison information between the virtual scene and the physical scene.
  • the present disclosure is not limited to this.
  • Various embodiments of the present disclosure are highly scalable and can not only be combined with various gesture recognition algorithms to conduct in-depth vertical development to provide model data and auxiliary data to ordinary users of the terminal 120, but can also be horizontally expanded to provide certain Supervisors in special industries provide scene supervision services and realize real-time scene detection through real scene restoration.
  • various embodiments of the present disclosure can also be output as a JAR package/dynamic link library that can be used by the corresponding platform for integration with multiple systems.
  • FIG. 10 is a schematic diagram illustrating the operation of a rendering engine according to an embodiment of the present disclosure.
  • operation S204 includes: selecting multiple video frames from the video data; performing texture compression and/or texture scaling processing on the multiple video frames to generate texture data; based on the texture data, The virtual scene corresponding to the physical scene is rendered, and the rendered virtual scene is displayed.
  • the OpenGL ES interface glCompressedTexImage2D(..., format,..., data) can be used to perform texture compression on the multiple video frames. It is worth noting that this disclosure does not limit the format of texture data, and it can convert texture data into any format according to the supplier's SDK or documentation. For example, assume that the display screen of the terminal 120 is adapted to a 32MB display memory. A 2MB single video frame image can be texture compressed to generate texture data in ECT (Ericsson Texture Compression) format to ensure texture data of more than 16 textures.
  • ECT Ericsson Texture Compression
  • the map data obtained after texture compression may be distorted in scale. Therefore, texture scaling can be used in the 3D rendering engine to further adjust the map data. For example, material (Texture) resource data (for example, Material A to Material C parameters shown in Figure 10) can be generated based on the texture data. Based on the material resource data, the rendering engine will generate material resource data (for example, color, highlight, metal and other parameters shown in Figure 10) accordingly.
  • material resource data for example, Material A to Material C parameters shown in Figure 10.
  • the parameters corresponding to the texture scaling process can be determined (for example, the pixel data in some maps can be directly Characterized by texture scaling parameters), based on the parameters corresponding to the texture scaling process Number, texture scaling can be further performed on the texture data to further reduce the file size of the texture data and ensure the running speed of the virtual reality application.
  • various embodiments of the present disclosure use video data to realize scene virtualization, which helps to solve the technical problem of high complexity and long time-consuming scene model generation process.
  • a device for virtualizing a physical scene includes: a first module configured to determine the scene boundary based on interaction information used to indicate the scene boundary. A physical entity within the physical entity and captures video data corresponding to the physical entity; a second module configured to determine model data of a virtual entity corresponding to the physical entity based on the video data corresponding to the physical entity; and a third module A module configured to create a virtual scene corresponding to the physical scene based on the model data corresponding to the virtual entity.
  • the video data includes a plurality of video frames, and different video frames among the plurality of video frames correspond to different lighting conditions, shooting positions or shooting angles.
  • the second module is further configured to: extract multiple discrete points from each video frame in the video data; and generate stereo model data represented by Thiessen polygons based on the multiple discrete points of each video frame.
  • model data of the virtual entity corresponding to the physical entity is determined.
  • the second module is further configured to: obtain one or more of the building information model, global geographical location information and building positioning spatial data; based on the building information model, the global geographical location information and the One or more items in the building positioning space data are used to determine the model data of the virtual entity corresponding to the physical entity using the video data corresponding to the physical entity.
  • the second module is further configured to: obtain one or more of urban traffic data, urban planning data, and urban municipal data; based on one or more of the urban traffic data, urban planning data, and urban municipal data or multiple items, using the video data corresponding to the physical entity to determine the model data of the virtual entity corresponding to the physical entity.
  • the device further includes a fourth module configured to: display relevant information of the virtual scene based on the virtual scene corresponding to the physical scene.
  • displaying the relevant information of the virtual scene further includes: selecting multiple video frames from the video data; performing texture compression and/or texture scaling processing on the multiple video frames to generate Texture data; based on the texture data, render the virtual scene corresponding to the physical scene and display the rendered virtual scene.
  • performing texture compression and/or texture scaling on the plurality of video frames to generate map data further includes: performing texture compression on the plurality of video frames to generate texture-compressed map data; based on the texture Compressed map data, determine the material resource data and material resource data corresponding to the map data; determine the parameters corresponding to the texture scaling process based on the material resource data and material resource data corresponding to the map data; based on the texture scaling process corresponding Parameter, perform texture scaling processing on the texture compressed map data to generate texture scaled map data.
  • an electronic device is also provided for implementing the method according to the embodiment of the present disclosure.
  • Figure 11 shows a schematic diagram of an electronic device 2000 according to an embodiment of the present disclosure.
  • the electronic device 2000 may include one or more processors 2010 and one or more memories 2020 .
  • the memory 2020 stores computer-readable code, and when the computer-readable code is run by the one or more processors 2010 , the computer-readable code can execute the search request processing method as described above.
  • the processor in the embodiment of the present disclosure may be an integrated circuit chip with signal processing capabilities.
  • the above-mentioned processor may be a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components.
  • DSP digital signal processor
  • ASIC application-specific integrated circuit
  • FPGA off-the-shelf programmable gate array
  • Each method, operation, and logical block diagram disclosed in the embodiments of the present disclosure may be implemented or executed.
  • the general-purpose processor can be a microprocessor or the processor can be any conventional processor, etc., which can be of X86 architecture or ARM architecture.
  • the various example embodiments of the present disclosure may be implemented in hardware or special purpose circuits, software, firmware, logic, or any combination thereof. Some aspects may be implemented in hardware, while other aspects may be implemented in firmware or software that may be executed by a controller, microprocessor, or other computing device. While aspects of embodiments of the present disclosure are illustrated or described as block diagrams, flowcharts, or using some other graphical representation, it will be understood that the blocks, devices, systems, techniques, or methods described herein may be used as non-limiting Examples are implemented in hardware, software, firmware, special purpose circuitry or logic, general purpose hardware or controllers, or other computing devices, or some combination thereof.
  • computing device 3000 may include bus 3010, a One or more CPUs 3020, read-only memory (ROM) 3030, random access memory (RAM) 3040, communication port 3050 connected to the network, input/output component 3060, hard disk 3070, etc.
  • the storage device in the computing device 3000 such as the ROM 3030 or the hard disk 3070, may store various data or files used for processing and/or communication of the methods provided by the present disclosure, as well as program instructions executed by the CPU.
  • Computing device 3000 may also include user interface 3080.
  • the architecture shown in FIG. 7 is only exemplary, and when implementing different devices, one or more components in the computing device shown in FIG. 7 may be omitted according to actual needs.
  • Figure 13 shows a schematic diagram of a storage medium 4000 according to the present disclosure.
  • Computer readable instructions 4010 are stored on the computer storage medium 4020. When the computer readable instructions 4010 are executed by a processor, the methods according to the embodiments of the present disclosure described with reference to the above figures may be performed.
  • Computer-readable storage media in embodiments of the present disclosure may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory.
  • Non-volatile memory may be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), or flash memory.
  • Volatile memory may be random access memory (RAM), which acts as an external cache.
  • RAM Direct Memory Bus Random Access Memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • SDRAM synchronous dynamic random access memory
  • DDRSDRAM double data rate synchronous dynamic Random Access Memory
  • ESDRAM Enhanced Synchronous Dynamic Random Access Memory
  • SLDRAM Synchronous Linked Dynamic Random Access Memory
  • DR RAM Direct Memory Bus Random Access Memory
  • Embodiments of the present disclosure also provide a computer program product or computer program, which includes computer instructions stored in a computer-readable storage medium.
  • the processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the method according to the embodiment of the present disclosure.
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logic for implementing the specified Function executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown one after another may actually execute substantially in parallel, or they may sometimes execute in the reverse order, depending on the functionality involved.
  • each block of the block diagram and/or flowchart illustration, and combinations of blocks in the block diagram and/or flowchart illustration can be implemented by special purpose hardware-based systems that perform the specified functions or operations. , or can be implemented using a combination of specialized hardware and computer instructions.
  • the various example embodiments of the present disclosure may be implemented in hardware or special purpose circuits, software, firmware, logic, or any combination thereof. Some aspects may be implemented in hardware, while other aspects may be implemented in firmware or software that may be executed by a controller, microprocessor, or other computing device. While aspects of embodiments of the present disclosure are illustrated or described as block diagrams, flowcharts, or using some other graphical representation, it will be understood that the blocks, devices, systems, techniques, or methods described herein may be used as non-limiting Examples are implemented in hardware, software, firmware, special purpose circuitry or logic, general purpose hardware or controllers, or other computing devices, or some combination thereof.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Graphics (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Processing Or Creating Images (AREA)

Abstract

本公开提供一种对物理场景进行虚拟化的方法、电子设备、计算机可读存储介质和计算机程序产品。所述方法包括基于用于指示场景边界的交互信息,确定所述场景边界内的物理实体,并捕获所述物理实体对应的视频数据;基于所述物理实体对应的视频数据,确定与所述物理实体对应的虚拟实体的模型数据;以及基于所述虚拟实体对应的模型数据,创建所述物理场景对应的虚拟场景。

Description

对物理场景进行虚拟化的方法、电子设备、计算机可读存储介质和计算机程序产品
相关申请的交叉引用
本申请要求于2022年5月31日提交的中国专利申请第202210614156.3的优先权,该中国专利申请的全文通过引用的方式结合于此以作为本申请的一部分。
技术领域
本公开涉及虚拟现实以及数字孪生领域,更具体地涉及一种对场景进行虚拟化的方法、电子设备、计算机可读存储介质和计算机程序产品。
背景技术
数字孪生(英语:Digital Twins)是充分利用物理模型、传感器更新、运行历史等数据,集成多学科、多物理量、多尺度、多概率的仿真过程,在虚拟空间中完成映射,从而反映相对应的物理实体的全生命周期过程。数字孪生是一种超越现实的概念,可以被视为一个或多个重要的、彼此依赖的装备系统的数字映射系统。
数字孪生技术还可以与扩展现实技术(XR,Extended Reality)进行结合。扩展现实技术具体包含虚拟现实技术(VR,Virtual Reality)、增强现实技术(AR,Augmented Reality)、混合现实技术(MR,Mixed Reality)等。
数字孪生技术已经广泛应用于工程建设领域,尤其是三维场景建模领域。基于三维场景模型的可视化三维场景应用已经广泛流行。目前存在三维引擎可以助力可视化三维场景应用研发。此外,由于三维场景的虚拟化属性,往往涉及到场景建模应用与虚拟现实应用同时运行的情况。然而,当前的三维场景建模方案的模型生成过程不仅复杂度高耗时长,还需要提前采集大量的数据,因此在实际应用的过程中往往出现卡顿和模拟的虚拟场景的真实度太低的情况。
为此,本公开提出了一种对场景进行虚拟化的方法、电子设备、计算机可读存储介质和计算机程序产品,以解决场景虚拟化过程中的计算复杂度高并且耗时长的技术问题。
发明内容
本公开的实施例提供了一种对物理场景进行虚拟化的方法,包括:基于用于指示场景边界的交互信息,确定所述场景边界;基于所述场景边界,确定所述场景边界内的物理实体,并捕获所述物理实体对应的视频数据;基于所述物理实体对应的视频数据,确定与所述物理实体对应的虚拟实体的模型数据;以及基于所述虚拟实体对应的模型数据,创建所述物理场景对应的虚拟场景。
例如,所述视频数据包括多个视频帧,所述多个视频帧中不同的视频帧对应于不同的光照条件、拍摄位置或拍摄角度。
例如,所述基于所述物理实体对应的视频数据,确定与所述物理实体对应的虚拟实体的模型数据还包括:从所述视频数据中的每个视频帧提取多个离散点;基于每个视频帧的多个离散点,生成以泰森多边形表征的立体模型数据作为所述视频帧的立体模型数据;基于各个视频帧的立体模型数据,确定与所述物理实体对应的虚拟实体的模型数据。
例如,所述基于所述物理实体对应的视频数据,确定与所述物理实体对应的虚拟实体的模型数据还包括:获取建筑信息模型、全球地理位置信息和建筑定位空间数据中的一项或多项;基于所述建筑信息模型、所述全球地理位置信息和所述建筑定位空间数据中的一项或多项,利用所述物理实体对应的视频数据,确定与所述物理实体对应的虚拟实体的模型数据。
例如,所述基于所述物理实体对应的视频数据,确定与所述物理实体对应的虚拟实体的模型数据还包括:获取城市交通数据、城市规划数据、城市市政数据中的一项或多项;基于所述城市交通数据、城市规划数据、城市市政数据中的一项或多项,利用所述物理实体对应的视频数据,确定与所述物理实体对应的虚拟实体的模型数据。
例如,所述的方法还包括:基于所述物理场景对应的虚拟场景,显示所述虚拟场景的相关信息。
例如,所述显示所述虚拟场景的相关信息还包括:从所述视频数据中,选择多个视频帧;对所述多个视频帧进行纹理压缩和/或纹理缩放处理,以生成贴图数据;基于所述贴图数据,对所述物理场景对应的虚拟场景进行渲染,显示渲染后的虚拟场景。
例如,所述对所述多个视频帧进行纹理压缩和/或纹理缩放处理,以生成贴图数据还包括:对所述多个视频帧进行纹理压缩,以生成纹理压缩后的贴图数据;基于纹理压缩后的贴图数据,确定所述贴图数据对应的材质资源数据和材料资源数据;基于所述贴图数据对应的材质资源数据和材料资源数据,确定纹理缩放处理对应的参数;基于纹理缩放处理对应的参数,对纹理压缩后的贴图数据进行纹理缩放处理,以生成纹理缩放处理后的贴图数据。
本公开的一些实施例提供了一种电子设备,包括:处理器;存储器,存储器存储有计算机指令,该计算机指令被处理器执行时实现上述的方法。
本公开的一些实施例提供了一种计算机可读存储介质,其上存储有计算机指令,所述计算机指令被处理器执行时实现上述的方法。
本公开的一些实施例提供了一种计算机程序产品,其包括计算机可读指令,所述计算机可读指令在被处理器执行时,使得所述处理器执行上述的方法。
由此,针对应用业务可视化与场景虚拟化的需求,本公开的各个实施例利用视频数据来实现场景的虚拟化,有助于解决场景模型生成过程复杂度高并且耗时长的技术问题。
附图说明
为了更清楚地说明本公开实施例的技术方案,下面将对实施例的附图作简单地介绍,显而易见地,下面描述的附图仅仅涉及本公开的一些实施例,而非对本公开的限制。
图1是示出根据本公开实施例的应用场景的示例示意图。
图2是示出根据本公开实施例的对物理场景进行虚拟化的示例方法的流程图。
图3是示出根据本公开实施例的物理场景、交互信息和物理实体的示意 图。
图4是示出根据本公开的实施例的终端在获取交互信息时的示例界面变化示意图。
图5是示出根据本公开实施例的获取交互信息的示意图。
图6是示出根据本公开的实施例的对视频帧的处理的示意图。
图7是示出根据本公开的实施例的结合建筑信息的对视频帧的处理的示意图。
图8是示出根据本公开的实施例的结合地理信息对视频帧的处理的示意图。
图9是示出根据本公开的实施例的场景建模应用和/或虚拟现实应用的架构示意图。
图10是示出根据本公开的实施例的渲染引擎的操作示意图。
图11示出了根据本公开实施例的电子设备的示意图。
图12示出了根据本公开实施例的示例性计算设备的架构的示意图。
图13示出了根据本公开实施例的存储介质的示意图。
具体实施方式
为了使得本公开的目的、技术方案和优点更为明显,下面将参照附图详细描述根据本公开的示例实施例。显然,所描述的实施例仅仅是本公开的一部分实施例,而不是本公开的全部实施例,应理解,本公开不受这里描述的示例实施例的限制。
在本说明书和附图中,具有基本上相同或相似操作和元素用相同或相似的附图标记来表示,且对这些操作和元素的重复描述将被省略。同时,在本公开的描述中,术语“第一”“第二”等字样用于对作用和功能基本相同的相同项或相似项进行区分,应理解,“第一”、“第二”、“第n”之间不具有逻辑或时序上的依赖关系,也不对数量和执行顺序进行限定。还应理解,尽管以下描述使用术语第一、第二等来描述各种元素,但这些元素不应受术语的限制。这些术语只是用于将一元素与另一元素区别分开。例如,在不脱离各种示例的范围的情况下,第一数据可以被称为第二数据,并且类似地,第二数据可 以被称为第一数据。第一数据和第二数据都可以是数据,并且在某些情况下,可以是单独且不同的数据。本申请中术语“至少一个”的含义是指一个或多个,本申请中术语“多个”的含义是指两个或两个以上,例如,多个音频帧是指两个或两个以上的音频帧。
应理解,在本文中对各种示例的描述中所使用的术语只是为了描述特定示例,而并非旨在进行限制。如在对各种示例的描述和所附权利要求书中所使用的那样,单数形式“一个(“a”“an”)”和“该”旨在也包括复数形式,除非上下文另外明确地指示。
还应理解,本文中所使用的术语“和/或”是指并且涵盖相关联的所列出的项目中的一个或多个项目的任何和全部可能的组合。术语“和/或”,是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本申请中的字符“/”,一般表示前后关联对象是一种“或”的关系。
还应理解,在本申请的各个实施例中,各个过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。还应理解,根据(基于)A确定B并不意味着仅仅根据(基于)A确定B,还可以根据(基于)A和/或其它信息来确定B。
还应理解,术语“包括”(也称“includes”、“including”、“Comprises”和/或“Comprising”)当在本说明书中使用时指定存在所陈述的特征、整数、操作、操作、元素、和/或部件,但是并不排除存在或添加一个或多个其他特征、整数、操作、操作、元素、部件、和/或其分组。
还应理解,术语“如果”可被解释为意指“当...时”(“when”或“upon”)或“响应于确定”或“响应于检测到”。类似地,根据上下文,短语“如果确定...”或“如果检测到[所陈述的条件或事件]”可被解释为意指“在确定...时”或“响应于确定...”或“在检测到[所陈述的条件或事件]时”或“响应于检测到[所陈述的条件或事件]”。
为便于描述本公开,以下介绍与本公开有关的概念。
首先,参照图1描述本公开的各个方面的应用场景。图1示出了根据本 公开实施例的应用场景100的示意图,其中示意性地示出了服务器110和多个终端120。终端120以及服务器110可以通过有线或无线通信方式进行直接或间接地连接,本公开在此不做限制。
如图1所示,本公开实施例采用互联网技术,尤其是物理网技术。物联网可以作为互联网的一种延伸,它包括互联网及互联网上所有的资源,兼容互联网所有的应用。随着物联网技术在各个领域的应用,出现了诸如智能家居、智能交通、智慧健康等各种新的智慧物联的应用领域。
根据本公开的一些实施例用于处理场景数据。这些场景数据可能是物联网技术相关的数据。场景数据包括XX。当然,本公开并不以此为限。
例如,根据本公开的一些实施例的方法可以全部或部分地搭载在服务器110上以对场景数据进行处理,例如,对图片形式的场景数据。例如,服务器110将用于分析场景数据,并基于分析结果确定模型数据。这里的服务器110可以是独立的服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、内容分发网络(CDN,Content Delivery Network)、定位服务以及大数据和人工智能平台等基础云计算服务的云服务器,本公开实施例对此不作具体限制。以下,又将服务器110称为云端。
例如,根据本公开实施例的方法还可以全部或部分地搭载在终端120上以对场景数据进行处理。例如,终端120将用于采集上述图片形式的场景数据。又例如,终端120将用于呈现场景数据以使得用户可在虚拟场景中与构建的三维模型进行交互。例如,终端120可以是一种交互装置,其能提供3D数字虚拟对象且包括用户界面的显示装置,可通过用户界面对3D数字虚拟对象进行显示,用户可以与交互装置进行信息交互,又例如,终端120还将用于分析上述的建筑数据。本公开对此并不进行限定。
例如,多个终端120中的每个终端可以是诸如台式计算机等的固定终端,诸如,智能手机、平板电脑、便携式计算机、手持设备、个人数字助理、智能可穿戴设备(例如,智能眼镜)、智能头戴设备、摄像机、车载终端等具有网络功能的移动终端,或者它们的任意组合,本公开实施例对此不作具体限制。 所述多个终端120中的每个终端还可以包括各种传感器或者数据采集装置,例如图1中所示的温度传感器等。在一些示例中,场景数据与光照条件相关,因此所述终端还可以是亮度传感器。在又一些示例中,终端120还可以是摄像机(例如红外摄像机)或者距离探测仪。
上述的各种终端120均可以结合增强现实(AR)技术和虚拟现实(VR)技术。其中,增强现实技术是一种将虚拟场景数据与真实场景进行融合的技术,广泛运用了多媒体、三维建模、实时跟踪及注册、智能交互、传感等多种技术手段,将计算机生成的文字、图像、三维模型、音乐、视频等虚拟信息模拟仿真后,应用到真实世界中,两种信息互为补充,从而实现对真实世界的“增强”。虚拟现实是利用计算机针对真实场景仿真产生一个三维空间的虚拟世界,提供用户关于视觉等感官的仿真,让使用者感觉彷佛身历其境,可以实时、没有限制地观察三维空间内的事物。用户进行位置移动时,计算机可以立即进行复杂的运算,将精确的三维世界影像传回产生临场感。
以图1中示出的智能眼镜为例对结合有增强现实技术和虚拟现实技术的终端120进行进一步说明。智能眼镜不仅包括常规眼镜的各种光学组件和支撑组件,其还包括显示组件,用于显示上述的增强现实信息和/或虚拟现实信息。智能眼镜还包括对应的电池组件、传感器组件和网络组件等等。其中,传感器组件可以包括深度相机(例如,Kinect深度相机),其通过调幅连续波(AMCW)时差测距(TOF)原理来捕捉真实场景中的深度信息,利用近红外光(NIR)来生成真实场景对应的深度图。传感器组件还可以包括各种加速度传感器、陀螺仪传感器和地磁场传感器等,用于检测用户的姿态和位置信息,从而为场景数据的处理提供参考信息。智能眼镜上还可能集成由各种眼球追踪配件,以通过用户的眼动数据来在现实世界、虚拟世界和用户之间搭建桥梁,从而提供更为自然的用户体验。本领域技术人员应当理解,虽然以智能眼镜为例对终端120进行了进一步的说明,但是本公开并未对终端的种类进行任何限制。
可以理解的是,本公开的实施例还可以进一步涉及人工智能服务以智能地提供上述的虚拟场景。人工智能服务不仅可以是在服务器110上执行的,也可以是在终端120上执行的,还可以是由终端和服务器共同执行的,本公 开对此不进行限制。此外,可以理解的是,应用本公开的实施例的人工服务来对场景数据进行分析推理的装置既可以是终端,也可以是服务器,还可以是由终端和服务器组成的系统。
目前,数字孪生技术已经广泛应用于工程建设领域,尤其是三维场景建模领域。基于三维场景模型的可视化三维场景应用已经广泛流行。目前存在很多三维引擎可以助力可视化三维场景应用研发。此外,由于三维场景的虚拟化属性,往往涉及到场景建模的应用与虚拟现实应用同时运行的情况。然而,当前的三维场景建模方案的模型生成过程不仅复杂度高耗时长,还需要提前采集大量的数据,因此在实际应用的过程中往往出现卡顿和模拟的虚拟场景的真实度太低的情况。
例如,目前存在这样的技术方案:从一个固定点,从上下左右前后六个固定的角度拍摄某个场景的六张图片,然后将这六张图片通过贴图方案贴至立方体形式的空间场景模型。
由于在实际展示的过程中,需要对贴图的数据进行拉伸形变,这样的方案生成的虚拟三维场景往往真实度差。此外,由于这六张图片的拍摄时间往往存在差异,导致六张图片均对应不同的光照场景。由此,实际生成的虚拟场景往往难以模拟真实的光照情况,导致虚拟场景的失真。更进一步地,由于这六张图片仅简单的贴至立方体形式的空间场景模型,其往往需要提前采集的大量信息并使用大量的计算资源才能准确地确定符合场景建模应用需求的信息,导致场景建模应用难以与虚拟现实应用同时运行。
因此,本公开的实施例提供了一种对物理场景进行虚拟化的方法,包括:基于指示场景边界的交互信息,确定所述场景边界内的物理实体,并捕获所述物理实体对应视频数据;基于所述视频数据,确定与所述物理实体对应的虚拟实体的模型数据;以及基于所述虚拟实体对应的模型数据,创建所述物理场景对应的虚拟场景。由此,针对应用业务可视化与场景虚拟化的需求,本公开的各个实施例利用视频数据来实现场景的虚拟化,有助于解决场景模型生成过程复杂度高并且耗时长的技术问题。
以下,参考图2至图12以对本公开实施例进行进一步的描述。
作为示例,图2是示出根据本公开实施例的对物理场景进行虚拟化的示 例方法20的流程图。图3是示出根据本公开实施例的物理场景、交互信息和物理实体的示意图。
参见图2,示例方法20可以包括操作S201-S203之一或全部,也可以包括更多的操作。本公开并不以此为限。如上所述,操作S201至S203是由终端120/服务器110实时执行的,或者由终端120/服务器110离线执行。本公开并不对示例方法200各个操作的执行主体进行限制,只要其能够实现本公开的目的即可。示例方法中的各个步骤可以全部或部分地由虚拟现实应用和/或场景建模应用执行。虚拟现实应用和场景建模应用可以集成成一个大型应用,虚拟现实应用和场景建模应用可以是两个独立的应用,但是通过二者互相开放的接口传输交互信息、视频数据、模型数据等等。本公开并不以此为限。
例如,在操作S201中,基于指示场景边界的交互信息,确定所述场景边界。在操作S202中,基于所述场景边界,确定所述场景边界内的物理实体,并捕获所述物理实体对应视频数据。
例如,所述交互信息可以是通过图1中的终端120采集的,其指示需要对物理场景中的哪些物理实体进行进一步虚拟化处理。例如,如图3所示,其示出了一种物理场景、交互信息和物理实体的示例,其示意地示出了包括沙发、窗帘、月亮、台灯、置物柜和书籍等物理实体的物理场景的示例。针对这样的物理场景,可以获取以圆形框示出的交互信息,其指示仅需要对圆形框中的物理实体及物理场景进行虚拟化。也即在图3的示例中,其可以对应地确定所述场景中的物理实体仅包括台灯、置物柜和书籍。接着,可以捕获台灯、置物柜和书籍对应的视频数据。虽然在图3以圆形框的形式示出了场景边界,本领域技术人员应当理解本公开并不以此为限,具体地,还可以任意的连通形状指示场景边界。之后将参考图4至图5来详细描述交互信息的各种示例,本公开在此不再赘述。
作为一个示例,所述物理实体对应的视频数据是指连续的图像序列,其实质是由一组组连续的图像构成的。该图像序列中的每个图像又称为视频帧,其是组成视频的最小视觉单位。可以以参考图1描述的各种终端120来采集该视频数据,例如可以使用智能眼镜、手机终端、深度相机等设备来采集该视频数据。由于视频数据捕获的物理实体在一段时间内的图像(视频帧),所述多 个视频帧中不同的视频帧对应于不同的光照条件、拍摄位置或拍摄角度。因此,视频数据中的每个视频帧包括关于该物理实体的各种信息。根据采用本公开实施例的各种实验,可以确定能够从包括300帧的视频数据中提取到能够表征物理实体的足够多的信息,从而实现真实度高的虚拟实体的建模处理。
在操作S203中,基于所述物理实体对应的视频数据,确定与所述物理实体对应的虚拟实体的模型数据。
可选地,所述视频数据虽然是利用终端120采集的,但是视频数据的分析处理可以是服务器110处理的。例如,终端120可以通过流式传输将视频数据传输至服务器,然后服务器110可以对所述物理实体对应的视频数据进行处理(例如,图像处理等)以获取与所述物理实体对应的虚拟实体的模型数据。此外,服务器110还可以结合各种已知信息或通过各类接口连接至公开或非公开数据库,以获取与所述物理实体相关的信息作为所述虚拟实体的模型数据。
例如,所述虚拟实体的模型数据指示可以用于在虚拟场景中搭建虚拟实体有关的任意数据。例如,其可以从所述视频数据的各个视频帧中提取的虚拟实体的边缘信息、位置信息、深度信息、顶点信息、高度信息、宽度信息、长度信息等等。所述虚拟实体的模型数据也可以是从所述视频数据的各个视频帧中提取的虚拟实体所处的环境信息,例如,光照信息、相对位置关系信息等等。甚至在物理实体为物联网设备的情况下,所述虚拟实体的模型数据还可以包括物联网相关信息,例如网络状态、注册请求信息、注册实体信息、设备运行信息等等。又或者,还可以基于对所述视频数据的分析,从互联网/数据库中拉取与所述物理实体相关的任意数据。本公开对此不进行限制。之后将参考图6来详细描述交互信息的各种示例,本公开在此不再赘述。
在操作S204中,基于所述虚拟实体对应的模型数据,创建所述物理场景对应的虚拟场景。
可选地,所述虚拟场景是三维虚拟场景,其是真实物理场景的虚拟化。在该三维虚拟场景内放置有所述虚拟实体对应的三维虚拟模型。三维虚拟模型又称为3D模型,其可通过各种3D软件制作。结合以下详述的本公开的各个实施例,本公开中制作3D模型的软件例如是CAD(CAD-Computer Aided  Design,计算机辅助设计)软件。在这些示例中,可以通过该软件得到STL格式的3D模型文件;然后,再将STL格式文件导入到可进行切片的3D软件中的切片处理流程管线,以获得该三维虚拟模型。此外,还可在闯将三维虚拟模型之前,对模型数据进行结构性优化,以节省计算资源,提高处理效率。值得注意的是,本公开并不对3D软件的类型进行限制,例如,可为3D模型剖析的软件,可为进行视觉艺术创作的3D软件,还可为3D打印的3D软件,等等;此外,还可通过计算机图形库(即自编程时用到的图形库)制作生成三维模型;例如,(OpenGraphics Library,开放图形库)、DirectX(Direct eXtension)等等。
可选地,方法20还可以进一步包括操作S205。在操作S205中,基于所述物理场景对应的虚拟场景,显示所述虚拟场景的相关信息。例如,以三维的形式显示该虚拟场景。
可选地,可以使用各类三维渲染引擎来将所述虚拟场景进行可视化。三维渲染引擎能够实现从数字三维场景中生成可显示的二维影像。所生成的二维影像可以是写实的也可以是非写实的。而三维渲染的过程需要依靠3D渲染引擎来生成。结合以下详述的本公开的各个实施例,本公开中的示例渲染引擎可以使用“光线追踪”技术,其通过追踪来自摄影机的光线穿过像素的虚拟平面并模拟其与物体相遇的效果来生成影像。本公开中的示例渲染引擎还可以使用“光栅化”技术,其通过收集各种面元的相关信息来确定二维影像中各个像素的值。本公开并不对3D渲染引擎的种类以及采用的技术进行限制。
由此,针对应用业务可视化与场景虚拟化的需求,方法20利用视频数据来实现场景的虚拟化,有助于解决场景模型生成过程复杂度高并且耗时长的技术问题。
接下来参考图4和图5来进一步描述操作S201至S202的示例。其中,图4是示出根据本公开的实施例的终端在获取交互信息时的示例界面变化示意图。图5是示出根据本公开实施例的获取交互信息的示意图。
如图4所示,终端120上可能搭载有场景建模应用和/或虚拟现实应用。响应于场景建模应用和/或虚拟现实应用开启,终端120可以触发“手势圈选”相关功能,用于获取指示场景边界的交互信息。具体地,响应于终端120为智 能眼镜或智慧手机,透过所述智能眼镜或利用所述智慧手机的摄像头可能看到图4的左图中的7个示例物理实体。通过触发显示屏上的对话框,智能眼镜或智慧手机将对用户的手势进行捕捉。例如,用户可能在智能眼镜前用手在空中比划一个不规则区域范围。又例如,用户可能一手握着智慧手机,并用另一只手在智慧手机的摄像头可拍摄的区域中比划一个不规则的区域范围。智能眼镜或智慧手机将对该手势进行识别,以获得一个可以用矢量性的接续向量描述的场景边界,其在首尾定向闭合时,可以生成如图4和图5所示的凸多边形闭合区域。
进一步地,如图5所示,以摄像组件(例如,智能眼镜或智慧手机的摄像头)为起点,基于从上述的凸多边形闭合区域的边缘上的多个点到起点所在的垂直面的距离。基于所述多个点到起点所在的垂直面的距离,从中选取最短距离作为凸多边形闭合区域对应的最短距离。基于凸多边形闭合区域对应的最短距离,确定第一垂直面。例如,第一垂直面垂直于水平面,并且第一垂直面与摄像组件之间的水平距离为凸多边形闭合区域对应的最短距离。接着,基于所述第一垂直面确定圆形平面区域。所述圆形平面区域用于辅助确定某个物理实体是否位于场景边界内。
例如,可以将所述凸多边形闭合区域上的最高点和最低点投影到所述第一垂直面上,并将第一垂直面上的所述最高点的投影和所述最低点的投影之间连线作为直径,以该连线的中心作为圆心,确定所述圆形平面区域。又例如,可以将所述凸多边形闭合区域上的最左点和最右点投影到所述第一垂直面上,并将第一垂直面上的所述最左点的投影和所述最右点的投影之间连线作为直径,以该连线的中心作为圆心,确定所述圆形平面区域。又例如,还可以将所述凸多边形闭合区域的最长对角线投影到所述第一垂直面上,以所述最长对角线的投影作为直径,以所述最长对角线的投影的中心作为圆心,确定所述圆形平面区域。本公开并不对确定圆形平面区域的方式进行进一步的限定。
类似地,以摄像组件为起点,确定从物理实体的边缘上的多个点到起点所在的垂直面的距离。基于所述物理实体的边缘上的多个点到起点所在的垂直面的距离,选取物理实体对应的最短距离。基于该物理实体对应的最短距离,确定第二垂直面。例如,第二垂直面垂直于水平面,并且第二垂直面与摄像组 件之间的水平距离为物理实体对应的最短距离。基于凸多边形闭合区域对应的最短距离与物理实体对应的最短距离的比值,在第二垂直面上确定等比扩大的圆形平面区域。圆形平面区域的直径与等比扩大的圆形平面区域的直径之间的比值等于凸多边形闭合区域对应的最短距离与物理实体对应的最短距离的比值,并且圆形平面区域的圆心与等比扩大的圆形平面区域的圆心在同一条水平线上。
如果该物理实体在该等比扩大的圆形平面区域上的投影全部在所该等比扩大的圆形平面区域内,那么可以确定该物理实体在所述场景边界内部。如图4和图5所示,可以确定灰色标记的物理实体在所述场景边界内,而白色标记的物理实体在所述场景边界外。由此,基于凸多边形闭合区域对应的最短水平距离确定第一垂直面和第二垂直面能够实现更小的误差。当然本公开并不以此为限。
图4和图5仅为利用手势追踪方案来获取指示场景边界的交互信息以及确定场景边界内的物理实体的一种示例方案,本公开并不以此为限。例如,虚拟现实应用还可以先通过红外线感测或动态图像识别的方案,先确定摄像组件能够拍摄的多个物理实体,并通过语音或文字对话框提示用户从所述多个物理实体中进行选择。在这样的情况下,用户从所述多个物理实体中进行选择的信息将作为指示场景边界的交互信息。又例如,虚拟现实应用还可以先拍摄一张静态的图像,对该静态的图像进行边缘提取,以在静态的图像上绘制覆盖所拍摄的物理实体的按钮,用户通过点击/触摸/手势指示等方式触发该按钮以实现从所述多个物理实体中选择需要虚拟化的物理实体。在这样的情况下,用户对于该按钮进行触发的信息也可以作为指示场景边界的交互信息。
接着,摄像组件将捕获场景边界内的物理实体对应的视频数据。例如,摄像组件可以在拍摄时段内不断地自动/手动调整拍摄参数,例如,调整焦点、焦距、摄像组件的位置、间断地开启闪光灯、间断地开启远光灯、间断地开启近光灯等方式捕获所述物理实体对应的视频数据,以使得视频数据中包括更多的信息。当然在一些示例中,摄像组件也可以在拍摄时段内不对拍摄参数进行任何调整。由于在虚拟现实应用的运行过程中,环境光往往存在设备可捕捉变化,所捕获的视频数据往往也包括了足够多的信息,足以提供足够的虚拟实 体的模型数据。
由此,本公开的各个方面通过虚拟现实应用,采用丰富的人机交互方式提供用于指示场景边界的交互信息,能够便捷地确定场景边界内的物理实体,为后续的虚拟场景的创建提供足够多的模型数据。
接下来,参考图6至图8来进一步描述操作S202的示例。其中,图6是示出根据本公开的实施例的对视频帧的处理的示意图。图7是示出根据本公开的实施例的结合建筑信息的对视频帧的处理的示意图。图8是示出根据本公开的实施例的结合地理信息对视频帧的处理的示意图。
可选地,操作S202包括从所述视频数据中的每个视频帧提取多个离散点;基于每个视频帧的多个离散点,生成以泰森多边形表征的立体模型数据作为所述视频帧的立体模型数据;基于各个视频帧的立体模型数据,确定与所述物理实体对应的虚拟实体的模型数据。
图6示出了场景建模应用和/或虚拟现实应用对于视频数据中的一个视频帧的示例。该视频数据拍摄以杯子形态示出的物理实体。本领域技术人员应当理解,图6仅是用于说明本公开的方案的示意图,真实的视频数据还可能在单个视频帧中包括更多或更少的像素和信息。
作为一个示例,场景建模应用和/或虚拟现实应用将从视频数据中提取如601标记的视频帧。接着,可以从以601标记的视频帧中提取如602标记的图像中以黑色圆点标记的多个离散点。所述多个离散点中的每个离散点均指示物理实体关联的信息。离散点的示例可以是杯子的顶点、中心点、特征点、以及明暗变化最剧烈的点。作为一个示例,可以在单个视频帧中提取20个至30个离散点。当然,本公开的实施例并不以此为限。
可以以各种方式来提取离散点,本公开对提取离散点的方式不进行限制。例如,可以该视频帧生成灰度图以从灰度图中确定每个像素的明暗变化情况。然后,基于每个像素的明暗变化情况生成热力图,以获取视频帧的明暗变化分布。基于所述明暗变化分布,确定多个离散点的坐标,这些离散点均指示该视频帧的明暗变化信息。
又例如,可以利用神经网络来智能地识别所述视频帧中的多个离散点,每个离散点可以是该视频帧中的特征点。可以使用各种神经网络模型来确定这 些离散点,例如可以采用深度神经网络(DNN)模型、因子分解机(FM)模型等等。这些神经网络模型可以被实现为无环图,其中神经元布置在不同的层中。通常,神经网络模型包括输入层和输出层,输入层和输出层通过至少一个隐藏层分开。隐藏层将由输入层接收到的输入变换为对在输出层中生成输出有用的表示。网络节点经由边全连接至相邻层中的节点,并且每个层内的节点之间不存在边。在神经网络的输入层的节点处接收的数据经由隐藏层、激活层、池化层、卷积层等中的任意一项被传播至输出层的节点。神经网络模型的输入输出可以采用各种形式,本公开对此不作限制。
接续该示例,可以基于所提取的各个离散点生成以泰森多边形表征的立体模型数据。例如,可以从这些离散点中任意选取一个离散点作为第一离散点,然后查找距离此点最近的点作为第二离散点,连接第一离散点和第二离散点作为第一基线。查找距离该第一基线最近的点作为第三离散点,连接第一离散点和第三离散点作为第二基线并连接第二离散点和第三离散点作为第三基线。第一基线、第二基线和第三基线组成框603中标记的三角形。接着,再查找距离第二基线和第三基线最近的离散点,重复生成多个三角形,直至生成框604中标记的三角网。基于该三角网,利用泰森多边形生成的方式,形成立体的模型结构。泰森多边形生成是将任意一个离散点作为中心点,然后将中心点分别同周围多个离散点相连,然后分别做直线的垂直平分线,这些垂直平分线相交组成的多边形(由此,由称为该中心点的临近范围),该多边形即为泰森多边形。由此,针对每个视频帧,均能生成以泰森多边形进行表征的立体模型结构。
由于同一个物理实体,其的物理结构、物理表面均难以在短时间(例如视频数据捕获的时段内)中变化,因此,针对时间上相邻或相近的视频帧,可以根据视频帧中提取的离散点之间的相似度来确定多个视频帧中的相同离散点。结合近大远小的原理,可以算出各个离散点处的深度信息。各个离散点处的深度信息将作为所述物理实体对应的虚拟实体的模型数据的示例。
如图7所示,如果场景建模应用和/或虚拟现实应用需要对包括大型建筑物的场景进行虚拟化(其中大型建筑物将作为一个物理实体),那么可以进一步地结合该大型建筑物的建筑信息模型(BIM模型)来确定该物理实体对应 的虚拟实体的模型数据。BIM模型也即建筑信息化模型,其英文全称是Building Information Modeling。一个BIM模型中不仅有建筑的三维模型,还可以设置建筑的材料特性、颜色、设计者、制造者、施作者、检验者、日期时间、面积、体积等信息。各个监测虚拟实体可以作为实体对象被设置于BIM模型中,其对应地包括对象标识、对象的几何数据、对象的基准几何数据、对象实时采集到的数据等等。本公开并不以此为限。
此外,还可以进一步结合该大型建筑物对应的全球地理位置信息,来确定该物理实体对应的虚拟实体的模型数据。其中,全球地理位置信息可以是在地图数据库中根据该物理实体的部分特征查找到的信息。例如可以通过各种导航地图应用查找到该物理实体对应的经纬度信息作为全球地理位置信息。又例如,还可以基于终端120的定位模块(例如GPS定位模块、北斗系统定位模块)确定的终端120的位置数据,来进一步确定距离手机一定范围内的物理实体的位置。本公开并不对全球地理位置信息进行进一步的限定。
此外,还可以进一步结合该大型建筑物对应的建筑定位空间数据,来确定该物理实体对应的虚拟实体的模型数据。例如,可以通过终端120从建筑定位空间数据库中拉取对应建筑的建筑定位空间数据,其包括建筑的长宽高数据、墙体数据、建筑在报批时的各项设计数据等等。本公开并不对建筑定位空间数据进行进一步的限定。
例如,可以从上述的视频数据中提取光照信息,然后将该光照信息与上述的建筑信息模型进行结合,以确定与所述物理实体对应的虚拟实体的模型数据。又例如,可以结合图6中描述的方法,从视频数据中的各个视频帧中生成所述视频帧的立体模型数据,结合立体模型数据、建筑信息模型、全球地理位置信息和建筑定位空间数据中的一项或多项,确定与所述物理实体对应的虚拟实体的模型数据,进而使得能够呈现不同光照条件下的虚拟场景。本公开对此不进行限制。
如图8所示,如果场景建模应用和/或虚拟现实应用需要对远景进行虚拟化(该远景中包括多个大型建筑物,每个大型建筑物都将作为一个物理实体),那么还可以进一步地结合城市交通数据、城市规划数据、城市市政数据等来确定该物理实体对应的虚拟实体的模型数据。城市交通数据、城市规划数据、城 市市政数据可以从城市相关的网页信息中直接获取,或从相关数据库中拉取。本公开对此不进行限定。城市交通数据、城市规划数据、城市市政数据均为示例性的地理信息,本公开在此就不再赘述。
接下来参考图9来进一步描述操作S203的示例。其中,图9是示出根据本公开的实施例的场景建模应用和/或虚拟现实应用的架构示意图。
如图9所示场景建模应用和/或虚拟现实应用中可以从数据采集模块(例如摄像头)中获得视频数据,然后通过底层功能模块对该视频数据进行初步解析。其中数据采集模块的支持组件可以包括任意的硬件设备SDK或WebSocket客户端,而底层功能模块则包括:基于视频数据生成序列化的Xml/Json的文件摘要的序列化功能模块、确定各个程序/服务的活动性的监听功能模块、文件格式转换模块等等。
根据上述对于该视频数据的初步解析,还可以使用I/O模块来将视频数据处理为可传输的文件。例如,I/O模块可以包括多个服务模块,例如,提供文件监听服务的文件监听模块以及用于将文件进行FTP传输的文件传输模块等等。
然后,终端120上搭载的场景建模应用和/或虚拟现实应用将文件形式的视频数据传输至服务器110进行进一步解析。具体地,服务器110上也类似地包括通信模块。该通信模块也类似支持组件可以包括任意的硬件设备SDK或WebSocket客户端。甚至为提高传输速度,还可以对应地包括管道传输模块。服务器110上还包括各种数据库,例如模型数据库、材质数据库和纹理数据库。服务器110可以使用其分析模块,结合各种数据库中的执行上述操作S202,然后将虚拟实体的模型数据返回至场景建模应用和/或虚拟现实应用。
接着场景建模应用和/或虚拟现实应用将利用规则转换模块,将物理世界中的规则转换为虚拟场景中的规则(例如,进行坐标转换),并结合该虚拟场景中的规则创建所述物理场景对应的虚拟场景。值得注意的是,接收虚拟实体的模型数据的终端不一定是发送视频数据文件的终端。例如,可以利用终端A采集视频数据发送至服务器,然后服务器将模型数据发送至终端B,从而实现远端多地协同操作。为物理场景之外的用户提供相应动态参考,以助于该用户进行虚拟场景的异地分析和虚拟场景还原。
此外,场景建模应用和/或虚拟现实应用中还可以包括渲染进程和控制进程以实现虚拟场景的可视化过程。例如,渲染进程和控制进程可以互相通信,以实现虚拟场景的可视化,此外渲染进程还向控制进程提供仿真反馈信息,以指示上述的虚拟场景与物理场景之间的比较信息。当然本公开并不限于此。
本公开的各个实施例的扩展性强,不仅能够结合各种手势识别算法,进行深度纵向开发,以向终端120的普通用户提供模型数据和辅助数据,还可以进行横向扩展开发,以向某些特殊行业的监管者提供场景监管服务,通过真实场景还原,实现实时场景检测。此外,本公开的各个实施例还可以输出为对应平台可使用的JAR包/动态链接库,供多系统进行集成。
接下来参考图10来进一步描述操作S204的示例,其中,图10是示出根据本公开的实施例的渲染引擎的操作示意图。
作为一个示例,操作S204包括:从所述视频数据中,选择多个视频帧;对所述多个视频帧进行纹理压缩和/或纹理缩放处理,以生成贴图数据;基于所述贴图数据,对所述物理场景对应的虚拟场景进行渲染,显示渲染后的虚拟场景。
例如,可以利用OpenGL ES的接口glCompressedTexImage2D(…,format,…,data)来对所述多个视频帧进行纹理压缩。值得注意地是,本公开并不限制纹理数据的格式,其可以根据供应商的SDK或文档,将纹理数据转换为任意的格式。例如,假设终端120的显示屏幕适配有32MB的显示内存。可以将2MB的单个视频帧图像进行纹理压缩以生成ECT(Ericsson Texture Compression)格式的贴图数据,以保证16张贴图以上的贴图数据。
在一些情况下,经纹理压缩后得到的贴图数据可能在比例上出现失真,因此,可以在三维渲染引擎中利用纹理缩放进一步调整贴图数据。例如,可以针对贴图数据生成材质(Texture)资源数据(例如,图10所示的材质A至材质C参数)。基于材质资源数据,渲染引擎将对应地生成材料(Material)资源数据(例如,图10所示的颜色、高光、金属等参数)。结合从视频数据中获取的与所述物理实体对应的虚拟实体的模型数据,基于所述材质资源数据和材料资源数据,可以确定纹理缩放处理对应的参数(例如,部分贴图中的像素数据可以直接以纹理缩放参数进行表征),基于所述纹理缩放处理对应的参 数,可以进一步地对贴图数据进行纹理缩放处理,以进一步减少贴图数据的文件大小,保证虚拟现实应用的运行速度。
由此,针对应用业务可视化与场景虚拟化的需求,本公开的各个实施例利用视频数据来实现场景的虚拟化,有助于解决场景模型生成过程复杂度高并且耗时长的技术问题。
此外根据本公开的又一方面,还提供了一种对物理场景进行虚拟化的装置,所述装置包括:第一模块,被配置为基于用于指示场景边界的交互信息,确定所述场景边界内的物理实体,并捕获所述物理实体对应的视频数据;第二模块,被配置为基于所述物理实体对应的视频数据,确定与所述物理实体对应的虚拟实体的模型数据;以及第三模块,被配置为基于所述虚拟实体对应的模型数据,创建所述物理场景对应的虚拟场景。
例如,所述视频数据包括多个视频帧,所述多个视频帧中不同的视频帧对应于不同的光照条件、拍摄位置或拍摄角度。
例如,所述第二模块还被配置为:从所述视频数据中的每个视频帧提取多个离散点;基于每个视频帧的多个离散点,生成以泰森多边形表征的立体模型数据作为所述视频帧的立体模型数据;基于各个视频帧的立体模型数据,确定与所述物理实体对应的虚拟实体的模型数据。
例如,所述第二模块还被配置为:获取建筑信息模型、全球地理位置信息和建筑定位空间数据中的一项或多项;基于所述建筑信息模型、所述全球地理位置信息和所述建筑定位空间数据中的一项或多项,利用所述物理实体对应的视频数据,确定与所述物理实体对应的虚拟实体的模型数据。
例如,所述第二模块还被配置为:获取城市交通数据、城市规划数据、城市市政数据中的一项或多项;基于所述城市交通数据、城市规划数据、城市市政数据中的一项或多项,利用所述物理实体对应的视频数据,确定与所述物理实体对应的虚拟实体的模型数据。
例如,所述装置还包括第四模块,被配置为:基于所述物理场景对应的虚拟场景,显示所述虚拟场景的相关信息。
例如,所述显示所述虚拟场景的相关信息还包括:从所述视频数据中,选择多个视频帧;对所述多个视频帧进行纹理压缩和/或纹理缩放处理,以生成 贴图数据;基于所述贴图数据,对所述物理场景对应的虚拟场景进行渲染,显示渲染后的虚拟场景。
例如,所述对所述多个视频帧进行纹理压缩和/或纹理缩放处理,以生成贴图数据还包括:对所述多个视频帧进行纹理压缩,以生成纹理压缩后的贴图数据;基于纹理压缩后的贴图数据,确定所述贴图数据对应的材质资源数据和材料资源数据;基于所述贴图数据对应的材质资源数据和材料资源数据,确定纹理缩放处理对应的参数;基于纹理缩放处理对应的参数,对纹理压缩后的贴图数据进行纹理缩放处理,以生成纹理缩放处理后的贴图数据。
此外根据本公开的又一方面,还提供了一种电子设备,用于实施根据本公开实施例的方法。图11示出了根据本公开实施例的电子设备2000的示意图。
如图11所示,所述电子设备2000可以包括一个或多个处理器2010,和一个或多个存储器2020。其中,所述存储器2020中存储有计算机可读代码,所述计算机可读代码当由所述一个或多个处理器2010运行时,可以执行如上所述的搜索请求处理方法。
本公开实施例中的处理器可以是一种集成电路芯片,具有信号的处理能力。上述处理器可以是通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现成可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本公开实施例中的公开的各方法、操作及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等,可以是X86架构或ARM架构的。
一般而言,本公开的各种示例实施例可以在硬件或专用电路、软件、固件、逻辑,或其任何组合中实施。某些方面可以在硬件中实施,而其他方面可以在可以由控制器、微处理器或其他计算设备执行的固件或软件中实施。当本公开的实施例的各方面被图示或描述为框图、流程图或使用某些其他图形表示时,将理解此处描述的方框、装置、系统、技术或方法可以作为非限制性的示例在硬件、软件、固件、专用电路或逻辑、通用硬件或控制器或其他计算设备,或其某些组合中实施。
例如,根据本公开实施例的方法或装置也可以借助于图12所示的计算设备3000的架构来实现。如图7所示,计算设备3000可以包括总线3010、一 个或多个CPU 3020、只读存储器(ROM)3030、随机存取存储器(RAM)3040、连接到网络的通信端口3050、输入/输出组件3060、硬盘3070等。计算设备3000中的存储设备,例如ROM 3030或硬盘3070可以存储本公开提供的方法的处理和/或通信使用的各种数据或文件以及CPU所执行的程序指令。计算设备3000还可以包括用户界面3080。当然,图7所示的架构只是示例性的,在实现不同的设备时,根据实际需要,可以省略图7示出的计算设备中的一个或多个组件。
根据本公开的又一方面,还提供了一种计算机可读存储介质。图13示出了根据本公开的存储介质4000的示意图。
如图13所示,所述计算机存储介质4020上存储有计算机可读指令4010。当所述计算机可读指令4010由处理器运行时,可以执行参照以上附图描述的根据本公开实施例的方法。本公开实施例中的计算机可读存储介质可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。非易失性存储器可以是只读存储器(ROM)、可编程只读存储器(PROM)、可擦除可编程只读存储器(EPROM)、电可擦除可编程只读存储器(EEPROM)或闪存。易失性存储器可以是随机存取存储器(RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、同步动态随机存取存储器(SDRAM)、双倍数据速率同步动态随机存取存储器(DDRSDRAM)、增强型同步动态随机存取存储器(ESDRAM)、同步连接动态随机存取存储器(SLDRAM)和直接内存总线随机存取存储器(DR RAM)。应注意,本文描述的方法的存储器旨在包括但不限于这些和任意其它适合类型的存储器。应注意,本文描述的方法的存储器旨在包括但不限于这些和任意其它适合类型的存储器。
本公开的实施例还提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行根据本公开实施例的方法。
需要说明的是,附图中的流程图和框图,图示了按照本公开各种实施例的 系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,所述模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
一般而言,本公开的各种示例实施例可以在硬件或专用电路、软件、固件、逻辑,或其任何组合中实施。某些方面可以在硬件中实施,而其他方面可以在可以由控制器、微处理器或其他计算设备执行的固件或软件中实施。当本公开的实施例的各方面被图示或描述为框图、流程图或使用某些其他图形表示时,将理解此处描述的方框、装置、系统、技术或方法可以作为非限制性的示例在硬件、软件、固件、专用电路或逻辑、通用硬件或控制器或其他计算设备,或其某些组合中实施。
在上面详细描述的本公开的示例实施例仅仅是说明性的,而不是限制性的。本领域技术人员应该理解,在不脱离本公开的原理和精神的情况下,可对这些实施例或其特征进行各种修改和组合,这样的修改应落入本公开的范围内。

Claims (11)

  1. 一种对物理场景进行虚拟化的方法,包括:
    基于用于指示场景边界的交互信息,确定所述场景边界;
    基于所述场景边界,确定所述场景边界内的物理实体,并捕获所述物理实体对应的视频数据;
    基于所述物理实体对应的视频数据,确定与所述物理实体对应的虚拟实体的模型数据;以及
    基于所述虚拟实体对应的模型数据,创建所述物理场景对应的虚拟场景。
  2. 如权利要求1所述的方法,其中,所述视频数据包括多个视频帧,所述多个视频帧中不同的视频帧对应于不同的光照条件、拍摄位置或拍摄角度。
  3. 如权利要求1所述的方法,其中,所述基于所述物理实体对应的视频数据,确定与所述物理实体对应的虚拟实体的模型数据还包括:
    从所述视频数据中的每个视频帧提取多个离散点;
    基于每个视频帧的多个离散点,生成以泰森多边形表征的立体模型数据作为所述视频帧的立体模型数据;
    基于各个视频帧的立体模型数据,确定与所述物理实体对应的虚拟实体的模型数据。
  4. 如权利要求1所述的方法,其中,所述基于所述物理实体对应的视频数据,确定与所述物理实体对应的虚拟实体的模型数据还包括:
    获取建筑信息模型、全球地理位置信息和建筑定位空间数据中的一项或多项;
    基于所述建筑信息模型、所述全球地理位置信息和所述建筑定位空间数据中的一项或多项,利用所述物理实体对应的视频数据,确定与所述物理实 体对应的虚拟实体的模型数据。
  5. 如权利要求1所述的方法,其中,所述基于所述物理实体对应的视频数据,确定与所述物理实体对应的虚拟实体的模型数据还包括:
    获取城市交通数据、城市规划数据、城市市政数据中的一项或多项;
    基于所述城市交通数据、城市规划数据、城市市政数据中的一项或多项,利用所述物理实体对应的视频数据,确定与所述物理实体对应的虚拟实体的模型数据。
  6. 如权利要求1所述的方法,还包括:
    基于所述物理场景对应的虚拟场景,显示所述虚拟场景的相关信息。
  7. 如权利要求6所述的方法,其中,所述显示所述虚拟场景的相关信息还包括:
    从所述视频数据中,选择多个视频帧;
    对所述多个视频帧进行纹理压缩和/或纹理缩放处理,以生成贴图数据;
    基于所述贴图数据,对所述物理场景对应的虚拟场景进行渲染,
    显示渲染后的虚拟场景。
  8. 如权利要求7所述的方法,其中,所述对所述多个视频帧进行纹理压缩和/或纹理缩放处理,以生成贴图数据还包括:
    对所述多个视频帧进行纹理压缩,以生成纹理压缩后的贴图数据;
    基于纹理压缩后的贴图数据,确定所述贴图数据对应的材质资源数据和材料资源数据;
    基于所述贴图数据对应的材质资源数据和材料资源数据,确定纹理缩放处理对应的参数;
    基于纹理缩放处理对应的参数,对纹理压缩后的贴图数据进行纹理缩放处理,以生成纹理缩放处理后的贴图数据。
  9. 一种电子设备,包括:处理器;存储器,存储器存储有计算机指令,该计算机指令被处理器执行时实现如权利要求1-8中任一项所述的方法。
  10. 一种计算机可读存储介质,其上存储有计算机指令,所述计算机指令被处理器执行时实现如权利要求1-8中任一项所述的方法。
  11. 一种计算机程序产品,其包括计算机可读指令,所述计算机可读指令在被处理器执行时,使得所述处理器执行如权利要求1-8中任一项所述的方法。
PCT/CN2023/094999 2022-05-31 2023-05-18 对物理场景进行虚拟化的方法、电子设备、计算机可读存储介质和计算机程序产品 WO2023231793A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210614156.3 2022-05-31
CN202210614156.3A CN114972599A (zh) 2022-05-31 2022-05-31 一种对场景进行虚拟化的方法

Publications (1)

Publication Number Publication Date
WO2023231793A1 true WO2023231793A1 (zh) 2023-12-07

Family

ID=82960480

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/094999 WO2023231793A1 (zh) 2022-05-31 2023-05-18 对物理场景进行虚拟化的方法、电子设备、计算机可读存储介质和计算机程序产品

Country Status (2)

Country Link
CN (1) CN114972599A (zh)
WO (1) WO2023231793A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114972599A (zh) * 2022-05-31 2022-08-30 京东方科技集团股份有限公司 一种对场景进行虚拟化的方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103226830A (zh) * 2013-04-25 2013-07-31 北京大学 三维虚实融合环境中视频纹理投影的自动匹配校正方法
CN103500465A (zh) * 2013-09-13 2014-01-08 西安工程大学 基于增强现实技术的古代文物场景快速渲染方法
CN109903129A (zh) * 2019-02-18 2019-06-18 北京三快在线科技有限公司 增强现实显示方法与装置、电子设备、存储介质
CN111145236A (zh) * 2019-12-04 2020-05-12 东南大学 一种基于数字孪生的产品拟实物装配模型生成方法及实现框架
WO2021031454A1 (zh) * 2019-08-21 2021-02-25 佳都新太科技股份有限公司 一种数字孪生系统、方法及计算机设备
CN114972599A (zh) * 2022-05-31 2022-08-30 京东方科技集团股份有限公司 一种对场景进行虚拟化的方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103226830A (zh) * 2013-04-25 2013-07-31 北京大学 三维虚实融合环境中视频纹理投影的自动匹配校正方法
CN103500465A (zh) * 2013-09-13 2014-01-08 西安工程大学 基于增强现实技术的古代文物场景快速渲染方法
CN109903129A (zh) * 2019-02-18 2019-06-18 北京三快在线科技有限公司 增强现实显示方法与装置、电子设备、存储介质
WO2021031454A1 (zh) * 2019-08-21 2021-02-25 佳都新太科技股份有限公司 一种数字孪生系统、方法及计算机设备
CN111145236A (zh) * 2019-12-04 2020-05-12 东南大学 一种基于数字孪生的产品拟实物装配模型生成方法及实现框架
CN114972599A (zh) * 2022-05-31 2022-08-30 京东方科技集团股份有限公司 一种对场景进行虚拟化的方法

Also Published As

Publication number Publication date
CN114972599A (zh) 2022-08-30

Similar Documents

Publication Publication Date Title
US11972529B2 (en) Augmented reality system
US9224237B2 (en) Simulating three-dimensional views using planes of content
US9437038B1 (en) Simulating three-dimensional views using depth relationships among planes of content
US11704806B2 (en) Scalable three-dimensional object recognition in a cross reality system
KR20220004607A (ko) 목표 검출방법, 전자장치, 노변장치와 클라우드 컨트롤 플랫폼
US11842514B1 (en) Determining a pose of an object from rgb-d images
AU2022345532B2 (en) Browser optimized interactive electronic model based determination of attributes of a structure
US20130257856A1 (en) Determining a View of an Object in a Three-Dimensional Image Viewer
US9245366B1 (en) Label placement for complex geographic polygons
WO2023185354A1 (zh) 实景导航方法、装置、设备及存储介质、程序产品
US20220319231A1 (en) Facial synthesis for head turns in augmented reality content
US11989900B2 (en) Object recognition neural network for amodal center prediction
CN109741431B (zh) 一种二三维一体化电子地图框架
Bulbul et al. Social media based 3D visual popularity
WO2023231793A1 (zh) 对物理场景进行虚拟化的方法、电子设备、计算机可读存储介质和计算机程序产品
US20230260218A1 (en) Method and apparatus for presenting object annotation information, electronic device, and storage medium
Szabó et al. Data processing for virtual reality
JP2022501751A (ja) 3d幾何抽出のために画像の複数から相補的画像を選択するシステムおよび方法
CN115731370A (zh) 一种大场景元宇宙空间叠加方法和装置
CN115578432B (zh) 图像处理方法、装置、电子设备及存储介质
Bezpalko IMPROVING THE INTEGRATION OF THREE-DIMENSIONAL MODELS IN AUGMENTED REALITY TECHNOLOGY.
CN118071955A (zh) 一种基于Three.JS实现三维地图埋点方法
CN115619985A (zh) 增强现实内容展示方法、装置、电子设备及存储介质
CN115272512A (zh) Gis矢量图生成方法、装置、设备以及计算机存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23814978

Country of ref document: EP

Kind code of ref document: A1