WO2017164923A1 - Requête d'occlusion par lots de gpu avec mise à jour spatiale - Google Patents

Requête d'occlusion par lots de gpu avec mise à jour spatiale Download PDF

Info

Publication number
WO2017164923A1
WO2017164923A1 PCT/US2016/050654 US2016050654W WO2017164923A1 WO 2017164923 A1 WO2017164923 A1 WO 2017164923A1 US 2016050654 W US2016050654 W US 2016050654W WO 2017164923 A1 WO2017164923 A1 WO 2017164923A1
Authority
WO
WIPO (PCT)
Prior art keywords
cells
spatial
spatial hierarchy
occlusion
hierarchy
Prior art date
Application number
PCT/US2016/050654
Other languages
English (en)
Inventor
Jeremy S. Bennett
Michael B. Carter
Original Assignee
Siemens Product Lifecycle Management Software Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Siemens Product Lifecycle Management Software Inc. filed Critical Siemens Product Lifecycle Management Software Inc.
Publication of WO2017164923A1 publication Critical patent/WO2017164923A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/40Analysis of texture
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/10Geometric effects
    • G06T15/40Hidden part removal
    • G06T15/405Hidden part removal using Z-buffer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/005Tree description, e.g. octree, quadtree

Definitions

  • the present disclosure is directed, in general, to computer-aided design, visualization, and manufacturing systems, product lifecycle management ("PLM”) systems, and similar systems, that manage data for products and other items (collectively, "Product Data Management” systems or PDM systems).
  • PLM product lifecycle management
  • PDM systems manage PLM and other data. Improved systems are desirable.
  • a method includes populating a depth buffer on a graphics processing unit (GPU) with depth values of opaque geometries of a three dimensional (3D) geometric model.
  • the method includes executing occlusion queries over cells of a spatial hierarchy that correspond to the 3D geometric model.
  • the method includes determining, based on results from the occlusion queries, if each cell is culled or visible from a current viewpoint.
  • the method includes displaying the 3D geometric model according to the cells that are determined to be visible from the current viewpoint
  • Figure 1 illustrates a block diagram of a data processing system in which an embodiment can be implemented
  • Figure 2 illustrates an example of components that can be included in a massive model visualization system in accordance with disclosed embodiments
  • Figures 3 and 4 demonstrate how a spatial hierarchy can be mapped in accordance with disclosed embodiments
  • Figure 5 illustrates a Multi Draw Elements Indirect buffer, index vertex buffer object, and a vertex vertex buffer object in accordance with disclosed embodiments
  • Figure 6 illustrates a process in accordance with disclosed embodiments.
  • FIGURES 1 through 6, discussed below, and the various embodiments used to describe the principles of the present disclosure in this patent document are by way of illustration only and should not be construed in any way to limit the scope of the disclosure. Those skilled in the art will understand that the principles of the present disclosure may be implemented in any suitably arranged device. The numerous innovative teachings of the present application will be described with reference to exemplary non-limiting embodiments.
  • MMV Massive Model Visualization
  • VGR Visibility-guided rendering
  • Disclosed embodiments include a new graphics processing unit (GPU) based occlusion system that is capable of achieving significant performance gains when compared to prior approaches while still maintaining the ability to handle massive data sets.
  • GPU graphics processing unit
  • FIG. 1 illustrates a block diagram of a data processing system in which an embodiment can be implemented, for example as a PDM system particularly configured by software or otherwise to perform the processes as described herein, and in particular as each one of a plurality of interconnected and communicating systems as described herein.
  • the data processing system depicted includes a processor 102 connected to a level two cache/bridge 104, which is connected in turn to a local system bus 106.
  • Local system bus 106 may be, for example, a peripheral component interconnect (PCI) architecture bus.
  • PCI peripheral component interconnect
  • main memory 108 also connected to local system bus in the depicted example are a main memory 108 and a graphics adapter 110.
  • the graphics adapter 110 may be connected to display 1 11.
  • Graphics adapter 1 10 or processor 102 can include a graphics processing unit (GPU) 128. Multiple processors 102 can be included, and may be referred to herein as a "CPU” to distinguish the general-purpose processor from the GPU.
  • Peripherals such as local area network (LAN) / Wide Area Network / Wireless (e.g. WiFi) adapter 112, may also be connected to local system bus 106.
  • Expansion bus interface 1 14 connects local system bus 106 to input/output (I/O) bus 116.
  • I/O bus 1 16 is connected to keyboard/mouse adapter 118, disk controller 120, and I/O adapter 122.
  • Disk controller 120 can be connected to a storage 126, which can be any suitable machine usable or machine readable storage medium, including but not limited to nonvolatile, hard-coded type mediums such as read only memories (ROMs) or erasable, electrically programmable read only memories (EEPROMs), magnetic tape storage, and user-recordable type mediums such as floppy disks, hard disk drives and compact disk read only memories (CD-ROMs) or digital versatile disks (DVDs), and other known optical, electrical, or magnetic storage devices.
  • ROMs read only memories
  • EEPROMs electrically programmable read only memories
  • CD-ROMs compact disk read only memories
  • DVDs digital versatile disks
  • audio adapter 124 Also connected to I/O bus 116 in the example shown is audio adapter 124, to which speakers (not shown) may be connected for playing sounds.
  • Keyboard/mouse adapter 118 provides a connection for a pointing device (not shown), such as a mouse, trackball, trackpointer, touchscreen, etc.
  • pointing device such as a mouse, trackball, trackpointer, touchscreen, etc.
  • a data processing system in accordance with an embodiment of the present disclosure includes an operating system employing a graphical user interface.
  • the operating system permits multiple display windows to be presented in the graphical user interface simultaneously, with each display window providing an interface to a different application or to a different instance of the same application.
  • a cursor in the graphical user interface may be manipulated by a user through the pointing device. The position of the cursor may be changed and/or an event, such as clicking a mouse button, generated to actuate a desired response.
  • One of various commercial operating systems such as a version of Microsoft WindowsTM, a product of Microsoft Corporation located in Redmond, Wash, may be employed if suitably modified.
  • the operating system is modified or created in accordance with the present disclosure as described.
  • LAN/ WAN/Wireless adapter 112 can be connected to a network 130 (not a part of data processing system 100), which can be any public or private data processing system network or combination of networks, as known to those of skill in the art, including the Internet.
  • Data processing system 100 can communicate over network 130 with server system 140, which is also not part of data processing system 100, but can be implemented, for example, as a separate data processing system 100.
  • LMV Large Model Visualization
  • MMV Massive Model Visualization
  • a 1080p high definition screen has a resolution of 1920 by 1080 or just over 2 million pixels. If one was to render a relatively large model of 200 million triangles at most only 2 percent of those triangles could possibly contribute to the final image.
  • MMV technologies are about creating a system that is bound by screen space and not data size which is not the case with systems based purely on LMV technologies.
  • the product structure 204 is subdivided by a partitioner 206 into a spatial hierarchy 210 and a geometric cache 212 in a data cache 208.
  • the data cache 208 can contain anything from occurrences, polygons, or voxels depending up the level of subdivision that is deemed necessary.
  • a strategy stage 218 is executed over the spatial hierarchy 210 in order construct visibility data 228 of all data that is expected to contribute to the current frame.
  • This information is then fed into the renderer 220 that generates the final image, the loader 222 that ensures any required data is resident in the geometric cache, and the reaper 224 that ensures any data that has not been used recently is removed from the geometric cache.
  • the operations within render 216 can be executed in parallel.
  • the strategy 218 can generate the visibility data 228 for next frame while the Tenderer 220 is still rendering the current frame and the loader 222 and reaper 224 can run in a constant cycle executing data load and unloads as deemed necessary.
  • Renderer 220 produces the 3D model 226 for display as viewed from a specified viewpoint.
  • each of the components can play an important part when it comes to handling extremely large datasets.
  • the spatial hierarchy 210 generated by the partitioner 206 must provide enough spatial coherence between the cells for the strategy to be able to efficiently cull large batches of cells.
  • the partitioner 206 must also ensure that the data contained within the cells are sufficiently course so as minimize the amount of noncontributing geometry being used as one of the biggest challenges with extremely large datasets is they contain vastly more geometric information that can possibly be contained in main memory.
  • the render 216 components have to work together to manage the amount of data that is resident at any given point in time.
  • the loader 222 is responsible for loading data and needs to be agile enough to ensure data is available as quickly as possible when it is marked as needed.
  • Predictive algorithms can be used by the loader 222 to try and prefetch data that is likely to become visible so as to minimize any potential lag.
  • the reaper 224 is responsible for detecting and unloading data when it is no longer necessary and determining the best candidates for unloading if memory should approach the maximum threshold.
  • the strategy 218' s primary responsibility is to construct a list of visible occurrences.
  • the visibility determination process can be designed around GPU-based occlusion queries and use other culling techniques, such as view frustum and screen coverage, as a way of pruning the list of entities for which a query needs to be executed.
  • the strategy 218's secondary responsibility is to prune the list of visible data such that it meets the desired thresholds for both frame rate and memory footprint.
  • Disocclussion artifacts occur whenever a visible shape is not rendered for one or more frames while it is visible causing a visible popping effect when the shape is rendered. This behavior is often the result of the visibility determination algorithm's inability to keep up with the visibility state changes that occur within a tree as the camera is moved through the scene. This behavior can also occur if the loader should fail to load the data before it is need.
  • GPU-based occlusion tests have been shown to be an effective tool for improving rendering performance in both industry and games.
  • Disclosed embodiments include novel improvements to a GPU based occlusion strategy for improving performance and reducing disocclusion artifacts.
  • GPU Based Depth Buffer Reprojection GPU based occlusion tests require a depth buffer be prepopulated with the depth values of potential occluders. Most approaches accomplish this by rendering either a potential occluder list or the existing render list into the depth buffer. On large models this can involve rendering millions of triangles, far more than there may be pixels on the screen, at significant cost.
  • a commonly used principle with MMV techniques is frame to frame to coherence or the notion that the visibility state of occurrences will not change significantly between frames. From this it can be extrapolated that the depth buffer used for occlusion culling will not alter significantly given the same.
  • Disclosed embodiments show how the depth buffer from a previous frame can be reprojected into the current viewpoint through the use of a sub-pixel mesh and a vertex shader to generate an approximation of the current depth buffer at near to no cost.
  • Disclosed embodiments can perform a batch query with spatial update, which is a significant improvement over previous approaches. It demonstrates how buffer write based occlusion culling can be applied to a spatial hierarchy without sacrificing the inherit benefits of previous front based approaches. The ability to query all cells in a single draw call allows for increased parallelism to be achieved on the GPU while still maintaining the ability to limit the scope of data loads and visibility changes within the hierarchy. [0033] Occlusion queries provide a means by which the GPU can be used to determine if a given set of primitives contribute to the final image and therefore frequently serve as the primary visibility test in massive model visibility determination algorithms.
  • GPU-based buffer write has been shown to be a viable alternative to GPU occlusion queries as it allows the visibility of all entities of interest to be obtained with a single draw call in order to significantly increase parallelism on the GPU.
  • Disclosed embodiments show how this can be effectively combined with a spatial hierarchy in order to increase its scalability to arbitrarily large data sets.
  • the spatial hierarchy is represented as a tree structure that stores the spatial hierarchy data for each cell.
  • a disclosed spatial hierarchy is based on a bounding volume hierarchy over occurrences. Each cell within the tree contains its bounding volume information, occurrence(s), and children cell(s). The same occurrence can appear in multiple cells and occurrences contained within a cell can be dynamically determined at run time.
  • the spatial hierarchy can be partitioned using any number of different algorithms, such as implemented: Median Cut, Octree-Hilbert, and Outside-In.
  • the bounding volume over occurrence allows the visibility state of a given cell to be directly translated to the visibility state of the occurrence. It also allows the occurrences that are contained within a given cell to be dynamically configured as cells become visible. Both of these features are useful for integrating directly against a PDM and enabling visibility guided interaction.
  • the spatial hierarchy supports having the same occurrence in multiple cells. This allows for a better subdivision while still allowing for cell visibility to be traced back to a specific occurrence.
  • a query representation is dynamically generated for each cell at run time. This allows for the representation to be matched to the visibility determination algorithm being used. For example, the system can dynamically generate a set of triangles representing the bounding volume using OpenGL occlusion queries.
  • the renderlist render process renders the list of all visible occurrence as efficiently as possible.
  • the data structure was designed to utilize modern GPU functionality while minimizing the potential L2 cache impact.
  • the current implementation is based around rendering unified vertex buffer objects, multiple shapes in the same VBO, with state information passed into shader through uniform buffer objects.
  • the depth values could be read back to the host in order to generate a traditional texture depth mesh which in turn could be rendered from the current view point in order to populate the depth buffer.
  • the cost of reading the depth buffer back is far too expensive for this to be practical as taking a performance hit to read the depth buffer back immediately would defeat the purpose, and delaying such that the depth buffer is from 2-3 frames prior has a greater potential to introduce artifacts.
  • the depth buffer could be treated as a point cloud that is easily transformed into the new view point as part of a vertex shader.
  • a frame buffer object (FBO) with a depth texture render target can be used to capture the state of the depth buffer after the rendering of all visible opaque geometry has completed by blitting the buffer from the main frame buffer.
  • the sub-pixel mesh described above is rendered using a vertex shader to dynamically transform all the vertices from the previous view point to the current view point using the values from the depth texture as the initial depth offset of the vertices. If, during the fragment shader, a fragment is detected as having had an initial depth value at the depth buffer maximum, it is discarded. This ensures depth values are only propagated for those pixels caused by rendered geometry.
  • Some approaches utilize OpenGL occlusion queries to decide the visibility state of cells within a spatial hierarchy.
  • One basic algorithm is to traverse a spatial hierarchy in a screen depth first order and execute an individual occlusion query for each cell whose visibility state is question.
  • These approaches result in the alternative representations of each cell being individually rendered along this front as well as multiple state transfers in order read back the results from the queries.
  • Modern GPUs run optimally when processing large batches of data in parallel. In terms of render this means pushing as many triangles as possible in a single draw call which runs counter to the way that traditional occlusion queries are executed.
  • Disclosed embodiments demonstrate how the visibility determination process utilizes buffer writes instead of occlusion queries.
  • the basic approach includes allocating an integer buffer object for storing occlusion results and executing a render operation that renders the shape as a single batch and populates the buffer with the visibility state of individual cells.
  • the cell alternative representations can be combined into a single draw call that ensures all the representation are rendered as a single batch while still maintaining a means by which fragments can be traced back to the originating cells.
  • the alternative representation can be rendered, for example, using glDrawRangedElements and triangles or glDrawRangeElements with GL Primitive Restarts and TriStripSets.
  • Triangle-based rendering does not provide an inherit means to trace the resulting fragments back to the originating cell, so disclosed embodiments can add an additional per-primitive attribute that contains the cell's ID that can be forwarded from the vertex shader into the fragment shader.
  • Encoding each alternative representation as a single TriStripSet allows each cell to be uniquely identified within the fragment shader by using the primitive ID. Both methods can be integrated into the current visibility determination process in lieu of GL occlusion queries.
  • disclosed processes can map the buffer into memory to provide access to the pixel values for all cells in a single operation.
  • all values are initially set to 0.
  • a tree traversal is then used to propagate the values from the buffer to render data associated with each cell. Traversal along a given path is terminated if a cell is not considered visible, the cell has not been configured, or the geometric data associated with cell has not been loaded.
  • This embodiment queries the visibility state of all cells in the spatial hierarchy eliminating the need to post propagate the visibility state of children cells to their parents as commonly found in approaches based upon GL occlusion queries.
  • Disclosed embodiments include novel improvements that are effective in not only improving the performance of a PDM or CAD system, but also in reducing some of the undesirable artifacts that often occur when using culling techniques.
  • buffer writes provide a viable alternative to the tradition GL occlusion queries. Whereas traditional occlusion queries are limited to only querying a single entity at a time, buffer write can be used to query several entities in one go. Through this approach it was shown that buffer write can be effectively used to query the visibility state of an entire spatial hierarchy in the time it would normally take to query only a handful of it cells. In order to ensure that only the truly visible geometry is loaded, the traversal of the spatial hierarchy for update is stopped whenever an unloaded cell is encountered. Further, the query set associated with the spatial hierarchy can be split in to smaller sets such that higher level sets can be used to filter on whether lower level sets even need to be queried.
  • the spatial hierarchy visibility front refers to the point at which cells transition from visible to culled along a given path of traversal.
  • occlusion query solutions can be placed into three primary categories: CPU Based culling, GPU Occlusion Queries, and GPU Texture Write.
  • CPU Based culling Systems in this category rely on the CPU based algorithm as the primary source of occlusion culling. There has been a recent resurgence in this approach in the game industry as the GPU is often viewed as a scarce resource best left for more important tasks such as rendering. Prime examples of this are the FrostBite 2 and CryEngine 3 games engines. Both of these approaches use a software rasterizer on sub thread to execute screen space culling of objects based upon their AABBs or OBBs.
  • One problem with these approaches is that they assume the increases in number of available CPU cores will help them to perform as well as, if not better than, hardware optimized for handling this very problem and whose performance gains far outstrip the CPU.
  • Another problem with these approaches is they do not take into account that GPUs and their APIs are fast approaching the point of executing the entire culling and render list generation process on the GPU.
  • GPU Occlusion Queries Systems in this category rely on GPU Occlusion Queries as the primary source of occlusion culling.
  • GPU Gems 2 showed that it was possible to implement an algorithm that interleaves the rendering of visible occurrence with GL Occlusion queries over a spatial hierarchy in order to minimize GPU stalls when retrieving occlusion results. This solution was adapted and used with success to increase the render performance.
  • the disclosed MMV solution utilizes an iterative solution where previous occlusion queries are retrieved occlusion queries are executed on visibility front in the spatial hierarchy every odd frame in order to avoid GPU stalls.
  • the primary problems with using GL occlusion queries is their parallelism is limited as entities must be queried one at a time.
  • GPU Buffer Write Disclosed embodiments include systems and methods for executing batch occlusion queries on the GPU over the cells of a spatial hierarchy.
  • the GPU realizes the most parallels when the number of triangles rendered or queried by a single draw call is large. This is because the GPU is typically not allowed to overlap computation between successive draw calls.
  • the system operates by generating a single vertex buffer object (VBO) that contains the bounding volumes for all cells within a spatial hierarchy (SH).
  • a multi draw elements indirect (MDEI) buffer is then setup such that the bounding volume of each cell is uniquely referenced.
  • the depth buffer is populated with the potential occluder by either rendering the previous renderlist or reproject the depth buffer from the previous frame.
  • a single draw call is then executed using the MDEI buffer and carefully crafted fragment shader that atomically increment the pixel value associated with the unique ID of a bounding volume of a buffer object whenever one of its fragment passes the depth test.
  • the resulting buffer is copied into another buffer that been persistently mapped into a pointer on the host. This pointer is then indexed parallel to the cells in the SH in order to retrieve the current pixel value for each cell. These values are used for generating the render list, loading data, and selecting LODs.
  • the cell list may be split into multiple segments representing different sub regions of the spatial hierarchy.
  • This system is different from occlusion-query based solutions as it is capable of executing all queries and retrieving all results with single call. Furthermore the system can query any subset of the cells in the spatial hierarchy.
  • This solution is different from existing texture write based approaches as disclosed embodiments are designed to operate on the cells of spatial hierarchy, whereas other techniques are designed around occurrences. This allows achievement of better scalability as the spatial hierarchy allows reduction of the number of occurrences that may be accidentally loaded and allows significant reduction in the number of bounding volumes that need to be rendered by breaking the spatial hierarchy into multiple sub- regions and using the results from higher regions to determine if a sub region needs to be checked. Additionally, the spatial hierarchy can be leveraged when updating the visibility state so as to minimize the impact when first entering a region in which the current depth information is unknown.
  • the massive model rendering process can be split into two main pipeline stages: rendering and strategy.
  • the render stage is responsible for generating the on-screen image through multi-stage rendering of a render list.
  • the strategy stage is responsible for generating a render list of visible geometry to be used in the render stage.
  • the render stage can be broken into four primary sub-stages.
  • the first stage is the rendering of opaque geometry into both the color and depth buffers.
  • the second stage is the blitting of the current depth buffer into a depth texture for use in the spatial strategy.
  • the third stage is rendering transparent geometry into both the color and depth buffers.
  • the fourth and final stage of rendering is again the blitting of the depth buffer into a depth texture for potential use in the spatial strategy.
  • the strategy stage can be broken into four sub-stages: obtain results, update renderlist, render depth, and execute query. The strategy is responsible for executing occlusion queries.
  • the render depth stage is responsible for populating the depth buffer on the GPU with the depth values for potential occluders. This is accomplished by either rendering the previous render list or by utilizing techniques as described in the provisional patent application incorporated herein.
  • the execute query stage is responsible for executing occlusion queries over the cells of a spatial hierarchy in order to determine if they are culled or visible from the current viewpoint.
  • occlusion queries are executed by rendering the bounding volumes of all cells in the spatial hierarchy in single draw call and utilizing a fragment shader to atomically increment the pixel associated with a cell each time one of the fragments produced by its bounding volume is not culled by the depth buffer. This operation results in a buffer that tallies the number of visible fragments for each cell in the spatial hierarchy.
  • the cells may be queried in groups so as to reduce the number of bounding volumes that need to be rendered in order to determine the visibility of all cells. When this occurs, the results from the previous frame for higher-level groups can be used to determine if a lower level group should query.
  • the obtain results stage is responsible for retrieving the results from the queries. It iterates through the cells on the spatial hierarchy and retrieves the visibility results for each cell from an index parallel buffer object that has been persistently mapped on the CPU. Iteration of child cells may be stopped in order to ensure any occurrences associated with a visible parent are loaded and rendered prior to potentially marking the child as visible. This optimization helps to limit the number of cells that are accidentally marked as visible, and therefore configure or load their occurrences, in regions of space in which the current depth buffer is unknown.
  • the buffer object used for capturing the per cell pixel value is made available to the CPU in a form that does not block the GPU. This data structure is designed to be indexed parallel to the cells in spatial hierarchy.
  • the update renderlist stage is responsible for generating a renderlist based upon the current visibility results to be used in the primary render stage. It iterates through the cells on the spatial hierarchy and for any cell that is marked as visible checks to see if there are associated occurrences. If there are occurrences and they are loaded they are inserted into the renderlist. If there are occurrences, but they are not loaded, a request for loading may occur here.
  • Figures 3 and 4 demonstrate how a spatial hierarchy 300/400 can be mapped for batch query of the entire spatial hierarchy, or in the case of an extremely large hierarchy, multiple sets for batch query.
  • the spatial hierarchy is implemented as a vector of cells in which each cell can be uniquely identified by index.
  • the structure of the hierarchy is established through each cell containing a parent index, a child index, and the number of children.
  • the cells are defined in a depth first order that guarantees that children cells have a larger index and are grouped such that all cells within a sub-tree have contiguous indices as shown in Fig. 3.
  • a breadth-first ordering as illustrated in Fig. 4 results in multi-query groups where not all cells in a sub-tree have contiguous indices, as illustrated by the sub-trees of node two, which would include cells 4, 5, 8, 9, 10 and 11.
  • a buffer object containing integer values is allocated in parallel to the cells of the spatial hierarchy.
  • occlusion tests if a fragment associated with the bounding volume of a cell should pass the depth test, the value contained within the associated element is incremented, resulting in the buffer contain the pixel hit count for all tested cells.
  • a secondary buffer is allocated and persistently mapped, such that values can be read back from it through a direct pointer access. At the end of each occlusion pass, the values from the primary buffer are copied into this secondary buffer. This subtle enhancement allows the primary buffer to remain only in GPU memory which significantly improves the performance of the write operations.
  • Multi-Draw Indirect Modern GPUs are optimized for executing work in parallel. The best performance is therefore achieved when processing large batches.
  • the system batches the rendering of the cell bounding volumes to perform occlusion tests. It defines the batches in such a way that the associated cell for any given volume can be uniquely identified in both the vertex and fragment shader.
  • FIG. 5 illustrates a Multi Draw Elements Indirect (MDEI) buffer 502, index VBO 504, and vertex VBO 506 in accordance with disclosed embodiments.
  • MDEI Multi Draw Elements Indirect
  • the MDEI buffer is parallel to the number of the cells and is initialized such that each DrawElementsIndirectBuffer contains the Firstlndex and BaseVertex for the geometric information of the corresponding cell in the VBO and the Baselnstance is set to the corresponding cell index.
  • This setup allows all cells or a sub-tree of cells to be rendered in a single draw, if so desired, and, more importantly, allows the fragment shader to readily identify the associated cells' bounding volume for a given fragment.
  • Disclosed embodiments have the potential to significantly increase the rendering performance of MMV systems. Early tests have shown a 2-3x increase in the frame rate of several large data sets. It also improves upon the accuracy of occlusion tests and the responsiveness to changes in the visibility state of the spatial hierarchy cells. The batch query contributes significantly to the increase in performance. Various embodiments eliminate the use of expensive GL Occlusion queries and reduce the number of occlusion tests / result retrievals from the number of cells on the front to 1 or the number of visible batches in the case of a large SH.
  • the batch query over cells significantly contributes to performance, accuracy, and responsiveness.
  • Various embodiments update the pixel value for all cells every frame and improves the level of detail (LOD) selection based upon cell pixel value.
  • LOD level of detail
  • Various embodiments allow cell-visibility state changes to be reflected almost immediately in the SH.
  • the single draw call for all queries significantly improves performance.
  • Various embodiments reduce CPU overhead and increase parallelism on the GPU.
  • the multiple batch sets for large spatial hierarchies significantly improves performance and scalability.
  • Various embodiments reduce the number of bounding volumes to be rendered in order to determine visibility of all cells.
  • Figure 6 illustrates a process in accordance with disclosed embodiments that can be performed, for example, by one or more data processing systems 100, referred to generically as "the system” below.
  • This figure provides an overview flowchart of a disclosed massive model rendering process in accordance with disclosed embodiments, including a GPU occlusion query with batch query of spatial hierarchy.
  • the system can receive the 3D geometric model (605).
  • "Receiving,” as used herein, can include loading from storage, receiving from another device or process, receiving via an interaction with a user, and otherwise.
  • Receiving the 3D model can be implemented by receiving a product structure of the 3D model, as described above with respect to Fig. 2.
  • the system can populate a depth buffer on a graphics processing unit (GPU) with depth values of opaque geometries of the 3D geometric model (610).
  • GPU graphics processing unit
  • the system can execute occlusion queries over cells of a spatial hierarchy that correspond to the 3D geometric model (615).
  • Each cell of the spatial hierarchy represents a geometric bounding volume that encompasses some portion of the 3D model, and preferably every portion of the 3D model is included in some cell.
  • the cell data is stored in a spatial hierarchy that represents the spatial location of each cell and its respective portions of the 3D model. In this way, the spatial hierarchy can identify the spatial/geometric location of any part, assembly, subassembly, or other portion of the 3D model according to its cell.
  • the cells can be processed in spatial-hierarchy groups so as to reduce the number of bounding volumes that need to be rendered in order to determine the visibility of all cells.
  • executing occlusion queries over cells of a spatial hierarchy includes performing a batch query of a visibility state of all cells in the spatial hierarchy. In some cases, executing occlusion queries over cells of a spatial hierarchy includes splitting a query set associated with the spatial hierarchy into smaller sets such that higher level sets are used to filter on whether lower level sets must be queried. In some cases, executing occlusion queries over cells of a spatial hierarchy does not execute an occlusion query for any cell that has a parent cell that is determined to be culled from the current viewpoint.
  • Obtaining results to occlusion queries can include reading a query list of cells to be queried for visibility. The outstanding query list is traversed and for all queries that are ready the number of pixels that were hit during rendering are obtained. If the number of pixels is greater than the visibility threshold, the associated cell is marked as visible, and if the cell's previous state was considered indeterministic, a counter is incremented to track the number of indeterministic cells that have been processed.
  • the pixel value can also be used to set the number of frames each cell should delay before executing another query.
  • the delay value can be equal to the pixel value, clamped to 254, right shifted by 5, and for invisible cells the delay value is set to 5 minus the pixel value clamped to 0.
  • Frame delays for queries can be used to reduce the number queries that are executed in a given frame, either as a fixed delay value or as a random delay value.
  • the system can perform a spatial update of the spatial hierarchy, including traversing the spatial hierarchy and updating a visibility state of the cells.
  • Performing the spatial update can include one or more processes of pruning the traversal based upon a state of the spatial hierarchy based on an indeterminate state a depth buffer, performing data configuration and data loading based upon a visibility state of cells of the spatial hierarchy, setting a configuration and a load priority based upon a returned pixel coverage, or dynamically selecting a level of detail based on the returned pixel coverage.
  • the spatial hierarchy tree can be traversed starting at the front cell from the current view of the 3D model and then up to the root cell of the tree.
  • the system can mark any cell that does not contribute enough pixels to the current frame as culled.
  • the system can mark any cell whose children are all culled as potentially culled if the cell pixel count from a recent query is not greater than a threshold value.
  • the system can determine, based on results from the occlusion queries, if each cell is culled or visible from a current viewpoint (620).
  • the system can display the 3D geometric model according to the cells that are determined to be visible from the current viewpoint (625).
  • Disclosed embodiments include a method for massive model visualization performed by a data processing system.
  • the method includes executing a rendering stage on a three-dimensional (3D) geometric model.
  • the method includes executing a strategy stage on the 3D geometric model.
  • the method includes displaying the 3D geometric model according to the rendering stage and strategy stage.
  • the rendering stage includes rendering opaque geometry of the 3D geometric model into a color buffer and a depth buffer, blitting the current depth buffer into a depth texture for use in the strategy stage, rendering transparent geometry of the 3D geometric model into the color buffer and the depth buffer, and blitting the depth buffer into a depth for use in the spatial strategy.
  • the strategy stage includes an obtain results substage, an update RenderList substage, a render depth substage, and an execute query substage.
  • the render depth substage populates a zbuffer on a graphics processing unit (GPU) with depth values for potential occluders.
  • the execute query substage executes occlusion queries over cells of a spatial hierarchy in order to determine if each cell is culled or visible from a current viewpoint.
  • the obtain results substage retrieves the results from occlusion queries.
  • the update RenderList substage generates a RenderList based upon current visibility results to be used in the rendering stage.
  • MMR an interactive massive model rendering system using geometric and image-based acceleration.
  • machine usable/readable or computer usable/readable mediums include: nonvolatile, hard-coded type mediums such as read only memories (ROMs) or erasable, electrically programmable read only memories (EEPROMs), and user-recordable type mediums such as floppy disks, hard disk drives and compact disk read only memories (CD-ROMs) or digital versatile disks (DVDs).
  • ROMs read only memories
  • EEPROMs electrically programmable read only memories
  • user-recordable type mediums such as floppy disks, hard disk drives and compact disk read only memories (CD-ROMs) or digital versatile disks (DVDs).

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Geometry (AREA)
  • Computer Graphics (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Generation (AREA)

Abstract

L'invention concerne des procédés de visualisation de modèles massifs, et des systèmes et des supports lisibles par ordinateur correspondants. Un procédé consiste à peupler (610) un tampon de profondeur sur une unité de traitement graphique (GPU) (128) avec des valeurs de profondeur de géométries opaques d'un modèle géométrique tridimensionnel (3D) (226). Le procédé consiste à exécuter (615) des requêtes d'occlusion sur des cellules d'une hiérarchie spatiale (300, 400) correspondant au modèle géométrique 3D (226). Le procédé consiste à déterminer (620), sur la base des résultats des requêtes d'occlusion, si chaque cellule est supprimée ou visible depuis un point de vue actuel. Le procédé consiste à afficher (625) le modèle géométrique 3D (226) en fonction des cellules qui sont déterminées comme étant visibles depuis le point de vue actuel.
PCT/US2016/050654 2016-03-21 2016-09-08 Requête d'occlusion par lots de gpu avec mise à jour spatiale WO2017164923A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201662311067P 2016-03-21 2016-03-21
US62/311,067 2016-03-21

Publications (1)

Publication Number Publication Date
WO2017164923A1 true WO2017164923A1 (fr) 2017-09-28

Family

ID=59899652

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2016/050654 WO2017164923A1 (fr) 2016-03-21 2016-09-08 Requête d'occlusion par lots de gpu avec mise à jour spatiale

Country Status (1)

Country Link
WO (1) WO2017164923A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108156520A (zh) * 2017-12-29 2018-06-12 珠海市君天电子科技有限公司 视频播放方法、装置、电子设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060164410A1 (en) * 2005-01-27 2006-07-27 Wei Li Invisible space skipping with adaptive granularity for texture-based volume rendering
US20080225048A1 (en) * 2007-03-15 2008-09-18 Microsoft Corporation Culling occlusions when rendering graphics on computers
US8115767B2 (en) * 2006-12-08 2012-02-14 Mental Images Gmbh Computer graphics shadow volumes using hierarchical occlusion culling
US20130076762A1 (en) * 2011-09-22 2013-03-28 Arm Limited Occlusion queries in graphics processing
US20160005216A1 (en) * 2014-07-03 2016-01-07 Center Of Human-Centered Interaction For Coexistence Method, apparatus, and computer-readable recording medium for depth warping based occlusion culling

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060164410A1 (en) * 2005-01-27 2006-07-27 Wei Li Invisible space skipping with adaptive granularity for texture-based volume rendering
US8115767B2 (en) * 2006-12-08 2012-02-14 Mental Images Gmbh Computer graphics shadow volumes using hierarchical occlusion culling
US20080225048A1 (en) * 2007-03-15 2008-09-18 Microsoft Corporation Culling occlusions when rendering graphics on computers
US20130076762A1 (en) * 2011-09-22 2013-03-28 Arm Limited Occlusion queries in graphics processing
US20160005216A1 (en) * 2014-07-03 2016-01-07 Center Of Human-Centered Interaction For Coexistence Method, apparatus, and computer-readable recording medium for depth warping based occlusion culling

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108156520A (zh) * 2017-12-29 2018-06-12 珠海市君天电子科技有限公司 视频播放方法、装置、电子设备及存储介质
CN108156520B (zh) * 2017-12-29 2020-08-25 珠海市君天电子科技有限公司 视频播放方法、装置、电子设备及存储介质

Similar Documents

Publication Publication Date Title
US11138782B2 (en) Systems and methods for rendering optical distortion effects
US10553013B2 (en) Systems and methods for reducing rendering latency
Greß et al. GPU‐based collision detection for deformable parameterized surfaces
WO2017164924A1 (fr) Système de reprojection de profondeur basée sur une gpu pour accélérer la génération d'un tampon de profondeur
US10699467B2 (en) Computer-graphics based on hierarchical ray casting
US10553012B2 (en) Systems and methods for rendering foveated effects
Liu et al. Octree rasterization: Accelerating high-quality out-of-core GPU volume rendering
Sintorn et al. Compact precomputed voxelized shadows
Schütz et al. Software rasterization of 2 billion points in real time
Vasilakis et al. Depth-fighting aware methods for multifragment rendering
US20130127895A1 (en) Method and Apparatus for Rendering Graphics using Soft Occlusion
US10559125B2 (en) System and method of constructing bounding volume hierarchy tree
JP2017199354A (ja) 3dシーンのグローバル・イルミネーションの描画
Vasilakis et al. k+-buffer: Fragment synchronized k-buffer
Papaioannou et al. Real-time volume-based ambient occlusion
Chandak et al. FastV: From‐point Visibility Culling on Complex Models
Lee et al. Hierarchical raster occlusion culling
Hughes et al. Kd-jump: a path-preserving stackless traversal for faster isosurface raytracing on gpus
Xue et al. Efficient GPU out-of-core visualization of large-scale CAD models with voxel representations
Eisemann et al. Visibility sampling on gpu and applications
WO2017164923A1 (fr) Requête d'occlusion par lots de gpu avec mise à jour spatiale
Vollmer et al. Hierarchical spatial aggregation for level-of-detail visualization of 3D thematic data
Müller et al. Optimised molecular graphics on the hololens
Mateo et al. Hierarchical, Dense and Dynamic 3D Reconstruction Based on VDB Data Structure for Robotic Manipulation Tasks
Machado e Silva et al. Image space rendering of point clouds using the HPR operator

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16895713

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 16895713

Country of ref document: EP

Kind code of ref document: A1