CN103268342A

CN103268342A - CUDA-based DEM dynamic visualization acceleration system and method

Info

Publication number: CN103268342A
Application number: CN2013101949673A
Authority: CN
Inventors: 郭潇; 刘磊; 李浩然; 高勇; 郁浩
Original assignee: Peking University
Current assignee: Peking University
Priority date: 2013-05-21
Filing date: 2013-05-21
Publication date: 2013-08-28

Abstract

The invention provides a DEM (Digital Elevation Model) dynamic visualization accelerating method based on a CUDA (Computer Unified Device Architecture) technology. The DEM dynamic visualization accelerating method comprises a data preprocessing method, a dynamic visualization method and a parallel co-scheduling method, wherein the data preprocessing method achieves that tile data capable of dynamically loading are generated under a parallel environment, the dynamic visualization method comprises visualization acceleration and computation acceleration, namely LOD (Level of Detail) is used for accelerating visualization, the CUDA is used for accelerating topographic changes so as to achieve computation, and the parallel co-scheduling method achieves parallel and co-scheduling of four operations of external storage-internal storage data exchange, internal storage-video memory data exchange, CPU computation and CUDA (GPU) computation. The dynamic visualization accelerating method can meet the requirement for displaying the large-scale dynamic DEM.

Description

CUDA-based DEM dynamic visualization acceleration system and method

技术领域technical field

本发明属于高性能地理信息可视化领域。具体涉及一种基于CUDA(Compute UnifiedDevice Architecture，通用并行计算架构)技术的DEM(Digital Elevation Model，数字高程模型)动态可视化加速系统和方法。The invention belongs to the field of high-performance geographic information visualization. It specifically relates to a DEM (Digital Elevation Model, digital elevation model) dynamic visualization acceleration system and method based on CUDA (Compute Unified Device Architecture, general parallel computing architecture) technology.

背景技术Background technique

地貌过程及演化一直是地学研究中的一个重要方面。在20世纪60年代以前.对地貌演化的研究大都集中在地貌演化理论模型上。在最近三十年中，模拟实验、精密测量、地理信息系统与遥感等新技术的应用，为地貌学的定量研究奠定了坚实的基础。地貌数理统计方法已日臻完善，当前研究的热点与难点是基于动力学建立地貌过程-响应数学模型，仿真地貌演化过程。Geomorphic process and evolution have always been an important aspect in geoscience research. Before the 1960s, most studies on landform evolution focused on the theoretical model of landform evolution. In the past three decades, the application of new technologies such as simulation experiments, precision surveys, geographic information systems and remote sensing has laid a solid foundation for quantitative research in geomorphology. Geomorphological statistical methods have been perfected day by day. The current research hotspots and difficulties are based on dynamics to establish a geomorphic process-response mathematical model to simulate the evolution process of geomorphology.

数字高程模型(Digital Elevation Model，DEM)是研究地表过程/构造地貌的一种重要的数据。DEM的动态可视化，是地貌演化仿真的一种行之有效的方法。其过程，可以分为两步：Digital elevation model (Digital Elevation Model, DEM) is an important data for studying surface processes/tectonic landforms. Dynamic visualization of DEM is an effective method for landform evolution simulation. The process can be divided into two steps:

第一，以DEM数据为基础的地貌显示。DEM直接描述了地形最基本信息——高程，能够被便捷地转化为显示必需的多边形顶点，配合纹理贴图提高可视化效果，就可以显示出逼真的地貌。First, the terrain display based on DEM data. DEM directly describes the most basic information of the terrain—elevation, which can be easily converted into the necessary polygon vertices for display, and combined with texture maps to improve the visualization effect, it can display realistic landforms.

第二，结合DEM邻域分析的地貌变化计算。DEM的邻域分析(如，坡度分析、坡向分析、边缘检测、滤波变化等)，可以提取出很多基本数据。在地貌仿真中，往往利用DEM的邻域分析结果(如坡度、坡向)联系专题数据(如降雨、土壤信息)，计算出地形的高程变化。在每一帧渲染前，在原有地貌高程的基础上，叠加新计算出的高程变化量，就可以实现动态显示。Second, the calculation of landform changes combined with DEM neighborhood analysis. The neighborhood analysis of DEM (such as slope analysis, slope aspect analysis, edge detection, filter change, etc.) can extract a lot of basic data. In landform simulation, the results of neighborhood analysis of DEM (such as slope and aspect) are often used to connect with thematic data (such as rainfall and soil information) to calculate the elevation change of terrain. Before each frame is rendered, on the basis of the original landform elevation, the newly calculated elevation change can be superimposed to realize dynamic display.

在地理信息系统(Geographical Information System，GIS)领域，需要显示的DEM数据规模庞大，上述两步的实现都存在难度。对于第一步——以DEM数据为基础的地貌显示，直观算法中需要同时显示的多边形顶点超过了现有硬件的能力，不能满足实时显示的需求。这方面，已有很多卓有成效的研究，形成了一个综合利用格网简化、细节层次模型(Level-Of-Detail，LOD)、内外存调度优化等多种技术的方法体系。利用这些，可以解决显示效率的问题。对于第二步，DEM邻域分析、地形高程的变化量计算、原有地貌与高程变化的叠加，都具有可并行化的特点。结合以CUDA(Compute Unified Device Architecture，通用并行计算架构)为代表的GPU的通用计算(General Purpose GPU，GPGPU)方法，大规模的计算可以被显著地加速，解决计算效率的问题。In the field of Geographical Information System (GIS), the DEM data that needs to be displayed is huge, and it is difficult to realize the above two steps. For the first step—the topography display based on DEM data, the vertices of polygons that need to be displayed simultaneously in the intuitive algorithm exceed the capabilities of existing hardware and cannot meet the needs of real-time display. In this regard, there have been many fruitful researches, forming a method system that comprehensively utilizes various technologies such as grid simplification, level-of-detail model (Level-Of-Detail, LOD), and internal and external memory scheduling optimization. With these, the problem of display efficiency can be solved. For the second step, the DEM neighborhood analysis, the calculation of the change in terrain elevation, and the superposition of the original landform and elevation change all have the characteristics of parallelization. Combined with the general purpose GPU (GPGPU) method of GPU represented by CUDA (Compute Unified Device Architecture, general purpose parallel computing architecture), large-scale calculations can be significantly accelerated to solve the problem of computational efficiency.

虽然，单独地加速显示、计算效率都有了成熟的技术，但是两种技术的结合使用却不容易，还需要解决如何在渲染流程嵌入CUDA计算，合理安排CPU、GPU端的执行以减少等待，外存、内存、显存的统一调度等问题。Although there are mature technologies for accelerating display and computing efficiency alone, it is not easy to combine the two technologies. It is also necessary to solve how to embed CUDA computing in the rendering process, and reasonably arrange the execution of CPU and GPU to reduce waiting. Unified scheduling of storage, memory, and video memory.

发明内容Contents of the invention

本发明提出了一种基于CUDA的DEM动态可视化加速系统和方法，目的是提供一个通用的加速方法，既能利用已有的DEM高效可视化技术显示大规模的DEM数据，又能结合CUDA高效地完成地形动态变化的计算。The present invention proposes a CUDA-based DEM dynamic visualization acceleration system and method. The purpose is to provide a general acceleration method, which can not only use the existing DEM high-efficiency visualization technology to display large-scale DEM data, but also combine CUDA to efficiently complete Calculation of dynamic changes in terrain.

为了达到上述目的，本发明的技术方案如下：In order to achieve the above object, technical scheme of the present invention is as follows:

一种基于CUDA的DEM动态可视化加速系统，包括：数据预处理模块、动态可视化模块和并行协同调度模块，其特征是：A CUDA-based DEM dynamic visualization acceleration system, comprising: a data preprocessing module, a dynamic visualization module and a parallel collaborative scheduling module, characterized in that:

所述数据预处理模块，用于将显示和计算用的数据切割成瓦片(即将一个大文件，分成多个具有独立功能的小文件)，为显示时的动态加载准备数据；预处理在分布式环境下完成，相比单机环境可显著取得加速。其实现方法如下：切割的原始数据分为DEM数据和专题数据；切割的结果为LOD、瓦片DEM、瓦片专题数据(后两者合称瓦片数据)。LOD之间具有层次结构，不同层次LOD对应不同的显示分辨率，每一层次的LOD都被分为多个小文件，每个小文件所表示的格网数目相同。瓦片数据不具层次结构，仅将大文件分割成大小相同的文件，每个小文件对应的地理范围，与最底层LOD中一个文件对应的范围相同。分布式的数据预处理中，以一个计算机作为服务器，为多个客户端分配任务。The data preprocessing module is used to cut the data used for display and calculation into tiles (that is, divide a large file into multiple small files with independent functions), and prepare data for dynamic loading during display; It can be completed in a standard environment, which can be significantly accelerated compared with a stand-alone environment. The implementation method is as follows: the cut raw data is divided into DEM data and thematic data; the cut results are LOD, tile DEM, and tile thematic data (the latter two are collectively called tile data). There is a hierarchical structure between LODs. Different levels of LODs correspond to different display resolutions. Each level of LODs is divided into multiple small files, and each small file represents the same number of grids. The tile data does not have a hierarchical structure, and only divides large files into files of the same size. The geographical range corresponding to each small file is the same as the range corresponding to a file in the bottom LOD. In distributed data preprocessing, a computer is used as a server to assign tasks to multiple clients.

所述动态可视化模块，包括可视化加速单元和动态计算加速单元。The dynamic visualization module includes a visualization acceleration unit and a dynamic calculation acceleration unit.

其中，可视化加速单元的目的是减少同屏显示的多边形顶点数量，使用LOD来完成这一目标，同时使用场景图组织LOD，使层次结构更清晰，并加速渲染流程。其实现方法如下：场景图以四叉树的形式层次化组织LOD。即，一个低分辨率的LOD，对应有4个同一地理范围的稍高分辨率的LOD，将这4个LOD作为低分辨率LOD的子节点；并以此类推，这4个LOD还将具有自己的子节点。通过这种组织方法，使得场景图的渲染流程(更新——拣选——渲染)中可以减少一些不必要的计算，获得加速。Among them, the purpose of the visualization acceleration unit is to reduce the number of polygon vertices displayed on the same screen, and use LOD to achieve this goal. At the same time, the scene graph is used to organize the LOD to make the hierarchy clearer and accelerate the rendering process. The implementation method is as follows: the scene graph organizes LOD hierarchically in the form of a quadtree. That is, a low-resolution LOD corresponds to 4 slightly higher-resolution LODs of the same geographical range, and these 4 LODs are used as child nodes of the low-resolution LOD; and so on, these 4 LODs will also have own child nodes. Through this organization method, some unnecessary calculations can be reduced and accelerated in the scene graph rendering process (update-picking-rendering).

动态计算加速单元的目的是快速计算出两帧之间DEM的变化并将变化应用于LOD，使用CUDA代替CPU计算来完成这一目标，同时将CUDA计算与场景图组织相结合，使得显示过程正确、高效。其实现方法如下：以瓦片DEM和瓦片专题数据为基础，使用CUDA计算出DEM的高程变化；为场景图的节点增加一个计算模块，在场景图的更新遍历中，该计算模块使用CUDA将高程变化叠加到LOD。最后的渲染结果将显示出动态的DEM。The purpose of the dynamic calculation acceleration unit is to quickly calculate the change of DEM between two frames and apply the change to LOD, use CUDA instead of CPU calculation to accomplish this goal, and combine CUDA calculation with scene graph organization to make the display process correct , efficient. The implementation method is as follows: based on the tile DEM and tile thematic data, use CUDA to calculate the elevation change of the DEM; add a calculation module to the node of the scene graph, and use CUDA to update the calculation module during the update traversal of the scene graph. Elevation changes overlayed to LOD. The final rendering will show the dynamic DEM.

所述并行协同调度模块，使得外存(主要为硬盘)内存数据交换、内存-显存数据交换、CPU计算、CUDA(GPU)计算四种可同时进行的操作，在不影响显示结果的前提下并行处理，提高显示效率。其实现方法如下：内外存数据交换的优化主要通过预读实现，内存-显存数据交换没有效果显著的优化方法，CPU计算调度的优化通过先计算动态LOD实现，CUDA计算调度的优化通过将CUDA计算的不同步骤分布在渲染的不同步骤实现。The parallel cooperative scheduling module enables four simultaneous operations of external memory (mainly hard disk) memory data exchange, memory-video memory data exchange, CPU calculation, and CUDA (GPU) calculation to be performed in parallel without affecting the display results. processing to improve display efficiency. The implementation method is as follows: the optimization of internal and external memory data exchange is mainly realized through pre-reading, there is no effective optimization method for memory-video memory data exchange, the optimization of CPU computing scheduling is realized by first calculating dynamic LOD, and the optimization of CUDA computing scheduling is achieved by combining CUDA computing The different steps of the distribution are implemented in different steps of rendering.

本发明还包括一种基于CUDA的DEM动态可视化加速方法，包括如下步骤：The present invention also includes a CUDA-based DEM dynamic visualization acceleration method, comprising the steps of:

1)通过数据预处理生成显示需要用的LOD、瓦片DEM、瓦片专题数据，作为之后显示的数据基础。1) Generate the LOD, tile DEM, and tile thematic data required for display through data preprocessing, as the data basis for subsequent display.

2)动态DEM的显示分为4个步骤：2) The display of dynamic DEM is divided into 4 steps:

2.1)DEM高程变化计算：2.1) DEM elevation change calculation:

使用CUDA开始计算需要动态显示的区域的DEM高程变化。由于CUDA计算可完全与CPU计算并行，为了加速渲染流程，DEM高程变化中的很多计算在后续的三步中并行执行。Use CUDA to start computing DEM elevation changes for areas that need to be displayed dynamically. Since CUDA calculations can be completely parallelized with CPU calculations, in order to speed up the rendering process, many calculations in the DEM elevation change are performed in parallel in the subsequent three steps.

2.2)场景图的更新遍历：2.2) Update traversal of the scene graph:

遍历场景图，选取需要显示的LOD；使用CUDA将DEM高程变化叠加到需要动态显示的LOD上。Traverse the scene graph, select the LOD to be displayed; use CUDA to superimpose the DEM elevation change on the LOD that needs to be displayed dynamically.

场景图的更新遍历包括两步：The update traversal of the scene graph consists of two steps:

第一步：利用场景图的层次结构，高效地选出需要显示的LOD，实现方法为：场景图以四叉树的形式层次化组织LOD，一个低分辨率的LOD，对应有4个同一地理范围的稍高分辨率的LOD，将这4个LOD作为低分辨率LOD的子节点；并以此类推，这4个LOD还将具有自己的子节点；如果低分辨率的LOD不在显示范围内，则相应的高分辨率的LOD也一定不在显示范围，不必进行重复判断；Step 1: Use the hierarchical structure of the scene graph to efficiently select the LODs that need to be displayed. The implementation method is: the scene graph organizes the LODs hierarchically in the form of a quadtree, and a low-resolution LOD corresponds to 4 identical geographical locations. Slightly higher resolution LODs of the range, make these 4 LODs children of the lower resolution LOD; and so on, these 4 LODs will also have their own child nodes; if the lower resolution LOD is not in the display range , then the corresponding high-resolution LOD must not be in the display range, so there is no need to make repeated judgments;

第二步：使用CUDA将DEM高程变化叠加到需要动态显示的LOD上，实现方法为：将场景图的每个节点实现为可调用CUDA函数的LOD节点，以下简称CUDA-LOD节点；CUDA-LOD节点为LOD节点增添了一个计算模块，计算模块中包含计算函数和对计算资源的引用；CUDA-LOD节点允许计算模块对该节点及其所有子节点的资源进行访问，在节点更新遍历时将执行计算模块中的计算函数，计算函数调用CUDA核函数完成并行计算；由于不同层次的CUDA-LOD分辨率不同，其中的网格点与瓦片DEM的高程变化数据并不一一对应，而是存在一个映射关系，需要在计算函数中实现这一映射关系。Step 2: Use CUDA to superimpose the DEM elevation change on the LOD that needs to be dynamically displayed. The implementation method is: realize each node of the scene graph as a LOD node that can call CUDA functions, hereinafter referred to as CUDA-LOD node; CUDA-LOD The node adds a computing module to the LOD node, which contains computing functions and references to computing resources; the CUDA-LOD node allows the computing module to access the resources of the node and all its child nodes, and will execute when the node is updated and traversed The calculation function in the calculation module, the calculation function calls the CUDA kernel function to complete the parallel calculation; due to the different resolutions of CUDA-LOD at different levels, the grid points and the elevation change data of the tile DEM do not correspond one-to-one, but exist A mapping relationship, which needs to be implemented in the calculation function.

2.3)场景图的拣选遍历：2.3) Selection traversal of the scene graph:

拣选遍历是场景图渲染的必要步骤，但对本发明来说，要先拣选才能确定场景图、才能进行DEM高程变化到LOD的叠加；所以，LOD的拣选已经在步骤2.2)中完成。这一步几乎不执行任何计算。Picking and traversal is a necessary step for scene graph rendering, but for the present invention, the scene graph must be selected first, and the superposition of DEM elevation change to LOD can be carried out; therefore, the picking of LOD has been completed in step 2.2). This step performs almost no calculations.

2.4)场景图的渲染遍历：2.4) Rendering traversal of the scene graph:

将需要渲染的LOD使用GPU完成最终渲染。Use the GPU to complete the final rendering of the LOD that needs to be rendered.

本发明的有益效果：本发明所述的技术方案，能将GPU的并行计算能力用于大规模DEM显示时动态演化的计算，并实现高效的外存-内存-显存的数据交换，CPU、GPU的并行计算，满足大规模动态DEM显示的需求。Beneficial effects of the present invention: the technical scheme described in the present invention can use the parallel computing power of GPU for the calculation of dynamic evolution when large-scale DEM is displayed, and realize efficient external memory-internal memory-video memory data exchange, CPU, GPU Parallel computing to meet the needs of large-scale dynamic DEM display.

附图说明Description of drawings

图1本发明方法流程图Fig. 1 method flowchart of the present invention

图2实施例附图Figure 2 embodiment drawings

具体实施方式Detailed ways

数据预处理。GIS中，大规模的DEM的数据，其存储空间大小常常超过内存的容量，所以需要对DEM和相关专题数据进行预处理(主要是切割成小块文件)，实现动态加载。所述数据预处理是一种并行数据预处理，在分布式环境下完成。并行处理时，需合理设置任务的粒度，当单个文件较小时，应把多个文件的生成绑定成一个任务。Data preprocessing. In GIS, the storage space of large-scale DEM data often exceeds the capacity of memory, so it is necessary to preprocess DEM and related thematic data (mainly cut into small pieces of files) to realize dynamic loading. The data preprocessing is a kind of parallel data preprocessing, which is completed in a distributed environment. When processing in parallel, the granularity of tasks needs to be set reasonably. When a single file is small, the generation of multiple files should be bound into one task.

预处理需要完成两部分的操作：Preprocessing requires the completion of two parts:

1.通过DEM生成显示用的LOD，实现显示时的动态加载。LOD是由原始DEM生成的不同分辨率的几何格网，在显示时，依据视点的远/近调用高/低分辨率的LOD，在几乎不损失细节的条件下提高显示效率。LOD以四叉树结构层次化组织，第一层LOD表示整个DEM区域，之后的各层LOD依次细化，每个LOD节点存为一个文件。如，第一层LOD有1个节点，以64*64的分辨率呈现整个区域；第二层LOD有4个节点，以128*128的分辨率呈现整个区域，每一个节点的格网数仍为64*64，对应整个区域的1/4；第三层分辨率为256*256，节点数16，每个节点的格网数仍为64*64......直至最底层LOD达到原始DEM数据的分辨率。1. Generate LOD for display through DEM to realize dynamic loading during display. LOD is a geometric grid of different resolutions generated by the original DEM. When displaying, the high/low resolution LOD is called according to the distance/nearness of the viewpoint, which improves the display efficiency without losing details. LOD is hierarchically organized in a quadtree structure. The first layer of LOD represents the entire DEM area, and the subsequent layers of LOD are refined in turn, and each LOD node is saved as a file. For example, the first layer of LOD has 1 node and presents the entire area with a resolution of 64*64; the second layer of LOD has 4 nodes and presents the entire area with a resolution of 128*128, and the number of grids for each node remains the same. It is 64*64, corresponding to 1/4 of the entire area; the resolution of the third layer is 256*256, the number of nodes is 16, and the grid number of each node is still 64*64...until the bottom LOD reaches The resolution of the original DEM data.

2.切割DEM和相关专题数据，实现计算时的动态加载。DEM数据和相关专题数据都需要切割成与最底层LOD相对应的块。如，一个1024*1024的DEM数据，生成的LOD中每个节点分辨率为64*64，最底层有16*16＝256块，那么DEM和专题数据也要切割成对应的256块，每块所表示的地理空间与LOD相对应，称为瓦片DEM和瓦片专题数据。根据DEM邻域算子的大小，瓦片DEM间有一定的重叠覆盖区域。2. Cut DEM and related thematic data to realize dynamic loading during calculation. Both DEM data and related thematic data need to be cut into blocks corresponding to the lowest level LOD. For example, for a 1024*1024 DEM data, the resolution of each node in the generated LOD is 64*64, and the bottom layer has 16*16=256 blocks, then the DEM and thematic data should also be cut into corresponding 256 blocks, each block The represented geospatial corresponds to the LOD, known as tile DEM and tile thematic data. According to the size of the DEM neighborhood operator, there is a certain overlapping coverage area between the tile DEMs.

可视化加速：使用场景图和LOD。LOD在生成时有明显的层次化的关系，但在硬盘上以文件的方式组织时并不能体现这一点，要在显示中利用到这种层次关系，需要使用场景图。场景图以四叉树的形式层次化组织多层LOD，这种结构将在今后的显示过程中对动态加载、拣选产生很大帮助。由于LOD是原始DEM的粗糙化，每个LOD都有一定的几何误差。在显示时，设定一个屏幕误差的阈值。在场景图中，从第一层LOD开始判断，每个LOD要根据视点所在位置和自身的几何误差计算出屏幕误差，如果屏幕误差小于阀值则显示该LOD，否则，进入下一层LOD重复上述流程。场景图一方面完成显示资源的组织，另一方面也对应着一种高效的渲染流程。场景图的渲染流程，依次执行三种遍历操作：Visual acceleration: use scene graph and LOD. LOD has an obvious hierarchical relationship when it is generated, but this cannot be reflected when it is organized as a file on the hard disk. To use this hierarchical relationship in the display, you need to use a scene graph. The scene graph hierarchically organizes multi-layer LODs in the form of quadtrees. This structure will greatly help dynamic loading and selection in the future display process. Since the LOD is a roughening of the original DEM, each LOD has a certain geometric error. When displaying, set a threshold for screen error. In the scene graph, judge from the first layer of LOD. Each LOD needs to calculate the screen error according to the position of the viewpoint and its own geometric error. If the screen error is less than the threshold, the LOD will be displayed, otherwise, go to the next layer of LOD and repeat. the above process. On the one hand, the scene graph completes the organization of display resources, and on the other hand, it also corresponds to an efficient rendering process. The rendering process of the scene graph performs three traversal operations in sequence:

1.更新：在更新遍历中，场景图形库保证所有显示资源都准备好去完成渲染，包括变更场景图结构(添加、删除、变更节点)，更新节点的数据(使用CUDA完成LOD和DEM高程变化的叠加)。更新遍历允许程序修改场景图，是实现动态场景的关键。1. Update: In the update traversal, the scene graph library ensures that all display resources are ready to complete rendering, including changing the scene graph structure (adding, deleting, changing nodes), updating node data (using CUDA to complete LOD and DEM elevation changes overlay). The update traversal allows the program to modify the scene graph and is the key to implementing dynamic scenes.

2.拣选：在拣选遍历中，场景图形库检查场景里所有节点的包围体。如果一个节点在视口内，场景图形库将在最终的渲染列表中添加该节点的一个引用。2. Culling: In a culling traversal, the scene graph library examines the bounding volumes of all nodes in the scene. If a node is within the viewport, the scenegraph library will add a reference to that node in the final render list.

3.绘制：在绘制遍历中，场景图形将遍历由拣选遍历过程生成的渲染列表，调用底层API，使用GPU渲染几何体。3. Drawing: In the drawing traversal, the scene graph will traverse the rendering list generated by the picking traversal process, call the underlying API, and use the GPU to render the geometry.

动态变化的计算加速：使用CUDA计算，将CUDA计算嵌入场景图渲染流程。地貌过程可以使用地貌演化过程方程进行计算。一个经典的地貌演化过程方程形如：Dynamically changing computing acceleration: use CUDA computing and embed CUDA computing into the scene graph rendering process. Geomorphic processes can be calculated using geomorphic evolution process equations. A classic geomorphological evolution process equation is as follows:

h＝R+Ch=R+C

$\frac{&PartialD; &PartialD; R R}{&PartialD; &PartialD; t t} = = U u - - w w - - E E.$

$\frac{&PartialD; &PartialD; C C}{&PartialD; &PartialD; t t} = = w w - - &dtri; &dtri; {Q Q}_{s the s} - - &dtri; &dtri; {q q}_{m m}$

h表示高程，R表示基岩高程，C便是风化层厚度，U表示构造抬升速率，w是由于风化引起的基岩表面降低速率，E为基岩沟谷河流过程寝室速率，Q_s是沟谷泥沙输移速率，q_m是单位宽度坡面泥沙输移速率。一些参数需要使用专题数据，如w涉及岩性、q_m涉及降水，一些需要使用地形地貌要素，如q_m涉及坡面过程，需考虑坡度、坡向。地形地貌要素的计算为DEM邻域分析，使用CUDA可以获得很好的加速效果；涉及专题数据的参数和多个参数叠加计算高程变化，也具有可并行加速的特点；故使用CUDA完成这些计算可以取得很好的加速效果。h is the elevation, R is the elevation of the bedrock, C is the thickness of the regolith, U is the structural uplift rate, w is the rate of bedrock surface reduction due to weathering, E is the rate of bedrock valley river process, Q _s is the valley mud The sediment transport rate, q _m is the sediment transport rate per unit width slope. Some parameters need to use thematic data, such as w involves lithology, q _m involves precipitation, and some need to use topographic and geomorphic elements, such as q _m involves slope process, and slope and aspect need to be considered. The calculation of topographic and geomorphic elements is DEM neighborhood analysis, and CUDA can be used to obtain a good acceleration effect; the parameters related to thematic data and the superposition of multiple parameters to calculate the elevation change also have the characteristics of parallel acceleration; therefore, using CUDA to complete these calculations can Get a good acceleration effect.

DEM动态变化是地貌过程的表现方法，在程序中，需要动态显示的区域是根据用户的选择确定的，剩余区域依然作静态显示。The dynamic change of DEM is the expression method of the landform process. In the program, the area to be dynamically displayed is determined according to the user's choice, and the remaining area is still displayed statically.

要显示动态场景，在场景图的渲染流程前，需要先计算出所选区域DEM的高程变化。由于DEM被切割为DEM瓦片并动态加载，直接记录所选区域在之后的计算中并不方便。程序根据所选区域，记录加载的瓦片DEM、专题数据及各瓦片相应的动态范围。之后根据瓦片DEM、瓦片专题数据、动态区域范围，使用CUDA计算出每一瓦片DEM中所有点的高程变化(即这一帧相对于上一帧的高程变化)，不在动态范围的高程变化为0。之后将高程变化叠加同DEM瓦片，计算出变更后的DEM高程，为下一帧的计算做好数据准备。还需要记录从原始DEM到最新DEM的总高程变化，这一数据在LOD重新加载时将用到(如，同一区域需要显示的LOD分辨率变化时，需重新加载相应LOD)。总之，每一帧需计算新DEM高程及DEM高程变化、DEM总高程变化三个数据，DEM高程变化和DEM总高程也以瓦片数据的形式与DEM对应保存。To display dynamic scenes, before the rendering process of the scene graph, it is necessary to calculate the elevation change of the DEM in the selected area. Since the DEM is cut into DEM tiles and loaded dynamically, it is inconvenient to directly record the selected area in subsequent calculations. According to the selected area, the program records the loaded tile DEM, thematic data and the corresponding dynamic range of each tile. Then, according to the tile DEM, tile thematic data, and dynamic area range, use CUDA to calculate the elevation change of all points in each tile DEM (that is, the elevation change of this frame relative to the previous frame), and the elevation that is not in the dynamic range change to 0. Afterwards, the elevation change is superimposed on the DEM tiles, and the changed DEM elevation is calculated to prepare data for the calculation of the next frame. It is also necessary to record the total elevation change from the original DEM to the latest DEM, this data will be used when the LOD is reloaded (for example, when the resolution of the LOD that needs to be displayed in the same area changes, the corresponding LOD needs to be reloaded). In short, each frame needs to calculate the new DEM elevation, DEM elevation change, and DEM total elevation change. The DEM elevation change and DEM total elevation are also stored in the form of tile data corresponding to the DEM.

DEM高程变化计算完成后，将DEM高程变化数据叠加到显示资源(即LOD)上，就可以实现DEM的动态显示。要在场景图结构中有效完成这一步，需将场景图的每个节点实现为可调用CUDA函数的LOD节点(以下简称CUDA-LOD节点)。CUDA-LOD节点为LOD节点增添了一个计算模块，计算模块中包含计算函数和对计算资源的引用。CUDA-LOD节点允许计算模块对该节点及其所有子节点的资源进行访问，在节点更新遍历时将执行计算模块中的计算函数，计算函数调用CUDA核函数完成并行计算。前面提到，每一个最底层LOD与一个瓦片DEM相对应。将该瓦片DEM对应的高程变化数据，作为CUDA-LOD的最底层节点的资源(初始为空)。CUDA-LOD节点的计算模块访问自己的最底层子节点的高程变化资源，若都为空，则不进行计算，若不为空，则调用计算函数，根据高程变化更改CUDA-LOD节点的多边形顶点。由于不同层次的CUDA-LOD分辨率不同，与瓦片DEM的高程变化数据并不是一一对应，而是存在一个映射关系(可能是一对一或一对多)，需要在计算函数中实现这一映射关系。当用户漫游场景，使得动态显示区域内的LOD分辨率变化时，新加载的LOD的高程为数据原始高程，需要为新LOD叠加DEM高程总变化数据，而不是两帧之间的DEM高程变化数据。After the calculation of the DEM elevation change is completed, the DEM elevation change data can be superimposed on the display resource (that is, LOD), and the dynamic display of the DEM can be realized. To effectively complete this step in the scene graph structure, each node of the scene graph needs to be implemented as a LOD node that can call CUDA functions (hereinafter referred to as CUDA-LOD node). The CUDA-LOD node adds a computing module to the LOD node, which contains computing functions and references to computing resources. The CUDA-LOD node allows the computing module to access the resources of the node and all its child nodes. When the node is updated and traversed, the computing function in the computing module will be executed, and the computing function calls the CUDA kernel function to complete parallel computing. As mentioned earlier, each bottom-level LOD corresponds to a tile DEM. The elevation change data corresponding to the tile DEM is used as the resource of the bottom node of CUDA-LOD (initially empty). The calculation module of the CUDA-LOD node accesses the elevation change resources of its bottom-level child nodes. If they are all empty, no calculation is performed. If they are not empty, the calculation function is called to change the polygon vertices of the CUDA-LOD node according to the elevation change. . Due to the different resolutions of CUDA-LOD at different levels, there is not a one-to-one correspondence with the elevation change data of the tile DEM, but there is a mapping relationship (maybe one-to-one or one-to-many), which needs to be implemented in the calculation function A mapping relationship. When the user roams the scene and the LOD resolution in the dynamic display area changes, the elevation of the newly loaded LOD is the original elevation of the data, and it is necessary to superimpose the total change data of the DEM elevation for the new LOD, not the DEM elevation change data between two frames .

并行、协同调度。每一帧的渲染分四个步骤：DEM高程变化计算及场景图的更新、拣选、绘制遍历。程序运行时需考虑外存-内存数据交换、内存-显存数据交换、CPU计算、CUDA(GPU)计算四者在上述四步骤中的并行、协同调度。所谓并行调度，即在适当的时机，上述四者中的两个或多个可同时运行，提高效率；所谓协同调度，即保证运行的正确性，数据拷贝时要保证数据是计算完成的，计算时要保证数据是拷贝完全的。Parallel and collaborative scheduling. The rendering of each frame is divided into four steps: DEM elevation change calculation and scene graph update, selection, and drawing traversal. When the program is running, it is necessary to consider the parallel and cooperative scheduling of external memory-memory data exchange, memory-video memory data exchange, CPU calculation, and CUDA (GPU) calculation in the above four steps. The so-called parallel scheduling means that at the right time, two or more of the above four can run at the same time to improve efficiency; Make sure that the data is completely copied.

内外存数据交换，包括：1.数据的读取2.数据的预读3.数据的释放。内外存交换的数据有LOD和瓦片数据(瓦片DEM和瓦片专题数据)两类。瓦片数据用于高程变化的计算，与用户选择的动态显示区域有关，数据的读取和释放简单，在用户选择完区域后就读取，在用户放弃某片区域的动态显示时就释放，不存在数据预读的问题。LOD的读取在更新遍历时实现，当检测到当前视口中需要显示的LOD在内存中不存在时，就从外存读取数据。若同时需要读取的数据量过大，显示效率无法保证，这就需要数据的预读。由于视点变化的连续性，可预先读取各个当前显示LOD高一层和第一层分辨率的LOD，以及不在当前视口内、但与视口邻接或离视口较近的LOD；具体预读的LOD数量要根据程序可用内存的大小决定。LOD的释放，即释放那些最不可能被重用的LOD，需要考虑视口位置与LOD离开视口的时间；动态显示的LOD不再被显示时需马上释放，因为当该LOD重新被显示时，需要根据原始LOD和总高程变化数据计算得出显示结果，之前的动态显示结果是已经无法被重用的中间结果。LOD的预读和释放可以在渲染的任意步骤实现。Internal and external memory data exchange, including: 1. Data reading 2. Data pre-reading 3. Data release. There are two types of data exchanged between internal and external memory: LOD and tile data (tile DEM and tile thematic data). The tile data is used for the calculation of elevation change, which is related to the dynamic display area selected by the user. The reading and release of the data is simple. It will be read after the user selects the area, and released when the user gives up the dynamic display of a certain area. There is no problem of data pre-reading. The reading of LOD is implemented during the update traversal. When it is detected that the LOD to be displayed in the current viewport does not exist in the memory, the data is read from the external memory. If the amount of data to be read at the same time is too large, the display efficiency cannot be guaranteed, which requires data pre-reading. Due to the continuity of viewpoint changes, it is possible to read in advance the LODs that are one layer higher and the first layer resolution of the currently displayed LOD, as well as the LODs that are not in the current viewport but are adjacent to or closer to the viewport; specific pre-reading The number of LODs depends on the amount of memory available to the program. The release of LOD, that is, the release of those LODs that are least likely to be reused, needs to consider the viewport position and the time when the LOD leaves the viewport; the dynamically displayed LOD needs to be released immediately when it is no longer displayed, because when the LOD is displayed again, The display results need to be calculated based on the original LOD and total elevation change data. The previous dynamic display results are intermediate results that cannot be reused. The read-ahead and release of LOD can be implemented at any step of rendering.

内存-显存数据交换，包括：1.内存-显存的拷贝 2.显存-内存的拷贝。显存中保存所有CUDA计算用到的数据，包括LOD多边形顶点数据和瓦片数据(瓦片DEM、瓦片专题数据、瓦片高程变化、瓦片高程总变化)。内存-显存的拷贝只发生在第一次使用数据时，瓦片DEM和瓦片专题数据的拷贝在用户选择完动态区域后就进行，LOD多边形顶点数据的拷贝在更新遍历中实现，当检测到当前视口中需要显示的动态LOD在显存中不存在时就运行。显存-内存的拷贝每一帧都进行，只涉及LOD多边形顶点数据，在更新遍历中，当检测到需显示的一个LOD为动态时，就进行LOD与瓦片高程变化(或高程总变化)的叠加计算，计算完成后执行拷贝。在使用CUDA计算高程变化完成很快时，也可进行显存-内存的预拷贝，即假设这一帧显示的LOD与上一帧相同，提前进行后续计算和拷贝，但一般不需要预拷贝。Memory-Video memory data exchange, including: 1. Memory-Video memory copy 2. Video memory-Memory copy. All data used in CUDA calculations are stored in the video memory, including LOD polygon vertex data and tile data (tile DEM, tile thematic data, tile elevation change, and tile elevation total change). The memory-video memory copy only occurs when the data is used for the first time. The copy of the tile DEM and the tile theme data is carried out after the user selects the dynamic area. The copy of the LOD polygon vertex data is implemented in the update traversal. When detected The dynamic LOD that needs to be displayed in the current viewport will run when it does not exist in video memory. The video memory-memory copy is performed every frame, only involving the LOD polygon vertex data. During the update traversal, when it is detected that an LOD to be displayed is dynamic, the LOD and the tile elevation change (or the total elevation change) are performed. Overlay calculation, execute copy after the calculation is completed. When using CUDA to calculate the elevation change, the pre-copy of video memory-memory can also be performed, that is, assuming that the LOD displayed in this frame is the same as that of the previous frame, subsequent calculation and copying are performed in advance, but generally no pre-copying is required.

CUDA计算，包括：1.DEM高程变化计算2.LOD叠加高程变化。DEM高程变化计算是渲染的第一步骤，LOD叠加高程变化在第二步骤更新遍历中完成。DEM高程变化按顺序进行以下三步：1).高程变化计算2).高程总变化计算3).新DEM高程计算。CUDA计算可完全与CPU计算并行执行，即不用等待任何计算完成就可以开始更新遍历。但为保证计算的正确性，在更新遍历中进行LOD叠加高程变化前，要完成与LOD相应瓦片的“1).DEM高程变化计算”，有时要完成“2).高程总变化计算”。所以，在更新遍历完成时，一定已完成了所有的高程变化计算，可能完成了部分或全部高程总变化计算。新DEM高程计算在更新遍历后，与拣选和绘制遍历共同执行，但要在下一帧的计算开始前完成。CUDA calculation, including: 1.DEM elevation change calculation 2.LOD superimposed elevation change. DEM elevation change calculation is the first step of rendering, and LOD overlay elevation change is completed in the second step update traversal. The following three steps are carried out in order for DEM elevation change: 1). Elevation change calculation 2). Elevation total change calculation 3). New DEM elevation calculation. CUDA calculations can be executed completely in parallel with CPU calculations, that is, update traversal can be started without waiting for any calculation to complete. However, in order to ensure the correctness of the calculation, before performing the LOD superposition elevation change in the update traversal, it is necessary to complete the "1).DEM elevation change calculation" of the tile corresponding to the LOD, and sometimes to complete the "2).Elevation total change calculation". Therefore, when the update traversal is completed, all the elevation change calculations must have been completed, and some or all of the elevation total change calculations may have been completed. The new DEM elevation calculation is performed in conjunction with the picking and drawing passes after the update pass, but before the next frame's calculation starts.

CPU计算，涉及更新、拣选、绘制遍历。在绘制遍历中，先绘制动态LOD，再绘制静态LOD，这样在绘制遍历为完成前，就可以开始下一帧的渲染四步骤。CPU computing, involving updating, picking, and drawing traversal. In the drawing traversal, the dynamic LOD is drawn first, and then the static LOD is drawn, so that the four steps of rendering the next frame can be started before the drawing traversal is completed.

实施例：Example:

下面以一个格网数目256*256(格网点数目为257*257)的DEM数据作地表径流侵蚀的动态可视化为例，详述本发明的实施方式。The implementation of the present invention will be described in detail below by taking a DEM data with a grid number of 256*256 (the number of grid points is 257*257) as an example of dynamic visualization of surface runoff erosion.

1.数据预处理：1. Data preprocessing:

原始数据包括DEM数据和降雨数据。在预处理中，选定64*64作为每个LOD的大小，则256*256的DEM应当对应3层LOD，第一层标记为0，第二层标记为00、01、02、03，第三层标记为000、001、002、003、010、011、012、013、020、021、022、023、030、031、032、033；DEM和降雨数据也被切割成相应的16块瓦片，标记为000、001、002、003、010、011、012、013、020、021、022、023、030、031、032、033。Raw data include DEM data and rainfall data. In preprocessing, select 64*64 as the size of each LOD, then the 256*256 DEM should correspond to 3 layers of LOD, the first layer is marked as 0, the second layer is marked as 00, 01, 02, 03, and the second layer is marked as 00, 01, 02, 03. The three layers are labeled 000, 001, 002, 003, 010, 011, 012, 013, 020, 021, 022, 023, 030, 031, 032, 033; the DEM and rainfall data are also cut into the corresponding 16 tiles , marked as 000, 001, 002, 003, 010, 011, 012, 013, 020, 021, 022, 023, 030, 031, 032, 033.

2.静态场景的显示：2. Display of static scenes:

用户在程序选择并显示该DEM。则程序根据配置文件生成由CUDA-LOD节点组成的场景图结构，但是每个节点并不实际加载数据。根据默认的视点(一般设为比较远)，计算出第一层CUDA-LOD的屏幕误差，满足要求，读取LOD-0，并显示。用户在程序中漫游，到达某一视点，这时CUDA-LOD-0对应的屏幕误差不能满足显示需求，进入下一层CUDA-LOD，进行深度优先遍历。CUDA-LOD-02、03满足屏幕误差阈值，加载CUDA-LOD-02、03，CUDA-LOD-00、01不满足屏幕误差阈值，再进入下一层CUDA-LOD，由于已经到最底层，不用判断屏幕误差，直接加载000、001、002、003、010、011、012、013，全部读取完成后显示。The user selects and displays the DEM in the program. Then the program generates a scene graph structure composed of CUDA-LOD nodes according to the configuration file, but each node does not actually load data. According to the default viewpoint (generally set to be far away), calculate the screen error of the first layer of CUDA-LOD, meet the requirements, read LOD-0, and display. The user roams in the program and reaches a certain viewpoint. At this time, the screen error corresponding to CUDA-LOD-0 cannot meet the display requirements, and enters the next layer of CUDA-LOD for depth-first traversal. CUDA-LOD-02, 03 meet the screen error threshold, load CUDA-LOD-02, 03, CUDA-LOD-00, 01 does not meet the screen error threshold, and then enter the next layer of CUDA-LOD, since it has reached the bottom layer, no need To judge the screen error, directly load 000, 001, 002, 003, 010, 011, 012, 013, and display after all reading is completed.

3.动态、静态的混合显示：3. Dynamic and static mixed display:

地表径流侵蚀的公式可简化为The formula for surface runoff erosion can be simplified as

$\frac{&PartialD; &PartialD; h h}{&PartialD; &PartialD; t t} = = - - β β {Q Q}^{m m} {S S}^{n no} - - kS k$

其中，Q表示单位宽度径流量，S表示坡度，β、m、n、k是模型参数。Among them, Q represents runoff per unit width, S represents slope, and β, m, n, k are model parameters.

用户在场景中选择了一片需要动态显示土壤侵蚀过程的区域，如图2。根据选取的屏幕坐标，计算出几何坐标；该区域部分重叠了瓦片DEM中的000、001、002、003和020、021。读取相应的瓦片DEM、瓦片降雨数据，根据瓦片DEM、瓦片专题数据、动态区域范围，使用CUDA计算出每一瓦片DEM中各点的高程变化。根据简化的地表径流侵蚀公式，首先需使用DEM邻域分析，计算出坡度S，之后使用S和降雨数据计算出单位宽度径流量Q，最后将Q、S、β、m、n、k代入公式，计算出高程变化。上述步骤完成后，开始场景图的渲染，需要显示节点依然为的000、001、002、003、010、011、012、013、02、03，而010、011、012、013、03节点及其最底层子节点并不对应有高程变化资源，因此不调用计算函数。000、001、002、003分别对应高程变化资源000、001、002、003，02节点对应高程变化资源020、021，需调用计算函数。000、001、002、003的多边形格网，与高程变化资源一一映射，02节点与高程变化资源020的映射关系为(x，y)-(2x，2y)(x∈(0，32)，y∈(0，32))，02节点与高程变化资源021的映射关系为(x，y)-(2x-64，2y)(x∈(32，64)，y∈(0，32))。根据映射关系，完成所有动态节点的多边形顶点更新，最后绘制图形。The user selects an area in the scene that needs to dynamically display the soil erosion process, as shown in Figure 2. According to the selected screen coordinates, the geometric coordinates are calculated; this area partially overlaps 000, 001, 002, 003 and 020, 021 in the tile DEM. Read the corresponding tile DEM and tile rainfall data, and use CUDA to calculate the elevation change of each point in each tile DEM according to the tile DEM, tile thematic data, and dynamic area range. According to the simplified surface runoff erosion formula, it is first necessary to use DEM neighborhood analysis to calculate the slope S, then use S and rainfall data to calculate the runoff per unit width Q, and finally substitute Q, S, β, m, n, k into the formula , to calculate the elevation change. After the above steps are completed, start the rendering of the scene graph. It is necessary to display the nodes 000, 001, 002, 003, 010, 011, 012, 013, 02, 03, and the 010, 011, 012, 013, 03 nodes and their The bottom child node does not correspond to elevation change resources, so the calculation function is not called. 000, 001, 002, and 003 correspond to elevation change resources 000, 001, 002, and 003 respectively, and node 02 corresponds to elevation change resources 020 and 021, and the calculation function needs to be called. The polygon grids of 000, 001, 002, and 003 are mapped to the elevation change resources one by one, and the mapping relationship between the 02 node and the elevation change resource 020 is (x, y)-(2x, 2y) (x∈(0, 32) , y ∈ (0, 32)), the mapping relationship between node 02 and elevation change resource 021 is (x, y)-(2x-64, 2y) (x ∈ (32, 64), y ∈ (0, 32) ). According to the mapping relationship, update the polygon vertices of all dynamic nodes, and finally draw the graph.

用户在场景中继续漫游，由于视点的变化，显示节点变为00、01、02、03。其中00、01需要重新从外存加载。需要动态显示的节点(即对应有高程变化资源的节点)为00、02。其中02在上一帧中已经动态显示，在LOD上叠加“DEM高程变化”数据即可计算出新高程；00为新加载的LOD，需要叠加“DEM总高程变化”数据来计算新高程。The user continues to roam in the scene, and due to the change of viewpoint, the display nodes become 00, 01, 02, 03. Among them, 00 and 01 need to be reloaded from the external memory. The nodes that need to be dynamically displayed (that is, the nodes corresponding to the elevation change resources) are 00 and 02. Among them, 02 has been dynamically displayed in the previous frame, and the new elevation can be calculated by superimposing the "DEM elevation change" data on the LOD; 00 is the newly loaded LOD, and the "DEM total elevation change" data needs to be superimposed to calculate the new elevation.

Claims

1. DEM dynamic and visual accelerating system based on CUDA, comprising: data preprocessing module, dynamic and visual module and concurrent collaborative scheduler module is characterized in that:

Described data preprocessing module cuts into tile for the data that will show and calculate usefulness, and the dynamic load during for demonstration is prepared data; Pre-service is finished under distributed environment, compares stand-alone environment and can significantly obtain acceleration;

Described dynamic and visual module comprises visual accelerator module and dynamic calculation accelerator module;

Wherein, the purpose of visual accelerator module is to reduce the polygon vertex quantity that shows with screen, uses LOD to finish this target, uses scene graph to organize LOD simultaneously, makes hierarchical structure more clear, and accelerates to play up flow process;

The purpose of dynamic calculation accelerator module be calculate the variation of DEM between two frames fast and with change application in LOD, use CUDA replaced C PU to calculate to finish this target, simultaneously CUDA calculating is combined with the scene graph tissue, make procedure for displaying correct, efficient;

Described concurrent collaborative scheduler module makes that external memory-internal storage data exchange, internal memory-video memory exchanges data, CPU calculate, CUDA calculates the four kinds of operations that can carry out simultaneously, not influencing parallel processing under the prerequisite that shows the result, improves display efficiency.

2. the DEM dynamic and visual accelerating system based on CUDA as claimed in claim 1 is characterized in that, described data preprocessing module, and its implementation is as follows: the raw data of cutting is divided into dem data and thematic data; The result of cutting is LOD, tile DEM, tile thematic data; Have hierarchical structure between the LOD, the corresponding different display resolution of different levels LOD, the LOD of each level is divided into a plurality of small documents, and the represented graticule mesh number of each small documents is identical; The tile data are not had a hierarchical structure, only big file division are become the identical file of size, the geographic range of each small documents correspondence, and the scope corresponding with file among the bottom LOD is identical; In the distributed data pre-service, as server, be a plurality of client allocating tasks with a computing machine.

3. the DEM dynamic and visual accelerating system based on CUDA as claimed in claim 1, it is characterized in that, described visual accelerator module, its implementation is as follows: scene graph is organized LOD with the form stratification of quaternary tree, namely, the LOD of a low resolution is to there being the high-resolution slightly LOD of 4 same geographic ranges, with the child node of these 4 LOD as low resolution LOD; And by that analogy, these 4 LOD also will have the child node of oneself; By this method for organizing, make playing up of scene graph can reduce some unnecessary calculating in the flow process, obtain to accelerate.

4. the DEM dynamic and visual accelerating system based on CUDA as claimed in claim 1 is characterized in that, described dynamic calculation accelerator module, and its implementation is as follows: based on tile DEM and tile thematic data, the elevation that uses CUDA to calculate DEM changes; Be computing module of node increase of scene graph, in the renewal traversal of scene graph, this computing module uses CUDA that elevation is changed the LOD that is added to; Last rendering result will demonstrate dynamic DEM.

5. the DEM dynamic and visual accelerating system based on CUDA as claimed in claim 1, it is characterized in that, described concurrent collaborative scheduler module, its implementation is as follows: the optimization of inside and outside deposit data exchange is mainly by reading realization in advance, internal memory-video memory exchanges data does not have the obvious results optimization method, the optimization that CPU calculates scheduling realizes that by the dynamic LOD of calculating of elder generation the optimization that CUDA calculates scheduling is distributed in the different step realization of playing up by the different step with CUDA calculating.

6. the DEM dynamic and visual accelerated method based on CUDA comprises the steps:

1) generate to show LOD, tile DEM, the tile thematic data that needs usefulness by the data pre-service, as after the data presented basis;

2) dynamically the demonstration of DEM is divided into four steps:

2.1) DEM elevation change calculations: using CUDA to begin to calculate needs the dynamically DEM elevation variation in the zone of demonstration;

2.2) renewal of scene graph traversal: the traversal scene graph, choosing needs the LOD that shows; Use CUDA that the DEM elevation is changed on the LOD that needs dynamically to show that is added to;

2.3) selection of scene graph traversal: the selection traversal is the steps necessary that scene graph is played up, and for purposes of the invention, choose earlier and could determine scene graph, just can carry out the stack that the DEM elevation changes to LOD; So the selection of LOD is 2.2) in finish;

2.4) scene graph play up traversal: the LOD that needs are played up uses GPU to finish final rendering.

7. the DEM dynamic and visual accelerated method based on CUDA as claimed in claim 6 is characterized in that described data pre-service comprises:

The raw data of cutting is divided into dem data and thematic data; The result of cutting is LOD, tile DEM, tile thematic data; Have hierarchical structure between the LOD, the corresponding different display resolution of different levels LOD, the LOD of each level is divided into a plurality of small documents, and the represented graticule mesh number of each small documents is identical; The tile data are not had a hierarchical structure, only big file division are become the identical file of size, the geographic range of each small documents correspondence, and the scope corresponding with file among the bottom LOD is identical; In the distributed data pre-service, as server, be a plurality of client allocating tasks with a computing machine.

8. the DEM dynamic and visual accelerated method based on CUDA as claimed in claim 6 is characterized in that, described DEM elevation change calculations comprises:

The DEM elevation changes and carries out in order following three steps: 1) elevation change calculations; 2) the total change calculations of elevation; 3) new DEM elevation calculates; This three step needn't finish before upgrading traversal, can travel through executed in parallel with renewal, but be the correctness that guarantees to calculate, before in upgrading traversal, carrying out the variation of LOD stack elevation, finish " 1) DEM elevation change calculations " with the corresponding tile of LOD, will finish sometimes " 2) the total change calculations of elevation ".

9. the DEM dynamic and visual accelerated method based on CUDA as claimed in claim 6 is characterized in that, the renewal traversal of scene graph comprised for two steps:

The first step: the hierarchical structure of utilizing scene graph, select the LOD that needs demonstration efficiently, implementation method is: scene graph is organized LOD with the form stratification of quaternary tree, the LOD of a low resolution, to the high-resolution slightly LOD of 4 same geographic ranges should be arranged, with the child node of these 4 LOD as low resolution LOD; And by that analogy, these 4 LOD also will have the child node of oneself; If the LOD of low resolution is not in indication range, then corresponding high-resolution LOD needn't repeat to judge also scarcely in indication range;

Second step: use CUDA that the DEM elevation is changed on the LOD that needs dynamically to show that is added to, implementation method is: each node of scene graph is embodied as the LOD node that can call the CUDA function, hereinafter to be referred as the CUDA-LOD node; The CUDA-LOD node comprises computing function and quoting computational resource for the LOD node has increased a computing module in the computing module; The CUDA-LOD node allows computing module that the resource of this node and all child nodes thereof is conducted interviews, and will carry out the computing function in the computing module when node updates travels through, and computing function calls the CUDA kernel function and finishes parallel computation; Because the CUDA-LOD resolution difference of different levels, net point wherein is not corresponding one by one with the elevation delta data of tile DEM, but has mapping relations, need realize these mapping relations in computing function.