CN103677761A

CN103677761A - Quick satellite remote sensing data processing system

Info

Publication number: CN103677761A
Application number: CN201310676298.3A
Authority: CN
Inventors: 孙业超; 王峰; 徐文; 闵祥军; 朱晓波; 祝令亚
Original assignee: China Center for Resource Satellite Data and Applications CRESDA
Current assignee: China Center for Resource Satellite Data and Applications CRESDA
Priority date: 2013-12-11
Filing date: 2013-12-11
Publication date: 2014-03-26
Anticipated expiration: 2033-12-11
Also published as: CN103677761B

Abstract

A quick satellite remote sensing data processing system comprises a plurality of calculation nodes, an InfiniBand switchboard and a storage array. The calculation nodes comprise one Master node and a plurality of slave nodes, hardware configuration of all the calculation nodes is the same, the number of the slave nodes can be freely expanded, a multi-core CPU and a plurality of GPU board cards are loaded on each calculation node, and it is guaranteed that the number of cores of the CPUs is more than two times of the number of the GPU board cards. Parameter configuration files and strip data to be processed are stored in the storage array in advance. The InfiniBand switchboard enables the calculation nodes and the storage array to be connected together to form a high-speed internet, and a basic hardware facility for quick processing is formed.

Description

A kind of satellite remote sensing date fast processing system

Technical field

The present invention relates to satellite remote sensing date process field, particularly a kind of satellite remote sensing date fast processing system.

Background technology

The development of sensor technology makes China's satellite earth observation ability obtain considerable lifting, the aspects such as the principal feature of current earth observation satellite is mainly manifested in high spatial resolution, high spectral resolution, the return visit cycle is short, image strip is wide, three-dimensional imaging ability, multiple imaging pattern, the observation sensor that increasing different platform is carried makes that we obtain earth observation data every days with TB level even the speed of PB level increase very fast, the explosive growth of this Information Monitoring has brought huge challenge for processing.In addition, the remote sensing applications such as emergency response, Disaster Assessment and environmental monitoring are to the effective requirement of data processing, a large amount of high precision Remote Sensing Data Processing computings need to be completed at short notice, otherwise the macroscopic view that remote sensing technology has, quick and comprehensive advantage cannot be brought into play.So must adopt advanced high-performance treatments technology in satellite remote sensing date floor treatment, realize processing in real time or closely in real time of remotely-sensed data, could tackle more and more outstanding mass data and emergency response problem.

In parallel computation field, MPI is a kind of standard of message transfer mode, is the main flow programming model on current distributing mode computer system.OpenMP is the actual industrial standard of sharing storage programming, is also widely used at present.And GPU is the High Performance Computing that cost performance of new generation is high, and rapid in development in recent years.Graphic process unit has powerful calculation function and relatively high concurrent operation speed in parallel data computing, there is single instruction stream multithreading (single instruction multiple thread, SIMT) parallel behavior, has very high cost performance in solution computation-intensive problem.Aspect Remote Sensing Data Processing Study on Acceleration, Balz utilizes GPU to carry out the simulation of data of synthetic aperture radar, Govett utilizes GPU to realize the dynamics part of meteorologic model, obtained the acceleration of 34 times, GPU is also used to the Morphology Algorithm of high-spectral data end member purity and realizes, at end member abundance index, extract and cut apart the acceleration of having obtained 15 times, " in goddess in the moon II pretty young woman lunar exploration satellite engineering; GPU is for the decoding of passage down-transmitting data; and there is the speed lifting of 87 times; in addition, GPU is also used to real-time target identification and tracking on some stars.

From the Research Literature achievement of publishing at present, all to utilize MPI or OpenMP or CUDA model to carry out parallel optimization to certain Processing Algorithm, but lack the paralell design of whole satellite remote sensing date pretreatment process and research, do not have systematic, effective solution.

Summary of the invention

Technology of the present invention is dealt with problems and is: based on multistage parallel speed technology, by the data of satellite remote sensing date pretreatment process and tasks in parallel are decomposed, provide a kind of disposal system of satellite remote sensing date fast.

Technical solution of the present invention is: a kind of satellite remote sensing date fast processing system, comprises many computing nodes, InfiniBand switch and storage array; Described computing node comprises a Master node and a plurality of slave node, identical and the slave nodes of each computing node hardware configuration can spread, on every computing node, be loaded with multi-core CPU and a plurality of GPU board, and guarantee that CPU check figure is the more than 2 times of GPU board quantity; In described storage array, deposit in advance parameter configuration files, pending strip data; Described InfiniBand switch is joined together to form a High speed network by computing node and storage array, forms the underlying hardware facility of fast processing;

Master node reads parameter configuration files from storage array, according to the content configuration in file, participate in the node calculating, the process number of node participation and the number of data lines that each process is resolved, the node that utilizes MPI agreement the to call configuration strip data that walks abreast in CPU is resolved, and each process analysis result is spliced, form whole rail auxiliary data and be stored in storage array;

Master node utilizes above-mentioned whole rail auxiliary data to carry out WRS to divide scape, forms a minute scape file and deposits in storage array; According to the nodes in parameter configuration files and GPU board number, utilize MPI agreement to call that each configuration node is parallel to carry out standard scape to every scape data and produce and process, guarantee that each GPU board is assigned to scape data;

Described standard scape is produced and is comprised that first class product generates and system geometry correction; Described first class product generates and comprises image radiation treatment scheme and RPC parametric solution, and the thread parallel that utilizes OpenMP to realize image radiation treatment scheme and RPC parametric solution is processed; Image radiation treatment scheme comprises that according to processing sequence radiant correction, denoising, MTFC process three processing units successively, all adopts CUDA framework to realize; RPC parametric solution carries out at CPU end, and the image file that the RPC Parameter File of generation and image radiation treatment scheme generate forms first class product, is stored in storage array; After first class product has been produced, adopt CUDA framework, in GPU, image radiation result is carried out to system geometry correction, obtain final image product.

When being carried out to image radiation treatment scheme, every scape data adopt streamline recycle design to carry out, the file I/O operation between reducing in flow process between each processing unit, specific as follows:

(1) every scape data are carried out to piecemeal by the internal memory restriction of the each processing image in parameter configuration files, according to piecemeal order, the first blocks of data is read in to internal memory, and open up an output internal memory of formed objects;

(2) radiant correction data in internal memory being carried out based on CUDA framework is processed, result outputs to output internal memory, copy output internal memory is to input internal memory, and will export internal memory initialization, the denoising, the MTFC that carry out successively based on CUDA framework process again, and after MTFC processes, result data are write in output file from output internal memory;

(3) according to piecemeal order, the second blocks of data is read in to internal memory, repeating step (2), until last blocks of data is finished dealing with.

The present invention compared with prior art beneficial effect is:

The present invention adopts InfiniBand network and CPU-GPU mixing computing architecture to form high-performance treatments platform on hardware configuration, three grades of Hybrid paradigms based on MPI+OpenMP+CUDA in software architecture, can make full use of between node the performance advantage of fine-grained data parallel, processing capacity thread parallel and view data GPU parallelization in coarseness and node, improve treatment efficiency.In treatment scheme, on the basis of decomposing at data parallel, processing capacity is in sequence configured to streamline and processes successively, reduce file I/O and operate.Adopt satellite remote sensing date fast processing system of the present invention can reach nearly real-time treatment effeciency.

Accompanying drawing explanation

Fig. 1 is hardware configuration schematic diagram of the present invention;

Fig. 2 is strip data structural representation;

Fig. 3 is circular pipeline process flow diagram of the present invention;

Fig. 4 is system treatment scheme schematic diagram of the present invention.

Embodiment

Satellite remote sensing date fast processing system hardware structure of the present invention as shown in Figure 1, comprises many computing nodes, switch and storage array.

Computing node comprises a Master node and a plurality of slave node, and slave node can freely add.Every computing node configuration is identical, is all loaded with multinuclear high-performance CPU and a plurality of GPU board, and the more than 2 times of CPU check figure board quantity.These GPU boards and CPU processor have formed GPU+CPU mixing computing architecture jointly, are responsible for carrying out high-intensity parallel computation.

Switch adopts High Speed I nfiniBand switch, and computing node and storage array are joined together to form to a High speed network, realizes the high-speed communication of all kinds of orders, signal and data between server.Network performance is parallel computation, data transmission and shared memory access important factor in order.InfiniBand network is the highest commercial network of current performance, supports multiple upper-layer protocol.

Storage array is responsible for passing under stored parameter configuration file, satellite strip data and is processed output data, parameter is put parameter in file and is comprised: (1) hardware resource parameter: computing node quantity, computing node IP, the CPU check figure on each computing node and GPU board number; (2) band analytic parameter: participate in the process number of parsing, the band line number that each process is resolved; (3) processing parameter: each internal memory restriction of processing.Process output data and comprise band auxiliary data, minute scape file, first class product data and secondary product data.

The three grade Hybrid paradigms of satellite remote sensing date fast processing system of the present invention based on MPI+OpenMP+CUDA build, and Master node is responsible for the flow process of whole treatment scheme and is controlled, and comprises the beginning of flow process, the end of the execution of modules, flow process.Concrete treatment scheme (as shown in Figure 4) is as follows:

1, band is resolved

Under satellite, passing strip data is a kind of structurized binary file, and as shown in Figure 2, the every row structure in signal data district is wherein identical, is applicable to carrying out parallelization parsing.Can allow each process of each node process one piece of data, the result of each processing is stitched together in order and forms the auxiliary data of whole rail.

First Master node reads parameter configuration files from storage array, according to the content configuration in file, participate in the node calculating, the number of data lines that the process number that node participates in and each process are resolved, the node that utilizes MPI agreement to call configuration carries out strip data parsing in CPU, the satellite ephemeris that each process is resolved, attitude, the auxiliary datas such as imaging time and satellite imagery state are exported with file mode, and name with the imaging time of this section of the first row, after each process is parsed, by Master node according to the splicing of sorting of the imaging time in each file designation, form whole rail auxiliary data and be stored in storage array.

2, logic is divided scape

Logic is divided scape to utilize above-mentioned whole rail auxiliary data to adopt WRS to divide scape method to carry out standard and is divided scape, calculates simply, does not carry out parallel processing, only at Master node one process, carries out, and generates a minute scape file and leaves in storage array.

3, standard scape is produced

Master node is according to the configuration node in parameter configuration files and nodes, each node GPU board number, utilizes MPI agreement to call that configuration node is parallel to carry out standard scape to every scape data and produce and process; For example according to the content in parameter configuration files in storage array, show, the nodes that participates in calculating is a, the GPU board number loading on each computing node is m, for guaranteeing a GPU board, process a scape image, each node has m process correspondence to carry out the production of m scape, carries out the production of m*a scape image simultaneously.Each production process is by the scape number of MPI allocation of communications, finds after the start-stop row information that this scape is number corresponding in strip data in minute scape file, carries out standard scape and produces and process.Standard scape is produced and is comprised first class product production and system geometry correction.

1) first class product is produced

First class product is produced and is comprised image radiation treatment scheme and RPC parametric solution.Wherein image radiation is processed and is comprised that relative radiant correction, denoising and MTFC process three unit, it is all the operation for view data, RPC parametric solution is to utilize auxiliary data to carry out geometric parameter processing, there is not the common factor of data in both, and RPC parametric solution does not take GPU resource in CPU, therefore, adopt OpenMP to carry out thread parallel.

Each processing in traditional treatment scheme (radiant correction, denoising, MTFC process relatively) operation is independently, a upper output file of processing operation is the input of next operational processes, because whole scape image data amount is large, each is processed operation and all adopts piecemeal processing mode, so there is a lot of IO operations.In the present invention, image radiation adopts pipeline system in processing, and radiant correction, denoising, MTFC process and form a production line relatively.First whole scape data are carried out to piecemeal, every is according to the fixedly row of pre-configured internal memory limit calculation, and due to corresponding GPU number of devices restriction, each interblock is not parallel, adopts recycle design.As shown in Figure 3, specific as follows:

(2) radiant correction data in internal memory being carried out based on CUDA framework is processed, result outputs to output internal memory, copy output internal memory is to input internal memory, and will export internal memory initialization, the denoising, the MTFC that carry out according to this based on CUDA framework process again, and after MTFC processes, result data are write in output file from output internal memory;

Radiant correction, denoising, MTFC process and adopt respectively CUDA framework to realize relatively, for relative radiant correction, utilize the relative radiant correction coefficient that laboratory integrating sphere records to carry out the relative radiant correction of homogenization, all cameras are visited to first gain and biasing coefficient is copied in the shared drive of GPU, the corresponding GPU thread of each pixel carries out homogenization calculating.For denoising, adopt median filtering method, the corresponding GPU thread of each pixel is processed.For MTFC, process, adopt Wiener Filter Method to carry out image restoration, dimensional MTF matrix is copied in the shared drive of GPU, frequency field image block corresponding to MTF matrix size and MTF matrix are carried out to Wiener filtering, and transform to spatial domain output, mono-kind of concrete referenced patent < < accelerates to realize image restoration disposal route > > based on GPU, and application number is: 201310308418.4.

2) system geometry correction

The operation that system geometry correction resamples to image according to the corresponding relation of the ranks of each pixel of view data number and ground longitude and latitude.In the present invention, adopt the bearing calibration based on RPC parameter, and realize under CUDA framework, the RPC parameter of the scape file generating during first class product is produced is copied in GPU shared drive, and in GPU equipment end, each pixel being carried out to the re-sampling operations from image coordinate to ground coordinate based on RPC parameter, method for resampling adopts cubic convolution method.

The present invention has been applied in XX satellite data processing system, has formed a small-sized high-performance treatments system that has 3 computing nodes.Node configuration: INTEL XEON E5-2670x86-64 bit processor, 2 tunnel 8 cores, dominant frequency 2.6GHz, internal memory 1333MHz, 48GB, each node installation 3 Tesla2090GPU graphics cards, 40Gb/s(is unidirectional for InfiniBand switch nominal bandwidth), 80Gb/s(is two-way).

Under this hardware environment, XX satellite optical remote sensing data have been carried out to pilot production.Table 1 has been listed the processing time of 2 groups of bands choosing, and wherein, every group of band comprises 1 multispectral data band and 1 panchromatic data.From table, can see, adopt high-performance treatments system of the present invention when processing the 2nd group of band, more than handling capacity can reach 130MB/s, approached this net environment file IO speed, high-performance treatments system of the present invention can reach closely in real time to be processed, and handling property is very efficient.

The table 1 strip data processing time

The unspecified part of the present invention belongs to general knowledge as well known to those skilled in the art.

Claims

1. a satellite remote sensing date fast processing system, is characterized in that: comprise many computing nodes, InfiniBand switch and storage array; Described computing node comprises a Master node and a plurality of slave node, identical and the slave nodes of each computing node hardware configuration can spread, on every computing node, be loaded with multi-core CPU and a plurality of GPU board, and guarantee that CPU check figure is the more than 2 times of GPU board quantity; In described storage array, deposit in advance parameter configuration files, pending strip data; Described InfiniBand switch is joined together to form a High speed network by computing node and storage array, forms the underlying hardware facility of fast processing;

2. a kind of satellite remote sensing date fast processing system according to claim 1, it is characterized in that: when every scape data are carried out to image radiation treatment scheme, adopt streamline recycle design to carry out, file I/O operation between reducing between the interior each processing unit of flow process, specific as follows: