CN104270437A

CN104270437A - Mass data processing and visualizing system and method of distributed mixed architecture

Info

Publication number: CN104270437A
Application number: CN201410498051.1A
Authority: CN
Inventors: 薛健; 吕科; 潘卫国
Original assignee: University of Chinese Academy of Sciences
Current assignee: University of Chinese Academy of Sciences
Priority date: 2014-09-25
Filing date: 2014-09-25
Publication date: 2015-01-07
Anticipated expiration: 2034-09-25
Also published as: CN104270437B

Abstract

The invention relates to a mass data processing and visualizing system and method of distributed mixed architecture. The mass data processing and visualizing system is characterized in that a hardware system comprises a high-performance graphic work station, a high-speed disk array, a plurality of computational nodes, a kilomega or ten-thousand-mega Ethernet switchboard and a router; the corresponding data processing and visualizing method comprises a front end work station method and a computational node method, the front end work station method is adopted at a front end work station, and data loading, data processing, visualizing, interactive operating and other work are finished through the necessary cooperation of the computational nodes; the computational node method is adopted in the computational nodes and is carried out in the mode of daemon, a specific port is monitored, when it is detected that a computation task is sent, the corresponding outer storage algorithm is started to process the data according to task information, and the computing state information is sent back to the front end work station. The utilization rate of the network bandwidth can be guaranteed to the greatest extent, and meanwhile the user operation response speed of the front end work station is increased.

Description

The mass data processing of distributed mixed architecture and visualization system and method

Technical field

The present invention relates to a kind of mass data processing and visualization system, particularly about a kind of mass data processing of distributed mixed architecture and visualization system and method.

Background technology

Along with the development of various image acquisition equipment hardware technology and imaging means, to obtain the spatial resolution of data more and more higher, the data simultaneously gathered are from the Three-dimensional calibration of static state to the transformation of dynamic four-dimensional scalar field, the data volume that equipment is obtained sharply rises, data volume is huge, bring stern challenge to traditional scalar field data processing and method for visualizing, for the method for visualizing of some complexity, the processing speed that obtain real-time, interactive is especially difficult.In recent years, three aspects are mainly concentrated on for the process of mass data and the research of method for visualizing: 1, design efficient out-of-core algorithm and mass data is processed.Out-of-core algorithm mainly adopts two kinds of computation schemas: one is " batch processing computation schema ", if be orderly to the access of initial data in processing procedure or can change into orderly access by certain mode, is then applicable to this kind of computation model; Another kind is " online computation model ", its basic point of departure is that magnanimity initial data is carried out piecemeal process, enabling the data after piecemeal be loaded into internal memory, become to be convenient to the structure of efficient retrieval simultaneously by the Organization of Data after piecemeal, directly can obtain desired data by query manipulation when calculating.2, various concurrent technique is adopted to process mass data.Concurrent technique carries out one of efficient process and visual key technology to mass data, and this technology to be run simultaneously process to mass data by adding a large amount of hardware computational unit.The fast processing that these class methods mainly adopt MPI (Message Passing Interface, message passing interface) or distributed computation schema to realize mass data on large-scale networking hardware platform.3, the parallel ability of graphic based processing unit (GPU) designs efficient mass data visualized algorithm.High performance GPU has powerful floating-point operation ability and DLL (dynamic link library) flexibly, and based on the parallel behavior that it is powerful, GPU is expected to the powerful becoming following high performance float-point computing.Magnanimity scalar field data processing and visual field, current shortage technology maturation, with clearly defined objective, facilitate available mass data processing and visualization system, although the large software data processing and the visualization function that take ParaView as representative are comparatively powerful, but because it adopts the pattern of distributed storage and calculating, the collaborative work between computing node is realized based on MPI, computing node is caused to configure complicated, higher to the reliability requirement of computing node itself and Internet Transmission, be unfavorable for building reliable and stable distributed mass data process and visualization system.

Summary of the invention

For the problems referred to above, the object of this invention is to provide a kind of mass data processing of efficient, stable, the distributed mixed architecture that reliable, interactive performance is good and visualization system and method.

For achieving the above object, the present invention takes following technical scheme: the mass data processing of distributed mixed architecture and visualization system, is characterized in that, it comprises a front station, a data storage center, some computing nodes and a set of network equipment; Described front station is high performance graphics work station, and for the real-time display of mass data and interactive operation, operational described computing node in supervising the network, issues calculation task to described computing node simultaneously; Described data storage center is hyperdisk array, for storing magnanimity initial data, and provides and read and write access shared to the high speed of mass data; Described computing node is the hardware device possessing computing capability, and it adopts the mode of distributed parallel to run mass data processing algorithm consuming time; The described network equipment comprises a gigabit or ten thousand mbit ethernet switches, for building high-speed backbone network; One router, sets up communication link for the described computing node existed with outer net; Support the high speed netting twine of gigabit or 10,000,000,000 transmission rates; Described front station, data storage center and computing node high speed netting twine are connected to described Ethernet switch, form high-speed backbone, simultaneously, described backbone network is connected with external network with fire compartment wall by a router, for setting up communication link from the described computing node being distributed in different physical location.

Described front station is equipped with high performance CPU, GPU, high-capacity and high-speed internal memory, and at least one piece of gigabit or 10,000,000,000 Fast Ethernet cards.

Described data storage center is at least equipped with one piece of gigabit or 10,000,000,000 Fast Ethernet cards.

The mass data processing of distributed mixed architecture and a method for visualizing, it comprises the following steps: 1) arrange mass data processing and visualization system that one comprises the distributed mixed architecture of front station, data storage center, computing node and the network equipment; 2) front station is under the cooperation of computing node, completes Data import, data processing and does the visual of multistage resampling and interactive operation to original mass data; 3) based on pre-processed results, following operation is carried out: 1. read lower a piece of news in message loop; 2. judge whether current message is that window size changes message, if then play up window size according to current, calculate the optimum data progression k that can produce the most clear volume rendered projection result, then window is set and refreshes mode for " normal refresh ", and after send window refresh message, get back to step 1.; Otherwise enter step 3.; 3. judge whether current message is mouse interaction message, if then process mouse information, upgrade associated scenario rendering parameter, then window is set and refreshes mode for " mouse refreshes alternately ", and send window refresh message, get back to step 1.; Otherwise enter step 4.; 4. judge whether current message is window refresh message, if enter step 5.; Otherwise get back to step 1.; Whether the refreshing mode 5. judging current window refresh message is " mouse refreshes alternately ", if then adopt the ray casting volume rendering algorithm that OpenGL shading language realizes, carry out the drafting of GPU acceleration bodies to the n-th DBMS to play up, and after showing rendering result, get back to step 1.; Otherwise enter step 6.; 6. judge whether the data volume of kth DBMS is less than a certain given threshold value T, if the ray casting volume rendering algorithm then adopting OpenGL shading language to realize, utilize the computation capability of GPU, the drafting of GPU acceleration bodies is carried out to kth DBMS and plays up, and after showing rendering result, get back to step 1.; Otherwise the ray casting volume rendering algorithm adopting OpenMP to realize, utilizes the computation capability of multi-core CPU, the volume drawing starting kth DBMS is played up, and then enters step 7.; 7. whether there is mouse interaction message in detect-message circulation, if then interrupt current drawing process, get back to step 1.; Otherwise enter step 8.; 8. judge to play up whether terminate, if after then showing rendering result, get back to step 1.; Otherwise complete the calculating of next projection image vegetarian refreshments, and get back to step 7..

Described step 2) in data load process comprise the following steps: the data volume 1. being calculated data to be loaded by front station; 2. judge whether the data volume of data to be loaded exceedes the threshold value preset, if the data volume of data to be loaded exceedes the threshold value preset, illustrate that it is mass data, then enter step 3.; Otherwise prepare against follow-up rendering and process by Data import to the internal memory of front station 1, and enter step 8.; 3. travel through computing node list, search available computing node, set up TCP with available computing node and be connected, by Data import task, be sent to available computing node; 4. the state information that the computing node that connected returns is monitored; 5. the state information that computing node is beamed back is judged: if computing node beams back " mistake " information, then enter step 8.; Otherwise enter step 6.; 6. the state information continued computing node returns judges, if computing node beams back " end " information, then enters step 7.; Otherwise get back to step 4.; 7. front station according to computing node beam back about the information of data storage location after loading, read corresponding data to data storage center, then enter step 8.; 8. end data loading procedure show loading result.

Described step 2) in data handling procedure comprise the following steps: the data volume 1. being calculated pending data by front station; 2. judge whether pending data volume exceedes the threshold value preset, if the data volume of pending data exceedes the threshold value preset, illustrate that it is mass data, then enter step 3.; Otherwise directly perform correspondence memory algorithm in front station to process data, and enter step 8.; 3. travel through computing node list, search available computing node, set up TCP with available computing node and be connected, by data processing task, be sent to available computing node; 4. the state information that the computing node that connected is beamed back is monitored; 5. the state information that computing node is beamed back is judged: if computing node beams back " mistake " information, then enter step 8.; Otherwise enter step 6.; 6. the state information continued computing node is beamed back judges, if computing node beams back " end " message, then enters step 7.; Otherwise get back to step 4.; 7. the data storage location after the front station process of beaming back according to computing node, reads corresponding data to data storage center, then enters step 8.; 8. end data processing procedure Graphics Processing result.

Described step 2) in, multistage resampling preliminary treatment is done to original mass data and comprises: by original data definition be: 0 DBMS, it is of a size of W × H × D; 1 DBMS does resampling on the basis of 0 DBMS, and size becomes then data volume becomes 0 DBMS 2 DBMSs do resampling on the basis of 1 DBMS, and size becomes data volume becomes 1 DBMS 0 DBMS wherein, W, H, D represent the length of Three-dimensional calibration respectively; Continue this process, until its data volume of the n-th DBMS is less than or equal to a given threshold value.

Described step 3) step in 2. produce best volume drawing data method be: the little cuboid each sampled point of scalar field data being considered as the formed objects be close to one by one, little cuboid is projected to and plays up window, its size is not more than a pixel, maximum that of progression in the multi-stage data meeting this requirement is found to be exactly required best volume drawing data, if its progression is k.

Described step 3) step in the threshold value T 6. maximum amount of data that can be held by video card three-D grain buffer memory determine, if be less than given threshold value T, represent that kth DBMS can be loaded into video card texture cache by complete.

The present invention is owing to taking above technical scheme, it has the following advantages: 1, the present invention is owing to adopting the mode sharing storage to leave on data storage center (hyperdisk array) the magnanimity initial data of whole system, and front station and computing node are directly conducted interviews to the data left on data storage center by Fast Ethernet.The large batch of transfer of data of whole system only occurs in front station and data storage center and between computing node and data storage center, large batch of transfer of data is not then had between front station and computing node, front station is only had to have low volume data to transmit to computing node issues calculation task and computing node forward end work station posting status information time, the utilance that therefore can farthest guarantee network bandwidth, improves the user operation response speed of front station simultaneously.2, the present invention completes due to each computing node is distributed in loading consuming time and preprocessing process, data are through pretreated reorganization, real-time rendering can be realized in front station, to provide good interactive performance, and the data processing task consuming time that user and front station produce alternately completes to numerous computing node by network allocation, data after process are stored in disk array, front station directly reads result from disk array, therefore can reach the efficient process to data.3, the present invention issues and the communicating of state passback owing to only there is fairly simple task between computing node and front station, computing node fault can not affect the operation of front station software, more can not cause the collapse of whole system, therefore the stability of whole system and reliability obtain and effectively ensure.4, the present invention in original mass data loading procedure through preliminary treatment, be reorganized into multi-stage data when there being user interactive, adopt the data that resolution is lower, GPU (Graphics Processing Unit) is utilized to carry out real-time visually to play up, other time adopt and be applicable to the data of current display window size, utilize GPU or multi-core CPU to carry out quick visualization to play up, therefore while obtaining better quality rendering effect, ensure that good interaction response speed.5, the present invention is only responsible for comparing data processing task consuming time due to computing node software, do not have to show and mutual demand, therefore its software configuration and flow chart of data processing are comparatively simple, can be deployed in various different hardware platform and operating system, also can easily existing data processing out-of-core algorithm be integrated in computing node software, the data processing function of whole system of enriching constantly.The present invention can be widely used in mass data processing and visualized operation process.

Accompanying drawing explanation

Fig. 1 is hardware system logical architecture schematic diagram of the present invention;

Fig. 2 is that hardware system of the present invention disposes schematic diagram;

Fig. 3 is Data import flow chart of the present invention;

Fig. 4 is flow chart of data processing figure of the present invention;

Fig. 5 is data visualization of the present invention and interactive operation flow chart.

Embodiment

The present invention includes two parts: Part I is the mass data processing of distributed mixed architecture and visual hardware system; Part II operates on Part I basis to realize corresponding data process and visualization method.

As shown in Figure 1, the mass data processing of distributed mixed architecture of the present invention and visual hardware system comprise a high performance graphics work station as front station 1, the hyperdisk array as data storage center 2, some computing nodes 3 and some network equipments 4, wherein represent data flow."-" represents task flow.

High performance graphics work station of the present invention is as front station 1, user mainly carries out at this operation of whole system, for to the real-time display of mass data and interactive operation, operational computing node 3 in supervising the network, issues calculation task to computing node 3 simultaneously.For reaching the demand, front station 1 needs to be equipped with high performance CPU, GPU, high-capacity and high-speed internal memory, and at least one piece of gigabit or 10,000,000,000 Fast Ethernet cards.

Hyperdisk array of the present invention, as data storage center 2, for storing magnanimity initial data, and providing and read and write access shared to the high speed of mass data, needing at least to be equipped with one piece of gigabit or 10,000,000,000 Fast Ethernet cards.

Computing node 3 of the present invention can be any hardware device possessing certain computing capability, as the server cluster etc. that supercomputer, large-scale computer, ordinary PC build, or even mobile terminal and embedded device.Computing node 3 adopts the mode of distributed parallel to run mass data processing algorithm consuming time.

As shown in Figure 2, the network equipment 4 of the present invention comprises a gigabit or ten thousand mbit ethernet switches 41, for building high-speed backbone network; A router four 2, sets up communication link for the computing node 3 that may exist with outer net; Support that the high speed netting twine of gigabit or 10,000,000,000 transmission rates is some.Front station 1, data storage center 2 and internal calculation node 3 high speed netting twine are connected to Ethernet switch 41, form high-speed backbone, simultaneously, this backbone network is connected with external network (Internet) with fire compartment wall 43 by router four 2, for setting up communication link with the computing node 3 being distributed in different physical location, in this way, quantity by expanding computing node 3 is constantly promoted by the computing capability of whole system, thus expands computational resource further.

Data are processed and visualized operation time, the original scalar field data acquisition of magnanimity of whole system is stored in data storage center 2 with sharing the mode stored, and front station 1 and computing node 3 directly carry out share and access to the data left on data storage center 2 by high speed Ethernet exchange machine 41.Large batch of transfer of data is not then had between front station 1 and computing node 3, only have front station 1 to have low volume data to transmit to computing node 3 issues calculation task and computing node 3 forward end work station 1 posting status information time, do not carry out large batch of transfer of data.Therefore, the large batch of transfer of data of whole system only occurs in front station 1 and data storage center 2, and between computing node 3 and data storage center 2, the utilance that can farthest guarantee network bandwidth like this, improves the user operation response speed of front station 1 simultaneously.User operates whole system by front station 1, data are loaded from data storage center 2, loading consuming time and preprocessing process are distributed to each computing node 3 and are completed, data are through pretreated reorganization, real-time rendering can be realized in front station 1, to provide good interactive performance, and the data processing task consuming time that user and front station 1 produce alternately is completed to numerous computing node 3 by network allocation, data after process are stored in data storage center 2, front station 1 is reading process result from data storage center 2 directly, to reach the object of efficient data processing.

The corresponding data process that the present invention realizes and method for visualizing comprise two parts: front station method and computing node method.Front station method operates in front station 1, comprises graphic user interface, under the cooperation of computing node 3 necessity, completes the work such as Data import, data processing, visual and interactive operation.Computing node method operates on computing node 3, run in the mode of finger daemon, monitor particular port, detect that calculation task is sent, then start corresponding out-of-core algorithm according to mission bit stream to process data, and computing mode information (as progress msg, error message, ending message etc.) is returned to front station 1.For mass data, loading procedure completes at calculating crunode 3, is the reorganization to data store organisation, is a kind of preprocessing process, but does not relate to the amendment to data self-information.

The present invention realizes corresponding data process and method for visualizing comprises the following steps:

1) mass data processing and the visualization system that comprise the distributed mixed architecture of front station 1, data storage center 2, computing node 3 and the network equipment 4 are set;

2) front station 1 is under the cooperation of computing node 3, completes Data import, data processing and does the visual of multistage resampling and interactive operation to original mass data;

As shown in Figure 3, data load process comprises the following steps:

1. the data volume S of data to be loaded is calculated by front station 1;

2. judge whether the data volume S of data to be loaded exceedes the threshold value T (this threshold value T pre-sets according to the hardware performance of front station 1) preset, if the data volume S of data to be loaded exceedes the threshold value T preset, illustrate that it is mass data, then enter step 3.; Otherwise by Data import in the internal memory of front station 1, in order to follow-up rendering and process, and enter step 8.;

3. computing node 3 list is traveled through, search available computing node 3, set up TCP (transmission control protocol) with available computing node 3 and be connected, by Data import task, comprise the information such as the title of data to be loaded, memory location, loading parameters, be sent to available computing node 3;

4. the state information that the computing node 3 that connected is beamed back is monitored;

5. the state information that computing node 3 is beamed back is judged: if computing node 3 beams back " mistake " information, then enter step 8.; Otherwise enter step 6.;

6. the state information continued computing node 3 is beamed back judges, if computing node 3 beams back " end " information, then enters step 7.; Otherwise get back to step 4.;

7. front station 1 according to computing node 3 beam back about the information of data storage location after loading, read corresponding data to data storage center 2, then enter step 8.;

8. end data loading procedure show loading result.

Conventional scalar field data (image) process, such as noise reduction, level and smooth, sharpening etc., realize memory algorithm and the out-of-core algorithm of its correspondence respectively, memory algorithm operates in front station 1, for the data processing of ordinary size, out-of-core algorithm operates on each computing node 3, for mass data processing.

As shown in Figure 4, data processing method comprises the following steps:

1. the data volume S of pending data is calculated by front station 1;

2. judge whether pending data volume S exceedes the threshold value T (this threshold value T pre-sets according to the hardware performance of front station 1) preset, if the data volume S of pending data exceedes the threshold value T preset, illustrate that it is mass data, then enter step 3.; Otherwise directly perform correspondence memory algorithm in front station 1 to process data, and enter step 8.;

3. travel through computing node 3 list, search available computing node 3, set up TCP with available computing node 3 and be connected, by data processing task, comprise the information such as the title of pending data, memory location, data processing command and parameter, be sent to available computing node 3;

6. the state information continued computing node 3 is beamed back judges, if computing node 3 beams back " end " message, then enters step 7.; Otherwise get back to step 4.;

7. the data storage location after front station 1 process of beaming back according to computing node 3, reads corresponding data to data storage center 2, then enters step 8.;

8. end data processing procedure Graphics Processing result.

For visual and interactive operation, in order to reach the response speed of real-time, interactive, need to do multistage resampling preliminary treatment to original mass data, preprocessing process can as a part for Data import, and pretreated concrete grammar is:

Be 0 DBMS by original data definition, suppose that it is of a size of W × H × D; 1 DBMS does resampling on the basis of 0 DBMS, and size becomes then data volume becomes 0 DBMS 2 DBMSs do resampling on the basis of 1 DBMS, and size becomes data volume becomes 1 DBMS 0 DBMS wherein, W, H, D represent the length of Three-dimensional calibration respectively;

By that analogy, continue this process, until the data volume of the n-th DBMS is less than or equal to a given threshold value.The determination of this threshold value is relevant with the hardware performance of front station 1, its basic demand is that data that data volume equals this threshold value do volume drawing (i.e. Volume Rendering) when playing up, rendering speed at least will reach 15 frames/more than second, to meet the demand of interactive operation.This preprocessing process as a part for Data import, can be completed by computing node 3, and having processed rear multistage scalar field data, to be stored in data storage center 2 for subsequent use.Data in loading procedure through preliminary treatment, be reorganized into multi-stage data, like this, when there being user interactive, adopt the lower data of resolution, utilize GPU to carry out real-time visually to play up, other time adopt the data being applicable to current display window size, utilize GPU or multi-core CPU to carry out quick visualization to play up, while obtaining better quality rendering effect, ensure that good interaction response speed.

3) based on the multistage scalar field data visualization obtained after above-mentioned preliminary treatment and interactive operation, as shown in Figure 5, comprise the following steps:

1. lower a piece of news in message loop is read;

2. judge whether current message is that window size changes message, if then play up window size according to current, calculate the optimum data progression k that can produce the most clear volume rendered projection result, then window is set and refreshes mode for " normal refresh ", and after send window refresh message, get back to step 1.; Otherwise enter step 3.;

The method of the best volume drawing data of above-mentioned generation is: the little cuboid each sampled point of scalar field data being considered as the formed objects be close to one by one, little cuboid is projected to and plays up window, its size is not more than a pixel, minimum that of value of series in the multi-stage data meeting this requirement is found to be exactly required best volume drawing data, if its progression is k;

3. judge whether current message is mouse interaction message, if then process mouse information, upgrade associated scenario rendering parameter, then window is set and refreshes mode for " mouse refreshes alternately ", and send window refresh message, get back to step 1.; Otherwise enter step 4.;

4. judge whether current message is window refresh message, if enter step 5.; Otherwise get back to step 1.;

Whether the refreshing mode 5. judging current window refresh message is " mouse refreshes alternately ", if then adopt the ray casting volume rendering algorithm that OpenGL shading language realizes, carry out the drafting of GPU acceleration bodies to the n-th DBMS (minimum data) to play up, and after showing rendering result, get back to step 1.; Otherwise enter step 6.;

6. judge whether the data volume of kth DBMS is less than a certain given threshold value T, if the ray casting volume rendering algorithm then adopting OpenGL shading language to realize, utilize the computation capability of GPU, the drafting of GPU acceleration bodies is carried out to kth DBMS and plays up, and after showing rendering result, get back to step 1.; Otherwise the ray casting volume rendering algorithm adopting OpenMP to realize, utilizes the computation capability of multi-core CPU, the volume drawing starting kth DBMS is played up, and then enters step 7.; Wherein, the maximum amount of data that threshold value T can be held by video card three-D grain buffer memory determines, if be less than given threshold value T, represents that kth DBMS can be loaded into video card texture cache by complete;

7. whether there is mouse interaction message in detect-message circulation, if then interrupt current drawing process, get back to step 1.; Otherwise enter step 8.;

8. judge to play up whether terminate (namely in volume rendered projection figure, all pixels have calculated complete all), if after then showing rendering result, get back to step 1.; Otherwise what complete next projection image vegetarian refreshments plays up calculating, and gets back to step 7..

Adopt said method can respond fast user interactive, produce the highest volume drawing rendering result of quality simultaneously, because be less than the scalar field data of k for progression, because its each sampled point cuboid is less than a pixel in the projection of playing up window, its rendering accuracy can not higher than kth DBMS.

By above-mentioned steps, the loading of mass data, process and visual interactive manipulation can be completed in the mass data processing of distributed mixed architecture and visual hardware system, ensure the high-efficiency operation of whole system.

The various embodiments described above are only for illustration of the present invention, and wherein each parts are arranged and enforcement etc. all can change to some extent, and every equivalents of carrying out on the basis of technical solution of the present invention and improvement, all should not get rid of outside protection scope of the present invention.

Claims

1. the mass data processing of distributed mixed architecture and visualization system, it is characterized in that, it comprises a front station, a data storage center, some computing nodes and a set of network equipment;

Described front station is high performance graphics work station, and for the real-time display of mass data and interactive operation, operational described computing node in supervising the network, issues calculation task to described computing node simultaneously;

Described data storage center is hyperdisk array, for storing magnanimity initial data, and provides and read and write access shared to the high speed of mass data;

Described computing node is the hardware device possessing computing capability, and it adopts the mode of distributed parallel to run mass data processing algorithm consuming time;

The described network equipment comprises a gigabit or ten thousand mbit ethernet switches, for building high-speed backbone network; One router, sets up communication link for the described computing node existed with outer net; Support the high speed netting twine of gigabit or 10,000,000,000 transmission rates;

Described front station, data storage center and computing node high speed netting twine are connected to described Ethernet switch, form high-speed backbone, simultaneously, described backbone network is connected with external network with fire compartment wall by a router, for setting up communication link from the described computing node being distributed in different physical location.

2. the mass data processing of distributed mixed architecture as claimed in claim 1 and visualization system, it is characterized in that, described front station is equipped with high performance CPU, GPU, high-capacity and high-speed internal memory, and at least one piece of gigabit or 10,000,000,000 Fast Ethernet cards.

3. the mass data processing of distributed mixed architecture as claimed in claim 1 and visualization system, it is characterized in that, described data storage center is at least equipped with one piece of gigabit or 10,000,000,000 Fast Ethernet cards.

4. the mass data processing of distributed mixed architecture as claimed in claim 2 and visualization system, it is characterized in that, described data storage center is at least equipped with one piece of gigabit or 10,000,000,000 Fast Ethernet cards.

5. adopt mass data processing and the method for visualizing of the distributed mixed architecture of system according to any one of Claims 1 to 4, it comprises the following steps:

1) mass data processing and visualization system that one comprises the distributed mixed architecture of front station, data storage center, computing node and the network equipment are set;

2) front station is under the cooperation of computing node, completes Data import, data processing and does the visual of multistage resampling and interactive operation to original mass data;

3) based on pre-processed results, following operation is carried out:

1. lower a piece of news in message loop is read;

Whether the refreshing mode 5. judging current window refresh message is " mouse refreshes alternately ", if then adopt the ray casting volume rendering algorithm that OpenGL shading language realizes, carry out the drafting of GPU acceleration bodies to the n-th DBMS to play up, and after showing rendering result, get back to step 1.; Otherwise enter step 6.;

6. judge whether the data volume of kth DBMS is less than a certain given threshold value T, if the ray casting volume rendering algorithm then adopting OpenGL shading language to realize, utilize the computation capability of GPU, the drafting of GPU acceleration bodies is carried out to kth DBMS and plays up, and after showing rendering result, get back to step 1.; Otherwise the ray casting volume rendering algorithm adopting OpenMP to realize, utilizes the computation capability of multi-core CPU, the volume drawing starting kth DBMS is played up, and then enters step 7.;

8. judge to play up whether terminate, if after then showing rendering result, get back to step 1.; Otherwise complete the calculating of next projection image vegetarian refreshments, and get back to step 7..

6. the mass data processing of distributed mixed architecture as claimed in claim 5 and method for visualizing, is characterized in that, described step 2) in data load process comprise the following steps:

1. the data volume of data to be loaded is calculated by front station;

2. judge whether the data volume of data to be loaded exceedes the threshold value preset, if the data volume of data to be loaded exceedes the threshold value preset, illustrate that it is mass data, then enter step 3.; Otherwise prepare against follow-up rendering and process by Data import to the internal memory of front station 1, and enter step 8.;

3. travel through computing node list, search available computing node, set up TCP with available computing node and be connected, by Data import task, be sent to available computing node;

4. the state information that the computing node that connected returns is monitored;

5. the state information that computing node is beamed back is judged: if computing node beams back " mistake " information, then enter step 8.; Otherwise enter step 6.;

6. the state information continued computing node returns judges, if computing node beams back " end " information, then enters step 7.; Otherwise get back to step 4.;

7. front station according to computing node beam back about the information of data storage location after loading, read corresponding data to data storage center, then enter step 8.;

8. end data loading procedure show loading result.

7. the mass data processing of the distributed mixed architecture as described in claim 5 or 6 and method for visualizing, is characterized in that: described step 2) in data handling procedure comprise the following steps:

1. the data volume of pending data is calculated by front station;

2. judge whether pending data volume exceedes the threshold value preset, if the data volume of pending data exceedes the threshold value preset, illustrate that it is mass data, then enter step 3.; Otherwise directly perform correspondence memory algorithm in front station to process data, and enter step 8.;

3. travel through computing node list, search available computing node, set up TCP with available computing node and be connected, by data processing task, be sent to available computing node;

4. the state information that the computing node that connected is beamed back is monitored;

6. the state information continued computing node is beamed back judges, if computing node beams back " end " message, then enters step 7.; Otherwise get back to step 4.;

7. the data storage location after the front station process of beaming back according to computing node, reads corresponding data to data storage center, then enters step 8.;

8. end data processing procedure Graphics Processing result.

8. the mass data processing of the distributed mixed architecture as described in claim 5 or 6 or 7 and method for visualizing, is characterized in that: described step 2) in, multistage resampling preliminary treatment is done to original mass data and comprises:

By original data definition be: 0 DBMS, it is of a size of W × H × D;

1 DBMS does resampling on the basis of 0 DBMS, and size becomes then data volume becomes 0 DBMS 2 DBMSs do resampling on the basis of 1 DBMS, and size becomes data volume becomes 1 DBMS 0 DBMS

Wherein, W, H, D represent the length of Three-dimensional calibration respectively; Continue this process, until its data volume of the n-th DBMS is less than or equal to a given threshold value.

9. the mass data processing of the distributed mixed architecture as described in claim 5 or 6 or 7 or 8 and method for visualizing, it is characterized in that: described step 3) step in 2. produce best volume drawing data method be: the little cuboid each sampled point of scalar field data being considered as the formed objects be close to one by one, little cuboid is projected to and plays up window, its size is not more than a pixel, maximum that of progression in the multi-stage data meeting this requirement is found to be exactly required best volume drawing data, if its progression is k.

10. the mass data processing of the distributed mixed architecture as described in claim 5 or 6 or 7 or 8 or 9 and method for visualizing, it is characterized in that: described step 3) step in the threshold value T 6. maximum amount of data that can be held by video card three-D grain buffer memory determine, if be less than given threshold value T, represent that kth DBMS can be loaded into video card texture cache by complete.