CN101556546A

CN101556546A - Method for processing compression program parallelization based on computer clusters

Info

Publication number: CN101556546A
Application number: CNA2009100850217A
Authority: CN
Inventors: 梁军; 陈炜钊; 鲍泓; 张迪; 刘振恒
Original assignee: Beijing Union University
Current assignee: Beijing Union University
Priority date: 2009-05-27
Filing date: 2009-05-27
Publication date: 2009-10-14

Abstract

The invention relates to a method for processing compression program parallelization based on computer clusters. The method is characterized by comprising a provider-processor-producer frame, wherein the provider-processor-producer frame comprises a provider process, a processor process and a producer process; the provider process divides data to be processed into a plurality of fixed-length data blocks to be sent and then sends the data blocks to be sent to the processor process; idle processer process is gradually stopped after the data blocks to be sent are completely sent out; the processor process requests data blocks to be sent from the provider process and sends processed data blocks to a user process; after receiving the processed data blocks sent by the processor process, the user process outputs the processed data blocks sequentially; the processor comprises a plurality of serial compression algorithms and executes the serial compression algorithms in parallelization manner; and when the data blocks to be sent and the processed data blocks are transferred among the provider process, the processor process and the user process, unique data block serial numbers are respectively attached to the data blocks to be sent and the processed data blocks.

Description

A kind of disposal route of the compression program parallelization based on computer cluster

Technical field

The present invention relates to a kind of computing machine method for parallel processing, particularly about a kind of disposal route of the compression program parallelization based on computer cluster.

Background technology

The computer nowadays technology has entered the epoch of parallelization, high data volume, use the Fast Compression algorithm most important for handling mass data, but owing to use the data of traditional compressed software processing magnanimity very consuming time, develop a kind of new compression algorithm simultaneously and have suitable difficulty, therefore traditional compressed software is carried out the parallelization exploitation and have very much realistic meaning.There have been at present some that serial compressed algorithm has originally been carried out the parallelization rewriting, but because the compression algorithm of rewriting core is very complicated, so the process for the complete parallelization of serial compressed algorithm originally is very slow, and does not have a kind of more common method to solve this problem.

Cluster is the service entities of one group of collaborative work, in order to the service platform that has more extendability and availability than single service entities to be provided.In client, a cluster similarly is a service entities, but in fact cluster is made up of one group of service entities.With respect to single computing machine, cluster computer has the characteristics of higher availability, manageability, scalability, can bear the work of having only supercomputer to be competent at.Every computing machine in the computer cluster all is a separate server of its own process of operation, can highly closely cooperate and finish evaluation work.These separate servers can communicate with one another, and work in coordination with to the user application program, system resource and data are provided, and managed with the pattern of triangular web, similarly are to have formed a triangular web concerning the user.Under most of patterns, the status that all nodes are in the computer cluster is identical, and the user need not be concerned about has born the task that system realizes for how many computing machines, and only needs the bulk treatment ability of attention location system.Because the rise of cluster computer, essential based on the software of coding under the computer cluster, MPI (MessagePassing Interface) standard is the outstanding person in many concurrent program development approaches just, and MPICH2 is the standard realization of MPI.

Realize that for general cluster environment the compression of mass data or the usual method of decompression process are:, design concrete parallel algorithm according to particular problem.The advantage of this method is that the method for handling problems is pointed, makes program implementation have very high efficient on specific problem.But the defective of this method is that it does not possess universality at the algorithm of particular problem, and the framed structure that neither one is general causes these parallel algorithms not have versatility when redevelopment and expansion design.

Summary of the invention

At the problems referred to above, the purpose of this invention is to provide a kind of being based upon on the common serial compressed software, realize the disposal route that mass data is carried out that high layer compression parallelization handles based on the compression program parallelization of computer cluster.

For achieving the above object, the present invention takes following technical scheme: a kind of disposal route of the compression program parallelization based on computer cluster, and it is characterized in that: it comprises a provider process, a plurality of processor process and producer's process; Serial compressed algorithm is set, all processor process parallel runnings in the described processor process; Described provider process comprises MakeBlock module, SendBlockNum module, SendFileSize module, Produce module and supplier Run module; Described MakeBlock module is used for data to be processed are divided into the data block to be sent of a plurality of fixed length, and the number and the relevant information of described data block to be sent deposited in supplier's Maintenance Table; Described SendBlockNum module is used for the number of data block to be sent is imported described user's process; Described SendFileSize module is used for the size of described data to be processed is imported described user's process; Described Produce module is used for inserting the data block to be sent that is ready for sending to the transmission buffering of described provider process; Described supplier Run module is taken out described data block to be sent from described transmission buffering, and sends to the processor process of the request of sending; When not having described data block to be sent in the described provider process, described supplier Run module finishes all idle processor process; Described processor process comprises Process module and processor Run module; Described Process module is used to receive described data block to be sent, and with data block boil down to processing to be sent back data block; Described processor Run module is used for round-robin to the described data block to be sent of described provider process request, and sends described processing back data block to described user's process; Described user's process comprises ReceiveBlockNum module, ReceiveFileSize module, Consume module and user Run module; Described ReceiveBlockNum module receives the number of the described data block to be sent of described SendBlockNum module transmission; Described ReceiveFileSize module is used to receive the size of the described data to be processed that described SendFileSize module sends; Described Consume module is used for the Maintenance Table according to the user, arrangement and expedited data unit combination operation that all described processing back data blocks of receiving are concentrated; Described user Run module is used to circulate and receives data block after the processing that described processor's module sends, and simultaneously described user's Maintenance Table is safeguarded.

Support multiple serial compressed algorithm in the described Process module.

When data block transmits between described provider process, processor process and user's process after described data block to be sent and the described processing, all additional unique sequence number.

All comprise sequence number and data of description block structure body in described supplier's Maintenance Table and the user's Maintenance Table; Wherein sequence number is the order of described data block to be sent; Data of description block structure body is used to describe the content or the deposit position of data block after described data block to be sent and the described processing.

Described provider process sends to idle processor process with described data block to be sent, and described user's process receives has data block after the processing that usufructuary processor process sends.

When carrying out decompression process, described provider process reads an index structure of described data to be processed, finds each described data block to be sent.

The present invention is owing to take above technical scheme, it has the following advantages: 1, adopted a kind of supplier-processor-user's framework in the compression parallelization treatment progress of the present invention, directly different compression algorithms are realized parallelization, realization is carried out unified parallelization by existing serial compressed algorithm to magnanimity ordered data piece and is handled, the different qualities that has not only kept original compression algorithm makes its parallelization but also can expand new compression algorithm.2, in supplier-processor of the present invention-user's framework, the processor all adopts for supplier and user and fights for usufructuary mechanism between process, therefore the processor process that does not have inefficacy in processing procedure can guarantee the efficient of processor's data processing process unique under supplier-processor-user's framework.3, used the map container of C++ standard template library in the recovery of the user's process of the present invention ordering, so its ordering speed is much higher than general sort method.Method of the present invention can be widely used in computer cluster, with realization significance is arranged all for the processing of acceleration compressed file and the design of cluster parallel algorithm.

Description of drawings

Fig. 1 is supplier-processor of the present invention-user's framed structure synoptic diagram

Fig. 2 is a compression parallelization treatment progress course of work synoptic diagram of the present invention

Fig. 3 is the analysis synoptic diagram that back data order validity is reclaimed in assurance of the present invention

Embodiment

Below in conjunction with drawings and Examples the present invention is described in detail.

The present invention is used for handling the parallelization problem of the serial compressed algorithm of the compression process under the computer cluster environment, but since the solution thinking of compression problem to conciliate compression problem similar, therefore method of the present invention also can be applied to decompression process.In compression process, the present invention is on original serial compressed software, realizes a kind of method of general high layer compression parallelization treatment progress.Under the prerequisite of not revising original serial compressed algorithm, by mass data being divided into the data block of a plurality of fixed length, again it is distributed in each computing node of computer cluster, on different computing nodes, simultaneously data block is carried out data compression or decompression by original serial compressed algorithm at last.But for realize the in order distribution of parallel data block-＞calculate-＞process that reclaims, the present invention is based on classical producer consumer framework, in compression parallelization treatment progress, use a supplier (Provider)-processor (Processor)-user (User) framework, be called for short the PPU framework, the transmission of the file ordered sequence when this framework can perfectly solve compression and decompression and recovery arrangement.

The PPU framework is by the compression parallelization treatment progress that will open on the MPICH2 process that is divided three classes, respectively do not provide person's process (Provider), processor process (Processor) and user's process (User), can come the simplified design thinking and improve exploitation and the efficient of execution with this framework.The major function of these processes is: provider process is divided into data to be processed the data block of a plurality of fixed length, send data block to be processed to the processor process of filing a request afterwards, after data block to be processed is sent fully, progressively stop all idle processor process; Processor process is mainly sent data block to the provider process requests pending, and the data block to be sent that request is got is processed processing then, and data block after the processing after handling is sent to user's process; Data block after the processing that user's process reception processor process is sent is safeguarded the succession of processing the back data block information simultaneously.The difference division of labor by three above-mentioned class processes is handled, and makes that only changing processor process just can realize the serial compressed programs of parallelization difference, and the accurate division of labor by three class processes can solve the quick exploitation for different compression processes.

As shown in Figure 1, when under MPICH2 running environment, handling compression process, the user imports data A to be processed in the PPU framework of the present invention, treats that process data A compresses or after the parallelization that decompresses handles, and exports data B after the processing corresponding with data A to be processed.All unlatching processes are divided into a provider process 1, a plurality of processor process 2 and user's process 3 in the compression parallelization treatment progress of the present invention.Because the transmission of having only data stream alternately between provider process 1 and user's process 3 and the processor process 2, so handling for the parallelization of any serial compressed algorithm specifically in the processing procedure of processor process 2 is fully independently, become parallelly compressed program so can utilize original serial compressed program to rewrite by this structure.

As shown in Figure 2, provider process 1 of the present invention comprises MakeBlock module 11, SendBlockNum module 12, SendFileSize module 13, Produce module 14 and supplier Run module 15; Processor process 2 comprises Process module 21 and processor Run module 22; User's process 3 comprises ReceiveBlockNum module 31, ReceiveFileSize module 32, Consume module 33 and user Run module 34.Application module 4 is used for the startup of whole compression program parallelization treatment progress, automatically the process of all startups is classified, and also is responsible for the operations such as legitimacy of inspection user input parameter.

In the provider process 1, MakeBlock module 11 is used for the data A to be processed of user's input is divided into the data block A ' to be sent of a plurality of fixed length, and number and the relevant information with the data block A ' to be sent behind the piecemeal deposits among supplier's Maintenance Table M then.SendBlockNum module 12 is used for the number input user process 3 with data block A ' to be sent, can accurately receive processor process 2 processing back data block A to guarantee user's process 3 ".SendFileSize module 13 is used for the size input user process 3 with the data A to be processed of user's input, to guarantee " accurately setting up an index structure to processing back data block A in user's process 3.Produce module 14 is used for inserting the data block A ' to be sent that will send to the transmission buffering of provider process 1.Supplier Run module 15 is taken out data block A ' to be sent from send buffering, and sends to Process module 21; Guarantee to be still waiting to send in the provider process 1 data block A ' time simultaneously, always requests pending send the processor process 2 of the data block A ' to be sent of data block A ' to send data block A ' to be sent; Supplier Run module 15 also guarantees can finish all idle processor process 2 when not having data block A ' to be sent in the provider process 1.

In the processor process 2, Process module 21 is used for the original serial compressed algorithm that presets according to processor process 2, with the data block A ' boil down to processing back data block A to be sent that receives ".Because Process module 21 without any dependence, has therefore guaranteed the compression parallelization treatment progress of a plurality of processor process 2 between other processor process 2 of compression processing neutralization, thereby realizes the possibility to the parallelization of the serial compressed algorithm of difference fast.Processor Run module 22 is used for round-robin and send data block A ' to provider process 1 requests pending, passes through the Process module 21 in the processor process 2 afterwards, sends processing back data block A to user's process 3 at last ".

In user's process 3, ReceiveBlockNum module 31 receives the number of the data block A ' to be sent of provider process 1 transmission corresponding to the SendBlockNum module 12 of provider process 1.ReceiveFileSize module 32 is used for the SendFileSize module 13 of corresponding provider process 1, the size of the data A to be processed that reception provider process 1 is sent.All processing back data block A that Consume module 33 is used for user's process 3 is reclaimed ", the arrangement and the expedited data unit combination operation of concentrating according to user's Maintenance Table N.User Run module 34 is used for being responsible for circulation and receives data block A after the processing that processor's module 2 sends " and to be safeguarded user's Maintenance Table N of user's process 3.

It is as follows that method of the present invention is applied to the compression process course of work:

1) at first the user is by Application module 4 program start compression program parallelization treatment progress, and initialization PPU framework is constructed provider process 1, processor process 2 and user's process 3 simultaneously in computer cluster.In the compression program parallelization treatment progress, this three classes process all is an executed in parallel from start to finish.

2) after provider process 1 starts, call MakeBlock module 11, the data A to be processed that the user is imported carries out the piecemeal processing, afterwards number and the relevant information of input data block A ' to be sent in supplier's Maintenance Table M.

3) after processor process 2 starts, operation processor Run module 22.

4) after user's process 3 starts, operation user ReceiveBlockNum module 31.

5) after 11 work of MakeBlock module are finished, compression program parallelization treatment progress operation SendBlockNum module 12, with the ReceiveBlockNum module 31 in the number input user process 3 of data block A ' to be sent, the supplier need carry out the SendFileSize13 module afterwards.

6) processor process need be carried out SendFileSize module 13 after supplier's successful execution SendBlockNum module, with the ReceiveFileSize module 32 in the size input user process 3 of data A to be processed, moves supplier Run module 15 afterwards.

7) operation Produce module 14 is inserted data block A ' to be sent in the transmission buffering of provider process 1.

8) request of operation supplier Run module 15 corresponding processor Run modules 22, data block A ' to be sent the taking-up from send buffering that generates in the Produce module 14 is sent to Run module 22, and the data block A ' to be sent that Process module 22 will receive is through Process module 21 boil down tos processing back data block A ".

9) transmission of processor Run module 22 in user's process 3 data block A after 21 processing of Process module ".

10) user Run module 34, data block A after the processing that circulation reception processor module 2 sends ", treat to receive full the back and enter Consume module 33 afterwards.

11) Consume module 33 " is carried out buffer memory and is focused on output processing back data B processing back data block A according to user's Maintenance Table N.The index structure that data block A after will processing " when writing processing back data B, partly sets up processing back data block A according to user's Maintenance Table N in the beginning of output data B ".

In order to allow the data processing of processor process 2 keep efficient operation, processor process 2 of the present invention all adopts for provider process 1 and user's process 3 and fights for usufructuary mechanism between process.Being provider process 1 sends data block A ' to be sent not according to the sequencing of processor process 2, but gets off when 2 free time of which processor process, just data block A ' to be sent is sent to it.And for user's process 3, neither receive data block A after the processing that processor process 2 sends in order ", and provide the right to use of user's process 3, for processor process 2 contentions.When a processor process 2 has been fought for the right to use of user's process 3, user's process 3 is just accepted data block A after the processing that processor process 2 sends ".Therefore this data block A after the processing that receives just occurred in user's process 3, and " order corresponding with data block A ' to be sent in compression process, can be occurred data A to be processed and the inconsistent situation of processing back data B by situation about upsetting.

When allowing processor process 2 keep the efficient circulation work, guarantee that also user's process 3 finally obtains data B after the orderly processing, user's process 3 must be able to receive data block A after the unordered processing that all processor process 2 send ", the user can also " carry out sorting operation to data block A after receiving the unordered processing that comes simultaneously.Therefore to have taked be that map container in the STL among the C++ is safeguarded the user's Maintenance Table N in user's process 3 in the present invention, uses the method for the specific data block A ' sequence number to be sent of between different processes transmission simultaneously, solves the problems referred to above.

As shown in Figure 3, safeguard supplier's Maintenance Table M in the provider process 1, safeguard user's Maintenance Table N in user's process 3, be used to guarantee the order of data block.Supplier's Maintenance Table M is divided into two row pieces: data block sequence number is used to describe the order of data block A ' to be sent; The description block structure is used to specifically describe information such as the content of data block A ' to be sent or deposit position.User's Maintenance Table N is divided into two row pieces: data block sequence number is used for safeguarding automatically data block A after all scattered processing that receive ", its value is corresponding to the data block sequence number of data block A '; The description block structure is used for specifically describing processing back data block A " content or information such as deposit position.In step 8), provider process 1 sends data module A ' time to processor process 2 order, also sends data block A ' sequence number.Afterwards in step 9), processor process 2 sends processing back data block A to user's process 3 " time, also send data block sequence number, by Consume module 33 according to user's Maintenance Table N to the processing of unordered reception after data block A " efficiently sort.

The method of concrete efficient ordering is: use the map container in the C++ STL " to reclaim ordering to processing back data block A in user's process.Because the inside of the map container in the C++ STL realized by RBTree, so when the data that travel through in order by iterator among the map, have high efficiency.RBTree is a kind of self-equilibrating binary search tree, the operation that it can be searched, insert and delete the user's Maintenance Table N in the map container in the time complexity of 0 (log n), and wherein n is the number of node in the binary search tree.Therefore the present invention in the mutual transmission block, all adds and has transmitted unique data block sequence number between provider process 1, processor process 2 and user's process 3, and this data block sequence number is the serial number of data block A ' to be sent.According to condition recited above, the map container just can be safeguarded data block A after all scattered processing that receive automatically ", after the orderly processing of user's process 3 final outputs according to B.

When application method of the present invention was handled decompression process, at first provider process 1 read in the file header content of the data A to be processed after overcompression, by the file header index structure, can accurately find each data block A ' to be sent to be decompressed afterwards.The index structure of data block A ' to be sent, i.e. initial pointer position and the data block length of each data block A ' to be sent.Last process is just similar as handling compression process, promptly data block A ' to be sent is distributed to processor process 2 from provider process 1, reclaim all through data block A after the processing of decompression by user's process 3 again "; they are carried out buffer memory and focus on output processing back data B.But, the original file in back does not need index structure owing to decompress, therefore the index structure that user's process 3 will be processed back data block A in decompression process " when writing processing back data B, need partly not set up processing back data block A in the beginning of output data B ".

PPU framework of the present invention has suitable versatility aspect the parallel ordered data piece of processing, therefore the PPU framework can be applied to association area very easily, as calculating the field for the science that independent computing unit can be arranged, graph image is learned the field, and video field etc.

Claims

1, a kind of disposal route of the compression program parallelization based on computer cluster, it is characterized in that: it comprises a provider process, a plurality of processor process and producer's process; Serial compressed algorithm is set, all processor process parallel runnings in the described processor process; Described provider process comprises MakeBlock module, SendBlockNum module, SendFileSize module, Produce module and supplier Run module; Described MakeBlock module is used for data to be processed are divided into the data block to be sent of a plurality of fixed length, and the number and the relevant information of described data block to be sent deposited in supplier's Maintenance Table; Described SendBlockNum module is used for the number of data block to be sent is imported described user's process; Described SendFileSize module is used for the size of described data to be processed is imported described user's process; Described Produce module is used for inserting the data block to be sent that is ready for sending to the transmission buffering of described provider process; Described supplier Run module is taken out described data block to be sent from described transmission buffering, and sends to the processor process of the request of sending; When not having described data block to be sent in the described provider process, described supplier Run module finishes all idle processor process;

Described processor process comprises Process module and processor Run module; Described Process module is used to receive described data block to be sent, and with data block boil down to processing to be sent back data block; Described processor Run module is used for round-robin to the described data block to be sent of described provider process request, and sends described processing back data block to described user's process;

Described user's process comprises ReceiveBlockNum module, ReceiveFileSize module, Consume module and user Run module; Described ReceiveBlockNum module receives the number of the described data block to be sent of described SendBlockNum module transmission; Described ReceiveFileSize module is used to receive the size of the described data to be processed that described SendFileSize module sends; Described Consume module is used for the Maintenance Table according to the user, arrangement and expedited data unit combination operation that all described processing back data blocks of receiving are concentrated; Described user Run module is used to circulate and receives data block after the processing that described processor's module sends, and simultaneously described user's Maintenance Table is safeguarded.

2, the disposal route of a kind of compression program parallelization based on computer cluster as claimed in claim 1 is characterized in that: support multiple serial compressed algorithm in the described Process module.

3, the disposal route of a kind of compression program parallelization based on computer cluster as claimed in claim 1, it is characterized in that: when data block transmits between described provider process, processor process and user's process after described data block to be sent and the described processing, all additional unique sequence number.

4, the disposal route of a kind of compression program parallelization based on computer cluster as claimed in claim 2, it is characterized in that: when data block transmits between described provider process, processor process and user's process after described data block to be sent and the described processing, all additional unique sequence number.

5, as the disposal route of claim 1 or 2 or 3 or 4 described a kind of compression program parallelizations based on computer cluster, it is characterized in that: all comprise sequence number and data of description block structure body in described supplier's Maintenance Table and the user's Maintenance Table; Wherein sequence number is the order of described data block to be sent; Data of description block structure body is used to describe the content or the deposit position of data block after described data block to be sent and the described processing.

6, as the disposal route of claim 1 or 2 or 3 or 4 described a kind of compression program parallelizations based on computer cluster, it is characterized in that: described provider process sends to idle processor process with described data block to be sent, and described user's process receives has data block after the processing that usufructuary processor process sends.

7, the disposal route of a kind of compression program parallelization based on computer cluster as claimed in claim 5, it is characterized in that: described provider process sends to idle processor process with described data block to be sent, and described user's process receives has data block after the processing that usufructuary processor process sends.

8, as the disposal route of claim 1 or 2 or 3 or 4 or 7 described a kind of compression program parallelizations based on computer cluster, it is characterized in that: when carrying out decompression process, described provider process reads an index structure of described data to be processed, finds each described data block to be sent.

9, the disposal route of a kind of compression program parallelization based on computer cluster as claimed in claim 5, it is characterized in that: when carrying out decompression process, described provider process reads an index structure of described data to be processed, finds each described data block to be sent.

10, the disposal route of a kind of compression program parallelization based on computer cluster as claimed in claim 6, it is characterized in that: when carrying out decompression process, described provider process reads an index structure of described data to be processed, finds each described data block to be sent.