CN106648451A

CN106648451A - Memory-based MapReduce engine data processing method and apparatus

Info

Publication number: CN106648451A
Application number: CN201610305911.4A
Authority: CN
Inventors: 张伟; 王界兵; 李�杰; 施莹; 宋泰然; 韦辉华; 郭宇翔
Original assignee: Shenzhen Frontsurf Information Technology Co Ltd
Current assignee: Shenzhen Frontsurf Information Technology Co Ltd
Priority date: 2016-05-10
Filing date: 2016-05-10
Publication date: 2017-05-10
Anticipated expiration: 2036-05-10
Also published as: CN106648451B

Abstract

The invention discloses a memory-based MapReduce engine data processing method and apparatus. The method comprises the steps of performing granularity cutting on Map output result data of each partition, and performing sorting on the cut granularities; and performing multi-batch shuffle on the cut granularities of each partition, performing pipelining type data processing of copy, combination and reduce on each batch of data in sequence, and controlling data processed by a reduce process in a memory. According to the method and the apparatus, the reduce process of MapReduce is subjected to pipelining type design through pure software, so that the IO access and delay are greatly reduced and shortened; and the number of concurrent batches can be adjusted according to the capacity of the available memory, so that the throughput capacity and the overall performance of a MapReduce engine are improved, and an actual measurement result is more than 1.6-2 times an original result.

Description

MapReduce engine data treating method and apparatus based on internal memory

Technical field

The present invention relates to MapReduce engine data process fields, especially relate to a kind of based on internal memory MapReduce engine data treating method and apparatus.

Background technology

Since arrogant data processing technique Hadoop comes out, the optimization of data processing engine MapReduce performances therein Always industry compares concern.

By taking reduce as an example, the whole subregion (partition, P1) in each Map output file (MOF) can be copied into Reduce ends, then all subregions (P1, P1, P1) can merge becomes an orderly total set (P1), is finally Reduce, Uncontrollable due to data volume, above-mentioned copy link internal memory overflows inevitable, subsequently merges and carries out during Reduce The access of more hard disk IO can be produced.

At present industry reduces the method for IO mainly with compression (compression) in MapReduce and combines (combination), it is exactly UDA (the Unstructured of mellanox companies based on the mapreduce of internal memory DataAccelerator the trial of pipelining (Pipelining)) was done, but it is mainly based upon the acceleration skill of hardware Art, and because the design of shuffle agreements is limited, each batch shuffle low volume datas are not carried out big handling capacity.

The content of the invention

The main object of the present invention is a kind of big handling capacity of data realized by software of offer based on internal memory MapReduce engine data treating method and apparatus.

In order to realize foregoing invention purpose, present invention firstly provides at a kind of MapReduce engine datas based on internal memory Reason method, including：

The Map output results data of each subregion are carried out into granularity cutting, and the granularity after cutting is ranked up；

Granularity after the cutting of each subregion is carried out into multiple batches of shuffle, and the data of each batch are copied successively, Merge the pipeline system data processing with reduce, by the Data Control of reduce processes process in internal memory.

Further, the granularity after the cutting by each subregion carries out multiple batches of shuffle, and by the data of each batch Copied successively, merge and reduce pipeline system data processing, by reduce processes process Data Control in internal memory In step, including：

Each processing procedure of Reduce processes is the sub-line journey of autonomous operation, advanced by what is shared between each sub-line journey First go out asynchronous message queue to be communicated.

The pipeline system data processing is multiple batches of concurrently to be run, and is configured divided by each batch according to the size of free memory Memory size control concurrent batch number.

Further, it is described that the Map output results data of each subregion are carried out into granularity cutting, and by the granularity after cutting Before the step of being ranked up, including：

After the pipeline given batch data, synchronous all reduce are once；

It is described that the Map output results data of each subregion are carried out into granularity cutting, and the granularity after cutting is ranked up The step of after, including：

MOF file is deposited by relative bucket ID sequences, correspondingly carrys out pre-cache by same order.

Granularity internal data optionally sorts, if granularity internal data is unsorted, enters at reduce ends after copy Row sequence.

The present invention also provides a kind of MapReduce engine data processing meanss based on internal memory, including：

Cutter unit, for the Map output results data of each subregion to be carried out into granularity cutting, and by the granularity after cutting It is ranked up；

Pipelined units, for the granularity after the cutting of each subregion to be carried out into multiple batches of shuffle, and by the number of each batch According to copied successively, merged and reduce pipeline system data processing, by reduce processes process Data Control including In depositing.

Further, the pipelined units, including：

Shared communication module, for Reduce processes each processing procedure for autonomous operation sub-line journey, each sub-line journey Between communicated by the asynchronous message queue of first in first out shared.

Further, the pipelined units, including：

Module is concurrently run, is concurrently run for the pipeline system data processing to be multiple batches of, according to the big of free memory The little memory size divided by each batch configuration controls concurrent batch number.

Further, the MapReduce engine data processing meanss based on internal memory, also include：

Lock unit, after the pipeline given batch data, synchronous all reduce are once；

Memory cell, by relative bucket ID sequences MOF file is deposited, and correspondingly carrys out pre-cache by same order.

Further, the pipelined units, including：

Granularity internal data processing module, optionally sorts for granularity internal data, if granularity internal data is not Sequence, then be ranked up after copying at reduce ends.

The MapReduce engine data treating method and apparatus based on internal memory of the present invention, by pure software mode pair The reduce processes of MapReduce are pipelined (pipelining) design, i.e., data have been carried out with concurrent processing in batches, The shuffle or copy of each batch, merging (merge) can be controlled in internal memory in (In-Memory) with reduce and carries out, pole The earth reduces the access of IO and postpones；Can be to adjust the number of concurrent batch according to the number of free memory, so as to carry The handling capacity and overall performance of high mapreduce engines, measured result is more than -2 times of original 1.6 times.

Description of the drawings

Fig. 1 is the schematic flow sheet of the MapReduce engine data processing methods based on internal memory of one embodiment of the invention；

Fig. 2 is the Reduce process schematic diagrames of the pipelining of one embodiment of the invention；

Fig. 3 is the schematic flow sheet of the MapReduce engine data processing methods based on internal memory of one embodiment of the invention；

The schematic diagram of zookeeper is based between Reduce processes of the Fig. 4 for one embodiment of the invention by batch synchronization process；

Fig. 5 is the schematic diagram of the mapping relations of the bucket ID of one embodiment of the invention；

Fig. 6 is the structural representation frame of the MapReduce engine data processing meanss based on internal memory of one embodiment of the invention Figure；

Fig. 7 is the structural schematic block diagram of the pipelined units of one embodiment of the invention；

Fig. 8 is the structural representation frame of the MapReduce engine data processing meanss based on internal memory of one embodiment of the invention Figure.

The realization of the object of the invention, functional characteristics and advantage will be described further referring to the drawings in conjunction with the embodiments.

Specific embodiment

It should be appreciated that specific embodiment described herein is not intended to limit the present invention only to explain the present invention.

With reference to Fig. 1, the embodiment of the present invention provides a kind of MapReduce engine data processing methods based on internal memory, including Step：

S1, the Map output results data of each subregion are carried out into granularity cutting, and the granularity after cutting is ranked up；

S2, the granularity after the cutting of each subregion is carried out into multiple batches of shuffle, and the data of each batch are copied successively The pipeline system data processing of shellfish, merging and reduce, by the Data Control of reduce processes process in internal memory.

As described in above-mentioned step S1, more fine granularity is carried out to the data of original each subregion (partition) (bucket) cutting, in order to realize the data processing of follow-up streamlined, must be in order between bucket.In the present embodiment,

As described in above-mentioned step S2, in order to realize streamlined operation, the agreement of the whole subregions of primary shuffle is compared, New shuffle consultations carry out multiple batches of (pass) shuffle a subregion, and each batch only copies in order original whole The subset (bucket) of individual partition data, so by the size of adjustment bucket, the data of the process of reduce processes can Control is in internal memory.(pipelining) is pipelined by pure software mode to the reduce processes of MapReduce to set Meter, i.e., processed in batches data, the shuffle or copy of each batch, is merged (merge) and be can control with reduce Carry out in (In-Memory) in internal memory, considerably reduce the access of IO and postpone；Can be with according to the number of free memory To adjust the number of concurrent batch, so as to improve the handling capacity and overall performance of mapreduce engines.

With reference to Fig. 2, in the present embodiment, the above-mentioned granularity by after each subregion cutting carries out multiple batches of shuffle, and will be each The data of batch are copied successively, merge and reduce pipeline system data processing, by the data of reduce processes process Step S2 of the control in internal memory, including：

Each processing procedure of S21, Reduce process is the sub-line journey of autonomous operation, passes through what is shared between each sub-line journey The asynchronous message queue of first in first out is communicated.

As described in above-mentioned step S21, copy and merging thread (merger) is notified that after the completion of thread (fetcher), merge Reduce threads (reducer) are notified that after the completion of thread, due to keeping orderly between each batch, reducer may not necessarily be waited Remaining data copy comes and directly does reduce calculating, and in whole process, data access can control to be carried out in internal memory, Delay can be greatly reduced.

In the present embodiment, the granularity after the above-mentioned cutting by each subregion carries out multiple batches of shuffle, and by the number of each batch According to copied successively, merged and reduce pipeline system data processing, by reduce processes process Data Control including Step S2 in depositing, including：

S22, above-mentioned pipeline system data processing is multiple batches of concurrently runs, according to the size of free memory divided by each batch The memory size of configuration controls concurrent batch number.

As described in above-mentioned step S22, adaptedly concurrently run according to the available size of internal memory, compared Native method, Handling capacity is greatly improved.

In the present embodiment, the granularity after the above-mentioned cutting by each subregion carries out multiple batches of shuffle, and by the number of each batch According to copied successively, merged and reduce pipeline system data processing, by reduce processes process Data Control including Step in depositing, including：

S23, above-mentioned granularity internal data optionally sort, if granularity internal data is unsorted, copy after Reduce ends are ranked up.

As described in above-mentioned step S23, if the data inside granularity do not sort, carry out at reduce ends after can copying Sequence, thus consumes some cpu for sorting and has been transferred to reduce ends from map ends, and this becomes bottleneck to map ends cpu Operation has help of reducing pressure.

It is above-mentioned the Map output results data of each subregion are carried out into granularity to cut in the present embodiment with reference to Fig. 3, Fig. 4 and Fig. 5 Before the step of cutting, and the granularity after cutting is ranked up S1, including：

S11, lock unit, after the pipeline given batch data, synchronous all reduce are once；

It is above-mentioned that the Map output results data of each subregion are carried out into granularity cutting, and the granularity after cutting is ranked up The step of S1 after, including：

S12, memory cell, by relative bucket ID sequences MOF file is deposited, and correspondingly carrys out pre-cache by same order.

As described in above-mentioned step S11, because the differentiation of reduce boot sequences, and node computing capability also have network Differentiation, batch (pass) differentiation of each reduce operation is sometimes it is obvious that such as one reduce is in operation , in operation pass 5, so for MOF is a test by the pre-cache of batch, and internal memory may for pass 0, another reduce The so many data from 0 to 5 batches of not enough cache, need a kind of synchronization mechanism to pin (lock) batch, wait all reduce After completing the process of same batch, the unified process into next batch, to ensure that the cache of MOF is hit (hit).Reference Fig. 4, because each reduce is likely distributed on no node, the present embodiment employs the node synchronization in hadoop systems Realizing, control often separates the synchronous once all reduce of batch of specified quantity to mechanism zookeeper, that is to say, that reduce Between differentiation not over specified quantity batch.

With reference to Fig. 5, as described in above-mentioned step S12, due to multiple batches of shuffle, each reduce is from relative Bucket 0 starts to carry out in order, and we sort to deposit MOF file by relative bucket ID, correspondingly comes by same order The new method of pre-cache (cache), effectively reduces the probability of random IO, increased cache and hits (hit) probability.

In the present embodiment, because into the calculating based on internal memory, the use of internal memory is with management to performance after pipelining Affect also very big, the internal memory managed based on Java Virtual Machine (JVM) receives garbage reclamation (Garbbage Collection) performance Restriction, hence it is evident that be not suitable for the present invention.Present invention employs from the direct storage allocation of system and then oneself management, and do not use JVM。

In one embodiment, a kind of contrast test is done, 4, experimental data comparative analysis is carried out, it is as follows：

(1) test environment：4 back end

The big supplier CDH of hadoop softwares-three, HDP, MAPR result is similar to

CPU 2x8 cores

RAM 128GB

Disk 2TBx12

(2) measured result, such as following table：

As can be seen from the above table, invention significantly improves the data-handling capacity of MapReduce itself, is probably original 1.6 times -2 times.

The MapReduce engine data processing methods based on internal memory of the present embodiment, by pure software mode pair The reduce processes of MapReduce are pipelined (pipelining) design, i.e., data have been carried out with concurrent processing in batches, The shuffle or copy of each batch, merging (merge) can be controlled in internal memory in (In-Memory) with reduce and carries out, pole The earth reduces the access of IO and postpones；Can be to adjust the number of concurrent batch according to the number of free memory, so as to carry The handling capacity and overall performance of high mapreduce engines, measured result is more than -2 times of original 1.6 times.

With reference to Fig. 6, the embodiment of the present invention also provides a kind of MapReduce engine data processing meanss based on internal memory, wraps Include：

Cutter unit 10, for the Map output results data of each subregion to be carried out into granularity cutting, and by the grain after cutting Degree is ranked up；

Pipelined units 20, for the granularity after the cutting of each subregion to be carried out into multiple batches of shuffle, and by each batch Data are copied successively, merge and reduce pipeline system data processing, the Data Control of reduce processes process is existed In internal memory.

Such as above-mentioned cutter unit 10, more fine granularity is carried out to the data of original each subregion (partition) (bucket) cutting, in order to realize the data processing of follow-up streamlined, must be in order between bucket.In the present embodiment,

Such as above-mentioned pipelined units 20, in order to realize streamlined operation, the association of the whole subregions of primary shuffle is compared View, new shuffle consultations carry out multiple batches of (pass) shuffle a subregion, and each batch only copies in order former Carry out the subset (bucket) of whole partition data, so by the size of adjustment bucket, the data of the process of reduce processes Just it can be controlled in internal memory.The reduce processes of MapReduce are pipelined by pure software mode (pipelining) design, i.e., data are processed in batches, the shuffle or copy of each batch, merge (merge) Caning be controlled in internal memory in (In-Memory) with reduce is carried out, and is considerably reduced the access of IO and is postponed；Can be with basis The number of free memory adjusting the number of concurrent batch, so as to improve the handling capacity and overall performance of mapreduce engines.

With reference to Fig. 7, in the present embodiment, above-mentioned pipelined units 20, including：

Shared communication module 21, for Reduce processes each processing procedure for autonomous operation sub-line journey, each sub-line Communicated by the asynchronous message queue of the first in first out shared between journey.

Such as above-mentioned shared communication module, merging thread (merger) is notified that after the completion of copy thread (fetcher), is merged Reduce threads (reducer) are notified that after the completion of thread, due to keeping orderly between each batch, reducer may not necessarily be waited Remaining data copy comes and directly does reduce calculating, and in whole process, data access can control to be carried out in internal memory, Delay can be greatly reduced.

In the present embodiment, above-mentioned pipelined units 20, including：

Module 22 is concurrently run, is concurrently run for the pipeline system data processing to be multiple batches of, according to free memory Size controls concurrent batch number divided by the memory size of each batch configuration.

Module is concurrently run as described above, is adaptedly concurrently run according to the available size of internal memory, compare Native method, Handling capacity is greatly improved.

In the present embodiment, above-mentioned pipelined units 20, including：

Granularity internal data processing module 23, optionally sorts for granularity internal data, if granularity internal data It is unsorted, then it is ranked up at reduce ends after copying.

Such as above-mentioned granularity internal data processing module 23, if the data inside granularity do not sort, after can copying Reduce ends are ranked up, and thus some cpu for sorting are consumed and has been transferred to reduce ends from map ends, and this is to map ends Cpu becomes the operation of bottleneck and has help of reducing pressure.

With reference to Fig. 8, in the present embodiment, the above-mentioned MapReduce engine data processing meanss based on internal memory also include：

Lock unit 110, after the pipeline given batch data, synchronous all reduce are once；

Memory cell 120, by relative bucket ID sequences MOF file is deposited, and correspondingly carrys out pre-cache by same order.

Such as above-mentioned lock unit 110, because the differentiation of reduce boot sequences, and node computing capability also have network Differentiation, batch (pass) differentiation of each reduce operation is sometimes it is obvious that such as one reduce is in operation , in operation pass 5, so for MOF is a test by the pre-cache of batch, and internal memory may for pass 0, another reduce The so many data from 0 to 5 batches of not enough cache, need a kind of synchronization mechanism to pin (lock) batch, wait all reduce After completing the process of same batch, the unified process into next batch, to ensure that the cache of MOF is hit (hit).Reference Fig. 4, because each reduce is likely distributed on no node, the present embodiment employs the node synchronization in hadoop systems Realizing, control often separates the synchronous once all reduce of batch of specified quantity to mechanism zookeeper, that is to say, that reduce Between differentiation not over specified quantity batch.

With reference to Fig. 5, such as above-mentioned memory cell 120, due to multiple batches of shuffle, each reduce is from relative Bucket 0 starts to carry out in order, and we sort to deposit MOF file by relative bucket ID, correspondingly comes by same order The new method of pre-cache (cache), effectively reduces the probability of random IO, increased cache and hits (hit) probability.

(1) test environment：4 back end

The big supplier CDH of hadoop softwares-three, HDP, MAPR result is similar to

CPU 2x8 cores

RAM 128GB

Disk 2TBx12

(2) measured result, such as following table：

The MapReduce engine data processing meanss based on internal memory of the present embodiment, by pure software mode pair The reduce processes of MapReduce are pipelined (pipelining) design, i.e., data have been carried out with concurrent processing in batches, The shuffle or copy of each batch, merging (merge) can be controlled in internal memory in (In-Memory) with reduce and carries out, pole The earth reduces the access of IO and postpones；Can be to adjust the number of concurrent batch according to the number of free memory, so as to carry The handling capacity and overall performance of high mapreduce engines, measured result is more than -2 times of original 1.6 times.

The preferred embodiments of the present invention are the foregoing is only, the scope of the claims of the present invention, every utilization is not thereby limited Equivalent structure or equivalent flow conversion that description of the invention and accompanying drawing content are made, or directly or indirectly it is used in other correlations Technical field, be included within the scope of the present invention.

Claims

1. a kind of MapReduce engine data processing methods based on internal memory, it is characterised in that include：

Granularity after each subregion cutting is carried out into multiple batches of shuffle, and the data of each batch are copied successively, merged With the pipeline system data processing of reduce, the Data Control that reduce processes are processed is in internal memory.

2. MapReduce engine data processing methods based on internal memory according to claim 1, it is characterised in that described Granularity after the cutting of each subregion is carried out into multiple batches of shuffle, and the data of each batch are copied successively, merge and The pipeline system data processing of reduce, step of the Data Control that reduce processes are processed in internal memory, including：

Each processing procedure of Reduce processes is the sub-line journey of autonomous operation, and shared first in first out is passed through between each sub-line journey Asynchronous message queue is communicated.

3. MapReduce engine data processing methods based on internal memory according to claim 1, it is characterised in that described Granularity after the cutting of each subregion is carried out into multiple batches of shuffle, and the data of each batch are copied successively, merge and The pipeline system data processing of reduce, step of the Data Control that reduce processes are processed in internal memory, including：

The pipeline system data processing is multiple batches of concurrently to be run, according to the size of free memory divided by each batch configuration Deposit size and control concurrent batch number.

4. MapReduce engine data processing methods based on internal memory according to claim 3, it is characterised in that described Before the step of Map output results data of each subregion being carried out into granularity cutting, and the granularity after cutting are ranked up, bag Include：

After the pipeline given batch data, synchronous all reduce are once；

It is described that the Map output results data of each subregion are carried out into granularity cutting, and the step that the granularity after cutting is ranked up After rapid, including：

5. MapReduce engine data processing methods based on internal memory according to claim 1, it is characterised in that described Granularity after the cutting of each subregion is carried out into multiple batches of shuffle, and the data of each batch are copied successively, merge and The pipeline system data processing of reduce, step of the Data Control that reduce processes are processed in internal memory, including：

Granularity internal data optionally sorts, if granularity internal data is unsorted, is arranged at reduce ends after copy Sequence.

6. a kind of MapReduce engine data processing meanss based on internal memory, it is characterised in that include：

Cutter unit, for the Map output results data of each subregion to be carried out into granularity cutting, and the granularity after cutting is carried out Sequence；

Pipelined units, for the granularity after the cutting of each subregion to be carried out into multiple batches of shuffle, and by the data of each batch according to It is secondary copied, merge and reduce pipeline system data processing, by reduce processes process Data Control in internal memory.

7. MapReduce engine data processing meanss based on internal memory according to claim 6, it is characterised in that described Pipelined units, including：

Shared communication module, for Reduce processes each processing procedure for autonomous operation sub-line journey, between each sub-line journey Communicated by the asynchronous message queue of the first in first out shared.

8. MapReduce engine data processing meanss based on internal memory according to claim 6, it is characterised in that described Pipelined units, including：

Module is concurrently run, is concurrently run for the pipeline system data processing to be multiple batches of, removed according to the size of free memory Concurrent batch number is controlled with the memory size of each batch configuration.

9. MapReduce engine data processing meanss based on internal memory according to claim 6, it is characterised in that also wrap Include：

10. MapReduce engine data processing meanss based on internal memory according to claim 6, it is characterised in that described Pipelined units, including：

Granularity internal data processing module, optionally sorts for granularity internal data, if granularity internal data is unsorted, It is ranked up at reduce ends after then copying.