CN104036141B - Open computing language (OpenCL)-based red-black tree acceleration method - Google Patents
Open computing language (OpenCL)-based red-black tree acceleration method Download PDFInfo
- Publication number
- CN104036141B CN104036141B CN201410266098.5A CN201410266098A CN104036141B CN 104036141 B CN104036141 B CN 104036141B CN 201410266098 A CN201410266098 A CN 201410266098A CN 104036141 B CN104036141 B CN 104036141B
- Authority
- CN
- China
- Prior art keywords
- node
- black
- red
- data
- father
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Landscapes
- Image Analysis (AREA)
- Processing Or Creating Images (AREA)
- Image Processing (AREA)
Abstract
The invention discloses an open computing language (OpenCL)-based red-black tree acceleration algorithm. The method includes that according to the characteristic that multiple calculations are capable of being parallel processed during establishing the red-black tree, an OpenCL heterogeneous platform is adopted to rapidly establish a red-black tree model on the basis of big data; with an idea of employing graphics processing unit (GPU) acceleration, to-be-operated data are divided into multiple data blocks, and multiple cores enter a data insertion operation at the same time by the GPU; and after operations of all GPUs are synchronized, by means of a merge operation, the whole red-black tree is established. The OpenCL-based red-black tree acceleration algorithm has the advantages that in the situation of the big data, the red-black tree is rapid to be established within a short time.
Description
Technical field
The present invention relates to the parallel computation field based on GPU is and in particular to a kind of RBTree based on OpenCL accelerates to calculate
Method.
Background technology
RBTree is a kind of self-balancing binary search tree, and typical purposes is to realize Associate array it can also be used to big data
Retrieval.It is complicated, but its operation has good worst case run time, and is efficient in practice:
It can be searched within O (log n) time, insert and delete, and n here is the number of element in tree.RBTree is each
Node all carries the binary search tree of color attribute, color or red or black.Its statistic property is better than balanced binary tree,
Therefore, RBTree has application in much places.In C++ STL, much partly (include set, multiset at present,
Map, multimap) apply the variant of RBTree.
The present invention uses the RBTree accelerating algorithm based on OpenCL, and OpenCL is that one kind writes journey for heterogeneous platform
The framework of sequence, this heterogeneous platform can be by CPU, and GPU or other kinds of processor forms.OpenCL provide some for definition and
The interface function composition of control platform, can effectively utilize the acceleration to enter line algorithm of the computation capability of plurality of devices.
For the application in big data retrieval, it is to need first to calculate its red-black tree-model that pass is built, and conventional method needs
Expend substantial amounts of operation time, the present invention is exactly the heterogeneous platform framework using OpenCL, solve to set up what RBTree took
Problem and propose.
Content of the invention
It is an object of the invention to provide a kind of RBTree accelerated method based on OpenCL, using setting up RBTree process
In the feature that can be processed with parallelization of numerous computings, with OpenCL heterogeneous platform it is achieved that quick in the case of big data
Set up red-black tree-model.
For reaching above-mentioned purpose, idea of the invention is that:Beyond general requirement forced in binary search tree, have for any
Effect RBTree we increased following extra demand:
Property 1. node is red or black.
Property 2. root node is black.
Each leaf node of property 3.(NIL node, empty node)It is black.
Two child nodes of each red node of property 4. are black.(from all paths of each leaf to root not
Can there are two continuous red node)
Property 5. all comprises equal number of dark node from all paths of any node to each of which leaf node.
The key property of RBTree has been forced in these constraints:The longest possible path from root to leaf is not more than short
Possible path two double-lengths.Result is this tree is generally balance.Because operation is such as inserted, deletes and search certain
The worst case time of value requires proportional to the height set, and this theoretical upper limit in height allows RBTree the worst
In the case of be all efficient, and be different from common binary search tree.
According to above-mentioned design, the technical solution used in the present invention is:The data preparing operation is divided into multiple data blocks, profit
Carry out data insertion operation with many cores of GPU simultaneously.After the computing synchronously completing each GPU, finally do the conjunction of RBTree
And operate, complete whole RBTree and set up process.Its total inventive method schematic diagram is as shown in Figure 2.
The technical scheme that the present invention solves the employing of its technical problem can also be perfect further.Its concrete implementation step is such as
Shown in Fig. 1,4 steps can be divided into realize:
Step 1:Cpu data inputs, and GPU equipment initializes:Find and support the hardware device of OpenCL, create program performing
Required memory object, support the operations such as core number distribution thread according to equipment.
Step 2:Deblocking:Original mass data is carried out by piecemeal according to the thread distributing on GPU, such as hardware supported
Number of threads is n, and data volume is m, then each thread individually distributes data volume is m/n.
Step 3:Block data is distributed to each thread, and proceeds as follows:
1) data value being inserted into is inserted directly into tree tail.
2) would indicate that the Node color attribute of this data is labeled as redness.
3) color property of adjustment tree, is divided into following three kinds of situations:
If a) " uncle " node of present node is red:In this case, father, tertiary node are collectively labeled as black,
Again subtree root node as redness, then the black height of subtree does not change, and red-black property must be adjusted.
Now, then by present node point to the root node of subtree, upwards Recursive recovery red and black characteristic, as shown in Figure 2.
Wherein node E direct line upper strata node C is its father node, node layer A same with node C father node, and B, D are its uncle
Father node.
If b) " uncle " node of present node is black, present node is the left child node of father node:Then will be current
The father node of node and ancestral's node(The i.e. father node of the father node of present node)Carry out a dextrorotation, and father node black
Color, original ancestral's node red coloration.The red-black characteristic of these subtrees is recovered, and the black height of subtree is not changed in.
Further, since subtree root node has been black(This node is not in the problem that father and son is all redness), so not
Must recurrence upwards again, now the red-black characteristic of whole tree has been all correctly.
If c) " uncle " node of present node is black, present node is the right child node of father node:Then will be current
Node itself and its father node carry out once left-handed, allow present node point to original father node it is possible to situation b) above,
Settling mode further according to situation b) is operated.
4) RBTree after adjustment being finished carries out data look-up operations:Proceed by lookup from the root node of tree, if
Data to be found is less than present node, then proceed to search to left subtree;If data to be found is more than present node, to
Right subtree proceeds to search;Otherwise search and complete.If when reaching leaf node not yet returning result then it is assumed that this time looking into
Look for unsuccessfully.
Step 4:Merge subtree:After thread completes color adjustment for each subtree, GPU result of calculation is passed back CPU
Internal storage location, then in the merging of cpu end tree, during merging, merging tree needs to meet following property:A, root section
Point is black;B, each leaf node is black;C, two child nodes of each red node are black;D, from any node
All paths to each of which leaf node all comprise equal number of dark node.
Step 5:The RBTree output built up after classification is merged and storage.
The invention has the advantages that:RBTree for big data sets up the problem computationally intensive, time-consuming, utilizes
GPU parallelization can process the feature calculating, and the data preparing operation is divided into multiple data blocks, using many cores of GPU simultaneously
Carry out data insertion operation, synchronously complete each computing, so that the achievement time is greatly shortened, thus realize quick RBTree setting up.
Brief description
Fig. 1:It is the RBTree accelerating algorithm flow chart based on OpenCL;
Fig. 2:It is inventive method schematic diagram;
Fig. 3:It is that original CPU RBTree method for building up takes experimental result with the RBTree accelerated method based on OpenCL
Contrast.
Specific embodiment
It is as follows that the preferred embodiments of the present invention combine detailed description:
Embodiment one:One preferred embodiment of the RBTree accelerated method based on OpenCL be described with reference to the drawings as follows,
Its concrete implementation method can be divided into following steps:
Step 1:Cpu data inputs, and GPU equipment initializes:Find and support the hardware device of OpenCL, create program performing
Required memory object, support the operations such as core number distribution thread according to equipment.
Step 2:Deblocking:Original mass data is carried out by piecemeal according to the thread distributing on GPU, this experiment makes
Use GT420 video card, 512 threads of hardware supported, data volume is m=100M, and therefore each thread individually distributes data volume and is
200KB.
Step 3:Block data is distributed to each thread, and proceeds as follows:
1) data value being inserted into is inserted directly into tree tail.
2) would indicate that the Node color attribute of this data is labeled as redness.
3) color property of adjustment tree, is divided into following three kinds of situations:
If a) " uncle " node of present node is red:In this case, father, tertiary node are collectively labeled as black,
Again subtree root node as redness, then the black height of subtree does not change, and red-black property must be adjusted.
Now, then by present node point to the root node of subtree, upwards Recursive recovery red and black characteristic, as shown in Figure 2.
Wherein node E direct line upper strata node C is its father node, node layer A same with node C father node, and B, D are its uncle
Father node.
If b) " uncle " node of present node is black, present node is the left child node of father node:Then will be current
The father node of node and ancestral's node(The i.e. father node of the father node of present node)Carry out a dextrorotation, and father node black
Color, original ancestral's node red coloration.The red-black characteristic of these subtrees is recovered, and the black height of subtree is not changed in.
Further, since subtree root node has been black(This node is not in the problem that father and son is all redness), so not
Must recurrence upwards again, now the red-black characteristic of whole tree has been all correctly.
If c) " uncle " node of present node is black, present node is the right child node of father node:Then will be current
Node itself and its father node carry out once left-handed, allow present node point to original father node it is possible to situation b) above,
Settling mode further according to situation b) is operated.
4) RBTree after adjustment being finished carries out data look-up operations:Proceed by lookup from the root node of tree, if
Data to be found is less than present node, then proceed to search to left subtree;If data to be found is more than present node, to
Right subtree proceeds to search;Otherwise search and complete.If when reaching leaf node not yet returning result then it is assumed that this time looking into
Look for unsuccessfully.
Step 4:Merge subtree:After thread completes color adjustment for each subtree, GPU result of calculation is passed back CPU
Internal storage location, then in the merging of cpu end tree, during merging, merging tree needs to meet following property:A, root section
Point is black;B, each leaf node(NIL node, empty node)It is black;C, two child nodes of each red node are
Black;D, comprises equal number of dark node from all paths of any node to each of which leaf node.
Step 5:The RBTree output built up after classification is merged and storage.
The 100M data volume that this example is adopted, contributes and altogether takes as 751ms, and standard RBTree is at CPU end
Algorithm achievement total time-consuming is 1821ms, and contrast is time-consuming to reduce 1070ms, and operation efficiency improves 58.6%.
Embodiment two:
Step 1:Cpu data inputs, and GPU equipment initializes:Find and support the hardware device of OpenCL, create program performing
Required memory object, support the operations such as core number distribution thread according to equipment.
Step 2:Deblocking:Original mass data is carried out by piecemeal, hardware supported according to the thread distributing on GPU
512 Thread Counts, data volume is m=200M, and therefore each Thread Count distribution data volume is 400KB.
Calculate further and arrive step 5 according to the step 3 in embodiment one.
The 200M data volume that this example is adopted, contributes and altogether takes as 1121ms, and standard RBTree is at CPU end
Algorithm achievement total time-consuming is 3689ms, and contrast is time-consuming to reduce 2568ms, and operation efficiency improves 69.6%.
Experimental result
The present invention has carried out the experiment based on OpenCL1.0 platform, experimental situation for CPU select Intel CORE i5,
3.1Ghz processor, internal memory is 4GB, and video card uses NVIDIA GT420.Experimental data amount is experimental result from 10M to 500M
As shown in figure 3, using original RBTree method for building up(Blue)RBTree acceleration side based on OpenCL a kind of with the present invention
Method(Red)Carry out RBTree setup time contrast, under conditions of not reducing search performance, data volume is bigger, saved when
Between more;In the case of big data(More than 500MB), former based on the time comparability needed for the RBTree method for building up of OpenCL
Carry out 4 times of time minimizing needed for RBTree method for building up, it is achieved thereby that being greatly improved of searching algorithm efficiency, be big data inspection
The real-time calculating of rope algorithm provides possibility.
Claims (1)
1. a kind of RBTree accelerated method based on OpenCL, what it was realized comprises the following steps that:
Step 1:Cpu data inputs, and GPU equipment initializes:Find and support the hardware device of OpenCL, create needed for program performing
Memory object, according to equipment support core number distribution threading operation;
Step 2:Deblocking:Original mass data is carried out by piecemeal, hardware supported Thread Count according to the thread distributing on GPU
Mesh is n, and data volume is m, then each thread individually distributes data volume is m/n;
Step 3:Block data is distributed to each thread, and proceeds as follows:
1) data value being inserted into is inserted directly into tree tail;
2) would indicate that the Node color attribute of this data is labeled as redness;
3) color property of adjustment tree, is divided into following three kinds of situations:
If a) " uncle " node of present node is red:In this case, father, tertiary node are collectively labeled as black, then will
Subtree root node as redness, then the black height of subtree does not change, and red-black property is adjusted;Now,
Again present node is pointed to the root node of subtree, upwards Recursive recovery red and black characteristic;Wherein node E direct line upper strata node C is it
Father node, node layer A same with node C father node, B, D are its uncle's node;
If b) " uncle " node of present node is black, present node is the left child node of father node:Then by present node
Father node and ancestral's node, carry out a dextrorotation, and father node black, original ancestral's node red coloration;These subtrees
Red-black characteristic is recovered, and the black height of subtree is not changed in;Further, since subtree root node has been black,
This node is not in the problem that father and son is all redness, thus need not recurrence upwards again, now whole tree is red-black special
Property has been all correctly;
If c) " uncle " node of present node is black, present node is the right child node of father node:Then by present node
Carry out with its father node once left-handed in itself, allow present node point to original father node it is possible to arrive situation b) above, then root
Settling mode according to situation b) is operated;
4) RBTree after adjustment being finished carries out data look-up operations:Proceed by lookup from the root node of tree, if to be checked
Look for data to be less than present node, then proceed to search to left subtree;If data to be found is more than present node, son to the right
Tree proceeds to search;Otherwise search and complete;If returning result is lost then it is assumed that this time searching not yet when reaching leaf node
Lose;Described leaf node is the node of the bottom in setting, and there is not child node, is leaf with node E with the node of layer
Node;Step 4:Merge subtree:After thread completes color adjustment for each subtree, GPU result of calculation is passed back CPU internal memory
Unit, then in the merging of CPU end tree, during merging, merging tree needs to meet following property:A, root node is
Black;B, each leaf node is black;C, two child nodes of each red node are black;D, from any node to it
All paths of each leaf node comprise equal number of dark node;
Step 5:The RBTree output built up after classification is merged and storage.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410266098.5A CN104036141B (en) | 2014-06-16 | 2014-06-16 | Open computing language (OpenCL)-based red-black tree acceleration method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410266098.5A CN104036141B (en) | 2014-06-16 | 2014-06-16 | Open computing language (OpenCL)-based red-black tree acceleration method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104036141A CN104036141A (en) | 2014-09-10 |
CN104036141B true CN104036141B (en) | 2017-02-15 |
Family
ID=51466911
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410266098.5A Expired - Fee Related CN104036141B (en) | 2014-06-16 | 2014-06-16 | Open computing language (OpenCL)-based red-black tree acceleration method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104036141B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104486387B (en) * | 2014-12-02 | 2018-03-27 | 浪潮(北京)电子信息产业有限公司 | A kind of data synchronizing processing method and system |
CN105389360A (en) * | 2015-11-05 | 2016-03-09 | 浪潮(北京)电子信息产业有限公司 | AVL tree-based data writing method and apparatus |
CN109933327B (en) * | 2019-02-02 | 2021-01-08 | 中国科学院计算技术研究所 | OpenCL compiler design method and system based on code fusion compiling framework |
CN112559532B (en) * | 2020-12-23 | 2024-02-20 | 北京梆梆安全科技有限公司 | Data insertion method and device based on red and black trees and electronic equipment |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102880555A (en) * | 2012-07-28 | 2013-01-16 | 福州大学 | Memory algorithm facing real-time system |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7069272B2 (en) * | 2002-10-09 | 2006-06-27 | Blackrock Financial Management, Inc. | System and method for implementing dynamic set operations on data stored in a sorted array |
-
2014
- 2014-06-16 CN CN201410266098.5A patent/CN104036141B/en not_active Expired - Fee Related
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102880555A (en) * | 2012-07-28 | 2013-01-16 | 福州大学 | Memory algorithm facing real-time system |
Non-Patent Citations (1)
Title |
---|
红黑树算法及其应用;高庆 等;《软件导刊》;20080930;第7卷(第9期);第40-42页 * |
Also Published As
Publication number | Publication date |
---|---|
CN104036141A (en) | 2014-09-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106776963B (en) | The online method for visualizing of light-weighted BIM big data and system | |
CN104036141B (en) | Open computing language (OpenCL)-based red-black tree acceleration method | |
CN110262907A (en) | System and method for unified Application Programming Interface and model | |
CN103942108B (en) | Resource parameters optimization method under Hadoop isomorphism cluster | |
CN103995827B (en) | High-performance sort method in MapReduce Computational frames | |
US11630983B2 (en) | Graph conversion method | |
US9170836B2 (en) | System and method for re-factorizing a square matrix into lower and upper triangular matrices on a parallel processor | |
US10338629B2 (en) | Optimizing neurosynaptic networks | |
CN104331271A (en) | Parallel computing method and system for CFD (Computational Fluid Dynamics) | |
CN113435521A (en) | Neural network model training method and device and computer readable storage medium | |
CN109214512A (en) | A kind of parameter exchange method, apparatus, server and the storage medium of deep learning | |
CN105573726B (en) | A kind of rules process method and equipment | |
CN108009111A (en) | Data flow connection method and device | |
CN104182208A (en) | Method and system utilizing cracking rule to crack password | |
CN114817845B (en) | Data processing method, device, electronic equipment and storage medium | |
CN113034343B (en) | Parameter-adaptive hyperspectral image classification GPU parallel method | |
CN105022896A (en) | Method and device for APDL modelling based on dynamic numbering | |
CN111427857B (en) | FPGA configuration file compression and decompression method based on partition reference technology | |
CN109710314B (en) | A method of based on graph structure distributed parallel mode construction figure | |
CN115346099A (en) | Image convolution method, chip, equipment and medium based on accelerator chip | |
CN111143456B (en) | Spark-based Cassandra data import method, device, equipment and medium | |
CN103678545A (en) | Network resource clustering method and device | |
CN110211132A (en) | Point cloud semantic segmentation innovatory algorithm based on slice network | |
CN110334067A (en) | A kind of sparse matrix compression method, device, equipment and storage medium | |
Nonaka et al. | 2-3-4 decomposition method for large-scale parallel image composition with arbitrary number of nodes |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20170215 Termination date: 20190616 |
|
CF01 | Termination of patent right due to non-payment of annual fee |