CN104036141B - Open computing language (OpenCL)-based red-black tree acceleration method - Google Patents

Open computing language (OpenCL)-based red-black tree acceleration method Download PDF

Info

Publication number
CN104036141B
CN104036141B CN201410266098.5A CN201410266098A CN104036141B CN 104036141 B CN104036141 B CN 104036141B CN 201410266098 A CN201410266098 A CN 201410266098A CN 104036141 B CN104036141 B CN 104036141B
Authority
CN
China
Prior art keywords
node
black
red
data
father
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201410266098.5A
Other languages
Chinese (zh)
Other versions
CN104036141A (en
Inventor
余小清
熊玮
万旺根
杨超
丁玉朴
段石石
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Shanghai for Science and Technology
Original Assignee
University of Shanghai for Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Shanghai for Science and Technology filed Critical University of Shanghai for Science and Technology
Priority to CN201410266098.5A priority Critical patent/CN104036141B/en
Publication of CN104036141A publication Critical patent/CN104036141A/en
Application granted granted Critical
Publication of CN104036141B publication Critical patent/CN104036141B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Image Analysis (AREA)
  • Processing Or Creating Images (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses an open computing language (OpenCL)-based red-black tree acceleration algorithm. The method includes that according to the characteristic that multiple calculations are capable of being parallel processed during establishing the red-black tree, an OpenCL heterogeneous platform is adopted to rapidly establish a red-black tree model on the basis of big data; with an idea of employing graphics processing unit (GPU) acceleration, to-be-operated data are divided into multiple data blocks, and multiple cores enter a data insertion operation at the same time by the GPU; and after operations of all GPUs are synchronized, by means of a merge operation, the whole red-black tree is established. The OpenCL-based red-black tree acceleration algorithm has the advantages that in the situation of the big data, the red-black tree is rapid to be established within a short time.

Description

A kind of RBTree accelerated method based on OpenCL
Technical field
The present invention relates to the parallel computation field based on GPU is and in particular to a kind of RBTree based on OpenCL accelerates to calculate Method.
Background technology
RBTree is a kind of self-balancing binary search tree, and typical purposes is to realize Associate array it can also be used to big data Retrieval.It is complicated, but its operation has good worst case run time, and is efficient in practice: It can be searched within O (log n) time, insert and delete, and n here is the number of element in tree.RBTree is each Node all carries the binary search tree of color attribute, color or red or black.Its statistic property is better than balanced binary tree, Therefore, RBTree has application in much places.In C++ STL, much partly (include set, multiset at present, Map, multimap) apply the variant of RBTree.
The present invention uses the RBTree accelerating algorithm based on OpenCL, and OpenCL is that one kind writes journey for heterogeneous platform The framework of sequence, this heterogeneous platform can be by CPU, and GPU or other kinds of processor forms.OpenCL provide some for definition and The interface function composition of control platform, can effectively utilize the acceleration to enter line algorithm of the computation capability of plurality of devices.
For the application in big data retrieval, it is to need first to calculate its red-black tree-model that pass is built, and conventional method needs Expend substantial amounts of operation time, the present invention is exactly the heterogeneous platform framework using OpenCL, solve to set up what RBTree took Problem and propose.
Content of the invention
It is an object of the invention to provide a kind of RBTree accelerated method based on OpenCL, using setting up RBTree process In the feature that can be processed with parallelization of numerous computings, with OpenCL heterogeneous platform it is achieved that quick in the case of big data Set up red-black tree-model.
For reaching above-mentioned purpose, idea of the invention is that:Beyond general requirement forced in binary search tree, have for any Effect RBTree we increased following extra demand:
Property 1. node is red or black.
Property 2. root node is black.
Each leaf node of property 3.(NIL node, empty node)It is black.
Two child nodes of each red node of property 4. are black.(from all paths of each leaf to root not Can there are two continuous red node)
Property 5. all comprises equal number of dark node from all paths of any node to each of which leaf node.
The key property of RBTree has been forced in these constraints:The longest possible path from root to leaf is not more than short Possible path two double-lengths.Result is this tree is generally balance.Because operation is such as inserted, deletes and search certain The worst case time of value requires proportional to the height set, and this theoretical upper limit in height allows RBTree the worst In the case of be all efficient, and be different from common binary search tree.
According to above-mentioned design, the technical solution used in the present invention is:The data preparing operation is divided into multiple data blocks, profit Carry out data insertion operation with many cores of GPU simultaneously.After the computing synchronously completing each GPU, finally do the conjunction of RBTree And operate, complete whole RBTree and set up process.Its total inventive method schematic diagram is as shown in Figure 2.
The technical scheme that the present invention solves the employing of its technical problem can also be perfect further.Its concrete implementation step is such as Shown in Fig. 1,4 steps can be divided into realize:
Step 1:Cpu data inputs, and GPU equipment initializes:Find and support the hardware device of OpenCL, create program performing Required memory object, support the operations such as core number distribution thread according to equipment.
Step 2:Deblocking:Original mass data is carried out by piecemeal according to the thread distributing on GPU, such as hardware supported Number of threads is n, and data volume is m, then each thread individually distributes data volume is m/n.
Step 3:Block data is distributed to each thread, and proceeds as follows:
1) data value being inserted into is inserted directly into tree tail.
2) would indicate that the Node color attribute of this data is labeled as redness.
3) color property of adjustment tree, is divided into following three kinds of situations:
If a) " uncle " node of present node is red:In this case, father, tertiary node are collectively labeled as black, Again subtree root node as redness, then the black height of subtree does not change, and red-black property must be adjusted. Now, then by present node point to the root node of subtree, upwards Recursive recovery red and black characteristic, as shown in Figure 2.
Wherein node E direct line upper strata node C is its father node, node layer A same with node C father node, and B, D are its uncle Father node.
If b) " uncle " node of present node is black, present node is the left child node of father node:Then will be current The father node of node and ancestral's node(The i.e. father node of the father node of present node)Carry out a dextrorotation, and father node black Color, original ancestral's node red coloration.The red-black characteristic of these subtrees is recovered, and the black height of subtree is not changed in. Further, since subtree root node has been black(This node is not in the problem that father and son is all redness), so not Must recurrence upwards again, now the red-black characteristic of whole tree has been all correctly.
If c) " uncle " node of present node is black, present node is the right child node of father node:Then will be current Node itself and its father node carry out once left-handed, allow present node point to original father node it is possible to situation b) above, Settling mode further according to situation b) is operated.
4) RBTree after adjustment being finished carries out data look-up operations:Proceed by lookup from the root node of tree, if Data to be found is less than present node, then proceed to search to left subtree;If data to be found is more than present node, to Right subtree proceeds to search;Otherwise search and complete.If when reaching leaf node not yet returning result then it is assumed that this time looking into Look for unsuccessfully.
Step 4:Merge subtree:After thread completes color adjustment for each subtree, GPU result of calculation is passed back CPU Internal storage location, then in the merging of cpu end tree, during merging, merging tree needs to meet following property:A, root section Point is black;B, each leaf node is black;C, two child nodes of each red node are black;D, from any node All paths to each of which leaf node all comprise equal number of dark node.
Step 5:The RBTree output built up after classification is merged and storage.
The invention has the advantages that:RBTree for big data sets up the problem computationally intensive, time-consuming, utilizes GPU parallelization can process the feature calculating, and the data preparing operation is divided into multiple data blocks, using many cores of GPU simultaneously Carry out data insertion operation, synchronously complete each computing, so that the achievement time is greatly shortened, thus realize quick RBTree setting up.
Brief description
Fig. 1:It is the RBTree accelerating algorithm flow chart based on OpenCL;
Fig. 2:It is inventive method schematic diagram;
Fig. 3:It is that original CPU RBTree method for building up takes experimental result with the RBTree accelerated method based on OpenCL Contrast.
Specific embodiment
It is as follows that the preferred embodiments of the present invention combine detailed description:
Embodiment one:One preferred embodiment of the RBTree accelerated method based on OpenCL be described with reference to the drawings as follows, Its concrete implementation method can be divided into following steps:
Step 1:Cpu data inputs, and GPU equipment initializes:Find and support the hardware device of OpenCL, create program performing Required memory object, support the operations such as core number distribution thread according to equipment.
Step 2:Deblocking:Original mass data is carried out by piecemeal according to the thread distributing on GPU, this experiment makes Use GT420 video card, 512 threads of hardware supported, data volume is m=100M, and therefore each thread individually distributes data volume and is 200KB.
Step 3:Block data is distributed to each thread, and proceeds as follows:
1) data value being inserted into is inserted directly into tree tail.
2) would indicate that the Node color attribute of this data is labeled as redness.
3) color property of adjustment tree, is divided into following three kinds of situations:
If a) " uncle " node of present node is red:In this case, father, tertiary node are collectively labeled as black, Again subtree root node as redness, then the black height of subtree does not change, and red-black property must be adjusted. Now, then by present node point to the root node of subtree, upwards Recursive recovery red and black characteristic, as shown in Figure 2.
Wherein node E direct line upper strata node C is its father node, node layer A same with node C father node, and B, D are its uncle Father node.
If b) " uncle " node of present node is black, present node is the left child node of father node:Then will be current The father node of node and ancestral's node(The i.e. father node of the father node of present node)Carry out a dextrorotation, and father node black Color, original ancestral's node red coloration.The red-black characteristic of these subtrees is recovered, and the black height of subtree is not changed in. Further, since subtree root node has been black(This node is not in the problem that father and son is all redness), so not Must recurrence upwards again, now the red-black characteristic of whole tree has been all correctly.
If c) " uncle " node of present node is black, present node is the right child node of father node:Then will be current Node itself and its father node carry out once left-handed, allow present node point to original father node it is possible to situation b) above, Settling mode further according to situation b) is operated.
4) RBTree after adjustment being finished carries out data look-up operations:Proceed by lookup from the root node of tree, if Data to be found is less than present node, then proceed to search to left subtree;If data to be found is more than present node, to Right subtree proceeds to search;Otherwise search and complete.If when reaching leaf node not yet returning result then it is assumed that this time looking into Look for unsuccessfully.
Step 4:Merge subtree:After thread completes color adjustment for each subtree, GPU result of calculation is passed back CPU Internal storage location, then in the merging of cpu end tree, during merging, merging tree needs to meet following property:A, root section Point is black;B, each leaf node(NIL node, empty node)It is black;C, two child nodes of each red node are Black;D, comprises equal number of dark node from all paths of any node to each of which leaf node.
Step 5:The RBTree output built up after classification is merged and storage.
The 100M data volume that this example is adopted, contributes and altogether takes as 751ms, and standard RBTree is at CPU end Algorithm achievement total time-consuming is 1821ms, and contrast is time-consuming to reduce 1070ms, and operation efficiency improves 58.6%.
Embodiment two:
Step 1:Cpu data inputs, and GPU equipment initializes:Find and support the hardware device of OpenCL, create program performing Required memory object, support the operations such as core number distribution thread according to equipment.
Step 2:Deblocking:Original mass data is carried out by piecemeal, hardware supported according to the thread distributing on GPU 512 Thread Counts, data volume is m=200M, and therefore each Thread Count distribution data volume is 400KB.
Calculate further and arrive step 5 according to the step 3 in embodiment one.
The 200M data volume that this example is adopted, contributes and altogether takes as 1121ms, and standard RBTree is at CPU end Algorithm achievement total time-consuming is 3689ms, and contrast is time-consuming to reduce 2568ms, and operation efficiency improves 69.6%.
Experimental result
The present invention has carried out the experiment based on OpenCL1.0 platform, experimental situation for CPU select Intel CORE i5, 3.1Ghz processor, internal memory is 4GB, and video card uses NVIDIA GT420.Experimental data amount is experimental result from 10M to 500M As shown in figure 3, using original RBTree method for building up(Blue)RBTree acceleration side based on OpenCL a kind of with the present invention Method(Red)Carry out RBTree setup time contrast, under conditions of not reducing search performance, data volume is bigger, saved when Between more;In the case of big data(More than 500MB), former based on the time comparability needed for the RBTree method for building up of OpenCL Carry out 4 times of time minimizing needed for RBTree method for building up, it is achieved thereby that being greatly improved of searching algorithm efficiency, be big data inspection The real-time calculating of rope algorithm provides possibility.

Claims (1)

1. a kind of RBTree accelerated method based on OpenCL, what it was realized comprises the following steps that:
Step 1:Cpu data inputs, and GPU equipment initializes:Find and support the hardware device of OpenCL, create needed for program performing Memory object, according to equipment support core number distribution threading operation;
Step 2:Deblocking:Original mass data is carried out by piecemeal, hardware supported Thread Count according to the thread distributing on GPU Mesh is n, and data volume is m, then each thread individually distributes data volume is m/n;
Step 3:Block data is distributed to each thread, and proceeds as follows:
1) data value being inserted into is inserted directly into tree tail;
2) would indicate that the Node color attribute of this data is labeled as redness;
3) color property of adjustment tree, is divided into following three kinds of situations:
If a) " uncle " node of present node is red:In this case, father, tertiary node are collectively labeled as black, then will Subtree root node as redness, then the black height of subtree does not change, and red-black property is adjusted;Now, Again present node is pointed to the root node of subtree, upwards Recursive recovery red and black characteristic;Wherein node E direct line upper strata node C is it Father node, node layer A same with node C father node, B, D are its uncle's node;
If b) " uncle " node of present node is black, present node is the left child node of father node:Then by present node Father node and ancestral's node, carry out a dextrorotation, and father node black, original ancestral's node red coloration;These subtrees Red-black characteristic is recovered, and the black height of subtree is not changed in;Further, since subtree root node has been black, This node is not in the problem that father and son is all redness, thus need not recurrence upwards again, now whole tree is red-black special Property has been all correctly;
If c) " uncle " node of present node is black, present node is the right child node of father node:Then by present node Carry out with its father node once left-handed in itself, allow present node point to original father node it is possible to arrive situation b) above, then root Settling mode according to situation b) is operated;
4) RBTree after adjustment being finished carries out data look-up operations:Proceed by lookup from the root node of tree, if to be checked Look for data to be less than present node, then proceed to search to left subtree;If data to be found is more than present node, son to the right Tree proceeds to search;Otherwise search and complete;If returning result is lost then it is assumed that this time searching not yet when reaching leaf node Lose;Described leaf node is the node of the bottom in setting, and there is not child node, is leaf with node E with the node of layer Node;Step 4:Merge subtree:After thread completes color adjustment for each subtree, GPU result of calculation is passed back CPU internal memory Unit, then in the merging of CPU end tree, during merging, merging tree needs to meet following property:A, root node is Black;B, each leaf node is black;C, two child nodes of each red node are black;D, from any node to it All paths of each leaf node comprise equal number of dark node;
Step 5:The RBTree output built up after classification is merged and storage.
CN201410266098.5A 2014-06-16 2014-06-16 Open computing language (OpenCL)-based red-black tree acceleration method Expired - Fee Related CN104036141B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410266098.5A CN104036141B (en) 2014-06-16 2014-06-16 Open computing language (OpenCL)-based red-black tree acceleration method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410266098.5A CN104036141B (en) 2014-06-16 2014-06-16 Open computing language (OpenCL)-based red-black tree acceleration method

Publications (2)

Publication Number Publication Date
CN104036141A CN104036141A (en) 2014-09-10
CN104036141B true CN104036141B (en) 2017-02-15

Family

ID=51466911

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410266098.5A Expired - Fee Related CN104036141B (en) 2014-06-16 2014-06-16 Open computing language (OpenCL)-based red-black tree acceleration method

Country Status (1)

Country Link
CN (1) CN104036141B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104486387B (en) * 2014-12-02 2018-03-27 浪潮(北京)电子信息产业有限公司 A kind of data synchronizing processing method and system
CN105389360A (en) * 2015-11-05 2016-03-09 浪潮(北京)电子信息产业有限公司 AVL tree-based data writing method and apparatus
CN109933327B (en) * 2019-02-02 2021-01-08 中国科学院计算技术研究所 OpenCL compiler design method and system based on code fusion compiling framework
CN112559532B (en) * 2020-12-23 2024-02-20 北京梆梆安全科技有限公司 Data insertion method and device based on red and black trees and electronic equipment

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102880555A (en) * 2012-07-28 2013-01-16 福州大学 Memory algorithm facing real-time system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7069272B2 (en) * 2002-10-09 2006-06-27 Blackrock Financial Management, Inc. System and method for implementing dynamic set operations on data stored in a sorted array

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102880555A (en) * 2012-07-28 2013-01-16 福州大学 Memory algorithm facing real-time system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
红黑树算法及其应用;高庆 等;《软件导刊》;20080930;第7卷(第9期);第40-42页 *

Also Published As

Publication number Publication date
CN104036141A (en) 2014-09-10

Similar Documents

Publication Publication Date Title
CN106776963B (en) The online method for visualizing of light-weighted BIM big data and system
CN104036141B (en) Open computing language (OpenCL)-based red-black tree acceleration method
CN110262907A (en) System and method for unified Application Programming Interface and model
CN103942108B (en) Resource parameters optimization method under Hadoop isomorphism cluster
CN103995827B (en) High-performance sort method in MapReduce Computational frames
US11630983B2 (en) Graph conversion method
US9170836B2 (en) System and method for re-factorizing a square matrix into lower and upper triangular matrices on a parallel processor
US10338629B2 (en) Optimizing neurosynaptic networks
CN104331271A (en) Parallel computing method and system for CFD (Computational Fluid Dynamics)
CN113435521A (en) Neural network model training method and device and computer readable storage medium
CN109214512A (en) A kind of parameter exchange method, apparatus, server and the storage medium of deep learning
CN105573726B (en) A kind of rules process method and equipment
CN108009111A (en) Data flow connection method and device
CN104182208A (en) Method and system utilizing cracking rule to crack password
CN114817845B (en) Data processing method, device, electronic equipment and storage medium
CN113034343B (en) Parameter-adaptive hyperspectral image classification GPU parallel method
CN105022896A (en) Method and device for APDL modelling based on dynamic numbering
CN111427857B (en) FPGA configuration file compression and decompression method based on partition reference technology
CN109710314B (en) A method of based on graph structure distributed parallel mode construction figure
CN115346099A (en) Image convolution method, chip, equipment and medium based on accelerator chip
CN111143456B (en) Spark-based Cassandra data import method, device, equipment and medium
CN103678545A (en) Network resource clustering method and device
CN110211132A (en) Point cloud semantic segmentation innovatory algorithm based on slice network
CN110334067A (en) A kind of sparse matrix compression method, device, equipment and storage medium
Nonaka et al. 2-3-4 decomposition method for large-scale parallel image composition with arbitrary number of nodes

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170215

Termination date: 20190616

CF01 Termination of patent right due to non-payment of annual fee