CN109213601A - A kind of load-balancing method and equipment based on CPU-GPU - Google Patents
A kind of load-balancing method and equipment based on CPU-GPU Download PDFInfo
- Publication number
- CN109213601A CN109213601A CN201811064037.5A CN201811064037A CN109213601A CN 109213601 A CN109213601 A CN 109213601A CN 201811064037 A CN201811064037 A CN 201811064037A CN 109213601 A CN109213601 A CN 109213601A
- Authority
- CN
- China
- Prior art keywords
- cpu
- gpu
- duration
- data
- assembly line
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/20—Processor architectures; Processor configuration, e.g. pipelining
Abstract
The purpose of the application is to provide a kind of load-balancing method and equipment based on CPU-GPU, the application enables CPU-GPU isomeric data analysis system to support the query analysis under big data scene by constructing assembly line query execution model on CPU-GPU heterogeneous database system;Determine the total quantity of pending assembly line;Start the assembly line query execution model to distribute the corresponding assembly line of the total quantity to the CPU and the GPU, and the execution duration of the assembly line according to the determining single on the CPU and the GPU respectively, it calculates the corresponding system of all load sharing policies and executes duration;All systems are finally executed into the corresponding load sharing policy of minimum value in duration and are determined as best CPU-GPU allocation strategy, load balancing based on CPU-GPU isomeric data analysis system can reasonable distribution assembly line be loaded to different processor, make full use of processor computing resource, not only improve system performance, moreover it is possible to so that system is reached overall performance best.
Description
Technical field
This application involves computer field more particularly to a kind of load-balancing methods and equipment based on CPU-GPU.
Background technique
Universal graphics processing unit (Graphics Processing Unit, GPU) is more in matrix calculating, machine learning etc.
A field is widely used.In recent years, the related needs rapid growth of data-intensive applications, promotes the isomery based on GPU
The development of online analysis and processing platform, since GPU possesses multiple computing units that can run a large amount of threads simultaneously, with GPU
The performance of data analysis system as primary processor is better than traditional CPU analysis system in most cases, when executing
Between shorten several orders of magnitude.
In traditional relational query analysis system, when client sends inquiry request, system can create one point
Analyse operation, parsed and be converted to logical query plan to request, inquiry plan optimizer can according to certain principle (such as at
This is minimum) the optimal physical query plan execution of selection.Physical query plan is a directed acyclic graph (DAG), it includes more
A operation operator is executed according to certain sequence between operation operator.
In current CPU-GPU isomery analysis system, GPU is the primary processor of query execution, the execution of operation operator
Be mainly distributed on GPU, and CPU is mainly responsible for data distribution and collection, when subsequent operation need it is defeated using previous steps
When intermediate result out, CPU will also do certain processing to intermediate result.
The analysis demand of Data Management Analysis system processing is towards big data scene, and data volume exponentially increases, work
Make heavier loads;However, the data in storage medium inside it can only be directly handled due to GPU, and the capacity of video memory is limited,
Therefore GPU can not just complete the processing of large data sets by single load.When input data or intermediate result can not be put into greatly very much
When GPU global memory, the efficiency that will lead to analysis work continues lowly, or even causes mission failure.Pass through limit in the prior art
The size of inquiry table processed evades this problem, or calculating task is transferred to CPU as alternative strategy, but these are not most
Good solution.
In conclusion the use of GPU being that data analysis system accelerates query analysis at present on the heterogeneous platform of CPU-GPU
Though it is effective, but still have the following problems: GPU video memory capacity is limited, and the processing of large data sets can not be completed by single load,
And the task distribution between CPU and GPU is unbalanced, does not make full use of heterogeneous processor resource.
Summary of the invention
The purpose of the application is to provide a kind of load-balancing method and equipment based on CPU-GPU, existing to solve
GPU video memory capacity in technology is limited, and the task between the processing and CPU and GPU of large data sets can not be completed by single load
Distributing unbalanced leads to the problem of not making full use of heterogeneous processor resource.
According to the one aspect of the application, a kind of load-balancing method based on CPU-GPU is provided, this method comprises:
Assembly line query execution model is constructed on CPU-GPU heterogeneous database system;
Determine the total quantity of pending assembly line;
Start the assembly line query execution model, the corresponding assembly line of the total quantity is distributed to the CPU
On the GPU, and the execution duration of the assembly line according to the determining single on the CPU and the GPU respectively, it calculates
The corresponding system of all load sharing policies executes duration;
All systems are executed into the corresponding load sharing policy of minimum value in duration and are determined as best CPU-GPU
Allocation strategy.
Further, in the above method, the total quantity of the pending assembly line of the determination, comprising:
Obtain query statement, wherein the query statement includes data to be checked;
The data to be checked are divided according to preset data fragment size, obtain the data of the data to be checked
Fragment and its sum;
Each of the respectively described data to be checked data fragmentation starts corresponding assembly line, then pending institute
The total quantity for stating assembly line is determined by the sum of the data fragmentation.
Further, in the above method, start the assembly line query execution model total quantity is corresponding described
Assembly line is distributed to the CPU and the GPU, and is flowed according to the determining single on the CPU and the GPU respectively
The execution duration of waterline calculates the corresponding system of all load sharing policies and executes duration, comprising:
Step 1: starting assembly line query execution model, is arranged initial load allocation strategy: for the stream of CPU distribution
Waterline quantity NCPU=0, for the assembly line quantity N of GPU distributionGPU=N, wherein N is the total quantity and N of the assembly line
For the positive integer more than or equal to 1;
Step 2: parallel execute each of distributes on the CPU and the GPU assembly line respectively, obtain current
The corresponding CPU of load sharing policy executes duration and GPU executes duration;
Step 3: the CPU is executed duration and is determined if the CPU executes duration and GPU execution duration is equal
Duration is executed for the corresponding system of present load allocation strategy;If the CPU executes duration and the GPU executes duration not phase
Deng the CPU is then executed the larger value that duration and the GPU execute in duration, and to be determined as present load allocation strategy corresponding
System executes duration;
Step 4: updating load sharing policy: for the assembly line quantity N of CPU distributionCPU=NCPU+ 1, it is the GPU
The assembly line quantity N of distributionGPU=NGPU- 1, wherein NCPU+NGPU=N,;
Step 5: repeating the above steps two to step 4, until obtaining the corresponding system of all load sharing policies
System executes duration.
Further, described to be flowed according to the determining single on the CPU and the GPU respectively in the above method
The execution duration of waterline calculates before the corresponding system of all load sharing policies executes duration, further includes:
Duration T is inputted according to data of the data fragmentation on CPUIN_C, data execute duration TEXE_CAnd data output
Duration TOUT_C, determine the execution duration of the assembly line described in single on the CPU;
Duration T is inputted according to data of the data fragmentation on GPUIN_G, data execute duration TEXE_GAnd data output
Duration TOUT_G, determine the execution duration of the assembly line described in single on the GPU.
Further, in the above method, it is described Step 2: in the corresponding CPU of present load allocation strategy that obtains execute
The formula that duration and GPU execute duration is respectively as follows:
TCPU=TIN_C+TEXE_C+TOUT_C+Max{TIN_c, TEXE_C, TOUT_C}×(NCPU- 1),
TGPU=TIN_G+TEXE_G+TOUT_G+Max{TIN_G, TEXE_G, TOUT_G}×(NGPU- 1),
Wherein, TCPUDuration, Max { T are executed for the corresponding CPU of present load allocation strategyIN_C, TEXE_C, TOUT_CIt is data
Data of the fragment on CPU input duration TIN_C, data execute duration TEXE_CAnd data export duration TOUT_CIn maximum value;
TGPUDuration, Max { T are executed for the corresponding GPU of present load allocation strategyIN_G, TEXE_G, TOUT_GIt is data fragmentation
Data on GPU input duration TIN_G, data execute duration TEXE_GAnd data export duration TOUT_GIn maximum value.
Further, in the above method, the query statement further includes querying condition, wherein the distribution to the CPU
On assembly line be the thread instance run on the CPU according to the querying condition;It is described to distribute to the stream on the GPU
Waterline is the kernel function example run on the GPU according to the querying condition.
Further, described Step 2: parallel execute is distributed respectively on the CPU and the GPU in the above method
Each assembly line obtains the corresponding CPU of present load allocation strategy and executes duration and after GPU executes duration, further includes:
Each of obtain on the CPU and GPU the corresponding implementing result of the assembly line;
The corresponding final query execution knot of the query statement is obtained according to the corresponding implementing result of each assembly line
Fruit.
Further, in the above method, data of the data fragmentation on CPU input duration TIN_CFor the data point
The piece duration used when the memory of the CPU is copied;
Data of the data fragmentation on CPU execute duration TEXE_CFor used in the thread instance that is run on the CPU
Duration;
Data of the data fragmentation on CPU export duration TOUT_CFor the assembly line pair described in the CPU memory copying
Duration used in the implementing result answered;
Data of the data fragmentation on GPU input duration TIN_GFor the memory by the data fragmentation from the CPU
It is copied to duration used in the video memory of the GPU;
Data of the data fragmentation on GPU execute duration TEXE_GRun by kernel function example on the GPU
Duration;
Data of the data fragmentation on GPU export duration TOUT_GFor by the corresponding implementing result of the assembly line from institute
The video memory for stating GPU is copied to duration used in the memory of the CPU.
According to the another aspect of the application, a kind of non-volatile memory medium is additionally provided, being stored thereon with computer can
Reading instruction when the computer-readable instruction can be executed by processor, realizes the processor as above-mentioned based on CPU-GPU
Load-balancing method.
According to the another aspect of the application, a kind of equipment is additionally provided, wherein comprising:
One or more processors;
Non-volatile memory medium, for storing one or more computer-readable instructions,
When one or more of computer-readable instructions are executed by one or more of processors, so that one
Or multiple processors realize such as the above-mentioned load-balancing method based on CPU-GPU.
Compared with prior art, the application on CPU-GPU heterogeneous database system by constructing assembly line query execution
Model enables CPU-GPU isomeric data analysis system to support the query analysis under big data scene;Determine pending flowing water
The total quantity of line;Start the assembly line query execution model to distribute the corresponding assembly line of the total quantity to described
On CPU and the GPU, and the execution duration of the assembly line according to the determining single on the CPU and the GPU respectively,
It calculates the corresponding system of all load sharing policies and executes duration;Finally all systems are executed in duration
The corresponding load sharing policy of minimum value is determined as best CPU-GPU allocation strategy, is based on CPU-GPU isomeric data analysis system
Load balancing can reasonable distribution assembly line be loaded to different processor, make full use of processor computing resource, not only
Improve system performance, moreover it is possible to so that system is reached overall performance best.
Detailed description of the invention
By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the application's is other
Feature, objects and advantages will become more apparent upon:
Fig. 1 shows a kind of flow diagram of load-balancing method based on CPU-GPU according to the application one aspect;
Fig. 2 shows dividing in a kind of load-balancing method based on CPU-GPU according to the application one aspect in load
Duration is executed with GPU corresponding under strategy and CPU executes the schematic diagram of duration;
Fig. 3 shows in a kind of load-balancing method based on CPU-GPU according to the application one aspect and is in load
The calculating schematic diagram of execution duration when allocation strategy when corresponding inquiry;Fig. 4 shows the application according to the application one aspect
First query statement carries out showing for the load sharing policy of CPU-GPU under assembly line query execution model to 80,000,000 row data
It is intended to;
Fig. 5 shows right under assembly line query execution model according to the first query statement of application of the application one aspect
1.4 hundred million row data carry out the schematic diagram of the load sharing policy of CPU-GPU;
Fig. 6 shows right under assembly line query execution model according to the second query statement of application of the application one aspect
80000000 row data carry out the schematic diagram of the load sharing policy of CPU-GPU;
Fig. 7 shows right under assembly line query execution model according to the second query statement of application of the application one aspect
1.4 hundred million row data carry out the schematic diagram of the load sharing policy of CPU-GPU;
The same or similar appended drawing reference represents the same or similar component in attached drawing.
Specific embodiment
The application is described in further detail with reference to the accompanying drawing.
In a typical configuration of this application, terminal, the equipment of service network and trusted party include one or more
Processor (CPU), input/output interface, network interface and memory.
Memory may include the non-volatile memory in computer-readable medium, random access memory (RAM) and/or
The forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flash RAM).Memory is computer-readable medium
Example.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method
Or technology come realize information store.Information can be computer readable instructions, data structure, the module of program or other data.
The example of the storage medium of computer includes, but are not limited to phase change memory (PRAM), static random access memory (SRAM), moves
State random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electric erasable
Programmable read only memory (EEPROM), flash memory or other memory techniques, read-only disc read only memory (CD-ROM) (CD-ROM),
Digital versatile disc (DVD) or other optical storage, magnetic cassettes, magnetic tape disk storage or other magnetic storage devices or
Any other non-transmission medium, can be used for storage can be accessed by a computing device information.As defined in this article, computer
Readable medium does not include non-temporary computer readable media (transitory media), such as the data-signal and carrier wave of modulation.
As shown in Figure 1, load-balancing method of one of the embodiment of the present application based on CPU-GPU, is suitable for relationship type
The Data Analysis Model of database can give full play to heterogeneous processor characteristic, and rationally and efficiently share out the work task, realize big
Query analysis under data scene, the method comprising the steps of S11, step S12, step S13, step S14, wherein specifically include:
The step S11, by constructing assembly line query execution model on CPU-GPU heterogeneous database system;
The step S12, determines the total quantity of pending assembly line;For example, the total quantity of pending assembly line is
N;
The step S13 starts the assembly line query execution model for the corresponding assembly line point of the total quantity N
It is assigned on the CPU and the GPU, and according to the determining single on the CPU and the GPU respectively assembly line is held
Row duration calculates the corresponding system of all load sharing policies and executes duration;
All systems are executed the corresponding load sharing policy of minimum value in duration and determined by the step S14
For best CPU-GPU allocation strategy.
S11 to step S14 realizes the query analysis under big data quantity scene through the above steps, while can also be balanced
Task distribution between CPU and GPU, makes full use of processor resource, improves system performance.
During executing the load sharing policy of CPU-GPU heterogeneous database system, in the isomeric data of CPU-GPU
Building assembly line query execution model and load sharing policy are executed on library MapD Core open source version 3 .3.1, and are being equipped with 1
It is executed on the machine of block NVIDIA Tesla K80 GPU, video card includes about 22GB global memory.Corresponding to system
Server in be equipped with two ten cores Xeon E5-2630 v4 CPU and 224GB RA;Improved system is run on
The CentOS 7Linux release of Linux 3.10.0 kernel.
In the present embodiment, the step S11 on CPU-GPU heterogeneous database system by constructing assembly line query execution
The step of model, is as follows:
Query statement is obtained, includes data to be checked and one or more querying conditions in the query statement, here, institute
Stating data to be checked can be but be not limited to one or more database relational tables;
Size is divided according to preset data, data to be checked are divided into N number of data fragmentation, wherein when described to be checked
When data are preferably database relational table, then the data fragmentation is the sublist of the database relational table, and the preset data
Divide the tuple data that size is the preset quantity in the database relational table;
Start corresponding assembly line for each data fragmentation in above-mentioned N number of data fragmentation (subquery object) (to start
The total quantity of assembly line be N), which is also possible to distribute to GPU either be assigned to CPU, point
The operation for inquiring corresponding data fragmentation is executed on each assembly line not distributed on CPU and GPU, when each assembly line is held
After the completion of row, the corresponding implementing result r of each assembly line is obtainedi, i is number (or the fragment of data fragmentation of the assembly line
Number), wherein i=1,2 ... ..., N-1, N, by the corresponding implementing result r of all assembly lines of synthesisi, obtain the query statement
Corresponding final query result R, when poll-final, the system recorded under current CPU-GPU load sharing policy is held
Row duration realizes the record to the system execution duration of current CPU-GPU load sharing policy and final result of query execution
Statistics.
In the present embodiment, the step S12 determines the total quantity of pending assembly line, comprising:
Obtain query statement, wherein the query statement includes data to be checked;Here, the query statement further includes
Querying condition, then distribution to the assembly line on CPU is the thread instance run on the CPU according to the querying condition;Point
The assembly line being assigned on GPU is the kernel function example run on the GPU according to the querying condition.
The data to be checked are divided according to preset data fragment size, obtain the data of the data to be checked
Fragment and its sum;For example, total n=(sizes of data to be checked)/(preset data of the data fragmentation of the data to be checked
Fragment size);
Each of the respectively described data to be checked data fragmentation starts corresponding assembly line, then pending institute
The total quantity for stating assembly line is determined by the sum of the data fragmentation, for example, starting a corresponding stream for each data fragmentation
Waterline, then n data fragmentation can then correspond to n assembly line of starting, then the total quantity N=n of the pending assembly line of system, by counting
It is determined according to the sum of fragment.
In the present embodiment, the step S13 starts the assembly line query execution model for the corresponding institute of the total quantity
It states assembly line to distribute to the CPU and the GPU, and according to the determining single on the CPU and the GPU respectively
The execution duration of assembly line calculates the corresponding system of all load sharing policies and executes duration, comprising:
Step 1: starting assembly line query execution model, is arranged initial load allocation strategy: for the stream of CPU distribution
Waterline quantity NCPU=0, for the assembly line quantity N of GPU distributionGPU=N, wherein N is the total quantity and N of the assembly line
For the positive integer more than or equal to 1;
Step 2: parallel execute each of distributes on the CPU and the GPU assembly line respectively, obtain current
The corresponding CPU of load sharing policy executes duration TCPUDuration T is executed with GPUGPU;
Step 3: if the CPU executes duration TCPUDuration T is executed with the GPUGPUIt is equal, then when the CPU being executed
The long corresponding system of present load allocation strategy that is determined as executes duration T=TCPUOr T=TGPU;If the CPU executes duration TCPU
Duration T is executed with the GPUGPUIt is unequal, then the CPU is executed into duration and the GPU executes the larger value in duration and determines
Duration T=Max { T is executed for the corresponding system of present load allocation strategyCPU, TGPU};It is held since CPU and GPU can be started simultaneously at
Row, therefore system needs the execution time phase difference minimum of two processors that can just obtain top performance, optimizes under scene, when two
When the execution time phase difference of a processor is 0, then it is believed that system reaches optimal load balancing state, i.e., at this time by assembly line point
Dispensing CPU load sharing policy corresponding with GPU is best.
Step 4: updating load sharing policy: for the assembly line quantity N of CPU distributionCPU=NCPU+ 1, it is the GPU
The assembly line quantity N of distributionGPU=NGPU- 1, wherein NCPU+NGPU=N,;
Step 5: repeating the above steps two to step 4, until obtaining the corresponding system of all load sharing policies
System executes duration, i.e., finishes up to the case where all load sharing policies is performed both by.
In the present embodiment, in the step S13 according to the determining single on the CPU and the GPU respectively
The execution duration of assembly line calculates before the corresponding system of all load sharing policies executes duration, further includes:
Duration T is inputted according to data of the data fragmentation on CPUIN_C, data execute duration TEXE_CAnd data output
Duration TOUT_C, determine the execution duration of the assembly line described in single on the CPU;That is, in the execution of single assembly line on CPU
Length is the data input duration T by the data fragmentation on CPUIN_C, data execute duration TEXE_CAnd data export duration TOUT_C
Three phases composition, wherein data of the data fragmentation on CPU input duration TIN_CIt is the data fragmentation in institute
State duration used when the memory of CPU is copied;Data of the data fragmentation on CPU execute duration TEXE_CFor described
Duration used in the thread instance run on CPU;Data of the data fragmentation on CPU export duration TOUT_CFor described
Duration used in the corresponding implementing result of assembly line described in CPU memory copying;
Duration T is inputted according to data of the data fragmentation on GPUIN_G, data execute duration TEXE_GAnd data output
Duration TOUT_G, determine the execution duration of the assembly line described in single on the GPU, that is, in the execution of single assembly line on GPU
Length is the data input duration T by the data fragmentation on GPUIN_G, data execute duration TEXE_GAnd data export duration TOUT_G
Three phases composition, wherein data of the data fragmentation on GPU input duration TIN_GFor by the data fragmentation from institute
State duration used in video memory of the memory copying of CPU to the GPU;Data of the data fragmentation on GPU execute duration
TEXE_GFor duration used in the kernel function example that is run on the GPU;Data of the data fragmentation on GPU export duration
TOUT_GFor the corresponding implementing result of the assembly line is copied to duration used in the memory of the CPU from the video memory of the GPU.
Due to no matter on CPU, or there are phase mutual respects during executing assigned multiple threads on GPU parallel
It is folded, and only differ between two adjacent assembly lines the execution duration in a stage, thus it is described Step 2: in current
The corresponding CPU of load sharing policy executes duration and the formula difference of GPU execution duration is as follows:
The corresponding CPU of present load allocation strategy executes duration TCPUAre as follows:
TCPU=TIN_C+TEXE_C+TOUT_C+Max{TIN_C, TEXE_C, TOUT_C}×(NCPU- 1),
Due to no matter the execution duration in each stage on CPU or on GPU in the three phases of execution pipeline
It has differences, is the overlapping time between two adjacent assembly lines of accurate recording, then most by the execution duration in three phases
The long stage as the execution time difference between adjacent two assembly lines, such as according to data fragmentation on CPU data input when
Long TIN_C, data execute duration TEXE_CAnd data export duration TOUT_CIn maximum value Max { TIN_C, TEXE_C, TOUT_C, it can be with
Obtain the execution time difference between adjacent two assembly lines are as follows: the duration maximum value in three phases is multiplied by remaining flowing water line number
According to, the total duration of lap can be obtained, under present load allocation strategy as shown in Figure 2 be respectively GPU execute duration
The calculating schematic diagram of duration is executed with CPU, wherein be in the pipeline implementation of GPU on the left of Fig. 2, be CPU's on the right side of Fig. 2
Pipeline implementation.
As shown in figure 3, the corresponding GPU of present load allocation strategy executes duration TCPUAre as follows:
TGPU=TIN_G+TEXE_G+TOUT_G+Max{TIN_G, TEXE_G, TOUT_G}×(NGPU- 1),
Wherein, Max { TIN_G, TEXE_G, TOUT_GIt is that data of the data fragmentation on GPU input duration TIN_G, data execute
Duration TEXE_GAnd data export duration TOUT_GIn maximum value;
Here, in Fig. 3, for two processors being divided to assembly line less than CPU and GPU under load sharing policy
After upper, when inquiry the corresponding schematic diagram for executing duration, which, which is used to indicate, is all placed on all loads on GPU
When execution end time point;At the end of End is used to indicate the execution after on two processors of load balancing to CPU and GPU
Between point.
In the present embodiment, it is described Step 2: it is parallel execute each of distribute on the CPU and the GPU respectively it is described
Assembly line obtains the corresponding CPU of present load allocation strategy and executes duration and after GPU executes duration, further includes:
Each of obtain on the CPU and GPU the corresponding implementing result of the assembly line;For example, i-th assembly line is held
Corresponding implementing result is r after the completion of rowi, i is the number (or fragment number of data fragmentation) of the assembly line, wherein and i=1,
2 ... ..., N-1, N;
The corresponding final query execution knot of the query statement is obtained according to the corresponding implementing result of each assembly line
Fruit R, for example, R={ r1, r2... ... ri... ..., r(N-1), rN}。
Then above-described embodiment of the application, by all corresponding systems of load sharing policy in the step S14
Execute load sharing policy corresponding to the minimum value in duration as optimal CPU-GPU load sharing policy, i.e.,
{OptNGPU, OptNCPU}=FindMin (T []), OptNGPU(CPU is upper to be divided for the load of GPU under finger optimum load allocation strategy
The assembly line quantity matched), OptNCPURefer to the load (the assembly line quantity being assigned on GPU) of CPU under optimum load allocation plan,
T [] is the array for storing the system under all load sharing policy loads and executing duration, and FindMin owns in described
Load sharing policy under system execute the minimum value that finds in duration.
In a kind of one practical application scene of load-balancing method based on CPU-GPU provided by the present application, such as Fig. 4 and figure
Shown in 5, after using the first query statement to execute respectively to data to be checked for 80,000,000 row data and 1.4 hundred million row data
As a result, wherein the first query statement are as follows: select avg (attr1) from tbl1group by attr2, wherein Fig. 4 and
The longitudinal axis in the left side in Fig. 5 indicates that the corresponding system of present load allocation strategy executes duration T, right side longitudinal axis GPU Pipeline
Number indicates the assembly line quantity N distributed on GPUGPU, horizontal axis CPU Pipeline Number indicates to distribute on CPU
Assembly line quantity NCPU, total assembly line quantity on CPU and GPU remains unchanged (i.e. in Fig. 4 NGPU+NCPU=3, the N in Fig. 5GPU
+NCPU=5), Pipeline workload partitions represents the allocation strategy of flowing water linear load.When what is distributed on CPU
When assembly line quantity is 0, all loads are all allocated in GPU execution, i.e. the execution mould of tradition CPU-GPU isomery processing analysis system
Formula.It is found that (assembly line for distributing to CPU is 1, is distributed to when cpu load is equal to when 1, GPU load is equal to 2 in Fig. 4
The assembly line of GPU is 2), the corresponding system executive chairman T of present load allocation strategy is 587 milliseconds, to execute 80,000,000 rows
Inquiry data under the corresponding system of all load sharing policies execute the time shortest (T in durationmin), i.e. CPU is negative
It carries and is equal to the 1 optimum load allocation strategy with GPU load equal to 2 to inquire 80,000,000 row data with the first query statement;?
In Fig. 5 it is found that when cpu load is 2, GPU load is that 3 (assembly line for distributing to CPU is 2, distributes to the assembly line of GPU
It is 3) when, present load allocation strategy corresponding system is 936 milliseconds a length of when executing, in the case where executing 1.4 hundred million row data
The corresponding system of all load sharing policies executes the time shortest (T in durationmin), i.e., it is negative with GPU to be equal to 2 for cpu load
It carries and is equal to 3 for the optimum load allocation strategy of the inquiry data of the first query statement 1.4 hundred million rows of inquiry.In first inquiry
Under sentence, inquiring 80,000,000 row data and 1.4 hundred million row data and undertaking the system executive chairman all loaded by GPU is respectively 881 millis
Second with 1265 milliseconds.Compared to traditional executive mode, using the system performance after load sharing policy in the different data amount time-division
About 33% and 26% is not improved, wherein (GPU undertakes the system all loaded and executes duration-load 33%=when 80,000,000 row
The corresponding system executive chairman of allocation strategy)/(GPU undertakes the system all loaded and executes duration when 80,000,000 row)=(881 millis
- 587 milliseconds of second)/881 milliseconds;GPU undertakes the system all loaded and executes duration-load sharing policy when hundred million row of 26%=1.4
Corresponding system executive chairman)/(GPU undertakes the system all loaded and executes duration when 1.4 row)=(1265 milliseconds -936 milliseconds)/
936 milliseconds.
In a kind of another practical application scene of load-balancing method based on CPU-GPU provided by the present application, such as Fig. 6 and
Shown in Fig. 7, after using the second query statement to execute respectively to data to be checked for 80,000,000 row data and 1.4 hundred million row data
Result, wherein the second query statement are as follows: select count (*) from (select tbl1.attr1 from tbl1
Join tbl2 on tbl1.attr1=tbl2.attr1), since second query statement is attended operation, flowing water
Line number amount is more than first query statement.In Fig. 6 it can be seen that under 80,000,000 row data, when cpu load is 2, GPU is
When loading 7, system a length of minimum value T when executingmin=1361 milliseconds.In Fig. 7 it can be seen that under 1.4 hundred million row data, when
When cpu load is 7, when GPU load is 18, system a length of minimum value T when executingmin=3488 milliseconds, system performance is best, with
80,000,000 row data are inquired under second query statement and 1.4 hundred million row data are undertaken the system all loaded by GPU respectively and held
1750 milliseconds of row duration improve about 22% and 28% compared with 4845 milliseconds, using the system performance after load sharing policy,
Wherein, (GPU undertakes the corresponding system of system execution duration-load sharing policy all loaded and holds 22%=when 80,000,000 row
Row length)/(GPU undertakes the system all loaded and executes duration when 80,000,000 row)=(1750 milliseconds -1361 milliseconds)/1750 milli
Second;GPU undertakes the system all loaded and executes the corresponding system execution of duration-load sharing policy when hundred million row of 28%=1.4
It is long)/(GPU undertakes the system all loaded and executes duration when 1.4 row)=(4845 milliseconds -3488 milliseconds)/4845 milliseconds.
According to the another aspect of the application, a kind of non-volatile memory medium is additionally provided, being stored thereon with computer can
Reading instruction when the computer-readable instruction can be executed by processor, realizes the processor as above-mentioned based on CPU-GPU
Load-balancing method.
According to the another aspect of the application, a kind of equipment is additionally provided, wherein comprising:
One or more processors;
Non-volatile memory medium, for storing one or more computer-readable instructions,
When one or more of computer-readable instructions are executed by one or more of processors, so that one
Or multiple processors realize such as the above-mentioned load-balancing method based on CPU-GPU
Here, the detailed content of each embodiment of the equipment, for details, reference can be made to the one kind executed in the equipment to be based on
The corresponding part of the embodiment of the method for the load-balancing method of CPU-GPU, here, repeating no more.
In conclusion the application is made by constructing assembly line query execution model on CPU-GPU heterogeneous database system
CPU-GPU isomeric data analysis system can support the query analysis under big data scene;Determine the total of pending assembly line
Quantity;Start the assembly line query execution model to distribute the corresponding assembly line of the total quantity to the CPU and institute
It states on GPU, and the execution duration of the assembly line according to the determining single on the CPU and the GPU respectively, calculates all
The corresponding system of load sharing policy execute duration;All systems are finally executed to the minimum value pair in duration
The load sharing policy answered is determined as best CPU-GPU allocation strategy, and the load based on CPU-GPU isomeric data analysis system is equal
Weighing apparatus strategy can reasonable distribution assembly line be loaded to different processor, make full use of processor computing resource, not only improve system
Performance, moreover it is possible to make system reach overall performance best.
It should be noted that the application can be carried out in the assembly of software and/or software and hardware, for example, can adopt
With specific integrated circuit (ASIC), general purpose computer or any other realized similar to hardware device.In one embodiment
In, the software program of the application can be executed to implement the above steps or functions by processor.Similarly, the application
Software program (including relevant data structure) can be stored in computer readable recording medium, for example, RAM memory,
Magnetic or optical driver or floppy disc and similar devices.In addition, hardware can be used to realize in some steps or function of the application, example
Such as, as the circuit cooperated with processor thereby executing each step or function.
In addition, a part of the application can be applied to computer program product, such as computer program instructions, when its quilt
When computer executes, by the operation of the computer, it can call or provide according to the present processes and/or technical solution.
And the program instruction of the present processes is called, it is possibly stored in fixed or moveable recording medium, and/or pass through
Broadcast or the data flow in other signal-bearing mediums and transmitted, and/or be stored according to described program instruction operation
In the working storage of computer equipment.Here, including a device according to one embodiment of the application, which includes using
Memory in storage computer program instructions and processor for executing program instructions, wherein when the computer program refers to
When enabling by processor execution, method and/or skill of the device operation based on aforementioned multiple embodiments according to the application are triggered
Art scheme.
It is obvious to a person skilled in the art that the application is not limited to the details of above-mentioned exemplary embodiment, Er Qie
In the case where without departing substantially from spirit herein or essential characteristic, the application can be realized in other specific forms.Therefore, no matter
From the point of view of which point, the present embodiments are to be considered as illustrative and not restrictive, and scope of the present application is by appended power
Benefit requires rather than above description limits, it is intended that all by what is fallen within the meaning and scope of the equivalent elements of the claims
Variation is included in the application.Any reference signs in the claims should not be construed as limiting the involved claims.This
Outside, it is clear that one word of " comprising " does not exclude other units or steps, and odd number is not excluded for plural number.That states in device claim is multiple
Unit or device can also be implemented through software or hardware by a unit or device.The first, the second equal words are used to table
Show title, and does not indicate any particular order.
Claims (10)
1. a kind of load-balancing method based on CPU-GPU, wherein the described method includes:
Assembly line query execution model is constructed on CPU-GPU heterogeneous database system;
Determine the total quantity of pending assembly line;
Start the assembly line query execution model, the corresponding assembly line of the total quantity is distributed to the CPU and institute
It states on GPU, and the execution duration of the assembly line according to the determining single on the CPU and the GPU respectively, calculates all
The corresponding system of load sharing policy execute duration;
All systems are executed into the corresponding load sharing policy of minimum value in duration and are determined as optimum load distribution plan
Slightly.
2. according to the method described in claim 1, wherein, the total quantity of the pending assembly line of the determination, comprising:
Obtain query statement, wherein the query statement includes data to be checked;
The data to be checked are divided according to preset data fragment size, obtain the data fragmentation of the data to be checked
And its sum;
Each of the respectively described data to be checked data fragmentation starts corresponding assembly line, then the pending stream
The total quantity of waterline is determined by the sum of the data fragmentation.
3. according to the method described in claim 2, wherein, starting the assembly line query execution model total quantity is corresponding
The assembly line distribute to the CPU and the GPU, and according to the determining single on the CPU and the GPU respectively
The execution duration of the assembly line calculates the corresponding system of all load sharing policies and executes duration, comprising:
Step 1: starting assembly line query execution model, is arranged initial load allocation strategy: for the assembly line of CPU distribution
Quantity NCPU=0, for the assembly line quantity N of GPU distributionGPU=N, wherein N is the total quantity of the assembly line and N is big
In the positive integer for being equal to 1;
Step 2: parallel execute each of distributes on the CPU and the GPU assembly line respectively, present load is obtained
The corresponding CPU of allocation strategy executes duration and GPU executes duration;
Step 3: CPU execution duration is determined as working as if the CPU executes duration and GPU execution duration is equal
The corresponding system of preceding load sharing policy executes duration;If the CPU executes duration and GPU execution duration is unequal,
The CPU is executed into duration and the GPU executes the larger value in duration and is determined as the corresponding system of present load allocation strategy
Execute duration;
Step 4: updating load sharing policy: for the assembly line quantity N of CPU distributionCPU=NCPU+ 1, it is distributed for the GPU
Assembly line quantity NGPU=NGPU- 1, wherein NCPU+NGPU=N,;
Step 5: repeating the above steps two to step 4, held until obtaining the corresponding system of all load sharing policies
Row duration.
4. described according to the determining single on the CPU and the GPU respectively according to the method described in claim 3, wherein
The execution duration of the assembly line calculates before the corresponding system of all load sharing policies executes duration, further includes:
Duration T is inputted according to data of the data fragmentation on CPUIN_C, data execute duration TEXE_CAnd data export duration
TOUT_C, determine the execution duration of the assembly line described in single on the CPU;
Duration T is inputted according to data of the data fragmentation on GPUIN_G, data execute duration TEXE_GAnd data export duration
TOUT_G, determine the execution duration of the assembly line described in single on the GPU.
5. according to the method described in claim 4, wherein, it is described Step 2: in obtain present load allocation strategy corresponding
CPU executes duration and the formula of GPU execution duration is respectively as follows:
TCPU=TIN_C+TEXE_C+TOUT_C+Max(TIN_C, IEXE_C, TOUT_C}×(NCPU- 1),
TGPU=TIN_G+TEXE_G+TOUR_G+Max{TIN_G, TEXE_G, TOUT_G}×(NGPU- 1),
Wherein, TCPUDuration, Max { T are executed for the corresponding CPU of present load allocation strategyIN_C, TEXE_C, TOUT_CIt is data fragmentation
Data on CPU input duration TIN_C, data execute duration TEXE_CAnd data export duration TOUT_CIn maximum value;
TGPUDuration, Max { T are executed for the corresponding GPU of present load allocation strategyIN_G, TEXE_G, TOUT_GIt is data fragmentation in GPU
On data input duration TIN_G, data execute duration TEXE_GAnd data export duration TOUT_GIn maximum value.
6. according to the method described in claim 5, wherein, the query statement further includes querying condition, wherein the distribution is extremely
Assembly line on the CPU is the thread instance run on the CPU according to the querying condition;It is described to distribute to described
Assembly line on GPU is the kernel function example run on the GPU according to the querying condition.
It is described Step 2: parallel execute is distributed respectively in the CPU and described 7. according to the method described in claim 6, wherein
The assembly line each of on GPU, after obtaining the corresponding CPU execution duration of present load allocation strategy and GPU execution duration,
Further include:
Each of obtain on the CPU and GPU the corresponding implementing result of the assembly line;
The corresponding final result of query execution of the query statement is obtained according to the corresponding implementing result of each assembly line.
8. according to the method described in claim 7, wherein, data of the data fragmentation on CPU input duration TIN_CIt is described
The data fragmentation duration used when the memory of the CPU is copied;
Data of the data fragmentation on CPU execute duration TEXE_CWhen for used in the thread instance that is run on the CPU
It is long;
Data of the data fragmentation on CPU export duration TOUT_CIt is corresponding for the assembly line described in the CPU memory copying
Duration used in implementing result;
Data of the data fragmentation on GPU input duration TIN_GFor by the data fragmentation from the memory copying of the CPU to
Duration used in the video memory of the GPU;
Data of the data fragmentation on GPU execute duration TEXE_GFor used in the kernel function example that is run on the GPU
Duration;
Data of the data fragmentation on GPU export duration TOUT_GFor by the corresponding implementing result of the assembly line from described
The video memory of GPU is copied to duration used in the memory of the CPU.
9. a kind of non-volatile memory medium, is stored thereon with computer-readable instruction, the computer-readable instruction can be located
When managing device execution, the processor is made to realize such as method described in any item of the claim 1 to 8.
10. a kind of equipment, wherein comprising:
One or more processors;
Non-volatile memory medium, for storing one or more computer-readable instructions,
When one or more of computer-readable instructions are executed by one or more of processors, so that one or more
A processor realizes such as method described in any item of the claim 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811064037.5A CN109213601B (en) | 2018-09-12 | 2018-09-12 | Load balancing method and device based on CPU-GPU |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811064037.5A CN109213601B (en) | 2018-09-12 | 2018-09-12 | Load balancing method and device based on CPU-GPU |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109213601A true CN109213601A (en) | 2019-01-15 |
CN109213601B CN109213601B (en) | 2021-01-01 |
Family
ID=64984143
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811064037.5A Active CN109213601B (en) | 2018-09-12 | 2018-09-12 | Load balancing method and device based on CPU-GPU |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109213601B (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109918141A (en) * | 2019-03-15 | 2019-06-21 | Oppo广东移动通信有限公司 | Thread execution method, device, terminal and storage medium |
CN110069527A (en) * | 2019-04-22 | 2019-07-30 | 电子科技大学 | A kind of GPU and CPU isomery accelerated method of data base-oriented |
CN110096367A (en) * | 2019-05-14 | 2019-08-06 | 宁夏融媒科技有限公司 | A kind of panorama real-time video method for stream processing based on more GPU |
CN110287212A (en) * | 2019-06-27 | 2019-09-27 | 浪潮商用机器有限公司 | A kind of data service handling method, system and associated component |
CN110298437A (en) * | 2019-06-28 | 2019-10-01 | Oppo广东移动通信有限公司 | Separation calculation method, apparatus, storage medium and the mobile terminal of neural network |
CN110490300A (en) * | 2019-07-26 | 2019-11-22 | 苏州浪潮智能科技有限公司 | A kind of operation accelerated method, apparatus and system based on deep learning |
CN111062855A (en) * | 2019-11-18 | 2020-04-24 | 中国航空工业集团公司西安航空计算技术研究所 | Graph pipeline performance analysis method |
CN111240820A (en) * | 2020-01-13 | 2020-06-05 | 星环信息科技(上海)有限公司 | Concurrency quantity increasing speed multiplying determining method, equipment and medium |
CN112181689A (en) * | 2020-09-30 | 2021-01-05 | 华东师范大学 | Runtime system for efficiently scheduling GPU kernel under cloud |
CN112989082A (en) * | 2021-05-20 | 2021-06-18 | 南京甄视智能科技有限公司 | CPU and GPU mixed self-adaptive face searching method and system |
WO2021129873A1 (en) * | 2019-12-27 | 2021-07-01 | 中兴通讯股份有限公司 | Database querying method, device, apparatus, and storage medium |
CN115437795A (en) * | 2022-11-07 | 2022-12-06 | 东南大学 | Video memory recalculation optimization method and system for heterogeneous GPU cluster load perception |
US11954527B2 (en) | 2020-12-09 | 2024-04-09 | Industrial Technology Research Institute | Machine learning system and resource allocation method thereof |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101091175A (en) * | 2004-09-16 | 2007-12-19 | 辉达公司 | Load balancing |
CN101706741A (en) * | 2009-12-11 | 2010-05-12 | 中国人民解放军国防科学技术大学 | Method for partitioning dynamic tasks of CPU and GPU based on load balance |
CN103329100A (en) * | 2011-01-21 | 2013-09-25 | 英特尔公司 | Load balancing in heterogeneous computing environments |
US9311152B2 (en) * | 2007-10-24 | 2016-04-12 | Apple Inc. | Methods and apparatuses for load balancing between multiple processing units |
-
2018
- 2018-09-12 CN CN201811064037.5A patent/CN109213601B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101091175A (en) * | 2004-09-16 | 2007-12-19 | 辉达公司 | Load balancing |
US9311152B2 (en) * | 2007-10-24 | 2016-04-12 | Apple Inc. | Methods and apparatuses for load balancing between multiple processing units |
CN101706741A (en) * | 2009-12-11 | 2010-05-12 | 中国人民解放军国防科学技术大学 | Method for partitioning dynamic tasks of CPU and GPU based on load balance |
CN103329100A (en) * | 2011-01-21 | 2013-09-25 | 英特尔公司 | Load balancing in heterogeneous computing environments |
Non-Patent Citations (1)
Title |
---|
沈文枫: "CPU-GPU异构高性能计算中的负载预测调度算法研究及应用", 《中国博士学位论文全文数据库 信息科技辑》 * |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109918141B (en) * | 2019-03-15 | 2020-11-27 | Oppo广东移动通信有限公司 | Thread execution method, thread execution device, terminal and storage medium |
CN109918141A (en) * | 2019-03-15 | 2019-06-21 | Oppo广东移动通信有限公司 | Thread execution method, device, terminal and storage medium |
CN110069527A (en) * | 2019-04-22 | 2019-07-30 | 电子科技大学 | A kind of GPU and CPU isomery accelerated method of data base-oriented |
CN110069527B (en) * | 2019-04-22 | 2021-05-14 | 电子科技大学 | Database-oriented GPU and CPU heterogeneous acceleration method |
CN110096367A (en) * | 2019-05-14 | 2019-08-06 | 宁夏融媒科技有限公司 | A kind of panorama real-time video method for stream processing based on more GPU |
CN110287212A (en) * | 2019-06-27 | 2019-09-27 | 浪潮商用机器有限公司 | A kind of data service handling method, system and associated component |
CN110298437A (en) * | 2019-06-28 | 2019-10-01 | Oppo广东移动通信有限公司 | Separation calculation method, apparatus, storage medium and the mobile terminal of neural network |
CN110298437B (en) * | 2019-06-28 | 2021-06-01 | Oppo广东移动通信有限公司 | Neural network segmentation calculation method and device, storage medium and mobile terminal |
CN110490300A (en) * | 2019-07-26 | 2019-11-22 | 苏州浪潮智能科技有限公司 | A kind of operation accelerated method, apparatus and system based on deep learning |
CN110490300B (en) * | 2019-07-26 | 2022-03-15 | 苏州浪潮智能科技有限公司 | Deep learning-based operation acceleration method, device and system |
CN111062855A (en) * | 2019-11-18 | 2020-04-24 | 中国航空工业集团公司西安航空计算技术研究所 | Graph pipeline performance analysis method |
CN111062855B (en) * | 2019-11-18 | 2023-09-05 | 中国航空工业集团公司西安航空计算技术研究所 | Graphic pipeline performance analysis method |
WO2021129873A1 (en) * | 2019-12-27 | 2021-07-01 | 中兴通讯股份有限公司 | Database querying method, device, apparatus, and storage medium |
CN111240820A (en) * | 2020-01-13 | 2020-06-05 | 星环信息科技(上海)有限公司 | Concurrency quantity increasing speed multiplying determining method, equipment and medium |
CN111240820B (en) * | 2020-01-13 | 2020-11-24 | 星环信息科技(上海)有限公司 | Concurrency quantity increasing speed multiplying determining method, equipment and medium |
CN112181689A (en) * | 2020-09-30 | 2021-01-05 | 华东师范大学 | Runtime system for efficiently scheduling GPU kernel under cloud |
US11954527B2 (en) | 2020-12-09 | 2024-04-09 | Industrial Technology Research Institute | Machine learning system and resource allocation method thereof |
CN112989082B (en) * | 2021-05-20 | 2021-07-23 | 南京甄视智能科技有限公司 | CPU and GPU mixed self-adaptive face searching method and system |
CN112989082A (en) * | 2021-05-20 | 2021-06-18 | 南京甄视智能科技有限公司 | CPU and GPU mixed self-adaptive face searching method and system |
CN115437795A (en) * | 2022-11-07 | 2022-12-06 | 东南大学 | Video memory recalculation optimization method and system for heterogeneous GPU cluster load perception |
Also Published As
Publication number | Publication date |
---|---|
CN109213601B (en) | 2021-01-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109213601A (en) | A kind of load-balancing method and equipment based on CPU-GPU | |
CN110168516B (en) | Dynamic computing node grouping method and system for large-scale parallel processing | |
US8631000B2 (en) | Scan sharing for query predicate evaluations in column-based in-memory database systems | |
Bajda-Pawlikowski et al. | Efficient processing of data warehousing queries in a split execution environment | |
US9152669B2 (en) | System and method for distributed SQL join processing in shared-nothing relational database clusters using stationary tables | |
US9165032B2 (en) | Allocation of resources for concurrent query execution via adaptive segmentation | |
US7113957B1 (en) | Row hash match scan join using summary contexts for a partitioned database system | |
US9405782B2 (en) | Parallel operation in B+ trees | |
CN109558237A (en) | A kind of task status management method and device | |
CN108959510B (en) | Partition level connection method and device for distributed database | |
CN104238999B (en) | A kind of method for scheduling task and device based on horizontal partitioning distributed data base | |
US20200250192A1 (en) | Processing queries associated with multiple file formats based on identified partition and data container objects | |
US20190163795A1 (en) | Data allocating system and data allocating method | |
Mutharaju et al. | D-SPARQ: distributed, scalable and efficient RDF query engine | |
Tan et al. | Effectiveness assessment of solid-state drive used in big data services | |
CN109829678B (en) | Rollback processing method and device and electronic equipment | |
CN108710640B (en) | Method for improving search efficiency of Spark SQL | |
US9910869B2 (en) | Dropping columns from a table with minimized unavailability | |
CN107451142B (en) | Method and apparatus for writing and querying data in database, management system and computer-readable storage medium thereof | |
US20210216573A1 (en) | Algorithm to apply a predicate to data sets | |
Golab et al. | Distributed data placement via graph partitioning | |
CN113360503A (en) | Test data tracking method and device for distributed database | |
CN111737257A (en) | Data query method and device | |
CN111913986A (en) | Query optimization method and device | |
CN106202412A (en) | Data retrieval method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |