CN105094981B - A kind of method and device of data processing - Google Patents

A kind of method and device of data processing Download PDF

Info

Publication number
CN105094981B
CN105094981B CN201410223152.8A CN201410223152A CN105094981B CN 105094981 B CN105094981 B CN 105094981B CN 201410223152 A CN201410223152 A CN 201410223152A CN 105094981 B CN105094981 B CN 105094981B
Authority
CN
China
Prior art keywords
data
gpu
waiting task
acquisition system
described
Prior art date
Application number
CN201410223152.8A
Other languages
Chinese (zh)
Other versions
CN105094981A (en
Inventor
崔慧敏
杨文森
谢睿
Original Assignee
华为技术有限公司
中国科学院计算技术研究所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司, 中国科学院计算技术研究所 filed Critical 华为技术有限公司
Priority to CN201410223152.8A priority Critical patent/CN105094981B/en
Publication of CN105094981A publication Critical patent/CN105094981A/en
Application granted granted Critical
Publication of CN105094981B publication Critical patent/CN105094981B/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]

Abstract

The embodiment of the invention discloses a kind of method and devices of data processing, are related to the communications field, to improve the efficiency of data processing.The method, comprising: obtain waiting task and at least one data to be processed corresponding with waiting task;Graphics processor GPU is distributed for waiting task;At least one corresponding pending data of waiting task is converted to the data of data acquisition system type;Data in data acquisition system type are parsed, the data after parsing are generated at least one data block;At least one data block of generation is sent to the assigned GPU, so that GPU carries out calculation processing at least one data block according to waiting task.The present invention is suitable for triggering the scene of acceleration components processing data.

Description

A kind of method and device of data processing

Technical field

The present invention relates to the communications field more particularly to a kind of method and devices of data processing.

Background technique

With the development of science and technology and internet, the information content of modern society is increased rapidly, is accumulated in these information A large amount of data will have partial data in these data and be stored in cloud platform or handle by cloud platform, by Hadoop can efficiently store, manage and analyze these data being stored in cloud platform.

Hadoop is the software architecture that distributed treatment can be carried out to mass data, and bottommost is a distribution Formula file system carries out data storage by using distributed storage mode, improves the read or write speed of data, have also been enlarged Memory capacity.Upper one layer of distributed file system is to map to simplify (MapReduce) engine, passes through MapReduce engine pair Data in distributed file system are integrated, it is ensured that the high efficiency of analysis and processing data, just because of Hadoop Advantage outstanding, is widely used in many fields.But there is such as graphics processor (Graphics Processing Unit, GPU) etc. in the cluster environment of acceleration components, since the MapReduce engine of existing Hadoop is User provide<key, value>programming interface limitation so that the MapReduce engine of Hadoop can not trigger it is existing Acceleration components carry out the processing of data, the computing capability that also can not just utilize acceleration components powerful in this way, so that handling data Efficiency cannot improve.

Summary of the invention

The embodiment of the present invention provides a kind of method and device of data processing, to improve the efficiency of data processing.

In order to achieve the above objectives, the embodiment of the present invention adopts the following technical scheme that

In a first aspect, the embodiment of the invention provides a kind of methods of data processing, comprising: obtain waiting task and At least one data to be processed corresponding with the waiting task;Graphics processor GPU is distributed for the waiting task; At least one corresponding described pending data of the waiting task is converted to the data of data acquisition system type;By the number It is parsed according to the data in aggregate type, the data after parsing is generated at least one data block;At least by the generation One data block is sent to the assigned GPU so that the GPU according to the waiting task to it is described at least one Data block carries out calculation processing.

In the first possible implementation of the first aspect, graphics processor is being distributed for the waiting task Before GPU further include: obtain preconfigured resource information table;The resource information table is used to record quantity and the institute of GPU State the service condition information of GPU.

In the possible implementation of with reference to first aspect the first, in second of possible implementation of first aspect In, after the acquisition resource information table, further includes: determine the usage quantity for the GPU that the waiting task needs;It is described Distributing graphics processor GPU for waiting task includes: according to the quantity of GPU and making for the GPU in the resource information table With situation information, when determining that the quantity of not used GPU meets the usage quantity for the GPU that the waiting task needs, for institute State waiting task distribution GPU.

Second of possible implementation with reference to first aspect, in a third possible implementation of the first aspect, The resource information table is also used to record the quantity of central processor CPU and the service condition information of the CPU;Described true After the usage quantity for the GPU that the fixed waiting task needs, further includes: determine not used in the resource information table When the quantity of GPU is unsatisfactory for the usage quantity of the GPU of waiting task needs, CPU is distributed for the waiting task.

With reference to first aspect or first aspect first to any possible implementation of third, in first aspect It is described to be converted at least one corresponding described pending data of the waiting task in 4th kind of possible implementation The data of data acquisition system type comprise determining that the size of data of the data acquisition system type;According to the data acquisition system type Size of data distributes at least one described pending data at least one data acquisition system;Include in the data acquisition system Pending data size be not more than the data acquisition system type size of data.

The 4th kind of possible implementation with reference to first aspect, in the 5th kind of possible implementation of first aspect In, the size of data according to the data acquisition system type distributes at least one described pending data at least one It include: in the data type of corresponding at least one pending data of the waiting task in data acquisition system is elongated number When according to type, according to the size of data of the data acquisition system type, at least one described pending data is distributed at least one In a data acquisition system, and the location information of at least one pending data at least one described data acquisition system is recorded, So that the GPU is according to the positional information, the pending data is obtained;The location information is for recording elongated type Location dependent information of the pending data in data acquisition system.

With reference to first aspect or first to the 5th any possible implementation of first aspect, in first aspect In 6th kind of possible implementation, the data by the data acquisition system type are parsed, by the data after parsing Generating at least one data block includes: using preset analytical function, by the data lattice of the data in the data acquisition system type Formula is converted to the GPU and carries out data format required when calculation processing;Data after change data format are generated at least one A data block.

With reference to first aspect or first to the 6th any possible implementation of first aspect, in first aspect It is described that at least one data block of the generation is sent to the assigned GPU packet in 7th kind of possible implementation It includes: at least one data block of the generation is stored into the buffer area of the assigned GPU.

With reference to first aspect or first to the 7th 6 any possible implementation of first aspect, in first aspect The 8th kind of possible implementation in, it is described at least one data block of the generation is sent to it is described assigned After GPU, further includes: receive the calculation processing that the assigned GPU is sent as a result, and carrying out to the calculation processing result Subregion, sequence and merging treatment.

Second aspect, the embodiment of the invention provides a kind of devices of data processing, comprising: acquiring unit, for obtaining Waiting task and at least one data to be processed corresponding with the waiting task;Allocation unit, for for wait locate Reason task distributes graphics processor GPU;Converting unit, for corresponding described at least one to be to be processed by the waiting task Data are converted to the data of data acquisition system type;Resolution unit, the data acquisition system class for converting the converting unit Data in type are parsed, and the data after parsing are generated at least one data block;Transmission unit, for the parsing is single At least one described data block that member generates is sent to the GPU of allocation unit distribution so that the GPU according to Processing task carries out calculation processing at least one described data block.

In the first possible implementation of the second aspect, the acquiring unit is also used to obtain preconfigured Resource information table;The resource information table is for recording the quantity of GPU and the service condition information of the GPU.

In conjunction with the first possible implementation of second aspect, in second of possible implementation of second aspect In, described device further include: determination unit, the usage quantity of the GPU for determining the waiting task needs;The distribution The service condition of unit, quantity and the GPU specifically for GPU in the resource information table that is obtained according to the acquiring unit is believed Breath determines that the quantity of not used GPU meets the use number for the GPU that the waiting task that the determination unit determines needs When amount, GPU is distributed for the waiting task.

In conjunction with second of possible implementation of second aspect, in the third possible implementation of second aspect In, the resource information table is also used to record the quantity of central processor CPU and the service condition information of the CPU;It is described Allocation unit is also used to determine that the quantity of not used GPU is unsatisfactory for the waiting task that the determination unit determines and needs When the usage quantity of the GPU wanted, CPU is distributed for the waiting task.

In conjunction with second aspect or second aspect the firstth to any possible implementation of third, in second aspect The 4th kind of possible implementation in, the converting unit, specifically for the determination data acquisition system type data it is big It is small;According to the size of the data of the data acquisition system type, at least one described pending data is distributed at least one number According in set;The size for the pending data for including in the data acquisition system is big no more than the data of the data acquisition system type It is small.

In conjunction with the third possible implementation of second aspect, in the 5th kind of possible implementation of second aspect In, the converting unit, specifically for the data class in corresponding at least one pending data of the waiting task It, will at least one described pending data according to the size of the data of the data acquisition system type when type is elongated data type Distribution records at least one described pending data at least one described data acquisition system at least one data acquisition system Location information obtain the pending data so that the GPU is according to the positional information;The location information is used for Record location dependent information of the pending data of elongated type in data acquisition system.

In conjunction with the firstth to the 5th any possible implementation of second aspect or second aspect, in second aspect The 6th kind of possible implementation in, the resolution unit, be specifically used for utilize preset analytical function, by the data set The Data Format Transform for closing the data in type is that the GPU carries out data format required when calculation processing;By change data Data after format generate at least one data block.

In conjunction with the firstth to the 6th any possible implementation of second aspect or second aspect, in second aspect The 7th kind of possible implementation in, the transmission unit, specifically for sending at least one data block of the generation To in the buffer area of the assigned GPU.

In conjunction with the firstth to the 7th any possible implementation of second aspect or second aspect, in second aspect The 8th kind of possible implementation in, described device further include: receiving unit, for receiving at the calculating that the GPU is sent Manage result;Processing unit, for carrying out subregion, sequence and merging treatment to the calculation processing result.

The third aspect, the embodiment of the invention provides a kind of devices of data processing, comprising: processor, memory, communication Interface and bus, wherein the processor, the memory and the communication interface pass through the bus communication;The storage Device, for storing program;The processor, for executing executing instruction for the memory storage;The communication interface, is used for Receive waiting task and at least one data to be processed corresponding with the waiting task;When the data processing fills When setting operation, processor runs program, is given an order with executing: obtaining waiting task and corresponding with the waiting task At least one data to be processed;Graphics processor GPU is distributed for the waiting task;The waiting task is corresponding At least one described pending data be converted to the data of data acquisition system type;By the data in the data acquisition system type into Data after parsing are generated at least one data block by row parsing;At least one data block of the generation is sent to described Assigned GPU, so that the GPU carries out calculation processing at least one described data block according to the waiting task.

The embodiment of the invention provides a kind of method and device of data processing, the device of data processing obtains to be processed Business and at least one corresponding pending data of waiting task, waiting task distributes GPU thus;Waiting task is corresponding At least one pending data be converted to the data of data acquisition system type, and the data in data acquisition system type are solved Data after parsing are generated at least one data block, at least one data block of generation are sent to assigned GPU by analysis In, so that GPU carries out calculation processing.In this way, the device of data processing get waiting task and its it is corresponding at least It can be that it distributes GPU, and the corresponding pending data of this waiting task is sent to assigned after one pending data GPU, triggering GPU to pending data carry out calculation processing, improve processing data efficiency.

Detailed description of the invention

In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention without any creative labor, may be used also for those of ordinary skill in the art To obtain other drawings based on these drawings.

Fig. 1 is a kind of method flow diagram of data processing provided in an embodiment of the present invention;

Fig. 2 is the method flow diagram of another data processing provided in an embodiment of the present invention;

Fig. 3 is a kind of functional schematic of the device of data processing provided in an embodiment of the present invention;

Fig. 4 is the functional schematic of the transposition of another data processing provided in an embodiment of the present invention;

Fig. 5 is the functional schematic of the transposition of another data processing provided in an embodiment of the present invention;

Fig. 6 is a kind of structural schematic diagram of the transposition of data processing provided in an embodiment of the present invention.

Specific embodiment

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.

The embodiment of the invention provides a kind of methods of data processing, as shown in Figure 1, this method may include:

101, waiting task and at least one corresponding data to be processed of the waiting task are obtained.

Specifically, the device of data processing executes some in application, available this applies corresponding waiting task.And And at least one corresponding number to be processed of this waiting task is got in distributed file system according to this waiting task According to.

It should be noted that the device of data processing may operate in Hadoop system.At this point, the device of data processing When some application operation, waiting task can be obtained in Hadoop system, and according to this waiting task in Hadoop system Distributed file system in system gets its at least one corresponding pending data.

It should be noted that the device of data processing, which can also operate in other any need, sends data to GPU, by The system that GPU carries out calculation processing, the present invention are without limitation.

102, graphics processor GPU is distributed for the waiting task.

Specifically, the device of data processing obtain waiting task at least one pending data after, can basis Whether the demand of this waiting task determines at least one corresponding pending data of this waiting task by GPU (Graphic Processing Unit, graphics processor) processing.If the device of data processing determines that this waiting task needs GPU to handle it At least one corresponding pending data, then the device of data processing can distribute GPU by waiting task thus.

It should be noted that GPU can not exist in group system as individual components, it is necessary to be configured as acceleration components It must be realized by the device of data processing on the device of data processing, therefore to the management of GPU computing resource.In this way, There are two types of computing resources in the device of data processing, are respectively as follows: CPU (Central Processing Unit, central processing unit) And GPU.

103, at least one corresponding described pending data of the waiting task is converted into data acquisition system type Data.

Specifically, the device of data processing is that after waiting task distributes GPU, can determine the data acquisition system type Size of data.And according to the size of data of the data acquisition system type, at least one described pending data is distributed to extremely In a few data acquisition system.

Wherein, the size for the pending data for including in the data acquisition system is not more than the data of the data acquisition system type Size.

That is, the device of data processing after waiting task distributes GPU, is needed at least one number to be processed One group of data is converted to according to by individual data, the data of data acquisition system (Data Set) type are as converted to, at this time in determination It, will at least one described number to be processed according to the size of data of this data acquisition system type out after the size of data of data acquisition system type According to distribution at least one data acquisition system, so as to carry out subsequent processing as unit of a data acquisition system.

Further, the data type of at least one pending data can be equal long data types, be also possible to Elongated data type.

When the data type of at least one pending data is equal long data types, according to the data acquisition system class The size of data of type can directly distribute the data of at least one pending data at least one data acquisition system.Due to The size of each data be it is certain, then the position in data acquisition system is also certain, so without recording isometric data class Position of at least one pending data of type in data acquisition system.

It is elongated data type in the data type of corresponding at least one pending data of the waiting task When, according to the size of data of the data acquisition system type, at least one described pending data is distributed at least one data In set, and the location information of at least one pending data at least one described data acquisition system is recorded, so that The GPU according to the positional information, obtains the pending data.

Wherein, the location information is used to record position correlation letter of the pending data of elongated type in data acquisition system Breath.

That is, the data type in corresponding at least one pending data of the waiting task is elongated When data type, in the size of data according to the data acquisition system type, at least one described pending data is distributed to extremely It is not of uniform size due to each pending data when in a few data acquisition system, it needs by least one pending data When distribution at least one data acquisition system, location information of each pending data in data acquisition system is recorded, to will determine GPU can get complete pending data when carrying out data processing according to this location information.

Optionally, as an example, data acquisition system can be data cached area.The device of data processing can be by least one Pending data is stored into data cached area, to realize that corresponding described at least one is to be processed by the waiting task Data are converted to the data of data acquisition system type.

104, the data in the data acquisition system type are parsed, the data after parsing is generated at least one data Block.

Specifically, the device of data processing, is being converted to data acquisition system type at least one described pending data After data, the data of this data acquisition system type are parsed, so as to by the data of data acquisition system type be converted to GPU into Required data type when row calculation processing, and at least one data is generated using the data of the data acquisition system type after parsing Block.

Further, the device of data processing utilizes preset analytical function, by the data in the data acquisition system type Data Format Transform be that the GPU carries out required data format when calculation processing;Data after change data format are raw At at least one data block.

That is, the device of data processing as unit of data acquisition system, the data in data acquisition system is utilized preset Its data type conversion is required data type when GPU carries out calculation processing by analytical function.And it will be after change data format At least one data acquisition system generate at least one data block.

It should be noted that user when setting waiting task is executed by GPU, needs to be determined in advance out and be executed by GPU What kind of is calculated, and can execute that its analytical function is determined in what kind of calculating according to GPU at this time.As, GPU executes different meters It calculates, preset analytical function is different.Illustratively, logical operation if desired is carried out to pending data by GPU, at this point, It is data format needed for carrying out logical operation that preset analytical function, which can be the Data Format Transform of pending data,.Example Such as, analytical function is the pending data that will be text type or binary type by data format, and being converted to can be patrolled Collect the data of the shape data type of operation.

105, at least one data block of the generation is sent to the assigned GPU so that the GPU according to The waiting task carries out calculation processing at least one described data block.

Specifically, the device of data processing is connect after generating at least one data block, by this data block by the data of GPU Mouth is sent in assigned GPU.

Further, at least one data block of the generation can be stored to described and is assigned by the device of data processing GPU buffer area in.

The embodiment of the invention provides a kind of method of data processing, the device of data processing obtain waiting task and to At least one corresponding pending data of processing task, waiting task distributes GPU thus;Waiting task is corresponding at least One pending data is converted to the data of data acquisition system type, and the data in data acquisition system type are parsed, and will solve Data after analysis generate at least one data block, at least one data block of generation is sent in assigned GPU, so that GPU carries out calculation processing.In this way, the device of data processing is getting waiting task and its corresponding at least one is to be processed After data, GPU can be distributed for it, and the corresponding pending data of this waiting task is sent to assigned GPU, triggered GPU carries out calculation processing to pending data, improves the efficiency of processing data.

Further, the above process is not required for the data at least one corresponding pending data of waiting task Format is to wait long data types, improves the performance of system.It participates in, further improves manually without user in the process of running Handle the efficiency of data.

The embodiment of the invention provides a kind of methods of data processing, as shown in Figure 2, comprising:

201, data processing device obtain waiting task and it is corresponding with the waiting task at least one wait for The data of processing.

Specifically, can refer to step 101, details are not described herein.

202, the device of data processing obtains preconfigured resource information table.

Wherein, the resource information table is for recording the quantity of GPU and the service condition information of the GPU.

Specifically, can be obtained from initial cluster file system when the device of data processing obtains resource information table for the first time It takes.The device of data processing for the first time obtain resource information table after, can will this resource information table storage in caching in, so as to it After obtain.

Further, the resource information table is also used to record the quantity of central processor CPU and the use of the CPU Situation information.

It should be noted that the present invention to the sequence between step 201 and step 202 with no restrictions.Step can be first carried out 201, step 202 is being executed, step 202 can also be first carried out, step 201 is being executed, can also be performed simultaneously step 201,202.? One kind is only represented in diagram.

203, the device of data processing determines the usage quantity for the GPU that the waiting task needs.

Specifically, needed for the device of data processing carries it after getting waiting task, in waiting task Resource information.The device of data processing can know the use number for the GPU that the waiting task needs according to this resource information Amount.

It should be noted that waiting task can also be in other ways by the usage quantity notification data of the GPU needed for it The device of processing, the present invention are without limitation.

204, the device of data processing determines whether to distribute GPU for the waiting task.

Specifically, after the device of data processing GPU quantity needed for knowing waiting task, it can according to resource information table To determine not used GPU data, so can determine whether to distribute GPU for the waiting task.

Further, the device of data processing is according to the quantity of GPU in the resource information table and the use feelings of the GPU Condition information, determines whether the quantity of not used GPU meets the usage quantity for the GPU that the waiting task needs, thus really Whether fixed be the waiting task distribution GPU.Determine that the quantity of not used GPU meets what the waiting task needed When the usage quantity of GPU, the device of data processing is determined as the waiting task distribution GPU.

Determine that the quantity of not used GPU in the resource information table is unsatisfactory for the use of the GPU of waiting task needs When quantity, the device of data processing, which determines, does not distribute GPU for waiting task, can distribute CPU for the waiting task.

That is, the device of data processing is according to the quantity of GPU in the resource information table and the use feelings of the GPU Condition information can determine the quantity of not used GPU, by GPU's needed for the quantity of the unused GPU and waiting task Quantity is compared, when the quantity of GPU needed for being more than or equal to waiting task in the quantity of the unused GPU, at data The device of reason determines that the quantity of not used GPU meets the usage quantity for the GPU that the waiting task needs, at this time at data The device of reason is determined as waiting task distribution GPU.GPU needed for being less than waiting task in the quantity of the unused GPU Quantity when, the device of data processing determines that the quantity of not used GPU is unsatisfactory for making for the GPU that the waiting task needs With quantity, the device of data processing at this time, which determines, does not distribute GPU for waiting task, can distribute CPU for waiting task.

It should be noted that the device of data processing is different according to determining result, the difference being performed below.If it is determined that being Waiting task distributes GPU, thens follow the steps 205a, 206-209.If it is determined that not distributing GPU for waiting task, then step is executed Rapid 205b.

205a, when determining that the quantity of not used GPU meets the usage quantity for the GPU that the waiting task needs, number Device according to processing is that the waiting task distributes GPU.

Specifically, the device of data processing can distribute GPU, tool according to the quantity of the GPU needed for waiting task for it Body can refer to step 102, and details are not described herein.

205b, determine that the quantity of not used GPU in the resource information table is unsatisfactory for the GPU's of waiting task needs When usage quantity, the device of data processing is that the waiting task distributes CPU.

Specifically, the device of data processing quantity of not used GPU in determining the resource information table be unsatisfactory for When the usage quantity for the GPU that processing task needs, it be calculated since GPU cannot be distributed for it, it can waiting task thus CPU is distributed, corresponding calculation processing is carried out by CPU.

206, the device of data processing is converted at least one corresponding described pending data of the waiting task The data of data acquisition system type.

Specifically, can refer to step 103, details are not described herein.

207, the device of data processing parses the data in the data acquisition system type, and the data after parsing are raw At at least one data block.

Specifically, can refer to step 104, details are not described herein.

208, at least one data block of the generation is sent to the assigned GPU by the device of data processing, with So that the GPU carries out calculation processing at least one described data block according to the waiting task.

Specifically, can refer to step 105, details are not described herein.

209, the device of data processing receives the calculation processing of the assigned GPU transmission as a result, and to the calculating Processing result carries out subregion, sequence and merging treatment.

It, can be with specifically, the device of data processing is after receiving the calculation processing result that the assigned GPU is sent The calculation processing result is subjected to subregion, sequence and merging treatment, i.e. subregion is to divide the identical calculated result of keyword To in same group.Every group of calculated result is arranged according to the corresponding keyword of each group for the calculated result after grouping Sequence.The calculated result of same keyword is merged into processing.

The embodiment of the invention provides a kind of method of data processing, the device of data processing obtain waiting task and to At least one corresponding pending data of processing task, waiting task distributes GPU thus;Waiting task is corresponding at least One pending data is converted to the data of data acquisition system type, and the data in data acquisition system type are parsed, and will solve Data after analysis generate at least one data block, at least one data block of generation is sent in assigned GPU, so that GPU carries out calculation processing.In this way, the device of data processing is getting waiting task and its corresponding at least one is to be processed After data, GPU can be distributed for it, and the corresponding pending data of this waiting task is sent to assigned GPU, triggered GPU carries out calculation processing to pending data, improves the efficiency of processing data.And the above process is not required for to be processed The data format of at least one corresponding pending data of task is to wait long data types, improves the performance of system.It is transporting It is participated in manually during row without user, further improves the efficiency of processing data.

The embodiment of the invention provides a kind of devices of data processing, as shown in Figure 3, comprising:

Acquiring unit 301, for obtain waiting task and it is corresponding with the waiting task at least one wait locating The data of reason.

Allocation unit 302, for distributing graphics processor GPU for waiting task.

Specifically, allocation unit 302 can determine the corresponding of this waiting task according to the demand of this waiting task Whether at least one pending data is handled by GPU.If this waiting task need GPU handle its it is corresponding at least one wait locating Data are managed, then allocation unit 302 can distribute GPU by waiting task thus.

Converting unit 303, at least one corresponding described pending data of the waiting task to be converted to number According to the data of aggregate type.

Specifically, the converting unit 303, specifically for the size of the data of the determination data acquisition system type;According to The size of the data of the data acquisition system type distributes at least one described pending data at least one data acquisition system In.

Wherein, the size for the pending data for including in the data acquisition system is not more than the data of the data acquisition system type Size.

Further, the converting unit 303, specifically for the waiting task it is corresponding it is described at least one wait for When the data type for handling data is elongated data type, according to the size of the data of the data acquisition system type, by described in extremely A few pending data is distributed at least one data acquisition system, and record at least one described pending data it is described extremely Location information in a few data acquisition system obtains the pending data so that the GPU is according to the positional information.

Wherein, the location information is used to record position correlation letter of the pending data of elongated type in data acquisition system Breath.

The converting unit 303, specifically for being isometric data in the data type of at least one pending data When type, according to the size of data of the data acquisition system type, the data of at least one pending data can directly be distributed Into at least one data acquisition system.Due to the size of each data be it is certain, then the position in data acquisition system is also certain , so position of at least one pending data without long data types such as records in data acquisition system.

Resolution unit 304, the data in the data acquisition system type for converting the converting unit 303 solve Data after parsing are generated at least one data block by analysis.

Specifically, the resolution unit 304, is specifically used for utilizing preset analytical function, by the data acquisition system type In the Data Format Transforms of data be that the GPU carries out required data format when calculation processing.After change data format Data generate at least one data block.

Transmission unit 305 is sent to described point at least one data block described in generating the resolution unit 304 GPU with unit distribution, so that the GPU carries out at calculating at least one described data block according to the waiting task Reason.

Specifically, the transmission unit 305, specifically at least one data block of the generation is sent to the quilt In the buffer area of the GPU of distribution.

Further, the acquiring unit 301 is also used to obtain preconfigured resource information table.

Wherein, the resource information table is for recording the quantity of GPU and the service condition information of the GPU.

Further, the resource information table is also used to record the quantity of central processor CPU and the use of the CPU Situation information.

The device of the data processing, as shown in Figure 4, further includes:

Determination unit 306, the usage quantity of the GPU for determining the waiting task needs.

At this point, the allocation unit 302, specifically for GPU in the resource information table that is obtained according to the acquiring unit 301 Quantity and the GPU service condition information, determine that the quantity of not used GPU meets what the determination unit 306 determined When the usage quantity for the GPU that the waiting task needs, GPU is distributed for the waiting task.

Further, it is described determining single to be also used to determine that the quantity of not used GPU is unsatisfactory for for the allocation unit 302 When the usage quantity for the GPU that the waiting task that member 306 determines needs, CPU is distributed for the waiting task.

Further, the device of the data processing, as shown in Figure 5, further includes:

Receiving unit 307, the calculation processing result sent for receiving the assigned GPU.

Processing unit 308, for carrying out subregion, sequence and merging treatment to the calculation processing result.

Specifically, after the calculation processing result that the receiving unit 307 receives that the assigned GPU is sent, place The calculation processing result can be carried out subregion, sequence and merging treatment by managing unit 308, i.e. subregion is that keyword is identical Calculated result is divided into same group.For the calculated result after grouping, according to the corresponding keyword of each group, to every group of meter Result is calculated to be ranked up.The calculated result of same keyword is merged into processing.

The embodiment of the invention provides a kind of transposition of data processing, the device of data processing obtain waiting task and to At least one corresponding pending data of processing task, waiting task distributes GPU thus;Waiting task is corresponding at least One pending data is converted to the data of data acquisition system type, and the data in data acquisition system type are parsed, and will solve Data after analysis generate at least one data block, at least one data block of generation is sent in assigned GPU, so that GPU carries out calculation processing.In this way, the device of data processing is getting waiting task and its corresponding at least one is to be processed After data, GPU can be distributed for it, and the corresponding pending data of this waiting task is sent to assigned GPU, triggered GPU carries out calculation processing to pending data, improves the efficiency of processing data.And the above process is not required for to be processed The data format of at least one corresponding pending data of task is to wait long data types, improves the performance of system.It is transporting It is participated in manually during row without user, further improves the efficiency of processing data.

The embodiment of the invention provides a kind of devices of data processing, as shown in Figure 6, comprising: processor 601, memory 602, communication interface 603 and bus 604, wherein the processor 601, the memory 602 and the communication interface 603 are logical The bus 604 is crossed to communicate.

The memory 602, for storing program.

The processor 601, for executing executing instruction for the memory storage.

The communication interface 603, for receive waiting task and it is corresponding with the waiting task at least one Data to be processed,

When data processing equipment operation, the processor 601 runs program, is given an order with executing:

The processor 601, for obtain waiting task and it is corresponding with the waiting task at least one wait for The data of processing.

The processor 601 is also used to distribute graphics processor GPU for the waiting task.

Specifically, processor 601 can be determined according to the demand of this waiting task this waiting task it is corresponding extremely Whether a few pending data is handled by GPU.If this waiting task needs GPU to handle its, corresponding at least one is to be processed Data, then processor 601 can distribute GPU by waiting task thus.

The processor 601 is also used to corresponding at least one pending data conversion of the waiting task For the data of data acquisition system type.

Specifically, processor 601 is specifically used for determining the size of the data of the data acquisition system type;According to the data The size of the data of aggregate type distributes at least one described pending data at least one data acquisition system.

Wherein, the size for the pending data for including in the data acquisition system is not more than the data of the data acquisition system type Size.

Further, the processor 601, specifically for the waiting task it is corresponding it is described at least one wait locating When the data type for managing data is elongated data type, according to the size of the data of the data acquisition system type, by described at least One pending data is distributed at least one data acquisition system, and record at least one described pending data it is described at least Location information in one data acquisition system obtains the pending data so that the GPU is according to the positional information.

Wherein, the location information is used to record position correlation letter of the pending data of elongated type in data acquisition system Breath.

The processor 601, specifically for being isometric data class in the data type of at least one pending data When type, according to the size of data of the data acquisition system type, the data of at least one pending data can directly be distributed to In at least one data acquisition system.Due to the size of each data be it is certain, then the position in data acquisition system is also certain, So position of at least one pending data without long data types such as records in data acquisition system.

The processor 601 is also used to parse the data in the data acquisition system type, by the data after parsing Generate at least one data block.

Specifically, processor 601, is specifically used for utilizing preset analytical function, by the number in the data acquisition system type According to Data Format Transform be required data format when the GPU carries out calculation processing.By the data after change data format Generate at least one data block.

The processor 601 is also used at least one data block of the generation being sent to the assigned GPU, So that the GPU carries out calculation processing at least one described data block according to the waiting task.

Specifically, the processor 601, specifically at least one data block of the generation is sent to described divided In the buffer area of the GPU matched.

Further, the processor 601 is also used to obtain preconfigured resource information table.

Wherein, the resource information table is for recording the quantity of GPU and the service condition information of the GPU.

Further, the resource information table is also used to record the quantity of central processor CPU and the use of the CPU Situation information.

The processor 601 is also used to determine the usage quantity for the GPU that the waiting task needs.

At this point, the processor 601, for distributing graphics processor GPU for the waiting task specifically:

The processor 601, specifically for according to the quantity of GPU in the resource information table and the use feelings of the GPU Condition information, when determining that the quantity of not used GPU meets the usage quantity for the GPU that the waiting task needs, for it is described to Processing task distributes GPU.

Further, the processor 601 is also used to determine that the quantity of not used GPU is unsatisfactory for the determination unit When the usage quantity for the GPU that 306 waiting tasks determined need, CPU is distributed for the waiting task.

Further, the communication interface 603 is also used to receive the calculation processing result that the assigned GPU is sent.

The processor 601 is also used to carry out subregion, sequence and merging treatment to the calculation processing result.

Specifically, after the calculation processing result that the communication interface 603 receives that the assigned GPU is sent, place The calculation processing result can be carried out subregion, sequence and merging treatment by managing device 601, i.e. subregion is by the identical meter of keyword Result is calculated to be divided into same group.Calculating for the calculated result after grouping, according to the corresponding keyword of each group, to every group As a result it is ranked up.The calculated result of same keyword is merged into processing.

The embodiment of the invention provides a kind of transposition of data processing, the device of data processing obtain waiting task and to At least one corresponding pending data of processing task, waiting task distributes GPU thus;Waiting task is corresponding at least One pending data is converted to the data of data acquisition system type, and the data in data acquisition system type are parsed, and will solve Data after analysis generate at least one data block, at least one data block of generation is sent in assigned GPU, so that GPU carries out calculation processing.In this way, the device of data processing is getting waiting task and its corresponding at least one is to be processed After data, GPU can be distributed for it, and the corresponding pending data of this waiting task is sent to assigned GPU, triggered GPU carries out calculation processing to pending data, improves the efficiency of processing data.And the above process is not required for to be processed The data format of at least one corresponding pending data of task is to wait long data types, improves the performance of system.It is transporting It is participated in manually during row without user, further improves the efficiency of processing data.

Through the above description of the embodiments, it is apparent to those skilled in the art that the present invention can borrow Help software that the mode of required common hardware is added to realize, naturally it is also possible to which the former is more preferably by hardware, but in many cases Embodiment.Based on this understanding, the portion that technical solution of the present invention substantially in other words contributes to the prior art Dividing can be embodied in the form of software products, which stores in a readable storage medium, such as count The floppy disk of calculation machine, hard disk or CD etc., including some instructions are used so that computer equipment (it can be personal computer, Server or the network equipment etc.) execute method described in each embodiment of the present invention.

The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any In the technical scope disclosed by the present invention, any changes or substitutions that can be easily thought of by those familiar with the art, all answers It is included within the scope of the present invention.Therefore, protection scope of the present invention should be with the scope of protection of the claims It is quasi-.

Claims (19)

1. a kind of method of data processing, which is characterized in that the described method includes:
Obtain waiting task and at least one data to be processed corresponding with the waiting task;
Graphics processor GPU is distributed for the waiting task;
At least one corresponding described pending data of the waiting task is converted to the data of data acquisition system type;
Data in the data acquisition system type are parsed, the data after parsing are generated at least one data block;
At least one data block of the generation is sent to assigned GPU, so that the GPU is according to described to be processed Business carries out calculation processing at least one described data block.
2. the method according to claim 1, wherein distributing graphics processor GPU for the waiting task Before further include:
Obtain preconfigured resource information table;The resource information table is used to record the quantity of GPU and the use of the GPU Situation information.
3. according to the method described in claim 2, it is characterized in that, after obtaining resource information table, further includes: described in determining The usage quantity for the GPU that waiting task needs;
Distributing graphics processor GPU for waiting task includes:
According to the quantity of GPU in the resource information table and the service condition information of the GPU, the number of not used GPU is determined When amount meets the usage quantity for the GPU that the waiting task needs, GPU is distributed for the waiting task.
4. according to the method described in claim 3, it is characterized in that,
The resource information table is also used to record the quantity of central processor CPU and the service condition information of the CPU;
After the usage quantity for the GPU that the determination waiting task needs, further includes:
Determine that the quantity of not used GPU in the resource information table is unsatisfactory for the usage quantity of the GPU of waiting task needs When, CPU is distributed for the waiting task.
5. method according to claim 1-4, which is characterized in that
The data that at least one corresponding described pending data of the waiting task is converted to data acquisition system type Include:
Determine the size of data of the data acquisition system type;
According to the size of data of the data acquisition system type, at least one described pending data is distributed at least one data In set;The size for the pending data for including in the data acquisition system is not more than the size of data of the data acquisition system type.
6. according to the method described in claim 5, it is characterized in that, the size of data according to the data acquisition system type, At least one described pending data is distributed and includes: at least one data acquisition system
When the data type of corresponding at least one pending data of the waiting task is elongated data type, root According to the size of data of the data acquisition system type, at least one described pending data is distributed at least one data acquisition system In, and the location information of at least one pending data at least one described data acquisition system is recorded, so that described GPU according to the positional information, obtains the pending data;The location information is used to record the number to be processed of elongated type According to the location dependent information in data acquisition system.
7. method according to claim 1-4, which is characterized in that the number by the data acquisition system type According to being parsed, the data after parsing, which are generated at least one data block, includes:
It is GPU progress by the Data Format Transform of the data in the data acquisition system type using preset analytical function Required data format when calculation processing;
Data after change data format are generated at least one data block.
8. method according to claim 1-4, which is characterized in that described at least one data by the generation Block is sent to the assigned GPU
At least one data block of the generation is stored into the buffer area of the assigned GPU.
9. method according to claim 1-4, which is characterized in that in described at least one number by the generation It is sent to after the assigned GPU according to block, further includes:
Receive the calculation processing that the assigned GPU is sent as a result, and to the calculation processing result carry out subregion, sequence and Merging treatment.
10. a kind of device of data processing characterized by comprising
Acquiring unit, for obtaining waiting task and at least one number to be processed corresponding with the waiting task According to;
Allocation unit, for distributing graphics processor GPU for waiting task;
Converting unit, at least one corresponding described pending data of the waiting task to be converted to data acquisition system class The data of type;
Resolution unit, the data in the data acquisition system type for converting the converting unit are parsed, will be parsed Data afterwards generate at least one data block;
Transmission unit is sent to the allocation unit distribution at least one data block described in generating the resolution unit GPU so that the GPU according to the waiting task at least one described data block carry out calculation processing.
11. device according to claim 10, which is characterized in that
The acquiring unit is also used to obtain preconfigured resource information table;The resource information table is used to record the number of GPU The service condition information of amount and the GPU.
12. device according to claim 11, which is characterized in that described device further include:
Determination unit, the usage quantity of the GPU for determining the waiting task needs;
The allocation unit, quantity and the GPU specifically for GPU in the resource information table that is obtained according to the acquiring unit Service condition information, determine that the quantity of not used GPU meets the waiting task needs that the determination unit determines GPU usage quantity when, for the waiting task distribute GPU.
13. device according to claim 12, which is characterized in that the resource information table is also used to record central processing unit The service condition information of the quantity of CPU and the CPU;
The allocation unit is also used to determine that the quantity of not used GPU is unsatisfactory for the described wait locate of the determination unit determination When the usage quantity for the GPU that reason task needs, CPU is distributed for the waiting task.
14. the described in any item devices of 0-13 according to claim 1, which is characterized in that
The converting unit, specifically for the size of the data of the determination data acquisition system type;According to the data acquisition system class The size of the data of type distributes at least one described pending data at least one data acquisition system;The data acquisition system In include pending data size no more than the data acquisition system type data size.
15. device according to claim 14, which is characterized in that
The converting unit, specifically for the data class in corresponding at least one pending data of the waiting task It, will at least one described pending data according to the size of the data of the data acquisition system type when type is elongated data type Distribution records at least one described pending data at least one described data acquisition system at least one data acquisition system Location information obtain the pending data so that the GPU is according to the positional information;The location information is used for Record location dependent information of the pending data of elongated type in data acquisition system.
16. the described in any item devices of 0-13 according to claim 1, which is characterized in that
The resolution unit is specifically used for utilizing preset analytical function, by the data of the data in the data acquisition system type Format is converted to the GPU and carries out data format required when calculation processing;
Data after change data format are generated at least one data block.
17. the described in any item devices of 0-13 according to claim 1, which is characterized in that
The transmission unit is sent to the buffer area of assigned GPU specifically at least one data block by the generation In.
18. the described in any item devices of 0-13 according to claim 1, which is characterized in that described device further include:
Receiving unit, the calculation processing result sent for receiving the GPU;
Processing unit, for carrying out subregion, sequence and merging treatment to the calculation processing result.
19. a kind of device of data processing, which is characterized in that described device includes: processor, memory, communication interface, and total Line, wherein the processor, the memory and the communication interface pass through the bus communication;
The memory, for storing program;
The processor, for executing executing instruction for the memory storage;
The communication interface, for receiving waiting task and corresponding with the waiting task at least one is to be processed Data;
When data processing equipment operation, processor runs program, is given an order with executing:
Obtain waiting task and at least one data to be processed corresponding with the waiting task;
Graphics processor GPU is distributed for the waiting task;
At least one corresponding described pending data of the waiting task is converted to the data of data acquisition system type;
Data in the data acquisition system type are parsed, the data after parsing are generated at least one data block;
At least one data block of the generation is sent to assigned GPU, so that the GPU is according to described to be processed Business carries out calculation processing at least one described data block.
CN201410223152.8A 2014-05-23 2014-05-23 A kind of method and device of data processing CN105094981B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410223152.8A CN105094981B (en) 2014-05-23 2014-05-23 A kind of method and device of data processing

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410223152.8A CN105094981B (en) 2014-05-23 2014-05-23 A kind of method and device of data processing
PCT/CN2015/079633 WO2015176689A1 (en) 2014-05-23 2015-05-23 Data processing method and device

Publications (2)

Publication Number Publication Date
CN105094981A CN105094981A (en) 2015-11-25
CN105094981B true CN105094981B (en) 2019-02-12

Family

ID=54553454

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410223152.8A CN105094981B (en) 2014-05-23 2014-05-23 A kind of method and device of data processing

Country Status (2)

Country Link
CN (1) CN105094981B (en)
WO (1) WO2015176689A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107103009A (en) * 2016-02-23 2017-08-29 杭州海康威视数字技术股份有限公司 Method and device for data processing
CN107204998A (en) * 2016-03-16 2017-09-26 华为技术有限公司 Data processing method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102662639A (en) * 2012-04-10 2012-09-12 南京航空航天大学 Mapreduce-based multi-GPU (Graphic Processing Unit) cooperative computing method
CN102708088A (en) * 2012-05-08 2012-10-03 北京理工大学 CPU/GPU (Central Processing Unit/ Graphic Processing Unit) cooperative processing method oriented to mass data high-performance computation
CN103699656A (en) * 2013-12-27 2014-04-02 同济大学 GPU-based mass-multimedia-data-oriented MapReduce platform

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7119810B2 (en) * 2003-12-05 2006-10-10 Siemens Medical Solutions Usa, Inc. Graphics processing unit for simulation or medical diagnostic imaging
US9104767B2 (en) * 2012-08-28 2015-08-11 Adobe Systems Incorporated Identifying web pages that are likely to guide browsing viewers to improve conversion rate

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102662639A (en) * 2012-04-10 2012-09-12 南京航空航天大学 Mapreduce-based multi-GPU (Graphic Processing Unit) cooperative computing method
CN102708088A (en) * 2012-05-08 2012-10-03 北京理工大学 CPU/GPU (Central Processing Unit/ Graphic Processing Unit) cooperative processing method oriented to mass data high-performance computation
CN103699656A (en) * 2013-12-27 2014-04-02 同济大学 GPU-based mass-multimedia-data-oriented MapReduce platform

Also Published As

Publication number Publication date
CN105094981A (en) 2015-11-25
WO2015176689A1 (en) 2015-11-26

Similar Documents

Publication Publication Date Title
Jain et al. Design, implementation, and evaluation of the linear road bnchmark on the stream processing core
KR101600129B1 (en) Application efficiency engine
Zheng et al. PreDatA–preparatory data analytics on peta-scale machines
US20060277295A1 (en) Monitoring system and monitoring method
US8954992B2 (en) Distributed and scaled-out network switch and packet processing
US8863138B2 (en) Application service performance in cloud computing
US20130081046A1 (en) Analysis of operator graph and dynamic reallocation of a resource to improve performance
US10048976B2 (en) Allocation of virtual machines to physical machines through dominant resource assisted heuristics
US9819731B1 (en) Distributing global values in a graph processing system
KR20160119193A (en) Mobile Cloud Services Architecture
JP2012530295A (en) Method of monitoring a computer activities of the plurality of virtual devices, apparatus and program
JP5313990B2 (en) Estimating the service resource consumption based on the response time
US8589923B2 (en) Preprovisioning virtual machines based on request frequency and current network configuration
Brightwell et al. An analysis of NIC resource usage for offloading MPI
Hegeman et al. Toward optimal bounds in the congested clique: Graph connectivity and MST
US20120078948A1 (en) Systems and methods for searching a cloud-based distributed storage resources using a set of expandable probes
US20130104135A1 (en) Data center operation
US9405589B2 (en) System and method of optimization of in-memory data grid placement
CN102725753B (en) Method and apparatus for optimizing data access, method and apparatus for optimizing data storage
US20140372611A1 (en) Assigning method, apparatus, and system
US8966000B2 (en) Aggregation and re-ordering of input/output requests for better performance in remote file systems
US9092266B2 (en) Scalable scheduling for distributed data processing
US20090187588A1 (en) Distributed indexing of file content
WO2011137815A1 (en) Method, message receiving parser and system for data access
US8910128B2 (en) Methods and apparatus for application performance and capacity analysis

Legal Events

Date Code Title Description
C06 Publication
C10 Entry into substantive examination
GR01 Patent grant