WO2015176689A1 - 一种数据处理的方法及装置 - Google Patents

一种数据处理的方法及装置 Download PDF

Info

Publication number
WO2015176689A1
WO2015176689A1 PCT/CN2015/079633 CN2015079633W WO2015176689A1 WO 2015176689 A1 WO2015176689 A1 WO 2015176689A1 CN 2015079633 W CN2015079633 W CN 2015079633W WO 2015176689 A1 WO2015176689 A1 WO 2015176689A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
processed
gpu
task
type
Prior art date
Application number
PCT/CN2015/079633
Other languages
English (en)
French (fr)
Inventor
崔慧敏
杨文森
谢睿
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2015176689A1 publication Critical patent/WO2015176689A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]

Definitions

  • the present invention relates to the field of communications, and in particular, to a data processing method and apparatus.
  • Hadoop is a software architecture capable of distributed processing of large amounts of data.
  • a distributed file system By using distributed storage for data storage, data read and write speed is increased, and storage capacity is also expanded.
  • the upper layer of the distributed file system is the MapReduce engine.
  • the data in the distributed file system is integrated by the MapReduce engine to ensure the efficiency of analyzing and processing data. It is because of the outstanding advantages of Hadoop. It is widely used in many fields.
  • Hadoop's MapReduce engine is made possible by the limitations of the ⁇ key, value> programming interface provided by the existing Hadoop MapReduce engine. It is impossible to trigger the existing acceleration component to process the data, so that the powerful computing power of the acceleration component cannot be utilized, and the efficiency of processing data is not improved.
  • Embodiments of the present invention provide a data processing method and apparatus for improving data processing efficiency.
  • an embodiment of the present invention provides a data processing method, including: acquiring a to-be-processed task and at least one to-be-processed data corresponding to the to-be-processed task; and allocating a graphics processor GPU for the to-be-processed task Converting the at least one to-be-processed data corresponding to the to-be-processed task into data of a data collection type; parsing the data in the data collection type, and generating at least one data block from the parsed data; The generated at least one data block is sent to the allocated GPU, so that the GPU performs a calculation process on the at least one data block according to the to-be-processed task.
  • the method before the allocating the graphics processor GPU for the to-be-processed task, the method further includes: acquiring a pre-configured resource information table; the resource information table is used to record the number of GPUs and Usage information of the GPU.
  • the method further includes: determining a GPU required by the to-be-processed task
  • the number of the GPUs to be processed is determined by the number of GPUs in the resource information table and the usage information of the GPU, and the number of unused GPUs is determined to meet the requirements of the to-be-processed task.
  • a GPU is allocated for the to-be-processed task.
  • the resource information table is further configured to record a quantity of a central processing unit CPU and usage information of the CPU; After the determining the number of uses of the GPU required by the to-be-processed task, the method further includes: determining that the number of unused GPUs in the resource information table does not meet the required number of GPUs to be processed by the task to be processed, The processing task allocates CPU.
  • the converting the at least one to-be-processed data corresponding to the to-be-processed task into data of a data set type includes: determining a data size of the data set type; The data size of the data collection type, the at least one to-be-processed data is allocated to the at least one data set; the size of the to-be-processed data included in the data set is not greater than the data size of the data collection type.
  • the at least one to-be-processed data is allocated to the at least one data size according to the data set type
  • the data set includes: when the data type of the at least one to-be-processed data corresponding to the to-be-processed task is a variable-length data type, assigning the at least one to-be-processed data according to a data size of the data set type And at least one data set, and recording location information of the at least one to-be-processed data in the at least one data set, so that the GPU acquires the to-be-processed data according to the location information; It is used to record position-related information of the variable length type of pending data in the data set.
  • the data in the data set type is parsed
  • Generating the parsed data to generate at least one data block includes: converting, by using a preset analytic function, a data format of the data in the data collection type to a data format required by the GPU for performing a calculation process; converting the data The formatted data generates at least one data block.
  • the generating the generated at least one data block includes storing the generated at least one data block into a buffer of the allocated GPU.
  • the method further includes: receiving a calculation processing result sent by the allocated GPU, and performing partitioning, sorting, and merging processing on the calculation processing result.
  • an embodiment of the present invention provides an apparatus for data processing, including: an acquiring unit, configured to acquire a to-be-processed task and at least one to-be-processed data corresponding to the to-be-processed task; a to-be-processed task-allocation GPU; a conversion unit, configured to convert the at least one to-be-processed data corresponding to the to-be-processed task into data of a data collection type; and a parsing unit, configured to convert the conversion unit
  • the data in the data set type is parsed, and the parsed data is generated into at least one data block;
  • the sending unit is configured to send the at least one data block generated by the parsing unit to the GPU allocated by the allocating unit, And causing the GPU to perform calculation processing on the at least one data block according to the to-be-processed task.
  • the acquiring unit is further configured to obtain a pre-configured resource information table, where the resource information table is used to record the number of GPUs and the usage information of the GPU.
  • the device further includes: a determining unit, configured to determine a quantity of GPUs required for the to-be-processed task
  • the allocation unit is configured to determine, according to the number of GPUs in the resource information table acquired by the acquiring unit and the usage information of the GPU, that the number of unused GPUs meets the to-be-determined determined by the determining unit. When the number of GPUs required by the task is used, the GPU is allocated for the to-be-processed task.
  • the resource information table is further configured to record the number of CPUs of the central processing unit and the usage information of the CPU.
  • the allocation unit is further configured to allocate a CPU to the to-be-processed task when the number of unused GPUs does not satisfy the number of GPUs required by the determining unit to determine the required number of GPUs.
  • the converting unit is specifically configured to determine the data The size of the data of the collection type; the at least one data to be processed is allocated to at least one data set according to the size of the data of the data collection type; the waiting area included in the data set The size of the data is not greater than the size of the data of the data collection type.
  • the converting unit is specifically configured to: the at least one to-be-processed data corresponding to the to-be-processed task
  • the data type is a variable length data type
  • the at least one to-be-processed data is allocated to the at least one data set according to the size of the data of the data collection type, and the at least one to-be-processed data is recorded at the at least Position information in a data set, such that the GPU acquires the to-be-processed data according to the location information
  • the location information is used to record location-related information of a variable-length type of to-be-processed data in a data set.
  • the parsing unit is specifically configured to use a preset An analytic function that converts a data format of the data in the data collection type into a data format required by the GPU for performing a calculation process; and generates at least one data block by converting the data formatted data.
  • the sending unit is specifically configured to: At least one data block is sent to the buffer of the allocated GPU.
  • the device further includes: a receiving unit, configured to receive a calculation processing result sent by the GPU; a processing unit, configured to perform partitioning, sorting, and merging processing on the calculation processing result.
  • an embodiment of the present invention provides an apparatus for data processing, including: a processor, a memory, a communication interface, and a bus, wherein the processor, the memory, and the communication interface communicate through the bus
  • the memory is configured to store a program
  • the processor is configured to execute an execution instruction stored by the memory
  • the communication interface is configured to receive a task to be processed and at least one to be processed corresponding to the to-be-processed task Data
  • the processor runs a program to execute the following instructions: acquiring a to-be-processed task and at least one to-be-processed data corresponding to the to-be-processed task; assigning a graphic to the to-be-processed task Processor GPU Converting the at least one to-be-processed data corresponding to the processing task into data of a data collection type; parsing the data in the data collection type, and generating at least one data block from the parsed data; A data block is sent to the allocated GPU, so that
  • An embodiment of the present invention provides a data processing method and device, where a data processing device acquires at least one to-be-processed data corresponding to a task to be processed and a task to be processed, and allocates a GPU for the task to be processed; Converting a to-be-processed data into data of a data collection type, parsing the data in the data collection type, generating at least one data block from the parsed data, and transmitting the generated at least one data block to the allocated GPU, Causes the GPU to perform computational processing.
  • the data processing device After the data processing device obtains the to-be-processed task and the corresponding at least one to-be-processed data, it can allocate a GPU to the GPU, and send the to-be-processed data corresponding to the to-be-processed task to the allocated GPU, triggering the GPU to treat Processing data for calculation processing improves the efficiency of processing data.
  • FIG. 1 is a flowchart of a method for data processing according to an embodiment of the present invention
  • FIG. 2 is a flowchart of another method for data processing according to an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of functions of a device for data processing according to an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of another function of transposing data processing according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of another function of transposing data processing according to an embodiment of the present invention.
  • FIG. 6 is a schematic structural diagram of transposition of data processing according to an embodiment of the present invention.
  • An embodiment of the present invention provides a data processing method. As shown in FIG. 1 , the method may include:
  • the to-be-processed task corresponding to the application may be obtained. And obtaining at least one to-be-processed data corresponding to the to-be-processed task in the distributed file system according to the to-be-processed task.
  • the data processing device can be run in the Hadoop system.
  • the task to be processed can be obtained in the Hadoop system, and at least one corresponding data to be processed is obtained according to the distributed file system in the Hadoop system according to the to-be-processed task.
  • the data processing device can also be used in any other system that needs to send data to the GPU for calculation processing by the GPU, which is not limited by the present invention.
  • the data processing device may determine, according to the requirement of the to-be-processed task, whether the corresponding at least one to-be-processed data of the to-be-processed task is represented by a GPU (Graphic Processing Unit, graphic Processor) processing. If the data processing device determines that the to-be-processed task requires the GPU to process its corresponding at least one to-be-processed data, the data processing device may allocate a GPU for the to-be-processed task.
  • a GPU Graphic Processing Unit, graphic Processor
  • the GPU cannot exist as a separate component in the cluster system, and must be configured as an acceleration component on the data processing device. Therefore, management of the GPU computing resource must be implemented by a data processing device.
  • CPU Central Processing Unit
  • GPU GPU
  • the data size of the data collection type may be determined. And assigning the at least one to-be-processed data to the at least one data set according to the data size of the data collection type.
  • the size of the data to be processed included in the data set is not greater than the data size of the data set type.
  • the data processing device needs to convert at least one data to be processed from a single data into a set of data, that is, data converted into a data set (Data Set) type.
  • the at least one to-be-processed data is allocated to at least one data set according to the data size of the data collection type, so that subsequent processing can be performed in units of one data set.
  • the data type of the at least one data to be processed may be an isometric data type or a variable length data type.
  • the data type of the at least one data to be processed is an isometric data type
  • the data size of the data set type the data of the at least one data to be processed may be directly allocated to the at least one data set. Since the size of each data is constant, the position in the data set is also constant, so there is no need to record the position of at least one data to be processed of the same length data type in the data set.
  • the at least one to-be-processed data is allocated to at least one data set according to the data size of the data set type. And recording location information of the at least one to-be-processed data in the at least one data set, so that the GPU acquires the to-be-processed data according to the location information.
  • the location information is used to record bits of the variable length type of data to be processed in the data set.
  • Set relevant information is used to record bits of the variable length type of data to be processed in the data set.
  • the at least one to-be-processed data corresponding to the to-be-processed task is a variable-length data type
  • the at least one to-be-processed data is allocated to the data size according to the data set type.
  • position information of each data to be processed in the data set is recorded, thereby determining
  • the GPU performs data processing, it can obtain complete pending data based on the location information.
  • the data set may be a cache data area.
  • the data processing device may store at least one to-be-processed data into the cache data area, so as to convert the at least one to-be-processed data corresponding to the to-be-processed task into data set type data.
  • the data processing device parses the data of the data collection type after converting the at least one to-be-processed data into data of a data collection type, so that the data of the data collection type can be converted into a GPU for calculation processing.
  • the type of data required at the time, and the data of the parsed data collection type is used to generate at least one data block.
  • the data processing device converts a data format of the data in the data collection type into a data format required by the GPU for performing a calculation process by using a preset analytic function; and generates at least data converted by the data format.
  • a block of data A block of data.
  • the data processing device converts the data type of the data in the data set into a data type required for the calculation processing by the GPU by using a preset analytic function in units of data sets. And converting at least one data set after the data format is generated to generate at least one data block.
  • the analytic function can be determined according to the calculation performed by the GPU. That is, the GPU performs different calculations, and the preset analytic functions are different.
  • the preset analytic function may be The data format of the data to be processed is converted to the data format required for the logical operation.
  • the analytic function converts data to be processed whose data format is text type or binary type into data of an integer data type that can perform logical operations.
  • the data block is sent to the allocated GPU through the data interface of the GPU.
  • the data processing device may store the generated at least one data block into a buffer area of the allocated GPU.
  • An embodiment of the present invention provides a data processing method, where a data processing device acquires at least one to-be-processed data corresponding to a task to be processed and a task to be processed, and allocates a GPU for the task to be processed; at least one to-be-processed task to be processed.
  • the processing data is converted into data of a data collection type, and the data in the data collection type is parsed, the parsed data is generated into at least one data block, and the generated at least one data block is sent to the allocated GPU, so that the GPU Perform calculation processing.
  • the data processing device After the data processing device obtains the to-be-processed task and the corresponding at least one to-be-processed data, it can allocate a GPU to the GPU, and send the to-be-processed data corresponding to the to-be-processed task to the allocated GPU, triggering the GPU to treat Processing data for calculation processing improves the efficiency of processing data.
  • the foregoing process does not require that the data format of the corresponding at least one to-be-processed data to be processed is an isometric data type, which improves the performance of the system.
  • the user's manual participation is not required during the running process, which further improves the efficiency of processing data.
  • An embodiment of the present invention provides a data processing method, as shown in FIG. 2, including:
  • the data processing device acquires a to-be-processed task and at least one to-be-processed data corresponding to the to-be-processed task.
  • step 101 For details, refer to step 101, and details are not described herein again.
  • the device for data processing acquires a pre-configured resource information table.
  • the resource information table is used to record the number of GPUs and the usage of the GPU. information.
  • the data processing device when the data processing device obtains the resource information table for the first time, it can be obtained from the initial cluster file system. After the data processing device obtains the resource information table for the first time, the resource information table may be stored in the cache for later acquisition.
  • the resource information table is further used to record the number of CPUs of the central processing unit and the usage information of the CPU.
  • Step 201 may be performed first.
  • Step 202 may be performed first, and step 201 may be performed at step 201. Only one of them is shown in the illustration.
  • the data processing device determines a usage quantity of the GPU required by the to-be-processed task.
  • the to-be-processed task carries the required resource information.
  • the data processing device can learn the usage quantity of the GPU required by the to-be-processed task according to the resource information.
  • the to-be-processed task may also notify the device for data processing of the required number of GPUs in other manners, which is not limited by the present invention.
  • the data processing device determines whether to allocate a GPU to the to-be-processed task.
  • the unused GPU data may be determined according to the resource information table, so that it may be determined whether the GPU is allocated for the to-be-processed task.
  • the data processing device determines, according to the number of GPUs in the resource information table and the usage information of the GPU, whether the number of unused GPUs meets the number of used GPUs required by the to-be-processed task, thereby determining whether Allocating a GPU for the to-be-processed task.
  • the device for data processing determines to allocate a GPU for the to-be-processed task.
  • the data processing device determines that the GPU is not allocated for the task to be processed, and may be The to-be-processed task allocates a CPU.
  • the device for data processing can determine the number of unused GPUs according to the number of GPUs in the resource information table and the usage information of the GPU, and the number of unused GPUs and the tasks required for the task to be processed. Comparing the number of GPUs, when the number of unused GPUs is greater than or equal to the number of GPUs required for the task to be processed, the device for data processing determines that the number of unused GPUs satisfies the number of GPUs required for the task to be processed. At this time, the device for data processing determines to allocate a GPU for the task to be processed.
  • the device for data processing determines that the number of unused GPUs does not satisfy the number of GPUs required for the task to be processed, and at this time, the data processing The device determines that the GPU is not assigned to the task to be processed, and the CPU can be allocated for the task to be processed.
  • step 205a is performed. If it is determined that the GPU is assigned to the pending task, steps 205a, 206-209 are performed. If it is determined that the GPU is not allocated for the task to be processed, step 205b is performed.
  • the data processing device allocates a GPU to the to-be-processed task.
  • the data processing device may allocate a GPU according to the number of GPUs required for the task to be processed. For details, refer to step 102, and details are not described herein again.
  • the data processing device allocates a CPU to the to-be-processed task.
  • the data processing device may allocate the to-be-processed tasks because the GPU cannot be allocated for calculation.
  • the CPU performs corresponding calculation processing by the CPU.
  • the data processing device converts the at least one to-be-processed data corresponding to the to-be-processed task into data of a data collection type.
  • step 103 For details, refer to step 103, and details are not described herein again.
  • the data processing device parses the data in the data collection type, and parses the data.
  • the data generates at least one data block.
  • step 104 For details, refer to step 104, and details are not described herein again.
  • the data processing device sends the generated at least one data block to the allocated GPU, so that the GPU performs a calculation process on the at least one data block according to the to-be-processed task.
  • step 105 For details, refer to step 105, and details are not described herein again.
  • the data processing device receives the calculation processing result sent by the allocated GPU, and performs partitioning, sorting, and merging processing on the calculation processing result.
  • the data processing device may perform the partitioning, sorting, and merging processing on the calculation processing result, that is, partitioning to divide the calculation result with the same keyword into In the same group.
  • the calculation results of each group are sorted according to the keywords corresponding to each group. The calculation results of the same keyword are combined.
  • An embodiment of the present invention provides a data processing method, where a data processing device acquires at least one to-be-processed data corresponding to a task to be processed and a task to be processed, and allocates a GPU for the task to be processed; at least one to-be-processed task to be processed.
  • the processing data is converted into data of a data collection type, and the data in the data collection type is parsed, the parsed data is generated into at least one data block, and the generated at least one data block is sent to the allocated GPU, so that the GPU Perform calculation processing.
  • the data processing device After the data processing device obtains the to-be-processed task and the corresponding at least one to-be-processed data, it can allocate a GPU to the GPU, and send the to-be-processed data corresponding to the to-be-processed task to the allocated GPU, triggering the GPU to treat Processing data for calculation processing improves the efficiency of processing data.
  • the foregoing process does not require that the data format of the corresponding at least one to-be-processed data to be processed is an isometric data type, which improves the performance of the system. The user's manual participation is not required during the running process, which further improves the efficiency of processing data.
  • An embodiment of the present invention provides a device for data processing, as shown in FIG. 3, including:
  • An obtaining unit 301 configured to acquire a task to be processed and at least corresponding to the to-be-processed task A pending data.
  • the allocating unit 302 is configured to allocate a graphics processor GPU for the task to be processed.
  • the allocating unit 302 may determine, according to the requirement of the to-be-processed task, whether the corresponding at least one to-be-processed data of the to-be-processed task is processed by the GPU. If the to-be-processed task requires the GPU to process its corresponding at least one to-be-processed data, the allocating unit 302 may allocate a GPU for the to-be-processed task.
  • the converting unit 303 is configured to convert the at least one to-be-processed data corresponding to the to-be-processed task into data of a data set type.
  • the converting unit 303 is specifically configured to determine a size of the data of the data set type, and allocate the at least one to-be-processed data into the at least one data set according to the size of the data of the data set type.
  • the size of the data to be processed included in the data set is not greater than the size of the data of the data set type.
  • the converting unit 303 is specifically configured to: when the data type of the at least one to-be-processed data corresponding to the to-be-processed task is a variable-length data type, according to the size of the data of the data set type, All the at least one to-be-processed data is allocated to the at least one data set, and the location information of the at least one to-be-processed data in the at least one data set is recorded, so that the GPU acquires the to-be-acquired according to the location information. Data processing.
  • the location information is used to record location-related information of the variable-length type of data to be processed in the data set.
  • the converting unit 303 is configured to: when the data type of the at least one data to be processed is an equal-length data type, according to the data size of the data set type, at least one data of the to-be-processed data may be directly allocated to at least In a data set. Since the size of each data is constant, the position in the data set is also constant, so there is no need to record the position of at least one data to be processed of the same length data type in the data set.
  • a parsing unit 304 configured to convert the data set type of the conversion unit 303 The data is parsed, and at least one data block is generated from the parsed data.
  • the parsing unit 304 is specifically configured to convert, by using a preset analytic function, a data format of data in the data set type into a data format required by the GPU for performing a calculation process.
  • the data after the converted data format is generated to generate at least one data block.
  • the sending unit 305 is configured to send the at least one data block generated by the parsing unit 304 to the GPU allocated by the allocating unit, so that the GPU calculates the at least one data block according to the to-be-processed task. deal with.
  • the sending unit 305 is specifically configured to send the generated at least one data block to a buffer area of the allocated GPU.
  • the acquiring unit 301 is further configured to acquire a pre-configured resource information table.
  • the resource information table is used to record the number of GPUs and the usage information of the GPU.
  • the resource information table is further used to record the number of CPUs of the central processing unit and the usage information of the CPU.
  • the device for data processing as shown in FIG. 4, further includes:
  • the determining unit 306 is configured to determine the usage quantity of the GPU required by the to-be-processed task.
  • the allocating unit 302 is configured to determine, according to the number of GPUs in the resource information table acquired by the acquiring unit 301 and the usage information of the GPU, that the number of unused GPUs is determined by the determining unit 306.
  • a GPU is allocated for the to-be-processed task.
  • the allocating unit 302 is further configured to allocate a CPU to the to-be-processed task when the number of unused GPUs does not satisfy the number of used GPUs required by the determining unit 306.
  • the device for data processing as shown in FIG. 5, further includes:
  • the receiving unit 307 is configured to receive a calculation processing result sent by the allocated GPU.
  • the processing unit 308 is configured to perform partitioning, sorting, and merging processing on the calculation processing result.
  • the processing unit 308 may perform the partitioning, sorting, and merging processing, that is, the partitioning is the same keyword.
  • the calculation results are divided into the same group.
  • the calculation results of each group are sorted according to the keywords corresponding to each group.
  • the calculation results of the same keyword are combined.
  • the embodiment of the present invention provides a transposition of data processing, where the device for processing data acquires at least one to-be-processed data corresponding to the task to be processed and the task to be processed, and allocates a GPU for the task to be processed; at least one corresponding to the task to be processed.
  • the data to be processed is converted into data of a data set type, and the data in the data set type is parsed, the parsed data is generated into at least one data block, and the generated at least one data block is sent to the allocated GPU, so that The GPU performs calculation processing.
  • the data processing device After the data processing device obtains the to-be-processed task and the corresponding at least one to-be-processed data, it can allocate a GPU to the GPU, and send the to-be-processed data corresponding to the to-be-processed task to the allocated GPU, triggering the GPU to treat Processing data for calculation processing improves the efficiency of processing data.
  • the foregoing process does not require that the data format of the corresponding at least one to-be-processed data to be processed is an isometric data type, which improves the performance of the system. The user's manual participation is not required during the running process, which further improves the efficiency of processing data.
  • An embodiment of the present invention provides a data processing apparatus, as shown in FIG. 6, including: a processor 601, a memory 602, a communication interface 603, and a bus 604, wherein the processor 601, the memory 602, and the The communication interface 603 communicates over the bus 604.
  • the memory 602 is configured to store a program.
  • the processor 601 is configured to execute an execution instruction of the memory storage.
  • the communication interface 603 is configured to receive a to-be-processed task and at least one to-be-processed data corresponding to the to-be-processed task,
  • the processor 601 runs a program to execute the following instructions:
  • the processor 601 is configured to acquire a task to be processed and a corresponding to the task to be processed One less data to process.
  • the processor 601 is further configured to allocate a graphics processor GPU for the to-be-processed task.
  • the processor 601 may determine, according to the requirement of the to-be-processed task, whether the corresponding at least one to-be-processed data of the to-be-processed task is processed by the GPU. If the to-be-processed task requires the GPU to process its corresponding at least one to-be-processed data, the processor 601 can allocate a GPU for the to-be-processed task.
  • the processor 601 is further configured to convert the at least one to-be-processed data corresponding to the to-be-processed task into data of a data set type.
  • the processor 601 is specifically configured to determine a size of the data of the data set type, and allocate the at least one to-be-processed data into the at least one data set according to the size of the data of the data set type.
  • the size of the data to be processed included in the data set is not greater than the size of the data of the data set type.
  • the processor 601 is specifically configured to: when the data type of the at least one to-be-processed data corresponding to the to-be-processed task is a variable-length data type, according to the size of the data of the data collection type, All the at least one to-be-processed data is allocated to the at least one data set, and the location information of the at least one to-be-processed data in the at least one data set is recorded, so that the GPU acquires the to-be-acquired according to the location information. Data processing.
  • the location information is used to record location-related information of the variable-length type of data to be processed in the data set.
  • the processor 601 is configured to: when the data type of the at least one data to be processed is an equal-length data type, according to the data size of the data set type, the data of the at least one to-be-processed data may be directly allocated to at least In a data set. Since the size of each data is constant, the position in the data set is also constant, so there is no need to record the position of at least one data to be processed of the same length data type in the data set.
  • the processor 601 is further configured to parse data in the data collection type, and the solution The parsed data generates at least one data block.
  • the processor 601 is specifically configured to convert, by using a preset analytic function, a data format of the data in the data collection type into a data format required by the GPU for performing a calculation process.
  • the data after the converted data format is generated to generate at least one data block.
  • the processor 601 is further configured to send the generated at least one data block to the allocated GPU, so that the GPU performs a calculation process on the at least one data block according to the to-be-processed task.
  • the processor 601 is specifically configured to send the generated at least one data block to a buffer area of the allocated GPU.
  • the processor 601 is further configured to obtain a pre-configured resource information table.
  • the resource information table is used to record the number of GPUs and the usage information of the GPU.
  • the resource information table is further used to record the number of CPUs of the central processing unit and the usage information of the CPU.
  • the processor 601 is further configured to determine a usage quantity of the GPU required by the to-be-processed task.
  • the processor 601 is configured to allocate a graphics processor GPU to the to-be-processed task, specifically:
  • the processor 601 is specifically configured to determine, according to the number of GPUs in the resource information table and usage information of the GPU, that the number of unused GPUs meets the number of GPUs required by the to-be-processed task, The to-be-processed task allocates a GPU.
  • the processor 601 is further configured to allocate a CPU to the to-be-processed task when the number of unused GPUs does not satisfy the number of GPUs required by the determining unit 306 to be processed by the to-be-processed task.
  • the communication interface 603 is further configured to receive a calculation processing result sent by the allocated GPU.
  • the processor 601 is further configured to partition, sort, and merge the calculation processing result Reason.
  • the processor 601 may perform the partitioning, sorting, and merging processing on the calculation processing result, that is, the partitioning is to have the same keyword.
  • the calculation results are divided into the same group.
  • the calculation results of each group are sorted according to the keywords corresponding to each group.
  • the calculation results of the same keyword are combined.
  • the embodiment of the present invention provides a transposition of data processing, where the device for processing data acquires at least one to-be-processed data corresponding to the task to be processed and the task to be processed, and allocates a GPU for the task to be processed; at least one corresponding to the task to be processed.
  • the data to be processed is converted into data of a data set type, and the data in the data set type is parsed, the parsed data is generated into at least one data block, and the generated at least one data block is sent to the allocated GPU, so that The GPU performs calculation processing.
  • the data processing device After the data processing device obtains the to-be-processed task and the corresponding at least one to-be-processed data, it can allocate a GPU to the GPU, and send the to-be-processed data corresponding to the to-be-processed task to the allocated GPU, triggering the GPU to treat Processing data for calculation processing improves the efficiency of processing data.
  • the foregoing process does not require that the data format of the corresponding at least one to-be-processed data to be processed is an isometric data type, which improves the performance of the system. The user's manual participation is not required during the running process, which further improves the efficiency of processing data.
  • the present invention can be implemented by means of software plus necessary general hardware, and of course, by hardware, but in many cases, the former is a better implementation. .
  • the technical solution of the present invention which is essential or contributes to the prior art, can be embodied in the form of a software product stored in a readable storage medium, such as a floppy disk of a computer.
  • a hard disk or optical disk, etc. includes instructions for causing a computer device (which may be a personal computer, server, or network device, etc.) to perform the methods described in various embodiments of the present invention.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

本发明实施例公开了一种数据处理的方法及装置,涉及通信领域,用以提高数据处理的效率。所述方法,包括:获取待处理任务以及与待处理任务对应的至少一个待处理的数据;为待处理任务分配图形处理器GPU;将待处理任务对应的至少一个待处理数据转换为数据集合类型的数据;将数据集合类型中的数据进行解析,将解析后的数据生成至少一个数据块;将生成的至少一个数据块发送给所述被分配的GPU,以使得GPU根据待处理任务对至少一个数据块进行计算处理。本发明适用于触发加速部件处理数据的场景。

Description

一种数据处理的方法及装置
本申请要求于2014年5月23日提交中国专利局、申请号为201410223152.8、发明名称为“一种数据处理的方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及通信领域,尤其涉及一种数据处理的方法及装置。
背景技术
随着科学技术和互联网的发展,现代社会的信息量迅速增长,这些信息里积累着大量的数据,这些数据中将会有部分数据存储在云平台中或借助云平台进行处理,借助Hadoop可以高效地存储、管理和分析这些存储在云平台中的数据。
Hadoop是一个能够对大量数据进行分布式处理的软件架构,其最底部是一个分布式文件系统,通过采用分布式存储方式来进行数据存储,提高了数据的读写速度,也扩大了存储容量。分布式文件系统的上一层是映射简化(MapReduce)引擎,通过MapReduce引擎对分布式文件系统中的数据进行整合,可以保证分析和处理数据的高效性,正是由于Hadoop突出的优势,其在许多领域中被广泛应用。但是在具有如图形处理器(Graphics Processing Unit,GPU)等加速部件的集群环境中,由于现有的Hadoop的MapReduce引擎为用户提供的<key,value>编程接口的局限性,使得Hadoop的MapReduce引擎无法触发已有的加速部件进行数据的处理,这样也就无法利用加速部件强大的计算能力,使得处理数据的效率得不到提高。
发明内容
本发明的实施例提供一种数据处理的方法及装置,用以提高数据处理的效率。
为达到上述目的,本发明的实施例采用如下技术方案:
第一方面,本发明实施例提供了一种数据处理的方法,包括:获取待处理任务以及与所述待处理任务对应的至少一个待处理的数据;为所述待处理任务分配图形处理器GPU;将所述待处理任务对应的所述至少一个待处理数据转换为数据集合类型的数据;将所述数据集合类型中的数据进行解析,将解析后的数据生成至少一个数据块;将所述生成的至少一个数据块发送给所述被分配的GPU,以使得所述GPU根据所述待处理任务对所述至少一个数据块进行计算处理。
在第一方面的第一种可能的实现方式中,在为所述待处理任务分配图形处理器GPU之前还包括:获取预先配置的资源信息表;所述资源信息表用于记录GPU的数量以及所述GPU的使用情况信息。
结合第一方面的第一种可能的实现方式中,在第一方面的第二种可能的实现方式中,在所述获取资源信息表之后,还包括:确定所述待处理任务需要的GPU的使用数量;所述为待处理任务分配图形处理器GPU包括:根据所述资源信息表中GPU的数量及所述GPU的使用情况信息,确定未使用的GPU的数量满足所述待处理任务需要的GPU的使用数量时,为所述待处理任务分配GPU。
结合第一方面第二种可能的实现方式,在第一方面的第三种可能的实现方式中,所述资源信息表还用于记录中央处理器CPU的数量以及所述CPU的使用情况信息;在所述确定所述待处理任务需要的GPU的使用数量之后,还包括:确定所述资源信息表中未使用的GPU的数量不满足待处理任务需要的GPU的使用数量时,为所述待处理任务分配CPU。
结合第一方面,或第一方面的第一至第三任一种可能的实现方式,在第 一方面的第四种可能的实现方式中,所述将所述待处理任务对应的所述至少一个待处理数据转换为数据集合类型的数据包括:确定所述数据集合类型的数据大小;根据所述数据集合类型的数据大小,将所述至少一个待处理数据分配至至少一个数据集合中;所述数据集合中包含的待处理数据的大小不大于所述数据集合类型的数据大小。
结合第一方面的第四种可能的实现方式,在第一方面的第五种可能的实现方式中,所述根据所述数据集合类型的数据大小,将所述至少一个待处理数据分配至至少一个数据集合中包括:在所述待处理任务对应的所述至少一个待处理数据的数据类型为变长数据类型时,根据所述数据集合类型的数据大小,将所述至少一个待处理数据分配至至少一个数据集合中,且记录所述至少一个待处理数据在所述至少一个数据集合中的位置信息,以使得所述GPU根据所述位置信息,获取所述待处理数据;所述位置信息用于记录变长类型的待处理数据在数据集合中的位置相关信息。
结合第一方面,或第一方面的第一至第五任一种可能的实现方式,在第一方面的第六种可能的实现方式中,所述将所述数据集合类型中的数据进行解析,将解析后的数据生成至少一个数据块包括:利用预设的解析函数,将所述数据集合类型中的数据的数据格式转换为所述GPU进行计算处理时所需的数据格式;将转换数据格式后的数据生成至少一个数据块。
结合第一方面,或第一方面的第一至第六任一种可能的实现方式,在第一方面的第七种可能的实现方式中,所述将所述生成的至少一个数据块发送给所述被分配的GPU包括:将所述生成的至少一个数据块存储至所述被分配的GPU的缓存区中。
结合第一方面,或第一方面的第一至第七六任一种可能的实现方式,在第一方面的第八种可能的实现方式中,在所述将所述生成的至少一个数据块发送给所述被分配的GPU之后,还包括:接收所述被分配的GPU发送的计算处理结果,并对所述计算处理结果进行分区、排序及合并处理。
第二方面,本发明实施例提供了一种数据处理的装置,包括:获取单元,用于获取待处理任务以及和所述待处理任务对应的至少一个待处理的数据;分配单元,用于为待处理任务分配图形处理器GPU;转换单元,用于将所述待处理任务对应的所述至少一个待处理数据转换为数据集合类型的数据;解析单元,用于将所述转换单元转换的所述数据集合类型中的数据进行解析,将解析后的数据生成至少一个数据块;发送单元,用于将所述解析单元生成的所述至少一个数据块发送给所述分配单元分配的GPU,以使得所述GPU根据所述待处理任务对所述至少一个数据块进行计算处理。
在第二方面的第一种可能的实现方式中,所述获取单元,还用于获取预先配置的资源信息表;所述资源信息表用于记录GPU的数量以及所述GPU的使用情况信息。
结合第二方面的第一种可能的实现方式,在第二方面的第二种可能的实现方式中,所述装置还包括:确定单元,用于确定所述待处理任务需要的GPU的使用数量;所述分配单元,具体用于根据所述获取单元获取的资源信息表中GPU的数量及所述GPU的使用情况信息,确定未使用的GPU的数量满足所述确定单元确定的所述待处理任务需要的GPU的使用数量时,为所述待处理任务分配GPU。
结合第二方面的第二种可能的实现方式,在第二方面的第三种可能的实现方式中,所述资源信息表还用于记录中央处理器CPU的数量以及所述CPU的使用情况信息;所述分配单元,还用于确定未使用的GPU的数量不满足所述确定单元确定的所述待处理任务需要的GPU的使用数量时,为所述待处理任务分配CPU。
结合第二方面,或第二方面的第第一至第三任一种可能的实现方式,在第二方面的第四种可能的实现方式中,所述转换单元,具体用于确定所述数据集合类型的数据的大小;根据所述数据集合类型的数据的大小,将所述至少一个待处理数据分配至至少一个数据集合中;所述数据集合中包含的待处 理数据的大小不大于所述数据集合类型的数据的大小。
结合第二方面的第三种可能的实现方式,在第二方面的第五种可能的实现方式中,所述转换单元,具体用于在所述待处理任务对应的所述至少一个待处理数据的数据类型为变长数据类型时,根据所述数据集合类型的数据的大小,将所述至少一个待处理数据分配至至少一个数据集合中,且记录所述至少一个待处理数据在所述至少一个数据集合中的位置信息,以使得所述GPU根据所述位置信息,获取所述待处理数据;所述位置信息用于记录变长类型的待处理数据在数据集合中的位置相关信息。
结合第二方面,或第二方面的第第一至第五任一种可能的实现方式,在第二方面的第六种可能的实现方式中,所述解析单元,具体用于利用预设的解析函数,将所述数据集合类型中的数据的数据格式转换为所述GPU进行计算处理时所需的数据格式;将转换数据格式后的数据生成至少一个数据块。
结合第二方面,或第二方面的第第一至第六任一种可能的实现方式,在第二方面的第七种可能的实现方式中,所述发送单元,具体用于将所述生成的至少一个数据块发送给所述被分配的GPU的缓存区中。
结合第二方面,或第二方面的第第一至第七任一种可能的实现方式,在第二方面的第八种可能的实现方式中,所述装置还包括:接收单元,用于接收所述GPU发送的计算处理结果;处理单元,用于对所述计算处理结果进行分区、排序及合并处理。
第三方面,本发明实施例提供了一种数据处理的装置,包括:处理器,存储器,通信接口,和总线,其中,所述处理器、所述存储器和所述通信接口通过所述总线通信;所述存储器,用于存放程序;所述处理器,用于执行所述存储器存储的执行指令;所述通信接口,用于接收待处理任务以及与所述待处理任务对应的至少一个待处理的数据;当所述数据处理装置运行时,处理器运行程序,以执行以下指令:获取待处理任务以及和所述待处理任务对应的至少一个待处理的数据;为所述待处理任务分配图形处理器GPU;将所 述待处理任务对应的所述至少一个待处理数据转换为数据集合类型的数据;将所述数据集合类型中的数据进行解析,将解析后的数据生成至少一个数据块;将所述生成的至少一个数据块发送给所述被分配的GPU,以使得所述GPU根据所述待处理任务对所述至少一个数据块进行计算处理。
本发明实施例提供了一种数据处理的方法及装置,数据处理的装置获取待处理任务及待处理任务对应的至少一个待处理数据,为此待处理任务分配GPU;将待处理任务对应的至少一个待处理数据转换为数据集合类型的数据,并将数据集合类型中的数据进行解析,将解析后的数据生成至少一个数据块,将生成的至少一个数据块发送至被分配的GPU中,以使得GPU进行计算处理。这样,数据处理的装置在获取到待处理任务及其对应的至少一个待处理数据后,可以为其分配GPU,并将此待处理任务对应的待处理数据发送至被分配的GPU,触发GPU对待处理数据进行计算处理,提高了处理数据的效率。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本发明实施例提供的一种数据处理的方法流程图;
图2为本发明实施例提供的另一种数据处理的方法流程图;
图3为本发明实施例提供的一种数据处理的装置的功能示意图;
图4为本发明实施例提供的另一种数据处理的转置的功能示意图;
图5为本发明实施例提供的另一种数据处理的转置的功能示意图;
图6为本发明实施例提供的一种数据处理的转置的结构示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
本发明实施例提供了一种数据处理的方法,如图1所示,该方法可以包括:
101、获取待处理任务及所述待处理任务对应的至少一个待处理的数据。
具体的,数据处理的装置执行某个应用时,可以获取此应用对应的待处理任务。并且根据此待处理任务在分布式文件系统中获取到此待处理任务对应的至少一个待处理数据。
需要说明的是,数据处理的装置可以运行在Hadoop系统中。此时,数据处理的装置某个应用运行时,可以在Hadoop系统中获取待处理任务,并根据此待处理任务在Hadoop系统中的分布式文件系统获取到其对应的至少一个待处理数据。
需要说明的是,数据处理的装置还可以运行在其他任何需要将数据发送至GPU,由GPU进行计算处理的系统,本发明对此不做限制。
102、为所述待处理任务分配图形处理器GPU。
具体的,数据处理的装置在获取待处理任务的至少一个待处理数据后,可以根据此待处理任务的需求确定此待处理任务的对应的至少一个待处理数据是否由GPU(Graphic Processing Unit,图形处理器)处理。若数据处理的装置确定此待处理任务需要GPU处理其对应的至少一个待处理数据,则数据处理的装置可以为此待处理任务分配GPU。
需要说明的是,GPU在集群系统中无法作为独立部件存在,必须作为加速部件配置在数据处理的装置上,因此对GPU计算资源的管理必须通过数据处理的装置实现。这样,在数据处理的装置中有两种计算资源,分别为:CPU (Central Processing Unit,中央处理器)和GPU。
103、将所述待处理任务对应的所述至少一个待处理数据转换为数据集合类型的数据。
具体的,数据处理的装置为待处理任务分配完GPU后,可以确定所述数据集合类型的数据大小。并根据所述数据集合类型的数据大小,将所述至少一个待处理数据分配至至少一个数据集合中。
其中,所述数据集合中包含的待处理数据的大小不大于所述数据集合类型的数据大小。
也就是说,数据处理的装置在待处理任务分配完GPU后,需要将至少一个待处理数据由单个数据转换为一组数据,即为转换为数据集合(Data Set)类型的数据,此时在确定出数据集合类型的数据大小后,根据此数据集合类型的数据大小将所述至少一个待处理数据分配至至少一个数据集合中,从而可以以一个数据集合为单位进行后续的处理。
进一步的,所述至少一个待处理数据的数据类型可以是等长数据类型,也可以是变长数据类型。
在所述至少一个待处理数据的数据类型是等长数据类型时,根据所述数据集合类型的数据大小,可以将至少一个待处理数据的数据直接分配至至少一个数据集合中。由于每个数据的大小是一定的,则在数据集合中的位置也是一定的,所以无需记录等长数据类型的至少一个待处理数据在数据集合中的位置。
在所述待处理任务对应的所述至少一个待处理数据的数据类型为变长数据类型时,根据所述数据集合类型的数据大小,将所述至少一个待处理数据分配至至少一个数据集合中,且记录所述至少一个待处理数据在所述至少一个数据集合中的位置信息,以使得所述GPU根据所述位置信息,获取所述待处理数据。
其中,所述位置信息用于记录变长类型的待处理数据在数据集合中的位 置相关信息。
也就是说,在所述待处理任务对应的所述至少一个待处理数据的数据类型为变长数据类型时,在根据所述数据集合类型的数据大小,将所述至少一个待处理数据分配至至少一个数据集合中时,由于每个待处理数据的大小不一致,需要在将至少一个待处理数据分配至至少一个数据集合时,记录每个待处理数据在数据集合中的位置信息,从而将确定GPU在进行数据处理时,可以根据此位置信息获取到完整的待处理数据。
可选的,作为举例,数据集合可以为缓存数据区。数据处理的装置可以将至少一个待处理数据存放至缓存数据区中,从而实现将所述待处理任务对应的所述至少一个待处理数据转换为数据集合类型的数据。
104、将所述数据集合类型中的数据进行解析,将解析后的数据生成至少一个数据块。
具体的,数据处理的装置,在将所述至少一个待处理数据转换为数据集合类型的数据后,将此数据集合类型的数据进行解析,从而可以将数据集合类型的数据转换为GPU进行计算处理时所需的数据类型,并利用解析后的数据集合类型的数据生成至少一个数据块。
进一步的,数据处理的装置利用预设的解析函数,将所述数据集合类型中的数据的数据格式转换为所述GPU进行计算处理时所需的数据格式;将转换数据格式后的数据生成至少一个数据块。
也就是说,数据处理的装置以数据集合为单位,将数据集合中的数据利用预设的解析函数将其数据类型转换为GPU进行计算处理时所需的数据类型。并将转换数据格式后的至少一个数据集合生成至少一个数据块。
需要说明的是,用户在设置待处理任务由GPU执行时,需要事先确定出由GPU执行怎样的计算,此时可以根据GPU执行怎样的计算确定出其解析函数。即为,GPU执行不同的计算,预设的解析函数是不同的。示例性的,若需要由GPU对待处理数据进行逻辑运算,此时,预设的解析函数可以是将 待处理数据的数据格式转换为进行逻辑运算所需的数据格式。例如,解析函数是将将数据格式为文本类型或二进制类型的待处理数据,转换为可以进行逻辑运算的整形数据类型的数据。
105、将所述生成的至少一个数据块发送给所述被分配的GPU,以使得所述GPU根据所述待处理任务对所述至少一个数据块进行计算处理。
具体的,数据处理的装置在生成至少一个数据块后,将此数据块通过GPU的数据接口发送至被分配的GPU中。
进一步的,数据处理的装置可以将所述生成的至少一个数据块存储至所述被分配的GPU的缓存区中。
本发明实施例提供了一种数据处理的方法,数据处理的装置获取待处理任务及待处理任务对应的至少一个待处理数据,为此待处理任务分配GPU;将待处理任务对应的至少一个待处理数据转换为数据集合类型的数据,并将数据集合类型中的数据进行解析,将解析后的数据生成至少一个数据块,将生成的至少一个数据块发送至被分配的GPU中,以使得GPU进行计算处理。这样,数据处理的装置在获取到待处理任务及其对应的至少一个待处理数据后,可以为其分配GPU,并将此待处理任务对应的待处理数据发送至被分配的GPU,触发GPU对待处理数据进行计算处理,提高了处理数据的效率。
进一步的,上述过程并不要求对待处理任务的对应的至少一个待处理数据的数据格式为等长数据类型,提高了系统的性能。在运行过程中无需用户手动参与,进一步提高了处理数据的效率。
本发明实施例提供了一种数据处理的方法,如图2所示,包括:
201、数据处理的装置获取待处理任务以及与所述待处理任务对应的至少一个待处理的数据。
具体的,可参考步骤101,在此不再赘述。
202、数据处理的装置获取预先配置的资源信息表。
其中,所述资源信息表用于记录GPU的数量以及所述GPU的使用情况 信息。
具体的,数据处理的装置首次获取资源信息表时,可以从初始集群文件系统中获取。数据处理的装置在首次获取资源信息表后,可以将此资源信息表存储中缓存中,以便之后获取。
进一步的,所述资源信息表还用于记录中央处理器CPU的数量以及所述CPU的使用情况信息。
需要说明的是,本发明对步骤201与步骤202间的顺序不做限制。可以先执行步骤201,在执行步骤202,也可先执行步骤202,在执行步骤201,还可同时执行步骤201,202。在图示中只表示出一种。
203、数据处理的装置确定所述待处理任务需要的GPU的使用数量。
具体的,数据处理的装置在获取到待处理任务后,待处理任务中携带了其所需的资源信息。数据处理的装置可以根据此资源信息获知所述待处理任务需要的GPU的使用数量。
需要说明的是,待处理任务还可用其他方式将其所需的GPU的使用数量通知数据处理的装置,本发明对此不做限制。
204、数据处理的装置确定是否为所述待处理任务分配GPU。
具体的,数据处理的装置在获知待处理任务所需的GPU数量后,根据资源信息表可以确定出未使用的GPU数据,从而可以确定是否为所述待处理任务分配GPU。
进一步的,数据处理的装置根据所述资源信息表中GPU的数量及所述GPU的使用情况信息,确定未使用的GPU的数量是否满足所述待处理任务需要的GPU的使用数量,从而确定是否为所述待处理任务分配GPU。确定未使用的GPU的数量满足所述待处理任务需要的GPU的使用数量时,数据处理的装置确定为所述待处理任务分配GPU。
确定所述资源信息表中未使用的GPU的数量不满足待处理任务需要的GPU的使用数量时,数据处理的装置确定不为待处理任务分配GPU,可以为 所述待处理任务分配CPU。
也就是说,数据处理的装置根据所述资源信息表中GPU的数量及所述GPU的使用情况信息可以确定未使用的GPU的数量,将所述未使用GPU的数量与待处理任务所需的GPU的数量进行比较,在所述未使用GPU的数量大于等于待处理任务所需的GPU的数量时,数据处理的装置确定未使用的GPU的数量满足所述待处理任务需要的GPU的使用数量,此时数据处理的装置确定为待处理任务分配GPU。在所述未使用GPU的数量小于待处理任务所需的GPU的数量时,数据处理的装置确定未使用的GPU的数量不满足所述待处理任务需要的GPU的使用数量,此时数据处理的装置确定不为待处理任务分配GPU,可以为待处理任务分配CPU。
需要说明的是,数据处理的装置根据确定的结果不同,下面执行的不同。若确定为待处理任务分配GPU,则执行步骤205a,206-209。若确定不为待处理任务分配GPU,则执行步骤205b。
205a、确定未使用的GPU的数量满足所述待处理任务需要的GPU的使用数量时,数据处理的装置为所述待处理任务分配GPU。
具体的,数据处理的装置可以根据待处理任务所需的GPU的数量为其分配GPU,具体可参考步骤102,在此不再赘述。
205b、确定所述资源信息表中未使用的GPU的数量不满足待处理任务需要的GPU的使用数量时,数据处理的装置为所述待处理任务分配CPU。
具体的,数据处理的装置在确定所述资源信息表中未使用的GPU的数量不满足待处理任务需要的GPU的使用数量时,由于不能为其分配GPU进行计算,可以为此待处理任务分配CPU,通过CPU进行相应的计算处理。
206、数据处理的装置将所述待处理任务对应的所述至少一个待处理数据转换为数据集合类型的数据。
具体的,可参考步骤103,在此不再赘述。
207、数据处理的装置将所述数据集合类型中的数据进行解析,将解析后 的数据生成至少一个数据块。
具体的,可参考步骤104,在此不再赘述。
208、数据处理的装置将所述生成的至少一个数据块发送给所述被分配的GPU,以使得所述GPU根据所述待处理任务对所述至少一个数据块进行计算处理。
具体的,可参考步骤105,在此不再赘述。
209、数据处理的装置接收所述被分配的GPU发送的计算处理结果,并对所述计算处理结果进行分区、排序及合并处理。
具体的,数据处理的装置在接收到所述被分配的GPU发送的计算处理结果后,可以将所述计算处理结果进行分区、排序及合并处理,即分区为将关键字相同的计算结果划分到同一个组中。针对分组后的计算结果,根据各组对应的关键字,对每组的计算结果进行排序。将相同关键字的计算结果进行合并处理。
本发明实施例提供了一种数据处理的方法,数据处理的装置获取待处理任务及待处理任务对应的至少一个待处理数据,为此待处理任务分配GPU;将待处理任务对应的至少一个待处理数据转换为数据集合类型的数据,并将数据集合类型中的数据进行解析,将解析后的数据生成至少一个数据块,将生成的至少一个数据块发送至被分配的GPU中,以使得GPU进行计算处理。这样,数据处理的装置在获取到待处理任务及其对应的至少一个待处理数据后,可以为其分配GPU,并将此待处理任务对应的待处理数据发送至被分配的GPU,触发GPU对待处理数据进行计算处理,提高了处理数据的效率。并且上述过程并不要求对待处理任务的对应的至少一个待处理数据的数据格式为等长数据类型,提高了系统的性能。在运行过程中无需用户手动参与,进一步提高了处理数据的效率。
本发明实施例提供了一种数据处理的装置,如图3所示,包括:
获取单元301,用于获取待处理任务以及和所述待处理任务对应的至少 一个待处理的数据。
分配单元302,用于为待处理任务分配图形处理器GPU。
具体的,分配单元302可以根据此待处理任务的需求确定此待处理任务的对应的至少一个待处理数据是否由GPU处理。若此待处理任务需要GPU处理其对应的至少一个待处理数据,则分配单元302可以为此待处理任务分配GPU。
转换单元303,用于将所述待处理任务对应的所述至少一个待处理数据转换为数据集合类型的数据。
具体的,所述转换单元303,具体用于确定所述数据集合类型的数据的大小;根据所述数据集合类型的数据的大小,将所述至少一个待处理数据分配至至少一个数据集合中。
其中,所述数据集合中包含的待处理数据的大小不大于所述数据集合类型的数据的大小。
进一步的,所述转换单元303,具体用于在所述待处理任务对应的所述至少一个待处理数据的数据类型为变长数据类型时,根据所述数据集合类型的数据的大小,将所述至少一个待处理数据分配至至少一个数据集合中,且记录所述至少一个待处理数据在所述至少一个数据集合中的位置信息,以使得所述GPU根据所述位置信息,获取所述待处理数据。
其中,所述位置信息用于记录变长类型的待处理数据在数据集合中的位置相关信息。
所述转换单元303,具体用于在所述至少一个待处理数据的数据类型是等长数据类型时,根据所述数据集合类型的数据大小,可以将至少一个待处理数据的数据直接分配至至少一个数据集合中。由于每个数据的大小是一定的,则在数据集合中的位置也是一定的,所以无需记录等长数据类型的至少一个待处理数据在数据集合中的位置。
解析单元304,用于将所述转换单元303转换的所述数据集合类型中的 数据进行解析,将解析后的数据生成至少一个数据块。
具体的,所述解析单元304,具体用于利用预设的解析函数,将所述数据集合类型中的数据的数据格式转换为所述GPU进行计算处理时所需的数据格式。将转换数据格式后的数据生成至少一个数据块。
发送单元305,用于将所述解析单元304生成的所述至少一个数据块发送给所述分配单元分配的GPU,以使得所述GPU根据所述待处理任务对所述至少一个数据块进行计算处理。
具体的,所述发送单元305,具体用于将所述生成的至少一个数据块发送给所述被分配的GPU的缓存区中。
进一步的,所述获取单元301,还用于获取预先配置的资源信息表。
其中,所述资源信息表用于记录GPU的数量以及所述GPU的使用情况信息。
进一步的,所述资源信息表还用于记录中央处理器CPU的数量以及所述CPU的使用情况信息。
所述数据处理的装置,如图4所示,还包括:
确定单元306,用于确定所述待处理任务需要的GPU的使用数量。
此时,所述分配单元302,具体用于根据所述获取单元301获取的资源信息表中GPU的数量及所述GPU的使用情况信息,确定未使用的GPU的数量满足所述确定单元306确定的所述待处理任务需要的GPU的使用数量时,为所述待处理任务分配GPU。
进一步的,所述分配单元302,还用于确定未使用的GPU的数量不满足所述确定单元306确定的所述待处理任务需要的GPU的使用数量时,为所述待处理任务分配CPU。
进一步的,所述数据处理的装置,如图5所示,还包括:
接收单元307,用于接收所述被分配的GPU发送的计算处理结果。
处理单元308,用于对所述计算处理结果进行分区、排序及合并处理。
具体的,在所述接收单元307接收到所述被分配的GPU发送的计算处理结果后,处理单元308可以将所述计算处理结果进行分区、排序及合并处理,即分区为将关键字相同的计算结果划分到同一个组中。针对分组后的计算结果,根据各组对应的关键字,对每组的计算结果进行排序。将相同关键字的计算结果进行合并处理。
本发明实施例提供了一种数据处理的转置,数据处理的装置获取待处理任务及待处理任务对应的至少一个待处理数据,为此待处理任务分配GPU;将待处理任务对应的至少一个待处理数据转换为数据集合类型的数据,并将数据集合类型中的数据进行解析,将解析后的数据生成至少一个数据块,将生成的至少一个数据块发送至被分配的GPU中,以使得GPU进行计算处理。这样,数据处理的装置在获取到待处理任务及其对应的至少一个待处理数据后,可以为其分配GPU,并将此待处理任务对应的待处理数据发送至被分配的GPU,触发GPU对待处理数据进行计算处理,提高了处理数据的效率。并且上述过程并不要求对待处理任务的对应的至少一个待处理数据的数据格式为等长数据类型,提高了系统的性能。在运行过程中无需用户手动参与,进一步提高了处理数据的效率。
本发明实施例提供了一种数据处理的装置,如图6所示,包括:处理器601,存储器602,通信接口603,和总线604,其中,所述处理器601、所述存储器602和所述通信接口603通过所述总线604通信。
所述存储器602,用于存放程序。
所述处理器601,用于执行所述存储器存储的执行指令。
所述通信接口603,用于接收待处理任务以及与所述待处理任务对应的至少一个待处理的数据,
当所述数据处理装置运行时,所述处理器601运行程序,以执行以下指令:
所述处理器601,用于获取待处理任务以及和所述待处理任务对应的至 少一个待处理的数据。
所述处理器601,还用于为所述待处理任务分配图形处理器GPU。
具体的,处理器601可以根据此待处理任务的需求确定此待处理任务的对应的至少一个待处理数据是否由GPU处理。若此待处理任务需要GPU处理其对应的至少一个待处理数据,则处理器601可以为此待处理任务分配GPU。
所述处理器601,还用于将所述待处理任务对应的所述至少一个待处理数据转换为数据集合类型的数据。
具体的,处理器601具体用于确定所述数据集合类型的数据的大小;根据所述数据集合类型的数据的大小,将所述至少一个待处理数据分配至至少一个数据集合中。
其中,所述数据集合中包含的待处理数据的大小不大于所述数据集合类型的数据的大小。
进一步的,所述处理器601,具体用于在所述待处理任务对应的所述至少一个待处理数据的数据类型为变长数据类型时,根据所述数据集合类型的数据的大小,将所述至少一个待处理数据分配至至少一个数据集合中,且记录所述至少一个待处理数据在所述至少一个数据集合中的位置信息,以使得所述GPU根据所述位置信息,获取所述待处理数据。
其中,所述位置信息用于记录变长类型的待处理数据在数据集合中的位置相关信息。
所述处理器601,具体用于在所述至少一个待处理数据的数据类型是等长数据类型时,根据所述数据集合类型的数据大小,可以将至少一个待处理数据的数据直接分配至至少一个数据集合中。由于每个数据的大小是一定的,则在数据集合中的位置也是一定的,所以无需记录等长数据类型的至少一个待处理数据在数据集合中的位置。
所述处理器601,还用于将所述数据集合类型中的数据进行解析,将解 析后的数据生成至少一个数据块。
具体的,处理器601,具体用于利用预设的解析函数,将所述数据集合类型中的数据的数据格式转换为所述GPU进行计算处理时所需的数据格式。将转换数据格式后的数据生成至少一个数据块。
所述处理器601,还用于将所述生成的至少一个数据块发送给所述被分配的GPU,以使得所述GPU根据所述待处理任务对所述至少一个数据块进行计算处理。
具体的,所述处理器601,具体用于将所述生成的至少一个数据块发送给所述被分配的GPU的缓存区中。
进一步的,所述处理器601,还用于获取预先配置的资源信息表。
其中,所述资源信息表用于记录GPU的数量以及所述GPU的使用情况信息。
进一步的,所述资源信息表还用于记录中央处理器CPU的数量以及所述CPU的使用情况信息。
所述处理器601,还用于确定所述待处理任务需要的GPU的使用数量。
此时,所述处理器601,用于为所述待处理任务分配图形处理器GPU具体为:
所述处理器601,具体用于根据所述资源信息表中GPU的数量及所述GPU的使用情况信息,确定未使用的GPU的数量满足所述待处理任务需要的GPU的使用数量时,为所述待处理任务分配GPU。
进一步的,所述处理器601,还用于确定未使用的GPU的数量不满足所述确定单元306确定的所述待处理任务需要的GPU的使用数量时,为所述待处理任务分配CPU。
进一步的,所述通信接口603,还用于接收所述被分配的GPU发送的计算处理结果。
所述处理器601,还用于对所述计算处理结果进行分区、排序及合并处 理。
具体的,在所述通信接口603接收到所述被分配的GPU发送的计算处理结果后,处理器601可以将所述计算处理结果进行分区、排序及合并处理,即分区为将关键字相同的计算结果划分到同一个组中。针对分组后的计算结果,根据各组对应的关键字,对每组的计算结果进行排序。将相同关键字的计算结果进行合并处理。
本发明实施例提供了一种数据处理的转置,数据处理的装置获取待处理任务及待处理任务对应的至少一个待处理数据,为此待处理任务分配GPU;将待处理任务对应的至少一个待处理数据转换为数据集合类型的数据,并将数据集合类型中的数据进行解析,将解析后的数据生成至少一个数据块,将生成的至少一个数据块发送至被分配的GPU中,以使得GPU进行计算处理。这样,数据处理的装置在获取到待处理任务及其对应的至少一个待处理数据后,可以为其分配GPU,并将此待处理任务对应的待处理数据发送至被分配的GPU,触发GPU对待处理数据进行计算处理,提高了处理数据的效率。并且上述过程并不要求对待处理任务的对应的至少一个待处理数据的数据格式为等长数据类型,提高了系统的性能。在运行过程中无需用户手动参与,进一步提高了处理数据的效率。
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到本发明可借助软件加必需的通用硬件的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在可读取的存储介质中,如计算机的软盘,硬盘或光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述的方法。
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易 想到的变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以所述权利要求的保护范围为准。

Claims (19)

  1. 一种数据处理的方法,其特征在于,所述方法包括:
    获取待处理任务以及与所述待处理任务对应的至少一个待处理的数据;
    为所述待处理任务分配图形处理器GPU;
    将所述待处理任务对应的所述至少一个待处理数据转换为数据集合类型的数据;
    将所述数据集合类型中的数据进行解析,将解析后的数据生成至少一个数据块;
    将所述生成的至少一个数据块发送给所述被分配的GPU,以使得所述GPU根据所述待处理任务对所述至少一个数据块进行计算处理。
  2. 根据权利要求1所述的方法,其特征在于,在为所述待处理任务分配图形处理器GPU之前还包括:
    获取预先配置的资源信息表;所述资源信息表用于记录GPU的数量以及所述GPU的使用情况信息。
  3. 根据权利要求2所述的方法,其特征在于,在所述获取资源信息表之后,还包括:确定所述待处理任务需要的GPU的使用数量;
    所述为待处理任务分配图形处理器GPU包括:
    根据所述资源信息表中GPU的数量及所述GPU的使用情况信息,确定未使用的GPU的数量满足所述待处理任务需要的GPU的使用数量时,为所述待处理任务分配GPU。
  4. 根据权利要求3所述的方法,其特征在于,
    所述资源信息表还用于记录中央处理器CPU的数量以及所述CPU的使用情况信息;
    在所述确定所述待处理任务需要的GPU的使用数量之后,还包括:
    确定所述资源信息表中未使用的GPU的数量不满足待处理任务需要的GPU的使用数量时,为所述待处理任务分配CPU。
  5. 根据权利要求1-4任一项所述的方法,其特征在于,
    所述将所述待处理任务对应的所述至少一个待处理数据转换为数据集合类 型的数据包括:
    确定所述数据集合类型的数据大小;
    根据所述数据集合类型的数据大小,将所述至少一个待处理数据分配至至少一个数据集合中;所述数据集合中包含的待处理数据的大小不大于所述数据集合类型的数据大小。
  6. 根据权利要求5所述的方法,其特征在于,所述根据所述数据集合类型的数据大小,将所述至少一个待处理数据分配至至少一个数据集合中包括:
    在所述待处理任务对应的所述至少一个待处理数据的数据类型为变长数据类型时,根据所述数据集合类型的数据大小,将所述至少一个待处理数据分配至至少一个数据集合中,且记录所述至少一个待处理数据在所述至少一个数据集合中的位置信息,以使得所述GPU根据所述位置信息,获取所述待处理数据;所述位置信息用于记录变长类型的待处理数据在数据集合中的位置相关信息。
  7. 根据权利要求1-6任一项所述的方法,其特征在于,所述将所述数据集合类型中的数据进行解析,将解析后的数据生成至少一个数据块包括:
    利用预设的解析函数,将所述数据集合类型中的数据的数据格式转换为所述GPU进行计算处理时所需的数据格式;
    将转换数据格式后的数据生成至少一个数据块。
  8. 根据权利要求1-7任一项所述的方法,其特征在于,所述将所述生成的至少一个数据块发送给所述被分配的GPU包括:
    将所述生成的至少一个数据块存储至所述被分配的GPU的缓存区中。
  9. 根据权利要求1-8任一项所述的方法,其特征在于,在所述将所述生成的至少一个数据块发送给所述被分配的GPU之后,还包括:
    接收所述被分配的GPU发送的计算处理结果,并对所述计算处理结果进行分区、排序及合并处理。
  10. 一种数据处理的装置,其特征在于,包括:
    获取单元,用于获取待处理任务以及和所述待处理任务对应的至少一个待处理的数据;
    分配单元,用于为待处理任务分配图形处理器GPU;
    转换单元,用于将所述待处理任务对应的所述至少一个待处理数据转换为数据集合类型的数据;
    解析单元,用于将所述转换单元转换的所述数据集合类型中的数据进行解析,将解析后的数据生成至少一个数据块;
    发送单元,用于将所述解析单元生成的所述至少一个数据块发送给所述分配单元分配的GPU,以使得所述GPU根据所述待处理任务对所述至少一个数据块进行计算处理。
  11. 根据权利要求10所述的装置,其特征在于,
    所述获取单元,还用于获取预先配置的资源信息表;所述资源信息表用于记录GPU的数量以及所述GPU的使用情况信息。
  12. 根据权利要求11所述的装置,其特征在于,所述装置还包括:
    确定单元,用于确定所述待处理任务需要的GPU的使用数量;
    所述分配单元,具体用于根据所述获取单元获取的资源信息表中GPU的数量及所述GPU的使用情况信息,确定未使用的GPU的数量满足所述确定单元确定的所述待处理任务需要的GPU的使用数量时,为所述待处理任务分配GPU。
  13. 根据权利要求12所述的装置,其特征在于,所述资源信息表还用于记录中央处理器CPU的数量以及所述CPU的使用情况信息;
    所述分配单元,还用于确定未使用的GPU的数量不满足所述确定单元确定的所述待处理任务需要的GPU的使用数量时,为所述待处理任务分配CPU。
  14. 根据权利要求10-13任一项所述的装置,其特征在于,
    所述转换单元,具体用于确定所述数据集合类型的数据的大小;根据所述数据集合类型的数据的大小,将所述至少一个待处理数据分配至至少一个数据集合中;所述数据集合中包含的待处理数据的大小不大于所述数据集合类型的数据的大小。
  15. 根据权利要求14所述的装置,其特征在于,
    所述转换单元,具体用于在所述待处理任务对应的所述至少一个待处理数据的数据类型为变长数据类型时,根据所述数据集合类型的数据的大小,将所述至少一个待处理数据分配至至少一个数据集合中,且记录所述至少一个待处 理数据在所述至少一个数据集合中的位置信息,以使得所述GPU根据所述位置信息,获取所述待处理数据;所述位置信息用于记录变长类型的待处理数据在数据集合中的位置相关信息。
  16. 根据权利要求10-15任一项所述的装置,其特征在于,
    所述解析单元,具体用于利用预设的解析函数,将所述数据集合类型中的数据的数据格式转换为所述GPU进行计算处理时所需的数据格式;
    将转换数据格式后的数据生成至少一个数据块。
  17. 根据权利要求10-16任一项所述的装置,其特征在于,
    所述发送单元,具体用于将所述生成的至少一个数据块发送给所述被分配的GPU的缓存区中。
  18. 根据权利要求10-17任一项所述的装置,其特征在于,所述装置还包括:
    接收单元,用于接收所述GPU发送的计算处理结果;
    处理单元,用于对所述计算处理结果进行分区、排序及合并处理。
  19. 一种数据处理的装置,其特征在于,所述装置包括:处理器,存储器,通信接口,和总线,其中,所述处理器、所述存储器和所述通信接口通过所述总线通信;
    所述存储器,用于存放程序;
    所述处理器,用于执行所述存储器存储的执行指令;
    所述通信接口,用于接收待处理任务以及与所述待处理任务对应的至少一个待处理的数据;
    当所述数据处理装置运行时,处理器运行程序,以执行以下指令:
    获取待处理任务以及和所述待处理任务对应的至少一个待处理的数据;
    为所述待处理任务分配图形处理器GPU;
    将所述待处理任务对应的所述至少一个待处理数据转换为数据集合类型的数据;
    将所述数据集合类型中的数据进行解析,将解析后的数据生成至少一个数据块;
    将所述生成的至少一个数据块发送给所述被分配的GPU,以使得所述GPU 根据所述待处理任务对所述至少一个数据块进行计算处理。
PCT/CN2015/079633 2014-05-23 2015-05-23 一种数据处理的方法及装置 WO2015176689A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410223152.8A CN105094981B (zh) 2014-05-23 2014-05-23 一种数据处理的方法及装置
CN201410223152.8 2014-05-23

Publications (1)

Publication Number Publication Date
WO2015176689A1 true WO2015176689A1 (zh) 2015-11-26

Family

ID=54553454

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/079633 WO2015176689A1 (zh) 2014-05-23 2015-05-23 一种数据处理的方法及装置

Country Status (2)

Country Link
CN (1) CN105094981B (zh)
WO (1) WO2015176689A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108628817A (zh) * 2017-03-15 2018-10-09 腾讯科技(深圳)有限公司 一种数据处理方法及装置
CN110930291A (zh) * 2019-11-15 2020-03-27 山东英信计算机技术有限公司 一种gpu显存管理控制方法及相关装置

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107103009B (zh) * 2016-02-23 2020-04-10 杭州海康威视数字技术股份有限公司 一种数据处理方法及装置
CN107204998B (zh) * 2016-03-16 2020-04-28 华为技术有限公司 处理数据的方法和装置
CN109359689B (zh) * 2018-10-19 2021-06-04 科大讯飞股份有限公司 一种数据识别方法及装置
CN110688223B (zh) * 2019-09-11 2022-07-29 深圳云天励飞技术有限公司 数据处理方法及相关产品
CN110716805A (zh) * 2019-09-27 2020-01-21 上海依图网络科技有限公司 图形处理器的任务分配方法、装置、电子设备及存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102662639A (zh) * 2012-04-10 2012-09-12 南京航空航天大学 一种基于Mapreduce的多GPU协同计算方法
CN102708088A (zh) * 2012-05-08 2012-10-03 北京理工大学 面向海量数据高性能计算的cpu/gpu协同处理方法
US20140068407A1 (en) * 2012-08-28 2014-03-06 Adobe System Incorporated Identifying web pages that are likely to guide browsing viewers to improve conversion rate

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7119810B2 (en) * 2003-12-05 2006-10-10 Siemens Medical Solutions Usa, Inc. Graphics processing unit for simulation or medical diagnostic imaging
CN103699656A (zh) * 2013-12-27 2014-04-02 同济大学 一种基于GPU的面向海量多媒体数据的MapReduce平台

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102662639A (zh) * 2012-04-10 2012-09-12 南京航空航天大学 一种基于Mapreduce的多GPU协同计算方法
CN102708088A (zh) * 2012-05-08 2012-10-03 北京理工大学 面向海量数据高性能计算的cpu/gpu协同处理方法
US20140068407A1 (en) * 2012-08-28 2014-03-06 Adobe System Incorporated Identifying web pages that are likely to guide browsing viewers to improve conversion rate

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108628817A (zh) * 2017-03-15 2018-10-09 腾讯科技(深圳)有限公司 一种数据处理方法及装置
CN110930291A (zh) * 2019-11-15 2020-03-27 山东英信计算机技术有限公司 一种gpu显存管理控制方法及相关装置
CN110930291B (zh) * 2019-11-15 2022-06-17 山东英信计算机技术有限公司 一种gpu显存管理控制方法及相关装置

Also Published As

Publication number Publication date
CN105094981B (zh) 2019-02-12
CN105094981A (zh) 2015-11-25

Similar Documents

Publication Publication Date Title
WO2015176689A1 (zh) 一种数据处理的方法及装置
US10831562B2 (en) Method and system for operating a data center by reducing an amount of data to be processed
KR101630749B1 (ko) 데이터센터 리소스 할당
US8782656B2 (en) Analysis of operator graph and dynamic reallocation of a resource to improve performance
US10366046B2 (en) Remote direct memory access-based method of transferring arrays of objects including garbage data
US20200174838A1 (en) Utilizing accelerators to accelerate data analytic workloads in disaggregated systems
US11036558B2 (en) Data processing
KR101656360B1 (ko) 자동 분산병렬 처리 하둡 시스템을 지원하는 클라우드 시스템
US20120221744A1 (en) Migrating Virtual Machines with Adaptive Compression
US20150081914A1 (en) Allocation of Resources Between Web Services in a Composite Service
US11321090B2 (en) Serializing and/or deserializing programs with serializable state
US9237079B2 (en) Increasing performance of a streaming application by running experimental permutations
CN110781159B (zh) Ceph目录文件信息读取方法、装置、服务器及存储介质
CN114598597B (zh) 多源日志解析方法、装置、计算机设备及介质
US10133713B2 (en) Domain specific representation of document text for accelerated natural language processing
CN111221888A (zh) 大数据分析系统及方法
US20180137045A1 (en) Automatic memory management using a memory management unit
WO2021249030A1 (zh) 随机数序列生成方法和随机数引擎
US9176910B2 (en) Sending a next request to a resource before a completion interrupt for a previous request
US20170115888A1 (en) Method and system for a common processing framework for memory device controllers
US10528400B2 (en) Detecting deadlock in a cluster environment using big data analytics
US10185718B1 (en) Index compression and decompression
CN106547603B (zh) 减少golang语言系统垃圾回收时间的方法和装置
WO2013153620A1 (ja) データ処理システム及びデータ処理方法
US20220171657A1 (en) Dynamic workload tuning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15795671

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15795671

Country of ref document: EP

Kind code of ref document: A1