CN115827256A - Task transmission scheduling management system for multi-core storage and computation integrated accelerator network - Google Patents

Task transmission scheduling management system for multi-core storage and computation integrated accelerator network Download PDF

Info

Publication number
CN115827256A
CN115827256A CN202310127045.4A CN202310127045A CN115827256A CN 115827256 A CN115827256 A CN 115827256A CN 202310127045 A CN202310127045 A CN 202310127045A CN 115827256 A CN115827256 A CN 115827256A
Authority
CN
China
Prior art keywords
task
data
accelerator
processing
monitoring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310127045.4A
Other languages
Chinese (zh)
Other versions
CN115827256B (en
Inventor
李涛
熊大鹏
胡建伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Yizhu Intelligent Technology Co ltd
Original Assignee
Shanghai Yizhu Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Yizhu Intelligent Technology Co ltd filed Critical Shanghai Yizhu Intelligent Technology Co ltd
Priority to CN202310127045.4A priority Critical patent/CN115827256B/en
Publication of CN115827256A publication Critical patent/CN115827256A/en
Application granted granted Critical
Publication of CN115827256B publication Critical patent/CN115827256B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Debugging And Monitoring (AREA)
  • Multi Processors (AREA)

Abstract

The invention belongs to the field of data processing, relates to the technology of a multi-core accelerator, and is used for solving the problem that the performance of an application program is limited by adopting a static hardware allocation method in the existing storage and computation integrated accelerator, in particular to a task transmission scheduling management system for a multi-core storage and computation integrated accelerator network, which comprises a scheduling management platform, wherein the scheduling management platform is in communication connection with a task management module and the accelerator network, and the task management module is used for performing management analysis on task transmission processing: compiling an application program into a data-driven task mode, providing a unique characteristic value for each task, and dynamically setting a task address space and external data; the invention makes each node in the accelerator network form a reconfigurable memory-computation integrated accelerator core through a real-time detection and dynamic task allocation algorithm to perform near data computation on data, and each node comprises a reallocation module supporting real-time detection and scheduling to control the accelerator and the data.

Description

Task transmission scheduling management system for multi-core storage and computation integrated accelerator network
Technical Field
The invention belongs to the field of data processing, relates to a multi-core accelerator technology, and particularly relates to a task transmission scheduling management system for a multi-core storage and computation integrated accelerator network.
Background
In order to meet the requirements of an application program on time delay and simultaneous processing of a large amount of data, the conventional multi-core computing integrated accelerator connects a plurality of cores to a shared memory to realize instruction and data exchange among the cores, and the whole interaction process is controlled by a main processor connected with the accelerator and is completed by executing data transmission and task execution instructions in program design;
traditional scientific simulation application programs generally divide data into data blocks with equal size and perform independent data operation and iteration, however, emerging high-performance computing application programs combine traditional scientific simulation with advanced data analysis and mechanical learning, data structures of the applications are often stored in a sparse data structure form and are more difficult to organize into a conventional partitionable data structure, and irregular fine-grained data access and a large amount of data interaction and deformation are caused; the existing storage and computation integrated accelerator mostly adopts a static hardware allocation method, so that hardware resources cannot be allocated effectively when a data structure is changed, and the performance of an application program is limited;
in view of the above technical problem, the present application proposes a solution.
Disclosure of Invention
The invention aims to provide a task transmission scheduling management system for a multi-core storage and computation integrated accelerator network, which is used for solving the problem that the performance of an application program is limited by adopting a static hardware allocation method in the existing storage and computation integrated accelerator.
The technical problems to be solved by the invention are as follows: how to provide a task transmission scheduling management system which can realize load balance among various accelerators so as to realize the highest hardware utilization rate.
The purpose of the invention can be realized by the following technical scheme:
the task transmission scheduling management system for the multi-core storage and computation integrated accelerator network comprises a scheduling management platform, wherein the scheduling management platform is in communication connection with a task management module and the accelerator network;
the task management module is used for managing and analyzing the task transmission processing: compiling an application program into a data-driven task mode, providing a unique characteristic value for each task, and dynamically setting a task address space and external data according to a data access mode of the task;
each node of the accelerator network comprises a storage and computation integrated accelerator core, a redistribution module and a plurality of monitoring modules, wherein the storage and computation integrated accelerator core is used for carrying out near data computation on data, and each node of the accelerator network is connected into a mesh topology structure to transmit the data and tasks; the method comprises the steps that a plurality of storage and computation integrated accelerator cores form an accelerator group, and monitoring modules correspond to the accelerator group one by one;
the monitoring module is used for monitoring and analyzing the hardware utilization rate of the accelerator group, obtaining the matching value and the priority value of the accelerator group, and sending the matching value and the priority value of the accelerator group to the re-distribution module through the scheduling management platform;
the reallocation module is used for performing task reallocation processing on the accelerator groups.
As a preferred embodiment of the present invention, the feature value of the task is composed of a feature parameter a and a feature parameter B, where the feature parameter a is a data access mode of the task, and includes a data accessor mode, an active domain object mode, an object relationship mapping mode, and a layer mode; the characteristic parameter B is a data memory value of the task.
As a preferred embodiment of the present invention, a specific process of the monitoring module for monitoring and analyzing the hardware utilization of the accelerator group includes: setting a monitoring period, marking an accelerator group as a monitoring object, and acquiring historical processing data of the monitoring object in the monitoring period, wherein the historical processing data comprises a characteristic value of a processing task, duration data SC and utilization data LS; summing the utilization coefficients LY of all the processing processes and averaging to obtain utilization data LS of the monitored object; and obtaining an application coefficient YY when the monitoring object carries out a processing task by carrying out numerical calculation on the time and length data SC and the utilization data LS.
As a preferred embodiment of the present invention, the process duration of the processing task performed by the monitoring object is duration data SC, and the obtaining process using the data LS includes: dividing the process duration of the processing task into a plurality of processing time intervals, acquiring processing data CL and memory data NC of the monitoring object in the processing time intervals, wherein the processing data CL is the maximum value of the processor utilization rate of the monitoring object in the processing time intervals, the memory data NC is the maximum value of the memory utilization rate of the monitoring object in the processing time intervals, and the utilization coefficient LY of the monitoring object in the processing time intervals is obtained by carrying out numerical calculation on the processing data CL and the memory data NC.
As a preferred embodiment of the present invention, the process of acquiring the matching value of the monitoring object includes: the processing task with the maximum application coefficient YY value in the monitoring period is marked as a matching task, the processing task with the same characteristic parameter A and the same data access mode of the matching task is marked as an analysis task, the maximum value and the minimum value of the characteristic parameter B of the analysis task form a memory range, the memory range is divided into a plurality of memory intervals, the memory interval matched with the characteristic parameter B of the matching task is marked as a matching interval of a monitoring object, and the characteristic parameter A and the matching interval of the matching task form a matching value of the monitoring object.
As a preferred embodiment of the present invention, the process of acquiring the priority value of the monitoring target includes: and summing the application coefficients YY of the analysis tasks, averaging to obtain application values corresponding to the characteristic parameters A of the matched object, obtaining application values corresponding to the residual characteristic parameters A of the processing tasks of the monitored object by the same method, and arranging the characteristic parameters A according to the sequence of the application values from large to small to obtain the priority value of the monitored object.
As a preferred embodiment of the present invention, the specific process of the reallocation module for performing task reallocation processing on the accelerator group includes: carrying a control module for each storage and computation integrated node to detect the utilization rate and perform task allocation, performing task data space allocation tasks in an allocation mode I, monitoring the hardware utilization rate in real time and marking the tasks with the highest utilization rate and the lowest utilization rate; monitoring whether the node to be determined exists or not, and if yes, performing task data space allocation tasks in an allocation mode II; if not, the first allocation mode is continuously adopted to allocate the task data space after the task is completed.
As a preferred embodiment of the present invention, the allocation procedure of the first allocation mode includes: acquiring a matching value of an accelerator group, judging whether a task matched with the matching value of the accelerator group exists in a task list, and if so, distributing the corresponding task and the accelerator group; if the node does not exist, marking the node corresponding to the accelerator group as an undetermined node; the judgment basis comprises: the characteristic parameter A of the task is the same as the characteristic parameter A of the matching value, and the characteristic parameter B is located in the memory interval;
the allocation process of the second allocation mode comprises the following steps: and obtaining a priority value of the accelerator group corresponding to the node to be determined, and screening a task from the task list and performing task allocation on the accelerator group according to the descending order of the priority value and the numerical value of the characteristic parameter B.
The working method of the task transmission scheduling management system for the multi-core storage and computation integrated accelerator network comprises the following steps:
the method comprises the following steps: compiling an application program into a data-driven task mode, providing a unique characteristic value for each task, and dynamically setting a task address space and external data according to the data access mode of the task;
step two: monitoring and analyzing the hardware utilization rate of the accelerator group: setting a monitoring period, marking the accelerator group as a monitoring object, acquiring historical processing data of the monitoring object in the monitoring period, and performing data analysis to obtain a matching value and a priority value of the monitoring object;
step three: performing task reallocation processing on the accelerator group: and carrying out task data space allocation on each storage and computation integrated node by using an allocation mode I and an allocation mode II.
The invention has the following beneficial effects:
1. each node in the accelerator network forms a reconfigurable storage and computation integrated accelerator core through a real-time detection and dynamic task allocation algorithm so as to perform near data computation on data, each node comprises a redistribution module supporting real-time detection and scheduling to control the accelerator and the data, each node is connected into a mesh topology structure so as to transmit the data and tasks, and the task allocation and recombination are supported through a task address space mode and a task addressing transmission method;
2. the hardware utilization rate of the accelerator groups can be detected and analyzed through the monitoring module, historical processing data of the accelerator groups are comprehensively analyzed within a monitoring period to obtain a priority value and a matching value, historical processing tasks of the accelerator groups are screened, the hardware utilization rate in the processing process is fed back, and therefore the hardware utilization rate of each accelerator group during task processing is guaranteed;
3. the re-allocation module can perform task re-allocation processing analysis on the accelerator group, the allocation mode I and the memory interval ensure the task adaptation degree of the accelerator group through the combined application of the allocation mode I and the allocation mode II, and the allocation mode II allocates tasks according to the data access mode on the basis of the allocation mode I.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a block diagram of a system according to a first embodiment of the present invention;
FIG. 2 is a flowchart of a method according to a second embodiment of the present invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the following embodiments, and it should be understood that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to meet the requirements of an application program on time delay and simultaneous processing of a large amount of data, the conventional multi-core calculation integrated accelerator connects a plurality of cores to a shared memory to realize instruction and data exchange among the cores, and the whole interaction process is controlled by a main processor connected with the accelerator and is completed by executing data transmission and task execution instructions in program design; meanwhile, in order to ensure the reliability of tasks, the method needs to strictly control the sequence of data processing and shared memory access, under the two limiting conditions, the conventional multi-core integrated computing accelerator adopts a static task allocation method controlled by a compiler, so that various tasks cannot be processed simultaneously, and when the data volume of the tasks changes, the static allocation method cannot effectively utilize hardware resources on the multi-core integrated computing accelerator, so that the waste of the hardware resources is caused.
Example one
As shown in fig. 1, the task transmission scheduling management system for a multi-core storage-computation-integrated accelerator network includes a scheduling management platform, where the scheduling management platform is communicatively connected to a task management module and an accelerator network.
The task management module is used for managing and analyzing the task transmission processing: compiling an application program into a data-driven task mode, providing a unique characteristic value for each task, and dynamically setting a task address space and external data according to a data access mode of the task; the characteristic value of the task is composed of a characteristic parameter A and a characteristic parameter B, wherein the characteristic parameter A is a data access mode of the task and comprises a data accessor mode, an active domain object mode, an object relation mapping mode and a layer mode; the characteristic parameter B is a data memory value of the task.
Each node of the accelerator network comprises a storage and computation integrated accelerator core, a redistribution module and a plurality of monitoring modules, wherein the storage and computation integrated accelerator core is used for performing near data computation on data, and all nodes of the accelerator network are connected into a mesh topology structure to transmit data and tasks; the method comprises the steps that a plurality of storage and computation integrated accelerator cores form an accelerator group, and monitoring modules correspond to the accelerator group one by one; each node in the accelerator network forms a reconfigurable storage and computation integrated accelerator core through a real-time detection and dynamic task allocation algorithm to perform near data computation on data, each node is connected into a mesh topology structure to transmit the data and tasks, and the task allocation and recombination are supported through a task address space mode and task addressing transmission method.
The monitoring module is used for monitoring and analyzing the hardware utilization rate of the accelerator group: setting a monitoring period, marking an accelerator group as a monitoring object, and acquiring historical processing data of the monitoring object in the monitoring period, wherein the historical processing data comprises a characteristic value of a processing task, duration data SC and utilization data LS, the duration data SC is the process duration of the processing task of the monitoring object, and the acquisition process of the utilization data LS comprises the following steps: dividing the process duration of a processing task into a plurality of processing time periods, acquiring processing data CL and memory data NC of a monitoring object in the processing time periods, wherein the processing data CL is the maximum value of the processor utilization rate of the monitoring object in the processing time periods, the memory data NC is the maximum value of the memory utilization rate of the monitoring object in the processing time periods, and acquiring a utilization coefficient LY of the monitoring object in the processing time periods through a formula LY = alpha 1 × CL + alpha 2 × NC, wherein alpha 1 and alpha 2 are both proportional coefficients, and alpha 1 is more than alpha 2 and more than 1; summing the utilization coefficients LY of all the processing processes and averaging to obtain utilization data LS of the monitored object; obtaining an application coefficient YY when the monitoring object performs the processing task through a formula YY = (beta 1 × LS)/(beta 2 × SC), wherein the application coefficient is a numerical value reflecting the overall processing efficiency when the monitoring object performs the task processing, and the greater the numerical value of the application coefficient is, the higher the overall processing efficiency when the monitoring object processes the corresponding task is; wherein, the beta 1 and the beta 2 are proportional coefficients, and the beta 1 is more than the beta 2 and more than 1; marking the processing task with the maximum application coefficient YY value in the monitoring period as a matching task, marking the processing task with the same characteristic parameter A and the same data access mode as the matching task as an analysis task, forming a memory range by the maximum value and the minimum value of a characteristic parameter B of the analysis task, dividing the memory range into a plurality of memory intervals, marking the memory interval matched with the characteristic parameter B of the matching task as a matching interval of a monitoring object, forming a matching value of the monitoring object by the characteristic parameter A and the matching interval of the matching task, summing and averaging the application coefficients YY of the analysis task to obtain an application value corresponding to the characteristic parameter A of the matching object, obtaining application values corresponding to the residual characteristic parameters A of the processing task of the monitoring object by the same method, arranging the characteristic parameters A according to the sequence of the application values from large to small to obtain a priority value of the monitoring object, and sending the priority value and the matching value of the monitoring object to a reallocation module through a scheduling management platform; the hardware utilization rate of the accelerator groups is detected and analyzed, historical processing tasks of the accelerator groups are screened by comprehensively analyzing historical processing data of the accelerator groups within a monitoring period to obtain a priority value and a matching value, and the hardware utilization rate in the processing process is fed back, so that the hardware utilization rate of each accelerator group during task processing is ensured.
The reallocation module is used for performing task reallocation processing on the accelerator group: carrying a control module for each storage and computation integrated node to detect the utilization rate and perform task allocation, performing task data space allocation tasks in an allocation mode I, monitoring the hardware utilization rate in real time and marking the tasks with the highest utilization rate and the lowest utilization rate; monitoring whether the nodes to be determined exist or not, and if yes, adopting a second allocation mode to perform task data space allocation tasks; if not, continuing to adopt a first distribution mode to distribute the task data space after the task is finished; the allocation process of the allocation mode one comprises the following steps: acquiring a matching value of an accelerator group, judging whether a task matched with the matching value of the accelerator group exists in a task list, and if so, distributing the corresponding task and the accelerator group; if the node does not exist, marking the node corresponding to the accelerator group as an undetermined node; the judgment basis comprises: the characteristic parameter A of the task is the same as the characteristic parameter A of the matching value, and the characteristic parameter B is located in the memory interval; the allocation process of the second allocation mode comprises the following steps: acquiring a priority value of an accelerator group corresponding to a node to be determined, and screening a task from a task list and performing task allocation on the accelerator group according to the descending order of the priority value and the numerical value of the characteristic parameter B; and performing task reallocation processing analysis on the accelerator group, ensuring the task adaptation degree of the accelerator group through a characteristic parameter A and a memory interval by combining and applying a first allocation mode and a second allocation mode, and allocating the tasks according to a data access mode by the second allocation mode on the basis of the first allocation mode.
Example two
As shown in fig. 2, the task transmission scheduling management method for a multi-core computing unified accelerator network includes the following steps:
the method comprises the following steps: compiling an application program into a data-driven task mode, providing a unique characteristic value for each task, and dynamically setting a task address space and external data according to the data access mode of the task;
step two: monitoring and analyzing the hardware utilization rate of the accelerator group: setting a monitoring period, marking an accelerator group as a monitoring object, acquiring historical processing data of the monitoring object in the monitoring period, performing data analysis to obtain a matching value and a priority value of the monitoring object, and feeding back the hardware utilization rate in the processing process of the monitoring object through the matching value and the priority value;
step three: performing task reallocation processing on the accelerator group: and carrying out task data space allocation on each storage and computation integrated node by carrying a control module to detect the utilization rate and perform task allocation, and performing task data space allocation on the tasks by adopting an allocation mode I and an allocation mode II, wherein the allocation mode I ensures the task adaptation degree of an accelerator group through a characteristic parameter A and a memory interval, and the allocation mode II allocates the tasks according to a data access mode on the basis of the allocation mode I.
The task transmission scheduling management system for the multi-core storage and calculation integrated accelerator network compiles an application program into a data-driven task mode during working, provides a unique characteristic value for each task, and dynamically sets a task address space and external data according to the data access mode of the task; monitoring and analyzing the hardware utilization rate of the accelerator group: setting a monitoring period, marking the accelerator group as a monitoring object, acquiring historical processing data of the monitoring object in the monitoring period, and performing data analysis to obtain a matching value and a priority value of the monitoring object; performing task reallocation processing on the accelerator group: and carrying out task data space allocation on each storage and computation integrated node by using an allocation mode I and an allocation mode II.
The foregoing is merely exemplary and illustrative of the present invention and various modifications, additions and substitutions may be made by those skilled in the art to the specific embodiments described without departing from the scope of the invention as defined in the following claims.
The formulas are obtained by acquiring a large amount of data and performing software simulation, and the coefficients in the formulas are set by the technicians in the field according to actual conditions; such as: formula YY = (β 1 × ls)/(β 2 × sc); collecting multiple groups of sample data by the technicians in the field and setting corresponding application coefficients for each group of sample data; substituting the set application coefficient and the acquired sample data into formulas, forming a linear equation set of two variables by any two formulas, screening the calculated coefficients and taking the mean value to obtain values of alpha 1 and alpha 2 which are respectively 4.28 and 2.37;
the size of the coefficient is a specific numerical value obtained by quantizing each parameter, so that the subsequent comparison is convenient, and the size of the coefficient depends on the number of sample data and the corresponding application coefficient preliminarily set by a person skilled in the art for each group of sample data; as long as the proportional relationship between the parameters and the quantized values is not affected, for example, the application coefficient is proportional to the value of the utilization data.
In the description herein, references to the description of "one embodiment," "an example," "a specific example" or the like are intended to mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The preferred embodiments of the invention disclosed above are intended to be illustrative only. The preferred embodiments are not intended to be exhaustive or to limit the invention to the precise forms disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, to thereby enable others skilled in the art to best utilize the invention. The invention is limited only by the claims and their full scope and equivalents.

Claims (9)

1. The task transmission scheduling management system for the multi-core storage and computation integrated accelerator network is characterized by comprising a scheduling management platform, wherein the scheduling management platform is in communication connection with a task management module and the accelerator network;
the task management module is used for managing and analyzing the task transmission processing: compiling an application program into a data-driven task mode, providing a unique characteristic value for each task, and dynamically setting a task address space and external data according to a data access mode of the task;
each node of the accelerator network comprises a storage and computation integrated accelerator core, a redistribution module and a plurality of monitoring modules, wherein the storage and computation integrated accelerator core is used for performing near data computation on data, and all nodes of the accelerator network are connected into a mesh topology structure to transmit data and tasks; the method comprises the following steps that a plurality of storage and calculation integrated accelerator cores form an accelerator group, and monitoring modules correspond to the accelerator group one by one;
the monitoring module is used for monitoring and analyzing the hardware utilization rate of the accelerator group, obtaining the matching value and the priority value of the accelerator group, and sending the matching value and the priority value of the accelerator group to the re-distribution module through the scheduling management platform;
the reallocation module is used for performing task reallocation processing on the accelerator groups.
2. The task transmission scheduling management system for the multi-core storage-computation integrated accelerator network as claimed in claim 1, wherein the characteristic value of the task is composed of a characteristic parameter a and a characteristic parameter B, and the characteristic parameter a is a data access mode of the task, and includes a data accessor mode, an active domain object mode, an object relation mapping mode and a layer mode; the characteristic parameter B is a data memory value of the task.
3. The task transmission scheduling management system for the multi-core computation-integrated accelerator network as claimed in claim 2, wherein the specific process of the monitoring module for monitoring and analyzing the hardware utilization of the accelerator group includes: setting a monitoring period, marking an accelerator group as a monitoring object, and acquiring historical processing data of the monitoring object in the monitoring period, wherein the historical processing data comprises a characteristic value of a processing task, duration data SC and utilization data LS; summing the utilization coefficients LY of all the processing processes and averaging to obtain utilization data LS of the monitored object; and obtaining an application coefficient YY when the monitoring object carries out a processing task by carrying out numerical calculation on the time and length data SC and the utilization data LS.
4. The task transmission scheduling management system for the multi-core computing integrated accelerator network as claimed in claim 3, wherein the duration data SC is a process duration for the monitoring object to perform the processing task, and the obtaining process using the data LS includes: dividing the process duration of the processing task into a plurality of processing time intervals, acquiring processing data CL and memory data NC of the monitoring object in the processing time intervals, wherein the processing data CL is the maximum value of the processor utilization rate of the monitoring object in the processing time intervals, the memory data NC is the maximum value of the memory utilization rate of the monitoring object in the processing time intervals, and the utilization coefficient LY of the monitoring object in the processing time intervals is obtained by carrying out numerical calculation on the processing data CL and the memory data NC.
5. The task transmission scheduling management system for a multi-core computation-integrated accelerator network according to claim 4, wherein the acquisition process of the matching value of the monitoring object includes: the processing task with the maximum application coefficient YY value in the monitoring period is marked as a matching task, the processing task with the characteristic parameter A being the same as the data access mode of the matching task is marked as an analysis task, the maximum value and the minimum value of the characteristic parameter B of the analysis task form a memory range, the memory range is divided into a plurality of memory intervals, the memory interval matched with the characteristic parameter B of the matching task is marked as a matching interval of a monitoring object, and the characteristic parameter A of the matching task and the matching interval form a matching value of the monitoring object.
6. The task transmission scheduling management system for a multi-core computation-integrated accelerator network according to claim 5, wherein the acquiring of the priority value of the monitoring object includes: and summing the application coefficients YY of the analysis tasks, averaging to obtain application values corresponding to the characteristic parameters A of the matched object, obtaining application values corresponding to the residual characteristic parameters A of the processing tasks of the monitored object by the same method, and arranging the characteristic parameters A according to the sequence of the application values from large to small to obtain the priority value of the monitored object.
7. The task transmission scheduling management system for the multi-core computing unified accelerator network as claimed in claim 6, wherein the specific process of the reallocation module for performing task reallocation processing on the accelerator group comprises: carrying a control module for each storage and computation integrated node to detect the utilization rate and perform task allocation, performing task data space allocation tasks in an allocation mode I, monitoring the hardware utilization rate in real time and marking the tasks with the highest utilization rate and the lowest utilization rate; monitoring whether the node to be determined exists or not, and if yes, performing task data space allocation tasks in an allocation mode II; if not, the first allocation mode is continuously adopted to allocate the task data space after the task is completed.
8. The task transmission scheduling management system for a multi-core computing unified accelerator network according to claim 7, wherein the assignment procedure of the assignment mode one comprises: acquiring a matching value of an accelerator group, judging whether a task matched with the matching value of the accelerator group exists in a task list, and if so, distributing the corresponding task and the accelerator group; if the node does not exist, marking the node corresponding to the accelerator group as an undetermined node; the judgment basis includes: the characteristic parameter A of the task is the same as the characteristic parameter A of the matching value, and the characteristic parameter B is located in the memory interval;
the allocation process of the second allocation mode comprises the following steps: and obtaining a priority value of the accelerator group corresponding to the node to be determined, and screening the task from the task list and allocating the task to the accelerator group according to the descending order of the priority value and the numerical value of the characteristic parameter B.
9. The operating method of a task transmission scheduling management system for a multi-core computational unified accelerator network according to any of claims 1 to 8, characterized by comprising the following steps:
the method comprises the following steps: compiling an application program into a data-driven task mode, providing a unique characteristic value for each task, and dynamically setting a task address space and external data according to the data access mode of the task;
step two: monitoring and analyzing the hardware utilization rate of the accelerator group: setting a monitoring period, marking the accelerator group as a monitoring object, acquiring historical processing data of the monitoring object in the monitoring period, and performing data analysis to obtain a matching value and a priority value of the monitoring object;
step three: performing task reallocation processing on the accelerator group: and carrying out task data space allocation on each storage and computation integrated node by using an allocation mode I and an allocation mode II.
CN202310127045.4A 2023-02-17 2023-02-17 Task transmission scheduling management system for multi-core memory and calculation integrated accelerator network Active CN115827256B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310127045.4A CN115827256B (en) 2023-02-17 2023-02-17 Task transmission scheduling management system for multi-core memory and calculation integrated accelerator network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310127045.4A CN115827256B (en) 2023-02-17 2023-02-17 Task transmission scheduling management system for multi-core memory and calculation integrated accelerator network

Publications (2)

Publication Number Publication Date
CN115827256A true CN115827256A (en) 2023-03-21
CN115827256B CN115827256B (en) 2023-05-16

Family

ID=85521690

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310127045.4A Active CN115827256B (en) 2023-02-17 2023-02-17 Task transmission scheduling management system for multi-core memory and calculation integrated accelerator network

Country Status (1)

Country Link
CN (1) CN115827256B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116049908A (en) * 2023-04-03 2023-05-02 北京数力聚科技有限公司 Multi-party privacy calculation method and system based on blockchain
CN116414726A (en) * 2023-03-24 2023-07-11 苏州亿铸智能科技有限公司 Task dynamic allocation data parallel computing method based on memory and calculation integrated accelerator

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030084088A1 (en) * 2001-10-31 2003-05-01 Shaffer Larry J. Dynamic allocation of processing tasks using variable performance hardware platforms
US7647335B1 (en) * 2005-08-30 2010-01-12 ATA SpA - Advanced Technology Assessment Computing system and methods for distributed generation and storage of complex relational data
CN101937370A (en) * 2010-08-16 2011-01-05 中国科学技术大学 Method and device supporting system-level resource distribution and task scheduling on FCMP (Flexible-core Chip Microprocessor)
US20140075439A1 (en) * 2012-06-08 2014-03-13 Huawei Technologies Co., Ltd. Virtualization management method and related apparatuses for managing hardware resources of a communication device
CN104035896A (en) * 2014-06-10 2014-09-10 复旦大学 Off-chip accelerator applicable to fusion memory of 2.5D (2.5 dimensional) multi-core system
CN104317658A (en) * 2014-10-17 2015-01-28 华中科技大学 MapReduce based load self-adaptive task scheduling method
CN104794100A (en) * 2015-05-06 2015-07-22 西安电子科技大学 Heterogeneous multi-core processing system based on on-chip network
CN115526725A (en) * 2022-11-24 2022-12-27 深圳市泰铼科技有限公司 Securities programmed trading risk analysis system based on big data analysis

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030084088A1 (en) * 2001-10-31 2003-05-01 Shaffer Larry J. Dynamic allocation of processing tasks using variable performance hardware platforms
US7647335B1 (en) * 2005-08-30 2010-01-12 ATA SpA - Advanced Technology Assessment Computing system and methods for distributed generation and storage of complex relational data
CN101937370A (en) * 2010-08-16 2011-01-05 中国科学技术大学 Method and device supporting system-level resource distribution and task scheduling on FCMP (Flexible-core Chip Microprocessor)
US20140075439A1 (en) * 2012-06-08 2014-03-13 Huawei Technologies Co., Ltd. Virtualization management method and related apparatuses for managing hardware resources of a communication device
CN104035896A (en) * 2014-06-10 2014-09-10 复旦大学 Off-chip accelerator applicable to fusion memory of 2.5D (2.5 dimensional) multi-core system
CN104317658A (en) * 2014-10-17 2015-01-28 华中科技大学 MapReduce based load self-adaptive task scheduling method
CN104794100A (en) * 2015-05-06 2015-07-22 西安电子科技大学 Heterogeneous multi-core processing system based on on-chip network
CN115526725A (en) * 2022-11-24 2022-12-27 深圳市泰铼科技有限公司 Securities programmed trading risk analysis system based on big data analysis

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116414726A (en) * 2023-03-24 2023-07-11 苏州亿铸智能科技有限公司 Task dynamic allocation data parallel computing method based on memory and calculation integrated accelerator
CN116414726B (en) * 2023-03-24 2024-03-15 苏州亿铸智能科技有限公司 Task dynamic allocation data parallel computing method based on memory and calculation integrated accelerator
CN116049908A (en) * 2023-04-03 2023-05-02 北京数力聚科技有限公司 Multi-party privacy calculation method and system based on blockchain

Also Published As

Publication number Publication date
CN115827256B (en) 2023-05-16

Similar Documents

Publication Publication Date Title
CN115827256B (en) Task transmission scheduling management system for multi-core memory and calculation integrated accelerator network
CN111491006B (en) Load-aware cloud computing resource elastic distribution system and method
CN102004671B (en) Resource management method of data center based on statistic model in cloud computing environment
US9239734B2 (en) Scheduling method and system, computing grid, and corresponding computer-program product
CN104239144A (en) Multilevel distributed task processing system
CN104951372A (en) Method for dynamic allocation of Map/Reduce data processing platform memory resources based on prediction
CN116244081B (en) Multi-core calculation integrated accelerator network topology structure control system
DE112010004083T5 (en) Distribution of work objects in a multi-core data processing system
CN114996018A (en) Resource scheduling method, node, system, device and medium for heterogeneous computing
Han et al. Energy efficient VM scheduling for big data processing in cloud computing environments
CN115658311A (en) Resource scheduling method, device, equipment and medium
CN113242304B (en) Edge side multi-energy data acquisition scheduling control method, device, equipment and medium
CN110879753B (en) GPU acceleration performance optimization method and system based on automatic cluster resource management
Wu et al. QoS oriented resource reservation in shared environments
CN114860449B (en) Data processing method, device, equipment and storage medium
Pilla et al. Asymptotically optimal load balancing for hierarchical multi-core systems
CN112148546A (en) Static safety analysis parallel computing system and method for power system
Mandal et al. An empirical study and analysis of the dynamic load balancing techniques used in parallel computing systems
Cao et al. Online cost-rejection rate scheduling for resource requests in hybrid clouds
Filippini et al. SPACE4AI-R: a runtime management tool for AI applications component placement and resource scaling in computing continua
CN116414726B (en) Task dynamic allocation data parallel computing method based on memory and calculation integrated accelerator
CN117255134B (en) Data transmission method based on cloud edge cooperation
Cheng et al. Global-view based Task Migration for Deep Learning Processor
Santhoshkumar et al. A pre-run-time scheduling algorithm for object-based distributed real-time systems
CN115033392A (en) Method and system for self-adaptively allocating computing resources

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CB02 Change of applicant information
CB02 Change of applicant information

Address after: Room 1911, Building 1, Jiazhaoye Yuefeng Building, No. 101 Tayuan Road, High tech Zone, Suzhou City, Jiangsu Province, 215011

Applicant after: Suzhou Yizhu Intelligent Technology Co.,Ltd.

Address before: 200120 building C, No. 888, Huanhu West 2nd Road, Lingang New Area, China (Shanghai) pilot Free Trade Zone, Pudong New Area, Shanghai

Applicant before: Shanghai Yizhu Intelligent Technology Co.,Ltd.