WO2022111466A1 - Procédé de planification de tâches, procédé de commande, dispositif électronique et support lisible par ordinateur - Google Patents

Procédé de planification de tâches, procédé de commande, dispositif électronique et support lisible par ordinateur Download PDF

Info

Publication number
WO2022111466A1
WO2022111466A1 PCT/CN2021/132400 CN2021132400W WO2022111466A1 WO 2022111466 A1 WO2022111466 A1 WO 2022111466A1 CN 2021132400 W CN2021132400 W CN 2021132400W WO 2022111466 A1 WO2022111466 A1 WO 2022111466A1
Authority
WO
WIPO (PCT)
Prior art keywords
core
cluster
core cluster
task
target
Prior art date
Application number
PCT/CN2021/132400
Other languages
English (en)
Chinese (zh)
Inventor
吴臻志
祝夭龙
Original Assignee
北京灵汐科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京灵汐科技有限公司 filed Critical 北京灵汐科技有限公司
Publication of WO2022111466A1 publication Critical patent/WO2022111466A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5066Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • G06F15/7807System on chip, i.e. computer system on a single chip; System in package, i.e. computer system on one or more chips in a single package
    • G06F15/7825Globally asynchronous, locally synchronous, e.g. network on chip
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]

Definitions

  • the present disclosure relates to the field of computer technology, and in particular, to a control method for task scheduling, a task scheduling method, a task scheduling method, an electronic device, a computer-readable medium, and a computer program product.
  • a many-core system can be composed of at least one chip, each chip has multiple computing units, and the smallest computing unit in each chip that can be independently scheduled and has complete computing power is called a core.
  • multiple cores can work together, and each core can run program instructions independently, using parallel computing capabilities to speed up program execution and provide multitasking capabilities.
  • the present disclosure provides a control method for task scheduling based on a many-core system, a task scheduling method, a task scheduling method, an electronic device, a computer-readable medium, and a computer program product.
  • an embodiment of the present disclosure provides a task scheduling control method, which is applied to the core of a many-core system, including: performing load detection on at least one core cluster of the many-core system, and determining whether the detected core cluster is in the core cluster.
  • There is a target core cluster and the target core cluster is at least one of the core clusters whose task allocation needs to be adjusted; if the target core cluster exists, the target core cluster is controlled to adjust the task allocation according to the load detection result ; wherein the many-core system includes a plurality of cores, at least one of the cores forms the core cluster, and the many-core system includes at least one of the core clusters.
  • an embodiment of the present disclosure provides a task scheduling method, which is applied to a first control core of a first core cluster of a many-core system.
  • the task scheduling method includes: performing load detection on the first core cluster, determining the Whether the first core cluster is the target core cluster, and the target core cluster is the core cluster whose task allocation needs to be adjusted; if the first core cluster is the target core cluster, according to the load of the first core cluster As a result of the detection, the task allocation of the first core cluster is adjusted; wherein, the many-core system includes multiple cores, at least one of the cores forms the core cluster, and the many-core system includes at least one of the core clusters , the first core cluster is one of at least one of the core clusters.
  • an embodiment of the present disclosure provides a task scheduling method, which is applied to the core of a many-core system, including: sending a request signaling for acquiring task information of a task processed by a target core cluster, so as to acquire the task information; the task information, form a second core cluster for replacing the target core cluster, and the core is the second control core of the second core cluster; run the target core on the second core cluster
  • the task of cluster processing wherein, the many-core system includes a plurality of cores, at least one of the cores forms the core cluster, the many-core system includes at least one of the core clusters, and the target core cluster is the task allocation that needs to be adjusted of the core cluster.
  • an embodiment of the present disclosure provides a computer-readable medium on which a computer program is stored, and when the program is executed by a processor, implements at least one of the following methods: the method described in the first aspect of the embodiment of the present disclosure A control method for task scheduling; the task scheduling method according to the second aspect of the embodiment of the present disclosure; and the task scheduling method according to the third aspect of the embodiment of the present disclosure.
  • embodiments of the present disclosure provide an electronic device, including: a plurality of cores; and a network-on-chip configured to exchange data and external data among the plurality of cores; One or more instructions that are executed by one or more of the cores to enable the one or more of the cores to perform at least one of the following methods: The control method for task scheduling described above; the task scheduling method described in the second aspect of the embodiment of the present disclosure; and the task scheduling method described in the third aspect of the embodiment of the present disclosure.
  • an embodiment of the present disclosure provides a computer program product, which, when running on a computer, causes the computer to execute at least one of the following methods: the task scheduling described in the first aspect of the embodiment of the present disclosure A control method; the task scheduling method described in the second aspect of the embodiment of the present disclosure; and the task scheduling method described in the third aspect of the embodiment of the present disclosure.
  • load detection can be performed on each core cluster in the many-core system, and when there is a core cluster that needs to adjust the task allocation, the core cluster is controlled to adjust the task allocation according to the load detection result, thereby improving the flexibility of task processing in the many-core system. improve the utilization efficiency of computing resources in many-core systems.
  • FIG. 1 is a flowchart of a control method for task scheduling in an embodiment of the present disclosure
  • FIG. 2 is a schematic diagram of a many-core system in an embodiment of the present disclosure
  • FIG. 3 is a flowchart of a task scheduling method in an embodiment of the present disclosure
  • FIG. 5 is a flowchart of a task scheduling method in an embodiment of the present disclosure.
  • FIG. 6 is a flowchart of a task scheduling method in an embodiment of the present disclosure.
  • Fig. 7 is the composition block diagram of a kind of core in the embodiment of the present disclosure.
  • FIG. 8 is a block diagram of an electronic device in an embodiment of the present disclosure.
  • Static scheduling maps a certain part of the algorithm to the core of the chip, and each core only runs the mapped part of the algorithm, resulting in poor flexibility of task processing in many-core systems.
  • a core cluster is dynamically formed according to computing tasks.
  • the core cluster includes multiple cores, and the many-core system may have multiple cores.
  • Core clusters each core cluster performs corresponding computing tasks.
  • each cluster of cores typically performs computational tasks in a pipelined fashion.
  • the processing tasks change such as a sudden increase in business flow, some core clusters may be overloaded.
  • the reduction of the processing efficiency of a certain core cluster will lead to a reduction of the processing efficiency of the entire many-core system.
  • FIG. 1 is a flowchart of a control method for task scheduling in an embodiment of the present disclosure. 1, the control method includes:
  • step S110 load detection is performed on at least one core cluster of the many-core system, and it is determined whether there is a target core cluster in the detected core cluster, and the target core cluster is at least one of the core clusters that needs to adjust the task allocation the core cluster;
  • step S120 when the target core cluster exists, control the target core cluster to adjust task allocation according to the load detection result
  • the many-core system includes a plurality of cores, at least one of the cores forms the core cluster, and the many-core system includes at least one of the core clusters.
  • FIG. 2 is a schematic diagram of a many-core system in an embodiment of the present disclosure.
  • the many-core system includes a first core and a second core, and the first core and the second core have different control capabilities and different functions in the many-core system.
  • the first core may be the control core of the many-core system, which is used to receive instructions and tasks of the external system; control each core in the many-core system to perform processing tasks and the like.
  • a plurality of second cores are formed into a core cluster (as shown by the dotted box in FIG. 2 ). There may be multiple core clusters in a many-core system, and each core cluster performs a corresponding computing task.
  • the control core is used to receive the first core or other components (eg Many-core system synchronizers, external devices, etc.) instructions and tasks; split tasks; control each slave core in the core cluster to execute subtasks, etc.; slave cores are used to execute corresponding subtasks.
  • the present disclosure does not limit the specific functional classification of the cores and the types of tasks performed by each core.
  • the high-level control core in the many-core system performs load detection on each core cluster in the many-core system by performing steps S110 to S120, and when there are core clusters that need to adjust task allocation, the control needs The core cluster that adjusts the assignment of tasks adjusts the assignment of tasks.
  • the high-level control core that executes steps S110 to S120 may be the first core in the many-core system as shown in FIG. 2 , or may be any one of the many-core systems independent of each core cluster core. This embodiment of the present disclosure does not limit this.
  • the load detection refers to determining whether the tasks processed by the core cluster match the computing resources of the core cluster.
  • the result of the load detection may be that the target core cluster is overloaded, and adjusting the task allocation in step S120 may be to increase the number of cores in the target core cluster to increase the computing resources of the target core cluster; the result of the load detection may also be that the target core cluster computing resources Excessive, adjusting the task allocation in step S120 may be to reduce the number of cores in the target core cluster to save the computing resources of the many-core system and improve the utilization rate of the computing resources of the many-core system; the load detection result can also be the core cluster processing task and The computing resources of the core cluster are matched. This embodiment of the present disclosure does not limit this.
  • load detection is performed on each core cluster in the many-core system through a high-level control check, and when there are core clusters that need to adjust the task allocation, the core cluster that needs to adjust the task allocation is controlled to adjust the task.
  • Allocation can dynamically increase computing resources for overloaded core clusters, and can timely recycle computing resources of core clusters with excess computing resources, thereby improving the flexibility of task processing in many-core systems and reducing the processing efficiency of some core clusters.
  • the overall processing efficiency of the many-core system is reduced, and the utilization efficiency of the computing resources of the many-core system is improved at the same time.
  • the high-level control core finds that the core cluster is overloaded or has excessive computing resources through load detection, it determines that the core cluster needs to adjust task allocation.
  • step S110 includes:
  • the core cluster with overload or excess computing resources is the target core cluster.
  • each core cluster processes tasks in a pipeline manner
  • multiple core clusters have a uniform synchronization period.
  • the synchronization period is determined by the runtime of the longest-running core cluster. The longer the run time, the higher the load on the core cluster.
  • it may be determined whether there is an overloaded core cluster according to the relationship between the running durations of multiple core clusters within a certain period of time.
  • the step of judging whether there is an overloaded core cluster in at least one of the core clusters of the many-core system includes:
  • the candidate target core cluster is the core cluster with the longest running time in N synchronization cycles, where N is a natural number greater than or equal to a predetermined number;
  • the candidate target core cluster exists, it is determined that there is an overloaded core cluster; wherein the candidate target core cluster is an overloaded core cluster.
  • a predetermined period of time may include multiple synchronization cycles (the number is greater than or equal to N). If there is a candidate target core cluster that has the longest running time in the N synchronization cycles within the predetermined time period, it can be It is determined that there are overloaded core clusters, and N is a natural number greater than or equal to a predetermined number. On the contrary, the judgment can be continued after the end of the next predetermined time period.
  • a predetermined synchronization period may be preset, and when the running duration of any one core cluster exceeds the predetermined synchronization period, it means that the core cluster is overloaded.
  • the step of judging whether there is an overloaded core cluster in at least one of the core clusters of the many-core system includes:
  • the predetermined synchronization period can be set as: the time required for each core cluster to process a computing task in one phase during normal computing, and when the predetermined synchronization period is reached, each core in the core cluster can be phased Switch to the next phase.
  • the core cluster if the running duration of a phase of all or part of the cores in the core cluster exceeds a predetermined synchronization period, the core cluster may be considered to be overloaded.
  • the present disclosure does not limit the specific value of the predetermined synchronization period.
  • multiple overloaded core clusters may also be determined from the core clusters of the many-core system, which is not limited in the present disclosure.
  • a solution for adjusting the task allocation of the target core cluster without stopping is provided.
  • step S120 includes:
  • an idle core can be dynamically applied for, a new core cluster (called the second core cluster) can be rebuilt, and the target core cluster can be replaced by the second core cluster;
  • the second core cluster there is no need to suspend the operation of the target core cluster.
  • the target core cluster may be disbanded, and the cores in the target core cluster become idle cores, thereby releasing computing resources.
  • the high-level control core determines the control core of the second core cluster (referred to as the second control core), and the second core cluster interacts with the high-level control core and the control core of the target core cluster , to complete the formation of the second core cluster.
  • task information of tasks processed by the target core cluster may be stored in the control core of the target core cluster; task information of tasks processed by each core cluster may also be stored in the high-level control core. This embodiment of the present disclosure does not limit this.
  • the second control core may obtain task information of the task processed by the target core cluster from the control core of the target core cluster, and may also obtain task information of the task processed by the target core cluster from the high-level control core. This embodiment of the present disclosure also does not limit this.
  • the task information of the task processed by the target core cluster is transmitted to the second control core in response to the request signaling by the second control core to obtain the task information of the task processed by the target core cluster nuclear.
  • the task information of the task processed by the target core cluster is not limited.
  • the task information may include configuration information of each core in the second core cluster, and may also include information representing storage content of each core in the second core cluster.
  • the target core cluster can be replaced by the second core cluster, so as to realize the process of adjusting the task allocation of the target core cluster, thereby improving the efficiency of the adjustment.
  • FIG. 3 is a flowchart of a task scheduling method in an embodiment of the present disclosure.
  • an embodiment of the present disclosure provides a task scheduling method, which is applied to a first control core of a first core cluster of a many-core system.
  • Task scheduling methods include:
  • step S210 load detection is performed on the first core cluster to determine whether the first core cluster is a target core cluster, and the target core cluster is a core cluster that needs to adjust task allocation;
  • step S220 when the first core cluster is the target core cluster, adjust the task allocation of the first core cluster according to the load detection result of the first core cluster;
  • the many-core system includes a plurality of cores, at least one of the cores constitutes the core cluster, the many-core system includes at least one of the core clusters, and the first core cluster is one of the at least one core cluster.
  • any one core cluster (referred to as the first core cluster) in the many-core system performs load detection by executing steps S210 to S220 through its control core (referred to as the first control core).
  • the core cluster needs to adjust the task allocation, the task allocation of the first core cluster is adjusted. In this way, the load detection and task adjustment of each core cluster in the many-core system can be realized.
  • the load detection refers to determining whether the tasks processed by the core cluster match the computing resources of the core cluster.
  • the result of the load detection may be that the target core cluster is overloaded, and adjusting the task allocation in step S220 may be to increase the number of cores in the target core cluster to increase the computing resources of the target core cluster; the result of the load detection may also be that the target core cluster computing resources Excessive, in step S220, adjusting the task allocation can be to reduce the number of cores in the target core cluster to save the computing resources of the many-core system and improve the utilization rate of the computing resources of the many-core system; the load detection result can also be the core cluster processing task and The computing resources of the core cluster are matched. This embodiment of the present disclosure does not limit this.
  • the control of the core cluster in the many-core system checks the load of the core cluster, and adjusts the task allocation of the core cluster when the core cluster needs to adjust the task allocation, so that when the core cluster is overloaded Dynamically increase computing resources, and can release computing resources in time when the core cluster computing resources are excessive, thereby improving the flexibility of many-core system task processing and reducing the overall processing efficiency of the many-core system due to the reduction of the processing efficiency of some core clusters. , while improving the utilization efficiency of computing resources in many-core systems.
  • the first control core finds that the first control core is overloaded or has excess computing resources through load detection, it is determined that the first core cluster is the target core cluster, and task allocation needs to be adjusted.
  • step S210 includes:
  • the first core cluster When the first core cluster is overloaded or has excess computing resources, it is determined that the first core cluster is the target core cluster.
  • the first core cluster may be considered overloaded; otherwise, the first core cluster is overloaded. If the time required by each core in the core cluster to process a computing task of a phase is significantly lower than the average value of the cores in each core cluster, it can be considered that the first core cluster has excess computing resources.
  • the first core cluster if the first core cluster is overloaded or has excess computing resources, it may be determined that the first core cluster is the target core cluster, and task allocation needs to be adjusted. It should be understood that those skilled in the art can set specific judgment conditions for overload or excess computing resources according to actual situations, which are not limited in the present disclosure.
  • the synchronization period is determined by the runtime of the core cluster with the largest runtime. The longer the runtime, the higher the load on the core cluster. As an optional implementation manner, it may be determined whether the target core cluster is overloaded according to the relationship between the running duration of the target core cluster and the running durations of other core clusters within a certain period of time.
  • the step of determining whether the first core cluster is overloaded includes:
  • the running durations of multiple synchronization cycles in the predetermined time period determine the number of synchronization cycles with the longest running duration of the first core cluster in at least one of the core clusters;
  • the predetermined time period may include multiple synchronization cycles (the number is greater than or equal to N), and the running durations of the multiple synchronization cycles of the first core cluster in the predetermined time period may be determined first;
  • the running duration is sorted with the running duration of the synchronization cycle of each core cluster, and the synchronization cycle with the longest running duration is determined. If there is a synchronization period with the longest running time in the first core cluster, and the number N exceeds a predetermined value, it can be determined that the first core cluster is overloaded.
  • a predetermined synchronization period may be preset, and when the running duration of the target core cluster exceeds the predetermined synchronization period, it means that the target core cluster is overloaded.
  • the step of determining whether the first core cluster is overloaded includes:
  • the running time of the first core cluster exceeds the predetermined synchronization period, it is determined that the first core cluster is overloaded.
  • the predetermined synchronization period can be set as: the time required for each core cluster to process a computing task in one phase during normal computing, and when the predetermined synchronization period is reached, each core in the core cluster can be phased Switch to the next phase.
  • the first core cluster if the running duration of a phase of all or part of the cores in the first core cluster exceeds a predetermined synchronization period, the first core cluster may be considered to be overloaded.
  • the present disclosure does not limit the specific value of the predetermined synchronization period.
  • a solution for adjusting the task allocation of the target core cluster without stopping is provided.
  • control core of the target core cluster determines the control core of the new core cluster, and the control core of the new core cluster interacts with the control core of the target core cluster to complete the formation of the new core cluster .
  • FIG. 4 is a flowchart of a task scheduling method in an embodiment of the present disclosure.
  • step S220 the step of adjusting the task allocation of the first core cluster according to the load detection result of the first core cluster includes:
  • step S221 apply for an idle core as the second control core of the second core cluster that replaces the first core cluster
  • step S222 in response to the request signaling of the second control core to acquire the task information of the task processed by the first core cluster, transmit the task information of the task processed by the first core cluster to the second control core control nucleus;
  • step S223 in response to the request signaling of the second control core to obtain the original data, obtain the original data;
  • step S224 transmitting the original data to the second control core
  • step S225 the received input and output information of the second core cluster is added to the pipeline composed of multiple core clusters
  • step S226 start signaling is sent to the second control core.
  • the first control core can dynamically apply for an idle core in step S221 as the second control core of the second core cluster that replaces the first core cluster.
  • the second control core may send a request signaling to the first control core, which is used to request task information of the task processed by the first core cluster; when the first control core receives the request signaling, it may In S22, in response to the request signaling, the task information of the task processed by the first core cluster is transmitted to the second control core.
  • the task information of the task processed by the target core cluster is not limited.
  • the task information may include configuration information of each core in the second core cluster, and may also include information representing the storage content of each core in the second core cluster.
  • the second control core may form a second core cluster to replace the first core cluster. For example, apply to the many-core system for multiple idle cores according to task information; configure tasks for multiple idle cores.
  • the second control core may send request signaling to the first control core for requesting raw data of the task processed by the first core cluster.
  • the first control core receives the request signaling, in step S223, in response to the request signaling, obtain the original data, and in step S224, transmit the original data to the second control core.
  • the step of acquiring the original data by the first control core includes: searching for data from each core in the first core cluster according to the original compilation information when the tasks of the first core cluster are configured, and reorganizing them into the raw data.
  • the second control core after receiving the raw data, allocates the raw data to each core in the second core cluster, so that the second core cluster can process the tasks processed by the first core cluster. Further, the second control core may determine the input and output routes of the second core cluster, and send the input and output information of the second core cluster to the first control core.
  • the first control core may add the received input and output information of the second core cluster to a pipeline composed of multiple core clusters in step S225.
  • the first control core adds the second core cluster to the pipeline by replacing the input and output information of the second core cluster into the pipeline composed of multiple core clusters, so that the second core cluster enters the pipeline. Normal computing task flow.
  • the first control core may send a start signaling to the second control core in step S226, so that the second core cluster starts to replace the first core cluster to process computing tasks.
  • step S220 further includes:
  • the first core cluster After receiving the message that the second core cluster has been started sent by the second control core, the first core cluster is disbanded.
  • the first core cluster can be disbanded, so that the cores in the first core cluster become idle cores, thereby releasing computing resources.
  • the second core cluster can replace the first core cluster to process computing tasks, so as to realize the process of adjusting the task allocation of the first core cluster, thereby improving the flexibility of task processing in the many-core system. Improve the utilization efficiency of computing resources in many-core systems.
  • a solution is provided for adjusting the task allocation of the first core cluster by shutting down. For example, when it is necessary to adjust the task allocation of the first core cluster, dynamically apply for an idle core as the core of the first core cluster, and perform task configuration.
  • step S220 includes: applying for an idle core as a core to which the first core cluster belongs; and re-assigning tasks to the cores of the first core cluster.
  • the first control core can send a request for adding cores to the high-level control core of the many-core system, so that the high-level control core allocates a new idle core to the first core cluster as the core of the first core cluster.
  • the first control core may set the number of cores to be added according to actual conditions, which is not limited in the present disclosure.
  • the first control core may reconfigure tasks for each core of the first core cluster, and after the task configuration is completed, control each core of the first core cluster to process computing tasks, so as to achieve The process of adjusting the task assignment of the first core cluster.
  • the present disclosure does not limit the specific manner of task configuration.
  • a global notification mechanism is included, for example, all cores and personal computer (PC, Personal Computer) terminals in the many-core system are broadcasted to notify invalid state information and valid state information.
  • PC personal computer
  • FIG. 5 is a flowchart of a task scheduling method in an embodiment of the present disclosure.
  • an embodiment of the present disclosure provides a task scheduling method, which is applied to the core of a many-core system, and the method includes:
  • step S310 a request signaling for obtaining task information of the task processed by the target core cluster is sent to obtain the task information
  • step S320 according to the acquired task information, a second core cluster for replacing the target core cluster is formed, and the core is the second control core of the second core cluster;
  • step S330 the task processed by the target core cluster is executed on the second core cluster
  • the many-core system includes a plurality of cores, at least one of the cores constitutes the core cluster, the many-core system includes at least one of the core clusters, and the target core cluster is the core cluster that needs to adjust task allocation .
  • a solution for adjusting task allocation of a target core cluster without stopping the system is provided. For example, when it is necessary to adjust the task allocation of the target core cluster, dynamically apply for idle cores, re-establish a second core cluster, and replace the target core cluster with the second core cluster; among which, when forming the second core cluster, the target does not need to be suspended.
  • the operation of the core cluster After the second core cluster replaces the target core cluster, the target core cluster may be disbanded, and the cores in the target core cluster become idle cores, thereby releasing computing resources.
  • the second control core of the second core cluster may be determined by the control core of the target core cluster, or the second control core of the second core cluster may be determined by the high-level control core. This embodiment of the present disclosure does not limit this.
  • the second control core interacts with the control core or high-level control core of the target core cluster through steps S310 to S330 to complete the formation of the second core cluster.
  • the second control core may send a request signaling for acquiring task information of the task processed by the target core cluster to the high-level control core, so as to obtain the task information;
  • the control core sends a request signaling for acquiring task information of the task processed by the target core cluster to acquire the task information.
  • the second control core may form a second core cluster for replacing the target core cluster according to the acquired task information; and then in step S330, run the target on the second core cluster The tasks handled by the core cluster.
  • the second control core is determined by the high-level control core or the control core of the core cluster whose task allocation needs to be adjusted, and then the second control core forms the second core cluster, so that the second control core can be
  • the computing resources are dynamically increased when the core cluster is overloaded, and the computing resources are released in time when the core cluster computing resources are excessive, which can improve the flexibility of task processing in the many-core system and avoid the reduction of the processing efficiency of some core clusters.
  • the overall processing efficiency of the many-core system is reduced, while the utilization efficiency of the computing resources of the many-core system is improved.
  • FIG. 6 is a flowchart of a task scheduling method in an embodiment of the present disclosure. 6, in some embodiments, step S320 includes:
  • step S321 apply for multiple idle cores according to the task information
  • step S322 task configuration is performed on a plurality of the idle cores
  • step S323 sending a request signaling for obtaining raw data to obtain the raw data
  • step S324 the obtained raw data is allocated to each of the idle cores according to the task information
  • step S325 determine the input and output information of the second core cluster
  • step S326 the input and output information of the second core cluster is sent to the control core of the target core cluster
  • step S330 includes:
  • step S331 in response to the start signaling sent by the control core of the target core cluster, the task processed by the target core cluster is executed on the second core cluster.
  • the second control core can apply for multiple idle cores according to the task information in step S321.
  • the number of idle cores required to execute the task is determined according to parameters such as the calculation amount and completion time requirements of the task processed by the target core cluster in the task information, which may be greater than the current number of cores in the target core cluster.
  • the present disclosure does not limit the specific number of idle cores to be applied for.
  • the second control core may send an allocation request for idle cores to the high-level control core, so that the high-level control core allocates a corresponding number of idle cores to the second core cluster.
  • the second control core may perform task configuration on the multiple idle cores in step S322 according to the task information, for example, split tasks and assign them to each idle core.
  • the present disclosure does not limit the specific manner of task configuration.
  • performing task configuration on idle cores may include determining configuration information corresponding to each core, and may also include specifying information that each core should store. This embodiment of the present disclosure does not limit this.
  • the second control core may, in step S323, send a request signaling for obtaining the original data to the control core of the target core cluster, so as to obtain the original data.
  • the acquired original data may be allocated to each idle core , so that the second core cluster can process the tasks processed by the target core cluster.
  • the second control core may determine the input and output routes of the second core cluster in step S325, obtain input and output information, and send the input and output information to the control core of the target core cluster in step S326.
  • the control core of the target core cluster after receiving the input and output information, replaces the input and output information of the second core cluster into a pipeline composed of a plurality of core clusters, and adds the second core cluster to the pipeline to Make the second core cluster enter the normal computing task flow.
  • the control core of the target core cluster adds the input and output information of the second core cluster to the pipeline, it sends a start signaling to the second control core of the second core cluster.
  • step S331 the second control core controls the second core cluster to execute the task processed by the target core cluster in response to the activation signaling, thereby realizing the formation of a second core cluster for replacing the target core cluster. the whole process.
  • the task scheduling method further includes: sending a message that the second core cluster has been started to the control core of the target core cluster, so as to dissolve the target core cluster.
  • the control core of the target core cluster may send a message that the second core cluster has been started. After receiving the message, the control core of the target core cluster can dissolve the target core cluster, so that the cores in the target core cluster become idle cores, thereby releasing computing resources.
  • a control device for task scheduling is also provided, which is applied to the core of a many-core system, and the control method includes:
  • the first detection module is used to perform load detection on at least one core cluster of the many-core system, and determine whether there is a target core cluster in the detected core cluster, and the target core cluster is at least one of the core clusters that needs to be adjusted a core cluster for task allocation; a first adjustment module, configured to control the target core cluster to adjust task allocation according to the load detection result in the presence of the target core cluster; wherein the many-core system includes a plurality of cores, at least One of the cores constitutes the core cluster, and the many-core system includes at least one of the core clusters.
  • the first detection module is configured to: determine whether there is an overloaded core cluster or a core cluster with excess computing resources in the detected core cluster; In the case of a core cluster with excess computing resources, it is determined that the core cluster with excessive load or excess computing resources is the target core cluster.
  • the first detection module is configured to: determine whether there is a candidate target core cluster within a predetermined period of time, where the candidate target core cluster is the core cluster with the longest running time in N synchronization cycles , where N is a natural number greater than or equal to a predetermined number; when the candidate target core cluster exists, it is determined that there is an overloaded core cluster; wherein, the candidate target core cluster is an overloaded core cluster.
  • the first detection module is configured to: determine whether there is a core cluster whose running duration exceeds a predetermined synchronization period; in the case of a core cluster whose running duration exceeds the predetermined synchronization period, determine that there is overload A core cluster, wherein a core cluster whose running time exceeds the predetermined synchronization period is an overloaded core cluster.
  • the first adjustment module is configured to: apply for an idle core as a second control core of a second core cluster that replaces the target core cluster; acquire the target core in response to the second control core
  • the request signaling of the task information of the task processed by the cluster transmits the task information of the task processed by the target core cluster to the second control core.
  • a task scheduling apparatus which is applied to a first control core of a first core cluster of a many-core system, and the task scheduling apparatus includes:
  • the second detection module is configured to perform load detection on the first core cluster, and determine whether the first core cluster is a target core cluster, and the target core cluster is a core cluster that needs to adjust task allocation;
  • a second adjustment module configured to adjust the task allocation of the first core cluster according to the load detection result of the first core cluster when the first core cluster is the target core cluster;
  • the many-core system includes a plurality of cores, at least one of the cores forms the core cluster, the many-core system includes at least one of the core clusters, and the first core cluster is one of the at least one core cluster .
  • the second detection module is configured to: determine whether the first core cluster is overloaded or have excess computing resources; in the case of overloading or excess computing resources of the first core cluster, determine whether the first core cluster is overloaded or has excess computing resources.
  • the first core cluster is the target core cluster.
  • the second detection module is configured to: determine the running duration of multiple synchronization cycles of the first core cluster within a predetermined time period; The running duration is to determine the number of synchronization cycles with the longest running duration of the first core cluster in at least one of the core clusters; when the number of synchronization cycles with the longest running duration of the first core cluster exceeds a predetermined value, determine The first core cluster is overloaded.
  • the second detection module is configured to: determine whether the running duration of the first core cluster exceeds a predetermined synchronization period; when the running duration of the first core cluster exceeds the predetermined synchronization period Next, it is determined that the core cluster is overloaded.
  • the second adjustment module is configured to: apply for an idle core as a second control core of a second core cluster that replaces the first core cluster; acquire the first core cluster in response to the second control core
  • a request signaling of task information of a task processed by a core cluster transmits the task information of a task processed by the first core cluster to the second control core; in response to a request signal of the second control core to acquire original data order, obtain the original data; transmit the original data to the second control core; add the received input and output information of the second core cluster to the pipeline composed of multiple core clusters;
  • the second control core sends the start signaling.
  • the second adjustment module is further configured to: receive a message sent by the second control core that the second core cluster has been started, and dissolve the first core cluster.
  • the second adjustment module is further configured to: apply for an idle core as a core to which the first core cluster belongs; and re-assign tasks to the cores of the first core cluster.
  • a task scheduling apparatus which is applied to the core of a many-core system, and the task scheduling method includes:
  • a request signaling sending module configured to send a request signaling for acquiring task information of a task processed by the target core cluster, to acquire the task information
  • a core cluster forming module configured to form a second core cluster for replacing the target core cluster according to the acquired task information, and the core is the second control core of the second core cluster;
  • the system includes at least one of the core clusters, and the target core cluster is the core cluster for which task allocation needs to be adjusted.
  • the core cluster building module is configured to: apply for multiple idle cores according to the task information; perform task configuration on the multiple idle cores; send request signaling for obtaining raw data to obtain all the idle cores. Allocate the acquired raw data to each of the idle cores according to the task information; determine the input and output information of the second core cluster; send the input and output information of the second core cluster to all the idle cores The control core of the target core cluster; wherein, the task operation module is configured to: in response to a start signaling sent by the control core of the target core cluster, run the target core cluster processing on the second core cluster task.
  • the apparatus further includes: a startup message sending module, configured to send a message that the second core cluster has been started to the control core of the target core cluster, so as to make The target core cluster is dissolved.
  • a startup message sending module configured to send a message that the second core cluster has been started to the control core of the target core cluster, so as to make The target core cluster is dissolved.
  • FIG. 7 is a block diagram of the composition of a core in an embodiment of the present disclosure.
  • a core is further provided, which is applied to a many-core system.
  • the core includes: one or more processing units 101 ; and a storage unit 102 on which one or more programs are stored, When one or more programs are executed by one or more processing units, the one or more processing units implement at least one of the following methods: the control method for task scheduling described in the first aspect of the embodiment of the present disclosure; the implementation of the present disclosure Examples include the task scheduling method described in the second aspect; the task scheduling method described in the third aspect of the embodiments of the present disclosure.
  • the processing unit 101 is a device with data processing capability, including but not limited to an arithmetic unit, etc.
  • the storage unit 102 is a device with data storage capability, including but not limited to random access memory (RAM), read-only memory (ROM), Power Erasable Programmable Read-Only Memory (EEPROM), Flash Memory (FLASH).
  • RAM random access memory
  • ROM read-only memory
  • EEPROM Power Erasable Programmable Read-Only Memory
  • FLASH Flash Memory
  • a computer-readable medium on which a computer program is stored, and when the program is executed by a processor, at least one of the following methods is implemented: the method described in the first aspect of the embodiment of the present disclosure.
  • FIG. 8 is a block diagram of an electronic device in an embodiment of the present disclosure.
  • an electronic device is further provided, including: a plurality of cores 201 ; and an on-chip network 202 configured to interact Data and external data between the multiple cores 201; one or more of the cores 201 store one or more instructions, and the one or more of the instructions are executed by the one or more of the cores 201 to make
  • One or more of the cores 201 can execute at least one of the following methods: the control method for task scheduling described in the first aspect of the embodiment of the present disclosure; the task scheduling method described in the second aspect of the embodiment of the present disclosure; the present disclosure The task scheduling method described in the third aspect of the embodiment.
  • a computer program product is also provided.
  • the computer program product When the computer program product runs on a computer, the computer program product causes the computer to execute at least one of the following methods: A control method; the task scheduling method described in the second aspect of the embodiment of the present disclosure; and the task scheduling method described in the third aspect of the embodiment of the present disclosure.
  • a second core cluster is formed, and the second core cluster processes the tasks processed by the target core cluster without suspending the operation of the target core cluster.
  • the process of adjusting the task allocation of the target core cluster includes: determining, by the control core or high-level control core of the target core cluster, a second control core in a second core cluster of the target core cluster; obtaining the target control core by the second control core
  • the task information of the task processed by the core in the example, the control core of the target core cluster holds the task information of the task processed by the target core cluster, the high-level control core also holds the task information of the task processed by the target core cluster, the second control core
  • the core can apply for task information to the control core of the target core cluster, and can also apply to the high-level control core for task information;
  • the second control core applies for a target number of idle cores and performs task configuration; the second control core sends request signaling to the control cores of the target core cluster to obtain original data;
  • the control core of the target core cluster After receiving the request signaling, the control core of the target core cluster searches for data from each core in the target core cluster according to the original compilation information, reorganizes the original data, and returns the original data to the second control core; The received raw data is allocated to each core in the second core cluster according to the task configuration information.
  • the second control core determines the input and output routes, and sends the input and output information of the second core cluster to the control core of the target core cluster; the control core of the target core cluster replaces the input and output information of the second core cluster into the pipeline, and sends the information to the first core cluster.
  • the second core cluster sends start signaling; the second core cluster starts according to the start signaling, and notifies the target core cluster.
  • the target core cluster is disbanded.
  • the following configurations may be performed: allocate memory, determine the information and parameters that should be stored by each idle core; and so on to reduce the computational complexity; configure routing; configure calculation control information such as operation sequence and operation time; configure synchronization management information.
  • the task configuration includes the following flow:
  • Each computational step is decomposed into several subtasks that can be parallelized according to the required memory and amount of computation.
  • each subtask can be mapped to an idle core; the memory required by each subtask shall not exceed the memory capacity limit of a single idle core, and the routing transmission volume (input and output) of the subtask shall not exceed the routing transmission bandwidth limit of a single idle core;
  • the subtask computation amount of each core is evenly distributed;
  • the optimal layout scheme determines the subtasks that each idle core needs to execute, and the optimal layout scheme is the layout scheme that minimizes the routing transmission bandwidth
  • the configuration information is sent to each idle core through the on-chip network, wherein each idle core will return a configuration completion signaling after receiving all the configuration information.
  • the operation of the target core cluster is suspended, and the cores in the target core cluster are increased or decreased according to the result of the load detection, so that the computing resources of the target core cluster and the computing tasks match.
  • the process of adjusting the task allocation of the target core cluster includes: applying for a new core, and performing task configuration.
  • remapping including a global notification mechanism, for example, valid (invalid) and invalid (valid) to broadcast notification to all cores and PC side.
  • the following configurations can be performed: allocate memory to determine the information and parameters that each core should store; configure arithmetic operators to reduce computational complexity by dismantling operations; configure routing; Configure calculation control information such as operation sequence and operation time; configure synchronization management information, etc.
  • the task configuration includes the following flow:
  • Each computational step is decomposed into several subtasks that can be parallelized according to the required memory and amount of computation.
  • each subtask can be mapped to one core; the memory required by each subtask shall not exceed the memory capacity limit of a single core, and the routing transmission volume (input and output) of the subtask shall not exceed the routing transmission bandwidth limit of a single core; preferably, The calculation amount of subtasks of each core is evenly distributed;
  • the optimal layout scheme determines the subtasks that each core needs to perform, and the optimal layout scheme is the layout scheme that minimizes the routing transmission bandwidth
  • the configuration information is sent to each core through the on-chip network, and each core will return a configuration completion signaling after receiving all the configuration information.
  • Such software may be distributed on computer-readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media).
  • computer storage media includes both volatile and nonvolatile implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules or other data flexible, removable and non-removable media.
  • Computer storage media include, but are not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cartridges, magnetic tape, magnetic disk storage or other magnetic storage devices, or may Any other medium used to store desired information and which can be accessed by a computer.
  • communication media typically embodies computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and can include any information delivery media, as is well known to those of ordinary skill in the art .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Microelectronics & Electronic Packaging (AREA)
  • Power Sources (AREA)
  • Hardware Redundancy (AREA)

Abstract

L'invention concerne un procédé de planification de tâche, un procédé de commande, un dispositif électronique et un support lisible par ordinateur. Le procédé de commande pour la planification de tâches est appliqué à un cœur d'un système à cœurs multiples, et le procédé de commande consiste à : réaliser une détection de charge sur au moins un groupe de cœurs d'un système à cœurs multiples, et déterminer s'il existe un groupe de cœurs cibles dans le groupe de cœurs qui a été soumis à une détection, le groupe de cœurs cibles étant un groupe de cœurs dans lequel l'attribution de tâches doit être ajustée dans le ou les groupes de cœurs ; et en cas de présence de groupe de cœurs cibles, commander, en fonction d'un résultat de détection de charge, le groupe de cœurs cibles pour ajuster l'attribution de tâches, le système à cœurs multiples comprenant une pluralité de cœurs, au moins l'un des cœurs constituant le groupe de cœurs, et le système à cœurs multiples comprenant le ou les groupes de cœurs.
PCT/CN2021/132400 2020-11-24 2021-11-23 Procédé de planification de tâches, procédé de commande, dispositif électronique et support lisible par ordinateur WO2022111466A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011330751.1A CN114546631A (zh) 2020-11-24 2020-11-24 任务调度方法、控制方法、核心、电子设备、可读介质
CN202011330751.1 2020-11-24

Publications (1)

Publication Number Publication Date
WO2022111466A1 true WO2022111466A1 (fr) 2022-06-02

Family

ID=81660359

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/132400 WO2022111466A1 (fr) 2020-11-24 2021-11-23 Procédé de planification de tâches, procédé de commande, dispositif électronique et support lisible par ordinateur

Country Status (2)

Country Link
CN (1) CN114546631A (fr)
WO (1) WO2022111466A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115086232A (zh) * 2022-06-13 2022-09-20 清华大学 任务处理及数据流生成方法和装置

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140059548A1 (en) * 2012-08-23 2014-02-27 Nvidia Corporation Processor cluster migration techniques
CN103870322A (zh) * 2012-12-17 2014-06-18 联发科技股份有限公司 控制任务转移的方法、非暂时性计算机可读介质、异构多核系统
US20140208072A1 (en) * 2013-01-18 2014-07-24 Nec Laboratories America, Inc. User-level manager to handle multi-processing on many-core coprocessor-based systems
CN104995603A (zh) * 2013-11-14 2015-10-21 联发科技股份有限公司 至少部分基于共享相同数据及/或存取相同存储地址的任务分布的任务调度方法以及多核处理器系统中用于分配任务的相关非暂时性计算机可读介质
CN105528330A (zh) * 2014-09-30 2016-04-27 杭州华为数字技术有限公司 负载均衡的方法、装置、丛集和众核处理器
CN105607720A (zh) * 2014-11-17 2016-05-25 联发科技股份有限公司 管理计算系统的能源效率的方法以及管理能源效率的系统
CN108170525A (zh) * 2016-12-07 2018-06-15 晨星半导体股份有限公司 动态调整多核心处理器的任务负载配置的装置以及方法
CN108694151A (zh) * 2017-04-09 2018-10-23 英特尔公司 通用图形处理单元内的计算集群抢占
CN111198757A (zh) * 2020-01-06 2020-05-26 北京小米移动软件有限公司 Cpu内核调度方法、cpu内核调度装置及存储介质

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140059548A1 (en) * 2012-08-23 2014-02-27 Nvidia Corporation Processor cluster migration techniques
CN103870322A (zh) * 2012-12-17 2014-06-18 联发科技股份有限公司 控制任务转移的方法、非暂时性计算机可读介质、异构多核系统
US20140208072A1 (en) * 2013-01-18 2014-07-24 Nec Laboratories America, Inc. User-level manager to handle multi-processing on many-core coprocessor-based systems
CN104995603A (zh) * 2013-11-14 2015-10-21 联发科技股份有限公司 至少部分基于共享相同数据及/或存取相同存储地址的任务分布的任务调度方法以及多核处理器系统中用于分配任务的相关非暂时性计算机可读介质
CN105528330A (zh) * 2014-09-30 2016-04-27 杭州华为数字技术有限公司 负载均衡的方法、装置、丛集和众核处理器
CN105607720A (zh) * 2014-11-17 2016-05-25 联发科技股份有限公司 管理计算系统的能源效率的方法以及管理能源效率的系统
CN108170525A (zh) * 2016-12-07 2018-06-15 晨星半导体股份有限公司 动态调整多核心处理器的任务负载配置的装置以及方法
CN108694151A (zh) * 2017-04-09 2018-10-23 英特尔公司 通用图形处理单元内的计算集群抢占
CN111198757A (zh) * 2020-01-06 2020-05-26 北京小米移动软件有限公司 Cpu内核调度方法、cpu内核调度装置及存储介质

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115086232A (zh) * 2022-06-13 2022-09-20 清华大学 任务处理及数据流生成方法和装置

Also Published As

Publication number Publication date
CN114546631A (zh) 2022-05-27

Similar Documents

Publication Publication Date Title
CN110769278B (zh) 一种分布式视频转码方法及系统
CN107003887B (zh) Cpu超载设置和云计算工作负荷调度机构
US8893148B2 (en) Performing setup operations for receiving different amounts of data while processors are performing message passing interface tasks
US8312464B2 (en) Hardware based dynamic load balancing of message passing interface tasks by modifying tasks
US8127300B2 (en) Hardware based dynamic load balancing of message passing interface tasks
US8108876B2 (en) Modifying an operation of one or more processors executing message passing interface tasks
CN109564528B (zh) 分布式计算中计算资源分配的系统和方法
US8954765B2 (en) Energy based resource allocation across virtualized machines and data centers
CN110221920B (zh) 部署方法、装置、存储介质及系统
US20090064166A1 (en) System and Method for Hardware Based Dynamic Load Balancing of Message Passing Interface Tasks
CN112114950A (zh) 任务调度方法和装置、以及集群管理系统
WO2022111453A1 (fr) Procédé et appareil de traitement de tâches, procédé d'attribution de tâches, et dispositif et support électroniques
US9471387B2 (en) Scheduling in job execution
US20230136661A1 (en) Task scheduling for machine-learning workloads
CN105786603A (zh) 一种基于分布式的高并发业务处理系统及方法
CN110990154A (zh) 一种大数据应用优化方法、装置及存储介质
CN113132456A (zh) 一种基于截止时间感知的边云协同任务调度方法及系统
WO2023020010A1 (fr) Procédé d'exécution de processus, et dispositif associé
WO2020108337A1 (fr) Procédé de programmation de ressources cpu, et équipement électronique
CN114518955A (zh) 一种基于kubernetes的Flink云原生部署架构方法及系统
WO2022111466A1 (fr) Procédé de planification de tâches, procédé de commande, dispositif électronique et support lisible par ordinateur
WO2024139754A1 (fr) Procédé et appareil de régulation et de commande de nœuds de test, dispositif électronique et support de stockage
CN117632457A (zh) 一种加速器调度方法及相关装置
CN114546630A (zh) 任务处理方法及分配方法、装置、电子设备、介质
WO2024087663A1 (fr) Procédé et appareil de planification de tâche, et puce

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21896967

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 210923)

122 Ep: pct application non-entry in european phase

Ref document number: 21896967

Country of ref document: EP

Kind code of ref document: A1