WO2022111453A1 - Procédé et appareil de traitement de tâches, procédé d'attribution de tâches, et dispositif et support électroniques - Google Patents

Procédé et appareil de traitement de tâches, procédé d'attribution de tâches, et dispositif et support électroniques Download PDF

Info

Publication number
WO2022111453A1
WO2022111453A1 PCT/CN2021/132344 CN2021132344W WO2022111453A1 WO 2022111453 A1 WO2022111453 A1 WO 2022111453A1 CN 2021132344 W CN2021132344 W CN 2021132344W WO 2022111453 A1 WO2022111453 A1 WO 2022111453A1
Authority
WO
WIPO (PCT)
Prior art keywords
core
task
processing
processing cores
target processing
Prior art date
Application number
PCT/CN2021/132344
Other languages
English (en)
Chinese (zh)
Inventor
吴臻志
祝夭龙
何伟
Original Assignee
北京灵汐科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN202011330531.9A external-priority patent/CN114546630A/zh
Priority claimed from CN202011480656.XA external-priority patent/CN114637593A/zh
Application filed by 北京灵汐科技有限公司 filed Critical 北京灵汐科技有限公司
Publication of WO2022111453A1 publication Critical patent/WO2022111453A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]

Definitions

  • the present disclosure relates to the field of computer technologies, and in particular, to a task processing method and apparatus, a task assignment method and apparatus, electronic equipment, computer-readable media, and computer program products.
  • a many-core system usually has many cores (also called processing cores).
  • the core is the smallest computing unit in the many-core system that can be independently scheduled and has complete computing power.
  • the core has certain resources such as storage and computing.
  • the core of the many-core system can run program instructions independently, using the ability of parallel computing, can speed up the running speed of the program, and can provide multi-tasking ability.
  • each core passively executes computing functions according to the running program instructions, the control capability of each core is relatively weak, and the many-core system has poor flexibility in processing tasks.
  • the present disclosure provides a task processing method and apparatus, a task assignment method and apparatus, an electronic device, a computer-readable medium, and a computer program product.
  • the present disclosure provides a task processing method based on a many-core system, the many-core system includes a plurality of processing cores, and the method includes: receiving a task to be processed; The core cluster of the task to be processed, the core cluster includes at least one target processing core determined from the plurality of processing cores; sending the to-be-processed task to the core cluster for the core cluster to execute the Describe pending tasks.
  • the present disclosure provides a task allocation method.
  • the task allocation method is used for a pre-established core cluster in a many-core system.
  • the core cluster is determined based on the above-mentioned task processing method, and the core cluster includes a plurality of targets.
  • processing cores the method includes: receiving a task to be processed; dividing the task to be processed into a plurality of subtasks; allocating corresponding subtasks to at least some target processing cores in the core cluster, at least some of the target processing cores Each target processing core corresponds to at least one subtask, so that at least some of the target processing cores respectively execute the corresponding subtask.
  • the present disclosure provides a task processing device, the task processing device is applied to a many-core system, and the many-core system includes a plurality of processing cores; the task processing device includes: a first receiving module for receiving pending processing task; a core cluster forming module, configured to determine a core cluster for processing the to-be-processed task according to the to-be-processed task, where the core cluster includes at least one target processing core determined from the plurality of processing cores ; a sending module, configured to send the to-be-processed task to the core cluster, so that the core cluster can execute the to-be-processed task.
  • the present disclosure provides a task allocation device, which is used to perform task allocation on a pre-established core cluster in a many-core system, where the core cluster includes a plurality of target processing cores, and the task allocation device It includes: a second receiving module for receiving tasks to be processed; a task splitting module for splitting the tasks to be processed into multiple subtasks; a task allocation module for sending at least some of the targets in the core cluster Corresponding subtasks are assigned to the processing cores, and each target processing core in at least some of the target processing cores corresponds to at least one subtask for at least some of the target processing cores to execute the corresponding subtasks respectively.
  • the present disclosure provides a processing core, which includes the above-mentioned task processing apparatus, and/or the above-mentioned task distribution apparatus.
  • the present disclosure provides an electronic device comprising: a plurality of processing cores; an on-chip network configured to exchange data among the plurality of processing cores and external data; one or more of the processing cores One or more first instructions are stored in the core, and one or more of the first instructions are executed by one or more of the processing cores, so that the one or more of the processing cores can execute the above-mentioned task processing method; And/or, one or more second instructions are stored in one or more of the processing cores, and one or more of the second instructions are executed by one or more of the processing cores, so that one or more of the second instructions are executed.
  • the processing core is capable of executing the above-described task allocation method.
  • the present disclosure provides a computer-readable medium on which a first computer program and/or a second computer program are stored, wherein the first computer program implements the above-mentioned task processing when executed by a processing core method; the second computer program implements the above task allocation method when executed by the processing core.
  • the present disclosure provides a computer program product, which, when running on a computer, causes the computer to execute the above-mentioned task processing method; or, when the computer program product runs on a computer, causes the computer to execute The task assignment method described above.
  • a core cluster for processing the to-be-processed task according to the situation of the to-be-processed task, and to process the to-be-processed task through the core cluster, thereby realizing the dynamic processing of the to-be-processed task and improving the tasks of the many-core system. Processing flexibility.
  • FIG. 1 is a flowchart of a task processing method based on a many-core system provided by an embodiment of the present disclosure
  • FIG. 2 is a flowchart of a task processing method based on a many-core system provided by an embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of a many-core chip according to an embodiment of the present disclosure.
  • FIG. 4 is a flowchart of a task allocation method according to an embodiment of the present disclosure.
  • FIG. 5 is a flowchart of a task allocation method provided by an embodiment of the present disclosure.
  • FIG. 6 is a flowchart of a task processing method based on a many-core system provided by an embodiment of the present disclosure
  • Fig. 7 is the composition block diagram of a kind of many-core system
  • FIG. 8 is a block diagram of the composition of a task processing apparatus provided by an embodiment of the present disclosure.
  • FIG. 9 is a block diagram of a composition of a task allocation apparatus provided by an embodiment of the present disclosure.
  • FIG. 10 is a block diagram of an electronic device according to an embodiment of the present disclosure.
  • FIG. 1 is a flowchart of a task processing method based on a many-core system according to an embodiment of the present disclosure.
  • an embodiment of the present disclosure provides a task processing method based on a many-core system, wherein the many-core system includes a plurality of processing cores, and the task processing method can be executed by a task processing device, which can use software and / or hardware, the task processing method includes:
  • Step S11 receiving the task to be processed.
  • Step S12 Determine a core cluster for processing the to-be-processed task according to the to-be-processed task, where the core cluster includes at least one target processing core determined from a plurality of processing cores.
  • Step S13 Send the pending task to the core cluster for the core cluster to execute the pending task.
  • a corresponding core cluster is formed according to the to-be-processed task, and each to-be-processed task is processed by its corresponding core cluster, thereby The flexibility of task processing in many-core systems is improved.
  • a task to be processed sent by a host is received.
  • the host is a processing unit outside the many-core system, for example, the host is a central processing unit (Central Processing Unit, CPU), and the host is used to transmit the tasks to be processed to the many-core system for the many-core system to perform task processing.
  • the task can be, for example, any suitable computing task, such as an image recognition task, an object detection task, and the like.
  • the multiple processing cores of the many-core system include a first processing core and multiple second processing cores
  • the above-mentioned task processing device may be the first processing core, that is, the above-mentioned task processing method may be based on the first processing core of the many-core system.
  • a processing core is implemented.
  • the first processing core receiving the to-be-processed task may include: the first processing core receiving the to-be-processed task from the host (Host).
  • the first processing core of the many-core system is responsible for receiving tasks to be processed and forming a core cluster for processing the tasks to be processed.
  • the embodiment of the present disclosure has The control capability of some of the processing cores (the first processing core) in the many-core system has been improved.
  • the task processing device can form corresponding core clusters according to the resource requirements corresponding to the tasks to be processed, wherein step S12 can further include:
  • the resource demand required to execute the to-be-processed task is determined.
  • At least one target processing core is determined from currently available processing cores to form the core cluster according to the resource requirements required to execute the task to be processed.
  • the resource requirement required to execute the to-be-processed task can be determined, for example, the required computing resource requirement, storage resource requirement and/or bandwidth resource requirement.
  • the task processing apparatus may monitor or acquire the status of each processing core in the many-core system, so as to determine the currently available processing cores in the many-core system, and the available processing cores may be idle processing cores.
  • the resource requirements required for executing the task to be processed at least one processing core is determined from the currently available processing cores as a target processing core, so that the at least one target processing core is regarded as a core cluster, the core cluster Used to process the pending task.
  • the task processing device is a first processing core in a many-core system
  • the core cluster includes at least one target processing core determined from a plurality of second processing cores. That is, the first processing core determines at least one second processing core from the currently available second processing cores according to the resource requirements required for executing the task to be processed as a target processing core, so as to form a core cluster.
  • the core cluster is used to process the pending task.
  • the current remaining (idle) resources of the at least one target processing core must meet the resource requirements required for executing the task to be processed, that is, the total current remaining (idle) resources of the at least one target processing core must be greater than or Equal to the resource requirement required to execute the pending task. It should be noted that the embodiment of the present disclosure does not specifically limit the number of target processing cores in the core cluster, as long as the current remaining resources of all target processing cores in the core cluster can meet the resource requirements required for executing the to-be-processed task. .
  • different tasks to be processed can be formed according to the actual conditions of different tasks to be processed. It can realize dynamic processing of different tasks to be processed and improve the flexibility of task processing in many-core systems.
  • the task processing apparatus may further form corresponding core clusters according to the amount of delay requirement (computing time) corresponding to the tasks to be processed. For example, when a task is created, it can also carry its corresponding delay requirement, and by analyzing the to-be-processed task, the delay constraint amount corresponding to the to-be-processed task can be determined. Then, in step S12, at least one target processing core is determined from the currently available processing cores according to the delay constraint corresponding to the task to be processed to form the core cluster.
  • the number of target processing cores may be determined according to the delay constraint amount corresponding to the task to be processed.
  • the embodiments of the present disclosure do not limit how to determine the number of target processing cores according to the delay constraint amount, as long as one or more target processing cores can be determined.
  • the time used by the core to process the to-be-processed task may satisfy (less than or equal to) the delay constraint corresponding to the to-be-processed task.
  • a processing core is arbitrarily selected from the currently available processing cores, and the time required for the one processing core to execute the to-be-processed task is predicted. If the time required by the one processing core to execute the pending task is less than or equal to the delay constraint amount corresponding to the pending task, the one processing core can be determined as the target processing core; if the time required for the one processing core to execute the pending task is greater than the delay constraint, continue to arbitrarily select one or more processing cores from other currently available processing cores, and predict the time required for the one or more processing cores to execute the task to be processed respectively, and then calculate each selected The ratio of the time required by each processing core to execute the task to be processed to the number of selected processing cores, if the largest ratio is less than or equal to the delay constraint corresponding to the task to be processed, the selected multiple The processing cores serve as multiple target processing cores.
  • the corresponding relationship between the delay constraint amount and the required number of target processing cores may also be pre-configured. After the delay constraint amount of the task to be processed is determined, it may be determined from the corresponding relationship to execute the to-be-processed task. The number of target processing cores required to process the task, so that a corresponding number of target processing cores are selected from the currently available processing cores, for example, a corresponding number of processing cores are randomly selected from the currently available processing cores as target processing cores .
  • the task processing apparatus may generate a core corresponding to the core cluster Cluster list, and establish the corresponding relationship between the core cluster and the task to be processed, the core cluster list includes but not limited to: the cluster ID of the core cluster, the corresponding relationship between the core cluster and the task to be processed, the core cluster
  • the core identification information corresponding to each target processing core in the core identification information includes but is not limited to: the core identification, address, location, etc. of the target processing core.
  • the core cluster list may also be sent to each target processing core in the core cluster for storage.
  • sending the task to be processed to the core cluster includes: sending the pending task to the target processing core in the core cluster Process tasks.
  • the method further includes: selecting one from the core cluster The target processing core is used as the main core of the core cluster to manage the core cluster through the main core, wherein one target processing core can be randomly selected as the main core of the core cluster.
  • the above-mentioned core cluster list may further include information for identifying the identity of the main core of the core cluster.
  • the step of sending the task to be processed to the core cluster may further include: sending the task to be processed to the main core of the core cluster for the main core Assign tasks to at least some of the target processing cores in the core cluster according to the tasks to be processed, and at least some of the target processing cores may include the main core itself.
  • the task to be processed may be delivered to the main core of the corresponding core cluster through the on-chip network.
  • each target processing core and the first processing core used to process the to-be-processed task in the core cluster are located on the same many-core chip, unnecessary energy consumption caused by cross-chip interaction during the processing of the task can be effectively avoided. Effectively ensure the power consumption of the many-core system.
  • the above task processing method describes a method for dynamically forming a corresponding core cluster for processing the to-be-processed task according to the to-be-processed task.
  • the corresponding core cluster executes the to-be-processed task, which improves the task processing flexibility of the many-core system and improves the control capability of some processing cores (eg, the first processing core, the main core of the core cluster) in the many-core system.
  • the temperature of the processing core can be used to characterize the utilization and load of the processing core to a certain extent. Different from the situation in which core clusters are formed to process tasks according to processing core resources or the number of tasks to be processed, the actual utilization rate and load of processing cores can be more intuitively determined based on temperature factors when forming core clusters, and the system resources can be maximized. .
  • the temperature data is easier to obtain than the resource data and the task data to be processed, and the task allocation efficiency is higher, and its transmission can not occupy the transmission resources corresponding to the processing core, and will not have an impact on the ongoing task processing flow of the processing core.
  • FIG. 2 is a flowchart of a task processing method based on a many-core system according to an embodiment of the present disclosure. As shown in FIG. 2, in some embodiments, step S12 may include:
  • Step S121 obtaining the temperature of the currently available processing core
  • Step S122 according to the to-be-processed task and the temperature of the currently available processing cores, determine at least one target processing core from the currently available processing cores to form the core cluster.
  • the status of each processing core in the many-core system can be monitored or obtained, and the currently available processing cores in the many-core system can be determined. available processing cores.
  • the available processing cores may be idle processing cores.
  • the temperature of the processing core can be directly measured, or the temperature of the processing core can be obtained by acquiring and parsing the temperature data in a temperature database or a cache corresponding to the temperature data.
  • the temperature of the processing core is within the predetermined interval, it can indicate that the processing core is in the working range that can be operated; when the temperature of the processing core is outside the predetermined interval, it can indicate that the processing core is overloaded or idle state.
  • the currently available processing core when there is only one currently available processing core, and if the temperature of the currently available processing core satisfies the corresponding allocation strategy, the currently available processing core is used as a core cluster, and the allocation strategy may be based on the corresponding allocation strategy. Set the temperature threshold, temperature interval, etc.; when there are multiple processing cores currently available, you can use some or all of the currently available processing cores to form a core cluster according to the temperature, while referring to the task requirements, the area and other factors. .
  • a task to be processed may be allocated to the core cluster in step S13, so that the core cluster executes the task to be processed.
  • the temperature of the currently available processing cores can be obtained; and according to the tasks to be processed and the temperature, at least one target processing core is determined from the currently available processing cores to form a core cluster, so that tasks can be allocated in a targeted manner. It ensures the temperature uniformity of the on-chip many-core, and at the same time improves the accuracy of the temperature adjustment of the many-core system through local adjustment.
  • obtaining the temperature of the currently available processing core in step S121 may include:
  • the temperature information includes the temperature data measured by the temperature sensor corresponding to the controller and the temperature measurement distance of the processing core, the temperature measurement distance is the temperature sensor and the temperature measurement distance. distance between the processing cores; according to the temperature information, respectively determine the temperature of the currently available processing cores.
  • the many-core system includes a temperature sensor for measuring the temperature of at least one processing core within a predetermined range.
  • a temperature sensor for measuring the temperature of at least one processing core within a predetermined range.
  • One or more temperature sensors may be provided.
  • the temperature sensor is one-to-one with the processing core. or the temperature sensor is arranged at the center of the area where the plurality of processing cores are located.
  • each temperature sensor corresponds to at least one processing core respectively, and the predetermined ranges corresponding to each temperature sensor are different. In some embodiments, the temperature sensors are arranged at intervals to form a temperature sensor array.
  • FIG. 3 is a schematic diagram of a many-core chip according to an embodiment of the present disclosure.
  • the many-core chip can be any chip in the many-core system, which includes multiple processing cores and multiple temperature sensors, and the temperature sensors are arranged at intervals to form a temperature sensor array.
  • all temperature sensors may be arranged in an equidistant arrangement.
  • a controller (not shown) may be provided, such as an external controller, which corresponds to one or more temperature sensors, and the temperature sensors may send the measured temperature data to the controller through a local bus ;
  • the controller uses the temperature data measured by the temperature sensor and the temperature measurement distance of the processing core as temperature information, and sends it to the first processing core through the on-chip network.
  • the temperature measurement distance is the distance between the center point of the processing core and the center point of the corresponding temperature sensor.
  • the first processing core may separately determine the temperature of the currently available processing core according to the temperature information.
  • a temperature sensor in the neighborhood of the processing core may be determined, that is, a temperature sensor whose distance from the center point of the processing core is within a preset distance threshold Temperature sensor, the present disclosure does not limit the specific value of the distance threshold.
  • the temperature of the processing core can be determined by the following formula (1):
  • Tp represents the temperature of the currently available processing core p
  • q represents the number of the temperature sensor in the neighborhood of the processing core p
  • Sq represents the temperature data measured by the temperature sensor q
  • Dp, q represent the temperature of the processing core p
  • D ⁇ (p,q) represents the temperature measurement distance between the center point of the processing core p and the temperature sensor j
  • D p, q represents the normalized temperature measurement distance
  • the temperature of the currently available processing core can be comprehensively calculated through the temperature data of multiple temperature sensors and the temperature measurement distance, thereby improving the accuracy of the temperature measurement of the processing core.
  • the target processing core may be selected from the currently available processing cores according to the temperature of the currently available processing cores, and at the same time with reference to factors such as task requirements, the area in which they are located, and the like.
  • step S122 at least one target processing core may be determined from the currently available processing cores according to the to-be-processed task and the temperature of the currently available processing cores.
  • step S122 may include:
  • N the number of target processing cores for executing the to-be-processed task, where N is a positive integer;
  • the N processing cores whose temperature is less than or equal to the preset temperature threshold are used as target processing cores.
  • the number N (N is a positive integer) of the required target processing cores can be determined according to parameters such as the task type of the task to be processed, the task requirement, and the resource demand.
  • the number of processing cores corresponding to the task to be processed is the minimum number of processing cores required to process the task based on the principle of resource priority, or the number of processing cores corresponding to the task to be processed is based on the principle of efficiency priority. maximum number of processing cores. This disclosure does not limit this.
  • the temperatures of a plurality of currently available processing cores may be sorted from low to high, and the currently available processing cores whose temperatures are at the top N in the sorting result form a core cluster; or N currently available processing cores whose temperature is less than or equal to a preset temperature threshold are determined from a plurality of currently available processing cores to form a target core cluster.
  • the present disclosure does not limit the specific selection method of the target processing core and the specific value of the temperature threshold.
  • a core cluster may also be formed of N currently available processing cores that are physically close, thereby reducing the number of routing steps during task processing and improving data transmission efficiency .
  • target processing cores can be determined from the currently available processing cores to form a core cluster, which can improve the efficiency of task allocation and the rationality of task allocation, and further improve the flexibility of task processing in the many-core system.
  • FIG. 4 is a flowchart of a task allocation method provided by an embodiment of the present disclosure.
  • an embodiment of the present disclosure provides a task allocation method.
  • the method is used for a pre-established core cluster in a many-core system, the core cluster is determined based on the above-mentioned task processing method, the core cluster includes a plurality of target processing cores, and the plurality of target processing cores include a target processing core as a main core, the A task allocation method for the core cluster is implemented based on the main core of the core cluster, and the task allocation method includes:
  • Step S21 receiving the task to be processed.
  • Step S22 Divide the task to be processed into multiple subtasks.
  • Step S23 Allocate corresponding subtasks to at least some of the target processing cores in the core cluster, and each target processing core corresponds to at least one subtask, so that at least some of the target processing cores execute the corresponding subtasks respectively.
  • step S21 a task to be processed from the task processing device is received, or a task to be processed from the first processing core is received.
  • the to-be-processed task may be parsed in step S22, the to-be-processed task is decomposed into several calculation steps, and the dependencies between the calculation steps are determined according to each calculation step
  • the subtasks are divided according to the resource requirements (such as storage resource requirements and computing resource requirements) required by the computing steps and the dependencies between the computing steps.
  • the calculation steps with dependencies are divided into the same subtask, and the calculation steps without dependencies are divided into different subtasks.
  • the resources corresponding to the divided subtasks are divided. The demand is as even and balanced as possible.
  • computing steps with dependencies refer to computing steps that require serial computing
  • computing steps that do not have dependencies refer to computing steps that can be computed in parallel.
  • the multiple subtasks that are finally divided are multiple subtasks that can be calculated in parallel.
  • the calculation process includes convolution, pooling, convolution, pooling, convolution, pooling, and full connection.
  • the result of the convolution is the input of the pooling, and the pooling
  • the result is a fully connected input, indicating that there is a dependency between convolution, pooling, and full connection.
  • Convolution can be decomposed into multiple calculation steps, and each calculation step is used to process partial convolution, so there is no dependency between multiple calculation steps corresponding to convolution; in the same way, between multiple calculation steps corresponding to pooling There is no dependency relationship, and there is no dependency relationship between multiple computing steps corresponding to the full connection.
  • corresponding subtasks may be allocated to at least some target processing cores in the core cluster, and each target processing core corresponds to at least one subtask, for at least some target processing cores to execute corresponding subtasks respectively Subtasks.
  • the main core of the core cluster divides the to-be-processed task corresponding to the core cluster into multiple subtasks, and allocates the multiple subtasks
  • Each target processing core (including the main core itself) in the cluster is processed, thereby effectively improving the processing efficiency of the tasks to be processed.
  • FIG. 5 is a flowchart of a task allocation method provided by an embodiment of the present disclosure.
  • steps S221 to S223 are further included .
  • Step S221 count the current valid core resource information of the core cluster.
  • the current valid core resource information of the core cluster may include the number of target processing cores in the core cluster and the current remaining (idle) resources corresponding to each target processing core respectively, such as the current remaining resources, including the current remaining computing resources, the current The amount of remaining storage resources and/or the amount of current remaining bandwidth resources.
  • Step S222 according to the current valid core resource information, determine whether the core cluster currently has the conditions for executing the multiple subtasks currently split, if yes, jump to step S23, otherwise, go to step S223.
  • step S222 it is judged whether the current valid core resource information of the core cluster satisfies the resource requirements corresponding to each subtask currently split, and if so, it is judged that the core cluster currently has the ability to execute the current split. condition of multiple subtasks, otherwise it is judged that the core cluster currently does not have the conditions to execute the multiple subtasks currently split.
  • the target processing core satisfies the conditions for executing multiple subtasks, and for all the target processing cores except the target processing core Except for other target processing cores, all other target processing cores meet the conditions for executing at least one subtask respectively, and the number of subtasks that can be executed by all target processing cores is consistent with the number of subtasks currently split, then it is judged that the core The cluster is currently eligible to perform multiple subtasks.
  • condition that the target processing core satisfies the execution of multiple subtasks may be that the current remaining resource of the target processing core is greater than or equal to the resource requirements required by the plurality of subtasks, and the condition that the target processing core satisfies the execution of at least one subtask may be The current remaining resource amount of the target processing core is greater than or equal to the resource demand amount required by the at least one subtask.
  • the target processing core satisfies the conditions for executing at least one subtask, such as the target processing core
  • the remaining memory capacity corresponding to the core exceeds the static memory required by the at least one subtask (ie the storage resource requirements), and the remaining computing power corresponding to the target processing core (ie the current remaining computing resources) amount) exceeds the computational amount required by the at least one subtask (i.e., the computational resource requirement), and the current routing transmission bandwidth threshold corresponding to the target processing core (i.e., the current remaining bandwidth resource amount) exceeds the required amount of the at least one subtask.
  • the routing transmission volume (that is, the bandwidth resource requirement), and the sum of the number of subtasks that can be executed by all target processing cores is consistent with the number of subtasks currently split, it indicates that all target processing cores of the core cluster are satisfied to execute the corresponding
  • the conditions of the subtasks are determined, so that it is judged that the core cluster currently has the conditions to execute the currently split subtasks. If there is at least one target processing core that does not meet the conditions for executing any subtask, for example, the remaining storage resources of the target processing core are less than the storage resources required by the subtask, it is determined that the core cluster currently does not have the ability to execute the current split conditions for multiple subtasks.
  • step S222 in response to determining that the core cluster currently has the conditions for executing the currently split subtasks, the step of allocating subtasks to each target processing core in the core cluster is performed, that is, step S23 is performed.
  • step S222 in response to determining that the core cluster currently does not have the conditions to execute the multiple subtasks currently split, jump to step S223.
  • Step S223 in response to judging that the core cluster does not currently have the conditions for executing the multiple subtasks currently split, split the task to be processed again to split multiple new subtasks, and jump Go to step S222.
  • the core cluster cannot meet the conditions for executing the split sub-tasks after multiple re-splits of the tasks to be processed, apply to the task processing device (such as the first processing core) to add more core resources In the core cluster, or from the many-core system, other idle processing cores are determined, and other idle processing cores in the many-core system are added to the core cluster to expand the current remaining resources of the core cluster, so that the core cluster can meet the execution requirements.
  • the threshold of the number of splits may be preset, for example, may be set to 3 times, which is not limited in the present disclosure.
  • the task allocation method further includes: for each target processing core in at least part of the target processing cores, according to the At least one subtask matching the target processing core is determined according to the current remaining resource information of the target processing core and the resource requirements corresponding to each of the currently split subtasks respectively.
  • the target processing core For at least some of the target processing cores, if the current remaining resource information of a certain target processing core only satisfies the conditions for executing one subtask among multiple subtasks, the target processing core is matched with the subtask, and the subsequent task allocation , the subtask can be allocated to the target processing core. If the current remaining resource information of a certain target processing core satisfies the conditions for executing two subtasks among the multiple subtasks currently split, the target processing core is correspondingly matched with the two subtasks. The target processing core distributes the two subtasks.
  • allocating the corresponding subtasks to the at least some target processing cores including: for each subtask, allocating the subtask to the core cluster The target processing core that matches this subtask.
  • the subtasks may also be allocated according to the time balancing strategy, so that the time required for each target processing core to process the respective allocated subtasks is equal or approximately equal , that is, the time used by each target processing core to process the corresponding subtask is equal or approximately equal to the time used by other target processing cores.
  • the task allocation method further includes: according to each The resource requirements corresponding to the subtasks, the current remaining resource information and location information of each target processing core in at least some of the target processing cores, determine the subtask allocation strategy matching the core cluster.
  • each target processing core in at least some of the target processing cores determines the target processing core corresponding to the target processing core. At least one subtask that matches. After determining the matching subtasks of each target processing core, the subtasks that are finally allocated to them are determined according to the position information of each target processing core, that is, the subtasks that are finally executed by each target processing core, so as to obtain the corresponding core cluster.
  • the matching subtask allocation strategy includes the corresponding relationship between each target processing core and the subtask that needs to be executed finally, so that the overall routing transmission bandwidth corresponding to the core cluster is minimized, thereby ensuring the performance of the many-core system. performance.
  • the route transmission cost algorithm includes the route transmission cost function:
  • W represents the routing transmission cost corresponding to the initial subtask allocation strategy
  • Mn represents the routing transmission cost corresponding to the nth subtask
  • xi represents the main core of the core cluster in the preset position coordinate system
  • the x-axis position coordinates of , y i represents the y-axis position coordinates of the main core of the core cluster in the preset position coordinate system
  • x j represents the target processing core corresponding to the nth subtask in the initial correspondence in the preset position coordinates
  • y j represents the y-axis position coordinate of the target processing core corresponding to the nth subtask in the initial correspondence relationship in the preset position coordinate system.
  • the preset position coordinate system may be a two-dimensional coordinate system established in advance with the first processing core in the many-core system as the origin, or may be a two-dimensional coordinate system established in advance with the main nucleus in the core cluster as the origin, At this time, the position coordinate of the main core in the core cluster is (0, 0).
  • the initial subtask allocation strategy with the smallest value of the route transmission cost W is used as the subtask allocation strategy matching the core cluster.
  • optimization methods such as a neural network-based reinforcement learning method, a simulated annealing algorithm, a convex optimization algorithm, etc. may be further adopted to optimize the above-mentioned route transmission cost function, and the target-based route transmission cost function can be calculated based on the optimized route transmission cost function.
  • the optimal solution of subtask assignment is performed by processing the position information of the core, and the optimal solution is the subtask assignment strategy matched with the core cluster.
  • assigning corresponding subtasks to the at least part of the target processing cores includes: based on the subtask assignment strategy matching the core cluster, Corresponding subtasks are assigned to at least some of the target processing cores in the core cluster.
  • the task assignment method further includes: step S231-step S233.
  • Step S231 generating task configuration information of subtasks corresponding to each target processing core in at least some of the target processing cores.
  • corresponding task configuration information is generated, and the task configuration information includes but is not limited to: memory configuration information, operation operator Configuration information, routing configuration information, computing control configuration information, and synchronization management configuration information.
  • the memory configuration information may include the memory information required by the target processing core to execute the subtask, the data and parameters that need to be stored or imported to execute the subtask, and the operation operator configuration information may include the information of the operation operator to be executed.
  • Configuration information may include routing transmission paths for data such as calculation results (task execution results), calculation control configuration information may include operation sequence, operation time and other information, and synchronization management configuration information may include information on data to be synchronized.
  • the task allocation method further It includes: generating task execution instructions corresponding to each target processing core, address generation logic, control logic and synchronization logic.
  • the address generation logic may be, for example, the generation logic of the memory read/write address corresponding to the target processing core
  • the control logic may be, for example, the operation timing and sequence control logic of each target processing core
  • the synchronization logic may be, for example, data to be processed synchronously.
  • the above-mentioned task configuration information may include logic such as address generation logic, control logic, and synchronization logic.
  • Step S232 Send respective task configuration information to each target processing core in at least some of the target processing cores, so that each target processing core stores its corresponding task configuration information.
  • step S232 the corresponding task configuration information is respectively sent to each target processing core in at least a part of the target processing cores through the network-on-chip.
  • Step S233 in response to the configuration completion information returned by each target processing core, notify the data source corresponding to the to-be-processed task to provide each target processing core with the task data required by each target processing core.
  • step S233 after task configuration is performed on each target processing core of at least some of the target processing cores, the data source corresponding to the task to be processed may be notified to carry the task data (for example, the input picture data, text data, etc. corresponding to the task), The data source can be notified by signaling.
  • step S23 may include: respectively sending corresponding task execution instructions to at least some of the target processing cores, so that each target processing core executes corresponding subtasks in response to the task execution instructions.
  • step S23 and the foregoing step S233 may be performed synchronously and in parallel, or step S23 may be performed after step S233, which is not limited in this embodiment of the present disclosure.
  • each target processing core in at least some of the target processing cores executes the corresponding subtask in response to the task execution instruction, and obtains and stores the corresponding task execution result or calculation result. After the execution of the subtask is completed, the subtask completion message sent by each target processing core is received.
  • each target processing core it is determined that the to-be-processed task has been completed, and the data source is notified to read the task execution results distributed in each target processing core to the memory outside the many-core chip, such as the host memory (Host DDR) Or other external memory (DDR).
  • Host DDR host memory
  • DDR other external memory
  • the corresponding core cluster may be disbanded, for example, the control logic may be suspended. At the same time, it is reported to the task processing device that the core cluster has been disbanded, so that the next task to be processed can be performed.
  • FIG. 6 is a flowchart of a task processing method based on a many-core system according to an embodiment of the present disclosure
  • FIG. 7 is a block diagram of a many-core system.
  • the many-core system includes a A processing core and a plurality of second processing cores
  • the task processing method includes:
  • Step S301 the first processing core receives the task to be processed sent by the host.
  • Step S302 The first processing core determines a core cluster for processing the to-be-processed task according to the to-be-processed task, where the core cluster includes at least one target processing core determined from a plurality of second processing cores.
  • Step S303 The first processing core selects a target processing core from the core cluster as the main core of the core cluster, and sends the to-be-processed task to the main core.
  • Step S304 the main core splits the to-be-processed task into multiple subtasks.
  • Step S305 the main core judges whether the core cluster currently has the conditions for executing the currently split multiple subtasks according to the current valid core resource information of the core cluster.
  • Step S306 the main core determines at least one subtask corresponding to each target processing core in at least some of the target processing cores in response to judging that the core cluster currently has the conditions for executing the multiple subtasks currently split.
  • Step S307 The main core generates task configuration information of subtasks corresponding to each target processing core, and sends the task configuration information to each target processing core.
  • Step S308 In response to the configuration completion information returned by each target processing core, the master core notifies the data source corresponding to the task to be processed to provide each target processing core with the task data required by each target processing core.
  • Step S309 the main core sends a corresponding task execution instruction to each target processing core, so that each target processing core executes its corresponding subtask in response to the task execution instruction.
  • Step S310 After the task to be processed is completed, the main core notifies the data source corresponding to the task to be processed to read out the execution results of each task distributed in each target processing core to the external memory.
  • Step S311 the main core disbands the core cluster, and feeds back information that the core cluster has been disbanded to the first processing core.
  • FIG. 8 is a block diagram of the composition of a task processing apparatus provided by an embodiment of the present disclosure.
  • an embodiment of the present disclosure provides a task processing apparatus 500, the task processing apparatus is applied to a many-core system, the many-core system includes a plurality of processing cores, the task processing apparatus 500 includes: a first receiving module 501, The core cluster consists of a module 502 and a sending module 503 .
  • the first receiving module 501 is used to receive the task to be processed;
  • the core cluster forming module 502 is used to determine the core cluster used to process the to-be-processed task according to the to-be-processed task, and the core cluster includes a core cluster determined from a plurality of processing cores At least one target processing core;
  • the sending module 503 is configured to send the pending task to the core cluster, so that the core cluster can execute the pending task.
  • the task processing apparatus 500 provided by the embodiment of the present disclosure is used to implement the above-mentioned task processing method.
  • the task processing apparatus 500 reference may be made to the description in the above-mentioned task processing method, which will not be repeated here.
  • the core cluster building module 502 is configured to: determine, according to the to-be-processed task, the resource demand required to execute the to-be-processed task; At least one target processing core is determined from the processing cores to form the core cluster.
  • the number of the target processing cores is multiple, and before the sending module 503, it further includes:
  • a main core selection module for selecting a target processing core from the core cluster as the main core of the core cluster
  • the sending module 503 is configured to: send the to-be-processed task to the main core, so that the main core can perform task assignment to at least some target processing cores in the core cluster according to the to-be-processed task .
  • the core cluster building module 502 is configured to: obtain the temperature of the currently available processing core; according to the to-be-processed task and the temperature of the currently available processing core At least one target processing core is identified from the cores to form the core cluster.
  • the acquiring the temperature of the currently available processing core includes: receiving temperature information sent by a preset controller, where the temperature information includes temperature data measured by a temperature sensor corresponding to the controller and the temperature measurement distance of the processing core, where the temperature measurement distance is the distance between the temperature sensor and the processing core;
  • the temperatures of the currently available processing cores are respectively determined.
  • the determining at least one target processing core from the currently available processing cores according to the tasks to be processed and the temperature of the currently available processing cores includes: according to the tasks to be processed parameter, determine the number N of target processing cores used to execute the task to be processed, N is a positive integer; sort the temperatures of the currently available processing cores from low to high, and sort the results in The top N processing cores are determined as target processing cores; or, N processing cores whose temperatures are less than or equal to a preset temperature threshold are determined from the currently available processing cores as target processing cores.
  • the plurality of processing cores include a first processing core and a plurality of second processing cores
  • the task processing device is implemented based on the first processing core
  • the core cluster includes a plurality of second processing cores.
  • FIG. 9 is a block diagram of the composition of a task allocation apparatus provided by an embodiment of the present disclosure.
  • an embodiment of the present disclosure provides a task allocation apparatus 600, which is used to implement the above task allocation method, and perform task allocation on a pre-established core cluster in a many-core system, where the core cluster includes a plurality of The target processing core, the task allocation device 600 includes: a second receiving module 601 , a task splitting module 602 and a task allocation module 603 .
  • the second receiving module 601 is used to receive the task to be processed; the task splitting module 602 is used to split the task to be processed into multiple subtasks; the task allocation module 603 is used to allocate corresponding tasks to at least some of the target processing cores in the core cluster Each target processing core in at least some of the target processing cores corresponds to at least one subtask, so that at least some of the target processing cores can respectively execute the corresponding subtask.
  • the task allocation apparatus 600 provided by the embodiment of the present disclosure is used to implement the above task allocation method.
  • the task allocation apparatus 600 reference may be made to the description in the above task allocation method, which will not be repeated here.
  • the apparatus further includes: a resource statistics module, configured to count the currently valid core resource information of the core cluster; Core resource information, to determine whether the core cluster currently has the conditions to execute the multiple subtasks currently split; wherein, the task allocation module 603 is used for: in response to determining that the core cluster currently has the ability to execute the current split According to the conditions of the selected multiple subtasks, corresponding subtasks are allocated to at least some of the target processing cores in the core cluster.
  • the apparatus further includes: a re-splitting module, configured to, in response to judging that the core cluster does not currently have the conditions for executing the currently split subtasks, The to-be-processed task is re-split to re-split into a plurality of new sub-tasks, and the condition judgment module continues to be executed.
  • a re-splitting module configured to, in response to judging that the core cluster does not currently have the conditions for executing the currently split subtasks, The to-be-processed task is re-split to re-split into a plurality of new sub-tasks, and the condition judgment module continues to be executed.
  • the apparatus before the task allocation module 603, the apparatus further includes:
  • the task determination module is used for, for each target processing core in at least some of the target processing cores, according to the current remaining resource information of the target processing core and the resource requirements corresponding to the currently split sub-tasks respectively, to determine the resource requirements corresponding to the target processing cores.
  • the task assignment module 603 is configured to: for each subtask, assign the subtask to the target processing core in the core cluster that matches the subtask.
  • the apparatus before the task assignment module 603, the apparatus further includes: a policy determination module, configured to determine the resource requirements corresponding to each subtask, at least some of the target processing cores in the target processing cores. The current remaining resource information and location information are used to determine the subtask allocation strategy matching the core cluster; the task allocation module 603 is configured to: based on the subtask allocation strategy, process each target in the core cluster Nucleus assigns subtasks.
  • a policy determination module configured to determine the resource requirements corresponding to each subtask, at least some of the target processing cores in the target processing cores. The current remaining resource information and location information are used to determine the subtask allocation strategy matching the core cluster; the task allocation module 603 is configured to: based on the subtask allocation strategy, process each target in the core cluster Nucleus assigns subtasks.
  • the apparatus further includes: a configuration information generating module, configured to generate task configuration information of subtasks corresponding to each target processing core in at least some of the target processing cores; a configuration information sending module, configured to send Each target processing core in at least some of the target processing cores respectively sends corresponding task configuration information, so that each target processing core stores the corresponding task configuration information.
  • a configuration information generating module configured to generate task configuration information of subtasks corresponding to each target processing core in at least some of the target processing cores
  • a configuration information sending module configured to send Each target processing core in at least some of the target processing cores respectively sends corresponding task configuration information, so that each target processing core stores the corresponding task configuration information.
  • the apparatus further includes: a task data providing module, configured to respond to the configuration completion information returned by each target processing core, notifying the task corresponding to the to-be-processed task
  • the data source provides each target processing core with the task data required by each target processing core; the task allocation module 603 is used to: send corresponding task execution instructions to at least some of the target processing cores respectively, for each target processing core to respond Execute the corresponding subtask according to the task execution instruction.
  • An embodiment of the present disclosure further provides a processing core, where the processing core includes: the above-mentioned task processing apparatus and/or the above-mentioned task allocation apparatus.
  • FIG. 10 is a block diagram of an electronic device according to an embodiment of the present disclosure.
  • an embodiment of the present disclosure provides an electronic device, the electronic device includes multiple processing cores 701 and an on-chip network 702, wherein the multiple processing cores 701 are connected to the on-chip network 702, and the on-chip network 702 is used for interacting multiple data between cores and external data.
  • One or more first instructions are stored in one or more processing cores 701, and one or more first instructions are executed by one or more processing cores 701, so that one or more processing cores 701 can execute the above-mentioned task processing method; and/or, one or more second instructions are stored in one or more processing cores 701, and one or more second instructions are executed by one or more processing cores 701, so that one or more processing cores 701
  • the core 701 is capable of executing the above-described task allocation method.
  • an embodiment of the present disclosure also provides a computer-readable medium on which a first computer program and/or a second computer program are stored, wherein the first computer program implements the above-mentioned tasks when executed by a processing core A processing method; the second computer program implements the above-mentioned task allocation method when executed by the processing core.
  • an embodiment of the present disclosure also provides a computer program product, which, when running on a computer, causes the computer to execute the above-mentioned task processing method, or when the computer program product runs on a computer, causes the computer to execute The task assignment method described above.
  • Such software may be distributed on computer-readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media).
  • computer storage media includes both volatile and nonvolatile implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules or other data flexible, removable and non-removable media.
  • Computer storage media include, but are not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cartridges, magnetic tape, magnetic disk storage or other magnetic storage devices, or may Any other medium used to store desired information and which can be accessed by a computer.
  • communication media typically embodies computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and can include any information delivery media, as is well known to those of ordinary skill in the art .

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Hardware Redundancy (AREA)

Abstract

L'invention concerne un procédé et un appareil de traitement de tâches, un procédé d'attribution de tâches, et un dispositif et un support électroniques. Un système à plusieurs cœurs comprend une pluralité de cœurs de traitement. Le procédé comprend les étapes consistant à : recevoir une tâche à traiter ; déterminer, en fonction de ladite tâche, une grappe de cœurs utilisée pour traiter ladite tâche, la grappe de cœurs comprenant au moins un cœur de traitement cible déterminé à partir d'une pluralité de cœurs de traitement ; et envoyer ladite tâche à la grappe de cœurs de telle sorte que la grappe de cœurs exécute ladite tâche.
PCT/CN2021/132344 2020-11-24 2021-11-23 Procédé et appareil de traitement de tâches, procédé d'attribution de tâches, et dispositif et support électroniques WO2022111453A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202011330531.9 2020-11-24
CN202011330531.9A CN114546630A (zh) 2020-11-24 2020-11-24 任务处理方法及分配方法、装置、电子设备、介质
CN202011480656.XA CN114637593A (zh) 2020-12-15 2020-12-15 任务分配方法、处理核、电子设备和计算机可读介质
CN202011480656.X 2020-12-15

Publications (1)

Publication Number Publication Date
WO2022111453A1 true WO2022111453A1 (fr) 2022-06-02

Family

ID=81755051

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/132344 WO2022111453A1 (fr) 2020-11-24 2021-11-23 Procédé et appareil de traitement de tâches, procédé d'attribution de tâches, et dispositif et support électroniques

Country Status (1)

Country Link
WO (1) WO2022111453A1 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115086232A (zh) * 2022-06-13 2022-09-20 清华大学 任务处理及数据流生成方法和装置
CN115658269A (zh) * 2022-11-01 2023-01-31 上海玫克生储能科技有限公司 一种用于任务调度的异构计算终端
CN116107724A (zh) * 2023-04-04 2023-05-12 山东浪潮科学研究院有限公司 一种ai加速核调度管理方法、装置、设备及存储介质
CN116405555A (zh) * 2023-03-08 2023-07-07 阿里巴巴(中国)有限公司 数据传输方法、路由节点、处理单元和片上系统
CN117632520A (zh) * 2024-01-25 2024-03-01 山东省计算中心(国家超级计算济南中心) 基于申威众核处理器的主从核监测交互的计算量调度方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101390067A (zh) * 2006-02-28 2009-03-18 英特尔公司 增强众核处理器的可靠性
CN105718318A (zh) * 2016-01-27 2016-06-29 上海戴西实业有限公司 一种基于辅助工程设计软件的集合式调度优化方法
US20170220719A1 (en) * 2016-02-01 2017-08-03 King Fahd University Of Petroleum And Minerals Multi-core compact executable trace processor
CN111143051A (zh) * 2018-11-05 2020-05-12 三星电子株式会社 通过异构资源执行人工神经网络来执行任务的系统和方法
CN111932027A (zh) * 2020-08-28 2020-11-13 电子科技大学 一种融合边缘设施的云服务综合调度优化系统及方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101390067A (zh) * 2006-02-28 2009-03-18 英特尔公司 增强众核处理器的可靠性
CN105718318A (zh) * 2016-01-27 2016-06-29 上海戴西实业有限公司 一种基于辅助工程设计软件的集合式调度优化方法
US20170220719A1 (en) * 2016-02-01 2017-08-03 King Fahd University Of Petroleum And Minerals Multi-core compact executable trace processor
CN111143051A (zh) * 2018-11-05 2020-05-12 三星电子株式会社 通过异构资源执行人工神经网络来执行任务的系统和方法
CN111932027A (zh) * 2020-08-28 2020-11-13 电子科技大学 一种融合边缘设施的云服务综合调度优化系统及方法

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115086232A (zh) * 2022-06-13 2022-09-20 清华大学 任务处理及数据流生成方法和装置
CN115086232B (zh) * 2022-06-13 2023-07-21 清华大学 任务处理及数据流生成方法和装置
CN115658269A (zh) * 2022-11-01 2023-01-31 上海玫克生储能科技有限公司 一种用于任务调度的异构计算终端
CN115658269B (zh) * 2022-11-01 2024-02-27 上海玫克生储能科技有限公司 一种用于任务调度的异构计算终端
CN116405555A (zh) * 2023-03-08 2023-07-07 阿里巴巴(中国)有限公司 数据传输方法、路由节点、处理单元和片上系统
CN116405555B (zh) * 2023-03-08 2024-01-09 阿里巴巴(中国)有限公司 数据传输方法、路由节点、处理单元和片上系统
CN116107724A (zh) * 2023-04-04 2023-05-12 山东浪潮科学研究院有限公司 一种ai加速核调度管理方法、装置、设备及存储介质
CN117632520A (zh) * 2024-01-25 2024-03-01 山东省计算中心(国家超级计算济南中心) 基于申威众核处理器的主从核监测交互的计算量调度方法
CN117632520B (zh) * 2024-01-25 2024-05-17 山东省计算中心(国家超级计算济南中心) 基于申威众核处理器的主从核监测交互的计算量调度方法

Similar Documents

Publication Publication Date Title
WO2022111453A1 (fr) Procédé et appareil de traitement de tâches, procédé d'attribution de tâches, et dispositif et support électroniques
KR101286700B1 (ko) 멀티 코어 프로세서 시스템에서 로드 밸런싱을 위한 장치및 방법
US10572290B2 (en) Method and apparatus for allocating a physical resource to a virtual machine
CN107688492B (zh) 资源的控制方法、装置和集群资源管理系统
US9092266B2 (en) Scalable scheduling for distributed data processing
WO2016078008A1 (fr) Procédé et appareil de programmation d'une tâche d'un flux de données
US20230136661A1 (en) Task scheduling for machine-learning workloads
US7920282B2 (en) Job preempt set generation for resource management
US10936377B2 (en) Distributed database system and resource management method for distributed database system
Garala et al. A performance analysis of load Balancing algorithms in Cloud environment
JP7506096B2 (ja) 計算資源の動的割り当て
WO2020238989A1 (fr) Procédé et appareil permettant de planifier une entité de traitement de tâche
US11093291B2 (en) Resource assignment using CDA protocol in distributed processing environment based on task bid and resource cost
US20210390405A1 (en) Microservice-based training systems in heterogeneous graphic processor unit (gpu) cluster and operating method thereof
US9152549B1 (en) Dynamically allocating memory for processes
CN107634978B (zh) 一种资源调度方法及装置
CN114546630A (zh) 任务处理方法及分配方法、装置、电子设备、介质
WO2022111466A1 (fr) Procédé de planification de tâches, procédé de commande, dispositif électronique et support lisible par ordinateur
US10180712B2 (en) Apparatus and method for limiting power in symmetric multiprocessing system
JP5526748B2 (ja) パケット処理装置、パケット振り分け装置、制御プログラム及びパケット分散方法
Li et al. Topology-aware job allocation in 3d torus-based hpc systems with hard job priority constraints
US20170161110A1 (en) Computing resource controller and control method for multiple engines to share a shared resource
CN115878309A (zh) 资源分配方法、装置、处理核、设备和计算机可读介质
US10142245B2 (en) Apparatus and method for parallel processing
US20240143394A1 (en) Heterogeneous computing terminal for task scheduling

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21896954

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 21.09.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21896954

Country of ref document: EP

Kind code of ref document: A1