CN113703936A

CN113703936A - Method for creating computing power container, computing power platform, electronic device and storage medium

Info

Publication number: CN113703936A
Application number: CN202110397131.8A
Authority: CN
Inventors: 查冲
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-04-13
Filing date: 2021-04-13
Publication date: 2021-11-26

Abstract

A method for creating a computing power container, a computing power platform, an electronic device and a storage medium are provided, and the method relates to the field of big data processing of cloud technology. The method comprises the following steps: obtaining a target task through an affinity task cache; determining a first resource pool corresponding to the target task in N affinity task resource pools respectively corresponding to N time periods according to the time period of the target task through the affinity task cache; n is more than 1; a computing power container is created in the first resource pool for the target task. The method can regulate the computational resource pool in the time dimension, avoid the scattering of the computational resource pool in the time granularity and reduce the computational fragmentation.

Description

Method for creating computing power container, computing power platform, electronic device and storage medium

Technical Field

The embodiment of the application relates to the field of cloud technologies, in particular to the field of big data processing of cloud technologies, and more particularly to a method for creating a computing power container, a computing power platform, an electronic device and a storage medium.

Background

Up to now, the technical solution for "reducing computational fragmentation" in the industry is mainly: after submitting AI training, a user meets the requirements of the user from the resource perspective, and a saturation-first algorithm is usually adopted, namely, the computing power fragmentation is reduced from the resource perspective, and fragmented computing power equipment is preferentially allocated; however, the saturation-first algorithm is effective for resource scenarios of static production delivery in terms of resource allocation scenarios, while fragmentation may occur for dynamic scenarios of tasking, even if the users are binned according to saturation priority.

Therefore, there is a need in the art for a more efficient method of creating a computing power container for a diversified resource allocation scenario.

Disclosure of Invention

The embodiment of the application provides a method for creating a computing power container, a computing power platform, electronic equipment and a storage medium, which can regulate a computing power resource pool from the dimension of time, avoid the computing power resource pool from being scattered on the time granularity and reduce the computing power fragmentation.

In one aspect, the present application provides a method for creating a computing power container, where the method is applied to a computing power platform; the method comprises the following steps:

obtaining a target task through an affinity task cache;

determining a first resource pool corresponding to the target task in N affinity task resource pools respectively corresponding to N time periods according to the time period of the target task through the affinity task cache; n is more than 1;

a computing power container is created in the first resource pool for the target task.

In another aspect, an embodiment of the present application provides a computational force platform, including:

the acquiring unit is used for acquiring the target task through the affinity task cache;

a determining unit, configured to determine, according to the affinity task cache, a first resource pool corresponding to the target task from N affinity task resource pools corresponding to N time periods respectively according to the time period of the target task; n is more than 1;

and the creating unit is used for creating a calculation capacity container for the target task in the first resource pool.

In another aspect, an embodiment of the present application provides an electronic device, including:

a processor adapted to execute a computer program;

a computer readable storage medium having stored thereon a computer program which, when executed by the processor, implements the method of the first aspect or the method of the second aspect.

In another aspect, embodiments of the present application provide a computer-readable storage medium, which stores computer instructions, which when read and executed by a processor of a computer device, cause the computer device to perform the method of the first aspect or the method of the second aspect.

In the method for creating the computing power container, a target task is obtained through an affinity task cache; determining a first resource pool corresponding to the target task in N affinity task resource pools respectively corresponding to N time periods according to the time period of the target task through the affinity task cache; n is more than 1; creating a computing power container for the target task in the first resource pool.

In other words, the affinity task resource pool corresponding to the target task, namely the first resource pool, is determined through the affinity task cache, and a computing container is created for the target task in the first resource pool; on one hand, the N affinity task resource pools respectively correspond to N time periods, so that the computing resource pools can be normalized in a time dimension; on the other hand, the affinity task cache determines the first resource pool corresponding to the target task from the N affinity task resource pools according to the time period of the target task, so that the target task can run in the corresponding affinity task resource pools, and the computing power fragmentation caused by overlarge running time difference of a plurality of tasks is avoided, thereby realizing the reduction of the computing power fragmentation in the time dimension.

In addition, the computing power fragmentation is reduced in the time dimension, the whole inventory of computing power resources is facilitated, the situation that the total computing power resources of a computing power platform at a certain moment are enough, but the computing power resources of a certain task resource pool required by a user task cannot be met can be avoided.

In short, the method for creating the computing power container provided by the application can be used for regulating the computing power resource pool from the time dimension, so that the computing power fragmentation is reduced from the time dimension.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is an example of a system framework provided by an embodiment of the present application.

Fig. 2 is a schematic flow chart of a method for creating a computing power container provided by an embodiment of the present application.

Fig. 3 is another schematic flow chart of a method for creating a computing force container provided by an embodiment of the present application.

Fig. 4 is another schematic flow chart of a method for creating a computing force container provided by an embodiment of the present application.

FIG. 5 is a schematic block diagram of a computing force platform provided by an embodiment of the application.

Fig. 6 is a schematic structural diagram of an electronic device provided in an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described clearly and completely with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The scheme provided by the application can relate to artificial intelligence technology.

Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

It should be understood that the artificial intelligence technology is a comprehensive subject, and relates to a wide range of fields, namely a hardware technology and a software technology. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

With the research and progress of artificial intelligence technology, the artificial intelligence technology is developed and applied in a plurality of fields, such as common smart homes, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned driving, automatic driving, unmanned aerial vehicles, robots, smart medical care, smart customer service, and the like.

The embodiment of the application also can relate to Machine Learning (ML) in the artificial intelligence technology, wherein the ML is a multi-field cross subject and relates to a plurality of subjects such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.

In addition, the scheme provided by the application can relate to cloud technology, in particular to cloud technology big data processing; and in particular to computing resource pooling in cloud technology.

Cloud computing (cloud computing) refers to a delivery and use mode of an IT infrastructure, and refers to obtaining required resources in an on-demand and easily-extensible manner through a network; the generalized cloud computing refers to a delivery and use mode of a service, and refers to obtaining a required service in an on-demand and easily-extensible manner through a network. Such services may be IT and software, internet related, or other services. Cloud Computing is a product of development and fusion of traditional computers and Network Technologies, such as Grid Computing (Grid Computing), distributed Computing (distributed Computing), Parallel Computing (Parallel Computing), Utility Computing (Utility Computing), Network Storage (Network Storage Technologies), Virtualization (Virtualization), Load balancing (Load Balance), and the like.

With the development of diversification of internet, real-time data stream and connecting equipment and the promotion of demands of search service, social network, mobile commerce, open collaboration and the like, cloud computing is rapidly developed. Different from the prior parallel distributed computing, the generation of cloud computing can promote the revolutionary change of the whole internet mode and the enterprise management mode in concept.

Big data (Big data) refers to a data set which cannot be captured, managed and processed by a conventional software tool within a certain time range, and is a massive, high-growth-rate and diversified information asset which can have stronger decision-making power, insight discovery power and flow optimization capability only by a new processing mode. With the advent of the cloud era, big data has attracted more and more attention, and the big data needs special technology to effectively process a large amount of data within a tolerance elapsed time. The method is suitable for the technology of big data, and comprises a large-scale parallel processing database, data mining, a distributed file system, a distributed database, a cloud computing platform, the Internet and an extensible storage system.

The computing resource pooling technology realizes unified management on hardware resources through a software technology, changes the hardware definition of computing resources into software definition and realizes flexible scheduling of the computing resources. Through the computing resource pooling technology, a user can efficiently schedule and use chip resources in a data center, the computing utilization rate is improved, and the fragmentation and computing cost is reduced.

Fig. 1 is an example of a system framework 100 provided by an embodiment of the present application.

As shown in FIG. 1, the system framework 100 may include a task submission module 110, a task container list module 160, a computing resource pool 101, and a data center 150. The computing resource pool 101 includes an affinity task cache 120, affinity task resource pools 1 to N, and a shared cache resource pool 140. The computing resource pool 101 and the data center 150 may be connected over a network to receive or send messages. It should be noted that, the number of affinity task resource pools is not limited in the present application.

The task submission module 110 may be used to obtain a user's request and submit the task in the request to the computing resource pool 101. Alternatively, the task may be any task that requires computation, for example, a training task.

The affinity task cache 120 may communicate with the affinity task resource pools 1 to N, and specifically, the affinity task cache 120 determines an affinity task resource pool i corresponding to the task submitted by the task submitting module 110 among the affinity task resource pools 1 to N, and sends the task to the affinity task resource pool i, so that the affinity task resource pool i performs calculation. On the other hand, the affinity task cache 120 may further be configured to send the remaining resources of the affinity task resource pool 1 to the affinity task resource pool N and the shared cache resource pool 140 to the data center 150, so that the data center 150 dynamically adjusts the resources of the affinity task resource pool 1 to the affinity task resource pool N and the shared cache resource pool 140.

Affinity task resource pool 1-affinity task resource pool N may be used to create a computation container for a task and return to task container list module 160, and task container list module 160 feeds back the results to the user.

On one hand, the data center 150 acquires a configuration instruction for initialization, and sends initialization strategies of the affinity task resource pool 1 to the affinity task resource pool N to the affinity task cache 120; on the other hand, the method can also be used for dynamically adjusting the resources of the affinity task resource pool 1 to the affinity task resource pool N and the shared cache resource pool 140.

It should be noted that the system framework 100 is an execution subject of the method for creating a computational force container provided in the embodiments of the present application, and the system framework 100 may also be referred to as a computational force platform. The computing resource pool 101 is used to provide a computing container for a task, and the computing resource pool 101 may be any device or device with data Processing capability, including but not limited to a Graphics Processing Unit (GPU), a Central Processing Unit (CPU), a Neural Network Processor (NPU), a Tensor Processor (TPU), and an Accelerated Processing Unit (APU). Data center 150 may be any network device, such as a server, having data computing, transfer, and storage capabilities.

Fig. 2 is a schematic flow chart of a method 200 for creating a computing power container provided by an embodiment of the present application.

It should be noted that the method 200 provided herein is applicable to a system framework 100, such as any computing platform with data processing capabilities. Optionally, the computing platform includes a computing resource pool, and the computing resource pool includes, but is not limited to, a Graphics Processing Unit (GPU), a Central Processing Unit (CPU), a Neural Network Processing Unit (NPU), a Tensor Processor (TPU), and an Accelerated Processing Unit (APU). Optionally, the computing resource pool includes an affinity task cache, N affinity task resource pools, and a shared cache resource pool. Optionally, the computing platform may further include a data center. Of course, the computing power platform may also be a cloud computing power platform, which is not specifically limited in this application.

As shown in fig. 2, the method 200 may include:

s201: obtaining a target task through an affinity task cache;

s202: determining a first resource pool corresponding to the target task in N affinity task resource pools respectively corresponding to N time periods according to the time period of the target task through the affinity task cache; n is more than 1;

s203: a computing power container is created in the first resource pool for the target task.

In the embodiment of the application, the N affinity task resource pools respectively correspond to N time periods. Equivalently, temporal characteristics are introduced for conventional resource pools. In other words, the affinity task resource pool referred to in this application may also be referred to as a temporal affinity task resource pool, i.e. the affinity task resource pool is used for processing tasks within a certain threshold. Correspondingly, the affinity task cache referred to in the present application refers to a cache that determines an affinity task resource pool corresponding to a target task from N affinity task resource pools by using a time period of the target task.

In other words, the affinity task resource pool corresponding to the target task, namely the first resource pool, is determined through the affinity task cache; on one hand, the N affinity task resource pools respectively correspond to N time periods, so that the computing resource pools can be normalized in a time dimension; on the other hand, the affinity task cache determines the first resource pool corresponding to the target task from the N affinity task resource pools according to the time period of the target task, so that the target task can run in the corresponding affinity task resource pools, and the computing power fragmentation caused by overlarge running time difference of a plurality of tasks is avoided, thereby realizing the reduction of the computing power fragmentation in the time dimension.

It should be noted that the number of the N affinity task resource pools is greater than 1, and the specific number is not limited in this application. Each of the N time periods may be a time period having a start time and an end time, or may be a time period having no start time and no end time. Accordingly, the time period of the target task may be a time period having a start time and an end time, or may not be a time period having a start time and an end time, which is not specifically limited in the present application. In summary, the present application is directed to regularizing a computing power resource pool in a temporal dimension, whereby computing power fragmentation of the resource pool in a computing power platform can be reduced in the temporal dimension based on temporal information of a task.

In one implementation mode, a computing platform obtains an instruction for creating a computing container for a target task, determines a first resource pool corresponding to the target task in N affinity task resource pools according to a time period of the target task through an affinity task cache based on the instruction, and creates the computing container for the target task in the first resource pool.

In some embodiments of the present application, the time period of the target task comprises an estimated run period of the target task; s202 may include:

and determining the affinity task resource pool corresponding to the time period with the difference value smaller than the first threshold value in the estimated running time period as the first resource pool through the affinity task cache in the N time periods. Optionally, the affinity task cache stores N time periods corresponding to the N affinity task resource pools respectively.

In one implementation, the estimated run period of the target task may be a target time period having a start time and an end time within which the target task is estimated to run. At this time, each of the N time periods may also be a time period having a start time and an end time; in one implementation, the computing platform may determine, in N time periods, a time period in which a difference between an estimated running period end time or a starting time of the target task and the estimated running period is smaller than a first threshold, and determine an affinity task resource pool corresponding to the time period as a first resource pool. Of course, in other implementation manners of the present application, the computing platform may also determine, in the N time periods, a time period within which the estimated running time period of the target task falls, and determine the affinity task resource pool corresponding to the time period as the first resource pool; the embodiment of the present application is not particularly limited to this.

In another implementation, the estimated operation period of the target task may also be an operation duration without a start time and an end time. At this time, each of the N time periods may also be a time period without a start time and an end time; based on the method, in N time periods, the time period of which the estimated running time of the target task is smaller than a first threshold value is determined, and the corresponding affinity task resource pool is determined as a first resource pool.

The first threshold may be set in advance, or may be input by a user, and the setting size of the first threshold is not limited in the present application. For example 0.1 h.

In some embodiments of the present application, the S203 may include:

determining whether the resources of the first resource pool meet the resources required by the target task; if the resources of the first resource pool meet the resources required by the target task, establishing a computing capacity container in the first resource pool; if the resources of the first resource pool can not meet the resources required by the target task, acquiring second resources from the shared cache resource pool through the affinity task cache; a computing power container is created for the target task based on the resources of the first resource pool and the second resources. In one implementation, if the resources of the first task resource pool cannot meet the resources required by the target task, allocating the second resources in the shared cache resource pool to the first task resource pool through a task cache; and further creating a computing power container for the target task in the first task resource pool based on the resources of the first task resource pool and the second resources obtained from the shared cache resource pool.

In other words, the computing platform determines whether the resources of the first resource pool satisfy the resources required by the target task; if the resources of the first resource pool meet the resources required by the target task, establishing a computing power container for the target task in the first resource pool, and returning to a container list of a task container; if the resources of the first resource pool can not meet the resources required by the target task, acquiring second resources from the shared cache resource pool through the affinity task cache; and creating a computing power container for the target task based on the resources of the first resource pool and the second resources, and returning a container list of task containers. The task container can also be called as a computational container, and the computational container is used as a computing unit and can perform various computing tasks, including data preprocessing, training of a machine learning model, deduction of unlabelled data by using an existing model and the like; accordingly, the container list of task containers may be the results of the task container's calculations on the target task, such as trained models or calculated values, etc. After the computing platform creates a computing power container for the target task, the target task can be operated through the computing power container, and an operation result, namely a container list of the task container, is returned after the operation is finished.

It should be noted that the second resource may be greater than or equal to the resource required by the first resource pool, which is not specifically limited in this application.

In some embodiments of the present application, the method 200 may further comprise:

if the task accumulation occurs in the first resource pool, reporting a task accumulation notification to a data center through the affinity task cache; in response to the task accumulation notification, sending new allocation policies of resources of the n affinity task resource pools and the shared cache resource pool to the affinity task cache through the data center; the N affinity task resource pools respectively correspond to N time periods, and the N time periods are different from the N time periods; n is more than 1; and based on the new allocation strategy, reallocating resources for the n affinity task resource pools and the shared cache resource pool through the affinity task cache.

N may be equal to N, or may not be equal to N, which is not specifically limited in this application.

In one implementation, the N time periods include time periods obtained by performing segmentation on the N time periods. For example, the N time periods include a time period obtained by splitting a time period corresponding to a first resource pool in the N time periods.

In other words, if a computation container needs to be created for a plurality of tasks in the first resource pool, that is, when task accumulation occurs in the first resource pool, it indicates that after the data center divides the computation resource pool into N affinity task resource pools based on a time period, the tasks in the affinity task resource pool corresponding to a certain time period are concentrated or accumulated, that is, the tasks in the certain time period are concentrated or accumulated, at this time, a task accumulation notification may be reported to the data center through the affinity task cache, so that the data center may divide the time period again or divide the concentrated time period again, and resend the allocation policy of the affinity task resource pool corresponding to each time period after the division or the division to the affinity task cache. Correspondingly, the new allocation strategies of the n affinity task resource pools and the shared cache resource pool are received from the data center through the affinity task cache, and resources can be reallocated for the n affinity task resource pools and the shared cache resource pool through the affinity task cache based on the new allocation strategies. It should be noted that the resources related to the embodiments of the present application may be calculated as units, and the present application does not limit the specific implementation manner thereof. For example, the maximum number of hash collisions per second can be referred to as the computational power, and the unit is denoted as hash/s. For example, MH/s is a million hashes (hashes) per second, and TH/s is a trillion hashes per second.

When the first resource pool has task accumulation, reporting a task accumulation notification to the data center through the affinity task cache, receiving new allocation strategies of the n affinity task resource pools and the shared cache resource pool which respectively correspond to the n repartitioned or segmented time periods sent by the data center through the affinity task cache, and reallocating resources for the n affinity task resource pools and the shared cache resource pool based on the new allocation strategies, thereby ensuring smooth allocation of different affinity task resource pools on time granularity and reducing fragmentation probability of computing resources on time dimension.

It should be noted that the new allocation policy may be a new percentage of the resources of the n affinity task resource pools to the total resources of the computing platform (i.e., the computing resource pool), a new percentage of the resources of the n affinity task resource pools to the total resources of the original one of the affinity task resource pools, or an allocation amount of the resources of the n affinity task resource pools. In other words, the new allocation policy is intended to allocate or clarify resources of the n affinity task resource pools, and the specific implementation manner of the new allocation policy is not limited in the present application. It should be noted that, since the data center originally divides the computation resource pool based on N time periods, and after the task is stacked, the computation resource pool is divided again based on N time periods or a certain computation affinity task resource pool is divided, so that the N time periods after the division or the division are different from the original N time periods.

sending the residual resources of the N affinity task resource pools and the residual resources of the shared cache resource pool in the computing platform to a data center periodically through the affinity task cache; receiving an adjustment strategy sent by the data center through the affinity task cache; and based on the adjustment strategy, adjusting the residual resources of the N affinity task resource pools and the residual resources of the shared cache resource pool in the computing platform through the affinity task cache.

In other words, first, the computing platform periodically sends the remaining resources of the N affinity task resource pools and the remaining resources of the shared cache resource pool in the computing platform to the data center through the affinity task cache, so that the computing platform determines an adjustment policy based on the remaining resources of the N affinity task resource pools and the remaining resources of the shared cache resource pool in the computing platform through the data center; then, the computing platform sends an adjustment strategy to the affinity task cache through the data center; correspondingly, after receiving the adjustment strategy sent by the data center through the affinity task cache, the computing platform adjusts the remaining resources of the N affinity task resource pools and the remaining resources of the shared cache resource pool through the affinity task cache based on the adjustment strategy. For example, after receiving the adjustment policy through the affinity task cache, the computing platform releases redundant resources in the remaining resources of the N affinity task resource pools to the shared cache resource pool based on the adjustment policy; for example, after receiving the adjustment policy through the affinity task cache, the computing platform obtains resources from the shared cache resource pool based on the adjustment policy, and fills the resources to the insufficient affinity task resource pool among the remaining resources in the N affinity task resource pools.

On one hand, the computing platform periodically sends the residual resources of the N affinity task resource pools and the residual resources of the shared cache resource pool in the computing platform to the data center through the affinity task cache, and can report the resource scheduling condition of each affinity task resource pool to the data center periodically, so that a data basis is provided for the data center to adjust the residual resources of each affinity task resource pool; on the other hand, the computing platform receives the adjustment strategy sent by the data center through the affinity task cache, and adjusts the residual resources of the N affinity task resource pools and the residual resources of the shared cache resource pool in the computing platform based on the adjustment strategy, so that the resource flow of each affinity task resource pool in the computing platform can be controlled, and the condition that the resources of a certain affinity task resource pool or certain affinity task resource pools are not scheduled or blocked for a long time is avoided; further, fast iteration of the task is facilitated.

It should be noted that the adjustment policy may be a new percentage of the resources of the N affinity task resource pools in the total resources of a certain affinity task resource pool, may also be a new percentage of the resources of the N affinity task resource pools in the total resources of the computing platform, and may also be an adjustment amount of the resources of the N affinity task resource pools directly. In other words, the adjustment policy aims to reallocate or adjust the adjustment amount of the resources of the N affinity task resource pools, and the specific implementation manner of the adjustment policy is not limited in the present application.

In some embodiments of the present application, before S201, the method 200 may further include:

acquiring an initialization strategy by using initialization data through a data center; receiving the initialization strategy sent by the data center through the affinity task cache; based on the initialization policy, the N affinity task resource pools and the shared cache resource pool are initialized by the affinity task cache.

In other words, after the computing platform obtains the initialization policy through the data center, the initialization policy of the N affinity task resource pools and the shared cache resource pool sent by the data center is received through the affinity task cache, and resources are allocated to the N affinity task resource pools and the shared cache resource pool of the computing platform through the affinity task cache based on the initialization policy.

In some implementations, the initialization data may be operation data of the historical task, for example, an operation duration of the historical task, and the force computing platform may determine, through the operation duration of the historical task, a task amount corresponding to each of the N time periods, and then determine the initialization policy based on the task amount corresponding to each of the N time periods. The initialization strategy may be the percentage of the resources of the N affinity task resource pools and the shared cache resource pool in the total resources of the computing platform (i.e., the computing resource pool), or may be the allocation amount of the resources of the N affinity task resource pools and the shared cache resource pool directly. In other words, the initialization policy aims to clarify the allocation amounts of the resources of the N affinity task resource pools and the shared cache resource pool. The specific implementation of the initialization data and the initialization strategy is not limited in this application. For the convenience of understanding, the initialization data in the present application is exemplarily described below by taking the proportion of the resource of the initialization policy including the N affinity task resource pools and the shared cache resource pool in the total resource of the computing platform as an example.

If the total resource (i.e. the computing resource pool) of the computing platform is M, the percentage ratio of the total resource of all the affinity task resource pools to the resource of the shared cache resource pool is a: and B, the total resource of all the affinity task resource pools is M A, and the resource of the shared cache resource pool is M B. The data center can also determine the proportion of the resources required by the tasks in each time period in the total resources required by all the tasks according to the summarized data of the tasks in the time periods, and determine the proportion as the proportion of the resources in the affinity task resource pool corresponding to the time period. For example, taking 3 time periods as an example, each of the 3 time periods includes one task, the calculated percentage of the 3 affinity task resource pools corresponding to the 3 time periods is shown in table 1:

TABLE 1

As shown in table 1, after receiving the percentage a of all affinity task resource pools in table 1 sent by the data center, the computing platform determines the resources of each affinity task resource pool according to the total resources M × a of all affinity task resource pools and the percentage a of each affinity task resource pool, that is, the resources of 3 affinity task resource pools are respectively: m A X, M A Y, M A Z.

In some implementations, prior to obtaining, by the data center, the initialization policy with the initialization data, obtaining, by the data center, a configuration instruction indicating initialization; and responding to the configuration instruction, and acquiring the initialization strategy by the data center by using the initialization data.

Fig. 3 is a schematic flow chart of a method 300 for creating a computing force container provided by an embodiment of the present application.

It should be understood that the method 300 may be performed by an affinity task cache and a data center in a computing platform, which may be any computing platform having data processing capabilities. Optionally, the computing platform includes a computing resource pool and a data center. The computing resource pool includes an affinity task cache including, but not limited to, a Graphic Processing Unit (GPU), a Central Processing Unit (CPU), a Neural Network Processing Unit (NPU), a Tensor Processor (TPU), and an Accelerated Processor (APU), and the data center may be any Network device with data computing, transferring, and storing capabilities. Such as a server.

As shown in fig. 3, the method 300 may include:

s301: and the computing platform acquires a configuration instruction for indicating initialization through the data center. The configuration instruction is used for instructing the data center to initialize the N affinity task resource pools and the shared cache resource pool. For example, the configuration instructions are to instruct the data center to send an initialization policy to the affinity task cache. Alternatively, as shown in fig. 1, the configuration instruction may be an instruction input by the user.

S302: the computing platform sends an initialization policy to the affinity task cache through the data center.

S303: and the computing platform receives an initialization strategy sent by the data center through the affinity task cache, and initializes the N affinity task resource pools and the shared cache resource pool of the computing platform based on the initialization strategy.

In some implementations, the initialization data may be operation data of the historical task, for example, an operation duration of the historical task, and the force computing platform may determine, through the operation duration of the historical task, a task amount corresponding to each of the N time periods, and then determine the initialization policy based on the task amount corresponding to each of the N time periods. The initialization strategy may be the percentage of the resources of the N affinity task resource pools in the total resources of the computing platform (i.e., the computing resource pools), or may be the allocation amount of the resources of the N affinity task resource pools directly. In other words, the initialization policy aims to clarify the allocation amount of resources of the N affinity task resource pools. The specific implementation of the initialization data and the initialization strategy is not limited in this application.

In some embodiments of the present application, the S202 may further include:

determining a first sub-resource pool corresponding to the target task in N1 sub-resource pools in the first resource pool based on the machine room of the target task through the affinity task cache; the N1 sub-resource pools respectively correspond to N1 machine rooms, and the N1 machine rooms include the machine room of the target task; n1 > 1; a computing power container is created for the target task in the first sub-resource pool in the first resource pool.

In other words, the affinity task is cached in N1 sub-resource pools corresponding to N1 rooms in the first resource pool, and the sub-resource pool corresponding to the room matched with the target task is determined as the first sub-resource pool.

The first sub-resource pool corresponding to the machine room of the target task is determined in the N1 sub-resource pools in the first resource pool, and equivalently, the machine room affinity of the task is considered on the basis of considering the time affinity of the task, so that the computing power resource pool can be regulated in the time dimension, the computing power fragmentation is reduced, the affinity task resource pool can be regulated in the machine room dimension while the computing power fragmentation is reduced, and the processing efficiency of the task can be accelerated. In other words, on one hand, the time characteristics of the tasks, namely the time affinity of the tasks, are considered in a mode of dividing the computing resource pool in the time dimension; on the other hand, the method for dividing the affinity task resource pool in the dimension of the machine room takes the machine room characteristics of the task into consideration, namely the machine room affinity of the task; based on the method, the affinity task resource pool can be regulated in the machine room dimension while the computing power fragmentation is reduced, and the processing efficiency of the tasks can be accelerated.

In some embodiments of the present application, the S202 may further include:

determining a second sub-resource pool corresponding to the target task in the N2 sub-resource pools in the first resource pool based on the network adopted by the target task through the affinity task cache; the N2 sub-resource pools respectively correspond to N2 networks, and the N2 networks include the network adopted by the target task; n2 > 1; a computing power container is created for the target task in the second sub-resource pool in the first resource pool.

In other words, through the affinity task cache, of the N2 sub-resource pools respectively corresponding to the N2 networks in the first resource pool, the sub-resource pool corresponding to the network matching the network adopted by the target task is determined as the second sub-resource pool.

By determining the second sub-resource pool corresponding to the network adopted by the target task from the N2 sub-resource pools in the first resource pool, equivalently, the network affinity of the task is considered on the basis of considering the time affinity of the task, the computing power resource pool can be regulated in the time dimension, the computing power fragmentation is reduced, the affinity task resource pool can be regulated in the network dimension while the computing power fragmentation is reduced, and the processing efficiency of the task can be accelerated. In other words, on one hand, the time characteristics of the tasks, namely the time affinity of the tasks, are considered in a mode of dividing the computing resource pool in the time dimension; on the other hand, the way of dividing the affinity task resource pool on the network dimension takes the network characteristics of the task into consideration, namely the network affinity of the task; based on the method, the affinity task resource pool can be regulated in the network dimension while the calculation capacity fragmentation is reduced, and the processing efficiency of the task can be accelerated.

Fig. 4 is a schematic flow chart of a method 400 for creating a computing power container provided by an embodiment of the present application.

It should be understood that the method 400 is applicable to the system framework 100, such as any computing platform having data processing capabilities. Optionally, the computing platform includes a computing resource pool, and the computing resource pool includes, but is not limited to, a Graphics Processing Unit (GPU), a Central Processing Unit (CPU), a Neural Network Processing Unit (NPU), a Tensor Processor (TPU), and an Accelerated Processing Unit (APU). Optionally, the computing resource pool includes an affinity task cache, N affinity task resource pools, and a shared cache resource pool. Optionally, the computing platform may further include a data center. Of course, the computing power platform may also be a cloud computing power platform, which is not specifically limited in this application.

As shown in fig. 4, the method 400 may include:

s401: and acquiring the target task through the affinity task cache.

In one implementation, a computing platform obtains, via an affinity task cache, an instruction to create a computing container for a target task, the instruction to request creation of the computing container for the target task. Optionally, the instruction includes an estimated operational period of the target task.

S402: determining a first resource pool corresponding to the target task in N affinity task resource pools respectively corresponding to N time periods according to the time period of the target task through the affinity task cache; n is more than 1.

Specifically, through the affinity task cache, in the N time periods, the affinity task resource pool corresponding to the time period in which the difference value between the estimated running time periods is smaller than the first threshold is determined as the first resource pool.

S403: is the resources of the first resource pool meet the resources required for the target task?

And if the resources of the first resource pool meet the resources required by the target task, establishing a computing capacity container for the target task in the first resource pool. If the resources of the first resource pool cannot meet the resources required by the target task, acquiring second resources from the shared cache resource pool through the affinity task cache; a computing power container is created for the target task based on the resources of the first resource pool and the second resources.

S404: and acquiring the second resource from the shared cache resource pool through the affinity task cache.

And if the resources of the first resource pool cannot meet the resources required by the target task, acquiring second resources from the shared cache resource pool through the affinity task cache. It should be noted that the second resource may be greater than or equal to the resource required by the first resource pool, which is not specifically limited in this application.

S405: a computing power container is created for the target task.

The computing platform creates a computing power container for the target task in a first resource pool for the target task, or the computing platform may create a computing power container for the target task based on resources of the first resource pool and second resources.

S406: and dynamically adjusting the residual resources of the N affinity task resource pools and the residual resources of the shared cache resource pool.

Specifically, firstly, the computing platform periodically sends the remaining resources of the N affinity task resource pools and the remaining resources of the shared cache resource pool in the computing platform to the data center through the affinity task cache, so that the computing platform determines an adjustment policy based on the remaining resources of the N affinity task resource pools and the remaining resources of the shared cache resource pool in the computing platform through the data center; then, the computing platform sends an adjustment strategy to the affinity task cache through the data center; correspondingly, after receiving the adjustment strategy sent by the data center through the affinity task cache, the computing platform adjusts the remaining resources of the N affinity task resource pools and the remaining resources of the shared cache resource pool through the affinity task cache based on the adjustment strategy. For example, after receiving the adjustment policy through the affinity task cache, the computing platform releases redundant resources in the remaining resources of the N affinity task resource pools to the shared cache resource pool based on the adjustment policy; for example, after receiving the adjustment policy through the affinity task cache, the computing platform obtains resources from the shared cache resource pool based on the adjustment policy, and fills the resources to the insufficient affinity task resource pool among the remaining resources in the N affinity task resource pools.

S407: and initializing the N affinity task resource pools and the shared cache resource pool.

Specifically, an initialization strategy is obtained through the data center by using initialization data; receiving the initialization strategy sent by the data center through the affinity task cache; based on the initialization policy, the N affinity task resource pools and the shared cache resource pool are initialized by the affinity task cache.

In other words,

after the computing platform obtains the initialization strategy through the data center, the affinity task cache receives the initialization strategy of the resources of the N affinity task resource pools sent by the data center, and the affinity task cache allocates the resources for the N affinity task resource pools and the shared cache resource pool of the computing platform based on the initialization strategy.

It should be noted that the drawings provided in the embodiments of the present application are only examples and should not be construed as limiting the present application. For example, S401, S402, and S405 shown in fig. 4 may be used in place of S201, S202, and S203 shown in fig. 2, respectively.

The preferred embodiments of the present application have been described in detail with reference to the accompanying drawings, however, the present application is not limited to the details of the above embodiments, and various simple modifications can be made to the technical solution of the present application within the technical idea of the present application, and these simple modifications are all within the protection scope of the present application. For example, the various features described in the foregoing detailed description may be combined in any suitable manner without contradiction, and various combinations that may be possible are not described in this application in order to avoid unnecessary repetition. For example, various embodiments of the present application may be arbitrarily combined with each other, and the same should be considered as the disclosure of the present application as long as the concept of the present application is not violated.

It should also be understood that, in the various method embodiments of the present application, the sequence numbers of the above-mentioned processes do not imply an execution sequence, and the execution sequence of the processes should be determined by their functions and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

The method provided by the embodiment of the present application is explained above, and the computational force platform provided by the embodiment of the present application is explained below.

Fig. 5 is a schematic block diagram of a computational force platform 500 provided by an embodiment of the present application.

As shown in fig. 5, the computing platform 500 includes:

an obtaining unit 501, configured to obtain a target task through an affinity task cache;

a determining unit 502, configured to determine, according to the affinity task cache, a first resource pool corresponding to the target task from N affinity task resource pools corresponding to N time periods respectively according to the time period of the target task; n is more than 1;

a creating unit 503, configured to create a computation force container for the target task in the first resource pool.

In some embodiments of the present application, the determining unit 502 is specifically configured to:

In some embodiments of the present application, the creating unit 503 is specifically configured to:

determining whether the resources of the first resource pool meet the resources required by the target task;

if the resources of the first resource pool meet the resources required by the target task, establishing a computing power container for the target task in the first resource pool;

if the resources of the first resource pool can not meet the resources required by the target task, acquiring second resources from a shared cache resource pool in the computing platform through the affinity task cache; a computing power container is created for the target task based on the resources of the first resource pool and the second resources.

In some embodiments of the present application, the determining unit 502 is further operable to:

if the task accumulation occurs in the first resource pool, reporting a task accumulation notification to a data center in the computing platform through the affinity task cache;

in response to the task accumulation notification, sending new allocation policies of the n affinity task resource pools and a shared cache resource pool of the computing platform to the affinity task cache through the data center; the N affinity task resource pools respectively correspond to N time periods, and the N time periods are different from the N time periods; n is more than 1;

and based on the new allocation strategy, reallocating resources for the n affinity task resource pools and the shared cache resource pool through the affinity task cache.

In some embodiments of the present application, the N time periods include time periods obtained by performing slicing on the N time periods.

sending the residual resources of the N affinity task resource pools and the residual resources of the shared cache resource pool in the computing platform to a data center periodically through the affinity task cache;

receiving an adjustment strategy sent by the data center through the affinity task cache;

and based on the adjustment strategy, adjusting the residual resources of the N affinity task resource pools and the residual resources of the shared cache resource pool in the computing platform through the affinity task cache.

acquiring an initialization strategy by using initialization data through a data center;

receiving the initialization strategy sent by the data center through the affinity task cache;

and initializing the N affinity task resource pools and the shared cache resource pool of the computing platform through the affinity task cache based on the initialization strategy.

In some embodiments of the present application, before obtaining the initialization policy with the initialization data by the data center, the determining unit 502 may further be configured to:

acquiring a configuration instruction for indicating initialization through a data center;

and responding to the configuration instruction, and acquiring the initialization strategy by the data center by using the initialization data.

In some embodiments of the present application, the determining unit 502 may further specifically be configured to:

determining a first sub-resource pool corresponding to the target task in N1 sub-resource pools in the first resource pool based on the machine room of the target task through the affinity task cache; the N1 sub-resource pools respectively correspond to N1 machine rooms, and the N1 machine rooms include the machine room of the target task; n1 > 1;

a computing power container is created for the target task in the first sub-resource pool in the first resource pool.

determining a second sub-resource pool corresponding to the target task in the N2 sub-resource pools in the first resource pool based on the network adopted by the target task through the affinity task cache; the N2 sub-resource pools respectively correspond to N2 networks, and the N2 networks include the network adopted by the target task; n2 > 1;

a computing power container is created for the target task in the second sub-resource pool in the first resource pool.

It is to be understood that apparatus embodiments and method embodiments may correspond to one another and that similar descriptions may refer to method embodiments. To avoid repetition, further description is omitted here. Specifically, the computing platform 500 may correspond to a main computing platform for executing the method 200, the method 300, and the method 400 of the embodiment of the present application, and the foregoing and other operations and/or functions of each module in the computing platform 500 are respectively for implementing corresponding flows in each method in fig. 2 to fig. 4, and are not described herein again for brevity.

It should also be understood that the units in the computing platform 500 related to the embodiments of the present application may be respectively or entirely combined into one or several other units to form one or several other units, or some unit(s) may be further split into multiple functionally smaller units to form one or more other units, which may achieve the same operation without affecting the achievement of the technical effect of the embodiments of the present application. The units are divided based on logic functions, and in practical application, the functions of one unit can be realized by a plurality of units, or the functions of a plurality of units can be realized by one unit. In other embodiments of the present application, the computing platform 500 may also include other units, and in practical applications, these functions may also be implemented by the assistance of other units, and may be implemented by cooperation of multiple units. According to another embodiment of the present application, the computing force platform 500 related to the embodiment of the present application may be constructed by running a computer program (including program codes) capable of executing the steps involved in the corresponding method on a general-purpose computing device including a general-purpose computer such as a Central Processing Unit (CPU), a random access storage medium (RAM), a read only storage medium (ROM), and the like, and a storage element, and the method of creating a computing force container according to the embodiment of the present application may be implemented. The computer program can be loaded on a computer-readable storage medium, for example, and loaded and executed in an electronic device through the computer-readable storage medium, so as to implement the corresponding method of the embodiments of the present application.

In other words, the above-mentioned units may be implemented in hardware, may be implemented by instructions in software, and may also be implemented in a combination of hardware and software. Specifically, the steps of the method embodiments in the present application may be implemented by integrated logic circuits of hardware in a processor and/or instructions in the form of software, and the steps of the method disclosed in conjunction with the embodiments in the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software in the decoding processor. Alternatively, the software may reside in random access memory, flash memory, read only memory, programmable read only memory, electrically erasable programmable memory, registers, and the like, as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps in the above method embodiments in combination with hardware thereof.

Fig. 6 is a schematic structural diagram of an electronic device 600 provided in an embodiment of the present application.

As shown in fig. 6, the electronic device 600 includes at least a processor 610 and a computer-readable storage medium 620. Wherein the processor 610 and the computer-readable storage medium 620 may be connected by a bus or other means. The computer-readable storage medium 620 is used to store a computer program 621, the computer program 621 includes computer instructions, and the processor 610 is used to execute the computer instructions stored by the computer-readable storage medium 620. The processor 610 is a computing core and a control core of the electronic device 600, and is adapted to implement one or more computer instructions, and in particular to load and execute one or more computer instructions to implement a corresponding method flow or a corresponding function.

By way of example, processor 610 may also be referred to as a Central Processing Unit (CPU). The processor 610 may include, but is not limited to: general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and the like.

By way of example, the computer-readable storage medium 620 may be a high-speed RAM memory or a Non-volatile memory (Non-volatile memory), such as at least one disk memory; optionally, there may be at least one computer readable storage medium located remotely from the processor 610. In particular, computer-readable storage medium 620 includes, but is not limited to: volatile memory and/or non-volatile memory. The non-volatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable PROM (EEPROM), or a flash Memory. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of example, but not limitation, many forms of RAM are available, such as Static random access memory (Static RAM, SRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic random access memory (Synchronous DRAM, SDRAM), Double Data Rate Synchronous Dynamic random access memory (DDR SDRAM), Enhanced Synchronous SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), and Direct Rambus RAM (DR RAM).

In one implementation, the electronic device 600 can be the computing force platform 500 shown in FIG. 5; the computer-readable storage medium 620 has stored therein computer instructions; the computer instructions stored in the computer-readable storage medium 620 are loaded and executed by the processor 610 to implement the corresponding steps in the method embodiments shown in fig. 2-4; in a specific implementation, the computer instructions in the computer-readable storage medium 620 are loaded by the processor 610 and executed to perform corresponding steps, which are not described herein again to avoid repetition.

According to another aspect of the present application, a computer-readable storage medium (Memory) is provided, which is a Memory device in the electronic device 600 and is used for storing programs and data. Such as computer-readable storage medium 620. It is understood that the computer readable storage medium 620 herein may include both built-in storage media in the electronic device 600 and, of course, extended storage media supported by the electronic device 600. The computer readable storage medium provides a storage space that stores an operating system of the electronic device 600. Also stored in the memory space are one or more computer instructions, which may be one or more computer programs 621 (including program code), suitable for being loaded and executed by the processor 610.

According to another aspect of the application, a computer program product or computer program is provided, comprising computer instructions stored in a computer readable storage medium. Such as a computer program 621. At this time, the electronic device 600 may be a computer, and the processor 610 reads the computer instructions from the computer-readable storage medium 620, and the processor 610 executes the computer instructions, so that the computer performs the method of creating a computation power container provided in the above-described various alternatives.

In other words, when implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the processes of the embodiments of the present application are executed in whole or in part or to realize the functions of the embodiments of the present application. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.).

Those of ordinary skill in the art will appreciate that the various illustrative elements and process steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

Finally, it should be noted that the above is only a specific embodiment of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of changes or substitutions within the technical scope of the present application, and all the changes or substitutions should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method of creating a computing power container, the method being applied to a computing power platform;

the method comprises the following steps:

obtaining a target task through an affinity task cache;

creating a computing power container for the target task in the first resource pool.

2. The method of claim 1, wherein the time period of the target task comprises an estimated run period of the target task;

the determining, by the affinity task cache, a first resource pool corresponding to the target task from N affinity task resource pools corresponding to N time periods according to the time period of the target task, includes:

and determining an affinity task resource pool corresponding to the time period with the difference value smaller than a first threshold value in the estimated running time period as the first resource pool in the N time periods through the affinity task cache.

3. The method of claim 1, wherein creating a computing power container for the target task in the first resource pool comprises:

determining whether the resources of the first resource pool satisfy the resources required by the target task;

if the resources of the first resource pool cannot meet the resources required by the target task, acquiring second resources from a shared cache resource pool in the computing platform through the affinity task cache; creating a computing power container for the target task based on the resources of the first resource pool and the second resources.

4. The method according to any one of claims 1 to 3, further comprising:

in response to the task accumulation notification, sending new allocation strategies of the n affinity task resource pools and a shared cache resource pool of the computing platform to the affinity task cache through the data center; the N affinity task resource pools respectively correspond to N time periods, and the N time periods are different from the N time periods; n is more than 1;

5. The method of claim 4, wherein the N time periods comprise time periods resulting from slicing the N time periods.

6. The method according to any one of claims 1 to 3, further comprising:

periodically sending the residual resources of the N affinity task resource pools and the residual resources of the shared cache resource pool in the computing platform to a data center through the affinity task cache;

7. The method of any of claims 1-3, wherein prior to retrieving the target task through the affinity task cache, the method further comprises:

8. The method of claim 7, wherein prior to obtaining the initialization policy with the initialization data by the data center, the method further comprises:

9. The method of any of claims 1-3, wherein creating a computing power container for the target task in the first resource pool comprises:

determining a first sub-resource pool corresponding to the target task in N1 sub-resource pools in the first resource pool based on the machine room of the target task through the affinity task cache; the N1 sub-resource pools respectively correspond to N1 machine rooms, and the N1 machine rooms comprise the machine room of the target task; n1 > 1;

creating a computing power container for the target task in the first sub-resource pool in the first resource pool.

10. The method of any of claims 1-3, wherein creating a computing power container for the target task in the first resource pool comprises:

determining a second sub-resource pool corresponding to the target task in the N2 sub-resource pools in the first resource pool based on the network adopted by the target task through the affinity task cache; the N2 sub-resource pools respectively correspond to N2 networks, and the N2 networks comprise the networks adopted by the target task; n2 > 1;

creating a computing power container for the target task in the second sub-resource pool in the first resource pool.

11. A computing force platform, comprising:

12. An electronic device, comprising:

a processor adapted to execute a computer program;

a computer-readable storage medium, in which a computer program is stored which, when executed by the processor, implements the method of any one of claims 1 to 10.

13. A computer-readable storage medium for storing a computer program which causes a computer to perform the method of any one of claims 1 to 10.