CN112068957B - Resource allocation method, device, computer equipment and storage medium - Google Patents

Resource allocation method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN112068957B
CN112068957B CN202010879892.2A CN202010879892A CN112068957B CN 112068957 B CN112068957 B CN 112068957B CN 202010879892 A CN202010879892 A CN 202010879892A CN 112068957 B CN112068957 B CN 112068957B
Authority
CN
China
Prior art keywords
network
network algorithm
resources
target
resource
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010879892.2A
Other languages
Chinese (zh)
Other versions
CN112068957A (en
Inventor
吴欣洋
李涵
丁瑞强
孟凡辉
戚海涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Lynxi Technology Co Ltd
Original Assignee
Beijing Lynxi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Lynxi Technology Co Ltd filed Critical Beijing Lynxi Technology Co Ltd
Priority to CN202010879892.2A priority Critical patent/CN112068957B/en
Publication of CN112068957A publication Critical patent/CN112068957A/en
Priority to PCT/CN2021/114217 priority patent/WO2022042519A1/en
Application granted granted Critical
Publication of CN112068957B publication Critical patent/CN112068957B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request

Abstract

The embodiment of the invention discloses a resource allocation method, a resource allocation device, computer equipment and a storage medium. The method comprises the steps of obtaining a network algorithm of resources to be allocated; when a first network algorithm is included in the network algorithm for determining the resources to be allocated, determining a target operation resource from the allocable operation resources according to the operation demand information of the first network algorithm, wherein the first network algorithm is configured with the operation demand information, and the target operation resource is an operation resource which meets the operation demand information when the first network algorithm is operated; and allocating the target operation resource to the first network algorithm. The embodiment of the invention can reasonably configure the operation resources and improve the utilization rate of the operation resources.

Description

Resource allocation method, device, computer equipment and storage medium
Technical Field
The embodiment of the invention relates to the field of artificial intelligence, in particular to a resource allocation method, a resource allocation device, computer equipment and a storage medium.
Background
In recent years, with rapid development of artificial intelligence related applications and technologies, demands for computing power and power consumption efficiency are increasing, and dedicated artificial intelligence (Artificial Intelligence, AI) chips to run AI algorithms have become an unprecedented trend.
However, the related art has disadvantages of unreasonable allocation of operation resources and low utilization of operation resources.
Disclosure of Invention
The embodiment of the invention provides a resource allocation method, a resource allocation device, computer equipment and a storage medium, which can reasonably allocate operation resources and improve the utilization rate of the operation resources.
In a first aspect, an embodiment of the present invention provides a resource allocation method, which is applied to a many-core system, where the many-core system includes an allocable operation resource, and the method includes:
acquiring a network algorithm of resources to be allocated;
when a first network algorithm is included in the network algorithm for determining the resources to be allocated, determining a target operation resource from the allocable operation resources according to the operation demand information of the first network algorithm, wherein the first network algorithm is configured with the operation demand information, and the target operation resource is an operation resource which meets the operation demand information when the first network algorithm is operated;
and allocating the target operation resource to the first network algorithm.
In a second aspect, an embodiment of the present invention further provides a resource allocation device configured in a many-core system, where the many-core system includes an allocable operation resource, including:
The first network algorithm acquisition module is used for acquiring a network algorithm of the resources to be allocated;
the target operation resource determining module is used for determining target operation resources from the allocable operation resources according to the operation demand information of the first network algorithm when the network algorithm of the resources to be allocated comprises the first network algorithm, wherein the first network algorithm is configured with the operation demand information, and the target operation resources are operation resources which meet the operation demand information when the first network algorithm is operated;
and the first network algorithm resource allocation module is used for allocating the target operation resource to the first network algorithm.
In a third aspect, an embodiment of the present invention further provides a computer device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the processor executes the program to implement a resource allocation method according to any one of the embodiments of the present invention.
In a fourth aspect, embodiments of the present invention further provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a resource allocation method according to any of the embodiments of the present invention.
According to the embodiment of the invention, the first network algorithm is allocated in the allocable operation resources, the target operation resources allocated to the first network algorithm meet the operation requirement information of the first network algorithm, partial resources can be allocated to the network algorithm, the resources matched with the network algorithm are adjusted according to the operation information of the network algorithm, the network algorithm can be adapted to reasonably allocate the resources, the problem of waste of operation resources caused by overall resource allocation aiming at each network algorithm in the prior art is solved, the operation resources can be reasonably allocated, the utilization rate of the operation resources is improved, and the waste of the operation resources is reduced.
Drawings
FIG. 1 is a flow chart of a method for allocating resources according to a first embodiment of the present invention;
fig. 2 is a flowchart of a resource allocation method in the second embodiment of the present invention;
FIG. 3a is a flow chart of a method for allocating resources according to a third embodiment of the present invention;
FIG. 3b is a flow chart of a method of resource allocation in accordance with a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of a resource allocation device in a fourth embodiment of the present invention;
fig. 5 is a schematic structural diagram of a computer device in a fifth embodiment of the present invention.
Detailed Description
The invention is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting thereof. It should be further noted that, for convenience of description, only some, but not all of the structures related to the present invention are shown in the drawings.
Example 1
Fig. 1 is a flowchart of a resource allocation method according to a first embodiment of the present invention, where the method may be implemented by a resource allocation device provided by the embodiment of the present invention, and the device may be implemented in a software and/or hardware manner and may be generally integrated into a computer device. As shown in fig. 1, the method of the present embodiment includes:
any of the embodiments of the present invention is applied to a many-core system that includes allocatable operating resources.
Where an allocable execution resource may refer to any resource dedicated to allocation to a network algorithm, e.g., may be a computational unit and thread. The resources of the computer device include, in addition to the allocable running resources, resources allocated to programs other than the network algorithm to run other programs. The computer equipment for executing the resource allocation method comprises a many-core system, wherein the many-core system is used for simultaneously running a plurality of network algorithms. Among them, many cores (Many cores) are a set of cores with high-performance parallel processing capability, which are connected together in a preset manner by a large number (hundreds or thousands in future) and a plurality of kinds of cores. Illustratively, the structure of the computer device performing the resource allocation method includes: shan Kashan chip, single-card multichip or multichip. The Shan Kashan chip is a many-core system chip, and the number of cores of each chip is at least one in the single-card multi-chip computer or the multi-card multi-chip computer, wherein the card is an integrated circuit with a set function.
S110, acquiring a network algorithm of the resources to be allocated.
Where the network algorithm is used to implement a particular function, the network algorithm may refer to a high performance computing algorithm, for example, the network algorithm may be a machine learning model network algorithm. For example, the network algorithm is a deep learning model network algorithm, and for another example, the network algorithm is a neural network algorithm. The network algorithms of the resources to be allocated may be multiple, at least two network algorithms may be independent network algorithms, or network algorithms with a dependency relationship, where the two network algorithms with a dependency relationship may refer to: the output of the first network algorithm serves as the input to the second network algorithm.
The network algorithm operation requires resources, and allocates resources to the network algorithm to operate the network algorithm, thereby realizing a specific function. In some alternative embodiments, the number of network algorithms for which resources are to be allocated is at least two. The resource allocation for at least two network algorithms may be allocated to one network algorithm, or may be allocated to a plurality of network algorithms simultaneously, or allocated to a plurality of network algorithms simultaneously in batches. Specifically, the setting may be performed according to the actual situation, and the embodiment of the present invention is not particularly limited.
For example, allocating resources for at least two network algorithms may be allocating network algorithms one by one. When the resource allocation of one network algorithm is completed or one network algorithm is running, the resource allocation for the next network algorithm is started.
Optionally, the network algorithm includes: the neural network model and/or at least one network comprised by the neural network model.
Wherein the network algorithm may be an entire and/or a portion of the neural network model. It is understood that network algorithms are used to perform a full, specific function, or some of the specific functions. For example, the network algorithm may be a neural network model, or a collection of networks of at least one network of a neural network model.
In some alternative embodiments, the network algorithm may include a model formed by the image detection network and the speech recognition network, or the network algorithm may include only the image detection network, or the network algorithm may include only the speech recognition network. As another example, the network algorithm may include a model formed by the image detection network and the object recognition network, or the network algorithm may include only the image detection network, or the network algorithm may include only the object recognition network.
By configuring the network algorithm as the whole model or part of the network included in the model, the application scene of the network algorithm executable by the chip and the service mode of the network algorithm can be enriched, and the utilization rate of operation resources can be improved.
Optionally, the neural network model is an image object detection model, the image object detection model includes an image detection network and an object recognition network, and the network algorithm includes: an image detection network and/or an object recognition network.
Wherein the image object detection model is composed of an image detection network and an object recognition network, and furthermore, the image object detection model does not include other networks. The object recognition network may include a person (e.g., face) recognition network, a vehicle recognition network, or a signal light recognition network, among others.
By configuring the network algorithm as an image detection network and/or an object recognition network, the efficiency of image recognition using the network algorithm can be improved.
S120, when the network algorithm of the resources to be allocated comprises a first network algorithm, determining a target operation resource from the allocable operation resources according to the operation demand information of the first network algorithm, wherein the first network algorithm is configured with the operation demand information, and the target operation resource is an operation resource which meets the operation demand information when the first network algorithm is operated.
The first network algorithm is a network algorithm configured with operational requirement information. The operation requirement information may refer to information of operation performance that the corresponding network algorithm needs to achieve. The operational requirement information may be configurable by a user. Illustratively, the operation requirement information includes an operation speed minimum value and/or an operation accuracy minimum value of the first network algorithm, and the like.
The running network algorithm may be a program corresponding to the running network algorithm, and the program running requires resources, so that the allocable running resources are running resources which can be allocated to the network algorithm and support the running network algorithm. By way of example, the allocatable resources may include computing units, threads, and the like. The computing unit may refer to a minimum unit with complete computing power, which is also called Core (Core), which is independently scheduled. A memory location may refer to a memory space in a chip cache. A thread may refer to the smallest unit that an operating system can perform operation scheduling, and is included in a process and is the unit of operation in a run. It should be noted that at the same time, the allocated resources cannot overlap each other, that is, each resource operates a network algorithm different from the network algorithms operated by other resources.
If the first network algorithm configured with the operation requirement information needs to allocate resources, a part or a complete operation resource can be obtained from the allocable operation resources according to the operation requirement information, and the operation resource is determined to be the target operation resource. The target execution resource is for executing the first network algorithm. The target operation resource is an operation resource meeting the operation requirement information when the first network algorithm is operated, and the operation performance of the first network algorithm is matched with the operation requirement information when the first network algorithm is operated by adopting the target operation resource. That is, the target operating resource may enable the operating first network algorithm to achieve the operating performance required by the operating requirement information.
The corresponding relation between the operation demand information and the operation resources can be pre-established, so that the corresponding operation resources can be queried according to the operation demand information of the first network algorithm, and the operation resources are determined as the target operation resources of the first network algorithm. The corresponding relation is established, the operation requirement information of the network algorithm can be determined according to the information (such as the structure of the network algorithm, the parameters of the network algorithm, the requirement information of the network algorithm and the like) associated with the network algorithm, matched resources are respectively allocated to different network algorithms, and the corresponding relation between the operation requirement information of the network algorithm and the operation resources is established.
Or the matched operation resources can be allocated for the first network algorithm according to the historical allocation resource information of the first network algorithm and the operation information of the first network algorithm corresponding to each historical allocation resource information, and the target operation resources are determined. In addition, there are other ways of determining the target operation resource according to the operation requirement information, for example, the first network algorithm may be tried to be operated by using the allocable operation resource to obtain the operation information of the try operation, and the operation resource allocated to the first network algorithm is adjusted according to the operation information of the try operation to obtain the target operation resource meeting the operation requirement information, which may be specifically set according to the actual situation.
S130, distributing the target operation resource to the first network algorithm.
And distributing the target operation resources to the corresponding first network algorithm, wherein the first network algorithm can be operated by adopting the distributed target operation resources when the first network algorithm is operated. The resource allocation can be dynamic allocation when the chip runs a network algorithm, or can be pre-allocation when compiling.
Optionally, the number of the network algorithms running at the same time is at least two, which indicates that a plurality of network algorithms can run in parallel, and different network algorithms are run by adopting different resources respectively, so that other running resources still run corresponding network algorithms under the condition that a certain network algorithm transmits data and the allocated running resources are idle, thereby reducing the influence of one network algorithm to the utilization rate of all the running resources, reducing the waste of the running resources and improving the utilization rate of the chip running resources.
Optionally, the method further comprises: and if the operation requirement information of the first network algorithm is not matched with the chip performance information, generating requirement adjustment information of the first network algorithm to instruct a user to adjust the operation requirement information of the first network algorithm.
The chip performance information is used to evaluate the performance of the chip or chip set. The chip performance information may include: storage space capacity, number of processing units, data throughput, operating frequency, etc. The operation requirement information of the first network algorithm is not matched with the chip performance information, so that when the chip or the chip group operates the first network algorithm, the operation state of the first network algorithm is not matched with the operation requirement information, and the state that the operation state cannot meet the requirement of the operation requirement information can be understood. For example, the first network algorithm may be run on full resources, the running speed of the first network algorithm is obtained as chip performance information, if it is determined that the chip performance information is smaller than the running requirement information, it is determined that the chip performance cannot meet the running requirement information, and it is determined that the running requirement information of the first network algorithm is not matched with the chip performance information.
The requirement adjustment information of the first network algorithm is used for prompting a user to adjust operation requirement information of the first network algorithm. The requirement adjustment information of the first network algorithm comprises identification information of the first network algorithm, mismatch information of operation requirement information and chip performance information and the like.
By detecting that the operation requirement information of the first network algorithm is not matched with the performance information of the chip, the requirement adjustment information is generated, a user is prompted to adjust the operation requirement information of the first network algorithm so as to accurately allocate resources for the first network algorithm, the occurrence of the condition that the resources are continuously allocated for the first network algorithm when the chip or the chipset cannot meet the operation requirement information of the first network algorithm is reduced, the error probability of the resource allocation operation is reduced, and the accuracy of the resource allocation is improved.
The computing capacity and the storage space of the chip have the upper limit, and the computing capacity of the chip can be improved and the storage space of the chip can be increased by connecting multiple AI chips in parallel. It should be understood that the upper limit of chip performance is not exactly the same due to the way the chips are combined and the performance differences of different model chips. Therefore, network algorithms are needed one by one, whether the current chip can meet the expected performance of each network algorithm is judged, and if the current chip cannot meet the expected performance of each network algorithm, a user is prompted to adjust the expected performance of the network algorithm.
According to the embodiment of the invention, the first network algorithm is allocated in the allocable operation resources, the target operation resources allocated to the first network algorithm meet the operation requirement information of the first network algorithm, partial resources can be allocated for the network algorithms respectively, the resources matched with the network algorithms are adjusted according to the operation information of the network algorithms, the network algorithms can be adapted to reasonably allocate the resources, the problem of waste of operation resources caused by full resource allocation aiming at each network algorithm in the prior art is solved, the operation resources can be reasonably allocated, the utilization rate of the operation resources is improved, and the waste of the operation resources is reduced.
Example two
Fig. 2 is a flowchart of a resource allocation method according to a second embodiment of the present invention, which is implemented based on the above-mentioned embodiment. The method of the embodiment can comprise the following steps:
s210, acquiring a network algorithm of the resources to be allocated.
Reference may be made to the foregoing embodiments for a non-exhaustive description of embodiments of the invention.
S220, when the network algorithm of the resources to be allocated comprises a first network algorithm, the first network algorithm is operated by adopting the matched target resources, the current operation information of the first network algorithm is obtained, and the first network algorithm is configured with operation requirement information.
The matched target resource can be all or part of the allocable operation resource, and the current operation information is used for evaluating the performance of the chip for operating the first network algorithm by adopting the target resource. The current operation information may include operating the first network algorithm using the target resource, the first network algorithm calculating a duration of the input data and/or an operation speed of the first network algorithm, and the like. The current operation information is used for adjusting the target resource.
The target resource adjustment is used for reducing or increasing resources occupied by the network algorithm so as to obtain target operation resources meeting the operation requirement of the first network algorithm.
Optionally, the running the first network algorithm with the matched target resource and obtaining current running information of the first network algorithm include: generating random input data in the compiling process of the first network algorithm; inputting the random input data into the first network algorithm, and operating the first network algorithm by adopting a target resource to acquire output data of the first network algorithm output aiming at the random input data, wherein the target resource is matched with the first network algorithm; and acquiring the first network algorithm, calculating the operation speed of the output data according to the random input data, and determining the operation speed as the current operation information of the first network algorithm.
Compiling the first network algorithm may be understood as converting the file associated with the first network algorithm into an executable file. The target resources matched with the first network algorithm can be configured in the executable file, and when the chip executes the executable file, the matched target resources can be acquired to execute the executable file, so that the first network algorithm is operated by the target resources.
The random input data is used for being input into the first network algorithm so that the first network algorithm calculates and current operation information of the first network algorithm is obtained. The random input data may refer to automatically generated random data for use as input data for the first network algorithm. The type of the random input data is matched with the first network algorithm, and meanwhile, the random input data is input in units. Illustratively, the first network algorithm is a speech recognition network, and the random input data is a random audio, such as a sentence of random audio, or a random audio file, etc. For another example, the target network algorithm is an image detection network, and the random input data is 1 random image.
The random input data is input to the target network algorithm, and the target network algorithm is instructed to calculate the random input data. And operating the target network algorithm by adopting the target resource, inputting the random input data into the target network algorithm, and obtaining output data output by the target network algorithm, wherein the output data is a calculation result obtained by calculating the random input data by the target network algorithm.
In some alternative embodiments, the target resource is used to run a target network algorithm, as opposed to running an executable file compiled from the target network algorithm. The operation process of using the target resource to operate the target network algorithm is a pseudo operation process, and is a process for simulating the actual operation of using the target resource to operate the target network algorithm. For example, in the compiling process of the target network algorithm, a pre-configured module may be invoked to run the target network algorithm based on the target resource, so as to simulate the real running process of running the target network algorithm by using the target resource.
The operation speed may refer to a calculation process of output data obtained by calculating the random input data by using a first network algorithm operated by the target resource.
By carrying out resource allocation on the first network algorithm in advance in the compiling process of the first network algorithm, the resource allocation can be carried out in advance before the network algorithm is operated, adjustment is carried out, the residual resources allocated for the next network algorithm can be accurately determined, the resources can be accurately allocated for a plurality of network algorithms, the resources of a plurality of network algorithms can be reasonably allocated, and the resource utilization rate is improved.
And in the compiling process of the first network algorithm, performing resource allocation on the first network algorithm, and after the first network algorithm is compiled, operating the executable file obtained by compiling the first network algorithm by adopting the adjusted target resource.
Optionally, the running the first network algorithm using the adjusted target resource includes: generating a target executable file of the first network algorithm according to the adjusted target resource and the first network algorithm; and storing the target executable file into a cache, and operating the target executable file by adopting the adjusted target resource.
The target executable file is matched with the first network algorithm, and the executable file is used for execution to realize the function of the first network algorithm. The target executable file is configured with executable code of the first network algorithm and the adjusted target resource. And storing the target executable file into a cache for running the target executable file in the on-chip cache.
And S230, adjusting the target resources according to the current operation information and the operation demand information of the first network algorithm until the adjusted target resources meet the minimum resource allocation condition, and determining the target resources meeting the minimum resource allocation condition as target operation resources matched with the first network algorithm, wherein the target operation resources are operation resources meeting the operation demand information when the first network algorithm is operated.
The resource minimum allocation condition is used for detecting whether the current target resource is the minimum resource when the running requirement of the network algorithm is met. The resource minimum allocation condition may include operation requirement information. Illustratively, the minimum allocation condition of the resources is that when the running speed of the first network algorithm is greater than or equal to 1 piece per second, the minimum allocable resources are target resources.
The target resource is adjusted until the adjusted target resource meets the minimum allocation condition of the resource, which can be understood as detecting the minimum resource under the condition of ensuring the operation requirement of the first network algorithm.
In one possible implementation manner, the first network algorithm may be run in a trial mode to obtain a corresponding performance embodiment, and adjust resources based on the performance embodiment, so as to ensure that the number of occupied resources is reduced on the basis of maintaining running requirements, and improve the effective power utilization rate of the computing unit.
The adjusted target resources are used as target operation resources corresponding to the first network algorithm to operate the first network algorithm, so that the resources occupied by the first network algorithm can be reduced as much as possible under the condition of meeting the requirement of the first network algorithm, the operation resources are reasonably configured, the resource utilization rate is improved, the rest resources can be used for operating other network algorithms, the number of the network algorithms operated in parallel is increased, and the overall operation efficiency of a plurality of network algorithms is improved. For example, the first network algorithm may be run using the adjusted target resource after the first network algorithm is compiled.
Adjusting the target resource may include decreasing the target resource or increasing the target resource. And if the performance corresponding to the current operation information is lower than the performance corresponding to the operation demand information, adding a target resource, wherein the target resource is smaller than the allocable operation resource. And if the performance corresponding to the current operation information is higher than the performance corresponding to the operation requirement information, reducing the target resource, wherein the target resource is not 0.
Optionally, the current operation information includes a current operation speed, and the operation demand information includes an operation demand speed, where the adjusting the target resource until the adjusted target resource meets a resource minimum allocation condition includes: reducing the target resource; operating the first network algorithm by adopting the reduced target resource, and acquiring the current operation speed of the first network algorithm; if the difference value between the current running speed and the running demand speed is determined to be greater than a set speed threshold, continuing to reduce the target resource, running the first network algorithm by adopting the reduced target resource, and acquiring the current running speed of the first network algorithm; and if the difference value between the current running speed and the running demand speed is smaller than or equal to the set speed threshold value, determining that the reduced target resource meets a resource minimum allocation condition.
Reducing the target resource is used to continually reduce the target resource to determine the smallest available resource. The current operation speed may refer to a calculation speed of the first network algorithm in the process of operating the first network algorithm using the target resource to instruct the first network algorithm to calculate the input data. The operation demand speed may refer to a speed at which the first network algorithm is expected to reach in operating the network algorithm with the target resource to instruct the first network algorithm to calculate the input data.
And operating the first network algorithm by adopting the reduced target resource, and acquiring the current operation speed of the first network algorithm, so that the operation speed of the first network algorithm can be based on the reduced target resource in real time.
The speed threshold is set for detecting whether the reduced target resource is close to the target resource meeting the minimum allocation condition of the resource, and also for detecting whether the current running speed associated with the reduced target resource is close to the running demand speed. The set speed threshold may be set according to actual situations, and the embodiment of the present invention is not particularly limited. The input and output of different types of network algorithms are different, for example, the input and output of the image detection network are both images, and for example, the input of the voice recognition network is voice, the output is text, and the set speed thresholds corresponding to different types of network algorithms can be configured to be different.
The difference between the current operating speed and the operating demand speed may be, for example, the difference obtained by subtracting the value of the operating demand speed from the value of the current operating speed. The difference between the current operating speed and the operating demand speed is a non-negative number.
The difference between the current operating speed and the operating demand speed is greater than the set speed threshold, indicating that the current operating speed is far greater than the operating demand speed, thereby indicating that the target resources currently allocated to the first network algorithm are sufficient to support the first network algorithm to meet the operating demand, and that the chip operates the first network algorithm with the currently allocated target resources in excess of performance, i.e., in excess of the target resources. Under the condition of excessive resources, operations such as reducing target resources, operating the first network algorithm by adopting the reduced target resources, acquiring the current operation speed of the first network algorithm, judging the magnitude relation between the difference value between the current operation speed and the operation demand speed and the set speed threshold value and the like can be continuously executed.
The difference between the current running speed and the running demand speed is smaller than or equal to the set speed threshold, which indicates that the current running speed is close to (equal to or slightly larger than) the running demand speed, so that the current target resource allocated to the first network algorithm is enough to support the first network algorithm to reach the running demand, and the performance of the chip running the first network algorithm by adopting the current allocated target resource is not excessive, namely the allocation of the target resource is reasonable. And under the condition that the target resource allocation is reasonable, determining that the adjusted target resource meets the resource minimum allocation condition.
By reducing the target resources, continuously calculating the difference between the current running speed and the running demand speed of the first network algorithm when the first network algorithm is run by adopting the reduced target resources, determining that the resources are excessive when the difference is larger than a set speed threshold value, continuously reducing the resources, and determining that the adjusted target resources meet the minimum allocation condition of the resources when the difference is smaller than or equal to the set speed threshold value, thereby reducing the condition of excessive resources, reasonably configuring the running resources and improving the resource utilization rate.
Optionally, the reducing the target resource includes: calculating a first quantity ratio of computing units included in the target resource to allocable computing units; calculating a second number ratio of threads included in the target resource to allocable threads; if the first number ratio is determined to be greater than or equal to the second number ratio, reducing the number of computing units included in the target resource; if it is determined that the first number ratio is less than the second number ratio, reducing the number of threads included by the target resource.
For example, the target resource may include a compute unit and a thread. The computational unit and threads may be reduced simultaneously, or one of the items may be selected for a reduced number. The allocatable resources may include allocatable computing units and allocatable threads. It should be noted that, the chip includes a plurality of computing units and threads, where a portion of the computing units may be selected as the allocatable computing units, and a portion of the threads may be selected as the allocatable threads, and the allocatable computing units and the allocatable chips may be formed into allocatable resources.
The ratio of the number of computing units currently allocated by the first network algorithm to the number of computing units allocable in total is calculated and determined as a first number ratio. The first quantity ratio is used to describe the proportion of computing resources allocated to the target network algorithm to computing resources of the total allocable network algorithm.
The ratio of the number of threads currently allocated by the first network algorithm to the total number of allocable threads is calculated and is determined as a second number ratio. The second number ratio is used to describe the proportion of thread resources allocated to the target network algorithm to thread resources of the total allocable network algorithm.
The first number ratio is greater than or equal to the second number ratio, indicating that the number of computing units allocated to the first network algorithm is excessive relative to the threads, i.e., the computing resources are more excessive relative to the thread resources, thereby preferentially reducing the number of computing units. The first number ratio being smaller than the second number ratio indicates that the number of threads allocated to the first network algorithm is excessive relative to the computational unit, i.e., the thread resources are more excessive relative to the computational resources, thereby preferentially reducing the number of threads.
It should be understood that the actual resource occupation quantity and the corresponding performance can be counted, one or two of the computation unit occupation quantity and the thread quantity can be reduced, and the goal resource distributed to the first network algorithm can be gradually reduced until the reduced goal resource is just or slightly more than the performance requirement of the first network algorithm, namely, the minimum distribution condition of the resource is determined to be met; and allocating resources for the next network algorithm according to the same flow until all the network algorithms are allocated. The resource utilization rate can be improved on the basis of guaranteeing the operation requirement of each network algorithm.
In addition, in the target resource, if the number of threads is determined to be one, the number of threads is not reduced continuously, and the number of computing units is reduced continuously. If the number of computing units is determined to be one, the number of computing units is not reduced continuously, and the number of threads is reduced continuously.
When the target resource comprises computing resources and threads, determining the ratio of the number of computing units to the number of the allocatable computing units as a first number ratio, determining the ratio of the number of threads to the number of the allocatable threads as a second number ratio, comparing the first number ratio with the second number ratio, and taking the resources with large number ratio as the target resources needing to be reduced, thereby being capable of rapidly reducing surplus resources and improving the efficiency of reasonable resource allocation.
According to the embodiment of the invention, the target resources are pre-allocated for the first network algorithm, the first network algorithm is operated by adopting the target resources, the target resources are adjusted according to the current operation information of the first network algorithm, the resources occupied by the first network algorithm are reduced as much as possible under the condition of meeting the requirement of the first network algorithm, the computing resources are reasonably configured, the resource utilization rate is improved, other network algorithms can be operated by adopting the residual resources, the number of the network algorithms operated in parallel is increased, and the overall operation efficiency of a plurality of network algorithms is improved.
Example III
Fig. 3a is a flowchart of a resource allocation method according to a third embodiment of the present invention, which is embodied based on the above-described embodiment. The method of the embodiment can comprise the following steps:
s310, a network algorithm of the resources to be allocated is obtained.
Reference may be made to the foregoing embodiments for a non-exhaustive description of embodiments of the invention.
S320, when the network algorithm of the resources to be allocated comprises a first network algorithm, determining a target operation resource from the allocable operation resources according to the operation demand information of the first network algorithm, wherein the first network algorithm is configured with the operation demand information, and the target operation resource is an operation resource which meets the operation demand information when the first network algorithm is operated.
S330, the target operation resource is distributed to the first network algorithm.
And S340, when the network algorithm of the resources to be allocated also comprises a second network algorithm without configuration operation demand information, determining the rest operation resources according to the allocable operation resources and the target operation resources corresponding to the first network algorithm.
The second network algorithm not configured with the operation requirement information may refer to a network algorithm in which no operation requirement exists, and thus, may operate using any resource. The unconfigured operation requirement information may also be that the operation requirement information is null.
Because the second network algorithm does not have operation requirements, any resource can be adopted for operation, so that the remaining second network algorithm without operation requirements is reasonably configured under the condition that the operation performance of the first network algorithm with operation requirements is preferentially ensured.
When the resource allocation of each first network algorithm is completed, the remaining allocable operation resources are counted, and the second network algorithm is operated in any mode.
Wherein, the first network algorithm may not exist, and at this time, the corresponding target operation resource may be empty.
The remaining operating resources are resources for operating the second network algorithm for which the operating requirement information is not configured. The remaining operational resources and the target operational resources constitute allocatable operational resources. Wherein, the remaining operation resources may be equal to or greater than 0, and/or the target operation resources may be equal to or greater than 0.
And S350, distributing the residual operation resources to the second network algorithm.
The first network algorithms can be operated in parallel, the first network algorithms and the second network algorithms can be operated in parallel, and the second network algorithms can be operated in a time division multiplexing mode.
Optionally, the network algorithm of the resources to be allocated includes a plurality of second network algorithms, where allocating the remaining operation resources to the second network algorithms includes: and distributing the residual operation resources to the plurality of second network algorithms in a time division multiplexing mode.
The time division multiplexing manner may mean that at least one second network algorithm is respectively allocated a time period, and the second network algorithm operates in the allocated time period. The time periods allocated by the different second network algorithms do not overlap each other. And operating the second network algorithm by using the residual operation resources in the time period allocated by the second network algorithm.
By adopting the remaining operation resources to operate the second network algorithm without operation requirements in a time division multiplexing mode, the operation performance and the operation efficiency of the network algorithm without operation requirements can be ensured, the operation performance and the operation efficiency of the network algorithm without operation requirements are simultaneously considered, the resources of the network algorithm without operation requirements are reasonably configured, and the operation performance and the operation efficiency of different types of network algorithms are improved.
Optionally, the network algorithm of the resource to be allocated includes a plurality of first network algorithms, where determining, according to operation requirement information of the first network algorithms, a target operation resource from the allocable operation resources includes: classifying the plurality of first network algorithms to form at least one network group, wherein the network group comprises a dependency network group and/or an independent network group, the dependency network group comprises at least two first network algorithms, a dependency relationship exists between each first network algorithm and at least one first network algorithm, and the independent network group comprises first network algorithms without the dependency relationship; when a dependency relationship network group is determined to exist, sequentially determining target operation resources allocated to each first network algorithm in the dependency relationship network group; and when the first network algorithm resources in the dependency relationship network group are determined to be completely allocated or the dependency relationship network group is empty, sequentially determining the target operation resources allocated to the first network algorithms in the independent network group.
Network groups represent one type of network algorithm. The operation of each first network algorithm in the set of dependency network interacts with the operation of at least one first network algorithm, e.g. the output of a first network algorithm is the input of another first network algorithm. The operation between the first network algorithms in the independent network group is independent of each other.
In some alternative embodiments, the resource allocations are in order of existence. The method comprises the steps of preferentially allocating resources for a first network algorithm with a dependency network group, and allocating resources for the first network algorithm of an independent network group after the resources of the first network algorithm of the dependency network group are allocated.
In the dependency relationship network group, because a dependency relationship exists between the first network algorithm and at least one first network algorithm, output data of the first network algorithm needs to be transmitted to the first network algorithm with the dependency relationship, so that the first network algorithm with the dependency relationship calculates according to the output data. The shorter the routing path, the shorter the time required to transmit data during the data transmission process. Storage resources, i.e., memory space, may also be included in the allocatable resources, for example.
The method can allocate resources for the first network algorithm with the dependency relationship network group preferentially, and allocate the network algorithm with the dependency relationship in the adjacent storage space, so that the routing path is shortened, and the data transmission time is shortened.
And finishing the resource allocation of all network algorithm in the dependency relationship network group or emptying the dependency relationship network group, wherein the network algorithm for unallocated resources does not exist in the dependency relationship network group. At this point, resource allocation for the first network algorithm in the independent network group begins.
And determining the allocable residual resources of the independent network group according to the sum of the target operation resources allocated by each first network algorithm in the dependency network group and the allocable operation resources. If the dependency network group is empty, determining that resources are not allocated to the network algorithms in the dependency network group, and directly adopting the allocatable operation resources to allocate the resources to the network algorithms in the independent network group. And if the dependency network group is not empty, performing resource allocation for each first network algorithm in the independent relationship network group according to the residual resources after the resource allocation of each first network algorithm in the dependency network group is completed.
The first network algorithms are classified according to the dependency relationship among the first network algorithms, the first network algorithms with the dependency relationship are preferentially allocated with resources, and after the first network algorithms with the dependency relationship are allocated with resources, the first network algorithms which are independent of each other are continuously allocated with resources, so that resources with short routes are allocated for the network algorithms with the dependency relationship, the data transmission path length and the transmission time length of the network algorithms with the dependency relationship are reduced, and the operation performance of the network algorithms with the dependency relationship is improved.
Optionally, the method further comprises: operating each network algorithm and counting the operation data of each network algorithm; and when a change request of the operation requirement information aiming at least one network algorithm is acquired, re-allocating resources.
For example, network algorithms configured with or not empty of operational requirement information reside on-chip. The user can adjust whether the network algorithm configures the operation requirement information according to the performance of each network algorithm in a period of time so as to adjust whether the network algorithm resides on a chip or not, and then adjust the resource allocation mode and the operation mode (such as time division multiplexing or parallel operation) of the network algorithm. Wherein, parallel operation means that the number of network algorithms running at the same time is at least two.
The statistics of the operation data of each network algorithm may refer to the operation data of each network algorithm during a period of time. The operation data is used for prompting a user to judge whether operation requirement information is to be added for the network algorithm without the operation requirement information. The change request is used for editing the operation requirement information of the network algorithm, such as editing operations of adding, modifying, deleting and the like.
And updating the classification result of the network algorithm, namely updating the network algorithm included in each network group, namely adjusting the network group to which the network algorithm belongs according to the changed operation demand information of the network algorithm.
By counting the operation data of each network algorithm and acquiring a change request of the operation demand information input by a user aiming at the operation data, the network group to which the network algorithm belongs is adjusted, the resource allocation mode and the operation mode of the network algorithm are flexibly configured, the resource allocation mode of the network algorithm is enriched, and the rationality of the resource allocation is improved.
In an alternative example, as shown in fig. 3b, the resource allocation method may include:
s301, the dependency network group and/or the independent network group are/is loaded into the chip.
The dependency network groups and/or independent network groups are loaded into the chip, and may be, for example, a receiving dependency network group and/or independent network group and stored in the chip memory. The network algorithms in the dependency network group and/or the independent network group are configured with the operation requirement information.
S302, the allocable resources are obtained, and the resources are allocated for each network algorithm in the dependency network group and/or the independent network group.
In the process of acquiring the allocable resources and allocating the resources for each network algorithm in the dependency network group and/or the independent network group, if the chip performance information is detected to be unable to meet the operation requirement information of a certain network algorithm, the requirement adjustment information is generated, and the user is prompted to adjust the operation requirement information of the network algorithm.
S303, adjusting the resources of each network algorithm.
S304, judging whether a network algorithm of unallocated resources exists in the dependency relationship network group and/or the independent network group, if so, executing S302; otherwise, S305 is executed.
S305, loading the network algorithm without the operation requirement information into the chip.
The network algorithm that does not configure the operation requirement information is loaded into the chip, for example, the network algorithm that receives the operation requirement information that does not configure is stored in the chip memory.
In the embodiment of the present invention, the allocation manner of the dependency network group and/or the independent network group configured with the operation requirement information may be the same, and different from the allocation manner of the network algorithm not configured with the operation requirement information. The distribution mode of the dependency relation network group and/or the independent network group configured with the operation requirement information can be parallel operation, and can be different from the operation mode of a network algorithm without the operation requirement information, and the operation mode of the network group is not required to be time division multiplexing.
According to the embodiment of the invention, the first network algorithm with the operation requirement is preferentially allocated, and the second network algorithm without the operation requirement is allocated, so that the operation performance and the operation efficiency of the network algorithm without the operation requirement can be both improved while the operation performance and the operation efficiency of the network algorithm with the operation requirement are improved, the resources of the network algorithm without the operation requirement are reasonably configured, and the operation performance and the operation efficiency of different types of network algorithms are improved.
Example IV
Fig. 4 is a schematic diagram of a resource allocation device in a fourth embodiment of the present invention. The fourth embodiment is a corresponding device for implementing the resource allocation method provided by the foregoing embodiment of the present invention, where the device may be implemented in software and/or hardware, and may generally be integrated with a computer device, etc.
A first network algorithm obtaining module 410, configured to obtain a network algorithm of a resource to be allocated;
a target operation resource determining module 420, configured to determine, when it is determined that a network algorithm of the resources to be allocated includes a first network algorithm, a target operation resource from allocable operation resources according to operation requirement information of the first network algorithm, where the first network algorithm is configured with operation requirement information, and the target operation resource is an operation resource that satisfies the operation requirement information when the first network algorithm is operated;
a first network algorithm resource allocation module 430, configured to allocate the target operating resource to the first network algorithm.
According to the embodiment of the invention, the first network algorithm is allocated in the allocable operation resources, the target operation resources allocated to the first network algorithm meet the operation requirement information of the first network algorithm, partial resources can be allocated for the network algorithms respectively, the resources matched with the network algorithms are adjusted according to the operation information of the network algorithms, the network algorithms can be adapted to reasonably allocate the resources, the problem of operation resource waste caused by full resource allocation for each network algorithm in the prior art is solved, the operation resources can be reasonably allocated, the resources can be allocated in a targeted mode, the utilization rate of the operation resources is improved, and the waste of the operation resources is reduced.
Further, the resource allocation device further includes:
the second network algorithm resource allocation module is used for determining the remaining operation resources according to the allocable operation resources and the target operation resources corresponding to the first network algorithm when the network algorithm of the resources to be allocated further comprises the second network algorithm without configuration operation demand information; and distributing the residual operation resources to the second network algorithm.
Further, the network algorithm of the resource to be allocated includes a plurality of first network algorithms, and the target operation resource determining module 420 includes: a network algorithm classifying unit, configured to classify the plurality of first network algorithms to form at least one network group, where the network group includes a dependency network group and/or an independent network group, where the dependency network group includes at least two first network algorithms, a dependency exists between each first network algorithm and at least one first network algorithm, and the independent network group includes a first network algorithm with no dependency; when a dependency relationship network group is determined to exist, sequentially determining target operation resources allocated to each first network algorithm in the dependency relationship network group; and when the first network algorithm resources in the dependency relationship network group are determined to be completely allocated or the dependency relationship network group is empty, sequentially determining the target operation resources allocated to the first network algorithms in the independent network group.
Further, the network algorithm of the resource to be allocated includes a plurality of second network algorithms, and the second network algorithm resource allocation module includes: and the time division multiplexing unit is used for distributing the residual operation resources to the plurality of second network algorithms in a time division multiplexing mode.
Further, the target operation resource determining module 420 includes: an operation resource adjusting unit, configured to operate the first network algorithm by using the matched target resource, and obtain current operation information of the first network algorithm; and adjusting the target resources according to the current operation information and the operation demand information until the adjusted target resources meet the minimum allocation condition of the resources, and determining the target resources meeting the minimum allocation condition of the resources as the target operation resources matched with the first network algorithm.
Further, the operation resource adjustment unit includes: the compiling pseudo-operation subunit is used for acquiring random input data in the compiling process of the target network algorithm; inputting the random input data into the target network algorithm, and operating the target network algorithm by adopting target resources to obtain output data output by the target network algorithm aiming at the random input data, wherein the target resources are matched with the target network algorithm; and acquiring the operation speed of the output data calculated by the target network algorithm according to the random input data, and determining the operation speed as the current operation information of the target network algorithm.
Further, the current operation information includes a current operation speed, the operation demand information includes an operation demand speed, and the operation resource adjustment unit includes: an operation demand speed detection subunit configured to reduce the target resource; operating the target network algorithm by adopting the reduced target resources, and acquiring the current operation speed of the target network algorithm; if the difference value between the current running speed and the running demand speed is determined to be greater than a set speed threshold, continuing to reduce the target resources, running the target network algorithm by adopting the reduced target resources, and acquiring the current running speed of the target network algorithm; and if the difference value between the current running speed and the running demand speed is less than or equal to the set speed threshold value, determining that the adjusted target resource meets the resource minimum allocation condition.
Further, the operation demand speed detection subunit includes: a resource step-by-step adjustment subunit, configured to calculate a first number ratio of computing units included in the target resource to allocable computing units; calculating a second number ratio of threads included in the target resource to allocable threads; if the first quantity ratio is greater than or equal to the second quantity ratio, reducing the number of computing units included in the target resource; if it is determined that the first number ratio is less than the second number ratio, reducing the number of threads included by the target resource.
Further, the resource allocation device further includes: the operation demand information changing module is used for operating each network algorithm and counting the operation data of each network algorithm; and when a change request of the operation requirement information aiming at least one network algorithm is acquired, re-allocating resources.
Further, the resource allocation device further includes: and the operation requirement information adjustment module is used for generating requirement adjustment information of the target network algorithm if the operation requirement information of the target network algorithm is not matched with the chip performance information, so as to instruct a user to adjust the operation requirement information of the target network algorithm.
Further, the allocatable running resources include computing units and threads.
Further, the network algorithm includes: the neural network model or the at least one network comprised by the neural network model.
The image generating device can execute the resource allocation method provided by any one of the embodiments of the invention, and has the corresponding functional modules and beneficial effects of the executed image generating method.
Example five
Fig. 5 is a schematic structural diagram of a computer device according to a fifth embodiment of the present invention. Fig. 5 illustrates a block diagram of an exemplary computer device 12 suitable for use in implementing embodiments of the present invention. The computer device 12 shown in fig. 5 is merely an example and should not be construed as limiting the functionality and scope of use of embodiments of the present invention.
As shown in FIG. 5, the computer device 12 is in the form of a general purpose computing device. Components of computer device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, a bus 18 that connects the various system components, including the system memory 28 and the processing units 16. Computer device 12 may be a device that is attached to a bus.
Bus 18 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, and a local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include industry standard architecture (Industry Standard Architecture, ISA) bus, micro channel architecture (Micro Channel Architecture, MCA) bus, enhanced ISA bus, video electronics standards association (Video Electronics Standards Association, VESA) local bus, and peripheral component interconnect (Peripheral Component Interconnect, PCI) bus.
Computer device 12 typically includes a variety of computer system readable media. Such media can be any available media that is accessible by computer device 12 and includes both volatile and nonvolatile media, removable and non-removable media.
The system memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) 30 and/or cache memory 32. The computer device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from or write to non-removable, nonvolatile magnetic media (not shown in FIG. 5, commonly referred to as a "hard disk drive"). Although not shown in fig. 5, a disk player for reading from and writing to a removable nonvolatile magnetic disk (e.g., a "floppy disk"), and an optical disk player for reading from and writing to a removable nonvolatile optical disk (e.g., a compact disk Read Only Memory (CD-ROM), digital versatile disk (Digital Video Disc-Read Only Memory, DVD-ROM), or other optical media) may be provided. In these cases, each mover may be coupled to bus 18 through one or more data medium interfaces. The system memory 28 may include at least one program product having a set (e.g., at least one) of program modules configured to carry out the functions of the embodiments of the invention.
A program/utility 40 having a set (at least one) of program modules 42 may be stored in, for example, system memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment. Program modules 42 generally perform the functions and/or methods of the embodiments described herein.
The computer device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), one or more devices that enable a user to interact with the computer device 12, and/or any devices (e.g., network card, modem, etc.) that enable the computer device 12 to communicate with one or more other computing devices. Such communication may be via an Input/Output (I/O) interface 22. The computer device 12 may also communicate with one or more networks such as a local area network (Local Area Network, LAN), a wide area network (Wide Area Network, WAN) via the network adapter 20. As shown, the network adapter 20 communicates with other modules of the computer device 12 via the bus 18. It should be understood that although not shown in FIG. 5, other hardware and/or software modules may be used in connection with the computer device 12, including, but not limited to, microcode, device movers, redundant processing elements, external disk movement arrays, (Redundant Arrays of Inexpensive Disks, RAID) systems, tape movers, and data backup storage systems, among others.
The processing unit 16 executes various functional applications and data processing by running programs stored in the system memory 28, for example, implementing a resource allocation method provided by any of the embodiments of the present invention.
Example six
A sixth embodiment of the present invention provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a resource allocation method as provided in all the inventive embodiments of the present application:
that is, the program, when executed by the processor, implements: acquiring a network algorithm of resources to be allocated; when a first network algorithm is included in the network algorithm for determining the resources to be allocated, determining a target operation resource from the allocable operation resources according to the operation demand information of the first network algorithm, wherein the first network algorithm is configured with the operation demand information, and the target operation resource is an operation resource which meets the operation demand information when the first network algorithm is operated; and allocating the target operation resource to the first network algorithm.
The computer storage media of embodiments of the invention may take the form of any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a RAM, a Read-Only Memory (ROM), an erasable programmable Read-Only Memory (Erasable Programmable Read Only Memory, EPROM), a flash Memory, an optical fiber, a portable CD-ROM, an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, radio frequency (RadioFrequency, RF), etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a LAN or WAN, or may be connected to an external computer (for example, through the Internet using an Internet service provider).
Note that the above is only a preferred embodiment of the present invention and the technical principle applied. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, while the invention has been described in connection with the above embodiments, the invention is not limited to the embodiments, but may be embodied in many other equivalent forms without departing from the spirit or scope of the invention, which is set forth in the following claims.

Claims (13)

1. A resource allocation method, applied to a many-core system, the many-core system including allocatable operating resources, the method comprising:
acquiring a network algorithm of resources to be allocated;
when a first network algorithm is included in the network algorithm for determining the resources to be allocated, determining a target operation resource from the allocable operation resources according to the operation demand information of the first network algorithm, wherein the first network algorithm is configured with the operation demand information, and the target operation resource is an operation resource which meets the operation demand information when the first network algorithm is operated;
Allocating the target operating resource to the first network algorithm;
the network algorithm of the resource to be allocated comprises a plurality of first network algorithms,
wherein, according to the operation requirement information of the first network algorithm, determining a target operation resource from the allocable operation resources includes:
classifying the plurality of first network algorithms to form at least one network group, wherein the network group comprises a dependency network group and/or an independent network group, the dependency network group comprises at least two first network algorithms, a dependency relationship exists between each first network algorithm and at least one first network algorithm, and the independent network group comprises first network algorithms without the dependency relationship;
when a dependency relationship network group is determined to exist, sequentially determining target operation resources allocated to each first network algorithm in the dependency relationship network group;
when the first network algorithm resources in the dependency relationship network group are determined to be completely allocated or the dependency relationship network group is empty, sequentially determining target operation resources allocated to the first network algorithms in the independent network group;
operating the first network algorithm by adopting the matched target resource, and acquiring the current operation information of the first network algorithm;
And adjusting the target resources according to the current operation information and the operation demand information until the adjusted target resources meet the minimum allocation condition of the resources, and determining the target resources meeting the minimum allocation condition of the resources as the target operation resources matched with the first network algorithm.
2. The method according to claim 1, wherein the method further comprises:
when the network algorithm of the resources to be allocated also comprises a second network algorithm without configuration operation demand information, determining the rest operation resources according to the allocable operation resources and the target operation resources corresponding to the first network algorithm;
and distributing the residual operation resources to the second network algorithm.
3. The method of claim 2, wherein the network algorithm for the resources to be allocated comprises a plurality of second network algorithms,
wherein said allocating said remaining operating resources to said second network algorithm comprises:
and distributing the residual operation resources to the plurality of second network algorithms in a time division multiplexing mode.
4. The method of claim 1, wherein operating the first network algorithm with the matched target resource and obtaining current operating information of the first network algorithm comprises:
Generating random input data in the compiling process of the first network algorithm;
inputting the random input data into the first network algorithm, and operating the first network algorithm by adopting a target resource to acquire output data of the first network algorithm output aiming at the random input data, wherein the target resource is matched with the first network algorithm;
and acquiring the first network algorithm, calculating the operation speed of the output data according to the random input data, and determining the operation speed as the current operation information of the first network algorithm.
5. The method of claim 1, wherein the current operating information comprises a current operating speed, the operating demand information comprises an operating demand speed,
the adjusting the target resource until the adjusted target resource meets the minimum allocation condition of the resource comprises the following steps:
reducing the target resource;
operating the first network algorithm by adopting the reduced target resource, and acquiring the current operation speed of the first network algorithm;
if the difference value between the current running speed and the running demand speed is determined to be greater than a set speed threshold, continuing to reduce the target resource, running the first network algorithm by adopting the reduced target resource, and acquiring the current running speed of the first network algorithm;
And if the difference value between the current running speed and the running demand speed is smaller than or equal to the set speed threshold value, determining that the reduced target resource meets a resource minimum allocation condition.
6. The method of claim 5, wherein said reducing said target resource comprises:
calculating a first quantity ratio of computing units included in the target resource to allocable computing units;
calculating a second number ratio of threads included in the target resource to allocable threads;
if the first number ratio is determined to be greater than or equal to the second number ratio, reducing the number of computing units included in the target resource;
if it is determined that the first number ratio is less than the second number ratio, reducing the number of threads included by the target resource.
7. The method according to claim 1, wherein the method further comprises:
operating each network algorithm and counting the operation data of each network algorithm;
and when a change request of the operation requirement information aiming at least one network algorithm is acquired, re-allocating resources.
8. The method according to claim 1, wherein the method further comprises:
And if the operation requirement information of the first network algorithm is not matched with the chip performance information, generating requirement adjustment information of the first network algorithm to instruct a user to adjust the operation requirement information of the first network algorithm.
9. The method of claim 1, wherein the allocatable execution resources comprise computing units and threads.
10. The method according to claim 1, wherein the network algorithm comprises a neural network model and/or at least one network comprised by a neural network model.
11. A resource allocation apparatus configured in a many-core system, the many-core system including allocatable operating resources, the apparatus comprising:
the first network algorithm acquisition module is used for acquiring a network algorithm of the resources to be allocated;
the target operation resource determining module is used for determining target operation resources from the allocable operation resources according to the operation demand information of the first network algorithm when the network algorithm of the resources to be allocated comprises the first network algorithm, wherein the first network algorithm is configured with the operation demand information, and the target operation resources are operation resources which meet the operation demand information when the first network algorithm is operated;
A first network algorithm resource allocation module, configured to allocate the target operation resource to the first network algorithm;
the network algorithm of the resource to be allocated comprises a plurality of first network algorithms,
wherein, according to the operation requirement information of the first network algorithm, determining a target operation resource from the allocable operation resources includes:
classifying the plurality of first network algorithms to form at least one network group, wherein the network group comprises a dependency network group and/or an independent network group, the dependency network group comprises at least two first network algorithms, a dependency relationship exists between each first network algorithm and at least one first network algorithm, and the independent network group comprises first network algorithms without the dependency relationship;
when a dependency relationship network group is determined to exist, sequentially determining target operation resources allocated to each first network algorithm in the dependency relationship network group;
when the first network algorithm resources in the dependency relationship network group are determined to be completely allocated or the dependency relationship network group is empty, sequentially determining target operation resources allocated to the first network algorithms in the independent network group;
Operating the first network algorithm by adopting the matched target resource, and acquiring the current operation information of the first network algorithm;
and adjusting the target resources according to the current operation information and the operation demand information until the adjusted target resources meet the minimum allocation condition of the resources, and determining the target resources meeting the minimum allocation condition of the resources as the target operation resources matched with the first network algorithm.
12. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the resource allocation method of any of claims 1-10 when the program is executed by the processor.
13. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the resource allocation method according to any of claims 1-10.
CN202010879892.2A 2020-08-27 2020-08-27 Resource allocation method, device, computer equipment and storage medium Active CN112068957B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010879892.2A CN112068957B (en) 2020-08-27 2020-08-27 Resource allocation method, device, computer equipment and storage medium
PCT/CN2021/114217 WO2022042519A1 (en) 2020-08-27 2021-08-24 Resource allocation method and apparatus, and computer device and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010879892.2A CN112068957B (en) 2020-08-27 2020-08-27 Resource allocation method, device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112068957A CN112068957A (en) 2020-12-11
CN112068957B true CN112068957B (en) 2024-02-09

Family

ID=73660703

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010879892.2A Active CN112068957B (en) 2020-08-27 2020-08-27 Resource allocation method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112068957B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112764509B (en) * 2021-02-03 2024-03-01 北京灵汐科技有限公司 Computing core, computing core temperature adjustment method, computing core temperature adjustment device, computer readable medium, computer program, chip and computer system
CN114039937A (en) * 2021-11-15 2022-02-11 清华大学 Network resource management method and related equipment
CN114168331A (en) * 2021-12-07 2022-03-11 杭州萤石软件有限公司 Algorithm deployment and scheduling method and algorithm deployment and scheduling device
CN116225669B (en) * 2023-05-08 2024-01-09 之江实验室 Task execution method and device, storage medium and electronic equipment
CN117176644B (en) * 2023-10-25 2024-02-06 苏州元脑智能科技有限公司 Multi-channel route matching method, device, equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109976901A (en) * 2017-12-28 2019-07-05 航天信息股份有限公司 A kind of resource regulating method, device, server and readable storage medium storing program for executing
CN111143800A (en) * 2019-12-31 2020-05-12 北京华胜天成科技股份有限公司 Cloud computing resource management method, device, equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8756209B2 (en) * 2012-01-04 2014-06-17 International Business Machines Corporation Computing resource allocation based on query response analysis in a networked computing environment
US10043194B2 (en) * 2014-04-04 2018-08-07 International Business Machines Corporation Network demand forecasting

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109976901A (en) * 2017-12-28 2019-07-05 航天信息股份有限公司 A kind of resource regulating method, device, server and readable storage medium storing program for executing
CN111143800A (en) * 2019-12-31 2020-05-12 北京华胜天成科技股份有限公司 Cloud computing resource management method, device, equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于计算资源运行时剩余能力评估优化云平台;周墨颂;董小社;陈衡;张兴军;;计算机研究与发展(11);全文 *
通信网络资源分配算法研究;王磊;;计算机与网络(23);全文 *

Also Published As

Publication number Publication date
CN112068957A (en) 2020-12-11

Similar Documents

Publication Publication Date Title
CN112068957B (en) Resource allocation method, device, computer equipment and storage medium
CN111427681B (en) Real-time task matching scheduling system and method based on resource monitoring in edge computing
WO2016078008A1 (en) Method and apparatus for scheduling data flow task
CN110389816B (en) Method, apparatus and computer readable medium for resource scheduling
CN111176852A (en) Resource allocation method, device, chip and computer readable storage medium
CN112465146B (en) Quantum and classical hybrid cloud platform and task execution method
CN105808328A (en) Task scheduling method, device and system
CN110347602B (en) Method and device for executing multitasking script, electronic equipment and readable storage medium
CN113918351A (en) Method and device for adapting to distributed training in deep learning framework and AI acceleration card
CN116467061B (en) Task execution method and device, storage medium and electronic equipment
US20160210171A1 (en) Scheduling in job execution
CN115586961A (en) AI platform computing resource task scheduling method, device and medium
CN113703975A (en) Model distribution method and device, electronic equipment and computer readable storage medium
CN109992408B (en) Resource allocation method, device, electronic equipment and storage medium
CN111459648B (en) Heterogeneous multi-core platform resource optimization method and device for application program
CN106844024B (en) GPU/CPU scheduling method and system of self-learning running time prediction model
CN112905317A (en) Task scheduling method and system under rapid reconfigurable signal processing heterogeneous platform
CN115543577B (en) Covariate-based Kubernetes resource scheduling optimization method, storage medium and device
CN110704182A (en) Deep learning resource scheduling method and device and terminal equipment
CN112988383A (en) Resource allocation method, device, equipment and storage medium
CN111459651B (en) Load balancing method, device, storage medium and scheduling system
CN110705884B (en) List processing method, device, equipment and storage medium
CN117032937B (en) Task scheduling method based on GPU, electronic device and storage medium
CN115269164A (en) Resource allocation method, device, computer equipment and storage medium
CN115269166A (en) Time allocation method and device for computation graph, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant