CN116860435A - Nuclear function priority determining method, device, computer equipment and storage medium - Google Patents

Nuclear function priority determining method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN116860435A
CN116860435A CN202310708221.3A CN202310708221A CN116860435A CN 116860435 A CN116860435 A CN 116860435A CN 202310708221 A CN202310708221 A CN 202310708221A CN 116860435 A CN116860435 A CN 116860435A
Authority
CN
China
Prior art keywords
kernel function
weight
determining
objective
size
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310708221.3A
Other languages
Chinese (zh)
Inventor
丁光宇
肖熠
卜景德
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongke Shuguang International Information Industry Co ltd
Original Assignee
Zhongke Shuguang International Information Industry Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongke Shuguang International Information Industry Co ltd filed Critical Zhongke Shuguang International Information Industry Co ltd
Priority to CN202310708221.3A priority Critical patent/CN116860435A/en
Publication of CN116860435A publication Critical patent/CN116860435A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The application relates to a method, a device, computer equipment and a storage medium for determining the priority of a kernel function, wherein the method obtains kernel function information corresponding to a target kernel function through a portable heterogeneous computing interface operation time base; determining a first weight corresponding to a resource required by the target kernel function according to the kernel function information; and under the condition that the first weight meets the preset condition, determining the priority of the objective kernel function according to the first weight and the second weight of the use resources of the objective process required by the objective kernel function. The method for determining the priority of the kernel function can improve the utilization rate of hardware resources of heterogeneous accelerator hardware in executing the kernel function.

Description

Nuclear function priority determining method, device, computer equipment and storage medium
Technical Field
The present application relates to the field of computers, and in particular, to a method and apparatus for determining a priority of a kernel function, a computer device, and a storage medium.
Background
With the development of computer technology, it is becoming more common for a plurality of HIP applications on a high-performance computer based on HIP (The Heterogeneous Computing Interface for Portability, portable heterogeneous computing interface) to jointly complete a task, where the plurality of HIP applications jointly use the same heterogeneous accelerator hardware resources.
In the conventional technology, when a plurality of HIP applications run, the priority of a current kernel function is optimized in units of processes. However, the conventional method of optimizing the priority of the kernel function in the unit of process has low efficiency in the use of heterogeneous accelerator hardware resources.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a kernel function priority determination method, apparatus, computer device, and storage medium capable of improving the use efficiency of heterogeneous accelerator hardware resources.
In a first aspect, the present application provides a method for determining a priority of a kernel function, the method comprising:
obtaining kernel function information corresponding to a target kernel function through a portable heterogeneous computing interface HIP runtime library;
determining a first weight corresponding to a resource required by the target kernel function according to the kernel function information;
and under the condition that the first weight meets the preset condition, determining the priority of the objective kernel function according to the first weight and the second weight of the use resource of the objective process to which the objective kernel function belongs.
In the above embodiment, when determining the priority of the objective kernel function, not only the second weight of the usage resource of the objective process to which the objective kernel function belongs, but also the first weight corresponding to the resource required by the objective kernel function are considered. The priority of the objective kernel function is determined for the resources required by each kernel function of all the processes and the use resources of the objective process to which each kernel function belongs, that is, the priority of the objective kernel function is determined not in the unit of process but in the unit of kernel function. And the determination of the priority of the target kernel function is related to the resources required by the target kernel function, so that when the subsequent heterogeneous accelerator hardware executes the target kernel function according to the priority of the target kernel function, whether the residual resources of the heterogeneous accelerator meet the resources required by the target kernel function can be considered, and the use efficiency of the heterogeneous accelerator hardware resources can be improved.
In one embodiment, determining, according to kernel function information, a first weight corresponding to a resource required by a target kernel function includes:
acquiring the residual resource information of heterogeneous accelerator hardware; the residual resource information comprises the residual thread block size, the residual shared memory size and the residual video memory size;
and determining a first weight according to the residual resource information and the kernel function information.
In the above embodiment, the first weight is determined by the obtained remaining resource information of the heterogeneous accelerator hardware and the kernel function information of the target kernel function, that is, the determination of the first weight of the target kernel function considers the remaining resources of the heterogeneous accelerator hardware, so that whether the remaining resources of the heterogeneous accelerator hardware meet the resources required for executing the target kernel function or not can be considered when the priority of the target kernel function is determined according to the first weight, and the use efficiency of the resources of the heterogeneous accelerator hardware can be improved.
In one embodiment, the kernel function information includes a required thread block size, a required shared memory size, and a required video memory size of the target kernel function, and determining the first weight according to the remaining resource information and the kernel function information includes:
Calculating a difference value between the size of the remaining thread blocks and the size of the required thread blocks to obtain a first difference value;
calculating a difference value between the size of the residual shared memory and the size of the needed shared memory to obtain a second difference value;
calculating a difference value between the residual video memory size and the required video memory size to obtain a third difference value;
and determining a first weight according to the first difference value, the second difference value and the third difference value.
In the above embodiment, the first weight is determined according to the calculated difference by calculating the difference between the size of the remaining thread block and the size of the required thread block, the size of the remaining shared memory and the size of the required shared memory, and the size of the remaining video memory and the size of the required video memory, so that the method for calculating the first weight is fast and simple, easy to implement, and capable of improving the efficiency of the kernel function priority determining method. In addition, when the resources used in the first weight are determined to include the video memory, the priority of the objective kernel function is determined according to the first weight and the second weight, so that the spatial locality principle of video memory access is utilized when the objective kernel function is executed subsequently, the access efficiency to the L2 cache and the shared video memory in the heterogeneous accelerator hardware can be improved, and the hit rate of the heterogeneous accelerator hardware to the L2 cache during the video memory access is improved.
In one embodiment, the method further comprises:
determining whether the first difference, the second difference, and the third difference are all greater than zero;
if the first difference value, the second difference value and the third difference value are all larger than zero, determining that the first weight meets a preset condition;
if at least one of the first difference value, the second difference value and the third difference value is smaller than zero, determining that the first weight does not meet the preset condition, and determining that the priority of the objective kernel function is the lowest priority.
In the above embodiment, whether the first weight satisfies the preset condition is determined by determining whether the first difference, the second difference, and the third difference are all greater than zero. The method for determining whether the first weight meets the preset condition is quick and simple. And when the first weight is determined not to meet the preset condition, the priority of the objective kernel function is directly set to be the lowest priority, so that the practicability of the method for determining the priority of the kernel function can be improved when the method for determining the priority of the objective kernel function is described when the first weight is determined not to meet the preset condition.
In one embodiment, a method for acquiring a second weight includes:
acquiring the size of a thread block used by a target process to which a target kernel function belongs, the size of a shared memory used and the size of a video memory used;
And determining a second weight according to the used thread block size, the used shared memory size and the used video memory size.
In the above embodiment, the second weight corresponding to the target process can be determined by the obtained thread block size, the shared memory size and the display memory size used by the target process to which the target kernel function belongs, so that the method for determining the second weight is fast and easy to implement, and the efficiency of the method for determining the priority of the kernel function can be improved.
In one embodiment, determining the priority of the objective kernel function according to the first weight and the second weight of the usage resource of the objective process to which the objective kernel function belongs includes:
determining the total weight corresponding to the objective kernel function according to the first weight and the second weight;
and determining the priority of the objective kernel function according to the total weight corresponding to the objective kernel function.
In the above embodiment, the total weight corresponding to the objective kernel function is determined according to the first weight and the second weight, and the priority of the objective kernel function in all kernel functions can be determined by comparing the total weight corresponding to the objective kernel function with the total weight corresponding to all kernel functions. The method for determining the priority of the target kernel function is quick and simple, and the priority of the target kernel function is the priority corresponding to all kernel functions, so that the use efficiency of heterogeneous accelerator hardware resources when heterogeneous accelerator hardware executes the kernel functions can be improved.
In one embodiment, obtaining, by the HIP runtime library of the portable heterogeneous computing interface, kernel function information corresponding to the objective kernel function includes:
and receiving kernel function information sent by the HIP runtime library through the heterogeneous accelerator driving interface.
In the above embodiment, by setting the heterogeneous accelerator driver interface between the HIP runtime library and the heterogeneous accelerator driver, the HIP runtime library may send the kernel function information corresponding to the received objective kernel function to the heterogeneous accelerator driver, so that the heterogeneous accelerator driver determines the priority of the objective kernel function according to the received objective kernel function information, which can improve the practicability of the kernel function priority determining method.
In a second aspect, the present application also provides a kernel function priority determining apparatus, including:
the acquisition module is used for acquiring kernel function information corresponding to the target kernel function through the HIP runtime library;
the determining module is used for determining a first weight corresponding to a resource required by the objective kernel function according to the kernel function information;
the determining module is further configured to determine a priority of the objective kernel function according to the first weight and a second weight of a usage resource of the objective process to which the objective kernel function belongs, where the first weight meets a preset condition.
In a third aspect, the present application also provides a computer device comprising a memory storing a computer program and a processor implementing the steps of the method as provided in the first aspect above when the computer program is executed by the processor.
In a fourth aspect, the present application also provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method as provided in the first aspect above.
Drawings
FIG. 1 is a schematic diagram of a computer device in one embodiment;
FIG. 2 is a flowchart illustrating steps of a method for determining a priority of a kernel function according to an embodiment;
FIG. 3 is a flowchart illustrating a method for determining the priority of a kernel function according to another embodiment;
FIG. 4 is a flowchart illustrating steps of a method for determining a priority of a kernel function according to another embodiment;
FIG. 5 is a flowchart illustrating steps of a method for determining a priority of a kernel function according to another embodiment;
FIG. 6 is a flowchart illustrating steps of a method for determining a priority of a kernel function according to another embodiment;
FIG. 7 is a flowchart illustrating steps of a method for determining a priority of a kernel function according to another embodiment;
FIG. 8 is a schematic diagram of the internal architecture of a computer device in one embodiment;
FIG. 9 is a flowchart illustrating steps of a method for determining a priority of a kernel function according to another embodiment;
fig. 10 is a schematic structural diagram of a kernel function priority determining apparatus in one embodiment.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
The numbering of the components itself, e.g. "first", "second", etc., in the present application is used only to distinguish between the described objects and does not have any sequential or technical meaning.
Before the technical scheme of the embodiment of the application is specifically introduced, the technical background or technical evolution context based on the embodiment of the application is introduced. With the development of computer technology, it is becoming more common for a plurality of HIP applications on a high-performance computer based on HIP (The Heterogeneous Computing Interface for Portability, portable heterogeneous computing interface) to jointly complete a task, and for a plurality of HIP applications on a high-performance computer to jointly use the same heterogeneous accelerator hardware resources. In the conventional technology, when a plurality of HIP applications run, the priority of a current kernel function is optimized in units of processes. That is, the priority of each kernel function is only the priority in the process to which the kernel function belongs. However, when executing the kernel function in all processes using heterogeneous accelerator hardware resources, there may be a case where heterogeneous accelerator hardware resources remain, but the priority of the kernel function is low, and the kernel function cannot be executed. Making the heterogeneous accelerator hardware resources inefficient to use. In this regard, the present application provides a method for determining the priority of a kernel function.
The technical scheme related to the embodiment of the application is described below in connection with the scene to which the embodiment of the application is applied.
The kernel function priority determining method provided by the application can be applied to computer equipment, wherein the computer equipment is provided with an open source heterogeneous programming language HIP, a dynamic library (run time) of the heterogeneous programming language HIP, a heterogeneous system architecture (Heterogeneous System Architecture, HSA) and heterogeneous accelerator hardware. The internal structure of the computer device may be as shown in fig. 1. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless mode can be realized through WIFI, a mobile cellular network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a kernel function prioritization method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
In one embodiment, as shown in fig. 2, a kernel function priority determining method is provided, and this embodiment is applied to a computer device for illustration by the method. In this embodiment, the method includes the steps of:
step 200, obtaining kernel function information corresponding to the target kernel function through a portable heterogeneous computing interface HIP runtime library.
When the computer equipment is required to execute the kernel function, the HIP runtime library in the computer equipment receives the user-defined target kernel function issued by the user and the kernel function information corresponding to the target kernel function. That is, the user issues the kernel function information corresponding to the objective kernel function while issuing the objective kernel function. The kernel function information corresponding to the target kernel function comprises the address, the name, the parameter address, the parameter size, the grid size of the target kernel function, various resources required by the running of the target kernel function and the like. The present embodiment does not limit the specific content of the kernel function information as long as the function thereof can be realized.
After the computer device receives the kernel information of the objective kernel through the HIP runtime library, the kernel information is sent to a heterogeneous accelerator driver in the computer device, so that the heterogeneous accelerator driver determines the priority of the objective kernel according to the received kernel information.
Step 210, determining a first weight corresponding to a resource required by the objective kernel function according to the kernel function information.
After the heterogeneous accelerator driver in the computer equipment receives the kernel function information, the kernel function information comprises resources required by the running of the target kernel function, and the heterogeneous accelerator driver can determine the first weight corresponding to the resources required by the target kernel function according to the kernel function information. The first weight refers to the importance of the resources required by the objective kernel in determining the priority of the objective kernel.
Step 220, determining the priority of the objective kernel function according to the first weight and the second weight of the use resource of the objective process to which the objective kernel function belongs under the condition that the first weight meets the preset condition.
When the computer equipment obtains the first weight corresponding to the resource required by the objective kernel function, whether the first weight meets the preset condition is determined. The preset condition may refer to that when the heterogeneous accelerator hardware of the computer device executes the objective kernel function, the heterogeneous accelerator hardware resource can satisfy the resource required by the objective kernel function. The first weight is used to characterize whether heterogeneous accelerator hardware resources in the computer device are capable of satisfying the resources required by the target kernel function.
And under the condition that the first weight meets the preset condition, the heterogeneous accelerator driver in the computer equipment acquires the second weight of the use resource of the target process to which the target kernel function belongs. The second weight refers to the importance degree of the use resource of the target process to which the target kernel belongs in determining the priority of the target kernel. After determining a first weight corresponding to a resource required by the objective kernel function and a second weight of a use resource of the objective process to which the objective kernel function belongs, the heterogeneous accelerator driver can determine the priority of the objective kernel function according to the first weight and the second weight.
The kernel function priority determining method provided by the embodiment of the application obtains the kernel function information corresponding to the target kernel function through the HIP runtime library; determining a first weight corresponding to a resource required by the target kernel function according to the kernel function information; and under the condition that the first weight meets the preset condition, determining the priority of the objective kernel function according to the first weight and the second weight of the use resource of the objective process to which the objective kernel function belongs. In this embodiment, when determining the priority of the objective kernel function, not only the second weight of the usage resource of the objective process to which the objective kernel function belongs, but also the first weight corresponding to the resource required by the objective kernel function are considered. The priority of the objective kernel function is determined for the resources required by each kernel function of all the processes and the use resources of the objective process to which each kernel function belongs, that is, the priority of the objective kernel function is determined not in the unit of process but in the unit of kernel function. And the determination of the priority of the target kernel function is related to the resources required by the target kernel function, so that when the subsequent heterogeneous accelerator hardware executes the target kernel function according to the priority of the target kernel function, whether the residual resources of the heterogeneous accelerator meet the resources required by the target kernel function can be considered, and the use efficiency of the heterogeneous accelerator hardware resources can be improved.
In one embodiment, as shown in fig. 3, an implementation manner related to determining a first weight corresponding to a resource required by a target kernel function according to kernel function information includes the following steps:
step 300, obtaining the residual resource information of heterogeneous accelerator hardware; the remaining resource information includes a remaining thread block size, a shared memory size, and a memory size.
Heterogeneous accelerator hardware is used to execute the objective kernel function. The remaining resource information of the heterogeneous accelerator hardware refers to a resource that can be used in the heterogeneous accelerator hardware before the target kernel function is executed using the heterogeneous accelerator hardware, that is, a difference between the total resource information in the heterogeneous accelerator hardware and the remaining resource information that has been used.
The heterogeneous accelerator driver in the computer device obtains the remaining resource information of the heterogeneous accelerator hardware at this time, that is, the resource information that can be used by the heterogeneous accelerator hardware to execute the objective kernel function at this time. In this embodiment, the remaining resource information of the heterogeneous accelerator hardware includes a remaining thread block size, a remaining shared memory size, and a remaining video memory size. Thread block size refers to the number of threads that can run synchronously while executing a kernel function. Shared memory is a logical memory that allows different processes to access. The video memory refers to the memory of the video card.
Step 310, determining a first weight according to the residual resource information and the kernel function information.
After determining the remaining resource information of heterogeneous accelerator hardware, a heterogeneous accelerator driver in the computer device determines a first weight based on the remaining information and the kernel function information.
In this embodiment, the computer device determines the first weight according to the obtained remaining resource information of the heterogeneous accelerator hardware and the kernel function information of the target kernel function, that is, the determination of the first weight of the target kernel function considers the remaining resources of the heterogeneous accelerator hardware, so that whether the remaining resources of the heterogeneous accelerator hardware meet the resources required for executing the target kernel function or not can be considered when determining the priority of the target kernel function according to the first weight, and the use efficiency of the resources of the heterogeneous accelerator hardware can be improved.
In one embodiment, the kernel function information obtained by the HIP runtime library includes a desired thread block size, a desired shared memory size, and a desired memory size of the target kernel function, as shown in FIG. 4, involves determining a first weight from the remaining resource information and the kernel function information, including:
step 400, calculating the difference between the size of the remaining thread blocks and the size of the required thread blocks to obtain a first difference.
Step 410, calculating the difference between the remaining shared memory size and the required shared memory size to obtain a second difference.
Step 420, calculating the difference between the residual video memory size and the required video memory size to obtain a third difference.
Heterogeneous accelerator drivers in a computer device acquire the remaining resource information of heterogeneous accelerator hardware and the kernel function information of a target kernel function. Calculating a difference value between the size of the residual thread blocks in the residual resource information and the size of the thread blocks required in the kernel function information to obtain a first difference value; calculating a difference value between the size of the residual shared memory in the residual resource information and the size of the shared memory required in the kernel function information to obtain a second difference value; and calculating the difference between the residual video memory size in the residual resource information and the video memory size required in the kernel function information to obtain a third difference.
Step 430, determining a first weight according to the first difference, the second difference and the third difference.
After the first difference value, the second difference value and the third difference value are obtained through calculation, the heterogeneous accelerator driver in the computer equipment can determine the first weight according to the first difference value, the second difference value and the third difference value.
In an alternative embodiment, the heterogeneous accelerator driver may directly calculate a sum between the first difference, the second difference, and the third difference, with the sum being the first weight.
In another alternative embodiment, the heterogeneous accelerator driver may calculate a first product between the first difference and the weight factor corresponding to the first difference, a second product between the second difference and the weight factor corresponding to the second difference, a third product between the third difference and the weight factor corresponding to the third difference, and calculate a sum value between the first product, the second product, and the third product, and the sum value is taken as the first weight. The first weight may be expressed as w=aw 1 +bW 2 +cW 3 Wherein W represents a first weight, W 1 Represents a first difference value, a represents a weight factor corresponding to the first difference value, W 2 Represents the second difference value, b represents the weight factor corresponding to the second difference value, W 3 And c represents a weight factor corresponding to the third difference.
In this embodiment, the computer device determines the first weight according to the calculated difference by calculating the difference between the remaining thread block size and the required thread block size, and the remaining shared memory size and the required shared memory size and the remaining video memory size and the required video memory size, respectively, so that the method for calculating the first weight is fast and simple, easy to implement, and can improve the efficiency of the kernel function priority determining method. In addition, when the resources used in the first weight are determined to include the video memory, the priority of the objective kernel function is determined according to the first weight and the second weight, so that the spatial locality principle of video memory access is utilized when the objective kernel function is executed subsequently, the access efficiency to the L2 cache and the shared video memory in the heterogeneous accelerator hardware can be improved, and the hit rate of the heterogeneous accelerator hardware to the L2 cache during the video memory access is improved.
Under the condition that the first weight meets the preset condition, the computer equipment determines the priority of the target and the function according to the first weight and the second weight of the use resource of the target process to which the target kernel function belongs. In this example, as shown in fig. 5, when determining whether the first weight satisfies the preset condition, and when the first weight does not satisfy the preset condition, the step of the kernel priority determining method further includes:
step 500, determining whether the first difference, the second difference, and the third difference are all greater than zero.
In this embodiment, the first difference is a difference obtained by subtracting the required thread block size from the remaining thread block size, the second difference is a difference between the remaining shared memory size and the required shared memory size, and the third difference is a difference between the remaining video memory size and the required video memory size.
After the first difference value, the second difference value and the third difference value are obtained, the heterogeneous accelerator driver in the computer equipment respectively judges whether the first difference value, the second difference value and the third difference value are larger than zero.
Step 510, if the first difference, the second difference, and the third difference are all greater than zero, determining that the first weight satisfies the preset condition.
If the heterogeneous accelerator driver in the computer equipment judges whether the first difference value, the second difference value and the third difference value are larger than zero or not, and determines that the first difference value, the second difference value and the third difference value are larger than zero, and the residual resource information of the heterogeneous accelerator hardware can meet the resource information required for executing the objective kernel function, determining that the first weight determined according to the first difference value, the second difference value and the third difference value meets the preset condition.
Step 520, if at least one of the first difference, the second difference, and the third difference is smaller than zero, determining that the first weight does not satisfy the preset condition, and determining that the priority of the objective kernel function is the lowest priority.
If the heterogeneous accelerator driver in the computer equipment judges whether the first difference value, the second difference value and the third difference value are larger than zero or not respectively, and at least one difference value of the first difference value, the second difference value and the third difference value is smaller than zero, and the residual resources of the heterogeneous accelerator hardware cannot meet the resource information required for executing the objective kernel function, determining that the first weight determined according to the first difference value, the second difference value and the third difference value does not meet the preset condition. And under the condition that the heterogeneous accelerator driver determines that the first weight does not meet the preset condition, the heterogeneous accelerator driver indicates that the heterogeneous accelerator hardware cannot execute the target kernel function at the moment, and the priority of the target kernel function is set to be the lowest priority.
In this embodiment, whether the first weight satisfies the preset condition is determined by determining whether the first difference, the second difference, and the third difference are all greater than zero. The method for determining whether the first weight meets the preset condition is quick and simple. And when the first weight is determined not to meet the preset condition, the priority of the objective kernel function is directly set to be the lowest priority, so that the practicability of the method for determining the priority of the kernel function can be improved when the method for determining the priority of the objective kernel function is described when the first weight is determined not to meet the preset condition.
In one embodiment, as shown in fig. 6, one implementation involving acquiring the second weight includes the steps of:
step 600, obtaining the size of a thread block used by a target process to which the target kernel function belongs, the size of a shared memory used and the size of a video memory used.
And the second weight represents the importance degree of the use resources of the target process of the target kernel function, and when the second weight is determined, the use resource information of the target process of the target kernel function is obtained. In this embodiment, the usage resource information of the target process to which the target kernel function belongs includes a thread block size used by the target process, a shared memory size used by the target process, and a video memory size used by the target process, and then the heterogeneous accelerator area in the computer device obtains the thread block size used by the target process to which the target kernel function belongs, the shared memory size used by the target process, and the video memory size used by the target process.
Step 610, determining a second weight according to the used thread block size, the used shared memory size, and the used video memory size.
The heterogeneous accelerator driver in the computer equipment determines a second weight according to the size of a thread block used by the target process to which the acquired target kernel function belongs, the size of a shared memory used and the size of a video memory used.
In an alternative embodiment, the heterogeneous accelerator driver may calculate a sum between the thread block size used by the target process, the shared memory size used, and the memory size used, and determine the sum as the second weight.
In another alternative embodiment, the heterogeneous accelerator driver may calculate a fourth product between the used thread block size and the weight factor corresponding to the used thread block size, a fifth product between the used shared memory size and the weight factor corresponding to the used shared memory size, a sixth product between the used memory size and the weight factor corresponding to the used memory size, and calculate a sum between the fourth product, the fifth product, and the sixth product as the second weight. The second weight may be expressed as v=dv 1 +eV 2 +fV 3 Wherein V represents a second weight, W 1 Represents the size of the thread block used, d represents the weight factor corresponding to the size of the thread block used, V 2 Represents the size of the shared memory used, e represents the weight factor corresponding to the size of the shared memory used, V 3 The used video memory size is represented, and f represents the weight factor corresponding to the used video memory size.
In this embodiment, the second weight corresponding to the target process can be determined by the obtained thread block size, the shared memory size and the display memory size used by the target process to which the target kernel function belongs, so that the method for determining the second weight is fast and easy to implement, and the efficiency of the method for determining the priority of the kernel function can be improved.
In one embodiment, as shown in fig. 7, an implementation manner related to determining a priority of a target kernel function according to a first weight and a second weight of a usage resource of a target process to which the target kernel function belongs includes:
step 700, determining the total weight corresponding to the objective kernel function according to the first weight and the second weight;
after determining a first weight corresponding to a resource required by a target kernel function and a second weight of a used resource of the target process of the target kernel function, a heterogeneous accelerator driver in computer equipment determines a total weight corresponding to the target kernel function according to the first weight and the second weight.
In an alternative embodiment, the heterogeneous accelerator driver may calculate a sum of the first weight and the second weight, and determine the sum as a total weight corresponding to the objective kernel function.
In another alternative embodiment, the heterogeneous accelerator driver calculates a seventh product between the first weight and a weight factor corresponding to the first weight, calculates an eighth product between the second weight and a weight factor corresponding to the second weight, calculates a sum between the seventh product and the eighth product, and determines the sum as a total weight corresponding to the objective kernel function. The total weight may be denoted as h= jW +kv, where H denotes the total weight, j denotes a weight factor corresponding to the first weight, and k denotes a weight factor corresponding to the second weight.
Step 710, determining the priority of the objective kernel function according to the total weight corresponding to the objective kernel function.
After determining the total weight corresponding to the objective kernel function, the heterogeneous accelerator in the computer equipment compares the total weight corresponding to the objective kernel function with the total weight corresponding to all kernel functions in the computer equipment. The smaller the total weight, the higher the priority, and the larger the total weight, the lower the priority. That is, the heterogeneous accelerator compares the total weight corresponding to the objective kernel function with the total weight corresponding to all kernel functions, sorts the total weight corresponding to all kernel functions and the total weight corresponding to the objective kernel function, and can determine the priority of the objective kernel function according to the sorting.
In this embodiment, the computer device determines the total weight corresponding to the objective kernel function according to the first weight and the second weight, and compares the total weight corresponding to the objective kernel function with the total weight corresponding to all kernel functions, so as to determine the priority of the objective kernel function in all kernel functions. The method for determining the priority of the target kernel function is quick and simple, and the priority of the target kernel function is the priority corresponding to all kernel functions, so that the use efficiency of heterogeneous accelerator hardware resources when heterogeneous accelerator hardware executes the kernel functions can be improved.
In one embodiment, one implementation related to obtaining kernel information corresponding to a target kernel through a portable heterogeneous computing interface HIP runtime library comprises:
and receiving kernel function information sent by the HIP runtime library through the heterogeneous accelerator driving interface.
A user-defined heterogeneous accelerator driver interface is included between a HIP runtime library and a heterogeneous accelerator driver in a computer device. After receiving the user-defined target kernel function and kernel function information corresponding to the target kernel function through an interface corresponding to the HIP runtime library, the HIP runtime library sends the kernel function information to the heterogeneous accelerator driver through a heterogeneous accelerator driver interface between the HIP runtime library and the heterogeneous accelerator driver.
In this embodiment, by setting a heterogeneous accelerator driver interface between the HIP runtime library and the heterogeneous accelerator driver, the HIP runtime library may send kernel function information corresponding to the received objective kernel function to the heterogeneous accelerator driver, so that the heterogeneous accelerator driver determines the priority of the objective kernel function according to the received objective kernel function information, which may improve the practicality of the kernel function priority determination method.
In an alternative embodiment, heterogeneous accelerator drivers in the computer device, upon determining the priority of the objective kernel, send the priority of the objective kernel to the HIP runtime library; the HIP runtime library will issue the target kernel functions into the HSA queue according to the priorities of the received target kernel functions, so that the heterogeneous accelerator hardware executes each kernel function according to the priorities of the kernel functions in the HSA queue.
In an alternative embodiment, the internal architecture of a computer using the kernel function priority determining method provided in this embodiment is shown in fig. 8. HIP applications and HIP runtime libraries are at the application layer, heterogeneous accelerator drivers and heterogeneous accelerator scheduling systems are at the kernel layer. The HIP runtime library sends kernel function information of the target kernel function to a heterogeneous accelerator driving scheduling system through a heterogeneous accelerator driving interface, so that heterogeneous accelerator hardware driving can determine a first weight according to the kernel function information, and under the condition that the first weight meets a preset condition, the priority of the target kernel function is determined according to the first weight and a second weight of a use resource of a target process to which the target kernel function belongs; the heterogeneous accelerator driver sends the priority of the target kernel function to the HIP runtime library through the heterogeneous accelerator driver scheduling system, and simultaneously sends a kernel function issuing event to the HIP runtime library; the HIP runtime library receives the kernel function issuing event and then issues the target kernel function to the HSA queue, so that heterogeneous accelerator hardware executes the kernel functions in the HSA queue according to the priorities of the kernel functions in the HSA queue.
Referring to fig. 9, an embodiment of the present application provides a kernel function priority determining method, which includes the steps of:
step 900, receiving kernel function information corresponding to a target kernel function sent by a HIP runtime library through a heterogeneous accelerator driving interface; the kernel function information comprises the size of a thread block required by the target kernel function, the size of a shared memory required by the target kernel function and the size of a video memory required by the target kernel function;
step 910, obtaining the remaining resource information of heterogeneous accelerator hardware; the residual resource information comprises the residual thread block size, the residual shared memory size and the residual video memory size;
step 920, calculating a difference value between the size of the remaining thread blocks and the size of the required thread blocks to obtain a first difference value;
step 930, calculating a difference between the remaining shared memory size and the required shared memory size to obtain a second difference;
step 940, calculating the difference between the residual video memory size and the required video memory size to obtain a third difference;
step 950, determining whether the first difference, the second difference, and the third difference are all greater than zero;
step 960, if yes, determining a first weight according to the first difference value, the second difference value and the third difference value, wherein the first weight meets a preset condition;
Step 970, if not, determining that the first weight does not meet the preset condition, and determining that the priority of the objective kernel function is the lowest priority;
step 980, acquiring the size of a thread block used by a target process to which the target kernel function belongs, the size of a shared memory used and the size of a video memory used under the condition that the first weight meets a preset condition;
step 990, determining a second weight of the target process to which the target kernel function belongs according to the size of the used thread block, the size of the used shared memory and the size of the used video memory;
step 991, determining the total weight corresponding to the objective kernel function according to the first weight and the second weight;
step 992, determining the priority corresponding to the objective kernel function according to the total weight corresponding to the objective kernel function.
It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.
Based on the same inventive concept, the embodiment of the application also provides a kernel function priority determining device for realizing the kernel function priority determining method. The implementation of the solution provided by the apparatus is similar to the implementation described in the above method, so the specific limitation in the embodiments of the apparatus for determining the priority of a kernel function provided below may be referred to as the limitation of the method for determining the priority of a kernel function hereinabove, which is not repeated here.
In one embodiment, as shown in FIG. 10, there is provided a kernel function prioritization apparatus 10, comprising: an acquisition module 11 and a determination module 12, wherein:
and the obtaining module 11 is used for obtaining the kernel function information corresponding to the objective kernel function through the HIP runtime library.
And the determining module 12 is configured to determine a first weight corresponding to a resource required by the objective kernel function according to the kernel function information.
The determining module 12 is further configured to determine the priority of the objective kernel function according to the first weight and the second weight of the usage resource of the objective process to which the objective kernel function belongs, if the first weight meets a preset condition.
In one embodiment, the determination module 12 includes an acquisition unit and a determination unit. The acquisition unit is used for acquiring the residual resource information of the heterogeneous accelerator hardware; the residual resource information comprises the residual thread block size, the residual shared memory size and the residual video memory size; the determining unit is used for determining a first weight according to the residual resource information and the kernel function information.
In one embodiment, the determining unit includes a first computing subunit, a second computing subunit, a third computing subunit, and a determining subunit. The first calculating subunit is used for calculating the difference between the size of the residual thread block and the size of the required thread block to obtain a first difference; the second calculating subunit is used for calculating the difference between the size of the residual shared memory and the size of the needed shared memory to obtain a second difference; the third calculation subunit is used for calculating the difference between the residual video memory size and the required video memory size to obtain a third difference; the determining subunit is configured to determine a first weight according to the first difference, the second difference, and the third difference.
In one embodiment, the determining module further comprises a judging unit. The judging unit is used for determining whether the first difference value, the second difference value and the third difference value are all larger than zero; if the first difference value, the second difference value and the third difference value are all larger than zero, determining that the first weight meets a preset condition; if at least one of the first difference value, the second difference value and the third difference value is smaller than zero, determining that the first weight does not meet the preset condition, and determining that the priority of the objective kernel function is the lowest priority.
In one embodiment, the obtaining module is further configured to obtain a thread block size, a shared memory size and a video memory size used by a target process to which the target kernel function belongs; and determining a second weight according to the used thread block size, the used shared memory size and the used video memory size.
In one embodiment, the determining module is specifically configured to determine, according to the first weight and the second weight, a total weight corresponding to the objective kernel function; and determining the priority of the objective kernel function according to the total weight corresponding to the objective kernel function.
In one embodiment, the acquisition module is specifically configured to receive kernel function information sent by the HIP runtime library via the heterogeneous accelerator driver interface.
The above-described respective modules in the kernel function priority determining apparatus may be implemented in whole or in part by software, hardware, and a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In one embodiment, a computer device is provided, which may be a terminal, and the internal structure thereof may be as shown in fig. 1.
It will be appreciated by those skilled in the art that the architecture shown in fig. 1 is merely a block diagram of some of the architecture relevant to the present inventive arrangements and is not limiting as to the computer device to which the present inventive arrangements may be implemented, as a particular computer device may include more or less components than those shown, or may be combined with some components, or may have a different arrangement of components.
In one embodiment, a computer device is provided comprising a memory and a processor, the memory having stored therein a computer program, the processor when executing the computer program performing the steps of:
obtaining kernel function information corresponding to a target kernel function through a portable heterogeneous computing interface HIP runtime library;
determining a first weight corresponding to a resource required by the target kernel function according to the kernel function information;
and under the condition that the first weight meets the preset condition, determining the priority of the objective kernel function according to the first weight and the second weight of the use resource of the objective process to which the objective kernel function belongs.
In one embodiment, the processor when executing the computer program further performs the steps of: acquiring the residual resource information of heterogeneous accelerator hardware; the residual resource information comprises the residual thread block size, the residual shared memory size and the residual video memory size; and determining a first weight according to the residual resource information and the kernel function information.
In one embodiment, the processor when executing the computer program further performs the steps of: calculating a difference value between the size of the remaining thread blocks and the size of the required thread blocks to obtain a first difference value; calculating a difference value between the size of the residual shared memory and the size of the needed shared memory to obtain a second difference value; calculating a difference value between the residual video memory size and the required video memory size to obtain a third difference value; and determining a first weight according to the first difference value, the second difference value and the third difference value.
In one embodiment, the processor when executing the computer program further performs the steps of: determining whether the first difference, the second difference, and the third difference are all greater than zero; if the first difference value, the second difference value and the third difference value are all larger than zero, determining that the first weight meets a preset condition; if at least one of the first difference value, the second difference value and the third difference value is smaller than zero, determining that the first weight does not meet the preset condition, and determining that the priority of the objective kernel function is the lowest priority.
In one embodiment, the processor when executing the computer program further performs the steps of: acquiring the size of a thread block used by a target process to which a target kernel function belongs, the size of a shared memory used and the size of a video memory used; and determining a second weight according to the used thread block size, the used shared memory size and the used video memory size.
In one embodiment, the processor when executing the computer program further performs the steps of: determining the total weight corresponding to the objective kernel function according to the first weight and the second weight; and determining the priority of the objective kernel function according to the total weight corresponding to the objective kernel function.
In one embodiment, the processor when executing the computer program further performs the steps of: and receiving kernel function information sent by the HIP runtime library through the heterogeneous accelerator driving interface.
In one embodiment, a computer readable storage medium is provided having a computer program stored thereon, which when executed by a processor, performs the steps of:
obtaining kernel function information corresponding to a target kernel function through a portable heterogeneous computing interface HIP runtime library;
determining a first weight corresponding to a resource required by the target kernel function according to the kernel function information;
and under the condition that the first weight meets the preset condition, determining the priority of the objective kernel function according to the first weight and the second weight of the use resource of the objective process to which the objective kernel function belongs.
In one embodiment, the computer program when executed by the processor further performs the steps of: acquiring the residual resource information of heterogeneous accelerator hardware; the residual resource information comprises the residual thread block size, the residual shared memory size and the residual video memory size; and determining a first weight according to the residual resource information and the kernel function information.
In one embodiment, the computer program when executed by the processor further performs the steps of: calculating a difference value between the size of the remaining thread blocks and the size of the required thread blocks to obtain a first difference value; calculating a difference value between the size of the residual shared memory and the size of the needed shared memory to obtain a second difference value; calculating a difference value between the residual video memory size and the required video memory size to obtain a third difference value; and determining a first weight according to the first difference value, the second difference value and the third difference value.
In one embodiment, the computer program when executed by the processor further performs the steps of: determining whether the first difference, the second difference, and the third difference are all greater than zero; if the first difference value, the second difference value and the third difference value are all larger than zero, determining that the first weight meets a preset condition; if at least one of the first difference value, the second difference value and the third difference value is smaller than zero, determining that the first weight does not meet the preset condition, and determining that the priority of the objective kernel function is the lowest priority.
In one embodiment, the computer program when executed by the processor further performs the steps of: acquiring the size of a thread block used by a target process to which a target kernel function belongs, the size of a shared memory used and the size of a video memory used; and determining a second weight according to the used thread block size, the used shared memory size and the used video memory size.
In one embodiment, the computer program when executed by the processor further performs the steps of: determining the total weight corresponding to the objective kernel function according to the first weight and the second weight; and determining the priority of the objective kernel function according to the total weight corresponding to the objective kernel function.
In one embodiment, the computer program when executed by the processor further performs the steps of: and receiving kernel function information sent by the HIP runtime library through the heterogeneous accelerator driving interface.
In one embodiment, a computer program product is provided comprising a computer program which, when executed by a processor, performs the steps of:
obtaining kernel function information corresponding to a target kernel function through a portable heterogeneous computing interface HIP runtime library;
determining a first weight corresponding to a resource required by the target kernel function according to the kernel function information;
and under the condition that the first weight meets the preset condition, determining the priority of the objective kernel function according to the first weight and the second weight of the use resource of the objective process to which the objective kernel function belongs.
In one embodiment, the computer program when executed by the processor further performs the steps of: acquiring the residual resource information of heterogeneous accelerator hardware; the residual resource information comprises the residual thread block size, the residual shared memory size and the residual video memory size; and determining a first weight according to the residual resource information and the kernel function information.
In one embodiment, the computer program when executed by the processor further performs the steps of: calculating a difference value between the size of the remaining thread blocks and the size of the required thread blocks to obtain a first difference value; calculating a difference value between the size of the residual shared memory and the size of the needed shared memory to obtain a second difference value; calculating a difference value between the residual video memory size and the required video memory size to obtain a third difference value; and determining a first weight according to the first difference value, the second difference value and the third difference value.
In one embodiment, the computer program when executed by the processor further performs the steps of: determining whether the first difference, the second difference, and the third difference are all greater than zero; if the first difference value, the second difference value and the third difference value are all larger than zero, determining that the first weight meets a preset condition; if at least one of the first difference value, the second difference value and the third difference value is smaller than zero, determining that the first weight does not meet the preset condition, and determining that the priority of the objective kernel function is the lowest priority.
In one embodiment, the computer program when executed by the processor further performs the steps of: acquiring the size of a thread block used by a target process to which a target kernel function belongs, the size of a shared memory used and the size of a video memory used; and determining a second weight according to the used thread block size, the used shared memory size and the used video memory size.
In one embodiment, the computer program when executed by the processor further performs the steps of: determining the total weight corresponding to the objective kernel function according to the first weight and the second weight; and determining the priority of the objective kernel function according to the total weight corresponding to the objective kernel function.
In one embodiment, the computer program when executed by the processor further performs the steps of: and receiving kernel function information sent by the HIP runtime library through the heterogeneous accelerator driving interface.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the embodiments provided herein may include at least one of a relational database and a non-relational database. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processor referred to in the embodiments provided in the present application may be a general-purpose processor, a central processing unit, a graphics processor, a digital signal processor, a programmable logic unit, a data processing logic unit based on quantum computing, or the like, but is not limited thereto.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The foregoing examples illustrate only a few embodiments of the application and are described in detail herein without thereby limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of the application should be assessed as that of the appended claims.

Claims (10)

1. A method for determining a priority of a kernel function, the method comprising:
obtaining kernel function information corresponding to a target kernel function through a portable heterogeneous computing interface HIP runtime library;
determining a first weight corresponding to a resource required by the objective kernel function according to the kernel function information;
and under the condition that the first weight meets a preset condition, determining the priority of the objective kernel function according to the first weight and the second weight of the use resource of the objective process to which the objective kernel function belongs.
2. The method according to claim 1, wherein determining the first weight corresponding to the resource required by the objective kernel according to the kernel information includes:
acquiring the residual resource information of heterogeneous accelerator hardware; the residual resource information comprises the residual thread block size, the residual shared memory size and the residual video memory size;
and determining the first weight according to the residual resource information and the kernel function information.
3. The method of claim 2, wherein the kernel function information includes a required thread block size, a required shared memory size, and a required memory size of the target kernel function, and wherein determining the first weight based on the remaining resource information and the kernel function information comprises:
calculating a difference value between the size of the remaining thread blocks and the size of the required thread blocks to obtain a first difference value;
calculating a difference value between the size of the residual shared memory and the size of the needed shared memory to obtain a second difference value;
calculating the difference between the residual video memory size and the required video memory size to obtain a third difference;
and determining the first weight according to the first difference value, the second difference value and the third difference value.
4. A method according to claim 3, characterized in that the method further comprises:
determining whether the first, second, and third differences are all greater than zero;
if the first difference value, the second difference value and the third difference value are all larger than zero, determining that the first weight meets the preset condition;
and if at least one of the first difference value, the second difference value and the third difference value is smaller than zero, determining that the first weight does not meet the preset condition, and determining that the priority of the objective kernel function is the lowest priority.
5. The method of any of claims 1-4, wherein the method of obtaining the second weight comprises:
acquiring the size of a thread block used by a target process to which the target kernel function belongs, the size of a shared memory used and the size of a video memory used;
and determining the second weight according to the used thread block size, the used shared memory size and the used video memory size.
6. The method according to any one of claims 1-4, wherein determining the priority of the objective kernel function according to the first weight and a second weight of a usage resource of the objective process to which the objective kernel function belongs comprises:
Determining the total weight corresponding to the objective kernel function according to the first weight and the second weight;
and determining the priority of the objective kernel function according to the total weight corresponding to the objective kernel function.
7. The method of claim 1, wherein the obtaining, by the HIP runtime library of the portable heterogeneous computing interface, kernel function information corresponding to the objective kernel function comprises:
and receiving the kernel function information sent by the HIP runtime library through a heterogeneous accelerator driving interface.
8. A kernel function priority determining apparatus, the apparatus comprising:
the acquisition module is used for acquiring kernel function information corresponding to the target kernel function through the HIP runtime library;
the determining module is used for determining a first weight corresponding to the resource required by the objective kernel function according to the kernel function information;
the determining module is further configured to determine, when the first weight meets a preset condition, a priority of the objective kernel function according to the first weight and a second weight of a usage resource of the objective process to which the objective kernel function belongs.
9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 7 when the computer program is executed.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.
CN202310708221.3A 2023-06-14 2023-06-14 Nuclear function priority determining method, device, computer equipment and storage medium Pending CN116860435A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310708221.3A CN116860435A (en) 2023-06-14 2023-06-14 Nuclear function priority determining method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310708221.3A CN116860435A (en) 2023-06-14 2023-06-14 Nuclear function priority determining method, device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116860435A true CN116860435A (en) 2023-10-10

Family

ID=88233174

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310708221.3A Pending CN116860435A (en) 2023-06-14 2023-06-14 Nuclear function priority determining method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116860435A (en)

Similar Documents

Publication Publication Date Title
CN112906865B (en) Neural network architecture searching method and device, electronic equipment and storage medium
US20220343146A1 (en) Method and system for temporal graph neural network acceleration
CN118331513B (en) Data intelligent dynamic scheduling method and device and computer equipment
CN115729687A (en) Task scheduling method and device, computer equipment and storage medium
CN115981843A (en) Task scheduling method and device in cloud-edge cooperative power system and computer equipment
CN117271100B (en) Algorithm chip cluster scheduling method, device, computer equipment and storage medium
CN114089921A (en) Power system data storage method and device, computer equipment and storage medium
CN114461384A (en) Task execution method and device, computer equipment and storage medium
CN114201306B (en) Multi-dimensional geographic space entity distribution method and system based on load balancing technology
CN116820758A (en) Job processing method, apparatus, computer device, storage medium, and program product
CN116860435A (en) Nuclear function priority determining method, device, computer equipment and storage medium
CN114253481A (en) Data storage method and device, computer equipment and storage medium
CN116991600B (en) Method, device, equipment and storage medium for processing graphic call instruction
CN116681454B (en) Virtual resource proportioning strategy generation method and device, computer equipment and storage medium
CN117453759B (en) Service data processing method, device, computer equipment and storage medium
CN118608746A (en) Detection frame screening method, detection frame screening device, computer equipment and storage medium
CN118838689A (en) Task scheduling method, device, computer equipment and storage medium
CN118861364A (en) Graph data processing method, device, equipment, storage medium and program product
CN116126490A (en) Resource scheduling method, device, computer equipment and storage medium
CN117971742A (en) Chip data transmission method and device based on transmission sequence
CN117314036A (en) Work order distribution method, apparatus, device, storage medium and program product
CN117112206A (en) Transaction resource isolation method, device, computer equipment and storage medium
CN118708353A (en) Computing force sharing method and device, vehicle-mounted terminal, vehicle, system and storage medium
CN118034885A (en) Task processing method, device, computer equipment and storage medium
CN117950833A (en) Task scheduling method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination