WO2023151340A1 - 图形处理器资源管理方法、装置、设备、存储介质和程序产品 - Google Patents

图形处理器资源管理方法、装置、设备、存储介质和程序产品 Download PDF

Info

Publication number
WO2023151340A1
WO2023151340A1 PCT/CN2022/132457 CN2022132457W WO2023151340A1 WO 2023151340 A1 WO2023151340 A1 WO 2023151340A1 CN 2022132457 W CN2022132457 W CN 2022132457W WO 2023151340 A1 WO2023151340 A1 WO 2023151340A1
Authority
WO
WIPO (PCT)
Prior art keywords
application process
resources
resource
application
graphics processor
Prior art date
Application number
PCT/CN2022/132457
Other languages
English (en)
French (fr)
Inventor
杨坤
王赐烺
唐玺
杨广东
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to KR1020247011896A priority Critical patent/KR20240052091A/ko
Priority to US18/215,018 priority patent/US20230342207A1/en
Publication of WO2023151340A1 publication Critical patent/WO2023151340A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/30Interconnection arrangements between game servers and game devices; Interconnection arrangements between game devices; Interconnection arrangements between game servers
    • A63F13/35Details of game servers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/509Offload

Definitions

  • the present application relates to the field of cloud technologies, and more specifically, to a graphics processor resource management method, device, device, computer-readable storage medium, and computer program product.
  • the cloud game renders the image of the game on the graphics processing unit (GPU) on the cloud server side, and transmits the rendering result to the user's client through the network.
  • GPU graphics processing unit
  • multiple game processes hosted on the cloud server may share the hardware computing resources of the cloud server, so there will be competition for hardware computing resources. If multiple game processes are running on one GPU at the same time, these game processes will compete for GPU resources and affect the rendering effect. If a separate GPU is provided for each game process, although the rendering quality of each game process can be guaranteed, But this will inevitably cause a serious waste of GPU resources.
  • the present application determines the processing order of these task processes in real time according to the resource requirements of each task process and the resource consumption that has occurred, thereby realizing efficient allocation of graphics manager resources.
  • Embodiments of the present application provide a graphics processor resource management method, device, device, computer-readable storage medium, and computer program product.
  • An embodiment of the present application provides a graphics processor resource management method, including: determining multiple graphics processors for processing application processes; obtaining multiple application processes to be processed, and providing Each application process allocates a graphics processor in the plurality of graphics processors; for each application process in at least one application process allocated to a graphics processor, determining that the application process is in the graphics processor The amount of remaining available resources in the currently reserved resources, the amount of remaining available resources is related to the amount of remaining available resources of the application process in the historical reserved resources of the graphics processor; and based on each of the at least one application process The amount of remaining available resources of an application process in the currently predetermined resource, determining a resource allocation command for each application process in the at least one application process, and the resource allocation command indicates whether to process the application process; Wherein, the resource allocation command is used to make the amount of remaining available resources of the application process in the currently predetermined resources reach a preset target value.
  • the embodiment of the present application also provides a graphics processor resource management method, including: starting a scheduling process, the scheduling process includes an allocation thread and a plurality of processing threads; a graphics processor, and assign a processing thread to each graphics processor in the plurality of graphics processors; start a plurality of application processes, and each application process in the plurality of application processes includes a configured scheduling library; for each application process in the plurality of application processes, assign the plurality of application processes to each application process in the plurality of application processes through the scheduling library of the application process and the allocation thread One of the graphics processors and the corresponding processing thread; for each application process in the at least one application process assigned to a graphics processor, the application is determined by the processing thread corresponding to the application process The amount of remaining available resources of a process in the currently reserved resources of the graphics processor, so as to determine a resource allocation command for the application process, and the resource allocation command indicates whether to process the application process; wherein, the remaining The amount of available resources is related to the amount of remaining available resources of the application process in the historical reserved resources
  • An embodiment of the present application provides a graphics processor resource management device, including: a processor determination module configured to determine multiple graphics processors that can be used to process application processes; a processor allocation module configured to acquire a plurality of application processes, and assign one of the plurality of graphics processors to each application process in the plurality of application processes; the remaining resource determination module is configured to allocate to a graphics processor For each application process in the at least one application process, determine the remaining available resource amount of the application process in the currently reserved resources of the graphics processor, and the remaining available resource amount is the same as that of the application process in the graphics processor and the resource allocation module is configured to, based on the remaining available resource amount of each application process in the at least one application process in the current predetermined resource, determine the remaining available resource amount for the at least one application process A resource allocation command for each application process in the at least one application process, the resource allocation command indicates whether to process the application process; wherein the resource allocation command is used to make the application process in the current scheduled The remaining available resource amount in the resource reaches the preset target value.
  • An embodiment of the present application provides a graphics processor resource management device, including: one or more processors; and one or more memories, where computer executable programs are stored in the one or more memories, and when When the computer-executable program is executed by the processor, the graphics processor resource management method described above is executed.
  • An embodiment of the present application provides a computer-readable storage medium, on which computer-executable instructions are stored, and when executed by a processor, the instructions are used to implement the above-mentioned graphics processor resource management method.
  • Embodiments of the present application provide a computer program product or computer program, where the computer program product or computer program includes computer instructions, and the computer instructions are stored in a computer-readable storage medium.
  • the processor of the computer device reads the computer instruction from the computer-readable storage medium, and the processor executes the computer instruction, so that the computer device executes the graphics processor resource management method according to the embodiment of the present application.
  • the method provided by the embodiments of the present application uses the past resource consumption of the application process running on the graphics processor as a reference for resource allocation, and uses the actual The amount of available resources is used to adjust resource allocation in real time, thereby avoiding resource competition among multiple application processes.
  • the method provided by the embodiments of the present application considers the remaining available resources of these application processes in the historical resource allocation for multiple application processes running simultaneously on the same graphics processor, and based on the current resources of these application processes in the resources of the graphics processor The amount of available resources determines the resource allocation scheme in real time, thereby realizing efficient allocation of graphics manager resources.
  • graphics processor resources can be reasonably allocated according to the resource requirements of each application process, the influence of competition among multiple application processes is avoided, and the usage rate of graphics processor resources is improved.
  • FIG. 1 is a schematic diagram showing an example of a scenario in which multiple application processes use GPU resources according to an embodiment of the present application;
  • FIG. 2A is a flowchart illustrating a method for managing graphics processor resources according to an embodiment of the present application
  • FIG. 2B is a schematic flow diagram illustrating a method for managing graphics processor resources according to an embodiment of the present application
  • FIG. 2C is a schematic diagram showing a timing sequence of a graphics processor resource management method according to an embodiment of the present application.
  • FIG. 3 is a schematic diagram illustrating allocating graphics processors for multiple application processes according to an embodiment of the present application
  • FIG. 4A is a schematic diagram illustrating two resource usage situations according to an embodiment of the present application.
  • FIG. 4B is a schematic diagram showing resource allocation ratios of multiple application processes according to an embodiment of the present application.
  • FIG. 5 is a schematic diagram illustrating an acquisition queue and a processing queue according to an embodiment of the present application
  • Fig. 6 is an exemplary schematic diagram illustrating determining a first increment of a used resource amount of an application process in a currently reserved resource according to an embodiment of the present application
  • Fig. 7 is a schematic diagram illustrating acquisition and processing of a first increment according to an embodiment of the present application.
  • FIG. 8A is a schematic diagram illustrating a double-buffering method of the CPU and the GPU when determining the first increment according to an embodiment of the present application
  • FIG. 8B is a schematic diagram illustrating an adaptive buffering method of the CPU and the GPU when determining the first increment according to an embodiment of the present application
  • FIG. 9A is a schematic diagram showing a graphics processor resource management method according to an embodiment of the present application.
  • FIG. 9B is a schematic diagram illustrating scheduling logic of a graphics processor resource management method according to an embodiment of the present application.
  • FIG. 10 is a schematic diagram showing a graphics processor resource management device according to an embodiment of the present application.
  • FIG. 11 shows a schematic diagram of a graphics processor resource management device according to an embodiment of the present application.
  • Figure 12 shows a schematic diagram of the architecture of an exemplary computing device according to an embodiment of the application.
  • Fig. 13 shows a schematic diagram of a storage medium according to an embodiment of the present application.
  • the graphics processor resource management method of the present application may be based on cloud technology (Cloud technology).
  • the graphic processor resource management method of the present application may be based on cloud gaming (Cloud gaming).
  • Fig. 1 is a schematic diagram showing an example of a scenario where GPU resources are used by multiple application processes according to an embodiment of the present application.
  • the network can be an Internet of Things (Internet of Things) based on the Internet and/or a telecommunication network, which can be a wired network or a wireless network, for example, it can be a local area network (LAN), a metropolitan area network (MAN), a wide area network (WAN) ), cellular data communication network and other electronic networks that can realize the function of information exchange.
  • Internet of Things Internet of Things
  • a telecommunication network which can be a wired network or a wireless network, for example, it can be a local area network (LAN), a metropolitan area network (MAN), a wide area network (WAN) ), cellular data communication network and other electronic networks that can realize the function of information exchange.
  • the mobile phone application or computer software on the user terminal can send the control command input by the user to the server, thereby starting the corresponding application process.
  • There may be various hardware computing resources on the server for example, central processing unit, communication interface, memory, and so on.
  • GPU resources shown in Figure 1 as an example, there are multiple GPUs (e
  • the server can be an independent physical server, or a server cluster or distributed system composed of multiple physical servers, or it can provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services Cloud servers for basic cloud computing services such as cloud communications, middleware services, domain name services, security services, CDN, and big data and artificial intelligence platforms.
  • the user terminal may be a smart phone, a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart watch, etc., but is not limited thereto.
  • the user terminal and the server may be connected directly or indirectly through wired or wireless communication, which is not limited in this application.
  • a networked game application usually relies on the GPU on the cloud server to synthesize the game screen displayed by the user terminal or perform hardware encoding.
  • Such game applications are also called cloud games (also called “cloud games”). Game On Demand”).
  • the user terminal can transmit the data of the user's game operation to the cloud server through the control stream, and the cloud server transmits one or more audio frames and video frames to the user terminal through the data stream.
  • cloud gaming games are stored, synchronized and presented in remote cloud servers, and delivered to players using streaming technology, which is a completely different type of online gaming service from the past.
  • the cloud server runs the game, renders and encodes its graphical output into video, and then streams the video to a network client, which decodes and displays the video stream for the player to interact with The command is sent to the cloud server.
  • cloud gaming transfers the computing load of the game from the client to the cloud, thereby releasing its constraints on the player's device.
  • cloud gaming also allows players to start the game immediately without spending time on downloading and installing the game client. Due to these advantages, cloud gaming has attracted great attention from academia and industry.
  • the cloud server is responsible for the interpretation of player input, the execution of game code and the rendering of graphics, and transmits the game scene to the client through the network, while the client is responsible for decoding and displaying the game scene to the player, capturing and sending the game player's view in real time.
  • the cloud server in the process of the cloud server using the GPU to perform graphics rendering, due to the limited hardware computing resources of the GPU, in the cloud server that provides cloud games, multiple virtual entities hosted on the cloud server may share the hardware computing resources of the cloud server , so there will be competition for hardware computing resources.
  • GPU hardware renders the rendering requests provided by all game processes on it according to the principle of first-come-first-served rendering, and the increase of the rendering load of a single game process will affect the normal rendering of other game processes. For example, in the case of any game process rendering timeout, other game processes are forced to shorten the rendering time, resulting in low rendering quality, and this deterioration of rendering quality may gradually accumulate as the rendering process progresses, seriously affecting the user's game experience.
  • the present application provides a graphics processor resource management method, which determines the processing order of these task processes in real time according to the resource requirements of each task process and the resource consumption that has occurred, thereby realizing the graphics manager. Efficient allocation of resources.
  • the method provided by the embodiments of the present application uses the past resource consumption of the application process running on the graphics processor as a reference for resource allocation, and uses the actual The amount of available resources is used to adjust resource allocation in real time, thereby avoiding resource competition among multiple application processes.
  • the method provided by the embodiments of the present application considers the remaining available resources of these application processes in the historical resource allocation for multiple application processes running simultaneously on the same graphics processor, and based on the current resources of these application processes in the resources of the graphics processor The amount of available resources determines the resource allocation scheme in real time, thereby realizing efficient allocation of graphics manager resources.
  • the resources of the graphics processor can be reasonably allocated according to the resource requirements of each application process, the influence of competition among multiple application processes is avoided, and the usage rate of the resources of the graphics processor is improved.
  • FIG. 2A is a flowchart illustrating a graphics processor resource management method 200 according to an embodiment of the present application.
  • FIG. 2B is a schematic block diagram illustrating a method for managing graphics processor resources according to an embodiment of the present application.
  • FIG. 2C is a schematic diagram showing a timing sequence of a graphics processor resource management method according to an embodiment of the present application.
  • a plurality of graphics processors for processing application processes may be determined.
  • the resource management of the GPU can be coordinated between the application process and the scheduling service as shown in FIG. 2C , wherein the application process part corresponds to the execution of the application process (for example, a game instance) in the GPU resource management process. operation, and the scheduling service part corresponds to the GPU resource management scheduling operation for the corresponding application process.
  • the application process can be various application processes such as game process, video process and conference process, etc.
  • the game process developed by graphics engines such as OpenGL (Open Graphics Library, Open Graphics Library) is used as an example
  • OpenGL Open Graphics Library, Open Graphics Library
  • the allocation thread in the scheduling service can first determine the multiple GPUs that are currently available for processing the application process, and create corresponding processing threads for these GPUs in the scheduling service, and then for each Resource allocation management of a GPU can be performed on its corresponding processing thread.
  • the determined GPU should have the ability to perform certain rendering calculations.
  • the GPU resource management method of this application is also applicable to the processing of other application processes, for example, it can be applied to video processes and conference processes
  • the processing may be processing the image rendering of the video image, or may be processing the image rendering of the conference image.
  • step 202 a plurality of application processes to be processed may be obtained, and one graphics processor among the plurality of graphics processors is assigned to each application process in the plurality of application processes.
  • the server can register the corresponding application process for it, including but not limited to assigning each application process a GPU for its graphics rendering processing, and returning the registration information to the To the application process, information such as the index of the allocated GPU and its corresponding processing thread.
  • the application process's request for GPU resources can also be forwarded to different GPUs for execution, so that multiple GPUs can be A GPU is virtualized to implement application process processing in this case.
  • more than one application process may be allocated to the same GPU, that is, the processing tasks of more than one application process may be executed on the same GPU at the same time, under the condition that the computing performance limit of the GPU is guaranteed.
  • FIG. 3 is a schematic diagram illustrating allocating graphics processors to multiple application processes according to an embodiment of the present application.
  • the application processes A and C are allocated to the GPUs 1 for processing, application process B is assigned to GPU 2 for processing, and GPU C may not be assigned to perform any application process processing. Therefore, for the application processes A and C running together on the GPU 1, these application processes can share the computing resources on the GPU 1, so the resource usage of these application processes may also have a competitive relationship, which can be avoided by the GPU resource management method of the present application. Competing effects between these application processes.
  • each application process in the plurality of application processes may have a predetermined resource requirement weight.
  • the resource requirement weight of each application process may be determined based on the amount of GPU resources required for its calculation, and the resource requirement weight may be predetermined and notified to the scheduling service during registration.
  • the corresponding resource requirement weight can be determined in advance based on the complexity and calculation amount of the screen rendering.
  • the weight can be the proportion required for the screen rendering of the game instance in unit GPU resources, for example, in If the screen rendering of a game instance requires 200ms in 1s of GPU hardware computing time, the resource requirement weight of the game instance can be 0.2 (200ms/1s).
  • assigning one of the multiple graphics processors to each of the multiple application processes in step 202 may include: determining one of the multiple graphics processors The available resource ratio of each graphics processor, where the available resource ratio is the ratio of resources available for processing application processes in the graphics processor; and based on the resource requirement weight of each application process in the plurality of application processes and An available resource ratio of each graphics processor in the plurality of graphics processors determines a graphics processor allocated to each application process.
  • each GPU in the plurality of GPUs is a GPU currently available for processing application processes, but the amount of computing resources available in these GPUs is not necessarily equal and not necessarily equal to the amount of computing resources of all of them.
  • the amount of available resources of the GPU can be expressed by its available resource ratio, which can represent the proportion of the amount of resources available for processing application processes in the GPU to its unit resources, for example , in 1s GPU hardware computing time, the time available for processing the application process is 0.8s, then the available resource ratio of the GPU can be 0.8.
  • the GPU allocation to the application process can be jointly determined based on the resource requirement weight of each application process and the available resource ratio of each GPU.
  • the sum of resource requirement weights of at least one application process allocated to one graphics processor is not greater than the ratio of available resources of the graphics processor.
  • the sum of the resource requirement weights of the application processes allocated to it for processing cannot be greater than the available resource ratio of the GPU, that is, the amount of resources actually used for processing application processes in the GPU cannot be greater than the expected amount of resources.
  • step 203 for each application process in at least one application process allocated to a graphics processor, the remaining available resource amount of the application process in the currently reserved resources of the graphics processor may be determined, the remaining The amount of available resources may be related to the amount of remaining available resources of the application process in the historical predetermined resources of the graphics processor.
  • the resource amounts included in the historical reserved resources and the current reserved resources of the graphics processor respectively may be predetermined resource amounts.
  • the amount of resources respectively contained in the historical reserved resource and the current reserved resource may be the above-mentioned unit GPU resource, and the historical reserved resource is a resource reserved by the application process before the currently reserved resource.
  • the usage of unit GPU resources can be used to determine the usage rate of the GPU (the ratio of the actual working time of the GPU to the running time), so the allocation of GPU resources in this application can be based on the allocation of unit GPU resources.
  • the unit GPU resource may be a calculation time of unit length, such as 1 second or 1 frame time described above, which is not limited in the present application.
  • the resource requirement weight of each application process in the plurality of application processes may indicate a proportion of the resource amount required by the application process in the predetermined resource amount.
  • the resource requirement weight of the application process may be the resource ratio required by the processing of the application process in unit GPU resources.
  • the current resource allocation can be determined based on the historical state of resource allocation to the application process.
  • the historical state may include the remaining resource amount of the application process in the historical resource allocation, that is, the resource amount that can be used but not used by it.
  • the existence of the remaining available resources may be due to the resource occupation of the application process due to the processing timeout of other application processes. Such resource occupation may cause the rendering effect of the application process to be poor. Therefore, in order to avoid the accumulation of such resource occupation , corresponding adjustments can be made in the subsequent resource allocation, for example, the available resource amount of the application process in the current resource allocation can be associated with the remaining available resource amount in the historical resource allocation.
  • determining the remaining available resources of the application process in the current reserved resources in step 203 may include: based on the remaining available resources of the application process in the historical reserved resources, the The amount of resources used by the application process in the currently reserved resources and the resource demand weight of the application process are used to determine the remaining available resource amount of the application process in the currently reserved resources.
  • the total amount of available resources of the application process in the current reserved resources can be determined.
  • the amount of used resources in the reserved resources can determine its remaining available resources in the current reserved resources, that is, the current reserved resources can be obtained by subtracting the used resources in the currently reserved resources from the total amount of available resources in the currently reserved resources The amount of remaining available resources in .
  • the current resource allocation can be adjusted based on the error in the historical resource allocation to reduce or even eliminate the resource allocation error . For example, in the case that the remaining available resources of an application process in the historical scheduled resources are less than zero, by subtracting the remaining available resources of the application process in the historical scheduled resources from the remaining available resources of the application process in the current scheduled resources The absolute value of the amount can effectively reduce the resource allocation error.
  • the graphics processor resource management method 200 may further include: for each application process in the at least one application process, acquiring the remaining available resources of the application process in the historical predetermined resources amount, and determine the amount of resources used by the application process in the currently reserved resources.
  • the amount of resources used by the application process in the currently scheduled resource may include the amount of resources used by the application process in the previous resource allocation and the amount of resources used in the earlier resource allocation to the current scheduled resource, wherein, The amount of resources used by the application process in the previous resource allocation corresponds to the computing resources used by the application process in the latest processing.
  • determining the amount of used resources of the application process in the currently scheduled resources may include determining a first increment of the amount of used resources of the application process in the currently scheduled resources, the first increment It may correspond to the previous processing of the processing task from the application process by the graphics processor corresponding to the application process.
  • the first increment may correspond to the time when the GPU performed the rendering processing of the game instance last time, as shown in FIG. 2C , the rendering time may be obtained by the application process, and through The scheduling library is set to notify the scheduling service, so that the scheduling service processes the rendering time based on the rendering time, including determining the amount of remaining available resources of the application process in the currently scheduled resources.
  • the scheduling service can jointly determine the current resource allocation plan, and notify the application process whether to render, and the application process waits during this period Rendering notifications from the scheduling service, as shown in Figure 2C. Therefore, obtaining the remaining available resources of each application process, especially obtaining the first increment of each application process is very important for the GPU resource management of this application, which can effectively reduce allocation errors.
  • FIG. 6 and FIG. 7 For the manner of obtaining the first increment, reference may be made to the relevant descriptions in FIG. 6 and FIG. 7 below, which will not be described in detail here.
  • a resource allocation command for each application process in the at least one application process may be determined based on the remaining available resource amount of each application process in the at least one application process in the current predetermined resource , the resource allocation command indicates whether to process the application process.
  • step 204 may include: for each application process in the at least one application process, when the amount of remaining available resources of the application process in the currently scheduled resources is not greater than zero, determining that the resource allocation command for the application process indicates not to process the application process; and for other application processes in the at least one application process whose remaining available resources are greater than zero, based on each application in the other application processes The priority of the process determines the resource allocation command for each application process.
  • resources may not be allocated to it in the currently reserved resources, so as not to affect the processing of other application processes.
  • it may be considered to continue to allocate GPU computing resources to these application processes.
  • the allocation of resources to the application processes can be based on the priority of the application processes, rather than just based on the previously described first-come, first-served competition model.
  • the priority of each application process among the other application processes may be related to the length of time the application process waits to be processed and the time order in which the latest first increment thereof is determined. For example, for pending application processes that are about to freeze the screen, etc., their priority can be set higher so that they can be processed first; and for processing opportunities that do not have such an emergency situation, it can be based on the information obtained from these application processes.
  • the time order of the first increment is used to determine its priority. For example, the application process that first obtains its first increment and whose remaining available resources in the current scheduled resources is greater than zero can be prioritized.
  • determining a resource allocation command for each application process based on the priority of each application process in the other application processes may include: for each application process in the other application processes, when there is In the case of an application process whose length of time waiting to be processed satisfies a predetermined condition, determining a resource allocation command to the application process based on the chronological order of determining the latest first increment of the application process; and when there is no waiting to be processed In the case of the application process whose processing time satisfies the predetermined condition, the resource allocation command for each application process is determined based on the time sequence of determining the latest first increment of each application process in the other application processes.
  • the following factors can be considered (including but not limited to) for the priority setting of application processes: 1
  • the amount of remaining available resources can be set to low for the application processes whose remaining available resources in the current scheduled resources are not greater than zero.
  • Priority indicate not to execute
  • 2 Urgency, you can assign high priority to the application process that needs to be processed urgently, for example, the screen display is about to freeze or other emergencies
  • 3 The order of the first increment can be obtained first
  • the application process of the first increment is assigned a higher priority, so that the allocation of GPU resources is more continuous, thereby improving the utilization rate of GPU resources. It should be understood that, when setting the priority for processing application processes, the method of the present application may also consider other various factors, and the factors listed above are only used as examples and not limitations.
  • the resource allocation command indicates whether the corresponding application process will send a processing task to the corresponding graphics processor to be processed by the graphics processor, wherein the graphics processor is responsible for the processing task
  • the processing of corresponds to usage of resources of the graphics processor by the application process.
  • the GPU resource management of the present application controls the sending of processing tasks by these application processes by evaluating the resource usage of the GPU hardware resources by each application process.
  • FIG. 2B takes a graphics rendering task as an example.
  • GPU resource management can control the delivery of rendering instructions by each application process through resource allocation commands.
  • each application process may determine whether to send a processing task to the corresponding GPU according to the received resource allocation command (for example, deliver the rendering instruction to the GPU rendering instruction queue in FIG. 2B ), and the execution of the processing task by the GPU will be Consuming a certain amount of GPU hardware resources, the certain amount of computing resources may correspond to the first increment of the application process referred to in the next resource allocation.
  • the resource allocation command may be used to make the amount of remaining available resources of the application process in the currently reserved resources reach a preset target value.
  • the resource allocation command can reduce errors in resource allocation to each application process.
  • the preset target value is a pre-set value that needs to be achieved by the amount of remaining available resources, and can be set according to requirements. For example, it can be set to zero, so that the amount of remaining available resources approaches zero, so as to save resources.
  • FIG. 4A is a schematic diagram illustrating two resource usage situations according to an embodiment of the present application.
  • the application process has been unable to use the currently scheduled resources No resources are used any more, so no resources are allocated to the application process from the currently scheduled resources, so that the amount of remaining available resources does not continue to decrease.
  • the resource allocation command is used to make the resources used by the application process in the current scheduled resources compared with the resource ratio used by the application process in the historical scheduled resources
  • the scale is closer to the resource requirement weight of the application process. That is, the resource allocation command is used to make the first error of the application process greater than the second error, and the first error refers to the resource demand weight of the application process and the historical predetermined resource used by the application process
  • the second error refers to an error between the resource demand weight of the application process and the resource proportion used by the application process in the currently scheduled resources.
  • Fig. 4B is a schematic diagram showing resource allocation ratios of multiple application processes according to an embodiment of the present application.
  • the resource requirement weights of the three application processes A, B, and C are 0.5, 0.2, and 0.3, respectively.
  • application process A used more resources than its scheduled resource requirements (58%), resulting in insufficient resources for the other two application processes B and C (respectively 17% and 25%, both less than their scheduled resources The proportion of resources corresponding to the weight).
  • the amount of remaining available resources of the application process A in the historical reserved resources is less than zero (for example, -0.08 in the example in FIG.
  • the resource amount used by the application process in the currently scheduled resource can be closer to the expected resource amount, that is, the resource demand weight of the application process corresponding resource requirements. Therefore, on-demand resource allocation to application processes and efficient utilization of GPU resources can be realized.
  • FIG. 5 is a schematic diagram illustrating an acquisition queue and a processing queue according to an embodiment of the present application.
  • the acquisition queue inserts multiple new rendering time data sequentially through input events, and the processing queue performs rendering time processing during this period, and can exchange with the acquisition queue after the processing is completed, making the acquisition queue a new The processing queue, and the processing queue becomes a new acquisition queue, and the timing allocation of time is realized through the alternation of the queues.
  • the method of obtaining the first increment may also change accordingly.
  • the manner of determining the first increment may be determined based on the processing manner of the graphics processor for the processing task, and the processing manner may include at least one of synchronous rendering or asynchronous rendering A sort of.
  • the above-mentioned first increment of determining the amount of resources used by the application process in the current predetermined resource may be performed by one of the following: by starting the previous processing and end marking, to estimate the first increment; or obtain the first increment from the graphics processor by using a query instruction.
  • the determination of the first increment may be completed by intercepting a GPU hardware queue by using a signal.
  • the above-mentioned first increment of determining the amount of resources used by the application process in the currently reserved resources may be marked by marking the start and end of the previous processing, to estimate the first increment.
  • Fig. 6 is an exemplary schematic diagram illustrating determining a first increment of a used resource amount of an application process in a currently reserved resource according to an embodiment of the present application.
  • the actual time consumption of rendering may be determined by calculating the execution time of the drawing function, and the actual time consumption of rendering may be used as the first increment of the amount of resources used in the currently scheduled resources.
  • the execution time of the drawing function may include the preparation time and the actual rendering time.
  • the transmission of the drawing function through the PCI-E channel and the preparation for rendering include the execution of rendering instructions, and the actual rendering time
  • the command signal F can be inserted before and after the drawing command, so as to inform the application process (or scheduling service) to start or end the timing when the signal F is triggered.
  • the determination of the first increment may be performed locally on the application thread through a query.
  • the above-mentioned first increment for determining the amount of resources used by the application process in the currently reserved resources may be obtained from the graphics processor by using a query instruction. first increment.
  • Fig. 7 is a schematic diagram illustrating acquisition and processing of a first increment according to an embodiment of the present application.
  • a query operation may be sent to the GPU in a query manner, and the query operation may determine the time between two specified query points through the GPU.
  • “Start Query” and “End Query” are two specified query points inserted in the GPU respectively, to query the GPU time spent by the execution of the drawing function of the rendering instruction in between, that is, the rendering time, where the insertion
  • the first "End Query” point of the query was discarded because it could not know where its corresponding "Start Query” point was inserted.
  • the gray box part in Figure 7 indicates that the GPU resource management process is executed based on the obtained rendering time, that is, the dispatching service is notified of the obtained rendering time and then waits for the dispatching service to notify the rendering, and the dispatching service executes the rendering time processing and notifies through the event way to send rendering notifications (that is, resource allocation commands) to the application process.
  • the GPU resource management method of the present application can use the time of the CPU to execute rendering instructions to offset the time of the GPU to prepare query results.
  • FIG. 8A is a schematic diagram illustrating a double-buffering method of the CPU and the GPU when determining the first increment according to an embodiment of the present application.
  • the indexes of CPU and GPU indicate their respective processing order, where the processing of CPU 2 corresponds to the query time of GPU 1, and the processing of CPU 3 corresponds to the query time of GPU 2.
  • the GPU management method of the present application can also solve the foregoing problems based on an adaptive buffering mechanism .
  • FIG. 8B is a schematic diagram illustrating an adaptive buffering method of a CPU and a GPU when determining a first increment according to an embodiment of the present application.
  • the double buffering mechanism can still be used, but in the case that the previous drawing cannot be queried, it is not necessary to end this query until it is determined that the previous drawing must be completed, during which multiple drawing calls can be made .
  • the adaptive buffer method expands the number of draw calls in the query, and expands the single draw call of the query to multiple random calls.
  • the system accuracy may be reduced. This is because there may be cross-rendering between different application processes, and the GPU can only read the time of the current query point and cannot distinguish between different application processes, resulting in inaccurate queries.
  • the rendering instruction buffer can be forced to be flushed asynchronously at the end of the GPU query, which will generate a large number of fast draws, and such draws can be quickly obtained without using the adaptive buffer method.
  • Rendering time results on the GPU, but this also results in a communication overhead.
  • a Monte Carlo algorithm can be used, which combines the basic algorithm of the double buffer method and the adaptive algorithm, and obtains a better solution (relative to the optimal solution) through a statistical method of probability theory.
  • the degradation of the adaptive algorithm to the double-buffered underlying algorithm can be limited by the number of draw calls or the system runtime, where lower values of these parameters may lead to poor communication performance, while higher values may increase the communication between multiple application processes. Therefore, the rationality of the system can be further guaranteed by using the Monte Carlo algorithm.
  • FIG. 9A is a schematic diagram illustrating a graphics processor resource management method 300 according to an embodiment of the present application.
  • the graphics processor resource management method 300 may include two operations performed by a scheduling process and an application process respectively.
  • the graphics processor resource management method 300 may mainly include the following steps, where the numbers of each step correspond to the reference numerals in FIG. 9A .
  • Start a scheduling process which may include an allocation thread and a plurality of processing threads.
  • the scheduling process may include one allocation thread and multiple processing threads, wherein the allocation thread may be used for allocation of GPUs and application processes and distribution of messages, and the processing thread may be used for GPU rendering time processing.
  • multiple GPUs that are currently available for processing application processes can be determined for subsequent management of resource allocation for each GPU.
  • the scheduling process can allocate three processing threads 1, 2, and 3 for the rendering time processing of the three GPUs respectively. Therefore, the subsequent resource allocation of each GPU can be Executed by the corresponding processing thread.
  • the scheduling library can be shared among the multiple application processes, and the information interaction between the application process and the scheduling process can be realized through the scheduling library.
  • the scheduling library can send the information of the application process to the scheduling process for registration, and the registration operation of the application process is a synchronous operation.
  • the application process can continue to send messages after registration, and the scheduling process can notify the processing of the application process through shared events.
  • three application processes can be assigned GPUs for their graphics rendering processing, for example, application processes 1 and 2 are assigned to perform graphics rendering on GPU 1, and application process 3 is assigned to perform graphics on GPU 3. rendering, while GPU 2 is not allocated for the processing of these three application processes.
  • corresponding registration information may be returned to each application process, such as information such as the index of the allocated GPU and its corresponding processing thread.
  • the above-mentioned rendering processing operations of the scheduling service can be performed by corresponding processing threads in the scheduling process, for example, the processing thread 1 corresponding to GPU 1 can process the rendering time of application processes 1 and 2, including determining when the two application processes are A corresponding resource allocation command is determined based on the remaining available resources in the currently reserved resources of the two application processes in the currently reserved resources.
  • each application process determines whether to send a processing task to the corresponding GPU based on the resource allocation command for processing by the GPU.
  • the manner of obtaining the first increment is different based on the manner in which the GPU processes the processing task.
  • the first increment can be estimated by the scheduling library directly from the GPU through signal interception; in the case of synchronous rendering, the first increment can be obtained by the scheduling library from the GPU through query instructions , that is, return along the original path from the GPU to the scheduling library.
  • the amount of remaining available resources is related to the amount of remaining available resources of the application process in the historical predetermined resources of the graphics processor, and the resource allocation command is used to make the application process in the The remaining available resources in the currently scheduled resources reach the preset target value.
  • the resource amount used by the application process in the currently scheduled resource can be closer to the expected resource amount, that is, the resource demand weight of the application process corresponding resource requirements. Therefore, on-demand resource allocation to application processes and efficient utilization of GPU resources can be realized.
  • FIG. 9B is a schematic diagram showing the scheduling logic of the graphics processor resource management method according to the embodiment of the present application.
  • the graphics processor resource management method of the present application may involve three runtime libraries based on the current design of a common graphics system, which are respectively client injection (SchedulingClient), scheduling service (SchedulingService) and scheduling logic (Scheduling) .
  • client injection and scheduling service only handles function injection and external event communication, and the core logic lies in the scheduling logic.
  • various functions eg, ResourceScheduling, SchedulingProtocol, etc.
  • FIG. 9B can be called to determine resource allocation commands for each application process by collecting and calculating rendering time and calculating whether to render based on the rendering time.
  • FIG. 10 is a schematic diagram showing a graphics processor resource management apparatus 1000 according to an embodiment of the present application.
  • the graphics processor resource management apparatus 1000 may include a processor determination module 1001 , a processor allocation module 1002 , a remaining resource determination module 1003 and a resource allocation module 1004 .
  • the processor determining module 1001 may be configured to determine multiple graphics processors available for processing application processes.
  • the processor determining module 1001 may perform the operations described in step 201 above.
  • the application process may be various application processes such as a game process, a video process, and a conference process.
  • the determined GPU should be The computing power of the GPU.
  • the processor allocation module 1002 may be configured to acquire multiple application processes to be processed, and allocate one graphics processor among the multiple graphics processors to each application process in the multiple application processes.
  • the processor allocation module 1002 may perform the operations described above in relation to step 202 .
  • the server may register corresponding application processes for each user terminal, including allocating GPUs for graphics rendering processing to each application process.
  • more than one application process can be assigned to the same GPU, that is, the processing tasks of more than one application process can be executed on the same GPU at the same time, but the load assigned to the GPU cannot exceed this Computational performance limitations of GPUs.
  • the remaining resource determining module 1003 may be configured to, for each application process in at least one application process allocated to a graphics processor, determine the amount of remaining available resources of the application process in the currently reserved resources of the graphics processor, The remaining available resource amount is related to the remaining available resource amount of the application process in the historical reserved resources of the graphics processor.
  • the remaining resource determining module 1003 may perform the operations described above in relation to step 203 .
  • the current resource allocation can be adjusted based on the error in the historical resource allocation to reduce or even eliminate the resource allocation error .
  • the resource allocation module 1004 may be configured to determine the resources for each application process in the at least one application process based on the remaining available resource amount of each application process in the at least one application process in the currently predetermined resources An allocation command, where the resource allocation command indicates whether to process the application process. Wherein, the resource allocation command is used to make the amount of remaining available resources of the application process in the currently predetermined resources reach a preset target value.
  • the resource allocation module 1004 may perform the operations described above in relation to step 204 .
  • the resource amount used by the application process in the currently scheduled resource can be closer to the expected resource amount, that is, the resource demand weight of the application process corresponding resource requirements. Therefore, on-demand resource allocation to application processes and efficient utilization of GPU resources can be realized.
  • each application process in the plurality of application processes has a predetermined resource requirement weight
  • the device further includes:
  • the used resource determination module is configured to, for each application process in the at least one application process, obtain the remaining available resource amount of the application process in the historical predetermined resources, and determine that the application process has the amount of used resources in the reserved resources;
  • the remaining resource determination module is further configured to be based on the amount of remaining available resources of the application process in the historical reserved resources, the amount of used resources of the application process in the current reserved resources, and the application The resource requirement weight of the process determines the amount of remaining available resources of the application process in the currently scheduled resources;
  • the resource allocation command is used to make the first error of the application process larger than the second error
  • the first error refers to the difference between the resource demand weight of the application process and the historical predetermined resources of the application process.
  • An error of the proportion of resources used refers to an error between the resource demand weight of the application process and the proportion of resources used by the application process in the currently scheduled resources.
  • the resource allocation command indicates whether the corresponding application process will send a processing task to the corresponding graphics processor for processing by the graphics processor, wherein the graphics processor is responsible for the processing task processing usage of resources of the graphics processor corresponding to the application process;
  • the used resource determining module is further configured to determine a first increment of the amount of used resources of the application process in the currently scheduled resource, the first increment corresponds to the resource corresponding to the application process The previous processing of the processing task from the application process by the graphics processor.
  • the resource allocation module is further configured to, for each application process in the at least one application process, if the amount of remaining available resources among the currently reserved resources corresponding to the application process is not greater than zero , determining that the resource allocation command to the application process indicates not to process the application process; and for other application processes in the at least one application process whose remaining available resources are greater than zero, based on each of the other application processes The priority of the application process determines the resource allocation order for each application process.
  • the priority of each of the other application processes is related to the length of time the application process waits to be processed and the chronological order in which its latest first increment is determined;
  • the resource allocation module is further configured to, for each application process in the other application processes, if there is an application process whose waiting time to be processed satisfies a predetermined condition, based on determining the latest first Incremental chronological order to determine the resource allocation command for the application process; and in the case that there is no application process waiting for the length of time to be processed to meet the predetermined condition, based on determining each application process in the other application processes The chronological order of the latest first increment is used to determine the resource allocation commands for each application process.
  • FIG. 11 shows a schematic diagram of a GPU resource management device 2000 according to an embodiment of the present application.
  • the graphics processor resource management device 2000 may include one or more processors 2010 and one or more memories 2020 .
  • the memory 2020 stores computer-readable codes, and when the computer-readable codes are executed by the one or more processors 2010, the above-mentioned graphics processor resource management method can be executed.
  • the processor in the embodiment of the present application may be an integrated circuit chip, which has a signal processing capability.
  • the above-mentioned processor may be a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, and discrete hardware components.
  • DSP digital signal processor
  • ASIC application-specific integrated circuit
  • FPGA off-the-shelf programmable gate array
  • Various methods, steps, and logic block diagrams disclosed in the embodiments of the present application may be implemented or executed.
  • the general-purpose processor may be a microprocessor, or the processor may be any conventional processor, etc., and may be of an X86 architecture or an ARM architecture.
  • the various example embodiments of the present application may be implemented in hardware or special purpose circuits, software, firmware, logic, or any combination thereof. Certain aspects may be implemented in hardware, while other aspects may be implemented in firmware or software, which may be executed by a controller, microprocessor or other computing device.
  • aspects of the embodiments of the present application are illustrated or described as block diagrams, flowcharts, or using some other graphical representation, it will be understood that the blocks, devices, systems, techniques, or methods described herein may serve as non-limiting Examples are implemented in hardware, software, firmware, special purpose circuits or logic, general purpose hardware or controllers or other computing devices, or some combination thereof.
  • computing device 3000 may include bus 3010, one or more CPUs 3020, read only memory (ROM) 3030, random access memory (RAM) 3040, communication ports 3050 connected to a network, input/output components 3060, hard disk 3070, etc.
  • the storage device in the computing device 3000 such as the ROM 3030 or the hard disk 3070, can store various data or files used in the processing and/or communication of the graphics processor resource management method provided in the present application and program instructions executed by the CPU.
  • Computing device 3000 may also include user interface 3080 .
  • the architecture shown in FIG. 11 is only exemplary, and one or more components in the computing device shown in FIG. 12 may be omitted according to actual needs when implementing different devices.
  • FIG. 13 shows a schematic diagram 4000 of a storage medium according to the present application.
  • a computer-readable storage medium in embodiments of the present application may be a volatile memory or a nonvolatile memory, or may include both volatile and nonvolatile memory.
  • the nonvolatile memory can be read only memory (ROM), programmable read only memory (PROM), erasable programmable read only memory (EPROM), electrically erasable programmable read only memory (EEPROM), or flash memory.
  • Volatile memory can be random access memory (RAM), which acts as external cache memory.
  • RAM Static Random Access Memory
  • DRAM Dynamic Random Access Memory
  • SDRAM Synchronous Dynamic Random Access Memory
  • DDRSDRAM Double Data Rate Synchronous Dynamic Random Access Memory
  • ESDRAM Enhanced Synchronous Dynamic Random Access Memory
  • SLDRAM Synchronous Linked Dynamic Random Access Memory
  • DRRAM Direct Memory Bus Random Access Memory
  • Embodiments of the present application also provide a computer program product or computer program, where the computer program product or computer program includes computer instructions, and the computer instructions are stored in a computer-readable storage medium.
  • the processor of the computer device reads the computer instruction from the computer-readable storage medium, and the processor executes the computer instruction, so that the computer device executes the graphics processor resource management method according to the embodiment of the present application.
  • Embodiments of the present application provide a graphics processor resource management method, device, device, computer-readable storage medium, and computer program product.
  • the method provided by the embodiments of the present application uses the past resource consumption of the application process running on the graphics processor as a reference for resource allocation, and uses the actual The amount of available resources is used to adjust resource allocation in real time, thereby avoiding resource competition among multiple application processes.
  • the method provided by the embodiments of the present application considers the remaining available resources of these application processes in the historical resource allocation for multiple application processes running simultaneously on the same graphics processor, and based on the current resources of these application processes in the resources of the graphics processor The amount of available resources determines the resource allocation scheme in real time, thereby realizing efficient allocation of graphics manager resources.
  • graphics processor resources can be reasonably allocated according to the resource requirements of each application process, the influence of competition among multiple application processes is avoided, and the usage rate of graphics processor resources is improved.
  • each block in the flowchart or block diagram may represent a module, program segment, or part of code that includes at least one Executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented by a dedicated hardware-based system that performs the specified functions or operations , or may be implemented by a combination of dedicated hardware and computer instructions.
  • the various example embodiments of the present application may be implemented in hardware or special purpose circuits, software, firmware, logic, or any combination thereof. Certain aspects may be implemented in hardware, while other aspects may be implemented in firmware or software, which may be executed by a controller, microprocessor or other computing device.
  • aspects of the embodiments of the present application are illustrated or described as block diagrams, flowcharts, or using some other graphical representation, it will be understood that the blocks, devices, systems, techniques, or methods described herein may serve as non-limiting Examples are implemented in hardware, software, firmware, special purpose circuits or logic, general purpose hardware or controllers or other computing devices, or some combination thereof.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Mathematical Physics (AREA)
  • Processing Or Creating Images (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Debugging And Monitoring (AREA)

Abstract

本申请的实施例提供了一种图形处理器资源管理方法、装置、设备、计算机可读存储介质和计算机程序产品。本申请的实施例所提供的方法涉及云技术和云游戏领域,针对在同一图形处理器上同时运行的多个应用进程,考虑这些应用进程在历史资源分配中的剩余可用资源,基于这些应用进程在图形处理器的资源中当前可使用的资源量实时确定资源分配方案,从而实现对图形管理器资源的高效分配。通过本申请的实施例的方法能够根据各个应用进程的资源需求对图形处理器资源进行合理分配,避免了多个应用进程之间的竞争影响,提高了图形处理器资源的使用率。

Description

图形处理器资源管理方法、装置、设备、存储介质和程序产品
本申请要求于2022年02月14日提交中国专利局,申请号为2022101351584,申请名称为“图形处理器资源管理方法、装置、设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及云技术领域,更具体地,涉及一种图形处理器资源管理方法、装置、设备、计算机可读存储介质和计算机程序产品。
背景技术
随着云技术的不断发展,云游戏在游戏行业中越来越受欢迎。云游戏将游戏的画面渲染工作在云服务器端的图形处理器(Graphic Processing Unit,GPU)上完成,并通过网络将渲染结果传输到用户的客户端。然而,由于云服务器端的GPU资源有限,而在提供云游戏的云服务器中,该云服务器上承载的多个游戏进程可能会共享云服务器的硬件计算资源,因此将会产生硬件计算资源的竞争。如果在一个GPU上同时运行多个游戏进程,这些游戏进程会对GPU资源进行争夺而导致影响渲染效果,而如果为每个游戏进程提供单独的GPU,虽然能够保证每个游戏进程的渲染质量,但是这样势必会造成GPU资源的严重浪费。
因此,需要一种高效的GPU资源管理方法,使得可以以有限的GPU资源实现更高质量的多游戏进程渲染。
发明内容
为了解决上述问题,本申请通过根据各个任务进程的资源需求及其已产生的资源消耗,实时确定对这些任务进程的处理顺序,从而实现了对图形管理器资源的高效分配。
本申请的实施例提供了一种图形处理器资源管理方法、装置、设备、计算机可读存储介质和计算机程序产品。
本申请的实施例提供了一种图形处理器资源管理方法,包括:确定用于处理应用进程的多个图形处理器;获取待处理的多个应用进程,并为所述多个应用进程中的每个应用进程分配所述多个图形处理器中的一个图形处理器;对于向一个图形处理器分配的至少一个应用进程中的每个应用进程,确定所述应用进程在所述图形处理器的当前预定资源中的剩余可用资源量,所述剩余可用资源量与所述应用进程在所述图形处理器的历史预定资源中的剩余可用资源量相关;以及基于所述至少一个应用进程中的每个应用进程在所述当前预定资源中的剩余可用资源量,确定对所述至少一个应用进程中的每个应用进程的资源分配命令,所述资源分配命令指示是否对所述应用进程进行处理;其中,所述资源分配命令用于使所述应用进程在所述当前预定资源中的剩余可用资源量达到预设目标值。
本申请的实施例还提供了一种图形处理器资源管理方法,包括:启动调度进程,所述调度进程包括分配线程和多个处理线程;通过所述分配线程确定用于处理应用进程的多个图形处理器,并为所述多个图形处理器中的每个图形处理器分配一个处理线程;启动多个应用进程,所述多个应用进程中的每个应用进程包括由所述调度进程预先配置的调度库;对于所述多个应用进程中的每个应用进程,通过所述应用进程的调度库和所述分配线程,为所述多个应用进程中的每个应用进程分配所述多个图形处理器中的一个图形处理器和对应的处理线程;对于向一个图形处理器分配的至少一个应用进程中的每个应用进程,通过与所述应用进程相对应的处理线程确定所述应用进程在所述图形处理器的当前预定资源中的剩余可用资源 量,以确定对所述应用进程的资源分配命令,所述资源分配命令指示是否对所述应用进程进行处理;其中,所述剩余可用资源量与所述应用进程在所述图形处理器的历史预定资源中的剩余可用资源量相关,所述资源分配命令用于使所述应用进程在所述当前预定资源中的剩余可用资源量达到预设目标值。
本申请的实施例提供了一种图形处理器资源管理装置,包括:处理器确定模块,被配置为确定可用于处理应用进程的多个图形处理器;处理器分配模块,被配置为获取待处理的多个应用进程,并为所述多个应用进程中的每个应用进程分配所述多个图形处理器中的一个图形处理器;剩余资源确定模块,被配置为对于向一个图形处理器分配的至少一个应用进程中的每个应用进程,确定所述应用进程在所述图形处理器的当前预定资源中的剩余可用资源量,所述剩余可用资源量与所述应用进程在所述图形处理器的历史预定资源中的剩余可用资源量相关;以及资源分配模块,被配置为基于所述至少一个应用进程中的每个应用进程在所述当前预定资源中的剩余可用资源量,确定对所述至少一个应用进程中的每个应用进程的资源分配命令,所述资源分配命令指示是否对所述应用进程进行处理;其中,所述资源分配命令用于使所述应用进程在所述当前预定资源中的剩余可用资源量达到预设目标值。
本申请的实施例提供了一种图形处理器资源管理设备,包括:一个或多个处理器;以及一个或多个存储器,其中,所述一个或多个存储器中存储有计算机可执行程序,当由所述处理器执行所述计算机可执行程序时,执行如上所述的图形处理器资源管理方法。
本申请的实施例提供了一种计算机可读存储介质,其上存储有计算机可执行指令,所述指令在被处理器执行时用于实现如上所述的图形处理器资源管理方法。
本申请的实施例提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行根据本申请的实施例的图形处理器资源管理方法。
本申请的实施例所提供的方法相比于传统的图形处理器资源管理方法而言,利用图形处理器上运行的应用进程在过去的资源消耗量作为资源分配的参考,以根据各个应用进程实际可使用的资源量来实时调整资源分配,从而避免了多个应用进程之间的资源竞争。
本申请的实施例所提供的方法针对在同一图形处理器上同时运行的多个应用进程,考虑这些应用进程在历史资源分配中的剩余可用资源,基于这些应用进程在图形处理器的资源中当前可使用的资源量实时确定资源分配方案,从而实现对图形管理器资源的高效分配。通过本申请的实施例的方法能够根据各个应用进程的资源需求对图形处理器资源进行合理分配,避免了多个应用进程之间的竞争影响,提高了图形处理器资源的使用率。
附图说明
为了更清楚地说明本申请的实施例的技术方案,下面将对实施例的描述中所需要使用的附图作简单的介绍。显而易见地,下面描述中的附图仅仅是本申请的一些示例性实施例,对于本领域普通技术人员来说,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。
图1是示出根据本申请的实施例的多个应用进程使用GPU资源的场景的示例示意图;
图2A是示出根据本申请的实施例的图形处理器资源管理方法的流程图;
图2B是示出根据本申请的实施例的图形处理器资源管理方法的示意性流程框图;
图2C是示出根据本申请的实施例的图形处理器资源管理方法的时序示意图;
图3是示出根据本申请的实施例的为多个应用进程分配图形处理器的示意图;
图4A是示出根据本申请的实施例的两种资源使用情况的示意图;
图4B是示出根据本申请的实施例的多个应用进程的资源分配比例的示意图;
图5是示出根据本申请的实施例的采集队列和处理队列的示意图;
图6是示出根据本申请的实施例的确定应用进程在当前预定资源中的已使用资源量的第一增量的示例性示意图;
图7是示出根据本申请的实施例的第一增量的获取与处理的示意图;
图8A是示出根据本申请的实施例的确定第一增量时CPU和GPU的双缓冲方法的示意图;
图8B是示出根据本申请的实施例的确定第一增量时CPU和GPU的自适应缓冲方法的示意图;
图9A是示出根据本申请的实施例的图形处理器资源管理方法的示意图;
图9B是示出根据本申请的实施例的图形处理器资源管理方法的调度逻辑的示意图;
图10是示出根据本申请的实施例的图形处理器资源管理装置的示意图;
图11示出了根据本申请的实施例的图形处理器资源管理设备的示意图;
图12示出了根据本申请的实施例的示例性计算设备的架构的示意图;以及
图13示出了根据本申请的实施例的存储介质的示意图。
具体实施方式
为了使得本申请的目的、技术方案和优点更为明显,下面将参考附图详细描述根据本申请的示例实施例。显然,所描述的实施例仅仅是本申请的一部分实施例,而不是本申请的全部实施例,应理解,本申请不受这里描述的示例实施例的限制。
在本说明书和附图中,具有基本上相同或相似步骤和元素用相同或相似的附图标记来表示,且对这些步骤和元素的重复描述将被省略。同时,在本申请的描述中,术语“第一”、“第二”等仅用于区分描述,而不能理解为指示或暗示相对重要性或排序。并且,术语“多个”可以理解为至少两个。
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中所使用的术语只是为了描述本发明实施例的目的,不是旨在限制本发明。
本申请的图形处理器资源管理方法可以基于云技术(Cloud technology)。
本申请的图形处理器资源管理方法可以是基于云游戏(Cloud gaming)的。
图1是示出根据本申请的实施例的多个应用进程使用GPU资源的场景的示例示意图。
目前,有许多手机应用或电脑软件需要通过网络才能实现其功能,对于游戏应用尤为如此。网络可以是基于互联网和/或电信网的物联网(Internet of Things),其可以是有线网也可以是无线网,例如,其可以是局域网(LAN)、城域网(MAN)、广域网(WAN)、蜂窝数据通信网络等能实现信息交换功能的电子网络。如图1所示,用户终端上的手机应用或电脑软件可以将用户输入的控制命令发送到服务器,从而启动其所对应的应用进程。服务器上可以有多种硬件计算资源,例如,中央处理器单元、通信接口、存储器等等。以图1中示出的GPU资源为例,服务器上存在多个GPU(例如,GPU-1、GPU-2等),这些GPU中的每一个GPU都可能为不同的多个应用进程执行相关计算。
可选地,该服务器可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、 云通信、中间件服务、域名服务、安全服务、CDN、以及大数据和人工智能平台等基础云计算服务的云服务器。用户终端可以是智能手机、平板电脑、笔记本电脑、台式计算机、智能音箱、智能手表等,但并不局限于此。用户终端以及服务器可以通过有线或无线通信方式进行直接或间接地连接,本申请在此不做限制。例如,在本申请的实施例中,联网的游戏应用通常会依靠云服务器上的GPU来合成用户终端将显示的游戏画面或进行硬件编码,这样的游戏应用也称为云游戏(也称为“游戏点播”)。用户终端可以通过控制流将用户操作游戏的数据传输到云服务器,而云服务器则通过数据流将一帧或多帧音频帧和视频帧传输到用户终端。
在云游戏中,游戏在远程云服务器中存储、同步和呈现,并使用流技术传递给玩家,是一种完全不同于以往类型的在线游戏服务。具体而言,云服务器运行游戏,将其图形输出呈现并编码为视频,然后将视频流式传输到网络客户端,客户端解码并显示视频流,供玩家与游戏交互,同时将玩家输入的控制命令发送到云服务器。云游戏以这种方式将游戏的计算负载从客户端转移到云端,从而解除其对玩家设备的约束,此外,云游戏还允许玩家无需耗费时间下载和安装游戏客户端,立即开始游戏。由于这些优势,云游戏吸引了学术界和工业界极大的关注。
如上所述,云服务器负责玩家输入的解释、游戏代码的执行和图形的渲染,并通过网络向客户端传输游戏场景,而客户端负责解码并向玩家显示游戏场景,实时捕捉和发送游戏玩家对游戏的操作并将其作为云服务器的输入。其中,在云服务器利用GPU执行图形渲染的过程中,由于GPU硬件计算资源有限,而在提供云游戏的云服务器中,该云服务器上承载的多个虚拟实体可能会共享云服务器的硬件计算资源,因此将会产生硬件计算资源的竞争。通常GPU硬件的渲染是对其上所有游戏进程提供的渲染请求按照先到先渲染原则进行处理,并且单个游戏进程的渲染负载增加会影响其他游戏进程的正常渲染。例如,在任一游戏进程渲染超时的情况下,其他游戏进程被迫缩短渲染时间,导致其渲染质量不高,并且这种渲染质量的恶化可能随渲染过程的进行而逐渐累积,严重影响用户的游戏体验。
目前的GPU资源管理方法在对GPU资源的管理中选择完全隔离资源分配,或者放任资源抢占,其对于GPU内存和高速串行计算机扩展总线标准(peripheral component interconnect express,PCI-E)总线带宽未做隔离,这可能导致在某一瞬间GPU的占用率很高,而PCI-E总线处于闲置状态,致使最终整体资源的利用率不高。诸如虚拟化GPU的技术对GPU资源的划分仅能实现二分之一或四分之一GPU资源的划分粒度,而无法进行更细粒度的资源划分,且这样的资源管理方法依赖于针对特定应用进程的预先配置,无法根据应用进程的进入与退出实时且灵活地为其分配GPU计算资源。
本申请基于此,提供了一种图形处理器资源管理方法,其通过根据各个任务进程的资源需求及其已产生的资源消耗,实时确定对这些任务进程的处理顺序,从而实现了对图形管理器资源的高效分配。
本申请的实施例所提供的方法相比于传统的图形处理器资源管理方法而言,利用图形处理器上运行的应用进程在过去的资源消耗量作为资源分配的参考,以根据各个应用进程实际可使用的资源量来实时调整资源分配,从而避免了多个应用进程之间的资源竞争。
本申请的实施例所提供的方法针对在同一图形处理器上同时运行的多个应用进程,考虑这些应用进程在历史资源分配中的剩余可用资源,基于这些应用进程在图形处理器的资源中当前可使用的资源量实时确定资源分配方案,从而实现对图形管理器资源的高效分配。通过本申请的实施例的方法能够根据各个应用进程的资源需求对图形处理器资源进行合理分配, 避免了多个应用进程之间的竞争影响,提高了图形处理器资源的使用率。
图2A是示出根据本申请的实施例的图形处理器资源管理方法200的流程图。图2B是示出根据本申请的实施例的图形处理器资源管理方法的示意性流程框图。图2C是示出根据本申请的实施例的图形处理器资源管理方法的时序示意图。
如图2A所示,在步骤201中,可以确定用于处理应用进程的多个图形处理器。
可选地,对GPU的资源管理可以在如图2C所示的应用进程和调度服务两个部分协同进行,其中,应用进程部分对应于GPU资源管理过程中应用进程(例如,游戏实例)所执行的操作,而调度服务部分对应于针对相应应用进程的GPU资源管理调度操作。可选地,该应用进程可以是诸如游戏进程、视频进程和会议进程等的各种应用进程,在本申请中以通过OpenGL(Open Graphics Library,开放图形库)等图形引擎开发的游戏进程作为示例而非限制进行描述,任何需要进行GPU资源调度的进程都可以适用于本申请的GPU资源管理方法。
如图2C所示,在应用进程启动之前,调度服务中的分配线程可以首先确定当前可用于处理应用进程的多个GPU,并在调度服务中为这些GPU分别创建相应的处理线程,后续对于各个GPU的资源分配管理可以在其相应的处理线程上进行。例如,对于诸如游戏实例的应用进程的图形渲染任务,所确定的GPU应具有执行一定渲染计算的能力。应当理解,下文虽然在很多情况下针对云游戏场景下对游戏实例的图形渲染进行描述,但本申请的GPU资源管理方法同样适用于其他应用进程的处理,比如,可以适用于视频进程和会议进程的处理,其中,可以是对视频画面的图像渲染进行处理,也可以是对会议画面的图像渲染进行处理。
在步骤202中,可以获取待处理的多个应用进程,并为所述多个应用进程中的每个应用进程分配所述多个图形处理器中的一个图形处理器。
如图2C所示,当各个用户终端启动应用后,服务器端可以为其分别注册相应的应用进程,包括但不限于为各个应用进程分配用于其图形渲染处理的GPU,并将其注册信息返回至应用进程,诸如所分配的GPU的索引及其对应的处理线程等信息。
可选地,在应用进程的GPU资源需求超出了所确定的每个GPU的可用资源的情况下,还可以将应用进程对GPU资源的请求转发到不同的GPU上执行,使得可以将多个GPU虚拟为一个GPU来实现这种情况下的应用进程处理。
可选地,在保证满足GPU计算性能限制的情况下,可以将多于一个应用进程分配给同一GPU,即在同一GPU上同时执行多于一个应用进程的处理任务。
图3是示出根据本申请的实施例的为多个应用进程分配图形处理器的示意图。如图3所示,在存在三个应用进程A、B和C以及三个可用GPU 1、2和3的情况下,通过执行步骤202中分配GPU的操作,应用进程A和C被分配到GPU 1上进行处理,应用进程B被分配到GPU 2上进行处理,而GPU C可以不被分配执行任何应用进程的处理。因此,对于共同运行于GPU 1上的应用进程A和C,这些应用进程可以共享GPU 1上的计算资源,因此这些应用进程的资源使用也可能存在竞争关系,本申请的GPU资源管理方法可以避免这些应用进程之间的竞争影响。
根据本申请的实施例,所述多个应用进程中的每个应用进程可以具有预定的资源需求权重。可选地,每个应用进程的资源需求权重可以基于其计算所需的GPU资源量来确定,该资源需求权重可以是预先确定的并且在注册时已通知调度服务。例如,对于多个游戏实例,可以基于其画面渲染的复杂程度和计算量来预先确定其对应的资源需求权重,该权重可以是游戏实例的画面渲染在单位GPU资源中所需的比例,例如在游戏实例的画面渲染在1s的GPU硬 件计算时间中需要200ms的情况下,该游戏实例的资源需求权重可以为0.2(200ms/1s)。
根据本申请的实施例,步骤202中为所述多个应用进程中的每个应用进程分配所述多个图形处理器中的一个图形处理器可以包括:确定所述多个图形处理器中的每个图形处理器的可用资源比,所述可用资源比为所述图形处理器中可用于处理应用进程的资源比例;以及基于所述多个应用进程中的每个应用进程的资源需求权重以及所述多个图形处理器中的每个图形处理器的可用资源比,确定为每个应用进程分配的图形处理器。
如上所述,该多个GPU中的每个GPU是当前可用于处理应用进程的GPU,但这些GPU中的可用计算资源量不一定相等且不一定等于其全部的计算资源量,因此在为具有特定资源需求的应用进程分配GPU之前,需要确定可用GPU的可用资源量。类似于上述关于资源需求权重的描述,GPU的可用资源量可以用其可用资源比表示,该可用资源比可以表示GPU中可用于处理应用进程的资源量在其单位资源中所占的比例,例如,在1s的GPU硬件计算时间中,可用于处理应用进程的时间为0.8s,则该GPU的可用资源比可以为0.8。
因此,可以基于各个应用进程的资源需求权重和各GPU的可用资源比来联合确定对应用进程的GPU分配。根据本申请的实施例,向一个图形处理器分配的至少一个应用进程的资源需求权重之和不大于所述图形处理器的可用资源比。对于一个GPU,分配到其上进行处理的应用进程的资源需求权重之和不可以大于该GPU的可用资源比,即该GPU中实际用于处理应用进程的资源量不可以大于预期的资源量。
在步骤203中,对于向一个图形处理器分配的至少一个应用进程中的每个应用进程,可以确定所述应用进程在所述图形处理器的当前预定资源中的剩余可用资源量,所述剩余可用资源量可与所述应用进程在所述图形处理器的历史预定资源中的剩余可用资源量相关。其中,根据本申请的实施例,图形处理器的历史预定资源和当前预定资源分别包含的资源量可以均为预定资源量。
可选地,历史预定资源和当前预定资源分别包含的资源量(即预定资源量)可以是如上所述的单位GPU资源,该历史预定资源是应用进程在当前预定资源之前预定的资源。对于单位GPU资源的使用量可以用于确定该GPU的使用率(GPU的实际工作时间与运行时间之比),因此本申请中的GPU资源分配可以基于对单位GPU资源的分配。例如,该单位GPU资源可以是单位长度的计算时间,诸如以上描述的1s或1帧时间,本申请对此不作限制。根据本申请的实施例,所述多个应用进程中的每个应用进程的资源需求权重可以指示所述应用进程在所述预定资源量中所需的资源量的比例。如上所述,应用进程的资源需求权重可以为该应用进程的处理在单位GPU资源中所需的资源比例。
在本申请的实施例中,可以基于对应用进程的资源分配的历史状态来确定当前资源分配。该历史状态可以包括该应用进程在历史资源分配中的剩余资源量,即其可使用但未使用的资源量。该剩余可用资源量的存在可能是由于其他应用进程的处理超时而导致该应用进程的资源被占用,这样的资源占用可能导致该应用进程的渲染效果不佳,因此为了避免这样的资源占用的累积,可以在后续资源分配中进行相应调整,例如使应用进程在当前资源分配中的可用资源量与其在历史资源分配中的剩余可用资源量相关联。
根据本申请的实施例,步骤203中确定所述应用进程在所述当前预定资源中的剩余可用资源量可以包括:基于所述应用进程在所述历史预定资源中的剩余可用资源量、所述应用进程在所述当前预定资源中的已使用资源量、以及所述应用进程的资源需求权重,来确定所述应用进程在所述当前预定资源中的剩余可用资源量。
可选地,基于应用进程的资源需求权重及其在历史预定资源中的剩余可用资源量可以确定该应用进程在当前预定资源中的可用资源总量,因此在此基础上根据该应用进程在当前预定资源中的已使用资源量可以确定其在当前预定资源中的剩余可用资源量,即使用当前预定资源中的可用资源总量减去当前预定资源中的已使用资源量就可以得到当前预定资源中的剩余可用资源量。
通过基于应用进程在历史预定资源中的剩余可用资源量来确定其在当前预定资源中的剩余可用资源量,可以基于历史资源分配中的误差来调整当前资源分配,以减少甚至消除该资源分配误差。例如,在应用进程在历史预定资源中的剩余可用资源量小于零的情况下,通过在该应用进程在当前预定资源中的剩余可用资源量中减去其在历史预定资源中的该剩余可用资源量的绝对值,可以有效减少该资源分配误差。
因此,根据本申请的实施例,图形处理器资源管理方法200还可以包括:对于所述至少一个应用进程中的每个应用进程,获取所述应用进程在所述历史预定资源中的剩余可用资源量,并且确定所述应用进程在所述当前预定资源中的已使用资源量。
可选地,应用进程在当前预定资源中的已使用资源量可以包括应用进程在前一次资源分配中使用的资源量和在对当前预定资源的更早的资源分配中使用的资源量,其中,该应用进程在前一次资源分配中使用的资源量对应于其最新一次处理所使用的计算资源。
因此,确定所述应用进程在所述当前预定资源中的已使用资源量可以包括确定所述应用进程在所述当前预定资源中的已使用资源量的第一增量,所述第一增量可以对应于与所述应用进程相对应的图形处理器对来自所述应用进程的处理任务的前一次处理。
可选地,对于游戏实例的渲染任务,该第一增量可以对应于GPU前一次执行该游戏实例的渲染处理的时间,如图2C所示,该渲染时间可以由应用进程获取,并且通过其中设置的调度库来向调度服务通知,以使得调度服务基于该渲染时间来处理渲染时间,包括确定应用进程在当前预定资源中的剩余可用资源量等。
如上所述,通过确定同一GPU上的各个应用进程在当前预定资源中的剩余可用资源量,调度服务可以联合地确定当前的资源分配方案,并向应用进程通知是否渲染,应用进程在此期间等待来自调度服务的渲染通知,如图2C所示。因此,获得各个应用进程的剩余可用资源量,尤其是获得各个应用进程的第一增量对于本申请的GPU资源管理非常重要,可以有效减少分配误差。其中,关于第一增量的获得方式可以参见下文关于图6和图7的相关描述,在此将不进行详述。
在步骤204中,可以基于所述至少一个应用进程中的每个应用进程在所述当前预定资源中的剩余可用资源量,确定对所述至少一个应用进程中的每个应用进程的资源分配命令,所述资源分配命令指示是否对所述应用进程进行处理。
根据本申请的实施例,步骤204可以包括:对于所述至少一个应用进程中的每个应用进程,在所述应用进程在所述当前预定资源中的剩余可用资源量不大于零的情况下,确定对所述应用进程的资源分配命令指示不对所述应用进程进行处理;以及对于所述至少一个应用进程中剩余可用资源量大于零的其他应用进程,基于所述其他应用进程中的每个应用进程的优先级确定对每个应用进程的资源分配命令。
可选地,如上所述,对于在当前预定资源中的剩余可用资源量不大于零的应用进程,可以在当前预定资源中不再向其分配资源,以免影响其他应用进程的处理。而对于在当前预定资源中还存在大于零的剩余可用资源量的应用进程,可以考虑向这些应用进程继续分配GPU 计算资源。可选地,对这些应用进程的资源分配可以是基于这些应用进程的优先级的,而不仅仅是基于先前描述的先到先处理的竞争模式。
根据本申请的实施例,所述其他应用进程中的每个应用进程的优先级可以与所述应用进程等待被处理的时间长度以及确定其最新的第一增量的时间顺序相关。例如,对于将要出现画面卡顿等情况的待处理的应用进程,其优先级可以被较高地设置,以便优先处理;而对于不存在这样的紧急情况的处理时机,可以基于获得到这些应用进程的第一增量的时间顺序来确定其优先级,例如可以优先处理先获得到其第一增量且在当前预定资源中的剩余可用资源量大于零的应用进程。
根据本申请的实施例,基于所述其他应用进程中的每个应用进程的优先级确定对每个应用进程的资源分配命令可以包括:对于所述其他应用进程中的每个应用进程,在存在等待被处理的时间长度满足预定条件的应用进程的情况下,基于确定所述应用进程的最新的第一增量的时间顺序来确定对所述应用进程的资源分配命令;以及在不存在等待被处理的时间长度满足预定条件的应用进程的情况下,基于确定所述其他应用进程中的每个应用进程的最新的第一增量的时间顺序来确定对每个应用进程的资源分配命令。
如上所述,对于应用进程的优先级设置可以考虑(包括但不限于)以下几个因素:①剩余可用资源量,可以为在当前预定资源中的剩余可用资源量不大于零的应用进程设置低优先级(指示不执行);②紧急性,可以为当前急需处理的应用进程分配高优先级,例如画面显示即将出现卡顿或其他紧急情况;以及③第一增量获得顺序,可以为先获得其第一增量的应用进程分配较高优先级,使得GPU资源分配更连续,从而提高GPU资源的使用率。应当理解,在设置处理应用进程的优先级时,本申请的方法还可以考虑其他各种因素,上述列举的因素在此仅用作示例而非限制。
根据本申请的实施例,所述资源分配命令指示对应的应用进程是否要向对应的图形处理器发送处理任务以由所述图形处理器进行处理,其中,所述图形处理器对所述处理任务的处理对应于所述应用进程对所述图形处理器的资源的使用。
如上所述,上述各种操作步骤的执行均可以基于事件触发的方式进行,而无需传统中将所有应用进程的渲染指令重定向到调度进程进行渲染所必需的指令流系统,因此可以提高研发效率并降低成本。
如图2B所示,本申请的GPU资源管理通过评估各个应用进程对GPU硬件资源的资源使用情况,来控制这些应用进程对处理任务的发送。图2B以图形渲染任务为例,GPU资源管理可以通过资源分配命令来控制各个应用进程对渲染指令的投递。
可选地,各个应用进程可以根据所接收的资源分配命令来确定是否向对应的GPU发送处理任务(例如,图2B中的向GPU渲染指令队列投递渲染指令),GPU对该处理任务的执行将消耗一定数量的GPU硬件资源,该一定数量的计算资源可以对应于下一次资源分配中所参考的该应用进程的第一增量。
根据本申请的实施例,所述资源分配命令可以用于使所述应用进程在所述当前预定资源中的剩余可用资源量达到预设目标值。可选地,资源分配命令可以使对各个应用进程的资源分配的误差趋于减少。该预设目标值是预先设置好的剩余可用资源量需要达到的值,可以根据需求设置,比如,可以设置为零,使剩余可用资源量趋近于零,以节省资源。
图4A是示出根据本申请的实施例的两种资源使用情况的示意图。例如,在应用进程的剩余可用资源量大于零的情况(例如,如图4A所示的正常情况)下,对该应用进程的资源分配 命令可以使该应用进程发送处理任务,继而使用一定数量的计算资源,使得大于零的剩余可用资源量减少;而在应用进程的剩余可用资源量小于零的情况(例如,如图4A所示的超时情况)下,该应用进程在当前预定资源中已无法再使用任何资源,因此当前预定资源中不再为该应用进程分配任何资源,使得其剩余可用资源量不再继续减少。
根据本申请的实施例,所述资源分配命令用于使得与所述应用进程在所述历史预定资源中所使用的资源比例相比,所述应用进程在所述当前预定资源中所使用的资源比例更接近于所述应用进程的资源需求权重。即该资源分配命令用于使所述应用进程的第一误差大于第二误差,所述第一误差是指所述应用进程的资源需求权重与所述应用进程在所述历史预定资源中所使用的资源比例的误差,所述第二误差是指所述应用进程的资源需求权重与所述应用进程在所述当前预定资源中所使用的资源比例的误差。
图4B是示出根据本申请的实施例的多个应用进程的资源分配比例的示意图。如图4B所示,三个应用进程A、B和C的资源需求权重分别为0.5、0.2和0.3。在历史预定资源中,应用进程A使用了超出其预定资源需求的资源量(58%),导致其他两个应用进程B和C资源不足(分别为17%和25%,均少于其预定资源权重所对应的资源比例)。
因此,根据本申请的GPU资源管理方法,应用进程A在历史预定资源中的剩余可用资源量是小于零的(例如,在图4B的示例中为-0.08),在当前预定资源中应按照其过度使用的资源量(即,0.08)减少对其的资源分配。所以在当前预定资源的资源分配结果中,通过资源分配调整,为应用进程A分配了42%的可用资源量,而对其他被占用资源的应用进程B和C分别按量进行了资源补偿(即,应用进程B的可用资源量为20%+(20%-17%)=23%,应用进程C的可用资源量为30%+(30%-25%)=35%)。
通过如上所述使应用进程在当前预定资源中的剩余可用资源量达到预设目标值,可以使应用进程在当前预定资源中所使用的资源量更接近期望的资源量,即其资源需求权重所对应的资源需求量。因此,可以实现对应用进程的按需资源分配,并实现对GPU资源的高效利用。
考虑到上述对于GPU硬件资源的定时分配可能需要为处理线程绑定精准的定时器来触发时间分配,为了降低方法的复杂度,可以通过创建一个采集队列和一个处理队列,分别用于采集渲染时间和处理渲染时间,来完成图2C中的通知渲染时间以及处理渲染时间操作。
图5是示出根据本申请的实施例的采集队列和处理队列的示意图。如图5所示,采集队列通过输入事件依次插入多个新的渲染时间数据,而处理队列在此期间执行渲染时间处理,在其处理完成后可以与采集队列进行交换,使得采集队列成为新的处理队列,而处理队列成为新的采集队列,通过队列的交替实现对时间的定时分配。
在对渲染时间的采集过程中,基于对应用进程的渲染方式不同,获得第一增量的方式也可能随之改变。根据本申请的实施例,确定所述第一增量的方式可以是基于所述图形处理器对所述处理任务的处理方式来确定的,所述处理方式可以包括同步渲染或者异步渲染中的至少一种。
根据本申请的实施例,上述确定所述应用进程在所述当前预定资源中的已使用资源量的第一增量可以通过以下中的一项来执行:通过对所述前一次处理的开始与结束进行标记,来估计所述第一增量;或者通过使用查询指令,从所述图形处理器获取所述第一增量。
可选地,对于异步渲染方式,第一增量的确定可以通过采用信号拦截GPU硬件队列的方式完成。在此情况下,根据本申请的实施例,上述确定所述应用进程在所述当前预定资源中的已使用资源量的第一增量可以通过对所述前一次处理的开始与结束进行标记,来估计所述 第一增量。
图6是示出根据本申请的实施例的确定应用进程在当前预定资源中的已使用资源量的第一增量的示例性示意图。
可选地,可以通过计算绘制函数执行的时间来确定渲染的实际耗时,将该渲染的实际耗时作为在当前预定资源中已使用资源量的第一增量。如图6所示,绘制函数执行的时间可以包括准备时间和实际渲染时间,在准备时间中,执行绘制函数通过PCI-E信道的传输和渲染的准备,包括渲染指令的执行,而实际渲染时间实际上是在绘制指令执行部分,因此可以在绘制指令前后插入指令信号F,以在触发该信号F时告知应用进程(或调度服务)进行计时开始或者计时结束。
可选地,对于同步渲染方式,第一增量的确定可以在应用线程本地通过查询的方式完成。在此情况下,根据本申请的实施例,上述确定所述应用进程在所述当前预定资源中的已使用资源量的第一增量可以通过使用查询指令,从所述图形处理器获取所述第一增量。
图7是示出根据本申请的实施例的第一增量的获取与处理的示意图。
在同步渲染的情况下,可以通过查询的方式向GPU发送查询操作,该查询操作可以通过GPU确定两个指定查询点之间的时间。如图7所示,“开始查询”和“结束查询”分别是在GPU中插入的两个指定查询点,以查询其间的渲染指令的绘制函数执行所花费的GPU时间,即渲染时间,其中插入的首个“结束查询”点由于无法知道其所对应的“开始查询”点的插入位置而丢弃了该次查询。图7中的灰色框部分表示基于所获取的渲染时间而执行GPU资源管理处理,即向调度服务通知所获取的渲染时间然后等待调度服务通知渲染,以及由调度服务执行渲染时间处理并通过事件通知方式向应用进程发送渲染通知(即资源分配命令)。
在如上所述的渲染时间查询操作中,由于等待查询时间的操作在中央处理器(CPU)侧是同步操作,获取渲染时间的等待过程可能会极大影响应用的性能。因此,本申请的GPU资源管理方法可以使用CPU运行渲染指令的时间来抵消GPU准备查询结果的时间。
图8A是示出根据本申请的实施例的确定第一增量时CPU和GPU的双缓冲方法的示意图。
为了保证GPU中的不相交查询机制,即查询的开始点和结束点必须被包含在不相交查询中,也就是说不能连续插入多个开始点,本实施例中将按照GPU延迟一次计算的方式来执行CPU与GPU之间的交互操作。如图8A所示,CPU和GPU的索引指示其各自的处理次序,其中CPU 2的处理对应于GPU 1的查询时间,而CPU 3的处理对应于GPU 2的查询时间。通过这样的双缓冲方式,有效提升了在CPU等待GPU查询时间期间CPU的处理性能。但是,由于CPU的处理可能快于GPU部分,双缓冲的方式可能仍然无法完全避免CPU的等待,因此,基于上述双缓冲方法,本申请的GPU管理方法还可以基于自适应缓冲机制来解决前述问题。
图8B是示出根据本申请的实施例的确定第一增量时CPU和GPU的自适应缓冲方法的示意图。
在自适应缓冲机制中,仍然可以采用双缓冲机制,但是在无法查询到前一次绘制完成的情况下,可以不结束本次查询,直到确定前一次绘制一定完成,期间还可以进行多次绘制调用。
由于自适应缓冲方法相对于双缓冲方法扩展了查询中绘制调用的次数,将查询的单次绘制调用扩展为了随机的多次调用,当同时处理多个应用进程时,系统精确度可能降低。这是因为不同应用进程之间可能存在交叉渲染,GPU仅是读取当前查询点的时间而无法对不同应用进程进行区分,导致查询不准确。
因此,为了更好地控制渲染指令队列的提交,可以在结束GPU查询时强制异步刷新渲染指令缓冲区,这将产生大量的快速绘制,而这样的绘制不需要使用自适应缓冲方法就能快速获取GPU的渲染时间结果,但这也导致了通信传输负担。为了平衡系统性能与精确度,以在保证通信效率的同时,减少多个应用进程之间的相互影响。因此,可以采用蒙特卡洛算法,其结合了双缓冲方法的基础算法与自适应算法,通过概率论统计方式来获得较优解(相对于最优解)。可以通过绘制调用次数或系统运行时间来限制自适应算法到双缓冲基础算法的退化,其中,这些参数的较低值可能导致通信性能降低,而较高的值可能增加多个应用进程之间的影响,因此通过采用蒙特卡洛算法可以更进一步保证系统的合理性。
图9A是示出根据本申请的实施例的图形处理器资源管理方法300的示意图。如图9A所示,图形处理器资源管理方法300可以包括分别由调度进程和应用进程执行的两部分操作。图形处理器资源管理方法300主要可以包括以下步骤,其中各个步骤的编号与图9A中的附图标记相对应。
①启动调度进程,所述调度进程可以包括分配线程和多个处理线程。
如图9A所示,调度进程可以包括一个分配线程和多个处理线程,其中,分配线程可以用于GPU和应用进程的分配以及消息的分发,处理线程可以用于针对GPU的渲染时间处理。
②通过所述分配线程确定可用于处理应用进程的多个图形处理器。
如上所述,可以确定当前可用于处理应用进程的多个GPU,以便后续关于各个GPU的资源分配管理。
③为所述多个图形处理器中的每个图形处理器分配一个处理线程。
如图9A所示,对于当前所确定的三个可用GPU,调度进程可以分配三个处理线程1、2和3分别用于这三个GPU的渲染时间处理,因此,后续各个GPU的资源分配可以通过对应的处理线程来执行。
④启动多个应用进程,所述多个应用进程中的每个应用进程可以包括由所述调度进程预先配置的调度库。
可选地,该调度库可以在该多个应用进程中共享,应用进程与调度进程之间的信息交互可以通过该调度库实现。在应用进程启动后,调度库可以将该应用进程的信息发送到调度进程进行注册,应用进程的注册操作是同步操作。为了减少数据的回流,在注册后应用进程可以持续发送消息,而调度进程可以通过共享事件方式通知对应用进程的处理。
⑤对于所述多个应用进程中的每个应用进程,通过所述应用进程的调度库和所述分配线程,为所述多个应用进程中的每个应用进程分配所述多个图形处理器中的一个图形处理器和与所述图形处理器相对应的处理线程。
如图9A所示,可以为三个应用进程分配用于其图形渲染处理的GPU,例如,应用进程1和2被分配在GPU 1上进行图形渲染,应用进程3被分配在GPU 3上进行图形渲染,而GPU 2未被分配用于这三个应用进程的处理。在调度进程上完成应用进程的注册操作后,可以将相应的注册信息返回至各个应用进程,诸如所分配的GPU的索引及其对应的处理线程等信息。
⑥对于向一个图形处理器分配的至少一个应用进程中的每个应用进程,通过与所述应用进程相对应的处理线程确定所述应用进程在所述图形处理器的当前预定资源中的剩余可用资源量,以确定对所述应用进程的资源分配命令,所述资源分配命令指示是否对所述应用进程进行处理。
如上所述的调度服务的渲染处理操作可以由调度进程中的相应处理线程进行,例如,对 应于GPU 1的处理线程1可以处理应用进程1和2的渲染时间,包括确定这两个应用进程在当前预定资源中的剩余可用资源量并基于这两个应用进程在当前预定资源中的剩余可用资源量确定相应的资源分配命令。每个应用进程在接收到相应的资源分配命令后,基于所述资源分配命令确定是否向相应的GPU发送处理任务,以供GPU进行处理。
其中,如上所述,获得第一增量的方式基于GPU对处理任务的处理方式不同而不同。例如,在异步渲染的情况下,第一增量可以由调度库直接从GPU中通过信号拦截来估计;而在同步渲染的情况下,第一增量可以由调度库从GPU中通过查询指令获取,即沿GPU到调度库的实线原路返回。
根据本申请的实施例,所述剩余可用资源量与所述应用进程在所述图形处理器的历史预定资源中的剩余可用资源量相关,所述资源分配命令用于使所述应用进程在所述当前预定资源中的剩余可用资源量达到预设目标值。通过如上所述使应用进程在当前预定资源中的剩余可用资源量达到预设目标值,可以使应用进程在当前预定资源中所使用的资源量更接近期望的资源量,即其资源需求权重所对应的资源需求量。因此,可以实现对应用进程的按需资源分配,并实现对GPU资源的高效利用。
图9B是示出根据本申请的实施例的图形处理器资源管理方法的调度逻辑的示意图。可选地,本申请的图形处理器资源管理方法可以基于常见的图形系统的当前设计而涉及三个运行时库,分别为客户端注入(SchedulingClient)、调度服务(SchedulingService)与调度逻辑(Scheduling)。其中,客户端注入与调度服务仅处理函数注入与外部的事件通信方式,核心逻辑在于调度逻辑中。如图9B所示,可以调用图9B中的各种函数(例如,ResourceScheduling、SchedulingProtocol等),以通过收集并计算渲染时间以及基于渲染时间计算是否渲染来确定对各个应用进程的资源分配命令。
图10是示出根据本申请的实施例的图形处理器资源管理装置1000的示意图。
所述图形处理器资源管理装置1000可以包括处理器确定模块1001、处理器分配模块1002、剩余资源确定模块1003和资源分配模块1004。
根据本申请的实施例,处理器确定模块1001可以被配置为确定可用于处理应用进程的多个图形处理器。
可选地,处理器确定模块1001可以执行以上关于步骤201描述的操作。
可选地,该应用进程可以是诸如游戏进程、视频进程和会议进程等的各种应用进程,对应地,对于诸如游戏实例的应用进程的图形渲染任务,所确定的GPU应是具有执行一定渲染计算的能力的GPU。
处理器分配模块1002可以被配置为获取待处理的多个应用进程,并为所述多个应用进程中的每个应用进程分配所述多个图形处理器中的一个图形处理器。
可选地,处理器分配模块1002可以执行以上关于步骤202描述的操作。
可选地,当各个用户终端启动应用后,服务器端可以为其分别注册相应的应用进程,包括为各个应用进程分配用于其图形渲染处理的GPU。在保证满足GPU计算性能限制的情况下,可以将多于一个应用进程分配给同一GPU,即在同一GPU上同时执行多于一个应用进程的处理任务,但被分配给GPU的负载不可以超过该GPU的计算性能限制。
剩余资源确定模块1003可以被配置为对于向一个图形处理器分配的至少一个应用进程中的每个应用进程,确定所述应用进程在所述图形处理器的当前预定资源中的剩余可用资源量,所述剩余可用资源量与所述应用进程在所述图形处理器的历史预定资源中的剩余可用资 源量相关。
可选地,剩余资源确定模块1003可以执行以上关于步骤203描述的操作。通过基于应用进程在历史预定资源中的剩余可用资源量来确定其在当前预定资源中的剩余可用资源量,可以基于历史资源分配中的误差来调整当前资源分配,以减少甚至消除该资源分配误差。
资源分配模块1004可以被配置为基于所述至少一个应用进程中的每个应用进程在所述当前预定资源中的剩余可用资源量,确定对所述至少一个应用进程中的每个应用进程的资源分配命令,所述资源分配命令指示是否对所述应用进程进行处理。其中,所述资源分配命令用于使所述应用进程在所述当前预定资源中的剩余可用资源量达到预设目标值。
可选地,资源分配模块1004可以执行以上关于步骤204描述的操作。通过如上所述使应用进程在当前预定资源中的剩余可用资源量达到预设目标值,可以使应用进程在当前预定资源中所使用的资源量更接近期望的资源量,即其资源需求权重所对应的资源需求量。因此,可以实现对应用进程的按需资源分配,并实现对GPU资源的高效利用。
在一个实施例中,所述多个应用进程中的每个应用进程具有预定的资源需求权重,所述装置还包括:
已使用资源确定模块,用于对于所述至少一个应用进程中的每个应用进程,获取所述应用进程在所述历史预定资源中的剩余可用资源量,并且确定所述应用进程在所述当前预定资源中的已使用资源量;
其中,所述剩余资源确定模块还用于基于所述应用进程在所述历史预定资源中的剩余可用资源量、所述应用进程在所述当前预定资源中的已使用资源量、以及所述应用进程的资源需求权重,确定所述应用进程在所述当前预定资源中的剩余可用资源量;
其中,所述资源分配命令用于使所述应用进程的第一误差大于第二误差,所述第一误差是指所述应用进程的资源需求权重与所述应用进程在所述历史预定资源中所使用的资源比例的误差,所述第二误差是指所述应用进程的资源需求权重与所述应用进程在所述当前预定资源中所使用的资源比例的误差。
在一个实施例中,所述资源分配命令指示对应的应用进程是否要向对应的图形处理器发送处理任务以由所述图形处理器进行处理,其中,所述图形处理器对所述处理任务的处理对应于所述应用进程对所述图形处理器的资源的使用;
其中,所述已使用资源确定模块还用于确定所述应用进程在所述当前预定资源中的已使用资源量的第一增量,所述第一增量对应于与所述应用进程相对应的图形处理器对来自所述应用进程的处理任务的前一次处理。
在一个实施例中,所述资源分配模块还用于对于所述至少一个应用进程中的每个应用进程,在所述应用进程对应的当前预定资源中的剩余可用资源量不大于零的情况下,确定对所述应用进程的资源分配命令指示不对所述应用进程进行处理;以及对于所述至少一个应用进程中剩余可用资源量大于零的其他应用进程,基于所述其他应用进程中的每个应用进程的优先级确定对每个应用进程的资源分配命令。
在一个实施例中,所述其他应用进程中的每个应用进程的优先级与所述应用进程等待被处理的时间长度以及确定其最新的第一增量的时间顺序相关;
所述资源分配模块还用于对于所述其他应用进程中的每个应用进程,在存在等待被处理的时间长度满足预定条件的应用进程的情况下,基于确定所述应用进程的最新的第一增量的时间顺序来确定对所述应用进程的资源分配命令;以及在未存在等待被处理的时间长度满足 预定条件的应用进程的情况下,基于确定所述其他应用进程中的每个应用进程的最新的第一增量的时间顺序来确定对每个应用进程的资源分配命令。
根据本申请的又一方面,还提供了一种图形处理器资源管理设备。图11示出了根据本申请的实施例的图形处理器资源管理设备2000的示意图。
如图11所示,所述图形处理器资源管理设备2000可以包括一个或多个处理器2010,和一个或多个存储器2020。其中,所述存储器2020中存储有计算机可读代码,所述计算机可读代码当由所述一个或多个处理器2010运行时,可以执行如上所述的图形处理器资源管理方法。
本申请的实施例中的处理器可以是一种集成电路芯片,具有信号的处理能力。上述处理器可以是通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现成可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等,可以是X86架构或ARM架构的。
一般而言,本申请的各种示例实施例可以在硬件或专用电路、软件、固件、逻辑,或其任何组合中实施。某些方面可以在硬件中实施,而其他方面可以在可以由控制器、微处理器或其他计算设备执行的固件或软件中实施。当本申请的实施例的各方面被图示或描述为框图、流程图或使用某些其他图形表示时,将理解此处描述的方框、装置、系统、技术或方法可以作为非限制性的示例在硬件、软件、固件、专用电路或逻辑、通用硬件或控制器或其他计算设备,或其某些组合中实施。
例如,根据本申请的实施例的方法或装置也可以借助于图12所示的计算设备3000的架构来实现。如图12所示,计算设备3000可以包括总线3010、一个或多个CPU 3020、只读存储器(ROM)3030、随机存取存储器(RAM)3040、连接到网络的通信端口3050、输入/输出组件3060、硬盘3070等。计算设备3000中的存储设备,例如ROM 3030或硬盘3070可以存储本申请提供的图形处理器资源管理方法的处理和/或通信使用的各种数据或文件以及CPU所执行的程序指令。计算设备3000还可以包括用户界面3080。当然,图11所示的架构只是示例性的,在实现不同的设备时,根据实际需要,可以省略图12示出的计算设备中的一个或多个组件。
根据本申请的又一方面,还提供了一种计算机可读存储介质。图13示出了根据本申请的存储介质的示意图4000。
如图13所示,所述计算机存储介质4020上存储有计算机可读指令4010。当所述计算机可读指令4010由处理器运行时,可以执行参照以上附图描述的根据本申请的实施例的图形处理器资源管理方法。本申请的实施例中的计算机可读存储介质可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。非易失性存储器可以是只读存储器(ROM)、可编程只读存储器(PROM)、可擦除可编程只读存储器(EPROM)、电可擦除可编程只读存储器(EEPROM)或闪存。易失性存储器可以是随机存取存储器(RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、同步动态随机存取存储器(SDRAM)、双倍数据速率同步动态随机存取存储器(DDRSDRAM)、增强型同步动态随机存取存储器(ESDRAM)、同步连接动态随机存取存储器(SLDRAM)和直接内存总线随机存取存储器(DR RAM)。应注意,本文描述的方法的存储器旨在包括但不限于这些和任意其它适合类型的存储器。应注意,本文描述的 方法的存储器旨在包括但不限于这些和任意其它适合类型的存储器。
本申请的实施例还提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行根据本申请的实施例的图形处理器资源管理方法。
本申请的实施例提供了一种图形处理器资源管理方法、装置、设备、计算机可读存储介质和计算机程序产品。
本申请的实施例所提供的方法相比于传统的图形处理器资源管理方法而言,利用图形处理器上运行的应用进程在过去的资源消耗量作为资源分配的参考,以根据各个应用进程实际可使用的资源量来实时调整资源分配,从而避免了多个应用进程之间的资源竞争。
本申请的实施例所提供的方法针对在同一图形处理器上同时运行的多个应用进程,考虑这些应用进程在历史资源分配中的剩余可用资源,基于这些应用进程在图形处理器的资源中当前可使用的资源量实时确定资源分配方案,从而实现对图形管理器资源的高效分配。通过本申请的实施例的方法能够根据各个应用进程的资源需求对图形处理器资源进行合理分配,避免了多个应用进程之间的竞争影响,提高了图形处理器资源的使用率。
需要说明的是,附图中的流程图和框图,图示了按照本申请各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,所述模块、程序段、或代码的一部分包含至少一个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
一般而言,本申请的各种示例实施例可以在硬件或专用电路、软件、固件、逻辑,或其任何组合中实施。某些方面可以在硬件中实施,而其他方面可以在可以由控制器、微处理器或其他计算设备执行的固件或软件中实施。当本申请的实施例的各方面被图示或描述为框图、流程图或使用某些其他图形表示时,将理解此处描述的方框、装置、系统、技术或方法可以作为非限制性的示例在硬件、软件、固件、专用电路或逻辑、通用硬件或控制器或其他计算设备,或其某些组合中实施。
在上面详细描述的本申请的示例实施例仅仅是说明性的,而不是限制性的。本领域技术人员应该理解,在不脱离本申请的原理和精神的情况下,可对这些实施例或其特征进行各种修改和组合,这样的修改应落入本申请的范围内。

Claims (18)

  1. 一种图形处理器资源管理方法,由图形处理器资源管理设备执行,包括:
    确定用于处理应用进程的多个图形处理器;
    获取待处理的多个应用进程,并为所述多个应用进程中的每个应用进程分配所述多个图形处理器中的一个图形处理器;
    对于向一个图形处理器分配的至少一个应用进程中的每个应用进程,确定所述应用进程在所述图形处理器的当前预定资源中的剩余可用资源量,所述剩余可用资源量与所述应用进程在所述图形处理器的历史预定资源中的剩余可用资源量相关;以及
    基于所述至少一个应用进程中的每个应用进程在所述当前预定资源中的剩余可用资源量,确定对所述至少一个应用进程中的每个应用进程的资源分配命令,所述资源分配命令指示是否对所述应用进程进行处理;
    其中,所述资源分配命令用于使所述应用进程在所述当前预定资源中的剩余可用资源量达到预设目标值。
  2. 如权利要求1所述的方法,其中,所述多个应用进程中的每个应用进程具有预定的资源需求权重,所述方法还包括:
    对于所述至少一个应用进程中的每个应用进程,获取所述应用进程在所述历史预定资源中的剩余可用资源量,并且确定所述应用进程在所述当前预定资源中的已使用资源量;
    其中,所述确定所述应用进程在所述当前预定资源中的剩余可用资源量包括:
    基于所述应用进程在所述历史预定资源中的剩余可用资源量、所述应用进程在所述当前预定资源中的已使用资源量、以及所述应用进程的资源需求权重,确定所述应用进程在所述当前预定资源中的剩余可用资源量;
    其中,所述资源分配命令用于使所述应用进程的第一误差大于第二误差,所述第一误差是指所述应用进程的资源需求权重与所述应用进程在所述历史预定资源中所使用的资源比例的误差,所述第二误差是指所述应用进程的资源需求权重与所述应用进程在所述当前预定资源中所使用的资源比例的误差。
  3. 如权利要求2所述的方法,其中,图形处理器的历史预定资源和当前预定资源分别包含的资源量均为预定资源量;
    所述多个应用进程中的每个应用进程的资源需求权重指示所述应用进程在所述预定资源量中所需的资源量的比例;
    其中,为所述多个应用进程中的每个应用进程分配所述多个图形处理器中的一个图形处理器包括:
    确定所述多个图形处理器中的每个图形处理器的可用资源比,所述可用资源比为所述图形处理器中可用于处理应用进程的资源比例;以及
    基于所述多个应用进程中的每个应用进程的资源需求权重以及所述多个图形处理器中的每个图形处理器的可用资源比,确定为每个应用进程分配的图形处理器;
    其中,向一个图形处理器分配的至少一个应用进程的资源需求权重之和不大于所述图形处理器的可用资源比。
  4. 如权利要求2所述的方法,其中,所述资源分配命令指示对应的应用进程是否要向对应的图形处理器发送处理任务以由所述图形处理器进行处理,其中,所述图形处理器对所述处理任务的处理对应于所述应用进程对所述图形处理器的资源的使用;
    其中,确定所述应用进程在所述当前预定资源中的已使用资源量,包括:确定所述应用进程在所述当前预定资源中的已使用资源量的第一增量,所述第一增量对应于与所述应用进程相对应的图形处理器对来自所述应用进程的处理任务的前一次处理。
  5. 如权利要求4所述的方法,其中,所述确定所述应用进程在所述当前预定资源中的已使用资源量的第一增量,包括:
    基于所述图形处理器对所述处理任务的处理方式确定所述应用进程在所述当前预定资源中的已使用资源量的第一增量,所述处理方式包括同步渲染或者异步渲染中的至少一种。
  6. 如权利要求5所述的方法,其中,所述基于所述图形处理器对所述处理任务的处理方式确定所述应用进程在所述当前预定资源中的已使用资源量的第一增量,包括:
    当所述图形处理器对所述处理任务的处理方式为异步渲染时,通过对所述前一次处理的开始与结束进行标记,确定所述第一增量。
  7. 如权利要求5所述的方法,其中,所述基于所述图形处理器对所述处理任务的处理方式确定所述应用进程在所述当前预定资源中的已使用资源量的第一增量,包括:
    当所述图形处理器对所述处理任务的处理方式为同步渲染时,通过使用查询指令,从所述图形处理器获取所述第一增量。
  8. 如权利要求4所述的方法,其中,所述基于所述至少一个应用进程中的每个应用进程在所述当前预定资源中的剩余可用资源量,确定对所述至少一个应用进程中的每个应用进程的资源分配命令包括:
    对于所述至少一个应用进程中的每个应用进程,在所述应用进程对应的当前预定资源中的剩余可用资源量不大于零的情况下,确定对所述应用进程的资源分配命令指示不对所述应用进程进行处理;以及
    对于所述至少一个应用进程中剩余可用资源量大于零的其他应用进程,基于所述其他应用进程中的每个应用进程的优先级确定对每个应用进程的资源分配命令。
  9. 如权利要求8所述的方法,其中,所述其他应用进程中的每个应用进程的优先级与所述应用进程等待被处理的时间长度以及确定其最新的第一增量的时间顺序相关;
    基于所述其他应用进程中的每个应用进程的优先级确定对每个应用进程的资源分配命令包括:
    对于所述其他应用进程中的每个应用进程,在存在等待被处理的时间长度满足预定条件的应用进程的情况下,基于确定所述应用进程的最新的第一增量的时间顺序来确定对所述应用进程的资源分配命令;以及
    在未存在等待被处理的时间长度满足预定条件的应用进程的情况下,基于确定所述其他应用进程中的每个应用进程的最新的第一增量的时间顺序来确定对每个应用进程的资源分配命令。
  10. 一种图形处理器资源管理方法,由图形处理器资源管理设备执行,包括:
    启动调度进程,所述调度进程包括分配线程和多个处理线程;
    通过所述分配线程确定用于处理应用进程的多个图形处理器,并为所述多个图形处理器中的每个图形处理器分配一个处理线程;
    启动多个应用进程,所述多个应用进程中的每个应用进程包括由所述调度进程预先配置的调度库;
    对于所述多个应用进程中的每个应用进程,通过所述应用进程的调度库和所述分配线程, 为所述多个应用进程中的每个应用进程分配所述多个图形处理器中的一个图形处理器和对应的处理线程;
    对于向一个图形处理器分配的至少一个应用进程中的每个应用进程,通过与所述应用进程相对应的处理线程确定所述应用进程在所述图形处理器的当前预定资源中的剩余可用资源量,以确定对所述应用进程的资源分配命令,所述资源分配命令指示是否对所述应用进程进行处理;
    其中,所述剩余可用资源量与所述应用进程在所述图形处理器的历史预定资源中的剩余可用资源量相关,所述资源分配命令用于使所述应用进程在所述当前预定资源中的剩余可用资源量达到预设目标值达到预设目标值。
  11. 一种图形处理器资源管理装置,包括:
    处理器确定模块,被配置为确定用于处理应用进程的多个图形处理器;
    处理器分配模块,被配置为获取待处理的多个应用进程,并为所述多个应用进程中的每个应用进程分配所述多个图形处理器中的一个图形处理器;
    剩余资源确定模块,被配置为对于向一个图形处理器分配的至少一个应用进程中的每个应用进程,确定所述应用进程在所述图形处理器的当前预定资源中的剩余可用资源量,所述剩余可用资源量与所述应用进程在所述图形处理器的历史预定资源中的剩余可用资源量相关;以及
    资源分配模块,被配置为基于所述至少一个应用进程中的每个应用进程在所述当前预定资源中的剩余可用资源量,确定对所述至少一个应用进程中的每个应用进程的资源分配命令,所述资源分配命令指示是否对所述应用进程进行处理;
    其中,所述资源分配命令用于使所述应用进程在所述当前预定资源中的剩余可用资源量达到预设目标值达到预设目标值。
  12. 如权利要求11所述的装置,其中,所述多个应用进程中的每个应用进程具有预定的资源需求权重,所述装置还包括:
    已使用资源确定模块,用于对于所述至少一个应用进程中的每个应用进程,获取所述应用进程在所述历史预定资源中的剩余可用资源量,并且确定所述应用进程在所述当前预定资源中的已使用资源量;
    其中,所述剩余资源确定模块还用于基于所述应用进程在所述历史预定资源中的剩余可用资源量、所述应用进程在所述当前预定资源中的已使用资源量、以及所述应用进程的资源需求权重,确定所述应用进程在所述当前预定资源中的剩余可用资源量;
    其中,所述资源分配命令用于使所述应用进程的第一误差大于第二误差,所述第一误差是指所述应用进程的资源需求权重与所述应用进程在所述历史预定资源中所使用的资源比例的误差,所述第二误差是指所述应用进程的资源需求权重与所述应用进程在所述当前预定资源中所使用的资源比例的误差。
  13. 如权利要求12所述的装置,其中,所述资源分配命令指示对应的应用进程是否要向对应的图形处理器发送处理任务以由所述图形处理器进行处理,其中,所述图形处理器对所述处理任务的处理对应于所述应用进程对所述图形处理器的资源的使用;
    其中,所述已使用资源确定模块还用于确定所述应用进程在所述当前预定资源中的已使用资源量的第一增量,所述第一增量对应于与所述应用进程相对应的图形处理器对来自所述应用进程的处理任务的前一次处理。
  14. 如权利要求13所述的装置,所述资源分配模块还用于对于所述至少一个应用进程中的每个应用进程,在所述应用进程对应的当前预定资源中的剩余可用资源量不大于零的情况下,确定对所述应用进程的资源分配命令指示不对所述应用进程进行处理;以及对于所述至少一个应用进程中剩余可用资源量大于零的其他应用进程,基于所述其他应用进程中的每个应用进程的优先级确定对每个应用进程的资源分配命令。
  15. 如权利要求14所述的装置,其中,所述其他应用进程中的每个应用进程的优先级与所述应用进程等待被处理的时间长度以及确定其最新的第一增量的时间顺序相关;
    所述资源分配模块还用于对于所述其他应用进程中的每个应用进程,在存在等待被处理的时间长度满足预定条件的应用进程的情况下,基于确定所述应用进程的最新的第一增量的时间顺序来确定对所述应用进程的资源分配命令;以及在未存在等待被处理的时间长度满足预定条件的应用进程的情况下,基于确定所述其他应用进程中的每个应用进程的最新的第一增量的时间顺序来确定对每个应用进程的资源分配命令。
  16. 一种图形处理器资源管理设备,包括:
    一个或多个处理器;以及
    一个或多个存储器,其中存储有计算机可执行程序,当由所述处理器执行所述计算机可执行程序时,执行权利要求1-10中任一项所述的方法。
  17. 一种计算机可读存储介质,其上存储有计算机可执行指令,所述指令在被处理器执行时用于实现如权利要求1-10中任一项所述的方法。
  18. 一种计算机程序产品,包括计算机指令,其特征在于,该计算机指令被处理器执行时实现权利要求1至10中任一项所述的方法的步骤。
PCT/CN2022/132457 2022-02-14 2022-11-17 图形处理器资源管理方法、装置、设备、存储介质和程序产品 WO2023151340A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
KR1020247011896A KR20240052091A (ko) 2022-02-14 2022-11-17 그래픽 처리 유닛 자원 관리 방법, 장치, 디바이스, 저장 매체, 및 프로그램 제품
US18/215,018 US20230342207A1 (en) 2022-02-14 2023-06-27 Graphics processing unit resource management method, apparatus, and device, storage medium, and program product

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210135158.4A CN114490082A (zh) 2022-02-14 2022-02-14 图形处理器资源管理方法、装置、设备和存储介质
CN202210135158.4 2022-02-14

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/215,018 Continuation US20230342207A1 (en) 2022-02-14 2023-06-27 Graphics processing unit resource management method, apparatus, and device, storage medium, and program product

Publications (1)

Publication Number Publication Date
WO2023151340A1 true WO2023151340A1 (zh) 2023-08-17

Family

ID=81479606

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/132457 WO2023151340A1 (zh) 2022-02-14 2022-11-17 图形处理器资源管理方法、装置、设备、存储介质和程序产品

Country Status (4)

Country Link
US (1) US20230342207A1 (zh)
KR (1) KR20240052091A (zh)
CN (1) CN114490082A (zh)
WO (1) WO2023151340A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114490082A (zh) * 2022-02-14 2022-05-13 腾讯科技(深圳)有限公司 图形处理器资源管理方法、装置、设备和存储介质
CN115408305B (zh) * 2022-11-03 2022-12-23 北京麟卓信息科技有限公司 一种基于dma重定向的图形渲染方式检测方法
CN116579914B (zh) * 2023-07-14 2023-12-12 南京砺算科技有限公司 一种图形处理器引擎执行方法、装置、电子设备及存储介质
CN117314728B (zh) * 2023-11-29 2024-03-12 深圳市七彩虹禹贡科技发展有限公司 一种gpu运行调控方法及系统

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160358305A1 (en) * 2015-06-07 2016-12-08 Apple Inc. Starvation free scheduling of prioritized workloads on the gpu
CN110362407A (zh) * 2019-07-19 2019-10-22 中国工商银行股份有限公司 计算资源调度方法及装置
CN111597042A (zh) * 2020-05-11 2020-08-28 Oppo广东移动通信有限公司 业务线程运行方法、装置、存储介质及电子设备
CN112870726A (zh) * 2021-03-15 2021-06-01 腾讯科技(深圳)有限公司 图形处理器的资源分配方法、装置和存储介质
CN113849312A (zh) * 2021-09-29 2021-12-28 北京百度网讯科技有限公司 数据处理任务的分配方法、装置、电子设备及存储介质
CN114490082A (zh) * 2022-02-14 2022-05-13 腾讯科技(深圳)有限公司 图形处理器资源管理方法、装置、设备和存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160358305A1 (en) * 2015-06-07 2016-12-08 Apple Inc. Starvation free scheduling of prioritized workloads on the gpu
CN110362407A (zh) * 2019-07-19 2019-10-22 中国工商银行股份有限公司 计算资源调度方法及装置
CN111597042A (zh) * 2020-05-11 2020-08-28 Oppo广东移动通信有限公司 业务线程运行方法、装置、存储介质及电子设备
CN112870726A (zh) * 2021-03-15 2021-06-01 腾讯科技(深圳)有限公司 图形处理器的资源分配方法、装置和存储介质
CN113849312A (zh) * 2021-09-29 2021-12-28 北京百度网讯科技有限公司 数据处理任务的分配方法、装置、电子设备及存储介质
CN114490082A (zh) * 2022-02-14 2022-05-13 腾讯科技(深圳)有限公司 图形处理器资源管理方法、装置、设备和存储介质

Also Published As

Publication number Publication date
CN114490082A (zh) 2022-05-13
KR20240052091A (ko) 2024-04-22
US20230342207A1 (en) 2023-10-26

Similar Documents

Publication Publication Date Title
WO2023151340A1 (zh) 图形处理器资源管理方法、装置、设备、存储介质和程序产品
US10891177B2 (en) Message management method and device, and storage medium
US20220379204A1 (en) Image processing method and apparatus, server, and medium
CN109582425B (zh) 一种基于云端与终端gpu融合的gpu服务重定向系统及方法
US8341624B1 (en) Scheduling a virtual machine resource based on quality prediction of encoded transmission of images generated by the virtual machine
CN110769278B (zh) 一种分布式视频转码方法及系统
US8990292B2 (en) In-network middlebox compositor for distributed virtualized applications
CN109726005B (zh) 用于管理资源的方法、服务器系统和计算机可读介质
CN107832143B (zh) 一种物理机资源的处理方法和装置
CN102143386B (zh) 一种基于图形处理器的流媒体服务器加速方法
CN110247942B (zh) 一种数据发送方法、装置和可读介质
US10733022B2 (en) Method of managing dedicated processing resources, server system and computer program product
CN113992758B (zh) 一种系统数据资源的动态调度方法、装置及电子设备
CN111078404B (zh) 一种计算资源确定方法、装置、电子设备及介质
WO2017075967A1 (zh) 在线媒体服务的带宽分配方法及系统
CN111432262A (zh) 页面视频渲染方法及装置
CN109388501B (zh) 基于人脸识别请求的通信匹配方法、装置、设备及介质
CN111597044A (zh) 任务调度方法、装置、存储介质及电子设备
WO2020259208A1 (zh) 内存调度方法、装置、设备及存储介质
CN114327918B (zh) 调整资源量的方法、装置、电子设备和存储介质
WO2022057718A1 (zh) 编码调度方法、服务器及客户端和获取远程桌面的系统
CN113051051B (zh) 视频设备的调度方法、装置、设备及存储介质
CN115344350A (zh) 云服务系统的节点设备及资源处理方法
CN114816703A (zh) 一种任务处理方法、装置、设备及介质
CN111111163B (zh) 管理计算资源的方法、设备和电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22925691

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2022925691

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 20247011896

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2022925691

Country of ref document: EP

Effective date: 20240328