CN117215764A - Computing power resource processing method, device, equipment and storage medium - Google Patents

Computing power resource processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN117215764A
CN117215764A CN202310681868.1A CN202310681868A CN117215764A CN 117215764 A CN117215764 A CN 117215764A CN 202310681868 A CN202310681868 A CN 202310681868A CN 117215764 A CN117215764 A CN 117215764A
Authority
CN
China
Prior art keywords
idle
computing power
time period
power resource
resource
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310681868.1A
Other languages
Chinese (zh)
Inventor
查冲
郑亚峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202310681868.1A priority Critical patent/CN117215764A/en
Publication of CN117215764A publication Critical patent/CN117215764A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Supply And Distribution Of Alternating Current (AREA)

Abstract

The application provides a method, a device, equipment and a storage medium for processing computing resources, which are applied to various scenes such as cloud technology, artificial intelligence, map field, intelligent traffic, auxiliary driving, vehicle-mounted and the like, and the method comprises the following steps: in the processing process of performing task processing on the target model based on the computing equipment, determining an idle time period and a processing time period corresponding to the processing process; acquiring a first idle computing power resource which does not perform task processing on a target model in a processing time period and a second idle computing power resource which is in an idle state in the idle time period; and sending the first idle computing power resource and the second idle computing power resource to a resource scheduling system so that the resource scheduling system performs scheduling consumption on the first idle computing power resource and the second idle computing power resource. The embodiment of the application can effectively reduce the model processing cost and the consumption of system resources in the model processing process.

Description

Computing power resource processing method, device, equipment and storage medium
Technical Field
The application belongs to the technical field of computers, and particularly relates to a method, a device, equipment and a storage medium for processing computing power resources.
Background
With the development of artificial intelligence (Artificial Intelligence, AI) model training, the demand of heterogeneous computing forces such as graphics processors (Graphics Processing Unit, GPU) is suddenly increased, especially in large-model processing scenarios (model training/reasoning scenarios), and the demand of heterogeneous computing forces such as GPU may need thousands or even tens of thousands of cards, thereby greatly increasing the cost of AI model processing and the consumption of system resources in AI model processing, and reducing the performance of AI model processing.
In order to reduce the cost of AI model processing, the related art generally cuts the GPU card by a virtualization manner, and executes AI model processing by the cut GPU card, but the virtualization manner is generally adapted to training scenes of small models and is not suitable for increasingly larger model training scenes; moreover, the virtualization mode cannot dynamically mine and dynamically schedule the computational power resources, so that the heterogeneous computational power performance cannot be used maximally, and further the AI model processing cost and the consumption of the AI model processing process on the system resources cannot be effectively reduced.
Disclosure of Invention
In order to solve the technical problems, the application provides a method, a device, equipment and a storage medium for processing computing power resources.
In one aspect, the present application provides a method for processing computing power resources, where the method includes:
in the processing process of performing task processing on a target model based on the computing equipment, determining an idle time period and a processing time period corresponding to the processing process; the idle time period is a time period other than the processing time period in the target time period; the target time period is a time period corresponding to the operation of the computing equipment;
acquiring a first idle computing power resource which does not perform task processing on the target model in the processing time period, and acquiring a second idle computing power resource which is in an idle state in the idle time period; the first idle computing power resource and the second computing power resource are computing power resources corresponding to the computing power equipment;
and sending the first idle computing power resource and the second idle computing power resource to a resource scheduling system so that the resource scheduling system performs scheduling consumption on the first idle computing power resource and the second idle computing power resource to perform task processing on other models.
In another aspect, the present application provides a method for processing computing power resources, the method including:
receiving a first idle computing power resource and a second idle computing power resource sent by a terminal; the first idle computing power resource is the computing power resource which is acquired by the terminal from the processing time period and does not process the task of the target model, and the second idle computing power resource is the computing power resource which is acquired by the terminal from the idle time period and is in an idle state; the idle time period is a time period other than the processing time period in the target time period; the processing time period is a time period corresponding to a processing process of the terminal for performing task processing on the target model based on the computing equipment; the idle period is a period other than the processing period in the target period; the target time period is a time period corresponding to the operation of the computing equipment;
And scheduling and consuming the first idle computing power resource and the second idle computing power resource to perform task processing on other models.
In another aspect, the present application provides an apparatus for processing computing resources, the apparatus comprising:
the time period determining module is used for determining an idle time period and a processing time period corresponding to the processing process in the processing process of performing task processing on the target model based on the computing equipment; the idle time period is a time period other than the processing time period in the target time period; the target time period is a time period corresponding to the operation of the computing equipment;
the resource acquisition module is used for acquiring a first idle computing power resource which does not perform task processing on the target model in the processing time period and acquiring a second idle computing power resource which is in an idle state in the idle time period; the first idle computing power resource and the second computing power resource are computing power resources corresponding to the computing power equipment;
and the sending module is used for sending the first idle computing power resource and the second idle computing power resource to a resource scheduling system so that the resource scheduling system performs scheduling consumption on the first idle computing power resource and the second idle computing power resource to perform task processing on other models.
In another aspect, the present application provides an apparatus for processing computing resources, the apparatus comprising:
the receiving module is used for receiving the first idle computing power resource and the second idle computing power resource sent by the terminal; the first idle computing power resource is the computing power resource which is acquired by the terminal from the processing time period and does not process the task of the target model, and the second idle computing power resource is the computing power resource which is acquired by the terminal from the idle time period and is in an idle state; the idle time period is a time period other than the processing time period in the target time period; the processing time period is a time period corresponding to a processing process of the terminal for performing task processing on the target model based on the computing equipment; the idle period is a period other than the processing period in the target period; the target time period is a time period corresponding to the operation of the computing equipment;
the scheduling consumption module is used for scheduling consumption of the first idle computing power resource and the second idle computing power resource so as to process tasks of other models
In another aspect, the present application provides an electronic device for processing a computing resource, where the electronic device includes a processor and a memory, and at least one instruction or at least one program is stored in the memory, where the at least one instruction or at least one program is loaded and executed by the processor to implement a computing resource processing method as described above.
In another aspect, the present application provides a computer readable storage medium having stored therein at least one instruction or at least one program loaded and executed by a processor to implement a method of computing power resource processing as described above.
In another aspect, the application proposes a computer program product which, when executed by a processor, implements a method of computing power resource handling as described above.
According to the computing power resource processing method, device, equipment and storage medium, in the processing process of performing task processing on the target model based on the computing power equipment, the processing time period corresponding to the processing process and the idle time period except for the processing time period in the target time period are determined, the first idle computing power resource which does not perform task processing on the target model in the processing time period is obtained, the second idle computing power resource which is in an idle state in the idle time period is obtained, the terminal sends the first idle computing power resource and the second idle computing power resource to resource scheduling, and the system resource scheduling system performs scheduling consumption on the first idle computing power resource and the second idle computing power resource so as to perform task processing on other models. Therefore, the embodiment of the application provides a strategy for dynamically maximizing performance, namely, a first idle computing resource which does not perform task processing on a target model (mining of computing resources in space dimension) is dynamically mined from a processing time period, a second idle computing resource which is in an idle state (mining of computing resources in time dimension) is dynamically mined from the idle time period, and the idle computing resources mined from the space dimension and the time dimension are dynamically scheduled and consumed through a resource scheduling system, so that heterogeneous computing performance is used to the greatest extent, and further the AI model processing cost and the consumption of system resources by an AI model processing process are effectively reduced; furthermore, strategies to dynamically maximize performance can also support computational resource mining of large model scenarios, adapting to increasingly larger model processing scenarios.
Drawings
In order to more clearly illustrate the embodiments of the application or the technical solutions and advantages of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the application, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram illustrating an implementation environment for a computing power resource processing method, according to an example embodiment.
FIG. 2 is a flow diagram illustrating a method of computing power resource processing according to an example embodiment.
Fig. 3 is a schematic diagram illustrating one determination of idle periods according to an example embodiment.
FIG. 4 is a flow chart diagram II illustrating a method of computing power resource processing according to an example embodiment.
FIG. 5 is a flow diagram illustrating a process for scheduling computing resources according to an example embodiment.
FIG. 6 is a flow diagram illustrating a resource scheduling system scheduling computing resources according to an exemplary embodiment.
FIG. 7 is a flowchart illustrating a method of computing power resource processing according to an exemplary embodiment.
FIG. 8 is a flow chart diagram fourth illustrating a method of computing power resource processing according to an example embodiment.
FIG. 9 is a block diagram illustrating a computing resource processing device according to an example embodiment.
FIG. 10 is a block diagram of another computing resource processing device, shown in accordance with an exemplary embodiment.
Fig. 11 is a block diagram of a hardware structure of a server according to an exemplary embodiment.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
It should be noted that the terms "first," "second," and the like in the description and the claims of the embodiments of the present application and the above-described drawings are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that embodiments of the application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or server that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
In order to make the objects, technical solutions and advantages of the embodiments of the present application more apparent, the embodiments of the present application will be further described in detail below with reference to the accompanying drawings and examples. It should be understood that the detailed description and specific examples, while indicating the embodiment of the application, are intended for purposes of illustration only and are not intended to limit the scope of the application.
The terms "first" and "second" are used below for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature. In the description of the present embodiment, unless otherwise specified, the meaning of "plurality" is two or more.
AI is a theory, method, technique, and application system that utilizes a digital computer or a digital computer-controlled machine to simulate, extend, and extend human intelligence, sense the environment, acquire knowledge, and use knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.
AI is a comprehensive discipline, and relates to a wide range of technologies, both hardware and software. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.
Machine Learning (ML) is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, etc. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like. The training/reasoning process of the terminal in the embodiment of the application on the target model and the consumption of computing power resources, and the training/reasoning task of executing other models relate to the machine learning technology in AI.
Cloud technology (cloud technology) is based on the general terms of network technology, information technology, integration technology, management platform technology, application technology and the like applied by cloud computing business modes, and can form a resource pool, and the cloud computing business mode is flexible and convenient as required. Cloud computing technology will become an important support. Background services of technical networking systems require a large amount of computing, storage resources, such as video websites, picture-like websites, and more portals. Along with the high development and application of the internet industry, each article possibly has an own identification mark in the future, the identification mark needs to be transmitted to a background system for logic processing, data with different levels can be processed separately, and various industry data needs strong system rear shield support and can be realized only through cloud computing. Cloud technologies include cloud infrastructure classes and cloud applications. Further, the cloud base technology class may further include cloud computing, cloud storage, databases, big data, and the like. Cloud applications may further include medical clouds, cloud internet of things, cloud security, public clouds, private clouds, and the like.
Big data (Big data) refers to a data set which cannot be captured, managed and processed by a conventional software tool within a certain time range, and is a massive, high-growth-rate and diversified information asset which needs a new processing mode to have stronger decision-making ability, insight discovery ability and flow optimization ability. With the advent of the cloud age, big data has attracted more and more attention, and special techniques are required for big data to effectively process a large amount of data within a tolerant elapsed time. Technologies applicable to big data include massively parallel processing databases, data mining, distributed file systems, distributed databases, cloud computing platforms, the internet, and scalable storage systems.
In order to facilitate better understanding of the technical solution of the embodiments of the present application, first, technical terms related to the embodiments of the present application are described:
computing power resources: hardware with some general purpose or special purpose computing power may be considered a computing power resource. Such as GPU resources, central processing unit (Central Processing Unit, CPU) resources, etc.
Heterogeneous calculation power: heterogeneous computing power refers to computing power implemented using different technologies, including different system architectures, different instruction sets, different technology types, different ways of providing computing power, such as X86 architecture, ARM architecture, CPU, GPU, data processor (Data Processing Unit, DPU), field programmable gate array (Field Programmable Gate Array, FPGA) implemented computing chips, dedicated hardware computing chips, and so forth. Wherein the X86 architecture is a microprocessor-executed computer language instruction set. The ARM architecture is a 32-bit reduced instruction set processor architecture.
The application scenario of the computing power resource processing method provided by the embodiment of the application can be as follows: with the development of AI technology, the application of machine learning models is becoming more and more widespread. While considerable computational resources are required in the processing of the machine learning model (training/reasoning), such as: data processing of neural network models requires the use of heterogeneous computing forces.
The application scenario of the computing power resource processing method provided by the embodiment of the application can also be as follows: when video flows of video production companies, live broadcast platforms and the like are high and video needs to be encoded and decoded in real time, the computing power resources in the cloud server, such as the computing power resources of the GPU, are obtained through the computing power resource processing method provided by the embodiment of the application. By means of the method for processing the computing power resources, the GPU computing power is configured, and model training can be conveniently and rapidly completed by using the GPU computing power under the condition that a terminal (namely a model training end) does not feel to a business process. The cloud server may be a node in a blockchain network.
For example, in a scenario of machine learning model training, with the development of AI model processing, the amount of heterogeneous computation power required by a graphics processor (Graphics Processing Unit, GPU) and the like is suddenly increased, especially in a scenario of large model processing, the heterogeneous computation power required by the GPU and the like may need thousands or even tens of thousands of cards, thereby greatly increasing the cost of AI model processing and the consumption of system resources by AI model processing, and reducing the performance of AI model processing. In order to reduce the cost of AI model processing, the related art generally cuts the GPU card by a virtualization manner, and executes AI model processing by the cut GPU card, but the virtualization manner is generally adapted to training scenes of small models and is not suitable for increasingly larger model processing scenes; moreover, the virtualization mode cannot dynamically mine and dynamically schedule the computational power resources, so that the heterogeneous computational power performance cannot be used maximally, and further the AI model processing cost and the consumption of the AI model processing process on the system resources cannot be effectively reduced.
Aiming at the problems in the related art, the embodiment of the application provides a strategy for dynamically maximizing the performance, improves the dynamic availability performance of heterogeneous computing power and digs potential computing power resources. The embodiment of the application excavates potential computing force based on computing force resources (space dimension) of heterogeneous computing force and angles (time dimension) of card time; because the heterogeneous cards can support more computing power resources, the classification can be divided into floating point computing power resources and integer computing power resources, and each computing power resource has different hardware computing libraries, namely even if the floating point computing power resources are high in utilization rate and even run under full load, the integer computing power resources can still be mined out for AI to execute computing, usually the computing of the floating point computing power resources is used for executing AI model training tasks, the integer computing power resources are used for processing AI model reasoning requests, from the perspective of computing power resource hardware of the heterogeneous cards, the mining computing space exists, and if bottlenecks appear in other resources such as a CPU, a memory and a disk, even if the mining of the heterogeneous computing power is not used, dynamic execution mining is needed. The calculation force resource mining at the time of card is based on the fact that AI training or reasoning does not run at any time, especially AI training scenes, the links of loading training data and analyzing a result model exist, heterogeneous calculation force resources are in an idle state under the time axis of the links, and resources such as a CPU (Central processing Unit) and a memory are in a released idle state in the time axis, namely, from the time of card, dynamic mining time and calculation force space exist. The AI training scenarios may include, but are not limited to, speech, text, medical, etc.
FIG. 1 is a schematic diagram illustrating an implementation environment for a computing power resource processing method, according to an example embodiment. As shown in fig. 1, the implementation environment may at least include a terminal 01 and a resource scheduling system 02, where the terminal 01 and the resource scheduling system 02 may be directly or indirectly connected through a wired or wireless communication manner, and the embodiment of the present application is not limited herein.
Specifically, the terminal 01 is configured to determine, based on a processing procedure of the processing device performing task processing on the target model, an idle time period and a processing time period corresponding to the processing procedure; the method comprises the steps of obtaining first idle computing power resources which are not subjected to task processing on a target model in a processing time period, and obtaining second idle computing power resources which are in an idle state in the idle time period; and transmitting the first idle computing power resource and the second idle computing power resource to a resource scheduling system. Illustratively, the terminal 01 may be a resource consuming end (i.e., model training end), which may include: smart phones, tablet computers, notebook computers, desktop computers, smart speakers, smart watches, etc., but are not limited thereto. The terminal 01 may be provided therein with a processing device, which may include, but is not limited to: GPU, CPU, etc.
Specifically, the resource scheduling system 02 may be configured to receive a first idle computing power resource and a second idle computing power resource sent by a terminal, and schedule and consume the first idle computing power resource and the second idle computing power resource to perform task processing on other models. Illustratively, the resource scheduling system 02 may include a terminal device as a resource consuming end, a cloud server as a resource providing end, and a node disposed between the resource consuming end and the resource providing end. The resource consumption end, the resource providing end and the nodes are connected through wires or wireless. The server can be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, and can also be a cloud server for providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, basic cloud computing services such as big data and artificial intelligent platforms and the like.
It should be noted that fig. 1 is only an example. In other scenarios, an implementation environment may also be included.
FIG. 2 is a flow diagram illustrating a method of computing power resource processing according to an example embodiment. The method may be used in the implementation environment of fig. 1. The present specification provides method operational steps as described above, for example, in the examples or flowcharts, but may include more or fewer operational steps based on conventional or non-inventive labor. The order of steps recited in the embodiments is merely one way of performing the order of steps and does not represent a unique order of execution. When implemented in a real system or server product, the methods illustrated in the embodiments or figures may be performed sequentially or in parallel (e.g., in a parallel processor or multithreaded environment). As shown in fig. 2, the method may include:
S101, in the processing process of performing task processing on a target model by a terminal based on a computing device, determining an idle time period and a processing time period corresponding to the processing process by the terminal; the idle period is a period other than the processing period in the target period; the target time period is a time period corresponding to operation of the computing device.
Alternatively, the computing device may be various types of heterogeneous computing forces, e.g., GPU, CPU, etc. The computing device may support at least one computing resource. Taking the computing power device as a GPU as an example, according to the difference of types, value ranges, accuracy, applicable service scenarios and the like, the at least one computing power resource corresponding to the GPU may include, but is not limited to: floating point computing power resources, integer computing power resources, codec computing power resources. The types, descriptions and applicable business scenarios of different computing power resources can be shown in table 1.
TABLE 1 types, descriptions, applicable business scenarios of different computing resources
Computing power resource Description of computing resources Applicable business scenario
Floating point type computing power resource FP16, FP32, FP64, etc AI training scene
Integer computing power resource INT8 AI reasoning scenario
Coding and decoding computing power resource Encode, decode interpreter Video rendering and cloud game scene
Alternatively, the target model may be any type of model, which is not particularly limited. The processing procedure of the task processing can be a training procedure of the target model or an reasoning procedure of the target model.
Optionally, in a processing procedure of the terminal performing task processing on the target model based on the computing device, the terminal may acquire a target time period corresponding to the operation of the computing device. The determination of the target period of time may be performed in a variety of ways, which are not particularly limited. For example, the target period of time may be determined in basic time units of hours, days, weeks, months, and the like. For example, in days, the target period of time is determined to be 24 hours, labeled as 24 cards.
Because the processing process of performing task processing on the target model based on the processing equipment does not run at any time, especially in an AI training scene, a link for loading training data and analyzing a result model exists, heterogeneous computing resources are in an idle state under a time axis of the link, and resources such as a CPU, a memory and the like are in a release idle state in the time axis, namely, from the aspect of card time, a dynamic and excavatable time and computing space exist. Based on this, after the target period is determined, the period other than the processing period in the target period may be regarded as a period in which the processing device is idle, that is, an idle period, that is, when the card is idle.
S103, the terminal acquires a first idle computing power resource which does not perform task processing on the target model in the processing time period, and acquires a second idle computing power resource which is in an idle state in the idle time period.
Optionally, mining of computational resources for the time dimension: in the idle card, the load of the whole machine is usually lower, the CPU and the memory are basically in an idle state, and in view of the situation, the computing resource excavation in the idle card belongs to exclusive excavation. Taking the processing equipment as a GPU as an example, in the idle card time, the terminal can excavate the computing power resources in the idle state corresponding to the processing equipment to obtain second idle computing power resources. Taking the processing device as a GPU as an example, the floating point type computing power resource and the integer type computing power resource corresponding to the GPU are idle in the idle time period, and the floating point type computing power resource and the integer type computing power resource in the idle time period can be used as a second idle computing power resource, namely, the running model training or reasoning task in the idle time period can be performed.
Optionally, the mining of heterogeneous computing resources for a spatial dimension is the mining of available computing resources performed on a scale of cards: because the processing process of the processing equipment executing the target model comprises a training process and an reasoning process, and the service scenes suitable for different computing resources are different, the terminal can acquire computing resources of which the service scene is not suitable for the processing process according to the processing process of the processing equipment executing the target model, namely computing resources which do not process tasks on the target model in a processing time period, and obtain the first idle computing resources.
S105, the terminal sends the first idle computing power resource and the second idle computing power resource to the resource scheduling system.
S107, the resource scheduling system schedules and consumes the first idle computing power resources and the second idle computing power resources so as to process tasks of other models.
The performance of the heterogeneous computing power maximization can be mined through the space dimension and the time dimension to execute the mining of the computing power resources, but how the mined computing power resources are consumed is also an important factor affecting the heterogeneous computing power performance. The embodiment of the application can dynamically send the mined space computing power resources and the time computing power resources to the resource scheduling system, and the resource scheduling system pulls through the training task and the reasoning service of the AI so as to execute the dynamic scheduling consumption of the mined computing power resources.
The other models are models other than the target model among the models that the resource scheduling system can schedule. The type of task processing for other models may be determined based on the type of free computing resources. Referring to table 1, if the type of the idle computing power resource is a floating point computing power resource, performing task processing on other models is training processing on the other models. And if the type of the idle computing power resource is an integer computing power resource, performing task processing on other models to perform reasoning processing on the other models. And if the type of the idle computing power resource is the coding and decoding computing power resource, performing task processing on other models to perform coding and decoding processing on the other models.
According to the embodiment of the application, the first idle computing resources which do not perform task processing on the target model (the mining of computing resources in the space dimension) can be dynamically mined out in the processing time period, the second idle computing resources which are in the idle state (the mining of computing resources in the time dimension) can be dynamically mined out in the idle time period, and the idle computing resources mined out in the space dimension and the time dimension are dynamically scheduled and consumed through the resource scheduling system, so that heterogeneous computing performance is used to the maximum, and the AI model processing cost and the consumption of system resources in the AI model processing process are effectively reduced; furthermore, strategies to dynamically maximize performance can also support computational resource mining of large model scenarios, adapting to increasingly larger model processing scenarios.
In an alternative embodiment, in the step S101, the determining, by the terminal, an idle period may include:
under the condition that the processing time period is a training time period for training the target model, the terminal obtains a non-training time period according to the difference value between the running time period and the training time period; the terminal determines the non-training period as an idle period.
Under the condition that the processing time period is an reasoning time period for reasoning the target model, the terminal obtains a non-reasoning time period according to the difference value between the running time period and the reasoning time period; the terminal determines the non-inference period as an idle period.
Because the processing course of task processing on the target model based on the processing equipment does not run at any time, links such as loading of training data and analysis of a result model exist, heterogeneous computing power resources are in idle running state under the time axis of the links, and resources such as CPU, memory and the like are in idle releasing state in the time axis, namely from the aspect of card time, dynamic excavatable time and computing power space exist
Aiming at the training process: because the training process of training the target model based on the processing equipment does not run at any time, a link for loading training data and analyzing a result model exists, heterogeneous computing power resources are in an idle state under a time axis of the link, and resources such as a CPU, a memory and the like are in a release idle state in the time axis, namely, from the aspect of card time, a dynamic and excavatable time and computing power space exist.
In one embodiment, if the processing time period is a training time period for training the target model, the training time period for training the target model by the processing device is removed, and the remaining idle time period (the difference between the running time period and the training time period) is a non-training time period, which is an idle time period. In another embodiment, the corresponding weights of the running time period and the training time period can be set according to the actual service requirement, and the product of the running time period and the corresponding weight and the difference between the product of the training time period and the corresponding weight are calculated to obtain the idle time period.
For the reasoning process: because the reasoning process of reasoning the target model based on the processing equipment does not run at any time, heterogeneous computing power resources are in an idle state under a non-running time axis, and resources such as a CPU, a memory and the like are in a free release state in the time axis, namely, from the aspect of card time, a dynamic and excavatable time and computing power space exist.
In another embodiment, if the processing time period is an inference time period for inferring the target model, the inference time period for the processing device to infer the target model is removed, and the remaining idle time period (the difference between the running time period and the inference time period) is a non-inference time period, which is an idle time period. In another embodiment, the corresponding weights of the running time period and the reasoning time period can be set according to the actual service requirement, and the product of the running time period and the corresponding weight and the difference between the products of the reasoning time period and the corresponding weight are calculated to obtain the idle time period.
The following describes the above step S101 using the processing device as a GPU as an example:
FIG. 3 is a schematic diagram illustrating a determination of an idle period according to an exemplary embodiment, as shown in FIG. 3, assuming that the processing period is a training period for training the target model, and the GPU cards are 24 hours a day, i.e., when the running period of the GPU is marked as 24 cards, then the rest of the idle period is the training period, i.e., the difference between the 24 cards and the training period is the idle period.
Therefore, the time period that the heterogeneous computing power resource is in the idle running state can be determined in the training time period of the processing equipment for training, the idle time period corresponding to the training time period is obtained, or the time period that the heterogeneous computing power resource is in the idle running state is determined in the reasoning time period of the processing equipment for carrying out the reasoning task, and the idle time period corresponding to the reasoning time period is obtained, so that the idle time period can be accurately determined in the processing time period of the processing equipment for processing the target model, the waste of the system resource caused by the existence of the idle time period is avoided, the heterogeneous computing power performance is used from the maximum time dimension, and the AI training/reasoning cost and the consumption of the system resource caused by the AI training/reasoning process are effectively reduced.
Fig. 4 is a second flowchart of a method for processing computing resources according to an exemplary embodiment, as shown in fig. 4, in an optional embodiment, in the step S103, the terminal acquires a first idle computing resource that does not perform task processing on the target model in a processing period, and may include:
s1031, under the condition that the processing time period is a training time period for training the target model, the terminal acquires a first idle integer computing power resource which is not used for training the target model in the training time period; the terminal determines the first idle integer computing power resource as a first idle computing power resource; the first idle integer computing power resource is used for executing reasoning tasks of other models under the condition of scheduled consumption.
S1033, under the condition that the processing time period is an reasoning time period for reasoning the target model, the terminal acquires a first idle floating point type computing power resource which does not reason the target model in the reasoning time period; the terminal determines the first idle floating point computing power resource as a first idle computing power resource; the first idle floating point computing power resource is used for executing training tasks of other models under the condition of being scheduled for consumption.
Optionally, in S1031, if the processing period is a training period for training the target model, as shown in table 1, floating-point type computing resources are mainly used in the training period, and even if the floating-point type computing resources are used very high, even if the floating-point type computing resources are operated under full load, the whole type computing resources can still be mined out to perform reasoning tasks of other models for the AI. Therefore, the terminal can acquire the first idle integer computing power resources which are not trained on the target model in the training time period, and obtain the first idle computing power resources so as to execute the reasoning tasks of other models under the condition that the first idle computing power resources are scheduled to be consumed. Therefore, the first idle integer computing power resources which are not used for training the target model in the training time period can be mined from the space dimension, free operators are guaranteed not to appear in the training time period, and therefore the heterogeneous computing power performance is used from the space dimension to the maximum, and further AI training/reasoning cost and consumption of system resources in the training/reasoning process are effectively reduced.
The following describes the above step S1031 using the processing device as a GPU as an example:
when the GPU card performs training in a training time period, the inferred INT8 computing power and the codec operator are usually idle, i.e. the inferred INT8 computing power and the inferred codec operator do not train the target model in the training time period, and can be mined to perform inference services or codec of other models.
Optionally, in the step S1033, if the processing period is an inference period for reasoning the target model, as shown in table 1, the integrated computing power resources are mainly used in the inference period, and even if the use rate of the integrated computing power resources is high, even if the system runs at full load, the floating point computing power resources can still be mined out to perform training tasks of other models for the AI. Therefore, the terminal can acquire the first idle floating point computing power resources which do not infer the target model in the reasoning time period, and obtain the first idle computing power resources so as to execute training tasks of other models under the condition that the first idle computing power resources are scheduled and consumed.
Therefore, the first idle integer computing power resources which do not train the target model in the training time period can be mined from the space dimension, or the first idle floating point computing power resources which do not infer the target model in the reasoning time period can be mined from the space dimension, so that the idle computing power resources are ensured not to appear in the training/reasoning time period, the heterogeneous computing power performance is used from the space dimension to the maximum extent, and the AI training/reasoning cost and the consumption of system resources in the AI training/reasoning process are further effectively reduced.
In an optional embodiment, in the step S103, the acquiring the second idle computing power resource in the idle state in the idle period may include:
under the condition that the processing time period is a training time period, the terminal acquires the computing power resources which are not consumed in the non-training time period, and acquires second idle computing power resources;
under the condition that the processing time period is an reasoning time period, the terminal acquires the computing power resources which are not consumed in the non-reasoning time period, and obtains second idle computing power resources;
the second idle computing power resource comprises at least one of a second idle floating point type computing power resource and a second idle integer type computing power resource, the second idle integer type computing power resource is used for executing reasoning tasks of other models under the condition of scheduled consumption, and the second idle floating point type computing power resource is used for executing training tasks of other models under the condition of scheduled consumption.
In one embodiment, when the processing time period is a training time period, the load of the whole terminal is usually lower, and the CPU and the memory are also in a substantially idle state, so that the computing power resource mining when the card is idle belongs to exclusive mining, and the GPU card floating point and integer operator in the processing time period are idle, that is, the training or reasoning task can be operated in the idle time period. The terminal may obtain an unconsumed computing power resource in the non-training period, where the unconsumed computing power resource includes at least one of a second idle floating point type computing power resource and a second idle integer type computing power resource, to obtain a second idle computing power resource. The second idle integer computing power resource is used for executing reasoning tasks of other models under the condition of scheduled consumption, and the second idle floating point computing power resource is used for executing training tasks of other models under the condition of scheduled consumption.
In another embodiment, in the case that the processing time period is the reasoning time period, the load of the whole terminal is usually lower, the CPU and the memory are also basically in an idle state, in view of the situation, the computing power resource mining when the card is idle belongs to the exclusive mining, and the GPU card floating point and the integer operator in the processing time period are idle, that is, the training or reasoning task can be operated in the idle time period. The terminal may obtain an unconsumed computational resource in the non-inference time period, where the unconsumed computational resource includes at least one of a second idle floating-point type computational resource and a second idle integer type computational resource, to obtain a second idle computational resource. The second idle integer computing power resource is used for executing reasoning tasks of other models under the condition of scheduled consumption, and the second idle floating point computing power resource is used for executing training tasks of other models under the condition of scheduled consumption.
Therefore, the second idle floating point type computing power resources and/or the second idle integer type computing power resources which are in the idle state in the training time period can be mined from the time dimension, or the second idle floating point type computing power resources and/or the second idle integer type computing power resources which are in the idle state in the reasoning time period can be mined from the space dimension, so that the idle computing time period can not appear in the training time period or the reasoning time period, the heterogeneous computing power performance is used from the maximum of the time dimension, and the AI training/reasoning cost and the consumption of the AI training/reasoning process on the system resources are further effectively reduced.
FIG. 5 is a schematic flow chart of a scheduling process for computing resources according to an exemplary embodiment, as shown in FIG. 5, in an alternative embodiment, after the terminal acquires the first idle computing resource that does not perform task processing on the target model in the processing period, the method may further include:
s201, the terminal sets the priority of the first idle computing power resource to be low.
S203, under the condition that the processing procedure for performing task processing on the target model meets the preset condition, the terminal performs scheduling processing on the first idle computing power resource.
In this embodiment, for the first free computing power resource mined from the spatial dimension, the terminal may set its priority to a low priority. The terminal monitors whether a processing process of performing task processing on the target model based on the processing equipment meets preset conditions in real time, wherein the preset conditions are conditions capable of affecting smooth performance of the processing process. If the processing process of performing task processing on the target model based on the processing equipment is monitored to meet the preset condition, the processing process may not be performed smoothly, efficiently and accurately, the terminal performs scheduling processing on the first idle computing power resource, so that the processing process can be performed smoothly, efficiently and accurately.
Alternatively, the preset condition may be a condition that an amount of resources required for the process is increased, that there is an increase in time consumption of the subtask in the process, or the like. For example, the amount of resources required by the processing procedure increases, and the existence of the mined computing resources occupies the memory of the processing device, so that the processing device cannot accommodate the resources required by the processing procedure, and therefore the mined low-priority computing resources can be adjusted away, so that the processing device can accommodate the resources required by the processing procedure. For example, the time consumption of the subtasks in the processing process is increased, the time consumption of the tasks is increased due to the existence of the mined computing resources, and therefore the mined low-priority computing resources can be adjusted away, so that the time consumption of the subtasks in the processing process is reduced.
Alternatively, the terminal may schedule the first idle computing resource in a plurality of manners. For example, the terminal may schedule the first free computing resource to the other device such that the first free computing resource mined from the spatial dimension is scheduled away from the processing device.
It should be noted that, the processing device may be configured in advance, so that, when a processing procedure of performing task processing on the target model meets a preset condition, it is allowed to perform scheduling processing on the first idle computing resource mined from the spatial dimension, so as to automatically complete the scheduling operation.
Therefore, under the condition that the processing process of task processing on the target model meets the preset condition, the first idle computing power resource can be scheduled, so that the processing process can be smoothly, efficiently and accurately performed, the mining of the computing power resource can not influence the processing process of the processing equipment, the heterogeneous computing power performance is used from the space dimension maximization, and the AI processing cost and the consumption of the system resource are further effectively reduced.
In an optional embodiment, in the step S201, the setting, by the terminal, the priority of the first idle computing power resource to be a low priority may include:
under the condition that the processing time period is a training time period, the terminal sets the priority of the first idle integer computing power resource as low priority; and under the condition that the processing time period is an inference time period, the terminal sets the priority of the first idle floating-point type power resource to be low.
Accordingly, in the step S203, when determining that the processing procedure for performing task processing on the target model meets the preset condition, the terminal performs scheduling processing on the first idle computing power resource, including:
and under the condition that the processing time period is a training time period and the resource amount required by the terminal for training the target model meets a first preset condition or the training time consumption meets a second preset condition, the terminal performs scheduling processing on the first idle integer computing power resource.
And under the condition that the processing time period is an inference time period and the resource amount required by the terminal to infer the target model meets a third preset condition or the inference time consumption meets a fourth preset condition, the terminal performs scheduling processing on the first idle floating-point type computational power resource.
For a training period:
for a first idle integer computing power resource mined from a spatial dimension during a training period in which the processing device performs training on the target model, the terminal may set its priority to a low priority. The terminal monitors whether a training process for training the target model based on the processing equipment meets preset conditions in real time, wherein the preset conditions are conditions capable of affecting smooth progress of the training process. If the condition that the training process based on the processing equipment for training the target model meets the preset condition is monitored, the training process may not be performed smoothly, efficiently and accurately, the terminal performs scheduling processing on the first idle integer computing power resource, so that the training process can be performed smoothly, efficiently and accurately.
Alternatively, the preset condition may be that the amount of resources required for the training process meets a first preset condition or that the training time consumption meets a second preset condition, for example, the first preset condition is that the amount of resources increases, the second preset condition is that there is an increase in time consumption or that the time consumption increase rate is greater than a certain threshold, etc. Alternatively, the terminal may schedule the first idle integer computing power resource in a plurality of manners. For example, the terminal may schedule the first free integer computing power resource to the other device such that the first free integer computing power resource mined from the spatial dimension is scheduled away from the processing device.
It should be noted that, the processing device may be configured in advance, so that, when the training process for training the target model meets the preset condition, it is allowed to perform the scheduling processing on the first idle integer computing resource mined from the spatial dimension, so as to automatically complete the scheduling operation.
For the inference period:
for a first free floating point computational resource mined from a spatial dimension, the terminal may set its priority to a low priority for an inference period in which the processing device performs inference on the object model. The terminal monitors whether the condition of the reasoning process based on the reasoning of the processing equipment on the target model meets the preset condition in real time, wherein the preset condition is a condition capable of affecting the smooth proceeding of the reasoning process. If the condition that the reasoning process based on the processing equipment for reasoning the target model meets the preset condition is monitored, the reasoning process possibly cannot be smoothly, efficiently and accurately performed, the terminal performs scheduling processing on the first idle floating point type computing power resource, so that the reasoning process can be smoothly, efficiently and accurately performed.
Alternatively, the preset condition may be that the amount of resources required by the inference process meets a third preset condition or that the inference time consumption meets a fourth preset condition, for example, the third preset condition is that the amount of resources increases, the fourth preset condition is that there is an increase in the inference time consumption or that the rate of increase in the inference time consumption is greater than a certain threshold, etc. Alternatively, the terminal may schedule the first idle floating-point type power resource in a plurality of manners. For example, the terminal may schedule the first free floating-point-type computational resource to the other device such that the first free floating-point-type computational resource mined from the spatial dimension is scheduled away from the processing device.
It should be noted that, the processing device may be configured in advance, so that, when the reasoning process for reasoning the target model meets the preset condition, it is allowed to perform scheduling processing on the first idle floating-point type computing resource mined from the spatial dimension, so as to automatically complete the scheduling operation.
Therefore, under the condition that the training process for training the target model or the reasoning process for reasoning meets the preset condition, the first idle computing power resource or the first idle floating point computing power resource can be scheduled, so that the training or reasoning can be smoothly performed, the mining of the computing power resource can not influence the original training or reasoning process of the processing equipment, the heterogeneous computing power performance is used from the space dimension maximization, and the AI training/reasoning cost and the consumption of the system resource are further effectively reduced.
In an alternative embodiment, the first idle computing power resource and the second idle computing power resource carry respective corresponding resource identification information, where the resource identification information is used to characterize a resource type and an acquisition path of the computing power resource. Then, continuing as shown in fig. 4, in the step S107, the resource scheduling system schedules and consumes the first idle computing power resource and the second idle computing power resource to perform task processing on other models, including:
S1071, a resource scheduling system determines a first acquisition path and a first resource type of a first idle computing resource according to resource identification information corresponding to the first idle computing resource; and determining a second acquisition path and a second resource type of the second idle computing power resource according to the resource identification information corresponding to the second idle computing power resource.
S1073, the resource scheduling system schedules and consumes the first idle computing power resources according to the first acquisition path and the first resource type so as to process tasks of other models; and scheduling consumption of the second idle computing power resources according to the second acquisition path and the second resource type so as to perform task processing on other models.
Optionally, the data format of the idle computing power resource reported by the terminal to the resource scheduling system may be as shown in table 2:
table 2 data format of idle computational resources
/>
Wherein IP1 refers to the internet protocol address of the computing resource mined in the idle dimension, and IP2 refers to the internet protocol address of the computing resource mined in the time dimension. The card type is obtained by the card type identification of the processing equipment, different card rows run different training or reasoning tasks, and the resource scheduling center can conveniently perform scheduling arrangement of mining calculation power by identifying different card types.
The terminal may generate resource identification information corresponding to the first idle computing resource according to at least one of IP, available computing resource dimension, card type, and computing resource type corresponding to the first idle computing resource, where the resource identification information is used to represent a first acquisition path and a first resource type corresponding to the first idle computing resource, and generate resource identification information corresponding to the second idle computing resource according to at least one of IP, available computing resource dimension, card type, and computing resource type corresponding to the second idle computing resource, where the resource identification information is used to represent a second acquisition path and a second resource type corresponding to the second idle computing resource. The first obtaining path corresponding to the first idle computing power resource is used for representing where the computing power resource is obtained by mining, namely, the resource dimension of the computing power resource can be represented, and the first resource type is used for representing information such as the resource type, the card type, the IP and the like of the first idle computing power resource pair. The second obtaining path corresponding to the second idle computing power resource is used for representing where the computing power resource is obtained by mining, namely, the resource dimension of the computing power resource can be represented, and the second resource type is used for representing information such as the resource type, the card type, the IP and the like of the second idle computing power resource pair.
After the resource scheduling system receives the first idle computing power resource and the second idle computing power resource sent by the terminal, the resource scheduling system can acquire resource identification information carried by the first idle computing power resource and the second idle computing power resource, the resource scheduling system determines a first acquisition path and a first resource type of the first idle computing power resource according to the resource identification information corresponding to the first idle computing power resource, and performs scheduling consumption on the first idle computing power resource according to the first acquisition path and the first resource type so as to perform task processing on other models. The resource scheduling system determines a second acquisition path and a second resource type of the second idle computing power resource according to the resource identification information corresponding to the second idle computing power resource, and schedules and consumes the second idle computing power resource according to the second acquisition path and the second resource type so as to process tasks of other models.
It should be noted that, because the first acquisition path is different from the second acquisition path, the first resource type is different from the second resource type, and the processing tasks of the resource scheduling system for the other model pulled by the first idle computing power resource are also different from the processing tasks of the resource scheduling system for the other model pulled by the second idle computing power resource.
Therefore, the resource scheduling system can pull through different processing tasks to process other models according to the way and type of the mined computational resources, so that the dynamic scheduling consumption of the mined computational resources is realized efficiently, the maximum realization of the improvement of the computational performance is facilitated, the AI training/reasoning cost and the consumption of the system resources are further effectively reduced, the computational mining of large model scenes can be supported, and the system is suitable for increasingly large model processing scenes.
In an alternative embodiment, in step S1071, the resource scheduling system performs scheduling consumption on the first idle computing power resource according to the first acquisition path and the first resource type to perform task processing on the other model, including:
when the first obtaining path represents that the first idle computing power resource is obtained from the processing time period and the first resource type represents that the first idle computing power resource is the first idle integer computing power resource, the resource scheduling system performs scheduling consumption on the first idle integer computing power resource so as to execute reasoning tasks of other models; the first idle integer computing power resource is a computing power resource which is obtained from the training time period and is not used for training the target model when the processing time period is the training time period.
When the first obtaining path represents that the first idle computing power resource is obtained from the processing time period and the first resource type represents that the first idle computing power resource is the first idle floating point computing power resource, the resource scheduling system schedules and consumes the first idle floating point computing power resource so as to execute training tasks of other models; the first idle floating point type computing power resource is obtained from the reasoning time period by the terminal under the condition that the processing time period is the reasoning time period, and the computing power resource does not infer the target model.
FIG. 6 is a schematic flow chart of a resource scheduling system scheduling an computing resource according to an exemplary embodiment, as shown in FIG. 6, in the process of consuming the mined computing resource by the resource scheduling system, the resource scheduling system determines whether the mined computing resource belongs to a space dimension mined computing resource, if so, the resource scheduling system performs different dynamic scheduling according to whether the mined computing resource is a floating point computing resource or an integral computing resource.
In one embodiment, the resource scheduling system analyzes the received computing power resources, and when the first acquisition path of the first idle computing power resource indicates that the first idle computing power resource is acquired from the processing time period and the first resource type indicates that the first idle computing power resource is the first idle computing power resource, the computing power resource is specified to be the full computing power resource mined from the space dimension (i.e. the full computing power resource mined in the training time period), and the resource scheduling system schedules and consumes the first idle full computing power resource to execute the reasoning tasks of other models.
In another embodiment, the resource scheduling system analyzes the received computing power resources, and when the first acquisition path of the first idle computing power resource indicates that the first idle computing power resource is acquired from the processing time period and the first resource type indicates that the first idle computing power resource is a first idle floating point type computing power resource, the computing power resource is an integer computing power resource mined from the space dimension (i.e. an integer computing power resource mined from the reasoning time period), and the resource scheduling system schedules and consumes the first idle floating point type computing power resource to execute training tasks of other models.
It should be noted that, the tasks of consuming and mining computing power are divided into an AI training task and an AI reasoning task, and no matter the tasks are training or reasoning tasks, the tasks are preempted at random, so that the scene of consuming and mining computing power resources needs to have the capability of continuously calculating or tolerating recalculation after being interrupted in real time. Based on the above, the AI training task in the embodiment of the present application generally adopts a short and quick training scenario, and AI reasoning adopts a second-level time-consuming reasoning service scenario, so that even if preemption occurs, the retry operation of the reasoning request is triggered, only a single reasoning request delay increases by a second level, and the overall reasoning service quality is not affected. The short and flat training scene refers to a training scene which is short in time consumption and small in fluctuation.
Therefore, for the computational resources excavated from the space dimension, the resource scheduling system can pull through different processing tasks to process other models according to the excavated computational resources and types, so that the dynamic scheduling consumption of the excavated computational resources is realized efficiently, the maximum realization of the improvement of the computational performance is facilitated, the AI training/reasoning cost and the consumption of the system resources are further effectively reduced, and the computational excavation of a large model scene can be supported, so that the system is suitable for increasingly large model processing scenes.
In an alternative embodiment, continuing as shown in fig. 6, in step S1073, the scheduling consumption of the second idle computing power resource according to the second acquisition path and the second resource type to perform task processing on the other model may include:
and under the condition that the second obtaining path represents that the second idle computing power resource is obtained from the idle time period and the second resource type represents that the second idle computing power resource is the second idle floating point type computing power resource and the second idle integer type computing power resource, the resource scheduling system performs scheduling consumption on the second idle integer type computing power resource so as to execute reasoning tasks of other models.
Under the condition that the scheduling consumption of the second idle integer computing power resource is determined to be completed, the resource scheduling system schedules the consumption of the second idle floating point computing power resource so as to execute training tasks of other models;
the terminal acquires the non-consumed computing power resources from the non-training time period under the condition that the processing time period is the training time period; the non-training period characterizes the difference between the running period and the training period; or the second idle floating point type computing power resource and the second idle integer type computing power resource are computing power resources which are obtained by the terminal and are not consumed in a non-reasoning time period under the condition that the processing time period is the reasoning time period; the non-inference period characterizes the difference between the run period and the inference period.
If the resource scheduling system determines that the mined out computing power resource is a computing power resource mined out from the time dimension (the second obtaining path represents that the second idle computing power resource is obtained from an idle time period, and the second resource type represents that the second idle computing power resource is a second idle floating point computing power resource and a second idle integer computing power resource), aiming at the computing force scene which can be preempted at such an irregular time, the inference scene can be preferentially adopted to execute consumption, namely, the second idle integer computing power resource is scheduled and consumed first to execute the inference tasks of other models. This is because the time consumption of reasoning is much shorter than training, and reasoning is basically in the second order, training is basically in the hour order; the reasoning is preferentially used for consumption, and the service quality of the calculation force is mined to ensure T. If no reasoning is needed for consumption (namely, the scheduling consumption of the second idle integer computing power resource is completed), the scheduling AI model is trained to schedule the consumption of the second idle floating point computing power resource.
In addition, in the middle state in the training process, the training process needs to be stored regularly, so that the model is prevented from being trained again after the mined idle card is recovered, and the completed model training and the expenditure of heterogeneous calculation force are wasted.
In an alternative embodiment, after the resource scheduling system performs scheduling consumption on the first idle computing power resource and the second idle computing power resource to perform task processing on the other model, the method further includes:
under the condition that the processing procedures corresponding to the processing tasks of other models meet a fifth preset condition, the resource scheduling system reschedules and consumes the first idle computing power resources and the second idle computing power resources so that the first idle computing power resources and the second idle computing power resources perform task processing on the candidate models;
the candidate models are models except the target model and other models in the models which can be scheduled by the resource scheduling system.
In this real-time embodiment, the resource scheduling system may monitor, in real time, a processing procedure corresponding to a processing task of another model, and when it is monitored that the processing procedure corresponding to the processing task of the other model meets a fifth preset condition, it indicates that the processing procedure corresponding to the processing task of the other model (i.e., a high-priority training or reasoning procedure) is affected at this time, and the resource scheduling system may reschedule and consume the first idle computing resource and the second idle computing resource, so that the first idle computing resource and the second idle computing resource perform task processing on the candidate model.
Alternatively, in the training scenario, the fifth preset condition may be an increase in time consumption of training, etc., and in the scenario being inferred, the fifth preset condition may be an increase in request delay of inference, etc. The candidate model may be a model other than the target model and other models among the models that the resource scheduling system is capable of scheduling.
Therefore, under the condition that the processing processes corresponding to the processing tasks of other models meet the fifth preset condition, the resource scheduling system reschedules and consumes the first idle computing power resources and the second idle computing power resources, so that the influence of a high-priority training or reasoning process is avoided, the maximum realization of the improvement of computing power performance is facilitated, the AI training/reasoning cost and the consumption of system resources are effectively reduced, and computing power mining of a large model scene can be supported, so that the method is suitable for increasingly-large model processing scenes.
The computing power resource processing method provided by the embodiment of the application has the following beneficial effects:
with the development of AI model processing, the demand of heterogeneous computing power such as GPU (graphics processing Unit) is suddenly increased, especially in the processing scene of a large model, the demand of heterogeneous computing power such as GPU is often thousands or even tens of thousands of cards, and in view of such large-scale card quantity, if an idle period or idle computing power resource appears, the idle computing power resource is extremely wasted. The method has the advantages that potential computing resources are mined based on space dimension and time dimension, space dynamic mining of the computing resources and time dynamic scheduling mining executed during card are supported, the mined computing resources are dynamically executed, scheduled and consumed, the maximum improvement of computing performance is facilitated, the computing resource mining of a large model scene can be supported, in addition, in view of large computing scale of the large model, the mining amount of the available computing resources is larger than that of the traditional regulation, and increment benefits are obvious. In addition, consumption of the GPU card can be reduced, so that the cost of AI training/reasoning is reduced.
The method for processing the computing power resource is introduced by taking the terminal as an execution main body:
FIG. 7 is a flowchart third illustrating a method of processing a computing resource, as shown in FIG. 7, according to an exemplary embodiment, the method may include:
s301, determining an idle time period and a processing time period corresponding to the processing process in the processing process of performing task processing on the target model based on the computing equipment; the idle period is a period other than the processing period in the target period; the target time period is a time period corresponding to the operation of the computing device;
s303, acquiring a first idle computing power resource which does not perform task processing on the target model in a processing time period, and acquiring a second idle computing power resource which is in an idle state in the idle time period; the first idle computing power resource and the second computing power resource are computing power resources corresponding to the computing power equipment;
s305, sending the first idle computing power resources and the second idle computing power resources to a resource scheduling system so that the resource scheduling system performs scheduling consumption on the first idle computing power resources and the second idle computing power resources to perform task processing on other models.
In an alternative embodiment, acquiring a first idle computing power resource that does not task a target model for a processing period of time includes:
Under the condition that the processing time period is a training time period for training the target model, acquiring a first idle integer computing force resource which is not used for training the target model in the training time period; determining the first idle integer computing power resource as a first idle computing power resource; the first idle integer computing power resource is used for executing reasoning tasks of other models under the condition of being scheduled for consumption;
under the condition that the processing time period is an reasoning time period for reasoning the target model, acquiring a first idle floating point type computing power resource which does not reason the target model in the reasoning time period; determining the first idle floating point type computing power resource as a first idle computing power resource; the first idle floating point computing power resource is used for executing training tasks of other models under the condition of being scheduled for consumption.
In an alternative embodiment, determining the idle period includes:
under the condition that the processing time period is a training time period for training the target model, a non-training time period is obtained according to the difference value between the running time period and the training time period; determining the non-training time period as an idle time period;
under the condition that the processing time period is an reasoning time period for reasoning the target model, a non-reasoning time period is obtained according to the difference value between the running time period and the reasoning time period; the non-inference period of time is determined to be an idle period of time.
In an alternative embodiment, acquiring a second idle computing power resource in an idle state for an idle period of time includes:
under the condition that the processing time period is a training time period, computing power resources which are not consumed in a non-training time period are obtained, and second idle computing power resources are obtained;
under the condition that the processing time period is an inference time period, computing power resources which are not consumed in a non-inference time period are obtained, and second idle computing power resources are obtained;
the second idle computing power resource comprises at least one of a second idle floating point type computing power resource and a second idle integer type computing power resource, the second idle integer type computing power resource is used for executing reasoning tasks of other models under the condition of scheduled consumption, and the second idle floating point type computing power resource is used for executing training tasks of other models under the condition of scheduled consumption.
In an alternative embodiment, after acquiring the first idle computing force resource that does not task the target model for the processing period, the method further includes:
setting the priority of the first idle computing power resource as a low priority;
and under the condition that the processing procedure for performing task processing on the target model meets the preset condition, scheduling the first idle computing power resource.
In an alternative embodiment, prioritizing the first idle computing power resource to a low priority includes:
setting the priority of the first idle integer computing power resource as a low priority under the condition that the processing time period is a training time period; setting the priority of the first idle floating point type power calculation resource as low priority under the condition that the processing time period is an inference time period;
under the condition that the processing procedure for performing task processing on the target model meets the preset condition, scheduling the first idle computing power resource, wherein the scheduling processing comprises the following steps:
under the condition that the processing time period is a training time period and the resource amount required for training the target model meets a first preset condition or the training time consumption meets a second preset condition, scheduling the first idle integer computing power resource;
and under the condition that the processing time period is an inference time period and under the condition that the amount of resources required for reasoning the target model meets a third preset condition or the inference time consumption meets a fourth preset condition, scheduling the first idle floating-point type computational resource.
The method for processing the computing power resource is described below by taking a resource scheduling system as an example:
FIG. 8 is a flow chart diagram IV illustrating a method of processing a computing resource, as shown in FIG. 8, according to an exemplary embodiment, the method may include:
s401, receiving a first idle computing power resource and a second idle computing power resource which are sent by a terminal; the first idle computing resource is a computing resource which is acquired by the terminal from a processing time period and does not process tasks on the target model, and the second idle computing resource is a computing resource which is acquired by the terminal from an idle time period and is in an idle state; the idle period is a period other than the processing period in the target period; the processing time period is a time period corresponding to a processing process of the terminal for performing task processing on the target model based on the computing equipment; the idle period is a period other than the processing period in the target period; the target time period is a time period corresponding to the operation of the computing device;
s403, scheduling and consuming the first idle computing power resource and the second idle computing power resource to perform task processing on other models.
In an alternative embodiment, the first idle computing power resource and the second idle computing power resource carry respective corresponding resource identification information, and the resource identification information is used for representing the resource type and the acquisition path of the computing power resource; scheduling consumption of the first idle computing power resource and the second idle computing power resource to perform task processing on other models, including:
Determining a first acquisition path and a first resource type of the first idle computing power resource according to resource identification information corresponding to the first idle computing power resource; determining a second acquisition path and a second resource type of the second idle computing power resource according to the resource identification information corresponding to the second idle computing power resource;
scheduling consumption of the first idle computing power resources according to the first acquisition path and the first resource type so as to perform task processing on other models; and scheduling consumption of the second idle computing power resources according to the second acquisition path and the second resource type so as to perform task processing on other models.
In an alternative embodiment, scheduling consumption of the first free computing power resource according to the first acquisition pathway and the first resource type to task other models includes:
under the condition that the first obtaining path represents that the first idle computing power resource is obtained from the processing time period and the first resource type represents that the first idle computing power resource is the first idle integer computing power resource, scheduling and consuming the first idle integer computing power resource to execute reasoning tasks of other models; the first idle integer computing power resource is a computing power resource which is obtained from the training time period and is not used for training the target model when the processing time period is the training time period;
Under the condition that the first obtaining path represents that the first idle computing power resource is obtained from the processing time period and the first resource type represents that the first idle computing power resource is the first idle floating point type computing power resource, scheduling and consuming the first idle floating point type computing power resource to execute training tasks of other models; the first idle floating point type computing power resource is obtained from the reasoning time period by the terminal under the condition that the processing time period is the reasoning time period, and the computing power resource does not infer the target model.
In an alternative embodiment, scheduling consumption of the second free computing power resource according to the second acquisition pathway and the second resource type to task the other model includes:
under the condition that the second obtaining path represents that the second idle computing power resource is obtained from an idle time period and the second resource type represents that the second idle computing power resource is a second idle floating point type computing power resource and a second idle integer type computing power resource, scheduling and consuming the second idle integer type computing power resource to execute reasoning tasks of other models;
under the condition that the dispatching consumption of the second idle integer computing power resource is determined to be completed, the dispatching consumption of the second idle floating point computing power resource is carried out so as to execute training tasks of other models;
The terminal acquires the non-consumed computing power resources from the non-training time period under the condition that the processing time period is the training time period; the non-training period characterizes the difference between the running period and the training period; or the second idle floating point type computing power resource and the second idle integer type computing power resource are computing power resources which are obtained by the terminal and are not consumed in a non-reasoning time period under the condition that the processing time period is the reasoning time period; the non-inference period characterizes the difference between the run period and the inference period.
In an alternative embodiment, after scheduling consumption of the first idle computing power resource and the second idle computing power resource to task the other model, the method further comprises:
under the condition that the processing procedures corresponding to the processing tasks of other models meet a fifth preset condition, rescheduling consumption is carried out on the first idle computing power resources and the second idle computing power resources, so that the first idle computing power resources and the second idle computing power resources carry out task processing on the candidate models;
the candidate models are models except the target model and other models in the models which can be scheduled by the resource scheduling system.
In an alternative embodiment, embodiments of the present application further provide a computing resource processing system, the system comprising: a terminal and a resource scheduling system,
the terminal is used for determining an idle time period and a processing time period corresponding to the processing process in the processing process of performing task processing on the target model based on the computing equipment; the idle period is a period other than the processing period in the target period; the target time period is a time period corresponding to the operation of the computing device; acquiring a first idle computing power resource which does not perform task processing on the target model in a processing time period, and acquiring a second idle computing power resource which is in an idle state in the idle time period; the first idle computing power resource and the second computing power resource are computing power resources corresponding to the computing power equipment; and sending the first idle computing power resource and the second idle computing power resource to a resource scheduling system.
The resource scheduling system is used for receiving first idle computing power resources and second idle computing power resources sent by the terminal; and the method is used for scheduling and consuming the first idle computing power resource and the second idle computing power resource so as to perform task processing on other models.
FIG. 9 is a block diagram of a computing resource processing device, as shown in FIG. 9, according to an example embodiment, the computing resource processing device comprising:
A time period determining module 501, configured to determine, in a process of performing task processing on a target model based on a computing device, an idle time period and a processing time period corresponding to the process; the idle time period is a time period other than the processing time period in the target time period; the target time period is a time period corresponding to the operation of the computing equipment;
a resource obtaining module 503, configured to obtain a first idle computing power resource that does not perform task processing on the target model in the processing period, and obtain a second idle computing power resource that is in an idle state in the idle period; the first idle computing power resource and the second computing power resource are computing power resources corresponding to the computing power equipment;
and the sending module 505 is configured to send the first idle computing power resource and the second idle computing power resource to a resource scheduling system, so that the resource scheduling system performs scheduling consumption on the first idle computing power resource and the second idle computing power resource, so as to perform task processing on other models.
In an alternative embodiment, the resource acquisition module includes:
the integer computing power resource processing unit is used for acquiring a first idle integer computing power resource which is not trained on the target model in the training time period when the processing time period is the training time period for training the target model; determining the first idle integer computing power resource as the first idle computing power resource; the first idle integer computing power resource is used for executing reasoning tasks of other models under the condition of being scheduled for consumption;
The floating point type computing power resource processing unit is used for acquiring a first idle floating point type computing power resource which does not infer the target model in the inference time period under the condition that the processing time period is the inference time period for inferring the target model; determining the first idle floating point computing power resource as the first idle computing power resource; the first idle floating point computing power resource is used for executing training tasks of other models under the condition of scheduled consumption.
In an alternative embodiment, the time period determining module includes:
the non-training time period processing unit is used for obtaining a non-training time period according to the difference value between the running time period and the training time period when the processing time period is the training time period for training the target model; determining the non-training period as the idle period;
the non-inference time period processing unit is used for obtaining a non-inference time period according to the difference value between the running time period and the inference time period under the condition that the processing time period is the inference time period for inferring the target model; and determining the non-inference time period as the idle time period.
In an alternative embodiment, the resource acquisition module includes:
the first unconsumed resource obtaining module is used for obtaining unconsumed computing power resources in the non-training time period under the condition that the processing time period is the training time period, and obtaining the second idle computing power resources;
the second unconsumed resource obtaining module is used for obtaining unconsumed computing power resources in the non-reasoning time period under the condition that the processing time period is the reasoning time period, and obtaining the second idle computing power resources;
the second idle computing power resource comprises at least one of a second idle floating point type computing power resource and a second idle integer type computing power resource, the second idle integer type computing power resource is used for executing reasoning tasks of other models under the condition of being scheduled for consumption, and the second idle floating point type computing power resource is used for executing training tasks of other models under the condition of being scheduled for consumption.
In an alternative embodiment, the apparatus further comprises:
the priority setting module is used for setting the priority of the first idle computing power resource to be low;
and the scheduling processing module is used for scheduling the first idle computing power resource under the condition that the processing process of performing task processing on the target model meets the preset condition.
In an alternative embodiment, the prioritization module includes:
a first priority setting unit, configured to set a priority of the first idle integer computing resource to a low priority if the processing time period is the training time period;
a second priority setting unit, configured to set the priority of the first idle floating-point type power resource to a low priority if the processing time period is the reasoning time period;
the scheduling processing module comprises:
the first scheduling unit is used for scheduling the first idle integer computing power resource under the condition that the processing time period is the training time period and the resource amount required for training the target model meets a first preset condition or the training time consumption meets a second preset condition;
the second scheduling unit is configured to perform scheduling processing on the first idle floating-point type computing power resource when the processing time period is the inference time period and when the amount of resources required for reasoning the target model satisfies a third preset condition or the time consumption for reasoning satisfies a fourth preset condition.
FIG. 10 is a block diagram of another computing resource processing device, as shown in FIG. 10, according to an exemplary embodiment, the computing resource processing device comprising:
a receiving module 601, configured to receive a first idle computing power resource and a second idle computing power resource sent by a terminal; the first idle computing power resource is the computing power resource which is acquired by the terminal from the processing time period and does not process the task of the target model, and the second idle computing power resource is the computing power resource which is acquired by the terminal from the idle time period and is in an idle state; the idle time period is a time period other than the processing time period in the target time period; the processing time period is a time period corresponding to a processing process of the terminal for performing task processing on the target model based on the computing equipment; the idle period is a period other than the processing period in the target period; the target time period is a time period corresponding to operation of the computing device.
And the scheduling consumption module 603 is configured to schedule consumption of the first idle computing power resource and the second idle computing power resource, so as to perform task processing on other models.
In an optional embodiment, the first idle computing power resource and the second idle computing power resource carry respective corresponding resource identification information, where the resource identification information is used to characterize a resource type and an acquisition path of the computing power resource; the dispatch consumption module 603 includes:
the path type determining unit is used for determining a first acquisition path and a first resource type of the first idle computing power resource according to the resource identification information corresponding to the first idle computing power resource; determining a second acquisition path and a second resource type of the second idle computing power resource according to the resource identification information corresponding to the second idle computing power resource;
the scheduling consumption unit is used for scheduling consumption of the first idle computing power resource according to the first acquisition path and the first resource type so as to process tasks of other models; and scheduling consumption of the second idle computing power resource according to the second acquisition path and the second resource type so as to perform task processing on other models.
In an alternative embodiment, the dispatch consumer unit includes:
the first consumption subunit is configured to schedule consumption of the first idle integer computing power resource to perform reasoning tasks of other models when the first acquisition path indicates that the first idle computing power resource is acquired from the processing time period and the first resource type indicates that the first idle computing power resource is the first idle integer computing power resource; the first idle integer computing power resource is a computing power resource which is obtained from the training time period and is not used for training the target model when the processing time period is the training time period;
The second consumption subunit is configured to schedule consumption of the first idle floating-point type computing resource to execute training tasks of other models when the first acquisition path indicates that the first idle computing resource is acquired from the processing time period and the first resource type indicates that the first idle computing resource is a first idle floating-point type computing resource; and the first idle floating point type computing power resource is a computing power resource which is obtained from the reasoning time period and does not reason the target model when the processing time period is the reasoning time period.
In an alternative embodiment, the dispatch consumer unit includes:
a third consumption subunit, configured to schedule consumption of the second idle integer computing power resource to perform reasoning tasks of other models when the second obtaining path indicates that the second idle computing power resource is obtained from the idle time period and the second resource type indicates that the second idle computing power resource is a second idle floating point type computing power resource and a second idle integer computing power resource;
a fourth consumption subunit, configured to schedule consumption of the second idle floating-point type computing resource to execute training tasks of other models when it is determined that the scheduling consumption of the second idle integer type computing resource is completed;
The second idle floating point type computing power resource and the second idle integer type computing power resource are computing power resources which are obtained from a non-training time period and are not consumed by the terminal under the condition that the processing time period is the training time period; the non-training period characterizes a difference between the run period and the training period; or the second idle floating point type computing power resource and the second idle integer type computing power resource are computing power resources which are obtained and not consumed by the terminal in a non-reasoning time period under the condition that the processing time period is the reasoning time period; the non-inference period characterizes a difference between the run period and the inference period.
In an alternative embodiment, the apparatus further comprises:
rescheduling consumption module, configured to reschedule consumption of the first idle computing power resource and the second idle computing power resource when it is determined that a processing procedure corresponding to a processing task of another model meets a fifth preset condition, so that the first idle computing power resource and the second idle computing power resource perform task processing on a candidate model;
the candidate models are models except the target model and other models in the models which can be scheduled by the resource scheduling system.
It should be noted that, the device embodiment provided by the embodiment of the present application and the method embodiment described above are based on the same inventive concept.
The embodiment of the application also provides an electronic device for processing the computing power resources, which comprises a processor and a memory, wherein at least one instruction or at least one section of program is stored in the memory, and the at least one instruction or the at least one section of program is loaded and executed by the processor to realize the processing method of the computing power resources provided by any embodiment.
Embodiments of the present application also provide a computer readable storage medium that may be provided in a terminal to hold at least one instruction or at least one program for implementing one of the method embodiments, the at least one instruction or at least one program being loaded and executed by a processor to implement the computing resource processing method as provided by the method embodiments described above.
Alternatively, in the present description embodiment, the storage medium may be located in at least one network server among a plurality of network servers of the computer network. Alternatively, in the present embodiment, the storage medium may include, but is not limited to: a U-disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The memory of the embodiments of the present specification may be used for storing software programs and modules, and the processor executes various functional applications and data processing by executing the software programs and modules stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, application programs required for functions, and the like; the storage data area may store data created according to the use of the device, etc. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device. Accordingly, the memory may also include a memory controller to provide access to the memory by the processor.
Embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device executes the computing power resource processing method provided by the above method embodiment.
The embodiment of the method for processing the computing power resource provided by the embodiment of the application can be executed in a terminal, a computer terminal, a server or similar computing devices. Taking the example of running on a server, fig. 11 is a block diagram of a hardware structure of a server according to an exemplary embodiment. As shown in fig. 11, the server 700 may vary considerably in configuration or performance and may include one or more central processing units (Central Processing Units, CPU) 710 (the central processing unit 710 may include, but is not limited to, a microprocessor MCU or a processing device such as a programmable logic device FPGA), a memory 730 for storing data, one or more storage mediums 720 (e.g., one or more mass storage devices) for storing applications 723 or data 722. Wherein memory 730 and storage medium 720 may be transitory or persistent. The program stored in the storage medium 720 may include one or more modules, each of which may include a series of instruction operations on the server. Still further, the central processor 710 may be configured to communicate with the storage medium 720 and execute a series of instruction operations in the storage medium 720 on the server 700. The server 700 may also include one or more power supplies 760, one or more wired or wireless network interfaces 750, one or more input/output interfaces 740, and/or one or more operating systems 721, such as Windows ServerTM, mac OS XTM, unixTM, linuxTM, freeBSDTM, and the like.
Input-output interface 740 may be used to receive or transmit data via a network. The specific example of the network described above may include a wireless network provided by a communication provider of the server 700. In one example, the input-output interface 740 includes a network adapter (Network Interface Controller, NIC) that can connect to other network devices through a base station to communicate with the internet. In one example, the input/output interface 740 may be a Radio Frequency (RF) module for communicating with the internet wirelessly.
It will be appreciated by those of ordinary skill in the art that the configuration shown in fig. 11 is merely illustrative and is not intended to limit the configuration of the electronic device described above. For example, server 700 may also include more or fewer components than shown in fig. 11, or have a different configuration than shown in fig. 11.
It should be noted that: the sequence of the embodiments of the present application is only for description, and does not represent the advantages and disadvantages of the embodiments. And the foregoing description has been directed to specific embodiments of this specification. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for the device and server embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and references to the parts of the description of the method embodiments are only required.
It will be appreciated by those of ordinary skill in the art that all or part of the steps of implementing the above embodiments may be implemented by hardware, or may be implemented by a program to instruct related hardware, and the program may be stored in a computer readable storage medium, where the storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The foregoing is only illustrative of the present application and is not to be construed as limiting thereof, but rather as various modifications, equivalent arrangements, improvements, etc., within the spirit and principles of the present application.

Claims (15)

1. A method of computing power resource processing, the method comprising:
in the processing process of performing task processing on a target model based on the computing equipment, determining an idle time period and a processing time period corresponding to the processing process; the idle time period is a time period other than the processing time period in the target time period; the target time period is a time period corresponding to the operation of the computing equipment;
Acquiring a first idle computing power resource which does not perform task processing on the target model in the processing time period, and acquiring a second idle computing power resource which is in an idle state in the idle time period; the first idle computing power resource and the second computing power resource are computing power resources corresponding to the computing power equipment;
and sending the first idle computing power resource and the second idle computing power resource to a resource scheduling system so that the resource scheduling system performs scheduling consumption on the first idle computing power resource and the second idle computing power resource to perform task processing on other models.
2. The method of claim 1, wherein the obtaining a first idle computing power resource that is not tasked with the target model within the processing time period comprises:
acquiring a first idle integer computing power resource which is not trained on the target model in the training time period under the condition that the processing time period is the training time period for training the target model; determining the first idle integer computing power resource as the first idle computing power resource; the first idle integer computing power resource is used for executing reasoning tasks of other models under the condition of being scheduled for consumption;
Under the condition that the processing time period is an reasoning time period for reasoning the target model, acquiring a first idle floating point type computing power resource which does not reason the target model in the reasoning time period; determining the first idle floating point computing power resource as the first idle computing power resource; the first idle floating point computing power resource is used for executing training tasks of other models under the condition of scheduled consumption.
3. The computing power resource processing method of claim 1, wherein the determining an idle period comprises:
under the condition that the processing time period is a training time period for training the target model, a non-training time period is obtained according to the difference value between the running time period and the training time period; determining the non-training period as the idle period;
under the condition that the processing time period is an reasoning time period for reasoning the target model, obtaining a non-reasoning time period according to the difference value between the running time period and the reasoning time period; and determining the non-inference time period as the idle time period.
4. The method of computing power resource processing according to claim 3, wherein the acquiring the second idle computing power resource in the idle state in the idle period of time includes:
Acquiring the unconsumed computing power resource in the non-training time period under the condition that the processing time period is the training time period, and acquiring the second idle computing power resource;
under the condition that the processing time period is the reasoning time period, computing power resources which are not consumed in the non-reasoning time period are obtained, and the second idle computing power resources are obtained;
the second idle computing power resource comprises at least one of a second idle floating point type computing power resource and a second idle integer type computing power resource, the second idle integer type computing power resource is used for executing reasoning tasks of other models under the condition of being scheduled for consumption, and the second idle floating point type computing power resource is used for executing training tasks of other models under the condition of being scheduled for consumption.
5. The computing power resource processing method of claim 2, wherein after the acquiring the first idle computing power resource for which the target model is not tasked within the processing period, the method further comprises:
setting the priority of the first idle computing power resource to be low priority;
and under the condition that the processing procedure for performing task processing on the target model meets the preset condition, scheduling the first idle computing power resource.
6. The method of computing power resource processing of claim 5, wherein the setting the priority of the first idle computing power resource to a low priority comprises:
setting the priority of the first idle integer computing power resource to be a low priority if the processing time period is the training time period; setting the priority of the first idle floating point power resource to be a low priority if the processing time period is the reasoning time period;
the scheduling processing for the first idle computing power resource under the condition that the processing procedure for performing task processing on the target model meets the preset condition is determined, including:
under the condition that the processing time period is the training time period and under the condition that the resource amount required for training the target model meets a first preset condition or the training time consumption meets a second preset condition, scheduling the first idle integer computing power resource;
and under the condition that the processing time period is the reasoning time period and under the condition that the resource amount required for reasoning the target model meets a third preset condition or the reasoning time consumption meets a fourth preset condition, scheduling the first idle floating point computing power resource.
7. A method of computing power resource processing, the method comprising:
receiving a first idle computing power resource and a second idle computing power resource sent by a terminal; the first idle computing power resource is the computing power resource which is acquired by the terminal from the processing time period and does not process the task of the target model, and the second idle computing power resource is the computing power resource which is acquired by the terminal from the idle time period and is in an idle state; the idle time period is a time period other than the processing time period in the target time period; the processing time period is a time period corresponding to a processing process of the terminal for performing task processing on the target model based on the computing equipment; the idle period is a period other than the processing period in the target period; the target time period is a time period corresponding to the operation of the computing equipment;
and scheduling and consuming the first idle computing power resource and the second idle computing power resource to perform task processing on other models.
8. The computing power resource processing method according to claim 7, wherein the first idle computing power resource and the second idle computing power resource carry respective corresponding resource identification information, and the resource identification information is used for representing a resource type and an acquisition path of the computing power resource; the scheduling consumption of the first idle computing power resource and the second idle computing power resource to perform task processing on other models includes:
Determining a first acquisition path and a first resource type of the first idle computing power resource according to resource identification information corresponding to the first idle computing power resource; determining a second acquisition path and a second resource type of the second idle computing power resource according to the resource identification information corresponding to the second idle computing power resource;
scheduling consumption of the first idle computing power resource according to the first acquisition path and the first resource type so as to perform task processing on other models; and scheduling consumption of the second idle computing power resource according to the second acquisition path and the second resource type so as to perform task processing on other models.
9. The method of computing power resource processing according to claim 8, wherein said scheduling consumption of the first free computing power resource according to the first acquisition pathway and the first resource type to task other models comprises:
under the condition that the first obtaining path represents that the first idle computing power resource is obtained from the processing time period and the first resource type represents that the first idle computing power resource is the first idle computing power resource, scheduling consumption is carried out on the first idle computing power resource so as to execute reasoning tasks of other models; the first idle integer computing power resource is a computing power resource which is obtained from the training time period and is not used for training the target model when the processing time period is the training time period;
Under the condition that the first obtaining path represents that the first idle computing power resource is obtained from the processing time period and the first resource type represents that the first idle computing power resource is a first idle floating point type computing power resource, scheduling consumption is carried out on the first idle floating point type computing power resource so as to execute training tasks of other models; and the first idle floating point type computing power resource is a computing power resource which is obtained from the reasoning time period and does not reason the target model when the processing time period is the reasoning time period.
10. The method of computing power resource processing according to claim 8, wherein said scheduling consumption of the second free computing power resource according to the second acquisition pathway and the second resource type to task other models comprises:
when the second obtaining path represents that the second idle computing power resource is obtained from the idle time period and the second resource type represents that the second idle computing power resource is a second idle floating point computing power resource and a second idle integer computing power resource, scheduling and consuming the second idle integer computing power resource to execute reasoning tasks of other models;
Under the condition that the second idle integer computing power resource is determined to be completely consumed in a scheduling mode, the second idle floating point computing power resource is consumed in a scheduling mode, and training tasks of other models are executed;
the second idle floating point type computing power resource and the second idle integer type computing power resource are computing power resources which are obtained from a non-training time period and are not consumed by the terminal under the condition that the processing time period is the training time period; the non-training period characterizes a difference between the run period and the training period; or the second idle floating point type computing power resource and the second idle integer type computing power resource are computing power resources which are obtained and not consumed by the terminal in a non-reasoning time period under the condition that the processing time period is the reasoning time period; the non-inference period characterizes a difference between the run period and the inference period.
11. The computing power resource processing method of any of claims 7 to 10, wherein after the scheduling consumption of the first idle computing power resource and the second idle computing power resource to task other models, the method further comprises:
Under the condition that the processing procedures corresponding to the processing tasks of other models meet a fifth preset condition, rescheduling and consuming the first idle computing power resource and the second idle computing power resource to enable the first idle computing power resource and the second idle computing power resource to perform task processing on the candidate models;
the candidate models are models except the target model and other models in the models which can be scheduled by the resource scheduling system.
12. An apparatus for processing computing resources, the apparatus comprising:
the time period determining module is used for determining an idle time period and a processing time period corresponding to the processing process in the processing process of performing task processing on the target model based on the computing equipment; the idle time period is a time period other than the processing time period in the target time period; the target time period is a time period corresponding to the operation of the computing equipment;
the resource acquisition module is used for acquiring a first idle computing power resource which does not perform task processing on the target model in the processing time period and acquiring a second idle computing power resource which is in an idle state in the idle time period; the first idle computing power resource and the second computing power resource are computing power resources corresponding to the computing power equipment;
And the sending module is used for sending the first idle computing power resource and the second idle computing power resource to a resource scheduling system so that the resource scheduling system performs scheduling consumption on the first idle computing power resource and the second idle computing power resource to perform task processing on other models.
13. An apparatus for processing computing resources, the apparatus comprising:
the receiving module is used for receiving the first idle computing power resource and the second idle computing power resource sent by the terminal; the first idle computing power resource is the computing power resource which is acquired by the terminal from the processing time period and does not process the task of the target model, and the second idle computing power resource is the computing power resource which is acquired by the terminal from the idle time period and is in an idle state; the idle time period is a time period other than the processing time period in the target time period; the processing time period is a time period corresponding to a processing process of the terminal for performing task processing on the target model based on the computing equipment; the idle period is a period other than the processing period in the target period; the target time period is a time period corresponding to the operation of the computing equipment;
And the dispatching consumption module is used for dispatching consumption of the first idle computing power resource and the second idle computing power resource so as to process tasks of other models.
14. An electronic device for processing computing resources, characterized in that it comprises a processor and a memory, in which at least one instruction or at least one program is stored, which is loaded by the processor and which performs the computing resource processing method according to any of claims 1 to 11.
15. A computer readable storage medium having stored therein at least one instruction or at least one program loaded and executed by a processor to implement the method of computing power resource processing of any of claims 1 to 11.
CN202310681868.1A 2023-06-09 2023-06-09 Computing power resource processing method, device, equipment and storage medium Pending CN117215764A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310681868.1A CN117215764A (en) 2023-06-09 2023-06-09 Computing power resource processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310681868.1A CN117215764A (en) 2023-06-09 2023-06-09 Computing power resource processing method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117215764A true CN117215764A (en) 2023-12-12

Family

ID=89043198

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310681868.1A Pending CN117215764A (en) 2023-06-09 2023-06-09 Computing power resource processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117215764A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117834614A (en) * 2024-01-11 2024-04-05 北京蓝耘科技股份有限公司 GPU resource scheduling method in cloud computing environment
CN117971502A (en) * 2024-03-29 2024-05-03 南京认知物联网研究院有限公司 Method and device for carrying out online optimization scheduling on AI reasoning cluster

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117834614A (en) * 2024-01-11 2024-04-05 北京蓝耘科技股份有限公司 GPU resource scheduling method in cloud computing environment
CN117971502A (en) * 2024-03-29 2024-05-03 南京认知物联网研究院有限公司 Method and device for carrying out online optimization scheduling on AI reasoning cluster

Similar Documents

Publication Publication Date Title
US20170329643A1 (en) Distributed node intra-group task scheduling method and system
CN117215764A (en) Computing power resource processing method, device, equipment and storage medium
CN107871164B (en) Fog computing environment personalized deep learning method
Han et al. Tailored learning-based scheduling for kubernetes-oriented edge-cloud system
CN109324875B (en) Data center server power consumption management and optimization method based on reinforcement learning
CN112035238A (en) Task scheduling processing method and device, cluster system and readable storage medium
CN111813545A (en) Resource allocation method, device, medium and equipment
Lakhan et al. Deadline aware and energy-efficient scheduling algorithm for fine-grained tasks in mobile edge computing
Jha et al. Multiobjective deployment of data analysis operations in heterogeneous IoT infrastructure
CN114895773A (en) Energy consumption optimization method, system and device of heterogeneous multi-core processor and storage medium
CN109976873B (en) Scheduling scheme obtaining method and scheduling method of containerized distributed computing framework
Gutierrez-Torre et al. Automatic distributed deep learning using resource-constrained edge devices
CN114995997A (en) Task processing method
CN114741200A (en) Data center station-oriented computing resource allocation method and device and electronic equipment
CN113014649B (en) Cloud Internet of things load balancing method, device and equipment based on deep learning
CN106802822A (en) A kind of cloud data center cognitive resources dispatching method based on moth algorithm
CN114490049A (en) Method and system for automatically allocating resources in containerized edge computing
CN117349026A (en) Distributed computing power scheduling system for AIGC model training
CN115145709B (en) Low-carbon big data artificial intelligence method and medical health state system
CN113535348A (en) Resource scheduling method and related device
CN115827232A (en) Method, device, system and equipment for determining configuration for service model
Senthilkumar et al. Energy aware task scheduling using hybrid firefly-GA in big data
CN113821313A (en) Task scheduling method and device and electronic equipment
Reznik et al. Distributed neural networks for signal change detection: On the way to cognition in sensor networks
CN116594784B (en) Method, device and system for scheduling edges and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication