CN112905333B - Computing power load scheduling method and device for distributed video intelligent analysis platform - Google Patents

Computing power load scheduling method and device for distributed video intelligent analysis platform Download PDF

Info

Publication number
CN112905333B
CN112905333B CN202110092033.3A CN202110092033A CN112905333B CN 112905333 B CN112905333 B CN 112905333B CN 202110092033 A CN202110092033 A CN 202110092033A CN 112905333 B CN112905333 B CN 112905333B
Authority
CN
China
Prior art keywords
algorithm
service
computing power
algorithm service
dif
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110092033.3A
Other languages
Chinese (zh)
Other versions
CN112905333A (en
Inventor
陈卫强
杜渐
段洪琳
张威奕
张凯
王兆明
王库超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhaoshang Xinzhi Technology Co ltd
Original Assignee
Zhaoshang Xinzhi Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhaoshang Xinzhi Technology Co ltd filed Critical Zhaoshang Xinzhi Technology Co ltd
Priority to CN202110092033.3A priority Critical patent/CN112905333B/en
Publication of CN112905333A publication Critical patent/CN112905333A/en
Application granted granted Critical
Publication of CN112905333B publication Critical patent/CN112905333B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load

Abstract

One or more embodiments of the present disclosure provide a method and an apparatus for computing power load scheduling for a distributed video intelligent analysis platform, which enable a load scheduling calculation result to be more real-time and more accurate due to the adoption of a real-time computing power resource acquisition method; the method of combining algorithm resource consumption estimation and algorithm service real-time resource idle rate is adopted in the calculation process of calculation load scheduling, so that the result of the load scheduling algorithm has the capability of calculation load estimation of algorithm service, and the reliability of the scheduling result is ensured; because the load priority value is calculated and compared in the calculation process of the calculation power load scheduling, the selection of the system optimal load algorithm service is realized, the calculation response speed of the user algorithm is faster, and the user experience is improved.

Description

Computing power load scheduling method and device for distributed video intelligent analysis platform
Technical Field
One or more embodiments of the present disclosure relate to the technical field of power load scheduling, and in particular, to a power load scheduling method and apparatus for a distributed video intelligent analysis platform.
Background
In recent years, with the rapid development of machine learning and deep learning, more and more products become more automated and intelligent by applying image analysis techniques, such as face recognition, license plate recognition, and the like. The increasing number of user side calls causes that many intelligent video analysis platforms are not enough heavy, so that many platform manufacturers are forced to deal with the situation by improving the performance of hardware facilities or increasing the number of newly-added hardware servers, but the coping mode not only increases the cost of the platform, but also increases the workload of platform maintenance. Therefore, the scheduling method capable of fully and flexibly scheduling the computing power resources of the platform is studied, and the scheduling method has higher practicability on the aspects of economic benefit and resource saving for the intelligent video analysis platform.
Conventional load balancing scheduling mainly achieves two main goals: firstly, a large amount of concurrent access or data traffic is shared to a plurality of node devices for parallel processing, so that the waiting time of a user for response is reduced; secondly, heavy load operation of a single node is avoided, so that the processing capacity of the system is greatly improved. The conventional load balancing scheduling method includes Round Robin (Round Robin), minimum Connection number (Least Connection), weighted response (Weighted Response) and the like, wherein the Round Robin and minimum Connection number method is suitable for servers with similar performance and resource consumption, and the weighted response method is complex in implementation and cannot exclude the influence of network speed on response time calculation.
For the video intelligent analysis platform, because various image analysis algorithms on the platform server occupy different system resources, even the occupation condition of the resources of the same algorithm is not simply incremental superposition according to the quantity under the condition of one path and multiple paths of analysis, the conventional load balancing scheduling problem of the computing power of the video intelligent analysis platform cannot be really solved by a connection counting or response time computing mode.
Disclosure of Invention
In view of this, an object of one or more embodiments of the present disclosure is to provide a method and an apparatus for computing power load scheduling for a distributed video intelligent analysis platform, so as to solve the problem of computing power load balancing scheduling of the video intelligent analysis platform.
In view of the above objects, one or more embodiments of the present disclosure provide a method for computing power load scheduling for a distributed video intelligent analysis platform, including:
Configuring different types of algorithms, and estimating values of different types of manufacturers on GPU video memory consumption capacity, memory consumption capacity and CPU occupation to obtain algorithm resource consumption information;
Acquiring computing power resource data, and analyzing algorithm service information and resource state values to obtain computing power resource consumption information corresponding to each algorithm service;
and acquiring request data of a user, judging whether each algorithm service meets the load capacity called by the user based on the algorithm resource consumption information and the computing power resource consumption information, obtaining the current optimal algorithm service, and returning an algorithm service identifier.
Preferably, after obtaining the request data of the user, the method further comprises:
verifying the legality of the user;
If the verification is illegal, the reply request fails, and if the verification is legal, the algorithm type and the algorithm manufacturer type in the request data of the user are extracted, and the next flow is executed.
Preferably, collecting computing power resource data, analyzing algorithm service information and resource state values, and obtaining computing power resource consumption information corresponding to each algorithm service includes:
Creating an HTTP service for receiving the computing resource data;
Each algorithm service in the system receives RESTful interface push computing force resource data through computing force resource data at regular time;
after receiving the computing power resource data, analyzing algorithm service information and resource state values;
And storing the algorithm service information and the resource state value into a generated computing power resource information table.
Preferably, the algorithm service information includes, but is not limited to, an algorithm service support algorithm vendor type list and an algorithm service access address, and the resource status values include, but are not limited to, GPU video memory free capacity, memory free capacity and CPU occupancy.
Preferably, the algorithm resource consumption information comprises a GPU video memory consumption estimated value GM, a memory capacity consumption estimated value M and a CPU occupancy rate estimated value C0 corresponding to each algorithm type of each manufacturer, the algorithm resource consumption information comprises an algorithm type support list TList, a manufacturer support type, a CPU occupancy rate C1, a GPU video memory free capacity GMF and a memory free capacity MF corresponding to each algorithm service, and the request data of the user comprises a user request algorithm type T0;
Judging whether the user request algorithm type T0 is in the algorithm type support list Tlist, and if not, returning to FALSE;
If so, calculating an estimated CPU idle utilization C_DIF= (CT-C0-C1), an estimated GPU idle memory capacity GM_DIF= (GMF-GM) and an estimated idle memory capacity M_DIF= (MF-M) of an algorithm service supporting the user request algorithm type T0 in the algorithm type supporting list Tlist, and if C_DIF is smaller than 0 or GM_DIF is smaller than 0 or M_DIF is smaller than 0, returning to FALSE, otherwise, judging that the algorithm service meets the load capacity of user call, wherein CT is a CPU occupancy threshold.
Preferably, the obtaining the current optimal algorithm service, and the returning the algorithm service identifier includes:
Inputting an estimated CPU idle utilization C_DIF, an estimated GPU idle video memory capacity GM_DIF and an estimated idle memory capacity M_DIF of an algorithm service meeting the load capacity called by a user;
Reading CPU idle rate of current optimal load algorithm service resource idle estimation as BEST_C_DIF, GPU video memory idle capacity BEST_GM_DIF and memory idle capacity BEST_M_DIF, comparing the sizes of GM_DIF and BEST_GM_DIF, returning to FALSE if GM_DIF is small, otherwise, calculating input algorithm service load weight f= (C_DIFx0.7+M_DIF/1000x0.3), calculating current optimal algorithm service load weight f_best= (BEST_C_DIFx0.7+BEST_M_DIF/1000x0.3), setting a return value as FALSE if f is smaller than f_best, otherwise, setting the return value as TRUE, and serving the input algorithm as new optimal algorithm service;
until all algorithm services meeting the load capacity called by the user are traversed, outputting the algorithm service identification of the optimal algorithm service.
Preferably, the method further comprises:
Generating algorithm service access authorization information according to the returned algorithm service identification, wherein the algorithm service access authorization information comprises an authorization code and authorization time;
Transmitting the authorization code to a corresponding algorithm service;
and returning access RESTful interface address and authorization information of the algorithm service.
The specification also provides a computing power load scheduling device for the distributed video intelligent analysis platform, comprising:
the algorithm resource consumption estimation module is used for configuring estimated values of different types of algorithms, different types of manufacturers on the consumption capacity of the GPU video memory, the consumption capacity of the memory and the occupation of the CPU to obtain algorithm resource consumption information;
The computing power resource consumption calculation module is used for collecting computing power resource data, analyzing algorithm service information and resource state values and obtaining computing power resource consumption information corresponding to each algorithm service;
And the computing power load scheduling analysis module is used for acquiring request data of a user, judging whether each algorithm service meets the load capacity called by the user based on the algorithm resource consumption information and the computing power resource consumption information, obtaining the current optimal algorithm service, and returning an algorithm service identifier.
From the above, it can be seen that, according to the computing power load scheduling method and device for the distributed video intelligent analysis platform provided in one or more embodiments of the present disclosure, the real-time computing power resource acquisition method is adopted, so that the load scheduling calculation result can be more real-time and more accurate; the method of combining algorithm resource consumption estimation and algorithm service real-time resource idle rate is adopted in the calculation process of calculation load scheduling, so that the result of the load scheduling algorithm has the capability of calculation load estimation of algorithm service, and the reliability of the scheduling result is ensured; because the load priority value is calculated and compared in the calculation process of the calculation power load scheduling, the selection of the system optimal load algorithm service is realized, the calculation response speed of the user algorithm is faster, and the user experience is improved.
Drawings
For a clearer description of one or more embodiments of the present description or of the solutions of the prior art, the drawings that are necessary for the description of the embodiments or of the prior art will be briefly described, it being apparent that the drawings in the description below are only one or more embodiments of the present description, from which other drawings can be obtained, without inventive effort, for a person skilled in the art.
FIG. 1 is a flow diagram of a method of computing power load scheduling in accordance with one or more embodiments of the present disclosure;
FIG. 2 is a schematic diagram of a computing power resource data collection flow in accordance with one or more embodiments of the present disclosure;
FIG. 3 is a schematic diagram of a user request processing flow in accordance with one or more embodiments of the present disclosure;
FIG. 4 is a flow diagram of load scheduling analysis in accordance with one or more embodiments of the present disclosure;
FIG. 5 is a schematic diagram of a load capacity estimation algorithm according to one or more embodiments of the present disclosure.
FIG. 6 is a flow diagram of an optimal load alignment algorithm according to one or more embodiments of the present disclosure;
FIG. 7 is a schematic diagram of an overall architecture of a computing power load scheduler in accordance with one or more embodiments of the present disclosure.
Detailed Description
For the purposes of promoting an understanding of the principles and advantages of the disclosure, reference will now be made in detail to the following specific examples.
It is noted that unless otherwise defined, technical or scientific terms used in one or more embodiments of the present disclosure should be taken in a general sense as understood by one of ordinary skill in the art to which the present disclosure pertains. The use of the terms "first," "second," and the like in one or more embodiments of the present description does not denote any order, quantity, or importance, but rather the terms "first," "second," and the like are used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof, but does not exclude other elements or items. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", etc. are used merely to indicate relative positional relationships, which may also be changed when the absolute position of the object to be described is changed.
One or more embodiments of the present disclosure provide a computing power load scheduling method for a distributed video intelligent analysis platform, including the following steps:
S101, different types of algorithms and the estimated values of GPU video memory consumption capacity, memory consumption capacity and CPU occupation by different types of manufacturers are configured to obtain algorithm resource consumption information, for example, the algorithm resource consumption information can be stored in a mode of constructing an algorithm resource consumption estimation table, and the configuration can be configured according to input of a user, such as providing an input method for the user to input.
S102, acquiring computing power resource data, and analyzing algorithm service information and resource state values to obtain computing power resource consumption information corresponding to each algorithm service;
As one implementation mode, the method includes the steps that firstly, an HTTP service for receiving computing power resource data is created, computing power resource data is pushed by computing power resource data receiving RESTful interfaces in a system at regular time (the computing power resource data is packaged in a JSON protocol format and is put into a BODY field of the HTTP protocol), after the system receives the computing power resource data, algorithm service information and resource state values are analyzed, wherein the algorithm service information comprises an algorithm service supporting algorithm manufacturer type list, an algorithm service access address, and the resource state values comprise Graphic Processing Unit (GPU) display memory free capacity, CPU occupation rate and the like; and finally, storing the algorithm service information and the resource state value thereof into a generated computing power resource information table, namely storing computing power resource consumption information in a computing power resource information table mode.
S103, acquiring request data of a user, judging whether each algorithm service meets the load capacity called by the user based on the algorithm resource consumption information and the computing power resource consumption information, obtaining the current optimal algorithm service, and returning an algorithm service identifier.
As one embodiment, step 103 includes:
1. judging whether each algorithm service meets the load capacity called by the user, if so, continuing to acquire the optimal algorithm service, and if not, continuing to traverse the next algorithm service;
The algorithm resource consumption information comprises a GPU video memory consumption estimated value GM, a memory capacity consumption estimated value M and a CPU occupancy rate estimated value C0 corresponding to each algorithm type of each manufacturer, the algorithm resource consumption information comprises an algorithm type supporting list TList, a manufacturer supporting type, a CPU occupancy rate C1, a GPU video memory free capacity GMF and a memory free capacity MF corresponding to each algorithm service, and request data of a user comprises a user request algorithm type T0;
Judging whether the user request algorithm type T0 is in the algorithm type support list Tlist, and if not, returning to FALSE;
If so, calculating an estimated CPU idle utilization C_DIF= (CT-C0-C1), an estimated GPU idle memory capacity GM_DIF= (GMF-GM) and an estimated idle memory capacity M_DIF= (MF-M) of an algorithm service supporting the user request algorithm type T0 in the algorithm type supporting list Tlist, and if C_DIF is smaller than 0 or GM_DIF is smaller than 0 or M_DIF is smaller than 0, returning to FALSE, otherwise, judging that the algorithm service meets the load capacity of user call, wherein CT is a CPU occupancy threshold.
2. And obtaining the current optimal algorithm service, returning to the algorithm service identifier, namely comparing the load capacity of the algorithm service meeting the load capacity called by the user with the current optimal algorithm service, and returning to the algorithm service identifier by taking the algorithm service as a new optimal algorithm service if the load capacity of the algorithm service is better.
Specifically, the method comprises the following steps:
Inputting an estimated CPU idle utilization C_DIF, an estimated GPU idle video memory capacity GM_DIF and an estimated idle memory capacity M_DIF of an algorithm service meeting the load capacity called by a user;
Reading CPU idle rate of current optimal load algorithm service resource idle estimation as BEST_C_DIF, GPU video memory idle capacity BEST_GM_DIF and memory idle capacity BEST_M_DIF, comparing the sizes of GM_DIF and BEST_GM_DIF, returning to FALSE if GM_DIF is small, otherwise, calculating input algorithm service load weight f= (C_DIFx0.7+M_DIF/1000x0.3), calculating current optimal algorithm service load weight f_best= (BEST_C_DIFx0.7+BEST_M_DIF/1000x0.3), setting a return value as FALSE if f is smaller than f_best, otherwise, setting the return value as TRUE, and serving the input algorithm as new optimal algorithm service;
until all algorithm services meeting the load capacity called by the user are traversed, outputting the algorithm service identification of the optimal algorithm service.
As an embodiment, after obtaining the request data of the user, the method further includes:
verifying the legality of the user;
If the verification is illegal, the reply request fails, and if the verification is legal, the algorithm type and the algorithm manufacturer type in the request data of the user are extracted, and the next flow is executed.
After the algorithm service identifier is obtained, the method further comprises the following steps:
Generating algorithm service access authorization information according to the returned algorithm service identification, wherein the algorithm service access authorization information comprises an authorization code and authorization time;
Transmitting the authorization code to a corresponding algorithm service;
and returning access RESTful interface address and authorization information of the algorithm service.
The method provided by the invention has the advantages that by providing the calculation power load scheduling method for the distributed video intelligent analysis platform, the load scheduling calculation result can be more real-time and more accurate due to the adoption of the acquisition method of real-time calculation power resources; the method of combining algorithm resource consumption estimation and algorithm service real-time resource idle rate is adopted in the calculation process of calculation load scheduling, so that the result of the load scheduling algorithm has the capability of calculation load estimation of algorithm service, and the reliability of the scheduling result is ensured; because the calculation and comparison of the load priority value are adopted in the calculation process of the calculation power load scheduling, the selection of the system optimal load algorithm service is realized, the calculation response speed of the user algorithm is faster, and the user experience is improved; the algorithm service is used for reporting the calculation force resource data to the scheduling service through the RESTful interface, so that the algorithm service has no coupling, and the number of the algorithm services of the algorithm service cluster is easier to expand.
In summary, the computational load scheduling method for the distributed video intelligent analysis platform has the advantages of being strong in instantaneity, accuracy and reliability, fast in user request response speed and easy to expand.
The specification also provides a computing power load scheduling device for a distributed video intelligent analysis platform, comprising:
the algorithm resource consumption estimation module is used for configuring estimated values of different types of algorithms, different types of manufacturers on the consumption capacity of the GPU video memory, the consumption capacity of the memory and the occupation of the CPU to obtain algorithm resource consumption information;
The computing power resource consumption calculation module is used for collecting computing power resource data, analyzing algorithm service information and resource state values and obtaining computing power resource consumption information corresponding to each algorithm service;
And the computing power load scheduling analysis module is used for acquiring request data of a user, judging whether each algorithm service meets the load capacity called by the user based on the algorithm resource consumption information and the computing power resource consumption information, obtaining the current optimal algorithm service, and returning an algorithm service identifier.
The beneficial effects of the device are the same as those of the method, and are not repeated.
The implementation steps of the method of the present invention will be described below by taking a distributed architecture video analysis platform of a company as an example.
The video analysis platform of a certain company adopts a distributed architecture, the algorithm service comprises a face recognition algorithm service and a safety helmet recognition algorithm service, the number of servers is A, B, C, D, wherein the face recognition algorithm service F1 is deployed by the server A, the face recognition algorithm service F2 is deployed by the server B, the safety helmet recognition algorithm service H1 is deployed by the server C, the safety helmet recognition algorithm service H2 is deployed by the server D, the platform needs to realize that a certain load algorithm service in the platform is acquired relatively in real time and accurately to carry out algorithm call on users, and the algorithm call response of the users is fast and the experience is good.
The power-calculating load scheduling method of the distributed video intelligent analysis platform realizes the power-calculating scheduling requirement of the platform, and mainly comprises the processes of constructing an algorithm resource consumption pre-estimated table, acquiring power resource data in real time, processing user requests, scheduling and analyzing the power-calculating load of the system and the like.
The first step: the method provides a configuration interface to enable a platform manager to configure an estimated value of algorithm to resource consumption, and the system stores algorithm resource consumption information input by a user into a JSON format file on the assumption that the estimated value of the face recognition algorithm resource consumption configured by the user is shown in a table 1.
TABLE 1
And a second step of: firstly, an HTTP service is started for monitoring computing power resource data pushed by an algorithm service, the computing power resource data is placed in an HTTP protocol BODY in a JSON protocol format text mode according to protocol format specifications, the computing power resource data is divided into two parts, wherein the first part is algorithm service information, and the second part is resource state information.
The computing power resource data pushed by the face algorithm services F1 and F2 and the safety helmet recognition algorithm services H1 and H2 are received, the computing power resource data after data analysis is shown in the table 2, and the computing power resource consumption information is stored in the memory list because the data are updated in real time.
TABLE 2
And a third step of: firstly, an HTTP service is started for monitoring user requests, user information is stored in a user field of an HTPP protocol header according to user protocol format specifications, and other request parameters are placed in an HTTP protocol BODY in a JSON protocol format text mode.
The video intelligent analysis platform scheduling service receives three user requests: reqest _1, reqest _2, reqest _3, the user requested data are shown in table 3.
TABLE 3 Table 3
After protocol analysis, user access legitimacy verification is sequentially performed on three requests of reqest _1, reqest _2 and reqest _3, the user information of the user request reqest _1 is found to be not in accordance with the system authorization rule, the user information belongs to illegal requests, the request failure of reqest _1 is immediately replied, and the fourth step of system calculation power load scheduling analysis is called according to the algorithm types of the reqest _2 and reqest _3 user requests and the manufacturer types, and the obtained results are shown in table 4.
TABLE 4 Table 4
As can be seen from the result of the service acquisition of the power load, if the service acquisition of the user request reqest _2 fails, the system immediately replies to the failure of the user request reqest _2, and the service acquisition of the power load of the user request reqest _3 succeeds.
Finally, an authorization code 'task=1000 and date=2019-10-10' is generated, an algorithm service access interface 'http:// 10.19.155.207:10081/api/helmet/recog' of the algorithm service identifier as server_h2 is obtained, then a user access authorization code is sent to the algorithm service server_h2, a user request is replied, and the reply message carries the algorithm service access authorization code and an access interface address.
Fourth step: the computational load scheduling analysis comprises several specific steps: firstly, judging whether the algorithm service type is consistent with the algorithm type requested by a user, secondly, judging whether an algorithm service manufacturer support list contains the algorithm manufacturer type requested by the user, then predicting whether the algorithm server computing power resource is enough to support the user algorithm call request, and finally comparing and searching the load algorithm service which is optimal at the current moment of the system.
1. Load schedule calculation procedure for user request reqest _2:
a) The user requests an algorithm type of "facerecog", and since the algorithm type of the algorithm services H1 and H2 is "helmetrecog", the algorithm services H1 and H2 are excluded.
B) The user requests an algorithm vendor type of "shangtang" and the algorithm service F2 does not support the vendor type, so the exclusion algorithm service F2 is excluded.
C) Assuming that the CPU occupancy threshold ct=0.9, the estimated value of the resource consumption of the face recognition algorithm "shangtang" in table 1 is CPU occupancy c0=0.2, the GUP video memory capacity gm=500, the memory capacity m=300, and the resource status data CPU occupancy of the algorithm service F1 is c1=0.6, the GUP video memory free capacity gmf=400, and the memory free capacity mf=1000, which are calculated according to the load capacity calculation formula: c_dif=0.1, gm_dif= -100, m_dif=100, since gm_dif is less than 0, indicating that GPU video memory capacity is not sufficient to support algorithm operation, the F1 algorithm service is excluded
D) Since none of the services F1, F2, H1, H2 satisfies the load condition, FALSE is returned.
2. Load schedule calculation procedure for user request reqest _3:
a) And (3) algorithm type support judgment: the user requests an algorithm type of "helmetrecog", and since the algorithm types of the algorithm services F1 and F2 are "facerecog", the algorithm services F1 and F2 are excluded.
B) Vendor type support judgment: the manufacturer type of the algorithm requested by the user is "yx", and both algorithm services H1 and H2 support the manufacturer type
C) And (3) judging load conditions: assuming that the CPU occupancy threshold ct=0.9, the resource consumption estimation value of the helmet recognition algorithm "yx" in table 1 is CPU occupancy c0=0.3, GUP video memory capacity gm=500, and memory capacity m=200. The CPU occupancy rate of the resource state data of the algorithm service H1 is C1=0.5, the GUP video memory free capacity GMF=550 and the memory free capacity MF=800, and the algorithm is calculated according to a load capacity calculation formula: c_dif=0.1, gm_dif=50, m_dif=600. The CPU occupancy rate of the resource state data of the algorithm service H2 is C1=0.2, the GUP video memory free capacity GMF=550 and the memory free capacity MF=500, and the algorithm is calculated according to a load capacity calculation formula: c_dif=0.4, gm_dif=50, m_dif=300. The algorithm services H1, H2 thus meet the load conditions of the algorithm call.
D) And (3) optimal load algorithm service comparison: according to the load weight calculation formula, the algorithm service H1 load weight f1=0.1×0.7+0.6×0.3=0.25, the algorithm service H2 load weight f2=0.4×0.7+0.3×0.3=0.37, and the calculation result shows that f1 is smaller than f2, so that the algorithm service H2 is the optimal load algorithm service applicable to the user request in the current system.
And returning TRUE and carrying the service number H2 of the algorithm service H2.
In summary, the method researches a computational load scheduling method suitable for a distributed video intelligent analysis platform, and the method specifically comprises the following steps: the method comprises four aspects of algorithm resource consumption estimation table construction, calculation power resource data acquisition, system calculation power load scheduling analysis and user request processing, wherein an algorithm resource consumption estimation table construction support platform configures calculation power resource consumption estimated values according to different algorithm types, and the calculation power resource data acquisition realizes automatic real-time acquisition of calculation power resource use data of each algorithm server, including memory capacity, GPU (graphic processing unit) video memory capacity, CPU (central processing unit) occupancy rate and the like; the system calculation power load scheduling analysis realizes the estimation of calculation power required by the user request and the analysis and comparison of algorithm service resources, thereby determining the optimal load algorithm service for processing the user request; the user request processing realizes the extraction of the user algorithm type, the acquisition of the algorithm load scheduling service and the generation and synchronization of the algorithm service access authorization information. The method can better realize the load balance of the computing power resources of the video intelligent analysis platform in practical application.
Those of ordinary skill in the art will appreciate that: the discussion of any of the embodiments above is merely exemplary and is not intended to suggest that the scope of the disclosure, including the claims, is limited to these examples; combinations of features of the above embodiments or in different embodiments are also possible within the spirit of the present disclosure, steps may be implemented in any order, and there are many other variations of the different aspects of one or more embodiments described above which are not provided in detail for the sake of brevity.
Additionally, well-known power/ground connections to Integrated Circuit (IC) chips and other components may or may not be shown within the provided figures, in order to simplify the illustration and discussion, and so as not to obscure one or more embodiments of the present description. Furthermore, the apparatus may be shown in block diagram form in order to avoid obscuring the one or more embodiments of the present description, and also in view of the fact that specifics with respect to implementation of such block diagram apparatus are highly dependent upon the platform within which the one or more embodiments of the present description are to be implemented (i.e., such specifics should be well within purview of one skilled in the art). Where specific details (e.g., circuits) are set forth in order to describe example embodiments of the disclosure, it should be apparent to one skilled in the art that one or more embodiments of the disclosure can be practiced without, or with variation of, these specific details. Accordingly, the description is to be regarded as illustrative in nature and not as restrictive.
The present disclosure is intended to embrace all such alternatives, modifications and variances which fall within the broad scope of the appended claims. Any omissions, modifications, equivalents, improvements, and the like, which are within the spirit and principles of the one or more embodiments of the disclosure, are therefore intended to be included within the scope of the disclosure.

Claims (5)

1. A method for computing power load scheduling for a distributed video intelligent analysis platform, comprising:
Configuring different types of algorithms, and estimating values of different types of manufacturers on GPU video memory consumption capacity, memory consumption capacity and CPU occupation to obtain algorithm resource consumption information;
acquiring computing power resource data, and analyzing algorithm service information and resource state values in the computing power resource data to obtain computing power resource consumption information corresponding to each algorithm service;
acquiring request data of a user, judging whether each algorithm service meets the load capacity called by the user or not based on the algorithm resource consumption information and the algorithm service information obtained by analyzing the algorithm service information and the resource state value in the algorithm resource data, obtaining the current optimal algorithm service, and returning an algorithm service identifier;
The acquiring the computing power resource data, analyzing the algorithm service information and the resource state value in the computing power resource data, and obtaining the computing power resource consumption information corresponding to each algorithm service comprises the following steps:
Creating an HTTP service for receiving the computing resource data;
Each algorithm service in the system receives RESTful interface push computing force resource data through computing force resource data at regular time;
after receiving the computing power resource data, analyzing algorithm service information and resource state values in the computing power resource data;
Storing the algorithm service information and the resource state value into a generated computing power resource information table;
The algorithm service information comprises, but is not limited to, an algorithm service support algorithm manufacturer type list and an algorithm service access address, and the resource state values comprise, but are not limited to, GPU video memory free capacity, memory free capacity and CPU occupancy rate;
The algorithm resource consumption information comprises a GPU video memory consumption estimated value GM, a memory capacity consumption estimated value M and a CPU occupancy rate estimated value C0 corresponding to each algorithm type of each manufacturer, the algorithm resource consumption information comprises an algorithm type supporting list TList, a manufacturer supporting type, a CPU occupancy rate C1, a GPU video memory free capacity GMF and a memory free capacity MF corresponding to each algorithm service, and the request data of the user comprises a user request algorithm type T0;
Judging whether the user request algorithm type T0 is in the algorithm type support list Tlist, and if not, returning to FALSE;
If so, calculating an estimated CPU idle utilization C_DIF= (CT-C0-C1), an estimated GPU idle memory capacity GM_DIF= (GMF-GM) and an estimated idle memory capacity M_DIF= (MF-M) of an algorithm service supporting the user request algorithm type T0 in the algorithm type supporting list Tlist, and if C_DIF is smaller than 0 or GM_DIF is smaller than 0 or M_DIF is smaller than 0, returning to FALSE, otherwise, judging that the algorithm service meets the load capacity of user call, wherein CT is a CPU occupancy threshold.
2. The method for computing power load scheduling for a distributed video intelligent analysis platform of claim 1, wherein after obtaining the user's request data, the method further comprises:
verifying the legality of the user;
If the verification is illegal, the reply request fails, and if the verification is legal, the algorithm type and the algorithm manufacturer type in the request data of the user are extracted, and the next flow is executed.
3. The method for computing power load scheduling for a distributed video intelligent analysis platform according to claim 1, wherein said obtaining a current optimal algorithm service and returning an algorithm service identifier comprise:
Inputting an estimated CPU idle utilization C_DIF, an estimated GPU idle video memory capacity GM_DIF and an estimated idle memory capacity M_DIF of an algorithm service meeting the load capacity called by a user;
Reading CPU idle rate BEST_C_DIF, GPU video memory idle capacity BEST_GM_DIF and memory idle capacity BEST_M_DIF of the current optimal load algorithm service resource idle estimation, comparing the sizes of the GM_DIF and the BEST_GM_DIF, returning to FALSE if the GM_DIF is small, otherwise, calculating input algorithm service load weight f= (C_DIFx0.7+M_DIF/1000x0.3), calculating current optimal algorithm service load weight f_best= (BEST_C_DIFx0.7+BEST_M_DIF/1000x0.3), setting a return value to FALSE if f is smaller than f_best, otherwise, setting the return value to TRUE, and serving the input algorithm as a new optimal algorithm;
until all algorithm services meeting the load capacity called by the user are traversed, outputting the algorithm service identification of the optimal algorithm service.
4. The method for computing power load scheduling for a distributed video intelligent analysis platform of claim 1, further comprising:
Generating algorithm service access authorization information according to the returned algorithm service identification, wherein the algorithm service access authorization information comprises an authorization code and authorization time;
Transmitting the authorization code to a corresponding algorithm service;
and returning access RESTful interface address and authorization information of the algorithm service.
5. A computing power load scheduling device for a distributed video intelligent analysis platform, comprising:
the algorithm resource consumption estimation module is used for configuring estimated values of different types of algorithms, different types of manufacturers on the consumption capacity of the GPU video memory, the consumption capacity of the memory and the occupation of the CPU to obtain algorithm resource consumption information;
the computing power resource consumption calculation module is used for collecting computing power resource data, analyzing algorithm service information and resource state values in the computing power resource data, and obtaining computing power resource consumption information corresponding to each algorithm service;
The computing power load scheduling analysis module is used for acquiring request data of a user, judging whether each algorithm service meets the load capacity called by the user or not based on the algorithm resource consumption information and the computing power resource consumption information obtained by analyzing the algorithm service information and the resource state value in the computing power resource data, obtaining the current optimal algorithm service, and returning an algorithm service identifier;
The acquiring the computing power resource data, analyzing the algorithm service information and the resource state value in the computing power resource data, and obtaining the computing power resource consumption information corresponding to each algorithm service comprises the following steps:
Creating an HTTP service for receiving the computing resource data;
Each algorithm service in the system receives RESTful interface push computing force resource data through computing force resource data at regular time;
after receiving the computing power resource data, analyzing algorithm service information and resource state values in the computing power resource data;
Storing the algorithm service information and the resource state value into a generated computing power resource information table;
The algorithm service information comprises, but is not limited to, an algorithm service support algorithm manufacturer type list and an algorithm service access address, and the resource state values comprise, but are not limited to, GPU video memory free capacity, memory free capacity and CPU occupancy rate;
The algorithm resource consumption information comprises a GPU video memory consumption estimated value GM, a memory capacity consumption estimated value M and a CPU occupancy rate estimated value C0 corresponding to each algorithm type of each manufacturer, the algorithm resource consumption information comprises an algorithm type supporting list TList, a manufacturer supporting type, a CPU occupancy rate C1, a GPU video memory free capacity GMF and a memory free capacity MF corresponding to each algorithm service, and the request data of the user comprises a user request algorithm type T0;
Judging whether the user request algorithm type T0 is in the algorithm type support list Tlist, and if not, returning to FALSE;
If so, calculating an estimated CPU idle utilization C_DIF= (CT-C0-C1), an estimated GPU idle memory capacity GM_DIF= (GMF-GM) and an estimated idle memory capacity M_DIF= (MF-M) of an algorithm service supporting the user request algorithm type T0 in the algorithm type supporting list Tlist, and if C_DIF is smaller than 0 or GM_DIF is smaller than 0 or M_DIF is smaller than 0, returning to FALSE, otherwise, judging that the algorithm service meets the load capacity of user call, wherein CT is a CPU occupancy threshold.
CN202110092033.3A 2021-01-23 2021-01-23 Computing power load scheduling method and device for distributed video intelligent analysis platform Active CN112905333B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110092033.3A CN112905333B (en) 2021-01-23 2021-01-23 Computing power load scheduling method and device for distributed video intelligent analysis platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110092033.3A CN112905333B (en) 2021-01-23 2021-01-23 Computing power load scheduling method and device for distributed video intelligent analysis platform

Publications (2)

Publication Number Publication Date
CN112905333A CN112905333A (en) 2021-06-04
CN112905333B true CN112905333B (en) 2024-04-26

Family

ID=76118604

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110092033.3A Active CN112905333B (en) 2021-01-23 2021-01-23 Computing power load scheduling method and device for distributed video intelligent analysis platform

Country Status (1)

Country Link
CN (1) CN112905333B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113641124B (en) * 2021-08-06 2023-03-10 珠海格力电器股份有限公司 Calculation force distribution method and device, controller and building control system
WO2024007171A1 (en) * 2022-07-05 2024-01-11 北京小米移动软件有限公司 Computing power load balancing method and apparatuses
CN115220916B (en) * 2022-07-19 2023-09-26 浙江通见科技有限公司 Automatic calculation scheduling method, device and system of video intelligent analysis platform
CN117573371B (en) * 2024-01-09 2024-03-29 支付宝(杭州)信息技术有限公司 Scheduling method and device for service running based on graphic processor

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103338228A (en) * 2013-05-30 2013-10-02 江苏大学 Cloud calculating load balancing scheduling algorithm based on double-weighted least-connection algorithm
CN104331325A (en) * 2014-11-25 2015-02-04 深圳市信义科技有限公司 Resource exploration and analysis-based multi-intelligence scheduling system and resource exploration and analysis-based multi-intelligence scheduling method for video resources
CN105656973A (en) * 2014-11-25 2016-06-08 中国科学院声学研究所 Distributed method and system for scheduling tasks in node group
CN106657379A (en) * 2017-01-06 2017-05-10 重庆邮电大学 Implementation method and system for NGINX server load balancing
CN109918198A (en) * 2019-02-18 2019-06-21 中国空间技术研究院 A kind of emulation cloud platform load dispatch system and method based on user characteristics prediction
CN110096349A (en) * 2019-04-10 2019-08-06 山东科技大学 A kind of job scheduling method based on the prediction of clustered node load condition
CN111124662A (en) * 2019-11-07 2020-05-08 北京科技大学 Fog calculation load balancing method and system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103338228A (en) * 2013-05-30 2013-10-02 江苏大学 Cloud calculating load balancing scheduling algorithm based on double-weighted least-connection algorithm
CN104331325A (en) * 2014-11-25 2015-02-04 深圳市信义科技有限公司 Resource exploration and analysis-based multi-intelligence scheduling system and resource exploration and analysis-based multi-intelligence scheduling method for video resources
CN105656973A (en) * 2014-11-25 2016-06-08 中国科学院声学研究所 Distributed method and system for scheduling tasks in node group
CN106657379A (en) * 2017-01-06 2017-05-10 重庆邮电大学 Implementation method and system for NGINX server load balancing
CN109918198A (en) * 2019-02-18 2019-06-21 中国空间技术研究院 A kind of emulation cloud platform load dispatch system and method based on user characteristics prediction
CN110096349A (en) * 2019-04-10 2019-08-06 山东科技大学 A kind of job scheduling method based on the prediction of clustered node load condition
CN111124662A (en) * 2019-11-07 2020-05-08 北京科技大学 Fog calculation load balancing method and system

Also Published As

Publication number Publication date
CN112905333A (en) 2021-06-04

Similar Documents

Publication Publication Date Title
CN112905333B (en) Computing power load scheduling method and device for distributed video intelligent analysis platform
CN102891868B (en) The load-balancing method of a kind of distributed system and device
CN109995859A (en) A kind of dispatching method, dispatch server and computer readable storage medium
CN108111554B (en) Control method and device for access queue
CN106407002B (en) Data processing task executes method and apparatus
US9501326B2 (en) Processing control system, processing control method, and processing control program
CN110597719B (en) Image clustering method, device and medium for adaptation test
CN108924238A (en) Track collision analysis method and device
CN105897947A (en) Network access method and device for mobile terminal
CN108702334B (en) Method and system for distributed testing of network configuration for zero tariffs
CN110351311A (en) Load-balancing method and computer storage medium
CN107733805B (en) Service load scheduling method and device
CN108600399A (en) Information-pushing method and Related product
CN105610958A (en) Method and device for selecting time synchronization server and intelligent terminal
CN113326946A (en) Method, device and storage medium for updating application recognition model
US11087382B2 (en) Adapting digital order to venue service queue
CN109697155B (en) IT system performance evaluation method, device, equipment and readable storage medium
CN114500381A (en) Network bandwidth limiting method, system, electronic device and readable storage medium
CN108280024B (en) Flow distribution strategy testing method and device and electronic equipment
CN110351345B (en) Method and device for processing service request
CN114331446B (en) Method, device, equipment and medium for realizing out-of-chain service of block chain
CN110991253A (en) Block chain-based face digital identity recognition method and device
CN114722282A (en) Resource acquisition method, server and terminal
CN111625375B (en) Account reservation method and device, storage medium and electronic equipment
CN113055199B (en) Gateway access method and device and gateway equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant