CN116414567A

CN116414567A - Resource scheduling method, device and equipment of intelligent automobile operating system

Info

Publication number: CN116414567A
Application number: CN202310542184.3A
Authority: CN
Inventors: 朱林法; 刘洪振; 许智
Original assignee: Zebred Network Technology Co Ltd
Current assignee: Zebred Network Technology Co Ltd
Priority date: 2023-05-12
Filing date: 2023-05-12
Publication date: 2023-07-11

Abstract

The application provides a resource scheduling method, device and equipment of an intelligent automobile operating system, comprising the following steps: performing at least one performance test on an artificial intelligent model in an intelligent automobile operating system to obtain at least one static performance index, identifying a priority static index in the at least one static performance index, and setting a static strategy corresponding to the priority static index as a static allocation strategy; distributing graphics processor resources for each artificial intelligent model in the intelligent automobile operating system according to the static distribution strategy; operating at least one artificial intelligent model in an intelligent automobile operating system, and monitoring dynamic performance indexes of the operated artificial intelligent model; and generating a dynamic allocation strategy according to the dynamic performance index, and adjusting the graphics processor resources allocated to each artificial intelligent model according to the dynamic allocation strategy. The method and the device avoid the situation that a plurality of requests contend for the working thread in the graphic processor at the same time, and ensure the stability of the resource call of the graphic processor.

Description

Resource scheduling method, device and equipment of intelligent automobile operating system

Technical Field

The application relates to the technical field of intelligent automobiles, in particular to a resource scheduling method, device and equipment of an intelligent automobile operating system.

Background

With the development of intelligent driving technology, an automatic driving scene of an intelligent automobile is generated, and the automatic driving scene is constructed based on reasoning conclusions generated by a graphic processor through a plurality of artificial intelligent models; wherein an artificial intelligence model is a series of calculations and rules for solving a problem or analyzing a set of data, it looks like a flowchart, which contains step-by-step descriptions of the problem, but is written in mathematical and programming code form, so that the artificial intelligence model is typically used to implement specified tasks such as radar sensing, visual sensing, path planning, etc.; the reasoning conclusion is that a reasoning request is sent to the graphic processor based on the plurality of artificial intelligent models, so that the graphic processor calls the artificial intelligent models and generates a reasoning result according to the collected data loaded into the artificial intelligent models.

However, the inventor finds that when a plurality of artificial intelligence models in the current automatic driving scene send reasoning requests to the graphics processor, the situation that a plurality of requests fight for a working thread in the graphics processor at the same time easily occurs, so that the graphics processor is frequently called by the conflicting reasoning requests due to the working thread, and the unstable condition of the resource call of the graphics processor occurs.

Disclosure of Invention

The application provides a resource scheduling method, device and equipment of an intelligent automobile operating system, which are used for solving the problem that when a plurality of artificial intelligent models in the current automatic driving scene send reasoning requests to a graphic processor, the situation that a plurality of requests fight for a working thread in the graphic processor at the same time easily occurs, so that the graphic processor is frequently called by the conflicting reasoning requests due to the working thread, and the resource call of the graphic processor is unstable.

In a first aspect, the present application provides a resource scheduling method for a graphics processor of an operating system of an intelligent automobile, including:

performing at least one performance test on an artificial intelligent model in an intelligent automobile operating system to obtain at least one static performance index; the method comprises the steps of identifying a priority static index in at least one static performance index, setting a static strategy corresponding to the priority static index as a static allocation strategy, wherein an intelligent automobile operating system is provided with at least one artificial intelligent model, the artificial intelligent model is used for realizing a specified task, the specified task is used for realizing automatic driving of an automobile, the static performance index reflects performance of the artificial intelligent model in a performance test, the priority static index is a static performance index meeting a preset priority rule, the static strategy is a computer strategy used for grouping the artificial intelligent model, and allocating working threads to the grouped artificial intelligent model, and the working threads are sequence graphic processors used for scheduling a flow processor and/or a computing unit in a graphic processor;

Distributing graphics processor resources for each artificial intelligent model in the intelligent automobile operating system according to the static distribution strategy;

running at least one artificial intelligent model in the intelligent automobile operating system, and monitoring dynamic performance indexes of the running artificial intelligent model, wherein the dynamic performance indexes reflect performance of the artificial intelligent model when the distributed graphic processor resources are called for operation;

generating a dynamic allocation strategy according to the dynamic performance index, and adjusting the graphics processor resources allocated to each artificial intelligent model according to the dynamic allocation strategy, wherein the dynamic allocation strategy is used for allocating the graphics processor resources to the running artificial intelligent model according to the performance of the artificial intelligent model during running.

In the above scheme, the artificial intelligent model in the intelligent automobile operating system performs at least one performance test to obtain at least one static performance index, including:

acquiring at least one static strategy, respectively grouping artificial intelligent models in the intelligent automobile operating system according to each static strategy, and distributing working threads to each group of artificial intelligent models to obtain at least one static sample, wherein the static strategy is a computer strategy for grouping the artificial intelligent models and distributing the working threads to the grouped artificial intelligent models;

Recording preset test cases into artificial intelligent models in each static sample, and running the test cases in the artificial intelligent models through working threads distributed by the artificial intelligent models in each static sample so as to perform performance test on each static sample and obtain at least one static performance index corresponding to at least one static strategy, wherein the test cases are test cases for performing performance test on the artificial intelligent models.

In the above scheme, according to each static policy, respectively grouping the artificial intelligent models in the intelligent automobile operating system and distributing working threads to each group of artificial intelligent models to obtain at least one static sample, including:

grouping each artificial intelligent model in the intelligent automobile operating system according to the partitioning rule in the static strategy to obtain at least one test group, wherein the test group is provided with at least one artificial intelligent model;

acquiring at least one working thread in the graphic processor, and distributing one working thread to each test group according to a resource distribution rule in the static strategy;

And summarizing at least one test group and the working threads corresponding to each test group to form a static sample corresponding to the static strategy.

In the above scheme, grouping each artificial intelligent model in the intelligent automobile operating system according to the division rule in the static strategy to obtain at least one test group, including:

if at least one directed acyclic graph is determined to exist in the intelligent automobile operating system, dividing the artificial intelligent models belonging to the same directed acyclic graph into a test group, wherein the directed acyclic graph reflects the logic relationship between two or more artificial intelligent models in the intelligent automobile operating system;

if it is determined that other artificial intelligence models which do not belong to the directed acyclic graph exist in the intelligent automobile operating system, grouping the other artificial intelligence models according to model attribute data of each other artificial intelligence model to obtain at least one test group, wherein the model attribute data describe calculation power consumed by the artificial intelligence model for realizing a specified task;

and if the intelligent automobile operation system is determined to not have the directed acyclic graph, grouping the artificial intelligent models in the intelligent automobile operation system according to the model attribute data of each artificial intelligent model to obtain at least one test group.

In the above scheme, identifying a priority static index of the at least one static performance index includes:

extracting a first index element in each static performance index, and sequencing the first index element to obtain a target sequence, wherein the static performance index is provided with at least one static index element, the static index element reflects the performance of an artificial intelligent model in one performance dimension in a performance test, and the first index element is one static index element in the static performance index;

determining a performance value of each first index element according to the rank of each first index element in the target sequence, wherein the performance value reflects the performance quality degree of the first index element;

obtaining the comprehensive performance value of each static performance index according to the performance value of each static index element in each static performance index;

and setting the static performance index with the highest comprehensive performance value as a priority static index.

In the above scheme, according to the static allocation policy, allocating graphics processor resources for each artificial intelligent model in the intelligent automobile operating system includes:

Grouping the artificial intelligent models in the intelligent automobile operating system according to the partitioning rules in the static allocation strategy to obtain at least one running group;

and acquiring at least one working thread in the graphic processor, and distributing one working thread to each running group according to the resource distribution rule in the static distribution strategy so as to distribute the graphic processor resource to each artificial intelligent model resource in each running group.

In the above scheme, generating the dynamic allocation policy according to the dynamic performance index includes:

extracting a second index element in the dynamic performance index, and acquiring an index rule corresponding to the second index element, wherein the dynamic performance index is provided with at least one index element, the second index element is one index element in the dynamic performance index, and the index rule is a computer rule for defining the normal and abnormal index elements;

if the second index element is determined to accord with the index rule, setting the second index element as a normal index element;

if the second index element is determined not to accord with the index rule, setting the second index element as an abnormal index element;

If the number of the normal index elements in the dynamic performance index is not up to the preset normal threshold value or the number of the abnormal index elements is up to the preset abnormal threshold value, determining the dynamic performance index as an abnormal performance index;

if the number of the normal index elements in the dynamic performance index is determined to reach a preset normal threshold value or the number of the abnormal index elements is determined to not reach a preset abnormal threshold value, determining the dynamic performance index as a normal performance index;

and generating a dynamic allocation strategy according to the normal performance index and the abnormal performance index.

In the above scheme, generating a dynamic allocation policy according to the normal performance index and the abnormal performance index includes:

setting an artificial intelligent model corresponding to the normal performance index as a normal model, setting an artificial intelligent model corresponding to the abnormal performance index as an abnormal model, setting an operation group in which the normal model is located and an operation group in which an artificial intelligent model which does not operate in the intelligent automobile operating system is located as a normal group, and setting the operation group in which the abnormal model is located as an abnormal group;

if the abnormal model and other artificial intelligent models in the abnormal group are determined to have a logic relationship; adjusting the partitioning rule in the static partitioning strategy or the dynamic partitioning strategy to enable the adjusted partitioning rule to be used for adjusting the independent model in the abnormal group to a normal group; and/or adjusting the resource allocation rule in the static allocation policy or the dynamic allocation policy, so that the adjusted resource allocation rule is used for adjusting the working thread corresponding to the abnormal group into a working thread corresponding to the normal group; wherein the independent model is an artificial intelligent model which has no logical relationship with other artificial intelligent models in the anomaly group;

If it is determined that the abnormal model and other artificial intelligence models in the abnormal group do not have a logic relationship; adjusting the partitioning rule in the static allocation strategy or the dynamic allocation strategy to enable the adjusted partitioning rule to be used for adjusting the abnormal model to a normal group; and/or adjusting the resource allocation rule in the static allocation policy or the dynamic allocation policy, so that the adjusted resource allocation rule is used for adjusting the working thread corresponding to the abnormal group into a working thread corresponding to the normal group;

and generating a dynamic allocation strategy according to the adjusted partitioning rule and/or the adjusted resource allocation rule.

In a second aspect, the present application provides a resource scheduling device for a graphics processor of an operating system of a smart car, including:

the static test module is used for performing at least one performance test on the artificial intelligent model in the intelligent automobile operating system to obtain at least one static performance index; the method comprises the steps of identifying a priority static index in at least one static performance index, setting a static strategy corresponding to the priority static index as a static allocation strategy, wherein an intelligent automobile operating system is provided with at least one artificial intelligent model, the artificial intelligent model is used for realizing a specified task, the specified task is used for realizing automatic driving of an automobile, the static performance index reflects performance of the artificial intelligent model in a performance test, the priority static index is a static performance index meeting a preset priority rule, the static strategy is a computer strategy used for grouping the artificial intelligent model, and allocating working threads to the grouped artificial intelligent model, and the working threads are sequence graphic processors used for scheduling a flow processor and/or a computing unit in a graphic processor;

The static allocation module is used for allocating graphics processor resources for each artificial intelligent model in the intelligent automobile operating system according to the static allocation strategy;

the dynamic monitoring module is used for running at least one artificial intelligent model in the intelligent automobile operating system and monitoring dynamic performance indexes of the running artificial intelligent model, wherein the dynamic performance indexes reflect performance of the artificial intelligent model when the distributed graphic processor resource is called for operation;

and the dynamic allocation module is used for generating a dynamic allocation strategy according to the dynamic performance index, and adjusting the graphics processor resources allocated to each artificial intelligent model according to the dynamic allocation strategy, wherein the dynamic allocation strategy is used for allocating the graphics processor resources to the running artificial intelligent model according to the performance of the artificial intelligent model during running.

In a third aspect, the present application provides a computer device comprising: a processor and a memory communicatively coupled to the processor;

the memory stores computer-executable instructions;

the processor executes computer-executable instructions stored in the memory to implement the resource scheduling method of the graphics processor as described in claim.

In a fourth aspect, the present application provides a computer readable storage medium having stored therein computer executable instructions that when executed by a processor are configured to implement the above-described resource scheduling method for a graphics processor.

In a fifth aspect, the present application provides a computer program product comprising a computer program which, when executed by a processor, implements the above-described resource scheduling method of a graphics processor.

According to the resource scheduling method, device and equipment for the intelligent automobile operating system, at least one performance test is conducted on the artificial intelligent model in the intelligent automobile operating system, so that before the intelligent automobile operating system is operated, the static performance test is conducted on the artificial intelligent model in the intelligent automobile operating system, at least one static performance index is obtained, and therefore performance of each artificial intelligent model in the intelligent automobile operating system is determined when different groups are formed and different working threads are distributed to the artificial intelligent model.

And identifying a grouping mode and a working thread matching mode with optimal performance by identifying a priority static index in the at least one static performance index, further optimizing the allocation rationality of the graphics processor resources to the maximum extent before the intelligent automobile system is operated, and setting the optimal grouping mode and the working thread matching mode as a static allocation strategy.

And distributing the graphics processor resources to each artificial intelligent model through a static distribution strategy corresponding to the priority static index, so that the graphics processor resources with optimal configuration of performance are distributed to each artificial intelligent model before the intelligent automobile operating system runs.

The dynamic performance index of the running artificial intelligent model is monitored through the performance acquisition module, and index elements in the dynamic performance index comprise: CPU utilization rate, memory occupancy rate, disk IO, system average load, delay and frame rate; by generating the dynamic allocation strategy according to the dynamic performance index, the graphic processor resources are allocated to the artificial intelligence model based on the performance of the artificial intelligence model in the current running process, so that the sufficient graphic processor resources are ensured for calling of each running artificial intelligence model, the problem that reasoning requests generated by the artificial intelligence model conflict on the calling of working threads of the graphic processor is solved, the situation that a plurality of requests fight for the working threads in the graphic processor at the same time is avoided, the performance of a plurality of artificial intelligence models in an automatic driving scene is ensured, and the stability of the calling of the graphic processor resources is further ensured.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.

Fig. 1 is a schematic view of an application scenario provided in an embodiment of the present application;

FIG. 2 is a flowchart of an embodiment 1 of a method for scheduling resources of a graphics processor according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a program module of a resource scheduling apparatus of a graphics processor according to the present invention;

fig. 4 is a schematic diagram of a hardware structure of a computer device in the computer device according to the present invention.

Specific embodiments thereof have been shown by way of example in the drawings and will herein be described in more detail. These drawings and the written description are not intended to limit the scope of the inventive concepts in any way, but to illustrate the concepts of the present application to those skilled in the art by reference to specific embodiments.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present application as detailed in the accompanying claims.

Referring to fig. 1, the specific application scenario in the present application is:

the control unit 11 of the resource scheduling method of the graphic processor running with the intelligent automobile operating system is installed in the intelligent automobile operating system 1, and the control unit 11 is connected with the artificial intelligent model 12 and the graphic processor 13 in the intelligent automobile operating system 1.

The control unit 11 performs at least one performance test on the artificial intelligent model 12 in the intelligent automobile operating system 1 to obtain at least one static performance index, identifies a priority static index in the at least one static performance index, and sets a static strategy corresponding to the priority static index as a static allocation strategy

The control unit 11 allocates graphics processor 13 resources to each artificial intelligence model 12 in the intelligent car operating system 1 according to a static allocation strategy.

The control unit 11 runs at least one artificial intelligence model 12 in the intelligent car operating system 1 and monitors the dynamic performance index of the running artificial intelligence model 12.

The control unit 11 generates a dynamic allocation policy according to the dynamic performance index, and adjusts the graphics processor resources allocated to each artificial intelligence model 12 according to the dynamic allocation policy, where the dynamic allocation policy is used to allocate the graphics processor resources to the running artificial intelligence model 12 according to the performance of the artificial intelligence model 12 during running.

The control unit 11 is mounted in a computer module in a circuit or chip with the smart car operating system 1 and the graphics processor 13 is a graphics accelerator mounted in the circuit or chip as part of the smart car operating system 1.

The following describes the technical solutions of the present application and how the technical solutions of the present application solve the prior art problems in detail with specific embodiments. The following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.

Example 1:

referring to fig. 2, the present application provides a resource scheduling method for a graphics processor of an operating system of an intelligent automobile, including:

s201: performing at least one performance test on an artificial intelligent model in an intelligent automobile operating system to obtain at least one static performance index; the method comprises the steps of identifying a priority static index in at least one static performance index, setting a static strategy corresponding to the priority static index as a static allocation strategy, wherein an intelligent automobile operation system is provided with at least one artificial intelligent model, the artificial intelligent model is used for realizing an appointed task, the appointed task is used for realizing automatic driving of an automobile, the static performance index reflects performance of the artificial intelligent model in performance test, the priority static index is the static performance index meeting preset priority rules, the static strategy is a computer strategy used for grouping the artificial intelligent model, and allocating working threads to the grouped artificial intelligent model, and the working threads are sequence graphic processors used for scheduling a flow processor and/or a computing unit in a graphic processor.

In this step, the intelligent automobile Operating System is an underlying Operating System (OS) of the intelligent automobile, which is used to control and manage hardware and software resources of the whole intelligent automobile, and provide interfaces and environments for users and other software. Current smart car operating systems include: an autopilot OS and a smart cockpit OS. The requirements of the autopilot OS on safety, real-time performance and stability are very high, and the intelligent cabin OS pays more attention to openness and compatibility.

The automatic driving OS is mainly used for controlling the chassis and power of the vehicle so as to realize basic running functions such as accelerator, steering, gear shifting, braking and the like. The autopilot OS is the core of the autopilot function of the level L3 and above which is currently developed, and comprises a fusion technology of various industries such as a high-performance complex embedded system, an artificial intelligent chip and algorithm, a high-speed network, mass data processing, cloud machine coordination and the like.

The intelligent cockpit OS is a complete system composed of different cockpit electronics. The intelligent cabin is mainly divided into 5 parts: the system comprises a vehicle-mounted information entertainment system, a streaming media central rearview mirror, a head-up display system HUD, a full liquid crystal instrument and a vehicle networking module. The intelligent cabin realizes human-computer interaction through multi-screen fusion, and adopts a liquid crystal instrument, a HUD, a central control screen, a central control vehicle-mounted information terminal, a rear seat HMI entertainment screen, an inside and outside rearview mirror and the like as carriers to realize more intelligent interaction modes of voice control, gesture operation and the like. It is possible to incorporate artificial intelligence, AR, ADAS, VR, etc. technologies into the future.

The worker thread is used to schedule a stream processor (SP, streaming processor), which is the most basic processing unit in the graphics processor, also called CUDA core, on which specific instructions and tasks are processed.

Worker threads are also used to schedule computational units (SM, streaming multiprocessor, also known as graphics processor cores) in a graphics processor. SM is a large core of a graphics processor composed of multiple SPs plus some other resources, such as: the SM can be regarded as the heart (versus CPU core) of the graphics processor, with the registers and shared memory being scarce resources of the SM.

The method comprises the steps of performing at least one performance test on an artificial intelligent model in an intelligent automobile operating system to perform static performance test on the artificial intelligent model in the intelligent automobile operating system before the intelligent automobile operating system is operated to obtain at least one static performance index so as to determine the performance of each artificial intelligent model in the intelligent automobile operating system when the artificial intelligent models are grouped in different ways and different working threads are distributed to the artificial intelligent models.

The optimal grouping mode and the working thread matching mode are identified by identifying the priority static index in at least one static performance index, so that the allocation rationality of the graphics processor resources is optimized to the maximum degree before the intelligent automobile system is operated, and the optimal grouping mode and the working thread matching mode are set as a static allocation strategy.

In a preferred embodiment, at least one performance test is performed on an artificial intelligence model in an operating system of a smart car to obtain at least one static performance indicator, including:

obtaining at least one static strategy, respectively grouping artificial intelligent models in an intelligent automobile operating system according to each static strategy, and distributing working threads to each group of artificial intelligent models to obtain at least one static sample, wherein the static strategy comprises a division rule and a resource distribution rule, the division rule is used for grouping the artificial intelligent models in the intelligent automobile operating system, and the resource distribution rule is used for distributing the working threads to each group of artificial intelligent models.

Specifically, according to each static strategy, respectively grouping artificial intelligent models in an intelligent automobile operating system and distributing working threads to each group of artificial intelligent models to obtain at least one static sample, wherein the method comprises the following steps:

grouping each artificial intelligent model in the intelligent automobile operating system according to a partitioning rule in the static strategy to obtain at least one test group, wherein the test group is provided with at least one artificial intelligent model;

Acquiring at least one working thread in a graphic processor, and distributing one working thread to each test group according to a resource distribution rule in a static strategy;

Illustratively, for example: the artificial intelligent model in the intelligent automobile operating system comprises the following steps: model 1, model 2, model 3, model 4, model 5.

If static policy 1 is to group models two by two, and if there are remaining, then separate groups, then we will get: test group 1: model 1, model 2; test group 2: model 3, model 4; test group 3: and a model 5.

If static policy 2 is to list a given model individually as a group (e.g., model 3), the other models are grouped in pairs, then the following will be: test group 1: a model 3; test group 2: model 1, model 2; test group 3: model 4, model 5.

The graphics processor is provided with at least one working thread, and each working thread calls a corresponding stream processor or a computing unit respectively, so that the graphics processor resources which can be called by each working thread are different. Assume that a worker thread in a graphics processor includes: thread 1 and thread 2; then static policy 1 assigns test set 1 and test set 2 to thread 1 and test set 3 to thread 2; static policy 2 assigns test set 1 to thread 1, test set 2 and test set 3 to thread 2, and so on.

Thus, based on the above example, a static sample corresponding to static policy 1 may be obtained: model 1, model 2, model 3, model 4 assign thread 1, model 5 assign thread 2, and static samples corresponding to static policy 2: model 3 assigns thread 1, models 1, 2, 4, and 5 assigns static samples of thread 2, and so on, and will not be described in detail herein.

Further, grouping each artificial intelligent model in the intelligent automobile operating system according to the partitioning rule in the static strategy to obtain at least one test group, including:

if it is determined that other artificial intelligence models which do not belong to the directed acyclic graph exist in the intelligent automobile operating system, grouping the other artificial intelligence models according to model attribute data of each other artificial intelligence model to obtain at least one test group, wherein the model attribute data describe the computational power consumed by the artificial intelligence models for completing the specified task;

And if the intelligent automobile operating system is determined to not have the directed acyclic graph, grouping the artificial intelligent models in the intelligent automobile operating system according to the model attribute data of each artificial intelligent model to obtain at least one test group.

Illustratively, if determining the directed acyclic graph in the intelligent vehicle operating system includes: the first graph and the second graph, the first graph characterization model 1 points to model 2 and the second graph characterization model 3 points to model 4, then model 1 and model 2 are divided into one test set and model 3 and model 4 are divided into one test set.

If it is determined that there are other artificial intelligence models in the intelligent vehicle operating system that do not belong to the directed acyclic graph, such as model 5 and model 6, model 5 and model 6 do not belong to the first graph nor the second graph, model 5 is grouped according to the attribute data of model 5 and model 6.

Model attribute data describes the computational effort expended by an artificial intelligence model to accomplish a specified task; for example: the designated task for model 5 to achieve is radar perception, which consumes an amount of computation M1; model 6 is the visual perception of the designated task for implementation, and the computational effort it consumes is M2.

It should be noted that, the directed acyclic graph is composed of a limited number of vertices and "directed edges", and from any vertex, the directed edges cannot return to the vertex, and this graph is the directed acyclic graph, in this embodiment, the artificial intelligence model is a vertex in the directed acyclic graph, and the directed edges in the directed acyclic graph are used to describe an association relationship between two artificial intelligence models, for example: dependency relationships, association relationships, aggregation relationships, combination relationships, and the like; the artificial intelligent models belonging to the same directed acyclic graph are mutually dependent, and the designated task realized by the artificial intelligent model positioned at the upper position in the directed acyclic graph is used as the input of the artificial intelligent model positioned at the lower position, so that each artificial intelligent model is sequentially executed, and finally, the combined task is the total task completed based on the designated task realized by each artificial intelligent model in the directed acyclic graph.

Optionally, grouping the artificial intelligence models according to model attribute information model attribute data includes:

if the sum of model attribute data of two or more artificial intelligent models is determined to not exceed a preset calculation threshold, dividing the two or more artificial intelligent models into a test group;

and if the sum of the model attribute data of the first artificial intelligent model and the model attribute data of other artificial intelligent models in the intelligent automobile operating system exceeds the calculation force preset, dividing the first artificial intelligent model into a test group, wherein the first artificial intelligent model is one artificial intelligent model in the intelligent automobile operating system.

If the sum of the calculation forces consumed by the model 5 and the model 6 exceeds a preset calculation force threshold M3, setting the model 5 as a test group and setting the model 6 as a test group;

if the sum of the calculated forces consumed by the models 5 and 6 does not exceed the calculated force preset M3, the models 5 and 6 are set together as one test group.

If it is determined that the intelligent automobile operating system does not have the directed acyclic graph, grouping the artificial intelligent models according to model attribute information model attribute data to achieve grouping of the model 1, the model 2, the model 3, the model 4, the model 5 and the model 6 to obtain at least one test group.

Recording a preset test case into an artificial intelligent model in each static sample, and running the test case in the artificial intelligent model through a working thread distributed by the artificial intelligent model in each static sample to perform performance test on each static sample and obtain at least one static performance index corresponding to at least one static strategy, wherein the test case is a test case for performing performance test on the artificial intelligent model.

In a preferred embodiment, identifying a preferred static indicator of the at least one static performance indicator comprises:

extracting a first index element in each static performance index, and sequencing the first index elements to obtain a target sequence, wherein the static performance index is provided with at least one static index element, the static index element reflects the performance of the artificial intelligent model in one performance dimension in the performance test, and the first index element is one static index element in the static performance index.

Illustratively, the index elements in the static performance index include: run time, memory footprint, SM utilization, power consumption.

Assume that three static performance indicators are generated based on three static policies: static performance index 1, static performance index 2, static performance index 3, run time of the first index element, then the target sequence will be obtained:

Run-time target sequence	Run time
		Static Performance index 1	0.1s
Static Performance index 2	0.2s
		Static performance index 3	0.3s

And determining a performance value of each first index element according to the rank of each first index element in the target sequence, wherein the performance value reflects the performance quality degree of the first index element.

In this embodiment, the higher the rank indicates the better the performance of the first index element, the lower the rank indicates the worse the performance of the first index element, so the target sequence of the running time is in ascending order, and the shorter the running time indicates the better the performance of the index element; the target sequence of the memory occupation is in descending order, and the smaller the memory occupation is, the better the performance of index elements is; the target sequence of the SM utilization rate is in descending order, and the higher the SM utilization rate is, the better the performance of index elements is; the target sequence of the power consumption is arranged in an ascending order, and the smaller the power consumption is represented, the better the performance of index elements is.

And obtaining the comprehensive performance value of each static performance index according to the performance values of the static index elements in each static performance index.

Illustratively, based on the above examples: assuming that the first performance value is 3, the second performance value is 2, and the third performance value is 1, the performance values of the static index elements in each performance index are brought into a preset weighting function, and the weighting function is calculated to obtain the comprehensive performance value.

The weighting function is: s=a+b+y+c+z+d+m

Where x is the performance value of the runtime and a is the weight of the runtime; y is the performance value of memory occupation, b is the weight of memory occupation; z is the performance value of the SM occupancy rate, c is the weight of the SM occupancy rate; m is a performance value of power consumption, d is a weight of power consumption; s is the overall performance value.

S202: and distributing graphics processor resources for each artificial intelligent model in the intelligent automobile operating system according to the static distribution strategy.

In the step, graphics processor resources are allocated to each artificial intelligent model through a static allocation strategy corresponding to the priority static index, so that the graphics processor resources with optimal configuration of performance are allocated to each artificial intelligent model before the intelligent automobile operating system runs.

In a preferred embodiment, allocating graphics processor resources for each artificial intelligence model in an intelligent vehicle operating system according to a static allocation policy comprises:

grouping artificial intelligent models in an intelligent automobile operating system according to a division rule in a static allocation strategy to obtain at least one running group;

At least one working thread in the graphic processor is acquired, one working thread is allocated to each running group according to a resource allocation rule in a static allocation strategy, so that the graphic processor resource is allocated to each artificial intelligent model resource in each running group, wherein the working thread is a sequence for calling the graphic processor resource in the graphic processor to run the artificial intelligent model.

S203: and running at least one artificial intelligent model in the intelligent automobile operating system, and monitoring the dynamic performance index of the running artificial intelligent model, wherein the dynamic performance index reflects the performance of the artificial intelligent model when the distributed graphic processor resource is called to operate.

In this step, the performance acquisition module monitors the dynamic performance index of the running artificial intelligent model, and index elements in the dynamic performance index include: CPU usage, memory occupancy, disk IO, system average load, latency, frame rate.

And a KylinTOP test and monitoring platform, or LoadRunner, or KylinPET, or Apache JMeter, or NeoLoad, or WebLOAD, or Loadster, or Loadstorm, or Load impact, or OpenSTA, or Telegraf is adopted as a performance acquisition module.

S204: generating a dynamic allocation strategy according to the dynamic performance index, and adjusting the graphics processor resources allocated to each artificial intelligent model according to the dynamic allocation strategy, wherein the dynamic allocation strategy is used for allocating the graphics processor resources to the running artificial intelligent model according to the performance of the artificial intelligent model during running.

In the step, a dynamic allocation strategy is generated according to the dynamic performance index, so that the graphic processor resources are allocated to the artificial intelligence model based on the performance of the artificial intelligence model in the current running process, the enough graphic processor resources are ensured to be used for calling each running artificial intelligence model, the problem that the reasoning requests generated by the artificial intelligence model conflict on the calling of the working threads of the graphic processor is solved, the situation that a plurality of requests fight for the working threads in the graphic processor at the same time is avoided, and the performance of a plurality of artificial intelligence models in an automatic driving scene is ensured.

In a preferred embodiment, generating the dynamic allocation policy based on the dynamic performance metrics comprises:

extracting a second index element in the dynamic performance index, and acquiring an index rule corresponding to the second index element, wherein the dynamic performance index is provided with at least one index element, the second index element is one index element in the dynamic performance index, and the index rule is a computer rule for defining the normal and abnormal index elements.

And if the second index element is determined to accord with the index rule, setting the second index element as a normal index element.

And if the second index element is determined not to accord with the index rule, setting the second index element as an abnormal index element.

If the number of the normal index elements in the dynamic performance index is not up to the preset normal threshold value or the number of the abnormal index elements is up to the preset abnormal threshold value, the dynamic performance index is determined to be the abnormal performance index.

If the number of the normal index elements in the dynamic performance index is determined to reach a preset normal threshold value or the number of the abnormal index elements is determined to not reach a preset abnormal threshold value, the dynamic performance index is determined to be the normal performance index.

Illustratively, the obtained dynamic performance metrics include: CPU utilization rate, memory occupancy rate, disk IO, system average load, delay, frame rate, assuming that the second index element is system average load, wherein the system average load is used for describing the utilization rate of the graphics processor, and when the system average load is equal to 1.0, the highest utilization rate of the graphics processor is indicated; when the average load of the system is less than 1.0, the utilization rate of the graphic processor is shown in an idle state; when the average load of the system is greater than 1.0, it indicates that graphics processor usage has exceeded the load.

The index rule for the hypothetical load is: if the average load of the system belongs to 0,1, judging the average load of the system as a normal index element; if the average system load does not belong to [ 0,1 ], the average system load is judged to be an abnormal index element.

It is assumed that there are 6 dynamic performance indexes, the normal threshold is 4, and the abnormal threshold is 3.

If the dynamic performance index has 5 normal index elements and 1 abnormal index element, the dynamic performance index is judged to be the normal performance index.

If the dynamic performance index has 2 normal index elements and 4 abnormal index elements, the dynamic performance index is judged to be the abnormal performance index.

Specifically, generating a dynamic allocation policy according to the normal performance index and the abnormal performance index includes:

setting an artificial intelligent model corresponding to a normal performance index as a normal model, setting an artificial intelligent model corresponding to an abnormal performance index as an abnormal model, setting an operation group in which the normal model is located and an operation group in which an artificial intelligent model which does not operate in an intelligent automobile operating system is located as a normal group, and setting the operation group in which the abnormal model is located as an abnormal group;

if the logic relationship between the abnormal model and other artificial intelligent models in the abnormal group is determined; the partitioning rule in the static allocation strategy or the dynamic allocation strategy is adjusted, so that the adjusted partitioning rule is used for adjusting the independent model in the abnormal group to a normal group; and/or adjusting a resource allocation rule in the static allocation strategy or the dynamic allocation strategy, so that the adjusted resource allocation rule is used for adjusting the working thread corresponding to the abnormal group into a working thread corresponding to the normal group; wherein the independent model is an artificial intelligent model which has no logic relation with other artificial intelligent models in the abnormal group;

If the abnormal model and other artificial intelligent models in the abnormal group are determined to have no logic relationship; then adjusting the partitioning rule in the static allocation strategy or the dynamic allocation strategy to enable the adjusted partitioning rule to be used for adjusting the abnormal model to a normal group; and/or adjusting a resource allocation rule in the static allocation strategy or the dynamic allocation strategy, so that the adjusted resource allocation rule is used for adjusting the working thread corresponding to the abnormal group into a working thread corresponding to the normal group;

Illustratively, the anomaly model running in the anomaly group is model 1, and the anomaly group includes: model 1, model 2, and model 3, assuming a logical relationship between model 1 and model 2, for example: dependency relationships, association relationships, aggregate relationships, combination relationships, and the like, and thus model 3 at this time is an artificial intelligence model having independent relationships.

Assuming that the working thread corresponding to the abnormal group is thread 1; normal group 1 includes: model 4 and model 5, the corresponding worker thread is thread 2; normal group 2 includes: model 6, which corresponds to a worker thread of thread 3, wherein model 6 is an artificial intelligence model that is not running. Model 3 may be tuned to either normal group 1 or normal group 2, or thread 2 or thread 3 may be assigned to an abnormal group.

The anomaly group includes: model 1, model 2, and model 3 do not have a logical relationship with each other, and assuming that the anomaly model is model 1, model 2 and/or model 3 may be adjusted to normal group 1 or normal group 2 based on the above example, or thread 2 or thread 3 may be assigned to the anomaly group.

Example 2:

referring to fig. 3, the present application provides a resource scheduling device 3 of a graphics processor of an operating system of a smart car, including:

the static test module 31 is configured to perform at least one performance test on the artificial intelligent model in the intelligent automobile operating system to obtain at least one static performance index; the method comprises the steps of identifying a priority static index in at least one static performance index, setting a static strategy corresponding to the priority static index as a static allocation strategy, wherein an intelligent automobile operation system is provided with at least one artificial intelligent model, the artificial intelligent model is used for realizing a specified task, the specified task is used for realizing automatic driving of an automobile, the static performance index reflects performance of the artificial intelligent model in performance test, the priority static index is the static performance index meeting preset priority rules, the static strategy is a computer strategy used for grouping the artificial intelligent model, and allocating working threads to the grouped artificial intelligent model, and the working threads are sequence graphic processors used for scheduling a flow processor and/or a computing unit in a graphic processor;

The static allocation module 32 is configured to allocate graphics processor resources for each artificial intelligent model in the intelligent automobile operating system according to a static allocation policy;

a dynamic monitoring module 33, configured to run at least one artificial intelligence model in an operating system of the intelligent automobile, and monitor a dynamic performance index of the running artificial intelligence model, where the dynamic performance index reflects a performance of the artificial intelligence model when the distributed graphics processor resource is invoked for operation;

the dynamic allocation module 34 is configured to generate a dynamic allocation policy according to the dynamic performance index, and adjust graphics processor resources allocated to each artificial intelligence model according to the dynamic allocation policy, where the dynamic allocation policy is configured to allocate graphics processor resources to the running artificial intelligence model according to performance of the artificial intelligence model during running.

Example 3:

to achieve the above object, the present application further provides a computer device 4, including: a processor 42 and a memory 41 communicatively connected to the processor 42; the memory stores computer-executable instructions;

the processor executes the computer execution instructions stored in the memory 41 to implement the above-mentioned resource scheduling method of the graphics processor, where the components of the resource scheduling apparatus of the graphics processor may be distributed in different computer devices, and the computer device 4 may be a smart phone, a tablet computer, a notebook computer, a desktop computer, a rack server, a blade server, a tower server, or a rack server (including a stand-alone server, or a server cluster formed by a plurality of application servers) that execute a program, or the like. The computer device of the present embodiment includes at least, but is not limited to: a memory 41, a processor 42, which may be communicatively coupled to each other via a system bus, as shown in fig. 4. It should be noted that fig. 4 only shows a computer device with components-but it should be understood that not all of the illustrated components are required to be implemented and that more or fewer components may be implemented instead. In the present embodiment, the memory 41 (i.e., a readable storage medium) includes a flash memory, a hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, and the like. In some embodiments, the memory 41 may be an internal storage unit of a computer device, such as a hard disk or a memory of the computer device. In other embodiments, the memory 41 may also be an external storage device of a computer device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like. Of course, the memory 41 may also include both internal storage units of the computer device and external storage devices. In this embodiment, the memory 41 is typically used to store an operating system installed in a computer device and various types of application software, such as program codes of a resource scheduling apparatus of a graphics processor of the third embodiment. In addition, the memory 41 may also be used to temporarily store various types of data that have been output or are to be output. Processor 42 may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 42 is typically used to control the overall operation of the computer device. In this embodiment, the processor 42 is configured to execute the program code stored in the memory 41 or process data, for example, execute the resource scheduling device of the graphics processor, so as to implement the resource scheduling method of the graphics processor of the above embodiment.

The integrated modules, which are implemented in the form of software functional modules, may be stored in a computer readable storage medium. The software functional modules described above are stored in a storage medium and include instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or processor to perform some steps of the methods of the various embodiments of the present application. It should be appreciated that the processor may be a central processing unit (Central Processing Unit, CPU for short), other general purpose processors, digital signal processor (Digital Signal Processor, DSP for short), application specific integrated circuit (Application Specific Integrated Circuit, ASIC for short), etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present application may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in a processor for execution. The memory may comprise a high-speed RAM memory, and may further comprise a non-volatile memory NVM, such as at least one magnetic disk memory, and may also be a U-disk, a removable hard disk, a read-only memory, a magnetic disk or optical disk, etc.

To achieve the above object, the present application further provides a computer readable storage medium such as a flash memory, a hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, a server, an App application store, etc., on which computer-executable instructions are stored, which when executed by the processor 42, perform the corresponding functions. The computer-readable storage medium of the present embodiment is used to store computer-executable instructions that implement the resource scheduling method of the graphics processor, and when executed by the processor 42, implement the resource scheduling method of the graphics processor of the above-described embodiment.

The storage medium may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.

An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an application specific integrated circuit (Application Specific Integrated Circuits, ASIC for short). It is also possible that the processor and the storage medium reside as discrete components in an electronic device or a master device.

The application provides a computer program product, comprising a computer program, wherein the computer program realizes the resource scheduling method of the graphics processor when being executed by the processor.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.

It is to be understood that the present application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims

1. A method for scheduling resources of a graphics processor of an intelligent automobile operating system, comprising:

performing at least one performance test on an artificial intelligent model in an intelligent automobile operating system to obtain at least one static performance index; the method comprises the steps of identifying a priority static index in at least one static performance index, setting a static strategy corresponding to the priority static index as a static allocation strategy, wherein an intelligent automobile operating system is provided with at least one artificial intelligent model, the artificial intelligent model is used for realizing a specified task, the specified task is used for realizing automatic driving of an automobile, the static performance index reflects performance of the artificial intelligent model in a performance test, the priority static index is a static performance index meeting a preset priority rule, the static strategy is a computer strategy used for grouping the artificial intelligent model, and allocating working threads to the grouped artificial intelligent model, and the working threads are sequences used for scheduling a flow processor and/or a computing unit in a graphic processor;

2. The resource scheduling method of a graphic processor according to claim 1, wherein performing at least one performance test on an artificial intelligent model in an intelligent automobile operating system to obtain at least one static performance index comprises:

acquiring at least one static strategy, respectively grouping artificial intelligent models in the intelligent automobile operating system according to each static strategy, and distributing working threads to each group of artificial intelligent models to obtain at least one static sample;

3. The resource scheduling method of a graphic processor according to claim 2, wherein grouping the artificial intelligent models in the intelligent automobile operating system and assigning a work thread to each group of artificial intelligent models according to each static policy, respectively, to obtain at least one static sample, comprises:

4. A method for scheduling resources of a graphics processor according to claim 3, wherein grouping each artificial intelligence model in an intelligent automobile operating system according to a partitioning rule in the static policy, to obtain at least one test group, comprises:

5. The resource scheduling method of a graphics processor of claim 1, wherein identifying a priority static indicator of the at least one static performance indicator comprises:

6. The method of resource scheduling for a graphics processor of claim 1, wherein allocating graphics processor resources for each artificial intelligence model in the intelligent vehicle operating system according to the static allocation policy comprises:

7. The resource scheduling method of a graphics processor according to claim 6, wherein generating a dynamic allocation policy according to the dynamic performance index comprises:

8. The resource scheduling method of a graphic processor according to claim 7, wherein generating a dynamic allocation policy according to the normal performance index and the abnormal performance index comprises:

9. A resource scheduling device for a graphics processor of an intelligent automobile operating system, comprising:

10. A computer device, comprising: a processor and a memory communicatively coupled to the processor;

the memory stores computer-executable instructions;

the processor executes computer-executable instructions stored in the memory to implement a resource scheduling method for a graphics processor as claimed in any one of claims 1 to 8.