CN116560843A

CN116560843A - Intelligent automobile GPU resource optimization method and device based on environment awareness

Info

Publication number: CN116560843A
Application number: CN202310540229.3A
Authority: CN
Inventors: 许智; 刘洪振; 朱林法; 黄啸
Original assignee: Zebred Network Technology Co Ltd
Current assignee: Zebred Network Technology Co Ltd
Priority date: 2023-05-12
Filing date: 2023-05-12
Publication date: 2023-08-08

Abstract

The application provides an intelligent automobile GPU resource optimization method and device based on environment awareness. The method comprises the following steps: determining that the driving scene of the intelligent automobile changes, and determining a first target AI model in a first AI model set corresponding to the changed driving scene and a target inference frame rate of the first target AI model according to the perception information of the changed driving scene. And operating the first target AI model by using GPU resources, acquiring the operation inference frame rate of the first target AI model, and adjusting the GPU resources occupied by the first target AI model according to the operation inference frame rate of the first target AI model and the target inference frame rate of the first target AI model. The method can dynamically adjust the occupancy rate of the GPU resources of the AI model of the intelligent automobile operating system, improves the utilization rate of the GPU resources of the intelligent automobile operating system and the system stability, and provides more efficient and energy-saving intelligent driving service for the intelligent automobile.

Description

Intelligent automobile GPU resource optimization method and device based on environment awareness

Technical Field

The application relates to the technical field of intelligent automobile operating systems, in particular to an intelligent automobile GPU resource optimization method and device based on environment awareness.

Background

With the development of intelligent automobiles, more and more functions can be realized by intelligent driving of the automobiles. Intelligent driving refers to that devices such as advanced sensors are mounted on an automobile, and new technologies such as artificial intelligence are applied to enable the automobile to have intelligent driving capability, so that drivers are assisted to safely and conveniently complete driving tasks. In the intelligent driving operation system, an artificial intelligence (Artificial Intelligence, AI) model plays a very important role, and is used for analyzing and processing environmental information (such as road conditions, traffic flows, etc.) around the vehicle, making corresponding decisions, and controlling the running of the vehicle.

Currently, AI models in intelligent driving operating systems are mainly calculated by graphics processor (Graphics Processing Unit, GPU) resources carried by the intelligent automobile, so that the reasoning ability of the AI models mainly depends on the performance of the GPU and the GPU resources that the AI models can use. However, the current method for allocating GPU resources to AI models in intelligent driving operating systems has the problem of resource waste.

Therefore, how to improve the resource utilization rate of GPU resources allocated to the AI model of the intelligent driving operating system for a vehicle is a problem to be solved.

Disclosure of Invention

The application provides an intelligent automobile GPU resource optimization method and device based on environment awareness, which are used for solving the problem of resource waste in the method for allocating GPU resources of an AI model in an intelligent driving operation system in the prior art.

In a first aspect, the present application provides an intelligent automobile GPU resource optimization method based on environmental awareness, including:

determining that the driving scene of the intelligent automobile changes;

determining a first target AI model in a first AI model set corresponding to the changed driving scene according to the perception information of the changed driving scene, and a target inference frame rate of the first target AI model; the AI models in the first AI model set are used for realizing intelligent driving functions;

operating the first target AI model by using GPU resources, and acquiring an operation reasoning frame rate of the first target AI model;

and adjusting GPU resources occupied by the first target AI model according to the running reasoning frame rate of the first target AI model and the target reasoning frame rate of the first target AI model.

Optionally, the determining, according to the perceived information of the driving scene after the change, the first target AI model in the first AI model set corresponding to the driving scene after the change includes:

Determining the first AI model set corresponding to the changed driving scene and a first reasoning frame rate set of the first AI model set according to the perception information of the changed driving scene, wherein the first reasoning frame rate set comprises a target reasoning frame rate of each AI model in the first AI model set;

acquiring a second AI model set corresponding to the driving scene before the change, and a second reasoning frame rate set of the second AI model set, wherein the second reasoning frame rate set comprises a target reasoning frame rate of each AI model in the second AI model set;

and determining a first target AI model in the first AI model set according to the first AI model set, the first reasoning frame rate set, the second AI model set and the second reasoning frame rate set, wherein the first target AI model is an AI model which is different from any AI model in the second AI model set or is an AI model with a target reasoning frame rate changed in the second AI model set.

Optionally, the acquiring the operation inference frame rate of the first target AI model includes:

when the first target AI model is an AI model not belonging to the second AI model set, determining an initial GPU resource for running the first target AI model;

And operating the first target AI model by using the initial GPU resource to obtain the operation reasoning frame rate of the first target AI model.

Optionally, obtaining the operation inference frame rate of the first target AI model includes:

when the first target AI model is the AI model with the target inference frame rate changed in the second AI model set, acquiring the identification of the first target AI model;

acquiring a target inference frame rate of an AI model corresponding to the identifier in the second AI model set according to the identifier of the first target AI model;

and taking the target inference frame rate as the operation inference frame rate of the first target AI model.

Optionally, the adjusting the GPU resources occupied by the first target AI model according to the operation inference frame rate of the first target AI model and the target inference frame rate of the first target AI model includes:

if the operation inference frame rate of the first target AI model is greater than the target inference frame rate of the first target AI model, and the difference between the operation inference frame rate of the first target AI model and the target inference frame rate of the first target AI model is greater than or equal to a preset frame rate threshold, reducing the GPU resource occupancy rate of the first target AI model according to a first preset adjustment step length until the operation inference frame rate of the first target AI model is within the deviation range of the target inference frame rate of the first target AI model;

If the operation inference frame rate of the first target AI model is smaller than the target inference frame rate of the first target AI model, and the difference between the operation inference frame rate of the first target AI model and the target inference frame rate of the first target AI model is greater than or equal to the preset frame rate threshold, the GPU resource occupancy rate of the first target AI model is increased according to a second preset adjustment step length until the operation inference frame rate of the first target AI model is within the deviation range of the target inference frame rate of the first target AI model.

Optionally, the method further comprises:

determining a second target AI model with the maximum target inference frame rate in the first AI model set;

and adjusting the frequency of the GPU running the AI model in the first AI model set according to the running inference frame rate of the second target AI model and the target inference frame rate of the second target AI model.

Optionally, the adjusting the frequency of the GPU to operate the AI model in the first AI model set according to the operation inference frame rate of the second target AI model, and the target inference frame rate of the second target AI model includes:

if the operation inference frame rate of the second target AI model is greater than the target inference frame rate of the second target AI model, and the difference between the operation inference frame rate of the second target AI model and the target inference frame rate of the second target AI model is greater than or equal to a preset frame rate threshold, reducing the frequency of the GPU according to a third preset adjustment step length until the operation inference frame rate of the second target AI model is within the deviation range of the target inference frame rate of the second target AI model;

If the operation inference frame rate of the second target AI model is smaller than the target inference frame rate of the second target AI model, and the difference between the operation inference frame rate of the second target AI model and the target inference frame rate of the second target AI model is greater than or equal to the preset frame rate threshold, the frequency of the GPU is increased according to a fourth preset adjustment step length until the operation inference frame rate of the second target AI model is within the deviation range of the target inference frame rate of the second target AI model.

Optionally, the determining that the driving scene of the intelligent automobile changes includes:

receiving a driving mode switching instruction of the intelligent automobile, and determining that the driving scene of the intelligent automobile changes according to the driving mode switching instruction;

or alternatively, the process may be performed,

and acquiring surrounding environment perception information of the intelligent automobile, and determining that the driving scene of the intelligent automobile changes according to the surrounding environment perception information.

In a second aspect, the present application provides an intelligent automobile GPU resource optimization device based on environmental awareness, including:

the determining module is used for determining that the driving scene of the intelligent automobile changes; determining a first target AI model in a first AI model set corresponding to the changed driving scene according to the perception information of the changed driving scene, and a target inference frame rate of the first target AI model; the AI models in the first AI model set are used for realizing intelligent driving functions;

The acquisition module is used for operating the first target AI model by using GPU resources and acquiring the operation reasoning frame rate of the first target AI model;

the processing module is used for adjusting GPU resources occupied by the first target AI model according to the operation inference frame rate of the first target AI model and the target inference frame rate of the first target AI model; and operating the first target AI model by using the adjusted GPU resources.

In a third aspect, the present application provides an electronic device, comprising: a processor, a communication interface, and a memory; the processor is respectively in communication connection with the communication interface and the memory;

the memory stores computer-executable instructions;

the communication interface performs communication interaction with external equipment;

the processor executes computer-executable instructions stored by the memory to implement the method of any one of the first aspects.

In a fourth aspect, the present application provides a computer-readable storage medium having stored therein computer-executable instructions that, when executed by a processor, are configured to implement the context-aware-based intelligent automotive GPU resource optimization method of any of the first aspects.

In a fifth aspect, the present application provides a computer program product for implementing the context aware based intelligent automotive GPU resource optimization method of any of the first aspects when executed by a processor.

According to the intelligent automobile GPU resource optimization method and device based on environment awareness, the AI model required to be used by the intelligent automobile operating system under different driving scenes and the reasoning frame rate requirements of the AI model are determined by considering the corresponding relation between the driving scenes of the intelligent automobile and the AI model and the reasoning frame rate requirements of the AI model. And the GPU resource occupancy rate of the AI model is adjusted by comparing the current reasoning frame rate of the AI model operated in the driving scene with the reasoning frame rate requirement corresponding to the AI model in the driving scene, so that each AI model is ensured to meet the reasoning frame rate requirement, and meanwhile, no extra GPU resource is wasted, and the utilization rate of the GPU resource is improved.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.

FIG. 1 is a schematic diagram of the change of the reasoning frame rate under different driving scenarios of the current target detection model;

Fig. 2 is a schematic flow chart of an intelligent automobile GPU resource optimization method based on environmental awareness according to an embodiment of the present application;

FIG. 3 is a flowchart of another method for optimizing GPU resources of an intelligent automobile based on environmental awareness according to an embodiment of the present application;

FIG. 4 is a flowchart of a method for obtaining an operation inference frame rate of a first target AI model according to an embodiment of the disclosure;

FIG. 5 is a flowchart of another method for obtaining an operation inference frame rate of a first target AI model according to an embodiment of the disclosure;

FIG. 6 is a flowchart of another method for optimizing GPU resources of an intelligent automobile based on environmental awareness according to an embodiment of the present application;

FIG. 7 is a flowchart of another method for optimizing GPU resources of an intelligent automobile based on environmental awareness according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of an intelligent automobile GPU resource optimization device based on environmental awareness according to an embodiment of the present application;

fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Specific embodiments thereof have been shown by way of example in the drawings and will herein be described in more detail. These drawings and the written description are not intended to limit the scope of the inventive concepts in any way, but to illustrate the concepts of the present application to those skilled in the art by reference to specific embodiments.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present application as detailed in the accompanying claims.

At present, the method for distributing GPU resources of an AI model in an intelligent driving operation system of an intelligent automobile mainly comprises the following two steps:

the method comprises the following steps: and pre-distributing fixed GPU resources for all AI models in the intelligent driving operating system, and occupying the pre-distributed fixed GPU resources to operate when the AI models operate.

The second method is as follows: when a plurality of AI models occupying the same GPU resource are called in the intelligent driving operation system, the AI models are ordered according to the performance requirements or preset priorities, the AI models with stronger performance requirements or higher preset priorities occupy the GPU resource preferentially, other AI models compete with the AI models for using the GPU resource according to actual conditions, and the priorities can be determined according to actual driving requirements.

However, the AI models required to be called by the intelligent driving operation system are different in different scenes of the intelligent automobile, for example, the AI models required to be called by the intelligent automobile in the automatic driving scene of the expressway comprise a target detection model, a road condition identification model, a path planning model, a behavior decision-making and control model and the like; but in the auxiliary driving scenario, the path planning model and the behavior decision and control model can be turned off. Under different scenes, the requirements of the inference frame rate of the AI model required to be called by the intelligent driving operation system can also be different, for example, when the interference in rainy days is large, the requirements of the inference frame rate of the target detection model are higher than those of the target detection model in sunny days. The reasoning frame rate is the detection times or calculation times of the AI model in a preset time, and the reasoning capability of the AI model is represented. Fig. 1 is a schematic diagram illustrating the change of the reasoning frame rate in different driving scenarios of the current target detection model. As shown in fig. 1, the intelligent automobile has an inference frame rate requirement of 60 for the target detection model, namely 60 detections per second, in a sunny day, and has an inference frame rate requirement of 100 for the target detection model, namely 100 detections per second, when the driving scene is switched from sunny to rainy days.

However, in the existing methods for allocating GPU resources to AI models in two intelligent driving operation systems of two intelligent automobiles, the difference of AI models required to be invoked by the intelligent driving operation systems in different scenes is not considered, and the influence of the difference of reasoning capability requirements of the AI models on the GPU resource allocation of the AI models is not considered. Therefore, the method fixedly allocates GPU resources to the AI model, and when the reasoning frame rate requirements on the AI model are different, the GPU resources are wasted or insufficient. And in the second method, multiple AI models compete for GPU resources, so that the stability of the intelligent driving operation system is reduced.

In view of this, the application provides an intelligent automobile GPU resource optimization method based on environment awareness, which determines an AI model required to be used by an intelligent automobile operating system under different driving scenes and the reasoning frame rate requirements of the AI model by considering the corresponding relation between the driving scenes of the intelligent automobile and the AI model and the reasoning frame rate requirements of the AI model. And the GPU resource occupancy rate of the AI model is adjusted by comparing the current reasoning frame rate of the AI model operated in the driving scene with the reasoning frame rate requirement corresponding to the AI model in the driving scene, so that each AI model is ensured to meet the reasoning frame rate requirement, and meanwhile, no extra GPU resource is wasted, and the utilization rate of the GPU resource is improved.

The execution main body of the intelligent automobile GPU resource optimization method based on the environment awareness can be an intelligent driving operation system of the intelligent automobile, and the execution main body can be deployed on an automobile machine terminal device of the intelligent automobile. The execution main body can also be a processing chip of the vehicle terminal equipment, and when the execution main body is the processing chip of the vehicle terminal equipment, the processing chip can execute software or program codes of the intelligent automobile GPU resource optimization method based on environment awareness, so that the GPU resource allocation of the AI model is realized. The intelligent driving operation system can be developed by utilizing the existing embedded system platform integrated with the CPU and the GPU, the platform can be an Orin platform, and the intelligent driving operation system is more suitable for intelligent automobiles through the characteristics of high performance, low power consumption, small size and the like.

In the following, taking an intelligent driving operation system with an execution main body as an intelligent automobile as an example, a detailed description is given of the technical scheme of the application and how to solve the technical problems through specific embodiments. The following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.

Fig. 2 is a flow chart of an intelligent automobile GPU resource optimization method based on environmental awareness according to an embodiment of the present application. As shown in fig. 2, the method may include:

s201, determining that the driving scene of the intelligent automobile changes.

The driving scene of the intelligent automobile can comprise a driving mode of the intelligent automobile and surrounding environments of the intelligent automobile. The driving mode may include, for example, a manual manipulation mode, an assisted driving mode, an automatic driving mode, and the like; the ambient environment may include, for example, weather conditions, road conditions, traffic conditions, and the like. If the driving mode of the intelligent automobile or any one of the surrounding environments changes, the driving scene representing the intelligent automobile changes.

Optionally, for the driving mode of the intelligent automobile, a driving mode switching instruction of the intelligent automobile may be received, and a driving scene of the intelligent automobile is determined to change according to the driving mode switching instruction, where the driving mode switching instruction may be sent by a driver of the intelligent automobile through operating the intelligent driving operation system. In addition, whether the driving scene of the intelligent automobile changes can be determined by periodically detecting the driving mode of the intelligent automobile.

Optionally, for the surrounding environment of the intelligent automobile, surrounding environment sensing information of the intelligent automobile can be obtained, and driving scene change of the intelligent automobile is determined according to the surrounding environment sensing information. The sensing information of the surrounding environment can be obtained according to a sensing module of the intelligent automobile, and the sensing module can use various sensors for sensing the environment around the automobile in real time, such as a camera, a radar, a laser radar and the like. According to the information acquired by the sensor, the environment such as weather conditions, road conditions, traffic conditions and the like around the intelligent automobile is identified through technologies such as deep learning, image identification and the like, and for example, environments such as urban roads, highways, congested road sections, sunny days, rainy days, snowy days and the like can be identified.

S202, determining a first target AI model in a first AI model set corresponding to the changed driving scene according to the perception information of the changed driving scene, and determining a target inference frame rate of the first target AI model.

The AI models in the first AI model set are used for realizing an intelligent driving function, and an object detection model is taken as an example, and the model can detect an object near an intelligent automobile in an auxiliary driving mode or an automatic driving mode, so that the intelligent automobile can avoid the object or prevent collision with the object. The AI models that need to be invoked for different driving scenarios may be different, so the first AI model set is a set of all AI models that need to be invoked for the changed driving scenario.

And as for different driving scenes, even though the same AI model is called, the reasoning frame rate requirements (target reasoning frame rate) of the AI model by different driving scenes can also be different, for example, the target reasoning frame rate of the target detection model is different in sunny days and rainy days. Thus, for each AI model in the first AI model set, its target inference frame rate also corresponds to the driving scenario.

The first target AI model is an AI model in which the first AI model set needs to adjust the occupancy rate of GPU resources, and may be an AI model different from any AI model corresponding to a driving scene before change, or may be an AI model running before and after a scene change, but the target reasoning frame rate changes. For the two first target AI models, comparison determination can be performed according to AI models corresponding to driving scenes before change. For any AI model different from the AI model corresponding to the driving scene before the change, the intelligent driving operation system can also call the new AI model to start the action determination according to the detection of the driving scene change.

S203, using GPU resources to operate the first target AI model, and acquiring an operation reasoning frame rate of the first target AI model.

The CPU resource refers to an initial GPU resource allocated to the first target AI model, where the initial GPU resource may be preset by a system developer of the intelligent driving operating system, or may be preset according to different driving scenarios.

The operation reasoning frame rate is the current reasoning frame rate corresponding to the first target AI model when the driving scene changes. The operation inference frame rate may or may not be capable of satisfying the target inference frame rate of the first target AI model after the driving scene change.

One possible implementation may determine the operational inference frame rate of the first target AI model by detecting a current inference frame rate of the first target AI model at the time of operation after a driving scene change.

In another possible implementation manner, if the first target AI model is also in an operating state before the driving scene change, the target inference frame rate of the first target AI model before the driving scene change is taken as the operating inference frame rate of the first target AI model.

S204, adjusting GPU resources occupied by the first target AI model according to the operation inference frame rate of the first target AI model and the target inference frame rate of the first target AI model.

And when the reasoning frame rate limit of the AI model is met, the more GPU resources are occupied by the AI model, the higher the reasoning frame rate is, so that when the difference between the running reasoning frame rate of the first target AI model and the target reasoning frame rate is large, the difference between the running reasoning frame rate and the target reasoning frame rate can be reduced by improving the GPU resources occupied by the first target AI model. The adjustment of the GPU resources occupied by each AI model may be adjusted through Multi-Process Service (MPS) of the GPU, and through the MPS, multiple unified computing device architecture (Compute Unified Device Architecture, CUDA) application programs may be allowed to share a single GPU device at the same time, and provide a function of fine-grained resource control, so that the function that multiple AI models share a single GPU device at the same time and the function of dynamically adjusting the GPU resources of each AI model are implemented without affecting the operation effect and security of each AI model.

One possible implementation manner adjusts the GPU resources occupied by the first target AI model multiple times through a preset step size, so that the running inference frame rate is close to the target inference frame rate.

In another possible implementation manner, the difference between the operation inference frame rate and the target inference frame rate is reduced by determining a mapping relationship between the operation inference frame rate change of the first target AI model and the GPU resource occupancy rate change, and determining and adjusting the GPU resource occupied by the first target AI model according to the mapping relationship.

And distributing the adjusted GPU resources to the first target AI model so that the first target AI model runs on the adjusted GPU resources, thereby meeting the intelligent driving requirement of the intelligent automobile in the driving environment.

According to the method provided by the embodiment of the application, the corresponding relation between the driving scene of the intelligent automobile and the AI model and the reasoning frame rate requirement of the AI model are considered, and the AI model required to be used by the intelligent automobile operating system under different driving scenes and the reasoning frame rate requirement of the AI model are determined. And the GPU resource occupancy rate of the AI model is adjusted by comparing the current reasoning frame rate of the AI model operated in the driving scene with the reasoning frame rate requirement corresponding to the AI model in the driving scene, so that each AI model is ensured to meet the reasoning frame rate requirement, and meanwhile, no extra GPU resource is wasted, and the utilization rate of the GPU resource is improved.

Next, taking the comparison of the AI model corresponding to the driving scene before the change to determine the first target AI model as an example, how to determine the first target AI model in the first AI model set corresponding to the driving scene after the change according to the perception information of the driving scene after the change in step S202 is described in detail. Fig. 3 is a flow chart of another intelligent automobile GPU resource optimization method based on environmental awareness according to an embodiment of the present application. As shown in fig. 3, the step S202 may include:

S301, determining a first AI model set corresponding to the changed driving scene and a first reasoning frame rate set of the first AI model set according to the perception information of the changed driving scene.

The perception information is used for representing specific scenes in the driving scenes, such as perceived driving mode switching instructions, and can represent changes of driving modes in the driving scenes; perceived changes in weather, road conditions, etc., may characterize changes in the ambient environment of the intelligent vehicle in the driving scenario. And determining a specific scene after the driving scene is changed according to the perception information of the driving scene after the change.

Wherein the first set of inferred frame rates includes a target inferred frame rate for each AI model in the first set of AI models. Each driving scene corresponds to an AI model set, AI models required to be called for each specific scene are determined according to the specific scenes in the driving scene, all AI models required to be called corresponding to the driving scene are determined according to the specific scenes included in the driving scene, and all AI models required to be called form the first AI model set. For example, taking driving scenarios including driving modes such as an auxiliary driving mode and an automatic driving mode, and surrounding environments such as weather conditions as an example, an AI model set corresponding to each driving scenario, and a target inference frame rate set of the AI model set may be as shown in the following table 1:

TABLE 1

When the driving scene after the change is an auxiliary driving mode and the weather condition is rain, the first AI model set corresponding to the scene comprises a target detection model and a road condition identification model, and the first reasoning frame rate set of the first AI model set is as follows: target detection model: frame rate 100, road condition recognition model: frame rate 30.

S302, a second AI model set corresponding to the driving scene before the change and a second reasoning frame rate set of the second AI model set are obtained.

Wherein the second set of inferred frame rates includes a target inferred frame rate for each AI model in the second set of AI models. The second AI model set and the method for obtaining the second inference frame rate set of the second AI model set are the same as the first AI model set and the first inference frame rate set of the first AI model set, which are not described herein.

S303, determining a first target AI model in the first AI model set according to the first AI model set, the second AI model set and the second AI model set.

The first target AI model is an AI model which does not belong to a second AI model set, or the AI model of which the target inference frame rate changes in the second AI model set.

When the first target AI model is an AI model not belonging to the second AI model set, an AI model in the first AI model set that is different from the first AI model set is taken as the first target AI model by comparing AI models in the first AI model set with AI models in the second AI model set. For example, as shown in table 1, when the intelligent vehicle is switched from the auxiliary driving mode to the automatic driving mode, the first target AI model is a path planning model and a behavior decision and control model.

And when the AI model with the changed target inference frame rate in the second AI model set is used as the first target AI model by comparing the target inference frame rates of the same AI model in the first and second inference frame rate sets. For example, as shown in table 1, the first target AI model is a target detection model when it is perceived that the surroundings of the smart car are changed from a sunny day to a rainy day.

Based on the first target AI model determined in fig. 3, the operation inference frame rates of the first target AI model in step S203 are obtained according to the determination methods of the two first target AI models, respectively.

Case a: the first target AI model is an AI model that does not belong to the second AI model set. In this case, fig. 4 is a flowchart of a method for obtaining an operation inference frame rate of the first target AI model according to an embodiment of the present application. As shown in fig. 4, this step S203 may include:

S401, determining initial GPU resources for running a first target AI model.

In one possible implementation manner, the initial GPU resources may be preset by a system developer of the intelligent driving operating system, for example, a mapping relationship of the initial GPU resources corresponding to each AI model may be stored in the intelligent driving operating system, and after determining the first target AI model, the initial GPU resources of the first target AI model are determined according to the mapping relationship.

In another possible implementation manner, the initial GPU resources of the first target AI model may be preset according to different driving scenarios, where the initial GPU resources corresponding to the first target AI model are also different. And determining initial GPU resources of the first target AI model according to the current driving scene and the determined first target AI model.

S402, using the initial GPU resource to operate the first target AI model to obtain the operation reasoning frame rate of the first target AI model.

The initial GPU resource is assigned to the first target AI model and the first target AI model is caused to run on the initial GPU resource. And acquiring the operation reasoning frame rate of the first target AI model from the operation information by detecting the operation information of the first target AI model on the initial GPU resource. The running inference frame rate may be the current inference frame rate at a certain time point, or may be an average value of the inference frame rates in a certain period of time, etc.

Case B: the first target AI model is an AI model with a change in target inference frame rate in the second AI model set. In this case, fig. 5 is a flowchart of another method for obtaining the operation inference frame rate of the first target AI model according to an embodiment of the present application. As shown in fig. 5, this step S203 may include:

s501, acquiring an identification of a first target AI model.

The identification of the first target AI model may be, for example, a model name, a model number, etc. of the first target AI model, which identification may be extracted from information of the first target AI model. The specific form of identification of the AI model, and the storage location, is not limited in this application.

S502, according to the identification of the first target AI model, acquiring the target inference frame rate of the AI model corresponding to the identification in the second AI model set.

And according to the identification of the first target AI model, matching with the identification of all AI models in the second AI model set one by one, determining the AI models in the second AI model set matched with the identification, and obtaining the target inference frame rate of the AI models.

S503, taking the target reasoning frame rate as the running reasoning frame rate of the first target AI model.

Because the method of the present application adjusts the operation inference frame rate of the AI model to the target inference frame rate of the AI model when the driving scene changes each time (i.e., before the driving scene changes this time), the target inference frame rate corresponding to the AI model in the second AI model set may be used as the operation inference frame rate of the first target AI model.

For the method described in fig. 5, another possible implementation manner may obtain, according to the identifier of the first target AI model, a current inference frame rate of the AI model in the second AI model set corresponding to the identifier, and use the current inference frame rate as the operation inference frame rate of the first target AI model. Because the operation inference frame rate of the AI model is adjusted to be within the error range near the target inference frame rate only when the driving scene changes (i.e., before the driving scene changes), the accuracy of the operation inference frame rate of the first target AI model can be further improved by using the implementation manner.

In the following, taking the method described in any of fig. 4 or fig. 5 as an example, how to adjust the GPU resources occupied by the first target AI model according to the operation inference frame rate of the first target AI model and the target inference frame rate of the first target AI model in step S204 is described in detail.

Case 1: the operation reasoning frame rate of the first target AI model is larger than the target reasoning frame rate of the first target AI model, namely the GPU resource occupancy rate of the current first target AI model is too high, and the condition of wasting resources exists.

If the operation inference frame rate of the first target AI model is greater than the target inference frame rate of the first target AI model, and the difference between the operation inference frame rate of the first target AI model and the target inference frame rate of the first target AI model is greater than or equal to a preset frame rate threshold, reducing the GPU resource occupancy rate of the first target AI model according to a first preset adjustment step length until the operation inference frame rate of the first target AI model is within the deviation range of the target inference frame rate of the first target AI model.

The preset frame rate threshold may be determined according to actual requirements, and for different first target AI models, the preset frame rate threshold may be the same or different, which is not limited in this application. When the frame rate is greater than or equal to the preset frame rate threshold, the GPU resource occupancy rate of the first target AI model is indicated to be too high, more resources are wasted, and the GPU resource occupancy rate of the first target AI model needs to be reduced to save resources.

The first preset adjustment step length can be determined according to actual requirements, for example, the first preset adjustment step length can be reduced as much as possible, so that the adjustment of GPU resource occupancy rate is smaller every time, the adjustment of the operation inference frame rate of the first target AI model is more accurate, whether continuous adjustment is needed or not is judged by detecting the operation inference frame rate after each adjustment for many times, and the operation inference frame rate can be adjusted to be within the deviation range of the target inference frame rate of the smaller first target AI model as much as possible.

Case 2: the operation reasoning frame rate of the first target AI model is smaller than the target reasoning frame rate of the first target AI model, namely, the GPU resource occupancy rate representing the current first target AI model is too low, the condition of insufficient resources exists, and the problem of driving danger caused by the fact that the reasoning frame rate of the first target AI model is too low is easily caused.

The preset frame rate threshold may be the same as the preset frame rate threshold in case 1, or may be a different threshold, and the threshold may be determined and adjusted according to actual requirements, which is not limited in this application. When the frame rate is greater than or equal to the preset frame rate threshold, the GPU resource occupancy rate of the first target AI model is indicated to be too low, resources are insufficient, and the GPU resource occupancy rate of the first target AI model needs to be improved to save resources.

The second preset adjustment step length can be determined according to actual requirements, and can be equal to or greater than the first preset adjustment step length. When the second preset adjustment step length is set to be larger than the first preset adjustment step length, the second preset adjustment step length can be adjusted to be larger as much as possible according to actual requirements, so that the GPU resource occupancy rate of the first target AI model can be improved rapidly, the adjustment time is shortened, and the driving hidden danger caused by the fact that the adjustment time is overlong and the operation reasoning frame rate of the first target AI model is lower, and the reasoning capacity of the first target AI model is not up to standard, is reduced.

And judging whether continuous adjustment is needed or not by detecting the operation inference frame rate after each adjustment for a plurality of times, so that the operation inference frame rate can be adjusted to be within the deviation range of the target inference frame rate of the first target AI model as soon as possible.

For this adjustment method, an example of acquiring the operation inference frame rate by detecting the current operation condition of the first target AI model may be described as a flowchart shown in fig. 6. Fig. 6 is a flowchart of another intelligent automobile GPU resource optimization method based on environmental awareness according to an embodiment of the present application. As shown in fig. 6, the method may include:

S601, detecting the operation condition of a first target AI model.

The running condition of the first target AI model can be monitored in real time through Profiling tools and the like.

S602, acquiring an operation reasoning frame rate of the first target AI model according to the operation condition.

S603, judging whether the running reasoning frame rate meets the target reasoning frame rate. If yes, step S607 is executed, and if not, step S604 is executed.

And determining whether the difference value between the running reasoning frame rate and the target reasoning frame rate is larger than or equal to a preset difference value threshold, if so, not meeting, and if not, meeting. The preset difference threshold may be the deviation range of the target inference frame rate of the first target AI model at the same time.

S604, judging whether the running reasoning frame rate is excessive or insufficient, if so, characterizing that the GPU resource occupancy rate of the first target AI model needs to be reduced, and executing a step S605; if not, the characterization needs to increase the GPU resource occupancy rate of the first target AI model, and step S606 is performed.

S605, reducing the GPU resource occupancy rate of the first target AI model according to the first preset adjustment step length, and executing step S601.

And S606, improving the GPU resource occupancy rate of the first target AI model according to the second preset adjustment step length, and executing the step S601.

S607, stopping adjustment.

According to the method provided by the embodiment of the application, through the determination of the first target AI models of different types and the acquisition of the operation inference frame rate of the first target AI models, the operation inference frame rate of the first target AI models is gradually adjusted to be within the error range of the target inference frame rate of the first target AI models in a preset adjustment step length mode according to the acquired operation inference frame rate, so that the function of adjusting the GPU resource occupancy rate of a plurality of AI models in the GPU is realized, and the accuracy and the speed of adjusting the GPU resource occupancy rate of the first target AI models are improved.

Fig. 7 is a flow chart of another intelligent automobile GPU resource optimization method based on environmental awareness according to an embodiment of the present application. As shown in fig. 7, the method may further include:

s701, determining a second target AI model with the maximum target inference frame rate in the first AI model set.

And determining the AI model with the maximum target inference frame rate in the first AI model set as a second target AI model through comparing the target inference frame rates in the first AI model set. Since the inferred frame rate of the AI model is not only affected by the occupancy rate of the GPU resource, but also affected by the performance of the GPU, if the performance of the GPU is too low, even if the occupancy rate of the GPU resource of the AI model is still high, it may not be possible to adjust the operational inferred frame rate of the AI model to be within the error range of the target inferred frame rate.

In this embodiment, how to satisfy the target inference frame rate of all AI models in the first AI model set is described by taking the GPU performance as an example mainly for the GPU frequency, and it should be understood that other parameters that can characterize the GPU performance may also be adjusted to implement the method of the present application, which is not limited in this application. When the frequency of the GPU is increased, the inferred frame rate of all AI models running on the GPU resource is increased, and when the frequency of the GPU is decreased, the inferred frame rate of all AI models running on the GPU resource is decreased.

S702, adjusting the frequency of the GPU running the AI model in the first AI model set according to the running inference frame rate of the second target AI model and the target inference frame rate of the second target AI model.

One possible implementation manner is to determine, by determining a mapping relationship between a running inference frame rate change of the first target AI model and a GPU frequency change, and adjust GPU frequencies running the AI models in the first AI model set according to the mapping relationship, so as to reduce a gap between the running inference frame rate and the target inference frame rate.

In another possible implementation manner, the GPU frequencies of the AI models in the first AI model set are adjusted multiple times by a preset step size, so that the running inference frame rate is close to the target inference frame rate. In this implementation, the adjustment of the GPU frequency may be achieved by the following method.

The implementation manner of the adjustment of the GPU frequency is similar to the manner of adjusting the GPU resource occupancy rate of the first target AI model described in fig. 6, and the implementation method and the technical effect thereof are similar, and are not repeated here.

According to the method provided by the embodiment of the application, the second target AI model with the largest target reasoning frame rate in the first AI model set is determined, and the frequency of the GPU running the first AI model set is adjusted according to the running reasoning frame rate of the second target AI model and the target reasoning frame rate of the second target AI model, so that the running reasoning frame rates of all AI models in the first AI model set are located in the corresponding target reasoning frame rate in each AI model re-driving scene, and the system stability and resource rationality of the intelligent driving operation system are improved.

The intelligent driving service system can provide more efficient and energy-saving intelligent driving service for the intelligent automobile running under different scenes and conditions, improve the performance and safety of the intelligent automobile, reduce the running cost and maintenance cost of the intelligent automobile, and improve the driving experience and satisfaction of users. It should be appreciated that the present application may also provide references and references for intelligent system development of other similar embedded system platforms, such as intelligent system development of unmanned aerial vehicles, robots, etc., so as to achieve more excellent GPU resource management and optimization.

Fig. 8 is a schematic structural diagram of an intelligent automobile GPU resource optimization device based on environmental awareness according to an embodiment of the present application. As shown in fig. 8, the apparatus may include: the device comprises a determining module 11, an acquiring module 12 and a processing module 13.

The determining module 11 is configured to determine that a driving scene of the smart car changes. And determining a first target AI model in the first AI model set corresponding to the changed driving scene according to the perception information of the changed driving scene, and a target inference frame rate of the first target AI model. The AI models of the first AI model set are for implementing intelligent driving functions.

And the obtaining module 12 is configured to use GPU resources to run the first target AI model, and obtain a running inference frame rate of the first target AI model.

The processing module 13 is configured to adjust GPU resources occupied by the first target AI model according to the running inference frame rate of the first target AI model and the target inference frame rate of the first target AI model. And running the first target AI model by using the adjusted GPU resources.

Optionally, for how to determine that the driving scene of the smart car changes, the obtaining module 12 may receive a driving mode switching instruction of the smart car, and determine that the driving scene of the smart car changes according to the driving mode switching instruction. Or acquiring the surrounding environment sensing information of the intelligent automobile, and determining that the driving scene of the intelligent automobile changes according to the surrounding environment sensing information.

A possible implementation manner, the determining module 11 is specifically configured to determine the first AI model set corresponding to the changed driving scenario and the first inference frame rate set of the first AI model set according to the perception information of the changed driving scenario. The acquiring module 12 is specifically configured to acquire a second AI model set corresponding to the driving scenario before the change, and a second inference frame rate set of the second AI model set. The determining module 11 is specifically configured to determine a first target AI model in the first AI model set according to the first AI model set, the first inference frame rate set, the second AI model set, and the second inference frame rate set, where the first inference frame rate set includes a target inference frame rate of each AI model in the first AI model set, and the second inference frame rate set includes a target inference frame rate of each AI model in the second AI model set, and the first target AI model is an AI model that is different from any AI model in the second AI model set, or an AI model in which the target inference frame rate in the second AI model set changes.

Alternatively, when the first target AI model is an AI model that does not belong to the second AI model set, the acquisition module 12 is specifically configured to determine an initial GPU resource for running the first target AI model. And operating the first target AI model by using the initial GPU resource to obtain the operation reasoning frame rate of the first target AI model.

Optionally, when the first target AI model is an AI model whose target inference frame rate changes in the second AI model set, the obtaining module 12 is specifically configured to obtain an identification of the first target AI model. And acquiring the target inference frame rate of the AI model corresponding to the identification in the second AI model set according to the identification of the first target AI model. And taking the target inference frame rate as the operation inference frame rate of the first target AI model.

In the above two cases, in one possible implementation manner, the processing module 13 is specifically configured to reduce the GPU resource occupancy rate of the first target AI model according to the first preset adjustment step size if the operation inference frame rate of the first target AI model is greater than the target inference frame rate of the first target AI model, and the difference between the operation inference frame rate of the first target AI model and the target inference frame rate of the first target AI model is greater than or equal to the preset frame rate threshold, until the operation inference frame rate of the first target AI model is within the deviation range of the target inference frame rate of the first target AI model. If the operation inference frame rate of the first target AI model is smaller than the target inference frame rate of the first target AI model, and the difference between the operation inference frame rate of the first target AI model and the target inference frame rate of the first target AI model is greater than or equal to the preset frame rate threshold, the GPU resource occupancy rate of the first target AI model is increased according to a second preset adjustment step length until the operation inference frame rate of the first target AI model is within the deviation range of the target inference frame rate of the first target AI model.

Optionally, the determining module 11 is further configured to determine a second target AI model with a maximum target inference frame rate in the first AI model set. The processing module 13 is further configured to adjust the frequency of the GPU operating the AI model in the first AI model set according to the operation inference frame rate of the second target AI model and the target inference frame rate of the second target AI model.

In this implementation manner, the processing module 13 is specifically configured to reduce the frequency of the GPU according to the third preset adjustment step size if the operation inference frame rate of the second target AI model is greater than the target inference frame rate of the second target AI model, and the difference between the operation inference frame rate of the second target AI model and the target inference frame rate of the second target AI model is greater than or equal to the preset frame rate threshold, until the operation inference frame rate of the second target AI model is within the deviation range of the target inference frame rate of the second target AI model. If the operation inference frame rate of the second target AI model is smaller than the target inference frame rate of the second target AI model, and the difference between the operation inference frame rate of the second target AI model and the target inference frame rate of the second target AI model is greater than or equal to the preset frame rate threshold, the frequency of the GPU is increased according to a fourth preset adjustment step length until the operation inference frame rate of the second target AI model is within the deviation range of the target inference frame rate of the second target AI model.

The intelligent automobile GPU resource optimization device based on environment awareness, which is provided by the embodiment of the application, can execute the intelligent automobile GPU resource optimization method based on environment awareness in the method embodiment, and the implementation principle and the technical effect are similar, and are not repeated here.

Fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present application. The electronic device is configured to execute the above-mentioned intelligent automobile GPU resource optimization method based on environmental awareness, and may be, for example, the above-mentioned automobile terminal device. As shown in fig. 9, the electronic device 900 may include: at least one processor 901, a memory 902, a communication interface 903.

A memory 902 for storing programs. In particular, the program may include program code including computer-operating instructions.

The memory 902 may include high-speed RAM memory or may further include non-volatile memory (non-volatile memory), such as at least one disk memory.

The processor 901 is configured to execute computer-executable instructions stored in the memory 902 to implement the methods described in the foregoing method embodiments. The processor 901 may be a CPU, or an application specific integrated circuit (Application Specific Integrated Circuit, abbreviated as ASIC), or one or more integrated circuits configured to implement embodiments of the present application.

The processor 901 may communicate with external devices, such as sensors, through a communication interface 903. In a specific implementation, if the communication interface 903, the memory 902, and the processor 901 are implemented independently, the communication interface 903, the memory 902, and the processor 901 may be connected to each other and perform communication with each other through buses. The bus may be an industry standard architecture (Industry Standard Architecture, abbreviated ISA) bus, an external device interconnect (Peripheral Component, abbreviated PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, abbreviated EISA) bus, among others. Buses may be divided into address buses, data buses, control buses, etc., but do not represent only one bus or one type of bus.

Alternatively, in a specific implementation, if the communication interface 903, the memory 902, and the processor 901 are integrated on a chip, the communication interface 903, the memory 902, and the processor 901 may complete communication through internal interfaces.

The present application also provides a computer-readable storage medium, which may include: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk or an optical disk, etc., in which program codes may be stored, and in particular, the computer-readable storage medium stores program instructions for the methods in the above embodiments.

The present application also provides a program product comprising execution instructions stored in a readable storage medium. The at least one processor of the computing device may read the execution instructions from the readable storage medium, the execution instructions being executable by the at least one processor to cause the computing device to implement the above-described context-aware-based intelligent automotive GPU resource optimization method.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the corresponding technical solutions from the scope of the technical solutions of the embodiments of the present application.

Claims

1. The intelligent automobile GPU resource optimization method based on environment awareness is characterized by comprising the following steps of:

determining that the driving scene of the intelligent automobile changes;

2. The method of claim 1, wherein the determining a first target AI model in the first AI model set corresponding to the changed driving scenario based on the perceived information of the changed driving scenario comprises:

3. The method of claim 2, wherein the obtaining the operational inference frame rate of the first target AI model comprises:

4. The method of claim 2, wherein obtaining the operational inference frame rate of the first target AI model comprises:

5. The method of any of claims 3 or 4, wherein the adjusting the GPU resources occupied by the first target AI model based on the operational inference frame rate of the first target AI model and the target inference frame rate of the first target AI model comprises:

6. The method according to any one of claims 2-5, further comprising:

7. The method of claim 6, wherein the adjusting the frequency at which the GPU operates the AI model of the first AI model set based on the operating inference frame rate of the second target AI model and the target inference frame rate of the second target AI model comprises:

8. The method of any one of claims 1-5, wherein said determining that a driving scenario of the smart car has changed comprises:

or alternatively, the process may be performed,

9. An intelligent automobile GPU resource optimization device based on environment awareness, which is characterized by comprising:

and the processing module is used for adjusting GPU resources occupied by the first target AI model according to the running reasoning frame rate of the first target AI model and the target reasoning frame rate of the first target AI model.

10. An electronic device, comprising: the processor is respectively in communication connection with the communication interface and the memory;

the memory stores computer-executable instructions;

the processor executes computer-executable instructions stored in the memory to implement the method of any one of claims 1-8.