CN116225688A - Multi-core collaborative rendering processing method based on GPU instruction forwarding - Google Patents

Multi-core collaborative rendering processing method based on GPU instruction forwarding Download PDF

Info

Publication number
CN116225688A
CN116225688A CN202211608792.1A CN202211608792A CN116225688A CN 116225688 A CN116225688 A CN 116225688A CN 202211608792 A CN202211608792 A CN 202211608792A CN 116225688 A CN116225688 A CN 116225688A
Authority
CN
China
Prior art keywords
gpu
graphic
command
operating system
core
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211608792.1A
Other languages
Chinese (zh)
Inventor
廖科
曲国远
郭文骏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Aeronautical Radio Electronics Research Institute
Original Assignee
China Aeronautical Radio Electronics Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Aeronautical Radio Electronics Research Institute filed Critical China Aeronautical Radio Electronics Research Institute
Priority to CN202211608792.1A priority Critical patent/CN116225688A/en
Publication of CN116225688A publication Critical patent/CN116225688A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3004Arrangements for executing specific machine instructions to perform operations on memory
    • G06F9/30047Prefetch instructions; cache control instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3802Instruction prefetching
    • G06F9/3814Implementation provisions of instruction buffers, e.g. prefetch buffer; banks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5044Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering hardware capabilities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/544Buffers; Shared memory; Pipes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/60Memory management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/005General purpose rendering architectures
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Graphics (AREA)
  • Image Generation (AREA)

Abstract

The embodiment of the invention discloses a multi-core collaborative rendering processing method based on GPU instruction forwarding, which comprises the following steps: in each partition operating system, user mode driving software establishes an independent GPU command buffer area and a data buffer area; when the graphic application software calls the graphic API, the user state driving software converts the graphic API into GPU commands and data, and the GPU commands and the data are respectively cached in a GPU command buffer area and a data buffer area; when the graphic application software calls the graphic API to trigger the GPU command submission, the user state driving software 'forwards' the service application GPU command submission to the GPU command; and controlling the annular buffer zone in the shared storage area to transmit the GPU command to the GPU by the GPU management module so as to execute graphic rendering of the GPU command through the GPU. The technical scheme provided by the embodiment of the invention solves the problem of CPU performance bottleneck caused by the traditional single-thread graphics rendering by effectively utilizing the multi-core capability of the processor, thereby effectively improving the drawing performance of the airborne graphics.

Description

Multi-core collaborative rendering processing method based on GPU instruction forwarding
Technical Field
The invention relates to the technical field of graphic driving in avionics systems, in particular to a multi-core collaborative rendering processing method based on GPU instruction forwarding.
Background
With the continuous improvement of the complexity and resolution of on-board display, the graphics system is more complex and huge, and by analyzing typical on-board graphics rendering scenes, it is found that performance bottlenecks are clamped at the CPU end in many rendering scenes, namely, the CPU is fully loaded, and the GPU is often in an idle state, particularly, the problem is more prominent in the scene of pictures with more Chinese characters.
Due to the limitation of the technology, the performance of the single-core CPU reaches the bottleneck, the performance improvement by the single-core CPU can not meet the future high-performance graphics rendering requirement, and the basic performance requirement of the application program can be effectively ensured only by fully utilizing the existing multi-core platform and playing the performance of the multi-core.
Disclosure of Invention
The purpose of the invention is that: the embodiment of the invention provides a multi-core collaborative rendering processing method based on GPU instruction forwarding, which solves the problem of CPU performance bottleneck caused by traditional single-thread graphics rendering by effectively utilizing the multi-core capability of a processor, thereby effectively improving the drawing performance of an airborne graphics.
The technical scheme of the invention is as follows: the embodiment of the invention provides a multi-core collaborative rendering processing method based on GPU instruction forwarding, which comprises the following steps: the hardware for executing the multi-core collaborative rendering comprises a multi-core CPU and a GPU, wherein the software comprises an operating system supporting multi-core partition, graphic driving software and graphic application software, the multi-core CPU is operated with multi-core partition operating systems, and each partition operating system is distributed with an independent CPU core, an independent memory space and independent peripherals; the method comprises the steps that graphic application software and user mode driving software of a GPU are operated on each partition operating system, the graphic application software comprises user mode driving software operated in each partition operating system and a GPU management module operated in an operating system virtualization layer, and the GPU management module provides GPU command forwarding services for each partition operating system; the multi-core collaborative rendering processing method comprises the following steps:
step 1, in each partition operating system, user mode driving software establishes an independent GPU command buffer area and a data buffer area;
step 2, in each partition operating system, when the graphic application software calls the graphic API in the user mode driving software, the user mode driving software converts the called graphic API into GPU commands and data, and the GPU commands and the data are respectively cached in a GPU command buffer area and a data buffer area for storing the GPU commands and the data;
step 3, in each partition operating system, when the graphic application software calls the graphic API to trigger the GPU command submission, user state driving software in the partition operating system 'forwards' the service application GPU command submission to the GPU command in the GPU management module according to the agreed protocol;
and 4, the GPU management module controls the annular buffer in the shared storage area of the operating system to transmit the GPU command to the GPU so as to execute graphic rendering of the GPU command through the GPU.
Optionally, in the multi-core collaborative rendering processing method based on GPU instruction forwarding as described above,
the GPU management module provides a GPU command forwarding service for each partition operating system, so that GPU commands submitted to the GPU by each partition operating system are transmitted through the GPU command forwarding service of the GPU management module.
Optionally, in the multi-core collaborative rendering processing method based on GPU instruction forwarding as described above,
in the step 3, the graphics application software calling a graphics API triggers GPU command submission to have the following cases:
in the case 1, when graphic application software calls a graphic API, user mode driving software triggers the submission of a GPU command by converting a specific GPU command converted from the called graphic API;
and 2, when the graphic application software calls the graphic API, the user mode driving software converts the called graphic API into GPU commands and data, and when the GPU command buffer or the data buffer is filled, the GPU command submission is triggered.
Optionally, in the multi-core collaborative rendering processing method based on GPU instruction forwarding as described above,
in the step 3, in each partition operating system, after the graphics application software invokes the graphics API to trigger the GPU command to submit, the GPU command "forwarding" service provided by the GPU management module is specifically invoked, and the GPU command "forwarding" service caches the GPU command submitted by the user mode driver software in the ring buffer.
Optionally, in the multi-core collaborative rendering processing method based on GPU instruction forwarding as described above, after the GPU command "forwarding" service buffers the GPU commands submitted by the user mode driver software in the ring buffer,
the GPU command forwarding service schedules according to the priority of the GPU commands submitted by the partition operating systems so as to control the GPU to perform graphic rendering on the GPU commands submitted by the partition operating systems in a time-sharing manner.
Optionally, in the method for processing multi-core collaborative rendering based on GPU instruction forwarding as described above, the method further includes:
the GPU management module monitors the GPU load in each partition operating system to realize load balancing of GPU commands in each partition operating system.
Optionally, in the multi-core collaborative rendering processing method based on GPU instruction forwarding as described above,
in the step 2, the actual copying operation of the GPU command and the data is transferred through the pointer, so as to ensure the timeliness of command forwarding.
Optionally, in the method for processing multi-core collaborative rendering based on GPU instruction forwarding as described above, the method further includes:
when the state of the GPU for executing graphic rendering is inconsistent with the state of the current partition operating system, the GPU command forwarding service switches the execution state of the GPU to the rendering context of the current partition operating system, and then executes the GPU command from a GPU command buffer zone of the partition operating system to realize GPU command submission, so that the GPU executes graphic processing tasks of all partitions.
Optionally, in the multi-core collaborative rendering processing method based on GPU instruction forwarding as described above,
and the service process of the GPU management module is in a silent state by default, and wakes the GPU management module when the GPU command is submitted by each partition operating system, and the GPU management module is continuously in the silent state after the GPU command is submitted by processing.
Optionally, in the method for processing multi-core collaborative rendering based on GPU instruction forwarding as described above, the method further includes:
and the graphic results rendered by the GPU are transmitted to different displays through the video controller for display.
The invention has the beneficial effects that: the embodiment of the invention provides a multi-core collaborative rendering processing method based on GPU instruction forwarding, in particular to a shared access mode based on GPU virtualization. The multi-core collaborative rendering processing method provided by the embodiment of the invention can break through the bottleneck of single-core computing power, improve the performance of the GPU light-load graphics application and improve the integration level of the system for processing the graphics application. Compared with the three sharing access modes, the influence among graphic tasks residing on different CPU cores can be effectively controlled, and the influence of faults of a single graphic processing task on other tasks can be avoided; the method can avoid the disastrous influence of the heavy-load low-priority task on the light-load high-priority task, and the graphic application running on each CPU core can realize isolation, so that the safety and reliability are better ensured, and the practicability is very good.
Furthermore, the technical scheme provided by the embodiment of the invention does not depend on a specific hardware platform, has good adaptability and flexibility, is simple, convenient and easy to use in realization, and is suitable for various occasions of multi-core collaborative rendering; and the popularization is strong, and the market using space and obvious economic benefit can be achieved.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate and do not limit the invention.
Fig. 1 is a schematic diagram of a multi-core collaborative rendering processing method based on GPU instruction forwarding according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail hereinafter with reference to the accompanying drawings. It should be noted that, in the case of no conflict, the embodiments and features in the embodiments may be arbitrarily combined with each other.
In the background art, in the airborne graphics processing technology in the avionics system, the performance of the single-core CPU has reached the bottleneck, and the performance improvement by the single-core CPU cannot meet the future high-performance graphics rendering requirement. Aiming at the problem, the existing multi-core platform is fully utilized, the performance of the multi-core is exerted, and the basic performance requirement of the application program can be effectively ensured.
At present, the multi-core platform is rapidly developed, and the main stream embedded processor manufacturers such as Enzhima, feiteng, loongson and other processors in China provide a CPU supporting 4 cores or 8 cores, so how to effectively utilize the advantages of the multi-core CPU and improve the graphics rendering performance becomes a urgent problem to be solved. In order to improve the utilization efficiency of the GPU, a function that multiple CPUs can use the GPU to perform graphics rendering is required to be realized, and a multi-core sharing function of the GPU, also called a GPU virtualization function, is realized.
The existing multi-core sharing access method of the GPU mainly comprises the following modes:
mode one: shared access based on graphics Application Programming Interface (API) forwarding
Referring to fig. 1, a schematic diagram of a prior art shared access based on forwarding by a graphics Application Programming Interface (API) is shown. The principle of the scheme is that graphics API forwarding software and graphics application software are run on a graphics API client partition. The graphic API called when the graphic application software processes the graphic task is sent to the partition processing of the graphic API server side through the graphic API forwarding software according to the agreed protocol. Graphics driver software is not needed on the graphics API client partition, the graphics driver software is not directly crosslinked with the GPU, and the operation of the graphics driver software is not related to the specific GPU, so that the graphics API client partition has very high universality. However, the graphics API forwarding method has a large data processing amount and a high application processing delay, which is extremely difficult to solve.
The partition of the graphic API server side exclusive to the GPU runs the graphic driving software and the graphic API service software. The graphic API service software receives the graphic API call and related data forwarded from the graphic API client partition according to the agreed protocol, analyzes the graphic API call and related data into a graphic API, calls a command stream converted into a GPU by a graphic driver, and sends the command stream to the GPU for execution, so that the graphic task processing is completed. The data processing amount of the graphic API forwarding mode is large, the processing load of the CPU of the graphic API service software is very high, and in addition, the CPU load of the graphic driving software is also very high, so that the partition of the graphic API server end is very easy to become a bottleneck of graphic processing.
The shared access manner based on the graphics API forwarding is basically exclusive to the GPU. The processing load of a graphics application can be distributed to multiple processor cores for parallel execution by splitting into multiple graphics API client partitions. All the load of the graphics driver software is not resolved and is still concentrated on the processor cores of the graphics API server-side partition. The load of the graphics API forwarding software is distributed over the processor cores of the individual API client partitions to execute in parallel, but the load of the graphics API service software is all concentrated in the graphics API server-side partition.
Mode two, GPU direct sharing access based on mutual exclusion sharing
Based on the mutual exclusion sharing GPU direct sharing access mode, each partition runs graphic application software and graphic driving software, and processor cores of each partition acquire GPU access rights according to the mutual exclusion sharing access mechanism and then directly carry out GPU access. The partition sends out a GPU access application, the partition directly operates the GPU and closes the mutual exclusion lock after the GPU access permission is obtained, and other partitions cannot access the GPU at the moment. And switching the execution state of the GPU to the rendering context of the current partition, inserting a command packet into the annular buffer area to realize graphics instruction submission, starting the GPU to perform the graphics processing task of the current partition, and releasing the mutual exclusion lock after the execution is finished.
Mode two, shared access based on GPU hardware virtualization
The shared access mode based on GPU virtualization requires GPU hardware virtualization support, a graphic virtualization manager is integrated on the GPU, and the graphic virtualization manager of the GPU can be configured into a plurality of virtual GPUs. The operation principle of the processor core on the virtual GPUs is identical to that of the exclusive GPU mode, and different virtual GPUs are distinguished through IDs. Logically, it is the same as a multi-core processor that has multiple physical GPUs connected. Each partition runs graphics application software and graphics driver software, and the use and scheduling of GPU resources between the partitions is completed by a built-in virtualization manager of the GPU. The graphics driver on the partition establishes an independent GPU command buffer area and a data buffer area, when the application calls the graphics API, the graphics driver converts the graphics API calls into GPU instructions and data, the GPU instructions and the data are cached in the GPU command buffer area and the data buffer area, the virtual GPU is accessed at any time in an exclusive mode according to the operation requirement to submit the GPU instructions, and the graphics processing task is executed.
The implementation schemes of the three GPUs for multi-core shared access have respective advantages and disadvantages, and are specifically described as follows:
1) The sharing access universality based on the graphic API forwarding is good, and although the multi-core sharing of the GPU function can be realized at the level of the graphic drawing function, the CPU processing bottleneck is concentrated in the partition of the graphic API server end, and is difficult to break through, so that the processing efficiency is difficult to effectively improve, and the method does not meet the target of the parallel collaborative graphics processing of the airborne multi-core;
2) Based on the mutual exclusion sharing GPU direct sharing access mode, the processing load of the graphic application software and the processing load of the graphic driving software can be distributed to a plurality of processor cores for parallel execution in a mode of being decomposed into a plurality of partitions, so that the computational bottleneck of a CPU in the graphic processing application can be truly broken through; but this approach has the following problems:
(a) Firstly, the problem that the GPU efficiency is drastically reduced due to the too frequent task switching of the GPU exists;
(b) Secondly, the problem that a heavy-load low-priority task affects a light-load high-priority task exists, all partitions are in the same position, and priority scheduling cannot be performed;
(c) Any partition abnormality is very easy to cause the whole system to be hung, any partition occupies the access authority of the GPU and is not released, so that all graphics processing tasks of the whole system are hung and cannot respond
3) In the shared access mode based on GPU virtualization, each partition has the convenience similar to exclusive access to the GPU, the processing load of the graphic application software and the processing load of the graphic driving software can be distributed to a plurality of processor cores for parallel execution in a mode of being decomposed into a plurality of partitions, the computational power bottleneck of a CPU in the graphic processing application can be broken through, but the GPU which needs to support GPU hardware virtualization can only use the mode, and the application range is smaller.
Aiming at the problems of the three existing GPU multi-core shared access modes, the multi-core collaborative rendering processing method based on GPU instruction forwarding provided by the embodiment of the invention is specifically an improvement scheme of a mutual exclusion sharing-based GPU direct shared access mode, and belongs to a graphics driving implementation scheme for an embedded Graphics Processor (GPU) in the field of airborne graphics processing in an avionic system.
The following specific embodiments may be combined with each other, and some embodiments may not be repeated for the same or similar concepts or processes.
Fig. 1 is a schematic diagram of a multi-core collaborative rendering processing method based on GPU instruction forwarding according to an embodiment of the present invention. Fig. 1 illustrates a shared access principle based on GPU instruction forwarding, first, hardware and software for executing the multi-core collaborative rendering processing method based on GPU instruction forwarding provided by the embodiment of the present invention are described as follows:
as shown in fig. 1, the hardware for performing multi-core co-rendering includes a multi-core CPU and a GPU, and the software includes an operating system supporting multi-core partitioning (i.e., an embedded real-time operating system supporting multi-core virtualization), graphics driver software, and graphics application software; the multi-core CPU is provided with multi-core partition operating systems, and each partition operating system is distributed with an independent CPU core, an independent memory space and independent peripherals; the user mode driving software of the graphic application software and the GPU is operated on each partition operating system; it should be noted that, the graphics application software in the embodiment of the present invention includes user mode driver software running in each partition operating system and a GPU management module running in the operating system virtualization layer, where the GPU management module provides GPU command "forwarding" service for each partition operating system.
Based on the specific content of the hardware and the software provided by the embodiment of the invention, the multi-core collaborative rendering processing method provided by the embodiment of the invention comprises the following steps:
step 1, in each partition operating system, user mode driving software establishes an independent GPU command buffer area and a data buffer area;
step 2, in each partition operating system, when the graphic application software calls the graphic API in the user mode driving software, the user mode driving software converts the called graphic API into GPU commands and data, and the GPU commands and the data are respectively cached in a GPU command buffer area and a data buffer area for storing the GPU commands and the data;
step 3, in each partition operating system, when the graphic application software calls the graphic API to trigger the GPU command submission, user state driving software in the partition operating system 'forwards' the service application GPU command submission to the GPU command in the GPU management module according to the agreed protocol;
and 4, the GPU management module controls the annular buffer in the shared storage area of the operating system to transmit the GPU command to the GPU so as to execute graphic rendering of the GPU command through the GPU.
In the embodiment of the invention, the GPU management module provides a GPU command forwarding service for each partition operating system, so that the GPU commands submitted to the GPU by each partition operating system are transmitted through the GPU command forwarding service of the GPU management module.
In the step 3 of the embodiment of the present invention, the graphics application software calls the graphics API to trigger the GPU command to submit, which has the following 2 cases:
in the case 1, when graphic application software calls a graphic API, user mode driving software triggers the submission of a GPU command by converting a specific GPU command converted from the called graphic API;
and 2, when the graphic application software calls the graphic API, the user mode driving software converts the called graphic API into GPU commands and data, and when the GPU command buffer or the data buffer is filled, the GPU command submission is triggered.
In the step 3 of the embodiment of the present invention, after the graphics application software invokes the graphics API to trigger the GPU command to submit in each partition operating system, the GPU command "forwarding" service provided by the GPU management module is specifically invoked, and the GPU command "forwarding" service caches the GPU command submitted by the user mode driver software in the ring buffer.
In the specific implementation, after the GPU command 'forwarding' service caches the GPU command submitted by the user mode driving software in the annular buffer, the GPU command 'forwarding' service performs scheduling according to the priority of the GPU command submitted by each partition operating system so as to control the GPU to perform graphic rendering on the GPU command submitted by each partition operating system in a time-sharing manner.
In one implementation manner of the embodiment of the invention, the GPU management module monitors the GPU load in each partition operating system to realize load balancing of GPU commands in each partition operating system.
In the embodiment of the present invention, the actual copying operations of the GPU command and the data in the step 2 are all transferred through pointers, so that the service load of the GPU command forwarding is very light, and the command forwarding is ensured in time; .
Further, in the multi-core collaborative rendering processing method provided by the embodiment of the invention, when the state of executing graphics rendering by the GPU is inconsistent with the state of the current partition operating system, the specific processing mode of the multi-core collaborative rendering processing method is as follows:
and switching the execution state of the GPU to the rendering context of the current partition operating system by using a GPU command forwarding service, and then executing the GPU command from a GPU command buffer area of the partition operating system to realize GPU command submission, so that the GPU executes the graphics processing task of each partition.
It should be noted that, in the implementation process of the multi-core collaborative rendering processing method provided by the embodiment of the invention, the service process of the GPU management module is in a silence state at ordinary times, when GPU commands are submitted by each partition operating system, the GPU management module is awakened, and the processing is continued to be in the silence state after the GPU commands are submitted.
Further, the embodiment of the invention provides a multi-core collaborative rendering processing method, which may further include:
and the graphic results rendered by the GPU are transmitted to different displays through the video controller for display.
Compared with the 3 traditional implementation schemes, the technical scheme of the invention has the same good universality as the implementation scheme I, and solves the CPU performance bottleneck problem caused by heavy load of the server side; compared with a GPU direct sharing access mode based on mutual exclusion sharing, the technical scheme of the invention eliminates the problem that GPU efficiency is rapidly reduced due to frequent task switching of the GPU, and the whole system is easily suspended due to any abnormal partition, thereby relieving the problem that a heavy-load low-priority task affects a light-load high-priority task, and greatly improving the practicability; compared with the third scheme, the technical scheme of the invention can utilize a plurality of processors to calculate the power on a platform where the GPU does not support hardware virtualization, breaks through the bottleneck of graphic application and greatly enhances the application range.
The multi-core collaborative rendering processing method based on GPU instruction forwarding, provided by the embodiment of the invention, is specifically a shared access mode based on GPU virtualization, and not only can the multi-core sharing of the GPU function be realized, but also the multi-core sharing of the GPU can be realized in the aspect of direct access of GPU resources by adding a GPU management module to a virtualization layer of a multi-core partition operating system. The multi-core collaborative rendering processing method provided by the embodiment of the invention can break through the bottleneck of single-core computing power, improve the performance of the GPU light-load graphics application and improve the integration level of the system for processing the graphics application. Compared with the three sharing access modes, the influence among graphic tasks residing on different CPU cores can be effectively controlled, and the influence of faults of a single graphic processing task on other tasks can be avoided; the method can avoid the disastrous influence of the heavy-load low-priority task on the light-load high-priority task, and the graphic application running on each CPU core can realize isolation, so that the safety and reliability are better ensured, and the practicability is very good.
Furthermore, the technical scheme provided by the embodiment of the invention does not depend on a specific hardware platform, has good adaptability and flexibility, is simple, convenient and easy to use in realization, and is suitable for various occasions of multi-core collaborative rendering; and the popularization is strong, and the market using space and obvious economic benefit can be achieved.
The following describes schematically a specific implementation of the method for processing multi-core collaborative rendering based on GPU instruction forwarding according to the embodiment of the present invention through an implementation example.
As shown in fig. 1, the multi-core collaborative rendering processing method based on GPU instruction forwarding provided in this embodiment is implemented by the following scheme.
(1) The hardware for realizing the multi-core collaborative rendering processing method consists of a multi-core CPU and a GPU, and the software consists of an embedded real-time operating system supporting multi-core virtualization (i.e. an operating system supporting multi-core partition), graphic driving software and graphic application software;
(2) Operating systems of multi-core partitions are operated on the multi-core CPU, namely a plurality of partitions from partition operating system 1 to partition operating system n and the like are operated, each partition operating system is distributed with independent CPU cores, independent memory spaces and independent peripherals, and each partition operating system can be distributed with one or more CPU cores according to the needs, so that the influence of faults of a single graphic processing task on other tasks is effectively controlled;
(3) User mode driver software of the GPU resides on each partition operating system, and the user mode driver software and the graphic application software run on the partition operating system; the graphic application software consists of user mode driving software which resides in each partition operating system and a GPU management module which runs in an operating system virtualization layer (namely a hypervisor virtualization layer);
the user mode driver software in each partition operating system establishes independent GPU command Buffers (CMDS) and Data Buffers (Data Buffers), and when the graphic application software calls the graphic API, the user mode driver software converts the called graphic API into GPU commands and Data, and the GPU commands and Data are respectively cached in the GPU command Buffers and the Data Buffers for storing the GPU commands and Data.
When the graphic application software calls the graphic API to trigger the GPU command submission, or when the GPU command buffer area or the data buffer area is full, user state driving software in the corresponding partition operating system 'forwards' the service application GPU command submission to the GPU command in the GPU management module according to the agreed protocol;
(4) Running a special management program, namely a GPU management module, in the memory of the operating system, wherein the user state driving software of each partition operating system can call the service provided by the GPU management module in a system call mode, and the management program generally runs on an independent CPU core in order to ensure the performance;
in specific implementation, only the GPU management module can directly access the GPU, and commands are transmitted to the GPU through a ring buffer (ring buffers) in the shared memory; in addition, the GPU management module provides a GPU command forwarding service for each partition operating system, so that commands submitted to the GPU by each partition operating system are realized through the GPU command forwarding service of the GPU management module;
(5) Because the actual copy operations of the GPU command (cmds) and the data (data) are transmitted by pointers, the service load of the GPU command forwarding is very light, and the timeliness of the command forwarding can be ensured;
(6) The GPU command forwarding service receives service requests submitted by each partition operating system, and submits GPU commands to the GPU in a certain sequence according to a scheduling strategy set by configuration;
specifically, priority and load monitoring strategies are pre-configured in the service process of the GPU management module, GPU load monitoring is conducted on each partition operating system, the priority of GPU commands is scheduled according to an internal scheduling algorithm, the problem that high-priority tasks cannot be executed due to the fact that low-priority tasks occupy GPU resources for a long time in certain states is solved, and load balancing is achieved through GPU load monitoring.
(7) When the state of the GPU for executing graphic rendering is inconsistent with the state of the current partition operating system, the GPU command forwarding service switches the execution state of the GPU to the rendering context of the current partition operating system, and then executes the GPU command from a GPU command buffer zone of the partition operating system to realize GPU command submission, so that the GPU executes graphic processing tasks of all partitions.
(8) The service process of the GPU management module is in a silent state at ordinary times, and when rendering tasks (namely GPU commands) of each partition operating system are submitted, the GPU management module is awakened, and the processing is continued to be in the silent state after the GPU commands are submitted.
(9) And sending the result rendered by the GPU to different displays for display through the video controller.
Compared with a graphics processing hardware platform system which only can process one task in real time by using a single-core mode in the traditional mode, the embodiment of the invention adopts the multi-core processor core with shared access based on GPU instruction forwarding to cooperatively process graphics, thereby realizing the real-time processing of two or even three similar graphics processing tasks.
The embodiments of the invention provide an implementation scheme for graphic collaborative rendering of a multi-core processor based on virtualization. With the increasing complexity of on-board graphics applications, the performance of a single-core processor has failed to meet the performance requirements, which becomes a performance bottleneck for the overall system. In order to fully exert the capability of the multi-core processor, the embodiment of the invention distributes the multi-core CPU to a plurality of partition operating systems in a virtualization mode, and realizes the management of GPU commands and data of the plurality of partition operating systems through the GPU management module, so that the multi-core CPU can finally perform graphic rendering by utilizing the GPU, the utilization rate of the multi-core CPU is effectively improved, the CPU performance bottleneck of airborne graphic rendering is solved, good isolation is provided, and the reliability of products is improved.
The implementation scheme of the multi-core processor graphics collaborative rendering provided by the embodiment of the invention is very suitable for the occasions where graphics applications of a plurality of partition operating systems all need to be drawn by using a GPU and the security requirements are very high. In addition, the technical scheme provided by the embodiment of the invention does not depend on a specific hardware platform, and has good adaptability and flexibility; the technical scheme is simple to realize, convenient and easy to use, is suitable for the field of various GPU driving realization, has strong popularization, and has wide market use space and obvious economic benefit.
Although the embodiments of the present invention are described above, the present invention is not limited to the embodiments which are used for understanding the present invention. Any person skilled in the art can make any modification and variation in form and detail without departing from the spirit and scope of the present disclosure, but the scope of the present disclosure is to be determined by the appended claims.

Claims (10)

1. The multi-core collaborative rendering processing method based on GPU instruction forwarding is characterized in that hardware for executing multi-core collaborative rendering comprises a multi-core CPU and a GPU, software comprises an operating system supporting multi-core partition, graphic driving software and graphic application software, the multi-core CPU is provided with a multi-core partition operating system, and each partition operating system is provided with an independent CPU core, an independent memory space and an independent peripheral; the method comprises the steps that graphic application software and user mode driving software of a GPU are operated on each partition operating system, the graphic application software comprises user mode driving software operated in each partition operating system and a GPU management module operated in an operating system virtualization layer, and the GPU management module provides GPU command forwarding services for each partition operating system; the multi-core collaborative rendering processing method comprises the following steps:
step 1, in each partition operating system, user mode driving software establishes an independent GPU command buffer area and a data buffer area;
step 2, in each partition operating system, when the graphic application software calls the graphic API in the user mode driving software, the user mode driving software converts the called graphic API into GPU commands and data, and the GPU commands and the data are respectively cached in a GPU command buffer area and a data buffer area for storing the GPU commands and the data;
step 3, in each partition operating system, when the graphic application software calls the graphic API to trigger the GPU command submission, user state driving software in the partition operating system 'forwards' the service application GPU command submission to the GPU command in the GPU management module according to the agreed protocol;
and 4, the GPU management module controls the annular buffer in the shared storage area of the operating system to transmit the GPU command to the GPU so as to execute graphic rendering of the GPU command through the GPU.
2. The method for multi-core collaborative rendering processing based on GPU instruction forwarding according to claim 1, wherein,
the GPU management module provides a GPU command forwarding service for each partition operating system, so that GPU commands submitted to the GPU by each partition operating system are transmitted through the GPU command forwarding service of the GPU management module.
3. The method for multi-core collaborative rendering processing based on GPU instruction forwarding according to claim 2, wherein,
in the step 3, the graphics application software calling a graphics API triggers GPU command submission to have the following cases:
in the case 1, when graphic application software calls a graphic API, user mode driving software triggers the submission of a GPU command by converting a specific GPU command converted from the called graphic API;
and 2, when the graphic application software calls the graphic API, the user mode driving software converts the called graphic API into GPU commands and data, and when the GPU command buffer or the data buffer is filled, the GPU command submission is triggered.
4. The method for multi-core collaborative rendering processing based on GPU instruction forwarding according to claim 2, wherein,
in the step 3, in each partition operating system, after the graphics application software invokes the graphics API to trigger the GPU command to submit, the GPU command "forwarding" service provided by the GPU management module is specifically invoked, and the GPU command "forwarding" service caches the GPU command submitted by the user mode driver software in the ring buffer.
5. The method of claim 4, wherein after the GPU command "forward" service buffers the GPU commands submitted by the user mode driver software in the ring buffer,
the GPU command forwarding service schedules according to the priority of the GPU commands submitted by the partition operating systems so as to control the GPU to perform graphic rendering on the GPU commands submitted by the partition operating systems in a time-sharing manner.
6. The GPU instruction forwarding-based multi-core collaborative rendering processing method according to any one of claims 1-5, further comprising:
the GPU management module monitors the GPU load in each partition operating system to realize load balancing of GPU commands in each partition operating system.
7. The method for multi-core collaborative rendering processing based on GPU instruction forwarding according to any one of claims 1-5, wherein,
in the step 2, the actual copying operation of the GPU command and the data is transferred through the pointer, so as to ensure the timeliness of command forwarding.
8. The GPU instruction forwarding-based multi-core collaborative rendering processing method according to any one of claims 1-5, further comprising:
when the state of the GPU for executing graphic rendering is inconsistent with the state of the current partition operating system, the GPU command forwarding service switches the execution state of the GPU to the rendering context of the current partition operating system, and then executes the GPU command from a GPU command buffer zone of the partition operating system to realize GPU command submission, so that the GPU executes graphic processing tasks of all partitions.
9. The method for multi-core collaborative rendering processing based on GPU instruction forwarding according to any one of claims 1-5, wherein,
and the service process of the GPU management module is in a silent state by default, and wakes the GPU management module when the GPU command is submitted by each partition operating system, and the GPU management module is continuously in the silent state after the GPU command is submitted by processing.
10. The GPU instruction forwarding-based multi-core collaborative rendering processing method according to any one of claims 1-5, further comprising:
and the graphic results rendered by the GPU are transmitted to different displays through the video controller for display.
CN202211608792.1A 2022-12-14 2022-12-14 Multi-core collaborative rendering processing method based on GPU instruction forwarding Pending CN116225688A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211608792.1A CN116225688A (en) 2022-12-14 2022-12-14 Multi-core collaborative rendering processing method based on GPU instruction forwarding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211608792.1A CN116225688A (en) 2022-12-14 2022-12-14 Multi-core collaborative rendering processing method based on GPU instruction forwarding

Publications (1)

Publication Number Publication Date
CN116225688A true CN116225688A (en) 2023-06-06

Family

ID=86583234

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211608792.1A Pending CN116225688A (en) 2022-12-14 2022-12-14 Multi-core collaborative rendering processing method based on GPU instruction forwarding

Country Status (1)

Country Link
CN (1) CN116225688A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116991600A (en) * 2023-06-15 2023-11-03 上海一谈网络科技有限公司 Method, device, equipment and storage medium for processing graphic call instruction

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116991600A (en) * 2023-06-15 2023-11-03 上海一谈网络科技有限公司 Method, device, equipment and storage medium for processing graphic call instruction
CN116991600B (en) * 2023-06-15 2024-05-10 上海一谈网络科技有限公司 Method, device, equipment and storage medium for processing graphic call instruction

Similar Documents

Publication Publication Date Title
US8443377B2 (en) Parallel processing system running an OS for single processors and method thereof
US5127098A (en) Method and apparatus for the context switching of devices
US10733019B2 (en) Apparatus and method for data processing
EP3054382B1 (en) Busy - wait loop
US7526673B2 (en) Parallel processing system by OS for single processors and parallel processing program
Sengupta et al. Scheduling multi-tenant cloud workloads on accelerator-based systems
JP5516398B2 (en) Multiprocessor system and method for sharing device between OS of multiprocessor system
CN107122233B (en) TSN service-oriented multi-VCPU self-adaptive real-time scheduling method
KR20050016170A (en) Method and system for performing real-time operation
WO2022028061A1 (en) Gpu management apparatus and method based on detection adjustment module, and gpu server
US20140281629A1 (en) Power management for a computer system
CN103927225A (en) Multi-core framework Internet information processing and optimizing method
US8631253B2 (en) Manager and host-based integrated power saving policy in virtualization systems
KR102052964B1 (en) Method and system for scheduling computing
CN116225688A (en) Multi-core collaborative rendering processing method based on GPU instruction forwarding
US7765548B2 (en) System, method and medium for using and/or providing operating system information to acquire a hybrid user/operating system lock
JP4183712B2 (en) Data processing method, system and apparatus for moving processor task in multiprocessor system
WO2024007934A1 (en) Interrupt processing method, electronic device, and storage medium
CN110018782B (en) Data reading/writing method and related device
CN114048026A (en) GPU resource dynamic allocation method under multitask concurrency condition
US7320044B1 (en) System, method, and computer program product for interrupt scheduling in processing communication
Shen et al. An NFV framework for supporting elastic scaling of service function chain
US11971830B2 (en) Efficient queue access for user-space packet processing
US20220350639A1 (en) Processing device, control unit, electronic device, method for the electronic device, and computer program for the electronic device
CN115934385B (en) Multi-core inter-core communication method, system, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination