CN109800088A

CN109800088A - Based on trained GPU configuring management method, device, storage medium and GPU

Info

Publication number: CN109800088A
Application number: CN201910005831.0A
Authority: CN
Inventors: 张宏伟; 孙琳娜; 田珍; 纪楠; 刘红红; 马佳静
Original assignee: Xi'an Technology Co Ltd
Current assignee: Xi'an Technology Co Ltd
Priority date: 2018-11-14
Filing date: 2019-01-03
Publication date: 2019-05-24
Anticipated expiration: 2039-01-03
Also published as: CN109800088B

Abstract

The present invention relates to a kind of based on trained GPU configuring management method, device, storage medium and GPU, the first statistical data and the second statistical data of GPU when method includes: the operation of statistics application program；First resource configuration information is generated according to first statistical data, Secondary resource configuration information is generated according to second statistical data；Resource distribution of the GPU when running the application program is adjusted according to the first resource configuration information and the Secondary resource configuration information.The present invention is obviously improved the rendering performance of GPU by reconfiguring optimization to drive software and hardware；The present invention be directed to the configuration optimization that the bottom software of GPU and hardware carry out, user in use, no longer need to optimize user program modification, can directly use in different application scenarios, and the performance satisfaction application scenarios more demanding to frame per second of GPU.

Description

Based on trained GPU configuring management method, device, storage medium and GPU

Technical field

The invention belongs to graphics processor technical fields, and in particular to a kind of based on trained GPU configuring management method, dress It sets, storage medium and GPU.

Background technique

Graphics processor (english abbreviation: GPU), also known as shows core, vision processor, display chip, is a kind of special Image operation works in PC, work station, game machine and some mobile devices (such as tablet computer, smart phone) Microprocessor；Its purposes is will to show that information carries out conversion driving required for computer system, and provide row to display and sweep Signal is retouched, the correct display of display is controlled, is the critical elements of connection display and PC mainboard, and " man-machine right One of the important equipment of words ".

Graphics processor performance is to judge the important indicator of a processor chips, software and hardware when can be run by changing Resource distribution is to reach expected performance.For slightly more complex application scenarios, graphics processor may existence Expected frame per second requirement not enough can be not achieved；If airborne scene requirement draws 60 frames/second, these slightly complicated scenes are common GPU on run when be extremely difficult to require, be necessary to optimize graphics processor at this time.

And existing optimization method can only be optimized for user program, can not be optimized to GPU bottom, that is, be modified The not realization of GPU can only optimize modification for user program, make troubles to user's use.

Summary of the invention

In order to solve the above-mentioned problems in the prior art, pipe is configured based on trained GPU the present invention provides a kind of Manage method, apparatus, storage medium and GPU.The technical problem to be solved in the present invention is achieved through the following technical solutions:

It is a kind of based on trained GPU configuring management method, comprising the following steps:

The first statistical data and the second statistical data of GPU when statistics application program is run；

First resource configuration information is generated according to first statistical data, generates second according to second statistical data Resource allocation information；

According to the first resource configuration information and the Secondary resource configuration information adjust the GPU in operation described in answer Resource distribution when with program.

Further, first statistical data include API number, API type, API frequency of use, in frame per second change rate One or any combination.

Further, second statistical data includes PCI bandwidth availability ratio, pipelining-stage resource utilization, DDR resource benefit With one or any combination in rate.

Further, the first resource configuration information includes the cutting of drive software.

Further, the Secondary resource configuration information include close it is non-using resource, open remaining computing resource and hard Part resource is redistributed.

Further, the GPU is adjusted according to the first resource configuration information and the Secondary resource configuration information to exist The resource distribution when application program is run, is specifically included:

Initialize the configuration of GPU basic resources；

Read the first resource configuration information；

Configuration optimization is carried out to host driven software according to the first resource configuration information of reading；

Read the Secondary resource configuration information；

Configuration optimization is carried out to hardware logic resource according to the Secondary resource configuration information of reading；

Rerun GPU equipment.

It is a kind of based on trained GPU configuration management device, comprising:

Statistical module, the first statistical data and the second statistical data when statistics application program is run；

Generation module generates first resource configuration information according to first statistical data, according to second statistical number According to generation Secondary resource configuration information；

Module is adjusted, the GPU is adjusted according to the first resource configuration information and the Secondary resource configuration information and is existed Run the resource distribution when application program.

It is a kind of to be stored with based on trained GPU configuration management device, including storage medium and processor, the storage medium Computer program, the processor realize following steps when executing the computer program:

The first statistical data and the second statistical data of GPU when statistics application program is run；According to first statistical number According to first resource configuration information is generated, Secondary resource configuration information is generated according to second statistical data；According to described first Resource allocation information and the Secondary resource configuration information adjust resource distribution of the GPU when running the application program.

A kind of storage medium, is stored thereon with calculator program, and the computer program module processed is realized when executing It is above-mentioned based on trained GPU configuring management method.

It is a kind of based on trained GPU, including above-mentioned GPU configuration management device.

Compared with prior art, beneficial effects of the present invention:

1., by carrying out configuration optimization to host driven software, will be read the present invention is based on trained GPU configuring management method The cutting information taken is cut to some functions to be used are not required in dyeing procedure, reduces redundancy program, increases dyeing journey The speed of service of sequence optimizes GPU rendering performance；Configuration optimization is carried out by hardware logic resource, by antialiasing, is mixed Conjunction, depth test and template test etc. influence huge non-essential resource to GPU rendering performance and close, and make GPU rendering performance It greatly promotes；The remaining computing resource of the assembly lines functional modules at different levels such as remaining stainer kernel and configuration multichannel PCI is opened, It can satisfy the requirement of GPU high frame per second；Hardware resource is redistributed, the memory headroom for being not required to application to be used is distributed to The normal drafting of application scenarios is realized in the application for needing more memory headrooms.

2. the present invention is based on trained GPU configuration management device, statistical module, generation drive software when being run by GPU Configuration information module and generation hardware resource configuration information module, complete the training to GPU, according to trained GPU to the master of GPU Machine drive software and hardware logic resource carry out corresponding configuration optimization, promote the rendering performance of GPU, meet user in complicated applications To the performance requirement of GPU under scene.

3. it is of the invention based on trained GPU, pass through the bottom software and hardware progress configuration optimization to GPU, Yong Hu Different application scenarios can be used directly, and the performance of GPU in use, no longer need to optimize user program modification Meet the different application scenarios more demanding to frame per second.

Detailed description of the invention

Fig. 1 is the flow chart provided in an embodiment of the present invention based on trained GPU configuring management method；

Fig. 2 is the refined flow chart provided in an embodiment of the present invention based on trained GPU configuring management method.

Fig. 3 is the structural schematic diagram provided in an embodiment of the present invention based on trained GPU configuration management device；

Fig. 4 is another structural schematic diagram based on trained GPU configuration management device provided in an embodiment of the present invention；

Fig. 5 is another stream based on program performed by trained GPU configuration management device provided in an embodiment of the present invention Cheng Tu.

Specific embodiment

Further detailed description is done to the present invention combined with specific embodiments below, but embodiments of the present invention are not limited to This.

In order to solve to be difficult to meet the application scenarios more demanding to frame per second, and existing optimization method existing for existing GPU GPU bottom can not be optimized, modification can only be optimized for the problem that user program, is made troubles to user's use, The present invention provides a kind of based on trained GPU configuring management method, device, storage medium and GPU.

Illustrate the meaning that english abbreviation involved in the present invention represents first:

GPU: graphics processor

API: application programming interface

PCI: peripheral components interconnection

DDR: Double Data Rate synchronous DRAM

BAR: base register

OpenGL: open graphic library

Embodiment 1:

Referring to Figure 1, Fig. 1 is the flow chart based on trained GPU configuring management method of the embodiment of the present invention 1.Such as Fig. 1 It is shown, the embodiment of the present invention 1 based on trained GPU configuring management method, comprising the following steps:

S101, the first statistical data and the second statistical data of GPU when statistics application program is run.

The application program of this step is the program for referring to run on the GPU under specific application scene；Specific application field Scape refers to the scene more demanding to drawing frame per second, such as the instrument board of airborne scene；First statistical data refers to software statistics Data, frame per second changes when including the operation of API number, API type, the use frame per second of difference API and program；Second statistical data Refer to hardware statistics data, including PCI bandwidth availability ratio, each pipelining-stage resource utilization, DDR resource utilization, module at different levels Input and output variation；Statistics is that addition is counted for every kind of OpenGL function interface using number in the structural body of drive software Variable, cumulative is counted is counted when corresponding interface executes.

This step is soft using driving that is, run the application program of the scene more demanding to drawing frame per second with initial GPU Frame per second variation etc. is soft when API number, API type, the use frame per second of difference API and program when part counts GPU operation are run Number of packages evidence and PCI bandwidth availability ratio, each pipelining-stage resource utilization, DDR resource utilization, module input and output at different levels change The hardware datas such as change.

S102 generates first resource configuration information according to first statistical data, raw according to second statistical data At Secondary resource configuration information.

First resource configuration information described in the present embodiment refers to drive software resource allocation information, including drives soft The cutting of part.But drive software resource allocation information of the present invention, is not limited to the cutting of drive software, can be according to specific The demand of application scenarios is reconfigured other drive software resource allocation informations.

Specifically, this step is mainly to pass through the use of the API of the OpenGL used in drive software statistics application scene Number and the function of using such as texture, illumination, mist, according to whether generating crop configuration using texture, illumination, mist function Txt file.

Why need to use drive software cut be because GPU in some functions be to be executed by dyeing procedure, These dyeing procedures are write usually using assembler language, and the lines of code of dyeing procedure is more can be to whole execution performance It has an immense impact on；By software crop configuration, GPU rendering performance is enable to be promoted.

Secondary resource configuration information described in the present embodiment refers to hardware resource configuration information, including closes non-use Resource opens redistributing for remaining computing resource and hardware resource.

Specifically, closing is non-draws GPU including closing antialiasing, mixing, depth test and template test etc. using resource Performance processed influences huge non-essential resource；Opening remaining computing resource includes opening remaining stainer kernel and configuration multichannel The remaining computing resource of the assembly lines such as PCI functional modules at different levels；The redistributing of hardware resource refers to default in GPU distributes in advance Good memory headroom, redistribute in conjunction with practical application scene.

Hardware resource configuration information when hardware data generates operation when being run according to the GPU of statistics specifically refers to, in GPU Inside has the hardware resource statistic registers of each function of assembly line, and drive software reads these after each frame end automatically When statistic registers, time when including one frame data of pipeline function resume module at different levels in these registers and waiting Between, drive software can generate configuration file according to the data of statistics after reading data, as there are tri- moulds of A/B/C in assembly line Block, A handle a frame and spend 10ms, and waiting time 40ms, B spend 50ms, and waiting time 0ms, c spend 2ms, waiting time 0ms, Illustrate that the processing capacity of B is weaker at this time, the computing resource of B module will be increased when generating configuration file, that is, allows the processing energy of B Power becomes strong.

S103 adjusts the GPU according to the first resource configuration information and the Secondary resource configuration information and is running The resource distribution when application program.

Fig. 2 is referred to, Fig. 2 is the refinement process provided in an embodiment of the present invention based on trained GPU configuring management method Figure.S103 step specifically includes:

S1031, initialization GPU basic resources configuration.

After completing the procedure, that is, the training process to GPU is completed；GPU equipment is initialized in this step to refer to just The various necessary and non-configurable resources of beginningization GPU, such as: BAR space configuration, PCI initialization, host driven software memory Space application, OpenGL core library initialization；GPU is set correctly to behave, reaching can working condition.

S1032 reads the first resource configuration information.

Read the drive software resource allocation information that S102 is generated；It is main that software resource configuration information is read in the present embodiment It is the cutting information for reading drive software.

S1033 carries out configuration optimization to host driven software according to the first resource configuration information of reading.

Specifically, it is cut, is led to some functions to be used are not required in dyeing procedure by the cutting information of reading It crosses cutting and is not required to some functions to be used, reduce redundancy program, increase the speed of service of dyeing procedure, obtain GPU performance Optimization.

For example, in application scenes, if GPU performance is usually after cutting using Full Featured dyeing procedure GPU performance 50%~70%.Such as Full Featured dyeing procedure includes illumination, texture, mist function, but in user's scene only It only used mist function, illumination and texture function do not used, then optimization tool after having counted, can generate corresponding dye The configuration of color program tailoring regenerates dyeing procedure by cutting out in initialization, promotes user's scene drawing performance.

S1034 reads the Secondary resource configuration information.

Read the Secondary resource configuration information refer to read hardware resource configuration information, including close it is non-using resource, It opens remaining computing resource and redistributing for hardware resource waits hardware resources configuration information；It specifically includes reading and closes anti-saw Tooth, mixing, depth test and template test etc. are non-to use resource information；It reads and opens remaining stainer kernel and configuration multichannel The remaining computing resource of the assembly lines such as PCI functional modules at different levels；The demand to frame per second is drawn according to application scenarios is read, to not The hardware resource that the memory headroom of same application is redistributed redistributes information.

S1035 carries out configuration optimization to hardware logic resource according to the Secondary resource configuration information of reading.

In conjunction with the needs of different application scenarios, by closing antialiasing, mixing, depth test and template test etc. pair GPU rendering performance influences huge non-essential resource, and GPU rendering performance can be made to greatly promote；By opening remaining stainer The remaining computing resource of the assembly lines functional modules at different levels such as kernel and configuration multichannel PCI, can satisfy the requirement of GPU high frame per second； By redistributing for hardware resource, the memory headroom for being not required to application to be used is distributed to and needs answering for more memory headrooms With realizing the normal drafting of application scenarios.

It should be noted that hardware logic resource carries out in configuration optimization, closing is non-to be provided using resource, remaining calculate of unlatching Source and hardware resource when redistributing specific execution according to the demand of GPU application scenarios, may need simultaneously by three into The hardware logic optimizing resource allocation of row GPU, it is also possible to the then hardware logic optimizing resource allocation of its a pair of GPU.

In application scenes, in order to promote the rendering performance of GPU, need to believe some non-use resources in GPU Breath is closed.For example, in order to which the figure of drafting is not more mellow and full lofty, so often opening antialiasing in graphing Function, when user would generally open the antialiasing function of line when drawing annulus scene, if the annulus itself drawn is very It is good, without jagged edge, but after having opened antialiasing function, draw decline when can be than being not turned on antialiasing of the performance of annulus 50%, that can pass through switching function during initialization, close anti-aliasing effect, meet the requirements performance.Again For example, mixing, depth test, template test.It is tight that mixing, depth test and template test usually will affect comparison to performance Heavy, usually only a part of figure needs to carry out it hybrid manipulation, depth test or template survey when drawing in a scene Examination at this time can be by matching if to will lead to performance excessively slow if also carrying out these operations at this point for other figures It sets and closes corresponding function, greatly promote GPU rendering performance.

In application scenes, in order to promote the rendering performance of GPU, need to open remaining stainer kernel, configuration it is more The assembly lines such as road PCI functional module computing resources at different levels, besides when being drawn, hardware resource is not all turned on, can be with It is opened according to the information of distributing rationally of generation.By opening these assembly lines functional module computing resources at different levels, to meet The frequency performance demand of program operation.

In application scenes, in order to promote the rendering performance of GPU, need to redistribute hardware resource.For example, GPU can distribute in advance memory to different application (display list, texture, Buffer object) defaults in DDR when realizing Space such as shows list 50M, texture 80M, Buffer object 20M, but the space of these distribution cannot not modify, example In the application scenarios used such as user, display list is not used, but has used texture and the data texturing of load is super 80M memory headroom range is crossed, this scene cannot be drawn under normal circumstances, because data texturing is more than storage model It encloses, can not load also can not just carry out textures.If given birth to again during GPU is initialized according to the data of statistics It being configured at space range of distribution, display list 50M is distributed into texture, such texture actual space available just will become 130M, The data texturing of user can be carried out normal load, and application scenarios can normally be drawn.

S1036 reruns GPU equipment.

The host driven software and hardware information crossed according to configuration optimization, reruns GPU equipment.

The present embodiment will be read based on trained GPU configuring management method by carrying out configuration optimization to host driven software The cutting information taken is cut to some functions to be used are not required in dyeing procedure, reduces redundancy program, increases dyeing journey The speed of service of sequence optimizes GPU rendering performance.

The present embodiment carries out configuration optimization based on trained GPU configuring management method, by hardware logic resource, by anti-saw Tooth, mixing, depth test and template test etc. influence huge non-essential resource to GPU rendering performance and close, and draw GPU Performance greatly promotes；By the remaining computing resource of the assembly lines functional modules at different levels such as remaining stainer kernel and configuration multichannel PCI It opens, can satisfy the requirement of GPU high frame per second；Hardware resource is redistributed, the memory headroom for being not required to application to be used The application for needing more memory headrooms is distributed to, realizes the normal drafting of application scenarios.

Embodiment 2:

Fig. 3 is referred to, Fig. 3 is the structural representation based on trained GPU configuration management device 200 of the embodiment of the present invention 3 Figure.As shown in figure 3, the embodiment of the present invention 3 includes: based on trained GPU configuration management device 200

Statistical module 201, the first statistical data and the second statistical data when statistics application program is run.

First statistical data refers to software statistics data, the use frame per second including API number, API type, difference API with And frame per second changes when program operation；Second statistical data refers to hardware statistics data, including PCI bandwidth availability ratio, each pipelining-stage Resource utilization, DDR resource utilization, module input and output at different levels variation.Statistical module 201 passes through the structure in drive software The variable that addition uses number to count for every kind of OpenGL function interface in body, counted when corresponding interface executes add up into Row statistics.

Generation module 202 generates first resource configuration information according to first statistical data, according to second statistics Data generate Secondary resource configuration information.

First resource configuration information in the present embodiment refers to drive software resource allocation information, the sanction including drive software It cuts.But drive software resource allocation information of the present invention, is not limited to the cutting of drive software, can be according to concrete application field The demand of scape is reconfigured other drive software resource allocation informations.Secondary resource configuration information refers to hardware resource with confidence Breath, including closing non-redistributing using resource, the remaining computing resource of unlatching and hardware resource.Specifically, non-use is closed Resource includes closing antialiasing, mixing, depth test and template test etc. to influence huge non-essential money to GPU rendering performance Source；Opening remaining computing resource includes opening the assembly lines functional modules at different levels such as remaining stainer kernel and configuration multichannel PCI Remaining computing resource；Hardware resource redistributes the memory headroom for referring to and defaulting in GPU and distributing in advance, in conjunction with practical application Scene redistribute.

Module 203 is adjusted, the GPU is adjusted according to the first resource configuration information and the Secondary resource configuration information Resource distribution when running the application program.

Adjusting module 203 includes drive software adjustment module and hardware adjustment module.

Specifically, drive software adjustment module includes the cutting pair by reading the drive software that generation module 202 generates Host driven software is adjusted optimization.For example, more to be used to being not required in dyeing procedure by the cutting information read Function is cut, and is not required to some functions to be used by cutting, is reduced redundancy program, and the operation speed of dyeing procedure is increased Degree, optimizes GPU performance.

Hardware adjustment module includes generating hardware resource configuration information to hardware logic resource by reading generation module 202 It is adjusted optimization.It specifically includes reading and closes that antialiasing, mixing, depth test and template test etc. are non-to use resource information； Read the remaining computing resource for opening the assembly lines functional modules at different levels such as remaining stainer kernel and configuration multichannel PCI；Read root Demand according to application scenarios to frame per second is drawn, the hardware resource redistributed to the memory headroom of different application is again Distribute information.

The present embodiment is completed by statistical module and generation module to GPU based on trained GPU configuration management device Training, then adjust module adjusted accordingly according to host driven software and hardware logical resource of the trained GPU to GPU Optimization, promotes the rendering performance of GPU, meets user under complex application context to the performance requirement of GPU.

Embodiment 3:

Fig. 4 is referred to, Fig. 4 is another kind provided in an embodiment of the present invention based on trained GPU configuration management device 300 Structural schematic diagram.As shown in figure 4, the present embodiment includes 301 He of storage medium based on trained GPU configuration management device 300 Processor 302, the storage medium 301 are stored with computer program.

Fig. 5 is referred to, Fig. 5 is held by another kind provided in an embodiment of the present invention based on trained GPU configuration management device The flow chart of line program.The processor 302 realizes following steps when executing the computer program:

S310, the first statistical data and the second statistical data of GPU when statistics application program is run；

S320 generates first resource configuration information according to first statistical data, raw according to second statistical data At Secondary resource configuration information；

S330 adjusts the GPU according to the first resource configuration information and the Secondary resource configuration information and is running The resource distribution when application program.

The tool that the present embodiment is realized when executing the computer program based on the processor of trained GPU configuration management device Body step is with embodiment 1, and it is similar that the realization principle and technical effect are similar, and details are not described herein.

The present embodiment based on trained GPU configuration management device, pass through the first system of GPU when statistics application program is run Count with the second statistical data, and first resource configuration information generated according to first statistical data, according to described the Two statistical data generate Secondary resource configuration information, complete the training to GPU, are then driven according to host of the trained GPU to GPU Dynamic software and hardware logical resource adjusts accordingly optimization, promotes the rendering performance of GPU, meets user in complex application context Under to the performance requirement of GPU.

Embodiment 4:

A kind of storage medium, is stored thereon with calculator program, and the computer program module processed is realized when executing , should be as described in Example 1 based on trained GPU management method based on trained GPU management method, realization principle and technology effect Seemingly, details are not described herein for fruit.

Embodiment 5:

A kind of includes GPU configuration management device based on trained GPU, GPU configuration management device such as embodiment 2 or Described in embodiment 3, it is similar that the realization principle and technical effect are similar, and details are not described herein.

The present embodiment based on trained GPU, configured by bottom software or bottom software to GPU and hardware Optimization, user in use, no longer need to optimize user program modification, can directly use in different application scenarios, and And the performance of GPU meets the different application scenarios more demanding to frame per second.

The above content is a further detailed description of the present invention in conjunction with specific preferred embodiments, and it cannot be said that Specific implementation of the invention is only limited to these instructions.For those of ordinary skill in the art to which the present invention belongs, exist Under the premise of not departing from present inventive concept, a number of simple deductions or replacements can also be made, all shall be regarded as belonging to of the invention Protection scope.

Claims

1. a kind of based on trained GPU configuring management method, which comprises the following steps:

First resource configuration information is generated according to first statistical data, Secondary resource is generated according to second statistical data Configuration information；

It is described using journey in operation that the GPU is adjusted according to the first resource configuration information and the Secondary resource configuration information Resource distribution when sequence.

2. according to claim 1 based on trained GPU configuring management method, which is characterized in that first statistical number According to including one or any combination in API number, API type, API frequency of use, frame per second change rate.

3. according to claim 2 based on trained GPU configuring management method, which is characterized in that second statistical number According to including one or any combination in PCI bandwidth availability ratio, pipelining-stage resource utilization, DDR resource utilization.

4. according to claim 3 based on trained GPU configuring management method, which is characterized in that the first resource is matched Confidence breath includes the cutting of drive software.

5. according to claim 4 based on trained GPU configuring management method, which is characterized in that the Secondary resource is matched Confidence breath includes closing non-redistributing using resource, the remaining computing resource of unlatching and hardware resource.

6. according to claim 1 or 5 based on trained GPU configuring management method, which is characterized in that according to described first Resource allocation information and the Secondary resource configuration information adjust resource distribution of the GPU when running the application program, It specifically includes:

Initialize the configuration of GPU basic resources；

Read the first resource configuration information；

Read the Secondary resource configuration information；

Rerun GPU equipment.

7. a kind of based on trained GPU configuration management device characterized by comprising

Generation module generates first resource configuration information according to first statistical data, raw according to second statistical data At Secondary resource configuration information；

Module is adjusted, the GPU is adjusted according to the first resource configuration information and the Secondary resource configuration information and is being run The resource distribution when application program.

8. a kind of be stored with meter based on trained GPU configuration management device, including storage medium and processor, the storage medium Calculation machine program, which is characterized in that the processor realizes following steps when executing the computer program:

The first statistical data and the second statistical data of GPU when statistics application program is run；It is raw according to first statistical data At first resource configuration information, Secondary resource configuration information is generated according to second statistical data；According to the first resource Configuration information and the Secondary resource configuration information adjust resource distribution of the GPU when running the application program.

9. a kind of storage medium, which is characterized in that be stored thereon with calculator program, which is characterized in that the computer program Claim 1~6 described in any item methods are realized when module processed executes.

10. a kind of based on trained GPU, which is characterized in that including GPU configuration management device as claimed in claim 7 or 8.