Background technology
GCN (Graphics Core Next) is a series of microarchitectures of Advanced Micro Devices exploitation and the generation of instruction set
Number.First product of GCN frameworks was released in 2011, and five iteration have been carried out in AMD GCN frameworks series so far.
Before R600 framework video cards, there is dedicated region copy acceleration hardware unit in AMD/ATI video cards, driver can be controlled directly
This hardware cell is made, corresponding two dimension is carried out and accelerates operation.Two-dimentional accelerating hardware is removed in R600 frameworks and video card later
Unit, all two dimensions accelerate operation to be realized by three-dimensional element.Compared with video card before, GCN framework video card parallel processing energy
Power is high, and framework utilization rate and instruction throughput are big.
In Linux figure storehouses, there are mainly two types of the accelerated modes that support video card two dimension accelerates:Glamor accelerated modes
With EXA accelerated modes.Wherein, three the main inclusion region copy of EXA accelerated modes, area filling, image blend methods.At present
In Linux increases income graphics software stack, as shown in Figure 1, GCN frameworks video card realizes two-dimentional acceleration using Glamor, without real
The EXA frames that existing two dimension accelerates, and since Glamor accelerated modes cannot play GCN video card capabilities completely, lead to X-Y scheme
It is bad to render performance.Therefore, the X-Y scheme performance for raising GCN framework video cards realizes that the EXA two dimensions of GCN framework video cards add
Fast method is necessary.
Existing GCN frameworks video card realizes that two dimension accelerates using Glamor, and this accelerated mode is using EGL interfaces, by X
The rendering of window system is converted to OpenGL operations, using existing d engine any in operating system, passes through calling
The api interface of OpenGL drivers realizes two-dimentional acceleration.In this way when as soon as application scenarios need execution two-dimensional operation,
Switch contexts process that must be back and forth could be completed, and also therefore bring great consumption to system entirety resource, lead to GCN framves
The two dimension of structure video card accelerates operating characteristics low.
And the two dimension in existing EXA accelerated modes accelerates operating method mainly for non-GCN frameworks video card hardware characteristic
And realize, mainly comprising three parts:The initialization of 3 d rendering engine, the configuration of coloration program, the configuration of graphic resource
(vertex resource, texture resource, constant resource).This realization method does not need to toggle context process, with Glamor plus
Fast mode is compared, can be less to the consumption of system entirety resource, and performance can higher.It is reported that there are no in GCN framework video cards
The two-dimentional accelerated method of EXA accelerated modes is realized.
Chinese invention patent (application number CN201210380598.2) provides a kind of based on the only of embedded acceleration core
Vertical video card framework, including acceleration components, interconnection bus and transmission part.Display controller is used for pixel number in display-memory
According to output to display device;Graphics processor is used to accelerate figure generic task;Video accelerator is used for video and figure
As data carry out coding-decoding operation.Although the invention is completely independently shown by design based on embedded acceleration core constructing function
Card meets the fields such as PC machine and server to display and multimedia processing capability demand, but is not related to the present invention
The X-Y scheme acceleration problem of GCN framework video cards considered.
Chinese invention patent (application number CN201310549819.9) provides a kind of hardware-accelerated realization browser of use
The method and browser of rendering, the video card for installing the terminal of the browser have graphics processor GPU hardware acceleration function, wherein
Included the following steps using the method that hardware-accelerated realization browser renders:It hardware-accelerated is reflected preset according to the information of video card
The corresponding hardware-accelerated mapping data item of video card is searched in firing table;Video card is determined from corresponding hardware-accelerated mapping data item
Support hardware-accelerated web page element type;GPU is carried out to the rendering of browser page according to the web page element type determined
It is hardware-accelerated.Although the invention improves the availability that GPU hardware acceleration is rendered in webpage, avoid causes because hardware-accelerated
The phenomenon that existing browser blue screen or collapse, but it is not related to the X-Y scheme of GCN framework video cards that the present invention is considered
Acceleration problem.
Chinese invention patent (application number CN201410205303.7) provides a kind of X figures system based on Feiteng processor
System parallel acceleration method, implementation steps are as follows:Input-output equipment is performed by X server main thread to initialize, and establishes input
Event handling subsystem thread, monitors X client end PROGRAMMED REQUESTS respectively and management video card, processing display output request, input are set
Standby event;X client end PROGRAMMED REQUESTS is responded by X server main thread;It creates to handle management by X server main thread
Video card and the video card management drawing subsystem thread of processing display output;Pipe is performed by video card management drawing subsystem thread
Manage video card and processing display output request;The input of input equipment event response is taken out by incoming event processing subsystem thread to set
Standby event.Although the invention can promote the X figures system of Feiteng processor using the advantage of Feiteng processor Multi-core
System performance has the advantages that hardware resource utilization is high, user experience is smooth, graphics process performance is high, but does not relate to
The X-Y scheme acceleration problem of GCN framework video cards considered to the present invention.
Chinese invention patent (application number CN201510981689.5) discloses a kind of side for speeding up to browser rendering
Method and browser, the video card for installing the equipment of the browser have GPU hardware acceleration function, the method includes:Described aobvious
When card opens GPU hardware acceleration function, obtain in preset time period with the relevant each process of the GPU hardware acceleration function
Running state information;The weighted value of the running state information is obtained, will be somebody's turn to do in the weighted value and preset process table
The corresponding process weighted value of equipment is compared;According to comparison result, it is determined whether close the GPU hardware acceleration function.Though
The right invention can be achieved to improve the availability that GPU hardware acceleration is rendered in webpage, and avoid causes to browse because hardware-accelerated
The problem of device blue screen or collapse, but be not involved with the present invention and consider that the other X-Y scheme acceleration of operating system grade is asked
Topic.
Chinese invention patent (application number CN03142366.3) provides a kind of method and apparatus of accelerated graphics data, can
To reduce the computation complexity of graphic processing data.Wherein, the method for accelerating two-dimensional graphic data includes:It receives with being schemed by processing
The relevant information of width of shape window;Pixel data is read from the memory of the pixel data of storage graphical window;It receives and two
A relevant information of pixel data field, the two pixel data fields are to be drawn according to the width information of graphical window from memory area
Point, a pixel data field is handled with burst mode, and one other pixel data field is handled as unit of byte;To what is marked off
Each pixel data field performs scheduled graphics process respectively.Although the invention is directed to non-GCN frameworks under Windows operating system and shows
Card proposes a kind of two-dimension graphic data accelerated method, but there is no consider under the targeted GCN framework video cards of the present invention
The X-Y scheme acceleration problem of (SuSE) Linux OS.
Chinese invention patent (application number CN201410653610.1) discloses a kind of domestic autonomous embedded computer
System and its video driver method, including Godson 2F central processing unit and SM722 display chips, SM722 display chips pass through PCI
Bus is connected with Godson 2F.Its video driver method includes the firmware layer driving modification and operating system nucleus to supporting SM712
Layer driving modification realizes Godson carrying and does not support MIPS frames and the SM722 display chips of pci bus originally, so as to have
Standby stronger performance.The invention can promote the Man machine interaction of Loongson-2F CPU product and figure shows control performance, still
There is no the X-Y scheme acceleration problems for considering GCN framework video cards under the targeted Feiteng processor platform of the present invention.
Invention content
In view of this, the present invention provides a kind of based on GCN frameworks to solve defect and deficiency of the existing technology
The X-Y scheme accelerated method of video card can realize that GCN frameworks video card accelerates the X-Y scheme of frame to accelerate based on EXA, be promoted
X-Y scheme accelerating ability.
In order to solve the above-mentioned technical problem, the invention discloses a kind of X-Y scheme acceleration sides based on GCN framework video cards
Method, and realized using following technical scheme.
A kind of X-Y scheme accelerated method based on GCN framework video cards, step include:
S1, it is obtained from application scenarios when the parameter of preacceleration operation;
S2, the general three-dimensional figure rendering register by setting video card initialize the 3 d rendering engine of the video card;
The memory address of the three-dimensional rendering result of the video card is stored in S3, setting cut out areas, setting;
The source file of S4, the source file for building vertex shader and/or fragment shader, and by two source files
Executable file is compiled into be stored;
Parameter when S5, setting executable file operation;
S6, start the 3 d rendering engine, perform two dimension and accelerate operation;
S7, synchronic command is sent, it is ensured that the two dimension accelerates operation to complete.
Further, it is described to accelerate the filling of operation inclusion region, region copy and/or image blend.
Further, the setting clipping region be specially by the width specified in the interface parameters obtained in the S1 and
Highly write-in cuts register;And by described when color buffer register is written in the object information of preacceleration operation.
Further, when the object for accelerating to operate is area filling, the memory address of purpose filling region is write
Enter the color buffer register;When the object of operation is accelerated to be copied for region, the memory address in purpose copy region is write
Enter the color buffer register;When the object for accelerating to operate is image blend, the memory address of background image is write
Enter the color buffer register, for setting the memory address for storing the video card three-dimensional rendering result, and by image blend
Action type parameter read-in mixing register controls the realization effect for accelerating operation.
Further, the S4 is specially to call tinter management module, and the top is built using OpenGL shading languages
Point Coloring device source file and/or fragment shader source file.
Further, the vertex shader be used for complete vertex, texture coordinate and/or vector translation, scaling with
And rotation process;The fragment shader for the vertex to be chained up forming pixel, complete 3-D graphic rasterisation,
The illumination of pixel and/or the operation of texture mapping.
Further, the S5 is specially ginseng when calling resource descriptor module as executable file setting operation
Number.
Further, the S6 is specially to send render instruction, and parameter during according to the operation set performs institute
The executable file of vertex shader and the executable file of the fragment shader are stated, starts the 3 d rendering engine, holds
The row two dimension accelerates operation.
A kind of computer readable storage medium, the computer-readable recording medium storage have computer program, feature
It is:It is realized when the computer program is executed by processor as described in claim 1-9 is any the step of upgrade method.
A kind of X-Y scheme accelerator based on GCN framework video cards including memory, processor and is stored in described
In memory and the computer program that can run on the processor, it is characterised in that:The processor performs the calculating
It is realized during machine program as described in claim 1-9 is any the step of X-Y scheme accelerated method.
Compared with prior art, the present invention can be obtained including following technique effect:
1st, the two-dimentional accelerated method that GCN frameworks video card accelerates frame based on EXA is realized, using GCN framework video card characteristics,
By quickly establishing rendering link, state programming is carried out to GCN frameworks video card and tinter programs, and utilizes vertex resource, line
Resource, constant resource are managed to complete the two-dimentional accelerated method under EXA acceleration frames;
2nd, avoiding the accelerated method in Glamor accelerated modes needs to call EGL initialization graphics rendering contexts first,
Then the drawbacks of complicated processes in api interface switch contexts process to three-dimensional graphics renderer in calling OpenGL, entirely
The realization of accelerating interface is also greatly improved in terms of time complexity and stability and memory consumption, is greatly carried
X-Y scheme accelerating ability is risen.
Certainly, implementing any of the products of the present invention must be not necessarily required to reach all the above technique effect simultaneously.
Specific embodiment
Carry out the embodiment that the present invention will be described in detail below in conjunction with accompanying drawings and embodiments, thereby how the present invention is applied
Technological means can fully understand and implement according to this to solve technical problem and reach the realization process of technical effect.
A kind of X-Y scheme accelerated method based on GCN framework video cards, this method are a kind of based on logical under X window system
With figure accelerate frame EXA and GCN framework video card characteristic, by realize EXA accelerate frame under main acceleration operate come
Realize the X-Y scheme accelerated method of GCN framework video cards.
The functional structure of the heretofore described X-Y scheme accelerated method based on GCN framework video cards, as shown in Fig. 2,
Being broadly divided into EXA accelerates framework initialization, accelerating interface to realize that layer and hardware bottom layer interface realize layer.
(1) in EXA accelerates framework initialization layer, the main EXA that completes accelerates the interface call back function of frame register, is initial
Change the running environment of interface and initialization accelerating interface realizes the modules of layer.The interface of initialization running environment mainly wraps
Include three area filling, region copy, image blend interfaces.The accelerating interface of initialization realizes that the modules of layer include video memory
The modules such as management, tinter management, resource descriptor management.
(2) accelerating interface realize layer not only inclusion region filling, region copy and three interfaces of image blend it is specific
It realizes, also comprising video memory management module, tinter management module, resource descriptor management module.
Area filling refers to be filled specified rectangular area operation using solid color, and region copy refers to source
The pixel content in region is copied in destination region, and image blend is according to specified combination class by foreground image and background image
Type, process of the generation with part or all of transparency interaction image.Here value storage of the transparence information between 0 and 1
In image Alpha channels, it is worth and represents that pixel is transparent for 0, is worth and represents that pixel is opaque for 1.
Video memory management module is responsible for being managed video memory, is realized comprising virtual address (Va) management, Cache mechanism.
Tinter management module is responsible for building the tinter two of each interface (area filling, region copy, image blend)
Binary file is responsible for the setting of tinter related resource.
Resource descriptor management module is responsible for setting the tinter binary file operating parameter of each interface.
(3) hardware bottom layer interface realizes that layer includes buffering area (BufferObject) interface, imaging surface interface, command stream
Submit interface.
A kind of X-Y scheme accelerated method based on GCN framework video cards, as shown in figure 3, specific steps include:
S301, it is obtained from application scenarios when the parameter of preacceleration operation.
Accelerate to operate three kinds of main inclusion region filling, region copy, image blend operations, each accelerates operation in EXA
Accelerate to correspond to an accelerating interface in frame.Since the function of each acceleration operation is different, the parameter of accelerating interface is also
It is different.The height and width of the parameter of area filling interface including filling region, color, to angular coordinate and its in memory
Address;The parameter of region copy interface includes the point coordinates in source copy region and its address in memory, purpose copy
The point coordinates in region and its address in memory, the width and height in source copy region, and point coordinates refers to copy region
The point coordinates in the upper left corner;The parameter of image blend interface includes foreground image address in memory, covers image in memory
Address, the address of background image in memory, each image participates in the region of hybrid manipulation (by respective top left co-ordinate and altogether
It is high wide specified), optional foreground image and the coordinate conversion matrix for covering image, while hybrid manipulation type is also needed to,
To realize different mixed effects.
S302, register is rendered by general three-dimensional figure of the setting from GCN hardware video cards, initializes the three-dimensional of video card
Rendering engine.
S303, setting cut out areas, the memory address of setting storage video card three-dimensional rendering result.
The width specified in current interface parameter in S301 and height write-in are cut into (CLIP) register, for setting
Clipping region;And preacceleration operation object information will be worked as, color buffer register is written.
When accelerating operation for area filling, color buffer is written into the memory address of purpose filling region
(COLORBUFFER) register;When accelerating operation for region copy, the memory address write-in color in purpose copy region is delayed
Rush (COLORBUFFER) register;When accelerating operation for image blend, color buffer is written into the memory address of background image
(COLORBUFFER) register, for setting the memory address of storage video card three-dimensional rendering result;It is mixed when accelerating operation for image
During conjunction, it is also necessary to hybrid manipulation type parameter are written mixing (BLEND) register, control accelerates the final of operation to realize effect
Fruit.
S304, call accelerating interface realize layer in tinter management module, respectively build vertex shader source file and
Fragment shader source file deposits in video memory after compiling;
Since GCN frameworks video card has the characteristic of programmable pipeline (Progrmming Pipeline), it is possible to use
OpenGL shading languages build vertex shader source file and fragment shader source file according to current operation respectively;Again by this two
A source file is compiled into binary executable respectively, and is stored in video memory, for controlling the render process of video card,
Carry out acceleration operation;
The code of vertex shader, which is mainly responsible for, completes vertex, texture coordinate, the translation of vector, scaling and rotation process,
The code of fragment shader is then mainly responsible for be chained up forming pixel by vertex, completes the rasterisation and pixel of 3-D graphic
The operations such as illumination, texture mapping.
S305, parameter when running is set for the binary executable in video memory;
Accelerating interface is called to realize the resource descriptor module in layer for the binary executable setting fortune in video memory
Parameter during row, parameter during these operations mainly include filling region in S301 to angular coordinate and need the color filled and
The memory address of filling region, the point coordinates and memory address in source copy region, the point coordinates in purpose copy region, foreground image
Point coordinates and memory address, cover the point coordinates and memory address of image, the point coordinates of background image, foreground image and coverage
The coordinate conversion matrix of image.
S306, start 3 d rendering engine, perform two dimension and accelerate operation;
Render instruction is sent, according to the runtime parameter set in S305, video card performs vertex shader binary system can
File and fragment shader binary executable are performed, starts 3 d rendering engine, two dimension is performed and accelerates operation.
S307, synchronic command is sent, it is ensured that operation is accelerated to complete.
GCN (Graphics Core Next) is that AMD is developed, for replacing the one of TeraScale micro-architectures/instruction set
The code name of serial microarchitecture and instruction set.GCN frameworks are also used for the visuals of AMD acceleration processing units (APU), than
In the APU used such as PlayStation 4 and Xbox One.GCN frameworks change the VLIW that AMD has adhered to always since R600
Packing is handled up pattern.VLIW (very long instruction word) is that the instruction of many items connects together, and builds up a long instruction, allows GPU's
Arithmetic element can be continuously performed with one-shot, eliminate many dispatch commands, latent period, single so as to promote operation efficiency
It is very high that thread performs density.Compared with video card before, the hardware shader core of GCN frameworks redesigns completely, and use is non-
The imperative statement collection framework of VLIW, from memory rather than all resource descriptors of register read.
EXA:A kind of figure under X window system accelerates frame, from KAA (the KDriveAcceleration
Architecture it) transplants.EXA is general acceleration frame, is used by a variety of DDX graphics drivers.
Tinter (Shader) is the short and small self defining programm that developer writes, they are the GPU in graphics card
It performs in (Graphic Processor Unit graphics processing units), instead of a part for fixed rendering pipeline, makes
Different levels are programmable in rendering pipeline.Tinter mainly includes vertex shader (Vertex Shader) and segment
Tinter (Fragment Shader) also has geometric coloration (Geometry Shader) etc. sometimes.Vertex shader master
It is responsible for completing vertex, texture coordinate, the translation of vector, scaling and rotation process.Fragment shader is mainly responsible for vertex chain
It picks up to form pixel, completes the operations such as the rasterisation of 3-D graphic and the illumination of pixel, texture mapping.In OpenGL specification
In, write tinter using GLSL (OpenGL Shading Language OpenGL shading languages).GLSL is made using C language
For basic high-order shading language, the complexity using assembler language or hardware specification language is avoided.
Area filling:It refers to being filled operation to specified rectangular area using solid color, add as basic
Speed operation, it can complete the operation of the acceleration to point, line, line segment.
Region copies:Refer to copy the pixel content of source region in destination region to.
Image blend:In computer graphics, according to the transparence information of image, by foreground image and background image according to
The bond type specified, process of the generation with part or all of transparency interaction image.Transparence information is between 0 and 1
Value is stored in image Alpha channels, is worth and is represented that pixel is transparent for 0, is worth and is represented that pixel is opaque for 1.In X Window
In system, a coverage image for only including Alpha channel informations is increased, is used to implement the rendering of polygon.
The beneficial effects of the invention are as follows:
1st, the two-dimentional accelerated method that GCN frameworks video card accelerates frame based on EXA is realized, using GCN framework video card characteristics,
By quickly establishing rendering link, state programming is carried out to GCN frameworks video card and tinter programs, and utilizes vertex resource, line
Resource, constant resource are managed to complete the two-dimentional accelerated method under EXA acceleration frames;
2nd, avoiding the accelerated method in Glamor accelerated modes needs to call EGL initialization graphics rendering contexts first,
Then the drawbacks of complicated processes in api interface switch contexts process to three-dimensional graphics renderer in calling OpenGL, entirely
The realization of accelerating interface is also greatly improved in terms of time complexity and stability and memory consumption, is greatly carried
X-Y scheme accelerating ability is risen.
A kind of X-Y scheme accelerated method based on GCN framework video cards provided above the embodiment of the present invention carries out
It is discussed in detail.The explanation of above example is only intended to facilitate the understanding of the method and its core concept of the invention;Meanwhile for
Those of ordinary skill in the art, thought according to the present invention have change in specific embodiments and applications
Place, in conclusion the content of the present specification should not be construed as limiting the invention.
Some vocabulary has such as been used to censure specific components or module in specification and claim.Art technology
Personnel are, it is to be appreciated that different institutions may call same component or module with different nouns.This specification and right will
It asks not in a manner that the difference of title is used as and distinguishes component, but area is used as with the difference of component or module functionally
The criterion divided.As being open language, therefore should solve in specification in the whole text and claim "comprising", " comprising " mentioned in
It is interpreted into " including but not limited to "." substantially " refer in receivable error range, those skilled in the art can be certain
The technical problem is solved in error range, basically reaches the technique effect.Specification subsequent descriptions are to implement the present invention's
Better embodiment, so description is that the model of the present invention is not limited to for the purpose of illustrating the rule of the present invention
It encloses.Protection scope of the present invention is when subject to appended claims institute defender.
It should also be noted that, term " comprising ", "comprising" or its any other variant are intended to nonexcludability
Comprising so that commodity or system including a series of elements not only include those elements, but also including without clear and definite
It the other element listed or further includes as this commodity or the intrinsic element of system.In the feelings not limited more
Under condition, the element that is limited by sentence "including a ...", it is not excluded that in the commodity including the element or system also
There are other identical elements.
Several preferred embodiments of the present invention have shown and described in above description, but as previously described, it should be understood that the present invention
Be not limited to form disclosed herein, be not to be taken as the exclusion to other embodiment, and available for various other combinations,
Modification and environment, and can in innovation and creation contemplated scope described herein, by the technology of above-mentioned introduction or related field or
Knowledge is modified.And changes and modifications made by those skilled in the art do not depart from the spirit and scope of the present invention, then it all should be
In the protection domain of appended claims of the present invention.