CN106683036A

CN106683036A - Storing and encoding method of frame buffer for efficient GPU drawing

Info

Publication number: CN106683036A
Application number: CN201611139601.6A
Authority: CN
Inventors: 田泽; 郑新建; 任向隆; 卢俊; 韩立敏; 张骏
Original assignee: Xian Aeronautics Computing Technique Research Institute of AVIC
Current assignee: Xian Aeronautics Computing Technique Research Institute of AVIC
Priority date: 2016-12-12
Filing date: 2016-12-12
Publication date: 2017-05-17

Abstract

The invention provides a storing and encoding method of a frame buffer for efficient GPU drawing. The method comprises the following steps: during encoding, image data in each Tile unit or Super Tile unit is encoded according to a normal encoding order; the encoding order of Tile units or Super Tile units in each Block unit is consistent with the rasterization direction, the Tile units or Super Tile units are encoded in a zigzag shape, the first Tile unit or Super Tile unit in the lower left corner is encoded first in each Block unit, and encoding is carried out in a left-to-right and then bottom-to-top order; and the encoding order of the Block units in each encoding and storing object is consistent with the rasterization direction, the Block units are encoded in a zigzag shape, the first Block unit in the lower left corner is encoded first in each encoding and storing object, and encoding is carried out in a left-to-right and then bottom-to-top order. According to the invention, the space locality of a frame buffer in graph drawing can be utilized to the maximum, the absence rate of color, depth and texture Caches can be reduced, GPU drawing can be speeded up, and the bandwidth requirement of DDR can be reduced.

Description

It is a kind of to store coded method towards the frame buffer zone that GPU efficiently draws

Technical field

The present invention relates to computer hardware technology field, more particularly to the frame buffer zone storage coded method of GPU.

Background technology

GPU needs great memory bandwidth during 3D graphic plottings, mainly due to data texturing, color data and depth number According to access, often alleviated using texture Cache, color Cache and depth Cache and corresponding compression algorithm in design DDR memory bandwidth pressure.The access of Cache is accessed by Block, while compression algorithm is to be compressed by Tile and decompress Contracting, frame buffer zone storage coded system different in DDR can greatly affect the hit rate of the efficiency and Cache of compression algorithm Efficiency is updated with Cache.

The content of the invention

The purpose of the present invention is：

The present invention describes a kind of frame buffer zone storage coded method efficiently drawn towards GPU, can be maximum Using the spatial locality of frame buffer zone during graphic plotting, the miss rate of color, depth and texture Cache is reduced, accelerate GPU's Draw, and reduce the bandwidth demand of DDR.

The technical scheme is that：

It is a kind of to store coded method towards the frame buffer zone that GPU efficiently draws, including：

Code storage object is pressed into stress and strain model and presses grid in the big Block units of several grades, each Block unit It is divided into the big Tile units of several grades or SuperTile units；Bag in each Tile unit or SuperTile units View data containing equal number；

During coding, view data is entered according to normal coded sequence in each Tile unit or SuperTile units Row coding；

The coded sequence of Tile units or SuperTile units in each Block unit is consistent with rasterization direction, Encode according to "the" shape, in each Block unit from the beginning of first, lower left corner Tile units or SuperTile units, According to from left to right, then sequential encoding from top to bottom；

The coded sequence of Block units is consistent with rasterization direction in each code storage object, according to "the" shape Coding, in each code storage object from the beginning of the Block units of first, the lower left corner, according to from left to right, then from top to bottom Sequential encoding.

The code storage object is data texturing either color data or depth data.

Described image data arrange totally 16 texels for 4 rows 4, or 4 rows 16 arrange totally 64 pixels, or 8 rows 8 arrange totally 64 Depth data.

It is an advantage of the invention that：

The code storage mode of the data texturing can ensure that the Block data that texture Cache is accessed every time Spatial locality is optimum；The Block that the code storage mode of the color data can ensure access every time with color Cache The optimum balance of bandwidth and buffering when the spatial locality of data, data compression scheme and pixel buffer show；The depth The code storage mode of degrees of data can ensure that spatial locality, the data of the Block data that depth Cache is accessed every time The optimum balance of compress mode.

Description of the drawings

Fig. 1 is a kind of texture buffer storage coded system schematic diagram efficiently drawn towards GPU in the present invention；

Fig. 2 is a kind of color buffer storage coded system schematic diagram efficiently drawn towards GPU in the present invention；

Fig. 3 is a kind of depth buffer storage coded system schematic diagram efficiently drawn towards GPU in the present invention.

Specific embodiment

Below in conjunction with the accompanying drawings and specific embodiment, technical scheme is clearly and completely stated.Obviously, The embodiment stated only is a part of embodiment of the invention, rather than the embodiment of whole, based on the embodiment in the present invention, Those skilled in the art are not making the every other embodiment that creative work premise is obtained, and belong to the guarantor of the present invention Shield scope.

The code storage object is data texturing either color data or depth data.

Embodiment

As shown in figure 1, a kind of store coded system, the stricture of vagina of texture buffer towards the texture buffer that GPU efficiently draws According to the difference of pinup picture pattern when reason pinup picture is accessed, pixel may need spatially adjacent 2,4,8 even More texel points are fitted, so the spatial locality of the texture Cache of texel data is provided for texture mapping just especially It is important.Using the "the" shape code storage mode based on Tile in design, the Block data of a Cache are in two-dimensional space On contain 16 Tile, 16 rows 16 arrange totally 256 texel datas, and its spatial locality can be optimal.

Texture Cache is read-only Cache, is read according to the adjacent mode of two-dimensional space from Cache when texture mapping Multiple texel datas, if texture Cache there occurs disappearance, need the data that a Block is once obtained from DDR.Adopt With the "the" shape code storage mode based on Tile, the adjacent Block data of two-dimensional space are Coutinuous store in DDR , the acquisition of DDR data can be completed by a burst transfer, can be on the basis of texel space locality optimum be ensured Reduce the access bandwidth demand of DDR.

As shown in Fig. 2 a kind of store coded system towards the color buffer that GPU efficiently draws, color buffer is in GPU The final result of fragment is drawn in storage when carrying out graphic plotting, is read and is shown on screen by display module.So face The coded format of color relief area will not only consider the access characteristics of color Cache during graphic plotting, it is also contemplated that display module is read Display characteristic when taking, while the bandwidth demand in order to reduce DDR, color buffer generally require using based on Tile or The lossless compression algorithm of SuperTile.Using the "the" shape code storage mode based on SuperTile in design, one SuperTile is a compression blocks, is compressed using lossless compression algorithm.To improve compressible, the size of SuperTile It is set as the pixel composition that 4 row 16 is arranged.Because the display of final color relief area needs to be shown line by line, if still adopted The optimum coded system of two-dimensional space locality is accomplished by caching at least 16 row data when then display module reads color buffer, And actually show line by line read characteristic be do not need so jumbo caching, in order to access in color Cache two Dimension space locality is optimum and shows that reading buffer capacity obtains between the two best balance, designs the Block of a Cache Data contain 4 SuperTile on two-dimensional space, totally 256 pixel datas.

When color Cache carries out buffer area read-write, if color Cache there occurs disappearance, need from DDR once Obtain the data of a Block.Using the "the" shape code storage mode based on Tile, a two-dimensional space is adjacent Block data are Coutinuous store in DDR, and the acquisition of DDR data can be completed by a burst transfer, can ensure picture The access bandwidth demand of DDR is reduced on the basis of plain spatial locality optimum.And display module need not be read when reading by Block Take, once only need to read the data of the SuperTile that 4 row 16 is arranged, saved the spatial cache of internal realization, data The optimum balance of bandwidth and buffering when spatial locality, data compression scheme and pixel buffer show.

As shown in figure 3, a kind of store coded system towards the depth buffer that GPU efficiently draws, depth buffer is in GPU The depth value of fragment is drawn in storage when carrying out graphic plotting, and operates the depth to subsequent segments to test by segment, To determine that those segments can be shown on screen.In order to reduce the bandwidth demand of DDR, depth buffer generally requires and adopts base In the lossless compression algorithm of Tile or SuperTile.Using the "the" shape code storage side based on SuperTile in design Formula, a SuperTile is a compression blocks, is compressed using lossless compression algorithm.To improve compressible, The size of SuperTile is set as the segment depth value composition that 8 row 8 is arranged, and designs the Block data of a Cache two-dimentional empty Between on contain 4 SuperTile, totally 256 segment depth datas.

When depth Cache carries out buffer area read-write, if depth Cache there occurs disappearance, need from DDR once Obtain the data of a Block.Using the "the" shape code storage mode based on Tile, a two-dimensional space is adjacent Block data are Coutinuous store in DDR, and the acquisition of DDR data can be completed by a burst transfer, can ensure picture The access bandwidth demand of DDR is reduced on the basis of plain spatial locality optimum.

Finally it should be noted that above example is only to illustrate technical scheme, rather than a limitation；Although The present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those within the art that；It still may be used To modify to the technical scheme that foregoing embodiments are recorded, or equivalent is carried out to which part technical characteristic；And These modifications are replaced, and do not make the spirit and model of the essence disengaging various embodiments of the present invention technical scheme of appropriate technical solution Enclose.

Claims

1. it is a kind of to store coded method towards the frame buffer zone that GPU efficiently draws, it is characterised in that to include：

Code storage object is pressed into stress and strain model and presses stress and strain model in the big Block units of several grades, each Block unit Into the big Tile units of several grades or SuperTile units；Phase is included in each Tile unit or SuperTile units With the view data of quantity；

During coding, view data is compiled according to normal coded sequence in each Tile unit or SuperTile units Code；

The coded sequence of Tile units or SuperTile units in each Block unit is consistent with rasterization direction, according to "the" shape is encoded, in each Block unit from the beginning of first, lower left corner Tile units or SuperTile units, according to From left to right, sequential encoding then from top to bottom；

The coded sequence of Block units is consistent with rasterization direction in each code storage object, encodes according to "the" shape, In each code storage object from the beginning of the Block units of first, the lower left corner, according to from left to right, then order from top to bottom Coding.

2. a kind of frame buffer zone storage coded method efficiently drawn towards GPU as claimed in claim 1, it is characterised in that The code storage object is data texturing either color data or depth data.

3. a kind of frame buffer zone storage coded method efficiently drawn towards GPU as claimed in claim 1, it is characterised in that Described image data arrange totally 16 texels for 4 rows 4, or 4 rows 16 arrange totally 64 pixels, or 8 rows 8 arrange totally 64 depth numbers According to.