GB2461900A - Storing and Displaying Parameter Data Relating to Separately Rendered Areas of an Image - Google Patents

Storing and Displaying Parameter Data Relating to Separately Rendered Areas of an Image Download PDF

Info

Publication number
GB2461900A
GB2461900A GB0813028A GB0813028A GB2461900A GB 2461900 A GB2461900 A GB 2461900A GB 0813028 A GB0813028 A GB 0813028A GB 0813028 A GB0813028 A GB 0813028A GB 2461900 A GB2461900 A GB 2461900A
Authority
GB
United Kingdom
Prior art keywords
rendering
processing apparatus
frame
parameter
parameters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
GB0813028A
Other versions
GB0813028D0 (en
GB2461900B (en
Inventor
Frank Klaeboe Langtind
Remi Pedersen
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ARM Ltd
Original Assignee
ARM Ltd
Advanced Risc Machines Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ARM Ltd, Advanced Risc Machines Ltd filed Critical ARM Ltd
Priority to GB0813028.8A priority Critical patent/GB2461900B/en
Publication of GB0813028D0 publication Critical patent/GB0813028D0/en
Priority to JP2009161993A priority patent/JP5317866B2/en
Priority to CN200910161610.9A priority patent/CN101650821B/en
Priority to US12/458,609 priority patent/US8144167B2/en
Publication of GB2461900A publication Critical patent/GB2461900A/en
Priority to US13/310,008 priority patent/US8339414B2/en
Application granted granted Critical
Publication of GB2461900B publication Critical patent/GB2461900B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/40Filling a planar surface by adding surface attributes, e.g. colour or texture
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/60Memory management

Abstract

A graphics processing apparatus 6 is provided with rendering circuitry 24 which separately renders different areas of a frame of pixel values. Monitoring circuitry 30, 32, 34, 36, 38, 40 coupled to the rendering circuitry 24 captures for each area rendered one or more parameters (e.g. event counts, processing cycles, cache misses, bus transactions and potentially intersecting primitives) and stores these parameters to a parameter memory 20. A performance frame can be generated from the captured and stored parameters with performance-representing pixel values for each area within the performance frame corresponding to an area within the image frame and having a visual characteristic selected in dependence upon the performance parameter which was captured. The visual characteristic may be a grey-scale value, a pixel intensity or a pixel colour.

Description

MONITORING GRAPHICS PROCESSING
This invention relates to the field of graphics processing. More particularly, this invention relates to the monitoring of graphic processing performed in rendering different areas of a frame.
Graphics processing is a computationally intensive task. In order that such graphic processing can be performed efficiently it is often necessary to adapt the processing being performed or the system which is performing that processing so as, for example, to reduce performance bottlenecks. The effects which give rise to performance bottlenecks and other operating characteristics can be subtle and it can be difficult to identify the cause of low performance or other problem.
It is known to provide graphics processing systems with monitoring circuitry is which is able to capture diagnostic/performance information in respect of graphics proccig epcratcne that are pcrfcrmed. Such infcntion may, for example, tell he hardware designer or application author how many processing cycles of the graphics processing apparatus are used in rendering each frame. While such mechanisms may be useful in identifying that a problem exists, such that an excessive number of processing cycles are required, there exists a difficulty in understanding what is causing such problems. Techniques which can assist in the understanding of the complex behaviour of graphics processing systems to identify problems therein are advantageous.
Viewed from one aspect the present invention provides graphics processing apparatus for rendering a frame of pixel values representing a scene, said graphics processing apparatus comprising: rendering circuitry for rendering separately different areas of said frame; monitoring circuitry coupled to said rendering circuitry to capture for each area rendered one or more parameters; and a parameter memory coupled to said monitoring circuitry to store separately for each area said one or more parameters captured by said monitoring circuitry.
The present technique recognises that capturing parameters in respect of each of a plurality of separately rendered areas within a frame permits a more ready understanding of effects which give rise to those parameters, and potential problems which they may indicate. For example, a parameter indicating an excessive cycle count associated with a particular area within a frame rendered will allow a user to concentrate on properties particular to the area which gave rise to that excessive cycle count when identifying its cause. Permitting a ready correlation between monitored parameters that are captured and the corresponding areas within a frame rendered considerably facilitates diagnostic, optimisation and other activities.
It will be appreciated that the rendering performed by the graphics processing apparatus could take a wide variety of different forms. In one example, the scene rendered may include one or more primitives and the frame be formed of a plurality of tiles of pixel values.
In the above context, the rendering circuitry may be a tile-based rendering ciit1y which rcad; data charatc�ing cne cr n-ere piniti"s n'1 rndr sequence of tiles to generate the overall frame with each of the tiles being rendered in turn for a selection of the primitives that are identified as potentially intercepting tha tile.
The separate rendering of each tile in such systems is well suited to the separate capture and storage of parameters associated with that rendering.
It will be appreciated that the parameters captured and stored can have a wide variety of forms. The parameters may be diagnostic in a general sense. However, the present technique is particularly well suited to uses where the parameters monitored, captured and stored are performance parameters.
It is advantageous if the action of the monitoring circuitry does not interfere with the rendering circuitry as this could give rise to inaccurate andlor misleading parameters being captured, e.g. if the action of capturing and storing parameters interferes with memory accesses required by the rendering circuitry, then a decrease in performance may be observed as a result of the monitoring, capture and storage rather than as a consequence of defects present without such monitoring, capture and
I
storage.
In some embodiments of the invention the rendering circuitry upon completion of rendering a currently active area writes pixel values for that currently active area to a frame memory. The monitoring circuitry can be formed to write the one or more parameters to the parameter memory at times when the rendering circuitry does not have rendered pixel values for a completed area to be wriften to the frame memory.
In this way, the writing of the parameters to the parameter memory can be performed when the rendering circuitry is not trying to perform its own writes and accordingly the monitoring circuitry will have a low impact upon this aspect of the performance of the graphics processing apparatus.
Whilst it will be appreciated that the parameter memory and the frame memory may be separately provided, it is convenient if these form part of a common shared memory. Providing a special purpose parameter memory only for use by the iiintoriiig circuitry wcud b v;ateft! a in!d use when nonitcring ws nf required, such a dedicated parameter memory would lie idle. If the parameter memory forms part of a common shared memory, then that common shared memory may be used for other purposes when not required to store the parameters generated by the monitoring circuitry and there is also greater flexibilty in the size of parameter memory that can be used.
The monitoring circuitry can take a wide variety of different forms. Providing the monitoring circuitry in the form of one or more counters coupled to respective points within the graphics processing apparatus to count events associated with the rendering of each area provides a low overhead monitoring mechanism which is capable of providing a wide variety of useful parameters.
The flexibility of the monitoring circuitry can be improved by the provision of parameter selecting circuitry associated with one or more of the counters and responsive to a parameter selecting signal to select which points/event within the graphics processing apparatus is to be counted by the counter concerned. In this way, the overhead associated with the monitoring circuitry can be reduced since relatively few counters need be provided and if a wide range of parameters need capturing then this can be achieved by re-executing the rendering of the same frame with different parameter selecting signals such that the counters count different events upon different renderings.
The above flexibility is aided when the parameter selecting signal is a user programmable signal.
It will be appreciated that the parameters which are monitored can have a wide variety of different forms. Particularly useful parameters which may be monitored include a number of processing cycles used by the graphics processing apparatus in rendering an area, a number of cache misses within a cache memory of the graphics processing apparatus when rendering an area, a number of bus transactions on a bus coupled to the graphics processing apparatus when rendering an area and a number of primitives identified as potentially intercepting an area that are processed when rendering the area. It will be appreciated that many different andlor alternative prmetr tc the hp nnntnrpd cantiired and stored in different embodiments. The present techniques encompass a wide variety of different parameters as may be deemed appropriate/useful in a particular graphics processing apparatus.
The parameters may be stored within the parameter memory in a variety of different ways. However, it is convenient if the parameters are stored within the parameter memory such that upon completion of rendering of a frame, the parameter memory contains an array of parameters corresponding to an array of areas forming the frame which has been rendered. Such a one-to-one correspondence between the areas and the elements of the parameter array facilitates a ready understanding of the parameter data and a simplified processing of that parameter data to assist such an understanding.
Viewed from another aspect the present invention provides a method of generating monitoring data for a graphics processing apparatus rendering a frame of pixel values representing a scene, said method comprising the steps of: separately rendering different areas of said frame; capturing for each area rendered one or more parameters; and separately storing for each area said one or more parameters captured.
Viewed from a further aspect the present invention provides a method of analysing a graphics processing apparatus separately rendering different areas of an image frame representing a scene, said method comprising the steps of: reading an array of one or more parameters stored within a parameter memory for respective areas of said frame; and generating a parameter frame with areas of parameter-representing pixel values having at least one visual characteristic selected in dependence upon at least one parameter of a corresponding area within said image frame.
As well as the capture of parameter data for the rendering on an area-by-area basis discussed above, a complementary aspect of the present invention provides a method of analysing a graphics processing apparatus separately rendering different areas of an image frame representing a scene, said method comprising the steps of: rcaiing ry cf e cr nre rmetr tnred within a parameter memorY for respective areas of said frame; and generating a parameter frame with areas of parameter-representing pixel values having at least one visual characteristic selected in dependence upon at least one parameter of a corresponding area within said image frame.
Having separately captured parameter data in respect of areas which are separately rendered within a frame, this aspect of the present technique serves to read such an array of parameters and generate a parameter frame for display with areas of parameter-representing pixel values with at least one visual characteristic selected in dependence upon at least one parameter of a corresponding area within the image frame.
In this way, the captured parameter data can be displayed as a parameter frame on an area-by-area basis in a manner which considerably facilitates the understanding of the parameter data, such as by facilitating a comparison between the parameter frame and the image frame so as to identif' areas of the image frame giving rise to parameter values of note.
As previously, the rendering of the image frame can be performed in a variety of different ways giving rise to area-by-area processing, but the present technique is particularly well suited to tile-based rendering. Furthermore, the parameters stored within the parameter memory can have many different forms, but the present technique is well suited to use when the parameters are performance parameters.
The visual characteristics of the parameter-representing pixel values could be selected in a variety of different ways. Some visual characteristics which are particularly useful in permitting a ready understanding of the captured parameter data using a parameter frame are a pixel grey-scale value, a pixel intensity andlor a pixel colour.
While it is possible that only one parameter frame may be generated from the parameter values captured, it may be advantageous in some example embodiments to capture multiple arrays of parameters (or an array with multiple parameter elements) such that a plurality nr t,pfpr frt,p nv hp onprfpd fir eRch image frame with different parameter frames having areas with pixel values dependent upon different parameters. The effects which give rise to performance bottlenecks and the like can be subtle and problems sometimes can be more readily identified using multiple parameter frames such that combinations of effects for particular areas within the image frame can be identified.
It will be appreciated that in many embodiments the graphics processing apparatus is part of an integrated circuit and it may be convenient that the steps of reading and generating mentioned above are performed with a general purpose computer coupled to the integrated circuit concerned. General purpose computers coupled to an integrated circuit in this way for diagnostic/performance monitoring reasons will be familiar to those in this technical field during the design and debugging phases of hardware and software development.
Another aspect of the present invention provides a computer program storage media storing a computer program for controlling a general purpose computer in accordance with the above methods of reading and generating.
The graphics processing apparatus incorporating the monitoring circuitry and the parameter memory may require such features to be configured for use.
Accordingly, another aspect of the present technique provides a computer program storage media storing a driver computer program for controlling a graphics processing apparatus as discussed above and in particular permitting user selection of the one or more parameters to be captured.
Such a driver computer program may be executed by the integrated circuit of which the graphics processing apparatus forms a part or within the user device of which the graphics processing apparatus forms part, such as a driver which initialises and controls the graphics processing apparatus.
The driver program may also permit user selection of a storage location to be used as the parameter memory.
Ebcdimcnt cf the inventicn uifl! w b drihed. by wv of example only.
with reference to the accompanying drawings in which: Figure 1 schematically illustrates a system-on-chip integrated circuit incorporating a graphics processing unit and coupled to a diagnostic general purpose computer and an LCD display; Figure 2 schematically illustrates an image frame containing three primitives to be rendered and composed of an array of tiles of pixel values which are separately rendered on a tile-by-tile basis; Figure 3 schematically illustrates a performance parameter array of captured parameter values corresponding to the image frame of Figure 2 and representing the number of primitives potentially intersecting respective tiles; Figure 4 schematically illustrates a performance frame generated from the performance parameter array of Figure 3 in which each tile of pixel values within the performance frame has a visual characteristic corresponding to the parameter value associated with the corresponding tile within the image frame of Figure 2 as read from the performance parameter array of Figure 3; Figure 5 is a flow diagram schematically illustrating the tile rendering performed by the graphics processing unit of Figure 1 including the capture of counter values and the storage of counter values as parameters to a parameter memory; Figure 6 is a flow diagram schematically illustrating how a performance frame such as is illustrated in Figure 4 may be generated from a performance parameter array such as is illustrated in Figure 3; Figure 7 schematically illustrates the action of a driver computer program in initialising a graphics processing unit including initialising the monitoring circuitry (counters and counter controller); and Figure 8 schematically illustrates a general purpose computer of a type 5Itab1c f3r pcrfcm'irg th prcceing illustrated in t let Figure 6.
Figure 1 schematically illustrates a system-on-chip integrated circuit 2 including a central processing unit 4 (such as a general purpose ARM processor), a graphics processing unit 6, a memory 8, a display driver 10 and an input output circuit 12 all coupled via a system bus 14. A general purpose computer 16 as will be described in connection with Figure 8 in the following text is connected to the input output circuit 12 to permit the reading of data from the memory 8 and the writing of data to the memory 8. The data read from the memory 8 by the general purpose computer 16 can include image frame data 18 as well as an array of performance parameter data 20. The general purpose computer 16 can display the image frame data 18 and generate and display a performance frame using the array of performance data 20 so as to facilitate understanding of processing being performed by the graphics processing unit 6. The performance parameter data can represent a wide variety of different parameters. Examples of these parameters will be discussed below.
In normal (non-diagnostic) operation the graphics processing unit 6 generates image frame data 18 for display on an attached LCD 22 using the display driver 10.
The graphics processing unit 6 performs three dimensional graphics processing such as includes tile-based rendering of the type performed by the MALI grphics processing units design by ARM Limited of Cambridge, England.
The graphics processing unit 6 includes tile-based rendering circuitry 24 and a graphics processing unit cache 26 together with a memory interface 28 for connecting to the system bus 14. It will be appreciated that in practice the graphics processing unit 6 will typically include many further circuit elements but these have been omitted from Figure 1 for the sake of clarify.
Also shown within Figure 1 is monitoring circuitry including counters 30, 32 controlled by a counter controller 34 and supplied with respective signals to be counted via multiplexers 36, 38. Each of the multiplexers 36, 38 receives four input signals respectively coupled to different points within the graphics processing unit 6 so as selectively to monitor the number of processing clock cycles used by the graphics cessing nt 6, t p1!mhPr of primitives identified as potentially intersecting a tile being rendered by the tile-based rendering circuitry 24, a miss within the graphics processing unit cache 26 and a bus transaction as performed by the memory interface circuit 28. The multiplexers 36, 38 select different event signals to monitor and supply these to their respective counters 30, 32 so as to be counted.
The counter controller 34 is responsive to a user programmable value within a memory mapped register 40 to select the signals passed by the multiplexers 36, 38 to the counters 30, 32. Thus, a user can write to the register 40 to select which of the parameters are to be monitored and form the performance parameter array when an image frame is rendered. The register 40 is also user programmable to specify a storage location within the memory 8 at which the performance parameter array data will be stored.
The counter controller 34 is responsive to a tile complete signal generated by the tile-based rendering circuitry 24 to trigger the counter controller 34 to read the current values of the counters 30, 32 and send these values to the memory interface circuitry 28 to be written into the memory 8 as part of the performance parameter array data 20 at a position corresponding to the tile rendered which gave rise to those count values. The count values may be cumulative or may be reset each time they are read depending upon the nature of the count concerned.
The memory interface circuitry 28 is also responsible for writing pixel values of the rendered tile generated by the tile-based rendering circuitry 24 into the image frame 18 of the memory 8. Such writing of the pixel values of the tile rendered takes place in bursts as each tile is completed and the writing of the parameter data to the performance parameter array 20 can be fitted into the gaps between the writing of the pixel values of the tile data such that the writing of the parameters does not interfere with the performance of the graphics processing unit 6. The memory interface 28 may be arranged to arbitrate between the writes from the tile-based rendering circuitry 24 and the writes from the counter controller 34 such that the writes from the tile-based rendering circuitry 24 always have high priority.
Tile-based rendering in this example is performed using tiles which contain 16*16 pixel values (although it will be appreciated that other sizes and shapes of tiles pc!ie). A dip!ay list 42 tnrM within the memory 8 stores lists of Drimitives which potentially intersect each tile to be rendered by the tile-based rendering circuitry. The display lists 42 may be generated by the general purpose processor 4 and stored within the memory 8. The graphics processing unit 6 serves to render each tile on a tile-by-tile basis by reading the display list 42 and then calculating each pixel value depending upon the data identifying the primitives potentially intersecting the tile concerned and taking into account any texture, shading or other graphics controlling data which may also be in use. When the tile has been generated, the array of 16* 16 pixel values are written into the corresponding position within the image frame 18 of the memory 8. Such tile-by-tile processing is distinguished from what is normally termed immediate mode processing in which the image is formed by rendering each primitive in turn on a primitive-by-primitive basis into the image frame 18 as a whole.
The parameters monitored can take a wide variety of different forms.
Particularly useful parameters to monitor include a number of processing cycles used by the graphics processing unit 6 in rendering each tile, a number of cache misses within the graphics processing unit cache 26 when rendering each tile, a number of bus transactions on the system bus 14 performed by the memory interface circuitry 28 when rendering each tile andlor a number of primitives identified as potentially intersecting a tile being processed as identified by the tile-based rendering circuitry 24 from reading the display list 42.
Figure 2 schematically illustrates a simple image frame to be rendered. This image frame contains three primitives in the form of two triangles and one square. It will be seen that the image frame is composed of an array of 10*10 tiles and each of these tiles contains 16*16 pixel values. Each tile is rendered in turn by the tile-based rendering circuitry 24. As each tile is rendered, the display list 42 within the memory 8 is read to identify the number of primitives potentially intersecting that tile. This number of primitives data is output by the tile-based rendering circuitry 24 and is captured within one of the counters, 30, 32.
Figure 3 illustrates a performance parameter array corresponding to the image frame of Figure 2 in which the number of primitives potentially intersecting each tile within the array has been captured and stored. It will be seen that there is a parameter value representing the number of primitives stored in respect of each tile within the image frame. There is a one-to-one correspondence in this example between the tiles of the image frame of Figure 2 and the parameter value stored within the performance parameter array of Figure 3. It will be appreciated that each entry within the performance parameter array of Figure 3 could include multiple different parameters relating to the same tile, such as a number of primitives count, a cycle count, a cache miss count, a number of memory transactions count etc. As an alternative, separate performance parameter arrays could be kept in respect of different performance parameters being monitored and captured. It will be observed from Figure 3 that even though the number of primitives associated with each tile is a relatively straight forward parameter to capture and count, the interpretation of the array of data illustrated in Figure 3 is not straight forward even though it is illustrated in Figure 3 in the form of two dimensional array.
Figure 4 illustrates how a performance frame may be generated from the performance parameter array of Figure 3 in order to facilitate understanding and interpretation of the performance parameters which have been captured. The performance frame of Figure 4 is formed with tiles in one-to-one correspondence with the tiles of the image frame of Figure 2 and the data values stored within the performance parameter array of Figure 3. For the sake of convenience, the tiles within the performance frame can have the same size as the tiles within the image frame of Figure 2, namely formed of 16*16 parameter-representing pixel values with at least one visual characteristic selected in dependence upon the corresponding parameter value within the performance parameter array of Figure 3. It will be seen from the performance parameter array of Figure 3 that the maximum number of primitives for any tile is 3 and the minimum number is 0. The maximum and minimum values can be searched for within the performance parameter array and used to effectively select the mapping between parameters and visual characteristics of the parameter-representing pixel values within the performance frame of Figure 4.
Another approach would be to allow the user to manually select the mapping to be used, such as manually selecting minimum and maximum values and which visual characteristics these corresponded to with the visual characteristic varying in a predetermined manner in dependence upon the parameter value between these nmm Th n-nimiin, nd n1xmiim values can be determined on the basis of a single performance parameter array or they may be determined based upon multiple performance parameter arrays for the same parameter. Setting the mapping taking into account the parameter values of multiple arrays captured for the same parameter may be preferable as it may more readily allow unusual parameter values within individual performance frames to be identified.
The example shown in Figure 4 associates solid shading with the tiles for which three primitives were potentially intersecting, cross hatched shading for tiles with two primitives, diagonal shading with tiles for one primitive and no shading for tiles with zero primitives. In this way, the tiles for which the highest number of primitives required consideration can be readily identified and a visual comparison may be made with the image frame of Figure 2 should such a high number of primitives be considered a problem. The nature of the image frame giving rise to such a high number of primitives may then be adapted if needed.
It will be appreciated that the above is only one example of how a performance frame may be formed. The visual characteristic varied in dependence upon the parameter value can have a wide variety of different forms. As an example, the visual characteristic may be a pixel grey-scale value, a pixel intensity and/or a
I
pixel colour. Other visual characteristics (e.g. flashing when over a certain parameter value) may also be envisaged and used if desired.
Figure 5 schematically illustrates a flow diagram corresponding to processing performed by the tile-based rendering circuitry 24. At step 44 the first tile to be rendered is selected. At step 46 the display list 42 is read to identify the primitives which potentially intersect the current tile. Step 48 renders the tile using the primitives read and also updates the counters 30, 32 in dependence upon the currently selected parameters being monitored. At step 50, the array of rendered pixel values are written into the image frame 18 within the memory 8. At step 52 the performance counter values from the counters 30, 32 for the tile which has just been rendered are written into the performance parameter array 20 by the counter controller 34 via the memory interface circuitry 28. Step 54 identifies whether the current tile is the last tile within the image frame. If the current tile is not the last tile, then step 56 selects the next tile and processing returns to step 46, otherwise the tile rendering of the hc hn i'cn,nJptd --.--.-----.
Figure 6 is a flow diagram schematically illustrating the generation of a performance frame, such as illustrated in Figure 4, from an array of parameter values, such as illustrated in Figure 3. The processing illustrated in Figure 6 may be performed by the diagnostic computer 16 of Figure 1, which has read the array of performance data 20 from the memory 8 via the input output unit 12. At step 58 the mapping between performance parameter values and visual characteristics is either calculated or selected as previously discussed. At step 60 the first value in the performance parameter array is selected. Step 62 generates a corresponding tile of performance-representing pixel values with a visual characteristic dependent upon the performance parameter read from the array at step 60. At step 64 the tile of performance-representing pixel values are written to the performance frame of Figure 4. Step 66 determines whether the current parameter value within the array is the last array value. If the current parameter value is not the last array value, then step 68 selects the next array value and processing returns to step 62. If all of the array values have been mapped to performance-representing pixel values such that the full performance frame of Figure 4 has been generated, then processing proceeds to step where the performance frame is displayed on the diagnostic computer 16. The processing illustrated in Figure 6 may be performed by the diagnostic computer 16 under control of a computer program stored on a computer readable storage medium, such as a disk memory, etc. Figure 7 is a flow diagram schematically illustrating the action of a driver computer program in initialising the graphics processing unit 6 of Figure 1. The driver computer program may be executed by the general purpose processor 4 in Figure 1 and may be stored within the memory 8. At step 72, the graphics processing unit 6 is initialised other than in respect of its diagnostic capabilities with which the present technique is concerned. At step 74 a determination is made as to whether or not diagnostics are required to be run. If diagnostics are not required, then processing proceeds to step 76 where the graphics processing unit 6 is started.
If diagnostics are required, then step 78 reads a user input specifying which parameters are to be monitored. This user input could be made via the diagnostic ccn-pter!6. The user inpi!t oiild also he made in a number of other ways, such as via a input device associated with the apparatus of which the system-on-chip integrated circuit 2 performs a part. The user input specifying which parameters to monitor writes to the register 40 within the counter controller 34 and accordingly generates corresponding control signals for the multiplexers 36, 38 as previously discussed. The writing of the parameter selecting value to the counter controller 34 takes place at step 18. At step 82, further user input is read specifying which memory storage location is to be used for the performance parameter array 20. When this user input has been received, step 84 writes this memory storage location specifying information into the register 40 of the counter controller 34 such that the counter controller 34 will generate appropriately addressed memory transactions to the memory 8 in respect of parameter data to be written into the performance parameter array 20 as each tile is completed. Processing then proceeds to step 76 where the graphics processing unit 6 is started.
Figure 8 schematically illustrates a general purpose computer 200 of the type that may be used to implement the above described techniques. The general purpose computer 200 includes a central processing unit 202, a random access memory 204, a read only memory 206, a network interface card 208, a hard disk drive 210, a display
S
driver 212 and monitor 214 and a user inputloutput circuit 216 with a keyboard 218 and mouse 220 all connected via a common bus 222. In operation the central processing unit 202 will execute computer program instructions that may be stored in one or more of the random access memory 204, the read only memory 206 and the hard disk drive 210 or dynamically downloaded via the network interface card 208.
The results of the processing performed may be displayed to a user via the display driver 212 and the monitor 214. User inputs for controlling the operation of the general purpose computer 200 may be received via the user input output circuit 216 from the keyboard 218 or the mouse 220. It will be appreciated that the computer program could be written in a variety of different computer languages. The computer program may be stored and distributed on a recording medium or dynamically downloaded to the general purpose computer 200. When operating under control of an appropriate computer program, the general purpose computer 200 can perform the above described techniques and can be considered to form an apparatus for performing the above described technique. The architecture of the general purpose ccmputcr 200 could vory ccniderab!y 9nd Fig!r i nnJy nne example.

Claims (28)

  1. CLAIMSI. Graphics processing apparatus for rendering a frame of pixel values representing a scene, said graphics processing apparatus comprising: rendering circuitry for rendering separately different areas of said frame; monitoring circuitry coupled to said rendering circuitry to capture for each area rendered one or more parameters; and a parameter memory coupled to said monitoring circuitry to store separately for each area said one or more parameters captured by said monitoring circuitry.
  2. 2. Graphics processing apparatus as claimed in claim 1, wherein said scene includes one or more primitives and said frame is formed of a plurality of tiles of pixel values.
  3. 3. Graphics processing appsrt1is s e!im! jn claim 7 whrir said rendering circuitry is tile-based rendering circuitry responsive to data characterising said one or more primitives to render a sequence of said tiles to generate said frame, each of said tiles being rendered in turn for a selection of said one or more primitives identified as potentially intersecting said tile.
  4. 4. Graphics processing apparatus as claimed in any one of claims 1, 2 and 3, wherein said monitoring circuitry includes performance monitoring circuitry and said one or more parameters include one or more performance parameters.
  5. 5. Graphic processing apparatus as claimed in any one of the preceding claims, wherein said rendering circuitry upon completion of rendering of a currently active area writes pixel values for said currently active area to a frame memory, and said monitoring circuitry writes said one or more parameters to said parameter memory at times when said rendering circuitry does not have rendered pixel values for a completed area to be written to said frame memory.
  6. 6. Graphic processing apparatus as claimed in claim 5, wherein said parameter memory and said frame memory are parts of a common shared memory.
  7. 7. Graphics processing apparatus as claimed in any one of the preceding claims, wherein said monitoring circuitry comprises one or more counters coupled to respective points within said graphics processing apparatus to count events associated with rendering of each area.
  8. 8. Graphics processing apparatus as claimed in claim 7, wherein parameter selecting circuitry associated with at least one of said one or more counters is responsive a parameter selecting signal to select to which point within said graphics processing apparatus said counter is coupled and accordingly which events are counted.
  9. 9. Graphics processing apparatus as claimed in claim 8, wherein said parameter selecting signal is user programmable such that a user can select which events are to U%. JW1U..4.
  10. 10. Graphic processing apparatus as claimed in any one of the preceding claims, wherein said one or more parameters comprise one or more of: a number of processing cycles used by said graphics processing apparatus in rendering an area; a number of cache misses within a cache memory of said graphics processing apparatus when rendering an area; a number of bus transactions on a bus coupled to said graphics processing apparatus when rendering an area; and a number of primitives identified as potentially intersecting an area that are processed when rendering said area.
  11. 11. Graphic processing apparatus as claimed in any one of the preceding claims, wherein upon completion of rendering of said frame, said parameter memory contains an array of parameters corresponding to an array of said areas forming said frame.
  12. 12. A method of generating monitoring data for a graphics processing apparatus rendering a frame of pixel values representing a scene, said method comprising theIsteps of: separately rendering different areas of said frame; capturing for each area rendered one or more parameters; and separately storing for each area said one or more parameters captured.
  13. 13. Graphics processing apparatus for rendering a frame of pixel values representing a scene, said graphics processing apparatus comprising: rendering means for rendering separately different areas of said frame; monitoring means coupled to said rendering means for capturing for each area rendered one or more parameters; and parameter memory means coupled to said monitoring means for storing separately for each area said one or more parameters captured by said monitoring means.
  14. 14 A method of ana1vsin a ranhics processing apparatus separately rendering different areas of an image frame representing a scene, said method comprising the steps of: reading an array of one or more parameters stored within a parameter memory for respective areas of said frame; and generating a parameter frame with areas of parameter-representing pixel values having at least one visual characteristic selected in dependence upon at least one parameter of a corresponding area within said image frame.
  15. 15. A method as claimed in claim 14, wherein said scene includes one or more primitives and said frame is formed of a plurality of tiles of pixel values.
  16. 16. A method as claimed in claim 15, wherein in response to data characterising said one or more primitives, said image frame is rendered as a sequence of said tiles, each of said tiles being rendered in turn for a selection of said one or more primitives identified as potentially intersecting said tile.
  17. 17. A method as claimed in any one of claims 14, 15 and 16, wherein said one or more parameters include one or more performance parameters.
  18. 18. A method as claimed in any one of claim 14 to 17, wherein each area within said parameter frame has a corresponding area within said image frame.
  19. 19. A method as claimed in any one of claims 14 to 18, wherein said at least one visual characteristic comprises one or more of: pixel grey-scale value; pixel intensity; and pixel colour.
  20. 20. A method as claimed in any one of claims 14 to 19, wherein a plurality of parameter frames are generated for each image frame, different parameter frames having areas with pixel values dependent upon different parameters.
  21. 21. A method as claimed in any one of claims 14 to 20, wherein said array of one cr mcre parameters and idjnvige frcimp. rp stnred within a common memory.
  22. 22. A method as claimed in any one of claims 14 to 21, wherein said graphic processing apparatus is part of an integrated circuit and said steps of reading and generating are performed with a general purpose computer coupled to said integrated circuit.
  23. 23. A computer program storage medium storing a computer program for controlling as general purpose computer to perform the method as claims in any one of claims 14 to 22.
  24. 24. A computer program storage medium storing a driver computer program for controlling a graphic processing apparatus as claimed in any one of claims 1 to 11, said driver computer program permitting user selection of said one or more parameters to be captured.
  25. 25. A computer program storage medium as claimed in claim 24, wherein said driver program permits user selection of a storage location to be used as said parameter memory.
  26. 26. Graphic processing apparatus substantially as hereinbefore described with reference to the accompanying drawings.
  27. 27. A method of analysing performance of a graphics processing apparatus substantially as hereinbefore described with reference to the accompanying drawings.
  28. 28. A computer program storage medium storing a driver computer program for controlling a graphic processing apparatus substantially as hereinbefore described with reference to the accompanying drawings.
GB0813028.8A 2008-07-16 2008-07-16 Monitoring graphics processing Expired - Fee Related GB2461900B (en)

Priority Applications (5)

Application Number Priority Date Filing Date Title
GB0813028.8A GB2461900B (en) 2008-07-16 2008-07-16 Monitoring graphics processing
JP2009161993A JP5317866B2 (en) 2008-07-16 2009-07-08 Graphic processing monitoring
CN200910161610.9A CN101650821B (en) 2008-07-16 2009-07-15 Monitor graphics process
US12/458,609 US8144167B2 (en) 2008-07-16 2009-07-16 Monitoring graphics processing
US13/310,008 US8339414B2 (en) 2008-07-16 2011-12-02 Monitoring graphics processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB0813028.8A GB2461900B (en) 2008-07-16 2008-07-16 Monitoring graphics processing

Publications (3)

Publication Number Publication Date
GB0813028D0 GB0813028D0 (en) 2008-08-20
GB2461900A true GB2461900A (en) 2010-01-20
GB2461900B GB2461900B (en) 2012-11-07

Family

ID=39722389

Family Applications (1)

Application Number Title Priority Date Filing Date
GB0813028.8A Expired - Fee Related GB2461900B (en) 2008-07-16 2008-07-16 Monitoring graphics processing

Country Status (4)

Country Link
US (2) US8144167B2 (en)
JP (1) JP5317866B2 (en)
CN (1) CN101650821B (en)
GB (1) GB2461900B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014168740A3 (en) * 2013-04-11 2015-06-11 Qualcomm Incorporated Intra-frame timestamps for tile-based rendering

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2473682B (en) * 2009-09-14 2011-11-16 Sony Comp Entertainment Europe A method of determining the state of a tile based deferred re ndering processor and apparatus thereof
US8760460B1 (en) * 2009-10-15 2014-06-24 Nvidia Corporation Hardware-managed virtual buffers using a shared memory for load distribution
US8339409B2 (en) * 2011-02-16 2012-12-25 Arm Limited Tile-based graphics system and method of operation of such a system
US8830246B2 (en) 2011-11-30 2014-09-09 Qualcomm Incorporated Switching between direct rendering and binning in graphics processing
KR101953133B1 (en) * 2012-02-27 2019-05-22 삼성전자주식회사 Apparatus and method for rendering
US9672584B2 (en) * 2012-09-06 2017-06-06 Imagination Technologies Limited Systems and methods of partial frame buffer updating
US9152872B2 (en) * 2012-11-12 2015-10-06 Accenture Global Services Limited User experience analysis system to analyze events in a computer desktop
US9030480B2 (en) * 2012-12-18 2015-05-12 Nvidia Corporation Triggering performance event capture via pipelined state bundles
GB201223089D0 (en) 2012-12-20 2013-02-06 Imagination Tech Ltd Hidden culling in tile based computer generated graphics
GB2509113B (en) * 2012-12-20 2017-04-26 Imagination Tech Ltd Tessellating patches of surface data in tile based computer graphics rendering
US9261939B2 (en) 2013-05-09 2016-02-16 Apple Inc. Memory power savings in idle display case
US9286649B2 (en) * 2013-05-31 2016-03-15 Qualcomm Incorporated Conditional execution of rendering commands based on per bin visibility information with added inline operations
GB2546810B (en) * 2016-02-01 2019-10-16 Imagination Tech Ltd Sparse rendering
US10373286B2 (en) 2016-08-03 2019-08-06 Samsung Electronics Co., Ltd. Method and apparatus for performing tile-based rendering
GB2555586B (en) * 2016-10-31 2019-01-02 Imagination Tech Ltd Performance profiling in a graphics unit
CN109840876B (en) * 2017-11-24 2023-04-18 成都海存艾匹科技有限公司 Graphic memory with rendering function
CN110459180B (en) 2018-05-07 2022-04-22 京东方科技集团股份有限公司 Drive control method and device and display device
GB2577062B (en) * 2018-09-11 2021-04-28 Advanced Risc Mach Ltd Methods, apparatus and processor for producing a higher resolution frame
CN110471701B (en) * 2019-08-12 2021-09-10 Oppo广东移动通信有限公司 Image rendering method and device, storage medium and electronic equipment
CN110992867B (en) * 2019-12-18 2023-02-28 京东方科技集团股份有限公司 Image processing method and display device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE19826512A1 (en) * 1998-06-15 1999-12-16 Siemens Ag Picture display system for monitor
US6344852B1 (en) * 1999-03-17 2002-02-05 Nvidia Corporation Optimized system and method for binning of graphics data
GB2374782A (en) * 1998-04-01 2002-10-23 Real 3 D Inc A linear surface memory to spatial tiling algorithm/mechanism

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5995113A (en) 1996-08-02 1999-11-30 Hewlett-Packard Company Coloring events in event streams in order to provide information about operation of a software library
JP2000132349A (en) * 1998-10-21 2000-05-12 Fuji Xerox Co Ltd Plotting processor
JP2000132350A (en) * 1998-10-21 2000-05-12 Fuji Xerox Co Ltd Plotting processor
GB2378108B (en) * 2001-07-24 2005-08-17 Imagination Tech Ltd Three dimensional graphics system
US6738069B2 (en) 2001-12-31 2004-05-18 Intel Corporation Efficient graphics state management for zone rendering
US6885376B2 (en) * 2002-12-30 2005-04-26 Silicon Graphics, Inc. System, method, and computer program product for near-real time load balancing across multiple rendering pipelines
US7075541B2 (en) * 2003-08-18 2006-07-11 Nvidia Corporation Adaptive load balancing in a multi-processor graphics processing system
JP4347017B2 (en) * 2003-10-17 2009-10-21 キヤノン株式会社 Information processing method and image processing method
US7834890B2 (en) * 2003-10-17 2010-11-16 Canon Kabushiki Kaisha Information processing method and image processing method
US7190366B2 (en) * 2004-05-14 2007-03-13 Nvidia Corporation Method and system for a general instruction raster stage that generates programmable pixel packets
JP4566772B2 (en) 2005-02-14 2010-10-20 キヤノン株式会社 Image processing apparatus, image processing method, and program
US20070139421A1 (en) * 2005-12-21 2007-06-21 Wen Chen Methods and systems for performance monitoring in a graphics processing unit

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2374782A (en) * 1998-04-01 2002-10-23 Real 3 D Inc A linear surface memory to spatial tiling algorithm/mechanism
DE19826512A1 (en) * 1998-06-15 1999-12-16 Siemens Ag Picture display system for monitor
US6344852B1 (en) * 1999-03-17 2002-02-05 Nvidia Corporation Optimized system and method for binning of graphics data

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014168740A3 (en) * 2013-04-11 2015-06-11 Qualcomm Incorporated Intra-frame timestamps for tile-based rendering
CN105122310A (en) * 2013-04-11 2015-12-02 高通股份有限公司 Intra-frame timestamps for tile-based rendering
US9449410B2 (en) 2013-04-11 2016-09-20 Qualcomm Incorporated Intra-frame timestamps for tile-based rendering
CN105122310B (en) * 2013-04-11 2018-06-26 高通股份有限公司 For time stamp in the frame of the rendering based on tile

Also Published As

Publication number Publication date
CN101650821A (en) 2010-02-17
US20100020090A1 (en) 2010-01-28
JP5317866B2 (en) 2013-10-16
US8339414B2 (en) 2012-12-25
US20120075321A1 (en) 2012-03-29
CN101650821B (en) 2015-11-25
JP2010027050A (en) 2010-02-04
US8144167B2 (en) 2012-03-27
GB0813028D0 (en) 2008-08-20
GB2461900B (en) 2012-11-07

Similar Documents

Publication Publication Date Title
US8144167B2 (en) Monitoring graphics processing
EP2438576B1 (en) Displaying a visual representation of performance metrics for rendered graphics elements
JP5875751B2 (en) Graphic analysis technology
JP4463948B2 (en) Programmable visualization device for processing graphic data
US6654012B1 (en) Early ray termination in a parallel pipelined volume rendering system
US8711155B2 (en) Early kill removal graphics processing system and method
US6812929B2 (en) System and method for prefetching data from a frame buffer
US20080170082A1 (en) Graphics engine and method of distributing pixel data
US6166743A (en) Method and system for improved z-test during image rendering
US8120621B1 (en) Method and system of measuring quantitative changes in display frame content for dynamically controlling a display refresh rate
JPH087792B2 (en) Extensible multiple image buffers in graphics systems
US20150371610A1 (en) Programmable power performance optimization for graphics cores
US6720969B2 (en) Dirty tag bits for 3D-RAM SRAM
US20030160796A1 (en) Active block write-back from SRAM cache to DRAM
US6778179B2 (en) External dirty tag bits for 3D-RAM SRAM
EP4217968A1 (en) Depth buffer pre-pass
US6853381B1 (en) Method and apparatus for a write behind raster
CN107451318B (en) Power estimation device for processor, power estimation system and related method
GB2471360A (en) Creating an enhanced overdraw image in graphics rendering applications
JPH03139771A (en) Graphic display system and method
JPH0778266A (en) Image processor
US7735093B2 (en) Method and apparatus for processing real-time command information
US6061073A (en) Tracking of graphics polygon data from different clock domains in a graphics processor
CN117314727A (en) Image processing method, apparatus, device, storage medium, and program product
Holten-Lund Embedded 3D Graphics Core for FPGA-based System-on-Chip Applications

Legal Events

Date Code Title Description
PCNP Patent ceased through non-payment of renewal fee

Effective date: 20220716