CN102799431B - Graphics primitive preprocessing method, graphics primitive processing method, graphic processing method, processor and device - Google Patents

Graphics primitive preprocessing method, graphics primitive processing method, graphic processing method, processor and device Download PDF

Info

Publication number
CN102799431B
CN102799431B CN201210226716.4A CN201210226716A CN102799431B CN 102799431 B CN102799431 B CN 102799431B CN 201210226716 A CN201210226716 A CN 201210226716A CN 102799431 B CN102799431 B CN 102799431B
Authority
CN
China
Prior art keywords
summit
pel
vertex
speed cache
sequence table
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201210226716.4A
Other languages
Chinese (zh)
Other versions
CN102799431A (en
Inventor
沙力
李济川
赵波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Galaxycore Shanghai Ltd Corp
Original Assignee
SHANGHAI SUANXIN MICROELECTRONICS CO Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHANGHAI SUANXIN MICROELECTRONICS CO Ltd filed Critical SHANGHAI SUANXIN MICROELECTRONICS CO Ltd
Priority to CN201210226716.4A priority Critical patent/CN102799431B/en
Publication of CN102799431A publication Critical patent/CN102799431A/en
Application granted granted Critical
Publication of CN102799431B publication Critical patent/CN102799431B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The invention provides a graphics primitive preprocessing method, a graphics primitive processing method, a graphic processing method, a graphics primitive preprocessor, a graphics primitive processor and a graphic processing device. The graphics primitive preprocessing method comprises the following steps: simulating the actual output process of a graphics primitive, acquiring index values of all peaks of the graphics primitive during cache, acquiring a peak index sequence table through reordering, and giving the amount of peaks of the graphics primitive to be substituted. The graphics primitive processing method comprises the step of outputting the graphics primitive and substituting the peak data during cache. The graphic processing method comprises the step of processing all graphics primitives in sequence by the graphics primitive processing method in the order of graphics primitive strings. The graphics primitive preprocessor comprises a simulating unit, an index value unit, a reordering unit and a peak amount unit. The graphics primitive processor comprises an output unit and a substituting unit. The graphic processing device comprises a reception unit, a graphics primitive preprocessor and a graphics primitive processor. According to the invention, a traditional cache mark is not required, so that the area of a chip is saved, and the design complexity and power consumption of the chip are reduced as well.

Description

Pel pre-service and disposal route, graphic processing method and processor thereof, device
Technical field
The present invention relates to graphics process, particularly a kind of pel preprocess method and disposal route, a kind of graphics process and a kind of pel pretreater and processor, a kind of graphic processing facility.
Background technology
Present user requires more and more higher to the visual effect of computer applied algorithm, especially game class application program, and normal needs describes complexity and the figure of fineness, and only a scene often just relates to a large amount of figures.This certainly will bring a large amount of continual graphic operation, correspondingly, also requires more and more higher to the arithmetic capability of graph processing chips.
Use pel as the base unit of graphic plotting in prior art.Each pel comprises one or more summit.Such as: the pel of a point is exactly a summit, and the pel of a line segment comprises two summits, and a leg-of-mutton pel comprises three summits, and the pel of a quadrilateral comprises four summits, by that analogy.Pel can be defined by drawing application interface (Application Programming Interface, API) standard.Conventional drawing application interface has open drawing function storehouse (Open GraphicsLibrary, OpenGL), Direct 3D(D3D) etc.OpenGL be one across programming language, cross-platform DLL (dynamic link library), can independent of window system and operating system exploitation two-dimensional/three-dimensional computer graphics application program, the application program developed based on it can be transplanted very easily between various platform.D3D is the standard of Microsoft, can coordinate the graphics process of carrying out two-dimensional/three-dimensional from different microsoft system.
The figure of any one complexity can be split into multiple relatively simple pel.Such as: the figure of a people can be splitted into a circle and represent head, and a large rectangle represents trunk, and four little rectangles represent four limbs, wherein circle is combined by numerous triangle primitives again.Conveniently draw, a complicated figure can be split into simple pel, such as: triangle, quadrilateral etc. as far as possible.
Draw a figure, exactly each pel of this figure of composition is drawn out.The information of these pels is stored in internal memory (such as, the internal memory of DDR type).First these primitive information comprise a concordance list containing whole summit, and this concordance list provides the actual physical storage address of each vertex correspondence data.Next comprises the data on each summit, also can provide primitive information table in addition, shows each pel which summit is made up of.The process of drafting pel is exactly the vertex index information according to this pel of composition, by its actual physical storage address, reads corresponding vertex data and carry out graphic plotting in internal memory.Due to the limitation of internal memory reading speed, in existing 3 D graphics chip design, all adopt the mode of vertex cache to read vertex data, this technology is widely used.This technology leaves most recently used vertex data in a cache memory (Cache) in.Cache memory capacity is more much smaller than internal memory, only can the data of memory limited, but its reading speed is rapider than internal memory many too, and being usually used in being stored in next step is probably data that processor is used.The spatial locality on summit and temporal locality principle in graphic based program, two pels before and after reading often have shared summit, add in high-speed cache by such summit, can accelerate the time of next reading same vertices data.
Due to the finite capacity of high-speed cache, all vertex datas cannot be stored, when high-speed cache is expired, strategically need carry out data replacement.So, need to arrange sign (Tag), to show whether in the caches this summit.Whenever the pel that reading one is new, need the vertex index belonging to pel and the sign in high-speed cache (Tag) to compare, if the same represent hit, this vertex data in the caches, directly can read high-speed cache.Otherwise represent disappearance, need the summit of getting from the internal memory sheet into disappearance, replace the summit in high-speed cache, read high-speed cache more afterwards.
The capacity of such as certain internal memory can store the data on 1024 summits, then each summit needs to represent (because of 2 with 10 bits 10=1024).The capacity of high-speed cache can store the data on 8 summits, then each summit only needs can represent (because of 2 with 3 bits 3=8), normally with 3 minimum bit representations.That is: summit 0 is expressed as 0000000000 in internal memory, is expressed as minimum 3 000 in the caches.Summit 1 is expressed as 0000000001 in internal memory, is expressed as 001 in the caches.Summit 7 is expressed as 0000000111 in internal memory, is expressed as 111 in the caches.Problem is, summit 8 is expressed as 0000001000 in internal memory, and is 000 too with 3 minimum bit representations in the caches.Now, cannot distinguish this summit is represent summit 0 on earth, or summit 8.So, need for each summit arranges sign (Tag), be used for preserving the high position on each summit.Namely the Tag on summit 0 is 0000000, and the Tag on summit 8 is 0000001.By comparing Tag value, can determine whether in the caches certain summit.So except preserving except vertex data in high-speed cache, also to preserve the sign (Tag) on summit, take the storage space of high-speed cache, also will increase a large amount of steering logic simultaneously, add chip design complexity, chip area and chip power-consumption.
Summary of the invention
Technical matters to be solved by this invention is the sign without the need to arranging in high-speed cache, just can realize graphing, saves the area of graph processing chips, and reduces design complexities and the power consumption of graph processing chips.
In order to solve the problem, the invention provides a kind of pel preprocess method, comprising:
Simulate the actual output procedure of this pel, during to obtain this pel of actual output, in high-speed cache, summit and each summit of this pel enter the sequencing of high-speed cache;
According to summit in high-speed cache during described this pel of reality output, obtain each summit of this pel index value in the caches; One-to-one relationship is there is between summit in index value in described high-speed cache and described high-speed cache;
Enter the sequencing of high-speed cache according to the described each summit of this pel, is reordered in each summit of this pel, obtain vertex index sequence table; Described vertex index sequence table stores the index of each vertex data, can obtain the actual physical address of this vertex data according to described index, obtains the data on this summit;
Summit when exporting this pel based on reality in high-speed cache and the summit of next pel, it is natural number that the need providing this pel replace summit quantity n, n.
Optionally, described reordering comprises:
Arbitrary summit of getting this pel is current vertex;
Repeat following steps, until each summit of this pel is all processed:
If the summit quantity in vertex index sequence table is less than high-speed cache open ended summit quantity, then added vertex index sequence table;
Otherwise, if in the high-speed cache of current vertex not when reality exports a upper pel of this pel, added vertex index sequence table;
If the summit quantity added in the vertex index sequence table after current vertex is more than or equal to high-speed cache open ended summit quantity, then judge actually to carry out the summit whether summit that oldest stored when summit is replaced enters high-speed cache is this pel; If so, then by oldest stored, the summit entered in high-speed cache adds vertex index sequence table, the summit that secondary morning is stored in high-speed cache is adjusted to oldest stored and enters summit in high-speed cache; Repeating this step carries out till summit that when summit is replaced, oldest stored enters high-speed cache is not the summit of this pel until actual.
Next summit of this pel is adjusted to current vertex.
Optionally, the described vertex index sequence table that adds at least comprises:
When vertex index sequence table is empty, this summit is the first summit of described vertex index sequence table; When vertex index sequence table is not empty, this summit is added to the end of described vertex index sequence table.
Optionally, the described need providing this pel are replaced summit quantity n and are at least comprised:
The need of this pel are replaced summit quantity n and are set to initial value;
When the summit quantity in the vertex index sequence table obtained after reordering in each summit of this pel is more than or equal to the open ended summit quantity of high-speed cache, arbitrary summit of taking off a pel is current vertex, repeat following steps, until each summit of pressing next pel of pel reading order is all processed, summit quantity replaced by the need obtaining this pel:
If in the high-speed cache of current vertex not when reality exports this pel, then the need replacement summit quantity n of this pel adds 1;
Add after 1, judge actually to carry out the summit whether summit that oldest stored when summit is replaced enters high-speed cache is next pel; If so, then the need of this pel are replaced summit quantity n and are added 1, the summit that secondary morning is stored in high-speed cache are adjusted to oldest stored and enter summit in high-speed cache; Repeating this step carries out till summit that when summit is replaced, oldest stored enters high-speed cache is not the summit of next pel until actual;
Next summit of next pel is adjusted to current vertex.
Optionally, with described preprocess method, pre-service is carried out to each pel in the pel string that need draw.
Present invention also offers a kind of pel disposal route, comprising: according to each summit of this pel index value in the caches, from high-speed cache, obtain each vertex data, export this pel; When the need replacement summit quantity n of this pel is not initial value, with the summit of the n in vertex index sequence table, replace n summit in high-speed cache.
Optionally, when this pel is first pel, before n summit in this pel of described output and described replacement high-speed cache, also comprise: from the first summit of vertex index sequence table, the data on summit are obtained one by one according to the index on summit, and be stored in high-speed cache, until high-speed cache is filled with; The first summit of also not reading in vertex index sequence table is adjusted to current vertex.
Optionally, n summit in described use vertex index sequence table, n the summit of replacing in high-speed cache at least comprises:
From the current vertex in described vertex index sequence table, read n summit one by one, and by the data on this n summit, replacement oldest stored enters the data on n summit of high-speed cache, adjusts the summit that oldest stored enters high-speed cache;
The first summit of also not reading in described vertex index sequence table is adjusted to current vertex.
Optionally, described high-speed cache is divided into headspace and output region, headspace stores the vertex data needed for replacing, and output region stores the vertex data needed for output primitive;
While output primitive, the vertex data concurrently from vertex index sequence table needed for pre-read replacement, is stored in this headspace;
During replacement, from this headspace, read vertex data, replace the vertex data of output region in high-speed cache.
Present invention also offers a kind of graphic processing method, comprising: according to pel order each in described pel string, successively each pel in this pel string is processed by any one pel disposal route described.
Optionally, when described being plotted as is drawn first, before successively each pel in this pel string being processed by any one pel disposal route described, also comprise: carry out pre-service with any one pel preprocess method described to each pel in this pel string, to obtain each summit of each pel index value in the caches, the need of vertex index sequence table and each pel replace summit quantity n.
Present invention also offers a kind of pel pretreater, comprising:
Analogue unit, for simulating the actual output procedure of this pel, during to obtain this pel of actual output, in high-speed cache, summit and each summit of this pel enter the sequencing of high-speed cache;
Index value unit, for according to summit in high-speed cache during described this pel of reality output, obtains each summit of this pel index value in the caches;
Reorder unit, for entering the sequencing of high-speed cache according to the described each summit of this pel, reordered on each summit of this pel, obtains vertex index sequence table;
Summit processing units, summit during for exporting this pel based on reality in high-speed cache and the summit of next pel, summit quantity n replaced by the need providing this pel.
Present invention also offers a kind of pel processor, comprising:
First pel setting unit, for from the first summit of vertex index sequence table, obtain the data on summit one by one according to the index on summit, and be stored in high-speed cache, until high-speed cache is filled with, the more first summit of also not reading in vertex index sequence table is adjusted to current vertex;
Output unit, for according to each summit of this pel index value in the caches, reads each vertex data, exports this pel from high-speed cache.
Replacement unit, when not being 0 for replacing summit quantity n when the need of this pel, with n vertex data of vertex index sequence table, replaces n vertex data in high-speed cache.
Optionally, also comprise: pre-read unit, for while output primitive, the vertex data concurrently from vertex index sequence table needed for pre-read replacement.
Present invention also offers a kind of graphic processing facility, comprising: receiving element, for receiving the primitive information of the pel string that need draw, comprising: vertex data and primitive information table, described primitive information table comprises the vertex index forming each pel; Described primitive information is stored in internal memory;
Described pel pretreater, for carrying out pre-service to each pel in described pel string;
Described pel processor, for realizing the drafting of each pel in described pel string.
Compared with prior art, its advantage is in the present invention:
1, owing to knowing the drawing order of each pel in advance, can calculated in advance need replace summit quantity, by the succession needing the summit of replacing to be replaced into high-speed cache according to it is sorted, current location always from this sequence during replacement reads summit, by certain replacement policy, replace the summit in high-speed cache, eliminate the sign (Tag) of traditional cache, save chip area.
2, owing to eliminating sign, also just indicating position without the need to comparing when replacing, therefore having saved a large amount of steering logic, having reduced the design complexities of chip, having reduced the power consumption of chip simultaneously.
Accompanying drawing explanation
Fig. 1 is running environment schematic diagram of the present invention;
Fig. 2 is the process flow diagram of a kind of embodiment of pel preprocess method of the present invention;
Fig. 3 is a kind of process flow diagram of embodiment of step of reordering in pel preprocess method of the present invention;
Fig. 4 is the process flow diagram providing a kind of embodiment need replacing summit quantity step in pel preprocess method of the present invention;
Fig. 5 is the process flow diagram of a kind of embodiment of pel disposal route of the present invention;
Fig. 6 is the process flow diagram of a kind of embodiment of graphic processing method of the present invention;
Fig. 7 is the pictorial diagram using an embodiment of graphic processing method of the present invention to draw;
Fig. 8 is the structural representation of a kind of embodiment of pel pretreater of the present invention;
Fig. 9 is the structural representation of a kind of embodiment of pel processor of the present invention;
Figure 10 is the structural representation of a kind of embodiment of graphic processing facility of the present invention.
Embodiment
Set forth a lot of detail in the following description so that fully understand the present invention.But the present invention can be much different from alternate manner described here to implement, those skilled in the art can when without prejudice to doing similar popularization when intension of the present invention, therefore the present invention is by the restriction of following public concrete enforcement.
As previously mentioned, due to the finite capacity of high-speed cache, all vertex datas cannot be stored, so when high-speed cache is expired, strategically data replacement need be carried out.The cache capacity that can provide in view of prior art temporarily also cannot meet the needs of graphic plotting, so the replacement of this vertex data is still inevitable in graphics process.
Why needing in prior art to use sign (tag) dynamically to manage vertex data, is that namely graphic chips does not know the priority drawing order of these pels because graphic chips does not understand the concrete order of vertex data stream in advance.If but could determine in advance the sequencing of each pel in the required pel string drawn, the order of vertex data stream could be predicted, the behavior of vertex data can be predicted, also just without the need to this intricately dynamic management vertex data for another example.And in reality, the situation of determining to draw pel string is in advance very ordinary.A string pel in a scene often can be drawn by graphic chips repeatedly, and when such as making animation, the figure of each frame is the same, and difference is only the change of position.This string vertex data fails to be convened for lack of a quorum and to be read repeatedly by graphic chips, and pel can not change in repeatedly drawing, and the order of vertex data stream also can not change, and the behavior of vertex data can be predicted.
The present invention make use of the pel drawing order determined just, and by preprocess method of the present invention, calculated in advance goes out to need the summit quantity of replacing, and the summit needed for these being replaced is according to the succession sequence entering high-speed cache.During replacement, utilize disposal route of the present invention, always read summit from the current location of this sequence, by namely fixed replacement policy, replace the summit in high-speed cache, indicating position (Tag) without the need to arranging traditional high-speed cache again, saving chip area, reducing design complexities and the power consumption of graph processing chips simultaneously.
As previously described, draw complex figure time, the pel that can as far as possible become number of vertex few inserting drawing to facilitate drafting, such as: triangle, quadrilateral.Therefore, when drawing complex figure, the maximum vertex number relating to single pel generally all can not be very large.The number of vertex that current high-speed cache can hold is general all much larger than the maximum vertex number of single pel.That is, can not this thing happens, namely the maximum vertex number of certain pel is greater than the number of vertex that high-speed cache can hold, even if high-speed cache is filled with all, this pel also has summit not also to be read into high-speed cache, cannot export.Once there is above-mentioned situation, no matter the disposal route of prior art or this method, all cannot realize.
First some terms involved in the present invention are made an explanation.
High-speed cache: be the high speed small-capacity memory between central processing unit and primary memory.At least comprise: Cache memory bank, address converting member and replacement parts.This Cache memory bank deposits the instruction and data block of being called in by main memory.This address converting member sets up catalogue listing to realize the conversion of core address to buffer address.These replacement parts strategically carry out data block replacement when buffer memory is expired, and modified address converting member.Comprise the index value in high-speed cache in the catalogue listing set up, according to the index value in this high-speed cache, data actual storage locations can be found in the caches.
Replacement policy: when the data needed for processor are not in high-speed memory, and when not now being free position in high-speed memory, just some data of eliminating in high-speed cache deposit the data of newly calling in vacate position, and this is called replacement.Determine that the rule of replacing is replacement policy, conventional replacement policy has: least recently used method (LRU), method of first-in, first-out (FIFO) and random approach (RAND) etc.
Vertex data: at least comprise vertex index, and concrete data value.According to vertex index, the actual physical address deposited of this vertex data can be found, to read concrete data value.Shared by this vertex index, storage space is little.Storage space shared by this concrete data value, according to packet containing attribute number disagree, but it is all much bigger to compare vertex index, can comprise multiple attribute and numerical value, alternatively when reading reads.
Primitive information: at least comprise this pel and which summit be made up of, and the vertex index on those summits.According to the vertex index on those summits, vertex data can be read where necessary.
To achieve these goals, the invention provides a kind of pel preprocess method, a kind of pel disposal route and a kind of graphic processing method.
Shown in Fig. 1 is running environment schematic diagram of the present invention.As shown in Figure 1, central processing unit (CPU) first reads desired data from (Cache) high-speed cache.Only have when desired data not in the caches time, just arrive in main memory (internal memory) and desired data be transferred in high-speed cache, then read from high-speed cache.The present invention is to the requirement of CPU (central processing unit) processing power, different according to handled figure complexity difference, so be not specifically limited at this.But at least require that running environment can provide the high-speed cache of certain capacity, and much larger than the internal memory of cache capacity.The capacity of this high-speed cache at least should be greater than the maximum vertex number of single pel in drawn pel string, with the needs of the single pel of satisfied output.This internal memory is except can storing all vertex datas, primitive information, also need enough capacity to store the pretreated object information of the present invention, at least comprise the summit quantity that vertex index sequence table, each summit of each pel index value in the caches and each pel need be replaced.
Shown in Fig. 2 is the process flow diagram of a kind of embodiment of pel preprocess method of the present invention, at least comprises the following steps:
Perform step S201, simulate the actual output procedure of this pel, during to obtain this pel of actual output, in high-speed cache, summit and each summit of this pel enter the sequencing of high-speed cache.
Perform step S202, obtain each summit of this pel index value in the caches; That is, according to summit in high-speed cache during described this pel of reality output, each summit of this pel index value is in the caches obtained; One-to-one relationship is there is between summit in index value in described high-speed cache and described high-speed cache.Due to the drawing order of known each pel, so according to the simulation of selected summit replacement policy and step S201, can foresee when exporting this pel, in high-speed cache, also have which summit.Even if the summit needed for this pel is not before in the caches, also can be replaced into after a upper pel of this pel exports, so can ensure that summit now needed for this pel all in the caches.This step needs to provide each summit index value in the caches in this pel, so that follow-up output step can read the vertex data in high-speed cache according to index value.
Perform step S203, enter the sequencing of high-speed cache according to the described each summit of this pel, is reordered in each summit of this pel, obtain vertex index sequence table; Described vertex index sequence table stores the index of each vertex data, can obtain the actual physical address of this vertex data according to described index, obtains the data on this summit.Owing to predicting the drawing order of each pel, so according to selected summit replacement policy, summit can be predicted and be replaced sequencing into high-speed cache.Reordering by this step, in follow-up summit replacement step, can read the summit that need replace according to the order of sequence easily from the vertex index sequence table that this step produces.It should be noted that, according to the difference of selected summit replacement policy, the vertex index sequence table that this step produces also can be different.
Perform step S204, summit when exporting this pel based on reality in high-speed cache and the summit of next pel, it is natural number that the need providing this pel replace summit quantity n, n.How much not in the caches the summit quantity i.e. summit of next pel replaced by the need of this pel has, and needs to be replaced in follow-up replacement step.It should be noted that, according to the difference of selected summit replacement policy, in the high-speed cache that this step produces, need the summit quantity of replacing also can be different.
It should be noted that, the result that in Fig. 2, each step produces, is all stored in internal memory.When if desired using these results in subsequent step, then read from internal memory, and do not take cache memory space.
It should be noted that, although in this embodiment, pre-service be obtain each summit of this pel index value in the caches by step S202, S203 reorders, S204 provides this pel need replace summit quantity and carry out, but the in fact sequencing of step S202, S203 and S204 unimportant, only needs to complete these steps above-mentioned before reality exports this pel.
Fig. 3 is a kind of process flow diagram of embodiment of step of reordering in pel preprocess method of the present invention.Composition graphs 3, illustrates the step S203 that reorders in Fig. 2.
Perform step S301, arbitrary summit of getting this pel is current vertex;
Perform step S302, judge that whether current vertex processed? if so, then reorder and complete in each summit of current pel.
If not, then show that this pel also has summit not processed.
Perform step S303, judge whether the summit quantity in vertex index sequence table is less than or equal to high-speed cache open ended summit quantity.If so, then perform S304a, judge current vertex whether in vertex index sequence table.If so, then perform step S312, next summit of this pel is adjusted to current vertex, then circulate from step S302, continue next summit of this pel of process.If not, then step S305 is performed.
If the summit quantity in vertex index sequence table is greater than high-speed cache open ended summit quantity, then perform step S304b, do you judge in the high-speed cache of current vertex when reality exports a upper pel of this pel? if, then perform step S312, next summit of this pel is adjusted to current vertex, then circulate from step S302, continue next summit of this pel of process.
If not, then show current vertex not in high-speed cache at that time, need to be replaced into high-speed cache by follow-up.Perform step S305, judge that vertex index sequence table is empty? if so, then perform step S306, current vertex is the first summit of vertex index sequence table.If not, then perform step S307, current vertex is added to the end of vertex index sequence table, to upgrade vertex index sequence table.
Perform step S308, do you judge that the summit quantity in the vertex index sequence table after adding current vertex is greater than high-speed cache open ended summit quantity? if not, then show now also have remaining space in high-speed cache, current vertex directly stored in, without the need to replacing.Perform step S312, next summit of this pel is adjusted to current vertex, circulate from step S302, continue next summit of this pel of process.
If so, then show to need to replace in high-speed cache.Perform step S309, do you judge that actual to carry out the oldest stored summit entered in high-speed cache when vertex data is replaced be the summit of this pel? Given this replacement policy that embodiment adopts is, when high-speed cache is expired, always will replace stored in that summit in high-speed cache the earliest.So need to judge that whether this summit be replaced also is the summit of this pel herein, if so, then this summit is the summit of this pel, need by this summit after being replaced again stored in high-speed cache.Perform step S310, summit oldest stored entered in high-speed cache is added to the end of vertex index sequence table.Perform step S311, the summit that secondary morning is stored in high-speed cache is adjusted to oldest stored and enters summit in high-speed cache.
If not, then this summit is not the summit of this pel, need not consider the follow-up problem being again stored into high-speed cache again.Perform step S312, next summit of this pel is adjusted to current vertex, circulate from step S302, continue next summit of this pel of process.
Fig. 4 is the process flow diagram providing a kind of embodiment need replacing summit quantity step in pel preprocess method of the present invention; Composition graphs 4, illustrates step S204 in Fig. 2.
Perform step S401, the need of this pel are replaced summit quantity n and is set to 0, namely first giving tacit consent to the summit quantity that need replace is 0, determines the summit quantity that need replace again, change this numerical value in subsequent step.
Perform step S402, do you judge that the summit quantity in the vertex index sequence table obtained after reordering in each summit of this pel is more than or equal to high-speed cache open ended summit quantity? if not, then show now also have remaining space in high-speed cache, without the need to replacing, the summit quantity that need replace is 0, without the need to changing.
If so, then show all to pile in high-speed cache, if restore new data, just must replace.Perform step S403, arbitrary summit of taking off a pel is current vertex.Perform step S404, judge that whether current vertex processed? need the summit quantity of replacing given in the high-speed cache of if so, then this pel.
If not, then perform step S405, judge in the high-speed cache of current vertex when reality exports this pel? if so, show, without the need to replacing, to perform step S410, next summit of next pel is adjusted to current vertex.Circulate from step S404, continue next summit of next pel of process.
If not, show that these summit needs are replaced, in the high-speed cache of this pel, need the summit quantity n replaced to need adjustment.Perform step S406, the need of this pel are replaced summit quantity n and are added 1.
Perform step S407, do you judge actually to carry out the summit that summit that when vertex data is replaced, oldest stored enters high-speed cache is next pel? Given this replacement policy that embodiment adopts is, when high-speed cache is expired, always will replace stored in that summit in high-speed cache the earliest.So need to judge that whether this summit be replaced also is the summit of next pel herein.If not, then this summit does not need again to be replaced, and performs step S410, next summit of next pel is adjusted to current vertex.Circulate from step S404, continue next summit of next pel of process.
If so, then this summit needs again to be replaced into, needs the summit quantity n replaced to need adjustment in the high-speed cache of this pel.Namely perform step S408, the need of this pel are replaced summit quantity n and are added 1.Perform step S409, the summit that secondary morning is stored in high-speed cache is adjusted to oldest stored and enters summit in high-speed cache.Circulate from step S407, carry out till summit that when vertex data is replaced, oldest stored enters high-speed cache is not the summit of next pel until actual.Perform step S410, next summit of next pel is adjusted to current vertex.Circulate from step S404, continue next summit of next pel of process.
It should be noted that, according to different replacement policies, the reorder vertex index sequence table of gained of Fig. 2 step S203 is not quite similar.Accordingly, the replacement summit quantity n that needs of this pel of Fig. 2 step S204 gained also can be different.The replacement policy taked in present embodiment is: high-speed cache completely time, always will replace stored in that summit in high-speed cache the earliest, but should not be construed as the spendable replacement policy of this method and be only limitted to this.In fact, only need according to replacement policy, the method making the need reordering and provide this pel replace summit quantity matches, and the replacement policy of traditional cache all can be applied this method to realize, such as: when high-speed cache is expired, always that summit finally entered in high-speed cache is replaced.Or replaced on that summit of lowest order identical in high-speed cache, namely as previously described capacity is the high-speed cache on 8 summits, and summit 8(lowest order is 000) when entering in high-speed cache, replace summit 0(lowest order and be similarly 000).Or by each summit in high-speed cache by frequency of utilization sequence, always replace that summit of minimum use at every turn.Or a cryptographic hash is provided arbitrarily, every a Hash summit, replace a summit, etc.
Shown in Fig. 5 is the process flow diagram of a kind of embodiment of pel disposal route of the present invention, at least comprises the following steps:
Perform step S501, judge pel headed by whether is this pel? if not, then without the need to carrying out the correlation step of initial setting up, directly can start to export this pel, continuing to perform from step S504.
If so, then need the correlation step of carrying out initial setting up, perform step S502, from the first summit of vertex index sequence table, read summit one by one, obtain vertex data according to the index of vertex data, and be stored in high-speed cache, until high-speed cache is filled with.Perform step S503, the first summit of also not reading in vertex index sequence table is adjusted to current vertex.
Perform step S504, export this pel, namely according to each summit of this pel index value in the caches, from high-speed cache, obtain each vertex data, export this pel.The each summit of this pel index value in the caches provides by pre-service before, and in the replacement step of a upper pel by this pel not summit in the caches to substituted for into, when ensure that this pel exports, all summits all in the caches, are directly read according to each summit of this pel index value in the caches and are exported.
After output, perform step S505, does is judging that summit quantity n replaced by the need of this pel 0? if so, each fixed point then showing next pel all in the caches, does not need to replace.
If not, then show to also have n summit not in the caches in next pel, need to replace.Perform step S506, with the vertex data of the n in vertex index sequence table, replace n vertex data in high-speed cache.Particularly, from the current vertex of vertex index sequence table, read n summit one by one, and by the data on this n summit, replacement oldest stored enters the data on n summit of high-speed cache.Vertex index sequence table obtains by pre-service.The sequencing being substituted into high-speed cache by summit in pre-service reorders, so n summit in now vertex index sequence table from current location to be in next pel not n summit in the caches, directly replaces.After replacement completes, the first summit of also not reading in vertex index sequence table is adjusted to current vertex.
In possibility, can also be divided into headspace and output region in the caches, headspace stores the vertex data needed for replacing, and output region stores the vertex data needed for output primitive.Such as: capacity is the high-speed cache on 8 summits, wherein 7 summits are as output region, and store the summit needed for output primitive, 1 as headspace; Or 6 summits are as output region, store the summit needed for output primitive, 2 as headspace.Concrete distribution can according to the actual requirements, in the scope that cache capacity allows, independently be allocated.While graph processing chips output primitive, the vertex data needed for replacing can be read concurrently from vertex index sequence table, be stored in this headspace; During replacement, from this headspace, read vertex data, replace the vertex data of output region in high-speed cache.Parallel work-flow like this, can accelerate the execution efficiency of graph processing chips further.
It should be noted that, due to when draw be a string pel time, output primitive and replace vertex data be a process moved in circles, so its sequencing is unimportant.Namely can be the vertex data replacing current pel, export current pel, and so forth.Also or export current pel, the vertex data of next pel is replaced, and so forth.
Fig. 6 is the process flow diagram of a kind of embodiment of graphic processing method of the present invention, at least comprises the following steps:
Perform step S601, judge that drawing this pel string is draw first? if not, then through pre-service in the drafting before, and be with a string pel, again pre-service because of what draw, result and the first string pel are also the same.So pretreated result before can directly using, performs from step S604.
It should be noted that, consider for unified process, again drawing and also do not forbid repeating pretreated step, is only that treatment effeciency can correspondingly decrease.When graphic processing method of the present invention is used in and repeats to draw with a string pel, advantage is more obvious.
If draw first, then perform step S602, judge that whether each pel in this pel string is through pre-service? if so, then pre-service is complete, performs step S604 step; If not, then perform step S603, with pel preprocess method, pre-service is carried out to pel.Then circulate from step S602, until each pel in this pel string is through pre-service.
Perform step S604, judge that whether each pel in this pel string is treated, namely each pel has exported and has been that the output of next pel substituted for vertex data.If so, then this graphing terminates.
If not, then perform step S605, process by pel disposal route to pel, namely output primitive is also the output replacement vertex data of next pel.Then circulate from step S604, until each pel in this pel string is treated, graphing is complete.
It should be noted that, it will be appreciated by those skilled in the art that, the all or part of hardware that can carry out instruction relevant by program of the pel preprocess method of above-mentioned embodiment, pel disposal route and graphic processing method completes, described program can be fixed in computer-readable recording medium, and described storage medium comprises ROM, RAM, magnetic disc, CD etc.
Below by a specific embodiment, to understand this method more intuitively.Fig. 7 is the pictorial diagram that the present embodiment need be drawn.This figure has carried out pel fractionation, and being split into sequence number is 1.-9 pels 9., and each pel is all containing 3 summits.Primitive information is provided by primitive information table 1 below:
Pel sequence number Vertex index Vertex index Vertex index
0 1 2
0 2 3
0 3 4
0 4 5
0 5 6
0 6 7
0 7 8
0 8 9
0 9 1
Table 1
High-speed cache in this example can hold at most the data on 8 summits.
First pre-service is carried out to this string pel.
1. pel is first pel, simulation pel is the actual situation on summit in high-speed cache when exporting 1., because be first pel, so the summit in high-speed cache is now the state that initial setting up is complete when exporting, namely 8 summits are had in high-speed cache, namely 0,1,2,3,4,5,6,7.The each summit index value in the caches providing this pel is 0,1,2.Due to pel 1. headed by pel, process first summit 0 time, vertex index sequence table be sky, so summit 0 is the first summit of vertex index sequence table.Process the second summit 1, in now vertex index sequence table, have 1 summit 0, be less than the open ended number of vertex of high-speed cache 8, so, be added to 1 after 0, upgrade vertex index sequence table.Process the 3rd summit 2, have 0,1 two summit in now vertex index sequence table, be still less than the open ended number of vertex of high-speed cache 8, be added to after 1 by 2, forming new vertex index sequence table is: 0,1,2.The summit quantity n that then need replace is set to 0.Because the summit quantity in index sequence table is now 3, be less than open ended 8 summits of high-speed cache, so still have remaining space in high-speed cache, without the need to replacing, the summit quantity n that need replace is 0.
Then process pel 2., due to a upper pel 1. export after without the need to carrying out summit replacement, so be still 8 original summits in high-speed cache now, namely 0,1,2,3,4,5,6,7.Providing pel each summit 2. index value is in the caches 0,2,3.Number of vertex in now vertex index sequence table is 3, be less than the open ended number of vertex of high-speed cache 8, and 2. pel has a summit 3 not in vertex index sequence table, to be added to 3 in index sequence table after 2, to form new vertex index sequence table: 0,1,2,3.The summit quantity n that then need replace is set to 0.Because the summit quantity in index sequence table is now 4, be less than open ended 8 summits of high-speed cache, so still have remaining space in high-speed cache, without the need to replacing, the summit quantity n that need replace is 0.
To pel 3., 4., 5. pel carry out similar pre-service, repeat no more herein.
Each pel does not all need after exporting to replace vertex data before, so be still 0,1,2,3,4,5,6,7,8 summits in high-speed cache now.Providing pel each summit 6. index value is in the caches 0,6,7.Vertex index sequence table now has 7 summits, and namely 0,1,2,3,4,5,6, be less than high-speed cache open ended summit quantity 8.6. pel has a summit 7 not in current vertex index sequence table, is added to the end of index sequence table by 7, forms new vertex index sequence table to be: 0,1,2,3,4,5,6,7.Then the summit quantity need replaced is set to 0.Summit quantity in index sequence table is now 8, equals high-speed cache and can hold 8 summits, need replace.7. next pel has 3 summits 0,7,8.Wherein 0 and 7 implement output primitive 6. time high-speed cache suffered, only have summit 8 not in the caches, need in pel high-speed cache 6. replace summit quantity n by 0 become 1.Now oldest stored enters the summit of high-speed cache is summit 0, and when namely summit 8 enters in high-speed cache, what replace is summit 0, and summit 0 is next pel summit 7. equally.Therefore, summit 0 needs again to be replaced into, namely needs the summit quantity n replaced to add 1 again in pel high-speed cache 6., becomes 2 from 1.The summit 1 be stored into secondary morning in high-speed cache is adjusted to oldest stored and enters summit in high-speed cache.Again judge whether summit 1 is next pel summit 7..Summit 1 is not next pel summit 7., so far, obtains in pel high-speed cache 6. and needs the summit quantity n replaced to be 2.Namely actual output primitive 6. after, there occurs 2 summits and replace, is that summit 8 substituted for summit 0 for the first time, and second time is that summit 0 substituted for summit 1.
Summit now in high-speed cache is 8,0,2,3,4,5,6,7.Provide pel 7. each summit index value be in the caches set to 1,7,0.Vertex index sequence table is now, 1,2,3,4,5,6,7, just in time equal the open ended number of vertex of high-speed cache.7. pel has 1 summit 8 not in current vertex index sequence table, summit 8 is added to the end of index sequence table, forms new vertex index sequence table to be: 0,1,2,3,4,5,6,7,8, is greater than the open ended number of vertex 8 of high-speed cache.Pel summit 7. when now needing to continue to judge that whether is the actual summit 0 that oldest stored enters high-speed cache when carrying out vertex data replacement? summit 0 is pel summit 7., so need end summit 0 being added to vertex index sequence table, forming new vertex index sequence table is: 0,1,2,3,4,5,6,7,8,0.Continue to judge whether the secondary early summit 1 be stored in high-speed cache is pel summit 7..Summit 1 is not pel summit 7., so it is complete to reorder.Then the summit quantity need replaced is set to 0.8. next pel has 3 summits: 0,8,9.One of them summit 9 not in the caches, needs the summit quantity n replaced to become 1 by 0 in pel high-speed cache 7..Summit now in high-speed cache is 8,0,2,3,4,5,6,7, the summit that oldest stored enters high-speed cache is summit 2, namely when summit 9 enters in high-speed cache, what replace is summit 2, and summit 2 is not next pel summit 8., so the summit quantity that need replace is without the need to changing, the summit quantity n replaced in pel high-speed cache 7., is needed to be 1, namely 7. actual output primitive needs to carry out a summit replacement afterwards, and namely summit 2 is replaced on summit 9.
8., 9. pel also carries out similar pre-service, repeats no more herein.
The vertex index sequence table produced after reordering is: 0,1,2,3,4,5,6,7,8,0,9,1.
The result produced after pretreatment and summit situation when exporting this pel in high-speed cache are in table 2:
Table 2
Then, actual output primitive is started.When drawing first, must initial setting up be carried out, namely from the first summit 0 of vertex index sequence table, vertex data is stored in high-speed cache one by one, until high-speed cache is piled, the vertex data now in high-speed cache is 0,1,2,3,4,5,6,7.Current vertex in vertex index sequence table is set to the 9th, i.e. summit 8.
Then, the circulation of this string pel is exported and summit replacement.
See table 2, output primitive 1. time, the summit in high-speed cache is 0,1,2,3,4,5,6,7.Pel each vertex index is 1. 0,1,2, the vertex data of the 0th, the 1st, the 2nd in corresponding high-speed cache, and read the vertex data 0,1,2 of in high-speed cache the 0th, 1,2, output primitive 1..The summit quantity n replaced due to pel need is 1. 0, without the need to replacing vertex data.
Continue see table 2, output primitive 2. time, the summit in high-speed cache is still 0,1,2,3,4,5,6,7.Pel each vertex index is 2. 0,2,3, the vertex data of the 0th, the 2nd, the 3rd in corresponding high-speed cache, and read the vertex data 0,2,3 of in high-speed cache the 0th, 2,3, output primitive 2..Owing to needing the summit quantity n replaced to be 0 in pel high-speed cache 2., without the need to replacing vertex data.
Continue see table 2, output primitive 3. time, the summit in high-speed cache is still 0,1,2,3,4,5,6,7.Pel each vertex index is 3. 0,3,4, the vertex data of the 0th, the 3rd, the 4th in corresponding high-speed cache, and read the vertex data 0,3,4 of in high-speed cache the 0th, 3,4, output primitive 3..Owing to needing the summit quantity n replaced to be 0 in pel high-speed cache 3., without the need to replacing vertex data.
Continue see table 2, output primitive 4. time, the summit in high-speed cache is still 0,1,2,3,4,5,6,7.Pel each vertex index is 4. 0,4,5, the vertex data of the 0th, the 4th, the 5th in corresponding high-speed cache, and read the vertex data 0,4,5 of in high-speed cache the 0th, 4,5, output primitive 4..Owing to needing the summit quantity n replaced to be 0 in pel high-speed cache 4., without the need to replacing vertex data.
Continue see table 2, output primitive 5. time, the summit in high-speed cache is still 0,1,2,3,4,5,6,7.Pel each vertex index is 5. 0,5,6, the vertex data of the 0th, the 5th, the 6th in corresponding high-speed cache, and read the vertex data 0,5,6 of in high-speed cache the 0th, 5,6, output primitive 5..Owing to needing the summit quantity n replaced to be 0 in pel high-speed cache 5., without the need to replacing vertex data.
Continue see table 2, output primitive 6. time, the summit in high-speed cache is still 0,1,2,3,4,5,6,7.Pel each vertex index is 6. 0,6,7, the vertex data of the 0th, the 6th, the 7th in corresponding high-speed cache, and read the data 0,6,7 of in high-speed cache the 0th, 6,7, output primitive 6..Need the summit quantity n replaced to be 2 in pel high-speed cache 6., illustrate and need replacement 2 summits.First read 1 summit from the current location of vertex index sequence table, i.e. summit 8, replace the summit entered the earliest in high-speed cache, i.e. summit 0.The adjustment summit entered the earliest in high-speed cache is summit 1, and the current vertex of adjustment vertex index sequence table is summit 0.Continue to read 1 summit from the current location of vertex index sequence table, i.e. summit 0, replace the summit entered the earliest in high-speed cache, i.e. summit 1.The adjustment summit entered the earliest in high-speed cache is summit 2, and the current vertex of adjustment vertex index sequence table is summit 9.
Continue see table 2, output primitive 7. time, the summit in former high-speed cache has been replaced 2, is 8,0,2,3,4,5,6,7 now.Pel 7. each vertex index is 1,7,0, the vertex data of the 1st, the 7th, the 0th in corresponding high-speed cache, and read the data 0,7,8 of in high-speed cache the 1st, 7,0, output primitive 7..Need the summit quantity n replaced to be 1 in pel high-speed cache 7., illustrate and need replacement 1 summit.Read 1 summit from the current location of vertex index sequence table, i.e. summit 9, replace the summit entered the earliest in high-speed cache, i.e. summit 2.The adjustment summit entered the earliest in high-speed cache is summit 3, and the current vertex of adjustment vertex index sequence table is summit 1.
Continue see table 2, output primitive 8. time, the summit in former high-speed cache has been replaced 1, is 8,0,9,3,4,5,6,7 now.Pel 8. each vertex index is 1,0,2, the vertex data of the 1st, the 0th, the 2nd in corresponding high-speed cache, and read the data 0,8,9 of in high-speed cache the 1st, 0,2, output primitive 8..Need the summit quantity n replaced to be 1 in pel high-speed cache 8., illustrate and need replacement 1 summit.Read 1 summit from the current location of vertex index sequence table, i.e. summit 1, replace the summit entered the earliest in high-speed cache, i.e. summit 3.The adjustment summit entered the earliest in high-speed cache is summit 4.Summit 1 has been the end position of vertex index sequence table, does not have the summit of also not reading, so no longer adjust.
Continue see table 2, output primitive 9. time, the summit in former high-speed cache has been replaced 1, is 8,0,9,1,4,5,6,7 now.Pel 9. each vertex index is 1,2,3, the vertex data of the 1st, the 2nd, the 3rd in corresponding high-speed cache, and read the data 0,9,1 of in high-speed cache the 1st, 2,3, output primitive 9..Needing the summit quantity n replaced to be 0 in pel high-speed cache 9., illustrating without the need to replacing.
So far, the order read according to each pel, each pel of this string pel exports, draws and terminates.
Present invention also offers a kind of pel pretreater, a kind of pel processor and a kind of graphic processing facility.
Fig. 8 is the structural representation of a kind of embodiment of pel pretreater of the present invention.As shown in Figure 8, pel pretreater U8 comprises analogue unit 801, index value unit 802, reorder unit 803 and summit processing units 804.Analogue unit 801 is according to the actual output procedure of these pels of information simulation such as the vertex data of primitive information, high-speed cache, this pel, by obtain this pel of actual output time high-speed cache in summit input index value unit 802 and summit processing units 804, the sequencing input each summit of this pel obtained being entered high-speed cache is reordered unit 803.When index value unit 802 exports this pel according to reality, summit in high-speed cache, obtains each summit of this pel index value in the caches, is stored in internal memory.The unit 803 that reorders enters the sequencing of high-speed cache according to each summit of this pel, reordered on each summit of this pel, obtains vertex index sequence table, is stored in internal memory.Summit when summit processing units 804 exports this pel based on reality in high-speed cache and next primitive vertices of input, summit quantity n replaced by the need providing this pel, is stored in internal memory.
Fig. 9 is the structural representation of a kind of embodiment of pel processor of the present invention.As shown in Figure 9, pel processor U9 comprises first pel setting unit 901, output unit 902, pre-read unit 903 and replacement unit 904.First pel setting unit 901 is for from the first summit of vertex index sequence table, the data on summit are obtained one by one according to the index on summit, be stored in high-speed cache, until high-speed cache is filled with, the more first summit of also not reading in vertex index sequence table be adjusted to current vertex.Output unit 902, for according to each summit of this pel index value in the caches, reads each vertex data, exports this pel from high-speed cache.Pre-read unit 903 for while output unit 902 output primitive, concurrently from vertex index sequence table pre-read replace needed for vertex data.When replacement unit 904 is not 0 for replacing summit quantity n when the need of this pel, with n vertex data of vertex index sequence table, replace n vertex data in high-speed cache.
Figure 10 is the structural representation of a kind of embodiment of graphic processing facility of the present invention.As shown in Figure 10, graphic processing facility U10 comprises receiving element 101, pel pretreater U8 and pel processor U9.Receiving element 101, for receiving the primitive information of the pel string that need draw, is stored in internal memory (not shown).Primitive information comprises: vertex data and primitive information table, and primitive information table comprises the vertex index forming each pel.Pel pretreater U8 is used for carrying out pre-service to pel each in pel string.Pel processor U9 is for realizing the drafting of each pel in pel string.
Although the present invention with preferred embodiment openly as above; but it is not for limiting the present invention; any those skilled in the art without departing from the spirit and scope of the present invention; the Method and Technology content of above-mentioned announcement can be utilized to make possible variation and amendment to technical solution of the present invention; therefore; every content not departing from technical solution of the present invention; the any simple modification done above embodiment according to technical spirit of the present invention, equivalent variations and modification, all belong to the protection domain of technical solution of the present invention.

Claims (13)

1. a pel preprocess method, is characterized in that, at least comprises:
Simulate the actual output procedure of this pel, during to obtain this pel of actual output, in high-speed cache, summit and each summit of this pel enter the sequencing of high-speed cache;
According to summit in high-speed cache during described this pel of reality output, obtain each summit of this pel index value in the caches; One-to-one relationship is there is between summit in index value in described high-speed cache and described high-speed cache;
Enter the sequencing of high-speed cache according to the described each summit of this pel, is reordered in each summit of this pel, obtain vertex index sequence table; Described vertex index sequence table stores the index of each vertex data, can obtain the actual physical address of this vertex data according to described index, obtains the data on this summit;
Summit when exporting this pel based on reality in high-speed cache and the summit of next pel, it is natural number that the need providing this pel replace summit quantity n, n;
Described reordering at least comprises:
Arbitrary summit of getting this pel is current vertex;
Repeat following steps, until each summit of this pel is all processed:
If the summit quantity in vertex index sequence table is less than high-speed cache open ended summit quantity, and current vertex is not in vertex index sequence table, then added vertex index sequence table;
Otherwise, if in the high-speed cache of current vertex not when reality exports a upper pel of this pel, added vertex index sequence table;
If the summit quantity added in the vertex index sequence table after current vertex is more than or equal to high-speed cache open ended summit quantity, then judge that whether actual summit of carrying out first being replaced according to replacement policy when summit is replaced is the summit of this pel; If so, then the summit be first replaced is added vertex index sequence table, the secondary summit be replaced is adjusted to the summit be first replaced; Repeat this step till actual summit of carrying out first being replaced when summit is replaced is not the summit of this pel;
Next summit of this pel is adjusted to current vertex;
The described vertex index sequence table that adds at least comprises:
When vertex index sequence table is empty, this summit is the first summit of described vertex index sequence table;
When vertex index sequence table is not empty, this summit is added to the end of described vertex index sequence table; The described need providing this pel are replaced summit quantity n and are at least comprised:
The need of this pel are replaced summit quantity n and are set to initial value;
When the summit quantity in the vertex index sequence table obtained after reordering in each summit of this pel is more than or equal to the open ended summit quantity of high-speed cache, arbitrary summit of taking off a pel is current vertex, repeat following steps, until each summit of pressing next pel of pel reading order is all processed, summit quantity replaced by the need obtaining this pel:
If in the high-speed cache of current vertex not when reality exports this pel, then the need replacement summit quantity n of this pel adds 1;
Add after 1, judge actually to carry out the summit whether summit that is first replaced when summit is replaced is next pel; If so, then the need replacement summit quantity n of this pel adds 1, and the secondary summit be replaced is adjusted to the summit be first replaced; Repeat this step till actual summit of carrying out first being replaced when summit is replaced is not the summit of next pel;
Next summit of next pel is adjusted to current vertex.
2. pel preprocess method as claimed in claim 1, is characterized in that: the described summit be first replaced according to replacement policy for the earliest stored in the summit in high-speed cache, the summit finally entering identical lowest order in summit in high-speed cache, high-speed cache, high-speed cache according to the summit of the minimum use of frequency or according to provide cryptographic hash compartment of terrain to be replaced summit.
3. any one pel preprocess method as claimed in claim 1 or 2, is characterized in that:
With described preprocess method, pre-service is carried out to each pel in the pel string that need draw.
4. a pel disposal route, is characterized in that, at least comprises:
According to each summit of this pel index value in the caches, from high-speed cache, obtain each vertex data, export this pel; When the need replacement summit quantity n of this pel is not initial value, with the summit of the n in vertex index sequence table, replace n summit in high-speed cache; The need of each summit of this pel index value in the caches, vertex index sequence table and this pel are replaced summit quantity n and are utilized the method for claim 1 to carry out pre-service to this pel to obtain.
5. pel disposal route as claimed in claim 4, is characterized in that:
When this pel is first pel, before n summit in this pel of described output and described replacement high-speed cache, also comprise: from the first summit of vertex index sequence table, the data on summit are obtained one by one according to the index on summit, and be stored in high-speed cache, until high-speed cache is filled with; The first summit of also not reading in vertex index sequence table is adjusted to current vertex.
6. pel disposal route as claimed in claim 4, it is characterized in that, described uses vertex index sequence
N summit in table, n the summit of replacing in high-speed cache at least comprises:
From the current vertex in described vertex index sequence table, read n summit one by one, and by the data on this n summit, replacement oldest stored enters the data on n summit of high-speed cache, adjusts the summit that oldest stored enters high-speed cache;
The first summit of also not reading in described vertex index sequence table is adjusted to current vertex.
7. pel disposal route as claimed in claim 6, is characterized in that:
Described high-speed cache is divided into headspace and output region, and headspace stores the vertex data needed for replacing, and output region stores the vertex data needed for output primitive;
While output primitive, the vertex data concurrently from vertex index sequence table needed for pre-read replacement, is stored in this headspace;
During replacement, from this headspace, read vertex data, replace the vertex data of output region in high-speed cache.
8. a graphic processing method, for drawing a known pel string, is characterized in that, at least comprise:
According to pel order each in described pel string, successively each pel in this pel string is processed by any one pel disposal route in claim 4 to 7.
9. graphic processing method as claimed in claim 8, is characterized in that:
When described being plotted as is drawn first, before in described use claim 4 to 7, any one pel disposal route processes each pel in this pel string successively, also comprise: carry out pre-service with the pel preprocess method of any one in claims 1 to 3 to each pel in this pel string, to obtain each summit of each pel index value in the caches, the need of vertex index sequence table and each pel replace summit quantity n.
10. a pel pretreater, is characterized in that, at least comprises:
Analogue unit, for simulating the actual output procedure of this pel, during to obtain this pel of actual output, in high-speed cache, summit and each summit of this pel enter the sequencing of high-speed cache;
Index value unit, for according to summit in high-speed cache during described this pel of reality output, obtains each summit of this pel index value in the caches;
Reorder unit, for entering the sequencing of high-speed cache according to the described each summit of this pel, reordered on each summit of this pel, obtains vertex index sequence table;
Summit processing units, summit during for exporting this pel based on reality in high-speed cache and the summit of next pel, summit quantity n replaced by the need providing this pel;
Described reordering at least comprises:
Arbitrary summit of getting this pel is current vertex;
Repeat following steps, until each summit of this pel is all processed:
If the summit quantity in vertex index sequence table is less than high-speed cache open ended summit quantity, and current vertex is not in vertex index sequence table, then added vertex index sequence table;
Otherwise, if in the high-speed cache of current vertex not when reality exports a upper pel of this pel, added vertex index sequence table;
If the summit quantity added in the vertex index sequence table after current vertex is more than or equal to high-speed cache open ended summit quantity, then judge that whether actual summit of carrying out first being replaced according to replacement policy when summit is replaced is the summit of this pel; If so, then the summit be first replaced is added vertex index sequence table, the secondary summit be replaced is adjusted to the summit be first replaced; Repeat this step till actual summit of carrying out first being replaced when summit is replaced is not the summit of this pel;
Next summit of this pel is adjusted to current vertex;
The described vertex index sequence table that adds at least comprises:
When vertex index sequence table is empty, this summit is the first summit of described vertex index sequence table;
When vertex index sequence table is not empty, this summit is added to the end of described vertex index sequence table; The described need providing this pel are replaced summit quantity n and are at least comprised:
The need of this pel are replaced summit quantity n and are set to initial value;
When the summit quantity in the vertex index sequence table obtained after reordering in each summit of this pel is more than or equal to the open ended summit quantity of high-speed cache, arbitrary summit of taking off a pel is current vertex, repeat following steps, until each summit of pressing next pel of pel reading order is all processed, summit quantity replaced by the need obtaining this pel:
If in the high-speed cache of current vertex not when reality exports this pel, then the need replacement summit quantity n of this pel adds 1;
Add after 1, judge actually to carry out the summit whether summit that is first replaced when summit is replaced is next pel; If so, then the need replacement summit quantity n of this pel adds 1, and the secondary summit be replaced is adjusted to the summit be first replaced; Repeat this step till actual summit of carrying out first being replaced when summit is replaced is not the summit of next pel;
Next summit of next pel is adjusted to current vertex.
11. 1 kinds of pel processors, is characterized in that, at least comprise:
First pel setting unit, for from the first summit of vertex index sequence table, obtain the data on summit one by one according to the index on summit, and be stored in high-speed cache, until high-speed cache is filled with, the more first summit of also not reading in vertex index sequence table is adjusted to current vertex;
Output unit, for according to each summit of this pel index value in the caches, reads each vertex data, exports this pel from high-speed cache;
Replacement unit, when not being 0 for replacing summit quantity n when the need of this pel, with n vertex data of vertex index sequence table, replaces n vertex data in high-speed cache;
The need of each summit of this pel index value in the caches, vertex index sequence table and this pel are replaced summit quantity n and are utilized pel pretreater as claimed in claim 10 to carry out pre-service to this pel to obtain.
12. pel processors as claimed in claim 11, is characterized in that, also comprise:
Pre-read unit, for while output primitive, the vertex data concurrently from vertex index sequence table needed for pre-read replacement.
13. 1 kinds of graphic processing facilities, is characterized in that, at least comprise:
Receiving element, for receiving the primitive information of the pel string that need draw, comprising: vertex data and primitive information table, described primitive information table comprises the vertex index forming each pel; Described primitive information is stored in internal memory;
Pel pretreater as claimed in claim 10, for carrying out pre-service to each pel in described pel string;
Pel processor as described in claim 11 or 12, for realizing the drafting of each pel in described pel string.
CN201210226716.4A 2012-07-02 2012-07-02 Graphics primitive preprocessing method, graphics primitive processing method, graphic processing method, processor and device Active CN102799431B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210226716.4A CN102799431B (en) 2012-07-02 2012-07-02 Graphics primitive preprocessing method, graphics primitive processing method, graphic processing method, processor and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210226716.4A CN102799431B (en) 2012-07-02 2012-07-02 Graphics primitive preprocessing method, graphics primitive processing method, graphic processing method, processor and device

Publications (2)

Publication Number Publication Date
CN102799431A CN102799431A (en) 2012-11-28
CN102799431B true CN102799431B (en) 2015-06-10

Family

ID=47198548

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210226716.4A Active CN102799431B (en) 2012-07-02 2012-07-02 Graphics primitive preprocessing method, graphics primitive processing method, graphic processing method, processor and device

Country Status (1)

Country Link
CN (1) CN102799431B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104166568B (en) * 2014-08-13 2017-08-29 国电南瑞科技股份有限公司 A kind of method of the picture file of electrical power system loaded in parallel based on CIM/G
CN108986014A (en) * 2018-07-19 2018-12-11 芯视图(常州)微电子有限公司 It is applicable in the pel assembly unit of out-of-order vertex coloring
CN113254127A (en) * 2021-05-13 2021-08-13 中国电力工程顾问集团西南电力设计院有限公司 Processing method for large-data-volume graphic elements in power transmission line engineering measurement software
CN115880133B (en) * 2023-01-31 2023-07-25 南京砺算科技有限公司 Graphics processing unit and terminal device

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1753033A (en) * 2005-11-10 2006-03-29 北京航空航天大学 Real time drawing method of vivid three dimensional land form geograpical model

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110043518A1 (en) * 2009-08-21 2011-02-24 Nicolas Galoppo Von Borries Techniques to store and retrieve image data

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1753033A (en) * 2005-11-10 2006-03-29 北京航空航天大学 Real time drawing method of vivid three dimensional land form geograpical model

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
《Direct3D中的坐标系与基本图元绘制》;王德才等;《电脑编程技巧与维护》;20070430;第4-6,28页 *

Also Published As

Publication number Publication date
CN102799431A (en) 2012-11-28

Similar Documents

Publication Publication Date Title
US10957097B2 (en) Allocation of primitives to primitive blocks
CN104641396B (en) Delay preemption techniques for Dispatching Drawings processing unit command stream
CN103793893A (en) Primitive re-ordering between world-space and screen-space pipelines with buffer limited processing
CN103649922B (en) Synchronization method and device of shader operation
JP4076502B2 (en) Efficient graphics state management for zone rendering
KR102147356B1 (en) Cache memory system and operating method for the same
CN109603155A (en) Merge acquisition methods, device, storage medium, processor and the terminal of textures
US11600034B2 (en) Methods and control stream generators for generating a control stream for a tile group in a graphics processing system
CN105431831B (en) Data access method and the data access device for utilizing same procedure
US20130069943A1 (en) Optimizing resolve performance with tiling graphics architectures
KR101609079B1 (en) Instruction culling in graphics processing unit
CN103885893A (en) Technique For Accessing Content-Addressable Memory
US10332303B2 (en) Dedicated ray memory for ray tracing in graphics systems
KR20070028368A (en) Low power programmable processor
CN102799431B (en) Graphics primitive preprocessing method, graphics primitive processing method, graphic processing method, processor and device
GB2497762A (en) Tile based graphics processing using per-pixel store
CN103207774A (en) Method And System For Resolving Thread Divergences
CN104160420A (en) Execution of graphics and non-graphics applications on a graphics processing unit
CN106779057A (en) The method and device of the calculating binary neural network convolution based on GPU
CN110223216B (en) Data processing method and device based on parallel PLB and computer storage medium
KR20210066727A (en) Graphics processing systems
US10466915B2 (en) Accessing encoded blocks of data in memory
CN101201933B (en) Plot treatment unit and method
CN103003839A (en) Split storage of anti-aliased samples
CN109978977A (en) The device and method for executing the rendering based on segment using the graph data prefetched

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220729

Address after: 200120 room 11F, building 2, Lane 560, shengxia Road, Pudong New Area, Shanghai

Patentee after: GALAXYCORE SHANGHAI Ltd.,Corp.

Address before: Room 1004 and room 1005, building 2, No. 560, shengxia Road, Pudong New Area, Shanghai 201203

Patentee before: SHANGHAI SUANXIN MICROELECTRONICS Co.,Ltd.