Be applicable to stack cache memory and buffer storage method that context switches
Technical field
The present invention relates to the micro-processor architecture technical field, particularly a kind of stack cache memory and buffer storage method that is applicable to that context switches.
Background technology
Along with the fast development of microprocessor Design and production technology, the gap of the access speed of internal memory and the arithmetic speed of processor is more and more significant.And the gap of the access speed of storer and the arithmetic speed of processor makes memory access speed more and more become the bottleneck that improves processor performance with annual 50% speed increment.Utilize principle of locality, adopting one or more levels cache memory (Cache is called for short " high-speed cache ") is one of effective means that improves performance of storage system.High-speed cache is a little fireballing special memory of capacity, deposits processor most recently used instruction and data.Processor when operation, if the instruction of visit or data in high-speed cache, then can be, otherwise need access memory with very high speed visit, wait for the long time.Design the average memory access time that high-speed cache efficiently can reduce processor significantly.
For the better utilization principle of locality, improve cache hit rate, high-speed cache is further divided into instruction cache (Instruction Cache) and data cache (Data Cache) again.The visit of internal memory is divided into code segment, data segment, heap space and stack space, and this just provides possibility for the further refinement of data cache.Wherein the visit of stack space data has good temporal locality and spatial locality, and program is visited the data near stack top continuously.The preservation of the preservation of local variable during function call, parameter transmission, register and recovery all are to finish by the visit of stack space.Stack cache memory (Stack Cache, be called for short " stack high-speed cache ") stack addressing is separated from data cache, can the better utilization stack space characteristics of visit, the heap data of avoiding the stack data to replace out in the data cache simultaneously causes that data cache pollutes, and has reduced taking the data cache port.The characteristics of stack space data access: the visit of (1) stack space data has good temporal locality and spatial locality, and program is visited the data near stack top continuously.Therefore the stack high-speed cache does not need very greatly, just very high hit rate can be arranged.(2) stack space distributes, and promptly top-of-stack pointer (sp) reduces, and does not need to fetch from the low layer storage system the original value of corresponding blocks.(3) stack space reclaims, and promptly top-of-stack pointer (sp) increases, and does not need alluvial (even dirty) is write in the low layer storage system.(4) stack addressing is the visit of continuous space, and continuous space can use identical base address to add the form visit of skew.
As Fig. 1 is traditional stack cache structure and visit figure.In order to obtain stack high-speed cache information of whether hitting and the data of hitting apace, the stack high-speed cache adopts the virtual address access modes.The stack high-speed cache is divided into sign, data and control assembly three parts, and input is the address of visit stack high-speed cache, and output is stack cache hit or signal that does not hit and hiting data.The sign of stack high-speed cache partly comprises territory, address (Bottom) at the bottom of empty plot (Vbase) territory, significance bit (Valid) territory, physical base address (Pbase) territory, (Top) territory, stack top address and the stack.The data division of stack high-speed cache comprises (W) territory, dirty position whether data (Data) territory and expression data were write.Because the stack of microprocessor all is to distribute to low address from high address, thus at the bottom of the stack address greater than the stack top address.The address is promptly determined by binary program interface standard (ABI) by software convention at the bottom of the stack.As shown in Figure 1, control assembly comprises first comparator circuit, second comparator circuit and AND circuit.First comparator circuit is finished the base address and the judgement whether empty plot of stack high-speed cache equates of visit, promptly judge Base=Vbase?Is second comparator circuit finished the judgement whether reference address belongs to stack space, promptly judges Top≤Vaddr≤Bottom?AND circuit finish first comparator circuit output, the output of second comparator circuit and significance bit with operation, determine whether the stack high-speed cache hits, output stack cache hit or the signal that does not hit.
The address of visit stack high-speed cache is divided into two fixing parts: base address (Base) and skew (Offset).Wherein the base address is used for judging relatively with the empty plot of stack high-speed cache whether the stack cache access hits.Skew is used for selecting the content that needs in the data field, output stack number of cache hits certificate.
The stack cache set is woven to the form of round-robin queue, the zone that buffer memory is continuous.The stack high-speed cache is judged the distribution and the recovery of stack space by the variation that detects top-of-stack pointer.Stack space distributes, and promptly top-of-stack pointer (sp) reduces, and directly distributes stack space in the stack high-speed cache, does not need to fetch from the low layer storage system the original value of corresponding blocks.If the new stack space of the not enough distribution in the space in the stack high-speed cache is the continuity that guarantees the stack high-speed cache, replace in the high-speed cache of popping near the data at the bottom of the stack.Stack space reclaims, and promptly top-of-stack pointer (sp) increases, and does not need dirty alluvial is write in the low layer storage system.
The stack cache access by indicating comparison, determines whether the stack high-speed cache hits.The stack cache tag relatively, need to judge the whether satisfied stack space that belongs to of reference address, be reference address more than or equal to the stack top address smaller or equal to stack at the bottom of the address, the base address of visit is identical with empty plot in the stack cache tag, the value of the effective bit field in the sign is 1.Satisfy above-mentioned condition simultaneously, then the stack cache hit.Obtain the data of needs with the offset index data field.
The stack high-speed cache does not hit, if reference address does not belong to stack space, is handled by data cache.The stack high-speed cache does not hit, reference address belongs to stack space, if reference address is less than the lowest address value in the current stack high-speed cache, promptly less than near the address of stack top, then from the low layer storage system, fetch lowest address from the stack high-speed cache to the data of hit address not.The stack high-speed cache does not hit, reference address belongs to stack space, if reference address is greater than the maximum address value in the current stack high-speed cache, promptly greater than near the address at the bottom of the stack, then from the low layer storage system, fetch maximum address from the stack high-speed cache to the data of hit address not.Guarantee the continuity of stack high-speed cache like this.
Carrying out context when switching, owing to do not have the record the process identification information of (comprising thread) in the stack high-speed cache, new process may be distributed identical virtual address with original process, causes their physical address difference but the virtual address is identical.Therefore, when switching, context is the assurance data consistency, the data of stack high-speed cache apoplexy involving the solid organs all need be write in the low layer storage system, discharging the space uses for new process, referring to U.S. Patent No. 6167488, patent name " Stack caching circuit with overflow/underflow unit ".Even the enough new process in the space in the stack high-speed cache is used, also must write dirty data in the low layer storage system for guaranteeing correctness.
When processor operation one process is used, the stack high-speed cache shows good performance, but operation multi-user (Multi-user), multiprogramming (Multi-programming) and multithreading (Multi-threading), need once all dirty data in the stack high-speed cache to be write in the low layer storage system because context switches, expense is very big.After context switches back, also need the data of writing are away fetched the stack high-speed cache again, the cost of data transmission is very big.Therefore, the processor with stack high-speed cache has good performance in one process is used, and operates under multi-process (the comprising thread) environment, and when particularly frequently carrying out the context switching, effect is unsatisfactory.The application of multi-user, multiprogramming and multithreading is the trend of microprocessor development, is inevitable.
Therefore, the deficiencies in the prior art just need design to be applicable to the average memory access time of the microprocessor stack high-speed cache of context switching with the reduction processor, increase substantially microprocessor memory access performance in actual applications.
Summary of the invention
The object of the invention is that the stack high-speed cache that overcomes prior art is not suitable for the deficiency that context switches, thereby provide a kind of under multi-user, multiprogramming and multi-thread environment, running into frequent context switches, also can be good at playing a role, and hardware spending is little, the microprocessor stack high-speed cache and the method that are applicable to the context switching that are easy to realize.
In order to achieve the above object, the present invention takes technical scheme as follows:
A kind of stack cache memory that is applicable to that context switches comprises:
At least two stack cachelines, described stack cacheline is made up of sign part, data division and control section;
An OR circuit is connected with the output terminal of the control section of described two stack cachelines at least, be used for each stack cacheline hiting signal or operation, export the result that this stack cache memory hits or do not hit;
A selector switch, be connected with the output terminal of the control section of described at least two stack cachelines, and be connected with the output terminal of the data division of described at least two stack cachelines, be used to select the data of the stack cacheline that hits, output stack number of cache hits certificate;
Further, the sign of described stack cacheline partly comprises territory, address (Bottom) at the bottom of empty plot (Vbase) territory, significance bit (Valid) territory, physical base address (Pbase) territory, territory, stack top address (Top), the stack, process address space sign (PASID is called for short " process identification (PID) ") territory;
Further, the data division of described stack cacheline comprises (W) territory, dirty position whether data (Data) territory and expression data were write;
Further, the control section of each described stack cacheline comprises: at least three comparator circuits and an AND circuit; Wherein, the input end of first comparator circuit is connected with the empty plot territory of sign part and the territory, base address of the reference address of this stack high-speed cache, be used to finish the base address and the judgement that indicates whether empty plot partly equates of visit, its output terminal is connected to described AND circuit; The reference address of address field and this stack high-speed cache is connected at the bottom of the stack top address field of the input end of second comparator circuit and sign part and the stack, is used to finish the judgement whether reference address belongs to stack space, and its output terminal is connected to described AND circuit; The input end of the 3rd comparator circuit is connected with the process identification (PID) territory of described sign part and the process identification (PID) territory of control register, the judgement whether value that is used to finish process address space identification field and the process address space sign of access instruction equate, its output terminal is connected to described AND circuit; Effective bit field of described sign part also is connected to the input end of described AND circuit; The output terminal of described AND circuit is connected respectively to described OR circuit and described selector switch.
The microprocessor stack cache memory that is applicable to that context switches provided by the invention, its input are the address of visit stack high-speed cache and the process address space sign of access instruction, and its output is signal that hits or do not hit and the data of hitting.
In the present invention, the value of the process address space sign of input reference instruction all has corresponding process address space sign from the control register of microprocessor for each bar access instruction.In the middle of microprocessor, the control register of preserving process address space sign corresponding contents is all arranged, the register difference of just depositing, location mode difference.For example MIPS processor adopting EntryHi register is deposited address space identifier (ASID) ASID (Address SpaceIdentifier), and the EntryLow register is deposited overall situation position G (Global Bit), and they constitute process address space sign jointly.
According to above-mentioned microprocessor stack cache memory, a kind of microprocessor stack cache storage means that is applicable to that context switches, it is as follows to comprise step:
(1) context switches, the initialization stack; If do not distribute corresponding process stack space in the stack high-speed cache, address and stack top address at the bottom of the record initialization stack in the stack high-speed cache;
(2) stack space distributes; If the stack high-speed cache has allocatable space, in the stack high-speed cache, distribute new vacant space, if the stack high-speed cache does not have allocatable space, then select the stack cacheline to write back to the low layer storage system, and the sign of the newly assigned stack cacheline of initialization;
(3) stack space reclaims; Do not need dirty alluvial is write in the low layer storage system, directly reclaim release stack cache memory space;
(4) instruction access stack high-speed cache; Indicate comparison, determine according to the sign comparative result whether visit stack high-speed cache hits; If hit, execution in step (5); If do not hit, execution in step (6);
(5) export the stack cacheline that hits and carry out the hiting data that data directory obtains with skew;
(6) judge whether reference address belongs to stack space; If reference address does not belong to stack space, handle by data cache; If belong to stack space, from the low layer storage system, fetch not hit address place stack cacheline at the bottom of the stack and the data between the stack top address.
In above-mentioned steps (2), if the stack high-speed cache does not have allocatable space, select the stack cacheline to write back to the low layer storage system, can adopt first in first out strategy (FIFO), randomized policy (Random) or least recently used strategy (LRU).If adopt first in first out strategy (FIFO), the selection of the stack cacheline that enters at first can distribute the territory (Age) of time to realize by increase expression stack cacheline in the sign of stack cacheline, the Age territory zero clearing of newly assigned stack high-speed cache block mark, the Age territory of other stack high-speed cache block marks adds 1.
In above-mentioned steps (4), described sign relatively is meant and judges whether to satisfy simultaneously following condition: reference address more than or equal to the stack top address smaller or equal to stack at the bottom of the address, the base address of visit is identical with the empty plot in the stack high-speed cache block mark, in the empty plot stack high-speed cache block mark identical with the visit base address effectively the value of bit field be 1, the value of process address space identification field identifies identical with the process address space of access instruction; Described visit stack cache hit is meant and satisfies above-mentioned condition; If do not satisfy above-mentioned condition, then visit the stack high-speed cache and do not hit.
The present invention has following advantage:
1. stack high-speed cache of the present invention is organizational form with the piece, has adopted special process address space sign in stack high-speed cache block mark, in order to distinguish the address space of different processes.So this be one specially at multi-user, multiprogramming, multi-thread environment, can well adapt to the stack cache method that process (comprising thread) context switches.
2. the present invention only need increase process address space sign PASID territory and Age territory in the sign of stack cacheline, and hardware spending is little, and control is simple, has avoided the complicacy that realizes.
Description of drawings
Fig. 1 is traditional stack cache structure and visit figure.
Fig. 2 is that one embodiment of the invention is applicable to stack cache structure and the visit figure that context switches.
Embodiment
Below in conjunction with the drawings and specific embodiments the present invention is described in further detail:
Among Fig. 2, numeral 10 is represented one according to the specific embodiment that is applicable to the stack cache memory that context switches of the present invention.Among this embodiment, be applicable to that the stack high-speed cache of context switching is made up of two stack cachelines, an OR circuit 11 and a selector switch 12.The input of this embodiment is the address 13 of visit stack high-speed cache and the process address space sign 14 of access instruction, and output is signal that hits or do not hit and the data of hitting.Data in each stack cacheline are continuous, are similar to traditional stack high-speed cache.
Two stack cachelines have same structure, are called the first and second stack cachelines for the ease of statement, and wherein the first stack cacheline comprises sign part 15, data division 16 and control assembly 17 3 parts.The sign part 15 of the first stack cacheline comprises: empty plot (Vbase) territory is used for representing the empty plot of stack cacheline, significance bit (Valid) territory is used for representing whether the stack cacheline is effective, physical base address (Pbase) territory is used for representing the empty plot corresponding physical of stack cacheline plot, territory, stack top address (Top) is used for representing the stack top address of the affiliated process stack space of stack cacheline, territory, address at the bottom of the stack (Bottom) is used for representing address at the bottom of the stack of process stack space under the stack cacheline, process address space sign (PASID) territory is used for representing the process address space sign of the affiliated process of stack cacheline and the distribution time that distribution time (Age) territory is used for representing the stack cacheline.The sign of each individual corresponding stack cacheline of stack cache tag part among the figure, the sign part of the numeral 18 expressions second stack cacheline among the figure, its structure is identical with the sign part 15 of the first stack cacheline.The data division 16 of the first stack cacheline comprises (W) territory, dirty position whether data (Data) territory and expression data were write.Each body surface of stack cached data among the figure shows the data of a stack cacheline, the data division of the numeral 19 expressions second stack high-speed cache among the figure, and its structure is identical with the data division 16 of the first stack cacheline.The control section 17 of the first stack cacheline comprises first comparator circuit 20, second comparator circuit 21, the 3rd comparator circuit 22 and AND circuit 23.First comparator circuit 20 is finished the base address and the judgement whether empty plot of the first stack cacheline equates of visit, promptly judge Base=Vbase?Second comparator circuit 21 is finished the judgement whether reference address belongs to stack space, promptly judge Top≤Vaddr≤Bottom?The 3rd comparator circuit 22 is finished the value of process address space identification field and process address space sign 14 judgements that whether equate of access instruction.Wherein the value of the process address space of access instruction sign 14 (PASID) all has corresponding process address space sign from the control register of microprocessor for each bar access instruction.AND circuit 23 finish the output of first comparator circuit 20,21 outputs of second comparator circuit, 22 outputs of the 3rd comparator circuit and sign part 15 effective bit field with operation, determine that whether the first stack cacheline hits, and exports the signal that the first stack cacheline hits or do not hit.
That OR circuit 11 is finished each stack cacheline hiting signal or operation, the i.e. first and second stack cachelines in the present embodiment, the result that output stack high-speed cache 10 hits or do not hit.
Selector switch 12 is finished the data of selecting the stack cacheline hit, output stack number of cache hits certificate.
The address (Vaddr) 13 of visit stack high-speed cache instruction is divided into two fixing parts: base address (Base) and skew (Offset).Wherein the base address be used for stack high-speed cache block mark in empty plot judge relatively whether the stack cache access hits, skew is used for selecting the content that needs in the data field.Processor has the control register (Control Register) of process identity information and specifies the process address space of visit stack high-speed cache instruction to identify 14, be used for stack high-speed cache block mark in process address space sign compare the process at identification access instruction place.
In the present embodiment, with two stack cachelines is that example illustrates stack high-speed cache of the present invention, be to be understood that, can comprise a plurality of stack cachelines according to stack high-speed cache of the present invention, it all has same structure and connected mode, and this is adequate to those skilled in the art.
According to the stack cache memory that present embodiment provides, a kind of microprocessor stack cache storage means that is applicable to that context switches, concrete implementation step is as follows:
(1) context switches, the initialization stack; If distribute corresponding process stack space in the stack high-speed cache, address and stack top address at the bottom of the initialization stack of record the process in the stack high-speed cache.If distributed corresponding process stack space in the stack high-speed cache, expression is that original process context switches back, and does not need to do any operation.
(2) stack space distributes, and promptly top-of-stack pointer reduces; If the stack high-speed cache has allocatable space, in the stack high-speed cache, distribute new vacant space.If the stack high-speed cache does not have allocatable space, then the stack cacheline of selecting to enter at first according to first in first out strategy (FIFO) writes back to the low layer storage system, discharge the stack cacheline and distribute to new process use, and the newly assigned stack high-speed cache of initialization block mark.The selection of the stack cacheline that enters at first realizes by increase distribution time (Age) territory in the sign of stack cacheline, the stack cacheline of stack cacheline for entering at first that the Age value is maximum.The new Age territory zero clearing that distributes stack high-speed cache block mark, the Age territory of other stack high-speed cache block marks adds 1.Distribute the process of stack space to distribute the stack cacheline if desired, revise the top-of-stack pointer in this process place stack high-speed cache block mark, will newly distribute the W territory zero clearing of stack space correspondence.If the stack cacheline at place, a stack top address is newly distributed in the stack top address not in the stack cacheline that this process has been distributed.Distribute the process of stack space not distribute the stack cacheline if desired, for it distributes the stack cacheline at new place, stack top address.Vbase in the newly assigned stack high-speed cache block mark is provided with by the base address of stack top, Pbase is provided with by stack top base address corresponding physical plot, process address space identification field is changed to current process address space sign, address at the bottom of the stack (Bottom) and stack top address (Top) are made as address and current stack top address at the bottom of the initialization stack, effectively bit field (Valid) is changed to 1, and the W territory of stack high-speed cache blocks of data is initialized as 0.
(3) stack space reclaims, and promptly top-of-stack pointer increases; Do not need dirty alluvial is write in the low layer storage system, directly reclaim release stack cache memory space, revise top-of-stack pointer.If stack space all reclaims, promptly top-of-stack pointer equals the value of bottom of stack pointer, and the value of then putting the effective bit field (Valid) that reclaims the corresponding stack cache tag of stack space is 0.
(4) instruction access stack high-speed cache indicates comparison, determines according to the sign comparative result whether the stack high-speed cache hits.Indicate comparison, promptly judge whether to satisfy: (a) reference address more than or equal to the stack top address smaller or equal to stack at the bottom of the address.(b) Fang Wen base address is identical with empty plot in the stack high-speed cache block mark.(c) value in significance bit (Valid) territory is 1 in the stack high-speed cache block mark that empty plot is identical with the visit base address.(d) value of the process address space identification field in the identical stack high-speed cache in the base address block mark is identical with the process address space sign of access instruction.Satisfy above-mentioned condition simultaneously, stack cache hit then, execution in step (5); Otherwise execution in step (6);
(5) visit is hit, and the stack cacheline that output is hit carries out the hiting data that data directory obtains with skew.
(6) visit is not hit, and judges whether reference address belongs to stack space.Handle in two kinds of situation: (a) reference address does not belong to stack space, is handled by data cache.(b) reference address belongs to stack space, fetches not hit address place stack cacheline at the bottom of the stack and the data between the stack top address from the low layer storage system, guarantees that the stack cacheline is continuous.Distribute the stack cacheline, the empty plot of this stack cacheline is identical with the base address of reference address.Visit low layer memory system data returns, put Vbase in the stack high-speed cache block mark with the base address, put Pbase in the sign with base address corresponding physical plot, Valid territory in the sign puts 1, with the PASID territory in the filling of the PASID in the control register stack high-speed cache block mark, the zero clearing of Age territory, the value in the Age territory of other stack cachelines adds 1.Return data leaves the Data territory of stack high-speed cache blocks of data in, and corresponding W territory puts 0.
Enumerate three specific embodiment below.Distribution recovery, stack cache access by stack space hits the example that does not hit with the stack cache access, and to specify the stack high-speed cache how to mention by the present invention be organizational form with the piece, in stack high-speed cache block mark, increase process address space identification field, distinguish the shared space of different processes, realize the stack cache method that is applicable to that context switches.
Example 1. stack spaces distribute, the virtual address is 32, the address is 0x7fff8000 at the bottom of the stack, the stack top address is reduced to 0x7fff7c00, and the process address space sign PASID in the control register is a process 8, and stack high-speed cache block size is 4KB, the Vbase of each stack high-speed cache block mark is 20, Offset is 12, and the stack cache memory sizes is 16KB, is divided into 4 stack cachelines.Not having process number in the stack high-speed cache is 8 stack cacheline, and not have the significance bit of sign be 0 stack cacheline.The space that needs to distribute is 1KB (0x7fff8000-0x7fff7c00), search the stack cacheline of Age maximum, the Age of second stack cacheline is 8 to the maximum, process number is 1, the stack top address is 0x7fff7b00, and the address is 0x7fff8000 at the bottom of the stack, and Vbase is 0x7fff7, Valid is 1, and Pbase is 0x00ff7.Replace out this stack cacheline, begin at the bottom of the stack from the stack top of process 1, i.e. skew is arrived 0xfff for 0xb00, if the data field respective items W that indexes be 1 be dirty, respective items is write in the low layer storage system, and the value that writes back rearmounted W territory is 0, and writing back the address is Offset in the Pbase assembly.As Offset is that the data of 0xb00 are dirty, and Pbase is 0x00ff7, writes back the address and then is 00ff7b00.Stack high-speed cache block size is 4KB, more than or equal to the space that needs distribute, enough distributes to new process stack space and uses.Process 8 enters second stack cacheline, the Vbase of sign is changed to 0x7fff7, Valid is changed to 1, the PASID that PASID is changed in the control register is 8, Pbase is changed to this void plot corresponding physical plot 0x01ff7, Bottom is changed to 0x7fff8000, and Top is changed to 7fff7c00, and Age is changed to 0.The Age of other stack cachelines adds 1.When stack space reclaimed, the stack top address increased to 0x7fff7f00, did not need the data of the correspondence in the stack high-speed cache are write the low layer storage system, only need change stack top into 0x7fff7f00.
The address Vaddr of example 2. access instruction is 0x7fff7f80, process address space sign PASID in the control register is a process 7, stack high-speed cache block size is 4KB, the Vbase of each sign stack cacheline is 20, Offset is 12, the stack cache memory sizes is 16KB, is divided into 4 stack cachelines.The base address of reference address (Base) is 0x7fff7, and skew (Offset) is 0xf80.The empty plot (Vbase) of first stack cacheline in the stack high-speed cache is 0x7fff7, process address space sign (PASID) is 7, significance bit (Valid) is 1, address at the bottom of the stack (Bottom) is 0x7fff8000, stack top address (Top) is 0x7fff7400, physical base address (Pbase) is 0x1ff7, and Age is 0.The data of 0xf80 offset index are 0x01fc00c0 in first stack cacheline.Indicate comparison, reference address more than or equal to the stack top address smaller or equal to stack at the bottom of address (0x7fff7400≤0x7fff7f80≤0x7fff8000), the visit base address equals the empty plot 0x7fff7f80 of first stack high-speed cache block mark, the value of effective bit field is 1 in first stack high-speed cache block mark, and the value of process address space identification field is all 7 mutually with the process address space sign of access instruction.Satisfy the condition of stack cache hit, return the information of hitting.Select the data 0x01fc00c0 that first stack cacheline obtains with the offset index data field in the data field, output hiting data 0x01fc00c0.
The address Vaddr of example 3. access instruction is 0x7fff7f80, process address space sign PASID in the control register is a process 7, stack high-speed cache block size is 4KB, the Vbase of each stack high-speed cache block mark is 20, Offset is 12, the stack cache memory sizes is 16KB, is divided into 4 stack cachelines.The base address of reference address (Base) is 0x7fff7, and skew (Offset) is 0xf80.The empty plot (Vbase) of first stack high-speed cache block mark in the stack high-speed cache is 0x7fff6, process address space sign (PASID) is 7, significance bit (Valid) is 1, address at the bottom of the stack (Bottom) is 0x7fff8000, stack top address (Top) is 0x7fff6000, physical base address (Pbase) is 0x1ff6, and Age is 2.Reference address more than or equal to the stack top address smaller or equal to stack at the bottom of the address (0x7fff7400≤0x7fff7f80≤0x7fff8000), in the sign effectively the value of bit field be 1, the value of the process address space identification field in the sign is all 7 mutually with the process identification (PID) of access instruction.But the visit base address is not equal to the empty plot 0x7fff6 of first stack cacheline, and the significance bit that does not have the process address space of other stack cachelines to be designated 7, the three stack high-speed cache block marks is 0.The stack high-speed cache does not hit, the reference address visit belongs to stack space, fetch from the low layer storage system not that stack cacheline in hit address place enters the 3rd stack cacheline in the data between the address at the bottom of stack top address and the stack, guarantee that the stack cacheline is continuous.Promptly fetch the data from virtual address 0x7fff7000 to 0x7fff7fff, corresponding physical address is that 0x01ff7000 is to 0x01ff7fff.Visit low layer memory system data returns, putting the 3rd Vbase in the stack high-speed cache block mark is 0x7fff7, putting Pbase is corresponding physical base address 0x01ff7, putting the Valid territory is 1, fill PASID with the address space identifier (ASID) in the control register 7, putting the Age territory is 0, and the value in other stack cachelines Age territory adds 1.Return data leaves the position, Data territory of the 3rd stack high-speed cache blocks of data in, and corresponding W territory is changed to 0.
By the description of the foregoing description, advantage of the present invention is tangible.The present invention has overcome traditional stack cache method and has not been suitable for the deficiency that process (comprising thread) context switches, and feasibility is good.
It should be noted that at last: above embodiment is the unrestricted technical scheme of the present invention in order to explanation only, although the present invention is had been described in detail with reference to the foregoing description, those of ordinary skill in the art is to be understood that: still can make amendment or be equal to replacement the present invention, and not breaking away from any modification or partial replacement of the spirit and scope of the present invention, it all should be encompassed in the middle of the claim scope of the present invention.