CN103186474B - The method that the cache of processor is purged and this processor - Google Patents
The method that the cache of processor is purged and this processor Download PDFInfo
- Publication number
- CN103186474B CN103186474B CN201110448085.6A CN201110448085A CN103186474B CN 103186474 B CN103186474 B CN 103186474B CN 201110448085 A CN201110448085 A CN 201110448085A CN 103186474 B CN103186474 B CN 103186474B
- Authority
- CN
- China
- Prior art keywords
- cache
- field
- deviant
- processor
- instruction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Abstract
The present invention relates to method and this processor being purged the cache of processor, the method includes requiring to produce a specific instruction according to one, and this specific instruction comprises an operational order, one first field and one second field;According to this first field and this second field, obtain a deviant and an initial address;According to this initial address and this deviant, a selected appointment section in this cache;And remove the data being stored in this appointment section.This specific instruction can comprise and writes back (Writeback), ineffective treatment (Invalidate) or write back and add ineffective treatment (Writeback+Invalidate).
Description
Technical field
The present invention is the sweep-out method about a kind of cache, clear especially with regard to a kind of cache to processor
Except the method specifying section.
Background technology
Cache (cache) refers to that access speed is than general random access memory a kind of internal memory faster, it is however generally that,
It uses DRAM technology unlike main system memory (main memory), but uses expensive but faster SRAM skill
Art.With reference to Fig. 1, due to processor (CPU) 10 executions speed more than main storage 12 reading speed soon, processor 10 to
The data of accessing main memory 12, it is necessary to wait that several processor frequencies cycle causes the waste for the treatment of efficiency, therefore, process
Device 10 is when accessing data, and its core 102 can arrive first in cache 104 and look for, when required data are because of operation before
When being temporary in cache 104, processor 10 avoids the need for reading data from main storage 12, and can be directly from a high speed
Caching 104 acquisition desired data, thus promote access speed, it is thus achieved that better performance.
A kind of high-order technology that the cache of CPU was once used on supercomputer, but make on modern computer
Microprocessor all incorporate the data high-speed caching and instruction cache differed in size at chip internal, be commonly referred to as L1 high
Speed caching (L1Cache i.e. Level 1On-dieCache, first order on chip cache);And than the L2 high speed of L1 more capacity
Caching was once placed in outside CPU, such as on motherboard or CPU adapter, but have become as now the standard within CPU
Assembly;More expensive top domestic and work station CPU even can be equipped with the third level speed buffering bigger than L2 cache
Memorizer (level 3On-die Cache;L3 cache).
Thering is provided the purpose of cache is to allow the processing speed of velocity adaptive CPU of data access, in order to fully send out
Waving the effect of cache, cache now the most not only relies on the most accessed temporary data to provide cache
Ability, also can coordinate branch prediction and the data pre-fetching technology of hardware implementation, as far as possible the data that will use in advance from master
Memorizer is got in cache, promotes CPU and obtains the probability of desired data in cache.Capacity due to cache
Limited, in addition to the CPU desired data that effectively prestores, the data that in good time removing is stored in cache are also particularly significant
's.Cache can be provided write back (Writeback) or ineffective treatment according to system or the demand of software by CPU
(Invalidate) instruction.With reference to Fig. 1, when core 102 carries out written-back operation to cache 104, former being stored in is delayed at a high speed
Deposit the data in 104 and be written back to main storage 12;When performing ineffective treatment operation, core 102 is by the institute in cache 104
There is data dump (clean);Generally, write back instruction to send together along with ineffective treatment instruction, to write back primary storage in data
Whole cache is removed after device 12.But, cache capacity in early days is minimum, and the most several KB, therefore without the concern for such as
What understands partial sector, but cache now has been expanded and has reached several MB, how the particular section number to cache
According to being cleared into new problem.
Hacking et al. proposes a solution U.S. Patent No. US 6978357, but, this reset mode
There is several restriction, first, the sector sizes being selected must be the multiple of 2;Second, the district of regular length can only be removed
Section.
Summary of the invention
An object of the present invention, is to propose the instruction format of section in a kind of selected cache, according to this to process
The cache of device selectes section the method removed.
An object of the present invention, is to propose a kind of to perform the instruction format of section in selected cache, according to this
The processor that section selected in its cache is purged.
According to the present invention, a kind of method that the cache of processor is purged, including: require generation one according to one
Specific instruction, this specific instruction comprises an operational order, one first field and one second field;According to this this first field with
And this second field, obtain a deviant and an initial address;According to this initial address and this deviant, the most slow from this
A selected appointment section in depositing;And remove the data being stored in this appointment section.
According to the present invention, a kind of processor includes: a cache, controls including a cache and a cache
Device;And a processor core, require to produce a specific instruction according to one, this specific instruction comprise an operational order, one first
Field and one second field, obtain a deviant and an initial address according to this first field and this second field;Its
In, this processor core sends this initial address and this deviant to this director cache, and this cache controls
Device is according to this initial address and deviant, and in this cache, selected one specifies section, and removing is stored in this appointment district
The data of section.
The instruction format of present invention proposition makes to be eliminated the initial address of section and sector sizes all adjustable.
Accompanying drawing explanation
Fig. 1 is the schematic diagram of the framework of processor of the prior art;
Fig. 2 is the instruction format proposed according to the present invention;
Fig. 3 is the flow chart according to one embodiment of the invention;And
Fig. 4 is the processor architecture schematic diagram of the embodiment of the 3rd figure.
Primary clustering symbol description
10 processor 102 cores
104 cache 12 main storages
20 instruction 22 OP fields
24 skew field 26 buffer fields
40 cores 402 instruct the acquisition stage
404 instruction decoding stages 406 produced address command and approval stage
42 cache systems 422 director caches
424 data high-speeds cache 426 instruction caches
Detailed description of the invention
The present invention proposes a kind of method being purged the cache of processor, and Fig. 2 shows what the present invention proposed
Instruction format, in instruction 20, OP field 22 is specific instruction, such as, write back (Writeback), ineffective treatment
(Invalidate) and write back and add ineffective treatment (Writeback+Invalidate) etc., skew field 24 is for write one deviant
(offset), the buffer field 26 being denoted as rS is for pointing to a buffer, to represent an initial address.Generally, process
Device can be provided with 32 buffers, referred to as register file (Register File), in the present embodiment, buffer field
26 for point to these 32 buffers one of them, its store value be 0x8000_0000, therefore, core is done with 0x8000_0000
For initial address (Starting address), end address (End address) is 0x8000_0000+offset, offsets hurdle
Position offset represented by 24 can be the quantity of the cache column (cache line) of skew.
For example, in the case of cache column size (cache line size) is 8bytes, when buffer hurdle
The value of buffers that position 26 is pointed to is 0000, and when offset is 0001, and end address is rS+offset=0byte+1 (< < 3)
Byte=8, start address now is 0000, and end address is 0008.CPU is according to the instruction in OP field 22, by the most slow
Deposit the data stored by address 0000 to 0008 write back to main storage or removed.Thus change skew field 24 and delay
The value of storage field 26, the size of selected section and initial address all adjustable.
Fig. 3 is the flow chart according to one embodiment of the invention, and it is said by the processor architecture schematic diagram in conjunction with Fig. 4
Bright.As it was previously stated, processor includes core 40 and 42 two parts of cache systems, wherein, the process of core 40 is also divided into
Multiple stages, such as instruction acquisition stage (Instruction Fetch;IF) 402, instruction decoding stage (Instruction
Decode;ID) 404, finally enter generation address command and approval stage (Address-Command Generation&
Issue) 406, in the present embodiment, after starting 301, perform step 302, core 40 is instructing the acquisition stage 402 according to coming
From the requirement of software, it is decoded in the instruction decoding stage 404 and obtains the relevant information of deviant and initial address, determine behaviour
Instruct and follow the buffer field after operational order and the value in skew field, then at producing address command and checking and approving rank
Section 406 produces whole instruction, subsequently enters step 303, and core 40 obtains this according to the buffer that buffer field points to and initiates
Address, produces end address then at step 304 according to this initial address and this deviant computing;In step 305, core 40
Send operational order, initial address and the end address director cache 422 to cache systems 42;Cache
Section between this initial address and this end address is performed should the spy of operational order by controller 422 within step 306
Fixed operation, such as, write back, ineffective treatment or write back and add ineffective treatment, then terminates 307.Cache in cache 40 can be divided again
(data cache) 424 and 426 two parts of instruction cache (instruction cache) is cached for data high-speed, this
The mode that invention is proposed can be simultaneously suitable for this two kinds of caches, and wherein, instruction cache 426 is general without performing to write back
The needs of instruction.
In the fig. 3 embodiment, core 40 provide initial address, deviant and end address to cache, but at it
In its embodiment, end address can not be calculated generation by core 40, and core only provides operational order, initial address and deviant
To cache systems 42, then gone to calculate this end address of generation by the director cache 422 in cache.
The foregoing is only the preferred embodiments of the present invention, all impartial changes done according to scope of the present invention patent with
Modify, all should belong to the covering scope of the present invention.
Claims (14)
1. the method being purged the cache of processor, wherein, described cache includes multiple cache
Row, each cache column includes multiple section, and described method includes:
Requiring to produce a specific instruction according to one, described specific instruction comprises an operational order, one first field and one second
Field, and described operational order includes writing back instruction;
According to described first field and described second field, obtain a deviant and an initial address;
According to described initial address and described deviant, a selected appointment section in described cache;And
It is written back to memorizer in response to the described data that will be stored in described appointment section of instruction that write back.
Method the most according to claim 1, wherein, selectes described appointment according to described initial address and described deviant
The step of section includes:
An end address is produced according to described first field and described second field computing;And
Described appointment section is determined according to described initial address, described deviant and described end address.
Method the most according to claim 1, wherein, described requirement is from a software.
Method the most according to claim 1, wherein, described specific instruction comprises an ineffective treatment instruction.
Method the most according to claim 1, wherein, described first field and described second field sequentially follow described
Operational order, and described second field points to a buffer.
Method the most according to claim 1, wherein, the described step according to requirement execution one specific instruction includes decoding
Described requirement is to produce described specific instruction.
Method the most according to claim 1, wherein, described deviant is the quantity of the cache column of skew.
8. a processor, including:
One cache systems, including a cache memory and a director cache, wherein, described cache
Memorizer includes that multiple cache column, each cache column include multiple section;And
One processor core, requires to produce a specific instruction according to one, and described specific instruction comprises an operational order, one first hurdle
Position and one second field, described operational order includes writing back instruction, and described processor core according to described first field with
And described second field obtains a deviant and an initial address;
Wherein, described processor core sends described initial address and described deviant to described director cache,
Described director cache is according to described initial address and deviant, and in this cache memory, selected one specifies district
Section, and, described director cache writes back instruction will be stored in specifying the data in section to be written back to this in response to described
Cache memory.
Processor the most according to claim 8, wherein, described processor core is always according to described deviant and described
Beginning address arithmetic produces an end address, and described initial address, described deviant and described end address are supplied to institute
State director cache.
Processor the most according to claim 8, wherein, described director cache according to described initial address and
Described deviant computing produces an end address, to determine described appointment section.
11. processors according to claim 8, wherein, described requirement is from a software.
12. processors according to claim 8, wherein, described specific instruction comprises an ineffective treatment instruction.
13. processors according to claim 8, also include multiple buffer, wherein said first field and described
Two fields sequentially follow described operational order, and described second field point to the plurality of buffer one of them.
14. processors according to claim 8, wherein, described deviant is the quantity of the cache column of skew.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201110448085.6A CN103186474B (en) | 2011-12-28 | 2011-12-28 | The method that the cache of processor is purged and this processor |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201110448085.6A CN103186474B (en) | 2011-12-28 | 2011-12-28 | The method that the cache of processor is purged and this processor |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103186474A CN103186474A (en) | 2013-07-03 |
CN103186474B true CN103186474B (en) | 2016-09-07 |
Family
ID=48677650
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201110448085.6A Active CN103186474B (en) | 2011-12-28 | 2011-12-28 | The method that the cache of processor is purged and this processor |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103186474B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107479860B (en) * | 2016-06-07 | 2020-10-09 | 华为技术有限公司 | Processor chip and instruction cache prefetching method |
CN114385528A (en) * | 2020-10-16 | 2022-04-22 | 瑞昱半导体股份有限公司 | Direct memory access controller, electronic device using the same, and method of operating the same |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101533371A (en) * | 2008-03-12 | 2009-09-16 | Arm有限公司 | Cache accessing using a micro tag |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6734867B1 (en) * | 2000-06-28 | 2004-05-11 | Micron Technology, Inc. | Cache invalidation method and apparatus for a graphics processing system |
US8214598B2 (en) * | 2009-12-22 | 2012-07-03 | Intel Corporation | System, method, and apparatus for a cache flush of a range of pages and TLB invalidation of a range of entries |
-
2011
- 2011-12-28 CN CN201110448085.6A patent/CN103186474B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101533371A (en) * | 2008-03-12 | 2009-09-16 | Arm有限公司 | Cache accessing using a micro tag |
Also Published As
Publication number | Publication date |
---|---|
CN103186474A (en) | 2013-07-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7234040B2 (en) | Program-directed cache prefetching for media processors | |
US9727471B2 (en) | Method and apparatus for stream buffer management instructions | |
US8244984B1 (en) | System and method for cleaning dirty data in an intermediate cache using a data class dependent eviction policy | |
US8035648B1 (en) | Runahead execution for graphics processing units | |
CN105339908B (en) | Method and apparatus for supporting long-time memory | |
TWI489385B (en) | A computer-implemented method and a subsystem for free-fetching cache lines | |
US6513107B1 (en) | Vector transfer system generating address error exception when vector to be transferred does not start and end on same memory page | |
US20030154349A1 (en) | Program-directed cache prefetching for media processors | |
KR20120049806A (en) | Prefetch instruction | |
JP2014115851A (en) | Data processing device and method of controlling the same | |
JP2008047124A (en) | Method and unit for processing computer graphics data | |
US8359433B2 (en) | Method and system of handling non-aligned memory accesses | |
US8656093B1 (en) | Supporting late DRAM bank hits | |
US8661169B2 (en) | Copying data to a cache using direct memory access | |
CN103186474B (en) | The method that the cache of processor is purged and this processor | |
CN109669881B (en) | Computing method based on Cache space reservation algorithm | |
US9158697B2 (en) | Method for cleaning cache of processor and associated processor | |
KR20080006136A (en) | Cache memory apparatus for 3-dimensional graphic computation, and method of processing 3-dimensional graphic computation | |
US8375163B1 (en) | Supporting late DRAM bank hits | |
TW202029200A (en) | Electronic device and method for managing electronic device | |
CN105900060B (en) | Memory pool access method, device and computer equipment | |
US7085887B2 (en) | Processor and processor method of operation | |
JP2004240616A (en) | Memory controller and memory access control method | |
TWI739430B (en) | Cache and method for managing cache | |
CN117032594B (en) | Read command scheduling method, processing method, device and storage equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |