WO1989009444A1 - Antememoire ayant au moins deux tailles de remplissage differentes - Google Patents
Antememoire ayant au moins deux tailles de remplissage differentes Download PDFInfo
- Publication number
- WO1989009444A1 WO1989009444A1 PCT/US1989/001314 US8901314W WO8909444A1 WO 1989009444 A1 WO1989009444 A1 WO 1989009444A1 US 8901314 W US8901314 W US 8901314W WO 8909444 A1 WO8909444 A1 WO 8909444A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- cache
- data
- miss
- filling
- requested information
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0842—Multiuser, multiprocessor or multiprocessing cache systems for multiprocessing or multitasking
Definitions
- the present invention relates to the field of digital computers and their architecture. More particularly, it relates to the filling of cache memories used in such computers.
- RAMs Random Access Memories
- 'cache' memories These caches are generally small, on the order of a few thousand • bytes, in order to allow the rapid retrieval of data. Since there are few complete programs or data bases that can be stored in memories of that size, computer systems also incorporate memories with larger capacities, but slower access and retrieval times. These memories can include larger RAMs with slower retrieval speeds, bubble memories, disc memories of various types and other memories.
- a commonly used method to optimize computer operations is to couple a cache memory directly to the CPU and another, larger, memory unit to both the cache memory and the CPU. In this manner the ca ⁇ che can supply the CPU with the instructions and data needed immediately at a rate which will allow unimpeded CPU operation.
- the main memory usually supplies refill data to the cache, keeping it full. If an instruction or a required piece of data is not in the cache when the CPU requires it, it can be obtained from the main memory, at the expense of the extra time that this requires.
- a memory can be mapped in at least two ways.
- the first is physical mapping where instructions refer to the actual physical address where the required data is stored.
- the second way is virtual mapping.
- the instruction refers to a virtual address which must be translated in some fashion to obtain the physical address where the data is stored.
- Virtual mapping allows better main memory utilization and is particularly useful in multiprogramming environments as the memory can be allocated without contiguous partitions between the users. Both physically and virtually mapped caches are currently being used in computer design.
- the physical location of the cache memory also plays an important role in optimizing computer operation. CPU operations are performed with virtual addresses. If the computer system uses a virtually mapped cache it becomes advantageous to couple the cache directly to the CPU.
- the virtual to physical address translation map of a virtually mapped cache changes. When this occurs, the cache must be flushed (cleared) and replaced with a new map.
- the cache After the cache is flushed, it is refilled with new data and instructions.
- the cache In the prior art, after the cache was flushed, it was refilled at the same rate that data or instructions were fed to the cache when a given program was being run for a long period of time. Caches work most efficiently when completely full as fewer attempts to find data or instructions in the cache result in misses that require a search of main memory. Consequently, when the cache was refilled at a constant rate after flushing, numerous "misses" requiring reference to and response from the main memory occurred, resulting in inefficient cache utilization. On the other hand, if the cache is continually refilled or refreshed at a very high rate, other problems occur, such as writing over data or instructions which are still current and useful.
- the present invention provides a method of filling a cache in a computer with information.
- the method includes the steps of searching the cache for requested information and generating a miss signal when the requested information is not found in the cache, and examining a valid bit of a data block in the cache where the requested information should be located when the miss signal is generated.
- N data blocks are written to the cache if the valid bit is not on, which indicates that the data in the block was used previously but is no longer current. These N data blocks will include the data block containing the requested information. If the valid bit is on, P blocks of data are written to the cache at one time, where P is less than N, and these P data blocks include a data block that contains the requested information.
- FIG. 1 shows the structure of a data block stored in a cache
- FIG. 2 is a block diagram of a computer system which utilizes a virtually mapped cache.
- caches generally store information in blocks of data.
- Each block here numbered respectively 10, 20, 30, 40, and 50, contains a valid bit, a tag field, a Process Identification (PID) field and a data field.
- PID Process Identification
- the valid bit is used to determine if the information contained in the block is valid.
- all valid bits in each of the data blocks are set to 0, indicating invalid data and allowing the present contents of the block to be written over.
- the valid bit is turned on, indicating that the data contained therein is valid and usable.
- each user's program is allowed to run for a certain amount of time, whereupon another program is run. For reasons which will be discussed later, it is useful to identify each program being run with a unique PID number.
- the data field is where the data stored in each block is actually located.
- cache 100 is virtually mapped and coupled to CPU 120.
- Cache 100 can be a translation buffer, for example, that caches virtual to physical translations.
- Main memory 160 is coupled to both cache 100 and CPU 120, as well as a plurality of input/output devices, not shown.
- virtually mapped caches must be flushed every time the virtual to physical mapping changes. One instance of when this occurs is when one running program is changed for another in the CPU.
- the present invention optimizes the refilling of the* virtual cache through hardware in the following manner.
- a cache miss in other words, a failure to find desired data or instructions in the cache.
- the address tag being used for the search refers to a particular block, but the block contains different data or invalid data.
- a first embodiment of the invention checks to see if the valid bit is off or on. If it is off, it means that no data has been written to this block since the last flush and that therefore the cache should be refilled at a fast rate equal to N blocks at a time. If the valid bit is on, the cache is refilled with one block, based on the assumption that useful data already exists and it would waste time to write over useful data.
- the principle of spatial locality which has been discovered to operate in computer environments, states that when a given block of data or instructions is needed, it is very likely that contiguous blocks of data or instructions will also be required.
- the number of blocks N is equal to four. Therefore, four blocks which are naturally aligned to one another are used to refill the cache.
- the blocks are chosen in even naturally aligned group of four blocks; more precisely, block numbers 0 to 3, 4 to 7, etc. are fetched as a group if the "missed" block falls within that group. For example, if block 2 was found to be invalid, blocks 0 to 3 would be fetched.
- a second embodiment of this invention relies on both the PID number and the valid bit and is particularly useful in a multi-user computer system where a number of different programs or processes are run at nearly the same time.
- Each PID represents a unique number whiGh refers to one of at least thirty-two processes or programs which are running at nearly the same time on a single CPU.
- the valid bit is checked after every miss. If the valid bit is off, the situation is considered identical to that described in the first embodiment -- the area of the cache is empty, and an N block refill occurs. If the valid bit is on, a second comparison is made, this time between the PID of the process being run and that of the particular block being read.
- the miss is refilled with N blocks in this instance also. Only if the valid bit is on and the PID numbers match is the miss refilled with one block. This avoids writing over data which may still be useful to the process being run.
- a further embodiment stores the address of the last miss.
- the locations of the two misses are compared. If they occurred in the same aligned group of blocks, for example, at blocks 2 and 3, it is assumed that the program being run has moved to a new area, requiring new data and instructions, and the miss is refilled with N blocks. This condition is in addition to those described in the previous embodiment.
- a still further embodiment provides a miss counter.
- the miss counter keeps track of all the read misses that occur even when the valid bit is on and the PID numbers match. If this count exceeds a pre-determined threshold, each miss is refilled with N blocks. In this case it is assumed that the program being run has reached some transition and jumped to a new region, requiring a change of the data and instructions. As soon as a hit occurs, the counter is reset to zero. With this embodiment, it is alternatively contemplated to decrement the counter upon each hit. Only when the counter decreases below a second pre-determined threshold will a miss be refilled with one block.
- a further embodiment examines which block in an aligned group of blocks is being read. As in the last two described embodiments, if the valid bit is off and/or if the PID numbers do not match, misses are refilled with N blocks. Even if the valid bit is on and the PID numbers match, if the block being examined is the first block in the aligned group of blocks, the miss is refilled with N blocks. This decision is based upon the traffic patterns of certain programs and data sets.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
L'invention concerne l'optimalisation des performances d'un système d'antémémoire. Pendant l'utilisation d'un système automatique dont le processeur (120) est pris en charge par une antémémoire virtuelle (100), l'antémémoire doit être remise à zéro et reremplie pour permettre le remplacement des anciennes données par des données plus actuelles. On remplit l'antémémoire avec des blocs soit P soit N(N>P). Plusieurs procédés de sélection dynamique de blocs de données N ou P sont possibles. Par exemple, immédiatement après avoir vidé l'antémémoire, le vide est rerempli avec des blocs N, ce qui permet le déplacement de données vers l'antémémoire à vitesse élevée. Une fois l'antémémoire presque pleine, le vide tend à être rerempli avec des blocs P. Cela permet de maintenir à jour les données se trouvant dans l'antémémoire, tout en évitant simultanément d'écrire par dessus des données se trouvant déjà dans l'antémémoire. L'invention est utile dans un système à utilisateurs et à exécution de tâches multiples dans lequel le programme exécuté change souvent, ce qui nécessite le vidage et l'effacement fréquents de l'antémémoire.
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE68924896T DE68924896T2 (de) | 1988-04-01 | 1989-03-30 | Cachespeicher mit mindestens zwei füllgrössen. |
EP89904922A EP0359815B1 (fr) | 1988-04-01 | 1989-03-30 | Antememoire ayant au moins deux tailles de remplissage differentes |
KR1019890701200A KR930002786B1 (ko) | 1988-04-01 | 1989-03-30 | 적어도 2가지 필사이즈를 갖는 캐쉬 메모리 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17659688A | 1988-04-01 | 1988-04-01 | |
US176,596 | 1988-04-01 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO1989009444A1 true WO1989009444A1 (fr) | 1989-10-05 |
Family
ID=22645015
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US1989/001314 WO1989009444A1 (fr) | 1988-04-01 | 1989-03-30 | Antememoire ayant au moins deux tailles de remplissage differentes |
Country Status (6)
Country | Link |
---|---|
EP (1) | EP0359815B1 (fr) |
JP (1) | JP2700148B2 (fr) |
KR (1) | KR930002786B1 (fr) |
CA (1) | CA1314107C (fr) |
DE (1) | DE68924896T2 (fr) |
WO (1) | WO1989009444A1 (fr) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1182562A1 (fr) * | 2000-08-21 | 2002-02-27 | Texas Instruments France | Antémémoire intelligente à préextraction de bloc interruptible |
WO2004019213A1 (fr) | 2002-08-23 | 2004-03-04 | Koninklijke Philips Electronics N.V. | Procede de prelecture permettant d'adapter des caracteristiques de protocole de bus memoire |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4370710A (en) * | 1980-08-26 | 1983-01-25 | Control Data Corporation | Cache memory organization utilizing miss information holding registers to prevent lockup from cache misses |
US4392200A (en) * | 1980-01-28 | 1983-07-05 | Digital Equipment Corporation | Cached multiprocessor system with pipeline timing |
US4442488A (en) * | 1980-05-05 | 1984-04-10 | Floating Point Systems, Inc. | Instruction cache memory system |
US4489378A (en) * | 1981-06-05 | 1984-12-18 | International Business Machines Corporation | Automatic adjustment of the quantity of prefetch data in a disk cache operation |
-
1989
- 1989-03-30 DE DE68924896T patent/DE68924896T2/de not_active Expired - Fee Related
- 1989-03-30 JP JP1504733A patent/JP2700148B2/ja not_active Expired - Lifetime
- 1989-03-30 EP EP89904922A patent/EP0359815B1/fr not_active Expired - Lifetime
- 1989-03-30 WO PCT/US1989/001314 patent/WO1989009444A1/fr active IP Right Grant
- 1989-03-30 KR KR1019890701200A patent/KR930002786B1/ko not_active IP Right Cessation
- 1989-03-31 CA CA000595321A patent/CA1314107C/fr not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4392200A (en) * | 1980-01-28 | 1983-07-05 | Digital Equipment Corporation | Cached multiprocessor system with pipeline timing |
US4442488A (en) * | 1980-05-05 | 1984-04-10 | Floating Point Systems, Inc. | Instruction cache memory system |
US4370710A (en) * | 1980-08-26 | 1983-01-25 | Control Data Corporation | Cache memory organization utilizing miss information holding registers to prevent lockup from cache misses |
US4489378A (en) * | 1981-06-05 | 1984-12-18 | International Business Machines Corporation | Automatic adjustment of the quantity of prefetch data in a disk cache operation |
Non-Patent Citations (1)
Title |
---|
See also references of EP0359815A4 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1182562A1 (fr) * | 2000-08-21 | 2002-02-27 | Texas Instruments France | Antémémoire intelligente à préextraction de bloc interruptible |
US6678797B2 (en) | 2000-08-21 | 2004-01-13 | Texas Instruments Incorporated | Cache/smartcache with interruptible block prefetch |
WO2004019213A1 (fr) | 2002-08-23 | 2004-03-04 | Koninklijke Philips Electronics N.V. | Procede de prelecture permettant d'adapter des caracteristiques de protocole de bus memoire |
CN100390757C (zh) * | 2002-08-23 | 2008-05-28 | Nxp股份有限公司 | 处理器预取以匹配存储器总线协议特性 |
Also Published As
Publication number | Publication date |
---|---|
EP0359815B1 (fr) | 1995-11-22 |
JPH02500552A (ja) | 1990-02-22 |
EP0359815A1 (fr) | 1990-03-28 |
JP2700148B2 (ja) | 1998-01-19 |
CA1314107C (fr) | 1993-03-02 |
KR900700959A (ko) | 1990-08-17 |
DE68924896D1 (de) | 1996-01-04 |
EP0359815A4 (en) | 1992-04-01 |
KR930002786B1 (ko) | 1993-04-10 |
DE68924896T2 (de) | 1996-07-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US5038278A (en) | Cache with at least two fill rates | |
EP1654660B1 (fr) | Methode de mise en cache de donnees | |
US6038647A (en) | Cache memory device and method for providing concurrent independent multiple accesses to different subsets within the device | |
US5689679A (en) | Memory system and method for selective multi-level caching using a cache level code | |
US6823428B2 (en) | Preventing cache floods from sequential streams | |
EP0667580B1 (fr) | Système d'antémémoire pour une mémoire | |
KR100240912B1 (ko) | 데이터 프리페치 장치 및 시스템, 캐시 라인 프리페치 방법 | |
US5214770A (en) | System for flushing instruction-cache only when instruction-cache address and data-cache address are matched and the execution of a return-from-exception-or-interrupt command | |
US20070094450A1 (en) | Multi-level cache architecture having a selective victim cache | |
US6965970B2 (en) | List based method and apparatus for selective and rapid cache flushes | |
KR20170098187A (ko) | 저장 서브시스템을 위한 연관적 및 원자적 라이트-백 캐싱 시스템 및 방법 | |
JPH1196074A (ja) | 交換アルゴリズム動的選択コンピュータシステム | |
KR20010101695A (ko) | 가상 메모리 시스템에서의 메모리 접근 개선 기술 | |
US8621152B1 (en) | Transparent level 2 cache that uses independent tag and valid random access memory arrays for cache access | |
US6745292B1 (en) | Apparatus and method for selectively allocating cache lines in a partitioned cache shared by multiprocessors | |
EP0543991A1 (fr) | Amelrioration des performances d'un ordinateur par associativite simulee de l'antememoire | |
US5897651A (en) | Information handling system including a direct access set associative cache and method for accessing same | |
EP0675443A1 (fr) | Dispositif et procédé d'accès à antémémoire à mappage direct | |
US6311253B1 (en) | Methods for caching cache tags | |
KR100379993B1 (ko) | 컴퓨터 시스템에서 캐시 라인 교체를 관리하기 위한 방법및 장치 | |
EP0359815B1 (fr) | Antememoire ayant au moins deux tailles de remplissage differentes | |
US6792512B2 (en) | Method and system for organizing coherence directories in shared memory systems | |
US7143239B2 (en) | Cache structure and methodology | |
KR100486240B1 (ko) | 분리된 캐쉬 메모리를 구비한 마이크로프로세서 및 메모리 액세스 방법 | |
CA1315004C (fr) | Commande de vidage d'antememoire a instructions |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 1989904922 Country of ref document: EP |
|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): JP KR |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): AT BE CH DE FR GB IT LU NL SE |
|
WWP | Wipo information: published in national office |
Ref document number: 1989904922 Country of ref document: EP |
|
WWG | Wipo information: grant in national office |
Ref document number: 1989904922 Country of ref document: EP |