KR20040005275A

KR20040005275A - Optimum Method for Reference of Cache Memory

Info

Publication number: KR20040005275A
Application number: KR1020020039772A
Authority: KR
Inventors: 이유철
Original assignee: 삼성전자주식회사
Priority date: 2002-07-09
Filing date: 2002-07-09
Publication date: 2004-01-16
Also published as: KR100460142B1

Abstract

PURPOSE: A cache memory management method is provided to make a cache empty if a processor fails to make an initial access to a cache memory and to previously eliminate a cause to write back data to a main memory when an execution of a specific function is finished so that it can prevent meaningless garbage data from being transmitted between the main memory and the cache memory. CONSTITUTION: The method comprises several steps. A processor reads an instruction(S1). In a case that the read instruction is for starting a new function, namely needs allocating a stack(S2), initial access management bits corresponding to an allocated stack area are initialized(S3). In a case that the read instruction is for restoring the stack(S4), dirty bits within a cache memory, corresponding to the restored stack, are set to zeros(S5). It is determined whether the read instruction is for writing an address within the stack(S6). In a case that the read instruction is for writing an address within the stack, there exists no block for the address within the cache memory and there is no case that the address is referred to, the instruction is not copied to the main memory and the initial access bits for the block are set to ones.

Description

Optimization method for cache memory reference {Optimum Method for Reference of Cache Memory}

본 발명은 프로세서(CPU)와 캐쉬 메모리(Cache Memory) 및 메인 메모리(Main Memory)로 구성된 컴퓨터 시스템에서, 프로세서가 캐쉬 메모리의 초기 접근 실패 시에 캐쉬내의 공간을 비우고, 특정 함수의 실행이 종료되면 메인 메모리로 라이트 백을 할 요인을 사전에 제거하여, 의미 없는 가비지 값의 메인 메모리와 캐쉬 메모리간의 전송을 제거할 수 있는 캐쉬 메모리 참조 최적화 방법에 관한 것이다.The present invention is a computer system consisting of a processor (CPU), a cache memory (Cache Memory) and a main memory (Main Memory), the processor frees space in the cache when the initial failure of the cache memory, and when execution of a specific function is terminated The present invention relates to a cache memory reference optimization method capable of eliminating a factor to write back to the main memory in advance, thereby eliminating the transfer between the main memory and the cache memory of meaningless garbage values.

일반적으로, 컴퓨터 시스템은 제어 연산 등의 기능을 수행하기 위한 프로세서와 각종 동작을 위한 데이터를 저장하는 기능을 수행하는 메인 메모리를 필수적으로 구비한다.In general, a computer system essentially includes a processor for performing functions such as control operations and a main memory for storing data for various operations.

그런데, 최근 컴퓨터 내부의 하드웨어 반도체 칩들의 개발 경향을 살펴보면, 상기 프로세서의 속도는 비약적으로 개선되어 현재 700 MIPS 이상의 성능을 갖는 마이크로 프로세서들이 발표되고 있는 반면, 상기 메인 메모리는 그 집적도 향상에만 개발의 초점이 집중되어 그 접근 시간에는 별 진전이 없어, 결과적으로 프로세서와 메인 메모리간의 속도 격차는 점점 더 벌어지고 있다.However, the recent trend of development of hardware semiconductor chips in a computer shows that the speed of the processor has been dramatically improved, and microprocessors having a performance of 700 MIPS or more have been announced. This concentrated time has made little progress, and as a result, the speed gap between the processor and main memory is widening.

따라서, 이러한 프로세서와 메인 메모리간의 속도 격차를 해소하고 메인 메모리의 접근 시간을 개선하기 위하여 보편적으로 사용되는 것이 바로 캐쉬 메모리이다.Therefore, cache memory is commonly used to bridge the speed gap between the processor and the main memory and to improve the access time of the main memory.

통상, 캐쉬 메모리는 자주 쓰는 명령어 등의 데이터를 임시 저장하여 프로세서가 명령어를 신속히 접근 수행할 수 있도록 하는 기능을 수행하므로 속도는 빠르나 집적도가 낮고 고가인 특성을 가지는 정적 랜덤 액세스 메모리(SRAM : Static Random Access Memory, 이하 SRAM으로 약칭)로 구성되며, 반면 메인 메모리는 동적랜덤 액세스 메모리(DRAM : Dynamic Random Access Memory, 이하 DRAM으로 약칭)로 구성되게 된다.In general, the cache memory temporarily stores data, such as frequently used instructions, so that the processor can access the instructions quickly. Therefore, the cache memory is a static random access memory (SRAM) having a high speed but low density and high cost. Access Memory (hereinafter, abbreviated as SRAM) may be configured, while the main memory may be configured as Dynamic Random Access Memory (DRAM).

따라서, SRAM의 접근 속도는 DRAM에 비하여 8~16배로 훨씬 빠르므로 캐쉬 메모리는 프로세서로 하여금 메인 메모리로의 고속 접근을 가능하도록 하며, 빈번하게 사용되는 명령어 등의 데이터 블록을 저장하여 그 접근 시간을 단축하는 기능을 수행한다.Therefore, the access speed of SRAM is 8 to 16 times faster than that of DRAM, so the cache memory allows the processor to access the main memory at high speed, and stores the data block such as frequently used instructions to save the access time. It performs the function of shortening.

도 1은 종래의 프로세스와 메인 메모리 및 캐쉬 메모리의 상호 연결 관계를 나타내는 블록도로서, 프로세서(10)는 캐쉬 메모리(20)를 통하여 메인 메모리(30)와 연결되어 있어, 일단 캐쉬 메모리(20)가 인에이블(Enable)되면 프로세서(10)는 모든 명령어 데이터에 대한 접근을 캐쉬 메모리(20)를 통하여 수행하게 된다.FIG. 1 is a block diagram illustrating an interconnection relationship between a conventional process, a main memory, and a cache memory. The processor 10 is connected to the main memory 30 through the cache memory 20, and thus, the cache memory 20 is once. If enabled, the processor 10 performs access to all instruction data through the cache memory 20.

따라서, 메인 메모리(30)에 저장되어 있는 데이터는 캐쉬 메모리(20)로 옮겨진 후에야 프로세서(10)에 의하여 접근될 수 있게 되며, 이때 메인 메모리(30)에서 캐쉬 메모리(20)로 옮겨지는 데이터의 기본 단위를 블록이라 칭한다.Therefore, the data stored in the main memory 30 can be accessed by the processor 10 only after being transferred to the cache memory 20, and at this time, the data transferred from the main memory 30 to the cache memory 20 can be accessed. The basic unit is called a block.

한편, 이렇게 캐쉬 메모리(20)를 이용한 구조에서, 데이터 접근 시 프로세서(10)가 필요로 하는 내용이 캐쉬 메모리(20)에 존재할 경우에는 아무런 문제없이 정상 동작하지만, 프로세서(10)가 필요로 하는 내용이 캐쉬 메모리(20)에 존재하지 않을 경우에는 접근 실패가 발생하게 된다.On the other hand, in the structure using the cache memory 20, if the contents required by the processor 10 in the cache memory 20 when accessing data normally operates without any problem, but the processor 10 is required If the content does not exist in the cache memory 20, an access failure occurs.

이렇게, 접근 실패가 발생하면 해당 내용을 캐쉬 메모리(20)에 존재하도록 하기 위하여 메인 메모리(30)에서 캐쉬 메모리(20)로 블록을 복사하는 작업을 수행하여야 한다.In this way, when an access failure occurs, a task of copying a block from the main memory 30 to the cache memory 20 must be performed so that the corresponding content exists in the cache memory 20.

따라서, 이와 같이 접근 실패가 발생하였을 경우에는 메인 메모리(30)에서 캐쉬 메모리(20)로 블록을 복사하기 위한 시간이 소요되게 되므로, 프로세서(10)의 처리 속도를 악화시켜 컴퓨터의 성능에 크게 영향을 미치게 된다.Therefore, when an access failure occurs as described above, it takes time to copy a block from the main memory 30 to the cache memory 20, and thus, the processing speed of the processor 10 is deteriorated, which greatly affects the performance of the computer. Get mad.

접근 실패는 다음과 같이 세 가지로 나눌 수 있다.Failure to access can be divided into three categories:

1. 초기 접근 실패(Compulsory Miss) : 프로세서(10)가 블록에 맨 처음 접근할 때 발생하는 접근 실패로서 일명 처음 접근 실패(First Reference Miss)라고 불리기도 한다.1. Compulsory Miss: An access failure that occurs when the processor 10 first accesses a block, also known as a first reference miss.

이는 프로세서(10)가 접근하려는 블록이 한번도 캐쉬 메모리(20)에 저장되어 적재되지 않았기 때문에 발생하는 접근 실패이다.This is an access failure that occurs because the block to be accessed by the processor 10 is never stored and loaded in the cache memory 20.

2. 용량에 의한 접근 실패(Capacity Miss) : 캐쉬 메모리(20)가 모든 블록을 전부 포함할 수 없기 때문에 발생하는 접근 실패이다.2. Capacity Missing (Capacity Miss): This access failure occurs because the cache memory 20 cannot contain all the blocks.

3. 연관도에 의한 접근 실패(Conflict Miss) : 완전한 연관(Fully Associative)이 아닌 경우에 발생하는 접근 실패로서, 단일 장소에 다수의 블록이 매핑(Mapping)됨으로써 발생하는 접근 실패이다.3. Conflict Miss: An access failure that occurs when the association is not fully associative. An access failure that occurs when multiple blocks are mapped in a single place.

한편, 캐쉬 메모리(20)는 그 데이터 라이트(Write) 방식을 기준으로 하여 다음과 같은 두 가지 방식으로 구분할 수 있다.On the other hand, the cache memory 20 can be classified into the following two methods based on the data write method.

1. 라이트 쓰루(Write Through) : 데이터를 캐쉬 메모리(20)와 메인 메모리(30)에 동시 라이트 하는 방식으로써, 즉 캐쉬에 쓰기(Write)를 할 때 그에 해당하는 메인 메모리(30)의 위치에도 쓰기가 발생하여 캐쉬 메모리(20)와 메인 메모리(30)가 항상 같은 내용을 보유하도록 하는 방식이다.1. Write Through: Writes data to the cache memory 20 and the main memory 30 at the same time, that is, at the position of the corresponding main memory 30 when writing to the cache. The write occurs so that the cache memory 20 and the main memory 30 always retain the same contents.

2. 라이트 백(Write Back) : 먼저 캐쉬 메모리(20)에 쓰기를 수행한 뒤 캐쉬 메모리(20)의 해당 블록의 데이터를 대체하여야 하는 경우에만 기존의 데이터를 메인 메모리(30)로 옮긴 다음 캐쉬 메모리(20)의 해당 블록을 새로운 데이터로 대체하는 방식으로써, 캐쉬 메모리(20) 내에 쓰기가 발생하더라도 메인 메모리(30)에는 쓰기가 발생하지 않아 캐쉬 메모리(20)와 메인 메모리(30)의 내용이 항상 같지는 않다.2. Write Back: Write data to the cache memory 20 first, and then transfer the existing data to the main memory 30 only if it is necessary to replace the data of the corresponding block of the cache memory 20, and then cache By replacing the corresponding block of the memory 20 with new data, even if a write occurs in the cache memory 20, the write does not occur in the main memory 30, and thus, the contents of the cache memory 20 and the main memory 30. This is not always the same.

따라서, 이러한 라이트 백 방식의 경우에는, 캐쉬 메모리(20) 내에 존재하고 있던 하나의 블록은 새로운 블록을 위하여 자리를 비워야 하고 원래 캐쉬 메모리(20)에 있던 블록은 그 내용을 보존할 필요가 있다.Therefore, in the case of such a write back method, one block existing in the cache memory 20 needs to be empty for a new block, and a block existing in the cache memory 20 needs to preserve its contents.

그러므로, 캐쉬 메모리(20)의 내용이 메인 메모리(30)의 내용과 다를 경우에 메인 메모리(30)로 그 내용을 복사하여야 하는 되쓰기, 즉 라이트 백 동작이 필요하게 되어, 캐쉬 메모리(20) 내의 블록이 방출될 때만 쓰기가 발생하게 된다.Therefore, when the contents of the cache memory 20 are different from those of the main memory 30, a rewrite, that is, a write back operation, which requires copying the contents into the main memory 30 is required, and thus the cache memory 20 Writes will only occur when blocks within are released.

한편, 이러한 종래의 캐쉬 메모리(20)에 접근 실패가 발생할 경우의 기본 방식은, 메인 메모리(30) 내의 해당 위치에서 캐쉬 메모리(20)로 그 내용을 복사해 오고 복사를 하기 위하여 캐쉬 메모리(20) 내의 해당 위치를 비우도록 하는 것이다.On the other hand, the basic method in the case of failure to access the conventional cache memory 20, the contents of the cache memory 20 to copy the contents from the corresponding location in the main memory 30 to the cache memory 20 to copy ) To empty the corresponding position.

이때, 방출될 위치에 존재하는 캐쉬 메모리(20)의 내용은 더 이상 캐쉬 메모리(20) 내에 존재할 수 없게 되므로, 그 내용이 수정되었을 경우에는 메인 메모리(30)로 그 내용을 복사하여 다음에 캐쉬 메모리(20)에 등장할 때는 수행에 지장이 없도록 하여야 한다.At this time, since the contents of the cache memory 20 existing at the discharged position can no longer exist in the cache memory 20, if the contents are modified, the contents are copied to the main memory 30 and then cached. When appearing in the memory 20, it should not interfere with performance.

이 경우 물론, 접근 실패가 발생하지 않았을 경우에는 캐쉬 메모리(20) 내에 그 내용이 이미 존재하고 있으므로 이러한 별도의 처리가 필요하지 않을 것이다.In this case, of course, if the access failure does not occur, since the contents already exist in the cache memory 20, such separate processing will not be necessary.

따라서, 접근 실패가 발생한 주소가 스택(21) 내에 존재하는 주소일 경우에는 메인 메모리(30)에서 내용을 복사하여 오는 작업이 필요 없는 경우가 발생한다.Therefore, when the address where the access failure occurs is an address existing in the stack 21, there is a case where the operation of copying contents from the main memory 30 is not necessary.

왜냐하면, 스택(21)이 처음 할당되었을 경우에 그 스택(21)의 내용은 이전에 있던 가비지(Garbage) 값이므로, 그 내용을 다시 캐쉬 메모리(20)로 복사하여 오는 것은 시간 낭비일 뿐이기 때문이다.Because when the stack 21 is allocated for the first time, since the contents of the stack 21 are garbage values previously, it is only a waste of time to copy the contents back to the cache memory 20. to be.

한편, 스택(21)에 있는 내용이 캐쉬 메모리(20)에서 방출될 경우를 생각할 수도 있는데, 방출될 때에는 해당하는 메인 메모리(30)의 위치에 쓰기가 발생하게 되며, 스택(21)을 사용했던 해당 루틴의 실행이 완료된 상태이면, 그 내용은 이미 없는 내용이라고 할 수 있다.On the other hand, it may be considered that the contents in the stack 21 is released from the cache memory 20. When the contents are released, a write occurs at a position of the corresponding main memory 30, and the stack 21 is used. If the execution of the routine is completed, the content may be said to be absent already.

즉, 방출될 블록의 주소가 이미 반환된 스택(21) 영역에 속하여 있으면, 캐쉬 메모리(20)의 내용을 메인 메모리(30)로 옮기는 작업은 사실상 아무런 의미가 없는 일이 되게 된다.That is, if the address of the block to be released belongs to the area of the stack 21 that has already been returned, the operation of moving the contents of the cache memory 20 to the main memory 30 becomes meaningless.

따라서, 앞서 설명하였던 종래의 캐쉬 메모리(20) 참조 방법에 따르면 데이터가 스택(21) 내의 주소 영역에 있든 아니든 간에, 접근 실패가 발생하면 메인 메모리(30)에서 캐쉬 메모리(20)로 무조건 그 내용을 복사하고, 캐쉬 메모리(20)에서 방출될 때 그 내용을 무조건 메인 메모리(30)로 복사하므로, 불필요한 가비지 전송이 발생하게 된다.Therefore, according to the conventional cache memory 20 reference method described above, whether or not the data is in the address area in the stack 21, if the access failure occurs, the contents unconditionally from the main memory 30 to the cache memory 20 unconditionally. And copy the contents to the main memory 30 unconditionally when it is released from the cache memory 20, so that unnecessary garbage transfer occurs.

그러므로, 이러한 불필요한 가비지 전송에 의하여 불필요한 전송 시간의 소모가 발생하여 결국 메모리 접근 속도가 저하되는 문제점이 발생하게 된다.Therefore, unnecessary wasteful transmission time is consumed by this unnecessary garbage transfer, resulting in a problem that the memory access speed is lowered.

본 발명은 이러한 문제점들을 해결하기 위하여 창안된 것으로, 캐쉬 메모리의 초기 접근 실패 시에 캐쉬내의 공간을 비우고, 특정 함수의 실행이 종료되면 메인 메모리로 라이트 백을 할 요인을 사전에 제거하도록 하여, 의미 없는 가비지 값의 메인 메모리와 캐쉬 메모리간의 전송을 제거함으로써, 프로세서와 메모리의 전체적인 성능을 향상시킬 수 있는 캐쉬 메모리 참조의 최적화 방법을 제공하는데 그 목적이 있다.The present invention has been devised to solve these problems, which frees up space in the cache when an initial access of the cache memory fails, and removes a factor to write back to the main memory in advance when execution of a specific function is completed. It is an object of the present invention to provide a method of optimizing cache memory references that can improve the overall performance of a processor and memory by eliminating missing garbage values between main memory and cache memory.

도 1은 종래의 프로세스와 메인 메모리 및 캐쉬 메모리의 상호 연결 관계를 나타내는 블록도이고,1 is a block diagram illustrating an interconnection relationship between a conventional process and a main memory and a cache memory.

도 2는 본 발명의 바람직한 실시예에 따른 캐쉬 메모리 참조의 최적화 방법의 흐름을 나타내는 흐름도이고,2 is a flowchart illustrating a method of optimizing a cache memory reference according to a preferred embodiment of the present invention.

도 3은 도 2에 도시된 단계 2의 내용을 상세하게 설명하기 위한 예시도이고,3 is an exemplary diagram for describing in detail the content of step 2 illustrated in FIG. 2;

도 4는 도 2에 도시된 단계 3의 내용을 상세하게 설명하기 위한 예시도이다.4 is an exemplary diagram for describing in detail the content of step 3 shown in FIG. 2.

<도면의 주요 부분에 대한 부호 설명><Description of the symbols for the main parts of the drawings>

10 : 프로세서(CPU)10: Processor (CPU)

20 : 캐쉬 메모리20: cache memory

30 : 메인 메모리30: main memory

이러한 목적을 달성하기 위하여 본 발명은, 프로세서와 캐쉬 메모리 및 메인 메모리로 구성된 컴퓨터 시스템에서의 캐쉬 메모리 참조 최적화 방법에 있어서, 먼저 프로세서가 하나의 명령어를 읽어온다.In order to achieve the above object, the present invention provides a cache memory reference optimization method in a computer system including a processor, a cache memory, and a main memory, in which a processor first reads one instruction.

이어서, 상기 읽어온 명령어가 캐쉬 메모리 내의 스택 할당에 관한 명령어이면, 할당된 스택 영역에 해당하는 초기 접근 관리 비트들을 0으로 초기화한다.Subsequently, if the read instruction is an instruction relating to stack allocation in the cache memory, the initial access management bits corresponding to the allocated stack region are initialized to zero.

또한, 상기 읽어온 명령어가 캐쉬 메모리 내의 스택 반환에 관한 명령어이면, 반환된 스택 영역에 해당하는 캐쉬 메모리 내의 블록의 더티 비트를 0으로 클리어한다.In addition, if the read instruction is an instruction relating to a stack return in the cache memory, the dirty bit of the block in the cache memory corresponding to the returned stack region is cleared to zero.

한편, 상기 읽어온 명령어가 캐쉬 메모리 내의 스택 내의 주소에 대한 쓰기명령어일 경우, 상기 주소에 대한 블록이 캐쉬 메모리 내에 존재하지 않고, 상기 주소가 아직 참조된 적이 없는 상태이면, 메인 메모리에서 그 내용을 복사하지 않으며 상기 블록에 대한 초기 접근 비트를 1로 변경한다.On the other hand, if the read instruction is a write instruction for an address in the stack in the cache memory, if the block for the address does not exist in the cache memory and the address has not been referenced yet, the contents of the instruction are read from the main memory. Do not copy and change the initial access bit for the block to one.

이하, 본 발명이 속하는 분야에 통상의 지식을 지닌자가 본 발명을 용이하게 실시할 수 있도록 본 발명의 바람직한 실시 예를 첨부된 도면을 참조하여 상세히 설명한다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those skilled in the art may easily implement the present invention.

또한, 이해의 편의를 위하여 비록 다른 도면에 속하더라도 동일한 구성 요소에는 동일한 부호를 부여하였음을 주의하여야 한다.In addition, it should be noted that the same reference numerals are given to the same components, although belonging to different drawings for convenience of understanding.

도 2는 본 발명의 바람직한 실시예에 따른 캐쉬 메모리 참조의 최적화 방법의 흐름을 나타내는 흐름도이다.2 is a flowchart illustrating a flow of a method of optimizing a cache memory reference according to a preferred embodiment of the present invention.

먼저, 프로세서(10)가 하나의 명령을 읽어온다(단계:S1).First, the processor 10 reads one instruction (step: S1).

이때, 읽어온 상기 명령어가 새로운 함수에 대한 시작에 관한 명령어 즉, 스택(21) 할당에 관한 명령어이면(단계:S2), 할당된 스택(21) 영역에 해당하는 초기 접근 관리 비트들을 0으로 초기화하여 해당 스택 영역을 비운다(단계:S3).At this time, if the read command is a command for starting a new function, that is, a command for stack 21 allocation (step S2), the initial access management bits corresponding to the allocated stack 21 area are initialized to 0. To free up the stack area (step: S3).

상기 단계(단계:S3)를 이해의 편의를 위하여 도 3에 도시된 바를 예로 들어 설명하는데, 도 3에서 제 1 스택 포인터는 할당되기 이전의 스택 포인터이고, 제 2 스택 포인터는 할당되고 난 뒤의 스택 포인터를 의미한다.For convenience of understanding, the step (step S3) will be described with reference to FIG. 3 as an example. In FIG. 3, the first stack pointer is a stack pointer before being allocated, and the second stack pointer is after being allocated. The stack pointer.

도 3의 제 1 스택 포인터의 값이 100008이고, 할당된 스택(21)의 크기가 32이면, 제 2 스택 포인터의 값은 99976이 된다.If the value of the first stack pointer of FIG. 3 is 100008 and the size of the allocated stack 21 is 32, the value of the second stack pointer is 99976.

또한, 한 블록의 크기가 16이고 한 워드의 크기가 4이면, 한 블록 내에서는4개의 워드가 존재하게 된다.In addition, if the size of one block is 16 and the size of one word is 4, there are four words in one block.

따라서, 스택(21)의 크기가 32이면, 전부 8개의 워드가 2개 또는 3개의 블록(이 경우에는 3개)에 걸쳐서 분포하게 된다.Therefore, if the size of the stack 21 is 32, all eight words are distributed over two or three blocks (three in this case).

주소 100004가 1번 워드의 시작 주소이고 99976이 8번 워드의 시작 주소라고 가정하면, 1번과 2번 워드가 블록 A에 속하여 있고, 블록 B에는 3, 4, 5, 6번의 4개의 워드가, 블록 C에는 7, 8번 워드가 속하여 있는 경우를 생각할 수 있다.Assuming address 100004 is the start address of word 1 and 99976 is the start address of word 8, words 1 and 2 belong to block A, and block B contains 4 words of 3, 4, 5, and 6. In this case, a case in which the words 7 and 8 belong to the block C belongs.

그러므로, 블록 A는 이전 스택이 할당되었을 때 관리되었지만, 블록 B와 블록 C는 새로 할당된 블록이므로 블록 B와 C를 지시하는 초기 접근 비트를 0으로 클리어한다.Therefore, block A was managed when the previous stack was allocated, but block B and block C are newly allocated blocks, so the initial access bits indicating blocks B and C are cleared to zero.

이때, 0이 의미하는 것은 이 블록에 대한 데이터 참조가 아직 발생하지 않았음을 의미한다.In this case, 0 means that a data reference for this block has not yet occurred.

한편, 상기 단계(단계:S1)에서 읽어온 명령어가 스택(21) 반환에 관한 명령어이면(단계:S4), 반환된 스택(21) 영역에 해당하는 캐쉬 메모리(20) 내의 블록의 더티 비트를 0으로 하여 데이터를 제거한다(단계:S5).On the other hand, if the instruction read in the step (step: S1) is a command for returning the stack 21 (step: S4), the dirty bit of the block in the cache memory 20 corresponding to the returned stack 21 area is erased. Set to 0 to remove the data (step: S5).

상기 단계(단계:S5)를 이해의 편의를 위하여 도 4에 도시된 바를 예로 들어 설명하는데, 이때, 제 3 스택 포인터는 반환되기 이전의 스택 포인터이고, 제 4 스택 포인터는 반환되고 난 뒤의 스택 포인터를 의미한다.For convenience of understanding, the step (step S5) will be described with the example shown in FIG. 4, where the third stack pointer is a stack pointer before return and the fourth stack pointer is a stack after return. It means a pointer.

제 3 스택 포인터 값이 99976이고, 반환된 스택(21)의 크기가 32이면, 제 4 스택 포인터 값은 100008이 된다.If the third stack pointer value is 99976 and the size of the returned stack 21 is 32, the fourth stack pointer value is 100008.

또한, 한 블록의 크기가 16이고 한 워드의 크기가 4이면, 한 블록 내에는 4개의 워드가 존재하게 된다.In addition, if the size of one block is 16 and the size of one word is 4, there are four words in one block.

따라서, 스택(21)의 크기가 32이면 2개 또는 3개의 블록(이 경우에는 3개)에 걸쳐서 분포하게 된다.Therefore, if the size of the stack 21 is 32, it is distributed over two or three blocks (three in this case).

주소 100004가 1번의 워드의 시작 주소이고 99976이 8번 워드의 시작 주소라고 가정하면, 1번 2번 워드가 블록 A에 속하여 있고, 블록 B에는 3, 4, 5, 6 번의 4개의 워드가, 블록 C에는 7, 8번 워드가 속하여 있는 경우를 생각할 수 있다.Assuming address 100004 is the start address of word 1 and 99976 is the start address of word 8, word 1 and 2 belong to block A, and block B contains 4 words of 3, 4, 5, 6, Consider a case in which the words 7 and 8 belong to the block C.

이때, 블록 A에는 아직 반환되지 않은 스택(21)의 내용이 속하여 있지만, 블록 B와 블록 C는 반환된 블록이므로, 이 블록 내의 내용은 이미 가비지 상태라고 할 수 있다.At this time, although the contents of the stack 21 which have not been returned yet belong to the block A, since the blocks B and C are the returned blocks, the contents in this block may be said to be garbage.

따라서, 이 블록들이 캐쉬에서 방출될 때 메인 메모리(30)로 그 내용을 복사하는 작업을 막기 위하여 블록들의 더티 비트를 0으로 클리어하는 것이다.Thus, when these blocks are released from the cache, the dirty bits of the blocks are cleared to zero to prevent the copying of the contents into the main memory 30.

한편, 상기 단계(단계:S1)에서 읽어온 명령어가 스택(21) 내의 주소에 대한 쓰기 명령인지를 판단하여(단계:S6), 스택(21) 내의 주소에 대한 쓰기 명령인 경우 이 주소에 대한 블록이 캐쉬 메모리(20) 내에 존재하지 않고(접근 실패), 이 주소가 아직 참조된 적이 없는 상태(초기 접근 비트의 값이 0인 상태)이면, 메인 메모리(30)에서 그 내용을 복사하지 않으며, 그 블록에 대한 초기 접근 비트를 1로 변경한다(단계:S7).On the other hand, it is determined whether the command read in the step (step: S1) is a write command for an address in the stack 21 (step: S6), and if it is a write command for an address in the stack 21, If a block does not exist in cache memory 20 (failed to access), and this address has not been referenced yet (the initial access bit has a value of 0), then its contents are not copied from main memory 30. , Change the initial access bit for that block to 1 (step S7).

이때, 상기 쓰기 명령만 특별히 제시한 이유는 쓰기가 행하여지지 않는 주소의 내용은 가비지이므로 이를 읽는 명령은 존재하기 어렵기 때문이다.In this case, the reason for specially presenting the write command is that the content of the address to which writing is not performed is garbage, so the command to read it is difficult to exist.

따라서, 스택(21) 내의 임의의 한 주소에 대하여 먼저 쓰기가 행하여지는 것이 일반적인 일이라고 할 수 있다.Therefore, it can be said that it is common to first write to any one address in the stack 21.

즉, 초기 접근은 쓰기 명령에서 발생한다고 볼 수 있는 것이다.In other words, the initial access can be seen in the write command.

초기 접근(초기 접근 비트의 값이 0임)이고, 그 주소가 접근 실패를 발생시켰을 경우에는 메인 메모리(30)의 내용을 캐쉬 메모리(20)로 복사하는 번거러움을 제거하기 위하여 별도의 처리를 하여야 한다.If the initial access (initial access bit value is 0) and the address causes an access failure, a separate process must be performed to remove the trouble of copying the contents of the main memory 30 to the cache memory 20. do.

캐쉬 메모리(20)에서 방출될 블록의 내용을 메인 메모리(30)로 전송하고 난 이후에 새로 할당된 블록이 마치 접근 실패를 발생시키지 않은 블록인 것처럼 처리한다.After the contents of the block to be released from the cache memory 20 are transferred to the main memory 30, the newly allocated block is processed as if the block does not cause an access failure.

그러므로, 접근 실패가 아니면 그 블록이 캐쉬 메모리(20) 내에 존재하고 있는 것으로 처리되기 때문에 메인 메모리(30)에서 캐쉬 메모리(20)로 내용을 전송하는 일이 발생하지 않는다.Therefore, since the block is treated as existing in the cache memory 20 unless the access fails, the transfer of contents from the main memory 30 to the cache memory 20 does not occur.

따라서, 이 주소가 속한 블록은 더 이상 초기 접근 상태가 아니기 때문에 이 블록에 대한 초기 접근 비트의 값을 1로 변경한다.Therefore, since the block to which this address belongs is no longer the initial access state, the value of the initial access bit for this block is changed to 1.

이때, 1의 값의 의미는 이 블록의 내용이 더 이상 가비지 값이 아님을 나타내는 것이다.In this case, a value of 1 means that the contents of this block are no longer garbage values.

이상 본 발명의 바람직한 실시예에 대해 상세히 기술되었지만, 본 발명이 속하는 기술분야에 있어서 통상의 지식을 가진 사람이라면, 첨부된 청구 범위에 정의된 본 발명의 정신 및 범위를 벗어나지 않으면서 본 발명을 여러 가지로 변형 또는 변경하여 실시할 수 있음을 알 수 있을 것이다.Although the preferred embodiments of the present invention have been described in detail above, those skilled in the art will appreciate that the present invention may be modified without departing from the spirit and scope of the invention as defined in the appended claims. It will be appreciated that modifications or variations may be made.

특히, 상술한 캐쉬 메모리(20) 참조 최적화 방법은 스택(21)의 반환과 할당이 빈번하게 일어나고 캐쉬 메모리(20)와 메인 메모리(30)와의 데이터 전송이 빈번하게 발생하는 환경에서 다양하게 적용되어 좋은 성능을 발휘하게 될 것이다.In particular, the cache memory 20 reference optimization method described above is variously applied in an environment in which the return and allocation of the stack 21 occurs frequently and the data transfer between the cache memory 20 and the main memory 30 occurs frequently. You will get good performance.

따라서 본 발명의 앞으로의 실시예들의 변경은 본 발명의 기술을 벗어날 수 없을 것이다.Therefore, changes in the future embodiments of the present invention will not be able to escape the technology of the present invention.

이상 설명한 바와 같이, 본 발명에 따르면 기존 캐쉬 메모리 관리법과는 달리 프로세서가 캐쉬 메모리의 초기 접근 실패 시에 캐쉬내의 공간을 비우고, 특정 함수의 실행이 종료되면 메인 메모리로 라이트 백을 할 요인을 사전에 제거해버려, 의미 없는 가비지 값의 메인 메모리와 캐쉬 메모리간의 전송을 제거함으로써, 프로세서와 메모리의 전체적인 성능을 향상시킬 수 있는 장점이 있다.As described above, according to the present invention, unlike the conventional cache memory management method, the processor frees space in the cache when an initial access of the cache memory fails, and when the execution of a specific function is terminated, the factors to write back to the main memory in advance This eliminates the transfer of meaningless garbage values between main memory and cache memory, thereby improving the overall performance of the processor and memory.

Claims

In the cache memory reference optimization method in a computer system consisting of a processor, cache memory and main memory,

Reading one command;

Initializing initial access management bits corresponding to the allocated stack region to zero if the instruction read in the step is a stack allocation instruction in the cache memory;

Clearing a dirty bit of a block in a cache memory corresponding to the returned stack region to 0 if the instruction read in the step of reading the instruction is related to a stack return in the cache memory;

When the instruction read in the step of reading the instruction is a write instruction for an address in the stack in the cache memory, if the block for the address does not exist in the cache memory and the address has not been referenced yet, Not copying its contents from the main memory, and changing the initial access bit for the block to one.