KR20010015045A

KR20010015045A - Method and apparatus for accessing special purpose registers within a processor

Info

Publication number: KR20010015045A
Application number: KR1020000033552A
Authority: KR
Inventors: 요르단폴요셉; 튜벨아미
Original assignee: 포만 제프리 엘; 인터내셔널 비지네스 머신즈 코포레이션
Priority date: 1999-06-29
Filing date: 2000-06-19
Publication date: 2001-02-26
Also published as: JP2001022578A

Abstract

PURPOSE: An improved method and device are provided which accesses an exclusive register contained in a processor and can eliminate a delay that is caused by accessing the exclusive register. CONSTITUTION: In this method/device for accessing an exclusive register contained in a processor, a cache inhibition bit and an exclusive register access bit are supplied. A group of exclusive registers (34) is connected to a data cache (13) via a cache bypass bus (31). The bus (31) is available for a cache inhibition access when the cache inhibition bit is shown. However, the data obtained via the operations of the register (34) can be sent to the registers (34) via the bus (31) when the exclusive register access bit is shown.

Description

METHOD AND APPARATUS FOR ACCESSING SPECIAL PURPOSE REGISTERS WITHIN A PROCESSOR}

본 발명은 일반적으로 데이터 처리에 대한 방법 및 장치에 관한 것으로, 특히 프로세서 내부에서 레지스터를 액세스하기 위한 방법 및 장치에 관한 것이다. 좀더 상세히 설명하면, 본 발명은 프로세서 내부에서 특정 목적의 레지스터를 액세스하는 방법 및 장치에 관한 것이다.The present invention relates generally to methods and apparatus for data processing, and more particularly, to a method and apparatus for accessing registers within a processor. More specifically, the present invention relates to a method and apparatus for accessing a register of a specific purpose inside a processor.

현행 프로세서 내부 구조에 있어서, 특정 목적의 레지스터 또는 구성 레지스터로 알려진 여러 종류의 레지스터는 프로세서와 소프트웨어간에 구성 및 상태 정보를 통신하기 위해 사용된다. 프로세서가 표준 동작을 하는 동안에는, 이러한 특정 목적의 레지스터는 프로세서의 작업을 바꾸기 위해 프로세서에 다양한 정보를 공급한다. 특정 목적 레지스터의 통신 역할로 인해, 특정 목적 레지스터는 프로세서가 요청하는 대로 업데이트된 후에는 반드시 소프트웨어에 의해 종종 판독 및 기록이 가능해야만 한다In current processor internals, various types of registers, known as special purpose registers or configuration registers, are used to communicate configuration and status information between the processor and software. While the processor is in standard operation, these special purpose registers supply various information to the processor to change the processor's work. Due to the communication role of specific destination registers, specific destination registers must be readable and writable often by software after they have been updated as requested by the processor.

비록 소프트웨어에 드러난 액세스 방법이 프로세서 내부의 모든 특정 목적 레지스터에 걸쳐서 공통될 지라도, 그 특정 목적의 레지스터 자신은 주로 그들이 가지고 있는 정보의 국부적인 특정한 사용에 따른 프로세서의 다양한 기능 블록을 통해 분산된다. 그러므로, 종래 기술 설계에 있어서, 특정 목적 레지스터는 통상적으로 특정 목적 레지스터를 판독 및 기록하는 하드웨어의 특정 블록 근처에 배치되었다. 광범위하게 분산된 특정 목적의 레지스터로의 액세스는 전용 제어 회로(예, 레지스터 엔진) 및 독특한 프로토콜을 가진 전용 버스에 의해 제공된다. 전용 버스는 특정 목적의 레지스터를 포함하는 모든 기능부와의 연결이 필요하기 때문에, 전용 버스는 전체 프로세서의 모든 상이한 영역을 통해 공통적으로 연결된다. 따라서, 전용 버스의 길고도 분산된 조직으로 인해, 특정 목적 레지스터 액세스 명령에 관계된 큰 지연이 존재한다. 결과적으로, 특정 목적 레지스터에 관계된 지연이 제거된 프로세서 내부에서, 특정 목적의 레지스터에 액세스하기 위해 개선된 방법 및 장치를 제공하는 것이 바람직하다.Although the access methods revealed in software are common across all specific purpose registers within the processor, those specific purpose registers themselves are distributed primarily through the various functional blocks of the processor depending on the local specific use of the information they have. Therefore, in prior art designs, a particular destination register is typically placed near a specific block of hardware that reads and writes the specific destination register. Access to widely distributed, purpose-built registers is provided by dedicated buses with dedicated control circuitry (eg, register engines) and unique protocols. Dedicated buses are commonly connected through all the different areas of the entire processor, since dedicated buses need to be connected to all functional units that contain registers of particular purpose. Thus, due to the long and distributed organization of dedicated buses, there is a large delay associated with specific destination register access instructions. As a result, it would be desirable to provide an improved method and apparatus for accessing a register of a specific purpose, within the processor where the delay associated with the specific object of the register has been eliminated.

개선된 사용 모드, 추가 목적 및 수반되는 잇점 뿐만 아니라, 본 발명 그 자체는 첨부되는 도면을 참조하여, 다음의 상술되는 실시예에 대한 상세한 설명으로부터 더 잘 이해될 것이다.In addition to the improved modes of use, further objects and accompanying advantages, the invention itself will be better understood from the following detailed description of the embodiments, with reference to the accompanying drawings.

본 발명의 개선된 실시예에 따라, 캐시 금지 비트 및 특정 목적의 레지스터가 제공된다. 여러 그룹의 특정 목적 레지스터는 캐시 바이패스 버스를 통해 데이터 캐시에 연결된다. 캐시 금지 비트가 출현할 때 캐시 바이패스 버스는 캐시 금지 액세스에 이용될 수 있다. 그러나, 특정 목적의 레지스터 액세스 비트가 출현할 때, 특정 목적의 레지스터 동작으로부터 데이터는 캐시 바이패스 버스를 통해 특정 목적의 레지스터에 전달되는 것이 허용된다.In accordance with an improved embodiment of the present invention, cache inhibit bits and special purpose registers are provided. Several groups of specific destination registers are connected to the data cache through the cache bypass bus. The cache bypass bus may be used for cache inhibit access when the cache inhibit bit appears. However, when a special purpose register access bit appears, data from the special purpose register operation is allowed to be transferred to the special purpose register via the cache bypass bus.

본 발명의 모든 목적, 특징 및 잇점은 다음에 상세히 설명되는 명세서를 통해 명백해 질 것이다.All objects, features and advantages of the present invention will become apparent from the following detailed description.

도 1은 본 발명의 개선된 실시예가 통합될 수 있는 프로세서에 대한 블록도.1 is a block diagram of a processor in which an improved embodiment of the present invention may be incorporated.

도 2는 종래 기술에 따른, 프로세서 내부에 특정 목적의 레지스터를 배치한 블록도.2 is a block diagram of a register having a specific purpose inside a processor according to the prior art;

도 3은 본 발명의 개선된 실시예에 따른, 도 1의 프로세서 내부에 특정 목적의 레지스터를 배치한 블록도.3 is a block diagram of a particular purpose register placed within the processor of FIG. 1 in accordance with an improved embodiment of the present invention.

도 4는 본 발명의 개선된 실시예에 따른, 프로세서 내부에 SPR 액세스를 재매핑하는 방법에 대한 하이레벨 로직 흐름도.4 is a high level logic flow diagram for a method for remapping SPR access inside a processor in accordance with an improved embodiment of the present invention.

<도면의 주요 부분에 대한 부호의 설명><Explanation of symbols for main parts of the drawings>

10 : 프로세서10: processor

11 : 명령부11: command unit

12 : 버스 인터페이스부12: bus interface unit

13 : 데이터 캐시13: data cache

14 : 명령 캐시14: instruction cache

15 : 정수부15: water purification part

16 : 로드/저장부16: load / store

17 : 부동 소수점부17: floating point part

본 발명은 다양한 프로세서에서 실시될 수 있다. 이러한 프로세서들로는 예컨대, RISC(reduced instruction set computer) 프로세서 또는 CISC(complex instruction set computer) 프로세서가 있다. 설명의 목적을 위해, 이하에 설명되는, 본 발명의 개선된 실시예는, 뉴욕 아르몽크의 아이비엠(IBM)에 의해 제작된 PowerPCp와 같은, RISC 프로세서에서 실시된다.The invention may be practiced in various processors. Such processors are, for example, reduced instruction set computer (RISC) processors or complex instruction set computer (CISC) processors. For purposes of explanation, the improved embodiment of the present invention, described below, is implemented in a RISC processor, such as PowerPCp manufactured by IBM of Armonk, New York.

이제 도면, 특히 본 발명의 개선된 실시예를 통합한 프로세서의 블록도인, 도 1을 참조한다. 프로세서(10) 내부에서, 버스 인터페이스부(12)는 데이터 캐시(13) 및 명령 캐시(14)에 연결된다. 데이터 캐시(13) 및 명령 캐시(14)는 모두 프로세서(10)로 하여금 주 메모리(도시되지 않음)로부터 이전에 전송된 데이터 또는 명령의 서브셋을 상대적으로 빠른 액세스 시간으로 저장하게 하는 고속 셋트-결합 캐시이다. 명령 캐시(14)는 각 실행 사이클 동안 명령 캐시(14)로부터 명령을 페치하는 명령부(11)에 추가로 연결된다.Reference is now made to FIG. 1, which is a block diagram of a processor incorporating an improved embodiment of the present invention. Inside the processor 10, the bus interface 12 is connected to a data cache 13 and an instruction cache 14. Both data cache 13 and instruction cache 14 allow fast processing of the processor 10 to store a subset of previously transmitted data or instructions from main memory (not shown) with a relatively fast access time. It's a cache. The instruction cache 14 is further connected to an instruction portion 11 that fetches instructions from the instruction cache 14 during each execution cycle.

프로세서(10)는 또한 적어도 3 개의 실행부, 즉, 정수부(15), 로드/저장부(LSU, 16) 및 부동 소수점부(floating-point unit)(17)를 포함한다. 각각의 실행부(15-17)는 한 종류 이상의 명령을 실행할 수 있고, 모든 실행부(15-17)는 각 프로세서 사이클 동안 동시에 동작할 수 있다. 명령이 실행되어 종결된 후, 소정의 실행부(15-17)는 데이터 결과들을 명령 타입에 따라, 각각의 새이름 버퍼(rename buffer)에 저장한다. 이어서, 소정의 한 실행부(15-17)에서 종료부(20)에 신호를 보내면 명령의 실행은 종결된다. 마지막으로, 각 명령은 프로그램 순서에 따라 종료되고, 그 결과 데이터는 각각의 새이름 버퍼에서 일반 목적의 레지스터(18) 또는 부동 소수점 레지스터(19)로 전송된다.The processor 10 also includes at least three execution units, namely an integer unit 15, a load / store unit (LSU) 16 and a floating-point unit 17. Each execution unit 15-17 may execute one or more kinds of instructions, and all execution units 15-17 may operate simultaneously during each processor cycle. After the command is executed and terminated, the predetermined execution unit 15-17 stores the data results in each new name buffer, depending on the command type. Subsequently, when a predetermined execution unit 15-17 sends a signal to the termination unit 20, execution of the instruction is terminated. Finally, each instruction is terminated in program order, with the result that data is transferred from each new name buffer to general purpose register 18 or floating point register 19.

이제, 종래의 기술에 따른, 프로세서(21) 내부에 특정 목적의 레지스터 배치에 대한 블록도를 도시하는 도 2를 참조한다. 도시된 바와 같이, 여러 개의 특정 목적의 레지스터(SPR)(24)는 SPR 버스(23)를 통해 SPR 엔진(22)에 연결된다. SPR 엔진(22)은 데이터 캐시(25)를 위해 추가로 데이터 경로에 연결된다. SPR(24)를 액세스하는 명령 오피코드(29)의 스트림은 버스(26)를 통해 SPR 엔진(22)에 전달된다. 명령 오피코드는 SPR(24)의 컨텐츠를 변경할 수 있는 명령들을 포함한다. 예컨대, 명령 오피코드는 데이터를 SPR(24)에 로드하거나 저장시키는 명령들을 포함할 수도 있다(예컨대, PowerPC 프로세서에서 특별한 mfspr 및 mtspr 명령어).Reference is now made to FIG. 2, which shows a block diagram of a specific purpose register placement within processor 21, according to the prior art. As shown, several specific purpose registers (SPRs) 24 are connected to the SPR engine 22 via the SPR bus 23. SPR engine 22 is further coupled to the data path for data cache 25. The stream of instruction opcode 29 that accesses the SPR 24 is delivered to the SPR engine 22 via the bus 26. The command opcode includes instructions that can change the contents of the SPR 24. For example, the instruction opcode may include instructions to load or store data in the SPR 24 (eg, special mfspr and mtspr instructions in a PowerPC processor).

종래의 기술에 있어서, SPR(24)는 통상적으로 하드웨어의 특정 블록 근처, 예컨대 SPR(24)에 임의의 결과들을 판독 및 기록하는데 필요한 종료부(도시되지 않음)에 배치되었다. SPR(24)로의 액세스에는 SPR 엔진(22) 및 SPR 버스(23)에 의해 독특한 SPR 버스 프로토콜이 제공된다. SPR 버스(23)는 SPR를 포함하는 모든 하드웨어와의 연결이 필요하기 때문에, SPR 버스(23)는 주로 프로세서(21)의 상이한 영역을 통해 연결된다. 결과적으로, SPR 버퍼를 액세스하는 명령들이 수행되기 위한 필요조건은 길고도 분산된 SPR 버스(23) 조직에서 초래되는 큰 지연 시간을 견디는 것이다. 게다가, SPR 엔진(22) 및 SPR 버스(23)는 모두 프로세서(21)의 설계 단계 과정에서 규정되고, 설계되며, 검증되어야 한다. SPR 엔진(22) 및 SPR 버스(23)의 정교한 배열로 인해, 거대한 칩 면적이 SPR 엔진(22) 및 SPR 버스(23)에 의해 점유된다. 만약 SPR 엔진(22) 및 SPR 버스의 정교한 버스 구조가 제거될 수 있다면, 프로세서(21) 칩 면적은 확실히 감소될 수 있고, 훨씬 저렴하고 효과적인 프로세서(21)를 제작할 수 있다.In the prior art, SPR 24 has typically been placed near a particular block of hardware, such as at an end (not shown) needed to read and write any results to SPR 24. Access to the SPR 24 is provided with a unique SPR bus protocol by the SPR engine 22 and the SPR bus 23. Since the SPR bus 23 needs to be connected to all hardware including the SPR, the SPR bus 23 is mainly connected through different areas of the processor 21. As a result, a requirement for the instructions to access the SPR buffer to be tolerated is the large delay incurred in the long and distributed SPR bus 23 organization. In addition, both the SPR engine 22 and the SPR bus 23 must be defined, designed, and verified during the design phase of the processor 21. Due to the sophisticated arrangement of the SPR engine 22 and the SPR bus 23, a huge chip area is occupied by the SPR engine 22 and the SPR bus 23. If the sophisticated bus structure of the SPR engine 22 and the SPR bus can be eliminated, then the processor 21 chip area can be certainly reduced, making the processor 21 much cheaper and more effective.

일반적으로, 데이터는 넌(non)-캐시 금지 동작 동안 통상적으로 상기 데이터가 시스템 버스에 전달되기 전에 데이터 캐시를 통해 기록된다. 그러나, 캐시 금지 동작 동안에는, 데이터의 복사본을 데이터 캐시에 전달하지 않고, 데이터는 단지 실행부 및 시스템 버스 사이에 전송될 것이다. 그러므로, 본 발명의 개선된 실시예를 따라, SPR 동작은 캐시 금지 동작을 처리하는데 사용한 것과 유사한 메카니즘을 사용함으로써 처리될 수 있다.In general, data is typically written through the data cache during non-cache prohibition operations before the data is delivered to the system bus. However, during the cache inhibit operation, without transferring a copy of the data to the data cache, the data will only be transferred between the executive and the system bus. Therefore, according to an improved embodiment of the present invention, the SPR operation can be handled by using a mechanism similar to that used to handle cache prohibition operations.

이제, 본 발명의 개선된 실시예에 따른, 도 1의 프로세서(10) 내부에 특정 목적의 레지스터를 배치한 블록도를 도시하는 도 3을 참조한다. 도시한 바와 같이, 여러 개의 SPR(34)는 캐시 바이패스 버스(31)에 직접 연결된다. 넌-캐시 금지 동작 동안에, 캐시 금지 비트(32)가 출현하지 않는다면, 데이터의 복사본은 그 데이터가 시스템 버스(38)에 전달되기 전에 데이터 캐시(13)에 전달된다. 캐시 금지 동작 동안에, 캐시 금지 비트(32)가 출현하면, 데이터는 단지 실행부(도시되지 않음)와 시스템 버스(38) 사이로 전송될 것이다. 데이터는 캐시 바이패스 버스(31)를 통해 데이터를 우회한다.Reference is now made to FIG. 3, which shows a block diagram of disposing a register of a particular purpose inside the processor 10 of FIG. 1, in accordance with an improved embodiment of the present invention. As shown, several SPRs 34 are directly connected to the cache bypass bus 31. During the non-cache inhibit operation, if the cache inhibit bit 32 does not appear, a copy of the data is delivered to the data cache 13 before the data is transferred to the system bus 38. During the cache prohibit operation, if the cache prohibit bit 32 appears, data will only be transferred between the execution unit (not shown) and the system bus 38. The data bypasses the data via the cache bypass bus 31.

본 발명에 관하여, 이제 SPR(34)가 캐시 바이패스 버스(31)에 직접 연결되므로, SPR 동작은 캐시 금지 로드 및 저장 동작인, 캐시 금지 동작을 처리하는데 사용된 메카니즘을 사용함으로써 처리될 수 있다. PowerPCp 실시예에 있어서, 예컨대, mtspr 및 mfspr 명령어는 모든 캐시 금지 로드 및 저장 동작을 처리하는 실행부와 동일한 LUS(16)에 의해 실행된다.With regard to the present invention, since the SPR 34 is now directly connected to the cache bypass bus 31, the SPR operation can be handled by using the mechanism used to handle the cache inhibit operation, which is a cache inhibit load and store operation. . In the PowerPCp embodiment, for example, the mtspr and mfspr instructions are executed by the same LUS 16 as the execution unit that handles all cache inhibit load and store operations.

부수적으로, 칩 상에서 캐시 금지 동작에 필요한 모든 버스들은 또한 SPR 동작을 수행하는데 이용될 수 있다. 게다가, 2 개 이상의 로직 블록은 캐시 금지 동작을 처리하는데 이용된, 즉, 디코더 로직 및 드라이버 제어와 같은 메카니즘의 장점을 얻기 위해 포함될 수 있다. 디코더 로직은 전체 SPR 동작 동안 캐시 금지 상태를 강요하는데 이용되는데 SPR 동작은 보통 캐시 금지 상태를 나타내기 위해 제공되는 세크먼트 레지스터 연결(캐시 금지 비트(32))이 없기 때문이다. 스위치 또는 AND 게이트로 구현될 수 있는, 드라이버 제어는 시스템 버스(38) 상에서 SPR 동작이 프로세서(10)를 방치하는 것을 보호한다. 디코더 로직은 특정 목적의 레지스터 액세스(SPRA) 비트(33)의 형태로 구현될 수 있다. 개선된 실시예처럼, SPR 동작에 대해 캐시 금지 상태는 전체 SPR 동작 동안 및 이전에 SPRA 비트(33)를 나타나게 함으로써 강제될 수 있다. 캐시 금지 비트(32)의 상태는 SPR가 동작하는 동안에는 무의미하다.Incidentally, all the buses required for the cache inhibit operation on the chip can also be used to perform the SPR operation. In addition, two or more logic blocks may be included to take advantage of mechanisms used to handle cache inhibit operations, i.e., decoder logic and driver control. Decoder logic is used to enforce a cache inhibit state during the entire SPR operation because there is no segment register connection (cache inhibit bit 32) that is normally provided to indicate the cache inhibit state. Driver control, which may be implemented as a switch or an AND gate, protects the SPR operation from leaving the processor 10 on the system bus 38. Decoder logic may be implemented in the form of special purpose register access (SPRA) bits 33. As with the improved embodiment, the cache inhibit state for the SPR operation may be enforced by causing the SPRA bit 33 to appear during and before the entire SPR operation. The state of the cache inhibit bit 32 is meaningless while the SPR is operating.

좀 더 간단한 설계를 위해, SPR(34)는 단일 배열 또는 레지스터 파일 구조에 저장될 수 있다. 그러므로, 단지 최소의 임팩트만이 실행 대역폭을 감소시키고 버스 로드를 증가시키는 형태로 메모리에 로드되고 저장된다. 그러나, 감소된 대역폭과 증가된 로드는 SPR(34)의 좀더 간편하고 편리한 배치에서 얻어지는 잇점들에 의해 상쇄되어야 한다.For a simpler design, the SPR 34 can be stored in a single array or register file structure. Therefore, only minimal impacts are loaded and stored in memory in the form of reducing execution bandwidth and increasing bus load. However, the reduced bandwidth and increased load must be offset by the advantages gained in the simpler and more convenient deployment of the SPR 34.

SPR(34)의 추가적 로드를 갖는 프로세서(10) 내부의 공통 버스 상에서 높은 성능을 유지하기 위해, SPR(34)를 고립시키도록 소정의 국부적인 버퍼링 또는 래치를 제공해야 한다. SPR(34)는 대체로 SPR(34)를 이용하거나 업데이트하는 실행부에서 멀리 떨어진 공통 버스에 거의 근접하게 배치되어야 한다. 왜냐하면 소프트웨어에 의한 SPR(34)로의 변화는 프로세서(10)와 같은 아웃-오브-오더(out-of-order) 프로세서에서의 적합한 동작을 보호하기 위해 명령을 동기 시킴으로써 주의깊게 강제되어야 하기 때문이다. 그러므로, 드문 경우(이제 변화되는)에 그 값을 사용하는 실행부로부터 변화된 상태의 추가 전송 사이클은 쉽게 할부 상각(amortize)될 수 있다. 하드웨어 업데이트 역시 전송 시간을 조절하기 위해 파이프라인식으로 설치될 수 있다.In order to maintain high performance on a common bus inside the processor 10 with an additional load of the SPR 34, some local buffering or latching must be provided to isolate the SPR 34. The SPR 34 should generally be placed in close proximity to a common bus that is far from the implementation that uses or updates the SPR 34. This is because the change to the SPR 34 by software must be carefully enforced by synchronizing the instructions to protect proper operation on an out-of-order processor such as the processor 10. Therefore, in rare cases, the additional transfer cycle of the changed state from the execution part using the value can easily be amortized. Hardware updates can also be installed pipelined to control the transfer time.

이제, 본 발명의 개선된 실시예에 따른, 프로세서 내부에 SPR 액세스를 재매핑하는 방법에 대한 하이레벨 로직의 흐름도를 도시하는 도 4를 참조한다. 블록 40에서 시작하여, 블록 41에 나타난 바와 같이, 동작이 캐시 금지 동작인지 아닌지에 관한 결정이 이루어지고, 블록 42에 도시된 바와 같이, 상기 동작이 SPR 동작인지 아닌지에 관한 또다른 결정을 행한다. 만약 그 동작이 캐시 금지 동작이나 SPR 동작 중의 하나라면, 데이터는 블록 45에 나타난 바와 같이, 데이터 캐시를 피하기 위해 캐시 바이패스 버스에 전달된다. 만약 그 동작이 캐시 금지 동작도 SPR 동작도 아니라면, 그 데이터는 블록 43에 나타난 바와 같이, 데이터 캐시에 전달되고, 이어서 블록 44에 도시된 바와 같이 시스템 버스에 전달된다.Reference is now made to FIG. 4, which shows a flow diagram of high level logic for a method of remapping SPR access inside a processor, in accordance with an improved embodiment of the present invention. Beginning at block 40, as shown in block 41, a determination is made as to whether the operation is a cache inhibit operation, and as shown in block 42, another determination is made as to whether the operation is an SPR operation or not. If the operation is one of a cache inhibit operation or an SPR operation, the data is passed on the cache bypass bus to avoid the data cache, as shown in block 45. If the operation is neither a cache inhibit operation nor an SPR operation, the data is passed to the data cache, as shown in block 43, and then to the system bus, as shown in block 44.

그 데이터가 캐시 바이패스 버스로 전달된 후에, 블록 46에 나타난 바와 같이, 특정 목적의 레지스터 액세스 비트(예컨대, 도 3에서 SPRA 비트(33))가 출현했는지에 관한 결정이 이루어진다. 만약 특정 목적의 레지스터 액세스 비트가 출현하지 않았다면, 그 데이터는 블록 44에 나타난 바와 같이, 상기 시스템 버스에 전달되고, 그렇치 않으면, 그 데이터는 블록 47에 도시된 바와 같이, 적절한 SPR에 전달된다.After the data is transferred to the cache bypass bus, a determination is made as to whether a particular purpose register access bit (eg, SPRA bit 33 in FIG. 3) has appeared, as shown in block 46. If no specific purpose register access bits have appeared, then the data is transferred to the system bus, as shown in block 44; otherwise, the data is transferred to the appropriate SPR, as shown in block 47.

상술한 바와 같이, 본 발명은 프로세서 내부에서 SPR 액세스를 재매핑하는 방법 및 장치를 제공한다. SPR의 배치 및 SPR 동작의 개선으로, 종래 기술 구성에서의 SPR 제어기 및 SPR 버스와 이들의 특정 규정 및 특성이 제거될 수 있다.As noted above, the present invention provides a method and apparatus for remapping SPR access inside a processor. With the placement of the SPR and the improvement of the SPR operation, the SPR controller and SPR bus and their specific rules and characteristics in the prior art configuration can be eliminated.

본 발명은 특별히 개선된 실시예를 참조하여 설명되고 도시되었기에, 당업자라면 본 발명의 사상과 범위에서 이탈하지 않고 형식 및 상세한 설명에서 여러가지 변화가 가능하다는 것이 이해될 것이다.Since the present invention has been described and illustrated with reference to particularly improved embodiments, it will be understood by those skilled in the art that various changes may be made in form and detail without departing from the spirit and scope of the invention.

이상 설명한 바와 같이, 본 발명에 의하면 SPR의 배치 및 SPR 동작의 개선으로, 종래 기술 구성에서의 SPR 제어기 및 SPR 버스와 이들의 특정 규정 및 특성이 제거되어 칩 면적을 줄일 수 있다.As described above, according to the present invention, the arrangement of the SPR and the improvement of the SPR operation can remove the SPR controller and the SPR bus and their specific specifications and characteristics in the prior art configuration, thereby reducing the chip area.

Claims

In the processor internal device,

Cache prohibit bits and special purpose register access bits;

A plurality of specific destination registers coupled to the data cache via a cache bypass bus that may be used for a cache inhibit operation when the cache inhibit bit appears;

Means for causing data to be transferred to the plurality of specific purpose registers from the specific purpose register operation through the cache bypass bus when the specific purpose register access bits are encountered.

2. The apparatus of claim 1, wherein said specific destination register operation comprises moving to a specific destination register instruction.

The processor internal apparatus of claim 1, wherein the specific purpose register operation comprises moving from a specific purpose register instruction.

A load / store section for executing a memory access instruction;

A bus coupled to the load / store via a data path and configured to read and write to system memory by the load / store in response to the load / store to execute the memory access command;

A specific purpose register coupled to the data path and for reading and writing to a specific purpose register by the load / store in response to the load / store for executing a specific purpose register command. Processor.

5. The system of claim 4, wherein the processor further comprises cache control logic and cache memory coupled to the bus to control access to cache memory,

The cache control logic enables any memory access by the load / store to update the cache memory, prohibits any other memory access by the load / store to update the cache memory ,

For the load / store which executes a register command of a particular purpose, the control logic prohibits the load / store from updating the cache memory.

6. The system of claim 5, wherein the processor further comprises a driver coupled to the bus for writing to the system memory by the load / store unit,

For the load / store which executes a register command of a particular purpose, the control logic inhibits the driver to prevent the load / store from writing to the system memory.

In the information processing system,

System memory coupled to the system bus;

A processor coupled to the system bus, wherein the processor includes:

A load / store section for executing a memory access instruction;

A bus coupled to the load / store via a data path and configured to read and write to system memory by the load / store in response to the load / store for executing the memory access command;

A particular link connected to the load / store via the data path and performing reading and writing to the register of the specific purpose by the load / store in response to the load / store which executes a register command of a specific purpose; An information processing system comprising a register of interest.

8. The system of claim 7, wherein the processor further comprises cache control logic and cache memory coupled to the bus to control access to cache memory,

For the load / store which executes a specific destination register instruction, the control logic forbids the load / store to update the cache memory.

8. The system of claim 7, wherein the processor further comprises a driver coupled to the bus for writing to system memory by the load / store unit,

For a load / store which executes a register command of a particular purpose, the control logic inhibits the driver to prevent the load / store from writing to the system memory.

In a method of accessing a register for a specific purpose in the processor,

Executing a memory access instruction at the load / store;

Reading and writing to a system memory by a load / store unit in response to the load / store unit executing one of the memory access commands via a bus connected to the load / store unit;

The load / store unit, in response to the load / store unit executing one of the memory access instructions in the form of a specific object register in the form of a specific object register, over a bus connected to the load / store unit, And reading and writing.

11. The method of claim 10, wherein the load / store unit, in response to the load / store unit executing one of memory access commands, writes and reads to the cache memory via a bus connected to the load / store unit. Further comprising the steps of:

For the load / store which executes a register command of a particular purpose, control logic forbids the load / store to update the cache memory.

11. The method of claim 10, further comprising the step of writing to a system memory by the load / store section,

For a load / store that executes a register command of a particular purpose, control logic inhibits the driver to prohibit the load / store from writing to system memory.