JP2004094973A

JP2004094973A - Processor

Info

Publication number: JP2004094973A
Application number: JP2003374914A
Authority: JP
Inventors: Masato Suzuki; 鈴木　正人
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 2003-11-04
Filing date: 2003-11-04
Publication date: 2004-03-25

Abstract

<P>PROBLEM TO BE SOLVED: To provide a processor wherein, when an instruction comprising a branch and at least one non-branch processing unit is executed, pipeline interlock is prevented from occurring even if the branch address of the instruction is set by instruction misalignment, and thereby the execution time of the instruction is not extended so that processing performance is not degraded. <P>SOLUTION: An instruction reading part 60 reads an instruction based on a read address. When a read instruction comprises a branch and one or more non-branch processing units, an instruction decoding part 30 issues a command for processing the processing unit of the branch prior to any of the non-branch processing units. This is effective in preventing pipeline interlock caused by instruction misalignment of the branch address from occurring, and thereby preventing extension of the execution time of the instruction to prevent degradation of processing performance. <P>COPYRIGHT: (C)2004,JPO

Description

　本発明は情報処理装置のCPUとして用いられ、命令読み出しステージ、命令解読ステージ、命令実行ステージの少なくとも３段からなるパイプライン制御方式を採用したプロセッサに関する。 The present invention relates to a processor which is used as a CPU of an information processing apparatus and employs a pipeline control system including at least three stages of an instruction reading stage, an instruction decoding stage, and an instruction execution stage.

　近年の電子技術の発展により、マイクロプロセッサ等の情報処理装置が普及し、あらゆる分野で用いられている。
　従来のプロセッサは、命令の種類が豊富なことを特徴とするCISCプロセッサ（Complex Instruction Set Computer）と、命令の種類を限定して高速さを特徴とするRISCプロセッサ（Reduced Instruction Set Computer）とに大別できる。例えば、TRONやMC68040などは前者であり、SPARCやMIPSなどは後者である。 With the recent development of electronic technology, information processing devices such as microprocessors have become widespread and used in all fields.
Conventional processors are mainly classified into a CISC processor (Complex Instruction Set Computer), which is characterized by a wide variety of instructions, and a RISC processor (Reduced Instruction Set Computer), which is characterized by a limited number of instructions and characterized by high speed. Can be different. For example, TRON and MC68040 are the former, and SPARC and MIPS are the latter.

　どちらのタイプのプロセッサにおいても、パイプライン構造をとることにより命令の見かけ上の実行時間の短縮が図られている。パイプラインとは、命令の処理ステージを少なくとも読出し、解読、実行のステージに分け、複数の命令について異なるステージを並列に実行する処理方法である。CISCプロセッサは、RISCプロセッサと異なり可変長命令形式をとるものが多い。可変長命令形式は、命令の種類によって命令の語長が異なり、一般的に固定長命令形式に比べてプログラムの大きさを小さくすることができる。 In both types of processors, the apparent execution time of instructions is reduced by adopting a pipeline structure. The pipeline is a processing method in which at least a processing stage of an instruction is read, decoded, and executed, and different stages of a plurality of instructions are executed in parallel. Unlike the RISC processor, many CISC processors take a variable-length instruction format. In the variable-length instruction format, the word length of the instruction differs depending on the type of the instruction, and generally, the size of the program can be reduced as compared with the fixed-length instruction format.

　その反面、命令の配置が、命令読出しにおけるワード（１６ビット）境界あるいはダブルワード（３２ビット）境界に対して固定的ではなくなるため、多くの命令が境界をまたいで配置されることになる。そのため、命令の先読みを行っていても解読すべき命令が末尾まで読出されていない場合があり、その場合パイプラインによる並列処理を一時的に停止する必要が生じる。 (5) On the other hand, since instructions are not fixed at word (16-bit) boundaries or double-word (32-bit) boundaries in instruction reading, many instructions are arranged across boundaries. Therefore, there are cases where the instruction to be decoded is not read to the end even if the instruction is prefetched. In this case, it is necessary to temporarily stop the parallel processing by the pipeline.

　このパイプラインの一時停止をパイプラインインタロック、またその原因になる命令が境界をまたいで配置される状態を命令ミスアライメントと呼ぶ。
　また、従来のプロセッサとしては、例えば、特開平5-197546号（発明の名称：「マイクロコンピュータ及び除算回路」）公報に示されているものがある。この従来のプロセッサは、公報２７ページ８８段落に記載されるように、JSR（Jump SubRoutine：サブルーチンコール／プロシージャコール）命令ではプログラムカウンタPCの内容を退避した後に分岐の処理を行っている。 This temporary stop of the pipeline is called pipeline interlock, and the state where the instruction causing the temporary stop is placed across the boundary is called instruction misalignment.
As a conventional processor, for example, there is a processor disclosed in Japanese Patent Application Laid-Open No. 5-197546 (title of the invention: "Microcomputer and division circuit"). In this conventional processor, as described in page 27, paragraph 88, the JSR (Jump SubRoutine: subroutine call / procedure call) instruction performs branch processing after saving the contents of the program counter PC.

　この従来のプロセッサは、命令読出しステージ、解読・アドレス計算ステージ、演算実行ステージの３段からなるパイプライン構造をとり、命令読出しステージで読出され命令を格納する４バイトの命令バッファと、命令の動作を細分化した処理単位を実現するマイクロ命令を記憶する制御記憶とを有する。
　命令読出しステージでは、読出しアドレスが偶数番地ならば１マシンサイクルで２バイトが、奇数番地ならば１バイトが読出され、命令バッファに格納される。 This conventional processor has a pipeline structure including three stages of an instruction read stage, a decoding / address calculation stage, and an operation execution stage, and has a 4-byte instruction buffer for storing an instruction read in the instruction read stage, and an instruction operation. And a control storage for storing a microinstruction for realizing a processing unit obtained by subdividing the above.
In the instruction reading stage, if the read address is an even address, two bytes are read in one machine cycle, and if the read address is an odd address, one byte is read and stored in the instruction buffer.

　解読・アドレス計算ステージでは、命令バッファの底（先頭の命令）の命令に対応するマイクロ命令が制御記憶から読出され、マイクロ命令が指示する制御信号が出力される。解読・アドレス計算ステージで解読される命令が複数のマイクロ命令からなる場合には、１マシンサイクル毎に１つのマイクロ命令を指示する御信号が出力される。
　演算実行ステージでは、制御記憶から出力された１処理単位の指示を１マシンサイクルで実行する。 In the decoding / address calculation stage, a microinstruction corresponding to the instruction at the bottom of the instruction buffer (the first instruction) is read from the control storage, and a control signal indicated by the microinstruction is output. If the instruction decoded in the decoding / address calculation stage is composed of a plurality of microinstructions, a control signal indicating one microinstruction is output every machine cycle.
In the operation execution stage, the instruction of one processing unit output from the control storage is executed in one machine cycle.

　図５は、上記従来のプロセッサの動作タイミングを示す図である。同図は、パイプラインの各ステージで処理する命令と、命令バッファの内容と、制御記憶から出力される処理単位毎の出力内容とをマシンサイクルと呼ばれるタイミング毎に示している。図中に例示しているプログラムは下記の通りである。
　命令１：　100番地　ADD D0,D1
（D0レジスタの値とD1レジスタの値を加算して結果をD1レジスタに格納する１バイトの命令で、１処理単位のマイクロ命令からなる。）
　命令２：　101番地　JSR @(disp16,PC)
（プログラムカウンタの値に16ビットの偏位を加えた番地にあるサブルーチンに分岐する３バイトの命令で、３処理単位のマイクロ命令からなる。分岐先は201番地とする。）
　命令３：　201番地　MOV @(disp8,A0),D0
（A0レジスタの値に８ビットの偏位を加えた番地にあるデータをD0レジスタにロードする２バイトの命令で、１処理単位のマイクロ命令からなる。）
　上記のプログラム例では、３バイト長のJSR命令（命令２）と、３バイト長のMOV命令（命令３）とが奇数番地から配置されており、ミスアライメントになっている。 FIG. 5 is a diagram showing the operation timing of the conventional processor. This figure shows the instructions processed in each stage of the pipeline, the contents of the instruction buffer, and the output contents of each processing unit output from the control storage at each timing called a machine cycle. The programs illustrated in the figure are as follows.
Instruction 1: Address ADD D0, D1
(This is a 1-byte instruction that adds the value of the D0 register and the value of the D1 register and stores the result in the D1 register, and consists of a microinstruction in one processing unit.)
Instruction 2: Address 101 JSR @ (disp16, PC)
(This is a 3-byte instruction that branches to a subroutine at an address obtained by adding a 16-bit deviation to the value of the program counter, and consists of micro-instructions in three processing units. The branch destination is address 201.)
Instruction 3: Address 201 MOV @ (disp8, A0), D0
(This is a 2-byte instruction that loads data at an address obtained by adding an 8-bit deviation to the value of the A0 register into the D0 register, and consists of a micro instruction in one processing unit.)
In the above program example, the 3-byte length JSR instruction (instruction 2) and the 3-byte length MOV instruction (instruction 3) are arranged from odd addresses and are misaligned.

　図９は、上記命令２（JSR @(disp16,PC)）の３つの処理単位が実行するオペレーションの内容を示す説明図である。同図のようにJSR命令はスタックポインタ減算、戻り先ストア、分岐実行の処理単位からなる。SPはスタックポインタ、PCはプログラムカウンタ、disp16は１６ビットのアドレス偏位を示す。それぞれの処理単位のオペレーションは、図５ではタイミングｔ４、ｔ５、ｔ６において実行される。 FIG. 9 is an explanatory diagram showing the contents of operations executed by the three processing units of the instruction 2 (JSR @ (disp16, PC)). As shown in the figure, the JSR instruction includes a processing unit of stack pointer subtraction, return destination store, and branch execution. SP indicates a stack pointer, PC indicates a program counter, and disp16 indicates a 16-bit address deviation. The operation of each processing unit is executed at timings t4, t5, and t6 in FIG.

　従来のプロセッサの動作を同図を用いて説明する。
　（タイミングｔ１）
　命令読出しステージで100番地から２バイトの命令コードを読出す。
　（タイミングｔ２）
　タイミングｔ１で読出された100番地と101番地の命令コードを命令バッファに格納するとともに、命令バッファの底から100番地の命令１を取り出して解読・アドレス計算ステージで解読する。命令１は１バイトなので命令の末尾までが命令バッファに存在することになる。命令読出しステージでは２だけ増分した読出しアドレスを計算して102番地から２バイトの命令コードを読出す。 The operation of the conventional processor will be described with reference to FIG.
(Timing t1)
At the instruction reading stage, a 2-byte instruction code is read from address 100.
(Timing t2)
The instruction codes at addresses 100 and 101 read at the timing t1 are stored in the instruction buffer, and the instruction 1 at address 100 is taken out from the bottom of the instruction buffer and decoded at the decoding / address calculation stage. Since the instruction 1 is 1 byte, the end of the instruction exists in the instruction buffer. In the instruction read stage, a read address incremented by 2 is calculated, and a 2-byte instruction code is read from address 102.

　（タイミングｔ３）
　解読・アドレス計算ステージから命令１に関する加算の指示が出力され、演算実行ステージでこれを実行する。命令１に関する処理単位は唯一で、この実行により命令１の実行が完了する。前のタイミングで読出された102番地と103番地の命令コードを命令バッファに格納するとともに、命令バッファの底から101番地の命令２を取り出して解読・アドレス計算ステージで解読および分岐先のアドレス計算を行う。命令２は３バイトであるが命令の末尾までが命令バッファに存在している。命令読出しステージは命令バッファに２バイト以上の空きがないため命令読出しを行わない。 (Timing t3)
An instruction for addition relating to instruction 1 is output from the decoding / address calculation stage, and is executed in the operation execution stage. The processing unit for the instruction 1 is unique, and the execution completes the execution of the instruction 1. The instruction codes at addresses 102 and 103 read at the previous timing are stored in the instruction buffer, and the instruction 2 at address 101 is taken out from the bottom of the instruction buffer, and the decoding and address calculation at the decoding / address calculation stage are performed. Do. Instruction 2 has 3 bytes, but the end of the instruction exists in the instruction buffer. The instruction reading stage does not read the instruction because there is no space of 2 bytes or more in the instruction buffer.

　（タイミングｔ４）
　解読・アドレス計算ステージから命令２に関する第１の処理単位であるスタックポインタデクリメントの指示が出力され、演算実行ステージでこれを実行する。命令２は３つの処理単位から構成される。命令読出しステージでは２だけ増分した読出しアドレスを計算して104番地から２バイトの命令コードを読出す。 (Timing t4)
An instruction for stack pointer decrement, which is the first processing unit for the instruction 2, is output from the decoding / address calculation stage, and is executed in the operation execution stage. Instruction 2 is composed of three processing units. In the instruction read stage, a read address incremented by 2 is calculated, and a 2-byte instruction code is read from address 104.

　（タイミングｔ５）
　解読・アドレス計算ステージから命令２に関する第２の処理単位であるスタックへの戻り先番地のストアの指示が出力され、演算実行ステージでこれを実行する。前のタイミングで読出された104番地と105番地の命令コードを命令バッファに格納する。命令読出しステージでは２だけ増分した読出しアドレスを計算して106番地から２バイトの命令コードを読出す。 (Timing t5)
From the decoding / address calculation stage, an instruction to store the return address to the stack, which is the second processing unit for the instruction 2, is output, and this is executed in the operation execution stage. The instruction codes at addresses 104 and 105 read at the previous timing are stored in the instruction buffer. In the instruction read stage, a read address incremented by 2 is calculated, and a 2-byte instruction code is read from address 106.

　（タイミングｔ６）
　解読・アドレス計算ステージから命令２に関する第３の処理単位である分岐の指示が出力され、演算実行ステージでこれを実行する。この分岐の指示により命令バッファにある全ての命令をフラッシュするともに、命令読出しステージでは解読・アドレス計算ステージで先に計算された分岐先アドレスを受け取り、201番地から１バイトの命令コードを読出す。受け取ったアドレスが奇数番地なので１バイトのみの読出しになる。命令２はこの実行により完了する。 (Timing t6)
The instruction for branching, which is the third processing unit for instruction 2, is output from the decoding / address calculation stage, and is executed in the operation execution stage. In accordance with this branch instruction, all instructions in the instruction buffer are flushed, and the instruction read stage receives the branch destination address calculated earlier in the decoding / address calculation stage, and reads a 1-byte instruction code from address 201. Since the received address is an odd address, only one byte is read. Instruction 2 is completed by this execution.

　（タイミングｔ７）
　前のタイミングで読出された201番地の命令コードを命令バッファに格納するとともに、命令バッファの底から201番地の命令３を取り出して解読・アドレス計算ステージで解読しようとするが命令３は２バイトなので命令の末尾までが命令バッファに存在しないため解読できない。そのため解読・アドレス計算ステージにおける動作を停止させる（パイプラインインタロック）。命令読出しステージでは前の読出しアドレスが奇数なので１だけ増分した読出しアドレスを計算して202番地から２バイトの命令コードを読出す。 (Timing t7)
The instruction code at the address 201 read at the previous timing is stored in the instruction buffer, and the instruction 3 at the address 201 is taken out from the bottom of the instruction buffer to be decoded at the decoding / address calculation stage. Since the end of the instruction does not exist in the instruction buffer, it cannot be decoded. Therefore, the operation in the decoding / address calculation stage is stopped (pipeline interlock). In the instruction read stage, since the previous read address is an odd number, a read address incremented by 1 is calculated, and a 2-byte instruction code is read from address 202.

　（タイミングｔ８）
　前のタイミングで解読・アドレス計算ステージにおける動作が停止されているため、このタイミングでは演算実行ステージにおける動作を停止させる（パイプラインインタロック）。前のタイミングで読出された202番地と203番地の命令コードを命令バッファに格納するとともに、命令バッファの底から201番地の命令３を取り出して解読・アドレス計算ステージで解読およびロードのアドレス計算を行う。命令３は２バイトなのでこのタイミングで初めて命令の末尾までが命令バッファに存在することになる。命令読出しステージは命令バッファに２バイト以上の空きがないため命令読出しを行わない。 (Timing t8)
Since the operation at the decoding / address calculation stage has been stopped at the previous timing, the operation at the operation execution stage is stopped at this timing (pipeline interlock). The instruction codes at addresses 202 and 203 read at the previous timing are stored in the instruction buffer, and the instruction 3 at address 201 is taken out from the bottom of the instruction buffer, and the decoding and load calculation is performed in the decoding / address calculation stage. . Since the instruction 3 is 2 bytes, the end of the instruction is present in the instruction buffer for the first time at this timing. The instruction reading stage does not read the instruction because there is no space of 2 bytes or more in the instruction buffer.

　（タイミングｔ９）
　解読・アドレス計算ステージから命令３に関するロードの指示が出力され、演算実行ステージでこれを実行する。命令３に関する処理単位は唯一で、この実行により命令３の実行が完了する。命令読出しステージでは２だけ増分した読出しアドレスを計算して204番地から２バイトの命令コードを読出す。 (Timing t9)
A load instruction relating to the instruction 3 is output from the decoding / address calculation stage, and is executed in the operation execution stage. There is only one processing unit for the instruction 3, and the execution completes the execution of the instruction 3. In the instruction read stage, a read address incremented by 2 is calculated, and a 2-byte instruction code is read from address 204.

　しかしながら上記従来のプロセッサによれば、可変長命令形式おいて複数の処理単位（マイクロ命令）からなる分岐命令がミスアライメントされている場合には、分岐するための処理単位が最後のマシンサイクルで実行されるので、パイプラインインタロックが発生し、全体の命令実行時間が伸張して処理性能が劣化するという問題点を有している。
　図５を用いてより具体的に説明すると、タイミングｔ７において命令バッファの底から命令３を取り出して解読しようとするが２バイトの命令３の末尾の１バイトが命令バッファに存在しないため解読できないという状態が起こる。これは、２バイトの命令３が奇数番地に配置されていることと、直前のタイミングｔ６において分岐先の命令読出しが行われていることとに起因する。さらに、タイミングｔ４やｔ５では実行されずにフラッシュされることが明確な命令の先読出しが行われており、これらの命令読出しに伴い無駄な浪費電力が発生しているという問題もある。 However, according to the above-described conventional processor, when a branch instruction including a plurality of processing units (micro instructions) is misaligned in the variable-length instruction format, the processing unit for branching is executed in the last machine cycle. Therefore, there is a problem that a pipeline interlock occurs, the entire instruction execution time is extended, and the processing performance is deteriorated.
More specifically, referring to FIG. 5, at time t7, the instruction 3 is taken out from the bottom of the instruction buffer and is to be decoded. However, it cannot be decoded because the last one byte of the 2-byte instruction 3 does not exist in the instruction buffer. The situation happens. This is because the 2-byte instruction 3 is located at an odd address and the instruction at the branch destination is being read at the immediately preceding timing t6. Further, there is a problem in that prefetching of instructions that are clearly executed and not flushed at timings t4 and t5 is performed, and unnecessary power is wasted in reading these instructions.

　かかる課題に鑑み本発明は、分岐と少なくとも１つの分岐でない処理単位とからなる命令を実行する場合に該命令の分岐先が命令ミスアライメントであっても命令不在によるパイプラインインタロックが発生せず、従って命令の実行時間が伸張せず処理性能が劣化しないプロセッサを提供することを目的とする。 In view of such a problem, the present invention does not cause pipeline interlock due to the absence of an instruction when executing an instruction including a branch and at least one processing unit that is not a branch even if the branch destination of the instruction is an instruction misalignment. Accordingly, an object of the present invention is to provide a processor in which the execution time of an instruction does not extend and the processing performance does not deteriorate.

　上記の課題を解決するため本発明の請求項１記載のプロセッサは、命令の先読みを行うプロセッサであって、
　先読みすべき命令のアドレスを保持するフェッチアドレス保持手段と、
　フェッチアドレス保持手段のアドレスを書き換えることを内容とする分岐の処理単位と、その他の処理単位とを含む分岐命令の実行に関する制御処理に対して、まず分岐の処理単位の実行を制御し、続いてその他の処理単位の実行を制御する制御手段とを備え、前記分岐の処理単位とその他の処理単位とは同じパイプラインステージで実行される。 In order to solve the above problem, a processor according to claim 1 of the present invention is a processor that prefetches an instruction,
Fetch address holding means for holding an address of an instruction to be prefetched;
For the control processing relating to the execution of the branch instruction including the processing unit of the branch having the content of rewriting the address of the fetch address holding means and the other processing unit, the execution of the processing unit of the branch is controlled first, Control means for controlling the execution of other processing units, wherein the processing unit of the branch and the other processing units are executed in the same pipeline stage.

　請求項２記載のプロセッサは、請求項１記載のプロセッサにおいて、
　前記制御処理は、サブルーチンコール命令の実行制御を含み、
　前記制御手段は、前記分岐の処理単位の制御において、サブルーチンコール命令のオペランドに基づいて得られる分岐先アドレスをフェッチアドレス保持手段に書き込み、その他の処理単位において、戻り先アドレスを退避すべき領域のアドレス計算および戻り先アドレスの退避を制御するように構成されている。 The processor according to claim 2 is the processor according to claim 1,
The control processing includes execution control of a subroutine call instruction,
The control means writes the branch destination address obtained based on the operand of the subroutine call instruction to the fetch address holding means in the control of the branch processing unit, and in the other processing unit, stores the return destination address in the area where the return destination address is to be saved. It is configured to control the address calculation and the saving of the return address.

　請求項３記載のプロセッサは、請求項２記載のプロセッサにおいて、
　前記制御処理は、割り込み処理ルーチンへの移行制御を含み、
　前記制御手段は、前記分岐の処理単位の制御において、所定の割込み処理ルーチンの開始アドレスをフェッチアドレス保持手段に書き込み、その他の処理単位において戻り先アドレスを退避すべきの領域のアドレス計算および戻り先アドレスの退避を制御するように構成されている。 The processor according to claim 3 is the processor according to claim 2,
The control processing includes a transition control to an interrupt processing routine,
In the control of the processing unit of the branch, the control unit writes a start address of a predetermined interrupt processing routine into a fetch address holding unit, and calculates an address of an area where a return address is to be saved in another processing unit and calculates a return destination. It is configured to control saving of addresses.

　請求項４記載のプロセッサは、請求項１ないし３記載の何れかのプロセッサにおいて、前記制御処理は、サブルーチンからの復帰を指示するリターン命令と、割り込み処理ルーチンからの復帰を指示するリターン命令とを含み、
　前記制御手段は、前記分岐の処理単位の制御において戻り先アドレスをフェッチアドレス保持手段に書き込むように構成されている。 According to a fourth aspect of the present invention, in the processor according to any one of the first to third aspects, the control processing includes a return instruction instructing return from a subroutine and a return instruction instructing return from an interrupt processing routine. Including
The control means is configured to write a return address to a fetch address holding means in controlling the processing unit of the branch.

　請求項５記載のプロセッサは、請求項３記載のプロセッサにおいて、前記制御手段は、
　前記制御処理の動作を実現する複数のマイクロ命令を記憶する制御記憶部と、
　前記各処理単位に対応するマイクロ命令を制御記憶から順次読み出して、マイクロ命令が指示する制御信号をプロセッサ内部に発行するマイクロ命令発行部とを備え、
　前記マイクロ命令発行部は、
　サブルーチンコール命令に対しては、当該命令中に指定されたアドレスをフェッチアドレス保持手段に格納することを指示するマイクロ命令と、戻り先アドレスを退避すべき領域のアドレスの計算を指示するマイクロ命令と、戻り先アドレスを退避することを指示するマイクロ命令とをこの順序で発行し、
　割り込み処理ルーチンへの移行制御に対しては、所定の割り込み処理開始アドレスをフェッチアドレス保持手段に格納することを指示するマイクロ命令と、戻り先アドレスを退避すべき領域のアドレスの計算を指示するマイクロ命令と、戻り先アドレスを退避することを指示するマイクロ命令とをこの順序で発行する。 The processor according to claim 5 is the processor according to claim 3, wherein the control unit includes:
A control storage unit that stores a plurality of micro-instructions for realizing the operation of the control process,
A micro-instruction issuing unit for sequentially reading micro-instructions corresponding to each processing unit from the control storage and issuing a control signal designated by the micro-instruction to the inside of the processor;
The microinstruction issuing unit includes:
For a subroutine call instruction, a microinstruction instructing to store the address specified in the instruction in the fetch address holding means, and a microinstruction instructing to calculate the address of the area where the return address is to be saved And a microinstruction for instructing to save the return address, in this order,
For control of transition to the interrupt processing routine, a microinstruction instructing to store a predetermined interrupt processing start address in the fetch address holding means and a microinstruction instructing calculation of an address of an area where a return address is to be saved are provided. An instruction and a microinstruction instructing to save the return address are issued in this order.

　請求項６記載のプロセッサは、請求項３記載のプロセッサにおいて、前記制御手段は、
　サブルーチンコール命令をプリデコードし、当該命令の分岐先アドレスの算出を行うアドレス計算部と、
　前記制御処理を実現する複数のマイクロ命令を記憶する制御記憶部と、
　前記各処理単位に対応するマイクロ命令を制御記憶から順次読み出して、マイクロ命令をプロセッサ内部に発行するマイクロ命令発行部とを備え、
　前記マイクロ命令発行部は、
　サブルーチンコール命令に対して、前記アドレス計算部で得られるアドレスを前記フェッチアドレス保持手段に格納すると共に、メモリに格納すべきデータを保持するストアバッファに戻り先アドレスを保持することを指示するマイクロ命令と、前記退避領域を指すスタックポインタの更新、およびオペランドアドレスを保持すべきオペランドアドレスバッファに更新されたスタックポインタの内容を格納することを指示するマイクロ命令と、ストアバッファの内容をオペランドアドレスバッファが指す退避領域に格納することを指示するマイクロ命令とをこの順序で発行し、
　割り込み処理ルーチンへの移行制御に対しては、所定の割込み処理の開始アドレスを前記フェッチアドレス保持手段に格納すると共にストアバッファに戻り先アドレスを保持することを指示するマイクロ命令と、スタックポインタの更新、及びオペランドアドレスバッファに更新されたスタックポインタの内容を格納することを指示するマイクロ命令と、ストアバッファの内容をオペランドアドレスバッファが指す退避領域に格納することを指示するマイクロ命令とをこの順序で発行する。 The processor according to claim 6 is the processor according to claim 3, wherein the control unit includes:
An address calculation unit for pre-decoding a subroutine call instruction and calculating a branch destination address of the instruction;
A control storage unit that stores a plurality of microinstructions that implements the control processing;
A micro-instruction issuing unit for sequentially reading micro-instructions corresponding to the respective processing units from the control storage, and issuing the micro-instructions inside the processor;
The microinstruction issuing unit includes:
In response to a subroutine call instruction, a microinstruction for storing an address obtained by the address calculation unit in the fetch address holding unit and instructing a store buffer holding data to be stored in a memory to hold a return address. A microinstruction for instructing to update the stack pointer pointing to the save area and to store the updated contents of the stack pointer in the operand address buffer to hold the operand address, and to store the contents of the store buffer in the operand address buffer. Issue a microinstruction instructing to store in the save area pointed to in this order,
For the control of transition to the interrupt processing routine, a microinstruction for storing the start address of the predetermined interrupt processing in the fetch address holding means and for holding the return address in the store buffer, and updating the stack pointer , And a microinstruction instructing to store the updated contents of the stack pointer in the operand address buffer, and a microinstruction instructing to store the contents of the store buffer in the save area pointed to by the operand address buffer. Issue.

　本発明の請求項１のプロセッサによれば、分岐先命令が解読可能になるタイミングでは少なくとも２回の命令読出しが行われているため、分岐先が命令ミスアライメントであっても命令不在によるパイプラインインタロックは発生しないので、命令の実行時間が伸張せず処理性能が劣化しないという効果がある。
　請求項２のプロセッサによれば、特にサブルーチンコール命令に対して請求項１と同じ効果がある。 According to the processor of the first aspect of the present invention, at least two times of instruction reading are performed at the timing when the branch destination instruction becomes decodable, so that even if the branch destination is instruction misalignment, the pipeline due to the absence of an instruction. Since no interlock occurs, there is an effect that the execution time of the instruction is not extended and the processing performance is not degraded.
According to the processor of the second aspect, the same effect as that of the first aspect is obtained particularly for a subroutine call instruction.

　請求項３のプロセッサによれば、特に割り込み処理ルーチンへの移行制御に対しても請求項２と同じ効果がある。
　請求項４のプロセッサによれば、請求項１ないし３何れかの効果に加えて、サブルーチンからの復帰を指示するリターン命令と、割り込み処理ルーチンからの復帰を指示するリターン命令に対して、請求項１ないし３何れかと同じ効果がある。 According to the processor of the third aspect, the same effect as that of the second aspect is obtained particularly for control of transition to an interrupt processing routine.
According to the processor of the fourth aspect, in addition to the effects of the first to third aspects, a return instruction instructing return from a subroutine and a return instruction instructing return from an interrupt processing routine are provided. It has the same effect as any one of 1 to 3.

　請求項５のプロセッサによれば、請求項３の効果に加えて、マイクロプログラムよる制御によって特にサブルーチンコール命令、割り込み処理ルーチンへの移行制御に対して命令の実行時間が伸張せず処理性能が劣化しないという効果がある。
　請求項６のプロセッサによれば、請求項３の効果に加えて、マイクロプログラムよる制御とは独立にアドレス計算の制御をすることよって特にサブルーチンコール命令および割り込み処理ルーチンへの移行制御に対して命令の実行時間が伸張せず処理性能が劣化しないという効果がある。 According to the processor of the fifth aspect, in addition to the effect of the third aspect, the execution time of the instruction is not extended due to the control by the microprogram, particularly for the control of shifting to the subroutine call instruction and the interrupt processing routine, and the processing performance is deteriorated. It has the effect of not doing it.
According to the processor of the sixth aspect, in addition to the effect of the third aspect, by controlling the address calculation independently of the control by the microprogram, it is possible to instruct the subroutine call instruction and the transition control to the interrupt processing routine. Has the effect that the execution time does not increase and the processing performance does not deteriorate.

　請求項１記載のプロセッサは命令の先読みを行うプロセッサであって、
　先読みすべき命令のアドレスを保持するフェッチアドレス保持手段と、
　フェッチアドレス保持手段のアドレスを書き換えることを内容とする分岐の処理単位と、その他の処理単位とを含む分岐命令の実行に関する制御処理に対して、まず分岐の処理単位の実行を制御し、続いてその他の処理単位の実行を制御する制御手段とを備える。これにより、分岐先が命令ミスアライメントであっても命令不在によるパイプラインインタロックは発生しないようになる。 2. The processor of claim 1, wherein the processor performs prefetching of instructions.
Fetch address holding means for holding an address of an instruction to be prefetched;
For the control processing relating to the execution of the branch instruction including the processing unit of the branch having the content of rewriting the address of the fetch address holding means and the other processing unit, the execution of the processing unit of the branch is controlled first, Control means for controlling execution of other processing units. As a result, even if the branch destination is an instruction misalignment, pipeline interlock due to the absence of an instruction does not occur.

　請求項２記載のプロセッサにおいて、請求項１記載のプロセッサの前記制御処理は、サブルーチンコール命令の実行制御を含み、
　前記制御手段は、前記分岐の処理単位の制御において、サブルーチンコール命令のオペランドに基づいて得られる分岐先アドレスをフェッチアドレス保持手段に書き込み、その他の処理単位において、戻り先アドレスを退避すべき領域のアドレス計算および戻り先アドレスの退避を制御する。 The processor according to claim 2, wherein the control processing of the processor according to claim 1 includes execution control of a subroutine call instruction,
The control means writes the branch destination address obtained based on the operand of the subroutine call instruction to the fetch address holding means in the control of the branch processing unit, and in the other processing unit, stores the return destination address in the area where the return destination address is to be saved. Controls address calculation and return address saving.

　請求項３記載のプロセッサにおいて、請求項２記載のプロセッサの前記制御処理は、割り込み処理ルーチンへの移行制御を含み、
　前記制御手段は、前記分岐の処理単位の制御において、所定の割込み処理ルーチンの開始アドレスをフェッチアドレス保持手段に書き込み、その他の処理単位において戻り先アドレスを退避すべきの領域のアドレス計算および戻り先アドレスの退避を制御する。 4. The processor according to claim 3, wherein the control processing of the processor according to claim 2 includes control for shifting to an interrupt processing routine.
In the control of the processing unit of the branch, the control unit writes a start address of a predetermined interrupt processing routine into a fetch address holding unit, and calculates an address of an area where a return address is to be saved in another processing unit and calculates a return destination. Controls evacuation of addresses.

　請求項４記載のプロセッサでは、請求項１ないし３記載のプロセッサの前記制御処理は、サブルーチンからの復帰を指示するリターン命令と、割り込み処理ルーチンからの復帰を指示するリターン命令とを含み、
　前記制御手段は、前記分岐の処理単位の制御において戻り先アドレスをフェッチアドレス保持手段に書き込む。 In the processor according to the fourth aspect, the control processing of the processor according to the first to third aspects includes a return instruction instructing return from a subroutine and a return instruction instructing return from an interrupt processing routine,
The control unit writes a return address to a fetch address holding unit in the control of the processing unit of the branch.

　請求項５記載のプロセッサにおいて、請求項３記載のプロセッサの前記制御手段は、
　前記制御処理の動作を実現する複数のマイクロ命令を記憶する制御記憶部と、
　前記各処理単位に対応するマイクロ命令を制御記憶から順次読み出して、マイクロ命令が指示する制御信号をプロセッサ内部に発行するマイクロ命令発行部とを備え、
　前記マイクロ命令発行部は、
　サブルーチンコール命令に対しては、当該命令中に指定されたアドレスをフェッチアドレス保持手段に格納することを指示するマイクロ命令と、戻り先アドレスを退避すべき領域のアドレスの計算を指示するマイクロ命令と、戻り先アドレスを退避することを指示するマイクロ命令とをこの順序で発行し、
　割り込み処理ルーチンへの移行制御に対しては、所定の割り込み処理開始アドレスをフェッチアドレス保持手段に格納することを指示するマイクロ命令と、戻り先アドレスを退避すべき領域のアドレスの計算を指示するマイクロ命令と、戻り先アドレスを退避することを指示するマイクロ命令とをこの順序で発行する。 The processor according to claim 5, wherein the control means of the processor according to claim 3 comprises:
A control storage unit that stores a plurality of micro-instructions for realizing the operation of the control process,
A micro-instruction issuing unit for sequentially reading micro-instructions corresponding to each processing unit from the control storage and issuing a control signal designated by the micro-instruction to the inside of the processor;
The microinstruction issuing unit includes:
For a subroutine call instruction, a microinstruction instructing to store the address specified in the instruction in the fetch address holding means, and a microinstruction instructing to calculate the address of the area where the return address is to be saved And a microinstruction for instructing to save the return address, in this order,
For control of transition to the interrupt processing routine, a microinstruction instructing to store a predetermined interrupt processing start address in the fetch address holding means and a microinstruction instructing calculation of an address of an area where a return address is to be saved are provided. An instruction and a microinstruction instructing to save the return address are issued in this order.

　請求項６記載のプロセッサにおいて、請求項３記載のプロセッサの前記制御手段は、
　サブルーチンコール命令をプリデコードし、当該命令の分岐先アドレスの算出を行うアドレス計算部と、
　前記制御処理を実現する複数のマイクロ命令を記憶する制御記憶部と、
　前記各処理単位に対応するマイクロ命令を制御記憶から順次読み出して、マイクロ命令をプロセッサ内部に発行するマイクロ命令発行部とを備え、
　前記マイクロ命令発行部は、
　サブルーチンコール命令に対して、前記アドレス計算部で得られるアドレスを前記フェッチアドレス保持手段に格納すると共に、メモリに格納すべきデータを保持するストアバッファに戻り先アドレスを保持することを指示するマイクロ命令と、前記退避領域を指すスタックポインタの更新、およびオペランドアドレスを保持すべきオペランドアドレスバッファに更新されたスタックポインタの内容を格納することを指示するマイクロ命令と、ストアバッファの内容をオペランドアドレスバッファが指す退避領域に格納することを指示するマイクロ命令とをこの順序で発行し、
　割り込み処理ルーチンへの移行制御に対しては、所定の割込み処理の開始アドレスを前記フェッチアドレス保持手段に格納すると共にストアバッファに戻り先アドレスを保持することを指示するマイクロ命令と、スタックポインタの更新、及びオペランドアドレスバッファに更新されたスタックポインタの内容を格納することを指示するマイクロ命令と、ストアバッファの内容をオペランドアドレスバッファが指す退避領域に格納することを指示するマイクロ命令とをこの順序で発行する。 The processor according to claim 6, wherein the control means of the processor according to claim 3 comprises:
An address calculation unit for pre-decoding a subroutine call instruction and calculating a branch destination address of the instruction;
A control storage unit that stores a plurality of microinstructions that implements the control processing;
A micro-instruction issuing unit for sequentially reading micro-instructions corresponding to the respective processing units from the control storage, and issuing the micro-instructions inside the processor;
The microinstruction issuing unit includes:
In response to a subroutine call instruction, a microinstruction for storing an address obtained by the address calculation unit in the fetch address holding unit and instructing a store buffer holding data to be stored in a memory to hold a return address. A microinstruction for instructing to update the stack pointer pointing to the save area and to store the updated contents of the stack pointer in the operand address buffer to hold the operand address, and to store the contents of the store buffer in the operand address buffer. Issue a microinstruction instructing to store in the save area pointed to in this order,
For the control of transition to the interrupt processing routine, a microinstruction for storing the start address of the predetermined interrupt processing in the fetch address holding means and for holding the return address in the store buffer, and updating the stack pointer , And a microinstruction instructing to store the updated contents of the stack pointer in the operand address buffer, and a microinstruction instructing to store the contents of the store buffer in the save area pointed to by the operand address buffer. Issue.

　図１は本発明の実施例におけるプロセッサの概略構成を示すブロック図である。同図に示すように、プロセッサ１０は、命令バッファ３２と制御記憶３３を有する解読部３０、レジスタ部４０、演算実行部５０、命令読出し部６０、バスインタフェース部７０を備え、命令読出し部６０、解読部３０、演算実行部５０でのそれぞれの処理をパイプラインとして同時に実行する。 FIG. 1 is a block diagram showing a schematic configuration of a processor according to an embodiment of the present invention. As shown in FIG. 1, the processor 10 includes a decoding unit 30 having an instruction buffer 32 and a control memory 33, a register unit 40, an operation execution unit 50, an instruction reading unit 60, and a bus interface unit 70. The respective processes in the decoding unit 30 and the operation execution unit 50 are simultaneously executed as a pipeline.

　入出力バス７１１は、プロセッサ１０と外界との間でデータのやり取りを行う入出力バスであり、プログラムやデータを格納する外部メモリ（図外）等に接続される。
　バスインタフェース部７０は、入出力バス７１１の制御を行う。
　命令読出し部６０は、バスインタフェース部７０を介して外部メモリから命令を読出す。その際命令読出し部６０は、分岐命令実行などにより読出しアドレスが不連続になる場合にのみ解読部３０または演算実行部５０からアドレスを受け取り、読出しアドレスが連続する場合には内部に持つインクリメント回路により読出しアドレスを計算して命令を読み出す。読出す命令は、読出しアドレスが偶数番地ならば１マシンサイクルで２バイト、奇数番地ならば１バイトを読み出して、４バイトの命令バッファ３２に格納する。 The input / output bus 711 is an input / output bus for exchanging data between the processor 10 and the outside world, and is connected to an external memory (not shown) for storing programs and data.
The bus interface unit 70 controls the input / output bus 711.
The instruction reading unit 60 reads an instruction from an external memory via the bus interface unit 70. At this time, the instruction reading unit 60 receives an address from the decoding unit 30 or the arithmetic execution unit 50 only when the read address becomes discontinuous due to execution of a branch instruction or the like, and when the read address is continuous, an internal increment circuit has Calculate the read address and read the instruction. The read instruction reads two bytes in one machine cycle if the read address is an even address, and reads one byte if the read address is an odd address, and stores it in the 4-byte instruction buffer 32.

　解読部３０は、命令読出し部６０から受け取った命令を解読して、命令の実行を制御する。命令がメモリアクセスを伴う場合にはそのオペランドアドレスの計算を、分岐を伴う場合にはその分岐先アドレスの計算を演算実行部５０に行わせる。解読部３０は、内部に制御記憶３３を持ち、解読した命令が１つの処理単位からなる場合は１つの処理の指示を発行し、複数の処理単位からなる場合は１マシンサイクルに１つの処理単位ずつ順次処理の指示を発行する。 The decoding unit 30 decodes the instruction received from the instruction reading unit 60 and controls execution of the instruction. When the instruction involves memory access, the arithmetic execution unit 50 is caused to calculate the operand address, and when the instruction involves a branch, the branch execution address is calculated. The decryption unit 30 has a control memory 33 therein and issues one processing instruction when the decrypted instruction is composed of one processing unit, and issues one processing unit per machine cycle when the decrypted instruction is composed of a plurality of processing units. The processing instructions are issued one by one.

　レジスタブロック４０は、命令のオペランドで指定される複数のレジスタを有する。
　演算実行部５０は、内部に算術論理演算を行う演算器を持ち、１マシンサイクルで制御記憶３３から出力された１処理単位の指示を実行する。
　図２（ａ）は、制御記憶３３の構成を示す説明図である。制御記憶３３は、全ての機械語命令に対応する処理単位（マイクロ命令）を格納しており、同図（ａ）は、その一部を示している。 The register block 40 has a plurality of registers specified by the operand of the instruction.
The arithmetic execution unit 50 has an arithmetic unit for performing an arithmetic and logic operation therein, and executes an instruction of one processing unit output from the control storage 33 in one machine cycle.
FIG. 2A is an explanatory diagram showing the configuration of the control storage 33. The control memory 33 stores processing units (micro instructions) corresponding to all machine language instructions, and FIG.

　２１１は、命令１（加算命令）が解読された場合に参照され発行されるマイクロ命令の内容を格納する記憶領域で、加算の指示を表す。命令１では１マシンサイクルで制御記憶領域２１１の内容のみが発行される。
　２２１〜２２３は、命令２（JSR命令）が解読された場合に参照され発行されるマイクロ命令を格納する制御記憶領域で、それぞれ分岐、スタックポインタのデクリメント、スタックへの戻り先番地のストアを実現する処理単位を表す。命令２では３マシンサイクルかけて制御記憶領域２２１〜２２３の内容が順次発行される。 Reference numeral 211 denotes a storage area for storing the contents of a microinstruction to be referred to and issued when the instruction 1 (addition instruction) is decoded, and represents an addition instruction. In the instruction 1, only the contents of the control storage area 211 are issued in one machine cycle.
Reference numerals 221 to 223 denote control storage areas for storing microinstructions to be referred and issued when the instruction 2 (JSR instruction) is decoded, and realize branch, decrement of the stack pointer, and storage of the return address to the stack, respectively. Indicates the processing unit to be used. In the instruction 2, the contents of the control storage areas 221 to 223 are sequentially issued over three machine cycles.

　２３１は、命令３（MOV命令）が解読された場合に参照され発行される処理の内容を格納する制御記憶領域で、ロードの指示を表す。命令３では１マシンサイクルで制御記憶領域２３１の内容のみが発行される。
　図３は、図１に示したプロセッサのさらに詳細な構成を表すブロック図である。
　同図において、プロセッサ１０は、命令解読部３０、レジスタ部４０、演算実行部５０、命令読出し部６０、バスインタフェース部７０を有し、バス（以下ABUSと略す）２０、第１データバス（B1BUSと略す）２１、第２データバス（B2BUSと略す）２２、命令アドレスバス（NIADDRと略す）７０１、命令バス（IBUSと略す）７０２、オペランドアドレスバス（OADDRと略す）７０３、ストアデータバス（STBUSと略す）７０４、ロードデータバス（LDBUSと略す）７０５により図示するように接続されている。 A control storage area 231 stores the contents of processing to be referred to and issued when the instruction 3 (MOV instruction) is decrypted, and indicates a load instruction. In the instruction 3, only the contents of the control storage area 231 are issued in one machine cycle.
FIG. 3 is a block diagram showing a more detailed configuration of the processor shown in FIG.
In FIG. 1, a processor 10 includes an instruction decoding unit 30, a register unit 40, an operation execution unit 50, an instruction reading unit 60, and a bus interface unit 70, a bus (hereinafter abbreviated as ABUS) 20, a first data bus (B1BUS). 21, a second data bus (abbreviated as B2BUS) 22, an instruction address bus (abbreviated as NIADDR) 701, an instruction bus (abbreviated as IBUS) 702, an operand address bus (abbreviated as OADDR) 703, and a store data bus (STBUS). 704) and a load data bus (abbreviated as LDBUS) 705 as shown in the figure.

　命令解読部３０は、先読みした命令の格納および解読を行ないマイコン全体を制御するため、割込み制御部３１、命令バッファ（IBと略す）３２、制御記憶３３、セレクタ３４、命令レジスタ３５、ステータスレジスタ３６、プリデコーダ３７、マイクロ命令レジスタ３８とから構成されている。
　割込み制御部３１は割込み受付時のマイコンの動作シーケンスを制御する。 The instruction decoding unit 30 stores and decodes pre-read instructions and controls the entire microcomputer, so that an interrupt control unit 31, an instruction buffer (abbreviated as IB) 32, a control storage 33, a selector 34, an instruction register 35, and a status register 36 are provided. , A predecoder 37, and a microinstruction register 38.
The interrupt control unit 31 controls the operation sequence of the microcomputer when receiving an interrupt.

　命令バッファ３２は、命令の実行に先だって命令読出し部６０によって先行的にメモリから読み出された命令を蓄える。本実施例では、４バイト分の命令を蓄える容量を持つ。
　セレクタ３４は、命令バッファ３２から入力される命令と、命令バスから入力される命令との何れか一方を選択する。
　命令レジスタ３５は、セレクタ３４から出力する命令を保持する。 The instruction buffer 32 stores the instruction read from the memory by the instruction reading unit 60 prior to the execution of the instruction. This embodiment has a capacity to store 4 bytes of instructions.
The selector 34 selects one of an instruction input from the instruction buffer 32 and an instruction input from the instruction bus.
The instruction register 35 holds an instruction output from the selector 34.

　ステータスレジスタ３６は、命令の解読に必要な各種のステータスフラグを保持する。
　制御記憶３３は、ステータスレジスタ３６の内容を参照して、命令レジスタ３５の命令を解読する。本実施例では、プログラマブルロジックアレイ（PLA）を用いてマイクロプログラムによる制御論理が実装され、命令レジスタ３５の命令を実現するマイクロ命令を順に出力する。 The status register 36 holds various status flags required for decoding the instruction.
The control memory 33 decodes the instruction in the instruction register 35 with reference to the contents of the status register 36. In this embodiment, control logic by a microprogram is implemented using a programmable logic array (PLA), and microinstructions for realizing the instructions in the instruction register 35 are sequentially output.

　プリデコーダ３７は、命令レジスタ３５の内容とステータスレジスタ３６の内容とを入力し、主として１サイクルで動作するロード命令及び条件分岐命令を実行するための制御信号を出力する。この動作は、命令実行ステージに先立って命令解読ステージにおいて行われる。特に、サブルーチンコール命令、割込み処理要求命令等の分岐命令に対しては、命令中で指定された分岐先アドレスをPCB６４およびIAB７２に格納する制御を行う。分岐先アドレスへのディスプレースメント（偏位）で指定される場合には、演算実行部５０に分岐先アドレスの計算を実行させて、その計算結果をPCB６４およびIAB７２に格納する。 The predecoder 37 receives the contents of the instruction register 35 and the contents of the status register 36, and outputs a control signal for executing a load instruction and a conditional branch instruction that operate mainly in one cycle. This operation is performed in the instruction decoding stage prior to the instruction execution stage. In particular, for branch instructions such as a subroutine call instruction and an interrupt processing request instruction, control is performed to store the branch destination address specified in the instruction in the PCB 64 and the IAB 72. When it is specified by a displacement (deviation) to the branch destination address, the arithmetic execution unit 50 executes the calculation of the branch destination address, and stores the calculation result in the PCB 64 and the IAB 72.

　マイクロ命令レジスタ３８は、制御記憶３３の解読結果である制御指示を保持する。
　レジスタ部４０は、データやアドレスを保持するため、データレジスタ群４１、アドレスレジスタ群４２、セレクタ４３を備えて構成されている。データレジスタ群４１は、主としてデータを保持する１６ビット長の４本のレジスタDR3〜DR0を有する。アドレスレジスタ群４２は、主としてアドレスを保持する１６ビット長の４本のレジスタAR3〜AR0を有する。このうちAR3はスタックポインタとして機能する。 The microinstruction register 38 holds a control instruction, which is a decoding result of the control storage 33.
The register section 40 includes a data register group 41, an address register group 42, and a selector 43 for holding data and addresses. The data register group 41 includes four 16-bit registers DR3 to DR0 that mainly hold data. The address register group 42 includes four 16-bit registers AR3 to AR0 that mainly hold addresses. AR3 functions as a stack pointer.

　セレクタ４３は、ABUS２０とLDBUS７０５とから選択的にデータレジスタ群４１及びアドレスレジスタ群４２に出力する。
　演算実行部５０は、アドレスの計算やデータ演算するため、演算器５１、プログラムステータスワード５２、オペランドアドレスレジスタ５３、セレクタ５４、セレクタ５５、テンポラリーレジスタ５６、セレクタ５７、セレクタ５８、シフタ５９を備えて構成されている。 The selector 43 selectively outputs to the data register group 41 and the address register group 42 from the ABUS 20 and the LDBUS 705.
The arithmetic execution unit 50 includes an arithmetic unit 51, a program status word 52, an operand address register 53, a selector 54, a selector 55, a temporary register 56, a selector 57, a selector 58, and a shifter 59 for calculating an address and performing data operation. It is configured.

　演算器５１は、１６ビットのデータ演算及びアドレス計算を行う。
　プログラムステータスワード５２は、演算結果のフラグ等を保持する１６ビットのレジスタである。
　オペランドアドレスレジスタ５３は、メモリをアクセスするアドレスを格納する。
　セレクタ５４、５５は、演算器５１に入力すべきオペランドを選択する。 The arithmetic unit 51 performs 16-bit data operation and address calculation.
The program status word 52 is a 16-bit register that holds a flag and the like of the operation result.
Operand address register 53 stores an address for accessing the memory.
The selectors 54 and 55 select an operand to be input to the arithmetic unit 51.

　テンポラリーレジスタ５６は、演算器５１の出力を一時的に保持する。
　セレクタ５７は、テンポラリーレジスタ５６、オペランドアドレスレジスタ５３の何れかを選択してオペランドアドレスバッファ７４に出力する。
　セレクタ５８は、ABUS２０とシフタ５９との何れかを選択する。
　シフタ５９は、セレクタ５８の出力を受けて演算器５１とともにシフト動作をする。 The temporary register 56 temporarily holds the output of the arithmetic unit 51.
The selector 57 selects one of the temporary register 56 and the operand address register 53 and outputs it to the operand address buffer 74.
The selector 58 selects one of the ABUS 20 and the shifter 59.
The shifter 59 receives the output of the selector 58 and performs a shift operation together with the arithmetic unit 51.

　命令読出し部６０は、命令の読み出し位置を制御するため、第１プリフェッチカウンタ（PFCと略す）６１、第２プリフェッチカウンタ（PFCPと略す）６２、セレクタ６３、プログラムカウンタバッファ（PCBと略す）６４、インクリメンタ（INCと略す）６５、セレクタ６６から構成されている。
　PCB６４は、セレクタ６６から出力される命令の読み出しアドレスを保持する。 The instruction reading section 60 includes a first prefetch counter (abbreviated as PFC) 61, a second prefetch counter (abbreviated as PFCP) 62, a selector 63, a program counter buffer (abbreviated as PCB) 64, It comprises an incrementer (abbreviated as INC) 65 and a selector 66.
The PCB 64 holds a read address of an instruction output from the selector 66.

　INC６５は、PCB６４の内容を＋１あるいは＋２インクリメントして、命令の先読みアドレスとしてPFC６１、セレクタ６６を介してIAB７２に出力する。
　PFC６１は、INC６５によりインクリメントされたアドレスを保持する。
　PFCP６２は、PFC６１の１つ前のアドレスを保持する。
　セレクタ６３は、PFC６１とPFCP６２とのいずれかを選択してABUS２０及びB1BUS２１に出力する。 The INC 65 increments the content of the PCB 64 by +1 or +2 and outputs the incremented value to the IAB 72 via the PFC 61 and the selector 66 as a prefetch address of the instruction.
The PFC 61 holds the address incremented by the INC 65.
The PFCP 62 holds the address immediately before the PFC 61.
The selector 63 selects one of the PFC 61 and the PFCP 62 and outputs it to the ABUS 20 and the B1BUS 21.

　セレクタ６６は、連続アドレスの命令が読み出される場合にはPFC６１の出力を選択し、分岐する場合には、テンポラリーレジスタ５６又はオペランドアドレスレジスタ５３からのアドレスを選択して出力する。
　バスインタフェース部７０は、外部メモリ（図外）から命令やデータを読み出す際のバスの接続を制御し、インタフェース部７１、命令アドレスバッファ７２、命令バッファ７３、オペランドアドレスバッファ７４、ストアバッファ７５、ロードバッファ７６、バススイッチ７７、RAM７８、ROM７９から構成されている。 The selector 66 selects the output of the PFC 61 when the instruction of the continuous address is read, and selects and outputs the address from the temporary register 56 or the operand address register 53 when branching.
The bus interface unit 70 controls connection of the bus when reading instructions and data from an external memory (not shown), and includes an interface unit 71, an instruction address buffer 72, an instruction buffer 73, an operand address buffer 74, a store buffer 75, and a load. It comprises a buffer 76, a bus switch 77, a RAM 78 and a ROM 79.

　インタフェース部７１は、CPU６のバスと外部との接続を制御する。
　命令アドレスバッファ（IABと略す）７２、命令バッファ７３、オペランドアドレスバッファ７４、ストアバッファ７５、ロードバッファ７６は、それぞれ命令アドレス、命令、オペランドアドレス、ストアデータ、ロードデータを保持するためのバッファである。
　バススイッチ７７は、バス７０６〜７０９を接断する。 The interface unit 71 controls connection between the bus of the CPU 6 and the outside.
An instruction address buffer (abbreviated as IAB) 72, an instruction buffer 73, an operand address buffer 74, a store buffer 75, and a load buffer 76 are buffers for holding an instruction address, an instruction, an operand address, store data, and load data, respectively. .
The bus switch 77 connects and disconnects the buses 706 to 709.

　RAM７８、ROM７９は、それぞれデータ、命令を格納する。本実施例では外部メモリを前提とするので、RAM７８、ROM７９は無視してよい。
　図２（ｂ）は、上記、プリデコーダ３７、制御記憶３３、マイクロ命令レジスタ３８による機械語命令JSR @(disp16,PC)を実現する処理単位の詳細なオペレーションを示す説明図である。図２（ａ）にも示したように、JSR @(disp16,PC)命令は、分岐、スタックポインタ減算、戻り先ストアの３つの処理単位からなる。 The RAM 78 and the ROM 79 store data and instructions, respectively. In this embodiment, since the external memory is assumed, the RAM 78 and the ROM 79 may be ignored.
FIG. 2B is an explanatory diagram showing a detailed operation of a processing unit for realizing the machine language instruction JSR @ (disp16, PC) by the predecoder 37, the control memory 33, and the microinstruction register 38. As shown in FIG. 2A, the JSR @ (disp16, PC) instruction includes three processing units: branch, stack pointer subtraction, and return store.

　分岐は、プリデコーダ３７による制御動作（PFCP又はPFC+disp16+0又は1 → PCB,IAB）と、制御記憶３３及びマイクロ命令レジスタ３８による制御動作（PFCP又はPFC+0又は1 → STB）とにより実現される。前者のプリデコーダ３７の制御により分岐先アドレス（PFCP又はPFC+disp16又は0/1）が計算され、計算結果がPCB６４及びIAB７２に書き込まれる。分岐先アドレスは、PFC６１とPFCP６２のどちらかと１６ビットディスプレースメントと０または１とを加算することにより得られる。PFC６１とPFCP６２との選択および０と１の選択は、命令バッファ３２の残量に応じて選択される。この計算結果がPCB６４及びIAB７２に書き込まれる結果、分岐先アドレスから命令が順次フェッチされることになる。また、後者の制御動作により、戻り先アドレスがストアバッファ７５に保持される。これは、戻り先のストアに対する前準備であり、戻り先アドレスが保存される。この制御は、制御記憶領域２２１のマイクロ命令が指定する制御信号がマイクロ命令レジスタ３８から発行されることにより実現される。 The branch is performed by a control operation (PFCP or PFC + disp16 + 0 or 1 → PCB, IAB) by the predecoder 37 and a control operation (PFCP or PFC + 0 or 1 → STB) by the control memory 33 and the microinstruction register 38. Is achieved. The branch destination address (PFCP or PFC + disp16 or 0/1) is calculated under the control of the former predecoder 37, and the calculation result is written to the PCB 64 and the IAB 72. The branch destination address is obtained by adding either the PFC 61 or the PFCP 62, the 16-bit displacement, and 0 or 1. The selection between the PFC 61 and the PFCP 62 and the selection between 0 and 1 are selected according to the remaining amount of the instruction buffer 32. As a result of the calculation result being written into the PCB 64 and the IAB 72, instructions are sequentially fetched from the branch destination address. The return address is held in the store buffer 75 by the latter control operation. This is preparation for the return destination store, and the return destination address is stored. This control is realized by issuing a control signal designated by the microinstruction in the control storage area 221 from the microinstruction register 38.

　スタックポインタ減算の処理単位は、スタックポインタの値をスタック（退避領域）の未使用領域に変更する処理（AR3-4 → AR3,OAB）である。本実施例ではアドレススレジスタAR3がスタックポインタであるので、AR3が４減算される。同時に戻り先ストアの前準備として減算結果がOAB７４にも格納される。この制御は、制御記憶領域２２２のスタックポインタ減算のマイクロ命令により実現される。処理 The processing unit for stack pointer subtraction is processing (AR3-4 → AR3, OAB) that changes the value of the stack pointer to an unused area of the stack (save area). In this embodiment, since the address register AR3 is a stack pointer, AR3 is subtracted by 4. At the same time, the subtraction result is also stored in the OAB 74 as preparation for the return store. This control is realized by a microinstruction of the stack pointer subtraction in the control storage area 222.

　戻り先ストアの処理単位は、スタックポインタが指すスタックの未使用領域に、戻り先アドレスを退避させる処理である。この時点では既に、減算されたスタックポインタの値がOAB７４に、戻り先アドレスがSTB７５に格納されているので、STB７５の内容をOAB７４が指すメモリ領域に格納する制御が行われる。この制御は、制御記憶領域２２３の戻り先ストアのマイクロ命令により実現される。処理 The processing unit of the return store is processing for saving the return address in an unused area of the stack indicated by the stack pointer. At this point, since the value of the subtracted stack pointer has already been stored in the OAB 74 and the return address has been stored in the STB 75, control for storing the contents of the STB 75 in the memory area pointed to by the OAB 74 is performed. This control is realized by a microinstruction of the return destination store in the control storage area 223.

　以上のように構成された本実施例のプロセッサについて、以下その動作を説明する。従来技術との違いを明確にするために、従来技術の説明で用いた同一のプログラム例に用いて説明する。例示プログラムを以下に示す。
　命令１：　100番地　ADD D0,D1
（D0レジスタの値とD1レジスタの値を加算して結果をD1レジスタに格納する１バイトの命令で、１処理単位のマイクロ命令からなる。）
　命令２：　101番地　JSR @(disp16,PC)
（プログラムカウンタの値に16ビットの偏位を加えた番地にあるサブルーチンに分岐する３バイトの命令で、３処理単位のマイクロ命令からなる。分岐先は201番地とする。）
　命令３：　201番地　MOV @(disp8,A0),D0
（A0レジスタの値に８ビットの偏位を加えた番地にあるデータをD0レジスタにロードする２バイトの命令で、１処理単位のマイクロ命令からなる。）
　図７は、同実施例におけるプロセッサの動作タイミング図を示すものである。同図は、命令読出し部６０、解読部３０、演算実行部５０で処理する命令と命令バッファ３２の内容と制御記憶３３の出力内容とをマシンサイクルと呼ばれるタイミング毎に示している。時間が経過する順にタイミング毎に説明する。 The operation of the processor of the present embodiment configured as described above will be described below. In order to clarify the difference from the related art, the description will be made using the same program example used in the description of the related art. An example program is shown below.
Instruction 1: Address ADD D0, D1
(This is a 1-byte instruction that adds the value of the D0 register and the value of the D1 register and stores the result in the D1 register, and consists of a microinstruction in one processing unit.)
Instruction 2: Address 101 JSR @ (disp16, PC)
(This is a 3-byte instruction that branches to a subroutine at an address obtained by adding a 16-bit deviation to the value of the program counter, and consists of micro-instructions in three processing units. The branch destination is address 201.)
Instruction 3: Address 201 MOV @ (disp8, A0), D0
(This is a 2-byte instruction that loads data at an address obtained by adding an 8-bit deviation to the value of the A0 register into the D0 register, and consists of a micro instruction in one processing unit.)
FIG. 7 shows an operation timing chart of the processor in the embodiment. The figure shows the instructions processed by the instruction reading unit 60, the decoding unit 30, and the operation executing unit 50, the contents of the instruction buffer 32, and the output contents of the control storage 33 at each timing called a machine cycle. The description will be given for each timing in order of elapse of time.

　（タイミングｔ１）
　命令読出し部６０が100番地から２バイトの命令コードを読出す。この読出しアドレスは図示しない分岐その他の理由によりこのタイミングで解読部３０または演算実行部５０から受け取るものとする。また命令バッファ３２も空であるとする。
　（タイミングｔ２）
　前のタイミングで読出された100番地と101番地の命令コードを命令バッファ３２に格納するとともに、命令バッファ３２の底から100番地の命令１を取り出して解読部３０で解読する。命令１は１バイトなので命令の末尾までが命令バッファ３２に存在することになる。命令読出し部６０では２だけ増分した読出しアドレスを計算して102番地から２バイトの命令コードを読出す。 (Timing t1)
The instruction reading section 60 reads a 2-byte instruction code from address 100. It is assumed that this read address is received at this timing from the decoding unit 30 or the operation execution unit 50 for a branch (not shown) and other reasons. It is also assumed that the instruction buffer 32 is empty.
(Timing t2)
The instruction codes at addresses 100 and 101 read at the previous timing are stored in the instruction buffer 32, and the instruction 1 at address 100 is taken out from the bottom of the instruction buffer 32 and decoded by the decoding unit 30. Since the instruction 1 is one byte, the end of the instruction exists in the instruction buffer 32. The instruction reading unit 60 calculates a read address incremented by 2 and reads a 2-byte instruction code from address 102.

　（タイミングｔ３）
　解読部３０の制御記憶領域２１１から命令１に関する加算の指示が出力され、演算実行部５０でこれを実行する。この実行により命令１の実行が完了する。前のタイミングで読出された102番地と103番地の命令コードを命令バッファ３２に格納するとともに、命令バッファ３２の底から101番地の命令２を取り出して解読部３０で解読および分岐先のアドレス計算を行う。命令２は３バイトであるが命令の末尾までが命令バッファ３２に存在している。命令読出し部６０は命令バッファ３２に２バイト以上の空きがないため命令読出しを行わない。 (Timing t3)
An instruction for addition relating to the instruction 1 is output from the control storage area 211 of the decoding unit 30 and is executed by the arithmetic execution unit 50. With this execution, the execution of the instruction 1 is completed. The instruction codes at addresses 102 and 103 read at the previous timing are stored in the instruction buffer 32, and the instruction 2 at address 101 is taken out from the bottom of the instruction buffer 32, and the decoding unit 30 decodes and calculates the address of the branch destination. Do. The instruction 2 is 3 bytes, but the end of the instruction exists in the instruction buffer 32. The instruction reading unit 60 does not read the instruction because the instruction buffer 32 has no space of 2 bytes or more.

　（タイミングｔ４）
　解読部３０の制御記憶領域２２１から命令２に関する第１の処理単位である分岐の指示が出力され、図４に示したように演算実行部５０でこれを実行する。この分岐の指示により命令バッファ３２にある全ての命令をフラッシュするともに、命令読出し部６０では解読部３０で前のタイミングで計算された分岐先アドレスを受け取り、201番地から１バイトの命令コードを読出す。受け取ったアドレスが奇数番地なので１バイトのみの読出しになる。 (Timing t4)
A branch instruction, which is the first processing unit for the instruction 2, is output from the control storage area 221 of the decoding unit 30 and is executed by the arithmetic execution unit 50 as shown in FIG. In accordance with this branch instruction, all instructions in the instruction buffer 32 are flushed, and the instruction reading unit 60 receives the branch destination address calculated by the decoding unit 30 at the previous timing, and reads the 1-byte instruction code from address 201. put out. Since the received address is an odd address, only one byte is read.

　（タイミングｔ５）
　解読部３０の制御記憶領域２２２から命令２に関する第２の処理単位であるスタックポインタデクリメントの指示が出力され、図５に示したように演算実行部５０でこれを実行する。前のタイミングで読出された201番地の命令コードを命令バッファ３２に格納する。命令読出し部６０では前の読出しアドレスが奇数なので１だけ増分した読出しアドレスを計算して202番地から２バイトの命令コードを読出す。 (Timing t5)
An instruction for stack pointer decrement, which is the second processing unit for the instruction 2, is output from the control storage area 222 of the decoding unit 30, and is executed by the arithmetic execution unit 50 as shown in FIG. The instruction code at the address 201 read at the previous timing is stored in the instruction buffer 32. The instruction read unit 60 calculates the read address incremented by 1 since the previous read address is an odd number, and reads a 2-byte instruction code from address 202.

　（タイミングｔ６）
　解読部３０の制御記憶領域２２３から命令２に関する第３の処理単位であるスタックへの戻り先番地のストアの指示が出力され、図６に示したように演算実行部５０でこれを実行する。前のタイミングで読出された202番地と203番地の命令コードを命令バッファ３２に格納する。命令読出し部６０は命令バッファ３２に２バイト以上の空きがないため命令読出しを行わない。命令２はこの実行により完了する。 (Timing t6)
An instruction to store the return address to the stack, which is the third processing unit for the instruction 2, is output from the control storage area 223 of the decoding unit 30 and is executed by the arithmetic execution unit 50 as shown in FIG. The instruction codes at addresses 202 and 203 read at the previous timing are stored in the instruction buffer 32. The instruction reading unit 60 does not read the instruction because the instruction buffer 32 has no space of 2 bytes or more. Instruction 2 is completed by this execution.

　（タイミングｔ７）
　命令バッファ３２の底から201番地の命令３を取り出して解読部３０で解読およびロードのアドレス計算を行う。命令３は２バイトであるがこのタイミングで既に命令の末尾までが命令バッファ３２に存在しており、パイプラインインタロックは発生しない。命令読出し部６０は命令バッファ３２に２バイト以上の空きがないため命令読出しを行わない。 (Timing t7)
The instruction 3 at address 201 is taken out from the bottom of the instruction buffer 32, and the decoding unit 30 calculates the addresses for decoding and loading. Although the instruction 3 is 2 bytes, the end of the instruction already exists in the instruction buffer 32 at this timing, and the pipeline interlock does not occur. The instruction reading unit 60 does not read the instruction because the instruction buffer 32 has no space of 2 bytes or more.

　（タイミングｔ８）
　解読部３０の制御記憶領域２３１から命令３に関するロードの指示が出力され、演算実行部５０でこれを実行する。この実行により命令３の実行が完了する。命令読出し部６０では２だけ増分した読出しアドレスを計算して204番地から２バイトの命令コードを読出す。 (Timing t8)
A load instruction relating to the instruction 3 is output from the control storage area 231 of the decryption unit 30 and is executed by the arithmetic execution unit 50. With this execution, the execution of the instruction 3 is completed. The instruction reading unit 60 calculates a read address incremented by 2 and reads a 2-byte instruction code from address 204.

　以上のように本実施例によれば、サブルーチンへの分岐命令を実行する場合、解読部３０の制御記憶３３から他の指示に先行してまず分岐の指示が出力されるため、演算実行部５０におけるスタックポインタデクリメントやスタックへの戻り先番地のストアの実行と並行に命令読出し部６０において２回の分岐先の命令読出しを行うことができ、分岐先の命令がミスアライメントであっても分岐先命令を解読する時点で命令バッファ３２に３バイトの分岐先の命令が格納されていることになり、命令バッファ３２における命令不在によるパイプラインインタロックを回避することができる。 As described above, according to the present embodiment, when a branch instruction to a subroutine is executed, a branch instruction is first output from the control memory 33 of the decoding unit 30 prior to another instruction. In parallel with the execution of the stack pointer decrement and the store of the return address to the stack, the instruction read unit 60 can read the instruction of the branch destination twice, and even if the instruction of the branch destination is misaligned, At the time of decoding the instruction, the instruction of the branch destination of 3 bytes is stored in the instruction buffer 32, so that the pipeline interlock due to the absence of the instruction in the instruction buffer 32 can be avoided.

　なお本実施例では、サブルーチンへの分岐命令の場合をあげているが、分岐の処理単位と少なくとも１つの分岐でない処理単位とからなる命令であれば何でも適応できる。例えば、割込み処理ルーチンへの分岐処理においては、スタックポインタのデクリメントとスタックへの戻り先番地およびステータスワードのストアとに先行して分岐の指示を発行するように制御記憶３３を構成すればよいし、サブルーチンからのリターン命令あるいは割込み処理ルーチンからのリターン命令においては、スタックからの戻り先番地およびステータスワードのロードとスタックポインタインクリメントとに先行して分岐の指示を発行するように制御記憶３３を構成すればよい。 In this embodiment, a case of a branch instruction to a subroutine is described, but any instruction can be applied as long as the instruction includes a branch processing unit and at least one non-branch processing unit. For example, in the branch processing to the interrupt processing routine, the control storage 33 may be configured to issue a branch instruction before decrementing the stack pointer and storing the return address to the stack and the status word. In the case of a return instruction from a subroutine or a return instruction from an interrupt processing routine, the control memory 33 is configured to issue a branch instruction prior to loading of a return address and a status word from the stack and incrementing the stack pointer. do it.

　また実施例では、命令バッファ３２の容量を４バイトとしたが、５バイトまたはそれ以上でもよい。その場合、サブルーチンへの分岐命令の分岐先命令を解読する時点で命令バッファ３２に少なくとも５バイトの分岐先命令が格納されていることになりより効果的である。または命令バッファ３２の容量を３バイトとしてもよい。
　さらに実施例では、命令読出し部６０での１マシンサイクルで読出される命令を最大２バイトとしたが、これを４バイトまたはそれ以上としてもよい。１マシンサイクルで読出される命令語長を大きくするほど読出しアドレスの境界をまたぐ命令ミスアライメントの確率は低下するが、語長をいくら大きくしても命令ミスアライメントは皆無にはならない。 In the embodiment, the capacity of the instruction buffer 32 is 4 bytes, but may be 5 bytes or more. In this case, at the time of decoding the branch destination instruction of the branch instruction to the subroutine, the branch destination instruction of at least 5 bytes is stored in the instruction buffer 32, which is more effective. Alternatively, the capacity of the instruction buffer 32 may be 3 bytes.
Further, in the embodiment, the instruction read out in one machine cycle by the instruction reading unit 60 has a maximum of 2 bytes, but this may be 4 bytes or more. As the length of an instruction word read in one machine cycle increases, the probability of instruction misalignment across read address boundaries decreases. However, no matter how long the word length is, instruction misalignment does not disappear.

本発明は、命令の分岐先が命令ミスアライメントであっても命令不在によるパイプラインインタロックの発生を好適に防止できる技術に関し、CISCプロセッサ、RISCプロセッサを問わず、パイプライン構造をとるプロセッサにおいて好適に利用できる。 The present invention relates to a technique capable of suitably preventing the occurrence of pipeline interlock due to the absence of an instruction even when the instruction branch destination is an instruction misalignment, and is suitable for a processor having a pipeline structure regardless of a CISC processor or a RISC processor. Available to

本発明の実施例におけるプロセッサの概略構成を示すブロック図である。FIG. 1 is a block diagram illustrating a schematic configuration of a processor according to an embodiment of the present invention. （ａ）同実施例における制御記憶３３の構成図である。　　　　　　（ｂ）同実施例における制御記憶３３のオペレーションを示す図である。FIG. 3A is a configuration diagram of a control storage 33 in the embodiment. (B) is a diagram showing an operation of the control storage 33 in the embodiment. 同実施例におけるプロセッサの詳細な構成を示すブロック図である。FIG. 2 is a block diagram illustrating a detailed configuration of a processor in the embodiment. 同実施例におけるプロセッサの動作タイミング図である。FIG. 4 is an operation timing chart of the processor in the embodiment. 従来技術におけるプロセッサの動作タイミング図である。FIG. 11 is an operation timing chart of a processor according to the related art. 従来技術における制御記憶の構成図である。FIG. 9 is a configuration diagram of a control storage according to the related art.

Explanation of reference numerals

　　１０プロセッサ
　　１１入出力バス
　　１２バスインタフェース回路
　　１３命令読出し回路
　　１４解読・アドレス計算回路
　　１５演算実行回路
　　１３１命令バッファ
　　１４１制御記憶
　　２１１〜２３１制御記憶領域
REFERENCE SIGNS LIST 10 processor 11 input / output bus 12 bus interface circuit 13 instruction readout circuit 14 decoding / address calculation circuit 15 operation execution circuit 131 instruction buffer 141 control storage 211 to 231 control storage area

Claims

A processor for prefetching instructions,
Fetch address holding means for holding an address of an instruction to be prefetched;
For the control processing relating to the execution of the branch instruction including the processing unit of the branch having the content of rewriting the address of the fetch address holding means and the other processing unit, the execution of the processing unit of the branch is controlled first, Control means for controlling execution of another processing unit, wherein the branch processing unit and the other processing unit are executed in the same pipeline stage.

The control processing includes execution control of a subroutine call instruction,
The control means writes the branch destination address obtained based on the operand of the subroutine call instruction to the fetch address holding means in the control of the branch processing unit, and in the other processing unit, stores the return destination address in the area where the return destination address is to be saved. The processor according to claim 1, wherein the processor controls address calculation and saving of a return address.

The control processing includes a transition control to an interrupt processing routine,
In the control of the processing unit of the branch, the control unit writes a start address of a predetermined interrupt processing routine into a fetch address holding unit, and calculates an address of an area where a return address is to be saved in another processing unit and calculates a return destination. The processor according to claim 2, wherein the processor controls address saving.

The control processing includes a return instruction for instructing return from the subroutine, and a return instruction for instructing return from the interrupt processing routine,
4. The processor according to claim 1, wherein the control unit writes a return address to a fetch address holding unit in controlling the branch processing unit. 5.

The control means includes:
A control storage unit that stores a plurality of micro-instructions for realizing the operation of the control process,
A micro-instruction issuing unit for sequentially reading micro-instructions corresponding to each processing unit from the control storage and issuing a control signal designated by the micro-instruction to the inside of the processor;
The microinstruction issuing unit includes:
For a subroutine call instruction, a microinstruction instructing to store the address specified in the instruction in the fetch address holding means, and a microinstruction instructing to calculate the address of the area where the return address is to be saved And a microinstruction for instructing to save the return address, in this order,
For control of transition to the interrupt processing routine, a microinstruction instructing to store a predetermined interrupt processing start address in the fetch address holding means and a microinstruction instructing calculation of an address of an area where a return address is to be saved are provided. The processor according to claim 3, wherein the instruction and the microinstruction for instructing to save the return address are issued in this order.

The control means includes:
An address calculation unit for pre-decoding a subroutine call instruction and calculating a branch destination address of the instruction;
A control storage unit that stores a plurality of microinstructions that implements the control processing;
A micro-instruction issuing unit for sequentially reading micro-instructions corresponding to the respective processing units from the control storage, and issuing the micro-instructions inside the processor;
The microinstruction issuing unit includes:
In response to a subroutine call instruction, a microinstruction for storing an address obtained by the address calculation unit in the fetch address holding unit and instructing a store buffer holding data to be stored in a memory to hold a return address. A microinstruction for instructing to update the stack pointer pointing to the save area and to store the updated contents of the stack pointer in the operand address buffer to hold the operand address, and to store the contents of the store buffer in the operand address buffer. Issue a microinstruction instructing to store in the save area pointed to in this order,
For the control of transition to the interrupt processing routine, a microinstruction for storing the start address of the predetermined interrupt processing in the fetch address holding means and for holding the return address in the store buffer, and updating the stack pointer , And a microinstruction instructing to store the updated contents of the stack pointer in the operand address buffer, and a microinstruction instructing to store the contents of the store buffer in the save area pointed to by the operand address buffer. The processor according to claim 3, wherein the processor issues the command.