JP2000305778A

JP2000305778A - Instruction processor

Info

Publication number: JP2000305778A
Application number: JP11447599A
Authority: JP
Inventors: Hiroshi Kadota; 浩廉田
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1999-04-22
Filing date: 1999-04-22
Publication date: 2000-11-02

Abstract

PROBLEM TO BE SOLVED: To improve program execution performance by means of eliminating the useless access of DRAM by stopping the reading of an instruction when a branch instruction is detected and re-starting the reading when the branch instruction is executed. SOLUTION: A branch instruction code detecting means 118 detects the branch instruction at the moment when the fetched instruction still exists on an instruction bus. Then the code detecting means is inputted to the access stop terminal 121 of a selective memory access means 116, and the means 116 stops access until a next access re-start control input is obtained. The branch instruction is stored in a fetch control part just as the normal instruction, decoded and executed so that it is decided whether branching occurs. Then the instruction is accessed by a proper address based on the judgement of the condition. As a result, the fetched instruction is outputted to the instruction bus. Then the next correct instruction is stored in the fetch control part, decoded as a normal instruction after that and executed.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は高速のプロセッサLS
Iに関するもので、特に主記憶も同一チップ上に混載し
た命令先読み機構をもつパイプラインプロセッサに関す
るものである。The present invention relates to a high-speed processor LS.
In particular, the present invention relates to a pipeline processor having an instruction prefetch mechanism in which a main memory is also mounted on the same chip.

【０００２】[0002]

【従来の技術】LSI微細化製造技術の進歩により、大規
模論理ゲートとDRAM（ダイナミックランダムアクセスメ
モリ）の１チップ上の混載が可能になり、プロセッサコ
アと主記憶が１チップで構成できる状況である。2. Description of the Related Art Advances in LSI miniaturization manufacturing technology have enabled large-scale logic gates and DRAMs (Dynamic Random Access Memory) to be mounted together on a single chip. is there.

【０００３】従来オンチップメモリとして使われていた
高速のSRAM（スタティックランダムアクセスメモリ）は
アクセス時のオーバーヘッドがない代わりに容量が大き
くないので、それだけで主記憶にするには不十分であっ
たが、DRAMの容量は十分に大きいのでオンチップメモリ
だけで主記憶が構成できる。A high-speed SRAM (static random access memory) conventionally used as an on-chip memory does not have an overhead at the time of access but has a small capacity. Since the capacity of the DRAM is sufficiently large, the main memory can be constituted only by the on-chip memory.

【０００４】図８の880に従来の単純なDRAM混載型のプ
ロセッサチップの構成を示す。DRAMはセンスアンプの動
作やプリチャージ動作で、SRAMよりアクセスのサイクル
時間がかなり長くなるが入出力のビット数を多くするこ
とで、連続するアドレスアクセスを実行するかぎり高速
なプロセッサにデータや命令を供給することが可能であ
る。FIG. 8 shows a configuration 880 of a conventional processor chip of a simple DRAM mixed type. DRAM requires much longer access cycle time than SRAM due to sense amplifier operation and precharge operation.However, by increasing the number of input / output bits, data and instructions can be sent to a high-speed processor as long as continuous address access is executed. It is possible to supply.

【０００５】しかし一般的なプログラムをプロセッサが
実行する場合、命令やデータがメモリアドレスの順番に
アクセスされることはほとんどなく、色々な条件に応じ
てメモリアクセスのアドレスが変化する。図７に命令の
実行順序が分岐命令によって変化するプログラムの例を
示す。770の例は通常の命令771の中に分岐命令773が存
在している場合で、分岐命令に続く幾つかの命令772は
実行されずにスキップされる。同図の775は繰り返し処
理（Loop) の場合で、通常の命令776に続いてループ開
始部の命令777からループ終了部の命令778までを戻り条
件779が満足される限りループ処理部に続く命令が順次
実行される。戻り条件としては普通「繰り返し回数があ
る指定値に達するまで」というものである。However, when a general program is executed by a processor, instructions and data are rarely accessed in the order of memory addresses, and the address of memory access changes according to various conditions. FIG. 7 shows an example of a program in which the execution order of instructions is changed by a branch instruction. The example of 770 is a case where the branch instruction 773 exists in the normal instruction 771, and some instructions 772 following the branch instruction are skipped without being executed. Reference numeral 775 in the figure denotes a case of a repetitive processing (Loop). After the normal instruction 776, the instruction from the loop start part 777 to the loop end part 778 is returned, and the instruction following the loop processing part as long as the condition 779 is satisfied. Are sequentially executed. The return condition is usually “until the number of repetitions reaches a specified value”.

【０００６】一般に、信号処理等の数値計算主体のプロ
グラムでは、条件付分岐命令で条件が成立してプログラ
ムの分岐が行われる確率が非常に高い。更にデータのア
クセスに関しては、数値計算に近い応用の場合プログラ
ムの作り方である程度アクセスを順番通りにすることが
可能であるが、命令は分岐が発生するたびに命令先読み
機構が事前に取ってきている命令群と、そのとき既に開
始されている次のアクセスをキャンセルし、DRAMのアク
セスサイクルが終了するのを待って必要な命令のアクセ
ス動作を開始することになる。この時のタイミングチャ
ートを図９、図１０に示す。図９は単純分岐命令実行
時、図１０はループ命令実行時である。Generally, in a program mainly composed of numerical calculations such as signal processing, there is a very high probability that a condition is satisfied by a conditional branch instruction and the program is branched. Furthermore, regarding data access, if the application is close to numerical calculation, it is possible to make the access to some degree in order by creating a program, but every time a branch occurs, the instruction prefetch mechanism takes in advance. The instruction group and the next access already started at that time are canceled, and the access operation of the necessary instruction is started after the DRAM access cycle is completed. FIGS. 9 and 10 show timing charts at this time. FIG. 9 shows the case of executing a simple branch instruction, and FIG. 10 shows the case of executing a loop instruction.

【０００７】両方の図で、950、150は命令バスアクセス
アドレス部状態、951、151は命令バス命令フェッチ出力
状態、952、152は命令フェッチ制御部格納状態、953、1
53は命令デコード状態、954、154は命令実行部状態を示
す。In both figures, reference numerals 950 and 150 denote instruction bus access address unit states, 951 and 151 denote instruction bus instruction fetch output states, 952 and 152 denote instruction fetch control unit storage states, 953 and 1
53 indicates an instruction decode state, and 954 and 154 indicate instruction execution unit states.

【０００８】図９で、分岐命令990がデコードされ実行
されている時点では既に次の命令のフェッチ991が行わ
れ、かつ更に先の命令アクセス992も実行され、これに
対応するDRAM出力即ち命令フェッチ993も行われてい
る。このため991および993に対応する命令サイクル等の
994で示した９サイクルが無駄になる。正しい分岐先は
分岐命令が実行状態になって以降のDRAMのアクセス可能
なサイクル995に行われ、996で命令バス上に出力され、
997で命令フェッチ制御部に格納される。In FIG. 9, at the time when the branch instruction 990 is decoded and executed, the fetch 991 of the next instruction has already been performed, and the further preceding instruction access 992 has also been executed. 993 is also taking place. For this reason, instruction cycles etc. corresponding to 991 and 993
The nine cycles shown at 994 are wasted. The correct branch destination is performed in the DRAM accessible cycle 995 after the branch instruction is executed and output on the instruction bus at 996,
At 997, it is stored in the instruction fetch control unit.

【０００９】同様に、図１０のループ処理の場合も、11
00で示すループ終了命令が実行されループの戻り先メモ
リアクセスを開始しようとするときは既に命令フェッチ
191、命令アクセス192が終わっており、1101で示した戻
り番地のDRAMアクセスに対する命令アクセス1101が開始
できる192のアクセスに対応する命令フェッチ193が終わ
ってからであり、ループ命令開始部の命令が1102でバス
上にフェッチされ、1103で命令フェッチ制御部に格納す
るまでの194で示す命令サイクルが９サイクルが無駄に
なる。Similarly, in the case of the loop processing shown in FIG.
When the loop end instruction indicated by 00 is executed and an attempt is made to start a memory access at the return destination of the loop, an instruction fetch has already been performed.
191 and instruction access 192 have been completed, and instruction access 1101 for the DRAM access at the return address indicated by 1101 can be started. Instruction fetch 193 corresponding to 192 access has been completed. Nine instruction cycles indicated by 194 until the instruction is fetched onto the bus and stored in the instruction fetch control unit in 1103 are wasted.

【００１０】この様なDRAMアクセスにともなう無駄時間
を減少させプロセッサの高速動作を実現するには、従来
は更に大規模なSRAMやキャッシュメモリをプロセッサと
DRAMのあいだに挟んだ構成にしていた。図８の883に構
成の概要を示す。この場合分岐命令による性能の劣化は
抑えられるが、キャッシュ部分のチップ面積が増加する
ため、チップの大幅コストアップにつながる。In order to reduce the dead time associated with such DRAM access and to realize high-speed operation of the processor, conventionally, a larger SRAM or cache memory is used as a processor.
The configuration was sandwiched between DRAMs. An outline of the configuration is shown at 883 in FIG. In this case, the performance degradation due to the branch instruction can be suppressed, but the chip area of the cache portion increases, which leads to a significant increase in chip cost.

【００１１】[0011]

【発明が解決しようとする課題】したがって本発明は、
DRAM混載プロセッサLSIにおいてキャッシュ等の高速バ
ッファメモリ要素を用いずに、分岐命令やループ処理繰
り返しを実行した場合にも、性能劣化を少なくできるよ
うなプロセッサを提供することを目的とする。Accordingly, the present invention provides
An object of the present invention is to provide a processor which can reduce performance degradation even when a branch instruction or loop processing is repeatedly executed without using a high-speed buffer memory element such as a cache in a DRAM embedded processor LSI.

【００１２】[0012]

【課題を解決するための手段】課題を解決する第１の手
段は、データ入出力信号線束と命令入力線束とが分離さ
れたプロセッサで、前記命令入力線束の命令コードに対
応する１個もしくは複数のフィールド各々に分岐命令コ
ード検出手段を設け、前記プロセッサの命令先読み制御
部分には命令先読みのためのメモリアクセスのアドレス
カウンタ手段、分岐先アドレスを格納するレジスタ手
段、これらカウンタ手段とレジスタ手段の出力を入力と
し選択してどちらか一方を出力する選択的メモリアクセ
ス手段、更にこの選択的メモリアクセス手段は選択用の
入力端子およびメモリアクセス動作を一旦停止させる制
御入力端子とメモリアクセスを再開する制御端子をも
ち、前記検出手段の検出出力信号の論理和信号を前記メ
モリアクセスを一旦停止させる制御入力端子に接続さ
れ、該当する分岐命令が実行される時に出力される分岐
実施・非分岐を示す信号線を前記選択用の入力端子に接
続し、かつ前記分岐命令の実行の終了を示す信号を前記
メモリアクセスを再開する端子に接続するごとき構成を
特徴とする命令処理装置を有することである。According to a first aspect of the present invention, there is provided a processor in which a data input / output signal line bundle and a command input line bundle are separated from each other. Branch instruction code detecting means is provided for each field of the processor, and in the instruction prefetch control portion of the processor, an address counter means for memory access for instruction prefetching, a register means for storing a branch destination address, and outputs of these counter means and register means Memory access means for selecting and receiving one of them as an input and outputting either one of them, further comprising an input terminal for selection, a control input terminal for temporarily stopping the memory access operation, and a control terminal for resuming the memory access And temporarily interrupts the memory access with the logical sum signal of the detection output signal of the detection means. A signal line that is connected to a control input terminal to be executed and that indicates whether a branch instruction is to be executed or not is output when the corresponding branch instruction is executed, is connected to the input terminal for selection, and indicates the end of execution of the branch instruction. An instruction processing apparatus characterized in that a signal is connected to a terminal for resuming the memory access.

【００１３】課題を解決する第２の手段は、データ入出
力信号線束と命令入力線束とが分離されたプロセッサ
で、かつ前記プロセッサはループ処理開始命令とループ
処理終了命令をもち、命令入力線束の命令コードに対応
する１個もしくは複数のフィールド各々に前記ループ処
理開始命令とループ処理終了命令のコード検出手段を設
け、前記プロセッサの命令先読み制御部分には命令先読
みのためのメモリアクセスのアドレスを示すカウンタ手
段、ループ戻先アドレスを格納すレジスタ手段、これら
カウンタ手段とレジスタ手段の出力を入力とし選択して
どちらか一方を出力する選択的メモリアクセス手段、更
にこの選択的メモリアクセス手段は選択用の入力端子、
ループ処理回数をカウントするループカウンタ手段を設
け、前記検出手段でループ処理開始を検出した場合の検
出信号を前記ループ戻先アドレス格納用レジスタにその
時のメモリアドレスおよび命令のアドレスを格納する制
御信号端子に接続すると同時に、前記ループカウンタ手
段にループ回数を設定する制御信号端子にもも接続し、
前記検出手段でループ処理終了を検出した場合の検出信
号を前記ループカウンタ手段の値が零でないとき前記選
択的メモリアクセス手段の選択制御入力に伝搬させ、該
値が零のとき前記選択制御入力に伝搬させない処理を行
う論理手段を設けることを特徴とする命令処理装置を有
することである。A second means for solving the problem is a processor in which a data input / output signal line bundle and a command input line bundle are separated, and the processor has a loop processing start instruction and a loop processing end instruction, and Code detection means for the loop processing start instruction and the loop processing end instruction is provided in each of one or a plurality of fields corresponding to the instruction code, and an instruction prefetch control part of the processor indicates a memory access address for instruction prefetch. A counter means, a register means for storing a loop return destination address, a selective memory access means for receiving and selecting one of the outputs of the counter means and the register means and outputting one of the outputs, and further comprising a selective memory access means for selecting Input terminal,
A control signal terminal for providing loop counter means for counting the number of times of loop processing, and for storing a detection signal when the detection means detects the start of loop processing in the register for storing the return address of the loop and the address of the instruction at that time. At the same time, also connected to a control signal terminal for setting the number of loops in the loop counter means,
When the value of the loop counter means is not zero, a detection signal when the detection means detects the end of the loop processing is propagated to a selection control input of the selective memory access means, and when the value is zero, the detection signal is transmitted to the selection control input. There is provided an instruction processing apparatus characterized by providing a logic means for performing processing not to be propagated.

【００１４】[0014]

【発明の実施の形態】図面を用いて、本発明の実施形態
を説明する。Embodiments of the present invention will be described with reference to the drawings.

【００１５】図１、図２は各々、本発明の第１および第
２の実施形態の構成図を示す。FIGS. 1 and 2 show the configuration of the first and second embodiments of the present invention, respectively.

【００１６】各図中、101、201はLSIチップ、102、202
はプロセッサのコア部、103、203は混載されたDRAM（ダ
イナミックランダムアクセスメモリ）、104、204はDRAM
から命令をプロセッサに供給する命令信号線束（以降、
命令バスと記述）、105、205はDRAMとプロセッサとの間
でデータの入出力を行うデータ信号線束（以降、データ
バスと記述）、各バスはアドレス部とコンテンツ部に分
かれており、命令バスは命令アクセスアドレス部106、2
06と、命令コンテンツ部である命令アクセス出力部10
7、207からなり、データバスはデータアクセスアドレス
部108、208と、データコンテンツ部のデータアクセス入
出力部109、209からなる。In the figures, 101 and 201 are LSI chips, 102 and 202, respectively.
Is the core of the processor, 103 and 203 are mixed DRAM (dynamic random access memory), and 104 and 204 are DRAM
Instruction signal line bundle for supplying instructions to the processor from
A data signal line bundle for inputting and outputting data between the DRAM and the processor (hereinafter, referred to as a data bus); each bus is divided into an address portion and a content portion; Are the instruction access address units 106 and 2
06 and the instruction access output section 10 which is the instruction content section
7 and 207, and the data bus includes data access address sections 108 and 208 and data access input / output sections 109 and 209 of the data content section.

【００１７】プロセッサコアの内部にはデータ処理部
（データパス）111、211と、各種制御を行う制御部（コ
ントロールパス）110、210があり、制御部の中には、命
令のフェッチ制御部112、212、命令デコード部113、21
3、制御信号生成部114、214がある。フェッチ制御部の
中には将来実行する予定の命令を事前にフェッチするた
めの連続したアドレスを生成する命令先読みカウンタ11
5、215と、このカウンタのアドレスと別の連続しないア
ドレスとを選択しどちらかを命令のアクセス用にメモリ
へ出力する選択的メモリアクセス手段116、216、および
この選択のための選択信号入力端子117、217から構成さ
れる。Inside the processor core, there are data processing units (data paths) 111 and 211, and control units (control paths) 110 and 210 for performing various controls. Among the control units, an instruction fetch control unit 112 is provided. , 212, instruction decoding units 113, 21
3. There are control signal generators 114 and 214. The fetch control unit includes an instruction look-ahead counter 11 that generates a continuous address for fetching an instruction to be executed in the future in advance.
5, 215, selective memory access means 116, 216 for selecting the address of this counter and another non-contiguous address and outputting either of them to the memory for accessing the instruction, and a selection signal input terminal for this selection It consists of 117 and 217.

【００１８】更に本発明の第１実施形態の構成を示す図
１では、分岐命令をメモリからフェッチ直後に検出する
ために、分岐命令コード検出手段118を命令バスの命令
アクセス出力部の所定のフィールドに設ける。In FIG. 1 showing the configuration of the first embodiment of the present invention, in order to detect a branch instruction immediately after being fetched from a memory, a branch instruction code detecting means 118 is connected to a predetermined field of an instruction access output section of an instruction bus. To be provided.

【００１９】図３に分岐命令検出手段の設置状態を示
す。この図中、330は命令コード全体フォーマット（16
ビット）、331はその中のオペコードフィールド（6ビッ
ト）、332は第１オペランドフィールド（5ビット）、33
3は第２オペランドフィールド（5ビット）である。この
例では分岐命令は1100というオペコードであるとする。
分岐命令の第１オペランドとしては例えば、分岐のため
の条件、第２オペランドは分岐先アドレスを格納してい
る内部レジスタ番号等である。検出手段118の第１要素
は1100のパターンが命令バスから入力されると直ちに論
理１が出力される論理ゲート334で構成される。図３の
例は命令アクセス出力部107が３命令分のビット幅があ
り、並列に出力されるので、検出手段第１要素が３個設
置されている。図１の例では２個である。上記の分岐命
令検出手段の第１要素出力の論理和を論理回路335でと
り、その出力336を検出手段118の出力とする。FIG. 3 shows an installation state of the branch instruction detecting means. In this figure, 330 is the entire instruction code format (16
331, an opcode field (6 bits) therein, 332 a first operand field (5 bits), 33
3 is a second operand field (5 bits). In this example, it is assumed that the branch instruction is an operation code of 1100.
The first operand of the branch instruction is, for example, a condition for branching, and the second operand is an internal register number storing a branch destination address. The first element of the detecting means 118 is constituted by a logic gate 334 which outputs a logic 1 as soon as the pattern of 1100 is inputted from the instruction bus. In the example of FIG. 3, since the instruction access output unit 107 has a bit width of three instructions and outputs the instructions in parallel, three detection means first elements are provided. In the example of FIG. 1, the number is two. The logical sum of the first element output of the branch instruction detecting means is calculated by a logic circuit 335, and the output 336 is used as the output of the detecting means 118.

【００２０】一般に、DRAMのアクセスサイクルタイムは
プロセッサの動作サイクルタイムの数倍であるため、混
載DRAMからの命令バスはビット幅を広くとり、並列に複
数の命令を読み出し、上記のサイクルタイムギャップ
を、平均的スループットを高めると埋める構成をとる。Generally, the access cycle time of the DRAM is several times the operation cycle time of the processor. Therefore, the instruction bus from the embedded DRAM has a wide bit width, reads a plurality of instructions in parallel, and reduces the cycle time gap. However, it is possible to increase the average throughput and fill the gap.

【００２１】更に、本発明の第１の実施形態では、この
出力336を図１にあるように選択的メモリアクセス手段1
16のアクセス停止入力端子121に接続する。選択的メモ
リアクセス手段116は、現在フェッチされている命令ア
ドレスの次の連続するメモリアドレスを示す命令先読み
カウンタ115、および分岐先のアドレスが格納されてい
る（または将来分岐命令が実行される時に格納される予
定の）分岐先アドレスレジスタ119の両出力を入力アド
レスとし選択信号入力端子117の入力信号の極性に応じ
て、115または119からのアドレスが選択され、通常時は
そのまま命令アクセスアドレス部106に出力され、命令
フェッチ動作が行われる。しかしアクセス停止入力端子
121に論理１が入力されると直ちにアドレス出力とフェ
ッチ動作が停止される。また一旦停止された命令フェッ
チ動作はアクセス再開制御入力端子120にアサート信号
が入力されると再び通常動作に戻ってフェッチ動作が再
開される。分岐命令はフェッチされた後、命令フェッチ
制御部112に格納され、順番が来れば命令デコード部113
でデコードされ、制御信号が114で生成された後、デー
タ処理部111の中にある分岐命令処理手段122で実行され
る。分岐条件に合致すれば分岐が発生し、分岐発生通知
信号124が122から選択信号入力端子117に送出され、分
岐先アドレスレジスタ119側が選択される。逆に分岐条
件に合致しない場合は、連続した次のアドレスの命令を
フェッチしてくる必要があり、先読みアドレスカウンタ
が選択される。Further, in the first embodiment of the present invention, the output 336 is transmitted to the selective memory access means 1 as shown in FIG.
Connect to 16 access stop input terminals 121. The selective memory access unit 116 stores an instruction prefetch counter 115 indicating the next consecutive memory address of the instruction address currently being fetched, and stores the address of the branch destination (or stores the address when a future branch instruction is executed). Both outputs of the branch destination address register 119 are set as input addresses, and the address from the address 115 or 119 is selected according to the polarity of the input signal at the selection signal input terminal 117. And an instruction fetch operation is performed. But access stop input terminal
As soon as logic 1 is input to 121, the address output and fetch operation are stopped. In addition, the instruction fetch operation that has been temporarily stopped returns to the normal operation again when the assert signal is input to the access restart control input terminal 120, and the fetch operation is restarted. After the branch instruction is fetched, it is stored in the instruction fetch control unit 112.
After the control signal is generated at 114, the control signal is generated by the branch instruction processing means 122 in the data processing unit 111. If the branch condition is met, a branch occurs, a branch occurrence notification signal 124 is sent from the 122 to the selection signal input terminal 117, and the branch destination address register 119 is selected. Conversely, if the branch condition is not met, it is necessary to fetch instructions at the next consecutive address, and the prefetch address counter is selected.

【００２２】分岐命令実行の最後に分岐命令実行終了信
号123が分岐命令処理手段122からアクセス再開制御入力
端子120に送出される。図５に、この実施形態の動作状
態を示す。図中、550は命令バスアクセスアドレス部状
態、551は命令バス命令フェッチ出力状態、552は命令フ
ェッチ制御部格納状態、553は命令デコード状態、554は
命令実行部状態を示す。At the end of the execution of the branch instruction, a branch instruction execution end signal 123 is sent from the branch instruction processing means 122 to the access restart control input terminal 120. FIG. 5 shows an operation state of this embodiment. In the figure, 550 indicates an instruction bus access address section state, 551 indicates an instruction bus instruction fetch output state, 552 indicates an instruction fetch control section storage state, 553 indicates an instruction decode state, and 554 indicates an instruction execution section state.

【００２３】フェッチされた命令がまだ命令バス上にあ
る560の時点で、分岐命令コード検出手段118が分岐命令
556を検出するので、コード検出手段出力336が選択的メ
モリアクセス手段116のアクセス停止端子121に入力され
ることにより選択的メモリアクセス手段116は次にアク
セス再開制御入力がくるまでアクセスを中止するので、
すぐ次に発生する予定だったアドレスアクセス555をキ
ャンセルすることができる。分岐命令556は、通常命令
と同様、フェッチ制御部に格納され、デコードされ、実
行されると分岐が発生するかどうかが確定し、その条件
判断に基づいた適切なアドレスで557に示す時点で命令
をアクセスする。この結果558にフェッチされた命令が
命令バスに出力される。そして次の正しい命令が559で
フェッチ制御部に格納され、以降通常の命令としてデコ
ードされ、実行される。この分岐命令から、次の実行命
令までの無効サイクル561はこの例では６サイクルであ
る。At 560, when the fetched instruction is still on the instruction bus, the branch instruction code detecting means 118
Since 556 is detected, the code detection means output 336 is input to the access stop terminal 121 of the selective memory access means 116, so that the selective memory access means 116 suspends access until the next access resumption control input comes. ,
It is possible to cancel the address access 555 that was to occur immediately next. Like the normal instruction, the branch instruction 556 is stored in the fetch control unit, decoded, and when executed, it is determined whether or not a branch will occur, and at the time indicated by 557 at an appropriate address based on the condition judgment, To access. As a result, the instruction fetched at 558 is output to the instruction bus. Then, the next correct instruction is stored in the fetch control unit at 559, and then decoded and executed as a normal instruction. The invalid cycle 561 from this branch instruction to the next execution instruction is six in this example.

【００２４】一方、図２に示した第２の実施形態は、ル
ープ処理に対応した構成で、かつこの構成を使い高効率
ループ処理動作をさせるためには、プロセッサに通常の
条件付分岐命令以外にループ開始命令とループ終了命令
を追加する。225は命令バスの出力部に設けられたルー
プ処理開始命令コードおよび同終了命令コード検出手
段、226はループ戻り先アドレスレジスタで、ループ処
理開始命令があるアドレスが格納される。これは検出手
段225でループ命令開始命令コードを検出した場合にそ
の時の命令先読みカウンタ215の値をこのレジスタ226に
転送することで実現される。図４に検出手段225の構成
例を示す。この場合はループ処理開始命令440および同
終了命令441の２種類のコードを検出する必要があるの
で、各々に対する第１検出要素442、443を必要数併置
し、論理回路444、445で各要素の論理和をとり、各出力
446、447を検出手段225の２つの出力とする。227はルー
プ処理を自立的に制御する論理手段、228はループ回数
カウンタで、最初カウンタの値は０に設定されている。
命令実行段階でループ処理開始命令が実行されると、次
のような条件別の処理が行われる。１）ループ回数カウンタ228が０の場合・・ループ処理
開始命令440のオペランド部で指定される繰り返し回数
値から１を減じた値をループ回数カウンタ228にセット
する。２）ループ回数カウンタ228が正の場合・・ループ回数
カウンタ228にデクリメント処理をさせて、値を１減じ
る。On the other hand, the second embodiment shown in FIG. 2 has a configuration corresponding to loop processing, and in order to perform a high-efficiency loop processing operation using this configuration, the processor is required to provide a processor other than a normal conditional branch instruction. Add a loop start instruction and a loop end instruction. Reference numeral 225 denotes a loop processing start instruction code and end instruction code detection means provided at the output unit of the instruction bus, and 226 denotes a loop return address register, which stores an address where the loop processing start instruction is present. This is realized by transferring the value of the instruction prefetch counter 215 at that time to the register 226 when the detection means 225 detects the loop instruction start instruction code. FIG. 4 shows a configuration example of the detection means 225. In this case, it is necessary to detect two types of codes, a loop processing start instruction 440 and a loop processing end instruction 441. Therefore, the required number of first detection elements 442 and 443 for each of them are juxtaposed, and the logic circuits 444 and 445 use OR and output each
446 and 447 are two outputs of the detection means 225. 227 is a logic means for controlling the loop processing autonomously, 228 is a loop number counter, and the value of the counter is initially set to 0.
When the loop processing start instruction is executed in the instruction execution stage, the following condition-dependent processing is performed. 1) When the number-of-loops counter 228 is 0: A value obtained by subtracting 1 from the number of repetitions specified by the operand part of the loop processing start instruction 440 is set in the number-of-loops counter 228. 2) When the loop count counter 228 is positive: The loop count counter 228 is decremented, and the value is reduced by one.

【００２５】また、検出手段225がループ処理終了命令
コード441を検出した場合も、次のような条件別の処理
が行われる。１）ループ回数カウンタ228が０の場合・・指定された
ループの繰り返し回数を実行が終わっているので、分岐
はせず、次の命令の実行へと進む。即ち、選択的メモリ
アクセス手段216は命令先読みカウンタ215（現在アクセ
スしているメモリアドレスの次のインクリメントした値
を保持している）を選択し、次のメモリアクセスを行
う。２）ループ回数カウンタ228が正の場合・・更にループ
の繰り返し処理が必要なので、ループ処理開始命令部を
アクセスするため、選択的メモリアクセス手段216はル
ープ戻り先アドレスレジスタ226を選択し、次のメモリ
アクセスで分岐が行われ、命令先読みカウンター215に
ループ戻り先アドレス226の値が代入される。Also, when the detecting means 225 detects the loop processing end instruction code 441, the processing according to the following conditions is performed. 1) When the number-of-loops counter 228 is 0: Since the execution of the specified number of times of the loop has been completed, the process proceeds to the execution of the next instruction without branching. That is, the selective memory access means 216 selects the instruction prefetch counter 215 (holding the next incremented value of the currently accessed memory address) and performs the next memory access. 2) When the loop count counter 228 is positive: Further loop repetition processing is necessary, so that the selective memory access means 216 selects the loop return destination address register 226 to access the loop processing start instruction section, and The branch is performed by the memory access, and the value of the loop return destination address 226 is substituted into the instruction prefetch counter 215.

【００２６】図６にこの構成の動作状態を示す。660に
示す以前のアクセスでフェッチされたバス上にある命令
の中にループ処理終了命令がある場合、660でループ処
理開始・終了命令コード検出手段225でループ処理終了
命令が検出され、ループ回数カウンタの値に応じて、戻
り先アドレスまたは連続した次のアドレスが準備され、
662でアクセスされる。FIG. 6 shows an operation state of this configuration. If there is a loop processing end instruction among the instructions on the bus fetched by the previous access shown at 660, a loop processing end instruction is detected by the loop processing start / end instruction code detecting means 225 at 660, and the loop counter is executed. Depending on the value of, the return address or the next consecutive address is prepared,
Accessed at 662.

【００２７】図６で、L.end はループ処理終了命令、L.
staはループ処理開始命令を示す。戻り先アドレスには
ループ処理開始命令が含まれているが、これが663でフ
ェッチされ、665に示すようにフェッチ制御部に格納さ
れ、デーコドされ、実行される。ループ処理終了命令66
4から同開始命令665の間の無効命令サイクル661はわず
か１サイクルである。In FIG. 6, L.end is a loop processing end instruction, and L.end is
sta indicates a loop processing start instruction. The return address contains a loop processing start instruction, which is fetched at 663, stored in the fetch control unit as shown at 665, coded, and executed. Loop processing end instruction 66
The invalid instruction cycle 661 between 4 and the start instruction 665 is only one cycle.

【００２８】[0028]

【発明の効果】以上の説明より明らかなように、本発明
の効果は、DRAMを記憶要素として使ったプロセッサシス
テムで（特に並列入出力ビット幅が大きくとれる場合）
分岐命令やループ処理命令による命令実行順序の分岐が
多発する時に、無駄なDRAMのアクセスをなくすことで大
幅なプログラム実行性能向上を実現することである。ま
た、無駄なメモリアクセスをなくすことでそのために必
要だった消費電力も減らすことができ、低消費電力化も
実現できる。またこのために必要ハードウェア量は簡単
な論理回路とカウンタ等少量で済み、高性能化と低消費
電力化という非常に大きな効果を発揮するものである。As is clear from the above description, the effect of the present invention is obtained in a processor system using a DRAM as a storage element (particularly when the parallel input / output bit width can be increased).
It is to realize a significant improvement in program execution performance by eliminating useless access to the DRAM when branching of instruction execution order due to a branch instruction or a loop processing instruction occurs frequently. Further, by eliminating useless memory access, power consumption required for that purpose can be reduced, and low power consumption can be realized. In addition, a small amount of hardware such as a simple logic circuit and a counter is required for this purpose, and a very high effect of high performance and low power consumption is exhibited.

[Brief description of the drawings]

【図１】本発明の第１の実施形態構成図FIG. 1 is a configuration diagram of a first embodiment of the present invention.

【図２】本発明の第２の実施形態構成図FIG. 2 is a configuration diagram of a second embodiment of the present invention.

【図３】分岐命令における命令コードと命令バスのフィ
ールドおよび命令コード検出手段の構成図FIG. 3 is a configuration diagram of an instruction code and an instruction bus field in a branch instruction and an instruction code detection unit;

【図４】ループ処理開始と終了命令における命令コード
と命令バスのフィールドおよび命令コード検出手段の構
成図FIG. 4 is a configuration diagram of an instruction code in an instruction to start and end a loop process, a field of an instruction bus, and an instruction code detection unit;

【図５】第１の実施形態に対する動作状態を示す図FIG. 5 is a diagram showing an operation state according to the first embodiment;

【図６】第２の実施形態に対する動作状態を示す図FIG. 6 is a diagram showing an operation state according to the second embodiment;

【図７】命令の実行順序を示す図FIG. 7 is a diagram showing an execution order of instructions;

【図８】従来のプロセッサとメモリ構成例の図FIG. 8 is a diagram of a conventional processor and a memory configuration example.

【図９】従来方式における分岐命令実行の動作状態を示
す図FIG. 9 is a diagram showing an operation state of branch instruction execution in a conventional method.

【図１０】従来方式におけるループ命令実行の動作状態
を示す図FIG. 10 is a diagram showing an operation state of loop instruction execution in a conventional method.

[Explanation of symbols]

101 LSIチップ 102 プロセッサコア部 103 DRAM 104 命令バス 105 データバス 106 命令アクセスアドレス部 107 命令アクセス出力部 108 データアクセスアドレス部 109 データアクセス入出力部 110 制御処理部 111 データ処理部 112 命令フェッチ制御部 113 命令デコード部 114 制御信号生成部 115 命令先読みカウンタ 116 選択的メモリアクセス手段 117 選択信号入力端子 118 分岐命令コード検出手段 119 分岐先アドレスレジスタ 120 アクセス再開制御入力端子 121 アクセス停止入力端子 122 分岐命令処理手段 123 分岐命令実行終了信号 124 分岐発生通知信号 150 命令バスアクセスアドレス部状態 151 命令バス命令フェッチ出力状態 152 命令フェッチ制御部格納状態 153 命令デコード状態 154 命令実行部状態 191 以前のアクセスでフェッチされバス上にある無効
命令 192 無効アドレスへのアクセス 193 フェッチされたバス上の無効命令 194 無効命令サイクルまたは何も実行されないサイク
ル 201 LSIチップ 202 プロセッサコア部 203 DRAM 204 命令バス 205 データバス 206 命令アクセスアドレス部 207 命令アクセス出力部 208 データアクセスアドレス部 209 データアクセス入出力部 210 制御処理部 211 データ処理部 212 命令フェッチ制御部 213 命令デコード部 214 制御信号生成部 215 命令先読みカウンタ 216 選択的メモリアクセス手段 217 選択信号入力端子 225 ループ処理開始／終了命令コード検出手段 226 ループ戻り先アドレスレジスタ 227 ループ処理判別論理手段 228 ループ回数カウンタ 307 命令アクセス出力部 330 命令コード例101 331 オペコードフィールド 332 第101オペランドフィールド 333 第２オペランドフィールド 334 第101のコード検出要素 335 第２のコード検出要素（論理和回路） 336 コード検出手段出力 407 命令アクセス出力部 430 命令コード例101 431 オペコードフィールド 432 第101オペランドフィールド 433 第２オペランドフィールド 440 命令コード例２（ループ処理開始命令） 441 命令コード例３（ループ処理終了命令） 442 ループ処理開始コード検出部第101要素 443 ループ処理終了コード検出部第101要素 444 ループ処理開始コード検出部第２要素（論理和回
路） 445 ループ処理終了コード検出部第２要素（論理和回
路） 446 ループ処理開始コード検出部出力 447 ループ処理終了コード検出部出力 550 命令バスアクセスアドレス部状態 551 命令バス命令フェッチ出力状態 552 命令フェッチ制御部格納状態 553 命令デコード状態 554 命令実行部状態 555 キャンセルされたアクセス 556 分岐命令 557 分岐先アドレスのアクセス 558 フェッチされた分岐先命令 559 分岐先命令または再実行開始命令サイクル 560 以前のアクセスでフェッチされバス上にある命令 561 無効命令サイクルまたは何も実行されないサイク
ル 650 命令バスアクセスアドレス部状態 651 命令バス命令フェッチ出力状態 652 命令フェッチ制御部格納状態 653 命令デコード状態 654 命令実行部状態 660 以前のアクセスでフェッチされバス上にある命令 661 無効命令サイクルまたは何も実行されないサイク
ル 662 ループ戻り先アドレス 663 ループ開始命令を含むバス上の命令 664 ループ終了命令 665 ループ開始命令 770 純粋なブランチの場合 771 実行される（分岐以外の）命令 772 実行されない命令 773 分岐命令 775 ループ処理の場合 776 実行される命令（ループ開始・終了命令を除く） 777 ループ開始命令 778 ループ終了命令 779 もどり条件：（繰り返し回数）＜（決められた回
数） 880 単純なDRAM混載型プロセッサのチップ 881 CPU部分 882 混載されたDRAM 883 キャッシュも搭載したDRAM混載型プロセッサチッ
プ 884 搭載されたキャッシュ部分 950 命令バスアクセスアドレス部状態 951 命令バス命令フェッチ出力状態 952 命令フェッチ制御部格納状態 953 命令デコード状態 954 命令実行部状態 990 分岐命令 991 以前のアクセスでフェッチされバス上にある無効
命令 992 無効アドレスへのアクセス 993 フェッチされたバス上の無効命令 994 無効命令サイクルまたは何も実行されないサイク
ル 995 分岐先アドレスのアクセス 996 命令バス上のフェッチされた分岐先命令 997 分岐先命令が命令フェッチ制御部格納されるサイ
クル 1100 ループ末のブランチ命令 1101 分岐先アドレスのアクセス 1102 フェッチされたループ戻り先命令 1103 戻り先命令サイクル101 LSI chip 102 Processor core unit 103 DRAM 104 Instruction bus 105 Data bus 106 Instruction access address unit 107 Instruction access output unit 108 Data access address unit 109 Data access input / output unit 110 Control processing unit 111 Data processing unit 112 Instruction fetch control unit 113 Instruction decode unit 114 Control signal generation unit 115 Instruction prefetch counter 116 Selective memory access means 117 Select signal input terminal 118 Branch instruction code detection means 119 Branch destination address register 120 Access resume control input terminal 121 Access stop input terminal 122 Branch instruction processing means 123 Branch instruction execution end signal 124 Branch occurrence notification signal 150 Instruction bus access address section state 151 Instruction bus instruction fetch output state 152 Instruction fetch control section storage state 153 Instruction decode state 154 Instruction execution section state 191 Fetched on previous access and on bus Invalid instruction at 192 invalid Access to address 193 Invalid instruction on fetched bus 194 Invalid instruction cycle or no execution cycle 201 LSI chip 202 Processor core 203 DRAM 204 Instruction bus 205 Data bus 206 Instruction access address 207 Instruction access output 208 Data Access address section 209 Data access input / output section 210 Control processing section 211 Data processing section 212 Instruction fetch control section 213 Instruction decoding section 214 Control signal generation section 215 Instruction prefetch counter 216 Selective memory access means 217 Selection signal input terminal 225 Loop processing start / End instruction code detecting means 226 loop return destination address register 227 loop processing discriminating logic means 228 loop number counter 307 instruction access output section 330 instruction code example 101 331 opcode field 332 first operand field 333 second operand field 334 first code Output element 335 Second code detection element (OR circuit) 336 Code detection means output 407 Instruction access output unit 430 Instruction code example 101 431 Opcode field 432 101st operand field 433 Second operand field 440 Instruction code example 2 (loop processing) Start instruction) 441 Instruction code example 3 (loop processing end instruction) 442 Loop processing start code detection unit 101st element 443 Loop processing end code detection unit 101st element 444 Loop processing start code detection unit second element (OR circuit) 445 Loop processing end code detection unit second element (OR circuit) 446 Loop processing start code detection unit output 447 Loop processing end code detection unit output 550 Instruction bus access address part state 551 Instruction bus instruction fetch output state 552 Instruction fetch control unit storage Status 553 Instruction decode status 554 Instruction execution status 555 Canceled access 556 Branch instruction 557 Access to branch destination address 558 Fetched branch destination instruction 559 Branch target instruction or re-execution start instruction cycle 560 Instruction fetched on previous access and on bus 561 Invalid instruction cycle or cycle where nothing is executed 650 Instruction bus Access address section state 651 Instruction bus instruction fetch output state 652 Instruction fetch control section storage state 653 Instruction decode state 654 Instruction execution section state 660 Instruction fetched on previous access and on the bus 661 Invalid instruction cycle or cycle where nothing is executed 662 Loop return address 663 Instruction on bus including loop start instruction 664 Loop end instruction 665 Loop start instruction 770 For pure branch 771 Instruction executed (other than branch) 772 Instruction not executed 773 Branch instruction 775 For loop processing 776 Instructions to be executed (loop start / end instructions 777 Loop start instruction 778 Loop end instruction 779 Return condition: (number of repetitions) <(determined number of times) 880 Simple DRAM embedded processor chip 881 CPU part 882 Embedded DRAM 883 Embedded DRAM with cache Processor chip 884 Installed cache part 950 Instruction bus access address part state 951 Instruction bus instruction fetch output state 952 Instruction fetch control part storage state 953 Instruction decode state 954 Instruction execution part state 990 Branch instruction 991 Accessed before the branch instruction 991 and fetched on the bus Invalid instruction at 992 Invalid address access 993 Invalid instruction on fetched bus 994 Invalid instruction cycle or no-execution cycle 995 Access to branch target address 996 Instruction fetched branch target instruction on instruction bus 997 Branch target instruction Is stored in the instruction fetch control unit. Launch instruction 1101 Access to branch destination address 1102 Fetched loop return destination instruction 1103 Return destination instruction cycle

Claims

[Claims]

1. An instruction prefetch counter, a branch destination address storage means, a selection means for selecting a value of the instruction prefetch counter and a value of the branch destination address storage means and outputting the selected value as an address signal, Connected branch instruction detecting means, wherein when the branch instruction detecting means detects a branch instruction, reading of the instruction is stopped, and when the branch instruction is executed, reading of the instruction is restarted. Processing equipment.

2. A loop processing start instruction detecting means having a loop processing start instruction and a loop processing end instruction and connected to an instruction input signal and detecting the loop processing start instruction;
A loop processing end instruction detecting means connected to the instruction input signal for detecting the loop processing end instruction, an instruction prefetch counter, a branch destination address storage means, a value of the instruction prefetch counter and the branch destination address storage means Selecting means for selecting the value of
A counter for counting the number of loops, and when the loop processing end command is detected by the loop processing end command detection means, when the counter indicates that the loop is not ended, the selection means Output the value of the branch destination address storage means.