JP2000259408A

JP2000259408A - Software breakpoint in delay slot

Info

Publication number: JP2000259408A
Application number: JP2000062443A
Authority: JP
Inventors: Shigeyuki Abiko; 茂志安孫子; Gilbert Laurenti; ローランティジルベール; Mark Buser; ビュセルマーク; Eric Ponsot; ポンソエリック
Original assignee: Texas Instruments Inc
Current assignee: Texas Instruments Inc
Priority date: 1999-03-08
Filing date: 2000-03-07
Publication date: 2000-09-22

Abstract

PROBLEM TO BE SOLVED: To provide a high digital signal processor performance with a little power consumption by decoding a software breakpoint instruction having a length equal to any one of instruction length formats. SOLUTION: An instruction buffer unit 106 decodes an instruction, which is fetched from an instruction memory, having a first length selected out of plural first instruction fetch lengths. The decoded instruction is executed by a data calculation unit 112 and a program counter generates an instruction address to be provided to the instruction memory. Namely, the instruction buffer unit 106 operates to decode the first software breakpoint instruction selected to have a length equal to any one of plural first instruction length formats. Besides, the instruction buffer unit 106 operates to decode the second software breakpoint instruction coupled with a first non-operation instruction in one cycle.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明はプロセッサに関し、
更にハードウェアまたはソフトウェアのデバッグをする
ためのプロセッサのエミュレーションに関する。TECHNICAL FIELD The present invention relates to a processor.
Further, the present invention relates to emulation of a processor for debugging hardware or software.

【０００２】[0002]

【従来の技術】マイクロプロセッサはソフトウェアを実
行させるのに大きい命令スループットを必要とする汎用
プロセッサであり、関連する特定のソフトウェアアプリ
ケーションに応じ、広範な範囲の処理条件を有し得る。
１つのソフトウェアルーチンから別のソフトウェアルー
チンへ変数を送るのに使用できるスタックを設けること
が公知となっている。これらスタックはコールされた第
２ルーチンの完了時にプログラムフローが第１のソフト
ウェアルーチンにリターンできるよう、第１のソフトウ
ェアルーチンが第２のソフトウェアルーチンをコールし
た際にプログラムカウンタの内容を維持するのにも使用
される。第２のソフトウェアルーチンにおけるコールは
第３ルーチンなどもコールできる。更に、ソフトウェア
のデバッグ中に使用すべきソフトウェアのブレークポイ
ント命令を設けることも公知である。2. Description of the Related Art A microprocessor is a general-purpose processor that requires a large instruction throughput to execute software, and may have a wide range of processing conditions, depending on the particular software application involved.
It is known to provide a stack that can be used to pass variables from one software routine to another. These stacks are used to maintain the contents of the program counter when the first software routine calls the second software routine so that the program flow can return to the first software routine upon completion of the called second routine. Is also used. The call in the second software routine can also call the third routine and so on. It is also known to provide software breakpoint instructions to be used during software debugging.

【０００３】[0003]

【発明が解決しようとする課題】プロセッサには多数の
異なるタイプのものが知られており、このうちのマイク
ロプロセッサは１つの例である。例えば特定のアプリケ
ーション、例えばモービル処理アプリケーションのため
にデジタル信号プロセッサ（ＤＳＰ）が広く使用されて
いる。これらＤＳＰは関連するアプリケーションの性能
を最適にし、かつこれらＤＳＰがより特殊な実行ユニッ
トおよび命令セットを使用する性能を達成できるよう
に、一般に構成されている。特に移動通信アプリケーシ
ョンのようなアプリケーション（これのみに限定される
わけではない）では、電力消費量をできるだけ少なく維
持しながら、次第に要求が高くなりつつあるＤＳＰ性能
を提供することが望ましい。A number of different types of processors are known, of which the microprocessor is one example. For example, digital signal processors (DSPs) are widely used for certain applications, such as mobile processing applications. These DSPs are generally configured to optimize the performance of the associated application and to achieve the performance of using more specialized execution units and instruction sets. Particularly in applications such as, but not limited to, mobile communication applications, it is desirable to provide increasingly demanding DSP performance while keeping power consumption as low as possible.

【０００４】[0004]

【課題を解決するための手段】添付した独立請求項およ
び従属請求項には本発明の特定の好ましい特徴事項が記
載されている。従属請求項に記載の特徴事項の組み合わ
せは、適当であり、かつ請求項には単に明示的には記載
されていない独立請求項の特徴事項と組み合わせること
ができる。本発明は、プロセッサ、例えばデジタル信号
プロセッサ（専らこれのみに限定されるものではない）
の性能を改良せんとするものである。Summary of the Invention The accompanying independent and dependent claims set forth certain preferred features of the invention. Combinations of features described in the dependent claims are appropriate and may be combined with features of the independent claims not explicitly set out in the claims. The present invention relates to a processor, such as, but not limited to, a digital signal processor.
It is intended to improve the performance of.

【０００５】本発明の第１の特徴によれば、プロセッ
サ、すなわち高いコード（符号）密度および容易なプロ
グラム性の双方を提供する、プログラマブルデジタル信
号プロセッサ（ＤＳＰ）が提供される。アーキテクチャ
および命令セットは、電量消費量を少なくし、かつＤＳ
Ｐアルゴリズム、例えば無線電話のみならず純粋な制御
作業用のアルゴリズムの実行の効率を高くするように最
適にする。このプロセッサは命令メモリから取り出され
た（フェッチされた）メモリを復号化するように働く命
令バッファユニットを含む。この命令は多数の命令フォ
ーマット長さを有することができる。プロセッサは命令
バッファユニットによって復号化された命令を実行する
ためのデータ計算ユニットと、命令メモリへ提供される
メモリアドレスを提供するように作動できるプログラム
カウンタも有する。命令バッファユニットは命令セット
のうちの命令長さフォーマットのいずれかに等しい長さ
を有するソフトウェアブレークポイント命令（ＳＷＢ
Ｐ）を復号化するように作動する。In accordance with a first aspect of the present invention, there is provided a processor, a programmable digital signal processor (DSP) that provides both high code (code) density and easy programmability. The architecture and instruction set reduce power consumption and reduce DS
The P-algorithm is optimized to make the execution of the algorithm not only for radiotelephones but also for pure control tasks more efficient. The processor includes an instruction buffer unit that serves to decode memory fetched from instruction memory. This instruction can have multiple instruction format lengths. The processor also has a data calculation unit for executing the instructions decoded by the instruction buffer unit, and a program counter operable to provide a memory address provided to an instruction memory. The instruction buffer unit has a software breakpoint instruction (SWB) having a length equal to any of the instruction length formats of the instruction set.
It operates to decode P).

【０００６】本発明の別の特徴によれば、命令バッファ
は組み合わされたソフトウェアブレークポイント命令と
非オペレーション命令（ＮＯＰ）とがデータ計算ユニッ
トによって単一の命令として扱われるように、非オペレ
ーション命令と組み合わされたソフトウェアブレークポ
イント命令を１つのサイクルで復号化するように作動で
きる。In accordance with another aspect of the invention, the instruction buffer stores non-operational instructions and non-operational instructions such that the combined software breakpoint instruction and non-operational instruction (NOP) are treated as a single instruction by the data computation unit. Operable to decode the combined software breakpoint instruction in one cycle.

【０００７】本発明の別の特徴によれば、ソフトウェア
ブレークポイント命令は完全な命令セット内の命令長さ
フォーマットの数よりも少ない少数の命令長さフォーマ
ットしか有しない。しかしながら、ＳＷＢＰ命令とＮＯ
Ｐ命令の組み合わせに対し、命令セットの各命令長さフ
ォーマットを一致させるために組み合わされた命令長さ
フォーマットがある。According to another feature of the present invention, software breakpoint instructions have fewer instruction length formats than the number of instruction length formats in the complete instruction set. However, SWBP instruction and NO
For a combination of P instructions, there is a combined instruction length format to match each instruction length format in the instruction set.

【０００８】本発明の別の特徴によれば、デジタルシス
テムを作動させる方法が提供される。プロセッサコアの
命令パイプラインでは複数の命令が実行され、このプロ
セッサコアではプロセッサコアに関連する命令メモリか
らのプログラムカウンタに応答して命令がフェッチさ
れ、多数の命令長さフォーマットを有する命令セットか
ら命令のシーケンスが選択される。エミュレーション
中、命令のシーケンス内の命令は、その長さに係わら
ず、置き換わる命令と同じ命令長さフォーマットを有す
るソフトウェアブレークポイント命令に置換される。命
令のシーケンスの一部を実行した後に、ソフトウェアブ
レークポイント命令を実行することにより、実行シーケ
ンスがブレークされる。次に、ソフトウェアブレークポ
イント命令を命令のシーケンス内の先に置換された命令
に置き換えることにより、命令のシーケンスの実行を再
開する。According to another aspect of the present invention, there is provided a method of operating a digital system. A plurality of instructions are executed in an instruction pipeline of a processor core, where the instructions are fetched in response to a program counter from an instruction memory associated with the processor core, and instructions from an instruction set having a number of instruction length formats. Is selected. During emulation, instructions in the sequence of instructions, regardless of their length, are replaced by software breakpoint instructions having the same instruction length format as the replacing instruction. After executing a portion of the sequence of instructions, executing the software breakpoint instruction breaks the execution sequence. Next, execution of the sequence of instructions is resumed by replacing the software breakpoint instruction with the previously replaced instruction in the sequence of instructions.

【０００９】本発明の別の特徴は、ソフトウェアブレー
クポイント命令と非オペレーション命令の組み合わされ
た長さが、置換された命令長さに等しくなるよう、数個
のソフトウェアブレークポイント命令のうちの１つを選
択することにより、第１のソフトウェアブレークポイン
ト命令を形成することにある。本発明の別の特徴は、不
連続タイプの命令を実行することから生じた遅延スロッ
ト内でソフトウェアブレークポイント命令を実行した際
に、置換された命令が命令のシーケンス内に存在してい
たのと同じ値でリターンアドレスを記憶することであ
る。Another feature of the invention is that one of the several software breakpoint instructions is such that the combined length of the software breakpoint and non-operation instructions is equal to the length of the replaced instruction. To form a first software breakpoint instruction. Another feature of the present invention is that when a software breakpoint instruction is executed in a delay slot resulting from executing a discontinuous type instruction, the replaced instruction was present in the sequence of instructions. This is to store the return address with the same value.

【００１０】以下、添付図面を参照し、本発明に係わる
特定の実施例を単なる例として説明する。添付図面では
同様な部品を示すのに同様な番号を使用する。これら図
面は特に説明しない限り図１のプロセッサに関連するも
のである。A specific embodiment according to the present invention will now be described, by way of example only, with reference to the accompanying drawings, in which: In the accompanying drawings, like numbers are used to indicate like parts. These drawings relate to the processor of FIG. 1 unless otherwise indicated.

【００１１】[0011]

【発明の実施の形態】本発明は、例えば特殊用途向け集
積回路（ＡＳＩＣ）内に実現されるディジタル信号プロ
セッサ（ＤＳＰ）に特に用いられるが、他の種類のプロ
セッサにも用いられる。本発明によるプロセッサの一例
の基本的構成について以下に説明する。プロセッサ１０
０は、可変命令長（８ビットから４８ビット）のプログ
ラマブル固定小数点ＤＳＰコアであって、コード密度が
高くプログラミングが容易である。構成および命令集合
は、電力消費を低くし、かつ、無線電話や専用の制御タ
スク用のようなＤＳＰアルゴリズムの実行の効率を高め
るように、最適化されている。プロセッサ１００は、エ
ミュレーションおよびコード・デバッグの機能を含む。DETAILED DESCRIPTION OF THE INVENTION The present invention has particular application to digital signal processors (DSPs) implemented, for example, in special purpose integrated circuits (ASICs), but also to other types of processors. The basic configuration of an example of the processor according to the present invention will be described below. Processor 10
0 is a programmable fixed-point DSP core having a variable instruction length (8 bits to 48 bits), which has a high code density and is easy to program. The configuration and instruction set have been optimized to reduce power consumption and increase the efficiency of executing DSP algorithms such as for radiotelephones and dedicated control tasks. Processor 100 includes emulation and code debugging functions.

【００１２】図１は、本発明の一実施の形態によるディ
ジタル装置１０の概略図である。ディジタル装置は、プ
ロセッサ１００とプロセッサ・バックプレーン２０とを
含む。本発明の特定の例では、ディジタル装置は、特殊
用途向け集積回路（ＡＳＩＣ）で実現されたディジタル
信号プロセッサ装置１０である。簡単のために、図１
は、本発明の実施の形態を理解するのに必要なマイクロ
プロセッサ１００のそれらの部分だけを示す。ＤＳＰの
一般的な構造の詳細は良く知られているので、別の文献
を参照していただきたい。例えば、Frederick Boutaud
らに発行された米国特許第５，０７２，４１８号はＤＳ
Ｐについて詳細に説明しているので、ここに援用する。
Gary Swoboda らに発行された米国特許第５，３２９，
４７１号はＤＳＰを試験しエミュレートする方法につい
て詳細に説明しているので、ここに援用する。マイクロ
プロセッサ技術の当業者が本発明を製作し使用すること
ができるように、本発明の一実施の形態に関するマイク
ロプロセッサ１００の一部の詳細について以下に説明す
る。FIG. 1 is a schematic diagram of a digital device 10 according to one embodiment of the present invention. The digital device includes a processor 100 and a processor backplane 20. In a particular example of the invention, the digital device is a digital signal processor device 10 implemented in a special purpose integrated circuit (ASIC). For simplicity, FIG.
Shows only those parts of the microprocessor 100 that are necessary to understand embodiments of the present invention. Details of the general structure of DSPs are well known, so please refer to another document. For example, Frederick Boutaud
U.S. Pat. No. 5,072,418 issued to DS et al.
Since P is described in detail, it is incorporated herein.
US Patent No. 5,329, issued to Gary Swoboda et al.
No. 471 describes in detail how to test and emulate a DSP and is incorporated herein by reference. Some details of the microprocessor 100 according to one embodiment of the present invention are described below so that those skilled in the microprocessor art can make and use the present invention.

【００１３】本発明の態様の恩恵を被ることのできるい
くつかの例示の装置は、ここに援用した米国特許第５，
０７２，４１８号に、特に米国特許第５，０７２，４１
８号の図２から図１８を参照して述べられている。性能
を向上させコストを削減する本発明の態様を組み込んだ
マイクロプロセッサを用いれば、米国特許第５，０７
２，４１８号に述べられた装置を更に改善することがで
きる。かかる装置は、これらに限定されるわけではない
が、工業的プロセス制御，自動車システム，モータ制
御，ロボット制御装置，衛星通信システム，エコー消去
装置，モデム，ビデオ映像装置，音声認識装置，暗号化
されたボコーダ・モデム装置などを含む。図１のマイク
ロプロセッサの種々の構造の特徴の説明および命令の完
全な集合の説明は、本出願人に譲渡された出願番号第０
９／４１０，９７７号（ＴＩ−２８４３３）に述べられ
ているので、これをここに援用する。Some exemplary devices that can benefit from aspects of the present invention are described in US Pat.
No. 072,418, especially US Pat. No. 5,072,41.
No. 8 is described with reference to FIGS. With a microprocessor incorporating aspects of the present invention that improve performance and reduce cost, US Pat.
The device described in US Pat. No. 2,418 can be further improved. Such devices include, but are not limited to, industrial process controls, automotive systems, motor controls, robot controllers, satellite communication systems, echo cancellation devices, modems, video imaging devices, voice recognition devices, encrypted devices. And vocoder / modem devices. A description of the various structural features and a complete set of instructions of the microprocessor of FIG. 1 can be found in Application No. 0, assigned to the assignee of the present invention.
No. 9 / 410,977 (TI-28433), which is incorporated herein by reference.

【００１４】図１に示すように、プロセッサ１００は、
プロセッサ・コア１０２と、プロセッサ・コア１０２を
プロセッサ・コア１０２の外部にあるメモリ・ユニット
とインターフェースするメモリ・インターフェース・ユ
ニット１０４とを有する中央処理装置（ＣＰＵ）を形成
する。プロセッサ・バックプレーン２０は、プロセッサ
のメモリ管理ユニット１０４が接続されたバックプレー
ン・バス２２を含む。バックプレーン・バス２２には、
命令メモリ２４，周辺装置２６および外部インターフェ
ース２８も接続されている。理解されるように、他の例
では、異なる構成および／または異なる技術を用いて本
発明を実現することができる。例えば、プロセッサ１０
０は、プロセッサ・バックプレーン２０をそこから分離
して、第１の集積回路を形成してもよい。例えば、プロ
セッサ１００は、バックプレーン・バス２２と周辺およ
び外部インターフェースとを支援するバックプレーン２
０から離してその上に取り付けたＤＳＰであってもよ
い。例えば、プロセッサ１００は、ＤＳＰではなくマイ
クロプロセッサでもよいし、また、ＡＳＩＣ技術以外の
技術で実現してもよい。このプロセッサまたはこのプロ
セッサを含むプロセッサを１つ以上の集積回路に実現し
てもよい。As shown in FIG. 1, the processor 100 includes:
It forms a central processing unit (CPU) having a processor core 102 and a memory interface unit 104 that interfaces the processor core 102 with a memory unit external to the processor core 102. The processor backplane 20 includes a backplane bus 22 to which the memory management unit 104 of the processor is connected. The backplane bus 22 includes
The instruction memory 24, the peripheral device 26 and the external interface 28 are also connected. As will be appreciated, in other examples, the invention can be implemented using different configurations and / or different technologies. For example, the processor 10
0 may separate the processor backplane 20 therefrom to form a first integrated circuit. For example, processor 100 may include backplane 2 supporting backplane bus 22 and peripheral and external interfaces.
It may be a DSP mounted on it away from zero. For example, the processor 100 may be a microprocessor instead of a DSP, or may be realized by a technology other than the ASIC technology. The processor or a processor including the processor may be implemented on one or more integrated circuits.

【００１５】図２は、プロセッサ・コア１０２の一実施
の形態の基本構造を示す。図示するように、プロセッサ
・コア１０２のこの実施の形態は、４つの要素、すなわ
ち、命令バッファ・ユニット（Ｉユニット）１０６と３
つの実行ユニットとを含む。実行ユニットは、プログラ
ム・フロー・ユニット（Ｐユニット）１０８と、アドレ
ス・データ・フロー・ユニット（Ａユニット）１１０
と、命令バッファ・ユニット（Ｉユニット）１０６から
復号された命令を実行するとともにプログラム・フロー
を制御し監視するデータ計算ユニット（Ｄユニット）１
１２とである。FIG. 2 shows the basic structure of one embodiment of the processor core 102. As shown, this embodiment of the processor core 102 has four components: an instruction buffer unit (I unit) 106 and 3
And one execution unit. The execution units include a program flow unit (P unit) 108 and an address data flow unit (A unit) 110
And a data calculation unit (D unit) 1 for executing the instruction decoded from the instruction buffer unit (I unit) 106 and controlling and monitoring the program flow.
And 12.

【００１６】図３は、プロセッサ・コア１０２のＰユニ
ット１０８，Ａユニット１１０およびＤユニット１１２
をもっと詳細に示す図であり、また、プロセッサ・コア
１０２の種々の要素を接続するバス構造を示す。Ｐユニ
ット１０８は、例えば、ループ制御回路と、ＧｏＴｏ／
分岐制御回路と、反復カウンタ・レジスタや割込みマス
ク，フラグまたはベクトル・レジスタのようなプログラ
ム・フローを制御し監視する種々のレジスタとを含む。
Ｐユニット１０８は、汎用データ書込みバス（ＥＢ，Ｆ
Ｂ）１３０，１３２と、データ読取りバス（ＣＢ，Ｄ
Ｂ）１３４，１３６と、アドレス定数バス（ＫＡＢ）１
４２とに結合されている。また、Ｐユニット１０８は、
ＣＳＲ，ＡＣＢおよびＲＧＤとラベルされた種々のバス
を介してＡユニット１１０およびＤユニット１１２内の
サブユニットに結合されている。FIG. 3 shows a P unit 108, an A unit 110, and a D unit 112 of the processor core 102.
Is shown in more detail, and also shows the bus structure connecting the various elements of the processor core 102. The P unit 108 includes, for example, a loop control circuit and a GoTo /
It includes a branch control circuit and various registers that control and monitor program flow, such as a repeat counter register and an interrupt mask, flag or vector register.
The P unit 108 has a general-purpose data write bus (EB, F
B) 130, 132 and the data read bus (CB, D
B) 134, 136 and address constant bus (KAB) 1
42. Also, the P unit 108
It is coupled to sub-units within A-unit 110 and D-unit 112 via various buses labeled CSR, ACB and RGD.

【００１７】図３に示すように、この実施の形態では、
Ａユニット１１０はレジスタ・ファイル３０とデータ・
アドレス生成サブユニット（ＤＡＧＥＮ）３２と算術・
論理演算ユニット（ＡＬＵ）３４とを含む。Ａユニット
・レジスタ・ファイル３０は種々のレジスタを含む。例
えば、１６ビット・ポインタ・レジスタ（ＡＲ０〜ＡＲ
７）と、データ・フローおよびアドレス生成にも用いら
れるデータ・レジスタ（ＤＲ０〜ＤＲ３）とである。ま
た、レジスタ・ファイルは、１６ビット循環バッファ・
レジスタと７ビットのデータ・ページ・レジスタとを含
む。汎用バス（ＥＢ，ＦＢ，ＣＢ，ＤＢ）１３０，１３
２，１３４，１３６の他に、データ定数バス１４０およ
びアドレス定数バス１４２もＡユニット・レジスタ・フ
ァイル３０に結合されている。Ａユニット・レジスタ・
ファイル３０は、それぞれ逆方向に動作する一方向バス
１４４，１４６を介してＡユニットＤＡＧＥＮユニット
３２に結合されている。ＤＡＧＥＮユニット３２は、例
えば処理エンジン１００内のアドレス生成を制御し監視
する１６ビット・Ｘ／Ｙレジスタと係数／スタック・ポ
インタ・レジスタとを含む。As shown in FIG. 3, in this embodiment,
The A unit 110 stores the register file 30 and the data
Address generation subunit (DAGEN) 32 and arithmetic
A logical operation unit (ALU) 34. The A unit register file 30 contains various registers. For example, a 16-bit pointer register (AR0 to AR
7) and data registers (DR0 to DR3) which are also used for data flow and address generation. The register file is a 16-bit circular buffer
Register and a 7-bit data page register. General-purpose buses (EB, FB, CB, DB) 130, 13
In addition to 2, 134, 136, a data constant bus 140 and an address constant bus 142 are also coupled to the A unit register file 30. A unit, register,
The file 30 is coupled to the A unit DAGEN unit 32 via unidirectional buses 144 and 146 operating in the opposite directions, respectively. DAGEN unit 32 includes, for example, a 16-bit X / Y register and a coefficient / stack pointer register that control and monitor address generation within processing engine 100.

【００１８】Ａユニット１１０は、加算，減算およびＡ
ＮＤ，ＯＲ，ＸＯＲ論理演算子のようなＡＬＵに一般に
関連する機能とともにシフタ機能を含むＡＬＵ３４も含
む。ＡＬＵ３４は、汎用バス（ＥＢ，ＤＢ）１３０，１
３６および命令定数データ・バス（ＫＤＢ）１４０にも
結合されている。ＡユニットＡＬＵは、ＰＤＡバスを介
してＰユニット１０８に結合されて、Ｐユニット１０８
レジスタ・ファイルからレジスタ定数を受ける。ＡＬＵ
３４は、バスＲＧＡ，ＲＧＢを介してＡユニット・レジ
スタ・ファイル３０にも結合されて、アドレスおよびデ
ータ・レジスタの内容を受けるとともに、バスＲＧＤを
介してレジスタ・ファイル３０のアドレスおよびデータ
・レジスタの内容を転送する。The A unit 110 performs addition, subtraction and A
It also includes an ALU 34 that includes shifter functions as well as functions generally associated with ALUs such as ND, OR, and XOR logical operators. The ALU 34 is a general-purpose bus (EB, DB) 130, 1
36 and an instruction constant data bus (KDB) 140. The A unit ALU is coupled to the P unit 108 via the PDA bus,
Receives register constants from a register file. ALU
34 is also coupled to the A unit register file 30 via buses RGA and RGB to receive the contents of the address and data registers, and to the address and data registers of the register file 30 via the bus RGD. Transfer the contents.

【００１９】本発明の例示の実施の形態によれば、Ｄユ
ニット１１２は、Ｄユニット・レジスタ・ファイル３６
と、ＤユニットＡＬＵ３８と、Ｄユニット・シフタ４０
と、２つの乗算および累算ユニット（ＭＡＣ１，ＭＡＣ
２）４２，４４とを含む。Ｄユニット・レジスタ・ファ
イル３６とＤユニットＡＬＵ３８とＤユニット・シフタ
４０とはバス（ＥＢ，ＦＢ，ＣＢ，ＤＢ，ＫＤＢ）１３
０，１３２，１３４，１３６，１４０に結合され、ま
た、ＭＡＣユニット４２，４４はバス（ＣＢ，ＤＢ，Ｋ
ＤＢ）１３４，１３６，１４０とデータ読取りバス（Ｂ
Ｂ）１４４とに結合されている。Ｄユニット・レジスタ
・ファイル３６は、４０ビット累算器（ＡＣ０〜ＡＣ
３）と１６ビット遷移レジスタとを含む。Ｄユニット１
１２は、４０ビット累算器の他に、発信元レジスタまた
は宛先レジスタとしてＡユニット１１０の１６ビット・
ポインタおよびデータ・レジスタも用いる。Ｄユニット
・レジスタ・ファイル３６は、累積器書込みバス（ＡＣ
Ｗ０，ＡＣＷ１）１４６，１４８を介してＤユニットＡ
ＬＵ３８とＭＡＣ１４２とＭＡＣ２４４とからデー
タを受け、また、累積器書込みバス（ＡＣＷ１）１４８
を介してＤユニット・シフタ４０からデータを受ける。
データは、Ｄユニット・レジスタ・ファイル累積器から
累積器読取りバス（ＡＣＲ０，ＡＣＲ１）１５０，１５
２を介してＤユニットＡＬＵ３８，Ｄユニット・シフタ
４０，ＭＡＣ１４２およびＭＡＣ２４４に読み取ら
れる。ＤユニットＡＬＵ３８およびＤユニット・シフタ
４０は、ＥＦＣ，ＤＲＢ，ＤＲ２およびＡＣＢとラベル
された種々のバスを介してＡユニット１０８のサブユニ
ットにも結合されている。According to an exemplary embodiment of the present invention, the D unit 112 stores the D unit register file 36
, D unit ALU 38 and D unit shifter 40
And two multiplication and accumulation units (MAC1, MAC
2) 42 and 44 are included. The D unit register file 36, the D unit ALU 38, and the D unit shifter 40 are connected to a bus (EB, FB, CB, DB, KDB) 13
0, 132, 134, 136 and 140, and the MAC units 42 and 44 are connected to buses (CB, DB, K).
DB) 134, 136, 140 and a data read bus (B
B) 144). The D unit register file 36 includes a 40-bit accumulator (AC0 to AC
3) and a 16-bit transition register. D unit 1
12 is a 16-bit A-unit 110 as a source register or a destination register in addition to the 40-bit accumulator.
Pointers and data registers are also used. The D unit register file 36 stores the accumulator write bus (AC
W0, ACW1) D unit A via 146, 148
It receives data from the LU 38, MAC1 42 and MAC2 44, and also stores the accumulator write bus (ACW1) 148
And receives data from the D unit shifter 40.
Data is transferred from the D unit register file accumulator to the accumulator read buses (ACR0, ACR1) 150,15.
2 are read by the D unit ALU 38, the D unit shifter 40, the MAC1 42 and the MAC2 44. D unit ALU 38 and D unit shifter 40 are also coupled to the subunits of A unit 108 via various buses labeled EFC, DRB, DR2 and ACB.

【００２０】図４を参照すると、３２語の命令バッファ
待ち行列（ＩＢＱ）５０２を含む本発明による命令バッ
ファ・ユニット１０６が示されている。ＩＢＱ５０２
は、８ビット・バイト５０６に論理的に分割された３２
×１６ビットのレジスタ５０４を含む。命令は、３２ビ
ットのプログラム・バス（ＰＢ）１２２を介してＩＢＱ
５０２に到着する。命令は、ローカル書込みプログラム
・カウンタ（ＬＷＰＣ）５３２によって指し示される位
置に３２ビット・サイクルで取り出される。ＬＷＰＣ５
３２は、Ｐユニット１０８にあるレジスタに含まれてい
る。Ｐユニット１０８も、ローカル読取りプログラム・
カウンタ（ＬＲＰＣ）５３６レジスタと、書込みプログ
ラム・カウンタ（ＷＰＣ）５３０レジスタと、読取りプ
ログラム・カウンタ（ＲＰＣ）５３４レジスタとを含
む。ＬＲＰＣ５３６は、命令デコーダ５１２，５１４に
ロードされるべき次の命令のＩＢＱ５０２内の位置を指
し示す。すなわち、ＬＲＰＣ５３６は、デコーダ５１
２，５１４に現在ディスパッチされている命令のＩＢＱ
５０２内の位置を指し示す。ＷＰＣは、プログラム・メ
モリにおけるパイプライン用の次の４バイトの命令コー
ドの開始アドレスを指し示す。ＩＢＱに取り出す度に、
プログラム・メモリからの次の４バイトが命令境界に関
わらず取り出される。ＲＰＣ５３４は、デコーダ５１２
／５１４に現在ディスパッチされている命令のプログラ
ム・メモリのアドレスを指し示す。Referring to FIG. 4, there is shown an instruction buffer unit 106 according to the present invention that includes a 32-word instruction buffer queue (IBQ) 502. IBQ502
Is 32 logically divided into 8-bit bytes 506
Includes a × 16 bit register 504. Instructions are sent to the IBQ via a 32-bit program bus (PB) 122.
Arrives at 502. The instruction is fetched in a 32-bit cycle to the location pointed to by the local write program counter (LWPC) 532. LWPC5
32 is included in a register in the P unit 108. The P unit 108 also has a local read program
It includes a counter (LRPC) 536 register, a write program counter (WPC) 530 register, and a read program counter (RPC) 534 register. LRPC 536 points to the location in IBQ 502 of the next instruction to be loaded into instruction decoders 512,514. That is, the LRPC 536 is
IBQ of the instruction currently dispatched to 2,514
Points to a location within 502. WPC points to the start address of the next 4-byte instruction code for the pipeline in program memory. Every time I take it out to IBQ,
The next four bytes from program memory are fetched regardless of instruction boundaries. RPC 534 is a decoder 512
/ 514 points to the address of the program memory of the instruction currently dispatched.

【００２１】この実施の形態では、命令は、４８ビット
語で形成され、マルチプレクサ５２０，５２１を介して
４８ビットのバス５１６により命令デコーダ５１２，５
１４にロードされる。当業者には明らかなように、命令
は４８ビット以外で構成された語に形成されてもよく、
本発明は上述した特定の実施の形態に限定されるもので
はない。In this embodiment, the instruction is formed by a 48-bit word, and is supplied to the instruction decoders 512 and 5 by a 48-bit bus 516 via multiplexers 520 and 521.
14 is loaded. As will be apparent to those skilled in the art, the instructions may be formed into words composed of more than 48 bits,
The present invention is not limited to the specific embodiments described above.

【００２２】現在好ましいとされる４８ビット語サイズ
に対して、バス５１６は、並列に実行される任意の１命
令サイクル中に最大２命令（デコーダ当たり１命令）を
ロードすることができる。命令の組合せは、４８ビット
のバスに適合する任意の書式（８，１６，２４，３２，
４０および４８ビット）の組合せでよい。１サイクル中
に１命令だけをロードする場合は、デコーダ２５１４
よりデコーダ１５１２の方を優先してロードする。次
に、各命令が、それらを実行するとともに、命令または
操作が実行されるべきデータをアクセスするために、各
機能ユニットに送られる。命令デコーダに渡される前
に、命令はバイト境界上で整列される。整列は、その復
号中に前の命令に対して得られた書式に基づいて行われ
る。バイト境界との命令の整列に関連する多重化はマル
チプレクサ５２０，５２１で行われる。For the presently preferred 48-bit word size, the bus 516 can load up to two instructions (one instruction per decoder) during any one instruction cycle executed in parallel. The combination of instructions can be in any format (8, 16, 24, 32,
40 and 48 bits). If only one instruction is loaded during one cycle, the decoder 2 514
The decoder 1 512 is loaded with higher priority. Each instruction is then sent to each functional unit to execute them and access the data on which the instruction or operation is to be performed. Instructions are aligned on byte boundaries before being passed to the instruction decoder. The alignment is based on the format obtained for the previous instruction during its decoding. Multiplexing related to the alignment of instructions with byte boundaries is performed in multiplexers 520 and 521.

【００２３】２つの命令の一方が並列イネーブル・ビッ
トを持つ場合は、２つの命令を並列に入れることができ
る。かかる種類の並列方式を支援するハードウエアを並
列イネーブル機構と呼ぶ。同様に、２つの命令が両方と
も間接モードで単一データ・メモリ・アクセス（Ｓｍｅ
ｍまたはｄｂｌ（ｌｍｅｍ））を行う場合は、２つの命
令を並列に入れることができる。かかる種類の並列方式
を支援するハードウエアをソフト二重機構と呼ぶ。If one of the two instructions has a parallel enable bit, the two instructions can be placed in parallel. Hardware that supports this type of parallelism is called a parallel enable mechanism. Similarly, the two instructions are both in a single data memory access (Sme
m or dbl (lmem)), two instructions can be put in parallel. Hardware that supports this type of parallelism is called a soft duplex mechanism.

【００２４】プロセッサ・コア１０２は７段階のパイプ
ラインにより命令を実行する。その各段階について、表
１と図５を参照して以下に説明する。どこ（Ａユニット
かＤユニット）で実行するかに関わらず、７段階のパイ
プラインによりプロセッサ命令を実行する。本発明の一
態様によれば、プログラム・コード・サイズを小さくす
るために、ＣコンパイラはＡユニットでの実行のために
できるだけ多くの命令をディスパッチするので、Ｄユニ
ットは、電力を節約するために電源を切られてもよい。
このため、Ａユニットは、メモリ・オペランドで実行さ
れる基本的動作を支援する必要がある。Processor core 102 executes instructions through a seven-stage pipeline. Each stage is described below with reference to Table 1 and FIG. Regardless of where (A unit or D unit) is executed, the processor instruction is executed by a seven-stage pipeline. In accordance with one aspect of the present invention, the D unit is used to save power because the C compiler dispatches as many instructions as possible for execution in the A unit to reduce program code size. The power may be turned off.
Thus, the A unit needs to support the basic operations performed on the memory operands.

【００２５】[0025]

【表１】 [Table 1]

【００２６】パイプラインの第１段階は、事前取出し
（Ｐ０）段階２０２であり、この段階中では、メモリ・
インターフェース１０４のアドレス・バス（ＰＡＢ）１
１８上にアドレスを表明することによって、次のプログ
ラム・メモリ位置がアドレスされる。次の段階の取出し
（Ｐ１）段階２０４では、プログラム・メモリが読み取
られ、メモリ・インターフェース・ユニット１０４から
ＰＢバス１２２を介してＩユニット１０６が満たされ
る。事前取出しおよび取出し段階は、他のパイプライン
段階から切り離されており、事前取出しおよび取出し段
階中はパイプラインに割り込んで、連続したプログラム
・フローを中断するとともに、プログラム・メモリ内の
別の命令（例えば、分岐命令）を指し示すことができ
る。The first stage of the pipeline is the prefetch (P0) stage 202, during which the memory
Address bus (PAB) 1 of interface 104
By asserting the address on 18, the next program memory location is addressed. In the next fetch (P1) phase 204, the program memory is read and the I unit 106 is filled from the memory interface unit 104 via the PB bus 122. The prefetch and fetch stages are decoupled from other pipeline stages, interrupting the pipeline during the prefetch and fetch stages to interrupt continuous program flow and to separate other instructions ( For example, a branch instruction) can be indicated.

【００２７】次に、第３段階の復号（Ｐ２）段階２０６
では、命令バッファ内の次の命令がデコーダ５１２／５
１４にディスパッチされ、命令が復号されるとともにそ
の命令を実行する実行ユニット（例えば、Ｐユニット１
０８，Ａユニット１１０またはＤユニット１１２）にデ
ィスパッチされる。復号段階２０６は、命令の種類を示
す第１の部分と命令の書式を示す第２の部分と命令用の
アドレス指定モードを示す第３の部分とを含む命令の少
なくとも一部を復号することを含む。次の段階はアドレ
ス（Ｐ３）段階２０８であり、そこでは、命令で用いら
れるべきデータのアドレスが計算されるか、命令がプロ
グラムの分岐またはジャンプを必要とする場合は新しい
プログラム・アドレスが計算される。各計算はＡユニッ
ト１１０またはＰユニット１０８でそれぞれ行う。Next, a third stage decoding (P2) stage 206
The next instruction in the instruction buffer is the decoder 512/5
An execution unit (e.g., P unit 1) that dispatches instructions to 14 and decodes and executes the instructions.
08, A unit 110 or D unit 112). The decoding step 206 decodes at least a portion of the instruction, including a first part indicating the type of instruction, a second part indicating the format of the instruction, and a third part indicating the addressing mode for the instruction. Including. The next stage is the address (P3) stage 208, in which the address of the data to be used in the instruction is calculated or, if the instruction requires a program branch or jump, a new program address is calculated. You. Each calculation is performed by the A unit 110 or the P unit 108, respectively.

【００２８】アクセス（Ｐ４）段階２１０では、読取り
オペランドのアドレスが生成され、また、そのアドレス
がＹｍｅｍ間接アドレス指定モードでＤＡＧＥＮＹオ
ペレータで生成されているメモリ・オペランドが、間接
的にアドレスされたＹメモリ（Ｙｍｅｍ）から読み取ら
れる。パイプラインの次の段階は、そのアドレスがＸｍ
ｅｍ間接アドレス指定モードでＤＡＧＥＮＸ内でまた
は係数アドレス・モードでＤＡＧＥＮＣオペレータで
生成されているメモリ・オペランドが読み取られる読取
り（Ｐ５）段階２１２である。命令の結果が書き込まれ
るべきメモリ位置のアドレスが生成される。In the access (P4) stage 210, the address of the read operand is generated, and the memory operand whose address is generated by the DAGEN Y operator in the Ymem indirect addressing mode is replaced with the indirectly addressed Y operand. Read from memory (Ymem). The next stage in the pipeline is when its address is Xm
A read (P5) stage 212 in which the memory operands generated in DAGEN X in em indirect addressing mode or with the DAGEN C operator in coefficient address mode are read. The address of the memory location where the result of the instruction is to be written is generated.

【００２９】最後は、Ａユニット１１０またはＤユニッ
ト１１２のいずれかで命令が実行される実行（Ｐ６）段
階２１４である。次に、その結果がデータ・レジスタす
なわち累算器に記憶されるか、読取り／変更／書込み命
令用のメモリに書き込まれる。更に、シフト操作が、実
行段階中に累算器でデータについて行われる。プロセッ
サ１００のパイプラインは保護されている。これによ
り、ＮＯＰ命令が待ち時間の要求を満たすために挿入さ
れる必要がなくなるので、Ｃコンパイラ性能が大幅に向
上する。また、これにより、前の生成プロセッサから後
の生成プロセッサへのコード変換が非常に容易になる。Finally, there is an execution (P6) stage 214 in which the instruction is executed in either the A unit 110 or the D unit 112. The result is then stored in a data register or accumulator or written to memory for read / modify / write instructions. Further, a shift operation is performed on the data in the accumulator during the execution phase. The pipeline of the processor 100 is protected. This eliminates the need for NOP instructions to be inserted to meet the latency requirements, thus significantly improving C compiler performance. This also greatly facilitates code conversion from a previous generation processor to a subsequent generation processor.

【００３０】プロセッサ１００で用いられるパイプライ
ン保護の基本的規則は、次の通りである。実行中の読取
りアクセスが終了する前に書込みアクセスが開始され、
かつ、両方のアクセスが同じ資源を共用する場合は、追
加のサイクルが挿入されて、書込みを完了させ、更新さ
れたオペランドで次の命令を実行することができるよう
にするが、エミュレーションについては、単一ステップ
・コード実行がフリーランニング・コード実行と全く同
様に行われなければならない。The basic rules for pipeline protection used in processor 100 are as follows. A write access is started before the current read access ends,
And if both accesses share the same resource, an extra cycle is inserted to complete the write and allow the next instruction to be executed with the updated operand, but for emulation, Single-step code execution must be performed exactly like free-running code execution.

【００３１】パイプライン・プロセッサの動作の基本的
原理について、図５を参照して以下に説明する。図５か
ら分かるように、第１の命令３０２では、連続するパイ
プライン段階が時間Ｔ₁〜Ｔ₇の間に実行される。各時間
は、プロセッサ・マシン・クロックの１クロック・サイ
クルである。第２の命令３０４は、時間Ｔ₂にパイプラ
インに入ることができる。なぜなら、前の命令はすでに
次のパイプライン段階に移っているからである。命令３
（３０６）では、事前取出し段階２０２が時間Ｔ₃に起
こる。図５から分かるように、第７段階のパイプライン
では、７命令全部を同時に処理することができる。７つ
の命令３０２〜３１４全部に対して、図５は、時間Ｔ₇
で処理中であるそれらすべてを示す。このような構造
は、命令の処理に並列形式を付加する。The basic principle of operation of the pipeline processor will be described below with reference to FIG. As can be seen from FIG. 5, for a first instruction 302, the successive pipeline stages are executed during the time T ₁ through T _7. Each time is one clock cycle of the processor machine clock. The second instruction 304, can enter the pipeline in period T _2. This is because the previous instruction has already moved to the next pipeline stage. Instruction 3
In (306), the PRE-FETCH stage 202 occurs in time T _3. As can be seen from FIG. 5, in the seventh stage pipeline, all seven instructions can be processed simultaneously. Seven instructions 302-314 against all, FIG. 5, the time T ₇
Shows them all in process. Such a structure adds a parallel form to the processing of instructions.

【００３２】図６に示すように、本発明のこの実施の形
態は、２４ビットのアドレス・バス１１８および３２ビ
ットの双方向データ・バス１２０を介して外部プログラ
ム記憶ユニット１５０に結合されているメモリ・インタ
ーフェース・ユニット１０４を含む。また、メモリ・イ
ンターフェース・ユニット１０４は、２４ビットのアド
レス・バス１１４および双方向の１６ビットのデータ・
バス１１６を介してデータ記憶ユニット１５１に結合さ
れている。メモリ・インターフェース・ユニット１０４
は、３２ビットのプログラム読取りバス（ＰＢ）１２２
を介してマシン・プロセッサ・コア１０２のＩユニット
１０６にも結合されている。Ｐユニット１０８，Ａユニ
ット１１０およびＤユニット１１２は、データ読取りお
よびデータ書込みバスとこれに対応するアドレスバスと
を介してメモリ・インターフェース・ユニット１０４に
結合されている。Ｐユニット１０８はプログラム・アド
レス・バス１２８に更に結合されている。As shown in FIG. 6, this embodiment of the present invention provides a memory coupled to an external program storage unit 150 via a 24-bit address bus 118 and a 32-bit bidirectional data bus 120. -Includes interface unit 104. Also, the memory interface unit 104 has a 24-bit address bus 114 and a bidirectional 16-bit data bus.
It is coupled to a data storage unit 151 via a bus 116. Memory interface unit 104
Is a 32-bit program read bus (PB) 122
Is also coupled to the I unit 106 of the machine processor core 102. P unit 108, A unit 110 and D unit 112 are coupled to memory interface unit 104 via data read and data write buses and corresponding address buses. P unit 108 is further coupled to program address bus 128.

【００３３】より詳しく述べると、Ｐユニット１０８
は、２４ビットのプログラム・アドレス・バス１２８と
２つの１６ビットのデータ書込みバス（ＥＢ，ＦＢ）１
３０，１３２と２つの１６ビットのデータ読取りバス
（ＣＢ，ＤＢ）１３４，１３６とを介してメモリ・イン
ターフェース・ユニット１０４に結合されている。Ａユ
ニット１１０は、２つの２４ビットのデータ書込みアド
レス・バス（ＥＡＢ，ＦＡＢ）１６０，１６２と２つの
１６ビットのデータ書込みバス（ＥＢ，ＦＢ）１３０，
１３２と３つのデータ読取りアドレス・バス（ＢＡＢ，
ＣＡＢ，ＤＡＢ）１６４，１６６，１６８と２つの１６
ビットのデータ読取りバス（ＣＢ，ＤＢ）１３４，１３
６とを介してメモリ・インターフェース・ユニット１０
４に結合されている。Ｄユニット１１２は、２つのデー
タ書込みバス（ＥＢ，ＦＢ）１３０，１３２と３つのデ
ータ読取りバス（ＢＢ，ＣＢ，ＤＢ）１４４，１３４，
１３６とを介してメモリ・インターフェース・ユニット
１０４に結合されている。More specifically, the P unit 108
Is a 24-bit program address bus 128 and two 16-bit data write buses (EB, FB) 1
30, 132 and two 16-bit data read buses (CB, DB) 134, 136 coupled to the memory interface unit 104. The A unit 110 includes two 24-bit data write address buses (EAB, FAB) 160, 162 and two 16-bit data write buses (EB, FB) 130,
132 and three data read address buses (BAB,
CAB, DAB) 164, 166, 168 and two 16
Bit data read buses (CB, DB) 134, 13
6 and a memory interface unit 10
4. The D unit 112 has two data write buses (EB, FB) 130, 132 and three data read buses (BB, CB, DB) 144, 134,
136 and to the memory interface unit 104.

【００３４】図６は、１２４でＩユニット１０６からＰ
ユニット１０８への命令の受け渡し、例えば分岐命令を
送ることを表す。また、図６は、１２６および１２８で
Ｉユニット１０６からＡユニット１１０およびＤユニッ
ト１１２へのデータの受け渡しを表す。FIG. 6 shows that the I unit 106
It indicates the transfer of an instruction to the unit 108, for example, sending a branch instruction. FIG. 6 also illustrates the transfer of data from I unit 106 to A unit 110 and D unit 112 at 126 and 128.

【００３５】図７に示すように、プロセッサ１００は統
一プログラム／データ空間の周りに組織化されている。
プログラム・ポインタは、内部では２４ビットであっ
て、バイトアドレス指定機能を持つが、プログラムの取
出しが常に３２ビット境界で行われるので２２ビットの
アドレスだけがメモリに送られる。しかし、例えばソフ
トウエア開発のためのエミュレーション中は、ハードウ
エア区切り点を実現するために全２４ビットのアドレス
が与えられる。データ・ポインタは７ビットの主データ
・ページで拡張された１６ビットであり、語アドレス指
定機能を有する。As shown in FIG. 7, the processor 100 is organized around a unified program / data space.
The program pointer is internally 24 bits and has a byte addressing function, but since a program is always fetched on a 32-bit boundary, only a 22-bit address is sent to the memory. However, during emulation for software development, for example, a 24-bit address is provided to implement a hardware breakpoint. The data pointer is 16 bits extended with a 7 bit main data page and has word addressing capabilities.

【００３６】ソフトウエアは最大３主データ・ページを
次のように定義する。・ＭＤＰ直接アクセス間接アクセスＣＤＰ・ＭＤＰ０５ − 間接アクセスＡＲ［０〜５］・ＭＤＰ６７ − 間接アクセスＡＲ［６〜７］スタックは、維持されて、主データ・ページ０に常駐す
る。ＣＰＵメモリ・マップ・レジスタは全てのページか
ら見える。プロセッサ１００の種々の態様を表２に要約
する。The software defines up to three main data pages as follows: • MDP Direct Access Indirect Access CDP • MDP05-Indirect Access AR [0-5] • MDP67-Indirect Access AR [6-7] The stack is maintained and resident in main data page 0. The CPU memory map register is visible from every page. Table 2 summarizes various aspects of the processor 100.

【００３７】[0037]

【表２】 [Table 2]

【００３８】ソフトウェアまたはハードウェアのデバッ
グを実行するために、先に引用した米国特許第5,329,47
1号（ゲリー・スボボダに発行）に記載されているよう
に、種々のソフトウェアまたはハードウェアの事象に応
答してホストプロセッサが種々の内部レジスタの内容を
ディスプレイできるよう、テスト中のマイクロプロセッ
サにエミュレーションホストプロセッサを接続すること
が知られている。ソフトウェア事象はプログラム内の命
令をソフトウェアブレークポイント命令に置換すること
によって生成できる。テスト中のマイクロプロセッサに
よってこのソフトウェアブレークポイント命令が実行さ
れる際に、ホストプロセッサが呼び出される。同様に、
ソフトウェアブレークポイント命令を実行することによ
ってテスト中のマイクロプロセッサ上のデバッグソフト
ウェアを呼び出すこともできる。To perform software or hardware debugging, US Pat. No. 5,329,47 cited above.
Emulation of the microprocessor under test so that the host processor can display the contents of various internal registers in response to various software or hardware events, as described in Issue 1 (issued to Gary Svoboda) It is known to connect a host processor. Software events can be generated by replacing instructions in the program with software breakpoint instructions. The host processor is invoked when the software breakpoint instruction is executed by the microprocessor under test. Similarly,
Executing a software breakpoint instruction may also invoke debug software on the microprocessor under test.

【００３９】当初ロードされていた命令の代わりに、プ
ロセッサに実行を停止させる命令と置き換えることによ
り、ソフトウェアブレークポイントを実現できる。ソフ
トウェアブレークポイントを設定する際にエミュレーシ
ョンソフトウェアは次のステップを使用する。A software breakpoint can be implemented by replacing the instruction that was initially loaded with an instruction that causes the processor to stop execution. The emulation software uses the following steps when setting software breakpoints.

【００４０】１．デバッガーがソフトウェアブレークポ
イントに置換すべき命令のバイトアドレスを指定する。２．エミュレーションソフトウェアがそのバイトアドレ
スでスタートするプログラムの一部、一般には６４ビッ
トを読み出す。３．エミュレーションソフトウェアがパラレリズム、ソ
ウトデュアリズムなどを考慮して元の命令のサイズを判
断する。４．エミュレーションソフトウェアが元の命令と同じサ
イズのソフトウェアブレークポイント命令を選択し、こ
れをプログラムメモリに書き込む。５．次にメモリを読み出し、その結果と書き込まれた内
容とを比較することにより、メモリ内にソフトウェアブ
レークポイントが存在することを、エミュレーションソ
フトウェアが確認する。指定されたアドレスがリードオ
ンリーメモリ（ＲＯＭ）である場合、ソフトウェアブレ
ークポイントの使用は不可能であり、その代わりにハー
ドウェアブレークポイントを使用する。1. Specifies the byte address of the instruction that the debugger should replace at the software breakpoint. 2. The emulation software reads a part of the program starting at that byte address, typically 64 bits. 3. Emulation software determines the size of the original instruction, taking into account parallelism, outdualism, etc. 4. The emulation software selects a software breakpoint instruction of the same size as the original instruction and writes it to program memory. 5. Next, the emulation software confirms that a software breakpoint exists in the memory by reading the memory and comparing the result with the written contents. If the specified address is a read-only memory (ROM), the use of software breakpoints is not possible, and hardware breakpoints are used instead.

【００４１】エミュレーションにより使用するためのプ
ロセッサ１００の命令セット内にいくつかの命令が設け
られている。ソフトウェアブレークポイントに対しては
ＥＳＴＯＰ０命令が使用される。エミュレータを接続
すると、このＥＳＴＯＰ０命令はＰＣがＥＳＴＯＰ
０命令をポイントした状態でプロセッサ命令を停止させ
る。この命令は、デバッガーのソフトウェアブレークポ
イントを実現するのに使用される。エミュレータが接続
されていない（ランステートマシン（ＲＳＭ）はＥＸＥ
＿ＣＯＮＴ状態である）場合、この命令は有効にＮＯＰ
であり、ＰＣはＥＳＴＯＰ０命令を通過してインクリ
メントする。この命令はＤＥＣＯＤＥパイプ段階で有効
である。ＥＳＴＯＰ０命令に対しては２つのフォーマ
ット、すなわち３２ビットフォーマットと８ビットフォ
ーマットとがある。置換すべき所定の元の命令に対して
は置換命令として８ビットまたは３２ビットのＥＳＴＯ
Ｐ０命令のいずれかが使用される。元の命令サイズに対
し、ソフトウェアブレークポイントをパッディングする
ために、８ビットのＮＯＰ命令または１６ビットのＮＯ
Ｐ命令を添付する。ソフトウェアブレークポイント命令
の置換によって並列命令は置換すべき単一命令として扱
われる。従って、ＳＷＢＰ−ＥＳＴＯＰ（）命令とＮＯ
Ｐ命令との組み合わせに対しては、本実施例の命令のセ
ットの各命令長さフォーマットに一致させるための組み
合わせ命令長さフォーマットがある。Several instructions are provided in the instruction set of processor 100 for use by emulation. The ESTOP 0 instruction is used for software breakpoints. When the emulator is connected, this ESTOP 0 instruction
The processor instruction is stopped with the 0 instruction pointed. This instruction is used to implement a debugger software breakpoint. No emulator is connected (Run state machine (RSM) is EXE
_CONT state), this instruction is effectively a NOP
And the PC increments through the ESTOP 0 instruction. This instruction is valid in the DECODE pipe stage. There are two formats for the ESTOP 0 instruction, a 32-bit format and an 8-bit format. For a predetermined original instruction to be replaced, an 8-bit or 32-bit ESTO is used as a replacement instruction.
One of the P0 instructions is used. 8-bit NOP instruction or 16-bit NOP instruction to pad software breakpoints to original instruction size
Attach the P instruction. By replacing the software breakpoint instruction, the parallel instruction is treated as a single instruction to be replaced. Therefore, the SWBP-ESTOP () instruction and NO
For the combination with the P instruction, there is a combination instruction length format for matching each instruction length format of the instruction set of this embodiment.

【００４２】ＥＳＴＯＰ０命令はアセンブリ言語では
定義されておらず、エミュレーションソフトウェアで使
用されるにすぎない。この命令のエンコーディングは
「ｅｓｔｏｐ＿０（）」（０ｘ９２）と「ｅｓｔｏｐ＿
３２（）」（０ｘＦＤ００００００）である。ＰＣがＥ
ＳＴＯＰ１命令を通過して進むことを除けば、ＥＳＴ
ＯＰ１命令はＥＳＴＯＰ０命令に類似している。こ
の命令はアプリケーション内にブレークポイントを埋め
込むのに使用される。この埋め込みブレークポイントは
エミュレータに接続されている間、または接続されてい
ない間に使用できる。エミュレータに接続されているケ
ースでは、埋め込まれたブレークポイントはソフトウェ
アブレークポイントのように働き、接続されていないケ
ースでは、埋め込まれたブレークポイントはリアルタイ
ムオペレーションシステム（ＲＴＯＳ）（またはモニタ
プログラム）にエミュレーショントラップを発生し、エ
ミュレーションリクエストのサービスをすることができ
る。エミュレーション事象が構成されていない場合、こ
の命令は有効にＮＯＰ命令となる。この命令は、ＤＥＣ
ＯＤＥパイプ段階で有効であり、この命令は並列なペア
のいずれかの位置で発生できる。ＥＳＴＯＰ１のアセ
ンブリニモニックおよびエンコーディングな’ｅｓｔｏ
ｐ＿１（）’（０ｘ２ＡＣ１）である。The ESTOP 0 instruction is not defined in assembly language, but is only used by emulation software. The encoding of this instruction is "estop_0 ()" (0x92) and "estop_0 ()".
32 () "(0xFD000000). PC is E
EST except for going past the STOP 1 instruction
The OP1 instruction is similar to the ESTOP 0 instruction. This instruction is used to embed breakpoints in the application. This embedded breakpoint can be used while connected to the emulator or disconnected. In the case where it is connected to an emulator, the embedded breakpoint acts like a software breakpoint, and in the case where it is not connected, the embedded breakpoint is used by the real-time operating system (RTOS) (or monitor program) to emulate traps. And service the emulation request. If no emulation event has been configured, this instruction effectively becomes a NOP instruction. This instruction is DEC
Valid at the ODE pipe stage, this instruction can occur anywhere in a parallel pair. Estop 1 assembly mnemonics and encoding 'esto
p_1 () ′ (0x2AC1).

【００４３】次に、図８〜１５を参照し、ＥＳＴＯＰ命
令の作動および使用についてより詳細に説明する。図８
は、ソフトウェアブレークポイント命令を実行中の、先
に説明した命令パイプラインの略図である。本発明の１
つの特徴はソフトウェアブレークポイントの命令に応答
してエミュレーション中のアセンブリ言語命令の境界で
命令の実行を停止したいというニーズに関連している。
本プロセッサコアの実施例の命令パイプライン内で、一
旦、アドレス段階（Ｐ４）を命令が入力すると、エミュ
レーション目的のためにこの命令をアボートすることは
できない。スプリアスなメモリアクセスが実行されない
ように、この命令は完了しなければならない。The operation and use of the ESTOP instruction will now be described in more detail with reference to FIGS. FIG.
Figure 4 is a schematic diagram of the previously described instruction pipeline during execution of a software breakpoint instruction. 1 of the present invention
One feature relates to the need to halt instruction execution at the boundary of the assembly language instruction being emulated in response to a software breakpoint instruction.
Once the instruction enters the address phase (P4) in the instruction pipeline of the present embodiment of the processor core, the instruction cannot be aborted for emulation purposes. This instruction must be completed so that spurious memory accesses are not performed.

【００４４】更に図８を参照すると、時間８００におい
て、パイプラインの４つの段階Ｐ３、Ｐ２、Ｐ１および
Ｐ０にそれぞれ命令のシーケンスからの４つの命令、
Ａ、Ｂ、ＣおよびＥＳＴＯＰのステータスが示されてい
る。明瞭にするために、段階Ｐ４〜Ｐ６における命令の
シーケンスからの別の命令は示されていない。ＥＳＴＯ
Ｐはソフトウェアのブレークポイント命令である。時間
８０２において、各命令はそれぞれ次のパイプライン段
階に進んでいるが、新しい命令Ｄはプリフェッチ（事前
取り出し）段階に入っている。時間８０４では、各命令
はそれぞれの次のパイプライン段階に進んでおり、新し
い命令Ｅはプリフェッチ段階に入っている。時間８０４
では、復号化段階Ｐ２においてＥＳＴＯＰ命令が復号化
中であり、時間８０６では各命令はそれぞれの次のパイ
プライン段階に進んでおり、新しい命令Ｆはプリフェッ
チ（事前取り出し）段階に入っている。Still referring to FIG. 8, at time 800, four stages P3, P2, P1 and P0 of the pipeline include four instructions from the sequence of instructions, respectively.
The statuses of A, B, C and ESTOP are shown. For clarity, another instruction from the sequence of instructions in steps P4-P6 is not shown. ESTO
P is a software breakpoint instruction. At time 802, each instruction has advanced to the next pipeline stage, but a new instruction D has entered the prefetch stage. At time 804, each instruction has progressed to its next pipeline stage, and a new instruction E has entered the prefetch stage. Time 804
Thus, in the decoding stage P2, the ESTOP instruction is being decoded, at time 806 each instruction has proceeded to its next pipeline stage, and a new instruction F has entered the prefetch stage.

【００４５】ＥＳＴＯＰ命令を復号化した後に、８２０
に示されるように、アドレス段階にＮＵＬＬがジャミン
グされる。このＮＵＬＬは非オペレーション命令ＮＯＰ
に類似しているが、ＮＵＬＬはＰＣをインクリメントし
ないし（すなわちＣＰＵのステートは変わらないままで
ある）、ＣＰＵステートのプログラマモデルも変更しな
い（すなわちステータスビットは変わらない）。アドレ
スＰ３から実行Ｐ６までのすべてのパイプライン段階内
にＮＵＬＬ命令がジャミングされている際には、ＣＰＵ
は「停止されている」と見なされる。After decoding the ESTOP instruction, 820
Null is jammed in the address stage as shown in FIG. This NULL is a non-operation instruction NOP
, But NULL does not increment the PC (ie, the state of the CPU remains unchanged) and does not change the programmer model of the CPU state (ie, the status bits do not change). When a NULL instruction is jammed in all pipeline stages from address P3 to execution P6, the CPU
Is considered "suspended".

【００４６】本実施例のパイプラインは次の特性で作動
する。・パイプラインのアドレス段階前では、レジスタまたは
メモリの変化は生じない。プリフェッチ（事前取り出
し）、フェッチ（取り出し）または復号化段階にフェッ
チされている命令は、廃棄することができるし、プログ
ラマのモデルのステーとを変えることなく、後に再フェ
ッチすることができる。・アドレスまたはその後の段階（アクセス、読み出し、
実行）において生じるレジスタまたはメモリの変化は、
アドレス段階前の命令（すなわち割り込みまたは分岐に
よりアドレス前の段階の内容は廃棄され得る）に依存し
ない。・アドレスおよびその後のパイプライン段階を完了でき
るようにしながら、アドレス前のパイプライン段階の内
容を廃棄する行為はパイプラインのフラッシングと称さ
れる。The pipeline of this embodiment operates with the following characteristics. • No register or memory changes occur before the address stage of the pipeline. Instructions that have been fetched during the prefetch, fetch or decode stage can be discarded and later refetched without changing the state of the programmer's model. The address or subsequent steps (access, read,
Execution), the register or memory change that occurs
It does not depend on the instruction before the address stage (ie, the contents of the stage before the address can be discarded by an interrupt or branch). The act of discarding the contents of the pipeline stage before the address while allowing the address and subsequent pipeline stages to be completed is called flushing the pipeline.

【００４７】（エミュレーションプログラムがディスプ
レイするような）ＰＣレジスタは、パイプラインのアド
レス段階内の命令のアドレスを一般にホールドする。こ
れは命令ジャムが発生しないと仮定したときに、次に
「実行」すべき命令である。The PC register (as displayed by the emulation program) generally holds the address of the instruction within the address stage of the pipeline. This is the next instruction to "execute", assuming no instruction jam occurs.

【００４８】ソフトウェアブレークポイント命令は命令
のアドレス段階の開始時にデバッグ事象を生じさせる。
ブレークポイント命令が（デバッガーによってセットさ
れるような）ＥＳＴＯＰ０である場合、ＰＣは実行中
のＥＳＴＯＰ命令のアドレスをポイントするが、（コー
ドに埋め込まれているような）ＥＳＴＯＰ１に対して
はＰＣはＥＳＴＯＰ命令後のアドレスをポイントする。
いずれのケースにしろ、ステータスビットは前の値から
変化しない。A software breakpoint instruction causes a debug event at the beginning of the address stage of the instruction.
If the breakpoint instruction is ESTOP 0 (as set by the debugger), the PC points to the address of the executing ESTOP instruction, but for ESTOP 1 (as embedded in the code) the PC Points to the address after the ESTOP instruction.
In either case, the status bits do not change from their previous values.

【００４９】エミュレータホストがプロセッサ１００に
接続されていないか、またはデバッグソフトウェアがエ
ミュレーションをディスエーブルすると、ＥＳＴＯＰ命
令はＮＯＰとして扱われ、このＥＳＴＯＰ命令の後でプ
ログラムカウンタはインクリメントされる。If the emulator host is not connected to the processor 100, or if the debug software disables emulation, the ESTOP instruction is treated as a NOP, and the program counter is incremented after the ESTOP instruction.

【００５０】図９は、サブルーチンコール中のプログラ
ム実行のフローをフローチャートである。サブルーチン
内では次のコードシーケンスに示されるようにスタック
ポインタ相対アドレシングを使って変数がアクセスされ
る。FIG. 9 is a flowchart showing the flow of program execution during a subroutine call. Within the subroutine, variables are accessed using stack pointer relative addressing, as shown in the following code sequence.

【００５１】[0051]

【表３】 [Table 3]

【００５２】上記コードシーケンスでは用語「^*ＳＰ
（ｏｆｆｓｅｔ＿ｖａｒ１）」は値（ｏｆｆｓｅｔ＿ｖ
ａｒ１）によってスタックポインタＳＰに対するメモリ
ロケーションにあるデータ値をフェッチすることを示
す。コンパイラーはコンパイルプロセス中の各変数の相
対的アドレスを計算し、スタックに変数をプッシュした
後、「ｄＣａｌｌｆｕｎｃ＿ａ」命令を実行すること
に応答し、プログラムカウンタＰＣの値をプッシュする
遅延されたＣＡＬＬ（ＤＣＡＬＬ）命令が実行される。
ＤＣＡＬＬ命令の後の命令の命令長さに基づき、プロセ
ッサ１００によってリータンアドレスが形成される。本
命例パイプラインはプログラムフローを変更する命令後
の遅延スロット、例えばＤＣＡＬＬ、ＪＵＭＰ、Ｂｒａ
ｎｃｈなどを含む。本明細書ではこれらタイプの命令を
「不連続命令」と称す。命令パイプラインがフラッシュ
されている間、遅延スロットによって不連続命令後の１
つ以上の命令の実行が可能となる。In the above code sequence, the term “ ^* SP”
(Offset_var1) ”is a value (offset_v
ar1) indicates that the data value at the memory location for the stack pointer SP is to be fetched. The compiler computes the relative address of each variable during the compilation process, pushes the variable onto the stack, and then responds to executing the "dCall func_a" instruction in response to the delayed CALL () pushing the value of the program counter PC. DCALL) instruction is executed.
The return address is formed by the processor 100 based on the instruction length of the instruction following the DCALL instruction. The preferred pipeline is a delay slot after an instruction that changes the program flow, eg, DCALL, JUMP, Bra
nch and the like. These types of instructions are referred to herein as "discontinuous instructions." While the instruction pipeline is being flushed, a delay slot causes one
One or more instructions can be executed.

【００５３】更に図９を参照すると、ここにはプログラ
ム例が示されており、このプログラムには第１部分９０
０におけるあるシーケンスの命令と、遅延スロット９０
１内の１つ以上の命令と、第２部分９０２におけるある
シーケンスの命令とを含む。命令Ａ、ＢおよびＣは９０
０におけるシーケンスを示す。命令メモリの異なる部分
にはサブシーケンス９０５が設けられており、不連続命
令９１０の結果、９０６に示されるようにサブルーチン
９０５へプログラムフローが転送される。遅延スロット
９０１の間、命令ＤおよびＥが実行される。サブルーチ
ン９０５の完了後、プログラムフローは９０７に示され
るように命令Ｆで開始するシーケンス９０２にリターン
する。Still referring to FIG. 9, there is shown an example of a program, in which the first part 90
0 and a sequence of instructions at 0
1 and one or more instructions in the second portion 902. Instructions A, B and C are 90
The sequence at 0 is shown. Subsequences 905 are provided in different parts of the instruction memory, and as a result of the discontinuous instruction 910, the program flow is transferred to the subroutine 905 as indicated by 906. During delay slot 901, instructions D and E are executed. After completion of subroutine 905, the program flow returns to sequence 902 starting at instruction F as shown at 907.

【００５４】図１０Ａ〜１０Ｃは、ソフトウェアブレー
クポイント命令と組み合わせた、図９のサブルーチンコ
ール中のリータンアドレスの計算を示す時間ラインであ
る。本発明の特徴によれば、プロセッサ１００の命令セ
ットはいくつかの異なる命令長さフォーマットを含む。
プロセッサ１００は可変長さの命令を有するので、遅延
スロット内の命令の各々を復号化し、これらの組み合わ
された長さを決定することにより、不連続命令のための
リータンアドレスを計算する。図１０Ａは命令Ｄおよび
Ｅの双方は、１バイトの長さＬ１およびＬ２をそれぞれ
有する１バイト命令であるケースを示す。このケースで
は、リータンアドレスはｎ＋２（ここでｎは不連続命令
Ｃの後のプログラムカウンタの値である）となるように
計算される。図１０Ｂはリータンアドレスがｎ＋７とな
るように計算されるよう、命令Ｄが６バイトの長さＬ１
を有するケースを示す。同様に、図１０Ｃは命令Ｅが３
バイトの長さＬ２を有し、よってリータンアドレスがｎ
＋４に計算されるケースを示す。ここで、これら図では
ＤＣＡＬＬ命令は不連続命令Ｃとして示されているが、
この説明はプロセッサ１００の命令セット内の任意の不
連続命令に当てはまる。FIGS. 10A-10C are time lines showing the calculation of the return address during the subroutine call of FIG. 9, combined with a software breakpoint instruction. According to a feature of the invention, the instruction set of processor 100 includes several different instruction length formats.
Because processor 100 has instructions of variable length, it calculates the return address for discontinuous instructions by decoding each of the instructions in the delay slot and determining their combined length. FIG. 10A shows the case where both instructions D and E are one-byte instructions having one-byte lengths L1 and L2, respectively. In this case, the return address is calculated to be n + 2 (where n is the value of the program counter after the discontinuous instruction C). FIG. 10B shows that the instruction D has a length L1 of 6 bytes so that the return address is calculated to be n + 7.
Is shown. Similarly, FIG. 10C shows that instruction E is 3
It has a byte length L2, so that the return address is n
The case calculated as +4 is shown. Here, in these figures, the DCALL instruction is shown as a discontinuous instruction C,
This description applies to any discontinuous instruction in the instruction set of processor 100.

【００５５】プログラムシーケンス内の任意の命令をソ
フトウェアブレークポイント命令に置換し、ソフトウェ
アまたはハードウェアのデバッグを実行できるようにす
ることが好ましい。プロセッサ１００の本実施例内でリ
ータンアドレスを正しく計算できるようにするには、プ
ロセッサ１００の命令セット内の命令の各長さに対し、
ソフトウェアブレークポイント命令が存在していなけれ
ばならない。先に説明したように、プロセッサの命令セ
ットは８ビット、１６ビット、２４ビットおよび３２ビ
ットの命令長さフォーマットを含む。更に、命令バッフ
ァユニット１０６により２つの命令を単一サイクルで並
列に復号化し、かつ実行できるので、命令実行パイプラ
インは４０ビットの命令長さフォーマットと４８ビット
の命令長さフォーマットを取り扱う。換言すれば、プロ
セッサ１００の命令セット内には効果的に６つの異なる
命令長さフォーマットが設けられている。各命令長さフ
ォーマットごとに１つずつ、総計６つの異なるソフトウ
ェアブレークポイント命令を設けることが、本発明の特
徴となっている。Preferably, any instruction in the program sequence is replaced with a software breakpoint instruction so that software or hardware debugging can be performed. In order for the return address to be correctly calculated in this embodiment of the processor 100, for each length of instruction in the instruction set of the processor 100,
A software breakpoint instruction must be present. As explained above, the processor instruction set includes 8-bit, 16-bit, 24-bit and 32-bit instruction length formats. Further, the instruction execution pipeline handles a 40-bit instruction length format and a 48-bit instruction length format because the instruction buffer unit 106 can decode and execute two instructions in a single cycle in parallel. In other words, there are effectively six different instruction length formats within the instruction set of processor 100. It is a feature of the present invention to provide a total of six different software breakpoint instructions, one for each instruction length format.

【００５６】本発明の別の特徴は、より少数のソフトウ
ェアブレークポイント命令と、多数の非オペレーション
命令とを組み合わせ、ソフトウェアブレークポイント命
令の必要な数を低減することである。これにより、命令
デコーダ５１２および５１４が認識しなければならない
命令の数は有利なことに少なくなっている。Another feature of the present invention is that it combines a smaller number of software breakpoint instructions with a large number of non-operational instructions to reduce the required number of software breakpoint instructions. This advantageously reduces the number of instructions that instruction decoders 512 and 514 need to recognize.

【００５７】図１１は、本発明の特徴に係わる、非オペ
レーション命令との組み合わせにより形成される種々の
長さブレークポイント命令を示すチャートである。ソフ
トウェアブレークポイント命令の２つの種類が設けられ
ている。すなわち８ビットの長さ１１１０のｅｓｔｏｐ
１１００と、３２ビットの長さ１１１３のｅｓｔｏｐ＿
３２とが設けられている。別の理由からプロセッサ１０
０の命令セット内には並列イネーブルビットを備えた非
オペレーション命令の２つの変形命令が設けられてい
る。すなわち８ビットの長さのＮＯＰと１６ビットの長
さのＮＯＰ＿１６とが設けられている。図４を参照して
説明するように、命令のペアの１つにおいて、並列イネ
ーブルビットをセットすることにより、プロセッサ１０
０内の命令を組み合わせることができる。従って、２つ
のｅｓｔｏｐ命令のうちの選択された１つとＮＯＰ命令
のうちの選択された１つとを組み合わせ、ＮＯＰ命令に
おける並列イネーブルビットをセットすることにより、
８ビットの長さ１１１０、１６ビットの長さ１１１１、
２４ビットの長さ１１１２、３２ビットの長さ１１１
３、４０ビットの長さ１１１４または４８ビットの長さ
１１１５を有するソフトウェアブレークポイント命令が
設けられる。FIG. 11 is a chart illustrating various length breakpoint instructions formed in combination with a non-operation instruction in accordance with a feature of the present invention. Two types of software breakpoint instructions are provided. That is, an 8-bit length 1110 estop
1100 and estop_ of 32-bit length 1113
32 are provided. Processor 10 for another reason
In the instruction set of 0, two modified instructions of a non-operation instruction having a parallel enable bit are provided. That is, an 8-bit NOP and a 16-bit NOP_16 are provided. By setting the parallel enable bit in one of the instruction pairs, as described with reference to FIG.
Instructions in 0 can be combined. Thus, by combining a selected one of the two stop instructions with a selected one of the NOP instructions and setting the parallel enable bit in the NOP instruction,
8-bit length 1110, 16-bit length 1111,
24 bit length 1112, 32 bit length 111
A software breakpoint instruction having a length 1114 of 3, 40 bits or a length 1115 of 48 bits is provided.

【００５８】エミュレーションソフトウェアは先に述べ
たように、ソフトウェアブレークポイント命令の置換を
行う前に、パラレリズムおよびソフトデュアル方式を考
慮して、置換すべき命令の命令フォーマットを計算す
る。As described above, the emulation software calculates the instruction format of the instruction to be replaced in consideration of the parallelism and the soft dual method before replacing the software breakpoint instruction.

【００５９】図１２は、リターンアドレスを計算するた
めの命令バッファユニット内で使用される種々のレジス
タの、より詳細なブロック図である。ターゲットレジス
タ１２００は、不連続命令のターゲットアドレスをホー
ルドする。書き込みプログラムカウンタ５３０、ローカ
ル書き込みプログラムカウンタ５３２、読み出しプログ
ラムカウンタ５３４およびローカル読み出しプログラム
カウンタ５３６については、図４を参照して説明したと
おりである。一時的読み出しプログラムカウンタ１２１
０および一時的書き込みプログラムカウンタ１２２０
は、スタック１２３０に値が書き込まれている間に、こ
れら値をホールドする。FIG. 12 is a more detailed block diagram of the various registers used in the instruction buffer unit for calculating the return address. The target register 1200 holds a target address of a discontinuous instruction. The write program counter 530, the local write program counter 532, the read program counter 534, and the local read program counter 536 are as described with reference to FIG. Temporary read program counter 121
0 and temporary write program counter 1220
Holds these values while the values are being written to the stack 1230.

【００６０】図１３は、図９および１０Ａ〜１０Ｃを参
照して先に説明したように、サブルーチンコール中の命
令パイプラインの作動を示すタイミング図である。時間
１３００において、１３１０に示されるように不連続命
令が復号化される。時間１３０６では、ＬＣＲＰＣレジ
スタ５３６の内容は、１３１２に示されるようにＴ１Ｒ
ＰＣレジスタ１２１０へ転送される。次に、１３１４に
示されるように、スタック１２３０へ内容が書き込まれ
る。従って、スタックに先のリターンアドレスがセーブ
される。１３１３に示されるように、時間１３０８中に
ＬＣＦＰＣ内にＤＣＡＬＬ１３１０のためのリターンア
ドレスがセーブされる。ＬＣＲＰＣは実際にはスタック
（ＴＯＳ）のトップであり、本実施例のスタックアーキ
テクチャはレジスタであるＴＯＳによりパイプライン化
される。FIG. 13 is a timing diagram illustrating the operation of the instruction pipeline during a subroutine call, as described above with reference to FIGS. 9 and 10A-10C. At time 1300, the discontinuous instruction is decoded as shown at 1310. At time 1306, the contents of LCRPC register 536 contains T1R as shown at 1312.
Transferred to PC register 1210. Next, as shown at 1314, the contents are written to stack 1230. Therefore, the previous return address is saved on the stack. As shown at 1313, the return address for DCALL 1310 is saved in LCFPC during time 1308. LCRPC is actually the top of the stack (TOS), and the stack architecture in this embodiment is pipelined by the register TOS.

【００６１】図１３を参照すると、「＠ＳＲ（ｐ−
１）」は現在のリターンアドレス「＠ＳＲ（ｐ）」と反
対に、先のリターンアドレスを示す。例えばＤＣＡＬＬ
命令にはＳＲ（ｐ）がセーブされ、ＤＣＡＬＬのリター
ンアドレスがＳＲ（ｐ）となる。レジスタＴ１ＲＰＣに
はＳＲ（ｐ−１）が記憶され、一方、レジスタＬＣＲＰ
ＣにはＳＲ（ｐ）が記憶される。これらレジスタの値は
スタックに漏れ、よって多数のネスト状のＤＣＡＬＬす
なわち割り込みなどがサポートされる。Referring to FIG. 13, “＠SR (p−
1) "indicates the previous return address, as opposed to the current return address" @SR (p) ". For example, DCALL
The instruction saves SR (p), and the return address of DCALL becomes SR (p). SR (p-1) is stored in the register T1RPC, while the register LCRP
SR (p) is stored in C. The values of these registers leak onto the stack, thus supporting a large number of nested DCALLs or interrupts.

【００６２】図１４は、１４１０に示されるＤＣＡＬＬ
命令後の第１遅延スロット内におかれたソフトウェアブ
レークポイント命令ＳＷＢＰを実行する間の命令パイプ
ラインの作動を示すタイミング図である。好ましいこと
に、ソフトウェアブレークポイント命令は、この命令に
置き換えた命令と同じ長さであるので、同じリターンア
ドレスを計算し、ＬＣＲＰＣに記憶し、１４２０に示さ
れるようにスタック１２３０に先のリターンアドレスを
記憶する。FIG. 14 shows the DCALL shown at 1410.
FIG. 4 is a timing diagram illustrating operation of the instruction pipeline during execution of a software breakpoint instruction SWBP placed in a first delay slot after an instruction. Preferably, the software breakpoint instruction is the same length as the instruction that replaced it, so the same return address is calculated, stored in LCRPC, and the previous return address is stored in stack 1230 as shown at 1420. Remember.

【００６３】図１５は、１５１０に示されるＤＣＡＬＬ
命令後の第２遅延スロット内におかれたソフトウェアブ
レークポイント命令ＳＷＢＰを実行する間の命令パイプ
ラインの作動を示すタイミング図である。図１４と同じ
ように、このソフトウェアブレークポイント命令はこの
命令に置換した命令と同じ長さとなっているので、同じ
リターンアドレスを計算し、ＬＣＲＰＣに記憶し、先の
リターンアドレスを１５２０に示されるようにスタック
１２３０に記憶する。FIG. 15 shows the DCALL shown at 1510.
FIG. 9 is a timing diagram illustrating operation of the instruction pipeline during execution of a software breakpoint instruction SWBP placed in a second delay slot after the instruction. As in FIG. 14, this software breakpoint instruction has the same length as the instruction replaced by this instruction, so the same return address is calculated, stored in LCRPC, and the previous return address is shown in 1520. Is stored in the stack 1230.

【００６４】図１６は、プロセッサ１００を内蔵する集
積回路の概要を示す。図示するように、集積回路は表面
取付け用の複数の接点を含む。しかし、集積回路は他の
形状でもよい。例えば、ゼロ・インサーション・フォー
ス・ソケットに取り付けるため回路の下面に複数のピン
を備えるものや、任意の他の適当な形状でよい。FIG. 16 shows an outline of an integrated circuit incorporating the processor 100. As shown, the integrated circuit includes a plurality of contacts for surface mounting. However, the integrated circuit may have other shapes. For example, it may have multiple pins on the underside of the circuit for mounting in a zero insertion force socket, or any other suitable shape.

【００６５】図１７は、統合キーボード１２およびディ
スプレイ１４を備えた移動電話のような移動通信装置で
かかる集積回路を実現する例を示す。図１７に示すよう
に、プロセッサ１００を備えるディジタル装置１０は、
必要に応じてキーボード・アダプタ（不図示）を介して
キーボード１２に、必要に応じてディスプレイ・アダプ
タ（不図示）を介してディスプレイ１４に、また、無線
周波数（ＲＦ）回路１６に接続されている。無線周波数
（ＲＦ）回路１６はアンテナ１８に接続されている。FIG. 17 shows an example of implementing such an integrated circuit in a mobile communication device such as a mobile telephone having an integrated keyboard 12 and a display 14. As shown in FIG. 17, the digital device 10 including the processor 100 includes:
It is connected to the keyboard 12 via a keyboard adapter (not shown) as needed, to the display 14 via a display adapter (not shown) as needed, and to a radio frequency (RF) circuit 16. . Radio frequency (RF) circuit 16 is connected to antenna 18.

【００６６】データ処理デバイス１００の製造は、種々
の量の不純物を半導体基板に注入するステップと不純物
を基板内の選択された深さに拡散させてトランジスタ・
デバイスを形成するステップとの多重ステップを含む。
マスクは、不純物の位置を制御するために形成される。
導電材料および絶縁材料の多重層が堆積されエッチング
されて種々のデバイスを相互に接続する。これらのステ
ップはクリーン・ルーム環境で行われる。The fabrication of the data processing device 100 involves implanting various amounts of impurities into the semiconductor substrate and diffusing the impurities to a selected depth in the substrate.
And multiplexing with forming the device.
The mask is formed to control the position of the impurity.
Multiple layers of conductive and insulating materials are deposited and etched to interconnect various devices. These steps are performed in a clean room environment.

【００６７】データ処理デバイスの製造コストのかなり
の部分は試験関係である。ウエハ状態で、個々のデバイ
スをある動作状態にバイアスして、基本的な動作機能性
を試験する。次に、ウエハを個々のダイに分割して、ダ
イのままでまたはパッケージ化して販売する。パッケー
ジ化した後、完成品を動作状態までバイアスして、動作
機能性を試験する。A significant portion of the manufacturing cost of a data processing device is test related. In the wafer state, individual devices are biased to an operating state to test basic operating functionality. The wafer is then divided into individual dies and sold as dies or packaged. After packaging, the finished product is biased to an operational state and tested for operational functionality.

【００６８】本発明の別の実施の形態は、組合せ機能の
ゲート総数を削減するために、ここに開示された回路を
組み合わせた別の回路を含む。ゲート最小化のための技
術は当業者には既知であるので、かかる実施の形態につ
いてはここで説明しない。Another embodiment of the present invention includes another circuit that combines the circuits disclosed herein to reduce the total number of gates for the combined function. Since embodiments for gate minimization are known to those skilled in the art, such embodiments are not described here.

【００６９】ソフトウェアブレークポイント命令の別の
実施例は３つ以上の命令長さフォーマットを有すること
ができる。同様に、異なる命令のコード化も使用でき
る。Another embodiment of a software breakpoint instruction can have more than two instruction length formats. Similarly, coding of different instructions can be used.

【００７０】プロセッサ１００の別の実施例はより多数
の命令パイプライン段階を有することができ、ソフトウ
ェアブレークポイント命令は異なるパイプライン段階内
で効果を発揮できる。更に、別の実施例における命令長
さフォーマットの数はプロセッサ１００の３つのフォー
マット長さよりも多くても良いし、少なくてもよい。Another embodiment of the processor 100 can have more instruction pipeline stages, and software breakpoint instructions can take effect in different pipeline stages. Further, the number of instruction length formats in other embodiments may be more or less than the three format lengths of processor 100.

【００７１】以上で、高いコード密度および容易なプロ
グラミングの双方を提供するプログラマブルデジタル信
号プロセッサ（ＤＳＰ）であるプロセッサについて説明
した。アーキテクチャおよび命令セットは電力消費量を
少なくし、ＤＳＰアルゴリズム、例えば無線電話だけで
なく、純粋な制御作業用のアルゴリズムを高い効率で実
行できるように最適化される。このプロセッサは命令バ
ッファユニットと、この命令バッファユニットにより復
号化される命令を実行するためのデータ計算ユニットと
を含む。デバッグ目的のためにソフトウェアブレークポ
イント命令が設けられ、遅延スロット中にソフトウェア
ブレークポイント命令を実行する際の命令パイプライン
の作動を正しくエミュレートするためには、ソフトウェ
アブレークポイントの幅を置換する命令と同一にしなけ
ればならない。限られた数のブレークポイント命令長さ
フォーマットと非オペレーション命令とを組み合わせ、
任意の命令長さフォーマットと一致する、多数の組み合
わせ命令を形成する。好ましいことに、限られた数のソ
フトウェアブレークポイント命令を定義し、多数の命令
長さフォーマットを提供しながら、これを復号化するだ
けでよい。例えば、本実施例では、６つの異なる命令長
さフォーマットを有する命令と置換するのに、２種類の
ソフトウェアブレークポイント命令しか必要でない。Thus, a processor has been described that is a programmable digital signal processor (DSP) that provides both high code density and easy programming. The architecture and instruction set are optimized to reduce power consumption and to execute DSP algorithms, such as wireless telephones, as well as algorithms for pure control tasks with high efficiency. The processor includes an instruction buffer unit and a data calculation unit for executing instructions decoded by the instruction buffer unit. Software breakpoint instructions are provided for debugging purposes, and in order to properly emulate the operation of the instruction pipeline when executing software breakpoint instructions during delay slots, instructions to replace the width of software breakpoints Must be identical. Combine a limited number of breakpoint instruction length formats with non-operational instructions,
Form multiple combined instructions that match any instruction length format. Preferably, only a limited number of software breakpoint instructions need to be defined and decoded while providing a large number of instruction length formats. For example, in this embodiment, only two types of software breakpoint instructions are needed to replace instructions with six different instruction length formats.

【００７２】好ましいことに、本発明の特徴によりＤＣ
ＡＬＬ命令の遅延スロット内にソフトウェアブレークポ
イントを挿入することが可能となっている。有利なこと
に、この遅延スロットは任意の長さでよく、デバッグシ
ステムは遅延スロットの長さに一致するように、ＳＷＢ
Ｐ命令の長さを調節できる。有利なことに、例えば遅延
スロットにおいて、ＳＷＢＰを任意の命令でサポートす
ることにより、本発明に係わるプロセッサを任意の命令
でインターラプトすることができる。これにより、デバ
ッグ動作中のＣＰＵステートマシンの観測可能性が改善
される。Preferably, the features of the present invention allow DC
A software breakpoint can be inserted in the delay slot of the ALL instruction. Advantageously, this delay slot can be of any length, and the debug system will use the SWB to match the length of the delay slot.
The length of the P instruction can be adjusted. Advantageously, by supporting SWBP with any instruction, for example in a delay slot, the processor according to the invention can be interrupted with any instruction. This improves the observability of the CPU state machine during the debugging operation.

【００７３】本明細書で使用した「印加した」、「接続
された」および「接続」なる用語は、電気的な接続を意
味するものであり、電気的接続経路内に別の要素が存在
していても良い。As used herein, the terms “applied”, “connected”, and “connection” are intended to mean an electrical connection, where another element is present in the electrical connection path. May be.

【００７４】以上で、図示した実施例を参照して、本発
明について説明したが、この説明は限定的な意味に解し
てはならない。当業者がこの説明を読めば、本発明の他
の種々の実施例が明らかとなろう。従って、添付した特
許請求の範囲は発明の要旨および範囲内に入る実施例の
かかる変形例をカバーするものである。While the present invention has been described with reference to illustrative embodiments, this description is not meant to be construed in a limiting sense. From reading the description, one skilled in the art will appreciate various other embodiments of the present invention. Accordingly, the appended claims are intended to cover such modifications of the embodiments as fall within the spirit and scope of the invention.

【００７５】以上の説明に関して更に以下の項を開示す
る。（１）マイクロプロセッサを備え、該マイクロプロセッ
サが、第１の複数の命令フェッチ長さから選択された第
１の長さを有する、命令メモリからフェッチされた命令
を復号化するように作動する命令バッファユニットと、
命令バッファユニットによって復号化された命令を実行
するためのデータ計算ユニットと、命令メモリに提供さ
れる命令アドレスを発生するようになっている、プログ
ラムカウンタとを備え、前記命令バッファユニットが前
記第１の複数の命令長さフォーマットのいずれかに等し
い長さを有するよう選択された第１のソフトウェアブレ
ークポイント命令を復号化するように作動できる、デジ
タルシステム。With respect to the above description, the following items are further disclosed. (1) An instruction comprising a microprocessor, the microprocessor operative to decode an instruction fetched from an instruction memory having a first length selected from a first plurality of instruction fetch lengths. A buffer unit,
A data calculation unit for executing the instruction decoded by the instruction buffer unit; and a program counter adapted to generate an instruction address provided to an instruction memory, wherein the instruction buffer unit comprises the first instruction buffer unit. A digital system operable to decode a first software breakpoint instruction selected to have a length equal to any of a plurality of instruction length formats of the first software breakpoint instruction.

【００７６】（２）組み合わされた第２のソフトウェア
ブレークポイント命令と第１の非オペレーション命令と
をデータ計算ユニットにより１つの前記第１のソフトウ
ェアブレークポイント命令として扱うように、命令バッ
ファが１回のサイクルで第１の非オペレーション命令と
組み合わされた第２のソフトウェアブレークポイント命
令を復号化するように作動できる、第１項記載のデジタ
ルシステム。(2) The instruction buffer is provided once so that the combined second software breakpoint instruction and the first non-operation instruction are treated as one first software breakpoint instruction by the data calculation unit. The digital system of claim 1, operable to decode a second software breakpoint instruction combined with a first non-operation instruction in a cycle.

【００７７】（３）前記第２の複数の命令長さフォーマ
ットが、前記第１の複数の命令長さフォーマットよりも
小さくなるように、前記第２のソフトウェアブレークポ
イント命令が前記第２の複数の命令長さフォーマットか
ら選択された第２の命令長さフォーマットを有する、第
２項記載のデジタルシステム。(3) The second software breakpoint instruction is executed by the second plurality of instructions so that the second plurality of instruction length formats is smaller than the first plurality of instruction length formats. The digital system of claim 2, having a second instruction length format selected from an instruction length format.

【００７８】（４）キーボードアダプタを介し、プロセ
ッサに接続された統合されたキーボードと、ディスプレ
イアダプタを介し、プロセッサに接続されたディスプレ
イと、プロセッサに接続された無線周波数（ＲＦ）回路
と、このＲＦ回路に接続されたアンテナとを更に備え
た、セルラー電話である先の請求項のいずれかに記載の
デジタルシステム。(4) An integrated keyboard connected to the processor via a keyboard adapter; a display connected to the processor via a display adapter; a radio frequency (RF) circuit connected to the processor; Digital system according to any of the preceding claims, which is a cellular telephone further comprising an antenna connected to the circuit.

【００７９】（５）プロセッサコアに関連した命令メモ
リからのプログラムカウンタに応答し、命令をフェッチ
し、第１の複数の命令長さフォーマットを有する命令の
セットから命令のシーケンスを選択するようになってい
るプロセッサコアの命令パイプラインにおいて、命令の
シーケンスを実行する工程と、前記命令のシーケンス内
の第１の命令長さフォーマットを有する第１の命令を、
第２の複数の命令長さフォーマット有する複数のソフト
ウェアブレークポイント命令から選択した第１の命令長
さフォーマットを有する第１のソフトウェアブレークポ
イント命令に置換する工程と、前記命令のシーケンスの
第１部分を実行した後に前記第１のソフトウェアブレー
クポイント命令を実行することにより、実行シーケンス
をブレークする工程と、前記第１のソフトウェアブレー
クポイント命令を命令シーケンス内の前記第１の命令に
置換することにより、前記命令のシーケンスの実行を再
開する工程とを備えた、デジタルシステムを作動させる
ための方法。(5) Responsive to a program counter from an instruction memory associated with the processor core, fetching the instruction and selecting a sequence of instructions from a set of instructions having a first plurality of instruction length formats. Executing, in an instruction pipeline of a processor core, a sequence of instructions, and a first instruction having a first instruction length format in the sequence of instructions.
Replacing a plurality of software breakpoint instructions having a second plurality of instruction length formats with a first software breakpoint instruction having a first instruction length format selected; and replacing a first portion of the sequence of instructions with a first software breakpoint instruction having a first instruction length format. Executing the first software breakpoint instruction after execution to break an execution sequence; and replacing the first software breakpoint instruction with the first instruction in an instruction sequence, Resuming execution of the sequence of instructions.

【００８０】（６）第２の複数の命令長さフォーマット
から、第２の命令長さフォーマットを有する第２のソフ
トウェアブレークポイント命令を選択する工程と、第３
の複数の命令長さフォーマットから、第３の長さを有す
る第１の非オペレーション命令を選択する工程と、第２
の命令長さと第３の命令長さとを組み合わせた長さが、
前記第１の命令長さに等しくなるよう、前記第２のソフ
トウェアブレークポイント命令と前記第１の非オペレー
ション命令とを組み合わせる工程によって、第１のソフ
トウェアブレークポイント命令を形成する、第５項記載
の方法。(6) selecting a second software breakpoint instruction having a second instruction length format from the second plurality of instruction length formats;
Selecting a first non-operation instruction having a third length from the plurality of instruction length formats of the second one;
The combination of the instruction length and the third instruction length is
6. The method of claim 5, wherein the first software breakpoint instruction is formed by combining the second software breakpoint instruction and the first non-operation instruction to be equal to the first instruction length. Method.

【００８１】（７）前記組み合わせる工程が、前記第２
のソフトウェアブレークポイント命令と前記第１の非オ
ペレーション命令とを命令パイプライン内で並列に実行
すべきことを表示することを含む、第５項記載の方法。(7) The step of combining is performed by the second step.
6. The method of claim 5, including indicating that the software breakpoint instruction of the first and the first non-operation instructions are to be executed in parallel in an instruction pipeline.

【００８２】（８）前記第２の複数の命令長さフォーマ
ットが前記第２の複数の命令長さフォーマットよりも少
ない、第５項記載の方法。(8) The method of claim 5, wherein said second plurality of instruction length formats is less than said second plurality of instruction length formats.

【００８３】（９）前記命令のシーケンスの第１部分に
おいて、前記第２の命令を実行した結果生じた遅延スロ
ット内で前記第１のソフトウェアブレークポイント命令
を実行し、不連続タイプの命令である第２の命令によ
り、前記第２の命令のシーケンスまでの分岐を生じさ
せ、前記第１の命令のシーケンス内に前記第１の命令が
存在している場合と同じ値を有するリターンアドレスを
記憶する工程を前記第２の命令を実行する工程が含む、
第５項記載の方法。(9) In the first part of the sequence of instructions, the first software breakpoint instruction is executed in a delay slot resulting from execution of the second instruction, and is a discontinuous type instruction. The second instruction causes a branch to the second instruction sequence, and stores a return address having the same value as when the first instruction is present in the first instruction sequence. Executing the second instruction.
6. The method according to claim 5.

【００８４】（１０）前記第１の命令が命令パイプ内で
並列に実行すべき命令セットから選択された少なくとも
２つの命令の組み合わせである、第５項記載の方法。(10) The method according to (5), wherein the first instruction is a combination of at least two instructions selected from an instruction set to be executed in parallel in an instruction pipe.

【００８５】（１１）前記第２の複数の命令長さフォー
マットが、前記第２の複数の命令長さフォーマットより
も少なく、前記第１の命令が前記命令パイプ内で並列に
実行すべき命令セットから選択された少なくとも２つの
命令の組み合わせであり、前記命令のシーケンスの第１
部分において、前記第２の命令を実行した結果生じた遅
延スロット内で前記第１のソフトウェアブレークポイン
ト命令を実行し、不連続タイプの命令である第２の命令
により、前記第２の命令のシーケンスまでの分岐を生じ
させ、前記第１の命令のシーケンス内に前記第１の命令
が存在している場合と同じ値を有するリターンアドレス
を記憶する工程を前記第２の命令を実行する工程が含
む、第７項記載の方法。(11) An instruction set in which the second plurality of instruction length formats is smaller than the second plurality of instruction length formats and the first instruction is to be executed in parallel in the instruction pipe. A combination of at least two instructions selected from the group consisting of:
Executing, in part, the first software breakpoint instruction in a delay slot resulting from the execution of the second instruction, wherein the second instruction is a discontinuous type instruction, and the second instruction is a sequence of the second instruction. Executing the second instruction includes causing a branch to and storing a return address having the same value as if the first instruction were present in the sequence of the first instruction. 8. The method of claim 7.

【００８６】（１２）プロセッサ、すなわち高いコード
密度および容易なプログラム性の双方を提供する、プロ
グラマブルデジタル信号プロセッサ（ＤＳＰ）であるプ
ロセッサ（１００）が提供される。電量消費量を少なく
し、かつＤＳＰアルゴリズム、例えば無線電話のみなら
ず純粋な制御作業用のアルゴリズムの実行の効率を高く
するように、アーキテクチャおよび命令セットが最適に
される。デバッグ用のソフトウェアブレークポイント命
令も提供される。遅延スロット中にソフトウェアブレー
クポイント命令を実行する際に、命令パイプラインの作
動を正しくエミュレートするために、ソフトウェアブレ
ークポイントの幅（１１１０〜１１１５）は置換された
命令と同じとなっている。非オペレーション命令（ＮＯ
Ｐ、ＮＯＰ＿１６）と限られた数のブレークポイント命
令長さフォーマット（１１００、１１０２）とが組み合
わされ、命令長さフォーマットと一致する多数の組み合
わせ命令を形成する。(12) There is provided a processor (100) which is a programmable digital signal processor (DSP) that provides both high code density and easy programmability. The architecture and instruction set are optimized to reduce power consumption and to make the execution of DSP algorithms, for example algorithms for pure control tasks as well as radiotelephones, more efficient. Software breakpoint instructions for debugging are also provided. To properly emulate the operation of the instruction pipeline when executing a software breakpoint instruction during a delay slot, the width of the software breakpoint (1110-1115) is the same as the replaced instruction. Non-operation instruction (NO
P, NOP_16) and a limited number of breakpoint instruction length formats (1100, 1102) are combined to form a number of combined instructions that match the instruction length format.

【００８７】本願は１９９９年３月８日に欧州において
出願された欧州特許出願第９９４００５５８．５号（Ｔ
Ｉ−２７７６１ＥＵ）および１９９９年１０月６日に欧
州において出願された欧州特許出願第９８４０２４５
５．４号（ＴＩ−２８４３３ＥＵ）に基づく優先権を主
張するものである。The present application is based on European Patent Application No. 99400558.5, filed March 8, 1999 in Europe (T
European Patent Application No. 9840245 filed in Europe on Oct. 6, 1999).
5.4 (TI-28433 EU).

[Brief description of the drawings]

【図１】本発明の実施例に係わるプロセッサのブロック
略図である。FIG. 1 is a schematic block diagram of a processor according to an embodiment of the present invention.

【図２】図１のプロセッサのコアの略図である。FIG. 2 is a schematic diagram of a core of the processor of FIG. 1;

【図３】プロセッサのコアの種々の実行ユニットの、よ
り詳細なブロック略図である。FIG. 3 is a more detailed block diagram of various execution units of a processor core.

【図４】プロセッサのうちの命令バッファキューおよび
命令デコーダの略図である。FIG. 4 is a schematic diagram of an instruction buffer queue and an instruction decoder of a processor.

【図５】プロセッサのパイプラインの作動を説明するた
めの、プロセッサのコアの略図である。FIG. 5 is a schematic diagram of a processor core for illustrating operation of the processor pipeline.

【図６】メモリ管理ユニットの相互接続されたメモリを
示すプロセッサのブロック図である。FIG. 6 is a block diagram of a processor showing the interconnected memories of the memory management unit.

【図７】プロセッサのプログラムメモリスペースとデー
タメモリスペースとの統一された構造を示す。FIG. 7 shows a unified structure of a program memory space and a data memory space of a processor.

【図８】ソフトウェアブレークポイントメモリを実行中
の命令パイプラインの略図である。FIG. 8 is a schematic diagram of an instruction pipeline executing a software breakpoint memory.

【図９】サブルーチンコール中のプログラム実行フロー
を示すフローチャートである。FIG. 9 is a flowchart showing a program execution flow during a subroutine call.

【図１０】ソフトウェアブレークポイント命令と組み合
わせた、図９のサブルーチンコール中のリターンアドレ
スの計算を示す時間図であり、そのうちの図１０Ａは命
令ＤおよびＥがそれぞれ１バイトの長さＬ１およびＬ２
を有する１バイト命令のケースを示し、図１０Ｂは命令
Ｄが６バイトの長さＬ１を有するケースを示し、図１０
Ｃは命令Ｅが３バイトの長さＬ２を有するケースを示
す。10 is a time diagram showing the calculation of the return address during the subroutine call of FIG. 9 in combination with a software breakpoint instruction, of which FIG.
FIG. 10B shows a case where instruction D has a length L1 of 6 bytes, and FIG.
C shows the case where instruction E has a length L2 of 3 bytes.

【図１１】本発明の特徴に係わる非オペレーション命令
との組み合わせによって形成された、種々の長さブレー
クポイント命令を示すチャートである。FIG. 11 is a chart illustrating various length breakpoint instructions formed in combination with non-operation instructions in accordance with aspects of the present invention.

【図１２】図４の命令バッファユニットで使用される種
々のレジスタの、より詳細なブロック図である。FIG. 12 is a more detailed block diagram of various registers used in the instruction buffer unit of FIG.

【図１３】サブルーチンコール中の命令パイプラインの
作動を示すタイミング図である。FIG. 13 is a timing chart showing the operation of the instruction pipeline during a subroutine call.

【図１４】ＤＣＡＬＬ命令後の第１遅延スロット内に入
れられたソフトウェアブレークポイント命令の実行中の
命令パイプラインの作動を示すタイミング図である。FIG. 14 is a timing diagram illustrating operation of the instruction pipeline during execution of a software breakpoint instruction placed in a first delay slot after a DCALL instruction.

【図１５】ＤＣＡＬＬ命令後の第２遅延スロット内に入
れられたソフトウェアブレークポイント命令の実行中の
命令パイプラインの作動を示すタイミング図である。FIG. 15 is a timing diagram illustrating operation of the instruction pipeline during execution of a software breakpoint instruction placed in a second delay slot after a DCALL instruction.

【図１６】プロセッサを内蔵する集積回路の略図であ
る。FIG. 16 is a schematic diagram of an integrated circuit incorporating a processor.

【図１７】図１のプロセッサを内蔵する通信デバイスの
略図である。FIG. 17 is a schematic diagram of a communication device incorporating the processor of FIG. 1;

[Explanation of symbols]

２０プロセッサ・バックプレーン２２ＡＳＩＣバックプレーン２４メモリ２６周辺装置２８外部インターフェース１００プロセッサ１０２プロセッサコア１０４メモリ管理ユニット Reference Signs List 20 processor backplane 22 ASIC backplane 24 memory 26 peripheral device 28 external interface 100 processor 102 processor core 104 memory management unit

フロントページの続き (72)発明者マークビュセルアメリカ合衆国ペンシルバニア、ピッツバーグ、ドウイアベニュー 329 (72)発明者エリックポンソフランス国バンス、ルートドカニュー、347、レドンドラコストContinued on the front page (72) Inventor Mark Busel United States of America Pennsylvania, Pittsburgh, Doui Avenue 329 (72) Inventor Eric Ponceau France Vance, Route de Canigne, 347, Redon de Lacoste

Claims

[Claims]

1. A microprocessor, comprising: a microprocessor operable to decode an instruction fetched from an instruction memory having a first length selected from a first plurality of instruction fetch lengths. An instruction buffer unit for executing an instruction decoded by the instruction buffer unit; and a program counter adapted to generate an instruction address provided to an instruction memory. A digital system, wherein the buffer unit is operable to decode a first software breakpoint instruction selected to have a length equal to any of the first plurality of instruction length formats.

Responsive to a program counter from an instruction memory associated with the processor core, fetching the instruction;
Executing a sequence of instructions in an instruction pipeline of a processor core adapted to select the sequence of instructions from a set of instructions having a first plurality of instruction length formats; Converting the first instruction having the first instruction length format into a first software breakpoint instruction having a first instruction length format selected from a plurality of software breakpoint instructions having a second plurality of instruction length formats; Replacing the first software breakpoint instruction by executing the first software breakpoint instruction after executing the first portion of the instruction sequence; and replacing the first software breakpoint instruction with the instruction sequence. By replacing the first instruction in And a step resumes execution of the sequence of instructions,
A method for operating a digital system.