JPH0475143A

JPH0475143A - Processor with task switching function

Info

Publication number: JPH0475143A
Application number: JP18888390A
Authority: JP
Inventors: Hiroshi Kadota; 廉田　浩
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1990-07-17
Filing date: 1990-07-17
Publication date: 1992-03-10
Anticipated expiration: 2010-11-29
Also published as: JPH07111683B2

Abstract

PURPOSE:To eliminate subordination between continuous instructions and to prevent register hazard occurring by inputting an independent instruction from another register to the instruction whose execution is being started at present by switching a task when the instructions unexecutable successively and set in the subordination mutually are detected in a processor made into a pipeline. CONSTITUTION:Plural instruction buffer means 1, an instruction decoder means 2, a task switching control means 3, an address conversion means 4, a register block 5, and plural arithmetic processing blocks 6 are provided. The instruction decoder means 2 is provided with a circuit which discriminates whether or not the instruction performing decoding at present can execute the continuous instructions with the same task as that of the instruction, and the task switching control means 3 sets a state so as to always execute a task switching operation when the next instruction with the same task is unexecutable. Thereby, it is possible to accelerate pipeline processing by reducing irregularities in the pipeline as possible, and to easily apply synchronism between the tasks when plural tasks are executed.

Description

【発明の詳細な説明】（産業上の利用分野）本発明はタスク切換機能付プロセッサに関し、特に高性
能のプロセッサの一種で複数のタスクをパイプライン的
に順次処理するプロセッサに関するものである。DETAILED DESCRIPTION OF THE INVENTION (Field of Industrial Application) The present invention relates to a processor with a task switching function, and more particularly to a processor that is a type of high-performance processor and sequentially processes a plurality of tasks in a pipeline manner.

（従来の技術）計算機システム中のＣＰＵ等のプロセッサを高速化する
手段として従来からパイプライン構成は非常によく使わ
れている。一般にパイプライン構成では、一つの大きな
処理を複数の処理要素に分割し、各処理要素に必要な時
間つまりパイプラインピッチで次々と新しい命令を実行
することによりスループットを非常に大きくすることが
できる。(Prior Art) Pipeline configurations have been very commonly used as a means of speeding up processors such as CPUs in computer systems. In general, in a pipeline configuration, one large process is divided into multiple processing elements, and each processing element executes new instructions one after another within the required time, that is, at the pipeline pitch, making it possible to greatly increase throughput.

第５図（ａ）〜（Ｃ）は各々パイプラインの処理方法を
示し、（ａ）は従来の逐次処理、（ｂ）は理想とするパ
イプライン処理、（Ｃ）は現実のバイブライン処理であ
って、同図において、ＡｌＢ、Ｃ，Ｄは各々命令、Ｔｓ
は逐次処理の場合の処理時間、Ｔｐは理想的パイプライ
ン処理の場合の時間、Ｔｐｏは現実的パイプライン処理
の処理時間、Δはパイプラインピッチである。Figures 5 (a) to (C) each show the pipeline processing method, where (a) is conventional sequential processing, (b) is ideal pipeline processing, and (C) is actual Vibration processing. In the figure, AlB, C, and D are instructions and Ts, respectively.
is the processing time for sequential processing, Tp is the time for ideal pipeline processing, Tpo is the processing time for realistic pipeline processing, and Δ is the pipeline pitch.

第５図から明らかなように、Ａ、Ｂ、Ｃ，Ｄという連続
した命令を実行する場合、最も原始的な逐次処理で実行
するときの処理時間Ｔｓに比べて、理想的なバイブライ
ン処理の処理時間Ｔｐはパイプラインの分割段数をＮと
すると１／Ｎに近い値になる。As is clear from Fig. 5, when executing consecutive instructions A, B, C, and D, the processing time Ts of the ideal Vibration processing is longer than that of the most primitive sequential processing. The processing time Tp has a value close to 1/N, where N is the number of pipeline division stages.

（発明が解決しようとする課８）しかるに、前記のような理想的なパイプライン処理が実
現できるためには、各命令Ａ、　Ｂ、　　Ｃ。(Issue 8 to be solved by the invention) However, in order to realize the ideal pipeline processing as described above, each instruction A, B, and C must be processed separately.

Ｄが各々独立なものでなければならないが、現実には例
えば命令Ａ、Ｃの実行結果を使って命令Ｂ。Each D must be independent, but in reality, for example, the execution results of instructions A and C are used to create instruction B.

Ｄが開始できる場合等の事態が頻繁に発生しバイブライ
ンの乱れが発生する。第５図（Ｃ）は、このような場合
の命令実行のタイムチャートを示し、この場合の全体の
処理時間：　’ｒｐ’は、理想的処理時間：Ｔｐよりも
大きくなり、むしろ逐次の処理時間：Ｔｓに近くなる。Situations such as when D is able to start occur frequently and the vibration line is disturbed. FIG. 5(C) shows a time chart of instruction execution in such a case, in which the total processing time: 'rp' is larger than the ideal processing time: Tp, but rather the sequential processing time. : Becomes close to Ts.

一方、近年複数のタスク（一つの命令系列でまとまった
処理をする単位）を高速に処理するため、各タスクごと
にプロセッサを割り当てるマルチプロセッサシステムが
開発され始めている。これらのシステムにおいては、通
常各プロセッサで同時に実行されている命令は相互に独
立である・が、タスク間で従属関係にある命令を実行す
る必要が生じたり、或いはタスク間で同期をかける必要
が生じたりする場合があるが、従来のシステムでは、こ
のような処理は容易に実現できなかった。On the other hand, in recent years, multiprocessor systems have begun to be developed in which a processor is assigned to each task in order to process multiple tasks (a unit of processing that is a single instruction sequence) at high speed. In these systems, the instructions that are executed simultaneously on each processor are usually mutually independent, but sometimes it becomes necessary to execute dependent instructions between tasks, or there is a need to synchronize between tasks. However, in conventional systems, such processing could not be easily realized.

前記に鑑みて、本発明は、プロセッサシステム中のパイ
プライン処理におけるパイプラインの乱れを極力減らし
、パイプライン処理を高速化する共に複数のタスクを実
行する場合におけるタスク間の同期を容易にかけられる
ようにするプロセッサ構成を提供することを目的とする
。In view of the above, the present invention aims to reduce pipeline disturbances in pipeline processing in a processor system as much as possible, speed up pipeline processing, and facilitate synchronization between tasks when multiple tasks are executed. The purpose is to provide a processor configuration that allows for

（課題を解決するための手段）前記の目的を達成するために、請求項（１）の発明は、
複数種類の命令系列であるタスクの各一部分を各々記憶
している複数の命令バッファ手段と、各タスクで次に実
行すべき命令のアドレスを記憶し、実際にその命令が実
行されると値が更新され、更に次の命令アドレスを保持
するプログラムカウンタ手段と、前記各命令バッファ手
段の出力を選択的に取り込み順次解読し、その解読結果
に基づき次に実行する命令が実行可能か不可能かを示す
チェック信号を出力する命令解読手段と、該命令解読手
段からのチェック信号と内部に設定された各タスクの切
り換えアルゴリズム情報とに基づき、次に選択すべきタ
スクの情報であるタスク選択信号を出力するタスク切換
制御手段と、該タスク切換制御手段からのタスク選択信
号と前記命令解読手段から出力される仮りのレジスタ番
号とから合成して実際のレジスタ番号を発生させるアド
レス変換手段と、前記実際のレジスタ番号の入力により
そのレジスタ番号位置の記憶部分をアクセスし、内部バ
スからのデータの書き込み或いは内部バスへのデータの
読み出しが可能なレジスタブロックと、前記内部バスに
接続され、データに対する演算処理を実行する単一また
は複数の演算処理ブロックとを備え、前記命令解読手段
は、現在解読中の命令がそれと同一タスクの連続する命
令を実行できるか否かを判別する判別回路を有し、前記
タスク切換制御手段は、前記命令解読手段からのチェッ
ク信号により同一タスクの次命令が実行不可能である場
合には必ずタスク切換動作が実行されるように設定され
た制御回路を有する構成とするものである。(Means for solving the problem) In order to achieve the above object, the invention of claim (1):
A plurality of instruction buffer means each storing a part of a task which is a plurality of types of instruction sequences, and an address of the next instruction to be executed in each task, and when the instruction is actually executed, the value is changed. A program counter means which is updated and further holds the next instruction address, and outputs from each of the instruction buffer means are selectively taken in and sequentially decoded, and based on the decoding result, it is determined whether the next instruction to be executed is executable or not. an instruction decoding means that outputs a check signal indicating the instruction; and a task selection signal, which is information on the next task to be selected, based on the check signal from the instruction decoding means and internally set switching algorithm information for each task. a task switching control means for generating an actual register number by synthesizing a task selection signal from the task switching control means and a temporary register number output from the instruction decoding means; A register block that is capable of accessing the memory portion at the register number position by inputting a register number and writing data from or reading data to the internal bus; and a register block that is connected to the internal bus and that performs arithmetic processing on the data. The instruction decoding means includes a determination circuit that determines whether or not the instruction currently being decoded can execute consecutive instructions of the same task, The switching control means is configured to include a control circuit configured to perform a task switching operation whenever the next instruction of the same task cannot be executed due to a check signal from the instruction decoding means. be.

また、請求項（′２Ｊの発明は、マルチプロセッサシス
テムと同様の機能を１プロセツサで実現するため、請求
項（１）の構成に加えて、複数のデータ系列が並行して
入出力できる複数のデータ入出力ポートと、前記タスク
切換制御手段からのタスク選択信号に応じて前記各デー
タ入出力ポートとレジスタブロックとを接続するデータ
スイッチ手段と、前記複数の命令バッファ手段の各々に
連結された複数の命令入力ポートとを備える構成を付加
するものである。In addition, the invention of claim ('2J) realizes the same functions as a multiprocessor system with one processor, so in addition to the structure of claim (1), the invention also includes a processor that can input and output multiple data series in parallel. a data input/output port, a data switch means for connecting each data input/output port and a register block in response to a task selection signal from the task switching control means, and a plurality of data switch means connected to each of the plurality of instruction buffer means. This adds a configuration having a command input port.

また、請求項（３）の発明は、複数のタスク間の同期や
、タスク間の任意の命令の実行順序を指定するために、
請求項（１）又は（′２Ｊの構成に加えて、前記命令解
読手段は、命令の強度を記憶又は更新する強度カウンタ
と、命令コードの一部分に含まれるか若しくは独立の非
実行・制御コードとして各命令系列中に存在する強度情
報を抽出・格納する記憶回路、前記強度カウンタの強度
と前記記憶回路の強度情報とを比較する比較回路、該比
較回路で比較した結果、前記強度カウンタの方が強度が
高い場合にロックフラグ信号を出力する出力回路とから
なる強度記憶判別手段と、前記ロックフラグ信号を受け
ると、演算処理ブロックにおける強度情報を伴った命令
の実行をロックし且つ該強度情報を伴った命令を一旦記
憶する機能を有するタスク数と同数個のロック命令バッ
ファメモリとを備え、前記プログラムカウンタは、各タ
スクに対応して設けられており、前記出力回路からのロ
ックフラグ信号を受けて次命令アドレスへの更新を停止
する機能を有している構成を付加するものである。In addition, the invention of claim (3) provides a method for specifying synchronization between a plurality of tasks and the execution order of arbitrary instructions between tasks.
In addition to the structure of claim (1) or ('2J), the instruction decoding means includes a strength counter that stores or updates the strength of the command, and a strength counter that is included in a part of the command code or as an independent non-execution/control code. a memory circuit that extracts and stores strength information present in each instruction sequence; a comparison circuit that compares the strength of the strength counter with the strength information of the memory circuit; and as a result of comparison by the comparison circuit, it is found that the strength counter is strength memory determination means comprising an output circuit that outputs a lock flag signal when the strength is high; and upon receiving the lock flag signal, locks execution of an instruction accompanied by strength information in an arithmetic processing block and stores the strength information. The program counter is provided with the same number of lock instruction buffer memories as the number of tasks and has a function of temporarily storing accompanying instructions, and the program counter is provided corresponding to each task and receives a lock flag signal from the output circuit. This adds a configuration that has a function to stop updating to the next instruction address.

さらに、請求項（４）の発明は、複数のタスクから、各
種のデータ処理命令が要求され、各々の結果データのレ
ジスタへの格納が錯綜したタイミングで発生するのを簡
便に制御するために、請求項（１）〜（３）の構成に少
なくとも書き込みと読み出しが独立に実行できるマルチ
入出力ポートのレジスタ部を備え、前記演算処理ブロッ
クは複数段のバイブライン構成であり、各バイブライン
構成毎に、レジスタ番号及び演算処理ブロックの種類を
指定するのに充分なビット幅を有するシフトレジスタと
、該シフトレジスタ各段のデータ入力部と同一のビット
幅を有する人力バッファメモリと、シフトレジスタ各段
の状態即ち空・使用中・待機中を示すステータスフラグ
部と、該ステータスフラグ部が示すステータスフラグの
値に応じて前記シフトレジスタ各段の動作の種類を決定
する動作制御手段を各々備え、前記命令解読手段から出
力される各演算処理結果を格納するためのデスティネー
ションのレジスタ番号を、その演算処理に要するバイブ
ライン段数に応じて前記シフトレジスタの適切な段数位
置に入力する入力位置制御手段を備える構成を付加する
ものである。Furthermore, the invention of claim (4) provides a method for easily controlling when various data processing instructions are requested from a plurality of tasks and storage of each result data in a register occurs at a complicated timing. The configuration according to claims (1) to (3) is provided with a register section of a multi-input/output port capable of independently executing at least writing and reading, and the arithmetic processing block has a multi-stage vibe line configuration, and each vibe line configuration has a plurality of stages. A shift register having a bit width sufficient to specify the register number and type of arithmetic processing block, a manual buffer memory having the same bit width as the data input section of each stage of the shift register, and a shift register having a bit width sufficient to specify the register number and type of arithmetic processing block; a status flag section indicating the state of the shift register, that is, empty, in use, or on standby, and an operation control means for determining the type of operation of each stage of the shift register according to the value of the status flag indicated by the status flag section; input position control means for inputting a destination register number for storing each arithmetic processing result output from the instruction decoding means into an appropriate stage number position of the shift register according to the number of vibe line stages required for the arithmetic processing; This is an additional configuration.

（作用）請求項（１）の発明の構成により、バイブラインで連続
して実行できない相互に従属関係にある命令を検出した
場合に、タスクを切換えて現在実行が開始されている命
令と独立の命令を別のタスクから持ってくる。(Operation) According to the structure of the invention of claim (1), when instructions in a mutually dependent relationship that cannot be executed consecutively are detected in the vibe line, the task is switched and an instruction that is independent of the instruction currently being executed is detected. Bringing instructions from another task.

また、請求項のの発明の構成により、上述の複数のデー
タ入出力ポート、命令人力ポートを持つ構成で、データ
の同時入出力、命令の同時入出力が可能になり、マルチ
プロセッサシステムと同等の処理を実行する。Further, according to the configuration of the claimed invention, with the configuration having a plurality of data input/output ports and instruction manual ports described above, simultaneous input/output of data and simultaneous input/output of instructions is possible, which is equivalent to a multiprocessor system. Execute processing.

また、請求項（３）の発明の構成により、上述の強度カ
ウンタ、ロック命令バッファメモリ、強度記憶判別手段
からなる構成により各タスクの命令系列中にコード中の
強度を検出しプロセッサの現在の強度と比較し、それよ
り強度が高ければ実行を継続し、低ければ実行を停止す
ることでタスク間の同期を実現する。Further, according to the configuration of the invention of claim (3), the strength of the code in the instruction sequence of each task is detected by the configuration consisting of the above-mentioned strength counter, lock instruction buffer memory, and strength memory determination means, and the current strength of the processor is detected. If the strength is higher than that, execution continues, and if it is lower, execution is stopped, thereby achieving synchronization between tasks.

さらに請求項（４）の発明の構成により、上述の入力バ
ッファメモリ付シフトレジスタ部を持った構成により、
同一命令中に書かれたデスティネーションオペランドの
レジスタへのデータ書き込み動作をシフトレジスタへの
入力位置を適宜選択することによって、そのタイミング
を調整することができる。Furthermore, according to the structure of the invention of claim (4), by the structure having the above-mentioned shift register section with input buffer memory,
The timing of the data write operation to the register of the destination operand written in the same instruction can be adjusted by appropriately selecting the input position to the shift register.

（実施例）第１図は、本発明の第１の実施例を示しており、同図に
おいて、（１）は複数個存在する命令バッファ手段であ
って、複数のタスクの命令の待行列が格納されている。(Embodiment) FIG. 1 shows a first embodiment of the present invention, in which (1) is a plurality of instruction buffer means, in which a queue of instructions of a plurality of tasks is stored. Stored.

また第１図において、（２）は上記命令バッファ手段（
１）の中の１つの命令を選択的に入力して解読し、プロ
セッサ内部の各ブロックに命令実行のための制御信号を
送り出す命令解読手段である。但し、同一タスクで２つ
の命令を連続して実行した場合にバイブライン動作が乱
れるような命令について、事前にその最初の命令コード
の一部にフラグを立てである。このフラグは、例えば、
Ｃ４−Ａ＋Ｂ・・・・・・■ Ｅ←Ｃ−Ｄ・・・・・・■ というような連続した命令を実行する際に、■での加算
結果であるＣの値が確定しないうちには■の命令を開始
できないという（通常、レジスタハザードと呼ばれる）
ものであって、アセンブラやコンパイラ等を少し工夫す
れば、■→■のシーケンスがレジスタハザードを発生す
ることは容易に検出でき、■の命令コードの中にフラグ
を立ててやることは容易である。In addition, in FIG. 1, (2) is the instruction buffer means (
This is an instruction decoding means that selectively inputs and decodes one of the instructions in 1) and sends a control signal for executing the instruction to each block inside the processor. However, if two instructions are executed consecutively in the same task, a flag may be set in advance for part of the first instruction code that would disrupt the vibe line operation. This flag can be used, e.g.
C4-A+B...■ When executing consecutive instructions such as E←C-D...■, before the value of C, which is the addition result in ■, is determined. ■The instruction cannot be started (usually called register hazard)
However, with a little ingenuity in the assembler, compiler, etc., it is easy to detect that the ■→■ sequence causes a register hazard, and it is easy to set a flag in the instruction code for ■. .

命令解読手段（２）は、前記のフラグを検出し命令連続
実行のチェック信号を発生すると共に、命令のオペラン
ドとして内部のレジスタが使われる場合、このレジスタ
番号も同時に出力する。但し、このレジスタ番号は同一
タスク内では一意的に決められた番号であるか、別のタ
スクでも同じ番号が使われている可能性が高い。従って
このレジスタ番号は仮のものであって、実際のレジスタ
番号はタスク番号と合わせて後述のアドレス変換手段（
４）で決められる。The instruction decoding means (2) detects the flag and generates a check signal for continuous instruction execution, and also outputs the register number at the same time when an internal register is used as an operand of the instruction. However, it is highly likely that this register number is a uniquely determined number within the same task, or that the same number is also used in another task. Therefore, this register number is temporary, and the actual register number, together with the task number, is determined by the address conversion means (described later).
4).

また第１図において、（３）はタスク切換手段であって
、命令解読手段（２）からのチェック信号と内部に有し
ている切換アルゴリズムとを総合してタスク切換を行な
う。この場合、各タスクに優先度を設けて切り換える方
法の具体例として例えば次のような方法がある。Further, in FIG. 1, (3) is a task switching means, which performs task switching by integrating a check signal from the instruction decoding means (2) and an internal switching algorithm. In this case, the following method is a specific example of a method of assigning priority to each task and switching between them.

すなわち、タスクａ、ｂ、ｃ、ｄに対して４゜３、　２
．　１という優先度（数が大きい方が優先度が高い）を
割り当て、まずａから実行を開始する。That is, 4°3, 2 for tasks a, b, c, and d.
．． It assigns a priority of 1 (the larger the number, the higher the priority) and starts execution from a.

タスク切換制御手段（３）は、命令解読手段（２）でチ
ェック信号が発生した場合は直ちにタスクをｂに切り換
え、またチェック信号が発生しない場合でも連続して実
行されたタスクａの命令数か４ユニツトになった場合も
タスクをｂに切り換える。The task switching control means (3) immediately switches the task to task b when a check signal is generated in the instruction decoding means (2), and also determines the number of consecutively executed instructions of task a even when no check signal is generated. Even when there are 4 units, the task is switched to b.

タスクｂの命令実行も同様にチェック信号の発生又は連
続して実行されたタスクｂの命令数が３ユニツトになっ
た場合にタスクＣに切り換える。以下同様にタスクＣか
らタスクｄ１タスクｄから再びタスクａと切り換える。The instruction execution of task b is similarly switched to task C when a check signal is generated or when the number of consecutively executed instructions of task b reaches 3 units. Thereafter, task C is switched to task d1 and task d is switched again to task a.

第６図は、前記のような動作を実現するためのタスク切
換制御手段（３）のハードウェア構成を示しており、同
図において、（３１−１）〜（３１−５）はカウンタ、
（３２）はデコーダ、（３３）はＯＲ回路、（３４）は
ＡＮＤ回路、（３５）はシステム基本クロック、（３６
）はチェック信号、（３７）はタスク選択信号、（３８
）はタスク切換パルスである。FIG. 6 shows the hardware configuration of the task switching control means (3) for realizing the above-mentioned operation. In the figure, (31-1) to (31-5) are counters,
(32) is a decoder, (33) is an OR circuit, (34) is an AND circuit, (35) is a system basic clock, (36)
) is a check signal, (37) is a task selection signal, (38
) is a task switching pulse.

第６図に示すように、プロセッサとして扱う最大のタス
ク数と同数のカウンタ（３１−１）〜（３１−４）を準
備し、これらを対応するタスクの優先度と同じ値に初期
設定しておく。また各カウンタ（３１−１）〜（３１−
４）のクロック入力にはバイブラインピッチに対応する
システム基本クロック（３５）を入力し、各カウンタ（
３１１）〜（３１−４）は常にカウント状態にしておく
。但し、タスク切換パルス（３８）がカウンタ（３１−
１）〜（３１−４）のＳ端子に入力されているため、タ
スク切換パルス（３８）が発生するごとに全カウンタ（
３１−１）〜（３１−４）をリセットして、カウントを
０から再開する。As shown in Fig. 6, prepare the same number of counters (31-1) to (31-4) as the maximum number of tasks handled by the processor, and initialize these to the same value as the priority of the corresponding task. put. In addition, each counter (31-1) to (31-
The system basic clock (35) corresponding to the vibration line pitch is input to the clock input of 4), and each counter (
311) to (31-4) are always kept in a counting state. However, the task switching pulse (38) is the counter (31-
Since it is input to the S terminals of 1) to (31-4), all counters (
31-1) to (31-4) are reset to restart counting from 0.

方、現在実行中のタスクは、別のカウンタ（３１−５）
の出力Ｃｈ　、Ｑ２によって表示されており、タスク切
換信号によってカウントされ次々とタスクが切換えられ
る。そして、別のカウンタ（３１−５）の出力Ｑ＋　、
Ｑ２は、通常二進符号化されているので、デコーダ（３
２）によってａ、ｂ。On the other hand, the currently executing task is displayed in another counter (31-5).
It is indicated by the output Ch and Q2, and is counted by the task switching signal, and the tasks are switched one after another. Then, the output Q+ of another counter (31-5),
Since Q2 is usually binary encoded, the decoder (3
2) by a, b.

ｃ、ｄの各タスクの個別信号にデコードされ出力される
。It is decoded into individual signals for each task c and d and output.

タスク切換パルス（３８）は、前記各タスクに対応した
カウンタ（３１−１）〜（３１−４）のカウント終了信
号Ｅと各々対応したタスク選択信号（３７）とのＡＮＤ
をとったもの、及び命令解読手段（２）から送られてく
るチェック信号（３６）のＯＲをとって生成される。つ
まり現在実行中のタスクのカウント終了信号Ｅがくるか
或いはチェック信号（３６）がくれば、次のタスクに切
り換えられ、そこでは新たに各タスク用のカウンタ（３
１−１）〜（３１−４）は０からそのタスクの優先度に
応じた値までカウントを開始する。The task switching pulse (38) is an AND of the count end signal E of the counters (31-1) to (31-4) corresponding to each task and the corresponding task selection signal (37).
and the check signal (36) sent from the instruction decoding means (2). In other words, when the count end signal E of the task currently being executed or the check signal (36) arrives, the task is switched to the next task, where the counter (3) for each task is newly set.
1-1) to (31-4) start counting from 0 to a value corresponding to the priority of the task.

第１図において、（４）は、タスク切換制御手段（３）
からのタスク選択信号と命令解読手段（２）から出力さ
れる仮りのレジスタ番号とから合成して実際のレジスタ
番号を発生させるアドレス変換手段であって、該アドレ
ス変換手段は（４）は、例えば仮想記憶システムをサポ
ートしたプロセッサ中にあるＴｒａｎｓｌａｔｉｏｎ　
　Ｌｏ。In FIG. 1, (4) is task switching control means (3)
The address conversion means generates an actual register number by combining the task selection signal from the instruction decoding means (2) and the temporary register number output from the instruction decoding means (2), and the address conversion means (4) is, for example, Translation in a processor that supports a virtual memory system
Lo.

ｋａｓｉｄｅ　　Ｂｕｆｆｅｒ（ＴＬＢ）と同じ構成で
実現することができる。It can be realized with the same configuration as a kaside buffer (TLB).

第１図において、（５）はレジスタブロック手段であっ
て、該レジスタブロック手段（５）−は、前記実際のレ
ジスタ番号の入力により、そのレジスタ番号位置の記憶
部分をアクセスし、内部バスからのデータの書き込み或
いは内部バスへのデータの読み出しか可能である。In FIG. 1, (5) is a register block means, and upon input of the actual register number, the register block means (5) accesses the storage part at the register number position and receives data from the internal bus. Data can only be written to or read from the internal bus.

第１図に示すように、アドレス変換手段（４）によって
実際のレジスタ番号に変換された信号は、レジスタブロ
ック（５）のアドレスデコーダ（５−１）に供給され、
レジスタ記憶部（５−２）をアクセスして、レジスタの
読み出し、書き込み動作が行われる。そして、レジスタ
ブロック（５）から出力されたデータは、内部バス（７
）を経由して演算処理ブロック（６）に供給されて演算
処理され、その結果は通常再び内部バス（７）を経由し
てレジスタブロック（５）に書き込まれる。As shown in FIG. 1, the signal converted into an actual register number by the address conversion means (4) is supplied to the address decoder (5-1) of the register block (5).
The register storage section (5-2) is accessed to perform register reading and writing operations. The data output from the register block (5) is then transferred to the internal bus (7).
) is supplied to the arithmetic processing block (6) for arithmetic processing, and the result is usually written to the register block (5) again via the internal bus (7).

第１図の演算処理ブロック（６）の例では、該演算処理
ブロック（６）の内部で２段のパイプライン処理を実行
する構成を示しており、同図において（６−１）は演算
処理器の第１段、（６−２）は中間結果を保持する中間
ラッチ、（６−３）は演算処理器の第２段、（６−４）
は最終結果を保持する田カラッチである。The example of the arithmetic processing block (6) in FIG. 1 shows a configuration in which two-stage pipeline processing is executed inside the arithmetic processing block (6). The first stage of the processor, (6-2) is an intermediate latch that holds intermediate results, (6-3) is the second stage of the arithmetic processor, (6-4)
is a tag that holds the final result.

第１図において、（８）は各タスクの次に実行すべき命
令のアドレスを示すプログラムカウンタ手段であって、
該プログラムカウンタ手段（８）は、各タスクで実行す
べき命令のアドレスを記憶しており、各タスクの命令が
実行されるにつれて対応するプログラムカウンタの値が
更新され、更新されたアドレスを保持する。またブラン
チ命令等の実行時は、従来同様プログラムカウンタの値
は、とび先番号へと変換される。In FIG. 1, (8) is a program counter means indicating the address of the next instruction to be executed after each task,
The program counter means (8) stores the address of the instruction to be executed in each task, and as the instructions of each task are executed, the value of the corresponding program counter is updated, and the updated address is held. . Further, when a branch instruction or the like is executed, the value of the program counter is converted into a jump destination number, as in the conventional case.

なお、命令解読手段（２）は、通常のプロセッサの命令
解読用デコーダ回路と、入力部のタスク選択のためのセ
レクタ回路、及びデコーダ回路の追加部分としての命令
連続実行可否検出部およびチェック信号発生回路からな
る。The instruction decoding means (2) includes a decoder circuit for decoding instructions of a normal processor, a selector circuit for selecting a task in the input section, and an instruction successive execution possibility detection section and check signal generation as an additional part of the decoder circuit. Consists of circuits.

第２図は、本発明の第２の実施例を示しており、この第
２実施例の構成は、第１実施例の各構成要素に加えて、
前記命令バッファ手段（２）の各々に接続されプロセッ
サ外部から各タスクに対する独立の命令を入力する複数
の命令入力ポート（１１〕と、複数のデータ系列が並行
して入出力できる複数の独立のデータ入出力ポート（１
２）と、前記タスク切換制御手段（３）からのタスク選
択信号に応じて前記データ入出力ポート（１２）とレジ
スタブロック（５）とを接続するデータスイッチ手段（
１３）とが付加されている。この構成により従来のマル
チプロセッサシステムが等価的に１プロセツサで置き換
えられたことになる。FIG. 2 shows a second embodiment of the present invention, and the configuration of this second embodiment includes, in addition to each component of the first embodiment,
a plurality of instruction input ports (11) that are connected to each of the instruction buffer means (2) and input independent instructions for each task from outside the processor; and a plurality of independent data that can input and output a plurality of data series in parallel. Input/output port (1
2), and data switch means (2) for connecting the data input/output port (12) and the register block (5) in response to a task selection signal from the task switching control means (3).
13) is added. With this configuration, the conventional multiprocessor system is equivalently replaced with one processor.

第３図は、本発明の第３の実施例を示している。FIG. 3 shows a third embodiment of the invention.

この第３実施例は、タスク間の命令実行の順序付け、或
いはタスク間の同期をとる場合に、事前に各命令に強度
情報を割り当てておき、同期をかける必要のない命令は
強度を最強にし、プロセッサ自体の判別強度をそれより
１ランク低いものにして強度カウンタ（２−３）に設定
しておぐちのである。例えば、タスクｄの命令Ｘｄを最
初に実行し、次にタスクＣの命令Ｙｃ、次にタスクｂの
命令ｚｂ１次にタスクａの命令Ｗａを実行する場合は、
Ｘｄの強度を４、Ｙｃの強度を３、ｚｂの強度を２、Ｗ
ａの強度を１、プロセッサの初期強度を４に設定する。In this third embodiment, when ordering the execution of instructions between tasks or synchronizing tasks, strength information is assigned to each instruction in advance, and instructions that do not require synchronization are given the strongest strength. This is done by setting the discrimination strength of the processor itself one rank lower than that in the strength counter (2-3). For example, if the instruction Xd of task d is executed first, then the instruction Yc of task C, then the instruction zb1 of task b, and then the instruction Wa of task a,
Xd intensity is 4, Yc intensity is 3, zb intensity is 2, W
Set the strength of a to 1 and the initial strength of the processor to 4.

強度記憶判別手段（２−２）は、各命令の強度情報を含
んだ部分を記憶する記憶回路と、上述の強度カウンタ（
２−３）から出力されたプロセッサの強度フラグ信号（
２−４）と各命令の強度を比較する比較回路及び次強度
生成出力回路からなり、該強度記憶判別手段（２−２）
は、（ａ）命令強度がプロセッサ強度より大きいか又は
等しい場合、命令強度の値から１を減じた値を次強度生
成回路により計算させ、これを再び強度更新信号（２−
５）として、強度カウンタ（２−３）へ戻し、その値に
設定し、逆に、（ｂ）プロセッサ強度が命令強度より強
い場合は、その命令の強度情報を除いた残りの部分をロ
ック命令バッファメモリ（２−１）に記憶させ、演算処
理ブロック（６）の命令の実行をロックし、対応するタ
スクのプログラムカウンタ（８）も停止し、タスク切換
用のチェック信号を発する。The strength memory determination means (2-2) includes a memory circuit that stores a portion including strength information of each instruction, and the strength counter (2-2) described above.
The strength flag signal of the processor output from 2-3) (
2-4), a comparison circuit for comparing the strength of each instruction, and a next strength generation output circuit, and the strength memory determination means (2-2)
(a) When the instruction strength is greater than or equal to the processor strength, the next strength generation circuit calculates a value obtained by subtracting 1 from the instruction strength value, and this is again sent to the strength update signal (2-
5), return it to the strength counter (2-3) and set it to that value, and conversely, (b) if the processor strength is stronger than the instruction strength, the remaining part except the strength information of that instruction is set to the lock instruction. It is stored in the buffer memory (2-1), locks execution of instructions in the arithmetic processing block (6), also stops the program counter (8) of the corresponding task, and issues a check signal for task switching.

例えば最初にｚｂがプロセッサの命令解読手段（２）に
供給されたとすると、プロセッサの強度は４でありｚｂ
の強度は２であって、（ｂ）の場合に相当するので、Ｚ
ｂはタスクｂに対応する強度記憶判別手段（２−３）と
ロック命令バッフ、アメモリ（２−１）に−旦格納され
、対応するプログラムカウンタ（８）は停止されタスク
はＣに切り換えられる。次に命令Ｘｄが到着すると今度
は（ａ）の場合であるため、Ｘｄはそのまま実行され、
しかも強度記憶判別手段（２−２）によりプロセッサ強
度は４から３に更新される。次にＷａが到着すると（ｂ
）の場合であるため、Ｗａもロック状態になりタスクは
ｂに切り換えられるが、タスクｂはロックされているの
で更にタスク切換が発生し、タスクＣに切り換えられる
。次に命令Ｙｃが到着すると（ａ）の場合に相当するの
で、そのままＹｃが実行されプロセッサ強度は２に更新
される。その後タスクがｂに切換えられた場合、今度は
（ａ）の場合となりロックされていたｚｂが実行される
。そして最後にＷａが実行され別のタスク内に存在する
命令でもＸｄ−＋Ｙｃ−＋Ｚｂ−＋Ｗａの順で実行され
る。For example, if zb is first supplied to the instruction decoding means (2) of the processor, the strength of the processor is 4 and zb
The intensity of is 2 and corresponds to case (b), so Z
b is temporarily stored in the strength memory determining means (2-3), lock instruction buffer, and memory (2-1) corresponding to task b, the corresponding program counter (8) is stopped, and the task is switched to C. Next, when instruction Xd arrives, this time it is case (a), so Xd is executed as is,
Furthermore, the processor strength is updated from 4 to 3 by the strength memory determining means (2-2). Next, when Wa arrives (b
), Wa also becomes locked and the task is switched to task b, but since task b is locked, further task switching occurs and task C is switched. When instruction Yc arrives next, this corresponds to case (a), so Yc is executed as is, and the processor strength is updated to 2. If the task is then switched to b, the case (a) will occur and the locked zb will be executed. Finally, Wa is executed, and instructions existing in other tasks are also executed in the order of Xd-+Yc-+Zb-+Wa.

第４図は、本発明の第４の実施例を示しており、この第
４実施例は、第１実施例に比べて、命令解読手段（２）
から出力される仮想的なレジスタ番号が直接アドレス変
換手段（４）に供給される直接の通路である番号指定信
号線（２４）の他に、後述の入力バッファメモリ（２１
）、ステータスフラグ部及び制御部（２３）、シフトレ
ジスタ（２２）を順次通過して一定時間の遅延をもって
アドレス変換手段（４）と演算処理ブロック（６）の出
力ラッチ部（６−４）とに供給される通路を備えている
。この場合、直接番号指定信号線（２４）は演算処理の
ソースオペランドのレジスタ番号を供給するものであり
、入力バッファメモリ（２１）→ステータスフラグ部及
び制御部（２３）−アドレス変換手段（４）の通路はデ
スティネーションオペランドのレジスタ番号を遅延させ
て供給するものである。ここでシフトレジスタ（２２）
の１段分はレジスタ番号を指定するのに充分なビット数
を有しており、システムの基本クロックによってデータ
を１段ずつアドレス変換手段（４）の方へ転送していく
。入力バッファメモリ（２１）は、シフトレジスタ（２
２）の各段に入力するものであって、ステータスフラグ
部及び制御部（２３）に格納されているシフトレジスタ
（２２）各段の状態に応じて次の動作をするように設定
されている。すなわち、その段のシフトレジスタ（２２
）が空の場合は、入力データ（レジスタ番号）は入力バ
ッファメモリ（２１）を通過し直接シフトレジスタ（２
３）に格納される。その段のシフトレジスタ（２３）が
既にデータ（レジスタ番号）で充されている場合は、入
力データは入力バッファメモリ（２１）に格納される。FIG. 4 shows a fourth embodiment of the present invention, and this fourth embodiment has a command decoding means (2) that is different from the first embodiment.
In addition to the number designation signal line (24), which is a direct path through which the virtual register number output from the address conversion means (4) is directly supplied, the input buffer memory (21
), the status flag section and control section (23), and the shift register (22), and after a certain time delay, the address conversion means (4) and the output latch section (6-4) of the arithmetic processing block (6). It is equipped with a passageway that is supplied to the In this case, the direct number designation signal line (24) supplies the register number of the source operand for arithmetic processing, and is connected from the input buffer memory (21) to the status flag section and control section (23) to the address conversion means (4). This path supplies the register number of the destination operand with a delay. Here the shift register (22)
Each stage has a sufficient number of bits to specify a register number, and data is transferred stage by stage to the address conversion means (4) using the basic clock of the system. The input buffer memory (21) has a shift register (2
2), and is set to perform the following operation depending on the state of each stage of the shift register (22) stored in the status flag unit and control unit (23). . In other words, the shift register (22
) is empty, the input data (register number) passes through the input buffer memory (21) and is directly stored in the shift register (2
3). If the shift register (23) of that stage is already filled with data (register number), the input data is stored in the input buffer memory (21).

また、シフトレジスタ（２３）のシフト動作は、次段の
入力バッファメモリ（２１）にデータが格納されている
場合には停止する。Further, the shift operation of the shift register (23) is stopped when data is stored in the input buffer memory (21) at the next stage.

第７図（ｂ）は前記のような動作をする制御回路構成を
示し、この第７図（ｂ）は各１段分の構成を示しており
、同図において、ＩＮ−ＤＡＴＡは入力データ、ＩＮ−
ＲＱはこの段に入力するための入力リクエストパルス信
号、ＣＫはシステム基本クロックである。また、同図に
おいて、（２１）は入力バッファメモリであって、具体
的にはＥ入力がアサートされているときにはＤ　（Ｌ）
入力が格納される複数ビットのフリップフロック回路、
（２３−１）はステータスフラグ部（２３）の１ビツト
であって、Ｅ入力、Ｄ　（Ｌ）入力以外にクロックＣＫ
入力のエツジトリガでＤ入力を格納するフリップフロッ
プ回路、（２３−２）はＥ入力、Ｄ　（Ｌ）入力をもつ
フリップフロップ回路、（２２）はシフトレジスタ１段
分であって、Ｅ入力、Ｄ　（Ｌ）入力、ＣＫ大入力Ｄ入
力を持つ複数ビットのフリップフロップ回路である。ま
た、同図において、（２３−３）は所望の動作を実現す
るための論理回路網、（２Ｂ−４）は入力バッファ又は
前段のシフトレジスタからのデータのいづれかを選択す
るセレクタ回路網であって、ステータスフラグＱ＋　、
Ｑ２の定義は第７図（ａ）中に記述した通りである。FIG. 7(b) shows the configuration of a control circuit that operates as described above, and this FIG. 7(b) shows the configuration of one stage each. In the same figure, IN-DATA indicates input data, IN-
RQ is an input request pulse signal to be input to this stage, and CK is a system basic clock. Also, in the same figure, (21) is an input buffer memory, and specifically, when the E input is asserted, D (L)
a multi-bit flip-flop circuit in which the input is stored;
(23-1) is 1 bit of the status flag section (23), and in addition to the E input and D (L) input, the clock CK
(23-2) is a flip-flop circuit that stores the D input by an edge trigger of the input; (23-2) is a flip-flop circuit with E input and D (L) input; (22) is a shift register for one stage; It is a multi-bit flip-flop circuit with (L) input, CK large input and D input. Also, in the figure, (23-3) is a logic circuit network for realizing a desired operation, and (2B-4) is a selector circuit network that selects either the input buffer or the data from the preceding shift register. , status flag Q+,
The definition of Q2 is as described in FIG. 7(a).

また、第４図において、（２５）は、命令解読手段（２
）から出力される各命令の実行時に使用する演算処理ブ
ロック（６）のパイプライン段数に応じてシフトレジス
タ（２２）の適切な入力段に入力リクエストパルスを出
力し、データを供給する入力（１）制御手段である。In addition, in FIG. 4, (25) is the instruction decoding means (2).
) outputs an input request pulse to the appropriate input stage of the shift register (22) according to the number of pipeline stages of the arithmetic processing block (6) used when executing each instruction output from the input (1) that supplies data. ) is a control means.

（発明の効果）以上説明したように、請求項（１）の発明に係るタスク
切換機能付プロセッサによると、パイプライン化したプ
ロセッサで、連続して実行できない相互に従属関係にあ
る命令を検出した場合にタスクを切換えて現在実行が開
始されている命令に対して独立の命令を別のタスクから
持ってくるため、複数のタスクの命令を順次実行するこ
とができるので、連続した命令間で従属関係がなくなり
レジスタハザードが発生しない。このため、本発明によ
ると、理想に近いパイプライン動作が実現でき、実行性
能として極めて高速のプロセッサを得ることができる。(Effects of the Invention) As explained above, according to the processor with a task switching function according to the invention of claim (1), in a pipelined processor, mutually dependent instructions that cannot be executed consecutively are detected. When switching tasks, an independent instruction for the instruction currently being executed is brought from another task, so instructions from multiple tasks can be executed sequentially, so there are no dependencies between consecutive instructions. There is no relationship and no register hazard occurs. Therefore, according to the present invention, a pipeline operation close to the ideal can be realized, and a processor with extremely high execution performance can be obtained.

また、請求項（２）の発明に係るタスク切換機能付プロ
セッサによると、プロセッサ外部に対する複数のデータ
人出力ポート及び命令入出力ポートを備えているため、
データの同時入出力、命令の同時入出力が可能になり、
マルチプロセ・ソサシステムが１プロセツサで実現でき
、しかも一般に実行時間がかかりパイプライン動作を乱
す大きな要因となっていたデータの入出力命令に対して
もタスク切換で対応できるので、これらも理想的にノく
イブライン化できる。このため、本発明によると、高速
演算処理が可能になる。Further, according to the processor with a task switching function according to the invention of claim (2), since it is provided with a plurality of data output ports and command input/output ports to the outside of the processor,
Simultaneous data input/output and simultaneous instruction input/output are now possible.
A multiprocessor system can be realized with a single processor, and data input/output instructions, which generally take a long time to execute and are a major cause of disrupting pipeline operation, can be handled by task switching, making them ideal. It can be converted into Noku Eveline. Therefore, according to the present invention, high-speed calculation processing is possible.

また、請求項（３）の発明に係るタスク切換機能付プロ
セッサによると、強度カウンタ、強度バッフアメそり、
強度記憶判別手段を備えているため、各タスクの命令系
列中にコード中の強度を検出しプロセッサの現在の強度
と比較し、それより強度が高ければ実行を継続し、低け
れば実行を停止することでタスク間の同期を実現するこ
とができるので、従来マルチプロセッサシステムで問題
となっていたプロセッサ間の処理の同期が、本発明の１
プロセツサによる擬似的なマルチプロセッサ構成により
わずかなハードウェアの追加により容易に実現できる。Further, according to the processor with a task switching function according to the invention of claim (3), an intensity counter, an intensity buffer counter,
Since it is equipped with a strength memory determination means, it detects the strength in the code during the instruction sequence of each task and compares it with the current strength of the processor.If the strength is higher than that, execution continues, and if it is lower, execution is stopped. This makes it possible to achieve synchronization between tasks, so the synchronization of processing between processors, which has been a problem in conventional multiprocessor systems, is solved by the present invention.
This can be easily realized by adding a small amount of hardware using a pseudo multiprocessor configuration using processors.

このため、本発明によると、従来のソフトウェアによる
同期方式に比べ大幅な処理時間の短縮が可能になる。Therefore, according to the present invention, processing time can be significantly reduced compared to the conventional synchronization method using software.

さらに、請求項１４）の発明に係るタスク切換機能付プ
ロセッサによると、複数の演算処理ブロックに次々と命
令を実行させた場合のその結果の生成タイミングは各演
算処理ブロックのパイプライン段数によって異なるが、
上述のように入力バッファメモリ付シフトレジスタ部を
備えているため、同一命令中に書かれたデスティネーシ
ョンオペランドのレジスタへのデータ書き込み動作をシ
フトレジスタへの入力位置を適宜選択することによって
、そのタイミングを調整することができ、生成タイミン
グに合わせて演算処理ブロックの出力ラッチと書込みレ
ジスタのアクセスを行なうことができるので、命令のオ
ペレーションコード、ソースオペランド、デスティネー
ショオペランドを全て同一行に記述することが可能にな
る。このため、本発明によると、プログラム開発の効率
が大幅に向上する。Furthermore, according to the processor with a task switching function according to the invention of claim 14, when a plurality of arithmetic processing blocks are caused to execute instructions one after another, the generation timing of the result varies depending on the number of pipeline stages of each arithmetic processing block. ,
As mentioned above, since it is equipped with a shift register section with an input buffer memory, the timing of the data write operation to the register of the destination operand written in the same instruction can be adjusted by appropriately selecting the input position to the shift register. The output latch and write register of the arithmetic processing block can be accessed according to the generation timing, so the instruction operation code, source operand, and destination operand can all be written on the same line. becomes possible. Therefore, according to the present invention, the efficiency of program development is greatly improved.

以上いづれの発明に係るタスク切換機能付プロセッサに
よっても、高度のパイプライン構成のプロセッサの実行
性能を大幅に改善することができ、しかも使い易さを増
すことができるものである。With the task switching function-equipped processor according to any of the above inventions, the execution performance of a processor with a highly pipelined configuration can be greatly improved, and the ease of use can be increased.

[Brief explanation of the drawing]

第１図は本発明の第１実施例を示すプロセッサ内部構成
図、第２図は上記第１実施例の構成に入出力ポート等を
加えて、更に大きな効果を発揮させる第２実施例の構成
図、第３図はタスク間の同期や命令の実行順序を指定で
きる本発明の第３実施例の構成図、第４図はデスティネ
ーションオペランド生成タイミングを合わせて自動的に
レジスタや演算処理ブロックの動作を制御できる本発明
の第４実施例の回路構成図、第５図は従来のパイプライ
ン動作一般のタイミング図、第６図はタスク切換回路を
示す図、第７図は第４図で示した入力バッファメモリ、
シフトレジスタ１段分、ステータスフラグ部及び制御部
の詳細な回路を示す図である。１・・・命令バッファ手段２・・・命令解読手段２−１・・・ロック命令バッファメモリ２−２・・・強
度記憶判別手段２−３・・・強度カウンタ３・・・タスク切換制御手段４・・・アドレス変換手段５・・・レジスタブロック６・・・演算処理ブロック７・・・内部バス８・・・プログラムカウンタト・・命令入力ポート２・・・データ入出力ポート３・・・データスイッチ手段１・・・入力バッファメモリ２・・・シフトレジスタ３・・・ステータスフラグ部及び制御部５・・・入力位
置制御手段島３図ｂろ６図FIG. 1 is an internal configuration diagram of a processor showing a first embodiment of the present invention, and FIG. 2 is a configuration of a second embodiment that adds input/output ports, etc. to the configuration of the first embodiment to achieve even greater effects. 3 is a block diagram of a third embodiment of the present invention that can specify synchronization between tasks and the order of execution of instructions, and FIG. A circuit configuration diagram of a fourth embodiment of the present invention that can control operation, FIG. 5 is a timing diagram of conventional pipeline operation in general, FIG. 6 is a diagram showing a task switching circuit, and FIG. input buffer memory,
FIG. 2 is a diagram showing a detailed circuit of one stage of shift register, a status flag section, and a control section. 1... Instruction buffer means 2... Instruction decoding means 2-1... Lock instruction buffer memory 2-2... Strength storage determining means 2-3... Strength counter 3... Task switching control means 4...Address conversion means 5...Register block 6...Arithmetic processing block 7...Internal bus 8...Program counter...Instruction input port 2...Data input/output port 3... Data switch means 1...Input buffer memory 2...Shift register 3...Status flag section and control section 5...Input position control means Figure 3b Figure 6

Claims

[Claims]

(1) A plurality of instruction buffer means each storing a part of a task which is a plurality of types of instruction sequences, and storing the address of the next instruction to be executed in each task, and the instruction is actually executed. and a program counter means for holding the next instruction address; and a program counter means for selectively taking in and sequentially decoding the outputs of the respective instruction buffer means, and determining whether or not the next instruction to be executed is executable based on the decoding result. an instruction decoder that outputs a check signal indicating whether the instruction is possible; and task selection, which is information on the next task to be selected, based on the check signal from the instruction decoder and internally set switching algorithm information for each task. task switching control means for outputting a signal; address conversion means for generating an actual register number by synthesizing the task selection signal from the task switching control means and the temporary register number output from the instruction decoding means; a register block that is capable of accessing the storage part at the register number position by inputting the actual register number and writing data from or reading data to the internal bus; and a single or plural arithmetic processing blocks that execute arithmetic processing, and the instruction decoding means has a determination circuit that determines whether or not the instruction currently being decoded can execute consecutive instructions of the same task. The task switching control means includes a control circuit configured to perform a task switching operation whenever the next instruction of the same task cannot be executed due to a check signal from the instruction decoding means. A processor with a distinctive task switching function.

(2) a plurality of data input/output ports that can input/output a plurality of data series in parallel; and a data switch that connects each of the data input/output ports and a register block in response to a task selection signal from the task switching control means. 2. A processor with a task switching function according to claim 1, further comprising a plurality of instruction input ports connected to each of said plurality of instruction buffer means.

(3) The instruction decoding means includes a strength counter that stores or updates the strength of the instruction, and strength information that is included in a part of the instruction code or exists in each instruction sequence as an independent non-execution/control code. a storage circuit for storing, a comparison circuit that compares the intensity of the intensity counter with intensity information of the storage circuit, and an output that outputs a lock flag signal when the intensity of the intensity counter is higher as a result of comparison by the comparison circuit. and a number of tasks having a function of locking the execution of an instruction with intensity information in the arithmetic processing block and temporarily storing the instruction with the intensity information upon receiving the lock flag signal. and the same number of lock instruction buffer memories, and the program counter is provided corresponding to each task, and has a function of stopping updating to the next instruction address upon receiving a lock flag signal from the output circuit. A processor with a task switching function according to claim 1 or 2, characterized in that the processor has a task switching function.

(4) At least a register section with multiple input/output ports capable of independently executing writing and reading is provided, and the arithmetic processing block has a multi-stage pipeline configuration, and for each pipeline configuration, the register number and the arithmetic processing block are specified. A shift register with a bit width sufficient to specify the type, an input buffer memory with the same bit width as the data input section of each stage of the shift register, and a state of each stage of the shift register, that is, empty, in use, and standby. each operation control means for determining the type of operation of each stage of the shift register according to the value of the status flag indicated by the status flag section, and each operation outputted from the instruction decoding means. Claim characterized by comprising an input position control means for inputting a destination register number for storing a processing result to an appropriate stage number position of the shift register according to the number of pipeline stages required for the arithmetic processing. The task switching type processor according to (1) to (3).