JP5842255B2

JP5842255B2 - Apparatus and method for generating logic circuit from logic circuit description in programming language

Info

Publication number: JP5842255B2
Application number: JP2013256881A
Authority: JP
Inventors: 剛一色
Original assignee: Tokyo Institute of Technology NUC
Current assignee: Tokyo Institute of Technology NUC
Priority date: 2013-12-12
Filing date: 2013-12-12
Publication date: 2016-01-13
Anticipated expiration: 2033-12-12
Also published as: CN105814568A; CN105814568B; WO2015087957A1; JP2015114874A; US20160299998A1; US10089426B2

Description

本発明は、半導体集積回路の合成方法に関し、より具体的には、プログラミング言語を用いて記述された論理回路記述から論理回路を合成するための装置及び方法に関する。 The present invention relates to a method for synthesizing a semiconductor integrated circuit, and more specifically, to an apparatus and method for synthesizing a logic circuit from a logic circuit description described using a programming language.

従来、ＬＳＩ（大規模集積回路）の論理回路の設計においては、ＲＴＬ（Register Transfer Level：レジスタ転送レベル）記述で設計が行われるとともに、状態遷移図を用いてデータフロー制御が行なわれていた。ＲＴＬは、論理回路の設計の抽象度を表すものであり、ＲＴＬ記述は、ハードウェアの構造や動作を記述する抽象度の低いハードウェア記述言語（ＨＤＬ、Hardware Description Language）による記述方式の一種である。ＲＴＬでは、レジスタ単位でデータの流れが記述される。ＲＴＬで記述されたＨＤＬのソースコードは、論理合成ツールと呼ばれるソフトウェアを用いて、論理ゲート水準の回路記述に変換される。 Conventionally, in designing a logic circuit of an LSI (Large Scale Integrated Circuit), the design is performed with RTL (Register Transfer Level) description and data flow control is performed using a state transition diagram. RTL represents the abstraction level of logic circuit design, and RTL description is a kind of description method based on a hardware description language (HDL, Hardware Description Language) that describes the structure and operation of hardware. is there. In RTL, the flow of data is described in register units. The HDL source code described in RTL is converted into a logic gate level circuit description using software called a logic synthesis tool.

状態遷移図を用いて論理回路の設計を行う場合には、状態遷移図の作成後に人手により状態遷移の流れを検証する必要がある。したがって、例えば画像処理用途などといった高性能ＬＳＩのように複雑なデータフロー制御を必要とする論理回路を設計する場合には、状態遷移の流れを検証する際の検証漏れや、バグの混入等による設計品質の低下が発生し易いなどという問題がある。
そこで、近年では、以下に示す高位合成技術が導入されている。 When designing a logic circuit using a state transition diagram, it is necessary to manually verify the flow of state transition after creating the state transition diagram. Therefore, when designing a logic circuit that requires complex data flow control, such as a high-performance LSI, such as for image processing applications, it may be due to omission of verification when verifying the flow of state transitions, bugs, etc. There is a problem that design quality is likely to deteriorate.
Therefore, in recent years, the following high-level synthesis technology has been introduced.

論理回路を短期間で設計するための技術として、非特許文献１に示されるような高位合成技術が知られている。高位合成技術は、Ｃ言語などの手続き型ソフトウェアプログラミング言語を用いて表現された論理回路の動作記述（以下、「ソフトウェア記述」という）を、ＨＤＬによるＲＴＬ記述に自動的に変換する設計自動化技術である。こうした高位合成技術を用いてソフトウェア記述からＲＴＬ記述を自動的に合成する工程は、以下のとおりである。
（１）アロケーション工程：合成する論理回路に搭載する演算器及びメモリ等の種類及び個数を決定する。
（２）スケジューリング工程：ソフトウェア記述内の各演算の実行時刻を決定する。この工程は、複数の演算装置による並列演算が可能なＶＬＩＷ（Very Long Instruction Word）プロセッサ用の並列化コンパイラ技術に用いられる「並列性解析」機能に基づいている。
（３）バインディング工程：各演算の演算器への割付け、中間処理データのレジスタへの割付け、等を決定する。
（４）ＦＳＭＤ生成工程：ソフトウェア記述と等価な処理動作を論理回路で実行するための制御部をＦＳＭ（Finite State Machine：有限状態機械）で実現し、この制御部によって演算器、レジスタ、メモリ、及びバスなどで構成される演算処理回路（Datapath）を駆動するＦＳＭＤ（Finite State Machine + Datapath）を生成する。最終的に生成されたＦＳＭＤが、出力されるＲＴＬ記述となる。
これらの工程のうち、「スケジューリング」によってソフトウェア内部の各演算処理を時間方向に多重化する工程と、「バインディング」によって各演算処理を空間方向に多重化する（演算器への割付け）工程とからなる「演算スケジューリング・バインディング工程」が、高位合成技術の核となる重要な工程である。
高位合成技術及びその工程についての詳細は、例えば、特許文献１（特開２００３−０７６７２８号公報）、特許文献２（特開２００６−０１１８７８号公報）及び非特許文献１に記載されている。 As a technique for designing a logic circuit in a short period of time, a high-level synthesis technique as shown in Non-Patent Document 1 is known. High-level synthesis technology is a design automation technology that automatically converts the behavioral description of a logic circuit (hereinafter referred to as “software description”) expressed using a procedural software programming language such as C language into an RTL description in HDL. is there. The process of automatically synthesizing an RTL description from a software description using such a high-level synthesis technique is as follows.
(1) Allocation process: The type and number of arithmetic units and memories mounted on the logic circuit to be synthesized are determined.
(2) Scheduling step: The execution time of each operation in the software description is determined. This process is based on a “parallelism analysis” function used in a parallel compiler technology for a VLIW (Very Long Instruction Word) processor capable of parallel computation by a plurality of arithmetic units.
(3) Binding step: Determines the assignment of each operation to an arithmetic unit, the assignment of intermediate processing data to a register, and the like.
(4) FSMD generation step: A control unit for executing processing operations equivalent to software description in a logic circuit is realized by FSM (Finite State Machine), and this control unit is used to calculate an arithmetic unit, a register, a memory, And an FSMD (Finite State Machine + Datapath) that drives an arithmetic processing circuit (Datapath) including a bus and the like. The finally generated FSMD is the output RTL description.
Among these processes, the process of multiplexing each arithmetic process in the software by “scheduling” in the time direction and the process of multiplexing each arithmetic process in the spatial direction by “binding” (assignment to the arithmetic unit) The “operation scheduling and binding process” is an important process that is the core of high-level synthesis technology.
Details of the high-level synthesis technique and the process thereof are described in, for example, Patent Document 1 (Japanese Patent Laid-Open No. 2003-077628), Patent Document 2 (Japanese Patent Laid-Open No. 2006-011878), and Non-Patent Document 1.

論理回路を短期間で設計するための別の技術として、非特許文献２に示されるような、ＳｙｓｔｅｍＣを用いる方法も知られている。ＳｙｓｔｅｍＣは、ハードウェアシステムのイベント駆動型シミュレーション用インターフェースを提供するＣ＋＋言語のクラス定義及びマクロ定義として提供される、ＨＤＬの一種である。ＳｙｓｔｅｍＣでは、ハードウェア記述で必要となる「階層記述」、「並列動作プロセス記述・プロセス起動条件（sensitivity list）」及び「信号結線」を記述することが可能であるため、ＳｙｓｔｅｍＣを用いて論理回路をＲＴＬで記述することが可能である。加えて、ＴＬＭ（Transaction Level Modeling）と呼ばれる通信記述と演算処理記述とを分離したＳｙｓｔｅｍＣ上の記述方式によってシステムの機能をＣ＋＋言語によるソフトウェア記述で抽象的に表現でき、大規模なシステム・レベル・シミューレーションを実行することが可能である。
ＳｙｓｔｅｍＣについての詳細は、例えば、非特許文献２に記載されている。 As another technique for designing a logic circuit in a short period of time, a method using SystemC as shown in Non-Patent Document 2 is also known. SystemC is a kind of HDL provided as a class definition and macro definition of C ++ language that provides an interface for event-driven simulation of a hardware system. In SystemC, it is possible to describe “hierarchy description”, “parallel operation process description / process activation condition (sensitivity list)” and “signal connection” necessary for hardware description. Can be described in RTL. In addition, system functions can be abstractly expressed by software description in C ++ language by a description method on SystemC that separates communication description and operation processing description called TLM (Transaction Level Modeling), and it is possible to express system functions at large scale. It is possible to perform a simulation.
Details of SystemC are described in Non-Patent Document 2, for example.

特開２００３−０７６７２８号公報JP 2003-077628 A 特開２００６−０１１８７８号公報JP 2006-011878 A

若林一敏、「ソフトウェアプログラムからハードウェア記述を合成する高位合成技術」、IEICE Fundamental Review、2012年、Vol.6、No.1、p.37-50Kazutoshi Wakabayashi, “High-level synthesis technology that synthesizes hardware descriptions from software programs”, IEICE Fundamental Review, 2012, Vol. 6, No. 1, p. 37-50 Thorsten Groetker 著、柿本勝、河原林政道、長谷川隆監訳、「ＳｙｓｔｅｍＣによるシステム設計」、丸善、2003年、ISBN 4-621-07144-0 C3055By Thorsten Groetker, Masaru Enomoto, Masamichi Kawarabayashi and Takashi Hasegawa, “System Design with SystemC”, Maruzen, 2003, ISBN 4-621-07144-0 C3055

高位合成技術には、次に挙げるような課題がある。
まず、高位合成技術は、高性能・高効率回路を合成するための技術構築・ツール実装が非常に複雑化・大規模化するという課題を有する。高位合成技術が入力対象とするプログラム・コードは、逐次処理手続きを表現しているという点において、ＣＰＵで実行されるソフトウェア・プログラム・コードと根本的に同じ構造を有しており、記述上の自由度が非常に高い。こうした自由度の高いソフトウェア記述から、いかに多くの演算並列性を抽出できるかによって、合成される回路の処理性能が決定するため、非常に複雑な並列性解析技術（データ依存解析、制御依存解析、ループ解析、配列参照依存性解析、等）を駆使する必要がある。その結果、前述の「演算スケジューリング・バインディング工程」が非常に複雑化する。 High-level synthesis techniques have the following problems.
First, the high-level synthesis technology has a problem that technology construction and tool implementation for synthesizing high-performance and high-efficiency circuits are very complicated and large-scale. The program code to be input by the high-level synthesis technology has a fundamentally the same structure as the software program code executed by the CPU in that it represents a sequential processing procedure. The degree of freedom is very high. Since the processing performance of the synthesized circuit is determined by how many operations parallelism can be extracted from such a highly flexible software description, extremely complicated parallelism analysis technology (data dependency analysis, control dependency analysis, Loop analysis, sequence reference dependency analysis, etc.). As a result, the aforementioned “operation scheduling and binding process” becomes very complicated.

また、高位合成技術は、入力対象とするソフトウェア記述の内容の変更によって、ＲＴＬ記述の内容、すなわち論理回路の状態がどのように変化するか、予測が極めて難しいという課題を有する。上述のとおり「演算スケジューリング・バインディング工程」が複雑であるため、入力されるソフトウェア記述から、出力される回路特性（回路規模、処理時間）を予測することが極めて困難である。そのため、例えば、生成される合成回路の特性がいずれも性能要件を満たさなかった場合には、ソフトウェア記述を変更することで、演算の並列性の増加やメモリ転送量の抑制の効果によって性能要件を充足させる方策（処理時間短縮、回路規模削減）が必要となるが、ソフトウェア記述に加える変更が、合成される回路の特性にどのような影響を与えるかを予測することが困難であるため、ソフトウェア記述の改変作業自体が困難を極める可能性がある。 Further, the high-level synthesis technique has a problem that it is very difficult to predict how the contents of the RTL description, that is, the state of the logic circuit will change due to the change of the contents of the software description to be input. As described above, since the “operation scheduling / binding process” is complicated, it is extremely difficult to predict the output circuit characteristics (circuit scale, processing time) from the input software description. Therefore, for example, if none of the characteristics of the generated synthesis circuit meet the performance requirements, the performance requirements can be met by changing the software description to increase the parallelism of operations and reduce the amount of memory transfer. Measures to be satisfied (reduction of processing time and circuit scale) are required, but it is difficult to predict how changes to the software description will affect the characteristics of the synthesized circuit. The work of modifying the description itself can be extremely difficult.

さらに、高位合成技術は、論理回路の構造（アーキテクチャ）が「演算スケジューリング・バインディング工程」に大きく依存するため、設計者のアーキテクチャ上の意図及びノウハウを反映したソフトウェア記述が難しいという課題を有する。このため、高い並列性が要求される処理機能ブロックの設計においては、高位合成技術に頼らない手動設計（ハードウェア記述言語で直接論理構造を記述する設計手法）が今なお主流である。 Furthermore, the high-level synthesis technique has a problem that it is difficult to describe software that reflects the designer's architectural intent and know-how because the structure (architecture) of the logic circuit largely depends on the “operation scheduling and binding process”. For this reason, manual design (design method for describing a logical structure directly in a hardware description language) that does not rely on high-level synthesis technology is still mainstream in the design of processing function blocks that require high parallelism.

一方、ＳｙｓｔｅｍＣによる方法には、次のような課題がある。その１つは、ＳｙｓｔｅｍＣを用いて論理回路をＲＴＬで記述するためには、ＳｙｓｔｅｍＣで定義されたＲＴＬ記述用クラスを用いる必要があり、その記述量はハードウェア記述言語と同程度又はそれ以上であるため、ＲＴＬ記述による設計生産性向上は非常に限られるということである。また、前述の高位合成技術を用いてＴＬＭ記述を自動的にＲＴＬ記述に変換するツールは存在するが、このツールを用いる場合には、高位合成技術の課題がそのまま当てはまる。 On the other hand, the method using SystemC has the following problems. For one, in order to describe a logic circuit using SystemC in RTL, it is necessary to use an RTL description class defined in SystemC, and the amount of description is equal to or higher than that of a hardware description language. Therefore, the improvement in design productivity by RTL description is very limited. In addition, there is a tool that automatically converts a TLM description into an RTL description using the above-described high-level synthesis technique. However, when this tool is used, the problem of the high-level synthesis technique is applied as it is.

本発明は、このような従来技術の課題を解決するためになされたものであり、回路実装を前提とした特定の情報処理機能をプログラミング言語によって記述して、この記述から論理合成可能なＲＴＬ記述を自動的に生成するための装置、方法及びコンピュータ・プログラムを提供することを目的とする。 The present invention has been made in order to solve such problems of the prior art, and describes a specific information processing function based on circuit implementation in a programming language, and an RTL description that can be logically synthesized from this description. An object of the present invention is to provide an apparatus, a method, and a computer program for automatically generating data.

本発明の第１の態様は、回路設計のためのハードウェアの一連の処理の流れである動作記述を記述した論理回路生成対象の最上位関数を含むプログラムを入力に取り、論理回路記述を生成する、論理回路生成装置である。論理回路生成装置は、制御フローグラフ生成部と、制御フロー縮退変換部と、データフローグラフ生成部と、論理回路記述出力部とを備える。 In the first aspect of the present invention, a logic circuit description is generated by taking a program including a top-level function to be generated as a logic circuit describing an operation description, which is a series of hardware processing flows for circuit design. This is a logic circuit generation device. The logic circuit generation device includes a control flow graph generation unit, a control flow degeneration conversion unit, a data flow graph generation unit, and a logic circuit description output unit.

制御フローグラフ生成部は、ループ処理部及び関数呼出し命令を含まない最上位関数から、制御フローグラフを生成する。制御フロー縮退変換部は、変数に対する代入命令を各変数につき１つだけ含む前記制御フローグラフから、すべての条件分岐命令を除去することによって、制御フローが縮退されたプログラムである制御フロー縮退プログラムを生成する。データフローグラフ生成部は、制御フロー縮退プログラムから、制御フロー縮退プログラムの各命令をノードとし、各変数への代入命令からその変数を参照する命令への有向枝を付加することによって、データフローグラフを生成する。論理回路記述出力部は、データフローグラフの有向枝が論理回路の配線に対応し、データフローグラフのノードが論理回路の演算器に対応する順序回路を表す論理回路記述を生成する。 The control flow graph generation unit generates a control flow graph from a loop processing unit and a top-level function that does not include a function call instruction. The control flow degeneracy conversion unit removes all conditional branch instructions from the control flow graph including only one assignment instruction for each variable for each variable, thereby reducing a control flow degeneration program that is a program in which the control flow is degenerated. Generate. The data flow graph generation unit uses each instruction of the control flow degenerate program as a node from the control flow degenerate program, and adds a directional branch from an assignment instruction to each variable to an instruction that refers to the variable. Generate a graph. The logic circuit description output unit generates a logic circuit description representing a sequential circuit in which a directed edge of the data flow graph corresponds to a wiring of the logic circuit and a node of the data flow graph corresponds to an arithmetic unit of the logic circuit.

順序回路の状態を表す状態変数は、プログラムにおいて、最上位関数を呼出す上位階層関数のローカル変数又はスタティック変数として表現され、前記状態変数への代入命令が実行される前の前記状態変数の値が前記順序回路の現状態を表し、前記状態変数への代入命令が実行された後の前記状態変数の値が前記順序回路の次状態を表す。 The state variable representing the state of the sequential circuit is expressed in the program as a local variable or a static variable of a higher-level function that calls the top-level function, and the value of the state variable before the assignment instruction to the state variable is executed This represents the current state of the sequential circuit, and the value of the state variable after the assignment instruction to the state variable is executed represents the next state of the sequential circuit.

一実施形態において、好ましくは、論理回路生成装置は、静的単一代入形式変換部をさらに備える。静的単一代入形式変換部は、前記最上位関数が、変数に対する代入命令を各変数につき１つだけ含む静的単一代入形式でない場合に、制御フローグラフを、制御フロー縮退変換部に入力される前に、静的単一代入形式に変換する。静的単一代入形式変換部は、φ関数命令挿入部と、変数名変換部と、状態変数名再変換部とを含む。φ関数命令挿入部は、制御フローグラフにおいて同一変数に対する値の定義が複数個合流する箇所に、その箇所に合流する全ての変数定義の中から実際に実行された経路上の変数定義を選択するφ関数命令を挿入する。変数名変換部は、制御フローグラフに含まれる各変数の名前を、変数に対する代入命令ごとに別の名前となるように変換することによって、前記制御フローグラフが名前変換後の各変数につき代入命令を１つだけ含む静的単一代入形式に変換する。状態変数名再変換部は、名前変換後の前記状態変数について、前記制御フローグラフの始点ブロックにおける変数名と終点ブロックに到達する変数名とを一致させるように、変数名を再度変換する。好ましくは、制御フロー縮退変換部は、φ関数命令実体化部を含む。φ関数命令実体化部は、φ関数命令を具体的な演算命令に変換する、 In one embodiment, preferably, the logic circuit generation device further includes a static single assignment format conversion unit. The static single assignment format conversion unit inputs a control flow graph to the control flow degenerate conversion unit when the top-level function is not a static single assignment format including only one assignment instruction for a variable for each variable. Convert to static single assignment form before being done. The static single assignment format conversion unit includes a φ function instruction insertion unit, a variable name conversion unit, and a state variable name re-conversion unit. The φ-function instruction insertion unit selects the variable definition on the path that was actually executed from all the variable definitions that merge at the location where multiple definitions of the value for the same variable merge in the control flow graph. Insert φ function instruction. The variable name conversion unit converts the name of each variable included in the control flow graph so as to have a different name for each assignment instruction for the variable, so that the control flow graph assigns an instruction for each variable after the name conversion. Is converted to a static single assignment form containing only one. The state variable name reconverting unit converts the variable name again so that the variable name in the start block of the control flow graph matches the variable name reaching the end block of the state variable after the name conversion. Preferably, the control flow degeneracy conversion unit includes a φ function instruction materializing unit. The φ function instruction materializing unit converts the φ function instruction into a specific operation instruction.

一実施形態において、好ましくは、論理回路生成装置は、レジスタ／メモリ配列アクセス命令分解部をさらに備える。レジスタ／メモリ配列アクセス命令分解部は、配列代入命令分解部と、配列参照命令分解部とを含む。配列代入命令分解部は、プログラムに含まれる最上位関数が配列変数に対する書込み処理命令である配列代入命令を含む場合に、各配列変数について、制御フローグラフに書込みデータ変数及び書込みアドレス変数を付加し、各配列代入命令を、配列要素への代入値を前記書込みデータ変数へ代入する命令と、配列インデックス値を前記書込みアドレス変数へ代入する命令と、前記書込みアドレス変数を配列インデックスとすることによって定まる配列要素に前記書込みデータ変数の値を代入する命令とに分解する。配列参照命令分解部は、前記プログラムに含まれる前記最上位関数が配列変数に対する読出し処理命令である配列参照命令を含む場合に、各配列変数について、前記制御フローグラフに読出しアドレス変数を付加し、各配列参照命令を、配列インデックス値を前記読出しアドレス変数へ代入する命令と、前記読出しアドレス変数を配列インデックスとすることによって定まる配列要素を参照する命令とに分解する。好ましくは、論理回路において各配列要素データを保持するためのメモリは、書込みデータ変数がメモリの書込みデータポートに対応し、書込みアドレス変数がメモリの書込みアドレスポートに対応し、読出しアドレス変数がメモリの読出しアドレスポートに対応したものである。制御フローグラフ生成部によって生成された制御フローグラフは、レジスタ／メモリ配列アクセス命令分解部によって処理された後に、制御フロー縮退変換部によって処理される。 In one embodiment, preferably, the logic circuit generation device further includes a register / memory array access instruction decomposition unit. The register / memory array access instruction decomposer includes an array assignment instruction decomposer and an array reference instruction decomposer. The array assignment instruction decomposition unit adds a write data variable and a write address variable to the control flow graph for each array variable when the top-level function included in the program includes an array assignment instruction that is a write processing instruction for the array variable. Each array assignment instruction is determined by assigning an assignment value to an array element to the write data variable, an instruction to assign an array index value to the write address variable, and the write address variable as an array index. The instruction is divided into an instruction for assigning the value of the write data variable to the array element. The array reference instruction decomposition unit adds a read address variable to the control flow graph for each array variable when the top-level function included in the program includes an array reference instruction that is a read processing instruction for the array variable. Each array reference instruction is decomposed into an instruction for assigning an array index value to the read address variable and an instruction for referring to an array element determined by using the read address variable as an array index. Preferably, in the memory for holding each array element data in the logic circuit, the write data variable corresponds to the write data port of the memory, the write address variable corresponds to the write address port of the memory, and the read address variable corresponds to the memory of the memory. This corresponds to the read address port. The control flow graph generated by the control flow graph generation unit is processed by the register / memory array access instruction decomposition unit and then processed by the control flow degeneration conversion unit.

一実施形態において、さらに好ましくは、前記レジスタ／メモリ配列アクセス命令分解部は、書込みポート番号割当部と、読出しポート番号割当部とをさらに含む。書込みポート番号割当部は、制御フローグラフにおける各配列変数に対する各配列代入命令について、始点ブロックから該配列代入命令までの間に実行される該配列変数への配列代入命令の実行回数の最大値を書込みポート番号として該配列代入命令に割当てる。読出しポート番号割当部は、制御フローグラフにおける各配列変数に対する各配列参照命令について、始点ブロックから該配列参照命令までの間に実行される該配列変数への配列参照命令の実行回数の最大値を読出しポート番号として該配列参照命令に割当てる。好ましくは、前記配列代入命令分解部は、各配列変数について、前記書込みポート番号ごとに、前記書込みデータ変数及び前記書込みアドレス変数を前記制御フローグラフに付加し、かつ、前記読出しポート番号ごとに、前記読出しアドレス変数を前記制御フローグラフに付加する。好ましくは、各配列要素データを保持するためのメモリは、該配列変数の代入命令に割り当てられた書込みポート番号の数と同数の前記書込みポートと、該配列変数の参照命令に割り当てられた読出しポート番号の数と同数の前記読出しポートとを持つ。 In one embodiment, more preferably, the register / memory array access instruction decomposing unit further includes a write port number assigning unit and a read port number assigning unit. The write port number assigning unit calculates, for each array assignment instruction for each array variable in the control flow graph, a maximum value of the number of executions of the array assignment instruction to the array variable executed between the start point block and the array assignment instruction. The array assignment instruction is assigned as a write port number. The read port number assigning unit, for each array reference instruction for each array variable in the control flow graph, sets the maximum number of executions of the array reference instruction to the array variable executed between the start point block and the array reference instruction. Assigned to the array reference instruction as a read port number. Preferably, the array assignment instruction decomposition unit adds the write data variable and the write address variable to the control flow graph for each write port number for each array variable, and for each read port number, The read address variable is added to the control flow graph. Preferably, the memory for holding each array element data includes the same number of write ports as the number of write port numbers assigned to the array variable assignment instruction, and the read ports assigned to the array variable reference instructions. There are as many read ports as there are numbers.

一実施形態において、さらに好ましくは、論理回路生成装置は、書込みメモリポート数判定部と、読出しメモリポート数判定部とをさらに備える。書込みメモリポート数判定部は、書込みポート番号割当部が割当てた書込みポート番号の数が、予め定められた書込みメモリポート数閾値以下であるか否かを判定する。読出しメモリポート数判定部は、読出しポート番号割当部が割当てた読出しポート番号の数が、予め定められた読出しメモリポート数閾値以下であるか否かを判定する。好ましくは、論理回路生成装置は、書込みメモリポート数判定部が「否」と判定した場合、又は読出しメモリポート数判定部が「否」と判定した場合には、論理回路記述を生成する処理を停止する。 In one embodiment, more preferably, the logic circuit generation device further includes a write memory port number determination unit and a read memory port number determination unit. The write memory port number determination unit determines whether or not the number of write port numbers assigned by the write port number assignment unit is equal to or less than a predetermined write memory port number threshold. The read memory port number determining unit determines whether or not the number of read port numbers assigned by the read port number assigning unit is equal to or less than a predetermined read memory port number threshold. Preferably, the logic circuit generation device performs a process of generating a logic circuit description when the write memory port number determination unit determines “No” or when the read memory port number determination unit determines “No”. Stop.

一実施形態において、好ましくは、論理回路生成装置に入力されるプログラムは、各変数に属性を付与するための属性記述を含む。属性は、変数のデータのビット幅を指定するビット幅属性と、変数の値をレジスタに保持することを指定するレジスタ属性と、配列変数の配列要素の値をメモリに保持することを指定するメモリ属性とを含む。好ましくは、論理回路生成装置は、レジスタ属性又は前記メモリ属性が付与された変数を状態変数とすることによって状態が表現される順序回路を含む論理回路記述を生成する。 In one embodiment, the program input to the logic circuit generation device preferably includes an attribute description for assigning an attribute to each variable. The attribute is a bit width attribute that specifies the bit width of the data of the variable, a register attribute that specifies that the value of the variable is held in the register, and a memory that specifies that the value of the array element of the array variable is held in the memory Including attributes. Preferably, the logic circuit generation device generates a logic circuit description including a sequential circuit in which a state is expressed by using a variable having a register attribute or the memory attribute as a state variable.

一実施形態において、さらに好ましくは、論理回路生成装置は、ビット幅判定部と、演算器回路遅延評価部と、パイプライン境界配置部とをさらに備える。ビット幅判定部は、データフローグラフに含まれる変数に対して、該変数に対する代入命令において参照される変数及び／又は定数のビット幅と該代入命令において実行される演算の種類とから該変数のビット幅を算出する。演算器回路遅延評価部は、データフローグラフに含まれる変数に対して算出されたビット幅に基づいて、演算器の信号伝搬遅延時間を算出する。パイプライン境界配置部は、パイプライン制約抽出部及びパイプライン段数決定部を含む。パイプライン制約抽出部は、データフローグラフにおける状態変数に対する代入命令を実行する演算器と該状態変数に対する参照命令を実行する演算器との間の有向枝にパイプライン境界属性を付加する。パイプライン段数決定部は、クロック同期型パイプライン回路の回路記述を生成するために用いられるパイプライン段数を、パイプライン境界属性に基づく制約によって定まる必要最小のパイプ段数であるパイプライン段数下限値と、指定されたクロック周期から算出したパイプライン段数又は予め指定されたパイプライン段数とに従って決定する。 In one embodiment, more preferably, the logic circuit generation device further includes a bit width determination unit, an arithmetic unit circuit delay evaluation unit, and a pipeline boundary arrangement unit. The bit width determination unit, for a variable included in the data flow graph, calculates the variable from the bit width of the variable and / or constant referred to in the assignment instruction for the variable and the type of operation executed in the assignment instruction. Calculate the bit width. The arithmetic unit circuit delay evaluation unit calculates the signal propagation delay time of the arithmetic unit based on the bit width calculated for the variable included in the data flow graph. The pipeline boundary arrangement unit includes a pipeline constraint extraction unit and a pipeline stage number determination unit. The pipeline constraint extraction unit adds a pipeline boundary attribute to a directional branch between an arithmetic unit that executes an assignment instruction for a state variable in the data flow graph and an arithmetic unit that executes a reference instruction for the state variable. The pipeline stage number determination unit determines the pipeline stage number used to generate the circuit description of the clock synchronous pipeline circuit as the pipeline stage number lower limit value, which is the minimum required pipeline stage number determined by the constraint based on the pipeline boundary attribute. The number is determined according to the number of pipeline stages calculated from the designated clock cycle or the number of pipeline stages designated in advance.

一実施形態において、好ましくは、論理回路生成装置は、論理回路入力信号抽出部と、論理回路出力信号抽出部とをさらに備える。論理回路入力信号抽出部は、最上位関数の引数及びグローバル変数から、回路記述によって記述されることになる回路の入力信号を抽出する。論理回路出力信号抽出部は、最上位関数の引数及び戻り値並びにグローバル変数から、回路記述によって記述されることになる回路の出力信号を抽出する。 In one embodiment, preferably, the logic circuit generation device further includes a logic circuit input signal extraction unit and a logic circuit output signal extraction unit. The logic circuit input signal extraction unit extracts the input signal of the circuit to be described by the circuit description from the argument of the highest function and the global variable. The logic circuit output signal extraction unit extracts the output signal of the circuit to be described by the circuit description from the argument and return value of the highest function and the global variable.

一実施形態において、好ましくは、論理回路生成装置は、非循環・非階層変換部をさらに備える。完全インライン展開部は、最上位関数が関数呼出し命令を含む場合に、それぞれの関数呼出し命令をインライン展開することによって、最上位関数を、関数呼出し命令を含まない最下層関数に変換する。完全ループ展開部は、完全インライン展開部によって変換された最下層関数が固定繰返し回数のループ処理部を含む場合に、それぞれの固定繰返し回数のループ処理部をループ展開することによって、最下層関数を、ループ処理部を含まない非循環型最下層関数に変換する。好ましくは、論理回路生成装置に入力されたプログラムは、非循環・非階層変換部によって非循環型最下層関数に変換された後に、制御フローグラフ生成部に入力される。 In one embodiment, preferably, the logic circuit generation device further includes a non-circular / non-hierarchical conversion unit. When the highest level function includes a function call instruction, the complete inline expansion unit converts the highest level function into a lowest layer function that does not include the function call instruction by performing inline expansion of each function call instruction. When the lowest layer function converted by the complete inline expansion unit includes a loop processing unit with a fixed number of iterations, the complete loop expansion unit loops the loop processing unit with each fixed number of iterations. , Convert to a non-circular bottom layer function that does not include a loop processing unit. Preferably, the program input to the logic circuit generation device is input to the control flow graph generation unit after being converted into a non-circular bottom layer function by the non-circular / non-hierarchical conversion unit.

一実施形態において、さらに好ましくは、完全インライン展開部は、入力された関数に対してインライン展開を所定の回数繰返しても関数呼出し命令が完全に展開されない場合には、該関数を最下層関数に変換不可能であると判断して処理を中止するように構成され、完全ループ展開部は、入力された関数に繰返し回数が定数ではないループ処理部が含まれる場合には、該関数を非循環型最下層関数に変換不可能であると判断して処理を中止するように構成される。好ましくは、論理回路生成装置は、完全インライン展開可能性判定部が処理を中止した場合、又は完全ループ展開可能性判定部が処理を中止した場合には、論理回路記述を生成する処理を停止する。 In one embodiment, more preferably, the complete inline expansion unit sets the function as the lowest function when the function call instruction is not fully expanded even if the inline expansion is repeated a predetermined number of times for the input function. The complete loop expansion unit is configured to stop processing upon determining that conversion is impossible, and when the input function includes a loop processing unit whose number of iterations is not a constant, the complete loop expansion unit is acyclic. It is configured to stop processing upon determining that the function cannot be converted into the lowest function of the type. Preferably, the logic circuit generation device stops the process of generating the logic circuit description when the complete inline expansion possibility determination unit stops processing or when the complete loop expansion possibility determination unit stops processing. .

一実施形態において、好ましくは、論理回路生成装置は、状態変数命令依存性判定部をさらに備える。状態変数命令依存性判定部は、制御フローグラフにおいて、同一の前記状態変数に対して代入命令・参照命令・代入命令がこの順序で連続して実行される場合がないか否を判定する。好ましくは、論理回路生成装置は、状態変数命令依存性判定部が「否」と判定した場合には、論理回路記述を生成する処理を停止する。 In one embodiment, preferably, the logic circuit generation device further includes a state variable instruction dependency determination unit. The state variable instruction dependency determining unit determines whether or not an assignment instruction, a reference instruction, and an assignment instruction are continuously executed in this order for the same state variable in the control flow graph. Preferably, the logic circuit generation device stops the process of generating the logic circuit description when the state variable instruction dependency determination unit determines “No”.

本発明の第２の態様は、回路設計のためのハードウェアの一連の処理の流れである動作記述を記述した論理回路生成対象の最上位関数を含むプログラムを入力に取り、論理回路記述を生成する、論理回路生成装置によって行われる、論理回路生成方法である。 In the second aspect of the present invention, a logic circuit description is generated by taking, as an input, a program including a top-level function to be generated as a logic circuit that describes an operation description that is a series of hardware processing flows for circuit design. A logic circuit generation method performed by the logic circuit generation device.

本発明の第３の態様は、本発明の第２の態様は、回路設計のためのハードウェアの一連の処理の流れである動作記述を記述した論理回路生成対象の最上位関数を含むプログラムを入力に取り、論理回路記述を生成する、論理回路生成装置によって行われる、論理回路生成方法の各ステップをコンピュータに実行させるための論理回路生成コンピュータ・プログラムである。 According to a third aspect of the present invention, there is provided a program including a top-level function for generating a logic circuit describing an operation description that is a flow of a series of hardware processing for circuit design. A logic circuit generation computer program for causing a computer to execute each step of a logic circuit generation method performed by a logic circuit generation device that takes an input and generates a logic circuit description.

本発明の装置、方法及びコンピュータ・プログラムによれば、記述の自由度が高いソフトウェア記述から論理合成可能なＲＴＬ記述を自動的に生成することができるため、高品質、高効率な大規模集積回路を、短期間、低コストで設計、開発及び検証することができる。 According to the apparatus, method, and computer program of the present invention, an RTL description that can be logically synthesized can be automatically generated from a software description having a high degree of freedom of description. Can be designed, developed and verified in a short period of time and at a low cost.

本発明の一実施形態に係る論理回路生成装置１の機能を示す例示的な機能ブロック図である。It is an exemplary functional block diagram which shows the function of the logic circuit generation apparatus 1 which concerns on one Embodiment of this invention. 本発明の一実施形態に係る論理回路生成装置１の例示的なハードウェア構成を示す図である。1 is a diagram illustrating an exemplary hardware configuration of a logic circuit generation device 1 according to an embodiment of the present invention. 本発明のソフトウェア記述における、データ型の別名定義の記述例を示す図である。It is a figure which shows the example of a description of the alias definition of a data type in the software description of this invention. 本発明のソフトウェア記述における、データ型に対して変数属性を付加する記述例を示す図である。It is a figure which shows the example of a description which adds a variable attribute with respect to a data type in the software description of this invention. 本発明のソフトウェア記述における、同じ関数を異なる変数属性を持つ引数で呼出すプログラムの記述例、及びそのプログラムに対応するパイプライン回路図を示す図である。It is a figure which shows the example of a description of the program which calls the same function with the argument which has a different variable attribute in the software description of this invention, and the pipeline circuit diagram corresponding to the program. 本発明のソフトウェア記述における、スタティック変数を用いた順序回路の状態遷移の記述例、及びその順序回路の状態遷移図を示す図である。It is a figure which shows the example of description of the state transition of the sequential circuit which used the static variable in the software description of this invention, and the state transition diagram of the sequential circuit. 本発明のソフトウェア記述における、メモリ属性変数を用いた画像処理用ラインバッファ及びシフトレジスタ回路の記述例、及び対応するパイプライン回路図を示す図である。It is a figure which shows the example of description of the line buffer for image processing and shift register circuit which used the memory attribute variable in the software description of this invention, and a corresponding pipeline circuit diagram. 図７のラインバッファ更新の動作を記述した、本発明のソフトウェア記述によるソフトウェア記述及び従来技術におけるＲＴＬ記述を示す図である。It is a figure which shows the RTL description in the software description by the software description of this invention, and the prior art which described the operation | movement of the line buffer update of FIG. 再帰的関数呼出しの記述例を示す図である。It is a figure which shows the example of a description of a recursive function call. 本発明の論理回路生成装置及び方法における、図９に示される関数をインライン展開した結果を示す図である。It is a figure which shows the result of having expanded the function shown in FIG. 9 in-line in the logic circuit generation apparatus and method of this invention. 本発明の論理回路生成装置及び方法における、図１０に示される関数に定数伝播による最適化を適用した結果を示す図である。It is a figure which shows the result of applying the optimization by constant propagation to the function shown in FIG. 10 in the logic circuit generation apparatus and method of this invention. 本発明の回路生成方法における工程における、図９に示される関数を完全インライン関数展開した結果を示す図である。FIG. 10 is a diagram illustrating a result of a complete inline function expansion of the function shown in FIG. 9 in the process of the circuit generation method of the present invention. 本発明の論理回路生成装置及び方法における、完全ループ展開の具体例を示す図である。（ａ１）は、定数回繰返しのループ処理部を含む関数の例を表す。（ａ２）は、（ａ１）の関数に対応する制御フローグラフを表す。（ｂ１）は、（ａ１）の関数を完全ループ展開した結果を表す。It is a figure which shows the specific example of complete loop expansion | deployment in the logic circuit generation apparatus and method of this invention. (A1) represents an example of a function including a loop processing unit that repeats a constant number of times. (A2) represents a control flow graph corresponding to the function of (a1). (B1) represents the result of complete loop expansion of the function of (a1). 本発明の論理回路生成装置及び方法における、メモリ配列アクセス命令の分解の具体例を示す図である。（ａ）は、メモリ配列アクセス命令を含む関数の例を表す。（ｂ）は、（ａ）に含まれるメモリ配列アクセス命令を、ポート変数を用いて分解した結果のソフトウェア記述を表す。It is a figure which shows the specific example of decomposition | disassembly of a memory array access instruction in the logic circuit generation apparatus and method of this invention. (A) represents an example of a function including a memory array access instruction. (B) represents a software description as a result of decomposing the memory array access instruction included in (a) using a port variable. 本発明の論理回路生成装置及び方法における、ＳＳＡ変換手順の具体例を示す図である。（ａ）はソフトウェア記述の例、（ｂ）は（ａ）のソフトウェア記述に対応する制御フローグラフ、（ｃ）はφ関数命令を挿入した結果の制御フローグラフ、（ｄ）は代入される変数の変数名を変換した結果の制御フローグラフ、（ｅ）は参照される変数の変数名を変換した結果の制御フローグラフ、（ｆ）は状態変数の変数名を再変換した結果の制御フローグラフを表す。It is a figure which shows the specific example of the SSA conversion procedure in the logic circuit generation apparatus and method of this invention. (A) is an example of a software description, (b) is a control flow graph corresponding to the software description of (a), (c) is a control flow graph resulting from inserting a φ-function instruction, and (d) is a variable to be substituted. (E) is a control flow graph as a result of converting the variable name of the referenced variable, and (f) is a control flow graph as a result of reconverting the variable name of the state variable. Represents. 本発明の論理回路生成装置及び方法における、メモリ配列アクセス命令を含むプログラムの制御フローグラフの具体例を示す図である。（ａ）は、図１４（ｂ）のソフトウェア記述に対応する制御フローグラフを表す。（ｂ）は、（ａ）の制御フローグラフをＳＳＡ変換した後の制御フローグラフを表す。It is a figure which shows the specific example of the control flow graph of the program containing a memory array access command in the logic circuit generation apparatus and method of this invention. (A) represents the control flow graph corresponding to the software description of FIG.14 (b). (B) represents the control flow graph after carrying out SSA conversion of the control flow graph of (a). 図１５（ｆ）の制御フローグラフに対して、本発明の論理回路生成装置及び方法における、φ関数命令の実体化による制御フロー縮退化変換処理を施した結果を示す図である。FIG. 16 is a diagram showing a result of performing control flow degeneration conversion processing by materializing a φ function instruction in the logic circuit generation device and method of the present invention on the control flow graph of FIG. 本発明の論理回路生成装置及び方法における、レジスタ／メモリ配列アクセス命令を含んだ制御フロー縮退化変換処理の具体例を示す図である。（ａ）は、ＳＳＡ変換後の制御フローグラフである図１６（ｂ）の制御フローグラフに対応するソフトウェア記述を表す。（ｂ）は、（ａ）のソフトウェア記述に制御フロー縮退後処理を施した結果のソフトウェア記述を表す。It is a figure which shows the specific example of the control flow degeneration conversion process containing the register / memory arrangement | sequence access instruction in the logic circuit generation apparatus and method of this invention. (A) represents the software description corresponding to the control flow graph of FIG.16 (b) which is a control flow graph after SSA conversion. (B) represents a software description as a result of performing control flow degeneration processing on the software description of (a). 図１７の制御フロー縮退プログラムから生成されるデータフローグラフを示す図である。It is a figure which shows the data flow graph produced | generated from the control flow degeneration program of FIG. 図１８（ｂ）の制御フロー縮退プログラムから生成されるデータフローグラフを示す図である。It is a figure which shows the data flow graph produced | generated from the control flow degeneracy program of FIG.18 (b). 定数乗算を伴うプログラムのソフトウェア記述の具体例を示す図である。It is a figure which shows the specific example of the software description of the program accompanied by a constant multiplication. 本発明の論理回路生成装置及び方法における、除算の分解の具体例及び結果を示す図である。It is a figure which shows the specific example and result of decomposition | disassembly of the division in the logic circuit generation apparatus and method of this invention. 図２４のデータフローグラフに対応するソフトウェア記述を示す図である。It is a figure which shows the software description corresponding to the data flow graph of FIG. 本発明の論理回路生成装置及び方法における、定数乗算分解、定数伝播及び共通部分式除去の処理の具体例を示す図である。It is a figure which shows the specific example of the process of constant multiplication decomposition | disassembly, constant propagation, and common subexpression removal in the logic circuit generation apparatus and method of this invention. 論理回路の最大信号伝播時間と、論理回路に含まれる演算器の入出力信号との関係を模式的に示す図である。It is a figure which shows typically the relationship between the maximum signal propagation time of a logic circuit, and the input-output signal of the calculator included in a logic circuit. 本発明の論理回路生成装置及び方法の演算器回路遅延モデルを模式的に示す図である。It is a figure which shows typically the arithmetic unit circuit delay model of the logic circuit generation apparatus and method of this invention. 本発明の論理回路生成装置及び方法の演算器回路遅延モデルにおける、演算器の各入力信号の伝播時間と、演算器の伝播遅延と、出力信号の伝播時間との関係を模式的に示す図である。FIG. 6 is a diagram schematically showing the relationship between the propagation time of each input signal of the computing unit, the propagation delay of the computing unit, and the propagation time of the output signal in the computing unit circuit delay model of the logic circuit generation device and method of the present invention. is there. ２入力１出力マルチプレクサの入力信号と出力信号との関係を示す図である。It is a figure which shows the relationship between the input signal of a 2 input 1 output multiplexer, and an output signal. 本発明の論理回路生成装置及び方法の一実施形態において想定される、Booth recoder 器の回路図、入力信号の決定方法、及び、入力信号と出力信号との関係を示す図である。It is a figure which shows the relationship between the circuit diagram of the Booth recorder, the determination method of an input signal, and an input signal and an output signal assumed in one Embodiment of the logic circuit generation apparatus and method of this invention. 本発明の論理回路生成装置及び方法の一実施形態において想定される、同値比較否定器を示す回路図である。It is a circuit diagram which shows the equivalence comparison negator assumed in one Embodiment of the logic circuit generation apparatus and method of this invention. 本発明の論理回路生成装置及び方法の一実施形態において想定される、シフト演算器の回路図、及び、入力信号と出力信号との関係を示す図である。1 is a circuit diagram of a shift computing unit and a diagram showing a relationship between an input signal and an output signal, which are assumed in an embodiment of a logic circuit generation device and method of the present invention. レジスタ変数の代入前参照命令及び代入後参照命令を含むプログラムのソフトウェア記述の具体例を示す図である。It is a figure which shows the specific example of the software description of the program containing the reference instruction before assignment of a register variable, and the reference instruction after assignment. 図３２のプログラムから生成されるパイプライン回路を表現するデータフローグラフを示す図である。FIG. 33 is a diagram illustrating a data flow graph representing a pipeline circuit generated from the program of FIG. 32. 図３３のデータフローグラフに対してレジスタ変数更新命令ノード群グループ化処理を行った結果のデータフローグラフを示す図である。It is a figure which shows the data flow graph of the result of having performed the register variable update instruction node group grouping process with respect to the data flow graph of FIG. 本発明の論理回路生成装置及び方法における、パイプライン回路合成処理の流れを示すフロー図である。It is a flowchart which shows the flow of a pipeline circuit synthetic | combination process in the logic circuit generation apparatus and method of this invention. ３つのパイプライン境界枝が１つの経路上に存在するデータフローグラフの例を模式的に示した図である。It is the figure which showed typically the example of the data flow graph in which three pipeline boundary branches exist on one path | route. 本発明の論理回路生成装置及び方法における、ＳＡ法を用いてパイプライン境界の局所変更処理の制御の流れを示すフロー図である。It is a flowchart which shows the flow of control of the local change process of a pipeline boundary using SA method in the logic circuit generation apparatus and method of this invention. 本発明の論理回路生成装置の１つの実施形態を示す図である。It is a figure which shows one Embodiment of the logic circuit generation apparatus of this invention. 本発明の論理回路生成装置の１つの実施形態を示す図である。It is a figure which shows one Embodiment of the logic circuit generation apparatus of this invention. 本発明の論理回路生成装置の１つの実施形態を示す図である。It is a figure which shows one Embodiment of the logic circuit generation apparatus of this invention. 本発明の論理回路生成装置の１つの実施形態を示す図である。It is a figure which shows one Embodiment of the logic circuit generation apparatus of this invention. 本発明の論理回路生成装置の１つの実施形態を示す図である。It is a figure which shows one Embodiment of the logic circuit generation apparatus of this invention. 本発明の論理回路生成装置の１つの実施形態を示す図である。It is a figure which shows one Embodiment of the logic circuit generation apparatus of this invention. 本発明の論理回路生成装置の１つの実施形態を示す図である。It is a figure which shows one Embodiment of the logic circuit generation apparatus of this invention.

以下に、図面（図１〜図３７）を用いて本発明の一実施形態を説明する。図１は、本発明の一実施形態に係る論理回路生成装置１の機能を示す例示的なブロック図を示す。図１に示されるように、論理回路生成装置１は、論理回路生成対象の最上位関数Ｆｔｏｐを含むプログラムＰを入力として用い、論理回路を表現する論理回路記述Ｄを出力することができる。論理回路生成装置１は、
・非循環・非階層変換部２１
・論理回路入出力信号抽出部２２
・制御フローグラフ生成部２３
・状態変数命令依存性判定部２４
・レジスタ／メモリ配列アクセス命令分解部２５
・メモリポート数判定部２６
・静的単一代入形式変換部２７
・制御フロー縮退変換部２８
・データフローグラフ生成部２９
・データフローグラフ最適化部３０
・演算器の回路遅延・回路規模評価部３１
・パイプライン境界配置部３２
・ＲＴＬ記述出力部３３
を含むものとすることができる。以下に、これらの概要を説明するが、各部において行われる詳細な処理の内容については、図８〜図３７を参照して後述される。 Hereinafter, an embodiment of the present invention will be described with reference to the drawings (FIGS. 1 to 37). FIG. 1 is an exemplary block diagram illustrating functions of a logic circuit generation device 1 according to an embodiment of the present invention. As shown in FIG. 1, the logic circuit generation device 1 can output a logic circuit description D representing a logic circuit using a program P including the highest function Ftop as a logic circuit generation target as an input. The logic circuit generation device 1
Acyclic / non-hierarchical conversion unit 21
Logic circuit input / output signal extraction unit 22
Control flow graph generation unit 23
State variable instruction dependency determination unit 24
Register / memory array access instruction decomposition unit 25
Memory port number determination unit 26
Static single assignment format conversion unit 27
Control flow degeneration conversion unit 28
Data flow graph generation unit 29
Data flow graph optimization unit 30
-Circuit delay of arithmetic unit-Circuit scale evaluation unit 31
Pipeline boundary arrangement part 32
RTL description output unit 33
Can be included. The outline of these will be described below, but the details of the processing performed in each unit will be described later with reference to FIGS.

非循環・非階層変換部２１は、論理回路生成装置１に入力されたプログラムＰに含まれる論理回路生成対象の最上位関数Ｆｔｏｐを、ループ処理部及び関数呼出し命令を含まない形式の非循環型最下層関数Ｆｅｘｐに変換する。 The non-circular / non-hierarchical conversion unit 21 converts the top-level function Ftop to be generated in the logic circuit included in the program P input to the logic circuit generation device 1 into an acyclic type that does not include a loop processing unit and a function call instruction. Convert to the lowest layer function Fexp.

論理回路入出力信号抽出部２２は、非循環型最下層関数Ｆｅｘｐの引数及びプログラムＰのグローバル変数から、論理回路生成装置１が生成する論理回路の入力信号及び出力信号を抽出する。 The logic circuit input / output signal extraction unit 22 extracts an input signal and an output signal of the logic circuit generated by the logic circuit generation device 1 from the argument of the acyclic bottom layer function Fexp and the global variable of the program P.

制御フローグラフ生成部２３は、非循環型最下層関数Ｆｅｘｐから、制御フローグラフＦｃｆｇを生成する。 The control flow graph generation unit 23 generates a control flow graph Fcfg from the acyclic bottom layer function Fexp.

状態変数命令依存性判定部２４は、制御フローグラフＦｃｆｇにおいて、同一の状態変数（state variable）に対して、代入命令、参照命令、代入命令がこの順序で連続して実行される場合がないか否かを判定する。状態変数命令依存性判定部２４が「否」と判定した場合には、論理回路生成装置１は、論理回路記述を生成する処理を停止するようにすることができる。 Whether or not the state variable instruction dependency determining unit 24 sequentially executes an assignment instruction, a reference instruction, and an assignment instruction in this order for the same state variable in the control flow graph Fcfg. Determine whether or not. When the state variable instruction dependency determining unit 24 determines “No”, the logic circuit generation device 1 can stop the process of generating the logic circuit description.

レジスタ／メモリ配列アクセス命令分解部２５は、制御フローグラフＦｃｆｇに含まれる配列代入命令に書込みポート番号を割当て、データ入力ポート及びアドレス入力ポートに対応した変数を制御フローグラフＦｃｆｇに追加することによって配列代入命令を細分化する。また、レジスタ／メモリ配列アクセス命令分解部２５は、配列参照命令に読出しポート番号を割当て、アドレス入力ポートに対応した変数をＦｃｆｇに追加することによって配列参照命令を細分化する。 The register / memory array access instruction decomposition unit 25 assigns a write port number to the array assignment instruction included in the control flow graph Fcfg, and adds a variable corresponding to the data input port and the address input port to the control flow graph Ffcf. Subdivide assignment instructions. The register / memory array access instruction decomposition unit 25 assigns a read port number to the array reference instruction, and subdivides the array reference instruction by adding a variable corresponding to the address input port to Fcfg.

メモリポート数判定部２６は、レジスタ／メモリ配列アクセス命令分解部２５において配列代入命令に割当てられた書込みポート番号の数及び配列参照命令に割当てられた読出しポート番号の数がそれぞれ予め定められた閾値以下であるか否かを判定する。メモリポート数判定部２６が「否」と判定した場合には、論理回路生成装置１は、論理回路記述を生成する処理を停止するようにすることができる。 The memory port number determination unit 26 has threshold values in which the number of write port numbers assigned to the array assignment instruction and the number of read port numbers assigned to the array reference instruction in the register / memory array access instruction decomposition unit 25 are respectively determined in advance. It is determined whether or not: When the memory port number determination unit 26 determines “No”, the logic circuit generation device 1 can stop the process of generating the logic circuit description.

静的単一代入形式変換部２７は、制御フローグラフＦｃｆｇを、そこに含まれる各変数につき代入命令を１つだけ含む形式である静的単一代入形式に変換する。 The static single assignment format conversion unit 27 converts the control flow graph Fcfg to a static single assignment format that includes only one assignment instruction for each variable included therein.

制御フロー縮退変換部２８は、静的単一代入形式の制御フローグラフＦｃｆｇから全ての条件分岐命令を除去することによって、制御フローが縮退されたプログラムである制御フロー縮退プログラムＦｄｅｇを生成する。 The control flow degeneration conversion unit 28 generates a control flow degeneration program Fdeg, which is a program in which the control flow is degenerated, by removing all conditional branch instructions from the control flow graph Fcfg in the static single assignment format.

データフローグラフ生成部２９は、制御フロー縮退プログラムＦｄｅｇから、データフローグラフＦｄｆｇを生成する。 The data flow graph generation unit 29 generates a data flow graph Fdfg from the control flow degeneration program Fdeg.

データフローグラフ最適化部３０は、データフローグラフＦｄｆｇを最適化する。 The data flow graph optimization unit 30 optimizes the data flow graph Fdfg.

演算器の回路遅延・回路規模評価部３１は、データフローグラフＦｄｆｇに基づいて、論理回路生成装置１が生成する論理回路の回路遅延及び回路評価を見積る。 The circuit delay / circuit scale evaluation unit 31 of the arithmetic unit estimates the circuit delay and circuit evaluation of the logic circuit generated by the logic circuit generation device 1 based on the data flow graph Fdfg.

パイプライン境界配置部３２は、データフローグラフＦｄｆｇにパイプライン境界を配置することによって、パイプライン回路データＦｐｌｃを合成する。 The pipeline boundary arrangement unit 32 synthesizes the pipeline circuit data Fplc by arranging the pipeline boundary in the data flow graph Fdfg.

ＲＴＬ記述出力部３３は、パイプライン回路データＦｐｌｃに基づいて論理回路記述Ｄを生成して出力する。この論理回路記述Ｄが、論理回路生成装置１の出力となる。 The RTL description output unit 33 generates and outputs a logic circuit description D based on the pipeline circuit data Fplc. This logic circuit description D becomes the output of the logic circuit generation device 1.

図２は、本発明の一実施形態に係る論理回路生成装置１の例示的なハードウェア構成を示す。論理回路生成装置１は、中央処理装置（ＣＰＵ）１１と、中央処理装置が実行する各種のプログラムやデータなどを格納するＲＡＭ１２、ＲＯＭ１３、ハードディスクドライブ（ＨＤＤ）１４などの記憶装置と、これらの装置を互いに接続するバス１０とを有する汎用コンピュータを用いて実現することができる。さらに、論理回路生成装置１には、必要に応じて、ＣＤ−ＲＯＭ又はＤＶＤ−ＲＯＭなどの外部記録媒体との間でデータの入出力を行うドライブ装置１５と、キーボード又はマウスなどの入力装置１６と、ＣＲＴ、液晶ディスプレイ又はプリンタなどの出力装置１７と、他のコンピュータ又はネットワークと通信するための通信インターフェース１８と、を接続するようにしてもよい。 FIG. 2 shows an exemplary hardware configuration of the logic circuit generation device 1 according to an embodiment of the present invention. The logic circuit generation device 1 includes a central processing unit (CPU) 11, a storage device such as a RAM 12, a ROM 13, and a hard disk drive (HDD) 14 that store various programs and data executed by the central processing unit, and these devices. Can be realized using a general-purpose computer having a bus 10 that connects the two. Further, the logic circuit generation device 1 includes a drive device 15 for inputting / outputting data to / from an external recording medium such as a CD-ROM or a DVD-ROM, and an input device 16 such as a keyboard or a mouse as necessary. And an output device 17 such as a CRT, a liquid crystal display or a printer, and a communication interface 18 for communicating with another computer or a network may be connected.

本発明の一実施形態においては、論理回路生成装置１は、プログラムＰを、ドライブ装置１５、入力装置１６、又は通信インターフェース１８から入力することができる。論理回路生成装置１に入力されたプログラムＰは、ＲＡＭ１２に格納することができる。ＲＡＭ１２にはさらに、図１に示す論理回路生成装置１の各部の機能をＣＰＵ１１に実行させるための別のプログラム４（図示せず）を格納することができる。ＣＰＵ１１は、プログラム４に従ってＲＡＭ１２に格納されたプログラムＰを処理し、結果として論理回路記述Ｄを生成することができる。ＣＰＵがプログラムＰを処理する過程で生成する制御フローグラフＦｃｆｇ、制御フロー縮退プログラムＦｄｅｇ、データフローグラフＦｄｆｇ及びパイプライン回路データＦｌｃｄなどの各種中間データ、並びに最終的に生成する論理回路記述Ｄもまた、ＲＡＭ１２に格納することができる。ＲＡＭ１２に格納された論理回路記述Ｄは、ドライブ装置１５、出力装置１７、又は通信インターフェース１８から出力することができる。 In one embodiment of the present invention, the logic circuit generation device 1 can input the program P from the drive device 15, the input device 16, or the communication interface 18. The program P input to the logic circuit generation device 1 can be stored in the RAM 12. The RAM 12 can further store another program 4 (not shown) for causing the CPU 11 to execute the function of each unit of the logic circuit generation device 1 shown in FIG. The CPU 11 can process the program P stored in the RAM 12 according to the program 4 and generate a logic circuit description D as a result. Various control data such as a control flow graph Fcfg, a control flow degeneration program Fdeg, a data flow graph Fdfg, and pipeline circuit data Flcd generated in the course of processing of the program P by the CPU, and a finally generated logic circuit description D are also provided. Can be stored in the RAM 12. The logic circuit description D stored in the RAM 12 can be output from the drive device 15, the output device 17, or the communication interface 18.

ここで、図３〜図７を参照して、本発明による論理回路生成装置又は論理回路生成方法の入力として用いるプログラムＰについて、１つの実施形態として、Ｃ言語によって記述されたプログラムの例を用いて説明する。本明細書においては、回路設計のためのハードウェアの一連の処理の流れである動作記述を記述した、本発明による論理回路生成装置又は論理回路生成方法の入力として用いるプログラムＰの記述を、本発明のソフトウェア記述と称することとする。 Here, with reference to FIG. 3 to FIG. 7, the program P used as an input of the logic circuit generation device or the logic circuit generation method according to the present invention uses, as one embodiment, an example of a program described in C language. I will explain. In the present specification, a description of a program P used as an input of a logic circuit generation device or a logic circuit generation method according to the present invention, which describes an operation description that is a flow of a series of hardware processing for circuit design, It will be referred to as the software description of the invention.

［データ型の別名定義］
Ｃ言語では、 typedef 指定子を用いて、定義済みのデータ型に対する別名データ型を定義することができる。図３は、本発明のソフトウェア記述における、データ型の別名定義の記述例を示す図である。この記述例において、 INT10 型及び INT12 型は int 型に対する別名データ型であり、 S_INT12 型及び M_INT12 型は INT12 型に対する別名データ型である。また、 SA_T 型は、 a（char 型）、 b（INT10 型）及び c（INT12 型）をメンバー変数として持つ構造体に対する別名データ型である。 Data type alias definition
In the C language, an alias data type for a defined data type can be defined using a typedef specifier. FIG. 3 is a diagram showing a description example of the alias definition of the data type in the software description of the present invention. In this example, INT10 type and INT12 type are alias data types for int type, and S_INT12 type and M_INT12 type are alias data types for INT12 type. The SA_T type is an alias data type for a structure having a (char type), b (INT10 type), and c (INT12 type) as member variables.

［データ型への変数属性の付加］
図４は、本発明のソフトウェア記述における、データ型に対して変数属性を付加する記述例を示す図である。この記述例においては、 #pragma 文を用いて変数属性をＣ言語のソースコード上で記述しているが、別の実施形態においては、例えば、型名と変数属性との対応関係を記述したファイルを読み込む方法、ユーザ・インターフェースによってこれらの情報を入力する方法など、他の方法を用いてもよい。 [Add variable attribute to data type]
FIG. 4 is a diagram showing a description example in which a variable attribute is added to a data type in the software description of the present invention. In this description example, the variable attribute is described in the C source code using the #pragma statement. However, in another embodiment, for example, a file describing the correspondence between the type name and the variable attribute Other methods may be used such as a method for reading information and a method for inputting such information through a user interface.

図４をより詳細に見ると、データ型に対して、変数属性であるビット幅属性、メモリ属性、及びレジスタ属性を付加する記述が示されている。具体的には、 INT10 型に１０ビット幅のビット幅属性、 INT12 型に１２ビット幅のビット幅属性、 M_INT12 型にメモリ属性、 S_INT12 型及び ST_A 型にレジスタ属性を、それぞれ付加している。 Looking at FIG. 4 in more detail, a description for adding a bit width attribute, a memory attribute, and a register attribute, which are variable attributes, to the data type is shown. Specifically, a bit width attribute of 10 bits width is added to the INT10 type, a bit width attribute of 12 bit width to the INT12 type, a memory attribute to the M_INT12 type, and a register attribute to the S_INT12 type and ST_A type, respectively.

さらに、 S_INT12 型及び M_INT12 型は INT12 型の別名であるので、 S_INT12 型と M_INT12 型にも１２ビット幅のビット幅属性が暗示的に付加される。また、 ST_A 型は、 char 型の a 、 INT12 型の b 、及び INT10 型の c の３つのメンバー変数からなる構造体であるので、 ST_A 型に明示的に付加されているレジスタ属性が、これら３つのメンバー変数 a，b，c にも暗示的に付加される。このように、データ型の別名定義手段と各データ型に対する属性情報の記述手段とを組み合わせたことを特徴とする変数属性の記述方式によって、論理回路生成で必要となる様々な付加情報を、少ない記述量で効率的にソフトウェア記述に対応付けることが可能となる。 Further, since the S_INT12 type and the M_INT12 type are alias names of the INT12 type, a 12-bit width attribute is implicitly added to the S_INT12 type and the M_INT12 type. Since ST_A type is a structure consisting of three member variables, char type a, INT12 type b, and INT10 type c, the register attributes explicitly added to ST_A type are those 3 It is also implicitly added to the two member variables a, b, and c. In this way, the variable attribute description method characterized by combining the data type alias definition means and the attribute information description means for each data type reduces the amount of various additional information required for logic circuit generation. It is possible to efficiently associate with the software description by the amount of description.

［データ型へのビット幅属性付加による設計再利用性向上］
図５は、本発明のソフトウェア記述における、同じ関数を異なる変数属性を持つ引数で呼出すコードの記述例、及びそのプログラムに対応するパイプライン回路図を示す図である。この記述例においては、関数 top0 が論理回路生成対象関数であり、１０ビット幅の引数 a，b と１２ビット幅の引数 c とが入力として供給され、 INT12 型の戻り値が出力となる。なお、入力及び出力の判別方法は後述する。また、 S_INT12 型のローカル変数 d には、１２ビット幅属性及びレジスタ属性が付加されている。関数 top0 の内部で２度呼び出される関数 func0 に与えられる引数は、１度目の呼び出しにおいては１０ビット幅であるが、２度目においては１２ビット幅であり、同じ関数記述で違うビット幅の回路を表現することができる。このようなビット幅の調整をハードウェア記述言語やＳｙｓｔｅｍＣで行う場合は明示的なビット幅指定記述が必要であるが、本発明による型への変数属性付加記述方式を用いることにより、ビット幅が暗示的に指定できるので、設計の再利用性が大幅に向上する。 [Improve design reusability by adding bit width attribute to data type]
FIG. 5 is a diagram showing a description example of code for calling the same function with arguments having different variable attributes in the software description of the present invention, and a pipeline circuit diagram corresponding to the program. In this description example, the function top0 is a logic circuit generation target function, 10-bit width arguments a and b and 12-bit width argument c are supplied as inputs, and an INT12 type return value is output. A method for determining input and output will be described later. A 12-bit width attribute and a register attribute are added to the local variable d of the S_INT12 type. The argument given to the function func0 that is called twice inside the function top0 is 10 bits wide in the first call, but 12 bits wide in the second call. Can be expressed. When such a bit width adjustment is performed by a hardware description language or SystemC, an explicit bit width designation description is required. However, by using the variable attribute addition description method for a type according to the present invention, the bit width is reduced. Since it can be specified implicitly, the reusability of the design is greatly improved.

［レジスタ属性付加によるタイミング情報の暗示的指定］
図５の記述例において、変数 d はレジスタ属性を持つ。一般に、レジスタ属性は、ソフトウェア記述の動作を何ら変更することはないが、この属性を持つ変数が論理回路においてレジスタとして実装されることを指定する。論理回路におけるレジスタの動作の特徴は、クロック信号の立上り（即ち、０から１への遷移）の瞬間のみに、レジスタ入力信号がレジスタ出力信号に伝播する点にあり、この特徴によって、レジスタが、順序回路の内部状態を状態変数として保持することが可能になる。そのため、レジスタ属性を持つ変数は、値を代入してからその値を読み出せるようになるまでに、１クロック以上の経過が必要である。従って、図５の記述において、１度目に呼び出される func0 の戻り値を d に代入してから、１クロック後に d が参照できるようになり、その値を用いて呼び出される２度目の func0 の結果は１クロック遅れて出力されることになる。このように、時間概念が無いソフトウェア記述にレジスタ属性を付加することによって、論理回路動作で重要なタイミング情報を暗示的に付加することができる。なお、変数 c は変数 a 及び b と同時刻に入力されるため、２番目の func0 の引数 d に引数 c を同期する必要があるが、そのためのレジスタの挿入は、後述するように、論理回路生成処理において自動的に行われる。このように、レジスタ属性を付加することにより、ソフトウェア記述を変更することなく論理回路動作のタイミングを自由自在に調整できるようになる。 [Implicit specification of timing information by register attribute addition]
In the example of FIG. 5, the variable d has a register attribute. In general, the register attribute does not change the operation of the software description, but specifies that a variable having this attribute is implemented as a register in the logic circuit. The operation of the register in the logic circuit is characterized in that the register input signal propagates to the register output signal only at the rising edge of the clock signal (i.e., transition from 0 to 1). It becomes possible to hold the internal state of the sequential circuit as a state variable. For this reason, a variable having a register attribute requires at least one clock from the time a value is assigned until the value can be read. Therefore, in the description of FIG. 5, after the return value of func0 called for the first time is substituted into d, d can be referred to after one clock, and the result of the second func0 called using that value is The output is delayed by one clock. In this way, by adding a register attribute to a software description having no time concept, timing information important in the logic circuit operation can be implicitly added. Since variable c is input at the same time as variables a and b, it is necessary to synchronize argument c with argument d of the second func0. This is done automatically in the generation process. Thus, by adding the register attribute, the timing of the logic circuit operation can be freely adjusted without changing the software description.

［スタティック変数を用いた順序回路の記述］
図６は、本発明のソフトウェア記述における、スタティック変数を用いた順序回路の状態遷移の記述例、及びその順序回路の状態遷移図を示す図である。この記述例においては、スタティック変数 stt に２ビット幅属性とレジスタ属性とを付加し、順序回路の状態変数として用いている。順序回路の状態変数の動作をソフトウェア記述で表現するためには、関数 top1 の１回の実行の後もこの状態変数の値を保持し続ける必要があるために、状態変数はスタティック変数として宣言する必要がある。また、この状態変数にレジスタ属性を付加することによって、この順序回路の状態変数がレジスタで実装されることが指定される。 stt の宣言文における 0 への初期化はプログラム起動前に（論理回路においては、リセット動作時に）１度だけ行われ、関数 top1 が実行されるたびに入力 a の値によって変数 stt の値が更新されていく。また、変数 prev_stt へ代入する stt の値、及び switch 文の stt の値は、 stt が更新される前の値である。このようにして、論理回路における最も重要な構成要素の一つである順序回路の状態遷移を、ソフトウェア記述のスタティック変数によって非常に簡単に記述できる。 [Description of sequential circuit using static variables]
FIG. 6 is a diagram showing a description example of state transition of a sequential circuit using static variables and a state transition diagram of the sequential circuit in the software description of the present invention. In this description example, a 2-bit width attribute and a register attribute are added to the static variable stt and used as a state variable of the sequential circuit. In order to express the behavior of the state variable of the sequential circuit in the software description, it is necessary to keep the value of this state variable after one execution of the function top1, so the state variable is declared as a static variable. There is a need. Further, by adding a register attribute to this state variable, it is specified that the state variable of this sequential circuit is implemented by a register. Initialization to 0 in the declaration statement of stt is performed only once before starting the program (at the time of reset operation in the logic circuit), and the value of variable stt is updated by the value of input a each time function top1 is executed. It will be done. The value of stt assigned to the variable prev_stt and the value of stt in the switch statement are the values before stt is updated. In this way, the state transition of the sequential circuit, which is one of the most important components in the logic circuit, can be described very simply by the static variable in the software description.

［メモリ属性変数を用いたメモリ回路を含む記述］
図７は、本発明のソフトウェア記述における、メモリ属性変数を用いた画像処理用ラインバッファ及びシフトレジスタ回路の記述例、及び対応するパイプライン回路図を示す図である。この記述例において、関数 top2 は、８ビット幅のデータ pin を入力し、３×３の配列データ pw[3][3] を出力する関数である。この関数は、ラスタースキャン方式の画像データに対する空間フィルタ処理などで必要となる周辺画素データを、ラインバッファ（垂直方向の「遅延回路」として、水平方向１ライン分の画素データを保存するバッファ）とシフトレジスタ（水平方向の「遅延回路」）とによって生成している。 [Description including memory circuit using memory attribute variable]
FIG. 7 is a diagram showing a description example of a line buffer for image processing and a shift register circuit using a memory attribute variable, and a corresponding pipeline circuit diagram in the software description of the present invention. In this description example, the function top2 is a function that inputs an 8-bit data pin and outputs 3 × 3 array data pw [3] [3]. This function uses the line buffer (vertical “delay circuit” to store pixel data for one line in the horizontal direction) as peripheral pixel data necessary for spatial filter processing for raster scan image data, etc. It is generated by a shift register (a horizontal “delay circuit”).

より具体的には、ラインバッファは、２つのメモリ属性配列変数 lbuf1[1024] 及び lbuf2[1024] によって実現され、これらの配列への共通インデックスに１０ビット幅の状態変数 pos が用いられている。 lb1 = lbuf1[pos] などのメモリ属性配列の参照がメモリ読出しを表現し、 lbuf1[pos] = pin などのメモリ属性配列への代入がメモリ書込みを表現する。また、シフトレジスタは、 pw[0][2] = pw[0][1] など、状態変数配列 pw[3][3] をシフトする一連の命令によって実現される。一方、状態属性を持たない変数 lb1, lb2 などは、データを瞬時転送する信号を表現するために用いられている。 More specifically, the line buffer is realized by two memory attribute array variables lbuf1 [1024] and lbuf2 [1024], and a 10-bit wide state variable pos is used as a common index to these arrays. A memory attribute array reference such as lb1 = lbuf1 [pos] represents a memory read, and an assignment to a memory attribute array such as lbuf1 [pos] = pin represents a memory write. The shift register is realized by a series of instructions for shifting the state variable array pw [3] [3], such as pw [0] [2] = pw [0] [1]. On the other hand, variables lb1, lb2, etc. that do not have state attributes are used to represent signals that transfer data instantaneously.

メモリにはクロック同期の有無によって同期型と非同期型との２種類が存在するが、現在では高速性を優先してほとんど同期型メモリが使われているため、ここでは同期型メモリを想定する。図８は、図７のラインバッファ更新を記述した、本発明のソフトウェア記述（上段）及び従来技術におけるＲＴＬ記述（下段）を示す図である。より具体的には、図８は、図７における２つのラインバッファの更新処理（同アドレスへの同時読出し・書込み）部分のソフトウェア記述と、同じ動作のＶｅｒｉｌｏｇハードウェア記述言語によるＲＴＬ記述を示す。同期型メモリの回路動作では、読出しデータは１クロック遅れて有効になるため、 lbuf1 の読出し（lb1 = lbuf1[pos]）の１クロック後に lbuf2 の書込み（lbuf2[pos] = lb1）が行われる必要がある。また、ラインバッファでは、同じアドレスに対する読出し及び書込みを同時に行う必要があるため、 lbuf2 の読出しも書込みのタイミングに合わせて１クロック遅らせる必要がある。 lbuf2 の読出しデータ（lb2 = lbuf2[pos]）は、 lbuf1 からさらに１クロック遅れて有効になる。こうしたタイミングのずれは、これらのデータを参照する後段回路の動作タイミングにも影響を及ぼすが、ＲＴＬ記述では上記のタイミングを正確に表現する必要がある。一方、本発明のソフトウェア記述では、このような詳細なタイミングを気にすることなく、データの流れだけを表現するだけで良い。即ち、本発明の技術であるプログラミング言語による論理回路記述方式は、詳細な回路動作タイミングを隠蔽した上で、データの流れだけを表現することを可能にした記述方式であり、このことによって、記述の可読性が格段に向上するだけでなく、設計不具合要因を格段に減らすことができるため、この点においてもＲＴＬ記述と比べて設計生産性が飛躍的に向上する。 There are two types of memory, synchronous type and asynchronous type, depending on the presence or absence of clock synchronization. At present, synchronous type memory is mostly used because priority is given to high speed. FIG. 8 is a diagram showing the software description (upper part) of the present invention and the RTL description (lower part) in the prior art describing the line buffer update of FIG. More specifically, FIG. 8 shows the software description of the update processing (simultaneous read / write to the same address) portion of the two line buffers in FIG. 7 and the RTL description by the Verilog hardware description language of the same operation. In synchronous memory circuit operation, the read data becomes valid one clock later, so lbuf2 must be written (lbuf2 [pos] = lb1) one clock after lbuf1 read (lb1 = lbuf1 [pos]) There is. In the line buffer, since it is necessary to read and write to the same address at the same time, it is necessary to delay the reading of lbuf2 by one clock in accordance with the write timing. The read data of lbuf2 (lb2 = lbuf2 [pos]) becomes valid one clock later than lbuf1. Although such a timing shift affects the operation timing of the subsequent circuit that refers to these data, it is necessary to accurately represent the above timing in the RTL description. On the other hand, in the software description of the present invention, it is only necessary to express the data flow without worrying about such detailed timing. In other words, the logic circuit description method using the programming language, which is the technology of the present invention, is a description method that allows only the data flow to be expressed while hiding the detailed circuit operation timing. In this respect, design productivity is dramatically improved compared to the RTL description.

［最上位関数の指定］
本発明のソフトウェア記述において論理回路生成の対象となる部分は、ソフトウェア記述全体とすることもできるが、論理回路生成の対象となる最上位関数を指定することによって定めることもできる。本実施形態においては、図示しないが、最上位関数を指定するための #pragma 文をソフトウェア記述内に記述することによって、論理回路生成の対象を定めている。別の実施形態においては、例えば、最上位関数の関数名を記述したファイルを読み込む方法、ユーザ・インターフェースによって論理回路生成の対象の範囲を指定する方法など、他の方法を用いてもよい。 [Specify top-level function]
In the software description of the present invention, the part that is the target of logic circuit generation can be the entire software description, but can also be determined by specifying the highest-level function that is the target of logic circuit generation. In this embodiment, although not shown, the logic circuit generation target is determined by describing in the software description a #pragma statement for designating the highest function. In another embodiment, other methods such as a method of reading a file in which the function name of the top-level function is described, a method of specifying a target range of logic circuit generation by a user interface, and the like may be used.

以上、本発明による論理回路生成装置又は論理回路生成方法の入力として用いるプログラムＰの記述であるソフトウェア記述の特徴について、Ｃ言語によって記述された例を用いて説明したが、これらの特徴を表現できるものである限り、Ｃ言語以外の手続型言語、又は関数型言語若しくはオブジェクト指向言語など、種々のソフトウェア記述言語によっても、本発明による論理回路生成装置又は論理回路生成方法の入力として用いるプログラムＰを同様に記述することができることが、当業者にとっては容易に理解されるであろう。 As described above, the features of the software description which is the description of the program P used as the input of the logic circuit generation device or the logic circuit generation method according to the present invention have been described using the example described in the C language. However, these features can be expressed. As long as the program P is used, the program P used as the input of the logic circuit generation apparatus or the logic circuit generation method according to the present invention can be obtained by various software description languages such as a procedural language other than the C language, a functional language, or an object-oriented language. Those skilled in the art will readily understand that the same can be described.

次に、図９〜図３７を参照して、図１に示される本実施形態における論理回路生成装置１の各部を詳述する。 Next, with reference to FIGS. 9 to 37, each part of the logic circuit generation device 1 in the present embodiment shown in FIG. 1 will be described in detail.

［非循環・非階層変換部２１］
論理回路生成装置１に入力されたプログラムＰは、最初に、非循環・非階層変換部２１に入力される。非循環・非階層変換部２１は、プログラムＰに含まれる論理回路生成対象の最上位関数Ｆｔｏｐに対して非循環・非階層変換を行って、ループ処理部及び関数呼出し命令を含まない関数である非循環型最下層関数Ｆｅｘｐを出力する。 [Acyclic / non-hierarchical conversion unit 21]
The program P input to the logic circuit generation device 1 is first input to the acyclic / non-hierarchical conversion unit 21. The non-circular / non-hierarchical conversion unit 21 is a function that performs non-circular / non-hierarchical conversion on the top-level function Ftop that is a logic circuit generation target included in the program P and does not include a loop processing unit and a function call instruction The non-circular bottom layer function Fexp is output.

より具体的には、非循環・非階層変換部２１は、次に説明する完全インライン展開部２１１と完全ループ展開部２１２とを含む。非循環・非階層変換部２１に入力された最上位関数Ｆｔｏｐは、まず、完全インライン展開部２１１に入力されて、関数呼出し命令を含まない形式の関数である最下層関数に変換される。最下層関数は、さらに、完全ループ展開部２１２に入力されて、非循環型最下層関数に変換される。 More specifically, the acyclic / non-hierarchical conversion unit 21 includes a complete inline expansion unit 211 and a complete loop expansion unit 212 described below. The highest-order function Ftop input to the non-circular / non-hierarchical conversion unit 21 is first input to the complete inline expansion unit 211 and converted into a lowest-layer function that is a function that does not include a function call instruction. The lowest layer function is further input to the complete loop expansion unit 212 and converted into an acyclic bottom layer function.

完全インライン展開部２１１は、入力された関数を、関数呼出し命令を含まない形式の関数である最下層関数に変換する。関数を最下層関数に変換する処理を、完全インライン展開と呼ぶ。完全インライン展開部２１１は、入力された関数に対して、関数呼出し命令のインライン展開を繰返す。これによって、再帰的でない関数を呼出す命令は、完全に展開される。さらに、完全インライン展開部２１１は、関数呼出しのインライン展開を繰返す過程において、必要に応じて、定数伝播による最適化を実行する。これによって、再帰的な関数であっても、再帰呼出し回数の上限が特定できるものであれば、それを呼出す命令は完全に展開される。 The complete inline expansion unit 211 converts the input function into a lowest layer function that is a function that does not include a function call instruction. The process of converting a function to the lowest layer function is called complete inline expansion. The complete inline expansion unit 211 repeats inline expansion of the function call instruction for the input function. This completely expands instructions that call non-recursive functions. Further, the complete inline expansion unit 211 performs optimization by constant propagation as necessary in the process of repeating inline expansion of function calls. As a result, even for a recursive function, if the upper limit of the number of recursive calls can be specified, the instruction for calling the function is completely expanded.

完全インライン展開部２１１は、インライン展開の繰返しを所定の回数行っても関数呼出し命令が完全に展開されない場合に、関数を最下層関数に変換不可能であると判断して、処理を中止するようにすることができる。この場合には、論理回路生成装置１は、例えば、出力装置１７に警告メッセージを出力し、論理回路生成処理を停止するようにすることができる。 The complete inline expansion unit 211 determines that the function cannot be converted into the lowest function when the function call instruction is not fully expanded even if the inline expansion is repeated a predetermined number of times, and stops the processing. Can be. In this case, for example, the logic circuit generation device 1 can output a warning message to the output device 17 and stop the logic circuit generation processing.

図９〜図１２を参照して、関数を完全インライン展開する処理について、具体例を用いて説明する。図９は、再帰的関数呼出しの記述例を示す。この記述例において、関数 funcB は、関数 funcB 自身を呼出す命令を含んでいるため、再帰的関数である。関数 funcA は、再帰的関数である関数 funcB を呼出す命令を含む関数である。この例における関数 funcA を完全インライン展開する処理を考える。図１０は、 funcA に含まれる関数呼出し funcB(4) をインライン展開した結果の関数 funcA を示す。ここで、変数 funcB_0 は、関数呼出し funcB(4) の戻り値を格納するために導入された変数である。図１１は、図１０の関数 funcA に定数伝播による最適化を適用した結果を示す。具体的には、図１０において、変数 a は、定数 4 が代入された後、そのまま２ヶ所で参照されているので、これら変数 a の参照を定数 4 で置き換える。これにより、
{ int a = 4; funcB_0 = (a == 1) ? 1 : funcB(a - 1) + 1;}
の部分は
{ funcB_0 = (4 == 1) ? 1 : funcB(4 - 1) + 1;}
に変換される。この代入文の右辺の条件演算式
(4 == 1) ? 1 : funcB(4 - 1) + 1
の条件式 (4 == 1) は偽と評価されるので、条件演算式は第３項、即ち
funcB(4 - 1) + 1
に簡単化され、さらに式 4 - 1 は定数式 3 に簡単化される。その結果、図１１の関数 funcA が得られる。得られた関数 funcA に対して、さらに、関数 funcB を呼出す命令のインライン展開、及び定数伝播による最適化を、再帰呼出しが発生しない a == 1 の条件が成立するまで繰返し適用する。その結果、図１２に示されるように、関数呼出し命令を含まない形式の関数、即ち最下層関数が得られる。 With reference to FIGS. 9 to 12, processing for completely inlining a function will be described using a specific example. FIG. 9 shows a description example of a recursive function call. In this example, the function funcB is a recursive function because it contains an instruction that calls the function funcB itself. The function funcA is a function including an instruction that calls the function funcB that is a recursive function. Consider the process of fully inlining the function funcA in this example. FIG. 10 shows a function funcA that is the result of inline expansion of the function call funcB (4) included in funcA. Here, the variable funcB_0 is a variable introduced to store the return value of the function call funcB (4). FIG. 11 shows the result of applying optimization by constant propagation to the function funcA of FIG. Specifically, in FIG. 10, the variable a is referred to in two places as it is after the constant 4 is assigned, so the reference of the variable a is replaced with the constant 4. This
{int a = 4; funcB_0 = (a == 1)? 1: funcB (a-1) + 1;}
Part of
{funcB_0 = (4 == 1)? 1: funcB (4-1) + 1;}
Is converted to Conditional expression on the right side of this assignment statement
(4 == 1)? 1: funcB (4-1) + 1
Since the conditional expression of (4 == 1) evaluates to false, the conditional expression is the third term, that is,
funcB (4-1) + 1
Furthermore, Equation 4-1 is simplified to Constant Equation 3. As a result, the function funcA in FIG. 11 is obtained. Further, inline expansion of the instruction that calls the function funcB and optimization by constant propagation are repeatedly applied to the obtained function funcA until the condition of a == 1 where no recursive call occurs is satisfied. As a result, as shown in FIG. 12, a function in a form that does not include a function call instruction, that is, a lowermost layer function is obtained.

以上に説明されたように、完全インライン展開部２１１は、入力された関数を、関数呼出し命令を含まない形式の関数である最下層関数に変換する。入力された関数が関数呼出し命令を含まないものである場合には、完全インライン展開部２１１は、その入力された関数をそのまま出力する。 As described above, the complete inline expansion unit 211 converts the input function into a lowermost layer function that is a function that does not include a function call instruction. If the input function does not include a function call instruction, the complete inline expansion unit 211 outputs the input function as it is.

完全ループ展開部２１２は、入力された最下層関数を、ループ処理部を含まない非循環型最下層関数に変換する。これは、関数に含まれる全てのループ処理部を展開することによって行われ、この処理を完全ループ展開と呼ぶ。 The complete loop expansion unit 212 converts the input lowest layer function into an acyclic bottom layer function that does not include a loop processing unit. This is performed by expanding all the loop processing units included in the function, and this processing is called complete loop expansion.

図１３を参照して、最下層関数を非循環型最下層関数に変換する処理について、具体例を用いて説明する。図１３（ａ）の関数 funcD は、ループ処理部を含む関数である。このループ処理部におけるループ繰返し回数は３回である。繰返しの１回目においては、変数 i の値は定数値 0 である。即ち、このループ処理部の繰返しの１回目は
i = 0; a += a * i + 1;
と展開され、これに定数伝播の最適化を適用することによって
a += a * 0 + 1;
が得られる。繰返しの２回目及び３回目においてもそれぞれ同様に
a += a * 1 + 1;
a += a * 2 + 1;
が得られる。その結果、図１３（ｂ）に示されるように、ループ処理部を含まない形式の関数 funcD が得られる。 With reference to FIG. 13, processing for converting the lowest layer function to the acyclic bottom layer function will be described using a specific example. The function funcD in FIG. 13A is a function including a loop processing unit. The number of loop iterations in this loop processing unit is three. In the first iteration, the value of variable i is the constant value 0. That is, the first iteration of this loop processing unit is
i = 0; a + = a * i + 1;
By applying constant propagation optimization to this
a + = a * 0 + 1;
Is obtained. Similarly in the second and third repetitions
a + = a * 1 + 1;
a + = a * 2 + 1;
Is obtained. As a result, as shown in FIG. 13B, a function funcD in a form not including the loop processing unit is obtained.

以上に説明されたように、完全ループ展開部２１２は、入力された最下層関数を、ループ処理部を含まない非循環型最下層関数に変換する。入力された最下層関数がループ処理部を含まないものである場合には、完全ループ展開部２１２は、その入力された最下層関数をそのまま出力する。 As described above, the complete loop expansion unit 212 converts the input lowest layer function into an acyclic bottom layer function that does not include a loop processing unit. When the input lowest layer function does not include the loop processing unit, the complete loop expansion unit 212 outputs the input lowest layer function as it is.

なお、本発明の別の実施形態においては、論理回路生成装置１は、非循環・非階層変換部２１を含まないものとすることができる。この場合の論理回路生成装置１は、最上位関数Ｆｔｏｐが非循環型最下層関数であるプログラムＰを入力に取る。 In another embodiment of the present invention, the logic circuit generation device 1 may not include the acyclic / non-hierarchical conversion unit 21. In this case, the logic circuit generation device 1 takes as input a program P in which the highest function Ftop is an acyclic bottom layer function.

［論理回路入出力信号抽出部２２］
非循環・非階層変換部２１によって非循環型最下層関数Ｆｅｘｐに変換された最上位関数Ｆｔｏｐを含むプログラムＰは、次いで、論理回路入出力信号抽出部２２に入力される。論理回路入出力信号抽出部２２は、入力された非循環型最下層関数Ｆｅｘｐの引数及び戻り値並びにプログラムＰに含まれるグローバル変数から、論理回路生成装置１が出力する論理回路記述Ｄによって表現される論理回路の入力信号及び出力信号を抽出する。 [Logical circuit input / output signal extraction unit 22]
The program P including the highest function Ftop converted to the acyclic bottom layer function Fexp by the noncircular / nonhierarchical conversion unit 21 is then input to the logic circuit input / output signal extraction unit 22. The logic circuit input / output signal extraction unit 22 is expressed by a logic circuit description D output from the logic circuit generator 1 from the input arguments and return values of the acyclic bottom layer function Fexp and global variables included in the program P. The input signal and output signal of the logic circuit are extracted.

より具体的には、論理回路入出力信号抽出部２２は、次に説明する論理回路入力信号抽出部２２１と論理回路出力信号抽出部２２２とを含む。論理回路入力信号抽出部２２１は、非循環型最下層関数Ｆｅｘｐの引数及びプログラムＰに含まれるグローバル変数から入力信号を抽出し、論理回路出力信号抽出部２２２は、入力された非循環型最下層関数Ｆｅｘｐの引数及び戻り値並びにプログラムＰに含まれるグローバル変数から出力信号を抽出する。 More specifically, the logic circuit input / output signal extraction unit 22 includes a logic circuit input signal extraction unit 221 and a logic circuit output signal extraction unit 222 described below. The logic circuit input signal extraction unit 221 extracts an input signal from the argument of the acyclic bottom layer function Fexp and the global variable included in the program P, and the logic circuit output signal extraction unit 222 receives the input acyclic bottom layer The output signal is extracted from the argument and return value of the function Fexp and the global variable included in the program P.

論理回路入力信号抽出部２２１は、非循環型最下層関数Ｆｅｘｐの引数及びプログラムＰに含まれるグローバル変数から入力信号を抽出する。論理回路入力信号抽出部２２１は、具体的には、次の（１）〜（３）に挙げるものを入力信号と判別して抽出する。なお、ここで「引数」とは非循環型最下層関数Ｆｅｘｐの引数のことであり、「関数」とは非循環型最下層関数Ｆｅｘｐのことである。
（１）非ポインタ型の引数。
（２）関数内部でポインタ参照を介した値参照があり、かつ、ポインタ参照を介した値代入がないポインタ型の引数が参照する実体変数。
（３）関数内部で値参照があり、かつ、値代入がないグローバル変数。 The logic circuit input signal extraction unit 221 extracts an input signal from an argument of the acyclic bottom layer function Fexp and a global variable included in the program P. Specifically, the logic circuit input signal extraction unit 221 determines and extracts the following (1) to (3) as input signals. Here, the “argument” is an argument of the acyclic bottom layer function Fexp, and the “function” is the acyclic bottom layer function Fexp.
(1) Non-pointer type argument.
(2) An entity variable that is referred to by a pointer type argument that has a value reference through a pointer reference within the function and does not have a value assignment through a pointer reference.
(3) A global variable with a value reference inside the function and no value substitution.

論理回路出力信号抽出部２２２は、入力された非循環型最下層関数Ｆｅｘｐの引数及び戻り値並びにプログラムＰに含まれるグローバル変数から出力信号を抽出する。具体的には、論理回路出力信号抽出部２２２は、次の（１）〜（３）に挙げるものを出力信号と判別して抽出する。なお、ここで「引数」とは非循環型最下層関数Ｆｅｘｐの引数のことであり、「関数」とは非循環型最下層関数Ｆｅｘｐのことである。
（１）関数の戻り値。
（２）関数内部でポインタ参照を介した値代入があるポインタ型の引数が参照する実体変数。
（３）関数内部で値代入があるグローバル変数。 The logic circuit output signal extraction unit 222 extracts an output signal from the input argument and return value of the acyclic bottom layer function Fexp and the global variable included in the program P. Specifically, the logic circuit output signal extraction unit 222 determines and extracts the following (1) to (3) as output signals. Here, the “argument” is an argument of the acyclic bottom layer function Fexp, and the “function” is the acyclic bottom layer function Fexp.
(1) Function return value.
(2) An entity variable that is referenced by a pointer type argument that has a value assignment via a pointer reference inside the function.
(3) A global variable with value substitution inside the function.

図５〜図７を参照して、論理回路入力信号抽出部２２１及び論理回路出力信号抽出部２２２が入力信号及び出力信号をそれぞれ抽出する処理について、具体例を用いて説明する。
図５の関数 top0 においては、非ポインタ型である引数 a, b, c が入力信号と判別され、戻り値の計算式が出力信号と判別される。この例においては戻り値は変数 e であるが、一般には、関数の戻り値は変数単独でない一般式となり得る。
図６の関数 top1 においては、非ポインタ型である引数 a が入力信号と判別され、戻り値の計算式である prev_stt が出力信号と判別される。なお、関数内スタティック変数 stt は、関数外部からは参照できないので、論理回路内部の信号と判別され、入出力信号とはならない。
図７の関数 top2 のおいては、非ポインタ型である引数 pin が入力信号と判別される。引数 pw は、ポインタ型であり、また関数内部にポインタ参照を介した値代入（pw[0][0] = pin 等）が存在するので、９つの配列要素すべてが出力信号と判別される。このポインタ型引数 pw が参照する配列の実体を含む変数は、関数 top2 を呼出す関数（図示せず）によって指定されるが、これはスタティック変数（グローバル変数を含む。）であってもよく、又は関数 top2 を直接的若しくは間接的に呼出す上位階層関数のローカル変数であってもよい。 With reference to FIGS. 5 to 7, a process in which the logic circuit input signal extraction unit 221 and the logic circuit output signal extraction unit 222 extract the input signal and the output signal, respectively, will be described using specific examples.
In the function top0 in FIG. 5, arguments a, b, and c that are non-pointer types are determined as input signals, and a return value calculation formula is determined as an output signal. In this example, the return value is the variable e, but in general the return value of the function can be a general expression that is not a variable alone.
In the function top1 in FIG. 6, the argument a which is a non-pointer type is determined as an input signal, and prev_stt which is a return value calculation formula is determined as an output signal. Note that the intra-function static variable stt cannot be referred to from outside the function, so it is determined as a signal inside the logic circuit and not an input / output signal.
In the function top2 in FIG. 7, the non-pointer type argument pin is determined as an input signal. The argument pw is of the pointer type, and value assignment via pointer reference (pw [0] [0] = pin etc.) exists in the function, so that all nine array elements are determined as output signals. The variable containing the entity of the array referred to by this pointer type argument pw is specified by a function (not shown) that calls the function top2, which may be a static variable (including a global variable), or It may be a local variable of a higher-level function that calls the function top2 directly or indirectly.

［制御フローグラフ生成部２３］
制御フローグラフ生成部２３は、非循環・非階層変換部２１によって得られた非循環型最下層関数Ｆｅｘｐから、その制御フローグラフＦｃｆｇを生成する。
一般に、関数の制御フローグラフとは、その関数を実行したときの経路を有向グラフで表現したものである。制御フローグラフのノード（節点）には、基本ブロック、始点ブロック及び終点ブロックがある。基本ブロックは、先頭命令以外に分岐先がなく、終端命令以外に分岐命令がない、逐次命令列である。始点ブロックは、関数入口に対応するノードであり、関数の実行はここから開始される。終点ブロックは、関数出口に対応するノードであり、関数の実行はここに到達すると終了する。制御フローグラフのエッジ（枝）は、関数の実行についてノード間の直接の遷移を表す有向枝である。
関数からその制御フローグラフを生成する処理は公知の技術であるため、ここではその具体的方法についての説明を省略する。制御フローグラフ生成部２３によって非循環型最下層関数Ｆｅｘｐから生成される制御フローグラフＦｃｆｇは、ループ構造を含まない（即ち、１つのノードを複数回通過する経路がグラフ内に存在しない）ものとなるが、最上位関数Ｆｔｏｐに条件分岐命令が含まれる場合には、制御フローグラフＦｃｆｇは経路の条件分岐（及びそれに伴う合流）を含むことになる。 [Control Flow Graph Generation Unit 23]
The control flow graph generation unit 23 generates the control flow graph Fcfg from the acyclic bottom layer function Fexp obtained by the acyclic / non-hierarchical conversion unit 21.
In general, a function control flow graph is a directed graph representing a path when a function is executed. The nodes (nodes) of the control flow graph include a basic block, a start point block, and an end point block. The basic block is a sequential instruction sequence having no branch destination other than the head instruction and no branch instruction other than the end instruction. The starting point block is a node corresponding to the function entrance, and the execution of the function is started from here. The end point block is a node corresponding to the function exit, and the execution of the function is terminated when it reaches here. An edge (branch) of the control flow graph is a directional branch that represents a direct transition between nodes for execution of a function.
Since the process of generating the control flow graph from the function is a known technique, the description of the specific method is omitted here. The control flow graph Fcfg generated from the non-circular lowest layer function Fexp by the control flow graph generation unit 23 does not include a loop structure (that is, a path that passes through one node a plurality of times does not exist in the graph). However, when a conditional branch instruction is included in the highest-level function Ftop, the control flow graph Fcfg includes a conditional branch of a path (and a confluence associated therewith).

［状態変数命令依存性判定部２４］
状態変数命令依存性判定部２４は、制御フローグラフ生成部２３で生成された制御フローグラフＦｃｆｇについて、以下に説明する論理回路生成可能判定規則１及び２によって論理回路生成の可能性を判定する。これらの規則のいずれかによって論理回路生成が可能でないと状態変数命令依存性判定部２４が判定した場合には、論理回路生成装置１は、例えば、出力装置１７に警告メッセージを出力し、論理回路生成処理を停止するようにすることができる。 [State variable instruction dependency determination unit 24]
The state variable instruction dependency determination unit 24 determines the possibility of logic circuit generation for the control flow graph Fcfg generated by the control flow graph generation unit 23 according to the logic circuit generation enable determination rules 1 and 2 described below. When the state variable instruction dependency determination unit 24 determines that the logic circuit generation is not possible by any of these rules, the logic circuit generation device 1 outputs a warning message to the output device 17, for example, The generation process can be stopped.

論理回路生成可能性判定規則１は、各配列変数への代入命令について、配列インデックスが定数でない変数であり、かつ、その配列変数がレジスタ属性またはメモリ属性を持たない場合に、論理回路生成が可能でないと判定する規則である。そうした変数が制御フローグラフＦｃｆｇに含まれる場合には、変数命令依存性判定部２４は、論理回路生成が可能でないと判定する。 The logic circuit generation possibility determination rule 1 allows generation of a logic circuit when the array index is a non-constant variable and the array variable does not have a register attribute or a memory attribute for an assignment instruction to each array variable. It is a rule that determines that it is not. When such a variable is included in the control flow graph Fcfg, the variable instruction dependency determining unit 24 determines that a logic circuit cannot be generated.

論理回路生成可能性判定規則２は、レジスタ／メモリ配列変数（レジスタ属性又はメモリ属性を持つ配列変数をいう。）である各変数Ｘについて、命令Ａから命令Ｂに対する変数Ｘの read-after-write 依存性があり、かつ、命令Ｂから命令Ｃに対する変数Ｘの write-after-read 依存性があるような関係にある命令Ａ、Ｂ、Ｃが存在する場合に、論理回路生成が可能でないと判定する規則である。この場合においては、命令ＡによるＸへの代入、命令ＢによるＸの参照、及び命令ＣによるＸへの代入が連続して実行されることになるが、このようなソフトウェア記述は１クロックの回路動作として実現できないため、論理回路生成が可能でないと判断するのである。こうした関係にある命令が制御フローグラフＦｃｆｇに含まれる場合には、変数命令依存性判定部２４は、論理回路生成が可能でないと判定する。 The logic circuit generation possibility determination rule 2 is read-after-write of the variable X from the instruction A to the instruction B for each variable X that is a register / memory array variable (referred to as an array variable having a register attribute or a memory attribute). When there are instructions A, B, and C that have a dependency and have a write-after-read dependency of the variable X from the instruction B to the instruction C, it is determined that the logic circuit cannot be generated. It is a rule to do. In this case, the assignment to X by the instruction A, the reference to X by the instruction B, and the assignment to X by the instruction C are executed continuously. Such a software description is a circuit of one clock. Since it cannot be realized as an operation, it is determined that the logic circuit cannot be generated. If an instruction having such a relationship is included in the control flow graph Ffcfg, the variable instruction dependency determining unit 24 determines that a logic circuit cannot be generated.

なお、本発明の別の実施形態においては、論理回路生成装置１は、状態変数命令依存性判定部２４を含まないものとすることができる。いずれの場合にも、論理回路生成装置１が入力に取るプログラムＰは、上記の論理回路生成可能性判定規則１及び２を満たす制御フローグラフに変換される最上位関数Ｆｔｏｐを含むものに限られる。 Note that in another embodiment of the present invention, the logic circuit generation device 1 may not include the state variable instruction dependency determination unit 24. In any case, the program P taken as an input by the logic circuit generation device 1 is limited to the one including the highest function Ftop converted to a control flow graph satisfying the logic circuit generation possibility determination rules 1 and 2 described above. .

［レジスタ／メモリ配列アクセス命令分解部２５］
レジスタ／メモリ配列変数（レジスタ属性又はメモリ属性を持つ配列変数をいう。）に対する複数の読出し命令及び複数の書込み命令をそれぞれ１クロックで実行するためには、レジスタ配列変数（レジスタ属性を持つ配列変数をいう。）はマルチポートレジスタファイルを用いて、又はメモリ配列変数（メモリ属性を持つ配列変数をいう。）はマルチポートメモリを用いて、それぞれ回路を実現する必要がある。レジスタ／メモリ配列アクセス命令分解部２５は、配列代入命令及び配列参照命令にポート番号を割当てることによって、複数の書込みポートとソフトウェア記述上の配列代入命令との対応付け、及び、複数の読出しポートとソフトウェア記述上の配列参照命令との対応付けを行う。さらに、レジスタ／メモリ配列アクセス命令分解部２５は、レジスタ／メモリ配列変数に対応して生成されるレジスタファイル回路又はメモリ回路の入力ポートに対応する変数を追加することによって、配列代入命令及び配列参照命令を細分化する。
より具体的には、本実施形態におけるレジスタ／メモリ配列アクセス命令分解部２５は、次に説明する、書込みポート番号割当部２５１と、読出しポート番号割当部２５２と、配列代入命令分解部２５３と、配列参照命令分解部２５４とを含む。 [Register / Memory Array Access Instruction Decomposition Unit 25]
In order to execute a plurality of read instructions and a plurality of write instructions for each register / memory array variable (referred to as an array variable having a register attribute or a memory attribute) in one clock, a register array variable (an array variable having a register attribute). )) Using a multi-port register file, or a memory array variable (an array variable having a memory attribute) must be implemented using a multi-port memory. The register / memory array access instruction decomposition unit 25 assigns port numbers to the array assignment instruction and the array reference instruction, thereby associating a plurality of write ports with array assignment instructions on the software description, and a plurality of read ports. Corresponds to an array reference instruction on the software description. Further, the register / memory array access instruction decomposition unit 25 adds an array assignment instruction and an array reference by adding a variable corresponding to an input port of the register file circuit or the memory circuit generated corresponding to the register / memory array variable. Subdivide instructions.
More specifically, the register / memory array access instruction decomposing unit 25 in this embodiment includes a write port number assigning unit 251, a read port number assigning unit 252, an array assignment instruction decomposing unit 253, which will be described below. And an array reference instruction decomposing unit 254.

書込みポート番号割当部２５１は、制御フローグラフＦｃｆｇに含まれるレジスタ／メモリ配列変数への配列代入命令のそれぞれに書込みポート番号を割当て、レジスタ／メモリ配列変数の書込みポート数を決定する。より具体的には、書込みポート番号割当部２５１は、制御フローグラフＦｃｆｇに含まれるレジスタ／メモリ配列変数Ｘへの配列代入命令Ａのそれぞれについて、制御フローグラフＦｃｆｇにおける始点ブロックから命令Ａまでの各径路上に存在する命令Ａ以外の配列変数Ｘへの配列代入命令の個数をそれぞれ数え上げ、その個数の最大値を、書込みポート番号として命令Ａに割当てる。 The write port number assigning unit 251 assigns a write port number to each of the array assignment instructions to the register / memory array variable included in the control flow graph Fcfg, and determines the number of write ports of the register / memory array variable. More specifically, the write port number assigning unit 251 performs each of the array assignment instruction A to the register / memory array variable X included in the control flow graph Fcfx from the start point block to the instruction A in the control flow graph Ffcf. The number of array assignment instructions to the array variable X other than the instruction A existing on the path is counted, and the maximum value of the number is assigned to the instruction A as a write port number.

こうしてレジスタ／メモリ配列変数Ｘの配列代入命令に割当てられた書込みポート番号の数を、レジスタ／メモリ配列変数Ｘの書込みポート数と呼ぶことにする。即ち、ある整数Ｎｗを用いて書込みポート番号が０からＮｗ−１までのＮｗ個の番号で表されるときに、書込みポート数はＮｗとなる。レジスタ配列変数の書込みポート数をレジスタファイル書込みポート数と呼び、メモリ配列変数の書込みポート数をメモリ書込みポート数と呼ぶことにする。 The number of write port numbers assigned to the array / assignment instruction for the register / memory array variable X in this way is referred to as the number of write ports for the register / memory array variable X. That is, when the write port number is represented by Nw numbers from 0 to Nw−1 using a certain integer Nw, the number of write ports is Nw. The number of register array variable write ports is called the register file write port number, and the number of memory array variable write ports is called the memory write port number.

読出しポート番号割当部２５２は、制御フローグラフＦｃｆｇに含まれるレジスタ／メモリ配列変数への配列参照命令のそれぞれに読出しポート番号を割当て、レジスタ／メモリ配列変数の読出しポート数を決定する。より具体的には、読出しポート番号割当部２５２は、制御フローグラフＦｃｆｇに含まれるレジスタ／メモリ配列変数Ｘへの配列参照命令Ｂのそれぞれについて、制御フローグラフＦｃｆｇにおける始点ブロックから命令Ｂまでの各経路上に存在する命令Ｂ以外の配列変数Ｘへの配列参照命令の個数をそれぞれ数え上げ、その個数の最大値を、読出しポート番号として命令Ｂに割当てる。 The read port number assigning unit 252 assigns a read port number to each of array reference instructions for register / memory array variables included in the control flow graph Fcfg, and determines the number of read ports for the register / memory array variables. More specifically, the read port number assigning unit 252 performs each of the array reference instruction B to the register / memory array variable X included in the control flow graph Fcfg from the start point block to the instruction B in the control flow graph Ffcf. The number of array reference instructions to the array variable X other than the instruction B existing on the path is counted up, and the maximum value of the number is assigned to the instruction B as a read port number.

こうしてレジスタ／メモリ配列変数Ｘの配列参照命令に割当てられた読出しポート番号の数を、レジスタ／メモリ配列変数Ｘの読出しポート数と呼ぶことにする。即ち、ある整数Ｎｒを用いて読出しポート番号が０からＮｒ−１までのＮｒ個の番号で表されるときに、読出しポート数はＮｒとなる。レジスタ配列変数の読出しポート数をレジスタファイル読出しポート数と呼び、メモリ配列変数の読出しポート数をメモリ読出しポート数と呼ぶことにする。 The number of read port numbers assigned to the array / reference instruction of the register / memory array variable X in this way is referred to as the number of read ports of the register / memory array variable X. That is, when a read port number is represented by Nr numbers from 0 to Nr−1 using a certain integer Nr, the number of read ports is Nr. The number of read ports of the register array variable is called the number of register file read ports, and the number of read ports of the memory array variable is called the number of memory read ports.

レジスタ／メモリ配列変数を実現するレジスタファイル回路又はメモリ回路は、いずれも、レジスタ／メモリ配列変数に値を書込むために用いられる書込みポートと、レジスタ／メモリ配列変数から値を読出すために用いられる読出しポートとを含む。書込みポートは、データ入力ポートとアドレス入力ポートとを含む。配列代入命令分解部２５３は、これら２つの入力ポートに対応した変数を制御フローグラフＦｃｆｇに追加して、制御フローグラフＦｃｆｇに含まれる配列代入命令を細分化する。読出しポートは、アドレス入力ポートを含む。配列参照命令分解部２５４は、この入力ポートに対応した変数を制御フローグラフＦｃｆｇに追加して、制御フローグラフＦｃｆｇに含まれる配列参照命令を細分化する。これらのポートに対応して追加された変数を「レジスタ／メモリ配列ポート変数」又は単に「ポート変数」と呼ぶことにする。 Any register file circuit or memory circuit that implements a register / memory array variable is used to write a value to the register / memory array variable and to read a value from the register / memory array variable. Read port. The write port includes a data input port and an address input port. The array assignment instruction decomposition unit 253 adds variables corresponding to these two input ports to the control flow graph Ffcf, and subdivides the array assignment instructions included in the control flow graph Fcfg. The read port includes an address input port. The array reference instruction decomposing unit 254 adds a variable corresponding to the input port to the control flow graph Fcff, and subdivides the array reference instruction included in the control flow graph Fcfg. Variables added corresponding to these ports are referred to as “register / memory array port variables” or simply “port variables”.

図１４を参照して、上述したレジスタ／メモリ配列アクセス命令分解部２５の処理を、具体例を用いて説明する。（ａ１）は、メモリ配列変数 A に対する配列代入命令及び配列参照命令を含む関数 top3 を含むプログラムのソフトウェア記述であり、（ａ２）は、関数 top3 の制御フローグラフである。 With reference to FIG. 14, the processing of the register / memory array access instruction decomposition unit 25 described above will be described using a specific example. (A1) is a software description of a program including a function top3 including an array assignment instruction and an array reference instruction for the memory array variable A, and (a2) is a control flow graph of the function top3.

関数 top3 は、（ａ１）の８行目及び１２行目に示されるように、２つの配列参照命令 v = A[a] 及び u = A[b] を含む。これらは、（ａ２）の制御フローグラフにおいても、それぞれそのまま表れている。（ａ２）において、始点ブロック top3:start から配列参照命令 v = A[a] の命令までの間には、配列変数 A に対する配列参照命令は存在しない（即ち、０個の配列参照命令が存在する）ので、読出しポート番号割当部２５２は、配列参照命令 v = A[a] に読出しポート番号０を割当てる。また、始点ブロック top3:start から配列参照命令 u = A[b] までの間には、配列参照命令 v = A[a] を通過する経路と、通過しない経路とが存在する。従って、始点ブロック top3:start から配列参照命令 u = A[b] までの経路上には０個又は１個の配列参照命令が存在し、その個数の最大値は１であるので、読出しポート番号割当部２５２は、配列参照命令 u = A[b] に読出しポート番号１を割当てる。以上より、配列変数 A に対する配列参照命令に割当てられた読出しポート番号は０及び１の２つであるため、読出しポート番号割当部２５２は、配列変数 A の読出しポート数を２と決定する。 The function top3 includes two array reference instructions v = A [a] and u = A [b] as shown in the 8th and 12th lines of (a1). These also appear as they are in the control flow graph of (a2). In (a2), there is no array reference instruction for the array variable A between the start point block top3: start and the instruction of the array reference instruction v = A [a] (that is, there are 0 array reference instructions). Therefore, the read port number assigning unit 252 assigns the read port number 0 to the array reference instruction v = A [a]. Further, between the start point block top3: start and the array reference instruction u = A [b], there are a path that passes through the array reference instruction v = A [a] and a path that does not pass through. Therefore, there are 0 or 1 array reference instructions on the path from the start block top3: start to the array reference instruction u = A [b], and the maximum value of the number is 1, so the read port number The assigning unit 252 assigns the read port number 1 to the array reference instruction u = A [b]. As described above, since the read port numbers assigned to the array reference instruction for the array variable A are two, 0 and 1, the read port number assigning unit 252 determines that the number of read ports for the array variable A is 2.

また、関数 top3 は、（ａ１）の９行目、１３行目及び１６行目に示されるように、３つの配列代入命令 A[a] = V + 1 、 A[b] = u + 1 及び A[0] = v + u を含む。このうち、配列代入命令 A[a] = V + 1 及び A[b] = u + 1 対する処理は、上記の１つ目及び２つ目の配列参照命令に対する処理とそれぞれ同様である。即ち、書込みポート番号割当部２５１は、配列代入命令 A[a] = V + 1 に書込みポート番号０を割当て、配列代入命令 A[b] = u + 1 に書込みポート番号１を割当てる。始点ブロック top3:start から配列代入命令 A[0] = v + u までの間には、配列代入命令 A[a] = V + 1 を通過する経路と、通過しない経路とが存在するが、いずれの経路も、配列代入命令 A[b] = u + 1 を通過することはない。従って、始点ブロック top3:start から配列代入命令 A[0] = v + u に到達する経路上には、０個又は１個の配列代入命令が存在し、その個数の最大値は１であるので、書込みポート番号割当部２５１は、３つ目の配列代入命令に書込みポート番号１を割当てる。以上より、配列変数 A に対する配列代入命令に割当てられた読出しポート番号は０及び１の２つであるため、書込みポート番号割当部２５１は、配列変数 A の書込みポート数を２と決定する。 Further, the function top3 has three array assignment instructions A [a] = V + 1, A [b] = u + 1, and as shown in the 9th, 13th and 16th lines of (a1). Includes A [0] = v + u. Among these, the processing for the array assignment instructions A [a] = V + 1 and A [b] = u + 1 is the same as the processing for the first and second array reference instructions. That is, the write port number assigning unit 251 assigns the write port number 0 to the array assignment instruction A [a] = V + 1 and assigns the write port number 1 to the array assignment instruction A [b] = u + 1. Between the start block top3: start and the array assignment instruction A [0] = v + u, there are paths that pass through the array assignment instruction A [a] = V + 1 and paths that do not pass. This path does not pass the array assignment command A [b] = u + 1. Therefore, there are 0 or 1 array assignment instructions on the path from the start point block top3: start to the array assignment instruction A [0] = v + u, and the maximum number is 1. The write port number assigning unit 251 assigns the write port number 1 to the third array assignment instruction. As described above, since the read port numbers assigned to the array assignment instruction for the array variable A are two, 0 and 1, the write port number assigning unit 251 determines that the number of write ports for the array variable A is 2.

以上のように、配列変数 A の配列代入命令に書込みポート０及び１が割当てられ、配列参照命令に読出しポート０及び１が割当てられた。そこで、配列代入命令分解部２５３及び配列参照命令分解部２５４は、次に示すように、これらの各ポートに対応するポート変数を、制御フローグラフＦｃｆｇに追加する。
読出しアドレス変数： A0r（ポート０）， A1r（ポート１）
書込みアドレス変数： A0w（ポート０）， A1w（ポート１）
書込みデータ変数： A0d（ポート０）， A1d（ポート１）
ここで、読出しアドレス変数は読出しポートにおけるアドレス入力ポートに対応する変数であり、書込みアドレス変数は書込みポートにおけるアドレス入力ポートに対応する変数であり、書込みデータ変数は書込みポートにおけるデータ入力ポートに対応する変数である。 As described above, write ports 0 and 1 are assigned to the array assignment instruction of array variable A, and read ports 0 and 1 are assigned to the array reference instruction. Therefore, the array assignment instruction decomposition unit 253 and the array reference instruction decomposition unit 254 add port variables corresponding to these ports to the control flow graph Fcfg as shown below.
Read address variable: A0r (port 0), A1r (port 1)
Write address variable: A0w (port 0), A1w (port 1)
Write data variable: A0d (port 0), A1d (port 1)
Here, the read address variable is a variable corresponding to the address input port in the read port, the write address variable is a variable corresponding to the address input port in the write port, and the write data variable corresponds to the data input port in the write port. Is a variable.

さらに、配列代入命令分解部２５３及び配列参照命令分解部２５４は、書込みアドレス変数及び読出しアドレス変数に配列インデックスの式を代入する命令を生成し、書込みデータ変数に代入値を計算する式を代入する命令を生成する。これにより、配列代入命令分解部２５３及び配列参照命令分解部２５４は、図１６（ａ）に示されるように、制御フローグラフＦｃｆｇに含まれる配列代入命令及び配列参照命令を、細分化した命令に変換する。図１４（ｂ１）は、理解を助けるために、図１６（ａ）の制御フローグラフをソフトウェア記述に表現しなおしたものである。 Further, the array assignment instruction decomposing unit 253 and the array reference instruction decomposing unit 254 generate an instruction for assigning the expression of the array index to the write address variable and the read address variable, and assign an expression for calculating the assignment value to the write data variable. Generate instructions. As a result, the array assignment instruction decomposing unit 253 and the array reference instruction decomposing unit 254 convert the array assignment instruction and the array reference instruction included in the control flow graph Fcfg into subdivided instructions as shown in FIG. Convert. FIG. 14 (b1) is a representation of the control flow graph of FIG. 16 (a) re-expressed in software description to aid understanding.

なお、本発明の別の実施形態においては、レジスタ／メモリ配列アクセス命令分解部２５は、書込みポート番号割当部２５１及び読込みポート番号割当部２５２を含まないものとすることができる。この場合には、配列代入命令分解部２５３及び配列参照命令分解部２５４は、制御フローグラフＦｃｆｇに含まれる各配列変数に対して１組のポート変数を追加するようにすればよい。また、本発明の別の実施形態においては、論理回路生成装置１は、レジスタ／メモリ配列アクセス命令分解部２５を含まないものとすることができる。この場合には、論理回路生成装置１は、レジスタ／メモリ配列変数を含まないプログラムＰを入力に取る。 In another embodiment of the present invention, the register / memory array access instruction decomposing unit 25 may not include the write port number assigning unit 251 and the read port number assigning unit 252. In this case, the array assignment instruction decomposition unit 253 and the array reference instruction decomposition unit 254 may add a set of port variables to each array variable included in the control flow graph Fcfg. In another embodiment of the present invention, the logic circuit generation device 1 may not include the register / memory array access instruction decomposition unit 25. In this case, the logic circuit generation device 1 takes as input a program P that does not include a register / memory array variable.

［メモリポート数判定部２６］
メモリポート数判定部２６は、レジスタ／メモリ配列アクセス命令分解部２５において算出されたレジスタファイル書込みポート数、レジスタファイル読出しポート数、メモリ書込みポート数及びメモリ読出しポート数が、それぞれ予め定められた閾値を超えていないかどうかを判定する。ポート数の閾値は、回路特性（例えば、回路規模、回路動作速度、消費電力）や半導体製造プロセス等の技術的要因によって、予め４種類のポートについて個別に定めることができる。これらのポート数のいずれかが予め定められた閾値を超えている場合には、メモリポート数判定部２６は、回路合成が可能でないと判定する。論理回路生成が可能でないとメモリポート数判定部２６が判定した場合には、論理回路生成装置１は、例えば、出力装置１７に警告メッセージを出力し、論理回路生成処理を停止するようにすることができる。 [Memory Port Number Determination Unit 26]
The memory port number determination unit 26 is configured such that the number of register file write ports, the number of register file read ports, the number of memory write ports, and the number of memory read ports calculated by the register / memory array access instruction decomposition unit 25 are respectively set to predetermined threshold values. Determine whether or not The threshold value for the number of ports can be individually determined for four types of ports in advance according to technical factors such as circuit characteristics (for example, circuit scale, circuit operation speed, power consumption) and semiconductor manufacturing process. If any of these numbers of ports exceeds a predetermined threshold, the memory port number determination unit 26 determines that circuit synthesis is not possible. When the memory port number determination unit 26 determines that the logic circuit generation is not possible, the logic circuit generation device 1 outputs a warning message to the output device 17, for example, and stops the logic circuit generation processing. Can do.

なお、本発明の別の実施形態においては、論理回路生成装置１は、メモリポート数判定部２６を含まないものとすることができる。この場合には、論理回路生成装置１は、ポート数にかかわらず論理回路生成の処理をする。 In another embodiment of the present invention, the logic circuit generation device 1 may not include the memory port number determination unit 26. In this case, the logic circuit generation device 1 performs a logic circuit generation process regardless of the number of ports.

［静的単一代入形式変換部２７］
メモリポート数判定部２６によって、制御フローグラフＦｃｆｇが論理回路生成可能であると判定された場合には、次いで、制御フローグラフＦｃｆｇは、静的単一代入形式変換部２７に入力される。静的単一代入形式変換部２７は、制御フローグラフＦｃｆｇを静的単一代入形式の制御フローグラフに変換する。 [Static single assignment format conversion unit 27]
If it is determined by the memory port number determination unit 26 that the control flow graph Fcfg can be generated as a logic circuit, then the control flow graph Ffcf is input to the static single substitution format conversion unit 27. The static single assignment format conversion unit 27 converts the control flow graph Fcfg to a control flow graph in the static single assignment format.

一般に、静的単一代入形式とは、ソフトウェア・コンパイラの中間表現形式の一つであり、各変数の値の定義が１つだけ存在する（即ち、各変数への値の代入が１箇所で行われる）ような形式をいう。静的単一代入形式は、変数の名前変更（リネーミング）を行い、さらに、同一変数に対する値の定義が複数個合流する制御フロー箇所にφ関数命令と呼ばれる命令を追加することによって得られる。 In general, the static single assignment form is one of the intermediate representation forms of the software compiler, and there is only one definition of the value of each variable (that is, the assignment of the value to each variable is in one place. Format). The static single assignment form is obtained by renaming a variable and adding an instruction called a φ function instruction at a control flow location where a plurality of value definitions for the same variable merge.

静的単一代入形式変換部２７は、より具体的には、次に説明する、φ関数命令挿入部２７１と、変数名変換部２７２と、状態変数名再変換部２７３とを含む。 More specifically, the static single assignment format conversion unit 27 includes a φ function instruction insertion unit 271, a variable name conversion unit 272, and a state variable name reconversion unit 273, which will be described next.

φ関数命令挿入部２７１は、制御フローグラフ内の、同一変数に対する値の定義が複数個合流する箇所に、φ関数命令を挿入する。φ関数命令は、その箇所に合流する全ての変数定義（変数に対する値の定義をいう。）の中から、実際に実行された経路上の変数定義を選択する命令である。φ関数命令によって選択された変数定義による値は、その同じ変数に代入される。なお、φ関数命令の実体化（φ関数命令を具体的な演算命令に変換することをいう。）は、後述する制御フロー縮退変換部２８によって行われる。 The φ function instruction insertion unit 271 inserts a φ function instruction at a location where a plurality of value definitions for the same variable merge in the control flow graph. The φ-function instruction is an instruction for selecting a variable definition on the actually executed path from all the variable definitions (referred to as a value definition for the variable) that joins the location. The value defined by the variable definition selected by the φ function instruction is assigned to the same variable. Note that the materialization of the φ function instruction (which means converting the φ function instruction into a specific operation instruction) is performed by the control flow degeneration conversion unit 28 described later.

変数名変換部２７２は、制御フローグラフ内の各変数代入命令及び各φ関数命令による変数の値の定義について、変数ごとの固有の添字付き変数名に置き換える。変数名変換部２７２は、さらに、制御フローグラフ内の変数を参照する各命令について、参照される各変数の変数名を、その命令に到達する各変数代入命令又は各φ関数命令による変数の値の定義に対応する添字付き変数名に置き換える。この処理によって、制御フローグラフは、静的単一代入形式に変換される。 The variable name conversion unit 272 replaces the definition of the value of the variable by each variable substitution instruction and each φ function instruction in the control flow graph with a variable name with a subscript unique to each variable. The variable name conversion unit 272 further, for each instruction that refers to a variable in the control flow graph, sets the variable name of each variable to be referred to, the value of the variable by each variable assignment instruction or each φ function instruction that reaches the instruction. Replace with the subscripted variable name corresponding to the definition of. By this processing, the control flow graph is converted into a static single substitution format.

状態変数名再変換部２７３は、静的単一代入形式に変換された制御フローグラフに対する追加の処理として、各スタティック変数（グローバル変数を含む。）に対して、関数の出口に対応する終点ブロックに到達する値の定義に割当てられた添字付き変数名を、関数の入口に対応する始点ブロックにおける添字付き変数名に置き換える。これを行うのは、関数出口におけるスタティック変数及びグローバル変数の値を、次回の関数実行における関数入口箇所で参照可能にするためである。 As an additional process for the control flow graph converted into the static single substitution format, the state variable name re-converting unit 273 performs an end point block corresponding to the exit of the function for each static variable (including global variables). Replace the subscripted variable name assigned to the definition of the value that reaches with the subscripted variable name in the starting block corresponding to the function entry. This is done so that the values of static variables and global variables at the function exit can be referred to at the function entry location in the next function execution.

図１５を参照して、静的単一代入形式変換部２７が、複数の変数定義の合流を伴う制御フローグラフＦｃｆｇを静的単一代入形式に変換する場合の処理について、具体例を用いて説明する。図１５（ｂ）は、図１５（ａ）に示されるプログラムに含まれる最上位関数 top4 の制御フローを表現した制御フローグラフである。まず、φ関数命令挿入部２７１は、図１５（ｃ）に示されるように、変数 a, b, c それぞれの変数定義が合流する基本ブロックの先頭箇所にφ関数命令を挿入する。具体的に、関数引数変数 a について見ると、関数引数の値として与えられる関数入口における値の定義と、 a = b + 1 の代入命令による値の定義とが、φ関数命令が挿入された箇所で合流する。グローバル変数 b について見ると、グローバル変数の値として与えられる関数入口における値の定義と、 b = b + 1 の代入命令による値の定義と、 b = a の代入命令による値の定義との３つが、φ関数命令が挿入された箇所で合流する。ローカル変数 c について見ると、 c = b - 1 の代入命令による値の定義と、 c = c + 1 の代入命令による値の定義と、 c = 0 の代入命令による値の定義との３つが、φ関数命令が挿入された箇所で合流する。 Referring to FIG. 15, a specific example of processing when static single assignment format conversion unit 27 converts a control flow graph Fcfg involving the merging of a plurality of variable definitions into a static single assignment format is used. explain. FIG. 15B is a control flow graph representing the control flow of the top-level function top4 included in the program shown in FIG. First, as shown in FIG. 15C, the φ function instruction insertion unit 271 inserts a φ function instruction at the beginning of the basic block where the variable definitions of the variables a, b, and c merge. Specifically, looking at the function argument variable a, the definition of the value at the function entry given as the value of the function argument and the value definition by the assignment instruction of a = b + 1 are the places where the φ function instruction is inserted. Join at. Looking at the global variable b, there are three definitions: the value definition at the function entry given as the value of the global variable, the value definition by the assignment instruction of b = b + 1, and the value definition by the assignment instruction of b = a. , Merge at the place where the φ function instruction is inserted. Looking at the local variable c, there are three definitions: a value definition with an assignment instruction of c = b-1, a value definition with an assignment instruction of c = c + 1, and a value definition with an assignment instruction of c = 0. Merge at the place where φ function instruction is inserted.

次に、変数名変換部２７２は、図１５（ｄ）に示されるように、各変数代入命令及び各φ関数命令による変数の値の定義について、変数ごとの固有の添字付き変数名に置き換える。例えば変数名 c1 は変数 c に対する１つの代入命令による値の定義について割当てられた添字付き変数名であり、変数名 c2 は同じ変数 c に対する別の代入命令による値の定義について割当てられた添字付き変数名である。また、代入変数名前変換部２７２は、図示されないが、関数引数変数及びスタティック変数（グローバル変数を含む。）の値は、関数 top4 の入り口に対応する制御フローグラフの始点ブロック top4:start において定義されているものとみなして、これらの変数にも添字付きの変数名を割当てる。具体的には、関数引数の値として与えられた変数 a の値の定義に対して、添字付き変数名 a1 を割当てる。また、グローバル変数 b については、関数入口の時点における値をそのまま定義とみなし、その値の定義に対して添字付き変数名 b1 を割当てる。 Next, as illustrated in FIG. 15D, the variable name conversion unit 272 replaces the variable value definition by each variable assignment instruction and each φ function instruction with a variable name with a unique subscript for each variable. For example, the variable name c1 is the subscripted variable name assigned for the value definition by one assignment instruction for the variable c, and the variable name c2 is the subscripted variable assigned for the value definition by another assignment instruction for the same variable c. Name. Although the substitution variable name conversion unit 272 is not shown, the values of the function argument variable and the static variable (including the global variable) are defined in the start block top4: start of the control flow graph corresponding to the entry of the function top4. And assign variable names with subscripts to these variables. Specifically, the subscripted variable name a1 is assigned to the definition of the value of the variable a given as the value of the function argument. For the global variable b, the value at the time of function entry is regarded as a definition as it is, and the subscripted variable name b1 is assigned to the definition of the value.

変数名変換部２７２は、さらに、図１５（ｅ）に示されるように、変数を参照する各命令について、参照される各変数の変数名を、その命令に到達する各変数代入命令又は各φ関数命令による変数の値の定義に対応する添字付き変数名に置き換える。具体的には、図１５（ｄ）において式 c1 = b - 1 で表される命令においてはグローバル変数 b が参照されるが、上述のように、この命令に到達する変数 b の値の定義には添字付き変数名 b1 が割当てられているため、図１５（ｅ）に示されるように、この命令における変数 b の変数名を b1 に置き換える。変数を参照する他の命令についても同様である。 Further, as shown in FIG. 15E, the variable name conversion unit 272 converts the variable name of each referenced variable for each instruction referring to the variable, each variable assignment instruction reaching each instruction, or each φ Replace with the subscripted variable name corresponding to the definition of the value of the variable by the function instruction. Specifically, the global variable b is referred to in the instruction represented by the expression c1 = b−1 in FIG. 15D, but as described above, the value of the variable b reaching this instruction is defined. Since the subscripted variable name b1 is assigned, the variable name of the variable b in this instruction is replaced with b1 as shown in FIG. 15 (e). The same applies to other instructions that refer to variables.

以上の処理によって、制御フローグラフは静的単一代入形式に変換されるが、状態変数名再変換部２７３は、こうして作られた静的単一代入形式の制御フローグラフに対して、追加の処理を行う。即ち、図１５（ｆ）に示されるように、各スタティック変数（グローバル変数を含む。）に対して、関数 top4 の出口に対応する終点ブロック top4:end に到達する値の定義に割当てられた添字付き変数名を、関数 top4 の入口に対応する始点ブロック top4:start における添字付き変数名に置き換える。 With the above processing, the control flow graph is converted into the static single assignment format. However, the state variable name reconverting unit 273 adds an additional control flow graph to the static single assignment format control flow graph thus created. Process. That is, as shown in FIG. 15F, for each static variable (including global variables), a subscript assigned to the definition of the value reaching the end point block top4: end corresponding to the exit of the function top4. Replace the subscripted variable name with the subscripted variable name in the start block top4: start corresponding to the entry of the function top4.

図１６には、制御フローグラフＦｃｆｇを静的単一代入形式に変換する処理が、別の具体例を用いて示される。図１６（ａ）は、図１４（ｂ）に示されるソフトウェア記述によって記述された関数 top3 の制御フローグラフである。図１６（ｂ）は、静的単一代入形式変換部２７が図１６（ａ）の制御フローグラフを静的単一代入形式に変換した結果である。変換の具体的な方法は、図１５を参照して上述した通りである。この例における関数 top3 はレジスタ／メモリ配列アクセス命令を含むものであり、制御フローグラフＦｃｆｇに含まれるレジスタ／メモリ配列アクセス命令は、上述したレジスタ／メモリ配列アクセス命令分解部２５によって、レジスタ／メモリ配列ポート変数を用いて細分化されている。静的単一代入形式変換部２７は、レジスタ／メモリ配列ポート変数についても、他の変数と同様に変数名を置き換える。但し、レジスタ／メモリ配列変数自体については、変数名の置き換えを行わない。 In FIG. 16, processing for converting the control flow graph Fcfg to the static single substitution format is shown using another specific example. FIG. 16A is a control flow graph of the function top3 described by the software description shown in FIG. FIG. 16B shows the result of the static single assignment format conversion unit 27 converting the control flow graph of FIG. 16A into the static single assignment format. A specific method of conversion is as described above with reference to FIG. The function top3 in this example includes a register / memory array access instruction, and the register / memory array access instruction included in the control flow graph Ffcfg is registered by the register / memory array access instruction decomposition unit 25 described above. It is subdivided using port variables. The static single assignment format conversion unit 27 replaces the variable names of register / memory array port variables as well as other variables. However, the variable name is not replaced for the register / memory array variable itself.

本発明の別の実施形態においては、論理回路生成装置１は、静的単一代入形式変換部２７を含まないものとすることができる。この場合は、論理回路生成装置１は、静的単一代入形式のソフトウェア記述で記述された最上位関数Ｆｔｏｐを含むプログラムＰを入力に取る。 In another embodiment of the present invention, the logic circuit generation device 1 may not include the static single assignment format conversion unit 27. In this case, the logic circuit generation device 1 takes as input a program P that includes the highest function Ftop described in the software description in the static single substitution format.

［制御フロー縮退変換部２８］
静的単一代入形式変換部２７によって静的単一代入形式に変換された制御フローグラフＦｃｆｇは、次いで、制御フロー縮退変換部２８に入力される。前述したように、制御フローグラフＦｃｆｇは、ループ構造を含まないが、経路の条件分岐（及びそれに伴う合流）を含むことがあるため、そのまま論理回路を表現するデータとして用いるのには適さない。しかし、制御フロー縮退変換部２８に入力される制御フローグラフＦｃｆｇは、各変数への代入を１つだけ含む静的単一代入形式に変換されたものであるため、制御フローグラフＦｃｆｇに追加されたφ関数命令を実体化すれば、条件分岐を無視して制御フローグラフの全てのノードの命令を実行しても正常な動作をする。このことは、ソフトウェア記述から論理回路を生成する手法において、大変重要な事実である。 [Control Flow Degeneration Conversion Unit 28]
The control flow graph Fcfg converted to the static single assignment format by the static single assignment format conversion unit 27 is then input to the control flow degeneration conversion unit 28. As described above, the control flow graph Fcfg does not include a loop structure, but may include a conditional branch of a path (and accompanying merging), and therefore is not suitable for use as data representing a logic circuit as it is. However, since the control flow graph Fcfg input to the control flow degeneration conversion unit 28 is converted into a static single assignment format including only one assignment to each variable, it is added to the control flow graph Ffcf. If the φ function instruction is materialized, the conditional operation is ignored and the instruction of all nodes in the control flow graph is executed normally. This is a very important fact in the method of generating a logic circuit from a software description.

そこで、制御フロー縮退変換部２８は、入力された制御フローグラフＦｃｆｇに含まれるφ関数命令を実体化し、制御フローグラフＦｃｆｇから条件分岐を全て除去することによって、制御フローが縮退されたプログラムである制御フロー縮退プログラムＦｄｅｇを生成する。制御フロー縮退プログラムＦｄｅｇは、条件分岐を含む制御フローグラフＦｃｆｇと比較すると、論理回路の構造をより直接的に表現するものとなる。 Therefore, the control flow degeneration conversion unit 28 is a program in which the control flow is degenerated by materializing the φ function instruction included in the input control flow graph Fcfx and removing all conditional branches from the control flow graph Fcffg. A control flow degeneration program Fdeg is generated. The control flow degeneracy program Fdeg represents the structure of the logic circuit more directly as compared with the control flow graph Fcfg including the conditional branch.

本実施形態における制御フロー縮退変換部２８は、より具体的には、次に説明する、φ関数命令実体化部２８１と、基本ブロック単一化部２８２と、レジスタ／メモリ配列アクセス命令融合部２８３とを含む。 More specifically, the control flow degeneracy conversion unit 28 in the present embodiment includes a φ function instruction instantiation unit 281, a basic block unification unit 282, and a register / memory array access instruction fusion unit 283 described below. Including.

φ関数命令実体化部２８１は、入力された制御フローグラフＦｃｆｇに含まれるφ関数命令を実体化する（即ち、φ関数命令を具体的な演算命令に変換する）。静的単一代入形式変換部２７の説明において述べたように、φ関数命令は、複数の変数定義が合流する制御フローグラフ内の箇所に挿入されたものであり、その箇所に合流する全ての変数定義の中から実際に実行された経路上の変数定義を選択する命令である。変数定義の選択は、経路の条件分岐における条件値に基づいて行われる。例えば、条件 p の真偽値によって条件分岐する２つの経路があり、条件 p が真である場合に実行される経路上に変数定義 a1 が存在し、条件 p が偽である場合に実行される経路上に変数定義 a2 が存在し、２つの経路の合流後の部分にこれらの変数定義から１つを選択するφ関数命令 φ(a1, a2) が存在するとする。このとき、φ関数命令 φ(a1, a2) は、条件 p が真である場合に a1 を選択し、そうでない（即ち、条件 p が偽である）場合に a2 を選択するように実体化される。より具体的には、 φ(a1, a2) は p ? a1 : a2 という式に実体化される。 The φ function instruction instantiation unit 281 instantiates the φ function instruction included in the input control flow graph Fcfg (that is, converts the φ function instruction into a specific operation instruction). As described in the description of the static single assignment format conversion unit 27, the φ function instruction is inserted at a location in the control flow graph where a plurality of variable definitions are merged, and all of the merges at that location. This is an instruction for selecting a variable definition on the path actually executed from the variable definitions. The variable definition is selected based on the condition value in the conditional branch of the route. For example, there are two paths that branch conditionally depending on the truth value of the condition p. The variable definition a1 exists on the path that is executed when the condition p is true, and the condition p is executed when the condition p is false. Assume that a variable definition a2 exists on the path, and a φ function instruction φ (a1, a2) for selecting one of these variable definitions exists in the part after the merging of the two paths. Then the φ function instruction φ (a1, a2) is instantiated to select a1 if the condition p is true, and to select a2 otherwise (ie, the condition p is false). The More specifically, φ (a1, a2) is instantiated into the expression p? A1: a2.

基本ブロック単一化部２８２は、φ関数命令実体化部２８１によってφ関数命令が実体化された制御フローグラフＦｃｆｇから条件分岐を全て除去することによって、制御フローが縮退されたプログラムである制御フロー縮退プログラムＦｄｅｇを生成する。 The basic block unification unit 282 removes all conditional branches from the control flow graph Fcfg in which the φ function instruction is instantiated by the φ function instruction instantiation unit 281, so that the control flow is a program in which the control flow is degenerated. A degenerate program Fdeg is generated.

一般に、条件分岐を含まない制御フローグラフにおいては、経路は１つに定まり、全ての命令が一連に逐次実行されることになるため、全ての基本ブロックに含まれる（分岐命令以外の）命令を１つの命令列に結合することによって、全ての基本ブロックを１つの基本ブロックに結合することが可能である。このように基本ブロックが１つにまとめられた制御フローグラフによって表されるプログラムを、制御フローが縮退されたプログラムと呼ぶ。 In general, in a control flow graph that does not include conditional branches, there is only one path, and all instructions are executed sequentially, so instructions (except for branch instructions) included in all basic blocks It is possible to combine all basic blocks into one basic block by combining them into one instruction sequence. A program represented by a control flow graph in which basic blocks are combined into one in this way is called a program in which the control flow is degenerated.

ところで、基本ブロック単一化部２８２に入力される、φ関数命令が実体化された制御フローグラフＦｃｆｇは、上述の通り、条件分岐を無視して全てのノードの命令を実行しても正常な動作をする。そこで、基本ブロック単一化部２８２は、入力された制御フローグラフＦｃｆｇの全ての基本ブロックに含まれる（分岐命令以外の）命令を１つの命令列に結合することによって、全ての基本ブロックを単一の基本ブロックに結合し、制御フローが縮退されたプログラムを生成する。このとき、結合された基本ブロックの命令列における命令の順序は、元の基本ブロックの実行順序に基づく。例えば、入力された制御フローグラフＦｃｆｇにおいて基本ブロックＢ１から基本ブロックＢ２及びＢ３に経路が分岐し、その後これらの経路が基本ブロックＢ４において合流する場合を考える。このとき、Ｂ１及びＢ２を通過する経路においてＢ１はＢ２より先に実行されるので、結合された基本ブロックにおいて、Ｂ１に含まれていた命令はＢ２に含まれていた命令よりも前に配置するようにする。Ｂ１とＢ３との関係、Ｂ２とＢ４との関係、及びＢ３とＢ４との関係も同様である。しかし、Ｂ２及びＢ３を共に通過する経路は存在しないので、Ｂ２に含まれていた命令はＢ３に含まれていた命令より前に配置しても後に配置しても構わない。その結果、結合された基本ブロックにおけるこの部分の命令の順序は、Ｂ１→Ｂ２→Ｂ３→Ｂ４、又はＢ１→Ｂ３→Ｂ２→Ｂ４となる。いずれにしても、Ｂ２に含まれていた命令及びＢ３に含まれていた命令は、入力された制御フローグラフＦｃｆｇにおいては排他的にのみ実行されるものであったが、結合された基本ブロックにおいてはどちらも実行されることになる。 By the way, as described above, the control flow graph Fcfg in which the φ-function instruction is materialized, which is input to the basic block unification unit 282, is normal even if the instruction of all nodes is executed ignoring the conditional branch. To work. Therefore, the basic block unification unit 282 combines all the basic blocks by combining instructions (other than branch instructions) included in all the basic blocks of the input control flow graph Fcfg into one instruction sequence. The program is combined with one basic block and the control flow is degenerated. At this time, the order of instructions in the instruction sequence of the combined basic blocks is based on the execution order of the original basic blocks. For example, consider a case where a path branches from the basic block B1 to the basic blocks B2 and B3 in the input control flow graph Fcfg, and then these paths merge at the basic block B4. At this time, since B1 is executed before B2 in the path passing through B1 and B2, the instruction included in B1 is arranged before the instruction included in B2 in the combined basic block. Like that. The same applies to the relationship between B1 and B3, the relationship between B2 and B4, and the relationship between B3 and B4. However, since there is no path that passes through both B2 and B3, the instruction included in B2 may be disposed before or after the instruction included in B3. As a result, the order of instructions in this part in the combined basic block is B1 → B2 → B3 → B4 or B1 → B3 → B2 → B4. In any case, the instruction included in B2 and the instruction included in B3 were executed exclusively in the input control flow graph Fcfg, but in the combined basic block Both will be executed.

レジスタ／メモリ配列アクセス命令融合部２８３は、φ関数命令が実体化された制御フロー縮退プログラムＦｄｅｇに対して、排他的に実行されるレジスタ／メモリ配列アクセス命令を融合する処理と、条件的に実行されるレジスタ／メモリ配列アクセス命令に実行条件式を付加する処理とを行う。 The register / memory array access instruction fusion unit 283 performs a conditionally executed process for fusing a register / memory array access instruction that is executed exclusively to the control flow degenerate program Fdeg in which the φ-function instruction is materialized. Processing for adding an execution conditional expression to the register / memory array access instruction to be executed.

図１５（ｆ）及び図１７を参照して、制御フロー縮退変換部２８が行う処理について、具体例を用いて説明する。まず、φ関数命令実体化部２８１が、制御フローグラフＦｃｆｇに含まれるφ関数命令を実体化する。図１５（ｆ）の制御フローグラフに含まれるそれぞれのφ関数命令の動作は、経路の条件分岐における条件値に基づき、具体的には次の通りである。ここでは、条件値を、 P1 = (a1==0) , P2 = (a1<0) と置いている。
φ(a1,a2) は、 P1 が真である場合には a2 を選択し、 P1 が偽である場合には a1 を選択する。
φ(b1,b2,b3) は、 P1 が真である場合には b2 を選択し、 P1 が偽でありかつ P2 が真である場合には b3 を選択し、 P1 が偽でありかつ P2 が偽である場合には b1 を選択する。
φ(c1,c2,c3) は、 P1 が真である場合には c2 を選択し、 P1 が偽でありかつ P2 が真である場合には c3 を選択し、 P1 が偽でありかつ P2 が偽である場合には c1 を選択する。
これらを具体的な式で表現することによって、各φ関数命令は次の通り実体化される。
φ(a1,a2) = (P1) ? a1 : a2;
S_b3_b1 = (P2) ? b3 : b1;
φ(b1,b2,b3) = (P1) ? b2 : S_b3_b1;
S_c3_c1 = (P2) ? c3 : c1;
φ(c1,c2,c3) = (P1) ? c2 : S_c3_c1; With reference to FIG. 15F and FIG. 17, the process performed by the control flow degeneration conversion unit 28 will be described using a specific example. First, the φ function instruction instantiation unit 281 instantiates a φ function instruction included in the control flow graph Fcfg. The operation of each φ function instruction included in the control flow graph of FIG. 15F is specifically as follows based on the condition value in the conditional branch of the path. Here, the condition values are set as P1 = (a1 == 0) and P2 = (a1 <0).
φ (a1, a2) selects a2 if P1 is true, and a1 if P1 is false.
φ (b1, b2, b3) selects b2 if P1 is true, selects b3 if P1 is false and P2 is true, P1 is false, and P2 is If it is false, select b1.
φ (c1, c2, c3) selects c2 if P1 is true, selects c3 if P1 is false and P2 is true, P1 is false, and P2 is If it is false, select c1.
By expressing these by specific expressions, each φ function instruction is materialized as follows.
φ (a1, a2) = (P1)? a1: a2;
S_b3_b1 = (P2)? B3: b1;
φ (b1, b2, b3) = (P1)? b2: S_b3_b1;
S_c3_c1 = (P2)? C3: c1;
φ (c1, c2, c3) = (P1)? c2: S_c3_c1;

次いで、基本ブロック単一化部２８２が、φ関数命令が実体化された制御フローグラフＦｃｆｇに含まれる全ての基本ブロックの（分岐命令以外の）命令を１つの命令列に結合する。その結果、条件分岐が全て除去され、図１７に示される、制御フロー縮退プログラムＦｄｅｇが生成される。 Next, the basic block unification unit 282 combines instructions (except for branch instructions) of all the basic blocks included in the control flow graph Fcfg in which the φ function instruction is materialized into one instruction sequence. As a result, all conditional branches are removed, and the control flow degeneration program Fdeg shown in FIG. 17 is generated.

別の具体例として図１６（ｂ）及び図１８を参照して、制御フロー縮退変換部２８が行う処理を説明する。図１８（ａ）は、レジスタ／メモリ配列アクセス命令を含む図１６（ｂ）の制御フローグラフＦｃｆｇから変換された制御フロー縮退プログラムＦｄｅｇを示す。この変換の具体的な処理方法は、図１５（ｆ）及び図１７を参照して説明された通りである。レジスタ／メモリ配列アクセス命令融合部２８３は、この制御フロー縮退プログラムＦｄｅｇに対して、排他的に実行されるレジスタ／メモリ配列アクセス命令を融合する処理を行う。具体的には、図１８（ａ）の１７行目に示される配列代入命令 A[A1w1] = A1d1 と、２０行目に示される配列代入命令 A[A1w2] = A1d2 とを、φ関数命令 A1w3 = φ(A1w1,A1w2) 及び A1d3 = φ(A1d1,A1d2) の実体化の結果を用いて、図１８（ｂ）の２６行目に示される１つの配列代入命令 A[A1w3] = A1d3 に融合する。さらに、レジスタ／メモリ配列アクセス命令融合部２８３は、条件的に実行されるレジスタ／メモリ配列アクセス命令に実行条件式を付加する処理を行う。具体的には、図１８（ａ）の５行目に示される配列参照命令 v2 = A[A1r1] の読出しアドレスポート変数の代入命令 A1r1 = a1 は、 P1 が真のときに実行されるので、この配列参照命令を、図１８（ｂ）の５行目に示されるように、 if(P1) v2 = A[A1r1] に変換する。同様に、図１８（ａ）の８行目に示される配列代入命令 A[A1w1] = A1d1 の書込みアドレスポート変数の代入命令 A1w1 = a1 及び書込みデータポート変数の代入命令 A1d1 = u2+1 はともに、 P1 が真のときに実行されるので、この配列代入命令を、図１８（ｂ）の５行目に示されるように、 if(P1) v2 = A[A1r1] に変換する。また、配列代入命令 A[A1w3] = A1d3 の書込みアドレスポート変数と書込みデータポート変数はφ関数命令が出力しており、これらφ関数命令は必ず実行される制御フロー経路上にあるので、実行条件は付加しない。 As another specific example, processing performed by the control flow degeneration conversion unit 28 will be described with reference to FIGS. 16B and 18. FIG. 18A shows a control flow degenerate program Fdeg converted from the control flow graph Fcfg of FIG. 16B including a register / memory array access instruction. The specific processing method of this conversion is as described with reference to FIG. 15 (f) and FIG. The register / memory array access instruction fusion unit 283 performs a process of fusing a register / memory array access instruction that is executed exclusively to the control flow degeneration program Fdeg. Specifically, the array assignment instruction A [A1w1] = A1d1 shown in the 17th line of FIG. 18A and the array assignment instruction A [A1w2] = A1d2 shown in the 20th line are converted into the φ function instruction A1w3. = φ (A1w1, A1w2) and A1d3 = Merged into one array assignment instruction A [A1w3] = A1d3 shown in line 26 of Fig. 18 (b) using the result of materialization of φ (A1d1, A1d2) To do. Further, the register / memory array access instruction fusion unit 283 performs a process of adding an execution conditional expression to a conditionally executed register / memory array access instruction. Specifically, the array reference instruction v2 = A [A1r1] shown in the fifth line of FIG. 18A is assigned when the read address port variable assignment instruction A1r1 = a1 is executed when P1 is true. This array reference instruction is converted to if (P1) v2 = A [A1r1] as shown in the fifth line of FIG. Similarly, the array assignment instruction A [A1w1] = A1d1 write address port variable assignment instruction A1w1 = a1 and the write data port variable assignment instruction A1d1 = u2 + 1 shown in the eighth line of FIG. , P1 is executed when it is true, this array assignment instruction is converted to if (P1) v2 = A [A1r1] as shown in the fifth line of FIG. 18B. In addition, the write address port variable and write data port variable of array assignment instruction A [A1w3] = A1d3 are output by φ function instructions, and these φ function instructions are always on the control flow path to be executed. Is not added.

以上に説明されたように、制御フロー縮退変換部２８は、入力された制御フローグラフＦｃｆｇから、制御フローが縮退されたプログラムである制御フロー縮退プログラムＦｄｅｇを生成する。なお、別の実施形態において、論理回路生成装置１がφ関数命令挿入部２７１を含まない場合には、制御フローグラフＦｃｆｇがφ関数命令を含むことはないため、制御フロー縮退変換部２８は、φ関数命令実体化部２８１を含まないものとすることができる。また、さらに別の実施形態において、論理回路生成装置１がレジスタ／メモリ配列アクセス命令分解２５を含まない場合には、制御フロー縮退変換部２８は、レジスタ／メモリ配列アクセス命令融合部２８３を含まないものとすることができる。 As described above, the control flow degeneration conversion unit 28 generates the control flow degeneration program Fdeg, which is a program in which the control flow is degenerated, from the input control flow graph Ffcfg. In another embodiment, when the logic circuit generation device 1 does not include the φ function instruction insertion unit 271, the control flow graph Fcfg does not include the φ function instruction. The φ function instruction materializing unit 281 may not be included. In still another embodiment, when the logic circuit generation device 1 does not include the register / memory array access instruction decomposition 25, the control flow degeneration conversion unit 28 does not include the register / memory array access instruction fusion unit 283. Can be.

［データフローグラフ生成部２９］
制御フロー縮退変換部２８によって生成された制御フロー縮退プログラムＦｄｅｇは、次いで、データフローグラフ生成部２９に入力される。データフローグラフ生成部２９は、入力された制御フロー縮退プログラムＦｄｅｇに含まれる制御フローの分岐及び合流のない逐次命令列から、そのデータフローグラフＦｄｆｇを生成する。一般に、データフローグラフとは、逐次命令列において、各命令をノードとし、変数の値を定義する命令とその定義を参照する命令との間に有向枝を結んだ有向グラフである。制御フローの分岐及び合流のない逐次命令列からそのデータフローグラフを生成する処理は、公知の技術を用いて容易に実装可能であるため、ここではその具体的方法についての説明を省略する。具体例として、図１９は、図１７の制御フロー縮退プログラムから生成されるデータフローグラフを示す。別の具体例として、図２０は、図１８（ｂ）の制御フロー縮退プログラムから生成されるデータフローグラフを示す。 [Data Flow Graph Generation Unit 29]
The control flow degeneration program Fdeg generated by the control flow degeneration conversion unit 28 is then input to the data flow graph generation unit 29. The data flow graph generation unit 29 generates the data flow graph Fdfg from the control instruction branch included in the input control flow degeneration program Fdeg and the sequential instruction sequence without merging. In general, a data flow graph is a directed graph in which each instruction is a node in a sequential instruction sequence and a directional branch is connected between an instruction that defines a value of a variable and an instruction that refers to the definition. Since the process of generating the data flow graph from the sequential instruction sequence without branching and merging of the control flow can be easily implemented using a known technique, description of the specific method is omitted here. As a specific example, FIG. 19 shows a data flow graph generated from the control flow degeneration program of FIG. As another specific example, FIG. 20 shows a data flow graph generated from the control flow degeneration program of FIG.

［データフローグラフ最適化部３０］
データフローグラフ生成部２９によって生成されたデータフローグラフＦｄｆｇは、次いで、データフローグラフ最適化部３０に入力される。データフローグラフ最適化部３０は、入力されたデータフローグラフＦｄｆｇを最適化する。より具体的には、データフローグラフ最適化部３０は、次に説明する、ビット幅判定部３０１、冗長比較演算削除部３０２、演算分解部３０３、及び冗長演算削除部３０４を含む。 [Data flow graph optimization unit 30]
The data flow graph Fdfg generated by the data flow graph generation unit 29 is then input to the data flow graph optimization unit 30. The data flow graph optimization unit 30 optimizes the input data flow graph Fdfg. More specifically, the data flow graph optimization unit 30 includes a bit width determination unit 301, a redundant comparison operation deletion unit 302, an operation decomposition unit 303, and a redundant operation deletion unit 304 described below.

ＲＴＬ記述においては、演算回路の各データのビット幅を明示する必要があるが、各データのビット幅を明示的に指定することは、非常に煩雑である。そこで、ビット幅判定部３０１は、論理回路の入力データ及びレジスタ変数に対するビット幅指定の情報から、演算回路の中間データ及び出力データのビット幅を自動的に判定する。本実施形態においては、これらのデータが取り得る値の区間を求め、その区間内の任意の値を表現することが可能な最小のビット幅をそのデータのビット幅として定めるようにする。以下に、ビット幅判定部３０１が行う処理について、詳細に説明する。以下の説明においては、データとして整数値のみを考慮するが、他の型の値（例えば、浮動小数点数）も同様に考慮することができることは明らかである。また、本実施形態においては、負の値は２の補数表記によって表現される。 In the RTL description, it is necessary to clearly indicate the bit width of each data of the arithmetic circuit, but it is very complicated to explicitly specify the bit width of each data. Therefore, the bit width determination unit 301 automatically determines the bit widths of the intermediate data and the output data of the arithmetic circuit from the input data of the logic circuit and the bit width designation information for the register variable. In the present embodiment, a section of values that can be taken by these data is obtained, and the minimum bit width capable of expressing any value in the section is determined as the bit width of the data. Hereinafter, the process performed by the bit width determination unit 301 will be described in detail. In the following description, only integer values are considered as data, but it is clear that other types of values (eg, floating point numbers) can be considered as well. In the present embodiment, negative values are expressed in 2's complement notation.

以下の説明において、データ x が取り得る最小値及び最大値を、それぞれ x.L 及び x.H で表すことにする。また、これらを両端点として定められる区間 [x.L, x.H] を、データ x のデータ区間と呼ぶことにする。取り得る値が定数値 c であるデータのデータ区間は、 [c, c] となる。 In the following description, the minimum value and the maximum value that can be taken by the data x are represented by x.L and x.H, respectively. In addition, the section [x.L, x.H] defined with these as both end points is called the data section of the data x. The data interval of the data whose possible value is the constant value c is [c, c].

定数 c のビット表現のうち符号ビットを除いた部分を表現することが可能な最小のビット幅を、 c の最小ビット幅と呼び、 min_bit(c) で表すことにする。 c が非負値である場合、即ち c >= 0 の場合には、 min_bits(c) の値は 2ⁿ - 1 >= c を満たす最小の整数 n であり、 c が負値である場合、即ち c < 0 の場合には、 min_bits(c) の値は 2ⁿ >= - c を満たす最小の整数 n である。 The minimum bit width that can represent the part excluding the sign bit in the bit representation of the constant c is called the minimum bit width of c and is represented by min_bit (c). If c is non-negative, i.e. if c> = 0, then the value of min_bits (c) is the smallest integer n that satisfies 2 ⁿ -1> = c, and if c is negative, i.e. If c <0, the value of min_bits (c) is the smallest integer n that satisfies 2 ⁿ > =-c.

この min_bits(c) を用いて、データ区間 [x.L, x.H] を持つデータ x を表現するための最小ビット幅 bit_width([x.L, x.H]) を、次のように表す。
データ x が非負値のみを取り得る場合、即ち x.L >= 0 の場合には、データ x は符号無しデータとして扱われ、符号無し２進数表記によってビット表現される。この場合には、 x が取り得る最大値を表現することが可能なビット幅によって、 x が取り得る任意の値を表現することが可能であるため、データ x を表現するための最小ビット幅は
bit_width([x.L, x.H]) = min_bits(x.H);
と表される。
一方、データ x が負値を取り得る場合、即ち x.L < 0 の場合には、データ x は符号付きデータとして扱われ、符号付きの２進数（負値については、２の補数）表記によってビット表現される。この場合には、データ x を表現するための最小ビット幅は
bit_width([x.L, x.H]) = max(min_bits(x.L), min_bits(x.H)) + 1;
と表される。 Using this min_bits (c), the minimum bit width bit_width ([xL, xH]) for representing the data x having the data section [xL, xH] is expressed as follows.
When the data x can take only non-negative values, that is, when xL> = 0, the data x is treated as unsigned data and is represented in bits by unsigned binary notation. In this case, since the bit width that can represent the maximum value that x can represent can represent any value that x can represent, the minimum bit width for representing the data x is
bit_width ([xL, xH]) = min_bits (xH);
It is expressed.
On the other hand, if the data x can take a negative value, that is, if xL <0, the data x is treated as signed data, and is represented as a bit by a signed binary number (two's complement for negative values). Is done. In this case, the minimum bit width for representing the data x is
bit_width ([xL, xH]) = max (min_bits (xL), min_bits (xH)) + 1;
It is expressed.

まず、ビット幅判定部３０１は、論理回路の入力データ及び状態変数に対して、ビット幅情報とデータ型とから、それぞれのデータが取り得る値の範囲を算出し、それを初期データ区間として設定する。
n ビット符号無しデータ x の初期データ区間は、
[x.L, x.H] = [0, 2ⁿ - 1];
と設定される。
n ビット符号付きデータ x の初期データ区間は、
[x.L, x.H] = [ - 2^{n - 1}, 2^{n - 1} - 1];
と設定される。 First, the bit width determination unit 301 calculates a range of values that each data can take from the bit width information and the data type for the input data and the state variable of the logic circuit, and sets it as an initial data section. To do.
The initial data interval of n-bit unsigned data x is
[xL, xH] = [0, 2 ⁿ -1];
Is set.
The initial data interval of n-bit signed data x is
[xL, xH] = [-2 ^n-1 , 2 ⁿ -1-1];
Is set.

次に、ビット幅判定部３０１は、各演算の種類及び演算対象のデータ区間に基づいて、演算結果のデータ区間を決定する。以下に、演算の種類ごとに、演算結果のデータ区間を決定する処理について説明する。 Next, the bit width determination unit 301 determines the data section of the calculation result based on the type of each calculation and the data section to be calculated. Below, the process which determines the data area of a calculation result for every kind of calculation is demonstrated.

（１）２項演算のデータ区間
z = x op yの形式の２項演算における演算結果のデータ区間を
[z.L, z.H] = Range(op, [x.L, x.H], [y.L, y.H]);
と表記する。ここでは、演算子 op がＣ言語の２項演算子のうち { +, -, *, /, %, <<, >>, &, |, ^, ==, !=, <, >, <=, >= } のいずれかである２項演算に対する処理について説明するが、例えばべき乗など他の２項演算に対しても同様の処理が可能であることは明らかである。２項演算における演算結果のデータ区間は、演算子 op と演算対象データ x 及び y のデータ区間 [x.L, x.H] 及び [y.L, y.H] とから決定される。以下に、２項演算の種類ごとに、演算結果のデータ区間を決定する具体的な処理について説明する。 (1) Data interval for binary operation
The data interval of the result of a binary operation of the form z = x op y
[zL, zH] = Range (op, [xL, xH], [yL, yH]);
Is written. Here, the operator op is a C language binary operator {+,-, *, /,%, <<, >>, &, |, ^, ==,! =, <,>, < A process for a binary operation that is any one of = and> =} will be described, but it is obvious that the same process can be performed for another binary operation such as a power. The data interval of the operation result in the binary operation is determined from the operator op and the data intervals [xL, xH] and [yL, yH] of the operation target data x and y. Hereinafter, specific processing for determining the data section of the calculation result for each type of binary calculation will be described.

（１．１）単調２項演算のデータ区間
２項演算 x op y の演算子 op が { +, -, *, <<, >> } のいずれかである場合のその２項演算を、ここでは単調２項演算と呼ぶことにする。この場合における x op y の演算結果のデータ区間は、それぞれの最小値 x.L, y.L 及び最大値 x.H, y.H を用いて求められる。より具体的には、
MRange(op, [x.L, x.H], [y.L, y.H]) = [min(z0, z1, z2, z3), max(z0, z1, z2, z3)];
（但し上式において
z0 = x.L op y.L;
z1 = x.L op y.H;
z2 = x.H op y.L;
z3 = x.H op y.H;
）と定義すると、これが単調２項演算における演算結果のデータ区間であるので、
Range(op, [x.L, x.H], [y.L, y.H]) = MRange(op, [x.L, x.H], [y.L, y.H]);
と決定される。 (1.1) Data interval of monotone binary operation The binary operation when the operator op of the binary operation x op y is any of {+,-, *, <<, >>} Let us call this a monotone binary operation. In this case, the data interval of the calculation result of x op y is obtained using the minimum value xL, yL and the maximum value xH, yH. More specifically,
MRange (op, [xL, xH], [yL, yH]) = [min (z0, z1, z2, z3), max (z0, z1, z2, z3)];
(However, in the above formula
z0 = xL op yL;
z1 = xL op yH;
z2 = xH op yL;
z3 = xH op yH;
), This is the data section of the result of the monotonic binary operation,
Range (op, [xL, xH], [yL, yH]) = MRange (op, [xL, xH], [yL, yH]);
Is determined.

（１．２）除算のデータ区間
２項演算 x op y が除算である場合、即ち op が / である場合には、次に説明するように、除数 y のデータ区間 [y.L, y.H] によってさらに場合分けをする。なお、ここでは y == 0 の場合は考慮しないものとする。
y が非負値のみ、又は y が非正値のみ、即ち y.L >= 0 || y.H <= 0 の場合には、有効な除数のデータ区間、即ち y のデータ区間から y == 0 を取り除いたデータ区間を [y'.L, y'.H] とすると、その両端点は
y'.L = (y.L == 0) ? 1 : y.L;
y'.H = (y.H == 0) ? - 1 : y.H;
と表される。これを用いて、この場合の演算データ区間は
Range(/, [x.L, x.H], [y.L, y.H]) = MRange(/, [x.L, x.H], [y'.L, y'.H]);
と決定される。
それ以外の場合、即ち y が正負値をとる y.L < 0 && y.H > 0 の場合には、除数 y の区間を負の範囲との正の範囲とに分け、除数をそれぞれの範囲に限定したときの演算結果のデータ区間を求め、それらを統合することによって、最終的な演算結果のデータ区間を決定する。具体的には、除数を負の範囲に限定したときの演算結果のデータ区間を [L0, H0] とし、除数を正の範囲に限定したときの演算結果のデータ区間を [L1, H1] とすると、
[L0, H0] = MRange(/, [x.L, x.H], [y.L, - 1]);
[L1, H1] = MRange(/, [x.L, x.H], [1, y.H]);
と表される。従って、この場合の最終的な演算結果のデータ区間は、これらを統合して
Range(/, [x.L, x.H], [y.L, y.H]) = [min(L0, L1), max(H0, H1)]
と決定される。 (1.2) Data interval of division When the binary operation x op y is division, that is, when op is /, the data interval [yL, yH] of the divisor y further increases as described below. Divide cases. Note that here y == 0 is not considered.
If y is only a non-negative value or y is only a non-positive value, ie yL> = 0 || yH <= 0, then y == 0 is removed from the valid divisor data interval, that is, the y data interval If the data interval is [y'.L, y'.H], its end points are
y'.L = (yL == 0)? 1: yL;
y'.H = (yH == 0)?-1: yH;
It is expressed. Using this, the calculation data interval in this case is
Range (/, [xL, xH], [yL, yH]) = MRange (/, [xL, xH], [y'.L, y'.H]);
Is determined.
In other cases, that is, when y takes a positive or negative value yL <0 &&yH> 0, when the divisor y is divided into a negative range and a positive range, and the divisor is limited to each range The data interval of the operation result is obtained and integrated to determine the data interval of the final operation result. Specifically, the data interval of the operation result when the divisor is limited to the negative range is [L0, H0], and the data interval of the operation result when the divisor is limited to the positive range is [L1, H1]. Then
[L0, H0] = MRange (/, [xL, xH], [yL,-1]);
[L1, H1] = MRange (/, [xL, xH], [1, yH]);
It is expressed. Therefore, the data section of the final calculation result in this case is integrated.
Range (/, [xL, xH], [yL, yH]) = [min (L0, L1), max (H0, H1)]
Is determined.

（１．３）剰余演算のデータ区間
２項演算 x op y が剰余演算である場合、即ち op が % である場合には、次の通りである。
一般に、 y != 0 の場合において、 x が負値のときには、 x % y の最小値は - (abs(y) - 1) 、最大値は 0 である。 x が非負値のときには、 x % y の最小値は 0 、最大値は abs(y) - 1である。そこで、
y_abs_max = max(abs(y.L), abs(y.H));
z4 = (x.L < 0) ? - (y_abs_max - 1) : 0;
z5 = (x.H >= 0) ? y_abs_max - 1 : 0;
と定めることにより、剰余演算における演算結果のデータ区間は
Range(%, [x.L, x.H], [y.L, y.H]) = [min(z4, 0), max(z5, 0)];
と決定される。 (1.3) Data interval of remainder operation When binary operation x op y is a remainder operation, that is, when op is%, it is as follows.
In general, if y! = 0 and x is negative, the minimum value of x% y is-(abs (y)-1) and the maximum value is 0. When x is non-negative, the minimum value of x% y is 0 and the maximum value is abs (y)-1. there,
y_abs_max = max (abs (yL), abs (yH));
z4 = (xL <0)?-(y_abs_max-1): 0;
z5 = (xH> = 0)? y_abs_max-1: 0;
By defining, the data interval of the operation result in the remainder operation is
Range (%, [xL, xH], [yL, yH]) = [min (z4, 0), max (z5, 0)];
Is determined.

（１．４）論理演算のデータ区間
２項演算 x op y がビットごとの論理演算である場合、即ち演算子 op が { &, |, ^ } のいずれかである場合における演算結果のデータ区間を決定する処理について、以下に説明する。なお、説明において、単に「論理演算」というときは、ビットごとの論理演算を意味するものとする。 (1.4) Data interval of logical operation Data interval of operation result when binary operation x op y is a bitwise logical operation, that is, when operator op is one of {&, |, ^} The process for determining the will be described below. In the description, the simple “logical operation” means a logical operation for each bit.

論理演算のデータ区間を算出する場合においては、算術演算の場合と異なり、２の補数の概念が適用されない。即ち、 -1 及び 0 は２進数表現でそれぞれ 111…1 及び 000…0 であるので、-1 及び 0 を共に含む区間は、２進数表現において不連続である。このように、負値と非負値とを両方含む、２進数表現において不連続となる区間を、２進不連続区間と呼ぶ。逆に、負値だけを取る区間、及び非負値だけを取る区間は、２進連続区間と呼ぶ。 In the case of calculating a data section of a logical operation, the concept of 2's complement is not applied unlike the case of an arithmetic operation. That is, since -1 and 0 are 111... 1 and 000... 0 in binary representation, respectively, the interval including both -1 and 0 is discontinuous in binary representation. In this way, a section that is discontinuous in binary representation including both negative and non-negative values is referred to as a binary discontinuous section. Conversely, a section taking only a negative value and a section taking only a non-negative value are called binary continuous sections.

２つの整数 x 及び y について、それぞれを２の補数表記によって表現し、それらを符号無し整数として再解釈して比較した場合の最大値を unsigned_max(x, y) 、最小値を unsigned_min(x, y) で表す。これらを式によって具体的に定義すると、
unsigned_max(x, y) = ((unsigned) x > (unsigned) y) ? x : y;
unsigned_min(x, y) = ((unsigned) x < (unsigned) y) ? x : y;
となる。
別の表現をすると、以下の通りである。 x 及び y の符号が等しい場合、即ち ((x < 0) == (y < 0)) の場合には、通常の大小比較によって
unsigned_max(x, y) = (x > y) ? x : y;
unsigned_min(x, y) = (x < y) ? x : y;
と表される。一方、 x 及び y の符号が異なる場合、即ち ((x < 0) != (y < 0)) の場合には、これらを符号無し整数として再解釈して比較すると、負値を再解釈した値の方が非負値を再解釈した値より大きくなるため、
unsigned_max(x, y) = (x < 0) ? x : y;
unsigned_min(x, y) = (x >= 0) ? x : y;
と表される。 For two integers x and y, each represented by two's complement notation, reinterpreted as an unsigned integer and compared, the maximum value is unsigned_max (x, y) and the minimum value is unsigned_min (x, y ) If these are specifically defined by formulas,
unsigned_max (x, y) = ((unsigned) x> (unsigned) y)? x: y;
unsigned_min (x, y) = ((unsigned) x <(unsigned) y)? x: y;
It becomes.
In other words, it is as follows. If the signs of x and y are equal, that is, ((x <0) == (y <0))
unsigned_max (x, y) = (x> y)? x: y;
unsigned_min (x, y) = (x <y)? x: y;
It is expressed. On the other hand, if the signs of x and y are different, that is, ((x <0)! = (Y <0)), the negative values are reinterpreted when they are reinterpreted and compared as unsigned integers. Because the value is larger than the reinterpreted non-negative value,
unsigned_max (x, y) = (x <0)? x: y;
unsigned_min (x, y) = (x> = 0)? x: y;
It is expressed.

ここでまず、論理演算において演算対象のデータ区間がいずれも２進連続区間である場合について説明する。即ち、 z = x op y の形式の論理演算において、 [x.L, x.H] 及び [y.L, y.H] が共に２進連続区間である場合である。この場合の演算結果のデータ区間 [z.L, z.H] を
[z.L, z.H] = C_Range(op, [x.L, x.H], [y.L, y.H]);
と表記する。これは、具体的に次の計算式によって求められる。
b = max(min_bits(x.L), min_bits(x.H), min_bits(y.L), min_bits(y.H));
x_sign = (x.L < 0); （２進連続区間なので、(x.L < 0) == (x.H < 0)）
y_sign = (y.L < 0); （２進連続区間なので、(y.L < 0) == (y.H < 0)）
min_range = (x_sign op y_sign) ? - 2^b : 0;
max_range = (x_sign op y_sign) ? - 1 : 2^b - 1;
C_Range(&, [x.L, x.H], [y.L, y.H]) = [min_range, unsigned_min(x.H, y.H)];
C_Range(|, [x.L, x.H], [y.L, y.H]) = [unsigned_max(x.L, y.L), max_range];
C_Range(^, [x.L, x.H], [y.L, y.H]) = [min_range, max_range]; Here, first, a case will be described in which all data sections to be calculated in a logical operation are binary continuous sections. That is, in a logical operation of the form z = x op y, both [xL, xH] and [yL, yH] are binary continuous intervals. In this case, the data interval [zL, zH]
[zL, zH] = C_Range (op, [xL, xH], [yL, yH]);
Is written. This is specifically obtained by the following calculation formula.
b = max (min_bits (xL), min_bits (xH), min_bits (yL), min_bits (yH));
x_sign = (xL <0); (Because it is a binary continuous interval, (xL <0) == (xH <0))
y_sign = (yL <0); (Because it is a binary continuous interval, (yL <0) == (yH <0))
min_range = (x_sign op y_sign)?-2 ^b : 0;
max_range = (x_sign op y_sign)?-1: 2 ^b -1;
C_Range (&, [xL, xH], [yL, yH]) = [min_range, unsigned_min (xH, yH)];
C_Range (|, [xL, xH], [yL, yH]) = [unsigned_max (xL, yL), max_range];
C_Range (^, [xL, xH], [yL, yH]) = [min_range, max_range];

なお、上式において、区間の端点を定めるために min_range 及び max_range が用いられているが、演算対象のデータ区間の与え方によっては、実際の演算結果の範囲が min_range 又は max_range の値を必ずしも含まない場合がある。本発明の別の実施形態においては、演算対象のデータ区間から各ビットの取り得る値を分析することによって、演算結果のデータ区間をより厳密に求めるようにしてもよい。又は、それとは逆に、さらに別の実施形態においては、例えばビット幅のみを考慮するなど、本実施形態よりさらに簡単なものとしてもよい。 In the above equation, min_range and max_range are used to determine the end point of the section. However, depending on how the data section to be calculated is given, the actual calculation result range does not necessarily include the min_range or max_range value. There is a case. In another embodiment of the present invention, the data section of the calculation result may be obtained more strictly by analyzing the possible value of each bit from the data section to be calculated. Or, on the contrary, in still another embodiment, for example, only the bit width may be considered, and it may be simpler than this embodiment.

次に、論理演算における一般の場合について説明する。演算 x op y において演算子 op が { &, |, ^ } のいずれかである場合における演算対象のデータ区間 [x.L, x.H] 及び [y.L, y.H] のそれぞれについて、その区間が２進不連続区間である場合には、その区間を２つの２進連続区間に分割した上で、それぞれの２進連続区間における演算結果のデータ区間を計算し、それらの結果を統合して最終的な演算結果のデータ区間を決定する。具体的な式を用いて表すと、次のようになる。まず、データ区間 [x.L, x.H] 及び [y.L, y.H] のそれぞれが２進不連続区間であるか否かを判定する。
x_neg_pos = (x.L < 0 && x.H >= 0); （[x.L, x.H] が２進不連続区間である）
y_neg_pos = (y.L < 0 && y.H >= 0); （[y.L, y.H] が２進不連続区間である）
x_neg_pos == 0 && y_neg_pos == 0 の場合、即ち両区間とも２進連続区間である場合には、前述の通り
Range(op, [x.L, x.H], [y.L, y.H]) = C_Range(op, [x.L, x.H], [y.L, y.H]);
と求められる。
x_neg_pos == 0 && y_neg_pos == 1 の場合には、２進不連続区間であるデータ区間 [y.L, y.H] を分割することによって
[L0, H0] = C_Range(op, [x.L, x.H], [y.L, - 1]);
[L1, H1] = C_Range(op, [x.L, x.H], [0, y.H]);
Range(op, [x.L, x.H], [y.L, y.H]) = [min(L0, L1), max(H0, H1)];
と求められる。
x_neg_pos == 1 && y_neg_pos == 0 の場合には、２進不連続区間であるデータ区間 [x.L, x.H] を分割することによって
[L2, H2] = C_Range(op, [x.L, - 1], [y.L, y,H]);
[L3, H3] = C_Range(op, [0, x.H], [y.L, y.H]);
Range(op, [x.L, x.H], [y.L, y.H]) = [min(L2, L3), max(H2, H3)];
と求められる。
x_neg_pos == 1 && y_neg_pos == 1 の場合には、データ区間 [x.L, x.H] 及び [y.L, y.H] を共に分割することによって
[L4, H4] = C_Range(op, [x.L, - 1], [y.L, - 1]);
[L5, H5] = C_Range(op, [x.L, - 1], [0, y.H]);
[L6, H6] = C_Range(op, [0, x.H], [y.L, - 1]);
[L7, H7] = C_Range(op, [0, x.H], [0, y.H]);
Range(op, [x.L, x.H], [y.L, y.H]) = [min(L4, L5, L6, L7), max(H4, H5, H6, H7)];
と求められる。
以上により、全ての場合について、論理演算における演算結果のデータ区間が決定される。 Next, a general case in the logical operation will be described. In the operation x op y, when the operator op is one of {&, |, ^}, for each data interval [xL, xH] and [yL, yH], the interval is binary discontinuous If it is a section, divide the section into two binary continuous sections, calculate the data section of the calculation results in each binary continuous section, and integrate the results to obtain the final calculation result Determine the data interval. This is expressed as follows using specific equations. First, it is determined whether each of the data sections [xL, xH] and [yL, yH] is a binary discontinuous section.
x_neg_pos = (xL <0 &&xH> = 0); ([xL, xH] is a binary discontinuous section)
y_neg_pos = (yL <0 &&yH> = 0); ([yL, yH] is a binary discontinuous section)
If x_neg_pos == 0 && y_neg_pos == 0, that is, if both sections are binary continuous sections, as described above
Range (op, [xL, xH], [yL, yH]) = C_Range (op, [xL, xH], [yL, yH]);
Is required.
If x_neg_pos == 0 && y_neg_pos == 1, by dividing the data interval [yL, yH], which is a binary discontinuous interval
[L0, H0] = C_Range (op, [xL, xH], [yL,-1]);
[L1, H1] = C_Range (op, [xL, xH], [0, yH]);
Range (op, [xL, xH], [yL, yH]) = [min (L0, L1), max (H0, H1)];
Is required.
If x_neg_pos == 1 && y_neg_pos == 0, by dividing the data interval [xL, xH], which is a binary discontinuous interval
[L2, H2] = C_Range (op, [xL,-1], [yL, y, H]);
[L3, H3] = C_Range (op, [0, xH], [yL, yH]);
Range (op, [xL, xH], [yL, yH]) = [min (L2, L3), max (H2, H3)];
Is required.
If x_neg_pos == 1 && y_neg_pos == 1, by dividing the data interval [xL, xH] and [yL, yH] together
[L4, H4] = C_Range (op, [xL,-1], [yL,-1]);
[L5, H5] = C_Range (op, [xL,-1], [0, yH]);
[L6, H6] = C_Range (op, [0, xH], [yL,-1]);
[L7, H7] = C_Range (op, [0, xH], [0, yH]);
Range (op, [xL, xH], [yL, yH]) = [min (L4, L5, L6, L7), max (H4, H5, H6, H7)];
Is required.
As described above, in all cases, the data section of the operation result in the logical operation is determined.

（１．５）比較演算のデータ区間
２項演算 x op y が比較演算である場合、即ち演算子 op が { ==, !=, <, >, <=, >= } のいずれかである場合における演算結果のデータ区間は、 [0, 0], [1, 1], [0, 1] のいずれかとなる。比較演算の結果は、一般的な定数伝播によって定数 0 又は 1 になる場合がある。また、後述するように、データ区間判定によって定数に評価される場合がある。
比較演算の結果が定数でない場合には、演算結果のデータ区間は [0, 1] となる。即ち、
Range(op, [x.L, x.H], [y.L, y.H]) = [0, 1];
と決定される。 (1.5) Data interval of comparison operation When binary operation x op y is a comparison operation, that is, the operator op is one of {==,! =, <,>, <=,> =} The data interval of the calculation result in the case is either [0, 0], [1, 1], or [0, 1] The result of a comparison operation may be a constant 0 or 1 due to general constant propagation. In addition, as will be described later, there are cases where a constant is evaluated by data interval determination.
If the result of the comparison operation is not a constant, the data interval of the operation result is [0, 1]. That is,
Range (op, [xL, xH], [yL, yH]) = [0, 1];
Is determined.

以上、（１．１）〜（１．５）により、２項演算における演算結果のデータ区間が決定される。 As described above, the data section of the calculation result in the binary calculation is determined by (1.1) to (1.5).

（２）単項演算のデータ区間
次に、単項演算における演算結果のデータ区間を決定する処理について説明する。
u = op x の形式の単項演算における演算結果のデータ区間を
[u.L, u.H] = Range(op, [x.L, x.H])
と表記する。ここでは、演算子 op がＣ言語の単項演算子のうち { !, ~, -, + } のいずれかである単項演算に対する処理について説明する。
単項演算における演算結果のデータ区間は、演算子 op と演算対象データ x のデータ区間 [x.L, x.H] とから、次のように決定される。
Range(!, [x.L, x.H]) = [0, 1];（論理否定）
Range(~, [x.L, x.H]) = [~x.H, ~x.L];（ビット反転）
Range(-, [x.L, x.H]) = [-x.H, -x.L];（マイナス演算）
Range(+, [x.L, x.H]) = [x.L, x.H];（プラス演算） (2) Data Interval of Unary Operation Next, processing for determining a data interval of the operation result in unary operation will be described.
The data interval of the result of unary operation of the form u = op x
[uL, uH] = Range (op, [xL, xH])
Is written. Here, processing for a unary operation in which the operator op is one of {!, ~,-, +} Among C language unary operators will be described.
The data interval of the operation result in the unary operation is determined as follows from the operator op and the data interval [xL, xH] of the operation target data x.
Range (!, [xL, xH]) = [0, 1]; (logical negation)
Range (~, [xL, xH]) = [~ xH, ~ xL]; (bit inversion)
Range (-, [xL, xH]) = [-xH, -xL]; (minus operation)
Range (+, [xL, xH]) = [xL, xH]; (plus operation)

（３）選択演算のデータ区間
最後に、 s = t ? x : y の形式の選択演算（３項演算）における演算結果のデータ区間を決定する処理について説明する。選択演算における演算結果のデータ区間は、条件式 t と被選択項 x 及び y との関係によって決定される。なお、条件式 t が定数値である場合には、後述するように、選択演算は冗長比較演算削除部３０２によって単純な代入式に簡単化さるため、ここで演算結果のデータ区間を考慮する必要はない。よって、以下では、条件式 t が定数値になることが明らかでない場合について説明する。 (3) Data Interval of Selection Operation Finally, a process for determining the data interval of the operation result in the selection operation (ternary operation) in the format of s = t? X: y will be described. The data interval of the calculation result in the selection calculation is determined by the relationship between the conditional expression t and the selected terms x and y. When the conditional expression t is a constant value, as will be described later, the selection operation is simplified to a simple substitution expression by the redundant comparison operation deletion unit 302, and therefore the data section of the operation result needs to be considered here. There is no. Therefore, hereinafter, a case will be described in which it is not clear that the conditional expression t is a constant value.

（３．１）条件式 t が変数 w と定数の大小比較演算（ <, >, <=, >= ）である場合
w のデータ区間 [w.L, w.H] を条件式 t が偽となる区間と真となる区間とに分割し、それぞれの分割された区間について、被選択項 x 及び y に含まれる w のデータ区間がその分割された区間であるものとして被選択項 x 及び y のデータ区間を評価した上で、両者のデータ区間を統合することによって、最終的な演算結果のデータ区間を決定する。このとき、条件式 t の形式によって、次の（ア）〜（エ）のいずれかの処理を選択して、被選択項 x 及び y のデータ区間の評価を行う。なお、これらにおいて c は定数値を表すものとする。
（ア）条件式 t が w < c の形式の場合には、 w のデータ区間が [w.L, c - 1] であるものとして第２項 x のデータ区間を評価し、また、 w のデータ区間が [c, w.H] であるものとして第３項 y のデータ区間を評価する。
（イ）条件式 t が w >= c の形式の場合には、条件式 t は !(w < c) と表されるので、上記（ア）の場合とは逆に、 w のデータ区間が [c, w.H] であるものとして第２項 x のデータ区間を評価し、また、 w のデータ区間が [w.L, c - 1] であるものとして第３項 y のデータ区間を再評価する。
（ウ）条件式 t が w <= c の形式の場合には、 c' = c + 1 とおくと、条件式 t は w < c' と表されるので、この w < c' の形式に基づいて上記（ア）に従って x 及び y それぞれのデータ区間を評価する。
（エ）条件式 t が w > c の形式の場合には、 c' = c + 1 とおくと、条件式 t は w >= c' と表されるので、この w >= c' の形式に基づいて上記（イ）に従って x 及び y それぞれのデータ区間を評価する。
選択演算における演算結果のデータ区間は、上記（ア）〜（エ）によって評価された x 及び y のデータ区間を統合することによって決定される。即ち、上記（ア）〜（エ）によって評価された x のデータ区間を [x'.L, x'.H] 、及び y のデータ区間を [y'.L, y'.H] としたとき、最終的な演算結果のデータ区間は
[s.L, s.H] = [min(x'.L, y'.L), max(x'.H, y'.H)];
と決定される。 (3.1) When the conditional expression t is a comparison operation (<,>, <=,> =) between the variable w and the constant
The data section [wL, wH] of w is divided into a section where the conditional expression t is false and a section where it is true, and for each divided section, the data section of w included in the selected terms x and y is After evaluating the data sections of the selected terms x and y as being the divided sections, the data sections of the final calculation result are determined by integrating the data sections of both. At this time, depending on the format of the conditional expression t, one of the following processes (a) to (d) is selected to evaluate the data section of the selected terms x and y. In these, c represents a constant value.
(A) If the conditional expression t is of the form w <c, the data interval of the second term x is evaluated assuming that the data interval of w is [wL, c-1], and the data interval of w Evaluate the data interval of the third term y as if [c, wH].
(B) When conditional expression t is in the form of w> = c, conditional expression t is expressed as! (W <c). Therefore, conversely to (a) above, the data interval of w is The data interval of the second term x is evaluated as being [c, wH], and the data interval of the third term y is re-evaluated assuming that the data interval of w is [wL, c-1].
(C) If conditional expression t is in the form of w <= c, if c '= c + 1, then conditional expression t is expressed as w <c'. Based on (a) above, the data intervals of x and y are evaluated.
(D) When conditional expression t is in the form of w> c, if c '= c + 1, conditional expression t is expressed as w> = c', so this form of w> = c ' Based on the above, the data intervals of x and y are evaluated according to the above (a).
The data interval of the calculation result in the selection calculation is determined by integrating the data intervals of x and y evaluated by the above (a) to (d). That is, the data interval of x evaluated by (a) to (d) above is [x'.L, x'.H] and the data interval of y is [y'.L, y'.H]. When the data interval of the final calculation result is
[sL, sH] = [min (x'.L, y'.L), max (x'.H, y'.H)];
Is determined.

条件式 t が変数 w と定数との大小比較演算である場合における選択演算データ区間の決定について、以下に具体例を用いて説明する。第１の例として、
s1 = (w > 127) ? 127 : w;
という式における演算結果のデータ区間 [s1.L, s1.H] を考える。この式においては、 w > 127 のときに第２項 127 が選択され、 w <= 127 のときに第３項 w が選択される。また、条件式 w > 127 が定数値でないことを仮定しているので、 w.L <= 127 && w.H > 127 が成立する。条件式 w > 127 は、定数値 c を用いて w > c の形式に表されるため、上記（エ）に該当する。従って、 w のデータ区間 [w.L, w.H] を [w.L, 127] と [128, w.H] とに分割し、前者を第３項 w に、後者を第２項 127 に、それぞれ適用する。その結果、第２項 127 のデータ区間は [127, 127] となり、第３項 w のデータ区間は [w.L, 127] となる。従って、演算結果のデータ区間は、これらの区間を統合することによって
[s1.L, s1.H] = [min(w.L, 127), 127] = [w.L, 127];
と決定される。 The determination of the selection operation data interval when the conditional expression t is a size comparison operation between the variable w and a constant will be described below using a specific example. As a first example,
s1 = (w> 127)? 127: w;
Consider the data interval [s1.L, s1.H] of the operation result in the equation. In this expression, the second term 127 is selected when w> 127, and the third term w is selected when w <= 127. Since it is assumed that the conditional expression w> 127 is not a constant value, wL <= 127 &&wH> 127 holds. Conditional expression w> 127 corresponds to the above (d) because it is expressed in the form of w> c using the constant value c. Therefore, the data interval [wL, wH] of w is divided into [wL, 127] and [128, wH], and the former is applied to the third term w and the latter is applied to the second term 127, respectively. As a result, the data interval of the second term 127 is [127, 127], and the data interval of the third term w is [wL, 127]. Therefore, the data interval of the calculation result is obtained by integrating these intervals.
[s1.L, s1.H] = [min (wL, 127), 127] = [wL, 127];
Is determined.

第２の例として、
s2 = (w < 0) ? - w : w
という式における演算結果のデータ区間 [s2.L, s2.H] を考える。この式においては、 w < 0 のときに第２項 - w が選択され、 w >= 0 のときに第３項 w が選択される。また、条件式 w < 0 が定数値でないことを仮定しているので、 w.L < 0 && w.H >= 0 が成立する。条件式 w < 0 は、定数値 c を用いて w < c の形式に表されるため、上記（ア）に該当する。従って、 w のデータ区間 [w.L, w.H] を [w.L, - 1] と [0, w.H] とに分割し、前者を第２項 - w に、後者を第３項 w に、それぞれ適用する。その結果、第２項 - w のデータ区間は Range(-, [w.L,-1]) = [1, - w.L] となり、第３項 w のデータ区間は [0, w.H] となる。従って、演算結果のデータ区間は、これらの区間を統合することによって
[s2.L, s2.H] = [0, max(- w.L, w.H)]
と決定される。 As a second example,
s2 = (w <0)?-w: w
Consider the data interval [s2.L, s2.H] of the operation result in In this equation, the second term -w is selected when w <0, and the third term w is selected when w> = 0. Since it is assumed that the conditional expression w <0 is not a constant value, wL <0 &&wH> = 0 holds. Since the conditional expression w <0 is expressed in the form of w <c using the constant value c, it corresponds to the above (a). Therefore, the data interval [wL, wH] of w is divided into [wL, -1] and [0, wH], and the former is applied to the second term -w and the latter is applied to the third term w. As a result, the data interval of the second term -w is Range (-, [wL, -1]) = [1, -wL], and the data interval of the third term w is [0, wH]. Therefore, the data interval of the calculation result is obtained by integrating these intervals.
[s2.L, s2.H] = [0, max (-wL, wH)]
Is determined.

（３．２）条件式 t が変数 w と定数との大小比較演算でない場合
被選択項 x 及び y それぞれのデータ区間を単純に統合することによって、選択演算データ区間を決定する。即ち、
[s.L, s.H] = [min(x.L, y.L), max(x.H, y.H)];
とする。 (3.2) When the conditional expression t is not a magnitude comparison operation between the variable w and the constant The data area of the selected operation is determined by simply integrating the data sections of the selected terms x and y. That is,
[sL, sH] = [min (xL, yL), max (xH, yH)];
And

以上、（３．１）及び（３．２）により、選択演算における演算結果のデータ区間が決定される。なお、本実施形態においては、条件式 t が変数 w と定数との大小比較演算でない場合には一律に、被選択項のデータ区間を単純に統合するようにしたが、本発明の別の実施形態においては、条件式と被選択項との関係をさらに分析することによって、演算結果のデータ区間をより厳密に求めるようにしてもよい。又は、それとは逆に、さらに別の実施形態においては、例えば条件式を考慮せずに常に被選択項のデータ区間を単純に統合するなど、本実施形態よりさらに簡単なものとしてもよい。 As described above, the data section of the calculation result in the selection calculation is determined by (3.1) and (3.2). In the present embodiment, when the conditional expression t is not a magnitude comparison operation between the variable w and the constant, the data sections of the selected terms are simply integrated, but another implementation of the present invention. In the embodiment, the data section of the calculation result may be obtained more strictly by further analyzing the relationship between the conditional expression and the selected term. Or, on the contrary, in another embodiment, the data sections of the selected terms may be simply integrated without always considering the conditional expression, for example.

以上、（１）〜（３）に説明した処理によって、ビット幅判定部３０１は、演算回路の中間データ及び出力データのデータ区間を決定する。データのビット幅は、そのデータ区間内の任意の値を表現することが可能な最小のビット幅として定められる。 As described above, by the processing described in (1) to (3), the bit width determination unit 301 determines the data section of the intermediate data and output data of the arithmetic circuit. The bit width of data is defined as the minimum bit width that can represent an arbitrary value in the data section.

なお、本発明の別の実施形態においては、データフロー最適化部３０がビット幅判定部３０１を含まないようにすることができる。この場合には、論理回路生成装置１は、演算回路の中間データ及び出力データのビット幅を、そのデータの型に即して定め、又は、論理回路生成装置１に入力されるプログラムＰのソフトウェア記述においてデータのビット幅が指定されているときには、その指定されたビット幅をそのまま用いるようにすることができる。 In another embodiment of the present invention, the data flow optimization unit 30 may not include the bit width determination unit 301. In this case, the logic circuit generation device 1 determines the bit widths of the intermediate data and output data of the arithmetic circuit according to the type of the data, or the software of the program P input to the logic circuit generation device 1 When the bit width of data is specified in the description, the specified bit width can be used as it is.

各データのデータ区間が決定されることによって、比較演算の結果が定数値に評価されることが明らかになる場合がある。結果が定数値になる比較演算を、冗長比較演算と呼ぶ。冗長比較演算削除部３０２は、ビット幅判定部３０１が判定した各データのデータ区間に基づいて、冗長比較演算を判定して定数値に置き換え、定数伝播による最適化をさらに行う。
例えば、データ x の区間が [x.L, x.H] = [0, 63] であり、データ y の区間が [y.L, y.H] = [64, 200] である場合においては、式 x < y は必ず 1 に評価されるので、この場合には式 x < y を定数値 1 に置き換える。
別の例として、データ x の区間が [x.L, x.H] = [0, 63] である場合においては、式 x < 0 は必ず 0 に評価されるので、この場合には式 x < 0 を定数値 0 に置き換える。 By determining the data interval of each data, it may become clear that the result of the comparison operation is evaluated to a constant value. A comparison operation that results in a constant value is called a redundant comparison operation. The redundancy comparison calculation deletion unit 302 determines a redundancy comparison calculation based on the data section of each data determined by the bit width determination unit 301 and replaces it with a constant value, and further performs optimization by constant propagation.
For example, if the interval of data x is [xL, xH] = [0, 63] and the interval of data y is [yL, yH] = [64, 200], the expression x <y must be 1 In this case, replace the expression x <y with the constant value 1 in this case.
As another example, when the interval of data x is [xL, xH] = [0, 63], the expression x <0 always evaluates to 0. In this case, the expression x <0 is defined. Replace with the number 0.

図２１を参照して、ビット幅判定部３０１によるデータ区間判定の処理、及び冗長比較演算削除部３０２の処理について、具体例を用いて説明する。なお、図示及び説明の便宜上、この例におけるプログラムはＣ言語によるソフトウェア記述として表現されているが、ビット判定部３０１及び冗長比較演算削除部３０２が処理する対象は、データフローグラフである。説明においては、ソフトウェア記述に対応するデータフローグラフが実際の処理の対象であるものとして理解されたい。 With reference to FIG. 21, the data section determination processing by the bit width determination unit 301 and the processing of the redundancy comparison calculation deletion unit 302 will be described using a specific example. For convenience of illustration and description, the program in this example is expressed as a software description in C language, but the object processed by the bit determination unit 301 and the redundant comparison calculation deletion unit 302 is a data flow graph. In the description, it should be understood that the data flow graph corresponding to the software description is the actual processing target.

この例において、関数 funcF は、変数 cc 及び dd に対するクリップ処理（具体的には、負値の場合に 0 にクリップする処理）と、変数 r に対するクリップ処理（具体的には、値が 255 を超える場合に 255 にクリップする処理）とを含む。関数 top6 は、配列 c[3] 及び d[3] の具体値を定義する。 In this example, the function funcF has the clipping process for variables cc and dd (specifically, the process of clipping to 0 if negative value) and the clipping process for variable r (specifically, the value exceeds 255) Processing to clip to 255). The function top6 defines the concrete values of the arrays c [3] and d [3].

まず、ビット幅判定部３０１は、初期データ区間として、関数 top6 の入り口における配列 x[3] の各要素（ここでは便宜的に xi と表記する）のデータ区間を、その型に基づいて
[xi,L, xi.H] = [0, 255]
と設定する。 First, the bit width determination unit 301 uses, as an initial data section, a data section of each element of the array x [3] at the entrance of the function top6 (in this case, expressed as xi for convenience) based on its type.
[xi, L, xi.H] = [0, 255]
And set.

関数 funcF の最初の式、即ち
c = x[0] * c[0] + x[1] * c[1] + x[2] * c[2];
は、複数の演算を含み、細分化すると
tmp0 = x[0] * c[0];
tmp1 = x[1] * c[1];
tmp2 = x[2] * c[2];
tmp3 = tmp0 + tmp1;
cc = tmp3 + tmp2;
と表現される。ビット幅判定部３０１は、これらの式に対して、配列 c[3] = {-1, 2, -1} の各要素を定数伝播させたときの各データのデータ区間を、上述した処理に従って、
[tmp0.L, tmp0.H] = [-255, 0]
[tmp1.L, tmp1.H] = [0, 510]
[tmp2.L, tmp2.H] = [-255, 0]
[tmp3.L, tmp3.H] = [-255, 510]
[cc.L, cc.H] = [-510, 510]
と判定する。 The first expression of the function funcF, ie
c = x [0] * c [0] + x [1] * c [1] + x [2] * c [2];
Contains multiple operations and subdivides
tmp0 = x [0] * c [0];
tmp1 = x [1] * c [1];
tmp2 = x [2] * c [2];
tmp3 = tmp0 + tmp1;
cc = tmp3 + tmp2;
It is expressed. The bit width determination unit 301 applies the data section of each data when the elements of the array c [3] = {-1, 2, -1} are propagated constant to these expressions according to the above-described processing. ,
[tmp0.L, tmp0.H] = [-255, 0]
[tmp1.L, tmp1.H] = [0, 510]
[tmp2.L, tmp2.H] = [-255, 0]
[tmp3.L, tmp3.H] = [-255, 510]
[cc.L, cc.H] = [-510, 510]
Is determined.

同様に、関数 funcF の２つ目の式、即ち
dd = x[0] * d[0] + x[1] * d[1] + x[2] * d[2];
についても、細分化すると
tmp4 = x[0] * d[0];
tmp5 = x[1] * d[1];
tmp6 = x[2] * d[2];
tmp7 = tmp4 + tmp5;
dd = tmp7 + tmp6;
と表現される。ビット幅判定部３０１は、配列 d[3] = {1, 2, 1} の各要素を定数伝播させたときの各データのデータ区間を
[tmp4.L, tmp4.H] = [0, 255]
[tmp5.L, tmp5.H] = [0, 510]
[tmp6.L, tmp6.H] = [0, 255]
[tmp7.L, tmp7.H] = [0, 765]
[dd.L, dd.H] = [0, 1020]
と判定する。 Similarly, the second expression of the function funcF, ie
dd = x [0] * d [0] + x [1] * d [1] + x [2] * d [2];
And also subdivide
tmp4 = x [0] * d [0];
tmp5 = x [1] * d [1];
tmp6 = x [2] * d [2];
tmp7 = tmp4 + tmp5;
dd = tmp7 + tmp6;
It is expressed. The bit width determination unit 301 calculates the data interval of each data when the elements of the array d [3] = {1, 2, 1} are propagated constant.
[tmp4.L, tmp4.H] = [0, 255]
[tmp5.L, tmp5.H] = [0, 510]
[tmp6.L, tmp6.H] = [0, 255]
[tmp7.L, tmp7.H] = [0, 765]
[dd.L, dd.H] = [0, 1020]
Is determined.

変数 cc のデータ区間は、上記の通り [cc.L, cc.H] = [-510, 510] であるため、関数 funcF の３つ目の式、即ち
cc_clip = (cc < 0) ? 0 : cc;
の選択演算における条件式 cc < 0 は、冗長比較演算ではない。そこで、ビット幅判定部３０１は、選択演算における演算結果のデータ区間を、上述のビット幅判定部３０１の処理（３．１）（ア）に従って、
[cc_clip.L, cc_clip.H] = [0, 510]
と判定する。 Since the data section of variable cc is [cc.L, cc.H] = [-510, 510] as described above, the third expression of function funcF, that is,
cc_clip = (cc <0)? 0: cc;
The conditional expression cc <0 in the selection operation is not a redundant comparison operation. Therefore, the bit width determination unit 301 determines the data section of the calculation result in the selection calculation according to the above-described processing (3.1) (a) of the bit width determination unit 301.
[cc_clip.L, cc_clip.H] = [0, 510]
Is determined.

一方、変数 dd のデータ区間は、上記の通り [dd.L, dd.H] = [0, 1020] であるため、関数 funcF の４つ目の式、即ち
dd_clip = (dd < 0) ? 0 : dd;
の選択演算における条件式 dd < 0 は、必ず 0 に評価される冗長比較演算である。従って、冗長比較演算削除部３０２は、この式を
dd_clip = dd;
と簡単化する。その結果、ビット幅判定部３０１は、演算結果のデータ区間を
[dd_clip.L, dd_clip.H] = [dd.L, dd.H] = [0, 1020]
と判定する。 On the other hand, since the data interval of the variable dd is [dd.L, dd.H] = [0, 1020] as described above, the fourth expression of the function funcF, that is,
dd_clip = (dd <0)? 0: dd;
The conditional expression dd <0 in the selection operation is always a redundant comparison operation that evaluates to 0. Therefore, the redundant comparison calculation deletion unit 302 converts this equation into
dd_clip = dd;
And simplify. As a result, the bit width determination unit 301 determines the data interval of the operation result.
[dd_clip.L, dd_clip.H] = [dd.L, dd.H] = [0, 1020]
Is determined.

以下同様に、ビット幅判定部３０１は、
r = (cc_clip + dd_clip) >> 2;
のデータ区間を
[r.L, r.H] = [0, 382]
と判定し、
r_clip = (r > 255) ? 255 : r;
のデータ区間を
[r_clip.L, r_clip.H] = [0, 255]
と判定する。以上により、関数 funcF の全ての中間データ及び出力データのデータ区間が判定され、従って、ビット幅が定められる。 Similarly, the bit width determination unit 301
r = (cc_clip + dd_clip) >>2;
Data interval
[rL, rH] = [0, 382]
And
r_clip = (r> 255)? 255: r;
Data interval
[r_clip.L, r_clip.H] = [0, 255]
Is determined. As described above, all intermediate data and output data sections of the function funcF are determined, and therefore the bit width is determined.

以上に説明されたように、冗長比較演算削除部３０２は、冗長比較演算を除去することによってデータフローグラフを最適化する。なお、本発明の別の実施形態においては、データフローグラフ最適化部３０は、冗長比較演算削除部３０２を含まないようにすることができる。その場合には、データフローグラフＦｄｆｇにおける比較演算が最適化されないことになるため、論理回路生成装置１が最終的に出力する論理回路が本実施例と比べて冗長なものとなることがあるが、本発明の目的が達成されることに変わりはない。 As described above, the redundant comparison calculation deletion unit 302 optimizes the data flow graph by removing the redundant comparison calculation. In another embodiment of the present invention, the data flow graph optimization unit 30 may not include the redundant comparison calculation deletion unit 302. In that case, since the comparison operation in the data flow graph Fdfg is not optimized, the logic circuit finally output by the logic circuit generation device 1 may be redundant as compared with the present embodiment. The object of the present invention is still achieved.

演算分解部３０３は、データフローグラフＦｄｆｇを最適化するための追加の処理として、複雑な算術演算を論理合成し易い単純な算術演算に変換する処理を行う。具体的には、次に説明するように、（１）定数乗算の加算・シフト分解、及び（２）除算のシフト・試行減算分解を行う。 As an additional process for optimizing the data flow graph Fdfg, the arithmetic decomposition unit 303 performs a process of converting a complex arithmetic operation into a simple arithmetic operation that is easy to logically synthesize. Specifically, as described below, (1) constant multiplication addition / shift decomposition and (2) division shift / trial subtraction decomposition are performed.

（１）定数乗算分解
データフローグラフＦｄｆｇに含まれる演算のうち、一方の演算対象（即ち、被乗数又は乗数）が定数である乗算について、演算分解部３０３は、その乗算を、定数のビット列において 1 であるそれぞれのビット位置にもう一方の演算対象をシフトしてそれらを積算する一連の演算に置き換える。この処理を定数乗算分解という。例えば、式 x * 5 は、定数である乗数 5 をビット表現すると 101 であるので、 x を 2 ビット左シフトした値と x との和に置き換えられる。これを式で表現すれば、 (x << 2) + x となる。 (1) Constant Multiplication Decomposition For operations in which one operation object (ie, multiplicand or multiplier) among the operations included in the data flow graph Fdfg is a constant, the operation decomposition unit 303 performs the multiplication on a constant bit string. Are replaced with a series of operations that shift the other operation target to each bit position and integrate them. This process is called constant multiplication decomposition. For example, the expression x * 5 is 101 when the multiplier 5 which is a constant is expressed in bits, so it can be replaced with the sum of x and a value obtained by shifting x to the left by 2 bits. Expressing this as an expression gives (x << 2) + x.

（２）除算分解
演算分解部３０３は、除算を、いわゆる筆算方式によって、シフトと試行減算とに分解する。この処理を除算分解という。２進数における筆算は、１０進数における一般的な筆算と同様であるが、減算の各回において求められる商のビットの値は 0 又は 1 であり、それに対応して、減数の値は 0 又は除数の値である。そこで、減算の各回において、被減数が除数以上である場合には、商のビットの値を 1 、減数の値を除数の値とし、そうでない場合には、商のビットを 0 、減数の値を 0 とする。この比較と減算とからなる処理は、試行減算として知られる。被減数が除数以上であるか否かは、実際に除数による減算を行った結果の符号によって判定してもよい。 (2) Division decomposition The operation decomposition unit 303 decomposes division into shift and trial subtraction by a so-called writing method. This process is called division decomposition. Arithmetic in binary numbers is the same as general arithmetic in decimal numbers, but the value of the quotient bit obtained at each subtraction is 0 or 1, and the value of the reduction is 0 or divisor. Value. Therefore, at each subtraction, if the dividend is greater than or equal to the divisor, the quotient bit value is 1 and the divisor value is the divisor value.If not, the quotient bit is 0 and the divisor value is set. Set to 0. This process of comparison and subtraction is known as trial subtraction. Whether the divisor is greater than or equal to the divisor may be determined by the sign of the result of the actual subtraction by the divisor.

図２２を参照して、８ビット符号無しデータ a, b, c, d における a = c / d 及び b = c % d の演算の分解を説明する。図の上段は、具体例として 254 を 21 で除したときの商 12 及び余り 2 を求める筆算を、２進数及び１０進数でそれぞれ表現したものである。２進数の筆算の図において、例えば、 n3 の値 000001111 は除数 00010101 以上でないため、これに対応する商のビット g3 の値は 0 となり、減数も 0 となる。一方、 n4 の値 000011111 は除数以上であるため、これに対応する商のビット g4 の値は 1 となり、減数は 00010101 となる。図の下段は、筆算の工程をＣ言語による一連の式として表現したものである。即ち、一般に、８ビット符号無しデータ a, b, c, d における a = c / d 及び b = c % d の演算は、この一連の式に分解される。より一般には、試行減算の回数はデータ c のビット幅に依存するが、上述のビット幅判定部３０１によって判定されたデータ c のビット幅がここで用いられる。 With reference to FIG. 22, decomposition of operations of a = c / d and b = c% d in 8-bit unsigned data a, b, c, d will be described. In the upper part of the figure, as a specific example, the writing to obtain the quotient 12 and the remainder 2 when 254 is divided by 21 is expressed in binary and decimal numbers, respectively. In the binary drawing, for example, the value n3 of 000001111 is not greater than or equal to the divisor 00010101, so the value of the bit g3 of the corresponding quotient is 0 and the subtraction is also 0. On the other hand, since the value 000011111 of n4 is greater than or equal to the divisor, the value of the bit g4 of the corresponding quotient is 1 and the subtraction is 00010101. The lower part of the figure represents the calculation process as a series of expressions in C language. That is, in general, the operations of a = c / d and b = c% d in 8-bit unsigned data a, b, c, d are decomposed into this series of expressions. More generally, the number of trial subtractions depends on the bit width of the data c, but the bit width of the data c determined by the bit width determination unit 301 is used here.

なお、本発明の別の実施形態においては、演算分解部３０３は、上記（１）及び（２）以外にも、ソフトウェア又は処理装の分野において一般に知られる、演算に対する他の最適化処理を行うようにすることができる。また、本発明のさらに別の実施形態においては、データフローグラフ最適化部３０は、演算分解部３０３を含まないようにすることができる。その場合には、データフローグラフＦｄｆｇに含まれる演算について上記の最適化がされないことになるため、論理回路生成装置１が最終的に出力する論理回路が本実施例と比べて冗長なものとなることがあるが、本発明の目的が達成されることに変わりはない。 In another embodiment of the present invention, the arithmetic decomposition unit 303 performs other optimization processing for arithmetic operations generally known in the field of software or processing equipment in addition to the above (1) and (2). Can be. In still another embodiment of the present invention, the data flow graph optimization unit 30 may not include the arithmetic decomposition unit 303. In that case, since the above-described optimization is not performed for the operations included in the data flow graph Fdfg, the logic circuit finally output by the logic circuit generation device 1 becomes redundant as compared with the present embodiment. However, the object of the present invention is still achieved.

冗長演算削除部３０４は、データフローグラフＦｄｆｇを最適化するための追加の処理として、定数伝播、共通部分式削除、及びデッドコード削除の処理を行う。定数伝播とは、演算結果が定数値になる式をその定数値に置き換える（即ち、定数を伝播させる）処理である。また、単純に１つの変数で表される式をその変数に置き換える処理も併せて行う。共通部分式削除とは、同じ値に評価されることが明らかである複数の式を一つの式にまとめる処理である。デッドコード削除（dead code elimination）とは、参照されない変数を削除する処理である。 The redundant operation deletion unit 304 performs constant propagation, common subexpression deletion, and dead code deletion processing as additional processing for optimizing the data flow graph Fdfg. Constant propagation is a process for replacing an expression that results in a constant value with a constant value (that is, propagating a constant). In addition, a process of simply replacing an expression represented by one variable with the variable is also performed. Common subexpression deletion is a process of combining a plurality of expressions that are clearly evaluated to the same value into one expression. Dead code elimination is a process of deleting a variable that is not referenced.

図２３及び図２４を参照して、演算分解部３０３による定数乗算分解、並びに冗長演算削除部３０４による定数伝播及び共通部分式除去について、具体例を用いて説明する。図２３において、（ａ）は、最上位関数として関数 top5 を含むプログラムＰのソフトウェア記述であり、（ｂ）は非循環・非階層変換部２１によってインライン展開されて非循環型最下層関数となった関数 top5 である。図２４において、（ａ）は、インライン展開された関数 top5 のデータフローグラフである。 With reference to FIGS. 23 and 24, constant multiplication decomposition by the arithmetic decomposition unit 303, constant propagation and common subexpression elimination by the redundant operation deletion unit 304 will be described using specific examples. In FIG. 23, (a) is a software description of the program P including the function top5 as the highest function, and (b) is expanded inline by the non-circular / non-hierarchical conversion unit 21 to become a non-circular bottom layer function. The function top5. In FIG. 24, (a) is a data flow graph of the function top5 expanded inline.

図２４において、（ｂ）は、（ａ）に定数伝播が適用された結果のデータフローグラフである。具体的には、変数 b0 が定数値 3 に、変数 b1 が定数値 7 に、それぞれ置き換えられる。また、単純に変数 c で表される変数 a0 及び a1 が変数 c に置き換えられる。（ｃ）は（ｂ）に定数乗算分解が適用された結果のデータフローグラフである。具体的には、式 c * 3 は式 (c << 1) + c に分解され、式 c * 7 は式 (c << 2) + (c << 1) + c に分解される。これらの分解結果の式について、部分式（例えば、 c << 1 ）に対応する中間変数（例えば、 tmp0 ）を導入し、部分式をノードとして表現することによって、（ｃ）のデータフローグラフが得られる。（ｄ）は、（ｃ）に共通部分式削除が適用された結果のデータフローグラフである。具体的には、変数 tmp0 及び tmp1 は共に式 c << 1 の演算結果であるため、 tmp1 への参照を tmp0 への参照に置き換えることによって、共通の式 c << 1 を１つにまとめる。（ｅ）は、（ｄ）に共通部分式削除がさらに適用された結果のデータフローグラフである。（ｆ）は、（ｅ）にデッドコード削除が適用された結果のデータフローグラフである。具体的には、他のノードから参照されない tmp1 及び tmp3 のノードがデータフローグラフから削除される。 In FIG. 24, (b) is a data flow graph as a result of applying constant propagation to (a). Specifically, variable b0 is replaced with constant value 3, and variable b1 is replaced with constant value 7. In addition, the variables a0 and a1 represented simply by the variable c are replaced with the variable c. (C) is a data flow graph as a result of applying constant multiplication decomposition to (b). Specifically, the expression c * 3 is decomposed into the expression (c << 1) + c, and the expression c * 7 is decomposed into the expression (c << 2) + (c << 1) + c. By introducing an intermediate variable (for example, tmp0) corresponding to the subexpression (for example, c << 1) and expressing the subexpression as a node for these decomposition result expressions, the data flow graph of (c) becomes can get. (D) is a data flow graph as a result of applying common subexpression elimination to (c). Specifically, since the variables tmp0 and tmp1 are the result of the operation of the expression c << 1, the common expression c << 1 is combined into one by replacing the reference to tmp1 with the reference to tmp0. (E) is a data flow graph resulting from further applying the common subexpression deletion to (d). (F) is a data flow graph as a result of applying dead code deletion to (e). Specifically, the tmp1 and tmp3 nodes that are not referenced by other nodes are deleted from the data flow graph.

以上に説明されたように、冗長演算削除部３０４は、データフローグラフＦｄｆｇを最適化するための追加の処理として、定数伝播、共通部分式削除、及びデッドコード削除の処理を行う。なお、本発明の別の実施形態においては、データフローグラフ最適化部３０は、冗長演算削除部３０４を含まないように構成することができる。 As described above, the redundant calculation deletion unit 304 performs constant propagation, common subexpression deletion, and dead code deletion processing as additional processing for optimizing the data flow graph Fdfg. In another embodiment of the present invention, the data flow graph optimization unit 30 can be configured not to include the redundant calculation deletion unit 304.

［演算器の回路遅延・回路面積評価部３１］
データフローグラフ最適化部３０によって最適化されたデータフローグラフＦｄｆｇは、次いで、演算器の回路遅延・回路面積評価部３１に入力される。演算器の回路遅延・回路面積評価部３１は、上述のビット幅判定部３０１によって定められた各演算命令の入出力ビット幅と、各演算の種類とから、対応する演算器の回路遅延（即ち、信号伝播遅延時間）及び回路面積を見積る。演算器の回路遅延を見積ることによって、論理回路生成装置１によって最終的に生成される論理回路の動作速度（即ち、最大動作クロック周波数）を見積ることが可能となる。また、演算器の回路面積をある程度の精度で予測することによって、より省面積の論理回路を得るため、或いはより高速動作の論理回路を得るために、論理回路生成装置１に入力されるプログラムＰのソフトウェア記述に対するチューニング作業を高効率化することが可能となる。本実施形態においては、後述するように、ここで評価された演算器の回路遅延は、パイプライン回路としての論理回路の動作クロック周波数を最大化させるための処理においても用いられる。 [Calculator circuit delay / circuit area evaluation unit 31]
The data flow graph Fdfg optimized by the data flow graph optimization unit 30 is then input to the circuit delay / circuit area evaluation unit 31 of the arithmetic unit. The circuit delay / circuit area evaluation unit 31 of the arithmetic unit calculates the circuit delay of the corresponding arithmetic unit (that is, from the input / output bit width of each arithmetic instruction determined by the bit width determination unit 301 and the type of each arithmetic operation) , Signal propagation delay time) and circuit area. By estimating the circuit delay of the arithmetic unit, it is possible to estimate the operation speed (that is, the maximum operation clock frequency) of the logic circuit finally generated by the logic circuit generation device 1. In addition, by predicting the circuit area of the arithmetic unit with a certain degree of accuracy, the program P input to the logic circuit generation device 1 in order to obtain a logic circuit with a smaller area or a logic circuit that operates at a higher speed. It is possible to improve the efficiency of tuning work for software descriptions. In this embodiment, as will be described later, the circuit delay of the arithmetic unit evaluated here is also used in processing for maximizing the operation clock frequency of the logic circuit as the pipeline circuit.

演算器の回路遅延の算出方法については、様々なアプローチが可能である。回路遅延を高精度に算出するためには、ＲＴＬ記述から特定の半導体プロセス用の回路ライブラリを用いた回路合成を行うツールである論理合成ツールを各演算器に直接適用する方法や、予め幾つかのビット幅で論理合成した結果を保存しておき、実際のビット幅での回路遅延を幾つかの論理合成結果から予測する方法などがある。本実施形態においては、演算器の回路遅延・回路面積評価部３１は、以下に説明する簡潔な演算器の遅延モデル及び面積モデルに基づいて、演算器の回路遅延及び回路面積を見積る。これらのモデルによれば、論理合成ツールなどの外部ツールの支援なしに、ある程度の精度で回路遅延及び回路面積を見積ることが可能である。 Various approaches are possible for the calculation method of the circuit delay of the arithmetic unit. In order to calculate the circuit delay with high accuracy, a logic synthesis tool that is a tool for synthesizing a circuit using a circuit library for a specific semiconductor process from an RTL description is directly applied to each arithmetic unit, There is a method of storing the result of logic synthesis with a bit width of 2 and predicting a circuit delay with an actual bit width from several logic synthesis results. In the present embodiment, the circuit delay / circuit area evaluation unit 31 of the arithmetic unit estimates the circuit delay and circuit area of the arithmetic unit based on a simple arithmetic unit delay model and area model described below. According to these models, it is possible to estimate the circuit delay and the circuit area with a certain degree of accuracy without the support of an external tool such as a logic synthesis tool.

図２５に示されるように、同期型の論理回路において、論理回路の最大動作クロック周波数は、論理回路の出力信号の最大信号伝播時間によって決定される。論理回路全体の入力信号及びクロックの立上り（即ち、０から１への遷移）の瞬間に変化する各レジスタ出力信号を信号伝播の発生源としたときに、最大信号伝播時間は、これらの発生源の信号からその他の信号までの信号伝播に要する時間のうち最大のものである。 As shown in FIG. 25, in the synchronous logic circuit, the maximum operation clock frequency of the logic circuit is determined by the maximum signal propagation time of the output signal of the logic circuit. When the input signal of the entire logic circuit and each register output signal that changes at the instant of clock rise (ie, transition from 0 to 1) are used as the signal propagation sources, the maximum signal propagation time is determined by these sources. This is the maximum time required for signal propagation from the other signal to other signals.

信号伝播の遅延要因は、論理回路を構成する各論理ゲートにおける入力信号が変化してから出力信号が変化するまでの回路遅延にある。本実施形態においては、これら回路遅延が演算器単位でモデル化される。同様に、回路面積も演算器単位でモデル化される。複雑な演算器は、より簡単な演算器を用いて階層的に表現されるが、これに対応して、演算器の遅延モデル及び面積モデルも階層的に表現される。これにより、ソフトウェア記述上表現可能な全ての演算器に遅延モデル及び面積モデルを対応させることが可能であるため、外部の論理合成ツールに依存する必要がない。 A signal propagation delay factor is a circuit delay from the change of the input signal to the change of the output signal in each logic gate constituting the logic circuit. In the present embodiment, these circuit delays are modeled in units of arithmetic units. Similarly, the circuit area is also modeled on a computing unit basis. A complex arithmetic unit is hierarchically expressed using a simpler arithmetic unit. Correspondingly, a delay model and an area model of the arithmetic unit are also hierarchically expressed. As a result, it is possible to make the delay model and the area model correspond to all the arithmetic units that can be expressed in the software description, so that it is not necessary to depend on an external logic synthesis tool.

演算器の回路遅延モデルにおいては、各信号の伝播時間は、信号のＭＳＢ（最上位ビット）及びＬＳＢ（最下位ビット）の２点のみで表現される。これにより、信号の各ビットで個別に伝播時間を計算するよりも遥かに高速に計算される。また、ＭＳＢとＬＳＢとの間の信号伝播の時間差が表現されるため、算術演算器（加算・減算・乗算等）に含まれるキャリー伝播構造に対応し、高精度に回路遅延が計算される。 In the circuit delay model of the arithmetic unit, the propagation time of each signal is represented by only two points of the MSB (most significant bit) and LSB (least significant bit) of the signal. This is much faster than calculating the propagation time individually for each bit of the signal. Also, since the signal propagation time difference between the MSB and the LSB is expressed, the circuit delay is calculated with high accuracy corresponding to the carry propagation structure included in the arithmetic operator (addition, subtraction, multiplication, etc.).

以下の説明において、信号 S のＬＳＢ伝播時間を S.TL 、ＭＳＢ伝播時間を S.TM とそれぞれ表記する。信号伝播の発生源となる信号（即ち、論理回路全体の入力信号及びレジスタ出力信号） SI の伝播時間は、 SI.TL = SI.TM = 0 となる。また、図２６に示されるように、演算器のポート i の入力信号を S(i) と表記し、入力ポート i を経由した出力信号を SO'(i) と表記する。従って、例えば、入力ポート i のＬＳＢ伝播時間は S(i).TL と表記される。ここで、図２６にさらに示されるように、演算器の各入力ポートと出力ポートとの間の信号伝播遅延を表現するために、次の４つの表記を導入する。 In the following description, the LSB propagation time of the signal S is represented as S.TL, and the MSB propagation time is represented as S.TM. The signal propagation source (that is, the input signal and register output signal of the entire logic circuit) SI propagation time is SI.TL = SI.TM = 0. In addition, as shown in FIG. 26, the input signal at the port i of the computing unit is denoted as S (i), and the output signal via the input port i is denoted as SO ′ (i). Therefore, for example, the LSB propagation time of the input port i is expressed as S (i) .TL. Here, as further shown in FIG. 26, the following four notations are introduced to express the signal propagation delay between each input port and output port of the computing unit.

L(i)（単純伝播遅延）は、入力ポート i のＬＳＢから出力ポートのＬＳＢへの伝播遅延（ＬＳＢ伝播遅延）、又は入力ポート i のＭＳＢから出力ポートのＭＳＢへの伝播遅延（ＭＳＢ伝播遅延）を示す。ここで、ＬＳＢ伝播遅延とＭＳＢ伝播遅延とは同じであるという単純なモデルを想定している。 L (i) (simple propagation delay) is the propagation delay from the LSB of the input port i to the LSB of the output port (LSB propagation delay), or the propagation delay from the MSB of the input port i to the MSB of the output port (MSB propagation delay) ). Here, a simple model is assumed in which the LSB propagation delay and the MSB propagation delay are the same.

C(i)（キャリー伝播遅延）は、入力ポート i のＬＳＢから出力ポートのＭＳＢへの信号伝播遅延時間を示す。これは、主に、加算器のキャリー遅延を表現するために用いられる。キャリー伝播遅延が考慮される（即ち C(i) が 0 でない）演算器としては、加算器、減算器（単項マイナス演算を含む）、乗算器、及び大小比較器が挙げられる。 C (i) (carry propagation delay) indicates a signal propagation delay time from the LSB of the input port i to the MSB of the output port. This is mainly used to represent the carry delay of the adder. Examples of the arithmetic unit in which the carry propagation delay is considered (that is, C (i) is not 0) include an adder, a subtracter (including a unary minus operation), a multiplier, and a magnitude comparator.

F(i)（入力遅延同期フラグ）は、入力のＬＳＢ及びＭＳＢそれぞれの信号伝播時間が同期される（即ち、伝播時間が遅い方に揃えられる）か否かを示し、信号が同期されない場合を F(i) = 0 、信号が同期される場合を F(i) = 1 で表すものとする。具体的には、同値比較器及び論理否定器は、入力の全ビットが到着してから出力が確定するため、全ての入力ポート i について F(i) = 1 とする。また、シフト演算器は、シフト量を指定する第２入力ポートの入力の全ビットが到着してから出力が確定するため、 F(2) = 1 とする。なお、このモデルにおいては、後述するように、１つの信号のＬＳＢ伝播時間がＭＳＢ伝播時間よりも遅くなることはないため、信号が同期される場合には、ＬＳＢ伝播時間がＭＳＢ伝播時間に揃えられることとなる。 F (i) (input delay synchronization flag) indicates whether or not the signal propagation time of each of the input LSB and MSB is synchronized (that is, the signal is not synchronized). Let F (i) = 0 and F (i) = 1 denote the case where the signal is synchronized. Specifically, since the output of the equivalence comparator and logic negator is determined after all the bits of the input arrive, F (i) = 1 is set for all input ports i. The shift computing unit sets F (2) = 1 because the output is determined after all the bits of the input of the second input port that specify the shift amount arrive. In this model, as will be described later, since the LSB propagation time of one signal does not become slower than the MSB propagation time, when the signals are synchronized, the LSB propagation time is aligned with the MSB propagation time. Will be.

TL(i) は、図示されないが、 F(i) によって選択的に行われる信号の同期を考慮した結果の入力信号のＬＳＢ伝播時間を表す。即ち、 F(i) == 0 のときには、信号の同期が行われないため、 TL(i) は S(i).TL である。一方、 F(i) == 1 のときには、信号の同期が行われ、ＬＳＢ伝播時間はＭＳＢ伝播時間に揃えられるため、 TL(i) は S(i).TM である。これを式によって表現すると
TL(i) = (F(i) == 1) ? S(i).TM : S(i).TL
となる。 Although not shown, TL (i) represents the LSB propagation time of the input signal as a result of considering the signal synchronization selectively performed by F (i). That is, when F (i) == 0, signal synchronization is not performed, so TL (i) is S (i) .TL. On the other hand, when F (i) == 1, signal synchronization is performed and the LSB propagation time is aligned with the MSB propagation time, so TL (i) is S (i) .TM. Expressing this with an expression
TL (i) = (F (i) == 1)? S (i) .TM: S (i) .TL
It becomes.

図２６にさらに示されるように、出力信号のＬＳＢ伝播時間 SO'(i).TL は、信号の同期を考慮した入力信号のＬＳＢ伝播時間 TL(i) に演算器の単純伝播遅延 L(i) を加えたものとする。これを式によって表現すると
SO'(i).TL = TL(i) + L(i)
となる。 As further shown in FIG. 26, the LSB propagation time SO ′ (i) .TL of the output signal is equal to the LSB propagation time TL (i) of the input signal in consideration of the signal synchronization. ) Is added. Expressing this with an expression
SO '(i) .TL = TL (i) + L (i)
It becomes.

また、出力信号のＭＳＢ伝播時間 SO'(i).TM は、入力信号のＬＳＢ伝播時間 TL(i) に演算器のキャリー伝搬遅延 C(i) を加えたものと、入力信号のＭＳＢ伝播時間 TM(i) に演算器の単純伝播遅延 L(i) を加えたものとのうち、いずれか大きいほうとする。これを式によって表現すると
SO'(i).TM = max(S(i).TM + L(i), TL(i) + C(i))
となる。 The MSB propagation time SO '(i) .TM of the output signal is the sum of the LSB propagation time TL (i) of the input signal and the carry propagation delay C (i) of the computing unit, and the MSB propagation time of the input signal. TM (i) plus the simple propagation delay L (i) of the arithmetic unit, whichever is greater. Expressing this with an expression
SO '(i) .TM = max (S (i) .TM + L (i), TL (i) + C (i))
It becomes.

これらの式から、各演算器の各ポートの入力信号のＬＳＢ伝播時間がＭＳＢ伝播時間より遅くなることがなければ、出力信号のＬＳＢ伝播時間がＭＳＢ伝播時間より遅くなることがないことがわかる。よって、論理回路全体において、１つの信号のＬＳＢ伝播時間がＭＳＢ伝播時間より遅くなることはない。 From these equations, it can be seen that unless the LSB propagation time of the input signal at each port of each computing unit is slower than the MSB propagation time, the LSB propagation time of the output signal will not be slower than the MSB propagation time. Therefore, in the entire logic circuit, the LSB propagation time of one signal does not become slower than the MSB propagation time.

図２７に示されるように、最終的な演算器の出力信号 SO のＭＳＢ伝播時間 SO.TM は、演算器の全ての入力ポート i について、入力ポート i を経由した出力信号のＭＳＢ伝播時間 SO'(i).TM のうち最も遅いものとして求められる。ＬＳＢ伝播時間 SO.TLも同様に、演算器の全ての入力ポート i について、入力ポート i を経由した出力信号のＬＳＢ伝播時間 SO'(i).TL のうち最も遅いものとして求められる。但し、演算器の出力信号の幅が１ビットの場合には、ＬＳＢとＭＳＢとは同一のビットであるため、ＭＳＢ伝播時間 SO.TM として求めた値を、ＬＳＢ伝播時間 SO.TL としても採用する。 As shown in FIG. 27, the MSB propagation time SO.TM of the final output signal SO of the computing unit is the MSB propagation time SO ′ of the output signal via the input port i for all the input ports i of the computing unit. (i) Required as the slowest of the TMs. Similarly, the LSB propagation time SO.TL is obtained as the slowest of the LSB propagation times SO ′ (i) .TL of the output signal passing through the input port i for all the input ports i of the computing unit. However, when the output signal width of the arithmetic unit is 1 bit, since the LSB and the MSB are the same bit, the value obtained as the MSB propagation time SO.TM is also adopted as the LSB propagation time SO.TL. To do.

ここで、 B（出力遅延同期フラグ）は、演算器の出力信号の幅が１ビットであるか否かを表すものとし、１ビット出力の場合に B = 1 、その他の場合に B = 0 とする。これを用いて、最終的な演算器の出力信号 SO のＭＳＢ伝播時間 SO.TM 及びＬＳＢ伝播時間 SO.TL を式によって表現すると
SO.TM = max{ SO'(i).TM | i ∈ I }
SO.TL = (B == 1) ? SO.TM : max{ SO'(i).TL | i ∈ I }
となる。これらの式において、 I は、演算器の全ての入力ポート i の集合を表す。 Here, B (output delay synchronization flag) indicates whether or not the width of the output signal of the arithmetic unit is 1 bit, and B = 1 in the case of 1-bit output, and B = 0 in the other cases To do. Using this, the MSB propagation time SO.TM and the LSB propagation time SO.TL of the final output signal SO of the computing unit are expressed by equations.
SO.TM = max {SO '(i) .TM | i ∈ I}
SO.TL = (B == 1)? SO.TM: max {SO '(i) .TL | i ∈ I}
It becomes. In these equations, I represents the set of all input ports i of the computing unit.

以上に説明した演算器の回路遅延モデル及び面積モデルについて、以下では、演算器の種類ごとに、より具体的に説明する。そのために、まず、基本論理ゲートの回路遅延及び回路面積を、想定する実装技術（半導体プロセス、ＦＰＧＡ：Field Programmable Gate Array）に対応して適切にモデル化する。ここでモデル化の対象とされる基本ゲートは、論理積（AND）、論理和（OR）、論理否定（NOT）、排他的論理和（XOR）及び２入力１出力マルチプレクサ（MUX）である。２入力１出力マルチプレクサの回路図及び動作を図２８に示す。これら以外にも、例えば NAND 又は NOR など、他の基本ゲートを追加しても良い。これらの各基本論理ゲート G の回路遅延モデルを G.delay と表記し、回路面積モデルを G.area と表記することにする（例えば、論理積ゲートの回路遅延は AND.delay と表記される）。 The circuit delay model and area model of the arithmetic unit described above will be described more specifically for each type of arithmetic unit. For this purpose, first, the circuit delay and circuit area of the basic logic gate are appropriately modeled corresponding to the assumed mounting technology (semiconductor process, FPGA: Field Programmable Gate Array). The basic gates to be modeled here are a logical product (AND), a logical sum (OR), a logical negation (NOT), an exclusive logical sum (XOR), and a two-input one-output multiplexer (MUX). FIG. 28 shows a circuit diagram and operation of the 2-input 1-output multiplexer. In addition to these, other basic gates such as NAND or NOR may be added. The circuit delay model of each of these basic logic gates G is denoted as G.delay, and the circuit area model is denoted as G.area (for example, the circuit delay of the AND gate is denoted as AND.delay) .

次に、基本論理ゲートを組合せた複合ゲートのモデルについて説明する。代表的な複合ゲートとして、１ビット全加算器（FA）、１ビット半加算器（HA）、及び Booth recoder 器（BR）を考える。なお、説明においては、それぞれ回路構造を想定して回路遅延及び回路面積を計算しているが、使用する回路ライブラリから厳密に計算してもよい。 Next, a composite gate model in which basic logic gates are combined will be described. As a typical composite gate, a 1-bit full adder (FA), a 1-bit half adder (HA), and a Booth recorder (BR) are considered. In the description, the circuit delay and the circuit area are calculated on the assumption of the circuit structure, but may be strictly calculated from the circuit library to be used.

１ビット全加算器（FA）は、 a, b, c の３ビットを入力し、 sum 及び carry の各ビットを出力する。入力と出力との関係は、
sum = a ^ b ^ c;
carry = (a & b) | (b & c) | (a & c);
と表現される。これらの式の演算を直接実現した回路構造を想定すると、 sum の出力遅延 FA.s_delay 、 carry の出力遅延 FA.c_delay 、及び回路面積 FA.area は、次のように表現される。
FA.s_delay = 2 * XOR.delay;
FA.c_delay = AND.delay + OR.delay;（キャリー伝播遅延）
FA.c_area = 3 * AND.area + 2 * OR.area;（キャリー出力回路面積）
FA.area = 2 * XOR.area + FA.c_area; The 1-bit full adder (FA) inputs 3 bits, a, b, and c, and outputs sum and carry bits. The relationship between input and output is
sum = a ^ b ^ c;
carry = (a & b) | (b & c) | (a &c);
It is expressed. Assuming a circuit structure that directly implements the operations of these equations, the sum output delay FA.s_delay, carry output delay FA.c_delay, and circuit area FA.area are expressed as follows.
FA.s_delay = 2 * XOR.delay;
FA.c_delay = AND.delay + OR.delay; (carry propagation delay)
FA.c_area = 3 * AND.area + 2 * OR.area; (Carry output circuit area)
FA.area = 2 * XOR.area + FA.c_area;

１ビット半加算器（HA）は、 a, b の２ビットを入力し、 sum 及び carry の各ビットを出力する。入力と出力との関係は、
sum = a ^ b;
carry = a & b;
と表現される。これらの式の演算を直接実現した回路構造を想定すると、 sum の出力遅延を HA.s_delay 、carry の出力遅延 HA.c_delay 、及び回路面積 HA.area は、次のように表現される。
HA.s_delay = XOR.delay;
HA.c_delay = AND.delay;（キャリー伝播遅延）
HA.c_area = AND.area;（キャリー出力回路面積）
HA.area = XOR.area + HA.c_area; A 1-bit half adder (HA) inputs 2 bits a and b and outputs sum and carry bits. The relationship between input and output is
sum = a ^ b;
carry = a &b;
It is expressed. Assuming a circuit structure that directly implements the operations of these equations, sum output delay HA.s_delay, carry output delay HA.c_delay, and circuit area HA.area are expressed as follows.
HA.s_delay = XOR.delay;
HA.c_delay = AND.delay; (carry propagation delay)
HA.c_area = AND.area; (Carry output circuit area)
HA.area = XOR.area + HA.c_area;

Booth recoder器（BR）は、乗算器で使用され、入力ポート 1 の連続する３ビットをデコードして、入力ポート 0 のデータの 2 倍、 1 倍、 0 倍、 -1 倍、 -2 倍を出力する回路である。図２９に示されるように、入力ポート 0 の 1 倍及び 2 倍から成る２ビットの入力（x0, x1）に対して２つの XOR ゲートで -1 倍及び -2 倍を作り、これらを２入力１出力マルチプレクサで選択し、 0 倍を作る AND ゲートで実現することによって出力の各ビットを得る回路構造を想定する。図における sel 、 neg 及び nz の各信号は、それぞれ
sel = y_i ^ y_i-1;
neg = y_i+1;
nz = ~ ( y_i+1 & y_i & y_i-1 ) & (y_i+1 | y_i | y_i-1 );
と表現される。これらの式に直接従って、これらの信号を出力するデコーダ回路の回路構造を想定する。出力遅延 BR.delay 及び回路面積 BR.area は、次のように表現される。ここで、 bw_0 は、入力ポート 0 のビット幅を表す。
BR.delay = XOR.delay + MUX.delay + AND.delay;
BR.dec_area = XOR.area + AND.area * 3 + OR.area * 2 + NOT.area; （デコーダ回路の面積）
BR.area(bw_0) = (XOR.area * 2 + MUX.area + AND.area) * bw_0 + BR.dec_area; The Booth recorder (BR) is used in the multiplier to decode the consecutive 3 bits of input port 1, and to double, 1 times, 0 times, -1 times, -2 times the data of input port 0 It is a circuit to output. As shown in Fig. 29, 2 bits of input port (x0, x1) consisting of 1x and 2x of input port 0 (x0, x1) are made -1 times and -2 times by two XOR gates, and these are input as 2 Assume a circuit structure that obtains each bit of the output by selecting it with a 1-output multiplexer and realizing it with an AND gate that produces 0 times. The sel, neg and nz signals in the figure are
sel = y _i ^ y _i-1 ;
neg = y _{i + 1} ;
nz = ~ (y _{i + 1} & y _i & y _i-1 ) & (y _{i + 1} | y _i | y _i-1 );
It is expressed. Directly following these equations, a circuit structure of a decoder circuit that outputs these signals is assumed. The output delay BR.delay and the circuit area BR.area are expressed as follows. Here, bw_0 represents the bit width of the input port 0.
BR.delay = XOR.delay + MUX.delay + AND.delay;
BR.dec_area = XOR.area + AND.area * 3 + OR.area * 2 + NOT.area; (decoder circuit area)
BR.area (bw_0) = (XOR.area * 2 + MUX.area + AND.area) * bw_0 + BR.dec_area;

以上説明した基本論理ゲート及び複合ゲートのモデルを利用して、各演算器の回路遅延モデル及び回路面積モデルを見積る処理について、以下に説明する。なお、説明においては、それぞれ回路構造を想定して回路遅延及び回路面積を見積っているが、別の回路構造を想定した場合には結果が異なることがある。 Processing for estimating the circuit delay model and circuit area model of each arithmetic unit using the basic logic gate and composite gate models described above will be described below. In the description, the circuit delay and the circuit area are estimated by assuming a circuit structure, but the results may be different when another circuit structure is assumed.

説明において、演算器 X の回路面積を X.AREA と表記する。
また、入出力信号に関するパラメータを、次のように表記する。
bw_in(i) は、入力ポート i の入力信号のビット幅を表す。
bw は、出力信号のビット幅を表す。
bw_max は、全ての入力ポートについての入力信号のビット幅の最大値を表す。
bw_min は、全ての入力ポートについての入力信号のビット幅の最小値を表す。
bw_dif は、入力信号のビット幅の最大値と最小値との差を表す。
これらの関係を式で表現すると、
bw_max = max{bw_in(i) | i ∈ I }
bw_min = min{bw_in(i) | i ∈ I }
bw_dif = bw_max - bw_min
となる。これらの式において、 I は、演算器の全ての入力ポート i の集合を表す。 In the explanation, the circuit area of the calculator X is expressed as X.AREA.
In addition, parameters related to input / output signals are expressed as follows.
bw_in (i) represents the bit width of the input signal of the input port i.
bw represents the bit width of the output signal.
bw_max represents the maximum value of the bit width of the input signal for all input ports.
bw_min represents the minimum value of the bit width of the input signal for all input ports.
bw_dif represents the difference between the maximum value and the minimum value of the bit width of the input signal.
Expressing these relationships as an expression,
bw_max = max {bw_in (i) | i ∈ I}
bw_min = min {bw_in (i) | i ∈ I}
bw_dif = bw_max-bw_min
It becomes. In these equations, I represents the set of all input ports i of the computing unit.

加算器（ADD）は、出力のうち bw_min ビットを全加算器（FA）で実現し、残りの bw_dif ビットを半加算器（HA）で実現する回路構造を想定する。この回路構造における加算器の回路遅延及び回路面積は次の通りとなる。式において、入力ポートを表す i は 0 又は 1 である。
ADD.L(i) = FA.s_delay （FA の sum 出力遅延１個分）
ADD.C(i) = FA.c_delay * bw_min + HA.c_delay * bw_dif （FA の carry 出力遅延 bw_min 個分と HA のキャリー出力遅延 bw_dif 個分）
ADD.F(i) = 0
ADD.AREA = FA.area * bw_min + HA.area * bw_dif The adder (ADD) assumes a circuit structure in which bw_min bits of the output are realized by a full adder (FA) and the remaining bw_dif bits are realized by a half adder (HA). The circuit delay and circuit area of the adder in this circuit structure are as follows. In the equation, i representing the input port is 0 or 1.
ADD.L (i) = FA.s_delay (for one FA sum output delay)
ADD.C (i) = FA.c_delay * bw_min + HA.c_delay * bw_dif (FA carry output delay bw_min and HA carry output delay bw_dif)
ADD.F (i) = 0
ADD.AREA = FA.area * bw_min + HA.area * bw_dif

減算器（SUB）は、加算器において入力ポート 1 の前に NOT ゲートを付加する回路構造を想定する。この回路構造における減算器の回路遅延及び回路面積は次の通りとなる。式において、入力ポートを表す i は 0 又は 1 である。
SUB.L(i) = FA.s_delay + NOT.delay
SUB.C(i) = FA.c_delay * bw_min + HA.c_delay * bw_dif + NOT.delay
SUB.F(i) = 0
SUB.AREA = FA.area * bw_min + HA.area * bw_dif + NOT.area * bw_in(1) The subtractor (SUB) assumes a circuit structure in which a NOT gate is added before input port 1 in the adder. The circuit delay and circuit area of the subtractor in this circuit structure are as follows. In the equation, i representing the input port is 0 or 1.
SUB.L (i) = FA.s_delay + NOT.delay
SUB.C (i) = FA.c_delay * bw_min + HA.c_delay * bw_dif + NOT.delay
SUB.F (i) = 0
SUB.AREA = FA.area * bw_min + HA.area * bw_dif + NOT.area * bw_in (1)

乗算器（MUL）は、ここでは Booth recoder 器の ceil(bw_1 / 2) 個の出力を加算する回路構造を想定する。この回路構造における乗算器の回路遅延及び回路面積は次の通りとなる。式において、入力ポートを表す i は 0 又は 1 である。
booth_count = ceil(bw_in(1) / 2) （Booth recoder 器の個数）
MUL.L(i) = BR.delay + booth_count * FA.s_delay （Booth recoder 器の出力遅延１個分と FA の sum 出力遅延 booth_count 個分）
MUL.C(i) = MUL.L(i) + bw_in(0) * FA.c_delay （乗算器の出力遅延と bw_in(0) ビット分の FA キャリー伝播遅延）
MUL.F(i) = 0
MUL.AREA = booth_count * BR.area(bw_in(0)) + booth_count * (bw_in(0) + 1) * FA.area （ビット幅 bw_in(0) の Booth recoder 器と出力加算用全加算器の回路面積の合計） The multiplier (MUL) here assumes a circuit structure that adds the ceil (bw_1 / 2) outputs of the Booth recorder. The circuit delay and circuit area of the multiplier in this circuit structure are as follows. In the equation, i representing the input port is 0 or 1.
booth_count = ceil (bw_in (1) / 2) (number of Booth recoder units)
MUL.L (i) = BR.delay + booth_count * FA.s_delay (1 Booth recoder output delay plus FA sum output delay booth_count)
MUL.C (i) = MUL.L (i) + bw_in (0) * FA.c_delay (multiplier output delay and FA carry propagation delay for bw_in (0) bits)
MUL.F (i) = 0
MUL.AREA = booth_count * BR.area (bw_in (0)) + booth_count * (bw_in (0) + 1) * FA.area (Booth recoder with bit width bw_in (0) and full adder circuit for output addition Total area)

選択演算器（SEL）は、１ビットの選択入力に基づいて、２つの被選択入力のうち１つを出力する。ここでは、出力信号のビット幅の数のマルチプレクサを用いた回路構造を想定する。この回路構造における選択演算器の回路遅延及び回路面積は次の通りとなる。式において、入力ポートを表す i は 0 又は 1 である。
SEL.L(i) = MUX.delay
SEL.C(i) = 0
SEL.F(i) = 0
SEL.AREA = MUX.area * bw
なお、２つの入力ポートのビット幅が異なる場合には、 bw_min 及び bw_dif を用いてさらに厳密に算出することもできる。 The selection calculator (SEL) outputs one of the two selected inputs based on a 1-bit selection input. Here, a circuit structure using a multiplexer having the number of bit widths of the output signal is assumed. The circuit delay and circuit area of the selected arithmetic unit in this circuit structure are as follows. In the equation, i representing the input port is 0 or 1.
SEL.L (i) = MUX.delay
SEL.C (i) = 0
SEL.F (i) = 0
SEL.AREA = MUX.area * bw
If the bit widths of the two input ports are different, it can be calculated more strictly using bw_min and bw_dif.

大小比較器（CMP）は、２つの入力ポートの減算を行い、演算結果の符号ビットから計算された結果を出力する回路を想定する。減算結果の出力は必要ないので、その分、減算器の回路面積よりも小さくなる。また、出力ビット幅が１なので、出力遅延同期フラグ（B）が 1 となる。この回路構造における大小比較器の回路遅延及び回路面積は次の通りとなる。式において、入力ポートを表す i は 0 又は 1 である。
CMP.L(i) = FA.c_delay
CMP.C(i) = FA.c_delay * bw_min + HA.c_delay * bw_dif + NOT.delay
CMP.F(i) = 0
CMP.AREA = FA.c_area * bw_min + HA.c_area * bw_dif + NOT.area * bw_in(1) A large / small comparator (CMP) is assumed to be a circuit that subtracts two input ports and outputs a result calculated from a sign bit of an operation result. Since the output of the subtraction result is not necessary, it becomes smaller than the circuit area of the subtractor accordingly. Since the output bit width is 1, the output delay synchronization flag (B) is 1. The circuit delay and circuit area of the large / small comparator in this circuit structure are as follows. In the equation, i representing the input port is 0 or 1.
CMP.L (i) = FA.c_delay
CMP.C (i) = FA.c_delay * bw_min + HA.c_delay * bw_dif + NOT.delay
CMP.F (i) = 0
CMP.AREA = FA.c_area * bw_min + HA.c_area * bw_dif + NOT.area * bw_in (1)

同値比較否定器（NEQ）は、２つの入力ポートの各ビットがすべて同値である場合に 0 を出力し、そうでない場合に 1 を出力する。回路構造としては、図３０に示されるように、２入力のビット毎の XOR 演算出力を、２分木構造を用いて OR で結合したものを想定する。また、出力のビット幅が１なので、出力遅延同期フラグ（B）が 1 となるが、入力ポート側で遅延同期しているため、出力ポート信号のＬＳＢ伝播時間とＭＳＢ伝播時間とは自動的に同一になる。この回路構成における同値比較否定器の回路遅延及び回路面積は次の通りとなる。式において、入力ポートを表す i は 0 又は 1 である。
NEQ.L(i) = XOR.delay + ceil(log₂(bw)) * OR.delay
NEQ.C(i) = 0
NEQ.F(i) = 1 （入力遅延同期）
NEQ.AREA = bw * XOR.area + (bw - 1) * OR.area The equivalence comparison negator (NEQ) outputs 0 when all the bits of the two input ports are equal, and outputs 1 otherwise. As a circuit structure, as shown in FIG. 30, it is assumed that two-input bitwise XOR operation outputs are connected by OR using a binary tree structure. Since the output bit width is 1, the output delay synchronization flag (B) is 1. However, since the delay is synchronized on the input port side, the LSB propagation time and the MSB propagation time of the output port signal are automatically set. Be the same. The circuit delay and circuit area of the equivalence comparison negator in this circuit configuration are as follows. In the equation, i representing the input port is 0 or 1.
NEQ.L (i) = XOR.delay + ceil (log ₂ (bw)) * OR.delay
NEQ.C (i) = 0
NEQ.F (i) = 1 (input delay synchronization)
NEQ.AREA = bw * XOR.area + (bw-1) * OR.area

シフト演算器（SHF）は、ポート 1 で指定された値をシフト量として、ポート 0 のデータの右シフト又は左シフトを出力する。回路構造としては、バレルシフタを想定する。バレルシフタは、ポート 1 のシフト量の２進表現において２のべき乗ビットのシフトを行う回路を直列接続して構成される。例えば、Z = X >> Y （X, Y, Z：符号なし32ビット）は、図３１に示されるような回路構造になる。ここでは、 Y の下位５ビットだけがシフト量を示すものとする。この回路構成におけるシフト演算器の回路遅延及び回路面積は次の通りとなる。式において、入力ポートを表す i は 0 又は 1 である。
bw_1' = min(bw_in(1), ceil(log₂(bw_in(0))) （ポート 1 のビット幅を最大 log₂(bw_in(0)) ビットとして算出する）
SHF.L(i) = bw_1' * MUX.delay
SHF.C(i) = 0
SHF.F(0) = 0, SHF.F(1) = 1 （入力ポート 1 を遅延同期する）
SHF.AREA = bw_in(0) * bw_1' * MUX.area The shift calculator (SHF) outputs the right shift or left shift of the data of port 0 using the value specified by port 1 as the shift amount. As the circuit structure, a barrel shifter is assumed. The barrel shifter is configured by connecting in series a circuit that shifts a power of 2 bits in the binary representation of the shift amount of port 1. For example, Z = X >> Y (X, Y, Z: unsigned 32 bits) has a circuit structure as shown in FIG. Here, only the lower 5 bits of Y indicate the shift amount. The circuit delay and circuit area of the shift calculator in this circuit configuration are as follows. In the equation, i representing the input port is 0 or 1.
bw_1 '= min (bw_in (1), ceil (log ₂ (bw_in (0))) (Calculate the port 1 bit width as the maximum log ₂ (bw_in (0)) bits)
SHF.L (i) = bw_1 '* MUX.delay
SHF.C (i) = 0
SHF.F (0) = 0, SHF.F (1) = 1 (delay synchronization of input port 1)
SHF.AREA = bw_in (0) * bw_1 '* MUX.area

論理回路生成装置１は、演算器の回路遅延・回路規模評価部３１が算出した回路遅延及び回路規模の見積を、例えば、出力装置１７に出力することができる。また、別の実施形態においては、論理回路生成装置１は、演算器の回路遅延・回路規模評価部３１を含まないようにすることができる。この場合には、論理回路生成装置１は、回路遅延及び回路規模の見積を算出しないが、かかる見積は生成される論理回路を直接構成する情報ではないため、本発明の目的が達成されることに変わりはない。 The logic circuit generation device 1 can output the circuit delay and circuit size estimate calculated by the circuit delay / circuit size evaluation unit 31 of the arithmetic unit to, for example, the output device 17. In another embodiment, the logic circuit generation device 1 may not include the circuit delay / circuit scale evaluation unit 31 of the arithmetic unit. In this case, the logic circuit generation device 1 does not calculate an estimate of the circuit delay and the circuit scale, but since the estimate is not information that directly configures the generated logic circuit, the object of the present invention is achieved. There is no change.

［パイプライン境界配置部３２］
演算器の回路遅延・回路規模評価部３１によって演算器の回路遅延及び回路規模が評価されたデータフローグラフＦｄｆｇは、次いで、パイプライン境界配置部３２に入力される。パイプライン境界配置部３２は、データフローグラフＦｄｆｇから、パイプライン回路を合成し、パイプライン回路データＦｐｌｃを出力する。パイプライン回路は、繰り返し実行される演算処理がパイプラインステージという処理単位に分割され、これら処理単位を実行する回路ブロックを直列接続させた構造を持つ。本明細書において、パイプラインステージのことを「パイプ段」とも呼ぶこととする。 [Pipeline boundary arrangement part 32]
The data flow graph Fdfg in which the circuit delay and the circuit scale of the arithmetic unit are evaluated by the circuit delay / circuit scale evaluation unit 31 of the arithmetic unit is then input to the pipeline boundary arrangement unit 32. The pipeline boundary arrangement unit 32 synthesizes a pipeline circuit from the data flow graph Fdfg and outputs pipeline circuit data Fplc. The pipeline circuit has a structure in which arithmetic processing that is repeatedly executed is divided into processing units called pipeline stages, and circuit blocks that execute these processing units are connected in series. In this specification, the pipeline stage is also referred to as a “pipe stage”.

データフローグラフに変換されたプログラムは、そのまま論理回路の回路構造と対応している。即ち、データフローグラフの演算命令が回路上の演算器と対応し、データフローグラフの有向枝が回路上の配線（又は信号）と対応する。従って、パイプライン回路を合成することは、データフローグラフをパイプ段に分割すること、即ち、データフローグラフの各命令ノードへパイプ段割当を行うことと等価である。隣接する２つのパイプ段の間のパイプライン境界を交差するデータフロー有向枝にレジスタを挿入することにより、レジスタに入力された値は、１クロックの遅延時間の後にレジスタ出力で参照可能となる。 The program converted into the data flow graph directly corresponds to the circuit structure of the logic circuit. That is, the operation instruction of the data flow graph corresponds to the arithmetic unit on the circuit, and the directed edge of the data flow graph corresponds to the wiring (or signal) on the circuit. Therefore, synthesizing the pipeline circuit is equivalent to dividing the data flow graph into pipe stages, that is, assigning the pipe stage to each instruction node of the data flow graph. By inserting a register in a data flow directed edge that crosses the pipeline boundary between two adjacent pipe stages, the value input to the register can be referenced at the register output after a delay time of one clock. .

そこで、パイプライン境界配置部３２は、まず、データフローグラフＦｄｆｇをパイプライン構造に分割する際の制約を抽出する処理を行い、次いで、データフローグラフＦｄｆｇの各命令ノードへパイプ段を割当てる処理を行う。より具体的には、パイプライン境界配置部３２は、パイプライン制約抽出部３２１と、パイプライン段数決定部３２２と、パイプライン回路合成部３２３とを含む。 Therefore, the pipeline boundary arrangement unit 32 first performs a process of extracting constraints when the data flow graph Fdfg is divided into a pipeline structure, and then performs a process of assigning a pipe stage to each instruction node of the data flow graph Fdfg. Do. More specifically, the pipeline boundary arrangement unit 32 includes a pipeline constraint extraction unit 321, a pipeline stage number determination unit 322, and a pipeline circuit synthesis unit 323.

パイプライン制約抽出部３２１は、後述のパイプライン回路合成部３２３における処理の準備として、データフローグラフＦｄｆｇをパイプ段に分割する際の制約を抽出する処理を行う。 The pipeline constraint extraction unit 321 performs processing for extracting constraints when the data flow graph Fdfg is divided into pipe stages as preparation for processing in a pipeline circuit synthesis unit 323 described later.

レジスタ属性は、変数への代入（書込み）と変数の参照（読出し）との間に１クロックの遅延時間が発生するレジスタに、変数の値を保存することを指定する。メモリ属性を持つ配列変数の読出しタイミングも、通常使われる同期型メモリの場合には、同様に１クロックの遅延時間が発生する。パイプライン回路構造を合成する際には、各変数の参照可能タイミングを正確に評価する必要がある。順序回路の状態変数を表現する変数は、クロックの前後にわたって値を保持する必要があるため、レジスタ属性を持つことが必要である。本実施形態においては、論理回路生成装置１に入力されるプログラムＰにおいて、順序回路の状態変数を表現する変数に対してレジスタ属性が明示されることを必須とし、レジスタ属性記述が省略された場合には回路合成不可とする。別の実施形態においては、順序回路の状態変数を表現する変数に対してレジスタ属性記述が省略された場合でも、自動的にレジスタ属性を付加するようにすることができる。いずれの実施形態においても、この処理の以降の段階では、順序回路の状態変数にはレジスタ属性が付加されている。 The register attribute specifies that the value of the variable is stored in a register in which a delay time of one clock occurs between assignment (writing) to the variable and reference (reading) of the variable. Similarly, in the case of a synchronous memory that is normally used, a read time of an array variable having a memory attribute is also delayed by one clock. When synthesizing the pipeline circuit structure, it is necessary to accurately evaluate the referenceable timing of each variable. Since a variable that represents a state variable of a sequential circuit needs to hold a value before and after the clock, it needs to have a register attribute. In the present embodiment, in the program P input to the logic circuit generation device 1, it is essential that register attributes be specified for variables representing state variables of sequential circuits, and register attribute descriptions are omitted. No circuit synthesis is possible. In another embodiment, even if the register attribute description is omitted for a variable representing the state variable of the sequential circuit, the register attribute can be automatically added. In any of the embodiments, the register attribute is added to the state variable of the sequential circuit in the subsequent stage of this process.

１クロック遅延を伴うレジスタ変数への参照命令は、次の２つに分類される。１つは、代入後参照命令である。これは、代入命令が実行された以降の同変数への参照命令である。もう１つは、代入前参照命令である。これは、代入命令が実行される以前の同変数への参照命令である。 Reference instructions to register variables with one clock delay are classified into the following two types. One is a post-assignment reference instruction. This is a reference instruction to the same variable after the assignment instruction is executed. The other is a pre-assignment reference instruction. This is a reference instruction to the same variable before the assignment instruction is executed.

パイプライン制約抽出部３２１は、まず、これらの参照命令をパイプ段に割当てる。このとき、代入後参照命令は、代入命令が含まれるパイプ段より１つ後方のパイプ段に割当てる。即ち、代入命令と代入後参照命令との間にパイプライン境界を挟む。これによって、代入後参照命令におけるレジスタ変数の参照タイミングは、代入タイミングより１クロック遅れることになり、代入命令によって変数に代入されたデータを読出すことになる。一方、代入前参照命令は、代入命令と同じパイプ段に割当てる。これによって、代入前参照命令は、代入命令実行前のデータを読出すことになる。 The pipeline constraint extraction unit 321 first assigns these reference instructions to the pipe stage. At this time, the post-assignment reference instruction is assigned to a pipe stage that is one backward from the pipe stage including the assignment instruction. That is, a pipeline boundary is sandwiched between the assignment instruction and the post-assignment reference instruction. As a result, the reference timing of the register variable in the post-assignment reference instruction is delayed by one clock from the assignment timing, and the data assigned to the variable by the assignment instruction is read. On the other hand, the pre-assignment reference instruction is assigned to the same pipe stage as the assignment instruction. As a result, the pre-assignment reference instruction reads data before execution of the assignment instruction.

図３２及び図３３を参照して、参照命令をパイプ段に割当てる処理について、具体例を用いて説明する。図３２は、レジスタ変数の代入前参照命令及び代入後参照命令を含むプログラムの例を示す。図３３は、図３２のプログラムから生成されるパイプライン回路におけるパイプライン配置を示す。この例において、パイプライン制約抽出部３２１は、変数 stt に対する代入前参照命令（○１）、代入命令（○３）、及びデータフローグラフＦｄｆｇの（○１）→（○３）の経路上にある命令（○２）を、同じパイプ段に割当てる。 With reference to FIGS. 32 and 33, processing for assigning a reference instruction to a pipe stage will be described using a specific example. FIG. 32 shows an example of a program including a pre-assignment reference instruction for register variables and a post-assignment reference instruction. FIG. 33 shows a pipeline arrangement in a pipeline circuit generated from the program of FIG. In this example, the pipeline constraint extraction unit 321 is on the path of the reference instruction before assignment (◯ 1), the assignment instruction (○ 3), and (◯ 1) → (○ 3) of the data flow graph Fdfg for the variable stt. An instruction (（2) is assigned to the same pipe stage.

次いで、パイプライン制約抽出部３２１は、レジスタ変数更新命令ノード群グループ化処理を行う。前述の通り、レジスタ変数の代入前参照命令は、代入命令と同じパイプ段に割当てることによって、代入命令実行前のデータ即ち１クロック前のデータを読出すようになるが、これよりも前方のパイプ段に割当てることは出来ない。なぜならば、代入命令のパイプ段よりも前方のパイプ段に割当てた場合には、読出される参照データは２クロック以上前の代入データになり、プログラムが意図する動作（即ち、１クロック前の代入データを参照する動作）の回路が実現しないからである。また、各代入前参照命令から代入命令までのデータフローグラフ経路上にあるすべての命令（図３３において、（○１）→（○２）→（○３）の経路がこれに当たる。）についても、同じパイプ段に割当てる必要がある。そこで、パイプライン制約抽出部３２１は、次に説明する処理によって、データフローグラフＦｄｆｇを、これらの命令が同一のパイプ段に割当たることが保証される構造に変換する。 Next, the pipeline constraint extraction unit 321 performs a register variable update instruction node group grouping process. As described above, the pre-assignment reference instruction for the register variable is assigned to the same pipe stage as the assignment instruction, so that the data before execution of the assignment instruction, that is, the data before one clock is read out. Cannot be assigned to a stage. This is because, when assigned to a pipe stage ahead of the pipe stage of the assignment instruction, the read reference data becomes assignment data two clocks or more before, and the operation intended by the program (that is, assignment one clock before) This is because the circuit of the operation for referring to data) is not realized. Also, all instructions on the data flow graph path from each pre-assignment reference instruction to the assignment instruction (in FIG. 33, the path (◯ 1) → (○ 2) → (○ 3) corresponds to this). Need to be assigned to the same pipe stage. Therefore, the pipeline constraint extraction unit 321 converts the data flow graph Fdfg into a structure that guarantees that these instructions are assigned to the same pipe stage by the processing described below.

レジスタ変数への代入前参照命令ノードから同変数への代入命令ノードまでのデータフローグラフ経路上のすべての命令ノード群を、レジスタ変数更新命令ノード群と呼ぶ。ただし、代入命令ノードへのデータフローグラフ経路が存在しない代入前参照命令ノードについては、この代入前参照命令ノードだけをレジスタ変数更新命令ノード群に含める。パイプライン制約抽出部３２１は、レジスタ変数更新命令ノード群を、一つの命令ノードに縮退したレジスタ変数更新縮退命令ノードに置き換えるデータフローグラフ変換処理を行う。より具体的には、レジスタ変数更新命令ノード群に含まれる命令ノード同士を結ぶ全ての有向枝を取り除く。そして、レジスタ変数更新命令ノード群に含まれる命令ノードと外部の命令ノードとを結ぶ有向枝を、レジスタ変数更新縮退命令ノードに繋ぎ直す。なお、このデータフローグラフ変換処理によって、代入命令と代入前参照命令とを結ぶ逆方向のデータフローグラフ有向枝（即ち、ループを形成する有向枝）は全て取り除かれることになる。 All instruction node groups on the data flow graph path from the reference instruction node before assignment to the register variable to the assignment instruction node to the variable are called register variable update instruction node groups. However, for a pre-substitution reference instruction node that does not have a data flow graph path to the assignment instruction node, only this pre-substitution reference instruction node is included in the register variable update instruction node group. The pipeline constraint extraction unit 321 performs data flow graph conversion processing in which the register variable update instruction node group is replaced with a register variable update degenerate instruction node that has been degenerated into one instruction node. More specifically, all directional branches connecting instruction nodes included in the register variable update instruction node group are removed. Then, the directional branch connecting the instruction node included in the register variable update instruction node group and the external instruction node is reconnected to the register variable update degenerate instruction node. Note that, by this data flow graph conversion processing, all the directed data flow graph directed edges (that is, the directed edges forming a loop) connecting the assignment instruction and the pre-assignment reference instruction are removed.

図３４は、図３３のデータフローグラフＦｄｆｇに対してレジスタ変数更新命令ノード群グループ化処理を行った結果のデータフローグラフＦｄｆｇを示す。図３３において、レジスタ変数 stt への代入前参照命令ノード（○１）から同変数への代入命令ノード（○３）までのデータフローグラフ経路上のすべての命令ノード群（○１）、（○２）及び（○３）が、１つのレジスタ変数更新命令ノード群である。パイプライン制約抽出部３２１は、このレジスタ変数更新命令ノード群を、図３４において（○１）（○２）（○３）として示されるレジスタ変数更新縮退命令ノードに置き換える。 FIG. 34 shows a data flow graph Fdfg as a result of performing the register variable update instruction node group grouping process on the data flow graph Fdfg of FIG. In FIG. 33, all instruction node groups (○ 1) on the data flow graph path from the reference instruction node (○ 1) before assignment to the register variable stt to the assignment instruction node (○ 3) to the variable, (○ 2) and (○ 3) are one register variable update instruction node group. The pipeline constraint extraction unit 321 replaces the register variable update instruction node group with register variable update degenerate instruction nodes indicated as (◯ 1) (◯ 2) (◯ 3) in FIG.

次いで、パイプライン制約抽出部３２１は、データフローグラフ有向枝に対してパイプライン境界属性を付加する処理を行う。レジスタ変数への代入命令と同変数への代入後参照命令との間には、１クロックのレジスタ出力遅延時間を実現するために、パイプライン境界が存在する必要がある。そこで、パイプライン制約抽出部３２１は、これらの命令を結ぶデータフローグラフ有向枝にパイプライン境界属性を付加し、この有向枝がパイプライン境界を交差するようなパイプ段割当を行う。 Next, the pipeline constraint extraction unit 321 performs a process of adding a pipeline boundary attribute to the data flow graph directed edge. A pipeline boundary must exist between the register variable assignment instruction and the reference instruction after assignment to the same variable in order to realize a register output delay time of one clock. Therefore, the pipeline constraint extraction unit 321 adds a pipeline boundary attribute to the directed edge of the data flow graph connecting these instructions, and performs pipe stage assignment such that the directed edge intersects the pipeline boundary.

また、メモリ配列変数の参照命令は、前述のレジスタ／メモリ配列アクセス命令分解部２５において、読出しアドレス変数への代入命令と、読出しアドレス変数を配列インデックスとした配列参照命令とに細分化されている。ここで、前述のように、同期型メモリにおける１クロックのデータ読出し遅延時間を実現するために、これらの２つの命令の間にもパイプライン境界が存在する必要がある。そこで、パイプライン制約抽出部３２１は、これらの命令を結ぶ、読出しアドレス変数に対応するデータフローグラフ有向枝にパイプライン境界属性を付加し、この有向枝がパイプライン境界を交差するようなパイプ段割当を行う。 Further, the reference instruction for the memory array variable is subdivided into an assignment instruction for the read address variable and an array reference instruction using the read address variable as an array index in the register / memory array access instruction decomposition unit 25 described above. . Here, as described above, in order to realize a data read delay time of one clock in the synchronous memory, a pipeline boundary needs to exist between these two instructions. Therefore, the pipeline constraint extraction unit 321 adds a pipeline boundary attribute to the directed edge of the data flow graph corresponding to the read address variable that connects these instructions, and the directed edge intersects the pipeline boundary. Perform pipe stage assignment.

前述の通り、パイプライン回路を合成することは、データフローグラフ上の各命令ノードのパイプ段割当を行うことと等価である。パイプライン回路合成部３２３は、次に説明するパイプ段割当制約を満たしたパイプ段初期割当を行い、その後、データフローグラフＦｄｆｇの各命令ノードのパイプ段割当を変更しながら各パイプ段の信号伝播時間（以降、「パイプライン遅延」とも呼ぶ。）を均等化するパイプライン遅延均等化処理と、パイプライン境界に配置されるレジスタ数を最小化するパイプラインレジスタ数最小化処理とを行うことによって、データフローグラフＦｄｆｇからクロック同期型パイプライン回路Ｆｐｌｃを合成する。パイプライン回路合成部３２３におけるこれらの処理の流れを、図３５に示す。以下に、これらの処理について、より具体的に説明する。なお、以降の説明では、データフローグラフ有向枝の始点ノードのパイプ段を有向枝の「始点パイプ段」と呼び、同終点ノードのパイプ段を有向枝の「終点パイプ段」と呼ぶことにする。 As described above, synthesizing a pipeline circuit is equivalent to assigning a pipe stage to each instruction node on the data flow graph. The pipeline circuit synthesis unit 323 performs pipe stage initial assignment that satisfies the pipe stage assignment constraint described below, and then changes the pipe stage assignment of each instruction node of the data flow graph Fdfg while propagating the signal of each pipe stage. By performing pipeline delay equalization processing for equalizing time (hereinafter also referred to as “pipeline delay”) and pipeline register number minimization processing for minimizing the number of registers arranged at pipeline boundaries Then, the clock synchronous pipeline circuit Fplc is synthesized from the data flow graph Fdfg. The flow of these processes in the pipeline circuit synthesis unit 323 is shown in FIG. Hereinafter, these processes will be described more specifically. In the following description, the pipe stage of the start node of the directional branch of the data flow graph is called the “start pipe stage” of the directional branch, and the pipe stage of the end node is called the “end pipe stage” of the directional branch. I will decide.

データフローグラフＦｄｆｇの各命令ノード対するパイプ段割当制約には、次に説明する、データ依存制約及びパイプライン境界制約がある。 The pipe stage allocation constraint for each instruction node of the data flow graph Fdfg includes a data dependency constraint and a pipeline boundary constraint described below.

データフローグラフ有向枝は、その始点ノード（変数への代入命令ノード）と終点ノード（変数への参照命令ノード）とのデータ依存性を表している。ここで、有向枝の始点パイプ段と終点パイプ段は、同一であるか、もしくは、始点パイプ段が終点パイプ段よりも前方のパイプ段となるように割当てる必要がある。この制約を、データ依存制約という。 The directional branch of the data flow graph represents data dependency between the start node (assignment instruction node to variable) and the end node (reference instruction node to variable). Here, it is necessary that the start pipe stage and the end pipe stage of the directional branch are the same or that the start pipe stage is a pipe stage ahead of the end pipe stage. This constraint is called a data dependency constraint.

パイプライン境界属性を付加されたデータフローグラフ有向枝を、パイプライン境界枝と呼ぶことにする。パイプライン境界枝については、始点パイプ段が終点パイプ段よりも前方のパイプ段となるように割当てる必要があり、同一のパイプ段に割当てることはできない。これを、パイプライン境界制約という。 The data flow graph directed edge to which the pipeline boundary attribute is added is called a pipeline boundary edge. The pipeline boundary branch needs to be assigned so that the start pipe stage is a pipe stage ahead of the end pipe stage, and cannot be assigned to the same pipe stage. This is called pipeline boundary constraint.

また、パイプライン境界枝のうち、始点パイプ段と終点パイプ段とが隣接している（即ち、始点パイプ段の１つ後方のパイプ段が終点パイプ段である）ときに、この有向枝を、クリティカルパイプライン境界枝と呼ぶことにする。これは、パイプ段割当制約そのものではないが、パイプ段割当を変更する際に、パイプライン境界制約が満たされるかどうかを判別するための属性である。 Also, when the start pipe stage and the end pipe stage are adjacent to each other in the pipeline boundary branch (that is, the pipe stage immediately after the start pipe stage is the end pipe stage), this directed branch is This is called a critical pipeline boundary branch. This is not the pipe stage allocation constraint itself, but an attribute for determining whether the pipeline boundary constraint is satisfied when the pipe stage allocation is changed.

パイプライン境界枝が存在する場合には、そのパイプライン境界を挟んで少なくとも２つのパイプ段が必要となる。１つのデータフローグラフ経路上に複数のパイプライン境界枝が存在する場合には、それらのデータフローグラフ経路上のパイプライン境界枝はすべて異なるパイプライン境界を交差する必要がある。このようにパイプライン境界制約によって定まる必要最小のパイプ段数を、パイプライン段数下限値と呼ぶ。例えば、図３６に示されるように、３つのパイプライン境界枝が１つのデータフローグラフ経路上に存在する場合には、パイプライン段数下限値は 4 となる。パイプライン段数決定部３２２は、算出したパイプライン段数下限値と、次に説明する回路設計者の指定に基づくパイプライン段数と、のうち最大のものを、最終的なパイプライン段数として決定する。 If a pipeline boundary branch exists, at least two pipe stages are required across the pipeline boundary. When there are a plurality of pipeline boundary branches on one data flow graph path, all of the pipeline boundary branches on the data flow graph path need to cross different pipeline boundaries. The minimum necessary number of pipe stages determined by the pipeline boundary constraint is called a pipeline stage number lower limit value. For example, as shown in FIG. 36, when three pipeline boundary branches exist on one data flow graph path, the pipeline stage number lower limit value is 4. The pipeline stage number determination unit 322 determines the maximum one of the calculated pipeline stage number lower limit value and the pipeline stage number based on the designation of the circuit designer described below as the final pipeline stage number.

本実施形態の論理回路生成装置１を使用する回路設計者は、生成される論理回路のクロック周期を指定することができ、又は、生成される論理回路のパイプライン段数を指定することができる。回路設計者がクロック周期を指定した場合には、クロック周期から算出されたパイプライン段数を、回路設計者の指定に基づくパイプライン段数とする。回路設計者がパイプライン段数を指定した場合には、そのパイプライン段数を、回路設計者の指定に基づくパイプライン段数とする。 The circuit designer who uses the logic circuit generation device 1 of the present embodiment can specify the clock cycle of the generated logic circuit, or can specify the number of pipeline stages of the generated logic circuit. When the circuit designer designates the clock period, the number of pipeline stages calculated from the clock period is set as the number of pipeline stages based on the designation of the circuit designer. When the circuit designer designates the number of pipeline stages, the number of pipeline stages is set as the number of pipeline stages based on the designation of the circuit designer.

より具体的には、回路設計者がクロック周期指定した場合には、それぞれのパイプ段の最大信号伝播時間が指定されたクロック周期以下になるようなパイプライン段数を算出する。パイプライン分割される前のデータフローグラフＦｄｆｇ全体の最大信号伝播時間を Dtotal とし、指定されたクロック周期が Dspec とすると、パイプライン段数の算出値 PSspec は、
PSspec = ceil(Dtotal / Dspec)
と算出される。また、回路設計者がパイプライン段数を指定した場合には、その指定値を PSspec とする。いずれの場合にも、パイプライン段数下限値を PSmin とすると、最終的なパイプライン段数 PS は、
PS = max(PSspec, PSmin)
と算出される。 More specifically, when the circuit designer designates the clock period, the number of pipeline stages is calculated such that the maximum signal propagation time of each pipe stage is less than the designated clock period. If the maximum signal propagation time of the entire data flow graph Fdfg before pipeline division is Dtotal and the specified clock period is Dspec, the calculated value PSspec of the pipeline stage is
PSspec = ceil (Dtotal / Dspec)
Is calculated. If the circuit designer specifies the number of pipeline stages, the specified value is PSspec. In either case, if the lower limit value of the pipeline stage number is PSmin, the final pipeline stage PS is
PS = max (PSspec, PSmin)
Is calculated.

パイプライン回路合成部３２３は、まず、パイプ段割当制約を満たすように各命令へパイプ段を割当てる、パイプ段初期割当処理を行う。この処理における命令へのパイプ段の割当をパイプ段初期割当という。この処理は、データフローグラフ経路上の各命令を順方向（即ち、入力側から出力側の方向）に辿りながら、前方のパイプ段から順次割当てて行く単純な処理によって実現される。 The pipeline circuit synthesis unit 323 first performs a pipe stage initial allocation process in which a pipe stage is allocated to each instruction so as to satisfy the pipe stage allocation constraint. The assignment of pipe stages to instructions in this process is called pipe stage initial assignment. This process is realized by a simple process in which each instruction on the data flow graph path is sequentially assigned from the preceding pipe stage while following each instruction in the forward direction (ie, from the input side to the output side).

次いで、パイプライン回路合成部３２３は、パイプライン遅延均等化処理及びパイプラインレジスタ最小化処理を行う。これらの処理においては、パイプ段割当から定まるパイプライン境界を局所的に変更する（即ち、１つの命令ノードのパイプ段割当を変更する）処理が繰返される。パイプ段割当制約を満たしながらパイプ段割当の変更が可能な命令ノードは、次に説明するパイプ段出力辺境ノード及びパイプ段入力辺境ノードに限られる。 Next, the pipeline circuit synthesis unit 323 performs pipeline delay equalization processing and pipeline register minimization processing. In these processes, the process of locally changing the pipeline boundary determined from the pipe stage assignment (that is, changing the pipe stage assignment of one instruction node) is repeated. The instruction nodes that can change the pipe stage assignment while satisfying the pipe stage assignment constraint are limited to the pipe stage output border node and the pipe stage input border node described below.

パイプ段出力辺境ノードとは、ある命令ノードにおいて、
（１）その命令ノードを始点とするすべてのデータフローグラフ有向枝の終点パイプ段が、その命令ノードのパイプ段（即ち、有向枝の始点パイプ段）よりも後方に位置し、かつ、
（２）その命令ノードを始点とするクリティカルパイプライン境界枝が存在しない
ような命令ノードをいう。パイプ段出力辺境ノードは、後方のパイプ段に移動することが可能である。 Pipe stage output border node is a certain instruction node.
(1) The end pipe stages of all directed branches of the data flow graph starting from the instruction node are located behind the pipe stage of the instruction node (that is, the starting pipe stage of the directed branch), and
(2) An instruction node having no critical pipeline boundary branch starting from the instruction node. The pipe stage output border node can move to the rear pipe stage.

パイプ段入力辺境ノードとは、ある命令ノードにおいて、
（１）その命令ノードを終点とするすべてのデータフローグラフ有向枝の始点パイプ段が、その命令ノードのパイプ段（即ち、有向枝の終点パイプ段）よりも前方に位置し、かつ、
（２）その命令ノードを終点とするクリティカルパイプライン境界枝が存在しない
ような命令ノードをいう。パイプ段入力辺境ノードは、前方のパイプ段に移動することが可能である。 The pipe stage input border node is an instruction node.
(1) The start pipe stage of all the directed branches of the data flow graph having the instruction node as an end point is located in front of the pipe stage of the instruction node (that is, the end pipe stage of the directed branch), and
(2) An instruction node in which there is no critical pipeline boundary branch whose end point is the instruction node. The pipe stage input border node can move to the preceding pipe stage.

パイプライン境界の局所変更処理において、個々の局所変更が「有益」であるか否かを判定するための目的関数について説明する。この目的関数は、パイプライン境界の局所変更処理の目的に依存して、以下のように定義する。 An objective function for determining whether or not each local change is “useful” in the pipeline boundary local change process will be described. This objective function is defined as follows depending on the purpose of the local change processing of the pipeline boundary.

パイプライン遅延均等化処理においては、各パイプ段の信号伝播時間を均等化することによって、回路全体の動作クロック周波数を最大化することを目的とする。ここで、 n 番目のパイプ段の最大信号伝播時間を D(n) としたとき、パイプライン遅延均等化処理の目的関数 E1 は、次式によって定義される。
E1 = Σ_n D(n)²
即ち、個々のパイプ段の最大信号伝播時間の２乗和をここでの目的関数とする。パイプライン回路合成部３２３は、この目的関数を最小化するようにパイプライン境界を調整することによって、各々のパイプ段の最大信号伝播時間を均等化する。本実施形態においては、パイプ段の最大信号伝播時間 D(n) は、演算器の回路遅延・回路規模評価部３１によって評価された演算器の回路遅延に基づいて算出される。 The purpose of the pipeline delay equalization process is to maximize the operation clock frequency of the entire circuit by equalizing the signal propagation time of each pipe stage. Here, when the maximum signal propagation time of the nth pipe stage is D (n), the objective function E1 of the pipeline delay equalization process is defined by the following equation.
E1 = Σ _n D (n) ²
That is, the sum of squares of the maximum signal propagation times of individual pipe stages is used as the objective function here. The pipeline circuit synthesis unit 323 equalizes the maximum signal propagation time of each pipeline stage by adjusting the pipeline boundary so as to minimize the objective function. In the present embodiment, the maximum signal propagation time D (n) of the pipe stage is calculated based on the circuit delay of the arithmetic unit evaluated by the circuit delay / circuit scale evaluation unit 31 of the arithmetic unit.

また、パイプラインレジスタ数最小化処理においては、各パイプライン境界を交差するデータフローグラフ有向枝上に挿入されるレジスタ数を最小化することによって、回路面積を最小化することを目的とする。ここで、データフローグラフ有向枝に対応する変数のビット幅を、その有向枝の重みと呼ぶことにする。 n 番目のパイプ段の出力側のパイプライン境界を交差するデータフローグラフ有向枝の重みの総数を R(n) としたとき、その目的関数 E2 は以下の数式で定義する。
E2 = Σ_n R(n)
即ち、パイプライン境界を交差するデータフローグラフ有向枝の重みの総和をここでの目的関数とする。パイプライン回路合成部３２３は、この目的関数を最小化するようにパイプライン境界を調整することによって、パイプラインレジスタ数を最小化する。 In the pipeline register number minimization process, an object is to minimize the circuit area by minimizing the number of registers inserted on the directed edge of the data flow graph crossing each pipeline boundary. . Here, the bit width of the variable corresponding to the directed edge of the data flow graph is referred to as the weight of the directed edge. The objective function E2 is defined by the following equation, where R (n) is the total weight of the directed edge of the data flow graph that crosses the pipeline boundary on the output side of the nth pipe stage.
E2 = Σ _n R (n)
That is, the sum of the weights of the data flow graph directed edges that cross the pipeline boundary is used as the objective function here. The pipeline circuit synthesis unit 323 minimizes the number of pipeline registers by adjusting the pipeline boundary so as to minimize the objective function.

さらに、パイプライン遅延均等化処理及びパイプラインレジスタ数最小化処理におけるそれぞれの目的関数 E1 及び E2 の他に、パイプライン回路全体の動作クロック周波数を最終的に決定する次式の評価関数を定義する。
Dmax = max{ D(n) | n ∈ すべてのパイプ段 }
この Dmax を増大させるパイプライン境界の局所変更は、禁止される。 Furthermore, in addition to the objective functions E1 and E2 in the pipeline delay equalization process and pipeline register number minimization process, an evaluation function of the following expression that finally determines the operation clock frequency of the entire pipeline circuit is defined. .
Dmax = max {D (n) | n ∈ all pipe stages}
Local changes to the pipeline boundary that increase this Dmax are prohibited.

本実施形態においては、パイプライン境界の局所変更処理を制御する方法として、焼き鈍し法（Simulated Annealing 法、本明細書において「ＳＡ法」とも呼ぶ。）を用いる。図３７を参照して、パイプライン回路合成部３２３が行うＳＡ法について、より具体的に説明する。 In this embodiment, an annealing method (Simulated Annealing method, also referred to as “SA method” in this specification) is used as a method for controlling the local change processing of the pipeline boundary. With reference to FIG. 37, the SA method performed by the pipeline circuit synthesis unit 323 will be described more specifically.

まず、ＳＡ法のパラメータである温度 T に、適切な初期値を設定する。温度 T の意味については後述する。 First, an appropriate initial value is set for the temperature T which is a parameter of the SA method. The meaning of temperature T will be described later.

次に、パイプ段出力辺境ノード及びパイプ段入力辺境ノードのうち、１つの命令ノードをランダムに選択し、移動可能な方向にパイプ段の割当の変更を行う。ここで、パイプ段割当変更前の最大パイプライン遅延 Dmax （前述）と、パイプ段割当変更後の最大パイプライン遅延 D'max とについて、 Dmax < D'max の場合には、パイプ段割当変更を却下し、変更前のパイプ段割当に戻す。 Dmax >= D'max の場合には、以下の処理に進む。 Next, one instruction node is randomly selected from the pipe stage output border node and the pipe stage input border node, and the pipe stage assignment is changed in a movable direction. Here, if the maximum pipeline delay Dmax before the pipe stage assignment change (described above) and the maximum pipeline delay D'max after the pipe stage assignment change, if Dmax <D'max, the pipe stage assignment change Reject and return to pipe stage assignment before change. If Dmax> = D'max, the process proceeds to the following process.

次に、前述で定義された目的関数（ E1 または E2 ）を用いて、パイプ段割当変更前の目的関数値 E と変更後の目的関数値 E' との差分であるパイプ段割当変更コスト ΔC を計算する。即ち、
ΔC = E - E'
とする。 Next, using the objective function defined above (E1 or E2), the pipe stage allocation change cost ΔC, which is the difference between the objective function value E before the pipe stage allocation change and the objective function value E ′ after the change, is calculated. calculate. That is,
ΔC = E-E '
And

次に、パイプ段割当変更コスト ΔC から、パイプ段割当変更を採用するか否かを確率的に判定するための採用閾値を、次式で定義する。
P = exp(ΔC / T)
この式において、 T は、割当変更の採用確率を制御するためのＳＡ法の「温度」と呼ばれる、正の実数値をとるパラメータである。 Next, an adoption threshold for probabilistically determining whether or not to adopt a pipe stage assignment change from the pipe stage assignment change cost ΔC is defined by the following equation.
P = exp (ΔC / T)
In this equation, T is a parameter that takes a positive real value called “temperature” in the SA method for controlling the adoption probability of allocation change.

次に、パイプ段割当変更を採用するか否かを最終的に判定するために、 0 以上 1 未満の乱数 R を生成し、 P と R との関係に従って次のいずれかを実行する。
（１） R < P の場合には、パイプ段割当変更を採用する。また、これまでのパイプ段割当の目的関数の最小評価値 Emin と E' とを比較し、 Emin > E' の場合には、 Emin = E' とし、パイプ段割当の最適解を更新する。
（２） R >= P の場合には、パイプ段割当変更を却下し、パイプ段割当変更前の状態に戻す。
なお、目的関数値が減少するパイプ段割当変更の場合、即ち ΔC = E = E' < 0 の場合には、採用閾値 P は 1 より大きくなるので、必ず上記（１）が実行され、パイプ段割当変更は採用される。 Next, in order to finally determine whether to adopt the pipe stage allocation change, a random number R between 0 and 1 is generated, and one of the following is executed according to the relationship between P and R.
(1) If R <P, change pipe stage allocation. Also, compare the minimum evaluation values Emin and E 'of the pipe stage assignment objective function so far, and if Emin>E', set Emin = E 'and update the optimum pipe stage assignment solution.
(2) When R> = P, the pipe stage assignment change is rejected and the state before the pipe stage assignment change is restored.
Note that in the case of pipe stage allocation change in which the objective function value decreases, that is, when ΔC = E = E '<0, the adoption threshold P is greater than 1, so (1) above is always executed and the pipe stage Allocation changes are adopted.

次に、温度 T を更新する。温度 T は、初期段階では比較的大きい（即ち「熱い」）値に設定し、目的関数値が増加するようなパイプ段割当変更を採用する確率を大きくすることで、パイプ段配置の局所最適解から脱出し易いようにする。割当変更処理が進むに連れて温度 T を徐々に減少させて行く（即ち「冷やす」）ことで、パイプ段配置の大局的最適解に近づけるように制御する。 Next, the temperature T is updated. The temperature T is set to a relatively large value (that is, “hot”) in the initial stage, and the probability of adopting a pipe stage assignment change that increases the objective function value is increased, so that the local optimum solution of the pipe stage arrangement is increased. Make it easy to escape from. As the allocation change process proceeds, the temperature T is gradually decreased (that is, “cooled”), so that it is controlled to approach the global optimum solution of the pipe stage arrangement.

ノードの選択から温度 T の更新までの上記処理を、温度 T が所定値を下回るまで繰り返す。温度 T が所定値を下回ったときの上述の最適解が、ＳＡ法によるパイプライン遅延均等化処理及びパイプラインレジスタ最小化処理の結果となる。 The above processing from the selection of the node to the update of the temperature T is repeated until the temperature T falls below a predetermined value. The above-described optimal solution when the temperature T falls below a predetermined value is the result of pipeline delay equalization processing and pipeline register minimization processing by the SA method.

［ＲＴＬ記述出力部３３］
パイプライン境界配置部３２によって出力されたパイプライン回路データＦｐｌｃは、次いで、ＲＴＬ記述出力部３３に入力される。本実施形態における論理回路生成装置１は、生成した論理回路を、ＲＴＬ記述の形式で出力する。そこで、ＲＴＬ記述出力部３３は、入力されたパイプライン回路データＦｐｌｃを、ＲＴＬ記述に変換する。ＲＴＬ記述による論理回路の表現は公知の技術であるため、ここでは具体的な説明は省略するが、パイプライン回路データＦｐｌｃはデータフローグラフであり、前述の通り、データフローグラフは論理回路の回路構造と対応しているため、これをそのままＲＴＬ記述によって表現することが可能である。なお、本発明の目的を達成するためには、論理回路生成装置１の出力形式は、ＲＴＬ記述の形式に限らず、論理回路を記述する任意の適切な形式で構わない。即ち、本発明の論理回路生成装置１は、一般には、論理回路記述出力部を備え、本実施形態における論理回路記述出力部は、ＲＴＬ記述を出力するＲＴＬ記述出力部３３である。 [RTL description output unit 33]
The pipeline circuit data Fplc output by the pipeline boundary arrangement unit 32 is then input to the RTL description output unit 33. The logic circuit generation device 1 in the present embodiment outputs the generated logic circuit in the RTL description format. Therefore, the RTL description output unit 33 converts the input pipeline circuit data Fplc into an RTL description. Since the expression of the logic circuit by the RTL description is a well-known technique, the specific description is omitted here. However, the pipeline circuit data Fplc is a data flow graph, and the data flow graph is a circuit of the logic circuit as described above. Since it corresponds to the structure, it can be expressed as it is by RTL description. In order to achieve the object of the present invention, the output format of the logic circuit generation device 1 is not limited to the RTL description format, and may be any appropriate format describing the logic circuit. That is, the logic circuit generation device 1 of the present invention generally includes a logic circuit description output unit, and the logic circuit description output unit in the present embodiment is an RTL description output unit 33 that outputs an RTL description.

ＲＴＬ記述出力部３３によって変換されたＲＴＬ記述は、論理回路記述Ｄとして、論理回路生成装置１から出力される。以上が、本実施形態における論理回路生成装置１の各部の説明である。 The RTL description converted by the RTL description output unit 33 is output from the logic circuit generation device 1 as the logic circuit description D. The above is the description of each unit of the logic circuit generation device 1 in the present embodiment.

次に、図３８〜図４４を参照して、本発明の論理回路生成装置１の実施形態のいくつかの例について説明する。各図において、淡色で示される部分は、その図が示す実施形態における論理回路生成装置１には含まれない構成要素を、他の実施形態と対照させるために示すものである。 Next, some examples of the embodiment of the logic circuit generation device 1 of the present invention will be described with reference to FIGS. In each figure, the portion shown in light color is shown in order to contrast the constituent elements not included in the logic circuit generation device 1 in the embodiment shown in the figure with other embodiments.

図３８に示される実施形態の論理回路生成装置１は、制御フローグラフ生成部２３と、制御フロー縮退変換部２８と、データフローグラフ生成部２９と、ＲＴＬ記述出力部（論理回路記述出力部）３３とを備える。制御フロー縮退変換部２８は、基本ブロック単一化部２８２を含む。この実施形態の論理回路生成装置１は、関数呼出し命令もループ制御部も含まず、変数をインデックスとする配列アクセス命令も含まない、静的単一代入形式で記述されたプログラムＰを入力に取り、論理回路記述Ｄを出力する。論理回路の入力信号及び出力信号は、プログラムＰにおいて明示的に指定することができる。 The logic circuit generation device 1 of the embodiment shown in FIG. 38 includes a control flow graph generation unit 23, a control flow degeneration conversion unit 28, a data flow graph generation unit 29, and an RTL description output unit (logic circuit description output unit). 33. The control flow degeneration conversion unit 28 includes a basic block unification unit 282. The logic circuit generation device 1 of this embodiment takes as input a program P described in a static single assignment format that does not include a function call instruction or a loop control unit, and does not include an array access instruction using a variable as an index. The logic circuit description D is output. The input signal and output signal of the logic circuit can be explicitly specified in the program P.

図３９に示される実施形態の論理回路生成装置１は、図３８の実施形態に加えて、静的単一代入形式変換部２７をさらに備える。また、制御フロー縮退変換部２８は、φ関数命令実体化部２８１をさらに含む。本実施形態の論理回路生成装置１は、入力されたプログラムＰを内部で静的単一代入形式に変換するため、論理回路生成装置１が入力に取るプログラムＰは、必ずしも静的単一代入形式で記述されたものである必要はない。 The logic circuit generation device 1 of the embodiment shown in FIG. 39 further includes a static single assignment format conversion unit 27 in addition to the embodiment of FIG. The control flow degeneration conversion unit 28 further includes a φ function instruction materialization unit 281. Since the logic circuit generation device 1 of the present embodiment converts the input program P into the static single assignment format internally, the program P taken by the logic circuit generation device 1 as an input is not necessarily the static single assignment format. It does not have to be described in.

図４０に示される実施形態の論理回路生成装置１は、図３８の実施形態に加えて、レジスタ／メモリ配列アクセス命令分解部２５と、メモリポート数判定部２６とをさらに備える。また、制御フロー縮退変換部２８は、レジスタ／メモリ配列命令融合部２８３をさらに含む。本実施形態の論理回路生成装置１は、変数をインデックスとする配列アクセス命令を含むプログラムＰを入力に取ることができ、マルチポートレジスタファイル及びマルチポートメモリを含む論理回路を表現する論理回路記述Ｄを生成することができる。 The logic circuit generation device 1 of the embodiment shown in FIG. 40 further includes a register / memory array access instruction decomposition unit 25 and a memory port number determination unit 26 in addition to the embodiment of FIG. The control flow degeneration conversion unit 28 further includes a register / memory array instruction fusion unit 283. The logic circuit generation device 1 of the present embodiment can take a program P including an array access instruction with a variable as an index, and a logic circuit description D representing a logic circuit including a multiport register file and a multiport memory. Can be generated.

図４１に示される実施形態の論理回路生成装置１は、図３８の実施形態に加えて、データフローグラフ最適化部３０、演算器の回路遅延・回路規模評価部３１、及びパイプライン境界配置部３２をさらに備える。本実施形態の論理回路生成装置１は、最適化されたパイプライン回路記述を出力することができる。また、本実施形態の論理回路生成装置１は、生成する論理回路の回路遅延及び回路規模の見積を算出することができる。 In addition to the embodiment of FIG. 38, the logic circuit generation device 1 of the embodiment shown in FIG. 41 includes a data flow graph optimization unit 30, a circuit delay / circuit scale evaluation unit 31, and a pipeline boundary arrangement unit. 32 is further provided. The logic circuit generation device 1 according to the present embodiment can output an optimized pipeline circuit description. Further, the logic circuit generation device 1 according to the present embodiment can calculate the circuit delay and the circuit scale estimate of the logic circuit to be generated.

図４２に示される実施形態の論理回路生成装置１は、図３８の実施形態に加えて、論理回路入出力信号抽出部２２をさらに備える。本実施形態の論理回路生成装置１は、生成することになる論理回路の入力信号及び出力信号を抽出するため、これらの信号が論理回路生成装置１の外部から明示的に指定される必要はない。 The logic circuit generation device 1 of the embodiment shown in FIG. 42 further includes a logic circuit input / output signal extraction unit 22 in addition to the embodiment of FIG. Since the logic circuit generation device 1 of the present embodiment extracts the input signal and output signal of the logic circuit to be generated, these signals do not need to be explicitly specified from the outside of the logic circuit generation device 1. .

図４３に示される実施形態の論理回路生成装置１は、図３８の実施形態に加えて、非循環・非階層変換部２１をさらに備える。本実施形態の論理回路生成装置１は、入力されたプログラムＰに含まれる最上位関数Ｆｔｏｐを内部で非循環型最下層関数Ｆｅｘｐに変換するため、入力されるプログラムＰに含まれる最上位関数Ｆｔｏｐは、必ずしも非循環型最下層関数である必要はない。 The logic circuit generation device 1 of the embodiment shown in FIG. 43 further includes an acyclic / non-hierarchical conversion unit 21 in addition to the embodiment of FIG. The logic circuit generation device 1 of the present embodiment internally converts the highest function Ftop included in the input program P into an acyclic bottom layer function Fexp, so that the highest function Ftop included in the input program P is converted. Need not be an acyclic bottom layer function.

図４４に示される実施形態の論理回路生成装置１は、図３８の実施形態に加えて、状態変数命令依存性判定部２４さらに備える。本実施形態の論理回路生成装置１は、同一の状態変数に対して代入命令・参照命令・代入命令がこの順序で連続して実行されるようなプログラムＰが入力されたときには、論理回路記述を生成する処理を停止する。
The logic circuit generation device 1 of the embodiment shown in FIG. 44 further includes a state variable instruction dependency determination unit 24 in addition to the embodiment of FIG. The logic circuit generation device 1 according to the present embodiment, when a program P is input in which an assignment instruction, a reference instruction, and an assignment instruction are successively executed in this order for the same state variable, Stop processing to be generated.

Claims

A logic circuit generation device that takes a program including a top-level function to be generated as a logic circuit generation target describing an operation description, which is a flow of a series of hardware processes for circuit design, and generates a logic circuit description. ,
A control flow graph generation unit that generates a control flow graph from the top-level function that does not include a loop processing unit and a function call instruction;
Control flow degenerative transformation that generates a control flow degenerate program that is a program in which the control flow is degenerated by removing all conditional branch instructions from the control flow graph including only one assignment instruction for variables for each variable. And
From the control flow degeneracy program, each instruction of the control flow degeneration program is a node, and a data flow graph is generated by adding a directional branch from an assignment instruction to each variable to an instruction that refers to the variable. A data flow graph generator,
A logic circuit description output unit that generates a logic circuit description representing a sequential circuit in which the directed edge of the data flow graph corresponds to a wiring of a logic circuit and the node of the data flow graph corresponds to an arithmetic unit of the logic circuit. When,
With
The state variable representing the state of the sequential circuit is expressed in the program as a local variable or a static variable of an upper layer function that calls the top function,
The value of the state variable before execution of the assignment instruction to the state variable represents the current state of the sequential circuit, and the value of the state variable after execution of the assignment instruction to the state variable is the sequential circuit. A logic circuit generation device characterized by representing a next state of

The logic circuit generation device according to claim 1,
If the top-level function is not in a static single assignment form that includes only one assignment instruction for a variable for each variable, the control flow graph is statically input before being input to the control flow degeneration conversion unit. A static single assignment format conversion unit for converting to a single assignment format is further provided.
The static single assignment format conversion unit includes:
In the control flow graph, a φ function instruction for selecting a variable definition on the actually executed path is inserted from all the variable definitions that merge at the location where a plurality of value definitions for the same variable merge. , Φ function instruction insertion part,
By converting the name of each variable included in the control flow graph so as to have a different name for each assignment instruction for the variable, the control flow graph includes only one assignment instruction for each variable after the name conversion. A variable name conversion unit that converts to a static single assignment form;
About the state variable after the name conversion, a state variable name reconverting unit that converts the variable name again so as to match the variable name in the start block of the control flow graph and the variable name reaching the end block;
Including
The control flow degeneration conversion unit includes a φ function instruction materializing unit that converts the φ function instruction into a specific operation instruction.
A logic circuit generation device characterized by that.

The logic circuit generation device according to claim 1 or 2,
When the top-level function includes an array assignment instruction that is a write processing instruction for an array variable,
For each array variable, add a write data variable and a write address variable to the control flow graph,
An array determined by assigning each array assignment instruction to an assignment value to the write data variable, an instruction to assign an array index value to the write address variable, and an array index as the write address variable Disassembled into an instruction that assigns the value of the write data variable to the element;
An array assignment instruction decomposition unit;
When the top-level function includes an array reference instruction that is a read processing instruction for an array variable,
For each array variable, add a read address variable to the control flow graph,
Decomposing each array reference instruction into an instruction that assigns an array index value to the read address variable and an instruction that refers to an array element determined by using the read address variable as an array index.
An array reference instruction decomposition unit;
A register / memory array access instruction decomposition unit including
Memory for holding each array element data in the logic circuit is
The write data variable corresponds to a write data port of the memory;
The write address variable corresponds to a write address port of the memory;
The read address variable corresponds to a read address port of the memory;
The control flow graph generated by the control flow graph generation unit is processed by the register / memory array access instruction decomposition unit and then processed by the control flow degeneration conversion unit. .

The logic circuit generation device according to claim 3,
The register / memory array access instruction decomposition unit includes:
For each array assignment instruction for each array variable in the control flow graph, the array array instruction is executed with the maximum value of the number of executions of the array assignment instruction to the array variable executed between the start point block and the array assignment instruction as the write port number. A write port number assigning section to be assigned to an assignment instruction;
For each array reference instruction for each array variable in the control flow graph, the maximum number of execution times of the array reference instruction to the array variable executed between the start point block and the array reference instruction is used as the read port number. A read port number assigning unit assigned to a reference instruction;
Further including
The array assignment instruction decomposition unit adds the write data variable and the write address variable to the control flow graph for each write port number for each array variable, and adds the read address for each read port number. Add variables to the control flow graph,
The memory for holding each array element data includes the same number of write ports as the number of write port numbers assigned to the array variable assignment instruction, and the read port number assigned to the reference instruction of the array variable. A logic circuit generation device having the same number of read ports.

The logic circuit generation device according to claim 4,
It is determined whether or not the number of the write port numbers assigned by the write port number assigning unit is equal to or less than a predetermined write memory port number threshold, and the read port number assigned by the read port number assigning unit A memory port number determination unit for determining whether the number is equal to or less than a predetermined read memory port number threshold;
Further comprising
If the number of write port numbers is less than or equal to a predetermined write memory port number threshold, or if the number of read port numbers is less than or equal to a predetermined read memory port number threshold, a logic circuit description The logic circuit generation device is characterized in that the processing for generating the logic circuit is stopped.

In the logic circuit generation device according to any one of claims 1 to 5,
The inputted program includes an attribute description for giving an attribute to each variable, and the attribute is
A bit width attribute that specifies the bit width of the variable's data,
A register attribute that specifies that the value of the variable is held in the register, and
A memory attribute that specifies that the value of the array element of the array variable is retained in memory, and
Including
An apparatus for generating a logic circuit, comprising: a logic circuit description including a sequential circuit in which a state is expressed by using a variable having the register attribute or the memory attribute as the state variable.

The logic circuit generation device according to claim 6,
For a variable included in the data flow graph, the bit width of the variable is calculated from the bit width of the variable and / or constant referenced in the assignment instruction for the variable and the type of operation executed in the assignment instruction. A bit width determination unit;
An arithmetic circuit delay evaluation unit that calculates a signal propagation delay time of an arithmetic unit based on a bit width calculated for a variable included in the data flow graph;
A pipeline boundary arrangement unit including a pipeline constraint extraction unit and a pipeline stage number determination unit;
Further comprising
The pipeline constraint extraction unit adds a pipeline boundary attribute to a directional branch between an arithmetic unit that executes an assignment instruction for the state variable and an arithmetic unit that executes a reference instruction for the state variable in the data flow graph. And
The pipeline stage number determining unit is configured to determine a pipeline stage number used to generate a circuit description of a clock synchronous pipeline circuit as a pipeline number that is a minimum necessary pipeline stage number determined by a constraint based on the pipeline boundary attribute. A logic circuit generation apparatus, comprising: a stage number lower limit value, and a pipeline stage number calculated from a designated clock period or a pipeline stage number designated in advance.

In the logic circuit generation device according to any one of claims 1 to 7,
A logic circuit input signal extraction unit that extracts an input signal of a circuit to be described by the circuit description from an argument of the top-level function and a global variable;
A logic circuit output signal extraction unit that extracts an output signal of a circuit to be described by the circuit description from an argument and a return value of the top-level function and a global variable;
The logic circuit generation device further comprising:

The logic circuit generation device according to any one of claims 1 to 8,
A complete inline expansion unit that converts the top-level function into a lowest-level function that does not include a function call instruction by inlining each function call instruction when the top-level function includes a function call instruction;
When the lowermost layer function converted by the complete inline expansion unit includes a loop processing unit having a fixed number of repetitions, the lowermost function is loop-processed by expanding each loop processing unit having a fixed number of repetitions. A complete loop expansion part that converts to a non-circular bottom layer function that does not include a part,
A non-circular / non-hierarchical conversion unit including
The program input to the logic circuit generation device is converted into a non-circular bottom layer function by the non-circular / non-hierarchical conversion unit and then input to the control flow graph generation unit Generator.

The logic circuit generation device according to claim 9, wherein
The complete inline expansion unit determines that the function cannot be converted into a lowermost layer function when the function call instruction is not fully expanded even if the inline expansion is repeated a predetermined number of times for the input function. Configured to stop processing,
When the input function includes a loop processing unit whose number of iterations is not a constant, the complete loop expansion unit determines that the function cannot be converted into an acyclic bottom layer function and stops the processing. Configured as
When the complete inline expansion possibility determination unit stops processing, or when the complete loop expansion possibility determination unit stops processing, a logic circuit that stops generating a logic circuit description Generator.

In the logic circuit generation device according to any one of claims 1 to 10,
In the control flow graph, a state variable instruction dependency determination unit that determines whether or not an assignment instruction, a reference instruction, and an assignment instruction are continuously executed in this order for the same state variable is further provided Prepared,
When the state variable instruction dependency determining unit determines “No”, the process of generating the logic circuit description is stopped.

Performed by a logic circuit generation device that takes as input a program that includes a top-level function for generating a logic circuit that describes an operation description, which is a flow of a series of hardware processing for circuit design, and generates a logic circuit description. A logic circuit generation method comprising:
The logic circuit generation device includes:
A control flow graph generation step of generating a control flow graph from the top-level function not including a loop processing unit and a function call instruction;
Control flow degenerative transformation that generates a control flow degenerate program that is a program in which the control flow is degenerated by removing all conditional branch instructions from the control flow graph including only one assignment instruction for variables for each variable. Steps,
From the control flow degeneracy program, each instruction of the control flow degeneration program is a node, and a data flow graph is generated by adding a directional branch from an assignment instruction to each variable to an instruction that refers to the variable. A data flow graph generation step;
A logic circuit description generating step for generating a logic circuit description representing a sequential circuit in which the directed edge of the data flow graph corresponds to a wiring of a logic circuit and the node of the data flow graph corresponds to an arithmetic unit of the logic circuit. When,
And
The state variable representing the state of the sequential circuit is expressed in the program as a local variable or a static variable of an upper layer function that calls the top function,
The value of the state variable before execution of the assignment instruction to the state variable represents the current state of the sequential circuit, and the value of the state variable after execution of the assignment instruction to the state variable is the sequential circuit. A logic circuit generation method characterized by representing a next state of

The logic circuit generation method according to claim 12, wherein
When the highest-order function is not a static single assignment form including only one assignment instruction for a variable for each variable, the logic circuit generation device displays the control flow graph before the control flow degeneration conversion step. Further performing a static single assignment form conversion step to convert to a static single assignment form,
The static single assignment format conversion step includes:
In the control flow graph, a φ function instruction for selecting a variable definition on the actually executed path is inserted from all the variable definitions that merge at the location where a plurality of value definitions for the same variable merge. , Φ function instruction insertion step,
By converting the name of each variable included in the control flow graph so as to have a different name for each assignment instruction for the variable, the control flow graph includes only one assignment instruction for each variable after the name conversion. A variable name conversion step for converting to a static single assignment form;
About the state variable after the name conversion, a state variable name re-conversion step that converts the variable name again so that the variable name in the start block of the control flow graph matches the variable name reaching the end block;
Including
The control flow degeneration conversion step further includes a φ function instruction instantiation step for converting the φ function instruction into a specific operation instruction.
A logic circuit generation method characterized by the above.

The logic circuit generation method according to claim 12 or 13,
The logic circuit generation device performs the control when the top-level function includes an array assignment instruction that is a write processing instruction for an array variable and / or includes an array reference instruction that is a read processing instruction for an array variable. Before the flow degeneration conversion step, a register / memory array access instruction decomposition step for decomposing the array assignment instruction and / or the array reference instruction of the control flow graph is further performed.
The register / memory array access instruction decomposition step comprises:
For each array variable, add a write data variable and a write address variable to the control flow graph,
An array determined by assigning each array assignment instruction to an assignment value to the write data variable, an instruction to assign an array index value to the write address variable, and an array index as the write address variable Disassembled into an instruction that assigns the value of the write data variable to the element;
An array assignment instruction decomposition step;
For each array variable, add a read address variable to the control flow graph,
Decomposing each array reference instruction into an instruction that assigns an array index value to the read address variable and an instruction that refers to an array element determined by using the read address variable as an array index.
An array reference instruction decomposition step;
Including
Memory for holding each array element data in the logic circuit is
The write data variable corresponds to a write data port of the memory;
The write address variable corresponds to a write address port of the memory;
The read address variable corresponds to a read address port of the memory;
A logic circuit generation method characterized by the above.

The logic circuit generation method according to claim 14,
The register / memory array access instruction decomposition step consists of:
For each array assignment instruction for each array variable in the control flow graph, the array array instruction is executed with the maximum value of the number of executions of the array assignment instruction to the array variable executed between the start point block and the array assignment instruction as the write port number. A write port number assignment step assigned to an assignment instruction;
For each array reference instruction for each array variable in the control flow graph, the maximum number of execution times of the array reference instruction to the array variable executed between the start point block and the array reference instruction is used as the read port number. A read port number assignment step assigned to a reference instruction;
Further including
The array assignment instruction decomposition step adds, for each write port number, the write data variable and the write address variable to the control flow graph for each array variable, and for each read port number, the read address Adding a variable to the control flow graph;
Each array having the same number of write ports as the number of write port numbers assigned to the array variable assignment instruction and the same number of read ports as the number of read port numbers assigned to the array variable reference instructions A description of the memory for holding element data is generated;
A logic circuit generation method characterized by the above.

The logic circuit generation method according to claim 15, wherein
The logic circuit generation device further performs a memory port number determination step,
The memory port number determination step includes:
Determining whether the number of write port numbers assigned in the write port number assignment step is less than or equal to a predetermined threshold;
Determining whether the number of read port numbers assigned in the read port number assigning step is less than or equal to a predetermined threshold;
Further including
When the number of the write port numbers or the number of the read memory ports exceeds a threshold value, the logic circuit generation device stops a process of generating a logic circuit description.

The logic circuit generation method according to any one of claims 12 to 16,
The inputted program includes an attribute description for giving an attribute to each variable, and the attribute is
A bit width attribute that specifies the bit width of the variable's data,
A register attribute that specifies that the value of the variable is held in the register, and
A memory attribute that specifies that the value of the array element of the array variable is retained in memory, and
Including
The logic circuit generation device generates a circuit description including a sequential circuit in which a state is expressed by using a variable having the register attribute or the memory attribute as the state variable.
A logic circuit generation method characterized by the above.

The logic circuit generation method according to claim 17, wherein
The logic circuit generation device includes:
For a variable included in the data flow graph, the bit width of the variable is calculated from the bit width of the variable and / or constant referenced in the assignment instruction for the variable and the type of operation executed in the assignment instruction. A bit width determination step;
An arithmetic circuit delay evaluation step for calculating a signal propagation delay time of an arithmetic unit based on a bit width calculated for a variable included in the data flow graph;
A pipeline boundary placement step including a pipeline constraint extraction step and a pipeline stage number determination step;
And further
In the pipeline constraint extraction step, a pipeline boundary attribute is added to a directional branch between an arithmetic unit that executes an assignment instruction for the state variable and an arithmetic unit that executes a reference instruction for the state variable in the data flow graph. And
In the pipeline stage number determining step, the pipeline stage number used for generating the circuit description of the clock synchronous pipeline circuit is a pipeline stage number that is the minimum necessary pipeline stage number determined by the constraint based on the pipeline boundary attribute. A logic circuit generation method, comprising: determining according to a stage number lower limit value and a pipeline stage number calculated from a designated clock cycle or a pipeline stage number designated in advance.

The logic circuit generation method according to any one of claims 12 to 18,
The logic circuit generation device, before the control flow graph generation step,
A logic circuit input signal extraction step for extracting an input signal of a circuit to be described by the circuit description from an argument of the top-level function and a global variable;
A logic circuit output signal extraction step of extracting an output signal of a circuit to be described by the circuit description from an argument and a return value of the top-level function and a global variable;
A logic circuit generation method, further comprising:

The logic circuit generation method according to any one of claims 13 to 19,
The logic circuit generation device may perform a non-control flow graph generation step before the control flow graph generation step when the highest function includes a function call instruction and / or when the highest function includes a loop processing unit having a fixed number of repetitions. Perform further circular / non-hierarchical transformation steps,
The acyclic / non-hierarchical conversion step includes:
A fully inline expansion step of transforming the top-level function into a lowest-level function not including a function call instruction by inlining each function call instruction;
A complete loop unrolling step of transforming the lowest layer function into a non-circular bottom layer function not including a loop processing unit by loop unrolling each fixed iteration number of loop processing units;
A logic circuit generation method comprising:

The logic circuit generation method according to claim 20,
In the complete inline expansion step, when the function call instruction is not completely expanded even if the inline expansion for the input function is repeated a predetermined number of times, it is determined that the function cannot be converted into the lowest layer function. Stop
If the input function includes a loop structure in which the number of iterations is not a constant, the complete loop expansion step determines that the function cannot be converted into an acyclic bottom layer function, and stops processing.
When the process is stopped in the complete inline expansion step, or when the process is stopped in the complete loop expansion step, the logic circuit generation device stops the process of generating the logic circuit description.
A logic circuit generation method characterized by the above.

The logic circuit generation method according to any one of claims 13 to 21,
The logic circuit generation device includes:
In the control flow graph, a state variable instruction dependency determining step for determining whether or not an assignment instruction, a reference instruction, and an assignment instruction are continuously executed in this order for the same state variable is further included Done
A logic circuit generation method characterized by stopping the process of generating a logic circuit description when it is determined as “No” in the state variable instruction dependency determining step.

A logic circuit generation computer program for causing a computer to execute each step of the logic circuit generation method according to any one of claims 12 to 22.