JP4836162B2

JP4836162B2 - Semiconductor integrated circuit

Info

Publication number: JP4836162B2
Application number: JP2004341612A
Authority: JP
Inventors: 俊輝山中
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2004-11-26
Filing date: 2004-11-26
Publication date: 2011-12-14
Anticipated expiration: 2024-11-26
Also published as: JP2006155703A

Description

この発明は、複数のデータ伝送線を有するメモリ部とロジック部とが単一の半導体チップ上に形成されて複数のデータ処理を同時におこなうＳＲＡＭ等の半導体集積回路に関し、特に、画像処理等に用いられるＳＩＭＤ方式の半導体集積回路に関するものである。 The present invention relates to a semiconductor integrated circuit such as an SRAM in which a memory unit having a plurality of data transmission lines and a logic unit are formed on a single semiconductor chip and simultaneously performs a plurality of data processing, and particularly used for image processing and the like. The present invention relates to a SIMD type semiconductor integrated circuit.

従来から、大量のデータを処理する画像処理等では、メモリ部とロジック部（プロセッサ）とを単一の半導体チップ上に搭載して、専用の処理システムとして高速に処理をおこなう半導体集積回路が多く用いられている。このような半導体集積装置として、代表的なＳＩＭＤ（Single Instruction Multiple Data）方式のものは、１つの命令でメモリ部（メモリコア）からの大量のデータを複数の演算回路で並列に同時処理をおこなって、これを繰り返し実行することによって、種々の画像処理を実現している。 Conventionally, in image processing and the like for processing a large amount of data, there are many semiconductor integrated circuits in which a memory unit and a logic unit (processor) are mounted on a single semiconductor chip and processed at high speed as a dedicated processing system. It is used. As such a semiconductor integrated device, a typical SIMD (Single Instruction Multiple Data) system simultaneously processes a large amount of data from a memory unit (memory core) in parallel by a plurality of arithmetic circuits with one instruction. By executing this repeatedly, various image processing is realized.

このように大量のデータを処理するためには、それに応じた大容量のメモリ回路とそれらのデータを処理する複数の演算回路とが必要になる。そして、大容量のメモリ領域に対応させるために、小容量のメモリ回路を複数用いることも可能である。しかし、その場合には、複数のメモリ回路のそれぞれに制御回路が必要となって、チップサイズが大きくなってしまう。 In order to process such a large amount of data, a large-capacity memory circuit corresponding to the large amount of data and a plurality of arithmetic circuits for processing the data are required. In order to correspond to a large capacity memory area, a plurality of small capacity memory circuits can be used. However, in that case, a control circuit is required for each of the plurality of memory circuits, which increases the chip size.

ＳＩＭＤ方式のように同時に並列処理をおこなう半導体集積回路では、チップサイズを縮小する目的を含めて、ワードラインを共通のドライバで駆動して、複数のデータ伝送線（データ入出力線）を有する単一の大容量メモリ回路（メモリ部）を搭載することが可能になる。その場合、メモリセルアレイにおける１本のワードライン上のデータを並列に読み出して、この読み出されたデータの演算処理を並列に実行することとなる。
ところが、大量のデータを一度に同時処理するためには、メモリコアのデータ入出力線の本数を多くする必要がある。しかし、そのような場合には、ワードライン方向の負荷が増加するとともに、ワードラインの配線抵抗によってドライバの近傍と最遠端とのワードライン選択時間に大きな時間差（遅延）が生じてしまう。これにより、メモリ部からの読出しタイミングや書込みタイミングが、複数のデータ入出力線間で異なってしまう。 In a semiconductor integrated circuit that performs parallel processing at the same time as in the SIMD method, a word line is driven by a common driver, including the purpose of reducing the chip size, and a single unit having a plurality of data transmission lines (data input / output lines). One large-capacity memory circuit (memory unit) can be mounted. In that case, data on one word line in the memory cell array is read in parallel, and the arithmetic processing of the read data is executed in parallel.
However, in order to process a large amount of data simultaneously, it is necessary to increase the number of data input / output lines of the memory core. However, in such a case, the load in the word line direction increases, and a large time difference (delay) occurs in the word line selection time between the vicinity of the driver and the farthest end due to the wiring resistance of the word line. As a result, the read timing and write timing from the memory section differ among the plurality of data input / output lines.

このような大容量化による配線遅延は、ロジック部でも同様に生じる可能性がある。すなわち、大量のデータを同時処理するために複数の演算回路を制御する信号に対して、メモリ部のワードラインを駆動する信号ほどではないにしても、配線長及び配線負荷はかなり大きなものとなる。したがって、ロジック部内の各演算回路間でのデータ処理タイミングに遅延が生じてしまう。 Such wiring delay due to the increase in capacity may occur in the logic part as well. That is, the wiring length and the wiring load are considerably large even if the signal for controlling a plurality of arithmetic circuits to process a large amount of data is not as large as the signal for driving the word line of the memory unit. . Therefore, a delay occurs in the data processing timing between the arithmetic circuits in the logic unit.

一方、特許文献１等には、メモリ回路における動的センスアンプのタイミングを最適化することを目的として、メモリ単体で構成される回路においてワードライン抵抗、ビットライン容量の変化を模擬する技術が開示されている。
また、特許文献２〜特許文献４等には、ＳＲＡＭのセルフタイミング回路において、消費電流を軽減すること等を目的として、メモリ単体で構成される回路にダミーセルを用いる技術が開示されている。 On the other hand, Patent Document 1 discloses a technique for simulating changes in word line resistance and bit line capacitance in a circuit constituted by a single memory for the purpose of optimizing the timing of a dynamic sense amplifier in the memory circuit. Has been.
Further, Patent Documents 2 to 4 disclose a technique in which dummy cells are used in a circuit constituted by a single memory for the purpose of reducing current consumption in an SRAM self-timing circuit.

特許平１０−１７７７９２号公報Japanese Patent No. 10-177792 特開２００２−３６７３７７号公報JP 2002-367377 A 特開２００３−７０５５号公報JP 2003-7055 A 特開２００３−３６６７８号公報JP 2003-36678 A

上述した従来の半導体集積回路は、ＳＩＭＤ方式のものに代表されるように、大量のデータを同時に処理できるものの、メモリ部（メモリコア）と演算回路を含むロジック部とでの入出力タイミングがそれぞれに独立した遅延値を持って入出力信号としてあらわれるために、双方の同期をとるためのタイミング制御が難しくなるという問題があった。 Although the conventional semiconductor integrated circuit described above can process a large amount of data simultaneously as represented by the SIMD system, the input / output timings of the memory unit (memory core) and the logic unit including the arithmetic circuit are different. Therefore, there is a problem in that it becomes difficult to control the timing to synchronize both of them because they appear as input / output signals with independent delay values.

例えば、メモリコアの制御信号（ワードライン信号）におけるドライバ近傍の入出力データとそれに対応する演算回路の入出力データとのタイミングを合わせたとしても、ドライバから遠い側の入出力データはメモリコア及びロジック部の遅延の程度が異なるために、タイミングが合わなくなってしまう。これに対して、すべての演算回路を正常に動作させるためにそれぞれのデータに対してセットアップタイム、ホールドタイムを設けた場合には、動作周波数を低下させることになってしまう。このような動作タイミングのずれは、高速動作の妨げになるだけではなく、消費電流を浪費する原因となっていた。 For example, even when the input / output data near the driver in the control signal (word line signal) of the memory core is matched with the input / output data of the arithmetic circuit corresponding thereto, the input / output data far from the driver The timing of the logic part will not match because the degree of delay of the logic part is different. On the other hand, when a setup time and a hold time are provided for each data in order to operate all the arithmetic circuits normally, the operating frequency is lowered. Such a shift in operation timing not only hinders high-speed operation, but also wastes current consumption.

ここで、メモリコアとロジック部との同期が取り難い原因として、メモリセル特有のレイアウトが挙げられる。通常、演算回路は基本論理ゲートを組み合わせることによって合成されるが、メモリ回路では集積度を上げるために、１ビット当たりのセル形状を特別なデザインルールに従って作成している。そのために、遅延情報をデータベース化して他の論理回路と同様に扱うことが困難となって、メモリコアはそれ自体がブラックボックスとして扱われることが多かった。また、各入出力データのタイミングを予め設定する方策も考えられるが、実際には製造工程におけるばらつき等によってタイミングを正確に設定することが困難であって、それぞれのスペックに動作マージンを加えた形での設定になっていた。 Here, a layout peculiar to the memory cell can be cited as a reason why it is difficult to synchronize the memory core and the logic unit. Usually, an arithmetic circuit is synthesized by combining basic logic gates, but in a memory circuit, a cell shape per bit is created according to a special design rule in order to increase the degree of integration. For this reason, it is difficult to make delay information into a database and handle it like other logic circuits, and the memory core itself is often handled as a black box. Although it is possible to set the timing of each input / output data in advance, it is actually difficult to set the timing accurately due to variations in the manufacturing process, etc., and an operation margin is added to each specification. It was set in.

小容量のデータを扱う半導体集積回路であれば同時のタイミングでデータの転送を扱ったとしても大きな問題とはならないが、大容量のデータを扱うＳＩＭＤ方式の半導体集積回路（プロセッサ）においては同時に並列処理をおこなうために、動作タイミングを合致させるように制御しなければならない。ＳＩＭＤ方式の半導体集積回路では、レイアウト上、大容量のメモリ回路の入出力線と演算回路を含むロジック部の入出力線とを、それぞれのピッチを合わせて配置する。しかし、それぞれの制御信号の配線負荷が異なるために、入出力線のタイミングのずれを調整する方法が必要になる。 Although it is not a big problem if data transfer is handled at the same timing if it is a semiconductor integrated circuit that handles small-capacity data, it is simultaneously parallel in a SIMD semiconductor integrated circuit (processor) that handles large-capacity data. In order to perform processing, control must be performed so that the operation timing is matched. In a SIMD type semiconductor integrated circuit, input / output lines of a large-capacity memory circuit and input / output lines of a logic unit including an arithmetic circuit are arranged at the same pitch in layout. However, since the wiring load of each control signal is different, a method for adjusting the timing deviation of the input / output lines is required.

このような問題を解決するために、メモリ回路におけるワードラインやビットラインの遅延成分をモニタして、メモリ回路のセンスアンプ等の活性化タイミングを制御する技術も開示されている。これらの技術は、実際に使用するワードラインやビットラインを複製したダミー回路を設けて、それらの動作を模擬するものである。具体的には、ワードライン及びビットラインの最遠端にダミーメモリセルを配置することで、クリティカルパスでのメモリセルアクセスを模擬して、メモリ回路の活性化期間を制御する。しかし、このような技術では、最も遅いパスに合わせてタイミングを決定するために、入出力のタイミングを均一に定義する小容量のメモリ回路に対しては問題は生じないが、大容量のメモリ回路に対してはクリティカルパスにのみ対応した回路となってしまう。 In order to solve such a problem, a technique for controlling the activation timing of a sense amplifier or the like of a memory circuit by monitoring a delay component of a word line or a bit line in the memory circuit is also disclosed. In these techniques, dummy circuits in which word lines and bit lines actually used are duplicated are provided to simulate their operations. Specifically, dummy memory cells are arranged at the farthest ends of the word lines and bit lines, thereby simulating memory cell access in the critical path and controlling the activation period of the memory circuit. However, such a technique does not cause a problem for a small-capacity memory circuit in which input / output timing is uniformly defined in order to determine timing in accordance with the slowest path, but a large-capacity memory circuit In contrast, the circuit only supports the critical path.

一方、上述した特許文献１〜特許文献４等の技術は、いずれも、メモリ単体で構成される回路にダミー回路を設けたものであって、メモリ部とロジック部とが単一の半導体チップ上に形成された半導体集積回路における各入出力データ線間の遅延によるばらつきを制御するものではない。 On the other hand, all of the above-described technologies such as Patent Literature 1 to Patent Literature 4 are those in which a dummy circuit is provided in a circuit constituted by a single memory, and the memory portion and the logic portion are on a single semiconductor chip. It does not control variations due to delays between the input / output data lines in the semiconductor integrated circuit formed in FIG.

この発明は、上述のような課題を解決するためになされたもので、メモリ部とロジック部とが単一の半導体チップ上に形成されて複数のデータ処理を同時におこなう半導体集積回路における、メモリ部とロジック部とのデータのやり取りのタイミングが最適化されて、動作性能及び動作速度が向上されるとともに消費電流が軽減される半導体集積回路を提供することにある。 The present invention has been made to solve the above-described problems, and a memory unit in a semiconductor integrated circuit in which a memory unit and a logic unit are formed on a single semiconductor chip and perform a plurality of data processing simultaneously. It is an object of the present invention to provide a semiconductor integrated circuit in which the timing of data exchange between the logic unit and the logic unit is optimized to improve operation performance and operation speed and reduce current consumption.

この発明の請求項１記載の発明にかかる半導体集積回路は、複数のデータ伝送線を有するメモリ部とロジック部とが単一の半導体チップ上に形成されて複数のデータ処理を同時におこなう半導体集積回路であって、前記メモリ部における前記複数のデータ伝送線の間で生じる遅延に係わる情報をモニタするモニタ回路と、前記遅延に合わせて位相が異なる複数の内部同期クロックを生成する生成回路と、を備え、前記内部同期クロックを前記メモリ部と前記ロジック部とのタイミングを調整する信号として用い、前記複数のデータ伝送線は、複数のワードライン及びビットラインであって、前記モニタ回路は、前記ワードラインの動作を模擬するように形成されたダミーワードラインであって、前記ワードラインは、複数のメモリセルが接続され、前記ダミーワードラインは、予め固定されたデータを保持するとともに前記メモリセルの動作を模擬するように形成されたダミーメモリセルが接続され、前記ダミーワードラインに対して逆相で動作するように形成された第２のダミーワードラインと、前記第２のダミーワードラインに接続された第２のダミーメモリセルと、前記ダミーメモリセルの出力ノードに接続されるとともに、前記第２のダミーメモリセルからの出力信号が入力されるプリチャージ回路と、前記第２のダミーメモリセルの出力ノードに接続されるとともに、前記ダミーメモリセルからの出力信号が入力される第２のプリチャージ回路と、を備えたものである。 According to a first aspect of the present invention, there is provided a semiconductor integrated circuit in which a memory unit having a plurality of data transmission lines and a logic unit are formed on a single semiconductor chip and simultaneously perform a plurality of data processing. a is a monitor circuit for monitoring the information relating to the delay occurring between the plurality of data transmission lines in the memory unit, and a generation circuit for the phase to generate a plurality of different internal synchronous clock in accordance with the said delay The internal synchronization clock is used as a signal for adjusting the timing between the memory unit and the logic unit, the plurality of data transmission lines are a plurality of word lines and bit lines, and the monitor circuit includes the word line A dummy word line formed so as to simulate the operation of a line, and the word line is connected to a plurality of memory cells. The dummy word line is connected to a dummy memory cell that holds data fixed in advance and simulates the operation of the memory cell, and operates in reverse phase with respect to the dummy word line. A second dummy word line formed on the second dummy memory cell; a second dummy memory cell connected to the second dummy word line; an output node of the dummy memory cell; and the second dummy memory cell A precharge circuit to which an output signal from the cell is input; a second precharge circuit to which an output signal from the dummy memory cell is input while being connected to an output node of the second dummy memory cell; It is equipped with .

また、請求項２記載の発明にかかる半導体集積回路は、前記請求項１に記載の発明において、前記生成回路は、前記ワードラインの立ち上がり又は立ち下がりの変化に合わせて前記内部同期クロックを生成するものである。 The semiconductor integrated circuit according to a second aspect of the present invention is the semiconductor integrated circuit according to the first aspect, wherein the generation circuit generates the internal synchronization clock in accordance with a change in rising or falling of the word line. Is.

本発明は、メモリ部とロジック部とが単一の半導体チップ上に形成されて複数のデータ処理を同時におこなう半導体集積回路において、メモリ部におけるデータ伝送線間の遅延情報をモニタして、遅延に合わせて位相が異なる内部同期クロックを生成して、これをタイミング調整するための信号としている。これにより、メモリ部とロジック部とのデータのやり取りのタイミングが最適化されて、動作性能及び動作速度が向上されるとともに消費電流が軽減される半導体集積回路を提供することができる。 In a semiconductor integrated circuit in which a memory unit and a logic unit are formed on a single semiconductor chip and perform a plurality of data processing at the same time, the delay information between data transmission lines in the memory unit is monitored and the delay is reduced. In addition, an internal synchronous clock having a different phase is generated and used as a signal for timing adjustment. As a result, the timing of data exchange between the memory unit and the logic unit is optimized, so that a semiconductor integrated circuit in which operation performance and operation speed are improved and current consumption is reduced can be provided.

以下、この発明を実施するための最良の形態について、図面を参照して詳細に説明する。なお、各図中、同一又は相当する部分には同一の符号を付しており、その重複説明は適宜に簡略化ないし省略する。 Hereinafter, the best mode for carrying out the present invention will be described in detail with reference to the drawings. In addition, in each figure, the same code | symbol is attached | subjected to the part which is the same or it corresponds, The duplication description is simplified or abbreviate | omitted suitably.

実施の形態１．
図１〜図３にて、この発明の実施の形態１について詳細に説明する。なお、本実施の形態１の説明にあたり、従来の半導体集積回路に係わる図７及び図８を適宜に参照する。
図１は、実施の形態１におけるＳＩＭＤ方式の半導体集積回路を示す回路図である。これに対して、図７は、従来のＳＩＭＤ方式の半導体集積回路を示す回路図である。本実施の形態１の半導体集積回路１は、複数のダミーメモリセル２１が接続されたダミーワードライン１２が設けられている点が、従来のものに対して構成上大きく相違する。
また、図２は実施の形態１の半導体集集積回路における動作タイミングを示すタイミングチャートであり、図３は特にロジック部のみが動作する際の動作タイミングを示すタイミングチャートである。これに対して、図８は従来の半導体集集積回路における動作タイミングを示すタイミングチャートである。 Embodiment 1 FIG.
A first embodiment of the present invention will be described in detail with reference to FIGS. In the description of the first embodiment, FIGS. 7 and 8 relating to a conventional semiconductor integrated circuit are appropriately referred to.
FIG. 1 is a circuit diagram showing a SIMD semiconductor integrated circuit according to the first embodiment. In contrast, FIG. 7 is a circuit diagram showing a conventional SIMD type semiconductor integrated circuit. The semiconductor integrated circuit 1 according to the first embodiment is greatly different from the conventional one in that a dummy word line 12 to which a plurality of dummy memory cells 21 are connected is provided.
FIG. 2 is a timing chart showing the operation timing in the semiconductor integrated circuit of the first embodiment, and FIG. 3 is a timing chart showing the operation timing when only the logic unit operates. On the other hand, FIG. 8 is a timing chart showing the operation timing in the conventional semiconductor integrated circuit.

図１（又は図７）を参照して、ＳＩＭＤ方式の半導体集積回路１は、主として、複数のデータ伝送線１１、１３を有するメモリ部２（メモリコア）及びロジック部３で構成される。メモリ部２及びロジック部３は、デコーダ６に接続されている。
ＳＩＭＤ方式の半導体集積回路１は、大容量のデータを同時に並列処理するために、１本のデータ伝送線（制御信号線）で複数の回路を駆動することになる。具体的に、メモリ部２のワードライン１１上には、複数のメモリセル２０が接続されている。 Referring to FIG. 1 (or FIG. 7), the SIMD semiconductor integrated circuit 1 is mainly composed of a memory unit 2 (memory core) and a logic unit 3 having a plurality of data transmission lines 11 and 13. The memory unit 2 and the logic unit 3 are connected to the decoder 6.
The SIMD semiconductor integrated circuit 1 drives a plurality of circuits with a single data transmission line (control signal line) in order to simultaneously process a large amount of data in parallel. Specifically, a plurality of memory cells 20 are connected on the word line 11 of the memory unit 2.

まず、本実施の形態１における半導体集積回路１の構成・動作を理解する上で、図７及び図８を用いて従来の半導体集積回路１の問題点を以下に整理する。
半導体集積回路１を画像プロセッサとして用いる場合に、例えば、８ビットのデータを５１２ＰＥ（プロセッサエレメント）分処理することとする。このような場合には、一度に４０９６個のメモリセルを選択する必要がある。すなわち、１つのワードライン１１で４０９６個ものメモリセルを駆動しなければならない。そのため、ワードライン１１に対する配線負荷は非常に重く、配線長もかなり長くなる。
したがって、図７に示す従来の半導体集積回路１では、ワードライン１１の近傍と最遠端とでは配線遅延による時間差が生じて、それにともないメモリ部２とロジック部３との転送をおこなうデータにも場所によってデータのあらわれるタイミングが異なってしまうという問題が生じていた。 First, in understanding the configuration and operation of the semiconductor integrated circuit 1 according to the first embodiment, the problems of the conventional semiconductor integrated circuit 1 will be summarized with reference to FIGS. 7 and 8.
When the semiconductor integrated circuit 1 is used as an image processor, for example, 8-bit data is processed by 512 PE (processor element). In such a case, it is necessary to select 4096 memory cells at a time. That is, 4096 memory cells must be driven by one word line 11. For this reason, the wiring load on the word line 11 is very heavy and the wiring length is considerably long.
Therefore, in the conventional semiconductor integrated circuit 1 shown in FIG. 7, a time difference due to wiring delay occurs between the vicinity of the word line 11 and the farthest end, and the data transferred between the memory unit 2 and the logic unit 3 accordingly. There has been a problem that the timing at which data appears differs depending on the location.

ここで、ワードライン１１の配線負荷を軽減する方法として、グローバル配線とローカル配線とに分割して、ローカル配線をバッファリングする方法が考えられる。この方法を用いれば、グローバル配線の配線負荷が軽減されるため、配線遅延をある程度低減することができる。しかし、１ビット当たりのメモリセル２０のレイアウトサイズが小さくなるために、そのピッチに合わせてワードライン１１ごとにバッファを挿入した場合にはレイアウトサイズがかなり大きくなってしまう。さらに、近年の微細化された半導体プロセスにおいては、メモリ部２中にロジックレイアウトを挿入するとなると、境界領域の仕上がりを安定させるために、レイアウト用ダミーパターンが必要となる。このようなダミーパターンをワードライン１１のバッファごとに挿入するとなると、ワードライン１１の単一駆動の場合と比べて、レイアウトサイズがかなり大きくなってしまう。
このようにワードライン１１の配線負荷を軽減させるためには、チップサイズの小面積化が達成できないという問題が生じる。小面積かつ高性能を実現するためには、メモリ部２における入出力データの場所による遅延成分はそのままにして、ロジック部３でのデータ入出力タイミングを合わせ込むことが必要になる。 Here, as a method of reducing the wiring load of the word line 11, a method of buffering the local wiring by dividing it into a global wiring and a local wiring is conceivable. If this method is used, the wiring load of the global wiring is reduced, so that the wiring delay can be reduced to some extent. However, since the layout size of the memory cell 20 per bit is reduced, the layout size is considerably increased when a buffer is inserted for each word line 11 in accordance with the pitch. Further, in a recent miniaturized semiconductor process, when a logic layout is inserted into the memory portion 2, a layout dummy pattern is required to stabilize the finish of the boundary region. If such a dummy pattern is inserted for each buffer of the word line 11, the layout size becomes considerably larger than that in the case of single driving of the word line 11.
Thus, in order to reduce the wiring load of the word line 11, there arises a problem that the chip size cannot be reduced. In order to realize a small area and high performance, it is necessary to match the data input / output timing in the logic unit 3 while keeping the delay component due to the location of the input / output data in the memory unit 2 as it is.

図８に示すように、従来の半導体集積回路１では、メモリ部２でのワードライン１１の遅延成分と、ロジック部３での制御信号（ＣＴＲＥＧ）の遅延成分とが異なってしまうという問題があった。
詳しくは、メモリ部２におけるワードライン１１（ＷＬ０（０）、ＷＬ０（ｍ）、ＷＬ０（ｎ））の場所的な遅延によって、ビットライン１３（ＢＬ（０）、ＢＬ（ｍ）、ＢＬ（ｎ））を介した後のメモリ部２からの出力信号（ＤＯ（０）、ＤＯ（ｍ）、ＤＯ（ｎ））が異なったタイミングで出力される。この出力信号をロジック部３のレジスタ回路３１で受信する場合、レジスタ回路３１側でメモリ部２からの出力タイミングに合わせて取り込まなければならない。すなわち、レジスタ回路３１においてＣＴＲＥＧ信号が立ち上がるまでにメモリ部２の出力を確定させる必要がある。しかし、ロジック部３での制御信号（ＣＴＲＥＧ）の立ち上がりが速すぎると、場所によってデータの取り込みができなくなってしまう。 As shown in FIG. 8, the conventional semiconductor integrated circuit 1 has a problem that the delay component of the word line 11 in the memory unit 2 and the delay component of the control signal (CTREG) in the logic unit 3 are different. It was.
Specifically, the bit line 13 (BL (0), BL (m), BL (n) is caused by the local delay of the word line 11 (WL0 (0), WL0 (m), WL0 (n)) in the memory unit 2. )), The output signals (DO (0), DO (m), DO (n)) from the memory unit 2 are output at different timings. When this output signal is received by the register circuit 31 of the logic unit 3, it must be fetched in accordance with the output timing from the memory unit 2 on the register circuit 31 side. That is, it is necessary to determine the output of the memory unit 2 before the CTREG signal rises in the register circuit 31. However, if the rise of the control signal (CTREG) in the logic unit 3 is too fast, data cannot be captured depending on the location.

図８を参照して、デコーダ６近傍のＤＯ（０）から出力されるデータは、メモリ部２からの出力されるアクセスタイムが速いため、レジスタ回路３１でのデータ取り込みに余裕がある。しかし、デコーダ６から最遠端のＤＯ（ｎ）になるとワードライン１１の遅延によりデータの出力が遅れているにもかかわらず、レジスタ回路３１の制御信号が速くなっているために、メモリ部２からのデータを取り込めなくなってしまう。
このように、メモリ部２とロジック部３との制御信号によるタイミングが合わずに、ロジック部２の信号が速くなってしまった場合等には、データのセットアップが不足してロジック部３が誤動作を起こしてしまう。また、ロジック部３の制御信号が遅くなってしまった場合には、メモリ部２からのデータがホールドできずにロジック部３への誤書き込みとなる。 Referring to FIG. 8, the data output from DO (0) near decoder 6 has a quick access time output from memory unit 2, so that there is a margin for data capture in register circuit 31. However, when the farthest DO (n) from the decoder 6 is reached, the output of the data is delayed due to the delay of the word line 11, but the control signal of the register circuit 31 is faster, so the memory unit 2 The data from can not be imported.
Thus, when the timing of the control signal of the memory unit 2 and the logic unit 3 does not match and the signal of the logic unit 2 becomes faster, the logic unit 3 malfunctions due to insufficient data setup. Will be caused. If the control signal of the logic unit 3 is delayed, data from the memory unit 2 cannot be held, and erroneous writing to the logic unit 3 occurs.

本実施の形態１の半導体集積回路１は、上述したような誤動作を防ぐために、ロジック部３の制御信号とメモリ部２のワードライン１１とを同様の遅延成分を持たせて動作させている。
具体的に、図１を参照して、本実施の形態１の半導体集積回路１は、メモリ部２にワードライン１１の動作を模擬するためのダミーワードライン１２を設けている。このダミーワードライン１２には、複数のダミーメモリセル２１が接続されている。ダミーメモリセル２１は、１対のビットライン１３における一方（ＤＢＬ）に「Ｌ（ロー）」が出力されるようにデータが予め固定されている。また、このＤＢＬの出力ノードには、その出力が予め「Ｈ（ハイ）」に固定されるようにプリチャージ回路２２が接続されている。 The semiconductor integrated circuit 1 according to the first embodiment operates the control signal of the logic unit 3 and the word line 11 of the memory unit 2 with the same delay component in order to prevent the above-described malfunction.
Specifically, referring to FIG. 1, in semiconductor integrated circuit 1 according to the first embodiment, dummy word line 12 for simulating the operation of word line 11 is provided in memory unit 2. A plurality of dummy memory cells 21 are connected to the dummy word line 12. In the dummy memory cell 21, data is fixed in advance so that “L (low)” is output to one (DBL) of the pair of bit lines 13. A precharge circuit 22 is connected to the output node of the DBL so that its output is fixed to “H (high)” in advance.

このように本実施の形態１の半導体集積回路１は、従来のもの（図７を参照できる。）に対して、メモリ部２中にワードライン（ダミーワードライン１２）を１ライン分追加したものである。このような追加は、メモリセルの１ビット分の追加に相当するものであって、全体のチップサイズに及ぼす影響はほとんどない。 As described above, the semiconductor integrated circuit 1 according to the first embodiment is obtained by adding one word line (dummy word line 12) in the memory unit 2 to the conventional one (see FIG. 7). It is. Such addition corresponds to the addition of one bit of the memory cell and has almost no influence on the entire chip size.

このダミーワードライン１２により出力された信号（内部同期クロック）は、内部同期信号としてメモリ部２からロジック部３へと供給される。ロジック部３には、ロジック部３における制御信号（ＣＴＲＥＧ、ＣＴＡＬＵ）を用いるか、メモリ部２からの同期信号（ＣＫＩ）を用いるかの選択回路３５、３６が設けられていて、その用途に応じて信号を使い分けることができる。なお、本実施の形態１では、選択回路３５、３６としてＯＲゲートを用いていずれかの信号を静止させる方法としているが、選択回路としてマルチプレクサを用いていずれかの信号を選択させる方法とすることもできる。 A signal (internal synchronization clock) output from the dummy word line 12 is supplied from the memory unit 2 to the logic unit 3 as an internal synchronization signal. The logic unit 3 is provided with selection circuits 35 and 36 for using the control signal (CTREG, CTALU) in the logic unit 3 or the synchronization signal (CKI) from the memory unit 2 depending on the application. Can be used properly. In the first embodiment, one of the signals is stopped using an OR gate as the selection circuits 35 and 36, but one of the signals is selected using a multiplexer as the selection circuit. You can also.

図２を用いて、上述のように構成された本実施の形態１の半導体集積回路１の動作について説明する。
ワードライン１１（ＷＬ０（０）、ＷＬ０（ｍ）、ＷＬ０（ｎ））は、従来のものと同様に、デコーダ６に近い側は速く、遠い側は遅れて選択される。また、ダミーワードライン１２（ＷＬｄ）も、ワードライン１１の動作を模擬してワードライン１１と同様の動作をおこなう。 The operation of the semiconductor integrated circuit 1 according to the first embodiment configured as described above will be described with reference to FIG.
The word lines 11 (WL0 (0), WL0 (m), WL0 (n)) are selected fast on the side closer to the decoder 6 and delayed on the far side, as in the prior art. The dummy word line 12 (WLd) also performs the same operation as the word line 11 by simulating the operation of the word line 11.

ダミーワードライン１２によってダミーメモリセル２１が選択されると、「Ｌ」がＤＢＬへ出力される。ＤＢＬはプリチャージ回路２２によって予め「Ｈ」に固定されているために、ダミーメモリセル２１からの出力に合わせてＣＫＩ（ダミーワードライン１２から出力される内部同期クロックである。）が「Ｌ」から「Ｈ」へと変化する。これが内部同期信号としてロジック部３へと伝わる。
その後、ロジック部３へ同期信号が伝えられると、その同期タイミングに合わせてロジック部３でメモリ部２からのデータが取り込まれる。 When the dummy memory cell 21 is selected by the dummy word line 12, “L” is output to DBL. Since DBL is fixed to “H” in advance by the precharge circuit 22, CKI (which is an internal synchronization clock output from the dummy word line 12) is set to “L” in accordance with the output from the dummy memory cell 21. To “H”. This is transmitted to the logic unit 3 as an internal synchronization signal.
Thereafter, when a synchronization signal is transmitted to the logic unit 3, the data from the memory unit 2 is taken in by the logic unit 3 in accordance with the synchronization timing.

ここで、内部同期信号の立ち下がりのエッジは、ダミーワードライン１２が立ち下がって、プリチャージ回路２２におけるプリチャージ信号（ＰＲＣ）がイネーブルになることによって得られる。このとき、プリチャージ信号のタイミングをワードライン１１の変化に同期させてもよい（これについては、別の実施の形態で説明する。）。 Here, the falling edge of the internal synchronization signal is obtained when the dummy word line 12 falls and the precharge signal (PRC) in the precharge circuit 22 is enabled. At this time, the timing of the precharge signal may be synchronized with the change of the word line 11 (this will be described in another embodiment).

このように、メモリ部２におけるワードライン１１の遅延をモニタしたダミーワードライン１２によって生成される同期信号をロジック部３においても使用するために、上述した入出力線の場所によってデータのセットアップタイムが異なるという従来の問題は解消される。すなわち、タイミングマージンが最小限に抑えられて、動作周波数を向上させることが可能となる。また、タイミングを合わせることによって、半導体集積回路１における不要な動作が軽減されて消費電流が低減される。
また、ロジック部３へ供給される同期信号は、レジスタ回路（ＲＥＧ）とのタイミング調整だけではなくて，演算回路３２（ＡＬＵ）とのタイミング調整にも用いられるために、メモリ部２とのアクセス中に演算回路３２等を動作させることが可能になる。 As described above, since the synchronization signal generated by the dummy word line 12 in which the delay of the word line 11 in the memory unit 2 is monitored is also used in the logic unit 3, the data setup time depends on the location of the input / output lines described above. The conventional problem of being different is solved. That is, the timing margin can be minimized and the operating frequency can be improved. Further, by adjusting the timing, unnecessary operations in the semiconductor integrated circuit 1 are reduced, and current consumption is reduced.
The synchronization signal supplied to the logic unit 3 is used not only for timing adjustment with the register circuit (REG) but also for timing adjustment with the arithmetic circuit 32 (ALU). It is possible to operate the arithmetic circuit 32 and the like.

なお、図２では、メモリ部２からデータが読み出される動作とその際の効果について説明したが、ロジック部３からメモリ部２への書き込み動作がおこなわれる際にも同様の効果を得ることができる。すなわち、メモリ部２のワードライン１１の選択期間とロジック部３からの入力データのタイミングを合わせて、データの正常な書き込みをおこなうことができる。 In FIG. 2, the operation of reading data from the memory unit 2 and the effect at that time have been described. However, the same effect can be obtained when a write operation from the logic unit 3 to the memory unit 2 is performed. . That is, data can be normally written in accordance with the selection period of the word line 11 of the memory unit 2 and the timing of the input data from the logic unit 3.

このように本実施の形態１では、ダミーワードライン１２がワードライン１１で生じる遅延に係わる情報をモニタするモニタ回路として機能して、ダミーワードライン１２に接続された複数のダミーメモリセル２１がその遅延に合わせて位相が異なる複数の内部同期クロックを生成する生成回路として機能する。 As described above, in the first embodiment, the dummy word line 12 functions as a monitor circuit for monitoring information related to the delay generated in the word line 11, and the plurality of dummy memory cells 21 connected to the dummy word line 12 are It functions as a generation circuit that generates a plurality of internal synchronization clocks having different phases in accordance with the delay.

すなわち、本実施の形態１の半導体集積回路１は、ＳＩＭＤプロセッサのように同時に大量のデータを扱う半導体集積回路であって、メモリ部２とロジック部３とでのデータ転送時に生じるタイミングのずれを補正して、入出力タイミングを合わせ込むことで動作周波数の向上が達成されている。
これはメモリセル２０に特有のレイアウト形状を利用して、内部同期クロックを生成することで実現させたものである。この半導体集積回路１を用いることで、メモリ部２とロジック部３とのタイミング調整が容易になって、動作周波数を向上させることができる。また、動作タイミングを安定させることで、不要な消費電流を軽減させることができる。さらに、メモリ部２を分割せずに１つの塊として扱うために、チップサイズを比較的縮小することができる。 That is, the semiconductor integrated circuit 1 according to the first embodiment is a semiconductor integrated circuit that handles a large amount of data at the same time, such as a SIMD processor, and a timing shift that occurs when data is transferred between the memory unit 2 and the logic unit 3. By correcting and matching the input / output timing, the operating frequency is improved.
This is realized by generating an internal synchronous clock using a layout shape peculiar to the memory cell 20. By using this semiconductor integrated circuit 1, the timing adjustment between the memory unit 2 and the logic unit 3 is facilitated, and the operating frequency can be improved. Further, by stabilizing the operation timing, unnecessary current consumption can be reduced. Further, since the memory unit 2 is handled as one lump without being divided, the chip size can be relatively reduced.

次に、図３にて、メモリ部２を動作させずにロジック部３のみを動作させる際の半導体集積回路１の動作タイミングについて説明する。
ロジック部３のみを動作させる場合には、ワードライン１１で生じる遅延に合わせて、プロセッサエレメント（ＰＥ）方向に回路の動作タイミングをずらす必要がない。そのため、メモリ部２から供給される内部同期信号（ＣＫＩ）は使用されない。ロジック部３は、その内部に予めデータベース化された論理回路が設置されているために、レジスタ回路３１（ＲＥＧ）や演算回路３２（ＡＬＵ）の同期をとることは容易である。図３に示すように、ロジック部３のみを動作させる場合には、ロジック部３内でその動作タイミングが調整される。すなわち、制御信号（ＣＴＲＥＧ、ＣＴＡＬＵ）の負荷を軽くして、ロジック部３内の遅延を軽減している。これにより、ロジック部３だけの動作時において、メモリ部２を含めて動作する場合に比べて、高速動作を可能としている。 Next, the operation timing of the semiconductor integrated circuit 1 when operating only the logic unit 3 without operating the memory unit 2 will be described with reference to FIG.
When only the logic unit 3 is operated, it is not necessary to shift the operation timing of the circuit in the processor element (PE) direction in accordance with the delay generated in the word line 11. For this reason, the internal synchronization signal (CKI) supplied from the memory unit 2 is not used. Since the logic unit 3 is provided with a pre-database logic circuit, it is easy to synchronize the register circuit 31 (REG) and the arithmetic circuit 32 (ALU). As shown in FIG. 3, when only the logic unit 3 is operated, the operation timing is adjusted in the logic unit 3. That is, the load in the control signal (CTREG, CTALU) is reduced, and the delay in the logic unit 3 is reduced. As a result, when only the logic unit 3 is operating, it is possible to operate at a higher speed than when operating including the memory unit 2.

以上説明したように、本実施の形態１では、メモリ部２とロジック部３とが単一の半導体チップ上に形成されて複数のデータ処理を同時におこなう半導体集積回路１において、メモリ部２におけるワードライン１１間の遅延情報をモニタして、遅延に合わせて位相が異なる内部同期クロックを生成して、これをタイミング調整するための信号としている。これにより、メモリ部２とロジック部３とのデータのやり取りのタイミングが最適化されて、動作性能及び動作速度が向上されるとともに消費電流が軽減される。 As described above, in the first embodiment, in the semiconductor integrated circuit 1 in which the memory unit 2 and the logic unit 3 are formed on a single semiconductor chip and simultaneously perform a plurality of data processing, the word in the memory unit 2 The delay information between the lines 11 is monitored, an internal synchronous clock having a different phase according to the delay is generated, and this is used as a signal for timing adjustment. As a result, the timing of data exchange between the memory unit 2 and the logic unit 3 is optimized, so that the operation performance and the operation speed are improved and the current consumption is reduced.

なお、本実施の形態１では、ダミーワードライン１２に対して、ワードライン１１に接続されたメモリセル２０と同数のダミーメモリセル２１を接続した。これに対して、ダミーワードライン１２に対して、数ビットごと（所定間隔ごと）にダミーメモリセル２１を接続することもできる。これによって、メモリ部２では、数ビットごとに同期信号が生成されることになる。このような場合であっても、数ビット間であればそこに生じる遅延の程度も大きくないために、本実施の形態１と同様の効果を得ることができる。 In the first embodiment, the same number of dummy memory cells 21 as the memory cells 20 connected to the word line 11 are connected to the dummy word line 12. On the other hand, the dummy memory cell 21 can be connected to the dummy word line 12 every several bits (every predetermined interval). As a result, the memory unit 2 generates a synchronization signal every several bits. Even in such a case, since the degree of delay occurring there is not so large as long as it is between several bits, the same effect as in the first embodiment can be obtained.

実施の形態２．
図４にて、この発明の実施の形態２について詳細に説明する。
図４は、実施の形態２における半導体集積回路１を示す回路図である。本実施の形態２の半導体集積回路１は、ロジック部３に第２のモニタ回路としてのダミーワードライン４１が設けられている点が、前記実施の形態１のものとは相違する。 Embodiment 2. FIG.
The second embodiment of the present invention will be described in detail with reference to FIG.
FIG. 4 is a circuit diagram showing the semiconductor integrated circuit 1 according to the second embodiment. The semiconductor integrated circuit 1 of the second embodiment is different from that of the first embodiment in that the logic unit 3 is provided with a dummy word line 41 as a second monitor circuit.

図４に示すように、本実施の形態２では、ロジック部３においてもメモリ部２との同期をとるために、ロジック部３に複数のダミーメモリセル４４が接続されたダミーワードライン４１が形成されている。
そして、ダミーワードライン４１により出力された信号は、内部同期信号として、レジスタ回路３１に接続された選択回路４２と、演算回路３２に接続された選択回路４３と、に供給される。選択回路４２、４３では、それぞれ、制御信号（ＣＴＲＥＧ、ＣＴＡＬＵ）を用いるか、ダミーメモリセル４４で生成された内部同期信号を用いるかが選択される。 As shown in FIG. 4, in the second embodiment, a dummy word line 41 in which a plurality of dummy memory cells 44 are connected to the logic unit 3 is formed in the logic unit 3 in order to synchronize with the memory unit 2. Has been.
The signal output from the dummy word line 41 is supplied as an internal synchronization signal to the selection circuit 42 connected to the register circuit 31 and the selection circuit 43 connected to the arithmetic circuit 32. Each of the selection circuits 42 and 43 selects whether to use a control signal (CTREG, CTALU) or an internal synchronization signal generated by the dummy memory cell 44.

なお、プロセスを安定させるために、ダミーワードライン４１に接続されたダミーメモリセル４４の周囲には、レイアウトのダミーパターンを形成することが好ましい。本実施の形態２では、メモリ部２にダミーワードパターンを形成する場合（前記実施の形態１の場合である。）に比べて面積は大きくなるが、チップサイズが大きくてメモリ部２から同期信号を引き回すには距離が遠くなってしまう半導体集積回路に対して有効である。 In order to stabilize the process, it is preferable to form a dummy pattern of the layout around the dummy memory cell 44 connected to the dummy word line 41. In the second embodiment, the area is larger than when a dummy word pattern is formed in the memory unit 2 (in the case of the first embodiment), but the chip size is large and the synchronization signal is sent from the memory unit 2. This is effective for a semiconductor integrated circuit whose distance is long.

以上説明したように、本実施の形態２では、メモリ部２とロジック部３とのデータのやり取りのタイミングが最適化されるとともに、ロジック部３内でのタイミング調整も最適化されるので、動作性能及び動作速度が向上されるとともに消費電流が軽減される。 As described above, in the second embodiment, the timing of data exchange between the memory unit 2 and the logic unit 3 is optimized, and the timing adjustment in the logic unit 3 is also optimized. Performance and operating speed are improved and current consumption is reduced.

実施の形態３．
図５にて、この発明の実施の形態３について詳細に説明する。
図５は、実施の形態３における半導体集積回路１を示す回路図である。本実施の形態３の半導体集積回路１は、ダミーワードライン１２の替わりにダミービットライン２３が形成されている点が、前記実施の形態１のものとは相違する。 Embodiment 3 FIG.
A third embodiment of the present invention will be described in detail with reference to FIG.
FIG. 5 is a circuit diagram showing the semiconductor integrated circuit 1 according to the third embodiment. The semiconductor integrated circuit 1 of the third embodiment is different from that of the first embodiment in that a dummy bit line 23 is formed instead of the dummy word line 12.

図５に示すように、本実施の形態３では、ダミーワードラインは設置せずに、各ワードライン１１に対して均等な間隔をおいてダミーメモリセル２８を挿入している。具体的には、ｍビットごとにダミーメモリセル２８を設置している。ダミーメモリセル２８は、必ずいずれかのダミーメモリセル２８が選択されるように、ワードライン１１の数と同数存在する。また、ダミーメモリセル２８は、一対のダミービットライン２３に接続されている。ここで、ダミービットライン２３（ＤＢＬ）は、ビットライン１３（ＢＬ）の動作を模擬するように形成されたものである。また、ダミービットライン２３には、ビットライン１３に接続されたセンスアンプ１４を複製したダミーセンスアンプ１８が接続されている。 As shown in FIG. 5, in the third embodiment, dummy memory cells 28 are inserted at equal intervals with respect to each word line 11 without providing dummy word lines. Specifically, a dummy memory cell 28 is provided for every m bits. The number of dummy memory cells 28 is the same as the number of word lines 11 so that one of the dummy memory cells 28 is always selected. The dummy memory cell 28 is connected to a pair of dummy bit lines 23. Here, the dummy bit line 23 (DBL) is formed so as to simulate the operation of the bit line 13 (BL). The dummy bit line 23 is connected to a dummy sense amplifier 18 that duplicates the sense amplifier 14 connected to the bit line 13.

以上のように構成された半導体集積回路１において、まず、ワードライン１１が選択されると、メモリセル２０が選択されて一対のビットライン１３にデータがあらわれる。これと同時に、ダミーメモリセル２８も選択されて、一対のダミービットライン２３にデータがあらわれる。
そして、ビットライン１３のデータは、センスアンプ１４で増幅されて出力される。これに同期して、ダミーセンスアンプ１８から出力信号の検出信号が生成される。これを内部同期信号とすることで、ワードライン１１で生じる遅延成分だけではなく、ビットライン１３で生じる遅延成分もモニタできることになる。したがって、メモリ部２からの出力信号に一層近いタイミングで、内部同期信号を生成することができる。 In the semiconductor integrated circuit 1 configured as described above, when the word line 11 is first selected, the memory cell 20 is selected and data appears on the pair of bit lines 13. At the same time, the dummy memory cell 28 is also selected, and data appears on the pair of dummy bit lines 23.
The data on the bit line 13 is amplified by the sense amplifier 14 and output. In synchronization with this, an output signal detection signal is generated from the dummy sense amplifier 18. By using this as an internal synchronizing signal, not only the delay component generated in the word line 11 but also the delay component generated in the bit line 13 can be monitored. Therefore, the internal synchronization signal can be generated at a timing closer to the output signal from the memory unit 2.

なお、本実施の形態３では、ｍビット目の次にダミー回路１８、２３、２８を形成していて、このダミー回路１８、２３、２８が０ビットからｍビットまでのデータをまとめて同期信号として扱っている。
ここで、ｍの数が小さくなれば精度が高まることになるが、ダミー回路の挿入数が増えるためにその分だけチップサイズが大きくなってしまう。これに対して、ｍの数が大きくなれば、ロジック部３とのタイミングが調整しにくくなってしまう。分割数（ｍ）は容易に変更できるために、上述の関係を理解した上で、半導体集積回路１の用途に応じて分割数（ｍ）を適宜に変更することが好ましい。 In the third embodiment, dummy circuits 18, 23, and 28 are formed next to the m-th bit, and the dummy circuits 18, 23, and 28 collectively collect data from 0 bits to m bits. Are treated as
Here, if the number of m decreases, the accuracy increases. However, since the number of dummy circuits inserted increases, the chip size increases accordingly. On the other hand, if the number of m increases, it becomes difficult to adjust the timing with the logic unit 3. Since the number of divisions (m) can be easily changed, it is preferable to change the number of divisions (m) appropriately according to the application of the semiconductor integrated circuit 1 after understanding the above relationship.

以上説明したように、本実施の形態３では、ビットライン１３で生じる遅延に係わる情報をモニタしてそれに合わせた内部同期クロックを生成しているので、メモリ部２とロジック部３とのデータのやり取りのタイミングが最適化されて、動作性能及び動作速度が向上されるとともに消費電流が軽減される。 As described above, in the third embodiment, since the information related to the delay generated in the bit line 13 is monitored and the internal synchronous clock corresponding to the information is generated, the data of the memory unit 2 and the logic unit 3 is stored. The timing of exchange is optimized, so that the operating performance and the operating speed are improved and the current consumption is reduced.

なお、本実施の形態３では、ダミービットライン２３にダミーセンスアンプ１８を接続して、メモリセル２０からの読み出し時におけるタイミング信号の生成をおこなっている。これに対して、ビットライン１３に接続されたライトバッファ１５を複製したダミーライトバッファを、ダミービットライン２３に接続することで、メモリセル２０へのライト時の書き込みタイミングに同期させる信号も容易に生成することができる。 In the third embodiment, a dummy sense amplifier 18 is connected to the dummy bit line 23 to generate a timing signal at the time of reading from the memory cell 20. On the other hand, by connecting a dummy write buffer that duplicates the write buffer 15 connected to the bit line 13 to the dummy bit line 23, a signal that is synchronized with the write timing at the time of writing to the memory cell 20 can be easily obtained. Can be generated.

実施の形態４．
図６にて、この発明の実施の形態３について詳細に説明する。
図６は、実施の形態４における半導体集積回路１を示す回路図である。本実施の形態４の半導体集積回路１は、ダミーワードライン１２に加えて第２のダミーワードライン５１が形成されている点が、前記実施の形態１のものとは相違する。 Embodiment 4 FIG.
A third embodiment of the present invention will be described in detail with reference to FIG.
FIG. 6 is a circuit diagram showing the semiconductor integrated circuit 1 according to the fourth embodiment. The semiconductor integrated circuit 1 of the fourth embodiment is different from that of the first embodiment in that a second dummy word line 51 is formed in addition to the dummy word line 12.

図６に示すように、本実施の形態４では、ワードライン１１を模擬したダミーワードライン１２（ＷＬｄ）に加えて、ダミーワードライン１２の動作に対して逆相で動作する第２のダミーワードラインとしての逆相ワードライン５１（ＷＬｂ）が形成されている。逆相ワードライン５１には、ダミーワードライン１２と同様に、予め固定されたデータを持つダミーメモリセル５４が接続されている。そして、それぞれのダミーメモリセル５４の出力ノードには、第２のプリチャージ回路５２が接続されている。プリチャージ回路２２と第２のプリチャージ回路５２とは、それぞれの出力信号がお互いのプリチャージ回路の入力信号となるように接続されている。このような構成によって、ダミーワードライン１２からの信号が内部同期信号として、逆相ワードライン５１からの信号がプリチャージ信号（ＰＲＣ）として用いられることになる。 As shown in FIG. 6, in the fourth embodiment, in addition to the dummy word line 12 (WLd) that simulates the word line 11, the second dummy word that operates in the opposite phase to the operation of the dummy word line 12. A reverse phase word line 51 (WLb) as a line is formed. Similarly to the dummy word line 12, a dummy memory cell 54 having previously fixed data is connected to the negative phase word line 51. A second precharge circuit 52 is connected to the output node of each dummy memory cell 54. The precharge circuit 22 and the second precharge circuit 52 are connected so that their respective output signals become input signals of the respective precharge circuits. With such a configuration, the signal from the dummy word line 12 is used as an internal synchronization signal, and the signal from the reverse-phase word line 51 is used as a precharge signal (PRC).

以上のように構成された半導体集積回路１において、ワードライン１１が選択されないときには逆相ワードライン５１が選択されて、プリチャージ信号がイネーブルとなる。
ワードライン１１が立ち上がると、それと同時に逆相ワードライン５１が立ち下がって、プリチャージが終了するのに合わせて内部同期信号（ＣＫＩ）が立ち上がる。これに対して、ワードライン１１が立ち下がると、ダミーワードライン１２に接続されたダミーメモリセル２１が非選択になって、それに合わせて逆相ワードライン５１が選択されて、プリチャージ信号が生成される。そして、内部同期信号は、「Ｈ」から「Ｌ」へと変化する。このように，本実施の形態４の半導体集積回路１では、ワードライン１１の立ち上がりや立ち下りに同期して、内部クロックを生成することができる。 In the semiconductor integrated circuit 1 configured as described above, when the word line 11 is not selected, the reverse-phase word line 51 is selected and the precharge signal is enabled.
When the word line 11 rises, the anti-phase word line 51 falls at the same time, and the internal synchronization signal (CKI) rises as the precharge ends. On the other hand, when the word line 11 falls, the dummy memory cell 21 connected to the dummy word line 12 is deselected, and the anti-phase word line 51 is selected in accordance with that, and a precharge signal is generated. Is done. The internal synchronization signal changes from “H” to “L”. As described above, in the semiconductor integrated circuit 1 according to the fourth embodiment, the internal clock can be generated in synchronization with the rise and fall of the word line 11.

以上説明したように、本実施の形態４では、ワードライン１１の立ち上がりや立ち下りの変化に合わせて内部同期クロックを生成しているので、メモリ部２とロジック部３とのデータのやり取りのタイミングが最適化されて、動作性能及び動作速度が向上されるとともに消費電流が軽減される。 As described above, in the fourth embodiment, since the internal synchronization clock is generated in accordance with the rise or fall of the word line 11, the timing of data exchange between the memory unit 2 and the logic unit 3 Is optimized to improve operation performance and operation speed and reduce current consumption.

なお、前記各実施の形態では、半導体集積回路としての１ポートＳＲＡＭに対して本発明を適用したが、デュアルポートＳＲＡＭ等のマルチポートの半導体集積回路に対しても当然に本発明を適用することができる。さらに、ＤＲＡＭ等のメモリ回路に対しても、データ伝送線で生じる遅延をモニタしてそれに合わせた内部同期クロックを生成することで、ロジック部とのタイミング調整をおこなうことが可能になる。 In each of the above embodiments, the present invention is applied to a 1-port SRAM as a semiconductor integrated circuit. However, the present invention is naturally applied to a multi-port semiconductor integrated circuit such as a dual-port SRAM. Can do. Furthermore, it is possible to adjust the timing with the logic unit by monitoring a delay generated in the data transmission line and generating an internal synchronous clock corresponding to the delay in the memory circuit such as a DRAM.

なお、本発明が前記各実施の形態に限定されず、本発明の技術思想の範囲内において、前記各実施の形態の中で示唆した以外にも、前記各実施の形態は適宜変更され得ることは明らかである。また、前記構成部材の数、位置、形状等は前記各実施の形態に限定されず、本発明を実施する上で好適な数、位置、形状等にすることができる。 It should be noted that the present invention is not limited to the above-described embodiments, and within the scope of the technical idea of the present invention, the embodiments can be modified as appropriate in addition to those suggested in the embodiments. Is clear. In addition, the number, position, shape, and the like of the constituent members are not limited to the above embodiments, and can be set to a number, position, shape, and the like that are suitable for carrying out the present invention.

この発明の実施の形態１における半導体集積回路を示す回路図である。1 is a circuit diagram showing a semiconductor integrated circuit according to a first embodiment of the present invention. 図１の半導体集積回路における動作タイミングを示す図である。FIG. 2 is a diagram showing operation timings in the semiconductor integrated circuit of FIG. 1. 図１の半導体集積回路においてロジック部のみが動作する際の動作タイミングを示す図である。FIG. 2 is a diagram illustrating operation timing when only a logic unit operates in the semiconductor integrated circuit of FIG. 1. この発明の実施の形態２における半導体集積回路を示す回路図である。It is a circuit diagram which shows the semiconductor integrated circuit in Embodiment 2 of this invention. この発明の実施の形態３における半導体集積回路を示す回路図である。It is a circuit diagram which shows the semiconductor integrated circuit in Embodiment 3 of this invention. この発明の実施の形態４における半導体集積回路を示す回路図である。It is a circuit diagram which shows the semiconductor integrated circuit in Embodiment 4 of this invention. 従来の半導体集積回路を示す回路図である。It is a circuit diagram which shows the conventional semiconductor integrated circuit. 図７の半導体集積回路における動作タイミングを示す図である。It is a figure which shows the operation timing in the semiconductor integrated circuit of FIG.

Explanation of symbols

１半導体集積回路、
２メモリ部（メモリコア）、
３ロジック部、６デコーダ、
１１ワードライン、
１２、４１、５１ダミーワードライン、
１３ビットライン、１４センスアンプ、
１５ライトバッファ、１８ダミーセンスアンプ、
２０メモリセル、
２１、２８、４４、５４ダミーメモリセル、
２３ダミービットライン、
２２、４５、５２プリチャージ回路、
３１レジスタ回路、３２演算回路、
３５、３６、４２、４３選択回路。
1 Semiconductor integrated circuit,
2 Memory part (memory core),
3 logic part, 6 decoder,
11 word lines,
12, 41, 51 Dummy word line,
13 bit lines, 14 sense amplifiers,
15 write buffer, 18 dummy sense amplifier,
20 memory cells,
21, 28, 44, 54 dummy memory cells,
23 Dummy bit line,
22, 45, 52 Precharge circuit,
31 register circuit, 32 arithmetic circuit,
35, 36, 42, 43 Selection circuit.

Claims

A semiconductor integrated circuit in which a memory unit having a plurality of data transmission lines and a logic unit are formed on a single semiconductor chip and simultaneously perform a plurality of data processing ,
A monitor circuit for monitoring information related to a delay generated between the plurality of data transmission lines in the memory unit;
A generation circuit that generates a plurality of internal synchronization clocks having different phases in accordance with the delay;
With
Using the internal synchronization clock as a signal for adjusting the timing of the memory unit and the logic unit,
The plurality of data transmission lines are a plurality of word lines and bit lines,
The monitor circuit is a dummy word line formed to simulate the operation of the word line,
The word line is connected to a plurality of memory cells,
The dummy word line is connected to a dummy memory cell formed so as to hold data fixed in advance and simulate the operation of the memory cell,
A second dummy word line formed to operate in reverse phase with respect to the dummy word line;
A second dummy memory cell connected to the second dummy word line;
A precharge circuit connected to an output node of the dummy memory cell and to which an output signal from the second dummy memory cell is input;
And a second precharge circuit connected to an output node of the second dummy memory cell and to which an output signal from the dummy memory cell is input .

The semiconductor integrated circuit according to claim 1, wherein the generation circuit generates the internal synchronization clock in accordance with a change in rising or falling of the word line.