JP2003122736A

JP2003122736A - Matrix arithmetic unit

Info

Publication number: JP2003122736A
Application number: JP2001314389A
Authority: JP
Inventors: Katsuyoshi Naka; 勝義中; Keiichi Kitagawa; 恵一北川
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 2001-10-11
Filing date: 2001-10-11
Publication date: 2003-04-25

Abstract

PROBLEM TO BE SOLVED: To perform an arithmetic operation for calculating a solution of a simultaneous linear equation to be expressed in a form including upper triangular matrix or a lower triangular matrix by forward substitution or backward substitution at extremely high speed by using a small-sized and low power consumption unit. SOLUTION: This matrix arithmetic unit constituted of hardware is provided with product-sum operation parts (101, 102, 103, 105, 106, 107, 108, 109, 110) and a linear operation part for performing a prescribed linear operation for a result of a product-sum operation. In a period when an arithmetic operation for finding out a solution of the n-th linear equation is performed in the linear operation part, a product-sum operation about a partial term without including the solution of the n-th linear equation among product-sum operation terms required for finding out a solution of the next (n+1)th linear operation is precedently executed by the product-sum operation parts and a plurality of multipliers of a multiplication part 107 are used on the time-sharing basis.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、上三角行列または
下三角行列を含んだ形式で表わされる連立一次方程式の
解を求めるために利用可能な行列演算装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a matrix calculation device that can be used to obtain a solution of a simultaneous linear equation expressed in a format including an upper triangular matrix or a lower triangular matrix.

【０００２】[0002]

【従来の技術】電子計算機を用いて数値計算（行列演
算）を行う際に、対称行列をコレスキー分解して上三角
行列と下三角行列の積の形式とし、前進代入または後退
代入により連立一次方程式の解を順番に求めていく方法
が採られる場合がある。2. Description of the Related Art When performing a numerical calculation (matrix operation) using an electronic computer, a symmetric matrix is Cholesky decomposed into a product form of an upper triangular matrix and a lower triangular matrix, and simultaneous linear equations are obtained by forward substitution or backward substitution. In some cases, the method of sequentially obtaining the solution of the equation may be adopted.

【０００３】このような行列演算は、例えば、図１２に
示すように、複数のプロセッサ（ＤＳＰ１００２，１０
０４）を具備するマルチプロセッサシステム１０００を
用いて実行される。複数のプロセッサを用いる行列演算
については、例えば、特開２０００−３３９２９６号公
報に記載されている。Such matrix calculation is performed by a plurality of processors (DSP 1002, 10) as shown in FIG.
04). The matrix operation using a plurality of processors is described in, for example, Japanese Patent Laid-Open No. 2000-339296.

【０００４】[0004]

【発明が解決しようとする課題】複数のプロセッサを用
いて行列演算を行う方法では、プロセッサ間の情報の伝
達を行う必要があるため処理が遅延し、また、基本的に
ソフトウエアによる演算であるため、高速化に限界があ
る。In the method of performing a matrix operation using a plurality of processors, it is necessary to transfer information between the processors, so that the processing is delayed, and the operation is basically performed by software. Therefore, there is a limit to the speedup.

【０００５】また、装置が大型化しがちであり、小型，
軽量化あるいは低消費電力化が厳しく求められる携帯電
話のような移動機に搭載するのは困難である。In addition, the device tends to be large, small,
It is difficult to mount it on a mobile device such as a mobile phone, which is strictly required to be lightweight or low in power consumption.

【０００６】本発明は、行列演算装置の処理能力の格段
の向上および装置の小型化，低消費電力化を実現するこ
とを目的とする。It is an object of the present invention to significantly improve the processing capability of a matrix operation device, reduce the device size, and reduce power consumption.

【０００７】[0007]

【課題を解決するための手段】本発明の行列演算装置
は、連立一次方程式の解を前進代入または後退代入によ
って順番に求めていく、ハードウエアで構成された巡回
型の演算処理回路であり、一次方程式の解を求めるため
に必要な乗算，除算，加算，減算を行うための回路要素
（実質的にそれらの演算を行う部分）を演算の手順（演
算の流れ）に沿うように配置して、無理のないデータの
流れを作りながら、パイプライン的に処理を行う。ハー
ドウエアによる処理であるため、ＬＳＩ化されたハード
ウエアの最大の処理能力でもって演算処理を実行でき、
これにより、例えば、従来の１０倍の高速化が達成され
る。A matrix arithmetic unit of the present invention is a cyclic arithmetic processing circuit composed of hardware, which sequentially obtains solutions of simultaneous linear equations by forward substitution or backward substitution. Arrange the circuit elements for performing multiplication, division, addition, and subtraction (substantially performing the operations) necessary to obtain the solution of the linear equation along the operation procedure (operation flow). , Perform processing in a pipeline while creating a reasonable data flow. Since the processing is performed by hardware, the arithmetic processing can be executed with the maximum processing capacity of the hardware implemented as an LSI.
As a result, for example, a speedup 10 times faster than the conventional one is achieved.

【０００８】また、本発明の行列演算装置では、乗算器
を時分割で使用することで、乗算器の数を、無理なく効
果的に削減して装置の小型化や低消費電力化を促進す
る。Further, in the matrix operation device of the present invention, the number of multipliers is reasonably and effectively reduced by using the multipliers in a time-division manner, thereby promoting miniaturization and low power consumption of the device. .

【０００９】つまり、ｎ個の連立一次方程式を、例え
ば、前進代入により順番に解く場合、ｎ番目の一次方程
式を解くためには、それ以前に求めた１番目〜（ｎ−
１）番目の一次方程式の解を用いた演算（例えば、積和
演算）を行う必要があり、よって、直前の（ｎ−１）番
目の一次方程式の解が定まらなければ、ｎ番目の一次方
程式の解を求めることはできない。That is, when n simultaneous linear equations are sequentially solved by forward substitution, for example, in order to solve the n-th linear equation, the first to (n-
1) It is necessary to perform an operation (for example, a product-sum operation) using the solution of the first-order linear equation. Therefore, if the solution of the immediately preceding (n−1) -th order linear equation cannot be determined, the n-th first-order equation Can't find the solution.

【００１０】このことを厳密に考えると、直前の一次方
程式の解を求めている最中には、その直前の一次方程式
の解を含む項に関する処理は行うことはできない、とい
うことになるが、一方で、過去に求まっている１番目〜
（ｎ−２）番目の一次方程式の解を含む項についての処
理は、実行可能である、ということである。If this is strictly considered, it means that while the solution of the immediately preceding linear equation is being obtained, the processing concerning the term including the solution of the immediately preceding linear equation cannot be performed. On the other hand, the first one we've been looking for in the past
This means that the processing for the term including the solution of the (n-2) th linear equation is feasible.

【００１１】この点に着目し、本発明の行列演算装置で
は、演算項を、直前の一次方程式の解を含む項をもつグ
ループと、含まない項のみのグループとに大別する。そ
して、直前の一次方程式の解を求めている最中に、次の
一次方程式を解くための演算項のうち、直前の一次方程
式の解を含まない項のみのグループの演算式を先行的に
実施し、結果を一時的に保持しておく。Focusing on this point, in the matrix operation device of the present invention, the operation terms are roughly classified into a group having a term including a solution of the immediately preceding linear equation and a group having only a term not including the solution. Then, while seeking the solution of the immediately preceding linear equation, among the operating terms for solving the next linear equation, the arithmetic expression of a group of only the terms not including the solution of the immediately preceding linear equation is executed in advance. Then, the result is held temporarily.

【００１２】そして、直前の一次方程式の解が求まった
段階で、その解を含む項について演算をして、その結果
を、先行実施された演算の結果と合算し、ｎ番目の一次
方程式を解くために必要な、１番目から（ｎ−１）番目
までの一次方程式の解のすべてを含む項に関する演算を
完結させる。Then, when the solution of the immediately preceding linear equation is obtained, the term including the solution is operated, and the result is summed with the result of the previously executed operation to solve the n-th linear equation. This completes the operation related to the term including all the solutions of the linear equations from the 1st to (n-1) th, which are necessary for

【００１３】乗算器を時分割使用せず、多数の乗算器を
用意したとしても、直前の一次方程式の解が求まるまで
は、結局、次の一次方程式の解の処理は待たされるので
あるから、処理時間は、時分割使用のときとほとんど同
じである。つまり、本発明によれば、高速な処理を行い
つつ、回路の小型化，低消費電力化を無理なく実現でき
る。Even if a large number of multipliers are prepared without using the time division, the processing of the solution of the next linear equation is eventually waited until the solution of the immediately previous linear equation is obtained. The processing time is almost the same as when using time division. That is, according to the present invention, it is possible to reasonably realize miniaturization of a circuit and reduction of power consumption while performing high-speed processing.

【００１４】本発明の行列演算装置は、下三角行列を含
む連立一次方程式の解を求める際の前進代入演算に適用
することができる。また、上三角行列を含む連立一次方
程式の解を求める際の後退代入演算に適用することがで
きる。また、対称行列を含む連立一次方程式の解を求め
る際に、対称行列をコレスキー分解あるいは変形コレス
キー分解によって変形された連立一次方程式の演算に適
用することができる。また、コレスキー分解を利用した
逆行列演算を行うことができる行列演算器に適用するこ
とができる。The matrix calculation device of the present invention can be applied to the forward substitution calculation when obtaining the solution of the simultaneous linear equations including the lower triangular matrix. Further, it can be applied to backward substitution calculation when obtaining a solution of simultaneous linear equations including an upper triangular matrix. Further, when obtaining the solution of the simultaneous linear equations including the symmetric matrix, the symmetric matrix can be applied to the operation of the simultaneous linear equations modified by the Cholesky decomposition or the modified Cholesky decomposition. Further, it can be applied to a matrix calculator capable of performing an inverse matrix calculation utilizing Cholesky decomposition.

【００１５】本発明の行列演算装置は、ジョイントディ
テクション復調機能を有している受信装置、最小２乗誤
差法に基づくＡＡＡ（アダプティブアレイ）が実装され
ている受信装置、あるいはトランスバーサルフィルタ等
を具備する適応等化器が実装されている受信装置に搭載
することができる。The matrix operation device of the present invention includes a receiving device having a joint detection demodulation function, a receiving device having an AAA (adaptive array) based on the least square error method, a transversal filter, or the like. It can be installed in a receiving device in which the adaptive equalizer provided is installed.

【００１６】本発明の行列演算装置は、小型，低消費電
力で、かつ超高速処理が可能であるため、携帯電話等の
移動局装置や無線基地局装置における演算処理装置とし
て、十分な能力を発揮し得るものである。Since the matrix calculation device of the present invention is small in size, low in power consumption, and capable of ultra-high-speed processing, it has sufficient capability as a calculation processing device in a mobile station device such as a mobile phone or a wireless base station device. It can be demonstrated.

【００１７】[0017]

【発明の実施の形態】まず、コレスキー分解（ＬＵ分解
（三角分解））を利用した連立一次方程式の解法につい
て説明する。ここでは、以下の（１）式で示されるよう
な大規模な連立一次方程式の解を求める場合について説
明する。Ｆｄ＝ｒ……（１）ここで、Ｆは、ｎ行×ｎ列の既知行列であり、ｒは、ｎ
行×１列の既知行列であり、ｄが求める行列（ｎ行×１
列）である。BEST MODE FOR CARRYING OUT THE INVENTION First, a method of solving simultaneous linear equations using Cholesky decomposition (LU decomposition (triangular decomposition)) will be described. Here, a case will be described in which a solution of a large-scale simultaneous linear equation as shown by the following equation (1) is obtained. Fd = r (1) where F is a known matrix of n rows × n columns, and r is n
It is a known matrix of rows x 1 columns, and the matrix that d finds (n rows x 1
Column).

【００１８】ＬＵ分解法によると、既知である対称行列
Ｆは、下三角行列Ｌとその転置行列Ｌ^Tによって、下記
（２）式のように表わすことができる。Ｆ＝ＬＬ^T……（２）ＬＬ^Tｄ＝ｒ………（３）式（３）をｄについて解くのであるが、この処理は、２
段階の計算に大別される。以下、第１段階の処理と第２
段階の処理について説明する。According to the LU decomposition method, the known symmetric matrix F can be expressed by the following equation (2) by the lower triangular matrix L and its transposed matrix L ^T. F = LL ^T ...... (2) but ^{LL T d = r ......... (3} ) Equation (3) is to solve for d, the process is 2
It is roughly divided into stage calculations. Below, the first stage process and the second
The process of steps will be described.

【００１９】（第１段階）Ｌ^Tｄ＝ｚとおくと、（３）
式は（４）式のように変形される。Ｌｚ＝ｒ………（４）（４）式から行列ｚを求める。行列Ｌは下三角行列であ
るので、行列ｚを求める計算式は、以下の（５）式のよ
うに表わすことができる。(First stage) When L ^T d = z is set, (3)
The equation is transformed into the equation (4). Lz = r (4) The matrix z is obtained from the equation (4). Since the matrix L is a lower triangular matrix, the calculation formula for obtaining the matrix z can be expressed as the following formula (5).

【００２０】[0020]

【数１】ただし、ｉ＝２，３…Ｎである。[Equation 1] However, i = 2, 3 ... N.

【００２１】行列ｚは、以下のようにして求める。ま
ず、１行目の要素ｚ₁を算出する。次に、算出したｚ₁を
用いて（５）式にしたがって、２行目の要素ｚ₂を算出
する。以下、同様にして、ｚ₁〜ｚ_i-1の算出結果を用い
て、ｚ_iを算出する。このように、行列ｚの第１番目の
要素から第Ｎ番目の要素の順序に従って算出する演算
は、前進代入と呼ばれる。The matrix z is obtained as follows. First, the element z ₁ in the first row is calculated. Next, using the calculated z ₁ , the element z ₂ in the second row is calculated according to the equation (5). Thereafter, similarly, z _i is calculated using the calculation results of z ₁ to z _i-1 . In this way, the operation of calculating in the order of the first element to the Nth element of the matrix z is called forward substitution.

【００２２】（第２段階）第１段階で算出した行列ｚを
用いて、第１段階で置換した式Ｌ^Tｄ＝ｚから解ｄを求
める。ここで、行列Ｌ^Tは、下三角行列Ｌの転置行列で
あるため上三角行列である。(Second Step) Using the matrix z calculated in the first step, the solution d is obtained from the expression L ^T d = z replaced in the first step. Here, the matrix L ^T is an upper triangular matrix because it is a transposed matrix of the lower triangular matrix L.

【００２３】従って、連立一次方程式の解ｄを求めるた
めの計算式は、以下の（６）式となる。Therefore, the calculation formula for obtaining the solution d of the simultaneous linear equations is the following formula (6).

【００２４】[0024]

【数２】ただし、ｉ＝Ｎ−１，Ｎ−２……１である。行列ｄは、
第Ｎ番目の要素から逆順に第１番目の要素までを算出す
るため、後退代入の演算と呼ばれる。[Equation 2] However, i = N-1, N-2 ... 1. The matrix d is
This is called a backward substitution operation because it calculates the Nth element to the first element in reverse order.

【００２５】本発明の行列演算装置は、このような前進
代入，後退代入による連立一次方程式の解を求める演算
に広く利用することができる。The matrix calculation apparatus of the present invention can be widely used for calculation for obtaining a solution of simultaneous linear equations by such forward substitution and backward substitution.

【００２６】以下、本発明の実施の形態について、図面
を参照して説明する。Embodiments of the present invention will be described below with reference to the drawings.

【００２７】（実施の形態１）三角分解されたｎ行×ｎ
列の行列に関する連立一次方程式の解を求める回路構成
について説明する。(Embodiment 1) Triangular decomposition n rows × n
A circuit configuration for obtaining a solution of simultaneous linear equations regarding a matrix of columns will be described.

【００２８】ｎ行×ｎ列の下三角行列をＬ，ｎ行×１列
の既知行列をｒとすると、前記連立一次方程式は、式
（７）で表される。Ｌｚ＝ｒ………（７）ここでＺは、連立一次方程式の解であるｎ行×１列の行
列である。連立一次方程式は、行列Ｌが下三角行列なの
で次式により、Ｚ₁からＺ_Nまで順次求めることができ
る。この演算の内容の一部が、図１に示される。When the lower triangular matrix of n rows × n columns is L and the known matrix of n rows × 1 column is r, the simultaneous linear equations are expressed by the equation (7). Lz = r ... (7) Here, Z is a matrix of n rows × 1 column which is a solution of simultaneous linear equations. In the simultaneous linear equations, since the matrix L is a lower triangular matrix, Z ₁ to Z _N can be sequentially obtained by the following equation. Part of the contents of this calculation is shown in FIG.

【００２９】図１の処理Ａは、下三角行列を含む行列演
算の内容を示しており、処理Ｂは、１番目〜４番目まで
の連立一次方程式を示しており、処理Ｃは、１番目〜４
番目までの連立一次方程式の解を求めるための具体的な
演算内容を示している。各一次方程式の解を求めるため
の一般式は、前記（５）式で表わされる。Process A in FIG. 1 shows the contents of the matrix operation including the lower triangular matrix, process B shows the simultaneous linear equations from the first to the fourth, and process C shows the first to the fourth. Four
The concrete calculation contents for obtaining the solution of the simultaneous linear equations up to the th are shown. The general formula for obtaining the solution of each linear equation is represented by the above formula (5).

【００３０】図１の処理Ａに示されるように、下三角行
列は、右上半分がすべて“０”であり、左下半分に行列
要素が配置された行列である。ここで、右上半分の
“０”の部分と接する境界線上に位置する行列要素を
「対角要素」といい（図１においては、丸で囲んで示さ
れている）、それ以外の行列要素を「一般要素」という
（図１では、三角で囲んで示されている）。対角要素お
よび一般要素の定義は、上三角行列についても同様であ
る。As shown in the process A of FIG. 1, the lower triangular matrix is a matrix in which the upper right half is all "0" and the matrix elements are arranged in the lower left half. Here, the matrix elements located on the boundary line in contact with the "0" part in the upper right half are called "diagonal elements" (indicated by circles in FIG. 1), and the other matrix elements are It is called a "general element" (in FIG. 1, it is shown enclosed by a triangle). The definition of the diagonal element and the general element is the same for the upper triangular matrix.

【００３１】図１の処理Ｃに示される、４番目の一次方
程式の解Ｚ₄を求める演算内容からも明らかなように、
解を求める演算は、過去に算出した各一次方程式の解
と、対応する下三角行列の一般要素との積の総和をとる
処理と、その総和を、既知の行列ｒの対応する要素か
ら減算する処理と、その結果を、対応する下三角行列
の対角要素で除算する処理とからなっている。As is clear from the operation contents for obtaining the solution Z ₄ of the _fourth linear equation shown in the process C of FIG.
The solution is calculated by summing the products of the solutions of the respective linear equations calculated in the past and the corresponding general elements of the lower triangular matrix, and subtracting the sum from the corresponding elements of the known matrix r. It consists of a process and a process of dividing the result by a diagonal element of the corresponding lower triangular matrix.

【００３２】図２は、このような行列演算を高速に実行
する行列演算装置の具体的な構成を示す図である。ま
た、図３は、図２の行列演算装置の特徴的な動作を説明
するためのタイミング図である。FIG. 2 is a diagram showing a concrete configuration of a matrix calculation device for executing such matrix calculation at high speed. Further, FIG. 3 is a timing diagram for explaining a characteristic operation of the matrix operation device of FIG.

【００３３】図２の行列演算装置は、連立一次方程式の
解を前進代入または後退代入によって順番に求めてい
く、ハードウエアで構成された巡回型の演算処理回路で
あり、一次方程式の解を求めるために必要な乗算，除
算，加算，減算を行うための回路要素を演算の手順（演
算の流れ）に沿うように配置して、無理のないデータの
流れを作りながら、パイプライン的に処理を行う。The matrix operation device of FIG. 2 is a cyclic arithmetic processing circuit composed of hardware that sequentially obtains solutions of simultaneous linear equations by forward substitution or backward substitution, and obtains solutions of linear equations. Therefore, the circuit elements for performing multiplication, division, addition, and subtraction necessary for this are arranged along the operation procedure (operation flow), and the processing is performed in a pipeline while creating a reasonable data flow. To do.

【００３４】図２の行列演算装置の構成は、大きく２つ
の部分に分けることができる。すなわち、ｎ番目（ｎは
２以上の自然数）の一次方程式の解を求めるために必要
な、直前の一次方程式の解を含む過去に求められたすべ
ての解についての所定の積和演算を行うハードウエア構
成の積和演算部（１０１，１０２，１０３，１０５，１
０６，１０７，１０８，１０９，１１０）と、この積和
演算部から出力される値について、所定の線形演算を行
って前記ｎ番目の一次方程式の解を求めるハードウエア
構成の線形演算部（１１１，１１２，１１３，１１４）
とに分けることができる。The structure of the matrix calculation device of FIG. 2 can be roughly divided into two parts. That is, a hardware for performing a predetermined product-sum operation for all solutions obtained in the past, including the solution of the immediately preceding linear equation, which is necessary for obtaining the solution of the n-th (n is a natural number of 2 or more) linear equation. Sum-of-products operation unit (101, 102, 103, 105, 1) of wear configuration
06, 107, 108, 109, 110) and a value output from the product-sum operation unit, a linear operation unit (111) having a hardware configuration for performing a predetermined linear operation to obtain a solution of the n-th linear equation. , 112, 113, 114)
Can be divided into

【００３５】そして、この行列演算装置は、線形演算部
にて（ｎ−１）番目の一次方程式の解を求める演算がな
されている期間において、積和演算部にて、ｎ番目の一
次方程式の解を求めるために必要な積和演算項のうち
の、（ｎ−１）番目の一次方程式の解を含まない部分的
な項についての積和演算を先行的に実施し、（ｎ−１）
番目の一次方程式の解が求まった時点で、（ｎ−１）番
目の一次方程式の解を含む部分的な項についての積和演
算を実行し、乗算部１０７の乗算器を時分割で使用す
る。In the matrix computing device, the product-sum computing unit computes the n-th linear equation during the period in which the linear computing unit computes the solution of the (n-1) th linear equation. Among the product-sum operation terms required to obtain the solution, the product-sum operation is performed in advance for a partial term that does not include the solution of the (n-1) th linear equation, and (n-1)
When the solution of the th linear equation is obtained, the product-sum operation is performed on the partial terms including the solution of the (n-1) th linear equation, and the multiplier of the multiplication unit 107 is used in time division. .

【００３６】以下、具体的な構成について説明する。図
２の行列演算装置において、レジスタ１０１は、直前の
一次方程式の解を蓄積するためのレジスタである。シフ
トレジスタ１０２は、過去に求められた一次方程式の解
を順次、蓄積するシフトレジスタである。The specific structure will be described below. In the matrix operation device of FIG. 2, the register 101 is a register for accumulating the solution of the immediately preceding linear equation. The shift register 102 is a shift register that sequentially accumulates solutions of linear equations obtained in the past.

【００３７】レジスタ１０１とシフトレジスタ２とを分
離しているのは、直前の一次方程式の解がレジスタ１０
１にラッチ（セット）されるタイミングと、レジスタ１
０１のデータおよびシフトレジスタの各タップのデータ
を１段シフトするタイミングとがずれていることを考慮
したからである。The reason why the register 101 and the shift register 2 are separated is that the solution of the immediately preceding linear equation is in the register 10.
Latched (set) to 1 and register 1
This is because it is considered that the data of 01 and the data of each tap of the shift register are shifted by one stage.

【００３８】直前の一次方程式の解がレジスタ１０１に
ラッチ（セット）されるタイミングは、図３にも示され
るように、レジスタラッチクロック（ＲＣ）により決定
される。The timing at which the solution of the immediately preceding linear equation is latched (set) in the register 101 is determined by the register latch clock (RC) as shown in FIG.

【００３９】また、レジスタ１０１のデータおよびシフ
トレジスタの各タップのデータを１段シフトするタイミ
ングは、シフトクロック（ＳＣＬ）によって制御され
る。なお、レジスタラッチクロック（ＲＣ）およびシフ
トクロック（ＳＣＬ）は、オアゲート（ＯＲ）を介して
レジスタ１０１に供給されるようになっている。The timing of shifting the data of the register 101 and the data of each tap of the shift register by one stage is controlled by the shift clock (SCL). The register latch clock (RC) and shift clock (SCL) are supplied to the register 101 via an OR gate (OR).

【００４０】ここで、注目すべきことは、シフトレジス
タ１０２が、途中で折り返された形状を有することであ
り、その結果として、シフトレジスタ１０２の前半部お
よび後半部における対応する位置にある遅延要素（記憶
要素）の出力同士が一組となり、各組の信号がそれぞ
れ、スイッチ部１０５に用意されているスイッチＳＷ１
〜ＳＷ（Ｎ／２）の各々に入力されることである。Here, it should be noted that the shift register 102 has a shape folded back on the way, and as a result, the delay elements at corresponding positions in the first half and the second half of the shift register 102 are located. The outputs of the (memory elements) form one set, and the signals of each set are provided to the switch SW1 provided in the switch unit 105.
To SW (N / 2).

【００４１】このような構成を採るのは、乗算部１０７
に含まれる乗算器（ＭＵＬ（１）〜ＭＵＬ（ｎ／２）を
時分割で使用することによって、その個数を減らすため
である。つまり、直前の一次方程式の解が、未だレジス
タ１０１にセットされていない期間において、レジスタ
１０１のデータおよびシフトレジスタの各タップのデー
タを１段シフトしてしまい、スイッチＳＷ１〜ＳＷ（ｎ
／２）をｂ端子側に切り換えて、シフトレジスタ１０２
の後半の遅延要素（記憶要素）から、すでに求まってい
る解のみを取り出し、先行的に演算を進めるものであ
る。この点については、後に具体的に説明する。The multiplication unit 107 adopts such a configuration.
This is because the number of multipliers (MUL (1) to MUL (n / 2)) included in is reduced in a time division manner, that is, the solution of the immediately previous linear equation is still set in the register 101. In the non-operating period, the data of the register 101 and the data of each tap of the shift register are shifted by one stage and the switches SW1 to SW (n
/ 2) is switched to the terminal b side, and the shift register 102
From the delay element (memory element) in the latter half of the above, only the solution already obtained is extracted and the operation is advanced in advance. This point will be specifically described later.

【００４２】第１のメモリ部１０３（メモリ（１）〜メ
モリ（ｎ−１）をもつ）には、下三角行列の一般要素の
値が記憶されている。この下三角行列の一般要素の値
は、スイッチ部１０６に設けられている、複数のスイッ
チ（ＰＷ１〜ＰＷ（ｎ／２）を介して、乗算部１０７に
設けられている乗算器（ＭＵＬ（１）〜ＭＵＬ（ｎ／
２））に入力され、すでに求まっている、一次方程式の
解と下三角行列の一般要素との乗算がなされる。The values of the general elements of the lower triangular matrix are stored in the first memory unit 103 (having memories (1) to (n-1)). The value of the general element of the lower triangular matrix is determined by the multiplier (MUL (1) provided in the multiplication unit 107 via the plurality of switches (PW1 to PW (n / 2)) provided in the switch unit 106. ) ~ MUL (n /
The solution of the linear equation, which has been input to 2)) and has already been obtained, is multiplied by the general element of the lower triangular matrix.

【００４３】そして、加算器１０８で各乗算器からの出
力値を加算し、その結果を、加算器１１０を介してレジ
スタ１０９に送って、ここに一時的に蓄積しておく。そ
して、直前の一次方程式の解がレジスタ１０１にラッチ
された後に、スイッチ部１０５の各スイッチをａ端子側
に切り換えて同様の積和演算を実行する。この積和演算
結果は、加算器１１０において、レジスタ１０９に蓄積
されていた先行処理の結果と合算される。Then, the adder 108 adds the output values from the multipliers, sends the result to the register 109 via the adder 110, and temporarily stores it there. Then, after the solution of the immediately preceding linear equation is latched in the register 101, each switch of the switch unit 105 is switched to the a terminal side, and a similar product-sum operation is executed. The product-sum operation result is added up with the result of the preceding process accumulated in the register 109 in the adder 110.

【００４４】第２のメモリ１１１には、既知の行列ｒの
行列要素が記憶されている。減算器１１２では、既知行
列ｒの要素から積和演算結果を減じる演算がなされる。
この結果について、さらに、除算器１１４にて、下三角
行列の対角要素の値で割り算が行われる。第３のメモリ
１１３には、下三角行列の対角要素の値が記憶されてい
る。The second memory 111 stores the matrix elements of the known matrix r. The subtractor 112 performs an operation of subtracting the product-sum operation result from the elements of the known matrix r.
The result is further divided by the value of the diagonal element of the lower triangular matrix in the divider 114. Values of diagonal elements of the lower triangular matrix are stored in the third memory 113.

【００４５】割り算の結果として求められた一次方程式
の解は，第４のメモリ（解の蓄積のためのメモリ）１１
５に蓄積されると共に、レジスタ１０１にセットされ、
以下、同様の処理を繰り返し行う。The solution of the linear equation obtained as a result of the division is stored in the fourth memory (memory for storing the solution) 11
Is stored in 5, and set in the register 101,
Hereinafter, the same process is repeated.

【００４６】図３は、図２の行列演算装置の、積和演算
（各乗算器の出力の総和をとる演算）を行う部分の動作
を説明するためのタイミング図である。FIG. 3 is a timing chart for explaining the operation of the part of the matrix operation device of FIG. 2 which performs the sum of products operation (operation for taking the sum of the outputs of the multipliers).

【００４７】ここでは、便宜上、８個の一次連立方程式
を解く場合を想定する。時刻ｔ０〜ｔ３が、１番目の方
程式の解Ｚ₁を求める期間であり、以下、時刻ｔ２〜ｔ
５，時刻ｔ４〜ｔ７，時刻ｔ６〜ｔ９，時刻ｔ８〜ｔ１
１，時刻ｔ１０〜ｔ１３，時刻ｔ１２〜時刻ｔ１５，時
刻ｔ１４〜時刻ｔ１７は、それぞれ、解ｚ₂，ｚ₃，
ｚ ₄，ｚ₅，ｚ₆，ｚ₇，ｚ₈を求める期間である。Here, for the sake of convenience, eight linear simultaneous equations are used.
Suppose you want to solve. Time t0 to t3 is the first one
Formula solution Z₁Is the period for which
5, time t4 to t7, time t6 to t9, time t8 to t1
1, time t10 to t13, time t12 to time t15, hour
From time t14 to time t17, the solution z₂, Z₃，
z _Four, Z_Five, Z₆, Z₇, Z₈Is the period for which

【００４８】スイッチ部１０５の各スイッチＳＷ１〜Ｓ
Ｗ（ｎ／２）は、ａ端子側、ｂ端子側に交互に周期的に
切り換えられる。The switches SW1 to S of the switch unit 105
W (n / 2) is alternately and periodically switched to the a terminal side and the b terminal side.

【００４９】あるｚ_iを求めるときには、ｂ端子側にス
イッチが切り換えられてから演算がスタートする。な
お、この切り換えは、スイッチ部１０６についても同様
である。ａ端子側に切り換えられたときには、シフトレ
ジスタ１０２の前半の遅延要素（記憶要素）およびレジ
スタ１０１からのデータの取り出しが行われ、ｂ端子側
に切り換えられたときには、シフトレジスタ１０２の後
半の遅延要素からのデータの取り出しが行われる。To obtain a certain z _i , the calculation is started after the switch is switched to the terminal b side. Note that this switching also applies to the switch unit 106. When switched to the a terminal side, the first half delay element (storage element) of the shift register 102 and data from the register 101 are taken out, and when switched to the b terminal side, the latter half delay element of the shift register 102. Data is extracted from the.

【００５０】シフトクロック（ＳＣＬ）とレジスタラッ
チクロック（ＲＣ）の位相はずれており、シフトクロッ
ク（ＳＣＬ）の方が位相が先行している。これは、直前
の一次方程式の解が求まる前に、データのシフトを行
い、シフトレジスタの状態を更新することを意味する。The shift clock (SCL) and the register latch clock (RC) are out of phase with each other, and the shift clock (SCL) precedes the phase. This means that the data is shifted and the state of the shift register is updated before the solution of the immediately preceding linear equation is obtained.

【００５１】図４は、８番目の方程式の解ｚ₈を求める
際の動作を模式的に示している。図示されるように、ｚ
₈を求めるためには、方程式の解を含む８つの項の加算
を行う必要があるが、直前の方程式の解であるｚ₇が算
出されない限り、演算を実行することができない。FIG. 4 schematically shows the operation for obtaining the solution z ₈ of the _eighth equation. Z as shown
_In order to obtain ₈ , it is necessary to add eight terms including the solution of the equation, but the operation cannot be executed unless the solution z ₇ of the immediately preceding equation is calculated.

【００５２】そこで、図４の上側に示すように、積和演
算の対象となる項を、直前の方程式の解を含まないグル
ープＡと、直前の方程式の解を含むグループＢとに大別
する。そして、直前の方程式の解ｚ₇を求めるための、
加算器１１０による加算演算が終了した後（すなわち、
積和演算部における演算が終了した後であって、減算器
１１２による演算や除算器１１４による割り算が行われ
ている期間）に、シフトレジスタの内容を更新して、シ
フトレジスタの後半部から、過去に求めたｚ₁〜ｚ₃の解
を取り出して、グループＡの積和演算を先行的に実施し
てしまう。Therefore, as shown in the upper part of FIG. 4, the terms to be subjected to the sum-of-products operation are roughly classified into a group A that does not include the solution of the immediately preceding equation and a group B that includes the solution of the immediately preceding equation. . And to find the solution z ₇ of the previous equation,
After the addition operation by the adder 110 is completed (that is,
The contents of the shift register are updated after the calculation in the product-sum calculation unit is completed and during the calculation by the subtractor 112 and the division by the divider 114). The solutions of z ₁ to z ₃ obtained in the past are taken out, and the product-sum calculation of group A is carried out in advance.

【００５３】図３の下側に、各期間で実行される演算の
内容が記載されている。最初のうちは、シフトレジスタ
の後半部のデータは“０”のままであるので、スイッチ
をｂ側に切り換えて後半部からデータを取り出して積和
演算を行っても、その結果は“０”のままである。At the bottom of FIG. 3, the contents of the calculation executed in each period are described. At the beginning, since the data in the latter half of the shift register remains "0", even if the switch is switched to the b side to take out the data from the latter half and carry out the product-sum operation, the result is "0". It remains.

【００５４】しかし、演算が進行していくと、過去に求
めた一次方程式の解が、シフトレジスタの後半部にシフ
トされてきて、やがて、スイッチをｂ側に切り換えてい
る期間（つまり、シフトレジスタの後半部からの出力を
選択している期間）において、次の一次方程式を解くた
めに必要な積和演算の一部が先行的に実施されるように
なり、共通の乗算器を時分割で使用した、非常に効率的
な処理がなされるようになる。However, as the calculation progresses, the solution of the linear equation obtained in the past is shifted to the latter half of the shift register, and eventually the switch is switched to the b side (that is, the shift register). In the period when the output from the latter half of is selected, some of the product-sum operations required to solve the following linear equations are performed in advance, and the common multiplier is time-divided. The very efficient processing used will be performed.

【００５５】以上が、図２の演算装置の特徴部分の概略
である。以下、図２の行列演算装置について、さらに詳
しく説明する。The above is the outline of the characteristic portion of the arithmetic unit of FIG. Hereinafter, the matrix calculation device of FIG. 2 will be described in more detail.

【００５６】なお、以下の説明では、レジスタ１０１を
第１のレジスタと呼び、レジスタ１０４を第２のレジス
タと呼び、また、レジスタ１０９を第３のレジスタと呼
ぶ。さらに、加算器１０８を第１の加算器と呼び、加算
器１１０を第２の加算器と呼ぶ。また、スイッチ１０５
を第１のスイッチと呼び、スイッチ１０７を第２のスイ
ッチと呼ぶことにする。In the following description, the register 101 is called the first register, the register 104 is called the second register, and the register 109 is called the third register. Further, the adder 108 is called a first adder, and the adder 110 is called a second adder. Also, the switch 105
Will be referred to as a first switch, and the switch 107 will be referred to as a second switch.

【００５７】図２の行列演算装置は、現時点において求
まった演算結果（ｚ_I）を格納する第１のレジスタ１０
１と、現時点までに求まっている演算結果（ｚ₁〜
ｚ_i-1）を格納する、（Ｎ−２）段のシフトレジスタ１
０２と、既知の下三角行列（Ｌ）の対角要素を除く全て
の一般要素が格納される第１のメモリ１０３と、常時０
が格納されている第２のレジスタ１０４と、前記第１の
レジスタ１０１とシフトレジスタ１０２の前半（Ｎ／２
−１）個を併せて前半部としシフトレジスタの後半（Ｎ
／２−１）個を後半部としてこれら前半部あるいは後半
部のいずれかの読み出しを制御する第１のスイッチ部１
０５と、第１のメモリ１０３の前半（Ｎ／２）個のメモ
リを前半部とし第１のメモリ１０３の後半（Ｎ／２−
１）個のメモリおよび前記第２のレジスタ１０４を後半
部としてこれら前半部あるいは後半部のいずれかの読み
出しを制御する第２のスイッチ１０６と、前記第１のス
イッチ１０５の出力値と前記第２のスイッチ１０６の出
力値との乗算を行うためのＮ／２個の乗算器１０７と、
Ｎ／２個の乗算器１０７から出力される全演算結果を加
算する第１の加算器１０８と、第１の加算器１０８の演
算結果が前記後半部を読み出して得られた場合に前記第
１の加算器１０８の結果を蓄えておくための第３のレジ
スタ１０９と、第１の加算器１０８の結果が前記前半部
を読み出して得られた場合に第１の加算器１０８の演算
結果と前記第３のレジスタ１０９に蓄えている値とを加
算するための第２の加算器１１０と、Ｎ行×１列の既知
行列の要素を格納する第２のメモリ１１１と、第２のメ
モリ１１１から読み出される値から前記第２の加算器１
１０の演算結果を減算するための減算器１１２と、既知
の下三角行列の対角要素を格納するための第３のメモリ
１１３と、減算器１１２からの出力値を第３のメモリ１
１３から読み出される値で除算するための除算器１１４
と、除算器１１４が出力する演算結果を格納するための
第４のメモリ１１５と、から構成される。以上より、第
４のメモリには、ｚ₁からｚ_Nまで順次格納されていく。The matrix calculation device of FIG. 2 has a first register 10 for storing the calculation result (z _I ) found at the present time.
1 and the calculation result (z ₁ ~
z _{i -1} ) of the (N-2) stage shift register 1
02, the first memory 103 in which all general elements except the diagonal elements of the known lower triangular matrix (L) are stored, and 0 at all times
Is stored in the second register 104 and the first half of the first register 101 and the shift register 102 (N / 2
-1) are combined to form the first half and the latter half of the shift register (N
/ 2-1) The first switch unit 1 which controls the reading of either the first half or the second half with the second half as the second half.
05 and the first half (N / 2) of the first memory 103 are used as the first half, and the second half (N / 2−) of the first memory 103 is
1) The second switch 106 for controlling the reading of either the first half or the second half of the memory and the second register 104 as the latter half, and the output value of the first switch 105 and the second switch 106. N / 2 multipliers 107 for performing multiplication with the output value of the switch 106 of
A first adder 108 for adding all the operation results output from the N / 2 multipliers 107, and the first adder 108 when the operation result of the first adder 108 is obtained by reading the latter half part. Third register 109 for storing the result of the first adder 108 and the operation result of the first adder 108 when the result of the first adder 108 is obtained by reading the first half part From the second adder 110 for adding the value stored in the third register 109, the second memory 111 for storing the elements of the known matrix of N rows × 1 column, and the second memory 111 From the value read out, the second adder 1
The subtracter 112 for subtracting the operation result of 10, the third memory 113 for storing the diagonal elements of the known lower triangular matrix, and the output value from the subtractor 112 for the third memory 1
Divider 114 for dividing by the value read from 13
And a fourth memory 115 for storing the calculation result output from the divider 114. As described above, z ₁ to z _N are sequentially stored in the fourth memory.

【００５８】次に、図２の回路動作を説明する。Next, the circuit operation of FIG. 2 will be described.

【００５９】演算開始時には、第１のメモリ１０３には
既知の下三角行列Ｌの一般要素が格納され、第２のメモ
リ１１１にはＮ行×１列の既知行列ｒの全要素が格納さ
れ、第３のメモリ１１３には下三角行列Ｌの対角要素
（Ｌ₁₁，Ｌ₂₂，…，Ｌ_NN）が格納され、第２のレジスタ
１０４および第３のレジスタ１０９には０がセットされ
る。At the start of calculation, the general elements of the known lower triangular matrix L are stored in the first memory 103, and all elements of the known matrix r of N rows × 1 column are stored in the second memory 111. The diagonal elements (L ₁₁ , L ₂₂ , ..., L _NN ) of the lower triangular matrix L are stored in the third memory 113, and 0 is set in the second register 104 and the third register 109.

【００６０】第１のメモリ１０３はＮ−１個のメモリか
ら構成されており、それぞれのメモリには下三角行列Ｌ
の一般要素が規則的に格納される。The first memory 103 is composed of N-1 memories, and each memory has a lower triangular matrix L.
The general elements of are stored regularly.

【００６１】メモリ（１）には（０，Ｌ₂₁，Ｌ₃₂，
Ｌ₄₃，…，Ｌ_N,N-1）が、メモリ（２）には（０，０，
Ｌ₃₁，Ｌ₄₂，Ｌ₅₃，…，Ｌ_N,N-2）が、…メモリ（Ｎ−
２）には（０，０，…，０，Ｌ_N-1，₁，Ｌ_N,2）が、メ
モリ（Ｎ−１）には（０，０，…，０，Ｌ_N,1）が格納
される。各メモリは全てＮ個のアドレスを持っている。
これらはシフトレジスタ１０２がシフトするタイミン
グに従ってメモリアドレスがインクリメントされて順次
読み出されるという動作をする。The memory (1) has (0, L ₂₁ , L ₃₂ ,
L ₄₃ , ..., L _{N, N−1} ) are stored in the memory (2) as (0, 0,
L ₃₁ , L ₄₂ , L ₅₃ , ..., L _{N, N-2} ) are ... Memory (N-
2) has (0, 0, ..., 0, L _N-1 , ₁ , L _{N, 2} ), and memory (N-1) has (0, 0, ..., 0, L _{N, 1} ). Is stored. Each memory has N addresses.
These operate so that the memory address is incremented and sequentially read according to the timing at which the shift register 102 shifts.

【００６２】演算開始時（ｚ₁算出時）において、第１
のレジスタ１０１およびシフトレジスタ部１０２には、
初期値である０が格納されている状態である。When the calculation is started (when z _{1 is} calculated), the first
The register 101 and the shift register unit 102 of
In this state, the initial value 0 is stored.

【００６３】まず、シフトレジスタ１０２の後半部から
値を読み出すように第１のスイッチ部１０５が制御し、
第１のメモリ１０３の後半部から値を読み出すように第
２のスイッチ１０６が制御する。First, the first switch section 105 controls so as to read the value from the latter half of the shift register 102,
The second switch 106 controls so as to read the value from the latter half of the first memory 103.

【００６４】この時、シフトレジスタ１０２にはすべて
０が格納されているのでＮ／２個の乗算器１０７の演算
結果はすべて０となる。従って、第１の加算器１０８の
結果も０となり第３のレジスタ１０９には０が格納され
る。At this time, since all 0s are stored in the shift register 102, the operation results of the N / 2 multipliers 107 are all 0s. Therefore, the result of the first adder 108 is also 0, and 0 is stored in the third register 109.

【００６５】次に第１のスイッチ１０５を切り替えて第
１のレジスタ１０１およびシフトレジスタ１０２の前半
部から値を読み出すように制御し乗算器１０７に入力す
る。また、同時に、第１のメモリ１０３の前半部および
第２のレジスタ１０４を読み出すように制御し乗算器１
０７に入力する。前半部の乗算の結果も全て０になるた
め、第１の加算器１０８の演算結果も０となる。第１の
加算器１０８の演算結果（前半部の乗算結果の総和）と
第３のレジスタ１０９に蓄えていた値（後半部の乗算結
果の総和）を第２の加算器１１０に入力する。Next, the first switch 105 is switched to control to read the value from the first half of the first register 101 and the shift register 102, and the value is input to the multiplier 107. At the same time, the first half of the first memory 103 and the second register 104 are controlled so as to be read, and the multiplier 1
Enter in 07. Since the multiplication results of the first half are all 0, the calculation result of the first adder 108 is also 0. The calculation result of the first adder 108 (sum of multiplication results of the first half) and the value stored in the third register 109 (sum of multiplication results of the second half) are input to the second adder 110.

【００６６】第２の加算器１１０の出力も０となる。第
２の加算器の演算が終わると第２のメモリ１１１から既
知行列ｒの第１要素ｒ₁を読み出して減算器１１２にお
いてｒ₁から第２の加算器１１０の演算結果を減算す
る。The output of the second adder 110 is also 0. When the operation of the second adder is completed, the first element r ₁ of the known matrix r is read from the second memory 111, and the subtracter 112 subtracts the operation result of the second adder 110 from r ₁ .

【００６７】また、第３のレジスタ１０９を初期化して
おく。第２の加算器１１０の演算結果が０であることか
ら減算器１１２の出力はｒ₁−０＝ｒ₁である。減算器１
１２の演算が終わると第３のメモリ１１３から下三角行
列の対角要素Ｌ₁₁を読み出し、除算器１１４において減
算器１１２の出力値をＬ₁₁で除算する。Further, the third register 109 is initialized. Since the operation result of the second adder 110 is 0, the output of the subtractor 112 is r ₁ −0 = r ₁ . Subtractor 1
When the calculation of 12 is completed, the diagonal element L ₁₁ of the lower triangular matrix is read from the third memory 113, and the output value of the subtractor 112 is divided by L ₁₁ in the divider 114.

【００６８】除算器１１４の出力はｒ₁／Ｌ₁₁であるこ
とから解ｚ₁がこの時点で求まる。得られた解ｚ₁は第４
のメモリ１１５に格納されると同時に第１のレジスタ１
０１に格納される。Since the output of the divider 114 is r ₁ / L ₁₁ , the solution z ₁ can be obtained at this point. The obtained solution z ₁ is the fourth
Of the first register 1 at the same time as being stored in the memory 115 of
It is stored in 01.

【００６９】次に解ｚ₂を求める動作について説明す
る。解ｚ₁の演算過程において、第２の加算器１１０の
演算が終了した時点で、第１のレジスタ１０１の値をシ
フトレジスタ１０２に入力して１段シフトを行う。Next, the operation of obtaining the solution z ₂ will be described. In the calculation process of the solution z _{1, when} the calculation of the second adder 110 is completed, the value of the first register 101 is input to the shift register 102 to shift one stage.

【００７０】また、（Ｎ−１）個の第１のメモリ１０３
の読み出しアドレスをインクリメントする。ここで、第
１のスイッチ１０５を切り替えてシフトレジスタ１０２
の後半部から値を読み出すように制御する。同時に第２
のスイッチ１０６を切り替えて第１のメモリ１０３の後
半部から値を読み出すように制御する。In addition, (N-1) first memories 103
Increment the read address of. Here, the first switch 105 is switched to switch the shift register 102.
The value is controlled to be read from the latter half of the. Second at the same time
The switch 106 is switched to read the value from the latter half of the first memory 103.

【００７１】シフトレジスタ１０２の後半部および第１
のメモリの後半部が乗算器１０７に入力されて乗算が行
われる。この時点においてもシフトレジスタ１０２の値
が全て０なので乗算結果は全て０になる。The second half and the first of the shift register 102
The latter half of the memory is input to the multiplier 107 and multiplication is performed. Even at this point in time, the values in the shift register 102 are all 0, so the multiplication results are all 0.

【００７２】乗算結果が第１の加算器１０８に入力され
て加算される。加算器の結果も０である。加算結果が第
３のレジスタ１０９に格納される。この時点において解
ｚ₁が求まっていなければ、第１のスイッチ１０５およ
び第２のスイッチ１０６を切り替えず、解ｚ₂を求める
ための演算は待機状態に入る。The multiplication result is input to the first adder 108 and added. The result of the adder is also 0. The addition result is stored in the third register 109. If the solution z ₁ is not found at this point, the first switch 105 and the second switch 106 are not switched, and the calculation for finding the solution z ₂ enters a standby state.

【００７３】解ｚ₁が求まり第１のレジスタ１０１に格
納されると第１のスイッチ１０５を切り替えて第１のレ
ジスタ１０１およびシフトレジスタ１０２の前半部から
値を読み出すように制御し乗算器１０７に入力する。When the solution z ₁ is obtained and stored in the first register 101, the first switch 105 is switched to control the multiplier 107 to read the value from the first half of the first register 101 and the shift register 102. input.

【００７４】また、同時に、第１のメモリ１０３の前半
部および第２のレジスタ１０４を読み出すように制御し
乗算器１０７に入力する。乗算器１０７では第１のスイ
ッチ１０５の出力と第２のスイッチ１０６の出力との乗
算を行う。At the same time, the first half of the first memory 103 and the second register 104 are controlled to be read and input to the multiplier 107. The multiplier 107 multiplies the output of the first switch 105 and the output of the second switch 106.

【００７５】第１のレジスタ１０１には解ｚ₁が格納さ
れており、また、第１のメモリ１０３の読み出しアドレ
スがインクリメントされているのでメモリ（１）からは
Ｌ₂₁が読み出される。従って、乗算器（ＭＵＬ（１））
の演算結果はｚ₁・Ｌ₂₁となる。シフトレジスタ１０２
には全て０が格納されているので乗算器ＭＵＬ（１）以
外の演算結果は０である。Since the solution z ₁ is stored in the first register 101 and the read address of the first memory 103 is incremented, L ₂₁ is read from the memory (1). Therefore, the multiplier (MUL (1))
The calculation result of is z ₁ · L ₂₁ . Shift register 102
Since all 0s are stored in 0, the calculation result other than the multiplier MUL (1) is 0.

【００７６】従って、第１の加算器１０８の演算結果は
ｚ₁・Ｌ₂₁となる。第２の加算器１１０で第３のレジス
タ１０９に蓄えられていた値と第１の加算器１０８の演
算結果の加算を行う。第３のレジスタ１０９には０が入
っていたので第２の加算器１１０の出力はｚ₁・Ｌ₂₁と
なる。Therefore, the calculation result of the first adder 108 is z ₁ · L ₂₁ . The second adder 110 adds the value stored in the third register 109 and the calculation result of the first adder 108. Since 0 has been entered in the third register 109, the output of the second adder 110 becomes z ₁ .L ₂₁ .

【００７７】第２の加算器１１０の演算が終わると第２
のメモリ１１１の読み出しアドレスをインクリメントし
て値ｒ₂を読み出して減算器１１２に入力される。減算
器１１２では第２のメモリ１１１から読み出したｒ₂ か
ら第２の加算器１１０の出力値ｚ₁・Ｌ₂₁を減算する。When the operation of the second adder 110 is completed, the second
The read address of the memory 111 is incremented, the value r ₂ is read, and the value is input to the subtractor 112. The subtractor 112 subtracts the output value z ₁ · L ₂₁ of the second adder 110 from r ₂ read from the second memory 111.

【００７８】従って、減算器１１２の演算結果は（ｒ₂
−ｚ₁・Ｌ₂₁）となる。減算器１１２の演算が終わると
第３のメモリ１１３の読み出しアドレスをインクリメン
トして下三角行列Ｌの対角要素Ｌ₂₂を読み出して除算器
１１４に入力する。除算器１１４では、減算器１１２の
演算結果である（ｒ₂−ｚ₁・Ｌ₂₁）を第３のメモリ１１
３から読み出したＬ₂₂で除算する。Therefore, the calculation result of the subtracter 112 is (r ₂
-Z ₁ · L ₂₁ ). When the calculation of the subtractor 112 is completed, the read address of the third memory 113 is incremented to read the diagonal element L ₂₂ of the lower triangular matrix L and input to the divider 114. In the divider 114, the calculation result of the subtractor 112 (r ₂ −z ₁ · L ₂₁ ) is stored in the third memory 11
Divide by L ₂₂ read from 3.

【００７９】除算器１１４の演算結果は、（ｒ₂−ｚ₁・
Ｌ₂₁）／Ｌ₂₂となる。前述の（５）式より、ｚ₂＝（ｒ₂
−ｚ₁・Ｌ₂₁）／Ｌ₂₂である。つまり、除算器１１４の
出力はｚ₂であり、第４のメモリ１１５および第１のレ
ジスタ１０１に入力される。以降ｚ₃，…，ｚ_Nも同様に
して求めることができる。The operation result of the divider 114 is (r ₂ −z ₁ ·
The L ₂₁₎ / L _22. From the above equation (5), z ₂ = (r ₂
-Z ₁ · L ₂₁ ) / L ₂₂ . That is, the output of the divider 114 is z ₂ and is input to the fourth memory 115 and the first register 101. Thereafter, z ₃ , ..., Z _N can be similarly obtained.

【００８０】以上説明した、本実施の形態の演算装置の
効果について述べる。The effects of the arithmetic unit according to the present embodiment described above will be described.

【００８１】第１のレジスタ１０１と（Ｎ−２）段のシ
フトレジスタ１０２から値を読み出して乗算を行うので
各段につき１個の乗算器があれば並列処理ができ処理時
間が短縮できるが、同時に並列処理を行う場合、第１の
レジスタ１０１に直前の解が格納されるまで待機しなけ
ればならない。Since the values are read from the first register 101 and the (N-2) th stage shift register 102 to perform multiplication, if there is one multiplier for each stage, parallel processing can be performed and the processing time can be shortened. When performing parallel processing at the same time, it is necessary to wait until the immediately preceding solution is stored in the first register 101.

【００８２】そこで、本発明の行列演算装置では既に求
まっている解が格納されているシフトレジスタ１０２の
後半部を先に演算を行い、直前の解が求まった時点で前
半部の演算を行うというように、乗算器の個数を半分に
削減し、時分割処理を行っている。Therefore, in the matrix operation device of the present invention, the second half of the shift register 102 in which the solutions already obtained are stored is calculated first, and the first half is calculated when the immediately preceding solution is obtained. Thus, the number of multipliers is reduced to half and the time division processing is performed.

【００８３】以上から、本発明の行列演算装置を用いれ
ば、高速かつ小規模な回路構成で前進代入演算処理を行
うことができる。From the above, by using the matrix operation device of the present invention, the forward substitution operation process can be performed with a high-speed and small-scale circuit configuration.

【００８４】（実施の形態２）上述の実施の形態では、
下三角行列を用いた演算について説明したが、本実施の
形態では、上三角行列を用いた演算を行う場合について
説明する。(Embodiment 2) In the above-mentioned embodiment,
Although the calculation using the lower triangular matrix has been described, this embodiment will explain a case where the calculation using the upper triangular matrix is performed.

【００８５】Ｎ行×Ｎ列の上三角行列をＬ，ｎ行×１列
の既知行列をｚとすると、前記連立一次方程式は、
（８）式で表わされる。Ｌｄ＝ｚ………（８）解であるｄはｎ行×１列の行列である。上三角行列は、
例えば、図５に示すように、左下半分がオール０で、右
上半分に行列要素が配置されている行列である。When the upper triangular matrix of N rows × N columns is L and the known matrix of n rows × 1 column is z, the simultaneous linear equations are
It is expressed by equation (8). Ld = z ... (8) The solution d is a matrix with n rows and 1 column. The upper triangular matrix is
For example, as shown in FIG. 5, the matrix has all 0s in the lower left half and matrix elements in the upper right half.

【００８６】連立一次方程式は、行列Ｌが上三角行列な
ので、ｄ_Nからｄ₁まで逆順に求めることができる。この
場合の演算式は、前記式（６）で示される。Since the matrix L is an upper triangular matrix, simultaneous linear equations can be obtained in reverse order from d _N to d ₁ . The arithmetic expression in this case is represented by the above expression (6).

【００８７】このような演算を、図２の行列演算装置に
て実行する。Such calculation is executed by the matrix calculation device of FIG.

【００８８】図２の行列演算装置は、上述のとおり、現
時点において求まった演算結果（ｄ _I）を格納する第１
のレジスタ１０１と、現時点までに求まっている演算結
果（ｄ_N〜ｄ_i+1）を格納する（Ｎ−２）段のシフトレジ
スタ１０２と、既知の上三角行列（Ｌ）の対角要素を
除く全ての一般要素が格納される第１のメモリ１０３
と、常時０が格納されている第２のレジスタ１０４と、
第１のレジスタ１０１とシフトレジスタ１０２の前半
（Ｎ／２−１）個を併せて前半部としシフトレジスタの
後半（Ｎ／２−１）個を後半部としてこれら前半部ある
いは後半部のいずれかの読み出しを制御する第１のスイ
ッチ１０５と、第１のメモリ１０３の前半（Ｎ／２）個
のメモリを前半部とし、第１のメモリ１０３の後半（Ｎ
／２−１）個のメモリおよび第２のレジスタ１０４を後
半部としてこれら前半部あるいは後半部のいずれかの読
み出しを制御する第２のスイッチ１０６と、第１のスイ
ッチ１０５の出力値と第２のスイッチ１０６の出力値と
の乗算を行うためのＮ／２個の乗算器１０７と、Ｎ／２
個の乗算器１０７から出力される全演算結果を加算する
第１の加算器１０８と、第１の加算器１０８の演算結果
が後半部を読み出して得られた場合に第１の加算器１０
８の結果を蓄えておくための第３のレジスタ１０９と、
第１の加算器１０８の結果が前半部を読み出して得られ
た場合に第１の加算器１０８の演算結果と前記第３のレ
ジスタ１０９に蓄えている値とを加算するための第２の
加算器１１０と、Ｎ行×１列の既知行列の要素を格納す
る第２のメモリ１１１と、第２のメモリ１１１から読み
出される値から第２の加算器１１０の演算結果を減算す
るための減算器１１２と、既知の上三角行列の対角要素
を格納するための第３のメモリ１１３と、減算器１１２
からの出力値を前記第３のメモリ１１３から読み出され
る値で除算するための除算器１１４と、除算器１１４が
出力する演算結果を格納するための第４のメモリ１１５
と、から構成される。以上より、第４のメモリには、ｄ
_Nからｄ₁まで順次格納されていく。As described above, the matrix calculation device of FIG.
Calculation result (d _I) To store the first
Register 101 and the operation result obtained up to the present time
Fruit (d_N~ D_{i + 1}) Is stored in the (N-2) th stage shift register
The star 102 and the diagonal elements of the known upper triangular matrix (L)
First memory 103 for storing all general elements except
And a second register 104 that always stores 0,
First half of the first register 101 and the shift register 102
(N / 2-1) are combined to form the first half of the shift register.
There are the first half of these as the latter half of the latter half (N / 2-1).
Or the first switch that controls the readout of either the latter half.
Switch 105 and the first half (N / 2) of the first memory 103
Memory of the first half of the first memory 103 (N
/ 2-1) memory and second register 104
Read either the first half or the second half as a half.
The second switch 106 for controlling the protrusion and the first switch
Output value of the switch 105 and the output value of the second switch 106
N / 2 multipliers 107 for multiplying
Add all operation results output from the multipliers 107
First adder 108 and the operation result of the first adder 108
Is obtained by reading the latter half of the first half, the first adder 10
A third register 109 for storing the result of 8;
The result of the first adder 108 is obtained by reading the first half
If the calculation result of the first adder 108 and the third register
The second for adding the value stored in register 109
Stores adder 110 and elements of a known matrix of N rows x 1 column
Read from the second memory 111 and the second memory 111
Subtract the operation result of the second adder 110 from the output value
Subtractor 112 and the diagonal elements of the known upper triangular matrix
A third memory 113 for storing the
Output value from the third memory 113 is read.
And a divider 114 for dividing by a value
Fourth memory 115 for storing the output operation result
It consists of and. From the above, d is stored in the fourth memory.
_NTo d₁Are sequentially stored until.

【００８９】次に、回路動作を説明する。Next, the circuit operation will be described.

【００９０】演算開始時には、第１のメモリ１０３には
既知の上三角行列Ｌの一般要素が格納され、第２のメモ
リ１１１にはｎ行×１列の既知行列ｚ（ｚ₁，ｚ₂，…ｚ
_N）の全要素が格納され、第３のメモリ１１３には上三
角行列Ｌの対角要素（Ｌ₁₁，Ｌ₂₂，…，Ｌ_NN）が格納さ
れ、第２のレジスタ１０４および第３のレジスタ１０９
には０がセットされる。第１のメモリ１０３はＮ−１個
のメモリから構成されており、それぞれのメモリには上
三角行列Ｌの一般要素が規則的に格納される。第１のメ
モリ１０３におけるメモリ（１）には（Ｌ₁₂，Ｌ₂₃，Ｌ
₃₄，…，Ｌ_N-1, _N，０）が、メモリ（２）には（Ｌ₁₃，
Ｌ₂₄，Ｌ₃₅，…，Ｌ_N-2,N，０，０）が、…メモリ（Ｎ
−２）には（Ｌ_1,N-1，Ｌ_2,N，０，…，０）が、メモリ
（Ｎ−１）には（Ｌ_1,N，０，…，０）が格納される。
各メモリは全てＮ個のアドレスを持っている。これら
はシフトレジスタ１０２がシフトするタイミングに従っ
てメモリアドレスがデクリメントされて順次読み出され
るという動作をする。At the start of the calculation, the general elements of the known upper triangular matrix L are stored in the first memory 103, and the known matrix z (z ₁ , z ₂ , ... z
_N ) are stored, the diagonal elements (L ₁₁ , L ₂₂ , ..., L _NN ) of the upper triangular matrix L are stored in the third memory 113, and the second register 104 and the third register are stored. 109
Is set to 0. The first memory 103 is composed of N-1 memories, and the general elements of the upper triangular matrix L are regularly stored in the respective memories. The memory (1) in the first memory 103 has (L ₁₂ , L ₂₃ , L
₃₄ , ..., L _N-1, _N , 0) is stored in the memory (2) as (L ₁₃ ,
L ₂₄ , L ₃₅ , ..., L _{N-2, N} , 0, 0) are ... Memory (N
-2) stores (L _{1, N-1} , L _{2, N} , 0, ..., 0), and memory (N-1) stores (L _{1, N} , 0, ..., 0). .
Each memory has N addresses. These operate such that the memory address is decremented and read sequentially according to the shift timing of the shift register 102.

【００９１】演算開始時（ｄ_N算出時）において、第１
のレジスタ１０１およびシフトレジスタ１０２には初期
値である０が格納されている状態である。When the calculation is started (d _{N is} calculated), the first
The initial value 0 is stored in the register 101 and the shift register 102.

【００９２】まずシフトレジスタ１０２の後半部から値
を読み出すように第１のスイッチ１０５が制御し、第１
のメモリ１０３の後半部から値を読み出すように第２の
スイッチ１０６が制御する。この時、シフトレジスタ１
０２にはすべて０が格納されているのでＮ／２個の乗算
器１０７の演算結果はすべて０となる。First, the first switch 105 controls so that the value is read out from the latter half of the shift register 102,
The second switch 106 controls so as to read the value from the latter half of the memory 103. At this time, shift register 1
Since all 0s are stored in 02, the operation results of the N / 2 multipliers 107 are all 0s.

【００９３】従って第１の加算器１０８の結果も０とな
り第３のレジスタ１０９には０が格納される。Therefore, the result of the first adder 108 also becomes 0, and 0 is stored in the third register 109.

【００９４】次に、第１のスイッチ１０５を切り替えて
第１のレジスタ１０１およびシフトレジスタ１０２の前
半部から値を読み出すように制御し乗算器１０７に入力
する。また同時に、第１のメモリ１０３の前半部および
第２のレジスタ１０４を読み出すように制御し乗算器１
０７に入力する。Next, the first switch 105 is switched to control to read the value from the first half of the first register 101 and the shift register 102, and the value is input to the multiplier 107. At the same time, the first half of the first memory 103 and the second register 104 are controlled to be read so that the multiplier 1
Enter in 07.

【００９５】前半部の乗算の結果も全て０になるため、
第１の加算器１０８の演算結果も０となる。第１の加算
器１０８の演算結果（前半部の乗算結果の総和）と第３
のレジスタ１０９に蓄えていた値（後半部の乗算結果の
総和）を第２の加算器１１０に入力する。第２の加算器
１１０の出力も０となる。第２の加算器の演算が終わる
と第２のメモリ１１１から既知行列ｚの第Ｎ要素ｚ_Nを
読み出して減算器１１２においてｚ_Nから第２の加算器
１１０の演算結果を減算する。また、第３のレジスタ１
０９を初期化しておく。Since the multiplication results of the first half are all 0,
The calculation result of the first adder 108 is also 0. The calculation result of the first adder 108 (the sum of the multiplication results of the first half) and the third
The value stored in the register 109 (total sum of multiplication results in the latter half) is input to the second adder 110. The output of the second adder 110 is also 0. When the operation of the second adder is completed, the Nth element z _N of the known matrix z is read from the second memory 111, and the subtracter 112 subtracts the operation result of the second adder 110 from z _N. Also, the third register 1
09 is initialized.

【００９６】第２の加算器１１０の演算結果が０である
ことから減算器１１２の出力はｚ_N−０＝ｚ_Nである。減
算器１１２の演算が終わると第３のメモリ１１３から上
三角行列の対角要素Ｌ_NNを読み出し、除算器１１４にお
いて減算器１１２の出力値をＬ_NNで除算する。除算器１
１４の出力はｚ_N／Ｌ_NNであることから解ｄ_Nがこの時点
で求まる。得られた解ｄ_Nは第４のメモリ１１５に格納
されると同時に第１のレジスタ１０１に格納される。Since the operation result of the second adder 110 is 0, the output of the subtractor 112 is z _N = 0 = z _N. When the operation of the subtractor 112 is completed, the diagonal element L _NN of the upper triangular matrix is read from the third memory 113, and the divider 114 divides the output value of the subtractor 112 by L _NN . Divider 1
Since the output of 14 is z _N / L _NN , the solution d _N can be obtained at this point. The obtained solution d _N is stored in the fourth memory 115 and simultaneously in the first register 101.

【００９７】次に解ｄ_N-1を求める動作について説明す
る。解ｄ_Nの演算過程において、第２の加算器１１０の
演算が終了した時点で、第１のレジスタ１０１の値をシ
フトレジスタ１０２に入力して１段シフトを行う。Next, the operation of obtaining the solution d _N-1 will be described. In the calculation process of the solution d _{N, when} the calculation of the second adder 110 is completed, the value of the first register 101 is input to the shift register 102 to perform one-stage shift.

【００９８】また、（Ｎ−１）個の第１のメモリ１０３
の読み出しアドレスをデクリメントする。ここで、第１
のスイッチ１０５を切り替えてシフトレジスタ１０２の
後半部から値を読み出すように制御する。In addition, (N-1) first memories 103
Decrement the read address of. Where the first
The switch 105 is switched to read the value from the latter half of the shift register 102.

【００９９】同時に第２のスイッチ１０６を切り替えて
第１のメモリ１０３の後半部から値を読み出すように制
御する。シフトレジスタ１０２の後半部および第１のメ
モリの後半部が乗算器１０７に入力されて乗算が行われ
る。この時点においてもシフトレジスタ１０２の値が全
て０なので乗算結果は全て０になる。乗算結果が第１の
加算器１０８に入力されて加算される。加算器の結果も
０である。加算結果が第３のレジスタ１０９に格納され
る。この時点において解ｄ_Nが求まっていなければ、第
１のスイッチ１０５および第２のスイッチ１０６を切り
替えず、解ｄ_N- ₁を求めるための演算は待機状態に入
る。解ｄ_Nが求まり第１のレジスタ１０１に格納される
と第１のスイッチ１０５を切り替えて第１のレジスタ１
０１およびシフトレジスタ１０２の前半部から値を読み
出すように制御し乗算器１０７に入力する。また同時
に、第１のメモリ１０３の前半部および第２のレジスタ
１０４を読み出すように制御し乗算器１０７に入力す
る。乗算器１０７では第１のスイッチ１０５の出力と第
２のスイッチ１０６の出力との乗算を行う。第１のレジ
スタ１０１には解ｄ_Nが格納されており、また、第１の
メモリ１０３の読み出しアドレスがデクリメントされて
いるのでメモリ（１）からはＬ_N-1,Nが読み出される。
従って、乗算部１０７の乗算器（ＭＵＬ１）の演算結果
はｄ_N・Ｌ_N-1,Nとなる。シフトレジスタ１０２には全て
０が格納されているので乗算器（ＭＵＬ１）以外の演算
結果は０である。従って、第１の加算器１０８の演算結
果はｄ_N・Ｌ_N-1 _,Nとなる。第２の加算器１１０で第３の
レジスタ１０９に蓄えられていた値と第１の加算器１０
８の演算結果の加算を行う。第３のレジスタ１０９には
０が入っていたので第２の加算器１１０の出力はｄ_N・
Ｌ_N-1,Nとなる。At the same time, the second switch 106 is switched to control to read the value from the latter half of the first memory 103. The latter half of the shift register 102 and the latter half of the first memory are input to the multiplier 107 and multiplication is performed. Even at this point in time, the values in the shift register 102 are all 0, so the multiplication results are all 0. The multiplication result is input to the first adder 108 and added. The result of the adder is also 0. The addition result is stored in the third register 109. If the solution d _N is not found at this point, the first switch 105 and the second switch 106 are not switched, and the calculation for finding the solution d _N- ₁ enters the standby state. When the solution d _N is obtained and stored in the first register 101, the first switch 105 is switched to change the first register 1
01 and the shift register 102 are controlled to read values from the first half of the shift register 102 and input to the multiplier 107. At the same time, the first half of the first memory 103 and the second register 104 are controlled to be read and input to the multiplier 107. The multiplier 107 multiplies the output of the first switch 105 and the output of the second switch 106. Since the solution d _N is stored in the first register 101 and the read address of the first memory 103 is decremented, L _{N-1, N} is read from the memory (1).
Therefore, the calculation result of the multiplier (MUL1) of the multiplication unit 107 is d _N · L _{N-1, N.} Since all 0s are stored in the shift register 102, the operation result other than the multiplier (MUL1) is 0. Therefore, the operation result of the first adder 108 is d _N · L _N−1 _{, N.} The value stored in the third register 109 by the second adder 110 and the first adder 10
The calculation result of 8 is added. Since 0 has been entered in the third register 109, the output of the second adder 110 is d _N
L _{N-1, N.}

【０１００】第２の加算器１１０の演算が終わると第２
のメモリ１１１の読み出しアドレスをデクリメントして
値ｚ_N-1を読み出して減算器１１２に入力される。減算
器１１２では第２のメモリ１１１から読み出したｚ_N-1
から第２の加算器１１０の出力値ｄ_N・Ｌ_N-1,Nを減算す
る。従って、減算器１１２の演算結果は（ｚ_N-1−ｄ_N・
Ｌ_N-1,N）となる。減算器１１２の演算が終わると第３
のメモリ１１３の読み出しアドレスをデクリメントして
上三角行列Ｌの対角要素Ｌ_N-1,N-1を読み出して除算器
１１４に入力する。When the operation of the second adder 110 is completed, the second
The read address of the memory 111 is decremented to read the value z _N−1 and input to the subtractor 112. The subtracter 112 reads z _N-1 read from the second memory 111.
Is subtracted from the output value d _N · L _{N-1, N} of the second adder 110. Therefore, the calculation result of the subtractor 112 is (z _N-1 −d _N ·
L _{N-1, N} ). When the subtracter 112 has finished the operation, the third
The read address of the memory 113 is decremented and the diagonal elements L _{N-1, N-1} of the upper triangular matrix L are read and input to the divider 114.

【０１０１】除算器１１４では、減算器１１２の演算結
果である（ｚ_N-1−ｄ_N・Ｌ_N-1,N）を第３のメモリ１１
３から読み出したＬ_N-1,N-1で除算する。除算器１１４
の演算結果は、（ｚ_N-1−ｄ_N・Ｌ_N-1,N）／Ｌ_N-1,N-1と
なる。（６）式より、ｄ_N-1＝（ｚ_N-1−ｄ_N・Ｌ_N-1,N）
／Ｌ_N-1,N-1である。つまり、除算器１１４の出力はｄ
_N-1であり、第４のメモリ１１５および第１のレジスタ
１０１に入力される。以降ｄ_N-2，…，ｄ₁も同様にして
求めることができる。[0102] Divider At 114, a computation result of the subtracter _{_{112 (z N-1 -d N}} · L N-1, N) of the third memory 11
Divide by L _{N-1, N-1} read from 3. Divider 114
The calculation result of is (z _N-1 −d _N · L _{N-1, N} ) / L _{N-1, N-1} . From the formula (6), d _N-1 = (z _N-1 −d _N · L _{N-1, N} )
/ L _{N-1, N-1} . That is, the output of the divider 114 is d
_N−1, which is input to the fourth memory 115 and the first register 101. Thereafter, d _N-2 , ..., D ₁ can be similarly obtained.

【０１０２】本実施の形態の効果について述べる。The effects of this embodiment will be described.

【０１０３】第１のレジスタ１０１と（Ｎ−２）段のシ
フトレジスタ１０２から値を読み出して乗算を行うので
各段につき１個の乗算器があれば並列処理ができ処理時
間が短縮できるが、同時に並列処理を行う場合、第１の
レジスタ１０１に直前の解が格納されるまで待機しなけ
ればならない。Since the values are read from the first register 101 and the (N-2) th stage shift register 102 and the multiplication is performed, if there is one multiplier for each stage, parallel processing can be performed and the processing time can be shortened. When performing parallel processing at the same time, it is necessary to wait until the immediately preceding solution is stored in the first register 101.

【０１０４】そこで、本発明の行列演算装置では既に求
まっている解が格納されているシフトレジスタ１０２の
後半部を先に演算を行い、直前の解が求まった時点で前
半部の演算を行うというように、乗算器の個数を半分に
削減し、時分割処理を行っている。Therefore, in the matrix operation device of the present invention, the second half of the shift register 102 in which the solutions already obtained are stored is calculated first, and the first half is calculated when the immediately preceding solution is obtained. Thus, the number of multipliers is reduced to half and the time division processing is performed.

【０１０５】以上から、本発明の行列演算装置を用いれ
ば、高速かつ小規模な回路構成で後退代入演算処理を行
うことができる。As described above, by using the matrix calculation device of the present invention, the backward substitution calculation process can be performed with a high speed and small scale circuit configuration.

【０１０６】(実施の形態３)本発明の行列演算装置は、
無線通信で受信した信号の復調方法であるジョイントデ
ィテクション復調にも適用することができる。(Embodiment 3) The matrix calculation device of the present invention is
It can also be applied to joint detection demodulation, which is a demodulation method of signals received by wireless communication.

【０１０７】ジョイントディテクション復調（以下、Ｊ
Ｄ復調という）は、Ｗ−ＣＤＭＡのＴＤＤモードの通信
に適した復調方法であり、拡散符号を乗算して自己相関
を検出して復調する方法と異なり、受信信号に重畳され
る複数のユーザーの信号のそれぞれについて相互相関を
検出し、自己の信号以外の成分を引き算することによ
り、各ユーザーの信号のみを取り出すという、干渉キャ
ンセルの原理を積極的に利用した復調方法であり、自己
相関のみでは除去できない干渉成分も正確に除去できる
ため、より精度の高い復調を行うことができる。また、
ＪＤ復調では、遅延波による相互相関も干渉除去するこ
とができる。Joint detection demodulation (hereinafter J
D demodulation) is a demodulation method suitable for TDD mode communication of W-CDMA, and is different from the method of multiplying a spreading code and detecting autocorrelation to demodulate, and is different from that of a plurality of users superimposed on a received signal. This is a demodulation method that positively utilizes the principle of interference cancellation, in which only the signal of each user is extracted by detecting the cross-correlation of each signal and subtracting the components other than its own signal. Since the interference component that cannot be removed can be removed accurately, more accurate demodulation can be performed. Also,
In JD demodulation, interference can be eliminated even for cross-correlation due to delayed waves.

【０１０８】図７は、ＪＤ復調を行う信号の伝搬モデル
を示している。FIG. 7 shows a propagation model of a signal for JD demodulation.

【０１０９】ｄ（１）〜ｄ（ｋ）はｋ人のユーザーがそ
れぞれ送信した信号を示しており、これが復調の対象と
なる。ｃ（１）〜ｃ（ｋ）は拡散コードであり、ｈ
（１）〜ｈ（ｋ）は推定された伝搬特性（遅延プロファ
イル：推定した回線のインパルス応答）である。D (1) to d (k) represent the signals transmitted by the k users, respectively, which are the targets of demodulation. c (1) to c (k) are spreading codes, and h
(1) to h (k) are estimated propagation characteristics (delay profile: estimated impulse response of the line).

【０１１０】ｂ（１）〜ｂ（ｋ）は、拡散コードと伝搬
特性の畳み込み演算によって求められるべクトルであ
る。これにノイズｎが加わったものが受信信号ｅであ
り、これをＪＤ復調部２０３で復調して、各ユーザーの
送信信号ｄ（１）〜ｄ（ｋ）を区別して復調する。B (1) to b (k) are vectors obtained by the convolution operation of the spreading code and the propagation characteristic. The noise n added to this is the received signal e, which is demodulated by the JD demodulation unit 203, and the transmitted signals d (1) to d (k) of each user are distinguished and demodulated.

【０１１１】図８は、図７の伝搬モデルを行列表示した
図である。FIG. 8 is a matrix display of the propagation model of FIG.

【０１１２】図８より、Ａｄ＋ｎ＝ｅという方程式から
行列ｄを算出するのが、ＪＤ復調である。両辺に、行列
Ａの共役転置行列Ａ^Hを左側から乗算すると、Ａ^HＡｄ＋
Ａ^Hｎ＝Ａ^Hｅと変形できる。ノイズｎが無視できるくら
い小さい、あるいはＡ^Hｎ＝σ²Ｉｄ（Ｉはｎ行ｎ列の単
位行列）で表わされるなら、方程式は、Ｆｄ＝ｒという
形で表わされる。From FIG. 8, it is the JD demodulation that calculates the matrix d from the equation Ad + n = e. Multiplying both sides by the conjugate transposed matrix A ^H of the matrix A from the left side, A ^H Ad +
It can be transformed into A ^H n = A ^H e. If the noise n is negligibly small, or is represented by A ^H n = σ ² Id (I is an identity matrix of n rows and n columns), the equation is expressed as Fd = r.

【０１１３】すなわち、Ａ^Hｎ＝σ²Ｉｄ（ＩはＮ行Ｎ列
の単位行列）とすると、（Ａ^HＡ＋σ²Ｉ）ｄ＝Ａ^Hｅと
変形できる。ここで、Ａ^HｅはＲＡＫＥ合成後のシンボ
ルデータを表わしており、ｒと表記する。また、Ａ^HＡ
＋σ²Ｉ＝Ｆとおく。すると、Ｆｄ＝ｒと表わすことが
できる。That is, assuming that A ^H n = σ ² Id (I is a unit matrix of N rows and N columns), it can be transformed into (A ^H A + σ ² I) d = A ^H e. Here, A ^H e represents the symbol data after RAKE combining, denoted as r. Also, A ^H A
Let + σ ² I = F. Then, it can be expressed as Fd = r.

【０１１４】ここで、行列Ｆは相互相関行列であり、ノ
イズｎが無視できるぐらい十分に小さい場合、拡散コー
ドと、推定した回線のインパルス応答とを畳み込んだベ
クトルＢ１（＝Ｂ２＝Ｂｎ）を規則的に配置して生成し
た行列Ａと、その共役転置行列Ａ^Hとの乗算を行って生
成された行列であり、対称行列である。Here, the matrix F is a cross-correlation matrix, and when the noise n is small enough to be ignored, the vector B1 (= B2 = Bn) obtained by convolving the spreading code and the estimated impulse response of the line is used. It is a matrix generated by multiplying the matrix A generated by regularly arranging it and its conjugate transposed matrix A ^H, and is a symmetric matrix.

【０１１５】このような方程式から、行列ｄを求めるた
めには、行列Ｆの逆行列を生成して乗算すればよいが、
実際には、逆行列の算出は容易ではない。そこで、本実
施の形態のＪＤ復調部では、行列Ｆをコレスキー分解
し、連立一次方程式を解くことで行列ｄの要素を求め
る。In order to obtain the matrix d from such an equation, the inverse matrix of the matrix F may be generated and multiplied.
In reality, the calculation of the inverse matrix is not easy. Therefore, the JD demodulation unit of the present embodiment obtains the elements of the matrix d by performing Cholesky decomposition of the matrix F and solving the simultaneous linear equations.

【０１１６】図６（ａ）は、ＪＤ復調部を含むＣＤＭＡ
受信装置の構成を示すブロック図であり、（ｂ）は、送
信データのフォーマットを示す図である。FIG. 6A shows a CDMA including a JD demodulation section.
It is a block diagram which shows the structure of a receiver, (b) is a figure which shows the format of transmission data.

【０１１７】アンテナ２で受信された信号は無線受信部
１０で増幅され、回線推定部２０１および逆拡散部２０
７に入力される。The signal received by antenna 2 is amplified by radio receiving section 10 and channel estimating section 201 and despreading section 20 are amplified.
Input to 7.

【０１１８】回線推定部２０１では、受信信号に含まれ
る既知信号（ミッドアンブルコード）についてのインパ
ルス応答から、各ユーザーの信号の回線推定を行う。Channel estimation section 201 estimates the channel of each user's signal from the impulse response of the known signal (midamble code) contained in the received signal.

【０１１９】ミッドアンブルコードは．図６（ｂ）に示
すように、１つのスロットの中央部に挿入されている、
回線推定用の既知コードである。The midamble code is. As shown in FIG. 6 (b), it is inserted in the center of one slot,
This is a known code for line estimation.

【０１２０】回線推定部２０１は、ミッドアンブル相関
処理部２０と、ミッドアンブルコード生成部２４と、パ
ス選択部２２とを有する。Channel estimating section 201 has a midamble correlation processing section 20, a midamble code generating section 24, and a path selecting section 22.

【０１２１】無線受信部１０を経た信号は逆拡散部２０
７にて逆拡散される。回線推定部２０１で得られた回線
推定値は、ＲＡＫＥ合成部２０２およびＪＤ復調部２０
３に入力される。逆拡散後のシンボルデータについて、
回線推定値を基にした位相補償を行い、ＲＡＫＥ合成部
２０２にてＲＡＫＥ合成を行い、ＲＡＫＥ合成結果ｒを
ＪＤ復調部２０３に入力する。The signal passed through the radio receiving section 10 is despreading section 20.
Despread at 7. The channel estimation value obtained by the channel estimation section 201 is used as the RAKE combining section 202 and the JD demodulation section 20.
Input to 3. Regarding the symbol data after despreading,
Phase compensation is performed based on the channel estimation value, RAKE combining section 202 performs RAKE combining, and RAKE combining result r is input to JD demodulating section 203.

【０１２２】ＪＤ復調部２０３は、拡散コード発生部３
０から与えられる拡散コードと推定された回線のインパ
ルス応答から相互相関行列（Ｆ）を求める相互相関行列
（Ｆ）生成部２０４と、相互相関行列をコスレキー分解
あるいは変形コレスキー分解して、下三角行列および上
三角行列の積の形式とするコレスキー分解部２０５と、
下三角行列または上三角行列を含む形式で表わされる連
立一次方程式について、前進代入あるいは後退代入を用
いて解を算出する連立方程式演算部２０６とを有する。
連立方程式演算部２０６は、図２の行列演算装置を具備
する。The JD demodulation section 203 has a spreading code generating section 3
A cross-correlation matrix (F) generation unit 204 that obtains a cross-correlation matrix (F) from the spread code given from 0 and the impulse response of the estimated channel, and a cross-correlation matrix or a modified Cholesky decomposition to obtain a lower triangle. A Cholesky decomposition unit 205 in the form of a product of a matrix and an upper triangular matrix,
It has a simultaneous equation calculation unit 206 for calculating a solution for simultaneous linear equations expressed in a format including a lower triangular matrix or an upper triangular matrix by using forward substitution or backward substitution.
The simultaneous equation calculation unit 206 includes the matrix calculation device of FIG.

【０１２３】図９は、相互相関行列（Ｆ）生成部２０４
の機能を説明するための図である。図９の上側に示され
る、畳み込み演算処理Ｘ１では、拡散コードを蓄積して
いる部分９００から出力される拡散コード（Ｃ₁〜Ｃ_Q）
と、回線推定値を格納している部分９０２から出力され
る回線推定値のパラメータ（ｈ₁〜ｈ_w）を、加算器９０
３（９０３ａ〜９０３ｃ等）および加算器９０４を用い
て畳み込み、ベクトルｂ₁〜ｂ_Q+W-1を求める。FIG. 9 shows the cross-correlation matrix (F) generator 204.
3 is a diagram for explaining the function of FIG. In the convolution operation processing X1 shown on the upper side of FIG. 9, the spreading code (C _{1 to} C _Q ) output from the portion 900 that stores the spreading code.
And the parameter (h _{1 to} h _w ) of the line estimation value output from the portion 902 storing the line estimation value, the adder 90
3 (903a to 903c, etc.) and the adder 904 are used for convolution to obtain vectors b _{1 to} b _{Q + W-1} .

【０１２４】そして、図９の下側に示される処理Ｙ１に
おいて、ベクトルｂを規則的に配置して、行列Ａを生成
する。さらに、処理Ｙ２において、行列Ａと行列Ａの共
役転置行列である行列Ａ^Hの乗算を行い、行列Ｆ（＝Ａ^H
Ａ）を生成する。Then, in process Y1 shown on the lower side of FIG. 9, the vector b is regularly arranged to generate the matrix A. Further, in the process Y2, the matrix A and the matrix A ^H that is the conjugate transposed matrix of the matrix A are multiplied to obtain the matrix F (= A ^H
A) is generated.

【０１２５】ここで、ＪＤ復調の結果として得られる送
信信号の推定信号をｄとすると、次の（９）式が成立す
る。Ｆｄ＝ｒ……（９）この式を、連立方程式演算部２０６にて、ｄについて解
く。Here, assuming that the estimated signal of the transmission signal obtained as a result of JD demodulation is d, the following expression (9) is established. Fd = r (9) This equation is solved for d by the simultaneous equation calculation unit 206.

【０１２６】ここで行列Ｆが対称行列であることからコ
レスキー分解（変形コレスキー分解を含む）ができ、行
列Ｆは下三角行列Ｌを用いて、Ｆ＝ＬＬ^Hと分解でき
る。ただし、Ｌ^HはＬの共役転置行列である。コレスキ
ー分解部２０５でコレスキー分解（あるいは変形コレス
キー分解）を行う。Since the matrix F is a symmetric matrix, Cholesky factorization (including modified Cholesky factorization) can be performed, and the matrix F can be factored into F = LL ^H using the lower triangular matrix L. However, L ^H is a conjugate transposed matrix of L. The Cholesky decomposition section 205 performs Cholesky decomposition (or modified Cholesky decomposition).

【０１２７】以上から、ＬＬ^Hｄ＝ｒとなり、Ｌ^Hｄ＝ｚ
と置換することでＬｚ＝ｒでまず下三角行列に関する連
立方程式をｚについて解く。ここで行列ｚが既知にな
ることから、Ｌ^Hｄ＝ｚの上三角行列に関する連立方程
式をｄについて解くことができる。From the above, LL ^H d = r and L ^H d = z
First, the simultaneous equations regarding the lower triangular matrix are solved for z with Lz = r. Now that the matrix z is known, a system of equations for the upper triangular matrix of L ^H d = z can be solved for d.

【０１２８】これら、下三角行列に関する連立１次方程
式、上三角行列に関する連立１次方程式を、図２の本発
明の行列演算装置を用いて解くことで、非常に高速に解
を求めることができる。By solving these simultaneous linear equations for the lower triangular matrix and simultaneous linear equations for the upper triangular matrix using the matrix operation device of the present invention shown in FIG. 2, the solution can be obtained very quickly. .

【０１２９】（実施の形態４）本発明の行列演算装置
は、最小２乗誤差法（ＭＭＳＥ）に基づいたアダプティ
ブアレイが実装されている通信装置にも用いることがで
きる。(Embodiment 4) The matrix operation device of the present invention can also be used in a communication device in which an adaptive array based on the least square error method (MMSE) is mounted.

【０１３０】最小２乗誤差法（ＭＭＳＥ）に基づいたア
ダプティブアレイが実装されている通信装置を図１０に
示す。FIG. 10 shows a communication device in which an adaptive array based on the least square error method (MMSE) is mounted.

【０１３１】例としてアンテナが３本の場合について説
明する。As an example, a case where there are three antennas will be described.

【０１３２】アンテナ３０１、アンテナ３０２、アンテ
ナ３０３から受信信号が入力される。受信信号の復調器
３０４で最適な復調を行えるようにウェイト生成部３０
５で生成されるウェイトを各アンテナから受信された信
号に乗算して最適な重み付けを行う。Received signals are input from the antennas 301, 302 and 303. The weight generator 30 is provided so that the demodulator 304 of the received signal can perform optimum demodulation.
Optimal weighting is performed by multiplying the signal received from each antenna by the weight generated in 5.

【０１３３】アンテナ３０１、アンテナ３０２、アンテ
ナ３０３から入力される受信信号の自己相関行列をＲ、
受信信号と既知信号との相互相関行列をＰとするとウェ
イトｗは（１０）式から求めることができる。Ｒｗ＝Ｐ
・・・（１０）ここで、Ｒは対称行列であることか
らコレスキー分解することができる。従って、上三角行
列に関する連立一次方程式、下三角行列に関する連立一
次方程式を解くことになる。Let R be the autocorrelation matrix of the received signals input from antenna 301, antenna 302, and antenna 303.
When the cross-correlation matrix between the received signal and the known signal is P, the weight w can be calculated from the equation (10). Rw = P
(10) Here, since R is a symmetric matrix, Cholesky decomposition can be performed. Therefore, simultaneous linear equations regarding the upper triangular matrix and simultaneous linear equations regarding the lower triangular matrix are solved.

【０１３４】そこで、本発明の行列演算装置を用いるこ
とにより、最適なウェイトを高速に求めることができ
る。Therefore, by using the matrix operation device of the present invention, the optimum weight can be obtained at high speed.

【０１３５】（実施の形態５）本発明の行列演算装置
は、適応等化器が実装されている通信装置にも用いるこ
とができる。適応等化器は伝送路の時間応答を精密に制
御を行って伝送路の振幅と遅延特性を平滑にするフィル
タである。適応等化器が実装されている通信装置を図１
１に示す。(Embodiment 5) The matrix operation device of the present invention can be used also in a communication device in which an adaptive equalizer is mounted. The adaptive equalizer is a filter that precisely controls the time response of the transmission line to smooth the amplitude and delay characteristics of the transmission line. FIG. 1 shows a communication device in which an adaptive equalizer is mounted.
Shown in 1.

【０１３６】受信信号がトランスバーサルフィルタ（Ｆ
ＩＲ）４０１およびウェイト算出部４０２に入力され
る。The received signal is a transversal filter (F
(IR) 401 and weight calculation section 402.

【０１３７】ウェイト算出部４０２では、トランスバー
サルフィルタ４０１の最適なタップ係数を算出する。ト
ランスバーサルフィルタ４０１のタップ数をＭとする。
最適なタップ係数は、以下のようにして算出する。最
適なタップ係数をＭ行×１列の行列ｗとし、受信信号の
自己相関行列をＲとし（Ｍ行×Ｍ列の行列）、受信信号
と既知信号に対応する望みの応答との相互相関行列をＰ
（Ｍ行×１列の行列）とすると以下の（１１）式が成り
立つ。ＲＷ＝ｐ……（１１）ここで、時点ｉにおける受信信号ｒ（ｉ）とすると、受
信信号の自己相関行列は以下のように与えられる。The weight calculator 402 calculates the optimum tap coefficient of the transversal filter 401. The number of taps of the transversal filter 401 is M.
The optimum tap coefficient is calculated as follows. The optimum tap coefficient is M row × 1 column matrix w, the received signal autocorrelation matrix is R (M row × M column matrix), and the cross-correlation matrix between the received signal and the desired response corresponding to the known signal. To P
(M rows × 1 column matrix), the following expression (11) is established. RW = p (11) Here, assuming that the received signal r (i) at the time point i, the autocorrelation matrix of the received signal is given as follows.

【０１３８】[0138]

【数３】また、望みの応答ｄ（ｎ）と受信信号との相互相関行列
Ｐは以下のように与えられる。[Equation 3] Further, the cross-correlation matrix P between the desired response d (n) and the received signal is given as follows.

【０１３９】[0139]

【数４】 [Equation 4]

【０１４０】（１１）式を、ｗについて解き、最適なタ
ップ係数を求めて最適フィルタを生成する。ここで、Ｒ
は対称行列なのでコレスキー分解、変形コレスキー分解
を行うことができる。それ以降の前進代入、後退代入演
算を本発明の行列演算器を用いることにより、小規模な
回路構成で高速に演算を行うことができる。The equation (11) is solved for w, the optimum tap coefficient is obtained, and the optimum filter is generated. Where R
Since is a symmetric matrix, Cholesky decomposition and modified Cholesky decomposition can be performed. By using the matrix calculator of the present invention, the subsequent forward substitution and backward substitution calculations can be performed at high speed with a small circuit configuration.

【０１４１】（実施の形態６）本発明の行列演算装置
は、アダプティブアレイが実装されている通信装置にお
いて、到来方向推定アルゴリズムの一種であるＣａｐｏ
ｎ法の演算にも有効である。(Embodiment 6) The matrix calculation device of the present invention is a type of arrival estimation algorithm, Capo, in a communication device in which an adaptive array is mounted.
It is also effective for the n-method calculation.

【０１４２】受信信号の相関行列を、Ｒｘｘ，アレイの
応答ベクトルをａ（θ）とすると、Ｃａｐｏｎ法角度ス
ペクトラムＰ_cp（θ）は、（１２）式で求めることがで
きる。When the correlation matrix of the received signal is Rxx and the response vector of the array is a (θ), the Capon method angle spectrum P _cp (θ) can be obtained by the equation (12).

【０１４３】[0143]

【数５】ここで、受信信号の相関行列Ｒ_xxは対称行列であるので
コレスキー分解ができる。下三角行列をＬとすると、Ｒ
_xx＝ＬＬ^Hと分解できる。従って、Ｒ_xx ^-1＝（ＬＬ^H）^-1
と変形できる。ＬＬ^Hの逆行列を求める際に（１３）式
を用いて容易に求めることができる。ＬＬ^Hの逆行列を
ｘ、単位行列をＥとすると、（１３）式で表わすことが
できる。ＬＬ^Hｘ＝Ｅ………（１３）数式１１をｘについて解くのに本発明の行列演算器を
用いることにより、小規模な回路で高速に演算を行うこ
とができる。以上に示しているように本発明の行列演算
器は逆行列を高速に求めることができる。[Equation 5] Here, since the correlation matrix R _xx of the received signal is a symmetric matrix, Cholesky decomposition can be performed. If the lower triangular matrix is L, then R
_It can be decomposed as _xx = LL ^H. Therefore, R _xx ^-1 = (LL ^H ) ^-1
Can be transformed. When the inverse matrix of LL ^H is obtained, it can be easily obtained by using the equation (13). When the inverse matrix of LL ^H is x and the identity matrix is E, it can be expressed by equation (13). LL ^H x = E (13) By using the matrix calculator of the present invention to solve Expression 11 for x, a small-scale circuit can perform high-speed calculations. As shown above, the matrix calculator of the present invention can obtain an inverse matrix at high speed.

【０１４４】[0144]

【発明の効果】以上説明したように本発明によれば、ハ
ードウエアで構成された巡回型の演算処理回路でもっ
て、超高速の行列演算を実行できる。これにより、例え
ば、従来の１０倍の高速化が達成される。As described above, according to the present invention, a super-high-speed matrix operation can be executed by a cyclic arithmetic processing circuit composed of hardware. As a result, for example, a speedup 10 times faster than the conventional one is achieved.

【０１４５】また、本発明では、乗算器を時分割で使用
することで、乗算器の数を、無理なく効果的に削減して
装置の小型化や低消費電力化を促進することができる。
すなわち、直前の一次方程式の解がまだ演算中の時に、
次の一次方程式の解の演算の積和演算を開始させる。た
だし積和演算にはこの時点でまだ算出中である直前の要
素も含まれるため全ての積和演算を行うことができな
い。そこで前半部と後半部（直前の要素が含まれる方）
にわけ、乗算を一斉に行えるように乗算器を複数並べ
て、後半部から乗算を開始させる。直前の要素が求まっ
てから前半部の乗算を開始させる。直前の要素が求まっ
てから一斉に乗算して積和演算するのに比べて、例え
ば、半分の乗算器で済む。処理時間もほとんど変わらな
い。Further, according to the present invention, by using the multipliers in a time division manner, it is possible to effectively and effectively reduce the number of the multipliers, thereby promoting miniaturization of the device and reduction of power consumption.
That is, when the solution of the previous linear equation is still being calculated,
The product-sum calculation of the solution calculation of the following linear equation is started. However, the sum-of-products calculation cannot perform all the sum-of-products calculation because it includes the element immediately before being calculated at this point. So the first half and the second half (the one that includes the previous element)
Therefore, a plurality of multipliers are arranged so that the multiplication can be performed all at once, and the multiplication is started from the latter half. After the previous element is found, the first half multiplication is started. For example, half the number of multipliers is sufficient as compared with the case where the immediately preceding element is obtained and multiplication is performed all at once, and the product sum operation is performed. The processing time is almost unchanged.

【０１４６】このようにして、本発明によれば、高速な
行列演算を、小型かつ低消費電力のハードウエアでもっ
て効率的に実現できる。本発明の行列演算装置は、ＬＳ
Ｉ化に適しており、したがって、携帯電話等の移動体通
信機にも適用が可能である。As described above, according to the present invention, a high-speed matrix operation can be efficiently realized with a small-sized and low power consumption hardware. The matrix calculation device of the present invention is
It is suitable for I-mode and is therefore applicable to mobile communication devices such as mobile phones.

[Brief description of drawings]

【図１】下三角行列を含む行列演算の処理内容を示す図FIG. 1 is a diagram showing the processing contents of a matrix operation including a lower triangular matrix.

【図２】本発明の行列演算装置の一例の構成を示す回路
図FIG. 2 is a circuit diagram showing a configuration of an example of a matrix calculation device of the present invention.

【図３】図２の行列演算装置の特徴的な動作を説明する
ためのタイミング図FIG. 3 is a timing diagram for explaining a characteristic operation of the matrix operation device of FIG.

【図４】図２の行列演算装置により、一次方程式の解を
求める際の処理内容の一例を説明するための図FIG. 4 is a diagram for explaining an example of processing contents when a solution of a linear equation is obtained by the matrix operation device of FIG.

【図５】上三角行列を含む行列演算の処理内容を示す図FIG. 5 is a diagram showing the processing contents of a matrix operation including an upper triangular matrix.

【図６】（ａ）ＪＤ復調部を含むＣＤＭＡ受信装置の構
成を示すブロック図（ｂ）送信データのフォーマットを示す図FIG. 6A is a block diagram showing a configuration of a CDMA receiving device including a JD demodulation unit, and FIG. 6B is a diagram showing a format of transmission data.

【図７】マルチユーザーの送信信号の伝搬モデルを示す
図FIG. 7 is a diagram showing a propagation model of a multi-user transmission signal.

【図８】図７の伝搬モデルを行列の形式で表わした図8 is a diagram showing the propagation model of FIG. 7 in a matrix format.

【図９】相互相関行列（Ｆ）の生成について説明するた
めの図FIG. 9 is a diagram for explaining generation of a cross correlation matrix (F).

【図１０】本発明の行列演算装置を適用したアダプティ
ブアレイ装置の構成を示す図FIG. 10 is a diagram showing the configuration of an adaptive array device to which the matrix calculation device of the present invention is applied.

【図１１】本発明の行列演算装置を適用した適応等化器
の構成を示す図FIG. 11 is a diagram showing the configuration of an adaptive equalizer to which the matrix calculation device of the present invention is applied.

【図１２】従来のマルチプロセッサを利用した行列演算
方法を説明するための図FIG. 12 is a diagram for explaining a matrix calculation method using a conventional multiprocessor.

[Explanation of symbols]

１０１，１０４，１０９レジスタ１０２シフトレジスタ１０３，１１１，１１３，１１５メモリ１０５，１０６スイッチ部１０８，１１０加算器１１２減算器１１４除算器２０６行列演算装置 101, 104, 109 registers 102 shift register 103,111,113,115 memory 105,106 switch part 108,110 adder 112 subtractor 114 divider 206 matrix computing device

───────────────────────────────────────────────────── フロントページの続きＦターム(参考） 5B056 AA05 BB01 BB31 BB71 5J021 AA05 AA09 CA06 DB02 DB03 EA04 FA13 FA14 FA15 FA16 FA17 FA20 FA26 FA29 FA30 FA32 GA02 HA04 HA05 JA10 5K046 AA05 BA01 BA07 EE05 EE06 EE32 EE47 EF15 5K059 AA08 AA12 CC01 CC04 DD31 ─────────────────────────────────────────────────── ─── Continued front page F-term (reference) 5B056 AA05 BB01 BB31 BB71 5J021 AA05 AA09 CA06 DB02 DB03 EA04 FA13 FA14 FA15 FA16 FA17 FA20 FA26 FA29 FA30 FA32 GA02 HA04 HA05 JA10 5K046 AA05 BA01 BA07 EE05 EE06 EE32 EE47 EF15 5K059 AA08 AA12 CC01 CC04 DD31

Claims

[Claims]

1. When a known lower triangular matrix is "L" and a known upper triangular matrix is "U", L (or U) .X = Y
A matrix operation device for calculating a solution of a simultaneous linear equation represented by (X is a matrix to be obtained and Y is a known matrix) by using forward substitution or backward substitution to obtain values of all the elements of the matrix X to be obtained. , N-th (n is a natural number greater than or equal to 2) hardware required to find the solution of the linear equation immediately before, including all solutions obtained in the past, including the solution of the previous linear equation And a linear operation unit having a hardware configuration for performing a predetermined linear operation on a value output from the product-sum operation unit to obtain a solution of the n-th linear equation, During the period in which the operation for obtaining the solution of the (n-1) th linear equation is performed in the linear operation unit, the product-sum operation required for obtaining the solution of the nth linear equation in the product-sum operation unit (N-1) of the item Matrix operation and wherein the preceding practiced product-sum operation for the partial terms the solution does not contain a linear equation of.

2. The matrix operation device according to claim 1, wherein the multiplier forming the product-sum operation unit is used in a time division manner.

3. When a known lower triangular matrix is "L" and a known upper triangular matrix is "U", L (or U) .X = Y
A matrix operation device for calculating a solution of a simultaneous linear equation represented by (X is a matrix to be obtained and Y is a known matrix) by using forward substitution or backward substitution to obtain values of all the elements of the matrix X to be obtained. , A first register for temporarily accumulating the solution of the immediately preceding linear equation necessary for finding the solution of the n-th (n is a natural number of 2 or more) linear equation, and the solution of the n-th linear equation A shift register having a configuration that temporarily accumulates each solution of a linear equation prior to the immediately preceding linear equation necessary for obtaining, and is divided into a first half portion and a second half portion by folding back in the middle. Of the folded back shift registers, the taps of the first half and the second half of the corresponding shifts are set as a set, or the register and the second half of the shift register are paired. And the position of the tap as one set, in each set, for switching whether to output either,
A plurality of switches provided for each set, and a general element of the lower triangular matrix L or the upper triangular matrix U,
As a coefficient for multiplying each of the values output from the plurality of switches, a coefficient generator that is generated in a predetermined order, a value that is output from each of the plurality of switches, and a coefficient that is generated from the coefficient generator A plurality of multipliers for multiplying the coefficient corresponding to each switch, an adder for adding the values output from each of the plurality of multipliers, and the n-th output from this adder. The value forming a part of the product-sum operation result required to obtain the solution of the linear equation is temporarily stored, and the stored value is stored when the next addition process is performed in the adder. A second register to be returned to the adder, and a necessary linear operation are performed on a value output from the adder, the value indicating a complete product-sum operation result necessary for obtaining a solution of the n-th linear equation. , The n-th primary A linear operation circuit for calculating the solution of the equation, and returning the solution of the obtained n-th linear equation to the first register, and the solution of the n-th linear equation to the first register. Shows a complete sum of products operation result required to shift the first register and the shift register by one stage and obtain the solution of the nth linear equation from the adder before being set. After the value is output and during the period when the operation is performed in the linear operation circuit, the plurality of switches are switched so that the value of the tap in the latter half portion of the shift register is output,
A part of the multiply-accumulate operation necessary for obtaining the solution of the next (n + 1) th linear equation is executed in advance, and then the obtained solution is returned to the first register to continue the operation. A matrix calculation device characterized by specifying all the elements of a matrix X to be obtained.

4. The coefficient generator outputs the value of a general element of a lower triangular matrix or an upper triangular matrix as a coefficient, and the linear arithmetic circuit outputs the lower triangular matrix or the upper triangular matrix. A matrix operation device comprising a circuit for performing a division operation using a diagonal element of the above as a divisor or a multiplication operation equivalent to this division operation.

5. A matrix in which a convolution of a spreading code and an estimated impulse response of the line is regularly arranged,
A cross-correlation matrix generator that multiplies the conjugate transposed matrix of the matrix to generate a cross-correlation matrix; and a cross-correlation matrix or a modified Cholesky decomposition of the cross-correlation matrix to form a product of a lower triangular matrix and an upper triangular matrix. Cholesky factorization section to be, and for a simultaneous linear equation represented in a format including the lower triangular matrix or the upper triangular matrix, the operation of calculating a solution using forward substitution or backward substitution, 4. A joint detection demodulation device, comprising: a simultaneous equation calculation unit that uses the matrix calculation device according to any one of items 4 to 4;

6. An adaptive array device, characterized in that a weight for multiplying a received signal of each antenna in the adaptive array is obtained by using the matrix operation device according to any one of claims 1 to 4.

7. The tap coefficient of the transversal filter in an adaptive equalizer including a traversal filter is obtained by using the matrix operation device according to claim 1. Adaptive equalizer.

8. An inverse matrix calculation device, wherein the matrix calculation device according to any one of claims 1 to 4 is used to perform an inverse matrix calculation using Cholesky decomposition.

9. The joint detection demodulator according to claim 5, the adaptive array device according to claim 6,
A wireless communication device equipped with either the adaptive equalizer according to claim 7 or the inverse matrix operation device according to claim 8.