CA2017835A1

CA2017835A1 - Parallel distributed processing network characterized by an information storage matrix

Info

Publication number: CA2017835A1
Application number: CA002017835A
Authority: CA
Inventors: Nikola Samardzija
Original assignee: Individual
Current assignee: EIDP Inc
Priority date: 1989-06-02
Filing date: 1990-05-30
Publication date: 1990-12-02
Also published as: EP0474747A4; JPH04505678A; EP0474747A1; WO1990015390A1

Abstract

PARALLEL DISTRIBUTED PROCESSING NETWORK CHARACTERIZED
BY AN
INFORMATION STORAGE MATRIX

ABSTRACT OF THE INVENTION

A single layer parallel distributed processing network is characterized by having connection weights between nodes that are defined by an [N x N] information storage matrix [A] that satisfies the matrix equation [A] [T] = [T] [A] (1) where [A] is an [N x N] diagonal matrix the components of which are the eigenvalues of the matrix [A] and [T] is an [N x N] similarity transformation matrix whose columns are formed of some predetermined number M of target vectors (where M <= N) and whose remaining columns are formed of some predetermined number ? of slack vectors (where ? = N - M), both of which together comprise the eigenvectors of [A].

Description

Z~ '78;~5 TITL,E

PARALI,EI, DISTRIBUTED PROCESSING NEI~51VORK CHARACTERIZED
BY AN
INF`ORMATION STORAGE MAT~IX
FIELD OF THE INVEN'rlON

The present inven~ion relates to a parallel distributed processing network wherein the connection we~ghts are defined by a 15 IN x N] inrormation storage matrix IA~ that satisl`les the matrix equation ~ Al IT] = ITl IA~
where IAI is an IN x Nl diagonal matrix ~e elements of which are the eigenvalues of the matrix IA] and IT] ls an IN x N] similarity 20 transformaLion ma~rix whose columns are formed of some predetermined number M of target vectors (where M ~= N) and whose remaining columns are rormed of some predeterrnined number Ç~ of slack vectors (where Ç~ = N - M), both of whlch together comprise the eigenvectors of 1~1.
BACKGROUND O~ THE lNVENl`ION
Parallel distributed processing networks (a~so popularly known by the tenn "neural networks") have been shown to be use~ul for solvlng large classes of complex problems ln analog fashion. They are 30 a class of highly parallel computational circu~ts wlth a plurallty of llnear and non-lInear ampliflers havlng transfer ~unct~ons that deflne input-output relations arr~ged in a network that connects the output of each amplifier to the ~nput of some or all the ampllfiers. Such networlcs may be Implemented In hardware leither in dlscre~e or 35 integrated ~orm~ or by simulation uslng tradltlonal von Neumann architecture dlgital computer.

, .,, . ; ., ...................................... -. ." . .. . .

Z~ .71~

Such networks are believed to be more sultable ror certaln types Or problems ~han a traditional von Neumann archltecture dlgltal computer. Exemplary of the classes oî problems with whlch parallel 5 dis~ributed processing networks have been used are associative memory, classil'lcation applications, geature extrac~ion, pattem recognition, and logic circuit realizatlon. These applications are often found in systems dcsigned to perform process control, and signal and/or data processing. For example, copendlng application Serial 10 Number 07/316,717, I`lled February 28, 1989 (ED-0373) and assigned to the assignee of the presen~ invention, discloses an apparatus and method for controlllng a process using a ~rained parallel distributed processing network.

There are numerous examples of parallel distributed processing networlss described in U~e prior art used to solve problems ln the areas listed above. l'wo of the rrequently used parallel distributed processing network architectures are described by the Hopfield algorithm (J.J. Hoprield, "Neurons with graded response have 20 collective computational properties like those of two-state neurons,"
Proceeding National Academy of Science, USA Vol. 81, pages 3088-3092, May 1984, Biophysics) and the back propagation algorithm ~for exarnple, see Rumelhart, Hlnton, and Williams "Learnirlg Internal Representations by Error Propagat~on," Parallel Distributed Processing 2~ Explorations In the Micro structure of Cognitlon Volume I, Founda~ions, Rumelhart and McClelland editors, MIT Press, Cambridge, Massachusetts ~1986)).

It has been found convenient to conceptualize a parallel 30 distrlbuted processing networls in terms of ~n N-dlmensional vector space having a topolo~ comprislng one or more localized energy minlmum or equllibrium points surrounded by baslns, to whlch the network operatlon will gravltate when presented witll an unknown input. Moreover, slnce matrlx rnathematics has been demonstrated to 35 accurately predict the characterisUcs of n-dimensional vectors in other areas, it has also been round convenlent to characterize such .
. . .
. .
, . . ~., - .
.

parallel distrlbuted processing networks and to analyze thelr behavior using tradlUonal matrlx techniques.

Both the Hopl`leId and the back propagatlon networks are 5 deslgned using algor}thms Ulat share the followlng prlnclples:
(1) Based on the desired output quantities (often referred to as deslred output, or target, vectors), network operation is such that for some lor any) input code (input vector) the network will produce one of the target vectors; (2) The network may be characterlzed by a linear 10 operator IA], whlch is a matrix with constant coemclents, and a non-linear thresholding device denoted by u(.). l~e coerricients of the matrix IAl determine the connection weights between the amplii`lers in the network, and o( ) represents a synaptic act~on at the output or input of each amplifier.
'rhe essenUa~ problem In designing such a parallel distributed processing network is to ~Ind a linear operator lAl such that for some (or any) input vector X~n, IA] and v(.) will produce some desired output XO, that is:
IA] Xln -> X0 ( 2 ) where the opera~lon by v(.) Is impllcitIy assumed.

2 5 -o-0-o-The Hopfleld and back propagatlon models derive the matrix operator IAJ by using different techniques.

In the Hopfleld algorithm îhe operator IA] essentially results from the sum of matrices created through the suter product operation on des~red output, or targetf ~ectors. l~at is, ~f XO1, Xo2, . . . Xorl are the desired target vectors then, A ~Xtol ~ Xof2xlo2 + . . . + Xonxton (3) :- ' ~ ' ~ ;
- .
.,, ,.. -- . . . . . ~. . - ..
:. ~

2~,,431,.t7~;~5 where X~OI is the transpose of XOI, and XolXto~ dcnotes an outer product for i=1, . . ., n. The operator IA] is then modif~ed by placlng zeroes along the dlagona~. Once such a matrL~c operator IA] is created, then for an input vect~r Xln the Iterative procedure can be used to 5 ol)tain a desired output XOI, i.e., IAl ~In = Xl IAI Xl = X2 IA~ Xk = Xol where again the operation by o(.~ is implicitly assumed.

Unfortunately, thls algorithm. because of the way it is structured.
may converge to a result that is not a desired target vector. An additional limitation of this algorithm is that when the network behaves as an associative memory for a given input, it will converge to a stable stat~ only when that input Is close (in Hamming d~stance) to 20 the stable state. This is one of the serious limitations of the Hopfield net. Furthermore, there Is very poor control over the speed and the way ~n which resu~ts converge. This is because the coefficients in matrLx IAl (the conneetion weights between the ampli~iers in the parallel distributed processing network) are "f`lrmly" determined by 25 output vectors and ~e outer product operation, I.e., there is no ilexibility in restructuring IAI. Note, the outer product operation always makes ~Al to be symme~c.

The assoclative memory network disclosed in Hopfield, United 30 States Patent 4,660,166 (Hopfield), uses a Interconnection scheme that connec~s each ampllfier output to the Input of all other amplifiers except Itsel~. In the Hopfleld network as disclosed In thls last mentioned patent, the connectivity matrlx characteriz~ng the connectlon welghts has to be symmetrlc, and the diagonal elements 35 need to be equal to zero. (Note that F~gure 2 of the Hoprleld paper referenced above Is believed to contain an eiTor in which the output of . . -. . . ~. ~

~S~~ 3~7~ ~9~
an amplirler is connected back to its Input. Figure 2 of the last-referenced patent Is believed to correc~ly depict the interconnection scheme of the Hoprleld paper.) The Hopfleld network has provided the basls for various applications. ~or example, see United States Patent 4,719,591 (Hopfield and Tank), where the network is applied to the problem of decomposlUon of signals into component slgnals. United States Patents 4,731,747 and 4,737,929 (both to Denker), improve the Hopfleld network by adjusUng the time constants of the amplifiers to con~rol the speed of convergence, by using negative gain ampliflers that possess a single output, and by using a cllpped connection matrix having only two values whieh permits the construction of the network wlth fewer leads.
Vnited States Patent 4,752,906 (Kleinfeld), overcomes the dericlency of the Hopfield network of not being able to provide temporal assoclation by using delay elements in the output which are fed back to an input interconnection network. United States Patent 4,755,963 (Denker, H~ward, and Jackel~ extends the range of problems sol~table by the Hopf~eld network.

The back propagatlon algorlthm results in a multl layer feed forward network that uses a performance crlteria in order to evaluate A ~minimi7ing error at the output by ad~ustlng coemcJents in A). This technique produces go~d results but, unfortunately~ is computationally intensive. This implies a long ~me ~or learning to converge. The back propagation ne~work requires considerable ~rne tralning or learning the InformaUon to be stored. Many techniques have been developed to reduce the tra~ning time. See, ~or example, copendlng applicatlon Serial Number 07/2850534, flled December 16, 1988 (ED-0367) and asslgned to Ule asslgnee of the present ~nventlon, whlch relates to the use of sUrr dirrerentlal equatlons ln tlalnlng the back propagat~on network.

.

.

SI~MMARY OF TI~E: INVENTION

The present inventlon relates to a parallel distributed 5 processlng network comprising a plurallty of amplifiers, or nodes, connected in a slngle layer, wlth each alnpli('ler having an Input and an output. The output of each of the nodes is connected to the Inputs of some or of all of the other nodes in the network (including belng fed back into ltseln by a respective llne havlng a predetermined 10 connection weight. The connecUon weights are def`lned by an ~NxNJ
matrix lA], termed the "Information storage matrix", wherein the element AIJ of U~e information storage matrix IA] Is the connection weight between the ~-th input node and the i-th output node.

In accordance with the present Invention the information storage matrix IA] satisrles the matrix equation IA] ITl - ITl 1~1 (1).

The matrix 1T] is an IN x N] matrix, termed the "sinnilarity transfonnation matrix", the columns of which are formed ~rom a predetermined number (M) of IN x I] target vectors plus a predetermlned number (Q) of IN x 11 arbltrary, or "slack", veetors.
Each target vector represents one oî the outputs of the parallel dlstributed proeessing network. The value of M can be < = N, ~nd Ç~ = (N - M). Pre~erably, each or the vectors in the similarity transformation matrlx is llnearly ~ndependent of all other of the vectors ~n that matrlx. Each of the vectors in the slmllarity transformaUon mat~x may or may not be orthogonal to all other of the vectors in that matrix.

If the matrlx IT] Is nonsingular so that the matr~x 1TI-l exists, ~he InÇormat~on storage matrix IAI is deflned as the matrix product 1Al - ITl 1A] 1T] -I (5).

, ,.
.

l~e matrix IAl Is an IN x N~&~ atrix, each element along the dlagonal correspondlng to a prede~ermined one of the target or the slack vectors. The relative value of each element along the dlagonal of the IA] matrlx corresponds to the rate Or convergence of 5 the outputs of the parallel distr~buted processing network toward the corresponding target vector. In general, the values of the elements of the diagonal ma~rix corresponding to the target vectors are preÇerably larger than the values Or the elements of the diagonal matri corresponding to the slack vectors. More specifically, the elements of 1 Q the diagonal matrix corresponding to the target vectors have an absolute value greater than one while the values of the elements of the diagonal ma1rix corresponding to the slack vectors have an absolute value less than one.

The network in accordance with the present invention provides some advantages over the networks discussed hereinbefore.

The Informatlon storage matrix IAl is more general, i.e., it does not have to be symmetric, or closely symmetric, and it does not 20 requlre the diagonal elements equal to zero as in the Hopfleld network. This means that the hardware realization is also more general. The cognitive behavior of the information storage matrix is more easily understood than the prior art. When an input vector is presented to the network and the network converges to a solution 25 whlch Is not a desired or targeted vector, a cognltive solution has been reached, which Is, in general, a linear combination of target vectors.

l~e Incluslon of the arbitrary vectors in the formation of the slmllarlty transrormatlon matrlx allvws ~lexibility in molding baslns ~or 30 the target vectors. The presence of such slack vectors ls not found In the Hopheld algorithm.

The Incluslon of the matrlx IA3 as one of the factors in formlng the lnlormaUon storage matr~c Is also a reature not present ln~the 35 Hoplleld network. The speed of convergence to a target solu~on is con~ollable by the select~on of the values of the IAI matrix.

:

2(!:178,3~i In addiUon, the computaUon of the components of the Informatlon storage matr~ is ~aster and more efflcient then computaUon of the connectivity matrL~c uslng the back propagaUon 5 algorithm. This Is because the back propagat~on algorithm utilizes a generalized delta-rule for determining the connectivity matrlx. This r;ule, however, is at least an order o~ magnitude somputationally more Intensive than ~he numerical ~echniques used for the inrormation storage matrix.
BRIEF DESCRIPl'ION OF THE DRAWINGS

The invenUon will be more fully understood from the following detailed description thereof, taken In connection with the 15 accompanying drawings, which form a part of this application and in which:

Figure 1 is a generalized schematic dlagram of a portlon of a parallel distributed processlng network the connection welghts of 20 which are characterized by the components of an information storage matrix ln accordance with the present invention;

Figure 2A is a schematic diagram of a given amplifier, including the feedback and biasing resistors, correspondlng to an elemsnt In the 25 informat~on storage matr~x having a value greate~ than zero;

Figure 2B is a schematlc diagram of a given amplifler, ~ncluding the feedback and biaslng resistors, corresponding to an element In the informat~on storage matrLx having a value less than zero;
~ igure 3 is a schematlc diagram of a nonlinear thresholding arnpllfier implementing the synaptlc acUon Or the functlon 1~(.); and Figure 4 is a schemaUc diagram of a parallel dlsklbuted 35 processing network In accordance with the present inventlon used in Example 11 hereln.

: ` ' ' , :
, 71~
The Appendix, which forms part Or thls applicatlon, comprlses pages A- 1 through A-6 and Is a lIsting, in For~ran language, of a program for implemenUng a parallel dist~lbuted processlng network 5 in acc~rdance with the present inventlon on a tradIUonal von Neumann archltecture digltal computer. 'rhe llsUng implements the network used in Example 11 herein.

DErAILED DESCRlPrlON OF THE IIWENTION
Throughout the ~ollowing detailed description simllar reference numerals refer to similar elements in all FJgures of the drawings.

The parallel distributed processlng network in ~ccordance wlth 15 the present inventlon will first be discussed in terms of its underlying ~eory and mathematical basis, after which schematic diagrams of varlous implementations thereof will be presented. Thereafter, several examples of the operation of the parallel distributed processing network in accordance with the present invention wlll be given.

-o-O-o As noted earlier It has been found conven1ent to conceptua~ize a para~lel distributed processing network in terms of an 25 N-dimensional vector space. Such a space has a topology eomprising one or more loca~ed equllibrium points each surrounded by a basln to whJch the network operatJon wlll gravltate when presented wlth an unknown input. The Input ls usually presented to the network ~n the form Or a dJg~tal code, comprlsed of N binary dJgJts, usually wlth values 30 of 1 and -1, respectJvely. The N dlmensJonal space would accommodate 2N posslb]e input codes.

The network In accordance wlth the present invention may be characterized us~ng an IN X Nl rnatrlx, hereafter termed the 35 "InformaUon storage matrLx" that speclfles the connect~on we~ghts between the ampllflers Implementlng the parallel distr~buted , . . . ..

' ' ~;''' ~ . ' ; . :" `
. ! : ' ;"' ` ,`, ',' ` '. '. ``, ' `

~f ~ r;
processing network. Because of ~he operational symmetricity Inherent when using the informatlon s~orage mat~x only one-half Or the 2N
possible input codes are dis~inct. ~e other codes (2N-I In number) are complementary.

In general, the Information storage matrix IAl is the INXN
ma~rix that satisfles the matrlx equat~on:

IA] ITl = IT~ IA
Equation (1) deflnes an eigenvalue problem in which each ~, that is, each element in the 11~] matrix is an eigenvalue and each column vector in the similarity transfonn matrix IT] is the associated eigenvector. Equation (1) can have up to n distinct solution pairs.
1~ This matrix equaUon may be solved using Gauss~an Elimination technlques or the Delta Rule. When lTl-l e~dsts the information storage matri,~c IA] is formed by the matrLx product:
IAl = IT] [L3 lTl- 1 (5).

The matrix IT] is termed the "similarity transformatlon matrix"
and Is an INXNl matr~c the columns of which are fonned from a predetennined number ~M) of IN x 11 target vectors. Each target vector takes the form of one of the 2N possible codes able to be accommodated by the N dimensional space representing the network.
Each target vector represents one of the desired outputs, or targets, of the parallel distributed processing network. Each target vector contalns ~nronnation that is deslred to be storcd in some fashion and retrieved at some time in the future and thus the set X2. . - ., XM1 of M =N target vectors forms the in~ormat~on basis of the network. Pre~erably each target vector in the set Is linearly Independent from the other target vectors and any vector X~ In N-dlmen~lonal space can be thus expressed as the linear comblnatlon of the set of target vectors. In thls event, the Inverse o~ the slmilarlty transfonnaUon matrlx ITl exlsts. Some ~r all Qf the M target i7eetors may or may not be orthogonal to each ot~er, If deslred.

. ~ `

2 ~. 7~
l'he number M of target vectors nnay be less than the number N, the dimension of t3~e inrormation storage matrix IAI. lr less than N
number of target vectors are specifled (that is, McN), the remalnder of the slmilarity t~ansformation matrix IT] is completed by a 5 predeterrnined number Q of IN x 11 arbitrary, or slack, vectors, (where Q = N - M).

The slack vectors are SlcUtious from the storage point of view slnce they do not require the data ~onnat characteristic of tar,get 10 vectors. However, it turns out that in most applicat~ons the slack veclors are importan~. The elements of the slack vectors should be selected such that the slack vectors do not describe one of the possible codes of the network. For example, if in a t~pical case the target vectors are each represented as a digital string n-bits long (that 15 ls, composed of the binary digits 1 and -1, I.e., ~1-11. . . -1-111, forming a slack vector from the same bin~y digits would suppress the corresp~nding code. Accordingly, a slack vector should be formed of digits that clearly disUnguish it from any of the 2N possible target vectors. In general, a slack vector may be formed with one (or more) 20 of its elements having a fractional value, a zero value and/or a positive or negative integer values.

The slack vectors are important In that they assist in contouring the topolo~y and shaping the basins of the N-dimensional space 25 corresponding to the network.

In ~um, dependent upon the value of the number M, the target vectors may form all, or part, of the informat50n storage spectrum of the matrLx IAI. If less than N target vectors are speclfied, then the 30 remainlng vectors In the trarlsfoïrn matrix are arbitrary or slac}s vectors. In each instance the vectors in the slmllari~y transrormation matrL~c ITI form the geometrlc spectrum of the Inrormation storage matr~x IAI (i.e., they are the eigenvectors ~f IAD.

The IA] matrix Is an INXNl d~agonal matr~x that represents the collect~on Or all e~genvalues of the Information storage matrLx IAl and is ...... . ..
, ., . .:.
' . . '~ , , ' ' '.' ' '- : ' . ' '. ' : .'' . ' ' - ' ', .: . ," '''. . . .~' ' ',- ''' ' `' , ' ~ ':' :', .~ ' ' ' " ''i',' ' , '"` "' , ' , ~;.;. , ,i , " ,, known as the algebra~c spectra of IAI. Each element of the IAI matrix corresponds to a respective one of the target or slack vectors. The values asslgned to the elements of the ll~l matrix determ~ne the convergence properUes of the networlc. The rreedom in selectlng the values of the l/~] matrix implies tha~ the speed of the network can be conlrolled. Thus, the tinne required for the network to reach a decision or to converge to a target arter Inlt~alizatlon can be controlled by the appropriate selection Or the values of the IAI matrix.

The values assigned to the elements of the ~AI matrix have an impact in the networlc of the present invention. If a preassigned then the corresponding elgenvector T~ (whlch contains a desired output inrormation3 will determine an asymptote in the N-dimensional inrormation space that will mot~vate the occurrence of desired event.
15 If a preassigned ~1<1, then the correspondlng elgenvector Tl will determine an'asymptote In the N-dimensional Information space that will suppress ~e occurrence of event. If a preasslgned ~i>>l, then the network will converge qulckly to the corresponding target vector, approximating the feed-forward action of a back propagation network.
Preferably, the va~ues asslgned to the elements of the l~l matrix correspondlng to the target vectors are greater than the values of the elements o~ the IA] ma~rix corresponding to the slack vectors. More speclflcally, the elements of the diagonal matrix corresponding to the 25 target vectors have an abso]ute value greater ~an one while the values of the elements of the diagonal matrix colTesponding to the slack vectors have an absolute v~ue less ~an one.

In sum, through asslgning ~I's and seleetlng slack vectors speed 30 Or network oonvergence and fle~dbility ~n shaplng the basins assoclated with hxed ea,uilibrium points is respectlvely obtalned.

To evaluate the InÇorrnation storage matrix IAI two methods can be used. These methods are either the Gaussian eliminatl~n method 35 or the Delta-n~le method.

, ;

:.
.

7~;~5 The evaluaUon Or the informatlon storage matrLx IAl us~ng the Gaussl~n elimination method will be flrst addressed. Let Xl, X2, ~
~M. ZM+1. . ZN be the basis in RN, (I.e., X1'S represent target vectors and Zl's are slack vectors), where RN represents an N-dimensional 5 real vector space. Using this basls constn~ct a similarity transformation matr~c lTl = IX1, X2, . . ., XM. ZM~1. . ZN1 TO it assoclate ~he diagonal matrix ~1 0 ...0--IA3 = 0 A2 .
. O
O . . AN

15 that contains elgenvalues predetermined for each element in the basis.
Next, form the matrLx equation W ITl = ITl lA] ( 1 ) 20 and solve it for lAl using the Gaussian elimination method. It is more convenient to solve the problem ITI~ IAlt = IA] [T]t (6) :2~ s~nce in this transposed vers~on of Equatiorl (1) matrix coef~lcients fall out in the natural ~orm wlth respect to the coefflcients Or IA]. Note, Equation (1) or Equation (6) produces N2 linearly coupled equat1Ons which determlne N2 coemcie~lts of lA].
A second method Is the Delta-rule method. Here a set of linear 3 0 equatlons Is formed:

AX~ X
.
.

AXM = AM XM (7) A ZM+1= AM +1 ~1 2M~I

,: ',` , :" ,` ,.~
.; . : - : . . : : , :' . " `.~
,, , . ":
:
, . .. . , :

2~'783 ~A ZN = ~N ZN

in which ~I's are predetermined eigenva1ues. Nuw applying ~he D elta-S rule in an iteraUve fashion to the system of linear Equations l7~ IA] can be evaluated. The Delta Rule is discussed in W. P. Jones and J.
Hosklns, "Bac}~ PropagaLion", Byte, pages 155-162, October 1987.

Comparing the two methods it is found that the Gaussian 10 elimination technique is faster than the Delta-Rule. If the inverse of ~e IT] matrix exists, the information storage matrix may be found by flndlng the matrix product Or Equation (5).

-o- 0-o-1~
Figure 1 Is a generalized schemat~c diagram of a portion of a parallel distrlbuted processing network in accordance with the present invention. The network, generally indicated by the reference character 10, includes a plurality o~ amplifiers, or nodes, 12 20 connected in a single layer 14. The network 1() includes N amplifiers 12-1 through 12-N, where N corresponds ~o the dimenslon of the informatlon storage matr~ IAI derived as discussed earlier. In Figure 1 only four of the ampliflers 12 are shown, namely the first amplifler 12-1, the i-th and the ~-th ampliflers 12-i and 12-~, respectively, and 25 the last amplifler 12-N. The interconnect~on of the other of the N
amplifiers comprising the network 10 is readily apparent from the drawlng of ~igure 1. By way of ~urther explanation Flgure 4 Is a schematic diagram that ~llustrates a specl~lc parallel dlst~ibuted processlng netw ork 10 used in E~a m ple 1I to ~ollow w here N Is equal 30 to 4, and ls prov~ded ~nly to illustrate a fully interconnected network 10. The speclfic values of the resistors used In ~e network shown In Flgure 4 are also shown.

Each ampllller 12 has an inver~ng Input port 16, a noninvertlng 35 input port 18, and an output port 20. ll~e output port 20 of each ~umpllfIer 12 Is oonnected to the Invertlng Input pGl't 16 thereof by a ., . ~ .
: -: . , .
,~,, .
~';

2~17~;35 llne containing a feedback resistor 22. ln addit~on the output port 20 of each amplifier 12 is applied to a squasher 26 wh~ch implements the thresholding nonlinear~ty or synaptlc squashing o(.~ discussed earller.
The detailed diagr~un of ~he squasher 26 is shown In Figure 3.

The signal at the output port 20 of each of the N ampliflers 12, after the sa~ne has been operated upon by the squasher 26, Is connected to either ~he invert~g input port 16 or the noninverting Input port 18, as will be explalned, of some or all of the other 10 ampliflers 12 in the network 10 ~including Itself) by a connectlon line 30. The interconnection of the output of any given amplifller to the input of another amplirler is determined by the value of the corresponding element in the information storage matrix lA], with a element value of zero indicating no connectlon, a positive element 15 value Indicating connectlon to the noninverting input, and a negative element value indicat~ng connection to the inverting input.

Each connection line 30 contains a connectivity resistor 34, which is also subscripted by the same variables i"~, denoting that the 20 given subscripted connectivity resistor 34 is connected in the line 30 that connects the ~-th input to the i-th output ~mplifler. The connectivity resistor 34 deflnes the connection weight of the line 30 between the 3-th input to the i-th output amplifier. The value of the connectivity resistor 34 is related to the corresponding subscripted 25 variable In the inÇormation storage matrix, as will be understood from the discussion that follows.

E:ach of the llnes 30 also includes a delay element 38 which has a predetermined signal delay tlme assoclated therewlth which Is 30 provided to pe~nlt the time sequence of each iteration needed to Implement the Iterative action (rnathematically deflned In Equatlon (4)) by which the output state of the given amplifler 12 is reached.
The same subscr!pted variable scheme as applled to the conneet~on llnes and their reslstors applles to the delay lines. As noted earlier, 35 the values asslgned to the eigenvalue ~ in the lA] mat~x corresp~nds : . . . . .

. .
, . . . , ,:

.. .

Z~ 7~
to the time (or the number of itera~ions) requ~red for the network 10 to settle to a decislon.

The manner In which the values of the connectivity resistors 34 are reallzed, as derived from the corresponding elements of the information storage matrix IA], and the galns of ~he amplifiers 12 in the network 10 may now be discussed with reference to Figures 2A
and 2B.

An input vector applied to the network 10 takes the form:
~1~

Xln = .
.
XN

~ e ln~ormat~on storage matnx IA]. when evaluated in the 20 manner earlier discussed. takes the ~ollow~ng form:

- , .

- . :.

A1,1 -- A1,N
A2~ ~2,N
AI = .--AN, 1 AN,N
where each element AlJ of the infonnat~on storage ma~ix IA] is e~ther a posiUve or a negaUve real constant, or zero:

When an elen~ent AIJ of the information storage matr~x IA] Is 10 posltive, the relationship between the value of that element AIJ and the values of the feedback resistor 22 (RF) and the connectivlty resistor 34 IR~) may be understood from Figure 2A. The line 30 in which the reslstor 34 (Rl,~3 ls connected to the noninvertlng input port 16 of the amplifler 12 and the gain of the amplifier 12 is given by 1~ the following:

eO / XJ = 1 1 + ( RF / RIJ ~1 = I Al.~ I (8) where eO ~s the voltage of the output slgnal at the output port 20 20 of the amplifier 12~ cally the values of ~he feedback resistors 22 ~RF ) for the entire network is fixed at a predetermined constant value, and the values of the connect~vity resistcrs R~,~ may be readlly determined from Equation (8).

2~ When an element Al,~ of the Informatlon storage matrix IAl is negative, the relationshlp between the value of ~at element AIJ and the vaiues of the feedback resistor 22 (RF) and the connecti~ity resistor 34 5R~J) may be understood from Figure 2B. The line 30 in which the resistor 341,l is, In this case, connected to the Inv,erting Input port 18 oi the ampllfler 12 and the gain of the amplifier I2 is given by the following:

eO / XJ = RF / RIJ ~ - I Al,~ 1 ~9) where eO is the voltage of Ihe output slgnal at the ~utput por~ 20 Or the ampllfler 12. Agaill, with the values Or the feedbaclc resistors Z2 ~. . . . i . ......

203L~5 (RF ) for the entlre network fixed at the predetermlned constant value, the values o~ the connec~ivity res~stors Rl,~ may be readlly determined from Equation (9).

It should be noted that if the element AlJ of the ~nrormatlon storage matr~c is between zero and 1 then one eall use hardware or sort~are techniques to eliminate di~ficulties in its realizaUon. For example, a sof~ware technique would require adjusting coef~lcients so Ulat t} e value of AI,l becornes greater than 1. A hardware technique 10 would cascade two inverting ampliiiers to provide a positive value in the region specified.

When an element Al"~ of the inÇonnaUon storage matr~x IAl is zero, there Is no connecUon between the 3-th input node and the i-th 1~ output node.

Figure 3 shows a schemat~c diagram of the nonlinear thresholding device, or squasher that generates the ~unetion 1)(.). The squasher 26 defines a network that limits the value of the output of 20 the node 12 to a range derined between a predetermined upper limit and a predetermlned lower limit. l~e upper and the lower limits are, tuypically, ~1 and -1, respectively.

.

.7~5 lt should be understood that the net~7ork 10 illustrated in the schenlatlc diagrams of Figures 1 to 4 may be physically implemellted in any convenient format. ~at is to say, it lies within the contemplation Or this invention that the network 10 be realized in an 5 electronic hardware implementation, an optical hardware implemen~ation, or a comblnaLion of both. The electronlc har~ware ~mplemen~ation may be effected by interconnecting the components thereof using discrele analog devices and/or amplifiers, resistors, delay elemen~s such as capacitors or RC networks; or by the 10 interconnection of Integrated circult elements; or by ~ntegrating the entire network using any integrated circuit fabricat~on technology on suitable substrate diagrammatically indicated in the Figures by the cha~acter S. In addiUon the network may be realized using a general purpose digital computer~ such as a Hewlett Packard Vectra, a Digital 15 Equipment VAX*or a Cray X-MP* operating in accordance with a program. In this regard the Appendix contains a listing, in Fortran language, whereby the network 10 may be realized on a Digital Equipment VAX. The listing implements the network used in Example 11 herein.

* trad~ mark - : , ZC)~ 5 ~o EXAMPLP,S

The operation of the parallel dis~ribu~ed processing netwc~rlc of the present invention will now be discussed in connec~ion with 5 the following Examples I and 11~

Example 1 Example I ;s an example of the use OI the paral]el distributed 10 processing network as a Classifier. A big corporation has a need to collec~ and process a personal da~a file of its constitl~en~s. The data collected re~lects the following personal profile:

1 5 ~
_ j . . . - . P ROF I LE _ _ . . - _ . _ ~ CODE
MALE I SINGLE _ NOT DIVORCF.D NO RIDS ~ ~
FEMALE MARRIED_ DIV RCED KIDS NO COLLEGE_ I -1 and is entered into computer files every lime a new member joins the corpora~ion. For a new, married, not previously divorced, collçge graduate,male member of the corporation wi~hout children the en~ry would take the form:
2 5 Narne:
John Doe Not divorced/
Member: MaleJFemale: Single/Married: Divorced:

_ .

. .

.

ZQ~7~,5 No kids/kids: College/No college:

Thus, each member has a 6-bit code lhat describes the personal profile associa~ed with her/his name. The name and code 10 are en~ered jointly into ~he data fiie. The "member" entry is included to accounl for the symmetric operation of lhe network 10 characterized by the information storage matrix.

This corporation has thousands of constituents and requires a 15 fas~ parallel distributed processing network that will classify members according to the information given in a profile code.
Suppose ~he corporation wants to know the names of all members that fall within the following interest groups: (1) male and single;
(2) female and single; and (3) female and ma~Tied. These three 2 0 interest groups are reflected in the following table:

CLASSIFICATlON 1: MALE ANl) SINGLE

MALE X _ dc _ FEMALE _ X X
where "dc" represents "don't care"

In addition, one msy also desire classifications presen~ed 2 5 below:

. ; .:
: . . . . . .
~. . -CLASSIFICATION 2:

MALE ~ X . d~ _ .
FEMALE X X _ _ CLASSlFlCA'rlON 3:

MALE X _ ~c FEMALE X _ : _ NO CQLLEGE Lo:~GE
MALE X _~ dc _ FEMALE X _ X _ These four classirica~ions now can be used ~o generate targe vectors. By examining ~he MALE status in all four classifications a target vector is ob~ained.
1 ~
1 I - member 1 - male 1 - single Xl - -1 - divorced 1 - no kids ~: ~ 1 ¦ - with colle~e Similarly, two more ~arge~ vec~ors can be derived by examinillg the 20 FEMALE slatus in all four classifications:

. ~ . . . ~ . ...
, . : ~ :. . ..

, ,. ~. ;,' ;, ' ~: '.
,~
. ~ . .

.

,2~)~7 : _ _ 23 1 - member -1 - femal.e 1 - single X2 = -1 - divorced 1 - no kids -1 . - no college r 1 - member -1 - female 1O-1 - married X3 = 1 - not divorced -1 - kids 1 - college 15 Clearly there are three target vectors and the information dimension is six. Thus Ihree more slack vectors are needed. I,et o 2~
~i= 1 i-th position in the vector .
O
2~ define a standard basis set in Rn. Thus, 1 o o O' ~0' 0 ~1= e2= e3= e4= 1 e5= e6= O
o . o o o 1 O
O O O O O l Now select Z4=e4, Zs=es, and Z6=e6 to be the three additional slack vectors and form the similarlty transformation matrix 35 Tl=lXltX2~X~Z4tZ5rZ6]
Also select the following diagonal matrix 2 ~ o O O O
0 2.5 o o O O
4 0 IA]= O 0 2 .~5 o o O
o o ~ .5 o O
o o o O .5 0 o o O O O .5 - . i, . . . .. .

, - . ~ . . .

, , -.., , -z(~

Tl and A will produce the information storage matrix. This matrix, when executed against all possible codes (i.e., 32 distincl elements in the code considered), will produce four basins illustrated in Table ]. The first basin in Table I shows all elements of lhe code that 5 converge to larget Xl. Sirnilarly, the third and fourlh basins are responsible for X2 and X3, Each code falling in these target basins will increMent a suitable counter. However, ~he second basin is responsible ~or the cognitive solution C=[1,1,-1,1-l,l~t that is not of our in~eresl (i.e., C gives classification for member, male, married, 10 not divorced, with kids, with college).

Whenever ~he code associated with a member's name converges to Xl, the member's name is entered into a class of male and single. Similarly, if the code converges to X2 or X3, the name is 15 entered into female and single or female and married respectively.
But when the code converges to C, the name is ignored. The a~Tows in Table I indica~e common informations within each basin. Thus T1 and A are used ~o design an information storage matrix for a parallel distributed processing network ~hat is executing the 20 func~ion of CLASS~FICATION 1.
Furthermore, if ~A] is used with the following similarity transformation:
T2= ~Xl, X2, X3, Z4, Zs, Z6], where Z4=e3, Z5=es and Z6=e6 ~3=[Xl,X2,X3,Z4,Zs,~6], where Z4=e3, zs=e~ and Z6=e6 T4=[Xl,X2,X3,Z4,Zs,Z6], where Z~=e3, zs=e4 and Z6=es then Tables 2, 3, and 4 are obtained. These tables, respectively, 30 produce CLASSlFlC:ATlONS 2, 3, and 4.

Next suppose that one requires to obtain the information illustrated in CLASSlFlCATiONS 5, 6, and 7 below: -,. .
. "
.

.
. :,. , . `:
"

~n.~7~

2~

SINGLE ~ D
_ ~
COLLEGE X X
No colleqe x dc CLASSlFlCATlON 6 _ DIVORCED NOT DIVORCED
COLLEGE X X
No colleqe x dc CLASSIFlCATlON 7 __ NO KIDS KIDS
_ , . . _ _ , ~ ;.
COLLEGE X X
, ~
No colleqe X _ dc _ 10 then the same targe~ vectors and 1/~] can be used. The new similarity transformations are:
T5=[Xl,X2,X3,Z4,Z5~Z6~, where z4=e2, Zs=e4 and Z6=es .
T6= lx1, x~, X3, Z", Z5, Z6], where Z4=e2, Zs=e3 and Z6-es : . ~5 T7= [xl~ X2, X3, Z4, Z5, Z6J, where Z4=e2, Zs~e3 and Z6=e4 The basin ~esults are illuslrated in Tables 5~ 6, and 7.

'` ` '~ '' `~! ' i -, ;
' ' ' ' !' ~ . :
` ~:'.' :' ' ' , , . . ;: . :: : .
: . , ,: ~ . ;.. .
-; . ' ' ~ ' ' :. ' ~`' . ~
' . . ~: ' ~. '` ' .. :;.
.. .~. ` .; ~ . .

Example 11 Logic circuit realization: Consider a 4-bit symme~ric complenlentary code such that:

P

1 1 -1 -1 is equivalent to -1 -1 1 1 Next, we want to desi,gn a logic circuit în which whenever C=D, the state Xl=[l,l,-l,-l]t is executed, and whenever C does no~ equal D, the sta~e X2=~l,-l,l,-l~t is obtained. To do this consider a similarity transformation T=[X1,X2,Z3,Z4], where Z3 = e1, Z4 ~ e2 and o 2 o O
[A] = o o -.5 o o o o -.5 These ma~rices produce a connectivity pa~lern given by the 3 0 following information storage ma~rix .
_ ~ o 0 ~-5 o-.5 -2.5 o [A] - 0 0 2 0 o o o 2 which when iterated and "sguashed", as prescribed in the ~igures, will produce basins for largels X1 and X2 given below:
4~

"", ,, ,: . ~ . , . ~ :
:. ; . ; , ~ .~ .
- ~ . . , , . . .. . ~ .
, .... ..

zn~

basin for Xl basin for X2 .

-1 -1 target 1 1 -1 1 -1 -1 -1 ~ 1 target 10 Thus ~he logic is performed and realized through the analog circuit.
Observe that in this network there are no cognitive solu~ions.

The schematic diagram for a parallel distributed processing network used in this Example 11 is shown in Figure 4. Thus lY] = lA] lX]
and Y; - . 5 0 0 -2 . 5 X
Y2= ~ . 5 -2 . 5 X2 Y3 O 0 2 ~ x3 so Yl = - . 5Xl -2 . 5X4 Y2 -- --~ 5X2 -2 . 5X3 y3 = 2X3 Y1 = 2X4 The values of the resistors (assuming a IK feedback resistor~ are derived using Equalions (8) and (9). The Appendix, containing pages A- I through A-6, is a Forlran listing implementing the network 3 0 shown in ~igure 4 on a Digilal E4uipment VAX computer.

Those skilled in the art, having the benefit of lhe teachings of the present invention may imparl numerous modifications thereto.
It should be understood, howeYer, that such modifications lie within 3 5 lhe conlemplation of the present inventivn, as defined by the appended claims.

~, , . " . . ,. ,.............. , . , ~ . . .......... ..

. ~ ~ " , ,

Claims

1. A parallel distributed processing network comprising a plurality of nodes connected in a single layer, each having an input and an output, wherein the output of each of the nodes is connected to the input of some predetermined ones of the nodes in the network by a predetermined connection weight.

wherein the connection weights are defined by an [N x N]
information storage matrix [A] wherein the element Aij of the information storage matrix [A] is the connection weight between the j-th input node and the l-th output node.

wherein the information storage matrix [A] satisfies the matrix equation [A] [T] = [T] [A]
where [T] is a [N x N] similarity transformation matrix the columns of which are formed from a predetennined number (M) of [N x l] target vectors plus a predetermined number (?) of [N x 1]
arbitrary vectors, each target vector representing one of the outputs of the parallel distributed processing network, where M < = N, and ? = (N- M), and where [A] is an [N x N] diagonal matrix, each element along the diagonal of the matrix [A] corresponding to a predetermined one of the vectors in the similarity transformation matrix, the relative value of each element along the diagonal of the matrix [A] corresponding to the rate of convergence of the parallel distributed processing network toward the corresponding target vector.

2. The parallel distributed processing network of claim 1 wherein the output of each of the nodes is connected to the input of all of the nodes in the network including itself.

3. The parallel distributed processing network of claim 1 wherein all of the vectors in the similarity transformation matrix are linearly independent.

4. The parallel distributed processing network of claim 3 wherein all of the vectors in the similarity transformation matrix are orthogonal.

5. The parallel distributed processing network of claim 2 wherein all of the vectors in the similarity transformation matrix are linearly independent.

6. The parallel distributed processing network of claim 5 wherein all of the vectors in the similarity transformation matrix are orthogonal.

7. The parallel distributed processing network of claim 5 wherein each of the elements of the diagonal matrix [A] that corresponds to an arbitrary vector in the similarity transformation matrix has a value less than the value of each of the components of the diagonal matrix that corresponds to a target vector in the similarity transformation matrix.

8. The parallel distributed processing network of claim 3 wherein each of the elements of the diagonal matrix [A] that corresponds to an arbitrary vector in the similarity transformation matrix has a value less than the value of each of the components of the diagonal matrix that corresponds to a target vector in the similarity transformation matrix.

9. The parallel distributed processing network of claim 2 wherein each of the elements of the diagonal matrix [A] that corresponds to an arbitrary vector in the similarity transormation matrix has a value less than the value of each of the components of the diagonal matrix that corresponds to a target vector in the similarity transformation matrix.

10. The parallel distributed processing network of claim 1 wherein each of the elements of the diagonal matrix [A] that corresponds to an arbitrary vector in the similarity transformation matrix has a value less than the value of each of the components of the diagonal matrix that corresponds to a target vector in the similarity transformation matrix

11. The parallel distributed processing network of claim 5 wherein each of the elements of the diagonal matrix [A] that corresponds to an arbitrary vector in the similarity transformation matrix has an absolute value less than one (1) and each of the components of the diagonal matrix that corresponds to a target vector in the similarity transformation matrix has an absolute value greater than one (1).

12. The parallel distributed processing network of claim 3 wherein each of the elements of the diagonal matrix [A] that corresponds to an arbitrary vector in the similarity transformation matrix has an absolute value less than one (1) and each of the components of the diagonal matrix that corresponds to a target vector in the similarity transformation matrix has an absolute value greater than one (11-

13. The parallel distributed processing network of claim 2 wherein each of the elements of the diagonal matrix [A] that corresponds to an arbitrary vector in the similarity transformation matrix has an absolute value less than one (1) and each of the components of the diagonal matrix that corresponds to a target vector in the similarity transformation matrix has an absolute value greater than one (1).

14. The parallel distributed processing network of claim 1 wherein each of the elements of the diagonal matrix [A] that corresponds to an arbitrary vector in the similarity transformation matrix has an absolute value less than one (1) and each of the components of the diagonal matrix that corresponds to a target vector in the similarity transformation matrix has an absolute value greater than one (1).

15. The parallel distributed processing network of claim 2 further comprising a delay element disposed between the output of each node and the input of the other nodes to which it Is connected.

16. The parallel distributed processing network of claim 1 further comprising a delay element disposed between the output of each node and the input of the other nodes to which it is connected.

17. The parallel distributed processing network of claim 16 further comprising a network connected to the output of each of the nodes that limits the value of the output of the node to a range defined between an upper and a lower predetermined limit.

18. The parallel distributed procesing network of claim 15 further comprising a network connected to the output of each of the nodes that limits the value of the output of the node to a range defined between an upper and a lower predetermined limit.

19. The parallel distributed processing network of claim 2 further comprising a network connected to the output of each of the nodes that limits the value of the output of the node to a range defined between an upper and a lower predetermined limit.

20. The parallel distributed processing network of claim 1 further comprising a network connected to the output of each of the nodes that limits the value of the output of the node to a range defined between an upper and a lower predetermined limit.

21. The parallel distributed processing network of claim 20 wherein each of the nodes has an inverting and a noninverting input terminal, and wherein the l-th output node is connected to the noninverting input port of the l-th input node when the corresponding element Aij of the information storage matrix is greater than zero.

22. The parallel distributed processing network of claim 21 wherein the j-th input node has a feedback resistor connected between its output port and it inverting input port, and a connectivity resistor connected at its inventing input port, and wherein the feedback resistor and the connectivity resistor are related to the corresponding element Aij of the information storage matrix by the relationship ¦ Ai.j ¦ = [ 1 + ( RF / Rl.j ) ].

23. The parallel distributed processing network of claim 19 wherein each of the nodes has an inverting and a nonlnverting input terminal, and wherein the l-th output node is connected to the noninverting input port of the j-th input node when the corresponding element Ai.j of the information storage matrix is greater than zero.

24. The parallel distributed processing network of claim 23 wherein the j-th input node has a feedback resistor connected between its output port and it inventing input port, and a connectivity resistor connected at its inverting input port, and wherein the feedback resistor and the connectivity resistor are related to the corresponding element Ai.j of the information storage matrix by the relationship ¦ Ai.j ¦ = [1+ ( RF / Ri.j)].

25. The parallel distributed processing network of claim 20 wherein each of the nodes has an inverting and a noninverting input terminal, and wherein the i-th output node is connected to the inverting input port of the j-th input node when the corresponding element Ai.j of the information storage matrix is less than zero.

26. The parallel distributed processing network of claim 25 wherein the j-th input node has a feedback resistor connected between its output port and it inverting input port, and a connectivity resistor connected at its inverting input port, and wherein the feedback resistor and the connectivity resistor are related to the corresponding element Ai.j of the information storage matrix by the relationship - ¦ Ai.j ] = RF / Ri.j .

27. The parallel distributed processing network of claim 19 wherein each of the nodes has an inverting and a noninverting input terminal.
and wherein the i-th output node is connected to the inverting input port of the j-th input node when the corresponding element Ai.j of the information storage matrix is less than zero.

28. The parallel distributed processing network of claim 27 wherein the j-th input node has a feedback resistor connected between its output port and it inverting input port, and a connectivity resistor connected at its inverting input port, and wherein the feedback resistor and the connectivity resistor are related to the corresponding element Ai.j of the information storage matrix by the relationship - ¦ Ai.j ¦ = RF / Ri.j .

29. The parallel distributed processing network of claim 1 wherein the network is implemented in hardware elements.

30. The parallel distributed processing network of claim 1 wherein the network is implemented using a general purpose digital computer operating in accordance with a program.

31. The parallel distributed processing network of claim 2 wherein the network is implemented in hardware elements.

32. The parallel distributed processing network of claim 2 wherein the network is implemented using a general purpose digital computer operating in accordance with a program.

33. The parallel distributed processing network of claim 1 wherein the information storage matrix [A] is the matrix product [A] = [T] [A] [T]-1 when [T]-1 of the matrix [T] exists.

34. The parallel distributed processing network of claim 2 wherein the information storage matrix [A] is the matrix product [A] = [T] [A] [T]-l when [T]-l of the matrix [T] exists.

35. The parallel distributed processing network of claim 5 wherein the information storage matrix [A] is the matrix product [A] = [T] [A] [T]-l when [T]-l of the matrix [T] exists.

36. The parallel distributed processing network of claim 3 wherein the information storage matrix [A] is the matrix product [A] = [T] [A] [T]-l when [T]-1 of the matrix [T] exists.

37. The parallel distributed processing network of claim 9 wherein the information storage matrix [A] is the matrix product [A] = [T] [A] [T]-l when [T]-l of the matrix [T] exists.

38. The parallel distributed processing network of claim 10 wherein the information storage matrix [A] is the matrix product [A] = [T] [A] [T]-l when [T]-l of the matrix [T] exists.

39. The parallel distributed processing network of claim 13 wherein the information storage matrix [A] is the matrix product [A] = [T] [A] [T]-l when [T]-l of the matrix [T] exists.

40. The parallel distributed processing network of claim 14 wherein the Information storage matrix [A] is the matrix product [A] = [T] [A] [T]-l when [T]-1 of the matrix exists.

41. The parallel distributed processing network of claim 16 wherein the information storage matrix [A] is the matrix product [A] = [T] [A] [T]-l when [T]-1 of the matrix [T] exists.

42. The parallel distributed processing network of claim 15 wherein the information storage matrix [A] is the matrix product [A] = [T] [A] [T]-1 when [T]-1 of the matrix [T] exists.

43. The parallel distributed processing network of claim 20 wherein the information storage matrix [A] is the matrix product [A] = [T] [A] [T]-1 when [T]-1 of the matrix [T] exists.

44. The parallel distributed processing network of claim 19 wherein the information storage matrix [A] is the matrix product [A] = [T] [A] [T]-1 when [T]- 1 of the matrix [T] exists.

45. The parallel distributed processing network of claim 23 wherein the information storage matrix [A] is the matrix product [A] = [T] [A] [T]-1 when [T]-1 of the matrix [T] exists.

46. The parallel distributed processing network of claim 21 wherein the information storage matrix [A] is the matrix product [A] = [T] [A] [T]-1 when [T]-1 of the matrix [T] exists.

47. The parallel distributed processing network of claim 27 wherein the information storage matrix [A] is the matrix product [A] = [T] [A] [T]-1 when [T]-1 of the matrix [T] exists.

48. The parallel distributed processing network of claim 25 wherein the information storage matrix [A] is the matrix product [A] = [T] [A] [T]-1 when [T]-1 of the matrix [T] exists.

49. The parallel distributed processing network of claim 33 wherein the network is implemented in hardware elements.

50. The parallel distributed processing network of claim 33 wherein the network is implemented using a general purpose digital computer operating in accordance with a program.

51. The parallel distributed processing network of claim 34 wherein the network is implemented in hardware elements.

52. The parallel distributed processing network of claim 34 wherein the network is implemented using a general purpose digital computer operating in accordance with a program.

53. A method of forming a parallel distributed processing network comprising a plurality of nodes, each having an input and an output, the method comprising the steps of;

(a) creating an [N x N] similarity transformation matrix the columns of which are formed from a predetermined number (M) of [N x 1] target vectors plus a predetermined number (Q) of [N x 1]
arbitrary vectors, each target vector representing one of the outputs of the parallel distributed processing network, where M < = N, and Q =
(N-M), (b) creating an [N x N] diagonal matrix [A], each element along the diagonal of the matrix [A] corresponding to a predetermined one of the vectors in the similarity transformation matrix, the relative value of each element along the diagonal of the matrix [A]
corresponding to the rate of convergence of the parallel distributed processing network toward the corresponding target vector (c) creating an [N x N] information storage matrix [A] satisfying the matrix equation [A] [T] = [T] [A].

(d) evaluating the information storage matrix to determine the values of the elements Ai,j of the information storage matrix [A], and (e) implementing the parallel distributed processing network by arranging the nodes in a single layer so that the output of each of the nodes is connected to the Inputs of some predetermined ones of the nodes in the network by a predetermined connection weight, the element Ai,j of the information storage matrix [A] defining the connection weight between the j-th input node and the i-th output node.

54. The method of claim 53 wherein the evaluation step (d) is performed using Gaussian Elimination.

55. The method of claim 53 wherein the evaluation step (d) is performed using the Delta Rule.

56. The method of claim 53 wherein, in step (e), the output of each of the nodes is connected to the input of all of the nodes in the network, including itself.

57. The method of claim 53 wherein the implementing step (e) is performed using a general purpose digital computer operating in accordance with a program.

58. The method of claim 56 wherein the implementing step (e) is performed by interconnecting hardware components.

59. The method of claim 56 wherein the implementing step (e) is performed using a general purpose digital computer operating in accordance with a program.

60. The method of claim 56 wherein the implementing step (e) is performed by interconnecting hardware components.