JP4845350B2

JP4845350B2 - Parallel processing unit

Info

Publication number: JP4845350B2
Application number: JP2004166134A
Authority: JP
Inventors: 優和真継; 克彦森
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2004-06-03
Filing date: 2004-06-03
Publication date: 2011-12-28
Anticipated expiration: 2024-06-03
Also published as: JP2005346470A

Description

本発明は、パターン認識等の情報処理に適用可能な並列処理技術に関するものである。 The present invention relates to a parallel processing technique applicable to information processing such as pattern recognition.

従来、大規模な多層神経回路網を小規模な回路を用いて実現する方法として、時分割処理により行う方法が知られる（例えば、特許文献１，２を参照）。多層神経回路網をアナログ・デジタル混載または融合型回路として実装する場合、２次元平面にニューロン、シナプス等の構成要素を全て並列配置し、並列処理を実現することは困難な場合が多い。そのため従来技術では、少数のニューロンまたはシナプス回路要素を用いて時分割処理が行われる。また、大規模なネットワーク構造を小規模なネットワークモジュールに分割して行う方法も従来から開示されている（例えば、特許文献３，４を参照）。
特許２６７９７３０号公報特許３１１０４３４号公報特許３０３５１０６号公報特許３２１０３１９号公報 Conventionally, as a method of realizing a large-scale multilayer neural network using a small-scale circuit, a method of performing time division processing is known (for example, refer to Patent Documents 1 and 2). When a multilayer neural network is implemented as an analog / digital mixed circuit or a fusion circuit, it is often difficult to realize parallel processing by arranging all the components such as neurons and synapses in parallel on a two-dimensional plane. Therefore, in the prior art, time division processing is performed using a small number of neurons or synapse circuit elements. A method of dividing a large-scale network structure into small-scale network modules has been conventionally disclosed (see, for example, Patent Documents 3 and 4).
Japanese Patent No. 2679730 Japanese Patent No. 3110434 Japanese Patent No. 3035106 Japanese Patent No. 3210319

しかしながら、従来の技術のうち、時分割処理により行う方式では大規模な問題を解くためには、同一のハードウエアを時分割で使用するため、問題の複雑さや規模の大きさに応じて処理時間が長くなり、神経回路網が本来持つ並列処理の利点を活かしきれていないという問題があった。 However, among the conventional techniques, the time division processing method uses the same hardware in order to solve a large-scale problem, so the processing time depends on the complexity of the problem and the size of the scale. There is a problem that the advantage of the parallel processing inherent in the neural network cannot be fully utilized.

また、各ニューロンの受容野構造（シナプス結合荷重分布）のメモリからのローディング頻度、或いは各階層の中間処理結果を保持するメモリからの中間結果の読み出し頻度が高いほどメモリアクセス時間に起因して処理時間が遅くなるという問題、更に、スイッチング素子を用いてニューロン出力及び荷重データを切り替える場合には、スイッチングに要する消費電力が累積して全体としての消費電力が非常に大きくなるという問題があった。 In addition, the higher the loading frequency from the memory of the receptive field structure (synaptic connection load distribution) of each neuron, or the readout frequency of the intermediate result from the memory holding the intermediate processing result of each layer, the processing is caused by the memory access time. When switching the neuron output and load data using a switching element, there is a problem that the power consumption required for switching is accumulated and the overall power consumption becomes very large.

また、小規模なネットワークモジュールに分割する方法では、全体としての回路規模は小さくならないというという欠点があった。 Further, the method of dividing into small network modules has a disadvantage that the circuit scale as a whole is not reduced.

本発明は以上の問題に鑑みてなされたものであり、階層型神経回路網の各層を複数の演算素子でもって時分割で行う場合に、演算効率を大幅に向上させることを目的とする。 The present invention has been made in view of the above problems, and it is an object of the present invention to greatly improve calculation efficiency when each layer of a hierarchical neural network is performed in a time division manner with a plurality of calculation elements.

本発明の目的を達成するために、例えば本発明の並列処理装置は以下の構成を備える。 In order to achieve the object of the present invention, for example, a parallel processing apparatus of the present invention comprises the following arrangement.

すなわち、ニューロン素子間で共通の局所受容野構造を有する階層型神経回路網において、
受容野を定めるシナプス結合に入力されるデータとして前段の階層レベルに属するニューロン素子からの出力信号を、各分割信号が他の分割信号と時系列上で重複部分を有するように分割する分割多重化手段を有することを特徴とする。 That is, in a hierarchical neural network having a local receptive field structure common between neuron elements,
Division multiplexing that divides the output signal from the neuron element belonging to the previous hierarchical level as data input to the synaptic connection that defines the receptive field so that each divided signal has an overlapping part in time series with other divided signals It has the means.

本発明の構成により、階層型神経回路網の各層を複数の演算素子でもって時分割で行う場合に、演算効率を大幅に向上させることができる。 According to the configuration of the present invention, when each layer of the hierarchical neural network is performed in a time division manner with a plurality of arithmetic elements, the arithmetic efficiency can be greatly improved.

以下添付図面を参照して、本発明を好適な実施形態に従って詳細に説明する。 Hereinafter, the present invention will be described in detail according to preferred embodiments with reference to the accompanying drawings.

［第１の実施形態］
図１は、本実施形態に係る並列処理装置の基本構成を示すブロック図である。本実施形態に係る並列処理装置は同図に示すように、データ入力制御回路１、ニューロンアレイ回路ブロック２、シナプスアレイ回路ブロック３、処理結果保持メモリ４、分割多重化信号生成回路ブロック５、受容野構造制御回路ブロック６、および全体制御回路ブロック７を主たる構成要素とする。 [First Embodiment]
FIG. 1 is a block diagram showing a basic configuration of a parallel processing apparatus according to the present embodiment. As shown in the figure, the parallel processing apparatus according to the present embodiment includes a data input control circuit 1, a neuron array circuit block 2, a synapse array circuit block 3, a processing result holding memory 4, a division multiplexed signal generation circuit block 5, an acceptance The field structure control circuit block 6 and the overall control circuit block 7 are the main components.

ここで、多層神経回路網をアナログ・デジタル混載または融合型回路として実装する場合、２次元平面にニューロン、シナプス等の構成要素を全て並列配置し、並列処理を実現することは困難な場合が多い。そのため、本実施形態においても少数のニューロンまたはシナプス回路要素を用いて時分割処理を行う。すなわち、各層における処理を時分割で行う。 Here, when a multilayer neural network is implemented as an analog / digital mixed circuit or a fusion circuit, it is often difficult to realize parallel processing by arranging all the components such as neurons and synapses in parallel on a two-dimensional plane. . Therefore, also in this embodiment, time division processing is performed using a small number of neurons or synapse circuit elements. That is, processing in each layer is performed in a time division manner.

また、従来技術で行う時分割処理では、各ニューロンの受容野構造（シナプス結合荷重分布）のメモリからのローディング頻度、或いは各階層の中間処理結果を保持するメモリからの中間結果の読み出し頻度が高いほど処理時間が遅くなり、消費電力が増大するが、以下に説明する多重化データ表現を用いることにより、メモリアクセスに起因する処理速度の低下とスイッチングエネルギによる消費電力の増大などを回避することが可能となる。 Also, in the time division processing performed in the prior art, the loading frequency of the receptive field structure (synaptic connection load distribution) of each neuron from the memory or the reading frequency of the intermediate result from the memory holding the intermediate processing result of each hierarchy is high. The processing time becomes slower and the power consumption increases, but by using the multiplexed data expression described below, it is possible to avoid a decrease in processing speed due to memory access and an increase in power consumption due to switching energy. It becomes possible.

図１において、データ入力制御回路１は、画像データなどをセンサ或いはデータベース等から入力する為の制御回路で、その内部には一次記憶用メモリを有する。 In FIG. 1, a data input control circuit 1 is a control circuit for inputting image data or the like from a sensor or a database, and has a primary storage memory therein.

ニューロンアレイ回路ブロック２は、階層処理構造中の所定階層に属するニューロン回路が複数個配列する。即ち、本実施形態では、ある時間範囲にニューロンアレイ回路ブロック２を用いて実現するのは、多層神経回路網の一階層（または一階層中の一特徴クラスの検出に関与するニューロン）であり、他の階層に属するニューロン（または他の特徴クラスの検出に関与するニューロン）については異なる時間帯に実現する。 In the neuron array circuit block 2, a plurality of neuron circuits belonging to a predetermined hierarchy in the hierarchical processing structure are arranged. That is, in this embodiment, what is realized by using the neuron array circuit block 2 in a certain time range is one layer of a multilayer neural network (or a neuron involved in detection of one feature class in one layer) Neurons belonging to other layers (or neurons involved in the detection of other feature classes) are realized in different time zones.

シナプスアレイ回路ブロック３は、ニューロン間のシナプス結合回路が２次元アレイ状に配列する。ここでは、異なる階層間でシナプス結合がある場合について説明する。シナプスアレイ回路ブロック３は、ある一つの階層レベルへのシナプス結合を実現する。 In the synapse array circuit block 3, synapse connection circuits between neurons are arranged in a two-dimensional array. Here, a case where there is a synaptic connection between different layers will be described. The synapse array circuit block 3 realizes synapse connection to a certain hierarchical level.

シナプスアレイ回路ブロック３は、受容野構造制御回路ブロック６により、ニューロンアレイ回路ブロック２内のニューロン回路との結合構造が制御される。受容野構造制御回路ブロック６は、自身が有する内部メモリに特徴クラスに応じた受容野構造データを格納している。 In the synapse array circuit block 3, the receptive field structure control circuit block 6 controls the coupling structure with the neuron circuit in the neuron array circuit block 2. The receptive field structure control circuit block 6 stores receptive field structure data corresponding to the feature class in its own internal memory.

処理結果保持メモリ４は、ニューロンアレイ回路ブロック２からの出力を一時的に保持するメモリである。分割多重化信号生成回路ブロック５は、シナプスアレイ回路ブロック３にニューロンアレイからの出力信号を時間的に分割多重化して供給する。 The processing result holding memory 4 is a memory that temporarily holds the output from the neuron array circuit block 2. The division multiplexing signal generation circuit block 5 supplies the output signal from the neuron array to the synapse array circuit block 3 after time division multiplexing.

全体制御回路７は各回路ブロックの動作を制御して多層神経回路網において下位層から上位層への信号入出力を制御する。 The overall control circuit 7 controls the operation of each circuit block to control signal input / output from the lower layer to the upper layer in the multilayer neural network.

なお本実施形態に係る並列処理装置が実現する多層神経回路網は、フィードフォワード結合型に限らず、回帰結合またはフィードバック結合を含む構成であってもよい。 Note that the multilayer neural network realized by the parallel processing device according to the present embodiment is not limited to the feedforward coupling type, and may include a regression coupling or a feedback coupling.

図５は、本実施形態に係る並列処理装置が実現する多層神経回路網のモデルの構成を示す図である。 FIG. 5 is a diagram showing a configuration of a model of a multilayer neural network realized by the parallel processing device according to the present embodiment.

同図に示した神経回路網は、入力データ中の局所領域において、対象または幾何学的特徴などの認識（検出）に関与する情報を階層的に扱うものであり、その基本構造はいわゆるConvolutionalネットワーク構造（LeCun，Y. and Bengio Y.，１９９５， “Convolutional Networks for Images Speech，and Time Series” in Handbook of Brain Theory and Neural Networks （M. Arbib，Ed.）， MIT Press， pp.２５５-２５８）である。最終層（最上位層）からの出力は認識結果としての認識された対象のカテゴリとその入力データ上の位置情報である。 The neural network shown in the figure handles information related to recognition (detection) of objects or geometric features in a local area in input data hierarchically, and its basic structure is a so-called Convolutional network. Structure (LeCun, Y. and Bengio Y., 1995, “Convolutional Networks for Images Speech, and Time Series” in Handbook of Brain Theory and Neural Networks (M. Arbib, Ed.), MIT Press, pp. 255-258) It is. The output from the final layer (uppermost layer) is a recognized target category as a recognition result and position information on the input data.

図５において、データ入力層１０１は、ＣＭＯＳセンサ、或いはＣＣＤ素子等の光電変換素子からの局所領域データを入力する層である。特徴検出層１０２ａ（１，０）は、データ入力層１０１より入力された画像パターンの局所的な低次の特徴（特定方向成分、特定空間周波数成分などの幾何学的特徴のほか色成分特徴を含んでもよい）を全画面の各位置を中心として局所領域（或いは、全画面にわたる所定のサンプリング点の各点を中心とする局所領域）において同一箇所で複数のスケールレベル又は解像度で複数の特徴カテゴリの数だけ検出する。そのために、特徴の種類（例えば、幾何学的特徴として所定方向の線分を抽出する場合にはその幾何学的構造である線分の傾き）に応じた受容野構造を有し、その程度に応じたパルス列を発生するニューロン素子から構成される。 In FIG. 5, a data input layer 101 is a layer for inputting local area data from a photoelectric conversion element such as a CMOS sensor or a CCD element. The feature detection layer 102a (1, 0) displays local low-order features (a specific direction component, a specific spatial frequency component, and other geometric features such as a color component feature in the image pattern input from the data input layer 101. Multiple feature categories at multiple scale levels or resolutions at the same location in a local region (or a local region centered on each point of a predetermined sampling point over the entire screen) centered on each position on the entire screen. As many as are detected. For this purpose, it has a receptive field structure corresponding to the type of feature (for example, when a line segment in a predetermined direction is extracted as a geometric feature) It consists of neuron elements that generate a corresponding pulse train.

特徴統合層１０３ａ（２，０）は、所定の受容野構造（以下、受容野とは直前の層の出力素子との結合範囲を、受容野構造とはその結合荷重の分布を意味する）を有し、パルス列を発生するニューロン素子からなり、特徴検出層１０２ａ（１，０）からの同一受容野１０５内にある複数のニューロン素子出力の統合（局所平均化、最大出力検出等によるサブサンプリングなどの演算）を行う。また、特徴統合層内のニューロンの各受容野は同一層内のニューロン間で共通の構造を有している。これを荷重の共有（weight sharing）という。 The feature integration layer 103a (2, 0) has a predetermined receptive field structure (hereinafter, the receptive field means the coupling range with the output element of the immediately preceding layer, and the receptive field structure means the distribution of the coupling load). A plurality of neuron element outputs in the same receptive field 105 from the feature detection layer 102a (1, 0) (subsampling by local averaging, maximum output detection, etc.) ). Each receptive field of neurons in the feature integration layer has a common structure among neurons in the same layer. This is called weight sharing.

特徴検出層１０２ａの後続の特徴検出層（（１，１）、（１，２）、…、（１，Ｍ））の各々、及び特徴統合層１０３ａの後続の特徴統合層（（２，１）、（２，２）、…、（２，Ｍ））の各々は、それぞれ所定の受容野構造を持ち、上述した各層と同様に機能する。すなわち、特徴検出層（（１，１）、…）は、特徴検出モジュール１０４ａ（１０４ｂについても同じ）において複数の異なる特徴の検出を行い、特徴統合層（（２，１）、…）は、特徴統合モジュール１０６ａ（１０６ｂについても同じ）において前段の特徴検出層からの複数の特徴に関する検出結果の統合を行う。 Each of the feature detection layers ((1, 1), (1, 2),..., (1, M)) subsequent to the feature detection layer 102a, and the feature integration layer ((2, 1) subsequent to the feature integration layer 103a. ), (2, 2),..., (2, M)) each have a predetermined receptive field structure and function in the same manner as the above-described layers. That is, the feature detection layer ((1, 1),...) Detects a plurality of different features in the feature detection module 104a (the same applies to 104b), and the feature integration layer ((2, 1),...) In the feature integration module 106a (same for 106b), detection results relating to a plurality of features from the preceding feature detection layer are integrated.

但し、特徴検出層は同一チャネルに属する前段の特徴統合層の細胞素子出力を受けるように結合（配線）されている。また、特徴統合層１０３で行う処理であるサブサンプリングは、同一特徴カテゴリの特徴検出細胞集団からの局所的な領域（当該特徴統合層ニューロンの局所受容野）からの出力についての平均化、或いは最大値検出などを行うものである。 However, the feature detection layer is coupled (wired) to receive the cell element output of the preceding feature integration layer belonging to the same channel. In addition, subsampling, which is processing performed in the feature integration layer 103, is performed by averaging the output from the local region (local receptive field of the feature integration layer neuron) from the feature detection cell population of the same feature category, or maximizing Value detection is performed.

図８Ａ、８Ｂは、ある特徴検出（或いは特徴統合）細胞（特徴検出層内のニューロン）に対する受容野を形成する特徴統合（或いは特徴検出）細胞のニューロン群（ｎｉ）からの出力（当該細胞から見ると入力）に関与する結合手段の構成を示す図である。信号伝達部２０３は局所的な共通バスラインを構成し、この信号伝達ライン上に複数のニューロンからのパルス信号が時系列に並んで伝達される。 8A and 8B show the output (from the cell) of the neuron group (ni) of the feature integration (or feature detection) cells that form a receptive field for a certain feature detection (or feature integration) cell (neuron in the feature detection layer). It is a figure which shows the structure of the coupling | bonding means which participates when it sees and input. The signal transmission unit 203 forms a local common bus line, and pulse signals from a plurality of neurons are transmitted in time series on the signal transmission line.

いわゆる、興奮性結合ではシナプス結合回路においてパルス信号の増幅を行い、抑制性結合は逆に減衰を与える。パルス信号により情報の伝達を行う場合、増幅及び減衰はパルス信号の振幅変調、パルス幅変調、位相変調、周波数変調のいずれによっても実現することができる。本実施形態においては、シナプス結合回路は、主にパルスの位相変調素子として用い、信号の増幅は、特徴に固有な量としてのパルス到着時間の実質的な進み、減衰は実質的な遅れとして変換される。パルス位相変調を行う回路としては、特開平１０−３２７０５４号公報、特開平５−３７３１７号公報、特開平６−６１８０８号公報などに開示される構成が知られる。 In so-called excitatory coupling, pulse signals are amplified in a synaptic coupling circuit, and inhibitory coupling conversely provides attenuation. When information is transmitted using a pulse signal, amplification and attenuation can be realized by any of amplitude modulation, pulse width modulation, phase modulation, and frequency modulation of the pulse signal. In this embodiment, the synapse coupling circuit is mainly used as a phase modulation element of a pulse, and signal amplification is converted into a substantial advance of the pulse arrival time as a characteristic intrinsic amount, and attenuation is converted as a substantial delay. Is done. As a circuit for performing pulse phase modulation, configurations disclosed in JP-A-10-327054, JP-A-5-37317, JP-A-6-61808 and the like are known.

シナプス結合回路がパルス位相変調を行う場合には、出力先のニューロンにおいて、個々の特徴に固有な時間軸上の到着位置（位相）を与えるが、その位相に加えて入力信号のレベル（位相で与えられる）に所定のシナプス荷重に相当する係数を掛けた値に相当する位相シフトを付与することもできる。この場合、定性的には興奮性結合はある基準位相に対する到着パルスの位相の進みを、抑制性結合では同様に遅れを与えるものである。 When the synapse coupling circuit performs pulse phase modulation, the arrival position (phase) on the time axis specific to each feature is given to the output destination neuron, but in addition to the phase, the level of the input signal (in phase) Phase shift corresponding to a value obtained by multiplying a given coefficient by a coefficient corresponding to a predetermined synaptic load. In this case, qualitatively, the excitatory coupling gives a phase advance of the arrival pulse with respect to a certain reference phase, and the inhibitory coupling similarly gives a delay.

図８Ａにおいて、各ニューロン素子ｎj（１≦ｊ≦ｋ）は、パルス信号（スパイクトレイン）を出力し、いわゆるintegrate-and-fire型の入出力処理を行う。先ずニューロン回路について説明する。各ニューロン素子は、いわゆるintegrate-and-fireニューロンを基本として拡張モデル化したもので、入力信号（アクションポテンシャルに相当するパルス列）を時空間的に線形加算した結果が閾値を越したら発火し、パルス状信号を出力する点では、いわゆるintegrate-and-fireニューロンと同じである。 In FIG. 8A, each neuron element nj (1 ≦ j ≦ k) outputs a pulse signal (spike train), and performs so-called integral-and-fire type input / output processing. First, the neuron circuit will be described. Each neuron element is an expansion model based on so-called integrate-and-fire neurons. When the result of linear addition of the input signal (pulse train corresponding to the action potential) exceeds the threshold value, it fires and pulses This is the same as a so-called integrate-and-fire neuron in that a state signal is output.

図８Ｂは、ニューロン素子としてのパルス発生回路（ＣＭＯＳ回路）の動作原理を表す基本構成の一例を示し、所定の時間窓内での入力パルス信号の重み付き積分を実行可能としている。 FIG. 8B shows an example of a basic configuration representing the operation principle of a pulse generation circuit (CMOS circuit) as a neuron element, which enables execution of weighted integration of an input pulse signal within a predetermined time window.

ここでは、入力信号として興奮性と抑制性の入力を受けるものとして構成されている。なお、各ニューロン素子のパルス発火タイミングの動作制御機構等に関しては、本願の主眼とするところではないので説明を省略する。 Here, it is configured to receive an excitatory and inhibitory input as an input signal. Note that the operation control mechanism of the pulse firing timing of each neuron element is not the main point of the present application, and thus the description thereof is omitted.

次に、階層構造を有する神経回路網において分割多重化した信号を生成する方法について説明する。 Next, a method for generating a division multiplexed signal in a neural network having a hierarchical structure will be described.

ある上位階層ニューロンの受容野内の下位階層からの出力は、下位階層における全てのニューロン出力の部分集合をなしている。いわゆるConvolutional Network構造においては、同一階層レベルの同一特徴クラスに属するニューロン間で受容野構造が共通の構造を有するが、上位階層各ニューロンに入力される下位階層ニューロン出力の部分集合は、部分的な重複を有するデータセットである。そこで、一つの受容野構造に相当するシナプスアレイ回路には、下位階層ニューロン出力の異なるデータセットを分割多重化して入力する。 The output from the lower layer in the receptive field of a certain upper layer neuron forms a subset of all the neuron outputs in the lower layer. In the so-called Convolutional Network structure, neurons that belong to the same feature class at the same hierarchical level have a common receptive field structure, but the subset of lower-layer neuron output that is input to each upper-layer neuron is partially A data set with duplicates. Therefore, a synapse array circuit corresponding to one receptive field structure is inputted by dividing and multiplexing different data sets of lower layer neuron outputs.

与えられた受容野構造中の所定の位置にあるシナプス回路に入力されるデータは特徴検出層のニューロンの位置（入力画像データ上の位置と対応がつく）に応じて決まり、かつそのデータは既に読み込み済みである。このことから、特徴検出層の各ニューロンへの信号入力動作を所定の順序で行う場合に、特徴検出層ニューロンの受容野内の各シナプス回路に入力されるべき特徴統合層ニューロン出力は、後段の各特徴検出層ニューロンの位置に応じて決まるので、各シナプス回路へ入力される信号は特徴検出層ニューロンの走査順序により決まる時系列データとして予め生成しておくことができる。即ち、シナプス回路ごとに与える特徴統合層出力データを時系列データとして予め格納してある全てのニューロンからの出力データの所定のサンプリングを行うことにより部分再生を行うことができる。 The data input to the synapse circuit at a given position in a given receptive field structure depends on the position of neurons in the feature detection layer (corresponding to the position on the input image data), and the data is already It has been read. Therefore, when the signal input operation to each neuron in the feature detection layer is performed in a predetermined order, the feature integration layer neuron output to be input to each synapse circuit in the receptive field of the feature detection layer neuron is Since it is determined according to the position of the feature detection layer neuron, a signal input to each synapse circuit can be generated in advance as time series data determined by the scanning order of the feature detection layer neuron. That is, partial reproduction can be performed by performing predetermined sampling of output data from all neurons stored in advance as feature series output data given to each synapse circuit as time series data.

本実施形態では、特徴検出層の前段にある特徴統合層の所定の特徴クラスに属する全てのニューロン（全部でｎ・ｍ個）からの出力を時系列的に順次（ラスタ形式などにより）読み出し（図３Ａに図示のｎ・ｍ個の時系列データ）、所定の処理結果保持メモリ４（SDRAM、MRAM、FRAMなど）に読み出しデータとして格納する。 In this embodiment, outputs from all neurons (total n · m) belonging to a predetermined feature class in the feature integration layer in the preceding stage of the feature detection layer are sequentially read in time series (in a raster format or the like) ( (N · m time-series data shown in FIG. 3A) and stored as read data in a predetermined processing result holding memory 4 (SDRAM, MRAM, FRAM, etc.).

その後、分割多重化信号生成回路ブロック５によって、特徴検出層の各ニューロンの受容野に属する特徴統合層ニューロン出力（処理結果保持メモリ４から読み出したデータ）をそれぞれ図３Ａに示す如く重複部分を有しつつ分割し、分割したそれぞれのデータを図３Ｂに示す如く時間的につなぎ合わせて一つの時系列データとし（図３Ｂ）、または分割したそれぞれのデータを図３Ｃに示す如く並列出力して、シナプスアレイ回路の各シナプス（シナプス１、シナプス２、…）に分岐入力する。この方法を利用することにより任意の波形の部分的な再生を高精度でかつ高速に行うことができる。なお、処理結果保持メモリ４からの読み出しデータに対して、所定の信号変換を行っても良い。例えば、ニューロン回路およびシナプス回路でパルス信号を扱う場合には、読み出しデータの値に応じたパルス位相またはパルス幅の信号データに変換しても良い。 Thereafter, the division multiplexing signal generation circuit block 5 causes the feature integration layer neuron output (data read from the processing result holding memory 4) belonging to the receptive field of each neuron of the feature detection layer to have overlapping portions as shown in FIG. 3A. Then, the divided data are temporally connected as shown in FIG. 3B to form one time series data (FIG. 3B), or the divided data are output in parallel as shown in FIG. 3C. A branch is input to each synapse (synapse 1, synapse 2,...) Of the synapse array circuit. By utilizing this method, partial reproduction of an arbitrary waveform can be performed with high accuracy and at high speed. Note that predetermined signal conversion may be performed on the read data from the processing result holding memory 4. For example, when a pulse signal is handled by a neuron circuit and a synapse circuit, it may be converted into signal data having a pulse phase or pulse width corresponding to the value of read data.

なお、図４Ａに示すように、区間が互いに重複しない部分的切り出しデータを複数のメモリに格納して並列出力してもよい。この場合、分割多重化信号生成回路ブロック５は例えば図４Ｂに示す如く、区分化サンプリング部５９、メモリデータ転送制御部５６及び複数のメモリ５７（M１, M２, …, Mk）、パルス信号生成器５５（PG１, PG２, …, PGk）、D／A変換器５２（DA１, DA２, …, DAk）、および並列出力ポート５４を有する必要がある。 As shown in FIG. 4A, partial cutout data whose sections do not overlap each other may be stored in a plurality of memories and output in parallel. In this case, as shown in FIG. 4B, for example, the division multiplexed signal generation circuit block 5 includes a segmented sampling unit 59, a memory data transfer control unit 56, a plurality of memories 57 (M1, M2,..., Mk), a pulse signal generator. 55 (PG1, PG2,..., PGk), a D / A converter 52 (DA1, DA2,..., DAk), and a parallel output port 54.

隣接するメモリ間でデータ転送が可能とし、ある特徴クラスに属する特徴検出層ニューロンの位置が順次切り替わるごとに、メモリ内の各データアドレスをシフトして重複しないデータを隣接するメモリから転送することにより、各メモリから出力すべき部分的切り出しデータの更新を行い、各部分データについてD／A変換、パルス信号生成を行って並列出力する。このように、各メモリ内に保持されるデータの内容は、データ転送の前後で重複した区間データになっており、全体として多重化したデータ転送を行っている。 Data transfer between adjacent memories is possible, and each time the position of a feature detection layer neuron belonging to a certain feature class is sequentially switched, each data address in the memory is shifted to transfer non-overlapping data from the adjacent memory. Then, the partial cut-out data to be output from each memory is updated, and D / A conversion and pulse signal generation are performed on each partial data and output in parallel. As described above, the contents of the data held in each memory are section data which are overlapped before and after data transfer, and multiplexed data transfer is performed as a whole.

図３Ｂに示すデータの縦軸の出力レベルは、パルス位相（値が０から２πの範囲とする）によるニューロン出力の表現形式では、位相の値が小さいほど高い出力レベルを表す。また、図３Ｂのデータは連続的な曲線表示をしているが、実際には、受容野内各ニューロン出力をサンプリングして得られる離散的なデータである。パルス位相表現では、図６に示すように一定の時間幅で分割された各区間がそれぞれ下位層の異なるニューロンに対応し、その区間内の位置（アナログ量）でパルス位相表現を行う。 The output level on the vertical axis of the data shown in FIG. 3B represents a higher output level as the phase value is smaller in the representation format of the neuron output by the pulse phase (value is in the range of 0 to 2π). The data of FIG. 3B is a continuous curve display, but is actually discrete data obtained by sampling each neuron output in the receptive field. In the pulse phase expression, as shown in FIG. 6, each section divided by a certain time width corresponds to a different neuron in the lower layer, and the pulse phase expression is performed at a position (analog amount) in the section.

時系列データの部分的切り出しは、図２Ａに示す分割多重化信号生成回路ブロック５内部の多重化サンプリング回路５１により行われる。多重化サンプリング回路５１は、一連のデータセットから所定の区間データを抽出する機能である区分化サンプリングを行う。分割多重化信号生成回路ブロック５は図２に示す如く、関数生成回路５０、多重化サンプリング回路５１、D／A変換器５２及びデマルチプレクサ５３、パルス信号生成器５５を主たる構成要素とする。 Partial segmentation of time-series data is performed by a multiplexing sampling circuit 51 in the division multiplexing signal generation circuit block 5 shown in FIG. 2A. The multiplexed sampling circuit 51 performs segmented sampling, which is a function for extracting predetermined section data from a series of data sets. As shown in FIG. 2, the division multiplexing signal generation circuit block 5 includes a function generation circuit 50, a multiplexing sampling circuit 51, a D / A converter 52, a demultiplexer 53, and a pulse signal generator 55 as main components.

ここでは、処理結果保持メモリ４から読み出したデータを入力し、多重化サンプリング回路５１で図３に示すように各シナプス回路へ入力する複数の時系列データセットのサンプリングを行い、関数生成器５０でデジタル波形を生成する。また、D／A変換器５２とパルス信号生成器５５で前段の階層レベルのニューロン出力レベルに応じた位相を有するパルス位相変調信号を生成する。このパルス信号の位相基準は所定のタイミング信号（クロック信号）により与えられる。 Here, the data read out from the processing result holding memory 4 is input, the multiple sampling circuit 51 samples a plurality of time series data sets input to each synapse circuit as shown in FIG. Generate digital waveforms. The D / A converter 52 and the pulse signal generator 55 generate a pulse phase modulation signal having a phase corresponding to the neuron output level of the previous hierarchical level. The phase reference of this pulse signal is given by a predetermined timing signal (clock signal).

部分的切り出しデータは、上位層ニューロンの受容野に属する下位層ニューロンからの出力のうち、２次元配列する上位層ニューロンアレイ回路をラスタ走査したときに、ニューロン間で共通の構造をもつ受容野構造内の特定位置のシナプス回路に入力されるべきデータを時系列化したものである。シナプスアレイ回路上の各シナプス番号（シナプス１、シナプス２、…）はラスタ走査順に配列している。 Partially cut-out data is a receptive field structure that has a common structure among neurons when raster scanning a two-dimensionally arranged upper layer neuron array circuit among the outputs from lower layer neurons belonging to the upper layer neuron's receptive field Data to be input to a synapse circuit at a specific position in the time series. The synapse numbers (synapse 1, synapse 2,...) On the synapse array circuit are arranged in raster scanning order.

多重化サンプリング回路５１の構成が同時並列的に上記部分的切り出しデータを複数（例えば、シナプスアレイ回路の１列分）のポートから出力するようになっている場合には、シナプスアレイ回路ブロック３と分割多重化信号生成回路５の各入出力ポート（図２Ｂでは分割多重生成回路５の並列出力ポート５４に相当）を並列接続可能な様にする。 When the multiplexed sampling circuit 51 is configured to output the partial cutout data from a plurality of ports (for example, one column of the synapse array circuit) in parallel and in parallel, the synapse array circuit block 3 and Each input / output port of the division multiplexing signal generation circuit 5 (corresponding to the parallel output port 54 of the division multiplexing generation circuit 5 in FIG. 2B) can be connected in parallel.

この時系列データセットは対応するシナプス回路ブロックが近接していれば（即ち、シナプスにより変調された信号を受けるニューロンが互いに近接した位置にあれば）、図３Ａに示すデータの互いに重複する区間でサンプリングされたデータとなる。 In this time-series data set, if the corresponding synapse circuit blocks are close to each other (that is, if the neurons receiving signals modulated by the synapses are in close proximity to each other), the data shown in FIG. It becomes sampled data.

この多重化サンプリング回路５１は、特徴統合層のある特徴クラスに関与する全てのニューロンからの出力データ（時系列的にサンプリングされているものとする）から部分的に重複する区間のデータを区分データとして切り出し、各区分データを時間的につなぎ合わせ一つの時系列データを生成する（図３Ｂ参照）か、或いは各区分データを同時並列的に出力する（図３Ｃ参照）。 The multiplexed sampling circuit 51 sorts data in a partially overlapping section from output data (assumed to be sampled in time series) from all neurons involved in a feature class having a feature integration layer. And each piece of segment data is temporally linked to generate one time series data (see FIG. 3B), or each piece of segment data is output in parallel (see FIG. 3C).

本実施形態では、神経回路網の複数ニューロン出力の表現形式として、このように元の完全なデータのうち部分的な区間データを単位とする複数データセットを一まとめにして扱うデータ表現形式を分割多重化表現といい、特に前者のように時間的にずらして行う表現形式を時分割多重化表現という。 In this embodiment, as a representation format of a plurality of neuron outputs of a neural network, a data representation format that handles a plurality of data sets in units of partial section data in the original complete data in this way is divided. It is called multiplexed expression, and an expression form that is shifted in time like the former is called time division multiplexed expression.

多重化サンプリング回路５１及び関数生成回路５０から生成される時系列データはD／A変換器５２でD／A変換され、デマルチプレクサ５３により時間各区分データを所定のシナプス回路に分岐出力する。この分岐はデマルチプレクサ回路５３が内蔵するスイッチング回路において、所定のクロック信号と同期する時間区分の更新タイミングデータを入力するたびにスイッチが切り替わることにより行われる。 The time-series data generated from the multiplexed sampling circuit 51 and the function generation circuit 50 is D / A converted by the D / A converter 52, and the demultiplexer 53 branches and outputs each time division data to a predetermined synapse circuit. This branching is performed by switching in the switching circuit built in the demultiplexer circuit 53 every time the update timing data of a time segment synchronized with a predetermined clock signal is input.

予めシナプス結合の位置に応じてその信号の時間軸上の位置が所定の範囲内に収まるように設定されている場合には、そのシナプスの位置を時間軸上のパルス信号位置で識別し、クロック信号と同期して動作するデマルチプレクサ５３を用いて信号を分岐させ、所定のシナプス回路に特徴統合層からの出力信号を適切に入力させる。 If the position on the time axis of the signal is set in advance within a predetermined range according to the position of the synapse connection, the position of the synapse is identified by the pulse signal position on the time axis, and the clock The signal is branched using a demultiplexer 53 that operates in synchronization with the signal, and an output signal from the feature integration layer is appropriately input to a predetermined synapse circuit.

なお、分割多重化信号生成回路ブロック５で行う処理は、デジタル回路でルックアップテーブル（LUT）法によりデジタル波形を作り出し（図２ＡのLUT生成回路５０により）、その必要範囲の信号をD／A変換器によりアナログ波形にする方法を用いても良い。 The processing performed in the division multiplexed signal generation circuit block 5 is a digital circuit that creates a digital waveform by the look-up table (LUT) method (by the LUT generation circuit 50 in FIG. 2A), and converts the signal in the necessary range to D / A. A method of making an analog waveform by a converter may be used.

以上説明したような分割多重化を行うデータ表現形式によれば、メモリへのデータ書き込みまたはメモリからの読み出しに関する入出力動作を繰り返す必要がなく、一つの特徴クラスに付き一回のメモリアクセスで済むので、メモリアクセス時間に起因する処理時間の低下を防ぎ、高速化が可能である。 According to the data representation format for division multiplexing as described above, it is not necessary to repeat input / output operations relating to data writing to or reading from the memory, and only one memory access per feature class is required. Therefore, it is possible to prevent the processing time from being reduced due to the memory access time and increase the speed.

本実施形態では、同じクラスの特徴を検出する特徴検出（統合）層ニューロン間で受容野が共通の構造を有することを利用し、単一のニューロンのある特徴クラスに関する一受容野分のシナプスアレイ回路を用いて上述した分割多重化信号をシナプス回路に入力し、変調を受けたパルス列信号の時間窓積分を各ニューロン回路で行う。 In the present embodiment, a synapse array corresponding to one receptive field with respect to a certain feature class of a single neuron is utilized by utilizing that a receptive field has a common structure between feature detection (integration) layer neurons that detect features of the same class. The above-described division multiplexed signal is input to the synapse circuit using a circuit, and time window integration of the modulated pulse train signal is performed in each neuron circuit.

本実施形態では、シナプスアレイ回路ブロック３は、ニューロンアレイ回路ブロック２の出力先ニューロンを切り替える出力分岐手段としてのデマルチプレクサを内蔵し、次の階層レベルまたは次の特徴クラスに属するニューロンを実現するニューロンアレイブロック２へのシナプスアレイ回路ブロック３からの信号伝播は、出力先（即ち、出力先のニューロン）が一定の順序で切り替わるデマルチプレクサを介して行われる。即ち、各シナプス回路から出力される変調後のパルス列データは、それぞれのパルスが異なるニューロン回路に出力される。 In the present embodiment, the synapse array circuit block 3 includes a demultiplexer as an output branching means for switching the output destination neurons of the neuron array circuit block 2 and realizes neurons belonging to the next hierarchical level or the next feature class. Signal propagation from the synapse array circuit block 3 to the array block 2 is performed via a demultiplexer in which an output destination (that is, an output destination neuron) is switched in a predetermined order. That is, the modulated pulse train data output from each synapse circuit is output to a neuron circuit in which each pulse is different.

同一階層レベルで複数の異なる特徴クラスの検出に関与するニューロン回路のアレイを同一基盤上に並列配置することができない場合には、検出する特徴のクラスごとにニューロンアレイ回路ブロック内部の構成を各特徴クラスの検出に適する構成に切り替えて（或いは内部状態をリセットして）シーケンシャルに処理を行えばよい。シナプス回路構成を変える受容野構造の制御およびニューロン回路の特性制御を実現するためには、FPGA（Field Programmable Gate Array）やFPAA（Field Programmable Analog Array）などの回路構成を用いてもよい。 If an array of neuron circuits that are involved in the detection of multiple different feature classes at the same hierarchical level cannot be placed in parallel on the same board, the configuration inside the neuron array circuit block for each feature class to be detected Switching to a configuration suitable for class detection (or resetting the internal state) may be performed sequentially. In order to realize the control of the receptive field structure that changes the synapse circuit configuration and the characteristic control of the neuron circuit, a circuit configuration such as FPGA (Field Programmable Gate Array) or FPAA (Field Programmable Analog Array) may be used.

また、上述した並列処理装置をパターン認識手段としてカメラその他の画像入力手段、或いはプリンタ及びディスプレイその他の画像出力手段に搭載することができる。その結果、低消費電力で小規模な回路構成により、特定被写体の認識または検出を行って所定の動作、例えば画像入力手段については、特定被写体を中心とするフォーカシング、露出補正、ズーミング、或いは色補正などの処理を行うことができる。画像出力手段についても特定被写体に関する最適色補正などの処理を自動的に行うことができる。 Further, the parallel processing device described above can be mounted as a pattern recognition unit on a camera or other image input unit, or on a printer, display, or other image output unit. As a result, a specific object is recognized or detected by a small circuit configuration with low power consumption, and for a predetermined operation, for example, for an image input unit, focusing, exposure correction, zooming, or color correction centering on the specific object. Etc. can be performed. The image output means can also automatically perform processing such as optimum color correction for a specific subject.

次に、本実施形態に係る並列処理装置を適用したパターン検出（認識）装置を画像入力装置の一つである撮像装置に搭載させることにより、特定被写体へのフォーカシングや特定被写体の色補正、露出制御を行う場合について、図１１を参照して説明する。図１１は、本実施形態に係るパターン検出（認識）装置を撮像装置に用いた例の構成を示す図である。 Next, by mounting a pattern detection (recognition) device to which the parallel processing device according to this embodiment is applied to an imaging device that is one of the image input devices, focusing on a specific subject, color correction of a specific subject, exposure The case of performing the control will be described with reference to FIG. FIG. 11 is a diagram illustrating a configuration of an example in which the pattern detection (recognition) apparatus according to the present embodiment is used in an imaging apparatus.

ここで撮像装置１１０１は、撮影レンズおよびズーム撮影用駆動制御機構を含む結像光学系１１０２、CCD又はＣＭＯＳイメージセンサー１１０３、撮像パラメータの計測部１１０４、映像信号処理回路１１０５、記憶部１１０６、撮像動作の制御、撮像条件の制御などの制御用信号を発生する制御信号発生部１１０７、EVFなどファインダーを兼ねた表示ディスプレイ１１０８、ストロボ発光部１１０９、記録媒体１１１０などを具備し、更に上述した時分割多重化処理を行う並列処理装置（パターン認識装置として機能させる）を認識用並列処理装置１１１１として備える。 Here, the imaging device 1101 includes an imaging optical system 1102 including a photographing lens and a zoom photographing drive control mechanism, a CCD or CMOS image sensor 1103, an imaging parameter measurement unit 1104, a video signal processing circuit 1105, a storage unit 1106, and an imaging operation. Control signal generator 1107 for generating control signals such as control of image capturing and image pickup conditions, display display 1108 that also serves as a finder such as EVF, strobe light emitting unit 1109, recording medium 1110, etc., and the above-described time-division multiplexing A parallel processing device (which functions as a pattern recognition device) that performs the conversion processing is provided as a recognition parallel processing device 1111.

この撮像装置１１０１は、例えば撮影された映像中から予め登録された人物の検出（存在位置、サイズの検出を伴う）を認識用並列処理装置１１１１により行う。そして、その人物の位置、サイズ情報が認識用並列処理装置１１１１から制御信号発生部１１０７に入力されると、同制御信号発生部１１０７は、撮像パラメータ計測部１１０４からの出力に基づき、その人物に対するピント制御、露出条件制御、ホワイトバランス制御などを最適に行う制御信号を発生する。 The imaging device 1101 performs, for example, a recognition parallel processing device 1111 to detect a person registered in advance in a captured video (with detection of the existing position and size). Then, when the position and size information of the person is input from the recognition parallel processing device 1111 to the control signal generation unit 1107, the control signal generation unit 1107 is based on the output from the imaging parameter measurement unit 1104. Generates control signals for optimal focus control, exposure condition control, white balance control, etc.

上述した並列処理装置をパターン検出（認識）に適用し、このように撮像装置に用いた結果、小型・低消費電力な回路で、高速に人物検出とそれに基づく撮影の最適制御を行うことができるようになる。 As a result of applying the above-described parallel processing device to pattern detection (recognition) and using it in the imaging device as described above, it is possible to perform human detection and optimum control of shooting based on it at high speed with a small and low power consumption circuit. It becomes like this.

以上説明したように本実施形態によれば、並列に複数の演算素子が結合してなる並列処理装置において複数の演算素子の中間的な出力を一括して多重化するデータ表現を用いることにより、演算効率を大幅に向上させることができる。 As described above, according to the present embodiment, by using a data expression that multiplexes intermediate outputs of a plurality of arithmetic elements at once in a parallel processing device in which a plurality of arithmetic elements are coupled in parallel. The calculation efficiency can be greatly improved.

特に、受容野構造が一部共通化される階層型神経回路網（例えば、Convolutional ネットワーク構造）において、その共通化された受容野をもつ各ニューロンに入力される複数のデータセットを時分割で処理する場合、それぞれのデータセットを一括して関数、曲線、またはルックアップテーブル形式で時分割多重化表現することにより、各データセットの入出力に伴うメモリアクセスを最小限の回数で済ませることが出来、処理速度を向上させることができる。 In particular, in a hierarchical neural network (for example, a convolutional network structure) in which the receptive field structure is partially shared, multiple data sets input to each neuron having the shared receptive field are processed in a time-sharing manner. In this case, each data set is expressed in a function, curve, or lookup table format in a time-division multiplexed manner, so that the memory access associated with each data set input / output can be minimized. , Processing speed can be improved.

［第２の実施形態］
本実施形態では多層神経回路網を実装する回路構成において、データ構造の多重化表現を異なるニューロンからの信号を周波数ドメインでの多重化により行う。 [Second Embodiment]
In the present embodiment, in a circuit configuration that implements a multilayer neural network, a multiplexed representation of a data structure is performed by multiplexing signals from different neurons in the frequency domain.

また、下位層の複数のニューロン出力（異なるベースバンド周波数の信号の混合した多重化信号）を検出する上位層ニューロンにおいて周波数ドメインでのフィルタ処理により各受容野からの信号を分離する。以下、データ構造の周波数軸上での多重化処理について説明する。 In addition, signals from each receptive field are separated by filtering in the frequency domain in an upper layer neuron that detects a plurality of lower layer neuron outputs (a multiplexed signal in which signals of different baseband frequencies are mixed). Hereinafter, the multiplexing process on the frequency axis of the data structure will be described.

本実施形態において図５に示すような同じ特徴クラスの検出に関与するニューロンの局所的受容野構造が共通化する階層型神経回路網においては、特徴統合層出力を異なる特徴検出層ニューロン（ただし、同じ特徴クラスに属する）に入力される受容野単位で異なるベースバンド周波数を割り当てても良い。即ち、各受容野に該当する時系列化された出力データ（異なるニューロンへ入力される）の「波形」信号を変調して異なるベースバンド周波数を有する搬送波信号にのせる。図１２は、周波数多重化した信号の各成分（異なるニューロンへ出力される）のスペクトルを概念的に示す図である。 In this embodiment, in a hierarchical neural network in which local receptive field structures of neurons involved in detection of the same feature class as shown in FIG. Different baseband frequencies may be assigned in units of receptive fields input to (belonging to the same feature class). That is, the “waveform” signal of the time-series output data (input to different neurons) corresponding to each receptive field is modulated and placed on a carrier signal having a different baseband frequency. FIG. 12 is a diagram conceptually showing the spectrum of each component (output to different neurons) of the frequency-multiplexed signal.

周波数多重化の例として、光信号を層間で伝播する信号として用い、かつ、光波を信号伝播キャリアとし、各ニューロンからの出力のデータ構造が波長多重化されるようにした場合について次に説明する。図７に波長多重化を行う場合の要部構成図を示す。波長多重化技術については、雑誌オプトロニクス（１９９６年５月号）の特集記事（１１１〜１４０頁）、光技術コンタクト（２００１年１０月号）の特集記事（３〜３１頁）などに記載されている。 As an example of frequency multiplexing, a case where an optical signal is used as a signal propagating between layers and an optical data is used as a signal propagation carrier and the data structure of the output from each neuron is wavelength-multiplexed will be described below. . FIG. 7 shows a configuration diagram of a main part when wavelength multiplexing is performed. The wavelength multiplexing technology is described in a special article (pages 111-140) of the magazine Optronics (May 1996), a special article (page 3-31) of the optical technology contact (October 2001 issue), etc. Yes.

この場合、多層神経回路網の各層は、所定の波長範囲で透明なサファイア基盤１３０上に構成されるCMOS回路１５０（IEEE Circuits and Systems Magazine, vol.１, No.３, ２００１, pp.２２-３０.の記事：”Silicon on Sapphire CMOS for Optoelectronic Microsystems,”を参照）、波長多重化用光源１２０（波長の異なる複数の固定波長光源、或いは波長可変光源となる半導体レーザアレイなど）、波長選択性を有する光フィルタ１００、および光電変換部（光検出器）１１０などにより構成される。CMOS回路１５０はニューロンアレイ回路および（または）シナプスアレイ回路を主たる構成要素とする。 In this case, each layer of the multilayer neural network is formed of a CMOS circuit 150 (IEEE Circuits and Systems Magazine, vol. 1, No. 3, 2001, pp. 22-) formed on a transparent sapphire substrate 130 in a predetermined wavelength range. Article 30: See "Silicon on Sapphire CMOS for Optoelectronic Microsystems,"), wavelength multiplexing light source 120 (multiple fixed wavelength light sources with different wavelengths, or semiconductor laser array as wavelength tunable light source, etc.), wavelength selectivity The optical filter 100 having a photoelectric conversion unit, a photoelectric conversion unit (photodetector) 110, and the like. The CMOS circuit 150 is mainly composed of a neuron array circuit and / or a synapse array circuit.

層間結合部１４０には、光ファイバ、回折格子型アレイ光導波路に代表される波長選択性を有する光媒体網、或いは、光音響素子などの光偏向機能を有する空間光変調素子などを用いる。CMOS回路１５０がニューロンアレイ回路を含むが、シナプスアレイ回路ブロック２を含まない場合には、層間結合部１４０に波長によって利得の異なる光増幅装置から構成されるシナプスアレイ回路が含まれるものとする。この場合、波長可変または波長固定の光フィルタを通過後の光信号に対して波長によって利得の異なる光増幅装置をシナプス荷重に相当するデバイスとして用いてもよい。 As the interlayer coupling unit 140, an optical fiber, an optical medium network having wavelength selectivity represented by a diffraction grating array optical waveguide, a spatial light modulation element having a light deflection function such as a photoacoustic element, or the like is used. When the CMOS circuit 150 includes a neuron array circuit but does not include the synapse array circuit block 2, it is assumed that the interlayer coupling unit 140 includes a synapse array circuit including optical amplifiers having different gains depending on wavelengths. In this case, an optical amplifying device having a gain different depending on the wavelength with respect to the optical signal after passing through the wavelength tunable or fixed wavelength optical filter may be used as a device corresponding to the synaptic load.

このようにシナプスアレイ回路ブロック２は、周波数軸上で多重化された信号（複数の受容野からの出力セット）からフィルタを用いて分離された各信号（電気信号または光波信号のいずれでもよい）についてシナプス荷重値と他のニューロン素子からの出力信号との所定の積和演算を各信号の属する周波数ドメインで行い、デマルチプレクサその他の分岐出力用の手段を用いて異なる周波数ドメインで求められた積和演算結果（電気信号または光波信号のいずれでもよい）をその周波数ドメインに対応する出力先ニューロンに分岐出力する。なお、パルス信号（電気信号）についてのシナプス荷重値との積和演算については、文献（A．F．Murray et al., “Pulse-Stream VLSI Neural Networks Mixing Analog and Digital Techniques,”，IEEE Trans. On Neural Networks, vol.２, pp.１９３-２０４., １９９１）に記載されている技術を用いることができる。 In this way, the synapse array circuit block 2 has each signal (either an electric signal or a light wave signal) separated from the signal multiplexed on the frequency axis (output set from a plurality of receptive fields) using a filter. Performs a predetermined product-sum operation on the synaptic load values and output signals from other neuron elements in the frequency domain to which each signal belongs, and a product obtained in a different frequency domain using a demultiplexer or other means for branch output The sum operation result (which may be either an electric signal or a light wave signal) is branched and output to an output destination neuron corresponding to the frequency domain. Note that the product-sum operation with a synaptic load value for a pulse signal (electrical signal) is described in the literature (AF Murray et al., “Pulse-Stream VLSI Neural Networks Mixing Analog and Digital Techniques,” IEEE Trans. On Neural Networks, vol.2, pp.193-204., 1991) can be used.

ニューロン素子はCMOS回路１５０上にアナログ・デジタル混載（融合）回路として構成され、その出力信号はニューロン素子の属する受容野に応じて（結局のところ、出力先ニューロンに応じて）異なる波長の光信号に変換され、波長多重化信号として出力される。そのために、ニューロン素子の出力部に結合して光源（半導体レーザアレイなどにより構成される）１２０が設定されている。 The neuron element is configured as an analog / digital mixed (fusion) circuit on the CMOS circuit 150, and its output signal is an optical signal having a different wavelength according to the receptive field to which the neuron element belongs (after all, depending on the output destination neuron). And output as a wavelength multiplexed signal. For this purpose, a light source (configured by a semiconductor laser array or the like) 120 is set in combination with the output part of the neuron element.

図５の階層型神経回路網の特徴検出層ニューロンがニューロンアレイ回路ブロック２において実現されているとき、これに結合するシナプスアレイ回路ブロック３の各シナプス回路は、特徴検出層のある特徴クラスを検出するニューロンの受容野構造を形成するものであり、特徴統合層の単一または複数の特徴クラスに属するニューロンとの結合構造の一部または全部を形成する。 When the feature detection layer neurons of the hierarchical neural network of FIG. 5 are realized in the neuron array circuit block 2, each synapse circuit of the synapse array circuit block 3 coupled thereto detects a feature class having a feature detection layer. Forming a receptive field structure of a neuron that forms part or all of a connection structure with neurons belonging to one or a plurality of feature classes of the feature integration layer.

シナプス回路ブロックが特徴検出層ニューロンの一受容野を構成する結合構造の全部を形成する場合には、特徴検出層のある特徴クラスに属する各ニューロンに入力されるべき多重化された信号は、互いに重複する受容野に属し、異なる特徴検出層ニューロンに入力されるべき同一の特徴統合層ニューロン出力である。 When the synapse circuit block forms all of the connection structure that constitutes one receptive field of the feature detection layer neuron, the multiplexed signals to be input to each neuron belonging to a feature class of the feature detection layer are mutually connected. It is the same feature integration layer neuron output that belongs to overlapping receptive fields and should be input to different feature detection layer neurons.

［第３の実施形態］
本実施形態では、階層型神経回路網のニューロンの受容野が部分的に共通の構造を有する場合に、共通する受容野構造を有する複数のニューロンに入力される複数のデジタルデータセットの構造を一括してビットの深さ方向に多重化したデータ表現形式となるようにしている。 [Third Embodiment]
In this embodiment, when the receptive fields of the neurons in the hierarchical neural network have a partially common structure, the structures of a plurality of digital data sets input to the plurality of neurons having the common receptive field structure are collectively displayed. Thus, the data representation format is multiplexed in the bit depth direction.

本実施形態の分割多重化信号生成回路ブロックは、図９に示すように、区分化サンプリング部５９、パラメータ近似部６５、ビット深さ方向データ変換部６０、区分データ合成部６１などから構成される。 As shown in FIG. 9, the division multiplexed signal generation circuit block according to the present embodiment includes a segmented sampling unit 59, a parameter approximating unit 65, a bit depth direction data converting unit 60, a segmented data synthesizing unit 61, and the like. .

パラメータ近似部６５は、例えば、各区分データの「波形」を近似するパラメトリックに表される曲線の当てはめを行い、その結果のパラメータデータ（ｋ個あるとする）をセットにして一つの区分データとする。 For example, the parameter approximating unit 65 fits a curve represented by a parametric approximating the “waveform” of each segment data, and sets the parameter data (assuming k pieces) as a result as one segment data and To do.

データの多重化表現は、ビット深さ方向変換部６０と区分データ合成部６１で行われる。受容野構造の特定位置に対応するデータがＱビットで表現されるとき、受容野構造が共通する出力先のニューロンの数がｋとすると、そのデータ表現を図１０の上図のグラフに示すようにｋ分割し、各出力先ニューロンへの各区分データの表現をそれぞれ、［Ｑ／Ｎ］＝ｐビットで行う（ここに［ｘ］はｘを丸めて整数化することを意味する）。 The multiplexed representation of data is performed by the bit depth direction converting unit 60 and the segmented data synthesizing unit 61. When the data corresponding to a specific position of the receptive field structure is expressed by Q bits, if the number of output destination neurons that share the receptive field structure is k, the data representation is shown in the upper graph of FIG. And each segment data to each output destination neuron is expressed by [Q / N] = p bits (where [x] means rounding x to an integer).

即ち、ビット深さ方向データ変換部６０は、図１０の下図に示すように第一の区分データには、０〜ｐビットが、第二の区分データにはｐ＋１〜２ｐビットが割り当てられるといった要領で、ビット深さ方向に複数の区分データの分割表現を行う。 That is, the bit depth direction data conversion unit 60 assigns 0 to p bits to the first segment data and p + 1 to 2p bits to the second segment data as shown in the lower diagram of FIG. Thus, a divided representation of a plurality of segmented data is performed in the bit depth direction.

各区分データはそれぞれが、ｓ個のパラメータデータを表すように、さらにビット深さ方向にｓ分割表現される。なお、ビット深さ方向の分割は、後で行われるシナプス回路による変換の結果が与えられた区分データのビット幅ｐに収まるように設定しておく必要があることは言うまでもない。また、各区分データは受容野構造が共通する複数ニューロンのそれぞれに入力されるべきデータであり、それぞれの受容野に属する他のニューロンからの出力である。 Each segment data is further expressed in s divisions in the bit depth direction so as to represent s parameter data. Needless to say, the division in the bit depth direction needs to be set so that the result of the conversion by the synapse circuit performed later falls within the bit width p of the given divided data. Each segment data is data to be input to each of a plurality of neurons having a common receptive field structure, and is an output from other neurons belonging to each receptive field.

区分データ合成部６１は、以上のようにビット深さ方向に変換された区分データを一例として図１０の下図に示すような一つの合成データを生成する。なお、図９には示していないが、区分データ合成部６１は、合成されたデータに対して所定のD／A変換その他の変換を施してアナログ信号を得てもよい。 The segment data combining unit 61 generates one combined data as shown in the lower diagram of FIG. 10 using the segment data converted in the bit depth direction as described above as an example. Although not shown in FIG. 9, the segmented data combining unit 61 may obtain an analog signal by performing predetermined D / A conversion or other conversion on the combined data.

シナプスアレイ回路ブロック３は、ビット深さ方向に多重化された信号の各分割データについて、シナプス荷重値に対応する所定の演算（シナプス荷重値と分割データとの積算など）を行う。ビット深さが同じ区分データについての積和演算結果を得ると、ビット深さ方向に多重化してある各演算結果をシフトレジスタなどにより分割して抽出し、デマルチプレクサなどの分岐出力部（図１に図示せず）により各ビット深さレベルに対応したニューロン（ニューロンアレイ回路ブロック２内にある）に分岐出力する。 The synapse array circuit block 3 performs a predetermined operation corresponding to the synapse load value (such as integration of the synapse load value and the divided data) for each divided data of the signal multiplexed in the bit depth direction. When the product-sum operation result for the divided data having the same bit depth is obtained, each operation result multiplexed in the bit depth direction is divided and extracted by a shift register or the like, and a branch output unit such as a demultiplexer (FIG. 1). (Not shown in FIG. 2), the data is branched and output to neurons (in the neuron array circuit block 2) corresponding to each bit depth level.

なお、上述した曲線当てはめを行わない場合には、パラメータ近似部６５は必要ではないが、有限個（ｋ個とする）のデータセットで表され、かつそのデータセットの全体を多重化表現することには変わりはないものとする。 If the above-described curve fitting is not performed, the parameter approximation unit 65 is not necessary, but is represented by a finite number (k) of data sets, and the entire data set is expressed in a multiplexed manner. Shall not change.

以上の説明により、中間的なデータセットの全体の周波数多重化表現またはビット深さ方向の多重化表現を用いることにより、処理速度をさらに向上させることができる。 As described above, the processing speed can be further improved by using the frequency-multiplexed representation of the entire intermediate data set or the multiplexed representation in the bit depth direction.

なお、上記実施形態では、階層構造でもって下層から上層に向けて情報処理を行う処理系としてニューラルネットワークを用いたが、これに限定するものではない。 In the above embodiment, a neural network is used as a processing system that performs information processing from a lower layer to an upper layer with a hierarchical structure. However, the present invention is not limited to this.

本発明の第１の実施形態に係る並列処理装置の基本構成を示すブロック図である。It is a block diagram which shows the basic composition of the parallel processing apparatus which concerns on the 1st Embodiment of this invention. 分割多重化信号生成回路ブロック５の構成を示す図である。FIG. 6 is a diagram showing a configuration of a division multiplexed signal generation circuit block 5; 分割多重化信号生成回路ブロック５の構成を示す図である。FIG. 6 is a diagram showing a configuration of a division multiplexed signal generation circuit block 5; 特徴検出層の前段にある特徴統合層の所定の特徴クラスに属する全てのニューロンからの出力を処理する内容を説明するための図である。It is a figure for demonstrating the content which processes the output from all the neurons which belong to the predetermined feature class of the feature integration layer in the front | former stage of a feature detection layer. 特徴検出層の前段にある特徴統合層の所定の特徴クラスに属する全てのニューロンからの出力を処理する内容を説明するための図である。It is a figure for demonstrating the content which processes the output from all the neurons which belong to the predetermined feature class of the feature integration layer in the front | former stage of a feature detection layer. 特徴検出層の前段にある特徴統合層の所定の特徴クラスに属する全てのニューロンからの出力を処理する内容を説明するための図である。It is a figure for demonstrating the content which processes the output from all the neurons which belong to the predetermined feature class of the feature integration layer in the front | former stage of a feature detection layer. 区間が互いに重複しない部分的切り出しデータを複数のメモリに格納して並列出力する例を説明する図である。It is a figure explaining the example which stores the partial cut-out data which an area does not mutually overlap in a some memory, and outputs it in parallel. 図４Ａに示す如く部分切り出しデータを処理する場合の分割多重化信号生成回路ブロック５の構成を示す図である。It is a figure which shows the structure of the division multiplexing signal generation circuit block 5 in the case of processing partially cut-out data as shown in FIG. 4A. 本実施形態に係る並列処理装置が実現する多層神経回路網のモデルの構成を示す図である。It is a figure which shows the structure of the model of the multilayer neural network which the parallel processing apparatus concerning this embodiment implement | achieves. 各ニューロンへの信号のパルス位相表現を説明する図である。It is a figure explaining the pulse phase expression of the signal to each neuron. 波長多重化を行う場合の要部構成図を示す図である。It is a figure which shows the principal part block diagram in the case of performing wavelength multiplexing. ある特徴検出（或いは特徴統合）細胞（特徴検出層内のニューロン）に対する受容野を形成する特徴統合（或いは特徴検出）細胞のニューロン群（ｎｉ）からの出力（当該細胞から見ると入力）に関与する結合手段の構成を示す図である。Participates in the output from the neuron group (ni) of feature-integrated (or feature-detected) cells that form a receptive field for certain feature-detected (or feature-integrated) cells (neurons in the feature-detecting layer) It is a figure which shows the structure of the coupling | bonding means to do. ある特徴検出（或いは特徴統合）細胞（特徴検出層内のニューロン）に対する受容野を形成する特徴統合（或いは特徴検出）細胞のニューロン群（ｎｉ）からの出力（当該細胞から見ると入力）に関与する結合手段の構成を示す図である。Participates in the output from the neuron group (ni) of feature-integrated (or feature-detected) cells that form a receptive field for certain feature-detected (or feature-integrated) cells (neurons in the feature-detecting layer) It is a figure which shows the structure of the coupling | bonding means to do. 本発明の第３の実施形態の分割多重化信号生成回路ブロックの構成を示す図である。It is a figure which shows the structure of the division multiplexing signal generation circuit block of the 3rd Embodiment of this invention. 第３の実施形態におけるデータ表現を説明する図である。It is a figure explaining the data expression in 3rd Embodiment. 本発明の第１の実施形態に係るパターン検出（認識）装置を撮像装置に用いた例の構成を示す図である。It is a figure which shows the structure of the example which used the pattern detection (recognition) apparatus which concerns on the 1st Embodiment of this invention for an imaging device. 周波数多重化した信号の各成分（異なるニューロンへ出力される）のスペクトルを概念的に示す図である。It is a figure which shows notionally the spectrum of each component (output to a different neuron) of the frequency multiplexed signal.

Claims

In a hierarchical neural network having a local receptive field structure common between neuron elements,
Division multiplexing that divides the output signal from the neuron element belonging to the previous hierarchical level as data input to the synaptic connection that defines the receptive field so that each divided signal has an overlapping part in time series with other divided signals A parallel processing apparatus in which data structures are multiplexed, characterized by comprising means.

The hierarchical neural network has a hierarchical structure in which a feature detection layer and a feature integration layer are alternately arranged, and the feature detection layer and the feature integration layer each detect and integrate at least one feature. 2. A parallel processing apparatus in which the data structure according to claim 1 is multiplexed.

The structure of the data input to the synapse connection is such that a signal representing an integration result relating to a preset feature from the previous hierarchical level is included in a preset region where the input data of the hierarchical neural network overlap each other. 2. The parallel processing apparatus according to claim 1 , wherein the data structure is arranged as one time series data as a signal corresponding to each position.

The structure of data input to the synapse connection is a signal representing an integration result relating to a preset feature from the previous hierarchical level, and a plurality of preset sampling point positions of the entire input data to the hierarchical neural network extracted in response to, in claim 3, characterized in that formed by arranging a set of data corresponding to partially overlapping interval with each other in the input data from the extracted signal as one time-series data A parallel processing device in which the described data structure is multiplexed.

The structure of data input to the synapse connection is a signal representing an integration result relating to a preset feature from the previous hierarchical level, and a plurality of preset sampling point positions of the entire input data to the hierarchical neural network And a set of the signals corresponding to a plurality of partial sections of the input data that do not overlap with each other are arranged in parallel from the extracted signals, and the signals of the partial sections are adjacent to each other. 4. The parallel processing apparatus according to claim 3 , wherein a part of the data in the partial section is transferred as one time series data.

The hierarchical neural network has a structure in which a plurality of synapse connection circuits and a plurality of neuron circuits are arranged in a preset method, and the structure of data input to the synapse connection circuit is determined from the previous hierarchical level. Is a time-series data in which signals representing the integration results relating to preset features are arranged in the time axis direction, each data being associated with a preset sampling position of the input data and corresponding to a different sampling position. 5. The parallel processing apparatus according to claim 4 , wherein the time-series data arranged in the time axis direction and input to the adjacent synapse coupling circuits include data overlapping each other. .

The time-series data is a partial data set in advance from the entire data extracted in time series by associating a signal representing an integration result relating to a preset feature from the previous hierarchical level with different sampling positions of the input data. 7. The parallel processing apparatus according to claim 6 , wherein the data structures are arranged as one time series data.

In a hierarchical neural network having a local receptive field structure common between neuron elements,
Synaptic connections that define receptive fields;
Division multiplexed signal generating means for dividing and converting an output signal from a neuron element belonging to the previous hierarchical level as data input to the synapse connection so that each divided signal has an overlapping portion with another divided signal; A parallel processing apparatus in which data structures are multiplexed.

The division multiplexed signal generation means includes:
A sampling means for sampling a plurality of the output signals and extracting the output signals of a plurality of overlapping sections;
9. The parallel processing device with multiplexed data structure according to claim 8 , further comprising: a time-series data generation circuit that generates a single time-series data by integrating a plurality of segmented data.

The division multiplexed signal generation means includes:
A sampling means for sampling a plurality of the output signals and extracting the output signals of a plurality of non-overlapping sections;
A plurality of memories each storing segmented data;
Memory control means for performing data transfer control between the plurality of memories so that there is overlapping partitioned data before and after data transfer;
9. The parallel processing apparatus according to claim 8 , further comprising parallel output means for outputting the segmented data in the memory in parallel each time the data transfer is performed.

11. The data structure according to claim 9 or 10 , wherein the segmented data is data obtained by time-sequentially outputting other neuron element belonging to a preset receptive field of the neuron element. Parallel processing device.