JPH01229360A

JPH01229360A - Learning processing system in boltzmann machine

Info

Publication number: JPH01229360A
Application number: JP63055489A
Authority: JP
Inventors: Akira Kawamura; 旭川村; Nobuo Watabe; 信雄渡部; Takashi Kimoto; 木本　隆; Kazuo Asakawa; 浅川　和雄; Shigemi Osada; 茂美長田; Hideki Yoshizawa; 英樹吉沢; Minoru Sekiguchi; 実関口
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1988-03-09
Filing date: 1988-03-09
Publication date: 1989-09-13

Abstract

PURPOSE:To acquire the progress state of learning or a normal learning operation, and to search a parameter to arrange an optimum operating state by providing a monitoring part, and monitoring the operating state of a network in the course of the learning. CONSTITUTION:The network 10 is constituted so as to include, at least, a visible unit 11 and a hidden unit 12, and in addition, to be capable of setting weight Wij of each unit S. Then, at the time of the setting of the weight Wij, the probability distribution P<->(Valpha) of a pattern having occurred actually correspondingly to a learning pattern parameter given by the initial setting part 21 of a learning processing part 20 is adjusted so as to be the occurrence probability distribution P<+>(Valpha) of the pattern. Namely, a weight updating part 22 adjusts the weight Wij so as to approach to the probability P(Talpha) of the pattern occurrence of an external environment. Next, the monitoring part 30 monitors the progress state of the learning in the processing by the updating part 22 or the normal learning operation. Thus, since the quality of the operating state and the progress state can be monitored, the parameter to arrange the optimum state can be searched.

Description

【発明の詳細な説明】〔概　要〕ニューロ・コンピュータの構成方法の１つであるボルツ
マン・マシンにおける学習処理方式に関し。[Detailed Description of the Invention] [Summary] This invention relates to a learning processing method in a Boltzmann machine, which is one of the methods for configuring a neurocomputer.

学習途中におけるネットワークの動作状態を監視するこ
とで、学習の進捗状況を知り、最適な動作状況を整える
パラメータの探索を可能にすることを目的とし。By monitoring the operating status of the network during learning, the aim is to know the progress of learning and enable the search for parameters that will optimize the operating status.

少なくとも可視ユニットと隠れユニットとを含み夫々の
ユニット間における重みＷ　ｉ　ｊを設定可能に構成さ
れるネットワークをそなえると共に。The network includes at least a visible unit and a hidden unit and is configured to be able to set weights W i j between the respective units.

学習によって上記重みＷｉｊを調整する学習処理部をそ
なえてなるボルツマン・マシンにおける学習処理方式におい
て。In a learning processing method in a Boltzmann machine including a learning processing section that adjusts the weight Wij by learning.

モニター部をもうけ、当該モニター部が上記学習の進捗
状況と上記動作状況の良否とを監視する機能をもつよう
構成している。A monitor section is provided, and the monitor section is configured to have a function of monitoring the progress of the learning and the quality of the operation status.

[Industrial application field]

本発明は、ニューロ・コンピュータの構成方法の１つで
あるボルツマン・マシンにおける学習処理方式に関する
。The present invention relates to a learning processing method in a Boltzmann machine, which is one of the methods for configuring a neurocomputer.

ボルツマン・マシンの動作には、学習規則の原理からは
説明できない部ト心＜つかある、特に尺度Ｇの最小化の
ための手続きが単純であるため。There are some aspects of the operation of the Boltzmann machine that cannot be explained from the principle of learning rules, especially because the procedure for minimizing the measure G is simple.

速い学習のためにはパラメータをうまく選ばねばならな
い。またアニーリング・スケジュールの良否によって、
ユニットの状態遷移がボルツマン分布に従わないものと
なり、動作の数学的保証がなくなる。そのため、学習過
程の監視を行い、動作状況の正常性の確認や、学習の進
捗状況の確認を行って、適切なパラメータを設定する必
要がある。For fast learning, parameters must be chosen well. Also, depending on the quality of the annealing schedule,
The state transition of the unit does not follow the Boltzmann distribution, and there is no mathematical guarantee of operation. Therefore, it is necessary to monitor the learning process, check the normality of the operating status and the progress of learning, and set appropriate parameters.

[Conventional technology]

ａ）ボルツマン・マシンの概要ボルツマン・マシンは、ネットワークの構造を持った学
習機械である。a) Overview of Boltzmann Machine A Boltzmann machine is a learning machine with a network structure.

当該マシンにおいては、記憶させたい外部環境のパター
ンの提示を反復することで、ネットワークにおけるユニ
ット間の結合の重みに、パターン発生の確率的規則性を
学習させるようにする。また学習終了後には０重みを固
定して、パターンの一部を提示し、残りの部分を想起さ
せるようにしている。In this machine, by repeatedly presenting a pattern of the external environment that is desired to be stored, the weights of connections between units in the network are made to learn the probabilistic regularity of pattern generation. Furthermore, after learning is completed, the 0 weight is fixed, a part of the pattern is presented, and the remaining part is recalled.

ｂ）ネットワークの構造と振る舞いネットワークは、０，１の２状態をとるユニットを持つ
、これを９次のように表す。b) Network structure and behavior A network has units that take two states, 0 and 1, which is expressed as 9th order.

Ｓｒ　＝０．１　　　（ｉ＝１，２．−、ｎ）　　・・
・（ｔｌユニットｌとｊとは、第２図に示す如く２重み
Ｗ　ｉ　ｊで結合されている０重み対称（Ｗ、＝Ｗ、ｉ
）である、また、各々のユニットは閾値θ、をもつ。Sr = 0.1 (i = 1, 2.-, n)...
・(tl units l and j are 0-weight symmetry (W, = W, i
), and each unit has a threshold value θ.

全ユニットの状態をベクトルとみると、第３図に示すよ
うに、２重個の全ユニット状態ベクトルがある。以下の
ように、その各々に対して、エネルギーを定義する。When the states of all units are viewed as vectors, there are two all-unit state vectors, as shown in FIG. Define energy for each of them as follows.

Ｅβ＝−己ｗ、ｊｓ、　ｓ、＋Σθ１Ｓ、　　・・・（
２）全ユニットは、β＝１〜２１までの全ユニット状態
間を確率的に遷移する。「熱平衡状態」において、ある
時点で、全ユニット状態がβである確率は次のようにな
る。Eβ=−self w, js, s, +Σθ1S, ...(
2) All units probabilistically transition between all unit states from β=1 to 21. In the "thermal equilibrium state", the probability that all unit states are β at a certain point in time is as follows.

ボルツマン分布：　ｐ　、　ｏｃ　ｅ　ｘ　ｐ　（−二
）・・・（３）これらは１次のように解釈できる。重み
Ｗ、が。Boltzmann distribution: p, oc e x p (-2)...(3) These can be interpreted as first order. Weight W, is.

正のとき、ユニットＳ４とＳｊとは互いに１であること
を支持しあい、負のときは、互いに１であることを排斥
しあう。また閾値θ、が大きいほど。When positive, units S4 and Sj mutually support each other to be 1, and when negative, they mutually exclude each other from being 1. Also, the larger the threshold value θ, the larger the threshold value θ.

Ｓ、は１であり難い。It is difficult for S to be 1.

そして、エネルギーＥ、は、それらユニット間の相互作
用の拮抗の程度を示すと考えられる。エネルギーの高い
全ユニット状態は不安定であり。The energy E is considered to indicate the degree of antagonism in the interactions between these units. The state of all units with high energy is unstable.

より安定なエネルギーの低い全ユニット状態に移ってし
まいやすい。熱平衡状態とは、拮抗が落ち着いた状態で
あり、全ユニットは、エネルギーの低い全ユニット状態
をより頻繁に遷移を行う。It is easy to move to a more stable, lower-energy, whole-unit state. The thermal equilibrium state is a state in which the antagonistic conditions have settled down, and all units more frequently transition between lower energy all-unit states.

ここで２重要なことは、ネットワーク中のユニット全部
または一部で、何かパターンを表現することを考えると
、熱平衡状態でのボルツマン分布が成立している場合２
重みＷ目を調整することで。2 Important points here are that when considering expressing some pattern in all or part of the units in the network, if the Boltzmann distribution holds in the thermal equilibrium state, 2
By adjusting the weight W.

パターンの出現確率を制御できることである。The ability to control the probability of pattern appearance.

Ｃ）学習の方法０．１の状態をとる端子を持つ仮想的な外部環境を考え
る。第４図は外部環境を説明する説明図を示している。C) Learning method Consider a virtual external environment with a terminal that takes a state of 0.1. FIG. 4 shows an explanatory diagram for explaining the external environment.

外部環境は、その端子の状態として出現スるパターンの
ベクトルＴ＆の出現確率ｐ　（Ｔ（ｘ）によって規定さ
れる。The external environment is defined by the appearance probability p (T(x)) of the vector T& of the pattern that appears as the state of the terminal.

学習の目標は、パターンの提示を繰り返すことで、外部
環境のパターン発生機構のモデルを、ネットワーク中に
重みを調整することによって構成することである。The goal of learning is to repeatedly present patterns and construct a model of the pattern generation mechanism of the external environment by adjusting weights in the network.

以下に、学習の方法について述べる。The learning method will be described below.

まず、ネットワーク中のユニットには、第５図図示の如
（，２つの種類がある。第５図において１０はネットワ
ーク、　１１は可視ユニット、　１２は隠れユニットを
表わしている。１つは、外部に直接つながる端子を持ち
、状態を外部から設定・変更できる「可視ユニット（ｖ
ｉｓｉｂｌｅ　ｕｎｉｔ）Ｊ　１１である。First, there are two types of units in the network, as shown in Figure 5. In Figure 5, 10 represents the network, 11 represents the visible unit, and 12 represents the hidden unit. It has a terminal that connects directly to the ``visible unit (v
isible unit) J 11.

もう１つは、外部と直接つながりを持たず、状態を外部
から設定・変更できない「隠れユニット（ｈｉｄｄｅｒ
＋　ｕｎｉｔ）　Ｊ　１２である。第７図は学習のため
のフェーズを説明する説明図を示している０図示の如（
、学習、即ち１重みの変更の手続きは２つのフェーズか
らなる。図に示すフェーズ・＋では。The other is a "hidden unit" that has no direct connection to the outside world and whose status cannot be set or changed from the outside.
+ unit) J 12. Figure 7 shows an explanatory diagram explaining the phases for learning.
,The procedure of learning, i.e. changing one weight, consists of two phases. In phase + shown in the figure.

ネットワークの可視ユニット１１を外部環境１３の端子
に接続して状態を所定の確率Ｐ（Ｔ）に従って設定・変
更し、隠れユニット１２の状態だけを変化させて、ネッ
トワーク全体を熱平衡状態に到達させる。このとき、第
６図に可視ユニット１１の状態を表すベクトルを示す如
く、当該ベクトルがＶヶとなる確率をＰ　”　　（Ｖ　
＆）　　とすると。The visible units 11 of the network are connected to the terminals of the external environment 13 and their states are set and changed according to a predetermined probability P(T), and only the state of the hidden units 12 is changed to make the entire network reach a thermal equilibrium state. At this time, as shown in FIG. 6, which shows a vector representing the state of the visible unit 11, the probability that the vector becomes V is P'' (V
&).

Ｐ”　　（Ｖ＆）＝Ｐ　　（Ｔ＆）　　　　　　　　　
・・・（４）となる。P” (V&)=P (T&)
...(4).

フェーズ・−では、ネットワークのどのユニットも状態
を設定・変更されず、全ユニットの状態を変化させ得る
ようにして熱平衡状態に到達させる。このとき、可視ユ
ニット１１の状態がＶヶとなる確率をＰ−（Ｖ＆）　と
する。学習の目標は、外部環境１３のパターン発生機構
のモデルを、ネットワークｌＯ中に構成することである
や即ち、ネットワークＩＯを自由に動作させたときのパ
ターン発生の確率Ｐ−（Ｖ、）が、外部環境のパターン発生の確率Ｐ　（Ｔ、）＝Ｐ”　　（Ｖ＆）にできるだけ近づくように、Ｐ−（Ｖ＆）を制御するパ
ラメータ、即ち、結合の重みＷ８、を調整することであ
る。In phase -, no unit in the network has its state set or changed, allowing all units to change state to reach a state of thermal equilibrium. At this time, the probability that the state of the visible unit 11 becomes V is set as P-(V&). The goal of learning is to construct a model of the pattern generation mechanism of the external environment 13 in the network IO, that is, the probability of pattern generation when the network IO is operated freely is P-(V,). The purpose is to adjust the parameter controlling P-(V&), that is, the connection weight W8, so as to get as close as possible to the probability of occurrence of a pattern in the external environment P(T,)=P''(V&).

それは、二つの分布の違いの程度を表す１次のような尺
度Ｇを定義し、これを最小化することで定式化できる。It can be formulated by defining a linear scale G that represents the degree of difference between two distributions and minimizing it.

これは、　Ｋｕｌｌｂａｃｋダイバージェンスとよばれ
。This is called the Kullback divergence.

確率分布のパラメータ推定を行うときに、一般的に使わ
れる尺度である。なおＰ−（Ｖ＆）は当該時点での重み
Ｗｉｊの関数である。This is a commonly used measure when estimating parameters of probability distributions. Note that P-(V&) is a function of the weight Wij at the relevant time point.

尺度Ｇは非負であり。The measure G is non-negative.

Ｐ−（Ｖ＆）−Ｐ”　　（Ｖ、）のとき、かつ、そのときに限り最小値：０をとる。P-(V&)-P" (V,) If and only then, the minimum value is 0.

尺度Ｇを最小化する重みＷ、ｊをもとめるために。To find the weights W, j that minimize the measure G.

最急降下法を用いる。Use steepest descent method.

θＧ３　ｗ。θG 3 w.

ないが、それは次の式でもとめられることが証明されて
いる（後述の参考文献参照）。However, it has been proven that it can be determined by the following formula (see references below).

ａＷｉｊ　　　　Ｔ（ｉ、ｊ＝　１〜ｎ）ただし、Ｐ″ｉｊ＋　　Ｐ−１ｊは、各々フェーズ・＋
。aWij T (i, j= 1 to n) However, P″ij+P−1j are each phase +
.

フェーズ・−において、Ｓ、とＳ、とが同時に１となる
確率である。これは、ネットワークｌＯの動作中の観測
によって推定可能な値である。This is the probability that S and S become 1 at the same time in phase -. This is a value that can be estimated by observing the network IO during operation.

上式を用いて、最も尺度Ｇが小さくなる方向へ。Using the above formula, move in the direction where the scale G is the smallest.

重みを変化させる０重みの変化量をΔＷ８Ｊとすると１
次のように書ける。If the amount of change in 0 weight that changes the weight is ΔW8J, then 1
It can be written as follows.

ΔＷＨ＝ε　（Ｐ″ｔ＝　　Ｐ−ｉｊ）　　　　　　　
・・・（７）ただし、パラメータε　（〉０）を適当に
設定しなければならない。ΔWH=ε (P″t= P−ij)
...(7) However, the parameter ε (>0) must be set appropriately.

ｄ）重み更新のアルゴリズム第８図は重み更新手続きについてのフローチャートを示
す、第８図（Ａ）と（Ｂ）と（Ｃ）とは−Ｊ＆になって
１つの図を表わしている。重み更新は３次の３つの手続
きからなる。d) Weight Updating Algorithm FIG. 8 shows a flowchart for the weight updating procedure. FIG. 8 (A), (B) and (C) are -J& to represent one diagram. Weight updating consists of three cubic procedures.

○　フェーズ・十で、Ｐ”、、の推定値を得る。○ In phase 10, obtain the estimated value of P'', .

○　フェーズ・−で＋Ｐ−ｉｊの推定値を得る。○ Obtain the estimated value of +P-ij at phase -.

○　次式による重みの更新を行う。○ Update the weight using the following formula.

Ｗ目＝Ｗ、＋ε　（Ｐゝｔ＝Ｐ−ｔ＝）　　　・・・（
８）１）フェーズ・十とフェーズ・−との２つのフェー
ズの違いを述べると次のようになる。Wth = W, +ε (Pt=P-t=) ...(
8) 1) The difference between the two phases, Phase 10 and Phase -, is as follows.

フェーズ・＋では、可視ユニット１１は、学習パターン
に設定・変更され、状態が自由に変化するのは、隠れユ
ニット１２だけである。In phase +, the visible units 11 are set and changed to the learning pattern, and only the hidden units 12 can change their states freely.

フェーズ・−では、どのユニットも値を設定・変更され
ることなく、自由に状態が変化する。In phase -, the state of any unit changes freely without any values being set or changed.

２）夫々のフェーズの中では１次の手続きが学習パター
ンの個数と同じ回数だけ繰り返される。2) In each phase, the primary procedure is repeated the same number of times as the number of learning patterns.

○　ユニットを初期化する。○ Initialize the unit.

学習パターンに設定・変更されるべきユニットはその値
が設定・変更される。The values of units to be set/changed in the learning pattern are set/changed.

その他のユニットは、ランダムに０．１とする。Other units are randomly set to 0.1.

○　設定されたアニーリング・スケジュール即ち。○ Set annealing schedule ie.

温度の変化のさせ方と、その温度に対応する回復回数に
従って、状態遷移手続きを行う、アニーリングが済めば
、熱平衡状態におけるポルツマン分布が実現する。The state transition procedure is performed according to the way the temperature is changed and the number of recoveries corresponding to the temperature. Once annealing is completed, the Poltzmann distribution in the thermal equilibrium state is realized.

○　目標とする最終温度で、複数回状態遷移手続きを行
い、Ｓ＋　、Ｓ＝が同時に１となる頻度を。○ Perform the state transition procedure multiple times at the target final temperature, and determine how often S+ and S= become 1 at the same time.

集計する。Tally.

集計は、すべての学習パターンを通しての積算である。Aggregation is an accumulation over all learning patterns.

３）各々のフェーズの最後で、頻度を確率に換算して＋
　　Ｐ”　ｉｊ＋　　Ｐ−ｉｊの推定値をもとめる。3) At the end of each phase, convert the frequency to probability and calculate +
Find the estimated value of P"ij+P-ij.

ｅ）状Ｂ遷移手続き第９図は状態遷移手続きに関するフローチャートを示す
。e) State B Transition Procedure FIG. 9 shows a flowchart regarding the state transition procedure.

与えられた重み、温度に対応する。熱平衡状態に於ける
ポルツマン分布を実現するためのユニットの状態遷移と
して図に示す如きアルゴリズムを採用している。Corresponds to the given weight and temperature. The algorithm shown in the figure is adopted as the state transition of the unit to realize the Portzmann distribution in the thermal equilibrium state.

Ｏ注目するユニットを、自由に変化させる範囲の中から
、ユニットをランダムに選択する。それをに番目とする
。O Select a unit at random from a range in which the unit of interest can be freely changed. That's the second thing.

・・・（９）０　Ｐｋ＝□の石室率でＳ。...(9) S with a stone chamber rate of 0 Pk = □.

１　＋ｅｘｐ（−ΔＥｋ／Ｔ）を１とする。　　　　　　　　　　　　　　　　・・・
αω以上の手続きを、どのユニットも、１回の状態遷移
手続きで、平均１回選ばれるだけの回数繰り返す。1 +exp(-ΔEk/T) is set to 1. ...
The procedure αω or more is repeated as many times as each unit is selected on average once in one state transition procedure.

ｆ）補足説明熱平衡状態における。ポルツマン分布の実現の原理は以
下の通りである。f) Supplementary explanation In a state of thermal equilibrium. The principle of realizing the Portzmann distribution is as follows.

弐〇〇は、シグモイダル関数と呼ばれており、グラフは
第１０図に示すようになる。2〇〇 is called a sigmoidal function, and its graph is shown in Figure 10.

ジグモイダル関数Ｐヨは、Ｔ＝０のときを考えると１階
段関数であり、ΔＥｋ＞０即ち５ｋ＝１の方がエネルギ
ーが低いときにはＳ、は常に１とされ１反対にΔＥ、く
０のときはＳｋは常に０とされる。The sigmoidal function Pyo is a step function when T=0, and when ΔEk>0, that is, 5k=1, the energy is lower, S is always 1, and on the contrary, when ΔE, is 0, S is always 1. Sk is always set to 0.

これは、山登り法とよばれる決定論的なアルゴリズムで
あり、ネットワーク全体のエネルギーが小さくなる方向
に状態を変化させる。しかし、この変化方法の場合には
状態の張る空間におけるエネルギーの起伏による局所的
極小に陥ってしまうという欠点がある。This is a deterministic algorithm called hill-climbing, which changes the state in a direction that reduces the energy of the entire network. However, this method of change has the disadvantage that it falls into a local minimum due to the ups and downs of energy in the space where the state extends.

シグモイダル関数Ｐｋは、パラメータ：温度Ｔによって
１階段関数を鈍らせた形になっている。The sigmoidal function Pk is a step function that is blunted by a parameter: temperature T.

ユニットは温度Ｔに比例したエネルギーのゆらぎを持つ
。そのことで１局所的極小から抜は出し。The unit has energy fluctuations proportional to temperature T. This allows us to escape from the local minimum.

最小点を分布の中心とするボルツマン分布が実現される
。A Boltzmann distribution is realized with the minimum point as the center of the distribution.

しかし、実際には、温度が低いとエネルギーのゆらぎが
１局所的極小のポテンシャルの山を越えるのには小さす
ぎる場合が考えられる。その場合。However, in reality, if the temperature is low, the energy fluctuation may be too small to exceed one local minimum potential peak. In that case.

局所的極小からまったく抜は出せないことはないが、長
時間を要することになる。一方、温度Ｔが大きすぎると
、ユニットの状態が変動しすぎる。Although it is not impossible to extract from the local minimum, it will take a long time. On the other hand, if the temperature T is too large, the state of the unit fluctuates too much.

極端な場合。extreme case.

Ｔ＝閃とすると、ユニットの状態変化はまったくのランダムと
なる。If T = Flash, the unit's state changes will be completely random.

そこで１重み更新手続きにおいては、シミュレーテッド
・アニーリングのアイデア、即ち、目標温度よりも高い
温度から初めて、大まかにエネルギーの低いところを捜
して状態遷移を行い１段々と温度を下げて、ゆらぎを小
さくして、最低エネルギー状態の近傍に落ち着かせると
いう方法をとっているやこのようにすると、目標温度で
繰り返しを続けるよりも、少ない繰り返しで、かつ、確
実に、目標温度における熱平衡状態におけるボルツマン
分布が得られる。Therefore, in the 1-weight update procedure, we use the idea of simulated annealing, which starts from a temperature higher than the target temperature, searches for a roughly low energy point, performs a state transition, and lowers the temperature one step at a time to reduce fluctuations. By doing this, the Boltzmann distribution in the thermal equilibrium state at the target temperature can be determined more reliably and with fewer repetitions than by continuing the repetition at the target temperature. can get.

参考文献１　、　Ｄ、Ｈ，Ａｃｋｌｅｙ、Ｇ、Ｅ、Ｈｉｎｔｏｎ
、Ｔ、Ｊ、Ｓｅｊｎｏｗｓｋｉ″Ａ　Ｌｅａｒｎｉｎｇ
　Ａｌｇｏｒｉｔｈｍ　ｆｏｒ　ＢｏｌｔｚｍａｎｎＭ
ａｃｈｉｎｅｓ” Ｃｏｇｎｉｔｉｖｅ　５ｃｉｅｎｃｅ　９＋１９８５＋
１４７−１６９２、　Ｄ、Ｅ、Ｈｉｎｔｏｎ、Ｔ、Ｊ、
Ｓｅｊｎｏｗｓｋｉ“Ｌｅａｒｎｉｎｇ　ａｎｄ　Ｒｅ
ｌｅａｒｎｉｎｇ　ｉｎ　Ｂｏｌｔｚ１１ａｎｎ門ａｃ
ｈｉｎｅｓ” Ｐａｒａｌｌｅｌ　Ｄｉｓｔｒｉｂｕｔｅｄ　Ｐｒｏｃ
ｅｓｓｉｎｇ　Ｖｏｌ。Reference 1, D.H., Ackley, G.E., Hinton.
, T., J., Sejnowski″A Learning
Algorithm for BoltzmannM
Cognitive 5science 9+1985+
147-1692, D.E., Hinton, T.J.
Sejnowski“Learning and Re
learning in Boltz11annmonac
Parallel Distributed Proc
essing Vol.

１（Ｔｈｅ　ＭＩＴ　Ｐｒｅｓｓ）、２８２−３１７〔
発明が解決しようとする課題〕ボルツマン・マシンの動作を制御するパラメータは、ア
ニーリング・スケジュールと重みの変え幅εとである。1 (The MIT Press), 282-317 [
Problems to be Solved by the Invention] The parameters that control the operation of the Boltzmann machine are the annealing schedule and the weight change width ε.

それらは、相互に関係しあっているので、完全に分離で
きるわけではないが、大きく２つに分けられる。第一は
、熱平衡状態におけるボルツマン分布に従う状態遷移の
実現に関わるものである。Since they are interrelated, they cannot be completely separated, but they can be broadly divided into two. The first one is related to the realization of state transition according to Boltzmann distribution in thermal equilibrium state.

第二は９尺度Ｇを最小化するような重みＷｉｊの探索に
関わるものである。The second is related to the search for weights Wij that minimize the 9 scale G.

まず、熱平衡状態におけるボルツマン分布に従う状態遷
移の実現に関して考える。状態遷移手続きを無限回反復
すれば、任意の温度において、熱平衡状態を実現するこ
とは可能であるはずである。First, we will consider the realization of a state transition according to the Boltzmann distribution in a thermal equilibrium state. It should be possible to achieve thermal equilibrium at any temperature by repeating the state transition procedure an infinite number of times.

しかし、実際には有限回、しかもできるだけ少ない回数
で実現しなければならない。温度が高いほどゆらぎが大
きいから、エネルギーの局所的極小から抜は出す可能性
が高く、熱平衡状態に達しやすい。そこで、シミュレー
テッド・アニーリングのアイデアを導入して、高い温度
から状態遷移をさせ始めて、目的の温度で落ち着かせる
ことを行う、したがって、理想的には、より高い温度か
ら始めて、同じ温度でより多くの回数、状態遷移を行い
、よりゆっくりと温度を下げるべきである。However, in reality, it must be accomplished a finite number of times, and moreover, it must be accomplished as few times as possible. The higher the temperature, the greater the fluctuations, so there is a high possibility that the energy will be extracted from the local minimum, and it will be easier to reach a state of thermal equilibrium. Therefore, we introduced the idea of simulated annealing to start the state transition from a high temperature and let it settle at the desired temperature. Therefore, ideally, we start from a higher temperature and then do more at the same temperature. number of times, the temperature should be lowered more slowly.

結局、処理速度とボルツマン分布の実現性との。After all, the processing speed and the feasibility of Boltzmann distribution.

妥協点を見つけねばならないことになる。A compromise will have to be found.

尺度Ｇを最小化するような重みＷ口の探索は。The search for a weight W that minimizes the measure G is as follows.

多次元の最小化問題である。ボルツマン・マシンは３尺
度Ｇの勾配方向に重みを変えるという探索法をとってい
る。重みを変えた結果２尺度Ｇが大きくなったか、小さ
くなったかは８周べていない。It is a multidimensional minimization problem. The Boltzmann machine uses a search method that changes the weight in the direction of the gradient of the three scales G. It has not been determined whether the two scales G became larger or smaller as a result of changing the weights.

このため、−船釣傾向としては、極小点の探索が可能で
あるが１尺度Ｇを大きくする可能性もあり。For this reason, as for the -boat fishing tendency, it is possible to search for the minimum point, but there is also a possibility that the scale G may be increased.

その場合には、振動を起こす。また、このような華純な
探索法では１重みの変え幅εが探索効率に大きな影響を
あたえる。一般的に変え幅εは小さいほど、確実な探索
が行なえるはずであるが、探索速度は遅くなる。したが
って、変え幅εの最適値が動的に存在する。In that case, it will cause vibration. Furthermore, in such a simple search method, the width of change ε of one weight has a large influence on the search efficiency. Generally, the smaller the change width ε, the more reliable the search will be, but the search speed will be slower. Therefore, an optimal value of the change width ε exists dynamically.

以上のことを考慮してパラメータを決定しなければなら
ないが、パラメータ間の相互作用が強く。Parameters must be determined taking the above into consideration, but there is a strong interaction between the parameters.

効率よく正解を出すパラメータの探索は困難である。It is difficult to search for parameters that will efficiently yield the correct answer.

従来のボルツマン・マシンでは、入力に対する出力の正
否によって学習の良否を判断する他なかった。また、学
習の進捗が悪い場合、どのような原因によるものかを判
別する手段が無かった。In conventional Boltzmann machines, the only way to determine whether learning is good or bad is based on whether the output is correct or incorrect in relation to the input. Furthermore, if learning progress was poor, there was no way to determine what was the cause.

本発明は、学習途中におけるネットワークの動作状態を
監視することで、学習の進捗状況を知り。The present invention monitors the operating state of the network during learning to know the progress of learning.

最適な動作状況を整えるパラメータの探索を可能とする
ことを目的としている。The purpose is to enable the search for parameters that provide optimal operating conditions.

[Means to solve the problem]

第１図は本発明の原理構成図を示す。図中の符号１０は
ネットワーク、　１１は可視ユニット、１２は隠れユニ
フ）、２０は学習処理部、２１は初期設定部。FIG. 1 shows a basic configuration diagram of the present invention. In the figure, numeral 10 is a network, 11 is a visible unit, 12 is a hidden unit), 20 is a learning processing section, and 21 is an initial setting section.

２２は重み更新部、２３は収束判定部、２４は学習終了
結果保持部、　３０はモニター部、Ｇ、Ｇ１．ＧＢは夫
々本発明におけるモニター部において用いる尺度を表わ
している。22 is a weight update unit, 23 is a convergence determination unit, 24 is a learning completion result holding unit, 30 is a monitor unit, G, G1. GB represents a scale used in the monitor section of the present invention.

図示のネットワーク１０は第５図や第７図に示すフット
ワークに対応している。そして学習処理部２０が第７図
および第８図を参照して説明した如き学習を行い１図示
のユニットＳ相互間について。The illustrated network 10 corresponds to the footwork shown in FIGS. 5 and 7. Then, the learning processing section 20 performs learning as described with reference to FIGS. 7 and 8 for the units S shown in FIG.

夫々のユニットを８８とＳｊとすれば、第２図に関連し
て説明した如（重みＷ、ｊを設定する。即ち。Assuming that the respective units are 88 and Sj, the weights W and j are set as described in connection with FIG. 2, ie.

初期設定部２１において与えられた学習パターン・パラ
メータに対応して、ｐ−（ｖ＆）をＰ”　　（Ｖ＆）即
ちＰ（Ｔ＆）に近づけるように１重み更新部２２が重み
Ｗｉｊを調整する。収束判定部２３は当該重みＷｉＪが
収束したか否かをチエツクし、収束した場合には、その
結果を学習終了結果保持部２４に保持させるようにする
。Corresponding to the learning pattern parameters given in the initial setting section 21, the 1-weight updating section 22 adjusts the weight Wij so that p-(v&) approaches P''(V&), that is, P(T&). Convergence. The determining unit 23 checks whether the weight WiJ has converged or not, and if it has converged, the result is stored in the learning completion result holding unit 24.

モニター部３０は８重み更新部２２による処理について
、学習の進捗状況についてのモニタリングや正常学習動
作が行われているか否かについてのモニタリングを行う
。即ち、この場合の尺度として。Regarding the processing by the 8-weight updating section 22, the monitor section 30 monitors the progress of learning and whether or not a normal learning operation is being performed. That is, as a measure in this case.

次の尺度ＧまたはＧｌまたはＣＢについての計算を行い
、動作状況の良否や学習の進捗状況を監視するようにし
ている。The next scale G, Gl, or CB is calculated to monitor the quality of the operating status and the progress of learning.

○　学習すべきパターンの出現確率分布：Ｐ”　　（Ｖ
、）と、それが実際に出現した確率分布：Ｐ−（Ｖ、）
とのＫｕｌｌｂａｃｋ−ダイバージェンス：（Ｖ６：可視ユニットの状態ベクトル。）○　学習すべ
きパターンの出現確率分布：ｐ’　　（ｖ、）と、現在
の重みに対応するボルツマン分布の理想値：ＰＩ　　（
ＷｉにＶ、）とのＫｕｌｌｂａｃｋ−ダイバージェンス
：○　現在の重みに対応する理想的ボルツマン分布：Ｐ
Ｉ　　（Ｖ／、にＵβ）と、実際に発生したパターンの
確率分布：Ｐ−（Ｕ、）　　とのＫｕｌｌｂａｃｋ−ダ
イバージェンス；〔作　用〕 ○　尺度Ｇは、ボルツマン・マシンが最小化を図ってい
る評価関数そのものであり、学習の進捗状況を示す。こ
の値の重み更新毎の変化をもとに重みの変え幅εを適切
な値に調節する。当該尺度Ｇを用いて監視するに当って
は、特にこの値が単調減少でなく、振動する場合には、
変え幅εを小さくする必要がある。○ Appearance probability distribution of the pattern to be learned: P” (V
, ) and the probability distribution that it actually appears: P-(V,)
Kullback-divergence with: (V6: State vector of visible unit.) ○ Appearance probability distribution of the pattern to be learned: p' (v,) and ideal value of the Boltzmann distribution corresponding to the current weight: PI (
Kullback-divergence with V, ): ○ Ideal Boltzmann distribution corresponding to the current weights: P
Kullback-divergence between I (V/, Uβ) and the probability distribution of the actually generated pattern: P-(U,); [Effect] ○ The measure G is minimized by the Boltzmann machine. It is the evaluation function itself and indicates the progress of learning. Based on the change in this value each time the weight is updated, the weight change width ε is adjusted to an appropriate value. When monitoring using the scale G, especially when this value does not monotonically decrease but oscillates,
It is necessary to reduce the change width ε.

○　尺度Ｇｌも、学習の進捗状況を示す尺度である。当
該尺度Ｇｌを用いて監視するに当っては。○ Scale Gl is also a scale that shows the progress of learning. In monitoring using the scale Gl.

尺度Ｇを用いる場合と違い、アニーリング手続き無しで
、計算可能であるが、２ｖオーダーの計算量を要する。Unlike the case of using the scale G, calculation is possible without an annealing procedure, but it requires a calculation amount on the order of 2v.

２ｖ→２のＶ乗Ｏ尺度ＧＢは、実際に生じている確率分布がボルツマン
分布にどれだけ近いかの尺度である。即ち、動作状況の
良否を示している。当該尺度ＣＢを用いて監視するに当
っては、特にこの値が大きい場合には、状態遷移のポル
ツマン分布からの外れが大きいことを示すから、より穏
やかなアニーリング・スケジュールにする必要がある。The V-th power O measure GB of 2v→2 is a measure of how close the actually occurring probability distribution is to the Boltzmann distribution. That is, it indicates whether the operating status is good or bad. When performing monitoring using the measure CB, a particularly large value indicates that the state transition deviates significantly from the Portsman distribution, so a gentler annealing schedule is required.

〔Example〕

第１１図（Ａ）（Ｂ）は第１図において別々の位置に存
在して１つの実施例フローチャートを示す図である。第
１１図（Ａ）図示のフェーズ・−は上述の第８図に関連
して第８図（Ｂ）に示されているフェーズ・−の処理と
置換されるものである。11(A) and 11(B) are diagrams in separate locations in FIG. 1 illustrating one embodiment flowchart. The phase-- shown in FIG. 11(A) is replaced with the phase-- shown in FIG. 8(B) in relation to FIG. 8 described above.

また第１１図（Ｂ）は第１図に示したモニター部３０に
対応するものであり１尺度ＧまたはＧｌまたはＧＢを計
算し、当該価々の時点における動作状況の良否と進捗状
況とを監視している。　第１２図（Ａ）（Ｂ）　も−緒
になって１つの実施例フローチャートを示す図である。Furthermore, FIG. 11(B) corresponds to the monitor unit 30 shown in FIG. 1, which calculates one scale G, Gl, or GB, and monitors the quality and progress of the operating status at each point in time. are doing. FIGS. 12(A) and 12(B) are diagrams together showing a flowchart of one embodiment.

第１２図（Ａ）図示のフェーズ・−は上述の第８図に関
連して第８図（Ｂ）に示されているフェーズ・−の処理
に対応するものである。言うまでもなく、第１２図（Ａ
）（Ｂ）は第１図に示したモニター部３０に対応するも
のであり２尺度ＧまたはＧｌまたはＣＢを計算し、当該
価々の時点における動作状況の良否と進捗状況とを監視
している。The phase shown in FIG. 12(A) corresponds to the process of the phase shown in FIG. 8(B) in relation to FIG. 8 described above. Needless to say, Figure 12 (A
)(B) corresponds to the monitor unit 30 shown in FIG. 1, which calculates two scales G, Gl, or CB, and monitors the quality and progress of the operating status at each point in time. .

〔Effect of the invention〕

以上説明した如く１本発明によれば、動作状況の良否と
進捗状況を監視することができ２例えば第（７）式にし
たがった重み変化を与えつつ高速な学習を行うことが可
能となる。As described above, according to the present invention, it is possible to monitor the quality of the operation status and the progress status, and to perform high-speed learning while applying weight changes according to, for example, equation (7).

また第１１図図示の実施例の場合には、モニタリングを
行うタイミングや確率推定のための「ユニットが１にな
る」頻度計測の回数の設定などが重み更新手続きと独立
に調整できる。また、第１２図図示の実施例の場合には
１重み更新中の動作状況そのものが測定可能なことや、
アニーリング手続きを重み更新手続きと共有できること
などの利点をもっている。Furthermore, in the case of the embodiment shown in FIG. 11, the timing of monitoring and the setting of the number of times "a unit becomes 1" frequency measurement for probability estimation can be adjusted independently of the weight update procedure. In addition, in the case of the embodiment shown in FIG. 12, the operating status itself during one weight update can be measured,
It has the advantage of being able to share the annealing procedure with the weight update procedure.

[Brief explanation of the drawing]

第１図は本発明の原理構成図、第２図はポルツマン・マ
シンのネットワークの構成ユニットを示す図、第３図は
全ユニットの状態ベクトルを列挙した図、第４図は外部
環境を抽象化した図、第５図は可視ユニットと隠れユニ
ットとを示す図、第６図は可視ユニットの状態ベクトル
を示す図、第７図は学習の２つのフェーズを示す図、第
８図は重み更新手続のフローチャート、第９図は状態遷
移手続のフローチャート、第１０図はシグモイド関数を
表わすグラフ、第１１図は一実施例フローチャート、第
１２図は他の一実施例フローチャートを示す。図中、１０はネットワーク、１１は可視ユニット。１２は隠れユニット２０は学習処理部、３０はモニター
部を表わす。Figure 1 is a diagram showing the principle configuration of the present invention, Figure 2 is a diagram showing the constituent units of the Portsman machine network, Figure 3 is a diagram listing the state vectors of all units, and Figure 4 is an abstraction of the external environment. Figure 5 shows the visible unit and hidden unit, Figure 6 shows the state vector of the visible unit, Figure 7 shows the two phases of learning, and Figure 8 shows the weight update procedure. 9 is a flowchart of a state transition procedure, FIG. 10 is a graph representing a sigmoid function, FIG. 11 is a flowchart of one embodiment, and FIG. 12 is a flowchart of another embodiment. In the figure, 10 is a network and 11 is a visible unit. The hidden unit 12 represents a learning processing section, and the hidden unit 30 represents a monitor section.

Claims

[Claims]

(1) Contains at least a visible unit (11) and a hidden unit (12), and each unit S_i and S_j
The network (10) is configured to be able to set the weight W_i_j between the visible units (11), and the pattern appearance probability P^+(V_α) corresponding to the learning pattern for setting/changing the state of the visible unit (11) is provided.
A learning processing unit (
In the learning processing method in the Boltzmann machine having the above-mentioned learning processing method, a monitor section (30) is provided, and the monitor section (30) has a function of at least monitoring the progress of the learning and the quality of the operation status. Characteristic learning processing method in Boltzmann machine.

(2) The monitor unit (30) performs a Kull analysis of the appearance probability distribution P^+(V_α) of the pattern to be learned and the probability distribution P^−(V_α) in which it actually appears.
back-divergence▲There are mathematical formulas, chemical formulas, tables, etc.▼ (where V_α is the state vector of the visible unit) as the scale G
A learning processing method in a Boltzmann machine according to claim (1), characterized in that the learning processing method is used as a Boltzmann machine.

(3) The monitor unit (30) calculates the Kullback-divergence ▲ between the appearance probability distribution P^+(V_α) of the pattern to be learned and the ideal value PI (W_i_j; V_α) of the Boltzmann distribution for the current weight. A learning processing method in a Boltzmann machine according to claim (1) or (2), characterized in that a mathematical formula, a chemical formula, a table, etc. ▼ is used as a scale GI. (4) The monitor unit (30) calculates the Kuhlback-divergence ▲ mathematical formula between the ideal Boltzmann distribution PI (W_i_j; U_β) at the current weight and the probability distribution P^-(U_β) of the actually generated pattern. There are chemical formulas, tables, etc. ▼ (However, U_β is the state vector of all units) as the scale GB
A learning processing method in a Boltzmann machine according to any one of claims (1) to (3), characterized in that the method is used as a learning processing method in a Boltzmann machine.