JPH03225482A

JPH03225482A - Neural network expressing position, size, and both those of target object

Info

Publication number: JPH03225482A
Application number: JP2018941A
Authority: JP
Inventors: Yoshiki Uchikawa; 内川　嘉樹; Kazuhisa Gohara; 一寿郷原
Original assignee: Nagoya University NUC
Current assignee: Nagoya University NUC
Priority date: 1990-01-31
Filing date: 1990-01-31
Publication date: 1991-10-04
Anticipated expiration: 2010-04-19
Also published as: JPH0736200B2

Abstract

PURPOSE:To perform robust and accurate pattern recognition for the superposition of a noise, the deviation of a position, and the deviation of size by providing a position network and a size network which detect the rough position and size of a pattern with arbitrary shape and size that exists on a two-dimensional screen, respectively. CONSTITUTION:At the position network, forty neuron units in an intermediate layer are connected to a neuron unit in an input layer in network shape, and are connected to two neuron network in an output layer conformed to an x-y orthogonal coordinate in the network shape. The position network is the one to detect the position of the pattern, for example, center or gravity (x.y). Also, in the size network, the forty neuron units in the intermediate layer are con nected to the neuron unit in the input layer in the network shape, and also, is connected to one neuron unit in the output layer conformed to one side of a square circumscribed to a target pattern. The size network detects the size of the pattern, for example, the length (d) of one side of the square enclosing the pattern. In such a way, the position and the size of the target pattern can be detected.

Description

【発明の詳細な説明】「目　的」［産業上の利用分野］本発明は、画像認識やパターン認識等に使用され、目標
物体の位置、大きさ及びその両方を表現するニューラル
ネットワークに関する。DETAILED DESCRIPTION OF THE INVENTION [Purpose] [Industrial Field of Application] The present invention relates to a neural network that is used for image recognition, pattern recognition, etc., and expresses the position, size, and both of a target object.

［従来の技術］近年、脳を模倣したニューラルネットワークの研究が盛
んになり、各方面で応用が試みられている。ニューラル
ネットワークのモデルの−っである階層型構造をもつパ
ックプロパゲーションモデル（以下ＢＰモデルという）
は、その学習能力及び非線型関数の近似能力等のために
、手書き文字等のパターン認識への応用が試みられてい
る。従来のＢＰモデルに基づくニューラルネットワーク
では、認識対象であるパターンの変形、ノイズ等に対し
ては強い、即ち補正が正しく行えるが、学習パターンか
らの位置ずれ、大きさのずれに対しては殆ど考慮されて
いない。即ち、従来のＢＰモデルでは、位置ずれ、大き
さずれの解消は達成されていない。[Prior Art] In recent years, research on neural networks that imitate the brain has become active, and applications are being attempted in various fields. A pack propagation model (hereinafter referred to as BP model) with a hierarchical structure, which is a neural network model.
Due to its learning ability and nonlinear function approximation ability, attempts have been made to apply it to pattern recognition of handwritten characters, etc. Neural networks based on conventional BP models are strong against deformation and noise of the pattern to be recognized, that is, they can correct correctly, but they hardly take into account positional and size deviations from the learned pattern. It has not been. That is, in the conventional BP model, it has not been possible to eliminate positional deviation and size deviation.

［当該発明が解決しようとする課題３以上のようにＢＰモデルは、目標パターンの状態によっ
ては位置や大きさを正しく検出できないという問題があ
った。[Problem 3 to be Solved by the Invention As described above, the BP model has the problem that the position and size cannot be detected correctly depending on the state of the target pattern.

本発明は、二次元視野内にある目標パターンの位置、大
きさ及びその両方を正確に検出できるニューラルネット
ワークを提供することを目的とする。An object of the present invention is to provide a neural network that can accurately detect the position, size, and both of a target pattern within a two-dimensional field of view.

「構　成」［課題を解決するための手段］本発明に係る目標物体の位置を表現するニューラルネッ
トワークは、ニューロン間の結合と学習則とを規定した
ＢＰモデルにおいて、格子状に分割された入力二次元画
像を各々の格子の強度信号として入力する入力層のニュ
ーロンユニットと、前記入力二次元画像の目的とするパ
ターンの位置を出力する出力層のニューロンユニットと
、前記入力層のニューロンユニットと前記出力層のニュ
ーロンユニット間に設けられた中間層のニューロンユニ
ットとからなり、前記入力二次元画像上に存在する任意
の形状と大きさを有する前記パターンの位置を検出する
ことを特徴とする。“Configuration” [Means for solving the problem] The neural network for expressing the position of a target object according to the present invention uses an input divided into a grid in a BP model that defines connections between neurons and a learning rule. a neuron unit in an input layer that inputs a two-dimensional image as an intensity signal of each grid; a neuron unit in an output layer that outputs the position of a target pattern in the input two-dimensional image; a neuron unit in the input layer; and an intermediate layer neuron unit provided between output layer neuron units, and is characterized in that it detects the position of the pattern having an arbitrary shape and size existing on the input two-dimensional image.

本発明に係る目標物体の大きさを表現するニュラルネッ
トワークは、ニューロン間の結合と学習則とを規定した
ＢＰモデルにおいて、格子状に分割された入力二次元画
像を各々の格子の強度信号として入力する入力層のニュ
ーロンユニットと、前記入力二次元画像の目的とするパ
ターンの大きさを出力する出力層のニューロンユニット
と、前記入力層のニューロンユニットと前記出方層のニ
ューロンユニット間に設けられた中間層のニュロンユニ
ットとからなり、前記入力二次元画像上に存在する任意
の形状と大きさを有する前記パターンの大きさを検出す
ることを特徴とする。The neural network that expresses the size of a target object according to the present invention uses an input two-dimensional image divided into grids as an intensity signal of each grid in a BP model that defines connections between neurons and learning rules. A neuron unit in the input layer that receives input, a neuron unit in the output layer that outputs the size of the target pattern of the input two-dimensional image, and a neuron unit provided between the neuron unit in the input layer and the neuron unit in the output layer. The present invention is characterized by detecting the size of the pattern having an arbitrary shape and size existing on the input two-dimensional image.

本発明に係る目標物体の位置と大きさを表現するニュー
ラルネットワークは、ニューロン間の結合と学習則とを
規定したＢＰモデルにおいて、格子状に分割された入力
二次元画像を各々の格子の強度信号として入力する第１
入力層のニューロンユニットと、前記入力二次元画像の
目的とするパターンの大きさを出力する第１出力層のニ
ューロンユニットと、前記第１入力層のニューロンユニ
ットと前記第１出力層のニューロンユニット間に設けら
れた第１中間層のニューロンユニットとからなり、前記
入力二次元画像上に存在する任意の形状と大きさを有す
る前記パターンの大きさを検出する位置ネットと、格子
状に分割された入力二次元画像を各々の格子の強度信号
として入力する第２入力層のニューロンユニットと、前
記入力二次元画像の目的とするパターンの大きさを出力
する第２出力層のニューロンユニットと、前記第１入力
層のニューロンユニットと前記第２出力層のニューロン
ユニット間に設けられた第２中間層のニューロンユニッ
トとからなり、前記入力二次元画像上に存在する任意の
形状と大きさを有する前記パターンの大きさを検出する
大きさネットとからなることを特徴とする。The neural network that expresses the position and size of a target object according to the present invention uses an input two-dimensional image divided into grids as an intensity signal for each grid in a BP model that defines connections between neurons and learning rules. The first input as
a neuron unit in the input layer, a neuron unit in the first output layer that outputs the size of the target pattern of the input two-dimensional image, and a neuron unit in the first input layer and a neuron unit in the first output layer; a first intermediate layer neuron unit provided in the input two-dimensional image, and a position net for detecting the size of the pattern having an arbitrary shape and size existing on the input two-dimensional image; a second input layer neuron unit that inputs the input two-dimensional image as an intensity signal of each grid; a second output layer neuron unit that outputs the size of the target pattern of the input two-dimensional image; The pattern is composed of neuron units of one input layer and neuron units of a second intermediate layer provided between the neuron units of the second output layer, and has an arbitrary shape and size existing on the input two-dimensional image. and a size net that detects the size of.

本発明に係る目標物体の位置と大きさを表現するニュー
ラルネットワークは、ニューロン間の結合と学習則とを
規定したＢＰモデルにおいて、格子状に分割された入力
二次元画像を各々の格子の強度信号として入力する第１
入力層のニューロンユニットと、前記入力二次元画像の
目的とするパターンの大きさを出力する第１出力層のニ
ューロンユニットと、前記第１入力層のニューロンユニ
ットと前記第１出力層のニューロンユニット間に設けら
れた第１中間層のニューロンユニットとからなり、前記
入力二次元画像上に存在する任意の形状と大きさを有す
る前記パターンの大きさを検出する位置ネットと、格子
状に分割された入力二次元画像を各々の格子の強度信号
として入力する第２入力層のニューロンユニットと、前
記入力二次元画像の目的とするパターンの大きさを出力
する第２出力層のニューロンユニットと、前記第２入力
層のニューロンユニットと前記第２出力層のニューロン
ユニット間に設けられた第２中間層のニューロンユニッ
トとからなり、前記入力二次元画像上に存在する任意の
形状と大きさを有する前記パターンの大きさを検出する
大きさネットと、格子状に分割された入力二次元画像を
各々の格子の強度信号として入力する第３入力層のニュ
ーロンユニットと、上下左右の４方向に対応した各ニュ
ーロンユニットに対して枠を内側に動かす場合には０を
、外側に動かす場合には１を教師の値とする第３出力層
のニューロンユニットと、前記第３入力層のニューロン
ユニットと前記第３出力層のニューロンユニット間に設
けられた第３中間層のニューロンユニットとからなり、
前記位置ネットと大きさネットの出力に基づいて生成さ
れた枠の補正を行い前記パターンの存在範囲を絞り込む
ことができる枠取りネットとからなることを特徴とする［作　用］位置、大きさ、枠取りは従来のパターン認識でも特徴抽
出の前処理に、正規化処理として考慮されていたもので
あるが、いづれも番地指定メモリに対する逐次処理を前
提としている。本発明の大きな特徴は、これらの前処理
も並列処理に基づいたパターン変換の問題としてとらえ
、それをネットワークの学習として実現している点にあ
る。The neural network that expresses the position and size of a target object according to the present invention uses an input two-dimensional image divided into grids as an intensity signal for each grid in a BP model that defines connections between neurons and learning rules. The first input as
a neuron unit in the input layer, a neuron unit in the first output layer that outputs the size of the target pattern of the input two-dimensional image, and a neuron unit in the first input layer and a neuron unit in the first output layer; a first intermediate layer neuron unit provided in the input two-dimensional image, and a position net for detecting the size of the pattern having an arbitrary shape and size existing on the input two-dimensional image; a second input layer neuron unit that inputs the input two-dimensional image as an intensity signal of each grid; a second output layer neuron unit that outputs the size of the target pattern of the input two-dimensional image; The pattern is composed of neuron units of two input layers and neuron units of a second intermediate layer provided between the neuron units of the second output layer, and has an arbitrary shape and size existing on the input two-dimensional image. a size net that detects the size of the image, a neuron unit in the third input layer that inputs the input two-dimensional image divided into grids as intensity signals for each grid, and neurons corresponding to four directions (up, down, left, and right). A neuron unit of the third output layer, a neuron unit of the third input layer, and a neuron unit of the third input layer whose teacher value is 0 when moving the frame inward relative to the unit and 1 when moving the frame outward. and a third intermediate layer neuron unit provided between the neuron units of the layer,
[Function] Position, size, Framing has been considered as a normalization process in the preprocessing of feature extraction in conventional pattern recognition, but both assume sequential processing of address specification memory. A major feature of the present invention is that these preprocessings are also considered as pattern conversion problems based on parallel processing, and are realized as network learning.

この特徴によって、変形、ノイズに対しては学習パター
ンを工夫することで、人間の認識に近づけることが可能
となり、目標パターンの位置と大きさを正確に検出でき
る。Due to this feature, by devising the learning pattern to deal with deformation and noise, it is possible to approach human recognition, and the position and size of the target pattern can be detected accurately.

即ち、本発明の位置ネットでは、視野内にあるパターン
の位置を検出する問題を、ＢＰモデルによるパターン変
換として解くことができ、本発明の大きさネットでは、
視野内にあるパターンの大きさを検出する問題を、ＢＰ
モデルによるパターン変換として解くことができるので
、目標パターンの位置と大きさを正確に検出できる。That is, with the position net of the present invention, the problem of detecting the position of a pattern within the field of view can be solved as pattern conversion using the BP model, and with the size net of the present invention,
BP solves the problem of detecting the size of a pattern within the field of view.
Since it can be solved as a pattern transformation using a model, the position and size of the target pattern can be detected accurately.

なお、本発明に使用している枠取りネットは、パターン
に外接するおおよその枠を設定し、生成された枠の補正
を行い、パターンの存在範囲を絞り込むことができるの
で、目標パターンの位置と大きさの補正を行う機能を学
習により獲得でき、ノイズによる影響を除去できる。Note that the framing net used in the present invention can set an approximate frame circumscribing the pattern, correct the generated frame, and narrow down the range of the pattern. The ability to correct the size can be acquired through learning, and the effects of noise can be removed.

［実施例］第１図を参照して、本発明の位置ネットの実施例を説明
する。第１（ａ）図は、入力二次元画像を示していて、
Ｘ方向に６分割、Ｘ方向に６分割されており、計３６個
の格子に分割されている。この各格子からの入力信号は
、第１（ｂ）図に示す３６個の入力層のニューロンユニ
ットに入力される。中間層の４０個のニューロンユニッ
トは入力層のニューロンユニットにネットワーク状に接
続され、かつｘ−ｙ直交座標に対応させた出力層の２個
のニューロンユニットにネットワーク状に接続されてい
る。この位置ネットはパターンの位置、例えば重心（ｘ
、ｙ）を検出するものである。[Example] Referring to FIG. 1, an example of the location net of the present invention will be described. FIG. 1(a) shows an input two-dimensional image,
It is divided into 6 parts in the X direction and 6 parts in the X direction, for a total of 36 grids. The input signals from each grid are input to the 36 input layer neuron units shown in FIG. 1(b). The 40 neuron units in the intermediate layer are connected in a network to the neuron units in the input layer, and are connected in a network to two neuron units in the output layer that correspond to xy orthogonal coordinates. This position net is the position of the pattern, e.g. centroid (x
, y).

第２図を参照して、本発明の大きさネットの実施例を説
明する。第２（ａ）図は、入力二次元画像を示していて
、Ｘ方向に６分割、Ｘ方向に６分割されており、計３６
個の格子に分割されている。An embodiment of the size net of the present invention will be described with reference to FIG. Figure 2(a) shows the input two-dimensional image, which is divided into 6 parts in the X direction and 6 parts in the X direction, for a total of 36 parts.
divided into grids.

この各格子からの入力信号は、第２（ｂ）図に示す３６
個の入力層のニューロンユニットに入力される。中間層
の４０個のニューロンユニットは入力層のニューロンユ
ニットにネットワーク状に接続され、かつ目標パターン
に外接した正方形の一片に対応させた出力層の１個のニ
ューロンユニットに接続されている。この大きさネット
はパターンの大きさ、例えばパターンを取り囲む正方形
の一辺の長さｄを検出するものである。The input signal from each grid is 36 as shown in FIG. 2(b).
input to the neuron units of the input layer. The 40 neuron units of the intermediate layer are connected in a network to the neuron units of the input layer, and are connected to one neuron unit of the output layer corresponding to a piece of a square circumscribed to the target pattern. This size net detects the size of the pattern, for example, the length d of one side of the square surrounding the pattern.

第３図を参照して、本発明の枠取りネットの実施例を説
明する。第３（ａ）図は、入力二次元画像を示していて
、Ｘ方向に６分割、Ｘ方向に６分割されており、計３６
個の格子に分割されている。An embodiment of the framing net of the present invention will be described with reference to FIG. Figure 3(a) shows the input two-dimensional image, which is divided into 6 parts in the X direction and 6 parts in the X direction, for a total of 36 parts.
divided into grids.

この各格子からの入力信号は、第３（ｂ）図に示す３６
個の入力層のニューロンユニットに入力される。中間層
の４０個のニューロンユニットは入力層のニューロンユ
ニットにネットワーク状に接続され、かつ上下左右の４
方向に対応させた出力層の４個のニューロンユニットに
接続されている。The input signal from each grid is 36 as shown in FIG. 3(b).
input to the neuron units of the input layer. The 40 neuron units in the intermediate layer are connected to the input layer neuron units in a network, and the 40 neuron units in the upper, lower, left, and right
It is connected to four neuron units of the output layer that correspond to the directions.

この枠取りネットは上下左右４方向に対し、枠がらはみ
出していると考えられる方向の成分を１とし、他を０と
して出力する。そして、１のときは枠を内側に移動させ
、０のときは枠を外側に移動する。This frame cutting net outputs the component in the direction in which the frame is thought to protrude as 1 in the four directions of the top, bottom, left, and right, and the other components as 0. When the value is 1, the frame is moved inward, and when it is 0, the frame is moved outside.

第４図を参照して、位置ネッと、大きさネットを応用し
たパターン認識アーキテクチャについて説明する。本発
明のパターン認識アーキテクチャは、第４図に示すよう
に、位置ネッと、大きさネッと、枠取りネッと、認識ネ
ットという４つのネットワークを核として構成されるも
のである。Referring to FIG. 4, a pattern recognition architecture applying position nets and size nets will be described. As shown in FIG. 4, the pattern recognition architecture of the present invention is constructed around four networks: a position network, a size network, a framing network, and a recognition network.

第４図の参照符号１ｏの部分の２つのネットワークは、
入力画像に存在する１つのパターンの位置及び大きさを
それぞれ抽出する。２つのネットワークは並列に用いら
れ、位置ネットはパターンの位置を抽出する機能のみを
有し、大きさネットはパターンの大きさを抽出する機能
のみを有する。The two networks in the part designated by reference numeral 1o in FIG.
The position and size of one pattern present in the input image are each extracted. The two networks are used in parallel, with the position net having only the function of extracting the position of the pattern, and the magnitude net having only the function of extracting the size of the pattern.

得られた位置ネッと、大きさネットの出力に基づいて入
力画像に存在するパターンに枠をはめる。The pattern existing in the input image is framed based on the output of the obtained position net and size net.

次に、位置ネッと、大きさネットの出力に基づいて生成
した枠の精度をさらに上げるために、参照符号１２で示
す部分に於いて、枠取りネットを繰り返し数回用いるこ
とにより、パターンの存在範囲を少しずつ絞り込んでい
く処理を行う。その後、枠取りネットの出力に基づいて
補正した枠内の画像を認識ネットへの入力とする。Next, in order to further improve the accuracy of the frame generated based on the outputs of the position net and the size net, in the part indicated by reference numeral 12, by repeatedly using the framing net several times, it is possible to detect the presence of a pattern. Perform the process of narrowing down the range little by little. Thereafter, the image within the frame corrected based on the output of the framing net is input to the recognition net.

そして、そのパターンが何であるか認識を行うのが、参
照符号１４の部分であり、認識ネットの出力に基づいた
判断の結果が最終的な答えとなる。The part indicated by reference numeral 14 recognizes what the pattern is, and the result of judgment based on the output of the recognition net becomes the final answer.

このとき、枠取りネットによって枠を変化させながら、
数回認識ネットの出力を見るのが望ましい。At this time, while changing the frame using the framing net,
It is desirable to view the output of the recognition net several times.

ここで用いたアーキテクチャは、４つのネットワークの
機能を分化し、単純化したところが特徴である。個々の
ネットワークの構造については第１図、第２図、第３図
及び第５図に示すように、いずれも３層ＢＰモデルを基
準としている。以下にそれぞれのネットの機能及び特徴
を説明する。The architecture used here is characterized by separating and simplifying the functions of the four networks. As for the structure of each network, as shown in FIGS. 1, 2, 3, and 5, each network is based on a three-layer BP model. The functions and characteristics of each network will be explained below.

（ａ）位置ネット二次元空間における画像情報を入力すれば、その中に存
在する、一つのパターンを抽出し、その位置座標（ｘ、
ｙ）を出力するネットワークが位置ネットである。ここ
で、位置ネットによって出力される”位置”とは、一般
にパターン認識で用いられる入力画像全体に対する”重
心”とは異なる。例えば、左上に大きく偏ったノイズが
重畳した場合、重心の座標は実際のパターンの位置より
も左上に大きくはずれたところになる。しかし、適正な
一般化が行われれば、ノイズがあっても提示されたパタ
ーンの存在位置に近づけることが可能である。(a) Position net When image information in two-dimensional space is input, one pattern existing in it is extracted and its position coordinates (x,
The network that outputs y) is a position net. Here, the "position" output by the position net is different from the "center of gravity" of the entire input image, which is generally used in pattern recognition. For example, if noise that is largely biased toward the upper left is superimposed, the coordinates of the center of gravity will be far more shifted toward the upper left than the actual pattern position. However, if proper generalization is performed, it is possible to approach the position of the presented pattern even if there is noise.

（ｂ）大きさネット二次元空間における画像情報を入力すれば、そこに存在
する一つのパターンを抽出し、それを取り囲むことがで
きるような大きさをもつ正方形の一辺の長さｄを出力す
るネットワークが、大きさネットである。この大きさネ
ットの出力に基づき、入カバターンに対して正方形の枠
をはめる。このとき、縦長、横長のパターンについても
正方形の枠を与えたが、必要ならば、大きさネットの出
力を長方形の縦、横の辺の長さ（２出力）とすればよい
ことも確認している。(b) Size net When image information in two-dimensional space is input, one pattern existing there is extracted and the length d of one side of a square that is large enough to surround it is output. The network is a size net. Based on the output of this size net, a square frame is fitted to the input cover pattern. At this time, we gave a square frame for the vertically long and horizontally long patterns, but we also confirmed that if necessary, the output of the size net could be the length of the vertical and horizontal sides of the rectangle (2 outputs). ing.

（Ｃ）枠取りネット位置ネッと、大きさネットの出力に基づいて生成された
枠の補正を行い、パターンの存在範囲を絞りこんでいく
機能をもつネットワークが枠取りネットである。枠内の
画像情報を入力とし、枠内に存在するパターンが、上下
左右各方向で枠に当たっているかどうか判定した結果を
出力する。各方向に対応するユニットの出力が１に近け
ればパターンが枠に当たっているので枠を内側へ移動さ
せ、それ以外ならば枠を外側に移動させる。(C) Framing Net A framing net is a network that has the function of correcting the frame generated based on the outputs of the position net and the size net, and narrowing down the range in which the pattern exists. It takes the image information inside the frame as input, and outputs the results of determining whether the pattern existing within the frame hits the frame in each of the up, down, left, and right directions. If the output of the unit corresponding to each direction is close to 1, the pattern is hitting the frame, so the frame is moved inward; otherwise, the frame is moved outward.

（ｄ）認識ネットはめられた枠の中に存在するパターンが何であるか、そ
のカテゴリー分けを行うネットワークが認識ネットであ
る。第４図の１０．１２の部分で作られた枠内の画像情
報を入力すれば、そこに存在するパターンのカテゴリー
を出力する。(d) Recognition Net A recognition net is a network that categorizes the patterns that exist within the fitted frame. If the image information within the frame created by section 10.12 in Fig. 4 is input, the category of the pattern existing there will be output.

ところで、位置ネッと、大きさネットを基本にしたパタ
ーン認識アーキテクチャの有効性を示すために、　１“
から５″までの手書き数字を認識させる、計算機シミュ
レーションを行ったので、以下にその結果を記述する。By the way, in order to demonstrate the effectiveness of the pattern recognition architecture based on position net and size net,
We conducted a computer simulation to recognize handwritten digits from 5" to 5", and the results are described below.

このシミュレーションで用いたネットワークは、いずれ
も３層構造であり、中間層及び出力層においては、各ユ
ニットの出力関数をシグモイド関数ｆ　（ｘ）　　−１
／　　（１＋ｅｘｐ（−ｘ）１とした。そして、各層に
おけるユニット数は、−膜化能力の評価を行った結果を
もとに、表１のように選定した。The networks used in this simulation all have a three-layer structure, and in the intermediate layer and output layer, the output function of each unit is defined as a sigmoid function f (x) −1
/ (1+exp(-x)1) The number of units in each layer was selected as shown in Table 1 based on the results of evaluating the film-forming ability.

入力層　中間層　出力層位置ネット　　　３６４０２大きさネット　　３６４０１枠取りネット　　３６４０４認識ネット　　　６４５０５表　　　１ここで、位置ネッと、大きさネットでは入力画像を縦６
画素×横６画素、枠取りネットでは枠内の画像を６画素
×６画素、認識ネットでは枠内の画像を縦８画素Ｘ横８
画素に重みをつけて量子化を行ったものをそれぞれのネ
ットワークへの入力とする。Input layer Middle layer Output layer position net 36402 Size net 36401 Framing net 36404 Recognition net 64505 Table 1 Here, in the position net and size net, the input image is
Pixel x 6 pixels horizontally, in the frame drawing net, the image inside the frame is 6 pixels x 6 pixels, and in the recognition net, the image inside the frame is 8 pixels vertically x 8 pixels horizontally.
The pixels are weighted and quantized, and then used as input to each network.

以上の設定によって、学習後のそれぞれのネットワーク
を用い、システムを構成した。位置、大きさを抽出した
後の枠取りネットによる補正の回数は１０回とした。そ
の後、枠取りと認識を交互に繰り返し、認識ネットが３
回続けて同じ出力を出した場合、それを答えとして１回
の認識作業を終了させた。With the above settings, a system was configured using each network after learning. After extracting the position and size, the number of corrections using the framing net was 10 times. After that, framing and recognition are repeated alternately, and the recognition net becomes 3
If the same output was produced twice in a row, that was used as the answer and one recognition task was completed.

第６図はシミュレーション結果の例である。第６（ａ）
図はシステムへの入力、第６（ｂ）図はその入力に対す
る出力、即ち認識ネットの出力である。FIG. 6 is an example of simulation results. Section 6(a)
The figure shows the input to the system, and FIG. 6(b) shows the output for that input, that is, the output of the recognition net.

この結果を見ると、位置、大きさ、変形、ノイズに対し
て不変の変換がなされていることがわかる。Looking at the results, it can be seen that the transformation is invariant to position, size, deformation, and noise.

また、従来のパターン認識手法では難しい塊状のノイズ
がのったものや、指定された領域外に書いたものにも対
応できていることがわかる。Additionally, it can be seen that it is able to handle items with blocky noises, which are difficult to use with conventional pattern recognition methods, and items written outside the specified area.

「効　果」本発明によれば、二次元画面上に存在する任意の形状と
大きさを有するパターンのおおよその位置と大きさをそ
れぞれ検出する位置ネッと、大きさネットを設けたので
、ノイズの重畳、位置のずれ、大きさのずれに対して極
めてロバストで正確なパターンの認識が行える。"Effects" According to the present invention, since a position net and a size net are provided to respectively detect the approximate position and size of a pattern having an arbitrary shape and size existing on a two-dimensional screen, noise can be reduced. It is possible to perform extremely robust and accurate pattern recognition against overlapping, positional, and size deviations.

[Brief explanation of drawings]

第１図は本発明の位置ネットの機能と構成を説明する図
、第２図は本発明の大きさネットの機能と構成を説明す
る図、第３図は本発明の枠取りネットの機能と構成を説
明する図、第４図は本発明のパターン認識アーキテクチ
ャの構成を説明する図、第５図は認識ネットの機能と構
成を説明する図、第６図は本発明の位置ネッと、大きさ
ネッと、枠取りネットを用いたパターン認識のシミュレ
ーション結果を示す図である。１０・・・位置ネットと大きさネットによる処理、１２
・・・枠取りネットによる処理、１４・・・認識ネットによる処理FIG. 1 is a diagram explaining the function and configuration of the position net of the present invention, FIG. 2 is a diagram explaining the function and configuration of the size net of the present invention, and FIG. 3 is a diagram explaining the function and configuration of the framing net of the present invention. FIG. 4 is a diagram explaining the configuration of the pattern recognition architecture of the present invention. FIG. It is a figure which shows the simulation result of pattern recognition using a net and a frame net. 10... Processing using position net and size net, 12
...Processing by framing net, 14...Processing by recognition net

Claims

[Claims]

(1) In a BP model that defines connections between neurons and learning rules, a neuron unit of an input layer that inputs an input two-dimensional image divided into a grid as an intensity signal of each grid, and a neuron unit that inputs an input two-dimensional image divided into a grid as an intensity signal of each grid, an output layer neuron unit that outputs the position of the target pattern, and an intermediate layer neuron unit provided between the input layer neuron unit and the output layer neuron unit, and A neural network for representing the position of a target object, characterized in that the neural network detects the position of the pattern having an arbitrary shape and size that exists in the target object.

(2) In a BP model that defines connections between neurons and learning rules, a neuron unit of an input layer that inputs an input two-dimensional image divided into a grid as an intensity signal of each grid, and a neuron unit that inputs an input two-dimensional image divided into a grid as an intensity signal of each grid, an output layer neuron unit that outputs the size of the target pattern; and an intermediate layer neuron unit provided between the input layer neuron unit and the output layer neuron unit, and the input two-dimensional image A neural network for representing the size of a target object, characterized in that it detects the size of the pattern having an arbitrary shape and size that exists on the target object.

(3) In a BP model that defines connections between neurons and learning rules, a first input layer neuron unit inputs an input two-dimensional image divided into a grid as an intensity signal for each grid; a neuron unit in a first output layer that outputs the size of a target pattern of a dimensional image, a neuron unit in the first input layer, and a neuron unit in the first input layer;
a position net for detecting the size of the pattern having an arbitrary shape and size existing on the input two-dimensional image; a second input layer neuron unit that inputs an input two-dimensional image divided into a grid as an intensity signal for each grid; and a second output layer neuron unit that outputs the size of a target pattern of the input two-dimensional image. a neuron unit, and a second intermediate layer neuron unit provided between the first input layer neuron unit and the second output layer neuron unit,
and a size net for detecting the size of the pattern having an arbitrary shape and size existing on the input two-dimensional image.

(4) In a BP model that defines connections between neurons and learning rules, a first input layer neuron unit inputs an input two-dimensional image divided into a grid as an intensity signal for each grid; a neuron unit in a first output layer that outputs the size of a target pattern of a dimensional image, a neuron unit in the first input layer, and a neuron unit in the first input layer;
a position net for detecting the size of the pattern having an arbitrary shape and size existing on the input two-dimensional image; a second input layer neuron unit that inputs an input two-dimensional image divided into a grid as an intensity signal for each grid; and a second output layer neuron unit that outputs the size of a target pattern of the input two-dimensional image. a neuron unit, and a second intermediate layer neuron unit provided between the second input layer neuron unit and the second output layer neuron unit,
A size net detects the size of the pattern having an arbitrary shape and size existing on the input two-dimensional image, and the input two-dimensional image divided into grids is input as an intensity signal of each grid. Third output that sets the teacher value to 0 when moving the frame inward and 1 when moving it outward for the neuron unit in the third input layer and each neuron unit corresponding to the four directions (up, down, left, right). and a third intermediate layer neuron unit provided between the third input layer neuron unit and the third output layer neuron unit, and based on the outputs of the position net and magnitude net. 1. A neural network for representing the position and size of a target object, characterized in that the neural network is comprised of a framing net that can narrow down the range of existence of the pattern by correcting the frame generated by the pattern.