JPH08272971A

JPH08272971A - Object recognizing method

Info

Publication number: JPH08272971A
Application number: JP7074820A
Authority: JP
Inventors: Katsuji Imai; 勝次今井
Original assignee: Toyota Motor Corp
Current assignee: Toyota Motor Corp
Priority date: 1995-03-31
Filing date: 1995-03-31
Publication date: 1996-10-18

Abstract

PURPOSE: To enable an accurate recognition of a linear body which is hardly affected by noise by inputting image data of respective divided slit areas to a neural network and outputting the position of the linear body from the neural network. CONSTITUTION: Input image data stored in a memory 12 for image storage are read in a computer 14, slit by slit. The slits are strip-like areas having the lengths direction almost at a right angle to the direction wherein the linear body 16 included in the input image data extends. The input image data are divided into the slits, and images of the respective slits are supplied to the neural network. The slits are not pixels of 'one line' having one-pixel width, but the striped areas having constant width larger than noise. Consequently, the neural network can easily discriminate between the noise and linear body 16 to be recognized. Consequently, the coordinates of the linear body can be outputted while the influence of the noise is held small.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、対象物の認識方法に関
する。特に線状物体の長さや置かれている角度等の状態
を検出可能な認識方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an object recognition method. In particular, the present invention relates to a recognition method capable of detecting states such as the length of a linear object and the angle at which it is placed.

【０００２】[0002]

【従来の技術】ワイヤーボンディングの自動実装検査、
ワイヤーハーネスの自動配線は、従来実現が困難な分野
であった。これらのことを実現するための一つの方法と
して、ワイヤーの形状を画像処理により認識し、これに
基づいてワイヤーの操作を行うことが考えられる。ワイ
ヤーの形状を一本一本認識することによって、より確実
な検査精度を得ることができると考えられる。2. Description of the Related Art Automatic mounting inspection of wire bonding,
Automatic wiring of wire harnesses has been a difficult field to realize. As one method for realizing these things, it is considered that the shape of the wire is recognized by image processing and the wire is operated based on this. It is considered that more reliable inspection accuracy can be obtained by recognizing the shape of each wire one by one.

【０００３】このような、線物体（ひも、ワイヤーハー
ネス等）の歪み形状を認識するときに用いる従来技術と
しては、文字認識の手法として一般的になっている細線
化による方法が考えられる。As a conventional technique used for recognizing such a distorted shape of a line object (string, wire harness, etc.), a thinning method which is generally used as a character recognition method can be considered.

【０００４】例えば、特開平５−２４８８３９号公報に
は、線状組織の認識方法が記載されている。ここに記載
されている方法は、基本的には上記古典的な画像処理の
一つである細線化を用いて、線状組織の特徴量を抽出す
る方法である。For example, Japanese Patent Laid-Open No. 5-248839 discloses a method for recognizing a linear tissue. The method described here is basically a method of extracting a feature amount of a linear tissue by using thinning, which is one of the classical image processing.

【０００５】ワイヤーハーネス、ロープ、ゴム管、紙ひ
も等の線物体を自動的に組み付けたり、あるいは梱包し
たりする場合には、対象となる物体をフックでひっかけ
たり、ロボットハンドでつかむ必要がある。このような
仕事を行うために、必要不可欠な情報は、「どこをつか
むのか」、ということである。台上に剛性の低い線物体
を置き、十分な張力をかけないで放置する場合には、残
留応力によって当該線物体に歪みが生じる。このような
歪みの形状は一般に不特定であるためそれを予測するこ
とはできない。すなわち、線物体を台の上に置いた後に
生じた歪みの形状は触覚的もしくは視覚的に認識する必
要がある。When wire objects such as wire harnesses, ropes, rubber tubes, and paper strings are automatically assembled or packed, it is necessary to hook the target object with a hook or grab it with a robot hand. . The essential information for doing such a job is "where to grab". When a line object having low rigidity is placed on the table and left without being applied with sufficient tension, the line object is distorted due to residual stress. Since the shape of such distortion is generally unspecified, it cannot be predicted. That is, the shape of the distortion generated after the line object is placed on the table needs to be recognized tactilely or visually.

【０００６】視覚的に認識を行う場合、認識対象である
線物体の太さが一様であり、かつ表面に模様がなく、光
線が一様に当たっている必要がある。更に、画像中に異
物が絶対に映ることがない場合に限って、その形状の認
識を行うことが可能である。このような条件が満たされ
る場合に限り、当該線物体を映した画像全体に対して細
線化処理を施すことにより所望の線物体を表す一本のラ
インを得ることが可能である。In the case of visual recognition, it is necessary that the line object to be recognized has a uniform thickness, that there is no pattern on the surface, and that the light rays hit the surface uniformly. Further, the shape can be recognized only when the foreign matter is never shown in the image. Only when such a condition is satisfied, it is possible to obtain a single line representing a desired line object by subjecting the entire image showing the line object to thinning processing.

【０００７】ところが、実際には上述したような理想的
な場合にはほとんど生じない。すなわち、画像に異物が
一緒に映っている場合や、認識すべき物体に模様がある
場合等はしばしば生じる。このような画像に対して細線
化処理を行うと、対象とする線物体が一本だけである場
合であっても、複数のラインが生じてしまう。従って、
単なる細線化処理によっては線物体の認識をすることは
現実にはほぼ不可能である。従来から、このような不都
合があることは認識されており、それを克服するための
研究もなされてきたが、未だ有効な認識アルゴリズムは
開発されていない。However, in reality, it hardly occurs in the ideal case described above. That is, a foreign object is often shown together in an image, or an object to be recognized has a pattern. When the thinning process is performed on such an image, a plurality of lines are generated even if the target line object is only one. Therefore,
In reality, it is almost impossible to recognize a line object by a simple thinning process. Conventionally, it has been recognized that such inconveniences exist, and studies have been made to overcome them, but effective recognition algorithms have not yet been developed.

【０００８】[0008]

【発明が解決しようとする課題】細線化処理による手法
は、極めて小さなノイズであっても、線となって現れて
しまうという欠点を有している。従って、複数の線の中
からどれが対象となる物体に属するものであるのかをそ
れぞれ判定しなければならないという問題があった。The method of thinning processing has a drawback that even a very small noise appears as a line. Therefore, there is a problem that it is necessary to determine which of the plurality of lines belongs to the target object.

【０００９】本発明は、この上記課題に鑑みなされたも
のであり、その目的は線物体の正確な認識が可能な対象
物の認識方法を提供することである。The present invention has been made in view of the above problems, and an object thereof is to provide a method of recognizing a target object capable of accurately recognizing a line object.

【００１０】[0010]

【課題を解決するための手段】第一の本発明は、上記課
題を解決するために、線物体を含む画像データを、前記
線物体が伸長する方向と略直角方向を長手方向とする複
数のスリット領域に分割する分割工程と、前記分割工程
によって得られた複数のスリット領域に属する画像デー
タを順次ニューラルネットワークに入力し、前記スリッ
ト領域の長手方向における前記線物体の位置を前記ニュ
ーラルネットワークに順次出力させる、座標値出力工程
と、を含み、前記座標値を出力することにより、前記線
物体を認識することを特徴とする対象物認識方法であ
る。In order to solve the above problems, a first aspect of the present invention provides a plurality of image data including a line object whose longitudinal direction is substantially perpendicular to the direction in which the line object extends. A dividing step of dividing into slit areas and image data belonging to a plurality of slit areas obtained by the dividing step are sequentially input to a neural network, and the position of the line object in the longitudinal direction of the slit area is sequentially input to the neural network. And a coordinate value output step of outputting the coordinate value, and recognizing the line object by outputting the coordinate value.

【００１１】第二の本発明は、上記課題を解決するため
に、第一の本発明の対象物認識方法において、前記分割
工程における前記複数のスリット領域のそれぞれは、隣
接するスリット領域とオーバーラップしていることを特
徴とする対象物認識方法である。In order to solve the above-mentioned problems, a second aspect of the present invention is the object recognition method of the first aspect of the present invention, wherein each of the plurality of slit areas in the dividing step overlaps with an adjacent slit area. The object recognition method is characterized by the following.

【００１２】第三の本発明は、上記課題を解決するため
に、第一又は第二の本発明の対象物認識方法において、
前記ニューラルネットワークは、教師データたる線物体
を含む画像データにノイズを混入させるノイズ混入工程
と、前記ノイズが混入された画像データを、前記線物体
が伸長する方向と略直角方向を長手方向とする複数のス
リット領域に分割する教師データ分割工程と、前記分割
後の複数のスリット領域を教師データとして学習をする
ニューラルネットワーク学習工程と、を含むことを特徴
とする対象物認識方法である。In order to solve the above problems, a third aspect of the present invention is the object recognition method according to the first or second aspect of the present invention,
The neural network includes a noise mixing step of mixing noise into image data including a line object serving as teacher data, and the image data mixed with the noise has a longitudinal direction substantially perpendicular to a direction in which the line object extends. An object recognition method comprising: a teacher data dividing step of dividing into a plurality of slit areas; and a neural network learning step of learning using a plurality of divided slit areas as teacher data.

【００１３】[0013]

【作用】第一の本発明におけるニューラルネットワーク
は、短冊領域に含まれる画像データを入力し、係る画像
データ中の線物体の位置を出力する。そのため、ノイズ
の影響を小さく保ったまま線物体の座標を出力すること
が可能である。The neural network according to the first aspect of the present invention inputs the image data contained in the strip area and outputs the position of the line object in the image data. Therefore, it is possible to output the coordinates of the line object while keeping the influence of noise small.

【００１４】第二の本発明においては、複数の短冊領域
が、それぞれ隣接する短冊領域とオーバーラップしてい
る。そのため、ニューラルネットワークは、より正確な
座標の出力が可能である。In the second aspect of the present invention, a plurality of strip regions overlaps adjacent strip regions. Therefore, the neural network can output more accurate coordinates.

【００１５】第三の本発明におけるノイズ混入工程は、
教師データとなる認識対象を含む画像データにノイズを
混入させ、教師データ分割工程は、複数の短冊領域に当
該画像データを分割する。そして、ニューラルネットワ
ークはこの短冊領域を教師データとして学習を行う。The noise mixing step in the third aspect of the present invention is
Noise is mixed into the image data including the recognition target to be the teacher data, and the teacher data dividing step divides the image data into a plurality of strip areas. Then, the neural network performs learning by using this strip area as teacher data.

【００１６】教師データとしては、認識すべき線物体を
表す曲線を含むデータとすることはもちろんであるが、
本発明においては、この線物体を含む画像データにノイ
ズを混入させている。このようなノイズを混入させたデ
ータを教師データとして用いてニューラルネットワーク
を学習させることにより、ノイズに強いニューラルネッ
トワークが得られる。The teaching data is, of course, data including a curve representing a line object to be recognized,
In the present invention, noise is mixed in the image data including this line object. A neural network resistant to noise can be obtained by learning the neural network by using the data mixed with such noise as the teacher data.

【００１７】尚、上記第一及び第二の本発明においては
ニューラルネットワークには、短冊領域に分割された画
像データが印可されるので、第三の本発明においても、
ノイズを混入させて作成した画像データは短冊領域に分
割されてからニューラルネットワークに印可される。In the first and second aspects of the present invention, since image data divided into strip areas is applied to the neural network, the third aspect of the present invention is also applicable.
Image data created by mixing noise is divided into strip areas and then applied to the neural network.

【００１８】[0018]

【実施例】以下、本発明の好適な実施例を図面に基づい
て説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT A preferred embodiment of the present invention will be described below with reference to the drawings.

【００１９】上述したように、従来の古典的な画像処理
である細線化を行っていたのでは、かえって問題を複雑
にするだけであり、線物体の認識は困難であると本願発
明者は考え、細線化の手続を経ることなく線物体の形状
を得る装置について提案を行う。As described above, the inventor of the present application considers that conventional thinning, which is a classical image processing, merely complicates the problem and makes it difficult to recognize line objects. , We propose a device to obtain the shape of a line object without going through the thinning procedure.

【００２０】本発明の好適な実施例である線物体認識装
置は、図１に示されているように、画像入力装置１０
と、この画像入力装置１０から出力される入力画像を記
憶する画像記憶用メモリ１２とを含んでいる。画像入力
装置１０によって得られた画像データは、最終的には計
算機１４に供給され、ここで対象の線物体１６の座標の
算出が行われる。この計算機１４においては、ニューラ
ルネットワークが例えばソフトウエアにより構成されて
おり、画像記憶用メモリ１２に格納されている画像デー
タに対して線物体１６の座標値を出力する。なお、本実
施例においてはニューラルネットワークは計算機１４内
のソフトウエアによって構成されているが、専用のハー
ドウエアを用いても好適である。A line object recognition apparatus according to a preferred embodiment of the present invention, as shown in FIG.
And an image storage memory 12 for storing an input image output from the image input device 10. The image data obtained by the image input device 10 is finally supplied to the computer 14, where the coordinates of the target line object 16 are calculated. In the computer 14, a neural network is configured by software, for example, and outputs the coordinate value of the line object 16 with respect to the image data stored in the image storage memory 12. In this embodiment, the neural network is composed of software in the computer 14, but it is also possible to use dedicated hardware.

【００２１】画像入力装置１０は、試験台１８の上に乗
せられている認識対象である線物体１６の画像データを
得るためのものであり、その詳細な構成が図２に示され
ている。図２に示されているように、画像入力装置１０
は、カメラ１０ａと、カメラ１０ａの位置を移動させる
ための補助機構１０ｂとから構成されている。また、図
２には示されていないが、カメラ１０ａが出力する画像
データは、所定のインターフェースを介して画像記憶用
メモリ１２に供給されるのである。The image input device 10 is for obtaining image data of the line object 16 which is a recognition target placed on the test table 18, and its detailed configuration is shown in FIG. As shown in FIG. 2, the image input device 10
Is composed of a camera 10a and an auxiliary mechanism 10b for moving the position of the camera 10a. Although not shown in FIG. 2, the image data output by the camera 10a is supplied to the image storage memory 12 via a predetermined interface.

【００２２】まず、カメラ１０ａによって対象となる線
物体１６が撮影される。この時、当該線物体１６の全長
がカメラの解像度の範囲内で撮影できないほど長い場合
には、カメラが線物体の長手方向に向かって回転する構
造もしくは直線運動を行う構造とすればよい。また、対
象となる線物体１６が大きく歪曲している場合は、カメ
ラが左右に同様な運動をする構造を付加するのが好適で
ある。First, the target line object 16 is photographed by the camera 10a. At this time, if the entire length of the line object 16 is too long to be photographed within the range of the resolution of the camera, the camera may have a structure of rotating in the longitudinal direction of the line object or a structure of performing linear motion. In addition, when the target line object 16 is greatly distorted, it is preferable to add a structure in which the camera makes similar movements to the left and right.

【００２３】なお、図２に示されているように、カメラ
１０ａの位置を移動させるための補助機構１０ｂは、縦
方向カメラムーブメント１０ｂ（ｉ）と、横方向カメラ
ムーブメント１０ｂ（ｉｉ）とから構成されており、通
常は横方向カメラムーブメント１０ｂ（ｉｉ）のみが動
く。以下の説明においては、縦方向カメラムーブメント
１０ｂ（ｉ）は動かないものとし、説明を進める。縦方
向カメラムーブメント１０ｂ（ｉ）は、線物体が視野か
らはみ出した場合にこれを再び視野の中に戻す役割を果
たす。すなわち、以下の説明においては、線物体１６
は、視野からはみ出さないことを前提とする。As shown in FIG. 2, the auxiliary mechanism 10b for moving the position of the camera 10a is composed of a vertical camera movement 10b (i) and a horizontal camera movement 10b (ii). In general, only the lateral camera movement 10b (ii) moves. In the following description, it is assumed that the vertical camera movement 10b (i) does not move, and the description will proceed. The vertical camera movement 10b (i) serves to bring the line object back into the visual field if it goes out of the visual field. That is, in the following description, the line object 16
Assumes that it does not protrude from the field of view.

【００２４】なお、カメラ１０ａは、従来のＣＣＤカメ
ラ等が適用可能であるが、視覚情報を電気信号に変換す
ることが可能なカメラであれば従来のどのようなカメラ
でも構わない。Although the conventional CCD camera or the like can be applied to the camera 10a, any conventional camera can be used as long as it can convert visual information into an electric signal.

【００２５】このようにして、カメラ１０ａによって撮
影された画像データは、画像記憶用メモリ１２に記憶さ
れる。画像記憶用メモリ１２は、カメラ１０ａから送ら
れてくる画像データを一時的に蓄積する。例えば、磁気
ディスクや、ＩＣカード等種々の記憶装置が適用可能で
ある。更に、この画像記憶用メモリ１２は、計算機１４
の内部に設けることも好適である。なお、以下の説明に
おいては、この画像記憶用メモリ１２に記憶されている
画像データを入力画像データと呼ぶ。The image data taken by the camera 10a in this manner is stored in the image storage memory 12. The image storage memory 12 temporarily stores the image data sent from the camera 10a. For example, various storage devices such as a magnetic disk and an IC card can be applied. Further, the image storage memory 12 is a computer 14
It is also preferable to provide it inside. In the following description, the image data stored in the image storage memory 12 will be referred to as input image data.

【００２６】本実施例において特徴的な線物体１６の認
識方法の原理を示す原理図が図３に示されている。図３
に示されているように、入力画像データは、スリット単
位に計算機１４に読み込まれる。このスリットは、入力
画像データに含まれる線物体が伸長する方向と略直角方
向を長手方向とする短冊状の領域である。入力画像デー
タは、複数のこのスリットに分割され、各スリットの画
像毎にニューラルネットワークに供給されるのである。
本実施例において特徴的なことは、このスリット毎に画
像データをニューラルネットワークに供給したことであ
る。ここで留意すべきことはこのスリットの短辺の長さ
はノイズより大きく設定されていることである。すなわ
ち、このスリットは１画素幅の「一列」の画素ではな
く、ノイズよりも大きな一定の幅を有する短冊状の領域
である。そのため、ノイズと、認識対象である線物体と
をニューラルネットワークは容易に区別することができ
るものと期待される。すなわち、線物体１６は、必ずス
リットの領域を横断しているが、ノイズはスリットの幅
よりも小さいので、ノイズと線物体１６とは明確に区別
することができる。FIG. 3 shows a principle diagram showing the principle of the recognition method of the line object 16 which is characteristic in the present embodiment. FIG.
As shown in, the input image data is read by the calculator 14 in units of slits. This slit is a strip-shaped region whose longitudinal direction is substantially perpendicular to the direction in which the line object included in the input image data extends. The input image data is divided into a plurality of slits, and the image of each slit is supplied to the neural network.
A characteristic of this embodiment is that the image data is supplied to the neural network for each slit. It should be noted here that the length of the short side of this slit is set larger than the noise. That is, the slit is not a "one row" of pixels each having a pixel width, but a strip-shaped region having a constant width larger than noise. Therefore, it is expected that the neural network can easily distinguish the noise from the line object to be recognized. That is, the line object 16 always crosses the slit region, but since the noise is smaller than the width of the slit, the noise and the line object 16 can be clearly distinguished.

【００２７】なお、図３に示されているように、このス
リットの長辺は、入力画像の一辺の長さと等しいものと
している。また、本実施例においては入力画像データは
所定のしきい値により二値化が行われ、二値化画像とし
て使われている。これは、画像データの情報量を削減
し、ニューラルネットワークの構成を簡易なものとする
ためであるが、階調を有する画像データをそのまま取り
扱うことももちろん可能である。しかしながら、この場
合ニューラルネットワークの規模が極めて大きくなって
しまうため実現はほとんど不可能であると考えられる。As shown in FIG. 3, the long side of this slit is equal to the length of one side of the input image. Further, in this embodiment, the input image data is binarized by a predetermined threshold value and used as a binarized image. This is to reduce the information amount of the image data and simplify the configuration of the neural network, but it is of course possible to directly handle the image data having gradation. However, in this case, the scale of the neural network becomes extremely large, so that it is considered almost impossible to realize.

【００２８】また、図３に示されているようにこのニュ
ーラルネットワークの出力は認識対象たる線物体の座標
値が出力される。この座標値は、スリットの長辺方向の
位置を表す。本実施例においては、スリットの長辺方向
をＹ軸、短辺方向をＸ軸としている。そして、スリット
画像に含まれる画素のＸ座標の平均値と、このスリット
画像をニューラルネットワークに供給した場合の得られ
た座標値Ｙ（ｔ）とを組み合わせることにより、認識対
象たる線物体の座標値が得られる。複数のスリットに対
して、それぞれ座標値が求められ、最終的に線物体の取
り得る位置の複数の座標が線物体の認識結果として得ら
れる。As shown in FIG. 3, the output of this neural network outputs the coordinate value of the line object to be recognized. This coordinate value represents the position of the slit in the long side direction. In the present embodiment, the long side direction of the slit is the Y axis and the short side direction is the X axis. Then, by combining the average value of the X coordinates of the pixels included in the slit image and the coordinate value Y (t) obtained when this slit image is supplied to the neural network, the coordinate value of the line object to be recognized is combined. Is obtained. Coordinate values are obtained for each of the plurality of slits, and finally, a plurality of coordinates of possible positions of the line object are obtained as the recognition result of the line object.

【００２９】スリットの位置がスライドすれば、ニュー
ラルネットワークの出力も変化する。この変化を観測す
ることによって、入力画像に映っている線物体の状態を
知ることができる。つまり、横長画像のスライスパター
ンの時系列信号変換を行い、変換後の信号を処理するこ
とによって、細線化の過程を経ずに対象とする線物体の
形状を関数として得ることができる。If the position of the slit slides, the output of the neural network also changes. By observing this change, the state of the line object reflected in the input image can be known. That is, by performing time-series signal conversion of the slice pattern of the horizontally long image and processing the converted signal, the shape of the target line object can be obtained as a function without going through the thinning process.

【００３０】なお、本実施例においては入力画像データ
を複数のスリット（領域）に分割したが、各スリット
は、それぞれ隣接するスリットと少しずつオーバーラッ
プ（重ね合わせ）されている。例えば、このスリットの
大きさとしては長辺が２５６画素、短辺は６４画素とす
ることが可能である。この場合、短辺の６４画素のうち
例えば８画素程度は隣接するスリットと重ね合わせるの
が好適である。このように部分的に画素を重ね合わせる
ことによって精度の高い線物体の認識が可能となる。Although the input image data is divided into a plurality of slits (regions) in the present embodiment, each slit is slightly overlapped with the adjacent slits. For example, the size of this slit can be 256 pixels on the long side and 64 pixels on the short side. In this case, it is preferable to overlap, for example, about 8 pixels of the 64 pixels on the short side with the adjacent slits. By overlapping the pixels partially in this way, it is possible to recognize the line object with high accuracy.

【００３１】本実施例において特徴的なことはノイズよ
り広い幅のスリットに、入力画像データを分割し、分割
後の各スリットに属する画素データをニューラルネット
ワークに供給したことであった。この構成により、ノイ
ズの影響を受けずに線物体１６の認識が可能となる。本
実施例においては、さらにまた、各スリットは完全に別
個独立に選ぶのではなく、少しずつ重ね合わせる部分を
各隣接するスリットの間において設ければ、各スリット
間における変化を滑らかなものとすることができ、その
結果ニューラルネットワークの出力結果の精度を向上さ
せることが可能である。The feature of this embodiment is that the input image data is divided into slits having a width wider than noise, and the pixel data belonging to each slit after division is supplied to the neural network. With this configuration, the line object 16 can be recognized without being affected by noise. Furthermore, in the present embodiment, each slit is not completely and independently selected, but if the portions that are overlapped little by little are provided between the adjacent slits, the change between the slits can be made smooth. As a result, it is possible to improve the accuracy of the output result of the neural network.

【００３２】なお、ニューラルネットワークの出力値
は、シグモイド関数であるため、一般には０から１の間
の数字となるが、これに入力画像データのＹ軸の長さを
乗算することにより、具体的な座標の値が得られる。Since the output value of the neural network is a sigmoid function, it is generally a number between 0 and 1, but by multiplying this by the length of the Y axis of the input image data, The value of the coordinate is obtained.

【００３３】ここで、シグモイド関数の値は、全範囲
（０〜１）を使わなくてもよい。例えば、０．２〜０．
８の範囲を使うときには、（座標値）＝((( ニューラルネットワークの出力値)-0.
2))/0.6)×Ｙ軸の長さにて具体的な座標値が求められる。詳細には、０．２〜
０．８の範囲を使用するのが好適である。Here, the value of the sigmoid function does not need to use the entire range (0 to 1). For example, 0.2-0.
When using the range of 8, (coordinate value) = (((neural network output value) -0.
2)) / 0.6) x Y-axis length gives specific coordinate values. Specifically, 0.2-
It is preferred to use the range 0.8.

【００３４】以上述べたように、本実施例においては入
力画像データをスリットの画像データに分割し、ニュー
ラルネットワークでその座標値を算出したことを特徴と
する。本実施例においては、Ｄ．Ｅ．Ｒｕｍｅｌｈａｒ
ｔ，Ｊ．Ｌ．ＭｃＣｌｅｌｌｅｎｄ，ａｎｄＰＤＰ
ＲｅｓｅａｒｃｈＧｒｏｕｐ：“ＰａｒａｌｌｅｌＤ
ｉｓｔｒｉｂｕｔｅｄＰｒｏｃｅｓｓｉｎｇ”，ＭＩ
ＴＰｒｅｓｓ（１９８６）に記述されているタイプ
のニューラルネットワークを採用している。ここに採用
されているニューラルネットワークの構成図が図４に示
されている。As described above, the present embodiment is characterized in that the input image data is divided into slit image data and the coordinate values are calculated by the neural network. In this embodiment, D. E. FIG. Rumelhar
t, J .; L. McClellend, and PDP
Research Group: "ParallelD"
distributed Processing ”, MI
It employs a neural network of the type described in T Press (1986). A block diagram of the neural network adopted here is shown in FIG.

【００３５】ニューラルネットワークは、複数のユニッ
ト（ニューロン）によって構成されている。すべてのユ
ニットは次のような機能を有する。まず、すべてのユニ
ットは、入出力の双方向に結合を有している。すなわ
ち、（１）外部からの入力、もしくは前のユニットから
の入力を受け取る結合、及び（２）外部へ、もしくは次
のユニットの出力のための結合の２種類の結合を有して
いる。これらの結合はすべて、任意の値の結合加重を有
している。受け取る信号にはその信号が伝わってきた信
号が有している加重が乗算されかつ、乗算の結果の総和
が計算されるのである。この総和は非線形変換（例えば
シグモイド関数による変換）を施すことによって信号変
換がなされる。信号変換後の値が、当該ユニットの出力
値となるのである。The neural network is composed of a plurality of units (neurons). All units have the following functions. First, all units have bidirectional inputs and outputs. That is, it has two types of couplings: (1) a coupling for receiving an input from the outside or an input from a previous unit, and (2) a coupling for an external or output of the next unit. All of these joins have a join weight of any value. The received signal is multiplied by the weight of the signal transmitted by the signal, and the sum of the multiplication results is calculated. This sum is subjected to signal conversion by performing non-linear conversion (for example, conversion by a sigmoid function). The value after signal conversion becomes the output value of the unit.

【００３６】ユニットは、図４に示されているように、
外部入力を受け取る入力ユニット３０と、外部に信号を
送る役割をもつ出力ユニット４０と、外部との接触を一
切もたない中間ユニット５０と、の３種類に分けられ
る。これらのユニットは、一個以上の個数を一かたまり
とした複数のグループに分けられ、層状に配置される。The unit, as shown in FIG.
It is divided into three types: an input unit 30 that receives an external input, an output unit 40 that sends a signal to the outside, and an intermediate unit 50 that has no contact with the outside. These units are divided into a plurality of groups, each of which is one or more, and are arranged in layers.

【００３７】ここで、非線形変換を行うユニットを集め
た層が一層以上ある場合には、中間層は複数あっても許
容される。また、図４に示されている結合以外のいかな
る結合もニューラルネットワークにおいては許容されて
いる。更に、ニューラルネットワークは、専用のハード
ウエアを用いても、またソフトウエアを用いて構成して
もそのコストや処理速度を考慮することによっていずれ
の選択も可能である。Here, when there is one or more layers in which units for performing non-linear conversion are collected, a plurality of intermediate layers are allowed. Also, any connection other than the connection shown in FIG. 4 is allowed in the neural network. Further, the neural network can be selected either by using dedicated hardware or by using software by considering its cost and processing speed.

【００３８】本実施例に係る線物体の認識アルゴリズム
が、図５に示されているフローチャートに表されてい
る。The line object recognition algorithm according to this embodiment is shown in the flowchart shown in FIG.

【００３９】まず、ステップＳＴ５−１においては、撮
影した画像の二値化が行われる。これは、求める線物体
１６を強調し、画素の階調を揃えるために行われる。二
値化を行うための所定のしきい値を設定する手法として
は、従来知られているあらゆる手法を利用することが可
能である。この際、しきい値の設定が悪くて、異物をも
強調されて二値化後の画像に残ることがあるが、ニュー
ラルネットワークはその学習により、小さなノイズであ
れば無視する機能を与えられているため、多少の異物や
ノイズが二値化後の画像に残ってしまっても構わない。First, in step ST5-1, the photographed image is binarized. This is performed in order to emphasize the desired line object 16 and to make the gradation of the pixels uniform. As a method for setting a predetermined threshold value for binarization, any conventionally known method can be used. At this time, the threshold value may be set poorly, and foreign matter may be emphasized and remain in the image after binarization, but the neural network is given a function to ignore small noise by learning. Therefore, some foreign matter or noise may remain in the binarized image.

【００４０】以後、説明を容易にするために、二値化後
の画像を二値画像と呼ぶ。上述した本実施例による認識
原理に従い、入力画像をスリットによって分割し、すな
わちスリットによる走査を行っても構わないが、二値画
像を対象とした方が画像の濃淡レベルの範囲が狭いた
め、ニューラルネットワークにかかる負担が少なくて済
むというメリットを有する。従って、以下本実施例にお
いては二値画像を走査することとして以下の説明を行
う。Hereinafter, for ease of explanation, the image after binarization is referred to as a binary image. According to the recognition principle according to the present embodiment described above, the input image may be divided by slits, that is, scanning by the slits may be performed. However, since the range of the gray level of the image is narrower for the binary image, the neural This has the advantage that the burden on the network can be reduced. Therefore, in the present embodiment, the following description will be made assuming that a binary image is scanned.

【００４１】ステップＳＴ５−２においては、スリット
位置の初期化が行われる。スリット位置の初期状態は、
スリットが二値画像の（向かって）一番左端にある時で
ある。以下の説明においては、二値画像の起点スリット
の内部に領域に現れる部分をスリット画像と呼ぶ。二値
画像の解像度をＮ（横）×Ｍ（縦）画素、スリットの画
素数をＬ×Ｎ画素、スリットの走査１回のステップ幅を
Ｐ画素とする。この時、スリットによる走査のステップ
カウントをＴとすると、二値画像のうちで現在スリット
が覆っている領域の左上の点の横方向の位置座標Ｘは、Ｘ＝Ｐ×（Ｔ−１）となる。このとき、０≦Ｘ≦Ｍ−Ｐとする。このような条件を満たすＴの最大値をＴｍａｘ
とすることによって、走査の終点が規定される。In step ST5-2, the slit position is initialized. The initial state of the slit position is
This is when the slit is at the far left (toward) of the binary image. In the following description, the portion that appears in the area inside the starting point slit of the binary image is called a slit image. The resolution of the binary image is N (horizontal) × M (vertical) pixels, the number of pixels in the slit is L × N pixels, and the step width of one scan of the slit is P pixels. At this time, when the step count of scanning by the slit is T, the horizontal position coordinate X of the upper left point of the area currently covered by the slit in the binary image is X = P × (T−1) Become. At this time, 0 ≦ X ≦ MP. The maximum value of T that satisfies these conditions is Tmax
Is defined as the end point of the scanning.

【００４２】スリット画像のニューラルネットワークへ
の入力がステップＳＴ５−３において行われる。スリッ
ト画像をニューラルネットワークの入力層へ投影する。
すなわち、スリット画像に含まれる角画素の値を、ニュ
ーラルネットワークの入力層へ供給する。この時、スリ
ット画像の解像度を変化させることも考えられる。例え
ば、スリットの画素数を２５６×６４とした場合、所定
の複数の画素をまとめて平均値を取ることにより、例え
ば１２×３画素程度に解像度を落とすことも考えられ
る。これは、ニューラルネットワークのユニット数が多
くなればなるほど、ソフトウエアの実行速度は遅くな
り、ハードウエアの実現も困難となるからである。Input of the slit image to the neural network is performed in step ST5-3. The slit image is projected on the input layer of the neural network.
That is, the values of the corner pixels included in the slit image are supplied to the input layer of the neural network. At this time, the resolution of the slit image may be changed. For example, when the number of pixels of the slit is 256 × 64, it is possible to reduce the resolution to, for example, about 12 × 3 pixels by collectively taking an average value of a plurality of predetermined pixels. This is because the greater the number of units in the neural network, the slower the execution speed of software and the harder it is to implement hardware.

【００４３】ニューラルネットワークの出力の計算が、
ステップＳＴ５−４において行われる。ニューラルネッ
トワークの出力は、ステップカウントＴに対応して線物
体１６がそのステップカウントの場合のスリット中央部
を通過する点の位置座標ｙ（Ｔ）及び画面内に線が存在
するか否かを表す判定値ｊ（Ｔ）である。このｙ（Ｔ）
の取り方が図６に示されている。図８に示されているよ
うに、一定の太さを有する線物体１６の中心部がｙ
（Ｔ）として取られるのである。The calculation of the output of the neural network is
This is performed in step ST5-4. The output of the neural network represents the position coordinates y (T) of the point passing through the slit central portion when the line object 16 corresponds to the step count T and the line object 16 has the line in the screen. The judgment value is j (T). This y (T)
6 is shown in FIG. As shown in FIG. 8, the central portion of the line object 16 having a constant thickness is y.
It is taken as (T).

【００４４】なお、横方向の位置座標はｘ（Ｔ）＝Ｐ×
（Ｔ−１／２）と表すことができる。このｙ（Ｔ）とｊ
（Ｔ）とはステップカウントＴに従った順番で画像記憶
用メモリ１２に蓄積される。線物体を走査するのに、走
査位置ｘに対応する出力値ｙ（Ｔ）を、離散的ではある
が得ることが可能である。すなわち、本実施例において
は、画像に映っている線物体１６の微少部分の位置を、
その線物体１６のすべての部分において検出をするので
ある。また、ｊ（Ｔ）を得ることによって、線物体１６
の両端がどこに位置するのかを容易に検出することが可
能である。The lateral position coordinate is x (T) = P ×
It can be expressed as (T-1 / 2). This y (T) and j
(T) is stored in the image storage memory 12 in the order according to the step count T. For scanning a line object, it is possible to obtain an output value y (T) corresponding to the scanning position x, albeit discretely. That is, in this embodiment, the position of the minute portion of the line object 16 reflected in the image is
The detection is performed in all parts of the line object 16. Further, by obtaining j (T), the line object 16
It is possible to easily detect where both ends of the are located.

【００４５】本実施例で用いられているタイプのニュー
ラルネットワークの出力の一般的な計算方法は、Ｄ．
Ｅ．Ｒｕｍｅｌｈａｒｔ，Ｊ．Ｌ．ＭｃＣｌｅｌｌｅｎ
ｄ，ａｎｄＰＤＰＲｅｓｅａｒｃｈＧｒｏｕｐ：
“ＰａｒａｌｌｅｌＤｉｓｔｒｉｂｕｔｅｄＰｒｏ
ｃｅｓｓｉｎｇ”，ＭＩＴＰｒｅｓｓ（１９８６）
に記述されている。ここに記載されている方法を本実施
例においてはそのまま踏襲しているが、学習の高速化等
の改良案が上記文献に対して行われる場合にはその改良
案が誤差関数の極小値探索の問題を解決するものである
場合は本実施例における対象物認識装置の性能に何ら影
響を与えるものではない。また、出力値ｙ（Ｔ）を入力
ユニットの１個に再帰的に入力する方法（リカレントニ
ューラルネットワーク）を用いても同等程度の精度が得
られることが、Ｍ．Ｉ．Ｊｏｒｄａｎによって示されて
いる（図１３参照）。A general method for calculating the output of a neural network of the type used in this embodiment is described in D.
E. FIG. Rumelhart, J .; L. McClellen
d, and PDP Research Group:
"Parallel Distributed Pro
cessing ”, MIT Press (1986)
It is described in. Although the method described here is followed as it is in the present embodiment, when an improvement plan such as speedup of learning is made to the above document, the improvement plan is the minimum value search of the error function. If it solves the problem, it does not affect the performance of the object recognition apparatus in this embodiment. Further, even if a method (recursive neural network) of recursively inputting the output value y (T) into one of the input units is used, the same degree of accuracy can be obtained. I. Shown by Jordan (see Figure 13).

【００４６】ニューラルネットワークの出力を精度良く
得るためには、効果的な学習を行うことも必要である。
本実施例で用いているニューラルネットワークの学習
も、Ｄ．Ｅ．Ｒｕｍｅｌｈａｒｔ，Ｊ．Ｌ．ＭｃＣｌｅ
ｌｌｅｎｄ，ａｎｄＰＤＰＲｅｓｅａｒｃｈＧｒｏ
ｕｐ：”ＰａｒａｌｌｅｌＤｉｓｔｒｉｂｕｔｅｄＰ
ｒｏｃｅｓｓｉｎｇ”，ＭＩＴＰｒｅｓｓ（１９８
６）に習うものとする。It is also necessary to carry out effective learning in order to obtain the output of the neural network with high accuracy.
The learning of the neural network used in this embodiment is also described in D. E. FIG. Rumelhart, J .; L. McCle
llend, and PDP Research Gro
up: "Parallel DistributedP
processing ”, MIT Press (198
You should learn in 6).

【００４７】学習に用いられる入力及び出力を得るため
に、以下に説明する手順によって疑似入力画像と、教師
関数とを作成する。疑似入力画像を実際に用いる場合と
同じ大きさのスリットで走査をし、それぞれのステップ
カウントにおける座標値を教師関数から計算して与える
のである。In order to obtain inputs and outputs used for learning, a pseudo input image and a teacher function are created by the procedure described below. The pseudo input image is scanned with a slit having the same size as when actually used, and the coordinate value at each step count is calculated from the teacher function and given.

【００４８】疑似入力画像（図７に例が示されている）
は、実際の入力画像と同様の条件（線の太さ、濃淡レベ
ル、解像度）で認識対象とする線物体を疑似的に作成し
画面にはめ込み、更にある程度の規模のノイズを疑似的
に付加することにより作成したものである。Pseudo-input image (an example is shown in FIG. 7)
Creates a line object to be recognized artificially under the same conditions as the actual input image (line thickness, gray level, resolution), fits it on the screen, and artificially adds noise of a certain scale. It was created by

【００４９】疑似入力画像の作成方法を、以下説明す
る。A method of creating a pseudo input image will be described below.

【００５０】まず、疑似的に線物体の形状を表す曲線を
作成する。これらの曲線は疑似画像の画面からはみ出す
ことのないスプライン関数として作成する。疑似入力画
像の解像度をＵ×Ｖ画素とすると、具体的な作成手段は
以下のようになる。まず、０個からｎ個（ｎの値は入力
画像の解像度と、学習させようとする変形の度合いに応
じる）の範囲で、基準点の個数を定める。例えば、図８
には、Ｎ×Ｍ個の画像中に８個の基準点が配置されてい
る様子が示されている。そして、図８において設定され
た基準点の間を、任意のスプライン関数で補間する。こ
のスプライン関数で補間された例が図９に示されてい
る。次に、このスプライン関数で示される曲線の太さを
所定の幅にまで設定する。尚、この太さも乱数によって
毎回変化するように設定してもよい。この様子が図１０
に示されている。最後に、ランダムなノイズを画面に加
えて疑似入力画像が作成される。この最終的な様子が図
１１に示されている。First, a curve that represents the shape of a line object in a pseudo manner is created. These curves are created as spline functions that do not protrude from the screen of the pseudo image. Assuming that the resolution of the pseudo input image is U × V pixels, the concrete creating means is as follows. First, the number of reference points is determined within the range of 0 to n (the value of n depends on the resolution of the input image and the degree of deformation to be learned). For example, in FIG.
Shows that eight reference points are arranged in N × M images. Then, the reference points set in FIG. 8 are interpolated by an arbitrary spline function. An example interpolated by this spline function is shown in FIG. Next, the thickness of the curve represented by this spline function is set to a predetermined width. It should be noted that this thickness may be set so as to change every time by a random number. This is shown in FIG.
Is shown in. Finally, random noise is added to the screen to create a pseudo input image. This final appearance is shown in FIG.

【００５１】教師関数の作成教師関数は、上で述べた疑似入力画像を与えるときに設
定したスプライン関数がそれぞれのスリット位置におい
て取る値を教師関数としている。すなわち、教師関数
は、疑似入力画像の作成途中において同時に得られるも
のである。Creation of Teacher Function The teacher function has a value taken at each slit position by the spline function set when the pseudo input image described above is given as the teacher function. That is, the teacher function is obtained at the same time while the pseudo input image is being created.

【００５２】ニューラルネットワークの出力がメモリに
記憶される。この記憶は図５におけるステップＳＴ５−
５において行われる。そして、ニューラルネットワーク
の出力は、スリット位置の順番（つまり、順次、次の書
き込み位置にデータを書き込めば良い）で出力記憶用メ
モリに記憶するのである。なお、記憶の形式は問わない
ものとする。The output of the neural network is stored in memory. This memory is step ST5- in FIG.
5 takes place. Then, the output of the neural network is stored in the output storage memory in the order of the slit positions (that is, data may be sequentially written to the next writing position). It should be noted that the format of storage is not limited.

【００５３】次に、ステップＳＴ５−６においては、Ｔ
がインクリメントされる。これは、スリット位置を１ス
テップ分だけずらす効果を有する。Next, in step ST5-6, T
Is incremented. This has the effect of shifting the slit position by one step.

【００５４】ステップＳＴ５−７においてはスリット位
置が終点に達したかどうかの判断が行われる。終点に達
しないときは、次のスリット位置で上記ステップＳＴ５
−３以下の処理が実行され、終点に達している場合に
は、ステップＳＴ５−８の処理、すなわちニューラルネ
ットワークの信号処理が行われる。In step ST5-7, it is determined whether or not the slit position has reached the end point. When the end point is not reached, the above step ST5 is performed at the next slit position.
If -3 or less is performed and the end point is reached, the process of step ST5-8, that is, the signal processing of the neural network is performed.

【００５５】ニューラルネットワークの信号処理がステ
ップＳＴ５−８において行われる。線物体の走査が完了
したとき、すなわち端点に達したときには記憶しておい
たニューラルネットワークの出力値の処理が行われる。Signal processing of the neural network is performed in step ST5-8. When the scanning of the line object is completed, that is, when the end point is reached, the stored output value of the neural network is processed.

【００５６】ニューラルネットワークの出力ｙ（Ｔ）
を、走査位置Ｔに関するグラフに表すために、記憶され
ている内容をＴ−Ｙ平面にプロットすると、例えばその
グラフの様子は図８に示されているようになる。図８に
示されているように、これらの点（ニューラルネットワ
ークの出力ｙ（Ｔ）を表す点）は、ニューラルネットワ
ークの出力に見られる特有の誤差を含んだ離散的な点の
集合となる。Output y (T) of neural network
When the stored contents are plotted on the T-Y plane in order to express the above as a graph regarding the scanning position T, for example, the state of the graph is as shown in FIG. As shown in FIG. 8, these points (points representing the output y (T) of the neural network) are a set of discrete points including the peculiar error found in the output of the neural network.

【００５７】このような離散的な点から連続な曲線を得
るために、スプライン関数による補間を行う。すると、
求めるべき線物体１６の形状が滑らかな関数（曲線）と
して得られる。関数として形状データを得ることができ
るため、数値データを作成しやすく、自動制御に容易に
用いることが可能である。In order to obtain a continuous curve from such discrete points, interpolation by a spline function is performed. Then
The shape of the line object 16 to be obtained is obtained as a smooth function (curve). Since the shape data can be obtained as a function, numerical data can be easily created and can be easily used for automatic control.

【００５８】[0058]

【発明の効果】以上述べたように、第一の本発明によれ
ば画像データを複数のスリット領域に分割し、各スリッ
ト領域の画像データをニューラルネットワークに入力
し、線物体の位置をニューラルネットワークから出力さ
せたので、ノイズの影響を受けにくく、かつ正確な線物
体の認識が可能な対象物認識方法が得られる。As described above, according to the first aspect of the present invention, the image data is divided into a plurality of slit areas, the image data of each slit area is input to the neural network, and the position of the line object is determined by the neural network. Since it is output from, it is possible to obtain an object recognition method that is not easily affected by noise and that can accurately recognize a line object.

【００５９】第二の本発明によれば、複数のスリット領
域のそれぞれが、隣接するスリット領域とオーバーラッ
プしているため、よりスリット間のデータの変化が滑ら
かとなり、その結果精度の高い対象物認識方法が得られ
る。According to the second aspect of the present invention, since each of the plurality of slit areas overlaps with the adjacent slit areas, the data change between the slits becomes smoother, and as a result, the object of high accuracy can be obtained. A recognition method is obtained.

【００６０】第三の本発明によれば、ニューラルネット
ワークは、線物体を含む画像データにノイズを混入さ
せ、ノイズが混入された画像データをスリット領域に分
割することによって作成された教師データを用いて学習
されている。そのため、効果的な学習ができ、合わせて
精度の高い対象物の認識が可能な対象物認識方法が得ら
れる。According to the third aspect of the present invention, the neural network uses the teacher data created by mixing the image data containing the line object with noise and dividing the image data containing noise into slit areas. Have been learned. Therefore, it is possible to obtain an object recognition method that enables effective learning and also enables highly accurate object recognition.

[Brief description of drawings]

【図１】本実施例の対象物認識装置の全体構成を表す
構成図である。FIG. 1 is a configuration diagram showing an overall configuration of an object recognition device of this embodiment.

【図２】画像入力装置１０の詳細な構成を表す構成図
である。FIG. 2 is a configuration diagram illustrating a detailed configuration of the image input device 10.

【図３】本実施例による対象物の認識の原理を説明す
る説明図である。FIG. 3 is an explanatory diagram illustrating a principle of recognizing an object according to the present embodiment.

【図４】本実施例において用いられているニューラル
ネットワークの構成を説明する構成図である。FIG. 4 is a configuration diagram illustrating a configuration of a neural network used in this embodiment.

【図５】本実施例における対象物認識方法の動作の流
れを表すフローチャートである。FIG. 5 is a flowchart showing the flow of operations of the object recognition method in this embodiment.

【図６】ｙ（Ｔ）の値を算出する方法を説明する説明
図である。FIG. 6 is an explanatory diagram illustrating a method of calculating a value of y (T).

【図７】本実施例において用いられているニューラル
ネットワークの学習に用いられる疑似入力画像の例を表
す説明図である。FIG. 7 is an explanatory diagram showing an example of a pseudo input image used for learning of the neural network used in this embodiment.

【図８】疑似入力画像の作成を説明する説明図であ
る。FIG. 8 is an explanatory diagram illustrating creation of a pseudo input image.

【図９】疑似入力画像の作成を説明する説明図であ
る。FIG. 9 is an explanatory diagram illustrating the creation of a pseudo input image.

【図１０】疑似入力画像の作成を説明する説明図であ
る。FIG. 10 is an explanatory diagram illustrating creation of a pseudo input image.

【図１１】疑似入力画像の作成を説明する説明図であ
る。FIG. 11 is an explanatory diagram illustrating the creation of a pseudo input image.

【図１２】ニューラルネットワークの各スリット位置
における出力をプロットした場合を表す説明図である。FIG. 12 is an explanatory diagram showing a case where the output at each slit position of the neural network is plotted.

【図１３】リカレントニューラルネットワークの構成
・模式図である。FIG. 13 is a configuration / schematic diagram of a recurrent neural network.

[Explanation of symbols]

１０画像入力装置、１２画像記憶用メモリ、１４
計算機、１６線物体、１８試験台。10 image input device, 12 image storage memory, 14
Calculator, 16-line object, 18 test bench.

Claims

[Claims]

1. A dividing step of dividing image data including a line object into a plurality of slit areas having a longitudinal direction substantially perpendicular to a direction in which the line object extends, and a plurality of slits obtained by the dividing step. Image data belonging to the area is sequentially input to the neural network, and the position of the line object in the longitudinal direction of the slit area is sequentially output to the neural network.
A coordinate value output step, and recognizing the line object by outputting the coordinate value.

2. The object recognition method according to claim 1, wherein each of the plurality of slit areas in the dividing step overlaps an adjacent slit area.

3. The object recognition method according to claim 1, wherein the neural network includes a noise mixing step of mixing noise into image data including a line object that is teacher data, and the image data mixed with the noise. Is divided into a plurality of slit areas having a longitudinal direction substantially perpendicular to the direction in which the line object extends, and a neural network learning step of learning the plurality of divided slit areas as teacher data. An object recognition method comprising: