JPH02257381A

JPH02257381A - Sorting of image, sorting of and identifier for character in image and fine line conversion of image

Info

Publication number: JPH02257381A
Application number: JP1324184A
Authority: JP
Inventors: Henry S Baird; ヘンリー　エス．ベアード; John S Denker; ジョン　スチュワート　デンカー; Hans P Graf; ハンス　ピーター　グラフ; Donnie Henderson; ドニー　ヘンダーソン; Richard E Howard; リチャード　イー．ハワード; Wayne E Hubbard; ウェイン　イー．ハバード; Lawrence D Jackel; ローレンス　ディー．ジャッカル
Original assignee: American Telephone and Telegraph Co Inc
Current assignee: AT&T Corp
Priority date: 1988-12-20
Filing date: 1989-12-15
Publication date: 1990-10-18
Also published as: CA2002542A1; CA2002542C

Abstract

PURPOSE: To improve the precision of the automatic recognition of the character pattern of a prescribed picture by a machine by executing the skeletonization of the picture and the correction of the inclination of the picture. CONSTITUTION: A pixel picture is formed by scanning a character to the recognized, and this picture is given scaling, and is 'purified', and is corrected in its inclination, and is skeletonized, and the line-thinned picture is formed. A picture map is formed by executing the extraction of a feature by a neural network by using this line-thinned picture, and the features are combined, and a 'higher rank (super) feature' is formed. After that, the picture is made into a block roughly, and the dimension number of a feature vector is reduced. Finally, a sorter specifies the character by the feature vector. Thus, the reliable automatic character recognition can be realized.

Description

[Detailed description of the invention]

［産業上の利用分野］本発明は、パターン分析及びパターン認識に係り、さら
に詳細には、画素（ビクセル）配列から成る画像におけ
るパターン又はシンボル認識のシステムに関する。INDUSTRIAL APPLICATION FIELD OF THE INVENTION The present invention relates to pattern analysis and pattern recognition, and more particularly to a system for pattern or symbol recognition in images consisting of pixel (pixel) arrays.

[Prior art]

所定画像の文字パターンを機械が自動的に認識し、分析
しかつ分類することが望ましいような極めて多種類の適
用例が存在する。コンピュータをベースにした情報の収
集、処理、取扱、記憶及び伝送システムの爆発的な発展
は、これらの要望の実現を可能にする技術を提供してい
る。パターン認識を実施するために汎用コンピュータに
対して精巧なプログラムが作成されてきたが、それらの
成功レベルには限界が見られた。それらの成功は標準印
刷フォントを認識する領域においてはほとんど達成され
ている。過去に遡る１９６０年代初期の頃の文字認識技術は認識
されるべき文字の曲線を追跡する方法を用いていた。そ
れは直観的にアピールするものであったが、残念ながら
、文字が歪んでいたり余計なストローク（筆跡）を含ん
だりするときは失敗する例が多かった。バキス（Ｂａｋｌｓ）他（ＩＢＭ）は「手書き数字の機
械認識の実験的研究（Ａｎ　Ｅｘｐｅｒｉｍｅｎｔａｌ
　５ｔｕｄｙｏｒ　Ｍａｃｈｉｎｅ　Ｒｅｃｏｇｎｉｔ
ｉｏｎ　ｏｒＨａｎｄ　Ｐｒ１ｎｔｅｄ　Ｎｕ＊ｅｒａ
ｌｓ）Ｊ、ＩＥＥＥトランザクションズ　オン　システ
ムズ　サイエンス　アンド　サイバネティクス　（ＩＥ
ＥＥ　　Ｔｒａｎｓａｃｔｌｏｎｓ　　ｏｎ　　Ｓｙｓ
ｔｅｍｓ　　５ｃｌｅｎｃｅ　　ａｎｄＣｙｂｅｒｎｅ
ｔｌｃｓ）第５ＳＣ−４巻、第２号、１９８８年７月号
、という題名の論文に手書き数字の認識方法について報
告している。記載のシステムにおいて数字は２５Ｘ３２
の２値化マトリツクスに変換される。８００ビツトベク
トル（２５Ｘ’１２）の次元数を約１００に減少するた
めに特徴（ｆｅａｔｕｒｅ）が抽出され、１００ビツト
ベクトルはいくつかの類別器（ｃａｔｅｇ。ｒＩｚｅｒ）に送られる。文字のある種の「正規化（ｎ
。ｒｇ＋ａ１１ｚａｔ１ｏｎ）Ｊもまた行われる。著者ら
は使用した手書きサンプルに応じて８Ｂないし９９．７
％の認識率を報告している。商業用に適用するための希
望レベルに比較して認識率が低いので、著者らは、「追
跡するコースは、曲線追跡型測定を・・・自動特徴選択
及び平行決定論理と組合せるべきであるように思われる
。」と結論づけている。フォローアツプ作業と思われる研究の中で、アール・ジ
ー県キャセイ（Ｒ，Ｇ、Ｃａ５ｅｙ）はバキス他の「正
規化」を、対象とする文字の傾斜修正（ｄｅｓｋｅｖｌ
ｎｇ）の方法へ拡張した実験を記載している。「手書き文字のモーメント正規化（Ｎｏ■ａｎｔ　Ｎｏ
ｒｓａＬｌｚａｔｉｏｎ　ｏｆ　Ｈａｎｄｐｒｉｎｔ　
Ｃｈａｒａｃｔｏｒｓ）　Ｊ、ＩＢＭジャーナル　オン
　リサーチ　ディベロブメント（ＩＢＭ　Ｊｏｕｒｎａ
ｌ　ｏｒ　Ｒｅ５ｅａｒｃｈ　Ｄｅｖｅｌｏｐｍｅｎｔ
）、１９７０年９月号、５４８−５５７頁参照。キャセ
イは、バキス他によって示唆されたような曲線追跡と、
及びテンプレート（型板）マツチング、クラスタリング
、自己相関、重み付き相互相関及び領域分割ｎ個要素を
含む決定方法体系と、を組合せた特徴認識法を使用した
。下記の論文の中で、ネイラー（Ｎａｙｌｏｒ）（同じく
ＩＢＭ）は、コンピュータ、対話形式グラフィクコンソ
ール及び傾斜正規化を使用したＯＣＲ（光学文字認識）
システムに関して報告している。「文字認識システムの対話形式設計におけるいくつかの
研究（Ｓｏｍｅ　５ｔｕｄｉｅｓ　ｉｎ　ｔｈｅ　Ｉｎ
ｔｅｒａｃｔｉｖｅＤｅｓｌｇｎ　ｏｆ　Ｃｈａｒａｃ
ｔｅｒ　Ｒｅｃｏｇｎｌｔｌｏｎ　！３ｙｓｔｅｍｓ）
Ｊ　ＩＥＥＥ　　）ランザクシジンズ　オン　コンピュ
ーターズ（ＩＥＥＥ　Ｔｒａｎｓａｃｔｉｏｎｓ　ｏｎ
　Ｃｏ５ｐｕｔｅｒｓ）　、１９７１年９月号、１０７
５−１０８６頁。彼のシステムの目的は、抽出されるべ
き特徴を識別するための適切な論理を開発することであ
った。１９８１年３月３１日付で発行された米国特許明細書第
４．２５９，６６１号には、トッド（Ｔｏｄｄ）によっ
て他の抽出特徴アプローチ法が記載されでいる。トッド
の方法によれば、文字の輪郭によって形成される矩形領
域があらかじめ定義されたサイズに正規化され、次に部
分領域に分割される。部分領域の各々内の画像の「黒度
（ｄａｒｋｎｅｓｓ）　Ｊが評価され、黒度評価の集合
から「特徴ベクトル」が形成される。特徴ベクトルは記
憶されているところの文字を表わす特徴ベクトルのセッ
トと比較され、最もよくマツチしたものが認識された文
字として選択される。［５ＰＴＡ：２値化パターンを細線化するための提唱ア
ルゴリズム（ＳＰｔＡ　：　Ａ　Ｐｒｏｐｏｓｅｄ　Ａ
ｌｇｏｒｉｔｈａ　ｆｏｒ　Ｔｈｉｎｎｉｎｇ　Ｂｉｎ
ａｒｙ　Ｐａｔｔｅｒｎｓ）Ｊ、Ｉ　ＥＥＥトランザク
ション　オン　システムズ　マン　アンド　サイバネテ
ィクス（ＩＥＥＥ　Ｔｒａｎｓａｃｔｉｏｎ　ｏｎＳｙ
ｓｔｅｍｓ、Ｍａｎｓ、ａｎｄ　Ｃｙｂｅｒｎｅｔｉｃ
ｓ）　ｓ第ＳＭＣ−１４巻、第３号、１９８４年５７６
月号、４０９−４１８頁、という題名の論文において、
ナカッシエ（Ｎａｃｃａｃｈｅ）他はＯＣＲの問題に対
する他の方法を発表している。この方法は、パターンは
幅の広いストロークで構成されることが多いことと、及
びパターンを骨格化（ｓｋｅｌｅｔｏｎｌｚｌｎｇ）す
ることが有利であることと、を提示している。ナカッシ
ュ他によって説明されているように、「骨格化はパター
ンが細線化（ｔｈｉｎｎｉｎｇ）されて線画が形成され
るまでパターンのエツジに沿って黒点を反復削除するこ
とによって（すなわち黒点を白点に変えること）からな
る。」初期パターンがその中実軸に細線化されれば理想
的である。この論文は１４の異なる既知の骨格化アルゴ
リズムを簡単に説明し、次にそれ自身のアルゴリズム（
ＳＰＴＡ）を提案している。５ＰＴＡを含む説明されて
いる全ての骨格化アルゴリズムは、画像上に３行×３列
の四角窓（通常３×３窓という）を通過させるという考
え方に基づいている。四角な３×３窓が画像上を通過す
るときに、アルゴリズムは、センター画素を包囲する８
つの近傍画素を評価し、その評価に基づいて黒い中心点
を白に変換したりまたはそれを変更することなく黒のま
ま残したりする。パターン分類は、関連分野において最近進歩を示した他
の、方面からも大きな支援を受けている。特に、１９８７年４月２１日付けで発行された米国特許
明細書第４，６６０．１８８号に開示されているホップ
フィールド（Ｈｏｐｆｌｅｌｄ）による研究によって、
・高度化並列計算回路網（「神経回路網」）が脚光を浴
びてきた。特にガリクセン（Ｇｕｌｌｌｅｈｓｅｎ）他
によって、「神経回路網によるパターン分類：記号認識
のための実質的システム（Ｐａｔｔｅｒｎ　Ｃ１ａｓｓ
ｉｆｉｅａｔｌｏｎ　ｂｙ　Ｎｅｕｒａｌ　Ｎｅｔｗｏ
ｒｋｓ：Ａｎ　Ｅｘｐｅｒｉｍｅｎｔａｌ　Ｓｙｓｔｅ
ｍ　ｆｏｒ　１ｅｏｎ　Ｒｅｃｏｇｎｉｔｉｏｎ）Ｊ、
Ｉ　ＥＥＥ第１回神経回路国際会議の論文集（カルディ
ル（Ｃａｒｄｌｌ）他Ｗ）　、ＩＶ−７２５−７３２頁
、に報告された研究は、文字分類化法に主題を集中させ
ている。彼等が説明しているシステムは、ある種の画像
処理は用いいるが特徴抽出は用いていない。その代わり
に彼等は、神経回路網が「逆伝搬（ｂａｃｋ　ｐｒｏｐ
ａｇａｔｉｏｎ）」訓練法（ｔｒａｉｎｉｎｇ　ｐｒｏ
ｃｅｓｓ）を介して獲得する本来の分類知能に完全に頼
っている。報告されているシステムは一見うまく働きそ
うであるが、著者らによって示唆されているように、研
究しなければならない多くの問題が残されている。シス
テムの性能は許容限度を下回っている。その他多くの文字分類の技法、アプローチ（研究方法）
及びアルゴリズムが存在する。しかしながら、本開示の
目的のために上記の参考文献は最も関係の深い従来技術
の合理的な説明を提供している。文字認識（すなわち分
類）問題を解決するために注がれた全ての努力にもかか
わらず、既存のシステムは、信頼性ある自動文字認識を
実現し得る精度を提供していない、ということを言うだ
けにとどめておこう。［発明の概要］本発明は、技法の正しい組合わせを選択し、それらの技
法を速度及び精度に関して高性能となるように修正する
ことによ・って、高い効率のＯＣＲシステムを提供する
。本発明の一般的アプローチは、無意味な変動を除去し
有意味な変動を捕獲してから分類を行なうことにある。特に、本発明の原理に従えば、認識される文字が走査さ
れることで画素（ビクセル）画像が形成され、その画像
がスケーリングされ、「清浄化」される。スケーリング
され清浄化された画像は、傾斜修正され骨格化されて細
線化画像が形成される。この細線化画像によりニニーラ
ル・ネットワークが特徴の抽出を行ない、画像マツプを
形成する。そして、特徴が組み合わされて、「上位（ス
ーパー）の特徴」を形成する。その後、その画像は粗く
ブロック化され、特徴ベクトルの次元数が低減される。その特徴ベクトルにより、最後に、分類器が文字の特定
を行う。［実施例］第１図は文字または記号の分類のための我々の方法の流
れ図を示す。ブロック１０において、文字画像が捕獲さ
れ、かつ半導体記憶装置のようなフレームバッファ内に
記憶されるのが有利である。画像は遠隔地から電子伝送を介して取得されてもよいし
、またはそれは「局所的に」走査カメラを用いて取得さ
れてもよい。通常の実施例によるいずれの画像取得源に
かかわらず、画像は画素の順序づけ集合（配列）によっ
て表わされる。各画素の値は画像の特定の小領域から放
射する光（明るさ、色など）に対応する。画素値は記憶
装置内に記憶される。スマツジ（汚れ）及び異質のストロークが文字の付近に
見出されることが多く、これらの存在は認識過程をより
難しくさせざるを得ない。我々の発明によれば、ブロッ
ク２０がブロックＩＯに続き、その機能は画像を清浄化
することである。これは画像から無意味な変化性を除去
するための我々の努力の第１のステップである。通常、数字のような記号または文字の画像は、（隣接す
る）画素の１つの大きな群及びゼロの場合もある小数の
小さな群を含む。我々の清浄化アルゴリズムは基本的に
はこのような群の全てを識別し、最も大きな群以外は全
て削除する。もし削除された群の合計が初期画像のある
パーセントより大きい割合を構成するのであればそれは
画像が変則（異状）であることを示すので、この事実が
後に使用されるように注記される。この説明の文脈内で
、画像記号は明背景上の暗ストロークで構成されている
と推定される。「反転」画像ももちろん同様な装置で処
理することが可能である。上記の清浄化アルゴリズムは
また、画像内に期待される記号セットが非結合ストロー
クを要求するような記号を含まないことを前提とする。数字〇−９及びラテン語アルファベット（小文字ｌ及び
ｊは除く）はこのようなセットを形成するが、大抵の他
のアルファベット（ヘブライ語、中国語、日本語、ハン
グル語、アラビア語など）は多くの非結合ストロークを
含む。このような他のセットに対しては、領域の全集合
ではなくて各非結合領域だけを見るようなやや異なる清
浄化アルゴリズムが適用されなければならないであろう
。これらの異質領域を検出し、かつ識別するために適用可
能な多くの方法がある。我々が使用する方法はブラシフ
ァイア（ｂｒｕｓｈ　ｆｉｒｅ）に類似する。我々の方法によれば、「黒い」画素群を見出すために画
像は頂部から底部までラスター走査される。このような
群が見出されたとき（すなわち以前に考慮されなかった
黒画素に出会ったとき）、走査が中断されて「ブラシフ
ァイア」が点火される。すなわち、出会った画素に対し
て識別子でマーキングされ、マーキングは拡張過程を開
始させる。拡張過程において、８つの直接近傍画素の各
々が考慮される。黒であるこれらの近傍画素も同様に識
別子でマーキングされ、各マーキングはそれ自身の拡張
過程を開始させる。このように、最初に出会った「黒」
の群の画素は、全ての群を選択された識別子によって迅
速に識別させる。過程内のこの点において画像の走査が
再開され、これにより他の群を発見し、かつ（異なる識
別子）で識別することが可能である。走査が完了して全
ての「黒領域」が識別されたときに、領域計算が実施可
能である。上に示したように、最大群を除く全ての群が
画像から削除される（すなわち暗から明に反転されるか
またはオフにされる）。この時点で、文字認識技術においては判定を下すのを拒
否するよりは文字を不正確に識別するミスをしないこと
の方がより重要なであることが判るであろう。この理由
から、不連続ストロークのない数字あるいは他の文字セ
ットを識別するように設計されたシステムにおいては、
領域除去しきい値はかなり低いレベルに設定されるべき
である。通常は、（前記の０−９文字セット及びラテン語アルフ
ァベットにおいては）画像の有意部分を含む画素は厳密
な意味で隣接しているであろうことが期待される。他方
で、おそらくは領域が極めて僅かだけ分離されたときは
これは例外とすべきであって、（例えばインクの出にく
いペンで書いたりまたは目の粗い紙面上に書いたりした
ときのように）文字ストロークが偶然に破断される可能
性があると、外部情報が人に信じさせるようにする。こ
のような偶発性に備えるために、「ファイヤ（ｒｌｒｅ
）　Ｊを拡散（ｓｐｒｅａｄ）するための我々の方法は
、８つの直接近傍画素（８つの画素とは大きな窓の隅と
大きな窓の辺の中心画素である）から少し離れた８つの
追加の画素を含むように近傍画素を定義するためのオン
シジンを含む。結局、我々は「ファイア」が「ファイア
ブレーク（ｆ’ｉｒｅ　ｂｒｅａｋ）　Ｊを飛び越すこ
とを可能にする。ブロック２５における画像を所定サイズにサイズ調整（
スケーリング；　ｓｃａｌｉｎｇ）をする過程が清浄化
過程に続く。サイズＨ！！はもちろん画像の無意味な変
化性も除去する。サイズ調整の前に行われる清浄化の過
程は、スマツジを含んだまま画像をＴイズ調整したくな
いという要望から行われるものである。スケーリング過
程は多数の異なるアルゴリズムのいずれを使用してもよ
い。例えば、１つのアルゴリズムによれば、画像は、画
像の一方の次元（ディメンション）サイズが所定サイズ
に到達するまで両次元方向に等倍率でサイズ調整が可能
である。他のアルゴリズムは、両次元方向のサイズ調整
倍率の間の最大差にある程度の制限を設けた上で両次元
方向に独立にサイズ調整を行う。両方のアプローチの方法はうまく行われるので、アルゴ
リズムの゛選択及びそれの実行は読者に任せる。我々は
文字画像の各々を最初に記載のアルゴリズムを用いて例
えば１８Ｘ３０の画素配列のような適切な数の画素にサ
イズ調整する。人々は一般に文字を斜めに書く。その傾斜は人によって
異なる。文字の傾斜すなわちスキュー（ｓｋｅｖ）は情
報をもたないところの手書き文字の他の無意味な変化性
であり、従って我々はそれを除去する。第１図に戻って、ブロック２５のに続くブロック３０は
画像の傾斜修正（デスキューイング；　ｄｅｓｋｅｖｌ
　ｎｇ）を行う。言い換えるとブロック３ｏの機能は、
全ての文字をより均等に直立化することである。ブロック３０は、画像の傾斜修正を行うために多くの通
常の手順のいずれかを使用可能である。このような手順
の１つは、画像に対して次の形の変換を行う。ここでＸ及びｙは画像の初期座標であり、Ｘｏ及びｙ。は原点を定義し、Ｕ及びＶは変換された画像の座標であ
り、及びｍ、及び”ｙｙは次式％式％）及びによって計算された画像モーメントである。上記におい
て、Ｂ　（ｘ、ｙ）は、位置Ｘｓ’／における画素が「
黒」であるときに１の値をとり、そうでないときには０
の値をとる。この機能の効果は、ｘｙモーメントを本質
的にＯに減少させることである。サイズ調整（ブロック２５）及び傾斜修正（ブロック３
０）は両方とも線形変換である。清浄化された画像に複
合変換を行って直接傾斜修正された画像を形成すること
は有利であろう。この複合操作を行えば、サイズ調整さ
れた画像を画素の配列として陽形式表示をしなくてもす
むようなことが可能になる。これは（計算）ノイズ源を
排除することになる。第１図においてブロック３０の後に続くブロック４０は
画像を細線化（ｔｈｉｎｎｉｎｇ）する。画像の細線化
もまた画像の無意味な変化性を除去する。上記のように
、従来技術による骨格化（ｓｋｅｌｅｔｏｎｉｚａｔＳ
ｏｎ）の方法は画像上に通過される３ｘ３窓を使用する
。３８３窓の中心点は、もしある条件が満たされればオ
フにされる；そしてこれらの条件は、大抵の方法の場合
、種々の事前定義窓条件を用いた反復テストを含む。例
えば、ペン・ラン（Ｂｅｎ−Ｌａｎ）及びモントト０４
０１ｔｏｔｏ）アルゴリズムは、暗中心は、もしそれが
次の条件：すなわち１）その画素が４近傍のうち少なく
とも１つの明近傍を有すること；及び、２）その近傍が８つの事前定義３×３窓のいずれともマ
ツチしないこと；を満足するならば、削除される（すなわちオフとされる
かまたは明とされる）。４近傍とは、考慮対象画素の東、北、西、または南にあ
る画素である。最近までプロセッサはいずれにしても一時に１つの業務
のみは処理可能であるので、上記に類似のアルゴリズム
はソフトウェア実行上全く使用可能である。しかしなが
ら、これらのアルゴリズムはそれらの手順の性質上どう
しても処理が遅くなる。さらに、従来技術によるこれら
のテストの各々は、パターンのある特徴は目標とするが
他の特徴は目標としない。異なる文字（例えば垂直な線
と水平な線）のストロークを細線化するためには、異な
るテストが用いられなければならない。さらに、従来技
術によるテストを用いるとき、特定の画素が確実に検出
される前にはこれらのテストの少なくともあるものが順
次に実行されることが必要であり；そしてこれらのテス
トが実行されない限り画素はオフにすることができない
。第２図の実施例はこの問題を示す。第２図において、テンプレート１００及び１１０は２つ
の３Ｘ３画素窓である。テンプレート１００内の３つの
頂部画素はオン（ＯＮ）画素を探すことを指示するため
に丸陰影がつけられている。中心画素及び底部の行の中
心内の画素はオフ（ＯＦＦ）画素を探すことを指示する
ために斜線陰影が付けられている。残りの画素は「ドン
ト　ケア（ｄｏｎ　’　ｔｃａｒｅ）　Ｊ状態を指示す
るために空白である。テンプレート１００は暗区画（画
素１０４及び１０５）の上部の明区画（画素１０１５１
Ｇ２及び１０３　）のエツジ状態を探すと、暗区画が少
なくとも２つの画素の厚さがあるに相違ないと予告を出
す。このような状態に遭遇すると、中心画素（１０４）
はオンからオフに（暗から明に）変えられる。従って、
テンプレート１００は、頂部から始めてきてたった１つ
のオン行が残されるまでオン領域をかじりとる機構を提
供する。テンプレート１１０も同様な働きをするが、但しこちら
は、第１及び第２行の中心画素がオン画素を求めながら
オフ画素を求める底部の行を有することが異なる。テン
プレート１１０は底部からオン（暗）領域をかじりとる
。水平な線を細線化し垂直な線を細線化しないところの上
記テンプレートは画像上に多数の異なるテンプレートを
通過させることが好ましいことを示し、この場合具なる
テンプレートは画像の異なる特徴を感知する。（速度の
点から）種々のテンプレートを同時に通過させることも
また好ましい。しかしながら、第２図の画像セグメント１０Ｂにおいて
もしテンプレート１００及び１１０が適用されると、描
写されている２画素幅の水平な線は完全に除去されてし
まうので、この場合は適用できない。頂部の行はテンプレート１００によって除去され、底部
の行はテンプレート１１０によって除去されるであろう
。もし細線化を効率的に実施すべきであるならば、異なる
テンプレート間の相互依存性は打破られなければならな
い。予期しなかったことであるが、この相互依存性は３Ｘ３
より大きい窓を使用することによって打破ることが可能
であることが判った。従って、我々は３Ｘ３より大きい
テンプレートを少なくとも幾つか含むテンプレートセッ
トを使用する。あるものは３Ｘ３であり、あるものは３
×４であり、あるものは４×３であり、またあるものは
５×５である。集合の特徴は、テンプレートが画像上を
同時に通過可能であることである。この可能性は、他の
テンプレートが画像を独立して変更できる能力に有害な
影響を与えることなく１つのテンプレートに応答して画
像を変更させることが可能なテンプレートの特殊選択に
よって実現される。かなりユニークなこのテンプレート
のセットを第３図に示す。第３図に示したテンプレートのセットは十分なセットで
あることを我々は発見した。他のセットも可能であるこ
とは当然であるが、我々の発明によればこのようなセッ
トは３Ｘ３より大きいテンプレートを少なくとも１つ含
むことが特徴である。図示のテンプレートの作動を説明するために、まずテン
プレート１２０及び１４０から始める。これらのテンプ
レートは第２図のテンプレート１００及びｌ１０に対応
する。テンプレート１２０は５×５配列として示される
でいるが、外側の列及び行は「ドント　ケア」状態であ
るので本質的にはこれは３Ｘ３窓を形成する。テンプレ
ート１２０内の画素１２１及び１２２はオン画素か否か
をテストし、−方テンプレート１００内のそれに対応す
る位置の画素は「ドント　ケア」にセットされていると
いう点で、テンプレート１２０はテンプレート１００と
は異なる。すなわち、テンプレート１２０は確実に、か
じとられた（明にされた）画素が両方向に伸長する線の
上方にあるようにする。一方テンプレート１４０は、そ
れが実際上３Ｘ４テンプレートである点でテンプレート
１１０と異なる。それは３８３テンプレート１１０に類
似の３×３部分を含み（画素１４１及び１４２を除く）
かつそれはまた第１行の中心に画素１４３を含む。画素
が（底部から）かじり取られることが許される前に、画
素１４３は結局水平な線が３画素幅であることを要求す
る。テンプレート１３０及び１５０はテンプレート対１２０
及び１４０類似のテンプレート対を形成する。テンプレ
ート１３０及び１５０は垂直な線を細線化する。テンプレート１６−Ｑ　、　１７０．１８０及び１９０
はそれぞれ右、左、上及び下を向く。「膝（ｋｎｅｅｓ
）　Ｊを細線化し；テンプレート２００　、２１０　、
２２０及び２３０は傾斜する線を上部から及び下部から
細線化する−などである。テンプレート１６０−２３０
は全て５×５テンプレートであることがわかるであろう
。第１図に戻ると、骨格化ブロック４０の後に特徴抽出フ
ロック５０が続く。操作は類似しているが、骨格化は機
械的な点から特徴抽出とは異なる。骨格化においては、
不要な画素が識別されてそれらが暗から明に変えられる
。特徴抽出においては、文字の分類を助ける比較的巨視
的な特徴が識別される。識別された巨視的な特徴は文字
のサイズまたは厚さに依存しないような特徴の種類であ
るが、文字にその特定の「サイン（ｓｉｇｎａｔｕｒｅ
　）　Ｊを与えるものである。従って、ブロック５０が
識別すべく探求するものがこれらの特徴である。特徴抽出は、操作的には画像上に一部の窓を通過させる
ことにより達成される。我々のシステムにおける各窓は
７×７テンプレートであり、各テンプレートは、端点、
対角線、水平線、垂直線；などのような特定の特徴の存
在を検出する。検出は、４９画素（７Ｘ　７）の大部分
がテンプレートにフィツトするとき特徴が存在すると結
論づけられるという意味の主要規則によって行われる。我々のシステムにおいては、第４図に示すような４９種
類の異なる７Ｘ７テンプレートを使用する。各テンプレ
ートに対しては、我々は、テンプレートのパターンが画
像にマツチするところの画像配列内の座標を基本的に指
示する「特徴マツプ」を形成する。第４図の４９種類のテンプレートに対応する４９種類の
特徴マツプを開発した後、ブロック６０において我々は
特徴マツプの論理組合せ（ＡＮＤ及びＯＲ）である多数
の上位（スーパー）特徴マツプを形成する。このように
して、我々はセットを４９マツプから（１８Ｘ３０画素
配列の）１８マツプに減少する。減少数は発見的に決定
された。我々は（我々が配列を記憶する記憶装置内で）配列を構
成しかつ我々はその検出特徴を配列内の適切な位置内に
置くので、この検出された特徴を「マツプ」と呼んでい
る。このように我々は特徴の存在及びその位置を記録す
る。「ヒツト（命中）」位置の指示を記録するために他
の機構を使用可能であるが、概念的にマツプ形式で考え
るのがより簡単である。１８Ｘ３０配列は分類の目的のためにはあまりにも詳細
過ぎることが解った。細部は実際に文字をマスクするこ
とが可能であるが、（「木を見ては森を見ることが出来
ない」という諺のように）分類作業を一層困難にする。従って、ブロック７０は粗いブロック化（ｂｌｏｃｋｌ
ｎｇ）を行って１８Ｘ３０の特徴マツプを僅か３Ｘ５の
特徴マツプに減少する。この結果１８個の３×５のマツ
プに対応する２７０ビツトからなる最終マツプまたはベ
クトルが得られる。最後に、ブロック８０は分類アルゴリズムを実行して、
与えられた２７０ビツトから最も確からしい候補を決定
する。識別されるべき文字に対してどのテンプレートが
最も確からしく対応するかがわかると、最低ハミング（
Ｈａ＋＋ｎｉｎｇ）距離を決定するような簡単なアルゴ
リズムで十分であろう。重要な点は勿論これらのテンプ
レートを決定することにあり；そしてその態様は、その
技術が現在取扱っているところの（逆伝搬というような
）学習方法論を要求する。［ハードウェア実施態様］第１図は我々のＯＣＲシステムの過程を示すが、それは
またハードウェアの実態化もよく表わしている。信号の
流れの実際の詳細は特定の設計と共に変わるであろうが
、それは完全に通常の回路設計技術の範囲内に入るもの
である。以下の説明のために、我々のシステムはバイブ
ライン方式で作動しかつ各電子回路ブロックはどの画素
が考慮の対象とされるかに関する必要な識別と共に必要
な信号及び制御を以下の回路ブロックに与えるものと仮
定しよう。前にも示唆したように、ブロック１０は分類されるべき
画像の特定の供給源に付随の通常の装置を含む。それは
単に、市販の「フレームグラバ−（ｆｒａｍｅ　ｇｒａ
ｂｂｅｒ）Ｊ及び記憶装置ニ結合すレタヒデオカメラで
あってもよい。分類過程が開始されるときに記憶装置が
アクセスされて中心画素と２４の近傍画素とを検索し、
検索された信号の集合がブロック２０に供給される。ブロック２０及び３０はこの、場合、付録に示した簡単
なプログラムを用いてＳＵＮワークステーション上で実
行される。画像信号と必要に応じて一時的な計算結果と
を記憶するために、ローカルメモリがマイクロプロセッ
サに含まれている。実際にはいずれのマイクロプロセッ
サでも同様に使用可能であるが、もしマイクロプロセッ
サで得られる速度よりも速い速度が必要であるならば、
必要な計算を実行するために通常のように特殊なハード
ウェアを設計可能である。実際に、必要な演算は単に加
算、減算、比較及び初歩的な乗算であるので、極めて高
い処理能力を提供するパイプライン方式アーキテクチャ
が容易に設計可能である。ブロック３０の出力は一連の信号セットであり、各々は
それに付属の中心画素とその近傍画素とを有する。ブロ
ック４０は第５図の神経回路網を用いて実行され、神経
回路網は、直列結合のスイッチ４００とテンプレートマ
ツチ回路網４１０としきい値回路網４２０とを含む。あ
る瞬間に５×５の窓によって覆われる画像の２５個の画
素の値に対応する入力信号が４１０の入力端にあるスイ
ッチ４００に供給される。スイッチ４００は、これらの
値が回路網に同時に供給されることを確実に行わせる。回路網４１０は２５個の入力リード線と記憶されるテン
プレートの数に等しい多数の出力リード線とを含む。回路網４１０内において、全ての入力リード線は１列の
プリセット結合ノード（節）を介して各出力リード線に
結合されている。このような結合ノードの各列（例えば
ノード４１１−４１４を含む列）は、記憶されている１
つのテンプレートに対応する。従って、各出力リードの信号は、入力信号がいずれのテ
ンプレートのものであるかという所属関係を示す。さらに詳細には、結合ノードは、励起状態（Ｅ）、阻止
状態（１）及び「ドント　ケア」状態（Ｄ）を知るため
に３種類の「多様性」を有する。マツチ（整合）または
ミスマツチ（不整合）に対する応答は、以下の真理値表
に従って各多様性毎に異なる。入　力　　シナプス　　出　力ＯＥ　　　　　　０１　　　　　Ｅ　　　　　　１１　　　　ｌ　　　　−２０Ｄ　　　　　　０１　　　　　Ｄ　　　　　　Ｏこの真理値表を実行するノード４１１はゲート付増幅器
を用いて容易に実現可能である。ノードがＥノードか、Ｉノードか、またはＤノードかに
関する情報は、（多様性が要望されたとき）各ノードに
付属された２個のフリップフロップセット内に記憶可能
である。代替態様として、その情報はノードの配列に付
属のリンクの配列に「接続させる（ｈａｒｄｖｌｒｅｄ
）Ｊことが可能である。テンプレートのプログラミング（すなわち結合）は適切
なリンクのバーンスルー（ｂｕｒｎ−１ｈｒｏｇｈ）に
よって達成可能である。もしテンプレートが完、全に不
変であるならば、当然に、テンプレート情報を直接ノー
ドの配列の集積回路マスク内に設計することが可能であ
る。出力線の電流はインピーダンスに流れ、その流れは、回
路網４１０の各出力線の電圧を、入力信号のセット内の
１と励起ノードとの間のマツチングの程度に比例したレ
ベルまで上昇させる。電圧はまた、入力信号のセット内
の１と阻止ノードとの間のマツチングの程度によって減
少されることも当然である。回路網４１０の出力線はしきい値回路４２０に供給され
るが、そこにはそのインピーダンスをオプションで配置
可能である。回路網４２０は１セツトのしきい値を回路
網４１０の出力信号に供給する。特に回路網４２０は、
回路網４２０の入力リード線に応答する１つの入力と増
幅器４２１−４２４の第２の入力に接続される多数のソ
ース（例えば４２５−４２７　）を有する１セツトの２
人力増幅器（例えば４２１−４２４）を含む。ソースの
各々は異なる電流を供給し、それに応じて各増幅器４２
１−４２４は、そのリード線がソース４２５−４２７に
対して有する特定結合に関連したその第２のリード線上
に電圧を発生する。このようにして、回路網４２０内に
おける異なる増幅器に異なるしきい値を供給可能である
。回路網４２０の出力リード線は増幅器４２１−４２４の
出力であり、それらは増幅器の入力信号がしきい値を超
えているかまたは超えていないかに応じて論理値１また
は０をとる。ブロック５０は第５図に示したような神経回路網で構成
される。しかしながら、ブロック５０はブロック４０の
５Ｘ５テンプレートとは異なり７Ｘ７テンプレートを取
扱うので、データをバッファ記憶するために２つの神経
回路網の間に記憶装置５５が挿入される。ブロック６０は１８個の特徴マツプを発生する。それは
単にブロック５０の出力を受入れて、中心画素の一致性
を特徴づける信号と共に適切な情報を記憶装置内に記憶
する。その結果は１８メモリセグメントであり、各セグ
メントは画像内に見出された特徴に関する情報を含む。従って、このような各セグメントは我々の特徴マツプの
１つである。ブロック７０の粗いブロック化（ｃｏａｒｓｅ　ｂｌｏ
ｃｋｉｎｇ）は、恐らくは同一の物理的記憶装置内にあ
る１８個の追加の小メモリセグメントを用いることによ
って達成される。ブロック７０は、これらの小メモリセ
グメント内に、大メモリセグメントの適切に選択された
部分内に見出された特徴に関する情報を記憶する。初期
画像のサイズが１８画素Ｘ３０画素であるとき、モジュ
ラス５で動作するカウンタを用いればその選択は容易に
達成可能であり、ここでカウンタの全値（ｆｕｌｌ　ｖ
ａｌｕｅ）は大セグメントをアクセスするのに使用され
、一方モジュラスらで除した後の全数（ｗｈｏｌｅ　ｎ
ｕＩｌｂｅｒ）は１８個の小メモリセグメント内のセル
を識別するのに使用される。小メモリセグメントの２７０個の記憶位置はブロック７
０の出力を形成し、結果的に画像内に含まれた文字を表
わすベクトルを形成する。実行される必要がある最後の機能は、与えられた特徴ベ
クトルに対する最も確からしい候補文字を選定するとこ
ろのある回路網にこのベクトルを供給することである。これがブロック８０の機能である。ブロック８０は多数の方法で実行される。例えば、前記
の米国特許第４．６（ｉｏ、１８８号におけるホップフ
ィールドの内容照合方式の教示を使用するのが有利であ
ろう。この教示に従って、彼の回路のフィードバック回
路網に、主題セット内の文字に関する情報を入力可能で
ある。記憶されたこのような情報を用いて、内容照合形
（連想）記憶装置は、供給された特徴ベクトルに最も近
い文字の特徴ベクトルを識別する。ホップフィールドの
回路網は極めてロバス）　（ｒｏｂｕｓｔ；適応作動性
が強い）であるので、たとえ入力がかなり歪んで入るよ
うに見えるときでさえも「正確な」選別を行う。しかし
ながら、ホップフィールド回路に対してフィードバック
をすると、記憶されたベクトルが全てフィードバック回
路網内に分配されて相互に混合されるので、ホップフィ
ールド回路に対するフィードバック回路網の設計はやや
難しい。この困難さは、我々が例えば「４」という文字
をいかに認識するかと、または「４」と認識することが
可能なときと確信がなくて決定することを拒否するとき
との限界と、を我々は正確には解らないという事実によ
って複合化される。それにもかかわらず、我々は１目見
ただけで「４」という文字を判ってしまう！本研究は、正しい決定に到達するために、試行錯誤によ
り分類回路に「学習」を行なわせることによってこの問
題を解決することを試みる。このような「学習」のため
の可能性を有する１つの構成が第６図に示されている。現行技術においては、この技法は一般に「逆伝搬法（ｂ
ａｃｋ　ｐｒｏｐａｇａｔｌｏｎ）」と呼ばれている。これは、例えば「認識の微細構造内の探索（Ｅｘｐｌｏ
ｒａｔｉｏｎｓ　１ｎ　ｔｈｅ　Ｍｉｃｒｏｓｔｒｕｃ
ｔｕｒｅ　ｏｆ’　Ｃｏｇｎｌｔｌｏｎ）Ｊ　、Ｍ　Ｉ
　Ｔ出版、１９８６年の第８章「平行分布処理（Ｐａｒ
ａｌｌｅｌ　Ｄｉｓｔｒｌｂｕｔｅｄ　Ｐｒｏｃｅｓｓ
ｉｎｇ）Ｊ　、デー・イー・ラメルハルト（Ｄ、Ｅ。Ｒｕ＋＋＋ｅｌｈａｒｔ）、ジエー・エル・マック拳り
レランド（Ｊ、Ｌ、Ｍｅ　Ｃ１ｅｌｌａｎｄ）編、の中
の［誤り伝搬による学習内部を表示（Ｌｅａｒｎｉｎｇ
　Ｉｎｔｅｒｎａｌ　Ｒｅｐｒｅｓｅｎｔａｔｌｏｎｓ
　ｂｙ　Ｅｒｒｏｒ　Ｐｒｏｐａｇａｔｌｏｎ）　Ｊ内
でデー中イー・ラメルハルトによって説明されている。第６図は直列結合され光相互結合回路網８１及び８２を
含む。入力信号セットが回路網８１の入力端に供給され
、出力信号セットが回路網８２の出力端に現われる。各
回路網は複数本の入力及び出力リード線を有し、各入力
リード線は出力リード線の全てに接続されている。さら
に詳細には、各入力リード線ｉは結合ウェイト（重み）
Ｗｌｊを介して各出力リード線ｊに接続されている。我
々の適用例においては、回路網８１は２７０本の入力リ
ード線と４０本の入力リード線とを有する。回路網８２
は４０本の入力リード線と１０本の出力リード線とを有
する。回路網８１の入力リード線の本数は特徴ベクトルの長さ
によって決定される。回路網８２の出力本数は分類セッ
ト内の文字の数によって決定される。中間リード線の本
数（この場合は４０本）は発見的に決定される。第６図の回路の訓練（ｔｒａｉｎｉｎｇ）は、既知の文
字の開発された特徴ベクトルを供給しかつ供給された既
知の文字の対応する回路網８２の指定の出力リード線に
おける出力信号を最大化するために両方の回路網８１及
び８２内のウェイトを調節することによって実行可能で
ある。分類されるべきセット内が全ての文字の利用可能
なサンプルがこのようにして回路網に供給され、その度
ごとに適切な出力リード線における信号を最大化するた
めに相互結合回路網内のウェイトが調節される。このよ
うにして、両方の回路網に対して１セツトのウェイトＷ
１ｊが形成される。結合ウェイＷ１ｊは本質的にアナログ量でありかつ回路
はアナログ式で作動すると明言することが適切であろう
。すなわち、回路網８１の任意の出力リード線における
電圧は、その出力リード線に結合された「点火された（
ｆｉｒｅｄ　ｕｐ）　Ｊウェイトの寄与分の和である。各ウェイトは、そのウェイトが結合されている入力リー
ド線上の２値化信号によって「点火」される。従って、
リード線ｊにおける出力は、〒Ｂｉｗｌｊに等しく、ここでＢ、は１番目の入力リード線の値（０
または１）である。このような学習回路網の概念はかなりよく理解されてい
るが、この様なアナログ回路を効率的かつコンパクトに
実現するという課題が残っている。このような回路に対する要求は単純なものではない。例
えば、もし回路網の最適化が達成されるべきであるなら
ば、最小ウェイト変化または修正はかなり小さくなけれ
ばならない。上記の繰返し改善方法はよりよいウェイト
が良好なものの近傍に発見されるかも知れないという発
見的仮定に基づいているが、その発見的仮定は細分化が
十分に細かくないときは失敗する。小さな回路網８１に
対しては、少なくとも８ビツトのアナログ深度が必要で
ある。回路網が大きくなるほどより細かい細分化が必要
となるであろう。ウェイトもまた正と負との両方の値を
表わさなければならず、変化は容易に可逆可能でなけれ
ばならない。学習及び訓練セクションの間に、ウェイト
に対する変更回数は十分に大きいものであってよい。従
って、実際的な回路はウェイトの迅速修正を可能にする
ものでなければならない。これらの揮々の要求を考慮した上で、ＭＯＳＶＬＳＩ技
法を用いて効率的なアナログ結合ウェイト回路、または
強度回路を製作した。第６図における各結合ウェイトは単なる黒丸で示されて
いるが、第７図はこれらの丸を実行するための回路を示
す。さらに詳細には、第７図は、入力線８３と出力線８
４とにその結合を有する１つの結合ウェイトと、いくつ
かの通常の回路とを示す。まず、第７図の相互結合ウェイト部分は、コンデンサ８
０１及び８０２と、小ＭＯＳスイッチ８０３及び８０４
と、やや大ＭＯ５）ランジスタ８０５と、差動増幅器８
０Ｂと掛算器（マルチプライヤ）８０７を含む。第２番
目に、第７図の回路は、電荷結合スイッチ８０ｇと検知
スイッチ８０９と種々の制御リード線とを含む。回路は以下のように作動する。コンデンサ８０１及び８
０２が異なる電圧レベルに充電され、その電圧レベル差
が差動増幅器８０６の出力電圧に反映される。増幅器８
０６はコンデンサ８０１及び８０２に接続されたそれの
２つの入力を有する。結合ウェイトを表わす増幅器８０
６の出力は掛算器８０７に接続される。掛算器８０７は
通常の相互コンダクタンス増幅器であればいかなるもの
でもよい。掛算器８０７にはまた相互結合回路網の入力
線８３も接続されている。変換器８０７の出力は相互結
合回路網の出力リード線に接続される。このように、掛算器８０７は、入力リード線における信
号と結合ウェイトの値との積である電流を出力リード線
に送る。結合ウェイトは、コンデンサ８０１および８０
２の間の電圧差に応答して増幅器８０Ｂによって形成さ
れた差電圧によって表わされる。コンデンサ８０１及び８０２上の電圧の差は（ＯＣＲシ
ステム内に含まれる動作に比較して）長時間維持され、
かつ回路が適度に低い温度に保持されているときはりフ
レッシングは必要ないことが解った。例えば、７７°Ｋ
においては、時間と共に検出可能な損失は見られなかっ
た。我々の回路の１つの有利性は、ウェイトはｖＣ８０１−
ＶＣ８０２に比例し、従ってたとえ電荷損失があっても
、それが両方のコンデンサにおいて同じであるときは結
果的にウェイトに変化はないことがわかるであろう。それにもかかわらず、コンデンサ８０１及び８０２上へ
情報をリフレッシュするための経路は明確に設けられな
ければならない。さらに、コンデンサ８０１及び８０２
上に電圧（電荷）値を設定しかつ上記の「学習」手順を
可能にするようにその設定値を修正するための経路が設
けられなければならない。この場所に残りのスイッチ及
び制御部が入ってくるのである。結合ウェイトを希望レベルにもたらすためにはスイッチ
８０８を僅かの開閉じて固定電圧レベルを電圧源８１Ｂ
からコンデンサ８０１に付加させる。電圧は固定電荷に
対応する。その後スイッチ８０８はオフにされる。この
点において、コンデンサ８０１は増幅器８０６の非反転
入力に接続されて正電圧を与え、一方コンデンサ８０２
は増幅器８０Ｂの反転入力に接続されているので、結合
ウェイトは最大の正レベルとなる。結合ウェイトの変更
は以下のように行われる。まず、トランジスタ８０３及び８０４がオンにされる。トランジスタ８０３はトランジスタ８０５に比較して極
めて小さく、従って現象を理解しやすくするために、ト
ランジスタ８０３は単なるスイッチと考えてよい。比較
すると、トランジスタ８０５は長くてかつ狭く、従って
それがオンにされたとき、それはコンデンサと考えてよ
い。スイッチ８０３が閉じられかつトランジスタ８０５
（それがｎチャネルデバイスであると仮定して）がオン
にされたとき、コンデンサ８０１の電荷はコンデンサ８
０１とオンにされたトランジスタ８０５上の反転電荷と
の間に分配される。次にトランジスタ８０３がオフにさ
れ、これによりトランジスタ８０５内に電荷を捕獲する
。次にトランジスタ８０４がオンにされ、ここでもしト
ランジスタ８０５がゆっくりとオフにされるならば、そ
のチャネル内の可動電荷はスイッチ８０４を通過してコ
ンデンサ８０２内に拡散するであろう。従って、上記のステップは電荷量をコンデンサ８０１か
らコンデンサ８０２へ移動させる。これは、コンデンサ
の電圧変化及び相互結合ウェイトの変化に対応する。上記のシーケンス（手順）は、結合ウェイトを希望レベ
ルにもたらすのに必要な回数だけ繰返すことが可能であ
る。このようにして、訓練期間の間結合ウェイトの最適
化が進められ、この結果回路網８１及び８２内の各結合
ウェイトは正しいレベルに設定される。上記の説明は回路の訓練態様に関するものである。学習
過程が一旦終了したら、ｌ）ウェイトの値を決定するた
めの手段と、及び２）時間経過に伴う損失をリフレッシ
ュするための手段と、などが提供されるべきである。こ
れは、検知スイッチ８０９と、Ａ／Ｄ変換器と、Ｄ／Ａ
変換器と不揮発記憶装置と利用して達成される。相互結合回路網内のウェイト値を決定するために、入力
リード線は全て一時に１つずつオンにされる。リードが
オンにされる毎に、その人力リード線に結合されている
ウェイトの検知スイッチ８０９が順次にオンにされて各
増幅器の電圧を検知バス８１０上に出現させる。その電
圧がＡ／Ｄ変換器８１１に供給され、その結果として得
られたディジタル情報が記憶装置８１２内に記憶される
。このようにして、全てのウェイトがディジタル形式に
変換されて記憶装置８１２内に記憶される。リフレッシ
ュ動作の間に、各結合ウェイトは上記のように分離され
るが、このときに検知バスａＬＱ上の電圧出力は、記憶
装置８１２のディジタル出力がそれに供給されるＤ／Ａ
変換器８１３のアナログ電圧と、増幅器８１４内で比較
される。記憶装置８１２はリフレッシュされた結合ウェ
イトに対応するディジタル出力を放出するようにさせら
れることは当然である。比較結果に基づき、スイッチ要
素８０３．８０４及び８０５のシーケンスが増幅き８１
４の出力信号によって制御されて、コンデンサ８０１の
電圧をコンデンサ８０２に対して増加させたりまたは減
少させたりする。バス８１０の出力をＡ／Ｄ変換器８１
１に供給するかまたは比較増幅器８１４に供給するか第１表の制御はスイッチ８１５によって行われる。両方のコンデンサ８０１及び８０２を完全に放電させる事が必要な場合には、電圧源８１Ｂの電圧をゼロに減少させてスイッチ８０３８０４及び８０５をオンにすればよい。／／ｆｉｒｅ、ｃ　９／７／８８　　ＬＤＪ〃１／　ｃｈｅｃｋ　ｆｏｒ　ｂｒｏｋｅｎ　ｉｍｇｇｅ
ｓ／／　ｒｅｔｕｒｎｓ　−１ｉｆ　ｃｏｍｐｌｅｔｅ
ｌｙ　ｂｌａｎｋ／／　ｒｅｔｕｒｎｓ　Ｏｉｆ　ｃｏ
ｎｎｅｃｔｅｄ／／　ｒｅｔｕｒｎｓ　ｌ　ｉｆ　ｃｏ
ｎｎｅｃｔｅｄ　ｅｘｃｅｐｔ　ｆｏｒ　ｓａｍｌｌ　
ｆｌｙｓｐｅｅｋｓ／／　ｒｅｔｕｒｎｓ　２　ｉｆ　
ｂａｄｌｙ　ｄｉｓｃｏｎｎｅｃｔｅｄ／／　ｕｓｅｓ
　ｒｅｃｕｒｓｉｖｅ　ｂｒｕｓｈｆｉｒｅ　ａ１ｇｏ
ｒｉｔｈｍ／／　Ｄｉａｇｎｏｓｔｉｃ　ｏｕｔｐｕｔ
：　ｐｒｉｎｔｓ　ｎｕｍｂｅｒ　ｏｆ　ｓｅｇｍｅｎ
ｔｓ。／／　ａｎｄ　１ｏｃａｔｉｏｎ　Ｂｄ　ｃｏｄｅ　ｏ
ｆ　ｔｈｅ　ｌａｒｇｅｓｔ７７５ｉｄｅ　ｅｆｆｅｃ
ｔ：　５ｅｔｓ　ｕｐ　Ｌｓｅｇ　（ｉｎ　ｒｅｃ２ｃ
ｏｍ）１／　ＩＭＰＯＲＴＡＮＴ　ＡＳＳＵＭＦｒ１０
Ｎ百ｍｌ　ａｓｓｕｍｅｓ　ｉｍｇ、ｐｉｘ　ｂｌａｃ
ｋ　ｐｉｘｅｌｓ　ａｒｅ　ＰＯ３ｍＶＥ／／　ａｎｄ
　ｗｈｉｔｅ　ｐｉｘｅｌｓ　ａｒｅ　ｚｅｒｏ　−／
／　Ｉｆ　ｙｏｕ　ｃａｎ’ｔ　ｇｕａｎｎｒｅｅ　ｔ
ｈａｔ、　ｃａｌｌ　ｆｉｒｅＱ　ｍ５ｒｅａｔｉ　ｏ
ｆ　ｆｗＱ１／　ｎｅｇａｔｉｖｅ　ｐｉｘｅｌｓ　ｗ
ｉｌｌ　ｃａｕｓｅ　Ｉｒ０ｕｂｌｅ／／　Ｔｈ１ｓ　
ｒｏｕｔｉｎｅ　ｍｏｄｉｆｉｅｓ　ｆｎｇ、ｐｉｘ　
！　！＃１ｎｃｌｕｄｅ″ｅｒｒ１．ｈ” 指ｎｃｌｕｄｅ　”ｆｉｒｅ、ｈ″ １ｎｌｉｎｅ　ｉｎｔ　１ｍ１ｎ（ｉｒｕ　ａ、　ｉｎ
ｔ　ｂ）　（ｘ℃ｔｕｒｎ（ａ＜ｂ？ａ：　ｂ）；）ｉ
ｎｌｉｎｅ　ｉｎｔ　ｉｍａｘ（ｉｎｔ　ｔｙ　ｉｒ＋
＋　ｂ）（ｒｅｍｒｎ（ａｙｂ？　ａ　：　ｂ）；　１
ｓｔａｏｃ　ｉｎｔ　ｘｄｌ；　／／　ｃｏｐｙ　ｏｆ
　ｉｍｇ、ｘｓｔａｔｉｃ　ｉｎｔ　ｙｄｌ；　／／　
ａｎｄ　ｉｍｇ、ｙ第２表５ｔａＩｉｃ　ｃｈａｒ”　”　ｐｉｘ　１　；／／　
ｙｄ　ｉｍｇ、ｐｉｘｓｔａｔｉｃ　Ｐａ１ｒ”　ｐｉ
；５ｒａｏｃ　Ｐａ１ｒ”　ｐ２；５ｔａｔｉｃ　ｉｎｔ　１ｉｓｔ　５ｉｚｅ　＝　−１
；５ｔａｔｉｃ　Ｓｅｇ　ｍｙｓｅｇＪｓｅｇｅｍｅｎ
ｔ　ｂｅｉｎｇ　ｐｒｏｃｅｓｓｅｄＳＣｇ　Ｌｓｅｇ
；　　　／／　ｌｏｎｇｅｓｔ　ｓｅｇｍｅｎｔ　ｔｈ
ｅｒｅ　ｉｓｍｔｉｃ　ｉｎｔ　ｎｓｅｇ；５ｔａｔｉｃ　ｉｎｔ　ｔｏｔｐｉｘ；／／　ｔｏｔａ
ｌ　ｐｉｘｅｌｓ　ｉｎ　ｉｍａｇｅｉｎｔ＆ｅｌσｍ
ａｇｅ　ｉｍｇ）（／／　ｍａｋｅ　５ｕｒｅ　ｗｅ　ｈａｖｅ　ａｌｌｃ
ａｔｅｄ　ｅｎｏｕｇｈ　ｒｏｏｍ　ｔｏ　ｋｅｅｐ　
ｏｗ　ｐａｉｒ−■Ｓ家ｉｎｔ　１ｓｉｚ　ｗ　ｉｍｇ
、ｘ　”　ｉｍｇ、ｙ；／／　ｎｕｘ　ｐｏｓｓｉｂｌ
ｅ　５ｉｚｅ　ｏｆ■ｓｔ　ｒｅｑｕｉｒｅｄｉｆ　（
ｌｓｉｚ　＞　１ｉｓｔ−ｓｉｚｅ）　（ｉｆ（ｐｌＮｄｅｌｅｔｅ　ｐ　ｉ　：ｄｅｌｅｔｅ　ｐ２；］ｐｉ　ｗ　ｎｅｗ　Ｐａ１ｒ［ｌ５ｉｚｌ；ｐ２　＝　
ｎｅｗ　Ｐａ１ｒ［１ｓｉｚ］；１ｉｓｔ　５ｉｚｅ　
１＋１ｌｓｉｚ；Ｌｓｅｇ、１ｙｓｔ　ｚ　Ｏ；　／／
　ｎｏ　ｌｏｎｇｅｓｔ　ｓｅｇｍｅｎｔ　ｙｅｔＬｓ
ｅｇ、５ｉｚｅ　ｚ　Ｏ；／／　ｆｉｒｓｔ　ｓｅｇ　
ｗｉｌｌ　ｂｅａｔ　ｔｈｉｓ　ｆｏｒ　ｓｕｒｅｎｓ
ｅｇ冨０；ｒｏｔｐｉｘ　＝　０；第３表／／　ｆｉｎｄ　ｆｉｒｓｔ　ｂｌＩＬｃｋ　ｐｉｘｅ
ｌ、　ｓｏ　ｗｅ　ｃａｒ＋　１ｎｉｔｉａｔｒ、　ｔ
ｈｅ　ｂｕｒｎ　を反ｒＣ；ｉｎｔ　ｘｘ；　ｉｎｔ　
ｙｙ；ｆｏｒ　（ｙｙ　ｗａ　（）；　ｙｙ　＜　ｆｎｇ、ｙ
；　ｙｙ＋＋）　（ｒｏｔ　（ｘｘ　ｓｗ　Ｏ；　ｘｘ
　＜　ｉｍｇ、ｘ；　ｘｘ＋＋）　（ｉｆ（ｔｍｇ−Ｐ
ｉＸ［ｙｙｌ［ＸＸｌ　＞　Ｏ）　ｔ／／／　ｆｐｒｉ
ｎｌ（ｓｔｄｅｒｒ、　”ｆｉｒｓｔｘ　ｓ−％ｄ　ｆ
ｉｒｓｔｙ　−％ｄ　Ｏ，ｘｘ、　ｙｙ）：／／　＊　
ｌｏｔ　ｏｆ　ｔｈｅｓｅ　ｔｈｉｎｇｓ　ｍｉｇｈｔ
　Ｉｏｇｉｃｉｌｙ　ｂｅ　ａｒｇｕｍｅｎｔｓ　＋ｏ
　ｂｕｒｎα７７　ｂｕｔ　５ｔａ６ｃ　ｖａｒｉａｂ
ｌｅｓ　ｔｙ　ｆａｓｔｅｒ　＆　ｓｉｍｐｌｅｒｎｓ
ｅｇ＋＋；　／／　ｃｏｕｎｔ　ｔＭｓ　ｓｅｇｍｅｎ
ｔｍｙｓｅｇ、ａｓｈｅｓ　＋ｗ＋　−ｎｓｅ４ｍｙｇ
ｇ、１ｉｓｔ　ｍ　（Ｌｓｅｇ、１ｉｓｔ　１ｍ　ｐｉ
）　？　ｐｉ　：　ｐ２；ｍｙｓｅｇ、５ｉｚｅ　ｗａ
　Ｏ；ｘｄｌ　ｗａ　ｉｍｇ、ｘ；　ｙｄｌ　ｗａ　ｉｍｇ、
ｙ；　ｐｔｘｌ　Ｍ　ｉｍｇ、ｐｉｘ；ｂｕｒｎ（ｘｘ
、ｙｙ’）、７７　ｂｕｍ、　ｂａｂｙ、　ｂｕｒｎｔ
ｉｆ　（ｍｙｓｅｇ、５ｉｚｅ　＞　Ｌｓｃｇ、５ｉｚ
ｅ）　Ｌｓｃｇ　ｇ　ｍｙｓｅｇ；】川ｆｄｅｆ　ＴＥＳＴｆｐｒｉｎｔｆ（ｓｔｄｅｒｒ、　”Ｓａｗ％ｄ　ｓｅ
ｇｍｅｎｔｓ’、　ｎｓｅｇ）；ｉｆ　（ｎｓｅｇ）　
ｆｐｒｉｎｔｆ（ｓｔｄｅｒｒ、　’Ｌｏｎｇｅｓｔ　
（ｃｏｄｅ　％ｄ）　５ｔａｎｓ　ａｔ　％ｄ　％ｄＯ
。Ｌｓｅｇ、ｕｈｅｓ、　Ｌｓｅｇ、１ｉｓ＋［Ｏｌ、ｘ
、　Ｌｓｅｇ、１ｉｓｔ（Ｏｌ、ｙ）；＃ｅｎｄｉｆｉｆ　（ｎｓｅｇ　ｍ　Ｏ）　ｒｅｒｕｍ　−１；ｉｆ
　（ｎｓｅｇ　Ｗ　１）　ｒｅｔｕｒｎ　Ｏ；ｆｌｏａ
ｔ　ｆｒａｃ　ｔａ　ｆｌｏａｔ（Ｌｓｅｇ、５ｉｚｅ
）　／　ｆｌｏａｔＱｏｔｐｉｘ）；ｃｏｎｓ＋　ｆｌ
ｏａｔ　ｍ１ｎｆｒａｃ　ｗｍ　、９；ｉｆ　（ｆｒａ
ｃ　）ｓ　ｍ１ｎｆｒａｃ）　ｒｅｔｕｒｎ　１；ｒｅ
ｔｕｒｎ　２；第４（１）表／／　ｔｈｅ　ｍａｇｉｅＩＩＩｒｅｃｕｒｓｉｖｅ　
ｂｕｒｎｉｎｇ　ｒｏｕｔｉｎｅ／／　ｔｕｒｎｓ　ｔ
ｏ　ａｓｈｅｓ　ａｌｌ　ｐｏｉｎｔｓ　ｔｈａｔ　ａ
ｒｅ　８−ｃｏｎｎｅｃｔｅｄ　ｔｏ　ｔｈｅ　１ｎｉ
ｔｉａｌ　ｐｏｉｎｔｖｏｉｄ　ｂｕｒｎ（ｉｎｔ　ｘｃｅｎｔ。ｔｒｕ　ｙｃｅｎｔ　　　／／　ｃｅｒｕｅｒ　ｏｆ　
３　ｘ　３　ｒｅｇｉｏｎ　ｏｒ　１ｒｕｅｒｅｓｔ／
／　ｉｆ　ｔｈｉｓ　ｐｏｉｎｔ　ｉｓ　ｏｆｆ−ｓｃ
ａｌｅ：１ｆ（ｘｃｅｎｔ＜Ｏｙｃｅｎｔａ　　ｘｃｅ
ｎｔ＞寓ｘｄｌ　　ｙｃｅｎｏｓａ−ｙｄｌ）ｒｅｔｕ
ｒｎ；１７　Ｎ０ＴＥ：　ｔｈｉｓ　ｉｓ　１ｎｄｅｅ
ｄ　ａ　ｃｈｅｃｋ　ｆｏｒ　＞　Ｏ。／／　ｎｏｔ　ｊｕｓｒ　ｎｏｎｚｅｒｏ、　ｓｏ　ｔ
ｈｉｎｇｓ　ｄｏｎ’ｔ　ｂｕｒｎ　ｔｗｉｃｅ。１ｆ（ｐｉｘｌ［ｙｃｅｎｔｌ（ｘｃｅｎｔｌ　＞　Ｏ
）［ｉｎｔ　ｔｏｐ　ｍ　ｍｙｓｅｇ、５ｉｚｅ＋＋；
／／　ｋｅｅｐ　ｔｒａｃｋ　ｏｆ　ｌｅｎｇｔｈ　ｏ
ｆ　ｓｅｇｍｅｎｔｔｏｔｐＬｘ＋＋；／１ｃｏｕｎｔ
ｔｏｔａｌｐｉｘｅｌｓｐｔｘｌ［ｙｃｅｒｕ］［ｘｃ
ｅｒｕ］ｗａｍｙｓｅｇ、ａｓｈｅｓ；／／１ｕｒｎ　
ｔｈｉｓｐｏｉｎｔｔｏａｓｈｅｓｂｕｒｎ（ｘｃｅｎ
ｔ＋１．　ｙｃｅｎｔ＋１）；／／　１ｇｎ１ｔｅ　ｎ
ｅｉｇｂｏｒｓｂｕｒｎ（ｘｃｅｎｔ＋１．　ｙｃｅｎ
ｔ　）；ｂｕｒｎ（ｘｃｅｎｔ＋１．ｙｃｅｎトｌ）よ
りｕｍ（ｘｃｅｎｔ　、　ｙｃｅｎｔ＋１）；ｂｕｒｎ
（ｘｃｅｎｔ　、　ｙｃｅｎｔ−１）；ｂｕｒｎ（ｘｃ
ｅｎｔ−１，ｙｃｅｎｔ＋１）；ｂｕｒｎ（ｘｃｅｎト
１．　ｙｃｅｎｔ　）；ｂｕｒｎ（ｘｃｅｎｔ−１，ｙ
ｃｅｎｔ−１）；＃ｄｅｆｉｎｅ　ｊｕｍｐｂｒｅａｋ
ｓ　ＹＥＳ＃ｉｆｄｅｆｊｕｍｐｂｒｅａｋｓ第５表／／　ｓａｍｅ　ａｓ　ａｂｏｖｅ、　ｂｕｔ　ｄｏｅ
ｓ　ｎｏｔ　＋ｌｓｓｕｍｅ　ｔｈａｔ　ｂｌａｃｋ　
ｐｉｘｅｌｓ　ａｒｅ　ｐｏｓｉｔｉｖｅ；／ｌ　ｎｏ
ｎ−ｚｅｒｏ　５ｕｆｆｉｃｅｓ。ｉｎｔ　ｆｉｒｅ（Ｉｍａｇｅ　ｉｍｇ）（ｆｏｒ　（
ｔｎｔ　ｙｙ　＝　０；　ｙｙ　＜　ｉｍｇ、ｙ；　ｙ
ｙ＋＋）　（ｆｏｒ　（ｉｎｔ　ｘｘ　＝　Ｏ；　ｘｘ
　＜　ｉｍｇ、ｘ；　ｘｘ＋＋）　［ｉｍｇ、ｐｉｘ［
ｙｙｌ［ｘｘｌ　＝　ｉｍｇ、ｐｉｘ［ｙｙｌ［ｘｘｌ
　！＝　Ｏ；ｒｅｔｕｒｎ　（ｆｉｒｅｌ（ｉｍｇ）　
）；第４（２）表ｉｎｔ　ｊｕｍｐ　ｘ　ｉｍａｘ（ｘｄｌ、　ｙｄｌ）
；ｊｕｍｐ　ｗ　ｊｕｍｐ　／　２０；ｉｆ　（ｊｕｍｐ　＜　３）　ｊｕｍｐ　＝　３；ｉｆ
θｕｍｐ　＞　１）　（ｂｕｒｎ（ｘｃｅｎｔ＋ｊｕｍｐ、ｙｃｅｎｔ−ｊｕｍ
ｐ）；ｂｕｒｎ（ｘｃｅｎｔ　、ｙｃｅｎｔ−ｊｕｍｐ
）；ｂｕｒｎ（ｘｃｅｎｔ−ｊｕｍｐ、　ｙｃｅｎｔ−
ｊｕｍｐ）；＃ｅｎｄｉｆ／／　ｉｆ　ｔｈｉｓ　ｐｏｉｎｔ　ＮＯＴ　ｓｅｔ、
　ｏｒ　ａｌｒｅａｄｙ　ｂｕｍｅｄ、　ｄｏ　ｎｏｔ
ｈｉｎｇｒｅｔｕｒｎ；りし６表／ｌ　ｄｏ　ｍｏｓｔ　ｏｆ　ｔｈｅ　ｗｏｒｋ　ｆｏ
ｒ　ｌｉｍａｒ　ｔｒａｎｓｆｏｒｍａｔｉｏｎ　ｐｒ
ｏｇｒａｍ／／　ｐｅｒｆｏｒｍ　ｌｉｒ＋ｅａｒ　ｔ
ｒａｎｓｆｏｒｍａｏｏｎｓ　ｏｎ　ｐｏｓｔ　ｏｆｆ
ｉｃｅ　ｄａｔａ／／　ｉ、ｅ、　ｃｏｎｖｅｒｔ　ｔ
ｏ　５ｔａｎｄａｒｄ　５ｉｚｅ　ａｎｄ　ａｓｐｅｃ
ｔ　ｒａｔｉ。／／　ｓｅｅ　ｌｉｎ、ｐｌａｎ　ｆｏｒ　ｅｘｔｅｎ
ｄｅｄ　ｄｉｓｃｕｓｓｉｏｎ／／　Ｎｏｔｅ：　ｘｙ
ｐＬｘ［］［］　ｗｉｎ　ｃｏｎｔａｉｎ　ｓｍａｌｌ
　ｉｎｔｅｇｅｒｓ　Ｏ，，９／／　ｆｏｒ　ｇｒａｙ
ｌｅｖｅｌｓ　ｂｅｌｏｗ　ｔｈｒｅｓｈｏｌｄ、　ｙ
ｏｕ　ｇｅｔ　ｚｅｒｏ；／／　ｆｏｒ　ｇｒａｙｌｅ
ｖｅｌｓ　ａｂｏｖｅ　ｔｈｒｅｓｈｏｌｄ、　ｙｏｕ
　ｇｅｔ　ｔｈｅ　ｇｒａｙｌｅｖｅｌ　ｎｕｍ反ｒ／
／　Ｔｈ１ｓ　ｇｉｖｅｓ　ｙｏｕ　ｔｈｅ　ｏｐｔｉ
ｏｎ　ｏｆ　ｔｒｅａｔｉｎｇ　ｉｔ　ａｓ　ａ　ｂｏ
ｏｌｅａｎ／／　ｉｆ　ｙｏｕ　ｄｏｎ’ｔ　ｃａｒｅ
　ａｂｏｕｔ　ｇｒａｙｌｅｖｅｌｓ。／／　Ｃａｕｅｒ　ａｌｌｏｃａｔｅｓ　ｔｈｅ　ａｒ
ｒａｙ；　ｗｅ　ｆｉｌｌ　ｉｔ。＃ｔｎｃｌｕｄｅ　＜５ｔｃｕｏ、ｈ＞＃ｔｎｃｌｕｄ
ｅ＜ｒｍｔｈ、ｈ＞ｉｎｌｉｎｅｆｌｏａｔｆｍｉｎ（ｆｌｏａｔａ、ｆｌ
ｏａｔｂ）（ｒｅｔｕｒｎ（ａ＜ｂ？ａ：　ｂ）；１ｉ
ｎｌｉｎｅ　ｆｌｏａｔ　ｆｍａ、ｔ（ｆｌｏａｔ　ａ
、　ｆｌｏａｔ　ｂ）（ｒｅｔｕｒｎ（ａ＞ｂ？　ａ　
：　ｂ）；　）ｉｎｌｉｎｅｉｒｕｉｍｉｎ（ｔｎｔａ
、１ｒｕｂ）（ｒｅｔｕｒｎ（ａ＜ｂ？ａ　：　ｂ）；
）ｉｎｌｉｎｅ　ｉｎｔ　ｉｍａｘ（ｉｎｔ　ＩＬ、　
ｉｎｔ　ｂ）（ｒｅｔｕｒｇ（ａ＞ｂ？　ａ　：　ｂ）
；　］第７表ｖｏｉｄｄｏ　ｔｉｎ（ｃｏｎｓｔ　Ｉｍａｇｅ　ｒａｗＪｌ　１ｎｐｕｔ　ｉ
ｍａｇｅｃｏｎｓｔ　ｉｎｔ　ｋｎｏｗｎ−ｆｉｔ、／
／　１１１１＝＞　ｃｈａｒ　ａｌｒｅａｄｙ　ｆｉｌ
ｌｓ　ｂｏｘ　ｐｄｉｍ　ｂｙ　ｑｄｉｍ。Ｉｍａｇｅ　ｄｅｓ、　　／／　ｒｅｓ＋ｒｌｔ　：　
ａｒｒａｙ　ｏｆ　ｓｍａｌｌ　ｉｎｔｅｇｅｒｓＦＩ
ＬＥ”　ｐｍｍ−ｆｐＪ／　ｐａｒｕｎｅｔｅｒ　ｆｉ
ｌｅ　ｆｉｌｅｐｏｉｎｔｅｒ／／　Ｏ＝＞　ａｌｌ　
ｐａｒａｍｅｔｅｒｓ　ｔａｋｅ　ｄｅｆａｕｌｔ　ｖ
ａｌｕｅｓｃｃｎｓｔ　ｃｈａｒ”　ｓｎａｍｅ／／　
ｆｉｌｅｎａｍｅ、　ｆｏｒ　ｉｎｆｏｒｍａｔｉｏｎ
ａｌ　ｍｅｓｓａｇｅｓ／ｌ　ｐｒｏｖｉｄｅ＋”’　
ｉｆ　ｙｏｕ　ｃａｎ’ｔ　ｄｏ　ｂｅｔｔｅｒ］（Ｐａｉｒ”　ｂｌ；ｉｎｔ　１）Ｉ）、　（Ｉｑ；ｆｏｒ　（Ｑｑ＝　Ｏ；　ｑｑ　＜　ｒａＷ、）’；　
Ｑｑ＋＋）　（ｆｏｒ　（ｐｐＩｌｌ　Ｏ；　ｐＰ　＜
　ｒａｗ、ｘ；　ｐｐ＋＋）　ｉｆ（ｒａｗ、ｐｉｘ（
ｑｑｌ［ｐｐｌ）　（ｂｌ［１ｂ１１．ｘ　−ｐｐ；ｂｌ［１ｂ１１．ｙ　−ｑｑ；ｉｂｌ＋＋；】ｄｏ−１ｉｎ　１（ｒａｗ、ｘ、　ｒａｗ、ｙ、　ｂｌ
、　ｉｂｌ、　ｋｒ＋ｏｗｎ−ｆｉｔ、　ｄｅｓ、　ｐ
ａｒａｍ　ｆｐ、　ｓｎａｍｅ）；ｄｅｌｅｔｅ（ｂｌ
）；第９表／／　ｆｉｎｄ　ｒａｗ　ｂｏｕｎｄｉｎｇ　ｂｏｘｉ
ｎｔ　ｐＯ，ｑＯ，ｐ２．　ｑ２；ｉｎｔ　ｉｂｌ；ｉｎｔ　ｐｐ＋　ｑＱ；ｉｆ　（ｋｎｏｗｎ　ｆｉｔ）　（ｐＯ＝　Ｏ；　ｑＯ＝　Ｏ；ｐ２＝ｐ己ｍ；　ｑ２　＝　ｑ山ｍ；）　ｅｌｓｅ　（ｐｏ　＝　ｂｌ［０］、ｘ；　ｑＯ＝　ｂｌ［０］、ｙ
；ｐ２　＝　ｐｏ；　ｑ２　＝　ｑｏ；ｆｏｒ　（ｉｂｌ　ｚ　１；　ｉｂｌ　＜　ｎｂｌ；　
ｉｂｌ←）　（ｐｐ　＝　ｂｌ［１ｂ１１．ｘ；　ｑｑ
　＝　ｂｌ［１ｂ１１．ｙ；ｐＯ＝　１ｍ１ｎ（ｐＯ・
ＰＰ）；ｐ２判ｒｎａｘ（ｐ２．　ｐｐ）；ｑｏ　＝　ｉｍ紬（ｑＯ，ｑｑ）；ｑ２　＝　ｉｍａｘ（ｑ２．　ｑｑ）；ｐ２→；ｑ２＋
＋；第８表ｖｏｉｄ　ｄｏ−ｆｉｎ　１（ｃｏｎｓｔ　ｉｎｔ　ｐｄｉｍＪＩ　５ｉｚｅ　ｏｆ　
１ｎｐｕｔ　ａｒｒａｙｃｏｎｓｔ　ｉｎｔ　ｑｄｉｍ
、／／　、。ｃｏｎｓｔ　Ｐａ１ｒ”　ｂｌ、／／　１ｎｐｕｔ　：
　１ｉｓｔ　ｏｆ　ｂｌａｃｋ　ｐｉｘｅｌｓｃｏｎｓ
ｔ　ｉｎｔ　ｎｂｌ、　／／　５ｉｚｅ　ｏｆ　５ａｉ
ｄ　１ｉｓｔ／／　Ｏ＝＋）　ａｌｌ　ｐａｒａｍｅｔ
ｅｒｓ　ｔａｋｅ　ｄｅｆａｕｌｔ　ｖａｌｕｅｓ／／
　ｐｒｏｖｉｄｅ　””　ｉｆ　ｙｏｕ　ｃａｎ’ｔ　
ｄｏ　ｂｅｕｅｒＦｇｅｔ（ｋｅｒｎｅｌ、　２）Ｊｃ
ｏｎｖｏｌｕｔｉｏｎ　ｋｅｒｎｅｌ　（ｕｎｉｔｓ　
ｏｆ　ＰＱ　ｒｏｗｓ／ｃｏｌｓ）Ｉｇｅｔ（ｍｉｎｇ
ｒａｙ、　３）Ｊｌ　ｔｈｉｓ　ｏｒ　ｌａｒｇｅｒ：
　ｒｅｔｕｒｎ　ｇｒａｙｌｅｖｅｌ、　ｅｌｓｅ　ｒ
ｅｔｕｒｎ　ｚｅｒ。ｆｌｏａｔ　ｐｋｅｍ　ｍ　ｋｅｒｎｅｌ；ｆｌｏａｔ
　ｑｋｅｍ　ｍ　ｋｅｒｎｅｌ；ｃｏｎｓｔ　ｉｎｔ　
ｘｏ　ｍ　Ｏ；ｃｏｎｓｔ　ｉｎｃ　ｙＯｗ　Ｏ；ｃｏｎｓｔ　ｆｌｏａｔ　ｘｍｉｄ　ｍ　（ｄｅｓ、ｘ
　−ｘＯ）　／　２．０；ｃｏｎｓｔ　ｆｌｏａｔ　ｙ
ｍｉｄ　ｗ　（ｄｅｓ、ｙ　−ｙＯ）　／　２．０；第
１０表／／　ｃａｌｃｕｌａｔｅ　ｓｏｍｅ　ｍｏｍｅｎｔｓ
：ｆｌｏａｔ”　ｘｙｆｌｔ　ｘ　ｎｅｗ−ｆｌｏａｔ
（ｄｅｓ、ｙ、　ｄｅｓ、ｘ）；／／　ｎｏｔｅ　ｔｈ
ａｔ　ｗｅ　ａｒｅ　ｔｒｅａｔｉｎｇ　ｔｈｅ　ｐｉ
ｘｅｌｓ　ａｓ　ＢＯＸＥＳ　ｏｆ　ｉｎｋ、　ｎｏｔ
　ｐｏｉｎｔｓ／／　ｓｏ　ｔｈｅ　（０，０）　ｐｉ
ｘｅｌ　ｅｘｔｅｎｄｓ　ｆｒｏｍ　（０，０）　ｔｏ
　（１−ｅｐｓ、　１−ｅｐｓ）／ｌ　ａｎｄ　ｈａｓ
　ｉｔｓ　ｃｅｎｔｅｒ　ａｔ　（，５，，５）ｆｌｏ
ａｔ　ｐｍｉｄ　ｍ　（ｐｏ　＋ｐ２）　／　２．０；
ｆｌｏａｔ　ｑｍｉｄ　ｘ　（ｑＯ＋ｑ２）　／　２．
０；／／　ｂｕｔ　ｉｆ　ｗｅ　５ｈｉｆｔ　ｔｈｅ　
ｍ１ｄｄｌｅ　ｈａｌｆ　ａ　ｂｉｔ。／／　ｗｅ　ｃａｎ　ｐｒｅｔｅｎｄ　ｔｈｅ　（０，
０）　ｐｉｘｅｌ　ｉｓ　ｃｅｒｕｅｒｅｄ　ａｔ　（
０，０）ｆｌｏａｔ　ｐＩｊ１Ｘ＋＋ａ　ｐｍｉｄ　−
，５：ｆｌｏａｔ　ｑｍｘ　ｍ　ｑｍｉｄ　−，５；ｆ
ｌｏａｔ　ｍｐｑ　ｗｓ　Ｏ，；／ｌ　ＰＱ　ｒｎｏｍ
ｅｎｒｆｌｏａｔ　ｍｑｑ　ｗ　Ｏ，；／／　ＱＱ　ｍ
ｏｍｅｎｔｆｏｒ　（ｉｂｌ　ｚ　Ｏ；　ｉｂｌ　＜　
ｎｂｌ；　ｉｂｌ＋−＋）　［ｐｐ　−ｂｌ［１ｂ１１
．ｘ；　ｑｑ　−ｂｌ［１ｂ１１．ｙ；ｍｐｑ　４ｍ　
（ｑｑ　−ｑｍｘ）”（ｐｐ　−ｐｍｘ）；ｍｑｑ　＋
＝　（ｑｑ　−ｑ皿ｔ）”（ｑｑ　−ｑｍｘ）；ｆｌｏ
ａｔ　ｔｈｅｔａ　ｗ　ｍｐｑ　／　ｍｑｑ；／／　Ｎ
ｏｔｅ：　５ｉｎｃｅ　ｐｉｘｅｌｓ　ａｒｅ　ｎｕｍ
ｂｅｒｅｄ　ｆｒｏｍ　ＵＰＰＥＲ１ｅｆｔ。／／　ｐｏｓｉｔｉｖｅ　ｔｈｅｔａ　ｃｏｒｒｅｓｐ
ｏｎｄｓ　ｔｏ　”／ｌ　ｎｅｇａｔｉｖｅ　ｔｈｅｔ
ａ　ｃｏｒｒｅｓｐｏｎｄｓ　ｔｏ　”／’／／　（ｐ
ｃ、　ｑｔ）　ｉｓ　ｔｈｅ　ｃｏｏｒｄｉｎａｔｅ　
ｗｈｅｒｅ　（ｐ、ｑ）　ｗｉｌｌ　ｇｏ　ｗｈｅｎ　
ｔｈｅ　ｃｈａｒ　ｉｓ　ｄｅｓｋｅｗｅｄ。／／　Ｔｈ１ｓ　ｉｓ　ｎｏｔ　ｑｕｉｔｅ　ｔｈｅ　
ｓａｍｅ　ａｓ　（ｘ、ｙ）　５ｉｎｃｅ　ｔｈｅ　１
ａｔｔｅｒ／／　ｈａｓ　５ｉｚｅ　ｃｈａｎｇｅｓ　
ａｓ　ｗｅｌｌ。第１１表第１２表Ｊｌ　Ｃａ１ｃｕｌａｔｅ　ｍｉｎ　ａｎｄ　ｍａｘ　
ｈｏｎｚ　ｃ■―１ｎａｔｅｓ。／／　ｍｅａｓｕｒｅｄ　ｒｅｌａｏｖｅ　ｔｏ　ａ　
１ｉｎｅ　ｐｕ耐１ｅｌ　ｔｏ　ｔｈｅ　５ｉｄｅｓ　
ｏｆ由ｅ　ｐａｒａｌｌｅｌｏｇｒａｍ〃There are a wide variety of applications in which it is desirable for machines to automatically recognize, analyze, and classify character patterns in a given image. The explosive development of computer-based information collection, processing, handling, storage, and transmission systems is providing the technology that enables the realization of these desires. Although sophisticated programs have been written for general purpose computers to perform pattern recognition, their level of success has been limited. Their success has mostly been achieved in the area of recognizing standard printing fonts. Character recognition technology dating back to the early 1960's used a method of tracing the curve of the character to be recognized. This was intuitively appealing, but unfortunately it often failed when the letters were distorted or contained unnecessary strokes. Bakls et al. (IBM) published ``An Experimental Study on Machine Recognition of Handwritten Numbers''.
5tudyor Machine Recognize
ion or Hand Pr1nted Nu＊era
ls) J, IEEE Transactions on Systems Science and Cybernetics (IE
EE Transactlons on Sys
tems 5clence and Cyberne
tlcs) Vol. 5SC-4, No. 2, July 1988 issue, reports on a method for recognizing handwritten digits. In the system described, the numbers are 25X32
is converted into a binary matrix. Features are extracted to reduce the dimensionality of the 800-bit vector (25X'12) to about 100, and the 100-bit vector is sent to several classifiers. Some kind of “normalization” of characters (n
. rg+a11zat1on)J is also performed. The authors ranged from 8B to 99.7B depending on the handwriting samples used.
% recognition rate is reported. Due to the low recognition rate compared to the desired level for commercial applications, the authors conclude that ``tracking courses should combine curve-following measurements...with automatic feature selection and parallel decision logic.'' It seems so,'' he concludes. In what appears to be a follow-up study, R.G. Cathy (R, G, Ca5ey) has applied the "normalization" of Bakis et al.
An experiment extended to the method of ng) is described. "Moment normalization of handwritten characters (No ant No
rsaLlzation of Handprint
Characters) J, IBM Journal on Research Development (IBM Journal
l or Research Development
), September 1970, pp. 548-557. Cathay uses curve tracking as suggested by Bakis et al.
and a decision method system including template matching, clustering, autocorrelation, weighted cross-correlation, and region segmentation with n elements. In the paper below, Naylor (also from IBM) describes OCR (optical character recognition) using a computer, an interactive graphics console, and gradient normalization.
Reporting on the system. ``Some studies in the interactive design of character recognition systems''
teractiveDeslgn of Charac
ter Recognltlon! 3 systems)
IEEE Transactions on Computers
Co5puters), September 1971 issue, 107
Pages 5-1086. The purpose of his system was to develop appropriate logic to identify the features to be extracted. Another extracted feature approach is described by Todd in US Pat. No. 4,259,661, issued March 31, 1981. According to Todd's method, the rectangular area formed by the outline of the character is normalized to a predefined size and then divided into sub-areas. The "darkness" J of the image within each of the sub-regions is evaluated and a "feature vector" is formed from the set of darkness estimates. The feature vector is compared to a stored set of feature vectors representing the character, and the best match is selected as the recognized character. [5PTA: Proposed algorithm for thinning a binarized pattern (SPtA: A Proposed A
lgoritha for Thinning Bin
ary Patterns) J, IEEE Transactions on Systems Man and Cybernetics (IEEE Transactions on Systems Man and Cybernetics)
stems, mans, and cybernetic
s) s SMC-14, No. 3, 1984 576
In a paper titled, Monthly Issue, pp. 409-418,
Naccache et al. have published other approaches to the OCR problem. This method provides that patterns are often composed of wide strokes and that it is advantageous to skeletonize the patterns. As explained by Nakash et al., "Skeletonization is performed by iteratively removing black points along the edges of the pattern (i.e., turning black points into white points) until the pattern is thinned to form a line drawing. It is ideal if the initial pattern is thinned to its solid axis. This paper briefly describes 14 different known skeletonization algorithms and then describes its own algorithm (
SPTA). All described skeletonization algorithms, including 5PTA, are based on the idea of passing a 3 row by 3 column square window (commonly referred to as a 3×3 window) over the image. When a square 3x3 window passes over the image, the algorithm uses 8 pixels surrounding the center pixel.
Evaluate the two neighboring pixels and, based on the evaluation, convert the black center point to white or leave it black without changing it. Pattern classification has also received significant support from other areas that have shown recent advances in related fields. In particular, the work by Hopfleld disclosed in U.S. Pat. No. 4,660.188, issued April 21, 1987,
・Advanced parallel computing networks (“neural networks”) have been in the spotlight. In particular, Gullehsen et al., ``Pattern classification using neural networks: a practical system for symbol recognition
ifieatlon by Neural Netwo
rks: An Experimental System
m for 1eon Recognition)J,
The work reported in the Proceedings of the IEEE 1st International Conference on Neural Networks (Cardll et al. W), pages IV-725-732, focuses on character classification methods. The system they describe uses some kind of image processing but no feature extraction. Instead, they believe that neural networks are
training pro
It relies completely on the original classification intelligence acquired through cess). Although the reported system appears to work well, many issues remain to be explored, as suggested by the authors. System performance is below acceptable limits. Many other character classification techniques and approaches (research methods)
and algorithms exist. However, for purposes of this disclosure, the above references provide a reasonable description of the most pertinent prior art. refers to the fact that, despite all the efforts put into solving character recognition (i.e., classification) problems, existing systems do not provide the accuracy that would allow reliable automatic character recognition. Let's just keep it that way. SUMMARY OF THE INVENTION The present invention provides a highly efficient OCR system by selecting the right combination of techniques and modifying those techniques to be high performing in terms of speed and accuracy. The general approach of the present invention is to remove meaningless variations and capture meaningful variations before classification. In particular, in accordance with the principles of the present invention, a character to be recognized is scanned to form a pixel image, and that image is scaled and "cleaned." The scaled and cleaned image is tilt corrected and skeletonized to form a thinned image. A nine-dimensional network extracts features from this thinned image to form an image map. The features are then combined to form "super features." The image is then coarsely blocked to reduce the dimensionality of the feature vectors. Finally, a classifier identifies the character using the feature vector. EXAMPLE FIG. 1 shows a flowchart of our method for character or symbol classification. At block 10, a character image is advantageously captured and stored in a frame buffer, such as a semiconductor memory device. The image may be acquired via electronic transmission from a remote location, or it may be acquired "locally" using a scanning camera. Regardless of the source of image acquisition according to typical embodiments, the image is represented by an ordered set (array) of pixels. The value of each pixel corresponds to the light (brightness, color, etc.) emitted from a particular subregion of the image. Pixel values are stored in storage. Smudges and extraneous strokes are often found near the characters, and their presence cannot but make the recognition process more difficult. According to our invention, block 20 follows block IO, the function of which is to clean the image. This is the first step in our effort to remove meaningless variability from images. Typically, an image of a symbol or character, such as a number, contains one large group of (adjacent) pixels and a small group of decimals, which may be zero. Our cleaning algorithm essentially identifies all such groups and removes all but the largest group. If the sum of the deleted groups constitutes more than a certain percentage of the initial image, it indicates that the image is anomaly, and this fact is noted for later use. Within the context of this description, it is assumed that image symbols are composed of dark strokes on a light background. "Inverted" images can of course also be processed with similar equipment. The cleaning algorithm described above also assumes that the expected symbol set in the image does not contain symbols that require uncombined strokes. The numbers 0-9 and the Latin alphabet (excluding lowercase l and j) form such a set, but most other alphabets (Hebrew, Chinese, Japanese, Hangul, Arabic, etc.) Contains unbound strokes. For such other sets, a somewhat different cleaning algorithm would have to be applied, looking only at each disjoint region rather than the entire set of regions. There are many methods that can be applied to detect and identify these foreign regions. The method we use is similar to brush fire. According to our method, the image is raster scanned from top to bottom to find the "black" pixels. When such a group is found (ie, a black pixel not previously considered is encountered), the scan is interrupted and a "brushfire" is ignited. That is, the encountered pixels are marked with an identifier, and the marking starts the dilation process. In the dilation process, each of the eight immediate neighboring pixels is considered. Those neighboring pixels that are black are similarly marked with an identifier, each marking starting its own dilation process. In this way, I first encountered "black"
The groups of pixels allow all groups to be quickly identified by the selected identifier. At this point in the process, scanning of the image is resumed so that other groups can be found and identified (with different identifiers). When the scan is complete and all "black areas" have been identified, area calculations can be performed. As shown above, all groups except the largest group are removed from the image (ie, inverted from dark to light or turned off). At this point, it can be seen that in character recognition technology it is more important not to make the mistake of incorrectly identifying a character than to refuse to make a decision. For this reason, in systems designed to identify numbers or other character sets without discontinuous strokes,
The region removal threshold should be set to a fairly low level. Normally it would be expected (in the 0-9 character set and the Latin alphabet mentioned above) that the pixels comprising a significant portion of the image would be strictly contiguous. On the other hand, perhaps this should be an exception when the areas are only very slightly separated, and the letters (e.g. when written with a pen that does not produce ink well or on a coarse paper surface) External information leads one to believe that a stroke can be broken accidentally. In order to prepare for such contingencies,
) Our method for spreading J is to spread 8 additional pixels a little further away from the 8 immediate neighbors (the 8 pixels are the large window corner and the center pixel of the large window side). Contains an oncisine for defining a neighboring pixel to include. In the end, we allow "fire" to jump over "f'ire break". Resize the image in block 25 to a predetermined size (
A scaling process follows the cleaning process. Size H! ! Of course, it also removes the meaningless variability of the image. The cleaning process performed before size adjustment is performed because it is desired not to adjust the T size of the image while containing smudges. The scaling process may use any of a number of different algorithms. For example, according to one algorithm, an image can be resized by the same factor in both dimensions until one dimension of the image reaches a predetermined size. Other algorithms perform size adjustment in both dimensions independently with some limit placed on the maximum difference between the size adjustment magnifications in both dimensions. Both approaches work well, so we leave the choice of algorithm and its implementation to the reader. We first size each character image to the appropriate number of pixels, such as an 18x30 pixel array, using the algorithm described. People generally write letters diagonally. The slope differs from person to person. Character slant or skew is another meaningless variability of handwriting that carries no information, so we remove it. Returning to FIG. 1, block 30 following block 25 performs deskewing of the image.
ng). In other words, the function of block 3o is
The goal is to make all the letters more evenly upright. Block 30 can use any of a number of conventional procedures to perform skew correction of the image. One such procedure performs a transformation of the following form on the image. Here, X and y are the initial coordinates of the image, and Xo and y. defines the origin, U and V are the coordinates of the transformed image, and m, and yy are the image moments calculated by ), the pixel at position Xs'/ is "
Takes a value of 1 when the color is black, and 0 otherwise.
takes the value of The effect of this function is to reduce the xy moments to essentially O. Size adjustment (block 25) and tilt correction (block 3)
0) are both linear transformations. It would be advantageous to perform a composite transform on the cleaned image to directly form a tilt corrected image. By performing this compound operation, it is possible to avoid the need to display the resized image in explicit form as an array of pixels. This will eliminate sources of (computational) noise. Block 40, which follows block 30 in FIG. 1, thins the image. Image thinning also removes meaningless variability in the image. As mentioned above, skeletonization according to the prior art (skeletonizatS
The method on) uses a 3x3 window that is passed over the image. The center point of the H.383 window is turned off if certain conditions are met; and for most methods, these conditions involve iterative testing with various predefined window conditions. For example, Ben-Lan and Monto 04
01toto) The algorithm detects a dark center if it has the following conditions: 1) its pixel has at least one bright neighbor out of four; and 2) its neighborhood has eight predefined 3x3 windows. does not match any of the following; if it satisfies, it is deleted (i.e., turned off or cleared). A four-neighborhood is a pixel to the east, north, west, or south of the considered pixel. Until recently, processors could only handle one task at a time anyway, so algorithms similar to the above were perfectly usable in software implementations. However, these algorithms are inevitably slow due to the nature of their procedures. Moreover, each of these prior art tests targets certain features of the pattern but not others. Different tests must be used to thin the strokes of different characters (e.g. vertical and horizontal lines). Furthermore, when using prior art tests, it is necessary that at least some of these tests be performed sequentially before a particular pixel can be reliably detected; and unless these tests are performed, the pixel cannot be turned off. The embodiment of FIG. 2 illustrates this problem. In FIG. 2, templates 100 and 110 are two 3×3 pixel windows. The three top pixels in template 100 are circle-shaded to indicate looking for ON pixels. The center pixel and the pixels within the center of the bottom row are shaded to indicate looking for OFF pixels. The remaining pixels are blank to indicate a "don't care" condition. Template 100 has a bright section (pixel 10151) above a dark section (pixels 104 and 105).
Looking for edge conditions of G2 and 103) predicts that the dark region must be at least two pixels thick. When such a condition is encountered, the center pixel (104)
can be changed from on to off (dark to light). Therefore,
Template 100 provides a mechanism for starting from the top and nibbling away the on area until only one on row is left. Template 110 works similarly, except that it has a bottom row that searches for off pixels while the center pixels of the first and second rows search for on pixels. Template 110 scrapes the on (dark) area from the bottom. The above template, where horizontal lines are thinned and vertical lines are not thinned, indicates that it is preferable to pass a number of different templates over the image, where each template is sensitive to different features of the image. It is also preferable (from a speed point of view) to pass the various templates simultaneously. However, if templates 100 and 110 were applied to image segment 10B of FIG. 2, the two pixel wide horizontal lines depicted would be completely removed, so this would not be applicable in this case. The top row will be removed by template 100 and the bottom row will be removed by template 110. If thinning is to be performed efficiently, interdependencies between different templates must be broken. Unexpectedly, this interdependence is 3X3
It has been found that it is possible to defeat this by using a larger window. Therefore, we use a template set that includes at least some templates larger than 3X3. Some things are 3X3 and some things are 3
×4, some are 4×3, and some are 5×5. A feature of the set is that the templates can be passed over the image simultaneously. This possibility is realized by the special selection of templates that allow images to be modified in response to one template without detrimentally affecting the ability of other templates to modify images independently. This rather unique set of templates is shown in Figure 3. We have found that the set of templates shown in Figure 3 is a sufficient set. According to our invention, such a set is characterized in that it contains at least one template larger than 3X3, although other sets are of course possible. To explain the operation of the illustrated templates, we begin with templates 120 and 140. These templates correspond to templates 100 and 110 of FIG. Template 120 is shown as a 5x5 array, but the outer columns and rows are "don't care" so essentially this forms a 3x3 window. Template 120 differs from template 100 in that pixels 121 and 122 in template 120 are tested to see if they are on pixels, and the corresponding pixels in template 100 are set to "don't care." is different. That is, template 120 ensures that steered (lightened) pixels are above lines that extend in both directions. Template 140, on the other hand, differs from template 110 in that it is actually a 3X4 template. It contains a 3x3 section similar to 383 template 110 (except for pixels 141 and 142)
And it also includes pixel 143 in the center of the first row. Pixel 143 eventually requires the horizontal line to be three pixels wide before the pixel is allowed to be scraped (from the bottom). Templates 130 and 150 are template pair 120
and 140 similar template pairs are formed. Templates 130 and 150 thin vertical lines. Template 16-Q, 170.180 and 190
point to the right, left, top, and bottom, respectively. "knees"
) Thin J; templates 200, 210,
220 and 230 thin the inclined line from the top and from the bottom, and so on. Template 160-230
It will be seen that all are 5x5 templates. Returning to FIG. 1, the skeletonization block 40 is followed by a feature extraction block 50. Although the operations are similar, skeletonization differs from feature extraction in mechanical terms. In skeletonization,
Unwanted pixels are identified and they are changed from dark to bright. In feature extraction, relatively macroscopic features are identified that aid in character classification. The macroscopic features identified are those types of features that do not depend on the size or thickness of the letter, but that give the letter its particular "signature."
) gives J. It is therefore these characteristics that block 50 seeks to identify. Feature extraction is operationally accomplished by passing a partial window over the image. Each window in our system is a 7x7 template, and each template has endpoints,
Detecting the presence of certain features such as diagonal lines, horizontal lines, vertical lines; etc. Detection is done by a main rule meaning that a feature is concluded to be present when the majority of 49 pixels (7×7) fit into the template. In our system, we use 49 different 7X7 templates as shown in FIG. For each template, we form a "feature map" that essentially indicates the coordinates within the image array where the template's pattern matches the image. After developing the 49 feature maps corresponding to the 49 templates of FIG. 4, at block 60 we form a number of super feature maps that are logical combinations (AND and OR) of the feature maps. In this way, we reduce the set from 49 maps to 18 maps (in an 18x30 pixel array). The number of reductions was determined heuristically. We call this detected feature a "map" because we construct the array (in the memory where we store the array) and we place the detected feature within the appropriate position within the array. In this way we record the presence of features and their locations. Although other mechanisms can be used to record "hit" location indications, it is conceptually easier to think in map form. The 18X30 array was found to be too detailed for classification purposes. Although the details can actually mask the text, they make the classification task more difficult (as the saying goes, "You can't see the forest for the trees"). Therefore, block 70 is coarsely blocked (blockl
ng) to reduce the 18×30 feature map to only a 3×5 feature map. This results in a final map or vector of 270 bits corresponding to 18 3.times.5 maps. Finally, block 80 executes a classification algorithm to
Determine the most likely candidate from the given 270 bits. Once you know which template most likely corresponds to the character to be identified, you can use the minimum humming (
A simple algorithm such as determining the distance (Ha++ning) would be sufficient. The key, of course, is in determining these templates; and that aspect requires learning methodologies (such as backpropagation) that the technology currently deals with. Hardware Implementation FIG. 1 shows the process of our OCR system, but it also closely represents the hardware implementation. The actual details of signal flow will vary with the particular design, but are entirely within the scope of normal circuit design techniques. For the purposes of the following discussion, our system operates in a vibline manner and each electronic circuit block provides the necessary signals and control to the following circuit blocks, along with the necessary identification as to which pixel is to be considered. Let's assume that. As previously indicated, block 10 includes the usual equipment associated with the particular source of images to be classified. It is simply a commercially available "frame grabber".
It may also be a retardo video camera that combines a camera (Bber) J and a storage device. When the classification process is initiated, a storage device is accessed to retrieve the center pixel and 24 neighboring pixels;
The set of searched signals is provided to block 20. Blocks 20 and 30 are executed in this case on a SUN workstation using the simple program shown in the appendix. Local memory is included in the microprocessor for storing image signals and, if necessary, temporary calculation results. In fact, any microprocessor can be used equally well, but if you need faster speeds than you can get with a microprocessor,
Specialized hardware can typically be designed to perform the required calculations. In fact, since the required operations are simply additions, subtractions, comparisons, and rudimentary multiplications, pipelined architectures that provide extremely high throughput can easily be designed. The output of block 30 is a series of signal sets, each having a center pixel and its neighboring pixels associated with it. Block 40 is implemented using the neural network of FIG. 5, which includes a series-coupled switch 400, a template match network 410, and a threshold network 420. An input signal corresponding to the values of the 25 pixels of the image covered by the 5×5 window at a given moment is applied to switch 400 at the input of 410 . Switch 400 ensures that these values are provided to the network simultaneously. Network 410 includes 25 input leads and a number of output leads equal to the number of templates stored. Within network 410, all input leads are coupled to each output lead through a column of preset coupling nodes. Each such column of combined nodes (e.g., the column containing nodes 411-414) has a stored one
Compatible with two templates. Therefore, the signal of each output lead indicates which template the input signal belongs to. More specifically, the coupling node has three types of "diversity" to know the excited state (E), the blocked state (1) and the "don't care" state (D). The response to a match or mismatch is different for each variety according to the truth table below. Input Synapse Output OE 0 1 E 1 1 l -2 0D 0 1 D O The node 411 that executes this truth table can be easily realized using a gated amplifier. Information regarding whether a node is an E-node, I-node, or D-node can be stored in two flip-flop sets attached to each node (when diversity is desired). Alternatively, the information is "connected" to an array of links attached to the array of nodes.
) J is possible. Programming (i.e., binding) of templates can be accomplished by burn-1hrogh of the appropriate links. If the template were completely and completely immutable, it would naturally be possible to design the template information directly into the integrated circuit mask of the array of nodes. Current in the output line flows through the impedance, and that flow raises the voltage on each output line of network 410 to a level proportional to the degree of matching between one in the set of input signals and the excitation node. It will be appreciated that the voltage will also be reduced by the degree of matching between one in the set of input signals and the blocking node. The output line of network 410 is fed to a threshold circuit 420, where its impedance can optionally be placed. Network 420 provides a set of thresholds to the output signal of network 410. In particular, the circuit network 420 is
A set of two sources having one input responsive to the input lead of network 420 and a number of sources (e.g. 425-427) connected to second inputs of amplifiers 421-424.
Includes human power amplifiers (e.g. 421-424). Each of the sources supplies a different current and each amplifier 42 accordingly
1-424 develops a voltage on its second lead related to the particular coupling that lead has to sources 425-427. In this manner, different amplifiers within circuitry 420 can be provided with different thresholds. The output leads of network 420 are the outputs of amplifiers 421-424, and they assume a logic 1 or 0 value depending on whether the amplifier input signal exceeds or does not exceed a threshold. Block 50 is composed of a neural network as shown in FIG. However, since block 50 handles 7X7 templates as opposed to the 5X5 templates of block 40, storage 55 is inserted between the two neural networks to buffer the data. Block 60 generates 18 feature maps. It simply accepts the output of block 50 and stores the appropriate information in memory along with a signal characterizing the consistency of the center pixel. The result is 18 memory segments, each segment containing information about features found within the image. Each such segment is therefore one of our feature maps. Coarse blo block 70
cking) is accomplished by using 18 additional small memory segments, possibly within the same physical storage device. Block 70 stores information within these small memory segments regarding features found within appropriately selected portions of the large memory segments. When the size of the initial image is 18 pixels by 30 pixels, the selection is easily achieved using a counter operating with modulus 5, where the full value of the counter (full v
alue) is used to access the large segment, while the whole number (whole n
uIlber) is used to identify cells within the 18 small memory segments. The 270 storage locations of the small memory segment are block 7
produces an output of 0, resulting in a vector representing the characters contained within the image. The last function that needs to be performed is to feed this vector to some circuitry that will select the most likely candidate character for the given feature vector. This is the function of block 80. Block 80 may be performed in a number of ways. For example, it would be advantageous to use the teachings of Hopfield's content matching scheme in the aforementioned U.S. Pat. With such information stored, the content-matching (associative) memory identifies the feature vector of the character that is closest to the supplied feature vector. The network is extremely robust (strongly adaptive) so that it makes "accurate" selections even when the input appears to be highly distorted. However, the design of the feedback network for the Hopfield circuit is somewhat difficult because feedback to the Hopfield circuit causes all stored vectors to be distributed within the feedback network and mixed with each other. This difficulty limits how we recognize the letter ``4'', for example, and the limits of when it is possible to recognize ``4'' and when we are unsure and refuse to decide. is compounded by the fact that it is not known exactly. Despite this, we can recognize the letter "4" just by looking at it! This research attempts to solve this problem by having the classification circuit perform "learning" through trial and error in order to arrive at the correct decision. One configuration with the possibility for such "learning" is shown in FIG. In the current state of the art, this technique is generally referred to as "backpropagation"
ack propagatlon). This is, for example, ``Exploration within the fine structure of recognition (Exploration)''.
rations 1n the Microstruc
ture of' Cognltlon) J, M I
T Publishing, 1986, Chapter 8 “Parallel distribution processing (Par
allel Distrlbuted Process
ing) J., D.E. Ru+++elhart, J.L., Me C1elland, eds. [Learning
Internal Representatlons
by Error Propagatlon) by E. Ramelhardt in J. FIG. 6 includes optical interconnect networks 81 and 82 coupled in series. An input signal set is applied to the input of network 81 and an output signal set appears at the output of network 82. Each network has multiple input and output leads, with each input lead connected to all of the output leads. More specifically, each input lead i has a connection weight
It is connected to each output lead j via Wlj. In our application, network 81 has 270 input leads and 40 input leads. circuit network 82
has 40 input leads and 10 output leads. The number of input leads of network 81 is determined by the length of the feature vector. The number of outputs of network 82 is determined by the number of characters in the classification set. The number of intermediate lead wires (40 in this case) is determined heuristically. Training the circuit of FIG. 6 provides developed feature vectors of known characters and maximizes the output signal at a designated output lead of the circuitry 82 corresponding to the provided known character. This can be done by adjusting the weights in both networks 81 and 82. The available samples of all the characters in the set to be sorted are thus fed into the network, each time using weights in the interconnection network to maximize the signal on the appropriate output lead. is adjusted. In this way, one set of weights W for both networks
1j is formed. It may be appropriate to state that the coupling way W1j is essentially an analog quantity and that the circuit operates in an analog manner. That is, the voltage on any output lead of network 81 will be the same as the ignited voltage coupled to that output lead.
fired up) is the sum of the contributions of the J weights. Each weight is "fired" by a binary signal on the input lead to which it is coupled. Therefore,
The output at lead j is equal to 〒Biwlj, where B is the value of the first input lead (0
Or 1). Although the concept of such learning networks is fairly well understood, the challenge remains of implementing such analog circuits efficiently and compactly. The requirements for such circuits are not simple. For example, if optimization of the network is to be achieved, the minimum weight change or modification must be fairly small. The iterative improvement method described above is based on the heuristic assumption that better weights may be found in the vicinity of good ones, but that heuristic assumption fails when the subdivision is not fine enough. For small networks 81, an analog depth of at least 8 bits is required. Larger networks will require finer subdivision. The weights must also represent both positive and negative values, and changes must be easily reversible. During the learning and training sections, the number of changes to the weights may be sufficiently large. Therefore, a practical circuit must allow for quick modification of weights. Taking these critical demands into consideration, we fabricated an efficient analog coupling weight circuit or strength circuit using MOSVLSI technology. Although each connection weight in FIG. 6 is simply shown as a black circle, FIG. 7 shows the circuitry for implementing these circles. More specifically, FIG. 7 shows input line 83 and output line 8.
Figure 4 shows one coupling weight with its coupling to 4 and some conventional circuits. First, the mutual coupling weight part in FIG.
01 and 802 and small MOS switches 803 and 804
, somewhat large MO5) transistor 805, and differential amplifier 8
0B and a multiplier 807. Second, the circuit of FIG. 7 includes a charge coupled switch 80g, a sense switch 809, and various control leads. The circuit operates as follows. Capacitors 801 and 8
02 are charged to different voltage levels, and the voltage level difference is reflected in the output voltage of the differential amplifier 806. amplifier 8
06 has its two inputs connected to capacitors 801 and 802. Amplifier 80 representing the coupling weights
The output of 6 is connected to multiplier 807. Multiplier 807 may be any conventional transconductance amplifier. Multiplier 807 is also connected to input line 83 of the cross-coupling network. The output of converter 807 is connected to the output lead of the interconnect network. Multiplier 807 thus sends a current to the output lead that is the product of the signal on the input lead and the value of the combination weight. The coupling weight is capacitors 801 and 80
is represented by the differential voltage formed by amplifier 80B in response to the voltage difference between 2 and 2. The voltage difference on capacitors 801 and 802 is maintained for a long time (compared to the operations involved within the OCR system);
It has also been found that freshening is not necessary when the circuit is maintained at a moderately low temperature. For example, 77°K
There was no detectable loss over time. One advantage of our circuit is that the weights are vC801-
It will be seen that it is proportional to VC802, so even if there is a charge loss, there is no change in weight as a result when it is the same on both capacitors. Nevertheless, a clear path must be provided for refreshing the information onto capacitors 801 and 802. Furthermore, capacitors 801 and 802
A path must be provided for setting the voltage (charge) value on the top and modifying that setting to enable the "learning" procedure described above. This is where the rest of the switches and controls come in. To bring the coupling weights to the desired level, switch 808 is slightly opened or closed to set a fixed voltage level to voltage source 81B.
to the capacitor 801. Voltage corresponds to a fixed charge. Switch 808 is then turned off. At this point, capacitor 801 is connected to the non-inverting input of amplifier 806 to provide a positive voltage, while capacitor 802
is connected to the inverting input of amplifier 80B, so the coupling weight is at the maximum positive level. Connection weights are changed as follows. First, transistors 803 and 804 are turned on. Transistor 803 is extremely small compared to transistor 805, so to make the phenomenon easier to understand, transistor 803 can be considered as a simple switch. By comparison, transistor 805 is long and narrow, so when it is turned on, it can be thought of as a capacitor. Switch 803 is closed and transistor 805
When turned on (assuming it is an n-channel device), the charge on capacitor 801 is
01 and the inverted charge on transistor 805, which is turned on. Transistor 803 is then turned off, thereby trapping charge within transistor 805. Transistor 804 is then turned on, and if transistor 805 is then turned off slowly, the mobile charge in its channel will diffuse through switch 804 and into capacitor 802. Therefore, the above steps transfer an amount of charge from capacitor 801 to capacitor 802. This corresponds to capacitor voltage changes and mutual coupling weight changes. The above sequence can be repeated as many times as necessary to bring the combination weights to the desired level. In this way, optimization of the connection weights proceeds during the training period, so that each connection weight in networks 81 and 82 is set to the correct level. The above description relates to the training aspect of the circuit. Once the learning process is finished, a means should be provided: l) for determining the values of the weights, and 2) for refreshing losses over time, etc. This includes a detection switch 809, an A/D converter, and a D/A
This is achieved using a converter and non-volatile storage. To determine the weight values within the interconnect network, all input leads are turned on one at a time. Each time a lead is turned on, the sense switches 809 of the weights coupled to that human lead are turned on in sequence, causing each amplifier's voltage to appear on the sense bus 810. The voltage is supplied to A/D converter 811 and the resulting digital information is stored in storage device 812. In this manner, all weights are converted to digital form and stored in storage 812. During a refresh operation, each combined weight is isolated as described above, but at this time the voltage output on sense bus aLQ is connected to the D/A to which the digital output of storage device 812 is applied.
The analog voltage of converter 813 is compared in amplifier 814 . Of course, storage device 812 is caused to emit a digital output corresponding to the refreshed combination weights. Based on the comparison result, the sequence of switch elements 803, 804 and 805 is amplified 81
4 to increase or decrease the voltage on capacitor 801 relative to capacitor 802. The output of bus 810 is converted to A/D converter 81.
1 or to comparison amplifier 814 is controlled by switch 815. If it is necessary to completely discharge both capacitors 801 and 802, the voltage on voltage source 81B can be reduced to zero and switches 803, 804 and 805 can be turned on. //fire, c 9/7/88 LDJ〃 1/ check for broken image
s// returns -1if complete
ly blank // returns Oif co
connected // returns l if co
connected except for samll
flyspeaks // returns 2 if
badly disconnected // uses
recursive brushfire a1go
rithm//Diagnostic output
: prints number of segment
ts. // and 1 location Bd code o
f the largest775ide effec
t: 5ets up Lseg (in rec2c
om)1/IMPORTANT ASSUMFr10
N100ml assume img, pix blac
k pixels are PO3mVE// and
white pixels are zero −/
/ If you can't guannree
hat, call fireQ m5reatio
f fwQ1/ negative pixels w
ill cause // Th1s
routine modifications fng, pix
! ! #1nclude"err1.h" Finger nclude "fire, h" 1nline int 1m1n(iru a, in
t b) (x℃turn(a<b?a: b);)i
nline int imax(int ty ir+
+ b) (remrn(ayb? a : b); 1
staoc int xdl; // copy of
img, xstatic int ydl; //
and img, yTable 2 5taIic char” ” pix 1 ;//
yd img, pixstatic Pa1r” pi
; 5raoc Pa1r"p2; 5tatic int 1ist 5ize = -1
;5tatic Seg mysegJsegmen
t being processed SCg Lseg
; // longest segment th
are ismtic int nseg; 5tatic int totpix; // tota
l pixels in imageint&elσm
age img) ( // make 5ure we have allc
enough room to keep
ow pair-■S family int 1 size w img
, x ” img, y; // nux possible
e 5ize of■st required (
lsiz > 1ist-size) (if(plN delete p i : delete p2; ] pi w new Pa1r[l5izl; p2 =
new Pa1r[1size];1ist 5ize
1+1lsiz; Lseg, 1yst z O; //
no longest segment yetLs
eg, 5ize z O; // first seg
will beat this for surens
eg 0; rotpix = 0; Table 3 // find first blILck pixe
l, so we car+ 1nitiatr, t
int xx; int
yy; for (yy wa (); yy < fng, y
; yy++) (rot (xx sw O; xx
< img, x; xx++) (if(tmg-P
iX[yyl[XXl > O) t/// fpri
nl(stderr, "firstx s-%d f
irsty -%d O, xx, yy):// *
lots of these things might
Iogicily be arguments +o
burnα77 but 5ta6c variab
les ty faster & simplerns
eg++; // count tMs segmen
tmyseg, ashes +w+ -nse4myg
g, 1ist m (Lseg, 1ist 1m pi
)? pi: p2; myseg, 5ize wa
O; xdl wa img, x; ydl wa img,
y; ptxl M img, pix; burn(xx
, yy'), 77 bum, baby, burnt
if (myseg, 5ize > Lscg, 5ize
e) Lscg g myseg;] 川fdef TEST fprintf(stderr, ”Saw%d se
gments', nseg); if (nseg)
fprintf(stderr, 'Longest
(code %d) 5tans at %d %dO
. Lseg, uhes, Lseg, 1is+[Ol, x
, Lseg, 1ist (Ol, y); #endif if (nseg m O) rerum -1; if
(nseg W 1) return O; float
t frac ta float (Lseg, 5ize
) / floatQotpix); cons+ fl
oat m1nfrac wm, 9;if (fra
c)s m1nfrac) return 1;re
turn 2; Table 4 (1) // the magieIIIrecursive
burning routine // turns
o ashes all points that a
re 8-connected to the 1ni
tial point void burn ( int xcent. truycent // ceruer of
3 x 3 region or 1ruest/
/ if this point is off-sc
ale:1f(xcent<Oycenta xce
nt>fable xdl ycenosa-ydl)retu
rn;17 N0TE: this is 1ndee
d a check for > O. // not jusr nonzero, so t
Hings don't burn twice. 1f(pixl[ycentl(xcentl > O
) [int top m myseg, 5ize++;
// keep track of length o
f segmenttotpLx++;/1count
totalpixelsptxl[yceru][xc
eru] wamyseg, ashes; //1urn
this point to ashes burn(xcen
t+1. ycent+1); // 1gn1ten
eigborsburn(xcent+1.ycen
t);burn(xcent+1.ycentl) to um(xcent, ycent+1);burn
(xcent, ycent-1); burn(xc
ent-1, ycent+1); burn(xcent1. ycent); burn(xcent-1, ycent
cent-1); #define jumpbreak
s YES #ifdefjumpbreaks Table 5 // same as above, but doe
s not +lssume that black
pixels are positive;/l no
n-zero 5offices. int fire(Image img)(for (
tnt yy = 0; yy < img, y; y
y++) (for (int xx = O; xx
< img, x; xx++) [img, pix[
yyl[xxl = img, pix[yyl[xxl
! = O;return (fire(img)
); Table 4 (2) int jump x imax (xdl, ydl)
;jump w jump / 20; if (jump < 3) jump = 3; if
θump > 1) ( burn(xcent+jump, ycent-jum
p);burn(xcent, ycent-jump
); burn(xcent-jump, ycent-
jump); #endif // if this point NOT set,
or already bumed, do not
hingreturn; 6 tables/l do most of the work for
r limar transformation pr
ogram//perform lir+ear t
transforms on post off
ice data // i, e, convert t
o 5standard 5ize and aspec
trati. // see lin, plan for extend
ded discussion // Note: xy
pLx[][] win contain small
integers O,,9//for gray
levels below threshold, y
you get zero; // for grayle
vels above threshold, you
get the graylevel num anti-r/
/ Th1s gives you the opti
on of treating it as a bo
olean // if you don't care
About gray levels. // Cauer allocates the ar
ray; we fill it. #tnclude <5tcuo, h>#tnclude
e<rmth,h> inlinefloatfmin(floata, fl
oatb)(return(a<b?a: b);1i
nline float fma, t(float a
, float b) (return(a>b? a
: b); )inlineiruimin(tnta
, 1rub) (return(a<b?a : b);
) inline int imax(int IL,
int b) (returg(a>b? a : b)
]Table 7 voiddo tin(const Image rawJl 1nput i
mageconst int known-fit, /
/ 1111 => char ready fil
ls box pdim by qdim. Image des, // res+rlt:
array of small integersFI
LE” pmm-fpJ/ paruneter fi
le filepointer // O=> all
parameters take default v
aluesccnst char” sname//
filename, for information
al messages/l provide+"'
if you can't do better] (Pair"bl; int 1) I), (Iq; for (Qq= O; qq <raW,)';
Qq++) (for (ppIll O; pP <
raw, x; pp++) if(raw, pix(
qql[ppl) (bl[1b11.x -pp; bl[1b11.y -qq; ibl++; ] do-1in 1(raw, x, raw, y, bl
, ibl, kr+own-fit, des, p
aram fp, sname); delete(bl
); Table 9 // find raw bounding boxi
nt pO, qO, p2. q2; int ibl; int pp+ qQ; if (known fit) (pO=O; qO=O; p2=pselfm; q2=qmountm; ) else (po=bl[0],x; qO=bl [0],y
;p2 = po; q2 = qo; for (ibl z 1; ibl <nbl;
ibl←) (pp = bl[1b11.x; qq
= bl[1b11. y; pO = 1m1n (pO・
PP); p2 size rnax (p2. pp); qo = im Tsumugi (qO, qq); q2 = imax (q2. qq); p2→; q2+
+; Table 8 void do-fin 1 (const int pdimJI 5ize of
1nput arrayconst int qdim
, // ,. const Pa1r” bl, // 1nput:
1st of black pixelscons
t int nbl, // 5ize of 5ai
d 1ist// O=+) all parameters
ers take default values//
provide "" if you can't
do beuerFget(kernel, 2)Jc
involution kernel (units)
of PQ rows/cols)Iget(ming
ray, 3) Jl this or larger:
return graylevel, else
etern zer. float pkem kernel; float
qkem kernel;const int
xo m O; const inc yOw O; const float xmid m (des, x
-xO) / 2.0; const float y
mid w (des, y −yO) / 2.0; Table 10 // calculate some moments
:float” xyflt x new-float
(des, y, des, x); // note th
at we are treating the pi
xels as BOXES of ink, not
points// so the (0,0) pi
xel extends from (0,0) to
(1-eps, 1-eps)/l and has
its center at (,5,,5)flo
at pmid m (po + p2) / 2.0;
float qmid x (qO+q2) / 2.
0; // but if we 5hift the
m1ddle half a bit. // we can pretend the (0,
0) pixel is cerured at (
0,0) float pIj1X++a pmid −
,5:float qmx m qmid-,5;f
loat mpq ws O,;/l PQ rnom
enrfloat mqq w O,;// QQ m
omentfor (ibl z O; ibl <
nbl; ibl+-+) [pp-bl[1b11
．． x; qq -bl[1b11. y;mpq 4m
(qq −qmx)”(pp −pmx); mqq +
= (qq - q plate t)" (qq - qmx); flo
at theta w mpq / mqq; // N
ote: 5ince pixels are num
Bered from UPPER1ft. // positive theta corresp
onds to ”/l negative thet
a corresponds to ”/'// (p
c, qt) is the coordinate
where (p, q) will go when
the char is deskewed. // Th1s is not quit the
same as (x,y) 5ince the 1
atter // has 5ize changes
as well. Table 11 Table 12 Jl Calculate min and max
honz c■-1nates. // measured relative to a
1ine pu resistance 1el to the 5ides
of Yue parallelogram〃

【山ｅ　ｒｅ
ｆｅｒｅｎｃｅ　１ｉｎｅ　ｇｏｅｓ　ｔｈｒｏｕｇｈ
　（ｐｍｉｄ、　ｑｍｉｄ）／／　ｗｈｉｃｈ　ｉｓ　
＋ｈｅ　ｍ１ｄｄｌｅ　ｏｆ　ｔｈｅ　ｒａｗ　ＰＯｒ
ｅｃｔａｎｇｌｅ）＃ｄｅｆｉｎｅｐｍａｐ（ｐ、ｑ）
（ｐ−ｐｍｘ　−（ｑ−ｑｍｘ）申ｔｈｅＩａ）ｆｌｏ
ａｔ　ｐｔＯｍ　ｐｒｎａｐ（ｂｌ［ｏ］、ｘ、　ｂｌ
［ｏ］、ｙ）　ＪＩｅｆｔｍｏｓ＋　ｂｌａｃｋ　ｐｉ
ｘｅｌｆｌｏａｔ　ｐｔ２　ｍ　ｐｔＯ；／／　ｒｉｇ
ｈｔｍｏ＊　ｂｌａｃｋ　ｐｉｘｅｌｆｏｒ　（ｉｂｌ
　ｍ　１　；　ｉｂｌ’＜ｎｂｌ；　ｉｂｌ＋＋）　（
ｆｌｏａｔ　ｘｘｘ　ｗａ　ｐｍａｐ（ｂｌ［１ｂ１１
．ｘ、　ｂｌ［１ｂ１１．ｙ）　；ｐｔｏ　ｗ　ｆｍｉ
ｎ（ｐｔＯ，ｘｘｘ）；ｐｔ２　ｔｚ　ｆｍ紅（ＰＱυ
■）；／／　Ｔｈｅ　ｐｏｉｎｔｓ　ｗｅ　ｊｕｓｔ　ｃａｌ
ｃｕｂｕｅｄ　ａｒｅ　ｃｅｎｔｅｒｓ　ｏｆ　ｐａｒ
ａｌｌｅｌｏ炉面ｂｏｘｅｓ／／　Ｃａ１ｃｕｌａｔｅ
　ｈｏｗ　ｍｕｃｈ　ｔｈｅ　ｂｏｘ　５ｔｉｃｋｓ　
ｏｕｔ　ｆｒｏｍ山ａｒｅ。ｐｔ２−ｗ　、５”ｆａｂｓ（ｔｈｅｔａ）　＋　、５
０１；／／、００１　ｔｏ　ｃａｔｃｈ　ｒｏｕｎｄｏ
ｆｆ　ｅｒｒｏｒｓｐｔｏ　−ｍ　、５”ｆａｂｓ（ｔ
ｈｅｔａ）　＋　、５０１；ｆｌｏａｔｄｓｗｍｐｔ２
−ｐｔｏＪｌｄｅｓｋｅｗｅｄｗｆｄｔｈｆｌｏａｔ　
ｋｆ　ｍ　ｄｓｗ　／　（ｐ２−ｐｏｐ；／／　ｋｒｕ
ｎｃｈ　ｆａｃｔａｒ／／　ｕｓｕａｌｌｙ　（ｂｕｔ
　ｎｏｔ　Ｉｗｍｙｓ）　１ｅｓｓ　ｔｈａｎ　１／／
　（ｎｏｔ　ｕｓｅｄ　ｉｎ　ｆｕｎｈｅｒ　ｃａｌｃ
ｕｌａｔｉｏｎｓ）ｆｌｏａｔ　ｃｏｎｗｉｄ　ｍ　ｄ
ｓｗ　＋　ｐｋｅｒｎ−１、Ｊｌ　ｙｅｔｅｒｖｉ　ｗ
ｉｄｅｒ　ｔｏ　ｍａｋｅ　ｒｏｏｍ　ｆｏｒ　ｃｏｎ
ｖｏｌｕｔｉｏｎｆｌｏａｔ　ｃｏｎｈｇｔ　Ｗ　ｑ２
−　ｑＯ＋　ｑｋｅｍ　−Ｉ　Ｊｌ　ａｎｄ　７ｅｒ、
　ａｌｓｏ　ｆｏｒ　ｃｏｎｖｏｌｕｔｉｏｎｆｌｏａ
ｔ　ｑａｎｉｄ　ｍ　ｑｍｉｄ　＋　（ｑｋｅｒｎ−１
，）　／Ｚ、Ｊｌ　ｈｅｉｇｈｔ　ｏｆ　ｍ１ｄｄｌｅ
　ｏｆ　ｃｈａｒｆｌｏａｔ　ｄｆａｔ　ｓｗ　ｄｅｓ
、ｘ　／　（ｆｌｏａｔ）　ｄｅｓ、ｙ；／／　ｄｅｓ
ｉｒｅｄ　ｆａｔｎｅｕ　ｎｔｉ。ｆｌｏａｔ　ｎｆａｔ　ｍ　ｆａｔ／ｄｆａｒ、／／ｎ
ｏｒｍａｌｉｚｅｄ　ｆａｔｎｅｕ　ｒａｔｉ。ｉｆ　（ｉｎｆｏ）ｆｐｒｉｎｔｆ（ｓｔｄｏｕｔ、　”％ｓｗ：％ｄ、　
ｈ：％ｄ、山：　％５．２ｆ、　ｄｓｗ：　％５．２ｆ
、Ｍ：％５．２ｆ、　ｎｆａｔ：％５．２ｆＯ。５ｎｘｒｒｈｅ、　ｐ２−ｐｏ、　ｑ２−ｑｏ、　ｔｈ
ｅｍ、　ｄｔｗ、　ｋｆ、　ｎｆａｔ）；Ｊｌ　ｃａｌ
ｃｕｌａｔｅ　ｔｈｅ　ｃｏｅｆｆｉｃｉｅｎｔｓ　ｏ
ｆ　ｔｈｅ　１ｉｎｅａｒ　ｔｒａｎｓｆｏｒｍａｔｉ
ｏｎ：ｆｌｏａｔ　ｄｏｏ、　ｄｏｔ、　ｄ０２．　ｄ
ｌｏ、　ｄｉ　１．　ｄｉ２；ｉｆ　（ｉｎｆｌａｔｅ
）　（／／　ｏｌｄ　”１ｎｆｌａｔｉｏｎａｒｙ”　
ｓｃｈｅｍｅｆｌｏａｔ　ｆｉｆ　＋ａ　ｎｆａｔ　＞
　ｆｃｏｒｎ　？１　、／ｎｆａｔ　：　ＬＪｆｃｏｍ
Ｊｌ　ｆａｔｎｅｓｓ　ｉｎｃｒｅａｓｉｎｇ　ｆａｃ
ｔｏｒｄｌｏ　＝　Ｏ，：　　／／　ｐｕｒｅ　ｓｋｅ
ｗｄｉ　１　ｘ　（ｄｅｓ、ｙ−ｙＯ）　／　ｃｏｎｈ
ｇｔ；／／　ｍａｋｅ　ｏｕｔｐｕｔ　ｃｈａｒ　ｆｉ
ｌｌ　ｉｔｓ　ｂｏｘ　ｖｅｒｔｉｃａｌｌｙｄｏｏ＝
ｄｌｌ”ｆｉｆ；ｄｏｌ−−ｄｌｌ傘ｆｉｆ　”　ｔｈｅｔａ；ｄ０２　
＝　ｘｍｉｄ　−ｄｏｏｎｏｔｍｉｄ　−ｄｏ１傘ｑｔ
ｍｉｄ；／／ｃｅｎｔｅｒｐｏｉｎｔ　ｉｓ　ｆｉｘｅ
ｄ　ｐｏｉｎｔｄ１２　＝　ｙｍｉｄ　−ｄ１０傘ｐｔ
ｍｉｄ　−ｄｉ１傘ｑｔｍｉｄ；】ｅｌｓｅ　（／／　ｎｅｖｅｒ　１ｎｆｌａｔｅ　ｓｃ
ｈｅｍｅｆｌｏａｔ　ｙｇｒｏｗ　ｘ　（ｄｅｓ、ｙ−
＞ｒｏ）　／　ｃｏｎｈｇｔ；ｆｌｏａｔ　ｘｇｒｏｗ
　ｚ　（ｄｅｓ、ｘ−ｘｏ）　／　ｃｏｎｗｉｄ；ｆｌ
ｏａｔ　ｄｏｇｒｏｗ　＝　ｆｍｉｎ（ｘｇｒｏｗ、　
ｙｇｒｏｗ）；／／　ｇｏｗ　ａｓ　１ｉｔｔｌｅ　ａ
ｓ　ｐｏｓｓｉｂｌｅ７７　ｉ、ｅ、　５ｌｕｉｎｋ　
ｓｏ　ＢＯＴＨｆｉｔｄｏｇｒｏｗ　ｚ　ｆｍｉｎ（ｔ
、ｏ、　ｄｏｇｒｏｗ）；／／　ｂｕｔ　ＮＥＶＥＲｒ
ｅａｌｌｙ　１ｎｆｌａｔｅｄｌｏ　ｘ　Ｏ；　　／／
　ｐｕｒｅ　ｓｋｅｗｄｌｌ　−ｄｏｇｒｏｗ；／／　
５ｈｒｉｎｋ　ｙｄｏｏ　ｘ　ｄｏｇｒｏｗ；／／　ａ
ｎｄ　ｘ　ｅｑｕａｌｌｙｄｏｌ　ｍ　−ｄｏｇｒｏｗ
　”　ｔｈｅｔａ；ｄ０２　ｗ　ｘｍｉｄ　−ｄｏｏ”
ｐｔｍｉｄ　−ｄｏｌ　”ｑｔｍｉｄ；／／　ｃｅｎｔ
ｅｒ　ｐｏｉｎｔ　ｉｓ　ｆｉｘｅｄ　ｐｏｉｎｔｄ１
２−　ｙｍｉｄ　−ｄ１０傘ｐｔｍｉｄ−ｄｌｌ傘ｑｔ
ｍｉｄ；第１３表／／　ｍａｋｅ　ｔｈｅ　ｏｕｔｐｕｔ　５ｐａｃｅ　
ａｌｌ　ｗｈｉｔｅ而ｘｘ、面ｙｙ；ｆｏｒ　（ｙｙ　＝　ｙＯ；　ｙｙ　＜　ｄｅｓ、ｙ；
　ｙｙ＋＋）　（ｆｏｒ　（ｘｘ　ｗ　ｘｏ；　ｘｘ　
＜　ｄｅｓ、ｘ；　ｘｘ＋＋）　（ｘｙｆｌｔ［ｙｙｌ
［ｘｘｌ　！　Ｏ，；／／　ａｎｄ　５ｔａｒｔ　ｂｅ
ｑｕｅａｔｈｉｎｇ　ｂｌａｃｋｎｅｓｓｉｎｔ　５ｐ
ｉｐ　＝　ｉｍａｘ（１，１ｎｔ（１０”ｄｌ　ｌ））
；／／　５ｐｉｐ　ｗ　５ｃａｎｓ　ｐｅｒ　１ｎｐｕ
ｔ　ｐｉｘｅｌｆｌｏａｔ　５ｔｅｐ　ａ　ｐｋｅｒｎ
　／　５ｐｉｐ；ｆｌｏａｔ　ｏｆｆｓｅｔ　＝　５ｔ
ｅｐ　／　２．；ｉｎｔ　ｉｉ；ｆｌｏａｔ　ｆｙ、　ｆｑ；ｉｎｔ　ｉｙ；ｆｌｏａｔ　ｆｘＯ，ｆｘ２；ｉｎｔ　ｉｘＯ，ｉｘ２；ｆｌｏａｔ　ｒｘｏ、　ｒｘ２；第１４表＃ｄｅｆｉｎｅ　ｏｏｐｓ（Ｘ）　（ｒｐｒｉｎｔｆ（
ｓｔｄｅｒｒ、　’ｏｏｐｓ　（Ｘ）％Ｓ％「％ｆ’ｏ
、　ｓｎａｍｅ、　ｆｘｏ、　ｆｘ２）；　ｃａｎ６ｎ
ｕｅ；１ｆｏｒ　（ｉｂｌ　ｗ　Ｏ；　ｉｂｌ　＜　ｎ
ｂｌ；　ｉｂ＋＋＋）　［／／　１ｏｏｐ　ｏｖｅｒ　
ｂｌａｃｋ　１ｎｐｕｔ　ｐｉｘｅｌｓｐｐ　−ｂｌ［
１ｂ１１．ｘ；　ｑｑ　−ｂｌ［１ｂ１１．ｙ；ｆｘｏ
　−ｄｏｏ傘ｐｐ　　　＋　　ｄｏｌ”ｑｑ＋ｄ０２；
／／ｓｕｎ　ｏｆｓｃａｎ　１ｉｎｅ　（ｉｎ　ｘ　５
ｐａｃｅ）ｒｘ２　＝　ｄｏｏ”（ｐｋｅｒｎ＋ｐｐ）
　＋ｄＯ１”ｑｑ＋ｄ０２；／／　ｅｎｄ　ｏｆ　５ｃ
ａｘｌｌｉｎｅ　（−、）ｉｘｏ　ｍ　１ｎｔ（ｆｘｏ
）；／／　ｉｎｔｅｇｅｒ　ｐａｎｓｉｘ２　ｗ　１ｎ
ｔ（ｆｘ２）；ｒｘｏ　ｍ　ｆｘＯ−ｉｘｏ；／／　ｒｅｍａｉｎｄｅ
ｒｓｒｘ２纂ｆｘ２−　ｉｘ２；ｉｆ　（ｉｘｏ　＜　Ｏ）　０ＯｐＳ（１）　；／／　
ｅｒｒｏｒ　ｃｈｅｃｋｓ　＆　ｄｅｆｅｎｃｅｉｆ　
（ｉｘｏ　＞ｗｍ　ｄｃｉ、ｘ）　ｏｏｐｓ（２）　；
「中Ｃ２＜　Ｏ）　ｏｏｐｓ（３）　；ｉｆ　（ｉｘ２
　＞　ｄｃｓ、ｘ）　ｏｏｐｓ（４）　；ｆｏｒ　（ｉ
ｉ　ｍ　Ｏ；　ｉｔ　＜　５ｐｉｐ；　ｉｉ＋＋）（／
／　１ｏｏｐ　ｏｖｅｒ　５ｃａｎ　１ｉｎｅｓ　ｐｅ
ｒ　１ｎｐｕｔ　ｐｉｘｅ！ｆｑ　ｓ＋＋　ｑｑ　＋　
ｏｆｆｓｅｔ　＋５ｔｅｐ”ｉｉ；ｒｙ寓ｄ１０”ｐｐ
＋ｄｉ１傘ｆｑ＋ｄ１２；ｉｙ　ｗ　６ｎｔ）　ｆｙ；ｉｆ　（ｉｙ　＜　Ｏ）　ｏｏｐｓ（５）　；ｉｆ　（
ｉｙ　％　ｄｅｓ、ｙ）　ｏｏｐｓ（６）　；ｘｙｆｋ
［ｉｙｌ［ｉｘＯト１ｒｘｏ；ｘｙｆｌｔ［ｉｙｌ［ｉ
ｘ２］　＋−ｎｃ２；ｆｏｒ　（ｉｎｔ　ｊｊ　ｍ　ｉ
ｘｏ；　ｊｊ　＜　ｉｘ２；　ｊｊ＋＋）　（ｘｙｆｌ
ｔ［１ｙｌｌｊ］　＋Ｉ！＝　１．０；第１５表／ｌ　ｃｌｉｐ　ｔｈｅ　ｂｏｕｏｍ　ｏｆｆ　ｏｆ　
ｔｈｅ　ｇｒａｙ−ｓｃａｌｅ／／　ｕｓｉｎｇ　ｑｕ
ｅｓｔｉｏｎａｂｌｅ　ｔｈｒｅｓｈｏｌｄ　ｓｃｈｅ
ｍｅ／／　＆ｓｔ、　ｆｉｎｄ　ｂｌａｃｋｅｓｔ　ｏ
ｕｔｐｕｔ　ｐｉｘｅｌｆｌｏａｔ　ｚｍａｘ　ｍ　Ｏ
；ｆｏｒ　（ｙｙ　ｘ　ｙＯ；　ｙｙ　＜　ｄｅｓ、ｙ；
　ｙｙ＋＋）　（ｆｏｒ　（ｘｘ　ｗ　ｘＯ；　ｘｘ　
＜　ｄｅｓ、ｘ；　ｘｘ→）（ｚｍａｘ　ｘ　ｆｍａｘ
（ｚｍａｘ、　ｘｙｆｌｔ［ｙｙｌ［ｘｘｌ）；／／　
ｔｈｅｎ　ｃｌｉｐ　ａｗａｙ、、、。ｆｌｏａｔ　ｚｚ　ｍ　ｘｙｆｌｔ［ｙｙｌ［ｘｘｌ　
；ｉｎｔ　ｇｒ　Ｗ　１ｎｔ（９，９”　ｚｚ　／　ｚ
ｍａｘ）；ｉｆ　（ｇｒ　＜　ｍｉｎｇｒａｙ）　ｇｒ
　ｘ　Ｏ；ｄｅｓ−１）ｉＸ［＞’ｙ］［ｘｘｌ　ｓ　
ｇｒ；Ｗ席’ＩＩ／／１１．妨ｍ席ｍＨ盾倣ｑ／／　ｐ
ｕｔ　ａｗａｙ　ａｌｌ　ｏｕｒ　ｔｏｙｓ：ｄｅｌｅ
ｔｅ２（ｘｙｆｌｔ）；[Mountain e re
ference 1ine goes through
(pmid, qmid) // which is
+he m1ddle of the raw POr
ectangle) #definepmap(p, q)
(p-pmx −(q-qmx)intheIa)flo
at ptOm prnap(bl[o], x, bl
[o], y) JIeftmos+ black pi
xelfloat pt2 m ptO; // rig
htmo* black pixelfor (ibl
m 1 ; ibl'<nbl; ibl++) (
float xxx wa pmap(bl[1b11
．． x, bl[1b11. y) ;pto w fmi
n (ptO, xxx); pt2 tz fm red (PQυ
■); // The points we just cal
cubed are centers of par
allelo furnace boxes // Calculate
how much the box 5ticks
out from the mountain are. pt2-w, 5”fabs(theta) +, 5
01; //, 001 to catch round
ff errorspto-m, 5”fabs(t
heta) + , 501; floatdswmpt2
-ptoJldeskewedwfdthfloat
kf m dsw / (p2-pop; // kru
nch factor // usually (but
not Iwmys) 1ess than 1//
(not used in funher calc
ulations) float conwid m d
sw + pkern-1, Jl yetervi w
ider to make room for con
volumefloat conhgt W q2
- qO+ qkem -I Jl and 7er,
also for convolution floor
t qanid m qmid + (qkern-1
,) /Z, Jl height of m1ddle
of charfloat dfat sw des
, x / (float) des, y; // des
ired fatneu nti. float nfat m fat/dfar, //n
normalized fatneu rati. if (info) fprintf(stdout, ”%sw:%d,
h:%d, mountain:%5.2f, dsw:%5.2f
, M:%5.2f, nfat:%5.2fO. 5nxrrhe, p2-po, q2-qo, th
em, dtw, kf, nfat); Jl cal
curate the coefficients o
f the 1inear transformation
on: float doo, dot, d02. d
lo, di 1. di2; if (inflate
) (// old “1nflationary”
schemefloat fif +a nfat >
fcorn? 1, /nfat: LJfcom
Jl fatness increasing fac
tordlo = O, : // pure ske
wdi 1 x (des, y-yO) / conh
gt; // make output char fi
ll its box vertically =
dll” fif; dol--dll umbrella fif ” theta; d02
= xmid - doonotmid - do1 umbrella qt
mid;//centerpoint is fixed
d point d12 = ymid - d10 umbrella pt
mid -di1 umbrella qtmid; ] else (// never 1nflate sc
hemefloat ygrow x (des, y-
>ro) / conhgt; float xgrow
z (des, x-xo) / conwid; fl
oat dogrow = fmin(xgrow,
ygrow）；／／gow as 1little a
s possible77 i, e, 5luink
so BOTHfitdogrow z fmin(t
, o, dogrow);// but NEVERr
ally 1nflatedlo x O; //
pure skewdll -dogrow;//
5hrink ydoo x dogrow; // a
nd x equallydol m -dogrow
"theta;d02 w xmid -doo"
ptmid -dol ”qtmid; // cent
er point is fixed pointd1
2- ymid-d10 umbrella ptmid-dll umbrella qt
mid;Table 13//make the output 5pace
all white and xx, face yy; for (yy = yO; yy < des, y;
yy++) (for (xx w xo; xx
< des, x; xx++) (xyflt[yyl
[xxl! O,; // and 5 tart be
queathing blacknessint 5p
ip = imax(1,1nt(10"dl l))
;// 5pip w 5cans per 1npu
t pixelfloat 5tep a pkern
/ 5pip; float offset = 5t
ep/2. ; int ii; float fy, fq; int iy; float fxO, fx2; int ixO, ix2; float rxo, rx2; Table 14 #define oops (X) (rprintf(
stderr, 'oops (X)%S%'%f'o
, sname, fxo, fx2); can6n
ue;1for (ibl w O; ibl < n
bl; ib+++) [// 1oop over
black 1nput pixelspp -bl[
1b11. x; qq -bl[1b11. y;fxo
-doo umbrella pp + dol”qq+d02;
//sun ofscan 1ine (in x 5
pace) rx2 = doo” (pkern+pp)
+dO1"qq+d02; // end of 5c
axlline (-,)ixo m 1nt(fxo
); // integer pansix2 w 1n
t(fx2); rxo m fxO-ixo; //remainde
rsrx2 compilation fx2- ix2; if (ixo < O) 0OpS(1); //
error checks & defense
(ixo > wm dci, x) oops(2);
"Medium C2 < O) oops (3); if (ix2
> dcs, x) oops(4) ;for (i
i m O; it <5pip; ii++) (/
/ 1oop over 5can 1ines pe
r 1nput pixe! fq s++ qq +
offset +5tep”ii;ry f10”pp
+di1 umbrella fq+d12;iy w 6nt) fy; if (iy < O) oops(5);if (
iy % des, y) oops(6) ;xyfk
[iyl[ixOto1rxo;xyflt[iyl[i
x2] +-nc2;for (int jj m i
xo; jj <ix2; jj++) (xyfl
t[1yllj] +I! = 1.0; Table 15/l clip the bouom off of
the gray-scale // using qu
estionable threshold sche
me// &st, find blackest o
output pixel float zmax m O
; for (yy x yO; yy < des, y;
yy++) (for (xx w xO; xx
< des, x; xx→)(zmax x fmax
(zmax, xyflt[yyl[xxl);//
then clip away... float zz m xyflt[yyl[xxl
;int gr W 1nt(9,9” zz / z
max);if (gr < mingray) gr
x O; des-1) iX[>'y][xxl s
gr; W seat 'II//11. Saddle m seat mH shield imitation q//p
ut away all our toys: dele
te2(xyflt);

[Brief explanation of drawings]

第１図は分類法の一般的な流れ図；第２図は独立の３×３テンプレートを使用したときに発
生する問題例；第３図は３×３より大きいテンプレートを含むところの
本発明で使用される細線化テンプレートのセット：第４図は特徴抽出テンプレートのセット；第５図は第３
図及び第４図のテンプレートに関連して使用される神経
回路網決定回路の構造；第６図はアナログ値結合ウェイ
トを有する２層神経回路網の構造；及び、第７図はアナログ値結合ウェイト神経回路網用の一実施
態様を示す。１−４９・・・特徴抽出テンプレート１２０．１３０，１４０．１５０−３５０・・・細線化
テンプレート１２１．１２２．１４１．１４２・・・画
素８０Ｌ、８Ｑ２・・・コンデンサ８０３．８０４，８０８，８０９，８１５．８２８・・
・スイッチ８０５．８２５．８２７・・・ＭＯＳ）ラン
ジスタ８０Ｂ、８１４・・・比較器手段８０７・・・掛算手段出　願　人：アメリカン　テレフォンアンドテレグラフ　カムパニーＦＩＧ。ＦＩＧ、　　４（フフーグ）ＦＩＧ、４ＦＩＧ、　　４（っプ１で）ＦＩＧ。ＦＩＧ。Figure 1 is a general flowchart of the classification method; Figure 2 is an example of a problem that occurs when using independent 3x3 templates; Figure 3 is used in the present invention where templates larger than 3x3 are included. A set of thinning templates to be used: Figure 4 is a set of feature extraction templates; Figure 5 is a set of thinning templates.
Structure of a neural network determination circuit used in conjunction with the templates of FIGS. and 4; FIG. 6 is a structure of a two-layer neural network with analog value connection weights; and FIG. 1 shows an embodiment for a neural network. 1-49... Feature extraction template 120.130, 140.150-350... Thinning template 121.122.141.142... Pixel 80L, 8Q2... Capacitor 803.804, 808, 809, 815.828...
・Switch 805.825.827...MOS) Transistor 80B, 814...Comparator means 807...Multiplying means Applicant: American Telephone & Telegraph Company FIG. FIG, 4 (fufugu) FIG, 4 FIG, 4 (in p1) FIG. FIG.

Claims

[Claims]

(1) A method for classifying images, comprising: processing a signal corresponding to the image to form a processed image; and extracting features to form features of the processed image. and a step of classifying; the processing step includes skeletalizing the image.
1. A method for classifying images, comprising a partial step of etonization and a partial step of deskewing the image.

(2) The processing step includes size adjustment of the image (
The method of claim 1, further comprising a partial step of scaling.

3. The method of claim 2, wherein the skeletonization substep follows the size adjustment and skew correction substeps.

4. The method of claim 2, wherein the skeletonization partial step precedes the size adjustment and skew correction partial steps.

5. The method of claim 2, wherein the processing step further includes a substep of cleaning the image before the size adjustment substep.

(6) the step of processing further comprises the substep of removing from the image discontinuous strokes whose area is less than 10% of the area occupied by other strokes in the image; The image classification method according to claim 2, characterized in that:

(7) The classification step is characterized in that the consistency of the symbols included in the image is determined based on a stored set of weights obtained from a training set of given sample symbols. The image classification method according to claim 1.

(8) the feature extraction step includes forming feature identifiers that describe features of the processed image and their locations within the processed image; A method according to claim 1, characterized in that, between the step of classification, the method comprises a step of coarse blocking, encoding the feature identifiers into a reduced set of feature identifiers.

(9) said step of feature extraction; forming a plurality of feature maps describing features of said processed image and locations of those features within said processed image; 2. The method of claim 1, further comprising: combining the feature maps into a super-feature map corresponding to composite features.

(10) The method of classifying images according to claim 9, wherein the high-level feature map is formed through combinatorial logic that combines some of the feature maps.

(11) A method for classifying characters in an image; capturing the image in the form of an array of image elements; resizing the array of image elements to a selected size; and removing unnecessary image portions from the image. and processing the image to thereby form a processed image; skewing and skeletonizing the processed image to form a thinner and more upright modified shape of the image; extracting features from said thinner and more upright image characterizing said image and forming feature identifiers to represent features of said image and their locations; and a coarser feature map. coarse blocking of the feature map to form
and classifying the coarse feature map as representing one of a known set of symbols or as representing an unknown symbol. Method.

(12) In an apparatus for identifying characters embedded in a given image; a first means for capturing said image in the form of an array of image elements; and a first means for capturing said image in the form of an array of image elements; second means for identifying the presence of characters in said image in response to means and for adjusting said characters by deskewing and skeletonizing said image; third means for extracting features of the adjusted image in response to the second means to form an associated feature map; and third means for extracting features of the adjusted image in response to the third means; a fourth means for expressing a roughly blocked modified shape;
and means for classifying the coarsely blocked shape of the feature map as representing one of a known number of characters or an unknown character; Character identification device.

(13) The partial steps of skeletonization include: passing a plurality of templates in parallel over the image; 2. The method of claim 1, further comprising: unconditionally determining whether the selected portion of the image can be deleted from the image.

14. The image classification method according to claim 13, further comprising the step of deleting the portion of the image according to the unconditionally determining step.

(15) The step of passing the plurality of templates includes passing the template over the image within the step of sequentially aligning the template with different parts of the image. How to classify images.

(16) The method of classifying images according to claim 13, wherein at least one of the templates forms an array of more than 3 pixels by 3 pixels.

(17) A method for thinning a pixel comprised of pixels arranged in an array; passing a plurality of templates over the image simultaneously;
and unconditionally determining, for each template, whether a selected portion of the image can be deleted from the image based on a comparison between the template and the image. Image thinning method.

(18) The step of passing the plurality of templates includes passing the in-plate over the image within the step of sequentially aligning the template with different parts of the image. Image thinning method.

(19) The image classification method according to claim 13, wherein at least one of the templates forms an array of 5 pixels x 5 pixels.

(20) The method of classifying images according to claim 13, wherein the selected portion includes more than one pixel.