JPH10240869A

JPH10240869A - Character recognition dictionary creation device and character recognition dictionary creation method

Info

Publication number: JPH10240869A
Application number: JP9048240A
Authority: JP
Inventors: Osamu Taniguchi; 修谷口; Minoru Maeda; 稔前田; Seishirou Takeuchi; 斎之郎竹内; Ichiro Kurihara; 一郎栗原
Original assignee: Nippon Steel Corp
Current assignee: Nippon Steel Corp
Priority date: 1997-03-03
Filing date: 1997-03-03
Publication date: 1998-09-11

Abstract

(57)【要約】【課題】多数の認識用サンプル文字データを自動的に
生成することを可能とし、ユーザーの労力を軽減するこ
とができる文字認識辞書作成方法及び文字認識辞書作成
装置を提供する。【解決手段】文字認識辞書作成部１において、イメー
ジデータ入力部１０は、外部から基本となる文字のイメ
ージデータを取り込む。切り出し部１１は、イメージデ
ータ入力部１０によって取り込まれたイメージデータの
中から文字の部分を切り出す作業を行って基本イメージ
データを得る。外乱処理部１２は、基本イメージデータ
に対し、種々の外乱をシミュレートした所定の画像処理
を施して、認識用サンプル文字データを生成する。これ
によって、少数の基本イメージデータから、多数の認識
用サンプル文字データが自動的に生成される。 (57) [Problem] To provide a character recognition dictionary creation method and a character recognition dictionary creation device capable of automatically generating a large number of sample character data for recognition and reducing user's labor. . SOLUTION: In a character recognition dictionary creating section 1, an image data input section 10 takes in image data of basic characters from outside. The cutout unit 11 obtains basic image data by performing a work of cutting out a character portion from the image data captured by the image data input unit 10. The disturbance processing unit 12 performs predetermined image processing that simulates various disturbances on the basic image data to generate sample character data for recognition. Thereby, a large number of sample character data for recognition is automatically generated from a small number of basic image data.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、コンピュータ等に
よる文字認識の際に使用される文字認識辞書を作成する
文字認識辞書作成装置及び文字認識辞書作成方法に関連
する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character recognition dictionary creation device and a character recognition dictionary creation method for creating a character recognition dictionary used for character recognition by a computer or the like.

【０００２】[0002]

【従来の技術】ファクトリーオートメーション（ＦＡ）
の分野において、生産ラインを流れる部品や製品に印字
又は刻印された、型番や製品番号等を示す文字をコンピ
ュータ等に自動認識させることがある。その場合、コン
ピュータは、テレビカメラ等で読み取られた文字の画像
データと、予め記憶手段に格納されているそれぞれの文
字の認識用サンプル文字データとのパターンマッチング
処理を行う。同じ活字の同じ文字種であっても、生産ラ
インなどでの読み取りの際には種々の外乱が加わるた
め、文字認識の精度を高めるには、一つの文字種に対し
て多数の認識用サンプル文字データを用意することが必
要となる。実際の生産ライン等における文字の認識作業
では、ある文字の認識作業を行うと、その文字のパター
ンは登録され、次回以降のパターンマッチングのための
認識用サンプル文字データとなるため、認識作業を繰り
返すことによって、認識用サンプル文字データは徐々に
増加する。しかし、文字認識装置を設置した当初におい
ては、予めある程度の数の認識用サンプル文字データを
用意しておかなければならない。一般的なＦＡ分野にお
ける印字又は刻印された文字の認識では、一つの文字種
に対して、予め数十から数百程度の認識用サンプル文字
データが用意され、そして、認識用サンプル文字データ
の対象となる多数の文字種のそれぞれについて、このよ
うな認識用サンプル文字データが用意されたものの全体
が、文字認識辞書として提供される。2. Description of the Related Art Factory automation (FA)
In some fields, a computer or the like may automatically recognize characters indicating a model number, a product number, or the like, which are printed or stamped on parts or products flowing on a production line. In that case, the computer performs a pattern matching process between the image data of the character read by the television camera or the like and the sample character data for recognition of each character stored in the storage means in advance. Even if the same character type is the same, various disturbances are added when reading it on a production line or the like.To improve the accuracy of character recognition, a large number of recognition sample character data are required for one character type. It is necessary to prepare. In a character recognition operation on an actual production line or the like, when a certain character recognition operation is performed, the pattern of the character is registered and becomes sample character data for recognition for the next and subsequent pattern matching, so the recognition operation is repeated. As a result, the sample character data for recognition gradually increases. However, when the character recognition device is installed, a certain number of sample character data for recognition must be prepared in advance. In the recognition of printed or engraved characters in the general FA field, several tens to several hundreds of sample character data for recognition are prepared in advance for one character type, and the target of the sample character data for recognition is The entirety of such prepared sample character data for recognition for each of a large number of character types is provided as a character recognition dictionary.

【０００３】[0003]

【発明が解決しようとする課題】文字認識における認識
精度を向上させるためには、文字認識装置を設置した当
初における一文字種当たりの認識用サンプル文字データ
の数を増やすことが必要である。しかし、従来は、文字
認識辞書に含まれる認識用サンプル文字データを人間が
一つ一つ作成し、又は画像データとして実際の製品等か
ら取り込んで、コンピュータに記憶させるという作業が
必要であった。このため、認識精度を向上させようとし
て一文字種当たりの認識用サンプル文字データの数を増
やそうとすると、相当の労力が要求され、ユーザーには
大きな負担がかかった。In order to improve the recognition accuracy in character recognition, it is necessary to increase the number of sample character data for recognition per character type when the character recognition device is installed. However, conventionally, it has been necessary for a human to create the recognition sample character data included in the character recognition dictionary one by one, or take it as image data from an actual product or the like and store it in a computer. For this reason, if an attempt is made to increase the number of recognition sample character data per character type in order to improve recognition accuracy, considerable labor is required, and a heavy burden is imposed on the user.

【０００４】本発明は、上記事情に基づいてなされたも
のであり、多数の認識用サンプル文字データを自動的に
生成することを可能とし、ユーザーの労力を軽減するこ
とができる文字認識辞書作成装置及び文字認識辞書作成
方法を提供することを目的とする。[0004] The present invention has been made based on the above circumstances, and is capable of automatically generating a large number of sample character data for recognition, thereby reducing the user's labor. And a method for creating a character recognition dictionary.

【０００５】[0005]

【課題を解決するための手段】上記の課題を解決するた
めに、本発明に係る文字認識辞書作成装置は、文字認識
辞書を記憶する記憶手段と、前記文字認識辞書に登録す
る認識用サンプル文字データの元になる文字データを入
力する文字入力手段と、認識対象文字に加わる外乱をシ
ミュレートした画像処理を、前記文字入力手段によって
入力された文字データに施して、前記文字入力手段によ
って入力された文字データから認識用サンプル文字デー
タを生成し、前記記憶手段の文字認識辞書に登録する画
像処理手段とを具備する。In order to solve the above-mentioned problems, a character recognition dictionary creating apparatus according to the present invention comprises: a storage unit for storing a character recognition dictionary; and a sample character for recognition registered in the character recognition dictionary. Character input means for inputting character data as a source of data, and image processing simulating a disturbance applied to a character to be recognized is applied to the character data input by the character input means, and the character data is input by the character input means. And image processing means for generating sample character data for recognition from the extracted character data and registering the sample character data in a character recognition dictionary of the storage means.

【０００６】また、本発明に係る文字認識辞書作成方法
は、認識対象文字に加わる外乱をシミュレートした画像
処理を、認識用サンプル文字データの元になる文字デー
タに施して、前記文字入力手段によって入力された文字
データから認識用サンプル文字データを生成し、これら
を記憶手段の文字認識辞書に登録することによって文字
認識辞書を作成することを特徴とする。The character recognition dictionary creating method according to the present invention is characterized in that image processing for simulating a disturbance applied to a character to be recognized is performed on character data which is a source of sample character data for recognition, and the character input means performs the processing. A character recognition dictionary is created by generating sample character data for recognition from input character data and registering them in a character recognition dictionary in a storage unit.

【０００７】本発明は、上記により、文字認識辞書に登
録する認識用サンプル文字データの元になる文字データ
に対して、認識対象文字に加わる外乱をシミュレートし
た画像処理を施すことによって認識用サンプル文字デー
タを作成するので、認識用サンプル文字データの作成を
自動化することが可能であり、したがって少数の元とな
る文字データから多数の認識用サンプル文字データを迅
速かつ大量に作成することができる。According to the present invention, a recognition sample character is obtained by subjecting character data, which is a source of the recognition sample character data registered in the character recognition dictionary, to image processing that simulates a disturbance applied to the character to be recognized. Since the character data is created, it is possible to automate the creation of the recognition sample character data. Therefore, a large number of recognition sample character data can be created quickly and in large quantities from a small number of original character data.

【０００８】[0008]

【発明の実施の形態】以下に図面を参照して、本発明の
一実施形態について説明する。図１は、本発明の一実施
形態の文字認識辞書作成装置を含む文字認識装置の概略
ブロック図である。図１の文字認識装置は、主として、
文字認識辞書作成部１と文字認識部２からなる。イメー
ジデータ入力部１０は、外部から基本となる文字のイメ
ージデータを取り込む。イメージデータ入力部１０とし
ては、てテレビカメラやイメージスキャナーなどを用い
ることができる。イメージデータ入力部１０で取り込ま
れた画像信号は、文字認識辞書作成部１に送られる。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of the present invention will be described below with reference to the drawings. FIG. 1 is a schematic block diagram of a character recognition device including a character recognition dictionary creation device according to one embodiment of the present invention. The character recognition device of FIG.
It comprises a character recognition dictionary creation unit 1 and a character recognition unit 2. The image data input unit 10 receives image data of a basic character from outside. As the image data input unit 10, a television camera, an image scanner, or the like can be used. The image signal captured by the image data input unit 10 is sent to the character recognition dictionary creation unit 1.

【０００９】文字認識辞書作成部１において、切り出し
部１１は、イメージデータ入力部１０によって取り込ま
れたイメージデータの中からそれぞれの文字の部分を切
り出す作業を行う。かかる作業は、オペレータの操作に
よって行われる。以下では、このようにして切り出され
た文字のイメージデータを「基本イメージデータ」とい
う。外乱処理部１２は、こうして切り出された基本イメ
ージデータに対して、種々の外乱をシミュレートした所
定の画像処理を施して、認識用サンプル文字データを生
成する。これによって、少数の基本イメージデータか
ら、多数の認識用サンプル文字データが自動的に生成さ
れる。また、複数の原因に基づく外乱をシミュレートす
るために、後述のように、一つの基本イメージデータに
対して複数の画像処理を施すこともできる。このように
して生成された認識用サンプル文字データは、辞書作成
処理部１３によって所定の文字認識のためのサンプルデ
ータとして使用可能にするための所定の処理が施され、
文字認識辞書１４にデータベースとして登録される。In the character recognition dictionary creating section 1, a cutout section 11 performs a work of cutting out each character portion from the image data taken in by the image data input section 10. This operation is performed by the operation of the operator. Hereinafter, the character image data cut out in this manner is referred to as “basic image data”. The disturbance processing unit 12 performs predetermined image processing that simulates various disturbances on the basic image data thus cut out, and generates recognition sample character data. Thereby, a large number of sample character data for recognition is automatically generated from a small number of basic image data. Further, in order to simulate a disturbance based on a plurality of causes, a plurality of image processes can be performed on one piece of basic image data as described later. The recognition sample character data generated in this manner is subjected to predetermined processing by the dictionary creation processing unit 13 so as to be usable as sample data for predetermined character recognition,
It is registered in the character recognition dictionary 14 as a database.

【００１０】このようにして作成された辞書は、文字認
識部２における実際の文字認識作業において使用され
る。例えば、生産ライン近傍に設けられたイメージデー
タ入力部２０から製品等に印刷又は刻印された文字のイ
メージデータを取り込み、認識部２１において、文字認
識辞書１４から順次認識用サンプル文字データを読み出
し、そして、イメージデータ入力部２０から取り込まれ
た認識対象文字とパターンマッチング等の処理を行っ
て、文字の種類を認識する。認識された文字は、必要に
応じて、文字出力部２２において、例えばＣＲＴ上に出
力表示される。このようにして認識された文字は、新た
に認識用サンプル文字データとしてフィードバックさ
れ、文字認識辞書１４に登録される。このため、認識用
サンプル文字データは、本装置の使用を繰り返すことに
よって徐々に増加し、かかる学習効果によって認識率は
漸次向上する。尚、上記イメージデータ入力部２０は、
前述のイメージデータ入力部１０と共用することができ
る。The dictionary created in this way is used in the actual character recognition work in the character recognition unit 2. For example, image data of characters printed or stamped on a product or the like is taken in from an image data input unit 20 provided near the production line, and a recognition unit 21 sequentially reads sample character data for recognition from the character recognition dictionary 14, and The type of character is recognized by performing processing such as pattern matching with the recognition target character fetched from the image data input unit 20. The recognized characters are output and displayed on the CRT, for example, in the character output unit 22 as necessary. The character recognized in this way is fed back as new sample character data for recognition and registered in the character recognition dictionary 14. Therefore, the number of recognition sample character data is gradually increased by repeatedly using the present apparatus, and the recognition rate is gradually improved by such a learning effect. Note that the image data input unit 20
It can be shared with the image data input unit 10 described above.

【００１１】ここで、図１の外乱処理部１２における処
理について詳細に説明する。図２の表の左側の欄は、図
１のイメージデータ入力部２０によって認識対象となる
文字のイメージデータを取り込むときに起こり得る外乱
のいくつかを列挙したものであり、真ん中の欄の各行の
項目は、それぞれ、その左側に示した外乱が起こる原因
を示している。例えば、「画像のボケ」という外乱は、
イメージデータ入力部２０のカメラのピントのずれやイ
ンクの滲みなどによって生じ得る。「線の太り」や「線
の細り」という外乱は、二値化レベルのずれなどによっ
て生じ得る。「汚れ」という外乱は、汚れ等の付着や信
号ノイズなどによって生じ得る。「文字の拡大や縮小」
という外乱は、対象物とカメラとの距離の変化などによ
って生じ得る。そして、「文字の傾き」という外乱は、
対象物やカメラの傾きによって生じ得る。Here, the processing in the disturbance processing unit 12 of FIG. 1 will be described in detail. The left column of the table of FIG. 2 lists some possible disturbances when the image data input unit 20 of FIG. 1 captures image data of a character to be recognized. Each item indicates the cause of the disturbance shown on the left side. For example, the disturbance "image blur"
This may be caused by a defocus of the camera of the image data input unit 20 or bleeding of ink. Disturbances such as “thickening of the line” and “thinning of the line” can be caused by a shift in the binarization level. Disturbance called "dirt" can be caused by adhesion of dirt and the like, signal noise, and the like. "Character enlargement and reduction"
Can be caused by a change in the distance between the object and the camera. And the disturbance called "tilt of characters"
It can be caused by the tilt of the object or camera.

【００１２】図２の右側の欄に列挙した各画像処理は、
それぞれの行の左側の欄に列挙した外乱をシミュレート
するための画像処理の種類である。例えば、実際に認識
対象文字の画像を取り込むときにイメージデータ入力部
２０のカメラのピントずれやインクの滲み等によって生
じ得る画像のボケという外乱は、基本イメージデータに
対して平滑化処理を行うことによってシミュレートでき
る。同様に、線の太りは基本イメージデータに対する膨
張処理によって、線の細りは収縮処理によって、汚れは
基本イメージデータに対して疑似ノイズを発生させるこ
とによって、文字の拡大・縮小は基本イメージデータに
対する画像の拡大・縮小処理によって、また、文字の傾
きは基本イメージデータに対する回転処理によって、そ
れぞれシミュレートできる。Each of the image processes listed in the right column of FIG.
This is the type of image processing for simulating the disturbances listed in the left column of each row. For example, when an image of a character to be recognized is actually captured, disturbance such as image blurring that may be caused by a camera out of focus or ink bleeding in the image data input unit 20 is performed by performing a smoothing process on the basic image data. Can be simulated by Similarly, the thickening of the line is performed by dilation processing on the basic image data, the thinning of the line is performed by shrinking processing, the dirt generates pseudo noise on the basic image data, and the enlargement / reduction of the character is performed on the image based on the basic image data. Can be simulated by the enlargement / reduction processing, and the inclination of the character can be simulated by the rotation processing on the basic image data.

【００１３】図３は、図１の外乱処理部１２において、
基本イメージデータに対して実際になされる画像処理の
一部をまとめた表である。同表において、「外乱なし」
は、基本イメージデータに何ら処理を施さず、そのまま
出力する場合である。また、「平滑化」と「平滑化
」の違いは、前者が各画素についてその隣接する９画
素での平均を求める処理であるのに対し、後者は各画素
について隣接する１６画素での平均を求める処理である
点にある。「太め」と「太め」の違いは、前者が４
方向の太め処理であるのに対し、後者は８方向の太め処
理である点にある。「細め」と「細め」の違いは、
前者が４方向の細め処理であるのに対し、後者は８方向
の細め処理である点にある。「疑似ノイズ」の割合が
「０．０２」とは、２％の割合で画像を変化させること
を意味する。「拡大」は、縦横比でα倍に拡大する。こ
のαの値は、ユーザーが自由に設定できる。「縮小」
は、縦横比で１／β倍に縮小する。このβの値も、ユー
ザーが自由に設定できる。「回転」及び「回転」
は、前者は時計回りに角度θだけ回転する処理であり、
後者は反時計回りに角度θだけ回転する処理である。
尚、図３は、外乱をシミュレートできる画像処理の一例
に過ぎず、実際にはこれよりも多数の画像処理を用意す
ることができる。FIG. 3 shows the disturbance processing unit 12 shown in FIG.
9 is a table summarizing a part of image processing actually performed on basic image data. In the table, "No disturbance"
Is a case where the basic image data is output without any processing. Also, the difference between “smoothing” and “smoothing” is that the former is a process of calculating the average of each pixel at its adjacent nine pixels, while the latter is the process of calculating the average of each pixel at the adjacent 16 pixels. The point is that it is the required process. The difference between “fat” and “fat” is that the former is 4
In contrast to the thickening process in the direction, the latter is a thickening process in eight directions. The difference between "narrow" and "narrow"
The former is a thinning process in four directions, while the latter is a thinning process in eight directions. When the ratio of “pseudo noise” is “0.02”, it means that the image is changed at a ratio of 2%. “Enlarge” enlarges the image by α times in aspect ratio. The value of α can be freely set by the user. "Shrink"
Is reduced to 1 / β times in aspect ratio. The value of β can be freely set by the user. "Rotation" and "Rotation"
Is the process of rotating the clockwise angle θ.
The latter is a process of rotating counterclockwise by the angle θ.
Note that FIG. 3 is only an example of image processing capable of simulating a disturbance, and more image processing can be actually prepared.

【００１４】図４は、図１の文字認識辞書作成装置に含
まれる外乱処理部１２が行う処理の概念を示した図であ
り、基本イメージデータに対して複数段（この場合は３
段）の画像処理部によって外乱シミュレーションを行う
ことができることを示している。図４では、制御部３０
と、一段目の画像処理部３１、二段目の画像処理部３１
₁〜３１_N、三段目の画像処理部３１₁₁〜３１_1N，・・
・，３１_N1〜３１_NNからなる。一段目の画像処理部３１
にはＮ個の出力があり、各出力にはスイッチＳ₁〜Ｓ_N
が設けられている。Ｎ個の各出力は、スイッチＳ₁〜Ｓ
_Nを介して二段目の画像処理部３１₁〜３１_Nに接続さ
れている。二段目の各画像処理部にも、それぞれ一段目
と同様にＮ個の出力があり、各出力にはスイッチＳ₁₁〜
Ｓ_1N，・・・，Ｓ_N1〜Ｓ_NNが設けられている。二段目の
各画像処理部３１₁〜３１_Nは、スイッチＳ₁₁〜Ｓ_1N，
・・・，Ｓ_N1〜Ｓ_NNを介して、三段目の画像処理部３１
₁₁〜３１_1N，・・・，３１_N1〜３１_NNに接続されてお
り、三段目の各画像処理部のＮ個の出力にも、それぞれ
Ｓ₁₁₁〜Ｓ_11N，・・・，Ｓ_NN1〜Ｓ_NNNが接続されて
いる。各段のスイッチＳ₁〜Ｓ_N，Ｓ₁₁〜Ｓ_1N，・・
・，Ｓ_N1〜Ｓ_NN，Ｓ₁₁₁〜Ｓ_11N，・・・，Ｓ_NN1〜Ｓ
_NNNは、すべて制御部３０によって制御される。FIG. 4 is a block diagram of the character recognition dictionary creating apparatus shown in FIG.
FIG. 3 is a diagram showing a concept of a process performed by a disturbance processing unit 12;
And multiple stages (in this case, 3
Disturbance simulation is performed by the image processing section
Indicates that it can be done. In FIG. 4, the control unit 30
And the first-stage image processing unit 31 and the second-stage image processing unit 31
₁~ 31_N, The third stage image processing unit 31₁₁~ 31_1N, ...
・, 31_N1~ 31_NNConsists of First-stage image processing unit 31
Has N outputs and each output has a switch S₁~ S_N
Is provided. Each of the N outputs is connected to a switch S₁~ S
_NThrough the second stage image processing unit 31₁~ 31_NConnected to
Have been. Each image processing unit in the second stage also has the first stage
There are N outputs, each with a switch S₁₁~
S_1N, ..., S_N1~ S_NNIs provided. Second stage
Each image processing unit 31₁~ 31_NIs the switch S₁₁~ S_1N,
... S_N1~ S_NNThrough the third-stage image processing unit 31
₁₁~ 31_1N, ..., 31_N1~ 31_NNConnected to
The N outputs of each image processing unit in the third stage
S₁₁₁~ S_11N, ..., S_NN1~ S_NNNIs connected
I have. Switch S of each stage₁~ S_N, S₁₁~ S_1N, ...
・, S_N1~ S_NN, S₁₁₁~ S_11N, ..., S_NN1~ S
_NNNAre all controlled by the control unit 30.

【００１５】図４に示した三つの段の各画像処理部は、
図３に例示した画像処理及びその他の外乱をシミュレー
トした画像処理を行う。但し、各段の画像処理部におい
て、図３の表の「割合」に示した各パラメータを異なら
せる。すなわち、疑似ノイズについては、例えば一段目
の画像処理部では０．０２、二段目の画像処理部では
０．０８、三段目の画像処理部では０．０６とし、拡大
については、一段目の画像処理部ではα＝２、二段目の
画像処理部ではα＝４、三段目の画像処理部ではα＝８
とし、縮小については、一段目の画像処理部ではβ＝
２、二段目の画像処理部ではβ＝４、三段目の画像処理
部ではβ＝８とし、回転については、一段目の画像処理
部ではθ＝５°、二段目の画像処理部ではθ＝１０°、
三段目の画像処理部ではθ＝１５°とする。そして、制
御部３０によってスイッチがオンとされた画像処理出力
の信号だけを次段の画像処理部に送る。したがって、制
御部３０によってすべてのスイッチがオンとされている
場合には、一つの画像処理部の処理の種類をＮとすれ
ば、三段の画像処理によって得られる認識用サンプル文
字データの数はＮの３乗（Ｎ³）個となる。Each of the three stages of image processing units shown in FIG.
The image processing illustrated in FIG. 3 and other image processing that simulates a disturbance are performed. However, the parameters shown in the “ratio” in the table of FIG. That is, for example, pseudo noise is set to 0.02 in the first-stage image processing unit, 0.08 in the second-stage image processing unit, and 0.06 in the third-stage image processing unit. Α = 2 in the image processing unit, α = 4 in the second image processing unit, and α = 8 in the third image processing unit.
For the reduction, in the first stage image processing unit, β =
2, β = 4 in the second stage image processing unit, β = 8 in the third stage image processing unit, and rotation = θ = 5 ° in the first stage image processing unit; Then θ = 10 °,
In the third-stage image processing unit, θ is set to 15 °. Then, only the signal of the image processing output whose switch is turned on by the control unit 30 is sent to the next image processing unit. Therefore, when all the switches are turned on by the control unit 30, if the type of processing of one image processing unit is N, the number of sample character data for recognition obtained by three-stage image processing is N is the third power (N ³ ).

【００１６】図４の制御部３０は、実際の生産ライン等
において起こり得ない外乱がある場合には、そのような
外乱をシミュレートする画像処理を行わないようにスイ
ッチをオフにすることができる。例えば、生産ラインの
状況からみて、画像の回転という外乱が起こり得ないこ
とが分かっている場合には、回転という画像処理を行わ
ないように、所定のスイッチをオフとする。これによっ
て、無駄な認識用サンプル文字データの生成は削減さ
れ、図１の認識部２１での認識作業の効率化を図ること
が出来る。The control unit 30 shown in FIG. 4 can turn off the switch so as not to perform image processing for simulating such disturbance when there is a disturbance that cannot occur in an actual production line or the like. . For example, when it is known from the situation of the production line that disturbance such as image rotation cannot occur, a predetermined switch is turned off so as not to perform image processing such as rotation. Thus, generation of useless recognition sample character data is reduced, and the efficiency of the recognition operation in the recognition unit 21 in FIG. 1 can be improved.

【００１７】図５は、従来の方法、すなわち認識用サン
プル文字データを一つ一つ読み込んで登録する方法で作
成した文字認識辞書（従来の辞書）と本実施形態の装置
によって作成した文字認識辞書（本実施形態の辞書）を
用いて、文字認識の実験を行った結果を示した表であ
る。認識対象とした文字は、刻印された文字をビデオよ
り画像として取り込み、これに図３に示した外乱をシミ
ュレートした画像処理を施した文字とする。その結果、
従来辞書では８８．６７％の認識率だったのに対し、本
実施形態の辞書では９９．４６％という高い認識率が得
られた。FIG. 5 shows a character recognition dictionary (conventional dictionary) created by a conventional method, that is, a method of reading and registering recognition sample character data one by one, and a character recognition dictionary created by the apparatus of the present embodiment. 9 is a table showing results of an experiment of character recognition using (dictionary of the present embodiment). The character to be recognized is a character obtained by capturing the engraved character as an image from a video and performing image processing for simulating the disturbance shown in FIG. as a result,
The recognition rate of the conventional dictionary was 88.67%, whereas the dictionary of the present embodiment achieved a high recognition rate of 99.46%.

【００１８】尚、本発明は上記実施形態に限られず、そ
の要旨の範囲内で種々の変更が可能である。例えば、上
記実施形態では、認識対象文字に対して外乱をシミュレ
ートした画像処理として、「外乱なし」、「平滑化」、
「太め」、「細め」、「疑似ノイズ」、「拡大」、「縮
小」、「回転」を行う場合について述べたが、これらは
一例に過ぎず、例えば「微分処理」等の画像処理を上記
の各画像処理とともに行うようにしてもよい。また、上
記実施形態では、生産ラインを流れる製品等に印字又は
刻印された文字を認識する場合であったが、本発明は、
手書き文字についても適用することが可能である。It should be noted that the present invention is not limited to the above embodiment, and various changes can be made within the scope of the invention. For example, in the above-described embodiment, “no disturbance”, “smoothing”,
The case where “thick”, “thin”, “pseudo noise”, “enlarge”, “shrink”, and “rotate” are performed has been described, but these are merely examples, and image processing such as “differential processing” is performed as described above. May be performed together with each image processing. Further, in the above embodiment, the characters printed or engraved on a product or the like flowing on the production line are recognized.
It is also possible to apply to handwritten characters.

【００１９】[0019]

【発明の効果】以上説明したように、本発明によれば、
文字認識辞書に登録する認識用サンプル文字データの元
になる文字データに対して、認識対象文字に加わる外乱
をシミュレートした画像処理を施すことによって認識用
サンプル文字データを作成するので、認識用サンプル文
字データの作成を自動化することが可能であり、したが
って少数の元となる文字データから多数の認識用サンプ
ル文字データを迅速かつ大量に作成することができるの
で、従来のように、認識用サンプル文字データを一つ一
つユーザーが作成する場合に比べてユーザーの負担が大
幅に軽減される。また、上記の画像処理として適当なも
のを選択することによって、認識対象文字に加わる外乱
をより良くシミュレートすることができるので、かかる
画像処理によって作成された認識用サンプル文字データ
を用いて認識作業を行うことによって、認識精度が向上
する文字認識辞書作成装置及び文字認識辞書作成方法を
提供することができる。As described above, according to the present invention,
The recognition sample character data is created by applying image processing that simulates the disturbance applied to the recognition target character to the character data that is the basis of the recognition sample character data registered in the character recognition dictionary. Since the creation of character data can be automated, and a large number of sample character data for recognition can be created quickly and in large quantities from a small number of base character data, the sample character data for recognition The burden on the user is greatly reduced as compared to the case where the data is created by the user one by one. In addition, by selecting an appropriate image processing as described above, it is possible to better simulate the disturbance applied to the character to be recognized. Therefore, the recognition work is performed using the recognition sample character data created by the image processing. By performing the above, it is possible to provide a character recognition dictionary creation device and a character recognition dictionary creation method that improve recognition accuracy.

[Brief description of the drawings]

【図１】本発明の文字認識辞書作成装置の一実施形態の
概略ブロック図である。FIG. 1 is a schematic block diagram of an embodiment of a character recognition dictionary creation device according to the present invention.

【図２】認識対象文字に対して生じ得る外乱の種類及び
その原因を列挙した表である。FIG. 2 is a table listing types of disturbances that can occur with respect to a recognition target character and causes thereof.

【図３】本実施形態の外乱処理部において基本イメージ
データに対して実際になされる画像処理の一部をまとめ
た表である。FIG. 3 is a table summarizing a part of image processing actually performed on basic image data in a disturbance processing unit of the embodiment.

【図４】本実施形態の文字認識辞書作成装置に含まれる
外乱処理部が行う処理の概念を示した図である。FIG. 4 is a diagram illustrating a concept of a process performed by a disturbance processing unit included in the character recognition dictionary creation device of the present embodiment.

【図５】従来の方法で作成した文字認識辞書と本実施形
態の装置によって作成した文字認識辞書を用いて、文字
認識の実験を行った結果を示した表である。FIG. 5 is a table showing the results of an experiment of character recognition using a character recognition dictionary created by a conventional method and a character recognition dictionary created by the apparatus of the present embodiment.

[Explanation of symbols]

１文字認識辞書作成部２文字認識部１０，２０イメージデータ入力部１１切り出し部１２外乱処理夫１３辞書作成処理部１４文字認識辞書２１認識部２２文字出力部３０制御部３１一段目の画像処理部３１₁〜３１_N 二段目の画像処理部３１₁₁〜３１_1N，・・・，３１_N1〜３１_NN 三段目の
画像処理部Ｓ₁〜Ｓ_N 一段目のスイッチＳ₁₁〜Ｓ_1N，・・・，Ｓ_N1〜Ｓ_NN 二段目のスイッチＳ₁₁₁〜Ｓ_11N，・・・，Ｓ_NN1〜Ｓ_NNN 三段目の
スイッチReference Signs List 1 character recognition dictionary creation unit 2 character recognition unit 10, 20 image data input unit 11 cutout unit 12 disturbance processing unit 13 dictionary creation processing unit 14 character recognition dictionary 21 recognition unit 22 character output unit 30 control unit 31 first stage image processing unit 31 _{1 to} 31 _N Second-stage image processing units 31 _{11 to} 31 _1N ,..., 31 _{N1 to} 31 _NN Third-stage image processing units S _{1 to} S _N First-stage switches S _{11 to} S _1N ,. .., S _{N1 to} S _NN second-stage switches S _{111 to} S _11N ,..., S _{NN1 to} S _NNN third-stage switches

───────────────────────────────────────────────────── フロントページの続き (72)発明者栗原一郎東京都千代田区大手町２丁目６番３号新日本製鐵株式会社内 ──────────────────────────────────────────────────続き Continuation of front page (72) Inventor Ichiro Kurihara 2-6-3 Otemachi, Chiyoda-ku, Tokyo New Nippon Steel Corporation

Claims

[Claims]

1. A storage unit for storing a character recognition dictionary, a character input unit for inputting character data which is a source of recognition sample character data registered in the character recognition dictionary, and a disturbance applied to a character to be recognized is simulated. Applying the image processing to the character data input by the character input means, generating sample character data for recognition from the character data input by the character input means, and registering it in a character recognition dictionary of the storage means. A character recognition dictionary creation device, comprising: processing means.

2. The character recognition dictionary creating apparatus according to claim 1, wherein said image processing means performs a plurality of different image processes on one character data input by said character input means.

3. The character recognition according to claim 1, wherein the image processing means performs the same type of image processing with different parameters a plurality of times on one character data input by the character input means. Dictionary creation device.

4. An image processing which simulates a disturbance applied to a character to be recognized is applied to character data which is a source of the sample character data for recognition. And generating a character recognition dictionary by registering them in a character recognition dictionary in a storage unit.

5. The image processing according to claim 4, wherein a plurality of different image processes are performed on the character data that is the basis of the recognition sample character data registered in the character recognition dictionary.
How to create the described character recognition dictionary.

6. The image processing according to claim 4, wherein the same type of image processing with different parameters is performed a plurality of times on the character data serving as the base of the recognition sample character data registered in the character recognition dictionary. 5. A method for creating a character recognition dictionary according to 5.