JP5679936B2

JP5679936B2 - Word recognition device, word recognition method, and paper sheet processing device provided with word recognition device

Info

Publication number: JP5679936B2
Application number: JP2011193116A
Authority: JP
Inventors: 浜村　倫行; 倫行浜村; 入江　文平; 文平入江; 匡哉前田; 英朴
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2011-09-05
Filing date: 2011-09-05
Publication date: 2015-03-04
Anticipated expiration: 2031-09-05
Also published as: JP2013054583A

Description

本発明の実施形態は、単語認識装置、単語認識方法、及び単語認識装置を備える紙葉類処理装置に関する。 Embodiments described herein relate generally to a word recognition device, a word recognition method, and a paper sheet processing apparatus including the word recognition device.

従来、例えば郵便物などの紙葉類を処理する郵便区分機などの紙葉類処理装置が実用化されている。このような紙葉類処理装置は、投入部に投入された紙葉類を１枚ずつ取り込み、紙葉類から画像を取得する。また、紙葉類処理装置は、単語認識装置を備える。単語認識装置は、紙葉類から取得した画像に基づいて、紙葉類上の単語を認識する。紙葉類処理装置は、認識結果に基づいて紙葉類に記載されたアドレスまたは他の情報を特定し、紙葉類を所定の区分ポケットに区分する。 2. Description of the Related Art Conventionally, a paper sheet processing apparatus such as a mail sorting machine that processes paper sheets such as mail has been put into practical use. Such a paper sheet processing apparatus takes in paper sheets that have been input into the input unit one by one, and acquires an image from the paper sheets. The paper sheet processing apparatus includes a word recognition device. The word recognition device recognizes a word on a paper sheet based on an image acquired from the paper sheet. The paper sheet processing apparatus identifies an address or other information written on the paper sheet based on the recognition result, and sorts the paper sheet into a predetermined sorting pocket.

単語を認識する方法として、解析的手法（ＡｎａｌｙｔｉｃＡｐｐｒｏａｃｈ）と、全体的手法（ＨｏｌｉｓｔｉｃＡｐｐｒｏａｃｈ）とが一般的に知られている。解析的手法と全体的手法とは、相補的な関係を有する。この為、単語認識装置は、解析的手法と全体的手法とを併用することにより、より高い精度で単語を認識することができる。 As a method of recognizing a word, an analytical method (Analytic Approach) and an overall method (Holistic Approach) are generally known. The analytical method and the overall method have a complementary relationship. For this reason, the word recognition apparatus can recognize a word with higher accuracy by using both the analytical method and the overall method.

単語認識装置は、解析的手法により単語を認識する場合、単語の画像に基づいて複数の切断点候補を抽出し、抽出した切断点候補に基づいて互いに重なる箇所を含む複数の文字候補を生成する。さらに、単語認識装置は、事後確率比を利用して解析的手法の評価値を算出し、評価値に基づいて複数の文字候補の中から正しい組み合わせを選び出す。 When recognizing a word by an analytical method, the word recognition device extracts a plurality of cutting point candidates based on the word image, and generates a plurality of character candidates including overlapping portions based on the extracted cutting point candidates. . Furthermore, the word recognition device calculates an evaluation value of the analytical method using the posterior probability ratio, and selects a correct combination from a plurality of character candidates based on the evaluation value.

また、単語認識装置は、全体的手法により単語を認識する場合、例えば隠れマルコフモデル（ＨＭＭ：ＨｉｄｄｅｎＭａｒｋｏｖＭｏｄｅｌ）を用いることにより、単語を認識する。 Further, when recognizing a word by the overall method, the word recognizing device recognizes the word by using, for example, a hidden Markov model (HMM).

特許第４６０１８３５号公報Japanese Patent No. 4601835

解析的手法と全体的手法とを併用する方法として、先に全体的手法で認識を実行し、認識の結果に基づいて単語を文字毎に分割し、分割した各文字を解析的手法により認識することで、単語の認識結果を検証する方法がある。しかし、全体的手法による認識において誤認が発生した場合、文字の認識に失敗する為、単語認識装置は、十分な精度を得ることができないという課題がある。 As a method of using both the analytical method and the overall method, recognition is first performed by the overall method, the word is divided into characters based on the recognition result, and each divided character is recognized by the analytical method. Thus, there is a method of verifying the word recognition result. However, when a misperception occurs in recognition by the overall method, the word recognition device fails to recognize characters, so that the word recognition device cannot obtain sufficient accuracy.

そこで、より高い精度で単語の認識を行うことができる単語認識装置、単語認識方法、及び単語認識装置を備える紙葉類処理装置を提供することを目的とする。 Therefore, an object of the present invention is to provide a word recognition device, a word recognition method, and a paper sheet processing device including the word recognition device that can recognize words with higher accuracy.

一実施形態に係る単語認識装置は、複数の単語を格納する単語辞書と、単語を含む画像を受け取る画像受取手段と、前記画像から単語毎の単語画像を抽出する単語画像抽出手段と、前記単語画像から文字候補を抽出する文字候補抽出手段と、前記文字候補に対して文字認識を行う文字認識手段と、前記文字認識手段による文字認識の結果に基づいて、前記単語辞書に格納されている単語毎に第１の評価値を計算する解析的マッチング手段と、前記単語画像から特徴を抽出する特徴抽出手段と、前記単語辞書に格納されている単語毎に単語モデルを生成する単語モデル生成手段と、前記単語モデル毎に前記特徴が出現する確率を示す第２の評価値を計算する全体的マッチング手段と、前記特徴が出現する特徴確率を計算する特徴確率計算手段と、前記第１の評価値と前記第２の評価値と前記特徴確率の逆数とを乗算し第３の評価値を算出する統合評価値算出手段と、前記統合評価値算出手段により算出された前記第３の評価値を出力する出力手段と、を具備する。 A word recognition device according to an embodiment includes a word dictionary that stores a plurality of words, an image receiving unit that receives an image including a word, a word image extracting unit that extracts a word image for each word from the image, and the word Character candidate extraction means for extracting character candidates from the image, character recognition means for performing character recognition on the character candidates, and words stored in the word dictionary based on the result of character recognition by the character recognition means Analytical matching means for calculating a first evaluation value every time, feature extraction means for extracting features from the word image, word model generation means for generating a word model for each word stored in the word dictionary, An overall matching means for calculating a second evaluation value indicating the probability of occurrence of the feature for each word model; and a feature probability calculation means for calculating a feature probability of occurrence of the feature; Integrated evaluation value calculating means for calculating a third evaluation value by multiplying the first evaluation value, the second evaluation value, and the inverse of the feature probability, and the first evaluation value calculated by the integrated evaluation value calculating means. Output means for outputting 3 evaluation values.

図１は、一実施形態に係る紙葉類処理装置の例について説明するための図である。FIG. 1 is a diagram for explaining an example of a paper sheet processing apparatus according to an embodiment. 図２は、一実施形態に係る単語認識装置の例について説明するための図である。FIG. 2 is a diagram for describing an example of a word recognition device according to an embodiment. 図３は、一実施形態に係る単語認識装置の処理について説明するための図である。FIG. 3 is a diagram for explaining processing of the word recognition device according to the embodiment. 図４は、一実施形態に係る単語認識装置の処理について説明するための図である。FIG. 4 is a diagram for explaining processing of the word recognition device according to the embodiment. 図５は、一実施形態に係る単語認識装置の処理について説明するための図である。FIG. 5 is a diagram for explaining processing of the word recognition device according to the embodiment. 図６は、一実施形態に係る単語認識装置の処理について説明するための図である。FIG. 6 is a diagram for explaining the processing of the word recognition device according to the embodiment. 図７は、一実施形態に係る単語認識装置の処理について説明するための図である。FIG. 7 is a diagram for explaining processing of the word recognition device according to the embodiment. 図８は、一実施形態に係る単語認識装置の処理について説明するための図である。FIG. 8 is a diagram for explaining processing of the word recognition device according to the embodiment. 図９は、一実施形態に係る単語認識装置の処理について説明するための図である。FIG. 9 is a diagram for explaining processing of the word recognition device according to the embodiment.

（第１の実施形態）
以下、図面を参照しながら、一実施形態に係る紙葉類処理装置、及び光検出装置について詳細に説明する。 (First embodiment)
Hereinafter, a paper sheet processing apparatus and a light detection apparatus according to an embodiment will be described in detail with reference to the drawings.

図１は、一実施形態に係る紙葉類処理装置１００の構成例を示す。
紙葉類処理装置１００は、紙葉類から画像を読み取って、読み取った画像から宛先情報及び切手の貼付位置などを認識し、紙葉類に押印し、紙葉類を区分する。紙葉類処理装置１００は、供給部２００、分離ローラ２１０、搬送路２２０、画像読取部４００、押印部４６０、印刷部４７０、主制御部５００、区分処理部３００、単語認識部６００、操作部７００、表示部８００、及び入出力部９００を備える。 FIG. 1 shows a configuration example of a paper sheet processing apparatus 100 according to an embodiment.
The paper sheet processing apparatus 100 reads an image from the paper sheet, recognizes destination information and a stamping position from the read image, and stamps the paper sheet to classify the paper sheet. The sheet processing apparatus 100 includes a supply unit 200, a separation roller 210, a conveyance path 220, an image reading unit 400, a stamping unit 460, a printing unit 470, a main control unit 500, a sorting processing unit 300, a word recognition unit 600, and an operation unit. 700, a display unit 800, and an input / output unit 900.

主制御部５００は、紙葉類処理装置１００の各部の動作を統合的に制御する。主制御部５００は、ＣＰＵ、バッファメモリ、プログラムメモリ、及び不揮発性メモリなどを備える。ＣＰＵは、種々の演算処理を行う。バッファメモリは、ＣＰＵにより行われる演算の結果を一時的に記憶する。プログラムメモリ及び不揮発性メモリは、ＣＰＵが実行する種々のプログラム及び制御データなどを記憶する。主制御部５００は、ＣＰＵによりプログラムメモリに記憶されているプログラムを実行することにより、種々の処理を行うことができる。 The main control unit 500 controls the operation of each unit of the paper sheet processing apparatus 100 in an integrated manner. The main control unit 500 includes a CPU, a buffer memory, a program memory, a nonvolatile memory, and the like. The CPU performs various arithmetic processes. The buffer memory temporarily stores the results of calculations performed by the CPU. The program memory and the nonvolatile memory store various programs executed by the CPU, control data, and the like. The main control unit 500 can perform various processes by executing a program stored in the program memory by the CPU.

供給部２００は、紙葉類処理装置１００に取り込む紙葉類１をストックする。供給部２００は、重ねられた状態の紙葉類１をまとめて受け入れる。 The supply unit 200 stocks the paper sheets 1 to be taken into the paper sheet processing apparatus 100. The supply unit 200 collectively receives the stacked paper sheets 1.

分離ローラ２１０は、例えば供給部２００の下端に設置される。分離ローラ２１０は、供給部２００に紙葉類１が投入された場合、投入された紙葉類１の集積方向の下端に接する。分離ローラ２１０は、回転することにより、供給部２００にセットされた紙葉類１を集積方向の下端から１枚ずつ紙葉類処理装置１００の内部に取り込む。 The separation roller 210 is installed at the lower end of the supply unit 200, for example. When the paper sheet 1 is loaded into the supply unit 200, the separation roller 210 contacts the lower end of the loaded paper sheet 1 in the stacking direction. The separation roller 210 rotates to take the sheets 1 set in the supply unit 200 one by one from the lower end in the stacking direction into the sheet processing apparatus 100.

分離ローラ２１０は、たとえば、１回転するごとに１枚の紙葉類１を取り込む。これにより、分離ローラ２１０は、紙葉類１を一定のピッチで取り込むことができる。分離ローラ２１０により取り込まれた紙葉類１は、搬送路２２０に導入される。 For example, the separation roller 210 takes in one sheet 1 every rotation. Thereby, the separation roller 210 can take in the paper sheets 1 at a constant pitch. The paper sheet 1 taken in by the separation roller 210 is introduced into the conveyance path 220.

搬送路２２０は、紙葉類１を紙葉類処理装置１００内の各部に搬送する搬送部である。搬送路２２０は、図示しない搬送ベルト及び図示しない駆動プーリなどを備える。搬送路２２０は、図示しない駆動モータにより駆動プーリを駆動する。搬送ベルトは、駆動プーリにより動作する。 The conveyance path 220 is a conveyance unit that conveys the paper sheet 1 to each unit in the paper sheet processing apparatus 100. The conveyance path 220 includes a conveyance belt (not shown) and a drive pulley (not shown). The conveyance path 220 drives a drive pulley by a drive motor (not shown). The conveyor belt is operated by a driving pulley.

搬送路２２０は、分離ローラ２１０により取り込む紙葉類１を搬送ベルトにより一定速度で矢印ａ（搬送方向ａ）の方向に搬送する。なお、搬送路２２０において分離ローラ２１０に近い側を上流側、逆側を下流側として説明する。 The transport path 220 transports the paper sheet 1 taken in by the separation roller 210 in the direction of arrow a (transport direction a) at a constant speed by the transport belt. Note that the side closer to the separation roller 210 in the conveyance path 220 will be described as an upstream side, and the opposite side will be described as a downstream side.

画像読取部４００は、搬送路２２０により搬送される紙葉類１から画像を取得する。画像読取部４００は、例えば、照明と光学センサとを備える。照明は、搬送路２２０により搬送される紙葉類１に対して光を照射する。光学センサは、ＣｈａｒｇｅＣｏｕｐｌｅｄＤｅｖｉｃｅ（ＣＣＤ）などの受光素子と光学系（レンズ）を備える。光学センサは、紙葉類１で反射した反射光を光学系により受光し、ＣＣＤに結像させ、電気信号（画像）を取得する。画像読取部４００は、搬送路２２０により搬送される紙葉類１から連続して画像を取得することにより、紙葉類１の全体の画像を取得する。画像読取部４００は、取得した画像を主制御部５００に供給する。なお、画像読取部４００は、ビデオカメラなどを備える構成であってもよい。 The image reading unit 400 acquires an image from the paper sheet 1 conveyed by the conveyance path 220. The image reading unit 400 includes, for example, illumination and an optical sensor. The illumination irradiates light onto the paper sheet 1 conveyed by the conveyance path 220. The optical sensor includes a light receiving element such as a Charge Coupled Device (CCD) and an optical system (lens). The optical sensor receives the reflected light reflected by the paper sheet 1 by the optical system, forms an image on the CCD, and acquires an electrical signal (image). The image reading unit 400 acquires an entire image of the paper sheet 1 by continuously acquiring images from the paper sheet 1 conveyed by the conveyance path 220. The image reading unit 400 supplies the acquired image to the main control unit 500. Note that the image reading unit 400 may include a video camera or the like.

主制御部５００は、画像読取部４００から受け取った画像に基づいて、紙葉類１の搬送先を判定する為の処理を行う。この為に、主制御部５００は、単語認識部６００により画像中の単語を認識させることにより、宛先としての住所（宛先情報）などを特定する。主制御部５００は、宛先情報に基づいて２次元コード、またはバーコードなどのイメージを生成し、生成したイメージを印刷部４７０に供給する。 Based on the image received from the image reading unit 400, the main control unit 500 performs processing for determining the transport destination of the paper sheet 1. For this purpose, the main control unit 500 identifies an address (destination information) as a destination by causing the word recognition unit 600 to recognize a word in the image. The main control unit 500 generates an image such as a two-dimensional code or a barcode based on the destination information, and supplies the generated image to the printing unit 470.

また、主制御部５００は、紙葉類１上の切手などが貼付された位置を特定する。またさらに、主制御部５００は、特定した切手の貼付位置に基づいて、押印部４６０の動作を制御する。 Further, the main control unit 500 specifies the position where a stamp or the like on the paper sheet 1 is pasted. Furthermore, the main control unit 500 controls the operation of the stamp unit 460 based on the specified stamp position.

押印部４６０は、主制御部５００の制御に基づいて、例えば日付印などのスタンプを紙葉類１に押印する。即ち、主制御部５００は、切手の貼付位置と重なる位置にスタンプを押印するように押印部４６０を制御する。例えば、押印部４６０は、割り印となるようにスタンプを押印する。 The stamp unit 460 stamps a stamp such as a date stamp on the paper sheet 1 based on the control of the main control unit 500. That is, the main control unit 500 controls the stamping unit 460 so as to stamp the stamp at a position overlapping the stamping position. For example, the stamp unit 460 stamps the stamp so as to be a split.

印刷部４７０は、主制御部５００の制御に基づいて、例えば二次元コードまたはバーコードなどのイメージを印刷する。即ち、印刷部４７０は、主制御部５００から供給される二次元コードまたはバーコードなどのイメージを印刷する。例えば、印刷部４７０は、紫外線が照射された場合に励起光を発する蛍光体などを含むインクにより上記のイメージを印刷する。 The printing unit 470 prints an image such as a two-dimensional code or a barcode based on the control of the main control unit 500. That is, the printing unit 470 prints an image such as a two-dimensional code or a barcode supplied from the main control unit 500. For example, the printing unit 470 prints the above image with an ink containing a phosphor that emits excitation light when irradiated with ultraviolet rays.

区分処理部３００は、主制御部５００の制御に基づいて、紙葉類１を区分けして集積する。区分処理部３００は、例えば、第１のゲート３１０、第１のスタッカ３２０、第２のゲート３３０、及び第２のスタッカ３４０などの複数のゲート及びスタッカを備える。また、区分処理部３００は、さらに複数のゲート及び複数スタッカを備える。スタッカは、例えば、宛先情報毎に設けられている。また、ゲートは、各スタッカ毎に設けられている。 The sorting processing unit 300 sorts and stacks the sheets 1 based on the control of the main control unit 500. The sorting processing unit 300 includes, for example, a plurality of gates and stackers such as a first gate 310, a first stacker 320, a second gate 330, and a second stacker 340. Further, the sorting processing unit 300 further includes a plurality of gates and a plurality of stackers. A stacker is provided for each destination information, for example. A gate is provided for each stacker.

主制御部５００は、区分処理部３００の各ゲートを制御することにより、紙葉類１を区分させることができる。これにより、区分処理部３００は、紙葉類１の宛先情報毎に異なるスタッカに紙葉類１を集積することができる。 The main control unit 500 can sort the sheets 1 by controlling each gate of the sorting processing unit 300. Thereby, the sorting processing unit 300 can stack the paper sheets 1 in different stackers for each piece of destination information of the paper sheets 1.

第１のゲート３１０及び第２のゲート３３０は、搬送路２２０の画像読取部４００、押印部４６０、及び印刷部４７０より下流に設けられる。第１のゲート３１０及び第２のゲート３３０は、それぞれ主制御部５００の制御に基づいて動作する。主制御部５００は、上記した処理により認識された宛先情報に応じて、第１のゲート３１０及び第２のゲート３３０を制御する。 The first gate 310 and the second gate 330 are provided downstream of the image reading unit 400, the stamping unit 460, and the printing unit 470 in the conveyance path 220. The first gate 310 and the second gate 330 each operate based on the control of the main control unit 500. The main control unit 500 controls the first gate 310 and the second gate 330 according to the destination information recognized by the above processing.

第１のゲート３１０は、紙葉類１の搬送先を第１のスタッカ３２０と第２のゲート３３０とで切り替える。また、第２のゲート３３０は、紙葉類１の搬送先を第２のスタッカ３４０と他のスタッカとで切り替える。 The first gate 310 switches the transport destination of the paper sheet 1 between the first stacker 320 and the second gate 330. The second gate 330 switches the transport destination of the paper sheet 1 between the second stacker 340 and another stacker.

主制御部５００は、単語認識部６００により画像中の単語を認識させる為に、画像読取部４００から受け取った画像を単語認識部６００に供給する。 The main control unit 500 supplies the image received from the image reading unit 400 to the word recognition unit 600 so that the word recognition unit 600 can recognize words in the image.

単語認識部６００は、受け取った画像中の単語を認識する。単語認識部６００は、認識結果を主制御部５００に出力する。主制御部５００は、単語認識部６００による認識結果に基づいて、宛先情報などを特定する。 The word recognition unit 600 recognizes a word in the received image. The word recognition unit 600 outputs the recognition result to the main control unit 500. The main control unit 500 identifies destination information and the like based on the recognition result by the word recognition unit 600.

また、主制御部５００は、宛先情報を特定することができなかった紙葉類１の画像を保持するメモリを備える。また、区分処理部３００は、宛先情報を特定できなかった紙葉類１を集積するスタッカを備える。 In addition, the main control unit 500 includes a memory that holds an image of the paper sheet 1 for which destination information could not be specified. In addition, the sorting processing unit 300 includes a stacker that accumulates the paper sheets 1 whose destination information could not be specified.

操作部７００は、オペレータによる各種操作入力を操作部により受け付ける。操作部７００は、オペレータにより入力される操作に基づいて操作信号を生成し、生成した操作信号を主制御部５００に伝送する。 The operation unit 700 receives various operation inputs from the operator through the operation unit. The operation unit 700 generates an operation signal based on an operation input by the operator, and transmits the generated operation signal to the main control unit 500.

例えば、紙葉類処理装置１００は、ＶＣＳ（ＶｉｄｅｏＣｏｄｉｎｇＳｙｓｔｅｍ）の機能を備えていてもよい。即ち、紙葉類処理装置１００の主制御部５００は、宛先情報を特定できなかった紙葉類１の画像を表示部８００に表示させる。紙葉類処理装置１００は、表示部８００に表示させた紙葉類１の画像をオペレータに読み取らせて宛先情報を操作部７００により入力させる。これにより、紙葉類処理装置１００は、正しい宛先情報を取得することが出来る。 For example, the paper sheet processing apparatus 100 may include a VCS (Video Coding System) function. That is, the main control unit 500 of the paper sheet processing apparatus 100 causes the display unit 800 to display an image of the paper sheet 1 whose destination information could not be specified. The paper sheet processing apparatus 100 causes the operator to read the image of the paper sheet 1 displayed on the display unit 800 and causes the operation unit 700 to input destination information. Thereby, the paper sheet processing apparatus 100 can acquire correct destination information.

表示部８００は、主制御部５００の制御に基づいて種々の画面を表示する。例えば、表示部８００は、オペレータに対して各種の操作案内、及び処理結果などを表示する。また、上記したように、表示部８００は、宛先情報が特定されなかった紙葉類１の画像を表示する構成であってもよい。なお、操作部７００と表示部８００とは、タッチパネルとして一体に形成されていてもよい。 The display unit 800 displays various screens based on the control of the main control unit 500. For example, the display unit 800 displays various operation guidance and processing results for the operator. Further, as described above, the display unit 800 may be configured to display an image of the paper sheet 1 for which the destination information is not specified. Note that the operation unit 700 and the display unit 800 may be integrally formed as a touch panel.

入出力部９００は、紙葉類処理装置１００に接続される外部機器、または記憶媒体とデータの送受信を行う。例えば、入出力部９００は、ディスクドライブ、ＵＳＢコネクタ、ＬＡＮコネクタ、またはデータの送受信が可能な他のインターフェースなどを備える。紙葉類処理装置１００は、入出力部９００に接続される外部機器、または記憶媒体からデータを取得することができる。また、紙葉類処理装置１００は、入出力部９００に接続される外部機器、または記憶媒体に処理結果を伝送することもできる。 The input / output unit 900 transmits / receives data to / from an external device connected to the paper sheet processing apparatus 100 or a storage medium. For example, the input / output unit 900 includes a disk drive, a USB connector, a LAN connector, or another interface capable of transmitting and receiving data. The paper sheet processing apparatus 100 can acquire data from an external device connected to the input / output unit 900 or a storage medium. In addition, the paper sheet processing apparatus 100 can transmit the processing result to an external device connected to the input / output unit 900 or a storage medium.

図２は、第１の実施形態に係る単語認識部６００の構成の例を示す。
単語認識部６００は、画像受取部６０１、単語抽出部６０２、文字候補抽出部６０３、文字認識部６０４、特徴抽出部６０５、解析的マッチング部６１０、全体的マッチング部６２０、特徴確率計算部６３０、ＶＣＳ６４０、第１の単語画像蓄積部６４１、モデル学習部６４２、モデル格納部６４３、単語モデル生成部６４４、単語辞書６４５、事前確率計算部６５１、事前確率格納部６５２、事前確率入力部６５３、統合評価値算出部６６０、及び事前確率乗算部６７０を具備する。 FIG. 2 shows an example of the configuration of the word recognition unit 600 according to the first embodiment.
The word recognition unit 600 includes an image receiving unit 601, a word extraction unit 602, a character candidate extraction unit 603, a character recognition unit 604, a feature extraction unit 605, an analytical matching unit 610, an overall matching unit 620, a feature probability calculation unit 630, VCS 640, first word image storage unit 641, model learning unit 642, model storage unit 643, word model generation unit 644, word dictionary 645, prior probability calculation unit 651, prior probability storage unit 652, prior probability input unit 653, integration An evaluation value calculation unit 660 and a prior probability multiplication unit 670 are provided.

なお、単語認識部６００の動作は、認識フェーズと学習フェーズとに大きく分けられる。まず、認識フェーズについて説明する。 The operation of the word recognition unit 600 is roughly divided into a recognition phase and a learning phase. First, the recognition phase will be described.

単語認識部６００は、上記の各部により、解析的マッチングを行い、全体的マッチングを行い、特徴確率の計算を行い、これらの結果と、単語毎の事前確率とを統合する。これにより、単語認識部６００は、単語毎の評価値（事後確率）を算出することができる。 The word recognizing unit 600 performs analytical matching, performs overall matching, calculates feature probabilities, and integrates these results and prior probabilities for each word. Thereby, the word recognition part 600 can calculate the evaluation value (posterior probability) for every word.

一般にパターン認識では、事後確率が最大となるカテゴリにパターンを所属させることが最適である。即ち、事後確率が最大となるカテゴリにパターンを所属させた場合、識別エラーが最小となる。即ち、単語認識部６００は、最も高い事後確率が算出された単語を認識結果として出力することにより、一つの単語を特定し、主制御部５００に伝送することができる。また、例えば、単語認識部６００は、単語毎の評価値を認識結果として主制御部５００に出力する構成であってもよい。この場合、主制御部５００は、複数の単語の事後確率と、他の単語との組み合わせとを考慮して宛先情報を特定することができる。 In general, in pattern recognition, it is optimal to assign a pattern to a category having the maximum posterior probability. That is, when a pattern belongs to a category having the maximum posterior probability, the identification error is minimized. That is, the word recognizing unit 600 can identify one word and transmit it to the main control unit 500 by outputting the word for which the highest posterior probability is calculated as a recognition result. For example, the word recognition unit 600 may be configured to output an evaluation value for each word to the main control unit 500 as a recognition result. In this case, the main control unit 500 can specify the destination information in consideration of the posterior probabilities of a plurality of words and combinations with other words.

例えば、解析的マッチングにおける単語候補内の全文字認識結果をＹ、全体的マッチングに用いられる画像から抽出された特徴をＸとした場合、単語認識部６００は、次の数式１に基づいて単語ｗの事後確率Ｐ（ｗ｜Ｙ、Ｘ）を算出する。

For example, when all character recognition results in word candidates in analytical matching are Y and features extracted from an image used for overall matching are X, the word recognition unit 600 uses the following formula 1 to generate the word w The posterior probability P (w | Y, X) is calculated.

数式１の左辺は、解析的マッチングによる文字認識結果の集合体と、全体的マッチングに用いられる特徴抽出結果の集合体とを条件とした場合のある単語の事後確率を示す。即ち、事後確率Ｐ（ｗ｜Ｙ、Ｘ）は、解析的マッチングと全体的マッチングとを併用した場合の単語毎の評価値を示す。 The left side of Equation 1 shows a posterior probability of a word in a case where a set of character recognition results by analytical matching and a set of feature extraction results used for overall matching are used as conditions. That is, the posterior probability P (w | Y, X) indicates an evaluation value for each word when analytical matching and overall matching are used together.

数式１の左辺は、ベイズの定理により右辺のように展開できる。さらに、解析的マッチングの結果と全体的マッチングの結果とがそれぞれ独立であるとみなすことにより、数式１の１段目の右辺は、２段目の右辺に示すように近似することができる。 The left side of Equation 1 can be expanded like the right side by Bayes' theorem. Furthermore, by assuming that the result of the analytical matching and the result of the overall matching are independent from each other, the right side of the first stage of Equation 1 can be approximated as shown on the right side of the second stage.

なお、数式１のＰ（Ｙ｜ｗ）／Ｐ（Ｙ）は、解析的マッチングの結果（事後確率比）を示す。また、数式１のＰ（Ｘ｜ｗ）は、全体的マッチングの結果（尤度）を示す。またさらに、数式１のＰ（Ｘ）は、特徴確率の計算結果を示す。またさらに、Ｐ（ｗ）は、単語ｗに関する事前確率を示す。 Note that P (Y | w) / P (Y) in Equation 1 indicates the result of analytical matching (a posteriori probability ratio). In addition, P (X | w) in Equation 1 indicates the overall matching result (likelihood). Furthermore, P (X) in Formula 1 indicates the calculation result of the feature probability. Furthermore, P (w) indicates the prior probability for the word w.

単語認識部６００は、上記の各項を算出し、数式１を演算することにより、単語毎の事後確率を算出することができる。 The word recognition unit 600 can calculate the a posteriori probability for each word by calculating each of the above terms and calculating Formula 1.

まず、解析的マッチングについて説明する。単語認識部６００の画像受取部６０１は、主制御部５００から紙葉類１の画像（紙葉類画像）を受け取る。図３は、紙葉類画像の例を示す。図３に示されるように、画像受取部６０１は、紙葉類１上に記載された単語を含む紙葉類画像を受け取る。図３は、英文字単語により宛先などが記載された例を示す。しかし、紙葉類１上に記載された宛先が日本語、または他の言語であっても本実施形態を適用することができる。画像受取部６０１は、受け取った紙葉類画像を単語抽出部６０２に伝送する。 First, analytical matching will be described. The image receiving unit 601 of the word recognition unit 600 receives the image of the paper sheet 1 (paper sheet image) from the main control unit 500. FIG. 3 shows an example of a paper sheet image. As shown in FIG. 3, the image receiving unit 601 receives a paper sheet image including words written on the paper sheet 1. FIG. 3 shows an example in which a destination and the like are described using English words. However, the present embodiment can be applied even if the destination described on the paper sheet 1 is in Japanese or another language. The image receiving unit 601 transmits the received paper sheet image to the word extracting unit 602.

単語抽出部６０２は、画像受取部６０１により受け取られた紙葉類画像から単語候補（単語画像）を抽出する。単語抽出部６０２は、例えば、紙葉類画像に対して画像処理を施すことにより、単語として区切ることができる可能性の高い領域を特定し、抽出する。図４は、単語候補の例を示す。図４に示されるように、単語抽出部６０２は、紙葉類画像中の単語候補を抽出する。 The word extraction unit 602 extracts word candidates (word images) from the paper sheet image received by the image receiving unit 601. For example, the word extraction unit 602 identifies and extracts an area that is likely to be segmented as a word by performing image processing on the paper sheet image. FIG. 4 shows an example of word candidates. As shown in FIG. 4, the word extraction unit 602 extracts word candidates from the paper sheet image.

例えば、単語抽出部６０２は、例えば、単語間のスペースを認識することにより、単語候補を抽出する。また、例えば、単語抽出部６０２は、「市」、「町」、または他の区切りとなるキーワードを抽出することにより、単語候補を抽出する構成であってもよい。また、単語候補を抽出する処理は、上記の方法に因らず、如何なるものであってもよい。単語抽出部６０２は、抽出した単語候補を文字候補抽出部６０３及び特徴抽出部６０５に伝送する。 For example, the word extraction unit 602 extracts word candidates by recognizing spaces between words, for example. In addition, for example, the word extraction unit 602 may be configured to extract word candidates by extracting “city”, “town”, or other keywords that serve as delimiters. Further, the process of extracting the word candidates is not limited to the above method, and any process may be used. The word extraction unit 602 transmits the extracted word candidates to the character candidate extraction unit 603 and the feature extraction unit 605.

文字候補抽出部６０３は、単語候補から文字候補を抽出する。文字候補抽出部６０３は、単語候補（単語画像）に対して画像処理を施すことにより、文字として区切ることができる可能性の高い領域を特定し、抽出する。図５は、単語候補から文字候補を抽出する処理の例を示す。図５に示されるように、文字候補抽出部６０３は、文字候補から複数の切断点候補を抽出し、抽出した切断点候補に基づいて互いに重なる箇所を含む複数の文字候補を抽出する。即ち、文字候補抽出部６０３は、１つの文字として認識することができる可能性の高い領域を特定し、文字候補として抽出する。文字候補抽出部６０３は、抽出した文字候補を文字認識部６０４に伝送する。 Character candidate extraction unit 603 extracts character candidates from word candidates. The character candidate extraction unit 603 identifies and extracts a region that is likely to be segmented as a character by performing image processing on the word candidate (word image). FIG. 5 shows an example of processing for extracting character candidates from word candidates. As illustrated in FIG. 5, the character candidate extraction unit 603 extracts a plurality of cut point candidates from the character candidates, and extracts a plurality of character candidates including overlapping portions based on the extracted cut point candidates. That is, the character candidate extraction unit 603 identifies a region that is highly likely to be recognized as one character and extracts it as a character candidate. The character candidate extraction unit 603 transmits the extracted character candidates to the character recognition unit 604.

文字認識部６０４は、文字候補毎に文字認識を行い、文字認識結果を取得する。即ち、文字認識部６０４は、文字候補の画像と予め用意された文字認識辞書とを比較することにより、文字認識結果を取得する。文字認識部６０４は、文字候補毎の文字認識結果を解析的マッチング部６１０に伝送する。 The character recognition unit 604 performs character recognition for each character candidate and acquires a character recognition result. That is, the character recognition unit 604 obtains a character recognition result by comparing a character candidate image with a character recognition dictionary prepared in advance. The character recognition unit 604 transmits the character recognition result for each character candidate to the analytical matching unit 610.

単語辞書６４５は、認識すべき単語をリストとして格納している。図６は、単語辞書６４５の例を示す。単語認識部６００は、単語の認識を行う場合、単語辞書６４５のリストの中から正解の単語を選出する。単語辞書６４５は、解析的マッチング部６１０に単語リストを供給する。 The word dictionary 645 stores words to be recognized as a list. FIG. 6 shows an example of the word dictionary 645. When recognizing a word, the word recognition unit 600 selects a correct word from the list in the word dictionary 645. The word dictionary 645 supplies the word list to the analytical matching unit 610.

解析的マッチング部６１０は、文字認識部６０４から伝送された文字候補毎の文字認識結果に基づいて、単語辞書６４５に格納されている単語毎に事後確率比を計算する。これにより、解析的マッチング部６１０は、文字候補抽出部６０３により抽出された複数の文字候補の正しいパス（経路）を探す。 The analytical matching unit 610 calculates the posterior probability ratio for each word stored in the word dictionary 645 based on the character recognition result for each character candidate transmitted from the character recognition unit 604. Accordingly, the analytical matching unit 610 searches for a correct path (route) of the plurality of character candidates extracted by the character candidate extraction unit 603.

例えば、単語ｗの第ｉ番目の文字をｃ_ｉ、第ｉ番目の文字に対応する文字候補の通し番号をｆ（ｉ）、第ｉ番目の文字に対応する文字候補の文字認識結果をｙ_ｆ（ｉ）、単語ｗの文字数をＮとした場合、単語ｗの事後確率比Ｐ（Ｙ｜ｗ）／Ｐ（Ｙ）は、次の数式２に示すように近似される。

For example, the i-th character of the word w is c _i , the serial number of the character candidate corresponding to the i-th character is f (i), and the character recognition result of the character candidate corresponding to the i-th character is y _{f ( i) When} the number of characters of the word w is N, the posterior probability ratio P (Y | w) / P (Y) of the word w is approximated as shown in Equation 2 below.

例えば、対象となる単語が「ｈａｍ」である場合、ｃ_１＝「ｈ」、ｃ_２＝「ａ」、ｃ_３＝「ｍ」である。また、この場合、Ｎ＝３である。またこの場合、Ｐ（ｙ_ｆ（ｉ）｜ｃ_ｉ）／Ｐ（ｙ_ｆ（ｉ））は、第ｉ番目の文字の事後確率比を示す。 For example, when the target word is “ham”, c ₁ = “h”, c ₂ = “a”, and c ₃ = “m”. In this case, N = 3. In this case, P (y _{f (i)} | c _i ) / P (y _{f (i)} ) represents the posterior probability ratio of the i-th character.

解析的マッチング部６１０は、第ｉ番目の文字の事後確率比をｉ＝１乃至Ｎに亘って乗算することにより、単語ｗの事後確率比Ｐ（Ｙ｜ｗ）／Ｐ（Ｙ）を算出することができる。即ち、解析的マッチング部６１０は、単語リストの各単語毎に文字認識結果に基づいて上記の数式２を演算することにより、単語毎の事後確率比を算出する。 The analytical matching unit 610 calculates the posterior probability ratio P (Y | w) / P (Y) of the word w by multiplying the posterior probability ratio of the i-th character over i = 1 to N. be able to. In other words, the analytical matching unit 610 calculates the posterior probability ratio for each word by calculating the above Equation 2 based on the character recognition result for each word in the word list.

なお、解析的マッチング部６１０は、文字確率計算部６１１、第１の演算部６１２、第２の演算部６１３を備える。文字確率計算部６１１は、数式２の右辺の各因子の分子を計算する。即ち、文字確率計算部６１１は、Ｐ（ｙ_ｆ（ｉ）｜ｃ_ｉ）をある単語ｗの各文字毎に算出する。 The analytical matching unit 610 includes a character probability calculation unit 611, a first calculation unit 612, and a second calculation unit 613. The character probability calculation unit 611 calculates the numerator of each factor on the right side of Equation 2. That is, the character probability calculation unit 611 calculates P (y _{f (i)} | c _i ) for each character of a certain word w.

第１の演算部６１２は、数式２の右辺の各因子を計算する。即ち、第１の演算部６１２は、右辺の分母であるＰ（ｙ_ｆ（ｉ））を算出し、算出した値で分子であるＰ（ｙ_ｆ（ｉ）｜ｃ_ｉ）を割る。なお、Ｐ（ｙ_ｆ（ｉ））は、文字認識結果ｙ_ｆ（ｉ）の出現する確率である。 The first calculation unit 612 calculates each factor on the right side of Equation 2. That is, the first calculation unit 612 calculates P (y _{f (i)} ) that is the denominator of the right side, and divides P (y _{f (i)} | c _i ) that is the numerator by the calculated value. Note that P (y _{f (i)} ) is the probability that the character recognition result y _{f (i)} will appear.

第２の演算部６１３は、数式２の右辺を計算する。即ち第２の演算部６１３は、第１の演算部６１２の演算結果である数式２の右辺の各因子を全て掛け合わせる。これにより、解析的マッチング部６１０は、単語ｗの事後確率比Ｐ（Ｙ｜ｗ）／Ｐ（Ｙ）を算出することができる。解析的マッチング部６１０は、算出した事後確率比Ｐ（Ｙ｜ｗ）／Ｐ（Ｙ）を統合評価値算出部６６０に出力する。 The second calculation unit 613 calculates the right side of Equation 2. That is, the second calculation unit 613 multiplies all the factors on the right side of Formula 2 that is the calculation result of the first calculation unit 612. Thereby, the analytical matching unit 610 can calculate the posterior probability ratio P (Y | w) / P (Y) of the word w. The analytical matching unit 610 outputs the calculated posterior probability ratio P (Y | w) / P (Y) to the integrated evaluation value calculation unit 660.

次に、全体的マッチングについて説明する。図２の特徴抽出部６０５は、上記したように、単語抽出部６０２から単語候補の画像を受け取る。特徴抽出部６０５は、受け取った単語候補の画像に基づいて、ベクトルの集合である特徴Ｘを抽出する。 Next, overall matching will be described. The feature extraction unit 605 in FIG. 2 receives the image of the word candidate from the word extraction unit 602 as described above. The feature extraction unit 605 extracts a feature X that is a set of vectors based on the received word candidate images.

例えば、特徴抽出部６０５は、単語候補の画像をぼかした後の輝度勾配情報を１２８次元のベクトルとして特徴Ｘを抽出する。特徴抽出部６０５は、単語候補の画像中の注目する領域（注目領域）を画像中の左から右にずらしながら複数の特徴を抽出する。 For example, the feature extraction unit 605 extracts the feature X using the brightness gradient information after blurring the word candidate image as a 128-dimensional vector. The feature extraction unit 605 extracts a plurality of features while shifting a region of interest (a region of interest) in the word candidate image from left to right in the image.

また、特徴抽出部６０５は、単語候補の画像の画素の濃度値を特徴として用いる構成であってもよい。またさらに、特徴抽出部６０５は、単語候補の画像をより簡易化することにより得られるパターンの濃度値を特徴として用いる構成であってもよい。 The feature extraction unit 605 may be configured to use the density value of the pixel of the word candidate image as a feature. Furthermore, the feature extraction unit 605 may be configured to use a pattern density value obtained by further simplifying a word candidate image as a feature.

上記の処理により、特徴抽出部６０５は、１つの単語候補の画像から複数個の特徴を抽出する。特徴抽出部６０５は、抽出した特徴Ｘを全体的マッチング部６２０、及び特徴確率計算部６３０に出力する。 Through the above processing, the feature extraction unit 605 extracts a plurality of features from one word candidate image. The feature extraction unit 605 outputs the extracted feature X to the overall matching unit 620 and the feature probability calculation unit 630.

モデル格納部６４３は、各文字毎の文字モデル、または単語毎の単語モデルなどを格納している。なお、モデル格納部６４３は、単語辞書６４５内の各単語に対応する単語モデルを格納する構成であってもよい。また、モデル格納部６４３は、任意の単語に対応する任意単語モデルを格納する構成であってもよい。 The model storage unit 643 stores a character model for each character or a word model for each word. The model storage unit 643 may be configured to store a word model corresponding to each word in the word dictionary 645. Further, the model storage unit 643 may be configured to store an arbitrary word model corresponding to an arbitrary word.

単語モデル生成部６４４は、モデル格納部６４３に格納されている文字モデル及び単語モデルを用いて、単語辞書６４５内の各単語に対応する単語モデルを生成する。単語モデル生成部６４４は、生成した単語モデルを全体的マッチング部６２０に出力する。 The word model generation unit 644 generates a word model corresponding to each word in the word dictionary 645 using the character model and the word model stored in the model storage unit 643. The word model generation unit 644 outputs the generated word model to the overall matching unit 620.

例えば、単語モデル生成部６４４は、モデル格納部６４３に格納されている文字モデルを読み出し、単語辞書６４５内の単語に応じて文字モデルを連結させることにより、単語モデルを生成する。なお、単語辞書６４５内の単語に対応する単語モデルがモデル格納部６４３に格納されている場合、単語モデル生成部６４４は、モデル格納部６４３に格納されている単語モデルをそのまま全体的マッチング部６２０に出力する。 For example, the word model generation unit 644 reads a character model stored in the model storage unit 643 and generates a word model by connecting the character models according to the words in the word dictionary 645. When a word model corresponding to a word in the word dictionary 645 is stored in the model storage unit 643, the word model generation unit 644 directly uses the word model stored in the model storage unit 643 as an overall matching unit 620. Output to.

全体的マッチング部６２０は、特徴抽出部６０５により抽出された特徴Ｘと、単語モデル生成部６４４から出力された単語モデルとに基づいて、尤度Ｐ（Ｘ｜ｗ）を計算する。尤度Ｐ（Ｘ｜ｗ）は、特徴抽出部６０５により抽出された特徴Ｘが単語モデル生成部６４４から出力された単語モデルから出力される確率である。なお、尤度Ｐ（Ｘ｜ｗ）は、数式１の右辺の第２因子の分子と同じものである。 The overall matching unit 620 calculates a likelihood P (X | w) based on the feature X extracted by the feature extraction unit 605 and the word model output from the word model generation unit 644. The likelihood P (X | w) is a probability that the feature X extracted by the feature extraction unit 605 is output from the word model output from the word model generation unit 644. The likelihood P (X | w) is the same as the numerator of the second factor on the right side of Equation 1.

全体的マッチング部６２０は、ビタビアルゴリズム（Ｖｉｔｅｒｂｉａｌｇｏｒｉｔｈｍ）を用いることにより、尤度Ｐ（Ｘ｜ｗ）を算出する。 The overall matching unit 620 calculates a likelihood P (X | w) by using a Viterbi algorithm.

ビタビアルゴリズムは、モデルパラメータが既知である場合に、与えられた配列を出力した可能性（尤度）が最も高い状態列を計算するアルゴリズムである。即ち、ビタビアルゴリズムは、特徴Ｘを結果として生じる隠された事象の系列を探す動的計画法アルゴリズムである。 The Viterbi algorithm is an algorithm that calculates a state sequence having the highest possibility (likelihood) of outputting a given array when model parameters are known. That is, the Viterbi algorithm is a dynamic programming algorithm that searches for a sequence of hidden events that result in feature X.

全体的マッチング部６２０は、ビタビアルゴリズムにより、単語モデル生成部６４４から出力された単語モデルを既知のパラメータとして、特徴Ｘが出現する確率としての尤度Ｐ（Ｘ｜ｗ）を算出する。即ち、尤度Ｐ（Ｘ｜ｗ）は、単語ｗに対応する単語モデルから特徴Ｘが出現する確率を示す。全体的マッチング部６２０は、算出した尤度Ｐ（Ｘ｜ｗ）を統合評価値算出部６６０に出力する。 The overall matching unit 620 calculates a likelihood P (X | w) as a probability that the feature X appears using the Viterbi algorithm with the word model output from the word model generation unit 644 as a known parameter. That is, the likelihood P (X | w) indicates the probability that the feature X appears from the word model corresponding to the word w. The overall matching unit 620 outputs the calculated likelihood P (X | w) to the integrated evaluation value calculation unit 660.

次に、特徴確率の計算について説明する。図２の特徴確率計算部６３０は、特徴抽出部６０５により抽出された特徴Ｘと、モデル格納部６４３に格納されている任意単語モデルとに基づいて、任意の単語から特徴Ｘが出力される特徴確率Ｐ（Ｘ）を算出する。 Next, calculation of the feature probability will be described. The feature probability calculation unit 630 in FIG. 2 outputs a feature X from an arbitrary word based on the feature X extracted by the feature extraction unit 605 and the arbitrary word model stored in the model storage unit 643. Probability P (X) is calculated.

任意の単語をｃ＊とした場合、特徴確率Ｐ（Ｘ）は、Ｐ（Ｘ）＝Ｐ（Ｘ｜ｃ＊）と表すことが出来る。即ち、特徴確率Ｐ（Ｘ）は、任意の単語ｃ＊から特徴Ｘが出力される確率を示す。この為、特徴確率計算部６３０は、上記したビタビアルゴリズムを用いて特徴確率Ｐ（Ｘ）を算出することができる。なお、特徴確率Ｐ（Ｘ）は、単語に因らず特徴Ｘ毎に一定の値である。 When an arbitrary word is c *, the feature probability P (X) can be expressed as P (X) = P (X | c *). That is, the feature probability P (X) indicates the probability that the feature X is output from an arbitrary word c *. Therefore, the feature probability calculation unit 630 can calculate the feature probability P (X) using the Viterbi algorithm described above. The feature probability P (X) is a constant value for each feature X regardless of the word.

任意の単語ｃ＊に対応するモデル（任意単語モデル）は、例えばエルゴディック隠れマルコフモデル（ｅｒｇｏｄｉｃＨＭＭ）を用いた方法により生成される。任意単語モデルは、予め生成されてモデル格納部６４３に格納される。即ち、特徴確率計算部６３０は、モデル格納部６４３に格納されている任意単語モデルを取得し、任意単語モデルと、特徴Ｘとに基づいて、上記したビタビアルゴリズムにより特徴確率Ｐ（Ｘ）を算出する。特徴確率計算部６３０は、算出した特徴確率Ｐ（Ｘ）を統合評価値算出部６６０に出力する。 A model (arbitrary word model) corresponding to an arbitrary word c * is generated by a method using, for example, an ergodic hidden Markov model (ergic HMM). The arbitrary word model is generated in advance and stored in the model storage unit 643. That is, the feature probability calculation unit 630 acquires an arbitrary word model stored in the model storage unit 643, and calculates the feature probability P (X) by the above Viterbi algorithm based on the arbitrary word model and the feature X. To do. The feature probability calculation unit 630 outputs the calculated feature probability P (X) to the integrated evaluation value calculation unit 660.

図７は、エルゴディックモデルの例を示す。任意単語モデルは、図７に示すような任意の状態間の遷移を許したエルゴディックモデルを用いてパラメータを学習することにより生成される。 FIG. 7 shows an example of an ergodic model. The arbitrary word model is generated by learning parameters using an ergodic model that allows transition between arbitrary states as shown in FIG.

また、任意単語モデルは、たとえば、各文字の文字モデルを用いて構成することもできる。構成方法の一例として、全文字の文字モデルを並列接続し、その任意回数の繰り返しを可能とするようモデル末尾から先頭への遷移を許すことで、任意の文字列を表すことができる。このモデルの例を図８に示す。並列に接続された文字モデル間の遷移確率は均等でもよいし、任意の値を設定してもよい。構成方法はこれに限るものではなく、任意の構成方法が考えられる。たとえば、「ｏｒ」「ｅｒ」「ｓｔ」などの頻繁に登場する部分的な文字列が存在する場合、部分的な文字列を表すモデルを並列接続に加えてもよい。 Moreover, the arbitrary word model can also be comprised using the character model of each character, for example. As an example of the configuration method, an arbitrary character string can be represented by connecting the character models of all characters in parallel and allowing the transition from the model end to the head so that the arbitrary number of repetitions is possible. An example of this model is shown in FIG. Transition probabilities between character models connected in parallel may be equal, or an arbitrary value may be set. The configuration method is not limited to this, and an arbitrary configuration method is conceivable. For example, if there are frequently appearing partial character strings such as “or”, “er”, and “st”, a model representing the partial character string may be added to the parallel connection.

なお、図７に示した例は、図８に比べ状態数が少なく済むため、尤度の算出処理を高速に行うことができるという利点がある。一方、図７に示した例は文字モデルとは別にエルゴディックモデルのパラメータを記憶する必要があるため、文字モデルのパラメータを流用できる図８に示す例の方がメモリの容量を抑えることができるという利点がある。 The example shown in FIG. 7 has an advantage that the likelihood calculation process can be performed at high speed because the number of states is smaller than that in FIG. On the other hand, since the example shown in FIG. 7 needs to store the ergodic model parameters separately from the character model, the example shown in FIG. 8 that can divert the parameters of the character model can reduce the memory capacity. There is an advantage.

統合評価値算出部６６０は、解析的マッチング部６１０、全体的マッチング部６２０、及び特徴確率計算部６３０の算出結果を統合する。統合評価値算出部６６０は、解析的マッチング部６１０により算出された事後確率比Ｐ（Ｙ｜ｗ）／Ｐ（Ｙ）と、全体的マッチング部６２０により算出された尤度Ｐ（Ｘ｜ｗ）と、特徴確率計算部６３０により算出された特徴確率Ｐ（Ｘ）とに基づいて、統合評価値｛Ｐ（Ｙ｜ｗ）／Ｐ（Ｙ）｝・｛Ｐ（Ｘ｜ｗ）／Ｐ（Ｘ）｝を算出する。 The integrated evaluation value calculation unit 660 integrates the calculation results of the analytical matching unit 610, the overall matching unit 620, and the feature probability calculation unit 630. The integrated evaluation value calculation unit 660 includes the posterior probability ratio P (Y | w) / P (Y) calculated by the analytical matching unit 610 and the likelihood P (X | w) calculated by the overall matching unit 620. And the integrated probability value {P (Y | w) / P (Y)} · {P (X | w) / P (X) based on the feature probability P (X) calculated by the feature probability calculation unit 630 )} Is calculated.

即ち、統合評価値算出部６６０は、事後確率比Ｐ（Ｙ｜ｗ）／Ｐ（Ｙ）と、尤度Ｐ（Ｘ｜ｗ）と、特徴確率Ｐ（Ｘ）の逆数とを乗算する。統合評価値算出部６６０は、算出した統合評価値｛Ｐ（Ｙ｜ｗ）／Ｐ（Ｙ）｝・｛Ｐ（Ｘ｜ｗ）／Ｐ（Ｘ）｝を事前確率乗算部６７０に出力する。 That is, the integrated evaluation value calculation unit 660 multiplies the posterior probability ratio P (Y | w) / P (Y), the likelihood P (X | w), and the inverse of the feature probability P (X). The integrated evaluation value calculation unit 660 outputs the calculated integrated evaluation value {P (Y | w) / P (Y)} · {P (X | w) / P (X)} to the prior probability multiplication unit 670.

事前確率乗算部６７０は、統合評価値算出部６６０により算出された統合評価値｛Ｐ（Ｙ｜ｗ）／Ｐ（Ｙ）｝・｛Ｐ（Ｘ｜ｗ）／Ｐ（Ｘ）｝に単語毎の事前確率Ｐ（ｗ）を乗算する。これにより、単語認識部６００は、上記の数式１の演算結果としての事後確率Ｐ（ｗ｜Ｙ、Ｘ）を得ることができる。 Prior probability multiplication section 670 applies the integrated evaluation value {P (Y | w) / P (Y)} · {P (X | w) / P (X)} calculated by integrated evaluation value calculation section 660 for each word. Is multiplied by the prior probability P (w). Thereby, the word recognizing unit 600 can obtain the posterior probability P (w | Y, X) as the calculation result of Equation 1 above.

事前確率格納部６５２は、単語毎の事前確率Ｐ（ｗ）をテーブルとして格納する。事前確率Ｐ（ｗ）は、紙葉類１にある単語が記載されている頻度を示す確率である。この値を調整してテーブルを作成することにより、住所として不適当な単語の事後確率Ｐ（ｗ｜Ｙ、Ｘ）を抑えることができる。 Prior probability storage unit 652 stores prior probability P (w) for each word as a table. The prior probability P (w) is a probability indicating the frequency with which a word in the paper sheet 1 is described. By adjusting this value and creating a table, the posterior probability P (w | Y, X) of a word inappropriate as an address can be suppressed.

例えば、紙葉類１上のバーコードなどが「１１１１１１１１」などの単語として認識される場合がある。このような場合であっても、「１１１１１１１１」などの単語に事前確率Ｐ（ｗ）として低い値を予め設定しておくことにより、単語認識部６００が単語「１１１１１１１１」の事後確率Ｐ（ｗ｜Ｙ、Ｘ）として高い値を算出することを防ぐことができる。即ち、誤認識しやすい単語などに対して事前確率Ｐ（ｗ）として低い値を予め設定しておくことにより、単語認識部６００が誤認識を起こすことを防ぐことができる。 For example, a barcode on the paper sheet 1 may be recognized as a word such as “11111111”. Even in such a case, by setting a low value as a prior probability P (w) in a word such as “11111111” in advance, the word recognition unit 600 can determine the posterior probability P (w |) of the word “11111111”. Y, X) can be prevented from being calculated as a high value. That is, by setting in advance a low value as the prior probability P (w) for a word that is easily misrecognized, it is possible to prevent the word recognition unit 600 from causing erroneous recognition.

また、例えば、全ての単語の出現頻度が一律である場合、事前確率Ｐ（ｗ）は一定の値であればよい。 For example, when the appearance frequency of all the words is uniform, the prior probability P (w) may be a constant value.

事前確率乗算部６７０は、事前確率格納部６５２に単語毎に格納されている事前確率事前確率Ｐ（ｗ）を読み出し、統合評価値｛Ｐ（Ｙ｜ｗ）／Ｐ（Ｙ）｝・｛Ｐ（Ｘ｜ｗ）／Ｐ（Ｘ）｝に乗算する。事前確率乗算部６７０は、乗算の結果、即ち事後確率Ｐ（ｗ｜Ｙ、Ｘ）を主制御部５００に出力する。 Prior probability multiplication section 670 reads prior probability prior probability P (w) stored for each word in prior probability storage section 652, and integrates evaluation values {P (Y | w) / P (Y)} · {P Multiply (X | w) / P (X)}. Prior probability multiplication section 670 outputs the result of multiplication, that is, posterior probability P (w | Y, X), to main control section 500.

上記の処理により、主制御部５００は、単語毎の認識結果（評価値）を取得することができる。主制御部５００は、複数の単語の事後確率と、他の単語との組み合わせとを考慮して宛先情報を特定することができる。例えば、主制御部５００は、宛先情報として適当な単語の組み合わせを推測することができる。 Through the above processing, the main control unit 500 can acquire a recognition result (evaluation value) for each word. The main control unit 500 can specify the destination information in consideration of the posterior probabilities of a plurality of words and combinations with other words. For example, the main control unit 500 can infer a combination of words suitable as destination information.

上記したように単語認識部６００は、上記の各部により、解析的マッチングにより事後確率比Ｐ（Ｙ｜ｗ）／Ｐ（Ｙ）を算出し、全体的マッチングにより尤度Ｐ（Ｘ｜ｗ）を算出し、特徴確率の計算により特徴確率Ｐ（Ｘ）を算出する。単語認識部６００は、事後確率比Ｐ（Ｙ｜ｗ）／Ｐ（Ｙ）と、尤度Ｐ（Ｘ｜ｗ）と、特徴確率Ｐ（Ｘ）と、予め単語毎に設定された事前確率Ｐ（ｗ）とを統合することにより、単語毎の事後確率Ｐ（ｗ｜Ｙ、Ｘ）を算出することができる。 As described above, the word recognizing unit 600 calculates the posterior probability ratio P (Y | w) / P (Y) by analytical matching by the above-described units, and calculates the likelihood P (X | w) by overall matching. The feature probability P (X) is calculated by calculating the feature probability. The word recognition unit 600 includes a posteriori probability ratio P (Y | w) / P (Y), likelihood P (X | w), feature probability P (X), and a prior probability P set in advance for each word. By integrating (w), the posterior probability P (w | Y, X) for each word can be calculated.

なお、単語認識部６００は、最も高い事後確率Ｐ（ｗ｜Ｙ、Ｘ）が算出された単語を認識結果として主制御部５００に出力する構成であってもよい。この場合、単語認識部６００は、一つの単語を認識結果として特定し、主制御部５００に伝送することができる。 Note that the word recognition unit 600 may be configured to output the word for which the highest posterior probability P (w | Y, X) is calculated to the main control unit 500 as a recognition result. In this case, the word recognition unit 600 can identify one word as a recognition result and transmit it to the main control unit 500.

また、上記したように、単語認識部６００は、単語毎の事後確率Ｐ（ｗ｜Ｙ、Ｘ）を認識結果として主制御部５００に出力する構成であってもよい。この場合、主制御部５００は、複数の単語の事後確率Ｐ（ｗ｜Ｙ、Ｘ）と、他の単語との組み合わせとを考慮して宛先情報を特定することができる。 Further, as described above, the word recognition unit 600 may be configured to output the posterior probability P (w | Y, X) for each word to the main control unit 500 as a recognition result. In this case, the main control unit 500 can specify the destination information in consideration of the posterior probabilities P (w | Y, X) of a plurality of words and combinations with other words.

次に、学習フェーズについて説明する。
図２に示すＶＣＳ６４０は、たとえば、単語認識部６００により宛先情報が認識されなかった紙葉類１の正しい宛先情報を紙葉類処理装置１００のオペレータに入力させる為のモジュールである。ＶＣＳ６４０は、例えば図１に示す操作部７００及び表示部８００により構成される。また、例えば、単語認識部６００は、操作部７００及び表示部８００とは別に操作及び表示が可能なモジュールをＶＣＳ６４０として備える構成であってもよい。 Next, the learning phase will be described.
The VCS 640 shown in FIG. 2 is a module for causing the operator of the paper sheet processing apparatus 100 to input correct destination information of the paper sheet 1 whose destination information has not been recognized by the word recognition unit 600, for example. The VCS 640 includes, for example, an operation unit 700 and a display unit 800 illustrated in FIG. Further, for example, the word recognition unit 600 may be configured to include a module capable of operation and display as the VCS 640 separately from the operation unit 700 and the display unit 800.

ＶＣＳ６４０は、宛先情報を特定できなかった紙葉類１の画像を表示する。ＶＣＳ６４０は、表示させた紙葉類１の画像をオペレータに読み取らせて宛先情報を入力させる。例えば、ＶＣＳ６４０は、単語候補毎にオペレータに正しい単語を入力させる。これにより、ＶＣＳ６４０は、単語画像と正しい宛先情報（正解）とを対応付けることができる。 The VCS 640 displays an image of the paper sheet 1 whose destination information could not be specified. The VCS 640 causes the operator to read the displayed image of the paper sheet 1 and input destination information. For example, the VCS 640 causes the operator to input a correct word for each word candidate. As a result, the VCS 640 can associate the word image with the correct destination information (correct answer).

ＶＣＳ６４０は、単語画像及び正しい宛先情報（正解）を、第１の単語画像蓄積部６４１と事前確率計算部６５１とに出力する。 The VCS 640 outputs the word image and correct destination information (correct answer) to the first word image storage unit 641 and the prior probability calculation unit 651.

まず、単語モデルの学習について説明する。第１の単語画像蓄積部６４１は、ＶＣＳ６４０により入力された単語画像と正解とを対応付けて蓄積する。 First, word model learning will be described. The first word image storage unit 641 stores the word image input by the VCS 640 and the correct answer in association with each other.

モデル学習部６４２は、第１の単語画像蓄積部６４１に蓄積されている単語画像とその正解を用いて、各文字モデル、各単語モデル、及び任意文字モデルのいずれかまたは複数を学習する。 The model learning unit 642 learns one or more of each character model, each word model, and an arbitrary character model using the word image stored in the first word image storage unit 641 and its correct answer.

モデル学習部６４２は、例えば、バウムウェルチアルゴリズム（Ｂａｕｍ−Ｗｅｌｃｈａｌｇｏｒｉｔｈｍ）を用いてモデルの学習を行う。バウムウェルチアルゴリズムは、隠れマルコフモデルにおける未知のパラメータを探すアルゴリズムである。バウムウェルチアルゴリズムは、モデルが出力した配列からモデルパラメータを推定することができる。 The model learning unit 642 performs model learning using, for example, a Baum-Welch algorithm. The Baumwelch algorithm is an algorithm that searches for unknown parameters in a hidden Markov model. The Baumwelch algorithm can estimate model parameters from the sequence output by the model.

モデル学習部６４２は、例えば、第１の単語画像蓄積部６４１に蓄積されている単語画像とその正解を用いて、バウムウェルチアルゴリズムによりモデルを生成する。モデル学習部６４２は、生成したモデルをモデル格納部６４３に出力する。モデル格納部６４３は、受け取ったモデルを格納する。 For example, the model learning unit 642 generates a model by the Baum Welch algorithm using the word images stored in the first word image storage unit 641 and the correct answer thereof. The model learning unit 642 outputs the generated model to the model storage unit 643. The model storage unit 643 stores the received model.

なお、モデル学習部６４２は、既にモデル格納部６４３に格納されているモデルを更新する構成であってもよい。 The model learning unit 642 may be configured to update a model already stored in the model storage unit 643.

次に、事前確率の学習について説明する。事前確率計算部６５１は、ＶＣＳ６４０により入力された単語画像の正しい宛先情報に基づいて、単語毎の頻度をカウントする。即ち、事前確率計算部６５１は、宛先情報に含まれる単語の数を単語毎にカウントして集計することにより、単語毎の事前確率Ｐ（ｗ）を算出する。事前確率計算部６５１は、算出した単語毎の事前確率Ｐ（ｗ）を事前確率格納部６５２に格納する。 Next, learning of prior probabilities will be described. The prior probability calculation unit 651 counts the frequency for each word based on the correct destination information of the word image input by the VCS 640. That is, the prior probability calculation unit 651 calculates the prior probability P (w) for each word by counting and counting the number of words included in the destination information for each word. The prior probability calculation unit 651 stores the calculated prior probability P (w) for each word in the prior probability storage unit 652.

事前確率入力部６５３は、事前確率格納部６５２に格納されている事前確率Ｐ（ｗ）を変更することができる。事前確率入力部６５３は、例えば図１に示す操作部７００により入力された操作に基づいて事前確率格納部６５２に格納されている事前確率Ｐ（ｗ）を操作に応じた値に書き換える。 Prior probability input section 653 can change prior probability P (w) stored in prior probability storage section 652. The prior probability input unit 653 rewrites the prior probability P (w) stored in the prior probability storage unit 652 to a value corresponding to the operation based on, for example, an operation input by the operation unit 700 illustrated in FIG.

また、事前確率入力部６５３は、操作部７００とは別に操作が可能なモジュールにより入力された操作に基づいて事前確率格納部６５２に格納されている事前確率Ｐ（ｗ）を操作に応じた値に書き換える構成であってもよい。 The prior probability input unit 653 is a value corresponding to the prior probability P (w) stored in the prior probability storage unit 652 based on an operation input by a module that can be operated separately from the operation unit 700. The configuration may be rewritten as follows.

これにより、上記したような誤認識しやすい単語などに対して事前確率Ｐ（ｗ）として低い値を設定することができる。これにより、単語認識部６００が誤認識を起こすことを防ぐ事ができる。 Thereby, a low value can be set as the prior probability P (w) for the above-described words that are easily misrecognized. Thereby, it is possible to prevent the word recognition unit 600 from causing erroneous recognition.

このような構成によると、単語認識部６００は、解析的手法（解析的マッチング）と全体的手法（全体的マッチング）とを併用することができる。また、単語認識部６００は、特徴確率Ｐ（Ｘ）を上記したように、任意単語モデルに基づいて算出することにより、より高い精度で事後確率Ｐ（ｗ｜Ｙ、Ｘ）を算出することができる。この結果、より高い精度で単語の認識を行うことができる単語認識装置、単語認識方法、及び単語認識装置を備える紙葉類処理装置を提供することができる。 According to such a configuration, the word recognizing unit 600 can use an analytical technique (analytic matching) and an overall technique (global matching) in combination. Further, the word recognition unit 600 can calculate the posterior probability P (w | Y, X) with higher accuracy by calculating the feature probability P (X) based on the arbitrary word model as described above. it can. As a result, it is possible to provide a word recognition device, a word recognition method, and a paper sheet processing device including the word recognition device that can recognize words with higher accuracy.

なお、上記の実施形態では、解析的マッチングと全体的マッチングとは、どちらが先に行われてもよい。また、単語認識部６００が解析的マッチングと全体的マッチングとを並列的に処理することが出来る構成を備える場合、解析的マッチングと全体的マッチングとを並列的に処理する構成であってもよい。 In the above embodiment, either the analytical matching or the overall matching may be performed first. In addition, when the word recognition unit 600 includes a configuration capable of processing the analytical matching and the global matching in parallel, the configuration may be such that the analytical matching and the global matching are processed in parallel.

なお、上記の実施形態では、単語認識部６００は、一つの単語を認識結果として特定する場合、最も高い事後確率Ｐ（ｗ｜Ｙ、Ｘ）が算出された単語を認識結果として主制御部５００に出力すると説明したが、この構成に限定されない。単語画像が同じである場合、特徴確率Ｐ（Ｘ）は一定である為、単語認識部６００は、数式１のＰ（Ｘ）を任意の値として事後確率Ｐ（ｗ｜Ｙ、Ｘ）を算出する構成であってもよい。即ち、単語認識部６００は、事後確率比Ｐ（Ｙ｜ｗ）／Ｐ（Ｙ）、尤度Ｐ（Ｘ｜ｗ）、及び事前確率Ｐ（ｗ）に基づいて事後確率Ｐ（ｗ｜Ｙ、Ｘ）を算出することができる。 In the above embodiment, when the word recognition unit 600 specifies one word as the recognition result, the main control unit 500 uses the word for which the highest posterior probability P (w | Y, X) is calculated as the recognition result. However, the present invention is not limited to this configuration. Since the feature probability P (X) is constant when the word images are the same, the word recognition unit 600 calculates the posterior probability P (w | Y, X) with P (X) in Equation 1 as an arbitrary value. It may be configured to. In other words, the word recognition unit 600 determines the posterior probability P (w | Y) based on the posterior probability ratio P (Y | w) / P (Y), the likelihood P (X | w), and the prior probability P (w). X) can be calculated.

また、上記したように、各単語毎の事後確率Ｐ（ｗ｜Ｙ、Ｘ）を上位である主制御部５００に出力する場合、単語認識部６００は、特徴確率Ｐ（Ｘ）を算出し、数式１に基づいて事後確率Ｐ（ｗ｜Ｙ、Ｘ）を算出する。これにより、単語認識部６００は、各単語の評価としての事後確率Ｐ（ｗ｜Ｙ、Ｘ）を主制御部５００に出力することができる。主制御部５００は、単語毎の事後確率Ｐ（ｗ｜Ｙ、Ｘ）と、各単語の組み合わせとを考慮し、より高い精度で宛先情報を特定することが出来る。 Further, as described above, when outputting the posterior probability P (w | Y, X) for each word to the main control unit 500, which is the upper level, the word recognition unit 600 calculates the feature probability P (X), A posteriori probability P (w | Y, X) is calculated based on Equation 1. Thereby, the word recognition unit 600 can output the posterior probability P (w | Y, X) as the evaluation of each word to the main control unit 500. The main control unit 500 can specify the destination information with higher accuracy in consideration of the posterior probability P (w | Y, X) for each word and the combination of each word.

また、上記した実施形態では、単語認識部６００は、事後確率比Ｐ（Ｙ｜ｗ）／Ｐ（Ｙ）、尤度Ｐ（Ｘ｜ｗ）、及び事前確率Ｐ（ｗ）に基づいて事後確率Ｐ（ｗ｜Ｙ、Ｘ）を算出する構成として説明したが、この構成に限定されない。例えば、事前確率を考慮する必要がない場合、または事前確率が一定の値である場合、単語認識部６００は、数式１の事前確率Ｐ（ｗ）を無視する、または所定の値に置き換えて事後確率Ｐ（ｗ｜Ｙ、Ｘ）を算出する構成であってもよい。 In the embodiment described above, the word recognition unit 600 determines the posterior probability based on the posterior probability ratio P (Y | w) / P (Y), the likelihood P (X | w), and the prior probability P (w). Although described as a configuration for calculating P (w | Y, X), it is not limited to this configuration. For example, when it is not necessary to consider the prior probability or when the prior probability is a constant value, the word recognition unit 600 ignores the prior probability P (w) of Equation 1 or replaces it with a predetermined value, and the posterior The probability P (w | Y, X) may be calculated.

（第２の実施形態）
図９は、第２の実施形態に係る単語認識部６００の構成の例を示す。
単語認識部６００は、画像受取部６０１、単語抽出部６０２、文字候補抽出部６０３、文字認識部６０４、特徴抽出部６０５、解析的マッチング部６１０、全体的マッチング部６２０、特徴確率計算部６３０、ＶＣＳ６４０、第１の単語画像蓄積部６４１、モデル学習部６４２、モデル格納部６４３、単語モデル生成部６４４、単語辞書６４５、事前確率計算部６５１、事前確率格納部６５２、事前確率入力部６５３、統合評価値算出部６６０、事前確率乗算部６７０、第２の単語画像蓄積部６８１、パラメータ学習部６８２、及びパラメータ格納部６８３を具備する。 (Second Embodiment)
FIG. 9 shows an example of the configuration of the word recognition unit 600 according to the second embodiment.
The word recognition unit 600 includes an image receiving unit 601, a word extraction unit 602, a character candidate extraction unit 603, a character recognition unit 604, a feature extraction unit 605, an analytical matching unit 610, an overall matching unit 620, a feature probability calculation unit 630, VCS 640, first word image storage unit 641, model learning unit 642, model storage unit 643, word model generation unit 644, word dictionary 645, prior probability calculation unit 651, prior probability storage unit 652, prior probability input unit 653, integration An evaluation value calculation unit 660, a prior probability multiplication unit 670, a second word image storage unit 681, a parameter learning unit 682, and a parameter storage unit 683 are provided.

また、特徴確率計算部６３０は、先頭特徴確率計算部６３１、条件特徴確率計算部６３２、同時確率特徴計算部６３３、前特徴確率計算部６３４、及び総積計算部６３５を具備する。なお、第１の実施形態と同様の構成には同じ参照符号を付し、詳細な説明を省略する。 The feature probability calculation unit 630 includes a head feature probability calculation unit 631, a conditional feature probability calculation unit 632, a joint probability feature calculation unit 633, a previous feature probability calculation unit 634, and a total product calculation unit 635. The same components as those in the first embodiment are denoted by the same reference numerals, and detailed description thereof is omitted.

なお、第２の実施形態に係る単語認識部６００の動作は、認識フェーズと学習フェーズとに大きく分けられる。まず、認識フェーズについて説明する。 Note that the operation of the word recognition unit 600 according to the second embodiment is roughly divided into a recognition phase and a learning phase. First, the recognition phase will be described.

特徴確率計算部６３０は、特徴抽出部６０５により抽出された特徴Ｘと、パラメータ格納部６８３により格納されているパラメータとに基づいて、特徴確率Ｐ（Ｘ）を算出する。上記したように、特徴抽出部６０５は、単語候補の画像に基づいて、ベクトルの集合である特徴Ｘを抽出する。この特徴Ｘは、Ｔ個の特徴ベクトルｘ_１、ｘ_２、ｘ_３・・・ｘ_Ｔを有する。この場合、特徴ベクトルｘ_ｔは、ｔ番目の特徴ベクトルを示す。また、特徴ベクトルｘ_ｔ−１は、特徴ベクトルｘ_ｔのひとつ前の特徴ベクトルを示す。 The feature probability calculation unit 630 calculates the feature probability P (X) based on the feature X extracted by the feature extraction unit 605 and the parameter stored by the parameter storage unit 683. As described above, the feature extraction unit 605 extracts the feature X, which is a set of vectors, based on the word candidate images. This feature X has T feature vectors x ₁ , x ₂ , x ₃ ... X _T. In this case, the feature vector _xt indicates the t-th feature vector. A feature vector x _t-1 indicates a feature vector immediately before the feature vector x _t .

上記のように仮定した場合、特徴確率計算部６３０は、下記の数式３に基づいて特徴確率Ｐ（Ｘ）を算出する。

Assuming the above, the feature probability calculation unit 630 calculates the feature probability P (X) based on the following Equation 3.

特徴確率計算部６３０の先頭特徴確率計算部６３１は、数式３の右辺の第１因子Ｐ（ｘ_１）を計算する。第１因子Ｐ（ｘ_１）は、１番目の特徴ベクトルとしてｘ_１が抽出される確率（先頭特徴確率）を示す。先頭特徴確率計算部６３１は、パラメータ格納部６８３により格納されているパラメータに基づいて第１因子Ｐ（ｘ_１）を計算する。 The head feature probability calculation unit 631 of the feature probability calculation unit 630 calculates the first factor P (x ₁ ) on the right side of Equation 3. The first factor P (x ₁ ) indicates the probability (first feature probability) that x ₁ is extracted as the _first feature vector. The leading feature probability calculation unit 631 calculates the first factor P (x ₁ ) based on the parameters stored in the parameter storage unit 683.

パラメータ格納部６８３は、複数の単語画像に基づいて学習により算出されたパラメータを蓄積する。このパラメータは、単語画像に基づいて抽出された特徴Ｘが有する特徴ベクトルｘ_１、ｘ_２、ｘ_３・・・の成す確率分布を示すものである。即ち、パラメータ格納部６８３は、各特徴ベクトルの成す確率分布のパラメータを記憶する。パラメータ格納部６８３は、例えば、混合ガウス分布でモデル化されている場合であれば、各ガウス分布の混合率、平均ベクトル、または共分散行列などを格納する。 The parameter storage unit 683 accumulates parameters calculated by learning based on a plurality of word images. This parameter indicates the probability distribution formed by the feature vectors x ₁ , x ₂ , x ₃ ... Of the feature X extracted based on the word image. That is, the parameter storage unit 683 stores the probability distribution parameters formed by the feature vectors. The parameter storage unit 683 stores, for example, the mixing ratio, average vector, or covariance matrix of each Gaussian distribution if the model is modeled with a mixed Gaussian distribution.

条件特徴確率計算部６３２は、数式３の右辺の第２因子のΠの中身である個別因子Ｐ（ｘ_ｔ｜ｘ_ｔ−１）を計算する。Ｐ（ｘ_ｔ｜ｘ_ｔ−１）は、先頭の特徴ベクトルを除く各特徴ベクトルが特徴ベクトルの１つ前に並ぶ特徴ベクトルを条件として出現する条件付き確率を示す。即ち、特徴確率計算部６３０は、Ｐ（ｘ_ｔ｜ｘ_ｔ−１）をＴ−１の組み合わせに応じてそれぞれ算出する。 The condition feature probability calculation unit 632 calculates the individual factor P (x _t | x _t−1 ) that is the content of the second factor bag on the right side of Equation 3. P (x _t | x _t−1 ) indicates a conditional probability that each feature vector excluding the first feature vector appears on the condition that the feature vector arranged immediately before the feature vector. That is, the feature probability calculation unit 630 calculates P (x _t | x _t−1 ) according to the combination of T−1.

なお、Ｐ（ｘ_ｔ｜ｘ_ｔ−１）は、下記の数式４に示すように表すことが出来る。

Note that P (x _t | x _t−1 ) can be expressed as shown in Equation 4 below.

条件特徴確率計算部６３２は、同時確率特徴計算部６３３と前特徴確率計算部６３４とを具備する。同時確率特徴計算部６３３は、数式４の右辺の分子であるＰ（ｘ_ｔ，ｘ_ｔ−１）を計算する。Ｐ（ｘ_ｔ，ｘ_ｔ−１）は、１つ前のベクトルｘ_ｔ−１と特徴ベクトルｘ_ｔとが同時に出現する確率（同時確率）を示す。同時確率特徴計算部６３３は、パラメータ格納部６８３により格納されているパラメータに基づいてＰ（ｘ_ｔ，ｘ_ｔ−１）を計算する。 The conditional feature probability calculation unit 632 includes a joint probability feature calculation unit 633 and a previous feature probability calculation unit 634. The joint probability feature calculation unit 633 calculates P (x _t , x _t−1 ), which is the numerator on the right side of Equation 4. P (x _t , x _t−1 ) indicates the probability (simultaneous probability) that the previous vector x _t−1 and the feature vector x _t appear simultaneously. The joint probability feature calculation unit 633 calculates P (x _t , x _t−1 ) based on the parameters stored in the parameter storage unit 683.

また、前特徴確率計算部６３４は、数式４の右辺の分母であるＰ（ｘ_ｔ−１）を計算する。Ｐ（ｘ_ｔ−１）は、１つ前のベクトルｘ_ｔ−１が出現する確率（前特徴確率）を示す。前特徴確率計算部６３４は、パラメータ格納部６８３により格納されているパラメータに基づいてＰ（ｘ_ｔ−１）を計算する。 Further, the previous feature probability calculation unit 634 calculates P (x _t−1 ) that is the denominator of the right side of Equation 4. P (x _t−1 ) indicates the probability (previous feature probability) that the previous vector x _t−1 appears. The previous feature probability calculation unit 634 calculates P (x _t−1 ) based on the parameters stored by the parameter storage unit 683.

総積計算部６３５は、先頭特徴確率計算部６３１の出力と、条件特徴確率計算部６３２の全ての出力とを乗算する。即ち、総積計算部６３５は、Ｐ（ｘ_１）と、ｔ＝２乃至Ｔに亘るＰ（ｘ_ｔ｜ｘ_ｔ−１）とを全て乗算する。これにより、特徴確率計算部６３０は、数式３の右辺を計算することができる。この結果、特徴確率計算部６３０は、特徴確率Ｐ（Ｘ）を算出することができる。特徴確率計算部６３０は、算出した特徴確率Ｐ（Ｘ）を統合評価値算出部６６０に出力する。これ以降の処理は、第１の実施形態と同様である。 The total product calculation unit 635 multiplies the output of the head feature probability calculation unit 631 by all the outputs of the condition feature probability calculation unit 632. That is, the total product calculation unit 635 multiplies P (x ₁ ) by P (x _t | x _t−1 ) from t = 2 to T. Thereby, the feature probability calculation unit 630 can calculate the right side of Equation 3. As a result, the feature probability calculation unit 630 can calculate the feature probability P (X). The feature probability calculation unit 630 outputs the calculated feature probability P (X) to the integrated evaluation value calculation unit 660. The subsequent processing is the same as in the first embodiment.

次に、学習フェーズについて説明する。
図９に示すＶＣＳ６４０は、図２に示すＶＣＳ６４０と同様の構成である。しかし、第２の実施形態に係るＶＣＳ６４０は、少なくとも単語認識部６００により単語が認識されなかった紙葉類１の単語画像を収集可能な構成であればよい。ＶＣＳ６４０は、単語画像を、第２の単語画像蓄積部６８１に出力する。 Next, the learning phase will be described.
The VCS 640 shown in FIG. 9 has the same configuration as the VCS 640 shown in FIG. However, the VCS 640 according to the second embodiment may be configured to be able to collect at least the word images of the paper sheet 1 whose words are not recognized by the word recognition unit 600. The VCS 640 outputs the word image to the second word image storage unit 681.

第２の単語画像蓄積部６８１は、ＶＣＳ６４０により入力された単語画像を蓄積する。また、図１に示す画像読取部４００により読み取られた紙葉類１の画像を直接格納する構成であってもよい。また、第２の単語画像蓄積部６８１は、第１の単語画像蓄積部６４１と同一に構成されていてもよい。 The second word image storage unit 681 stores the word image input by the VCS 640. Moreover, the structure which stores directly the image of the paper sheet 1 read by the image reading part 400 shown in FIG. 1 may be sufficient. Further, the second word image storage unit 681 may be configured in the same manner as the first word image storage unit 641.

パラメータ学習部６８２は、第２の単語画像蓄積部６８１に蓄積されている単語画像に基づいて、先頭特徴確率計算部６３１、同時確率特徴計算部６３３、及び前特徴確率計算部６３４により用いられるパラメータを学習する。即ち、パラメータ学習部６８２は、単語画像に基づいて複数の特徴ベクトルｘ_１、ｘ_２、ｘ_３・・・を算出し、これらの複数の特徴ベクトルから確率分布のパラメータを学習する。パラメータ学習部６８２は、学習したパラメータをパラメータ格納部６８３に格納する。 The parameter learning unit 682 uses parameters used by the head feature probability calculation unit 631, the joint probability feature calculation unit 633, and the previous feature probability calculation unit 634 based on the word images stored in the second word image storage unit 681. To learn. That is, the parameter learning unit 682 calculates a plurality of feature vectors x ₁ , x ₂ , x ₃ ... Based on the word image, and learns probability distribution parameters from the plurality of feature vectors. The parameter learning unit 682 stores the learned parameter in the parameter storage unit 683.

このような構成によると、単語認識部６００は、単語画像の特徴に基づいて、特徴ベクトルの確率分布のパラメータを予め学習する。単語認識部６００は、特徴確率Ｐ（Ｘ）を学習したパラメータに基づいて算出する。このように算出された特徴確率Ｐ（Ｘ）を用いて事後確率Ｐ（ｗ｜Ｙ、Ｘ）を算出することにより、単語認識部６００は、より高い精度で事後確率Ｐ（ｗ｜Ｙ、Ｘ）を算出することができる。この結果、より高い精度で単語の認識を行うことができる単語認識装置、単語認識方法、及び単語認識装置を備える紙葉類処理装置を提供することができる。 According to such a configuration, the word recognition unit 600 learns in advance the parameters of the probability distribution of the feature vector based on the feature of the word image. The word recognition unit 600 calculates the feature probability P (X) based on the learned parameter. By calculating the posterior probability P (w | Y, X) using the feature probability P (X) calculated in this way, the word recognition unit 600 can increase the posterior probability P (w | Y, X) with higher accuracy. ) Can be calculated. As a result, it is possible to provide a word recognition device, a word recognition method, and a paper sheet processing device including the word recognition device that can recognize words with higher accuracy.

なお、上記した実施形態では、単語認識部６００は、数式１乃至数式４の計算をそのまま行う構成として説明したが、この構成に限定されない。単語認識部６００は、数式１乃至数式４の各項の対数を取って計算する構成であってもよい。このように対数を使うことにより、乗算であった部分が加算に代替される。また、除算であった部分が減算に代替される。 In the above-described embodiment, the word recognition unit 600 has been described as a configuration in which the calculations of Formulas 1 to 4 are performed as they are, but the configuration is not limited thereto. The word recognition unit 600 may be configured to calculate by taking the logarithm of each term of Equations 1 to 4. By using the logarithm in this way, the part that was a multiplication is replaced with addition. Also, the part that was the division is replaced with subtraction.

なお、上述の各実施の形態で説明した機能は、ハードウエアを用いて構成するに留まらず、ソフトウエアを用いて各機能を記載したプログラムをコンピュータに読み込ませて実現することもできる。また、各機能は、適宜ソフトウエア、ハードウエアのいずれかを選択して構成するものであっても良い。 It should be noted that the functions described in the above embodiments are not limited to being configured using hardware, but can be realized by causing a computer to read a program describing each function using software. Each function may be configured by appropriately selecting either software or hardware.

なお、本発明は上記実施形態そのままに限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で構成要素を変形して具体化できる。また、上記実施形態に開示されている複数の構成要素の適宜な組み合せにより種々の発明を形成できる。例えば、実施形態に示される全構成要素から幾つかの構成要素を削除してもよい。更に、異なる実施形態に亘る構成要素を適宜組み合せてもよい。 Note that the present invention is not limited to the above-described embodiment as it is, and can be embodied by modifying the constituent elements without departing from the scope of the invention in the implementation stage. Further, various inventions can be formed by appropriately combining a plurality of constituent elements disclosed in the embodiment. For example, some components may be deleted from all the components shown in the embodiment. Furthermore, you may combine suitably the component covering different embodiment.

１…紙葉類、１００…紙葉類処理装置、２００…供給部、２１０…分離ローラ、２２０…搬送路、３００…区分処理部、４００…画像読取部、５００…主制御部、６００…単語認識部、６０１…画像受取部、６０２…単語抽出部、６０３…文字候補抽出部、６０４…文字認識部、６０５…特徴抽出部、６１０…解析的マッチング部、６１１…文字確率計算部、６１２…第１の演算部、６１３…第２の演算部、６２０…全体的マッチング部、６３０…特徴確率計算部、６３１…先頭特徴確率計算部、６３２…条件特徴確率計算部、６３３…同時確率特徴計算部、６３４…前特徴確率計算部、６３５…総積計算部、６４０…ＶＣＳ、６４１…第１の単語画像蓄積部、６４２…モデル学習部、６４３…モデル格納部、６４４…単語モデル生成部、６４５…単語辞書、６５１…事前確率計算部、６５２…事前確率格納部、６５３…事前確率入力部、６６０…統合評価値算出部、６７０…事前確率乗算部、６８１…第２の単語画像蓄積部、６８２…パラメータ学習部、６８３…パラメータ格納部、７００…操作部、８００…表示部、９００…入出力部。 DESCRIPTION OF SYMBOLS 1 ... Paper sheets, 100 ... Paper sheet processing apparatus, 200 ... Supply part, 210 ... Separation roller, 220 ... Conveyance path, 300 ... Sorting processing part, 400 ... Image reading part, 500 ... Main control part, 600 ... Word Recognizing unit, 601 ... Image receiving unit, 602 ... Word extracting unit, 603 ... Character candidate extracting unit, 604 ... Character recognizing unit, 605 ... Feature extracting unit, 610 ... Analytical matching unit, 611 ... Character probability calculating unit, 612 ... 1st operation part, 613 ... 2nd operation part, 620 ... Overall matching part, 630 ... Feature probability calculation part, 631 ... Leading feature probability calculation part, 632 ... Conditional feature probability calculation part, 633 ... Simultaneous probability feature calculation Part 634 ... previous feature probability calculation part 635 ... total product calculation part 640 ... VCS, 641 ... first word image storage part, 642 ... model learning part, 643 ... model storage part, 644 ... word model generation part, 45 ... Word dictionary, 651 ... Prior probability calculation unit, 652 ... Prior probability storage unit, 653 ... Prior probability input unit, 660 ... Integrated evaluation value calculation unit, 670 ... Prior probability multiplication unit, 681 ... Second word image storage unit 682 ... Parameter learning unit, 683 ... Parameter storage unit, 700 ... Operation unit, 800 ... Display unit, 900 ... Input / output unit.

Claims

A word dictionary for storing multiple words,
An image receiving means for receiving an image including a word;
Word image extraction means for extracting a word image for each word from the image;
Character candidate extraction means for extracting character candidates from the word image;
Character recognition means for performing character recognition on the character candidates;
Analytical matching means for calculating a first evaluation value for each word stored in the word dictionary based on a result of character recognition by the character recognition means;
Feature extraction means for extracting features from the word image;
Word model generation means for generating a word model for each word stored in the word dictionary;
An overall matching means for calculating a second evaluation value indicating a probability that the feature appears for each word model;
A feature probability calculating means for calculating a feature probability that the feature appears;
Integrated evaluation value calculation means for calculating a third evaluation value by multiplying the first evaluation value, the second evaluation value, and the inverse of the feature probability;
Output means for outputting the third evaluation value calculated by the integrated evaluation value calculating means;
A word recognition device comprising:

The word recognition apparatus according to claim 1, wherein the feature probability calculation unit calculates a probability that the feature appears from an arbitrary word model representing an arbitrary word as the feature probability.

The word recognition device according to claim 2, wherein the arbitrary word model allows transition between arbitrary states in the model.

First word image storage means for storing a word image and a correct answer of the word image;
Model learning means for learning the word model and the arbitrary word model using the word image and the correct answer stored by the first word image storage means;
The word recognition device according to claim 3, further comprising:

Model storage means for storing a character model for each character;
The word model generation means generates the arbitrary word model using the character model and the word model for each word stored in the word dictionary.
The word recognition device according to claim 2.

First word image storage means for storing a word image and a correct answer of the word image;
Model learning means for learning the character model using the word images and correct answers stored by the first word image storage means;
The word recognition device according to claim 5, further comprising:

The feature extraction means extracts a plurality of feature vectors forming a permutation as features,
The feature probability calculation means calculates a leading feature probability that the leading feature vector of the feature appears, and each feature vector excluding the leading feature vector appears on the condition that the feature vector is arranged immediately before the feature vector. Calculating a conditional probability for each feature vector, and calculating the feature probability based on the leading feature probability and the conditional probability;
The word recognition device according to claim 1.

The feature probability calculating means calculates each previous feature probability that a feature vector arranged immediately before each feature vector appears, and each feature vector and a feature vector arranged immediately before the feature vector appear simultaneously. Calculating a joint probability for each feature vector, and calculating the feature probability based on the head feature probability, each joint probability, and each previous feature probability;
The word recognition device according to claim 7.

A second word image storage means for storing a word image;
Parameter learning means for learning a parameter used for the head feature probability calculation and a parameter used for the conditional probability calculation using the word image stored by the second word image storage means;
Further comprising
The word recognition device according to claim 8.

Prior probability storage means for storing prior probabilities for each word;
Posterior probability calculation means for calculating a fourth evaluation value based on the prior probability stored in the prior probability storage means and the third evaluation value;
Further comprising
The output means outputs the fourth evaluation value calculated by the posterior probability calculation means;
The word recognition device according to any one of claims 1 to 9.

A prior probability input means for inputting a value of the prior probability;
The prior probability storage means changes the value of the stored prior probability to the value input by the prior probability input means.
The word recognition device according to claim 10.

A prior probability calculating means for receiving a recognition result of the word specified by the paper sheet processing apparatus provided with the word recognition device, and calculating the value of the prior probability for each word based on the received recognition result; Equipped,
The prior probability storage means changes the value of the stored prior probability to the value calculated by the prior probability calculation means.
The word recognition device according to claim 10 or 11.

A word recognition method used in a word recognition device having a word dictionary for storing a plurality of words,
Take an image containing a word,
Extracting a word image for each word from the image;
Extracting character candidates from the word image;
Character recognition is performed on the character candidates,
Calculating a first evaluation value for each word stored in the word dictionary based on the result of the character recognition;
Extracting features from the word image;
Generating a word model for each word stored in the word dictionary;
Calculating a second evaluation value indicating a probability that the feature appears for each word model;
Calculating a feature probability that the feature appears;
Multiplying the first evaluation value, the second evaluation value, and the inverse of the feature probability to calculate a third evaluation value;
Outputting the third evaluation value;
Word recognition method.

A capturing means for capturing paper sheets;
Conveying means for conveying the paper sheet;
Image reading means for reading an image including a word on the paper sheet;
A word dictionary for storing multiple words,
Word image extraction means for extracting a word image for each word from the image;
Character candidate extraction means for extracting character candidates from the word image;
Character recognition means for performing character recognition on the character candidates;
Analytical matching means for calculating a first evaluation value for each word stored in the word dictionary based on character recognition by the character recognition means;
Feature extraction means for extracting features from the word image;
Word model generation means for generating a word model for each word stored in the word dictionary;
An overall matching means for calculating a second evaluation value indicating a probability that the feature appears for each word model;
A feature probability calculating means for calculating a feature probability that the feature appears;
Integrated evaluation value calculation means for calculating a third evaluation value by multiplying the first evaluation value, the second evaluation value, and the inverse of the feature probability;
A recognition unit for recognizing the destination information of the paper sheet based on the third evaluation value calculated by the integrated evaluation value calculating unit;
A sorting processing unit for sorting the paper sheets based on the destination information recognized by the recognition unit;
A paper sheet processing apparatus comprising: