JP2006185342A

JP2006185342A - Information processor, method and program for classifying character string, and recording medium

Info

Publication number: JP2006185342A
Application number: JP2004380567A
Authority: JP
Inventors: Itsuki Shimokooriyama; 敬己下郡山
Original assignee: Canon Software Inc
Current assignee: Canon IT Solutions Inc
Priority date: 2004-12-28
Filing date: 2004-12-28
Publication date: 2006-07-13

Abstract

<P>PROBLEM TO BE SOLVED: To correctly acquire a character string of each part of a business card, etc., as correct information even when a name, etc., closely resembling a geographic name is included without the needs of designating information indicated by the character string of the each part written in the business card, etc., beforehand or at the time of recognition. <P>SOLUTION: When classifying each character string on the business card read by a scanner 200 by semantic attributes, a CPU of this information processor 100 analyzes the character string with reference to DBs 105 to 107 for the each character string, respectively attaches a score indicating the likelihood of assigning the character string to each of the respective semantic attributes by a plurality of score attaching methods, selects any of combination patterns on the basis of the total values of respective attached scores for each assignment combination pattern of the each character string to each of the semantic attributes and classifies the each character string into each of the semantic attributes with the selected combination pattern. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、文字認識および文字認識結果として得られる文字列の表す意味的情報区分を分類する情報処理装置および文字列分類方法およびプログラムおよび記録媒体に関する。 The present invention relates to an information processing apparatus, a character string classification method, a program, and a recording medium that classify semantic information classification represented by character recognition obtained as a result of character recognition and character recognition.

一定形式の画像、即ち、どの位置にどのような意味を持つ文字列が記述されているかをあらかじめ登録されているか、利用者に指定させることで、特定の位置には氏名が、また別の特定の位置には住所が記述されていることを認識するシステムが従来より存在している。 By specifying the image in a certain format, that is, what kind of meaning character string is described in which position is registered in advance, the user can specify the name at another position and another specific Conventionally, there is a system for recognizing that an address is described at the position of.

また、特許文献１には、名刺の文字認識を行う際に、名刺の各行において、特定の分類を表す識別子となる単語（人名や「株式」などの特徴的な語）を含むことを利用して、名刺中の情報に特定の分類を与えていく文字認識装置が記載されている。
特開平１０−７８９９７公報 Patent Document 1 utilizes the fact that each line of a business card includes a word (a characteristic word such as a person's name or “stock”) serving as an identifier representing a specific classification when performing business card character recognition. Thus, a character recognition device that gives a specific classification to information in a business card is described.
Japanese Patent Laid-Open No. 10-78997

しかしながら、上記従来のシステムにおいては、名刺の中に現れる文字列の位置情報を利用しているため、あらかじめ文字列出現位置パターンを登録する必要があった。そうでない場合には、文字列をブロックに分割し、どれが氏名か住所かを人間が介在して指定する処理をユーザに強いるものであり、非常に煩雑であった。 However, since the conventional system uses the position information of the character string appearing in the business card, it is necessary to register the character string appearance position pattern in advance. If this is not the case, the character string is divided into blocks, and the user is forced to specify which is the name or address, which is very complicated.

また、例えば、横型の名刺の場合は、上部左側から真中にかけて会社名があり、また真中に大きく、氏名が印刷されている等のおおよその位置情報を予め登録してあるものもあった。 Further, for example, in the case of a horizontal type business card, there is a case in which there is a company name from the upper left to the middle, and a large name is printed in the middle and approximate position information such as a name printed is registered in advance.

しかしこの場合、新しいパターンの名刺が出現した場合には、新たな文字列出現位置パターンを登録しなけなければ名刺認識をすることが出来ないと言う問題点があった。 However, in this case, when a business card with a new pattern appears, there is a problem that the business card cannot be recognized unless a new character string appearance position pattern is registered.

また、上記特許文献１に記載された文字認識装置では、文字列毎にその意味属性を判断しているため、例えば、「秋田県（あきたあがた）」や「吉野村政」のような、地名に酷似した氏名の場合、該氏名を地名と誤認識してしまう可能性が高かった。 Further, in the character recognition device described in the above-mentioned Patent Document 1, since the semantic attribute is determined for each character string, for example, “Akita Prefecture” or “Yoshinomura Masaru” is used. In the case of a name very similar to the place name, there is a high possibility that the name will be mistakenly recognized as a place name.

従って、上記特許文献１の文字認識技術を利用したとしても、どの文字列が住所であり、別のどの文字列が氏名であるといった分類を、正確に行うためには、結局、ユーザのチェックが不可欠であり、非常に煩雑であるといった問題点があった。 Therefore, even if the character recognition technology of the above-mentioned patent document 1 is used, in order to correctly classify which character string is an address and which other character string is a name, after all, a user check is performed. There was a problem that it was indispensable and very complicated.

本発明は、上記の問題点を解決するためになされたもので、本発明の目的は、複数の意味属性のいずれかに割り当てられた複数の文字列から構成される文字列群内の各文字列を前記意味属性毎に分類する際に、前記各文字列毎に、前記文字列を分析して前記文字列の前記各意味属性に対する割り当ての尤らしさを示すスコアを複数のスコア付与方法によりそれぞれ付与し、前記各文字列の前記各意味属性への割り当て組み合わせパターン毎の前記それぞれ付与されたスコアの合計値に基づいて、前記いずれかの組み合わせパターンを選択し、該選択した組み合わせパターンで、前記各文字列を前記各意味属性に分類する（即ち、文字列群全体を考慮して各文字列の意味属性を分類する）ことにより、名刺等に記述されている各部分の文字列が何の情報を示しているのかを予めあるいは認識時に指定する必要がなく、名刺等の各部の文字列を正しい情報として取得することができる情報処理装置および文字列分類方法およびプログラムおよび記録媒体を提供することである。 The present invention has been made to solve the above problems, and an object of the present invention is to provide each character in a character string group composed of a plurality of character strings assigned to any one of a plurality of semantic attributes. When classifying a column for each semantic attribute, for each of the character strings, a score indicating the likelihood of assignment of the character string to each semantic attribute is analyzed by a plurality of scoring methods. Assigning and selecting any one of the combination patterns based on the total value of the assigned scores for each combination pattern assigned to each semantic attribute of each character string, and in the selected combination pattern, By classifying each character string into the respective semantic attributes (that is, classifying the semantic attributes of the respective character strings in consideration of the entire character string group), the character strings of the respective parts described in the business card or the like are changed. Information processing apparatus, character string classification method, program, and recording medium that can acquire the character string of each part such as a business card as correct information, without needing to specify in advance whether the information is indicated or during recognition That is.

本発明は、複数の意味属性のいずれかに割り当てられた複数の文字列から構成される文字列群内の各文字列を前記意味属性毎に分類する情報処理装置において、前記文字列を分析して前記文字列の前記各意味属性に対する割り当ての尤らしさを示すスコアを付与する１又は複数のスコア付与手段と、前記各文字列の前記各意味属性への割り当て組み合わせパターン毎の前記各スコア付与手段によりそれぞれ付与されたスコアの合計値に基づいて、前記いずれかの組み合わせパターンを選択し、該選択した組み合わせパターンで、前記各文字列を前記各意味属性に分類する分類手段とを有することを特徴とする。 The present invention provides an information processing apparatus that classifies each character string in a character string group composed of a plurality of character strings assigned to any one of a plurality of semantic attributes for each semantic attribute, and analyzes the character string. One or a plurality of score assigning means for assigning a score indicating the likelihood of assignment of the character string to each semantic attribute, and each score assigning means for each combination combination pattern of each character string to each semantic attribute And a classifying unit that selects any one of the combination patterns based on the total value of the scores assigned by the method and classifies the character strings into the semantic attributes according to the selected combination pattern. And

本発明によれば、複数の意味属性のいずれかに割り当てられた複数の文字列から構成される文字列群内の各文字列を前記意味属性毎に分類する際に、前記各文字列毎に、前記文字列を分析して前記文字列の前記各意味属性に対する割り当ての尤らしさを示すスコアを複数のスコア付与方法によりそれぞれ付与し、前記各文字列の前記各意味属性への割り当て組み合わせパターン毎の前記それぞれ付与されたスコアの合計値に基づいて、前記いずれかの組み合わせパターンを選択し、該選択した組み合わせパターンで、前記各文字列を前記各意味属性に分類するので、名刺等に記述されている各部分の文字列が何の情報を示しているのかを予めあるいは認識時に指定する必要がなく、名刺等の各部の文字列を正しい情報として取得することができる等の効果を奏する。 According to the present invention, when each character string in the character string group composed of a plurality of character strings assigned to any one of the plurality of semantic attributes is classified for each semantic attribute, for each character string, Analyzing the character string and assigning a score indicating the likelihood of assignment of the character string to each semantic attribute by a plurality of scoring methods, and assigning each character string to each semantic attribute Since any one of the combination patterns is selected based on the total value of the assigned scores, and the character strings are classified into the respective semantic attributes according to the selected combination pattern, it is described on a business card or the like. It is not necessary to specify what information the character string of each part indicates in advance or at the time of recognition, and the character string of each part such as a business card can be acquired as correct information. An effect such as that.

従って、文字列を名刺等の原稿内における位置によってではなく、文字列群全体を考慮して、文字列の意味属性を特定するので、例えば、「秋田県（あきたあがた）」や「吉野村政」のような、地名に酷似した氏名でも、地名と誤認識することなく、氏名として分類可能である。 Therefore, since the character string is specified by considering the entire character string group, not by the position in the document such as a business card, the character string semantic attribute is specified. For example, “Akita Prefecture” or “Akitagata” Even names that are very similar to place names, such as “Yoshinomura Masaru”, can be classified as names without misrecognizing them as place names.

結果として、例えば名刺データベースなどの後続のアプリケーションに渡すときに、ある文字列が氏名なのか住所なのか、などを人手によりチェックし指定あるいは修正する必要を省力化することが出来る。 As a result, when passing to a subsequent application such as, for example, a business card database, it is possible to save labor by manually checking and specifying or correcting whether a character string is a name or an address.

〔第１実施形態〕
以下、図面を参照して、本発明の詳細を説明する。 [First Embodiment]
Hereinafter, details of the present invention will be described with reference to the drawings.

図１は、本発明の第１実施形態を示す情報処理装置を適用可能なシステムの一例を示すシステム構成図である。 FIG. 1 is a system configuration diagram showing an example of a system to which the information processing apparatus according to the first embodiment of the present invention can be applied.

図１において、１００は文字列認識装置としての情報処理装置である。この情報処理装置１００は、画像入力装置２００（例えばスキャナやデジタルカメラ）から入力されるイメージデータ上の文字列をＯＣＲ認識する。そして、情報処理装置１００は、ＯＣＲ認識結果から辞書ＤＢ１０６に格納された辞書データに基づいて文字列情報を決定する。また、情報処理装置１００は、辞書により決定された文字列情報からルールＤＢ１０５に格納されたルールにより文字列情報を決定する。さらに、情報処理装置１００は、ルールにより決定された文字列情報から確率情報ＤＢ１０７に格納された出現確率データに基づいてＨＭＭ（ＨｉｄｄｅｎＭａｒｋｏｖＭｏｄｅｌ；隠れマルコフモデル）等の確立計算を利用して文字列情報を決定する。 In FIG. 1, reference numeral 100 denotes an information processing apparatus as a character string recognition apparatus. The information processing apparatus 100 OCR recognizes a character string on image data input from an image input apparatus 200 (for example, a scanner or a digital camera). Then, the information processing apparatus 100 determines character string information based on dictionary data stored in the dictionary DB 106 from the OCR recognition result. Further, the information processing apparatus 100 determines the character string information from the character string information determined by the dictionary according to the rules stored in the rule DB 105. Furthermore, the information processing apparatus 100 uses character string information by using establishment calculation such as HMM (Hidden Markov Model) based on appearance probability data stored in the probability information DB 107 from character string information determined by the rule. Determine information.

図２は、図１に示した情報処理装置のハードウェア構成の一例を示すブロック図である。 FIG. 2 is a block diagram illustrating an example of a hardware configuration of the information processing apparatus illustrated in FIG.

図２において、１０１はＣＰＵで、ＲＯＭ１０３又はハードディスク（ＨＤ）（その他の記憶装置、例えば、フレキシブルディスク，ＣＤ−ＲＯＭ，ＤＶＤ−ＲＯＭ等どのような記憶装置であってもよい）１０４に格納された本発明に係るプログラムをＲＡＭ１０２上にロードして実行することにより、本発明の文字列認識方法を実現するとともに、コンピュータ全体を制御する。ＲＡＭ１０２は、ＣＰＵ１０１の作業領域として使用される。 In FIG. 2, reference numeral 101 denotes a CPU, which is stored in a ROM 103 or a hard disk (HD) 104 (any other storage device such as a flexible disk, CD-ROM, DVD-ROM, etc.) 104. By loading and executing the program according to the present invention on the RAM 102, the character string recognition method of the present invention is realized and the entire computer is controlled. The RAM 102 is used as a work area for the CPU 101.

１０９は通信インタフェースで、ネットワーク又はその他の通信媒体を介して画像入力装置（例えばスキャナやデジタルカメラ）２００とのデータの送受信を可能とする。 Reference numeral 109 denotes a communication interface that enables data transmission / reception with the image input apparatus (for example, a scanner or a digital camera) 200 via a network or other communication medium.

１１０は入力装置で、キーボードやマウス等のポインティングデバイス等に相当する。１１１は表示装置で、ＣＲＴ，ＬＣＤ，ＳＥＤ等のモニタに対応する。 An input device 110 corresponds to a pointing device such as a keyboard or a mouse. Reference numeral 111 denotes a display device that corresponds to a monitor such as a CRT, LCD, or SED.

なお、図１に示したルールＤＢ１０５，辞書ＤＢ１０６，確率情報ＤＢ１０７は、情報処理装置１００のＨＤ１０４内に構築されている。また、情報処理装置１００が有する各機能（ＯＣＲ機能，各文字列決定機能）は、情報処理装置１００のＣＰＵ１０１が、ＨＤ１０４に格納されるプログラムをＲＡＭ１０２上にロードして実行することにより、実体化される。 Note that the rule DB 105, the dictionary DB 106, and the probability information DB 107 illustrated in FIG. 1 are constructed in the HD 104 of the information processing apparatus 100. Each function (OCR function, each character string determination function) of the information processing apparatus 100 is materialized by the CPU 101 of the information processing apparatus 100 loading and executing a program stored in the HD 104 on the RAM 102. Is done.

また、本発明の情報処理装置は、コンピュータにより構成されるものであってもよいし、スキャナ，複合機等により構成されるものであってもよい。 In addition, the information processing apparatus of the present invention may be configured by a computer, or may be configured by a scanner, a multifunction machine, or the like.

以下、図３〜図１５を参照して、本発明の情報処理装置における文字列認識方法について説明する。 Hereinafter, a character string recognition method in the information processing apparatus of the present invention will be described with reference to FIGS.

図３は、本発明の情報処理装置における第１の制御処理手順の一例を示すフローチャートであり、文字列認識処理に対応する。なお、このフローチャートの処理は、情報処理装置１００のＣＰＵ１０１がＨＤ１０４に格納されたプログラムをＲＡＭ１０２にロードして実行することにより実現される。また、図中、Ｓ００１〜Ｓ００８は各ステップを示す。 FIG. 3 is a flowchart showing an example of a first control processing procedure in the information processing apparatus of the present invention, and corresponds to the character string recognition processing. Note that the processing of this flowchart is realized by the CPU 101 of the information processing apparatus 100 loading a program stored in the HD 104 into the RAM 102 and executing it. In the figure, S001 to S008 indicate each step.

また、以下、名刺上の文字列を認識する場合を例として説明する。 Hereinafter, a case where a character string on a business card is recognized will be described as an example.

まず、スキャナやデジタルカメラ等の画像入力装置２００にて読み取られた原稿（ここでは名刺）の画像データ（ＯＣＲ対象画像）３０１が入力される、又は、ＨＤ１０４内に記憶された画像データの文字列認識処理が指示されると、ＣＰＵ１０１は、本フローチャートの処理を開始する。 First, image data (an OCR target image) 301 of a document (here, a business card) read by an image input device 200 such as a scanner or a digital camera is input, or a character string of image data stored in the HD 104 When the recognition process is instructed, the CPU 101 starts the process of this flowchart.

ステップＳ００１において、ＣＰＵ１０１は、メモリ（ＨＤ又はＲＡＭ等）から文字列認識対象の画像データ３０１を読み込み、該読み込んだ画像データ３０１のＯＣＲ処理を行い、文字列及びその位置情報を抽出する（抽出文字列群３０２）。例えば、名刺が横型であれば同一行、縦型であれば同一列にある文字列は、１つの文字列として扱うように文字列を抽出する。ただし、一定以上の距離がある場合には、別の文字列として扱うように文字列を分ける。 In step S001, the CPU 101 reads character string recognition target image data 301 from a memory (such as HD or RAM), performs OCR processing on the read image data 301, and extracts a character string and its position information (extracted character). Column group 302). For example, a character string is extracted so that a character string in the same row if the business card is horizontal and a character string in the same column if the business card is vertical is treated as one character string. However, if there is a certain distance or more, the character string is divided so that it is handled as a separate character string.

例えば、図４に示すような名刺５００のＯＣＲ処理を行った場合、「課長」と「山形県」（やまがたあがた（人名））は別の文字列として扱うように文字列を抽出する。ただし、文字列に分けられた「課長」と「山形県」が同一行にあったことは後述のルール適用の際に使用するためにメモリ上に記憶しておく。図４に示す例では、名刺５００から７つの文字列５０１〜５０７が抽出される。 For example, when the OCR processing of the business card 500 as shown in FIG. 4 is performed, character strings are extracted so that “section manager” and “Yamagata prefecture” (Yamagata Agata (person name)) are treated as different character strings. To do. However, the fact that “section manager” and “Yamagata prefecture”, which are divided into character strings, are on the same line, is stored in the memory for use when applying the rules described later. In the example shown in FIG. 4, seven character strings 501 to 507 are extracted from the business card 500.

図４は、本発明の情報処理装置における文字列認識対象画像および該文字列認識対象画像から抽出された文字列の一例を示す模式図であり、特に名刺から文字列を抽出した場合に対応する。 FIG. 4 is a schematic diagram showing an example of a character string recognition target image and a character string extracted from the character string recognition target image in the information processing apparatus according to the present invention, and particularly corresponds to a case where a character string is extracted from a business card. .

そして、ＣＰＵ１０１は、ＯＣＲ処理により取得した文字列を抽出文字列分３０２としてメモリ（ＨＤ又はＲＡＭ等）に格納する。 Then, the CPU 101 stores the character string acquired by the OCR process as an extracted character string 302 in a memory (such as HD or RAM).

図４に示した例では、「ヤマモト電気株式会社」５０１，「○○事業本部」５０２，「△△研究開発部第一開発課」５０３，「課長」５０４，「山形県（ただし「課長」と同じ行にあることを記憶）」５０５，「〒１１１-００１１東京都○○区三好１−２−３」５０６，「ＴＥＬ０３-１２３４-５６７８」５０７が格納される。 In the example shown in FIG. 4, “Yamamoto Electric Co., Ltd.” 501, “XX Business Headquarters” 502, “△△ Research and Development Department First Development Section” 503, “Manager” 504, “Yamagata Prefecture (however,“ Manager ”) Are stored in the same row) ”505”, “1-2-3 Miyoshi 1-2-3, Tokyo, 111-0011” 506, “TEL 03-1234-5678” 507 are stored.

次に、ＣＰＵ１０１は、ステップＳ００１で抽出した全文字列に対して、以下ステップＳ００２〜Ｓ００６の処理を行って、各文字列に対し、その意味属性を決定する。 Next, the CPU 101 performs the processes of steps S002 to S006 on all the character strings extracted in step S001, and determines the semantic attribute for each character string.

意味属性とは「氏名」，「住所」，「電話番号」，「役職」等の情報の分類（ここでは、名刺に記述されている情報の分類）を示すものであり、ステップＳ００１で抽出されたそれぞれの文字列が、これらの意味属性のいずれに所属するものかを判定する。この判定処理を、ステップＳ００１で得られた全ての文字列に対して実行する。以下、ステップＳ００２〜Ｓ００６にて詳細に説明する。 The semantic attribute indicates the classification of information such as “name”, “address”, “telephone number”, “position” (here, the classification of information described in the business card), and is extracted in step S001. In addition, it is determined which of these semantic attributes each character string belongs to. This determination process is executed for all the character strings obtained in step S001. Hereinafter, it demonstrates in detail by step S002-S006.

ステップＳ００２において、ＣＰＵ１０１は、ステップＳ００１で抽出した文字列の１つを読み込み（着目し）、該着目中の文字列をキーワードとして、図５に示す辞書ＤＢ１０６を検索し、該検索結果に基づいて着目中の文字列に対してスコアを付与してメモリに保持する。 In step S002, the CPU 101 reads (focuses on) one of the character strings extracted in step S001, searches the dictionary DB 106 shown in FIG. 5 using the focused character string as a keyword, and based on the search result. A score is assigned to the character string under consideration and stored in the memory.

図５は、本発明の情報処理装置における辞書ＤＢの一例を示す図である。 FIG. 5 is a diagram showing an example of the dictionary DB in the information processing apparatus of the present invention.

図５に示すように、辞書ＤＢ１０６は、文字列４０１，読み４０２，分類４０３，スコア（尤もらしさ）４０４の項目で構成される。 As shown in FIG. 5, the dictionary DB 106 includes items of a character string 401, a reading 402, a classification 403, and a score (likelihood) 404.

この辞書ＤＢ１０６を、例えば「山形県」をキーとして検索すると、分類４０３として「住所（の一部）」あるいは「人名」であることが分かる。また、スコア４０４として住所の場合「８０」、人名の場合「２０」であることが分かる。このため、この処理だけでは「山形県」は住所としての可能性が高いとみなされる。なお、この段階は、最終決定ではなく、「山形県」が住所である可能性と人名である可能性をスコアと共にメモリに保持しておく。 If this dictionary DB 106 is searched using, for example, “Yamagata Prefecture” as a key, it is found that the classification 403 is “address (part of)” or “person name”. Further, it is understood that the score 404 is “80” in the case of an address and “20” in the case of a person name. For this reason, “Yamagata Prefecture” is regarded as having a high possibility of being an address only by this processing. In this stage, not the final decision but the possibility that “Yamagata Prefecture” is an address and the possibility of a person's name are stored in the memory together with the score.

以下、図３のフローチャートの説明に戻る。 The description returns to the flowchart of FIG.

次に、ステップＳ００４において、ＣＰＵ１０１は、着目中の文字列に対してルールＤＢ１０３の全ルールを適用してスコアを付与しメモリに保持する。 Next, in step S004, the CPU 101 assigns a score by applying all the rules of the rule DB 103 to the character string under attention, and stores it in the memory.

図６は、本発明の情報処理装置におけるルールＤＢの一例を示す図である。 FIG. 6 is a diagram showing an example of the rule DB in the information processing apparatus of the present invention.

図６に示すように、ルールＤＢ１０５は、ルール番号６０１，パターン部６０２，スコア付与部６０３，分類６０４，スコア６０５の項目で構成される。 As shown in FIG. 6, the rule DB 105 includes items of rule number 601, pattern portion 602, score assigning portion 603, classification 604, and score 605.

なお、図６中の「○」は文字又は文字列を表し、例えばルール２のパターン部６０２の「○○○3f○株式会社」の「株式会社」の前の「○○○○3f」は、図４に示した「ヤマモト電気株式会社」の「ヤマモト電気」に該当する。また、図６に示すルールのパターン部は、正規表現を用いて記述することを想定しているが、他のルール駆動型エンジン技術に従った記述形式であってもよい。 Note that “◯” in FIG. 6 represents a character or a character string. For example, “XXXXX3f” in front of “KK” of “XXXXXX” in the pattern part 602 of Rule 2 is 4 corresponds to “Yamamoto Electric” of “Yamamoto Electric Co., Ltd.” shown in FIG. The rule pattern portion shown in FIG. 6 is assumed to be described using a regular expression, but may be in a description format according to another rule-driven engine technology.

また、ルール２のスコア付与部６０３にあるように「文字列全体」、即ち「ヤマモト電気株式会社」全体で、ルール２の分類６０４にあるように「会社名」を指し、「会社名」としてスコア６０５にあるようにスコア「１００」を付与する。 Also, as in the rule 2 score assigning section 603, “whole character string”, that is, “Yamamoto Denki Co., Ltd.” as a whole, “company name” as in the classification 604 of rule 2, As shown in the score 605, the score “100” is given.

「山形県」は「課長」の後ろにあるためルール１のパターン部，スコア付与部から、分類「氏名」としてスコア「１００」が付与される。 Since “Yamagata Prefecture” is behind “Manager”, the pattern part and score assigning part of Rule 1 are given a score “100” as the classification “name”.

なお、「課長」が役職名であることは、図５に示した辞書検索結果により属性として付与されていることで判断可能である。しかし「山形県」が辞書に人名として登録されている事実を利用しないため、この例のルールは、辞書に登録されていない未知の人名に対しても有用である。 Note that it is possible to determine that “section manager” is a title by being assigned as an attribute based on the dictionary search result shown in FIG. However, since the fact that “Yamagata Prefecture” is registered as a person name in the dictionary is not used, the rule of this example is also useful for an unknown person name that is not registered in the dictionary.

同様に「東京都○○区三好１−２−３」の場合はルール９に該当するため、「住所」としてスコア「１００」が付与される。 Likewise, “1-2-3 Miyoshi 1-2-3, Tokyo” corresponds to rule 9, and therefore a score “100” is given as “address”.

また、「ヤマモト電気株式会社」の場合は上述したように、ルール２に該当するため「会社名」としてスコア「１００」が付与される。 In the case of “Yamamoto Electric Co., Ltd.”, the score “100” is given as “company name” because it corresponds to rule 2 as described above.

さらに、「ＴＥＬ０３−１２３４−５６７８」の場合はルール４に該当するため、「電話番号」としてスコア「１００」が付与される。 Furthermore, since “TEL 03-1234-5678” corresponds to rule 4, a score “100” is assigned as “telephone number”.

次に、ステップＳ００５において、ＣＰＵ１０１は、着目中の文字列に対して確率モデルを利用したスコア付与を行いメモリに保持する。 Next, in step S005, the CPU 101 assigns a score using a probability model to the character string of interest and holds it in the memory.

公知の技術として特開２００４−４６７７５号公報に記載される確率モデル（ＨＭＭ；隠れマルコフモデル）などを利用し、その文字列が出現する確率を取得することが可能である。 A probability model (HMM: hidden Markov model) described in Japanese Patent Application Laid-Open No. 2004-46775 can be used as a known technique to acquire the probability that the character string appears.

例えば、名刺に記載される文字列の「意味属性」を内部状態と考え、名刺に記載される「文字列」を外部から観測できる記号と考えると、名刺に記載される文字列群の生成過程は隠れマルコフモデルで近似できる。 For example, if the “semantic attribute” of a character string described on a business card is considered as an internal state, and the “character string” described on the business card is considered as a symbol that can be observed from the outside, the generation process of the character string group described on the business card Can be approximated by a hidden Markov model.

この隠れマルコフモデルで名刺の書式（各意味属性に対応する文字列の出現パターン）を表現すると、ある意味属性から他の意味属性に遷移する確率（例えば、「役職」の後に「氏名」が記載される確率，「住所」の後に「電話番号」が記載される確率等）や、各意味属性別の文字列の出現確率（例えば、「氏名」における「山形県」の出現確率，「地名」における「山形県」の出現確率等）等で表現される。 If this hidden Markov model expresses the format of a business card (appearance pattern of a character string corresponding to each semantic attribute), the probability of transition from one semantic attribute to another (for example, “Name” followed by “Name”) , The probability that “phone number” is written after “address”, etc.), the appearance probability of a character string for each semantic attribute (for example, the appearance probability of “Yamagata Prefecture” in “name”, “location name”) For example, the appearance probability of “Yamagata Prefecture”.

このように、隠れマルコフモデルを用いることにより、名刺の記載における「文字列」の接続の自然性を上記状態遷移の確率の大小で表現可能である。 Thus, by using the hidden Markov model, the naturalness of the connection of the “character string” in the description of the business card can be expressed by the magnitude of the state transition probability.

本実施形態では、このような確率モデルを用いて、着目中の文字列が出現する確率を求め、その確率の最大値が（例えば、スコアが「１００」点満点なら）スコア「１００」になるような係数を求め、着目中の文字列の出現確率を「１００」点満点のスコアに正規化する。 In the present embodiment, using such a probability model, the probability of occurrence of the character string of interest is obtained, and the maximum value of the probability becomes a score “100” (for example, if the score is a full score of “100”). Such a coefficient is obtained, and the appearance probability of the character string under consideration is normalized to a score of “100”.

なお、本発明に適用可能な確率モデルは、隠れマルコフモデルに限られるものではなく、各意味属性に対応する文字列の出現パターンを確率的に予測するモデルであればどのような確率モデルであってもよい。 Note that the probability model applicable to the present invention is not limited to the hidden Markov model, and any probability model may be used as long as it is a model that probabilistically predicts the appearance pattern of the character string corresponding to each semantic attribute. May be.

また、この確率モデルによるスコア付与の処理は、オプションとして実行しないように構成してもよい。 Moreover, you may comprise so that the process of the score provision by this probability model may not be performed as an option.

次に、ステップＳ００６において、着目中の文字列に対して、ステップＳ００３〜Ｓ００５で付与されたスコアを合計するスコア計算処理を行う。以下、図７を参照して説明する。 Next, in step S006, a score calculation process for adding the scores assigned in steps S003 to S005 to the character string under attention is performed. Hereinafter, a description will be given with reference to FIG.

図７は、本発明の情報処理装置におけるスコア計算処理を説明するための図である。 FIG. 7 is a diagram for explaining score calculation processing in the information processing apparatus of the present invention.

例えば、着目中の文字列を「山形県」とすると、「山形県」に対しては、辞書ＤＢ１０６により、「住所」としてスコア「８０」、「氏名」としてはスコア「２０」が付与されている。また、「山形県」に対しては、ルールＤＢ１０５により「氏名」としてスコア「１００」が付与されている。なお、ここでは、説明を簡単にするために、確率モデルを利用したスコアは付与されていないものとする。 For example, if the character string under consideration is “Yamagata Prefecture”, score “80” as “address” and score “20” as “name” are given to “Yamagata Prefecture” by the dictionary DB 106. Yes. For “Yamagata Prefecture”, the rule DB 105 assigns a score “100” as “name”. Here, in order to simplify the explanation, it is assumed that no score using a probability model is given.

この場合、スコア計算による合計は、「住所」としてスコア「８０」、「氏名」としてはスコア「１２０（＝２０＋１００）」となる。 In this case, the sum by the score calculation is score “80” as “address” and score “120 (= 20 + 100)” as “name”.

なお、本発明の情報処理装置では、この時点だけで、「山形県」が人名であると決定するものではなく、このスコア計算結果に後述するステップＳ００８に示す最適割り当て処理を適用して、最終的な文字列認識結果を決定する。 Note that the information processing apparatus of the present invention does not determine that “Yamagata Prefecture” is a person's name only at this point of time, but applies the optimal allocation process shown in step S008, which will be described later, to the score calculation result to obtain the final result. Character recognition result is determined.

ステップＳ００７において、ＣＰＵ１０１は、抽出文字列群３０２の全文字列に対してステップＳ００２〜００６の処理を終了したか否かを判定し、まだ終了していないと判断した場合には、ステップＳ００２に処理を戻し、次の文字列の処理を行う。 In step S007, the CPU 101 determines whether or not the processing in steps S002 to 006 has been completed for all the character strings in the extracted character string group 302. If it is determined that the processing has not been completed, the process proceeds to step S002. Return processing and process the next character string.

一方、ステップＳ００７で、ＣＰＵ１０１が、抽出文字列群３０２の全文字列に対してステップＳ００２〜００６の処理を終了したと判断した場合には、ステップＳ００８に処理を進める。 On the other hand, if the CPU 101 determines in step S007 that the processes in steps S002 to 006 have been completed for all the character strings in the extracted character string group 302, the process proceeds to step S008.

以下、図８に、抽出文字列群３０２の全文字列に対する辞書ベースのスコア付与結果を示す。また、この結果にさらに、ルールベースのスコアを付与し、抽出文字列群３０２の全文字列に対するスコア計算を行った結果を図９に示す。 FIG. 8 shows dictionary-based score assignment results for all character strings in the extracted character string group 302. FIG. Further, FIG. 9 shows a result of assigning a rule-based score to this result and performing score calculation for all the character strings in the extracted character string group 302.

図８，図９に示すように、「ヤマモト電気株式会社」に対してのスコア計算結果は、「氏名」としてスコア「２０（辞書ベース）」（図９の９０１）、「会社名」としてスコア「１００（辞書ベース）＋１００（ルールベース）」（図９の９０３）となる。 As shown in FIGS. 8 and 9, the score calculation result for “Yamamoto Electric Co., Ltd.” has a score “20 (dictionary base)” (901 in FIG. 9) as “name” and a score as “company name”. “100 (dictionary base) +100 (rule base)” (903 in FIG. 9).

また、「課長」に対しては、「役職名」としてスコア「１００（辞書ベース）」となる。 For the “section manager”, the score “100 (dictionary base)” is given as the “title”.

さらに、「山形県」に対しては、「氏名」としてスコア「２０（辞書ベース）＋１００（ルールベース）」（図９の９０２）、「住所」としてスコア「８０（辞書ベース）」（図９の９０４）となる。 Furthermore, for “Yamagata Prefecture”, the score “20 (dictionary base) +100 (rule base)” (902 in FIG. 9) as “name” and the score “80 (dictionary base)” as “address” (FIG. 9). 904).

また、「〒１１１-００１１東京都○○区三好１−２−３」に対しては、「住所」としてスコア「８０（辞書ベース）＋１００（ルールベース）」（図９の９０５）となる。 Further, for “〒 111-0011 Tokyo, XX-ku Miyoshi 1-2-3”, the score is “80 (dictionary base) +100 (rule base)” (905 in FIG. 9) as “address”.

さらに、「ＴＥＬ０３-１２３４-５６７８」に対しては、「電話番号」としてスコア「１００（ルールベース）」（図９の９０６）となる。 Further, for “TEL 03-1234-5678”, the score is “100 (rule base)” (906 in FIG. 9) as “telephone number”.

このように、この例では、「ヤマモト電気株式会社」，「山形県」は、文字列は、複数の意味属性を持つ可能性があることが示されているが、このような場合、以下図１０に示す最適割り当て処理により、いずれかの属性に決定する。 As described above, in this example, “Yamamoto Electric Co., Ltd.” and “Yamagata Prefecture” indicate that the character string may have a plurality of semantic attributes. Any one of the attributes is determined by the optimum allocation process shown in FIG.

図１０は、本発明の情報処理装置における第２の制御処理手順の一例を示すフローチャートであり、図３のステップＳ００８に示した最適割り当て処理に対応する。なお、このフローチャートの処理は、情報処理装置１００のＣＰＵ１０１がＨＤ１０４に格納されたプログラムをＲＡＭ１０２にロードして実行することにより実現される。また、図中、Ｓ１１０１〜Ｓ１１１０は各ステップを示す。 FIG. 10 is a flowchart showing an example of the second control processing procedure in the information processing apparatus of the present invention, and corresponds to the optimum allocation processing shown in step S008 of FIG. Note that the processing of this flowchart is realized by the CPU 101 of the information processing apparatus 100 loading a program stored in the HD 104 into the RAM 102 and executing it. In the figure, S1101 to S1110 indicate each step.

まず、ステップＳ１１０１において、ＣＰＵ１０１は、抽出文字列群３０２内で１つの文字列に２つ以上の意味属性があるもの（図９に示したスコア計算結果においてスコア「０」でない分類が複数ある文字列）が存在するか否かを判断し、存在しないと判断した場合には、ステップＳ１１１０において、抽出文字列群３０２内の各文字列に該当する意味属性（スコア「０」でない分類）を関連付けて、該関連つけた「文字列」と「意味属性」のペアを、出力情報として、本フローチャートの処理を終了する。 First, in step S1101, the CPU 101 determines that one character string in the extracted character string group 302 has two or more semantic attributes (characters having a plurality of categories other than the score “0” in the score calculation result shown in FIG. 9). In step S1110, the semantic attribute corresponding to each character string in the extracted character string group 302 (category other than score “0”) is associated in step S1110. Then, the process of this flowchart is terminated using the associated “character string” and “semantic attribute” pair as output information.

一方、ステップＳ１１０１で、ＣＰＵ１０１が、抽出文字列群３０２内で１つの文字列に２つ以上の意味属性があるもの（スコア「０」でない分類が複数ある文字列）が存在すると判断した場合には、ステップＳ１１０２において、ＣＰＵ１０１は、文字列群の組み合わせのパターンによるスコア合計を計算する。即ち、図１１に示すような総スコアになるパターンを抽出し、各パターンの総スコアを計算する。 On the other hand, when the CPU 101 determines in step S1101 that one character string in the extracted character string group 302 has two or more semantic attributes (a character string having a plurality of categories other than the score “0”). In step S1102, the CPU 101 calculates a total score based on a combination pattern of character string groups. That is, a pattern having a total score as shown in FIG. 11 is extracted, and the total score of each pattern is calculated.

図１１は、本発明の情報処理装置におけるパターン毎の総スコアを計算の一例を示す図である。 FIG. 11 is a diagram showing an example of calculating the total score for each pattern in the information processing apparatus of the present invention.

図１１に示すように、図９に示したスコア計算結果より、Ａ，Ｂ，Ｃ，Ｄの４パターンが抽出される。なお、電話番号，役職等のように、唯一の意味属性しか持たない文字列が既に関連つけられている意味属性は省略してある。 As shown in FIG. 11, four patterns A, B, C, and D are extracted from the score calculation result shown in FIG. It should be noted that semantic attributes such as telephone numbers and job titles that are already associated with character strings having only one semantic attribute are omitted.

また、パターン抽出の際の文字列の配置は、一つの文字列が、同じパターン内で、複数の意味属性に割り当てられることがないように配置する。 In addition, the character strings are arranged so that one character string is not assigned to a plurality of semantic attributes in the same pattern.

そして、各パターン毎に、総スコアを計算する。 Then, a total score is calculated for each pattern.

この結果、図１１に示した例では、ＣＰＵ１０１は、ステップＳ１１０３において、最も高スコアになるＣパターンが最も妥当であると判断し、意味属性と文字列の関係にＣパターンを選択する。 As a result, in the example shown in FIG. 11, the CPU 101 determines in step S1103 that the C pattern having the highest score is most appropriate, and selects the C pattern for the relationship between the semantic attribute and the character string.

そして、ステップＳ１１０４において、ＣＰＵ１０１は、ステップＳ１１０３で選択したパターンの文字列と意味属性を関連付けて、該関連つけた「文字列」と「意味属性」のペアを、出力情報として、本フローチャートの処理を終了する。 In step S1104, the CPU 101 associates the character string of the pattern selected in step S1103 with the semantic attribute, and uses the associated “character string” and “semantic attribute” as output information. Exit.

よって、最終的に、「ヤマモト電気株式会社」は「会社名」、「課長」は「役職」、「山形県」は「氏名」、「〒１１１-００１１東京都○○区三好１−２−３」は「住所」、「ＴＥＬ０３-１２３４-５６７８」は「電話番号」とそれぞれ文字認識されることとなる。 Therefore, in the end, “Yamamoto Electric Co., Ltd.” is “Company Name”, “Manager” is “Title”, “Yamagata Prefecture” is “Name”, “1-2-1 Miyoshi, ○ -ku, Tokyo, 111-0011, Japan” “3” is recognized as “address”, and “TEL 03-1234-5678” is recognized as “phone number”.

このように、図４に示した名刺は正しく文字認識される。 In this way, the business card shown in FIG. 4 is recognized correctly.

なお、図４に示した名刺の例では、「山形県」の前「課長」という役職が記載されていたため、ルールＤＢ１０５によるスコア付与の際に、「氏名」属性のスコアが高くなり、「山形県」が「氏名」属性と認識されることに大なる影響をあたえた。 In the example of the business card shown in FIG. 4, since the title “section manager” is listed in front of “Yamagata Prefecture”, the score of the “name” attribute increases when the score is given by the rule DB 105, and “Yamagata” It had a great influence on the recognition of “prefecture” as a “name” attribute.

よって、以下に示す図１２〜図１４を参照して、「山形県」の前「課長」という役職が記載されていない場合について考察してみる。 Therefore, with reference to FIGS. 12 to 14 shown below, consider the case where the title “section manager” in front of “Yamagata Prefecture” is not described.

図１２に、抽出文字列群３０２の全文字列に対する辞書ベースのスコア付与結果の他の例を示す。また、この結果にさらに、ルールベースのスコアを付与し、抽出文字列群３０２の全文字列に対するスコア計算を行った結果の他の例を図１３に示す。 FIG. 12 shows another example of dictionary-based score assignment results for all character strings in the extracted character string group 302. Further, FIG. 13 shows another example of the result of assigning a rule-based score to this result and calculating the score for all the character strings in the extracted character string group 302.

図１２，図１３に示すように、「ヤマモト電気株式会社」に対してのスコア計算結果は、「氏名」としてスコア「２０（辞書ベース）」、「会社名」としてスコア「１００（辞書ベース）＋１００（ルールベース）」となる。 As shown in FIG. 12 and FIG. 13, the score calculation result for “Yamamoto Electric Co., Ltd.” has a score “20 (dictionary base)” as “name” and a score “100 (dictionary base) as“ company name ”. +100 (rule base) ”.

また、「課長」に対してのスコア計算結果は、「役職名」としてスコア「１００（辞書ベース）」となる。 The score calculation result for “section manager” is score “100 (dictionary base)” as “title”.

さらに、「山形県」に対してのスコア計算結果は、「氏名」としてスコア「２０（辞書ベース）」、「住所」としてスコア「８０（辞書ベース）」となる（役職名がないためルールＤＢ１０５は適用されない）。 Furthermore, the score calculation results for “Yamagata Prefecture” are score “20 (dictionary base)” as “name” and score “80 (dictionary base)” as “address” (there is no title, so rule DB 105 Does not apply).

また、「〒１１１-００１１東京都○○区三好１−２−３」に対してのスコア計算結果は、「住所」としてスコア「８０（辞書ベース）＋１００（ルールベース）」となる。 Also, the score calculation result for “〒 111-0011 1-2-3 Miyoshi 1-2-3 Tokyo” is score “80 (dictionary base) +100 (rule base)” as “address”.

さらに、「ＴＥＬ０３-１２３４-５６７８」に対しては、「電話番号」としてスコア「１００（ルールベース）」となる。 Furthermore, for “TEL 03-1234-5678”, the score is “100 (rule base)” as “telephone number”.

以下、図１４に総スコアパターンを示す。 The total score pattern is shown in FIG.

図１４は、本発明の情報処理装置におけるパターン毎の総スコアを計算の他の例を示す図である。 FIG. 14 is a diagram showing another example of calculating the total score for each pattern in the information processing apparatus of the present invention.

図１４に示すように、図１３に示したスコア計算結果より、Ａ，Ｂ，Ｃ，Ｄの４パターンが抽出される。 As shown in FIG. 14, four patterns of A, B, C, and D are extracted from the score calculation result shown in FIG.

そして、各パターン毎の総スコアを計算の結果、図１４に示した例でも、Ｃパターンが最高スコアになる。 As a result of calculating the total score for each pattern, the C pattern has the highest score even in the example shown in FIG.

よって、「山形県」の前「課長」という役職が記載されていない場合であっても、「ヤマモト電気株式会社」は「会社名」、「課長」は「役職」、「山形県」は「氏名」、「〒１１１-００１１東京都○○区三好１−２−３」は「住所」、「ＴＥＬ０３-１２３４-５６７８」は「電話番号」とそれぞれ正しく文字認識されるととになる。 Therefore, even if the title “Manager” is not listed before “Yamagata Prefecture”, “Yamamoto Electric Co., Ltd.” is “Company Name”, “Manager” is “Title”, and “Yamagata Prefecture” is “ "Name", "〒 111-0011 Miyoshi 1-2-3, Tokyo, 1-2-3" will be correctly recognized as "address" and "TEL 03-1234-5678" will be correctly recognized as "phone number".

以上示したように、画像データ上の文字列が表す意味を特定するための辞書、ルールまたは確率的処理を利用し、また、個々の文字列の意味属性を特定することではなく、複数の（名刺全体の）文字列の意味属性を考慮して、画像データ上の各部分がいずれの意味属性に対応する情報を記述したものか（いずれの意味的情報区分に属するものか）を自動的に（人手を介さず）判断することができる。 As described above, a dictionary, a rule, or a probabilistic process for specifying the meaning represented by the character string on the image data is used, and a plurality of ( Taking into account the semantic attributes of the character string (of the entire business card), it is automatically determined whether each part on the image data describes the information corresponding to which semantic attribute (which belongs to which semantic information category) It can be judged (without human intervention).

従って、例えば名刺データベースなどの後続のアプリケーションに渡すときに、ある文字列が氏名なのか住所なのか、などを人手によりチェックし指定あるいは修正する必要を省力化することができる優れた文字列認識環境を構築することができる。 Therefore, an excellent character string recognition environment that saves the necessity of manually checking and specifying or correcting whether a character string is a name or address when passing it to a subsequent application such as a business card database. Can be built.

なお、ルールによるスコア付与，ＨＭＭ等の確率的処理によるスコア付与は、オプションであり、これらの一方又は両方を使用しないシステムも構築可能である。 Note that scoring by rules and scoring by probabilistic processing such as HMM are optional, and it is possible to construct a system that does not use one or both of them.

例えば、ルールを使用しない場合には、図３のステップＳ００４をスキップし、また、確率的処理を使用しない場合には、図３のステップＳ００５をスキップするように、ＣＰＵ１０１が制御する。 For example, when not using a rule, the CPU 101 controls to skip step S004 in FIG. 3 and to skip step S005 in FIG. 3 when not using probabilistic processing.

なお、上述した例を用いて、辞書ベースのスコアの付与のみを行った場合（ルール及び確立的処理を適用しなかった場合）を以下に考察する。 Note that the case where only dictionary-based score assignment is performed using the example described above (when the rules and the establishment process are not applied) will be considered below.

図１２に示した辞書ベースのスコア付与例から、この場合も、上述した例と同一のパターンＡ，Ｂ，Ｃ，Ｄが抽出される。 In this case, the same patterns A, B, C, and D as those described above are extracted from the dictionary-based score assignment example shown in FIG.

ただし、パターンＡの総スコアは「１００＝２０＋８０」、パターンＢの総スコアは「１００＝２０＋８０」、パターンＣの総スコアは「２００＝２０＋１００＋８０」、パターンＤの総スコアは「１８０＝１００＋８０」となる。 However, the total score of pattern A is “100 = 20 + 80”, the total score of pattern B is “100 = 20 + 80”, the total score of pattern C is “200 = 20 + 100 + 80”, and the total score of pattern D is “180 = 100 + 80” Become.

よって、辞書ベースのスコアの付与のみを行った場合（ルール及び確立的処理を適用しなかった場合）も、やはりパターンＣが最高スコアと判断され、正しく認識される。 Therefore, when only the dictionary-based score is given (when the rule and the establishment process are not applied), the pattern C is still determined as the highest score and correctly recognized.

〔第２実施形態〕
上記第１実施形態では、文字列認識対象となる画像データからＯＣＲ認識を行い、該ＯＣＲ認識した文字列の各意味属性を決定する構成について説明したが、図３のステップＳ００２〜００８の処理の実行は、画像データのＯＣＲ結果に限られるものではなく、複数の文字列データであれば、どのような文字列データであってもよい。 [Second Embodiment]
In the first embodiment described above, a configuration has been described in which OCR recognition is performed from image data that is a character string recognition target, and each semantic attribute of the character string that has been OCR-recognized is determined. However, the processing of steps S002 to 008 in FIG. The execution is not limited to the OCR result of the image data, and any character string data may be used as long as it is a plurality of character string data.

例えば、図１２に示すようなＣＳＶファイルに適用してもよい。 For example, you may apply to a CSV file as shown in FIG.

図１５は、本発明の情報処理装置において認識対象となる文字列を含むＣＳＶファイルの一例を示す図である。 FIG. 15 is a diagram illustrating an example of a CSV file including a character string to be recognized in the information processing apparatus of the present invention.

図１５に示すように、このＣＳＶファイルは、１行目では、１２０１に示すように１列目に住所、１２０２に示すように２列目に氏名（ここで１２０２は「秋田県（あがた）」という氏名）、３列目に会社名、４列目に電話番号が入っている。 As shown in FIG. 15, in the first line, this CSV file has an address in the first column as shown in 1201 and a name in the second column as shown in 1202 (where 1202 is “Akita Prefecture (Agata) ) ") The company name is in the third column and the phone number is in the fourth column.

また、２行目では、１２０３に示すように３列目に電話番号が入っている。 In the second row, a telephone number is entered in the third column as indicated by 1203.

さらに、３行目では、１２０５に示すように１列目に氏名（「○○田×子」という指名）、１２０４に示すように２列目に住所が入っている。 Further, in the third row, as shown in 1205, the name (nomination “XX field × child”) is shown in the first column, and the address is entered in the second column as shown in 1204.

このように、行毎に列項目が異なっていても、また、住所に似た氏名が入っていても、図３に示したステップＳ００２〜Ｓ００８の処理を、各行毎に行うことで、上述した第１実施形態で示した名刺の例と同様に、各行の各列のデータを意味属性（意味的情報区分）で分類することが可能である。 As described above, even if the column items are different for each row or a name similar to an address is entered, the processing in steps S002 to S008 shown in FIG. Similar to the example of the business card shown in the first embodiment, the data of each column in each row can be classified by semantic attributes (semantic information classification).

なお、上述した各実施形態内で示した変形例のいずれか又は全てを組み合わせた構成も全て本発明に含まれるものである。 In addition, all the structures which combined any or all of the modifications shown in each embodiment mentioned above are also included in this invention.

なお、上述した各種データの構成及びその内容はこれに限定されるものではなく、用途や目的に応じて、様々な構成や内容で構成されることは言うまでもない。 It should be noted that the configuration and contents of the various data described above are not limited to this, and it goes without saying that the various data and configurations are configured according to the application and purpose.

以上、一実施形態について示したが、本発明は、例えば、システム、装置、方法、プログラムもしくは記録媒体等としての実施態様をとることが可能であり、具体的には、複数の機器から構成されるシステムに適用しても良いし、また、一つの機器からなる装置に適用しても良い。 Although one embodiment has been described above, the present invention can take an embodiment as, for example, a system, apparatus, method, program, or recording medium, and specifically includes a plurality of devices. The present invention may be applied to a system including a single device.

以上より、複数の意味属性のいずれかに割り当てられた複数の文字列から構成される文字列群内の各文字列を前記意味属性毎に分類する際に、前記各文字列毎に、前記文字列を分析して前記文字列の前記各意味属性に対する割り当ての尤らしさを示すスコアを１又は複数の方法によりそれぞれ複数の方法によりそれぞれ付与し、前記各文字列の前記各意味属性への割り当て組み合わせパターン毎の前記それぞれ付与されたスコアの合計値に基づいて、前記いずれかの組み合わせパターンを選択し、該選択した組み合わせパターンで、前記各文字列を前記各意味属性に分類する（即ち、文字列群全体を考慮して各文字列の意味属性を分類する）ことにより、名刺等に記述されている各部分の文字列が何の情報を示しているのかを予めあるいは認識時に指定する必要がなく、名刺等の各部の文字列を正しい情報として取得することができる。 As described above, when each character string in a character string group composed of a plurality of character strings assigned to any one of a plurality of semantic attributes is classified for each semantic attribute, for each character string, the character A combination of assigning each character string to each semantic attribute by assigning a score indicating the likelihood of assignment of the character string to each semantic attribute by a plurality of methods by one or a plurality of methods. One of the combination patterns is selected based on the total score given for each pattern, and each character string is classified into each semantic attribute based on the selected combination pattern (that is, a character string). By classifying the semantic attributes of each character string in consideration of the entire group), it is possible to confirm in advance or what information the character string of each part described on a business card or the like indicates. Sometimes it is not necessary to specify, it is possible to get a string of each part of the business card as the correct information.

結果として、例えば名刺データベースなどの後続のアプリケーションに渡すときに、ある文字列が氏名なのか住所なのか、などを人手によりチェックし指定あるいは修正する必要を省力化することができる。 As a result, when passing to a subsequent application such as a business card database, for example, it is possible to save labor by manually checking and specifying or correcting whether a character string is a name or an address.

以下、図１６に示すメモリマップを参照して本発明に係る文字認識装置で読み取り可能なデータ処理プログラムの構成について説明する。 The configuration of a data processing program that can be read by the character recognition device according to the present invention will be described below with reference to the memory map shown in FIG.

図１６は、本発明に係る文字認識装置で読み取り可能な各種データ処理プログラムを格納する記録媒体（記憶媒体）のメモリマップを説明する図である。 FIG. 16 is a diagram for explaining a memory map of a recording medium (storage medium) that stores various data processing programs that can be read by the character recognition apparatus according to the present invention.

なお、特に図示しないが、記録媒体に記憶されるプログラム群を管理する情報、例えばバージョン情報，作成者等も記憶され、かつ、プログラム読み出し側のＯＳ等に依存する情報、例えばプログラムを識別表示するアイコン等も記憶される場合もある。 Although not specifically shown, information for managing a program group stored in the recording medium, for example, version information, creator, etc. is also stored, and information depending on the OS on the program reading side, for example, a program is identified and displayed. Icons may also be stored.

さらに、各種プログラムに従属するデータも上記ディレクトリに管理されている。また、インストールするプログラムやデータが圧縮されている場合に、解凍するプログラム等も記憶される場合もある。 Further, data depending on various programs is also managed in the directory. In addition, when a program or data to be installed is compressed, a program to be decompressed may be stored.

本実施形態における図３，図１０に示す機能が外部からインストールされるプログラムによって、ホストコンピュータにより遂行されていてもよい。そして、その場合、ＣＤ−ＲＯＭやフラッシュメモリやＦＤ等の記録媒体により、あるいはネットワークを介して外部の記録媒体から、プログラムを含む情報群を出力装置に供給される場合でも本発明は適用されるものである。 The functions shown in FIGS. 3 and 10 in this embodiment may be performed by a host computer by a program installed from the outside. In this case, the present invention is applied even when an information group including a program is supplied to the output device from a recording medium such as a CD-ROM, a flash memory, or an FD, or from an external recording medium via a network. Is.

以上のように、前述した実施形態の機能を実現するソフトウェアのプログラムコードを記録した記録媒体を、システムあるいは装置に供給し、そのシステムあるいは装置のコンピュータ（またはＣＰＵやＭＰＵ）が記録媒体に格納されたプログラムコードを読出し実行することによっても、本発明の目的が達成されることは言うまでもない。 As described above, a recording medium in which a program code of software for realizing the functions of the above-described embodiments is recorded is supplied to the system or apparatus, and the computer (or CPU or MPU) of the system or apparatus is stored in the recording medium. It goes without saying that the object of the present invention can also be achieved by reading and executing the program code.

この場合、記録媒体から読み出されたプログラムコード自体が本発明の新規な機能を実現することになり、そのプログラムコードを記憶した記録媒体は本発明を構成することになる。 In this case, the program code itself read from the recording medium realizes the novel function of the present invention, and the recording medium storing the program code constitutes the present invention.

プログラムコードを供給するための記録媒体としては、例えば、フレキシブルディスク，ハードディスク，光ディスク，光磁気ディスク，ＣＤ−ＲＯＭ，ＣＤ−Ｒ，ＤＶＤ−ＲＯＭ，磁気テープ，不揮発性のメモリカード，ＲＯＭ，ＥＥＰＲＯＭ，シリコンディスク等を用いることができる。 As a recording medium for supplying the program code, for example, a flexible disk, hard disk, optical disk, magneto-optical disk, CD-ROM, CD-R, DVD-ROM, magnetic tape, nonvolatile memory card, ROM, EEPROM, A silicon disk or the like can be used.

また、コンピュータが読み出したプログラムコードを実行することにより、前述した実施形態の機能が実現されるだけでなく、そのプログラムコードの指示に基づき、コンピュータ上で稼働しているＯＳ（オペレーティングシステム）等が実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。 Further, by executing the program code read by the computer, not only the functions of the above-described embodiments are realized, but also an OS (operating system) or the like running on the computer based on the instruction of the program code. It goes without saying that a case where the function of the above-described embodiment is realized by performing part or all of the actual processing and the processing is included.

さらに、記録媒体から読み出されたプログラムコードが、コンピュータに挿入された機能拡張ボードやコンピュータに接続された機能拡張ユニットに備わるメモリに書き込まれた後、そのプログラムコードの指示に基づき、その機能拡張ボードや機能拡張ユニットに備わるＣＰＵ等が実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。 Furthermore, after the program code read from the recording medium is written in a memory provided in a function expansion board inserted in the computer or a function expansion unit connected to the computer, the function expansion is performed based on the instruction of the program code. It goes without saying that the case where the CPU or the like provided in the board or the function expansion unit performs part or all of the actual processing and the functions of the above-described embodiments are realized by the processing.

また、本発明は、複数の機器から構成されるシステムに適用しても、１つの機器からなる装置に適用してもよい。また、本発明は、システムあるいは装置にプログラムを供給することによって達成される場合にも適応できることは言うまでもない。この場合、本発明を達成するためのソフトウェアによって表されるプログラムを格納した記録媒体を該システムあるいは装置に読み出すことによって、そのシステムあるいは装置が、本発明の効果を享受することが可能となる。 In addition, the present invention may be applied to a system composed of a plurality of devices or an apparatus composed of a single device. Needless to say, the present invention can be applied to a case where the present invention is achieved by supplying a program to a system or apparatus. In this case, by reading a recording medium storing a program represented by software for achieving the present invention into the system or apparatus, the system or apparatus can enjoy the effects of the present invention.

さらに、本発明を達成するためのソフトウェアによって表されるプログラムをネットワーク上のサーバ，データベース等から通信プログラムによりダウンロードして読み出すことによって、そのシステムあるいは装置が、本発明の効果を享受することが可能となる。 Furthermore, by downloading and reading out a program represented by software for achieving the present invention from a server, database, etc. on a network using a communication program, the system or apparatus can enjoy the effects of the present invention. It becomes.

本発明の一実施形態を示す情報処理装置を適用可能なシステムの一例を示すシステム構成図である。1 is a system configuration diagram illustrating an example of a system to which an information processing apparatus according to an embodiment of the present invention is applicable. 図１に示した情報処理装置のハードウェア構成の一例を示すブロック図である。It is a block diagram which shows an example of the hardware constitutions of the information processing apparatus shown in FIG. 本発明の情報処理装置における第１の制御処理手順の一例を示すフローチャートである。It is a flowchart which shows an example of the 1st control processing procedure in the information processing apparatus of this invention. 本発明の情報処理装置における文字列認識対象画像および該文字列認識対象画像から抽出された文字列の一例を示す模式図である。It is a schematic diagram which shows an example of the character string extracted from the character string recognition object image in the information processing apparatus of this invention, and this character string recognition object image. 本発明の情報処理装置における辞書ＤＢの構成の一例を示す図である。It is a figure which shows an example of a structure of dictionary DB in the information processing apparatus of this invention. 本発明の情報処理装置におけるルールＤＢの構成の一例を示す図である。It is a figure which shows an example of a structure of rule DB in the information processing apparatus of this invention. 図３に示す文字列に対するスコア計算処理を説明する図である。It is a figure explaining the score calculation process with respect to the character string shown in FIG. 抽出文字列群の全文字列に対する辞書ベースのスコア付与結果を示す図である。It is a figure which shows the score addition result of a dictionary base with respect to all the character strings of an extracted character string group. 抽出文字列群の全文字列に対するスコア計算結果を示す図である。It is a figure which shows the score calculation result with respect to all the character strings of the extraction character string group. 本発明の情報処理装置における第２の制御処理手順の一例を示すフローチャートである。It is a flowchart which shows an example of the 2nd control processing procedure in the information processing apparatus of this invention. 本発明の情報処理装置におけるパターン毎の総スコアの計算の一例を示す図である。It is a figure which shows an example of calculation of the total score for every pattern in the information processing apparatus of this invention. 抽出文字列群の全文字列に対する辞書ベースのスコア付与結果の他の例を示す図である。It is a figure which shows the other example of the dictionary-based score provision result with respect to all the character strings of an extracted character string group. 抽出文字列群の全文字列に対するスコア計算結果の他の例を示す図である。It is a figure which shows the other example of the score calculation result with respect to all the character strings of an extraction character string group. 本発明の情報処理装置におけるパターン毎の総スコアの計算結果の他の例を示す図である。It is a figure which shows the other example of the calculation result of the total score for every pattern in the information processing apparatus of this invention. 本発明の情報処理装置において認識対象となる文字列を含むＣＳＶファイルの一例を示す図である。It is a figure which shows an example of the CSV file containing the character string used as recognition object in the information processing apparatus of this invention. 本発明に係る文字認識装置で読み取り可能な各種データ処理プログラムを格納する記録媒体（記憶媒体）のメモリマップを説明する図である。It is a figure explaining the memory map of the recording medium (storage medium) which stores the various data processing program which can be read with the character recognition apparatus which concerns on this invention.

Explanation of symbols

２００画像入力装置（スキャナ）
１００情報処理装置（文字列認識装置）
１０１ＣＰＵ
１０２ＲＡＭ
１０３ＲＯＭ
１０４ＨＤ
１０５辞書ＤＢ
１０６ルールＤＢ
１０７確率情報ＤＢ
１０９通信インタフェース
１１０入力装置
１１１表示装置 200 Image input device (scanner)
100 Information processing device (character string recognition device)
101 CPU
102 RAM
103 ROM
104 HD
105 Dictionary DB
106 Rule DB
107 Probability information DB
109 Communication Interface 110 Input Device 111 Display Device

Claims

In an information processing apparatus for classifying each character string in a character string group composed of a plurality of character strings assigned to any one of a plurality of semantic attributes for each semantic attribute,
One or a plurality of score assigning means for analyzing the character string and assigning a score indicating the likelihood of assignment to each semantic attribute of the character string;
Based on the total value of the scores assigned by each score assigning unit for each combination pattern assigned to each semantic attribute of each character string, select any one of the combination patterns, and the selected combination pattern Classifying means for classifying the character strings into the semantic attributes;
An information processing apparatus comprising:

The one or more score assigning means includes:
First storage means for storing dictionary data in which a character string and each semantic attribute are associated with each other, and a score for each semantic attribute of the character string based on the dictionary data A dictionary score assigning means comprising an assigning means,
The information processing apparatus according to claim 1, further comprising:

The one or more score assigning means includes:
Second storage means for storing rule data in which a character string rule that associates a character string with each semantic attribute has a score, and a score for each semantic attribute of the character string based on the rule data A dictionary score giving means composed of a second giving means,
The information processing apparatus according to claim 1, further comprising:

The one or more score assigning means includes a third assigning means for assigning a score for each semantic attribute of the character string based on a probability model corresponding to a format of the character string group. Granting means
The information processing apparatus according to claim 1, further comprising:

The character string classification device includes:
3. The information processing apparatus according to claim 1, wherein a score is calculated when one character string has two or more semantic attributes.

The information processing apparatus according to claim 1, wherein each character string is a character string read from image data.

Each of the character strings is a character string read from image data corresponding to a business card,
The information processing apparatus according to claim 1, wherein each semantic attribute includes a name, a title, an address, a telephone number, and a company name.

In the character string classification method in the information processing apparatus capable of classifying each character string in a character string group composed of a plurality of character strings assigned to any of a plurality of semantic attributes for each semantic attribute,
For each of the character strings, one or a plurality of scoring steps for analyzing the character string and giving a score indicating the likelihood of assignment to the semantic attributes of the character string;
Based on the total value of the scores assigned in each score step for each combination pattern assigned to each semantic attribute of each character string, select any one of the combination patterns, with the selected combination pattern, A classifying step for classifying the character strings into the semantic attributes;
A character string classification method characterized by comprising:

The program for a computer to perform the character string classification | category method described in Claim 8.

A recording medium storing a computer-readable program for executing the character string classification method according to claim 8.