JP2008003740A

JP2008003740A - Input correction method, postscript information processing method, postscript information processor, and program

Info

Publication number: JP2008003740A
Application number: JP2006170877A
Authority: JP
Inventors: Teruka Saito; 照花斎藤
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2006-06-21
Filing date: 2006-06-21
Publication date: 2008-01-10
Anticipated expiration: 2026-06-21
Also published as: JP4873138B2

Abstract

<P>PROBLEM TO BE SOLVED: To accurately specify hnadwritten input information to be corrected and correct efficiently, quickly and accurately in a system for performing automatic data processing on the basis of the input information. <P>SOLUTION: An input correction method obtains not only a feature quantity related to reliability of recognition processing itself of postscript information of data processing object but also feature quantity related to reliability of a variety of processing relating to the recognition processing in a preceding step of the recognition processing (S300), and calculates sub-reliability of each processing, based on each feature quantity (S320). The method specifies final reliability of notable postscript information on the basis of each sub-reliability (S304). The method specifies difficult-to-recognize information of which the reliability of the recognition processing is lower than a prescribed level by determining whether or not the final reliability is lower than the prescribed level (S306), and prompts to correct the difficult-to-recognize information by presenting recognition performance information to enhance the recognition performance of the specified difficult-to-recognize information (S308, S310). <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、入力修正方法、並びにこの入力修正方法を適用した追記情報処理方法、追記情報処理装置、およびプログラムに関する。より詳細には、たとえば、文書に手書きで追加記入された付加情報（追記情報やアノテーションとも称する）を、文書本体から分離し、文書本体と関わりを持つ様々な情報処理に使用する際に利用される、入力された付加情報に対する修正の仕組みに関する。 The present invention relates to an input correction method, a write-once information processing method, a write-once information processing apparatus, and a program to which the input correction method is applied. More specifically, for example, it is used when additional information (also referred to as additional information or annotation) additionally written in a document by handwriting is separated from the document body and used for various information processing related to the document body. This relates to a mechanism for correcting input additional information.

情報処理技術の進歩の著しい今日、文書に関する様々な処理を自動処理する仕組みがある。たとえば、予め所定の情報が記載されている文書原本にさらに別の付加情報を手書きで追記し、その付加情報の追記された追記済文書を処理対象として、手書きで追記された付加情報に基づいて所定のデータ処理を自動的に実行する仕組みがある。 In today's remarkable progress in information processing technology, there is a mechanism for automatically processing various processes related to documents. For example, additional additional information is additionally written by hand on a document original in which predetermined information is written in advance, and the additional document with the additional information added is processed, based on the additional information added by handwriting. There is a mechanism for automatically executing predetermined data processing.

たとえば、定型伝票（いわゆる帳票）に手書きで情報を記入して、その記入された情報を処理対象とする自動帳票処理の仕組み（たとえば特許文献１，２を参照）や、日付や予定記入欄が用意された主に手帳やメモなどに手書きで予定を記入し、その記入された予定を電子データ化する個人情報管理（特にスケジュール管理ともいう）の仕組み（特許文献３を参照）もある。 For example, there is an automatic form processing mechanism (see, for example, Patent Documents 1 and 2), a date and a schedule entry field in which information is entered by handwriting on a standard form (so-called form) and the entered information is processed. There is also a mechanism for personal information management (particularly referred to as schedule management) (see Patent Document 3) in which a schedule is written by hand in a prepared notebook, memo or the like, and the entered schedule is converted into electronic data.

また、学校や学習塾などの教育現場においては、生徒や受験者による解答が記入された答案用紙に採点官が採点した結果の用紙（教育用教材）を処理対象として、自動採点集計処理を行なう仕組みも考えられている（特許文献４を参照）もある。 Also, in schools, school cram schools, etc., automatic scoring and summarization processing is performed on papers (educational teaching materials) that result from scoring on answer sheets filled with answers by students and examinees. There is also a mechanism (see Patent Document 4).

特開平５−３４２２３９号公報JP-A-5-342239 特開平６−２７４１５６号公報JP-A-6-274156 特開平５−２１６９３２号公報JP-A-5-216932 特開平１１−３１０４６号公報Japanese Patent Laid-Open No. 11-31046

たとえば、特許文献１には、表の種別および表中の各データの行方向の位置並びに列方向の位置を指定するための識別マークを付与して表データを識別マーク付きデータ表帳票として印刷するようにし、データ表帳票に付されている認識マークをマーク読取手段によって読み取らせることで、識別データから帳票を特定して該当する表データを表示手段に表示するようにし、かつ該当する箇所を修正可能な状態に表示することで、少ない作業量で効率よくデータ修正および照合を行なうことができるようにした仕組みが提案されている。 For example, in Patent Document 1, an identification mark for designating the type of table, the position in the row direction and the position in the column direction of each data in the table is added, and the table data is printed as a data table form with an identification mark. By reading the recognition mark attached to the data table form with the mark reading unit, the form is identified from the identification data and the corresponding table data is displayed on the display unit, and the corresponding part is corrected. A mechanism has been proposed in which data can be corrected and collated efficiently with a small amount of work by displaying in a possible state.

また、特許文献２には、出力原稿をイメージデータに変換し、変換されたイメージデータを表示し、表示されたイメージ上にフォーム規定位置座標を指示し、指示されたフォーム規定位置座標に基づいてフォーム図形データを導出し、導出されたフォーム図形データに基づいてフォームを表示部に表示されているイメージデータに重ね合わせ表示する構成とすることで、所望の出力原稿に対するフォームを容易に作成できるようにした仕組みが提案されている。 Japanese Patent Application Laid-Open No. 2004-228561 converts an output document into image data, displays the converted image data, indicates form specified position coordinates on the displayed image, and based on the instructed form specified position coordinates. Form graphic data can be derived and the form can be overlaid on the image data displayed on the display unit based on the derived form graphic data so that a form for a desired output document can be created easily. A mechanism has been proposed.

また、特許文献３には、個人情報管理用の仕組みとして、書き込みの施された手帳などの紙面をスキャナより入力し、既知の方法により文字や枠線を検出し、検出した文字や枠線の配置を予め記憶された手帳の書式の知識と照合し、各記載項目の属性を表すデータを生成する仕組みが提案されている。 Further, in Patent Document 3, as a mechanism for managing personal information, paper such as a written notebook is input from a scanner, characters and frame lines are detected by a known method, and the detected characters and frame lines are detected. There has been proposed a mechanism for collating the layout with knowledge of a notebook format stored in advance and generating data representing the attributes of each description item.

手帳やメモなどに手書きで記入した情報（本例では手帳に手書きで入力した予定の記載内容）が自動的に電子データに変換され、各記載項目を生成された属性に対応付けて切り換えてデータベースに登録することができるので、利用者がある書式の出力を要求した場合には、予め記憶された出力の書式を参照することにより、データベースの内容を紙上の各内容に対応した場所に印刷するなどができ便利である。 Information handwritten in a notebook or memo (in this example, the written description of a schedule entered in the notebook in handwriting) is automatically converted to electronic data, and each written item is switched in correspondence with the generated attribute. When the user requests output of a certain format, the contents of the database are printed at a location corresponding to each content on the paper by referring to the output format stored in advance. It can be useful.

また、特許文献４には、答案用紙の手書き記入欄をタブレット上の入力域として定義するととともに、集計対象の入力域を識別可能に定義する書式データを記憶装置に記憶しておき、タブレットに答案用紙が重ね合せられた状態で、手書き記入欄に記入された情報をタブレットの入力域に入力された筆記データとして取り込んで文字認識することにより、採点などの記入作業を行なった後のキー入力などによるデータ入力を省力化する仕組みが提案されている。 Further, in Patent Document 4, a handwritten entry field on an answer sheet is defined as an input area on the tablet, and format data that defines the input area to be aggregated is stored in a storage device, and the answer is stored in the tablet. Key entry after scoring, etc., by entering the information entered in the handwritten entry field as written data entered in the tablet input area and recognizing characters with the sheets superimposed A mechanism has been proposed to save labor in data input by.

一方、手書きで入力された付加情報に基づいてデータ処理を実行するに当たっては、入力された手書き情報（文字や図形）を如何様にして認識するかや、自動認識結果が信頼できない場合に自動認識処理した認識結果に対して如何様にして効率的に修正するかが問題となり、このような手書き情報の自動認識・修正の仕組みとしては、様々なものが考えられている（たとえば特許文献５を参照）。 On the other hand, when executing data processing based on additional information input by handwriting, how to recognize the input handwritten information (characters and figures), or automatically recognize if the automatic recognition result is unreliable The problem is how to efficiently correct the processed recognition result, and various mechanisms for automatic recognition / correction of such handwritten information are considered (for example, see Patent Document 5). reference).

特開２００４−１５２１１５号公報JP 2004-152115 A

ここで、特許文献５に記載の仕組みでは、取得したデータに含まれる複数の項目について項目ごとに内容の確信度を計算し、計算された確信度を用いて提示の方法を動的に変化させるようにしている。入力されたデータに含まれる各項目の確信度を用いて、データの修正方法を動的に変化させることで、郵便番号から住所を修正するなど関連の高い項目の内容を修正したり、常に疑わしい、つまり確信度の低い項目の内容を優先的に修正したりすることができるようにし、データの入力方法に因らずオペレータが介在する入力修正作業を効率よく迅速かつ正確に行なうことができるようにしている。 Here, in the mechanism described in Patent Document 5, the certainty factor of content is calculated for each item for a plurality of items included in the acquired data, and the presentation method is dynamically changed using the calculated certainty factor. I am doing so. By using the certainty of each item included in the input data and dynamically changing the correction method of the data, the contents of highly related items such as correcting the address from the zip code are corrected or always suspicious. In other words, it is possible to preferentially correct the contents of items with low confidence, and to make it possible to efficiently and promptly and accurately perform input correction work involving the operator regardless of the data input method. I have to.

しかしながら、特許文献５に記載の仕組みでは、各項目の認識時にパターン認識の類似度を使用して「確信度」を算出するのみであるため、認識処理よりも前段の各種の前処理などで生じ得る、認識処理に影響を与える事象を考慮することができない。そのため、たとえば、局所的なノイズの有無が「確信度」に影響を与えてしまい、「確信度」の精度が問題となる。前処理の段階で問題があると、認識処理への入力が既に変質しており、「信頼度が高いが間違い」の認識結果を出す可能性があるからである。 However, since the mechanism described in Patent Document 5 only calculates the “certainty factor” by using the similarity of pattern recognition when recognizing each item, it occurs in various pre-processes before the recognition process. The event that affects the recognition process is not considered. Therefore, for example, the presence or absence of local noise affects the “confidence”, and the accuracy of the “confidence” becomes a problem. This is because if there is a problem at the pre-processing stage, the input to the recognition process has already been altered, and there is a possibility that a recognition result of “high reliability but incorrect” may be output.

本発明は、上記事情に鑑みてなされたものであり、自動データ処理に供される追記情報について、修正作業を効率よく迅速かつ正確に行なうことができるとともに、高精度に修正を要する追記情報を特定することのできる仕組みを提供することを目的とする。 The present invention has been made in view of the above circumstances, and it is possible to perform correction work efficiently and quickly with respect to additional information provided for automatic data processing, and to add additional information that requires correction with high accuracy. The purpose is to provide a mechanism that can be specified.

本発明に係る仕組みにおいては、注目する付加情報に関して、認識処理の信頼度に関する特徴量を取得し、各特徴量に基づいて各処理の認識処理に関わるサブ信頼度をそれぞれ算出し、この算出したサブ信頼度に基づいて注目する付加情報に関しての最終的な信頼度を特定し、この最終的な信頼度が一定水準よりも低いか否かを判定することで認識処理の信頼度が一定水準よりも低い難認識情報を特定し、この特定した難認識情報の認識性能を向上させるための認識性能情報を提示するようにした。 In the mechanism according to the present invention, for the additional information of interest, the feature amount related to the reliability of the recognition process is acquired, and the sub-reliability related to the recognition process of each process is calculated based on each feature amount. Based on the sub-reliability, the final reliability of the additional information to be noticed is identified, and the reliability of the recognition process is lower than the fixed level by determining whether or not the final reliability is lower than the fixed level. The recognition performance information for improving the recognition performance of the identified difficulty recognition information is presented.

その後には、認識性能情報の提示に対応して記入された修正後の付加情報をデータ処理に反映させる。その修正の反映方法としては、たとえば、認識後に保存されたデータを直接変更することで実現する手法を採ることができる。たとえば、修正結果は、認識後にデータベース操作などで直接データを変更することで反映するとよい。 Thereafter, the corrected additional information entered corresponding to the presentation of the recognition performance information is reflected in the data processing. As a method of reflecting the correction, for example, a technique realized by directly changing data stored after recognition can be adopted. For example, the correction result may be reflected by changing the data directly by database operation after recognition.

ここで、「認識処理の信頼度に関する特徴量」は、認識処理そのものにおける信頼度に関する特徴量だけでなく、この認識処理よりも前段の認識処理に関わる各種の処理における信頼度に関する特徴量も含むものである。 Here, the “feature value related to the reliability of the recognition process” includes not only the feature quantity related to the reliability in the recognition process itself but also the feature quantity related to the reliability in various processes related to the recognition process preceding the recognition process. It is a waste.

認識処理そのものだけでなく、認識処理よりも前段の各種の前処理などで生じ得る、認識処理に影響を与える事象を考慮して、認識処理の信頼度が一定水準よりも低い難認識情報を特定するのである。 Identify difficult-to-recognize information whose recognition process reliability is lower than a certain level, taking into account not only the recognition process itself, but also the events that affect the recognition process that can occur in the various pre-processes before the recognition process. To do.

なお、本発明に係る仕組みは、電子計算機（コンピュータ）を用いてソフトウェアで実現することもでき、このためのプログラムやこのプログラムを格納した記録媒体を発明として抽出することも可能である。プログラムは、コンピュータ読取り可能な記憶媒体に格納されて提供されてもよいし、有線あるいは無線による通信手段を介した配信により提供されてもよい。 The mechanism according to the present invention can be realized by software using an electronic computer (computer), and a program for this purpose and a recording medium storing this program can also be extracted as an invention. The program may be provided by being stored in a computer-readable storage medium, or may be provided by distribution via wired or wireless communication means.

本発明によれば、認識処理そのものにおける信頼度に関する特徴量だけでなく、この認識処理よりも前段の認識処理に関わる各種の処理における信頼度に関する特徴量も取得し、認識処理の信頼度が一定水準よりも低い難認識情報を特定するようにした。 According to the present invention, not only the feature quantity related to the reliability in the recognition process itself but also the feature quantity related to the reliability in various processes related to the recognition process preceding the recognition process is acquired, and the reliability of the recognition process is constant. The difficulty recognition information lower than the standard was specified.

これにより、認識処理よりも前段の各種の前処理などで生じ得る認識処理に影響を与える事象を考慮することができるようになるので、全ての追記情報をチェックする必要がなく、誤認識を起し得る追記情報の修正作業を効率よく迅速かつ正確に行なうことができるだけでなく、高精度に難認識情報を特定することができる。修正対象の追記情報を高精度に特定して、効率よく迅速かつ正確に修正できるようになる。 As a result, it is possible to consider events that may affect the recognition process that may occur in various pre-processes prior to the recognition process.Therefore, it is not necessary to check all additional information and cause erroneous recognition. In addition to being able to efficiently and quickly correct the postscript information that can be corrected, it is possible to identify difficult recognition information with high accuracy. It becomes possible to specify the postscript information to be corrected with high accuracy and to efficiently and quickly correct it.

以下、図面を参照して本発明の実施形態について詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

＜処理対象文書の例＞
図１は、本発明に係る追記情報処理装置を備えてなる情報処理システムにおいて処理対象とする文書の一例を示す図である。 <Example of processing target document>
FIG. 1 is a diagram showing an example of a document to be processed in an information processing system including an additional information processing apparatus according to the present invention.

図１に示す文書は、教育現場で紙媒体として使用される教育用教材８０であって、典型例として、問題文８２およびその解答欄８４（本例では括弧で示された部分）を有したものであり、図１（Ａ）はその文書原本８Ａを示し、図１（Ｂ）は、追記情報記入済の状態（追記済文書８Ｂ）を模式的に示している。また、データベースへの登録情報の一例を（Ｃ）に示す。 The document shown in FIG. 1 is an educational material 80 used as a paper medium in an educational setting, and typically has a question sentence 82 and an answer column 84 (part shown in parentheses in this example). FIG. 1 (A) shows the original document 8A, and FIG. 1 (B) schematically shows a state in which additional information has been entered (added document 8B). An example of registration information in the database is shown in (C).

このような教育用教材８０は、具体的には教育機関で用いられるペーパーテストや練習問題シートなどが該当する。なお、教育用教材８０は、少なくとも解答欄８４を有していればよく、たとえば採点官が読み上げた問題について解答欄８４に解答を記入する場合などがあり、問題文８２については必ずしも記載されていなくともよい。 Specifically, the educational material 80 corresponds to a paper test or a practice question sheet used in an educational institution. The educational material 80 only needs to have at least an answer column 84. For example, an answer may be written in the answer column 84 for a question read by the grader, and the question sentence 82 is not necessarily described. Not necessary.

また、図１（Ａ）に示すように、教育用教材８０は、問題文８２および第１種の付加情報の一例である解答が記入される解答欄８４の他に、配点欄８３（問題別の項目点欄８３ａや部分点欄や合計点欄でなる集計欄８３ｂ）と、第１種の付加情報の一例である教育用教材８０を識別特定するための情報を記入する識別情報欄８５と、第１種の付加情報の一例である解答者情報であって解答欄８４への解答記入者に関する情報を記入する解答者情報欄８６とを有している。解答欄８４、識別情報欄８５、解答者情報欄８６は、何れも第１種の付加情報を記入する記入欄の一例である。 As shown in FIG. 1 (A), the educational material 80 includes not only a question sentence 82 and an answer field 84 in which an answer, which is an example of the first type of additional information, but also a scoring field 83 (by question). An item score field 83a, a partial score field, and a total score field 83b), and an identification information field 85 for entering information for identifying and specifying the educational material 80 as an example of the first type of additional information; , Answerer information which is an example of the first type of additional information, and an answerer information column 86 in which information related to the answer entry person in the answer column 84 is entered. The answer field 84, the identification information field 85, and the answerer information field 86 are all examples of entry fields for entering the first type of additional information.

識別情報欄８５には、たとえば教育用教材８０の科目、タイトル、あるいは適用学年などが予め記載されるものとする。ただし、これらの記載に加えて、またはこれらの記載とは別に、教育用教材８０を識別するためのコード情報が埋め込まれていてもよい。 In the identification information column 85, for example, the subject, title, or applicable grade of the educational material 80 is described in advance. However, in addition to these descriptions or separately from these descriptions, code information for identifying the educational material 80 may be embedded.

コード情報の埋込みは、公知技術を利用して実現すればよいが、その一つの具体例として、たとえば「ｉＴｏｎｅ（登録商標）」と呼ばれるもののように、階調表現としての万線スクリーンまたはドットスクリーンを構成する画素の形態（位置、形状など）を変化させることで、ハーフトーン画像の中にデジタル情報を埋め込むようにする、といった技術を用いることが考えられる。一方、解答者情報欄８６には、解答記入者の学級８６ａ、出席番号８６ｂ、あるいは氏名８６ｃなどが記入され得るようになっている。 The embedding of the code information may be realized by using a known technique. As one specific example, for example, a line screen or a dot screen as a gradation expression such as a so-called “iTone (registered trademark)” is used. It is conceivable to use a technique of embedding digital information in a halftone image by changing the form (position, shape, etc.) of the pixels constituting the image. On the other hand, in the answerer information column 86, the class 86a, attendance number 86b, name 86c, etc. of the answer entry person can be entered.

配点欄８３（特に項目点欄８３ａ）には、各解答欄８４についての配点情報が記入される。配点情報とは、教育用教材８０における各解答欄８４について、各位置の解答欄８４への配点が何点であるかを特定するための情報である。なお、配点は、解答欄８４ごとに異なっていてもよいし、あるいは一律であってもよい。 In the scoring field 83 (particularly, the item scoring field 83a), scoring information for each answer field 84 is entered. Scoring information is information for identifying the number of points assigned to the answer column 84 at each position for each answer column 84 in the educational material 80. The points may be different for each answer column 84 or may be uniform.

このような教育用教材８０は、対応する原本（教材原本）の電子データに基づき印刷装置で印刷出力することで得ることができる。教材原本の電子データは、たとえばパーソナルコンピュータなどの電子計算装置を利用してワープロソフトなどのアプリケーションソフトウェアを用いて生成することができ、予め所定のデータベースなどに保存される。 Such educational teaching material 80 can be obtained by printing out with a printing apparatus based on the electronic data of the corresponding original (teaching material original). The electronic data of the original teaching material can be generated using application software such as word processing software using an electronic computer such as a personal computer, and is stored in a predetermined database or the like in advance.

なお、教材原本の電子データは、その教育用教材８０における解答欄８４や識別情報欄８５などのレイアウトを特定し得るものであり、かつ、所定のデータベースにて保持蓄積可能なものであれば、そのデータ形式を問わない。たとえば、文書作成ソフトウェアで作成したアプリケーション文書データに限らず、画像データであってもよい。 The electronic data of the original teaching material can specify the layout of the answer column 84 and the identification information column 85 in the educational teaching material 80 and can be stored and accumulated in a predetermined database. Any data format is acceptable. For example, the application document data created by the document creation software is not limited to image data.

教育現場では、図１（Ａ）に示した文書原本８Ａの一例である教育用教材８０が生徒や受験者などに配布され、先ず、生徒などによって解答者情報欄８６への氏名や解答欄８４への解答などの第１種の付加情報が所定欄に記入された後に回収される。この後さらに、図１（Ｂ）に示す追記済文書８Ｂの一例である付加情報記入済教材８１のように、教師などの採点官によって各解答欄８４に記入された解答に対する採点記号（正誤判定図形）８７や採点記号８７と関連するその他の図形や文章で示されたコメント８８などの第２種の付加情報が記入される。なお、追記情報処理装置１０で自動採点処理を行なうので、この時点では、採点官による配点欄８３Ｃへの記入はない。 At the educational site, educational teaching material 80, which is an example of the original document 8A shown in FIG. 1A, is distributed to students, examinees, etc. First, the names and answer fields 84 to the answerer information field 86 are assigned by the students. The first type additional information such as the answer to is entered in a predetermined column and collected. After this, a scoring symbol (correct / incorrect determination) for the answer entered in each answer field 84 by a scoring officer such as a teacher, as in the additional information filled teaching material 81 which is an example of the added document 8B shown in FIG. The second type of additional information such as a comment 88 indicated by a graphic) 87 or other graphic or text related to the scoring symbol 87 is entered. Note that since the additional information processing apparatus 10 performs automatic scoring processing, there is no entry in the scoring field 83C by the scoring officer at this time.

この際には、通常、生徒などによる第１種の付加情報の記入に使用されるペン色と、採点官などによる第２種の付加情報の記入に使用されるペン色とは、異なるものが使用されるし、教育用教材８０として予め記入されている色とも異なるものが使用される。 In this case, the pen color usually used for the entry of the first type of additional information by the student is different from the pen color used for the entry of the second type of additional information by the grader or the like. The color used is different from the color pre-filled as the educational material 80.

また、採点記号８７としては、たとえば、正解を示す「○」やその他の図形（たとえば楕円図形）、不正解を示す「×」やその他の図形（たとえば「レ点」などのチェックマーク）、あるいは一部正解を示す「△」やその他の図形がある。 In addition, as the scoring symbol 87, for example, “O” indicating a correct answer and other figures (for example, an elliptical figure), “X” indicating an incorrect answer and other figures (for example, a check mark such as “R”), There are “△” indicating a correct answer and other figures.

コメント８８は、採点記号８７を元にした第１のデータ処理には直接的な関係を有しない情報であるが、第１のデータ処理の結果をサポート（補強）するあるいは全く関係のない第２のデータ処理に利用されるものである。 The comment 88 is information that does not have a direct relationship with the first data processing based on the scoring symbol 87, but supports (reinforces) the result of the first data processing or has no relationship at all. It is used for data processing.

教材自動採点システムでは、この追記済文書８Ｂの一例である付加情報記入済教材８１を対象として所定のデータ処理を行なうことになる。この際、第２種の付加情報（本例では採点記号８７とコメント８８）の別に、それぞれに応じた個別のデータ処理を行なうようにする。この例では、第１のデータ処理として採点記号８７を元にした自動採点処理を行なう。この自動採点処理結果として、たとえば、配点欄８３Ｃへ記入する得点情報が取得されることになる。また、コメント８８の追記内容を元にした自動採点処理結果とは別の第２のデータ処理を行なう。 In the learning material automatic scoring system, predetermined data processing is performed for the additional information filled learning material 81 which is an example of the added document 8B. At this time, in addition to the second type of additional information (in this example, scoring symbol 87 and comment 88), individual data processing corresponding to each is performed. In this example, automatic scoring processing based on the scoring symbol 87 is performed as the first data processing. As this automatic scoring processing result, for example, scoring information to be entered in the scoring field 83C is acquired. Further, the second data processing different from the automatic scoring processing result based on the additionally written contents of the comment 88 is performed.

自動採点処理とは別の第２のデータ処理としては、たとえば、コメント８８の追記内容を所定の条件に基づいて分類する処理や、分類したコメント（全ての追記内容には限らず一部の追記内容でもよい）を対応する解答欄８４についての採点処理結果と関連付けて生徒指導用データベースに登録・蓄積しておく処理などを行なう。 As a second data process different from the automatic scoring process, for example, a process for classifying the additional contents of the comment 88 based on a predetermined condition, a classified comment (not only all the additional contents but also a part of the additional data) The content may be registered) and stored in the student guidance database in association with the scoring processing result for the corresponding answer column 84.

こうすることで、自動採点処理とは別の第２のデータ処理の利用形態としては、生徒指導用データベースから情報を取り出して、後の生徒指導に利用することができるようになる。たとえば、全ての採点結果と対応するコメントを表示するようにすれば、優秀、優、並、劣、などの評価の細分類と合わせて採点結果を確認することができる。また、結果が不正解のコメントのみを表示するようにすれば、問題の読み間違えが多い、解答の書き間違えが多い、あるいは計算ミスが多いなどの生徒の能力分析に利用することもできる。あるいは、コメント８８で示されている注意文やミス修正に基づき生徒指導に利用することも考えられる。 In this way, as a use form of the second data processing different from the automatic scoring process, information can be taken out from the student guidance database and used for subsequent student guidance. For example, if all the scoring results and the corresponding comments are displayed, the scoring results can be confirmed together with the subcategories of evaluation such as excellent, excellent, average, inferior. In addition, if only comments with incorrect answers are displayed, it can be used to analyze the ability of students who have many mistakes in reading questions, many mistakes in writing answers, or many calculation errors. Alternatively, it may be used for student guidance based on a cautionary note or mistake correction indicated by the comment 88.

なお、採点記号８７やコメント８８を対象としたデータ処理を実行するための教育用教材８０（原本画像）のデータベースへの登録に当たっては、通常であれば、採点記号８７についてのデータ処理時に必要となる解答欄８４の位置情報や問題番号や配点情報などを示す記入欄位置領域情報３８と、コメント８８についてのデータ処理時に必要となる分類基準情報とを登録しておく。なお、教育用教材８０における文字部分（たとえば問題番号を含む問題文や配点）をテキスト情報として参照され得るように、教育用教材８０そのものの情報は、テキストデータもしくはテキストデータ相応の文書ファイルデータで保存しておくことが好ましい。 In addition, when registering the educational material 80 (original image) for executing data processing for the scoring symbol 87 and the comment 88 in the database, it is usually necessary when processing the data for the scoring symbol 87. The entry field position area information 38 indicating the position information, the question number, the scoring information, etc. of the answer field 84 and the classification reference information necessary for data processing for the comment 88 are registered. It should be noted that the information of the educational material 80 itself is text data or document file data corresponding to the text data so that a character part (for example, a question sentence including a question number or a score) in the educational material 80 can be referred to as text information. It is preferable to preserve it.

たとえば、図１（Ｃ）に示すように、記入欄位置領域情報３８の一例である解答欄位置領域情報を、教育用教材８０上に存在する問題の番号（第１の属性情報の一例）と、その問題の解答に対する配点（第２の属性情報の一例）と、その問題の解答を記入する解答欄８４として扱われる領域の所定点（たとえば左上頂点）のｘｙ座標、並びに、その外接矩形の幅（Ｗ）および高さ（ｈ）とからなる情報で、これらを互いに関連付けるテーブル形式で、所定の格納領域に保持蓄積しておく。 For example, as shown in FIG. 1C, the answer field position area information, which is an example of the entry field position area information 38, is replaced with the number of the problem existing on the educational material 80 (an example of first attribute information). , Points assigned to the answer of the question (an example of second attribute information), the xy coordinates of a predetermined point (for example, the upper left vertex) of the area treated as the answer field 84 for entering the answer of the question, and the circumscribed rectangle Information consisting of a width (W) and a height (h) is stored and accumulated in a predetermined storage area in a table format that associates them with each other.

付加情報記入済教材８１に対応する元の教育用教材８０が文書管理サーバに登録されていないときには、無記入の教育用教材８０を文書入力装置で読み取り、問題文の位置や解答欄８４の位置や配点などを特定することにより対処する。 When the original teaching material 80 corresponding to the additional information filled teaching material 81 is not registered in the document management server, the blank teaching material 80 is read by the document input device, and the position of the question sentence and the answer column 84 are read. This is dealt with by specifying points and points.

なお、追記情報について自動データ処理を実行する際に、追記情報を複数種類のものに分類し、それぞれの分類ごとに個別のデータ処理を実行しようとする場合には、各追記情報を他方のものと分離して認識・特定することが必要となる。両者の分離認識が適正になされないと、それぞれのデータ処理を適正に実行することができなくなる。 In addition, when executing automatic data processing for additional information, if additional information is classified into a plurality of types and individual data processing is to be performed for each classification, each additional information is assigned to the other It is necessary to identify and identify them separately. If the separation and recognition of both are not properly performed, the respective data processing cannot be properly performed.

たとえば、図１に示した付加情報記入済教材８１の場合には、採点記号８７に基づく自動採点処理と、コメント８８に基づく生徒指導用データベースの構築処理があり、このような個別処理を実現するには、採点記号８７とコメント８８とを区別して認識処理などを行なってから最終的なデータ処理を行なう必要がある。 For example, in the case of the additional information-added teaching material 81 shown in FIG. 1, there are an automatic scoring process based on the scoring symbol 87 and a student guidance database construction process based on the comment 88, and such individual processing is realized. In this case, it is necessary to distinguish between the grading symbol 87 and the comment 88 and perform a recognition process or the like before performing a final data process.

一方、答案の採点においては、採点記号８７以外のコメント８８を、採点記号８７と同じペンで記載することがあり、たとえば差分抽出部１３２での抽出結果に対する色成分認識処理を通じて、ペン色と対応する所定色成分についてのものを抽出するだけでは、両者を適切に分離した認識と記載内容の特定ができない。採点記号８７とコメント８８とを適切に分離できず、自動採点処理に悪影響を及ぼす。また、追記情報としては、自動採点処理用の採点記号８７の他に、生徒指導などにも利用し得るコメント８８が存在するにも関わらず、付加情報記入済教材８１を生徒に返却した後には、生徒がその記載内容を確認する以外には活用できない事態となる。 On the other hand, in scoring the answer, a comment 88 other than the scoring symbol 87 may be described with the same pen as that of the scoring symbol 87. For example, the pen color corresponds to the pen color through color component recognition processing on the extraction result in the difference extraction unit 132. By simply extracting the predetermined color component, it is not possible to recognize the two appropriately and specify the description. The scoring symbol 87 and the comment 88 cannot be properly separated, which adversely affects the automatic scoring process. In addition to the scoring symbol 87 for automatic scoring processing, the postscript information 81 after returning the additional information filled teaching material 81 to the student, although there is a comment 88 that can be used for student guidance etc. , It will be a situation where students can not use it other than confirming the description.

このような事態を避けるには、たとえば、追記済文書８Ｂ中に存在する複数の付加情報を、記載位置、画像特徴量、あるいは認識処理時の信頼度などの付加情報が持つ様々な特徴に基づいて、複数の種類に分類（分離）するようにするのがよい。この際、付加情報が持つ単一の特徴に基づくだけでなく複数の特徴を参照することで、より正確な分離認識を行なうようにするのがよい。なお、これらについては詳細な説明を割愛する。 In order to avoid such a situation, for example, a plurality of additional information existing in the additionally written document 8B is based on various features of additional information such as a description position, an image feature amount, or reliability during recognition processing. Thus, it is preferable to classify (separate) into a plurality of types. At this time, it is preferable to perform more accurate separation recognition by referring to a plurality of features as well as based on a single feature of the additional information. In addition, detailed description is omitted about these.

＜システム構成＞
図２は、本発明に係る追記情報処理装置を備えてなる情報処理システムの一実施形態の構成例を示す図である。なお、この情報処理システムは、答案用紙などの教育用教材８０を処理対象とする教材自動採点システムへの適用例で示す。 <System configuration>
FIG. 2 is a diagram illustrating a configuration example of an embodiment of an information processing system including the additional information processing apparatus according to the present invention. Note that this information processing system is shown as an application example to a teaching material automatic scoring system that targets teaching materials 80 such as answer sheets.

図示のように、教材自動採点システム１は、システムの中心をなす追記情報処理装置１０と、追記情報処理装置１０に処理対象文書である付加情報記入済教材８１を電子化して入力する文書入力装置２０と、処理対象文書である付加情報記入済教材８１に対応するテンプレート６や教育用教材８０（詳しくはその原本画像）の電子データを記憶する文書管理サーバ３０と、情報処理（本例では自動採点処理など）の結果を保存しておく処理結果保存サーバ４０とが、有線や無線を利用してネットワーク接続されて構成されている。 As shown in the figure, the automatic teaching material scoring system 1 includes a write-once information processing device 10 that forms the center of the system, and a document input device that digitizes and inputs additional information-filled learning material 81 that is a processing target document into the write-once information processing device 10. 20, a document management server 30 that stores electronic data of the template 6 corresponding to the additional information filled teaching material 81 that is the processing target document and the educational teaching material 80 (specifically, the original image), and information processing (automatic in this example). A processing result storage server 40 that stores results of scoring processing and the like is connected to the network using wired or wireless communication.

文書入力装置２０は、教育用教材８０における解答欄８４への解答記入、解答者情報欄８６への氏名などの記入および解答欄８４に記入された解答に対する採点官による採点記号８７（具体的には、たとえば「○」や「×」の図形）の記入がされた付加情報記入済教材８１に対して、公知の光学的画像読取技術を用いた画像読取りを行ない、その付加情報記入済教材８１から画像データを得るものである。 The document input device 20 enters an answer in the answer field 84 in the educational material 80, fills in the name in the answerer information field 86, and a scoring symbol 87 by the grader for the answer entered in the answer field 84 (specifically, For example, the additional information filled teaching material 81 in which “○” or “×” is entered) is read using a known optical image reading technique, and the additional information filled teaching material 81 is read. Image data is obtained from the image data.

文書入力装置２０は、処理対象文書である付加情報記入済教材８１を電子データにする機能を備えたものであればよく、たとえば、画像読取装置としての機能を有した複写機、複合機、またはスキャナ装置を利用して実現することが考えられる。その場合に、自動原稿搬送装置（Automatic Document Feeder ；ＡＤＦ）が付設されていると、複数の教育用教材に対する画像読取りを連続的に行なうことができ便利である。 The document input device 20 only needs to have a function of converting the additional information-filled teaching material 81, which is a processing target document, into electronic data. For example, a copying machine, a multifunction device, or It can be realized by using a scanner device. In that case, if an automatic document feeder (ADF) is attached, it is convenient that images can be read continuously for a plurality of educational materials.

なお、教育用教材８０を利用した試験などは、紙媒体を用いることに限定されない。たとえばタブレット型のＰＣを利用して試験を行ない採点するケースでは、最初から付加情報記入済教材８１を電子データの形式で入手することができ、この場合には、システム構成上、文書入力装置２０が不要となる。 Note that the test using the educational material 80 is not limited to using a paper medium. For example, in the case where a test is performed using a tablet PC and the scoring is performed, the teaching material 81 in which the additional information has been entered can be obtained from the beginning in the form of electronic data. In this case, the document input device 20 is included in the system configuration. Is no longer necessary.

文書管理サーバ３０は、付加情報記入済教材８１に対応する教育用教材８０の原本画像と、この原本画像を特定するための、たとえば、科目、タイトル、適用学年などの識別情報や識別コードとを対応付けて、ハードディスク装置や光ディスク装置などの所定の記憶媒体に文書原本情報データベースＤＢ２として登録している。 The document management server 30 obtains the original image of the educational material 80 corresponding to the additional information-filled educational material 81, and identification information and identification codes such as subjects, titles, and applicable grades for specifying the original image. In association therewith, it is registered as a document original information database DB2 in a predetermined storage medium such as a hard disk device or an optical disk device.

また、文書管理サーバ３０は、付加情報記入済教材８１に対応する元の教育用教材８０（原本画像）を保存するとともに、採点記号８７やコメント８８についてのデータ処理時に必要となる問題番号や配点情報などを示す記入欄位置領域情報３８を、図１（Ｃ）に示したように、テーブル形式で、所定の格納領域に文書原本情報データベースＤＢ２として保持蓄積している。 Further, the document management server 30 stores the original educational teaching material 80 (original image) corresponding to the additional information-added educational material 81, and also issues the problem number and the stipulation necessary for data processing for the grading symbol 87 and the comment 88. As shown in FIG. 1C, entry column position area information 38 indicating information and the like is held and accumulated as a document original information database DB2 in a predetermined storage area in a table format.

処理結果保存サーバ４０としては、追記情報処理装置１０とネットワーク接続され、付加情報記入済教材８１についての自動採点集計結果を管理することができるものであればよく、たとえば、処理結果データベース装置や処理結果ファイルサーバ装置などが該当する。 The processing result storage server 40 may be any server that is connected to the appending information processing apparatus 10 through a network and can manage the automatic scoring and counting results for the additional information filled teaching material 81. For example, the processing result database apparatus or the processing Applicable to result file server devices.

教材自動採点システム１の中心部をなす追記情報処理装置１０は、文書入力装置２０から入力された付加情報記入済教材８１の画像データに基づき所定の信号処理を行なう読取画像処理部１１０と、読取画像処理部１１０による処理に基づいて文書入力装置２０から入力された読取画像の元となった文書原本８Ａを識別特定する文書原本特定部１２０とを備えている。教材自動採点システム１においては、文書原本特定部１２０は、文書原本８Ａの一例である教育用教材８０を特定する教材特定部１２２として機能する。 The postscript information processing apparatus 10 that forms the center of the teaching material automatic scoring system 1 includes a read image processing unit 110 that performs predetermined signal processing based on image data of the additional information filled teaching material 81 input from the document input device 20, and a reading A document original specifying unit 120 that identifies and specifies the original document 8A that is the basis of the read image input from the document input device 20 based on processing by the image processing unit 110 is provided. In the learning material automatic scoring system 1, the document original specifying unit 120 functions as a learning material specifying unit 122 that specifies an educational material 80 which is an example of the document original 8A.

また、追記情報処理装置１０は、読取画像処理部１１０による処理が施された画像データから追記情報（アノテーション）を抽出する追記情報抽出部１３０と、追記情報抽出部１３０により抽出されたデータ処理対象の追記情報に基づき記載内容や記入位置を認識し特定するデータ処理対象追記情報特定処理部１５０と、データ処理対象追記情報特定処理部１５０により特定された追記情報の記載内容に基づいてデータ処理を行なうデータ処理部１７０とを備えている。 Further, the additional recording information processing apparatus 10 includes an additional recording information extraction unit 130 that extracts additional recording information (annotation) from the image data processed by the read image processing unit 110, and a data processing target extracted by the additional recording information extraction unit 130. The data processing target additional information specifying processing unit 150 for recognizing and specifying the description content and entry position based on the additional information, and data processing based on the description content of the additional information specified by the data processing target additional information specifying processing unit 150 And a data processing unit 170 to perform.

また、追記情報処理装置１０は、本実施形態特有の構成要素として、手書きで記入された文字や図形などの手書き入力情報の認識性能を示す情報（具体的には認識率）に基づいて特定される認識処理の信頼度が一定水準よりも低い難認識情報について、その難認識情報の認識性能を向上させるための認識性能情報をユーザに提示する認識性能情報提示処理部１９０を備えている。 Further, the postscript information processing apparatus 10 is specified based on information (specifically, recognition rate) indicating the recognition performance of handwritten input information such as handwritten characters and figures as a component unique to the present embodiment. A recognition performance information presentation processing unit 190 that presents the user with recognition performance information for improving the recognition performance of the difficult recognition information with respect to difficult recognition information whose reliability of recognition processing is lower than a certain level.

認識性能情報提示処理部１９０は、追記入力された採点記号８７やコメント８８の内、認識処理の信頼度が一定水準よりも低い難認識情報を特定し、この特定した難認識情報の認識性能を向上させるための認識性能情報をユーザ端末１７１にて提示する。データ処理部１７０は、認識性能情報提示処理部１９０による認識性能情報の提示に対応してユーザにより記入された修正後の採点記号８７やコメント８８を用いて自動採点処理や自動コメント分類処理などのデータ処理を実行する。 The recognition performance information presentation processing unit 190 identifies difficult recognition information whose reliability of recognition processing is lower than a certain level from among the scoring symbols 87 and comments 88 that are additionally written, and determines the recognition performance of the identified difficult recognition information. The user terminal 171 presents recognition performance information for improvement. The data processing unit 170 performs automatic scoring processing, automatic comment classification processing, and the like using the corrected scoring symbols 87 and comments 88 entered by the user in response to presentation of the recognition performance information by the recognition performance information presentation processing unit 190. Perform data processing.

ここで、本実施形態の認識性能情報提示処理部１９０は、認識処理対象の追記情報の効率的な修正を可能とするべく、追記情報処理装置１０の各機能部における複数の処理から、それぞれ認識率に関わる情報を収集し、記入内容を自動認識した追記情報の中で、特に信頼度の低いもの、つまり一定の度合いよりも認識の困難であった難認識情報を抽出して、この難認識情報についての認識性能情報を提示して修正を促す点に特徴を有する。全ての追記情報について、修正の要否をチェックする必要を無くすことで、誤認識の追記情報の修正作業を効率化するのである。 Here, the recognition performance information presentation processing unit 190 of the present embodiment recognizes each of the plurality of processes in each functional unit of the additional information processing apparatus 10 to enable efficient correction of additional information to be recognized. By collecting information related to the rate and automatically recognizing the contents of the entry, we extract particularly difficult recognition information that is less reliable, that is, difficult to recognize than a certain level. It is characterized in that recognition performance information about information is presented to prompt correction. By eliminating the need to check whether or not all additional information needs to be corrected, the correction work for erroneously recognized additional information is made more efficient.

たとえば、「全図形をチェックする必要が無い」という効果を得るためには、信頼度による警告が高精度であることが必要になる。何故なら、抜け漏れがあっては警告が信用できないからである。前処理の段階で問題があると、認識処理への入力が既に変質しており、「信頼度が高いが間違い」の認識結果を出す可能性がある。 For example, in order to obtain the effect of “no need to check all figures”, it is necessary that the warning based on the reliability be highly accurate. This is because the warning cannot be trusted if there is an omission. If there is a problem at the pre-processing stage, the input to the recognition process has already changed, and there is a possibility that a recognition result of “high reliability but wrong” will be output.

そこで、認識処理だけでなく複数の処理から情報を収集することで、高精度な信頼度を算出し、そこから高精度な警告を出すことで、はじめて「全図形をチェックする必要がない効率化」の効果が得られるようになるのである。なお、この点についての詳細は後述する。 Therefore, by collecting information not only from recognition processing but also from multiple processes, high-precision reliability is calculated, and high-precision warnings are issued from that, for the first time, “Efficiency without having to check all figures” The effect of "is to be obtained." Details of this point will be described later.

読取画像処理部１１０は、図示を割愛するが、文書入力装置２０から入力された画像データについて、レイアウト解析、文字図形分離、文字認識、コード情報認識、図形処理、色成分認識などの公知の画像処理技術（それぞれの詳細説明は割愛する）を利用して解析処理を行なう画像データ解析部と、文書入力装置２０から入力された画像データの傾きや主走査方向または副走査方向の拡縮率などの画像歪みを補正する歪み補正部とを有している。なお、歪み補正部は、文書入力装置２０から入力された画像データと、比較対象となる文書管理サーバ３０内の対応する原本画像とを比較照合し、その画像歪み（傾き、拡縮など）を補正してもよい。 Although not shown, the read image processing unit 110 performs well-known images such as layout analysis, character / graphic separation, character recognition, code information recognition, graphic processing, and color component recognition on the image data input from the document input device 20. An image data analysis unit that performs analysis processing using a processing technique (each detailed description is omitted), an inclination of image data input from the document input device 20, a scaling factor in the main scanning direction or the sub-scanning direction, and the like And a distortion correction unit that corrects image distortion. The distortion correction unit compares and collates the image data input from the document input device 20 with the corresponding original image in the document management server 30 to be compared, and corrects the image distortion (tilt, scaling, etc.). May be.

教材特定部１２２は、図示を割愛するが、たとえば、画像データ解析部によるデータ解析結果に基づいて、識別情報欄８５に記入されている科目、タイトル、あるいは適用学年などの識別情報を解析する識別情報解析部と、同じく識別情報欄８５に埋め込まれている教育用教材８０を特定するコード情報を解析するコード情報解析部とを有している。 The teaching material specifying unit 122 omits the illustration, but, for example, the identification that analyzes the identification information such as the subject, the title, or the applied grade that is entered in the identification information column 85 based on the data analysis result by the image data analysis unit An information analysis unit and a code information analysis unit that analyzes code information that identifies the educational material 80 embedded in the identification information field 85 are also included.

教材特定部１２２は、画像データ解析部での解析結果に基づいて特定した、たとえば科目、タイトル、適用学年などの識別情報や識別コードと、文書管理サーバ３０に保持蓄積されている教育用教材８０の原本画像の情報（たとえば科目、タイトル、適用学年などの識別情報や識別コード）とを照らし合わせ、該当する原本画像が文書管理サーバ３０に保持蓄積されていなければ、文書入力装置２０で得られた画像データとの比較対象となる電子データを特定できないと判定して、識別特定エラー信号を出力するようになっている。 The teaching material specifying unit 122 is identified based on the analysis result in the image data analysis unit, for example, identification information and identification codes such as subjects, titles, applied grades, and the like, and the teaching material 80 stored and stored in the document management server 30. If the corresponding original image is not stored and stored in the document management server 30 with reference to the information of the original image (for example, identification information or identification code such as subject, title, applied grade, etc.), it is obtained by the document input device 20. It is determined that the electronic data to be compared with the image data cannot be specified, and an identification specifying error signal is output.

なお、教材特定部１２２は、文書入力装置２０から入力された画像データ（付加情報記入済教材８１に相当）と対応する元の教育用教材８０を識別特定できればよく、識別情報解析部とコード情報解析部とは、付加情報記入済教材８１の識別情報欄８５に記載もしくは埋め込まれている識別情報の形式に対応する適正な方を備えていればよく、必ずしも、両者を備えている必要はない。 Note that the learning material specifying unit 122 only needs to be able to identify and specify the original educational material 80 corresponding to the image data input from the document input device 20 (corresponding to the additional information-filled learning material 81). The analysis unit only needs to be provided with an appropriate one corresponding to the format of the identification information described or embedded in the identification information column 85 of the additional information filled-in teaching material 81, and is not necessarily provided with both. .

追記情報抽出部１３０は、歪み補正部にて画像歪みが補正された画像データと、教材特定部１２２により特定された、文書入力装置２０から入力された画像データ（付加情報記入済教材８１に相当）に対応する原本画像（教育用教材８０に相当）とを公知の画像処理技術を利用して比較しそれぞれの間の差分を抽出する差分抽出部１３２を有する。 The postscript information extraction unit 130 includes image data whose image distortion has been corrected by the distortion correction unit, and image data specified by the teaching material specifying unit 122 and input from the document input device 20 (corresponding to the additional information filled teaching material 81). ) To the original image (corresponding to the educational material 80 for education) using a known image processing technique, and a difference extracting unit 132 for extracting a difference between them.

また、追記情報抽出部１３０は、差分抽出部１３２による抽出結果に基づき文書入力装置２０で読取り対象となった付加情報記入済教材８１における解答者情報（第１種の付加情報の一例）を抽出する解答者抽出部１３４と、同じく差分抽出部１３２による抽出結果に基づき文書入力装置２０で読取り対象となった付加情報記入済教材８１における採点記号８７やコメント８８などのデータ処理に供される第２種の付加情報を抽出するデータ処理用追記情報抽出処理部１４０とを有する。 Further, the additional information extraction unit 130 extracts answerer information (an example of the first type of additional information) in the additional information filled teaching material 81 to be read by the document input device 20 based on the extraction result by the difference extraction unit 132. The answerer extraction unit 134 that performs the same processing as that of the graded symbol 87 and the comment 88 in the additional information-filled teaching material 81 to be read by the document input device 20 based on the extraction result by the difference extraction unit 132. And a data processing additional information extraction processing unit 140 for extracting two types of additional information.

解答者抽出部１３４は、差分抽出部１３２による抽出結果に基づき、解答者情報欄８６の学級８８ａや出席番号８８ｂや氏名８８ｃの欄に記入された解答記入者の手書きによる番号や文字の画像をそのまま文字情報として切り出す手書き情報切出部１３６と、差分抽出部１３２による抽出結果に基づき（好ましくは、手書き情報切出部１３６により切り出された手書き情報について）、解答者情報欄８６の手書きによる記入情報を、追記情報処理装置１０上で加工編集が可能な文字データに変換する文字認識処理（ＯＣＲ；Optical Character Reader）部１３８とを有する。 Based on the extraction result by the difference extraction unit 132, the answerer extraction unit 134 obtains handwritten numbers and character images of answerers written in the class 88a, attendance number 88b, and name 88c fields of the answerer information column 86. Handwritten information extraction unit 136 that extracts the text information as it is, and handwritten information in the answerer information column 86 based on the extraction result by the difference extraction unit 132 (preferably about the handwritten information extracted by the handwriting information extraction unit 136) And a character recognition processing (OCR: Optical Character Reader) unit 138 that converts information into character data that can be processed and edited on the additional information processing apparatus 10.

なお、解答者抽出部１３４は、文書入力装置２０で読取り対象となった付加情報記入済教材８１における解答者情報欄８６に記入された解答者情報を抽出できればよく、手書き情報切出部１３６と文字認識処理部１３８の何れか一方を備えていればよい。また、文字認識処理部１３８を設けない場合や文字認識処理部１３８で文字認識できなかったコメント８８の部分に関しては、抽出された解答者情報をそのまま画像として取り扱うことにする。 The answerer extraction unit 134 only needs to be able to extract the answerer information entered in the answerer information column 86 in the additional information-filled teaching material 81 to be read by the document input device 20. Any one of the character recognition processing units 138 may be provided. In addition, regarding the part of the comment 88 in which the character recognition processing unit 138 is not provided or the character recognition processing unit 138 cannot recognize the character, the extracted answerer information is handled as an image as it is.

データ処理用追記情報抽出処理部１４０は、注目する追記色の情報を参照して、差分抽出部１３２による抽出結果に基づき、追記情報抽出部１３０により抽出された追記情報の内、文書入力装置２０で読取り対象となった追記済文書８Ｂ（本例では付加情報記入済教材８１）におけるデータ処理対象追記情報９ａ（本例では採点記号８７やコメント８８）を抽出するデータ処理対象追記情報抽出部１４２と、データ処理対象追記情報抽出部１４２で抽出されたデータ処理対象の追記情報をデータ処理に耐え得るように整形する追記情報整形部１４６とを有する。本実施形態においては、データ処理対象追記情報抽出部１４２は、採点記号８７を抽出する採点記号抽出部およびコメント８８を抽出するコメント抽出部の機能を備える。 The additional write information extraction processing unit 140 for data processing refers to the additional write color information of interest and, based on the extraction result by the difference extraction unit 132, of the additional write information extracted by the additional write information extraction unit 130, the document input device 20. The data processing target additional information extracting unit 142 for extracting the data processing target additional information 9a (the scoring symbol 87 and the comment 88 in this example) in the additional written document 8B (in this example, the additional information-added teaching material 81) that has been read in (1). And an additional write information shaping unit 146 that shapes the additional write information of the data processing target extracted by the data processing target additional write information extraction unit 142 so as to withstand the data processing. In the present embodiment, the data processing target additional information extraction unit 142 includes functions of a scoring symbol extraction unit that extracts a scoring symbol 87 and a comment extraction unit that extracts a comment 88.

なお、データ処理対象追記情報抽出部１４２は、データ処理対象追記情報９ａを抽出できればよく、色を指標に追記情報をさらに分類するとよい。たとえば差分抽出部１３２での抽出結果に対する色成分認識処理を通じて、採点官が採点記号８７やコメント８８の記入に使用したペン色と対応する所定色成分についてのものを抽出することで行なえばよい。たとえば、付加情報記入済教材８１における採点記号８７やコメント８８の記入は、一般に赤色ペン（赤の筆記具）で行なわれることが多く、この場合には、赤色成分に注目した抽出を行なえばよい。 The data processing target additional information extracting unit 142 only needs to be able to extract the data processing target additional information 9a, and may further classify the additional information using the color as an index. For example, through a color component recognition process on the extraction result in the difference extraction unit 132, the grader may extract a predetermined color component corresponding to the pen color used for entering the scoring symbol 87 and the comment 88. For example, the scoring symbol 87 and the comment 88 in the additional information-filled teaching material 81 are generally entered with a red pen (a red writing instrument). In this case, it is only necessary to perform an extraction focusing on the red component.

ただし、赤色ペンとはいってもピンク系からオレンジ系というように似通った色気のものがあるし、必ずしも採点記号８７やコメント８８の記入に赤色ペンを使用するとは限らないし、採点記号８７とコメント８８とを別のペン色で記入することもあるので、採点記号８７やコメント８８の記入に使用するペン色の情報を採点記号抽出部やコメント抽出部として機能するデータ処理対象追記情報抽出部１４２に設定可能に構成しておくことで、抽出性能を向上させるようにするとよい。 However, even though the red pen is used, there are similar colors such as pink to orange, and the red pen is not always used for entering the scoring symbol 87 and the comment 88. The scoring symbol 87 and the comment 88 are not necessarily used. May be entered in a different pen color, so that the pen color information used for entering the scoring symbol 87 and the comment 88 is added to the data processing target additional information extracting unit 142 functioning as a scoring symbol extracting unit and a comment extracting unit. It is preferable to improve the extraction performance by configuring it to be settable.

このため、実際に使用した追記色を特定し、その特定した追記色の情報を参照して、追記色に注目した抽出を行なう。実際に使用されたペン色が特定されていれば、データ処理対象追記情報抽出部１４２は、抽出許容範囲を狭くすることができる。これにより、採点記号８７やコメント８８をその他の追記情報と高精度に区別して抽出することができる。 For this reason, the write-on color actually used is specified, and extraction with attention to the write-on color is performed with reference to the information on the specified write-on color. If the pen color actually used is specified, the data processing target additional information extracting unit 142 can narrow the extraction allowable range. Thereby, the grading symbol 87 and the comment 88 can be extracted with high accuracy and distinguished from other additional information.

追記情報整形部１４６としては、データ処理対象追記情報抽出部１４２で抽出されたデータ処理対象の追記情報について、線分同士を接続してその抽出線分間の途切れを解消するように補正処理を行なう抽出線分途切れ補正部１４８を有する構成とするのがよい。 The additional write information shaping unit 146 performs correction processing on the additional write information of the data processing target extracted by the data processing target additional write information extracting unit 142 so that the line segments are connected to each other and the interruption between the extracted line segments is eliminated. It is preferable that the extraction line segment break correction unit 148 be included.

一般に、追記済文書８Ｂ上での図形記入や「２重線」や「波線」や「花丸」や「矢印」などの図形記入やコメント文などは、記入済のものに対して重ねて行なわれることもある。たとえば、付加情報記入済教材８１の場合には、各問題文８２や各解答欄８４を特定する枠や各解答欄８４への解答記入内容などに重ねて「○」や「×」などの採点記号８７が記入され、あるいは図形や文字でコメント８８の追記が行なわれることもある。そのため、データ処理対象追記情報抽出部１４２による所定色成分の抽出結果は、その重なり部分が除かれる結果、図形や文字に途切れ部分が生じたものとなる虞れがある。 In general, figure entry on the added document 8B, figure entry such as "double line", "wavy line", "Hanamaru", "arrow", etc., and comment sentences, etc., are repeated on the completed entry. Sometimes. For example, in the case of the additional information filled-in teaching material 81, the grades such as “○” and “x” are superimposed on the frame for identifying each question sentence 82 and each answer column 84 and the answer entry contents in each answer column 84. The symbol 87 may be entered, or the comment 88 may be additionally written with graphics or characters. For this reason, the extraction result of the predetermined color component by the data processing target additional information extracting unit 142 may result in the discontinuity of the graphic or character as a result of removing the overlapping portion.

このことから、抽出線分途切れ補正部１４８は、「○」や「×」や「線」やその他の印（マーク）などの図形や文字であるはずの抽出結果に対して、細線化処理、端点抽出処理、端点間接続処理（いわゆる連結処理）、あるいは線図形の直線近似などを適宜実行する。なお、このときに行なう細線化処理、端点抽出処理、あるいは端点間接続処理や線図形の直線近似などは、公知技術を利用して行なえばよいため、ここではその詳細な説明を割愛する（たとえば、「画像の処理と認識」，安居院猛著，昭晃堂発行などを参照）。 From this, the extraction line segmentation correction unit 148 performs thinning processing on the extraction result that should be a graphic or a character such as “◯”, “×”, “line”, and other marks (marks). End point extraction processing, end point connection processing (so-called connection processing), or linear approximation of a line figure is appropriately executed. Note that the thinning process, end point extraction process, end point connection process, straight line approximation of line figures, etc. performed at this time may be performed using known techniques, and therefore detailed description thereof is omitted here (for example, , "Image processing and recognition", Takeshi Aoiin, published by Shosodo).

データ処理対象追記情報特定処理部１５０は、差分抽出部１３２による差分抽出結果に基づいて、具体的には、抽出線分途切れ補正部１４８で補正されたデータ処理対象追記情報９ａに基づいて、追記済文書８Ｂにおける第１のデータ処理用の追記情報の記入内容を第２のデータ処理用の追記情報と分離して認識する第１データ処理用追記情報認識部１５４と第２のデータ処理用の追記情報の記入内容を第１のデータ処理用の追記情報と分離して認識する第２データ処理用追記情報認識部１６４とを有している。 The data processing target additional information specifying processing unit 150 adds the additional information based on the difference extraction result by the difference extracting unit 132, specifically, based on the data processing target additional information 9a corrected by the extraction line segmentation correcting unit 148. The first data processing additional information recognition unit 154 for recognizing the entry contents of the first data processing additional information in the completed document 8B separately from the second data processing additional information, and the second data processing additional information. There is a second data processing additional information recognition unit 164 that recognizes the entry content of the additional information separately from the additional information for the first data processing.

データ処理対象追記情報特定処理部１５０としては、第１データ処理用追記情報認識部１５４は、付加情報記入済教材８１における採点記号８７の記入内容をコメント８８と分離して認識し、また、第２データ処理用追記情報認識部１６４は、付加情報記入済教材８１におけるコメント８８の記入内容を採点記号８７と分離して認識する。 As the data processing target additional information specifying processing unit 150, the first data processing additional information recognition unit 154 recognizes the content of the scoring symbol 87 in the additional information-added teaching material 81 separately from the comment 88, and The two-data processing additional information recognition unit 164 recognizes the content of the comment 88 in the additional information-added teaching material 81 separately from the scoring symbol 87.

各データ処理用追記情報認識部１５４，１６４は、抽出線分途切れ補正部で補正されたデータ処理用追記情報の記入内容に対して形状認識処理を行なうことでデータ処理用追記情報の記入内容を認識する図形形状認識部１５６，１６６と、抽出線分途切れ補正部で補正されたデータ処理用追記情報の記入内容に対して文字認識処理を行なうことでデータ処理用追記情報の記入内容を認識するする文字認識部１５７，１６７と、図形形状認識部１５６，１６６や文字認識処理部１５７，１６７により認識された各データ処理用追記情報の記入内容の、文書原本８Ａ（追記済文書８Ｂ）上における記入位置を認識する記入位置認識部１５８，１６８とを有している。図形形状認識部１５６，１６６および文字認識処理部１５７，１６７により、採点記号８７とコメント８８とを分離して認識する分離認識処理部１５５が構成される。 Each of the data processing additional information recognition units 154 and 164 performs the shape recognition process on the data processing additional information that has been corrected by the extraction line segmentation correction unit, so that the data processing additional information is entered. Recognize the entry contents of the additional information for data processing by performing character recognition processing on the entry contents of the additional information for data processing corrected by the figure shape recognition sections 156 and 166 and the extraction line segmentation correction section. On the original document 8A (additional document 8B) of the entry contents of the additional information for data processing recognized by the character recognition units 157 and 167 and the figure shape recognition units 156 and 166 and the character recognition processing units 157 and 167. There are entry position recognition units 158 and 168 for recognizing the entry position. The figure shape recognition units 156 and 166 and the character recognition processing units 157 and 167 constitute a separation recognition processing unit 155 that recognizes the scoring symbol 87 and the comment 88 separately.

文字認識処理部１５７，１６７を設けない場合や文字認識処理部１５７，１６７で文字認識できなかったデータ処理用追記情報の部分に関しては、抽出されたデータ処理用追記情報をそのまま画像として取り扱うことにする。 In the case where the character recognition processing units 157 and 167 are not provided or the data processing additional information portion that cannot be recognized by the character recognition processing units 157 and 167, the extracted additional information for data processing is handled as an image as it is. To do.

なお、図示のように、図形形状認識部１５６，１６６、文字認識処理部１５７，１６７、並びに記入位置認識部１５８，１６８とは、それぞれ１つの機能部が双方の機能を実現する構成としてもよいし、それぞれを個別の機能部として独立に設けてもよい。 As shown in the figure, each of the figure shape recognizing units 156 and 166, the character recognition processing units 157 and 167, and the entry position recognizing units 158 and 168 may be configured such that one function unit realizes both functions. However, they may be provided independently as individual functional units.

たとえば、付加情報記入済教材８１を処理対象とする構成においては、第１データ処理用追記情報認識部１５４は、採点記号８７を第１のデータ処理用の追記情報とするものであり採点記号認識部として機能する。この場合、採点記号８７についての図形形状認識部１５６は、採点記号８７の記入内容が「正解（○）」または「不正解（×）」または「一部正解（△）」であるかなどを図形の側面から認識することができればよく、たとえば「○」，「×」，「△」の図形形状とのパターンマッチングによって形状認識を行なえばよい。あるいは、認識対象図形の特徴量を算出し、その特徴量から形状を認識してもよい。特徴量としては、たとえば、穴の個数や外接矩形に占める対象図形の面積率などを使用することができる。 For example, in the configuration in which the additional information filled-in teaching material 81 is a processing target, the first data processing additional information recognition unit 154 uses the scoring symbol 87 as additional information for the first data processing. It functions as a part. In this case, the figure shape recognizing unit 156 for the scoring symbol 87 determines whether the entry content of the scoring symbol 87 is “correct answer (◯)”, “incorrect answer (×)”, or “partially correct answer (Δ)”. It is only necessary to be able to recognize from the side of the figure. For example, shape recognition may be performed by pattern matching with the figure shapes of “◯”, “×”, and “Δ”. Alternatively, the feature amount of the recognition target figure may be calculated and the shape may be recognized from the feature amount. As the feature amount, for example, the number of holes, the area ratio of the target figure occupying the circumscribed rectangle, or the like can be used.

また、採点記号８７について文字認識処理部１５７は、採点記号８７の記入内容が「正解（○）」または「不正解（×）」または「一部正解（△）」であるかなどを文字の側面から認識することができればよい。なお、採点記号８７は図形のみであるとする場合には、文字認識処理部１５７を割愛することもできる。 The character recognition processing unit 157 for the scoring symbol 87 determines whether the contents of the scoring symbol 87 are “correct answer (◯)”, “incorrect answer (×)”, or “partially correct answer (Δ)”. It only needs to be recognized from the side. If the scoring symbol 87 is only a figure, the character recognition processing unit 157 can be omitted.

また、採点記号８７についての記入位置認識部１５８は、たとえば、教育用教材８０上における座標解析によって、付加情報記入済教材８１上の採点記号８７の記入内容の記入位置を認識すればよい。 Further, the entry position recognition unit 158 for the scoring symbol 87 may recognize the entry position of the entry contents of the scoring symbol 87 on the additional information-added teaching material 81 by, for example, coordinate analysis on the educational teaching material 80.

ここで、本実施形態の構成においては、認識性能情報提示処理部１９０を設けて、追記情報処理装置１０の各機能部における複数の処理から、それぞれ認識率に関わる情報を収集し、記入内容を自動認識した追記情報の中で、特に信頼度の低い難認識情報のみを提示して修正を促すようにしているので、全ての追記情報について、修正の要否をチェックする必要が無く、誤認識の追記情報の修正作業を効率的に実行することができる。もちろん、その結果として、各データ処理用追記情報認識部１５４，１６４における認識率を向上させることができる。 Here, in the configuration of the present embodiment, a recognition performance information presentation processing unit 190 is provided, and information related to the recognition rate is collected from a plurality of processes in each functional unit of the additional recording information processing apparatus 10, and the contents to be entered are displayed. In addition to the automatically recognized additional information, only the difficult-to-recognize difficult-to-recognize information is presented to facilitate correction, so it is not necessary to check whether the additional information needs to be corrected. The postscript information correction work can be efficiently executed. Of course, as a result, the recognition rate in each of the data processing additional information recognition units 154 and 164 can be improved.

なお、図形形状認識部１５６は、採点記号８７に関する形状認識の際には、「○」や「×」などの採点記号８７を示す図形を構成する連続画素群を１つに纏めて取り扱うために、その連続画素群に対して識別子を付与すべく、一般的な画像処理技術であるラベリング処理を行なう。このことから、記入位置認識部１５８による位置認識の際にも、そのラベリング処理の結果を利用して、「○」や「×」などの採点記号８７を示す図形を構成する連続画素群を１つの纏まりとして取り扱う。 Note that the figure shape recognizing unit 156, when recognizing the shape related to the scoring symbol 87, treats the continuous pixel group constituting the figure indicating the scoring symbol 87 such as “◯” and “x” collectively as one. In order to give an identifier to the continuous pixel group, a labeling process which is a general image processing technique is performed. Therefore, even when the position recognition by the entry position recognition unit 158 is performed, the result of the labeling process is used to select one continuous pixel group constituting the graphic indicating the scoring symbol 87 such as “◯” or “×”. Treat as a group of two.

また、記入位置認識部１５８は、採点記号８７の記入位置の認識処理に当たって、付加情報記入済教材８１上に複数の採点記号８７が記入されていることが一般的であるから、その複数の採点記号８７のそれぞれについて順次予め定められた走査順で検出される採点記号８７について、順にその位置を認識していく。 In addition, since the entry position recognition unit 158 generally recognizes the entry position of the scoring symbol 87, a plurality of scoring symbols 87 are generally entered on the additional information filled teaching material 81. For each of the symbols 87, the positions of the scoring symbols 87 detected in the predetermined scanning order are sequentially recognized.

各採点記号８７に関する位置認識は、たとえば「○」や「×」などの採点記号８７を示す図形（あるいは文字）の外接矩形情報を算出し、さらにその外接矩形の中心座標を算出することによって行なうことが考えられる。具体的には、認識対象となる図形もしくは文字（連続画素群）に対して外接矩形を抽出するとともに、その外接矩形の所定点（たとえば左上頂点）のｘｙ座標、並びに、その外接矩形の幅（Ｗ）および高さ（ｈ）を算出する。そして、これらの算出結果から、中心ｘ座標＝ｘ＋ｗ／２、中心ｙ座標＝ｙ＋ｈ／２を算出し、その算出結果を連続画素群の位置、すなわち採点記号８７の記入位置の認識結果とする。 Position recognition regarding each scoring symbol 87 is performed, for example, by calculating circumscribing rectangle information of a figure (or character) indicating the scoring symbol 87 such as “◯” or “×”, and further calculating the center coordinates of the circumscribing rectangle. It is possible. Specifically, a circumscribed rectangle is extracted for a figure or character (a group of continuous pixels) to be recognized, and the xy coordinates of a predetermined point (for example, the upper left vertex) of the circumscribed rectangle, and the width of the circumscribed rectangle ( W) and height (h) are calculated. Then, from these calculation results, the center x coordinate = x + w / 2 and the center y coordinate = y + h / 2 are calculated, and the calculation result is used as the recognition result of the position of the continuous pixel group, that is, the entry position of the scoring symbol 87.

一方、第２データ処理用追記情報認識部１６４は、コメント８８を第２のデータ処理用の追記情報とするコメント認識部として機能する。この場合、コメント８８についての図形形状認識部１６６は、コメント８８の追記内容を図形の側面から認識することができればよく、たとえば「１重線」や「２重線」や「（１重または２重の）波線」などの線を示す図形形状とのパターンマッチングによって線に関する形状認識を行なえばよい。あるいは、認識対象図形の特徴量を算出し、その特徴量から線の形状を認識してもよい。特徴量としては、たとえば、線数や外接矩形に対する画素密度などを使用することができる。 On the other hand, the second data processing additional information recognition unit 164 functions as a comment recognition unit that uses the comment 88 as the second data processing additional information. In this case, the graphic shape recognition unit 166 for the comment 88 only needs to be able to recognize the additional content of the comment 88 from the side surface of the graphic. For example, “single line”, “double line”, “(single or double) The shape of the line may be recognized by pattern matching with a graphic shape indicating a line such as a “double wavy line”. Alternatively, the feature amount of the recognition target figure may be calculated, and the line shape may be recognized from the feature amount. As the feature amount, for example, the number of lines, the pixel density for the circumscribed rectangle, or the like can be used.

また、コメント８８について文字認識処理部１６７は、付加情報記入済教材８１におけるコメント８８の記入内容を文字の側面から認識することができればよい。また、コメント８８についての記入位置認識部１６８は、たとえば、教育用教材８０上における座標解析によって、付加情報記入済教材８１上のコメント８８の追記内容の記入位置を認識すればよい。 In addition, the character recognition processing unit 167 only needs to be able to recognize the content of the comment 88 in the additional information filled teaching material 81 from the side of the character. In addition, the entry position recognition unit 168 for the comment 88 may recognize the entry position of the content to be added to the comment 88 on the additional information-added teaching material 81 by, for example, coordinate analysis on the educational material 80.

なお、図形形状認識部１６６は、コメント８８に関する形状認識の際には、「２重線」や「花丸」などのコメント８８を示す図形を構成する連続画素群を１つに纏めて取り扱うために、その連続画素群に対して識別子を付与すべく、一般的な画像処理技術であるラベリング処理を行なう。このことから、記入位置認識部１６８による位置認識の際にも、そのラベリング処理の結果を利用して、「２重線」や「花丸」などのコメント８８を示す図形を構成する連続画素群を１つの纏まりとして取り扱う。 Note that, when the shape recognition unit 166 recognizes the shape of the comment 88, the continuous shape pixel group constituting the graphic indicating the comment 88 such as “double line” and “flower circle” is handled as one. In addition, a labeling process, which is a general image processing technique, is performed to give an identifier to the continuous pixel group. Therefore, even when the position is recognized by the entry position recognizing unit 168, a group of continuous pixels constituting a figure indicating the comment 88 such as “double line” or “flower circle” is used by using the result of the labeling process. Are treated as one group.

また、記入位置認識部１６８は、コメント８８の記入位置の認識処理に当たって、付加情報記入済教材８１上に複数のコメント８８が記入されていることが一般的であるから、その複数のコメント８８のそれぞれについて順次予め定められた走査順で検出されるコメント８８について、順にその位置を認識していく。 The entry position recognition unit 168 generally has a plurality of comments 88 written on the additional information filled teaching material 81 in the process of recognizing the entry position of the comment 88. The positions of the comments 88 detected sequentially in a predetermined scanning order are sequentially recognized.

各コメント８８に関する位置認識は、たとえばコメント文や「２重線」や「花丸」などのコメント８８の文字や図形の外接矩形情報を算出し、さらにその外接矩形の中心座標を算出することによって行なうことが考えられる。具体的には、認識対象となる文字や図形（連続画素群）に対して外接矩形を抽出するとともに、その外接矩形の所定点（たとえば左上頂点）のｘｙ座標、並びに、その外接矩形の幅（Ｗ）および高さ（ｈ）を算出する。そして、これらの算出結果から、中心ｘ座標＝ｘ＋ｗ／２、中心ｙ座標＝ｙ＋ｈ／２を算出し、その算出結果を連続画素群の位置、すなわちコメント８８の記入位置の認識結果とする。 Position recognition for each comment 88 is performed by, for example, calculating circumscribing rectangle information of a comment sentence, characters and figures of the comment 88 such as “double line” and “Hanamaru”, and further calculating the center coordinates of the circumscribing rectangle. It is possible to do it. Specifically, a circumscribed rectangle is extracted from a character or figure (continuous pixel group) to be recognized, and the xy coordinates of a predetermined point (for example, the upper left vertex) of the circumscribed rectangle, and the width of the circumscribed rectangle ( W) and height (h) are calculated. Then, from these calculation results, the center x coordinate = x + w / 2 and the center y coordinate = y + h / 2 are calculated, and the calculation result is set as the recognition result of the position of the continuous pixel group, that is, the comment 88 entry position.

また、この位置認識の際には、各コメント８８は、ある位置の解答欄８４への採点記号８７と対応して、その近傍に記入されることが多いので、記入位置認識部１５８による採点記号８７についての位置認識と協働して処理を行なうのがよい。こうすることで、双方の位置情報の各解答欄８４との対応付け、結果としては、採点記号８７とコメント８８との関連付けが容易になる。 Also, in this position recognition, each comment 88 is often written in the vicinity thereof in correspondence with the scoring mark 87 for the answer column 84 at a certain position. The process should be performed in cooperation with the position recognition for 87. By doing so, it becomes easy to associate the position information of each position information with each answer field 84, and as a result, to associate the scoring symbol 87 with the comment 88.

データ処理部１７０は、文書入力装置２０から入力された追記済文書８Ｂの画像データについて、その追記済文書８Ｂに記入された第１のデータ処理対象追記情報に関する第１のデータ処理を実行する第１データ処理部１７０_1と、第２のデータ処理対象追記情報に関する第２のデータ処理を実行する第２データ処理部１７０_2を有する。 The data processing unit 170 executes the first data processing related to the first data processing target additional information entered in the additional document 8B for the image data of the additional document 8B input from the document input device 20. 1 data processing part 170_1 and 2nd data processing part 170_2 which performs the 2nd data processing regarding 2nd data processing object additional record information.

各データ処理部１７０_1，１７０_2は、データ処理対象の追記情報の記入位置を、当該記入欄の位置情報を保存している装置（テンプレート情報データベースＤＢ１や文書原本情報データベースＤＢ２として機能する文書管理サーバ３０）にアクセスして、記入欄位置領域情報３８、記入欄位置領域情報６８、あるいはテンプレート関連付け情報６９を参照して取得しつつ、追記情報の記入位置と追記情報とを対応付けながらデータ処理を実行する。 Each of the data processing units 170_1 and 170_2 sets the entry position of the additional information to be processed as data processing apparatuses (the document management server 30 functioning as the template information database DB1 and the document original information database DB2) that stores the position information of the entry field. ) To access the entry field position area information 38, entry field position area information 68, or template association information 69, and execute data processing while associating the entry position of the additional information with the additional information. To do.

第１データ処理部１７０_1は、追記済文書８Ｂの一例である付加情報記入済教材８１の画像データについて、その付加情報記入済教材８１に記入された採点記号８７を元に採点集計を行なう採点集計部１７２と、採点集計部１７２による採点集計の結果を、解答者抽出部１３４が抽出した解答者情報と関連付けて出力する集計結果出力部１７４とを備えている。採点集計結果と解答者情報とが関連付けられた状態の処理結果を特に採点認識結果と称する。 The first data processing unit 170_1 performs scoring and summarization on the image data of the additional information filled teaching material 81 which is an example of the added document 8B based on the scoring symbol 87 written in the additional information filled teaching material 81. Unit 172, and a totaling result output unit 174 that outputs the result of scoring totaling by scoring totaling unit 172 in association with the answerer information extracted by answerer extracting unit 134. A processing result in a state in which the scoring result and the answerer information are associated is particularly referred to as a scoring recognition result.

採点集計部１７２は、図形形状認識部１５６による採点記号８７の追記内容の図形の側面からの認識結果や文字認識処理部１５７による採点記号８７の追記内容の文字情報の側面からの認識結果と、記入位置認識部１５８による採点記号８７の記入位置の認識結果と、文書入力装置２０が保持蓄積している付加情報記入済教材８１に対応する教育用教材８０の電子データ（原本画像）に含まれる教育用教材８０（付加情報記入済教材８１）の各解答欄８４についての配点欄８３で規定されている配点情報とに基づいて、文書入力装置２０が読み取った付加情報記入済教材８１について、付加情報記入済教材８１に記入された採点記号８７に関する採点処理および集計処理（纏めて採点集計という）を行なう。 The scoring tabulation unit 172 recognizes the result of the additional writing of the scoring symbol 87 by the graphic shape recognition unit 156 from the side of the graphic and the recognition result of the additional information of the scoring symbol 87 by the character recognition processing unit 157 from the side of the character information. The result of recognition of the entry position of the scoring symbol 87 by the entry position recognition unit 158 and the electronic data (original image) of the educational material 80 corresponding to the additional information filled educational material 81 held and accumulated in the document input device 20 are included. The additional information-added teaching material 81 read by the document input device 20 is added based on the scoring information defined in the scoring column 83 for each answer column 84 of the teaching material 80 (additional information-added teaching material 81). A scoring process and a summing process (collectively referred to as scoring summarization) regarding the scoring symbols 87 entered in the information-filled teaching material 81 are performed.

集計結果出力部１７４は、採点集計部１７２により集計された採点集計結果と解答者抽出部１３４が抽出した解答者情報と関連付けて、処理結果保存サーバ４０（処理結果データベース装置や処理結果ファイルサーバ装置など）に登録する。あるいは、採点結果の点数を付加情報記入済教材８１の集計欄８３ｂに記入し用紙上に返却答案８１ｂとして出力して生徒などに返却できるようにする。 The counting result output unit 174 associates the score totaling result totaled by the scoring totaling unit 172 with the answerer information extracted by the answerer extracting unit 134, and stores the processing result storage server 40 (processing result database device or processing result file server device). Etc.). Alternatively, the score of the scoring result is entered in the totaling field 83b of the additional information-added teaching material 81 and output on the paper as a return answer 81b so that it can be returned to the student.

また、第２データ処理部１７０_2は、追記済文書８Ｂの一例である付加情報記入済教材８１の画像データについて、その付加情報記入済教材８１に記入されたコメント８８を元に分類処理を行なうコメント分類処理部１７６と、コメント分類処理部１７６による分類結果を集計結果出力部１７４が出力した採点認識結果や各解答に関連付けて出力するコメント処理結果出力部１７８とを備えている。 In addition, the second data processing unit 170_2 performs a classification process on the image data of the additional information filled teaching material 81, which is an example of the added document 8B, based on the comment 88 entered in the additional information filled teaching material 81. A classification processing unit 176 and a comment processing result output unit 178 that outputs the classification result by the comment classification processing unit 176 in association with the scoring recognition result output by the total result output unit 174 and each answer are provided.

コメント分類処理部１７６は、図形形状認識部１６６によるコメント８８の追記内容の図形の側面からの認識結果や文字認識処理部１６７によるコメント８８の追記内容の文字情報の側面からの認識結果と、記入位置認識部１６８によるコメント８８の記入位置の認識結果と、コメント８８の追記内容と対応するように予め規定されている分類情報とに基づいて、文書入力装置２０が読み取った付加情報記入済教材８１について、その付加情報記入済教材８１に記入されたコメント８８の分類処理を行なう。 The comment classification processing unit 176 inputs the recognition result from the side of the figure of the additional content of the comment 88 by the graphic shape recognition unit 166 and the recognition result from the side of the character information of the additional content of the comment 88 by the character recognition processing unit 167. The additional information filled teaching material 81 read by the document input device 20 based on the recognition result of the position where the comment 88 is entered by the position recognizing unit 168 and the classification information defined in advance so as to correspond to the additional content of the comment 88. The comment 88 entered in the additional information filled teaching material 81 is classified.

コメント処理結果出力部１７８は、コメント分類処理部１７６による分類結果を各解答欄や集計結果出力部１７４が出力した採点認識結果と関連付けて、処理結果保存サーバ４０（処理結果データベース装置や処理結果ファイルサーバ装置など）に登録する。 The comment processing result output unit 178 associates the classification result by the comment classification processing unit 176 with the scoring recognition result output by each answer column or the total result output unit 174, and stores the processing result storage server 40 (processing result database device or processing result file). To the server device).

なお、採点記号８７の記入は、一般に教育用教材８０上の複数の解答欄８４のそれぞれに対応して行なわれ、またコメント８８の記入は、採点記号８７の記入に付随してその採点記号８７の近傍に必要に応じて記入され、かつ採点記号８７，コメント８８は教師などの採点官によって手書きでされるため、各解答欄８４に対する記入位置が必ずしも一義的に定まっている訳ではない。 Note that the scoring symbol 87 is generally entered corresponding to each of the plurality of answer fields 84 on the educational material 80, and the comment 88 is entered in conjunction with the scoring symbol 87 entry. Since the scoring symbol 87 and the comment 88 are handwritten by a scoring officer such as a teacher, the entry position in each answer column 84 is not necessarily uniquely determined.

その一方で、採点記号８７の採点集計に当たっては、各解答欄８４と採点記号８７の記入位置との対応を明確にする必要がある。採点記号８７に関する採点集計は、各解答欄８４に対応する採点記号８７の記入結果を明確にした上で、採点記号８７の内容（正解か不正解か一部正解かなど）および各解答欄８４についての配点に基づいて行なわれるからである。同様に、コメント８８についての分類処理に当たっては、各解答欄８４（つまり採点記号８７）とコメント８８の記入位置との対応を明確にする必要がある。コメント８８に関する分類処理は、各解答欄８４に対応するコメント８８の記入結果を明確にした上で、コメント８８の内容に基づいて行なわれるからである。 On the other hand, when scoring the scoring symbols 87, it is necessary to clarify the correspondence between the answer columns 84 and the entry positions of the scoring symbols 87. In the scoring for the scoring symbol 87, after clarifying the entry result of the scoring symbol 87 corresponding to each answer column 84, the contents of the scoring symbol 87 (whether the answer is correct, incorrect or partially correct) and each answer column 84 It is because it is performed based on the scoring about. Similarly, in the classification process for the comment 88, it is necessary to clarify the correspondence between each answer column 84 (that is, the scoring symbol 87) and the entry position of the comment 88. This is because the classification process related to the comment 88 is performed based on the content of the comment 88 after clarifying the entry result of the comment 88 corresponding to each answer column 84.

このことから、採点集計部１７２やコメント分類処理部１７６は、以下に述べるような手順で、採点記号８７の採点集計やコメント８８の分類処理を行なう。たとえば、採点集計部１７２やコメント分類処理部１７６は、記入位置認識部１５８，１６８で特定される「○」や「×」などの採点記号８７やコメント８８の外接矩形が、付加情報記入済教材８１上で解答欄８４となる領域との重なるものがあるか否かを判定し、重なる解答欄８４と採点記号８７やコメント８８とを互いに対応付け、その採点記号８７やコメント８８を解答欄８４に対して記入された採点記号８７やコメント８８の判定結果とする第１の対応付け手法を採用することができる。 From this, the scoring totaling unit 172 and the comment classification processing unit 176 perform scoring totaling of the scoring symbols 87 and classification processing of the comments 88 in the following procedure. For example, the scoring totaling unit 172 and the comment classification processing unit 176 indicate that the circumscribed rectangles of the scoring symbols 87 and comments 88 specified by the entry position recognition units 158 and 168 are the additional information filled teaching materials. It is determined whether or not there is an overlap with the area that becomes the answer column 84 on 81, the overlapping answer column 84 and the scoring symbol 87 and the comment 88 are associated with each other, and the scoring symbol 87 and the comment 88 are associated with the answer column 84. It is possible to employ the first matching method that uses the determination result of the scoring symbol 87 and the comment 88 entered for.

ただし、１つの採点記号８７やコメント８８が複数の解答欄８４の領域に重なる場合には、何れに対応させるべきかを特定することはできないので、第１の対応付け手法による対応付けについての判定が不能であると判断する。また、注目する採点記号８７やコメント８８の外接矩形が、何れの解答欄８４の領域にも重ならない場合にも、何れに対応させるべきかを特定することはできないので、第１の対応付け手法による対応付けについての判定が不能であると判断する。 However, when one scoring symbol 87 or comment 88 overlaps the areas of the plurality of answer columns 84, it is not possible to specify which one should be associated with, so determination about association by the first association method Is determined to be impossible. In addition, since the circumscribed rectangle of the scoring symbol 87 or the comment 88 of interest does not overlap with any answer field 84, it cannot be specified which one should be associated with, so the first association method It is determined that the determination regarding the association by cannot be made.

また、採点集計部１７２やコメント分類処理部１７６は、記入位置認識部１５８，１６８で特定される採点記号８７やコメント８８の外接矩形と、付加情報記入済教材８１上で解答欄８４となる領域との重なり面積を求め、その面積（外接矩形に対する面積比でも同様）が最も大きくなる採点記号８７やコメント８８と解答欄８４とを互いに対応付け、その採点記号８７やコメント８８を解答欄８４に対して記入された採点記号８７やコメント８８の判定結果とする第２の対応付け手法を採用することができる。 In addition, the scoring totaling unit 172 and the comment classification processing unit 176 are a circumscribed rectangle of the scoring symbol 87 and the comment 88 specified by the entry position recognition units 158 and 168, and an area that becomes the answer column 84 on the additional information filled teaching material 81. The scoring symbol 87 and the comment 88 and the answer column 84 having the largest area (the same as the area ratio to the circumscribed rectangle) are associated with each other, and the scoring symbol 87 and the comment 88 are stored in the answer column 84. On the other hand, it is possible to employ a second matching method that makes the determination result of the scoring symbols 87 and comments 88 entered.

この第２の対応付け手法を採用すると、１つの採点記号８７やコメント８８が複数の解答欄８４の領域に重なる場合に第１の対応付け手法にては対応付けの特定ができない場合でも、重なり面積の大小に基づいて、何れに対応させるべきかを判定することができる。ただし、重なり面積の外接矩形に対する比が所定閾値未満の場合には、重なる部分が小さいことから、対応付けについての判定が不能であると判断する。 When this second association method is adopted, even if one scoring symbol 87 or comment 88 overlaps the areas of the plurality of answer columns 84, even if the association cannot be specified by the first association method, the overlapping is performed. Based on the size of the area, it can be determined which one should be handled. However, when the ratio of the overlapping area to the circumscribed rectangle is less than the predetermined threshold, it is determined that the determination regarding the association is impossible because the overlapping portion is small.

あるいは、採点集計部１７２やコメント分類処理部１７６は、記入位置認識部１５８，１６８で特定される各採点記号８７やコメント８８の中心座標位置と各解答欄８４の中心座標位置の距離を求め、その距離が最も小さくなる採点記号８７やコメント８８と解答欄８４とを互いに対応付け、その採点記号８７やコメント８８を解答欄８４に対して記入された採点記号８７やコメント８８の判定結果とする第３の対応付け手法を採用することができる。 Alternatively, the scoring totaling unit 172 and the comment classification processing unit 176 obtain the distance between the center coordinate position of each scoring symbol 87 and comment 88 specified by the entry position recognition units 158 and 168 and the center coordinate position of each answer column 84, The scoring symbol 87 or comment 88 with the smallest distance is associated with the answer column 84, and the scoring symbol 87 or comment 88 is used as the determination result of the scoring symbol 87 or comment 88 entered in the answer column 84. A third association technique can be employed.

この第３の対応付け手法を採用すると、注目する採点記号８７やコメント８８の外接矩形が何れの解答欄８４の領域にも重ならない場合に第１の対応付け手法にては対応付けの特定ができない場合や、採点記号８７やコメント８８が解答欄８４からずれて記入されて重なる部分が小さく、重なり面積の外接矩形に対する比が所定閾値未満の場合に第２の対応付け手法にては対応付けの特定ができない場合でも、何れに対応させるべきかを判定することができる。ただし、各解答欄８４との間の各距離の差が所定閾値未満の場合には、距離差が小さいことから、対応付けについての判定が不能であると判断する。 When this third matching method is adopted, the first matching method specifies the correspondence when the circumscribed rectangle of the scoring symbol 87 or the comment 88 of interest does not overlap the area of any answer column 84. If the scoring symbol 87 or the comment 88 is written out of the answer column 84 and the overlapping portion is small and the ratio of the overlapping area to the circumscribed rectangle is less than a predetermined threshold, the second association method associates. Even if it is not possible to identify the user, it is possible to determine which one should be handled. However, if the difference between the distances from the answer columns 84 is less than the predetermined threshold, it is determined that the determination about the association is impossible because the distance difference is small.

そして、各採点記号８７の解答欄８４への対応付けを行なった後は、採点記号８７が「○」であれば、これに対応する解答欄８４についての配点情報から特定される配点を加算し、また採点記号８７が「×」であれば、これに対応する解答欄８４についての配点加算を行なわず、このような採点集計を付加情報記入済教材８１上の全ての解答欄８４について行なう。 Then, after associating each scoring symbol 87 with the answer column 84, if the scoring symbol 87 is “◯”, the score specified from the scoring information for the corresponding answer column 84 is added. If the scoring symbol 87 is “x”, the scoring is not added to the answer field 84 corresponding to this, and such scoring is performed for all the answer fields 84 on the additional information filled teaching material 81.

なお、付加情報記入済教材８１上で解答欄８４となる領域は、各解答欄８４についての配点情報として、または当該配点情報と同様に、付加情報記入済教材８１に対応する文書管理サーバ３０に登録されている原本画像に含まれる記入欄位置領域情報３８によって特定されるものとする。 In addition, the area that becomes the answer column 84 on the additional information filled teaching material 81 is assigned to the document management server 30 corresponding to the additional information filled teaching material 81 as the score information for each answer column 84 or in the same manner as the score information. The entry field position area information 38 included in the registered original image is specified.

また、各コメント８８の解答欄８４への対応付けを行なった後は、コメント８８に対応する分類基準から特定される分類先を特定し、このようなコメント分類処理を付加情報記入済教材８１上の全てのコメント８８について行なう。 In addition, after associating each comment 88 with the answer column 84, the classification destination identified from the classification criteria corresponding to the comment 88 is identified, and such comment classification processing is performed on the additional information filled-in teaching material 81. This is done for all the comments 88.

なお、採点集計部１７２での採点集計処理やコメント分類処理部１７６での分類処理に当たっては、完全なる自動処理にしてもよいが、ユーザ端末１７１のＣＲＴ（Cathode Ray Tube）やＬＣＤ（Liquid Crystal Display）などで構成された表示部に処理過程や処理結果を表示して、適宜、操作者が処理過程や処理結果をキーボードやマウスなどの指示入力部を介して訂正できるようにしてもよい。 Note that the scoring and summarizing process in the scoring and summarizing unit 172 and the classification process in the comment classification processing unit 176 may be complete automatic processing, but the CRT (Cathode Ray Tube) or LCD (Liquid Crystal Display) of the user terminal 171 may be used. ) Or the like may be displayed on the display unit configured so that the operator can appropriately correct the processing step or the processing result via an instruction input unit such as a keyboard or a mouse.

また、本実施形態の特徴部分である認識性能情報提示処理部１９０は、手書き入力情報の履歴を収集しデータベースとしての処理結果保存サーバ４０に保存・蓄積しておく追記情報認識履歴保持部１９２と、一定の度合いよりも認識の困難であった特に信頼度の低い難認識情報を抽出する難認識追記情報抽出部１９４と、所定の表示態様に従って認識性能情報をユーザ端末１７１上で提示する認識性能情報提示部１９８とを有している。 The recognition performance information presentation processing unit 190, which is a characteristic part of the present embodiment, collects the history of handwritten input information and stores and stores it in the processing result storage server 40 as a database. A difficult recognition additional record information extracting unit 194 for extracting difficult recognition information with particularly low reliability that is difficult to recognize than a certain degree, and a recognition performance for presenting recognition performance information on the user terminal 171 in accordance with a predetermined display mode And an information presentation unit 198.

追記情報認識履歴保持部１９２は、後述する第１〜第３の履歴収集保存手法の何れかもしくはその任意の組合せの履歴収集保存手法を採用して、手書き入力情報の履歴を収集し、この収集した手書き入力情報の履歴を処理結果保存サーバ４０に保存する。 The additional information recognition history holding unit 192 collects the history of handwritten input information by adopting a history collection and storage method of any one or any combination of first to third history collection and storage methods described later, and collects this history The history of the handwritten input information is stored in the processing result storage server 40.

手書き入力情報の履歴を収集しデータベースに保存しておく際には、どの時点で記入されたものを履歴として残しておくかによって、様々な履歴収集保存手法を採るこができる。たとえば、通常の自動データ処理の過程で認識した手書き入力情報のデータを蓄積し保存していく第１の履歴収集保存手法を採ることが考えられる。第１の履歴収集保存手法を採る際には、手書き入力情報のみを元の文書原本８Ａから分離して保存することも考えられるし、手書き入力情報を含む所定範囲の画像情報をも一緒にして、つまり周囲の画像ごとに保存することも考えられる。 When collecting the history of handwritten input information and storing it in the database, various history collection and storage techniques can be adopted depending on at what point the entry is made as a history. For example, it is conceivable to adopt a first history collection and storage method in which handwritten input information data recognized in the process of normal automatic data processing is accumulated and stored. When adopting the first history collection and storage method, it is conceivable that only handwritten input information is stored separately from the original document original 8A, or together with a predetermined range of image information including handwritten input information. In other words, it may be possible to save each surrounding image.

第１の履歴収集保存手法を採ると、通常の処理過程で履歴を取って保存していくことができるので、形状や位置や修正状況などのデータを利用することができる。また、特別な手間を掛けずに、採点法などの記入態様をチェックすることができる、しかも、認識の正誤も反映させることができる。ただし、当初は履歴が少なく、また、ユーザによる入力態様の改善も期待できないので認識率が低く、修正の手間が掛かる可能性はある。 If the first history collection and storage method is adopted, the history can be collected and stored in a normal processing process, so that data such as the shape, position, and correction status can be used. Further, it is possible to check the entry mode such as the scoring method without taking any special effort, and also to reflect the correctness of recognition. However, since the history is small at the beginning and improvement of the input mode by the user cannot be expected, the recognition rate is low, and there is a possibility that it takes time and effort for correction.

また、通常の自動データ処理の過程ではなく、その処理開始前に、練習用答案などの練習用の文書原本８Ａに入力してもらい、その入力情報のデータを蓄積し保存しておく第２の履歴収集保存手法を採ることが考えられる。この第２の履歴収集保存手法を採ると、練習用の記入情報を指定することができるから、修正を考慮しなくてよい。用意された練習用答案などに追記し、通常の自動採点処理を行なう過程で、記入形状も指定されるからである。また、図形形状や文字形状と位置を判定し、認識の正誤も含めて結果を提示できる利点もある。 Also, it is not a normal automatic data processing process, but before the processing is started, the input is made to the practice document original 8A such as a practice answer, and the input information data is accumulated and stored. It is possible to adopt a history collection and storage method. If this second history collection and storage method is adopted, entry information for practice can be designated, so that correction need not be considered. This is because, in the process of adding to the prepared practice answer and performing the normal automatic scoring process, the entry form is also specified. In addition, there is an advantage that a figure shape, a character shape, and a position can be determined, and the result can be presented including correct or incorrect recognition.

加えて、事前に採点法などの記入態様を練習することができるので、正しい形と位置とを確認でき、認識性能のよい状態での記入が期待でき、その結果として、実際の自動データ処理時には修正を要しない利点が得られる。また、白紙部分に記入させることもできるので、指定された記入情報に対する手書き入力情報のみを保存すればよいので、蓄積容量を低減できる。 In addition, you can practice writing methods such as scoring methods in advance, so you can confirm the correct shape and position, you can expect entry with good recognition performance, and as a result, during actual automatic data processing The advantage is that no modification is required. Further, since it is possible to fill in a blank portion, it is only necessary to save handwritten input information for the designated entry information, so that the storage capacity can be reduced.

また、自動データ処理には使用しなかった過去の追記済文書８Ｂ（たとえば過去の採点答案）における入力情報のデータを蓄積し保存する第３の履歴収集保存手法を採ることが考えられる。なお、自動データ処理を行なっていないので、白紙の文書原本８Ａを付加して読み取ることで差分を抽出して、手書き入力情報のみを抽出するようにする。また、自動データ処理の全行程は行なわずに、形状認識や文字認識のみを行ない、結果を提示するだけにすればよい。 Further, it is conceivable to adopt a third history collection and storage method for accumulating and storing input information data in a past additionally written document 8B (for example, a past scoring answer) that has not been used for automatic data processing. Since automatic data processing is not performed, the difference is extracted by adding and reading the blank document original 8A, and only the handwritten input information is extracted. In addition, it is only necessary to perform shape recognition and character recognition without presenting the entire process of automatic data processing and present the result.

この第３の履歴収集保存手法を採ると、過去の追記文書を利用できるため、はじめて自動データ処理を使用する場合でも、手書き入力情報に関して認識性能を改善させるための修正手法を提示することができ、実際の自動データ処理時には修正が少なくて済む利点が得られる。また、練習用の文書原本８Ａを作成する手間を省くことができる。ただし、結果の正誤は不明のため、正誤を踏まえた判定はできない。 By adopting this third history collection and storage method, past appended documents can be used, so even when automatic data processing is used for the first time, a correction method for improving recognition performance with respect to handwritten input information can be presented. In the actual automatic data processing, there is an advantage that less correction is required. Further, it is possible to save the trouble of creating the original document 8A for practice. However, since the correctness of the result is unknown, determination based on correctness cannot be made.

難認識追記情報抽出部１９４は、後述する各種の手法の何れかもしくはその任意の組合せの難認識追記情報抽出手法を採用して、各種の手書き入力情報の内、信頼度が一定レベルよりも低い自動認識が困難であった難認識情報を抽出する。 The difficult recognition additional record information extraction unit 194 employs a difficult recognition additional record information extraction method of any of the various methods described later or any combination thereof, and the reliability of the various handwritten input information is lower than a certain level. Extract difficult recognition information that was difficult to recognize automatically.

認識性能情報提示部１９８は、後述する各種の手法の何れかの認識性能情報提示手法を採用して、認識性能情報をユーザ端末１７１上でユーザに提示する。 The recognition performance information presentation unit 198 employs any of the various recognition performance information presentation methods described later, and presents the recognition performance information to the user on the user terminal 171.

＜全体の処理手順＞
図３および図４は、情報処理システムの一実施形態である教材自動採点システム１における教材処理方法の処理動作の手順を説明する図である。ここで、図３は、その全体概要をシステム構成図と対応付けて示しており、また図４は、教材処理手順を示すフローチャートである。 <Overall procedure>
3 and 4 are diagrams for explaining the procedure of the processing operation of the teaching material processing method in the teaching material automatic scoring system 1 which is an embodiment of the information processing system. Here, FIG. 3 shows the overall outline in association with the system configuration diagram, and FIG. 4 is a flowchart showing the teaching material processing procedure.

先ず、教育用教材８０を完成させ文書原本情報データベースＤＢ２に登録する（Ｓ１０４）。なお、各記入欄６ａに配点が設定されていない形態のテンプレート６を使用する場合には、解答欄８４として設定した記入欄６ａに関しては配点の情報も設定する。 First, the educational material 80 is completed and registered in the original document information database DB2 (S104). In addition, when using the template 6 in the form in which no scoring is set in each entry field 6 a, scoring information is also set for the entry field 6 a set as the answer field 84.

この後、試験を実行する際には、文書原本情報データベースＤＢ２から教育用教材８０を読み出して印刷し生徒や受験者に配布する（Ｓ１０６）。そして、試験終了後に、採点官は採点記号８７やコメント８８を生徒の解答に対して追記する（Ｓ１０８）。 Thereafter, when the test is executed, the educational material 80 is read from the original document information database DB2, printed, and distributed to students and examinees (S106). After the test, the grader adds a scoring symbol 87 and a comment 88 to the student's answer (S108).

追記情報処理装置１０（特に教材自動採点システム１においては教材処理装置に該当する）を利用する場合には、先ず、生徒などによって解答者情報欄８６への氏名などの記入および解答欄８４への解答記入、つまり生徒による第１種の付加情報の記入がされ、さらに教師などによって各解答欄８４に記入された解答に対する「○」や「×」などの採点記号８７やコメント８８などの第２種の付加情報の記入がされた付加情報記入済教材８１について、文書入力装置２０は、その付加情報記入済教材８１を読み取り（Ｓ１１０）、その付加情報記入済教材８１を表わす画像データを追記情報処理装置１０に入力する（Ｓ１１２）。文書入力装置２０は、この文書入力装置２０による画像読取りによって得られた画像データについて、一旦ワークエリアとして用いられるメモリなどに保持しておく。 In the case of using the postscript information processing apparatus 10 (particularly corresponding to the teaching material processing apparatus in the teaching material automatic scoring system 1), first, a student enters a name in the answerer information column 86 and enters the answering column 84. Answer entry, i.e., the entry of the first type of additional information by the student, and a second such as a scoring symbol 87 such as “○” or “x” and a comment 88 for the answer entered in each answer column 84 by the teacher or the like With respect to the additional information filled teaching material 81 in which the additional information has been entered, the document input device 20 reads the additional information filled learning material 81 (S110) and adds image data representing the additional information filled learning material 81 as additional information. It inputs into the processing apparatus 10 (S112). The document input device 20 temporarily stores image data obtained by image reading by the document input device 20 in a memory or the like used as a work area.

なお、このとき、文書入力装置２０にてＡＤＦ装置を用いれば、たとえば同一学級のような１つのグループに纏めて処理すべき複数の付加情報記入済教材８１について、一括して読み取り（一括スキャン）、各付加情報記入済教材８１に対応する画像データを連続的に追記情報処理装置１０に入力することができる。 At this time, if the document input device 20 uses an ADF device, for example, a plurality of additional information-filled teaching materials 81 to be processed in one group such as the same class are collectively read (batch scan). The image data corresponding to each additional information filled teaching material 81 can be continuously input to the additional information processing apparatus 10.

追記情報処理装置１０は、文書入力装置２０から取り込んだ各付加情報記入済教材８１の画像データに対して、順次、次のような付加情報抽出・分離処理、付加情報特定処理、および自動採点処理並びに自動コメント処理といった付加情報終末処理を実行する。 The additional recording information processing device 10 sequentially adds the following additional information extraction / separation processing, additional information identification processing, and automatic scoring processing to the image data of each additional information filled teaching material 81 captured from the document input device 20. In addition, additional information end processing such as automatic comment processing is executed.

たとえば、データ処理部１７０における自動採点処理並びに自動コメント処理に先立って、ある１つの付加情報記入済教材８１から得られた画像データについて、読取画像処理部１１０の画像データ解析部は解析処理を行ない（Ｓ１２２）、教材特定部１２２は、その解析処理の結果に基づいて付加情報記入済教材８１に対応する元の教育用教材８０の識別特定を行なう（Ｓ１２４）。 For example, prior to automatic scoring processing and automatic comment processing in the data processing unit 170, the image data analysis unit of the read image processing unit 110 performs analysis processing on image data obtained from a certain additional information-added teaching material 81. (S122) The learning material specifying unit 122 identifies and specifies the original educational material 80 corresponding to the additional information-added learning material 81 based on the result of the analysis processing (S124).

この識別特定（Ｓ１２４）は、たとえば「理科」「５年」「１．天気と気温の変化」といった識別情報解析部によるタイトル解析または識別情報欄８５に埋め込まれたコード情報についてのコード情報解析部によるコード解析を通じて行なえばよい。この識別特定を経ることで、教材特定部１２２では、文書入力装置２０により得られた付加情報記入済教材８１の画像データとの比較対象となる教育用教材８０の電子データ（原本画像）を特定することができる。 This identification specification (S124) is, for example, a title analysis by an identification information analysis unit such as “science”, “5 years”, “1. change in weather and temperature” or a code information analysis unit for code information embedded in the identification information column 85 This can be done through code analysis. Through this identification and specification, the teaching material specifying unit 122 specifies the electronic data (original image) of the teaching material 80 to be compared with the image data of the additional information filled teaching material 81 obtained by the document input device 20. can do.

なお、この識別特定は、文書入力装置２０が画像読取りを行なった複数の付加情報記入済教材８１のそれぞれについて順次行なうことも考えられるが、一般に１つのグループに纏めて処理される付加情報記入済教材８１は全て同一のものであるため、その纏めて処理される中で最初に処理される付加情報記入済教材８１についてのみ行なえばよい。 This identification and identification may be performed sequentially for each of the plurality of additional information-filled teaching materials 81 from which the document input device 20 has read an image. However, in general, additional information that has been processed in one group is already entered. Since all the teaching materials 81 are the same, it is only necessary to perform the additional information-filled teaching material 81 that is processed first during the batch processing.

教材特定部１２２により各付加情報記入済教材８１に対応する教育用教材８０の特定が完了すると、文書管理サーバ３０は、その特定結果に従いつつ、保持蓄積している中から該当する教育用教材８０の原本画像（電子データ）を取り出して、これを差分抽出部１３２へ受け渡す（Ｓ１２６）。 When the teaching material specifying unit 122 completes the specification of the educational material 80 corresponding to each additional information-added teaching material 81, the document management server 30 follows the specified result and selects the corresponding educational material 80 from the stored and stored. The original image (electronic data) is taken out and transferred to the difference extraction unit 132 (S126).

また、歪み補正部は、ある１つの付加情報記入済教材８１から得られた画像データの歪みを補正する（Ｓ１２８）。この画像歪み補正は、文書入力装置２０での画像読取りの際に生じ得る画像歪みを補正するために行なうものであり、その後に差分抽出部１３２にて行なう原本画像との比較や差分抽出などの精度向上を図るためのものである。 Further, the distortion correction unit corrects the distortion of the image data obtained from a certain additional information-added teaching material 81 (S128). This image distortion correction is performed in order to correct image distortion that may occur when the document input device 20 reads an image. After that, the difference extraction unit 132 performs comparison with the original image or difference extraction. This is to improve accuracy.

差分抽出部１３２は、文書管理サーバ３０から受け渡された原本画像（教育用教材８０）と、文書入力装置２０から入力され、歪み補正部により画像歪みが補正された後の画像データ（付加情報記入済教材８１）とを、それぞれ比較して、その差分を抽出する（Ｓ１３０）。差分抽出部１３２は、抽出した差分情報９を解答者抽出部１３４やデータ処理用追記情報抽出処理部１４０に渡す。 The difference extraction unit 132 receives the original image (educational teaching material 80) delivered from the document management server 30 and the image data (additional information) input from the document input device 20 and corrected for image distortion by the distortion correction unit. The completed teaching material 81) is compared with each other, and the difference is extracted (S130). The difference extraction unit 132 passes the extracted difference information 9 to the answerer extraction unit 134 and the additional information extraction processing unit 140 for data processing.

差分抽出部１３２による差分抽出によって、たとえば図３中の中央部分に示すように、解答者情報欄８６および各解答欄８４への解答者による第１種の付加情報の記入内容、並びに各解答欄８４に対する採点記号８７やコメント８８などの採点官による第２種の付加情報の記入内容のみで表わされる差分情報９が抽出されることになる。 By the difference extraction by the difference extraction unit 132, for example, as shown in the center part in FIG. 3, the content of the first type additional information entered by the answerer in the answerer information column 86 and each answer column 84, and each answer column Difference information 9 represented only by the content of the second type additional information entered by the grader such as a grading symbol 87 and a comment 88 for 84 is extracted.

解答者抽出部１３４は、差分情報９に対する文字認識処理部による文字認識処理などを通じて、文書入力装置２０で読取り対象となった付加情報記入済教材８１における解答者情報を抽出する（Ｓ１３２）。これにより、ある１つの付加情報記入済教材８１に解答を記入した解答記入者の学級、出席番号、氏名などを特定できる。 The answerer extracting unit 134 extracts answerer information in the additional information-added teaching material 81 to be read by the document input device 20 through character recognition processing by the character recognition processing unit for the difference information 9 (S132). As a result, the class, attendance number, name, and the like of the answer entry person who wrote the answer in one additional information-filled teaching material 81 can be specified.

また、データ処理用追記情報抽出処理部１４０において、先ず追記部材特定部１４１は、データ処理の対象となる追記情報の記入に使用されたペン色である追記色を特定し（Ｓ１４１）、データ処理対象追記情報抽出部は、追記部材特定部１４１にて特定された追記色に基づき、差分抽出部１３２による差分抽出結果に対して、データ処理用の追記情報を抽出する（Ｓ１４２）。 In the data processing additional information extraction processing unit 140, the additional recording member specifying unit 141 first specifies an additional recording color that is a pen color used for entering the additional information to be processed (S141), and performs data processing. The target additional information extraction unit extracts additional information for data processing from the difference extraction result by the difference extraction unit 132 based on the additional recording color specified by the additional member specifying unit 141 (S142).

本例の場合、各解答欄８４への採点記号８７やコメント８８の追記内容を抽出するために、その差分情報９からさらに所定色成分についてのもの、具体的にはたとえば赤色成分のものを抽出する。所定色成分の抽出は、たとえば差分抽出結果が画素データからなる場合であれば、その画素データを構成する色成分データに着目することで行なうことができる。 In the case of this example, in order to extract the additional contents of the grading symbol 87 and the comment 88 in each answer column 84, the information about the predetermined color component, specifically, for example, the red component is extracted from the difference information 9 To do. For example, if the difference extraction result is composed of pixel data, the predetermined color component can be extracted by paying attention to the color component data constituting the pixel data.

抽出線分途切れ補正部は、データ処理対象追記情報抽出部による抽出結果に対して、細線化処理、端点抽出処理、端点間接続処理、あるいは線図形の直線近似などの追記情報整形処理を適宜実行する（Ｓ１４６）。抽出線分途切れ補正部は、途切れ補正処理済の採点記号８７の抽出結果を採点記号認識部として機能する第１データ処理用追記情報認識部１５４に渡し、また途切れ補正処理済のコメント８８の抽出結果をコメント認識部として機能する第２データ処理用追記情報認識部１６４に渡す。 The extracted line segmentation correction unit appropriately executes additional information shaping processing such as thinning processing, end point extraction processing, end point connection processing, or linear approximation of line figures, on the extraction result of the data processing target additional information extraction unit. (S146). The extracted line segment break correction unit passes the extraction result of the scoring symbol 87 that has been subjected to the break correction process to the first data processing additional information recognition unit 154 that functions as the scoring symbol recognition unit, and also extracts the comment 88 that has undergone the break correction process. The result is passed to the second data processing additional information recognition unit 164 that functions as a comment recognition unit.

データ処理対象追記情報特定処理部１５０は、先ず、分離認識処理部１５５を構成する図形形状認識部１５６，１６６および文字認識処理部１５７，１６７が協働して、文書管理サーバ３０に保存されている解答欄８４の位置情報を参照して、採点記号８７とコメント８８とを分離し（Ｓ１６２）、この後、分離した採点記号８７とコメント８８の別に、記入内容の特定処理や記入位置の特定処理を実行する。 The data processing target additional information specifying processing unit 150 is first stored in the document management server 30 in cooperation with the figure shape recognition units 156 and 166 and the character recognition processing units 157 and 167 constituting the separation recognition processing unit 155. The scoring symbol 87 and the comment 88 are separated by referring to the position information in the answer field 84 (S162), and thereafter, the specified content processing and the entry position are identified separately from the separated scoring symbol 87 and the comment 88. Execute the process.

図形形状認識部１５６や文字認識処理部１５７は、コメント８８と分離した採点記号８７について（Ｓ１６３−採点記号）、採点記号８７の記入内容に対する形状認識あるいは文字認識により、その採点記号８７の記入内容が「正解」であるかあるいは「不正解」であるかなど、採点記号８７で示された採点官の採点結果を特定する（Ｓ１６４）。このとき、ユーザによる修正指示を受け付ける（Ｓ１６５）。続いて、記入位置認識部１５８は、採点記号８７の記入内容について、その付加情報記入済教材８１上における記入位置を認識する（Ｓ１６６）。 The graphic shape recognition unit 156 and the character recognition processing unit 157, regarding the scoring symbol 87 separated from the comment 88 (S163-scoring symbol), the content of the scoring symbol 87 by shape recognition or character recognition for the content of the scoring symbol 87. The scoring result of the scorer indicated by the scoring symbol 87 is specified, such as whether the answer is “correct” or “incorrect” (S164). At this time, a correction instruction from the user is accepted (S165). Subsequently, the entry position recognition unit 158 recognizes the entry position on the additional information entry teaching material 81 with respect to the entry contents of the scoring symbol 87 (S166).

このようにして、記入位置認識部１５８が採点記号８７の記入位置を認識した後は、採点集計部１７２は、図形形状認識部１５６や文字認識処理部１５７による採点記号８７の記入内容の認識結果と、記入位置認識部１５８による採点記号８７の記入位置の認識結果と、文書管理サーバ３０が保持蓄積している付加情報記入済教材８１に対応する原本画像（教育用教材８０）に含まれる教育用教材８０の各解答欄８４についての配点情報とに基づいて、採点および集計を行なう（Ｓ１６８）。 After the entry position recognizing unit 158 recognizes the entry position of the scoring symbol 87 in this way, the scoring totaling unit 172 recognizes the entry contents of the scoring symbol 87 by the graphic shape recognition unit 156 and the character recognition processing unit 157. And the recognition result of the entry position of the scoring symbol 87 by the entry position recognizing unit 158 and the education included in the original image (educational instruction material 80) corresponding to the additional information entry instruction material 81 held and accumulated in the document management server 30. Based on the scoring information for each answer field 84 of the teaching material 80, scoring and counting are performed (S168).

集計結果出力部１７４は、その採点・集計の結果を処理結果保存サーバ４０に保存する（Ｓ１６９）。あるいは採点結果の点数が付加情報記入済教材８１の集計欄８３ｂに記入されて返却答案８１ｂとして生徒などに返却される。 The total result output unit 174 stores the scoring / total results in the processing result storage server 40 (S169). Alternatively, the score of the scoring result is entered in the totaling field 83b of the additional information filled teaching material 81 and returned to the student as a return answer 81b.

各付加情報記入済教材８１についての採点結果（問題別採点結果）のファイル形式としては、たとえば、図３に示すように、付加情報記入済教材８１上に存在する問題の番号と、その問題の解答に対する正誤判定と、その正誤判定に基づく得点とからなる情報で、これらを互いに関連付けるテーブル形式である。また、集計結果のファイル形式としては、たとえば、図３に示すように、出席番号および解答者情報と、得点情報（集計欄８３ｂに記入される項目点や合計点）とからなる情報で、これらを互いに関連付けるテーブル形式である。 As the file format of the scoring result (scoring result by problem) for each additional information filled teaching material 81, for example, as shown in FIG. 3, the number of the problem existing on the additional information filled teaching material 81 and the problem It is a table format that associates these with each other with information consisting of correct / incorrect determination on the answer and a score based on the correct / incorrect determination. In addition, as a file format of the counting result, for example, as shown in FIG. 3, information including an attendance number and answerer information and score information (item points and total points written in the counting field 83b) Is a table format for associating each other.

各付加情報記入済教材８１上に記入される正誤判定の採点集計の結果が問題別採点結果としてファイル出力されるし、また、問題別の採点集計の結果がファイル出力されるので、処理結果保存サーバ４０では、付加情報記入済教材８１についての採点集計結果を、たとえば一覧形式で、管理または利用することが可能となる。 The result of scoring for correct / incorrect judgment entered on each additional information-added teaching material 81 is output as a question-based scoring result, and the result of scoring-by-problem scoring is output as a file. In the server 40, it is possible to manage or use the score totaling result for the additional information filled teaching material 81 in a list form, for example.

図形形状認識部１６６や文字認識処理部１６７は、採点記号８７と分離したコメント８８について（Ｓ１６３−コメント）、コメント８８の記入内容に対する形状認識あるいは文字認識により、採点官により追記されたコメント８８の記入内容を特定する（Ｓ１７０）。このとき、ユーザによる修正指示を受け付ける（Ｓ１７１）。続いて、記入位置認識部１６８は、コメント８８の記入内容について、その付加情報記入済教材８１上における記入位置を認識する（Ｓ１７２）。 The graphic shape recognizing unit 166 and the character recognizing processing unit 167, regarding the comment 88 separated from the scoring symbol 87 (S163-comment), the comment 88 added by the grading officer by shape recognition or character recognition with respect to the entered content of the comment 88. The entry contents are specified (S170). At this time, a correction instruction from the user is accepted (S171). Subsequently, the entry position recognizing unit 168 recognizes the entry position on the additional information entry teaching material 81 with respect to the entry contents of the comment 88 (S172).

このようにして、記入位置認識部１６８がコメント８８の記入位置を認識した後は、コメント分類処理部１７６は、図形形状認識部１６６や文字認識処理部１６７によるコメント８８の記入内容の認識結果と、記入位置認識部１６８によるコメント８８の記入位置の認識結果と、文書管理サーバ３０が保持蓄積している分類基準情報３９とに基づいて、コメント８８を分類する（Ｓ１７４）。 After the entry position recognizing unit 168 recognizes the entry position of the comment 88 in this way, the comment classification processing unit 176 displays the recognition result of the entry contents of the comment 88 by the figure shape recognition unit 166 and the character recognition processing unit 167. The comment 88 is classified based on the recognition result of the entry position of the comment 88 by the entry position recognition unit 168 and the classification reference information 39 stored and accumulated in the document management server 30 (S174).

コメント処理結果出力部１７８は、コメント分類処理部１７６で分類された各コメント８８を、位置が近い採点記号８７の採点結果と関連付けて（Ｓ１７８）、処理結果保存サーバ４０に保存する（Ｓ１７９）。各コメント８８についての分類結果のファイル形式としては、たとえば、図３に示すように、各コメントと近傍の採点記号８７とからなる情報で、これらを互いに関連付けるテーブル形式である。なお、実際にテーブル形式で保存することに限らず、各コメント８８と、このコメント８８と対応する採点集計結果の両者を関連付ける関連付け情報とを対応付けて保存してもよい。 The comment processing result output unit 178 associates each comment 88 classified by the comment classification processing unit 176 with the scoring result of the scoring symbol 87 having a close position (S178), and stores it in the processing result storage server 40 (S179). As the file format of the classification result for each comment 88, for example, as shown in FIG. 3, it is a table format that associates each comment with information including a scoring symbol 87 in the vicinity. In addition, it is not limited to actually storing in the table format, and each comment 88 may be stored in association with association information that associates both the comment 88 and the scoring result corresponding to the comment 88.

各付加情報記入済教材８１上に記入される採点記号８７の採点集計結果とコメント８８とが対応するようにデータ保存されるので、処理結果保存サーバ４０では、付加情報記入済教材８１についての採点集計結果とコメント８８とを、たとえば一覧形式で、管理または利用することが可能となるし、生徒の能力分析や生徒指導にコメント８８を利用できるようになる。 Since the score summation result of the scoring symbol 87 written on each additional information filled teaching material 81 and the comment 88 are stored in correspondence with each other, the processing result storage server 40 scores the additional information filled teaching material 81. The counting results and the comments 88 can be managed or used, for example, in a list format, and the comments 88 can be used for student ability analysis and student guidance.

このように、情報処理システムの一実施形態として示した教材自動採点システム１では、採点記号８７やコメント８８の記入がされた付加情報記入済教材８１から読み取った画像データと、その付加情報記入済教材８１についての元の教育用教材８０、すなわち解答欄８４への解答記入などの生徒などによる第１種の付加情報および解答に対する採点官による採点記号８７やコメント８８などの第２種の付加情報の記入がされていないものについてのデータとを比較し、互いの差分から採点記号８７やコメント８８の記入内容を分離してその記入内容を特定し、その採点記号８７についての採点集計とコメント８８についての分類処理を実行するようになっている。 As described above, in the teaching material automatic scoring system 1 shown as an embodiment of the information processing system, the image data read from the supplementary information-added teaching material 81 in which the scoring symbol 87 and the comment 88 are entered, and the supplementary information are already entered. The original educational material 80 for the educational material 81, that is, the first type of additional information by the student such as entering the answer in the answer column 84, and the second type of additional information such as a scoring symbol 87 and a comment 88 by the grader for the answer. Are compared with the data for those not filled in, and the written contents of the grading symbol 87 and the comment 88 are separated from each other's difference to identify the filled content, and the totaling and comment 88 for the grading symbol 87 are identified. The classification process is executed.

したがって、採点記号８７に関するデータ処理については、差分抽出部１３２で抽出される差分結果から、コメント８８の記入内容を排除して採点記号８７のみを分離してその記入内容を特定するようにしているので、同じペンで両者が追記されていても、自動採点に悪影響を及ぼすことがない。 Therefore, regarding the data processing related to the scoring symbol 87, the entry content of the comment 88 is excluded from the difference result extracted by the difference extraction unit 132, and only the scoring symbol 87 is separated to specify the entry content. So, even if both are added with the same pen, automatic scoring will not be adversely affected.

また、採点結果の自動集計を実行できるので、結果として付加情報記入済教材８１についての採点処理が省力化される。付加情報記入済教材８１を紙媒体で入手するケースでは、付加情報記入済教材８１を文書入力装置２０で読み取った画像データを基にすればよく、たとえば、複写機、複合機、またはスキャナ装置などによって実現されるスキャン機能と、パーソナルコンピュータ（ＰＣ）などのコンピュータ機器が有する情報記憶処理機能、画像処理機能および演算処理機能とがあれば、システム構成を簡単に実現することができ、専用の機器を必要とすることもない。 In addition, since the scoring results can be automatically totaled, the scoring process for the additional information filled teaching material 81 is saved as a result. In the case where the additional information filled teaching material 81 is obtained as a paper medium, it may be based on image data obtained by reading the additional information filled teaching material 81 with the document input device 20, for example, a copying machine, a multifunction peripheral, a scanner device, or the like. System configuration can be easily realized with a scanning function realized by the computer and an information storage processing function, an image processing function, and an arithmetic processing function of a computer device such as a personal computer (PC). Is not required.

さらには、付加情報記入済教材８１の画像データを、文書管理サーバ３０が保持する電子データと比較するため、その文書管理サーバ３０に各種の教育用教材８０についての電子データを保持蓄積しておけば、対応可能な付加情報記入済教材８１についての汎用性を十分に確保し得る。さらには、文書管理サーバ３０に予め電子データを保持蓄積しておくことで、文書管理サーバ３０から取り込んだ画像データとの比較を行なう場合において、比較対象となる電子データの入力などを行なう手間を省くことができ、結果として迅速な採点処理を実現することができる。 Furthermore, in order to compare the image data of the additional information filled teaching material 81 with the electronic data held by the document management server 30, the electronic data of various educational teaching materials 80 can be held and accumulated in the document management server 30. Thus, the versatility of the additional information filled teaching material 81 that can be handled can be sufficiently secured. Furthermore, by storing and storing electronic data in the document management server 30 in advance, when comparing with image data captured from the document management server 30, it is possible to input the electronic data to be compared. As a result, a quick scoring process can be realized.

また、コメント８８に関するデータ処理については、差分抽出部１３２で抽出される差分結果から、採点記号８７の記入内容を排除してコメント８８のみを分離してその記入内容を特定するようにしているので、同じペンで両者が追記されていても、コメント８８に関する分類処理に悪影響を及ぼすことがない。また、各採点記号８７と対応付けて処理結果保存サーバ４０に各コメント８８を保存するようにしたので、コメント８８を生徒などが確認する用途以外に、採点官自らが、能力分析や生徒指導に活用できるようになる。 In addition, regarding the data processing related to the comment 88, the entry content of the scoring symbol 87 is excluded from the difference result extracted by the difference extraction unit 132, and only the comment 88 is separated to specify the entry content. Even if both are added with the same pen, the classification process related to the comment 88 is not adversely affected. In addition, since each comment 88 is stored in the processing result storage server 40 in association with each scoring symbol 87, the scoring officer himself / herself is used for ability analysis and student guidance in addition to the purpose of confirming the comment 88 by the student. It can be utilized.

＜＜認識性能情報提示処理＞＞
図５〜図１６は、認識性能情報提示処理部１９０の処理の詳細、すなわち手書きで記入された文字や図形などの手書き入力情報の履歴を取る手法や、認識性能を向上させるための認識性能情報をユーザに提示する手法を説明する図である。 << Recognition performance information presentation process >>
5 to 16 show details of the processing of the recognition performance information presentation processing unit 190, that is, a technique for taking a history of handwritten input information such as characters and figures entered by handwriting, and recognition performance information for improving recognition performance. It is a figure explaining the method of showing to a user.

本実施形態においては、文書原本８Ａへ手書きで記入された手書き入力情報が自動認識に適しているか否かを判定し、認識性能上問題のあるものについては、どのように記入すれば認識率が向上するかをユーザ自身が判断することのできるようなサポート情報（認識性能情報）を提示する。 In this embodiment, it is determined whether or not the handwritten input information entered in the original document 8A by handwriting is suitable for automatic recognition. Support information (recognition performance information) is presented so that the user can determine whether or not to improve.

ここで、認識性能情報としては、認識性能を向上させるための修正手法をユーザ自身が具体的にまた容易に判断することのできるものであればよく、様々な提示態様を採ることができる。たとえば、何処をどう直せばいいかを示唆する情報を提示する態様を採ることができる。こうすることで、どのように記載すると認識性能がよくなるのかを簡単に判断することができる。 Here, the recognition performance information may be any information that allows the user himself / herself to specifically and easily determine a correction method for improving the recognition performance, and can take various presentation modes. For example, it is possible to adopt a mode of presenting information that suggests where and how to correct it. By doing so, it is possible to easily determine how the description improves the recognition performance.

あるいは、過去の記載の認識信頼度を提示する態様を採ることができる。こうすることで、どのように記載すると認識性能がどの程度になるかを容易に判断することができ、これを踏まえて、ユーザは、より認識性能がよくなる記載をすることができるようになる。 Or the aspect which presents the recognition reliability of the past description can be taken. By doing so, it is possible to easily determine how much the recognition performance is to be described, and based on this, the user can make a description that improves the recognition performance.

何れの提示態様を採っても、認識に適した手書き入力情報の記入を促すことができ、結果的に、認識率を向上させることができる。 Whichever presentation mode is adopted, entry of handwritten input information suitable for recognition can be promoted, and as a result, the recognition rate can be improved.

特に、本実施形態においては、難認識追記情報抽出部１９４は、追記情報処理装置１０の各機能部における複数の処理から、それぞれ認識率に関わる情報を収集し、記入内容を自動認識した追記情報の中で、特に信頼度の低い難認識情報のみを特定し、認識性能情報提示部１９８は、難認識追記情報抽出部１９４にて特定された難認識情報について、その難認識情報の認識性能を向上させるための認識性能情報を提示して修正を促す。 In particular, in the present embodiment, the difficult-to-recognize additional write information extraction unit 194 collects information related to the recognition rate from a plurality of processes in each functional unit of the additional write information processing apparatus 10 and automatically adds the information to be written. In particular, only the difficult recognition information with low reliability is specified, and the recognition performance information presentation unit 198 determines the recognition performance of the difficult recognition information for the difficult recognition information specified by the difficult recognition additional record information extraction unit 194. Presents recognition performance information for improvement and prompts correction.

自動認識性能との関係で、特に認識性能の劣る信頼度の低い難認識情報のみを提示することで、チェック頻度を少なくするのである。これにより、全ての追記情報について、修正の要否をチェックする必要が無く、誤認識の追記情報の修正作業を効率的に実行することができる。 In relation to automatic recognition performance, the check frequency is reduced by presenting only difficult-to-recognize information with particularly poor recognition performance and low reliability. Thereby, it is not necessary to check the necessity of correction for all the additional write information, and the correction work of the erroneous additional write information can be performed efficiently.

加えて、認識処理そのものにおける信頼度に関する特徴量だけでなく、この認識処理よりも前段の認識処理に関わる各種の処理における信頼度に関する特徴量も取得して認識情報を特定するので、修正を要する追記情報（難認識情報）を高精度に特定することができる。 In addition, not only the feature quantity related to the reliability level in the recognition process itself, but also the feature quantity related to the reliability level in various processes related to the recognition process preceding the recognition process is acquired and the recognition information is specified, so correction is required. Additional write information (difficulty recognition information) can be specified with high accuracy.

なお、手書き入力情報の認識性能情報をユーザに提示するに当たっては、先ず、手書き入力情報の履歴を取っておき、それらの認識時の信頼度を参考にして、認識性能向上のために改善要求度合いのより強いものに関して認識性能情報を提示するようにする。換言すれば、一定の度合いよりも認識の困難であった手書き入力情報（難認識情報）についてのみ、認識性能情報を提示する。こうすることで、認識性能を一定の水準以上にできるようにするのである。認識性能が一定の水準以上にできればよく、必ずしも、手書き入力情報が理想的な基準情報と完全に一致していることは必要ないのである。 In order to present the recognition performance information of handwritten input information to the user, first of all, a history of handwritten input information is saved, and the degree of improvement is requested in order to improve the recognition performance with reference to the reliability at the time of recognition. Present recognition performance information for strong ones. In other words, the recognition performance information is presented only for handwritten input information (difficult recognition information) that is more difficult to recognize than a certain degree. In this way, the recognition performance can be made higher than a certain level. The recognition performance only needs to be higher than a certain level, and it is not always necessary that the handwritten input information completely matches the ideal reference information.

手書き入力情報は、ほぼ確実に基準情報とのずれがあるから、過去に記入された全ての手書き入力情報について認識性能情報を提示していたのでは、認識性能との関係で、必要以上の情報を提示してしまう虞れがあるが、提示範囲を難認識情報の範囲に絞ることで、必要以上の情報提示を防止するのである。こうすることで、修正作業を効率的に実行することができるし、必要以上に矯正してしまう虞れを防止することもできる。 Since the handwritten input information is almost certainly different from the reference information, if the recognition performance information was presented for all handwritten input information entered in the past, the information more than necessary due to the recognition performance. However, it is possible to prevent unnecessary information presentation by narrowing the presentation range to the difficult recognition information range. By doing so, the correction work can be performed efficiently, and the possibility of correcting more than necessary can be prevented.

なお、一般的には、ユーザによって記入癖が異なるので、提示すべき改善態様もユーザごとに異なるから、手書き入力情報の履歴を取る際には、ユーザごとに履歴を取っておくのが好ましい。 In general, since the entry habits differ depending on the user, the improvement mode to be presented also differs for each user. Therefore, when taking a history of handwritten input information, it is preferable to keep a history for each user.

＜信頼度情報に基づく難認識情報抽出＞
また、難認識追記情報抽出部１９４において、一定の度合いよりも認識の困難であった難認識情報についてのみ認識性能情報を提示するに当たっては、分離認識処理部１５５にて通常の認識処理を行ない、難認識追記情報抽出部１９４は、その認識処理時の信頼度、換言すれば候補情報（基準情報）に対する類似度に基づいて難認識情報を抽出する。 <Extracting difficult recognition information based on reliability information>
In addition, in the difficult recognition additional record information extraction unit 194, when presenting the recognition performance information only for the difficult recognition information that is difficult to recognize than a certain degree, the separation recognition processing unit 155 performs normal recognition processing, The difficult recognition additional record information extraction unit 194 extracts the difficult recognition information based on the reliability at the time of the recognition process, in other words, the similarity to the candidate information (reference information).

また、本実施形態において、難認識追記情報抽出部１９４は、注目する追記情報に関して、データ処理対象追記情報特定処理部１５０における認識処理そのものを含む当該認識処理に関わる各種の処理（たとえば差分抽出部１３２からデータ処理対象追記情報特定処理部１５０までの各機能部の各処理）における、認識処理の信頼度に関する特徴量をそれぞれ取得し、この各特徴量に基づいて各処理の認識処理に関わるサブ信頼度をそれぞれ算出し、この算出したサブ信頼度に基づいて注目する付加情報に関しての最終的な信頼度を特定し、この最終的な信頼度が一定水準よりも低いか否かを判定することで難認識情報を特定する。 Further, in the present embodiment, the difficult recognition additional record information extraction unit 194 performs various processes related to the recognition process including the recognition process itself in the data processing target additional information specification processing unit 150 (for example, a difference extraction unit) with respect to the additional information to be noted. 132 to each processing unit of each functional unit from the data processing target additional information specifying processing unit 150), the feature quantity related to the reliability of the recognition process is acquired, and the sub-process related to the recognition process of each process is acquired based on each feature quantity. Calculate the reliability, specify the final reliability for the additional information of interest based on the calculated sub-reliability, and determine whether the final reliability is lower than a certain level. To identify difficult recognition information.

つまり、分離認識処理部１５５や記入位置認識部１５８，１６８における認識処理時の信頼度だけでなく、その前段の各機能部における各処理時の認識率や信頼度に関わる情報（特徴量）をも収集し、収集した各特徴量に基づいて各処理別のサブ信頼度を算出し、この算出した処理別のサブ信頼度に基づいて最終的な信頼度を求める。そして、この最終的な信頼度を元にして修正を要する難認識情報を特定するのである。 That is, not only the reliability at the time of recognition processing in the separation recognition processing unit 155 and the entry position recognition units 158 and 168 but also information (features) related to the recognition rate and reliability at the time of each processing in each functional unit in the preceding stage. Are also collected, sub-reliability for each process is calculated based on each collected feature amount, and final reliability is obtained based on the calculated sub-reliability for each process. And the difficult recognition information which needs correction based on this final reliability is specified.

特開２００４−１５２１１５号公報に記載の仕組みでは、取得したデータに含まれる複数の項目について項目ごとに内容の確信度を計算し、計算された確信度を用いて修正提示の方法を動的に変化させるようにしているが、この場合、各項目の認識時にパターン認識の類似度を使用して「確信度」を算出するのみであるため、認識処理よりも前段の各種の前処理などで生じ得る、たとえば局所的なノイズの有無など、認識処理に影響を与える事象を考慮することができない。 In the mechanism described in Japanese Patent Application Laid-Open No. 2004-152115, a certainty factor is calculated for each item for a plurality of items included in acquired data, and a correction presentation method is dynamically used using the calculated certainty factor. However, in this case, since only the “certainty” is calculated using the similarity of pattern recognition when recognizing each item, it occurs in various pre-processing before the recognition processing. Events that affect the recognition process, such as the presence or absence of local noise, cannot be considered.

これに対して、本実施形態の難認識追記情報抽出部１９４では、データ処理対象追記情報特定処理部１５０よりも前段の各機能部における認識処理と関連する特徴量をも取得して総合的に類似度を算出することで難認識情報を特定するので、データ処理対象追記情報特定処理部１５０における認識の類似度やこの類似度に基づく確信度に基づいて難認識情報を特定する場合よりも、高精度に難認識情報を特定することができる。 On the other hand, in the difficult recognition additional record information extraction unit 194 of the present embodiment, the feature amount related to the recognition process in each functional unit before the data processing target additional information specification processing unit 150 is also acquired and comprehensively acquired. Since the difficulty recognition information is specified by calculating the similarity, rather than the case where the difficulty recognition information is specified based on the recognition similarity in the data processing target additional information specifying processing unit 150 or the certainty based on the similarity, It is possible to specify difficult recognition information with high accuracy.

すなわち、データ処理対象追記情報特定処理部１５０における認識処理の類似度だけでなく、データ処理対象追記情報特定処理部１５０よりも前段の処理をも含む各種の処理における、認識処理の信頼度に関する特徴量をそれぞれ取得し、この各特徴量に基づいて各処理の認識処理に関わるサブ信頼度をそれぞれ算出し、この算出したサブ信頼度に基づいて注目する付加情報に関しての最終的な信頼度を特定し、この最終的な信頼度が一定水準よりも低いか否かを判定することで難認識情報を特定するようにしているので、データ処理対象追記情報特定処理部１５０での認識処理よりも前段の各種の前処理などで生じ得る、局所的なノイズの有無などをも考慮して難認識情報を精度よく特定することができるのである。 In other words, not only the similarity of the recognition process in the data processing target additional information specifying processing unit 150 but also the characteristics related to the reliability of the recognition process in various processes including the processing preceding the data processing target additional information specifying processing unit 150 Each amount is acquired, the sub-reliability related to the recognition processing of each process is calculated based on each feature amount, and the final reliability regarding the additional information to be noticed is specified based on the calculated sub-reliability Since the difficulty recognition information is specified by determining whether or not the final reliability is lower than a certain level, the preceding stage of the recognition processing in the data processing target additional information specifying processing unit 150 is performed. Therefore, it is possible to specify the difficult recognition information with high accuracy in consideration of the presence or absence of local noise that may occur in various types of preprocessing.

＜履歴収集処理＞
なお、各機能部における各処理別のサブ信頼度を算出する際には、分離認識処理部１５５における認識処理結果のデータベース登録に関わる全行程から、その前段の各機能部における各処理時の認識率や信頼度に関わる特徴量を収集する。 <History collection processing>
When calculating the sub-reliability for each process in each function unit, the recognition at the time of each process in each function unit in the previous stage is performed from all the processes related to the database registration of the recognition process result in the separation recognition process unit 155. Collect features related to rate and reliability.

また、認識率や信頼度は、追記情報そのものの種類（文字なのか図形なのか線なのか）や形状によって傾向が異なることもあり、また、処理対象の追記済文書８Ｂの状態とも関わりを持つこともあるので、これらの点を考慮して最終的な信頼度の取得ができるように、過去の統計情報を取っておき、サブ信頼度と過去の統計情報とに基づいて信頼度を算出する。 In addition, the tendency of the recognition rate and reliability may differ depending on the type (whether it is a character or a graphic or a line) and the shape of the postscript information itself, and is also related to the state of the postscripted document 8B to be processed. In view of these points, the past statistical information is saved so that the final reliability can be acquired, and the reliability is calculated based on the sub-reliability and the past statistical information.

たとえば、過去の自動データ処理としては、概ね、「○」の認識率（信頼度）は０．９５で、「×」の認識率（信頼度）は０．９であったというように、採点記号８７の別に、認識率（信頼度）が異なることがある。 For example, in the past automatic data processing, scoring is generally such that the recognition rate (reliability) of “◯” was 0.95 and the recognition rate (reliability) of “x” was 0.9. In addition to the symbol 87, the recognition rate (reliability) may be different.

また、追記済文書８Ｂが写真や図形（クリップアート）を含むか否かで認識率（信頼度）は影響を受ける。特に、写真や図形（クリップアート）と接触する部分では間違い易く、写真や図形と接触する部分とそれ以外の部分とで認識率（信頼度）に大きな違いが生じ得る。このため、採点記号８７やコメント８８などの追記情報が、追記済文書８Ｂ（文書原本８Ａ）としての付加情報記入済教材８１（教育用教材８０）の何処の部分に記入されるものであるかを区別できるようにする。 Further, the recognition rate (reliability) is affected by whether or not the additionally-written document 8B includes a photograph or a figure (clip art). In particular, it is easy to make an error in a part that comes into contact with a photograph or a figure (clip art), and a recognition rate (reliability) can be greatly different between a part that comes into contact with a photograph or a figure and other parts. For this reason, the additional information such as the grading symbol 87 and the comment 88 is entered in the additional information filled teaching material 81 (educational teaching material 80) as the added document 8B (original document 8A). Can be distinguished.

たとえば、図５は、各処理時の認識率や信頼度に関わる特徴量を収集する手法の一例を示す図である。この図５に示す例では、写真を用いた問題（問４）が混在しており、その解答欄８４が写真部分に近接して配置されているので、たとえば、問４の採点記号８７の信頼度は「０．７」で、その他の解答欄８４の採点記号８７の信頼度は「０．９〜０．９５」であるなど、追記済文書８Ｂ（付加情報記入済教材８１）の問題ごとに認識率（信頼度）の履歴を取っておく。 For example, FIG. 5 is a diagram illustrating an example of a technique for collecting feature amounts related to the recognition rate and reliability during each process. In the example shown in FIG. 5, questions (question 4) using photographs are mixed, and the answer column 84 is arranged close to the photograph portion. The degree is “0.7”, and the reliability of the scoring symbol 87 in the other answer column 84 is “0.9 to 0.95”, for example, for each question in the added document 8B (additional information filled teaching material 81) Keep a history of the recognition rate (reliability).

＜信頼度情報収集処理と信頼度統合処理＞
また、難認識追記情報抽出部１９４は、各処理別のサブ信頼度に基づいて最終的な信頼度を求めるに当たっては、つまり、処理別のサブ信頼度を統合するに当たっては、たとえば、注目する追記情報（採点記号８７やコメント８８）について、処理間で最も信頼度が悪いものを、最終的な信頼度とする第１の統合手法を採ることが考えられる。全ての追記情報についてこの第１の統合手法を繰り返すことで、サブ信頼度の値が一定水準よりも低い追記情報が抽出される。なお、各処理別のサブ信頼度を算出する手法は、後述する第３の統合手法にて説明する。 <Reliability information collection processing and reliability integration processing>
In addition, the difficulty recognition additional record information extraction unit 194 obtains the final reliability based on the sub-reliability for each process, that is, when integrating the sub-reliabilities for each process, for example, a note to be noted Regarding information (scoring symbol 87 and comment 88), it is conceivable to adopt a first integration method in which the one with the lowest reliability among the processes is the final reliability. By repeating the first integration method for all the additional write information, additional write information having a sub-reliability value lower than a certain level is extracted. Note that a method for calculating the sub-reliability for each process will be described in a third integration method described later.

認識性能情報提示部１９８は、難認識追記情報抽出部１９４により抽出されたサブ信頼度の値が一定水準よりも低い追記情報について修正を促すように認識性能情報を記入者に提示する。認識性能が一定水準に満たない追記情報のみについて記入者に修正が促されるので、全ての追記情報について修正の要否を判断（チェック）する必要がなく、誤認識を起し得る追記情報の修正作業を効率化することができる。 The recognition performance information presentation unit 198 presents the recognition performance information to the writer so as to urge correction of the additional information whose sub-reliability value extracted by the difficult recognition additional record information extraction unit 194 is lower than a certain level. Only the additional information whose recognition performance is below a certain level is urged to be corrected by the entrant, so there is no need to determine (check) the necessity of correction for all additional information, and correction of additional information that may cause misrecognition Work can be made more efficient.

この第１の統合手法を採れば、１つの処理で認識結果に重大な影響を与える誤りを犯してしまう場合に、対応できる。たとえば、三角図形の下半分が写真と重なってしまい、差分抽出処理によって下半分が抽出できなかった場合、記入の仕方によってはバツ図形に見えるため、後段の処理ではバツと「確信を持って」判定してしまう可能性がある。しかし、この第１の統合手法を採れば、差分抽出処理でのサブ信頼度を利用できるため、このような誤りを回避できる効果がある。 If this 1st integration method is taken, it can respond to the case where the mistake which has a serious influence on a recognition result by one process is committed. For example, if the lower half of the triangle shape overlaps with the photo, and the lower half cannot be extracted by the difference extraction process, it will look like a cross shape depending on how you fill in. There is a possibility of judging. However, if this first integration method is adopted, the sub-reliability in the difference extraction process can be used, and thus such an error can be avoided.

また、難認識追記情報抽出部１９４は、注目する追記情報（採点記号８７やコメント８８）について、処理別のサブ信頼度（複数の処理結果）に重付けをして合計し、最終的な信頼度を算出する第２の統合手法を採ることも考えられる。全ての追記情報についてこの第２の統合手法を繰り返すことで、最終的な信頼度の値が一定水準よりも低い追記情報が抽出される。なお、各処理別のサブ信頼度を「重付け」を考慮して算出する手法は、後述する第３の統合手法にて説明する。 In addition, the difficult recognition additional record information extraction unit 194 adds the sum of the noticeable additional information (scoring symbol 87 and comment 88) to the sub-reliabilities (multiple processing results) for each process and adds the final reliability. It is also possible to adopt a second integration method for calculating the degree. By repeating this second integration method for all the additional write information, the additional write information whose final reliability value is lower than a certain level is extracted. Note that a method of calculating the sub-reliability for each process in consideration of “weighting” will be described in a third integrated method described later.

重付けの一例としては、処理順に拘わらず所定の値をランダムに設定することもできるし、上流に行くほど重付けを重くする、あるいは逆に下流に行くほど重付けを重くするというように、処理順に応じて重付け値を漸次変化させることもできる。 As an example of weighting, a predetermined value can be set at random regardless of the processing order, weighting is increased as it goes upstream, or weighting is increased as it goes downstream. The weighting value can be gradually changed according to the processing order.

上流に行くほど重付けを重くする手法では、初期の処理を重視するため、図５のような処理の上流段階に認識に大きな影響を与えることが想定される場合に対応できる効果があり、下流に行くほど重付けを重くする手法では、歪みなどの補正後の処理を重視するため、元々の記入図形の形状が主因の場合（丸みのある三角など）に対応できる効果がある。 In the method in which weighting is increased as it goes upstream, the initial process is emphasized, and therefore, there is an effect that can cope with the case where it is assumed that the upstream stage of the process as shown in FIG. In the method of increasing the weight as it goes to, since processing after correction such as distortion is emphasized, there is an effect that it is possible to deal with the case where the shape of the original entry figure is the main cause (such as a rounded triangle).

重付けの別の例としては、特徴量と正しい認識結果のデータ群が有る場合に、重回帰分析によりそのデータ群に最適な重み付けを算出することができる。この手法では、データ群とこれから入力される追記情報が似ている場合（記入者、文書原本が同じなど）には、精度が高い利点がある。 As another example of weighting, when there is a data group of feature amounts and correct recognition results, an optimal weighting can be calculated for the data group by multiple regression analysis. This method has an advantage of high accuracy when the data group and the additional information to be input are similar (the writer and the original document are the same).

認識性能情報提示部１９８は、難認識追記情報抽出部１９４により重付けを考慮して抽出された最終的な信頼度の値が一定水準よりも低い追記情報について修正を促すように認識性能情報を記入者に提示する。認識性能が一定水準に満たない追記情報のみについて記入者に修正が促されるので、全ての追記情報について修正の要否を判断（チェック）する必要がなく、誤認識を起し得る追記情報の修正作業を効率化することができる。 The recognition performance information presentation unit 198 displays the recognition performance information so as to prompt correction of additional information whose final reliability value extracted by the difficult recognition additional record information extraction unit 194 in consideration of weighting is lower than a certain level. Present to the writer. Only the additional information whose recognition performance is below a certain level is urged to be corrected by the entrant, so there is no need to determine (check) the necessity of correction for all additional information, and correction of additional information that may cause misrecognition Work can be made more efficient.

この第２の統合手法を採れば、重付けを自由に変更出来るため、記入者や文書原本などの特徴に合わせて精度が高くなるように調整できる効果がある。 If this second integration method is adopted, weighting can be freely changed, so that there is an effect that the accuracy can be adjusted so as to increase in accordance with the characteristics of the writer or the original document.

また、難認識追記情報抽出部１９４は、信頼度が低くなる誤認識のタイプ（ミスのタイプ）を予め設定しておき、データ処理時に、この設定した「誤認識のタイプ」に該当するものの内、信頼度が最も低いタイプのものを最終的な信頼度とする第３の統合手法を採ることも考えられる。この際には、誤認識のタイプごとに、判定時に使用する信頼度の組合せと重付けを設定しておく。そして、推定される誤認識のタイプを元にして、処理や特徴量の組合せを選び、重付け加算して信頼度を算出する。そして、加算処理結果に基づいて、信頼度が所定水準よりも低い、あるいは最も低い誤認識のタイプを選ぶ。これにより、難認識追記情報抽出部１９４は、選択した誤認識のタイプを持つ、認識処理の信頼度が一定水準よりも低い難認識情報を特定することができる。 Further, the difficult recognition additional record information extraction unit 194 sets in advance a type of misrecognition (type of error) with low reliability, and among those corresponding to the set “type of misrecognition” during data processing. It is also conceivable to adopt a third integration method in which the lowest reliability type is the final reliability. At this time, a combination and weighting of reliability used at the time of determination are set for each type of erroneous recognition. Then, based on the estimated misrecognition type, a combination of processing and feature amount is selected, and weighted addition is performed to calculate the reliability. Then, based on the addition processing result, the type of erroneous recognition whose reliability is lower than or lower than a predetermined level is selected. Thereby, the difficult recognition additional record information extraction unit 194 can specify difficult recognition information having the selected type of erroneous recognition and having a reliability of recognition processing lower than a certain level.

ここで、「推定される誤認識のタイプを元にして、処理や特徴量の組合せを選び、重付けして信頼度を算出する」には、あるタイプに特有に現れる特徴量の組合わせ（特徴量の特徴）を予め規定しておき、それに当てはまるかどうかを調べる。単純には、各処理で、あるタイプに特有の特徴量に当てはまれば１、当てはまらなければ０として加算し、ある水準を満たすかどうかを見る。 Here, “selecting a combination of processing and feature values based on the estimated misrecognition type and calculating the reliability by weighting them” includes a combination of feature values that appear peculiar to a certain type ( (Feature of feature quantity) is defined in advance, and it is examined whether or not it is applicable. Simply, in each process, 1 is added if the characteristic amount is specific to a certain type, and 0 is added if not, to see whether a certain level is satisfied.

図６〜図１０は、認識性能情報提示処理部１９０の処理であって、各処理と関わる誤認識のタイプを説明する図である。「誤認識のタイプ」の判定としては、たとえば、文書入力装置２０と関わりを持つスキャンミス、差分抽出部１３２やデータ処理対象追記情報抽出部１４２と関わりを持つ抽出ミス、抽出線分途切れ補正部１４８と関わりを持つ欠損補間ミス、変形処理部１５６ａ，１６６ａ，１５７ａ，１６７ａと関わりを持つ形状変形、記入位置認識部１５８，１６８と関わりを持つ位置ずれといった各機能部の処理そのものと密接に関係したものや、複数追記や追記訂正、ペン色・太さ・かすれといった追記情報そのものに起因するものや、紙汚れなどの追記済文書８Ｂそのものに起因するもの、などを判定するのがよい。 6 to 10 are diagrams illustrating the recognition performance information presentation processing unit 190 and explaining the types of misrecognition related to each processing. As the determination of “type of misrecognition”, for example, a scan error related to the document input device 20, an extraction error related to the difference extraction unit 132 or the data processing target additional information extraction unit 142, and an extraction line segmentation correction unit Closely related to the processing itself of each functional unit such as a missing interpolation error related to 148, a shape deformation related to the deformation processing units 156a, 166a, 157a, 167a, and a positional shift related to the entry position recognition units 158, 168 It is preferable to determine what has been added, what is added due to multiple writing, correction, additional information such as pen color / thickness / fading, and what is attributed to the added document 8B itself such as paper stains.

たとえば、「抽出ミス」の誤認識のタイプは、形状認識や文字認識の信頼度が低く、かつ濃い背景への重複量が大きいときに起こり易い。これは、差分抽出部１３２での差分抽出処理時に、追記済文書８Ｂと文書原本８Ａとの各画像の位置関係などから追記情報と原本画像との接触の有無によって影響を受け得ることがあるし、また追記情報が写真や図形などの濃い背景に重複しているときにも影響を受け得るからである。また、データ処理対象追記情報抽出部１４２でのデータ処理用の追記情報の抽出時に、追記に使用されたペン色（たとえば赤ペン）の成分を抽出しようとするときに、注目する色と実際のペン色との相違の影響を受け得るし、赤抽出画像と原本画像との位置関係などから原本画像との接触の有無によっても影響を受け得るからである。 For example, the misrecognition type of “extraction mistake” is likely to occur when the reliability of shape recognition or character recognition is low and the amount of overlap with a dark background is large. This may be affected by the presence / absence of contact between the postscript information and the original image due to the positional relationship between the images of the postscript added document 8B and the original document 8A during the differential extraction process in the differential extraction unit 132. Moreover, it is also possible to be affected when the additional information is overlapped with a dark background such as a photograph or a figure. In addition, when extracting additional information for data processing in the data processing target additional information extraction unit 142, when trying to extract a pen color (for example, red pen) component used for additional writing, the target color and actual This is because it can be affected by the difference from the pen color, and can also be affected by the presence or absence of contact with the original image due to the positional relationship between the red extracted image and the original image.

たとえば、図６（Ａ）に示すように、追記情報抽出処理の対象画像が、濃い背景に重なって採点記号８７が記入されていると、差分抽出部１３２やデータ処理対象追記情報抽出部１４２での追記情報の抽出結果は、図６（Ｂ）に示すように、重なり部分が欠損して抽出される。抽出線分途切れ補正部１４８は、その欠損部分が繋がるように補正する。 For example, as shown in FIG. 6A, when the target image of the additional information extraction process has a scoring symbol 87 written on a dark background, the difference extraction unit 132 and the data processing target additional information extraction unit 142 As shown in FIG. 6 (B), the additional record information extraction result is extracted with the overlapping portion missing. The extraction line segment break correction unit 148 performs correction so that the missing portion is connected.

すなわち、抽出線分途切れ補正部１４８は、データ処理対象追記情報抽出部１４２による所定色成分の抽出結果、すなわち「○」や「×」などの図形であるはずの抽出結果に対して、細線化処理を実行し、さらに端点抽出処理を実行する。これにより、「○」や「×」などの図形に途切れ部分が生じている場合に、その途切れ部分における端点が抽出されることになる。そして、端点を抽出したら、その抽出した全ての端点の内、所定のものを接続する。 That is, the extraction line segmentation correction unit 148 thins the extraction result of the predetermined color component by the data processing target additional information extraction unit 142, that is, the extraction result that should be a graphic such as “◯” or “x”. The process is executed, and the end point extraction process is further executed. As a result, when there is an interrupted portion in the graphic such as “◯” or “×”, the end point at the interrupted portion is extracted. When the end points are extracted, a predetermined one of all the extracted end points is connected.

これにより、たとえば図６（Ｂ）に示す図形が抽出された場合には、端点Ａに対して、所定距離内に端点Ｂ，Ｃが存在していても、その中で最近傍の端点Ｂを端点Ａと接続することで、図６（Ｃ）に示すように、「○」の画像における途切れ部分を補正する。その結果、記入者は、本来、概ね「○」を記入しているにも拘わらず、濃い背景と重なる部分が欠落された歪んだ状態の「○」の画像が分離認識処理部１５５に入力されることになる。 Thereby, for example, when the figure shown in FIG. 6B is extracted, even if the end points B and C exist within a predetermined distance from the end point A, the nearest end point B is selected. By connecting with the end point A, as shown in FIG. 6C, the discontinuity portion in the “◯” image is corrected. As a result, although the writer originally wrote “O” in general, the image of “O” in a distorted state in which the portion overlapping the dark background is omitted is input to the separation recognition processing unit 155. Will be.

その結果、分離認識処理部１５５は、歪んだ状態の「○」の画像に基づいて図形認識処理もしくは文字認識処理によって「○」と特定すべきであるが、その際には、重なり部分の欠落によって図形の特徴が大幅に変わってしまうため、認識率（信頼度）が低下してしまい、また認識ミスが生じる可能性が高くなる。 As a result, the separation recognition processing unit 155 should identify “◯” by graphic recognition processing or character recognition processing based on the distorted “◯” image. As a result, the feature of the figure changes drastically, so that the recognition rate (reliability) decreases and the possibility of a recognition error increases.

このため、図６（Ｄ）に示すように、難認識追記情報抽出部１９４にて、このような「抽出ミス」の誤認識のタイプを特定することができるように、差分抽出部１３２での差分抽出処理やデータ処理対象追記情報抽出部１４２での特定色成分抽出処理においては、抽出量などから求められる抽出時の各品質情報Ｊ１０，Ｊ１１、あるいは、採点記号８７やコメント８８の濃い背景への重複量Ｊ１２、赤抽出画像と原本画像との位置関係などから求められる原本画像と赤抽出画像の接触量Ｊ１３などを、認識率や信頼度に関わる特徴量として求め、これらの情報を難認識追記情報抽出部１９４に通知する。 For this reason, as shown in FIG. 6D, in the difficult-to-recognize additional record information extracting unit 194, the type of misrecognition of such “extraction mistake” can be specified. In the specific color component extraction process in the difference extraction process or the data processing target additional information extraction unit 142, the quality information J10, J11 at the time of extraction obtained from the extraction amount or the like, or the dark background of the scoring symbol 87 and the comment 88 The amount of overlap J12, the contact amount J13 between the original image and the red extracted image obtained from the positional relationship between the red extracted image and the original image, etc. are obtained as feature amounts related to the recognition rate and reliability, and these information are difficult to recognize. The postscript information extraction unit 194 is notified.

難認識追記情報抽出部１９４は、通知された品質情報Ｊ１０，Ｊ１１、重複量Ｊ１２、接触量Ｊ１３に基づいて、たとえば、それぞれ基準を満たせば１、満たさなければ０として加算することで、差分抽出処理のサブ信頼度Ｔ１３２や特定色成分抽出処理のサブ信頼度Ｔ１４２を算出する。 Based on the notified quality information J10, J11, the overlap amount J12, and the contact amount J13, the difficult recognition additional record information extraction unit 194 adds, for example, as 1 if the criterion is satisfied and 0 if not satisfied, thereby extracting the difference. The sub-reliability T132 of the process and the sub-reliability T142 of the specific color component extraction process are calculated.

また、「欠損補間ミス」の誤認識のタイプは、原本画像との接触量が多く、かつ複数の対応付けがなされたときに起こり易い。抽出線分途切れ補正部１４８における途切れ補正処理時に、端点同士の接続処理時に、端点間の距離の影響を受け得るからである。なお、複数の対応付けがなされるケースとしては、元々１つの画像が原本画像との重なり部分によって分断されるケースと、元々２つの画像が原本画像との重なり部分を途切れ補正することによって接続されるケースとがある。 Further, the type of erroneous recognition of “missing interpolation error” is likely to occur when the amount of contact with the original image is large and a plurality of associations are made. This is because the extraction line segment interruption correction unit 148 can be affected by the distance between the end points during the end point connection process during the end point connection process. In addition, as a case where a plurality of associations are made, a case where one image is originally divided by an overlapping portion with the original image and a case where two images originally are connected by correcting the discontinuity of the overlapping portion with the original image are connected. There are cases.

元々１つの画像が原本画像との重なり部分によって分断されるケースとしては、たとえば、図７に示す事例がある。図７（Ａ）に示すように、追記情報抽出処理の対象画像が、線分などの比較的小さな濃い背景に重なって採点記号８７が記入されていると、差分抽出部１３２やデータ処理対象追記情報抽出部１４２での追記情報の抽出結果は、図７（Ｂ）に示すように、その重なり部分が欠損して抽出される。抽出線分途切れ補正部１４８は、その欠損部分が繋がるように補正する。 As a case where one image is originally divided by an overlapping portion with the original image, for example, there is a case shown in FIG. As shown in FIG. 7A, when the target image of the additional information extraction process is marked with a scoring symbol 87 overlaid on a relatively small dark background such as a line segment, the difference extraction unit 132 and the data processing target additional recording are added. As shown in FIG. 7B, the extraction result of the postscript information in the information extraction unit 142 is extracted with the overlapping portion missing. The extraction line segment break correction unit 148 performs correction so that the missing portion is connected.

たとえば図７（Ｂ）に示す図形が抽出された場合には、「○」の画像が、線分などの比較的小さな濃い背景の重なり部分で分断され、４つの端点Ａ〜Ｄが現われる。抽出線分途切れ補正部１４８は、それらの端点を所定の条件に基づいて接続しようとする。この際、最近傍の端点同士を接続することを基本条件としていても、何らかの原因で、必ずしもそのように端点が接続されないことも起こり得る。この現象は、特に、文字や図形を示す１つの画像が、重なり部分によって分断されることで、複数の対応付けがなされた場合に起こり易い。元々１つの画像が分断されたものであるのか、元々別のものであったのかを切り分けることが困難であり、分断されている画像ごとに端点同士を接続しようとするからである。 For example, when the graphic shown in FIG. 7B is extracted, the image of “◯” is divided at an overlapping portion of a relatively small dark background such as a line segment, and four end points A to D appear. The extraction line segment break correction unit 148 tries to connect these end points based on a predetermined condition. In this case, even if the closest end points are connected to each other as a basic condition, the end points may not be connected as such for some reason. This phenomenon is particularly likely to occur when a plurality of associations are made by dividing an image showing characters and figures by overlapping portions. This is because it is difficult to discriminate whether one image is originally divided or different from another, and end points are tried to be connected for each divided image.

よって、たとえば、図７（Ｃ１）に示すように、分断された上半分の端点Ａと端点Ｃとを接続する一方で、分断された下半分の端点Ｂと端点Ｄとは接続しない場合や、図７（Ｃ２）に示すように、分断された上半分の端点Ａと端点Ｃとを接続する一方で、分断された下半分の端点Ｂと端点Ｄも接続する場合もある。 Therefore, for example, as shown in FIG. 7 (C1), while the divided upper half end point A and the end point C are connected, the divided lower half end point B and the end point D are not connected, As shown in FIG. 7 (C2), the divided upper half end point A and end point C are connected, while the divided lower half end point B and end point D are also connected in some cases.

その結果、何れにしても、記入者は、本来、概ね「○」を記入しているにも拘わらず、濃い背景と重なる部分が欠落された歪んだ状態の（概ね半円に近い）「○」の画像が分離認識処理部１５５に入力されることになる。 As a result, in any case, although the writer originally wrote “O” in general, the distorted state (substantially close to a semicircle) in which a portion overlapping with the dark background is omitted "Is input to the separation recognition processing unit 155.

その結果、分離認識処理部１５５は、歪んだ状態の「○」の画像に基づいて図形認識処理もしくは文字認識処理によって「○」と特定すべきであるが、その際には、原本画像との重なり部分の欠落によって図形の特徴が大幅に変わってしまうため、認識率（信頼度）が低下してしまい、また認識ミスが生じる可能性が高くなる。 As a result, the separation recognition processing unit 155 should identify “◯” by graphic recognition processing or character recognition processing based on the distorted “◯” image. Since the feature of the figure is significantly changed due to the omission of the overlapped portion, the recognition rate (reliability) is lowered, and the possibility of a recognition error increases.

また、元々２つの画像が原本画像との重なり部分を途切れ補正することによって接続されるケースとしては、たとえば、図８に示す事例がある。図８（Ａ）に示すように、追記情報抽出処理の対象画像が、解答欄８４の枠などの濃い背景に重なって複数の採点記号８７が記入されていると、差分抽出部１３２やデータ処理対象追記情報抽出部１４２での追記情報の抽出結果は、図８（Ｂ）に示すように、それぞれの採点記号８７は重なり部分が欠損して抽出される。抽出線分途切れ補正部１４８は、その欠損部分が繋がるように補正する。 Moreover, as a case where two images are originally connected by correcting the discontinuity of the overlapping portion with the original image, for example, there is a case shown in FIG. As shown in FIG. 8A, if the target image of the additional information extraction process is overlaid with a dark background such as a frame of the answer column 84 and a plurality of scoring symbols 87 are entered, the difference extraction unit 132 and the data process As shown in FIG. 8B, the extracted result of the additional write information in the target additional write information extraction unit 142 is extracted with each scoring symbol 87 lacking the overlapping portion. The extraction line segment break correction unit 148 performs correction so that the missing portion is connected.

たとえば図８（Ｂ）に示す図形が抽出された場合には、２つの「○」の画像が、それぞれ枠線などの比較的小さな濃い背景の重なり部分で分断され、それぞれ４つの端点A1〜D1，A2〜D2が現われる。抽出線分途切れ補正部１４８は、それらの端点を所定の条件に基づいて接続しようとする。 For example, when the graphic shown in FIG. 8B is extracted, two “◯” images are divided by overlapping portions of relatively small dark backgrounds such as frame lines, and four end points A1 to D1 respectively. , A2 to D2 appear. The extraction line segment break correction unit 148 tries to connect these end points based on a predetermined condition.

ここで、複数の画像成分（分断されたものであるか否かを問わない）に発生する端点間の距離が、所定値以下であるか否かに基づいて、元々１つの画像が分断されたものであるのか、元々別のものであったのかを切り分けるようにし、かつ、最近傍の端点同士を接続することを基本条件としていると、先ず、それぞれの「○」の画像の抽出部分は、図８（Ｃ）に示すように、最近傍の端点同士を接続することで、元の「○」がほぼ再現される。 Here, one image was originally divided based on whether or not the distance between end points generated in a plurality of image components (whether or not they are divided) is equal to or less than a predetermined value. If the basic condition is that the nearest end points are connected to each other, the extraction part of each “○” image is As shown in FIG. 8C, by connecting the nearest end points, the original “◯” is almost reproduced.

しかしながら、２つの「○」の画像成分に関しては、何らかの原因で、さらに端点同士の接続がなされることがある。この現象は、元々複数の画像が、それぞれ同じ背景によって分断されることで複数の対応付けがなされた場合に起こり易い。元々複数の画像であっても、元々１つの画像が分断されたものであるのか、元々別のものであったのかを切り分けることが困難であり、分断されている画像との間で端点同士を接続しようとすることが起こるからである。 However, for the two “◯” image components, the end points may be further connected for some reason. This phenomenon tends to occur when a plurality of images are originally associated with each other by being divided by the same background. Even if there are a plurality of images originally, it is difficult to distinguish whether one image was originally divided or originally, and it is difficult to separate the endpoints from the divided images. This is because trying to connect occurs.

よって、たとえば、図８（Ｃ）に示すように、右側と左側の２つの「○」の画像成分が、枠との重なり部分を補正しようとする線分によって接続され、全体として１つの画像になってしまう。 Thus, for example, as shown in FIG. 8C, the two “◯” image components on the right and left sides are connected by a line segment to correct the overlapping portion with the frame, so that one image as a whole is connected. turn into.

その結果、記入者は、本来、それぞれを概ね「○」で区別して記入しているにも拘わらず、濃い背景と重なる部分が線分で接続され、歪んだ状態の「○」の（２つが１つに繋がった）画像が分離認識処理部１５５に入力されることになる。 As a result, although the entry person originally entered each item with a distinction of “○”, the portions that overlap the dark background are connected by line segments, and the distorted “○” (two The images (connected to one) are input to the separation recognition processing unit 155.

その結果、分離認識処理部１５５は、歪んだ状態の「○」の画像に基づいて図形認識処理もしくは文字認識処理によってそれぞれを「○」と区別して特定すべきであるが、その際には、原本画像との重なり部分を補正する線分によって図形の特徴が大幅に変わってしまうため、認識率（信頼度）が低下してしまい、また、認識ミスが生じる可能性が高くなる。 As a result, the separation recognition processing unit 155 should identify and distinguish each from “◯” by graphic recognition processing or character recognition processing based on the distorted “◯” image. Since the feature of the figure is greatly changed by the line segment for correcting the overlapping portion with the original image, the recognition rate (reliability) is lowered, and the possibility of occurrence of a recognition error increases.

このため、図７（Ｄ），図８（Ｄ）に示すように、難認識追記情報抽出部１９４にて、このような「欠損補間ミス」の誤認識のタイプを特定することができるように、抽出線分途切れ補正部１４８での途切れ補正処理においては、途切れ補正処理時の補間長さを示す補間長情報Ｊ２０や接続の候補となる端点数Ｊ２１などを、認識率や信頼度に関わる特徴量として求め、補間長情報Ｊ２０などを難認識追記情報抽出部１９４に通知する。 Therefore, as shown in FIG. 7D and FIG. 8D, the difficult recognition additional recording information extraction unit 194 can identify the type of such erroneous recognition of “missing interpolation error”. In the interruption correction processing in the extraction line segment interruption correction unit 148, the interpolation length information J20 indicating the interpolation length at the time of the interruption correction processing, the number of end points J21 as connection candidates, and the like are related to the recognition rate and reliability. It calculates | requires as quantity and notifies the difficult recognition additional record information extraction part 194 of interpolation length information J20.

難認識追記情報抽出部１９４は、通知された補間長情報Ｊ２０などに基づいて、たとえば、それぞれ基準を満たせば１、満たさなければ０として加算することで、途切れ補正処理のサブ信頼度Ｔ１４８を算出する。 Based on the notified interpolation length information J20 and the like, the difficult recognition additional record information extraction unit 194 calculates the sub-reliability T148 of the interruption correction process by adding, for example, as 1 if the standard is satisfied and 0 if not satisfied, respectively. To do.

また、「複数追記」の誤認識のタイプは、形状認識や文字認識の信頼度が低く、隣接する他の記入欄との距離が近くて複数の採点記号８７やコメント８８の対応付けがあるときに起こり易い。これは、分離認識処理部１５５の図形形状認識部１５６，１６６における図形認識処理時や文字認識処理部１５７，１６７における文字認識処理時に、複数の追記情報同士の接触の有無が認識の信頼度に影響を与えることに基づくものである。 In addition, the type of misrecognition of “multiple addition” is when the reliability of shape recognition or character recognition is low, and there is a correspondence between a plurality of scoring symbols 87 and comments 88 because the distance to other adjacent entry fields is short. Easy to happen. This is because the presence or absence of contact between a plurality of additional information is the reliability of recognition during the graphic recognition processing in the graphic shape recognition units 156 and 166 of the separation recognition processing unit 155 and the character recognition processing in the character recognition processing units 157 and 167. It is based on influencing.

たとえば、図９（Ａ）に示すように、２つの解答欄８４が２行で記載され、その間の距離が十分にないときに、各解答欄８４に記入された生徒解答に対してそれぞれ採点記号８７が採点官によって記入されると、十分な間隔がないために、２つの採点記号８７が接触してしまうことが典型例である。 For example, as shown in FIG. 9A, when two answer columns 84 are described in two lines and there is not enough distance between them, each student answer entered in each answer column 84 is assigned a scoring symbol. When 87 is entered by a grader, there is typically not enough space between two scoring symbols 87 to contact.

このため、図９（Ｂ）に示すように、難認識追記情報抽出部１９４にて、このような「複数追記」の誤認識のタイプを特定することができるように、分離認識処理部１５５においては、先ず採点記号８７同士の接触の有無を示す接触情報Ｊ３０とコメント８８同士の接触の有無を示す接触情報Ｊ３１を求め、さらに、図形形状認識部１５６での図形認識処理においては、採点記号８７やコメント８８についての図形認識処理結果の認識の信頼度情報Ｊ３２，Ｊ３３を、採点記号８７やコメント８８についての図形認識率や図形の信頼度に関わる特徴量として求める。また、文字認識処理部１５７，１６７での文字認識処理においては、採点記号８７やコメント８８についての文字認識処理結果の認識の信頼度情報Ｊ３４，Ｊ３５を、採点記号８７やコメント８８についての文字認識率や文字の信頼度に関わる特徴量として求め、これらの情報を難認識追記情報抽出部１９４に通知する。 For this reason, as shown in FIG. 9B, the separation recognition processing unit 155 can identify the type of misrecognition of such “multiple addition” in the difficult recognition additional recording information extraction unit 194. First, contact information J30 indicating the presence / absence of contact between the scoring symbols 87 and contact information J31 indicating the presence / absence of contact between the comments 88 are obtained. Further, in the graphic recognition processing by the graphic shape recognition unit 156, the scoring symbol 87 is obtained. And the reliability information J32 and J33 of the recognition result of the graphic recognition process for the comment 88 are obtained as feature quantities related to the graphic recognition rate and the graphic reliability for the scoring symbol 87 and the comment 88. In the character recognition processing in the character recognition processing units 157 and 167, the reliability information J34 and J35 of recognition of the character recognition processing result for the scoring symbol 87 and the comment 88 is used as the character recognition processing for the scoring symbol 87 and the comment 88. It is obtained as a feature quantity related to the rate and the reliability of the characters, and these pieces of information are notified to the difficult-to-recognize additional information extracting unit 194.

難認識追記情報抽出部１９４は、通知された採点記号８７についての接触情報Ｊ３０，信頼度情報Ｊ３２，信頼度情報Ｊ３４に基づいて、たとえば、それぞれ基準を満たせば１、満たさなければ０として加算することで、図形認識処理のサブ信頼度Ｔ１５６と文字認識処理のサブ信頼度Ｔ１５７とを算出する。また、コメント８８についての接触情報Ｊ３１，信頼度情報Ｊ３３，信頼度情報Ｊ３５に基づいて、たとえば、それぞれ基準を満たせば１、満たさなければ０として加算することで、図形認識処理のサブ信頼度Ｔ１６８と文字認識処理のサブ信頼度Ｔ１６７とを算出する。 The difficult recognition additional record information extraction unit 194 adds, for example, as 1 if the criteria are satisfied and 0 if not, based on the contact information J30, reliability information J32, and reliability information J34 regarding the notified scoring symbol 87. Thus, the sub-reliability T156 of the graphic recognition process and the sub-reliability T157 of the character recognition process are calculated. Further, based on the contact information J31, reliability information J33, and reliability information J35 for the comment 88, for example, 1 is added if the standard is satisfied, and 0 is added if the standard is not satisfied, thereby sub-reliability T168 of the graphic recognition process. And the sub-reliability T167 of the character recognition process are calculated.

また、「追記訂正」の誤認識のタイプは、形状認識や文字認識の信頼度は高いが、記入欄との距離が遠くて、１つの記入欄に対して複数の採点記号８７やコメント８８の対応付けがあるときに起こり易い。これは、データ処理部１７０にて、注目する追記情報について、記入位置認識部１５８，１６８により特定された記入位置の特定処理結果に基づいて、何れの記入欄に対応するものであるかを特定してデータ処理を実行していく際に、１つの追記情報が複数の記入位置に関係するか否かがデータ処理結果に影響を与えることに基づくものである。 Furthermore, the type of misrecognition of “additional correction” is highly reliable in shape recognition and character recognition, but is far from the entry column and has a plurality of scoring symbols 87 and comments 88 for one entry column. It is likely to occur when there is a correspondence. This is to identify which entry field corresponds to the additional information to be noticed in the data processing unit 170 based on the result of specifying the entry position specified by the entry position recognition units 158 and 168. Thus, when data processing is executed, whether or not one additional information relates to a plurality of entry positions affects the data processing result.

たとえば、図１０（Ａ），（Ｂ）に示すように、２つの解答欄８４が２行で記載され、その間の距離がある程度広くなっているときに、各解答欄８４に記入された生徒解答に対してそれぞれ採点記号８７が採点官によって記入されると、間隔が広いために、２つの解答欄８４の間に、何れか一方用の１つの採点記号８７が記入されることが典型例である。 For example, as shown in FIGS. 10A and 10B, when two answer columns 84 are described in two lines and the distance between them is widened to some extent, the student answers entered in each answer column 84 When the scoring symbol 87 is entered by the grader for each of the above, since there is a wide interval, one scoring symbol 87 for either one is typically entered between the two answer fields 84. is there.

ここで、図１０（Ａ）に示すように、追記情報が大きく、上下の解答欄８４に跨がって記入されるケースでは、追記情報の外接矩形とそれぞれの解答欄８４となる領域との重なり面積が概ね同じであるし、各解答欄８４との間の各距離差も概ね同じであるので、２つの解答欄８４の間に記入された追記情報が何れの解答欄８４に対応させるべきかを特定することが困難になる。換言すれば、１つの解答欄８４に対して、複数の採点記号８７が対応付けされる可能性が生じる。 Here, as shown in FIG. 10 (A), in the case where the additional information is large and is written across the upper and lower answer fields 84, the circumscribed rectangle of the additional information and the areas to be the respective answer fields 84 Since the overlapping area is substantially the same, and the distance differences between the answer columns 84 are also substantially the same, the additional information entered between the two answer columns 84 should correspond to any answer column 84. It becomes difficult to specify. In other words, there is a possibility that a plurality of scoring symbols 87 are associated with one answer field 84.

また、図１０（Ｂ）に示すように、追記情報が小さく、上下の解答欄８４の間に収まって記入されるケースでは、追記情報の外接矩形とそれぞれの解答欄８４となる領域との重なり面積が概ねゼロであるし、各解答欄８４との間の各距離差も概ね同じであるので、２つの解答欄８４の間に記入された追記情報が何れの解答欄８４に対応させるべきかを特定することが困難になる。換言すれば、１つの解答欄８４に対して、複数の採点記号８７が対応付けされる可能性が生じる。 In addition, as shown in FIG. 10B, in the case where the additional information is small and is entered between the upper and lower answer columns 84, the circumscribed rectangle of the additional information overlaps with the area serving as each answer column 84. Since the area is approximately zero and the distance difference between each answer field 84 is substantially the same, the additional information entered between the two answer fields 84 should correspond to which answer field 84. It becomes difficult to specify. In other words, there is a possibility that a plurality of scoring symbols 87 are associated with one answer field 84.

このため、図１０（Ｃ）に示すように、難認識追記情報抽出部１９４にて、このような「追記訂正」の誤認識のタイプを特定することができるように、記入位置認識部１５８，１６８で記入位置認識処理においては、１つの追記情報の複数の記入欄への対応付けの可能性あるいは１つの記入欄に対しての複数の追記情報が対応付けされる可能性の有無を示す複数追記情報Ｊ４０，Ｊ４１や、各追記情報と各記入欄との距離を示す距離情報Ｊ４２，Ｊ４３などを、認識率や信頼度に関わる特徴量として求め、これらの情報を難認識追記情報抽出部１９４に通知する。 For this reason, as shown in FIG. 10 (C), the entry recognition unit 158, 158, so as to be able to identify the type of misrecognition of such “additional correction” by the difficult recognition additional recording information extraction unit 194. In the entry position recognition process at 168, a plurality of information indicating the possibility of associating one additional information with a plurality of entry fields or the possibility of associating a plurality of additional information with respect to one entry field. Additional information J40, J41, distance information J42, J43 indicating the distance between each additional information and each entry field is obtained as a feature quantity related to the recognition rate and reliability, and these pieces of information are recognized as difficult recognition additional information extraction unit 194. Notify

難認識追記情報抽出部１９４は、通知された情報Ｊ４０，Ｊ４２に基づいて、たとえば、それぞれ基準を満たせば１、満たさなければ０として加算することで、記入位置認識部１５８における採点記号８７についての記入位置認識処理のサブ信頼度Ｔ１５８を算出し、また通知された情報Ｊ４１，Ｊ４３に基づいて、たとえば、それぞれ基準を満たせば１、満たさなければ０として加算することで、記入位置認識部１６８におけるコメント８８についての記入位置認識処理のサブ信頼度Ｔ１６８を算出する。 Based on the notified information J40 and J42, the difficult recognition additional record information extraction unit 194 adds 1 as, for example, 0 if the criteria are not satisfied, and adds 0 as otherwise, for the scoring symbol 87 in the entry position recognition unit 158. The sub-reliability T158 of the entry position recognition process is calculated, and based on the notified information J41 and J43, for example, 1 is added if the standard is satisfied, and 0 is added if the reference is not satisfied. The sub-reliability T168 of the entry position recognition process for the comment 88 is calculated.

難認識追記情報抽出部１９４は、注目する追記情報（採点記号８７やコメント８８）について、それぞれ求めた各処理のサブ信頼度の内で、サブ信頼度の値が一定水準よりも低い誤認識のタイプを抽出する。この際には、サブ信頼度が最も低い誤認識のタイプのみを抽出してもよい。全ての追記情報についてこの第３の統合手法を繰り返すことで、サブ信頼度の値が一定水準よりも低い誤認識のタイプを持つ追記情報が抽出される。 The difficult-to-recognize additional information extraction unit 194 performs misrecognition on the additional information to be noticed (scoring symbol 87 and comment 88) with a sub-reliability value lower than a certain level within the sub-reliabilities of each process obtained. Extract type. At this time, only the type of misrecognition with the lowest sub-reliability may be extracted. By repeating this third integration method for all additional write information, additional write information having a type of erroneous recognition whose sub-reliability value is lower than a certain level is extracted.

認識性能情報提示部１９８は、難認識追記情報抽出部１９４により抽出されたサブ信頼度の値が一定水準よりも低い誤認識のタイプを持つ追記情報について修正を促すように認識性能情報を記入者に提示する。 The recognition performance information presentation unit 198 fills the recognition performance information so as to prompt correction of the additional information having the type of erroneous recognition whose sub-reliability value extracted by the difficult recognition additional information extraction unit 194 is lower than a certain level. To present.

認識性能が一定水準に満たない誤認識のタイプを持つ追記情報のみについて記入者に修正が促されるので、全ての追記情報について修正の要否を判断（チェック）する必要がなく、誤認識を起し得る追記情報の修正作業を効率化することができる。信頼度が所定水準よりも低い誤認識のタイプが提示されると、記入者は、その誤認識のタイプの側面から現状の記入状態を改善することができる。１つの追記情報について、信頼度が所定水準よりも低い誤認識のタイプが２つ提示されたときには、記入者は、それぞれの誤認識のタイプの側面から現状の記入状態を改善することができるし、信頼度が最も低い誤認識のタイプのみが提示されたときには、記入者は、その最も信頼度の低い誤認識のタイプの側面から現状の記入状態を改善することができる。 Only the additional information with the type of misrecognition whose recognition performance is less than a certain level is prompted by the writer, so there is no need to judge (check) the necessity of correction for all the additional information and cause misrecognition. It is possible to improve the efficiency of the correction of the additional information that can be performed. When a misrecognition type having a reliability lower than a predetermined level is presented, the writer can improve the current entry state from the aspect of the misrecognition type. When two types of misrecognition with a reliability lower than a predetermined level are presented for one additional information, the writer can improve the current entry state from the aspect of each misrecognition type. When only the type of misrecognition with the lowest reliability is presented, the writer can improve the current entry state from the aspect of the type of misrecognition with the lowest reliability.

このような第３の統合手法を採れば、タイプごとに具体的な改善方法を準備しておけるので、記入者が記入方法を注意し易くなり改善効果が向上する効果がある。 By adopting such a third integration method, a specific improvement method can be prepared for each type, so that the writer can easily pay attention to the entry method and the improvement effect is improved.

＜認識性能情報提示手法＞
図１１〜図１６は、認識性能情報提示部１９８における、認識性能を向上させるための認識性能情報をユーザに提示する手法を説明する図である。 <Recognition performance information presentation method>
FIGS. 11-16 is a figure explaining the method in which the recognition performance information presentation part 198 presents the recognition performance information for improving recognition performance to a user.

認識性能情報提示部１９８が認識性能情報をユーザ（記入者）に提示する際には、記入者のそれぞれに応じた認識性能情報を提示するのがよい。また、その際の提示手法としては、たとえば、図１１に示すように、認識性能情報提示部１９８は、信頼度が一定水準以下のものについて、さらに信頼度によって認識性能情報の提示方法を変えて提示する第１の提示手法を採ることが考えられる。なお、ここでの「信頼度」は、最終的な信頼度を意味する。 When the recognition performance information presentation unit 198 presents the recognition performance information to the user (entry person), it is preferable to present the recognition performance information corresponding to each of the entry persons. As a presentation method at that time, for example, as shown in FIG. 11, the recognition performance information presentation unit 198 changes the method of presenting the recognition performance information depending on the reliability for those whose reliability is below a certain level. It is conceivable to adopt the first presentation method to be presented. Here, “reliability” means the final reliability.

ここで「信頼度よって表示方法を変える」という第１の提示手法は、最終的な信頼度のレベルや誤認識のタイプを目視で容易に区別することができるようにすることを意図したものであり、その限りにおいて、様々な提示手法による区別ができる。たとえば、最終的な信頼度や誤認識のタイプに応じて、それぞれ異なるグラディエーションを掛けて表示することができる。またたとえば、信頼度が一定水準以下のものについて枠を付ける、さらに信頼度の低いものほど枠線を太くする手法を採ることができる。 Here, the first presentation method of “changing the display method according to the reliability” is intended to make it possible to easily distinguish the final reliability level and the type of misrecognition visually. Yes, as long as it can be distinguished by various presentation methods. For example, it can be displayed with different gradients depending on the final reliability and the type of misrecognition. Further, for example, it is possible to adopt a technique in which a frame is attached to those whose reliability is below a certain level, and the frame is thicker as the reliability is lower.

表示方法の違いによって、認識特定した全ての追記情報の中で、信頼度が一定水準よりも低いものがどれであるかを閲覧者（記入者）は判断することができる。よって、この第１の提示手法を採用した場合でも、事実上、信頼度の値が一定水準よりも低い追記情報についてのみ修正を促すように認識性能情報を記入者に提示することになる。記入者は、その信頼度の値が一定水準よりも低い追記情報について記入態様を改善することができる。 Depending on the display method, the viewer (entrant) can determine which of the additional information recognized and specified is the one whose reliability is lower than a certain level. Therefore, even when this first presentation method is adopted, the recognition performance information is presented to the writer so as to urge correction only for the additional information having a reliability level lower than a certain level. The writer can improve the entry mode for additional information whose reliability value is lower than a certain level.

この第１の提示手法を採れば、修正が必要なものほど目に付き易くすることで、重大な見落としを防止できる。また、信頼度の違いを見せることで、どのような図形が認識に適さないのかを比較して学習することができるため、記入方法の改善に繋がる効果がある。 If this 1st presentation method is taken, a serious oversight can be prevented by making the thing which needs correction easy to see. In addition, by showing the difference in reliability, it is possible to compare and learn what figure is not suitable for recognition, which has the effect of improving the entry method.

また、認識性能情報提示部１９８は、認識処理された全ての追記情報について、信頼度が一定水準よりも低い難認識情報のみに関して認識性能を向上させるための認識性能情報を提示する第２の提示手法を採ることが考えられる。たとえば、難認識追記情報抽出部１９４によって第１や第２の統合手法に基づいて特定される最終的な信頼度が所定水準よりも低い追記情報や、第３の統合手法に基づいて特定される信頼度が所定水準よりも低いあるいは最も低い誤認識のタイプを持つ追記情報のみに関して認識性能情報を提示する。 In addition, the recognition performance information presentation unit 198 presents the recognition performance information for improving the recognition performance for only the difficult recognition information whose reliability is lower than a certain level for all the additional information subjected to the recognition process. It is conceivable to adopt a method. For example, the difficulty recognition additional record information extraction unit 194 specifies the final reliability specified based on the first and second integration methods based on the additional information that is lower than a predetermined level or the third integration method. The recognition performance information is presented only for the additional information having the type of erroneous recognition whose reliability is lower than the predetermined level or lowest.

ここで、「信頼度が一定水準よりも低い難認識情報のみに関して認識性能を向上させるための認識性能情報を提示する」とは、要するに、認識処理された全ての付加情報について、「信頼度が低い難認識情報」と「その他の追記情報」とを区別して提示することで、「信頼度が低い難認識情報」を確実に特定できるように認識性能情報を提示することを意味する。たとえば、「信頼度が低い難認識情報」以外の情報は全て表示しないようにすることもできる。 Here, “presenting recognition performance information for improving recognition performance only with respect to difficult recognition information whose reliability is lower than a certain level” means that for all additional information subjected to recognition processing, “reliability is This means that the recognition performance information is presented so that the “difficult recognition information with low reliability” can be reliably identified by separately presenting “low difficulty recognition information” and “other additional information”. For example, information other than “difficult recognition information with low reliability” may not be displayed.

信頼度の値が一定水準よりも低い難認識情報についてのみ修正を促すように認識性能情報を記入者に提示することができ、記入者は、その信頼度の値が一定水準よりも低い難認識情報について記入態様を改善することができる。 The recognition performance information can be presented to the writer so as to prompt correction only for difficult recognition information whose reliability level is lower than a certain level, and the writer has difficulty recognition whose reliability level is lower than a certain level. It is possible to improve the information entry mode.

この第２の提示手法を採れば、信頼度が一定水準よりも低い難認識情報のみに関して認識性能を向上させるための認識性能情報を提示するため、修正の必要性が高い情報だけを素早く確認でき、効率的に修正作業ができる効果がある。 If this second presentation method is adopted, recognition performance information for improving recognition performance is presented only for difficult recognition information whose reliability is lower than a certain level, so that only information that needs to be corrected can be quickly confirmed. This has the effect of making correction work efficiently.

なお、第２の提示手法を採る場合において、認識性能が悪く修正を要するものとして認識性能情報の提示対象として抽出した追記情報に関しては、この追記情報が一目で何処の部分であるかを認識できるような状態で強調表示するのがよい。強調表示によって、信頼度が一定水準よりも低い難認識情報とその他の情報とを区別して提示することができる。 In the case of adopting the second presentation method, it is possible to recognize at a glance where the additional information is extracted as the additional information extracted as the target of the recognition performance information because the recognition performance is poor and needs to be corrected. It is better to highlight in such a state. By highlighting, it is possible to distinguish and present difficult-to-recognize information whose reliability is lower than a certain level and other information.

たとえば、図１２の各図に示すように、追記情報を枠で囲む（図１２（Ａ））、追記情報に矢印（あるいはその他のマーク）を付ける（図１２（Ｂ））、追記情報に電子付箋を付ける（図１３）、あるいは、図示を割愛するが、信頼度によって追記情報の太さを変えて表示する、信頼度によって追記情報の色を変えて表示する、などして、信頼度が一定水準よりも低い難認識情報とその他の情報とを区別して表示するとよい。 For example, as shown in FIGS. 12A and 12B, the postscript information is surrounded by a frame (FIG. 12A), an arrow (or other mark) is attached to the postscript information (FIG. 12B), and the postscript information is electronic. The sticky note is attached (FIG. 13), or the illustration is omitted, but the thickness of the postscript information is changed depending on the reliability, and the color of the postscript information is changed depending on the reliability. It is good to distinguish and display difficult recognition information lower than a certain level and other information.

枠で囲む表示態様では範囲が特定し易いため、特に複数の図形が図８（Ｃ）のように接続してしまった場合に視認し易い効果がある。また、矢印やその他のマークを付ける表示態様では大きさを自由に設定できるため文書原本の記載などに被ってしまう可能性が低く、記入が密な場合でもあまり視認性が低下しない効果がある。電子付箋を付ける表示態様では複数文書を重ねた場合でも最上段以外の付箋を視認できるため、信頼度が低い追記情報の全体量や位置などが確認し易い効果がある。 Since the range is easy to specify in the display mode surrounded by the frame, there is an effect that it is easy to visually recognize especially when a plurality of figures are connected as shown in FIG. In addition, in the display mode in which an arrow or other mark is added, the size can be set freely, so that there is a low possibility that the document will be covered with the description of the original document. In the display mode for attaching an electronic sticky note, even when a plurality of documents are overlapped, the sticky note other than the topmost one can be visually recognized, so that there is an effect that it is easy to confirm the entire amount and position of the postscript information with low reliability.

あるいは、認識性能が悪く修正を要するものとして認識性能情報の提示対象として抽出した追記情報に関して、図１４に示すように、リンクの一覧を作成して表示するとともに、この一覧中の何れかがクリックされたときには、付加情報記入済教材８１を表示し、かつ該当箇所の追記情報を表示するよう表示してもよい。この際には、該当箇所の追記情報を図１２に示した強調表示を採用して表示するとよい。 Alternatively, as shown in FIG. 14, a list of links is created and displayed for additional information extracted as a target of presentation of recognition performance information because the recognition performance is poor and needs to be corrected, and one of the lists is clicked. When it is done, the additional information filled teaching material 81 may be displayed and the additional information of the corresponding part may be displayed. At this time, it is preferable to display the additional information of the corresponding part by adopting the highlighting shown in FIG.

リンクの一覧では、信頼度が一定水準以下の採点記号８７やコメント８８が記入される解答欄８４と対応する問題文のように、難認識情報に辿り着くことのできる情報のみが提示される。このリンクの一覧をベースとして、所要の難認識情報を表示させてから修正作業時に取り掛かることができる。 In the list of links, only information that can reach difficult recognition information is presented, such as a question sentence corresponding to an answer column 84 in which a grading symbol 87 or a comment 88 with a reliability level of a certain level or less is entered. Based on this list of links, the necessary difficulty recognition information can be displayed before starting the correction work.

付加情報記入済教材８１上では難認識情報だけでなく全ての追記情報が提示されているが、リンクの一覧では、難認識情報に辿り着くことのできる情報のみを提示しており、このような態様も、「信頼度が一定水準よりも低い難認識情報のみに関して認識性能を向上させるための認識性能情報を提示する」に含むものとする。 On the additional information filled-in teaching material 81, not only difficult recognition information but all additional information is presented, but the list of links presents only information that can reach difficult recognition information. An aspect is also included in “presenting recognition performance information for improving recognition performance only for difficult recognition information whose reliability is lower than a certain level”.

あるいは、認識性能が悪く修正を要するものとして認識性能情報の提示対象として抽出した追記情報に関して、図１５に示すように、付加情報記入済教材８１を表示し、かつ、次にジャンプするボタンのクリック、「次に送る」ボタンのクリック、ダブルクリックなどの予め決められたジェスチャ、あるいは予め決められたキー操作（矢印キーなど）、などを契機とすることで、認識性能が悪く修正を要する複数の追記情報を、同一の付加情報記入済教材８１内であるいは別の付加情報記入済教材８１へと、次々とジャンプできるようにして表示してもよい。この際には、該当箇所（ジャンプ先）の追記情報を、図１２に示した強調表示を採用して表示するとよい。 Alternatively, as shown in FIG. 15, with respect to the additional information extracted as the recognition performance information presentation target because the recognition performance is poor and needs to be corrected, the additional information filled teaching material 81 is displayed and the next jump button is clicked , By using a predetermined gesture such as clicking or double-clicking the “send to next” button, or a predetermined key operation (such as an arrow key), the recognition performance is poor and multiple corrections are required. The additional information may be displayed so as to be able to jump one after another within the same additional information filled learning material 81 or to another additional information filled learning material 81. At this time, it is preferable to display the additional information of the corresponding part (jump destination) by using the highlighting shown in FIG.

また、認識性能情報提示部１９８は、図１６に示すように、信頼度が一定水準に満たない修正を要する追記情報（難認識情報）について、信頼度とともにそのように判定した判定理由を認識性能情報として提示する第３の提示手法を採ることが考えられる。 Further, as shown in FIG. 16, the recognition performance information presenting unit 198 recognizes the determination reason so determined together with the reliability for additional information (difficult recognition information) that requires correction whose reliability is less than a certain level. It is conceivable to adopt a third presentation method presented as information.

なお、ここでの「信頼度」は、最終的な信頼度を意味する。また、「判定理由」は、たとえば、難認識追記情報抽出部１９４に通知される各種のサブ信頼度を利用して、サブ信頼度の悪い処理を判定理由として特定するのがよい。 Here, “reliability” means the final reliability. In addition, the “determination reason” may be, for example, to specify a process having a low sub-reliability as a determination reason by using various sub-reliabilities notified to the difficult-to-recognize additional write information extraction unit 194.

この第３の提示手法を採れば、信頼度の値が一定水準よりも低い追記情報を通知するだけでなく、そのように判定した理由も認識性能情報として記入者に提示することができ、記入者は、その信頼度の値が一定水準よりも低い追記情報について、提示された理由に応じた記入態様に改善することができる。どの点に注意して記入すればよいかを具体的に把握することができるので、記入方法の習得に利用することができる。手書き入力情報の形状などの不適切な部分を確実に把握し、正確な図形や文字の形状などを確実に学習することができるのである。 By adopting this third presentation method, not only can additional information be reported with a reliability value lower than a certain level, but also the reason for such determination can be presented to the writer as recognition performance information. The person can improve the additional information according to the presented reason for the additional information whose reliability value is lower than a certain level. Since it is possible to know in detail which points should be noted, it can be used to learn how to fill in. It is possible to reliably grasp an inappropriate part such as the shape of handwritten input information and to learn an accurate figure or character shape.

なお、上述の第１〜第３の提示手法の何れにおいても、各種の処理過程における認識率に関わる情報を収集し、収集した情報に基づいて信頼度の値が一定水準よりも低い追記情報を特定し、その信頼度の値が一定水準よりも低い追記情報に関して、手書き入力情報の認識性能を向上させるための認識性能情報、つまり現状の追記情報について修正を促す認識性能情報を提示するようにしたので、全ての追記情報をチェックする必要がなく、誤認識をもたらすような追記情報の記載を修正する作業を効率的に実行することができる。 In any of the first to third presentation methods described above, information related to the recognition rate in various processing processes is collected, and additional information whose reliability value is lower than a certain level based on the collected information. Identify and present recognition performance information for improving the recognition performance of handwritten input information, that is, recognition performance information that encourages correction of the current additional information for additional information that has a reliability value lower than a certain level. Therefore, it is not necessary to check all the additional write information, and the work of correcting the description of the additional write information that causes erroneous recognition can be performed efficiently.

また、付加情報記入済教材８１の表示中に、認識の信頼度が一致水準より低い現状の追記情報（難認識情報）について修正を促す認識性能情報を提示するようにしているので、特開２０００−１０５７９６号公報に記載の仕組みとは異なり、訂正専用の画面を作成する必要がない利点もある。 In addition, during the display of the additional information filled-in teaching material 81, the recognition performance information that prompts the user to correct the current additional information (difficult recognition information) whose recognition reliability is lower than the matching level is presented. Unlike the mechanism described in Japanese Patent Application No. -105796, there is an advantage that it is not necessary to create a screen dedicated to correction.

＜信頼度情報収集提示処理；具体例＞
図１７および図１８は、文書原本８Ａの一例である教育用教材８０を処理対象とする具体的な信頼度情報収集提示処理を示した図である。ここで、図１７は、その全体概要をシステム構成図と対応付けて示しており、また図１８は、信頼度情報収集提示処理手順を示すフローチャートである。 <Reliability information collection and presentation processing; specific example>
FIGS. 17 and 18 are diagrams showing specific reliability information collection and presentation processing for an educational material 80 which is an example of the original document 8A. Here, FIG. 17 shows the overall outline in association with the system configuration diagram, and FIG. 18 is a flowchart showing the reliability information collection and presentation processing procedure.

ここでは、教育用教材８０としての答案を作成し、答案を用いた試験後に、採点記号８７に基づく自動採点処理やコメント８８に基づくコメント分類処理を行なう例で示す。 Here, an example is shown in which an answer as an educational material 80 is created, and an automatic scoring process based on a scoring symbol 87 and a comment classification process based on a comment 88 are performed after a test using the answer.

試験を実行する際には、文書原本情報データベースＤＢ２から教育用教材８０を読み出してプリンタで印刷し生徒や受験者に配布する（Ｓ２０６）。そして、試験終了後、採点官は採点記号８７やコメント８８を生徒の解答に対して追記する（Ｓ２０８）。 When the test is executed, the educational material 80 is read from the original document information database DB2, printed by a printer, and distributed to students and examinees (S206). After the test, the grader adds a scoring symbol 87 and a comment 88 to the student's answer (S208).

自動データ処理時には、先ず、文書入力装置２０は、採点官により採点記号８７やコメント８８が記入された各解答者の付加情報記入済教材８１を読み取り（Ｓ２１０）、その付加情報記入済教材８１を表わす画像データを追記情報処理装置１０に入力する（Ｓ２１２）。文書入力装置２０は、この文書入力装置２０による画像読取りによって得られた画像データについて、一旦ワークエリアとして用いられるメモリなどに保持しておく。 At the time of automatic data processing, first, the document input device 20 reads the additional information filled teaching material 81 of each answerer in which the scoring symbol 87 and the comment 88 are entered by the grader (S210), and the additional information filled teaching material 81 is read. The represented image data is input to the additional recording information processing apparatus 10 (S212). The document input device 20 temporarily stores image data obtained by image reading by the document input device 20 in a memory or the like used as a work area.

追記情報処理装置１０（教材処理装置）は、付加情報記入済教材８１の読取画像データを受け取り、ハーフトーン画像中に埋め込まれている文書原本特定コードを特定し（Ｓ２２４）、対応する教育用教材８０の原本画像と記入欄位置領域情報３８などの原本情報を文書原本情報データベースＤＢ２として機能する文書管理サーバ３０から入手する（Ｓ２２５）。 The additional recording information processing apparatus 10 (teaching material processing apparatus) receives the read image data of the additional information filled teaching material 81, specifies the original document specifying code embedded in the halftone image (S224), and the corresponding teaching teaching material. Original information such as 80 original images and entry field position area information 38 is obtained from the document management server 30 functioning as the document original information database DB2 (S225).

そして、自動データ処理として、付加情報記入済教材８１に記入されている採点記号８７やコメント８８を差分抽出部１３２やデータ処理用追記情報抽出処理部１４０にて抽出し（Ｓ２４２）、分離認識処理部１５５や記入位置認識部１５８，１６８にてその記入内容や記入位置を特定した上で（Ｓ２６２）、データ処理部１７０にて採点記号８７に関する自動採点集計処理やコメント８８に関する自動コメント分類処理などを実行する（Ｓ２６６）。 Then, as automatic data processing, the scoring symbol 87 and the comment 88 entered in the additional information filled teaching material 81 are extracted by the difference extraction unit 132 and the data processing additional information extraction processing unit 140 (S242), and the separation recognition processing is performed. The entry contents and entry positions are specified by the section 155 and the entry position recognition sections 158 and 168 (S262), and then the data processing section 170 performs automatic scoring and summarizing processing for the scoring symbols 87 and automatic comment classification processing for the comments 88. Is executed (S266).

この際、分離認識処理部１５５や記入位置認識部１５８，１６８あるいはデータ処理部１７０は、ユーザによる修正指示を受け付ける（Ｓ３１０）。この後、自動データ処理が完了すると（Ｓ２６４−ＹＥＳ）、データ処理部１７０は、採点・集計の結果やコメント分類結果を処理結果保存サーバ４０に登録する（Ｓ２６８）。 At this time, the separation recognition processing unit 155, the entry position recognition units 158 and 168, or the data processing unit 170 receives a correction instruction from the user (S310). Thereafter, when the automatic data processing is completed (S264-YES), the data processing unit 170 registers the result of scoring / counting and the comment classification result in the processing result storage server 40 (S268).

ここで、このような処理過程において、認識性能情報提示処理部１９０の難認識追記情報抽出部１９４は、各機能部から、それぞれ処理での認識率に関わる各特徴量を収集して（Ｓ３００）、収集した各特徴量に基づいて、各処理についてのサブ信頼度を算出する（Ｓ３０２）。 Here, in such a process, the difficult recognition additional record information extraction unit 194 of the recognition performance information presentation processing unit 190 collects each feature quantity related to the recognition rate in the processing from each functional unit (S300). Based on each collected feature amount, a sub-reliability for each process is calculated (S302).

たとえば、差分抽出部１３２は、文書原本８Ａの一例である教育用教材８０と追記済文書８Ｂの一例である付加情報記入済教材８１との間での差分抽出処理における品質情報Ｊ１０と、採点記号８７やコメント８８の濃い背景への重複量Ｊ１２を抽出して難認識追記情報抽出部１９４に通知する。また、データ処理用追記情報抽出処理部１４０のデータ処理対象追記情報抽出部１４２は、差分抽出部１３２で抽出された差分情報９の内データ処理対象の追記情報として採点記号８７やコメント８８を抽出する際の品質情報Ｊ１１と、赤抽出画像と原本画像との位置関係などから求められる原本画像と赤抽出画像の接触量Ｊ１３を抽出して難認識追記情報抽出部１９４に通知する。難認識追記情報抽出部１９４は、通知された品質情報Ｊ１０，Ｊ１１、重複量Ｊ１２、接触量Ｊ１３に基づいて、差分抽出処理のサブ信頼度Ｔ１３２と特定色成分抽出処理のサブ信頼度Ｔ１４２とを算出する。 For example, the difference extraction unit 132 includes the quality information J10 in the difference extraction process between the educational teaching material 80, which is an example of the original document 8A, and the additional information filled learning material 81, which is an example of the added document 8B, and a scoring symbol. The overlap amount J12 to the dark background of 87 or the comment 88 is extracted and notified to the difficult-to-recognize additional write information extracting unit 194. Further, the data processing target additional information extraction unit 142 of the data processing additional information extraction processing unit 140 extracts a grading symbol 87 and a comment 88 as additional data processing target additional information of the difference information 9 extracted by the difference extraction unit 132. The contact amount J13 between the original image and the red extracted image obtained from the quality information J11 and the positional relationship between the red extracted image and the original image is extracted and notified to the difficult recognition additional record information extracting unit 194. Based on the notified quality information J10, J11, the overlap amount J12, and the contact amount J13, the difficult recognition additional record information extraction unit 194 calculates the sub reliability T132 of the difference extraction process and the sub reliability T142 of the specific color component extraction process. calculate.

また、抽出線分途切れ補正部１４８は、途切れ補正処理時の補間長さを示す補間長情報Ｊ２０や接続の候補となる端点数Ｊ２１などを抽出し、難認識追記情報抽出部１９４に通知する。難認識追記情報抽出部１９４は、通知された補間長情報Ｊ２０などに基づいて、途切れ補正処理（欠損補間）のサブ信頼度Ｔ１４８を算出する。 Further, the extracted line segment break correction unit 148 extracts the interpolation length information J20 indicating the interpolation length at the time of the break correction process, the number of end points J21 as connection candidates, and notifies the difficult recognition additional record information extraction unit 194. The difficult recognition additional record information extraction unit 194 calculates the sub-reliability T148 of the interruption correction process (missing interpolation) based on the notified interpolation length information J20 or the like.

また、分離認識処理部１５５は、先ず、採点記号８７同士の接触の有無を示す接触情報Ｊ３０とコメント８８同士の接触の有無を示す接触情報Ｊ３１を抽出し、難認識追記情報抽出部１９４に通知する。 The separation recognition processing unit 155 first extracts contact information J30 indicating the presence / absence of contact between the scoring symbols 87 and contact information J31 indicating the presence / absence of contact between the comments 88, and notifies the difficult recognition additional record information extraction unit 194. To do.

また、分離認識処理部１５５の図形形状認識部１５６は、採点記号８７についての図形認識処理結果の認識の信頼度情報Ｊ３２を抽出し、難認識追記情報抽出部１９４に通知するし、図形形状認識部１６６は、コメント８８についての図形認識処理結果の認識の信頼度情報Ｊ３３を抽出し、難認識追記情報抽出部１９４に通知する。 Further, the graphic shape recognition unit 156 of the separation recognition processing unit 155 extracts the reliability information J32 of the recognition of the graphic recognition processing result for the scoring symbol 87, notifies the difficult recognition additional record information extraction unit 194, and performs graphic shape recognition. The unit 166 extracts the recognition reliability information J33 of the graphic recognition processing result for the comment 88, and notifies the difficult recognition additional record information extraction unit 194 of it.

また、分離認識処理部１５５の文字認識処理部１５７は、採点記号８７についての文字認識処理結果の認識の信頼度情報Ｊ３４を抽出し、難認識追記情報抽出部１９４に通知するし、文字認識処理部１６７は、コメント８８についての文字認識処理結果の認識の信頼度情報Ｊ３５を抽出し、難認識追記情報抽出部１９４に通知する。 In addition, the character recognition processing unit 157 of the separation recognition processing unit 155 extracts the recognition reliability information J34 of the character recognition processing result for the scoring symbol 87, notifies the difficult recognition additional record information extraction unit 194, and performs character recognition processing. The unit 167 extracts the recognition reliability information J35 of the recognition result of the character recognition process for the comment 88 and notifies the difficult recognition additional record information extraction unit 194.

難認識追記情報抽出部１９４は、通知された採点記号８７についての接触情報Ｊ３０，信頼度情報Ｊ３２，信頼度情報Ｊ３４に基づいて、図形認識処理のサブ信頼度Ｔ１５６と文字認識処理のサブ信頼度Ｔ１５７とを算出し、また、コメント８８についての接触情報Ｊ３１，信頼度情報Ｊ３３，信頼度情報Ｊ３５に基づいて、図形認識処理のサブ信頼度Ｔ１６８と文字認識処理のサブ信頼度Ｔ１６７とを算出する。 The difficult recognition additional record information extraction unit 194, based on the contact information J30, reliability information J32, and reliability information J34 for the notified scoring symbol 87, sub-reliability T156 for graphic recognition processing and sub-reliability for character recognition processing. T157 is calculated, and the sub-reliability T168 of the graphic recognition process and the sub-reliability T167 of the character recognition process are calculated based on the contact information J31, the reliability information J33, and the reliability information J35 for the comment 88. .

また、記入位置認識部１５８は、採点記号８７についての複数追記情報Ｊ４０と距離情報Ｊ４２とを抽出して難認識追記情報抽出部１９４に通知するし、記入位置認識部１６８は、コメント８８についての複数追記情報Ｊ４１と距離情報Ｊ４３とを抽出して難認識追記情報抽出部１９４に通知する。 Further, the entry position recognition unit 158 extracts a plurality of additional writing information J40 and distance information J42 for the scoring symbol 87 and notifies the difficult recognition additional writing information extraction unit 194, and the entry position recognition unit 168 The multiple additional writing information J41 and the distance information J43 are extracted and notified to the difficult recognition additional writing information extraction unit 194.

難認識追記情報抽出部１９４は、通知された情報Ｊ４０，Ｊ４２に基づいて採点記号８７についての記入位置認識処理のサブ信頼度Ｔ１５８を算出し、また通知された情報Ｊ４１，Ｊ４３に基づいてコメント８８についての記入位置認識処理のサブ信頼度Ｔ１６８を算出する。 The difficult recognition additional record information extraction unit 194 calculates the sub-reliability T158 of the entry position recognition process for the scoring symbol 87 based on the notified information J40, J42, and the comment 88 based on the notified information J41, J43. The sub-reliability T168 of the entry position recognition process for is calculated.

難認識追記情報抽出部１９４は、さらに、前述のようにして求めた各処理のサブ信頼度と過去の統計情報とに基づいて、たとえば下記式（１）のようにして、処理ごとに重付けαを変えて、最終的な信頼度Ｔfinal を算出する（Ｓ３０４）。 The difficult-to-recognize additional recording information extraction unit 194 further assigns weights for each process based on the sub-reliability of each process and the past statistical information obtained as described above, for example, as shown in the following equation (1). The final reliability Tfinal is calculated by changing α (S304).

認識性能情報提示部１９８は、自動採点処理用のソフトウェアに組み込まれた自動採点結果の修正ソフト上で、信頼度Ｔfinal が、予め定めてある閾値以下の低信頼度の追記情報を特定し（Ｓ３０６）、その信頼度の値が閾値よりも低い追記情報に関して、手書き入力情報の認識性能を向上させるための認識性能情報として、特定した現状の追記情報について修正を促す認識性能情報をユーザ端末１７１上に提示する（Ｓ３０８）。この際には、前述のように、たとえば、その追記情報を強調表示する、あるいは、判定理由を示すなど、修正が必要な追記情報であることや修正方法を容易に判断することができるような状態で表示するのがよい。 The recognition performance information presentation unit 198 specifies low-reliability additional recording information whose reliability Tfinal is equal to or lower than a predetermined threshold on the automatic scoring result correction software incorporated in the software for automatic scoring processing (S306). ) On the user terminal 171, the recognition performance information that prompts the user to correct the specified current additional write information as the recognition performance information for improving the recognition performance of the handwritten input information regarding the additional write information whose reliability value is lower than the threshold value. (S308). In this case, as described above, for example, the additional writing information is highlighted or the reason for determination is indicated, so that the additional writing information that needs to be corrected and the correction method can be easily determined. It is better to display the status.

採点記号８７やコメント８８の記入者は、ユーザ端末１７１に提示された認識性能情報を確認しながら、信頼度の値が閾値よりも低く、誤認識をもたらすような採点記号８７（を表わす文字や図形）やコメント８８（を表わす文字や図形）に関してのみ、ユーザ端末１７１上で修正処理を実行する（Ｓ３１０）。 The writer of the scoring symbol 87 and the comment 88 confirms the recognition performance information presented on the user terminal 171 while checking the recognition performance information, the character representing the scoring symbol 87 (which represents a misrecognition with a reliability value lower than the threshold value, The correction process is executed on the user terminal 171 only for the graphic) and the comment 88 (characters and graphics representing the graphic) (S310).

そして、全て修正作業が完了すると（Ｓ２６４−ＹＥＳ）、修正された追記情報をデータ処理に反映させる。たとえば、採点・集計結果の確定データやコメント分類結果の確定データを処理結果保存サーバ４０に登録する（Ｓ２６４）。利用者による修正結果を受け付け、認識後にデータベース操作などで直接データを変更することで反映する（Ｓ２６８）。つまり、データベースに保存後のデータを直接に変更すればよい。 When all the correction work is completed (S264-YES), the corrected additional writing information is reflected in the data processing. For example, final data for scoring and counting results and final data for comment classification results are registered in the processing result storage server 40 (S264). The correction result by the user is received and reflected by changing the data directly by database operation after recognition (S268). In other words, the data stored in the database may be changed directly.

この事例の処理手順では、通常の自動データ処理の過程で認識の正誤を反映させて、認識した手書き入力情報の内、認識性能の劣るものについてのみ修正を促すようにしているので、修正を要するものについてのみ特別な手間を掛けずに、採点記号８７やコメント８８などの記入態様を随時チェックしつつ修正を行ない、修正後の追記情報を用いて自動採点処理や自動コメント分類処理の結果を適切に変更することができる。 In the processing procedure of this example, correctness of recognition is reflected in the process of normal automatic data processing, and correction is required only for recognized handwritten input information that has poor recognition performance. Make corrections while checking the entry form such as scoring symbols 87 and comments 88 without any special effort, and correct the results of automatic scoring and automatic comment classification using the postscript information after correction. Can be changed.

＜追記情報処理装置；計算機構成＞
図１９は、追記情報処理装置１０の他の構成例を示すブロック図である。ここでは、パーソナルコンピュータなどの電子計算機を利用して、追記情報処理をソフトウェアで実行するマイクロプロセッサなどから構築されるより現実的なハードウェア構成を示している。 <Additional information processing device; computer configuration>
FIG. 19 is a block diagram illustrating another configuration example of the additional recording information processing apparatus 10. Here, a more realistic hardware configuration is shown that is constructed from a microprocessor or the like that executes additional write information processing by software using an electronic computer such as a personal computer.

すなわち、本実施形態において、追記情報に関するデータ処理を実行する仕組みは、ハードウェア処理回路により構成することに限らず、その機能を実現するプログラムコードに基づき電子計算機（コンピュータ）を用いてソフトウェア的に実現することも可能である。 That is, in this embodiment, the mechanism for executing the data processing related to the additional information is not limited to the hardware processing circuit, and is software-based using a computer (computer) based on the program code that realizes the function. It can also be realized.

よって、本発明に係る仕組みを、電子計算機（コンピュータ）を用いてソフトウェアで実現するために好適なプログラムあるいはこのプログラムを格納したコンピュータ読取可能な記憶媒体を発明として抽出することもできる。ソフトウェアにより実行させる仕組みとすることで、ハードウェアの変更を伴うことなく、処理手順などを容易に変更できる利点を享受できるようになる。 Therefore, a program suitable for realizing the mechanism according to the present invention by software using an electronic computer (computer) or a computer-readable storage medium storing this program can be extracted as an invention. By adopting a mechanism that is executed by software, it is possible to enjoy the advantage that the processing procedure and the like can be easily changed without changing hardware.

電子計算機に、追記情報に関するデータ処理機能をソフトウェアにより実行させる場合には、そのソフトウェアを構成するプログラムが、専用のハードウェアに組み込まれているコンピュータ（組込マイコンなど）、あるいは、ＣＰＵ（Central Processing Unit ）、論理回路、記憶装置などの機能を１つのチップ上に搭載して所望のシステムを実現するＳＯＣ（System On a Chip：システムオンチップ）、または、各種のプログラムをインストールすることで各種の機能を実行することが可能な汎用のパーソナルコンピュータなどに、記録媒体からインストールされる。 When the data processing function related to additional information is executed by software in an electronic computer, a program (such as a built-in microcomputer) in which a program constituting the software is incorporated in dedicated hardware or a CPU (Central Processing) Unit), logic circuit, storage device, etc. on a single chip to realize the desired system SOC (System On a Chip) or various programs by installing various programs It is installed from a recording medium in a general-purpose personal computer or the like capable of executing functions.

記録媒体は、コンピュータのハードウェア資源に備えられている読取装置に対して、プログラムの記述内容に応じて、磁気、光、電気などのエネルギの状態変化を引き起こして、それに対応する信号の形式で、読取装置にプログラムの記述内容を伝達できるものである。 The recording medium causes a state change of energy such as magnetism, light, electricity, etc. according to the description contents of the program to the reading device provided in the hardware resource of the computer, and in the form of a signal corresponding to the change. The program description can be transmitted to the reader.

たとえば、コンピュータとは別に、ユーザにプログラムを提供するために配布される、プログラムが記録されている磁気ディスク（フレキシブルディスクＦＤを含む）、光ディスク（ＣＤ−ＲＯＭ（Compact Disc-Read Only Memory ）、ＤＶＤ（Digital Versatile Disc）を含む）、光磁気ディスク（ＭＤ（Mini Disc ）を含む）、または半導体メモリなどよりなるパッケージメディア（可搬型の記憶媒体）により構成されるだけでなく、コンピュータに予め組み込まれた状態でユーザに提供される、プログラムが記録されているＲＯＭやハードディスクなどで構成されてもよい。 For example, a magnetic disk (including a flexible disk FD), an optical disk (CD-ROM (Compact Disc-Read Only Memory)), a DVD on which a program is recorded, which is distributed to provide a program to a user separately from a computer. (Including Digital Versatile Disc), magneto-optical disc (including MD (Mini Disc)), or package media (portable storage media) made of semiconductor memory, etc. It may be configured by a ROM, a hard disk, or the like in which a program is recorded, which is provided to the user in a state of being recorded.

また、ソフトウェアを構成するプログラムは、記録媒体を用いずに、記録媒体を介して提供されることに限らず、有線あるいは無線などの通信網を介して提供されてもよい。 The program constituting the software is not limited to being provided via the recording medium without using the recording medium, and may be provided via a wired or wireless communication network.

たとえば、追記情報に関するデータ処理機能を実現するソフトウェアのプログラムコードを記録した記憶媒体を、システムあるいは装置に供給し、そのシステムあるいは装置のコンピュータ（またはＣＰＵやＭＰＵ）が記憶媒体に格納されたプログラムコードを読み出し実行することによっても、ハードウェア処理回路にて構成する場合と同様の効果は達成される。この場合、記憶媒体から読み出されたプログラムコード自体が追記情報に関するデータ処理機能を実現する。 For example, a storage medium in which a program code of software that realizes a data processing function related to additional information is supplied to a system or apparatus, and a program code stored in the storage medium by a computer (or CPU or MPU) of the system or apparatus The same effect as in the case where the hardware processing circuit is used is also achieved by reading and executing. In this case, the program code itself read from the storage medium realizes a data processing function related to additional write information.

また、コンピュータが読み出したプログラムコードを実行することで、追記情報に関するデータ処理機能が実現されるだけでなく、プログラムコードの指示に基づき、コンピュータ上で稼働しているＯＳ（Operating Systems ；基本ソフト）などが実際の処理の一部または全部を行ない、その処理により追記情報に関するデータ処理機能が実現される場合であってもよい。 In addition, by executing the program code read by the computer, not only a data processing function related to additional information is realized, but also an OS (Operating Systems; basic software) running on the computer based on an instruction of the program code Or the like may perform part or all of the actual processing, and the data processing function regarding the additional information may be realized by the processing.

さらに、記憶媒体から読み出されたプログラムコードが、コンピュータに挿入された機能拡張カードやコンピュータに接続された機能拡張ユニットに備わるメモリに書き込まれた後、そのプログラムコードの指示に基づき、その機能拡張カードや機能拡張ユニットに備わるＣＰＵなどが実際の処理の一部または全部を行ない、その処理によって追記情報に関するデータ処理機能が実現される場合であってもよい。 Further, after the program code read from the storage medium is written in a memory provided in a function expansion card inserted into the computer or a function expansion unit connected to the computer, the function expansion is performed based on the instruction of the program code. There may be a case where a CPU or the like provided in the card or the function expansion unit performs part or all of the actual processing, and the data processing function regarding the additional information is realized by the processing.

なお、追記情報に関するデータ処理機能を実現するプログラムコードを記述したファイルとしてプログラムが提供されるが、この場合、一括のプログラムファイルとして提供されることに限らず、コンピュータで構成されるシステムのハードウェア構成に応じて、個別のプログラムモジュールとして提供されてもよい。 Note that the program is provided as a file describing a program code that realizes a data processing function related to additional information, but in this case, the program is not limited to being provided as a batch program file, and the hardware of a system configured by a computer Depending on the configuration, it may be provided as an individual program module.

たとえば、コンピュータシステム９００は、コントローラ部９０１と、ハードディスク装置、フレキシブルディスク（ＦＤ）ドライブ、あるいはＣＤ−ＲＯＭ（Compact Disk ROM）ドライブ、半導体メモリコントローラなどの、所定の記憶媒体からデータを読み出したり記録したりするための記録・読取制御部９０２とを有する。 For example, the computer system 900 reads and records data from a controller unit 901 and a predetermined storage medium such as a hard disk device, a flexible disk (FD) drive, a CD-ROM (Compact Disk ROM) drive, or a semiconductor memory controller. And a recording / reading control unit 902.

コントローラ部９０１は、ＣＰＵ（Central Processing Unit ）９１２、読出専用の記憶部であるＲＯＭ（Read Only Memory）９１３、随時書込みおよび読出しが可能であるとともに揮発性の記憶部の一例であるＲＡＭ（Random Access Memory）９１５、および不揮発性の記憶部の一例であるＲＡＭ（ＮＶＲＡＭと記述する）９１６を有している。 The controller unit 901 includes a CPU (Central Processing Unit) 912, a ROM (Read Only Memory) 913 which is a read-only storage unit, and a RAM (Random Access) which can be written and read at any time and is an example of a volatile storage unit. Memory) 915 and RAM (described as NVRAM) 916 which is an example of a nonvolatile storage unit.

なお、上記において“揮発性の記憶部”とは、装置の電源がオフされた場合には、記憶内容を消滅してしまう形態の記憶部を意味する。一方、“不揮発性の記憶部”とは、装置のメイン電源がオフされた場合でも、記憶内容を保持し続ける形態の記憶部を意味する。記憶内容を保持し続けることができるものであればよく、半導体製のメモリ素子自体が不揮発性を有するものに限らず、バックアップ電源を備えることで、揮発性のメモリ素子を“不揮発性”を呈するように構成するものであってもよい。 In the above description, the “volatile storage unit” means a storage unit in which the stored contents are lost when the power of the apparatus is turned off. On the other hand, the “nonvolatile storage unit” means a storage unit in a form that keeps stored contents even when the main power supply of the apparatus is turned off. Any memory device can be used as long as it can retain the stored contents. The semiconductor memory device itself is not limited to a nonvolatile memory device, and a backup power supply is provided to make a volatile memory device “nonvolatile”. You may comprise as follows.

また、半導体製のメモリ素子により構成することに限らず、磁気ディスクや光ディスクなどの媒体を利用して構成してもよい。たとえば、ハードディスク装置を不揮発性の記憶部として利用できる。また、ＣＤ−ＲＯＭなどの記録媒体から情報を読み出す構成を採ることでも不揮発性の記憶部として利用できる。 Further, the present invention is not limited to a semiconductor memory element, and may be configured using a medium such as a magnetic disk or an optical disk. For example, a hard disk device can be used as a nonvolatile storage unit. In addition, it is possible to use as a nonvolatile storage unit by adopting a configuration for reading information from a recording medium such as a CD-ROM.

また、コンピュータシステム９００は、ユーザインタフェースをなす機能部としての指示入力部９０３と、操作時のガイダンス画面や処理結果などの所定の情報をユーザに提示する表示出力部９０４と、各機能部との間のインタフェース機能をなすインタフェース部（ＩＦ部）９０９とを有する。 Further, the computer system 900 includes an instruction input unit 903 as a functional unit that forms a user interface, a display output unit 904 that presents a user with predetermined information such as a guidance screen and a processing result, and each functional unit. And an interface unit (IF unit) 909 that performs an interface function between them.

なお、データ処理結果を印刷出力してユーザに提示する構成とするべく、処理結果を所定の出力媒体（たとえば印刷用紙）に出力する画像形成部９０６を設けることもできる。 Note that an image forming unit 906 that outputs the processing result to a predetermined output medium (for example, printing paper) may be provided so that the data processing result is printed out and presented to the user.

指示入力部９０３としては、たとえば、ユーザインタフェース部９８５の操作キー部９８５ｂを利用することができる。あるいは、キーボードやマウスなどを利用することもできる。 As the instruction input unit 903, for example, the operation key unit 985b of the user interface unit 985 can be used. Alternatively, a keyboard or mouse can be used.

表示出力部９０４は、表示制御部９１９と表示装置とを備える。表示装置としては、たとえば、ユーザインタフェース部９８５の操作パネル部９８５ａを利用することができる。あるいは、ＣＲＴ（Cathode Ray Tube；陰極線管）やＬＣＤ（Liquid Crystal Display；液晶）などでなるその他のディスプレイ部を利用することもできる。 The display output unit 904 includes a display control unit 919 and a display device. As the display device, for example, the operation panel unit 985a of the user interface unit 985 can be used. Alternatively, other display units such as CRT (Cathode Ray Tube) or LCD (Liquid Crystal Display) can be used.

たとえば、表示制御部９１９が、操作パネル部９８５ａやディスプレイ部上に、ガイダンス情報や文書入力装置２０が取り込んだ全体画像などを表示させる。また、各種の情報をユーザに通知する際の表示デバイスとしても利用される。なお、表示面上にタッチパネルを有するディスプレイ部とすることで、指先やペンなどで所定の情報を入力する指示入力部９０３を構成することもできる。 For example, the display control unit 919 displays the guidance information, the entire image captured by the document input device 20, and the like on the operation panel unit 985a and the display unit. It is also used as a display device for notifying the user of various information. Note that an instruction input unit 903 for inputting predetermined information with a fingertip, a pen, or the like can be configured by using a display unit having a touch panel on the display surface.

インタフェース部９０９としては、処理データ（画像データを含む）や制御データの転送経路であるシステムバス９９１の他、たとえば、画像形成部９０６や他のプリンタとのインタフェース機能をなすプリンタＩＦ部９９６、およびネットワークとの間の通信データの受け渡しを仲介する通信ＩＦ部９９９を有している。 The interface unit 909 includes a system bus 991 that is a transfer path for processing data (including image data) and control data, a printer IF unit 996 that functions as an interface with the image forming unit 906 and other printers, and the like. A communication IF unit 999 is provided to mediate communication data exchange with the network.

このような構成において、ＣＰＵ９１２は、システムバス９９１を介してシステム全体の制御を行なう。ＲＯＭ９１３は、ＣＰＵ９１２の制御プログラムなどを格納する。ＲＡＭ９１５は、ＳＲＡＭ（Static Random Access Memory ）などで構成され、プログラム制御変数や各種処理のためのデータなどを格納する。また、ＲＡＭ９１５は、所定のアプリケーションプログラムに従って演算して得たデータや外部から取得したデータなどを一時的に格納する領域を含んでいる。 In such a configuration, the CPU 912 controls the entire system via the system bus 991. The ROM 913 stores a control program for the CPU 912 and the like. The RAM 915 is configured by SRAM (Static Random Access Memory) or the like, and stores program control variables, data for various processes, and the like. The RAM 915 includes an area for temporarily storing data obtained by calculation according to a predetermined application program, data obtained from the outside, and the like.

たとえば、追記情報に関するデータ処理機能をコンピュータに実行させるプログラムは、ＣＤ−ＲＯＭなどの記録媒体を通じて配布される。あるいは、このプログラムは、ＣＤ−ＲＯＭではなくＦＤに格納されてもよい。また、ＭＯドライブを設け、ＭＯに前記プログラムを格納してもよく、またフラッシュメモリなどの不揮発性の半導体メモリカードなど、その他の記録媒体にプログラムを格納してもよい。さらに、他のサーバなどからインターネットなどのネットワークを経由してプログラムをダウンロードして取得したり、あるいは更新したりしてもよい。 For example, a program that causes a computer to execute a data processing function related to additional recording information is distributed through a recording medium such as a CD-ROM. Alternatively, this program may be stored in the FD instead of the CD-ROM. In addition, an MO drive may be provided to store the program in the MO, or the program may be stored in another recording medium such as a nonvolatile semiconductor memory card such as a flash memory. Furthermore, the program may be downloaded from another server or the like via a network such as the Internet, or may be updated or updated.

なおプログラムを提供するための記録媒体としては、ＦＤやＣＤ−ＲＯＭなどの他にも、ＤＶＤなどの光学記録媒体、ＭＤなどの磁気記録媒体、ＰＤなどの光磁気記録媒体、テープ媒体、磁気記録媒体、ＩＣカードやミニチュアカードなどの半導体メモリを用いることができる。記録媒体の一例としてのＦＤやＣＤ−ＲＯＭなどには、追記情報に関するデータ処理機能を実現する際の、一部または全ての機能を格納することができる。 As a recording medium for providing the program, in addition to FD and CD-ROM, optical recording medium such as DVD, magnetic recording medium such as MD, magneto-optical recording medium such as PD, tape medium, magnetic recording A semiconductor memory such as a medium, an IC card, or a miniature card can be used. A part or all of functions for realizing a data processing function related to additional write information can be stored in an FD or CD-ROM as an example of a recording medium.

また、ハードディスク装置は、制御プログラムによる各種処理のためのデータを格納したり、自装置で取得したデータや外部から取得したデータなどを大量に一時的に格納したりする領域を含んでいる。 Further, the hard disk device includes an area for storing data for various processes by the control program, and temporarily storing a large amount of data acquired by the device itself or data acquired from the outside.

このような構成により、操作キー部９８５ｂを介した操作者による指令にて、前述の追記情報処理方法を実行するプログラムが記憶されているＣＤ−ＲＯＭなどの読取可能な記録媒体からＲＡＭ９１５に追記情報処理プログラムがインストールされ、また操作キー部９８５ｂを介した操作者による指令や自動処理にて追記情報処理プログラムが起動される。たとえば、教材自動採点システム１を実現する場合であれば、教材処理プログラムでは、所定色成分、具体的にはたとえば赤色成分の差分抽出結果を、採点記号８７やコメント８８の記入内容として認識し、かつ両者を分離するようにする処理ステップなどが記述されプログラムが起動される。 With such a configuration, additional information is written to the RAM 915 from a readable recording medium such as a CD-ROM in which a program for executing the additional recording information processing method is stored in accordance with an instruction from the operator via the operation key unit 985b. The processing program is installed, and the additional information processing program is started by an instruction or automatic processing by the operator via the operation key unit 985b. For example, if the teaching material automatic scoring system 1 is to be realized, the teaching material processing program recognizes a difference extraction result of a predetermined color component, specifically, for example, a red component as the contents of the marking symbol 87 and the comment 88, In addition, processing steps for separating both are described, and the program is started.

ＣＰＵ９１２は、この追記情報処理プログラムに従って前述の追記情報処理方法に伴う計算処理を施し、処理結果をＲＡＭ９１５やハードディスクなどの記憶装置に格納し、必要により操作パネル部９８５ａ、あるいはＣＲＴやＬＣＤなどの表示装置に出力する。追記情報処理方法を実行するプログラムが記録した記録媒体を用いることにより、既存のシステムを変えることなく、追記情報処理システムを汎用的に構築することができる。 The CPU 912 performs calculation processing according to the additional information processing method according to the additional information processing program, stores the processing result in a storage device such as the RAM 915 or a hard disk, and displays the operation panel unit 985a or a display such as a CRT or LCD as necessary. Output to the device. By using a recording medium recorded with a program for executing the additional recording information processing method, the additional recording information processing system can be constructed universally without changing the existing system.

なお、このようなコンピュータを用いた構成に限らず、図２を用いて示した各機能部の処理をなす専用のハードウェアの組合せにより、追記情報に関するデータ処理機能を行なう追記情報処理装置１０を構成することもできる。 Not only the configuration using such a computer, but also a write-once information processing apparatus 10 that performs a data processing function related to write-once information by a combination of dedicated hardware that performs processing of each functional unit shown in FIG. It can also be configured.

また、たとえば、追記情報に関するデータ処理機能のための各機能部分の全ての処理をソフトウェアで行なうのではなく、これら機能部分の一部を専用のハードウェアにて行なう処理回路９０８を設けてもよい。ソフトウェアで行なう仕組みは、並列処理や連続処理に柔軟に対処し得るものの、その処理が複雑になるに連れ、処理時間が長くなるため、処理速度の低下が問題となる。 In addition, for example, a processing circuit 908 may be provided in which not all processing of each functional part for the data processing function related to the additional information is performed by software, but part of these functional parts is performed by dedicated hardware. . Although the mechanism performed by software can flexibly cope with parallel processing and continuous processing, the processing time becomes longer as the processing becomes complicated, so that a reduction in processing speed becomes a problem.

これに対して、ハードウェア処理回路で行なうことで、高速化を図ったアクセラレータシステムを構築することができるようになる。アクセラレータシステムは、処理が複雑であっても、処理速度の低下を防ぐことができ、高いスループットを得ることができる。 On the other hand, it is possible to construct an accelerator system with a higher speed by using a hardware processing circuit. Even if the processing is complicated, the accelerator system can prevent a reduction in processing speed and can obtain a high throughput.

たとえば、追記情報に関するデータ処理機能を実現する場合であれば、処理回路９０８としては、図２に示した読取画像処理部１１０に相当する読取画像処理部９０８ａ、文書原本特定部１２０に相当する文書原本特定部９０８ｂ、追記情報抽出部１３０に相当する追記情報抽出部９０８ｃ、データ処理対象追記情報特定処理部１５０に相当するデータ処理対象追記情報特定処理部９０８ｄ、あるいはデータ処理部１７０に相当するデータ処理部９０８ｅなどをハードウェアで構成するとよい。 For example, in the case of realizing a data processing function related to additional information, the processing circuit 908 includes a read image processing unit 908 a corresponding to the read image processing unit 110 shown in FIG. 2 and a document corresponding to the document original specifying unit 120. Data corresponding to the original data specifying unit 908b, the additional information extracting unit 908c corresponding to the additional information extracting unit 130, the data processing target additional information specifying unit 908d corresponding to the data processing target additional information specifying unit 150, or the data processing unit 170. The processing unit 908e and the like may be configured by hardware.

以上、本発明について実施形態を用いて説明したが、本発明の技術的範囲は上記実施形態に記載の範囲には限定されない。発明の要旨を逸脱しない範囲で上記実施形態に多様な変更または改良を加えることができ、そのような変更または改良を加えた形態も本発明の技術的範囲に含まれる。 As mentioned above, although this invention was demonstrated using embodiment, the technical scope of this invention is not limited to the range as described in the said embodiment. Various changes or improvements can be added to the above-described embodiment without departing from the gist of the invention, and embodiments to which such changes or improvements are added are also included in the technical scope of the present invention.

また、上記の実施形態は、クレーム（請求項）にかかる発明を限定するものではなく、また実施形態の中で説明されている特徴の組合せの全てが発明の解決手段に必須であるとは限らない。前述した実施形態には種々の段階の発明が含まれており、開示される複数の構成要件における適宜の組合せにより種々の発明を抽出できる。実施形態に示される全構成要件から幾つかの構成要件が削除されても、効果が得られる限りにおいて、この幾つかの構成要件が削除された構成が発明として抽出され得る。 Further, the above embodiments do not limit the invention according to the claims (claims), and all combinations of features described in the embodiments are not necessarily essential to the solution means of the invention. Absent. The embodiments described above include inventions at various stages, and various inventions can be extracted by appropriately combining a plurality of disclosed constituent elements. Even if some constituent requirements are deleted from all the constituent requirements shown in the embodiment, as long as an effect is obtained, a configuration from which these some constituent requirements are deleted can be extracted as an invention.

たとえば、歪み補正処理や途切れ補正処理などは、必ずしも必須ではない。 For example, a distortion correction process and a break correction process are not necessarily essential.

また、処理対象の追記済文書８Ｂの一例として、自動採点集計処理の仕組みに利用される付加情報記入済教材８１を例に説明したが、これは一例に過ぎず、データ処理対象となる手書きの追記情報が含まれているものであれば、その文書の種別は問わない。たとえば、自動帳票処理の仕組みに利用される記入済帳票（特開平５−３４２２３９号公報や平６−２７４１５６号公報を参照）や個人情報管理（特にスケジュール管理ともいう）の仕組みに利用される付加情報記入済手帳（特開平５−２１６９３２号公報を参照）などであってもよい。 In addition, as an example of the additional document 8B to be processed, the additional information filled teaching material 81 used for the automatic scoring processing mechanism has been described as an example. However, this is merely an example, and the handwritten data to be processed is a handwritten data. As long as additional information is included, the type of the document is not limited. For example, a completed form (see Japanese Patent Laid-Open Nos. 5-342239 and 6-274156) used for an automatic form processing mechanism and an addition used for a mechanism for personal information management (especially also called schedule management) An information-filled notebook (see JP-A-5-216932) may be used.

また、先にも述べたが、処理対象の追記済文書８Ｂは、紙媒体を用いることに限定されず、最初から追記済文書８Ｂを電子データの形式で入手する態様の場合であっても、データ処理対象となる追記情報が含まれているものであれば、上述した仕組みを同様に適用することができる。 Further, as described above, the post-recorded document 8B to be processed is not limited to using a paper medium, and even if the post-recorded document 8B is obtained in the form of electronic data from the beginning, The above-described mechanism can be similarly applied as long as additional information to be processed is included.

本発明に係る追記情報処理において処理対象とする文書の一例である教育用教材を示す図である。It is a figure which shows the educational teaching material which is an example of the document made into the process target in the postscript information processing which concerns on this invention. 本発明に係る追記情報処理装置を備えてなる情報処理システムの一実施形態の構成例を示す図である。It is a figure which shows the structural example of one Embodiment of the information processing system provided with the postscript information processing apparatus which concerns on this invention. 教材自動採点システムにおける教材処理方法の全体概要をシステム構成図と対応付けて示した図である。It is the figure which showed the whole outline | summary of the teaching material processing method in a teaching material automatic scoring system matched with the system block diagram. 教材自動採点システムにおける全体処理手順を示すフローチャートである。It is a flowchart which shows the whole process sequence in a teaching material automatic scoring system. 認識性能情報提示処理部の処理であって、各処理時の認識率や信頼度に関わる特徴量を収集する手法の一例を示す図である。It is a process of a recognition performance information presentation process part, Comprising: It is a figure which shows an example of the method of collecting the feature-value regarding the recognition rate at the time of each process, and reliability. 認識性能情報提示処理部の処理であって、「抽出ミス」の誤認識のタイプを説明する図である。It is a process of a recognition performance information presentation process part, Comprising: It is a figure explaining the type of misrecognition of "extraction mistake". 認識性能情報提示処理部の処理であって、「欠損補間ミス」の誤認識のタイプを説明する図（その１）である。It is a process (1) explaining the type of misrecognition of "missing interpolation error", which is a process of the recognition performance information presentation processing unit. 認識性能情報提示処理部の処理であって、「欠損補間ミス」の誤認識のタイプを説明する図（その２）である。It is a process (2) explaining the type of misrecognition of “missing interpolation error”, which is a process of the recognition performance information presentation processing unit. 認識性能情報提示処理部の処理であって、「複数追記」の誤認識のタイプを説明する図である。It is a process of a recognition performance information presentation process part, Comprising: It is a figure explaining the type of misrecognition of "multiple addition". 認識性能情報提示処理部の処理であって、「追記訂正」の誤認識のタイプを説明する図である。It is a process of a recognition performance information presentation process part, Comprising: It is a figure explaining the type of misrecognition of "additional correction". 認識性能情報提示部における、認識性能を向上させるための認識性能情報をユーザに提示する第１の提示手法を説明する図である。It is a figure explaining the 1st presentation method in which recognition performance information for improving recognition performance in a recognition performance information presentation part is shown to a user. 認識性能情報提示部における、認識性能を向上させるための認識性能情報をユーザに提示する第２の提示手法を説明する図（その１）である。It is the figure (the 1) explaining the 2nd presentation method in which the recognition performance information presentation part presents the recognition performance information for improving recognition performance to a user. 認識性能情報提示部における、認識性能を向上させるための認識性能情報をユーザに提示する第２の提示手法を説明する図（その２）である。It is FIG. (2) explaining the 2nd presentation method in which the recognition performance information presentation part presents the recognition performance information for improving recognition performance to a user. 認識性能情報提示部における、認識性能を向上させるための認識性能情報をユーザに提示する第２の提示手法を説明する図（その３）である。It is the figure (the 3) explaining the 2nd presentation method in which recognition performance information for improving recognition performance in a recognition performance information presentation part is shown to a user. 認識性能情報提示部における、認識性能を向上させるための認識性能情報をユーザに提示する第２の提示手法を説明する図（その４）である。It is FIG. (4) explaining the 2nd presentation method in which recognition performance information for improving recognition performance in a recognition performance information presentation part is shown to a user. 認識性能情報提示部における、認識性能を向上させるための認識性能情報をユーザに提示する第３の提示手法を説明する図である。It is a figure explaining the 3rd presentation method in which recognition performance information for improving recognition performance in a recognition performance information presentation part is shown to a user. 教育用教材を処理対象とする具体的な信頼度情報収集提示処理をシステム構成図と対応付けて示した図である。It is the figure which showed the concrete reliability information collection presentation process which makes educational materials as a process target matched with a system block diagram. 信頼度情報収集提示処理手順を示すフローチャートである。It is a flowchart which shows a reliability information collection presentation process procedure. 追記情報処理装置を、電子計算機を用いて構成する場合のハードウェア構成の一例を示した図である。It is the figure which showed an example of the hardware constitutions in the case of constituting a postscript information processor using an electronic computer.

Explanation of symbols

１…教材自動採点システム、８Ａ…文書原本、８Ｂ…追記済文書、９…差分情報、１０…追記情報処理装置、２０…文書入力装置、３０…文書管理サーバ、４０…処理結果保存サーバ、８０…教育用教材、８１…付加情報記入済教材、８７…採点記号、８８…コメント、１１０…読取画像処理部、１２０…文書原本特定部、１２２…教材特定部、１３０…追記情報抽出部、１３２…差分抽出部、１３４…解答者抽出部、１３６…手書き情報切出部、１３８…文字認識処理部、１４０…データ処理用追記情報抽出処理部、１４２…データ処理対象追記情報抽出部、１４６…追記情報整形部、１４８…抽出線分途切れ補正部、１５０…データ処理対象追記情報特定処理部、１５４…第１データ処理用追記情報認識部、１５５…分離認識処理部、１５６…図形形状認識部、１５７…文字認識処理部、１５８…記入位置認識部、１６４…第２データ処理用追記情報認識部、１６６…図形形状認識部、１６７…文字認識処理部、１５６ａ，１６６ａ、１５７ａ，１６７ａ…変形処理部、１６８…記入位置認識部、１７０…データ処理部、１７０_1…第１データ処理部、１７０_2…第２データ処理部、１７１…ユーザ端末、１７２…採点集計部、１７４…集計結果出力部、１７６…コメント分類処理部、１７８…コメント処理結果出力、１９０…認識性能情報提示処理部、１９２…追記情報認識履歴保持部、１９４…難認識追記情報抽出部、１９８…認識性能情報提示部 DESCRIPTION OF SYMBOLS 1 ... Teaching material automatic scoring system, 8A ... Original document, 8B ... Additional recorded document, 9 ... Difference information, 10 ... Additional information processing apparatus, 20 ... Document input device, 30 ... Document management server, 40 ... Processing result storage server, 80 ... Educational teaching materials 81 ... Additional information filled teaching materials 87 ... Scoring symbols 88 ... Comments 110 ... Reading image processing unit 120 ... Original document specifying unit 122 ... Teaching material specifying unit 130 ... Additional information extracting unit 132 ... difference extraction unit, 134 ... answerer extraction unit, 136 ... handwritten information extraction unit, 138 ... character recognition processing unit, 140 ... data processing additional information extraction processing unit, 142 ... data processing target additional information extraction unit, 146 ... Additional writing information shaping unit, 148... Extraction line segmentation correction unit, 150... Data processing target additional writing information specifying processing unit, 154... First data processing additional writing information recognition unit, 155. 6 ... Graphic shape recognition unit, 157 ... Character recognition processing unit, 158 ... Entry position recognition unit, 164 ... Additional data recognition unit for second data processing, 166 ... Graphic shape recognition unit, 167 ... Character recognition processing unit, 156a, 166a 157a, 167a ... deformation processing unit, 168 ... entry position recognition unit, 170 ... data processing unit, 170_1 ... first data processing unit, 170_2 ... second data processing unit, 171 ... user terminal, 172 ... scoring totaling unit, 174 ... Aggregation result output unit, 176 ... Comment classification processing unit, 178 ... Comment processing result output, 190 ... Recognition performance information presentation processing unit, 192 ... Additional information recognition history holding unit, 194 ... Difficulty recognition additional information extraction unit, 198 ... Recognition Performance information presentation section

Claims

An input correction method for correcting additional information that is additionally written and used for recognition processing,
With regard to the additional information to be noticed, the feature amount related to the reliability of the recognition process in each of the various processes related to the recognition process including the recognition process itself is acquired, and the recognition process of each process is obtained based on each feature amount. Sub-reliability related to each is calculated, a final reliability for the additional information of interest is specified based on the calculated sub-reliability, and whether or not the final reliability is lower than a certain level. By identifying the difficulty recognition information whose recognition processing reliability is lower than a certain level, the recognition performance information for improving the recognition performance of the identified difficulty recognition information is presented. How to fix.

This is an additional information processing method for recognizing the content of the additional information added to the document with additional information added to the original document as a processing target, and performing predetermined data processing based on the recognized information. And
With regard to the additional information to be noticed, the feature amount related to the reliability of the recognition process in each of the various processes related to the recognition process including the recognition process itself is acquired, and the recognition process of each process is obtained based on each feature amount. Sub-reliability related to each is calculated, a final reliability for the additional information of interest is specified based on the calculated sub-reliability, and whether or not the final reliability is lower than a certain level. Identifying difficult recognition information whose recognition processing reliability is lower than a certain level by determining whether or not, presenting recognition performance information for improving the recognition performance of the identified difficult recognition information,
A postscript information processing method, wherein the corrected additional information entered corresponding to the presentation of the recognition performance information is reflected in data processing.

This is an additional information processing apparatus for recognizing the content of the additional information added to the original document with additional information added to the original document, and performing predetermined data processing based on the recognized information. And
With regard to the additional information to be noticed, the feature amount related to the reliability of the recognition process in each of the various processes related to the recognition process including the recognition process itself is acquired, and the recognition process of each process is obtained based on each feature amount. Sub-reliability related to each is calculated, a final reliability for the additional information of interest is specified based on the calculated sub-reliability, and whether or not the final reliability is lower than a certain level. In order to improve the recognition performance of the difficult recognition information identified by the difficult recognition additional record information extracting unit and the difficult recognition additional record information extracting unit that identifies difficult recognition information whose reliability of recognition processing is lower than a certain level by determining A recognition performance information presentation unit for presenting recognition performance information of
A postscript information processing apparatus, comprising: a data processing unit that reflects, in data processing, the corrected additional information entered corresponding to the presentation of the recognition performance information by the recognition performance information presenting unit.

4. The postscript information processing according to claim 3, wherein the difficult recognition additional record information extraction unit identifies, as the final reliability, the one with the lowest reliability among the sub-reliabilities of each process. apparatus.

The write-once information processing according to claim 3, wherein the difficult-to-recognize additional information extraction unit identifies the final reliability by an addition process in which the sub-reliability of each process is assigned a predetermined weight. apparatus.

The difficult recognition additional recording information extraction unit defines in advance the type of misrecognition with low reliability, and selects the type of misrecognition based on the result of addition processing of the sub-reliability of each process, The postscript information processing apparatus according to claim 3, wherein the difficulty recognition information is specified.

The additional information processing apparatus according to claim 3, wherein the recognition performance information presentation unit changes a method of presenting the recognition performance information according to reliability.

The recognition performance information presenting section distinguishes and presents difficult-to-recognize information, which is additional information having a reliability lower than a certain level, and other additional information for all the additional information subjected to recognition processing. Item 7. The postscript information processing device according to any one of items 3 to 6.

The additional performance information processing apparatus according to any one of claims 3 to 6, wherein the recognition performance information presenting unit presents the determination reason so determined together with the reliability as the recognition performance information. .

In order to perform processing of recognizing the content of the additional information added to the document with additional information added to the original document as a processing target, and to perform predetermined data processing based on the recognized information using a computer The program of
The computer,
With regard to the additional information to be noticed, the feature amount related to the reliability of the recognition process in each of the various processes related to the recognition process including the recognition process itself is acquired, and the recognition process of each process is obtained based on each feature amount. Sub-reliability related to each is calculated, a final reliability for the additional information of interest is specified based on the calculated sub-reliability, and whether or not the final reliability is lower than a certain level. In order to improve the recognition performance of the difficult recognition information identified by the difficult recognition additional record information extracting unit and the difficult recognition additional record information extracting unit that identifies difficult recognition information whose reliability of recognition processing is lower than a certain level by determining A recognition performance information presentation unit for presenting recognition performance information of
A program that functions as a data processing unit that reflects corrected additional information entered in response to presentation of the recognition performance information by the recognition performance information presentation unit in data processing.