JP2023002091A

JP2023002091A - Information processing system, method and program

Info

Publication number: JP2023002091A
Application number: JP2021103109A
Authority: JP
Inventors: 直樹岸川; Naoki Kishikawa
Original assignee: PFU Ltd
Current assignee: PFU Ltd
Priority date: 2021-06-22
Filing date: 2021-06-22
Publication date: 2023-01-10

Abstract

To provide an information processing system, method and program, capable of relieving a user's burden in correcting character strings obtained by character recognition.SOLUTION: An information processor 1 as an information processing system includes: a character recognition result acquisition section 22 which acquires a recognized character string obtained by character recognition of a character string image; a prediction object specification section 25 which specifies one or more prescribed positions of the recognized character string as one or more prediction object positions; a prediction section 26 which predicts a correct answer character to enter each prediction object position specified by the prediction object specification section 25; and a replacement candidate character generation section 27 which generates one or more replacement candidate character strings including the at least one correct answer character as a character string for replacement with at least a part of the recognized character string.SELECTED DRAWING: Figure 2

Description

本開示は、文字認識技術に関する。 The present disclosure relates to character recognition technology.

従来、文字認識の結果を修正する際のユーザの作業負担を軽減させるための技術が種々提案されている（特許文献１から４を参照）。 Conventionally, various techniques have been proposed for reducing the work load on the user when correcting the result of character recognition (see Patent Documents 1 to 4).

特開昭６０－２５４３８７号公報JP-A-60-254387 特開平５－３４２４０２号公報JP-A-5-342402 特開平７－２８９５６号公報JP-A-7-28956 特開平５－２９８４９５号公報JP-A-5-298495

従来、文字認識によって得られた文字列の修正は文字単位で行われており、また、修正の候補文字が提示される場合にも、文字認識における誤認識を修正するためのものであるから、提示される候補文字の字形は互いに類似している。このため、提示された候補文字から正解文字を見分けさせることは、ユーザに与える認知的な負担が大きい。 Conventionally, character strings obtained by character recognition are corrected on a character-by-character basis. The glyphs of the presented candidate characters are similar to each other. Therefore, distinguishing the correct character from the presented candidate characters imposes a heavy cognitive burden on the user.

本開示は、上記した問題に鑑み、文字認識によって得られた文字列の修正に係るユーザの負担を軽減することを課題とする。 In view of the above problems, the present disclosure aims to reduce the user's burden in correcting character strings obtained by character recognition.

本開示の一例は、文字列画像を文字認識することによって得られた認識文字列を取得する文字認識結果取得手段と、前記認識文字列のうち所定の箇所を推測対象箇所として指定する推測対象指定手段と、推測対象指定手段によって指定された前記推測対象箇所に入るべき正解文字を推測する推測手段と、前記認識文字列の少なくとも一部と置換するための文字列として、前記正解文字を含む置換候補文字列を生成する置換候補生成手段と、を備える情報処理システムである。 An example of the present disclosure includes character recognition result acquisition means for acquiring a recognized character string obtained by character recognition of a character string image; guessing means for guessing a correct character to be included in the guess target location designated by the guess target designating means; and replacement including the correct character as a character string for replacing at least part of the recognized character string. and replacement candidate generating means for generating candidate character strings.

本開示は、情報処理装置、システム、コンピューターによって実行される方法又はコンピューターに実行させるプログラムとして把握することが可能である。また、本開示は、そのようなプログラムをコンピューターその他の装置、機械等が読み取り可能な記録媒体に記録したものとしても把握できる。ここで、コンピューター等が読み取り可能な記録媒体とは、データやプログラム等の情報を電気的、磁気的、光学的、機械的又は化学的作用によって蓄積し、コンピューター等から読み取ることができる記録媒体をいう。 The present disclosure can be understood as an information processing device, a system, a method executed by a computer, or a program to be executed by a computer. The present disclosure can also be understood as recording such a program in a recording medium readable by a computer, other device, machine, or the like. Here, a computer-readable recording medium is a recording medium that stores information such as data and programs by electrical, magnetic, optical, mechanical or chemical action and can be read by a computer. say.

本開示によれば、文字認識によって得られた文字列の修正に係るユーザの負担を軽減することが可能となる。 Advantageous Effects of Invention According to the present disclosure, it is possible to reduce the user's burden of correcting a character string obtained by character recognition.

実施形態に係るシステムの構成を示す概略図である。1 is a schematic diagram showing the configuration of a system according to an embodiment; FIG. 実施形態に係る情報処理装置の機能構成の概略を示す図である。It is a figure which shows the outline of the functional structure of the information processing apparatus which concerns on embodiment. 実施形態において表示される確認・修正画面の例を示す図である。It is a figure which shows the example of the confirmation / correction screen displayed in embodiment. 実施形態に係る候補提示処理の流れの概要を示すフローチャートである。6 is a flowchart showing an overview of the flow of candidate presentation processing according to the embodiment; 実施形態において生成されるクエリ文字列の例を示す図である。It is a figure which shows the example of the query string produced|generated in embodiment. 実施形態において確認・修正画面に表示される置換候補文字列リストの例を示す図である。FIG. 7 is a diagram showing an example of a replacement candidate character string list displayed on a confirmation/correction screen in the embodiment; 実施形態において置換候補文字列リストから置換候補文字列が選択される様子の例を示す図である。FIG. 10 is a diagram showing an example of how a replacement candidate character string is selected from a replacement candidate character string list in the embodiment;

以下、本開示に係る情報処理装置、システム、方法及びプログラムの実施の形態を、図面に基づいて説明する。但し、以下に説明する実施の形態は、実施形態を例示するものであって、本開示に係る情報処理装置、システム、方法及びプログラムを以下に説明する具体的構成に限定するものではない。実施にあたっては、実施の態様に応じた具体的構成が適宜採用され、また、種々の改良や変形が行われてよい。 Hereinafter, embodiments of an information processing device, system, method, and program according to the present disclosure will be described based on the drawings. However, the embodiments described below are examples of embodiments, and do not limit the information processing apparatus, system, method, and program according to the present disclosure to the specific configurations described below. For implementation, a specific configuration may be appropriately adopted according to the mode of implementation, and various improvements and modifications may be made.

本実施形態では、本開示に係る情報処理装置、システム、方法及びプログラムを、電子化帳票の品質確認作業補助システムにおいて実施した場合の実施の形態について説明する。但し、本開示に係る情報処理装置、システム、方法及びプログラムは、文字認識技術について広く用いることが可能であり、本開示の適用対象は、実施形態において示した例に限定されない。 In this embodiment, an embodiment in which an information processing apparatus, system, method, and program according to the present disclosure are implemented in a quality confirmation work assistance system for electronic forms will be described. However, the information processing device, system, method, and program according to the present disclosure can be widely used for character recognition technology, and the application target of the present disclosure is not limited to the examples shown in the embodiments.

従来、文字認識（ＯＣＲ）結果の確認及び修正は、文字認識の対象となった文字列画像とＯＣＲによる認識文字列とをユーザが目視で比較し、認識文字列中に誤認識文字が発見された場合に、カーソルを誤認識文字の位置まで移動させ、誤認識文字を削除し、正解文字を入力する、というユーザ操作を誤認識文字毎に行うことでなされている。更に、誤認識文字がシングルバイト文字である場合には正解文字の入力は単純に正解文字のキーを押下する等の操作で行うことが出来るが、漢字や仮名等のマルチバイト文字を入力する場合には変換処理（日本語であれば、日本語変換処理）を行う必要がある。変換処理は、変換要素を入力し、変換要素に基づいてシステムから提示された変換候補から正解文字を探し出して選択し、確定する、といった操作を含み、マルチバイト文字を含む認識文字列を確認及び修正する際のユーザの作業はシングルバイト文字のみを含む認識文字列の確認及び修正に比べてより煩雑なものとなっている。 Conventionally, confirmation and correction of character recognition (OCR) results are performed by a user visually comparing a character string image to be character-recognized and a character string recognized by OCR, and if an erroneously recognized character is found in the recognized character string. In this case, the user moves the cursor to the position of the erroneously recognized character, deletes the erroneously recognized character, and inputs the correct character for each erroneously recognized character. Furthermore, when the misrecognized character is a single-byte character, the correct character can be input simply by pressing the key of the correct character, but when inputting multi-byte characters such as kanji and kana, , it is necessary to perform conversion processing (if it is Japanese, conversion processing to Japanese). The conversion process includes operations such as inputting conversion elements, searching for and selecting correct characters from conversion candidates presented by the system based on the conversion elements, and confirming them. The user's work when correcting is more complicated than checking and correcting a recognition string containing only single-byte characters.

また、上記のような課題に対して、従来、ユーザによって誤認識文字が選択された際に、文字認識の際に採用されなかった候補文字のリストを表示し、当該リストからユーザに正解文字を選択させるという手法が提案されている。しかし、このような手法が採用された場合にも、誤認識文字の修正が文字単位であるために誤認識文字が複数ある場合にその都度リストを表示させてユーザが選択する必要がある。更に、候補文字は文字認識における誤認識を修正するためのものであるから、候補文字のリストには類似する字形の文字が並び、正解文字を見分けるためにユーザに与える認知的な負担が大きい。加えて、候補文字リストの内容は採用されるＯＣＲシステムの認識精度に依存するため、候補文字リスト中にそもそも正解文字が無いケースも発生し得る。 In order to solve the above problems, conventionally, when a user selects an erroneously recognized character, a list of candidate characters that were not adopted in character recognition is displayed, and the correct character is given to the user from the list. A selection method has been proposed. However, even when such a technique is adopted, since correction of misrecognized characters is performed on a character-by-character basis, it is necessary for the user to display a list and select each time there are a plurality of misrecognized characters. Furthermore, since the candidate characters are for correcting misrecognition in character recognition, characters with similar character shapes line up in the candidate character list, and the cognitive burden placed on the user to distinguish the correct character is heavy. In addition, since the content of the candidate character list depends on the recognition accuracy of the OCR system employed, there may be cases where there is no correct character in the candidate character list.

このため、本実施形態において説明するシステムでは、文字認識における候補文字の外からも正解文字を取得可能とし、１又は複数の誤認識文字に対する正解文字を含み且つ自然な語句である置換候補文字列を提示してユーザに選択させることで、文字認識によって得られた文字列の修正に係るユーザの負担を軽減することとした。但し、本開示に係る技術を実施する場合には、以下に説明する全ての構成を採用することで上記説明した全ての課題を解決しなくてもよい。本開示に係る技術を実施する場合には、以下に説明する構成の一部を採用することで、上記説明した課題の一部を解決することとしてもよい。 For this reason, in the system described in this embodiment, it is possible to obtain correct characters from outside the candidate characters in character recognition. is presented to allow the user to select it, thereby reducing the burden on the user in correcting the character string obtained by character recognition. However, when implementing the technology according to the present disclosure, it is not necessary to solve all the problems described above by adopting all the configurations described below. When implementing the technology according to the present disclosure, part of the above-described problems may be solved by adopting part of the configuration described below.

＜システムの構成＞
図１は、本実施形態に係るシステムの構成を示す概略図である。本実施形態に係るシステムは、ネットワーク又はその他の通信手段を介して互いに通信可能に接続されたスキャナ３及び情報処理装置１を備える。 <System configuration>
FIG. 1 is a schematic diagram showing the configuration of a system according to this embodiment. The system according to this embodiment includes a scanner 3 and an information processing device 1 that are communicably connected to each other via a network or other communication means.

情報処理装置１は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）１１、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）１２、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）１３、ＥＥＰＲＯＭ（ＥｌｅｃｔｒｉｃａｌｌｙＥｒａｓａｂｌｅａｎｄＰｒｏｇｒａｍｍａｂｌｅＲｅａｄＯｎｌｙＭｅｍｏｒｙ）やＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）等の記憶装置１４、キーボードやマウス、タッチパネル等の入力デバイス１５、ディスプレイ等の出力デバイス１６、及び通信ユニット１７、等を備えるコンピューターである。但し、情報処理装置１の具体的なハードウェア構成に関しては、実施の態様に応じて適宜省略や置換、追加が可能である。また、情報処理装置１は、単一の筐体からなる装置に限定されない。情報処理装置１は、所謂クラウドや分散コンピューティングの技術等を用いた、複数の装置によって実現されてよい。 The information processing apparatus 1 includes a CPU (Central Processing Unit) 11, a ROM (Read Only Memory) 12, a RAM (Random Access Memory) 13, an EEPROM (Electrically Erasable and Programmable Read Only Memory), a HDD (Hard Disk), etc. It is a computer including a device 14, an input device 15 such as a keyboard, mouse, and touch panel, an output device 16 such as a display, a communication unit 17, and the like. However, the specific hardware configuration of the information processing apparatus 1 can be appropriately omitted, replaced, or added according to the mode of implementation. Further, the information processing device 1 is not limited to a device consisting of a single housing. The information processing device 1 may be realized by a plurality of devices using so-called cloud or distributed computing technology.

スキャナ３は、ユーザがセットした、文書、名刺、レシート又は写真／イラスト等の原稿を撮像することで、画像データを取得する装置である。なお、本実施形態では、対象の画像を取得するための装置としてスキャナ３を例示したが、画像を取得するために用いられる装置は所謂スキャナに限定されない。例えば、デジタルカメラや、スマートフォン／タブレットに内蔵されているカメラセンサを用いて対象を撮像し、画像を得ることとしてもよい。 The scanner 3 is a device that acquires image data by capturing an image of a document such as a document, a business card, a receipt, or a photograph/illustration set by the user. In this embodiment, the scanner 3 is used as an example of a device for acquiring an image of a target, but the device used for acquiring an image is not limited to a so-called scanner. For example, a digital camera or a camera sensor built into a smartphone/tablet may be used to capture an image of an object and obtain an image.

本実施形態に係るスキャナ３は、撮像によって得られた画像データを、ネットワークを介して情報処理装置１に送信する機能を有する。また、スキャナ３は、タッチパネルディスプレイやキーボード等の、文字入出力や項目選択を可能とするためのユーザインターフェース、及びＷｅｂブラウズ機能やサーバー機能を更に有していてもよい。本実施形態に係る方法を採用可能なスキャナの通信手段及びハードウェア構成等は、本実施形態における例示に限定されない。 The scanner 3 according to this embodiment has a function of transmitting image data obtained by imaging to the information processing apparatus 1 via a network. Further, the scanner 3 may further have a user interface, such as a touch panel display and a keyboard, for enabling character input/output and item selection, a web browsing function, and a server function. The communication means, hardware configuration, and the like of a scanner that can employ the method according to the present embodiment are not limited to those illustrated in the present embodiment.

図２は、本実施形態に係る情報処理装置１の機能構成の概略を示す図である。情報処理装置１は、記憶装置１４に記録されているプログラムが、ＲＡＭ１３に読み出され、ＣＰＵ１１によって実行されて、情報処理装置１に備えられた各ハードウェアが制御されることで、文字認識部２１、文字認識結果取得部２２、出力部２３、選択受付部２４、推測対象指定部２５、推測部２６、置換候補生成部２７、自然語句判定部２８、補正部２９及び置換部３０を備える情報処理装置として機能する。なお、本実施形態及び後述する他の実施形態では、情報処理装置１の備える各機能は、汎用プロセッサであるＣＰＵ１１によって実行されるが、これらの機能の一部又は全部は、１又は複数の専用プロセッサによって実行されてもよい。 FIG. 2 is a diagram showing an outline of the functional configuration of the information processing device 1 according to this embodiment. In the information processing apparatus 1, a program recorded in the storage device 14 is read out to the RAM 13, executed by the CPU 11, and each hardware provided in the information processing apparatus 1 is controlled to perform a character recognition unit. 21, information comprising a character recognition result acquisition unit 22, an output unit 23, a selection reception unit 24, a guess target designation unit 25, a guess unit 26, a replacement candidate generation unit 27, a natural phrase determination unit 28, a correction unit 29, and a replacement unit 30 It functions as a processor. Note that in this embodiment and other embodiments described later, each function of the information processing apparatus 1 is executed by the CPU 11, which is a general-purpose processor. It may be executed by a processor.

文字認識部２１は、入力された画像データに含まれる文字列画像に対する文字認識（ＯＣＲ）処理を実行し、認識文字列を出力する。ここで、画像データは、スキャナ３等による撮像によって得られた画像データであってもよいし、はじめから画像データとして生成された画像であってもよい。本実施形態において、文字認識部２１は、文字列画像に含まれる各文字について１又は複数の候補文字を得て、最も確度が高い文字を採用して認識文字列を作成し出力するが、文字認識部２１は、確度が２番目以降であるために認識文字列には採用さなかった候補文字についても、参考データとして出力することが出来る。なお、本実施形態では、情報処理装置１が文字認識部２１を備える例について説明するが、情報処理装置１は文字認識の結果である認識文字列を取得可能であればよく、文字認識部２１を備えなくてもよい。文字認識部２１は、スキャナ３や外部の装置、サーバ等、情報処理装置１以外の装置が備えていてもよい。 The character recognition unit 21 performs character recognition (OCR) processing on a character string image included in input image data, and outputs a recognized character string. Here, the image data may be image data obtained by imaging with the scanner 3 or the like, or may be an image generated as image data from the beginning. In this embodiment, the character recognition unit 21 obtains one or a plurality of candidate characters for each character included in the character string image, adopts the character with the highest probability, creates and outputs a recognized character string. The recognition unit 21 can also output, as reference data, candidate characters that have not been adopted in the recognition character string because of the second or subsequent accuracy. In this embodiment, an example in which the information processing device 1 includes the character recognition unit 21 will be described. does not have to be The character recognition unit 21 may be provided in a device other than the information processing device 1, such as the scanner 3, an external device, or a server.

文字認識結果取得部２２は、文字列画像を文字認識することによって得られた認識文字列（文字認識で得られた文字列）を取得する。また、本実施形態において、文字認識結果取得部２２は、認識文字列に加えて、文字認識の際に認識文字列に採用されなかった候補文字を更に取得する。 The character recognition result acquisition unit 22 acquires a recognized character string obtained by performing character recognition on a character string image (a character string obtained by character recognition). In addition to the recognized character string, the character recognition result acquisition unit 22 further acquires candidate characters that were not used in the recognized character string during character recognition in the present embodiment.

出力部２３は、文字列画像の認識文字列と並べて、当該文字列画像を表示させる。 The output unit 23 displays the character string image side by side with the recognized character string of the character string image.

図３は、本実施形態において情報処理装置１のディスプレイに表示される確認・修正画面５の例を示す図である。確認・修正画面５には、読み取り対象となった文字列画像５１と、当該文字列画像５１から認識された認識文字列５２とが、並べて（横書きである場合、上下に並べて）表示される。この際、認識文字列５２は、任意のフォントで表示されてよい。 FIG. 3 is a diagram showing an example of the confirmation/correction screen 5 displayed on the display of the information processing apparatus 1 in this embodiment. On the confirmation/correction screen 5, a character string image 51 to be read and a recognized character string 52 recognized from the character string image 51 are displayed side by side (in the case of horizontal writing, vertically arranged). At this time, the recognized character string 52 may be displayed in any font.

選択受付部２４は、ユーザからの修正対象文字の選択を受け付ける。確認・修正画面５を視認したユーザは、並べて表示された文字列画像５１と認識文字列５２とを見比べることで、文字認識部２１による文字認識の結果が正しいか否かを確認する。文字認識の結果に誤認識された文字がある場合、ユーザは、当該文字を修正対象として指定するために、当該文字（以下、「修正対象文字」と称する。）にカーソル５３を合わせることで、当該修正対象文字を選択状態とする。この際、カーソル５３の操作は、方向キー等を用いた入力操作によって行われてもよいし、ポインティングデバイスを用いた入力操作によって行われてもよい。 The selection reception unit 24 receives selection of a correction target character from the user. A user who visually recognizes the confirmation/correction screen 5 compares the character string image 51 and the recognized character string 52 displayed side by side to confirm whether or not the result of character recognition by the character recognition unit 21 is correct. If there is an erroneously recognized character in the character recognition result, the user moves the cursor 53 on the character (hereinafter referred to as a "correction target character") to designate the character as a correction target. The character to be corrected is set in a selected state. At this time, the operation of the cursor 53 may be performed by an input operation using direction keys or the like, or may be performed by an input operation using a pointing device.

推測対象指定部２５は、認識文字列５２のうち所定の箇所（以下、「推測対象箇所」と称する。）の文字を所定のマスク文字と置換したクエリ文字列を生成することで、推測対象箇所を指定する。ここで、推測対象指定部２５は、少なくともユーザによって指定された修正対象文字を推測対象箇所としてマスク文字と置換したクエリ文字列を生成する。 The speculation target specifying unit 25 generates a query string by replacing characters in a predetermined portion (hereinafter referred to as a “guessment target portion”) of the recognition character string 52 with predetermined mask characters. Specify Here, the speculation target specifying unit 25 generates a query string in which at least the correction target character specified by the user is replaced with the mask character as the speculation target location.

本実施形態では、ユーザによる操作負担をより軽減し、また全体としてより自然な修正候補が得られるように、推測対象指定部２５は、修正対象文字及び当該修正対象文字以降の任意の文字をマスク文字と置換したクエリ文字列を生成する。このようにすることで、ユーザによって明示的に指定された修正対象文字以外にも誤認識された文字があるような場合にも、ユーザによる誤認識文字の選択を待つことなく、複数文字についての正解文字を含む自然な置換候補文字列を提案し、複数の文字をまとめて修正することが出来る。推測対象指定部２５は、マスク文字によって置換される修正対象文字及びマスク文字によって置換される当該修正対象文字以降の任意の文字の組み合わせによって得られる複数のクエリ文字列を生成する。 In this embodiment, the guess target specifying unit 25 masks the correction target character and any characters after the correction target character in order to reduce the operation burden on the user and to obtain a more natural correction candidate as a whole. Generates a query string with character substitutions. By doing so, even if there are misrecognized characters other than the characters to be corrected explicitly specified by the user, multiple characters can be corrected without waiting for the selection of the misrecognized character by the user. It can propose natural replacement candidate character strings including correct characters and correct multiple characters at once. The speculation target specifying unit 25 generates a plurality of query character strings obtained by combining the correction target character to be replaced with the mask character and arbitrary characters after the correction target character to be replaced by the mask character.

更に、推測対象指定部２５は、マスク文字による置換に加えて、認識文字列５２中の候補文字に対応する文字を当該候補文字に置換することで、クエリ文字列を生成してもよい。この場合、推測対象指定部２５は、マスク文字による置換及び候補文字による置換の組み合わせによって得られる複数のクエリ文字列を生成する。上述の通り、候補文字とは、文字認識結果取得部２２による文字認識の際に認識文字列５２に採用されなかった文字である。このようにすることで、文字認識部２１による処理において候補に挙がったが採用されなかった文字も考慮した、正解文字の推測が可能となる。 Furthermore, the guess target designating unit 25 may generate a query string by replacing characters corresponding to candidate characters in the recognition character string 52 with the candidate characters in addition to the mask character replacement. In this case, the guess target designating unit 25 generates a plurality of query character strings obtained by combining replacement with mask characters and replacement with candidate characters. As described above, candidate characters are characters that are not adopted in the recognized character string 52 when character recognition is performed by the character recognition result acquisition unit 22 . By doing so, it is possible to guess the correct character in consideration of characters that were listed as candidates in the processing by the character recognition unit 21 but were not adopted.

推測部２６は、推測対象指定部２５によって指定された推測対象箇所に入るべき正解文字を推測する。本実施形態では、推測部２６は、生成されたクエリ文字列中のマスク文字によって指定された推測対象箇所に入るべき文字を推測することで、正解文字を推測する。 The guessing section 26 guesses a correct character to be placed in the guessing target portion specified by the guessing target designating section 25 . In this embodiment, the guessing unit 26 guesses the correct character by guessing the character that should be included in the guess target part specified by the mask character in the generated query string.

置換候補生成部２７は、認識文字列５２の少なくとも一部と置換するための文字列として、クエリ文字列中のマスク文字が推測部２６による推測結果（正解文字）で置換された、置換候補文字列を生成する。ここで、複数のクエリ文字列が生成されている場合、置換候補生成部２７は、複数のクエリ文字列の夫々に応じた複数の置換候補文字列を生成する。 The replacement candidate generation unit 27 generates a replacement candidate character in which the masked character in the query character string is replaced with the guess result (correct character) by the guess unit 26 as a character string to replace at least part of the recognized character string 52. Generate columns. Here, when multiple query strings are generated, the replacement candidate generation unit 27 generates multiple replacement candidate strings corresponding to each of the multiple query strings.

自然語句判定部２８は、置換候補文字列が自然な語句であるか否かを判定する。また、自然語句判定部２８は、置換候補文字列が自然な語句である程度を示す指標を、複数の置換候補文字列を表示させる際に用いられる優先度として更に算出する。 The natural phrase determining unit 28 determines whether or not the replacement candidate character string is a natural phrase. In addition, the natural phrase determination unit 28 further calculates an index indicating the extent to which the replacement candidate character string is a natural phrase as a priority used when displaying a plurality of replacement candidate character strings.

補正部２９は、置換候補文字列が自然な語句でないと判定された場合に、当該置換候補文字列に対して、最後尾の文字を削除する補正を行う。そして、自然語句判定部２８は、補正部２９によって置換候補文字列が補正された場合、補正後の当該置換候補文字列が自然な語句であるか否かを更に判定する。 When the replacement candidate character string is determined not to be a natural phrase, the correction unit 29 corrects the replacement candidate character string by deleting the last character. Then, when the replacement candidate character string is corrected by the correction unit 29, the natural phrase determination unit 28 further determines whether or not the corrected replacement candidate character string is a natural phrase.

置換部３０は、ユーザによって選択された置換候補文字列で、認識文字列５２のうち対応する文字列を置換する。 The replacing unit 30 replaces the corresponding character string in the recognized character string 52 with the replacement candidate character string selected by the user.

＜処理の流れ＞
次に、本実施形態に係る情報処理装置１によって実行される処理の流れを説明する。なお、以下に説明する処理の具体的な内容及び処理順序は、本開示を実施するための一例である。具体的な処理内容及び処理順序は、本開示の実施の形態に応じて適宜選択されてよい。 <Process flow>
Next, the flow of processing executed by the information processing apparatus 1 according to this embodiment will be described. Note that the specific content and processing order of the processing described below are examples for carrying out the present disclosure. Specific processing contents and processing order may be appropriately selected according to the embodiment of the present disclosure.

本実施形態では、はじめに文字認識部２１による文字認識が実行される。そして、文字認識結果取得部２２は、文字認識部２１によって出力された認識文字列５２及び候補文字（確度が２番目以降であるために認識文字列５２には採用さなかった文字）を取得し、出力部２３は、ユーザによる確認及び修正作業のために、文字認識の対象となった文字列画像５１と、文字認識部２１から取得した認識文字列５２とを並べて出力する（図３を参照。）。ユーザは、文字列画像５１と認識文字列５２とを目視で比較し、認識文字列５２中に誤認識文字を発見した場合、カーソル５３を誤認識文字の位置まで移動させる。選択受付部２４は、ユーザによるカーソル５３の移動操作を、修正対象文字の選択として受け付け、誤認識文字を選択状態とする。 In this embodiment, character recognition is first performed by the character recognition unit 21 . Then, the character recognition result acquiring unit 22 acquires the recognized character string 52 output by the character recognizing unit 21 and candidate characters (characters that are not adopted as the recognized character string 52 because the accuracy is second or later). , the output unit 23 outputs a character string image 51 to be subjected to character recognition and a recognized character string 52 acquired from the character recognition unit 21 side by side for confirmation and correction work by the user (see FIG. 3). .). The user visually compares the character string image 51 and the recognized character string 52, and when finding an erroneously recognized character in the recognized character string 52, moves the cursor 53 to the position of the erroneously recognized character. The selection accepting unit 24 accepts the operation of moving the cursor 53 by the user as selection of a character to be corrected, and selects the erroneously recognized character.

図４は、本実施形態に係る候補提示処理の流れの概要を示すフローチャートである。本フローチャートに示された処理は、ユーザ操作によって、認識文字列５２中の任意の文字（誤認識文字）が選択状態となったことを契機として実行される。 FIG. 4 is a flowchart showing an overview of the flow of candidate presentation processing according to this embodiment. The process shown in this flowchart is executed when an arbitrary character (erroneously recognized character) in the recognized character string 52 is selected by a user operation.

ステップＳ１０１では、推測処理のためのクエリ文字列が生成される。推測対象指定部２５は、後述する推測部２６による文字推測用学習モデルを用いた正解文字の推測処理において、文字推測用学習モデルに入力するための１又は複数のクエリ文字列を生成する。本実施形態において、推測対象指定部２５は、ユーザによって選択された修正対象文字以降の１又は複数の文字を、マスク文字と候補文字との組み合わせで置き換えることで、クエリ文字列を生成する。ここで、マスク文字は、文字推測用学習モデルに対して推測対象箇所を示すための特殊文字であり、本実施形態では、「＠」が用いられる。但し、マスク文字にはその他の記号が用いられてよいし、推測対象を特定するための方法には、マスク文字以外の方法が採用されてもよい。また、候補文字は、上述の通り、文字認識結果取得部２２によって取得された、文字認識処理において確度が２番目以降であると判定されたために認識文字列５２に採用されなかった文字である。 In step S101, a query string is generated for the inference process. The guessing target designating unit 25 generates one or more query strings to be input to the character guessing learning model in the correct character guessing process using the character guessing learning model by the guessing unit 26 to be described later. In this embodiment, the guess target specifying unit 25 generates a query string by replacing one or more characters after the correction target character selected by the user with a combination of mask characters and candidate characters. Here, the mask character is a special character for indicating a guessing target location for the learning model for character guessing, and in this embodiment, "@" is used. However, other symbols may be used as the mask characters, and a method other than the mask characters may be adopted as the method for specifying the guess target. As described above, the candidate character is a character acquired by the character recognition result acquisition unit 22 and not adopted as the recognized character string 52 because the accuracy is determined to be second or higher in the character recognition process.

本実施形態において、推測対象指定部２５は、修正対象文字以降の１又は複数の文字を、マスク文字と候補文字との全通りの組み合わせで置き換えることで、１又は複数のクエリ文字列を生成する。但し、生成されるクエリ文字列の数には、上限が設定されてもよい。このようにすることで、本フローチャートに示した候補提示処理の負荷を、情報処理装置１の処理能力に応じた負荷とすることが出来る。また、修正対象文字以降のクエリ文字列に含める文字数の上限、クエリ文字列の生成に用いる候補文字の範囲（確度が何位の候補文字まで使用するか）、及びクエリ文字列に含めるマスク文字の数の上限、等についても、任意に設定可能であってよい。 In this embodiment, the speculation target designation unit 25 generates one or more query strings by replacing one or more characters after the correction target character with all possible combinations of mask characters and candidate characters. . However, an upper limit may be set for the number of generated query strings. By doing so, the load of the candidate presentation process shown in this flow chart can be set according to the processing capability of the information processing apparatus 1 . In addition, the upper limit of the number of characters to be included in the query string after the character to be corrected, the range of candidate characters used to generate the query string (how many candidate characters are to be used for accuracy), and the mask characters to be included in the query string The upper limit of the number, etc. may also be set arbitrarily.

図５は、本実施形態において情報処理装置１によって生成されるクエリ文字列の例を示す図である。図５に示された例によれば、修正対象文字以降の１又は複数の文字が、マスク文字（図中の特殊文字「＠」）と候補文字（図中の太字文字「云」や「雲」、「パ」）との複数通りの組み合わせで置き換えられて、複数のクエリ文字列（図中の１から６２５）が生成されていることが分かる。推測処理のためのクエリ文字列の生成が完了すると、処理はステップＳ１０２へ進む。 FIG. 5 is a diagram showing an example of a query string generated by the information processing device 1 in this embodiment. According to the example shown in FIG. 5, one or more characters after the character to be corrected are mask characters (special character "@" in the figure) and candidate characters (bold characters "cloud" and "cloud" in the figure). , "Pa") are replaced with a plurality of combinations, and a plurality of query strings (1 to 625 in the figure) are generated. When the generation of the query string for the inference process is completed, the process proceeds to step S102.

ステップＳ１０２では、クエリ文字列中のマスク文字が示す箇所に入る正解文字が推測され、正解文字を含む置換候補文字列が生成される。推測部２６は、事前に学習を行うことで作成された文字推測用学習モデルに、ステップＳ１０１で生成された１又は複数のクエリ文字列を入力として与えることで、文字推測用学習モデルからの出力として、クエリ文字列中のマスク文字が示す箇所に入る正解文字の推測結果を得る。 In step S102, a correct character that is included in the portion indicated by the mask character in the query string is guessed, and a replacement candidate character string including the correct character is generated. The estimating unit 26 inputs the one or more query strings generated in step S101 to the character-guessing learning model created by performing learning in advance, so that the output from the character-guessing learning model , to obtain the guess result of the correct character that fits in the location indicated by the mask character in the query string.

推測結果は、正解文字として出力されてもよいし、正解文字を含むクエリ文字列に対応する文字列（置換候補文字列）として出力されてもよい。推測結果として正解文字が得られた場合、置換候補生成部２７は、クエリ文字列中のマスク文字が正解文字で置換された、置換候補文字列を生成する。ここで、ステップＳ１０１において複数のクエリ文字列が生成されている場合、複数のクエリ文字列の夫々に対応する複数の置換候補文字列が得られる。その後、処理はステップＳ１０３へ進む。但し、得られた置換候補文字列が１つである場合、ステップＳ１０３からステップＳ１０８の処理はスキップされてよい。 A guess result may be output as a correct character, or may be output as a character string (replacement candidate character string) corresponding to a query character string including the correct character. When a correct character is obtained as a result of guessing, the replacement candidate generation unit 27 generates a replacement candidate character string in which the mask character in the query character string is replaced with the correct character. Here, when a plurality of query strings are generated in step S101, a plurality of replacement candidate strings corresponding to each of the plurality of query strings are obtained. After that, the process proceeds to step S103. However, if only one replacement candidate character string is obtained, the processing from step S103 to step S108 may be skipped.

ステップＳ１０３からステップＳ１０７では、置換候補文字列が自然な語句であるか否かが判定される。自然語句判定部２８は、事前に学習を行うことで作成された自然語句判定用学習モデルに、ステップＳ１０２で得られた複数の置換候補文字列の夫々を入力として与えることで、自然語句判定用学習モデルからの出力として、複数の置換候補文字列の夫々について、当該置換候補文字列が自然な語句であるか否かの判定結果、及び当該置換候補文字列が自然な語句である程度を示す指標を得る（ステップＳ１０３）。本実施形態では、当該置換候補文字列が自然な語句である程度を示す指標として、当該置換候補文字列が自然な語句である確率（例えば、０．０から１．０の間の値）を得る。但し、指標には、点数やランク等、その他の態様が採用されてよい。また、当該置換候補文字列が自然な語句であるか否かの判定結果は、当該置換候補文字列が自然な語句である程度を示す指標が予め設定された所定の閾値（例えば、０．６）以上である場合に、当該置換候補文字列が自然な語句であるとの判定結果が得られることとしてよい。但し、自然な語句であるか否かを判定するための閾値は任意に設定可能であってよい。 In steps S103 to S107, it is determined whether or not the replacement candidate character string is a natural phrase. The natural phrase judging unit 28 inputs each of the plurality of replacement candidate character strings obtained in step S102 to the learning model for natural phrase judging created by learning in advance. As output from the learning model, for each of a plurality of candidate replacement character strings, the result of determination as to whether or not the replacement candidate character string is a natural phrase, and an index indicating the extent to which the replacement candidate character string is a natural phrase is obtained (step S103). In this embodiment, the probability that the replacement candidate character string is a natural phrase (for example, a value between 0.0 and 1.0) is obtained as an index indicating the extent to which the replacement candidate character string is a natural phrase. . However, other aspects such as score and rank may be adopted as the index. Further, the determination result of whether or not the replacement candidate character string is a natural phrase is a predetermined threshold value (for example, 0.6) in which an index indicating the extent to which the replacement candidate character string is a natural phrase is set in advance. In the case above, it may be possible to obtain a determination result that the replacement candidate character string is a natural phrase. However, the threshold for determining whether or not the phrase is natural may be set arbitrarily.

置換候補文字列が自然な語句でないとの判定結果が得られた場合（ステップＳ１０４のＮＯ）、補正部２９は、当該置換候補文字列に対して、最後尾の１文字を削除する補正を行う（ステップＳ１０５）。ここで、最後尾の１文字を削除する補正が行われた結果、補正後の置換候補文字列がいずれかの判定スキップ条件を満たすこととなった場合、（ステップＳ１０６のＹＥＳ）、処理はステップＳ１０７へ進む。ここで、判定スキップ条件とは、補正後の置換候補文字列の末尾の文字がユーザによって誤認識文字として選択された修正対象文字に対応する文字であること（換言すれば、補正後の置換候補文字列の末尾の文字がクエリ文字列中の最初に出現するマスク文字に対応する文字であること）、又は、補正後の置換候補文字列が既に自然語句判定済みの置換候補文字列と一致すること、の何れか一の条件である。一方、補正後の置換候補文字列がいずれの判定スキップ条件も満たさない場合（ステップＳ１０６のＮＯ）、処理はステップＳ１０３へ戻り、自然語句判定部２８は、補正後の当該置換候補文字列が自然な語句であるか否かを更に判定する（ステップＳ１０３）。即ち、ステップＳ１０３からステップＳ１０６の処理は、対象となっている置換候補文字列が自然な語句であると判定されるか、又はいずれかの判定スキップ条件が満たされるまで、最後尾の１文字を削除しながら繰り返し実行される。 When it is determined that the replacement candidate character string is not a natural phrase (NO in step S104), the correction unit 29 corrects the replacement candidate character string by deleting the last character. (Step S105). Here, if the corrected replacement candidate character string satisfies any of the judgment skip conditions as a result of the correction of deleting the last character (YES in step S106), the process proceeds to step Proceed to S107. Here, the judgment skip condition is that the character at the end of the corrected replacement candidate character string is a character corresponding to the correction target character selected by the user as an erroneously recognized character (in other words, the corrected replacement candidate character string The character at the end of the string must be the character corresponding to the mask character that appears first in the query string), or the replacement candidate string after correction matches the replacement candidate string that has already been judged as a natural phrase or any one of the following conditions. On the other hand, if the corrected replacement candidate character string does not satisfy any judgment skip condition (NO in step S106), the process returns to step S103, and the natural phrase determination unit 28 determines that the corrected replacement candidate character string is natural. It is further determined whether or not the word is a valid word (step S103). That is, the processing from step S103 to step S106 is performed until the target replacement candidate character string is determined to be a natural phrase, or until one of the determination skip conditions is satisfied, the last character is skipped. It is executed repeatedly while deleting.

対象となっている置換候補文字列が自然な語句であると判定されるか、いずれかの判定スキップ条件が満たされた場合、ステップＳ１０２で生成された全ての置換候補文字列についての自然語句判定が終了したか否かが判定される（ステップＳ１０７）。未判定の置換候補文字列がある場合（ステップＳ１０７のＮＯ）、自然語句判定部２８は、未判定の次の置換候補文字列を対象として、ステップＳ１０３からステップＳ１０６の処理を実行する。即ち、ステップＳ１０３からステップＳ１０６の処理は、ステップＳ１０２で生成された全ての置換候補文字列についての自然語句判定が終了するまで、対象とする置換候補文字列を変更しながら、繰り返し実行される。生成された全ての置換候補文字列についての自然語句判定が終了すると（ステップＳ１０７のＹＥＳ）、処理はステップＳ１０８へ進む。 If it is determined that the target replacement candidate character string is a natural phrase, or if any of the determination skip conditions are satisfied, natural phrase determination is made for all the replacement candidate character strings generated in step S102. is completed (step S107). If there is an undetermined replacement candidate character string (NO in step S107), the natural phrase determination unit 28 executes the processing from step S103 to step S106 for the next undetermined replacement candidate character string. That is, the processing from step S103 to step S106 is repeatedly executed while changing the target replacement candidate character string until the natural phrase determination is completed for all the replacement candidate character strings generated in step S102. When the natural phrase determination for all the generated replacement candidate character strings is completed (YES in step S107), the process proceeds to step S108.

ステップＳ１０８及びステップＳ１０９では、所定の優先順に従って置換候補文字列が出力される。出力部２３は、ステップＳ１０７までの処理で得られた複数の置換候補文字列を、自然語句判定用学習モデルから得られた自然な語句である程度／確率の降順にソートし、更に文字列長の降順にソートする（ステップＳ１０８）。このようにすることで、複数の置換候補文字列は、文字列長の降順に並び、且つ文字列長が同じ候補については、自然な語句である程度／確率の降順に並ぶ。 In steps S108 and S109, replacement candidate character strings are output according to a predetermined priority order. The output unit 23 sorts the plurality of replacement candidate character strings obtained by the processing up to step S107 in descending order of degree/probability of natural phrases obtained from the learning model for determining natural phrases, and further sorts the string lengths. Sort in descending order (step S108). In this way, a plurality of replacement candidate character strings are arranged in descending order of character string length, and candidates having the same character string length are arranged in descending order of degree/probability of natural phrases.

そして、出力部２３は、文字認識の対象となった文字列画像５１、及び文字認識部２１から取得した認識文字列５２に並べて、生成された１又は複数の置換候補文字列を含む置換候補文字列リストを表示装置に表示させる（ステップＳ１０９）。この際、置換候補文字列は、ステップＳ１０８において上記説明した通りソートされているため、出力部２３は、複数の置換候補文字列をソートされた順に表示させることで、より長い置換候補文字列を優先的に表示させ、且つ、自然語句判定部２８によって算出された指標に基づいて、自然な語句である程度が高い置換候補文字列を優先的に表示させることが出来る。このため、ユーザは、少ない操作で、より長く且つより自然な語句を選択して、より多くの誤認識文字をまとめて修正することが出来る。その後、本フローチャートに示された処理は終了する。 Then, the output unit 23 arranges the character string image 51 to be subjected to character recognition and the recognized character string 52 acquired from the character recognition unit 21, and generates replacement candidate characters including one or more replacement candidate character strings. The column list is displayed on the display device (step S109). At this time, since the replacement candidate character strings have been sorted as described above in step S108, the output unit 23 displays the plurality of replacement candidate character strings in the sorted order, so that a longer replacement candidate character string can be displayed. Based on the index calculated by the natural phrase determining unit 28, it is possible to preferentially display the replacement candidate character strings that are highly natural phrases. Therefore, the user can select a longer and more natural word/phrase with fewer operations, and collectively correct more erroneously recognized characters. After that, the processing shown in this flowchart ends.

図６は、本実施形態において確認・修正画面５に表示される置換候補文字列リスト５４の例を示す図である。図３を参照して説明した通り、確認・修正画面５には、読み取り対象となった文字列画像５１と、当該文字列画像５１から認識された認識文字列５２とが並べて表示されている。置換候補文字列リスト５４は、ユーザがカーソル５３を合わせて選択状態とした誤認識文字を含む認識文字列５２の下に、ステップＳ１０８でソートされた順に表示される。本実施形態では、カーソル５３の移動により認識文字列５２内の文字が選択状態となったことを契機として候補提示処理が実行され、自動的に置換候補文字列リスト５４が表示される。なお、選択状態となってから置換候補文字列リスト５４が表示されるまでの待機時間の長さは、設定により変更可能である。ここで、出力部２３は、リスト５４中の置換候補文字列のうち、認識文字列５２との差分となる文字（修正される文字）を強調表示（図６では太字表示）させる。 FIG. 6 is a diagram showing an example of the replacement candidate character string list 54 displayed on the confirmation/correction screen 5 in this embodiment. As described with reference to FIG. 3, the character string image 51 to be read and the recognized character string 52 recognized from the character string image 51 are displayed side by side on the confirmation/correction screen 5 . The replacement candidate character string list 54 is displayed in the order sorted in step S108 under the recognized character string 52 containing the erroneously recognized character selected by the user by placing the cursor 53 thereon. In this embodiment, when a character in the recognized character string 52 is selected by moving the cursor 53, the candidate presentation process is executed and the replacement candidate character string list 54 is automatically displayed. It should be noted that the length of waiting time from when the selected state is reached until the replacement candidate character string list 54 is displayed can be changed by setting. Here, the output unit 23 highlights (displays in bold in FIG. 6) characters (characters to be corrected) that are the difference from the recognized character string 52 among the replacement candidate character strings in the list 54 .

図７は、本実施形態において置換候補文字列リスト５４から置換候補文字列が選択される様子の例を示す図である。出力部２３は、認識文字列５２のうち、選択状態にある置換候補文字列に対応する部分を強調表示させる。図７に示された例では、インデックス番号４の置換候補文字列「総合展示会」が選択状態となっている場合に、認識文字列５２中の対応部分「総合展本会」が置換される範囲として強調表示されていることが分かる。 FIG. 7 is a diagram showing an example of how a replacement candidate character string is selected from the replacement candidate character string list 54 in this embodiment. The output unit 23 highlights a portion of the recognized character string 52 that corresponds to the selected replacement candidate character string. In the example shown in FIG. 7, when the replacement candidate character string "general exhibition" with index number 4 is selected, the corresponding part "general exhibition" in the recognized character string 52 is replaced. You can see it highlighted as a range.

本実施形態において、ユーザは、（１）マウスホイール操作によって選択状態にある置換候補文字列を変更してクリック操作によって所望の置換候補文字列の選択を確定する操作方法や、（２）キーボードの方向キー操作によって選択状態にある置換候補文字列を変更してＥｎｔｅｒキー操作によって所望の置換候補文字列の選択を確定する操作方法、（３）置換候補文字列の近傍に表示されたインデックス番号（図６及び図７の例では、置換候補文字列の左側に示された数字１から５）に対応する数字キー操作によって置換候補文字列の選択を確定する操作方法等、様々な操作方法でユーザ所望の置換候補文字列を選択することが可能である。但し、置換候補文字列リスト５４からユーザ所望の置換候補文字列を選択するための具体的な操作方法は、本実施形態における例示に限定されない。例えば、ユーザ所望の置換候補文字列は、ポインティングデバイスを用いて選択されてもよい。 In this embodiment, the user can (1) use the mouse wheel to change the selected replacement candidate character string and click to confirm the selection of the desired replacement candidate character string, and (2) use the keyboard. (3) an index number ( In the examples of FIGS. 6 and 7, the user can use various operation methods such as confirming the selection of the replacement candidate character string by operating the numeric keys corresponding to the numbers 1 to 5) shown on the left side of the replacement candidate character string. A desired replacement candidate character string can be selected. However, the specific operation method for selecting a replacement candidate character string desired by the user from the replacement candidate character string list 54 is not limited to the example in this embodiment. For example, a user-desired replacement candidate character string may be selected using a pointing device.

ユーザによる、表示された置換候補文字列リスト５４から所望の置換候補文字列を選択し確定する操作が受け付けられると、置換部３０は、ユーザによって選択された置換候補文字列で、認識文字列５２の対応部分を置換（修正）する。図７に示された例を用いて説明すると、ユーザによってインデックス番号１の置換候補文字列が選択された場合、置換部３０は、認識文字列「総合展本会ハネル」の全体を、置換候補文字列「総合展示会パネル」で置換する。即ち、本実施形態に係るシステムによれば、全体として意味のある自然な語句である「総合展示会パネル」を最も優先度が高い候補としてユーザに提示して選択させることで、認識文字列「総合展本会ハネル」に含まれる２つの誤認識文字「本」及び「ハ」を同時に修正することが出来る。一方、ユーザによってインデックス番号４の置換候補文字列が選択された場合、置換部３０は、認識文字列「総合展本会ハネル」のうち、インデックス番号４の置換候補文字列に対応する部分「総合展本会」を、置換候補文字列「総合展示会」で置換する。この場合も、全体として意味のある自然な語句である「総合展示会」をユーザに提示して選択させることで、認識文字列「総合展本会」に含まれる誤認識文字「本」を修正することが出来る。 When the user's operation of selecting and confirming a desired replacement candidate character string from the displayed replacement candidate character string list 54 is accepted, the replacement unit 30 replaces the recognized character string 52 with the replacement candidate character string selected by the user. replace (modify) the corresponding part of To explain using the example shown in FIG. 7, when the user selects a replacement candidate character string with index number 1, the replacement unit 30 replaces the entire recognized character string “Sougoutenhonkai Hanel” with the replacement candidate character string. Replace with the string "general exhibition panel". That is, according to the system according to the present embodiment, by presenting the user with the highest-priority candidate “general exhibition panel”, which is a meaningful and natural phrase as a whole, and allowing the user to select it, the recognized character string “ Two erroneously recognized characters "hon" and "ha" included in the general exhibition main meeting Hanel can be corrected at the same time. On the other hand, when the user selects the replacement candidate character string with index number 4, the replacement unit 30 replaces the part corresponding to the replacement candidate character string with index number 4 in the recognized character string "exhibition" is replaced with the replacement candidate character string "general exhibition". In this case as well, the misrecognized character "book" included in the recognition character string "general exhibition" is corrected by presenting the user with a natural phrase "general exhibition" that has a meaning as a whole and allowing the user to select it. can do

置換候補文字列を用いた認識文字列５２の対応部分の置換（修正）が完了すると、ユーザによる認識文字列５２の確認作業が再開される。ユーザによって再び誤認識文字が発見され、誤認識文字が選択状態とされると、再び候補提示処理が実行される。例えば、図７に示された例において、ユーザによってインデックス番号４の置換候補文字列が選択された場合、誤認識文字「ハ」が残るが、ユーザは、カーソル５３を移動して誤認識文字「ハ」を選択状態とすることで、再び候補提示処理を実行させることが出来る。ユーザは、誤認識文字を発見しなくなるまで一連の作業を繰り返すことで、認識文字列５２を確認及び修正する作業を行う。 When the replacement (correction) of the corresponding portion of the recognized character string 52 using the replacement candidate character string is completed, the confirmation operation of the recognized character string 52 by the user is resumed. When the user finds the erroneously recognized character again and selects the erroneously recognized character, the candidate presentation process is executed again. For example, in the example shown in FIG. 7, when the user selects the replacement candidate character string with index number 4, the misrecognized character "ha" remains, but the user moves the cursor 53 to move the misrecognized character " By setting "C" to the selected state, the candidate presentation process can be executed again. The user confirms and corrects the recognized character string 52 by repeating a series of operations until no misrecognized characters are found.

＜学習モデル＞
本開示に係る技術において文字推測用学習モデルに採用可能な具体的な学習モデルの種類は限定されず、クエリ文字列を入力として、マスク文字が示す箇所に入る正解文字の推測結果を出力可能なモデルであればよい。例えば、正解文字は、Ｓｈｉｆｔ＿ＪＩＳ第一、第二水準の全文字を対象とした多クラス分類の結果として出力される。以下に、本実施形態において用いることが可能な文字推測用学習モデルを作成するための事前学習の流れの一例を説明する。 <Learning model>
In the technology according to the present disclosure, the type of specific learning model that can be adopted for the character guessing learning model is not limited, and the query string is input, and the result of guessing the correct character that fits in the location indicated by the mask character can be output. Any model is fine. For example, correct characters are output as a result of multi-class classification of all characters of Shift_JIS first and second levels. An example of the flow of pre-learning for creating a character guessing learning model that can be used in this embodiment will be described below.

事前学習では、大量の語句（名詞又は複合語）を収集し、これらの語句中の任意の割合の文字（少なくとも１文字）をランダムに選択してマスク文字で置換し、学習モデルを用いてマスク文字が示す箇所の文字を推測させ、マスク文字で置換する前の文字と推測された文字との誤差を計算し、誤差逆伝播法により学習モデルのパラメータを修正する、という一連の処理を任意の回数実行することで、文字推測用学習モデルを作成する。誤差の算出には、例えばｏｎｅ－ｈｏｔベクトル等が用いられてよい。なお、マスク文字に置換される文字の割合は任意に設定可能である。例えば、実施の際に用いられるＯＣＲシステムの誤認識率に合わせた割合の文字をマスクすることで、予測精度を高めることが出来る。また、マスクされる文字は上述の通りランダムに選択されてよいが、文字列の前方に比べて後方の文字のマスク率が高くなるように選択されてもよい。これは、文字認識結果の確認及び修正を行う際には文字列を前方から後方に向かって順に修正していくために、修正作業全体でみると文字列の後方の方がよりマスクされる頻度が高くなるという実情があり、このような実情に合わせて後方の文字列を精度良く推測可能な学習モデルを作成するためである。 In pre-learning, a large number of words (nouns or compound words) are collected, an arbitrary percentage of characters (at least one character) in these words are randomly selected and replaced with mask characters, and masked using a learning model. A series of processing is performed by making the user guess the character indicated by the character, calculating the error between the character before replacement with the mask character and the guessed character, and correcting the parameters of the learning model by error backpropagation. Create a learning model for character guessing by executing it a number of times. For example, a one-hot vector or the like may be used to calculate the error. Note that the ratio of characters to be replaced with mask characters can be set arbitrarily. For example, the accuracy of prediction can be improved by masking characters at a rate that matches the misrecognition rate of the OCR system used in implementation. Also, the characters to be masked may be randomly selected as described above, but may be selected so that the masking rate of characters at the back of the character string is higher than at the front of the character string. This is because when checking and correcting character recognition results, the character string is corrected in order from the front to the back, so the rear part of the character string is masked more frequently in the entire correction process. is high, and a learning model capable of accurately estimating the subsequent character string is created in accordance with such circumstances.

本開示に係る技術において自然語句判定用学習モデルに採用可能な具体的な学習モデルの種類は限定されず、任意の文字列を入力として、意味のある自然な語句であるか否かの判定結果、自然な語句である程度、又は自然な語句である確率等を出力可能なモデルであればよい。以下に、本実施形態において用いることが可能な自然語句判定用学習モデルを作成するための事前学習の流れの一例を説明する。 In the technology according to the present disclosure, the type of specific learning model that can be adopted as the learning model for determining natural phrases is not limited, and the result of determining whether or not an arbitrary character string is input is a meaningful and natural phrase. , the degree of natural phrases, or the probability of natural phrases, etc., may be used. An example of the flow of pre-learning for creating a learning model for determining natural phrases that can be used in this embodiment will be described below.

事前学習では、大量の語句（名詞又は複合語）を収集し、これらの語句中の複数の文字をランダムに選択してランダムな別の文字で置換することで負例を作成する。また、置換を行わない語句については、そのまま正例として用いられる。正例と負例は、収集された語句を５０％の確率で正例又は負例に分けることで割り当てられてよい。そして、これらの正例及び負例として用意された語句を入力として、学習モデルを用いて入力された語句が自然な語句であるか否かを推測させ、推測結果としての自然な語句である確率（０．０から１．０の間の値）と正解（入力語句が正例であれば１．０、負例であれば０．０）との誤差を計算し、誤差逆伝播法により学習モデルのパラメータを修正する、という一連の処理を任意の回数実行することで、自然語句判定用学習モデルを作成する。 In pre-training, a large number of phrases (nouns or compound words) are collected, and multiple letters in these phrases are randomly selected and replaced with other random letters to create negative examples. Words that are not replaced are used as they are as positive examples. Positive and negative examples may be assigned by dividing the collected phrases into positive or negative examples with a 50% chance. Then, using the words prepared as positive and negative examples as input, the learning model is used to guess whether or not the input word is a natural word, and the probability that the word is a natural word as a result of guessing is calculated. Calculate the error between (a value between 0.0 and 1.0) and the correct answer (1.0 if the input word is a positive example, 0.0 if it is a negative example), and learn by error backpropagation A learning model for natural phrase determination is created by executing a series of processes of correcting model parameters an arbitrary number of times.

＜バリエーション＞
なお、上記説明した実施形態では、マスク文字への置換によって推測対象箇所を指定する例を説明したが、推測対象箇所の指定方法は、本実施形態における例示に限定されない。例えば、推測対象箇所は、認識文字列中の文字番号（文字列先頭からの文字数）によって指定されてもよい。 <Variation>
In addition, in the above-described embodiment, an example of designating a speculation target location by replacing with a mask character has been described, but the method of designating a speculation target location is not limited to the example in this embodiment. For example, the guess target part may be designated by a character number (the number of characters from the beginning of the character string) in the recognized character string.

１情報処理装置 1 information processing device

Claims

Character recognition result acquisition means for acquiring a recognized character string obtained by character recognition of a character string image;
Guessing target designation means for designating a predetermined portion of the recognized character string as a guessing target portion;
Guessing means for guessing a correct character to be included in the guessing target location designated by the guessing target designating means;
replacement candidate generating means for generating a replacement candidate character string including the correct character as a character string to replace at least part of the recognized character string;
An information processing system comprising

The guessing target specifying means specifies the guessing target part by generating a query string in which the character of the guessing target part in the recognition character string is replaced with a predetermined mask character,
The guessing means guesses a character to be included in the guess target location specified by the mask character in the generated query string;
The replacement candidate generating means generates the replacement candidate character string in which masked characters in the query string are replaced with a result of guessing by the guessing means.
The information processing system according to claim 1.

further comprising selection acceptance means for accepting selection of correction target characters from the user,
The speculation target specifying means generates a query string in which at least the correction target character is replaced with the mask character.
The information processing system according to claim 2.

The speculation target specifying means generates a query string in which the correction target character and any character after the correction target character are replaced with the mask character.
The information processing system according to claim 3.

The speculation target specifying means generates a plurality of query strings obtained by combining the correction target character to be replaced by the mask character and arbitrary characters after the correction target character to be replaced by the mask character,
The replacement candidate generating means generates a plurality of replacement candidate strings according to the plurality of query strings.
The information processing system according to claim 4.

The character recognition result acquisition means acquires, in addition to the recognized character string, candidate characters that were not adopted in the recognized character string during character recognition,
The guess target designation means generates the query string by replacing characters corresponding to the candidate characters in the recognized character string with the candidate characters in addition to the replacement with the mask characters.
The information processing system according to any one of claims 2 to 5.

The guess target specifying means generates a plurality of query strings obtained by combining the replacement with the mask character and the replacement with the candidate character,
The replacement candidate generating means generates a plurality of replacement candidate strings according to the plurality of query strings.
The information processing system according to claim 6.

Further comprising natural phrase determination means for determining whether the replacement candidate character string is a natural phrase,
The information processing system according to any one of claims 1 to 7.

Further comprising correction means for performing correction to delete the last character of the replacement candidate character string when it is determined that the replacement candidate character string is not a natural phrase,
When the replacement candidate character string is corrected by the correcting means, the natural phrase determination means further determines whether the corrected replacement candidate character string is a natural phrase.
The information processing system according to claim 8.

The natural phrase determination means further calculates an index indicating the extent to which the replacement candidate character string is a natural phrase as a priority used when displaying the plurality of replacement candidate character strings,
The information processing system according to claim 8 or 9.

Further comprising output means for displaying the generated replacement candidate character string on a display device,
The information processing system according to any one of claims 1 to 10.

The output means preferentially displays a longer replacement candidate character string when displaying a plurality of the replacement candidate character strings.
The information processing system according to claim 11.

The output means highlights a character that is a difference from the recognized character string in the replacement candidate character string.
The information processing system according to any one of claims 11 to 12.

The output means further displays the recognized character string, and highlights a portion of the recognized character string corresponding to the replacement candidate character string selected by the user.
The information processing system according to any one of claims 11 to 13.

The output means displays the character string image side by side with the recognized character string of the character string image.
The information processing system according to any one of claims 11 to 14.

Further comprising replacement means for replacing the corresponding character string in the recognized character string with the replacement candidate character string selected by the user;
The information processing system according to any one of claims 1 to 15.

the computer
a character recognition result obtaining step of obtaining a recognized character string obtained by character recognition of the character string image;
a speculation target designation step of designating a predetermined portion of the recognized character string as a speculation target portion;
a guessing step of guessing the correct character to be included in the guessing target location specified in the guessing target specifying step;
a replacement candidate generating step of generating a replacement candidate character string including the correct character as a character string to replace at least part of the recognized character string;
how to run

the computer,
Character recognition result acquisition means for acquiring a recognized character string obtained by character recognition of a character string image;
Guessing target designation means for designating a predetermined portion of the recognized character string as a guessing target portion;
Guessing means for guessing a correct character to be included in the guessing target location designated by the guessing target designating means;
replacement candidate generating means for generating a replacement candidate character string including the correct character as a character string to replace at least part of the recognized character string;
A program to function as