JP2021015490A

JP2021015490A - Image processing device and method for improving recognition accuracy

Info

Publication number: JP2021015490A
Application number: JP2019130255A
Authority: JP
Inventors: 亮浜田; Akira Hamada
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2019-07-12
Filing date: 2019-07-12
Publication date: 2021-02-12

Abstract

To effectively improve accuracy related to pattern recognition of an image processing device through machine learning.SOLUTION: An image processing device includes a pattern recognition part for recognizing an image related to image processing or a user performing operation related to the image processing, a recognition evaluation part for performing evaluation about the correctness of a recognition result or receiving evaluation from the user, a recognition complement part for presenting the whole or a part of a recognition object for determining that recognition accuracy is not enough on the basis of evaluation, and acquiring an input of the user about the correctness, and a machine learning part for performing learning related to recognition about the whole or a part of the recognition object by using the acquired input. The recognition complement part presents the recognition object during a waiting period when the user waits for a response to an end or operation of the image processing.SELECTED DRAWING: Figure 3

Description

この発明は、画像処理に係る画像または前記画像処理に係る操作を行うユーザーを認識する機能を有する画像処理装置およびその認識機能に係る認識精度改善方法に関する。 The present invention relates to an image processing apparatus having a function of recognizing an image related to image processing or a user who performs an operation related to the image processing, and a method for improving recognition accuracy related to the recognition function.

文書画像の文字認識に関連する技術としてはOCR（光学文字認識）と呼ばれるものが存在している。さらに、認識率を向上させる方法として、辞書として予め用意したものに加えてユーザー辞書を用いること、ユーザーにより認識結果が修正された場合にこれを学習する文字認識装置が知られている（例えば、特許文献１参照）。しかし、特許文献１には、具体的にどのようなタイミングや手法を適用してユーザーからの修正を受け付けるのかについての記載はない。
一方、コンビニエンスストアなどの店舗に設置された画像形成装置を利用するユーザーを対象として、広告や販売促進のために画像形成装置の待ち時間を利用してユーザーにアンケートを出題し、回答したユーザーの印刷料金を軽減するという仕組みが提案されている（例えば、特許文献２参照）。 As a technique related to character recognition of document images, there is a technique called OCR (optical character recognition). Further, as a method of improving the recognition rate, it is known to use a user dictionary in addition to a dictionary prepared in advance, and a character recognition device that learns the recognition result when the recognition result is modified by the user (for example,). See Patent Document 1). However, Patent Document 1 does not specifically describe what kind of timing or method is applied to accept corrections from users.
On the other hand, for users who use the image forming device installed in stores such as convenience stores, the users who answered the questionnaire by using the waiting time of the image forming device for advertising and sales promotion A mechanism for reducing printing charges has been proposed (see, for example, Patent Document 2).

特開２００９−１９３３８７号公報JP-A-2009-193387 特開２０１６−１４３１１０号公報Japanese Unexamined Patent Publication No. 2016-143110

読み取った原稿画像の文字認識を行う機能を有する画像処理装置が知られている。また、ユーザー認証を行い、認証されたユーザーの操作を受付ける画像処理装置が知られている。ユーザーの認証については種々の手法が適用可能であるが、従来の暗証番号やカードを用いる手法に加え、近年は例えば画像や音声などにパターン認識技術を適用した認証手段が普及しつつある。文字認識や音声認識は、技術的にパターン認識の範疇である。
近年のパターン認識技術の進歩は目覚ましく、認識精度の向上が顕著である。しかし、誤認識が皆無とはいえない。また、そもそも正解が一義的でないことがある。例えば、手書き文字認識においては、文字を手書きしたユーザーの癖や手書きしたときの状況によって略同一のパターンでもユーザーの意図する文字（正解）が異なることがある。
認識精度を上げるには、使用状況に応じた（例えばユーザー毎の）正解を持つために機械学習を行うことが有効であると考えられる。しかし、ユーザーが煩わしいと感じるような学習のやり方をすれば学習機能が利用されず、認識精度の向上につながらない可能性がある。
この発明は、以上のような事情を考慮してなされたものであって、機械学習を通じて画像処理装置のパターン認識に係る精度を効果的に向上させることが可能な手法を提供するものである。 An image processing device having a function of recognizing characters in a scanned original image is known. Further, an image processing device that authenticates a user and accepts an operation of the authenticated user is known. Various methods can be applied to user authentication, but in recent years, in addition to the conventional methods using a personal identification number or a card, an authentication method applying a pattern recognition technology to, for example, an image or a voice has become widespread. Character recognition and voice recognition are technically in the category of pattern recognition.
The progress of pattern recognition technology in recent years is remarkable, and the improvement of recognition accuracy is remarkable. However, it cannot be said that there is no misrecognition. In addition, the correct answer may not be unique in the first place. For example, in handwritten character recognition, the character (correct answer) intended by the user may differ even if the pattern is substantially the same, depending on the habit of the user who handwritten the character and the situation when handwriting.
In order to improve the recognition accuracy, it is considered effective to perform machine learning in order to have a correct answer according to the usage situation (for example, for each user). However, if the learning method is such that the user finds it bothersome, the learning function will not be used, which may not lead to improvement in recognition accuracy.
The present invention has been made in consideration of the above circumstances, and provides a method capable of effectively improving the accuracy related to pattern recognition of an image processing apparatus through machine learning.

この発明は、画像処理に係る画像または前記画像処理に係る操作を行うユーザーを認識するパターン認識部と、認識結果の正誤に関する評価を行うかまたは前記ユーザーから評価を受領する認識評価部と、前記評価に基づき十分な認識精度でないと判断する認識対象の全部または一部を提示して正誤に係る前記ユーザーの入力を求める認識補完部と、得られた入力を用いて前記認識対象の全部または一部について前記認識に係る学習を行う機械学習部とを備え、前記認識補完部は、前記ユーザーが前記画像処理の終了または前記操作に対する応答を待つ待ち期間中に前記認識対象の提示を行う画像処理装置を提供する。 The present invention includes a pattern recognition unit that recognizes an image related to image processing or a user who performs an operation related to the image processing, a recognition evaluation unit that evaluates whether the recognition result is correct or incorrect, or receives an evaluation from the user. A recognition complementing unit that presents all or part of the recognition target that is judged not to have sufficient recognition accuracy based on the evaluation and requests the input of the user regarding correctness, and all or one of the recognition targets using the obtained input. The recognition complementing unit includes a machine learning unit that performs learning related to the recognition of the unit, and the recognition complementing unit presents the recognition target during a waiting period in which the user waits for the end of the image processing or a response to the operation. Provide the device.

また、異なる観点からこの発明は、画像処理に係る画像または前記画像処理に係る操作を行うユーザーを認識するステップと、認識結果の正誤に関する評価を行うかまたは前記ユーザーから評価を受領するステップと、前記評価に基づき十分な認識精度でないと判断する認識対象の全部または一部を提示して正誤に係る前記ユーザーの入力を求める認識補完ステップと、得られた入力を用いて前記認識対象の全部または一部について前記認識に係る学習を行うステップとを備える処理をコンピュータが実行し、前記認識補完ステップは、前記ユーザーが前記画像処理の終了または前記操作に対する応答を待つ待ち期間中に実行される認識精度改善方法を提供する。 Further, from different viewpoints, the present invention includes a step of recognizing an image related to image processing or a user who performs an operation related to the image processing, and a step of evaluating the correctness of the recognition result or receiving an evaluation from the user. A recognition complementing step that presents all or part of the recognition target that is judged not to have sufficient recognition accuracy based on the evaluation and asks for the user's input related to correctness, and all or part of the recognition target using the obtained input. The computer executes a process including a step of learning about the recognition for a part of the recognition, and the recognition complement step is executed during a waiting period in which the user waits for the end of the image processing or a response to the operation. Provide a method for improving accuracy.

この発明による画像処理装置において、認識補完部は、前記ユーザーが前記画像処理の終了または前記操作に対する応答を待つ待ち期間中に、十分な認識精度でないと判断する認識対象の全部または一部を提示して正誤に係る前記ユーザーの入力を求めるので、ユーザーに煩わしさを感じさせることなく、機械学習を通じて画像処理装置のパターン認識に係る精度を効果的に向上させることができる。 In the image processing apparatus according to the present invention, the recognition complementing unit presents all or a part of the recognition target determined to be insufficient recognition accuracy during the waiting period in which the user waits for the end of the image processing or the response to the operation. Since the input of the user related to the correctness is requested, the accuracy related to the pattern recognition of the image processing device can be effectively improved through machine learning without causing the user to feel annoyed.

この実施形態における画像処理装置の構成を示すブロック図である。It is a block diagram which shows the structure of the image processing apparatus in this embodiment. 図１に示すデジタル複合機の外観を示す斜視図である。It is a perspective view which shows the appearance of the digital multifunction device shown in FIG. この実施形態において、認識対象の学習に関して制御部が実行する処理の例を示すフローチャートである。In this embodiment, it is a flowchart which shows an example of the process which a control part executes with respect to learning of a recognition target. 実施の形態１においてコピー終了の待ち期間中に認識の正誤に係る入力を求める例を示す第１の説明図である。FIG. 1 is a first explanatory diagram showing an example in which an input relating to correctness of recognition is requested during a waiting period for the end of copying in the first embodiment. 実施の形態１においてコピー終了の待ち期間中に認識の正誤に係る入力を求める例を示す第２の説明図である。FIG. 2 is a second explanatory diagram showing an example in which an input relating to correctness of recognition is requested during a waiting period for the end of copying in the first embodiment. 実施の形態１においてコピー終了の待ち期間中に認識の正誤に係る入力を求める例を示す第３の説明図である。FIG. 3 is a third explanatory diagram showing an example in which an input relating to correctness of recognition is requested during a waiting period for the end of copying in the first embodiment. 実施の形態１においてコピー終了の待ち期間中に認識の正誤に係る入力を求める例を示す第４の説明図である。FIG. 4 is a fourth explanatory diagram showing an example in which an input relating to correctness of recognition is requested during a waiting period for the end of copying in the first embodiment. 実施の形態１においてコピー終了の待ち期間中に認識の正誤に係る入力を求める例を示す第５の説明図である。FIG. 5 is a fifth explanatory diagram showing an example in which the input relating to the correctness of recognition is requested during the waiting period for the end of copying in the first embodiment. 実施の形態１においてコピー終了の待ち期間中に認識の正誤に係る入力を求める例を示す第６の説明図である。FIG. 6 is a sixth explanatory diagram showing an example in which the input relating to the correctness of recognition is requested during the waiting period for the end of copying in the first embodiment. 実施の形態２においてコピー終了の待ち期間中に認識の正誤に係る入力を求める例を示す第１の説明図である。It is 1st explanatory drawing which shows the example of requesting the input which concerns on the correctness of recognition during the waiting period of the copy completion in Embodiment 2. 実施の形態２においてコピー終了の待ち期間中に認識の正誤に係る入力を求める例を示す第２の説明図である。FIG. 2 is a second explanatory diagram showing an example in which input relating to correctness of recognition is requested during a waiting period for the end of copying in the second embodiment. 実施の形態２においてコピー終了の待ち期間中に認識の正誤に係る入力を求める例を示す第３の説明図である。FIG. 3 is a third explanatory diagram showing an example in which input relating to correctness of recognition is requested during a waiting period for the end of copying in the second embodiment. 実施の形態３においてコピー終了の待ち期間中に認識の正誤に係る入力を求める例を示す第１の説明図である。FIG. 3 is a first explanatory diagram showing an example in which an input relating to correctness of recognition is requested during a waiting period for the end of copying in the third embodiment. 実施の形態３においてコピー終了の待ち期間中に認識の正誤に係る入力を求める例を示す第２の説明図である。FIG. 2 is a second explanatory diagram showing an example in which an input relating to correctness of recognition is requested during a waiting period for the end of copying in the third embodiment. 実施の形態３においてコピー終了の待ち期間中に認識の正誤に係る入力を求める例を示す第３の説明図である。FIG. 3 is a third explanatory diagram showing an example in which an input relating to the correctness of recognition is requested during the waiting period for the end of copying in the third embodiment.

以下、図面を用いてこの発明をさらに詳述する。なお、以下の説明は、すべての点で例示であって、この発明を限定するものと解されるべきではない。
（実施の形態１）
≪画像処理装置の構成≫
図１は、この実施形態における画像処理装置の一態様であるデジタル複合機の構成を示すブロック図である。図２は、図１に示すデジタル複合機の外観を示す斜視図である。
なお、この実施形態では画像処理装置の例としてデジタル複合機を挙げているが、これに限るものでない。例えば、スキャナ装置、コピー装置、ファクシミリ装置、プリンタ装置などであってもよいし、画像処理に係る装置であればこれらに限られるものでない。 Hereinafter, the present invention will be described in more detail with reference to the drawings. It should be noted that the following description is exemplary in all respects and should not be construed as limiting the invention.
(Embodiment 1)
<< Configuration of image processing device >>
FIG. 1 is a block diagram showing a configuration of a digital multifunction device, which is one aspect of the image processing device according to this embodiment. FIG. 2 is a perspective view showing the appearance of the digital multifunction device shown in FIG.
In this embodiment, a digital multifunction device is mentioned as an example of an image processing device, but the present invention is not limited to this. For example, it may be a scanner device, a copy device, a facsimile device, a printer device, or the like, and is not limited to these as long as it is a device related to image processing.

図１に示すように、画像処理装置１００は、操作ユニット１０、制御部１１、表示ユニット１２、印刷ユニット１３、通信インターフェイス回路１４、スキャナユニット１５および画像データ生成回路１６を備える。また、通信インターフェイス回路１４を介して外部の情報処理装置２０と接続されている。この実施例で、情報処理装置２０は、画像処理装置１００が読み取った原稿画像を格納し、格納された原稿画像をユーザーが処理するパーソナルコンピュータである。ただし、情報処理装置２０はパーソナルコンピュータに限るものでなく、例えばスマートフォンであってもよく、ファイルサーバー等であってもよい。 As shown in FIG. 1, the image processing device 100 includes an operation unit 10, a control unit 11, a display unit 12, a printing unit 13, a communication interface circuit 14, a scanner unit 15, and an image data generation circuit 16. Further, it is connected to an external information processing device 20 via a communication interface circuit 14. In this embodiment, the information processing device 20 is a personal computer that stores the original image read by the image processing device 100 and the user processes the stored original image. However, the information processing device 20 is not limited to a personal computer, and may be, for example, a smartphone, a file server, or the like.

また、図２に示すように、画像処理装置１００は、給紙トレイ１７ａ、排出トレイ１８ａ、１８ｂおよび１８ｃ並びに手差しトレイ１７ｂを備える。
制御部１１と印刷ユニット１３、制御部１１とスキャナユニット１５とはバスで接続されており、相互に通信可能となっている。
制御部１１は、図１に示す画像処理装置１００の各部の動作を制御する。具体的には、ＣＰＵあるいはＭＰＵ（以下、両者をまとめてＣＰＵと呼ぶ）を中心に、メモリ、入出力インターフェイス回路、タイマ回路等のハードウェア資源で構成される。
制御部１１が備えるＲＯＭの少なくとも一部が、書き換え可能な不揮発性メモリであってもよい。制御部１１が備えるＣＰＵは、ＲＯＭに格納された制御プログラムを読み出して適宜、ＲＡＭに展開する。そして、ＲＡＭに展開された制御プログラムに従った処理を実行する。 Further, as shown in FIG. 2, the image processing device 100 includes a paper feed tray 17a, discharge trays 18a, 18b and 18c, and a manual feed tray 17b.
The control unit 11 and the printing unit 13, and the control unit 11 and the scanner unit 15 are connected by a bus so that they can communicate with each other.
The control unit 11 controls the operation of each unit of the image processing device 100 shown in FIG. Specifically, it is composed of hardware resources such as a memory, an input / output interface circuit, and a timer circuit, centering on a CPU or MPU (hereinafter, both are collectively referred to as a CPU).
At least a part of the ROM included in the control unit 11 may be a rewritable non-volatile memory. The CPU included in the control unit 11 reads the control program stored in the ROM and appropriately expands it into the RAM. Then, the process according to the control program expanded in the RAM is executed.

ＣＰＵは、ＲＯＭに格納された制御プログラムの内容に従って、ユーザーインターフェイスに係る表示を表示ユニット１２に表示させる。そして、ユーザーが操作ユニット１０に対して行う操作入力を受付ける。さらに、ＣＰＵは、前記制御プログラムに従って、画像処理装置１００のハードウェアを制御し、印刷処理等の機能を実現する。
制御部１１は、パターン認識部１１ａ、認識評価部１１ｂ、認識補完部１１ｃおよび機械学習部１１ｄを含む。前記ＣＰＵが、ＲＯＭに格納された制御プログラムを実行することによって、パターン認識部１１ａ、認識評価部１１ｂ、認識補完部１１ｃおよび機械学習部１１ｄの機能が実現される。
パターン認識部１１ａは、スキャナユニット１５が読み取った画像を対象に画像認識を行う。ただし、それに限らず、スキャナユニット１５が読み取った画像に代えて、あるいはそれに加えて、例えば図１に図示しないカメラで撮影されたユーザーの画像を認識してもよいし、図１に図示しないマイクで収音されたユーザーの音声を認識してもよい。 The CPU causes the display unit 12 to display a display related to the user interface according to the contents of the control program stored in the ROM. Then, the operation input performed by the user to the operation unit 10 is accepted. Further, the CPU controls the hardware of the image processing device 100 according to the control program to realize functions such as printing processing.
The control unit 11 includes a pattern recognition unit 11a, a recognition evaluation unit 11b, a recognition complementing unit 11c, and a machine learning unit 11d. When the CPU executes the control program stored in the ROM, the functions of the pattern recognition unit 11a, the recognition evaluation unit 11b, the recognition complementing unit 11c, and the machine learning unit 11d are realized.
The pattern recognition unit 11a performs image recognition on the image read by the scanner unit 15. However, the present invention is not limited to this, and instead of or in addition to the image read by the scanner unit 15, for example, an image of a user taken by a camera (not shown in FIG. 1) may be recognized, or a microphone (not shown in FIG. 1) may be recognized. You may recognize the user's voice picked up by.

認識評価部１１ｂは、パターン認識部１１ａが行った画像認識や音声認識等のパターン認識について、認識の正誤に係る評価を行う。あるいは、前記認識の正誤に係るユーザーによる評価を取得する。
認識の正誤に係る評価は、過去にパターン認識を行った認識対象を、同一または類似するものを同じグループに分類し、同にグループに属する認識対象に対して入力された正解のゆらぎを評価する。
例えば、パターン認識部１１ａが、スキャナユニット１５で読み取った文書画像の文字認識を行うものとする。さらに、パターン認識部１１ａが文字認識を行った結果に誤認識があるとユーザーが判断する場合、ユーザーが誤認識の部分を修正する機能を有するものとする。ユーザーは、表示ユニット１２を操作して誤認識の修正を行ってもよいし、通信インターフェイス回路１４を介して通信可能な情報処理装置２０を操作して誤認識を修正してもよい。 The recognition evaluation unit 11b evaluates the correctness of recognition of the pattern recognition such as image recognition and voice recognition performed by the pattern recognition unit 11a. Alternatively, an evaluation by the user regarding the correctness of the recognition is obtained.
In the evaluation of correctness of recognition, the recognition objects that have been pattern-recognized in the past are classified into the same group, and the fluctuation of the correct answer input to the recognition objects belonging to the same group is evaluated. ..
For example, it is assumed that the pattern recognition unit 11a recognizes the characters of the document image read by the scanner unit 15. Further, when the user determines that the result of character recognition by the pattern recognition unit 11a is erroneous recognition, the user shall have a function of correcting the erroneous recognition portion. The user may operate the display unit 12 to correct the erroneous recognition, or may operate the information processing device 20 capable of communicating via the communication interface circuit 14 to correct the erroneous recognition.

認識評価部１１ｂは、文字認識を行った認識対象と認識結果を記憶する。その際に、同一または類似の認識対象をグループ化して記憶する。誤認識をユーザーが修正したものについては、その修正内容と共に記憶する。そして、文字認識の回数に対して修正回数の多いグループに属する認識対象については、修正の割合が高いものとして低い評価を与える。逆に、認識回数の割に修正回数の少ないグループに属する認識対象については、高い評価を与える。例えばこのようにして、各グループに属する認識対象の正誤に係る評価を行うことが可能である。
あるいはまた認識評価部１１ｂは、自ら認識の正誤に係る評価を行うのでなく、ユーザーから認識の正誤に係る評価を取得してもよい。評価を取得するやり方としては例えば、誤認識が多いと考える認識対象をユーザーに選択させることが考えられる。 The recognition evaluation unit 11b stores the recognition target for which character recognition has been performed and the recognition result. At that time, the same or similar recognition targets are grouped and stored. If the user corrects the misrecognition, it is stored together with the corrected content. Then, the recognition target belonging to the group having a large number of corrections with respect to the number of character recognitions is given a low evaluation as having a high correction rate. On the contrary, the recognition target belonging to the group in which the number of corrections is small for the number of recognitions is given a high evaluation. For example, in this way, it is possible to evaluate the correctness of the recognition target belonging to each group.
Alternatively, the recognition evaluation unit 11b may obtain an evaluation related to the correctness of recognition from the user instead of evaluating the correctness of recognition by itself. As a method of acquiring the evaluation, for example, it is conceivable to let the user select a recognition target that is considered to have many false recognitions.

具体的な一例としては、読み取った画像と文字認識の結果を並べて表示ユニット１２あるいは情報処理装置２０に表示する。誤認識があればその部分をユーザーが修正して認識結果を編集できるものとする。認識評価部１１ｂは、ユーザーが修正した認識対象を記憶しておく。そして、例えばユーザーの待ち期間中に、過去に修正が行われた認識対象について、それらをフィードバックの候補として表示ユニット１２あるいは情報処理装置２０に一覧表示する。あるいはフィードバックの候補を一つずつ順次表示する。そして、表示された候補の中から誤認識が多いとユーザーが考える認識対象をフィードバック入力の対象として選択させる。
ここで、待ち期間は、画像処理装置１００が実行する画像処理（以下、ジョブとも呼ぶ）、あるいは通信の応答等が終了するのをユーザーが待つ期間である。画像処理のジョブ開始指示を受けて画像処理装置１００が省電力モードから復帰するまでの期間も待ち期間に含まれる。ジョブの種類としては、例えばコピー、スキャナ、プリンタ、ファックス等のジョブが挙げられる。これらのジョブは、原稿の走査やデータの印刷を伴う。
認識補完部１１ｃは、ユーザーの待ち期間中に、既に認識を行った認識対象のうち十分な認識精度でないと判断する認識対象をユーザーに提示する。そして、ユーザーに認識の正誤に係る入力を求める。ここで、提示される認識対象は、画像処理装置１００が実行中のジョブに係る原稿画像やユーザーの認識等に限るものでない。例えば、過去に実行されたジョブに係る認識対象を含む。
機械学習部１１ｄは、パターン認識部１１ａが実行するパターン認識に係る機械学習を行う。 As a specific example, the read image and the character recognition result are displayed side by side on the display unit 12 or the information processing device 20. If there is a misrecognition, the user can correct the part and edit the recognition result. The recognition evaluation unit 11b stores the recognition target modified by the user. Then, for example, during the waiting period of the user, the recognition targets that have been modified in the past are displayed in a list on the display unit 12 or the information processing device 20 as feedback candidates. Alternatively, feedback candidates are displayed one by one. Then, from the displayed candidates, the recognition target that the user thinks that there are many false recognitions is selected as the feedback input target.
Here, the waiting period is a period in which the user waits for the image processing (hereinafter, also referred to as a job) executed by the image processing device 100, the response of communication, or the like to be completed. The waiting period also includes the period until the image processing apparatus 100 returns from the power saving mode in response to the image processing job start instruction. Examples of job types include jobs such as copy, scanner, printer, and fax. These jobs involve scanning documents and printing data.
During the waiting period of the user, the recognition complementing unit 11c presents to the user a recognition target that is determined to have insufficient recognition accuracy among the recognition targets that have already been recognized. Then, the user is requested to input the correctness of recognition. Here, the recognition target presented is not limited to the original image related to the job being executed by the image processing device 100, the recognition of the user, and the like. For example, it includes a recognition target related to a job executed in the past.
The machine learning unit 11d performs machine learning related to pattern recognition executed by the pattern recognition unit 11a.

操作ユニット１０は、画像処理装置１００の筐体上に設けられ、ユーザーの操作を受付ける複数の操作ボタンや、表示ユニット１２の表示面上に配置されたタッチパネルなどから構成される。制御部１１は、操作ユニット１０に対する入力操作を示す信号を認識する。 The operation unit 10 is provided on the housing of the image processing device 100, and is composed of a plurality of operation buttons for receiving user operations, a touch panel arranged on the display surface of the display unit 12, and the like. The control unit 11 recognizes a signal indicating an input operation to the operation unit 10.

表示ユニット１２は、例えば、液晶ディスプレイ装置で構成される。表示ユニット１２は、例えば操作ユニット１０が受け付けた入力等に基づいて、各種の情報や画像等を表示可能である。制御部１１は、表示ユニット１２に表示するべき内容を生成し、更新する。それに伴って、表示ユニット１２は各種の情報や画像を表示する。
制御部１１には、原稿の画像を読取るスキャナユニット１５が接続される。
スキャナユニット１５は、制御部１１の制御下で、コピー、ファックスおよびスキャナのジョブにおける画像読取の処理を実行する。即ち、原稿画像を読み取って、画像信号に変換する。
画像データ生成回路１６は、スキャナユニット１５が出力する画像信号に基づいて画像データを生成する。 The display unit 12 is composed of, for example, a liquid crystal display device. The display unit 12 can display various information, images, and the like based on, for example, the input received by the operation unit 10. The control unit 11 generates and updates the contents to be displayed on the display unit 12. Along with this, the display unit 12 displays various information and images.
A scanner unit 15 for reading an image of a document is connected to the control unit 11.
The scanner unit 15 executes image reading processing in copy, fax, and scanner jobs under the control of the control unit 11. That is, the original image is read and converted into an image signal.
The image data generation circuit 16 generates image data based on the image signal output by the scanner unit 15.

給紙トレイ１７ａは、各種サイズの用紙を個別に収容する複数のトレイで構成される。
手差しトレイ１７ｂは、各種サイズおよび各種のタイプの用紙を給送可能なトレイである。
図１および図２に図示しない給紙機構は、制御部１１の制御下で、指定された給紙トレイの用紙を印刷装置内部へ給送して印刷ユニット１３へ搬送する、
印刷ユニット１３は、制御部１１の制御下で、給紙トレイ１７ａまたは手差しトレイ１７ｂから給送された用紙に、指定された画像データを印刷する。 The paper feed tray 17a is composed of a plurality of trays that individually store various sizes of paper.
The manual feed tray 17b is a tray capable of feeding various sizes and types of paper.
Under the control of the control unit 11, the paper feed mechanism (not shown in FIGS. 1 and 2) feeds the paper in the designated paper feed tray into the printing apparatus and conveys it to the printing unit 13.
Under the control of the control unit 11, the printing unit 13 prints the designated image data on the paper fed from the paper feed tray 17a or the manual feed tray 17b.

図１および図２に図示しない排紙機構は、印刷ユニット１３で印刷された用紙を排出トレイ１８ａ、１８ｂ、１８ｃの何れかへ排出する。
通信インターフェイス回路１４は、ネットワークを介して外部の機器とデータの通信を行うためのインターフェイスである。この実施形態で、画像処理装置１００は、ネットワークを介して接続された情報処理装置２０と通信する。情報処理装置２０は、スキャナユニット１５が読み取り、画像データ生成回路１６が生成した画像データを格納する。また、情報処理装置２０に格納された画像データを印刷ユニット１３が印刷するために提供する。 The paper ejection mechanism (not shown in FIGS. 1 and 2) ejects the paper printed by the printing unit 13 to any of the ejection trays 18a, 18b, and 18c.
The communication interface circuit 14 is an interface for communicating data with an external device via a network. In this embodiment, the image processing device 100 communicates with the information processing device 20 connected via the network. The information processing device 20 stores the image data read by the scanner unit 15 and generated by the image data generation circuit 16. Further, the image data stored in the information processing apparatus 20 is provided for the printing unit 13 to print.

≪パターン認識に係る機械学習≫
続いて、認識対象の学習に関して制御部が実行する処理について述べる。
図３は、この実施形態において、認識対象の学習に関して制御部が実行する処理の例を示すフローチャートである。図４〜図９は、この実施形態においてコピー終了の待ち期間中に認識の正誤に係る入力を求める例を示す説明図である。
以下、図３のフローチャートに沿って図４〜図９を参照しながら制御部の処理について述べる。 ≪Machine learning related to pattern recognition≫
Next, the process executed by the control unit regarding the learning of the recognition target will be described.
FIG. 3 is a flowchart showing an example of processing executed by the control unit regarding learning of the recognition target in this embodiment. 4 to 9 are explanatory views showing an example in which input relating to the correctness of recognition is requested during the waiting period for the end of copying in this embodiment.
Hereinafter, the processing of the control unit will be described with reference to FIGS. 4 to 9 according to the flowchart of FIG.

図３に示すように、制御部１１は、ユーザーの待ち期間が到来したかどうかを判断する。（ステップＳ１１）
この実施形態において、ユーザーの待ち時間は、画像処理装置１００の傍へ来てコピージョブを開始したユーザーがコピージョブの終了を待つ期間とする。
即ち、ユーザーは操作ユニット１０を用いてコピーの設定を行った後にコピージョブの開始を指示するものとする。 As shown in FIG. 3, the control unit 11 determines whether or not the waiting period of the user has arrived. (Step S11)
In this embodiment, the waiting time of the user is a period in which the user who has come to the side of the image processing device 100 and started the copy job waits for the end of the copy job.
That is, the user shall instruct the start of the copy job after setting the copy using the operation unit 10.

図４は、表示ユニット１２に表示されるコピージョブの操作画面であって、ユーザーが手指３０で［スタート］キーを操作してコピージョブをスタートする様子を示している。
図４に示す操作画面の左端部には、コピージョブに関する各種設定を行いかつ設定状態が表示される設定キー３１が配置されている。また、中央部にはコピー部数を設定しかつ設定部数が表示されるコピー部数領域３２が配置されている。
その下方には、原稿の有無や各トレイにセットされている用紙サイズが表示され、タッチすると各トレイの用紙設定画面が表示されるマシンイラスト３３が配置されている。操作画面の右上の隅には［ジョブ状況］キー３４が配置されており、タッチされると実行中および実行待ちのジョブが表示される。 FIG. 4 is an operation screen of the copy job displayed on the display unit 12, and shows a state in which the user operates the [Start] key with the finger 30 to start the copy job.
At the left end of the operation screen shown in FIG. 4, a setting key 31 for performing various settings related to the copy job and displaying the setting status is arranged. Further, a copy number area 32 in which the number of copies is set and the set number of copies is displayed is arranged in the central portion.
Below that, a machine illustration 33 is arranged in which the presence / absence of a document and the size of paper loaded in each tray are displayed, and when touched, the paper setting screen of each tray is displayed. The [Job Status] key 34 is arranged in the upper right corner of the operation screen, and when touched, the jobs being executed or waiting to be executed are displayed.

［ジョブ状況］キー３４の下方には［リセット］キー３５および［スタート］キー３６が配置されている。［リセット］キー３５がタッチされると、コピージョブに係るすべての設定（コピー部数を含む）がリセットされる。［スタート］キー３６がタッチされると設定されたコピージョブの実行が開始される。
これらの表示および操作に対する応答の処理は、制御部１１が実行する。 A [reset] key 35 and a [start] key 36 are arranged below the [job status] key 34. When the [Reset] key 35 is touched, all the settings (including the number of copies) related to the copy job are reset. When the [Start] key 36 is touched, the execution of the set copy job is started.
The control unit 11 executes the processing of the response to these displays and operations.

図４に示すように、［スタート］キー３６がタッチされたことを制御部１１が認識すると、それに応答して制御部１１は、［スタート］キー３６を［コピー中止］キー３７に置換する。この［コピー中止］キー３７がタッチされると制御部１１は実行中のコピージョブを中止する。
コピージョブを開始した制御部１１は、スキャナユニット１５に原稿を読み取らせ、印刷ユニット１３に読み取った原稿の画像を印刷させることにより、コピージョブを実行する。原稿が読み取られて印刷が終了するまでの期間は、ユーザーにとってコピージョブの終了を待つ待ち期間となる（ステップＳ１１のＹｅｓ）。コピーの設定に係る操作が終わると、ユーザーにとっては、開始されたコピージョブの終了を画像処理装置１００の傍で待つ待ち期間となる。 As shown in FIG. 4, when the control unit 11 recognizes that the [start] key 36 has been touched, the control unit 11 replaces the [start] key 36 with the [copy stop] key 37 in response. When the [Cancel Copy] key 37 is touched, the control unit 11 cancels the copy job being executed.
The control unit 11 that has started the copy job executes the copy job by causing the scanner unit 15 to read the document and the printing unit 13 to print the image of the scanned document. The period from the reading of the original to the end of printing is a waiting period for the user to wait for the end of the copy job (Yes in step S11). When the operation related to the copy setting is completed, the user waits for the end of the started copy job near the image processing device 100.

待ち期間になると、認識補完部１１ｃとしての制御部１１は、パターン認識に係るフィードバックの入力画面を表示ユニット１２に表示させる（ステップＳ１３）。
図６は、この場合に表示される画面の一例を示している。ここでは、過去に文字認識を行った原稿の画像中から認識補完部１１ｃが選択した単語を表示するものとしている。つまり、図６に示す実施形態で認識補完部１１ｃは、認識対象である原稿の画像のうち一部の文字を表示している。なお、本実施形態においては、認識対象の文字は手書き文字ではなく活字である。
認識補完部１１ｃは、表示された文字に対する正しい認識結果の候補として複数の語（図６に示す例では９個）を「枠内に表示された文字と同じものを下から選んでください。」というメッセージと共にユーザーに提示する。そして、ユーザーからの入力を待つ（ステップＳ１５）。 When the waiting period is reached, the control unit 11 as the recognition complementing unit 11c causes the display unit 12 to display the feedback input screen related to the pattern recognition (step S13).
FIG. 6 shows an example of the screen displayed in this case. Here, the word selected by the recognition complementing unit 11c from the image of the manuscript whose characters have been recognized in the past is displayed. That is, in the embodiment shown in FIG. 6, the recognition complementing unit 11c displays some characters in the image of the document to be recognized. In the present embodiment, the character to be recognized is not a handwritten character but a printed character.
The recognition complementing unit 11c selects a plurality of words (9 in the example shown in FIG. 6) as candidates for the correct recognition result for the displayed character, "Please select the same character displayed in the frame from the bottom." Present to the user with the message. Then, it waits for the input from the user (step S15).

制御部１１は、ユーザーからフィードバックの入力を受付けるまでの間（ステップＳ１７のＮｏ）に、待ち期間が終了したか（ステップＳ１９）、タイムアウトか（ステップＳ２３）を確認する。
入力を待つ間に待ち期間が終了した場合（ステップＳ１９のＮｏ）、即ち、入力を待つ間にコピージョブが終了した場合、認識補完部１１ｃとしての制御部１１は、図６に示す入力待ちの画面を閉じて（ステップＳ２１）処理を終了する。そして制御部１１は、表示ユニット１２に次のコピージョブの設定を待つ待機画面を表示させる。 The control unit 11 confirms whether the waiting period has ended (step S19) or the timeout (step S23) until the feedback input is received from the user (No in step S17).
When the waiting period ends while waiting for input (No in step S19), that is, when the copy job ends while waiting for input, the control unit 11 as the recognition complementing unit 11c waits for input as shown in FIG. The screen is closed (step S21) to end the process. Then, the control unit 11 causes the display unit 12 to display a standby screen waiting for the setting of the next copy job.

一方、フィードバックの入力を待つ間にタイムアウトが到来した場合（ステップＳ２３のＹｅｓ）、つまり所定期間内にユーザーからの入力がなかった場合、認識補完部１１ｃとしての制御部１１は、前述のステップＳ１３で表示した文字を次のフィードバックを受ける認識対象に切り替える（ステップＳ２５）。例えば、図６に示す画面を表示した状態でタイムアウトになった場合、認識補完部１１ｃは、図９に示す次の文字認識の対象の表示に切り替える。そして、新たに表示された認識対象の文字についてユーザーのフィードバックに係る入力を求める（ステップＳ１３）。 On the other hand, if a timeout arrives while waiting for feedback input (Yes in step S23), that is, if there is no input from the user within a predetermined period, the control unit 11 as the recognition complementing unit 11c will perform the above-mentioned step S13. The character displayed in is switched to the recognition target for receiving the next feedback (step S25). For example, when the time-out occurs while the screen shown in FIG. 6 is displayed, the recognition complementing unit 11c switches to the display of the next character recognition target shown in FIG. Then, the input related to the user's feedback is requested for the newly displayed character to be recognized (step S13).

認識評価部１１ｂは、フィードバックに係る入力を求める文字を例えば以下のように決定する。認識評価部１１ｂは、文字認識を行った認識対象を記憶し、その際に似た認識対象をグループ化して記憶するものとする。ユーザーが修正したものはその修正内容と共に記憶しておく。修正の具体的なやり方としては、読み取った画像と文字認識の結果を並べて表示ユニット１２に表示し、誤認識があればその部分をユーザーが修正して認識結果を編集できるものとする。
認識評価部１１ｂは、文字認識の回数に対して修正回数の多いグループに属する認識対象については、修正の割合が高いとして低い評価を与える。逆に認識回数の割に修正回数の少ないグループに属する認識対象については高い評価を与える。
そして、待ち時間が到来したら、評価の最も低いものから順に表示ユニット１２に表示する。なお、ユーザーからフィードバックに係る入力がなされ、認識対象の正解が得られたら、その認識対象についての評価を満点とする。このようにすれば、正解が入力された認識対象について繰り返しフィードバックを求めることのないようにできる。 The recognition evaluation unit 11b determines, for example, the characters for which input is requested for feedback as follows. The recognition evaluation unit 11b stores the recognition target for which character recognition has been performed, and groups and stores similar recognition targets at that time. The user's corrections are stored together with the corrections. As a specific method of correction, the read image and the result of character recognition are displayed side by side on the display unit 12, and if there is an erroneous recognition, the user can correct the part and edit the recognition result.
The recognition evaluation unit 11b gives a low evaluation to the recognition target belonging to the group having a large number of corrections with respect to the number of character recognitions, assuming that the correction rate is high. On the contrary, the recognition target belonging to the group with a small number of corrections for the number of recognitions is given a high evaluation.
Then, when the waiting time arrives, the display unit 12 is displayed in order from the one with the lowest evaluation. If the user inputs feedback and obtains the correct answer for the recognition target, the evaluation of the recognition target is given as a perfect score. In this way, it is possible to prevent repeated request for feedback on the recognition target in which the correct answer is input.

フィードバックに係る入力を求める文字を決定する別の態様について述べる。認識評価部１１ｂは、ユーザーが修正した認識対象を記憶しておく。そして、ユーザーの待ち期間中が到来したら、過去にユーザーが修正した認識対象を表示ユニット１２あるいは情報処理装置２０の画面に一覧表示し、あるいは順次表示する。そして、それらの中から誤認識が多いとユーザーが考える認識対象を選択させ、選択されたものについてフィードバックに係る入力を求める。
以下、ユーザーのフィードバック入力を待つステップＳ１５、Ｓ１７，Ｓ１９およびＳ２３のループあるいはステップＳ１３、Ｓ１５、Ｓ１７，Ｓ１９、Ｓ２３およびＳ２５のループで、ユーザーからの入力がなされた場合（ステップＳ１７のＹｅｓ）の処理について述べる。 Another aspect of determining the character for which feedback is requested is described. The recognition evaluation unit 11b stores the recognition target modified by the user. Then, when the waiting period of the user arrives, the recognition targets modified by the user in the past are displayed in a list on the screen of the display unit 12 or the information processing device 20, or are sequentially displayed. Then, the user selects a recognition target that the user thinks that there are many false recognitions from among them, and requests input related to feedback for the selected one.
Hereinafter, when input from the user is made in the loop of steps S15, S17, S19 and S23 or the loop of steps S13, S15, S17, S19, S23 and S25 waiting for the user's feedback input (Yes in step S17). The processing will be described.

例えば、前述の図６の画面において表示された候補の何れか一つをユーザーが選択し（図７参照）、認識結果を確定すると（図８参照）、機械学習部１１ｄとして制御部１１は、表示された文字の正解が選択された候補であると学習する。
ただし、この実施形態では、選択された候補を無条件に正解として受け入れるのでなく、有効な入力か否かを判断する（ステップＳ２７）。
勿論、有効な入力か否かを判断せず、すべての入力に対して学習を行う態様もあり得る。しかし、その場合はユーザーの誤入力により学習が左右される可能性がある。好ましくは、ステップＳ２７に示すように、制御部１１はユーザーからの入力を有効なものとするか否かの判断を行い、有効な入力のみを正解とする。 For example, when the user selects one of the candidates displayed on the screen of FIG. 6 (see FIG. 7) and confirms the recognition result (see FIG. 8), the control unit 11 acts as the machine learning unit 11d. Learn that the correct answer for the displayed character is the selected candidate.
However, in this embodiment, the selected candidate is not unconditionally accepted as the correct answer, but it is determined whether or not the input is valid (step S27).
Of course, there may be a mode in which learning is performed for all inputs without determining whether or not the inputs are valid. However, in that case, learning may be affected by erroneous input by the user. Preferably, as shown in step S27, the control unit 11 determines whether or not the input from the user is valid, and only the valid input is the correct answer.

有効な入力か否かは、例えば以下のようにして判断できる。例えば、学習すべき認識対象を、認識結果の正解が予めわかっている画像（テスト画像）とセットにしてユーザーに提示する。テスト画像に対するユーザーからの入力が前記正解と一致する場合に限って、他の認識対象についても選択された候補が正解の認識結果であると判断する。
あるいは、同一の認識対象について、異なる時期に、あるいは異なるユーザーからフィードバックを求める。そして、同じフィードバックを得る割合が予め定められた基準を超える場合、その認識対象に係る有効な入力が得られたと判断する。
逆に、ある認識対象について同じフィードバックを得る割合が前記基準以下の場合は、得られた入力が有効でないと判断して、受付けた入力を用いた学習は行わない。ただし、受付けた入力に基づいて前記割合を更新して記憶し、次回の入力に備える。 Whether or not the input is valid can be determined, for example, as follows. For example, the recognition target to be learned is presented to the user as a set with an image (test image) in which the correct answer of the recognition result is known in advance. Only when the input from the user for the test image matches the correct answer, it is determined that the candidate selected for the other recognition target is the recognition result of the correct answer.
Alternatively, seek feedback on the same recognition target at different times or from different users. Then, when the rate of obtaining the same feedback exceeds a predetermined standard, it is determined that a valid input related to the recognition target has been obtained.
On the contrary, when the ratio of obtaining the same feedback for a certain recognition target is less than or equal to the above standard, it is judged that the obtained input is not valid, and learning using the accepted input is not performed. However, the ratio is updated and stored based on the received input to prepare for the next input.

認識補完部１１ｃは、前記割合に基づいてフィードバックに係る注力を求める認識対象を決定するものとする。即ち、前記割合が基準値以下であって、より小さい値の認識対象についてより頻繁にフィードバックに係る入力を求めるものとする。このようにすれば、前記割合が小さい認識対象は前記割合が大きいものより頻繁にフィードバック入力が求められるが、同じフィードバックを得る割合が基準値を超えるまでは得られた入力について学習が行われない。同じフィードバックを得る割合が基準値を超えると学習が行われる。
前記ステップＳ２７で、有効な入力であると判断した場合、機械学習部１１ｄは受付けた入力に基づいて学習を行い（ステップＳ２９）、処理を終了する。
一方、有効な入力でないと判断した場合、機械学習部１１ｄは受付けた入力に基づく学習は行なわずに（ステップＳ３１）処理を終了する。 The recognition complementing unit 11c shall determine the recognition target for which focus is sought for feedback based on the above ratio. That is, it is assumed that the input related to the feedback is requested more frequently for the recognition target having the ratio of the reference value or less and the smaller value. In this way, the recognition target having a small ratio is required to input feedback more frequently than the recognition target having a large ratio, but learning is not performed on the obtained input until the ratio of obtaining the same feedback exceeds the reference value. .. Learning is performed when the rate of obtaining the same feedback exceeds the reference value.
If it is determined in step S27 that the input is valid, the machine learning unit 11d performs learning based on the received input (step S29), and ends the process.
On the other hand, if it is determined that the input is not valid, the machine learning unit 11d ends the process (step S31) without performing learning based on the received input.

なお、例えば図６に示す認識結果の候補は正解を含んでおり、ユーザーは候補の中からその正解を選択する。しかし、認識結果の候補の中に正解が含まれないケースもあり得る。そのようなケースに対応できるように、候補の一覧に加えて、例えば「その他」や「この中に正解はない」といった選択肢を用意しておくことも考えられる。
「その他」や「この中に正解はない」が選択された場合、認識補完部１１ｃは、ユーザーが意図する正解を入力できて、その入力を受付ける手段を提供する。
例えば、ユーザーが手書き入力をすることができるようにしたり、画面に入力用のソフトキーを表示してそのソフトキーを用いたカナ漢字変換入力をユーザーができるようにしたりすることが考えられる。
また、例えば図６に示すような候補を正解の確率が最も高い第１レベルの候補として表示し、「その他」や「この中に正解はない」が選択された場合、次に正解の確率が高い第２レベルの候補を表示させることも考えられる。
あるいは、候補が一画面に収まりきらない場合、頁送りボタンやスクロールバーを表示させてそれらの操作によって現在の画面に表示されていない候補を表示させることができるようにしてもよい。 For example, the recognition result candidate shown in FIG. 6 includes a correct answer, and the user selects the correct answer from the candidates. However, there may be cases where the correct answer is not included in the recognition result candidates. In order to deal with such cases, in addition to the list of candidates, it is conceivable to prepare options such as "Other" and "There is no correct answer in this".
When "Other" or "There is no correct answer in this" is selected, the recognition complementing unit 11c can input the correct answer intended by the user and provides a means for accepting the input.
For example, it is conceivable to enable the user to input by handwriting, or to display a soft key for input on the screen and enable the user to input Kana-Kanji conversion using the soft key.
Further, for example, when a candidate as shown in FIG. 6 is displayed as a first-level candidate having the highest probability of correct answer and "Other" or "There is no correct answer in this" is selected, the probability of correct answer is next. It is also possible to display high second level candidates.
Alternatively, if the candidates do not fit on one screen, a page feed button or a scroll bar may be displayed so that the candidates that are not displayed on the current screen can be displayed by their operations.

（実施の形態２）
実施の形態１では、活字の画像を文字認識する場合を例に挙げて説明した。しかし、文字認識および学習の対象は活字に限らず、手書き文字であってもよい。
図１０〜図１２は、この実施形態においてコピー終了の待ち期間中に手書き文字認識の正誤に係る入力を求める例を示す説明図である。
図１０に示す画面において表示された候補の何れか一つをユーザーが選択し（図１１参照）、認識結果を確定すると（図１２参照）、機械学習部１１ｄとして制御部１１は、表示された手書き文字の正解が選択された候補であると学習する。これによって、ユーザーのくせ字や判別しにくい手書き文字の認識精度を向上させることが可能になる。 (Embodiment 2)
In the first embodiment, a case where a printed image is recognized as a character has been described as an example. However, the target of character recognition and learning is not limited to printed characters, and may be handwritten characters.
10 to 12 are explanatory views showing an example in which input relating to the correctness of handwritten character recognition is requested during the waiting period for the end of copying in this embodiment.
When the user selects one of the candidates displayed on the screen shown in FIG. 10 (see FIG. 11) and confirms the recognition result (see FIG. 12), the control unit 11 is displayed as the machine learning unit 11d. Learn that the correct answer for handwritten characters is the selected candidate. This makes it possible to improve the recognition accuracy of the user's habitual characters and handwritten characters that are difficult to distinguish.

（実施の形態３）
実施の形態１では、パターン認識部１１ａがスキャナユニット１５の読み取った画像を文字認識する場合を例に挙げて説明した。
この実施形態では、画像処理装置１００がユーザーを認証し、認証されたユーザーの操作を受付けるものとする。ユーザー認証の手段として、パターン認識部１１ａが生体認証を行うものとする。
生体認証の具体的な一例として、画像処理装置１００がカメラを備え、ユーザーの顔認証を行う場合について述べる。ただし、これに限らず生体認証としては指紋認証、虹彩認証、指紋認証、音声認証にも適用可能である。顔認証、指紋認証、虹彩認証は画像を対象とするパターン認識である。一方、音声認証は音声を対象とするパターン認識である。しかし、画像認識についても、音声認識についてもあるいは他のパターン認識についても、機械学習の適用は可能である。 (Embodiment 3)
In the first embodiment, the case where the pattern recognition unit 11a recognizes the image read by the scanner unit 15 as characters has been described as an example.
In this embodiment, the image processing device 100 authenticates the user and accepts the operation of the authenticated user. As a means of user authentication, the pattern recognition unit 11a shall perform biometric authentication.
As a specific example of biometric authentication, a case where the image processing device 100 is provided with a camera and performs face authentication of a user will be described. However, the biometric authentication is not limited to this, and can be applied to fingerprint authentication, iris authentication, fingerprint authentication, and voice authentication. Face recognition, fingerprint recognition, and iris recognition are pattern recognition for images. On the other hand, voice authentication is pattern recognition for voice. However, machine learning can be applied to image recognition, speech recognition, and other pattern recognition.

図１３〜図１５は、顔認証の例についてユーザーのフィードバック入力を求める例を示す説明図である。実施の形態１で述べた図６〜図８に代えて、あるいはそれに加えて、認識補完部１１ｃが図１３〜図１５に示す画面を待ち期間中に表示させ、ユーザーの入力に基づいて機械学習部１１ｄが学習を行う。これによって、顔認証の精度を向上させることが可能である。
また、ユーザー認証に限らず、スキャナユニット１５が読み取った画像や通信インターフェイス回路１４を介して受信した画像を認識対象として、実施の形態１で述べた文字認識でない画像認識を行ってもよい。
例えば、ファクシミリで受信した画像に含まれるロゴや印鑑の画像を認識して発信元の仕訳を行うといった処理を行うようにしてもよい。 13 to 15 are explanatory views showing an example of requesting user feedback input for an example of face recognition. Instead of or in addition to FIGS. 6 to 8 described in the first embodiment, the recognition complementing unit 11c displays the screens shown in FIGS. 13 to 15 during the waiting period, and machine learning is performed based on the user's input. Part 11d learns. This makes it possible to improve the accuracy of face recognition.
Further, not limited to user authentication, image recognition other than the character recognition described in the first embodiment may be performed by using the image read by the scanner unit 15 or the image received via the communication interface circuit 14 as the recognition target.
For example, a process such as recognizing a logo or a seal image included in an image received by facsimile and performing a journal entry of the sender may be performed.

（実施の形態４）
実施の形態１に示す活字の文字認識については、不特定のユーザーからフィードバックに係る入力を得ることが好ましい。不特定のユーザーを対象とする方がユーザーを特定するよりも同じ期間に多くのフィードバックを得ることができるし、より客観的なフィードバックが得られると考えられる。
それに対して、実施の形態３に示すようなユーザー認証に係るパターン認識はユーザーに個別の情報である。よって、それぞれのユーザーに紐付けて学習を行うことが好ましい。 (Embodiment 4)
Regarding the character recognition of the type shown in the first embodiment, it is preferable to obtain an input related to feedback from an unspecified user. It is thought that targeting unspecified users can obtain more feedback during the same period than identifying users, and can provide more objective feedback.
On the other hand, the pattern recognition related to user authentication as shown in the third embodiment is information individual to the user. Therefore, it is preferable to perform learning in association with each user.

実施の形態２に示す手書き文字認識については、不特定ユーザーに係るものとして学習を行うか、あるいはユーザーに個別の情報として学習を行うかを、ユーザーあるいは管理者等が選択できるようにしてもよい。
認識対象の手書き文字がユーザー自身の手書き文字であればユーザーに個別の情報として学習を行うことが好ましい。しかし、他人の手書き文字であれば不特定に係るものとして学習を行うことが好ましい。ユーザー自身の手書き文字か、他人の手書き文字かは、ユーザーに選択してもらうようにすればよい。 Regarding the handwritten character recognition shown in the second embodiment, the user, the administrator, or the like may be able to select whether to perform learning as related to an unspecified user or as individual information for the user. ..
If the handwritten character to be recognized is the user's own handwritten character, it is preferable to learn as individual information for the user. However, if it is a handwritten character of another person, it is preferable to perform learning as if it is unspecified. You can ask the user to select the handwritten characters of the user or the handwritten characters of another person.

以上に述べたように、
（i）この発明による画像処理装置は、画像処理に係る画像または前記画像処理に係る操作を行うユーザーを認識するパターン認識部と、認識結果の正誤に関する評価を行うかまたは前記ユーザーから評価を受領する認識評価部と、前記評価に基づき十分な認識精度でないと判断する認識対象の全部または一部を提示して正誤に係る前記ユーザーの入力を求める認識補完部と、得られた入力を用いて前記認識対象の全部または一部について前記認識に係る学習を行う機械学習部とを備え、前記認識補完部は、前記ユーザーが前記画像処理の終了または前記操作に対する応答を待つ待ち期間中に前記認識対象の提示を行うことを特徴とする。 As mentioned above
(I) The image processing apparatus according to the present invention has a pattern recognition unit that recognizes an image related to image processing or a user who performs an operation related to the image processing, and evaluates the correctness of the recognition result or receives an evaluation from the user. Using the recognition evaluation unit, the recognition complementing unit that presents all or part of the recognition target that is judged not to have sufficient recognition accuracy based on the evaluation and requests the input of the user related to correctness, and the obtained input. The recognition complementing unit includes a machine learning unit that performs learning related to the recognition for all or a part of the recognition target, and the recognition complementing unit waits for the user to finish the image processing or respond to the operation. It is characterized by presenting an object.

この発明において、画像処理に係る画像は、画像処理装置が画像処理を行う対象の画像である。その具体的な態様は、例えば、読み取られた原稿画像あるいは印刷すべき印刷画像である。
また、画像処理に係るユーザーによる操作は、画像処理装置に対して直接行われてもよいが、それに限定されるものでない。例えば、ネットワークを介して画像処理装置と通信可能なＰＣやスマートフォン等の情報処理装置に対してユーザーが操作を行い、画像処理装置が通信を介してその操作を受付けてもよい。 In the present invention, the image related to image processing is an image to be image-processed by the image processing apparatus. A specific embodiment thereof is, for example, a scanned original image or a printed image to be printed.
Further, the operation by the user related to the image processing may be performed directly on the image processing device, but the operation is not limited thereto. For example, the user may operate an information processing device such as a PC or a smartphone that can communicate with the image processing device via a network, and the image processing device may accept the operation via communication.

さらにまた、認識結果の正誤に係る前記ユーザーから評価を受領するとは、例えば認識の対象が文章の書かれた原稿画像の場合、書かれた文章の一部の文字について認識結果が誤認識であるとの評価を得てもよい。あるいは、その文章全体について認識結果の評価を得てもよい。認識補完部は、得られた評価に基づいて認識対象の全部または一部の認識精度が十分か否かを判断する。
認識結果の正誤に関する評価は、評価の対象となった画像の画像処理に係る操作を行ったユーザーから受領してもよい。しかし、それに限らず他のユーザーから評価を受領することがあってもよい。即ち、評価を行うユーザーが、その評価の対象である画像の画像処理に係る操作を行ったユーザーに常に限定されるものではない。評価の対象によっては、他の画像処理に係る操作を行うユーザーが評価を行う場合があってもよい。 Furthermore, receiving an evaluation from the user regarding the correctness of the recognition result means that, for example, when the object of recognition is a manuscript image in which a sentence is written, the recognition result is erroneous recognition for some characters of the written sentence. You may get the evaluation. Alternatively, the recognition result may be evaluated for the entire sentence. The recognition complementing unit determines whether or not the recognition accuracy of all or part of the recognition target is sufficient based on the obtained evaluation.
The evaluation regarding the correctness of the recognition result may be received from the user who has performed the operation related to the image processing of the image to be evaluated. However, the evaluation may be received from other users. That is, the user who performs the evaluation is not always limited to the user who has performed the operation related to the image processing of the image to be evaluated. Depending on the evaluation target, a user who performs an operation related to other image processing may perform the evaluation.

認識精度は認識結果が正しいか誤りかの統計的な指標といえるところ、十分な認識精度か否かの決定は、認識補完部が複数の評価に基づいて判断することが極めて好ましい。即ち、同一または類似の認識対象について複数の評価を用いて判断を行うことが極めて好ましい。
認識補完部は、ある認識対象の全部または一部に係る正解を入力するようユーザーに求める。正解を入力する手段としては種々のものが考えられる。例えば、複数の候補が提示されて、それらの候補のうち何れか一つを正解として選択するやり方が考えられる。
待ち期間は、ユーザーが画像処理の終了や操作に対する応答等を待つ期間である。
例えば、認識対象が原稿の文書画像であってその文字認識を行う場合、その原稿に係る画像処理の待ち期間中に前記原稿以外の認識対象を認識補完部が提示してもよい。即ち、その原稿ではなく過去に処理した認識対象を提示してもよい。勿論、画像処理中の原稿を認識対象として提示してもよい。 Since the recognition accuracy can be said to be a statistical index of whether the recognition result is correct or incorrect, it is extremely preferable that the recognition complementing unit makes a judgment based on a plurality of evaluations to determine whether or not the recognition accuracy is sufficient. That is, it is extremely preferable to make a judgment using a plurality of evaluations for the same or similar recognition target.
The recognition complement unit asks the user to input the correct answer for all or part of a certain recognition target. Various means can be considered as a means for inputting the correct answer. For example, a method in which a plurality of candidates are presented and one of the candidates is selected as the correct answer can be considered.
The waiting period is a period in which the user waits for the end of image processing, a response to an operation, or the like.
For example, when the recognition target is a document image of a manuscript and the character recognition is performed, the recognition complementing unit may present a recognition target other than the manuscript during the waiting period for image processing related to the manuscript. That is, the recognition target processed in the past may be presented instead of the manuscript. Of course, the document being image-processed may be presented as a recognition target.

さらに、この発明の好ましい態様について説明する。
（ii）前記認識対象が前記画像処理または前記ユーザーに係る画像であって、前記認識補完部は、前記画像の全部または一部を提示する表示ユニットをさらに備えてもよい。
このようにすれば、認識精度が十分でない認識対象を、画像処理装置が備える表示ユニットに表示してユーザーに正解を入力させることができる。 Further, preferred embodiments of the present invention will be described.
(Ii) The recognition target may be an image related to the image processing or the user, and the recognition complementing unit may further include a display unit that presents all or a part of the image.
In this way, the recognition target whose recognition accuracy is not sufficient can be displayed on the display unit provided in the image processing device, and the user can input the correct answer.

（iii）前記認識対象が前記ユーザーに係る音声であって、前記認識補完部は、前記音声の全部または一部を提示する音声出力ユニットをさらに備えてもよい。
このようにすれば、認識精度が十分でない認識対象を、画像処理装置が備える音声出力ユニットから出力してユーザーに正解を入力させることができる。正解の音声を入力する手段は種々のものが考えられるが、必ずしも音声で返答する必要はない。例えば複数の正解候補を画像処理装置が備える表示ユニットに文字として表示してそれらの正解候補のうちで正解を選択させてもよいし、ユーザーが正解を文字で入力してもよい。
認識対象がユーザーに係る音声である場合として、例えばユーザー認証に特定話者音声認識を適用し、予め定められた単語や文章をパスワードあるいはパスフレーズとするシーンが考えられる。 (Iii) The recognition target is a voice related to the user, and the recognition complementing unit may further include a voice output unit that presents all or a part of the voice.
In this way, it is possible to output a recognition target having insufficient recognition accuracy from the voice output unit included in the image processing device and have the user input the correct answer. There are various possible means for inputting the correct voice, but it is not always necessary to reply by voice. For example, a plurality of correct answer candidates may be displayed as characters on a display unit provided in the image processing device, and the correct answer may be selected from the correct answer candidates, or the user may input the correct answer in characters.
As a case where the recognition target is the voice related to the user, for example, a scene in which a specific speaker voice recognition is applied to user authentication and a predetermined word or sentence is used as a password or a passphrase can be considered.

（iv）前記待ち期間は、省電力モード中および省電力モードに移行し得る状態を含まない期間であってもよい。
このようにすれば、学習を行うことで省エネルギーの効果が減殺されることがないようにできる。
省電力モードに移行し得る状態とは、省電力モード中ではないが省電力モードに移行できる状態を指し、例えばジョブの指示を待っている待機中である。
これに対して、省電力モードに移行し得ない状態とは、例えば省電力モード中にジョブ実行の指示を受けて省電力モードから復帰中の状態や、コピー、スキャン、プリンタ等画像処理に係るジョブを実行中の状態である。これらの状態は、ユーザーがジョブの終了を待っている待ち期間に属するが省電力モードに移行し得ない状態といえる。 (Iv) The waiting period may be a period that does not include the state during the power saving mode and the state in which the power saving mode can be entered.
In this way, it is possible to prevent the effect of energy saving from being diminished by learning.
The state in which the power saving mode can be shifted refers to a state in which the power saving mode can be shifted to the power saving mode, for example, waiting for a job instruction.
On the other hand, the state in which the power saving mode cannot be shifted is related to, for example, the state of returning from the power saving mode in response to a job execution instruction during the power saving mode, or image processing such as copying, scanning, and printer. The job is being executed. It can be said that these states belong to the waiting period in which the user is waiting for the job to finish, but cannot shift to the power saving mode.

（v）前記認識補完部は正誤に係る入力を行うユーザーを識別し、前記機械学習部は識別されたユーザー毎に前記学習を行ってもよい。
このようにすれば、ユーザー毎に学習を行うのでユーザーに固有の癖や特徴に対応して認識精度を向上させることが可能である。 (V) The recognition complementing unit may identify a user who inputs correct / incorrect input, and the machine learning unit may perform the learning for each identified user.
In this way, since learning is performed for each user, it is possible to improve the recognition accuracy in response to the habits and characteristics peculiar to the user.

（vi）前記認識補完部は、得られた入力が学習を行うべきものか否かを判断し、学習を行うべきと判断した入力についてのみ学習を行ってもよい。
このようにすれば、ユーザーが偏った入力や誤った入力を行った場合に、それを判断して適切な入力を選んで学習を行うことが可能である。 (Vi) The recognition complementing unit may determine whether or not the obtained input should be learned, and may perform learning only for the input determined to be learned.
In this way, when the user makes a biased input or an erroneous input, it is possible to judge it and select an appropriate input for learning.

（vii）また、上述の画像処理装置と異なる態様として、この発明は、画像処理に係る画像または前記画像処理に係る操作を行うユーザーを認識するステップと、認識結果の正誤に関する評価を行うかまたは前記ユーザーから評価を受領するステップと、前記評価に基づき十分な認識精度でないと判断する認識対象の全部または一部を提示して正誤に係る前記ユーザーの入力を求める認識補完ステップと、得られた入力を用いて前記認識対象の全部または一部について前記認識に係る学習を行うステップとを備える処理をコンピュータが実行し、前記認識補完ステップは、前記ユーザーが前記画像処理の終了または前記操作に対する応答を待つ待ち期間中に実行される認識精度改善方法を提供する。 (Vii) Further, as a mode different from the above-mentioned image processing apparatus, the present invention performs a step of recognizing an image related to image processing or a user who performs an operation related to the image processing, and evaluates the correctness of the recognition result. A step of receiving an evaluation from the user and a recognition complementing step of presenting all or a part of the recognition target judged to be not sufficient recognition accuracy based on the evaluation and requesting the input of the user related to the correctness were obtained. The computer executes a process including a step of learning about the recognition for all or a part of the recognition target using the input, and in the recognition complement step, the user finishes the image processing or responds to the operation. It provides a method for improving recognition accuracy that is executed during the waiting period.

この発明の好ましい態様には、上述した複数の態様のうちの何れかを組み合わせたものも含まれる。
前述した実施の形態の他にも、この発明について種々の変形例があり得る。それらの変形例は、この発明の範囲に属さないと解されるべきものではない。この発明には、請求の範囲と均等の意味および前記範囲内でのすべての変形とが含まれるべきである。 Preferred embodiments of the present invention include a combination of any of the plurality of embodiments described above.
In addition to the embodiments described above, there may be various variations of the present invention. These variations should not be construed as not belonging to the scope of the present invention. The present invention should include the meaning equivalent to the claims and all modifications within the said scope.

１０：操作ユニット、１１：制御部、１１ａ：パターン認識部、１１ｂ：認識評価部、１１ｃ：認識補完部、１１ｄ：機械学習部、１２：表示ユニット、１３：印刷ユニット、１４：通信インターフェイス回路、１５：スキャナユニット、１６：画像データ生成回路、１７ａ：給紙トレイ、１７ｂ：手差しトレイ、１８ａ，１８ｂ，１８ｃ：排出トレイ、２０：情報処理装置
３０：手指、３１：設定キー、３２：コピー部数領域、３３：マシンイラスト、３４：［ジョブ状況］キー、３５：［リセット］キー、３６：［スタート］キー、３７：［コピー中止］キー
１００：画像処理装置 10: Operation unit, 11: Control unit, 11a: Pattern recognition unit, 11b: Recognition evaluation unit, 11c: Recognition complement unit, 11d: Machine learning unit, 12: Display unit, 13: Printing unit, 14: Communication interface circuit, 15: Scanner unit, 16: Image data generation circuit, 17a: Paper feed tray, 17b: Bypass tray, 18a, 18b, 18c: Ejection tray, 20: Information processing device 30: Fingers, 31: Setting key, 32: Number of copies Area, 33: Machine illustration, 34: [Job status] key, 35: [Reset] key, 36: [Start] key, 37: [Cancel copy] key 100: Image processing device

Claims

A pattern recognition unit that recognizes an image related to image processing or a user who performs an operation related to the image processing.
The recognition evaluation department that evaluates the correctness of the recognition result or receives the evaluation from the user,
A recognition complementing unit that presents all or part of the recognition target that is judged not to have sufficient recognition accuracy based on the evaluation and requests the input of the user regarding correctness.
It is provided with a machine learning unit that performs learning related to the recognition for all or a part of the recognition target using the obtained input.
The recognition complementing unit is an image processing device that presents the recognition target during a waiting period in which the user waits for the end of the image processing or a response to the operation.

The recognition target is the image processing or the image related to the user.
The image processing apparatus according to claim 1, wherein the recognition complementing unit further includes a display unit that presents all or a part of the image.

The recognition target is the voice related to the user,
The image processing device according to claim 1, wherein the recognition complementing unit further includes an audio output unit that presents all or part of the audio.

The image processing apparatus according to claim 1, wherein the waiting period does not include a state during the power saving mode and a state in which the mode can be shifted to the power saving mode.

The recognition complementing unit identifies the user who inputs the correctness and error, and identifies the user.
The image processing device according to any one of claims 1 to 4, wherein the machine learning unit performs the learning for each identified user.

The image according to any one of claims 1 to 4, wherein the recognition complementing unit determines whether or not the obtained input should be learned, and learns only the input determined to be learned. Processing equipment.

A step of recognizing an image related to image processing or a user who performs an operation related to the image processing,
The step of evaluating the correctness of the recognition result or receiving the evaluation from the user,
A recognition complementing step that presents all or part of the recognition target that is judged not to have sufficient recognition accuracy based on the evaluation and requests the input of the user regarding correctness.
Using the obtained input, the computer executes a process including a step of learning related to the recognition for all or a part of the recognition target.
The recognition complementing step is a recognition accuracy improving method executed during a waiting period in which the user waits for the end of the image processing or a response to the operation.