JP2020024763A

JP2020024763A - Reflection detection system

Info

Publication number: JP2020024763A
Application number: JP2019210704A
Authority: JP
Inventors: 祐長谷川; Yu Hasegawa
Original assignee: Panasonic Corp
Current assignee: Panasonic Corp
Priority date: 2019-11-21
Filing date: 2019-11-21
Publication date: 2020-02-13

Abstract

To generate a reflection detection model with high accuracy capable of detecting a reflection image area indicating a reflection place of light in a captured image even when any captured image is input.SOLUTION: A learning processing method in an AI server includes the steps of: generating a false image B' of an original image A on the basis of the original image A including a reflection area indicating a reflection position of light; evaluating authenticity of the false image B' according to comparison of the false image B' with learning images B1, B2 and B3 generated such that a reflection area g1 in the original image A is distinguishable from other image areas; generating a false image A' of the original image A on the basis of the false image B'; evaluating authenticity of the false image A' according to comparison between the false image A' and the original image A; and generating a learned model used for detecting the reflection area g1 in any captured image on the basis of evaluation results of the authenticity of the false image B' and the false image A'.SELECTED DRAWING: Figure 7

Description

本開示は、反射検知システムに関する。 The present disclosure relates to a reflection detection system.

特許文献１には、文字の読取等の処理に利用可能な画素を特定するために、撮像画像に含まれる各画素の輝度値を示す輝度画像を取得し、各画素の輝度値の度数分布を基に輝度閾値を決定し、高い輝度値の画素を強調する処理を前述した輝度画像に対して行って高輝度部分強調画像を生成する画像評価装置が開示されている。この画像評価装置は、高輝度部分強調装置に含まれる画素のそれぞれについて輝度値が輝度閾値を超えるか否かの判定結果に基づいて、輝度閾値を超える輝度を有する画素を高輝度画素と特定する。 Patent Document 1 discloses that in order to specify pixels that can be used for processing such as character reading, a luminance image indicating a luminance value of each pixel included in a captured image is obtained, and a frequency distribution of the luminance value of each pixel is obtained. There is disclosed an image evaluation device that determines a luminance threshold based on the luminance and emphasizes pixels having a high luminance value on the luminance image described above to generate a high luminance partial enhanced image. The image evaluation device specifies a pixel having a luminance exceeding the luminance threshold as a high luminance pixel based on a determination result as to whether a luminance value exceeds a luminance threshold for each of the pixels included in the high luminance partial emphasizing device. .

特開２０１７−１６２０３０号公報JP, 2017-162030, A

しかし、特許文献１には、例えばスマートフォン等の携帯端末により撮像された撮像画像に照明光や外光等の光が反射した部分が含まれている場合に、その撮像画像中に生じた光反射画像領域を検知することは考慮されていない。 However, in Patent Document 1, for example, when a captured image captured by a mobile terminal such as a smartphone includes a portion where light such as illumination light or external light is reflected, light reflection generated in the captured image is included. No consideration is given to detecting an image area.

本開示は、上述した従来の状況に鑑みて案出され、任意の撮像画像が入力された場合でも、その撮像画像中の光の反射箇所を示す反射画像領域を検知可能な高精度な反射検知モデルを生成でき、任意の撮像画像において検知される反射画像領域の信頼性を的確に担保する反射検知システムを提供することを目的とする。 The present disclosure is devised in view of the above-described conventional situation, and enables highly accurate reflection detection that can detect a reflection image area indicating a reflection point of light in the captured image even when an arbitrary captured image is input. It is an object of the present invention to provide a reflection detection system capable of generating a model and accurately ensuring the reliability of a reflection image area detected in an arbitrary captured image.

本開示は、光の反射箇所を示す反射画像領域を含む学習処理対象の撮像画像を保持するサーバ装置と、撮像部及び表示部を有する携帯端末とが互いに通信可能に接続された反射検知システムであって、前記サーバ装置は、プロセッサとメモリと、を備え、前記プロセッサは、前記メモリと協働して、前記撮像画像に基づいて、前記撮像画像の第１類似画像を生成し、前記撮像画像中の前記反射画像領域が他の画像領域と識別可能に生成された学習用画像と前記第１類似画像との比較に応じて、前記第１類似画像の真偽性を評価し、前記第１類似画像に基づいて、前記撮像画像の第２類似画像を生成し、前記第２類似画像と前記撮像画像との比較に応じて、前記第２類似画像の真偽性を評価し、前記第１類似画像及び前記第２類似画像のそれぞれの真偽性の評価結果に基づいて、任意の撮像画像における前記反射画像領域の検知に用いる反射検知モデルを生成し、前記撮像部により撮像された任意の撮像画像を取得すると、前記反射検知モデルを用いて、前記撮像画像中の前記反射画像領域を検知するとともに、前記撮像画像中の前記反射画像領域を他の画像領域と識別可能に加工した出力画像を生成して前記携帯端末に送信し、前記携帯端末は、前記サーバ装置から送信された前記出力画像を用いて、前記出力画像のうち前記反射画像領域以外の前記他の画像領域を文字認識した結果を前記表示部に表示する。 The present disclosure relates to a reflection detection system in which a server device that holds a captured image to be subjected to a learning process including a reflection image region indicating a light reflection point and a mobile terminal having an imaging unit and a display unit are communicably connected to each other. The server device includes a processor and a memory, wherein the processor generates a first similar image of the captured image based on the captured image in cooperation with the memory, Evaluating the authenticity of the first similar image according to a comparison between the learning image generated such that the reflection image region in the other image region is distinguishable from another image region and the first similar image; Generating a second similar image of the captured image based on the similar image; evaluating the authenticity of the second similar image according to a comparison between the second similar image and the captured image; Each of the similar image and the second similar image Based on the evaluation result of the authenticity of, a reflection detection model used to detect the reflection image region in any captured image is generated, and when an arbitrary captured image captured by the imaging unit is obtained, the reflection detection model Using the to detect the reflected image area in the captured image, generate an output image in which the reflected image area in the captured image is processed to be identifiable from other image areas, and transmit the output image to the mobile terminal. The mobile terminal displays, on the display unit, a result of character recognition of the other image area other than the reflection image area in the output image using the output image transmitted from the server device.

本開示によれば、任意の撮像画像が入力された場合でも、その撮像画像中の光の反射箇所を示す反射画像領域を検知可能な高精度な反射検知モデルを生成でき、任意の撮像画像において検知される反射画像領域の信頼性を的確に担保できる。 According to the present disclosure, even when an arbitrary captured image is input, it is possible to generate a highly accurate reflection detection model capable of detecting a reflection image area indicating a reflection point of light in the captured image. The reliability of the detected reflection image area can be ensured accurately.

実施の形態１に係る反射検知システムのハードウェア構成を示すブロック図FIG. 2 is a block diagram showing a hardware configuration of the reflection detection system according to the first embodiment. 元画像Ａの準備及び前処理の動作手順の一例を説明するフローチャート4 is a flowchart for explaining an example of an operation procedure of preparation of an original image A and preprocessing. 学習画像Ｂ１を生成する動作手順の一例を説明するフローチャート4 is a flowchart illustrating an example of an operation procedure for generating a learning image B1. 学習画像Ｂ２を生成する動作手順の一例を説明するフローチャート9 is a flowchart illustrating an example of an operation procedure for generating a learning image B2. 学習画像Ｂ３を生成する動作手順の一例を説明するフローチャート9 is a flowchart illustrating an example of an operation procedure for generating a learning image B3. 元画像Ａ、前処理後の画像Ｂ０、学習画像Ｂ１，Ｂ２，Ｂ３を示す図The figure which shows the original image A, the image B0 after pre-processing, and the learning images B1, B2, and B3. ＡＩサーバの学習の動作手順の一例を説明するフローチャートFlowchart for explaining an example of an operation procedure of learning of the AI server ＡＩサーバの反射箇所の検出の動作手順の一例を説明するフローチャート5 is a flowchart illustrating an example of an operation procedure of detecting a reflection point of the AI server. スマートフォンの翻訳動作手順の一例を説明するフローチャートFlow chart illustrating an example of a translation operation procedure of a smartphone 撮像画像が表示されたスマートフォンの撮影画面例を示す図Diagram showing an example of a shooting screen of a smartphone on which a captured image is displayed 重畳画像が表示されたスマートフォンの確認画面例を示す図Figure showing an example of a smartphone confirmation screen on which a superimposed image is displayed スマートフォンに表示された翻訳結果画面例を示す図Diagram showing a translation result screen example displayed on a smartphone スマートフォンに表示された他の翻訳結果画面例を示す図Diagram showing another translation result screen example displayed on the smartphone 他の撮像画像が表示されたスマートフォンの撮影画面例を示す図A diagram showing an example of a shooting screen of a smartphone on which another captured image is displayed 一部文字認識可能な範囲を含む重畳画像が表示されたスマートフォンの確認画面例を示す図Diagram showing a confirmation screen example of a smartphone on which a superimposed image including a part where a character can be recognized is displayed 一部文字認識可能な範囲が変更された確認画面例を示す図Figure showing an example of a confirmation screen with a partially recognizable range changed スマートフォンに表示された翻訳結果画面例を示す図Diagram showing a translation result screen example displayed on a smartphone スマートフォンに表示された他の翻訳結果画面例を示す図Diagram showing another translation result screen example displayed on the smartphone

（実施の形態１の内容に至る経緯）
例えば、外国人等の旅行者が旅行先で自己が所持するスマートフォン等の携帯端末を用いて、その旅行者が内容確認したい文字部分が含まれる被写体を撮像することがある。携帯端末は、外国人等の操作により、その撮像画像中に含まれる文字部分を文字認識し、その文字認識結果を予めインストールされた翻訳アプリケーションで自己の母国語に変換する。これにより、外国人等の旅行者は、携帯端末により撮像された任意の撮像画像に含まれる文字部分の内容確認を行える。 (Process leading to the contents of Embodiment 1)
For example, a traveler such as a foreigner may use a mobile terminal such as a smartphone owned by the traveler at a travel destination to image a subject including a character portion that the traveler wants to check the content of. The portable terminal recognizes a character portion included in the captured image by an operation of a foreigner or the like, and converts the character recognition result into its own native language by a translation application installed in advance. Thereby, a traveler such as a foreigner can check the contents of the character portion included in any captured image captured by the mobile terminal.

ところが、前述したように、撮像画像中に光反射画像領域が存在すると、その文字部分は文字認識不可となる。従って、携帯端末に表示される任意の撮像画像に対応する文字部分の翻訳結果に文字認識不可領域（つまり、光反射画像領域）が検知された場合には、その領域が撮像画像中に明示されれば、外国人等の旅行者にとっては親切な翻訳等の各種アプリケーションの提供が実現可能となると考えられる。 However, as described above, if the light reflection image area exists in the captured image, the character portion cannot be recognized. Therefore, when a character recognizable region (that is, a light reflection image region) is detected in the translation result of a character portion corresponding to an arbitrary captured image displayed on the mobile terminal, the region is clearly specified in the captured image. Then, it would be possible for foreign tourists to provide various applications such as kind translation.

そこで、以下の実施の形態１では、任意の撮像画像が入力された場合でも、その撮像画像中の光の反射箇所を示す反射画像領域を検知可能な高精度な反射検知モデルを生成でき、任意の撮像画像において検知される反射画像領域の信頼性を的確に担保する学習処理方法、サーバ装置及び反射検知システムの例を説明する。 Therefore, in the following first embodiment, even when an arbitrary captured image is input, it is possible to generate a highly accurate reflection detection model capable of detecting a reflection image area indicating a reflection point of light in the captured image. A learning processing method, a server device, and an example of a reflection detection system that accurately ensure the reliability of a reflection image region detected in a captured image of the present invention will be described.

以下、適宜図面を参照しながら、本開示に係る学習処理方法、サーバ装置及び反射検知システムを具体的に開示した実施の形態を詳細に説明する。但し、必要以上に詳細な説明は省略する場合がある。例えば、既によく知られた事項の詳細説明や実質的に同一の構成に対する重複説明を省略する場合がある。これは、以下の説明が不必要に冗長になるのを避け、当業者の理解を容易にするためである。なお、添付図面及び以下の説明は、当業者が本開示を十分に理解するために提供されるのであって、これらにより特許請求の範囲に記載の主題を限定することは意図されていない。 Hereinafter, an embodiment that specifically discloses a learning processing method, a server device, and a reflection detection system according to the present disclosure will be described in detail with reference to the drawings as appropriate. However, an unnecessary detailed description may be omitted. For example, a detailed description of a well-known item or a redundant description of substantially the same configuration may be omitted. This is to prevent the following description from being unnecessarily redundant and to facilitate understanding by those skilled in the art. The accompanying drawings and the following description are provided to enable those skilled in the art to fully understand the present disclosure, and are not intended to limit the claimed subject matter.

図１は、実施の形態１に係る反射検知システム５のハードウェア構成を示すブロック図である。反射検知システム５は、ＡＩ（artificial intelligence）サーバ１０と、スマートフォン３０と、翻訳サーバ５０とを含む構成である。ＡＩ（artificial intelligence）サーバ１０と、スマートフォン３０と、翻訳サーバ５０とは、ネットワーク７０を介して互いに通信可能に接続される、 FIG. 1 is a block diagram illustrating a hardware configuration of the reflection detection system 5 according to the first embodiment. The reflection detection system 5 is configured to include an AI (artificial intelligence) server 10, a smartphone 30, and a translation server 50. The AI (artificial intelligence) server 10, the smartphone 30, and the translation server 50 are communicably connected to each other via a network 70.

サーバ装置の一例としてのＡＩサーバ１０は、プロセッサ１１と、ＡＩ処理部１３と、メモリ１５と、ストレージ１７と、通信部１８とを含む構成である。 The AI server 10 as an example of the server device has a configuration including a processor 11, an AI processing unit 13, a memory 15, a storage 17, and a communication unit 18.

プロセッサ１１は、例えばＣＰＵ（Central Processing Unit）、ＭＰＵ（Micro Processing Unit）、ＤＳＰ（Digital Signal Processor）もしくはＦＰＧＡ（Field Programmable Gate Array）を用いて構成される。プロセッサ１１は、ＡＩサーバ１０の動作を司るコントローラとして機能し、ＡＩサーバ１０の各部の動作を全体的に統括するための制御処理、ＡＩサーバ１０の各部との間のデータの入出力処理、データの演算（計算）処理及びデータの記憶処理を行う。プロセッサ１１は、メモリ１５に記憶されたプログラム及びデータに従って動作する。プロセッサ１１は、動作時にメモリ１５を使用し、プロセッサ１１が生成又は取得したデータ又は情報をメモリ１５に一時的に保存してよい。 The processor 11 is configured using, for example, a central processing unit (CPU), a micro processing unit (MPU), a digital signal processor (DSP), or a field programmable gate array (FPGA). The processor 11 functions as a controller that controls the operation of the AI server 10, performs control processing for totally controlling the operation of each unit of the AI server 10, data input / output processing with each unit of the AI server 10, (Calculation) processing and data storage processing. The processor 11 operates according to programs and data stored in the memory 15. The processor 11 may use the memory 15 during operation, and temporarily store data or information generated or obtained by the processor 11 in the memory 15.

ＡＩ処理部１３は、例えばスマートフォン３０から送信された任意の撮像画像に対するリアルタイムな画像処理（例えば後述する撮像画像中における光の反射箇所の検出並びに学習済みモデルを用いた出力画像の生成）に適したＧＰＵ（Graphics Processing Unit）を用いて構成されるプロセッサである。ＡＩ処理部１３は、後述する元画像と学習画像とを用いて、ＣｙｃｌｅＧＡＮ技術を用いた機械学習を実行して学習済みモデルを生成し、ストレージ１７に学習済みモデルのデータ（つまり、学習済みモデルデータ）を記憶する。ＡＩ処理部１３は、メモリ１３ｚを有し、例えばスマートフォン３０から送信された任意の撮像画像における光の反射箇所の検知処理の実行時に、ストレージ１７に記憶された学習済みモデルデータを読み出し、学習済みモデルをメモリ１３ｚに一時的に展開して記憶する。ＡＩ処理部１３は、スマートフォン３０で撮像された任意の撮像画像を入力し、学習済みモデルの一部の機能（例えば、元画像からその元画像に類似する偽画像を生成する偽画像生成器の機能、生成した偽画像の真偽を評価する偽画像判別器の機能、詳細は後述参照）を用いて、検出された光の反射箇所の画像領域を含む可視化画像を出力する。 The AI processing unit 13 is suitable for, for example, real-time image processing (for example, detection of a light reflection point in a captured image described later and generation of an output image using a learned model) on an arbitrary captured image transmitted from the smartphone 30. It is a processor configured using a GPU (Graphics Processing Unit). The AI processing unit 13 generates a learned model by executing machine learning using the Cycle GAN technique using an original image and a learning image described later, and stores the data of the learned model in the storage 17 (that is, the learned model). Data). The AI processing unit 13 has a memory 13z, for example, reads out the learned model data stored in the storage 17 at the time of executing a process of detecting a light reflection point in an arbitrary captured image transmitted from the smartphone 30, and The model is temporarily expanded and stored in the memory 13z. The AI processing unit 13 receives an arbitrary captured image captured by the smartphone 30 and performs a partial function of the learned model (for example, a false image generator that generates a false image similar to the original image from the original image). A visualized image including an image area of a detected light reflection portion is output using a function, a function of a fake image discriminator that evaluates the authenticity of the generated fake image (for details, see below).

メモリ１５は、例えばＲＡＭ（Random Access Memory）とＲＯＭ（Read Only Memory）とを用いて構成され、ＡＩサーバ１０の動作の実行に必要なプログラムやデータ、更には、動作中に生成されたデータ又は情報を一時的に保持する。ＲＡＭは、例えばＡＩサーバ１０の動作時に使用されるワークメモリである。ＲＯＭは、例えばＡＩサーバ１０を制御するためのプログラム及びデータを予め記憶して保持する。 The memory 15 is configured using, for example, a RAM (Random Access Memory) and a ROM (Read Only Memory), and programs and data necessary for executing the operation of the AI server 10, and further, data generated during the operation. Hold information temporarily. The RAM is a work memory used when the AI server 10 operates, for example. The ROM previously stores and holds, for example, a program and data for controlling the AI server 10.

ストレージ１７は、例えばＨＤＤ（Hard Disk Drive）又はＳＳＤ（Solid State Drive）を用いて構成された記録装置である。ストレージ１７は、例えばプロセッサ１１もしくはＡＩ処理部１３が生成又は取得したデータ又は情報を記憶する。ストレージ１７は、ＡＩ処理部１３により生成された学習済みモデルデータを記憶する（図１参照）。 The storage 17 is a recording device configured using, for example, a hard disk drive (HDD) or a solid state drive (SSD). The storage 17 stores, for example, data or information generated or obtained by the processor 11 or the AI processing unit 13. The storage 17 stores the learned model data generated by the AI processing unit 13 (see FIG. 1).

通信部１８は、例えば有線ＬＡＮ（Local Area Network）や無線ＬＡＮ等を用いてネットワーク７０に接続される。通信部１８は、ネットワーク７０に接続された翻訳サーバ５０との間で通信可能であるとともに、外国人等の旅行者（ユーザの一例）が携帯して所持するスマートフォン３０との間で通信可能である。通信部１８は、スマートフォン３０から送信された任意の撮像画像（つまり、前述した旅行者が内容確認したい文字部分を有する任意の被写体の撮像画像）を受信する。通信部１８は、光の反射箇所の検知処理の結果として生成される出力画像（後述参照）をスマートフォン３０翻訳サーバ５０に送信する。 The communication unit 18 is connected to the network 70 using, for example, a wired LAN (Local Area Network) or a wireless LAN. The communication unit 18 can communicate with the translation server 50 connected to the network 70, and can communicate with the smartphone 30 carried by a traveler (an example of a user) such as a foreigner. is there. The communication unit 18 receives an arbitrary captured image transmitted from the smartphone 30 (that is, the captured image of an arbitrary subject having a character portion that the traveler wants to check the content of). The communication unit 18 transmits an output image (see below) generated as a result of the light reflection point detection process to the smartphone 30 translation server 50.

スマートフォン３０は、プロセッサ３１と、撮像部３２と、表示部３３と、入力部３４と、通信部３５と、メモリ３６とを含む構成である。スマートフォン３０は、例えば外国人等の旅行者により携帯され、使用時に把持される。スマートフォン３０は、例えば文字認識処理を実行可能なアプリケーション（文字認識アプリケーション）と、翻訳処理を実行可能なアプリケーション（翻訳アプリケーション）とを少なくとも実行可能に予めインストールされている。 The smartphone 30 is configured to include a processor 31, an imaging unit 32, a display unit 33, an input unit 34, a communication unit 35, and a memory 36. The smartphone 30 is carried, for example, by a traveler such as a foreigner, and is gripped at the time of use. In the smartphone 30, for example, an application (character recognition application) that can execute character recognition processing and an application (translation application) that can execute translation processing are installed in advance so as to be at least executable.

プロセッサ３１は、例えばＣＰＵ、ＭＰＵ、ＤＳＰもしくはＦＰＧＡを用いて構成される。プロセッサ３１は、スマートフォン３０の動作を司るコントローラとして機能し、スマートフォン３０の各部の動作を全体的に統括するための制御処理、スマートフォン３０の各部との間のデータの入出力処理、データの演算（計算）処理及びデータの記憶処理を行う。プロセッサ３１は、メモリ３６に記憶されたプログラム及びデータに従って動作する。プロセッサ３１は、動作時にメモリ３６を使用し、プロセッサ３１が生成又は取得したデータ又は情報をメモリ３６に一時的に保存してよい。 The processor 31 is configured using, for example, a CPU, an MPU, a DSP, or an FPGA. The processor 31 functions as a controller that controls the operation of the smartphone 30, performs control processing for totally controlling the operation of each unit of the smartphone 30, input / output processing of data with each unit of the smartphone 30, and calculation of data ( Calculation) processing and data storage processing. The processor 31 operates according to programs and data stored in the memory 36. The processor 31 may use the memory 36 during operation, and temporarily store data or information generated or obtained by the processor 31 in the memory 36.

撮像部３２は、集光用のレンズと、ＣＣＤ（Charge Coupled Device）型イメージセンサもしくはＣＭＯＳ（Complementary Metal Oxide Semiconductor）型イメージセンサ等の固体撮像素子とを有する構成である。撮像部３２は、スマートフォン３０の電源がオンである間、固体撮像素子による撮像に基づいて得られた被写体の撮像映像のデータを常時プロセッサ３１に出力する。被写体は、例えば、外国人等の旅行者が内容確認したい文字部分を含む看板もしくは広告等の情報伝達媒体であるが、この情報伝達媒体に限定されないことは言うまでもない。 The imaging unit 32 is configured to include a condensing lens and a solid-state imaging device such as a charge coupled device (CCD) image sensor or a complementary metal oxide semiconductor (CMOS) image sensor. While the power of the smartphone 30 is on, the imaging unit 32 constantly outputs to the processor 31 the data of the captured video of the subject obtained based on the imaging by the solid-state imaging device. The subject is, for example, an information transmission medium such as a signboard or an advertisement including a character portion that a foreigner or other traveler wants to check the content of, but it is needless to say that the subject is not limited to this information transmission medium.

表示部３３は、例えばＬＣＤ（Liquid Crystal Display）もしくは有機ＥＬ（Electroluminescence）を用いて構成され、スマートフォン３０の現在の状態を報知する以外に、各種の画面（例えば、撮像部３２による撮像時の撮影画面（いわゆる、プレビュー画面）、後述する確認画面、翻訳結果を示す画面等）を表示する。 The display unit 33 is configured by using, for example, an LCD (Liquid Crystal Display) or an organic EL (Electroluminescence), and, in addition to notifying the current state of the smartphone 30, various screens (for example, shooting at the time of imaging by the imaging unit 32). A screen (so-called preview screen), a confirmation screen described later, a screen showing a translation result, and the like are displayed.

入力部３４は、ユーザ（例えば前述した外国人等の旅行者）による各種の入力操作を受け付けて、その入力操作に応じた信号をプロセッサ３１に出力する。表示部３３及び入力部３４は、公知のタッチパネルＴＰで構成されてよい。 The input unit 34 accepts various input operations by a user (for example, the above-mentioned foreign traveler) and outputs a signal corresponding to the input operation to the processor 31. The display unit 33 and the input unit 34 may be configured by a known touch panel TP.

通信部３５は、ネットワーク７０に接続されたＡＩサーバ１０及び翻訳サーバ５０との間で無線通信可能な通信回路を用いて構成される。通信部３５は、図示しないモバイル通信網（例えば４Ｇ（第４世代移動通信システム）、５Ｇ（第５世代移動通信システム））を介してネットワーク７０に接続される。通信部３５は、ネットワーク７０に接続されたＡＩサーバ１０及び翻訳サーバ５０との間で通信可能である。通信部３５は、撮像部３２により撮像された任意の撮像画像のデータをＡＩサーバ１０に送信する。 The communication unit 35 is configured using a communication circuit that can wirelessly communicate with the AI server 10 and the translation server 50 connected to the network 70. The communication unit 35 is connected to the network 70 via a mobile communication network (not shown) (for example, 4G (fourth generation mobile communication system) and 5G (fifth generation mobile communication system)). The communication unit 35 can communicate with the AI server 10 and the translation server 50 connected to the network 70. The communication unit 35 transmits data of an arbitrary captured image captured by the imaging unit 32 to the AI server 10.

メモリ３６は、例えばＲＡＭとＲＯＭとを用いて構成され、スマートフォン３０の動作の実行に必要なプログラムやデータ、更には、動作中に生成されたデータ又は情報を一時的に保持する。ＲＡＭは、例えばスマートフォン３０の動作時に使用されるワークメモリである。ＲＯＭは、例えばスマートフォン３０を制御するためのプログラム及びデータを予め記憶して保持する。 The memory 36 is configured using, for example, a RAM and a ROM, and temporarily stores programs and data necessary for executing the operation of the smartphone 30, and data or information generated during the operation. The RAM is a work memory used when the smartphone 30 operates, for example. The ROM previously stores and holds, for example, a program and data for controlling the smartphone 30.

なお、スマートフォン３０は、撮像機能及び通信機能を有する機器の一例であり、スマートフォンに限らず、ネットワーク７０に接続可能なカメラ、タブレット端末、ノートＰＣ、監視カメラ等であってもよい。 Note that the smartphone 30 is an example of a device having an imaging function and a communication function, and is not limited to a smartphone, and may be a camera, a tablet terminal, a notebook PC, a monitoring camera, or the like that can be connected to the network 70.

翻訳サーバ５０は、プロセッサ５１と、メモリ５２と、ストレージ５３と、通信部５４とを含む構成である。翻訳サーバ５０は、例えばネットワーク７０に接続されたクラウドサーバであってよいし、例えばＡＩサーバ１０が配置される運営業者の事業所（図示略）に設置されるオンプレミスサーバとして構成されてもよい。翻訳サーバ５０は、スマートフォン３０もしくはＡＩサーバ１０から送信された撮像画像もしくは出力画像中の文字部分に相当する文字情報を所定の言語（例えば、スマートフォン３０のユーザにより予め設定された言語）に翻訳処理し、その翻訳処理結果に相当する文字情報をスマートフォン３０に返信する。 The translation server 50 is configured to include a processor 51, a memory 52, a storage 53, and a communication unit 54. The translation server 50 may be, for example, a cloud server connected to the network 70, or may be configured as an on-premises server installed in an office (not shown) of an operator in which the AI server 10 is arranged, for example. The translation server 50 translates character information corresponding to a character portion in a captured image or an output image transmitted from the smartphone 30 or the AI server 10 into a predetermined language (for example, a language preset by a user of the smartphone 30). Then, character information corresponding to the translation processing result is returned to the smartphone 30.

プロセッサ５１は、例えばＣＰＵ、ＭＰＵ、ＤＳＰもしくはＦＰＧＡを用いて構成される。プロセッサ５１は、翻訳サーバ５０の動作を司るコントローラとして機能し、翻訳サーバ５０の各部の動作を全体的に統括するための制御処理、翻訳サーバ５０の各部との間のデータの入出力処理、データの演算（計算）処理及びデータの記憶処理を行う。プロセッサ５１は、メモリ５２に記憶されたプログラム及びデータに従って動作する。プロセッサ５１は、動作時にメモリ５２を使用し、プロセッサ５１が生成又は取得したデータ又は情報をメモリ５２に一時的に保存してよい。 The processor 51 is configured using, for example, a CPU, an MPU, a DSP, or an FPGA. The processor 51 functions as a controller that controls the operation of the translation server 50, performs control processing for totally controlling the operation of each unit of the translation server 50, input / output processing of data with each unit of the translation server 50, (Calculation) processing and data storage processing. The processor 51 operates according to programs and data stored in the memory 52. The processor 51 may use the memory 52 during operation, and temporarily store data or information generated or obtained by the processor 51 in the memory 52.

メモリ５２は、例えばＲＡＭとＲＯＭとを用いて構成され、翻訳サーバ５０の動作の実行に必要なプログラムやデータ、更には、動作中に生成されたデータ又は情報を一時的に保持する。ＲＡＭは、例えば翻訳サーバ５０の動作時に使用されるワークメモリである。ＲＯＭは、例えば翻訳サーバ５０を制御するためのプログラム及びデータを予め記憶して保持する。 The memory 52 is configured using, for example, a RAM and a ROM, and temporarily stores programs and data necessary for executing the operation of the translation server 50, and data or information generated during the operation. The RAM is a work memory used when the translation server 50 operates, for example. The ROM previously stores and holds, for example, a program and data for controlling the translation server 50.

ストレージ５３は、例えばＨＤＤ又はＳＳＤを用いて構成された記録装置である。ストレージ５３は、例えばプロセッサ５１が生成又は取得したデータ又は情報を記憶する。また、ストレージ５３は、翻訳処理の際に参照される、国毎の公用語である言語に対応する辞書データが予め登録された辞書ＤＢ５３ｚを含む。なお、翻訳サーバ５０は、ネットワーク７０もしくは他のネットワーク（図示略）との間で接続された専用の辞書データ管理サーバ（図示略）との間で定期的に通信することで、辞書ＤＢ５３ｚの内容を定期的に更新してよい。 The storage 53 is a recording device configured using, for example, an HDD or an SSD. The storage 53 stores, for example, data or information generated or obtained by the processor 51. Further, the storage 53 includes a dictionary DB 53z in which dictionary data corresponding to a language which is an official language of each country, which is referred to at the time of the translation process, is registered in advance. The translation server 50 periodically communicates with a dedicated dictionary data management server (not shown) connected to the network 70 or another network (not shown), so that the contents of the dictionary DB 53z are stored. May be updated periodically.

通信部５４は、有線ＬＡＮや無線ＬＡＮ等を用いてネットワーク７０に接続される。通信部５４は、ネットワーク７０に接続されたＡＩサーバ１０及びスマートフォン３０と通信可能である。通信部５４は、スマートフォン３０から文字認識処理結果の文字情報を受信すると、その受信された文字情報をスマートフォン３０のユーザの公用語に対応するように予め設定された又はその都度設定された所定の言語に翻訳処理し、その翻訳結果の文字情報をスマートフォン３０に返信する。 The communication unit 54 is connected to the network 70 using a wired LAN, a wireless LAN, or the like. The communication unit 54 can communicate with the AI server 10 and the smartphone 30 connected to the network 70. When the communication unit 54 receives the character information of the character recognition processing result from the smartphone 30, the received character information is set in advance to correspond to the official language of the user of the smartphone 30 or a predetermined setting set each time. The content is translated into a language, and the character information of the translation is returned to the smartphone 30.

なお、実施の形態１では、文字認識結果の文字情報を、翻訳サーバ５０が翻訳する場合を示したが、スマートフォン３０が、インストール済みの翻訳アプリケーションを起動し、文字認識結果の文字情報を所定の言語に翻訳してもよい。 In the first embodiment, the case where the translation server 50 translates the character information of the character recognition result has been described. However, the smartphone 30 activates the installed translation application and converts the character information of the character recognition result to a predetermined character information. May be translated into language.

次に、上述した実施の形態１に係る反射検知システム５の動作について、図面を参照して説明する。 Next, the operation of the reflection detection system 5 according to the first embodiment will be described with reference to the drawings.

実施の形態１に係る反射検知システム５は、文字部分が掲載された広告等をスマートフォン３０により撮像された撮像画像に含まれる文字情報を文字認識処理し、その文字認識処理された文字情報を所定の言語に翻訳する。反射検知システム５は、スマートフォン３０により撮像された撮像画像中に照明光や外光等の光の反射がある場合に、この光の反射がある個所を含む領域（以下、「反射領域」と称する場合がある）を検出し、反射領域以外の領域より一層識別可能な反射領域を含む可視光画像（以下、「出力画像」と称する場合がある）を出力する。実施の形態１では、ＡＩサーバ１０は、ＡＩモデルとして近年注目されているＣｙｃｌｅＧＡＮ（サイクルガン）を用いて機械学習を行い、スマートフォン３０により撮像される任意の被写体の撮像画像に含まれる反射領域（上述参照）を検出するためのＡＩモデル（つまり、学習済みモデル）を生成する。ＣｙｃｌｅＧＡＮによる機械学習では、元画像である撮像画像とその元画像に基づいて生成される学習画像との両方が用いられる。 The reflection detection system 5 according to the first embodiment performs character recognition processing on character information included in an image captured by the smartphone 30 of an advertisement or the like in which a character portion is posted, and determines the character information on which the character recognition processing has been performed. Translate to When there is reflection of light such as illumination light or external light in a captured image captured by the smartphone 30, the reflection detection system 5 includes a region including a portion where the light is reflected (hereinafter, referred to as a “reflection region”). In some cases), and outputs a visible light image (hereinafter, may be referred to as an “output image”) including a reflection region that can be identified more than a region other than the reflection region. In the first embodiment, the AI server 10 performs machine learning using a Cycle GAN (cycle gun), which has recently attracted attention as an AI model, and obtains a reflection area ( An AI model (that is, a trained model) for detecting the above (see the above) is generated. In machine learning by Cycle GAN, both a captured image that is an original image and a learning image generated based on the original image are used.

（学習画像の生成）
先ず始めに、ＡＩサーバ１０による学習画像の生成について説明する。図２は、元画像Ａの準備及び前処理の動作手順の一例を説明するフローチャートである。ユーザ（例えば、外国人等の旅行者。以下同様とする。）は、スマートフォン３０を用いて広告等の印刷物（被写体の一例）を撮像し、撮像画像である元画像（図６の元画像Ａ）を準備する（Ｓ１）。実施の形態１の説明のために、元画像Ａには、外光や照明光等による光の反射領域が含まれるとしている。 (Generation of learning images)
First, generation of a learning image by the AI server 10 will be described. FIG. 2 is a flowchart illustrating an example of an operation procedure of preparation of the original image A and preprocessing. A user (for example, a foreign traveler or the like; the same applies hereinafter) captures a printed matter (an example of a subject) such as an advertisement using the smartphone 30, and an original image (the original image A in FIG. 6) which is a captured image. Is prepared (S1). For the description of the first embodiment, it is assumed that original image A includes a light reflection region due to external light, illumination light, or the like.

ユーザは、元画像Ａに対し所定の前処理を行い、前処理後の画像Ｂ０を取得する（Ｓ２）。元画像Ａに対する所定の前処理は、例えばスマートフォン３０もしくはＰＣ（図示略９にインストールされた画像編集系のアプリケーション（後述参照）において、ユーザの操作により、撮像画像の一部に映る光の反射領域を所定の色で塗り潰す処理である。例えば、スマートフォン３０の画面に表示された撮像画像に対し、ユーザは、画像編集系のアプリケーション（例えば、描画ツール又は画像処理ソフト）を用いて、反射領域を赤色で塗り潰す。前処理後の画像（つまり、図６の前処理後の画像Ｂ０）には、赤色で塗り潰されたマーカ領域ｍｋが描画される。スマートフォン３０は、前処理後の画像Ｂ０のデータをＡＩサーバ１０に送信する。ＡＩサーバ１０は、スマートフォン３０から受信した前処理後の画像Ｂ０のデータをストレージ１７に記憶する。 The user performs predetermined pre-processing on the original image A, and obtains an image B0 after the pre-processing (S2). The predetermined pre-processing for the original image A is performed by, for example, a smartphone 30 or a PC (an image editing application (see below) installed in the illustration 9) (see below). Is painted in a predetermined color.For example, for a captured image displayed on the screen of the smartphone 30, the user uses an image editing application (for example, a drawing tool or image processing software) to display the reflection area. Is painted in red. The pre-processed image (that is, the pre-processed image B0 in FIG. 6) is drawn with the marker region mk filled in red. Is transmitted to the AI server 10. The AI server 10 stores the data of the pre-processed image B0 received from the smartphone 30 in the storage. And stores it in the di-17.

ＡＩサーバ１０は、前処理後の画像Ｂ０を用いて、複数の学習画像を生成する。ここでは、ＡＩサーバ１０が３枚の学習画像Ｂ１，Ｂ２，Ｂ３を生成する例を説明するが、任意の枚数の学習画像を生成してもよい。多くの学習画像を用意することで、ＡＩサーバ１０における学習済みモデルを生成する処理（言い換えると、学習済みモデルに用いられる学習パラメータの更新）の精度（つまり、学習精度）が向上する。 The AI server 10 generates a plurality of learning images using the preprocessed image B0. Here, an example in which the AI server 10 generates three learning images B1, B2, and B3 will be described, but an arbitrary number of learning images may be generated. By preparing a large number of learning images, the accuracy (that is, learning accuracy) of the process of generating the learned model in the AI server 10 (in other words, updating the learning parameters used for the learned model) is improved.

図３は、学習画像Ｂ１を生成する動作手順の一例を説明するフローチャートである。図３に示す処理は、例えばＡＩサーバ１０のＡＩ処理部１３により実行される。ＡＩサーバ１０のＡＩ処理部１３は、ストレージ１７に記憶された前処理後の画像Ｂ０から１画素の画素値を取得する（Ｓ１１）。ＡＩ処理部１３は、１画素の画素値の取得の際に、例えば元画像Ａと同一サイズを有する前処理後の画像Ｂ０に対して２次元座標（つまり、ＸＹ座標）を設定し、Ｘ方向及びＹ方向に画素単位に取得対象の画素を移動しながら該当する画素の画素値を取得する。 FIG. 3 is a flowchart illustrating an example of an operation procedure for generating the learning image B1. The process illustrated in FIG. 3 is executed by, for example, the AI processing unit 13 of the AI server 10. The AI processing unit 13 of the AI server 10 acquires a pixel value of one pixel from the pre-processed image B0 stored in the storage 17 (S11). When acquiring the pixel value of one pixel, the AI processing unit 13 sets two-dimensional coordinates (that is, XY coordinates) for the pre-processed image B0 having the same size as the original image A, for example, and sets the X direction. Then, the pixel value of the corresponding pixel is acquired while moving the acquisition target pixel in the Y direction in pixel units.

ＡＩ処理部１３は、取得された１画素の画素値に基づいて、その画素が塗り潰された画素であるか否かを判別する（Ｓ１２）。塗り潰された画素である場合（Ｓ１２、ＹＥＳ）、ＡＩ処理部１３は、この画素の画素値を所定の色に設定（例えば赤色で塗り潰すように設定）し、反射領域の出力画素（つまり、図３により生成される学習画像Ｂ１内の対応する画素）と設定する（Ｓ１３）。 The AI processing unit 13 determines whether or not the pixel is a filled pixel based on the obtained pixel value of one pixel (S12). If the pixel is a filled pixel (S12, YES), the AI processing unit 13 sets the pixel value of this pixel to a predetermined color (for example, set so as to be filled with red), and outputs the pixel of the reflection area (that is, (Corresponding pixels in the learning image B1 generated according to FIG. 3) (S13).

一方、取得された１画素が塗り潰された画素でない場合（Ｓ１２、ＮＯ）、ＡＩ処理部１３は、この画素を白色に設定（例えば白色で塗り潰すように設定）し、非反射領域の出力画素（前述参照）とする（Ｓ１４）。 On the other hand, if the acquired one pixel is not a filled pixel (S12, NO), the AI processing unit 13 sets this pixel to white (for example, to set it to be filled with white) and outputs the pixel in the non-reflection area. (See above) (S14).

ステップＳ１３又はステップＳ１４の処理後、ＡＩ処理部１３は、ステップＳ１１において取得された１画素が終端の画素であるか（つまり、前処理後の画像Ｂ０の終端の画素に到達したか）否かを判別する（Ｓ１５）。終端の画素でない場合（Ｓ１５、ＮＯ）、ＡＩ処理部１３は、前処理後の画像Ｂ０に対し、取得対象の画素の位置をＸ方向又はＹ方向に１画素分移動する（Ｓ１６）。ＡＩ処理部１３の処理はステップＳ１１に戻り、ステップＳ１６において移動された次の１画素を対象として取得して同様の処理を繰り返す。 After the processing in step S13 or step S14, the AI processing unit 13 determines whether the one pixel acquired in step S11 is the last pixel (that is, whether the pixel reached the last pixel of the preprocessed image B0). Is determined (S15). If the pixel is not the last pixel (S15, NO), the AI processing unit 13 moves the position of the pixel to be acquired by one pixel in the X direction or the Y direction with respect to the preprocessed image B0 (S16). The processing of the AI processing unit 13 returns to step S11, acquires the next one pixel moved in step S16, and repeats the same processing.

一方、終端の画素である場合（Ｓ１５、ＹＥＳ）、ＡＩ処理部１３は、ステップＳ１１，Ｓ１２，Ｓ１３，Ｓ１４，Ｓ１６，Ｓ１５の一連の処理により得られた画像を学習画像Ｂ１（図６参照）として生成してメモリ１３ｚに保存する（Ｓ１７）。この後、ＡＩ処理部１３は、学習画像Ｂ１の生成処理を終了する。 On the other hand, if the pixel is the last pixel (S15, YES), the AI processing unit 13 converts the image obtained by the series of steps S11, S12, S13, S14, S16, and S15 into the learning image B1 (see FIG. 6). And stored in the memory 13z (S17). Thereafter, the AI processing unit 13 ends the generation processing of the learning image B1.

図４は、学習画像Ｂ２を生成する動作手順の一例を説明するフローチャートである。図４に示す処理は、例えばＡＩサーバ１０のＡＩ処理部１３により実行される。ＡＩサーバ１０のＡＩ処理部１３は、元画像Ａから１画素の画素値を取得する（Ｓ２１）。ＡＩ処理部１３は、前処理後の画像Ｂ０から、元画像Ａの１画素に対応する（つまり、ＸＹ座標が同じである）１画素の画素値を取得する（Ｓ２２）。ＡＩ処理部１３は、その取得された１画素の画素値に基づいて、ステップＳ２２において取得された前処理後の画像Ｂ０の１画素が、塗り潰された画素であるか（つまり、光の反射領域にある画素であるか）否かを判別する（Ｓ２３）。 FIG. 4 is a flowchart illustrating an example of an operation procedure for generating the learning image B2. 4 is executed by, for example, the AI processing unit 13 of the AI server 10. The AI processing unit 13 of the AI server 10 acquires a pixel value of one pixel from the original image A (S21). The AI processing unit 13 acquires a pixel value of one pixel corresponding to one pixel of the original image A (that is, the XY coordinates are the same) from the preprocessed image B0 (S22). The AI processing unit 13 determines whether one pixel of the pre-processed image B0 acquired in step S22 is a filled pixel based on the acquired pixel value of one pixel (that is, the light reflection area). Is determined) (S23).

塗り潰された画素である場合（Ｓ２３、ＹＥＳ）、ＡＩ処理部１３は、元画像Ａの１画素値から輝度値を計算する（Ｓ２４）。例えば、ＡＩ処理部１３は、赤色成分をｒ、緑色成分をｇ、青色成分をｂ、輝度値ｙとすると、「ｙ＝０．２９９ｒ＋０．５８７ｇ＋０．１１４ｂ」の式により輝度値ｙを算出可能であり、以下同様である。ＡＩ処理部１３は、元画像Ａの１画素に対応する出力画素（つまり、図４に示す動作により生成される学習画像Ｂ２内の対応する画素）のＲ画素に、ステップＳ２４で計算された輝度値を設定する（Ｓ２５）。ＡＩ処理部１３は、出力画素のＧ，Ｂ画素にそれぞれ輝度値０を設定する（Ｓ２６）。 If the pixel is a filled pixel (S23, YES), the AI processing unit 13 calculates a luminance value from one pixel value of the original image A (S24). For example, assuming that the red component is r, the green component is g, the blue component is b, and the luminance value y, the AI processing unit 13 can calculate the luminance value y by the formula of “y = 0.299r + 0.587g + 0.114b”. Yes, and so on. The AI processing unit 13 applies the luminance calculated in step S24 to the R pixel of the output pixel corresponding to one pixel of the original image A (that is, the corresponding pixel in the learning image B2 generated by the operation illustrated in FIG. 4). A value is set (S25). The AI processing unit 13 sets a luminance value of 0 to each of the G and B pixels of the output pixel (S26).

一方、ステップＳ２２において取得された前処理後の画像Ｂ０の１画素が塗り潰された画素でない場合（Ｓ２３、ＮＯ）、ＡＩ処理部１３は、元画像Ａの１画素の画素値を出力画素（前述参照）の画素値に設定する（Ｓ２７）。 On the other hand, if one pixel of the pre-processed image B0 acquired in step S22 is not a solid pixel (S23, NO), the AI processing unit 13 outputs the pixel value of one pixel of the original image A to an output pixel (described above). (See S27).

ステップＳ２６又はステップＳ２７の処理後、ＡＩ処理部１３は、ステップＳ２１において取得された画素が終端の画素であるか（つまり、元画像Ａ０の終端の画素に到達したか）否かを判別する（Ｓ２８）。終端の画素でない場合（Ｓ２８、ＮＯ）、ＡＩ処理部１３は、元画像Ａに対し、取得対象の画素の位置をＸ方向又はＹ方向に１画素分移動する（Ｓ２９）。ＡＩ処理部１３の処理はステップＳ２１に戻り、ステップＳ２９において移動された次の１画素を対象として取得して同様の処理を繰り返す。 After the processing in step S26 or step S27, the AI processing unit 13 determines whether the pixel acquired in step S21 is the last pixel (that is, whether it has reached the last pixel of the original image A0) ( S28). If the pixel is not the last pixel (S28, NO), the AI processing unit 13 moves the position of the pixel to be acquired by one pixel in the X direction or the Y direction with respect to the original image A (S29). The process of the AI processing unit 13 returns to step S21, acquires the next one pixel moved in step S29, and repeats the same process.

一方、終端の画素である場合（Ｓ２８、ＹＥＳ）、ＡＩ処理部１３は、ステップＳ２１，Ｓ２２，Ｓ２３，Ｓ２４，Ｓ２５，Ｓ２６，Ｓ２７，Ｓ２８の一連の処理により得られた画像を学習画像Ｂ２（図６参照）として生成してメモリ１３ｚに保存する（Ｓ３０）。この後、ＡＩ処理部１３は学習画像Ｂ２の生成処理を終了する。 On the other hand, if the pixel is the last pixel (S28, YES), the AI processing unit 13 converts the image obtained by the series of processes of steps S21, S22, S23, S24, S25, S26, S27, and S28 into the learning image B2 ( (See FIG. 6) and stored in the memory 13z (S30). Thereafter, the AI processing unit 13 ends the process of generating the learning image B2.

図５は、学習画像Ｂ３を生成する動作手順の一例を説明するフローチャートである。図５に示す処理は、例えばＡＩサーバ１０のＡＩ処理部１３により実行される。ＡＩサーバ１０のＡＩ処理部１３は、元画像Ａから１画素の画素値を取得する（Ｓ３１）。ＡＩ処理部１３は、前処理後の画像Ｂ０から、元画像Ａの１画素に対応する（つまり、ＸＹ座標が同じである）１画素の画素値を取得する（Ｓ３２）。ＡＩ処理部１３は、ステップＳ３１において取得された元画像Ａの１画素の画素値から、例えば上述した算出式を用いて輝度値を計算する（Ｓ３３）。 FIG. 5 is a flowchart illustrating an example of an operation procedure for generating the learning image B3. The process illustrated in FIG. 5 is executed by, for example, the AI processing unit 13 of the AI server 10. The AI processing unit 13 of the AI server 10 acquires a pixel value of one pixel from the original image A (S31). The AI processing unit 13 acquires a pixel value of one pixel corresponding to one pixel of the original image A (that is, the XY coordinates are the same) from the preprocessed image B0 (S32). The AI processing unit 13 calculates a luminance value from the pixel value of one pixel of the original image A acquired in step S31 using, for example, the above-described calculation formula (S33).

ＡＩ処理部１３は、その取得された１画素の画素値に基づいて、ステップＳ３２において取得された前処理後の画像Ｂ０の１画素が、塗り潰された画素であるか（つまり、光の反射領域にある画素であるか）否かを判別する（Ｓ３４）。塗り潰された画素である場合（Ｓ３４、ＹＥＳ）、ＡＩ処理部１３は、元画像Ａの１画素に対応する出力画素（つまり、図５に示す動作により生成される学習画像Ｂ３内の対応する画素）のＲ画素の輝度値を、ステップＳ３３において計算された輝度値に設定する（Ｓ３５）。ＡＩ処理部１３は、出力画素（前述参照）のＧ，Ｂ画素に、それぞれ輝度値０を設定する（Ｓ３６）。 The AI processing unit 13 determines whether one pixel of the preprocessed image B0 acquired in step S32 is a filled pixel based on the acquired pixel value of one pixel (that is, the light reflection area). (S34). If the pixel is a solid pixel (S34, YES), the AI processing unit 13 outputs the output pixel corresponding to one pixel of the original image A (that is, the corresponding pixel in the learning image B3 generated by the operation illustrated in FIG. 5). ), The luminance value of the R pixel is set to the luminance value calculated in step S33 (S35). The AI processing unit 13 sets a luminance value of 0 to each of the G and B pixels of the output pixel (see above) (S36).

一方、ステップＳ３２において取得された前処理後の画像Ｂ０の１画素が塗り潰された画素でない場合（Ｓ３４、ＮＯ）、ＡＩ処理部１３は、出力画素（前述参照）のＲ，Ｇ，Ｂ画素のそれぞれに、ステップＳ３３において計算された輝度値を設定する（Ｓ３７）。 On the other hand, if one pixel of the pre-processed image B0 acquired in step S32 is not a filled pixel (S34, NO), the AI processing unit 13 outputs the R, G, and B pixels of the output pixel (see above). The brightness value calculated in step S33 is set for each (S37).

ステップＳ３６又はステップＳ３７の処理後、ＡＩ処理部１３は、ステップＳ３１において取得された画素が終端の画素であるか（つまり、元画像Ａの終端の画素に到達したか）否かを判別する（Ｓ３８）。終端の画素でない場合（Ｓ３８、ＮＯ）、ＡＩ処理部１３は、元画像Ａに対し、取得対象の画素の位置をＸ方向又はＹ方向に１画素分移動する（Ｓ３９）。ＡＩ処理部１３の処理はステップＳ３１に戻り、ステップＳ３９において移動された次の１画素を対象として取得して同様の処理を繰り返す。 After the processing in step S36 or step S37, the AI processing unit 13 determines whether the pixel acquired in step S31 is the last pixel (that is, has reached the last pixel of the original image A) ( S38). If the pixel is not the last pixel (S38, NO), the AI processing unit 13 moves the position of the acquisition target pixel by one pixel in the X direction or the Y direction with respect to the original image A (S39). The process of the AI processing unit 13 returns to step S31, acquires the next one pixel moved in step S39, and repeats the same process.

一方、終端の画素である場合（Ｓ３９、ＹＥＳ）、ＡＩ処理部１３は、ステップＳ３１，Ｓ３２，Ｓ３３，Ｓ３４，Ｓ３５，Ｓ３６，Ｓ３７，Ｓ３８の一連の処理後の画像を学習画像Ｂ３（図６参照）として生成してメモリ１３ｚに保存する（Ｓ４０）。この後、ＡＩ処理部１３は学習画像Ｂ３の生成処理を終了する。 On the other hand, if the pixel is the last pixel (S39, YES), the AI processing unit 13 converts the image after the series of processing in steps S31, S32, S33, S34, S35, S36, S37, and S38 into the learning image B3 (FIG. 6). (See S40) and stored in the memory 13z (S40). Thereafter, the AI processing unit 13 ends the process of generating the learning image B3.

図６は、元画像Ａ、前処理後の画像Ｂ０、学習画像Ｂ１，Ｂ２，Ｂ３を示す図である。元画像Ａは、広告や飲食店のメニュー等を被写体としてユーザの操作に基づいてスマートフォン３０により撮像された撮像画像である。元画像Ａには、照明光や外光等の光による反射領域ｇ１が存在し、反射領域ｇ１の近傍では、文字認識が不可である（言い換えると、文字情報が判読できない）。 FIG. 6 is a diagram showing an original image A, an image B0 after preprocessing, and learning images B1, B2, and B3. The original image A is a captured image captured by the smartphone 30 based on a user operation with an advertisement, a menu of a restaurant or the like as a subject. In the original image A, there is a reflection area g1 due to light such as illumination light or external light, and character recognition is not possible near the reflection area g1 (in other words, character information cannot be read).

前処理後の画像Ｂ０は、元画像Ａに対して前処理（図２参照）を行った画像である。前処理後の画像Ｂ０は、ユーザが描画ツールや画像処理ソフトを使用して反射領域を赤色で塗り潰したマーカ領域ｍｋが含まれる。 The image B0 after the preprocessing is an image obtained by performing the preprocessing (see FIG. 2) on the original image A. The image B0 after the preprocessing includes a marker area mk in which the reflection area is filled with red by the user using a drawing tool or image processing software.

学習画像Ｂ１は、前処理後の画像Ｂ０に対し、マーカ領域ｍｋを所定の色（ここでは、赤色）に設定し、その他の領域を背景色（白色）に設定した画像である。なお、マーカ領域ｍｋに設定される所定の色は、赤色でなく、青色等の任意の色でもよい。また、背景色は、白色に限らず、緑色や青色等、撮像画像にあまり含まれない色でもよい。 The learning image B1 is an image in which the marker area mk is set to a predetermined color (here, red) and the other areas are set to a background color (white) with respect to the preprocessed image B0. Note that the predetermined color set in the marker area mk may be an arbitrary color such as blue instead of red. Further, the background color is not limited to white, but may be a color that is not much included in the captured image, such as green or blue.

学習画像Ｂ２は、元画像Ａから輝度値を算出し、マーカ領域ｍｋでＲ，Ｇ，Ｂ成分のうち、Ｒ成分を算出した輝度値に置換し、Ｇ，Ｂ成分を輝度値０に設定し、その他の領域を元画像Ａの画素値にした画像である。 The learning image B2 calculates a luminance value from the original image A, replaces the R component among the R, G, and B components in the marker area mk with the calculated luminance value, and sets the G and B components to the luminance value 0. , And other areas are pixel values of the original image A.

学習画像Ｂ３は、元画像Ａから輝度値を算出した後、マーカ領域ｍｋでＲ成分を輝度値に置換し、その他の領域でＲ，Ｇ，Ｂ成分を輝度値に置換した画像である。 The learning image B3 is an image in which, after calculating the luminance value from the original image A, the R component is replaced with the luminance value in the marker area mk, and the R, G, B components are replaced with the luminance value in the other areas.

（学習済モデルを生成するための機械学習）
図７は、ＡＩサーバ１０の学習の動作手順の一例を説明するフローチャートである。図７に示す処理は、例えばＡＩサーバ１０のＡＩ処理部１３により実行される。ＡＩサーバ１０のＡＩ処理部１３は、ＡＩモデル（例えば前述したＣｙｃｌｅＧＡＮ）において使用されるパラメータ（以下、「学習パラメータ」という）を設定する（Ｓ５１）。 (Machine learning to generate a trained model)
FIG. 7 is a flowchart illustrating an example of a learning operation procedure of the AI server 10. The process illustrated in FIG. 7 is executed by, for example, the AI processing unit 13 of the AI server 10. The AI processing unit 13 of the AI server 10 sets parameters (hereinafter, referred to as “learning parameters”) used in the AI model (for example, the above-described Cycle GAN) (S51).

学習パラメータは、例えばＡＩモデルを形成するニューラルネットワークを学習する際のＬｅａｒｎｉｎｇＲａｔｅ（つまり、学習率）である。実施の形態１の機械学習では、例えばＣｙｃｌｅＧＡＮを用いたＡＩモデルの学習パラメータを最適化する。ＣｙｃｌｅＧＡＮを用いたＡＩモデルは、例えば、Ｂ´生成器、偽Ｂ評価器、Ｂ−Ｂ´類似度評価器、Ａ´生成器、偽Ａ評価器、及びＡ−Ａ´類似度評価器を含む。また、ＣｙｃｌｅＧＡＮを用いたＡＩモデルでは、元画像Ａ、元画像Ａの偽画像Ａ´、学習画像Ｂ、学習画像Ｂの偽画像Ｂ´が用いられる。このＡＩモデルでは、Ｂ´生成器の学習パラメータが最適化される。Ｂ´生成器は、元画像Ａあるいは偽画像Ａ´から偽画像Ｂ´を生成する。また、この学習モデルでは、Ａ´生成器の学習パラメータが最適化される。Ａ´生成器は、学習画像Ｂあるいは偽画像Ｂ´から偽画像Ａ´を生成する。学習画像Ｂには、図６に示した学習画像Ｂ１，Ｂ２，Ｂ３が用いられる。 The learning parameter is, for example, a learning rate (that is, a learning rate) when learning a neural network that forms the AI model. In the machine learning according to the first embodiment, for example, learning parameters of an AI model using Cycle GAN are optimized. The AI model using the Cycle GAN includes, for example, a B ′ generator, a false B evaluator, a BB ′ similarity evaluator, an A ′ generator, a false A evaluator, and an AA ′ similarity evaluator. . In the AI model using the Cycle GAN, the original image A, the fake image A 'of the original image A, the learning image B, and the fake image B' of the learning image B are used. In the AI model, the learning parameters of the B ′ generator are optimized. The B 'generator generates a fake image B' from the original image A or the fake image A '. In this learning model, the learning parameters of the A ′ generator are optimized. The A 'generator generates a fake image A' from the learning image B or the fake image B '. As the learning image B, the learning images B1, B2, and B3 shown in FIG.

ＡＩ処理部１３は、元画像Ａから偽画像Ｂ´を生成する（Ｓ５２）。つまり、ＡＩ処理部１３は、ＡＩモデルのＢ´生成器（偽画像生成器）に元画像Ａを入力して偽画像Ｂ´を生成する。そして、ＡＩ処理部１３は、偽画像Ｂ´の生成精度を評価する（Ｓ５３）。この評価の結果に基づいて、Ｂ´生成器の精度指標となる生成精度指標ＫＢ１が更新される。ＡＩ処理部１３は、偽Ｂ評価器（偽画像判別器）により、Ｂ´生成器で生成した偽画像Ｂ´の真偽を評価する（Ｓ５４）。つまり、偽Ｂ評価器が、Ｂ´生成器で生成された偽画像Ｂ´の真偽を判定する。この判定の結果、偽Ｂ評価器の精度指標となる判別精度指標ＫＢ２が更新される。 The AI processing unit 13 generates a fake image B ′ from the original image A (S52). That is, the AI processing unit 13 inputs the original image A to the AI model B 'generator (fake image generator) to generate a fake image B'. Then, the AI processing unit 13 evaluates the generation accuracy of the fake image B '(S53). Based on the result of this evaluation, the generation accuracy index KB1 serving as the accuracy index of the B ′ generator is updated. The AI processing unit 13 evaluates whether the false image B 'generated by the B' generator is true or false by using the false B evaluator (false image discriminator) (S54). That is, the false B evaluator determines the authenticity of the false image B 'generated by the B' generator. As a result of this determination, the discrimination accuracy index KB2 serving as the accuracy index of the false B evaluator is updated.

ＡＩ処理部１３は、偽画像Ｂ´から偽画像Ａ´を生成する（Ｓ５５）。つまり、ＡＩ処理部１３は、ＡＩモデルのＡ´生成器に偽画像Ｂ´を入力して偽画像Ａ´を生成する。ＡＩ処理部１３は、生成した偽画像Ａ´の類似度を評価する（Ｓ５６）。つまり、Ａ−Ａ´類似度評価器は、偽画像Ａ´と元画像Ａの類似度を計算する。類似度の計算結果、元画像Ａと再構築された偽画像Ａ´の再構築精度指標ＫＡ３が更新される。 The AI processing unit 13 generates a fake image A 'from the fake image B' (S55). That is, the AI processing unit 13 generates a fake image A 'by inputting the fake image B' to the A 'generator of the AI model. The AI processing unit 13 evaluates the similarity of the generated fake image A ′ (S56). That is, the AA ′ similarity evaluator calculates the similarity between the fake image A ′ and the original image A. As a result of the calculation of the similarity, the reconstruction accuracy index KA3 of the original image A and the reconstructed fake image A 'is updated.

また、ＡＩ処理部１３は、学習画像Ｂから偽画像Ａ´を生成する（Ｓ５７）。つまり、
ＡＩ処理部１３は、Ａ´生成器（偽画像生成器）に学習画像Ｂを入力して偽画像Ａ´を生成する。そして、ＡＩ処理部１３は、偽画像Ａ´の生成精度を評価する（Ｓ５８）。この評価の結果に基づいて、Ａ´生成器の精度指標となる生成精度指標ＫＡ１が更新される。ＡＩ処理部１３は、偽Ａ評価器（偽画像判別器）によりＡ´生成器で生成した偽画像Ａ´の真偽を評価する（Ｓ５９）。つまり、偽Ａ評価器は、Ａ´生成器で生成された偽画像Ａ´の真偽を判定する。この判定の結果、偽Ｂ評価器の精度指標となる判別精度指標ＫＡ２が更新される。 The AI processing unit 13 generates a fake image A ′ from the learning image B (S57). That is,
The AI processing unit 13 inputs the learning image B to the A ′ generator (fake image generator) and generates a fake image A ′. Then, the AI processing unit 13 evaluates the generation accuracy of the fake image A '(S58). Based on the result of this evaluation, the generation accuracy index KA1 serving as the accuracy index of the A ′ generator is updated. The AI processing unit 13 evaluates the falseness of the false image A 'generated by the A' generator using the false A evaluator (false image discriminator) (S59). That is, the fake A evaluator determines whether the fake image A 'generated by the A' generator is true or false. As a result of this determination, the discrimination accuracy index KA2 serving as the accuracy index of the false B evaluator is updated.

ＡＩ処理部１３は、偽画像Ａ´から偽画像Ｂ´を生成する（Ｓ６０）。つまり、ＡＩ処理部１３は、Ｂ´生成器に偽画像Ａ´を入力して偽画像Ｂ´を生成する。ＡＩ処理部１３は、生成した偽画像Ｂ´の類似度を評価する（Ｓ６１）。つまり、Ｂ−Ｂ´類似度評価器は、偽画像Ｂ´と学習画像Ｂの類似度を計算する。類似度の計算結果、元画像Ｂと再構築された偽画像Ｂ´の再構築精度指標ＫＢ３が更新される。 The AI processing unit 13 generates a fake image B 'from the fake image A' (S60). That is, the AI processing unit 13 inputs the fake image A 'to the B' generator and generates the fake image B '. The AI processing unit 13 evaluates the similarity of the generated fake image B ′ (S61). That is, the BB ′ similarity evaluator calculates the similarity between the false image B ′ and the learning image B. As a result of the calculation of the similarity, the reconstruction accuracy index KB3 of the original image B and the reconstructed fake image B ′ is updated.

ＡＩ処理部１３は、上述した生成精度指標ＫＡ１、判別精度指標ＫＡ２、再構築精度指標ＫＡ３、生成精度指標ＫＢ１、判別精度指標ＫＢ２、及び再構築精度指標ＫＢ３を基に、ＡＩモデルの学習パラメータ（例えば、Ｂ´生成器の学習パラメータとＡ´生成器の学習パラメータ）を更新する（Ｓ６２）。 The AI processing unit 13 determines a learning parameter of the AI model based on the above-described generation accuracy index KA1, the determination accuracy index KA2, the reconstruction accuracy index KA3, the generation accuracy index KB1, the determination accuracy index KB2, and the reconstruction accuracy index KB3. For example, the learning parameters of the B ′ generator and the learning parameters of the A ′ generator are updated (S62).

ＡＩ処理部１３は、全ての元画像Ａと学習画像Ｂ（例えば、学習画像Ｂ１，Ｂ２，Ｂ３）を用いて、上記ステップＳ５２〜Ｓ６２の学習処理を行ったか否かを判別する（Ｓ６３）。つまり、ＡＩ処理部１３は、全ての元画像Ａと学習画像Ｂのデータが学習済となったか否かを判別する。なお、図６に示した元画像Ａと学習画像Ｂ（Ｂ１，Ｂ２，Ｂ３）は、一例であり、多くの元画像Ａと学習画像Ｂを用いることが学習精度の向上のためには望ましい。 The AI processing unit 13 determines whether or not the learning processing in steps S52 to S62 has been performed using all the original images A and the learning images B (for example, the learning images B1, B2, and B3) (S63). That is, the AI processing unit 13 determines whether or not the data of all the original images A and the learning images B has been learned. Note that the original image A and the learning images B (B1, B2, B3) shown in FIG. 6 are merely examples, and it is desirable to use many original images A and the learning images B in order to improve learning accuracy.

学習済でないデータがある場合（Ｓ６３、ＮＯ）、ＡＩ処理部１３は、次のデータを取得する（Ｓ６４）。ＡＩ処理部１３の処理はステップＳ５２に戻り、同様の処理（つまり、ステップＳ５２，Ｓ５３，Ｓ５４，Ｓ５５，Ｓ５６，Ｓ５７，Ｓ５８，Ｓ５９，Ｓ６０、Ｓ６１，Ｓ６２，Ｓ６３，Ｓ６４の一連の処理）を繰り返す。 When there is data that has not been learned (S63, NO), the AI processing unit 13 acquires the next data (S64). The processing of the AI processing unit 13 returns to step S52, and performs the same processing (that is, a series of processing of steps S52, S53, S54, S55, S56, S57, S58, S59, S60, S61, S62, S63, and S64). repeat.

一方、全てのデータが学習済となった場合（Ｓ６３、ＹＥＳ）、ＡＩ処理部１３は、学習済みモデル（つまり、学習済みのＣｙｃｌｅＧＡＮを用いたＡＩモデル）を生成し、生成した学習済みモデルのデータをストレージ１７に保存する（Ｓ６５）。この後、ＡＩ処理部１３は、図７に示す学習処理を終了する。 On the other hand, when all the data has been learned (S63, YES), the AI processing unit 13 generates a learned model (that is, an AI model using the learned Cycle GAN), and generates the learned model. The data is stored in the storage 17 (S65). Thereafter, the AI processing unit 13 ends the learning process illustrated in FIG.

図８は、ＡＩサーバ１０の反射箇所の検出の動作手順の一例を説明するフローチャートである。図８に示す処理は、例えばＡＩサーバ１０のＡＩ処理部１３により実行される。ＡＩサーバ１０のＡＩ処理部１３は、スマートフォン３０により撮像された撮像画像を検出対象画像として取得し、メモリ１３ｚに記憶する（Ｓ７１）。ＡＩ処理部１３は、ストレージ１７に保存された学習済みモデルデータを読み出し、ＡＩネットワークとしてメモリ１３ｚに展開して取り込む（Ｓ７２）。 FIG. 8 is a flowchart illustrating an example of an operation procedure of detecting a reflection point of the AI server 10. The process illustrated in FIG. 8 is executed by, for example, the AI processing unit 13 of the AI server 10. The AI processing unit 13 of the AI server 10 acquires a captured image captured by the smartphone 30 as a detection target image and stores it in the memory 13z (S71). The AI processing unit 13 reads out the learned model data stored in the storage 17, expands it into the memory 13z as an AI network, and takes it in (S72).

ＡＩ処理部１３は、学習済みモデルの一部であるＢ´生成器に対し、検出対象画像（撮像画像）を入力し、反射領域が可視化された画像を出力する（Ｓ７３）。反射領域が可視化された画像は、例えばＡＩ処理部１３における学習済みモデル（ＡＩモデル）を用いた処理実行時に反射領域が赤く描画され、その他の領域がグレーで描画された画像である。 The AI processing unit 13 inputs a detection target image (captured image) to the B ′ generator, which is a part of the learned model, and outputs an image in which the reflection area is visualized (S73). The image in which the reflection area is visualized is an image in which the reflection area is drawn in red and the other areas are drawn in gray, for example, when the AI processing unit 13 executes a process using the learned model (AI model).

ＡＩ処理部１３は、画像の色成分の強度比を基に、非反射領域か反射領域かを判断し、反射領域情報を取得する（Ｓ７４）。非反射領域の画像は、後述するように、文字認識処理及び翻訳処理のそれぞれの対象とされる。反射領域の画像は、文字認識処理及び翻訳処理の対象外とされる。この後、ＡＩ処理部１３は、図８に示すＡＩ反射検出処理を終了する。 The AI processing unit 13 determines whether it is a non-reflection area or a reflection area based on the intensity ratio of the color components of the image, and acquires reflection area information (S74). The image of the non-reflection area is subjected to each of the character recognition processing and the translation processing as described later. The image of the reflection area is excluded from the character recognition processing and the translation processing. Thereafter, the AI processing unit 13 ends the AI reflection detection processing shown in FIG.

（スマートフォンの翻訳動作）
図９は、スマートフォン３０の翻訳動作手順の一例を説明するフローチャートである。図９に示す処理は、例えばスマートフォン３０のプロセッサ３１により主に実行される。スマートフォン３０のプロセッサ３１は、ユーザの操作を受け付けると、文字認識・翻訳アプリを起動する（Ｓ８１）。ユーザが広告等の被写体に対し、シャッタ操作（つまり、撮像開始操作）を行うと、撮像部３２は、被写体を撮像する。プロセッサ３１は、撮像部３２で撮像された撮像画像ＧＺ１（図１０参照）を取得し、メモリ３６に記憶する（Ｓ８２）。通信部３５は、メモリ３６に記憶された撮像画像ＧＺ１を、ネットワーク７０を介して、ＡＩサーバ１０に送信する（Ｓ８３）。 (Translation operation of smartphone)
FIG. 9 is a flowchart illustrating an example of a translation operation procedure of smartphone 30. The process illustrated in FIG. 9 is mainly executed by, for example, the processor 31 of the smartphone 30. The processor 31 of the smartphone 30 activates the character recognition / translation application upon receiving a user operation (S81). When the user performs a shutter operation (that is, an imaging start operation) on a subject such as an advertisement, the imaging unit 32 captures an image of the subject. The processor 31 acquires the captured image GZ1 (see FIG. 10) captured by the imaging unit 32, and stores it in the memory 36 (S82). The communication unit 35 transmits the captured image GZ1 stored in the memory 36 to the AI server 10 via the network 70 (S83).

図１０は、撮像画像ＧＺ１が表示されたスマートフォン３０の撮影画面ＧＭ１の一例を示す図である。撮像画像ＧＺ１内には、例えば２箇所に照明光による反射領域ｇ１が現れたとする。また、撮影画面ＧＭ１には、撮像画像ＧＺ１に矩形窓ｗｋ１が重畳して表示される。撮影画面ＧＭ１には、矩形窓ｗｋ１に隠れて表示されないが、撮像画像ＧＺ１には、コーヒー、紅茶の文字情報が含まれる（図１２Ｂ参照）。また、撮影画面ＧＭ１には、カメラのシャッタボタン（つまり、撮像開始ボタン）を示すシャッタアイコンｓｔが表示される。 FIG. 10 is a diagram illustrating an example of the shooting screen GM1 of the smartphone 30 on which the captured image GZ1 is displayed. It is assumed that, in the captured image GZ1, for example, reflection areas g1 due to illumination light appear at two places. Further, on the photographing screen GM1, a rectangular window wk1 is displayed so as to be superimposed on the captured image GZ1. The captured image GM1 is hidden behind the rectangular window wk1 and is not displayed, but the captured image GZ1 includes character information of coffee and tea (see FIG. 12B). Further, a shutter icon st indicating a shutter button of the camera (that is, an imaging start button) is displayed on the shooting screen GM1.

ＡＩサーバ１０の通信部１８は、スマートフォン３０から撮像画像を受信する。ＡＩ処理部１３は、受信した撮像画像に対し、図８に示したＡＩ反射検出処理を行って反射領域情報を取得する。通信部１８は、ＡＩ処理部１３で得られた反射領域情報をスマートフォン３０に送信する。 The communication unit 18 of the AI server 10 receives a captured image from the smartphone 30. The AI processing unit 13 performs the AI reflection detection process illustrated in FIG. 8 on the received captured image to acquire reflection area information. The communication unit 18 transmits the reflection area information obtained by the AI processing unit 13 to the smartphone 30.

スマートフォン３０の通信部３５は、ネットワーク７０を介して、ＡＩサーバ１０から反射領域情報を受信する（Ｓ８４）。プロセッサ３１は、受信された反射領域情報を基に、メモリ３６に記憶された撮像画像に対し、特定の色（例えば赤色）で表された反射位置ｍｃを重畳させ、反射位置ｍｃが重畳した重畳画像ＧＺ２を生成し、表示部３３に表示する（Ｓ８５）。 The communication unit 35 of the smartphone 30 receives the reflection area information from the AI server 10 via the network 70 (S84). The processor 31 superimposes the reflection position mc represented by a specific color (for example, red) on the captured image stored in the memory 36 based on the received reflection region information, and superimposes the reflection position mc. The image GZ2 is generated and displayed on the display unit 33 (S85).

図１１は、重畳画像ＧＺ２が表示されたスマートフォン３０の確認画面ＧＭ２の一例を示す図である。プロセッサ３１は、反射位置が重畳した重畳画像ＧＺ２に対し、文字認識を行う（Ｓ８６）。プロセッサ３１は、文字認識処理の結果をメモリ３６に記憶する。認識された文字には、文字認識できたことを表すマーキングとして文字掛けｈｍが施される。文字掛けｈｍが施されると、表示部３３の画面に表示される文字の表示形態が変化する。例えば、文字の色が文字認識前の黒色から文字を囲むグレーに変化する。 FIG. 11 is a diagram illustrating an example of the confirmation screen GM2 of the smartphone 30 on which the superimposed image GZ2 is displayed. The processor 31 performs character recognition on the superimposed image GZ2 on which the reflection positions are superimposed (S86). The processor 31 stores the result of the character recognition process in the memory 36. Recognized characters are provided with a character hm as a marking indicating that the characters have been recognized. When the character hanging hm is performed, the display form of the character displayed on the screen of the display unit 33 changes. For example, the color of the character changes from black before character recognition to gray surrounding the character.

また、プロセッサ３１は、確認画面ＧＭ２の下方に矩形窓ｗｋ２を表示し、矩形窓ｗｋ２に翻訳の有無を確認するメッセージを表示する。ここでは、タッチパネルＴＰの画面の下方に設定された表示領域には、「Ｔｒａｎｓｌａｔｅｔｈｅｄｉｓｐｌａｙ．ＩｓｉｔＯＫ？」のメッセージが表示される。また、タッチパネルＴＰの画面の下方には、入力部３４としてＹＥＳボタン３４ｚ及びＮＯボタン３４ｙが配置される。ユーザは、文字認識の結果、翻訳を行う場合、ＹＥＳボタン３４ｚを押下する。また、ユーザは、翻訳を行わない場合、ＮＯボタン３４ｙを押下する。 Further, the processor 31 displays a rectangular window wk2 below the confirmation screen GM2, and displays a message for confirming the presence or absence of translation in the rectangular window wk2. Here, a message “Translate the display. Is it OK?” Is displayed in the display area set below the screen of the touch panel TP. Further, below the screen of the touch panel TP, a YES button 34z and a NO button 34y are arranged as the input unit 34. When performing translation as a result of character recognition, the user presses the YES button 34z. In addition, when not performing the translation, the user presses the NO button 34y.

プロセッサ３１は、ユーザの操作を受け付け、翻訳を開始するか否かを判別する（Ｓ８７）。翻訳を開始する場合、通信部３５は、プロセッサの指示に従い、メモリ３６に文字認識の結果得られた文字情報を、ネットワーク７０に接続された翻訳サーバ５０に送信する。翻訳サーバ５０の通信部５４は、スマートフォン３０から送信された文字情報を、受信する。翻訳サーバ５０のプロセッサ５１は、ストレージ５３の辞書ＤＢ５３ｚを参照し、文字情報を予め指定された国の言語（例えば、外国人自身の母国語）で翻訳処理する。通信部５４は、翻訳処理の結果をスマートフォン３０に送信する。 The processor 31 accepts the operation of the user and determines whether or not to start the translation (S87). When starting the translation, the communication unit 35 transmits the character information obtained as a result of the character recognition to the memory 36 to the translation server 50 connected to the network 70 according to the instruction of the processor. The communication unit 54 of the translation server 50 receives the character information transmitted from the smartphone 30. The processor 51 of the translation server 50 refers to the dictionary DB 53z of the storage 53, and translates the character information in a language of a country designated in advance (for example, the foreign language of the foreigner himself). The communication unit 54 transmits the result of the translation process to the smartphone 30.

スマートフォン３０の通信部３５は、翻訳サーバ５０から翻訳結果を受信する。プロセッサ３１は、翻訳結果を表示部３３の画面に表示する（Ｓ８８）。なお、ここでは、翻訳サーバが翻訳を行ったが、スマートフォン３０がインストール済みの翻訳アプリを起動し、自装置で翻訳を行ってもよい。 The communication unit 35 of the smartphone 30 receives a translation result from the translation server 50. The processor 31 displays the translation result on the screen of the display unit 33 (S88). Here, the translation server performed the translation, but the smartphone 30 may activate the installed translation application and perform the translation on its own device.

図１２Ａは、スマートフォン３０に表示された翻訳結果画面ＧＭ３の一例を示す図である。翻訳結果画面ＧＭ３の下方に配置された、矩形窓ｗｋ３で囲まれた領域には、翻訳結果が表示される。ここでは、文字情報である「カレー」、「烏龍茶」に対し、それぞれ翻訳結果である「Ｃｕｒｒｙ」、「Ｏｏｌｏｎｇ」が表示される。また、反射位置ｍｃが重畳され、文字認識されなかった「たこ焼き」、「焼きそば」の画像に対しては、翻訳が行われないので、何も標示されない。なお、ここでは、日本語から英語へと翻訳されたが、翻訳前の言語及び翻訳後の言語は、日本語、英語、中国後、ドイツ語、フランス語等、任意の組み合わせが可能である。翻訳アプリは、スマートフォン３０に設定された所有者の国籍を判別し、該当する国の言語で翻訳を行う。 FIG. 12A is a diagram illustrating an example of the translation result screen GM3 displayed on the smartphone 30. The translation result is displayed in a region arranged below the translation result screen GM3 and surrounded by the rectangular window wk3. Here, the translation results "Curry" and "Ooolong" are displayed for the character information "Curry" and "Ourongcha", respectively. In addition, no translation is performed on the images of “Takoyaki” and “Yakisoba” on which the reflection position mc is superimposed and the characters are not recognized, so that nothing is displayed. Here, the language is translated from Japanese to English, but the language before translation and the language after translation can be any combination such as Japanese, English, Chinese, German, French and the like. The translation application determines the nationality of the owner set on the smartphone 30 and performs translation in the language of the corresponding country.

ユーザは、タッチパネルＴＰに対し、所定の操作を行うことで、翻訳結果を保存可能である。所定の操作として、例えば、翻訳結果画面ＧＭ３に表示された矩形窓ｗｋ３で囲まれた領域をダブルタップ操作することが挙げられる。 The user can save a translation result by performing a predetermined operation on the touch panel TP. As the predetermined operation, for example, a double tap operation is performed on an area surrounded by a rectangular window wk3 displayed on the translation result screen GM3.

プロセッサ３１は、ユーザの操作を受け付け、翻訳結果を保存するか否かを判別する（Ｓ８９）。翻訳結果を保存する場合、プロセッサ３１は、メモリ３６に翻訳結果を保存する（Ｓ９０）。プロセッサ３１は、アプリ終了操作が行われたか否かを判別する（Ｓ９１）。アプリ終了操作が行われない場合、ステップＳ８２の処理に戻る。一方、アプリ終了操作が行われた場合、あるいはステップＳ８９で翻訳結果を保存しない場合、プロセッサ３１は、そのまま本処理を終了する。 The processor 31 accepts the operation of the user and determines whether to save the translation result (S89). When storing the translation result, the processor 31 stores the translation result in the memory 36 (S90). The processor 31 determines whether an application end operation has been performed (S91). If the application end operation has not been performed, the process returns to step S82. On the other hand, when the application ending operation is performed, or when the translation result is not stored in step S89, the processor 31 ends the process as it is.

（他の翻訳結果画面）
図１２Ｂは、スマートフォン３０に表示された他の翻訳結果画面ＧＭ４の一例を示す図である。この翻訳結果画面ＧＭ４には、矩形窓が表示されず、文字認識結果画像ＧＺ４と、翻訳結果画像ＧＺ５とが対比して表示される。文字認識結果画像ＧＺ４には、文字認識された文字情報である、「カレー」、「烏龍茶」、「コーヒー」、「紅茶」が含まれる。翻訳結果画像ＧＺ５には、翻訳された文字情報である、「Ｃｕｒｒｙ」、「Ｏｏｌｏｎｇ」、「Ｃｏｆｆｅｅ」、「Ｂｌａｃｋｔｅａ」が含まれる。 (Other translation result screen)
FIG. 12B is a diagram illustrating an example of another translation result screen GM4 displayed on the smartphone 30. No rectangular window is displayed on the translation result screen GM4, and the character recognition result image GZ4 and the translation result image GZ5 are displayed in comparison. The character recognition result image GZ4 includes “curry”, “oolong tea”, “coffee”, and “tea”, which are character information on which character recognition has been performed. The translation result image GZ5 includes translated character information “Curry”, “Ooolong”, “Coffee”, and “Black tea”.

（スマートフォンの他の画面表示例）
別の利用例として、ユーザが、スマートフォン３０で食事メニューを撮像する場合を示す。図１３は、他の撮像画像ＧＺ６が表示されたスマートフォン３０の撮影画面ＧＭ６の一例を示す図である。図１０に示した撮影画面ＧＭ１と同様、撮影画面ＧＭ６には、撮像画像ＧＺ６、矩形窓ｗｋ６、及びシャッタアイコンｓｔが表示される。撮像画像ＧＺ６は、お食事メニュー、チキンカレー、ポークカレー、ビーフカレー、ドリングメニュー等の文字情報を含む。チキンカレー近傍の画像には、光による反射領域ｇ２がチキンカレーの「レー」部分と重畳して存在する。 (Example of other screen display on smartphone)
As another usage example, a case where a user captures an image of a meal menu using the smartphone 30 will be described. FIG. 13 is a diagram illustrating an example of a shooting screen GM6 of the smartphone 30 on which another captured image GZ6 is displayed. Similarly to the shooting screen GM1 shown in FIG. 10, the shooting image GM6, the rectangular window wk6, and the shutter icon st are displayed on the shooting screen GM6. The captured image GZ6 includes character information such as a meal menu, chicken curry, pork curry, beef curry, and a dring menu. In the image near the chicken curry, a light reflection area g2 overlaps with the “ray” portion of the chicken curry.

図１４Ａは、一部文字認識可能な範囲を含む重畳画像ＧＺ７が表示されたスマートフォン３０の確認画面ＧＭ７の一例を示す図である。撮像画像ＧＺ６に対し文字認識を行った結果、確認画面ＧＭ７では、お食事メニュー、ポークカレー、ビーフカレー、ドリングメニューが文字認識された。認識された文字には、文字認識できたことを表すマーキングとして文字掛けｈｍが施される。前述したように、文字掛けｈｍが施されると、表示部３３の画面に表示される文字の表示形態が変化する。 FIG. 14A is a diagram illustrating an example of a confirmation screen GM7 of the smartphone 30 on which a superimposed image GZ7 including a range in which some characters can be recognized is displayed. As a result of performing the character recognition on the captured image GZ6, the meal menu, the pork curry, the beef curry, and the dring menu were recognized on the confirmation screen GM7. Recognized characters are provided with a character hm as a marking indicating that the characters have been recognized. As described above, when the character hm is applied, the display form of the character displayed on the screen of the display unit 33 changes.

一方、チキンカレーを含む領域には、反射位置ｍｃが重畳表示される。この領域では、反射位置ｍｃが近傍に重畳表示されている。また、チキンカレー全体ではないが、その一部が文字認識可能である、一部文字認識可能な範囲が、マーカｍｒで識別可能に表示される。ここでは、一部文字認識可能な範囲は、チキンカレーのうちの「チキンカ」の部分である。「チキンカ」の範囲は、マーカｍｒとして、例えばオレンジ色の網掛け（図中、ハッチ表示）が施される。また、「チキンカ」の部分を挟むように、左右のカーソルｋｓがタッチパネルＴＰに表示される。ユーザが、例えば指でカーソルｋｓをドラッグ操作することで、一部文字認識可能な範囲が変更される。 On the other hand, the reflection position mc is superimposed and displayed on the area including the chicken curry. In this area, the reflection position mc is superimposed and displayed in the vicinity. Also, a part of the chicken curry, which is not the entire chicken curry but part of which can be recognized, is recognizable by the marker mr. Here, the range in which some characters can be recognized is the portion of “chicken curry” in chicken curry. The range of “Chickenka” is, for example, shaded in orange (indicated by hatching in the figure) as a marker mr. Further, left and right cursors ks are displayed on the touch panel TP so as to sandwich the portion of “chickenka”. When the user drags the cursor ks with a finger, for example, the range in which some characters can be recognized is changed.

図１４Ｂは、一部文字認識可能な範囲が変更された確認画面ＧＭ８の一例を示す図である。ユーザは、「チキンカ」を翻訳しても、誤訳すると判断し、指でカーソルｋｓを図中左に１文字移動させる。一部文字認識可能な範囲は、「チキン」の部分に変化する。これにより、チキンを翻訳した場合、チキンカレーが連想される。 FIG. 14B is a diagram illustrating an example of the confirmation screen GM8 in which the range in which some characters can be recognized has been changed. The user determines that the translation is “misplaced” even if “Chickenka” is translated, and moves the cursor ks one character to the left in the figure with the finger. The range in which some characters can be recognized changes to the part of “chicken”. Thereby, when the chicken is translated, the chicken curry is associated.

図１４Ａ及び図１４Ｂには、図１１と同様、確認画面ＧＭ７，ＧＭ８の下方に矩形窓ｗｋ７，ｗｋ８がそれぞれ表示され、翻訳の有無を確認するメッセージが表示される。ユーザが、タッチパネルＴＰの下方に表示されたＹＥＳボタン３４ｚを押下すると、確認画面ＧＭ８に対し、翻訳が行われる。 14A and 14B, similarly to FIG. 11, rectangular windows wk7 and wk8 are displayed below confirmation screens GM7 and GM8, respectively, and a message for confirming the presence or absence of translation is displayed. When the user presses the YES button 34z displayed below the touch panel TP, translation is performed on the confirmation screen GM8.

図１５Ａは、スマートフォン３０に表示された翻訳結果画面ＧＭ９の一例を示す図である。翻訳結果画面ＧＭ９の下方には、矩形窓ｗｋ９で囲まれた領域には、翻訳結果が表示される。ここでは、文字情報である、お食事メニュー、チキン、ポークカレー、ビーフカレー、ドリングメニューに対し、それぞれ翻訳結果である「ｆｏｏｄｍｅｎｕ」、「ｃｈｉｋｅｎ」、「ｐｏｒｋｃｕｒｒｙ」、「ｂｅｅｆｃｕｒｒｙ」、「ｄｒｉｎｋｍｅｎｕ」が表示される。 FIG. 15A is a diagram illustrating an example of the translation result screen GM9 displayed on the smartphone 30. The translation result is displayed below the translation result screen GM9 in a region surrounded by the rectangular window wk9. Here, the translation results of “food menu”, “chiken”, “poke curry”, “beef curry”, “beef curry” and “food menu”, “chicken”, “pork curry”, “beef curry”, and “dling menu”, respectively, are described. "link menu" is displayed.

（他の翻訳結果画面）
図１５Ｂは、スマートフォン３０に表示された他の翻訳結果画面ＧＭ１０の一例を示す図である。翻訳結果画面ＧＭ１０の下方に表示された矩形窓ｗｋ１０で囲まれた領域は、空白である。翻訳結果画面ＧＭ１０には、文字情報である、お食事メニュー、チキンカレー、ポークカレー、ビーフカレー、ドリングメニューを上書きして、翻訳結果である「ｆｏｏｄｍｅｎｕ」、「ｃｈｉｋｅｎ」、「ｐｏｒｋｃｕｒｒｙ」、「ｂｅｅｆｃｕｒｒｙ」、「ｄｒｉｎｋｍｅｎｕ」が表示される。ただし、反射位置ｍｃの近傍の領域は、翻訳されず、そのまま表示される。 (Other translation result screen)
FIG. 15B is a diagram illustrating an example of another translation result screen GM10 displayed on the smartphone 30. The area surrounded by the rectangular window wk10 displayed below the translation result screen GM10 is blank. The translation result screen GM10 overwrites the food information, such as the meal menu, chicken curry, pork curry, beef curry, and dring menu, and the translation results “food menu”, “chiken”, “poke curry”, “Beef curry” and “drink menu” are displayed. However, the area near the reflection position mc is not translated and is displayed as it is.

このように、スマートフォン３０で撮像された撮像画像に反射位置が含まれていても、ユーザが判読可能なように、翻訳結果が表示される。 As described above, even when the reflection position is included in the image captured by the smartphone 30, the translation result is displayed so that the user can read it.

以上により、実施の形態１に係るＡＩサーバ１０における学習処理方法は、光の反射位置（反射箇所の一例）を示す反射領域ｇ１（反射画像領域の一例）を含む元画像Ａ（学習処理対象の撮像画像の一例）に基づいて、元画像Ａの偽画像Ｂ´（第１類似画像の一例）を生成するステップを有する。また、学習処理方法は、元画像Ａ（撮像画像の一例）中の反射領域ｇ１が他の画像領域と識別可能に生成された学習画像Ｂ１，Ｂ２，Ｂ３と偽画像Ｂ´との比較に応じて、偽画像Ｂ´の真偽性を評価するステップを有する。また、学習処理方法は、偽画像Ｂ´に基づいて、元画像Ａの偽画像Ａ´（第２類似画像の一例）を生成するステップを有する。また、学習処理方法は、偽画像Ａ´と元画像Ａとの比較に応じて、偽画像Ａ´の真偽性を評価するステップを有する。また、学習処理方法は、偽画像Ｂ´及び偽画像Ａ´のそれぞれの真偽性の評価結果に基づいて、任意の撮像画像における反射領域ｇ１の検知に用いる学習済みモデル（反射検知モデルの一例）を生成するステップを有する。 As described above, in the learning processing method in the AI server 10 according to the first embodiment, the original image A (the learning processing target) including the reflection area g1 (an example of the reflection image area) indicating the light reflection position (an example of the reflection point) is shown. A step of generating a fake image B ′ of the original image A (an example of a first similar image) based on the captured image). The learning processing method is based on a comparison between learning images B1, B2, and B3 generated so that the reflection area g1 in the original image A (an example of a captured image) can be distinguished from other image areas, and a false image B ′. And evaluating the authenticity of the false image B ′. Further, the learning processing method includes a step of generating a fake image A ′ (an example of a second similar image) of the original image A based on the fake image B ′. Further, the learning processing method includes a step of evaluating the authenticity of the fake image A ′ according to the comparison between the fake image A ′ and the original image A. The learning processing method is based on a learned model (an example of a reflection detection model) used for detecting a reflection area g1 in an arbitrary captured image based on the evaluation results of the fake images B ′ and A ′. ).

これにより、ＡＩサーバ１０は、スマートフォン３０から任意の撮像画像が入力された場合でも、その撮像画像中の光の反射箇所を示す反射画像領域を検知可能な高精度な反射検知モデルを生成でき、任意の撮像画像において検知される反射画像領域の信頼性を的確に担保できる。 Thus, even when an arbitrary captured image is input from the smartphone 30, the AI server 10 can generate a highly accurate reflection detection model capable of detecting a reflection image area indicating a reflection point of light in the captured image, The reliability of the reflection image area detected in any captured image can be ensured accurately.

また、学習処理方法において、学習済みモデルを生成するステップは、偽画像Ｂ´及び偽画像Ａ´のそれぞれの真偽性の評価結果に基づいて、学習済みモデルが使用する学習パラメータ（パラメータの一例）を更新するステップと、更新された学習パラメータを用いて学習済みモデルを生成するステップとを含む。これにより、ＡＩサーバ１０は、偽画像と元画像との真偽性の評価結果に基づいて学習パラメータの更新された高精度な学習済みモデルを生成でき、学習済みモデルの学習効果を向上できる。 In the learning processing method, the step of generating a learned model includes a learning parameter (an example of a parameter) used by the learned model based on the evaluation result of the authenticity of each of the false image B ′ and the false image A ′. ) And generating a trained model using the updated learning parameters. Thereby, the AI server 10 can generate a highly accurate learned model in which learning parameters are updated based on the evaluation result of the authenticity of the false image and the original image, and can improve the learning effect of the learned model.

また、学習処理方法において、偽画像Ｂ´を生成するステップは、元画像Ａ（学習処理対象の撮像画像の一例）が複数存在する場合に、それぞれの元画像Ａ毎に対応する偽画像Ｂ´を生成するステップを含む。これにより、ＡＩサーバ１０は、複数の異なる元画像Ａに対応して複数の偽画像を生成できるので、元画像Ａ毎にそれぞれ学習パラメータを更新できるので、学習済みモデルの信頼性の精度を一層向上できる。 Further, in the learning processing method, the step of generating the fake image B ′ is performed when there are a plurality of original images A (an example of a captured image to be subjected to learning processing), and the fake image B ′ corresponding to each of the original images A is provided. Generating. Thereby, the AI server 10 can generate a plurality of fake images corresponding to a plurality of different original images A, and can update the learning parameters for each of the original images A, thereby further improving the accuracy of the reliability of the learned model. Can be improved.

また、学習処理方法は、元画像Ａ（撮像画像の一例）中の反射領域ｇ１に赤色（第１の色の一例）を付与し、元画像Ａ中の反射領域ｇ１以外の他の画像領域に白色（第２の色の一例）を付与して学習画像Ｂ１を生成するステップを更に有する。これにより、ＡＩサーバ１０は、スマートフォン３０から入力された撮像画像内に光の反射領域とそれ以外の領域とが明確に識別された学習画像を容易に生成できる。 In the learning processing method, red (an example of the first color) is added to the reflection area g1 in the original image A (an example of the captured image), and the reflection area g1 is added to the image area other than the reflection area g1 in the original image A. The method further includes a step of generating a learning image B1 by giving white (an example of a second color). Accordingly, the AI server 10 can easily generate a learning image in which the light reflection region and the other region are clearly identified in the captured image input from the smartphone 30.

また、学習処理方法は、元画像Ａ（撮像画像の一例）中の反射領域ｇ１を構成するそれぞれのＲ画素（画素のいずれか１色の一例）の画素値に、元画像Ａ中の対応する画素の輝度値を設定し、元画像Ａ中の反射領域ｇ１以外の他の画像領域を構成するそれぞれの画素の画素値に、元画像Ａ中の対応する画素の画素値を設定して学習画像Ｂ２を生成するステップを更に有する。これにより、ＡＩサーバ１０は、スマートフォン３０から入力された撮像画像内に光の反射領域とそれ以外の領域とが明確に識別された学習画像を容易に生成できる。 In the learning processing method, the pixel value of each of the R pixels (an example of any one of the pixels) forming the reflection area g1 in the original image A (an example of the captured image) corresponds to the pixel value in the original image A The learning image is set by setting the luminance value of the pixel and setting the pixel value of the corresponding pixel in the original image A to the pixel value of each pixel constituting an image area other than the reflection area g1 in the original image A. The method further includes generating B2. Accordingly, the AI server 10 can easily generate a learning image in which the light reflection region and the other region are clearly identified in the captured image input from the smartphone 30.

また、学習処理方法は、元画像Ａ（撮像画像の一例）中の反射領域ｇ１を構成するそれぞれのＲ画素（画素のいずれか１色の一例）の画素値に、元画像Ａ中の対応する画素の輝度値を設定し、元画像Ａ中の反射領域ｇ１以外の他の画像領域を構成するそれぞれのＲＧＢ画素の（全ての色の一例）の画素値に、元画像Ａ中の対応する画素の画素値を設定して学習画像Ｂ３を生成するステップを更に有する。これにより、ＡＩサーバ１０は、スマートフォン３０から入力された撮像画像内に光の反射領域とそれ以外の領域とが明確に識別された学習画像を容易に生成できる。 In the learning processing method, the pixel value of each of the R pixels (an example of any one of the pixels) constituting the reflection area g1 in the original image A (an example of the captured image) corresponds to the pixel value in the original image A. The luminance value of the pixel is set, and the pixel value of each of the RGB pixels (an example of all colors) constituting an image area other than the reflection area g1 in the original image A corresponds to the pixel value in the original image A. And generating the learning image B3 by setting the pixel values of. Accordingly, the AI server 10 can easily generate a learning image in which the light reflection region and the other region are clearly identified in the captured image input from the smartphone 30.

また、実施の形態１に係る反射検知システム５は、前述したＡＩサーバ１０（サーバ装置の一例）と、撮像部３２及び表示部３３を有するスマートフォン３０（携帯端末の一例）とが互いに通信可能に接続される。ＡＩサーバ１０は、撮像部３２により撮像された任意の撮像画像を取得すると、学習済みモデル（反射検知モデルの一例）を用いて、撮像画像中の光の反射領域（反射画像領域の一例）を検知するとともに、撮像画像中の光の反射領域を他の画像領域と識別可能に加工した出力画像を生成してスマートフォン３０に送信する。スマートフォン３０は、ＡＩサーバ１０から送信された出力画像を用いて、出力画像のうち光の反射領域以外の他の画像領域を文字認識した結果を表示部３３に表示する。 In addition, the reflection detection system 5 according to the first embodiment enables the AI server 10 (an example of a server device) and the smartphone 30 (an example of a mobile terminal) having the imaging unit 32 and the display unit 33 to communicate with each other. Connected. When acquiring an arbitrary captured image captured by the imaging unit 32, the AI server 10 uses a learned model (an example of a reflection detection model) to determine a light reflection area (an example of a reflection image area) in the captured image. At the same time as detecting, an output image in which the light reflection area in the captured image is processed so as to be distinguishable from other image areas is generated and transmitted to the smartphone 30. Using the output image transmitted from the AI server 10, the smartphone 30 displays a result of character recognition of an image area other than the light reflection area in the output image on the display unit 33.

これにより、スマートフォン３０を使用するユーザ（例えば、外国人等の旅行者）は、自ら内容確認したい広告等を被写体とする撮像画像をＡＩサーバ１０に送信しかつその撮像画像に対するＡＩサーバ１０の処理結果をスマートフォン３０において文字認識及び翻訳させることで、文字部分として認識された文字情報の翻訳結果を把握できる。言い換えると、反射検知システム５は、外国人等の旅行者をユーザに親切な文字認識及び翻訳のアプリケーションを提供でき、ユーザの利便性を的確に向上できる。 Thereby, the user using the smartphone 30 (for example, a traveler such as a foreigner) transmits a captured image whose subject is an advertisement or the like whose content is to be confirmed to the AI server 10 and processes the captured image by the AI server 10. By causing the smartphone 30 to recognize and translate the result, the translation result of the character information recognized as the character portion can be grasped. In other words, the reflection detection system 5 can provide a user with a kind character recognition and translation application for a foreign traveler or the like, and can improve the convenience of the user appropriately.

以上、添付図面を参照しながら各種の実施の形態について説明したが、本開示はかかる例に限定されない。当業者であれば、特許請求の範囲に記載された範疇内において、各種の変更例、修正例、置換例、付加例、削除例、均等例に想到し得ることは明らかであり、それらについても本開示の技術的範囲に属すると了解される。また、発明の趣旨を逸脱しない範囲において、上述した各種の実施の形態における各構成要素を任意に組み合わせてもよい。 Although various embodiments have been described with reference to the accompanying drawings, the present disclosure is not limited to such examples. It will be apparent to those skilled in the art that various changes, modifications, substitutions, additions, deletions, and equivalents can be made within the scope of the claims. It is understood that it belongs to the technical scope of the present disclosure. Further, the components in the above-described various embodiments may be arbitrarily combined without departing from the spirit of the invention.

本開示は、撮像画像中の光の反射箇所を示す反射画像領域を検知可能な高精度な反射検知モデルを生成でき、有用である。 INDUSTRIAL APPLICABILITY The present disclosure is useful because it can generate a highly accurate reflection detection model that can detect a reflection image area indicating a light reflection point in a captured image.

５反射検知システム
１０ＡＩサーバ
１３ＡＩ処理部
３０スマートフォン
３２撮像部
３３表示部 5 Reflection detection system 10 AI server 13 AI processing unit 30 Smartphone 32 Imaging unit 33 Display unit

Claims

A reflection detection system in which a server device holding a captured image of a learning processing target including a reflection image region indicating a light reflection point and a mobile terminal having an imaging unit and a display unit are communicably connected to each other,
The server device,
A processor and a memory,
The processor, in cooperation with the memory,
Generating a first similar image of the captured image based on the captured image;
In accordance with a comparison between the learning image and the first similar image generated such that the reflection image area in the captured image is identifiable with another image area, evaluates the authenticity of the first similar image,
Generating a second similar image of the captured image based on the first similar image;
Evaluating the authenticity of the second similar image according to the comparison between the second similar image and the captured image,
Based on the evaluation results of the authenticity of each of the first similar image and the second similar image, generate a reflection detection model used for detecting the reflection image region in an arbitrary captured image,
When an arbitrary captured image captured by the imaging unit is acquired, the reflected image region in the captured image is detected using the reflection detection model, and the reflected image region in the captured image is replaced with another image. Generate an output image processed to be identifiable as the area and send it to the portable terminal,
The mobile terminal,
Using the output image transmitted from the server device, a result of character recognition of the other image region other than the reflection image region in the output image is displayed on the display unit,
Reflection detection system.