JP6985977B2

JP6985977B2 - Output device, output method, output program and output system

Info

Publication number: JP6985977B2
Application number: JP2018093284A
Authority: JP
Inventors: 基大町; 俊宏熊谷; 雄太郎上岡; 彩花平野; 宏司町田; 直晃山下
Original assignee: Yahoo Japan Corp
Current assignee: Yahoo Japan Corp
Priority date: 2018-05-14
Filing date: 2018-05-14
Publication date: 2021-12-22
Anticipated expiration: 2038-05-14
Also published as: JP2019200485A

Description

本発明は、出力装置、出力方法、出力プログラム及び出力システムに関する。 The present invention relates to an output device, an output method, an output program and an output system.

ニューラルネットワーク（Neural Network）を用いて学習されたモデルを活用する情報処理が盛んに行われている。例えば、多段に接続されたニューロンを有するＤＮＮ（Deep Neural Network）を利用して、言語認識や画像認識等といった各種分類処理を実現する技術が知られている。 Information processing that utilizes a model learned using a neural network is being actively performed. For example, there is known a technique for realizing various classification processes such as language recognition and image recognition by using a DNN (Deep Neural Network) having neurons connected in multiple stages.

また、ニューラルネットワークに処理対象の画像を入力し、ニューラルネットワークの中間層から中間画像を抽出することによって、ニューラルネットワークが画像内の所定の対象を認識するために用いる合成画像を生成する技術が知られている。 In addition, a technique is known to generate a composite image used by a neural network to recognize a predetermined target in an image by inputting an image to be processed into the neural network and extracting the intermediate image from the intermediate layer of the neural network. Has been done.

特許第６２１４０７３号公報Japanese Patent No. 6214073

しかしながら、上記の従来技術では、モデルから出力される結果を動的に修正することは難しい。具体的には、上記の従来技術は、画像内の所定の対象を認識するために用いる情報を生成するものであり、ニューラルネットワークの出力結果そのものを修正するような処理に適用することは困難である。 However, with the above-mentioned conventional technique, it is difficult to dynamically modify the result output from the model. Specifically, the above-mentioned conventional technique generates information used for recognizing a predetermined object in an image, and it is difficult to apply it to a process of modifying the output result itself of the neural network. be.

本願は、上記に鑑みてなされたものであって、モデルから出力される結果を動的に修正することができる出力装置、出力方法、出力プログラム及び出力システムを提供することを目的とする。 The present application has been made in view of the above, and an object of the present invention is to provide an output device, an output method, an output program, and an output system capable of dynamically modifying the result output from the model.

本願に係る出力装置は、画像を出力するニューラルネットワークであるモデルに、処理対象である第１画像を入力する入力部と、前記第１画像が入力されたモデルの中間層における画像である中間画像を出力する中間出力部と、前記中間画像に対する介入処理を反映させた情報である介入情報を受け付ける受付部と、前記受付部によって受け付けられた介入情報に基づいて、前記モデルの出力層から第２画像を出力する結果出力部と、を備えたことを特徴とする。 The output device according to the present application is an input unit for inputting a first image to be processed into a model which is a neural network for outputting an image, and an intermediate image which is an image in an intermediate layer of the model in which the first image is input. The second from the output layer of the model based on the intermediate output unit that outputs the image, the reception unit that receives the intervention information that reflects the intervention processing for the intermediate image, and the intervention information received by the reception unit. It is characterized by having a result output unit for outputting an image.

実施形態の一態様によれば、モデルから出力される結果を動的に修正することができるという効果を奏する。 According to one aspect of the embodiment, there is an effect that the result output from the model can be dynamically modified.

図１は、実施形態に係る出力処理の一例を示す図である。FIG. 1 is a diagram showing an example of output processing according to an embodiment. 図２は、実施形態に係る出力処理の実行例を示す図である。FIG. 2 is a diagram showing an execution example of the output process according to the embodiment. 図３は、実施形態に係る出力システムの構成例を示す図である。FIG. 3 is a diagram showing a configuration example of an output system according to an embodiment. 図４は、実施形態に係る出力装置の構成例を示す図である。FIG. 4 is a diagram showing a configuration example of the output device according to the embodiment. 図５は、実施形態に係るモデル記憶部の一例を示す図である。FIG. 5 is a diagram showing an example of a model storage unit according to an embodiment. 図６は、実施形態に係る画像記憶部の一例を示す図である。FIG. 6 is a diagram showing an example of an image storage unit according to an embodiment. 図７は、実施形態に係る表示制御装置の構成例を示す図である。FIG. 7 is a diagram showing a configuration example of the display control device according to the embodiment. 図８は、実施形態に係る中間画像記憶部の一例を示す図である。FIG. 8 is a diagram showing an example of an intermediate image storage unit according to an embodiment. 図９は、実施形態に係る介入情報記憶部の一例を示す図である。FIG. 9 is a diagram showing an example of an intervention information storage unit according to an embodiment. 図１０は、実施形態に係る介入処理の一例を示す図（１）である。FIG. 10 is a diagram (1) showing an example of the intervention process according to the embodiment. 図１１は、実施形態に係る介入処理の手順を示す図（１）である。FIG. 11 is a diagram (1) showing the procedure of the intervention process according to the embodiment. 図１２は、実施形態に係る介入処理の一例を示す図（２）である。FIG. 12 is a diagram (2) showing an example of the intervention process according to the embodiment. 図１３は、実施形態に係る介入処理の手順を示す図（２）である。FIG. 13 is a diagram (2) showing the procedure of the intervention process according to the embodiment. 図１４は、実施形態に係る出力処理の手順を示す概要図である。FIG. 14 is a schematic diagram showing a procedure of output processing according to the embodiment. 図１５は、実施形態に係る処理手順を示すフローチャートである。FIG. 15 is a flowchart showing a processing procedure according to the embodiment. 図１６は、変形例に係る出力装置の構成例を示す図である。FIG. 16 is a diagram showing a configuration example of an output device according to a modified example. 図１７は、出力装置の機能を実現するコンピュータの一例を示すハードウェア構成図である。FIG. 17 is a hardware configuration diagram showing an example of a computer that realizes the function of the output device.

以下に、本願に係る出力装置、出力方法、出力プログラム及び出力システムを実施するための形態（以下、「実施形態」と呼ぶ）について図面を参照しつつ詳細に説明する。なお、この実施形態により本願に係る出力装置、出力方法、出力プログラム及び出力システムが限定されるものではない。また、以下の各実施形態において同一の部位には同一の符号を付し、重複する説明は省略される。 Hereinafter, embodiments (hereinafter, referred to as “embodiments”) for implementing the output device, output method, output program, and output system according to the present application will be described in detail with reference to the drawings. Note that this embodiment does not limit the output device, output method, output program, and output system according to the present application. Further, in each of the following embodiments, the same parts are designated by the same reference numerals, and duplicate explanations are omitted.

〔１．出力処理の一例〕
まず、図１を用いて、実施形態に係る出力処理の一例について説明する。図１は、実施形態に係る出力処理の一例を示す図である。図１では、実施形態に係る出力装置１００が、画像を出力するニューラルネットワークとして学習されたモデルに、処理対象となる画像（以下、区別のため「第１画像」と称する場合がある）を入力し、実施形態に係る出力処理により、モデルの出力層から画像（以下、区別のため「第２画像」と称する場合がある）を出力する例について説明する。 [1. Example of output processing]
First, an example of output processing according to the embodiment will be described with reference to FIG. FIG. 1 is a diagram showing an example of output processing according to an embodiment. In FIG. 1, the output device 100 according to the embodiment inputs an image to be processed (hereinafter, may be referred to as a “first image” for distinction) into a model trained as a neural network that outputs an image. An example of outputting an image (hereinafter, may be referred to as a "second image" for distinction) from the output layer of the model by the output process according to the embodiment will be described.

図１に示す出力装置１００は、実施形態に係る出力処理を実行するサーバ装置である。出力装置１００は、画像を出力するニューラルネットワークとして学習されたモデルＭ０１を用いて、実施形態に係る出力処理を実行する。 The output device 100 shown in FIG. 1 is a server device that executes output processing according to the embodiment. The output device 100 executes the output process according to the embodiment by using the model M01 learned as a neural network that outputs an image.

図１に示す表示制御装置３０_１及び３０_２は、モデルＭ０１の中間層から出力される画像をディスプレイ等の出力デバイスに表示するよう制御するサーバ装置である。なお、表示制御装置３０_１及び３０_２を区別する必要のない場合、表示制御装置３０と総称する。 _{The display control devices 30 1} and 30 ₂ shown in FIG. 1 are server devices that control the image output from the intermediate layer of the model M01 to be displayed on an output device such as a display. When it is not necessary to distinguish between the display control devices 30 ₁ and 30 _2, they are collectively referred to as the display control device 30.

実施形態において、出力装置１００は、複数のノードを多段に接続したニューラルネットワークとして学習されたモデルＭ０１を保持する。例えば、モデルＭ０１は、ＤＮＮ、ＬＳＴＭ（Long Short-Term Memory）、畳み込みニューラルネットワーク、再帰型ニューラルネットワーク等であってもよい。また、モデルＭ０１は、これら畳み込みニューラルネットワークや、再帰型ニューラルネットワークの機能を組み合わせたものであってもよい。 In an embodiment, the output device 100 holds a model M01 learned as a neural network in which a plurality of nodes are connected in multiple stages. For example, the model M01 may be a DNN, an LSTM (Long Short-Term Memory), a convolutional neural network, a recurrent neural network, or the like. Further, the model M01 may be a combination of the functions of these convolutional neural networks and recurrent neural networks.

モデルＭ０１は、第１画像が入力層に入力された場合に、複数の中間層を経て、出力層から第２画像を出力するよう学習されたモデルである。すなわち、モデルＭ０１は、ある入力情報が入力された際に、その入力情報を変換した他の出力情報を出力するよう学習されている。例えば、モデルＭ０１は、複数の人間の顔の特徴を事前に学習したモデルである。この場合、出力装置１００は、所定のユーザの顔を撮像した顔画像をモデルＭ０１に入力すると、モデルＭ０１の生成時に学習された複数の人間の顔の特徴を合成した画像を第２画像として出力する。 The model M01 is a model trained to output a second image from an output layer via a plurality of intermediate layers when the first image is input to the input layer. That is, the model M01 is learned to output other output information obtained by converting the input information when a certain input information is input. For example, the model M01 is a model in which the features of a plurality of human faces are learned in advance. In this case, when the output device 100 inputs a face image of a predetermined user's face into the model M01, the output device 100 outputs an image obtained by synthesizing a plurality of human facial features learned at the time of generation of the model M01 as a second image. do.

一般に、ニューラルネットワークでは、入力信号に対して、複数の中間層の各々に設定された様々な接続係数が乗じられ、出力信号が算出される。この場合、ユーザは、出力結果を確認できるものの、中間層による演算処理に介入することはできないため、出力結果を確認したのち、あらためて異なる入力信号（ここでは、ユーザの顔画像）を入力したり、モデルを新たに学習し直したりすることを要する。すなわち、従来では、ユーザがニューラルネットワークの出力結果を微修正することや、ニューラルネットワークの演算過程に介入することは困難であった。 Generally, in a neural network, an input signal is multiplied by various connection coefficients set in each of a plurality of intermediate layers to calculate an output signal. In this case, although the user can confirm the output result, he / she cannot intervene in the arithmetic processing by the intermediate layer. Therefore, after confirming the output result, he / she may input a different input signal (here, the user's face image). , It is necessary to relearn the model. That is, in the past, it was difficult for the user to finely modify the output result of the neural network or to intervene in the calculation process of the neural network.

そこで、実施形態に係る出力装置１００は、以下に示す出力処理によって、ユーザによる介入処理を出力結果に反映させる。具体的には、出力装置１００は、第１画像が入力されたモデルＭ０１の中間層における画像である中間画像を出力する。そして、出力装置１００は、中間画像に対する介入処理を反映させた情報である介入情報を受け付ける。さらに、出力装置１００は、受け付けられた介入情報に基づいて、モデルＭ０１の出力層から第２画像を出力する。このように、出力装置１００は、中間層の処理に対してユーザが介入する手段を与え、ユーザが介入したことにより生じる介入情報を演算処理に加えることで、学習済みモデルが行う演算に対してユーザが微調整を行うことを可能とする。以下、実施形態に係る出力処理について、図１を用いて流れに沿って説明する。 Therefore, the output device 100 according to the embodiment reflects the intervention process by the user in the output result by the output process shown below. Specifically, the output device 100 outputs an intermediate image which is an image in the intermediate layer of the model M01 to which the first image is input. Then, the output device 100 receives the intervention information which is the information reflecting the intervention process for the intermediate image. Further, the output device 100 outputs a second image from the output layer of the model M01 based on the received intervention information. In this way, the output device 100 provides a means for the user to intervene in the processing of the intermediate layer, and adds the intervention information generated by the user's intervention to the arithmetic processing, so that the arithmetic performed by the trained model can be performed. Allows the user to make fine adjustments. Hereinafter, the output processing according to the embodiment will be described along the flow with reference to FIG.

まず、ユーザは、カメラ５０等の入力デバイスを用いて自身の顔を撮影し、顔画像を作成する。そして、ユーザは、撮影した顔画像を出力装置１００に送信する（ステップＳ１１）。例えば、ユーザは、画像を出力するネットワーク機能を備えたカメラ５０を介して顔画像を送信してもよいし、カメラ５０と接続された端末装置等を介して顔画像を送信してもよい。なお、カメラ５０は、所定のトリミング処理を行った顔画像を作成してもよい。例えば、カメラ５０は、既存の顔認識技術に基づいて、ユーザの顔を示す範囲のみを残し、背景等の情報を削除するトリミング処理を行った後の顔画像を出力装置１００に送信してもよい。 First, the user photographs his / her face using an input device such as a camera 50 and creates a face image. Then, the user transmits the captured face image to the output device 100 (step S11). For example, the user may transmit the face image via the camera 50 having a network function for outputting the image, or may transmit the face image via a terminal device or the like connected to the camera 50. The camera 50 may create a face image that has undergone a predetermined trimming process. For example, the camera 50 may transmit a face image to the output device 100 after performing a trimming process for deleting information such as a background while leaving only a range showing the user's face based on the existing face recognition technology. good.

出力装置１００は、ユーザから取得した顔画像をモデルＭ０１に入力する（ステップＳ１２）。具体的には、出力装置１００は、顔画像を形成する各画素（ピクセル）の情報を入力信号として、モデルＭ０１の入力層に入力する。この場合、モデルＭ０１の入力層は、顔画像を構成する各画素の数、及び、画素の色情報（例えば、ＲＧＢの３チャンネル）に対応する数のノードを有する。 The output device 100 inputs the face image acquired from the user into the model M01 (step S12). Specifically, the output device 100 inputs the information of each pixel forming the face image as an input signal to the input layer of the model M01. In this case, the input layer of the model M01 has a number of pixels corresponding to each pixel constituting the face image and a number of nodes corresponding to the color information of the pixels (for example, 3 channels of RGB).

モデルＭ０１では、入力層のノードに所定の接続係数（重み値）が乗じられ、第１中間層のノードが算出される。なお、入力層の各ノードと乗じられる接続係数の値や、どのノードとどのエッジに対応する接続係数が乗じられるかといったニューラルネットワークの構造は、学習段階で決定される。なお、ニューラルネットワークの構造については、既存の技術であるため説明を省略する。 In the model M01, the node of the input layer is multiplied by a predetermined connection coefficient (weight value), and the node of the first intermediate layer is calculated. The structure of the neural network, such as the value of the connection coefficient to be multiplied by each node of the input layer and the connection coefficient corresponding to which node and which edge, is determined at the learning stage. Since the structure of the neural network is an existing technique, the description thereof will be omitted.

続いて、出力装置１００は、モデルＭ０１の第１中間層が有する情報に基づいて、中間画像を出力する（ステップＳ１３）。具体的には、出力装置１００は、第１中間層のノードが有する値に基づいて中間画像６１を生成し、生成した中間画像６１を出力する。出力装置１００は、出力した中間画像６１を表示制御装置３０_１に送信する。表示制御装置３０_１は、中間画像６１を出力デバイスの一例であるディスプレイ６０に表示するよう制御する（ステップＳ１４）。 Subsequently, the output device 100 outputs an intermediate image based on the information contained in the first intermediate layer of the model M01 (step S13). Specifically, the output device 100 generates an intermediate image 61 based on the value possessed by the node of the first intermediate layer, and outputs the generated intermediate image 61. The output device 100 transmits the output intermediate image 61 to the display control device 30 _1. The display control device 30 ₁ controls the intermediate image 61 to be displayed on the display 60, which is an example of the output device (step S14).

なお、図１の例では、中間画像６１は、ユーザの顔を構成する箇所（目や鼻など）が認識できるよう図示されている。この場合、表示制御装置３０_１は、第１中間層の各ノードに基づいて、第１画像と同様の画素の並びを再現して中間画像６１を生成したものと想定される。しかし、各中間層から出力される中間画像は、実際には、ユーザが対象を認識できないような画像である場合がある。例えば、モデルＭ０１が畳み込みニューラルネットワークで学習されたモデルである場合、中間画像は、畳み込みで用いられた、特徴量を抽出するための複数のフィルタの情報が反映された画像となる。この場合、中間画像は、個々のフィルタを適用した画像の大きさ（例えば、１６ピクセル×１６ピクセルなど）ごとにユーザの顔が区切られた、モザイクのような画像で示される。 In the example of FIG. 1, the intermediate image 61 is illustrated so that a portion (eyes, nose, etc.) constituting the user's face can be recognized. In this case, it _{is assumed that the display control device 30 1} generates the intermediate image 61 by reproducing the same arrangement of pixels as the first image based on each node of the first intermediate layer. However, the intermediate image output from each intermediate layer may actually be an image in which the user cannot recognize the target. For example, when the model M01 is a model trained by a convolutional neural network, the intermediate image is an image that reflects the information of a plurality of filters for extracting features used in the convolution. In this case, the intermediate image is represented by a mosaic-like image in which the user's face is separated by the size of the image to which the individual filters are applied (eg, 16 pixels x 16 pixels).

ディスプレイ６０は、ユーザが視認できる位置に設置される。すなわち、ユーザは、自身を撮像した顔画像が第１中間層で処理された画像である中間画像６１を視認することができる。 The display 60 is installed at a position that can be visually recognized by the user. That is, the user can visually recognize the intermediate image 61, which is an image of the face image captured by itself processed by the first intermediate layer.

また、図１の例では、ディスプレイ６０はタッチパネルを採用したディスプレイである。すなわち、ユーザは、ディスプレイ６０に対してタッチ操作による介入を行うことができる。例えば、ユーザは、指６５を用いてディスプレイ６０に表示された中間画像６１の任意の箇所をタッチすることで、タッチ操作による介入を行う（ステップＳ１５）。なお、表示制御装置３０_１は、ユーザがタッチ操作を行った場合、タッチ操作が行われた箇所（図１で示す表示６６）が黒く塗りつぶされるよう、表示を制御する。 Further, in the example of FIG. 1, the display 60 is a display that employs a touch panel. That is, the user can intervene with the display 60 by a touch operation. For example, the user touches an arbitrary part of the intermediate image 61 displayed on the display 60 with the finger 65 to perform intervention by the touch operation (step S15). The display control device 30 ₁ controls the display so that when the user performs a touch operation, the portion where the touch operation is performed (display 66 shown in FIG. 1) is painted black.

ここで、表示制御装置３０_１は、ユーザのタッチ操作に基づいて、介入情報を取得する（ステップＳ１６）。介入情報とは、中間画像に対する介入処理を反映させた情報である。具体的には、介入情報とは、中間画像６１においてユーザからタッチされた箇所の画素を示す情報である。すなわち、介入情報に示される画素とは、ユーザがディスプレイ６０上で黒く塗りつぶした箇所に対応する画素である。より具体的には、介入情報は、例えばユーザからタッチされた画素を指し示した座標情報によって表される。 Here, the display control device 30 ₁ acquires the intervention information based on the touch operation of the user (step S16). The intervention information is information that reflects the intervention process for the intermediate image. Specifically, the intervention information is information indicating the pixels of the portion touched by the user in the intermediate image 61. That is, the pixel shown in the intervention information is a pixel corresponding to a portion painted in black on the display 60 by the user. More specifically, the intervention information is represented by, for example, coordinate information pointing to a pixel touched by the user.

表示制御装置３０_１は、取得した介入情報を出力装置１００に送信する（ステップＳ１７）。出力装置１００は、受け付けた介入情報に基づいて、モデルＭ０１の演算への介入処理を行う（ステップＳ１８）。具体的には、出力装置１００は、第１中間層のノードのうち、介入情報に対応したノードを特定する。すなわち、出力装置１００は、中間画像６１に対してユーザのタッチ操作が行われた画素に対応するノードを特定する。そして、出力装置１００は、特定したノードをマスクする処理を行う。具体的には、出力装置１００は、特定したノードを除いたノードのみを用いて、次の中間層（第２中間層）への演算を行う。 The display control device 30 ₁ transmits the acquired intervention information to the output device 100 (step S17). The output device 100 performs an intervention process on the calculation of the model M01 based on the received intervention information (step S18). Specifically, the output device 100 identifies the node corresponding to the intervention information among the nodes of the first intermediate layer. That is, the output device 100 identifies the node corresponding to the pixel to which the user's touch operation is performed on the intermediate image 61. Then, the output device 100 performs a process of masking the specified node. Specifically, the output device 100 performs an operation on the next intermediate layer (second intermediate layer) using only the nodes excluding the specified node.

ステップＳ１８を経て、出力装置１００は、モデルＭ０１の第２中間層のノードの値を決定する。そして、出力装置１００は、第２中間層に対応する中間画像７１を出力し（ステップＳ１９）、表示制御装置３０_２に送信する。表示制御装置３０_２は、出力装置１００から送信された中間画像７１をディスプレイ７０に表示する（ステップＳ２０）。 Through step S18, the output device 100 determines the value of the node of the second intermediate layer of the model M01. The output device 100 outputs an intermediate image 71 corresponding to the second intermediate layer (step S19), and transmits to the display control unit 30 _2. The display control device 30 ₂ displays the intermediate image 71 transmitted from the output device 100 on the display 70 (step S20).

図１の例では、ディスプレイ７０の近傍にカメラ７５が設置される。カメラ７５は、例えば、ユーザがカメラ５０を操作する状況や、ディスプレイ６０をタッチしている状況を撮像可能なように設置される。カメラ７５は、リアルタイムなユーザの状況を継続的に撮像し、撮像した画像を表示制御装置３０_２に送信する。言い換えれば、ユーザは、カメラ７５で撮像された画像に基づく介入を行う（ステップＳ２１）。 In the example of FIG. 1, the camera 75 is installed in the vicinity of the display 70. The camera 75 is installed so that, for example, a situation in which the user operates the camera 50 or a situation in which the display 60 is touched can be captured. The camera 75, the status of the real-time user continuously captured, transmits the captured image to the display control unit 30 _2. In other words, the user performs an intervention based on the image captured by the camera 75 (step S21).

表示制御装置３０_２は、カメラ７５が撮像した画像を所定の閾値に基づいて２値化情報に変換する。かかる２値化情報は、中間画像７１と同じ大きさ（すなわち同一の画素数）を有する。そして、表示制御装置３０_２は、得られた２値化情報を介入情報として取得する（ステップＳ２２）。例えば、表示制御装置３０_２は、２値化情報のうち、黒色側を示した画素に対応する座標情報を介入情報として取得する。 The display control device 30 ₂ converts the image captured by the camera 75 into binarization information based on a predetermined threshold value. The binarized information has the same size (that is, the same number of pixels) as the intermediate image 71. Then, the display control device 30 ₂ acquires the obtained binarization information as intervention information (step S22). For example, the display control device 30 ₂ acquires the coordinate information corresponding to the pixel indicating the black side among the binarization information as the intervention information.

表示制御装置３０_２は、取得した介入情報を出力装置１００に送信する（ステップＳ２３）。出力装置１００は、受け付けた介入情報に基づいて、モデルＭ０１の演算への介入処理を行う（ステップＳ２４）。具体的には、出力装置１００は、第２中間層のノードのうち、介入情報に対応したノードを特定する。例えば、出力装置１００は、中間画像７１と２値化情報とを重畳させ、２値化情報のうち黒色側を示した画素に対応するノードを特定する。そして、出力装置１００は、特定したノードをマスクする処理を行う。具体的には、出力装置１００は、特定したノードを除いたノードのみを用いて、次の中間層（第３中間層）への演算を行う。 The display control device 30 ₂ transmits the acquired intervention information to the output device 100 (step S23). The output device 100 performs an intervention process on the calculation of the model M01 based on the received intervention information (step S24). Specifically, the output device 100 identifies the node corresponding to the intervention information among the nodes of the second intermediate layer. For example, the output device 100 superimposes the intermediate image 71 and the binarization information, and identifies a node corresponding to the pixel showing the black side in the binarization information. Then, the output device 100 performs a process of masking the specified node. Specifically, the output device 100 performs an operation on the next intermediate layer (third intermediate layer) using only the nodes excluding the specified node.

出力装置１００は、上記で示したような中間画像の出力と介入情報の取得とを、モデルＭ０１の出力層まで繰り返す。出力装置１００は、最後の中間層への介入処理を終えた場合、最後の中間層から出力層への演算を行い、出力層から第２画像を出力させる（ステップＳ２５）。 The output device 100 repeats the output of the intermediate image and the acquisition of the intervention information as shown above up to the output layer of the model M01. When the output device 100 finishes the intervention process for the last intermediate layer, the output device 100 performs an operation from the last intermediate layer to the output layer and outputs a second image from the output layer (step S25).

出力装置１００は、ディスプレイ８０に、出力結果である第２画像８１を表示するよう制御する。第２画像８１は、ユーザが入力した顔画像に、モデルＭ０１の学習時において学習された複数の顔画像における特徴が合成されたものである。また、第２画像８１には、各々の中間層における介入処理が反映される。出力装置１００は、ディスプレイ８０に第２画像８１を表示することで、ユーザへ出力結果を提示する（ステップＳ２６）。 The output device 100 controls the display 80 to display the second image 81, which is the output result. The second image 81 is a composite of the facial image input by the user and the features of the plurality of facial images learned at the time of learning the model M01. Further, the second image 81 reflects the intervention process in each intermediate layer. The output device 100 presents the output result to the user by displaying the second image 81 on the display 80 (step S26).

なお、ステップＳ１２からステップＳ２６までの処理は、ステップＳ１１の入力の開始から所定時間が経過するまで、連続的に実行される。すなわち、ユーザは、ディスプレイ６０をタッチしたり、カメラ７５に撮像される姿（影）を変化させたりして、第２画像８１の変化を確認することができる。これにより、ユーザは、モデルＭ０１から出力される結果を確認しながら、インタラクティブにモデルＭ０１の演算への介入を行うことができる。すなわち、ユーザは、モデルの出力結果を動的に修正することができる。 The processes from step S12 to step S26 are continuously executed from the start of the input in step S11 until a predetermined time elapses. That is, the user can confirm the change in the second image 81 by touching the display 60 or changing the figure (shadow) imaged by the camera 75. As a result, the user can interactively intervene in the calculation of the model M01 while checking the result output from the model M01. That is, the user can dynamically modify the output result of the model.

ここで、図１で示した出力装置１００による出力処理が実際に行われる状況について、図２を用いて説明する。図２は、実施形態に係る出力処理の実行例を示す図である。図２に示すように、ユーザは、カメラ５０によって自身の顔画像が撮像される位置に座る。そして、ユーザは、ディスプレイ６０に表示される中間画像６１にタッチしたり、カメラ７５に撮像される姿を変化させたりしながら、ディスプレイ８０に表示される第２画像８１を確認する。なお、図２に示すように、出力処理の実行例では、第１中間層や第２中間層とは異なる中間層から出力される中間画像を表示するための他のディスプレイがさらに設置されてもよい。 Here, a situation in which the output process by the output device 100 shown in FIG. 1 is actually performed will be described with reference to FIG. FIG. 2 is a diagram showing an execution example of the output process according to the embodiment. As shown in FIG. 2, the user sits at a position where the camera 50 captures his / her face image. Then, the user confirms the second image 81 displayed on the display 80 while touching the intermediate image 61 displayed on the display 60 or changing the appearance captured by the camera 75. As shown in FIG. 2, in the execution example of the output process, even if another display for displaying an intermediate image output from an intermediate layer different from the first intermediate layer and the second intermediate layer is further installed. good.

上述してきたように、実施形態に係る出力装置１００は、画像を出力するニューラルネットワークであるモデルＭ０１に、処理対象である第１画像を入力する。そして、出力装置１００は、第１画像が入力されたモデルＭ０１の中間層における画像である中間画像６１や中間画像７１を出力する。続けて、出力装置１００は、中間画像６１や中間画像７１に対する介入処理を反映させた情報である介入情報を受け付ける。さらに、出力装置１００は、受け付けられた介入情報に基づいて、モデルＭ０１の出力層から第２画像８１を出力する。 As described above, the output device 100 according to the embodiment inputs the first image to be processed into the model M01 which is a neural network that outputs an image. Then, the output device 100 outputs an intermediate image 61 or an intermediate image 71 which is an image in the intermediate layer of the model M01 to which the first image is input. Subsequently, the output device 100 receives the intervention information which is the information reflecting the intervention process for the intermediate image 61 and the intermediate image 71. Further, the output device 100 outputs the second image 81 from the output layer of the model M01 based on the received intervention information.

このように、実施形態に係る出力装置１００は、中間画像６１等に対するタッチ操作等の介入処理により生じた介入情報をモデルＭ０１の演算に組み込む構成を有することで、ユーザが動的にニューラルネットワークの処理に介入することを可能にする。これにより、出力装置１００は、モデルから出力される結果を動的に修正することができる。また、出力装置１００は、リアルタイムに変化する出力結果である第２画像を提示させながらユーザからの介入操作を受け付けることで、第２画像の変化を楽しむといった娯楽をユーザに提供することができる。また、出力装置１００は、中間画像に触れた箇所の入力をマスクし、その結果を第２画像に反映させることで、中間画像のどのような箇所が変化することで出力結果がどのような影響を受けるのかといった、通常では認識することのできないモデルの内部処理をユーザに体感させることができる。 As described above, the output device 100 according to the embodiment has a configuration in which the intervention information generated by the intervention process such as the touch operation on the intermediate image 61 or the like is incorporated into the calculation of the model M01, so that the user can dynamically control the neural network. Allows you to intervene in the process. As a result, the output device 100 can dynamically correct the result output from the model. Further, the output device 100 can provide the user with entertainment such as enjoying the change of the second image by accepting the intervention operation from the user while presenting the second image which is the output result changing in real time. Further, the output device 100 masks the input of the portion touching the intermediate image and reflects the result in the second image, so that what portion of the intermediate image changes and what kind of influence the output result has. It is possible to let the user experience the internal processing of the model that cannot be normally recognized, such as whether to receive it.

なお、図１や図２の例では、ユーザからの介入処理としてタッチ操作やカメラによって撮像された画像情報等を利用する例を示したが、介入処理はこれに限られない。例えば、介入処理は、マウス等のポインティングデバイスを利用して行われてもよいし、音声入力等によって行われてもよい。 In the examples of FIGS. 1 and 2, an example of using a touch operation, image information captured by a camera, or the like as an intervention process from a user is shown, but the intervention process is not limited to this. For example, the intervention process may be performed using a pointing device such as a mouse, or may be performed by voice input or the like.

以下、上記のような出力処理を行う出力装置１００、及び出力装置１００を含む出力システム１の構成や処理について、さらに詳細に説明する。 Hereinafter, the configuration and processing of the output device 100 that performs the output processing as described above and the output system 1 including the output device 100 will be described in more detail.

〔２．出力システムの構成〕
次に、図３を用いて、実施形態に係る出力システム１の構成について説明する。図３は、実施形態に係る出力システム１の構成例を示す図である。図３に示すように、出力システム１は、入力デバイス１０と、出力デバイス２０と、表示制御装置３０と、出力装置１００とを含む。出力システム１に含まれる各装置は、通信ネットワークであるネットワークＮ（例えば、インターネット）を介して有線または無線により通信可能に接続される。なお、図３に示す出力システム１に含まれる各装置の数は図示したものに限られない。例えば、出力システム１には、複数台の入力デバイス１０が含まれてもよい。 [2. Output system configuration]
Next, the configuration of the output system 1 according to the embodiment will be described with reference to FIG. FIG. 3 is a diagram showing a configuration example of the output system 1 according to the embodiment. As shown in FIG. 3, the output system 1 includes an input device 10, an output device 20, a display control device 30, and an output device 100. Each device included in the output system 1 is communicably connected by wire or wirelessly via a network N (for example, the Internet) which is a communication network. The number of each device included in the output system 1 shown in FIG. 3 is not limited to that shown in the figure. For example, the output system 1 may include a plurality of input devices 10.

入力デバイス１０は、種々の情報の入力を行うために利用されるデバイス（情報処理装置）である。例えば、入力デバイス１０は、図１に示したカメラ５０やカメラ７５等である。また、入力デバイス１０には、カメラ機能やマイク機能を備えたデスクトップ型ＰＣ（Personal Computer）や、ノート型ＰＣや、スマートフォン等の携帯電話機や、タブレット端末や、ＰＤＡ（Personal Digital Assistant）等であってもよい。入力デバイス１０は、入力された情報を出力装置１００や表示制御装置３０等に送信する。 The input device 10 is a device (information processing device) used for inputting various information. For example, the input device 10 is a camera 50, a camera 75, or the like shown in FIG. Further, the input device 10 includes a desktop PC (Personal Computer) having a camera function and a microphone function, a notebook PC, a mobile phone such as a smartphone, a tablet terminal, a PDA (Personal Digital Assistant), and the like. You may. The input device 10 transmits the input information to the output device 100, the display control device 30, and the like.

出力デバイス２０は、種々の情報を出力するデバイスである。例えば、出力デバイスは、図１に示したディスプレイ６０やディスプレイ７０、ディスプレイ８０等である。なお、入力デバイス１０と出力デバイス２０は、別の装置でなく、一体であってもよい。例えば、出力デバイス２０がタッチパネルを備えたディスプレイである場合、かかるディスプレイは、出力デバイス２０であるとともに、入力デバイス１０としても機能する。出力デバイス２０は、出力装置１００や表示制御装置３０から送信された情報に基づいて、画像等の情報を出力（表示）する。 The output device 20 is a device that outputs various information. For example, the output device is the display 60, the display 70, the display 80, or the like shown in FIG. The input device 10 and the output device 20 may be integrated instead of separate devices. For example, when the output device 20 is a display provided with a touch panel, the display is not only the output device 20 but also functions as the input device 10. The output device 20 outputs (displays) information such as an image based on the information transmitted from the output device 100 and the display control device 30.

表示制御装置３０は、出力装置１００が中間層から出力させた中間画像を取得し、中間画像を出力デバイス２０に表示するよう制御するサーバ装置である。また、表示制御装置３０は、入力デバイス１０に入力された情報に基づいて、中間画像に対する介入情報を取得する。具体的には、表示制御装置３０は、ユーザのタッチ操作やユーザを撮像した画像情報等に基づいて介入情報を取得する。そして、表示制御装置３０は、取得した介入情報を出力装置１００に送信する。 The display control device 30 is a server device that controls the output device 100 to acquire an intermediate image output from the intermediate layer and display the intermediate image on the output device 20. Further, the display control device 30 acquires intervention information for the intermediate image based on the information input to the input device 10. Specifically, the display control device 30 acquires intervention information based on the user's touch operation, image information obtained by capturing the user, and the like. Then, the display control device 30 transmits the acquired intervention information to the output device 100.

出力装置１００は、上述のように、モデルの中間層から中間画像を出力し、中間画像に対する介入情報を受け付け、受け付けられた介入情報に基づいて、モデルの出力層から第２画像を出力する出力処理を実行するサーバ装置である。 As described above, the output device 100 outputs an intermediate image from the intermediate layer of the model, receives intervention information for the intermediate image, and outputs a second image from the output layer of the model based on the received intervention information. It is a server device that executes processing.

〔３．出力装置の構成〕
次に、図４を用いて、実施形態に係る出力装置１００の構成について説明する。図４は、実施形態に係る出力装置１００の構成例を示す図である。図４に示すように、出力装置１００は、通信部１１０と、記憶部１２０と、制御部１３０とを有する。なお、出力装置１００は、出力装置１００を利用する管理者等から各種操作を受け付ける入力部（例えば、キーボードやマウス等）や、各種情報を出力するための出力部（例えば、液晶ディスプレイ等）を有してもよい。 [3. Output device configuration]
Next, the configuration of the output device 100 according to the embodiment will be described with reference to FIG. FIG. 4 is a diagram showing a configuration example of the output device 100 according to the embodiment. As shown in FIG. 4, the output device 100 includes a communication unit 110, a storage unit 120, and a control unit 130. The output device 100 includes an input unit (for example, a keyboard, a mouse, etc.) that receives various operations from an administrator or the like who uses the output device 100, and an output unit (for example, a liquid crystal display, etc.) for outputting various information. You may have.

（通信部１１０について）
通信部１１０は、例えば、ＮＩＣ（Network Interface Card）等によって実現される。通信部１１０は、ネットワークＮと有線又は無線で接続され、ネットワークＮを介して、入力デバイス１０や、出力デバイス２０や、表示制御装置３０との間で情報の送受信を行う。 (About communication unit 110)
The communication unit 110 is realized by, for example, a NIC (Network Interface Card) or the like. The communication unit 110 is connected to the network N by wire or wirelessly, and transmits / receives information to / from the input device 10, the output device 20, and the display control device 30 via the network N.

（記憶部１２０について）
記憶部１２０は、例えば、ＲＡＭ（Random Access Memory)、フラッシュメモリ（Flash Memory）等の半導体メモリ素子、または、ハードディスク、光ディスク等の記憶装置によって実現される。実施形態に係る記憶部１２０は、モデル記憶部１２１と、画像記憶部１２２とを有する。 (About the storage unit 120)
The storage unit 120 is realized by, for example, a semiconductor memory element such as a RAM (Random Access Memory) or a flash memory (Flash Memory), or a storage device such as a hard disk or an optical disk. The storage unit 120 according to the embodiment includes a model storage unit 121 and an image storage unit 122.

（モデル記憶部１２１について）
モデル記憶部１２１は、出力装置１００が保持するモデルに関する情報を記憶する。ここで、図５に、実施形態に係るモデル記憶部１２１の一例を示す。図５は、実施形態に係るモデル記憶部１２１の一例を示す図である。図５に示すように、モデル記憶部１２１は、「モデルＩＤ」、「入力データ」、「接続係数」、「出力データ」といった項目を有する。 (About model storage unit 121)
The model storage unit 121 stores information about the model held by the output device 100. Here, FIG. 5 shows an example of the model storage unit 121 according to the embodiment. FIG. 5 is a diagram showing an example of the model storage unit 121 according to the embodiment. As shown in FIG. 5, the model storage unit 121 has items such as “model ID”, “input data”, “connection coefficient”, and “output data”.

「モデルＩＤ」は、モデルを識別する識別情報を示す。なお、以下の説明では、図５に示すような識別情報を参照符号として用いる場合がある。例えば、識別情報が「Ｍ０１」で示されるモデルを「モデルＭ０１」と表記する場合がある。 The "model ID" indicates identification information that identifies the model. In the following description, the identification information as shown in FIG. 5 may be used as a reference code. For example, a model whose identification information is indicated by "M01" may be referred to as "model M01".

「入力データ」は、モデルに入力されるデータの形式（態様）を示す。「接続係数」は、モデルにおける接続係数（重み値）を示す。「出力データ」は、モデルから出力されるデータの形式を示す。図５に示した例では、「入力データ」や「接続係数」や「出力データ」を「Ａ０１」のような概念で示しているが、実際には、各項目に対応する具体的な情報が記憶される。例えば、「入力データ」の項目には、モデルに入力可能なデータの具体的な形式（例えば、入力される画像の画素数や色情報が３チャンネルで表現されること等）が記憶される。「接続係数」には、モデルが有する中間層の数や、各中間層のノードを接続するエッジの数や、どのノードとどのノードがエッジで接続されているかを示す情報や、各エッジの接続係数の値等が記憶される。「出力データ」には、出力結果として出力されるデータの具体的な形式（例えば、出力される画像の画素数や色情報が３チャンネルで表現されること等）が記憶される。 "Input data" indicates the format (mode) of the data input to the model. The "connection coefficient" indicates the connection coefficient (weight value) in the model. "Output data" indicates the format of the data output from the model. In the example shown in FIG. 5, "input data", "connection coefficient", and "output data" are shown by a concept such as "A01", but in reality, specific information corresponding to each item is provided. It will be remembered. For example, in the item of "input data", a specific format of data that can be input to the model (for example, the number of pixels of the input image and color information are represented by 3 channels) is stored. The "connection factor" includes the number of intermediate layers in the model, the number of edges connecting the nodes of each intermediate layer, information indicating which node is connected to which node at the edge, and the connection of each edge. The value of the coefficient etc. is stored. In the "output data", a specific format of the data output as an output result (for example, the number of pixels of the output image and the color information are represented by 3 channels) is stored.

すなわち、図５では、モデル記憶部１２１が記憶する情報の一例として、モデルＩＤ「Ｍ０１」で識別されるモデルＭ０１は、入力データが「Ａ０１」であり、接続係数が「Ｂ０１」であり、出力データが「Ｃ０１」といったモデルであることを示している。 That is, in FIG. 5, as an example of the information stored in the model storage unit 121, the model M01 identified by the model ID “M01” has the input data “A01”, the connection coefficient “B01”, and is output. It shows that the data is a model such as "C01".

なお、図５での図示は省略しているが、モデル記憶部１２１には、モデルを学習するための学習データ等のデータセット等が記憶されてもよい。 Although not shown in FIG. 5, the model storage unit 121 may store a data set such as learning data for learning the model.

なお、実施形態に係るモデルは、一つの中間層を有するニューラルネットワークであってもよいし、複数の中間層を有するＤＮＮ等、種々の構造であってもよい。 The model according to the embodiment may be a neural network having one intermediate layer, or may have various structures such as a DNN having a plurality of intermediate layers.

（画像記憶部１２２について）
画像記憶部１２２は、モデルに入力される画像の情報を記憶する。ここで、図６に、実施形態に係る画像記憶部１２２の一例を示す。図６は、実施形態に係る画像記憶部１２２の一例を示す図である。図６に示すように、画像記憶部１２２は、「画像ＩＤ」、「画素数」、「色情報」といった項目を有する。 (About the image storage unit 122)
The image storage unit 122 stores image information input to the model. Here, FIG. 6 shows an example of the image storage unit 122 according to the embodiment. FIG. 6 is a diagram showing an example of the image storage unit 122 according to the embodiment. As shown in FIG. 6, the image storage unit 122 has items such as “image ID”, “number of pixels”, and “color information”.

「画像ＩＤ」は、画像を識別する識別情報を示す。「画素数」は、画像が含む画素の数を示す。「色情報」は、各画素の色情報を示す。図６に示した例では、「画素数」や「色情報」を「Ｅ０１」のような概念で示しているが、実際には、各項目に対応する具体的な情報が記憶される。例えば、「画素数」の項目には、画像を構成する画素数の具体的な値が記憶される。また、「色情報」には、各画素の色を示す具体的な情報（例えば、ＲＧＢの３チャンネル分の値）が記憶される。 The "image ID" indicates identification information for identifying an image. "Number of pixels" indicates the number of pixels included in the image. "Color information" indicates the color information of each pixel. In the example shown in FIG. 6, "the number of pixels" and "color information" are shown by a concept such as "E01", but in reality, specific information corresponding to each item is stored. For example, in the item of "number of pixels", a specific value of the number of pixels constituting the image is stored. Further, in the "color information", specific information indicating the color of each pixel (for example, a value for three channels of RGB) is stored.

すなわち、図６では、画像記憶部１２２が記憶する情報の一例として、画像ＩＤが「Ｄ０１」である画像Ｄ０１は、画素数が「Ｅ０１」であり、色情報が「Ｆ０１」であることを示している。 That is, in FIG. 6, as an example of the information stored in the image storage unit 122, the image D01 having the image ID “D01” has the number of pixels “E01” and the color information is “F01”. ing.

（制御部１３０について）
図４に戻って説明を続ける。制御部１３０は、コントローラ（controller）であり、例えば、ＣＰＵ（Central Processing Unit）やＭＰＵ（Micro Processing Unit）等によって、出力装置１００内部の記憶装置に記憶されている各種プログラム（出力プログラムの一例に相当）がＲＡＭを作業領域として実行されることにより実現される。また、制御部１３０は、コントローラであり、例えば、ＡＳＩＣ（Application Specific Integrated Circuit）やＦＰＧＡ（Field Programmable Gate Array）等の集積回路により実現される。 (About control unit 130)
Returning to FIG. 4, the explanation will be continued. The control unit 130 is a controller, and is, for example, various programs (as an example of an output program) stored in a storage device inside the output device 100 by a CPU (Central Processing Unit), an MPU (Micro Processing Unit), or the like. (Equivalent) is realized by executing the RAM as a work area. Further, the control unit 130 is a controller, and is realized by, for example, an integrated circuit such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array).

制御部１３０は、記憶部１２０に記憶されるモデルに従った情報処理により、モデルの入力層に入力された第１画像（より正確には、画像を構成する各画素に対応する入力信号）に対して、モデルが有する接続係数（すなわち、モデルが学習した特徴に対応する重み値）に基づく演算を行い、モデルの出力層から第２画像（より正確には、画像を構成する各画素に対応する出力信号）を出力する。 The control unit 130 converts the first image (more accurately, the input signal corresponding to each pixel constituting the image) input to the input layer of the model by the information processing according to the model stored in the storage unit 120. On the other hand, an operation is performed based on the connection coefficient of the model (that is, the weight value corresponding to the feature learned by the model), and the second image (more accurately, each pixel constituting the image) is supported from the output layer of the model. Output signal) is output.

実施形態に係る制御部１３０は、図４に示すように、取得部１３１と、入力部１３２と、算出部１３３と、中間出力部１３４と、受付部１３５と、結果出力部１３６とを有し、以下に説明する情報処理の機能や作用を実現または実行する。なお、制御部１３０の内部構成は、図４に示した構成に限られず、後述する情報処理を行う構成であれば他の構成であってもよい。また、制御部１３０が有する各処理部の接続関係は、図４に示した接続関係に限られず、他の接続関係であってもよい。 As shown in FIG. 4, the control unit 130 according to the embodiment includes an acquisition unit 131, an input unit 132, a calculation unit 133, an intermediate output unit 134, a reception unit 135, and a result output unit 136. , Realize or execute the functions and actions of information processing described below. The internal configuration of the control unit 130 is not limited to the configuration shown in FIG. 4, and may be any other configuration as long as it is configured to perform information processing described later. Further, the connection relationship of each processing unit included in the control unit 130 is not limited to the connection relationship shown in FIG. 4, and may be another connection relationship.

（取得部１３１について）
取得部１３１は、種々の情報を取得する。例えば、取得部１３１は、実施形態に係る出力処理を実行するための学習済みモデルを取得する。取得部１３１は、取得したモデルをモデル記憶部１２１に格納する。 (About acquisition unit 131)
The acquisition unit 131 acquires various information. For example, the acquisition unit 131 acquires a trained model for executing the output process according to the embodiment. The acquisition unit 131 stores the acquired model in the model storage unit 121.

また、取得部１３１は、ユーザの顔画像を取得する。例えば、取得部１３１は、ユーザを撮像したカメラ等の入力デバイス１０を介して、ユーザの顔画像を取得する。取得部１３１は、取得した画像を画像記憶部１２２に格納する。また、取得部１３１は、取得した顔画像を入力部１３２に送る。 In addition, the acquisition unit 131 acquires the user's face image. For example, the acquisition unit 131 acquires the user's face image via an input device 10 such as a camera that captures the user. The acquisition unit 131 stores the acquired image in the image storage unit 122. Further, the acquisition unit 131 sends the acquired face image to the input unit 132.

（入力部１３２について）
入力部１３２は、画像を出力するニューラルネットワークとして学習され、生成されたモデルに、処理対象である第１画像を入力する。実施形態では、入力部１３２は、第１画像として、ユーザの顔を撮像した顔画像をモデルに入力する。 (About input unit 132)
The input unit 132 is trained as a neural network that outputs an image, and inputs a first image to be processed into the generated model. In the embodiment, the input unit 132 inputs a face image obtained by capturing the user's face into the model as the first image.

例えば、入力部１３２は、第１画像を符号化するエンコーダとしての機能や、符号化された情報に対して所定の行列を適用したベクトルを生成する機能等を有する。すなわち、入力部１３２は、入力データである第１画像を、ニューラルネットワークとして学習されたモデルに入力可能な形式に変換し、変換後の情報をモデルに入力する。 For example, the input unit 132 has a function as an encoder for encoding the first image, a function of generating a vector by applying a predetermined matrix to the encoded information, and the like. That is, the input unit 132 converts the first image, which is the input data, into a format that can be input to the model learned as the neural network, and inputs the converted information to the model.

（算出部１３３について）
算出部１３３は、入力部１３２によってモデルに入力された情報（すなわち、各ノード）に接続係数を乗じて、次段の中間層に対応するノードの値を算出する。例えば、算出部１３３は、次段の中間層のノードの値として、当該ノードに接続された前段の各ノードの値と、互いに接続されたノード間のエッジの接続係数とを乗じて、各ノードの値の合計値を算出する。さらに、算出部１３３は、算出した合計値を所定の活性化関数に入力し、次段のノードに対応する値を算出する。 (About calculation unit 133)
The calculation unit 133 multiplies the information (that is, each node) input to the model by the input unit 132 by the connection coefficient to calculate the value of the node corresponding to the intermediate layer of the next stage. For example, the calculation unit 133 multiplies the value of each node in the previous stage connected to the node as the value of the node in the middle layer of the next stage by the connection coefficient of the edge between the nodes connected to each other, and each node. Calculate the total value of. Further, the calculation unit 133 inputs the calculated total value into a predetermined activation function, and calculates a value corresponding to the node in the next stage.

（中間出力部１３４について）
中間出力部１３４は、第１画像が入力されたモデルの中間層における画像である中間画像を出力する。具体的には、中間出力部１３４は、中間層の各ノードの値に基づいて中間画像を生成し、生成した中間画像を出力する。 (About the intermediate output unit 134)
The intermediate output unit 134 outputs an intermediate image which is an image in the intermediate layer of the model to which the first image is input. Specifically, the intermediate output unit 134 generates an intermediate image based on the value of each node of the intermediate layer, and outputs the generated intermediate image.

なお、中間画像の構成は、必ずしも最終的な出力結果（第２画像）と一致しなくてもよい。例えば、ニューラルネットワークの構造によっては、中間層のノード数と出力層のノード数は異なる場合がある。この場合、中間画像と第２画像の各々の画素数等の画像情報は、互いに異なっていてもよい。 The configuration of the intermediate image does not necessarily have to match the final output result (second image). For example, depending on the structure of the neural network, the number of nodes in the intermediate layer and the number of nodes in the output layer may differ. In this case, the image information such as the number of pixels of each of the intermediate image and the second image may be different from each other.

また、中間画像は、画像を見るユーザが顔と認識可能な形態でなくてもよい。例えば、モデルが畳み込みニューラルネットワークの構造を有する場合、中間画像は、畳み込みで用いられた特徴量を抽出するためのフィルタの大きさで区切られた、モザイクのような画像で示される場合がある。この場合、中間画像には、ユーザの顔の特徴的な箇所（例えば、顔を構成する目や鼻の近傍など）がモザイクのように羅列される画像で示される場合がある。 Further, the intermediate image does not have to be in a form that can be recognized as a face by the user who sees the image. For example, if the model has the structure of a convolutional neural network, the intermediate image may be represented by a mosaic-like image separated by the size of the filter for extracting the features used in the convolution. In this case, the intermediate image may show a characteristic portion of the user's face (for example, the vicinity of the eyes and nose constituting the face) as an image in which they are arranged like a mosaic.

また、中間出力部１３４は、各々の中間層ごとの中間画像を出力する。具体的には、中間出力部１３４は、第１画像が入力されたモデルの第１中間層における画像である第１中間画像を出力する。その後、中間出力部１３４は、後述する受付部１３５によって第１中間画像に対する介入処理を反映させた第１介入情報が受け付けられた場合に、第１介入情報が入力されたモデルの次段の中間層における画像である第２中間画像を出力する。例えば、中間層がｍ層（ｍは任意の数）存在するモデルであれば、中間出力部１３４は、第１層から第ｍ層分の中間画像を出力してもよい。 Further, the intermediate output unit 134 outputs an intermediate image for each intermediate layer. Specifically, the intermediate output unit 134 outputs a first intermediate image which is an image in the first intermediate layer of the model to which the first image is input. After that, the intermediate output unit 134 is in the middle of the next stage of the model in which the first intervention information is input when the first intervention information reflecting the intervention processing for the first intermediate image is received by the reception unit 135 described later. A second intermediate image, which is an image in the layer, is output. For example, if the model has m layers (m is an arbitrary number), the intermediate output unit 134 may output intermediate images for the first layer to the mth layer.

（受付部１３５について）
受付部１３５は、中間画像に対する介入処理を反映させた情報である介入情報を受け付ける。具体的には、受付部１３５は、中間画像に対するユーザの介入処理を受け付けた表示制御装置３０を介して、当該介入処理を示す介入情報を受け付ける。 (About reception desk 135)
The reception unit 135 receives intervention information, which is information that reflects the intervention process for the intermediate image. Specifically, the reception unit 135 receives intervention information indicating the intervention process via the display control device 30 that has received the user's intervention process for the intermediate image.

例えば、受付部１３５は、中間画像に対する介入処理によって中間画像の一部又は全部の情報を欠落させた情報である介入情報を受け付ける。具体的には、受付部１３５は、出力デバイス２０上に表示された中間画像において、ユーザにタッチ操作された箇所に対応する画素の情報を欠落させることを示す介入情報を受け付ける。また、受付部１３５は、出力デバイス２０上に表示された中間画像において、ユーザを撮像した画像を２値化した情報と重畳される箇所に対応する画素の情報を欠落させることを示す介入情報を受け付ける。 For example, the reception unit 135 receives intervention information, which is information in which part or all of the information of the intermediate image is omitted by the intervention process for the intermediate image. Specifically, the reception unit 135 receives intervention information indicating that the intermediate image displayed on the output device 20 lacks the information of the pixel corresponding to the portion touch-operated by the user. Further, the reception unit 135 provides intervention information indicating that the intermediate image displayed on the output device 20 lacks the information of the pixel corresponding to the portion superimposed with the binarized information of the image captured by the user. accept.

受付部１３５は、受け付けた介入情報を算出部１３３に送る。算出部１３３は、受け付けた介入情報に基づいて、中間画像を出力させた中間層の次段の中間層におけるノードを算出する。すなわち、受付部１３５は、モデルの演算が出力層に至るまで介入情報を受け付け、算出部１３３は、モデルの出力層まで演算を繰り返す。 The reception unit 135 sends the received intervention information to the calculation unit 133. The calculation unit 133 calculates the node in the intermediate layer of the next stage of the intermediate layer from which the intermediate image is output, based on the received intervention information. That is, the reception unit 135 receives the intervention information until the calculation of the model reaches the output layer, and the calculation unit 133 repeats the calculation up to the output layer of the model.

（結果出力部１３６について）
結果出力部１３６は、受付部１３５によって受け付けられた介入情報に基づいて、モデルの出力層から第２画像を出力する。具体的には、結果出力部１３６は、受付部１３５によって受け付けられた介入情報に基づく演算が算出部１３３により行われた結果、出力層に対応するノードの情報を取得する。そして、結果出力部１３６は、出力層のノードに基づいて、モデルの演算結果となる第２画像を出力する。例えば、結果出力部１３６は、出力デバイス２０に表示可能な形式の画素数や色情報が設定された画像データとして、第２画像を出力する。 (About the result output unit 136)
The result output unit 136 outputs the second image from the output layer of the model based on the intervention information received by the reception unit 135. Specifically, the result output unit 136 acquires the information of the node corresponding to the output layer as a result of the calculation unit 133 performing the calculation based on the intervention information received by the reception unit 135. Then, the result output unit 136 outputs the second image which is the calculation result of the model based on the node of the output layer. For example, the result output unit 136 outputs the second image as image data in which the number of pixels and color information in a format that can be displayed on the output device 20 are set.

〔４．表示制御装置の構成〕
次に、図７を用いて、実施形態に係る表示制御装置３０の構成について説明する。図７は、実施形態に係る表示制御装置３０の構成例を示す図である。図７に示すように、表示制御装置３０は、通信部３１と、記憶部３３と、制御部３２とを有する。なお、表示制御装置３０は、表示制御装置３０を利用する管理者等から各種操作を受け付ける入力部（例えば、キーボードやマウス等）や、各種情報を出力するための出力部（例えば、液晶ディスプレイ等）を有してもよい。 [4. Display control device configuration]
Next, the configuration of the display control device 30 according to the embodiment will be described with reference to FIG. 7. FIG. 7 is a diagram showing a configuration example of the display control device 30 according to the embodiment. As shown in FIG. 7, the display control device 30 includes a communication unit 31, a storage unit 33, and a control unit 32. The display control device 30 includes an input unit (for example, a keyboard, a mouse, etc.) that receives various operations from an administrator or the like who uses the display control device 30, and an output unit (for example, a liquid crystal display, etc.) for outputting various information. ) May have.

（通信部３１について）
通信部３１は、例えば、ＮＩＣ等によって実現される。通信部３１は、ネットワークＮと有線又は無線で接続され、ネットワークＮを介して、入力デバイス１０や、出力デバイス２０や、出力装置１００との間で情報の送受信を行う。 (About communication unit 31)
The communication unit 31 is realized by, for example, a NIC or the like. The communication unit 31 is connected to the network N by wire or wirelessly, and transmits / receives information to / from the input device 10, the output device 20, and the output device 100 via the network N.

（記憶部３３について）
記憶部３３は、例えば、ＲＡＭ、フラッシュメモリ等の半導体メモリ素子、または、ハードディスク、光ディスク等の記憶装置によって実現される。実施形態に係る記憶部３３は、中間画像記憶部３４と、介入情報記憶部３５とを有する。 (About the storage unit 33)
The storage unit 33 is realized by, for example, a semiconductor memory element such as a RAM or a flash memory, or a storage device such as a hard disk or an optical disk. The storage unit 33 according to the embodiment includes an intermediate image storage unit 34 and an intervention information storage unit 35.

（中間画像記憶部３４について）
中間画像記憶部３４は、出力装置１００から送信された中間画像を記憶する。図８に、実施形態に係る中間画像記憶部３４の一例を示す。図８は、実施形態に係る中間画像記憶部３４の一例を示す図である。図８に示すように、中間画像記憶部３４は、「中間画像ＩＤ」、「画素数」、「色情報」といった項目を有する。 (About the intermediate image storage unit 34)
The intermediate image storage unit 34 stores the intermediate image transmitted from the output device 100. FIG. 8 shows an example of the intermediate image storage unit 34 according to the embodiment. FIG. 8 is a diagram showing an example of the intermediate image storage unit 34 according to the embodiment. As shown in FIG. 8, the intermediate image storage unit 34 has items such as “intermediate image ID”, “number of pixels”, and “color information”.

「中間画像ＩＤ」は、中間画像を識別する識別情報を示す。「画素数」及び「色情報」は、図６に示した同一の項目に対応する。 The "intermediate image ID" indicates identification information for identifying the intermediate image. The "number of pixels" and "color information" correspond to the same items shown in FIG.

すなわち、図８では、中間画像記憶部３４が記憶する情報の一例として、中間画像ＩＤが「Ｇ０１」である中間画像Ｇ０１は、画素数が「Ｈ０１」であり、色情報が「Ｊ０１」であることを示している。 That is, in FIG. 8, as an example of the information stored in the intermediate image storage unit 34, the intermediate image G01 having the intermediate image ID “G01” has the number of pixels “H01” and the color information “J01”. It is shown that.

（介入情報記憶部３５について）
介入情報記憶部３５は、介入情報を記憶する。ここで、図９に、実施形態に係る介入情報記憶部３５の一例を示す。図９は、実施形態に係る介入情報記憶部３５の一例を示す図である。図９に示すように、介入情報記憶部３５は、「介入情報ＩＤ」、「欠落箇所情報」といった項目を有する。 (About the intervention information storage unit 35)
The intervention information storage unit 35 stores the intervention information. Here, FIG. 9 shows an example of the intervention information storage unit 35 according to the embodiment. FIG. 9 is a diagram showing an example of the intervention information storage unit 35 according to the embodiment. As shown in FIG. 9, the intervention information storage unit 35 has items such as "intervention information ID" and "missing part information".

「介入情報ＩＤ」は、介入情報を識別する識別情報を示す。「欠落箇所情報」は、中間画像が含む画素のうち、ユーザの操作によって欠落される箇所を示す。図９に示した例では、「欠落箇所情報」を「Ｌ０１」のような概念で示しているが、実際には、欠落箇所情報の項目には、欠落箇所に対応する具体的な情報が記憶される。例えば、欠落箇所情報は、ユーザがタッチ操作を行った箇所に対応する中間画像における具体的な座標情報や、ユーザを撮像した画像の２値化情報において黒色（影）と判定された箇所の具体的な座標情報等が記憶される。 The "intervention information ID" indicates identification information that identifies the intervention information. The “missing part information” indicates a part of the pixels included in the intermediate image that is missing by the user's operation. In the example shown in FIG. 9, "missing part information" is shown by a concept such as "L01", but in reality, specific information corresponding to the missing part is stored in the item of missing part information. Will be done. For example, the missing part information is the specific coordinate information in the intermediate image corresponding to the part where the user has touched, or the specific part determined to be black (shadow) in the binarization information of the image captured by the user. Coordinate information etc. are stored.

すなわち、図９では、介入情報記憶部３５が記憶する情報の一例として、介入情報ＩＤが「Ｋ０１」である介入情報Ｋ０１は、欠落箇所情報が「Ｌ０１」であることを示している。 That is, in FIG. 9, as an example of the information stored in the intervention information storage unit 35, the intervention information K01 whose intervention information ID is “K01” indicates that the missing portion information is “L01”.

（制御部３２について）
図７に戻って説明を続ける。制御部３２は、コントローラであり、例えば、ＣＰＵやＭＰＵ等によって、表示制御装置３０内部の記憶装置に記憶されている各種プログラムがＲＡＭを作業領域として実行されることにより実現される。また、制御部３２は、コントローラであり、例えば、ＡＳＩＣやＦＰＧＡ等の集積回路により実現される。 (About control unit 32)
Returning to FIG. 7, the explanation will be continued. The control unit 32 is a controller, and is realized by, for example, using a CPU, an MPU, or the like to execute various programs stored in the storage device inside the display control device 30 using the RAM as a work area. Further, the control unit 32 is a controller, and is realized by, for example, an integrated circuit such as an ASIC or FPGA.

実施形態に係る制御部３２は、図７に示すように、受信部３６と、表示制御部３７と、生成部３８と、送信部３９とを有し、以下に説明する情報処理の機能や作用を実現または実行する。なお、制御部３２の内部構成は、図７に示した構成に限られず、後述する情報処理を行う構成であれば他の構成であってもよい。また、制御部３２が有する各処理部の接続関係は、図７に示した接続関係に限られず、他の接続関係であってもよい。 As shown in FIG. 7, the control unit 32 according to the embodiment includes a reception unit 36, a display control unit 37, a generation unit 38, and a transmission unit 39, and has functions and operations of information processing described below. Realize or execute. The internal configuration of the control unit 32 is not limited to the configuration shown in FIG. 7, and may be any other configuration as long as it is configured to perform information processing described later. Further, the connection relationship of each processing unit included in the control unit 32 is not limited to the connection relationship shown in FIG. 7, and may be another connection relationship.

（受信部３６について）
受信部３６は、各種情報を受信する。例えば、受信部３６は、出力装置１００から中間画像に関する情報を受信する。具体的には、受信部３６は、中間画像を構成する画素に関する情報（画素数や色情報等）を受信する。受信部３６は、受信した中間画像を中間画像記憶部３４に格納する。 (About receiver 36)
The receiving unit 36 receives various information. For example, the receiving unit 36 receives information about the intermediate image from the output device 100. Specifically, the receiving unit 36 receives information (number of pixels, color information, etc.) about the pixels constituting the intermediate image. The receiving unit 36 stores the received intermediate image in the intermediate image storage unit 34.

（表示制御部３７について）
表示制御部３７は、出力装置１００に係る中間出力部１３４によって出力された中間画像を、任意の表示装置（実施形態では出力デバイス２０）に表示する。 (About the display control unit 37)
The display control unit 37 displays the intermediate image output by the intermediate output unit 134 related to the output device 100 on an arbitrary display device (output device 20 in the embodiment).

また、表示制御部３７は、後述する生成部３８によって生成された介入情報を中間画像に反映させた画像である介入画像を表示装置に表示する。具体的には、表示制御部３７は、出力デバイス２０に表示中の中間画像にタッチ操作が行われた場合、タッチ操作が行われた画素を特定し、当該画素を黒く表示するよう制御する。このように、中間画像を黒く表示する処理により、介入画像が出力デバイス２０上に表示される。これにより、ユーザは、中間画像において、自身がどの位置にタッチしたかを認識することができる。 Further, the display control unit 37 displays an intervention image, which is an image in which the intervention information generated by the generation unit 38 described later is reflected in the intermediate image, on the display device. Specifically, when a touch operation is performed on the intermediate image displayed on the output device 20, the display control unit 37 identifies the pixel on which the touch operation is performed and controls the pixel to be displayed in black. In this way, the intervention image is displayed on the output device 20 by the process of displaying the intermediate image in black. As a result, the user can recognize which position he / she touched in the intermediate image.

（生成部３８について）
生成部３８は、表示制御部３７によって表示された中間画像に対する介入処理に基づいて、介入情報を生成する。具体的には、生成部３８は、中間画像においてユーザから介入処理を受け付けた画素の座標情報を特定し、特定した座標情報を示した介入情報を生成する。生成部３８は、生成した介入情報を介入情報記憶部３５に格納する。 (About the generator 38)
The generation unit 38 generates intervention information based on the intervention processing for the intermediate image displayed by the display control unit 37. Specifically, the generation unit 38 specifies the coordinate information of the pixel that has received the intervention process from the user in the intermediate image, and generates the intervention information indicating the specified coordinate information. The generation unit 38 stores the generated intervention information in the intervention information storage unit 35.

具体的には、生成部３８は、中間画像が表示された出力デバイス２０に対するユーザの選択操作に基づいて、中間画像において選択された箇所の情報を欠落させた介入情報を生成する。ユーザの選択操作とは、例えば、タッチセンサを有する出力デバイス２０に対するタッチ操作や、その他のポインティングデバイスを利用して中間画像を構成する画素をユーザが選択する操作をいう。これにより、ユーザは、中間画像を視認しながら欠落させたい情報を自身で選択して、ニューラルネットワークの処理に影響を与えることができる。 Specifically, the generation unit 38 generates intervention information in which the information of the selected portion in the intermediate image is omitted, based on the user's selection operation for the output device 20 in which the intermediate image is displayed. The user selection operation means, for example, a touch operation on the output device 20 having a touch sensor, or an operation in which the user selects pixels constituting an intermediate image by using another pointing device. This allows the user to visually select the information to be omitted while visually recognizing the intermediate image, and influence the processing of the neural network.

生成部３８が行う処理について、図１０及び図１１を用いて説明する。図１０は、実施形態に係る介入処理の一例を示す図（１）である。図１０では、ディスプレイ６０に表示された中間画像６１に対して、ユーザがタッチ操作による介入処理を行った例を示す。 The processing performed by the generation unit 38 will be described with reference to FIGS. 10 and 11. FIG. 10 is a diagram (1) showing an example of the intervention process according to the embodiment. FIG. 10 shows an example in which the user performs an intervention process by a touch operation on the intermediate image 61 displayed on the display 60.

図１０の例において、生成部３８は、ディスプレイ６０を介して、ユーザのタッチ操作を検出する（ステップＳ３１）。例えば、ユーザは、指６５で中間画像６１の一部をタッチする。生成部３８は、ユーザが触れた箇所に対応する座標情報に基づいて、欠落箇所を特定する（ステップＳ３２）。具体的には、生成部３８は、指６５が触れた箇所を示した表示６６に対応する座標位置を欠落箇所として特定する。そして、生成部３８は、中間画像６１から一部の情報が欠落した情報である介入情報を生成する。なお、介入情報は、欠落した箇所のみを示す情報によって表されてもよいし、中間画像６１から一部の情報を欠落させた画像情報として表されてもよい。 In the example of FIG. 10, the generation unit 38 detects the user's touch operation via the display 60 (step S31). For example, the user touches a part of the intermediate image 61 with the finger 65. The generation unit 38 identifies the missing portion based on the coordinate information corresponding to the portion touched by the user (step S32). Specifically, the generation unit 38 specifies the coordinate position corresponding to the display 66 indicating the portion touched by the finger 65 as the missing portion. Then, the generation unit 38 generates intervention information, which is information in which some information is missing from the intermediate image 61. The intervention information may be represented by information indicating only the missing portion, or may be represented as image information in which some information is missing from the intermediate image 61.

次に、図１１を用いて、生成部３８が生成した介入情報の流れを示す。図１１は、実施形態に係る介入処理の手順を示す図（１）である。図１１に示すように、まず、受信部３６は、出力装置１００からｍ層目（ｍは任意の整数）の中間層出力を受信する。図１１において、「Ｒ^{Ｎ（ｍ）×Ｗ（ｍ）×Ｈ（ｍ）}」は、ｍ層の中間層の画像情報を示す。なお、「Ｎ」はチャンネル数を示し、「Ｗ」は画像における横軸、「Ｈ」は画像における縦軸の座標を示す。「Ｈ（ｍ）」は、ｍ層の中間層出力の要素を示す。 Next, with reference to FIG. 11, the flow of intervention information generated by the generation unit 38 is shown. FIG. 11 is a diagram (1) showing the procedure of the intervention process according to the embodiment. As shown in FIG. 11, first, the receiving unit 36 receives the intermediate layer output of the mth layer (m is an arbitrary integer) from the output device 100. In FIG. 11, “ ^{RN (m) × W (m) × H (m)} ” indicates image information of the intermediate layer of the m layer. Note that "N" indicates the number of channels, "W" indicates the coordinates of the horizontal axis in the image, and "H" indicates the coordinates of the vertical axis in the image. “H (m)” indicates an element of the intermediate layer output of the m layer.

表示制御部３７は、受信した中間層出力を整形し、出力デバイス２０（ディスプレイ）で表示可能な形式に変換する。整形後の画像は、例えば、「Ｒ^Ｗ×Ｈ」で示され、各々の要素（各画素の情報）は「Ｘ」で示される。 The display control unit 37 shapes the received intermediate layer output and converts it into a format that can be displayed by the output device 20 (display). The image after shaping is represented by, for example, "RW ^{× H} ", and each element (information of each pixel) is represented by "X".

一方、図１１に示すように、生成部３８は、整形された中間画像が表示されたタッチセンサを介して、ユーザのタッチした座標位置を特定し、介入情報を生成する。なお、図１１に示す「Ｍ」は、介入情報の要素を示す。例えば、図１１では、介入情報とは、中間画像「Ｒ^Ｗ×Ｈ」で示される画像に含まれる座標「Ｍ」であることを示している。すなわち、図１１において、介入情報とは、中間画像においてユーザが触れた座標「Ｍ」を示した情報である。 On the other hand, as shown in FIG. 11, the generation unit 38 identifies the coordinate position touched by the user via the touch sensor on which the shaped intermediate image is displayed, and generates intervention information. In addition, "M" shown in FIG. 11 indicates an element of intervention information. For example, in FIG. 11, the intervention information is shown to be the coordinates “M” included in the image indicated by the ^{intermediate image “RW × H”.} That is, in FIG. 11, the intervention information is information indicating the coordinates “M” touched by the user in the intermediate image.

そして、送信部３９は、生成部３８によって生成された介入情報を出力装置１００に送信する。また、表示制御部３７は、介入情報と中間画像とに基づいて、介入処理に関する演算を行う。例えば、表示制御部３７は、中間画像の要素「Ｘ」と介入情報の要素「Ｍ」との要素積を算出して、介入後の中間画像に関する情報を演算する（図１１中の「○」は、要素積の算出を示す）。そして、表示制御部３７は、介入後の中間画像をディスプレイに表示する。 Then, the transmission unit 39 transmits the intervention information generated by the generation unit 38 to the output device 100. Further, the display control unit 37 performs an operation related to the intervention process based on the intervention information and the intermediate image. For example, the display control unit 37 calculates the element product of the element "X" of the intermediate image and the element "M" of the intervention information, and calculates the information regarding the intermediate image after the intervention ("○" in FIG. 11). Indicates the calculation of the element product). Then, the display control unit 37 displays the intermediate image after the intervention on the display.

また、生成部３８は、出力デバイス２０に設置された撮像装置（カメラ）を制御して生成される２値化情報と、出力デバイス２０に表示された中間画像とに基づいて、中間画像の一部又は全部の情報を欠落させた介入情報を生成する。これにより、出力装置１００は、ユーザが身体を動かすたびにニューラルネットワークの出力結果を変化させることができるため、思いがけない出力結果をユーザに提示させ、ユーザを楽しませることができる。 Further, the generation unit 38 is one of the intermediate images based on the binarization information generated by controlling the image pickup device (camera) installed in the output device 20 and the intermediate image displayed on the output device 20. Generate intervention information with some or all information missing. As a result, the output device 100 can change the output result of the neural network each time the user moves his / her body, so that the user can be presented with an unexpected output result to entertain the user.

上記の生成部３８が行う処理について、図１２及び図１３を用いて説明する。図１２は、実施形態に係る介入処理の一例を示す図（２）である。図１２では、ユーザの近傍に設置されたカメラ７５によってユーザが撮像され、撮像された画像に基づいて介入情報が生成される例を示す。 The processing performed by the generation unit 38 will be described with reference to FIGS. 12 and 13. FIG. 12 is a diagram (2) showing an example of the intervention process according to the embodiment. FIG. 12 shows an example in which a user is imaged by a camera 75 installed in the vicinity of the user and intervention information is generated based on the captured image.

図１２の例において、生成部３８は、カメラ７５を介して、ユーザを撮像した画像９０を取得する（ステップＳ４１）。続けて、生成部３８は、取得した画像９０を２値化データに変換する（ステップＳ４２）。例えば、生成部３８は、所定の閾値よりも明度の低い画素を「０」、所定の閾値よりも明度の高い画素を「１」とする。 In the example of FIG. 12, the generation unit 38 acquires an image 90 captured by the user via the camera 75 (step S41). Subsequently, the generation unit 38 converts the acquired image 90 into binarized data (step S42). For example, the generation unit 38 sets a pixel having a brightness lower than a predetermined threshold value as “0” and a pixel having a brightness higher than a predetermined threshold value as “1”.

また、生成部３８は、出力装置１００から受信した中間画像９１を取得する（ステップＳ４３）。そして、生成部３８は、中間画像９１に画像９０を２値化したデータを重畳させる（ステップＳ４４）。そして、生成部３８は、中間画像９１のうち、画像９０を２値化した場合に「０」と判定された画素と重畳する画素を、欠落させるデータとして特定する。生成部３８は、かかる欠落させたデータに基づいて、介入情報を生成する。 Further, the generation unit 38 acquires the intermediate image 91 received from the output device 100 (step S43). Then, the generation unit 38 superimposes the binarized data of the image 90 on the intermediate image 91 (step S44). Then, the generation unit 38 specifies, among the intermediate images 91, the pixels that are superimposed on the pixels determined to be “0” when the image 90 is binarized, as the data to be omitted. The generation unit 38 generates intervention information based on the missing data.

次に、図１３を用いて、生成部３８が生成した介入情報の流れを示す。図１３は、実施形態に係る介入処理の手順を示す図（２）である。図１３に示すように、まず、受信部３６は、図１１と同様、出力装置１００からｍ層目（ｍは任意の整数）の中間層出力を受信する。 Next, with reference to FIG. 13, the flow of the intervention information generated by the generation unit 38 is shown. FIG. 13 is a diagram (2) showing the procedure of the intervention process according to the embodiment. As shown in FIG. 13, first, as in FIG. 11, the receiving unit 36 receives the intermediate layer output of the mth layer (m is an arbitrary integer) from the output device 100.

また、図１３に示すように、生成部３８は、カメラから取得されたカメラ画像を２値化して、２値化画像を得る。そして、生成部３８は、２値化画像を中間画像に重畳可能なようにリサイズする。そして、生成部３８は、２値化情報に基づいて、介入情報（図１３に示す要素「Ｙ」により示される）を生成する。送信部３９は、生成部３８によって生成された介入情報を出力装置１００に送信する。また、表示制御部３７は、中間画像の要素「Ｘ」と介入情報の要素「Ｙ」との要素積を算出して、介入後の中間画像に関する情報を演算する。そして、表示制御部３７は、介入後の中間画像をディスプレイに表示する。なお、生成部３８は、中間画像の要素「Ｘ」と２値化情報の要素積を介入情報としてもよい。 Further, as shown in FIG. 13, the generation unit 38 binarizes the camera image acquired from the camera to obtain a binarized image. Then, the generation unit 38 resizes the binarized image so that it can be superimposed on the intermediate image. Then, the generation unit 38 generates intervention information (indicated by the element “Y” shown in FIG. 13) based on the binarization information. The transmission unit 39 transmits the intervention information generated by the generation unit 38 to the output device 100. Further, the display control unit 37 calculates the element product of the element "X" of the intermediate image and the element "Y" of the intervention information, and calculates the information regarding the intermediate image after the intervention. Then, the display control unit 37 displays the intermediate image after the intervention on the display. The generation unit 38 may use the element product of the element "X" of the intermediate image and the binarization information as intervention information.

ここで、図１４を用いて、表示制御装置３０から取得した介入情報に基づいて、出力装置１００で行われる演算の概要について説明する。図１４は、実施形態に係る出力処理の手順を示す概要図である。図１４に示すように、ユーザ顔画像が入力されたのち、出力装置１００は、第１層の計算を行い、第１層の出力（第１中間画像）を得る（なお、「ｆ」は活性化関数を示す）。そして、出力装置１００は、第１層の介入処理を実行し（図１１や図１３で示す「介入処理」と同様の処理）、介入処理を経た第１層の出力を得る。 Here, with reference to FIG. 14, an outline of the calculation performed by the output device 100 will be described based on the intervention information acquired from the display control device 30. FIG. 14 is a schematic diagram showing a procedure of output processing according to the embodiment. As shown in FIG. 14, after the user face image is input, the output device 100 performs the calculation of the first layer and obtains the output of the first layer (first intermediate image) (note that "f" is active. (Indicates the conversion function). Then, the output device 100 executes the intervention process of the first layer (process similar to the “intervention process” shown in FIGS. 11 and 13), and obtains the output of the first layer that has undergone the intervention process.

その後、出力装置１００は、第２層の計算を行い、第２層の出力（第２中間画像）を得る。これらの処理を繰り返し、出力装置１００は、出力層に至るまでの演算を行う。なお、図１４に示す例では、出力装置１００は、第Ｍ層という複数の中間層を有するＤＮＮを利用した演算を行っているため、最終的な結果として、出力層からＤＮＮ出力を得る。実施形態では、ＤＮＮ出力は、画像情報である。 After that, the output device 100 performs the calculation of the second layer and obtains the output of the second layer (second intermediate image). By repeating these processes, the output device 100 performs operations up to the output layer. In the example shown in FIG. 14, since the output device 100 performs an operation using a DNN having a plurality of intermediate layers called the M layer, a DNN output is obtained from the output layer as a final result. In the embodiment, the DNN output is image information.

（送信部３９について）
送信部３９は、各種情報を送信する。例えば、送信部３９は、生成部３８が生成した介入情報を出力装置１００に送信する。 (About transmitter 39)
The transmission unit 39 transmits various information. For example, the transmission unit 39 transmits the intervention information generated by the generation unit 38 to the output device 100.

〔５．処理手順〕
次に、図１５を用いて、実施形態に係る出力装置１００による処理の手順について説明する。図１５は、実施形態に係る処理手順を示すフローチャートである。 [5. Processing procedure]
Next, the procedure of processing by the output device 100 according to the embodiment will be described with reference to FIG. FIG. 15 is a flowchart showing a processing procedure according to the embodiment.

図１５に示すように、出力装置１００は、カメラ等の入力デバイス１０を介して、ユーザの顔画像を取得する（ステップＳ１０１）。出力装置１００は、取得した顔画像をモデルに入力する（ステップＳ１０２）。 As shown in FIG. 15, the output device 100 acquires a user's face image via an input device 10 such as a camera (step S101). The output device 100 inputs the acquired face image to the model (step S102).

その後、出力装置１００は、所定時間（例えば、ユーザがニューラルネットワークの出力の変化を体験する体験時間として設定された時間）が経過したか否かを判定する（ステップＳ１０３）。所定時間が経過していない場合（ステップＳ１０３；Ｎｏ）、出力装置１００は、出力した中間画像に対する介入情報を表示制御装置３０から受け付ける（ステップＳ１０５）。続けて、出力装置１００は、介入情報に基づいて、次の層の情報を算出する（ステップＳ１０６）。具体的には、出力装置１００は、介入処理を経た前段の出力に基づいて、次の層を構成する各ノードの値を算出する。 After that, the output device 100 determines whether or not a predetermined time (for example, a time set as an experience time for the user to experience a change in the output of the neural network) has elapsed (step S103). When the predetermined time has not elapsed (step S103; No), the output device 100 receives the intervention information for the output intermediate image from the display control device 30 (step S105). Subsequently, the output device 100 calculates the information of the next layer based on the intervention information (step S106). Specifically, the output device 100 calculates the value of each node constituting the next layer based on the output of the previous stage that has undergone the intervention process.

その後、出力装置１００は、算出した層（次の層）が出力層であるか否かを判定する（ステップＳ１０７）。出力層でない場合（ステップＳ１０７；Ｎｏ）、出力装置１００は、次の中間層の中間画像を出力する処理を繰り返す（ステップＳ１０４）。 After that, the output device 100 determines whether or not the calculated layer (next layer) is an output layer (step S107). If it is not an output layer (step S107; No), the output device 100 repeats a process of outputting an intermediate image of the next intermediate layer (step S104).

一方、出力装置１００は、次の層が出力層である場合（ステップＳ１０７；Ｙｅｓ）、出力層からモデルの出力結果である第２画像を出力する（ステップＳ１０８）。そして、出力装置１００は、出力した第２画像を出力デバイス２０等に表示する（ステップＳ１０９）。その後、所定時間が経過した場合に（ステップＳ１０３；Ｙｅｓ）、出力装置１００は、一連の出力処理を終了する。 On the other hand, when the next layer is an output layer (step S107; Yes), the output device 100 outputs a second image which is an output result of the model from the output layer (step S108). Then, the output device 100 displays the output second image on the output device 20 or the like (step S109). After that, when the predetermined time has elapsed (step S103; Yes), the output device 100 ends a series of output processes.

なお、図１５での図示は省略したが、出力装置１００は、任意のタイミング（例えば、ユーザの顔画像を最初にモデルに入力する直前）に、それまでの介入情報を初期化する処理を実行してもよい。 Although not shown in FIG. 15, the output device 100 executes a process of initializing the intervention information up to that point at an arbitrary timing (for example, immediately before the user's face image is first input to the model). You may.

〔６．変形例〕
上述した実施形態に係る出力システム１は、上記実施形態以外にも種々の異なる形態にて実施されてよい。そこで、以下では、上記の出力システム１に含まれる各装置の他の実施形態について説明する。 [6. Modification example]
The output system 1 according to the above-described embodiment may be implemented in various different forms other than the above-described embodiment. Therefore, in the following, other embodiments of each device included in the above output system 1 will be described.

〔６−１．出力装置の構成〕
上記実施形態では、出力装置１００が、中間画像を表示制御装置３０に出力し、表示制御装置３０を介して介入情報を受け付ける例を示した。ここで、これらの処理は、出力装置１００のみによって行われてもよい。 [6-1. Output device configuration]
In the above embodiment, an example is shown in which the output device 100 outputs an intermediate image to the display control device 30 and receives intervention information via the display control device 30. Here, these processes may be performed only by the output device 100.

この点について、図１６を用いて説明する。図１６は、変形例に係る出力装置２００の構成例を示す図である。図１６に示すように、出力装置２００は、出力装置１００と比較して、中間画像記憶部１２３と、介入情報記憶部１２４と、表示制御部１３７と、生成部１３８とをさらに有する。 This point will be described with reference to FIG. FIG. 16 is a diagram showing a configuration example of the output device 200 according to the modified example. As shown in FIG. 16, the output device 200 further includes an intermediate image storage unit 123, an intervention information storage unit 124, a display control unit 137, and a generation unit 138, as compared with the output device 100.

中間画像記憶部１２３は、表示制御装置３０に係る中間画像記憶部３４と同様の情報を記憶する。介入情報記憶部１２４は、表示制御装置３０に係る介入情報記憶部３５と同様の情報を記憶する。また、表示制御部１３７は、表示制御装置３０に係る表示制御部３７と同様の処理を実行する。生成部１３８は、表示制御装置３０に係る生成部３８と同様の処理を実行する。 The intermediate image storage unit 123 stores the same information as the intermediate image storage unit 34 related to the display control device 30. The intervention information storage unit 124 stores the same information as the intervention information storage unit 35 related to the display control device 30. Further, the display control unit 137 executes the same processing as the display control unit 37 related to the display control device 30. The generation unit 138 executes the same processing as the generation unit 38 related to the display control device 30.

すなわち、出力装置１００は、表示制御装置３０が実行する処理を自装置で実行してもよい。これにより、出力装置１００は、簡易なシステム設計で実施形態に係る出力処理を実行することができる。 That is, the output device 100 may execute the process executed by the display control device 30 by its own device. As a result, the output device 100 can execute the output process according to the embodiment with a simple system design.

〔６−２．出力システムの各装置〕
出力システム１に含まれる各装置は、様々な変形例により実現されてもよい。例えば、出力システム１は、ユーザを撮像するカメラ５０を制御するための撮影用ＰＣを備えてもよい。出力装置１００は、モデルの出力結果である第２画像を表示するディスプレイを備えてもよい。この場合、出力装置１００は、ノートＰＣやタブレット等、表示装置としての機能を兼ねる情報処理端末によって実現される。 [6-2. Each device of the output system]
Each device included in the output system 1 may be realized by various modifications. For example, the output system 1 may include a shooting PC for controlling the camera 50 that captures the user. The output device 100 may include a display that displays a second image that is the output result of the model. In this case, the output device 100 is realized by an information processing terminal that also functions as a display device, such as a notebook PC or a tablet.

〔６−３．介入処理〕
上記実施形態では、介入処理として、ユーザによるタッチ操作や、カメラによって撮像される画像情報等の例を示した。しかし、介入処理は、中間画像に対して行われる種々のユーザの操作や、あるいは、ユーザの発する音声による音声データ等であってもよい。 [6-3. Intervention processing]
In the above embodiment, as an intervention process, an example of a touch operation by a user, image information captured by a camera, or the like is shown. However, the intervention process may be various user operations performed on the intermediate image, voice data generated by the user, or the like.

また、上記実施形態では、出力装置１００は、介入情報として、中間画像として表示された一部の情報を欠落させた情報を利用することを示した。しかし、介入情報は、必ずしも中間画像の一部を欠落させたものではなく、中間画像の一部を変化させたものであってもよい。例えば、介入情報は、中間画像として示された画素の情報を任意に変化（例えば、明度や色情報を増減させる等）させたものであってもよい。 Further, in the above embodiment, it is shown that the output device 100 uses the information in which a part of the information displayed as the intermediate image is omitted as the intervention information. However, the intervention information is not necessarily the one in which a part of the intermediate image is omitted, but may be the one in which a part of the intermediate image is changed. For example, the intervention information may be information obtained by arbitrarily changing the information of the pixels shown as the intermediate image (for example, increasing / decreasing the brightness and the color information).

〔７．その他〕
また、上記実施形態において説明した各処理のうち、自動的に行われるものとして説明した処理の全部または一部を手動的に行うこともでき、あるいは、手動的に行われるものとして説明した処理の全部または一部を公知の方法で自動的に行うこともできる。この他、上記文書中や図面中で示した処理手順、具体的名称、各種のデータやパラメータを含む情報については、特記する場合を除いて任意に変更することができる。 [7. others〕
Further, among the processes described in the above-described embodiment, all or a part of the processes described as being automatically performed can be manually performed, or the processes described as being manually performed can be performed. All or part of it can be done automatically by a known method. In addition, information including processing procedures, specific names, various data and parameters shown in the above documents and drawings can be arbitrarily changed unless otherwise specified.

また、図示した各装置の各構成要素は機能概念的なものであり、必ずしも物理的に図示の如く構成されていることを要しない。すなわち、各装置の分散・統合の具体的形態は図示のものに限られず、その全部または一部を、各種の負荷や使用状況などに応じて、任意の単位で機能的または物理的に分散・統合して構成することができる。 Further, each component of each of the illustrated devices is a functional concept, and does not necessarily have to be physically configured as shown in the figure. That is, the specific form of distribution / integration of each device is not limited to the one shown in the figure, and all or part of them may be functionally or physically distributed / physically in arbitrary units according to various loads and usage conditions. Can be integrated and configured.

例えば、図４に示したモデル記憶部１２１や画像記憶部１２２は、出力装置１００が保持せずに、外部のストレージサーバ等に保持されてもよい。この場合、出力装置１００は、ストレージサーバにアクセスすることで、モデルや画像情報等を取得する。 For example, the model storage unit 121 and the image storage unit 122 shown in FIG. 4 may not be held by the output device 100 but may be held by an external storage server or the like. In this case, the output device 100 acquires the model, image information, and the like by accessing the storage server.

また、例えば、上述してきた出力装置１００は、表示制御装置３０から介入情報を取得したり、出力デバイス２０に第２画像を出力したりといった、外部装置とのやりとりを中心に実行するフロントエンドサーバ側と、モデルを用いた演算処理を中心に実行するバックエンドサーバ側とに分散されてもよい。 Further, for example, the output device 100 described above is a front-end server that mainly executes communication with an external device, such as acquiring intervention information from the display control device 30 and outputting a second image to the output device 20. It may be distributed between the side and the back-end server side that mainly executes arithmetic processing using the model.

〔８．ハードウェア構成〕
また、上述してきた実施形態に係る出力装置１００や、入力デバイス１０や、出力デバイス２０や、表示制御装置３０等は、例えば図１７に示すような構成のコンピュータ１０００によって実現される。以下、出力装置１００を例として説明する。図１７は、出力装置１００の機能を実現するコンピュータ１０００の一例を示すハードウェア構成図である。コンピュータ１０００は、ＣＰＵ１１００、ＲＡＭ１２００、ＲＯＭ（Read Only Memory）１３００、ＨＤＤ（Hard Disk Drive）１４００、通信インターフェイス（Ｉ／Ｆ）１５００、入出力インターフェイス（Ｉ／Ｆ）１６００、及びメディアインターフェイス（Ｉ／Ｆ）１７００を有する。 [8. Hardware configuration]
Further, the output device 100, the input device 10, the output device 20, the display control device 30, and the like according to the above-described embodiment are realized by, for example, a computer 1000 having a configuration as shown in FIG. Hereinafter, the output device 100 will be described as an example. FIG. 17 is a hardware configuration diagram showing an example of a computer 1000 that realizes the functions of the output device 100. The computer 1000 includes a CPU 1100, a RAM 1200, a ROM (Read Only Memory) 1300, an HDD (Hard Disk Drive) 1400, a communication interface (I / F) 1500, an input / output interface (I / F) 1600, and a media interface (I / F). ) Has 1700.

ＣＰＵ１１００は、ＲＯＭ１３００又はＨＤＤ１４００に格納されたプログラムに基づいて動作し、各部の制御を行う。ＲＯＭ１３００は、コンピュータ１０００の起動時にＣＰＵ１１００によって実行されるブートプログラムや、コンピュータ１０００のハードウェアに依存するプログラム等を格納する。 The CPU 1100 operates based on a program stored in the ROM 1300 or the HDD 1400, and controls each part. The ROM 1300 stores a boot program executed by the CPU 1100 when the computer 1000 is started, a program depending on the hardware of the computer 1000, and the like.

ＨＤＤ１４００は、ＣＰＵ１１００によって実行されるプログラム、及び、かかるプログラムによって使用されるデータ等を記憶する。通信インターフェイス１５００は、通信網５００（図３に示すネットワークＮに対応する）を介して他の機器からデータを受信してＣＰＵ１１００へ送り、また、通信網５００を介してＣＰＵ１１００が生成したデータを他の機器へ送信する。 The HDD 1400 stores a program executed by the CPU 1100, data used by such a program, and the like. The communication interface 1500 receives data from another device via the communication network 500 (corresponding to the network N shown in FIG. 3) and sends the data to the CPU 1100, and also receives data generated by the CPU 1100 via the communication network 500. Send to the device of.

ＣＰＵ１１００は、入出力インターフェイス１６００を介して、ディスプレイやプリンタ等の出力装置、及び、キーボードやマウス等の入力装置を制御する。ＣＰＵ１１００は、入出力インターフェイス１６００を介して、入力装置からデータを取得する。また、ＣＰＵ１１００は、入出力インターフェイス１６００を介して生成したデータを出力装置へ出力する。 The CPU 1100 controls an output device such as a display or a printer, and an input device such as a keyboard or a mouse via the input / output interface 1600. The CPU 1100 acquires data from the input device via the input / output interface 1600. Further, the CPU 1100 outputs the data generated via the input / output interface 1600 to the output device.

メディアインターフェイス１７００は、記録媒体１８００に格納されたプログラム又はデータを読み取り、ＲＡＭ１２００を介してＣＰＵ１１００に提供する。ＣＰＵ１１００は、かかるプログラムを、メディアインターフェイス１７００を介して記録媒体１８００からＲＡＭ１２００上にロードし、ロードしたプログラムを実行する。記録媒体１８００は、例えばＤＶＤ（Digital Versatile Disc）、ＰＤ（Phase change rewritable Disk）等の光学記録媒体、ＭＯ（Magneto-Optical disk）等の光磁気記録媒体、テープ媒体、磁気記録媒体、または半導体メモリ等である。 The media interface 1700 reads a program or data stored in the recording medium 1800 and provides the program or data to the CPU 1100 via the RAM 1200. The CPU 1100 loads the program from the recording medium 1800 onto the RAM 1200 via the media interface 1700, and executes the loaded program. The recording medium 1800 is, for example, an optical recording medium such as a DVD (Digital Versatile Disc) or PD (Phase change rewritable Disk), a magneto-optical recording medium such as MO (Magneto-Optical disk), a tape medium, a magnetic recording medium, or a semiconductor memory. And so on.

例えば、コンピュータ１０００が出力装置１００として機能する場合、コンピュータ１０００のＣＰＵ１１００は、ＲＡＭ１２００上にロードされたプログラムを実行することにより、制御部１３０の機能を実現する。また、ＨＤＤ１４００には、記憶部１２０内の各データが格納される。コンピュータ１０００のＣＰＵ１１００は、これらのプログラムを記録媒体１８００から読み取って実行するが、他の例として、他の装置から通信網５００を介してこれらのプログラムを取得してもよい。 For example, when the computer 1000 functions as the output device 100, the CPU 1100 of the computer 1000 realizes the function of the control unit 130 by executing the program loaded on the RAM 1200. Further, each data in the storage unit 120 is stored in the HDD 1400. The CPU 1100 of the computer 1000 reads these programs from the recording medium 1800 and executes them, but as another example, these programs may be acquired from another device via the communication network 500.

〔９．効果〕
上述してきたように、実施形態に係る出力装置１００は、入力部１３２と、中間出力部１３４と、受付部１３５と、結果出力部１３６とを有する。入力部１３２は、画像を出力するニューラルネットワークであるモデルに、処理対象である第１画像を入力する。中間出力部１３４は、第１画像が入力されたモデルの中間層における画像である中間画像を出力する。受付部１３５は、中間画像に対する介入処理を反映させた情報である介入情報を受け付ける。結果出力部１３６は、受付部１３５によって受け付けられた介入情報に基づいて、モデルの出力層から第２画像を出力する。 [9. effect〕
As described above, the output device 100 according to the embodiment includes an input unit 132, an intermediate output unit 134, a reception unit 135, and a result output unit 136. The input unit 132 inputs a first image to be processed into a model that is a neural network that outputs an image. The intermediate output unit 134 outputs an intermediate image which is an image in the intermediate layer of the model to which the first image is input. The reception unit 135 receives intervention information, which is information that reflects the intervention process for the intermediate image. The result output unit 136 outputs the second image from the output layer of the model based on the intervention information received by the reception unit 135.

このように、実施形態に係る出力装置１００は、ニューラルネットワークの演算の過程において、中間層から出力した中間画像に対する介入情報を受け付けることで、ニューラルネットワークにユーザが介入することを可能にする。これにより、出力装置１００は、モデルから出力される結果を動的に修正することができる。 As described above, the output device 100 according to the embodiment enables the user to intervene in the neural network by receiving the intervention information for the intermediate image output from the intermediate layer in the process of the calculation of the neural network. As a result, the output device 100 can dynamically correct the result output from the model.

また、中間出力部１３４は、第１画像が入力されたモデルの第１中間層における画像である第１中間画像を出力し、その後、受付部１３５によって第１中間画像に対する介入処理を反映させた第１介入情報が受け付けられた場合には、第１介入情報が入力されたモデルの次段の中間層における画像である第２中間画像を出力する。 Further, the intermediate output unit 134 outputs a first intermediate image which is an image in the first intermediate layer of the model to which the first image is input, and then the reception unit 135 reflects the intervention process for the first intermediate image. When the first intervention information is accepted, the second intermediate image, which is an image in the intermediate layer of the next stage of the model in which the first intervention information is input, is output.

このように、実施形態に係る出力装置１００は、中間層が複数にわたる場合には、中間層ごとに中間画像を出力してもよい。これにより、出力装置１００は、ニューラルネットワークの様々な過程でユーザによる介入を受け付けることができるので、より詳細にユーザの介入を出力結果に反映させることができる。 As described above, when the output device 100 according to the embodiment extends over a plurality of intermediate layers, the output device 100 may output an intermediate image for each intermediate layer. As a result, the output device 100 can accept the intervention by the user in various processes of the neural network, so that the intervention of the user can be reflected in the output result in more detail.

また、受付部１３５は、中間画像に対する介入処理によって中間画像の一部又は全部の情報を欠落させた情報である介入情報を受け付ける。 In addition, the reception unit 135 receives intervention information, which is information in which part or all of the information of the intermediate image is omitted by the intervention process for the intermediate image.

このように、実施形態に係る出力装置１００は、中間画像を欠落させた介入情報を受け付ける。言い換えれば、出力装置１００は、ニューラルネットワークによって抽出された特徴的な部分を欠落させる情報を受け付ける。これにより、出力装置１００は、入力された画像の一部の特徴をニューラルネットワークの演算から削除することができるため、介入処理によって出力結果がどのように変化するかをユーザにわかりやすく提示することができる。 As described above, the output device 100 according to the embodiment receives the intervention information in which the intermediate image is omitted. In other words, the output device 100 receives information that omits the characteristic portion extracted by the neural network. As a result, the output device 100 can delete some features of the input image from the calculation of the neural network, so that the user can easily understand how the output result is changed by the intervention process. Can be done.

また、変形例に係る出力装置２００は、中間出力部１３４によって出力された中間画像を表示装置に表示する表示制御部１３７と、表示制御部１３７によって表示された中間画像に対する介入処理に基づいて介入情報を生成する生成部１３８と、をさらに備える。また、受付部１３５は、生成部１３８によって生成された介入情報を受け付ける。 Further, the output device 200 according to the modified example intervenes based on the display control unit 137 that displays the intermediate image output by the intermediate output unit 134 on the display device and the intervention process for the intermediate image displayed by the display control unit 137. Further includes a generation unit 138 for generating information. Further, the reception unit 135 receives the intervention information generated by the generation unit 138.

このように、変形例に係る出力装置２００は、中間画像を表示する処理や、介入情報を生成する処理を自装置で実行してもよい。これにより、出力装置２００は、より簡易的なシステムで出力処理を実行することができる。 As described above, the output device 200 according to the modified example may execute the process of displaying the intermediate image and the process of generating the intervention information by the own device. As a result, the output device 200 can execute the output process with a simpler system.

また、生成部１３８は、中間画像が表示された表示装置に対するユーザの選択操作に基づいて、中間画像において選択された箇所の情報を欠落させた介入情報を生成する。 Further, the generation unit 138 generates intervention information in which the information of the selected portion in the intermediate image is omitted, based on the user's selection operation on the display device on which the intermediate image is displayed.

このように、変形例に係る出力装置２００は、欠落情報をユーザの選択操作に基づいて生成することで、ユーザの動きと連動した第２画像（出力結果）の変化をユーザに提示できるため、第２画像の変化を楽しむといった娯楽をユーザに提供することができる。また、出力装置２００は、中間画像のどのような箇所が変化することで出力結果がどのような影響を受けるのかといった、通常では認識することのできないモデルの内部処理をユーザに体感させることができる。 As described above, the output device 200 according to the modification can generate the missing information based on the user's selection operation, so that the change of the second image (output result) linked with the user's movement can be presented to the user. It is possible to provide the user with entertainment such as enjoying the change of the second image. Further, the output device 200 can allow the user to experience the internal processing of the model, which cannot be normally recognized, such as what part of the intermediate image is changed and how the output result is affected. ..

また、生成部１３８は、表示装置に設置された撮像装置を制御して生成される２値化情報と、表示装置に表示された中間画像とに基づいて、中間画像の一部又は全部の情報を欠落させた介入情報を生成する。 Further, the generation unit 138 information on a part or all of the intermediate image based on the binarization information generated by controlling the image pickup device installed in the display device and the intermediate image displayed on the display device. Generate intervention information that is missing.

このように、変形例に係る出力装置２００は、カメラ等に撮像された風景等に基づいて介入情報を生成してもよい。これにより、出力装置２００は、ユーザが意図しない変化を出力結果に反映させることができるため、印象的なデモンストレーション等を行うことができる。 As described above, the output device 200 according to the modified example may generate intervention information based on the landscape or the like captured by the camera or the like. As a result, the output device 200 can reflect changes not intended by the user in the output result, so that an impressive demonstration or the like can be performed.

また、表示制御部１３７は、生成部１３８によって生成された介入情報を中間画像に反映させた画像である介入画像を表示装置に表示する。 Further, the display control unit 137 displays an intervention image, which is an image in which the intervention information generated by the generation unit 138 is reflected in the intermediate image, on the display device.

このように、変形例に係る出力装置２００は、ユーザがタッチした箇所がわかるような、介入情報が反映された中間画像をユーザに提示する。これにより、出力装置２００は、ユーザがどの箇所を触れることでどのように出力結果が変化するかをユーザに知覚させることができるので、モデルによる演算や介入による変化をユーザに実感させることができる。 As described above, the output device 200 according to the modified example presents to the user an intermediate image reflecting the intervention information so that the portion touched by the user can be understood. As a result, the output device 200 allows the user to perceive how the output result changes by touching which part, so that the user can realize the change due to the calculation by the model or the intervention. ..

以上、本願の実施形態のいくつかを図面に基づいて詳細に説明したが、これらは例示であり、発明の開示の欄に記載の態様を始めとして、当業者の知識に基づいて種々の変形、改良を施した他の形態で本発明を実施することが可能である。 Although some of the embodiments of the present application have been described in detail with reference to the drawings, these are examples, and various modifications are made based on the knowledge of those skilled in the art, including the embodiments described in the disclosure column of the invention. It is possible to carry out the present invention in other modified forms.

また、上述した出力装置１００は、複数のサーバコンピュータで実現してもよく、また、機能によっては外部のプラットフォーム等をＡＰＩ（Application Programming Interface）やネットワークコンピューティングなどで呼び出して実現するなど、構成は柔軟に変更できる。 Further, the output device 100 described above may be realized by a plurality of server computers, and depending on the function, an external platform or the like may be called by API (Application Programming Interface), network computing, or the like to realize the configuration. It can be changed flexibly.

また、特許請求の範囲に記載した「部（section、module、unit）」は、「手段」や「回路」などに読み替えることができる。例えば、取得部は、取得手段や取得回路に読み替えることができる。 Further, the "section, module, unit" described in the claims can be read as "means" or "circuit". For example, the acquisition unit can be read as an acquisition means or an acquisition circuit.

１出力システム
１０入力デバイス
２０出力デバイス
３０表示制御装置
１００出力装置
１１０通信部
１２０記憶部
１２１モデル記憶部
１２２画像記憶部
１２３中間画像記憶部
１２４介入情報記憶部
１３０制御部
１３１取得部
１３２入力部
１３３算出部
１３４中間出力部
１３５受付部
１３６結果出力部
１３７表示制御部
１３８生成部 1 Output system 10 Input device 20 Output device 30 Display control device 100 Output device 110 Communication unit 120 Storage unit 121 Model storage unit 122 Image storage unit 123 Intermediate image storage unit 124 Intervention information storage unit 130 Control unit 131 Acquisition unit 132 Input unit 133 Calculation unit 134 Intermediate output unit 135 Reception unit 136 Result output unit 137 Display control unit 138 Generation unit

Claims

An input unit that inputs the first image to be processed into a model that is a neural network that outputs images,
An intermediate output unit that outputs an intermediate image that is an image in the intermediate layer of the model to which the first image is input, and an intermediate output unit.
A reception unit that receives intervention information, which is information that reflects the intervention processing for the intermediate image, and
A result output unit that outputs a second image from the output layer of the model based on the intervention information received by the reception unit.
Equipped with
The reception department
An output device that is an intermediate image output by the intermediate output unit and that receives the intervention information generated based on an intervention process for the intermediate image displayed on the display device.

The intermediate output unit
The first intermediate image, which is an image in the first intermediate layer of the model to which the first image is input, is output, and then the reception unit receives the first intervention information reflecting the intervention process for the first intermediate image. If so, the second intermediate image, which is an image in the intermediate layer of the next stage of the model in which the first intervention information is input, is output.
The output device according to claim 1.

The reception department
The intervention information, which is information in which a part or all of the information of the intermediate image is omitted by the intervention process for the intermediate image, is accepted.
The output device according to claim 1 or 2, wherein the output device is characterized by the above.

A display control unit that displays the intermediate image output by the intermediate output unit on the display device, and
A generation unit that generates the intervention information based on the intervention processing for the intermediate image displayed by the display control unit.
Further prepare
The reception department
Receiving the intervention information generated by the generator,
The output device according to any one of claims 1 to 3.

The generator is
Based on the user's selection operation for the display device on which the intermediate image is displayed, the intervention information in which the information of the selected portion in the intermediate image is omitted is generated.
The output device according to claim 4.

The generator is
Based on the binarization information generated by controlling the image pickup device installed in the display device and the intermediate image displayed on the display device, a part or all of the information of the intermediate image is omitted. Generate the intervention information,
The output device according to claim 4 or 5.

The display control unit
An intervention image, which is an image in which the intervention information generated by the generation unit is reflected in the intermediate image, is displayed on the display device.
The output device according to any one of claims 4 to 6.

It ’s an output method that a computer uses.
An input process for inputting the first image to be processed into a model that is a neural network that outputs images,
An intermediate output step of outputting an intermediate image which is an image in the intermediate layer of the model to which the first image is input, and an intermediate output step.
The reception process for receiving intervention information, which is information that reflects the intervention process for the intermediate image,
A result output process for outputting a second image from the output layer of the model based on the intervention information received by the reception process.
Only including,
The reception process is
An output method that is an intermediate image output by the intermediate output step and that receives the intervention information generated based on an intervention process for the intermediate image displayed on the display device .

An input procedure for inputting the first image to be processed into a model that is a neural network that outputs images,
An intermediate output procedure for outputting an intermediate image which is an image in the intermediate layer of the model to which the first image is input, and an intermediate output procedure.
The reception procedure for accepting intervention information, which is information that reflects the intervention process for the intermediate image, and
A result output procedure for outputting a second image from the output layer of the model based on the intervention information received by the reception procedure, and a result output procedure.
Let the computer run
The reception procedure is
Wherein an intermediate image that is output by the intermediate output procedure, the output program characterized Rukoto receiving the intervention information generated on the basis of the intervention process for intermediate image displayed on the display device.

An output system having an output device and a display control device.
The output device is
An input unit that inputs the first image to be processed into a model that is a neural network that outputs images,
An intermediate output unit that outputs an intermediate image that is an image in the intermediate layer of the model to which the first image is input, and an intermediate output unit.
A receiver for receiving the intervention information is information reflecting the intervention process for pre-Symbol intermediate image,
A result output unit that outputs a second image from the output layer of the model based on the intervention information received by the reception unit.
Equipped with
The display control device is
A display control unit that displays the intermediate image output by the intermediate output unit on the display device, and
A generation unit that generates the intervention information based on the intervention processing for the intermediate image displayed by the display control unit.
Equipped with
The reception department
Receiving the intervention information generated by the generator,
An output system characterized by that.