JP2023086549A

JP2023086549A - Information processing apparatus, information processing method in information processing apparatus, and program

Info

Publication number: JP2023086549A
Application number: JP2021201134A
Authority: JP
Inventors: 修二奥野; Shuji Okuno
Original assignee: Axell Corp
Current assignee: Axell Corp
Priority date: 2021-12-10
Filing date: 2021-12-10
Publication date: 2023-06-22
Anticipated expiration: 2041-12-10
Also published as: JP7418019B2; JP2024024680A; JP7548634B2

Abstract

To provide an information processing apparatus configured to perform analysis or recognition with high accuracy while suppressing increase of data volume or processing load in artificial intelligence using CNN.SOLUTION: An information processing apparatus 1A includes: a CNN 114 which includes a convolutional neural network including a convolution layer and executes convolution processing on data having multiple channels; a first transformer 112 which performs non-linear transformation on data input to the information processing apparatus 1A to be input to the CNN 114; and/or an inverse transformer 115 which performs non-linear transformation on data output from the CNN 114 to be output from the information processing apparatus 1A. The first transformer 112 and/or the CNN 114 performs non-linear transformation on data separately by channel.SELECTED DRAWING: Figure 2

Description

本発明は、畳み込みニューラルネットワーク（ＣＮＮ）を用いてデータを処理する情報処理装置及び情報処理方法に関する。 The present invention relates to an information processing apparatus and information processing method for processing data using a convolutional neural network (CNN).

近年、人工知能（ＡＩ）を用いてデータの解析や認識を行うために、畳み込みニューラルネットワーク（ＣＮＮ。以下「ＣＮＮ」と称する。）が多く用いられる。たとえば、画像データや音声データ等の各種の解析や各種の認識においてＣＮＮが用いられることが多い。従来、このようなＣＮＮを用いた人工知能システムとしては、ＣＮＮによる解析や認識の精度を高めるため、離散値としての複数のパラメータを持つデータ、たとえばＲＧＢ色空間のデジタルのカラー画像データについて非線形に空間変換する変換器をＣＮＮの前段に設ける発明が知られている（例えば、特許文献１参照）。 In recent years, convolutional neural networks (CNN, hereinafter referred to as "CNN") are often used to analyze and recognize data using artificial intelligence (AI). For example, CNNs are often used in various types of analysis and recognition of image data, voice data, and the like. Conventionally, as an artificial intelligence system using such a CNN, in order to improve the accuracy of analysis and recognition by the CNN, data with multiple parameters as discrete values, for example, digital color image data in the RGB color space are nonlinearly An invention is known in which a converter that performs space conversion is provided in the front stage of a CNN (see, for example, Patent Document 1).

特許第６４７６５３１号公報Japanese Patent No. 6476531

しかし、ＣＮＮを行う目的は、データの認識、データの解析、データの高精度化など、多様である。そして、データの種類や目的によっては、複数のパラメータのうちの特定のパラメータのみを非線形に変換することでＣＮＮの処理による効果が高まる場合もある。しかし、上記特許文献１は、変換対象であるデータの複数のパラメータの全てを対象として非線形に変換するため、処理負荷が過大になり、処理精度が低下する場合があるという問題がある。 However, the purposes of CNN are diverse, such as data recognition, data analysis, and improvement of data accuracy. Depending on the type and purpose of the data, the effect of CNN processing may be enhanced by nonlinearly transforming only a specific parameter out of a plurality of parameters. However, in Patent Document 1, since all of the parameters of the data to be converted are non-linearly converted, there is a problem that the processing load becomes excessive and the processing accuracy decreases.

本発明はこのような課題に鑑みてなされたものであり、ＣＮＮを用いた人工知能において、データ量や処理負荷が過大になるのを抑止しつつ高い精度で解析や認識を行うことのできる情報処理装置、情報処理方法、プログラムを提供することを課題としている。 The present invention has been made in view of such problems, and in artificial intelligence using CNN, information that can be analyzed and recognized with high accuracy while preventing the amount of data and processing load from becoming excessive. An object of the present invention is to provide a processing device, an information processing method, and a program.

かかる課題を解決するため、請求項１に係る発明は、畳み込み層を含む畳み込みニューラルネットワークを備え、複数のチャンネルを有するデータに対して畳み込み処理を行うデータ処理手段を備える情報処理装置であって、該情報処理装置に入力されたデータに対して非線形の変換を行って前記データ処理手段に入力する変換手段、及び／又は、前記データ処理手段から出力されたデータに対して非線形の変換を行って前記情報処理装置から出力させる逆変換手段を備え、前記変換手段、及び／又は、前記逆変換手段は、前記データに対して前記チャンネルごとに別個に前記非線形の変換を行う第一の非線形処理手段を備えたことを特徴とする。 In order to solve such a problem, the invention according to claim 1 is an information processing device comprising a convolutional neural network including a convolutional layer and comprising data processing means for performing convolution processing on data having a plurality of channels, conversion means for performing nonlinear conversion on data input to the information processing device and inputting the data to the data processing means, and/or performing nonlinear conversion on data output from the data processing means First nonlinear processing means comprising inverse transforming means for outputting from the information processing device, wherein the transforming means and/or the inverse transforming means perform the nonlinear transform on the data separately for each of the channels. characterized by comprising

請求項２に記載の発明は、請求項１に記載の構成に加え、前記変換手段、及び／又は、前記逆変換手段は、少なくとも３層の処理層からなる処理層群を備え、該処理層群は、ノード数が１の入力層と、該入力層の後段に設けられたノード数が複数の畳み込み層又は緻密層である中間処理層と、該中間処理層の後段に設けられたノード数が１又は複数の畳み込み層又は緻密層である出力層とを含む構成であり、処理層群が、前記畳み込みニューラルネットワークへ入力する前記データのチャンネル毎に設けられたことを特徴とする。 The invention according to claim 2 is, in addition to the configuration according to claim 1, wherein the conversion means and/or the inverse conversion means includes a processing layer group consisting of at least three processing layers, and the processing layer The group consists of an input layer with one node, an intermediate processing layer that is a convolutional layer or dense layer with a plurality of nodes provided after the input layer, and the number of nodes provided after the intermediate processing layer. includes an output layer which is one or more convolutional layers or dense layers, and a processing layer group is provided for each channel of the data input to the convolutional neural network.

請求項３に記載の発明は、請求項２に記載の構成に加え、前記中間処理層が１層からなることを特徴とする。 The invention according to claim 3 is characterized in that, in addition to the configuration according to claim 2, the intermediate treatment layer is composed of one layer.

請求項４に記載の発明は、請求項２に記載の構成に加え、前記中間処理層が複数層からなることを特徴とする。 The invention according to claim 4 is characterized in that, in addition to the configuration according to claim 2, the intermediate treatment layer is composed of a plurality of layers.

請求項５に記載の発明は、請求項１乃至４の何れか一つに記載の構成に加え、前記変換手段、及び／又は、前記逆変換手段は、複数の前記チャンネルを複合させて前記非線形の変換を行う第二の非線形処理手段を備えたことを特徴とする。 The invention according to claim 5 is, in addition to the configuration according to any one of claims 1 to 4, wherein the transforming means and/or the inverse transforming means combine a plurality of the channels to obtain the nonlinear It is characterized by comprising a second non-linear processing means for performing conversion of .

請求項６に記載の発明は、請求項１乃至５の何れか一つに記載の構成に加え、前記第一の非線形処理手段において用いられる変換の態様が記録された変換テーブルが記憶される記憶手段を備え、前記第一の非線形処理手段は、前記記憶手段から取得した前記変換テーブルを用いて前記非線形の変換を行うことを特徴とする。 The invention according to claim 6, in addition to the configuration according to any one of claims 1 to 5, is a storage storing a conversion table in which conversion modes used in the first nonlinear processing means are recorded. wherein the first nonlinear processing means performs the nonlinear conversion using the conversion table acquired from the storage means.

請求項７に記載の発明は、請求項１乃至６の何れか一つに記載の構成に加え、前記変換手段、及び／又は、前記逆変換手段でスキップコネクションを用いたことを特徴とする。 The invention according to claim 7 is characterized in that, in addition to the configuration according to any one of claims 1 to 6, a skip connection is used in the transforming means and/or the inverse transforming means.

請求項８に記載の発明は、畳み込み層を含む畳み込みニューラルネットワークにおいて、複数のチャンネルを有するデータに対して畳み込み処理が行われるデータ処理手順を備える、情報処理装置における情報処理方法であって、該情報処理装置に入力されたデータに対して非線形の変換を行って前記データ処理手順の処理に入力される変換手順、及び／又は、前記データ処理手順の処理によって出力されたデータに対して非線形の変換を行って前記情報処理装置から出力させる逆変換手順を備え、前記変換手順、及び／又は、前記逆変換手順は、前記データに対して前記チャンネルごとに別個に前記非線形の変換が行われる第一の非線形処理手順を備えたことを特徴とする。 According to an eighth aspect of the present invention, there is provided an information processing method in an information processing apparatus, comprising a data processing procedure in which convolution processing is performed on data having a plurality of channels in a convolutional neural network including convolution layers, wherein A transformation procedure that performs nonlinear transformation on data input to an information processing device and is input to the processing of the data processing procedure, and / or a nonlinear transformation of data that is output by the processing of the data processing procedure. An inverse transformation procedure is provided for performing transformation and outputting it from the information processing device, wherein the transformation procedure and/or the inverse transformation procedure performs the nonlinear transformation separately on the data for each channel. It is characterized by having one nonlinear processing procedure.

請求項９に記載の発明は、プログラムであって、コンピュータを請求項１乃至７の何れか一つに記載の情報処理装置として機能させることを特徴とする。 According to a ninth aspect of the present invention, there is provided a program that causes a computer to function as the information processing apparatus according to any one of the first to seventh aspects.

本発明によれば、ＣＮＮを用いた人工知能において、データ量や処理負荷が過大になるのを抑止しつつ高い精度で解析や認識を行うことが可能となる。 According to the present invention, in artificial intelligence using CNN, it is possible to perform analysis and recognition with high accuracy while preventing the amount of data and processing load from becoming excessive.

この実施の形態１の情報処理装置の全体構成を示す機能ブロック図である。1 is a functional block diagram showing the overall configuration of an information processing apparatus according to Embodiment 1; FIG. 同上情報処理装置の画像処理部の詳細構成を模式的に示す機能ブロック図である。3 is a functional block diagram schematically showing the detailed configuration of an image processing unit of the information processing apparatus; FIG. 同上情報処理装置の画像処理部の詳細構成を模式的に示す機能ブロック図である。3 is a functional block diagram schematically showing the detailed configuration of an image processing unit of the information processing apparatus; FIG. 同上情報処理装置の第一の変換器の詳細構成を示す機能ブロック図である。It is a functional block diagram which shows the detailed structure of the 1st converter of an information processing apparatus same as the above. 同上情報処理装置の第一の変換器の変形例の概略を示す機能ブロック図である。It is a functional block diagram which shows the outline of the modification of the 1st converter of the information processing apparatus same as the above. 同上情報処理装置の第二の変換器の詳細構成を示す機能ブロック図である。It is a functional block diagram which shows the detailed structure of the 2nd converter of an information processing apparatus same as the above. 同上情報処理装置のＣＮＮの構成と処理手順（データ処理手順）を模式的に示すブロック図並びにタイムチャートである。It is a block diagram and a time chart which show typically the structure and processing procedure (data processing procedure) of CNN of an information processing apparatus same as the above. この実施の形態２の情報処理装置の第一の変換器の構成を示す機能ブロック図である。FIG. 11 is a functional block diagram showing the configuration of a first converter of the information processing device of this embodiment 2; この実施の形態３の情報処理装置の画像処理部の構成の一部を示す機能ブロック図である。FIG. 12 is a functional block diagram showing a part of the configuration of an image processing unit of the information processing apparatus according to the third embodiment; この実施の形態４の情報処理装置の画像処理部の構成の一部を示す機能ブロック図である。FIG. 14 is a functional block diagram showing a part of the configuration of an image processing section of the information processing apparatus according to the fourth embodiment; この実施の形態５の情報処理装置の画像処理部の構成の一部を示す機能ブロック図である。FIG. 12 is a functional block diagram showing a part of the configuration of an image processing unit of the information processing apparatus according to the fifth embodiment; この実施の形態６の情報処理装置の画像処理部の構成の一部を示す機能ブロック図である。FIG. 12 is a functional block diagram showing a part of the configuration of an image processing section of the information processing apparatus according to Embodiment 6; この実施の形態７の情報処理装置の画像処理部の構成の一部を示す機能ブロック図である。FIG. 14 is a functional block diagram showing a part of the configuration of an image processing section of the information processing apparatus according to Embodiment 7; この発明の実施例としての、（Ａ）従来例１としての情報処理装置の画像処理部の構成の一部を示す機能ブロック図、（Ｂ）従来例２としての情報処理装置の画像処理部の構成の一部を示す機能ブロック図、（Ｃ）本件発明としての情報処理装置の画像処理部の構成の一部を示す機能ブロック図である。1 is a functional block diagram showing a part of the configuration of an image processing unit of an information processing apparatus as conventional example 1, and (B) an image processing unit of an information processing apparatus as conventional example 2, as embodiments of the present invention. FIG. 4C is a functional block diagram showing a part of the configuration, (C) a functional block diagram showing a part of the configuration of the image processing unit of the information processing apparatus as the present invention;

［発明の実施の形態１］
図１乃至図７に、この実施の形態１に係る情報処理装置及び情報処理装置における情報処理方法を示す。以下、この発明の実施の形態１について図面を参照して説明する。 [Embodiment 1 of the invention]
1 to 7 show an information processing apparatus and an information processing method in the information processing apparatus according to the first embodiment. Embodiment 1 of the present invention will be described below with reference to the drawings.

［基本構成］
まず、この実施の形態１の情報処理装置の及び構成について説明する。 [Basic configuration]
First, the configuration of the information processing apparatus according to the first embodiment will be described.

図１に示す、この実施の形態１の情報処理装置１Ａは、人工知能（ＡＩ、以下単に「ＡＩ」と記載する。）を備え、ＡＩによる各種データの解析や認識や、解析や認識に用いたデータの復元を行う。情報処理装置１Ａは、デジタルデータに対してＣＮＮを用いたデータ処理を行う。 The information processing device 1A of the first embodiment shown in FIG. Restore the data that was saved. The information processing device 1A performs data processing using CNN on digital data.

以下、この実施の形態１では、情報処理装置１Ａがデジタルデータとしての画像データの解析や認識、そして復元を行うものとして説明する。また、この実施の形態１の情報処理装置１Ａに入力される画像データは、２５６階調のＲＧＢカラーモデルの画像データ（Ｒ値、Ｇ値、Ｂ値の３つのパラメータを有する画像データ）であるものとする。 In the following description of the first embodiment, the information processing apparatus 1A analyzes, recognizes, and restores image data as digital data. Further, the image data input to the information processing apparatus 1A of the first embodiment is 256-tone RGB color model image data (image data having three parameters of R value, G value, and B value). shall be

ただし情報処理装置１Ａが扱うデータは画像データに限らず、例えばデジタルデータとしての音声データや、音声以外の各種デジタルデータを扱うものでもよい。また、情報処理装置１Ａが扱うデータはアナログデータをデジタルデータに変換して各種処理を行うものであってもよい。 However, data handled by the information processing apparatus 1A is not limited to image data, and may be voice data as digital data or various digital data other than voice. Further, the data handled by the information processing apparatus 1A may be converted from analog data into digital data and subjected to various processes.

また、この実施の形態１で扱う画像データは、ＲＧＢカラーモデル以外の画像データ、例えばＲＧＢカラーモデルをＹＵＶやＹＣｂＣｒなどの異なる色空間に変換した画像データであってもよいし、４つ以上のパラメータを有する画像データ（例えばＲＧＢＹの４つのパラメータを有する画像データ）であってもよい。この場合、以下に説明する情報処理装置１Ａの機能手段は、パラメータの種類やパラメータの数に応じたものとして構成される。 The image data handled in the first embodiment may be image data other than the RGB color model, for example, image data obtained by converting the RGB color model into a different color space such as YUV or YCbCr. Image data having parameters (for example, image data having four parameters of RGBY) may be used. In this case, the functional means of the information processing apparatus 1A described below are configured according to the types of parameters and the number of parameters.

［情報処理装置の機能手段］
図１に示すとおり、この実施の形態１の情報処理装置１Ａは、機能手段として、制御部１０、画像処理部１１、「記憶手段」としての記憶部１２、通信部１３、表示部１４及び操作部１５を備える。なお情報処理装置１Ａにおける動作について以下では、１台のサーバコンピュータとして説明するが、複数のコンピュータによって処理を分散するようにして構成されてもよい。 [Functional Means of Information Processing Device]
As shown in FIG. 1, the information processing apparatus 1A of the first embodiment includes a control unit 10, an image processing unit 11, a storage unit 12 as a "storage unit", a communication unit 13, a display unit 14, and an operation unit as functional units. A part 15 is provided. The operation of the information processing apparatus 1A will be described below as one server computer, but it may be configured such that processing is distributed among a plurality of computers.

制御部１０は、ＣＰＵ（Central Processing Unit ）等のプロセッサ及びメモリ等を用い、装置の構成部を制御して各種機能を実現する。画像処理部１１は、ＧＰＵ（Graphics Processing Unit）又は専用回路等のプロセッサ及びメモリを用い、制御部１０からの制御指示に応じて画像処理を実行する。なお、制御部１０及び画像処理部１１は、ＣＰＵ，ＧＰＵ等のプロセッサ、メモリ、更には記憶部１２及び通信部１３を集積した１つのハードウェア（ＳｏＣ：System on a Chip）として構成されていてもよい。 The control unit 10 uses a processor such as a CPU (Central Processing Unit), a memory, and the like, and controls components of the device to realize various functions. The image processing unit 11 uses a processor such as a GPU (Graphics Processing Unit) or a dedicated circuit and a memory, and executes image processing according to control instructions from the control unit 10 . Note that the control unit 10 and the image processing unit 11 are configured as one piece of hardware (SoC: System on a Chip) in which a processor such as a CPU or GPU, a memory, a storage unit 12 and a communication unit 13 are integrated. good too.

記憶部１２は、各種記憶媒体であり、たとえばハードディスク又はフラッシュメモリを用いる。記憶部１２には、画像処理プログラム１Ｐ、ＤＬ（Deep Learning）用、特にＣＮＮとしての機能を発揮させるＣＮＮライブラリ１Ｌ、及び変換器ライブラリ２Ｌが記憶されている。また記憶部１２には、１つの学習毎に作成される、ＣＮＮ１１４、第一の変換器１１２、第二の変換器１１３、逆変換器１１５、を定義する情報、学習済みのＣＮＮ１１４における各層の重み係数等を含むパラメータ情報等が記憶される。 The storage unit 12 is various storage media such as a hard disk or flash memory. The storage unit 12 stores an image processing program 1P, a CNN library 1L for DL (Deep Learning), particularly a CNN library 1L, and a converter library 2L. In addition, in the storage unit 12, information defining the CNN 114, the first transformer 112, the second transformer 113, the inverse transformer 115, and the weight of each layer in the learned CNN 114, which are created for each learning Parameter information and the like including coefficients and the like are stored.

また、記憶部１２には変換テーブル１２１が記憶される。この変換テーブル１２１は第一の変換器１１２に読み込まれ、第一の変換器１１２における演算処理に用いられる（後述の［変換テーブル］にて詳述。）
通信部１３は、インターネット等の通信網への通信接続を実現する通信モジュールである。通信部１３は、ネットワークカード、無線通信デバイス又はキャリア通信用モジュールを用いる。 A conversion table 121 is stored in the storage unit 12 . This conversion table 121 is read into the first converter 112 and used for arithmetic processing in the first converter 112 (detailed in [Conversion table] described later).
The communication unit 13 is a communication module that realizes communication connection to a communication network such as the Internet. The communication unit 13 uses a network card, a wireless communication device, or a carrier communication module.

表示部１４は、液晶パネル又は有機ＥＬ（Electro Luminescence）ディスプレイ等を用いる。表示部１４は、制御部１０の指示による画像処理部１１での処理によって画像を表示することが可能である。 The display unit 14 uses a liquid crystal panel, an organic EL (Electro Luminescence) display, or the like. The display unit 14 can display an image by processing in the image processing unit 11 according to instructions from the control unit 10 .

操作部１５は、キーボード又はマウス等のユーザインタフェースを含む。筐体に設けられた物理的ボタンを用いてもよい。及び表示部１４に表示されるソフトウェアボタン等を用いてもよい。操作部１５は、ユーザによる操作情報を制御部１０へ通知する。 The operating unit 15 includes a user interface such as a keyboard or mouse. A physical button provided on the housing may be used. Also, software buttons or the like displayed on the display unit 14 may be used. The operation unit 15 notifies the control unit 10 of operation information by the user.

読取部１６は、例えばディスクドライブを用い、光ディスク等を用いた記録媒体２に記憶してある画像処理プログラム２Ｐ、ＣＮＮライブラリ３Ｌ、及び変換器ライブラリ４Ｌを読み取ることが可能である。記憶部１２に記憶してある画像処理プログラム１Ｐ、ＣＮＮライブラリ１Ｌ、及び変換器ライブラリ２Ｌは、記録媒体２から読取部１６が読み取った画像処理プログラム２Ｐ、ＣＮＮライブラリ３Ｌ、及び変換器ライブラリ４Ｌを制御部１０が記憶部１２に複製したものであってもよい。 The reading unit 16 uses a disk drive, for example, and can read the image processing program 2P, the CNN library 3L, and the converter library 4L stored in the recording medium 2 using an optical disk or the like. The image processing program 1P, the CNN library 1L, and the converter library 2L stored in the storage unit 12 control the image processing program 2P, the CNN library 3L, and the converter library 4L read by the reading unit 16 from the recording medium 2. It may be one that the unit 10 duplicates in the storage unit 12 .

情報処理装置１Ａの制御部１０は、記憶部１２に記憶してある画像処理プログラム１Ｐに基づき、「学習実行部」としての画像処理実行部１０１として機能する。また画像処理部１１は、記憶部１２に記憶してあるＣＮＮライブラリ１Ｌ、定義データ、パラメータ情報に基づきメモリを用いてＣＮＮ１１４（ＣＮＮエンジン）として機能し、また変換器ライブラリ２Ｌ、フィルタ情報に基づきメモリを用いて第一の変換器１１２、第二の変換器１１３として機能する。画像処理部１１は、第一の変換器１１２、第二の変換器１１３の種類に応じて逆変換器１１５として機能する場合もある。 The control unit 10 of the information processing apparatus 1A functions as an image processing execution unit 101 as a "learning execution unit" based on the image processing program 1P stored in the storage unit 12. FIG. The image processing unit 11 functions as a CNN 114 (CNN engine) using a memory based on the CNN library 1L, definition data, and parameter information stored in the storage unit 12, and also functions as a CNN 114 (CNN engine) based on the converter library 2L and filter information. to function as the first converter 112 and the second converter 113 . The image processing unit 11 may function as an inverse transformer 115 depending on the types of the first transformer 112 and the second transformer 113 .

［画像処理実行部の機能手段］
図２に示すとおり、画像処理実行部１０１は、機能手段として、入力部１１１、「変換手段」「第一の非線形処理手段」としての第一の変換器１１２、「変換手段」「第二の非線形処理手段」としての第二の変換器１１３、「データ処理手段」としてのＣＮＮ１１４、「逆変換手段」としての逆変換器１１５、出力部１１６を備える。画像処理実行部１０１は、これらの機能手段を用い、各々へデータを与え、各々から出力されるデータを取得する処理を実行する。 [Functional Means of Image Processing Execution Unit]
As shown in FIG. 2, the image processing execution unit 101 includes, as functional means, an input unit 111, a first converter 112 as a "conversion means" and a "first nonlinear processing means", a "conversion means" and a "second A second transformer 113 as a "nonlinear processing means", a CNN 114 as a "data processing means", an inverse transformer 115 as an "inverse transforming means", and an output unit 116 are provided. The image processing execution unit 101 uses these functional means to give data to each of them and to acquire the data output from each of them.

具体的には、画像処理実行部１０１は、ユーザの操作部１５を用いた操作に基づいて入力部１１１に入力された、入力データである画像データを、第一の変換器１１２に入力し、第一の変換器１１２から出力された画像データを第二の変換器１１３に入力する。画像処理実行部１０１は、第二の変換器１１３から出力されたデータをＣＮＮ１１４に入力する。画像処理実行部１０１は、ＣＮＮ１１４から出力されたデータを必要に応じて逆変換器１１５へ入力し、逆変換器１１５から出力されたデータを出力部１１６に入力し、入力されたデータは出力部１１６から出力データとして出力されて記憶部１２に入力される。画像処理実行部１０１は、出力データを画像処理部１１へ与えて画像として描画し、表示部１４へ出力してもよい。 Specifically, the image processing execution unit 101 inputs image data, which is input data input to the input unit 111 based on the user's operation using the operation unit 15, to the first converter 112, The image data output from the first converter 112 is input to the second converter 113 . The image processing execution unit 101 inputs the data output from the second converter 113 to the CNN 114 . The image processing execution unit 101 inputs the data output from the CNN 114 to the inverse transformer 115 as necessary, inputs the data output from the inverse transformer 115 to the output unit 116, and outputs the input data to the output unit. 116 as output data and input to the storage unit 12 . The image processing execution unit 101 may give the output data to the image processing unit 11 to render it as an image and output it to the display unit 14 .

ＣＮＮ１１４は、定義データにより定義される複数段の畳み込み層及びプーリング層と、全結合層とを有し（図７参照）、入力データの特徴量を取り出し、取り出された特徴量に基づいて分類を行なう（後述の［ＣＮＮの構成と処理手順］に詳述。）。 The CNN 114 has multiple stages of convolution layers and pooling layers defined by definition data, and a fully connected layer (see FIG. 7), extracts the feature amount of the input data, and performs classification based on the extracted feature amount. (detailed in [CNN Configuration and Processing Procedure] below).

第一の変換器１１２、第二の変換器１１３は、ＣＮＮ１１４と同様に畳み込み層と多チャンネル層とを含み、入力されたデータに対して非線形変換を行なう。ここで非線形変換とは、例えば色空間変換やレベル補正のような入力値を非線形に歪めるような処理を言う。逆変換器１１５は、畳み込み層と多チャンネル層とを含んで逆変換する。逆変換器１１５は「第二の非線形処理手段」としての第一の変換器１１２、「第一の非線形処理手段」としての第二の変換器１１３による歪みを戻す機能を果たす。ただし、逆変換器１１５による変換は、第一の変換器１１２、第二の変換器１１３と対称となるような変換だけには限られない。 The first converter 112 and the second converter 113 each include a convolutional layer and a multi-channel layer like the CNN 114, and perform nonlinear conversion on input data. Here, the non-linear transformation refers to processing such as color space transformation and level correction that non-linearly distorts input values. The inverse transformer 115 includes a convolutional layer and a multi-channel layer for inverse transformation. The inverse transformer 115 functions to restore the distortion caused by the first transformer 112 as the "second nonlinear processing means" and the second transformer 113 as the "first nonlinear processing means". However, the conversion by the inverse converter 115 is not limited to the conversion that is symmetrical with the first converter 112 and the second converter 113 .

［第一の変換器］
図３及び図４に、この実施の形態１の第一の変換器１１２の構成を模式的に示す。 [First converter]
3 and 4 schematically show the configuration of the first converter 112 of the first embodiment.

第一の変換器１１２は、データに対してチャンネルごとに別個に非線形の変換を行う。ここでのチャンネルとは、ＲＧＢカラーモデルのカラー画像の画像データにおけるＲ値、Ｇ値、Ｂ値のこと（カラーチャンネル）をいう。つまりこの画像データは３チャンネルのデータである。 A first transformer 112 performs a non-linear transform on the data separately for each channel. Here, the channel means the R value, G value, and B value (color channel) in the image data of the color image of the RGB color model. That is, this image data is 3-channel data.

図４に示すとおり、第一の変換器１１２は、Ｒ変換器１１２ｒ、Ｇ変換器１１２ｇ、Ｂ変換器１１２ｂを備える。Ｒ変換器１１２ｒは、ノード数が１である第１層（入力層）１１２ｒ１と、ノード数が複数であり、この複数のノードによって緻密層が形成された畳み込み層（CONV）である第２層（中間処理層）１１２ｒ２と、ノード数が１である第３層（出力層）１１２ｒ３とで構成される。Ｇ変換器１１２ｇ、Ｂ変換器１１２ｂも、Ｒ変換器１１２ｒと同様の構成である。即ち、Ｇ変換器１１２ｇは第１層１１２ｇ１、第２層１１２ｇ２、第３層１１２ｇ３を備え、Ｂ変換器１１２ｂは第１層１１２ｂ１、第２層１１２ｂ２、第３層１１２ｂ３を備えている。 As shown in FIG. 4, the first converter 112 includes an R converter 112r, a G converter 112g, and a B converter 112b. The R converter 112r has a first layer (input layer) 112r1 having one node, and a second layer (CONV) having a plurality of nodes and a dense layer formed by the plurality of nodes. It is composed of an (intermediate processing layer) 112r2 and a third layer (output layer) 112r3 having one node. The G converter 112g and the B converter 112b also have the same configuration as the R converter 112r. That is, the G converter 112g has a first layer 112g1, a second layer 112g2 and a third layer 112g3, and the B converter 112b has a first layer 112b1, a second layer 112b2 and a third layer 112b3.

図３、図４に示すとおり、中間処理層である第２層を構成するＲ変換器１１２ｒの第２層１１２ｒ２は、例えば２５６個のノード１１２０_００１，１１２０_００２，・・・１１２０_２５５，１１２０_２５６を備える。ノード数は処理精度に比例するので、ノード数が多いほど処理精度が高まるが、ノード数が増えれば多くの演算処理が必要となるという関係にある。図３に示すとおり、Ｇ変換器１１２ｇ、Ｂ変換器１１２ｂも、同様にそれぞれ、２５６個のノード１１２０_００１，１１２０_００２，・・・１１２０_２５６を備えている。 As shown in FIGS. 3 and 4, the second layer 112r2 of the R converter 112r constituting the second layer, which is the intermediate processing layer, has, for example, ₂₅₆ nodes 1120 ₀₀₁ , 1120 ₀₀₂ _, . Prepare. Since the number of nodes is proportional to the processing accuracy, the greater the number of nodes, the higher the processing accuracy. As shown in FIG. 3, the G converter 112g and the B converter 112b are similarly provided with 256 nodes 1120 ₀₀₁ , 1120 ₀₀₂ , . . . 1120 ₂₅₆ , respectively.

第一の変換器１１２は、入力に対して非線形変換を行ない、入力サンプル値を非線形に歪めるような処理を行う作用を持つ（変換手順、第一の非線形処理手順）。なお、第一の変換器１１２のＲ変換器１１２ｒ、Ｇ変換器１１２ｇ、Ｂ変換器１１２ｂの第２層１１２ｒ２，１１２ｇ２，１１２ｂ２は、緻密層として構成されるものに限らず、畳み込み層として構成されるものであってもよい。 The first converter 112 has the function of performing nonlinear transformation on the input and performing processing that nonlinearly distorts the input sample values (transformation procedure, first nonlinear processing procedure). The second layers 112r2, 112g2, and 112b2 of the R converter 112r, the G converter 112g, and the B converter 112b of the first converter 112 are not limited to being configured as dense layers, but are configured as convolution layers. can be anything.

［第一の変換器の具体的構成］
図４は、この実施の形態１の第一の変換器１１２の具体的構成を示す機能ブロック図である。 [Specific Configuration of First Converter]
FIG. 4 is a functional block diagram showing a specific configuration of the first converter 112 of this first embodiment.

第一の変換器１１２のＲ変換器１１２ｒは、入力層である第１層のノード１１２ｒ１と、中間処理層である第２層１１２ｒ２と、出力層である第３層１１２ｒ３を有し、第２層１１２ｒ２では１×１のフィルタの畳み込み処理により２５６個のノード１１２１_００１，１１２１_００２，・・・１１２１_２５５，１１２１_２５６として畳み込み処理結果が出力され、さらにｅｌｕ活性化関数処理が行われ、１１２２_００１，１１２２_００２，・・・１１２２_２５５，１１２２_２５６の出力が得られる。また、第一の変換器１１２のＲ変換器１１２ｒの出力層である第３層１１２ｒ３は、畳み込みノード１１２ｒ３_１と出力ノード１１２ｒ３_２とを備える。畳み込みノード１１２ｒ３_１においては、中間処理層の第２層１１２ｒ２のノード１１２２_００１，１１２２_００２，・・・１１２２_２５５，１１２２_２５６でｅｌｕ活性化関数処理された出力を１×１のフィルタで畳み込む処理を行うと共に、畳み込みの結果についてｅｌｕ活性化関数処理を行う。出力ノード１１２ｒ３_２は、畳み込みノード１１２ｒ３_１における処理の結果を出力する。 The R converter 112r of the first converter 112 has a first layer node 112r1 which is an input layer, a second layer 112r2 which is an intermediate processing layer, and a third layer 112r3 which is an output layer. 1121 ₂₅₅ , ₁₁₂₁ ₂₅₆ as 256 nodes 1121 ₀₀₁ , 1121 ₀₀₂ , . , 1122 ₀₀₂ , . . . 1122 ₂₅₅ , 1122 ₂₅₆ are obtained. The third layer 112r3, which is the output layer of the R converter 112r of the first converter 112, comprises a convolution node _{112r3_1} and an output node _{112r3_2} . The convolution node 112r3 ₁ convolves the outputs processed by the elu activation function at the nodes _{1122 001} _, 1122 ₀₀₂ , _. In addition, elu activation function processing is performed on the result of convolution. Output node _{112r3_2} outputs the result of the processing in convolution node _{112r3_1} .

このｅｌｕ（Exponential Linear Unit）とは活性化関数の一つであり、ｅｌｕを用いることでデータを非線形に変形することができる。第一の変換器１１２において、活性化関数としてｅｌｕを用いているのは、他の活性化関数、例えば後述するＲｅＬＵ等に比べ、ｅｌｕを用いた処理の方が入力されたデータの曲線（ＲＧＢの数値の大きさと明度の大きさなどをパラメータとした特性曲線など）の変形が滑らかになる（活性化関数を用いた処理後の曲線の形状を、処理前と大きく変化させることのないものとすることができる。）ことによるものである。 This elu (exponential linear unit) is one of the activation functions, and by using elu, data can be transformed nonlinearly. The reason why the first converter 112 uses elu as an activation function is that the curve of the input data (RGB The deformation of the characteristic curve, etc., with parameters such as the numerical value and the brightness of can be done.)

図３，図４等に図示しないが、第一の変換器１１２のＧ変換器１１２ｇ及びＢ変換器１１２ｂもＲ変換器１１２ｒと同様の構成である。 Although not shown in FIGS. 3 and 4, the G converter 112g and the B converter 112b of the first converter 112 have the same configuration as the R converter 112r.

なお、第一の変換器１１２のＲ変換器１１２ｒは、第２層１１２ｒ２のｅｌｕ活性化関数処理部１１２２_００１，１１２２_００２，・・・１１２２_２５５，１１２２_２５６、及び第３層のｅｌｕ活性化関数処理部１１２ｒ３_２のうち、少なくとも何れか一つが設けられていなくてもよいし、ｅｌｕ活性化関数以外のどのような関数が用いられてもよい。これは、第一の変換器１１２のＧ変換器１１２ｇ、Ｂ変換器１１２ｂ、第二の変換器１１３、逆変換器１１５の第一の逆変換部１１５ａ、第二の逆変換部１１５ｂにおいても同様である。 1122 ₂₅₅ , 1122 ₂₅₆ _of the second layer 112r2 and the elu activation function of _the third layer. At least one of the processing units _112r32 may not be provided, and any function other than the elu activation function may be used. This is the same for the G converter 112g, the B converter 112b, the second converter 113, and the first inverse converter 115a and the second inverse converter 115b of the inverse converter 115 of the first converter 112. is.

なお、図３、図４に示すＲ変換器１１２ｒ、Ｇ変換器１１２ｇ、Ｂ変換器１１２ｂは、出力層である第３層の出力チャンネル数（ノード数）が入力チャンネル数と同数であるが、これに限らず減少させてもよいし、増加させてもよい。これは、第二の変換器１１３、逆変換器１１５の第一の逆変換部１１５ａ、第二の逆変換部１１５ｂのＲ逆変換部１１５ｂｒ、Ｇ逆変換部１１５ｂｇ、Ｂ逆変換部１１５ｂｂも同様である。 Note that the R converter 112r, G converter 112g, and B converter 112b shown in FIG. 3 and FIG. It is not limited to this and may be decreased or increased. This is the same for the second converter 113, the first inverse transform unit 115a of the inverse transform unit 115, the R inverse transform unit 115br, the G inverse transform unit 115bg, and the B inverse transform unit 115bb of the second inverse transform unit 115b. is.

［第一の変換器の構成の変形例］
図５は、この実施の形態１の第一の変換器１１２の構成の変形例の概略を示す機能ブロック図である。 [Modification of Configuration of First Converter]
FIG. 5 is a functional block diagram outlining a modification of the configuration of the first converter 112 of the first embodiment.

同図は、第一の変換器１１２のＲ変換器１１２ｒの変形例の概略を示している。図５において、第一の変換器１１２は第３層１１２ｒ３に畳み込みノード１１２ｒ３_４とスキップコネクション１１２ｒ３_５と活性化関数処理ノード１１２ｒ３_６とを備えている。このスキップコネクション１１２ｒ３_３は、畳み込みノード１１２ｒ３_４は、第２層１１２ｒ２の出力を１×１のフィルタで畳み込み処理を行う。スキップコネクション１１２ｒ３_３は、第１層１１２ｒ１から出力されたデータを第２層１１２ｒ２の処理を行わずに第３層１１２ｒ３に入力する。活性化関数処理ノード１１２ｒ３_６は、畳み込みノード１１２ｒ３_４で処理されたデータとスキップコネクション１１２ｒ３_３から供給されたデータとを加算し、加算後のデータのｅｌｕ活性化関数処理を行う。スキップコネクション１１２ｒ３_３を設けることで、機械学習で生じ得るデータの勾配消失問題を適切に回避させることが可能となる。 This figure shows an outline of a modification of the R converter 112r of the first converter 112. FIG. In FIG. 5, the first transformer 112 comprises a convolution node _112r34 , a skip connection _112r35 and an activation function processing node _112r36 in the third layer 112r3. The skip connection _112r33 and the convolution node _112r34 convolve the output of the second layer 112r2 with a 1×1 filter. The skip connection _112r33 inputs the data output from the first layer 112r1 to the third layer 112r3 without performing the processing of the second layer 112r2. The activation function processing node _{112r3_6} adds the data processed by the convolution node _{112r3_4} and the data supplied from the skip connection _{112r3_3} , and performs the elu activation function processing of the added data. By providing the skip connection _112r33 , it is possible to appropriately avoid the data gradient vanishing problem that may occur in machine learning.

なお図示しないが、Ｇ変換器１１２ｇ、Ｂ変換器１１２ｂにも同様のスキップコネクションを設け、同様の効果を得ることができる。これは、後述の［発明の実施の形態２］～［発明の実施の形態８］の第一の変換器１１２でも同様である。 Although not shown, the G converter 112g and the B converter 112b are also provided with similar skip connections to obtain the same effect. This is the same for the first converter 112 of [Embodiment 2 of the invention] to [Embodiment 8 of the invention] described later.

［第二の変換器］
図３及び図６に、この実施の形態１の第二の変換器１１３の構成を模式的に示す。 [Second converter]
3 and 6 schematically show the configuration of the second converter 113 of the first embodiment.

第二の変換器１１３は、ノード数が複数たとえば３である第１層１１３１ｒ，１１３１ｇ，１１３１ｂと、中間処理層として１×１のフィルタの畳み込み（CONV）を行う第２層１１３２_００１，１１３２_００２，・・・１１３２_２５５，１１３２_２５６と、１×１のフィルタの畳み込みにより３チャンネルの出力を得る第３層１１３３_１，１１３３_２，１１３３_３とで構成される。 The second converter 113 includes first layers 1131r, 1131g, and 1131b having a plurality of nodes, for example, three, and second layers 1132 ₀₀₁ and 1132 ₀₀₂ that perform 1×1 filter convolution (CONV) as intermediate processing layers. , 1132 ₂₅₅ , 1132 ₂₅₆ and third layers 1133 ₁ , 1133 ₂ , 1133 ₃ that obtain three-channel outputs by convolution of 1×1 filters.

この実施の形態１において、第二の変換器１１３の第１層１１３１ｒ，１１３１ｇ，１１３１ｂ、第３層１１３３_１，１１３３_２，１１３３_３のノード数３は、第一の変換器１１２を構成するＲ変換器１１２ｒ、Ｇ変換器１１２ｇ、Ｂ変換器１１２ｂの数に一致する数である。すなわち、第二の変換器１１３の第１層１１３１ｒ，１１３１ｇ，１１３１ｂや第３層１１３３_１，１１３３_２，１１３３_３のノード数は、ＲＧＢカラーモデルの色情報であるＲ，Ｇ，Ｂの３種類の分類数がこれに対応する。）に一致する。 In this Embodiment 1, the number of nodes of the first layers 1131r, 1131g, 1131b and the third layers 1133 ₁ , 1133 ₂ , 1133 ₃ of the second converter 113 is 3. R It is the same number as the converters 112r, G converters 112g, and B converters 112b. That is, the number of nodes in the first layers 1131r, 1131g, and 1131b and the third layers 1133 ₁ , 1133 ₂ , and 1133 ₃ of the second converter 113 is three types of R, G, and B, which are color information of the RGB color model. corresponds to this. ).

なお、第二の変換器１１３の第１層１１３１ｒ，１１３１ｇ，１１３１ｂや第３層１１３３_１，１１３３_２，１１３３_３のノード数と、第一の変換器１１２を構成する各変換器１１２ｒ，１１２ｇ，１１２ｂの数は必ずしも一致しなくてもよい。また、この実施の形態１において、第二の変換器１１３の第１層１１３１ｒ，１１３１ｇ，１１３１ｂと第３層１１３３_１，１１３３_２，１１３３_３とは同じノード数としているが、異なるノード数であってもよい。さらに、第二の変換器１１３は、第２層１１３２_００１，１１３２_００２，・・・１１３２_２５５，１１３２_２５６が緻密層を有するものに限らず、例えば畳み込み層を有するものでもよい。 Note that the number of nodes of the first layers 1131r, 1131g, 1131b and the third layers 1133 ₁ , 1133 ₂ , 1133 ₃ of the second converter 113 and the number of nodes of the converters 112r, 112g, 112b does not necessarily have to match. Further, in the first embodiment, the first layers 1131r, 1131g, 1131b and the third layers 1133 ₁ , 1133 ₂ , 1133 ₃ of the second converter 113 have the same number of nodes, but they have different numbers of nodes. may Furthermore, the second converter 113 _is _not limited to the second layers 1132 ₀₀₁ , 1132 ₀₀₂ , .

［逆変換器］
図３に、この実施の形態１の逆変換器１１５の構成を模式的に示す。 [Inverse converter]
FIG. 3 schematically shows the configuration of the inverse transformer 115 of the first embodiment.

逆変換器１１５は、第一の逆変換部１１５ａ、「第一の非線形処理手段」としての第二の逆変換部１１５ｂを備えている。 The inverse transformer 115 includes a first inverse transform section 115a and a second inverse transform section 115b as "first nonlinear processing means".

第一の逆変換部１１５ａは、第二の変換器１１３と同じ構成を備え、第二の変換器１１３による変換に対する逆変換を行う（逆変換手順）。具体的には、第一の逆変換部１１５ａは、ノード数が複数たとえば３である第１層１１５ａ１_１，１１５ａ１_２，１１５ａ１_３と、第１層よりもノード数が多い緻密層（DENSE）として構成された第２層１１５ａ２_００１，１１５ａ２_００２，・・・１１５ａ２_３５５，１１５ａ２_２５６と、第２層１１５ａ２_００１，１１５ａ２_００２，・・・１１５ａ２_３５５，１１５ａ２_２５６よりも少ない複数のノード数、たとえば第１層１１５ａ１_１，１１５ａ１_２，１１５ａ１_３と同じノード数が３である第３層１１５ａ３_１，１１５ａ３_２，１１５ａ３_３とで構成される。 The first inverse transformation unit 115a has the same configuration as the second transformer 113, and performs inverse transformation of the transformation by the second transformer 113 (inverse transformation procedure). Specifically, the first inverse transform unit 115a uses first layers 115a1 ₁ , 115a1 ₂ , and 115a1 ₃ having a plurality of nodes, for example, three, and a dense layer (DENSE) having a larger number of nodes than the first layer. 115a2 ₀₀₂ , _. . _. 115a2 ₀₀₂ , _115a2 ₀₀₂ _, _. The third layers 115a3 ₁ , 115a3 ₂ and 115a3 ₃ having the same number of nodes as the layers 115a1 ₁ , 115a1 ₂ and 115a1 ₃ are formed.

第二の逆変換部１１５ｂは、第一の変換器１１２と同じ構成を備え、第一の変換器１１２による変換に対する逆変換を行う（逆変換手順）。第二の逆変換部１１５ｂは、データに対してチャンネルごとに別個に非線形の変換を行う。ここでのチャンネルとは、第一の変換器１１２の場合と同様、ＲＧＢカラーモデルのカラー画像の画像データにおけるＲ値、Ｇ値、Ｂ値のことをいう。 The second inverse transformation unit 115b has the same configuration as the first transformer 112, and performs inverse transformation of the transformation by the first transformer 112 (inverse transformation procedure). The second inverse transform unit 115b performs nonlinear transform on the data separately for each channel. Channels here refer to R, G, and B values in the image data of the color image of the RGB color model, as in the case of the first converter 112 .

具体的には、第二の逆変換部１１５ｂは、Ｒ変換器１１２ｒに対応するＲ逆変換部１１５ｂｒ、Ｇ変換器１１２ｇに対応するＧ逆変換部１１５ｂｇ、Ｂ変換器１１２ｂに対応するＢ逆変換部１１５ｂｂを備える。Ｒ逆変換部１１５ｂｒは、ノード数が１である第１層１１５ｂｒ１と、ノード数が複数（ここでは２５６）の緻密層として構成された第２層１１５ｂｒ２_００１，１１５ｂｒ２_００２，・・・１１５ｂｒ２_２５６と、ノード数が１である第３層１１５ｂｒ３とで構成される。Ｇ逆変換部１１５ｂｇ、Ｂ逆変換部１１５ｂｂも、Ｒ逆変換部１１５ｂｒと同様の第１層１１５ｂｇ１，１１５ｂｂ１、第２層１１５ｂｇ２_００１，１１５ｂｇ２_００２，・・・１１５ｂｇ２_２５６と、第３層１１５ｂｂ３，１１５ｂｂ３とを備えた構成である。 Specifically, the second inverse transforming unit 115b includes an R inverse transforming unit 115br corresponding to the R converter 112r, a G inverse transforming unit 115bg corresponding to the G converter 112g, and a B inverse transforming unit corresponding to the B converter 112b. A portion 115bb is provided. The R inverse transform unit 115br includes _a first layer 115br1 having one node, and second layers 115br2 ₀₀₁ , 115br2 ₀₀₂ , . , and a third layer 115br3 having one node. 115bg2 ₀₀₁ , 115bg2 ₀₀₂ , _. It is a configuration with

第一の逆変換部１１５ａは、第二の変換器１１３と同様に、入力に対して非線形変換を行ない、入力サンプル値を非線形に歪めるような処理を行なう。第二の逆変換部１１５ｂのＲ逆変換部１１５ｂｒ、Ｇ逆変換部１１５ｂｇ、Ｂ逆変換部１１５ｂｂも、第一の変換器１１２のＲ変換器１１２ｒ、Ｇ変換器１１２ｇ、Ｂ変換器１１２ｂと同様に、入力に対して非線形変換を行ない、入力サンプル値を非線形に歪めるような処理を行う作用を持つ（第一の非線形処理手順）。 Like the second converter 113, the first inverse transform unit 115a performs nonlinear transform on the input to nonlinearly distort the input sample values. The R inverse transforming unit 115br, the G inverse transforming unit 115bg, and the B inverse transforming unit 115bb of the second inverse transforming unit 115b are similar to the R transforming unit 112r, the G transforming unit 112g, and the B transforming unit 112b of the first transforming unit 112. Secondly, it has the effect of performing nonlinear transformation on the input and performing processing that nonlinearly distorts the input sample values (first nonlinear processing procedure).

なお、第一の逆変換部１１５ａは、第二の変換器１１３と同様に、入力に対して非線形変換を行ない、入力サンプル値を非線形に歪めるような処理を行なう。第二の逆変換部１１５ｂのＲ逆変換部１１５ｂｒ、Ｇ逆変換部１１５ｂｇ、Ｂ逆変換部１１５ｂｂも、入力に対して非線形変換を行ない、入力サンプル値を非線形に歪めるような処理を行なう。 Note that the first inverse transforming unit 115a, like the second transforming unit 113, nonlinearly transforms the input and performs processing to nonlinearly distort the input sample values. The R inverse transforming unit 115br, the G inverse transforming unit 115bg, and the B inverse transforming unit 115bb of the second inverse transforming unit 115b also perform nonlinear transformation on the input, and perform processing to nonlinearly distort the input sample values.

また、前述の［画像処理実行部の機能手段］に記載のとおり、第一の逆変換部１１５ａの処理は第二の変換器１１３の完全に逆の処理でない場合もあり、第二の逆変換部１１５ｂの処理は第一の変換器１１２の完全に逆の処理でない場合も含まれる。 Further, as described in [Functional Means of Image Processing Execution Unit] above, the processing of the first inverse transforming unit 115a may not be the completely reverse processing of the second transforming unit 113, and the second inverse transforming The processing of the part 115b may not be the completely reverse processing of the first converter 112.

また、情報処理装置１Ａによる機械学習の出力データが入力データと同一形式の場合（例えば画像データの入力に対して画像データが出力される場合）は逆変換器１１５があった方が適切な処理を行える。一方、例えば情報処理装置１Ａによる出力データが入力データと相違する形式である場合（例えば画像データの入力に対して画像認識の結果が文字やシンボル等のデータとして出力される場合）は逆変換器１１５が不要である場合が多い。そのため、この実施の形態１の逆変換器１１５は、情報処理装置１Ａの処理するデータの種類や処理結果の出力態様等によっては情報処理装置１Ａに含めない構成とすることも考えられる（後述する［発明の実施の形態４，５，７］等参照）。 Further, when the output data of machine learning by the information processing apparatus 1A is in the same format as the input data (for example, when image data is output in response to the input of image data), the inverse transformer 115 is more suitable for processing. can do On the other hand, for example, when the output data from the information processing apparatus 1A is in a format different from the input data (for example, when the image recognition result is output as data such as characters and symbols for the input of image data), the inverse converter 115 is often unnecessary. Therefore, the inverse transformer 115 of the first embodiment may not be included in the information processing apparatus 1A depending on the type of data processed by the information processing apparatus 1A and the output mode of the processing result (described later). [Embodiment 4, 5, 7], etc.).

［変換テーブル］
この実施の形態１の第一の変換器１１２を構成するＲ変換器１１２ｒ、Ｇ変換器１１２ｇ、Ｂ変換器１１２ｂは、それぞれ、演算処理において変換テーブル１２１を用いる。図２に示すように、この変換テーブル１２１は記憶部１２に記憶され、第一の変換器１１２が記憶部１２から取り込んで演算に使用する。 [Conversion table]
The R converter 112r, the G converter 112g, and the B converter 112b, which constitute the first converter 112 of the first embodiment, each use the conversion table 121 in arithmetic processing. As shown in FIG. 2, this conversion table 121 is stored in the storage unit 12, and the first converter 112 fetches it from the storage unit 12 and uses it for calculation.

具体的には、変換テーブル１２１には、各変換器１１２ｒ，１１２ｂ，１１２ｇは、それぞれ、第２層１１２０_００１，１１２０_００２，・・・１１２０_２５５，１１２０_２５６のノードの数である２５６種類の演算パターンが記録されている。各変換器１１２ｒ，１１２ｂ，１１２ｇは、この変換テーブル１２１を用いて実際の演算に対応する処理を行う。 Specifically, in the conversion table 121, each of the converters 112r _, 112b _, and 112g has 256 kinds of operations, which are the number of nodes in the second layer 1120 ₀₀₁ , 1120 ₀₀₂ , . patterns are recorded. Each converter 112r, 112b, 112g uses this conversion table 121 to perform processing corresponding to actual calculation.

このような変換テーブル１２１を用いた処理が可能となるのは、この実施の形態１の構成におけるＲ変換器１１２ｒ、Ｇ変換器１１２ｇ、Ｂ変換器１１２ｂの演算の種類が事実上ノードの数だけであって演算のパターンの数が少なく、演算のパターンを変換テーブル１２１として容易に記録可能であるためである。 Such processing using the conversion table 121 is possible because the types of operations of the R converter 112r, the G converter 112g, and the B converter 112b in the configuration of the first embodiment are practically as many as the number of nodes. This is because the number of calculation patterns is small and the calculation patterns can be easily recorded as the conversion table 121 .

第一の変換器１１２や第二の変換器１１３では、畳み込みの演算（二項演算）が必要である。そして、第二の変換器１１３では第２層のノードに入力される値のバリエーションが非常に多く、それらのバリエーションを網羅したテーブルを作成することは困難である。これに対し、第一の変換器１１２を構成するＲ変換器１１２ｒ、Ｇ変換器１１２ｇ、Ｂ変換器１１２ｂや、第二の逆変換部１１５ｂを構成するＲ逆変換部１１５ｂｒ、Ｇ逆変換部１１５ｂｇ、Ｂ逆変換部１１５ｂｂは、第１層１１２ｒ１，１１２ｇ１，１１２ｂ１のノードがそれぞれ１つなので、第２層１１２０_００１，１１２０_００２，・・・１１２０_２５５，１１２０_２５６で演算を行う元データが１つである。そのため、第２層１１２０_００１，１１２０_００２，・・・１１２０_２５５，１１２０_２５６における各ノードのバリエーションは少ない。そのため、第２層１１２０_００１，１１２０_００２，・・・１１２０_２５５，１１２０_２５６の各ノードの演算結果を容易にテーブル化できる。これにより、Ｒ変換器１１２ｒ、Ｇ変換器１１２ｇ、Ｂ変換器１１２ｂの計算コストをほぼゼロにすることができる。なお、逆変換部１１５ｂｒ、１１５ｂｇ、１１５ｂｂでテーブルを利用する場合には逆変換部の出力を例えば２５６階調とし、各階調に対応した数値と出力値とをテーブルに設定し、設定された数値に一番近いテーブルの値を利用したり、各階調に対応した数値の範囲とその数値範囲の場合の出力値とをテーブルに設定し、入力データの値がどのテーブル値に含まれるかを検索し、出力値を得るようにしても良い。 The first converter 112 and the second converter 113 require a convolution operation (binary operation). In the second converter 113, there are many variations in the values that are input to the nodes of the second layer, and it is difficult to create a table that covers all of these variations. On the other hand, the R converter 112r, the G converter 112g, and the B converter 112b that form the first converter 112, and the R inverse converter 115br and the G inverse converter 115bg that form the second inverse converter 115b. _, ₁₁₂₀ ₀₀₂ _, . is. Therefore, variations of each node in the second layer 1120 ₀₀₁ , 1120 ₀₀₂ , . . . 1120 ₂₅₅ , 1120 ₂₅₆ are small. Therefore, _the calculation results of the nodes of _the second layer 1120 ₀₀₁ , 1120 ₀₀₂ , . Thereby, the calculation cost of the R converter 112r, the G converter 112g, and the B converter 112b can be almost zero. When using a table in the inverse transforming units 115br, 115bg, and 115bb, the output of the inverse transforming unit is set to, for example, 256 gradations, and the numerical value and the output value corresponding to each gradation are set in the table, and the set numerical value is Use the table value closest to , or set the numerical range corresponding to each gradation and the output value for that numerical range in the table, and search in which table value the input data value is included. and the output value may be obtained.

この実施の形態１におけるＲ変換器１１２ｒ、Ｇ変換器１１２ｇ、Ｂ変換器１１２ｂ、の演算処理を変換テーブル１２１を用いて行うことにより、簡易な構成で確実に演算処理の処理負荷が過大になることを抑止し、処理を行える情報処理装置１Ａを提供することが可能となる。また、ＣＮＮ１１４における計算資源が少ない場合であっても、僅かな計算資源によって構築できる第一の変換器１１２を用いることで、機械学習の精度を高めることができる。 By performing arithmetic processing of the R converter 112r, G converter 112g, and B converter 112b in Embodiment 1 using the conversion table 121, the processing load of the arithmetic processing becomes excessive with a simple configuration. It is possible to provide an information processing apparatus 1A capable of suppressing such a phenomenon and performing processing. Moreover, even if the CNN 114 has few computational resources, the accuracy of machine learning can be improved by using the first converter 112 that can be constructed with few computational resources.

特に、この実施の形態１の情報処理装置１Ａの機械学習の用途が、たとえば超解像（解像度の低い画像データを高解像度化するもの。）などのように処理負荷の重いものである場合には、ＣＮＮ１１４の処理全体における畳み込みの演算に要する計算コストの占める比率は無視できる程に低い。しかし、機械学習の用途が画像の認識のような処理負荷の軽いものである場合には、ＣＮＮ１１４の処理全体の中の畳み込みの演算に要する計算コストの占める比率は高い。そのため、ＣＮＮ１１４における演算が軽装なものにおいては、変換テーブル１２１を用いた計算コストの低減は特に効果的であるといえる。 In particular, when the machine learning application of the information processing apparatus 1A of the first embodiment has a heavy processing load, such as super-resolution (improving low-resolution image data to high resolution). , the ratio of the computational cost required for the convolution operation in the entire processing of the CNN 114 is negligibly low. However, when the application of machine learning is light processing load such as image recognition, the ratio of the computational cost required for the convolution operation in the overall processing of the CNN 114 is high. Therefore, it can be said that the reduction of the calculation cost using the conversion table 121 is particularly effective when the computation in the CNN 114 is light.

［ＣＮＮの構成と処理手順］
図７は、この実施の形態１の情報処理装置１ＡのＣＮＮ１１４の構成と処理手順（データ処理手順）を模式的に示すブロック図並びにタイムチャートである。 [Configuration and processing procedure of CNN]
FIG. 7 is a block diagram and a time chart schematically showing the configuration and processing procedure (data processing procedure) of CNN 114 of information processing apparatus 1A of the first embodiment.

図７に示すとおり、ＣＮＮ１１４は、データが入力される入力部１１４０、データが出力される出力部１１４７の他に、畳み込み層とプーリング層からなる複数の階層、ここでは第１層１１４１、第２層１１４２、第３層１１４３、第４層１１４４、第５層１１４５の５層の階層と、１つの全結合層１１４６を有している。これらの階層は、ＣＮＮ１１４の構成と処理の態様と模式的に示すものである。なお、畳み込み層とプーリング層の階層は５層よりも多くても少なくてもよい。 As shown in FIG. 7, the CNN 114 includes an input unit 1140 to which data is input, an output unit 1147 to which data is output, and a plurality of layers consisting of a convolution layer and a pooling layer. It has five layers of layer 1142 , third layer 1143 , fourth layer 1144 and fifth layer 1145 and one fully connected layer 1146 . These hierarchies are schematic representations of CNN 114 configuration and processing aspects. Note that the number of convolution layers and pooling layers may be more or less than five layers.

この実施の形態１のＣＮＮ１１４においては、まず第１層１１４１において、畳み込み層１１４１_１でフィルタ（図示せず）を用いた畳み込み処理が行われると、画像データの特徴（画像データに表示された画像や図形の特徴）が抽出された、元の画像データよりも２次元方向の大きさが縮小された画像データがフィルタの枚数分生成される。プーリング層１１４１_２では、畳み込み層で生成された画像データの２次元方向の大きさが縮小された画像データが生成される。 In the CNN 114 of Embodiment 1, first, in the first layer 1141, when convolution processing using a filter (not shown) is performed in the convolution layer ₁₁₄₁₁ , the features of the image data (the image displayed in the image data and graphic features) are extracted, and image data whose size in the two-dimensional direction is reduced from that of the original image data are generated for the number of filters. The pooling layer ₁₁₄₁₂ generates image data in which the size of the image data generated in the convolutional layer is reduced in the two-dimensional direction.

図７では、第１層１１４１の畳み込み層１１４１_１で６４種類のフィルタを用いた６４枚の畳み込みデータを生成し、プーリング層１１４１_２でその６４種類の畳み込みデータの２次元方向の大きさが縮小された新たな画像データが生成される。第２層１１４２では、畳み込み層１１４２_１において、第１層１１４１で生成された６４種類の画像データに１２８種類のフィルタを用いた畳み込み処理を行って１２８種類の畳み込みデータを生成し、プーリング層１１４２_２でその１２８種類の畳み込みデータの２次元方向の大きさが縮小された新たな画像データが生成される。 In FIG. 7, the convolution layer 1141 ₁ of the first layer 1141 generates 64 convolution data using 64 types of filters, and the pooling layer 1141 ₂ reduces the two-dimensional size of the 64 types of convolution data. new image data is generated. In the second layer 1142, the convolution layer _1142-1 performs convolution processing using 128 types of filters on the 64 types of image data generated in the first layer 1141 to generate 128 types of convolution data. ₂ , new image data is generated by reducing the two-dimensional size of the 128 types of convolution data.

以下、第３層１１４３、第４層１１４４、第５層１１４５でも同様の処理が行われる。第３層１１４３では畳み込み層１１４３_１、プーリング層１１４３_２の処理により２５６種類の畳み込みデータと新たな画像データが生成される。第４層１１４４、第５層１１４５では畳み込み層１１４４_１，１１４５_１、プーリング層１１４４_２，１１４５_２の処理により５１２種類の畳み込みデータと新たな画像データが生成される。 The same processing is performed for the third layer 1143, the fourth layer 1144, and the fifth layer 1145 thereafter. In the third layer 1143, 256 kinds of convolutional data and new image data are generated by the processing of the convolutional layer 1143 ₁ and the pooling layer 1143 ₂ . In the fourth layer 1144 and the fifth layer 1145, 512 types of convolution data and new image data are generated by processing in convolution layers 1144 ₁ and 1145 ₁ and pooling layers 1144 ₂ and 1145 ₂ .

全結合層１１４６では、第１層１１４１から第５層１１４５までの処理が行われたデータを１次データ変換し、それぞれの画像データに表示された画像の特徴を認識する。全結合層１１４６では、ＲｅＬＵ（Rectified Linear Unit）の活性化関数処理と、Batch Normalizationを用いた処理が行われるようにしてもよい。ただし、全結合層１１４６では、ＲｅＬＵ以外のどのような活性化関数が用いられた処理が行われてもよい。 The fully connected layer 1146 converts the data processed by the first layer 1141 to the fifth layer 1145 into primary data, and recognizes the features of the image displayed in each image data. The fully connected layer 1146 may perform ReLU (Rectified Linear Unit) activation function processing and processing using batch normalization. However, the fully connected layer 1146 may perform processing using any activation function other than ReLU.

［情報処理装置の学習手順］
この実施の形態１の情報処理装置１Ａは、画像処理実行部１０１が、第一の変換器１１２、第二の変換器１１３、及び逆変換器１１５を、ＣＮＮ１１４を含むＣＮＮの一部として用いて学習を行なう。具体的には画像処理実行部１０１は学習時には、学習データをＣＮＮ１１４全体に入力して得られる出力データと、既知の学習データの分類（出力）との誤差を最小にする処理を実行し、第一の変換器１１２、第二の変換器１１３、又は逆変換器１１５における重みを更新する。この学習処理により得られるＣＮＮ１１４におけるパラメータと、第一の変換器１１２、第二の変換器１１３における重みとは、対応するパラメータとして記憶部１２に記憶される。画像処理実行部１０１は、学習済みのＣＮＮ１１４を使用する場合には、ＣＮＮ１１４を定義する定義情報及び記憶部１２に記憶してあるパラメータと、対応する第一の変換器１１２及び第二の変換器１１３の重みとを用い、入力データを第一の変換器１１２、第二の変換器１１３に入力した後のデータをＣＮＮ１１４へ入力して用いる。逆変換器１１５を用いる場合も学習により得られる学習済みのＣＮＮ１１４を定義する定義情報及びパラメータと対応する重みを使用する。 [Learning Procedure of Information Processing Device]
In the information processing apparatus 1A of the first embodiment, the image processing execution unit 101 uses the first converter 112, the second converter 113, and the inverse converter 115 as part of the CNN including the CNN 114. do the learning. Specifically, during learning, the image processing execution unit 101 performs processing to minimize an error between output data obtained by inputting learning data to the entire CNN 114 and classification (output) of known learning data. The weights in one transformer 112, second transformer 113, or inverse transformer 115 are updated. The parameters in the CNN 114 and the weights in the first converter 112 and the second converter 113 obtained by this learning process are stored in the storage unit 12 as corresponding parameters. When using the trained CNN 114, the image processing execution unit 101 stores the definition information defining the CNN 114 and the parameters stored in the storage unit 12, and the corresponding first converter 112 and second converter 113 weights are used, and the input data is input to the first converter 112 and the second converter 113, and then the data is input to the CNN 114 and used. When the inverse transformer 115 is used, definition information and parameters that define the learned CNN 114 obtained by learning and corresponding weights are used.

第一の変換器１１２、第二の変換器１１３を、ＣＮＮ１１４が畳み込みによる特徴抽出を行う前段に入力することによって、抽出されるべき画像データの特徴を更に強調させることができる。これにより、ＣＮＮ１１４における学習効率及び学習精度が向上することが期待される。 By inputting the first converter 112 and the second converter 113 before the CNN 114 performs feature extraction by convolution, the features of the image data to be extracted can be further emphasized. This is expected to improve the learning efficiency and learning accuracy in the CNN 114 .

［その他の構成］
なお、この実施の形態１における情報処理装置１Ａのハードウェア構成のうち、通信部１３、表示部１４、操作部１５、及び読取部１６は必須ではない。通信部１３は、例えば記憶部１２に記憶される画像処理プログラム１Ｐ、ＣＮＮライブラリ１Ｌ及び変換器ライブラリ２Ｌを外部サーバ装置（図示せず）等から取得する場合には、それらを一旦ダウンロードした後は使用しなくてもよい。同様に、読取部１６も、画像処理プログラム１Ｐ、ＣＮＮライブラリ１Ｌ及び変換器ライブラリ２Ｌを外部の記憶媒体（図示せず）から読み出して取得した後は使用しない構成としてもよい。また、通信部１３及び読取部１６は、ＵＳＢ（Universal Serial Bus）等のシリアル通信を用いた同一デバイスであってもよい。 [Other configurations]
Note that the communication unit 13, the display unit 14, the operation unit 15, and the reading unit 16 are not essential in the hardware configuration of the information processing apparatus 1A according to the first embodiment. For example, when the communication unit 13 acquires the image processing program 1P, the CNN library 1L, and the converter library 2L stored in the storage unit 12 from an external server device (not shown) or the like, after downloading them once, May not be used. Similarly, the reading unit 16 may be configured not to be used after the image processing program 1P, the CNN library 1L, and the converter library 2L are read from an external storage medium (not shown) and acquired. Also, the communication unit 13 and the reading unit 16 may be the same device using serial communication such as USB (Universal Serial Bus).

また、情報処理装置１Ａの構成をネットワーク（図示せず）上に分散させた構成としてもよい。たとえば、上述のＣＮＮ１１４、第一の変換器１１２、第二の変換器１１３、及び逆変換器１１５としての機能をネットワーク（図示せず）上のＷｅｂサーバ（図示せず）上に設け、表示部及び通信部を備えるＷｅｂクライアント装置（図示せず）からこれらの機能が利用できる構成としてもよい。この場合、通信部１３は、Ｗｅｂクライアント装置（図示せず）からのリクエストを受信し、処理結果を送信するために使用される。 Also, the configuration of the information processing apparatus 1A may be distributed over a network (not shown). For example, the above-mentioned CNN 114, the first converter 112, the second converter 113, and the functions as the inverse converter 115 are provided on a Web server (not shown) on a network (not shown), and the display unit and a web client device (not shown) having a communication unit to use these functions. In this case, the communication unit 13 is used to receive a request from a web client device (not shown) and transmit the processing result.

なお学習時に用いる誤差は、二乗誤差、絶対値誤差、又は交差エントロピー誤差等、入出力されるデータ、学習目的に応じて適切な関数を用いるとよい。例えば、出力が分類である場合、交差エントロピー誤差を用いる。誤差関数を用いることに拘わらずその他の基準を用いるなど柔軟な運用が適用できる。この誤差関数自体に外部のＣＮＮ（図示せず）を用いて評価を行なってもよい。 As for the error used during learning, an appropriate function may be used according to the input/output data and the purpose of learning, such as a squared error, an absolute value error, or a cross entropy error. For example, if the output is a classification, use the cross-entropy error. Flexible operation such as using other criteria can be applied regardless of using the error function. The error function itself may be evaluated using an external CNN (not shown).

［作用効果］
この実施の形態１の情報処理装置１Ａは、入力されたデータや信号に非線形の補正を行う場合に、適切な補正を容易に行うことが可能となる。 [Effect]
The information processing apparatus 1A of the first embodiment can easily perform appropriate correction when performing non-linear correction on input data or signals.

これは、この実施の形態１の情報処理装置１Ａは、ＣＮＮ１１４の前後に第二の変換器１１３、逆変換器１１５を設け、情報処理装置１Ａに入力されたデータを非線形に空間変換するのに加え、第二の変換器１１３の前段に第一の変換器１１２を設け、画像データを構成するＲデータ、Ｇデータ、Ｂデータについて個々に非線形処理を行うことで、入力された画像データの特徴を増加させ得ることによるものである。 This is because the information processing apparatus 1A of the first embodiment is provided with the second transformer 113 and the inverse transformer 115 before and after the CNN 114, and non-linearly spatially transforms the data input to the information processing apparatus 1A. In addition, the first converter 112 is provided in the preceding stage of the second converter 113, and by individually performing non-linear processing on the R data, G data, and B data constituting the image data, the features of the input image data are obtained. can be increased.

このように構成することで、この実施の形態１の情報処理装置１Ａは、第一の変換器１１２の非線形変換において機械学習の特徴を増加させ、機械学習の認識率を高めたり、あるいは、高精細な画像形成を行ったりすることが可能となる。 With this configuration, the information processing apparatus 1A of the first embodiment increases the features of machine learning in the non-linear conversion of the first converter 112, increases the machine learning recognition rate, or increases the recognition rate of the machine learning. It becomes possible to perform fine image formation.

この実施の形態１の情報処理装置１Ａの処理は、例えば、ＲＧＢ色空間のカラー画像データにガンマ補正のような処理を行う場合が考えられる。 The processing of the information processing apparatus 1A of the first embodiment may be, for example, a case of performing processing such as gamma correction on color image data in the RGB color space.

たとえば、ピクセル毎にＲ、Ｇ、Ｂのパラメータを有する画像データについて、Ｒの値、Ｇの値、Ｂの値の少なくとも何れか一つ、例えばＲの値にガンマ補正のような非線形変換の補正（個々の色空間変換のような補正）を行うとともに、ＲＧＢ全体の値にガンマ補正のような非線形変換の補正を行う場合、第一の変換器１１２を構成する変換器の何れか一つ、たとえばＲ変換器１１２ｒを用いて画像データ中のＲの値を非線形変換するとともに、第二の変換器１１３を用いてＲＧＢの値全体を非線形変換することができる。 For example, for image data having R, G, and B parameters for each pixel, at least one of the R value, G value, and B value, for example, non-linear transformation correction such as gamma correction to the R value (corrections such as individual color space conversions) and non-linear conversion corrections such as gamma corrections to the overall RGB values, any one of the converters that make up the first converter 112; For example, the R converter 112r can be used to nonlinearly transform the R values in the image data, and the second converter 113 can be used to nonlinearly transform the entire RGB values.

このような処理を行うことで、画像データを構成する複数のパラメータのうちの一部のパラメータ（たとえばＲＧＢのうちのＲのパラメータ）について非線形変換等の補正を行うと共に、それら複数のパラメータ全てについての非線形変換等の補正を行うことが可能となる。これにより、画像データ等のデータや信号について多面的で的確な補正を簡単に行うことが可能となる。 By performing such processing, correction such as non-linear transformation is performed on some parameters (for example, the R parameter of RGB) among the plurality of parameters constituting the image data, and all of the plurality of parameters are corrected. It becomes possible to perform correction such as non-linear conversion of . This makes it possible to easily perform multifaceted and accurate correction of data such as image data and signals.

特に、複数のパラメータを有するデータや信号のうちの特定のパラメータのデータについての非線形変換等の変換と、全てのパラメータのデータについての非線形変換等の変換を順次行うことで良好な変換結果を得たい場合に、この実施の形態１の構成は有効性が高いと考えられる。 In particular, good conversion results can be obtained by sequentially performing conversion such as nonlinear conversion for data having a plurality of parameters or data of specific parameters among signals and conversion such as nonlinear conversion for data of all parameters. It is considered that the configuration of the first embodiment is highly effective in the case of

なお、ＣＮＮ１１４内の畳み込み層やプーリング層の数を増加させたり、畳み込みのチャンネル数（convolution数）を増加させ、ＣＮＮ１１４内の処理負荷を高くした場合には、第一の変換器１１２を用いた（Ｒデータ、Ｇデータ、Ｂデータについて個々に行う非線形処理のような）チャンネル毎の非線形処理による機械学習の認識率向上が期待値並みに高くならない傾向にある。それゆえ、この実施の形態１の情報処理装置１Ａは、ＣＮＮ１１４内の演算が軽装な場合に高い効果を奏すると考えられる。すなわち、この実施の形態１の情報処理装置１Ａは、ＣＮＮ１１４における計算資源が少ない場合であっても、僅かな計算資源で構築できる第一の変換器１１２を用いることで、機械学習の精度を向上させることができる。 The number of convolution layers and pooling layers in CNN 114 is increased, the number of convolution channels (convolution number) is increased, and when the processing load in CNN 114 is increased, the first converter 112 is used. There is a tendency that the improvement of machine learning recognition rate by nonlinear processing for each channel (such as nonlinear processing performed individually for R data, G data, and B data) does not reach the level expected. Therefore, it is considered that the information processing apparatus 1A of the first embodiment is highly effective when the computation in the CNN 114 is light. That is, the information processing apparatus 1A of the first embodiment improves the accuracy of machine learning by using the first converter 112 that can be constructed with few computational resources even if the computational resources in the CNN 114 are small. can be made

この実施の形態１の情報処理装置１Ａは、第一の変換器１１２がＲ変換器１１２ｒ、Ｇ変換器１１２ｇ、Ｂ変換器１１２ｂの少なくとも３層の処理群からなる処理層群を備えることや、第二の逆変換部１１５ｂがＲ逆変換部１１５ｂｒ、Ｇ逆変換部１１５ｂｇ、Ｂ逆変換部１１５ｂｂの少なくとも３層の処理層からなる処理層群を備えること、そして、そそれらの処理層群は、ノード数が１の入力層と、該入力層の後段に設けられたノード数が複数の畳み込み層又は緻密層である第２層と、該第２層の後段に設けられたノード数が１の畳み込み層又は緻密層である第３層とを含む処理層群として、畳み込みニューラルネットワークへ入力するデータのチャンネル（Ｒ，Ｇ，Ｂ３つのカラーチャンネル）毎に設けられていることにより、複数のチャンネル、複数のパラメータを有するデータについて、チャンネル毎、パラメータ毎のデータの非線形処理を行うことができ、機械学習の精度を一層向上させることができる。 In the information processing apparatus 1A of the first embodiment, the first converter 112 is provided with a processing layer group including at least three layers of processing groups of the R converter 112r, the G converter 112g, and the B converter 112b, The second inverse transforming unit 115b has a processing layer group consisting of at least three processing layers of an R inverse transforming unit 115br, a G inverse transforming unit 115bg, and a B inverse transforming unit 115bb, and these processing layer groups are , an input layer with one node, a second layer that is a convolutional layer or dense layer with a plurality of nodes provided after the input layer, and a node number of one provided after the second layer. As a processing layer group including the third layer, which is the convolutional layer or the dense layer, is provided for each channel of data (R, G, B three color channels) to be input to the convolutional neural network, so that a plurality of channels For data having a plurality of parameters, nonlinear processing can be performed for each channel and for each parameter, and the accuracy of machine learning can be further improved.

この実施の形態１の情報処理装置１Ａは、第一の変換器１１２や第二の逆変換部１１５ｂの第２層が複数層からなることにより、Ｒ，Ｇ，Ｂのカラーチャンネルのような多チャンネルのデータについて機械学習の精度を一層向上させることができる。 In the information processing apparatus 1A according to the first embodiment, the second layers of the first converter 112 and the second inverse conversion unit 115b are composed of a plurality of layers, so that multi-color channels such as R, G, and B color channels can be processed. It is possible to further improve the accuracy of machine learning for channel data.

この実施の形態１の情報処理装置１Ａは、第二の変換器１１３を用いることで、Ｒ値、Ｇ値、Ｂ値のような複数のパラメータを有するデータを、それらの複数のパラメータ（ＲＧＢ３値全ての場合も、例えばＲＧＢ３値のうちのＲ値とＧ値の２値のような場合も含む）について非線形変換を行う処理を併せて行うことで、バリエーションを持たせた非線形処理を容易に行い、機械学習の精度を一層向上させることができる。 Information processing apparatus 1A of the first embodiment uses second converter 113 to convert data having a plurality of parameters such as R value, G value, and B value into a plurality of parameters (RGB three values). In all cases, for example, the R value and the G value of the three RGB values) are also processed to perform nonlinear conversion, so that nonlinear processing with variations can be easily performed. , the accuracy of machine learning can be further improved.

この実施の形態１の情報処理装置１Ａは、第一の変換器１１２と第二の変換器１１３とを複合させて非線形の変換を行うことにより、バリエーションを持たせた非線形処理を容易に行うことができる。 The information processing apparatus 1A of the first embodiment combines the first converter 112 and the second converter 113 to perform nonlinear conversion, thereby easily performing nonlinear processing with variations. can be done.

この実施の形態１の情報処理装置１Ａは、変換テーブル１２１を用いて非線形の変換を行うことにより、処理負荷を軽減させつつ精度の高い機械学習を行うことができる。 The information processing apparatus 1A of the first embodiment performs non-linear conversion using the conversion table 121, thereby reducing the processing load and performing highly accurate machine learning.

この実施の形態１の情報処理装置１Ａは、畳み込み処理の結果に基づいて畳み込みニューラルネットワークにおけるパラメータを学習する画像処理実行部１０１を備えたことにより、機械学習に適したデータを用いた畳み込み処理の結果を用いて、精度の高い機械学習を行うことができる。 The information processing apparatus 1A of the first embodiment is provided with the image processing execution unit 101 that learns the parameters in the convolutional neural network based on the results of the convolution processing. The results can be used to perform highly accurate machine learning.

［変形例］
なお、この実施の形態１の情報処理装置１Ａは、下記に示す変形例のように構成することもできる。これらの構成をとることにより、データの内容や処理の内容に応じた適切な態様で、精度の高い機械学習を行うことが可能となる。 [Modification]
Note that the information processing apparatus 1A of the first embodiment can also be configured as a modified example shown below. By adopting these configurations, it is possible to perform highly accurate machine learning in an appropriate mode according to the content of data and the content of processing.

（変形例１）
ＣＮＮ１１４の前段に設けられる第一の変換器１１２や第二の変換器１１３の出力側のチャンネル数を、入力側のチャンネル数以上とすることができる。例えば、第１の変換器のＲ変換器１１２ｒの出力層で２チャンネル以上の出力を得るようにしても良い。Ｇ変換器１１２ｇ、Ｂ変換器１１２ｂも同様の構成とすることができる。これにより、第一の変換器１１２に入力されたＲＧＢの３チャンネルのデータは４チャンネル以上のデータとして出力される。 (Modification 1)
The number of channels on the output side of the first converter 112 and the second converter 113 provided in the preceding stage of the CNN 114 can be made equal to or greater than the number of channels on the input side. For example, outputs of two or more channels may be obtained in the output layer of the R converter 112r of the first converter. The G converter 112g and the B converter 112b can also have the same configuration. As a result, the RGB 3-channel data input to the first converter 112 is output as 4-channel or more data.

（変形例２）
ＣＮＮ１１４の前段に設けられる第一の変換器１１２や第二の変換器１１３の途中のチャンネル数を、入力側のチャンネル数以上とすることができる。例えば、Ｒ変換器１１２ｒの第１層１１２ｒ１から、図示された第２層１１２０_００１，・・・１１２０_２５６とは別系統の第２層（図示せず）にもデータを送る構成とできる。Ｇ変換器１１２ｇ、Ｂ変換器１１２ｂも同様の構成とすることができる。これにより、入力されたＲＧＢの３チャンネルのデータを第１の変換器１１２内で４チャンネル以上のデータとして処理を行える。 (Modification 2)
The number of channels in the middle of the first converter 112 and the second converter 113 provided in the preceding stage of the CNN 114 can be made equal to or greater than the number of channels on the input side. For example, data can be sent from the first layer 112r1 of the R converter 112r to a second layer (not shown) in a system different from the illustrated second layer 1120 ₀₀₁ , . . . 1120 ₂₅₆ . The G converter 112g and the B converter 112b can also have the same configuration. As a result, the input RGB 3-channel data can be processed as 4-channel or more data in the first converter 112 .

（変形例３）
ＣＮＮ１１４の前段に設けられる第一の変換器１１２や第二の変換器１１３の中間処理層を多層化することができる。例えば第一の変換器１１２のＲ変換器１１２ｒの中間処理層を、第２層１１２０_００１，・・・１１２０_２５６の後や前に第２層α、第２層βのような構成（第２層の個々のノードの前後に連続した別のノード）を設けた構成とすることができる。Ｇ変換器１１２ｇ、Ｂ変換器１１２ｂも同様の構成とすることができる。 (Modification 3)
The intermediate processing layers of the first converter 112 and the second converter 113 provided in the preceding stage of the CNN 114 can be multi-layered. For example, the intermediate processing layers _{of the R converter 112r of the first converter 112 are arranged after or before the second layers 1120 001} _, . Another continuous node can be provided before and after each node of the layer. The G converter 112g and the B converter 112b can also have the same configuration.

（変形例４）
ＣＮＮ１１４の後段に設けられる逆変換器１１５の入力側のチャンネル数を、出力側のチャンネル数以上とすることができる。例えば、逆変換器１１５に入力されるデータを４チャンネル以上とし、出力されるデータをＲＧＢの３チャンネルとすることができる。 (Modification 4)
The number of channels on the input side of the inverter 115 provided after the CNN 114 can be made greater than the number of channels on the output side. For example, the data input to the inverse transformer 115 can be 4 channels or more, and the output data can be 3 channels of RGB.

（変形例５）
ＣＮＮ１１４の後段に設けられる逆変換器１１５の中間処理層のチャンネル数を、入力側のチャンネル数以上とすることができる（上記（変形例２）の構成を逆変換器１１５の第一の逆変換部１１５ａや第二の逆変換部１１５ｂに適用した構成となる。）。 (Modification 5)
The number of channels in the intermediate processing layer of the inverse transformer 115 provided in the subsequent stage of the CNN 114 can be made greater than or equal to the number of channels on the input side (the configuration of the above (Modification 2) is the first inverse transform of the inverse transformer 115 The configuration is applied to the unit 115a and the second inverse transform unit 115b.).

（変形例６）
ＣＮＮ１１４の後段に設けられる逆変換器１１５の中間処理層を多層化することができる。（上記（変形例３）の構成を逆変換器１１５の第一の逆変換部１１５ａや第二の逆変換部１１５ｂに適用した構成となる。）。 (Modification 6)
The intermediate processing layers of the inverse transformer 115 provided after the CNN 114 can be multi-layered. (The configuration described above (Modification 3) is applied to the first inverse transform unit 115a and the second inverse transform unit 115b of the inverse transform unit 115.).

（変形例７）
第一の変換器１１２のＲ変換器１１２ｒ、Ｇ変換器１１２ｇ、Ｂ変換器１１２ｂの少なくとも何れか一つを、１チャンネル入力１チャンネル出力ではなく、多チャンネル入力や、多チャンネル出力とすることもできる。例えばＲ変換器１１２ｒの第１層１１２ｒ１、第３層１１２ｒ３を２つ以上のノードとして構成することもできる。このように構成しても、Ｒ変換器１１２ｒ、Ｇ変換器１１２ｇ、Ｂ変換器１１２ｂがそれぞれ独立したデータ処理を行う構成が維持されていれば図１に示す第一の変換器１１２の機能は実現できる。ただし、入力側（第１層１１２ｒ１，１１２ｇ１，１１２ｂ１）が１チャンネルの場合のみ、変換テーブル１２１を適用した演算が事実上可能である。 (Modification 7)
At least one of the R converter 112r, G converter 112g, and B converter 112b of the first converter 112 may be multi-channel input or multi-channel output instead of single-channel input/single-channel output. can. For example, the first layer 112r1 and the third layer 112r3 of the R converter 112r can be configured as two or more nodes. Even with this configuration, the function of the first converter 112 shown in FIG. realizable. However, only when the input side (first layers 112r1, 112g1, 112b1) is one channel, calculations using the conversion table 121 are practically possible.

（変形例８）
第二の変換器１１３は、入力側のチャンネルと出力側のチャンネルが、元のチャンネル数と同一でなくてもよい。たとえば、第二の変換器１１３の第１層１１３１ｒ，１１３１ｇ，１１３１ｂや、第３層１１３３_１，１１３３_２，１１３３_３は、チャンネル数が３つよりも多くても少なくてもよい。即ち、入力部１１１に入力された画像データのＲＧＢ３チャンネルよりもそれらのチャンネル数が多くても少なくてもよい。 (Modification 8)
The number of channels on the input side and the number of channels on the output side of the second converter 113 may not be the same as the original number of channels. For example, the first layers 1131r, 1131g, 1131b and the third layers ₁₁₃₃₁ , ₁₁₃₃₂ , ₁₁₃₃₃ of the second transducer 113 may have more or less than three channels. That is, the number of channels may be greater or less than the three RGB channels of the image data input to the input unit 111 .

（変型例９）
第一の変換器１１２の第２層や第二の逆変換部１１５ｂの第２層は、１層であってもよい。このように構成することで、処理負荷を軽減させたり処理速度を向上させることが可能となる。 (Modification 9)
The second layer of the first converter 112 and the second layer of the second inverse converter 115b may be one layer. By configuring in this way, it is possible to reduce the processing load and improve the processing speed.

（変形例１０）
図５に示したように第一の変換器１１２に適用したスキップコネクションを逆変換器１１５で適用しても良い。またスキップコネクションのストリーム数は１に限るものではなく、各中間処理層の一の処理出力をスキップコネクションにより出力し、該出力と中間処理層の他の処理出力と合成するストリームと、入力層からのデータと前記中間処理層出力と合成するストリームなど、複数のストリームで構成しても良い。 (Modification 10)
The skip connection applied to the first transformer 112 as shown in FIG. Also, the number of streams of the skip connection is not limited to 1. One processing output of each intermediate processing layer is output by the skip connection, and a stream for synthesizing this output with the other processing output of the intermediate processing layer, and may be composed of a plurality of streams such as a stream for synthesizing the data and the output of the intermediate processing layer.

なお、上記（変形例１）～（変形例１０）の構成は、以下の［発明の実施の形態２］～［発明の実施の形態８］にも適用可能である。 The configurations of (Modification 1) to (Modification 10) are also applicable to the following [Embodiment 2 of the invention] to [Embodiment 8 of the invention].

［発明の実施の形態２］
図８は、この発明の実施の形態２の情報処理装置１Ｂの第一の変換器１１２の構成を示す機能ブロック図である。 [Embodiment 2 of the invention]
FIG. 8 is a functional block diagram showing the configuration of first converter 112 of information processing apparatus 1B according to Embodiment 2 of the present invention.

この実施の形態２の情報処理装置１Ｂは、計算量を増やしてでも精度を高めたい場合に適用される。 The information processing apparatus 1B of the second embodiment is applied when it is desired to increase the accuracy even if the amount of calculation is increased.

具体的には、この実施の形態２の情報処理装置１Ｂは、第一の変換器１１２、第二の変換器１１３、ＣＮＮ１１４、及び逆変換器１１５の基本的な構成は実施の形態１の情報処理装置１Ａと同じだが（図２参照）、それぞれの第２層１１２０_００１，１１２０_００２，・・・１１２０_５１１，１１２０_５１２のノード数が５１２ノードとなっている。 Specifically, in the information processing apparatus 1B of the second embodiment, the basic configuration of the first converter 112, the second converter 113, the CNN 114, and the inverse converter 115 is the information of the first embodiment. Although it is the same as the processing device 1A ( _see _FIG . 2), the number of nodes in each of the second layers 1120 ₀₀₁ , 1120 ₀₀₂ , .

なお、情報処理装置１Ｂの第２層１１２０_００１，１１２０_００２，・・・１１２０_５１１，１１２０_５１２のノード数は、適宜増減可能である。これは、情報処理装置１Ｂの第一の変換器１１２、逆変換器１１５の第一の逆変換部１１５ａ、第二の逆変換部１１５ｂ（図３参照）においても同じである。また、このようなノード数の調整は、この実施の形態２以外のこの発明の全ての実施の形態にも同様に適用できる。 The number of _nodes in the second layer 1120 ₀₀₁ , 1120 ₀₀₂ _, . The same applies to the first converter 112 of the information processing device 1B, the first inverse converter 115a, and the second inverse converter 115b of the inverse converter 115 (see FIG. 3). Also, such adjustment of the number of nodes can be similarly applied to all the embodiments of the present invention other than the second embodiment.

この実施の形態２においては、入力されたデータを精度良く処理することが可能となる。 In the second embodiment, input data can be processed with high accuracy.

［発明の実施の形態３］
図９は、この発明の実施の形態３の情報処理装置１Ｃの画像処理部１１の一部を示す機能ブロック図である。この情報処理装置１Ｃの画像処理部１１は、第二の変換器１１３が存在しないこと以外は実施の形態１の情報処理装置１Ａと同じ構成である。この場合、逆変換器１１５は第二の変換器１１３に対応する第一の逆変換部１１５ａを設けない構成にもできる。 [Embodiment 3 of the invention]
FIG. 9 is a functional block diagram showing part of the image processing section 11 of the information processing apparatus 1C according to Embodiment 3 of the present invention. The image processing unit 11 of this information processing apparatus 1C has the same configuration as that of the information processing apparatus 1A of the first embodiment except that the second converter 113 does not exist. In this case, the inverse converter 115 can be configured without the first inverse converter 115 a corresponding to the second converter 113 .

このような構成とすることにより、複数のパラメータを一度に用いた空間変換で非線形処理を行う必要のない場合において、適切な処理を行うことが可能となる。 With such a configuration, it is possible to perform appropriate processing when there is no need to perform nonlinear processing by spatial transformation using a plurality of parameters at once.

［発明の実施の形態４］
図１０は、この発明の実施の形態４の情報処理装置１Ｄの画像処理部１１の一部を示す機能ブロック図である。この情報処理装置１Ｄの画像処理部１１は、逆変換器１１５が存在しないこと以外は実施の形態１の情報処理装置１Ａと同じ構成である。 [Embodiment 4 of the invention]
FIG. 10 is a functional block diagram showing part of the image processing section 11 of the information processing apparatus 1D according to Embodiment 4 of the present invention. The image processing unit 11 of this information processing apparatus 1D has the same configuration as that of the information processing apparatus 1A of the first embodiment except that the inverse transformer 115 is not present.

このような構成は出力データが非線形変換処理を必要としない場合に用いられる。 Such a configuration is used when the output data does not require nonlinear transformation processing.

なお、この実施の形態４の情報処理装置１Ｄの変形例として、実施の形態１の情報処理装置１ＡのＲ逆変換部１１５ｂｒ、Ｇ逆変換部１１５ｂｇ、Ｂ逆変換部１１５ｂｂのうちの１つないし２つが存在しない構成とすることもできる。 As a modification of the information processing apparatus 1D of the fourth embodiment, one or more of the R inverse transforming section 115br, the G inverse transforming section 115bg, and the B inverse transforming section 115bb of the information processing apparatus 1A of the first embodiment. A configuration in which the two do not exist is also possible.

［発明の実施の形態５］
図１１は、この実施の形態５の情報処理装置１Ｅの画像処理部１１の一部を示す機能ブロック図である。この情報処理装置１Ｅの画像処理部１１は、第二の変換器１１３と逆変換器１１５が存在しないこと以外は実施の形態１の情報処理装置１Ａと同じである。 [Embodiment 5 of the invention]
FIG. 11 is a functional block diagram showing part of the image processing section 11 of the information processing apparatus 1E of the fifth embodiment. The image processing unit 11 of this information processing apparatus 1E is the same as the information processing apparatus 1A of Embodiment 1 except that the second converter 113 and the inverse converter 115 are not present.

［発明の実施の形態６］
図１２は、この実施の形態６の情報処理装置１Ｆの画像処理部１１の一部を示す機能ブロック図である。この情報処理装置１Ｆの画像処理部１１は、第一の変換器１１２と第二の変換器１１３が逆に接続されている点が実施の形態１の情報処理装置１Ａと相違する。なお、図示しないが、逆変換器１１５を構成する第一の逆変換部１１５ａと第二の逆変換部１１５ｂが実施の形態１の情報処理装置１Ａと逆に接続されていてもよい。 [Embodiment 6 of the invention]
FIG. 12 is a functional block diagram showing part of the image processing section 11 of the information processing apparatus 1F according to the sixth embodiment. The image processing unit 11 of the information processing apparatus 1F differs from the information processing apparatus 1A of the first embodiment in that the first converter 112 and the second converter 113 are connected in reverse. Although not shown, the first inverse transforming unit 115a and the second inverse transforming unit 115b that constitute the inverse transforming unit 115 may be connected in reverse to the information processing apparatus 1A of the first embodiment.

このように構成することで、第二の変換器１１３による空間処理を先に行って空間処理を強調したい場合や、第一の変換器１１２による個々のパラメータの処理を後から行ってパラメータ毎の処理を強調したい場合等に、適切な処理を行うことが可能となる。なお、この情報処理装置１Ｆにおいて逆変換器１１５を設けない構成とすることもできる。 By configuring in this way, when it is desired to perform spatial processing by the second converter 113 first and emphasize the spatial processing, or when processing individual parameters by the first converter 112 is performed later and Appropriate processing can be performed when, for example, it is desired to emphasize processing. Note that the information processing apparatus 1F may be configured without the inverter 115. FIG.

［発明の実施の形態７］
図１３は、この実施の形態７の情報処理装置１Ｇの画像処理部１１の一部を示す機能ブロック図である。この情報処理装置１Ｇの画像処理部１１は、実施の形態６の情報処理装置１Ｆにおける逆変換器１１５が設けられていない構成である。このように構成することで、実施の形態６の情報処理装置１Ｆによって適切な処理が行われるデータにおいて、逆変換が必要でない場合に、適切な処理を行うことができる。 [Embodiment 7 of the invention]
FIG. 13 is a functional block diagram showing part of the image processing section 11 of the information processing apparatus 1G of the seventh embodiment. The image processing unit 11 of the information processing apparatus 1G does not include the inverse transformer 115 of the information processing apparatus 1F of the sixth embodiment. With such a configuration, appropriate processing can be performed on data to be appropriately processed by the information processing apparatus 1F according to the sixth embodiment when inverse transformation is not required.

［発明の実施の形態８］
また、図示しないが、この実施の形態の情報処理装置においては、実施の形態１の情報処理装置１Ａの構成において、ＣＮＮ１１４の前段に第一の変換器１１２、第二の変換器１１３の何れも設けられていない構成とすること、及び／又は、ＣＮＮ１１４の後段に第一の変換器１１２や第二の変換器１１３を設ける構成とすること、もできる。 [Embodiment 8 of the invention]
Although not shown, in the information processing apparatus of this embodiment, in the configuration of the information processing apparatus 1A of Embodiment 1, both the first converter 112 and the second converter 113 are placed before the CNN 114. A configuration in which they are not provided and/or a configuration in which the first converter 112 and the second converter 113 are provided after the CNN 114 is also possible.

なお、上記各実施の形態は本発明の例示であり、本発明が上記各実施の形態のみに限定されるものではないことは、いうまでもない。 It goes without saying that the above embodiments are examples of the present invention, and the present invention is not limited only to the above embodiments.

［実施例］
以下、この発明の実施例について説明する。 [Example]
Examples of the present invention will be described below.

図１４に、この発明の実施例を示す。図１４の（Ａ）が従来例１としての画像処理部１１の構成の一部を示す機能ブロック図である。この画像処理部１１では、入力されたデータをＣＮＮ１１４に直接入力している。 FIG. 14 shows an embodiment of the invention. FIG. 14A is a functional block diagram showing a part of the configuration of the image processing section 11 as Conventional Example 1. FIG. The image processing unit 11 directly inputs the input data to the CNN 114 .

図１４の（Ｂ）が従来例２としての画像処理部１１の構成の一部を示す機能ブロック図である。この画像処理部１１では、入力データを第二の変換器１１３に入力したのちＣＮＮ１１４に入力している。 FIG. 14B is a functional block diagram showing a part of the configuration of the image processing section 11 as Conventional Example 2. As shown in FIG. In this image processing unit 11 , the input data is input to the second converter 113 and then to the CNN 114 .

図１４の（Ｃ）が本件発明としての画像処理部１１の構成の一部を示す機能ブロック図である。この画像処理部１１では、入力データを第一の変換器１１２に入力したのちにＣＮＮ１１４に入力している。 FIG. 14C is a functional block diagram showing part of the configuration of the image processing section 11 as the present invention. In this image processing unit 11 , the input data is input to the first converter 112 and then to the CNN 114 .

この実施例では、１０種類の絵（飛行機、自動車、鳥、猫、しか、犬、かえる、馬、船、トラック）が示された画像データを画像処理部で識別させる実験を行った。具体的には、画像処理部に上述の１０種類の絵を学習させたのち、認識対象の画像を画像処理部に読み込ませ、読み込んだ画像が１０種類の絵のうちのどれに該当するかを認識させたのち、それぞれの絵に相当するシンボルを出力させて解答させる実験を行った。 In this embodiment, an experiment was conducted to make the image processing unit identify image data showing ten kinds of pictures (airplane, car, bird, cat, deer, dog, frog, horse, ship, and truck). Specifically, after having the image processing unit learn the 10 types of pictures described above, the image processing unit is caused to read an image to be recognized and determine which of the 10 types of pictures the read image corresponds to. After recognition, an experiment was conducted in which a symbol corresponding to each picture was output and an answer was given.

この実験は、機械学習モデルとしてＶＧＧ１６を改変したものを用い、データセットとしてＣＩＦＡＲ－１０を利用し、読み込んだ絵の数に対して正答の数を出し、validity accuracy（正答率）（％）を検証した。 In this experiment, a modified version of VGG16 was used as the machine learning model, CIFAR-10 was used as the data set, the number of correct answers was obtained with respect to the number of pictures read, and the validity accuracy (percentage of correct answers) (%) was calculated. verified.

なお、図１４に示すとおり、各画像処理部１１には逆変換器を設けていない。これは、画像データの入力に対してシンボルを出力する構成であり、逆変換器が設けられていては認識精度が低下すると考えられたためである。 As shown in FIG. 14, each image processing unit 11 is not provided with an inverse converter. This is because it is configured to output symbols in response to the input of image data, and it was thought that the recognition accuracy would be lowered if an inverse transformer was provided.

実験の結果を下記の（表）に示す。

この表に示すとおり、従来例１、従来例２に比べ、本件発明は改善された正答率が得られている。よって、本件発明は、従来例に比べて高い認識率が得られることがわかる。なお、正答率の改善は１％未満と僅かではあるが、機械学習においては僅かであっても正答率を向上させることは重要な課題である。 The results of the experiments are shown in the table below.

As shown in this table, compared with Conventional Examples 1 and 2, the present invention provides an improved percentage of correct answers. Therefore, it can be seen that the present invention can obtain a higher recognition rate than the conventional example. Although the improvement in the correct answer rate is less than 1%, it is an important issue in machine learning to improve the correct answer rate even if it is only a little.

１Ａ，１Ｂ，１Ｃ，１Ｄ，１Ｅ，１Ｆ，１Ｇ，１Ｈ，１Ｊ，１Ｋ・・情報処理装置
１２・・・記憶部（記憶手段）
１２１・・・変換テーブル
１０１・・・画像処理実行部（学習実行部）
１１２・・・第一の変換器（変換手段、第一の非線形処理手段）
１１３・・・第二の変換器（変換手段、第二の非線形処理手段）
１１４・・・ＣＮＮ（データ処理手段）
１１５・・・逆変換器（逆変換手段）
１１２ｒ１，１１２ｇ１，１１２ｂ１，１１３１ｒ，１１３１ｇ，１１３１ｂ，１１５ａ１_１，１１５ａ１_２，１１５ａ１_３，１１５ｂｒ１，１１５ｂｇ１，１１５ｂｂ１・・・第１層（入力層）
１１２０_００１，１１２０_００２，・・・１１２０_２５５，１１２０_２５６，１１３２_００１，・・・１１３２_２５６，１１５ａ２_００１，１１５ａ２_００２，・・・１１５ａ２_２５５，１１５ａ２_２５６，１１５ｂｒ２_００１，１１５ｂｒ２_００２，・・・１１５ｂｒ２_２５５，１１５ｂｒ２_２５６，１１５ｂｇ２_００１，１１５ｂｇ２_００２，・・・１１５ｂｇ２_２５５，１１５ｂｇ２_２５６，１１５ｂｂ２_００１，１１５ｂｂ２_００２，・・・１１５ｂｂ２_２５５，１１５ｂｂ２_２５６・・・第２層（中間処理層）
１１２ｒ３，１１２ｇ３，１１２ｂ３，１１３３ｒ，１１３３ｇ，１１３３ｂ，１１５ａ３_１，１１５ａ３_２，１１５ａ３_３，１１５ｂｒ１，１１５ｂｇ３，１１５ｂｂ３・・・第３層（出力層） 1A, 1B, 1C, 1D, 1E, 1F, 1G, 1H, 1J, 1K... Information processing device 12... Storage section (storage means)
121: conversion table 101: image processing execution unit (learning execution unit)
112 First converter (converting means, first nonlinear processing means)
113 second converter (conversion means, second nonlinear processing means)
114 CNN (data processing means)
115 Inverse converter (inverse conversion means)
112r1, 112g1, 112b1, 1131r, 1131g, 1131b, 115a11, _115a12 , _115a13 , _115br1 , 115bg1, 115bb1... First layer (input layer)
1120 ₀₀₁ _, 1120 ₀₀₂ , _. _. . 1120 ₂₅₅ _, 1120 ₂₅₆ , 1132 ₀₀₁ , . 5br2 ₀₀₂ , _... _115br2 ₂₅₅ _, _115br2 ₂₅₆ , _115bg2 ₀₀₁ , _115bg2 ₀₀₂ _, _.
112r3, 112g3, 112b3, 1133r, 1133g, 1133b, _115a31 , 115a32, _115a33 , _115br1 , 115bg3, 115bb3... third layer (output layer)

Claims

An information processing device comprising a convolutional neural network including a convolutional layer and comprising data processing means for performing convolution processing on data having a plurality of channels,
conversion means for performing non-linear conversion on data input to the information processing device and inputting the data to the data processing means;
and/or
An inverse transformation means for performing non-linear transformation on the data output from the data processing means and outputting it from the information processing device,
The information processing apparatus, wherein the transforming means and/or the inverse transforming means comprises first nonlinear processing means for individually performing the nonlinear transformation on the data for each channel.

The transforming means and/or the inverse transforming means,
A treated layer group consisting of at least three treated layers,
The processing layer group includes an input layer having one node, an intermediate processing layer having a plurality of nodes or a dense layer provided after the input layer, and an intermediate processing layer provided after the intermediate processing layer. and an output layer that is a convolutional layer or a dense layer with one or more nodes,
2. The processing apparatus according to claim 1, wherein said processing layer group is provided for each channel of said data input to said convolutional neural network.

3. A processing apparatus according to claim 2, wherein said intermediate processing layer consists of one layer.

3. A processing apparatus according to claim 2, wherein said intermediate processing layer comprises a plurality of layers.

5. The transforming means and/or the inverse transforming means according to claim 1, further comprising a second nonlinear processing means for performing the nonlinear transformation by combining a plurality of the channels. 1. The information processing device according to one.

A storage means for storing a conversion table in which conversion modes used in the first nonlinear processing means are recorded,
6. The information processing apparatus according to claim 1, wherein said first non-linear processing means performs said non-linear conversion using said conversion table obtained from said storage means.

7. An information processing apparatus according to claim 1, wherein said transforming means and/or said inverse transforming means uses a skip connection.

An information processing method in an information processing device, comprising a data processing procedure in which convolution processing is performed on data having a plurality of channels in a convolutional neural network including a convolution layer,
a transformation procedure for performing non-linear transformation on data input to the information processing device and inputting the data processing procedure;
and/or
An inverse transformation procedure for performing non-linear transformation on the data output by the processing of the data processing procedure and outputting it from the information processing device,
The information processing apparatus, wherein the transforming procedure and/or the inverse transforming procedure include a first nonlinear processing procedure in which the nonlinear transformation is performed on the data separately for each channel. Information processing methods.

A program that causes a computer to function as the information processing apparatus according to any one of claims 1 to 7.