JP2024014158A

JP2024014158A - Information processing system, information processing method, and information processing program

Info

Publication number: JP2024014158A
Application number: JP2022116782A
Authority: JP
Inventors: 貴寛丹羽; 邦博後藤; 司清水; 亮暢藤井; 文洋奥村; 光樹加藤
Original assignee: Toyota Central R&D Labs Inc
Current assignee: Toyota Central R&D Labs Inc
Priority date: 2022-07-21
Filing date: 2022-07-21
Publication date: 2024-02-01

Abstract

【課題】学習モデルの精度の低下を引き起こすおそれがある入力と出力との擬似的な相関の影響を把握する情報処理システム、情報処理方法及び情報処理プログラムを提供する。【解決手段】方法は、第１のデータセットを用いた学習モデルの学習結果と、入力としての第１のデータセットの一部であるデータ領域と、学習モデルの出力との、相関を取得する取得ステップＡ２と、第１のデータセットと相関とに基づき、疑似相関領域を特定する特定ステップＡ５と、を含む。疑似相関領域は、相関を有するデータ領域のうち、学習モデルの出力との擬似相関を有するものである。方法はさらに、疑似相関領域に対して変更及びアノテーションのうちの少なくとも１つを行い、第２のデータセットを生成する生成ステップＡ１１と、第２のデータセットに関する情報を表示する表示処理ステップＡ１２と、を含む。【選択図】図５The present invention provides an information processing system, an information processing method, and an information processing program that grasp the influence of a pseudo correlation between an input and an output that may cause a decrease in the accuracy of a learning model. A method obtains a correlation between a learning result of a learning model using a first data set, a data region that is part of the first data set as an input, and an output of the learning model. The method includes an obtaining step A2 and a specifying step A5 of specifying a pseudo-correlation region based on the first data set and the correlation. The pseudo-correlation area is a data area that has a pseudo-correlation with the output of the learning model, among data areas that have a correlation. The method further includes a generation step A11 of performing at least one of modification and annotation on the pseudo-correlation region to generate a second data set, and a display processing step A12 of displaying information regarding the second data set. ,including. [Selection diagram] Figure 5

Description

本発明は、情報処理システム、情報処理方法及び情報処理プログラムに関する。 The present invention relates to an information processing system, an information processing method, and an information processing program.

特許文献１には、類似画像検索に適した特徴量を取得する特徴量取得装置、表示装置、特徴量取得方法、類似画像検索方法、表示方法及びプログラムが開示されている。 Patent Document 1 discloses a feature amount acquisition device, a display device, a feature amount acquisition method, a similar image search method, a display method, and a program that acquire feature amounts suitable for similar image search.

特徴量取得装置は、複数の層を有し、第１対象及び第１対象の周囲の第２対象が写った入力画像の画像データに基づく入力データを複数の層で処理することにより第１対象を識別した結果を出力するＣＮＮ識別器における、複数の層のうちのある層のユニットが識別結果に影響を与える度合を活性度として導出する活性度算出部と、導出された活性度と入力画像の画像データとに基づいて、第１ユニットよりも活性度が低いユニット第２ユニットに対応する入力画像の領域である低活性度画像領域の特徴量が、第１ユニットに対応する入力画像の領域である高活性度画像領域の特徴量よりも小さくなるように、入力画像の特徴量を取得する特徴量取得部と、を備える。 The feature acquisition device has a plurality of layers, and processes input data based on image data of an input image in which a first object and a second object surrounding the first object are captured in the plurality of layers. In the CNN discriminator that outputs the result of identification, an activation calculation unit derives the degree of influence of a unit in a certain layer among the plurality of layers on the classification result as an activation degree, and the derived activation degree and the input image. Based on the image data of and a feature amount obtaining unit that obtains the feature amount of the input image so that the feature amount is smaller than the feature amount of the high activity image region.

特開２０２２－３３４２９号公報Japanese Patent Application Publication No. 2022-33429

ところで、学習モデルは、データセットに含まれるデータをもとに、入力と出力との相関を特定し、特定された相関に基づき予測、判定、分類等の種々のタスクを行う。しかし、当該相関のなかには、実際には、現実には相関がない擬似的な相関が含まれ得る。擬似的な相関は、学習モデルの精度の低下を引き起こすおそれがある。そのため、このような擬似的な相関の影響を把握することが求められている。 By the way, a learning model identifies a correlation between an input and an output based on data included in a data set, and performs various tasks such as prediction, determination, and classification based on the identified correlation. However, the correlation may actually include a pseudo correlation that has no correlation in reality. A spurious correlation may cause a decrease in the accuracy of the learning model. Therefore, it is necessary to understand the influence of such pseudo correlation.

本発明の一態様によれば、情報処理システムが提供される。この情報処理システムでは、次の各ステップがなされるようにプログラムを実行可能な少なくとも１つのプロセッサを備える。取得ステップでは、第１のデータセットを用いた学習モデルの学習結果と、入力としての第１のデータセットの一部であるデータ領域と、学習モデルの出力と、の相関と、を取得する。特定ステップでは、第１のデータセットと相関とに基づき、疑似相関領域を特定する。疑似相関領域は、相関を有するデータ領域のうち、学習モデルの出力との擬似相関を有するものである。生成ステップでは、疑似相関領域に対して変更及びアノテーションのうちの少なくとも１つを行い、第２のデータセットを生成する。 According to one aspect of the present invention, an information processing system is provided. This information processing system includes at least one processor capable of executing a program to perform the following steps. In the acquisition step, a learning result of the learning model using the first data set, a correlation between a data region that is part of the first data set as an input, and an output of the learning model are acquired. In the identifying step, a pseudo-correlation area is identified based on the first data set and the correlation. The pseudo-correlation area is a data area that has a pseudo-correlation with the output of the learning model, among the correlated data areas. In the generation step, at least one of modification and annotation is performed on the pseudo-correlation region to generate a second data set.

かかる構成によれば、ユーザが、第１のデータセットを用いた学習結果に対する疑似相関領域の影響を把握しやすくなる。 According to this configuration, it becomes easier for the user to understand the influence of the pseudo correlation region on the learning results using the first data set.

情報処理システム１を表す構成図である。1 is a configuration diagram showing an information processing system 1. FIG. 情報処理装置２のハードウェア構成を示すブロック図である。2 is a block diagram showing the hardware configuration of an information processing device 2. FIG. ユーザ端末３のハードウェア構成を示すブロック図である。3 is a block diagram showing the hardware configuration of a user terminal 3. FIG. プロセッサ２３が備える機能部の一例を示す図である。2 is a diagram showing an example of a functional unit included in a processor 23. FIG. 情報処理システム１において実行される情報処理の流れの一例を示すアクティビティ図である。2 is an activity diagram showing an example of the flow of information processing executed in the information processing system 1. FIG. アクティビティＡ７の表示処理の結果として表示部３４に表示される、疑似相関領域Ｒ２が含まれる学習データＤの一例を示す図である。FIG. 7 is a diagram showing an example of learning data D including a pseudo-correlation region R2, which is displayed on the display unit 34 as a result of display processing of activity A7. 学習データＤに対するデータ処理の一例を示す図である。3 is a diagram showing an example of data processing for learning data D. FIG.

以下、図面を用いて本発明の実施形態について説明する。以下に示す実施形態中で示した各種特徴事項は、互いに組み合わせ可能である。 Embodiments of the present invention will be described below with reference to the drawings. Various features shown in the embodiments described below can be combined with each other.

ところで、本実施形態に登場するソフトウェアを実現するためのプログラムは、コンピュータが読み取り可能な非一時的な記録媒体（Ｎｏｎ－ＴｒａｎｓｉｔｏｒｙＣｏｍｐｕｔｅｒ－ＲｅａｄａｂｌｅＭｅｄｉｕｍ）として提供されてもよいし、外部のサーバからダウンロード可能に提供されてもよいし、外部のコンピュータで当該プログラムを起動させてクライアント端末でその機能を実現（いわゆるクラウドコンピューティング）するように提供されてもよい。 By the way, the program for implementing the software appearing in this embodiment may be provided as a non-transitory computer-readable recording medium, or may be downloaded from an external server. The program may be provided in a manner that allows the program to be started on an external computer and the function thereof is realized on the client terminal (so-called cloud computing).

また、本実施形態において「部」とは、例えば、広義の回路によって実施されるハードウェア資源と、これらのハードウェア資源によって具体的に実現されうるソフトウェアの情報処理とを合わせたものも含みうる。また、本実施形態においては様々な情報を取り扱うが、これら情報は、例えば電圧・電流を表す信号値の物理的な値、０または１で構成される２進数のビット集合体としての信号値の高低、または量子的な重ね合わせ（いわゆる量子ビット）によって表され、広義の回路上で通信・演算が実行されうる。 Furthermore, in this embodiment, the term "unit" may include, for example, a combination of hardware resources implemented by circuits in a broad sense and software information processing that can be concretely implemented by these hardware resources. . In addition, various types of information are handled in this embodiment, and these information include, for example, the physical value of a signal value representing voltage and current, and the signal value as a binary bit collection consisting of 0 or 1. It is expressed by high and low levels or quantum superposition (so-called quantum bits), and communication and calculations can be performed on circuits in a broad sense.

また、広義の回路とは、回路（Ｃｉｒｃｕｉｔ）、回路類（Ｃｉｒｃｕｉｔｒｙ）、プロセッサ（Ｐｒｏｃｅｓｓｏｒ）、およびメモリ（Ｍｅｍｏｒｙ）等を少なくとも適当に組み合わせることによって実現される回路である。すなわち、特定用途向け集積回路（ＡｐｐｌｉｃａｔｉｏｎＳｐｅｃｉｆｉｃＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔ：ＡＳＩＣ）、プログラマブル論理デバイス（例えば、単純プログラマブル論理デバイス（ＳｉｍｐｌｅＰｒｏｇｒａｍｍａｂｌｅＬｏｇｉｃＤｅｖｉｃｅ：ＳＰＬＤ）、複合プログラマブル論理デバイス（ＣｏｍｐｌｅｘＰｒｏｇｒａｍｍａｂｌｅＬｏｇｉｃＤｅｖｉｃｅ：ＣＰＬＤ）、およびフィールドプログラマブルゲートアレイ（ＦｉｅｌｄＰｒｏｇｒａｍｍａｂｌｅＧａｔｅＡｒｒａｙ：ＦＰＧＡ））等を含むものである。 Further, a circuit in a broad sense is a circuit realized by at least appropriately combining a circuit, a circuit, a processor, a memory, and the like. That is, Application Specific Integrated Circuit (ASIC), programmable logic device (for example, Simple Programmable Logic Device (SPLD)), Complex Programmable Logic Device (Complex Pr) ogrammable Logic Device: CPLD), and fields This includes a field programmable gate array (FPGA) and the like.

１．ハードウェア構成
本節では、ハードウェア構成について説明する。 1. Hardware configuration This section explains the hardware configuration.

<情報処理システム１>
図１は、情報処理システム１を表す構成図である。情報処理システム１は、情報処理装置２と、ユーザ端末３と、データベースＤＢ１と、を備える。情報処理装置２と、ユーザ端末３と、データベースＤＢ１と、は、電気通信回線を通じて通信可能に構成されている。一実施形態において、情報処理システム１とは、１つまたはそれ以上の装置または構成要素からなるものである。仮に例えば、情報処理装置２のみからなる場合であれば、情報処理システム１は、情報処理装置２となりうる。以下、これらの構成要素について説明する。 <Information processing system 1>
FIG. 1 is a configuration diagram showing an information processing system 1. As shown in FIG. The information processing system 1 includes an information processing device 2, a user terminal 3, and a database DB1. The information processing device 2, the user terminal 3, and the database DB1 are configured to be able to communicate through a telecommunications line. In one embodiment, information handling system 1 is comprised of one or more devices or components. For example, if the information processing system 1 is composed of only the information processing device 2, the information processing system 1 can be the information processing device 2. These components will be explained below.

<データベースＤＢ１>
データベースＤＢ１は、種々のデータセットＤＳを記憶する。データセットＤＳは、例えば、ＭＮＩＳＴ、Ｆａｓｈｉｏｎ－ＭＮＩＳＴ、ＩｍａｇｅＮｅｔなどのオープンデータを含むものであっても、一部ユーザに限定的に提供されるものであってもよい。 <Database DB1>
Database DB1 stores various data sets DS. The data set DS may include open data such as MNIST, Fashion-MNIST, ImageNet, etc., or may be provided in a limited manner to some users.

<情報処理装置２>
図２は、情報処理装置２のハードウェア構成を示すブロック図である。情報処理装置２は、通信部２１と、記憶部２２と、プロセッサ２３とを備え、これらの構成要素が情報処理装置２の内部において通信バス２０を介して電気的に接続されている。各構成要素についてさらに説明する。 <Information processing device 2>
FIG. 2 is a block diagram showing the hardware configuration of the information processing device 2. As shown in FIG. The information processing device 2 includes a communication unit 21, a storage unit 22, and a processor 23, and these components are electrically connected via a communication bus 20 inside the information processing device 2. Each component will be further explained.

通信部２１は、ＵＳＢ、ＩＥＥＥ１３９４、Ｔｈｕｎｄｅｒｂｏｌｔ（登録商標）、有線ＬＡＮネットワーク通信等といった有線型の通信手段が好ましいものの、無線ＬＡＮネットワーク通信、３Ｇ／ＬＴＥ／５Ｇ等のモバイル通信、ＢＬＵＥＴＯＯＴＨ（登録商標）通信等を必要に応じて含めてもよい。すなわち、これら複数の通信手段の集合として実施することがより好ましい。すなわち、情報処理装置２は、通信部２１およびネットワークを介して、外部から種々の情報を通信してもよい。 Although the communication unit 21 is preferably a wired communication means such as USB, IEEE1394, Thunderbolt (registered trademark), wired LAN network communication, etc., it is also suitable for wireless LAN network communication, mobile communication such as 3G/LTE/5G, and BLUETOOTH (registered trademark). Communication etc. may be included as necessary. That is, it is more preferable to implement it as a set of these plurality of communication means. That is, the information processing device 2 may communicate various information from the outside via the communication unit 21 and the network.

記憶部２２は、前述の記載により定義される様々な情報を記憶する。これは、例えば、プロセッサ２３によって実行される情報処理装置２に係る種々のプログラム等を記憶するソリッドステートドライブ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ：ＳＳＤ）等のストレージデバイスとして、あるいは、プログラムの演算に係る一時的に必要な情報（引数、配列等）を記憶するランダムアクセスメモリ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ：ＲＡＭ）等のメモリとして実施されうる。記憶部２２は、プロセッサ２３によって実行される情報処理装置２に係る種々のプログラムや変数等を記憶している。 The storage unit 22 stores various information defined by the above description. This may be used, for example, as a storage device such as a solid state drive (SSD) that stores various programs related to the information processing device 2 executed by the processor 23, or as a temporary storage device related to program calculations. It can be implemented as a memory such as a random access memory (RAM) that stores necessary information (arguments, arrays, etc.). The storage unit 22 stores various programs, variables, etc. related to the information processing device 2 executed by the processor 23.

プロセッサ２３は、情報処理装置２に関連する全体動作の処理・制御を行う。プロセッサ２３は、例えば不図示の中央処理装置（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ：ＣＰＵ）である。プロセッサ２３は、記憶部２２に記憶された所定のプログラムを読み出すことによって、情報処理装置２に係る種々の機能を実現する。すなわち、記憶部２２に記憶されているソフトウェアによる情報処理が、ハードウェアの一例であるプロセッサ２３によって具体的に実現されることで、プロセッサ２３に含まれる各機能部として実行されうる。これらについては、次節においてさらに詳述する。なお、プロセッサ２３は単一であることに限定されず、機能ごとに複数のプロセッサ２３を有するように実施してもよい。またそれらの組合せであってもよい。 The processor 23 processes and controls overall operations related to the information processing device 2 . The processor 23 is, for example, a central processing unit (CPU) not shown. The processor 23 implements various functions related to the information processing device 2 by reading predetermined programs stored in the storage unit 22. That is, information processing by software stored in the storage unit 22 is specifically implemented by the processor 23, which is an example of hardware, and can be executed as each functional unit included in the processor 23. These will be explained in more detail in the next section. Note that the processor 23 is not limited to a single processor, and may be implemented so as to have a plurality of processors 23 for each function. It may also be a combination thereof.

<ユーザ端末３>
図３は、ユーザ端末３のハードウェア構成を示すブロック図である。ユーザ端末３は、通信部３１と、記憶部３２と、プロセッサ３３と、表示部３４と、ＨＭＩデバイス３５とを備え、これらの構成要素がユーザ端末３の内部において通信バス３０を介して電気的に接続されている。通信部３１、記憶部３２およびプロセッサ３３の説明は、情報処理装置２における各部の説明と同様のため省略する。 <User terminal 3>
FIG. 3 is a block diagram showing the hardware configuration of the user terminal 3. The user terminal 3 includes a communication section 31 , a storage section 32 , a processor 33 , a display section 34 , and an HMI device 35 , and these components are electrically connected via the communication bus 30 inside the user terminal 3 . It is connected to the. Descriptions of the communication unit 31, storage unit 32, and processor 33 are omitted because they are similar to the descriptions of each unit in the information processing device 2.

表示部３４は、ユーザ端末３筐体に含まれるものであってもよいし、外付けされるものであってもよい。表示部３４は、ユーザが操作可能なグラフィカルユーザインターフェース（ＧｒａｐｈｉｃａｌＵｓｅｒＩｎｔｅｒｆａｃｅ：ＧＵＩ）の画面を表示する。これは例えば、ＣＲＴディスプレイ、液晶ディスプレイ、有機ＥＬディスプレイおよびプラズマディスプレイ等の表示デバイスを、ユーザ端末３の種類に応じて使い分けて実施することが好ましい。 The display unit 34 may be included in the user terminal 3 housing, or may be externally attached. The display unit 34 displays a graphical user interface (GUI) screen that can be operated by the user. This is preferably implemented by using display devices such as a CRT display, a liquid crystal display, an organic EL display, and a plasma display depending on the type of user terminal 3, for example.

ＨＭＩデバイス３５は、ヒューマン・マシン・インターフェースデバイスである。ＨＭＩデバイス３５は、ユーザ端末３の筐体に含まれるものであってもよいし、外付けされるものであってもよい。例えば、ＨＭＩデバイス３５は、表示部３４と一体となってタッチパネルとして実施されてもよい。タッチパネルであれば、ユーザは、タップ操作、スワイプ操作等を入力することができる。もちろん、タッチパネルに代えて、スイッチボタン、マウス、ＱＷＥＲＴＹキーボード、音声認識装置、ジェスチャ検出装置、視線検出装置、生体信号検出装置、撮像装置などを採用してもよい。すなわち、ＨＭＩデバイス３５がユーザによってなされた操作入力を受け付ける。ＨＭＩデバイス３５は、応答として、通信バス３０を介し操作入力に対応する信号をプロセッサ３３に転送する。プロセッサ３３が必要に応じて所定の制御や演算を実行しうる。ＨＭＩデバイス３５は、ユーザからの入力を受付可能に構成されている入力部を含むともいえる。 HMI device 35 is a human machine interface device. The HMI device 35 may be included in the housing of the user terminal 3 or may be attached externally. For example, the HMI device 35 may be integrated with the display section 34 and implemented as a touch panel. With a touch panel, the user can input tap operations, swipe operations, and the like. Of course, a switch button, a mouse, a QWERTY keyboard, a voice recognition device, a gesture detection device, a line of sight detection device, a biological signal detection device, an imaging device, etc. may be used instead of the touch panel. That is, the HMI device 35 accepts operation inputs made by the user. In response, the HMI device 35 transfers a signal corresponding to the operational input to the processor 33 via the communication bus 30. The processor 33 can perform predetermined control and calculations as necessary. It can also be said that the HMI device 35 includes an input unit configured to accept input from the user.

２．情報処理装置２の機能構成
図４は、プロセッサ２３が備える機能部の一例を示す図である。図４に示すように、プロセッサ２３は、取得部２３１と、類似度計算部２３２と、特定部２３３と、表示処理部２３４と、受付部２３５と、生成部２３６と、を備える。本節では、これらの機能部の概要を説明する。各機能部の詳細は、後述の情報処理と合わせて説明される。 2. Functional Configuration of Information Processing Device 2 FIG. 4 is a diagram illustrating an example of functional units included in the processor 23. As shown in FIG. 4, the processor 23 includes an acquisition section 231, a similarity calculation section 232, a specification section 233, a display processing section 234, a reception section 235, and a generation section 236. This section provides an overview of these functional units. Details of each functional unit will be explained together with information processing described later.

取得部２３１は、ユーザ端末３または他のデバイスからの情報を取得可能に構成される。取得部２３１は、例えば、データベースＤＢ１等の情報源からデータセットＤＳや、データセットＤＳを用いて学習された学習モデルＭを取得する。また、取得部２３１は、記憶部２２の少なくとも一部であるストレージ領域に記憶されている種々の情報を読み出し、読み出された情報を記憶部２２の少なくとも一部である作業領域に書き込むことで、種々の情報を取得可能に構成されている。ストレージ領域とは、例えば、記憶部２２のうち、ＳＳＤ等のストレージデバイスとして実施される領域である。作業領域とは、例えば、ＲＡＭ等のメモリとして実施される領域である。なお、取得部２３１による取得は、プロセッサ２３に含まれる各機能部の出力結果を取得することを含む。 The acquisition unit 231 is configured to be able to acquire information from the user terminal 3 or other devices. The acquisition unit 231 acquires the dataset DS and the learning model M trained using the dataset DS, for example, from an information source such as the database DB1. The acquisition unit 231 also reads out various information stored in a storage area that is at least a part of the storage unit 22 and writes the read information to a work area that is at least a part of the storage unit 22. , is configured to be able to acquire various information. The storage area is, for example, an area of the storage unit 22 that is implemented as a storage device such as an SSD. The work area is, for example, an area implemented as a memory such as a RAM. Note that the acquisition by the acquisition unit 231 includes acquiring the output results of each functional unit included in the processor 23.

類似度計算部２３２は、取得部２３１による取得結果に基づき、種々の情報を計算可能に構成されている。類似度計算部２３２は、例えば、データセットＤＳに含まれる複数の学習データＤ間の類似度や、学習データＤが分類される複数のクラス間の類似度などを計算可能に構成される。 The similarity calculation unit 232 is configured to be able to calculate various pieces of information based on the acquisition results obtained by the acquisition unit 231. The similarity calculation unit 232 is configured to be able to calculate, for example, the similarity between a plurality of learning data D included in the data set DS, the similarity between a plurality of classes into which the learning data D is classified, and the like.

特定部２３３は、取得部２３１の取得結果や類似度計算部２３２の計算結果等に基づき、種々の情報を特定可能に構成される。 The identifying unit 233 is configured to be able to identify various pieces of information based on the acquisition results of the acquiring unit 231, the calculation results of the similarity calculation unit 232, and the like.

表示処理部２３４は、種々の情報を表示可能に構成される。当該情報は、ユーザ端末３の表示部３４または他のデバイスを介して、ユーザに提示可能である。かかる場合、例えば、表示処理部２３４は、画面、静止画又は動画を含む画像、アイコン、メッセージ等の視覚情報を、ユーザ端末３の表示部３４に表示させるように制御する。表示処理部２３４は、視覚情報をユーザ端末３に表示させるためのレンダリング情報だけを生成してもよい。なお、表示処理部２３４は、ユーザ端末３または他のデバイスユーザを介さずに、出力された情報をユーザに対して提示してもよい。 The display processing unit 234 is configured to be able to display various information. The information can be presented to the user via the display unit 34 of the user terminal 3 or another device. In such a case, for example, the display processing unit 234 controls the display unit 34 of the user terminal 3 to display visual information such as a screen, an image including a still image or a moving image, an icon, a message, and the like. The display processing unit 234 may generate only rendering information for displaying visual information on the user terminal 3. Note that the display processing unit 234 may present the output information to the user without going through the user terminal 3 or other device users.

受付部２３５は、ユーザからの種々の指定を受付可能に構成される。当該指定は、例えば、ユーザ端末３を通じて入力されるものでも、情報処理装置２に対して直接入力されるものでもよい。 The reception unit 235 is configured to be able to accept various specifications from the user. The designation may be input through the user terminal 3 or directly into the information processing device 2, for example.

生成部２３６は、種々の情報を生成可能に構成される。例えば、生成部２３６は、取得されたデータセットＤＳと受け付けた指定とに基づき、データセットＤＳを再度生成する。 The generation unit 236 is configured to be able to generate various types of information. For example, the generation unit 236 generates the dataset DS again based on the acquired dataset DS and the received specification.

学習部２３７は、取得されたデータセットＤＳを用いて学習モデルＭの学習を実行可能に構成される。なお、学習部２３７は、プロセッサ２３以外のデバイスに学習を行わせるために、データセットＤＳを用いた学習モデルＭの学習の開始を指示する指令や、当該学習に必要な情報、例えば、ハイパーパラメータや学習回数等の学習条件、を送信してもよい。 The learning unit 237 is configured to be able to perform learning of the learning model M using the acquired data set DS. In addition, in order to cause a device other than the processor 23 to perform learning, the learning unit 237 issues a command to start learning the learning model M using the data set DS and information necessary for the learning, such as hyperparameters. It is also possible to transmit learning conditions such as the number of times of learning and the like.

３．情報処理について
本節では、前述した情報処理システム１において実行される情報処理について説明する。 3. About Information Processing In this section, information processing executed in the information processing system 1 described above will be explained.

３．１．情報処理の流れについて
図５は、情報処理システム１において実行される情報処理の流れの一例を示すアクティビティ図である。なお、当該情報処理は、図示されない任意の例外処理を含みうる。例外処理は、当該情報処理の中断や、各処理の省略を含む。当該情報処理にて行われる選択または入力は、ユーザによる操作に基づくものでも、ユーザの操作に依らず自動で行われるものでもよい。 3.1. About the flow of information processing FIG. 5 is an activity diagram showing an example of the flow of information processing executed in the information processing system 1. As shown in FIG. Note that the information processing may include any exception processing not shown. Exception handling includes interruption of the information processing and omission of each process. The selection or input performed in the information processing may be based on a user's operation, or may be automatically performed without depending on a user's operation.

［アクティビティＡ１］
アクティビティＡ１にて、プロセッサ３３は、ユーザからの学習開始の指示を受け付け、当該指示を情報処理装置２のプロセッサ２３に送信する。当該指示は、学習に用いられるデータセットＤＳを特定するための情報を含む。例えば、当該指示は、ユーザがデータセットＤＳとして送信するデータセットＤＳを含む。以下、説明の便宜上、情報処理装置２のプロセッサ２３を単にプロセッサ２３と、ユーザ端末３のプロセッサ３３を単にプロセッサ３３という。 [Activity A1]
In activity A1, the processor 33 receives an instruction to start learning from the user, and transmits the instruction to the processor 23 of the information processing device 2. The instruction includes information for specifying the data set DS used for learning. For example, the instruction includes the data set DS that the user sends as the data set DS. Hereinafter, for convenience of explanation, the processor 23 of the information processing device 2 will be simply referred to as the processor 23, and the processor 33 of the user terminal 3 will be simply referred to as the processor 33.

［アクティビティＡ２］
次に、アクティビティＡ２にて、プロセッサ２３は、送信された指示に基づき、データセットＤＳを取得する。データセットＤＳは、ユーザによって入力されるものであっても、データベースＤＢに記憶されているものであってもよい。プロセッサ２３が過去にデータセットＤＳを取得したことがある場合、当該過去に取得されたデータセットＤＳを取得してもよい。以下、説明の便宜上、アクティビティＡ２にてプロセッサ２３が取得したデータセットＤＳを、第１のデータセットＤＳ１という。第１のデータセットＤＳ１は、少なくとも１つのデータを含む。データは、数値データ、時系列データ、音声データ、画像データ、又はこれらの組み合わせなど、任意の種類のデータを含み得る。以下、説明の便宜上、第１のデータセットＤＳ１に含まれるデータを学習データＤという。学習データＤは、学習モデルＭに入力される教師データとして用いられる。 [Activity A2]
Next, in activity A2, the processor 23 acquires the data set DS based on the transmitted instruction. The data set DS may be input by the user or may be stored in the database DB. If the processor 23 has acquired a dataset DS in the past, it may acquire the dataset DS acquired in the past. Hereinafter, for convenience of explanation, the data set DS acquired by the processor 23 in activity A2 will be referred to as a first data set DS1. The first data set DS1 includes at least one data. The data may include any type of data, such as numerical data, time series data, audio data, image data, or combinations thereof. Hereinafter, for convenience of explanation, the data included in the first data set DS1 will be referred to as learning data D. The learning data D is used as teacher data input to the learning model M.

［アクティビティＡ３］
次に、処理がアクティビティＡ３に進み、プロセッサ２３は、データセットＤＳを用いて学習モデルＭの学習を実行する。ここでは、プロセッサ２３は、アクティビティＡ２にて取得された第１のデータセットＤＳ１を用いて学習モデルＭの学習を実行する。これにより、取得部２３１は、第１のデータセットＤＳ１を用いた学習モデルＭの学習結果を取得する。学習モデルＭは、データセットＤＳに含まれる学習データＤを入力することで所定の結果を出力する。学習モデルＭが出力する結果の具体的態様は、例えば、回帰分析による予測結果、他クラスへの分類結果など、学習条件に応じて任意に決定される。 [Activity A3]
Next, the process proceeds to activity A3, where the processor 23 executes learning of the learning model M using the data set DS. Here, the processor 23 executes learning of the learning model M using the first data set DS1 acquired in activity A2. Thereby, the acquisition unit 231 acquires the learning result of the learning model M using the first data set DS1. The learning model M outputs a predetermined result by inputting the learning data D included in the data set DS. The specific aspect of the result output by the learning model M is arbitrarily determined according to the learning conditions, such as a prediction result by regression analysis, a classification result into another class, etc.

プロセッサ２３は、学習モデルＭの評価指標が所定の条件を満たした場合に学習を終了する。学習モデルＭの評価指標は任意であるが、例えば、カッパ係数、決定係数、赤池情報量基準（ＡＩＣ）、Ｒ２スコアなど、学習モデルＭの当てはまりの度合いを示す指標であることが好ましい。所定条件とは、例えば、これらの評価指標の少なくとも１つが予め定められた閾値未満となることである。なお、学習の終了条件はこれに限らず任意である。例えば、プロセッサ２３は、規定のエポック数だけ繰り返し処理が行われた場合に学習を終了してもよい。また、上記学習の終了条件が満たされた場合、プロセッサ２３は、ユーザによる、学習を終了するか否かの選択を受け付けてもよい。 The processor 23 ends the learning when the evaluation index of the learning model M satisfies a predetermined condition. Although the evaluation index for the learning model M is arbitrary, it is preferably an index indicating the degree of applicability of the learning model M, such as a kappa coefficient, a coefficient of determination, an Akaike information criterion (AIC), or an R2 score. The predetermined condition is, for example, that at least one of these evaluation indicators is less than a predetermined threshold. Note that the learning end condition is not limited to this and is arbitrary. For example, the processor 23 may terminate learning when repeated processing has been performed for a predetermined number of epochs. Further, when the learning termination condition is satisfied, the processor 23 may accept a user's selection as to whether or not to terminate the learning.

学習の態様は、学習データＤの種類や、学習モデルＭにて実行されるタスクなどに応じて適宜設定される。本実施形態の学習結果は、学習条件、学習回数、学習精度など、任意の情報を含み得る。学習モデルＭの学習の具体的アルゴリズムは、教師あり学習、教師なし学習、強化学習など任意である。教師あり学習の場合、学習モデルＭの学習アルゴリズムとして、線形回帰、ロジスティック回帰、ランダムフォレスト、ブースティング、サポートベクターマシン、ニューラルネットワークなどが採用可能である。特に学習データＤが画像データを含む場合、畳み込みニューラルネットワーク（ＣＮＮ）を用いることが好ましい。学習データＤが時系列データを含む場合、リカレントニューラルネットワーク（ＲＮＮ）を用いることが好ましい。一方、教師なし学習の場合、学習モデルＭの学習アルゴリズムとしては、ｋ－ｍｅａｎｓ法や主成分分析などが採用可能である。当該学習は、プロセッサ２３自身によって行われても、プロセッサ２３以外のデバイスによって行われてもよい。 The mode of learning is appropriately set depending on the type of learning data D, the task to be executed by learning model M, and the like. The learning results of this embodiment may include arbitrary information such as learning conditions, number of times of learning, and learning accuracy. The specific algorithm for learning the learning model M is arbitrary, such as supervised learning, unsupervised learning, and reinforcement learning. In the case of supervised learning, linear regression, logistic regression, random forest, boosting, support vector machine, neural network, etc. can be adopted as a learning algorithm for learning model M. In particular, when the learning data D includes image data, it is preferable to use a convolutional neural network (CNN). When the learning data D includes time-series data, it is preferable to use a recurrent neural network (RNN). On the other hand, in the case of unsupervised learning, the k-means method, principal component analysis, etc. can be adopted as the learning algorithm for the learning model M. The learning may be performed by the processor 23 itself or by a device other than the processor 23.

本実施形態の学習結果は、学習モデルＭによって分類される、第１のデータセットＤＳ１に含まれるデータ（本実施形態では学習データＤ）のそれぞれが属するクラスに関するクラス情報を含む。クラスは、例えば、良品であるか不良品であるか、の２値で表されていても、劣化度の大小を表す３つ以上の指標（例えば、大、中、小など）で表されていてもよい。したがって、本実施形態の学習モデルＭは、特に入力されるデータをクラスごとに分類する、分類器又は判定器として機能し得る。このようなクラスは、学習モデルＭが教師あり学習により生成される場合には、教師ありデータとしてのデータセットＤＳに付与されたラベルに基づき生成される。また、クラスは、学習モデルＭが教師なし学習により生成される場合には、データセットＤＳの学習結果として得られるラベルに基づき生成される。 The learning result of this embodiment includes class information regarding the class to which each of the data (learning data D in this embodiment) classified by the learning model M and included in the first data set DS1 belongs. For example, a class may be expressed as a binary value (good or defective), but it may also be expressed as three or more indicators (for example, large, medium, small, etc.) that indicate the degree of deterioration. It's okay. Therefore, the learning model M of this embodiment can function as a classifier or a determiner that particularly classifies input data for each class. When the learning model M is generated by supervised learning, such a class is generated based on the label given to the dataset DS as supervised data. Moreover, when the learning model M is generated by unsupervised learning, the class is generated based on the label obtained as a learning result of the dataset DS.

［アクティビティＡ４］
次に、処理がアクティビティＡ４に進み、類似度計算部２３２は、取得されたクラス情報に基づき、クラス間の類似度を計算する。類似度計算部２３２は、例えば、各クラスに属する学習データＤの代表値を定め、当該代表値間の特徴量空間内での距離を計算し、当該距離に基づきクラス間の類似度を計算する。類似度計算部２３２は、２つのクラス間の距離が短いほど、両クラスが類似していると判断する。代表値は、例えば、学習データＤの分布の重心値や中央値など任意である。また、当該距離の具体的態様は、マンハッタン距離、ユークリッド距離、コサイン類似度など任意である。類似度の具体的態様は任意であり、距離自体を類似度としても、当該距離に対して任意の計算処理を施したものであってもよい。なお、クラス間の類似度は、距離に基づき計算されるものに限らず、例えば、混同行列に基づき計算されてもよい。 [Activity A4]
Next, the process proceeds to activity A4, where the similarity calculation unit 232 calculates the similarity between classes based on the acquired class information. For example, the similarity calculation unit 232 determines a representative value of the learning data D belonging to each class, calculates the distance between the representative values in the feature space, and calculates the similarity between the classes based on the distance. . The similarity calculation unit 232 determines that the shorter the distance between two classes, the more similar the two classes are. The representative value is arbitrary, such as the centroid value or median value of the distribution of the learning data D, for example. Further, the specific form of the distance is arbitrary, such as Manhattan distance, Euclidean distance, and cosine similarity. The specific form of the degree of similarity is arbitrary, and the distance itself may be used as the degree of similarity, or the distance may be subjected to arbitrary calculation processing. Note that the similarity between classes is not limited to being calculated based on distance, and may be calculated based on a confusion matrix, for example.

［アクティビティＡ５］
次に、処理がアクティビティＡ５に進み、特定部２３３は、計算された類似度に基づき、第１のデータセットＤＳ１の一部であるデータ領域Ｒと、学習データＤを入力とする学習モデルＭの出力と、の相関を特定する。特定部２３３が当該相関を特定することは、取得部２３１が当該相関を取得することの一態様である。以下、説明の便宜上、学習データＤのデータ領域Ｒ中にて特定された入力と出力との相関を示す領域を、相関領域Ｒ１という。特定部２３３は、データ領域Ｒと、学習データＤを入力とする学習モデルＭの出力との相関として、相関領域Ｒ１を特定する。相関領域Ｒ１は、疑似相関領域Ｒ２と真の相関領域Ｒ３とを含み得る。 [Activity A5]
Next, the process proceeds to activity A5, and the specifying unit 233 identifies the data region R that is part of the first data set DS1 and the learning model M that receives the learning data D as input, based on the calculated similarity. Identify the correlation between the output and The identification unit 233 identifying the correlation is one aspect of the acquisition unit 231 acquiring the correlation. Hereinafter, for convenience of explanation, a region showing a correlation between an input and an output specified in the data region R of the learning data D will be referred to as a correlation region R1. The identifying unit 233 identifies a correlation region R1 as a correlation between the data region R and the output of the learning model M that receives the learning data D as input. The correlation region R1 may include a pseudo correlation region R2 and a true correlation region R3.

<疑似相関領域Ｒ２>
疑似相関領域Ｒ２は、相関を有するデータ領域Ｒのうち、学習モデルＭの出力との擬似相関を有するものである。疑似相関とは、２つの事象（本実施形態では入力と出力）との間に直接の相関性がないにも関わらず、潜在変数の存在によってあたかも因果関係があるように推測される状態をいう。疑似相関は、みかけの相関とも言われる。 <Pseudo correlation region R2>
The pseudo-correlation region R2 has a pseudo-correlation with the output of the learning model M, out of the correlated data region R. A pseudo-correlation is a state in which two events (input and output in this embodiment) are presumed to have a causal relationship due to the presence of a latent variable, even though there is no direct correlation between them. . A pseudo correlation is also called an apparent correlation.

<真の相関領域Ｒ３>
真の相関領域Ｒ３は、相関を有するデータ領域Ｒのうち、学習モデルＭの出力との直接的な相関を有するものである。直接的な相関とは、２つの事象（入力と出力）との間に直接の相関性があることをいう。真の相関領域Ｒ３は、疑似相関領域Ｒ２以外の相関領域Ｒ１であるともいえる。相関領域Ｒ１が疑似相関領域Ｒ２であるか真の相関領域Ｒ３であるか否かは、例えば、理論、複数の実験事実、経験則などから判断可能な場合がある。 <True correlation area R3>
The true correlation region R3 is a data region R that has a direct correlation with the output of the learning model M, among the correlated data regions R. Direct correlation means that there is a direct correlation between two events (input and output). It can also be said that the true correlation region R3 is a correlation region R1 other than the pseudo correlation region R2. Whether the correlation region R1 is a pseudo correlation region R2 or a true correlation region R3 may be determined from, for example, theory, a plurality of experimental facts, empirical rules, etc.

［アクティビティＡ６］
次に、処理がアクティビティＡ６に進み、特定部２３３は、第１のデータセットＤＳ１と相関とに基づき、疑似相関領域Ｒ２を特定する。詳細には、特定部２３３は、特定された相関領域Ｒ１の中から、疑似相関領域Ｒ２を特定する。例えば、特定部２３３は、予め学習された判定器を用いて、特定された相関領域Ｒ１が疑似相関領域Ｒ２であるか真の相関領域Ｒ３であるかを判定する。当該判定器の学習は、例えば、予め用意された教師ありデータを用いて行われる。当該教師ありデータは、相関領域Ｒ１が紐付けられた学習データＤと、紐付けられた相関領域Ｒ１が疑似相関領域Ｒ２であるか真の相関領域Ｒ３であるかを示すラベルと、を含む。判定器は、このような教師ありデータを入力とし、疑似相関領域Ｒ２であるか真の相関領域Ｒ３であるかを出力する。学習の具体的態様は任意であり、例えば、学習モデルＭの学習にて述べたものと同様のものが挙げられる。 [Activity A6]
Next, the process proceeds to activity A6, and the specifying unit 233 specifies the pseudo correlation region R2 based on the first data set DS1 and the correlation. Specifically, the identifying unit 233 identifies the pseudo correlation region R2 from the identified correlation region R1. For example, the identifying unit 233 uses a previously learned determiner to determine whether the identified correlation region R1 is a pseudo correlation region R2 or a true correlation region R3. The learning of the determiner is performed using, for example, supervised data prepared in advance. The supervised data includes learning data D to which the correlation region R1 is linked, and a label indicating whether the linked correlation region R1 is a pseudo correlation region R2 or a true correlation region R3. The determiner receives such supervised data as input and outputs whether it is a pseudo correlation region R2 or a true correlation region R3. The specific mode of learning is arbitrary, and for example, the same mode as described in the learning of the learning model M can be mentioned.

［アクティビティＡ７］
次に、処理がアクティビティＡ７に進み、表示処理部２３４は、表示処理を実行する。これにより、表示処理部２３４は、特定された疑似相関領域Ｒ２を、疑似相関領域Ｒ２以外のデータ領域Ｒと異なる態様で表示部３４に表示させる。疑似相関領域Ｒ２の表示態様は、ユーザが視覚的に区別可能であれば任意である。例えば、疑似相関領域Ｒ２は、他のデータ領域Ｒと異なり輪郭線によって囲われても、他のデータ領域Ｒと異なる色で表示されてもよい。本実施形態の表示処理部２３４は、表示処理により学習モデルＭの学習結果を表示部３４に表示させる。学習結果の表示態様は任意であるが、表示処理部２３４は、混同行列、データセットＤＳの散布図、クラス毎の学習データＤの集計値などを用いて、学習結果を表示部３４に表示させる。学習データＤの集計値としては、例えば、クラス毎の特徴量の分布が挙げられる。特に、学習データＤが画像データを含む場合、表示処理部２３４は、学習結果を活性化マップとして表示部３４に表示させる。活性化マップを生成する手法は任意であるが、例えば、ＣＡＭ，Ｇｒａｄ－ＣＡＭ，ＧｕｉｄｅｄＧｒａｄ－ＣＡＭなどが挙げられる。 [Activity A7]
Next, the process proceeds to activity A7, and the display processing unit 234 executes display processing. Thereby, the display processing unit 234 causes the display unit 34 to display the identified pseudo-correlation area R2 in a manner different from the data area R other than the pseudo-correlation area R2. The pseudo-correlation region R2 may be displayed in any manner as long as it can be visually distinguished by the user. For example, the pseudo-correlation region R2 may be surrounded by an outline unlike the other data regions R, or may be displayed in a different color from the other data regions R. The display processing unit 234 of this embodiment displays the learning results of the learning model M on the display unit 34 through display processing. Although the display mode of the learning results is arbitrary, the display processing unit 234 displays the learning results on the display unit 34 using a confusion matrix, a scatter diagram of the dataset DS, a total value of the learning data D for each class, etc. . The aggregated value of the learning data D includes, for example, the distribution of feature amounts for each class. In particular, when the learning data D includes image data, the display processing unit 234 causes the display unit 34 to display the learning results as an activation map. Any method can be used to generate the activation map, and examples thereof include CAM, Grad-CAM, Guided Grad-CAM, and the like.

［アクティビティＡ８］
次に、処理がアクティビティＡ８に進み、プロセッサ３３は、アクティビティＡ７での表示処理の結果を表示部３４に表示させる。これにより、ユーザは、学習モデルＭの学習結果や、学習モデルＭの学習に用いられた学習データＤに含まれるデータ領域Ｒ、特に疑似相関領域Ｒ２を把握することができる。 [Activity A8]
Next, the process proceeds to activity A8, and the processor 33 causes the display unit 34 to display the result of the display process in activity A7. Thereby, the user can grasp the learning results of the learning model M, the data region R included in the learning data D used for learning the learning model M, especially the pseudo correlation region R2.

学習モデルＭの学習を終了する場合、本情報処理が終了する。一方、学習モデルＭの学習を継続する場合、処理がアクティビティＡ９に進む。学習モデルＭの学習を終了するか継続するかの判断は、アクティビティＡ８にて表示された学習結果をもとにユーザによって行われても、学習結果に基づきプロセッサ２３が自動で行ってもよい。 When the learning of the learning model M ends, this information processing ends. On the other hand, if learning of the learning model M is to be continued, the process proceeds to activity A9. The decision as to whether to end or continue learning the learning model M may be made by the user based on the learning results displayed in activity A8, or may be made automatically by the processor 23 based on the learning results.

［アクティビティＡ９］
アクティビティＡ９では、プロセッサ３３は、ユーザから入力される、学習データＤにおける疑似相関領域Ｒ２に対する変更及びアノテーションのうちの少なくとも１つの指定を送信する。ここで、変更は、疑似相関領域Ｒ２に対して行われる具体的なデータ処理を含む。また、プロセッサ３３は、当該データ処理を行う疑似相関領域Ｒ２を特定する情報を送信してもよい。アノテーションは、学習モデルＭによる学習において、疑似相関領域Ｒ２の使い方を規定するものである。アノテーションは、疑似相関領域Ｒ２を学習モデルＭによる学習対象から除外させることを示すメタデータであってもよいし、学習モデルＭによる学習において、疑似相関領域Ｒ２の重み付けを他の領域より小さくすることを指定するメタデータであってもよい。また、アノテーションは、学習モデルＭによる学習において、疑似相関領域Ｒ２の学習回数を所定の値未満とすることを指定するメタデータであってもよい。また、アノテーションは、学習データＤが学習モデルＭに入力される前に、実行されるデータ処理を示すものであってもよい。なお、データ処理は、全体処理、及び部分処理のうちの少なくとも１つを含み得る。 [Activity A9]
In activity A9, the processor 33 transmits at least one designation of changes and annotations to the pseudo-correlation region R2 in the learning data D, input by the user. Here, the modification includes specific data processing performed on the pseudo-correlation region R2. Further, the processor 33 may transmit information specifying the pseudo correlation region R2 in which the data processing is performed. The annotation specifies how to use the pseudo correlation region R2 in learning using the learning model M. The annotation may be metadata indicating that the pseudo-correlation region R2 is to be excluded from the learning target by the learning model M, or may be weighting of the pseudo-correlation region R2 smaller than other regions in learning by the learning model M. It may also be metadata that specifies. Further, the annotation may be metadata specifying that the number of times the pseudo correlation region R2 is learned is less than a predetermined value in learning by the learning model M. Further, the annotation may indicate data processing to be performed before the learning data D is input to the learning model M. Note that data processing may include at least one of full processing and partial processing.

全体処理は、疑似相関領域Ｒ２を含む学習データＤの全体に対して行われるデータ処理である。全体処理は、例えば、第１のデータセットＤＳ１から疑似相関領域Ｒ２を含む学習データＤを除去する処理や、学習データＤの全体に対して一様に行われるパターン処理などを含む。パターン処理は、例えば、所定のパターン信号の畳み込み、予め定められたノイズパターンの減算などを含む。 The overall processing is data processing performed on the entire learning data D including the pseudo-correlation region R2. The overall process includes, for example, a process of removing the learning data D including the pseudo-correlation region R2 from the first data set DS1, a pattern process uniformly performed on the entire learning data D, and the like. The pattern processing includes, for example, convolution of a predetermined pattern signal, subtraction of a predetermined noise pattern, and the like.

部分処理は、学習データＤ中の疑似相関領域Ｒ２に対してのみ行われるデータ処理である。部分処理は、例えば、疑似相関領域Ｒ２の削除、規定パターンによる置換、マスク処理などを含む。また、部分処理は、疑似相関領域Ｒ２の少なくとも一部に対して行われる任意の処理を含み得る。例えば、部分処理は、疑似相関領域Ｒ２全体に対するデータ処理に限られず、疑似相関領域Ｒ２の一部に対するデータ処理を含み得る。 The partial processing is data processing performed only on the pseudo correlation region R2 in the learning data D. The partial processing includes, for example, deletion of the pseudo-correlation region R2, replacement with a prescribed pattern, mask processing, and the like. Furthermore, the partial processing may include any processing performed on at least a portion of the pseudo-correlation region R2. For example, the partial processing is not limited to data processing on the entire pseudo-correlation region R2, but may include data processing on a part of the pseudo-correlation region R2.

特に、学習データＤが画像データを含む場合、データ処理は、画像処理を含む。画像処理は、疑似相関領域Ｒ２を含む画像全体の削除、疑似相関領域Ｒ２を含む学習データＤ全体に対するパターン加算処理（例えば、スタイルトランスファー処理）などの全体処理や、学習データＤ中の疑似相関領域Ｒ２の削除や疑似相関領域Ｒ２に対するマスク処理などの部分処理を含む。 In particular, when the learning data D includes image data, the data processing includes image processing. Image processing includes overall processing such as deletion of the entire image including the pseudo-correlation region R2, pattern addition processing (for example, style transfer processing) for the entire learning data D including the pseudo-correlation region R2, and deletion of the pseudo-correlation region in the learning data D. This includes partial processing such as deletion of R2 and mask processing for the pseudo correlation region R2.

当該データ処理を行う疑似相関領域Ｒ２の指定の具体的態様は、ＨＭＩデバイス３５へのキーボード操作やタッチ操作など任意である。また、指定は、１つの疑似相関領域Ｒ２全体に対する指定に限られず、疑似相関領域Ｒ２に含まれる部分的な領域の指定を含み得る。ＨＭＩデバイス３５への操作によって、ＨＭＩデバイス３５からの応答が生成される。 The specific manner of specifying the pseudo correlation region R2 for performing the data processing is arbitrary, such as a keyboard operation or a touch operation on the HMI device 35. Further, the specification is not limited to the entire one pseudo-correlation region R2, but may include specification of a partial region included in the pseudo-correlation region R2. An operation on the HMI device 35 generates a response from the HMI device 35.

［アクティビティＡ１０］
次に、処理がアクティビティＡ１０に進み、受付部２３５は、プロセッサ３３から送信された、学習データＤにおける疑似相関領域Ｒ２に対する変更及びアノテーションのうちの少なくとも１つの指定を受け付ける。 [Activity A10]
Next, the process proceeds to activity A10, and the accepting unit 235 accepts the designation of at least one of changes and annotations to the pseudo correlation region R2 in the learning data D, transmitted from the processor 33.

［アクティビティＡ１１］
次に、処理がアクティビティＡ１１に進み、アクティビティＡ１０で受け付けた指定に基づき、疑似相関領域Ｒ２に対してデータ処理及びアノテーションのうちの少なくとも１つを行う。これにより、生成部２３６は、第２のデータセットＤＳ２を生成する。特に、第１のデータセットＤＳ１が画像データを含む場合、生成部２３６は、疑似相関領域Ｒ２を含む画像データに対して、データ処理として、画像処理を行うことにより、第２のデータセットＤＳ２を生成する。画像処理は、上記全体処理及び部分処理の少なくとも１つを含み得る。例えば、第１のデータセットＤＳ１が画像データを含む場合、生成部２３６は、疑似相関領域Ｒ２に対する部分的な画像処理（すなわち、画像処理に含まれる部分処理）を行う。 [Activity A11]
Next, the process proceeds to activity A11, where at least one of data processing and annotation is performed on the pseudo correlation region R2 based on the designation received in activity A10. Thereby, the generation unit 236 generates the second data set DS2. In particular, when the first data set DS1 includes image data, the generation unit 236 generates the second data set DS2 by performing image processing as data processing on the image data including the pseudo-correlation region R2. generate. Image processing may include at least one of the above-mentioned overall processing and partial processing. For example, when the first data set DS1 includes image data, the generation unit 236 performs partial image processing (that is, partial processing included in image processing) on the pseudo correlation region R2.

アクティビティＡ１１の処理の後、処理がアクティビティＡ３に戻り、学習部２３７は、第２のデータセットＤＳ２を用いて、学習モデルＭの学習を行う。これにより、第１のデータセットＤＳ１を用いて学習を行う場合に比べて、擬似相関の影響の小さい学習モデルＭが得られる。したがって、学習モデルＭの信頼性を向上することができる。その後、アクティビティＡ４からアクティビティＡ８までの処理が行われる。これにより、プロセッサ２３は、第２のデータセットＤＳ２を用いた学習モデルＭに対しても、前回の第１のデータセットＤＳ１に対する情報処理と同様に、疑似相関領域Ｒ２に対するデータ処理が実行可能に構成されている。 After the processing of activity A11, the processing returns to activity A3, and the learning unit 237 performs learning of the learning model M using the second data set DS2. As a result, a learning model M with less influence of pseudo correlation can be obtained than when learning is performed using the first data set DS1. Therefore, the reliability of the learning model M can be improved. Thereafter, processing from activity A4 to activity A8 is performed. As a result, the processor 23 can perform data processing on the pseudo correlation region R2 on the learning model M using the second data set DS2 in the same way as the previous information processing on the first data set DS1. It is configured.

［アクティビティＡ１２］
なお、アクティビティＡ１１の処理の後、アクティビティＡ１２では、表示処理部２３４は、第２のデータセットＤＳ２に関する情報を表示部３４に表示させる。第２のデータセットＤＳ２に関する情報は、例えば、学習データＤに対して行われるデータ処理の内容、データ処理の結果、第１のデータセットＤＳ１と第２のデータセットＤＳ２との差異などを含む。これにより、ユーザが学習モデルＭの学習課程を把握しやすくなる。 [Activity A12]
Note that after the processing of activity A11, in activity A12, the display processing unit 234 causes the display unit 34 to display information regarding the second data set DS2. The information regarding the second data set DS2 includes, for example, the content of the data processing performed on the learning data D, the results of the data processing, the difference between the first data set DS1 and the second data set DS2, and the like. This makes it easier for the user to understand the learning course of the learning model M.

３．２．情報処理の詳細について
本節では、学習データＤが画像データである場合における上記情報処理の詳細について説明する。本実施形態の学習データＤは、電極材料の電子顕微鏡像（ＳＥＭ像）であり、電極材料の結晶ドメインの像を含む。各学習データＤは、電極材料の劣化度（例えば、劣化度大、劣化度中、劣化度小）と紐付けられている。第１のデータセットＤＳ１は、複数の上記学習データＤを含む。このような第１のデータセットＤＳ１に基づき学習が行われた学習モデルＭは、電極材料の電子顕微鏡像を入力することで、当該電極材料の劣化度を出力する。このとき、特定部２３３は、出力される劣化度の判断材料として使うデータ領域Ｒを、相関領域Ｒ１として特定する。その後、特定部２３３は、相関領域Ｒ１が疑似相関領域Ｒ２であるか真の相関領域Ｒ３であるかを特定し、表示処理部２３４は、アクティビティＡ７及びアクティビティＡ８にて、学習結果とともに相関領域Ｒ１に関する情報を表示部３４に表示させる。 3.2. Details of Information Processing In this section, details of the above information processing when the learning data D is image data will be explained. The learning data D of this embodiment is an electron microscope image (SEM image) of the electrode material, and includes an image of crystal domains of the electrode material. Each learning data D is associated with the degree of deterioration of the electrode material (for example, high degree of deterioration, medium degree of deterioration, and small degree of deterioration). The first data set DS1 includes the plurality of learning data D described above. The learning model M that has been trained based on the first data set DS1 outputs the degree of deterioration of the electrode material by inputting an electron microscope image of the electrode material. At this time, the specifying unit 233 specifies the data region R to be used as the material for determining the degree of deterioration to be output as the correlation region R1. Thereafter, the specifying unit 233 specifies whether the correlation region R1 is a pseudo correlation region R2 or a true correlation region R3, and the display processing unit 234 displays the correlation region R1 together with the learning results in activity A7 and activity A8. The display section 34 displays information related to the above.

図６は、アクティビティＡ７の表示処理の結果として表示部３４に表示される、疑似相関領域Ｒ２が含まれる学習データＤの一例を示す図である。アクティビティＡ３からアクティビティＡ６までの処理の結果、特定部２３３は、当該学習データＤが２つの相関領域Ｒ１を含むことと、２つの相関領域Ｒ１の一方は疑似相関領域Ｒ２であり、相関領域Ｒ１の他方が真の相関領域Ｒ３であることを特定している。本実施形態で特定された疑似相関領域Ｒ２は、例えば、電極材料とセパレータとの接触に起因するセパレータ跡が残っている領域である。 FIG. 6 is a diagram showing an example of the learning data D including the pseudo correlation region R2, which is displayed on the display unit 34 as a result of the display processing of the activity A7. As a result of the processing from activity A3 to activity A6, the identifying unit 233 determines that the learning data D includes two correlation regions R1, one of the two correlation regions R1 is a pseudo correlation region R2, and the correlation region R1 is It is specified that the other is the true correlation region R3. The pseudo-correlation region R2 identified in this embodiment is, for example, a region where separator marks remain due to contact between the electrode material and the separator.

特定された相関領域Ｒ１のそれぞれは、アノテーションとしての矩形枠により囲われている。これにより、相関領域Ｒ１は、ユーザにより視認可能に表示される。詳細には、特定された疑似相関領域Ｒ２は実線の矩形枠により囲われており、真の相関領域Ｒ３は破線の矩形枠により囲われている。なお、相関領域Ｒ１以外のデータ領域Ｒは、このような矩形枠により囲われていない。このような表示態様により、ユーザは、学習データＤ中のどの領域が相関領域Ｒ１として劣化状態の判定に寄与しているのか、どの相関領域Ｒ１が疑似相関領域Ｒ２なのかを視認することができる。なお、プロセッサ２３は、ユーザの操作によって、特定された相関領域Ｒ１が疑似相関領域Ｒ２又は真の相関領域Ｒ３のいずれであるかを変更可能に構成されていてもよい。言い換えれば、特定部２３３は、相関領域Ｒ１のなかから、疑似相関領域Ｒ２の候補を特定してもよい。これにより、ユーザの経験則に基づき疑似相関領域Ｒ２の特定結果の修正が容易となる。当該操作は、マウス操作、タッチ操作、キーボード操作、音声、視線、ジェスチャなど、ＨＭＩデバイス３５への任意の入力を用いて実現可能である。当該変更により、相関領域Ｒ１を囲う矩形枠の表示態様が変更されてもよい。その後、アクティビティＡ９及びアクティビティＡ１０にて指定された入力態様に基づき、特定された疑似相関領域Ｒ２を含む学習データＤに対するデータ処理が行われる。 Each of the identified correlation regions R1 is surrounded by a rectangular frame as an annotation. Thereby, the correlation region R1 is displayed so as to be visible to the user. Specifically, the identified pseudo-correlation region R2 is surrounded by a solid-line rectangular frame, and the true correlation region R3 is surrounded by a broken-line rectangular frame. Note that the data areas R other than the correlation area R1 are not surrounded by such a rectangular frame. Such a display mode allows the user to visually recognize which region in the learning data D contributes to the determination of the deterioration state as the correlation region R1, and which correlation region R1 is the pseudo-correlation region R2. . Note that the processor 23 may be configured to be able to change whether the identified correlation region R1 is a pseudo correlation region R2 or a true correlation region R3 by a user's operation. In other words, the identifying unit 233 may identify candidates for the pseudo-correlation region R2 from the correlation region R1. This makes it easy to modify the identification result of the pseudo correlation region R2 based on the user's empirical rules. This operation can be realized using any input to the HMI device 35, such as mouse operation, touch operation, keyboard operation, voice, line of sight, and gesture. Due to this change, the display mode of the rectangular frame surrounding the correlation region R1 may be changed. Thereafter, data processing is performed on the learning data D including the identified pseudo-correlation region R2 based on the input mode specified in the activity A9 and the activity A10.

次に、上記学習データＤに対するアクティビティＡ１１のデータ処理の詳細について説明する。図７は、学習データＤに対するデータ処理の一例を示す図である。本実施形態のデータ処理は、学習データＤに含まれる画像データに対する画像処理、特に、マスク処理である。生成部２３６は、当該画像処理により、特定された疑似相関領域Ｒ２がマスクされた、少なくとも１つの学習データＤと、当該学習データＤを含む第２のデータセットＤＳ２を生成する。なお、生成部２３６は、特定された相関領域Ｒ１のうち、疑似相関領域Ｒ２に対してのみ当該データ処理を行い、真の相関領域Ｒ３に対しては行わない。このようなマスク処理が施された学習データＤを含む第２のデータセットＤＳ２を用いて学習モデルＭの再学習が行われる。そのため、第２のデータセットＤＳ２は、マスク処理によって学習モデルＭへの入力態様が変更されたデータセットＤＳといえる。なお、当該データ処理後の学習データＤを含む第２のデータセットＤＳ２の内容は、アクティビティＡ１２にて表示部３４を介してユーザに表示される。 Next, details of data processing of the activity A11 on the learning data D will be explained. FIG. 7 is a diagram illustrating an example of data processing for learning data D. The data processing in this embodiment is image processing on image data included in the learning data D, particularly mask processing. Through the image processing, the generation unit 236 generates at least one learning data D in which the identified pseudo-correlation region R2 is masked, and a second data set DS2 including the learning data D. Note that the generation unit 236 performs the data processing only on the pseudo correlation region R2 among the identified correlation regions R1, and does not perform the data processing on the true correlation region R3. The learning model M is retrained using the second data set DS2 that includes the learning data D that has been subjected to such mask processing. Therefore, the second data set DS2 can be said to be a data set DS in which the input mode to the learning model M has been changed by mask processing. Note that the contents of the second data set DS2 including the learning data D after the data processing are displayed to the user via the display unit 34 in activity A12.

４．その他
上記情報処理の態様はあくまで一例であり、これに限られない。 4. Others The above information processing mode is just an example, and is not limited to this.

アクティビティＡ６での疑似相関領域Ｒ２の特定方法は、判定器を用いるものに限られず任意である。特定部２３３は、予め得られている入力と出力との相関を表す関係式やルックアップテーブルを用いて、入力に対する学習モデルＭの出力が、当該関係式に基づく入力と出力との所定の閾値以上大きいか否かに基づき、相関領域Ｒ１が疑似相関領域Ｒ２であるか真の相関領域Ｒ３であるかを判定してもよい。また、特定部２３３は、相関領域Ｒ１に対するユーザの操作に基づき、疑似相関領域Ｒ２を特定してもよい。ユーザの入力は、例えば、相関領域Ｒ１が疑似相関領域Ｒ２であることを示す操作と、相関領域Ｒ１が真の相関領域Ｒ３であることを示す操作と、を含む。また、判定器の学習は、当該ユーザの操作によって相関領域Ｒ１が疑似相関領域Ｒ２であるか真の相関領域Ｒ３であるかを特定された学習データＤを、教師有りデータとして行われてもよい。 The method for identifying the pseudo-correlation region R2 in activity A6 is not limited to using a determiner, but may be any method. The specifying unit 233 uses a relational expression or a lookup table that expresses the correlation between the input and the output that has been obtained in advance, and determines whether the output of the learning model M with respect to the input is a predetermined threshold value between the input and the output based on the relational expression. It may be determined whether the correlation region R1 is a pseudo correlation region R2 or a true correlation region R3 based on whether the correlation region R1 is larger than the above. Further, the identifying unit 233 may identify the pseudo correlation region R2 based on the user's operation on the correlation region R1. The user's input includes, for example, an operation indicating that the correlation region R1 is a pseudo correlation region R2, and an operation indicating that the correlation region R1 is a true correlation region R3. Further, the learning of the determiner may be performed as supervised data using learning data D in which whether the correlation region R1 is a pseudo correlation region R2 or a true correlation region R3 is specified by the operation of the user. .

第１のデータセットＤＳ１を用いて学習された学習モデルＭが存在する場合、取得部２３１は、当該学習モデルＭを取得し、取得した学習モデルＭに対する第２のデータセットＤＳ２を用いた転移学習を行ってもよい。 If there is a learning model M trained using the first dataset DS1, the acquisition unit 231 acquires the learning model M, and performs transfer learning using the second dataset DS2 on the acquired learning model M. You may do so.

第１のデータセットＤＳ１に含まれるデータは、学習データＤに限られない。例えば、第１のデータセットＤＳ１に含まれるデータは、評価データであってもよい。評価データは、例えば、学習モデルＭの性能の評価に用いられるデータである。言い換えれば、第１のデータセットＤＳ１に含まれるデータの用途は任意であり、当該データは、教師データとして用いられるものであっても、学習モデルＭの評価に用いられるものであってもよい。 The data included in the first data set DS1 is not limited to the learning data D. For example, the data included in the first data set DS1 may be evaluation data. The evaluation data is, for example, data used to evaluate the performance of the learning model M. In other words, the use of the data included in the first data set DS1 is arbitrary, and the data may be used as teacher data or used to evaluate the learning model M.

アクティビティＡ９にて、ＨＭＩデバイス３５からの応答が生成された場合、取得部２３１は、当該ＨＭＩデバイス３５からの応答を取得してもよい。この場合、特定部２３３は、さらに、取得部２３１によって取得されたＨＭＩデバイス３５からの応答と、第１のデータセットＤＳ１と、相関とに基づき疑似相関領域Ｒ２を特定してもよい。例えば、プロセッサ２３は、取得部２３１によって取得されたＨＭＩデバイス３５からの応答に基づき、アクティビティＡ５での疑似相関領域Ｒ２の特定に用いられる判定器の学習を行い、学習後の判定器を用いて疑似相関領域Ｒ２の特定を行ってもよい。この場合、プロセッサ２３は、ＨＭＩデバイス３５からの応答に基づき生成された第２のデータセットＤＳ２を第１のデータセットＤＳ１として取り扱い、当該第１のデータセットＤＳ１を用いた判定器の学習を行ってもよい。 If a response from the HMI device 35 is generated in activity A9, the acquisition unit 231 may acquire the response from the HMI device 35. In this case, the identifying unit 233 may further identify the pseudo correlation region R2 based on the response from the HMI device 35 acquired by the acquiring unit 231, the first data set DS1, and the correlation. For example, the processor 23 learns the determiner used to identify the pseudo correlation region R2 in activity A5 based on the response from the HMI device 35 acquired by the acquisition unit 231, and uses the learned determiner to The pseudo correlation region R2 may also be specified. In this case, the processor 23 treats the second data set DS2 generated based on the response from the HMI device 35 as the first data set DS1, and trains the determiner using the first data set DS1. It's okay.

情報処理装置２は、オンプレミス形態であってもよく、クラウド形態であってもよい。クラウド形態の情報処理装置２としては、例えば、ＳａａＳ（ＳｏｆｔｗａｒｅａｓａＳｅｒｖｉｃｅ）、クラウドコンピューティングという形態で、上述の機能や処理を提供してもよい。 The information processing device 2 may be in an on-premises form or may be in a cloud form. The cloud-based information processing device 2 may provide the above-mentioned functions and processing, for example, in the form of SaaS (Software as a Service) or cloud computing.

上記実施形態では、情報処理装置２が種々の記憶・制御を行ったが、情報処理装置２に代えて、複数の外部装置が用いられてもよい。すなわち、種々の情報やプログラムは、ブロックチェーン技術等を用いて複数の外部装置に分散して記憶されてもよい。 In the embodiment described above, the information processing device 2 performs various storage and control operations, but instead of the information processing device 2, a plurality of external devices may be used. That is, various information and programs may be distributed and stored in a plurality of external devices using blockchain technology or the like.

上記実施形態は、情報処理システム１に限定されず、情報処理方法であっても、情報処理プログラムであってもよい。情報処理方法は、情報処理システム１の各ステップを含む。情報処理プログラムは、少なくとも１つのコンピュータに、情報処理システム１の各ステップを実行させる。 The embodiment described above is not limited to the information processing system 1, and may be an information processing method or an information processing program. The information processing method includes each step of the information processing system 1. The information processing program causes at least one computer to execute each step of the information processing system 1.

上記情報処理システム１等は、次に記載の各態様で提供されてもよい。 The information processing system 1 and the like may be provided in each of the following aspects.

（１）情報処理システムであって、次の各ステップがなされるようにプログラムを実行可能な少なくとも１つのプロセッサを備え、取得ステップでは、第１のデータセットを用いた学習モデルの学習結果と、入力としての前記第１のデータセットの一部であるデータ領域と、前記学習モデルの出力と、の相関と、を取得し、特定ステップでは、前記第１のデータセットと前記相関とに基づき、疑似相関領域を特定し、ここで、前記疑似相関領域は、前記相関を有する前記データ領域のうち、前記学習モデルの出力との擬似相関を有するものであり、生成ステップでは、前記疑似相関領域に対して変更及びアノテーションのうちの少なくとも１つを行い、第２のデータセットを生成する、もの。 (1) An information processing system, comprising at least one processor capable of executing a program so as to perform each of the following steps; A correlation between a data region that is part of the first data set as an input and an output of the learning model is obtained, and in the specifying step, based on the first data set and the correlation, A pseudo-correlation area is identified, where the pseudo-correlation area is one of the data areas having the correlation that has a pseudo-correlation with the output of the learning model, and in the generation step, the pseudo-correlation area is at least one of modification and annotation to the data to generate a second data set.

このような構成によれば、学習モデルの学習結果は、第１のデータセットを用いる場合と第２のデータセットとで、擬似相関領域の入力対応に応じて異なる。そのため、学習モデルの学習を行う際に第２のデータセットを用いることで、ユーザが、第１のデータセットを用いた学習結果に対する疑似相関領域の影響を把握しやすくなる。 According to such a configuration, the learning results of the learning model differ depending on the input correspondence of the pseudo-correlation region depending on whether the first data set is used or the second data set is used. Therefore, by using the second data set when learning the learning model, it becomes easier for the user to understand the influence of the pseudo-correlation region on the learning results using the first data set.

（２）上記（１）に記載の情報処理システムにおいて、前記取得ステップでは、さらにＨＭＩデバイスからの応答を取得し、前記特定ステップでは、取得された前記ＨＭＩデバイスからの応答と、前記第１のデータセットと、前記相関とに基づき、前記疑似相関領域を特定する、もの。 (2) In the information processing system according to (1) above, in the acquiring step, a response from the HMI device is further acquired, and in the identifying step, the acquired response from the HMI device and the first The pseudo-correlation area is identified based on the data set and the correlation.

このような構成によれば、HMIデバイスからの応答を介して疑似相関領域の特定に、ユーザの経験則を反映させやすくなる。 According to such a configuration, it becomes easier to reflect the user's empirical rules in identifying the pseudo-correlation area via the response from the HMI device.

（３）上記（１）又は（２）に記載の情報処理システムにおいて、さらに、受付ステップでは、ユーザによる、前記変更及び前記アノテーションのうちの少なくとも１つの指定を受け付ける、もの。 (3) In the information processing system according to (1) or (2) above, the accepting step further accepts a user's designation of at least one of the change and the annotation.

このような構成によれば、特定された疑似相関領域が真に疑似相関を有する領域であるか否か、どの程度疑似相関が強いか、などの疑似相関領域に関するユーザの主観的な判断を、擬似相関領域の入力態様に反映させることができる。そのため、このように生成される第２のデータセットを用いて学習モデルの学習を行うことで、よりユーザの主観と矛盾の少ない学習結果を得ることができる。 According to such a configuration, the user's subjective judgment regarding the pseudo-correlation area, such as whether or not the identified pseudo-correlation area truly has pseudo-correlation and how strong the pseudo-correlation is, is This can be reflected in the input mode of the pseudo-correlation area. Therefore, by learning the learning model using the second data set generated in this way, it is possible to obtain learning results that are more consistent with the user's subjectivity.

（４）上記（１）～（３）の何れか１つに記載の情報処理システムにおいて、表示処理ステップでは、特定された前記疑似相関領域を、前記疑似相関領域以外の前記データ領域と異なる態様で表示させる、もの。 (4) In the information processing system according to any one of (1) to (3) above, in the display processing step, the identified pseudo-correlation area is displayed in a manner different from the data area other than the pseudo-correlation area. Something to display.

このような構成によれば、ユーザは、どのデータ領域が疑似相関領域であるかを視覚的に判別しやすくなる。 According to such a configuration, it becomes easier for the user to visually determine which data area is a pseudo-correlation area.

（５）上記（１）～（４）の何れか１つに記載の情報処理システムにおいて、前記学習結果は、前記学習モデルによって分類される、前記データセットに含まれるデータが属するクラスに関するクラス情報を含み、さらに、類似度計算ステップでは、取得された前記クラス情報に基づき、前記クラス間の類似度を計算し、前記特定ステップでは、さらに、計算された前記類似度に基づき、前記相関を特定する、もの。 (5) In the information processing system according to any one of (1) to (4) above, the learning result includes class information regarding a class to which data included in the data set belongs, which is classified by the learning model. Further, in the similarity calculation step, the similarity between the classes is calculated based on the obtained class information, and in the identification step, the correlation is further specified based on the calculated similarity. Something to do.

このような構成によれば、クラス間の類似度という定量的な指標に基づき疑似相関領域が特定されるため、特定結果の客観性が向上する。 According to such a configuration, a pseudo correlation region is identified based on a quantitative index of similarity between classes, so that the objectivity of the identification result is improved.

（６）上記（１）～（５）の何れか１つに記載の情報処理システムにおいて、前記第１のデータセットが画像データを含む場合、前記生成ステップでは、前記疑似相関領域を含む前記画像データに対して画像処理を行うことにより、前記第２のデータセットを生成する、もの。 (6) In the information processing system according to any one of (1) to (5) above, when the first data set includes image data, in the generation step, the image including the pseudo-correlation area The second data set is generated by performing image processing on the data.

このような構成によれば、第１のデータセットと第２のデータセットの差異を視覚的に把握しやすくなるため、ユーザがより擬似相関領域の影響を把握しやすくなる。 According to such a configuration, it becomes easier for the user to visually understand the difference between the first data set and the second data set, and therefore it becomes easier for the user to understand the influence of the pseudo-correlation area.

（７）上記（６）に記載の情報処理システムにおいて、前記生成ステップでは、前記疑似相関領域に対する部分的な画像処理を行うことにより、前記第２のデータセットを生成する、もの。 (7) In the information processing system according to (6) above, in the generation step, the second data set is generated by performing partial image processing on the pseudo-correlation area.

このような構成によれば、画像処理が、特に疑似相関領域でない領域、特に、真の相関を有するデータ領域への影響を及ぼす可能性を低減することができる。したがって、ユーザが、より正確に擬似相関領域の影響を把握しやすくなる。 According to such a configuration, it is possible to reduce the possibility that image processing will affect an area that is not a pseudo-correlation area, especially a data area that has a true correlation. Therefore, it becomes easier for the user to more accurately grasp the influence of the pseudo-correlation area.

（８）上記（１）～（７）の何れか１つに記載の情報処理システムにおいて、さらに、学習ステップでは、前記第２のデータセットを用いて、前記学習モデルの学習を行う、もの。 (8) The information processing system according to any one of (1) to (7) above, further comprising, in the learning step, learning the learning model using the second data set.

このような構成によれば、第１のデータセットを用いる場合に比べて信頼性の高い学習モデルを得ることができる。 According to such a configuration, it is possible to obtain a learning model with higher reliability than when using the first data set.

（９）情報処理方法であって、上記（１）～（８）の何れか１つに記載の情報処理システムの各ステップを含む、方法。 (9) An information processing method, comprising each step of the information processing system described in any one of (1) to (8) above.

（１０）情報処理プログラムであって、少なくとも１つのコンピュータに、上記（１）～（８）の何れか１つに記載の情報処理システムの各ステップを実行させる、もの。
もちろん、この限りではない。 (10) An information processing program that causes at least one computer to execute each step of the information processing system described in any one of (1) to (8) above.
Of course, this is not the case.

最後に、本開示に係る種々の実施形態を説明したが、これらは、例として提示したものであり、発明の範囲を限定することは意図していない。当該新規な実施形態は、その他の様々な形態で実施されることが可能であり、発明の要旨を逸脱しない範囲で、種々の省略、置き換え、変更を行うことができる。当該実施形態やその変形は、発明の範囲や要旨に含まれるとともに、特許請求の範囲に記載された発明とその均等の範囲に含まれるものである。 Finally, although various embodiments according to the present disclosure have been described, these are presented as examples and are not intended to limit the scope of the invention. The new embodiment can be implemented in various other forms, and various omissions, substitutions, and changes can be made without departing from the gist of the invention. The embodiment and its modifications are included within the scope and gist of the invention, and are included within the scope of the invention described in the claims and its equivalents.

１：情報処理システム
２：情報処理装置
３：ユーザ端末
２０：通信バス
２１：通信部
２２：記憶部
２３：プロセッサ
３０：通信バス
３１：通信部
３２：記憶部
３３：プロセッサ
３４：表示部
３５：ＨＭＩデバイス
２３１：取得部
２３２：類似度計算部
２３３：特定部
２３４：表示処理部
２３５：受付部
２３６：生成部
２３７：学習部
ＤＢ１：データベース
Ｒ：データ領域
Ｒ１：相関領域
Ｒ２：疑似相関領域
Ｒ３：相関領域 1: Information processing system 2: Information processing device 3: User terminal 20: Communication bus 21: Communication unit 22: Storage unit 23: Processor 30: Communication bus 31: Communication unit 32: Storage unit 33: Processor 34: Display unit 35: HMI device 231: Acquisition unit 232: Similarity calculation unit 233: Specification unit 234: Display processing unit 235: Reception unit 236: Generation unit 237: Learning unit DB1: Database R: Data area R1: Correlation area R2: Pseudo correlation area R3 : Correlation area

Claims

An information processing system,
at least one processor capable of executing a program such that each of the following steps is performed;
In the acquisition step,
Learning results of the learning model using the first data set,
obtaining a correlation between a data region that is part of the first data set as an input and the output of the learning model;
In the identifying step, a pseudo-correlation region is identified based on the first data set and the correlation, and here, the pseudo-correlation region is defined as a region between the data region having the correlation and the output of the learning model. It has a pseudo correlation,
In the generation step, at least one of modification and annotation is performed on the pseudo-correlation region to generate a second data set.

The information processing system according to claim 1,
In the obtaining step, further obtaining a response from the HMI device,
In the identifying step, the pseudo-correlation area is identified based on the obtained response from the HMI device, the first data set, and the correlation.

The information processing system according to claim 1,
Furthermore, in the receiving step, a designation of at least one of the change and the annotation is received by the user.

The information processing system according to claim 1,
In the display processing step, the identified pseudo-correlation area is displayed in a manner different from the data area other than the pseudo-correlation area.

The information processing system according to claim 1,
The learning result includes class information regarding a class to which data included in the dataset belongs, classified by the learning model,
Furthermore, in the similarity calculation step, the similarity between the classes is calculated based on the acquired class information,
In the identifying step, the correlation is further identified based on the calculated similarity.

The information processing system according to claim 1,
When the first data set includes image data,
In the generation step, the second data set is generated by performing image processing on the image data including the pseudo-correlation area.

The information processing system according to claim 6,
In the generation step, the second data set is generated by performing partial image processing on the pseudo-correlation area.

The information processing system according to claim 1,
Furthermore, in the learning step, the learning model is trained using the second data set.

An information processing method,
A method comprising each step of the information processing system according to any one of claims 1 to 8.

An information processing program,
A device that causes at least one computer to execute each step of the information processing system according to any one of claims 1 to 8.