JP7044164B2

JP7044164B2 - Information processing equipment, information processing method, program

Info

Publication number: JP7044164B2
Application number: JP2020540930A
Authority: JP
Inventors: 康介西原
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2018-09-05
Filing date: 2018-09-05
Publication date: 2022-03-30
Anticipated expiration: 2038-09-05
Also published as: WO2020049667A1; JPWO2020049667A1

Description

本発明は、入力データを分類する情報処理装置、情報処理方法、プログラムに関する。 The present invention relates to an information processing device for classifying input data, an information processing method, and a program.

情報通信技術の発展により、外界を認識する精度が格段に向上している。特定の分野では、動画像から物体を既知のカテゴリに分類する精度が人を凌駕するまでに至っている。特に深層学習（Deep Learning）の発展は著しく、Neural Network（NN）を予め学習データを用いて学習することで高い分類精度を実現している。このような認識技術は、工場で製品の欠陥を高速に判定したり、自動運転技術が暗い中でも歩行者を発見したりするなど、様々な分野で実用化が進んでいる。 With the development of information and communication technology, the accuracy of recognizing the outside world has improved dramatically. In certain fields, the accuracy of classifying objects into known categories from moving images has surpassed that of humans. In particular, the development of deep learning is remarkable, and high classification accuracy is realized by learning Neural Network (NN) in advance using learning data. Such recognition technology is being put to practical use in various fields such as high-speed determination of product defects in factories and detection of pedestrians even in the dark of automatic driving technology.

しかしながら、まだ人間には及ばない点も多くある。これまでの認識技術は大量の学習データを必要とするが、人は１つのサンプルからでも学習することができる。つまり、見たことがないものからでも分類が可能である。初見のものでも既知のカテゴリと区別して、次に同種のものを見たとき、前に見たものと同じものと判断できる。例えば、キリンを知らなかったとしても、一度キリンに遭遇すると次にキリンが現れたとき、以前に見た首の長い動物と同じだと判断できる。 However, there are still many points that are not as good as humans. Conventional recognition techniques require a large amount of learning data, but a person can learn from even one sample. In other words, it is possible to classify even those that have never been seen. Even the first look can be distinguished from the known categories, and the next time you look at the same kind, you can judge that it is the same as the one you saw before. For example, even if you don't know a giraffe, once you encounter it, the next time it appears, you can tell it's the same as the long-necked animal you've seen before.

ここで、未知の１サンプルから分類を行う幾つかの手法が知られている。非特許文献１では、画像をNNに入力し、その出力を特徴ベクトルとして取り出して分類を行う（図１参照）。具体的には、まず学習データを用いて、特徴ベクトルがそのカテゴリの平均ベクトルに近づくようにNNを学習する。ベクトル間の近さは、例えば、特徴ベクトル間のユークリッド距離などで定義されている。このように学習したNNにテストデータを入力すると、当該テストデータの属するカテゴリの平均ベクトルに近い特徴ベクトルに変換される。このため、全ての平均ベクトルとの距離を計算することで、最も近い平均ベクトルのカテゴリを分類結果とすることができる。 Here, some methods for classifying from an unknown sample are known. In Non-Patent Document 1, an image is input to NN, and the output is taken out as a feature vector for classification (see FIG. 1). Specifically, first, the training data is used to train the NN so that the feature vector approaches the average vector of the category. The closeness between vectors is defined by, for example, the Euclidean distance between feature vectors. When the test data is input to the NN learned in this way, it is converted into a feature vector close to the average vector of the category to which the test data belongs. Therefore, by calculating the distances from all the average vectors, the category of the closest average vector can be used as the classification result.

例えば、どのカテゴリにも属さない未知画像１が入力されると、学習データで学習したどの平均ベクトルからも離れた特徴ベクトルに変換される。次に、この未知画像１と同じカテゴリに属する別の未知画像２が入力されると、既知の平均ベクトルから離れ、かつ、未知画像１の特徴ベクトルと近い位置の特徴ベクトルが出力される。このため、未知画像２は既知のカテゴリには属さず、未知画像１と同じカテゴリに属するであろうと推測できる。 For example, when an unknown image 1 that does not belong to any category is input, it is converted into a feature vector that is far from any average vector learned from the training data. Next, when another unknown image 2 belonging to the same category as the unknown image 1 is input, a feature vector separated from the known average vector and at a position close to the feature vector of the unknown image 1 is output. Therefore, it can be inferred that the unknown image 2 does not belong to the known category and will belong to the same category as the unknown image 1.

非特許文献２に記載の技術も、学習データを使って学習したNNの出力の特徴ベクトルを分類する技術である（図２参照）。具体的に、非特許文献２の手法では、二つの画像に対する特徴ベクトルに対し、二つの画像が同じカテゴリに属する場合は特徴ベクトル間の距離が近くなるようにNNを学習し、異なる場合は互いの特徴ベクトルが与えた一定値（例えば距離１）離れるようにNNを学習する。両手法とも、１サンプルから新しいカテゴリを形成して分類することができる特徴ベクトル空間を構築する手法と言える。 The technique described in Non-Patent Document 2 is also a technique for classifying the feature vector of the output of NN trained using the training data (see FIG. 2). Specifically, in the method of Non-Patent Document 2, NNs are learned so that the distance between the feature vectors is close when the two images belong to the same category with respect to the feature vectors for the two images, and when they are different, they are learned from each other. The NN is learned so that the feature vector of is separated by a constant value (for example, a distance of 1). Both methods can be said to be methods for constructing a feature vector space that can form and classify new categories from one sample.

J. Snell, K. Swersky, and R. S. Zemel, “Prototypical networks for few-shot learning,” in NIPS 2017.J. Snell, K. Swersky, and R. S. Zemel, “Prototypical networks for few-shot learning,” in NIPS 2017. G. Koch, R. Zemel, and R. Salakhutdinov. “Siamese neural networks for one-shot image recognition,” in ICML Workshop 2015.G. Koch, R. Zemel, and R. Salakhutdinov. “Siamese neural networks for one-shot image recognition,” in ICML Workshop 2015.

しかしながら、上述した手法は、特徴ベクトルが同じカテゴリに属しているか否かの知識しか利用しておらず、カテゴリ間の関係は最適化の制約には含まれていない。このため、同じ学習データで学習後にカテゴリの平均ベクトルを求めるとしても、初期値に依存して学習のたびに特徴ベクトル空間上での位置が異なることになる。すると、学習したカテゴリに対しては分類が可能であるが、未知のカテゴリに対しては必ずしも精度よく分類できるとは限らず、学習で得られたカテゴリの位置に依存して分類精度が変わることになる。つまり、上述した手法では、入力データをカテゴリに分類する精度の向上を図ることができない、という問題が生じる。 However, the above-mentioned method uses only the knowledge of whether or not the feature vectors belong to the same category, and the relationship between the categories is not included in the optimization constraint. Therefore, even if the average vector of the category is obtained after training with the same training data, the position on the feature vector space will be different each time training is performed depending on the initial value. Then, although it is possible to classify the learned categories, it is not always possible to classify unknown categories accurately, and the classification accuracy changes depending on the position of the categories obtained by learning. become. That is, the above-mentioned method has a problem that the accuracy of classifying the input data into categories cannot be improved.

このため、本発明の目的は、上述した課題である入力データをカテゴリに分類する精度の向上を図ることができない、ことを解決することができる情報処理装置、情報処理方法、プログラムを提供することにある。 Therefore, an object of the present invention is to provide an information processing device, an information processing method, and a program capable of solving the above-mentioned problem that the accuracy of classifying input data into categories cannot be improved. It is in.

本発明の一形態である情報処理装置は、
異なる入力データをそれぞれ特徴ベクトルに変換する特徴抽出部と、
異なる前記入力データの前記特徴ベクトル間の距離を表すベクトル距離を計算する距離計算部と、
前記ベクトル距離に基づいて誤差情報を計算する損失計算部と、を備え、
前記損失計算部は、異なる前記入力データがそれぞれ属するカテゴリ間の関係性を表す関係性情報に基づく損失関数から前記誤差情報を計算し、当該誤差情報に基づいて前記特徴抽出部による前記入力データを前記特徴ベクトルに変換する方法を更新する、
という構成をとる。The information processing device, which is one embodiment of the present invention, is
A feature extractor that converts different input data into feature vectors,
A distance calculation unit that calculates a vector distance that represents the distance between the feature vectors of different input data,
A loss calculation unit that calculates error information based on the vector distance is provided.
The loss calculation unit calculates the error information from the loss function based on the relationship information representing the relationship between the categories to which the different input data belong, and based on the error information, the input data by the feature extraction unit is obtained. Update the method of converting to the feature vector,
It takes the composition.

また、本発明の一形態であるプログラムは、
情報処理装置に、
異なる入力データをそれぞれ特徴ベクトルに変換する特徴抽出部と、
異なる前記入力データの前記特徴ベクトル間の距離を表すベクトル距離を計算する距離計算部と、
前記ベクトル距離に基づいて誤差情報を計算する損失計算部と、を実現させると共に、
前記損失計算部は、異なる前記入力データがそれぞれ属するカテゴリ間の関係性を表す関係性情報に基づく損失関数から前記誤差情報を計算し、当該誤差情報に基づいて前記特徴抽出部による前記入力データを前記特徴ベクトルに変換する方法を更新する、
ことを実現させる、
という構成をとる。Further, the program which is one form of the present invention is
For information processing equipment
A feature extractor that converts different input data into feature vectors,
A distance calculation unit that calculates a vector distance that represents the distance between the feature vectors of different input data,
In addition to realizing a loss calculation unit that calculates error information based on the vector distance,
The loss calculation unit calculates the error information from the loss function based on the relationship information representing the relationship between the categories to which the different input data belong, and based on the error information, the input data by the feature extraction unit is obtained. Update the method of converting to the feature vector,
To make that happen
It takes the composition.

本発明の一形態である情報処理方法は、
異なる入力データをそれぞれ特徴ベクトルに変換し、
異なる前記入力データの前記特徴ベクトル間の距離を表すベクトル距離を計算し、
前記ベクトル距離に基づいて誤差情報を計算し、
さらに、異なる前記入力データがそれぞれ属するカテゴリ間の関係性を表す関係性情報に基づく損失関数から前記誤差情報を計算し、当該誤差情報に基づいて前記入力データを前記特徴ベクトルに変換する方法を更新する、
という構成をとる。The information processing method, which is one embodiment of the present invention, is
Convert different input data into feature vectors,
Calculate a vector distance that represents the distance between the feature vectors of different input data.
Error information is calculated based on the vector distance,
Further, the method of calculating the error information from the loss function based on the relationship information representing the relationship between the categories to which the different input data belongs and converting the input data into the feature vector based on the error information is updated. do,
It takes the composition.

本発明は、以上のように構成されることにより、入力データをカテゴリに分類する精度の向上を図ることができる。 The present invention can improve the accuracy of classifying the input data into categories by being configured as described above.

非特許文献１における分類方法を説明するための図である。It is a figure for demonstrating the classification method in non-patent document 1. FIG. 非特許文献２における分類方法を説明するための図である。It is a figure for demonstrating the classification method in Non-Patent Document 2. 本発明の実施形態１におけるコンピュータの構成を示すブロック図である。It is a block diagram which shows the structure of the computer in Embodiment 1 of this invention. 図３に開示したプロセッサの構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of the processor disclosed in FIG. 図３に開示した損失計算部による処理に用いる関係性情報の一例を説明するための図である。It is a figure for demonstrating an example of the relationship information used for the processing by the loss calculation unit disclosed in FIG. 図３に開示した損失計算部による処理に用いる関係性情報の一例を説明するための図である。It is a figure for demonstrating an example of the relationship information used for the processing by the loss calculation unit disclosed in FIG. 図３に開示したプロセッサによる処理動作を示すフローチャートである。It is a flowchart which shows the processing operation by the processor disclosed in FIG. 本発明の実施形態２におけるプロセッサの構成を示すブロック図である。It is a block diagram which shows the structure of the processor in Embodiment 2 of this invention. 図８に開示したプロセッサによる処理動作を示すフローチャートである。It is a flowchart which shows the processing operation by the processor disclosed in FIG. 本発明の実施形態３における情報処理装置の構成を示すブロック図である。It is a block diagram which shows the structure of the information processing apparatus in Embodiment 3 of this invention.

＜実施形態１＞
本発明の第１の実施形態を、図３乃至図７を参照して説明する。図３乃至図６は、情報処理装置の構成を説明するための図であり、図７は、情報処理装置の動作を説明するための図である。<Embodiment 1>
The first embodiment of the present invention will be described with reference to FIGS. 3 to 7. 3 to 6 are diagrams for explaining the configuration of the information processing apparatus, and FIG. 7 is a diagram for explaining the operation of the information processing apparatus.

［構成］
本発明は、図３に示すようなコンピュータ１で構成される。コンピュータ１は、演算装置つまりＣＰＵ（Central Processing Unit）であるプロセッサ１０、記憶装置であるメインメモリ３０、データ入出力に用いられるインターフェース２０、といったハードウェアを備えた情報処理装置である。以下、各構成について詳述する。[Constitution]
The present invention comprises a computer 1 as shown in FIG. The computer 1 is an information processing device including hardware such as a processor 10, a CPU (Central Processing Unit), a main memory 30, and an interface 20 used for data input / output. Hereinafter, each configuration will be described in detail.

上記インターフェース２０は、データ入出力手段によりデータのやり取りに使われる。インターフェース２０には、例えばネットワークデバイス、ファイルデバイス、センサデバイスが接続される。そして、インターフェース２０を介して、プロセッサ１０に各種情報が入出力される。 The interface 20 is used for exchanging data by data input / output means. For example, a network device, a file device, and a sensor device are connected to the interface 20. Then, various information is input / output to / from the processor 10 via the interface 20.

上記メインメモリ３０は、プロセッサ１０によって実行されるプログラムやデータを記憶する。記憶されているプログラムには、後述する各部の処理を実行するためにプロセッサ１０に処理させるための命令群が記載されている。 The main memory 30 stores programs and data executed by the processor 10. In the stored program, an instruction group for causing the processor 10 to process in order to execute the processing of each part described later is described.

上記プロセッサ１０は、メインメモリ３０に記憶されたプログラムを実行することで、図４に示すように、入力部１、特徴抽出部２、距離計算部３、損失計算部４、を構築する。このように、本発明の情報処理装置が備える各部１～４は、プロセッサ１０といったハードウェアにソフトウェアが組み込まれることで実現される。但し、これら各部１～４は、ＩＣ（Integrated Circuit）回路などのハードウェアで構築されてもよい。 The processor 10 constructs an input unit 1, a feature extraction unit 2, a distance calculation unit 3, and a loss calculation unit 4 as shown in FIG. 4 by executing a program stored in the main memory 30. As described above, each part 1 to 4 included in the information processing apparatus of the present invention is realized by incorporating software into hardware such as the processor 10. However, each of these parts 1 to 4 may be constructed by hardware such as an IC (Integrated Circuit) circuit.

なお、上記プログラムは、様々なタイプの非一時的なコンピュータ可読媒体（non-transitory computer readable medium）を用いて格納され、コンピュータに供給することができる。非一時的なコンピュータ可読媒体は、様々なタイプの実体のある記録媒体（tangible storage medium）を含む。非一時的なコンピュータ可読媒体の例は、磁気記録媒体（例えばフレキシブルディスク、磁気テープ、ハードディスクドライブ）、光磁気記録媒体（例えば光磁気ディスク）、ＣＤ－ＲＯＭ（Read Only Memory）、ＣＤ－Ｒ、ＣＤ－Ｒ／Ｗ、半導体メモリ（例えば、マスクＲＯＭ、ＰＲＯＭ（Programmable ROM）、ＥＰＲＯＭ（Erasable PROM）、フラッシュＲＯＭ、ＲＡＭ（Random Access Memory））を含む。また、プログラムは、様々なタイプの一時的なコンピュータ可読媒体（transitory computer readable medium）によってコンピュータに供給されてもよい。一時的なコンピュータ可読媒体の例は、電気信号、光信号、及び電磁波を含む。一時的なコンピュータ可読媒体は、電線及び光ファイバ等の有線通信路、又は無線通信路を介して、プログラムをコンピュータに供給できる。 The above program can be stored in various types of non-transitory computer readable medium and supplied to a computer. Non-temporary computer-readable media include various types of tangible storage media. Examples of non-temporary computer-readable media include magnetic recording media (eg, flexible disks, magnetic tapes, hard disk drives), magneto-optical recording media (eg, magneto-optical disks), CD-ROMs (Read Only Memory), CD-Rs. Includes CD-R / W, semiconductor memory (eg, mask ROM, PROM (Programmable ROM), EPROM (Erasable PROM), flash ROM, RAM (Random Access Memory)). The program may also be supplied to the computer by various types of transient computer readable medium. Examples of temporary computer readable media include electrical, optical, and electromagnetic waves. The temporary computer-readable medium can supply the program to the computer via a wired communication path such as an electric wire and an optical fiber, or a wireless communication path.

上記入力部１は、学習時には、学習データから１又は複数の信号情報と、当該信号情報に対応する教師ラベルと、を取り出し、信号情報を特徴抽出部２に出力する。テスト時には、テストデータから信号情報を取り出して特徴抽出部２に出力する。学習データは、特徴抽出部２の学習に使うための画像や音声などの信号情報と、当該信号情報がどのカテゴリに属しているかを示す教師ラベルと、のペアから構成される。テストデータは、分類対象となる画像や音声などの信号情報から構成される。 At the time of learning, the input unit 1 extracts one or a plurality of signal information and the teacher label corresponding to the signal information from the learning data, and outputs the signal information to the feature extraction unit 2. At the time of the test, the signal information is taken out from the test data and output to the feature extraction unit 2. The learning data is composed of a pair of signal information such as images and sounds to be used for learning of the feature extraction unit 2 and a teacher label indicating which category the signal information belongs to. The test data is composed of signal information such as images and sounds to be classified.

上記特徴抽出部２は、入力された信号情報をニューラルネットワーク（NN）で特徴ベクトルに変換する。例えば、N層の畳み込みNNとM層の全結合層のN+M層から成るネットワークモデルを用い、最終層の出力を特徴ベクトルとする。なお、ネットワークモデルは、後述するように損失計算部から得られる誤差情報を基に最適化される。最適化には既存の深層学習の手法を用いることができる。 The feature extraction unit 2 converts the input signal information into a feature vector by a neural network (NN). For example, a network model consisting of an N-layer convolutional NN and an M-layer fully connected layer N + M layer is used, and the output of the final layer is used as a feature vector. The network model is optimized based on the error information obtained from the loss calculation unit as described later. Existing deep learning methods can be used for optimization.

上記距離計算部３は、比較する対象となる２つの信号情報から変換された特徴ベクトル間の距離を表すベクトル距離を計算する。ベクトル距離としては、ユークリッド距離やマハラノビス距離、コサイン距離などを用いることができる。テスト時には、入力されたデータから変換された特徴ベクトルと、各カテゴリの代表値の特徴ベクトルと、の距離計算を行い、値が最も近い代表値のカテゴリを分類結果とする。カテゴリの代表値は、例えば、当該カテゴリに属する特徴ベクトルを平均したベクトルで、学習データを用いて計算しておく。 The distance calculation unit 3 calculates a vector distance representing a distance between feature vectors converted from two signal information to be compared. As the vector distance, Euclidean distance, Mahalanobis distance, cosine distance and the like can be used. At the time of the test, the distance between the feature vector converted from the input data and the feature vector of the representative value of each category is calculated, and the category of the representative value with the closest value is used as the classification result. The representative value of the category is, for example, a vector obtained by averaging the feature vectors belonging to the category, and is calculated using the learning data.

上記損失計算部４は、与えられる関係性情報を参照して損失関数を求め、かかる損失関数から誤差情報を計算する。ここで、関係性情報とは、カテゴリ間の関係を与えるものであり、関係性情報を参照することでカテゴリ間の距離Ｄが求められる。損失関数は、例えば、比較する入力データが同じカテゴリに属する場合は、当該入力データから変換された特徴ベクトル間の距離が０に近づくにつれ最小となるよう定義され、また、比較する入力データが異なるカテゴリに属する場合は、特徴ベクトル間の距離がカテゴリ間の距離Ｄに近づくにつれ最小になるように定義される。 The loss calculation unit 4 obtains a loss function with reference to the given relationship information, and calculates error information from the loss function. Here, the relationship information gives the relationship between the categories, and the distance D between the categories can be obtained by referring to the relationship information. The loss function is defined so that, for example, when the input data to be compared belongs to the same category, the distance between the feature vectors converted from the input data becomes the minimum as the distance approaches 0, and the input data to be compared is different. If it belongs to a category, it is defined so that the distance between the feature vectors becomes the minimum as it approaches the distance D between the categories.

具体例として、特徴ベクトルx1、x2を入力とし、特徴ベクトル間のベクトル距離であるユークリッド距離をU(x1,x2)、各特徴ベクトルが属するカテゴリ間の距離をD(x1,x2)とする。この場合、損失関数L(x1,x2)は、以下のように定義される。
L(x1,x2)＝δ(x1,x2)・U(x1,x2)+(1-δ(x1,x2))・(D(x1,x2)-U(x1,x2))
ここで、δ(x1,x2)は、特徴ベクトルx1及びx2が同じクラスに属するときは１、異なるクラスに属するときは０となる関数である。なお、損失関数は、制約をゆるめて以下とすることもできる。
L'(x1,x2)＝δ(x1,x2)・U(x1,x2)+(1-δ(x1,x2))・max(0,(D(x1,x2)-U(x1,x2)))
ここで、max(a,b)はa、bのうち大きい値を返す関数である。As a specific example, the feature vectors x1 and x2 are input, the Euclidean distance, which is the vector distance between the feature vectors, is U (x1, x2), and the distance between the categories to which each feature vector belongs is D (x1, x2). In this case, the loss function L (x1, x2) is defined as follows.
L (x1, x2) = δ (x1, x2) ・ U (x1, x2) + (1-δ (x1, x2)) ・ (D (x1, x2) -U (x1, x2))
Here, δ (x1, x2) is a function that becomes 1 when the feature vectors x1 and x2 belong to the same class, and 0 when they belong to different classes. The loss function can also be relaxed and set to the following.
L'(x1, x2) = δ (x1, x2) ・ U (x1, x2) + (1-δ (x1, x2)) ・ max (0, (D (x1, x2) -U (x1, x2) )))
Here, max (a, b) is a function that returns the larger value of a and b.

そして、損失計算部４は、損失関数を用いて算出した結果を誤差情報とし、かかる誤差情報に基づいて特徴抽出部２におけるニューラルネットワークで特徴ベクトルに変換する方法を変更する。つまり、損失計算部４は、誤差情報を特徴抽出部２に出力し、特徴抽出部２が誤差情報に基づいてニューラルネットワークの重みなどを更新する。これにより、特徴抽出部２は、更新されたニューラルネットワークを用いて、学習データに対する学習を行う。 Then, the loss calculation unit 4 uses the result calculated by using the loss function as error information, and changes the method of converting the result into a feature vector by the neural network in the feature extraction unit 2 based on the error information. That is, the loss calculation unit 4 outputs the error information to the feature extraction unit 2, and the feature extraction unit 2 updates the weight of the neural network and the like based on the error information. As a result, the feature extraction unit 2 learns the training data using the updated neural network.

なお、上述した関係性情報は、例えば、以下に説明するようにカテゴリの大分類やカテゴリの系統図などに基づく値であってもよく、他にも特定分野に特化した評価指標で算出されたカテゴリ間の類似性に基づいた関係を示すものであってもよい。一例として、図５に示すように、カテゴリをさらに包含する大分類が設定されている場合には、同じカテゴリに属するデータ間の距離は０、同じ大分類に含まれることなるカテゴリ間の距離は１、異なる大分類に含まれるカテゴリ間の距離は２、などとすることができる。つまり、異なる大分類に含まれるカテゴリ間の距離が、同じ大分類に含まれるカテゴリ間の距離よりも大きくなるよう定義することができる。具体例として、２つの文字入力の場合を挙げると、同じ文字は距離０、異なる文字で同じ言語に属する場合は距離１、異なる言語は距離２、とすることが考えられます。 The above-mentioned relationship information may be, for example, a value based on a major category classification, a category system diagram, or the like as described below, and is calculated by an evaluation index specialized for a specific field. It may indicate a relationship based on the similarity between the categories. As an example, as shown in FIG. 5, when a major classification that further includes categories is set, the distance between data belonging to the same category is 0, and the distance between categories that are included in the same major classification is 0. 1. The distance between categories included in different major categories can be 2, and so on. That is, it can be defined that the distance between the categories included in different major categories is larger than the distance between the categories included in the same major category. As a specific example, in the case of two character inputs, it is conceivable that the same character has a distance of 0, different characters have a distance of 1, and different languages have a distance of 2.

また、図６に示すように、カテゴリが系統図つまり複数の分岐点を有する分岐図で表されている場合には、系統の近さつまりカテゴリ間の分岐数に応じて距離を定義することができる。例えば、分岐数が少ないほど関係性情報の値が小さくなるよう定義することができる。具体例として、動物の系統を表す系統図において、系統が離れるほどカテゴリ距離が離れるよう定義することができる。このようにカテゴリ間の関係性を参照することで、特徴ベクトル空間において、関係が近いカテゴリほど近く配置し、関係が遠いほど遠く配置することができる。 Further, as shown in FIG. 6, when a category is represented by a system diagram, that is, a bifurcation diagram having a plurality of branch points, it is possible to define a distance according to the proximity of the system, that is, the number of branches between categories. can. For example, it can be defined that the value of the relationship information becomes smaller as the number of branches decreases. As a specific example, in a phylogenetic diagram showing an animal lineage, it can be defined that the category distance increases as the lineage increases. By referring to the relationships between categories in this way, in the feature vector space, the closer the relationships are, the closer they are placed, and the farther the relationships are, the farther they are placed.

［動作］
次に、上述したコンピュータ１つまりプロセッサ１０の動作を、図７のフローチャートを参照して説明する。ここでは、主に、学習データを用いてニューラルネットワークの更新を行う動作を説明する。[motion]
Next, the operation of the computer 1, that is, the processor 10 described above will be described with reference to the flowchart of FIG. Here, the operation of updating the neural network using the training data will be mainly described.

まず、入力部１は、学習データから入力データとしてランダムに２つ取得する（ステップＳ１）。Ｋ個のバッチ処理を行う場合は２×K個のデータを入力する。続いて、特徴抽出部２は、２データ（又は２×Kデータ）をNNで特徴ベクトルに変換する（ステップＳ２）。このとき全データは、同じNNで変換される。 First, the input unit 1 randomly acquires two input data from the training data (step S1). When performing K batch processing, input 2 × K data. Subsequently, the feature extraction unit 2 converts 2 data (or 2 × K data) into a feature vector by NN (step S2). At this time, all the data is converted by the same NN.

続いて、距離計算部３は、２つの特徴ベクトル間の距離を計算する（ステップＳ３）。ベクトル距離は、例えば、ユークリッド距離で計算される。２×Kデータの場合は、Kペアについてベクトル距離を計算する。そして、損失計算部４は、関係性情報を参照し、損失関数を求めて誤差情報を算出し、特徴抽出部２に出力する（ステップＳ４）。特徴抽出部２は、誤差情報をもとにNNの重みを更新するなどNNを更新し（ステップＳ６）、データ入力に戻り、上述したように学習を繰り返す。なお、誤差情報算出の際には、これまでに算出した誤差情報と比較して、値が低下しなくなったら終了する（ステップＳ５でＹｅｓ）。 Subsequently, the distance calculation unit 3 calculates the distance between the two feature vectors (step S3). The vector distance is calculated, for example, by the Euclidean distance. For 2 × K data, calculate the vector distance for the K pair. Then, the loss calculation unit 4 refers to the relationship information, obtains the loss function, calculates the error information, and outputs the error information to the feature extraction unit 2 (step S4). The feature extraction unit 2 updates the NN by updating the weight of the NN based on the error information (step S6), returns to the data input, and repeats the learning as described above. When calculating the error information, the error information is compared with the error information calculated so far, and the process ends when the value does not decrease (Yes in step S5).

以上のように、本発明によると、学習データが属するカテゴリ間の関係性、例えばカテゴリの類似性、に基づく距離を考慮し、ニューラルネットワークの重みなどを更新してニューラルネットワークを最適化することができる。これにより、カテゴリ間の関係性、例えばカテゴリの類似性、に応じて特徴ベクトル空間上に配置されるように特徴ベクトルが入力データから変換されるため、カテゴリの関係性に応じて特徴ベクトルが連続的に特徴ベクトル空間に配置される。例えば、類似する入力データが連続してカテゴリ空間に配置されることとなる。その結果、未知のカテゴリもその類似性に応じて配置され、既知のカテゴリとも類似性に従って分離されるため、未知のカテゴリであっても精度よく分類することができ、分類精度の向上を図ることができる。 As described above, according to the present invention, it is possible to optimize the neural network by updating the weight of the neural network in consideration of the distance based on the relationship between the categories to which the training data belongs, for example, the similarity of the categories. can. As a result, the feature vector is transformed from the input data so as to be arranged in the feature vector space according to the relationship between the categories, for example, the similarity of the categories, so that the feature vector is continuous according to the relationship of the categories. It is arranged in the feature vector space. For example, similar input data will be continuously arranged in the category space. As a result, unknown categories are also arranged according to their similarity, and are separated from known categories according to their similarity. Therefore, even unknown categories can be classified accurately, and the classification accuracy should be improved. Can be done.

＜実施形態２＞
次に、本発明の第２の実施形態を、図８乃至図９を参照して説明する。図８は、プロセッサの構成を説明するための図であり、図９は、プロセッサの動作を説明するための図である。<Embodiment 2>
Next, a second embodiment of the present invention will be described with reference to FIGS. 8 to 9. FIG. 8 is a diagram for explaining the configuration of the processor, and FIG. 9 is a diagram for explaining the operation of the processor.

［構成］
本実施形態におけるプロセッサ１は、実施形態１で説明した構成に加えて、プログラムを実行することで構築された関係性算出部５を備える。以下、実施形態１とは異なる構成について主に説明する。[Constitution]
The processor 1 in the present embodiment includes a relationship calculation unit 5 constructed by executing a program in addition to the configuration described in the first embodiment. Hereinafter, a configuration different from that of the first embodiment will be mainly described.

上記関係性算出部５は、カテゴリ間の関係性情報を学習データから算出する。つまり、実施形態１では、与えられた関係性情報を用いる場合を例示したが、本実施形態では、算出された関係性情報を用いて損失計算を行う。 The relationship calculation unit 5 calculates the relationship information between categories from the learning data. That is, in the first embodiment, the case where the given relationship information is used is illustrated, but in the present embodiment, the loss is calculated using the calculated relationship information.

具体的に、関係性算出部５は、関係性情報を、入力データから抽出した特徴ベクトルの分散や共分散を用いて算出する。このとき、関係性算出部５は、すべての学習データを特徴ベクトルに変換した後にカテゴリ間の関係性情報を算出することもでき、また、学習データを入力するたびに変換済みの特徴ベクトルを用いて関係性情報を算出することもできる。 Specifically, the relationship calculation unit 5 calculates the relationship information using the variance or covariance of the feature vector extracted from the input data. At this time, the relationship calculation unit 5 can also calculate the relationship information between categories after converting all the training data into feature vectors, and each time the training data is input, the converted feature vector is used. It is also possible to calculate the relationship information.

関係性算出部５は、共分散を用いて関係性情報を算出する場合には、例えば、入力データが属するカテゴリ毎に共分散行列を計算し、共分散行列間の距離をもとに関係性情報を算出する。行列の距離は、行列ノルムや固有値、固有ベクトルの大きさで計算される。例えば、行列ノルム（二次のノルム）は、要素a_ijを持つ行列Aに対して、以下のように定義される。
||A||=√(Σ_ij (a_ij)^2)When the relationship calculation unit 5 calculates the relationship information using the covariance, for example, the relationship calculation unit calculates the covariance matrix for each category to which the input data belongs, and the relationship is based on the distance between the covariance matrices. Calculate the information. The distance of the matrix is calculated by the matrix norm, the eigenvalues, and the size of the eigenvectors. For example, the matrix norm (second-order norm) is defined as follows for the matrix A with the element a_ij.
|| A || = √ (Σ_ij (a_ij) ^ 2)

分散を用いて関係性を算出する場合、例えば、分散が小さい要素の組み合せをもとに関係性情報を算出する。カテゴリ毎に計算される特徴ベクトルの各要素の分散は、それぞれ異なっている。分散を要素に持つ分散ベクトルのユークリッド距離で関係性を算出することもできる。また、ある閾値を設け、分散ベクトルの要素が当該閾値以下の時に１、それ以外の時に０とするベクトルを生成し、当該ベクトル間のハミング距離やバイナリ距離などで関係性を算出することもできる。 When calculating the relationship using the variance, for example, the relationship information is calculated based on the combination of elements having a small variance. The variance of each element of the feature vector calculated for each category is different. It is also possible to calculate the relationship by the Euclidean distance of the variance vector having the variance as an element. It is also possible to set a certain threshold value, generate a vector in which the element of the variance vector is 1 when the element is equal to or less than the threshold value, and 0 in other cases, and calculate the relationship by the Hamming distance or the binary distance between the vectors. ..

［動作］
次に、上述したプロセッサ１０の動作を、図９のフローチャートを参照して説明する。ここでは、実施形態１と同様に、学習データを用いてニューラルネットワークの更新を行う動作を説明する。[motion]
Next, the operation of the processor 10 described above will be described with reference to the flowchart of FIG. Here, the operation of updating the neural network using the learning data will be described as in the first embodiment.

まず、実施形態１のステップＳ１からＳ３と同様に、学習データを入力して特徴ベクトルを抽出し、ベクトル距離を計算する（ステップＳ１１，Ｓ１２，Ｓ１３）。続いて、関係性算出部５は、変換された特徴ベクトルを用いて関係性情報を算出する（ステップＳ１４）。このとき、これまで変換された特徴ベクトルを用いてカテゴリ毎の分散ベクトル、及び、カテゴリ間の共分散行列を求める。分散ベクトルの要素が閾値以下の時に１、それ以外を０とするベクトルを生成し、当該ベクトル間のハミング距離でカテゴリ間の距離を求める。又は、共分散行列間の行列ノルムを計算し、カテゴリ間の距離を求める。 First, similarly to steps S1 to S3 of the first embodiment, learning data is input, a feature vector is extracted, and a vector distance is calculated (steps S11, S12, S13). Subsequently, the relationship calculation unit 5 calculates the relationship information using the converted feature vector (step S14). At this time, the variance vector for each category and the covariance matrix between the categories are obtained using the feature vectors converted so far. A vector is generated in which 1 is set when the element of the variance vector is equal to or less than the threshold value and 0 is set in other cases, and the distance between categories is obtained by the Hamming distance between the vectors. Alternatively, calculate the matrix norm between the covariance matrices and find the distance between the categories.

その後、損失計算部４は、関係性算出部５で算出された関係性情報（カテゴリ間の距離）を参照し、損失関数を求めて誤差情報を生成し、特徴抽出部２に出力する（ステップＳ１５）。特徴抽出部２は、誤差情報をもとにNNの重みを更新するなどNNを更新し（ステップＳ１７）、データ入力に戻り、上述したように学習を繰り返す。なお、誤差情報算出の際には、これまでに算出した誤差情報と比較して、値が低下しなくなったら終了する（ステップＳＳ１６でＹｅｓ）。 After that, the loss calculation unit 4 refers to the relationship information (distance between categories) calculated by the relationship calculation unit 5, obtains a loss function, generates error information, and outputs it to the feature extraction unit 2 (step). S15). The feature extraction unit 2 updates the NN by updating the weight of the NN based on the error information (step S17), returns to the data input, and repeats the learning as described above. When calculating the error information, the error information is compared with the error information calculated so far, and the process ends when the value does not decrease (Yes in step SS16).

＜実施形態３＞
次に、本発明の第３の実施形態を、図１０を参照して説明する。図１０は、実施形態３における情報処理装置の構成を示すブロック図である。なお、本実施形態では、実施形態１で説明したコンピュータ１の構成の概略を示している。<Embodiment 3>
Next, a third embodiment of the present invention will be described with reference to FIG. FIG. 10 is a block diagram showing the configuration of the information processing apparatus according to the third embodiment. In this embodiment, the outline of the configuration of the computer 1 described in the first embodiment is shown.

図１０に示すように、本実施形態における情報処理装置２００は、
異なる入力データをそれぞれ特徴ベクトルに変換する特徴抽出部２１０と、
異なる前記入力データの前記特徴ベクトル間の距離を表すベクトル距離を計算する距離計算部２２０と、
前記ベクトル距離に基づいて誤差情報を計算する損失計算部２３０と、を備える。
そして、前記損失計算部２３０は、異なる前記入力データがそれぞれ属するカテゴリ間の関係性を表す関係性情報に基づく損失関数から前記誤差情報を計算し、当該誤差情報に基づいて前記特徴抽出部による前記入力データを前記特徴ベクトルに変換する方法を更新する、
という構成をとる。As shown in FIG. 10, the information processing apparatus 200 in this embodiment is
A feature extraction unit 210 that converts different input data into feature vectors, and
A distance calculation unit 220 for calculating a vector distance representing a distance between the feature vectors of different input data, and
A loss calculation unit 230 that calculates error information based on the vector distance is provided.
Then, the loss calculation unit 230 calculates the error information from the loss function based on the relationship information representing the relationship between the categories to which the different input data belong, and the feature extraction unit uses the error information to calculate the error information. Update the method of converting the input data to the feature vector,
It takes the composition.

なお、上記特徴抽出部２１０、距離計算部２２０、損失計算部２３０は、情報処理装置がプログラムを実行することで実現されるものである。 The feature extraction unit 210, the distance calculation unit 220, and the loss calculation unit 230 are realized by the information processing apparatus executing a program.

そして、上記構成の情報処理装置２００は、
異なる入力データをそれぞれ特徴ベクトルに変換し、
異なる前記入力データの前記特徴ベクトル間の距離を表すベクトル距離を計算し、
前記ベクトル距離に基づいて誤差情報を計算し、
さらに、異なる前記入力データがそれぞれ属するカテゴリ間の関係性を表す関係性情報に基づく損失関数から前記誤差情報を計算し、当該誤差情報に基づいて前記入力データを前記特徴ベクトルに変換する方法を更新する、
という処理を実行するよう作動する。The information processing device 200 having the above configuration is
Convert different input data into feature vectors,
Calculate a vector distance that represents the distance between the feature vectors of different input data.
Error information is calculated based on the vector distance,
Further, the method of calculating the error information from the loss function based on the relationship information representing the relationship between the categories to which the different input data belongs and converting the input data into the feature vector based on the error information is updated. do,
It works to execute the process.

上記発明によると、学習データが属するカテゴリ間の関係性に基づく距離を考慮し、ニューラルネットワークの重みなどを更新してニューラルネットワークを最適化することができる。これにより、分類精度の向上を図ることができる。 According to the above invention, the neural network can be optimized by updating the weight of the neural network and the like in consideration of the distance based on the relationship between the categories to which the learning data belongs. This makes it possible to improve the classification accuracy.

＜付記＞
上記実施形態の一部又は全部は、以下の付記のようにも記載されうる。以下、本発明における情報処理装置、情報処理方法、プログラムの構成の概略を説明する。但し、本発明は、以下の構成に限定されない。<Additional Notes>
Part or all of the above embodiments may also be described as in the appendix below. Hereinafter, the outline of the configuration of the information processing device, the information processing method, and the program in the present invention will be described. However, the present invention is not limited to the following configuration.

（付記１）
異なる入力データをそれぞれ特徴ベクトルに変換する特徴抽出部と、
異なる前記入力データの前記特徴ベクトル間の距離を表すベクトル距離を計算する距離計算部と、
前記ベクトル距離に基づいて誤差情報を計算する損失計算部と、を備え、
前記損失計算部は、異なる前記入力データがそれぞれ属するカテゴリ間の関係性を表す関係性情報に基づく損失関数から前記誤差情報を計算し、当該誤差情報に基づいて前記特徴抽出部による前記入力データを前記特徴ベクトルに変換する方法を更新する、
情報処理装置。(Appendix 1)
A feature extractor that converts different input data into feature vectors,
A distance calculation unit that calculates a vector distance that represents the distance between the feature vectors of different input data,
A loss calculation unit that calculates error information based on the vector distance is provided.
The loss calculation unit calculates the error information from the loss function based on the relationship information representing the relationship between the categories to which the different input data belong, and based on the error information, the input data by the feature extraction unit is obtained. Update the method of converting to the feature vector,
Information processing equipment.

（付記２）
付記１に記載の情報処理装置であって、
前記関係性情報は、異なる前記入力データがそれぞれ属する前記カテゴリ間の距離が定義された情報であり、
前記損失関数は、異なる前記入力データが異なる前記カテゴリに属する場合に、前記カテゴリ間の距離と前記ベクトル距離とに基づいた値の前記誤差情報が計算されるよう定義されている、
情報処理装置。(Appendix 2)
The information processing apparatus described in Appendix 1
The relationship information is information in which the distance between the categories to which the different input data belong is defined.
The loss function is defined so that when different input data belong to different categories, the error information of the value based on the distance between the categories and the vector distance is calculated.
Information processing equipment.

（付記３）
付記２に記載の情報処理装置であって、
前記損失関数は、異なる前記入力データが同じ前記カテゴリに属する場合は前記ベクトル距離が近づくにつれ前記誤差情報の値が小さくなり、異なる前記入力データが異なる前記カテゴリに属する場合は前記カテゴリ間の距離に前記ベクトル距離が近づくにつれ前記誤差情報の値が小さくなるよう定義されている、
情報処理装置。(Appendix 3)
The information processing device described in Appendix 2
The loss function reduces the value of the error information as the vector distance approaches when different input data belong to the same category, and to the distance between the categories when different input data belong to different categories. It is defined that the value of the error information decreases as the vector distance approaches.
Information processing equipment.

（付記４）
付記２又は３に記載の情報処理装置であって、
前記カテゴリがさらに当該カテゴリを包含する大分類に分類されるよう構成されている場合に、前記関係性情報は、異なる前記入力データが異なる前記大分類に分類された前記カテゴリに属する場合の値が、異なる前記入力データが同じ前記大分類に分類された前記カテゴリに属する場合の値よりも大きくなるよう定義されている、
情報処理装置。(Appendix 4)
The information processing device according to Appendix 2 or 3,
When the category is further configured to be classified into a major category that includes the category, the relationship information has a value when different input data belong to the category classified into the different major categories. , The different input data is defined to be larger than the value when it belongs to the category classified into the same major category.
Information processing equipment.

（付記５）
付記２又は３に記載の情報処理装置であって、
前記カテゴリが分岐図に基づいて設定されている場合に、前記関係性情報は、異なる前記入力データが属する前記カテゴリ間の分岐の数が少ないほど値が小さくなるよう定義されている、
情報処理装置。(Appendix 5)
The information processing device according to Appendix 2 or 3,
When the category is set based on a bifurcation diagram, the relationship information is defined so that the smaller the number of branches between the categories to which the different input data belongs, the smaller the value.
Information processing equipment.

（付記６）
付記１乃至３のいずれかに記載の情報処理装置であって、
前記入力データから変換された前記特徴ベクトルと、当該入力データが属する前記カテゴリと、に基づいて、当該入力データが属する前記カテゴリ間の関係性を表す前記関係性情報を算出する関係性算出部を備えた、
情報処理装置。(Appendix 6)
The information processing apparatus according to any one of Supplementary note 1 to 3.
A relationship calculation unit that calculates the relationship information representing the relationship between the categories to which the input data belongs based on the feature vector converted from the input data and the category to which the input data belongs. Prepared,
Information processing equipment.

（付記７）
付記６に記載の情報処理装置であって、
前記関係性算出部は、前記カテゴリ毎の前記特徴ベクトルの分散、又は、前記カテゴリ間の前記特徴ベクトルの共分散、に基づいて、前記関係性情報を算出する、
情報処理装置。(Appendix 7)
The information processing apparatus described in Appendix 6
The relationship calculation unit calculates the relationship information based on the variance of the feature vector for each category or the covariance of the feature vector between the categories.
Information processing equipment.

（付記８）
情報処理装置に、
異なる入力データをそれぞれ特徴ベクトルに変換する特徴抽出部と、
異なる前記入力データの前記特徴ベクトル間の距離を表すベクトル距離を計算する距離計算部と、
前記ベクトル距離に基づいて誤差情報を計算する損失計算部と、を実現させると共に、
前記損失計算部は、異なる前記入力データがそれぞれ属するカテゴリ間の関係性を表す関係性情報に基づく損失関数から前記誤差情報を計算し、当該誤差情報に基づいて前記特徴抽出部による前記入力データを前記特徴ベクトルに変換する方法を更新する、
ことを実現させるためのプログラム。(Appendix 8)
For information processing equipment
A feature extractor that converts different input data into feature vectors,
A distance calculation unit that calculates a vector distance that represents the distance between the feature vectors of different input data,
In addition to realizing a loss calculation unit that calculates error information based on the vector distance,
The loss calculation unit calculates the error information from the loss function based on the relationship information representing the relationship between the categories to which the different input data belong, and based on the error information, the input data by the feature extraction unit is obtained. Update the method of converting to the feature vector,
A program to make that happen.

（付記９）
付記８に記載のプログラムであって、
前記情報処理装置に、さらに、
前記入力データから変換された前記特徴ベクトルと、当該入力データが属する前記カテゴリと、に基づいて、当該入力データが属する前記カテゴリ間の関係性を表す前記関係性情報を算出する関係性算出部、
を実現させるためのプログラム。(Appendix 9)
The program described in Appendix 8
In addition to the information processing device,
A relationship calculation unit that calculates the relationship information representing the relationship between the categories to which the input data belongs, based on the feature vector converted from the input data and the category to which the input data belongs.
A program to realize.

（付記１０）
異なる入力データをそれぞれ特徴ベクトルに変換し、
異なる前記入力データの前記特徴ベクトル間の距離を表すベクトル距離を計算し、
前記ベクトル距離に基づいて誤差情報を計算し、
さらに、異なる前記入力データがそれぞれ属するカテゴリ間の関係性を表す関係性情報に基づく損失関数から前記誤差情報を計算し、当該誤差情報に基づいて前記入力データを前記特徴ベクトルに変換する方法を更新する、
情報処理方法。(Appendix 10)
Convert different input data into feature vectors,
Calculate a vector distance that represents the distance between the feature vectors of different input data.
Error information is calculated based on the vector distance,
Further, the method of calculating the error information from the loss function based on the relationship information representing the relationship between the categories to which the different input data belongs and converting the input data into the feature vector based on the error information is updated. do,
Information processing method.

（付記１１）
付記１０に記載の情報処理方法であって、
前記入力データから変換された前記特徴ベクトルと、当該入力データが属する前記カテゴリと、に基づいて、当該入力データが属する前記カテゴリ間の関係性を表す前記関係性情報を算出する、
情報処理方法。(Appendix 11)
The information processing method described in Appendix 10
Based on the feature vector converted from the input data and the category to which the input data belongs, the relationship information representing the relationship between the categories to which the input data belongs is calculated.
Information processing method.

以上、上記実施形態等を参照して本願発明を説明したが、本願発明は、上述した実施形態に限定されるものではない。本願発明の構成や詳細には、本願発明の範囲内で当業者が理解しうる様々な変更をすることができる。 Although the invention of the present application has been described above with reference to the above-described embodiments and the like, the invention of the present application is not limited to the above-described embodiments. Various changes that can be understood by those skilled in the art can be made to the structure and details of the present invention within the scope of the present invention.

１入力部
２特徴抽出部
３距離計算部
４損失計算部
５関係性算出部
１０プロセッサ
２０インターフェース
３０メインメモリ
１００コンピュータ
２００情報処理装置
２１０特徴抽出部
２２０距離計算部
２３０損失計算部
1 Input unit 2 Feature extraction unit 3 Distance calculation unit 4 Loss calculation unit 5 Relationship calculation unit 10 Processor 20 Interface 30 Main memory 100 Computer 200 Information processing device 210 Feature extraction unit 220 Distance calculation unit 230 Loss calculation unit

Claims

A feature extractor that converts different input data into feature vectors,
A distance calculation unit that calculates a vector distance that represents the distance between the feature vectors of different input data,
A loss calculation unit that calculates error information based on the vector distance is provided.
The loss calculation unit calculates the error information from the loss function based on the relationship information representing the relationship between the categories to which the different input data belong, and based on the error information, the input data by the feature extraction unit is obtained. Update the method of converting to the feature vector,
Information processing equipment.

The information processing apparatus according to claim 1.
The relationship information is information in which the distance between the categories to which the different input data belong is defined.
The loss function is defined so that when different input data belong to different categories, the error information of the value based on the distance between the categories and the vector distance is calculated.
Information processing equipment.

The information processing apparatus according to claim 2.
The loss function reduces the value of the error information as the vector distance approaches when different input data belong to the same category, and to the distance between the categories when different input data belong to different categories. It is defined that the value of the error information decreases as the vector distance approaches.
Information processing equipment.

The information processing apparatus according to claim 2 or 3.
When the category is further configured to be classified into a major category that includes the category, the relationship information has a value when different input data belong to the category classified into the different major categories. , The different input data is defined to be larger than the value when it belongs to the category classified into the same major category.
Information processing equipment.

The information processing apparatus according to claim 2 or 3.
When the category is set based on a bifurcation diagram, the relationship information is defined so that the smaller the number of branches between the categories to which the different input data belongs, the smaller the value.
Information processing equipment.

The information processing apparatus according to any one of claims 1 to 3.
A relationship calculation unit that calculates the relationship information representing the relationship between the categories to which the input data belongs based on the feature vector converted from the input data and the category to which the input data belongs. Prepared,
Information processing equipment.

The information processing apparatus according to claim 6.
The relationship calculation unit calculates the relationship information based on the variance of the feature vector for each category or the covariance of the feature vector between the categories.
Information processing equipment.

For information processing equipment
A feature extractor that converts different input data into feature vectors,
A distance calculation unit that calculates a vector distance that represents the distance between the feature vectors of different input data,
In addition to realizing a loss calculation unit that calculates error information based on the vector distance,
The loss calculation unit calculates the error information from the loss function based on the relationship information representing the relationship between the categories to which the different input data belong, and based on the error information, the input data by the feature extraction unit is obtained. Update the method of converting to the feature vector,
A program to make that happen.

Convert different input data into feature vectors,
Calculate a vector distance that represents the distance between the feature vectors of different input data.
Error information is calculated based on the vector distance,
Further, the method of calculating the error information from the loss function based on the relationship information representing the relationship between the categories to which the different input data belongs and converting the input data into the feature vector based on the error information is updated. do,
Information processing method.

The information processing method according to claim 9 .
Based on the feature vector converted from the input data and the category to which the input data belongs, the relationship information representing the relationship between the categories to which the input data belongs is calculated.
Information processing method.