JP7469341B2

JP7469341B2 - Method and system for classifying data using parameter size invariant classifiers for unbiased classification - Patents.com

Info

Publication number: JP7469341B2
Application number: JP2022001602A
Authority: JP
Inventors: ホドクチャン; ドンユンウィ
Original assignee: Line Works; Naver Corp
Current assignee: Line Works; Naver Corp
Priority date: 2021-01-22
Filing date: 2022-01-07
Publication date: 2024-04-16
Anticipated expiration: 2042-01-07
Also published as: JP2022113125A; KR20220106541A; KR102459971B1

Description

本発明は、非偏向分類のためのパラメータサイズ不変分類器を利用してデータを分類する方法およびシステムに関する。 The present invention relates to a method and system for classifying data using a parameter size invariant classifier for unbiased classification.

従来の人工知能およびマシンラーニング分野で広く使用されている標準分類器は、分類器のパラメータサイズが学習データセットに内在する偏向性の影響を受けるため、偏向した分類性能を有するという問題があった。例えば、入力データを複数のカテゴリのうちの１つに分類するときに、学習データの各カテゴリの量の差によって偏向性が生じるようになる。 Standard classifiers, which are widely used in the fields of conventional artificial intelligence and machine learning, have a problem of having biased classification performance because the parameter size of the classifier is affected by the bias inherent in the training data set. For example, when classifying input data into one of multiple categories, bias occurs due to differences in the amount of each category in the training data.

このような問題点を解決するために、ほぼ同量の学習データを含むカテゴリごとにグループを形成し、各グループで学習を処理する従来技術が存在する。しかし、この従来技術は、特定のタスク（ｄｅｔｅｃｔｉｏｎｔａｓｋ）だけに特化しており、他のタスク（ｉｎｓｔａｎｃｅｓｅｇｍｅｎｔａｔｉｏｎｔａｓｋまたはｃｌａｓｓｉｆｉｃａｔｉｏｎｔａｓｋ）の分類問題には適用することができないという問題を抱えている。さらに、この従来技術は、多くのハイパーパラメータの探索を要求するため、実際に適用するには多くの資源と時間を要するという問題も抱えている。 To solve these problems, there is a conventional technique that forms groups for each category containing roughly the same amount of training data, and processes the learning in each group. However, this conventional technique has the problem that it is specialized for only a specific task (detection task) and cannot be applied to classification problems of other tasks (instance segmentation task or classification task). Furthermore, this conventional technique requires the search for many hyperparameters, so it requires a lot of resources and time to actually apply it.

韓国登録特許第１０－２１８００５４号公報Korean Patent No. 10-2180054

分類器のパラメータベクトルのサイズに対する偏向性を取り除くことにより、分類器の偏向性問題を解消することができる、データ分類方法およびシステムを提供する。 We provide a data classification method and system that can resolve the problem of classifier bias by removing the bias with respect to the size of the classifier's parameter vector.

少なくとも１つのプロセッサを含むコンピュータ装置のデータ分類方法であって、前記少なくとも１つのプロセッサにより、入力データの埋め込みベクトルを生成する段階、前記少なくとも１つのプロセッサにより、前記埋め込みベクトルと学習された分類器のパラメータベクトルとの内積を計算する段階、および、前記少なくとも１つのプロセッサにより、前記内積の結果に前記パラメータベクトルに対するノーム（ｎｏｒｍ）を適用して、前記パラメータベクトルのサイズに対する偏向性を取り除いたロジット（ｌｏｇｉｔ）を計算する段階を含む、データ分類方法を提供する。 A data classification method for a computer device including at least one processor is provided, the data classification method including a step of generating an embedding vector of input data by the at least one processor, a step of calculating an inner product of the embedding vector and a parameter vector of a trained classifier by the at least one processor, and a step of calculating a logit with no bias for the size of the parameter vector by applying a norm for the parameter vector to the result of the inner product by the at least one processor.

一つの側面によると、前記ロジットを計算する段階は、前記内積の結果を前記パラメータベクトルに対するノームで除算して、前記パラメータベクトルのサイズに対する偏向性を取り除く、ことを特徴としてよい。 According to one aspect, the step of calculating the logit may be characterized by dividing the result of the dot product by the norm for the parameter vector to remove bias with respect to the size of the parameter vector.

他の側面によると、前記ロジットを計算する段階は、前記ノームが適用された内積の結果にハイパーボリックタンジェント（ｈｙｐｅｒｂｏｌｉｃｔａｎｇｅｎｔ）が適用されたバイアスを付与する段階を含む、ことを特徴としてよい。 In another aspect, the step of calculating the logit may be characterized by including a step of biasing the result of the norm-applied dot product by applying a hyperbolic tangent.

前記方法をコンピュータ装置に実行させるためのコンピュータプログラムを提供する。 A computer program is provided for causing a computer device to execute the method.

前記方法をコンピュータ装置に実行させるためのプログラムが記録されている、コンピュータ読み取り可能な記録媒体を提供する。 A computer-readable recording medium is provided on which a program for causing a computer device to execute the method is recorded.

コンピュータ読み取り可能な命令を実行するように実現される少なくとも１つのプロセッサを含み、前記少なくとも１つのプロセッサにより、入力データの埋め込みベクトルを生成し、前記少なくとも１つのプロセッサにより、前記埋め込みベクトルと学習された分類器のパラメータベクトルの内積を計算し、前記少なくとも１つのプロセッサにより、前記内積の結果に前記パラメータベクトルに対するノーム（ｎｏｒｍ）を適用して、前記パラメータベクトルのサイズに対する偏向性を取り除いたロジットを計算する、ことを特徴とする、コンピュータ装置を提供する。 A computer device is provided, comprising at least one processor implemented to execute computer-readable instructions, the at least one processor generating an embedding vector for input data, the at least one processor calculating an inner product of the embedding vector and a parameter vector of a trained classifier, and the at least one processor applying a norm for the parameter vector to the result of the inner product to calculate a logit that is debiased with respect to the size of the parameter vector.

分類器のパラメータベクトルのサイズに対する偏向性を取り除くことにより、分類器の偏向性の問題を解消することができる。 The problem of classifier bias can be eliminated by removing the bias with respect to the size of the classifier's parameter vector.

本発明の一実施形態における、コンピュータ装置の例を示したブロック図である。FIG. 2 is a block diagram illustrating an example computing device according to an embodiment of the present invention. 本発明の一実施形態における、データ分類方法の例を示したフローチャートである。1 is a flow chart illustrating an example method for classifying data in accordance with an embodiment of the present invention. 標準ロジット表現を活用した場合と、本発明の一実施形態に係るノーム－不変ロジット表現を活用した場合の性能を比べた図表である。1 is a chart comparing the performance of using a standard logit representation and a norm-invariant logit representation according to an embodiment of the present invention; 標準ロジット表現を活用した場合と、本発明の一実施形態に係るノーム－不変ロジット表現を活用した場合の性能を比べた図表である。1 is a chart comparing the performance of using a standard logit representation and a norm-invariant logit representation according to an embodiment of the present invention;

以下、実施形態について、添付の図面を参照しながら詳しく説明する。 The following describes the embodiments in detail with reference to the attached drawings.

本発明の実施形態に係るデータ分類システムは、少なくとも１つのコンピュータ装置によって実現されてよく、本発明の実施形態に係るデータ分類方法は、データ分類システムに含まれる少なくとも１つのコンピュータ装置によって実行されてよい。コンピュータ装置においては、本発明の一実施形態に係るコンピュータプログラムがインストールされて実現されてよく、コンピュータ装置は、実行されたコンピュータプログラムの制御にしたがって、本発明の実施形態に係るデータ分類方法を実行してよい。上述したコンピュータプログラムは、コンピュータ装置と結合してデータ分類方法をコンピュータに実行させるために、コンピュータ読み取り可能な記録媒体に記録されてよい。 The data classification system according to the embodiment of the present invention may be realized by at least one computer device, and the data classification method according to the embodiment of the present invention may be executed by at least one computer device included in the data classification system. A computer program according to an embodiment of the present invention may be installed and implemented in the computer device, and the computer device may execute the data classification method according to the embodiment of the present invention according to the control of the executed computer program. The above-mentioned computer program may be recorded on a computer-readable recording medium in order to be combined with the computer device and cause the computer to execute the data classification method.

図１は、本発明の一実施形態における、コンピュータ装置の例を示したブロック図である。このようなコンピュータ装置１００は、図１に示すように、メモリ１１０、プロセッサ１２０、通信インタフェース１３０、および入力／出力インタフェース１４０を含んでよい。メモリ１１０は、コンピュータ読み取り可能な記録媒体であって、ＲＡＭ（ｒａｎｄｏｍａｃｃｅｓｓｍｅｍｏｒｙ）、ＲＯＭ（ｒｅａｄｏｎｌｙｍｅｍｏｒｙ）、およびディスクドライブのような永続的大容量記憶装置を含んでよい。ここで、ＲＯＭやディスクドライブのような永続的大容量記憶装置は、メモリ１１０とは区分される別の永続的記憶装置としてコンピュータ装置１００に含まれてもよい。また、メモリ１１０には、オペレーティングシステムと、少なくとも１つのプログラムコードが記録されてよい。このようなソフトウェア構成要素は、メモリ１１０とは別のコンピュータ読み取り可能な記録媒体からメモリ１１０にロードされてよい。このような別のコンピュータ読み取り可能な記録媒体は、フロッピー（登録商標）ドライブ、ディスク、テープ、ＤＶＤ／ＣＤ－ＲＯＭドライブ、メモリカードなどのコンピュータ読み取り可能な記録媒体を含んでよい。他の実施形態において、ソフトウェア構成要素は、コンピュータ読み取り可能な記録媒体ではない通信インタフェース１３０を通じてメモリ１１０にロードされてもよい。例えば、ソフトウェア構成要素は、ネットワーク１６０を介して受信されるファイルによってインストールされるコンピュータプログラムに基づいてコンピュータ装置１００のメモリ１１０にロードされてよい。 1 is a block diagram showing an example of a computer device in an embodiment of the present invention. Such a computer device 100 may include a memory 110, a processor 120, a communication interface 130, and an input/output interface 140, as shown in FIG. 1. The memory 110 is a computer-readable recording medium and may include a RAM (random access memory), a ROM (read only memory), and a persistent mass storage device such as a disk drive. Here, a persistent mass storage device such as a ROM or a disk drive may be included in the computer device 100 as a separate persistent storage device separate from the memory 110. In addition, the memory 110 may record an operating system and at least one program code. Such software components may be loaded into the memory 110 from a computer-readable recording medium separate from the memory 110. Such a separate computer-readable recording medium may include a computer-readable recording medium such as a floppy drive, a disk, a tape, a DVD/CD-ROM drive, or a memory card. In other embodiments, the software components may be loaded into the memory 110 through a communication interface 130 that is not a computer-readable recording medium. For example, the software components may be loaded into the memory 110 of the computer device 100 based on a computer program that is installed by a file received via the network 160.

プロセッサ１２０は、基本的な算術、ロジック、および入出力演算を実行することにより、コンピュータプログラムの命令を処理するように構成されてよい。命令は、メモリ１１０または通信インタフェース１３０によって、プロセッサ１２０に提供されてよい。例えば、プロセッサ１２０は、メモリ１１０のような記憶装置に記録されたプログラムコードにしたがって、受信される命令を実行するように構成されてよい。 The processor 120 may be configured to process computer program instructions by performing basic arithmetic, logic, and input/output operations. The instructions may be provided to the processor 120 by the memory 110 or by the communication interface 130. For example, the processor 120 may be configured to execute received instructions according to program code recorded in a storage device such as the memory 110.

通信インタフェース１３０は、ネットワーク１６０を介してコンピュータ装置１００が他の電子機器（一例として、上述した記憶装置）と互いに通信するための機能を提供してよい。一例として、コンピュータ装置１００のプロセッサ１２０がメモリ１１０のような記憶装置に記録されたプログラムコードにしたがって生成した要求や命令、データ、ファイルなどが、通信インタフェース１３０の制御にしたがって、ネットワーク１６０を介して他の装置に伝達されてよい。これとは逆に、他の装置からの信号や命令、データ、ファイルなどが、ネットワーク１６０を介してコンピュータ装置１００の通信インタフェース１３０を通じてコンピュータ装置１００に受信されてよい。通信インタフェース１３０を通じて受信された信号や命令、データなどは、プロセッサ１２０やメモリ１１０に伝達されてよく、ファイルなどは、コンピュータ装置１００がさらに含むことのできる記録媒体（上述した永続的記憶装置）に記録されてよい。 The communication interface 130 may provide a function for the computer device 100 to communicate with other electronic devices (for example, the storage device described above) via the network 160. As an example, requests, commands, data, files, etc. generated by the processor 120 of the computer device 100 according to program code recorded in a storage device such as the memory 110 may be transmitted to other devices via the network 160 under the control of the communication interface 130. Conversely, signals, commands, data, files, etc. from other devices may be received by the computer device 100 through the communication interface 130 of the computer device 100 via the network 160. The signals, commands, data, etc. received through the communication interface 130 may be transmitted to the processor 120 or the memory 110, and the files, etc. may be recorded on a recording medium (the persistent storage device described above) that the computer device 100 may further include.

入力／出力インタフェース１４０は、入力／出力装置１５０とのインタフェースのための手段であってよい。例えば、入力装置は、マイク、キーボード、またはマウスなどの装置を、出力装置は、ディスプレイ、スピーカなどのような装置を含んでよい。他の例として、入力／出力インタフェース１４０は、タッチスクリーンのように入力と出力のための機能が１つに統合された装置とのインタフェースのための手段であってもよい。入力／出力装置１５０は、コンピュータ装置１００と１つの装置で構成されてもよい。 The input/output interface 140 may be a means for interfacing with the input/output device 150. For example, the input device may include devices such as a microphone, keyboard, or mouse, and the output device may include devices such as a display, speaker, etc. As another example, the input/output interface 140 may be a means for interfacing with a device in which functions for input and output are integrated into one, such as a touch screen. The input/output device 150 may be configured as a single device together with the computer device 100.

また、他の実施形態において、コンピュータ装置１００は、図１の構成要素よりも少ないか多くの構成要素を含んでもよい。しかし、大部分の従来技術的構成要素を明確に図に示す必要はない。例えば、コンピュータ装置１００は、上述した入力／出力装置１５０のうちの少なくとも一部を含むように実現されてもよいし、トランシーバやデータベースなどのような他の構成要素をさらに含んでもよい。 Also, in other embodiments, the computing device 100 may include fewer or more components than those of FIG. 1. However, most of the prior art components need not be explicitly shown in the figures. For example, the computing device 100 may be implemented to include at least some of the input/output devices 150 described above, and may further include other components such as a transceiver, a database, etc.

図２は、本発明の一実施形態における、データ分類方法の例を示したフローチャートである。本実施形態に係るデータ分類方法は、コンピュータ装置１００によって実行されてよい。このとき、コンピュータ装置１００のプロセッサ１２０は、メモリ１１０が含むオペレーティングシステムのコードと、少なくとも１つのコンピュータプログラムのコードとによる制御命令（ｉｎｓｔｒｕｃｔｉｏｎ）を実行するように実現されてよい。ここで、プロセッサ１２０は、コンピュータ装置１００に記録されたコードが提供する制御命令にしたがってコンピュータ装置１００が図２の方法に含まれる段階２１０～２３０を実行するようにコンピュータ装置１００を制御してよい。 Figure 2 is a flow chart showing an example of a data classification method according to an embodiment of the present invention. The data classification method according to this embodiment may be executed by a computer device 100. In this case, the processor 120 of the computer device 100 may be implemented to execute control instructions according to the operating system code and at least one computer program code contained in the memory 110. Here, the processor 120 may control the computer device 100 so that the computer device 100 executes steps 210 to 230 included in the method of Figure 2 according to the control instructions provided by the code recorded in the computer device 100.

段階２１０で、コンピュータ装置１００は、入力データの埋め込みベクトルを生成してよい。一例として、入力データがイメージである場合、コンピュータ装置１００は、イメージの特徴ベクトルを埋め込みベクトルとして生成してよい。 At step 210, the computer device 100 may generate an embedding vector for the input data. As an example, if the input data is an image, the computer device 100 may generate a feature vector of the image as the embedding vector.

段階２２０で、コンピュータ装置１００は、埋め込みベクトルと学習された分類器のパラメータベクトルとの内積を計算してよい。学習データを利用して分類器を学習することは、分類器のパラメータベクトルを学習することに対応してよい。この場合、コンピュータ装置１００は、各カテゴリの学習データを利用して予め学習された分類器のパラメータベクトル（または、分類器加重値ベクトル）と段階２１０で生成された埋め込みベクトルとの内積を計算してよい。 In step 220, the computer device 100 may calculate an inner product of the embedding vector and the parameter vector of the trained classifier. Training the classifier using the training data may correspond to training the parameter vector of the classifier. In this case, the computer device 100 may calculate an inner product of the parameter vector (or classifier weight vector) of the classifier previously trained using the training data of each category and the embedding vector generated in step 210.

段階２３０で、コンピュータ装置１００は、内積の結果にパラメータベクトルに対するノーム（ｎｏｒｍ）を適用して、パラメータベクトルのサイズに対する偏向性を取り除いたロジット（ｌｏｇｉｔ、ｌｏｇｉｓｔｉｃｐｒｏｂｉｔ）を計算してよい。例えば、コンピュータ装置１００は、内積の結果をパラメータベクトルに対するノームで除算して、パラメータベクトルのサイズに対する偏向性を取り除いてよい。ノームは、ベクトルのサイズを意味するため、埋め込みベクトルとパラメータベクトルの内積の結果をノームで割ることによってパラメータベクトルで方向性だけが残るようになり、パラメータベクトルのサイズは取り除かれる。したがって、学習データの各カテゴリの量の差によって現れる偏向性を取り除くことができる。 In step 230, the computer device 100 may apply a norm for the parameter vector to the result of the dot product to calculate a logit (logistic probit) that removes bias with respect to the size of the parameter vector. For example, the computer device 100 may divide the result of the dot product by the norm for the parameter vector to remove bias with respect to the size of the parameter vector. Since the norm refers to the size of a vector, by dividing the result of the dot product of the embedding vector and the parameter vector by the norm, only the directionality remains in the parameter vector, and the size of the parameter vector is removed. Therefore, bias that appears due to differences in the amounts of each category in the training data can be removed.

実施形態によって、コンピュータ装置１００は、ノームが適用された内積の結果にハイパーボリックタンジェント（ｈｙｐｅｒｂｏｌｉｃｔａｎｇｅｎｔ、ｔａｎｈ）が適用されたバイアスを付与してよい。バイアスは、分類器のバイアスに対応してよい。 In some embodiments, the computer device 100 may bias the result of the normed dot product by applying a hyperbolic tangent (tanh). The bias may correspond to the bias of the classifier.

以下では、本発明の実施形態における、データ分類方法の数学的意味について説明する。 The mathematical meaning of the data classification method in an embodiment of the present invention is explained below.

一般的に、分類器の標準ロジット表現は、以下の数式（１）のように表現されてよい。 In general, the standard logit representation of a classifier may be expressed as follows:

ここで、ｌ_ｉは、分類器に対するｉ－番目のカテゴリのロジット表現を意味してよい。また、ｆは、Ｃ次元を有するパラメータベクトルと埋め込みベクトルをそれぞれ意味してよい。言い換えれば、分類器の標準ロジット表現は、数学的に、分類器のパラメータベクトルと入力データから生成された埋め込みベクトルとの内積に基づいて生成されてよい。このとき、学習データの各カテゴリの量の差によって発生する偏向性が分類器のパラメータベクトルに内在することがあり、このようなパラメータベクトルを利用して生成される標準ロジット表現に反映されることがある。このような偏向性の反映は、分類器の分類性能の偏向に繋がる恐れがある。なお、ｂ_ｉは、分類器のバイアスを意味してよい。 Here, l _i may mean the logit representation of the i-th category for the classifier. Also, f may mean a parameter vector and an embedding vector having C dimensions, respectively. In other words, the standard logit representation of the classifier may be mathematically generated based on the inner product of the parameter vector of the classifier and the embedding vector generated from the input data. At this time, bias caused by the difference in the amount of each category of the training data may be inherent in the parameter vector of the classifier, and may be reflected in the standard logit representation generated using such a parameter vector. Reflection of such bias may lead to bias in the classification performance of the classifier. Also, b _i may mean the bias of the classifier.

上述したように、本発明の実施形態では、分類器のパラメータベクトルのノームを利用することで、一例として、以下の数式（２）のように、このような偏向性による問題を解決することができる。 As described above, in an embodiment of the present invention, the problem of such bias can be solved by using the norm of the parameter vector of the classifier, as shown in the following equation (2), for example.

ここで、 here,

は、分類器のパラメータベクトルのノームを意味してよい。言い換えれば、分類器のパラメータベクトルと入力データの埋め込みベクトルの内積をベクトルのサイズを示すノーム（分類器のパラメータベクトルのノーム）で除算することにより、内積の結果からサイズによる偏向性を取り除くことが可能となる。追加で、分類器のバイアスにハイパーボリックタンジェント（ｔａｎｈ）を適用し、適用されるバイアスの値を－１から１の間の実数に制限することにより、分類器が受けるバイアスの影響を制限することができる。本明細書では、数式（２）によるロジット表現を、ノーム－不変ロジット表現（ｎｏｒｍ－ｉｎｖａｒｉａｎｔｌｏｇｉｔｒｅｐｒｅｓｅｎｔａｔｉｏｎ）と呼ぶことにする。 may mean the norm of the parameter vector of the classifier. In other words, by dividing the inner product of the parameter vector of the classifier and the embedding vector of the input data by the norm indicating the size of the vector (the norm of the parameter vector of the classifier), it is possible to remove the bias due to size from the result of the inner product. In addition, by applying a hyperbolic tangent (tanh) to the bias of the classifier and limiting the value of the applied bias to real numbers between -1 and 1, it is possible to limit the influence of the bias on the classifier. In this specification, the logit representation according to Equation (2) is referred to as a norm-invariant logit representation.

図３および図４は、標準ロジット表現を活用した場合と、本発明の一実施形態に係るノーム－不変ロジット表現を活用した場合の性能を比べた図表である。 Figures 3 and 4 are charts comparing the performance of using a standard logit representation and a norm-invariant logit representation according to one embodiment of the present invention.

図３は、入力イメージ内の物体の位置を探索して物体のカテゴリを分類するタスクであるインスタンスセグメンテーションタスク（ｉｎｓｔａｎｃｅｓｅｇｍｅｎｔａｔｉｏｎｔａｓｋ）に対して、標準ロジット表現を活用した場合の性能と、本発明の一実施形態に係るノーム－不変ロジット表現を活用した場合の性能を示している。このとき、インスタンスセグメンテーションタスクでは、ボックス（ｂｂｏｘ）とピクセルレベルのマスク（ｍａｓｋ）で位置を表現してよい。性能テストは、ＬＶＩＳｖ０．５ｂｅｎｃｈｍａｒｋのデータセットを利用して行った。図３の表において、ＡＰは、感知正確度を測定するために広く使用される測定項目である平均精密度を示している。このとき、ＡＰ＿Ｓ、ＡＰ＿Ｍ、およびＡＰ＿Ｌは、客体の大きさ（すなわち、Ｓｍａｌｌ、Ｍｅｄｉｕｍ、およびＬａｒｇｅ）に関するＡＰであり、ＡＰ＿ｒ、ＡＰ＿ｃ、およびＡＰ＿ｆは、訓練データセットのサンプル頻度（すなわち、ｒａｒｅｃｏｍｍｏｎ、ｆｒｅｑｕｅｎｔ）に関するＡＰである。このとき、図３の図表では、ノーム－不変ロジット表現が、ボックスとマスクの両方においてロー－ショット（ｌｏｗ－ｓｈｏｔ）の場合に（一例として、ＡＰ＿ｒおよびＡＰ＿ｃ）、性能を大きく改善し、学習データの各カテゴリの不均衡と、これによる偏向性の問題とを効果的に解決したことが示されている。また、ＡＰ＿ｆの場合にも、ノーム－不変ロジット表現を活用した場合の性能は、標準ロジット表現を活用した場合の性能と類似した。 3 shows the performance of an instance segmentation task, which is a task of searching for the location of an object in an input image and classifying the object category, when using a standard logit representation and when using a norm-invariant logit representation according to an embodiment of the present invention. In this case, in the instance segmentation task, the location may be represented by a box and a pixel-level mask. The performance test was performed using the LVISv0.5benchmark dataset. In the table of FIG. 3, AP indicates the average precision, which is a measurement item widely used to measure the detection accuracy. In this case, AP_S, AP_M, and AP_L are APs related to the object size (i.e., Small, Medium, and Large), and AP_r, AP_c, and AP_f are APs related to the sample frequency (i.e., rarecommon, frequent) of the training dataset. In this case, the graph in Figure 3 shows that the norm-invariant logit representation significantly improved performance in the low-shot cases (AP_r and AP_c, for example) for both boxes and masks, effectively resolving the imbalance of each category of training data and the resulting bias problem. In addition, in the case of AP_f, the performance when using the norm-invariant logit representation was similar to that when using the standard logit representation.

図４は、入力イメージのカテゴリを分類するクラシフィケーションタスク（ｃｌａｓｓｉｆｉｃａｔｉｏｎｔａｓｋ）に対して、標準ロジット表現を活用した場合の性能と、本発明の一実施形態に係るノーム－不変ロジット表現を活用した場合の性能を示している。性能テストは、Ｌｏｎｇ－ｔａｉｌｅｄＣＩＦＡＲ－１０ｂｅｎｃｈｍａｒｋのデータセットを利用して行った。図４の図表において、不均衡率（Ｉｍｂａｌａｎｃｅｒａｔｉｏ）が高くなるほど、訓練データセットの不均衡が高くなることを意味している。このとき、図４の図表では、ノーム－不変ロジット表現が、ロー－ショット（ｌｏｗ－ｓｈｏｔ）の場合に（一例として、１００の不均衡率）、性能を大きく改善し、学習データの各カテゴリの不均衡とこれによる偏向性の問題を効果的に解決したことが示されている。不均衡率が低い場合（一例として、５０および１０の不均衡率）でも、ノーム－不変ロジット表現を活用した場合の性能は、標準ロジット表現を活用した場合の性能と類似した。 Figure 4 shows the performance of a classification task for classifying the category of an input image when using a standard logit representation and when using a norm-invariant logit representation according to an embodiment of the present invention. The performance test was performed using the Long-tailed CIFAR-10 benchmark dataset. In the diagram of Figure 4, the higher the imbalance ratio, the higher the imbalance of the training dataset. In this case, the diagram of Figure 4 shows that the norm-invariant logit representation significantly improved performance in the case of low-shot (for example, an imbalance ratio of 100), effectively solving the problem of imbalance and the resulting bias in each category of training data. Even when the imbalance ratio was low (for example, an imbalance ratio of 50 and 10), the performance of the norm-invariant logit representation was similar to that of the standard logit representation.

このように、本発明の実施形態によると、分類器のパラメータベクトルのサイズに対する偏向性を取り除くことにより、分類器の偏向性の問題を解消することができる。 Thus, according to an embodiment of the present invention, the problem of classifier bias can be resolved by removing the bias with respect to the size of the classifier's parameter vector.

上述したシステムまたは装置は、ハードウェア構成要素、またはハードウェア構成要素とソフトウェア構成要素との組み合わせによって実現されてよい。例えば、実施形態で説明された装置および構成要素は、例えば、プロセッサ、コントローラ、ＡＬＵ（ａｒｉｔｈｍｅｔｉｃｌｏｇｉｃｕｎｉｔ）、デジタル信号プロセッサ、マイクロコンピュータ、ＦＰＧＡ（ｆｉｅｌｄｐｒｏｇｒａｍｍａｂｌｅｇａｔｅａｒｒａｙ）、ＰＬＵ（ｐｒｏｇｒａｍｍａｂｌｅｌｏｇｉｃｕｎｉｔ）、マイクロプロセッサ、または命令を実行して応答することができる様々な装置のように、１つ以上の汎用コンピュータまたは特殊目的コンピュータを利用して実現されてよい。処理装置は、オペレーティングシステム（ＯＳ）およびＯＳ上で実行される１つ以上のソフトウェアアプリケーションを実行してよい。また、処理装置は、ソフトウェアの実行に応答し、データにアクセスし、データを記録、操作、処理、および生成してもよい。理解の便宜のために、１つの処理装置が使用されるものとして説明される場合もあるが、当業者は、処理装置が複数個の処理要素、および／または、複数種類の処理要素を含んでもよいことが理解できるであろう。例えば、処理装置は、複数個のプロセッサ、または１つのプロセッサ、および１つのコントローラを含んでよい。また、並列プロセッサのような、他の処理構成も可能である。 The above-described systems or devices may be realized by hardware components or a combination of hardware and software components. For example, the devices and components described in the embodiments may be realized using one or more general-purpose or special-purpose computers, such as, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate array (FPGA), a programmable logic unit (PLU), a microprocessor, or various devices capable of executing and responding to instructions. The processing device may execute an operating system (OS) and one or more software applications that run on the OS. The processing device may also respond to the execution of the software and access, record, manipulate, process, and generate data. For ease of understanding, the description may be given as one processing device being used, but those skilled in the art will understand that the processing device may include multiple processing elements and/or multiple types of processing elements. For example, a processing unit may include multiple processors, or one processor and one controller. Other processing configurations, such as parallel processors, are also possible.

ソフトウェアは、コンピュータプログラム、コード、命令、または、これらのうちの１つ以上の組み合わせを含んでもよく、思うままに動作するように処理装置を構成したり、独立的または集合的に処理装置に命令したりしてよい。ソフトウェアおよび／またはデータは、処理装置に基づいて解釈されたり、処理装置に命令またはデータを提供したりするために、いかなる種類の機械、コンポーネント、物理装置、仮想装置、コンピュータ記録媒体、または装置に具現化されてよい。ソフトウェアは、ネットワークによって接続されたコンピュータシステム上に分散され、分散された状態で記録されても実行されてもよい。ソフトウェアおよびデータは、１つ以上のコンピュータ読み取り可能な記録媒体に記録されてよい。 The software may include computer programs, codes, instructions, or a combination of one or more of these, and may configure or instruct the processing device to operate as desired, either independently or collectively. The software and/or data may be embodied in any type of machine, component, physical device, virtual device, computer storage medium, or device to be interpreted based on the processing device or to provide instructions or data to the processing device. The software may be distributed and stored or executed in a distributed manner on computer systems connected by a network. The software and data may be stored on one or more computer-readable storage media.

実施形態に係る方法は、多様なコンピュータ手段によって実行可能なプログラム命令の形態で実現されて、コンピュータ読み取り可能な媒体に記録されてよい。前記コンピュータ読み取り可能な媒体は、プログラム命令、データファイル、データ構造などを単独または組み合わせて含んでよい。媒体は、コンピュータ実行可能なプログラムを継続して記録するものであっても、実行またはダウンロードのために一時的に記録するものであってもよい。また、媒体は、単一または複数のハードウェアが結合された形態の多様な記録手段または格納手段であってよく、あるコンピュータシステムに直接接続する媒体に限定されることはなく、ネットワーク上に分散して存在するものであってもよい。媒体の例としては、ハードディスク、フロッピー（登録商標）ディスク、および磁気テープのような磁気媒体、ＣＤ－ＲＯＭおよびＤＶＤのような光媒体、フロプティカルディスク（ｆｌｏｐｔｉｃａｌｄｉｓｋ）のような光磁気媒体、およびＲＯＭ、ＲＡＭ、フラッシュメモリなどを含み、プログラム命令が記録されるように構成されたものであってよい。また、媒体の他の例として、アプリケーションを配布するアプリケーションストアやその他の多様なソフトウェアを供給または配布するサイト、サーバなどで管理する記録媒体または格納媒体が挙げられてよい。プログラム命令の例としては、コンパイラによって生成されるもののような機械語コードだけではなく、インタプリタなどを使用してコンピュータによって実行される高級言語コードを含む。 The method according to the embodiment may be realized in the form of program instructions executable by various computer means and recorded on a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, etc., alone or in combination. The medium may be one that continuously records a computer-executable program, or one that temporarily records it for execution or download. The medium may be one that is a recording or storage means in the form of a single or multiple hardware devices combined, and is not limited to a medium that is directly connected to a certain computer system, but may be one that is distributed over a network. Examples of the medium include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical media such as CD-ROMs and DVDs, magneto-optical media such as floptical disks, and ROMs, RAMs, flash memories, etc., configured to record program instructions. Other examples of the medium may include recording media or storage media managed by application stores that distribute applications, or sites, servers, etc. that supply or distribute various other software. Examples of program instructions include not only machine code, such as that produced by a compiler, but also high-level language code that is executed by a computer using an interpreter or the like.

以上のように、実施形態を、限定された実施形態および図面に基づいて説明してきたが、当業者であれば、上述した記載から多様な修正および変形が可能であろう。例えば、説明された技術が、説明された方法とは異なる順序で実行されたり、かつ／あるいは、説明されたシステム、構造、装置、回路などの構成要素が、説明された方法とは異なる形態で結合されたり、または組み合わされたり、他の構成要素または均等物によって、対置されたり置換されたとしても、適切な結果を達成することができる。 Although the embodiments have been described above based on limited embodiments and drawings, those skilled in the art will appreciate that various modifications and variations can be made from the above description. For example, the described techniques may be performed in a different order than described, and/or the components of the described systems, structures, devices, circuits, etc. may be combined or combined in a different manner than described, or may be counterbalanced or replaced by other components or equivalents, and still achieve suitable results.

したがって、異なる実施形態であっても、特許請求の範囲と均等なものであれば、添付される特許請求の範囲に属する。 Therefore, different embodiments that are equivalent to the scope of the claims are within the scope of the attached claims.

１００：コンピュータ装置
１１０：メモリ
１２０：プロセッサ
１３０：通信インタフェース
１４０：入力／出力インタフェース
１５０：入力／出力装置
１６０：ネットワーク 100: Computer device 110: Memory 120: Processor 130: Communication interface 140: Input/output interface 150: Input/output device 160: Network

Claims

1. A method of data classification in a computer device, the method comprising:
the at least one processor generating an embedding vector for the input data;
said at least one processor calculating an inner product of said embedding vector and a parameter vector of a trained classifier; and said at least one processor calculating a logit by applying a norm for said parameter vector to the result of said inner product to remove bias with respect to the size of said parameter vector and leaving only the directionality of said parameter vector .
Including,
The step of calculating the logit comprises:
Dividing the result of the dot product by the norm for the parameter vector, and then:
Add a bias with a hyperbolic tangent applied to limit the value between -1 and 1;
1. A data classification method comprising :

A computer program comprising a plurality of computer executable instructions,
When the instructions are executed by a processor in a computing device,
2. A method for causing the computer device to perform the method of claim 1 .
Computer program.

A computer-readable recording medium having a computer program recorded thereon,
When the computer program is executed by a computer device,
2. A method for causing the computer device to perform the method of claim 1 .
A computer-readable recording medium.

1. A computing device comprising: at least one processor implemented to execute computer readable instructions,
The at least one processor
Generate an embedding vector for the input data;
Calculate the dot product of the embedding vector and a parameter vector of a trained classifier; and
A norm for the parameter vector is applied to the result of the inner product to remove bias with respect to the size of the parameter vector, and a logit is calculated that leaves only the direction of the parameter vector;
To calculate the logits, the at least one processor:
Dividing the result of the dot product by the norm for the parameter vector, and then:
Add a bias with a hyperbolic tangent applied to limit the value between -1 and 1;
A computer device comprising: