JP6831307B2

JP6831307B2 - Solution calculation device, solution calculation method and solution calculation program

Info

Publication number: JP6831307B2
Application number: JP2017150069A
Authority: JP
Inventors: 直貴丸茂; 具治岩田
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2017-08-02
Filing date: 2017-08-02
Publication date: 2021-02-17
Anticipated expiration: 2037-08-02
Also published as: JP2019028883A

Description

本発明は、解算出装置、解算出方法及び解算出プログラムに関する。 The present invention relates to a solution calculation device, a solution calculation method, and a solution calculation program.

機械学習の問題は、一般に、損失関数の最小化問題として定式化されることが知られている。例えば、回帰問題は、二乗損失関数やＨｕｂｅｒ損失関数等の最小化問題として定式化される。同様に、例えば、分類問題は、ロジスティック損失関数やヒンジ損失関数等の最小化問題として定式化される。 It is known that machine learning problems are generally formulated as loss function minimization problems. For example, the regression problem is formulated as a minimization problem such as a square loss function or a Huber loss function. Similarly, for example, the classification problem is formulated as a minimization problem such as a logistic loss function or a hinge loss function.

一方で、機械学習の汎化性能や結果の解釈性を向上させるため、劣モジュラ構造スパース制約が用いられる。劣モジュラ構造スパース制約を用いることで、例えば、パラメータの階層構造やグループの構造を考慮しつつ学習をすることができる。 On the other hand, submodular structure sparse constraints are used to improve the generalization performance of machine learning and the interpretability of results. By using the submodular structure sparse constraint, for example, learning can be performed while considering the hierarchical structure of parameters and the structure of groups.

これらを組み合わせて、劣モジュラ構造スパース制約下での最小化問題とすることで、様々な機械学習問題を定式化できる。このような機械学習問題を解くために、様々な手法が提案されている（非特許文献１乃至３）。 By combining these to make a minimization problem under the submodular structure sparse constraint, various machine learning problems can be formulated. Various methods have been proposed for solving such machine learning problems (Non-Patent Documents 1 to 3).

F. Bach. Structured sparsity-inducing norms through submodular functions. In Advances in Neural Information Processing Systems, pages 118-126, 2010.F. Bach. Structured sparsity-inducing norms through submodular functions. In Advances in Neural Information Processing Systems, pages 118-126, 2010. M. El Halabi, L. Baldassarre, and V. Cevher. To convexify or not? regression with clustering penalties on graphs. In Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), 2013 IEEE 5th International Workshop on, pages 21-24. IEEE, 2013.M. El Halabi, L. Baldassarre, and V. Cevher. To convexify or not? Regression with clustering penalties on graphs. In Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), 2013 IEEE 5th International Workshop on, pages 21-24 .IEEE, 2013. R. G. Baraniuk, V. Cevher, M. F. Duarte, and C. Hegde. Model-based compressive sensing. IEEE Transactions on Information Theory, 56(4):1982-2001,2010.R. G. Baraniuk, V. Cevher, M. F. Duarte, and C. Hegde. Model-based compressive sensing. IEEE Transactions on Information Theory, 56 (4): 1982-2001, 2010.

しかしながら、上記の手法にはいくつかの問題がある。例えば、非特許文献１及び２に開示されている手法では、得られる解の精度が低いという問題がある。また、例えば、非特許文献３に開示されている手法では、最小化する関数として二乗損失関数しか扱えない上に、解を得るために長時間を必要とするという問題がある。 However, the above method has some problems. For example, the methods disclosed in Non-Patent Documents 1 and 2 have a problem that the accuracy of the obtained solution is low. Further, for example, the method disclosed in Non-Patent Document 3 has a problem that only a squared loss function can be handled as a function to be minimized and a long time is required to obtain a solution.

本発明は、上記の問題点に鑑みてなされたものであり、一般の損失関数を用いた劣モジュラ構造スパース制約下での最小化問題の解を高速かつ高精度に得ることを目的とする。 The present invention has been made in view of the above problems, and an object of the present invention is to obtain a solution of a minimization problem under a submodular structure sparse constraint using a general loss function at high speed and with high accuracy.

上記課題を解決するため、損失関数を用いた劣モジュラ構造スパース制約下での最小化問題の解を算出する解算出装置であって、前記損失関数と、第１の劣モジュラ関数と、制約定数とを入力する入力部と、繰り返し回数ｔと、前記最小化問題の解とを初期化する初期化部と、ｔ回目における前記解を勾配降下させた降下後ベクトルを算出する勾配降下計算部と、前記勾配降下計算部が算出した前記降下後ベクトルと前記第１の劣モジュラ関数とから定まる第２の劣モジュラ関数を最小化する集合であって、前記第１の劣モジュラ関数の値が前記制約定数以下となるような集合を算出する劣モジュラ関数最小化部と、ｔ回目における前記解と、前記劣モジュラ関数最小化部が算出した前記集合とに基づいて、切り捨て計算又は厳密最小化計算により、ｔ＋１回目における前記解を算出する解算出部と、所定の終了条件を満たしているか否かを判定する終了条件判定部と、を有し、前記終了条件判定部は、前記所定の終了条件を満たしていないと判定した場合、前記繰り返し回数ｔに１を加算して、前記勾配降下計算部により前記降下後ベクトルを算出させる、ことを特徴とする。 In order to solve the above problem, it is a solution calculation device that calculates the solution of the minimization problem under the submodular structure sparse constraint using the loss function, and the loss function, the first submodular function, and the constraint constant. An input unit for inputting, an initialization unit that initializes the number of repetitions t and the solution of the minimization problem, and a gradient descent calculation unit that calculates a post-descent vector obtained by gradient descent of the solution at the t-th time. , A set that minimizes the second submodular function determined from the post-descent vector calculated by the gradient descent calculation unit and the first submodular function, and the value of the first submodular function is the value. Truncation calculation or exact minimization calculation based on the submodular function minimization unit that calculates a set that is less than or equal to the constraint constant, the solution at the t-th time, and the set calculated by the submodular function minimization unit. The solution calculation unit for calculating the solution at the t + 1th time and the end condition determination unit for determining whether or not the predetermined end condition is satisfied are provided, and the end condition determination unit has the predetermined end condition. When it is determined that the above is not satisfied, 1 is added to the number of repetitions t, and the gradient descent calculation unit calculates the post-descent vector.

一般の損失関数を用いた劣モジュラ構造スパース制約下での最小化問題の解を高速かつ高精度に得ることができる。 A solution of a minimization problem under a submodular structure sparse constraint using a general loss function can be obtained at high speed and with high accuracy.

本発明の実施の形態における解算出装置の構成の一例を示す図である。It is a figure which shows an example of the structure of the solution calculation apparatus in embodiment of this invention. 本発明の実施の形態における解算出装置のハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware composition of the solution calculation apparatus in embodiment of this invention. 本発明の実施の形態における解算出装置の機能構成の一例を示す図である。It is a figure which shows an example of the functional structure of the solution calculation apparatus in embodiment of this invention. 本発明の実施の形態における解算出装置が実行する全体処理の一例を示すフローチャートである。It is a flowchart which shows an example of the whole process executed by the solution calculation apparatus in embodiment of this invention. 正解信号の一例を示す図である。It is a figure which shows an example of a correct answer signal. 本発明と従来技術との比較例を示す図（その１）である。It is a figure (the 1) which shows the comparative example of this invention and the prior art. 本発明と従来技術との比較例を示す図（その２）である。It is a figure (the 2) which shows the comparative example of this invention and the prior art.

以下、本発明の実施の形態について、図面を参照しながら説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

＜解算出装置１０の構成＞
まず、本発明の実施の形態における解算出装置１０の構成について、図１を参照しながら説明する。図１は、本発明の実施の形態における解算出装置１０の構成の一例を示す図である。 <Configuration of solution calculation device 10>
First, the configuration of the solution calculation device 10 according to the embodiment of the present invention will be described with reference to FIG. FIG. 1 is a diagram showing an example of the configuration of the solution calculation device 10 according to the embodiment of the present invention.

図１に示す解算出装置１０は、一般の損失関数を用いた劣モジュラ構造スパース制約下での最小化問題の解を算出するコンピュータである。図１に示す解算出装置１０には、解算出プログラム１００がインストールされている。なお、解算出プログラム１００は、複数のモジュールで構成されるプログラム群であっても良い。 The solution calculation device 10 shown in FIG. 1 is a computer that calculates a solution of a minimization problem under a submodular structure sparse constraint using a general loss function. The solution calculation program 100 is installed in the solution calculation device 10 shown in FIG. The solution calculation program 100 may be a program group composed of a plurality of modules.

本発明の実施の形態における解算出装置１０は、解算出プログラム１００により、損失関数ｆ：Ｒ^ｄ→Ｒと、劣モジュラ関数ｇ：２^{｛１，・・・，ｄ｝}→Ｒと、制約定数ｃとを入力して、最小化問題の解ベクトルｗを出力する。このとき、出力される解ベクトルｗは、劣モジュラ構造スパース制約ｇ（ｓｕｐｐ（ｗ））≦ｃを満たし、かつ、損失関数の値ｆ（ｗ）を最小化するものである。 The solution calculation device 10 according to the embodiment of the present invention has a loss function f: R ^d → R, a submodular function g: 2 ^{{1, ..., D}} → R, and a constraint constant by the solution calculation program 100. Input c and output the solution vector w of the minimization problem. At this time, the output solution vector w satisfies the submodular structure sparse constraint g (supp (w)) ≦ c and minimizes the value f (w) of the loss function.

なお、Ｒ及びＲ^ｄは、それぞれ１次元実数空間及びｄ次元実数空間である。２^{｛１，・・・，ｄ｝}は、集合｛１，・・・，ｄ｝の冪集合である。解ベクトルｗは、ｄ次元の実ベクトルである。以降では、解ベクトルを単に「解」とも表す。 Note that R and R ^d are a one-dimensional real number space and a d-dimensional real number space, respectively. 2 ^{{1, ..., d}} is a power set of sets {1, ..., d}. The solution vector w is a d-dimensional real vector. Hereinafter, the solution vector is also simply referred to as a “solution”.

より具体的には、本発明の実施の形態における解算出装置１０は、所定の終了条件を満たすまで、反復回数を示す変数ｔを１ずつ増加させながら以下の（１）〜（３）を繰り返し実行することで、解ｗ^ｔ＋１を算出する。そして、本発明の実施の形態における解算出装置１０は、解算出プログラム１００により、算出した解ｗ＝ｗ^ｔ＋１を出力する。 More specifically, the solution calculation device 10 according to the embodiment of the present invention repeats the following (1) to (3) while increasing the variable t indicating the number of iterations by 1 until a predetermined end condition is satisfied. By executing, the solution wt ^{+ 1} is calculated. Then, the solution calculation device 10 according to the embodiment of the present invention outputs the solution w = w ^{t + 1} calculated by the solution calculation program 100.

（１）損失関数ｆの勾配方向に現在の解ｗ^ｔを移動（勾配降下）させた降下後ベクトルｖを算出する。 (1) in the gradient direction of the loss function f to calculate the descent after vector v moves the current solution w ^t (gradient descent).

（２）降下後ベクトルｖと劣モジュラ関数ｇとによって定まる劣モジュラ関数Ｇを最小化する集合Ｓを算出する。 (2) A set S that minimizes the submodular function G determined by the post-descent vector v and the submodular function g is calculated.

（３）現在の解ｗ^ｔと、上記の（２）で算出した集合Ｓとに基づいて、切り捨て計算又は厳密最小化計算によって次の解ｗ^ｔ＋１を算出する。 (3) and current solution w ^t, based on a set S calculated in the above (2), by truncation calculation or strict minimization calculations to calculate the next solution w ^{t + 1.}

なお、図１に示す解算出装置１０の構成は一例であって、他の構成であっても良い。例えば、図１に示す解算出装置１０は、複数台のコンピュータで構成されていても良い。 The configuration of the solution calculation device 10 shown in FIG. 1 is an example, and may be another configuration. For example, the solution calculation device 10 shown in FIG. 1 may be composed of a plurality of computers.

＜解算出装置１０のハードウェア構成＞
次に、本発明の実施の形態における解算出装置１０のハードウェア構成について、図２を参照しながら説明する。図２は、本発明の実施の形態における解算出装置１０のハードウェア構成の一例を示す図である。 <Hardware configuration of solution calculation device 10>
Next, the hardware configuration of the solution calculation device 10 according to the embodiment of the present invention will be described with reference to FIG. FIG. 2 is a diagram showing an example of the hardware configuration of the solution calculation device 10 according to the embodiment of the present invention.

図２に示す解算出装置１０は、入力装置１１と、表示装置１２と、外部Ｉ／Ｆ１３と、ＲＡＭ（Random Access Memory）１４と、ＲＯＭ（Read Only Memory）１５と、ＣＰＵ（Central Processing Unit）１６と、通信Ｉ／Ｆ１７と、補助記憶装置１８とを有する。これら各ハードウェアは、それぞれがバスＢを介して通信可能に接続されている。 The solution calculation device 10 shown in FIG. 2 includes an input device 11, a display device 12, an external I / F 13, a RAM (Random Access Memory) 14, a ROM (Read Only Memory) 15, and a CPU (Central Processing Unit). It has 16, a communication I / F 17, and an auxiliary storage device 18. Each of these hardware is connected so as to be able to communicate with each other via the bus B.

入力装置１１は、例えばキーボードやマウス、タッチパネル等であり、ユーザが各種操作を入力するのに用いられる。表示装置１２は、例えばディスプレイ等であり、解算出装置１０の処理結果を表示する。 The input device 11 is, for example, a keyboard, a mouse, a touch panel, or the like, and is used for a user to input various operations. The display device 12 is, for example, a display or the like, and displays the processing result of the solution calculation device 10.

外部Ｉ／Ｆ１３は、外部装置とのインタフェースである。外部装置には、記録媒体１３ａ等がある。解算出装置１０は、外部Ｉ／Ｆ１３を介して、記録媒体１３ａ等の読み取りや書き込みを行うことができる。記録媒体１３ａには、解算出プログラム１００等が記録されていても良い。 The external I / F 13 is an interface with an external device. The external device includes a recording medium 13a and the like. The solution calculation device 10 can read and write the recording medium 13a and the like via the external I / F 13. The solution calculation program 100 or the like may be recorded on the recording medium 13a.

記録媒体１３ａには、例えば、フレキシブルディスク、ＣＤ（Compact Disc）、ＤＶＤ（Digital Versatile Disk）、ＳＤメモリカード（Secure Digital memory card）、ＵＳＢ（Universal Serial Bus）メモリカード等がある。 The recording medium 13a includes, for example, a flexible disk, a CD (Compact Disc), a DVD (Digital Versatile Disk), an SD memory card (Secure Digital memory card), a USB (Universal Serial Bus) memory card, and the like.

ＲＡＭ１４は、プログラムやデータを一時保持する揮発性の半導体メモリである。ＲＯＭ１５は、電源を切ってもプログラムやデータを保持することができる不揮発性の半導体メモリである。ＲＯＭ１５には、例えば、ＯＳ（Operating System）設定やネットワーク設定等が格納されている。 The RAM 14 is a volatile semiconductor memory that temporarily holds programs and data. The ROM 15 is a non-volatile semiconductor memory capable of holding programs and data even when the power is turned off. The ROM 15 stores, for example, OS (Operating System) settings, network settings, and the like.

ＣＰＵ１６は、ＲＯＭ１５や補助記憶装置１８等からプログラムやデータをＲＡＭ１４上に読み出して処理を実行する演算装置である。 The CPU 16 is an arithmetic unit that reads a program or data from the ROM 15 or the auxiliary storage device 18 or the like onto the RAM 14 and executes processing.

通信Ｉ／Ｆ１７は、解算出装置１０をネットワークに接続するためのインタフェースである。解算出プログラム１００は、通信Ｉ／Ｆ１７を介して、所定のサーバ等から取得（ダウンロード）されても良い。 The communication I / F 17 is an interface for connecting the solution calculation device 10 to the network. The solution calculation program 100 may be acquired (downloaded) from a predetermined server or the like via the communication I / F17.

補助記憶装置１８は、例えばＨＤＤ（Hard Disk Drive）やＳＳＤ（Solid State Drive）等であり、プログラムやデータを格納している不揮発性の記憶装置である。補助記憶装置１８に格納されているプログラムやデータには、例えば、ＯＳ、当該ＯＳ上において各種機能を実現するアプリケーションプログラム、解算出プログラム１００等がある。 The auxiliary storage device 18 is, for example, an HDD (Hard Disk Drive), an SSD (Solid State Drive), or the like, and is a non-volatile storage device that stores programs and data. The programs and data stored in the auxiliary storage device 18 include, for example, an OS, an application program that realizes various functions on the OS, a solution calculation program 100, and the like.

本発明の実施の形態における解算出装置１０は、図２に示すハードウェア構成を有することにより、後述する各種処理を実現することができる。 The solution calculation device 10 according to the embodiment of the present invention can realize various processes described later by having the hardware configuration shown in FIG.

＜解算出装置１０の機能構成＞
次に、本発明の実施の形態における解算出装置１０の機能構成について、図３を参照しながら説明する。図３は、本発明の実施の形態における解算出装置１０の機能構成の一例を示す図である。 <Functional configuration of solution calculation device 10>
Next, the functional configuration of the solution calculation device 10 according to the embodiment of the present invention will be described with reference to FIG. FIG. 3 is a diagram showing an example of the functional configuration of the solution calculation device 10 according to the embodiment of the present invention.

図３に示す解算出装置１０は、入力部１０１と、初期化部１０２と、勾配降下計算部１０３と、劣モジュラ関数最小化部１０４と、解算出部１０５と、終了条件判定部１０６と、出力部１０７とを有する。これら各部は、解算出プログラム１００がＣＰＵ１６に実行させる処理により実現される。 The solution calculation device 10 shown in FIG. 3 includes an input unit 101, an initialization unit 102, a gradient descent calculation unit 103, a submodular function minimization unit 104, a solution calculation unit 105, and an end condition determination unit 106. It has an output unit 107. Each of these parts is realized by a process in which the solution calculation program 100 causes the CPU 16 to execute.

入力部１０１は、最小化問題（すなわち、損失関数ｆ、劣モジュラ関数ｇ、及び制約定数ｃ）を入力する。なお、これらの情報は、例えば、ユーザが入力しても良いし、他のプログラム等から入力されても良い。又は、補助記憶装置１８等に記憶されている電子ファイルを読み込むことで、入力されても良い。 The input unit 101 inputs a minimization problem (that is, a loss function f, a submodular function g, and a constraint constant c). Note that these information may be input by the user, for example, or may be input from another program or the like. Alternatively, it may be input by reading an electronic file stored in the auxiliary storage device 18 or the like.

初期化部１０２は、反復回数を示す変数ｔを「０」に初期化すると共に、現在の解ｗ^０を零ベクトルに初期化する。 The initialization unit 102 initializes the variable t indicating the number of iterations to “0” and initializes the current solution w ⁰ to a zero vector.

勾配降下計算部１０３は、現在の解ｗ^ｔを勾配降下させた降下後ベクトルｖを算出する。 Gradient descent calculation unit 103 calculates the drop after vector v obtained by gradient descent the current solution w ^t.

劣モジュラ関数最小化部１０４は、勾配降下計算部１０３が算出した降下後ベクトルｖと劣モジュラ関数ｇとから定まる劣モジュラ関数Ｇを最小化する集合Ｓを算出する。ここで、集合Ｓは、１からｄまでの自然数を元とする集合｛１，・・・，ｄ｝の部分集合である。 The submodular function minimization unit 104 calculates a set S that minimizes the submodular function G determined from the post-descent vector v calculated by the gradient descent calculation unit 103 and the submodular function g. Here, the set S is a subset of the set {1, ..., D} based on natural numbers from 1 to d.

解算出部１０５は、現在の解ｗ^ｔと、劣モジュラ関数最小化部１０４が算出した集合Ｓとに基づいて、切り捨て計算又は厳密最小化計算によって次の解ｗ^ｔ＋１を算出する。解算出部１０５には、切り捨て計算によって次の解ｗ^ｔ＋１を算出する切り捨て計算部１１１と、厳密最小化計算によって次の解ｗ^ｔ＋１を算出する厳密最小化計算部１１２とが含まれる。 Solution calculating section 105, a current solution w ^t, based on the set S submodular function minimization unit 104 calculates, by truncating calculation or strict minimization calculations to calculate the next solution w ^{t + 1.} The solution calculation unit 105 includes a truncation calculation unit 111 that calculates the next solution wt ^{+ 1} by the truncation calculation, and a strict minimization calculation unit 112 that calculates the next solution w ^{t + 1} by the exact minimization calculation.

終了条件判定部１０６は、所定の終了条件を満たしているか否かを判定する。また、終了条件判定部１０６は、所定の終了条件を満たしていないと判定した場合、反復回数を示す変数ｔに「１」を加算する。 The end condition determination unit 106 determines whether or not a predetermined end condition is satisfied. Further, when the end condition determination unit 106 determines that the predetermined end condition is not satisfied, "1" is added to the variable t indicating the number of repetitions.

出力部１０７は、終了条件判定部１０６により所定の終了条件を満たしていると判定された場合、解算出部１０５が算出した解ｗ＝ｗ^ｔ＋１を出力する。出力部１０７による出力先としては、例えば、表示装置１２でも良いし、他のプログラムでも良い。又は、解算出装置１０とネットワーク等を介して接続される他の装置等であっても良い。 When the end condition determination unit 106 determines that the predetermined end condition is satisfied, the output unit 107 outputs the solution w = w ^{t + 1} calculated by the solution calculation unit 105. The output destination by the output unit 107 may be, for example, the display device 12 or another program. Alternatively, it may be another device or the like connected to the solution calculation device 10 via a network or the like.

＜処理の詳細＞
次に、本発明の実施の形態における解算出装置１０の処理の詳細について説明する。以降では、本発明の実施の形態における解算出装置１０が実行する全体処理について、図４を参照しながら説明する。図４は、本発明の実施の形態における解算出装置１０が実行する全体処理の一例を示すフローチャートである。 <Details of processing>
Next, the details of the processing of the solution calculation device 10 in the embodiment of the present invention will be described. Hereinafter, the overall processing executed by the solution calculation device 10 according to the embodiment of the present invention will be described with reference to FIG. FIG. 4 is a flowchart showing an example of the overall processing executed by the solution calculation device 10 according to the embodiment of the present invention.

まず、入力部１０１は、損失関数ｆと、劣モジュラ関数ｇと、制約定数ｃとを入力する（ステップＳ１０１）。 First, the input unit 101 inputs the loss function f, the submodular function g, and the constraint constant c (step S101).

次に、初期化部１０２は、反復回数を示す変数ｔを「０」に初期化すると共に、現在の解ｗ^０を零ベクトルに初期化する（ステップＳ１０２）。 Next, the initialization unit 102 initializes the variable t indicating the number of iterations to “0” and initializes the current solution w ⁰ to a zero vector (step S102).

次に、勾配降下計算部１０３は、現在の解ｗ^ｔを勾配降下させた降下後ベクトルｖを算出する（ステップＳ１０３）。すなわち、勾配降下計算部１０３は、以下の式１によって降下後ベクトルｖを算出する。 Then, gradient descent calculation unit 103 calculates the drop after vector v obtained by gradient descent the current solution ^{w t} (step S103). That is, the gradient descent calculation unit 103 calculates the post-descent vector v by the following equation 1.

ここで、η_ｔは、ステップサイズパラメータである。ステップサイズパラメータは、グリッドサーチ又はバックトラッキング等によって定めることができる。 Here, η _t is a step size parameter. The step size parameters can be determined by grid search, backtracking, or the like.

次に、劣モジュラ関数最小化部１０４は、勾配降下計算部１０３が算出した降下後ベクトルｖと劣モジュラ関数ｇとから定まる劣モジュラ関数Ｇを最小化する集合Ｓを算出する（ステップＳ１０４）。すなわち、劣モジュラ関数最小化部１０４は、以下の式２によって集合Ｓを算出する。 Next, the submodular function minimization unit 104 calculates a set S that minimizes the submodular function G determined from the post-descent vector v calculated by the gradient descent calculation unit 103 and the submodular function g (step S104). That is, the submodular function minimization unit 104 calculates the set S by the following equation 2.

ここで、劣モジュラ関数Ｇは、以下の式３によって定義される。また、α_ｔは、正則化パラメータであり、得られる集合Ｓがｇ（Ｓ）≦ｃを満たすように十分小さく選ぶ。これは、パラメトリック劣モジュラ最小化又はバックトラッキング等を用いることで実現できる。 Here, the submodular function G is defined by the following equation 3. Further, α _t is a regularization parameter, and is selected sufficiently small so that the obtained set S satisfies g (S) ≦ c. This can be achieved by using parametric submodular minimization, backtracking, or the like.

次に、解算出部１０５は、現在の解ｗ^ｔと、劣モジュラ関数最小化部１０４が算出した集合Ｓとに基づいて、切り捨て計算又は厳密最小化計算によって次の解ｗ^ｔ＋１を算出する（ステップＳ１０５）。 Next, solution calculating section 105, a current solution w ^t, based on the set S submodular function minimization unit 104 calculates, by truncating calculation or strict minimization calculations to calculate the next solution w ^{t + 1} ( Step S105).

すなわち、切り捨て計算によって次の解ｗ^ｔ＋１を算出する場合、解算出部１０５は、切り捨て計算部１１１により、以下の式４によって次の解ｗ^ｔ＋１を算出する。 That is, when calculating the next solution ^{w t + 1} by truncation calculations, solution calculating section 105, the truncated calculation unit 111, by Equation 4 below to calculate the next solution ^{w t + 1.}

一方、厳密最小化計算によって次の解ｗ^ｔ＋１を算出する場合、解算出部１０５は、厳密最小化計算部１１２により、以下の式５によって次の解ｗ^ｔ＋１を算出する。 On the other hand, when calculating the next solution w ^{t + 1} by rigorous minimization calculations, solution calculating section 105, the exact minimization calculation unit 112, by Equation 5 below to calculate the next solution w ^{t + 1.}

なお、切り捨て計算又は厳密最小化計算のいずれを用いて次の解ｗ^ｔ＋１を算出するかは任意に決定して良い。 It should be noted that it may be arbitrarily determined whether to calculate the next solution wt ^{+ 1} by using the truncation calculation or the strict minimization calculation.

次に、終了条件判定部１０６は、所定の終了条件を満たしているか否かを判定する（ステップＳ１０６）。終了条件判定部１０６は、所定の終了条件として、例えば、以下の（ａ）〜（ｄ）のうち、いずれかを用いれば良い。 Next, the end condition determination unit 106 determines whether or not the predetermined end condition is satisfied (step S106). The end condition determination unit 106 may use, for example, any of the following (a) to (d) as a predetermined end condition.

（ａ）訓練誤差の減少量ｆ（ｗ^ｔ）−ｆ（ｗ^ｔ＋１）が予め定めた閾値より小さいこと
（ｂ）バリデーション誤差の減少量が予め定めた閾値より小さいこと
（ｃ）解ｗ^０，ｗ^１，・・・，ｗ^ｔ＋１から計算されるバリデーション誤差の最小値が、予め定めた回数の反復の間、更新されないこと
（ｄ）ｔの値が予め定めた値より大きいこと
ステップＳ１０６において、所定の終了条件を満たしていないと判定された場合、終了条件判定部１０６は、反復回数を示す変数ｔに「１」を加算する（ステップＳ１０７）。そして、上記のステップＳ１０３の処理に戻る。これにより、所定の終了条件を満たすまで、上記のステップＳ１０３〜ステップＳ１０７が繰り返し実行される。 (A) The amount of reduction in training error f (w ^t ) -f (w ^{t + 1} ) is smaller than the predetermined threshold value (b) The amount of reduction in validation error is smaller than the predetermined threshold value (c) Solution w ⁰ , The minimum value of the validation error calculated from w ¹ , ..., W ^{t + 1} is not updated during the predetermined number of iterations. (D) The value of t is larger than the predetermined value. In step S106, When it is determined that the predetermined end condition is not satisfied, the end condition determination unit 106 adds "1" to the variable t indicating the number of repetitions (step S107). Then, the process returns to the process of step S103 described above. As a result, the above steps S103 to S107 are repeatedly executed until a predetermined end condition is satisfied.

一方、ステップＳ１０６において、所定の終了条件を満たすと判定された場合、出力部１０７は、解算出部１０５が算出した解ｗ＝ｗ^ｔ＋１を出力する（ステップＳ１０８）。 On the other hand, if it is determined in step S106 that the predetermined end condition is satisfied, the output unit 107 outputs the solution w = w ^{t + 1} calculated by the solution calculation unit 105 (step S108).

以上により、本発明の実施の形態における解算出装置１０は、一般の損失関数ｆを用いた劣モジュラ構造スパース制約下での最小化問題の解ｗを算出することができる。 As described above, the solution calculation device 10 according to the embodiment of the present invention can calculate the solution w of the minimization problem under the submodular structure sparse constraint using the general loss function f.

＜比較例＞
次に、人工データを用いた２つの実験を行った場合における本発明と従来技術との比較例について説明する。従来技術としては、ＦＩＳＴＡ（非特許文献１参照）と、Ｍａｊｏｒｉｚａｔｉｏｎ−Ｍｉｎｉｍｉｚａｔｉｏｎ（以降、「ＭＭ」と表す。非特許文献２参照）と、Ｍｏｄｅｌ−ＢａｓｅｄＣｏＳａＭＰ（以降、「ＭＣｏＳａＭＰ」と表す。非特許文献３参照）とを用いた。 <Comparison example>
Next, a comparative example between the present invention and the prior art in the case where two experiments using artificial data are performed will be described. As the prior art, FISTA (see Non-Patent Document 1), Majorization-Minimation (hereinafter referred to as “MM”; see Non-Patent Document 2), and Model-Based CoSaMP (hereinafter referred to as “MCoSaMP”). (See Reference 3) and.

１つ目の実験では、Ｈａａｒウェーブレット基底を用いた信号復元問題を解いた。正解信号としては、図５に示す信号を用いた。また、損失関数ｆは二乗損失関数とし、劣モジュラ関数ｇはウェーブレット基底の階層構造スパース性を誘導するように定めた。この場合における本発明と従来技術とで得られる解のテスト誤差の比較例を図６に示す。図６は、本発明と従来技術との比較例を示す図（その１）である。 In the first experiment, we solved the signal restoration problem using the Haar wavelet basis. As the correct answer signal, the signal shown in FIG. 5 was used. Further, the loss function f is defined as the squared loss function, and the submodular function g is defined to induce the hierarchical structure sparseness of the wavelet basis. FIG. 6 shows a comparative example of the test errors of the solutions obtained by the present invention and the prior art in this case. FIG. 6 is a diagram (No. 1) showing a comparative example between the present invention and the prior art.

図６に示す比較例では、横軸が実行時間、縦軸がテスト誤差である。テスト誤差が小さいことは、解が高精度であることを表している。図６に示すように、本発明では、切り捨て計算又は厳密最小化計算のいずれを用いた場合でも、従来技術と比較して、高速かつ高精度に解を得られていることが分かる。 In the comparative example shown in FIG. 6, the horizontal axis represents the execution time and the vertical axis represents the test error. A small test error indicates that the solution is highly accurate. As shown in FIG. 6, it can be seen that in the present invention, a solution is obtained at high speed and with high accuracy as compared with the prior art, regardless of whether the truncation calculation or the strict minimization calculation is used.

次に、２つ目の実験では、多クラス分類問題を解いた。真のクラスパラメータは、グループスパースベクトルとスパースベクトルとの和になるようにランダムに生成した。損失関数ｆはロジスティック関数とし、劣モジュラ関数ｇはクラスパラメータのロバストなグループスパース性を誘導するように定めた。この場合における本発明と従来技術とで得られる解のテスト誤差の比較例を図７に示す。図７は、本発明と従来技術との比較例を示す図（その２）である。 Next, in the second experiment, we solved the multiclass classification problem. The true class parameters were randomly generated to be the sum of the group sparse vector and the sparse vector. The loss function f is a logistic function, and the submodular function g is defined to induce the robust group sparseness of the class parameters. FIG. 7 shows a comparative example of the test errors of the solutions obtained by the present invention and the prior art in this case. FIG. 7 is a diagram (No. 2) showing a comparative example between the present invention and the prior art.

図７に示す比較例も、図６と同様に、横軸が実行時間、縦軸がテスト誤差である。図７に示すように、本発明では、従来技術と比較して、高速かつ高精度に解を得られていることが分かる。なお、ＭＣｏＳａＭＰは、ロジスティック損失関数に適用することができないため、図７に示す比較例には示されていない。 In the comparative example shown in FIG. 7, the horizontal axis is the execution time and the vertical axis is the test error, as in FIG. As shown in FIG. 7, it can be seen that in the present invention, a solution is obtained at high speed and with high accuracy as compared with the prior art. Since MCoSaMP cannot be applied to the logistic loss function, it is not shown in the comparative example shown in FIG. 7.

以上のように、本発明の実施の形態における解算出装置１０によれば、一般の損失関数ｆを用いた劣モジュラ構造スパース制約下での最小化問題の解ｗを、従来技術と比較して高速かつ高精度に算出することができる。 As described above, according to the solution calculation device 10 according to the embodiment of the present invention, the solution w of the minimization problem under the submodular structure sparse constraint using the general loss function f is compared with the prior art. It can be calculated at high speed and with high accuracy.

本発明は、具体的に開示された上記の実施形態に限定されるものではなく、特許請求の範囲から逸脱することなく、種々の変形や変更が可能である。 The present invention is not limited to the above-described embodiment disclosed specifically, and various modifications and modifications can be made without departing from the scope of claims.

１０解算出装置
１００解算出プログラム
１０１入力部
１０２初期化部
１０３勾配降下計算部
１０４劣モジュラ関数最小化部
１０５解算出部
１０６終了条件判定部
１０７出力部
１１１切り捨て計算部
１１２厳密最小化計算部 10 Solution calculation device 100 Solution calculation program 101 Input unit 102 Initialization unit 103 Gradient descent calculation unit 104 Submodular function minimization unit 105 Solution calculation unit 106 End condition judgment unit 107 Output unit 111 Truncation calculation unit 112 Strict minimization calculation unit

Claims

It is a solution calculation device that calculates the solution of the minimization problem under the submodular structure sparse constraint using the loss function.
Input the loss function f: R ^d → R of the minimization problem, the first submodular function g: 2 ^{{1, ..., D}} → R representing the submodular structure sparse constraint, and the constraint constant c. Input section and
An initialization unit that initializes the number of iterations t and the solution of the minimization problem,
A gradient descent calculation unit that calculates a post-descent vector v in which the solution at the t-th time is lowered in the gradient direction of the loss function f ,
It is a set S ⊆ {1, ..., D} that minimizes the second submodular function G determined from the post-descent vector v calculated by the gradient descent calculation unit and the first submodular function g. A submodular function minimization unit that calculates a set S such that the value of the first submodular function g is equal to or less than the constraint constant c .
A solution calculation unit that calculates the solution at the t + 1th time by truncation calculation or strict minimization calculation based on the solution at the t-th time and the set S calculated by the submodular function minimization unit.
An end condition determination unit that determines whether or not a predetermined end condition is satisfied, and an end condition determination unit
Have,
The second submodular function G is
Wherein the elements of the drop after the vector v _{v i} (however, i = 1, ···, d ), the alpha _t as the regularization parameter, any set A⊆ {1, ···, d} with respect to G (A) = g (A) -α _t × ( sum of squares of vi with _i ∈ A) ,
The end condition determination unit
A solution calculation device, characterized in that, when it is determined that the predetermined end condition is not satisfied, 1 is added to the number of repetitions t, and the gradient descent calculation unit calculates the post-descent vector v .

The first aspect of claim 1, wherein when the end condition determination unit determines that the predetermined end condition is satisfied, the solution calculation unit has an output unit that outputs the solution at the t + 1th time calculated by the solution calculation unit. Solution calculation device.

The end condition is
The amount of reduction in training error is smaller than the predetermined threshold value, the amount of reduction in validation error is smaller than the predetermined threshold value, the minimum value of validation error is not updated during the predetermined number of repetitions, or the number of repetitions. The solution calculation device according to claim 1 or 2, wherein the value of t is either larger than a predetermined value.

A computer that calculates the solution of a minimization problem under a submodular structure sparse constraint using a loss function
Input the loss function f: R ^d → R of the minimization problem, the first submodular function g: 2 ^{{1, ..., D}} → R representing the submodular structure sparse constraint, and the constraint constant c. Input procedure to be done and
An initialization procedure that initializes the number of iterations t and the solution of the minimization problem, and
A gradient descent calculation procedure for calculating a post-descent vector v in which the solution at the t-th time is lowered in the gradient direction of the loss function f, and
A set S ⊆ {1, ..., D} that minimizes the second submodular function G determined by the post-descent vector v calculated by the gradient descent calculation procedure and the first submodular function g. Then, the procedure for minimizing the submodular function for calculating the set S such that the value of the first submodular function g is equal to or less than the constraint constant c , and
A solution calculation procedure for calculating the solution at the t + 1th time by a truncation calculation or an exact minimization calculation based on the solution at the t-th time and the set S calculated by the submodular function minimization procedure.
The end condition judgment procedure for determining whether or not the predetermined end condition is satisfied, and the end condition judgment procedure
And
The second submodular function G is
Wherein the elements of the drop after the vector v _{v i} (however, i = 1, ···, d ), the alpha _t as the regularization parameter, any set A⊆ {1, ···, d} with respect to G (A) = g (A) -α _t × ( sum of squares of vi with _i ∈ A) ,
The end condition determination procedure is
A solution calculation method characterized in that when it is determined that the predetermined end condition is not satisfied, 1 is added to the number of repetitions t to calculate the post-descent vector v by the gradient descent calculation procedure.

A computer that calculates the solution of a minimization problem under a submodular structure sparse constraint using a loss function.
Input the loss function f: R ^d → R of the minimization problem, the first submodular function g: 2 ^{{1, ..., D}} → R representing the submodular structure sparse constraint, and the constraint constant c. Input procedure to be done and
An initialization procedure that initializes the number of iterations t and the solution of the minimization problem, and
A gradient descent calculation procedure for calculating a post-descent vector v in which the solution at the t-th time is lowered in the gradient direction of the loss function f, and
A set S ⊆ {1, ..., D} that minimizes the second submodular function G determined by the post-descent vector v calculated by the gradient descent calculation procedure and the first submodular function g. Then, the procedure for minimizing the submodular function for calculating the set S such that the value of the first submodular function g is equal to or less than the constraint constant c , and
A solution calculation procedure for calculating the solution at the t + 1th time by a truncation calculation or an exact minimization calculation based on the solution at the t-th time and the set S calculated by the submodular function minimization procedure.
The end condition judgment procedure for determining whether or not the predetermined end condition is satisfied, and the end condition judgment procedure
To execute,
The second submodular function G is
Wherein the elements of the drop after the vector v _{v i} (however, i = 1, ···, d ), the alpha _t as the regularization parameter, any set A⊆ {1, ···, d} with respect to G (A) = g (A) -α _t × ( sum of squares of vi with _i ∈ A) ,
The end condition determination procedure is
A solution calculation program characterized in that when it is determined that the predetermined end condition is not satisfied, 1 is added to the number of repetitions t to calculate the post-descent vector v by the gradient descent calculation procedure.