JP7370924B2

JP7370924B2 - Optimization device and optimization method

Info

Publication number: JP7370924B2
Application number: JP2020075063A
Authority: JP
Inventors: 大佑萩原; 宣隆木村; 泰樹矢野; 宏視荒
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2020-04-21
Filing date: 2020-04-21
Publication date: 2023-10-30
Anticipated expiration: 2040-04-21
Also published as: JP2021174078A

Description

本発明は、パラメータの最適化技術に関する。 The present invention relates to a parameter optimization technique.

近年、計算機の性能の向上に伴い、計算機によるパラメータ最適化の対象はますます拡大してきている。しかし、その評価指標（以下、評価関数）が複数のパラメータに複雑に依存するような場合、そのパラメータ最適化に依然膨大な時間を要することがしばしば課題となっている。特に、評価関数の具体的な関数形が分からないために、パラメータ値を決めたときの評価関数の値（以下、評価値）は求まるが、その微分値は求められない状況においては、あるパラメータ値の下での評価値を複数計算して、それらを直接比較する必要があり、毎回の評価値計算に要する時間が無視できない場合は、計算時間の長さは顕著となる。 In recent years, as the performance of computers has improved, the scope of parameter optimization using computers has been expanding more and more. However, when the evaluation index (hereinafter referred to as evaluation function) depends on a plurality of parameters in a complicated manner, it is often a problem that optimization of the parameters still requires a huge amount of time. In particular, in situations where the specific function form of the evaluation function is not known, the value of the evaluation function when the parameter values are determined (hereinafter referred to as the evaluation value) can be found, but the differential value cannot be found. If it is necessary to calculate multiple evaluation values under each value and directly compare them, and the time required to calculate each evaluation value cannot be ignored, the length of calculation time becomes significant.

そこで、評価するパラメータ値（以下、探索点）を毎回ランダムに生成するのではなく、前に計算した探索点の評価値を考慮して次の探索点を生成することで、効率的に最適パラメータを探索する最適化手法が広く用いられている。例えば、分布予測型アルゴリズムとして知られる最適化手法群では、評価値の情報から明示的に構成される確率分布（以下、探索点生成分布）に基づいて探索点を生成することで、探索の効率化を図っている。 Therefore, instead of randomly generating parameter values to be evaluated (hereinafter referred to as search points) each time, by generating the next search point by considering the evaluation value of the previously calculated search point, we can efficiently calculate the optimal parameter value. Optimization methods that search for are widely used. For example, in a group of optimization methods known as distribution prediction algorithms, search points are generated based on a probability distribution explicitly constructed from evaluation value information (hereinafter referred to as search point generation distribution), which improves search efficiency. We are trying to make this happen.

しかし、そのような最適化手法の課題の一つとして、評価関数があるパラメータの変化に対してほとんど変化しないような平坦な領域を持つ場合に、探索性能が悪化することが挙げられる。 However, one problem with such optimization methods is that search performance deteriorates when the evaluation function has a flat region that hardly changes with changes in a certain parameter.

非特許文献１では、多変量正規分布からの探索点生成とその評価値に基づく分布の中心位置と共分散行列の更新を繰り返して最適パラメータを探索するCMA-ES（共分散行列適応進化戦略：Covariance Matrix Adaptation Evolution Strategy）と呼ばれる分布予測型アルゴリズムにおいて、悪条件な評価関数に対する探索性能の悪化を軽減するために、探索の対象となるパラメータを毎回ランダムに選択した少数のパラメータに制限することを提案している。ここで、悪条件関数とは、多変数関数でその曲がり具合いが方向によって大きく異なるものを指し、平坦な領域を持つ関数は悪条件関数に含まれる。また、特許文献１では、過去の探索点とその評価値に基づいて学習ベースで構成される判定器を導入して、各探索点に対して評価値計算を行うかどうかを判定することで、無駄な評価値計算を省き、計算時間を削減することを提案している。 In Non-Patent Document 1, CMA-ES (Covariance Matrix Adaptive Evolution Strategy: In a distribution prediction algorithm called Covariance Matrix Adaptation Evolution Strategy, in order to reduce the deterioration of search performance for poorly conditioned evaluation functions, we limit the search target parameters to a small number of randomly selected parameters each time. is suggesting. Here, the ill-conditioned function refers to a multivariable function whose degree of curvature differs greatly depending on the direction, and functions with flat regions are included in the ill-conditioned function. In addition, in Patent Document 1, by introducing a decision device configured on a learning basis based on past search points and their evaluation values, and determining whether to perform evaluation value calculation for each search point, It is proposed to eliminate unnecessary evaluation value calculations and reduce calculation time.

特開２０１９－１９２１６０号公報Japanese Patent Application Publication No. 2019-192160

清水洸希、小宮山純平、豊田正史、“高次元悪条件最適化問題のための確率的次元選択CMA-ES”、DEIM Forum 2019 A4-3Koki Shimizu, Junpei Komiyama, Masashi Toyota, “Stochastic dimension selection CMA-ES for high-dimensional ill-conditioned optimization problems”, DEIM Forum 2019 A4-3

非特許文献１で提案されている手法では、探索の対象となるパラメータの選択がランダムであるため、個々の悪条件の特徴に応じた探索パラメータの選択ができない。特に、悪条件でない評価関数に対しては、従来のCMA-ESと比べて探索性能が悪化してしまうことが、非特許文献１内で指摘されている。 In the method proposed in Non-Patent Document 1, since the selection of search target parameters is random, search parameters cannot be selected according to the characteristics of individual adverse conditions. In particular, it has been pointed out in Non-Patent Document 1 that search performance deteriorates compared to conventional CMA-ES for evaluation functions that are not under bad conditions.

特許文献１では、判定器を学習ベースで構築することを想定している。そのため、探索性能の悪化を招くような（平坦性を含む）様々な原因に対応することが、原理的には可能である。しかし、学習のために保持すべきデータ量や学習コストが大きいことが課題として挙げられる。 Patent Document 1 assumes that a determiner is constructed on a learning basis. Therefore, in principle, it is possible to deal with various causes (including flatness) that cause deterioration of search performance. However, challenges include the large amount of data that must be retained for learning and the large learning costs.

本発明は以上の問題を鑑みてなされたものであり、悪条件性の中でも特に平坦性に起因する探索性能の悪化を回避して、効率的な探索を実現する手法を提供することを目的とする。 The present invention was made in view of the above problems, and aims to provide a method for realizing efficient search by avoiding deterioration in search performance caused by flatness among other unfavorable conditions. do.

本発明の好ましい一側面は、パラメータの値の最適化を行う最適化装置である。この装置は、探索したいパラメータとその評価指標となる評価関数を受け取る入力部と、前記評価関数に基づいて前記パラメータの値の最適値を求める最適化計算部と、前記最適値を出力する出力部と、を含む。前記最適化計算部は、評価するパラメータの値である探索点を探索点生成分布から生成する探索点生成部と、前記評価関数に基づいて前記探索点の評価値を計算する評価値計算部と、前記評価値に基づいて前記探索点生成分布を更新する分布形状更新部と、前記探索点生成分布を特徴づける量を時系列情報として保持する分布形状保持部と、前記時系列情報をもとに探索するパラメータを選択する探索パラメータ選択部と、所定の終了条件をもとに終了判定を行う終了判定部と、を有する。 A preferred aspect of the present invention is an optimization device that optimizes parameter values. This device includes an input section that receives a parameter to be searched and an evaluation function serving as its evaluation index, an optimization calculation section that calculates an optimal value of the parameter based on the evaluation function, and an output section that outputs the optimal value. and, including. The optimization calculation unit includes a search point generation unit that generates a search point that is a value of a parameter to be evaluated from a search point generation distribution, and an evaluation value calculation unit that calculates an evaluation value of the search point based on the evaluation function. , a distribution shape updating unit that updates the search point generation distribution based on the evaluation value; a distribution shape holding unit that retains a quantity characterizing the search point generation distribution as time series information; The search parameter selection unit selects parameters to be searched for, and the termination determination unit determines termination based on predetermined termination conditions.

本発明の好ましい他の一側面は、入力装置、出力装置、プロセッサおよび記憶装置を備える情報処理装置で実行され、パラメータの値の最適化を行う最適化方法である。この方法は、探索したいパラメータとその評価指標となる評価関数を受け取る第１のステップと、探索するパラメータの値である探索点を探索点生成分布から生成する第２のステップと、前記評価関数に基づいて前記探索点の評価値を計算する第３のステップと、前記評価値に基づいて前記探索点生成分布を更新する第４のステップと、前記探索点生成分布を特徴づける量を時系列情報として保持する第５のステップと、前記時系列情報をもとに探索するパラメータを選択する第６のステップと、
を実行する。 Another preferred aspect of the present invention is an optimization method that is executed by an information processing device including an input device, an output device, a processor, and a storage device, and optimizes parameter values. This method includes a first step of receiving a parameter to be searched and an evaluation function serving as its evaluation index, a second step of generating a search point, which is the value of the parameter to be searched, from a search point generation distribution, and a second step of receiving the parameter to be searched and an evaluation function serving as its evaluation index. a third step of calculating an evaluation value of the search point based on the evaluation value; a fourth step of updating the search point generation distribution based on the evaluation value; and a fourth step of updating the search point generation distribution based on the evaluation value. a sixth step of selecting parameters to search based on the time series information;
Execute.

本発明によれば、探索性能の悪化を招くような平坦な領域が評価関数に含まれる場合でも、効率的な探索を可能とする。
上記した以外の課題、構成及び効果は、以下の実施形態の説明により明らかにされる。 According to the present invention, efficient search is possible even when the evaluation function includes a flat area that causes deterioration in search performance.
Problems, configurations, and effects other than those described above will be made clear by the following description of the embodiments.

実施例の最適化装置の機能構成例を示すブロック図である。It is a block diagram showing an example of functional composition of an optimization device of an example. 実施例の最適化装置のハードウェア構成例を示すブロック図である。FIG. 2 is a block diagram showing an example of a hardware configuration of an optimization device according to an embodiment. 実施例の最適化装置の処理の例を示すフローチャートである。It is a flowchart which shows an example of processing of the optimization device of an example. 分布予測型アルゴリズムの処理の例を示す説明図である。It is an explanatory diagram showing an example of processing of a distribution prediction type algorithm. 平坦性に起因する冗長な計算の例を示す説明図である。FIG. 3 is an explanatory diagram showing an example of redundant calculations due to flatness. 実施例の最適化装置の探索パラメータ選択の処理の例を示す説明図である。It is an explanatory diagram showing an example of processing of search parameter selection of an optimization device of an example. 探索パラメータ選択部の詳細な処理の例を示すフローチャートである。7 is a flowchart illustrating an example of detailed processing of a search parameter selection unit. パラメータ間に相関がある場合の処理の例を示す説明図である。FIG. 6 is an explanatory diagram showing an example of processing when there is a correlation between parameters. 評価関数の平坦性が局所的な場合の例を示す図である。FIG. 7 is a diagram illustrating an example where the flatness of the evaluation function is local. 実施例の最適化装置の特徴的な応答の例を示す説明図である。It is an explanatory view showing an example of a characteristic response of an optimization device of an example. 実施例２における探索進捗度合いを表示する出力装置の例を示す斜視図である。7 is a perspective view showing an example of an output device that displays the degree of search progress in Example 2. FIG.

以下、本発明の実施形態を、図面を用いて説明する。なお、本実施形態は本発明を実現するための一例に過ぎず、本発明の技術的範囲を限定するものではないことに注意すべきである。 Embodiments of the present invention will be described below with reference to the drawings. Note that this embodiment is merely an example for implementing the present invention, and does not limit the technical scope of the present invention.

以下に説明する発明の構成において、同一部分又は同様な機能を有する部分には同一の符号を異なる図面間で共通して用い、重複する説明は省略することがある。 In the configuration of the invention described below, the same parts or parts having similar functions may be designated by the same reference numerals in different drawings, and overlapping explanations may be omitted.

同一あるいは同様な機能を有する要素が複数ある場合には、同一の符号に異なる添字を付して説明する場合がある。ただし、複数の要素を区別する必要がない場合には、添字を省略して説明する場合がある。 When there are multiple elements having the same or similar functions, the same reference numerals may be given different subscripts for explanation. However, if there is no need to distinguish between multiple elements, the subscript may be omitted in the explanation.

本明細書等における「第１」、「第２」、「第３」などの表記は、構成要素を識別するために付するものであり、必ずしも、数、順序、もしくはその内容を限定するものではない。また、構成要素の識別のための番号は文脈毎に用いられ、一つの文脈で用いた番号が、他の文脈で必ずしも同一の構成を示すとは限らない。また、ある番号で識別された構成要素が、他の番号で識別された構成要素の機能を兼ねることを妨げるものではない。 In this specification, etc., expressions such as "first," "second," and "third" are used to identify constituent elements, and do not necessarily limit the number, order, or content thereof. isn't it. Further, numbers for identifying components are used for each context, and a number used in one context does not necessarily indicate the same configuration in another context. Furthermore, this does not preclude a component identified by a certain number from serving the function of a component identified by another number.

図面等において示す各構成の位置、大きさ、形状、範囲などは、発明の理解を容易にするため、実際の位置、大きさ、形状、範囲などを表していない場合がある。このため、本発明は、必ずしも、図面等に開示された位置、大きさ、形状、範囲などに限定されない。 The position, size, shape, range, etc. of each component shown in the drawings etc. may not represent the actual position, size, shape, range, etc. in order to facilitate understanding of the invention. Therefore, the present invention is not necessarily limited to the position, size, shape, range, etc. disclosed in the drawings or the like.

本明細書で引用した刊行物、特許および特許出願は、そのまま本明細書の説明の一部を構成する。 The publications, patents, and patent applications cited herein are incorporated in their entirety.

本明細書において単数形で表される構成要素は、特段文脈で明らかに示されない限り、複数形を含むものとする。 Elements expressed in the singular herein shall include the plural unless the context clearly dictates otherwise.

以下で詳細に説明される実施例の代表的な構成の一つとして以下を採用する。この例は、探索したいパラメータとその評価関数、及びそのパラメータの初期値やハイパーパラメータの値を入力として、評価値を最大にするパラメータ値を出力する最適化装置である。この装置では、ある探索点生成分布から探索点を生成する探索点生成部と、その探索点の評価値を計算する評価値計算部と、その評価値に基づいて探索点生成分布を更新する分布形状更新部と、その分布を特徴づける量（例えば分散値である）を保持する分布形状保持部と、その分布形状の情報に基づいて各パラメータに対して探索を行うかどうかを選択する探索パラメータ選択部と、探索の終了判定を行う終了判定部とを、含む。さらに、前記探索パラメータ選択部は、その選択処理によって探索パラメータ数に変化が生じた場合、最適化手法のハイパーパラメータの中で探索パラメータ数に依存するものを適切な値に更新する処理も行う。 The following is employed as one of the typical configurations of the embodiments described in detail below. This example is an optimization device that inputs a parameter to be searched, its evaluation function, initial values of the parameter, and hyperparameter values, and outputs a parameter value that maximizes the evaluation value. This device includes a search point generation section that generates search points from a certain search point generation distribution, an evaluation value calculation section that calculates the evaluation value of the search point, and a distribution that updates the search point generation distribution based on the evaluation value. A shape updating unit, a distribution shape holding unit that retains a quantity that characterizes the distribution (for example, a variance value), and a search parameter that selects whether to perform a search for each parameter based on the information on the distribution shape. It includes a selection section and an end determination section that determines whether or not the search is complete. Furthermore, when the number of search parameters changes as a result of the selection process, the search parameter selection section also performs processing to update hyperparameters of the optimization method that depend on the number of search parameters to appropriate values.

以上の構成によれば、分布形状保持部に蓄積された探索点生成分布の形状の時系列情報を参照して、評価関数の形状を推測することが可能となり、探索パラメータ選択部でその推測結果に応じた探索パラメータの選択を行うことが可能である。 According to the above configuration, it is possible to estimate the shape of the evaluation function by referring to the time series information of the shape of the search point generation distribution stored in the distribution shape holding section, and the estimation result is obtained in the search parameter selection section. It is possible to select search parameters according to the

本実施例は、例えば物体認識における特徴点抽出のしきい値を最適化する場合に使用できる。例えば、物流現場等で用いられるピッキングロボットが、多様な物品の位置や姿勢の認識に対応するためには、しきい値等のパラメータの最適化処理を行うことが有効である。本実施例を採用することにより、パラメータの最適化処理を効率化しつつ、認識の精度や速度の向上が期待できる。 This embodiment can be used, for example, when optimizing the threshold for feature point extraction in object recognition. For example, in order for a picking robot used at a logistics site to be able to recognize the positions and postures of various items, it is effective to perform optimization processing on parameters such as threshold values. By adopting this embodiment, it is possible to improve the accuracy and speed of recognition while increasing the efficiency of parameter optimization processing.

図１は、本実施形態の機能構成例を示すブロック図である。図１に示すように、本実施形態の最適化装置１は、最適化対象の評価指標となる評価関数とその評価関数が依存するパラメータ、及びそのパラメータの初期値やハイパーパラメータの値を登録する入力部１１と、その評価関数の値をできるだけ大きくするパラメータ値を後述するループ処理によって探索する最適化計算部１２と、その結果のパラメータ値を出力する出力部１３と、で構成されている。 FIG. 1 is a block diagram showing an example of the functional configuration of this embodiment. As shown in FIG. 1, the optimization device 1 of this embodiment registers an evaluation function serving as an evaluation index of an optimization target, parameters on which the evaluation function depends, and initial values of the parameters and values of hyperparameters. It consists of an input section 11, an optimization calculation section 12 that searches for a parameter value that maximizes the value of the evaluation function by a loop process described later, and an output section 13 that outputs the resulting parameter value.

さらに、最適化計算部１２は、探索点生成分布の形状を特徴づける量を保持する分布形状保持部１２１と、その情報に基づく探索パラメータの選択と探索パラメータ数に依存するハイパーパラメータの更新を行う探索パラメータ選択部１２２と、そこで選択されたパラメータについて探索点生成分布から探索点を生成する探索点生成部１２３と、その探索点での評価値を計算する評価値計算部１２４と、予め設定された評価値の目標値や評価値計算回数の最大値などから探索を終了するかどうかを判定する終了判定部１２５と、その判定で探索が終了しない場合に探索点とその評価値に基づいて探索点生成分布を更新してその形状を特徴づける量を分布形状保持部１２１に保存する分布形状更新部１２６と、で構成されている。 Furthermore, the optimization calculation unit 12 includes a distribution shape holding unit 121 that holds quantities characterizing the shape of the search point generation distribution, selects search parameters based on this information, and updates hyperparameters that depend on the number of search parameters. A search parameter selection unit 122, a search point generation unit 123 that generates a search point from a search point generation distribution for the parameter selected therein, and an evaluation value calculation unit 124 that calculates an evaluation value at the search point. An end determination unit 125 determines whether to end the search based on the target value of the evaluation value and the maximum number of evaluation value calculations, and if the search does not end based on the determination, the end determination unit 125 performs the search based on the search point and its evaluation value. The distribution shape updating section 126 updates the point generation distribution and stores the quantity characterizing the shape in the distribution shape holding section 121.

ここで（及び、以下では）、探索点の生成を明示的な確率分布から行う最適化手法の例で説明しているが、探索点の生成方法が確率分布からのサンプリングに帰着できる場合は本実施形態に含めることができる。上記の最適化計算部１２の構成では、終了判定部１２５の終了条件を満たすまでループ処理が続き、ループ処理ごとに分布形状保持部１２１に探索点生成分布を特徴づける量が時系列的に蓄積されていく。そのため、探索パラメータ選択部１２２では、探索パラメータを選択する際の判定基準として探索点生成分布の形状の時系列情報を活用することができる。ただし、後述するように、探索パラメータ選択部１２２で用いないデータは破棄してもよい。また、各ループでのすべての探索点と評価値を保持するのではなく、それらから抽出した分布形状を特徴づける比較的少数の量を保持するだけで済むため、メモリ効率が比較的良いということは、本実施形態の特徴の一つとして挙げられる。 Here (and below) we will explain an example of an optimization method in which search points are generated from an explicit probability distribution, but if the search point generation method can be reduced to sampling from a probability distribution, then this can be included in the embodiment. In the configuration of the optimization calculation unit 12 described above, the loop process continues until the termination condition of the termination determination unit 125 is satisfied, and for each loop process, the amount characterizing the search point generation distribution is accumulated in the distribution shape storage unit 121 in time series. It will be done. Therefore, the search parameter selection unit 122 can utilize time-series information about the shape of the search point generation distribution as a criterion when selecting search parameters. However, as will be described later, data that is not used by the search parameter selection unit 122 may be discarded. It is also relatively memory efficient, since instead of storing all search points and evaluation values in each loop, it only needs to store a relatively small number of quantities that characterize the distribution shape extracted from them. is mentioned as one of the features of this embodiment.

分布形状更新部１２６における探索点生成分布の更新方法は、不偏性を有すること以外は要請しない。ここで、分布更新における不偏性とは、探索点とその評価値に相関がない（例えば、一様、ランダムなど）場合に、更新に伴う分布の変動が平均するとゼロになるような、分布更新の性質を指す。不偏性がない場合は、求まる最適パラメータの値が、探索パラメータの初期値設定などに影響を受けるため、目的にもよるが、一般的には好ましくない。 The method for updating the search point generation distribution in the distribution shape updating unit 126 is not required to be unbiased. Here, unbiasedness in distribution update means that when there is no correlation between the search points and their evaluation values (for example, uniform, random, etc.), the distribution update is such that the fluctuation of the distribution due to the update becomes zero on average. refers to the properties of If there is no unbiasedness, the value of the optimal parameter to be found will be affected by the initial value setting of the search parameter, so it is generally undesirable, although it depends on the purpose.

ここで、本実施形態を実行した結果として分布形状保持部１２１に蓄積されている情報を、最適化計算終了後も保持しておくことで、次のような効果が得られる。つまり、実行済みの最適化計算と類似した最適化計算を実行したい場合、入力部１１で探索したいパラメータやその初期値、ハイパーパラメータの値を設定する際に、前述の分布形状保持部１２１に保持されている情報を活用することができる。例えば、後述するように分布形状保持部１２１の情報から評価関数の平坦な領域の存在とその範囲を推測することができるため、それに対応する探索する重要度の低いパラメータを事前に除去することやその定義域を制限することが可能である。
ここまで本実施例では、入力部１１で登録するパラメータの初期値やハイパーパラメータの値を外部から直接指定すること想定しているが、それらを内部処理で決定しても構わない。例えば、事前知識が何もない場合にパラメータの初期値を定義域からランダムに選択することが考えられる。 Here, by retaining the information accumulated in the distribution shape retaining unit 121 as a result of executing this embodiment even after the optimization calculation is completed, the following effects can be obtained. In other words, when you want to perform an optimization calculation similar to an optimization calculation that has already been performed, when setting the parameters you want to search for, their initial values, and hyperparameter values in the input unit 11, you need to store them in the distribution shape storage unit 121 described above. You can utilize the information provided. For example, as will be described later, it is possible to infer the existence and range of a flat region of the evaluation function from the information in the distribution shape holding unit 121, so it is possible to remove in advance the corresponding parameters of low importance to be searched. It is possible to limit its domain.
Up to this point, in this embodiment, it has been assumed that the initial values of parameters and the values of hyperparameters to be registered in the input unit 11 are directly specified from the outside, but they may also be determined by internal processing. For example, it is conceivable to randomly select initial values of parameters from the domain when there is no prior knowledge.

図２は、図１の機能構成例を実現するためのハードウェア構成例を示している。本実施形態の最適化装置１は、例えば、プロセッサ１０１、メモリ１０２、補助記憶装置１０３、入力装置１０４、出力装置１０５、及び通信IF（Interface）１０６を有し、それらがバス等の内部通信線１０７によって接続された計算機によって構成される。 FIG. 2 shows an example of a hardware configuration for realizing the functional configuration example of FIG. 1. As shown in FIG. The optimization device 1 of this embodiment has, for example, a processor 101, a memory 102, an auxiliary storage device 103, an input device 104, an output device 105, and a communication IF (Interface) 106, which are connected to an internal communication line such as a bus. It is composed of computers connected by 107.

プロセッサ１０１は、メモリ１０２に格納されたプログラムを実行して、最適化計算部１２の分布形状保持部１２１以外の機能を実現する。分布形状保持部１２１は、主にメモリ１０２で実現される。ただし、保持すべきデータ量が大きい場合や、得られたデータを他の類似最適化計算時に利用したい場合は、補助記憶装置１０３もその役割を担うことになる。メモリ１０２は、例えば、変更する必要のないプログラムを格納するための不揮発性の記憶素子（例えば、ROM（Read Only Memory））と、実行するプログラム及びプログラム実行時に使用するデータを一時的に格納するための揮発性の記憶素子（例えば、RAM（Random Access Memory））と、で構成される。一方、補助記憶装置１０３は、例えば磁気記憶装置（HDD（Hard Disk Drive））のような不揮発性で大容量な記憶装置を含み、プロセッサ１０１が実行するプログラム及びプログラム実行時に使用されるデータを格納する。以上の要素により、最適化プログラムは、例えば、まず補助記憶装置１０３から読みだされて、メモリ１０２にロードされて、プロセッサ１０１によって実行されることになる。 The processor 101 executes a program stored in the memory 102 to realize functions other than the distribution shape holding section 121 of the optimization calculation section 12. The distribution shape holding unit 121 is mainly realized by the memory 102. However, if the amount of data to be held is large or if the obtained data is to be used for other similar optimization calculations, the auxiliary storage device 103 will also play this role. The memory 102 is, for example, a non-volatile storage element (for example, ROM (Read Only Memory)) for storing programs that do not need to be changed, and temporarily stores programs to be executed and data used during program execution. It consists of a volatile storage element (for example, RAM (Random Access Memory)) for On the other hand, the auxiliary storage device 103 includes a nonvolatile, large-capacity storage device such as a magnetic storage device (HDD (Hard Disk Drive)), and stores programs executed by the processor 101 and data used during program execution. do. Due to the above factors, the optimization program is first read out from the auxiliary storage device 103, loaded into the memory 102, and executed by the processor 101, for example.

入力装置１０４は、キーボードやマウスのような、オペレータからの入力を受け付ける装置であり、最適化装置の入力部１１への入力操作などを可能とする。出力装置１０５は、ディスプレイやプリンタのような、プログラムの実行結果（例えば、出力部１３の出力）をオペレータが認識可能な形式で出力する装置である。通信IF１０６は、本最適化装置と他の装置との通信を制御するネットワークインターフェース装置である。 The input device 104 is a device such as a keyboard or a mouse that receives input from an operator, and enables input operations to the input unit 11 of the optimization device. The output device 105 is a device, such as a display or a printer, that outputs the results of program execution (for example, the output of the output unit 13) in a format that can be recognized by the operator. The communication IF 106 is a network interface device that controls communication between this optimization device and other devices.

以上のように、本実施例では計算や制御等の機能は、メモリ１０２や補助記憶装置１０３に格納されたプログラムがプロセッサ１０１によって実行されることで、定められた処理を他のハードウェアと協働して実現される。計算機などが実行するプログラム、その機能、あるいはその機能を実現する手段を、「機能」、「手段」、「部」、「ユニット」、「モジュール」等と呼ぶ場合がある。図１に示した探索パラメータ選択部１２２、探索点生成部１２３、評価値計算部１２４、終了判定部１２５、分布形状更新部１２６は、それぞれが実現する機能を実行するためのプログラムがメモリ１０２や補助記憶装置１０３に格納されているものとする。なお、プログラムで構成した機能と同等の機能は、FPGA（Field Programmable Gate Array）、ASIC（Application Specific Integrated Circuit）などのハードウェアでも実現できる。 As described above, in this embodiment, functions such as calculation and control are performed by executing programs stored in the memory 102 and the auxiliary storage device 103 by the processor 101, thereby performing predetermined processing in cooperation with other hardware. It is realized by working. A program executed by a computer or the like, its function, or a means for realizing that function may be called a "function," "means," "section," "unit," "module," etc. The search parameter selection unit 122, search point generation unit 123, evaluation value calculation unit 124, termination determination unit 125, and distribution shape update unit 126 shown in FIG. It is assumed that the data is stored in the auxiliary storage device 103. Note that functions equivalent to those configured using programs can also be achieved using hardware such as FPGA (Field Programmable Gate Array) and ASIC (Application Specific Integrated Circuit).

以上の構成は、単体のコンピュータで構成してもよいし、あるいは、プロセッサ１０１、メモリ１０２、補助記憶装置１０３、入力装置１０４、出力装置１０５、及び通信IF１０６の任意の部分が、ネットワークで接続された他のコンピュータで構成されてもよい。 The above configuration may be configured by a single computer, or any part of the processor 101, memory 102, auxiliary storage device 103, input device 104, output device 105, and communication IF 106 may be connected via a network. It may also be composed of other computers.

図３は、本実施形態の最適化計算部１２で行われる処理の一例を示すフローチャートである。本実施形態の最適化計算では、CMA-ESを一部応用している。CMA-ESは進化計算（Evolutionary Computation）の一つであり、対象とする問題の解を擬似的に生物個体とみなし，その集団（個体群）を用いて解探索を行う多点探索法である。よく知られるように、CMA-ESでは、次世代の個体群を正規分布に基づく突然変異によって生成し，その正規分布の共分散行列がCovariance matrix adaptationと呼ばれるメカニズムで更新されていく。 FIG. 3 is a flowchart showing an example of processing performed by the optimization calculation unit 12 of this embodiment. In the optimization calculation of this embodiment, CMA-ES is partially applied. CMA-ES is a type of evolutionary computation, and is a multi-point search method that considers the solution to the target problem as a pseudo-biological individual and searches for a solution using that population (individual population). . As is well known, in CMA-ES, the next generation population is generated by mutation based on a normal distribution, and the covariance matrix of the normal distribution is updated using a mechanism called Covariance matrix adaptation.

図３に示す処理のうち、探索点生成のステップＳ３０、評価値計算のステップＳ４０、終了判定のステップＳ５０、ステップＳ６０のうち探索点生成分布の更新については、従来のCMA-ESの技術を適用してよい。なお、本実施例ではCMA-ESを利用しているが、CMA-ESに代えて、実数値遺伝アルゴリズムやES（Evolution Strategy, ES）アルゴリズム等の概念を含む、各種の分布予測型アルゴリズムを利用してもよい。 Among the processes shown in FIG. 3, among the search point generation step S30, evaluation value calculation step S40, end determination step S50, and step S60, the conventional CMA-ES technology is applied to update the search point generation distribution. You may do so. Although CMA-ES is used in this example, various distribution prediction algorithms can be used instead of CMA-ES, including concepts such as real-valued genetic algorithms and ES (Evolution Strategy, ES) algorithms. You may.

図３のフローチャートに従って、その処理の流れを説明する。前述のように、最適化計算部１２はループ処理を行うため、ｋループ目に入った時点からの処理を説明する。以下では、説明のために、探索したいパラメータをｄ次元ベクトルｘ＝（ｘ_１，…，ｘ_ｄ）で表す。ｄはパラメータ数を表す自然数で、各パラメータはそれぞれ連続値もしく離散値をとる。また、評価関数をＦ（ｘ）、探索点生成分布をＰ（Ｓ^（ｋ））で表す。ここでＳ^（ｋ）は分布形状を特徴づける量の集合で、その要素数は任意である。 The flow of the process will be explained according to the flowchart of FIG. As described above, the optimization calculation unit 12 performs loop processing, so the processing from the time when the k-th loop is entered will be explained. In the following, for the sake of explanation, the parameter to be searched for is represented by a d-dimensional vector x=(x ₁ ,..., x _d ). d is a natural number representing the number of parameters, and each parameter takes a continuous value or a discrete value. Further, the evaluation function is represented by F(x), and the search point generation distribution is represented by P(S ^(k) ). Here, S ^(k) is a set of quantities characterizing the distribution shape, and the number of elements thereof is arbitrary.

まず、ステップＳ１０で、探索パラメータ選択部１２２は、探索パラメータの選択を行う。その選択基準として、分布形状保持部１２１に蓄積されている情報を用いる。分布形状保持部１２１に蓄積されている情報とは、過去分を含めたＳ^（ｋ）の履歴である。ここで、ループ数ｋが小さいために、分布形状保持部１２１にデータが蓄積されていない、もしくは不十分な場合は、ステップＳ４０に移る。分布形状保持部１２１に十分なデータが存在する場合は、探索を行うパラメータの選択を行う。探索を行うパラメータのインデックス集合をＩ^（ｋ）、探索を行わないパラメータのインデックス集合をＪ^（ｋ）とする。 First, in step S10, the search parameter selection unit 122 selects search parameters. Information stored in the distribution shape holding unit 121 is used as the selection criterion. The information stored in the distribution shape holding unit 121 is the history of S ^(k) including the past. Here, if the number of loops k is small and the data is not stored in the distribution shape holding unit 121 or is insufficient, the process moves to step S40. If sufficient data exists in the distribution shape holding unit 121, parameters to be searched are selected. Let I ^(k) be the index set of parameters to be searched, and J ^(k) be the index set of parameters not to be searched.

次のステップＳ２０では、探索パラメータ選択部１２２は、探索パラメータの数が前回のループ処理（ｋ－１ループ目）のときから変化したかどうかをまず判断する。｜Ｉ^（ｋ）|＝｜Ｉ^{（ｋ－１）}｜の場合は、何もせずにステップＳ３０に移る。｜Ｉ^（ｋ）|≠｜Ｉ^{（ｋ－１）}｜の場合は、ステップＳ２１に移り、探索パラメータ選択部１２２は、ハイパーパラメータの更新を行う。 In the next step S20, the search parameter selection unit 122 first determines whether the number of search parameters has changed since the previous loop processing (k-1th loop). If |I ^(k) |=|I ^(k-1) |, the process moves to step S30 without doing anything. In the case of |I ^(k) |≠|I ^(k-1) |, the process moves to step S21, and the search parameter selection unit 122 updates the hyperparameter.

例えば、CMA-ESの場合、後述のステップＳ３０において生成される探索点数やステップＳ６０の分布更新時の学習率などは、探索パラメータ数に応じて探索が効率的になる推奨値が存在する。そのようなハイパーパラメータを、探索パラメータ数に対応する値に変更した上でステップＳ３０に移る。 For example, in the case of CMA-ES, there are recommended values for the number of search points generated in step S30 (described later), the learning rate at the time of updating the distribution in step S60, etc. that make the search more efficient depending on the number of search parameters. After changing such hyperparameters to values corresponding to the number of search parameters, the process moves to step S30.

ハイパーパラメータの更新のために、探索パラメータ選択部１２２が参照可能なデータテーブルをメモリ１０２内に準備しておく。データテーブルの内容は、例えば探索パラメータの数や範囲に対応して、探索点数や分布更新時の学習率の推奨値を格納したものでよい。 In order to update the hyperparameters, a data table that can be referenced by the search parameter selection unit 122 is prepared in the memory 102. The contents of the data table may store, for example, recommended values for the number of search points and the learning rate at the time of updating the distribution, corresponding to the number and range of search parameters.

ハイパーパラメータは更新せずに固定値とすることもでき、その場合はステップＳ２０とステップＳ２１は省略することができる。ただし、ここでのハイパーパラメータ更新処理により、探索性能のさらなる向上が期待できる。 The hyperparameter can also be set to a fixed value without being updated, and in that case, step S20 and step S21 can be omitted. However, the hyperparameter update process here can be expected to further improve search performance.

ステップＳ３０では、探索点生成部１２３は、探索点生成を行う。探索点は下記の式（１）で表すことができる。 In step S30, the search point generation unit 123 generates a search point. The search point can be expressed by the following equation (1).

ここで、ｘ_ｉ；ｎ ^（ｋ）はｋループ目で生成されるｎ番目の探索点の第ｉ成分を表す。また、ｘ^＾ _ｉ；ｎ ^（ｋ）は探索を行わないパラメータに対する値である。平坦性への対応を考えている限りは、その値は重要ではない。なぜならば、平坦であるということは、その方向についてはパラメータの値を変えても評価関数の値は変わらないということであり、ｘ^＾ _ｉ；ｎ ^（ｋ）は評価関数の値に関係ないので、任意の値にとっても結果が変らないためである。例えばｘ^＾ _ｉ；ｎ ^（ｋ）の値としてＰの中心位置の対応する成分を採用することが考えられる。このように、式（１）によれば、Ｉ^（ｋ）に属する次元については、探索点生成分布Ｐ（Ｓ^（ｋ））から値を変えた探索点を選び、Ｊ^（ｋ）に属する次元については、任意の固定値にする。 Here, x _i;n ^(k) represents the i-th component of the n-th search point generated in the k-th loop. Moreover, x ^{^} _i;n ^(k) is a value for a parameter for which no search is performed. Its value is not important as long as flatness is considered. This is because being flat means that the value of the evaluation function does not change even if you change the parameter value in that direction, and x ^{^} _{i; n} ^(k) is not related to the value of the evaluation function. , this is because the result does not change for any value. For example, it is conceivable to adopt a component corresponding to the center position of P as the value of x ^{^} _i;n ^(k) . In this way, according to equation (1), for the dimension belonging to I ^(k) , search points whose values have been changed from the search point generation distribution P(S ^(k) ) are selected, and the dimension belonging to J ^(k) is For , set it to an arbitrary fixed value.

続いてステップＳ４０で、評価値計算部１２４は、各探索点での評価値を計算する。つまり、Ｆ（ｘ_；ｎ ^（ｋ））をすべてのｎに関して求める。この評価値計算に時間を要することが、最適パラメータが求まるまでの計算時間が膨大になることの主な原因となる。ここで、ステップＳ１０で、選択されなかったパラメータのｉ成分は定数として扱われることになるので、計算時間が短縮される。 Subsequently, in step S40, the evaluation value calculation unit 124 calculates the evaluation value at each search point. In other words, F(x _;n ^(k) ) is obtained for all n. The time it takes to calculate this evaluation value is the main reason why the calculation time required to find the optimal parameters is enormous. Here, in step S10, the i component of the unselected parameter is treated as a constant, so the calculation time is shortened.

次のステップＳ５０では、終了判定部１２５は、探索の終了判定を行う。具体的な終了判定は目的に応じて設定することができる。例えば、ループ数ｋがある最大繰り返し数以上になったかどうかや評価値の最大値ｍａｘ（Ｆ（ｘ））がある目標値以上になったかどうか、などがある。終了条件を満たした場合はループ処理を抜けて、求まった最適パラメータ値ａｒｇｍａｘ（Ｆ（ｘ））を、出力部１３に渡す。一方、終了条件を満たさなかった場合は、ステップＳ６０に移る。 In the next step S50, the end determination unit 125 determines the end of the search. The specific end determination can be set depending on the purpose. For example, it may be determined whether the number of loops k exceeds a certain maximum number of repetitions or whether the maximum evaluation value max(F(x)) exceeds a certain target value. If the termination condition is satisfied, the loop process is exited and the determined optimal parameter value argmax(F(x)) is passed to the output unit 13. On the other hand, if the termination condition is not satisfied, the process moves to step S60.

ステップＳ６０では、分布形状更新部１２６は、ｋループ目に生成された探索点とその評価値に基づいて探索点生成分布Ｐの更新を行う。これに伴いＳ^（ｋ）が更新され、更新後の値Ｓ^{（ｋ＋１）}の中で必要な要素を分布形状保持部１２１に保存した上で、ステップＳ１０に戻る。 In step S60, the distribution shape updating unit 126 updates the search point generation distribution P based on the search points generated in the k-th loop and their evaluation values. Accordingly, S ^(k) is updated, and after storing necessary elements in the updated value S ^(k+1) in the distribution shape holding unit 121, the process returns to step S10.

以上のような本実施形態の処理の中で特筆すべき特徴として、分布形状保持部１２１に保存された分布形状の時系列情報を利用して、探索するパラメータの制限を行うことがあげられる。また、他の特徴として、パラメータの制限に伴いハイパーパラメータの更新を行うことが挙げられる。その振る舞いや効果をより具体的に説明するために、以下では、探索したいパラメータの数を２（つまり、ｄ＝２）、探索点生成分布Ｐを中心位置と共分散行列で特徴づけられる多変量正規分布とする。そして、関係する各量を下記の式（２）で表すことにする。 A notable feature of the processing of this embodiment as described above is that the time-series information of the distribution shape stored in the distribution shape storage unit 121 is used to limit the parameters to be searched. Another feature is that hyperparameters are updated in accordance with parameter limitations. In order to explain its behavior and effects more specifically, below, we will set the number of parameters to be searched to 2 (that is, d = 2), and set the search point generation distribution P to a multivariate characterized by the center position and covariance matrix. Assume normal distribution. Then, each related quantity will be expressed by the following equation (2).

ここで、μ^（ｋ）はｋループ目のＰの中心位置、Ｃ^（ｋ）はｋループ目のＰの共分散行列である。 Here, μ ^(k) is the center position of P in the k-th loop, and C ^(k) is the covariance matrix of P in the k-th loop.

図４を用いて、まず本実施形態が主な対象としている分布予測型アルゴリズムにおける、探索パラメータ選択処理がない場合の典型的な探索点の生成・評価値計算と探索点生成分布の更新の様子を説明する。 Using FIG. 4, we will first explain typical search point generation, evaluation value calculation, and update of search point generation distribution in the distribution prediction algorithm, which is the main target of this embodiment, when there is no search parameter selection process. Explain.

図４の(a1)では、正規分布である探索点生成分布と、そこから生成された８個の探索点が記載されている。各探索点は、評価値が大きいものを白点、小さいもの黒点としている。その評価値に基づき変形された探索点生成分布が、(a1)の探索点と共に、(b1)に示されている。(b1)から分かるように、評価値が大きい探索点が生成されやすい分布になるように、探索点生成分布が変形されている。(a2)では、(b1)の探索点生成分布とそこから生成された８個の新たな探索点が示されている。先ほどのように、その評価値に基づいた分布の更新を行った結果が(b2)である。この流れから分かるように、できるだけ評価値が大きい探索点を生成するような分布へと徐々に探索点生成分布が変形されていくことが分かる。ここで、図４は簡単なイメージ図であり、実際のアルゴリズムでは、より洗練された更新方法を採用していることには注意すべきである。 In (a1) of FIG. 4, a search point generation distribution which is a normal distribution and eight search points generated therefrom are described. For each search point, those with large evaluation values are marked as white points, and those with small evaluation values are marked as black points. The search point generation distribution modified based on the evaluation value is shown in (b1) together with the search points in (a1). As can be seen from (b1), the search point generation distribution is modified so that search points with large evaluation values are likely to be generated. (a2) shows the search point generation distribution of (b1) and eight new search points generated therefrom. As before, the result of updating the distribution based on the evaluation value is (b2). As can be seen from this flow, it can be seen that the search point generation distribution is gradually transformed into a distribution that generates search points with as large an evaluation value as possible. Here, it should be noted that FIG. 4 is a simple image diagram, and a more sophisticated updating method is adopted in the actual algorithm.

図５を用いて、次に、上記のような最適化手法における、評価関数に平坦な領域が存在する場合の影響について説明する。図５では、(a)のようなｘ_１方向は上に凸だがｘ_２方向は平坦になっている評価関数を考えている。(b)のように、探索点生成分布から生成された５個の探索点に関して評価値計算を行う状況を考える。ここで、ｘ_２方向に関しては評価関数の値は変化しないため、評価関数は実質的に１変数関数とみなすことができる。実際(c)のように、全探索点のｘ_２の値を正規分布の中心位置の対応する成分の値μ_２ ^（ｋ）に射影して、評価値計算を行っても結果は変わらない。そして、各ループで生成する探索点数に探索パラメータ数に応じた推奨値があり、仮に探索パラメータ数２個・１個に対して探索点推奨個数がそれぞれ５個・３個とすると、(d)のように探索点を３個に制限することでより効率的な探索が可能となる。ここで、ｘ_２方向の分布の幅は、不偏性により、平均してみれば以降の計算で変化しないことには注意すべきである。 Next, using FIG. 5, the influence when a flat region exists in the evaluation function in the above-mentioned optimization method will be explained. In FIG. 5, we are considering an evaluation function that is convex upward in the _x1 direction but flat in the _x2 direction, as shown in (a). Consider a situation in which evaluation values are calculated for five search points generated from the search point generation distribution, as shown in (b). Here, since the value of the evaluation function does not change in the _x2 direction, the evaluation function can be substantially regarded as a one-variable function. In fact, as shown in (c), even if the evaluation value is calculated by projecting the value of x ₂ at all search points onto the value μ ₂ ^(k) of the corresponding component at the center position of the normal distribution, the result will not change. Then, the number of search points generated in each loop has a recommended value according to the number of search parameters, and if the number of recommended search points is 5 and 3 for 2 and 1 search parameters, respectively, (d) By limiting the number of search points to three, more efficient search becomes possible. Here, it should be noted that the width of the distribution in the _x2 direction does not change in subsequent calculations on average due to unbiasedness.

ここまでで、評価値を変えないパラメータがあって、評価関数に平坦な領域が存在する場合は、そのことを考慮することで効率的な探索ができることが分かるが、そのためには平坦な領域を探索途中に検知する必要がある。ここで、評価関数の平坦性を事前に調査することも考えられるが、後述のように一般的には現実的でない。本実施形態では、探索途中の平坦性の検知を実現することができる。 So far, we have seen that if there is a parameter that does not change the evaluation value and there is a flat region in the evaluation function, you can perform an efficient search by taking this into consideration. It must be detected during the search. Here, it is possible to investigate the flatness of the evaluation function in advance, but as will be described later, this is generally not practical. In this embodiment, flatness can be detected during the search.

図６を用いて、探索途中の平坦性の検知を実現する様子を説明する。ｋ－ｒ＋１ループ目で、図６の(a)のような状況だとする。そして、ｒ回のループ処理の後に、(b)のようになったとする。今回の例では、分布形状保持部１２１にｒループ前までの正規分布のｘ_１とｘ_２方向の標準偏差が保存されているとする。ここで、不偏性のために、標準偏差の変動が小さい、もしくはある小さなしきい値以下の場合、評価関数は対応する方向に関して平坦、もしくはランダムであると判断できる。例えば、その判定条件は下記の式（３）で表現できる。 Using FIG. 6, a description will be given of how flatness is detected during the search. Assume that the situation is as shown in (a) in FIG. 6 at the k−r+1th loop. Suppose that after r times of loop processing, the result is as shown in (b). In this example, it is assumed that the standard deviations of the normal distribution in the x ₁ and x ₂ directions up to before r loops are stored in the distribution shape storage unit 121. Here, due to unbiasedness, if the fluctuation of the standard deviation is small or less than a certain small threshold, it can be determined that the evaluation function is flat or random in the corresponding direction. For example, the determination condition can be expressed by the following equation (3).

ここで、εはしきい値、ΔＳ_ｉ ^{（ｋ；ｒ）}はｉ方向標準偏差のｋ－ｒ＋１ループ目からｋループ目までの変動度合いを表す量である。後者は、例えば下記の式（４）のように定義できる。 Here, ε is a threshold value, and ΔS _i ^(k;r) is a quantity representing the degree of variation of the i-direction standard deviation from the k−r+1th loop to the kth loop. The latter can be defined, for example, as in equation (4) below.

ここで、Ｓ_ｉ ^（ｋ）はｋループ目でのｉ方向標準偏差を表す。上記の判定基準では、あるループで探索対象から外れたパラメータはそれ以降探索が行われないこととなるが、局所解に陥る危険性を考慮して、そのようなパラメータを再び探索するかどうかの判定を追加してもよい。例えば、その判定は下記の式（５）のように定義できる。 Here, S _i ^(k) represents the i-direction standard deviation at the k-th loop. According to the above criteria, parameters that are excluded from the search target in a certain loop will not be searched after that, but considering the risk of falling into a local solution, it is necessary to decide whether to search for such parameters again. Additional judgments may be added. For example, the determination can be defined as in equation (5) below.

ここで、ｃは１より大きい定数である。この式は、探索対象の全パラメータに関して、対応する標準偏差の変動がある程度小さくなった場合に、探索から除外されているパラメータをすべて探索対象へと復帰させる処理を表している。このような判定の追加により、局所解に陥る危険性を軽減できるが、探索に要する時間が増加することには注意すべきである。 Here, c is a constant greater than 1. This formula represents a process in which all parameters excluded from the search are returned to the search target when fluctuations in the corresponding standard deviations of all parameters to be searched become small to a certain extent. Although the addition of such a determination can reduce the risk of falling into a local solution, it should be noted that the time required for the search increases.

また、以上の変動量の指標や判定基準は上記に挙げたものに制限されるものではないことにも注意すべきである。加えて、上記のような判定基準に基づく探索パラメータ選択では、平坦でない領域での探索中は、一般的に分布に有意な変動が生じるため、選択処理がない場合の探索性能を保持できるということは重要な特徴として強調しておく。いずれにせよ、それ以上探索しても評価値の安定した向上が、他の方向と比べて期待できないような、パラメータを上記のような方法で判定することができる。そして、本実施形態では、そのような処理を、分布形状保持部１２１と探索パラメータ選択部１２２を用いることで実現できる。 It should also be noted that the above-mentioned indicators and criteria for the amount of variation are not limited to those listed above. In addition, when search parameters are selected based on the criteria described above, significant fluctuations in the distribution generally occur during searches in areas that are not flat, so it is possible to maintain the search performance without selection processing. I would like to emphasize this as an important feature. In any case, the method described above can determine parameters for which a stable improvement in the evaluation value cannot be expected compared to other directions even if further search is performed. In this embodiment, such processing can be realized by using the distribution shape holding section 121 and the search parameter selection section 122.

図６の(c)の右側では、上記処理があってｘ_２方向をｋループ目以降探索しない場合を、左側は上記処理がない場合であり、不要なパラメータの除去により少ない評価点数でも同等の探索が行われている様子を示している。最後に、この例では、分布形状保持部１２１に保持する標準偏差をｒループ分としており、保持すべきデータ量を比較的少なく保つことが可能である。 The right side of (c) in Figure 6 shows the case where the above processing is performed and the search is not performed in the _x2 direction after the kth loop, and the left side is the case where the above processing is not performed. This shows how the search is being carried out. Finally, in this example, the standard deviation held in the distribution shape holding unit 121 is for r loops, and it is possible to keep the amount of data to be held relatively small.

図７は、図６で示した原理を実現するため、探索パラメータ選択部１２２が実行するステップＳ１０の詳細なフローを示す図である。実線矢印は処理の流れを、点線矢印はデータの流れを示す。ここで、ｉはパラメータのインデックスであり、初期値はｉ＝１である。ステップＳ１０では、分布形状保持部１２１に保存されている探索点生成分布の分布形状の時系列情報、例えば分布の分散値をもとに、分布形状の変化が閾値以上のパラメータの選択を行う。 FIG. 7 is a diagram showing a detailed flow of step S10 executed by the search parameter selection unit 122 in order to realize the principle shown in FIG. 6. Solid line arrows indicate the flow of processing, and dotted line arrows indicate the flow of data. Here, i is the index of the parameter, and the initial value is i=1. In step S10, based on time-series information on the distribution shape of the search point generation distribution stored in the distribution shape holding unit 121, for example, the variance value of the distribution, parameters whose distribution shape changes are greater than or equal to a threshold are selected.

ステップＳ１１で、探索パラメータ選択部１２２は、分布形状保持部１２１から所定のｒループ前までの正規分布の各方向の標準偏差の履歴、すなわち情報Ｓ^（ｋ）の履歴を読み込む。 In step S11, the search parameter selection unit 122 reads the history of the standard deviation in each direction of the normal distribution up to a predetermined r loops before, from the distribution shape holding unit 121, that is, the history of the information S ^(k) .

ステップＳ１２で、探索パラメータ選択部１２２は、例えば式（４）を用いて、ｉ方向の分布の標準偏差の変動を計算することにより、分布形状の変化を測定する。 In step S12, the search parameter selection unit 122 measures the change in the distribution shape by calculating the change in the standard deviation of the distribution in the i direction using, for example, equation (4).

ステップＳ１３で、探索パラメータ選択部１２２は、例えば式（３）を用いて、標準偏差の変動が閾値より小さいパラメータを抽出し、当該パラメータは平坦性を有すると推定して検索対象から除外する。ここで、パラメータを検索対象から除外するとは、定数としてその後の処理を行うということである。この場合、パラメータが完全に平坦であれば、任意の定数を選択可能である。また、所定の範囲のみにおいて平坦性が推定される場合には、平坦性が推定される範囲で定数を選択すればよい。例えば、平坦な範囲は、探索点生成分布の幅程度(分布の中心から標準偏差の大きさの距離程度)と推定されるため、その範囲内の値から選択すればよい。分布の中心は常にその範囲内にあるため、実用的には中心値を選択すればよい。すなわち、探索パラメータ選択部１２２で選択されなかったパラメータは、探索点生成分布の中心値に対応する値に固定する。 In step S13, the search parameter selection unit 122 uses, for example, equation (3) to extract a parameter whose standard deviation is smaller than a threshold, estimates that the parameter has flatness, and excludes it from the search target. Here, excluding a parameter from search targets means performing subsequent processing as a constant. In this case, any constant can be selected as long as the parameters are completely flat. Furthermore, when flatness is estimated only within a predetermined range, a constant may be selected within the range where flatness is estimated. For example, since the flat range is estimated to be about the width of the search point generation distribution (about the distance of the standard deviation from the center of the distribution), values within that range may be selected. Since the center of the distribution is always within this range, in practice it is sufficient to select the center value. That is, parameters not selected by the search parameter selection unit 122 are fixed to values corresponding to the center value of the search point generation distribution.

ステップＳ１４で、探索パラメータ選択部１２２は、全てのパラメータについて処理を終えたかを判定する。例えば、ｉがパラメータの最大インデックス数に達したかどうかを判定して終了判定を行う。終了していなければ、ステップＳ１５で、次のパラメータの処理を行う。 In step S14, the search parameter selection unit 122 determines whether processing has been completed for all parameters. For example, the end determination is made by determining whether i has reached the maximum index number of the parameter. If not completed, the next parameter is processed in step S15.

全てのパラメータについて処理を終わっている場合には、探索パラメータ選択部１２２は、例えば式（５）を用いて、ステップＳ１６で探索対象の全パラメータあるいは所定割合のパラメータに関して、標準偏差の変動が閾値より小さいかどうかを判定する。判定の結果、標準偏差の変動が閾値より小さい場合、ステップＳ１７で、検索対象から除外した方向を再検索するかどうかの判定を行う。このとき、前述のように、除外したパラメータを全て復帰させてもよい。あるいは、除外したパラメータをオペレータに示し、オペレータが復帰させるパラメータを選択できるようにしてもよい。 When the processing has been completed for all parameters, the search parameter selection unit 122 uses, for example, equation (5) to determine whether the variation in standard deviation is the threshold value for all parameters or a predetermined proportion of parameters to be searched in step S16. Determine whether it is smaller than. As a result of the determination, if the variation in standard deviation is smaller than the threshold value, it is determined in step S17 whether or not to search again for the direction excluded from the search target. At this time, all excluded parameters may be restored as described above. Alternatively, the excluded parameters may be shown to the operator so that the operator can select the parameters to be restored.

なお、ステップＳ１６とステップ１７が省略可能であることは、既に述べたとおりである。 Note that, as already stated, step S16 and step 17 can be omitted.

図６の説明では、２つのパラメータが独立な場合を想定して説明をしているが、互いに相関がある場合は、共分散行列の対角化を行いその固有値に注目することで、より適切に評価関数の平坦な領域を検知できる。 In the explanation of Figure 6, the explanation is based on the assumption that the two parameters are independent, but if they are correlated with each other, it is more appropriate to diagonalize the covariance matrix and focus on its eigenvalues. flat regions of the evaluation function can be detected.

図８を用いて説明する。図８では、図５の(b)のような評価関数を左に４５度傾けたものを考えており、直線ｘ_１＝ｘ_２が平坦な方向になっている。そして、図８(a)の状態から探索が進み、図８(b)のような状態になったとする。前述のようにｘ_１とｘ_２方向の標準偏差（Ｃ^（ｋ）の対角成分の平方根）を考えると、どちらも変化しているため、探索対象から除かれることはない。しかし、４５度回転した座標系（ｙ_１，ｙ_２）でみると、平坦方向に対応するｙ_１方向の標準偏差に変化はない。つまり、相関のあるパラメータに対して平坦な領域をより細かく検知するためには、共分散行列の固有値の変動に着目する必要があることが分かる。 This will be explained using FIG. In FIG. 8, the evaluation function shown in FIG. 5(b) is considered to be tilted 45 degrees to the left, and the straight line x ₁ =x ₂ is in a flat direction. Assume that the search progresses from the state shown in FIG. 8(a) and reaches a state as shown in FIG. 8(b). As mentioned above, considering the standard deviation in the x ₁ and x ₂ directions (the square root of the diagonal component of C ^(k) ), both are changing, so they are not excluded from the search target. However, when looking at the coordinate system (y ₁ , y ₂ ) rotated by 45 degrees, there is no change in the standard deviation in the y ₁ direction corresponding to the flat direction. In other words, it is clear that in order to more precisely detect flat areas for correlated parameters, it is necessary to pay attention to the fluctuations in the eigenvalues of the covariance matrix.

このような機能を実装するためには、探索点生成分布の形状を特徴づける量の中に２階の行列で表現される量がある場合、探索パラメータ選択部１２２が分布形状保持部１２１に保存する量を抽出するために、行列の対角化処理を行う機能を持てばよい。あるいは、分布形状保持部１２１に探索点生成分布の形状を特徴づける量を格納する際に、同様の処理を行ってもよい。 In order to implement such a function, if there is a quantity expressed by a second-order matrix among the quantities that characterize the shape of the search point generation distribution, the search parameter selection unit 122 saves it in the distribution shape storage unit 121. In order to extract the quantity, it is sufficient to have a function to perform matrix diagonalization processing. Alternatively, similar processing may be performed when storing the quantity characterizing the shape of the search point generation distribution in the distribution shape holding unit 121.

今の例では２次元の共分散行列を考えているが、より高次元の場合への拡張も可能である。ただし、上記の処理では行列の対角化が毎回のループ処理で必要となるため、対角化に要する時間とそれによる効率化の度合いを比較して、実際に採用するか否かは判断する必要がある。 In this example, we are considering a two-dimensional covariance matrix, but it is also possible to extend it to higher dimensions. However, in the above process, matrix diagonalization is required in each loop process, so it is necessary to compare the time required for diagonalization and the resulting degree of efficiency to decide whether to actually use it. There is a need.

評価関数の平坦な領域の検知を、本実施形態のように探索途中に行うのではなく、事前に調べておくことも考えられるが、評価関数の平坦性はパラメータの取る値の範囲によっても変わるため、平坦性を適切に把握するためには、各パラメータの全定義域にわたる網羅的な調査が必要となり、相応の計算コストを要する。 It is conceivable to detect a flat region of the evaluation function in advance, instead of detecting it during the search as in this embodiment, but the flatness of the evaluation function also changes depending on the range of values taken by the parameters. Therefore, in order to properly understand the flatness, a comprehensive investigation over the entire definition range of each parameter is required, which requires a corresponding amount of calculation cost.

図９は、ある一つのパラメータに対して評価関数の平坦な領域が局所的に存在する場合の例を示す図である。この図が示すように、この評価関数は領域９１に関しては平坦だが、領域９２に関しては平坦ではないという判断になる。 FIG. 9 is a diagram illustrating an example where a flat region of the evaluation function exists locally for one parameter. As shown in this figure, this evaluation function is determined to be flat for region 91, but not for region 92.

図１０を用いて、最後に、本実施形態における最適化装置の特徴的な応答の様子について述べる。本実施形態では、これ以上探索する重要性の低いパラメータを検知するために、分布形状の時系列変化に着目しているが、その判断は分布更新の不偏性に依拠しているため、そこでは平坦性とランダム性の区別までは行っていない。ただし、ランダム性に関しても、その領域での探索を続けても評価値の有意な差を見いだせないという点では、平坦性と同様に探索を継続する重要度は低い。 Finally, the characteristic response of the optimization apparatus in this embodiment will be described using FIG. 10. In this embodiment, we focus on time-series changes in the distribution shape in order to detect parameters that are less important to search for further, but since this judgment relies on the unbiasedness of the distribution update, The distinction between flatness and randomness is not made. However, with regard to randomness, the importance of continuing the search is low in the sense that no significant difference in evaluation values can be found even if the search is continued in that area, similar to the case of flatness.

本実施形態の入力部１１では、探索したいパラメータと共にそれらに依存する評価関数を与える必要がある。その評価関数の設計の際には、しばしば外部からの入力を必要とする。例えば、流体シミュレーションでは流体の粘性などのパラメータの値を与える必要がある。また、画像認識において、ある物体の認識率を最大にするパラメータを求めたい場合、その物体の情報に加えてその物体が映った画像を与える必要がある。そのような外部入力に人為的な変更を加えることを通じて、評価関数に平坦性、もしくはランダム性をもたらすことが可能である場合が多い。 In the input unit 11 of this embodiment, it is necessary to provide the parameters to be searched and an evaluation function depending on them. When designing the evaluation function, external input is often required. For example, in fluid simulation, it is necessary to provide values for parameters such as fluid viscosity. Furthermore, in image recognition, if one wants to find a parameter that maximizes the recognition rate of a certain object, it is necessary to provide an image of the object in addition to information about the object. It is often possible to bring flatness or randomness to the evaluation function through artificial changes to such external inputs.

例えば、上記の画像認識の例では、入力画像に大きなランダムノイズをのせることで、評価関数をランダムに、入力画像の色などの特徴量を一様化することで、評価関数を平坦にできる場合がある。このような処理を加えて設計した評価関数を用いて、本実施形態の最適化装置１に入力すると、どちらの場合も探索の不要性が検知され速やかに探索が終了する。これは、例えば勾配法のような最適化手法では、平坦な場合には同様の振る舞いだが、ランダムな場合には計算が安定せず探索も終わらないという、顕著な違いが生じる。つまり、このような応答特性は、本実施形態の特徴として特筆すべきものであると言える。その他の特徴的な応答として例えば、探索したいパラメータに評価関数とは全く無関係のパラメータを任意の数、追加しても、それらは不要であることが速やかに検知されるため、探索に要する時間がその追加数によらずほとんど一定であることが挙げられる。 For example, in the image recognition example above, the evaluation function can be made random by adding large random noise to the input image, and the evaluation function can be flattened by making the features such as color of the input image uniform. There are cases. When an evaluation function designed through such processing is input to the optimization device 1 of this embodiment, in both cases, the necessity of the search is detected and the search is promptly terminated. For example, in an optimization method such as the gradient method, the behavior is similar in the flat case, but there is a noticeable difference in that in the random case, the calculation is unstable and the search does not end. In other words, it can be said that such response characteristics are noteworthy as a feature of this embodiment. Another characteristic response is that even if you add an arbitrary number of parameters that are completely unrelated to the evaluation function to the parameters you want to search, it will quickly be detected that they are unnecessary, so the search will take longer. It can be mentioned that it is almost constant regardless of the number of additions.

本実施形態は、実施例の最適化装置を、ピッキングロボットの物体認識パラメータの最適化に適用するものである。ただし、本実施形態は、実施例の最適化装置の具体的なハードウェア上での効果の例を説明するものであって、その適用先を限定するものではないことには注意すべきである。 In this embodiment, the optimization device of the embodiment is applied to optimization of object recognition parameters of a picking robot. However, it should be noted that this embodiment describes an example of the effect of the optimization device of the embodiment on specific hardware, and does not limit its application. .

近年、人手不足に伴い、物流・製造現場での人手作業を代替する自律作業ロボットの需要が高まってきている。特に、目的の物体を把持して所定の場所に置く作業が可能なピッキングロボットは、様々な場面での活躍が期待されている。ここで、そのようなピッキングロボットが、物体を正しく把持するためには、その物体の位置、姿勢、及び種類を正確に認識できなければならない。しかし、多様な物品が扱われる、もしくは扱われる物品の入れ替わりが激しい現場での運用を考えると、高い認識率を保つためには、物体認識処理に関係するパラメータを、目的の物体が変わるたびに、その物体に合わせて調整する必要がある。本実施形態では、その物体ごとのパラメータ最適化を、実施例１の最適化装置が担う。すなわち、本実施形態は、最適化装置の出力部１３に出力される最適パラメータを用いた、ピッキングロボットの物体認識及び把持の機能を持つこと以外は、その構成は実施例１と同じである。以下では、この例における、最適化装置の入出力やその効果を具体的に説明する。 In recent years, due to the labor shortage, demand for autonomous work robots to replace manual labor at logistics and manufacturing sites has increased. In particular, picking robots that can grasp a target object and place it in a predetermined location are expected to be useful in a variety of situations. Here, in order for such a picking robot to correctly grasp an object, it must be able to accurately recognize the position, orientation, and type of the object. However, considering the operation in the field where a variety of objects are handled or the objects being handled change frequently, in order to maintain a high recognition rate, it is necessary to adjust the parameters related to object recognition processing every time the target object changes. , it is necessary to adjust it according to the object. In this embodiment, the optimization apparatus of the first embodiment is responsible for parameter optimization for each object. That is, the configuration of this embodiment is the same as that of the first embodiment, except that the picking robot has an object recognition and grasping function using the optimal parameters output to the output unit 13 of the optimization device. In the following, the input/output of the optimization device and its effects in this example will be specifically explained.

本実施形態のピッキングロボットの物体認識機能では、カメラで撮影されたある状況の画像（以下、シーン画像）の中に目的物体があるかどうか、そしてある場合はそれがどこにどのような姿勢で存在するかを、判断するものとする。一般に、その機能を実現するプログラムには、認識性能に影響を与えうるパラメータが複数存在する。例えば、予め取得した目的物体の特徴点とシーン画像の特徴点とのマッチングに基づく姿勢推定の場合、認識性能に影響を与えるパラメータとして、シーン画像からの特徴点取得に関するもの（例えば、特徴点とみなす輝度勾配の最小許容値）や、特徴点マッチングに関するもの（例えば、組とみなす距離の最大許容値）、などが考えられる。そのようなパラメータの最適化における評価関数は認識性能であり、具体的にはシーン画像に対する目的物体の認識率となる。ただし、目的に応じて、認識処理時間が短いほど値が大きくなるような項を加えてもよい。いずれにせよ複雑な認識処理を伴うため、評価関数の具体的な関数形は分からず、微分値の情報を利用することができない。そのため、今の場合、（本実施例が対象としているような）探索点生成分布を用いた最適パラメータ探索が有力な最適化手法となる。 The object recognition function of the picking robot of this embodiment determines whether or not there is a target object in an image of a certain situation (hereinafter referred to as a scene image) taken with a camera, and if so, where and in what orientation it exists. The decision shall be made as to whether or not to do so. Generally, a program that implements this function has multiple parameters that can affect recognition performance. For example, in the case of pose estimation based on matching the feature points of the target object acquired in advance with the feature points of the scene image, parameters related to the acquisition of the feature points from the scene image (for example, the parameters that affect the recognition performance) (minimum permissible value of brightness gradient to be considered), feature point matching (for example, maximum permissible value of distance to be considered as a set), etc. The evaluation function in such parameter optimization is recognition performance, specifically the recognition rate of the target object with respect to the scene image. However, depending on the purpose, a term may be added such that the shorter the recognition processing time, the larger the value. In any case, since complex recognition processing is involved, the specific functional form of the evaluation function is unknown, and information on differential values cannot be used. Therefore, in the present case, optimal parameter search using a search point generation distribution (as targeted by this embodiment) is a powerful optimization method.

最適化装置の入力部１１は、目的物体の情報（例えば、特徴点やその特徴量）、シーン画像群、物体認識処理に関係するパラメータとその初期値、及びハイパーパラメータの値を受け付ける。ここで、各パラメータを探索対象に含めるか否かの事前選定は一般に困難であり、全パラメータを初期パラメータとして入力部１１に登録しておくことが望ましい。ただし、その弊害として不要なパラメータが含まれ探索性能の悪化が起こり得る。最適化計算部１２では、認識性能を最大にするパラメータを探索するが、その評価値計算ではシーン画像群に対する認識を毎回実行する必要があり、一般に所要時間が長くなるため、探索の効率化の恩恵は大きい。出力部１３に出力された最適パラメータは、出力装置もしくは通信IFを介して、ピッキングロボットの制御装置に送られ、目的物品の正確な把持を実現する。なお、最適化装置１はピッキングロボットの制御装置の一部としてもよいし、別個独立の構成としてもよい。 The input unit 11 of the optimization device receives information on the target object (for example, feature points and their feature amounts), a group of scene images, parameters related to object recognition processing and their initial values, and hyperparameter values. Here, it is generally difficult to pre-select whether or not each parameter is included in the search target, and it is desirable to register all parameters in the input unit 11 as initial parameters. However, the disadvantage of this is that unnecessary parameters may be included and search performance may deteriorate. The optimization calculation unit 12 searches for parameters that maximize recognition performance, but in order to calculate the evaluation value, it is necessary to perform recognition on a group of scene images each time, which generally takes a long time, so it is difficult to improve the efficiency of the search. The benefits are great. The optimal parameters outputted to the output unit 13 are sent to the control device of the picking robot via the output device or the communication IF, thereby realizing accurate gripping of the target article. Note that the optimization device 1 may be a part of the control device of the picking robot, or may be a separate and independent configuration.

上記のような適用例における、本実施例の効果は以下である。前述のように、本実施形態では入力パラメータとして不要なものを含み得る。さらに、複数のパラメータが複雑に依存しあっているため、評価関数が局所的な平坦領域を複数有する可能性も大いにある。しかし、本実施形態の最適化装置では、その探索パラメータ選択処理により、探索途中に探索の重要性の低いパラメータが検知され除かれるため、探索悪化の影響を低減できる。そのため、目的物体の把持に必要な教示を効率的に行うことができ、ピッキングロボットの多品種への対応を容易にする。 The effects of this embodiment in the application example described above are as follows. As described above, in this embodiment, unnecessary input parameters may be included. Furthermore, since a plurality of parameters depend on each other in a complicated manner, there is a strong possibility that the evaluation function has a plurality of local flat regions. However, in the optimization device of this embodiment, parameters with low search importance are detected and removed during the search by the search parameter selection process, so that the influence of search deterioration can be reduced. Therefore, the teaching necessary for grasping the target object can be efficiently performed, and the picking robot can easily handle a wide variety of products.

本実施形態は、実施例１での実施形態の構成において、出力部１３が、最適化計算終了時に得られる最適パラメータだけではなく、探索途中に得られる情報も受け取る機能を有し、その情報を出力装置を通じて外部にリアルタイムで表示することを特徴とする。実施例１での実施形態で示したように、各ループ処理ごとに、分布形状保持部１２１には探索途中の探索点生成分布の形状を特徴づける量が保存されていき、さらに探索パラメータ選択部１２２の処理により各パラメータが探索対象であるかどうかが選択される。本実施形態の出力部１３では、最適パラメータに加えて、それらの情報を探索途中に受け付けることができる。本実施形態における出力装置を実現するハードウェアは、受け付けた情報をオペレータが視覚的に認識できる形で常時表示できるものであれば制限はなく、例えばディスプレイが考えられる。 In the present embodiment, in the configuration of the embodiment in Example 1, the output unit 13 has a function of receiving not only the optimal parameters obtained at the end of the optimization calculation but also information obtained during the search. It is characterized by being displayed externally in real time through an output device. As shown in the embodiment of Example 1, for each loop process, the distribution shape storage unit 121 stores the quantity characterizing the shape of the search point generation distribution during the search, and further stores the quantity characterizing the shape of the search point generation distribution during the search. Through the process 122, it is selected whether each parameter is a search target. The output unit 13 of this embodiment can receive such information in addition to the optimal parameters during the search. The hardware that implements the output device in this embodiment is not limited as long as it can constantly display received information in a form that can be visually recognized by the operator; for example, a display may be used.

図１１は、本実施形態を実現するハードウェア構成例を示す図である。この例では、プロセッサ１０１やメモリ１０２や補助記憶装置１０３等の記憶装置が積まれた計算機と、最適化プログラムの実行などを指示する入力装置１０４であるキーボードと、出力装置１０５であるディスプレイ１１０１と、が描かれている。そして、この例では、ディスプレイ１１０１に表示される情報が、最適化計算中に刻々と変化していくことになる。具体的な構成としては、探索パラメータ選択部１２２で選択されたパラメータと選択されなかったパラメータが、ディスプレイ１１０１上に表示されている。 FIG. 11 is a diagram showing an example of a hardware configuration for realizing this embodiment. In this example, a computer includes a processor 101, storage devices such as a memory 102, and an auxiliary storage device 103, a keyboard as an input device 104 for instructing execution of an optimization program, and a display 1101 as an output device 105. , is depicted. In this example, the information displayed on the display 1101 changes every moment during the optimization calculation. Specifically, parameters selected by the search parameter selection unit 122 and parameters not selected are displayed on the display 1101.

また、入力部における入力パラメータの選択、その定義域や初期値の設定、及びハイパーパラメータの設定のために、分布形状保持部１２１が保持する種々の情報を参照することが可能な構成を追加してもよい。 Additionally, a configuration has been added that allows reference to various information held by the distribution shape holding unit 121 in order to select input parameters in the input unit, set their domains and initial values, and set hyperparameters. It's okay.

上記のような機能が追加されることで、実施例１の実施形態と比べて、例えば以下の利点が生まれる。 By adding the above functions, for example, the following advantages are created compared to the embodiment of the first embodiment.

一つ目は、初期入力パラメータ数で見積もるよりも、より正確な探索の進捗度合いの見積もりが可能となることである。例えば、探索対象から外れたパラメータの数と種類が分かっているときと分かっていないときでは、あるループ処理における評価値計算の結果が同じ場合でも、前者の方が残りの探索がより早く終了するという判断を下すことができる。このようなより正確な見積もりは、目的に応じて、探索の早期終了に役立てることが可能である。例えば、評価関数の最大値が未知で、最大評価値が目標値に達したかどうかを探索終了判定に用いることができない場合に、追加の終了条件として全パラメータが探索対象から外れたかどうかを採用することができる。また、本利点により、探索パラメータ数に応じて最適化手法（例えば、探索点生成分布の更新方法）を変更することも可能性として考えられる。 The first is that it becomes possible to estimate the degree of progress of the search more accurately than by estimating based on the number of initial input parameters. For example, even if the evaluation value calculation result in a certain loop process is the same when the number and type of parameters excluded from the search target are known and when they are not known, the remaining search will finish faster in the former case. You can make a judgment. Such a more accurate estimate can be useful for ending the search early, depending on the purpose. For example, if the maximum value of the evaluation function is unknown and it is not possible to use whether the maximum evaluation value has reached the target value to determine the end of the search, an additional termination condition may be used to determine whether all parameters are no longer included in the search target. can do. Furthermore, with this advantage, it is possible to change the optimization method (for example, the method of updating the search point generation distribution) depending on the number of search parameters.

二つ目は、探索点生成分布の形状を特徴づける量を表示することで、類似最適化計算を効率化できることである。前述のように、本実施形態の分布形状保持部１２１に蓄積されているデータは、評価関数の形状が類似することが予想されるような別の最適化計算において、入力パラメータの設計を支援することが可能である。特に、本実施形態のようなリアルタイム表示機能の形を採用することは、複数の類似最適化計算を並行して進めたい場合での迅速な入力パラメータ設計支援を可能とする。 Second, by displaying the quantities that characterize the shape of the search point generation distribution, similarity optimization calculations can be made more efficient. As described above, the data stored in the distribution shape holding unit 121 of this embodiment supports the design of input parameters in another optimization calculation in which the shape of the evaluation function is expected to be similar. Is possible. In particular, adopting the real-time display function as in this embodiment enables prompt input parameter design support when it is desired to proceed with a plurality of similar optimization calculations in parallel.

以上の実施例で説明したように、ある分布からの探索点の生成とその評価値に基づく分布の更新、を繰り返して最適パラメータを探索する最適化手法において、分布更新に不偏性がある場合、評価関数の平坦な領域の存在が探索性能の悪化を招いてしまう。本実施例が提供する技術によれば、探索したいパラメータとその評価関数の登録を受け付けた後、ある分布に基づく探索点の生成とその評価値計算、その評価値に基づく分布の更新、を繰り返して、評価値ができるだけ大きくなるようなパラメータ値を出力する、最適化装置であって、分布形状の時系列情報を保持するデータベースを有しており、その情報を利用して各パラメータの探索を行うか否かの判断、及びそれに伴うハイパーパラメータの更新を行うことで、探索の効率化を実現することができる。 As explained in the above example, in an optimization method that searches for optimal parameters by repeatedly generating search points from a certain distribution and updating the distribution based on its evaluation value, if the distribution update is unbiased, The existence of a flat region in the evaluation function causes deterioration in search performance. According to the technology provided by this embodiment, after accepting registration of parameters to be searched and their evaluation functions, generation of search points based on a certain distribution, calculation of their evaluation values, and updating of the distribution based on the evaluation values are repeated. It is an optimization device that outputs parameter values that make the evaluation value as large as possible, and has a database that holds time-series information on distribution shapes, and uses that information to search for each parameter. By determining whether or not to perform the search and updating the hyperparameters accordingly, it is possible to improve the efficiency of the search.

１最適化装置、１１入力部、１２最適化計算部、１３出力部、１０１プロセッサ、１０２メモリ、１０３補助記憶装置、１０４入力装置、１０５出力装置、１０６通信IF、１０７内部通信線、１２１分布形状保持部、１２２探索パラメータ選択部、１２３探索点生成部、１２４評価値計算部、１２５終了判定部、１２６分布形状更新部 1 optimization device, 11 input section, 12 optimization calculation section, 13 output section, 101 processor, 102 memory, 103 auxiliary storage device, 104 input device, 105 output device, 106 communication IF, 107 internal communication line, 121 distribution shape Holding unit, 122 Search parameter selection unit, 123 Search point generation unit, 124 Evaluation value calculation unit, 125 End determination unit, 126 Distribution shape update unit

Claims

An optimization device that optimizes parameter values,
an input section that receives a parameter to be searched and an evaluation function serving as an evaluation index thereof;
an optimization calculation unit that calculates an optimal value of the parameter value based on the evaluation function;
an output unit that outputs the optimal value;
including;
The optimization calculation unit includes:
a search point generation unit that generates a search point that is a value of a parameter to be evaluated from a search point generation distribution;
an evaluation value calculation unit that calculates an evaluation value of the search point based on the evaluation function;
a distribution shape updating unit that updates the search point generation distribution based on the evaluation value;
a distribution shape holding unit that holds quantities characterizing the search point generation distribution as time series information;
a search parameter selection unit that selects parameters to search based on the time series information;
a termination determination unit that determines termination based on predetermined termination conditions;
Optimizer with

The optimization device according to claim 1,
The search parameter selection unit selects parameters for which the change in the distribution shape is greater than or equal to a threshold value based on time-series information of the distribution shape of the search point generation distribution stored in the distribution shape storage unit. Optimizer.

The optimization device according to claim 1,
The optimization device is characterized in that the updating method of the distribution shape updating unit is such that a change in the distribution shape becomes zero on average for the evaluation function in which there is no correlation between the evaluation value and the search point.

The optimization device according to claim 1,
The optimization calculation unit calculates the optimal value of the parameter value by loop processing,
An optimization device, wherein the distribution shape holding unit holds, as time-series information, a quantity characterizing the shape of the search point generation distribution updated by the distribution shape updating unit in each loop process.

The optimization device according to claim 1,

The optimization device, wherein the search parameter selection unit changes the value of the hyperparameter based on the selection result of the search parameter.

The optimization device according to claim 1,
An optimization device, wherein the quantity characterizing the search point generation distribution is a variance value.

The optimization device according to claim 1,
An optimization device, wherein parameters not selected by the search parameter selection unit are fixed to constant values corresponding to center values of the search point generation distribution.

The optimization device according to claim 1,
The optimization device, wherein the search parameter selection unit includes a function of determining whether to restart the search for a parameter once removed from the search target.

The optimization device according to claim 1,
Among the quantities characterizing the shape of the search point generation distribution, there is a quantity expressed by a second-order matrix, and the search parameter selection unit performs diagonalization processing of the matrix, conversion device.

An optimization device using the optimization device according to claim 1,
Configured as an optimization device for the object recognition function of picking robots,
The information received by the input unit includes information on the target object and a group of images showing the target object,
The output unit outputs parameter values of a recognition processing function that can recognize whether or not the target object is included in the image, and if it is included, the position and orientation of the object,
An optimization device characterized in that the parameters are used for accurate recognition and grasping of a target object by a picking robot.

The optimization device according to claim 1,
An optimization device, wherein the output unit has an output device capable of receiving calculation results during the search that reflect the degree of progress of the search and the shape of the evaluation function, and visually displaying the information even during the search.

An optimization method that is executed by an information processing device including an input device, an output device, a processor, and a storage device, and optimizes parameter values,
a first step of receiving a parameter to be searched and an evaluation function serving as its evaluation index;
a second step of generating a search point, which is the value of the parameter to be searched, from the search point generation distribution;
a third step of calculating an evaluation value of the search point based on the evaluation function;
a fourth step of updating the search point generation distribution based on the evaluation value;
a fifth step of retaining the quantity characterizing the search point generation distribution as time series information;
a sixth step of selecting parameters to search based on the time series information;
Optimization method to perform.

After the sixth step, return to the second step and repeat the loop process until the end determination condition is satisfied.
The optimization method according to claim 12.

The second step, the third step, and the fourth step are performed based on a distribution prediction algorithm.
The optimization method according to claim 12.

performing a seventh step of updating hyperparameters based on the number of parameters to explore;
The optimization method according to claim 12.