JP6785741B2

JP6785741B2 - Optimizer, traffic signal control system, parameter search device, optimization method, and program

Info

Publication number: JP6785741B2
Application number: JP2017210863A
Authority: JP
Inventors: 秀剛伊藤; 恭太堤田; 達史松林; 浩之戸田
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2017-10-31
Filing date: 2017-10-31
Publication date: 2020-11-18
Anticipated expiration: 2037-10-31
Also published as: JP2019082934A

Description

本発明は、最適化装置、交通信号制御システム、パラメータ探索装置、最適化方法、及びプログラムに関し、特に機械学習やシミュレーションのパラメータを最適化するための最適化装置、交通信号制御システム、パラメータ探索装置、最適化方法、及びプログラムに関する。 The present invention relates to an optimization device, a traffic signal control system, a parameter search device, an optimization method, and a program, and particularly an optimization device for optimizing machine learning and simulation parameters, a traffic signal control system, and a parameter search device. , Optimization methods, and programs.

近年、機械学習やシミュレーションの重要性が増してきている。機械学習やシミュレーションを用いた技術の例として、シミュレーション上で車を大量に動かし、都市交通を再現する技術がある（非特許文献１）。機械学習はそのハイパーパラメータによって性能が変動する。また、シミュレーションもそのパラメータによって出力が変動する。ここで、ハイパーパラメータないしパラメータをまとめてパラメータと表記する。 In recent years, the importance of machine learning and simulation has increased. As an example of a technique using machine learning or simulation, there is a technique of moving a large number of cars on a simulation to reproduce urban traffic (Non-Patent Document 1). The performance of machine learning varies depending on its hyperparameters. In addition, the output of the simulation also fluctuates depending on the parameters. Here, hyperparameters or parameters are collectively referred to as parameters.

パラメータを、適切な値に最適化する必要がある。最適化は、あらかじめ指定された指標が最良となるように行われる。指標の値を取得することを、ここでは評価と呼ぶ。近年の機械学習とシミュレーションの高度化に伴い、１回の評価にかかる時間が増大している。よって、前記パラメータの最適化にかかる時間を削減するために、評価の回数を減らす技術が提案されている（非特許文献２）。 The parameters need to be optimized to the appropriate values. Optimization is performed so that the pre-specified index is the best. Acquiring the value of the index is called evaluation here. With the sophistication of machine learning and simulation in recent years, the time required for one evaluation is increasing. Therefore, in order to reduce the time required for optimizing the parameters, a technique for reducing the number of evaluations has been proposed (Non-Patent Document 2).

Krajzewicz, D., Brockfeld, E., Mikat, J., Ringel, J., Rossel, C., Tuchscheerer, W., Wagner, P., and Wosler, R.: Simulation of modern Traffic Lights Control Systems using the open source Traffic Simulation SUMO, Proceedings of the 3rd Industrial Simulation Conference 2005, pp. 299-302 (2005).Krajzewicz, D., Brockfeld, E., Mikat, J., Ringel, J., Rossel, C., Tuchscheerer, W., Wagner, P., and Wosler, R .: Simulation of modern Traffic Lights Control Systems using the open source Traffic Simulation SUMO, Proceedings of the 3rd Industrial Simulation Conference 2005, pp. 299-302 (2005). Shahriari, B., Swersky, K.,Wang, Z., Adams, R. P. and Freitas, de N.: Taking the human out of the loop: A review of bayesian optimization, Proceedings of the IEEE, Vol. 104, No. 1, pp. 148-175 (2016).Shahriari, B., Swersky, K., Wang, Z., Adams, RP and Freitas, de N .: Taking the human out of the loop: A review of bayesian optimization, Proceedings of the IEEE, Vol. 104, No. 1, pp. 148-175 (2016). Papageorgiou, M., Diakaki, C., Dinopoulou, V., Kotsialos, A. and Wang, Y.: Review of road traffic control strategies, Proceedings of the IEEE, Vol. 91, No. 12, pp. 2043-2067 (2003).Papageorgiou, M., Diakaki, C., Dinopoulou, V., Kotsialos, A. and Wang, Y .: Review of road traffic control strategies, Proceedings of the IEEE, Vol. 91, No. 12, pp. 2043-2067 (2003).

上記の最適化を行う際、パラメータが取りうる空間を、いくつかの部分空間に分割することがある。その場合、パラメータが取りうる空間を、ある部分空間に制限し、その後、その部分空間の中から最適なパラメータを選択する必要がある。以後、この部分空間を単にパラメータ空間と呼ぶ。 When performing the above optimization, the space that the parameter can take may be divided into several subspaces. In that case, it is necessary to limit the space that the parameter can take to a certain subspace, and then select the optimum parameter from the subspace. Hereinafter, this subspace is simply referred to as a parameter space.

上記の最適化が行われる一例として交通信号制御がある。交通信号制御においては、信号灯色を切り替えるプランを１周期作成し、そのプランの繰り返しに従って、信号制御を行う方法が用いられる。前記プランはパラメータを指定することで、一意に決定される。このパラメータを最適化する処理にて、評価の実行のために、シミュレーションを用いる（非特許文献１）。 Traffic signal control is an example of the above optimization. In traffic signal control, a method is used in which a plan for switching signal light colors is created for one cycle, and signal control is performed according to the repetition of the plan. The plan is uniquely determined by specifying parameters. In the process of optimizing this parameter, simulation is used to execute the evaluation (Non-Patent Document 1).

ここで、交通信号制御においては、複数信号の系統制御が渋滞の緩和に有効であることが知られている（非特許文献３）。この場合、系統制御を行う範囲を決めるため、制御する信号全てをそれぞれいずれか１つのサブエリアに属させる。各信号の所属するサブエリアを決めることで、前記パラメータの取りうるパラメータ空間が一意に決定される。そこで、各信号の所属するサブエリアを決めることで、パラメータ空間を決定し、その後そのパラメータ空間内で最適なパラメータを決定する必要がある。 Here, in traffic signal control, it is known that system control of a plurality of signals is effective in alleviating traffic congestion (Non-Patent Document 3). In this case, in order to determine the range for system control, all the signals to be controlled belong to any one subarea. By determining the sub-area to which each signal belongs, the parameter space in which the parameter can be taken is uniquely determined. Therefore, it is necessary to determine the parameter space by determining the sub-area to which each signal belongs, and then determine the optimum parameter in the parameter space.

本発明は、上記の点に鑑みてなされたものであり、少ない評価回数で、パラメータの最適化を行うことができる最適化装置、最適化方法、及びプログラムを提供することを目的とする。 The present invention has been made in view of the above points, and an object of the present invention is to provide an optimization device, an optimization method, and a program capable of optimizing parameters with a small number of evaluations.

また、本発明は、少ない交通シミュレーション回数で、信号パラメータの最適化を行うことができる交通信号制御システムを提供することを目的とする。 Another object of the present invention is to provide a traffic signal control system capable of optimizing signal parameters with a small number of traffic simulations.

また、本発明は、最適化されたパラメータを得ることができるパラメータ探索装置を提供することを目的とする。 Another object of the present invention is to provide a parameter search device capable of obtaining optimized parameters.

本発明に係る最適化装置は、評価用データを入力として計算するときに用いられるパラメータを最適化する最適化装置であって、前記評価用データと、前記パラメータとに基づいて、前記計算の結果を評価する指標を計算する評価部と、前記パラメータが取り得る空間を分割した複数のパラメータ空間と、前記パラメータとを最適化する最適化部と、前記評価部による処理と、前記最適化部による処理とを繰り返すことにより得られる、最適化されたパラメータを出力する出力部と、を含み、前記最適化部は、前記パラメータ及び前記指標に基づいて、前記パラメータ空間に対する指標を予測するためのモデルである第１のモデルを学習し、複数の前記パラメータ空間の各々について、前記第１のモデルに基づいて前記パラメータ空間に対する指標である第１の指標を予測し、予測した前記第１の指標に基づいて計算される値に基づいて、前記評価部が次に評価するパラメータ空間を選択するパラメータ空間最適化部と、前記パラメータ及び前記指標に基づいて、前記パラメータに対する指標を予測するためのモデルである第２のモデルを学習し、前記パラメータ空間最適化部により選択されたパラメータ空間に含まれる前記パラメータのうち、過去に評価を行っていない前記パラメータの各々について、前記第２のモデルに基づいて前記パラメータに対する指標である第２の指標を予測し、予測した前記第２の指標に基づいて計算される値に基づいて、前記評価部が次に評価するパラメータを選択するパラメータ最適化部と、を含んで構成される。 The optimization device according to the present invention is an optimization device that optimizes parameters used when calculating with evaluation data as an input, and is a result of the calculation based on the evaluation data and the parameters. An evaluation unit that calculates an index for evaluating the above, a plurality of parameter spaces that divide the space that the parameter can take, an optimization unit that optimizes the parameters, processing by the evaluation unit, and the optimization unit. A model for predicting an index for the parameter space based on the parameter and the index, including an output unit that outputs an optimized parameter obtained by repeating the process. Is trained, and for each of the plurality of parameter spaces, the first index which is an index for the parameter space is predicted based on the first model, and the predicted first index is used. With a parameter space optimization unit that selects the parameter space to be evaluated next by the evaluation unit based on the value calculated based on the value, and a model for predicting the index for the parameter based on the parameter and the index. A second model is learned, and among the parameters included in the parameter space selected by the parameter space optimization unit, each of the parameters that have not been evaluated in the past is based on the second model. A parameter optimization unit that predicts a second index, which is an index for the parameter, and selects a parameter to be evaluated next by the evaluation unit based on a value calculated based on the predicted second index. Consists of including.

また、本発明に係る最適化方法は、評価用データを入力としてパラメータを用いて出力するときのパラメータを最適化する最適化方法であって、評価部が、前記評価用データと、前記パラメータとに基づいて、前記出力を評価する指標を計算するステップと、最適化部が、前記パラメータが取り得る空間を分割した複数のパラメータ空間と、前記パラメータとを最適化するステップと、出力部が、前記評価部による処理と、前記最適化部による処理とを繰り返すことにより得られる、最適化されたパラメータを出力するステップと、を含み、前記最適化部が最適化するステップは、パラメータ空間最適化部が、前記パラメータ及び前記指標に基づいて、前記パラメータ空間に対する指標を予測するためのモデルである第１のモデルを学習し、複数の前記パラメータ空間の各々について、前記第１のモデルに基づいて前記パラメータ空間に対する指標である第１の指標を予測し、予測した前記第１の指標に基づいて計算される値に基づいて、前記評価部が次に評価するパラメータ空間を選択するステップと、パラメータ最適化部が、前記パラメータ及び前記指標に基づいて、前記パラメータに対する指標を予測するためのモデルである第２のモデルを学習し、前記パラメータ空間最適化部により選択されたパラメータ空間に含まれる前記パラメータのうち、過去に評価を行っていない前記パラメータの各々について、前記第２のモデルに基づいて前記パラメータに対する指標である第２の指標を予測し、予測した前記第２の指標に基づいて計算される値に基づいて、前記評価部が次に評価するパラメータを選択するステップと、を含む。 Further, the optimization method according to the present invention is an optimization method for optimizing the parameters when the evaluation data is input and output using the parameters, and the evaluation unit uses the evaluation data and the parameters. Based on, the step of calculating the index for evaluating the output, the step of optimizing the plurality of parameter spaces in which the optimizer divides the space that the parameter can take, and the step of optimizing the parameter, and the output section The step of optimizing by the optimizing unit includes parameter space optimization, including a step of outputting an optimized parameter obtained by repeating the processing by the evaluation unit and the processing by the optimizing unit. The unit learns a first model, which is a model for predicting an index for the parameter space, based on the parameter and the index, and for each of the plurality of the parameter spaces, based on the first model. A step of predicting a first index, which is an index for the parameter space, and selecting a parameter space to be evaluated next by the evaluation unit based on a value calculated based on the predicted first index, and a parameter. The optimization unit learns a second model, which is a model for predicting an index for the parameter based on the parameter and the index, and is included in the parameter space selected by the parameter space optimization unit. Of the parameters, for each of the parameters that have not been evaluated in the past, a second index that is an index for the parameter is predicted based on the second model, and calculation is performed based on the predicted second index. A step of selecting a parameter to be evaluated next by the evaluation unit based on the value to be evaluated is included.

本発明に係る最適化装置及び最適化方法によれば、評価部が、評価用データと、パラメータとに基づいて、出力を評価する指標を計算し、最適化部が、パラメータが取り得る空間を分割したパラメータ空間と、パラメータとを最適化し、出力部が、前記評価部による処理と、前記最適化部による処理とを繰り返すことにより得られる、最適化されたパラメータを出力する。 According to the optimization device and the optimization method according to the present invention, the evaluation unit calculates an index for evaluating the output based on the evaluation data and the parameter, and the optimization unit determines the space that the parameter can take. The divided parameter space and the parameters are optimized, and the output unit outputs the optimized parameters obtained by repeating the processing by the evaluation unit and the processing by the optimization unit.

そして、最適化部による処理は、パラメータ空間最適化部が、パラメータ及び指標に基づいて、パラメータ空間に対する指標を予測するためのモデルである第１のモデルを学習し、複数のパラメータ空間の各々について、第１のモデルに基づいてパラメータ空間に対する指標である第１の指標を予測し、予測した第１の指標に基づいて計算される値に基づいて、評価部が次に評価するパラメータ空間を選択し、パラメータ最適化部が、パラメータ及び前記指標に基づいて、パラメータに対する指標を予測するためのモデルである第２のモデルを学習し、パラメータ空間最適化部により選択されたパラメータ空間に含まれるパラメータのうち、過去に評価を行っていないパラメータの各々について、第２のモデルに基づいてパラメータに対する指標である第２の指標を予測し、予測した第２の指標に基づいて計算される値に基づいて、評価部が次に評価するパラメータを選択する Then, in the processing by the optimization unit, the parameter space optimization unit learns the first model, which is a model for predicting the index for the parameter space based on the parameter and the index, and for each of the plurality of parameter spaces. , Predict the first index, which is an index for the parameter space, based on the first model, and select the parameter space to be evaluated next by the evaluation unit based on the value calculated based on the predicted first index. Then, the parameter optimization unit learns a second model, which is a model for predicting an index for the parameter based on the parameter and the index, and the parameter included in the parameter space selected by the parameter space optimization unit. Of these, for each of the parameters that have not been evaluated in the past, the second index, which is an index for the parameter, is predicted based on the second model, and based on the value calculated based on the predicted second index. Then, the evaluation department selects the parameter to be evaluated next.

このように、パラメータ空間に対する指標を予測するためのモデルに基づいて、パラメータ空間に対する指標である第１の指標を予測し、予測した第１の指標に基づいて計算される値に基づいて、次に評価するパラメータ空間を選択し、選択されたパラメータ空間に含まれるパラメータのうち、過去に評価を行っていないパラメータの各々について、パラメータに対する指標を予測するためのモデルに基づいてパラメータに対する指標である第２の指標を予測し、予測した第２の指標に基づいて計算される値に基づいて、次に評価するパラメータを選択し、評価用データと、選択されたパラメータとに基づいて、出力を評価する指標を計算することを繰り返すことにより、少ない評価回数で、パラメータの最適化を行うことができる。 In this way, based on the model for predicting the index for the parameter space, the first index, which is the index for the parameter space, is predicted, and based on the value calculated based on the predicted first index, the next The parameter space to be evaluated is selected, and among the parameters included in the selected parameter space, each of the parameters that have not been evaluated in the past is an index for the parameter based on a model for predicting the index for the parameter. Predict the second index, select the parameter to be evaluated next based on the value calculated based on the predicted second index, and output the output based on the evaluation data and the selected parameter. By repeating the calculation of the index to be evaluated, the parameters can be optimized with a small number of evaluations.

また、本発明に係る最適化装置は、前記パラメータ空間最適化部は、前記第１のモデルを用いて、複数の前記パラメータ空間の各々について、前記パラメータ空間に対する評価を予測し、前記パラメータ空間に対する評価の予測を変数とする第１の獲得関数を計算し、前記第１の獲得関数の値が最大となるパラメータ空間を、前記評価部が次に評価するパラメータ空間として選択し、前記パラメータ最適化部は、前記第２のモデルを用いて、前記パラメータ空間最適化部により選択されたパラメータ空間に含まれる前記パラメータのうち、過去に評価を行っていない前記パラメータの各々について、前記評価部が次に評価するパラメータ空間に対する評価を予測し、前記過去に評価を行っていないパラメータに対する評価の予測を変数とする第２の獲得関数を計算し、前記第２の獲得関数の値が最大となるパラメータを、前記評価部が次に評価するパラメータとして選択することができる。 Further, in the optimizing device according to the present invention, the parameter space optimizing unit predicts the evaluation of the parameter space for each of the plurality of the parameter spaces by using the first model, and the parameter space is evaluated. The first acquisition function with the prediction of evaluation as a variable is calculated, the parameter space in which the value of the first acquisition function is maximized is selected as the parameter space to be evaluated next by the evaluation unit, and the parameter optimization is performed. Using the second model, the evaluation unit is next for each of the parameters included in the parameter space selected by the parameter space optimization unit that has not been evaluated in the past. The parameter that predicts the evaluation for the parameter space to be evaluated, calculates the second acquisition function with the prediction of the evaluation for the parameter that has not been evaluated in the past as a variable, and maximizes the value of the second acquisition function. Can be selected as the parameter to be evaluated next by the evaluation unit.

また、本発明に係る最適化装置は、前記第１のモデル及び前記第２のモデルは、ガウス過程を用いる確率モデルであるとすることができる。 Further, in the optimization device according to the present invention, the first model and the second model can be considered to be a probabilistic model using a Gaussian process.

また、本発明に係る最適化装置は、前記パラメータ空間最適化部は、前記評価部が評価に用いたパラメータの属する前記パラメータ空間と、前記パラメータ空間に属する前記パラメータを用いて評価した回数と、前記評価部により得られた指標とを用いて前記第１のモデルを学習し、前記パラメータ最適化部は、前記評価部が評価を行った際に用いたパラメータ、及び前記評価部により得られた指標と、を用いて前記第２のモデルを学習することができる。 Further, in the optimization device according to the present invention, the parameter space optimization unit includes the parameter space to which the parameter used for evaluation by the evaluation unit belongs, the number of times of evaluation using the parameter belonging to the parameter space, and the number of times. The first model was learned using the index obtained by the evaluation unit, and the parameter optimization unit was obtained by the parameter used when the evaluation unit performed the evaluation and the evaluation unit. The second model can be trained using the index.

本発明に係る交通信号制御システムは、複数の交通信号機を制御する管制装置と、交通状況の計算に必要な評価用データを入力として前記管制装置により用いられる信号パラメータを最適化する最適化装置と、を含む交通信号制御システムであって、前記管制装置は、状況の入力を受け付ける入力部と、前記交通状況を入力として、前記最適化装置により得られた信号パラメータを用いて、前記複数の交通信号機を制御する制御部と、を含み、前記最適化装置は、前記評価用データと、前記信号パラメータとに基づいて、前記評価用データを入力として前記信号パラメータを用いて前記交通状況を計算し、前記計算された交通状況を評価する指標を計算する評価部と、前記信号パラメータが取り得る空間を分割した複数のパラメータ空間と、前記信号パラメータとを最適化する最適化部と、前記評価部による処理と、前記最適化部による処理とを繰り返すことにより得られる、最適化されたパラメータを出力する出力部と、を含み、前記最適化部は、前記信号パラメータ及び前記指標に基づいて、前記パラメータ空間に対する指標を予測するためのモデルである第１のモデルを学習し、複数の前記パラメータ空間の各々について、前記第１のモデルに基づいて前記パラメータ空間に対する指標である第１の指標を予測し、予測した前記第１の指標に基づいて計算される値に基づいて、前記評価部が次に評価するパラメータ空間を選択するパラメータ空間最適化部と、前記信号パラメータ及び前記指標に基づいて、前記信号パラメータに対する指標を予測するためのモデルである第２のモデルを学習し、前記パラメータ空間最適化部により選択されたパラメータ空間に含まれる前記信号パラメータのうち、過去に評価を行っていない前記信号パラメータの各々について、前記第２のモデルに基づいて前記信号パラメータに対する指標である第２の指標を予測し、予測した前記第２の指標に基づいて計算される値に基づいて、前記評価部が次に評価する信号パラメータを選択するパラメータ最適化部と、を含んで構成される。 The traffic signal control system according to the present invention includes a control device that controls a plurality of traffic signals, and an optimization device that optimizes signal parameters used by the control device by inputting evaluation data necessary for calculating traffic conditions. A traffic signal control system including, the control device uses an input unit that receives an input of a situation and a signal parameter obtained by the optimization device with the traffic situation as an input, and uses the plurality of traffic. The optimization device includes a control unit that controls a traffic light, and based on the evaluation data and the signal parameter, the optimization device calculates the traffic condition using the signal parameter with the evaluation data as an input. An evaluation unit that calculates an index for evaluating the calculated traffic condition, a plurality of parameter spaces that divide the space that the signal parameter can take, an optimization unit that optimizes the signal parameter, and the evaluation unit. The optimization unit includes an output unit that outputs an optimized parameter obtained by repeating the processing by the optimization unit and the optimization unit, based on the signal parameter and the index. A first model, which is a model for predicting an index for a parameter space, is learned, and for each of the plurality of the parameter spaces, a first index, which is an index for the parameter space, is predicted based on the first model. Then, based on the predicted value calculated based on the first index, the parameter space optimization unit selects the parameter space to be evaluated next by the evaluation unit, and based on the signal parameter and the index. A second model, which is a model for predicting an index for the signal parameter, is learned, and among the signal parameters included in the parameter space selected by the parameter space optimization unit, the above-mentioned signal parameters that have not been evaluated in the past. For each of the signal parameters, the second index, which is an index for the signal parameter, is predicted based on the second model, and the evaluation unit is based on the value calculated based on the predicted second index. Is configured to include a parameter optimization unit that selects the signal parameters to be evaluated next.

本発明に係る交通信号制御システムによれば、パラメータ空間に対する指標を予測するためのモデルに基づいて、パラメータ空間に対する指標である第１の指標を予測し、予測した第１の指標に基づいて計算される値に基づいて、次に評価するパラメータ空間を選択し、選択されたパラメータ空間に含まれる信号パラメータのうち、過去に評価を行っていない信号パラメータの各々について、パラメータに対する指標を予測するためのモデルに基づいて信号パラメータに対する指標である第２の指標を予測し、予測した第２の指標に基づいて計算される値に基づいて、次に評価する信号パラメータを選択し、評価用データと、選択された信号パラメータとに基づいて、交通状況を計算することを繰り返すことにより、少ない交通シミュレーション回数で、信号パラメータの最適化を行うことができる。 According to the traffic signal control system according to the present invention, the first index, which is an index for the parameter space, is predicted based on the model for predicting the index for the parameter space, and the calculation is performed based on the predicted first index. Based on the values to be evaluated, the parameter space to be evaluated next is selected, and among the signal parameters contained in the selected parameter space, for each of the signal parameters that have not been evaluated in the past, the index for the parameter is predicted. The second index, which is an index for the signal parameter, is predicted based on the model of, and the signal parameter to be evaluated next is selected based on the value calculated based on the predicted second index. By repeating the calculation of the traffic condition based on the selected signal parameter, the signal parameter can be optimized with a small number of traffic simulations.

本発明に係るパラメータ探索装置は、パラメータが取り得る空間を分割した複数のパラメータ空間の各々について、前記パラメータ空間に対する指標である第１の指標を予測するための第１のモデルに基づいて前記パラメータ空間に対する指標である第１の指標を予測し、予測した前記第１の指標に基づいて、パラメータ空間を選択するパラメータ空間最適化部と、前記パラメータ空間最適化部により選択されたパラメータ空間に含まれる前記パラメータの各々について、前記パラメータに対する指標である第２の指標を予測するための第２のモデルに基づいて前記パラメータに対する指標である第２の指標を予測し、前記第２の指標に基づいて、パラメータを選択するパラメータ最適化部と、前記パラメータ最適化部により選択されたパラメータを出力する出力部と、を含んで構成される。 The parameter search device according to the present invention has the parameters based on a first model for predicting a first index which is an index for the parameter space for each of a plurality of parameter spaces that divide the space that the parameter can take. Included in the parameter space optimization unit that predicts the first index, which is an index for space, and selects the parameter space based on the predicted first index, and the parameter space selected by the parameter space optimization unit. For each of the parameters, the second index, which is an index for the parameter, is predicted based on the second model for predicting the second index, which is an index for the parameter, and based on the second index. A parameter optimization unit for selecting parameters and an output unit for outputting the parameters selected by the parameter optimization unit are included.

本発明に係るパラメータ探索装置によれば、パラメータが取り得る空間を分割した複数のパラメータ空間の各々について、前記パラメータ空間に対する指標である第１の指標を予測するための第１のモデルに基づいてパラメータ空間に対する指標である第１の指標を予測し、予測した前記第１の指標に基づいて、パラメータ空間を選択し、選択されたパラメータ空間に含まれるパラメータの各々について、パラメータに対する指標である第２の指標を予測するための第２のモデルに基づいてパラメータに対する指標である第２の指標を予測し、前記第２の指標に基づいて、パラメータを選択し、選択されたパラメータを出力することにより、最適化されたパラメータを得ることができる。 According to the parameter search device according to the present invention, for each of a plurality of parameter spaces that divide the space that the parameters can take, based on a first model for predicting a first index that is an index for the parameter space. The first index, which is an index for the parameter space, is predicted, the parameter space is selected based on the predicted first index, and each of the parameters contained in the selected parameter space is the index for the parameter. Predicting a second index, which is an index for a parameter, based on a second model for predicting the second index, selecting a parameter based on the second index, and outputting the selected parameter. Allows the optimized parameters to be obtained.

本発明に係るプログラムは、上記の最適化装置の各部として機能させるためのプログラムである。 The program according to the present invention is a program for functioning as each part of the above-mentioned optimization device.

本発明の最適化装置、最適化方法、及びプログラムによれば、パラメータ空間に対する指標を予測するためのモデルに基づいて、パラメータ空間に対する指標である第１の指標を予測し、予測した第１の指標に基づいて計算される値に基づいて、次に評価するパラメータ空間を選択し、選択されたパラメータ空間に含まれるパラメータの各々について、パラメータに対する指標を予測するためのモデルに基づいてパラメータに対する指標である第２の指標を予測し、予測した第２の指標に基づいて計算される値に基づいて、次に評価するパラメータを選択し、評価用データと、選択されたパラメータとに基づいて、出力を評価する指標を計算することを繰り返すため、少ない評価回数で、パラメータ空間の最適化、及びそのパラメータ空間におけるパラメータの最適化を行うことができる。 According to the optimization device, the optimization method, and the program of the present invention, the first index, which is an index for the parameter space, is predicted and predicted based on the model for predicting the index for the parameter space. Based on the values calculated based on the metric, select the parameter space to evaluate next, and for each of the parameters contained in the selected parameter space, the metric for the parameter based on the model for predicting the metric for the parameter. The second index is predicted, the parameter to be evaluated next is selected based on the value calculated based on the predicted second index, and the evaluation data and the selected parameter are used. Since the calculation of the index for evaluating the output is repeated, the parameter space can be optimized and the parameters in the parameter space can be optimized with a small number of evaluations.

本発明の交通信号制御システムによれば、パラメータ空間に対する指標を予測するためのモデルに基づいて、パラメータ空間に対する指標である第１の指標を予測し、予測した第１の指標に基づいて計算される値に基づいて、次に評価するパラメータ空間を選択し、選択されたパラメータ空間に含まれる信号パラメータのうち、過去に評価を行っていない信号パラメータの各々について、パラメータに対する指標を予測するためのモデルに基づいて信号パラメータに対する指標である第２の指標を予測し、予測した第２の指標に基づいて計算される値に基づいて、次に評価する信号パラメータを選択し、評価用データと、選択された信号パラメータとに基づいて、交通状況を計算することを繰り返すことにより、少ない交通シミュレーション回数で、信号パラメータの最適化を行うことができる。 According to the traffic signal control system of the present invention, the first index, which is an index for the parameter space, is predicted based on the model for predicting the index for the parameter space, and the calculation is performed based on the predicted first index. To select the parameter space to be evaluated next based on the values, and to predict the index for the parameter for each of the signal parameters included in the selected parameter space that have not been evaluated in the past. The second index, which is an index for the signal parameter, is predicted based on the model, and the signal parameter to be evaluated next is selected based on the value calculated based on the predicted second index. By repeating the calculation of the traffic condition based on the selected signal parameter, the signal parameter can be optimized with a small number of traffic simulations.

本発明のパラメータ探索装置によれば、パラメータが取り得る空間を分割した複数のパラメータ空間の各々について、パラメータ空間に対する指標である第１の指標を予測するための第１のモデルに基づいてパラメータ空間に対する指標である第１の指標を予測し、予測した前記第１の指標に基づいて、パラメータ空間を選択し、選択されたパラメータ空間に含まれるパラメータの各々について、パラメータに対する指標である第２の指標を予測するための第２のモデルに基づいてパラメータに対する指標である第１の指標を予測し、前記第２の指標に基づいて、パラメータを選択し、選択されたパラメータを出力することにより、最適化されたパラメータを得ることができる。 According to the parameter search device of the present invention, for each of a plurality of parameter spaces that divide the space that the parameter can take, the parameter space is based on a first model for predicting a first index that is an index for the parameter space. The first index, which is an index for, is predicted, the parameter space is selected based on the predicted first index, and for each of the parameters contained in the selected parameter space, the second index, which is an index for the parameter, is selected. By predicting the first index, which is an index for the parameter, based on the second model for predicting the index, selecting the parameter based on the second index, and outputting the selected parameter. Optimized parameters can be obtained.

本発明の実施の形態に係る交通信号制御システムの構成を示すブロック図である。It is a block diagram which shows the structure of the traffic signal control system which concerns on embodiment of this invention. 本発明の実施の形態に係るパラメータ空間記憶部に格納される情報の一部の例を示す図である。It is a figure which shows a part example of the information stored in the parameter space storage part which concerns on embodiment of this invention. 本発明の実施の形態に係るパラメータ記憶部に格納される情報の一部の例を示す図である。It is a figure which shows a part example of the information stored in the parameter storage part which concerns on embodiment of this invention. 本発明の実施の形態に係る実験結果の例における探索回数と、パラメータ空間毎の渋滞損失時間との関係を表す図である。It is a figure which shows the relationship between the number of searches in the example of the experimental result which concerns on embodiment of this invention, and the congestion loss time for each parameter space. 本発明の実施の形態に係る最適化装置における最適化処理ルーチンを示すフローチャートである。It is a flowchart which shows the optimization processing routine in the optimization apparatus which concerns on embodiment of this invention.

＜本発明の実施の形態に係る交通信号制御システムの構成＞
以下、本発明の実施の形態について図面を用いて説明する。 <Structure of Traffic Signal Control System According to Embodiment of the Present Invention>
Hereinafter, embodiments of the present invention will be described with reference to the drawings.

本実施形態では、交通信号制御において、交通シミュレーションを用いた信号パラメータｓの最適化に本実施形態に係る最適化装置を適用した場合について説明する。 In the present embodiment, a case where the optimization device according to the present embodiment is applied to the optimization of the signal parameters s using the traffic simulation in the traffic signal control will be described.

本実施形態では、交通信号制御は、管制装置により行われる。交通信号制御では、信号灯色を切り替えるプランを１周期作成し、そのプランの繰り返しに従って、信号制御を行う。このプランは、信号パラメータｓを指定することで、一意に決定される。この信号パラメータｓを最適化する処理を、本実施形態に係る最適化装置にて行う。 In the present embodiment, the traffic signal control is performed by the control device. In the traffic signal control, a plan for switching the color of the signal light is created for one cycle, and the signal is controlled according to the repetition of the plan. This plan is uniquely determined by specifying the signal parameter s. The process of optimizing the signal parameter s is performed by the optimization device according to the present embodiment.

本実施形態では、複数信号の系統制御を行う範囲を決めるため、制御する信号全てを、サブエリアに基づいて、それぞれいずれか１つのサブエリアに属させることとする。各信号の所属するサブエリアを決めることで、信号パラメータｓの取りうるパラメータ空間ｉが一意に決定される。 In the present embodiment, in order to determine the range for system control of a plurality of signals, all the signals to be controlled are assigned to any one subarea based on the subarea. By determining the sub-area to which each signal belongs, the parameter space i that can be taken by the signal parameter s is uniquely determined.

図１は、本発明の実施の形態に係る交通信号制御システムの構成を示すブロック図である。 FIG. 1 is a block diagram showing a configuration of a traffic signal control system according to an embodiment of the present invention.

本実施形態にかかる交通信号制御システム１は、最適化装置１０と、管制装置５０と、複数の交通信号機（図示しない）で構成される。 The traffic signal control system 1 according to the present embodiment includes an optimization device 10, a control device 50, and a plurality of traffic signals (not shown).

本実施形態に係る最適化装置１０は、ＣＰＵと、ＲＡＭと、後述する最適化処理ルーチンを実行するためのプログラムを記憶したＲＯＭとを備えたコンピュータで構成され、機能的には次に示すように構成されている。 The optimization device 10 according to the present embodiment is composed of a computer including a CPU, a RAM, and a ROM that stores a program for executing an optimization processing routine described later, and is functionally as shown below. It is configured in.

図１に示すように、本発明の実施の形態に係る最適化装置１０は、最適化部１００と、評価用データ記憶部２００と、評価部３００と、出力部４００とを備えて構成される。 As shown in FIG. 1, the optimization device 10 according to the embodiment of the present invention includes an optimization unit 100, an evaluation data storage unit 200, an evaluation unit 300, and an output unit 400. ..

最適化部１００は、信号パラメータｓが取り得る空間を分割したパラメータ空間ｉと、信号パラメータｓとを最適化する。 The optimization unit 100 optimizes the parameter space i obtained by dividing the space that the signal parameter s can take and the signal parameter s.

具体的には、最適化部１００は、パラメータ空間記憶部１１０と、パラメータ空間最適化部１２０と、パラメータ記憶部１３０と、パラメータ最適化部１４０とを備えて構成される。 Specifically, the optimization unit 100 includes a parameter space storage unit 110, a parameter space optimization unit 120, a parameter storage unit 130, and a parameter optimization unit 140.

パラメータ空間記憶部１１０は、過去に行った評価におけるデータを格納する。 The parameter space storage unit 110 stores data in the evaluation performed in the past.

具体的には、パラメータ空間記憶部１１０は、交通シミュレーションの回数ｔ、ｔ回目の交通シミュレーションで選んだパラメータ空間ｉの特徴ベクトル

、ｔ回目までにパラメータ空間ｉを選んだ回数

と何かのパラメータ空間を選んだ回数

を要素とするベクトル

、ｔ回目までにパラメータ空間ｉを選んで行った交通シミュレーションにおける指標の集合から計算した値

である。パラメータ空間ｉの特徴ベクトル

はどのようなものでもよい。一例としては、２つの信号が同じパラメータ空間に属しているかどうかを、二値ベクトルとして表現したものがある。指標の集合から計算する値

もどのような計算により算出された値でもよい。計算の一例としては、指標の集合の最小値をとった値や、指標の集合を並び替えた値等がある。ｔ＝１，２，…におけるパラメータ空間ｉの特徴ベクトル

の集合をＸ、ベクトル

の集合をＺ、

の集合をＹと表す。図２に格納する情報の一部の例を示す。 Specifically, the parameter space storage unit 110 is a feature vector of the parameter space i selected in the traffic simulations t and t times.

, The number of times the parameter space i was selected by the tth time

And the number of times you chose some parameter space

Vector with

, Value calculated from the set of indicators in the traffic simulation performed by selecting the parameter space i up to the tth time.

Is. Feature vector of parameter space i

Can be anything. As an example, there is a binary vector expressing whether or not two signals belong to the same parameter space. Value calculated from a set of indicators

May be a value calculated by any calculation. As an example of calculation, there are a value obtained by taking the minimum value of a set of indicators, a value obtained by rearranging a set of indicators, and the like. Feature vector of parameter space i at t = 1, 2, ...

Set of X, vector

Set of Z,

The set of is represented as Y. An example of a part of the information stored in FIG. 2 is shown.

また、パラメータ空間記憶部１１０は、パラメータ空間ｉとその特徴ベクトル

の対応表も予め格納している。 Further, the parameter space storage unit 110 includes the parameter space i and its feature vector.

Correspondence table of is also stored in advance.

また、パラメータ空間記憶部１１０は、評価部３００からパラメータ空間ｉと、

とを取得すると、パラメータ空間ｉとその特徴ベクトル

の対応表から、パラメータ空間ｉ_ｔ＋１の特徴ベクトル

を得る。そして、パラメータ空間記憶部１１０は、

を

として更新し、そこから

を得る。 Further, the parameter space storage unit 110 includes the parameter space i from the evaluation unit 300.

When we get, the parameter space i and its feature vector

From the correspondence table of, the feature vector of the parameter space it _{+ 1}

To get. Then, the parameter space storage unit 110

To

Update as and from there

To get.

そして、パラメータ空間記憶部１１０は、

をＸに、

をＺに、評価部３００から取得した

をＹに、それぞれ追加する。 Then, the parameter space storage unit 110

To X,

Was obtained from the evaluation unit 300 in Z.

To Y, respectively.

パラメータ空間最適化部１２０は、パラメータ空間ｉとその特徴ベクトルＸ、各パラメータ空間を選んだ回数と何らかのパラメータ空間を選んだ回数を要素とするベクトルＺ、及び指標Ｙに基づいて、パラメータ空間に対する指標を予測するためのモデルである第１のモデルを学習し、複数のパラメータ空間の各々について、第１のモデルに基づいて当該パラメータ空間に対する指標である第１の指標を予測し、予測した第１の指標に基づいて計算される値に基づいて、評価部３００が次に評価するパラメータ空間を選択する。 The parameter space optimization unit 120 is an index for the parameter space based on the parameter space i and its feature vector X, the vector Z whose elements are the number of times each parameter space is selected and the number of times some parameter space is selected, and the index Y. The first model, which is a model for predicting, is trained, and for each of a plurality of parameter spaces, the first index, which is an index for the parameter space, is predicted and predicted based on the first model. The evaluation unit 300 selects the parameter space to be evaluated next based on the value calculated based on the index of.

具体的には、パラメータ空間最適化部１２０は、パラメータ空間モデル学習部１２２と、パラメータ空間選択部１２４とを備えている。 Specifically, the parameter space optimization unit 120 includes a parameter space model learning unit 122 and a parameter space selection unit 124.

パラメータ空間モデル学習部１２２は、パラメータ空間記憶部１１０に記憶されたＸ、Ｚ、及びＹに基づいて第１のモデルを学習する。 The parameter space model learning unit 122 learns the first model based on X, Z, and Y stored in the parameter space storage unit 110.

まず、パラメータ空間モデル学習部１２２は、パラメータ空間記憶部１１０からＸ、Ｚ、及びＹ、パラメータ空間ｉとその特徴ベクトル

の対応表を取得する。 First, the parameter space model learning unit 122 starts with the parameter space storage unit 110 to X, Z, and Y, the parameter space i, and its feature vector.

Get the correspondence table of.

そして、パラメータ空間モデル学習部１２２は、Ｘ、Ｚ、及びＹに基づいて第１のモデルを学習する。第１のモデルの一例として、確率モデルであるガウス過程がある（参考文献１）。 Then, the parameter space model learning unit 122 learns the first model based on X, Z, and Y. As an example of the first model, there is a Gaussian process which is a stochastic model (Reference 1).

［参考文献１］Rasmussen, C. E. and Williams, C. K. I.: Gaussian processes for machine learning, MIT Press (2006). [Reference 1] Rasmussen, C.E. and Williams, C.K.I .: Gaussian processes for machine learning, MIT Press (2006).

ガウス過程による回帰を用いると、交通シミュレーションを行っていない入力ｘ、及びｚに対して、未知の指標ｙを正規分布の形で確率分布として推論することができる。ここで、ガウス過程のカーネル関数を下記式（１）のように、ｘに関するカーネル、ｚに関するカーネルに分離する。 By using regression by Gaussian process, it is possible to infer an unknown index y as a probability distribution in the form of a normal distribution for inputs x and z for which traffic simulation is not performed. Here, the kernel function of the Gaussian process is separated into a kernel related to x and a kernel related to z as shown in the following equation (1).

また、ｘに関するカーネルは何でもよい。一例として、下記式（２）で表されるガウスカーネルがある（非特許文献２）。 Also, the kernel for x can be anything. As an example, there is a Gaussian kernel represented by the following equation (2) (Non-Patent Document 2).

ここで、θは実数をとるパラメータである。θの一例として、ガウス過程の周辺尤度が最大になる値に点推定した値を用いる（参考文献１）。 Here, θ is a parameter that takes a real number. As an example of θ, a point-estimated value is used as the value that maximizes the peripheral likelihood of the Gaussian process (Reference 1).

また、ｚに関するカーネルも何でもよい。一例としては、下記式（３）で表されるカーネルを用いる。 Also, any kernel related to z may be used. As an example, a kernel represented by the following equation (3) is used.

ここで、

は内積、ａ、ｂ、及び

は実数をとるパラメータである。ａ及びｂの一例として、ａ＝１００、ｂ＝１とすることができる。また、

の一例として、ガウス過程の周辺尤度が最大になる値に点推定した値を用いることができる（参考文献１）。 here,

Is the inner product, a, b, and

Is a parameter that takes a real number. As an example of a and b, a = 100 and b = 1 can be set. Also,

As an example, a point-estimated value can be used as the value that maximizes the peripheral likelihood of the Gaussian process (Reference 1).

そして、パラメータ空間モデル学習部１２２は、学習された第１のモデルをパラメータ空間選択部１２４に送信する。 Then, the parameter space model learning unit 122 transmits the learned first model to the parameter space selection unit 124.

パラメータ空間選択部１２４は、パラメータ空間モデル学習部１２２から受け取った第１のモデルを基に、次に交通シミュレーションを行うパラメータ空間ｉ_ｔ＋１を選択する。 The parameter space selection unit 124 selects the parameter space it _{+ 1 for} which the traffic simulation is to be performed next, based on the first model received from the parameter space model learning unit 122.

具体的には、まず、パラメータ空間選択部１２４は、ガウス過程回帰を行い、そのパラメータ空間を次に交通シミュレーションするべき度合いを表す、獲得関数

を、全てのパラメータ空間についてのパラメータ

に対して計算する。 Specifically, first, the parameter space selection unit 124 performs Gaussian process regression, and then represents the degree to which the parameter space should be traffic-simulated.

, Parameters for all parameter spaces

Calculate against.

ここで、

は、将来的にパラメータ空間ｉが選ばれたことによる値の変化を、ｚ_ｉ，ｔに加えたものである。ここで、

の一例として、次の交通シミュレーションでパラメータ空間ｉが選ばれた場合のｚ_ｉ，ｔである下記式（４）がある。 here,

Is the addition of the change in value due to the selection of the parameter space i in the future to zi _{and t} . here,

As an example, there is the following equation (4) which is z _{i, t} when the parameter space i is selected in the following traffic simulation.

ここで、Ｔは転置を意味する。 Here, T means transpose.

また、獲得関数の一例として、下記式（５）に表されるｕｐｐｅｒｃｏｎｆｉｄｅｎｃｅｂｏｕｎｄがある（非特許文献２）。 Further, as an example of the acquisition function, there is an upper confidence bound represented by the following formula (5) (Non-Patent Document 2).

ここで、

、及び

は、それぞれガウス過程で回帰した平均と分散であり、β_ｔ＋１はパラメータである。例えば、

とすることができる。 here,

,as well as

Are the mean and variance regressed in the Gaussian process, respectively, and β _{t + 1} is a parameter. For example

Can be.

そして、下記式（６）で表される獲得関数が最大となるパラメータ空間ｉ_ｔ＋１を選択し、パラメータ選択部１４４に送信する。 Then, the parameter space it _{+ 1} that maximizes the acquisition function represented by the following equation (6) is selected and transmitted to the parameter selection unit 144.

パラメータ記憶部１３０は、パラメータ空間記憶部１１０と同様に、過去に行った交通シミュレーションの結果を格納しており、要求にしたがってデータを読み出し、該当のデータをパラメータモデル学習部１４２に送信する。 Like the parameter space storage unit 110, the parameter storage unit 130 stores the results of traffic simulations performed in the past, reads data according to a request, and transmits the corresponding data to the parameter model learning unit 142.

パラメータ記憶部１３０に格納するデータは、交通シミュレーションの回数ｔ、ｔ回目の交通シミュレーションで選んだ信号パラメータｓ_ｔ、ｔ回目の交通シミュレーションの指標ｌ_ｔである。ｔ＝１，２，…におけるｓ_ｔ、ｌ_ｔの集合をそれぞれＳ、Ｌと表す。図３に格納する情報の一部の例を示す。 Data to be stored in the parameter storage unit 130 is the number of traffic simulation t, the signal parameter selected in the t-th traffic simulation s _t, the t-th traffic simulation index l _t. The sets of _st and l _t at t = 1, 2, ... Are represented as S and L, respectively. An example of a part of the information stored in FIG. 3 is shown.

また、パラメータ記憶部１３０は、パラメータ空間ｉと、そのパラメータ空間ｉの時にとりうる信号パラメータの空間

の対応表も格納している。 Further, the parameter storage unit 130 is a space of the parameter space i and a signal parameter space that can be taken in the parameter space i.

The correspondence table of is also stored.

また、パラメータ記憶部１３０は、評価部３００からｓ_ｔ＋１、ｌ_ｔ＋１を取得すると、ｓ_ｔ＋１、ｌ_ｔ＋１をそれぞれＳ、Ｌに追加する。 Further, when the parameter storage unit 130 acquires _st _{+ 1} and l _{t + 1} from the evaluation unit 300, the parameter storage unit 130 adds _st _{+ 1} and l _{t + 1} to S and L, respectively.

パラメータ最適化部１４０は、信号パラメータｓ及び指標に基づいて、パラメータに対する指標ｌを予測するためのモデルである第２のモデルを学習し、パラメータ空間最適化部１２０により選択されたパラメータ空間ｉに含まれる信号パラメータの各々について、第２のモデルに基づいて信号パラメータに対する指標である第２の指標を予測し、予測した第２の指標に基づいて計算される値に基づいて、評価部３００が次に評価するパラメータを選択する。 The parameter optimization unit 140 learns a second model, which is a model for predicting the index l for the parameter, based on the signal parameter s and the index, and enters the parameter space i selected by the parameter space optimization unit 120. For each of the included signal parameters, the evaluation unit 300 predicts the second index, which is an index for the signal parameter, based on the second model, and the evaluation unit 300 predicts the value calculated based on the predicted second index. Then select the parameters to evaluate.

具体的には、パラメータ最適化部１４０は、パラメータモデル学習部１４２と、パラメータ選択部１４４とを備えている。 Specifically, the parameter optimization unit 140 includes a parameter model learning unit 142 and a parameter selection unit 144.

パラメータモデル学習部１４２は、パラメータ記憶部１３０から受け取ったＳ、Ｌから第２のモデルを学習する。 The parameter model learning unit 142 learns the second model from S and L received from the parameter storage unit 130.

具体的には、まず、パラメータモデル学習部１４２は、パラメータ記憶部１３０からＳ、Ｌ、パラメータ空間ｉと、そのパラメータ空間ｉの時にとりうる信号パラメータの空間

の対応表を取得する。 Specifically, first, the parameter model learning unit 142 is a space of signal parameters that can be taken from the parameter storage unit 130 to S, L, the parameter space i, and the parameter space i.

Get the correspondence table of.

次に、パラメータモデル学習部１４２は、Ｓ、及びＬから第２のモデルを学習する。ここでも、第２のモデルの一例としてガウス過程があり、交通シミュレーションを行っていない信号パラメータｓの入力に対して、未知の指標ｌを正規分布の形で確率分布として推論する。ｓに関するカーネルは何でもよいが、一例として、上記式（２）で表されるガウスカーネルを用いる。ここで、通常は１つのガウス過程でモデル化するが、複数の指標がある場合は複数のガウス過程でモデル化することもできる。 Next, the parameter model learning unit 142 learns the second model from S and L. Here, too, there is a Gaussian process as an example of the second model, and an unknown index l is inferred as a probability distribution in the form of a normal distribution for the input of the signal parameter s for which the traffic simulation is not performed. The kernel for s may be anything, but as an example, the Gaussian kernel represented by the above equation (2) is used. Here, it is usually modeled by one Gaussian process, but if there are a plurality of indexes, it can be modeled by a plurality of Gaussian processes.

また、異なるパラメータ空間での信号パラメータをガウス過程で学習する場合、ａｒｃ−ＧＰ（参考文献２）などの手法を用いて、パラメータの欠損を補完することも可能である。 Further, when learning signal parameters in different parameter spaces in a Gaussian process, it is possible to supplement the parameter deficiency by using a method such as arc-GP (Reference 2).

［参考文献２］Swersky, K, Duvenaud, D, Snoek, J, Hutter, F, and Osborne, M.: Raiders of the Lost Architecture: Kernels for Bayesian Optimization in Conditional Parameter Spaces, Proceedings of NIPS workshop on Bayesian Optimization in theory and practice (2013). [Reference 2] Swersky, K, Duvenaud, D, Snoek, J, Hutter, F, and Osborne, M .: Raiders of the Lost Architecture: Kernels for Bayesian Optimization in Conditional Parameter Spaces, Proceedings of NIPS workshop on Bayesian Optimization in theory and practice (2013).

そして、パラメータ空間モデル学習部１２２は、学習した第２のモデルと、パラメータ空間ｉと、そのパラメータ空間ｉの時にとりうる信号パラメータの空間

の対応表を、パラメータ選択部１４４に送信する。 Then, the parameter space model learning unit 122 describes the learned second model, the parameter space i, and the space of the signal parameters that can be taken in the parameter space i.

The correspondence table of is transmitted to the parameter selection unit 144.

パラメータ選択部１４４は、パラメータモデル学習部１４２から受け取った第２のモデルと、次に交通シミュレーションを行うパラメータ空間ｉ_ｔ＋１とに基づいて、次に交通シミュレーションを行う信号パラメータｓ_ｔ＋１を選択する。 The parameter selection unit 144 selects the signal parameter s _{t + 1 for} which the traffic simulation is performed next, based on the second model received from the parameter model learning unit 142 and the parameter space it _{+ 1} for which the traffic simulation is performed next.

具体的には、パラメータ選択部１４４は、ガウス過程回帰を行い、信号パラメータｓにおける獲得関数α（ｓ）を、パラメータ空間選択部１２４から取得したパラメータ空間ｉ_ｔ＋１に属する信号パラメータのうち、過去に評価されていないパラメータｓに対して計算する。ここでも、獲得関数の一例としてｕｐｐｅｒｃｏｎｆｉｄｅｎｃｅｂｏｕｎｄを用いることができる（非特許文献２）。 Specifically, the parameter selection unit 144 performs Gaussian process regression, and obtains the acquisition function α (s) in the signal parameter s from the signal parameters belonging to the parameter space it _{+ 1} acquired from the parameter space selection unit 124 in the past. Calculate for the unevaluated parameter s. Here, too, the upper confidence bound can be used as an example of the acquisition function (Non-Patent Document 2).

そして、パラメータ選択部１４４は、下記式（７）で表される獲得関数が最大となる信号パラメータｓ_ｔ＋１を選択する。ここで、最大化はどのような手法を用いてもよい。 Then, the parameter selection unit 144 selects the signal parameter _{st + 1} that maximizes the acquisition function represented by the following equation (7). Here, any method may be used for maximization.

その後、パラメータ選択部１４４は、選択した信号パラメータｓ_ｔ＋１と、パラメータ空間ｉ_ｔ＋１とを、評価部３００に送信する。 After that, the parameter selection unit 144 transmits the selected signal parameter s _{t + 1} and the parameter space it _{+ 1} to the evaluation unit 300.

評価用データ記憶部２００は、交通シミュレーションを行うために必要なデータである評価用データを記憶する。 The evaluation data storage unit 200 stores evaluation data, which is data necessary for performing a traffic simulation.

ここで、評価用データは、交通シミュレーションを行うために必要なデータであれば、例えば、道路の形状、各道路の制限速度、車両の台数、各車両の交通シミュレーション区間への進入時間、それらの車両のルート、交通シミュレーションの開始時間や終了時間等を用いることができる。 Here, if the evaluation data is data necessary for performing traffic simulation, for example, the shape of the road, the speed limit of each road, the number of vehicles, the approach time of each vehicle to the traffic simulation section, and those The route of the vehicle, the start time and end time of the traffic simulation, etc. can be used.

評価部３００は、評価用データと、信号パラメータとに基づいて、評価用データを入力として信号パラメータを用いて交通状況を計算し、計算された交通状況を評価する指標を計算する。 Based on the evaluation data and the signal parameters, the evaluation unit 300 calculates the traffic condition using the evaluation data as an input and the signal parameters, and calculates an index for evaluating the calculated traffic condition.

具体的には、まず評価部３００は、パラメータ選択部１４４からパラメータ空間ｉ_ｔ＋１及び信号パラメータｓ_ｔ＋１と、評価用データ記憶部２００から評価用データとを取得する。 Specifically, first evaluation unit 300, a parameter space _{i t + 1} and the signal parameters _{s t + 1} from the parameter selection unit 144, acquires the evaluation data from the evaluation data storage unit 200.

次に、評価部３００は、評価用データ記憶部２００から取得した評価用データと、パラメータ選択部１４４から取得した信号パラメータｓ_ｔ＋１とを用いて、交通状況を計算する交通シミュレーションを行い、指標を計算し、計算された指標を、パラメータ空間ｉ_ｔ＋１に対する指標ｙ_ｔ＋１として出力すると共に、信号パラメータｓ_ｔ＋１に対する指標ｌ_ｔ＋１として出力する。 Next, the evaluation unit 300 performs a traffic simulation for calculating the traffic condition using the evaluation data acquired from the evaluation data storage unit 200 and the signal parameter _{st + 1} acquired from the parameter selection unit 144, and sets an index. calculated, the calculated index, and outputs as an indication _{y t + 1} for the parameter space _{i t + 1,} and outputs as an indication _{l t + 1} for the signal parameters _{s t + 1.}

指標の一例として、車両１台当たりの渋滞損失時間がある。渋滞損失時間とは、渋滞によってある基準の時間よりも遅れた時間の合計のことである。 As an example of the index, there is a traffic jam loss time per vehicle. Congestion loss time is the total time that is delayed from a certain standard time due to congestion.

そして、評価部３００は、パラメータ空間ｉ_ｔ＋１とそのパラメータ空間ｉ_ｔ＋１における指標ｙ_ｔ＋１の組合せをパラメータ空間記憶部１１０に、信号パラメータｓ_ｔ＋１と指標ｌ_ｔ＋１の組合せをパラメータ記憶部１３０に送信する。 The evaluation section 300 sends a combination of the index _{y t + 1} parameter space _{i t + 1} and in the parameter space _{i t + 1} in the parameter space memory 110, the combined signal parameters _{s t + 1} and the index _{l t + 1} in the parameter storage unit 130.

そして、評価部３００は、現在の交通シミュレーションを行った回数ｔが、予め定めた交通シミュレーションを繰り返す最大回数（例えば、１０００回）を超えているか否かを判断する。ｔが最大回数を、超えている場合には、出力部４００に、最適な信号パラメータを出力するように命じる。一方、超えていない場合には、最適化部１００に、再度処理を行うように命令する。 Then, the evaluation unit 300 determines whether or not the number of times t of the current traffic simulation is performed exceeds the maximum number of times (for example, 1000 times) for repeating the predetermined traffic simulation. If t exceeds the maximum number of times, the output unit 400 is instructed to output the optimum signal parameter. On the other hand, if it does not exceed, the optimization unit 100 is instructed to perform the process again.

なお、繰り返す最大回数を判定する代わりに、指標が収束するまで繰り返すように構成してもよい。 Instead of determining the maximum number of repetitions, it may be configured to repeat until the index converges.

出力部４００は、最適な信号パラメータを信号制御の管制装置５０に出力する。 The output unit 400 outputs the optimum signal parameters to the signal control control device 50.

具体的には、まず、出力部４００は、パラメータ記憶部１３０に記憶されている今まで交通シミュレーションを行った信号パラメータ、及び指標を取得する。 Specifically, first, the output unit 400 acquires the signal parameters and indexes stored in the parameter storage unit 130 that have been subjected to the traffic simulation so far.

そして、出力部４００は、指標ｌが最小となる信号パラメータｓを管制装置５０に出力する。 Then, the output unit 400 outputs the signal parameter s that minimizes the index l to the control device 50.

本実施形態に係る管制装置５０は、ＣＰＵと、ＲＡＭと、複数の交通信号機を制御するためのプログラムを記憶したＲＯＭとを備えたコンピュータで構成され、機能的には次に示すように構成されている。 The control device 50 according to the present embodiment is composed of a computer including a CPU, a RAM, and a ROM storing a program for controlling a plurality of traffic signals, and is functionally configured as shown below. ing.

図１に示すように、本発明の実施の形態に係る管制装置５０は、入力部５００と、制御部５１０とを備えて構成される。 As shown in FIG. 1, the control device 50 according to the embodiment of the present invention includes an input unit 500 and a control unit 510.

入力部５００は、出力部４００から信号パラメータｓを取得する。 The input unit 500 acquires the signal parameter s from the output unit 400.

また、入力部５００は、複数の信号装置を含むエリアの交通状況の入力を受け付ける。 Further, the input unit 500 receives an input of a traffic condition in an area including a plurality of signal devices.

制御部５１０は、入力部５００から信号パラメータｓと、交通状況を取得する。 The control unit 510 acquires the signal parameters s and the traffic condition from the input unit 500.

そして、制御部５１０は、交通状況を入力とし、信号パラメータｓを用いて、複数の信号装置を制御する。 Then, the control unit 510 controls a plurality of signal devices by inputting the traffic condition and using the signal parameters s.

具体的には、制御部５１０は、複数の信号装置の各々に対し、信号パラメータｓに基づいて、信号灯色を切り替える、維持する、点滅させる等の命令を行う。 Specifically, the control unit 510 issues commands to each of the plurality of signal devices, such as switching, maintaining, and blinking the signal light color, based on the signal parameter s.

＜＜実験結果＞＞
次に、本発明に係る最適化装置の実験結果を示す。本実験では、９台の交通信号機を制御するものとし、パラメータ空間ｉを９種類（系列１〜系列９）とした。また、最大回数を１０００回とした。また、指標として、渋滞損失時間を採用した。 << Experimental Results >>
Next, the experimental results of the optimization device according to the present invention are shown. In this experiment, nine traffic signals were controlled, and the parameter spaces i were set to nine types (series 1 to series 9). Moreover, the maximum number of times was set to 1000 times. In addition, the congestion loss time was adopted as an index.

図４は、実験結果における探索回数（すなわち、交通シミュレーション回数）と、パラメータ空間ｉ毎の渋滞損失時間との関係を表すグラフである。図４に示すように、開始直後では、各パラメータ空間ｉがランダムに選択され、各パラメータ空間ｉにおけるパラメータ空間内の信号パラメータに対する渋滞損失時間が計算されている。 FIG. 4 is a graph showing the relationship between the number of searches (that is, the number of traffic simulations) in the experimental results and the congestion loss time for each parameter space i. As shown in FIG. 4, immediately after the start, each parameter space i is randomly selected, and the congestion loss time for the signal parameter in the parameter space in each parameter space i is calculated.

しかし、少ない探索回数で、渋滞損失時間が大きいパラメータ空間（例えば系列７）は、良い指標を得ることができないパラメータ空間として学習され、探索を打ち切るか、探索回数を減らす。 However, a parameter space with a small number of searches and a large congestion loss time (for example, series 7) is learned as a parameter space in which a good index cannot be obtained, and the search is terminated or the number of searches is reduced.

一方、渋滞損失時間が最も少ないパラメータ空間（系列３）が最適であると学習されることにより、少ない探索回数でパラメータ空間の最適化が行われていることが分かる。 On the other hand, by learning that the parameter space (series 3) having the shortest congestion loss time is optimal, it can be seen that the parameter space is optimized with a small number of searches.

そして、少ない探索回数で、そのパラメータ空間におけるパラメータの最適化を行うことができる。 Then, the parameters in the parameter space can be optimized with a small number of searches.

＜本発明の実施の形態に係る最適化装置の作用＞
図５は、本発明の実施の形態に係る最適化処理ルーチンを示すフローチャートである。 <Operation of the optimizing device according to the embodiment of the present invention>
FIG. 5 is a flowchart showing an optimization processing routine according to the embodiment of the present invention.

評価部３００に評価用データが入力されると、最適化装置１０において、図５に示す最適化処理ルーチンが実行される。 When the evaluation data is input to the evaluation unit 300, the optimization device 10 executes the optimization processing routine shown in FIG.

まず、ステップＳ１００において、評価部３００は、評価用データ記憶部２００から、評価用データを取得する。 First, in step S100, the evaluation unit 300 acquires evaluation data from the evaluation data storage unit 200.

次に、ステップＳ１１０において、ｔ＝１に初期化する。 Next, in step S110, t = 1 is initialized.

ステップＳ１２０において、パラメータ空間モデル学習部１２２は、パラメータ空間記憶部１１０からパラメータ空間ｉの特徴ベクトル

の集合Ｘ、ベクトル

の集合Ｚ、

の集合Ｙ、及びパラメータ空間ｉとその特徴ベクトル

との対応表を取得する。 In step S120, the parameter space model learning unit 122 receives the feature vector of the parameter space i from the parameter space storage unit 110.

Set X, vector

Set Z,

Set Y, parameter space i and its feature vector

Get the correspondence table with.

ステップＳ１３０において、パラメータ空間モデル学習部１２２は、Ｘ、Ｚ、及びＹに基づいて第１のモデルを学習する。 In step S130, the parameter space model learning unit 122 learns the first model based on X, Z, and Y.

ステップＳ１４０において、パラメータ空間選択部１２４は、パラメータ空間モデル学習部１２２から受け取った第１のモデルを基に、次に交通シミュレーションを行うパラメータ空間ｉ_ｔ＋１を選択する。 In step S140, the parameter space selection unit 124 selects the parameter space it _{+ 1 for} which the traffic simulation is to be performed next, based on the first model received from the parameter space model learning unit 122.

ステップＳ１５０において、パラメータモデル学習部１４２は、パラメータ記憶部１３０からｓ_ｔの集合Ｓと、ｌ_ｔの集合Ｌと、パラメータ空間ｉと、そのパラメータ空間ｉの時にとりうる信号パラメータの空間

の対応表を取得する。 In step S150, the parameter model learning unit 142, the set S of _{s t} from the parameter storage unit 130, a set of _{l t} L, the parameter space i, the spatial signal parameters which can be taken at the time of the parameter space i

Get the correspondence table of.

ステップＳ１６０において、パラメータモデル学習部１４２は、Ｓ、及びＬから第２のモデルを学習する。 In step S160, the parameter model learning unit 142 learns the second model from S and L.

ステップＳ１７０において、パラメータ選択部１４４は、パラメータモデル学習部１４２から受け取った第２のモデルと、次に交通シミュレーションを行うパラメータ空間ｉ_ｔ＋１とに基づいて、次に交通シミュレーションを行う信号パラメータｓ_ｔ＋１を選択する。 In step S170, the parameter selection unit 144 sets the signal parameter s _{t + 1 for} which the traffic simulation is performed next based on the second model received from the parameter model learning unit 142 and the parameter space it _{+ 1} for which the traffic simulation is performed next. select.

ステップＳ１８０において、評価部３００は、評価用データ記憶部２００から取得した評価用データと、パラメータ選択部１４４から取得した信号パラメータｓ_ｔ＋１とを用いて、交通状況を計算する交通シミュレーションを行い、指標を計算し、計算された指標を、パラメータ空間ｉ_ｔ＋１に対する指標ｙ_ｔ＋１として出力すると共に、信号パラメータｓ_ｔ＋１に対する指標ｌ_ｔ＋１として出力する。 In step S180, the evaluation unit 300 performs a traffic simulation for calculating the traffic condition using the evaluation data acquired from the evaluation data storage unit 200 and the signal parameter _{st + 1} acquired from the parameter selection unit 144, and performs an index. was calculated, the calculated index, and outputs as an indication _{y t + 1} for the parameter space _{i t + 1,} and outputs as an indication _{l t + 1} for the signal parameters _{s t + 1.}

ステップＳ１９０において、評価部３００は、現在の交通シミュレーションを行った回数ｔが、予め定めた交通シミュレーションを繰り返す最大回数を超えているか否かを判断する。 In step S190, the evaluation unit 300 determines whether or not the number of times t of the current traffic simulation is performed exceeds the maximum number of times the predetermined traffic simulation is repeated.

ｔが最大回数を超えていない場合（ステップＳ１９０のＮＯ）、ステップＳ２００において、ｔ＝ｔ＋１とし、ステップＳ１２０〜ステップＳ１８０の処理を繰り返す。 When t does not exceed the maximum number of times (NO in step S190), t = t + 1 is set in step S200, and the processes of steps S120 to S180 are repeated.

一方、ｔが最大回数を超えている場合（ステップＳ１９０のＹＥＳ）、ステップＳ２１０において、出力部４００は、指標ｌが最小となる信号パラメータｓを管制装置５０に出力して、最適化処理ルーチンを終了する。 On the other hand, when t exceeds the maximum number of times (YES in step S190), in step S210, the output unit 400 outputs the signal parameter s at which the index l is the minimum to the control device 50, and performs the optimization processing routine. finish.

以上説明したように、本実施形態に係る最適化装置によれば、パラメータ空間に対する指標を予測するためのモデルに基づいて、パラメータ空間に対する指標である第１の指標を予測し、予測した第１の指標に基づいて計算される値に基づいて、次に評価するパラメータ空間を選択し、選択されたパラメータ空間に含まれるパラメータのうち、過去に評価を行っていないパラメータの各々について、パラメータに対する指標を予測するためのモデルに基づいてパラメータに対する指標である第２の指標を予測し、予測した第２の指標に基づいて計算される値に基づいて、次に評価するパラメータを選択し、評価用データと、選択されたパラメータとに基づいて、出力を評価する指標を計算することを繰り返すことにより、少ない評価回数で、パラメータの最適化を行うことができる。 As described above, according to the optimization device according to the present embodiment, the first index, which is an index for the parameter space, is predicted and predicted based on the model for predicting the index for the parameter space. Select the parameter space to be evaluated next based on the value calculated based on the index of, and for each of the parameters included in the selected parameter space that have not been evaluated in the past, the index for the parameter. Predict the second index, which is an index for the parameter, based on the model for predicting, select the parameter to be evaluated next based on the value calculated based on the predicted second index, and use it for evaluation. By repeating the calculation of the index for evaluating the output based on the data and the selected parameters, the parameters can be optimized with a small number of evaluations.

また、本実施形態に係る交通信号制御システムによれば、パラメータ空間に対する指標を予測するためのモデルに基づいて、パラメータ空間に対する指標である第１の指標を予測し、予測した第１の指標に基づいて計算される値に基づいて、次に評価するパラメータ空間を選択し、選択されたパラメータ空間に含まれる信号パラメータのうち、過去に評価を行っていない信号パラメータの各々について、パラメータに対する指標を予測するためのモデルに基づいて信号パラメータに対する指標である第２の指標を予測し、予測した第２の指標に基づいて計算される値に基づいて、次に評価する信号パラメータを選択し、評価用データと、選択された信号パラメータとに基づいて、交通状況を計算することを繰り返すことにより、少ない交通シミュレーション回数で、信号パラメータの最適化を行うことができる。 Further, according to the traffic signal control system according to the present embodiment, the first index, which is an index for the parameter space, is predicted based on the model for predicting the index for the parameter space, and the predicted first index is used. Based on the value calculated based on, the parameter space to be evaluated next is selected, and among the signal parameters included in the selected parameter space, the index for the parameter is set for each of the signal parameters that have not been evaluated in the past. The second index, which is an index for the signal parameter, is predicted based on the model for prediction, and the signal parameter to be evaluated next is selected and evaluated based on the value calculated based on the predicted second index. By repeating the calculation of the traffic condition based on the data and the selected signal parameter, the signal parameter can be optimized with a small number of traffic simulations.

なお、本発明は、上述した実施の形態に限定されるものではなく、この発明の要旨を逸脱しない範囲内で様々な変形や応用が可能である。 The present invention is not limited to the above-described embodiment, and various modifications and applications are possible without departing from the gist of the present invention.

本実施形態では、評価として交通シミュレーションを、パラメータとして信号パラメータを選択した場合について説明したが、これに限定されるものではない。 In the present embodiment, the case where the traffic simulation is selected as the evaluation and the signal parameter is selected as the parameter has been described, but the present invention is not limited to this.

例えば、他の実施形態として、誘導員を用いた群衆の誘導にも適用することができる。この例における対応関係としては、例えば、評価は人流シミュレーションを行うことに、パラメータ空間は誘導員の配置場所に、パラメータは誘導方法とすることができる。 For example, as another embodiment, it can be applied to guide the crowd using a guide. As the correspondence in this example, for example, the evaluation can be performed by performing a human flow simulation, the parameter space can be the location of the guide, and the parameter can be the guidance method.

また、他の実施形態として、機械学習のハイパーパラメータの最適化にも適用することができる。この例における対応関係としては、例えば、評価は機械学習モデルの学習を行うことに、パラメータ空間は機械学習のパイプラインに、パラメータはハイパーパラメータとすることができる。 In addition, as another embodiment, it can be applied to the optimization of hyperparameters of machine learning. As the correspondence in this example, for example, the evaluation can be the training of the machine learning model, the parameter space can be the pipeline of machine learning, and the parameters can be hyperparameters.

また、本願明細書中において、プログラムが予めインストールされている実施形態として説明したが、当該プログラムを、コンピュータ読み取り可能な記録媒体に格納して提供することも可能である。 Further, although described as an embodiment in which the program is pre-installed in the specification of the present application, it is also possible to provide the program by storing it in a computer-readable recording medium.

１０最適化装置
５０管制装置
１００最適化部
１１０パラメータ空間記憶部
１２０パラメータ空間最適化部
１２２パラメータ空間モデル学習部
１２４パラメータ空間選択部
１３０パラメータ記憶部
１４０パラメータ最適化部
１４２パラメータモデル学習部
１４４パラメータ選択部
２００評価用データ記憶部
３００評価部
４００出力部
５００入力部
５１０制御部 10 Optimization device 50 Control device 100 Optimization unit 110 Parameter space storage unit 120 Parameter space optimization unit 122 Parameter space model learning unit 124 Parameter space selection unit 130 Parameter storage unit 140 Parameter optimization unit 142 Parameter model learning unit 144 Parameter selection Unit 200 Evaluation data storage unit 300 Evaluation unit 400 Output unit 500 Input unit 510 Control unit

Claims

It is an optimization device that optimizes the parameters used when calculating with evaluation data as input.
An evaluation unit that calculates an index for evaluating the result of the calculation based on the evaluation data and the parameters.
A parameter space that divides the space that the parameter can take, an optimization unit that optimizes the parameter, and
An output unit that outputs optimized parameters obtained by repeating the processing by the evaluation unit and the processing by the optimization unit.
Including
The optimization unit
A first model, which is a model for predicting an index for the parameter space, is learned based on the parameter and the index, and for each of the plurality of parameter spaces, the parameter space is based on the first model. A parameter space optimization unit that predicts a first index, which is an index for, and selects a parameter space to be evaluated next by the evaluation unit based on a value calculated based on the predicted first index.
A second model, which is a model for predicting an index for the parameter, is learned based on the parameter and the index, and among the parameters included in the parameter space selected by the parameter space optimization unit, the past For each of the parameters that have not been evaluated, a second index, which is an index for the parameter, is predicted based on the second model, and based on a value calculated based on the predicted second index. The parameter optimization unit, which selects the parameter to be evaluated next by the evaluation unit,
Optimizer including.

The parameter space optimization unit uses the first model to predict the evaluation of the parameter space for each of the plurality of parameter spaces, and the first acquisition using the prediction of the evaluation of the parameter space as a variable. The function is calculated, and the parameter space in which the value of the first acquisition function is maximized is selected as the parameter space to be evaluated next by the evaluation unit.
The parameter optimization unit
Using the second model, among the parameters included in the parameter space selected by the parameter space optimization unit, each of the parameters that have not been evaluated in the past is next evaluated by the evaluation unit. The evaluation for the parameter space is predicted, the second acquisition function with the prediction of the evaluation for the parameter that has not been evaluated in the past as a variable is calculated, and the parameter having the maximum value of the second acquisition function is described. The optimization device according to claim 1, which is selected by the evaluation unit as a parameter to be evaluated next.

The optimization device according to claim 1 or 2, wherein the first model and the second model are stochastic models using a Gaussian process.

The parameter space optimization unit uses the parameter space to which the parameter used for evaluation by the evaluation unit belongs, the number of times of evaluation using the parameter belonging to the parameter space, and the index obtained by the evaluation unit. After learning the first model,
Any of claims 1 to 3, wherein the parameter optimization unit learns the second model using the parameters used when the evaluation unit performs the evaluation and the index obtained by the evaluation unit. The optimization device according to item 1.

A traffic signal control system including a control device that controls a plurality of traffic signals and an optimization device that optimizes signal parameters used by the control device by inputting evaluation data necessary for calculating traffic conditions. ,
The control device
An input section that accepts status input and
A control unit that controls the plurality of traffic signals by using the traffic conditions as an input and the signal parameters obtained by the optimization device.
Including
The optimization device
Based on the evaluation data and the signal parameter, the evaluation unit calculates the traffic condition using the signal parameter with the evaluation data as an input, and calculates an index for evaluating the calculated traffic condition. ,
A parameter space that divides the space that the signal parameter can take, an optimization unit that optimizes the signal parameter, and
An output unit that outputs optimized parameters obtained by repeating the processing by the evaluation unit and the processing by the optimization unit.
Including
The optimization unit
A first model, which is a model for predicting an index for the parameter space, is learned based on the signal parameter and the index, and for each of the plurality of the parameter spaces, the parameter is based on the first model. A parameter space optimization unit that predicts the first index, which is an index for space, and selects the parameter space to be evaluated next by the evaluation unit based on the value calculated based on the predicted first index. ,
A second model, which is a model for predicting an index for the signal parameter, is learned based on the signal parameter and the index, and the signal parameter included in the parameter space selected by the parameter space optimization unit. Of these, for each of the signal parameters that have not been evaluated in the past, a second index, which is an index for the signal parameter, is predicted based on the second model, and calculation is performed based on the predicted second index. A parameter optimization unit that selects the signal parameter to be evaluated next by the evaluation unit based on the value to be evaluated.
Traffic signal control system including.

It is an optimization method that optimizes the parameters when the evaluation data is input and output using the parameters.
A step in which the evaluation unit calculates an index for evaluating the output based on the evaluation data and the parameters.
The optimizing unit divides the space that the parameter can take, the parameter space, and the step of optimizing the parameter.
A step in which the output unit outputs an optimized parameter obtained by repeating the process by the evaluation unit and the process by the optimization unit.
Including
The steps that the optimization unit optimizes are
The parameter space optimization unit learns a first model that is a model for predicting an index for the parameter space based on the parameter and the index, and for each of the plurality of the parameter spaces, the first model is described. The first index, which is an index for the parameter space, is predicted based on the model, and the parameter space to be evaluated next by the evaluation unit is selected based on the value calculated based on the predicted first index. Steps and
The parameter optimization unit learns a second model, which is a model for predicting an index for the parameter based on the parameter and the index, and is included in the parameter space selected by the parameter space optimization unit. Of the parameters, for each of the parameters that have not been evaluated in the past, a second index that is an index for the parameter is predicted based on the second model, and based on the predicted second index. Based on the calculated value, the step of selecting the parameter to be evaluated next by the evaluation unit, and
Optimization methods including.

A first index that is an index for the parameter space based on a first model for predicting a first index that is an index for the parameter space for each of a plurality of parameter spaces that divide the space that the parameter can take. And a parameter space optimization unit that selects a parameter space based on the predicted first index.
For each of the parameters included in the parameter space selected by the parameter space optimization unit, the index for the parameter is based on the second model for predicting the second index which is the index for the parameter. A parameter optimization unit that predicts two indexes and selects parameters based on the second index.
An output unit that outputs the parameters selected by the parameter optimization unit, and
Parameter search device including.

A program for causing a computer to function as each part of the optimization device according to any one of claims 1 to 4.