JP2018147087A

JP2018147087A - Control scheme analyzing device, method and program

Info

Publication number: JP2018147087A
Application number: JP2017039361A
Authority: JP
Inventors: 寛清武; Hiroshi Kiyotake; 匡宏幸島; Masahiro Kojima; 達史松林; Tatsufumi Matsubayashi
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2017-03-02
Filing date: 2017-03-02
Publication date: 2018-09-20
Anticipated expiration: 2037-03-02
Also published as: JP6668279B2

Abstract

PROBLEM TO BE SOLVED: To estimate relation between a control scheme and output of a simulator so as to search for an optimum scheme.SOLUTION: A simulator execution part 31 executes, based upon an input parameter consisting of control contents for each user, simulation to obtain output data. A feature vector conversion part 32 converts, in every time zone, the input parameter into a feature vector representing the number of users of each control content. A relation estimation part 33 estimates, based upon a plurality of data points as a combination of a result of conversion into the feature vector and the output data, relation between the feature vector and the output data.SELECTED DRAWING: Figure 2

Description

本発明は、制御策解析装置、方法、及びプログラムに係り、特に、最適な制御策を探索するための制御策解析装置、方法、及びプログラムに関する。 The present invention relates to a control strategy analysis apparatus, method, and program, and more particularly, to a control strategy analysis apparatus, method, and program for searching for an optimal control strategy.

近年、花火大会などのイベント終了後の駅までの帰り道の人ごみや、テーマパークにおけるアトラクションの長い待ち時間などの、いわゆる混雑等、制御が必要とされる現場が多く存在する。イベント終了後の駅までの帰り道の人ごみでは、交通渋滞や雑踏事故を引き起こす危険性がある。大規模テーマパークにおいては、アトラクションの長い待ち時間は満足度の低下だけでなく、収益低下にもつながると指摘されている。 In recent years, there are many sites that need to be controlled, such as crowds on the way back to the station after an event such as a fireworks display or so-called congestion such as long waiting times for attractions in theme parks. On the way home to the station after the event, there is a risk of causing traffic jams and hustle and bustle accidents. In large-scale theme parks, it has been pointed out that long waiting times for attractions not only reduce satisfaction, but also reduce profits.

上記のような問題に対し、国や企業は様々な制御策を講じている。例えば、イベント終了後のある時間帯だけ一部道路を歩行者天国にする制御策や、テーマパークでは、アプリを通じて待ち時間情報をユーザに提示することで空いているアトラクションに誘導するといった制御策も実施されている。また、混雑の平準化を行うための制御策として、ユーザに最も待ち時間の短いアトラクションに推薦するという方法が提案されている。実際に、混雑平準化の効果はテーマパークを模したシミュレータ（以下、テーマパークシミュレータ）を用いて、確かめられている（非特許文献１参照）。 In response to the above problems, countries and companies are taking various control measures. For example, there are control measures to make some roads pedestrian heaven only for a certain period of time after the event ends, and control measures such as guiding waiting times to vacant attractions by presenting waiting time information to the user through the application at the theme park. It has been implemented. In addition, as a control measure for leveling congestion, a method of recommending an attraction with the shortest waiting time to the user has been proposed. Actually, the effect of leveling the congestion has been confirmed using a simulator simulating a theme park (hereinafter, theme park simulator) (see Non-Patent Document 1).

しかしながら、上記手法では、制御策を人手で考えるため、大量に存在する制御策から最適な制御策を選択することはできない。例えば、テーマパークにおいて、５００００人のユーザが３０個あるアトラクションのうち、４個のアトラクションに乗る場合のアトラクション巡回ルートの組み合わせは（３０^４）^{５００００}通り存在する。 However, in the above method, since the control strategy is considered manually, an optimal control strategy cannot be selected from a large number of control strategies. For example, in a theme park, there are (30 ⁴ ) ⁵⁰⁰⁰⁰ combinations of attraction tour routes when 4 out of 30 attractions with 50000 users ride.

一方、大量にある機械学習のパラメータから学習精度を向上させるパラメータを探索する手法として、ベイズ最適化という技術が考案されている（非特許文献２参照）。ベイズ最適化は、パラメータが類似していれば、学習精度も類似しているという仮定のもとで、すべてのパラメータで学習を実行することなく、学習率を向上させるパラメータを効率的に探索する技術である。 On the other hand, a technique called Bayesian optimization has been devised as a method for searching for parameters that improve learning accuracy from a large number of machine learning parameters (see Non-Patent Document 2). Bayesian optimization efficiently searches for parameters that improve the learning rate without performing learning on all parameters under the assumption that if the parameters are similar, the learning accuracy is also similar. Technology.

このベイズ最適化をシミュレータにも適応させることで、大量に存在する制御策の中から最適な制御策を発見できるようになると考えられる。 By applying this Bayesian optimization to a simulator, it is considered that an optimal control strategy can be found from a large number of control strategies.

清水仁, 松林達史, 納谷太. 混雑飽和状態の遊園地における待ち時間削減手法のシミュレーション評価. SIG-DOCMAS 研究会.Shimizu Jin, Matsubayashi Tatsushi, Naya Ta. Simulation evaluation of waiting time reduction method in crowded saturated amusement park. SIG-DOCMAS Study Group. J. Snoek, H. Larochelle, and R. P. Adams. Practical Bayesian optimization of machine learning algorithms. In Advances in Neural Information Processing Systems (NIPS), 2012.J. Snoek, H. Larochelle, and R. P. Adams. Practical Bayesian optimization of machine learning algorithms. In Advances in Neural Information Processing Systems (NIPS), 2012. Nello Cristianini，John Shawa-Taylor，大北剛(訳)：カーネル法によるパターン解析(2010)Nello Cristianini, John Shawa-Taylor, Takeshi Ohkita: Pattern analysis by kernel method (2010)

ベイズ最適化をシミュレータに適応させる技術では、シミュレータを、入力値を制御策、出力値を評価指標となるような関数と見なし、ベイズ最適化を適応することを検討する。ここで言う、制御策・評価指標とは問題ごとに異なってよい。例えば、イベント会場からの帰り道の混雑緩和では、制御策としてユーザがイベント会場から駅まで帰るルートの推薦、評価指標として各道路における人口密度などが考えられる。また、テーマパークの混雑緩和では、制御策としてユーザ毎のアトラクション巡回ルートの推薦、評価指標としてアトラクション平均待ち時間などが考えられる。テーマパークシミュレータのイメージ図を図７に示す。 In the technology for adapting Bayesian optimization to a simulator, it is considered that the simulator is regarded as a function in which an input value serves as a control measure and an output value serves as an evaluation index, and Bayesian optimization is applied. The control measures / evaluation indicators mentioned here may be different for each problem. For example, in the alleviation of congestion on the way back from the event venue, it is possible to recommend a route for the user to return from the event venue to the station as a control measure, and population density on each road as an evaluation index. Further, in reducing congestion at a theme park, it is conceivable to recommend an attraction tour route for each user as a control measure, and an attraction average waiting time as an evaluation index. An image of the theme park simulator is shown in FIG.

ここで、ベイズ最適化は関数の最適化手法であるため、入出力値をベクトル値に変換する必要がある。特に、入力値である制御策の問題に適したベクトル値に変換しなければならない。ベクトル値に変換するためには、以下の課題が存在する。 Here, since Bayesian optimization is a function optimization method, it is necessary to convert input / output values into vector values. In particular, it is necessary to convert the input value into a vector value suitable for the control problem. In order to convert to a vector value, the following problems exist.

［課題１］高次元のベクトルに変換すると関数の予測精度が悪化する。 [Problem 1] When converted into a high-dimensional vector, the prediction accuracy of the function deteriorates.

［課題２］必要な情報が欠如すると、ベクトルを制御策に逆変換することが困難となる。 [Problem 2] If necessary information is lacking, it is difficult to convert the vector back into a control strategy.

それぞれの課題について説明する。 Each problem will be described.

課題１は、高次元のベクトルを入力としてしまうと、次元の呪いにより関数近似の精度を維持するために必要なデータ数が指数的に増加してしまうという課題である。データ数が増加した場合、大規模なシミュレータは一度の実行に長時間かかってしまうため、大量のシミュレーションサンプルを得ることは不可能であり、結果として関数近似の精度が悪化してしまう。例えば、ユーザの巡回ルートという制御策を、図８のように、ユーザが選んだ巡回ルートに１を、それ以外の巡回ルートに０を割り当てるような表現とした場合、特徴ベクトルの次元数が巡回ルートの組み合わせとユーザ数の積で表現されるため、非常に高次元になってしまい、関数の予測精度が悪化する。 Problem 1 is a problem that if a high-dimensional vector is used as an input, the number of data necessary to maintain the accuracy of function approximation increases exponentially due to a dimensional curse. When the number of data increases, a large-scale simulator takes a long time to execute once, so it is impossible to obtain a large number of simulation samples, and as a result, the accuracy of function approximation deteriorates. For example, when the control policy of the user's cyclic route is expressed as shown in FIG. 8 in which 1 is assigned to the cyclic route selected by the user and 0 is assigned to other cyclic routes, the dimension number of the feature vector is cyclic. Since it is expressed by the product of the combination of routes and the number of users, it becomes very high-dimensional and the function prediction accuracy deteriorates.

課題２は、課題１の解決のためには、低次元のベクトルに変換する必要があるが、制御策に必要な情報を落としてしまうと、制御策への逆変換が困難になってしまうという課題である。例えば、図９のように巡回ルートを、同一ルートを選んだユーザ数という情報に変換してみると、次元を小さくすることができる。これは図８で表現されている行列をユーザ方向に潰す操作と等価である。しかしながら、このような変換方法では特徴ベクトルを制御策に逆変換する際に困難が生じる。例えば、図１０のように１→２→３というような巡回ルートを８０人が選ぶ特徴ベクトルが最適だったとする。この時、制御策としては８０人のユーザをどの来園者に割り振るかという組み合わせの数だけ候補が考えられる。この８０人の割り振り方によって、アトラクション平均待ち時間の結果が大きく異なってしまうため、特徴ベクトルを制御策に逆変換することが困難になってしまう。 Problem 2 needs to be converted into a low-dimensional vector in order to solve Problem 1, but if the information necessary for the control strategy is dropped, the inverse conversion to the control strategy becomes difficult. It is a problem. For example, as shown in FIG. 9, when the cyclic route is converted into information indicating the number of users who have selected the same route, the dimension can be reduced. This is equivalent to the operation of crushing the matrix represented in FIG. 8 in the user direction. However, with such a conversion method, difficulty occurs when the feature vector is inversely converted into a control strategy. For example, as shown in FIG. 10, it is assumed that the feature vector in which 80 people select a cyclic route such as 1 → 2 → 3 is optimal. At this time, as a control measure, candidates can be considered as many as combinations of which 80 users are assigned to which visitors. Since the result of the attraction average waiting time varies greatly depending on how the 80 persons are allocated, it is difficult to reversely convert the feature vector into a control strategy.

本発明は、上記問題点を解決するために成されたものであり、制御策が大量に存在する場合であっても、効率よく、最適な制御策を探索することができる制御策解析装置、方法、及びプログラムを提供することを目的とする。 The present invention was made to solve the above problems, and even when a large amount of control measures exist, a control measure analysis device that can efficiently search for an optimal control measure, It is an object to provide a method and a program.

上記目的を達成するために、第１の発明に係る制御策解析装置は、各ユーザに対する制御内容からなる入力パラメータに基づいて、シミュレーションを実行し、出力データを得るシミュレータ実行部と、前記入力パラメータを、時間帯毎に、各制御内容のユーザ数を表す特徴ベクトルに変換する特徴ベクトル変換部と、前記特徴ベクトルに変換した結果と前記出力データとの組である複数のデータ点に基づいて、前記特徴ベクトルと前記出力データとの関係を推定する関係推定部と、を含んで構成されている。 In order to achieve the above object, a control strategy analyzing apparatus according to a first aspect of the present invention includes a simulator execution unit that executes a simulation based on input parameters including control contents for each user and obtains output data, and the input parameters Is converted into a feature vector that represents the number of users of each control content for each time period, and a plurality of data points that are a set of the result converted to the feature vector and the output data, A relationship estimation unit for estimating a relationship between the feature vector and the output data.

また、第１の発明に係る制御策解析装置において、前記関係推定部によって推定される前記関係に応じて定まる、前記特徴ベクトル変換部によって前記特徴ベクトルに変換した結果と前記出力データとの組である全てのデータ点のうちの最適な出力データより良い出力データが得られる可能性を表す、前記特徴ベクトルを引数とする獲得関数であって、前記データ点から求まる分散値が大きいほど前記可能性が高くなる獲得関数に基づいて、前記可能性が最大となる特徴ベクトルを決定し、前記決定された特徴ベクトルに対応する入力パラメータを、次の入力パラメータとして決定する次入力パラメータ決定部と、予め定めた繰り返し条件を満たすまで、前記シミュレータ実行部による実行と、前記関係推定部による推定と、前記次入力パラメータ決定部による決定とを繰り返させ、最適な出力データを得るための入力パラメータを求める反復判定部と、を更に含むようにしてもよい。 Further, in the control strategy analyzing apparatus according to the first aspect of the present invention, a set of the result converted to the feature vector by the feature vector conversion unit and the output data determined according to the relationship estimated by the relationship estimation unit An acquisition function that takes the feature vector as an argument and represents the possibility of obtaining better output data than optimal output data of all data points, and the possibility increases as the variance value obtained from the data points increases. A next input parameter determining unit that determines a feature vector that maximizes the possibility based on an acquisition function that increases the likelihood, and determines an input parameter corresponding to the determined feature vector as a next input parameter; Until the predetermined repetition condition is satisfied, the execution by the simulator execution unit, the estimation by the relationship estimation unit, and the next input parameter Was repeated a determination by chromatography data determining section, a repetition judgment unit for determining the input parameters for the best output data may further include a.

また、第１の発明に係る制御策解析装置において、前記関係推定部は、前記特徴ベクトル変換部によって前記特徴ベクトルに変換した結果と前記出力データとの組である全てのデータ点に基づいて、前記特徴ベクトルと前記出力データの平均値との関係式、及び前記特徴ベクトルと前記出力データの分散との関係式を推定し、前記次入力パラメータ決定部は、推定された前記特徴ベクトルと前記出力データの平均値との関係式、及び前記特徴ベクトルと前記出力データの分散との関係式を用いて表される前記獲得関数に基づいて、前記可能性が最大となる特徴ベクトルを決定するようにしてもよい。 Further, in the control strategy analyzing apparatus according to the first aspect of the present invention, the relationship estimation unit is based on all data points that are a set of the result data converted by the feature vector conversion unit and the output data. A relational expression between the feature vector and the average value of the output data and a relational expression between the feature vector and the variance of the output data are estimated, and the next input parameter determination unit is configured to determine the estimated feature vector and the output Based on the relational expression with the average value of data and the relational expression between the feature vector and the variance of the output data, the feature vector having the maximum possibility is determined. May be.

また、第１の発明に係る制御策解析装置において、前記入力パラメータを、テーマパークにおける、各ユーザに対するアトラクションの巡回ルートとし、前記シミュレーションを行うシミュレータを、前記テーマパークにおける各アトラクションのユーザの待ち時間を再現するテーマパークシミュレータとし、前記出力データを、前記複数の巡回ルートのユーザの待ち時間の平均値又は最大値としてもよい。 In the control strategy analyzing apparatus according to the first aspect of the present invention, the input parameter is a circulation route of an attraction for each user in the theme park, and the simulator for performing the simulation is a waiting time for each attraction user in the theme park The output data may be an average value or a maximum value of waiting times of users of the plurality of tour routes.

第２の発明に係る制御策解析方法は、入力部が、シミュレータ実行部が、各ユーザに対する制御内容からなる入力パラメータに基づいて、シミュレーションを実行し、出力データを得るステップと、特徴ベクトル変換部が、前記入力パラメータを、時間帯毎に、各制御内容のユーザ数を表す特徴ベクトルに変換するステップと、関係推定部が、前記特徴ベクトルに変換した結果と前記出力データとの組である複数のデータ点に基づいて、前記特徴ベクトルと前記出力データとの関係を推定するステップと、を含んで実行することを特徴とする。 According to a second aspect of the present invention, there is provided a control strategy analysis method comprising: an input unit; a simulator executing unit executing simulation based on input parameters comprising control content for each user; obtaining output data; and a feature vector converting unit A step of converting the input parameter into a feature vector representing the number of users of each control content for each time zone, and a plurality of sets of results obtained by the relationship estimation unit converting into the feature vector and the output data And executing a step of estimating a relationship between the feature vector and the output data based on the data points.

第３の発明に係るプログラムは、コンピュータを、第１の発明に係る制御策解析装置の各部として機能させるためのプログラムである。 A program according to the third invention is a program for causing a computer to function as each part of the control strategy analyzing apparatus according to the first invention.

本発明の制御策解析装置、方法、及びプログラムによれば、各ユーザに対する制御内容からなる入力パラメータに基づいて、シミュレーションを実行し、出力データを得て、入力パラメータを、時間帯毎に、各制御内容のユーザ数を表す特徴ベクトルに変換し、特徴ベクトルに変換した結果と出力データとの組である複数のデータ点に基づいて、特徴ベクトルと出力データとの関係を推定することにより、制御策が大量に存在する場合であっても、効率よく、最適な制御策を探索することができる、という効果が得られる。 According to the control strategy analysis apparatus, method, and program of the present invention, simulation is executed based on input parameters including control contents for each user, output data is obtained, and input parameters are set for each time zone. Control is performed by converting to feature vectors representing the number of users in the control content, and estimating the relationship between the feature vectors and output data based on multiple data points that are pairs of the results of the conversion into feature vectors and output data. Even when there are a large number of measures, it is possible to efficiently search for an optimal control measure.

テーマパークの混雑緩和において、同一の時間帯で、制御策である巡回ルートを特徴ベクトルに変換する例を示す図である。It is a figure which shows the example which converts the cyclic route which is a control policy into a feature vector in the same time slot | zone in congestion reduction of a theme park. 本発明の実施の形態に係る制御策解析装置の構成を示すブロック図である。It is a block diagram which shows the structure of the control strategy analysis apparatus which concerns on embodiment of this invention. 本発明の実施の形態に係る制御策解析装置における制御策解析処理ルーチンを示すフローチャートである。It is a flowchart which shows the control strategy analysis processing routine in the control strategy analysis apparatus which concerns on embodiment of this invention. ベイズ最適化実行部３０の処理の概要図である。5 is a schematic diagram of processing of a Bayesian optimization execution unit 30. FIG. 実験例におけるシミュレータの設定の一例を示す図である。It is a figure which shows an example of the setting of the simulator in an experiment example. 実験例における実験結果の一例を示す図である。It is a figure which shows an example of the experimental result in an experimental example. 従来のテーマパークシミュレータのイメージを示す図である。It is a figure which shows the image of the conventional theme park simulator. 従来のテーマパークシミュレータにおいて、ユーザの巡回ルートに応じて、高次元になってしまう特徴ベクトルの例を示す図である。In the conventional theme park simulator, it is a figure which shows the example of the feature vector used as a high dimension according to a user's traveling route. 従来のテーマパークシミュレータにおいて、同一ルートを選んだユーザ数という情報に変換した場合の例を示す図である。It is a figure which shows the example at the time of converting into the information of the number of users who selected the same route in the conventional theme park simulator. 従来のテーマパークシミュレータにおいて、同一ルートを選んだユーザ数という情報に変換した場合に、制御策に逆変換することが困難になる例を示す図である。In the conventional theme park simulator, it is a figure which shows the example which becomes difficult to carry out reverse conversion to a control measure, when converting into the information of the number of users who selected the same route.

以下、図面を参照して本発明の実施の形態を詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

＜本発明の実施の形態に係る概要＞ <Outline according to Embodiment of the Present Invention>

まず、本発明の実施の形態における概要を説明する。 First, an outline of the embodiment of the present invention will be described.

本実施の形態においては、制御策を一度ベクトルに変換し、ベイズ最適化を行うことで、最適な制御策を探索する制御策解析装置を提案する。 In the present embodiment, a control policy analysis device that searches for an optimal control policy by converting the control policy into a vector and performing Bayesian optimization is proposed.

上記要件で述べたとおり、制御策を低次元かつ必要な情報を落とさないようにベクトルに変換する必要がある。イベント会場からの帰路やテーマパークでは、大量のユーザがほぼ同時に行動することが考えられる。その際に、ユーザが帰路として選択するルートやアトラクション巡回順序が少し入れ替わったとしても、道路の人口密度やアトラクション平均待ち時間などの評価指標与える影響は非常に小さい。そのため、ユーザ個々を厳密に区別する必要はない。また制御策は、ユーザが“いつ”・“どの”帰路や巡回ルートを選ぶかによって決定される。そのため、行動開始の時間情報を考慮する必要がある。 As described in the above requirements, it is necessary to convert the control strategy into a vector so as not to drop the necessary information. On the way back from the event venue and the theme park, a large number of users can act almost simultaneously. At this time, even if the route selected by the user as a return route or the order of attractions circulation is slightly changed, the influence of evaluation indexes such as the population density of the road and the average waiting time for attractions is very small. Therefore, there is no need to strictly distinguish individual users. The control strategy is determined by the user selecting “when” and “which” return route or patrol route. Therefore, it is necessary to consider time information for action start.

以上二つの事を考慮して各ユーザの巡回ルートを表す制御策をベクトルに変換する。 Considering the above two things, the control policy representing the cyclic route of each user is converted into a vector.

具体的には、同一の時間帯に行動を開始したユーザの中で同一帰路や同一ルートを選んだユーザ数という情報に変換する。図１にテーマパークの混雑緩和において、制御策である各ユーザの巡回ルートを特徴ベクトルに変換する例を示す。この変換方法により、制御策において、上記課題を解決しながらベクトルに変換することが出来る。 Specifically, the information is converted into information indicating the number of users who have selected the same return route or the same route among the users who have behaved in the same time zone. FIG. 1 shows an example of converting the traveling route of each user, which is a control measure, into a feature vector in reducing congestion at a theme park. By this conversion method, it is possible to convert into a vector while solving the above-mentioned problem in the control strategy.

以下、シミュレータを用いて評価指標を最小にする制御策を探索する、という場合について説明する。先に述べたように、制御策・評価指標は問題ごとに異なってよい。 Hereinafter, a case of searching for a control strategy that minimizes the evaluation index using a simulator will be described. As mentioned earlier, control measures and evaluation indices may differ from problem to problem.

＜本発明の実施の形態に係る制御策解析装置の構成＞ <Configuration of Control Measure Analysis Device According to Embodiment of Present Invention>

次に、本発明の実施の形態に係る制御策解析装置の構成について説明する。図２に示すように、制御策解析装置１は、シミュレータ設定処理部１０、データ記憶部２０、ベイズ最適化実行部３０、及び最適パラメータ処理部４０を含んで構成されている。シミュレータ設定処理部１０、及び最適パラメータ処理部４０は、それぞれ入力装置等の外部装置２に接続されている。 Next, the configuration of the control strategy analysis apparatus according to the embodiment of the present invention will be described. As shown in FIG. 2, the control strategy analysis apparatus 1 includes a simulator setting processing unit 10, a data storage unit 20, a Bayes optimization execution unit 30, and an optimal parameter processing unit 40. The simulator setting processing unit 10 and the optimum parameter processing unit 40 are each connected to an external device 2 such as an input device.

シミュレータ設定処理部１０は、入力パラメータとして、各ユーザに対する巡回ルートからなる制御策を複数受け付ける。また、シミュレータ設定処理部１０は、シミュレーションの最大回数、及び、ベイズ最適化実行部３０の処理を終了するか否かの判定を行うための出力値の閾値の少なくとも一方の入力を受け付けて、データ記憶部２０の各々対応するフィールドに設定する。 The simulator setting processing unit 10 receives a plurality of control measures including a cyclic route for each user as input parameters. Further, the simulator setting processing unit 10 receives at least one of the maximum number of simulations and an output value threshold value for determining whether or not to end the processing of the Bayesian optimization executing unit 30, and receives data Each field is set in the storage unit 20.

データ記憶部２０は、シミュレータ設定テーブル２１、及び、最適パラメータテーブル２２を有している。 The data storage unit 20 includes a simulator setting table 21 and an optimum parameter table 22.

シミュレータ設定テーブル２１は、最大シミュレーション実行回数フィールド、閾値フィールドを有する。最大シミュレーション実行回数フィールドは、シミュレータ設定処理部１０により入力された上記最大回数が設定され、閾値フィールドには、上記出力値の閾値が設定される。 The simulator setting table 21 has a maximum simulation execution frequency field and a threshold field. In the maximum simulation execution number field, the maximum number of times input by the simulator setting processing unit 10 is set, and in the threshold value field, the threshold value of the output value is set.

最適パラメータテーブル２２は、ベイズ最適化実行部３０により入力された最適パラメータが設定される。 In the optimum parameter table 22, optimum parameters input by the Bayesian optimization execution unit 30 are set.

ベイズ最適化実行部３０は、シミュレータ実行部３１、特徴ベクトル変換部３２、関係推定部３３、次入力パラメータ決定部３４、ベイズ最適化結果保管部３５、及び、反復判定部３６を有している。 The Bayesian optimization execution unit 30 includes a simulator execution unit 31, a feature vector conversion unit 32, a relationship estimation unit 33, a next input parameter determination unit 34, a Bayes optimization result storage unit 35, and an iterative determination unit 36. .

シミュレータ実行部３１は、各ユーザに対する巡回ルートからなる制御策を表す入力パラメータに基づいて、シミュレーションを実行し、出力データを得る。 The simulator execution unit 31 executes a simulation based on input parameters representing a control strategy including a cyclic route for each user, and obtains output data.

特徴ベクトル変換部３２は、制御策Ｘを、時間帯毎に、各巡回ルートのユーザ数を表す特徴ベクトルｘに変換する。 The feature vector conversion unit 32 converts the control strategy X into a feature vector x representing the number of users of each cyclic route for each time zone.

関係推定部３３は、制御策Ｘを、時間帯毎に、各巡回ルートのユーザ数を表す特徴ベクトルｘに変換した結果と出力データとの組である複数のデータ点に基づいて、特徴ベクトルｘと出力データとの関係を推定する。 The relationship estimation unit 33 converts the control strategy X into a feature vector x representing the number of users of each cyclic route and a plurality of data points, which are sets of output data, for each time zone, based on a plurality of data points. And the output data are estimated.

次入力パラメータ決定部３４は、推定された特徴ベクトルｘと出力データとの関係に基づいて、次の入力パラメータである制御策Ｘを決定する。具体的には、関係推定部３３によって推定される関係に応じて定まる、特徴ベクトル変換部３２によって特徴ベクトルｘに変換した結果と出力データとの組である全てのデータ点のうちの最適な出力データより良い出力データが得られる可能性を表す、特徴ベクトルｘを引数とする獲得関数であって、データ点から求まる分散値が大きいほど可能性が高くなる獲得関数に基づいて、可能性が最大となる特徴ベクトルｘを決定し、決定された特徴ベクトルｘに対応する制御策Ｘを、次の入力パラメータとして決定する。 The next input parameter determination unit 34 determines the control strategy X that is the next input parameter based on the relationship between the estimated feature vector x and the output data. Specifically, the optimum output of all the data points that are a set of the result converted to the feature vector x by the feature vector converting unit 32 and the output data, which is determined according to the relationship estimated by the relationship estimating unit 33 An acquisition function that takes a feature vector x as an argument and represents the possibility of obtaining better output data than data, and the possibility is greatest based on an acquisition function that has a higher probability as the variance value obtained from the data point increases. And the control strategy X corresponding to the determined feature vector x is determined as the next input parameter.

ベイズ最適化結果保管部３５は、シミュレーションの出力データを入力パラメータと共に保管する。 The Bayesian optimization result storage unit 35 stores the simulation output data together with the input parameters.

反復判定部３６は、予め定めた繰り返し条件を満たすまで、シミュレータ実行部３１による実行と、関係推定部３３による推定と、次入力パラメータ決定部３４による決定とを繰り返させ、最適な出力データを得るための入力パラメータを求める。 The iterative determination unit 36 repeats the execution by the simulator execution unit 31, the estimation by the relationship estimation unit 33, and the determination by the next input parameter determination unit 34 until the predetermined repetition condition is satisfied, thereby obtaining optimum output data. Find the input parameters for.

最適パラメータ処理部４０は、ベイズ最適化結果保管部３５に入力されている入力パラメータのうち、出力データの最小値を与える入力パラメータを、データ記憶部２０の最適パラメータテーブル２２に設定し、データ記憶部２０の最適パラメータテーブル２２に設定されている最適パラメータを外部装置２に出力する。 The optimum parameter processing unit 40 sets, in the optimum parameter table 22 of the data storage unit 20, an input parameter that gives the minimum value of the output data among the input parameters input to the Bayes optimization result storage unit 35, and stores the data. The optimum parameters set in the optimum parameter table 22 of the unit 20 are output to the external device 2.

なお、本実施形態に係る制御策解析装置１は、例えば、ＣＰＵ（Central Processing Unit）、ＲＡＭ（Random Access Memory）、各種プログラムを記憶するＲＯＭ（Read Only Memory）を備えたコンピュータ装置で構成される。また、制御策解析装置１を構成するコンピュータは、ハードディスクドライブ、不揮発性メモリ等の記憶部を備えていても良い。本実施形態では、ＣＰＵがＲＯＭ、ハードディスク等の記憶部に記憶されているプログラムを読み出して実行することにより、上記のハードウェア資源とプログラムとが協働し、上述した機能が実現される。 Note that the control strategy analysis apparatus 1 according to the present embodiment includes, for example, a computer device including a CPU (Central Processing Unit), a RAM (Random Access Memory), and a ROM (Read Only Memory) that stores various programs. . Moreover, the computer which comprises the control strategy analysis apparatus 1 may be provided with memory | storage parts, such as a hard disk drive and a non-volatile memory. In the present embodiment, the CPU reads and executes a program stored in a storage unit such as a ROM or a hard disk, whereby the hardware resources and the program cooperate to realize the above-described function.

＜本発明の実施の形態に係る制御策解析装置の作用＞ <Operation of Control Measure Analysis Device According to Embodiment of Present Invention>

次に、本発明の実施の形態に係る制御策解析装置１の作用について説明する。本実施形態に係る制御策解析装置１が実行する制御策解析処理について、図３を参照して説明する。制御策解析装置１が制御策解析処理を実行するタイミングは、例えば、使用者が外部装置２から制御策解析処理の実行を指示したタイミングとする。 Next, the operation of the control strategy analysis apparatus 1 according to the embodiment of the present invention will be described. Control strategy analysis processing executed by the control strategy analysis apparatus 1 according to the present embodiment will be described with reference to FIG. The timing when the control strategy analysis device 1 executes the control strategy analysis processing is, for example, the timing when the user instructs the execution of the control strategy analysis processing from the external device 2.

ステップＳ１１０では、シミュレータ設定処理部１０が、外部装置２により入力された、最大シミュレーション実行回数Ｍ、及び終了条件の閾値εを、データ記憶部２０のシミュレータ設定テーブル２１に設定する。 In step S <b> 110, the simulator setting processing unit 10 sets the maximum simulation execution count M and the termination condition threshold value ε input by the external device 2 in the simulator setting table 21 of the data storage unit 20.

次に、以下のステップＳ２１０〜Ｓ２７０において、ベイズ最適化実行部３０は以下の方法で、データ記憶部２０の最適パラメータテーブル２２に入力する最適パラメータを決定する。図４にベイズ最適化実行部３０の処理の概要図を示す。 Next, in the following steps S210 to S270, the Bayesian optimization executing unit 30 determines the optimal parameters to be input to the optimal parameter table 22 of the data storage unit 20 by the following method. FIG. 4 shows a schematic diagram of the processing of the Bayesian optimization execution unit 30.

ステップＳ２１０では、シミュレータ実行部３１は、制御策の集合からランダムに選んだ制御策Ｘ又は後述するステップＳ２８０で決定された制御策Ｘ_ｎｅｘｔによってシミュレーションを実行し、制御結果の評価指標値ｙを得る。 In step S210, the simulator execution unit 31 executes a simulation with the control measure X randomly selected from the set of control measures or the control measure X _next determined in step S280 described later, and obtains an evaluation index value y of the control result. .

ステップＳ２２０では、特徴ベクトル変換部３２は、制御策Ｘを、時間帯毎に、各巡回ルートのユーザ数を表す特徴ベクトルｘに変換する。ここでは、例としてテーマパークにおけるユーザの巡回ルートという制御策を特徴ベクトルに変換する方法について述べる。 In step S220, the feature vector conversion unit 32 converts the control strategy X into a feature vector x representing the number of users of each cyclic route for each time period. Here, as an example, a method for converting a control policy called a user's patrol route in a theme park into a feature vector will be described.

まずユーザｎの到着時間ＡＴ_ｎ、巡回ルートＲＯＵＴＥ_ｎという情報に基づいて、以下（１）式に示すような行列に変換する。 First arrival time AT _n users n, based on the information that the patrol route ROUTE _n, the following (1) into a matrix as shown in equation.

・・・（１）
... (1)

ここで、＃は要素の個数、Δ_ｊはシミュレーション内時間を表す。 Here, # represents the number of elements, and Δ _j represents the simulation time.

そして、この行列ｘ_ｔｍｐをベクトルに直したｘ＝(ｘ_１１,ｘ_１２，…,ｘ_１ｎ,ｘ_２１…，）を、巡回ルートである制御策Ｘに対応する特徴ベクトルとする。 Then, x = (x ₁₁ , x ₁₂ ,..., X _1n , x ₂₁ ...) _Obtained by _{converting the} matrix x _tmp into a vector is set as a feature vector corresponding to the control strategy X that is a cyclic route.

以降、シミュレーションを行うことで得られた、制御策の特徴ベクトルｘと評価指標値ｙの組（ｘ,ｙ）全体を Thereafter, the entire set (x, y) of the characteristic vector x of the control strategy and the evaluation index value y obtained by performing the simulation is obtained.

・・・（２）
... (2)

と書く。ここで、制御策に対応する特徴ベクトル全体の集合を、Ｃ_ｐａｒａと書くことにする。特徴ベクトル変換部３２は、上記Ｄをベイズ最適化結果保管部３５に格納する。 Write. Here, a set of all feature vectors corresponding to the control strategy is written as C _para . The feature vector conversion unit 32 stores the D in the Bayes optimization result storage unit 35.

ステップＳ２３０では、シミュレータ実行部３１が、シミュレーション実行回数を更新する。 In step S230, the simulator execution unit 31 updates the number of simulation executions.

ステップＳ２４０では、シミュレータ実行部３１が、シミュレーション実行回数がステップＳ１１０で設定された最大繰り返し回数Ｍを超えたか否かを判定する。ステップＳ２４０でシミュレーション実行回数が最大繰り返し回数Ｍを超えたと判定した場合（Ｓ２４０，Ｙ）はステップＳ３１０に移行する。また、ステップＳ２４０でシミュレーション実行回数が最大繰り返し回数Ｍを超えていないと判定した場合（Ｓ２４０，Ｎ）はステップＳ２５０に移行する。 In step S240, the simulator execution unit 31 determines whether or not the number of simulation executions exceeds the maximum number of repetitions M set in step S110. If it is determined in step S240 that the number of simulation executions has exceeded the maximum number of repetitions M (S240, Y), the process proceeds to step S310. If it is determined in step S240 that the number of simulation executions does not exceed the maximum number of repetitions M (S240, N), the process proceeds to step S250.

ステップＳ２５０では、シミュレータ実行部３１が、シミュレーションの出力値がステップＳ１１０で設定された閾値εより小さいか否かを判定する。ステップＳ２５０でシミュレーションの出力値が閾値εより小さいと判定した場合（Ｓ２５０，Ｙ）はステップＳ３１０に移行する。また、ステップＳ２５０でシミュレーションの出力値が閾値ε以上であると判定した場合（Ｓ２５０，Ｎ）はステップＳ２６０に移行する。 In step S250, the simulator execution unit 31 determines whether or not the simulation output value is smaller than the threshold value ε set in step S110. If it is determined in step S250 that the simulation output value is smaller than the threshold ε (S250, Y), the process proceeds to step S310. If it is determined in step S250 that the simulation output value is greater than or equal to the threshold ε (S250, N), the process proceeds to step S260.

ステップＳ２６０では、関係推定部３３は、ベイズ最適化結果保管部３５の全てのデータ点に基づいて、シミュレータの入出力に関する、特徴ベクトルと出力データの平均値との関係式、及び特徴ベクトルと出力データの分散との関係式をガウス過程によって推定する。ある特徴ベクトルｘにおける推定結果は、以下（３）式であらわされる平均値μ(ｘ)、（４）式であらわされる分散値σ(ｘ)をもつガウス分布に従う。 In step S260, the relationship estimation unit 33, based on all data points of the Bayes optimization result storage unit 35, the relational expression between the feature vector and the average value of the output data and the feature vector and the output regarding the input / output of the simulator. Estimate the relation with the variance of data by Gaussian process. The estimation result for a certain feature vector x follows a Gaussian distribution having an average value μ (x) expressed by the following equation (3) and a variance value σ (x) expressed by the following equation (4).

・・・（３）
... (3)

・・・（４）
... (4)

（３）式、（４）式の (3), (4)

は、カーネル関数と呼ばれる入力パラメータｘ_１とｘ_２の類似度ｋ(ｘ_１,ｘ_２)を定義する関数を用いて（５）式、（６）式のように書ける。 Can be written as equations (5) and (6) using a function that defines the similarity k (x ₁ , x ₂ ) between the input parameters x ₁ and x ₂ called a kernel function.

・・・（５）
... (5)

・・・（６）
... (6)

ただし、ｋ_ｉｊ＝ｋ（ｘ_ｉ，ｘ_ｊ）であり、上付きの記号Ｔは行列の転置を表し、上付き記号−１は逆行列を表す。このカーネル関数は問題に応じて変更してよい。代表的なカーネルとして、線形カーネル、ガウスカーネルなどがある（非特許文献３参照）。 However, k _ij = k (x _i , x _j ), the superscript symbol T represents the transpose of the matrix, and the superscript symbol −1 represents the inverse matrix. This kernel function may be changed depending on the problem. Typical kernels include a linear kernel and a Gaussian kernel (see Non-Patent Document 3).

ステップＳ２７０では、次入力パラメータ決定部３４は、以下（７）式に従い、次の制御策を決めるため、特徴ベクトルｘ_ｎｅｘｔを算出する。 In step S270, the next input parameter determination unit 34 calculates a feature vector x _{next in} order to determine the next control strategy according to the following equation (7).

・・・（７）
... (7)

ここでα（ｘ）は獲得関数と呼ばれ、特徴ベクトルｘが最小値を与える可能性を定量的に評価するための指標である。上記（７）式では、獲得関数値が最大となるｘ_ｎｅｘｔを求める。代表的な獲得関数としては、確率改善（ＰＩ）や期待値改善（ＥＩ）が用いられる（非特許文献２参照）。本実施の形態では、獲得関数α（ｘ）は、下記（８）式のように定義する。ここで、σ（ｘ）は、上記（４）式で与えられた分散であり、Φ（ｘ）は、標準正規分布における累積分布関数を表し、Ｎ（ｘ）は、標準正規分布における確率密度関数を表す。 Here, α (x) is called an acquisition function and is an index for quantitatively evaluating the possibility that the feature vector x gives the minimum value. In the above equation (7), x _next maximizing the acquisition function value is obtained. As typical acquisition functions, probability improvement (PI) and expected value improvement (EI) are used (see Non-Patent Document 2). In the present embodiment, the acquisition function α (x) is defined as the following equation (8). Here, σ (x) is the variance given by the above equation (4), Φ (x) represents the cumulative distribution function in the standard normal distribution, and N (x) is the probability density in the standard normal distribution. Represents a function.

・・・（８）
... (8)

また，γ（ｘ）は、上記（３）式及び（４）式で与えられた平均μ及び分散σを用いて下記（９）式で与えられる関数である。なお、下記（９）式におけるｆ_ｂｅｓｔは、シミュレータで得られた出力結果の最小値とする。 Γ (x) is a function given by the following equation (9) using the average μ and the variance σ given by the above equations (3) and (4). Note that f _best in the following equation (9) is the minimum value of the output result obtained by the simulator.

・・・（９）
... (9)

上記図４に示すように、次に入力となる次探索点のｘ_ｎｅｘｔは、分散値が大きい点、言い換えると観測データ点が少なく探索が行われていない不確かな領域の点となっている。獲得関数は、“分散値が大きい不確かな領域”と、“現状の最小値を与えるパラメータ付近”を次の探索点に選びやすいようになっている。 As shown in FIG. 4, x _next of the _next search point to be input _next is a point having a large variance value, in other words, a point in an uncertain region where there are few observation data points and no search is performed. The acquisition function makes it easy to select “an uncertain region with a large variance value” and “near the parameter that gives the current minimum value” as the next search points.

ステップＳ２８０では、次入力パラメータ決定部３４は、ステップＳ２７０にて得られたｘ_ｎｅｘｔが実際の制御策に対応する特徴ベクトルになっているとは限らないため、以下（１０）式のように、制御策Ｘを変換した特徴ベクトルｘ^ｉｎｔの中で、ｘ_ｎｅｘｔに最も近いｘ_ｎｅｘｔ ^ｉｎｔを決定する。 In step S280, the next input parameter determination unit 34 does not always have the feature vector corresponding to the actual control strategy as x _next obtained in step S270. among the feature vector ^{x int} obtained by converting the control measures _X, to determine the closest _x ^{next int} the _{x next.}

・・・（１０）
... (10)

そして、ｘ_ｎｅｘｔ ^ｉｎｔに対応する制御策Ｘを、制御策Ｘ_ｎｅｘｔとする。 Then, the control strategy X corresponding to x _next ^int is set as the control strategy X _next .

このように、次入力パラメータ決定部３４は、獲得関数に基づいて導出された、可能性が最大となるｘ_ｎｅｘｔに基づいて、制御策Ｘ_ｎｅｘｔを決定する。 As described above, the next input parameter determination unit 34 determines the control strategy X _next based on x _next derived based on the acquisition function and has the highest possibility.

（１０）式にて決定された、ｘ_ｎｅｘｔ ^ｉｎｔに対応する制御策は複数対応するものがあることが考えられるが、いずれか一つの制御策をランダムで決定する。そして、ステップＳ２１０に戻り、Ｘ_ｎｅｘｔをシミュレータ実行部３１に入力し、Ｘ_ｎｅｘｔを入力制御策としてシミュレーションを行う。 Although it is conceivable that there are a plurality of control measures corresponding to x _next ^int determined by the equation (10), any one control measure is determined at random. Then, returning to step S210, X _next is input to the simulator execution unit 31, and simulation is performed using X _next as an input control measure.

ステップＳ３１０では、ベイズ最適化実行部３０は、ベイズ最適化処理を終了した時点でベイズ最適化結果保管部３５に含まれるデータのうち、最小値を与える制御策をデータ記憶部２０に入力し、データ記憶部２０は最適パラメータテーブル２２にそのパラメータを設定する。 In step S310, the Bayesian optimization execution unit 30 inputs, to the data storage unit 20, a control strategy that gives the minimum value among the data included in the Bayesian optimization result storage unit 35 when the Bayesian optimization processing is completed. The data storage unit 20 sets the parameters in the optimum parameter table 22.

ステップＳ４１０では、最適パラメータ処理部４０は、データ記憶部２０より入力された最適パラメータを外部装置２に出力する。出力処理は、例えば、外部装置２から出力のリクエストが入力された場合に実行すればよい。 In step S410, the optimum parameter processing unit 40 outputs the optimum parameters input from the data storage unit 20 to the external device 2. The output process may be executed when an output request is input from the external device 2, for example.

［実験結果の例］ [Example of experimental results]

テーマパークシミュレータを用いて、アトラクションの待ち時間が短くなる巡回ルートを本実施の形態の手法によって探索を行った。 The theme park simulator was used to search for a patrol route that shortens the waiting time for attractions by the method of the present embodiment.

シミュレータの設定を図５に、実験結果を図６示す。図６の結果からもわかるように、ユーザが巡回ルートをランダムに選択して巡回するというランダムサーチを行った場合の結果に比べ、アトラクション平均待ち時間が短くなる巡回ルートを発見することに成功している。 FIG. 5 shows the simulator settings, and FIG. 6 shows the experimental results. As can be seen from the results of FIG. 6, the user succeeded in finding a cyclic route with a shorter average waiting time for the attraction than the result of a random search in which the user randomly selects a cyclic route and performs a cyclic search. ing.

本実験例では、テーマパークにおいてアトラクション平均待ち時間を短くする巡回ルートを考えたが、初めに乗るアトラクションを出発地点、最後に乗るアトラクションを到着地点、途中のアトラクションを一般道の交差点と考えることで、イベント会場から駅への帰路における一般道の混雑緩和に最適な制御策を探索することが出来る。また、混雑だけではなく様々な目的の制御に関しても制御策を探索することが出来る。 In this experimental example, we considered a patrol route that shortens the average waiting time for attractions in the theme park. It is possible to search for an optimal control strategy for alleviating congestion on general roads on the way back from the event venue to the station. Further, it is possible to search for control measures not only for congestion but also for various purposes of control.

以上説明したように、本発明の実施の形態に係る制御策解析装置によれば、各ユーザに対する制御内容からなる入力パラメータに基づいて、シミュレーションを実行し、出力データを得て、入力パラメータを、時間帯毎に、各制御内容のユーザ数を表す特徴ベクトルに変換し、特徴ベクトルに変換した結果と出力データとの組である複数のデータ点に基づいて、特徴ベクトルと出力データとの関係を推定することにより、制御策が大量に存在する場合であっても、効率よく、最適な制御策を探索することができる。 As described above, according to the control strategy analyzing apparatus according to the embodiment of the present invention, based on the input parameters composed of the control content for each user, the simulation is performed, the output data is obtained, and the input parameters are For each time zone, convert to a feature vector representing the number of users of each control content, and based on a plurality of data points that are a combination of the result of the conversion to the feature vector and the output data, the relationship between the feature vector and the output data is By estimating, even when there are a large number of control strategies, an optimal control strategy can be searched efficiently.

また、本発明の実施の形態の手法によって発見された巡回ルートをユーザに推薦することで、アトラクションの待ち時間や一般道の混雑を緩和に最適な制御策を発見できるようになる。 In addition, by recommending to the user the patrol route discovered by the method of the embodiment of the present invention, it becomes possible to discover an optimal control measure for alleviating the waiting time of attractions and congestion of general roads.

また、人手で制御策を考える必要がなくなるため、制御策を策定するコストを削減することが出来る。 In addition, since it is not necessary to think about the control measure manually, the cost for formulating the control measure can be reduced.

なお、本発明は、上述した実施の形態に限定されるものではなく、この発明の要旨を逸脱しない範囲内で様々な変形や応用が可能である。 The present invention is not limited to the above-described embodiment, and various modifications and applications can be made without departing from the gist of the present invention.

例えば、制御策解析装置の各構成要素の動作をプログラムとして構築し、制御策解析装置として利用されるコンピュータにインストールして実行させる、またはネットワークを介して流通させることが可能である。 For example, it is possible to construct the operation of each component of the control strategy analysis apparatus as a program, install it on a computer used as the control strategy analysis apparatus, execute it, or distribute it via a network.

１制御策解析装置
２外部装置
１０シミュレータ設定処理部
２０データ記憶部
２１シミュレータ設定テーブル
２２最適パラメータテーブル
３０ベイズ最適化実行部
３１シミュレータ実行部
３２特徴ベクトル変換部
３３関係推定部
３４次入力パラメータ決定部
３５ベイズ最適化結果保管部
３６反復判定部
４０最適パラメータ処理部 DESCRIPTION OF SYMBOLS 1 Control measure analysis apparatus 2 External device 10 Simulator setting process part 20 Data storage part 21 Simulator setting table 22 Optimal parameter table 30 Bayes optimization execution part 31 Simulator execution part 32 Feature vector conversion part 33 Relation estimation part 34 Next input parameter determination part 35 Bayes optimization result storage unit 36 Iterative determination unit 40 Optimal parameter processing unit

Claims

Based on the input parameters consisting of the control content for each user, a simulation is executed, and a simulator execution unit for obtaining output data;
A feature vector conversion unit that converts the input parameter into a feature vector representing the number of users of each control content for each time period;
A relationship estimation unit that estimates a relationship between the feature vector and the output data based on a plurality of data points that are a set of the result converted into the feature vector and the output data;
Control measure analyzer including

Better than optimal output data among all data points that are a set of the result data converted by the feature vector conversion unit and the output data, which are determined according to the relationship estimated by the relationship estimation unit An acquisition function that takes the feature vector as an argument and represents the possibility that output data can be obtained, and the possibility is maximum based on the acquisition function that increases as the variance value obtained from the data point increases. A next input parameter determining unit that determines a feature vector as follows, and determines an input parameter corresponding to the determined feature vector as a next input parameter;
Until the predetermined repetition condition is satisfied, the execution by the simulator execution unit, the estimation by the relationship estimation unit, and the determination by the next input parameter determination unit are repeated to obtain an input parameter for obtaining optimum output data An iterative determination unit;
The control policy analysis apparatus according to claim 1, further comprising:

The relation estimation unit is a relational expression between the feature vector and the average value of the output data based on all data points that are a set of the result data converted by the feature vector conversion unit and the output data. And estimating a relational expression between the feature vector and the variance of the output data,
The next input parameter determination unit is configured to calculate the acquisition function expressed using a relational expression between the estimated feature vector and an average value of the output data and a relational expression between the feature vector and the variance of the output data. The control policy analysis apparatus according to claim 2, wherein a feature vector that maximizes the possibility is determined based on the control vector.

The input parameter is a circulation route of attractions for each user in the theme park, the simulator that performs the simulation is a theme park simulator that reproduces the waiting time of the user of each attraction in the theme park, and the output data is the The control policy analysis apparatus according to any one of claims 1 to 3, wherein the waiting time of users of a plurality of traveling routes is an average value or a maximum value.

A simulator execution unit executes a simulation based on input parameters including control contents for each user to obtain output data;
A feature vector conversion unit converting the input parameter into a feature vector representing the number of users of each control content for each time zone;
A step of estimating a relationship between the feature vector and the output data based on a plurality of data points that are a set of the output data and the result of the relationship estimation unit converting the feature vector;
Control strategy analysis method including

The program for functioning a computer as each part of the control strategy analyzer of any one of Claims 1-4.