JP2024070098A

JP2024070098A - Information processing system and program

Info

Publication number: JP2024070098A
Application number: JP2022180483A
Authority: JP
Inventors: 享平野; 雄介片山; 隆鷲尾
Original assignee: Osaka University NUC; Taiyu Co Ltd; Nishimatsu Construction Co Ltd
Current assignee: Osaka University NUC; Taiyu Co Ltd; Nishimatsu Construction Co Ltd
Priority date: 2022-11-10
Filing date: 2022-11-10
Publication date: 2024-05-22

Abstract

【課題】掘削機械の挙動のシミュレーションが実用上で可能となるシステム、方法およびプログラムを提供すること。【解決手段】情報処理システム３０は、掘削機械の挙動を再現するシステムであり、掘削機械の制御に使用する学習済みモデルから出力された制御情報と、調整可能なパラメータとを用いて、制御情報により制御した場合の掘削機械の動作結果を予測する予測部４０と、実際に制御情報により制御した場合の掘削機械の動作結果を取得する取得部４１と、取得された動作情報に基づき、予測された動作結果を検証する検証部４２と、検証結果に応じて、パラメータを調整する調整部４３とを含む。【選択図】図５[Problem] To provide a system, method, and program that make it possible to practically simulate the behavior of an excavating machine. [Solution] An information processing system 30 is a system that reproduces the behavior of an excavating machine, and includes a prediction unit 40 that uses control information output from a trained model used to control the excavating machine and adjustable parameters to predict the operation results of the excavating machine when controlled by the control information, an acquisition unit 41 that acquires the operation results of the excavating machine when actually controlled by the control information, a verification unit 42 that verifies the predicted operation results based on the acquired operation information, and an adjustment unit 43 that adjusts the parameters in accordance with the verification results. [Selected Figure] Figure 5

Description

本発明は、掘削機械の挙動を再現する情報処理システムおよびその再現する処理をコンピュータに実行させるためのプログラムに関する。 The present invention relates to an information processing system that reproduces the behavior of an excavation machine and a program for causing a computer to execute the reproduction process.

シールドマシン（以下、単にシールドと呼ぶ。）等の掘削機械の制御を機械学習で学習した学習済みモデルを使用して自動化する技術が知られている（例えば、特許文献１参照）シールドの掘削制御を学習済みモデルで実行する場合、学習済みモデルに入力するデータに異常値や欠損値があると、工事の安全性や品質が低下する。そこで、上記の技術では、予測失敗リスク評価の基準として、各計測値の全てが、学習モデルを学習する際に用いた教師データにおける計測値の学習データ範囲内にあるか（内挿状態であるか）を判定し、評価が合格の場合のみ、シールドの操作の制御データの設定値を、学習済みモデルを使用して推定している。 There is a known technology that uses a trained model learned by machine learning to automate the control of excavation machines such as shield machines (hereinafter simply referred to as shield) (see, for example, Patent Document 1). When shield excavation control is performed using a trained model, the safety and quality of the construction work will decrease if there are abnormal values or missing values in the data input to the trained model. Therefore, in the above technology, as a criterion for evaluating the risk of prediction failure, it is determined whether all of the measurement values are within the learning data range of the measurement values in the teacher data used to train the learning model (whether they are in an interpolated state), and only if the evaluation is successful, the setting value of the control data for the operation of the shield is estimated using the trained model.

学習モデルを学習させる際、計測値が指示された値から乖離するデータを教師データとして使用すると、正しい推定を行わせるための学習が行えず、学習モデルが計測値に対応した適切な操作の設定値を推定しない場合がある。 When training a learning model, if data whose measured values deviate from the specified values is used as training data, learning to make correct estimations cannot be performed, and the learning model may not estimate appropriate operation settings that correspond to the measured values.

そこで、掘削状況を測定した状況測定データと掘削の目標である指示値との乖離度合に基づき、操作実績データと状況測定データとを含む判定対象データを、掘削機械の操作の設定値を推定する学習モデルの学習データとするか否かを判定する技術が提案されている（例えば、特許文献２参照）。 Therefore, a technology has been proposed that determines whether or not to use the data to be judged, which includes operation performance data and situation measurement data, as learning data for a learning model that estimates the setting values for the operation of an excavation machine, based on the degree of deviation between the situation measurement data that measures the excavation situation and the instruction value that is the excavation target (see, for example, Patent Document 2).

特開２０１９－１４３３８８号公報JP 2019-143388 A 特開２０１９－１４３３８９号公報JP 2019-143389 A

上記の従来の２つの技術を組み合わせることにより、シールドの掘削制御を、学習済みモデルを使用して自動化する場合の妥当性と精度を向上させることができる。 By combining the two conventional technologies described above, it is possible to improve the validity and accuracy of automating shield excavation control using a trained model.

現実の掘削工事では、線形や地質等の条件が工事途中で大きく変わることが事例として多く、工事毎に掘削機械の仕様も異なる。このため、一企業で収集できる教師データは、多様な背景を持ったデータとなる。このような教師データでは、学習モデルの汎化性能を高める目途が立たず、予測失敗リスク評価において不合格が多い状態となってしまう。 In real-world excavation work, there are many cases where conditions such as the alignment and geology change significantly during construction, and the specifications of excavation machinery differ for each work. For this reason, the training data that a single company can collect is data with diverse backgrounds. With such training data, there is no prospect of improving the generalization performance of the learning model, and many of the results fail the prediction failure risk assessment.

予測失敗リスクを管理するためには、学習済みモデルの出力を実機入力したケースでのシミュレーションが必要である。しかしながら、これまでの理論的にシールド挙動を再現するシミュレータは、複雑で多数の未知のパラメータを包含するモデルになることから、複雑すぎて現実的ではなく、実用化には至っていない。 To manage the risk of prediction failure, it is necessary to perform simulations in which the output of a trained model is input into an actual device. However, simulators that theoretically reproduce shield behavior to date have been too complex and unrealistic to be put into practical use, as they involve complex models that include many unknown parameters.

本発明は、上記課題に鑑みてなされたものであり、掘削機械の挙動を再現する情報処理システムであって、
掘削機械の制御に使用する学習済みモデルから出力された制御情報と、調整可能なパラメータとを用いて、制御情報により制御した場合の掘削機械の動作結果を予測する予測手段と、
実際に制御情報により制御した場合の掘削機械の動作結果を取得する取得手段と、
取得された動作情報に基づき、予測された動作結果を検証する検証手段と、
検証結果に応じて、パラメータを調整する調整手段と
を含む、情報処理システムが提供される。 The present invention has been made in consideration of the above-mentioned problems, and provides an information processing system that reproduces behavior of an excavation machine,
A prediction means for predicting an operation result of the excavation machine when controlled by the control information, using control information output from a trained model used for controlling the excavation machine and adjustable parameters;
an acquisition means for acquiring an operation result of the excavation machine when actually controlled by the control information;
A verification means for verifying a predicted operation result based on the acquired operation information;
and an adjustment means for adjusting the parameter in accordance with the verification result.

本発明によれば、掘削機械の挙動のシミュレーションが実用上で可能となる。 The present invention makes it possible to practically simulate the behavior of an excavation machine.

掘削機械の一例としてシールドの構成例を示した図。FIG. 2 is a diagram showing an example of the configuration of a shield as an example of an excavation machine. シールドが急曲線を曲がる様子を例示した図。A diagram illustrating how a shield turns a sharp curve. データ同化について説明する図。FIG. 1 is a diagram for explaining data assimilation. 情報処理システムのハードウェア構成の一例を示した図。FIG. 1 is a diagram showing an example of a hardware configuration of an information processing system. 情報処理システムの機能構成の一例を示した機能ブロック図。FIG. 2 is a functional block diagram showing an example of a functional configuration of the information processing system. シールドの掘削制御の一例を示したフローチャート。4 is a flowchart showing an example of shield excavation control. シールドの動作結果を推定する処理の一例を示したフローチャート。11 is a flowchart showing an example of a process for estimating the operation result of a shield. シミュレーションに関わる変数をまとめた表。A table summarizing the variables involved in the simulation. 確率分布の一例を示した図。FIG. 13 is a diagram showing an example of a probability distribution.

図１は、掘削機械の一例としてシールドの構成例を示した図である。掘削機械は、土砂や岩石、地山等を掘削する機械であり、トンネル全断面を掘削対象とするものでは、シールド、トンネルボーリングマシン、自由断面掘削機等がある。掘削機械は、トンネル全断面を掘削できる機械であればいかなる機械であってもよいが、以下、掘削機械をシールドとして説明する。 Figure 1 shows an example of the configuration of a shield as an example of an excavation machine. An excavation machine is a machine that excavates soil, rocks, natural ground, etc., and those that excavate the entire cross section of a tunnel include a shield, a tunnel boring machine, and a free cross section excavation machine. The excavation machine may be any machine that can excavate the entire cross section of a tunnel, but hereinafter, the excavation machine will be described as a shield.

シールド１０は、前方の土砂を削り、崩壊しようとする掘削面を押さえながら、削り取った量とバランスする量の掘削土砂を坑外へ排出して前進するトンネル掘削機械である。シールド後方では、セグメントと呼ばれるトンネル覆工ブロックをリング状に組立て、トンネル構造を完成させる。 The shield 10 is a tunnel boring machine that cuts away soil in front of it, holds down the excavation surface that is in danger of collapsing, and discharges an amount of excavated soil outside the tunnel that balances the amount cut away. Behind the shield, tunnel lining blocks called segments are assembled into a ring shape to complete the tunnel structure.

シールド１０は、前方の土砂を削るため、その先端には、回転可能な略円形の面板に円筒状または放射状に配列する複数の切削用のビットを備えたカッターヘッド１１を有する。また、シールド１０は、シールドジャッキ１２を備え、組み立てられたセグメント１３にシールドジャッキ１２を押し当て、シールドジャッキ１２を伸ばすことにより前進する。シールドジャッキ１２は、シールド１０を構成する鋼製の外筒であるスキンプレート１４の内周に沿って所定の間隔で複数配置されている。 The shield 10 has a cutter head 11 at its tip, which is equipped with multiple cutting bits arranged cylindrically or radially on a rotatable, roughly circular face plate, in order to cut away soil and sand in front of it. The shield 10 also has a shield jack 12, which is pressed against the assembled segments 13 and moves forward by extending the shield jack 12. Multiple shield jacks 12 are arranged at specified intervals along the inner circumference of the skin plate 14, which is a steel outer tube that constitutes the shield 10.

錯綜した都市地下空間の開発では、三次元的な急曲線を掘削することが要求される。このため、シールド１０は、急曲線に対応可能なように、スキンプレート１４が前胴１４ａと後胴１４ｂとの２つに分割され、前胴１４ａと後胴１４ｂとをヒンジで連結した中間折れ曲がりを可能とする中折れジャッキ１５等の中折れ機構を備えている。中折れジャッキ１５も、シールドジャッキ１２と同様、スキンプレート１４の前胴１４ａの内周に沿って所定の間隔で複数配置されている。 The development of complex urban underground spaces requires the excavation of three-dimensional sharp curves. For this reason, the shield 10 is equipped with a bending mechanism such as a bending jack 15 that allows the front body 14a and the rear body 14b to be connected by a hinge and bend in the middle, with the skin plate 14 divided into two parts, a front body 14a and a rear body 14b, to accommodate sharp curves. Like the shield jack 12, multiple bending jacks 15 are arranged at predetermined intervals along the inner circumference of the front body 14a of the skin plate 14.

シールド１０は、カッターヘッド１１の直ぐ後部に、掘削面を押さえる土圧や水圧をかけながら掘削土砂を排出可能に塑性流動化させるための撹拌室として、チャンバー１６を備える。チャンバー１６内の掘削土砂は、スクリューコンベア１７で取り出され、ベルトコンベア１８で坑外に搬出し、排出土として処分される。 The shield 10 is provided with a chamber 16 immediately behind the cutter head 11 as a mixing chamber for applying earth pressure and water pressure to the excavated surface while plastically fluidizing the excavated soil so that it can be discharged. The excavated soil in the chamber 16 is removed by a screw conveyor 17 and transported outside the mine by a belt conveyor 18, where it is disposed of as discharged soil.

シールド１０は、後胴１４ｂ内でセグメント１３を組み立てるためのエレクター１９を備える。エレクター１９は、セグメント１３を把持し、所定の位置まで搬送して設置する。また、シールド１０は、セグメント１３の背部（セグメント１３と掘削したトンネル壁面との間）に注入材を注入する裏込め注入装置２０を備える。 The shield 10 is equipped with an erector 19 for assembling the segments 13 inside the rear body 14b. The erector 19 grasps the segments 13 and transports them to a specified position to install them. The shield 10 also has a backfill injection device 20 that injects injection material into the back of the segments 13 (between the segments 13 and the excavated tunnel wall).

シールド１０は、シールドジャッキ１２を伸ばすことにより前進する。また、シールド１０、スキンプレート１４の内周に沿って複数配置される各シールドジャッキ１２を伸ばす長さ（ストローク）を変え、また、必要に応じて、スキンプレート１４の内周に沿って複数配置される各中折れジャッキ１５のストロークを変えることにより、所定の方角へ向けて曲がるように前進する。前進する方角は、水平方向の方位、水平方向に対する鉛直方向の角度（ピッチング）に向けた方向である。 The shield 10 moves forward by extending the shield jacks 12. Also, by changing the extension length (stroke) of each of the shield jacks 12 arranged along the inner circumference of the shield 10 and skin plate 14, and by changing the stroke of each of the bending jacks 15 arranged along the inner circumference of the skin plate 14 as necessary, the shield 10 moves forward while bending in a specified direction. The direction of forward movement is the direction toward the horizontal azimuth and the angle (pitching) of the vertical direction relative to the horizontal.

図２は、中折れ機構を有するシールドが急曲線を曲がる様子を例示した図である。中折れジャッキ１５により中折れした場合の、前胴１４ａの断面の中心を通る中心線と、後胴１４ｂの断面の中心を通る中心線とにより成す角を、中折れ角αとする。中折れジャッキ１５によりシールド１０が急曲線を曲がる場合、後胴１４ｂの面向きと、中折れ角αとを操作し、シールド１０と計画線との偏差が小さくなる掘進方向に沿って進むように、シールド１０の姿勢を調整しながら曲がっていく。 Figure 2 is a diagram illustrating how a shield with a bending mechanism turns a sharp curve. When the shield is bent by the bending jack 15, the angle formed by the center line passing through the center of the cross section of the front body 14a and the center line passing through the center of the cross section of the rear body 14b is the bending angle α. When the shield 10 turns a sharp curve by the bending jack 15, the orientation of the rear body 14b and the bending angle α are manipulated, and the shield 10 turns while adjusting its posture so that it moves along the excavation direction that reduces the deviation between the shield 10 and the planned line.

急曲線を曲がる場合、例えば、バスやトラックでは内輪差や回転中心からのオーバーハングを許容する車幅以上の空間が必要である。シールド１０も、カーブするために直線部の掘削径から拡大した同様の空間が必要になるが、その空間を自ら余分に掘削して生み出さない限り、カーブすることができない。 When making a sharp turn, for example, a bus or truck needs a space larger than the vehicle width to allow for wheel difference and overhang from the center of rotation. The shield 10 also needs a similar space that is expanded from the excavation diameter of the straight section in order to make a turn, but it cannot make the turn unless it creates that space by excavating extra space itself.

そこで、シールド１０の前胴１４ａに設けられたカッターヘッド１１の側方に、トンネルの径方向へ突出可能な突出部として、コピーカッタ２１が設けられる。コピーカッタ２１は、シールド１０がカーブできる空間を作成するための装置である。 Therefore, a copy cutter 21 is provided on the side of the cutter head 11 provided on the front body 14a of the shield 10 as a protrusion that can protrude in the radial direction of the tunnel. The copy cutter 21 is a device for creating a space in which the shield 10 can be curved.

シールド１０が急曲線で曲がる場合、急曲線の内側の中折れジャッキ１５を縮め、外側の中折れジャッキ１５を伸ばすジャッキ操作を行い、略への字形の屈曲姿勢でカーブする。そのためには、シールド１０の姿勢が取り得る空間を先読みし、コピーカッタ２１がその位置にあるタイミングで掘削（余掘り）しておく必要がある。 When the shield 10 makes a sharp curve, the bending jack 15 on the inside of the sharp curve is shortened and the bending jack 15 on the outside is extended, and the shield 10 curves in a roughly V-shaped bend. To do this, it is necessary to predict the space that the shield 10 can assume and excavate (over-excavate) when the copy cutter 21 is in that position.

余掘りした空間は、空洞とはせず、適度な塑性流動性をもつ改良剤で掘削直後から充填する。空洞のままでは掘削壁の崩壊の危険性があり、シールド１０が掘削壁からの反力と摩擦を適度に受けることが難しくなり、シールド１０の運転が困難になるからである。 The overexcavated space is not left hollow, but is filled immediately after excavation with an improvement agent with appropriate plastic flow properties. If it is left hollow, there is a risk of the excavation wall collapsing, and it becomes difficult for the shield 10 to adequately withstand the reaction force and friction from the excavation wall, making it difficult to operate the shield 10.

コピーカッタ２１の出入りの制御は、カッターヘッド１１の回転角に対応して連続的に行われる。この制御は、例えば、予め設定しておいた円周角に対するコピーカッタ２１の突出長の関係を示す式等を用いて実施することができる。 The entry and exit of the copy cutter 21 is controlled continuously in response to the rotation angle of the cutter head 11. This control can be performed, for example, using an equation that shows the relationship between the protruding length of the copy cutter 21 and a preset circumferential angle.

このようなシールド１０の掘削制御は、手動で行うこともできるが、近年では、機械学習の学習済みモデルを使用して自動で行うことができる。機械学習におけるモデルは、シールド１０の挙動を監視する監視項目の計測値（観測値）を入力データとし、シールド１０の掘削制御に用いる制御情報を出力する。モデルは、このような入力データから制御情報を予測するために、人為制御にて取得した入力データと制御情報との対である教師データが与えられ、学習される。 Although such excavation control of the shield 10 can be performed manually, in recent years it can be performed automatically using a trained machine learning model. A machine learning model takes measurement values (observation values) of monitoring items that monitor the behavior of the shield 10 as input data, and outputs control information used to control the excavation of the shield 10. In order to predict control information from such input data, the model is given training data that is a pair of input data and control information obtained by manual control, and is trained.

機械学習の本質は、教師データにより学習した範囲内で予測（内挿予測）することである。したがって、学習されたモデル（学習済みモデル）は、学習した範囲外で利用（外挿利用）してはならない。意図せぬ外挿は、予測失敗リスクとなるからである。 The essence of machine learning is to make predictions (interpolated predictions) within the range learned from training data. Therefore, a trained model (a trained model) must not be used outside the range it was learned within (extrapolated use). This is because unintended extrapolation poses the risk of prediction failure.

一企業で収集される教師データは、シールド１０の仕様が掘削する対象の地盤によって異なり、工事途中でも線形や地質等の条件が大きく変わることが事例として多いことから、多様な背景をもつデータとなる。例えば、同じ山を掘削する場合でも、途中に大きな岩があったり、地下水が通っていたりするため、地質等の条件が大きく変動する。 The training data collected by a single company has a diverse background, since the specifications of the shield 10 vary depending on the ground being excavated, and there are many cases where conditions such as the line shape and geology change significantly even during construction. For example, even when excavating the same mountain, the geological conditions can vary greatly because there may be large rocks in the way or groundwater running through it.

学習済みモデルは、入力データからパターンや傾向を見つけ出し、未知のデータに対して予測を行うものである。シールド１０を人為制御して収集される教師データは、地質等の条件が大きく変動するデータであるから、その教師データを使用して学習モデルを学習させたとしても網羅すべきパターンや傾向の例示が不十分で、十分な学習結果に収束せず、未知のデータに対応することができない。このため、未知のデータに対応する能力である汎化能力を高める目途が立たない事態に直面する。 A trained model finds patterns and trends from input data and makes predictions for unknown data. The training data collected by manually controlling the shield 10 is data that varies greatly depending on geological and other conditions, so even if the training data is used to train a learning model, there are insufficient examples of the patterns and trends that should be covered, and the learning results do not converge to a sufficient level, making it unable to handle unknown data. As a result, we are faced with a situation where there is no prospect of improving generalization ability, which is the ability to handle unknown data.

このため、常に教師データが不足しがちで、予測失敗リスク評価において不合格が多い状態になり、せっかくの学習モデルが掘削制御に生かされない。学習モデルを生かそうとすると、修正のための追加学習を繰り返さなければならず、生産性が悪い状況になる。 As a result, there is always a lack of training data, which results in many failures in prediction failure risk assessments, and the learning models that have been put to good use in excavation control. If attempts are made to make use of the learning models, additional learning must be repeated to make corrections, resulting in poor productivity.

予測失敗リスクを管理するには、学習済みモデルの出力を実際にシールド１０に入力し、操作した場合にどうなるかを試すためのシミュレーションが必要である。シミュレーションには、既往の知見に基づく自然法則や経験則を数式で記述し、仮想操作に対して実機を模倣した応答を返すように構築した装置やプログラムとして、シミュレータが使用される。 To manage the risk of prediction failure, a simulation is required to test what would happen if the output of the trained model were actually input into the shield 10 and operated. For the simulation, a simulator is used as a device or program that describes natural laws and empirical rules based on previous knowledge in mathematical formulas and returns responses to virtual operations that mimic those of the actual machine.

なお、利用可能な知見には限りがあるので、実現象すべてを再現するシミュレータを作ることはできない。そこで、従来においては、着目現象がリアルに再現できていれば、そのシミュレータは、実用的なシミュレータとしている。例えば、操縦桿の操作により、機首上げまたは機首下げし、機体の傾きを変化させることができるフライトシミュレータが実用的なシミュレータとして知られている。 However, since the available knowledge is limited, it is not possible to create a simulator that reproduces all real-world phenomena. Therefore, conventionally, a simulator is considered practical if it can realistically reproduce the phenomenon of interest. For example, a flight simulator that allows the pilot to change the inclination of the aircraft by raising or lowering the nose by operating the control stick is known as a practical simulator.

シールド１０では、未知の急曲線を掘削する場合の事前評価としてシミュレータの需要がある。シールド１０の挙動をシミュレーションするには、その挙動に似た状況を数学的なモデル等を使用して作り出す必要があるが、そのモデルは、周囲の地盤との摩擦や圧力等の相互作用による寄与が支配的で、未知要素の多い地盤を構成物とすることから、複雑で多数の未知パラメータを包含するモデルとなる。複雑で多数の未知パラメータを包含するモデルは、扱いにくく、再現することが困難であることから、現在においても実用化には至っていない。 For the shield 10, there is a demand for simulators to perform pre-evaluations when excavating unknown sharp curves. To simulate the behavior of the shield 10, it is necessary to create a situation similar to that behavior using a mathematical model, etc., but such a model is dominated by contributions from interactions with the surrounding ground, such as friction and pressure, and is composed of ground with many unknown elements, so it is a complex model that includes a large number of unknown parameters. Models that include a large number of complex unknown parameters are difficult to handle and difficult to reproduce, and therefore have not yet been put to practical use.

そこで、多数の未知パラメータを包含するモデルを再現することはあきらめて、その代替として、調整可能なパラメータを導入し、シミュレーションして再現できない場合、パラメータを調整して実際の観測結果に合わせていく、同化シミュレーションと呼ばれる手法を採用する。 Therefore, we have given up on reproducing models that contain a large number of unknown parameters, and instead adopt a method called assimilation simulation, in which adjustable parameters are introduced and, if the results cannot be reproduced through simulation, the parameters are adjusted to match the actual observational results.

ここで、図３を参照して、同化の概念について説明する。横軸に経過時間をとり、縦軸にシミュレーションにより予測した予測値や実機を動作させて得られる観測値等の値をとり、未知の真の状態を点線で示す。真の状態を概略で再現したシミュレータを用意し、二重丸で示される初期値（ａ）からシミュレーションを行うと、実線で示すような予測が得られ、経過時間ｔのときの白丸で示される予測値（ｂ）は、黒丸で示される観測値（ｃ）から大きく乖離する結果になったとする。なお、観測値（ｃ）も、破線で示される真の状態からずれているが、これは観測誤差によるものである。 Now, with reference to Figure 3, the concept of assimilation will be explained. The horizontal axis represents elapsed time, and the vertical axis represents predicted values predicted by simulation and observed values obtained by operating the actual machine, with the unknown true state indicated by the dotted line. If a simulator that roughly reproduces the true state is prepared and a simulation is performed from the initial value (a) indicated by the double circle, a prediction as indicated by the solid line is obtained, and the predicted value (b) indicated by the white circle at elapsed time t deviates significantly from the observed value (c) indicated by the black circle. Note that the observed value (c) also deviates from the true state indicated by the dashed line, but this is due to an observation error.

経過時間ｔのときの予測における予測値（ｂ）を基に、シミュレーションを続けると、観測値からさらに大きく乖離する結果になると予測される。そこで、経過時間ｔにおける観測値（ｃ）を踏まえたパラメータ調整を行い、経過時間ｔにおける乖離を軽減するようにシミュレータのモデルを修正し、修正したモデルを取得、すなわちデータ同化を実施する。データ同化を実施すると、予測値（ｂ）が、二重丸で示される解析値（ｄ）に修正され、解析値（ｄ）から白丸で示す予測値（ｅ）への実線で示す結果となり、破線で示す真の状態に近くなる。 If the simulation were to continue based on the predicted value (b) in the prediction at elapsed time t, it is predicted that the result would deviate even more from the observed value. Therefore, parameters are adjusted based on the observed value (c) at elapsed time t, the simulator model is modified to reduce the deviation at elapsed time t, and the modified model is obtained, i.e., data assimilation is performed. When data assimilation is performed, the predicted value (b) is corrected to the analytical value (d) indicated by the double circle, resulting in the result shown by the solid line from the analytical value (d) to the predicted value (e) indicated by the white circle, which is closer to the true state indicated by the dashed line.

シールド１０の掘削制御に使用する学習モデルは、汎化性能を高める目途が立たないことから、ある程度の汎化性能が得られたところで、その学習済みモデルを掘削制御に投入し、その投入した結果を、上記のデータ同化を取り入れたシミュレーションで検証し、シミュレーション結果が、許容できる結果か否かにより、学習済みモデルを使用した自動掘削制御とするか、人手による人為制御とするかを判定することができる。これにより、学習モデルによる自動掘削制御下での予測失敗リスクを管理することが可能となる。 Because there is no prospect of improving the generalization performance of the learning model used for excavation control of the shield 10, once a certain level of generalization performance has been obtained, the learned model is input into the excavation control, and the results of this input are verified by a simulation incorporating the above-mentioned data assimilation. Depending on whether the simulation results are acceptable or not, it can be determined whether to use automatic excavation control using the learned model or manual control. This makes it possible to manage the risk of prediction failure under automatic excavation control using a learning model.

そこで、本発明では、上記のデータ同化を取り入れたシミュレーションを行うことが可能な情報処理システムを提供する。なお、情報処理システムは、シールド１０の掘削制御に使用する学習モデルの学習や、学習済みモデルを使用した予測等を実行するものと同じであってもよい。 Therefore, the present invention provides an information processing system capable of performing a simulation incorporating the above-mentioned data assimilation. Note that the information processing system may be the same as the one that performs learning of the learning model used for excavation control of the shield 10, prediction using the learned model, etc.

図４は、情報処理システム３０のハードウェア構成の一例を示した図である。情報処理システム３０は、一般的なコンピュータにより構成することができる。このため、情報処理システム３０は、一般的なコンピュータと同様のハードウェア構成を採用することができる。情報処理システム３０は、ハードウェアとして、ＣＰＵ(Central Processing Unit)３１、ＲＯＭ(Read Only Memory)３２、ＲＡＭ(Random Access Memory)３３、ＨＤＤ(Hard Disk Drive)３４、外部機器Ｉ／Ｆ３５、入出力Ｉ／Ｆ３６、表示装置３７、入力装置３８を備える。 Figure 4 is a diagram showing an example of the hardware configuration of the information processing system 30. The information processing system 30 can be configured using a general computer. Therefore, the information processing system 30 can adopt a hardware configuration similar to that of a general computer. The information processing system 30 includes, as hardware, a CPU (Central Processing Unit) 31, a ROM (Read Only Memory) 32, a RAM (Random Access Memory) 33, a HDD (Hard Disk Drive) 34, an external device I/F 35, an input/output I/F 36, a display device 37, and an input device 38.

ＣＰＵ３１は、システム全体を制御し、各種のアプリケーションを実行する。アプリケーションは、上記のシミュレーションを行うシミュレータを含むことができる。ＲＯＭ３２は、システムの起動時のＯＳ(Operating System)の読み込みや周辺機器に対する入出力制御を行うＢＩＯＳ(Basic Input/Output System)や、ＨＤＤ３４等のシステム内部の回路等の制御を行うファームウェアを格納する。ＲＡＭ３３は、メインメモリとして用いられ、ＣＰＵ３１に対して作業領域を提供する。ＨＤＤ３４は、ＣＰＵ３１が実行する各種のアプリケーションやＯＳ、各種の設定情報、各種のデータ等を記憶する。ここでは、ＨＤＤ３４を使用しているが、これに限定されるものではなく、ＳＳＤ(Solid State Drive)等の記憶装置であってもよい。 The CPU 31 controls the entire system and executes various applications. The applications may include a simulator that performs the above simulation. The ROM 32 stores a basic input/output system (BIOS) that loads the operating system (OS) when the system is started and controls input/output for peripheral devices, and firmware that controls circuits inside the system such as the HDD 34. The RAM 33 is used as the main memory and provides a working area for the CPU 31. The HDD 34 stores various applications and the OS executed by the CPU 31, various setting information, various data, etc. Here, the HDD 34 is used, but is not limited to this and may be a storage device such as an SSD (Solid State Drive).

外部機器Ｉ／Ｆ３５は、操作盤や各種環境計測装置（センサ）等の外部機器と本システムとを接続し、外部機器との通信を制御する。表示装置３７は、液晶ディスプレイや有機ＥＬ(Electro Luminescence)ディスプレイ等の情報を表示するための装置である。入力装置３８は、キーボードやマウス等の情報の入力、アプリケーション等の選択、アプリケーション実行の指示等を行う装置である。入出力Ｉ／Ｆ３６は、表示装置３７への情報の出力および入力装置３８からの情報の入力を制御する。この例では、表示装置３７と入力装置３８が別個の装置として説明したが、これに限られるものではなく、表示装置３７と入力装置３８の両方の機能を備えたタッチパネルを採用してもよい。 The external device I/F 35 connects the system to external devices such as an operation panel and various environmental measuring devices (sensors), and controls communication with the external devices. The display device 37 is a device for displaying information, such as a liquid crystal display or an organic EL (Electro Luminescence) display. The input device 38 is a device for inputting information using a keyboard or mouse, selecting applications, and issuing instructions to run applications. The input/output I/F 36 controls the output of information to the display device 37 and the input of information from the input device 38. In this example, the display device 37 and the input device 38 have been described as separate devices, but this is not limited to the above, and a touch panel equipped with the functions of both the display device 37 and the input device 38 may be used.

情報処理システム３０は、その他の回路等を備えていてもよく、例えばインターネット等のネットワークと接続し、ネットワーク上の通信機器との通信を制御する通信Ｉ／Ｆや、Ｂｌｕｅｔｏｏｔｈ（登録商標）等により近距離無線通信を可能にする近距離無線通信回路等を備えていてもよい。 The information processing system 30 may also include other circuits, such as a communication I/F that connects to a network such as the Internet and controls communication with communication devices on the network, and a short-range wireless communication circuit that enables short-range wireless communication using Bluetooth (registered trademark) or the like.

情報処理システム３０は、１つの装置やプログラムから構成されていてもよいし、２以上の装置やプログラムから構成されていてもよい。 The information processing system 30 may be composed of one device or program, or may be composed of two or more devices or programs.

図５は、情報処理システム３０の機能構成の一例を示した機能ブロック図である。情報処理システム３０の各機能は、情報処理システム３０が備えるアプリケーション等のプログラムをＣＰＵ３１が実行することにより実現される。なお、情報処理システム３０の各機能の一部または全部は、回路等のハードウェアで構成されていてもよい。 Figure 5 is a functional block diagram showing an example of the functional configuration of information processing system 30. Each function of information processing system 30 is realized by CPU 31 executing a program such as an application provided in information processing system 30. Note that some or all of each function of information processing system 30 may be configured with hardware such as a circuit.

情報処理システム３０は、機能部として、予測部４０と、取得部４１と、検証部４２と、調整部４３とを含む。予測部４０は、シールド１０の掘削制御に使用する学習済みモデルから出力された制御情報と、調整可能なパラメータとを用いて、制御情報により掘削制御した場合のシールド１０の動作結果を予測する。シールド１０の動作結果を予測するためにシミュレータが使用される。シミュレータは、調整可能なパラメータを含む数式等の数学的なモデルとして作成される。 The information processing system 30 includes, as functional units, a prediction unit 40, an acquisition unit 41, a verification unit 42, and an adjustment unit 43. The prediction unit 40 uses control information output from a trained model used for excavation control of the shield 10 and adjustable parameters to predict the operation result of the shield 10 when excavation control is performed using the control information. A simulator is used to predict the operation result of the shield 10. The simulator is created as a mathematical model, such as a formula including adjustable parameters.

学習済みモデルは、教師データを用いて学習されたモデルである。教師データは、人が実際に操作したシールド１０の操作量等と、その操作の結果を計測した観測値とを対応付けたデータである。操作量等としては、中折れ角α、シールドジャッキ１２の推進力、カッターヘッド１１の回転トルク、コピーカッタ２１のストローク等が挙げられる。観測値としては、シールド１０の方位や上下方向の回転角を示すピッチング等が挙げられる。したがって、シールド１０の方位やピッチング等を、センサ等を使用して計測し、それを学習済みモデルに入力することで、トンネルの計画線等を考慮し、操作量等を制御情報として出力し、シールド１０にその制御情報を設定することにより、シールド１０の掘削制御を自動化することができる。 The trained model is a model trained using training data. The training data is data that associates the amount of operation of the shield 10 actually operated by a person with the observed values that measure the results of that operation. Examples of the amount of operation include the bending angle α, the driving force of the shield jack 12, the rotational torque of the cutter head 11, and the stroke of the copy cutter 21. Examples of the observed values include pitching, which indicates the orientation of the shield 10 and the vertical rotation angle. Therefore, by measuring the orientation and pitching of the shield 10 using a sensor or the like and inputting the results into the trained model, the planned line of the tunnel, etc. can be taken into account, the amount of operation, etc. can be output as control information, and the control information can be set in the shield 10, thereby automating the excavation control of the shield 10.

学習モデルは、学習により教師データを最良に再現できるように構築されるが、教師データにのみマッチする過学習に陥る可能性がある。過学習は、教師データが少ない場合やモデルが複雑すぎる場合等に生じる。そこで、教師データとは別のテストデータを用意し、学習済みモデルが用意した別のテストデータを再現できているか（モデルの汎化性能）が確認される。 A learning model is constructed so that it can best reproduce the training data through training, but there is a risk of overfitting, where the model only matches the training data. Overfitting occurs when there is little training data or when the model is too complex. Therefore, test data separate from the training data is prepared, and it is checked whether the trained model can reproduce the separate test data (the generalization performance of the model).

取得部４１は、実際に制御情報により掘削制御した場合のシールド１０の動作結果を取得する。シールド１０の動作結果は、上記のセンサ等を使用して計測される観測値である。 The acquisition unit 41 acquires the operation results of the shield 10 when excavation is actually controlled based on the control information. The operation results of the shield 10 are observed values measured using the above-mentioned sensors, etc.

検証部４２は、取得部４１により取得された情報、すなわち観測値に基づき、予測部４０により予測した動作結果を検証する。予測部４０は、学習済みモデルが出力した制御情報を入力とし、シミュレーションを行い、上記のセンサ等で計測される方位等を予測する。検証部４２は、予測した動作結果と、観測値とを比較し、その差が閾値以上であるか否かにより、予測した動作結果の妥当性を検証する。 The verification unit 42 verifies the operation result predicted by the prediction unit 40 based on the information acquired by the acquisition unit 41, i.e., the observed value. The prediction unit 40 receives the control information output by the trained model as input, performs a simulation, and predicts the direction, etc. measured by the above-mentioned sensors, etc. The verification unit 42 compares the predicted operation result with the observed value, and verifies the validity of the predicted operation result based on whether the difference is equal to or greater than a threshold value.

調整部４３は、検証結果に応じて、パラメータを調整する。検証部４２が、予測した動作結果と観測値との差が閾値以上で、予測した結果が妥当でないと判定した場合に、調整部４３が、パラメータを調整する。調整されたパラメータは、予測部４０がシミュレーションに使用する数学的モデルに設定され、数学的モデルが修正される。 The adjustment unit 43 adjusts the parameters according to the verification results. If the verification unit 42 determines that the difference between the predicted operation result and the observed value is equal to or greater than a threshold and that the predicted result is invalid, the adjustment unit 43 adjusts the parameters. The adjusted parameters are set in the mathematical model used by the prediction unit 40 for the simulation, and the mathematical model is modified.

上記の取得部４１、検証部４２、調整部４３を含む情報処理システム３０は、データ同化の手法を取り入れたシステムであり、同化の特徴である、環境変化や不連続な状況への追従も原理的に可能となる特徴を有している。なお、全く既往の知見が組み込まない極端な場合、同化の特徴の一部が損なわれることになる。しかしながら、着目すべきパラメータを単純に線形結合しただけの数式を用い、シミュレーションを行うことができるので、既往のモデルに比較して扱いやすいという特徴を有している。 The information processing system 30, which includes the acquisition unit 41, verification unit 42, and adjustment unit 43, is a system that incorporates a data assimilation technique, and has the characteristic of being, in principle, capable of tracking environmental changes and discontinuous situations, which is a characteristic of assimilation. Note that in the extreme case where no previous knowledge is incorporated at all, some of the characteristics of assimilation will be lost. However, since simulations can be performed using a formula that is simply a linear combination of the parameters of interest, it has the characteristic of being easier to handle than previous models.

情報処理システム３０は、予測部４０、取得部４１、検証部４２、調整部４３のみに限らず、その他の機能部を備えることができる。情報処理システム３０は、その他の機能部として、例えば、教師データを用いて学習モデルを学習する学習部、汎化性能を確認する判定部、学習済みモデルを使用して予測を実行する予測実行部等を備えていてもよい。 The information processing system 30 may include other functional units, not limited to the prediction unit 40, the acquisition unit 41, the verification unit 42, and the adjustment unit 43. The information processing system 30 may include, as other functional units, a learning unit that learns a learning model using teacher data, a judgment unit that checks generalization performance, a prediction execution unit that executes predictions using a learned model, etc.

また、情報処理システム３０は、その他の機能部として、学習済みモデルから出力された制御情報による制御の結果が許容範囲内か否かを判定することにより、当該結果を評価する評価部をさらに含んでいてもよい。 In addition, the information processing system 30 may further include, as another functional unit, an evaluation unit that evaluates the result of control based on the control information output from the trained model by determining whether the result is within an acceptable range.

図６は、シールド１０の掘削制御の流れを示したフローチャートである。掘削制御を実施する前に、学習モデル、シミュレーションに設定するパラメータ等を準備しておく。シールド１０の掘削制御は、ステップ１００から開始し、ステップ１０１では、教師データを取得するために、人為で制御を行う。ステップ１０２では、教師データを取得し、ステップ１０３で、取得した教師データを使用し、学習モデルの学習を行う。 Figure 6 is a flowchart showing the flow of excavation control of the shield 10. Before performing excavation control, a learning model, parameters to be set in the simulation, etc. are prepared. Excavation control of the shield 10 starts at step 100, and in step 101, manual control is performed to obtain training data. In step 102, training data is obtained, and in step 103, the obtained training data is used to train the learning model.

ステップ１０４では、教師データとは別に用意したテストデータを使用し、学習モデルの汎化性能を確認する。すなわち、所定の精度で制御情報を予測でき、所定の汎化性能が得られているかを確認する。予測の精度は、いかなる精度であってもよく、任意に決定することができる。汎化性能を確認し、所定の精度で予測できている場合、ステップ１０５へ進み、予測できていない場合は、再びステップ１０２へ戻り、学習を繰り返す。 In step 104, test data prepared separately from the training data is used to confirm the generalization performance of the learning model. In other words, it is confirmed whether the control information can be predicted with a predetermined accuracy and whether a predetermined generalization performance is obtained. The prediction accuracy may be any accuracy and can be determined arbitrarily. After confirming the generalization performance, if the prediction is possible with the predetermined accuracy, proceed to step 105; if the prediction is not possible, return to step 102 and repeat the learning.

ステップ１０５では、人為で制御し、シールド１０に取り付けられたセンサ等により計測した観測値を入力とし、学習済みモデルを使用して制御情報の予測を実行する。ステップ１０６では、学習済みモデルによる予測を入力としてシミュレーションを行い、必要に応じて予測を補正し、予測または補正した予測を出力する。 In step 105, the system is manually controlled, and observation values measured by sensors or the like attached to the shield 10 are used as input, and the learned model is used to predict control information. In step 106, a simulation is performed using the prediction from the learned model as input, the prediction is corrected as necessary, and the prediction or corrected prediction is output.

ステップ１０７では、出力された予測に基づき、学習済みモデルによる自動掘削制御（予測制御）を行う。ここで予測制御を行い、あり得ない方向に進んだとしても、シールド１０の挙動は緩慢であり、修正できる時間が十分にある。このため、ここでは一旦、予測制御を行うようにしている。 In step 107, automatic excavation control (predictive control) is performed using the trained model based on the output prediction. Even if predictive control is performed here and the shield 10 moves in an impossible direction, the behavior of the shield 10 is slow, and there is ample time to make corrections. For this reason, predictive control is performed here.

ステップ１０８では、予測制御の制御結果を評価する。予測制御の制御結果が、シールド１０が計画線に対して許容できる角度で前進していることを示す許容範囲内である場合、ステップ１０９へ進む。一方、許容範囲外である場合は、ステップ１０１へ戻り、人為で制御を行い、シールド１０が前進する方向の修正等を行う。そして、ステップ１０２で教師データを取得し、ステップ１０３で学習モデルを学習し直す。 In step 108, the control result of the predictive control is evaluated. If the control result of the predictive control is within the tolerance range, which indicates that the shield 10 is advancing at an acceptable angle with respect to the planned line, the process proceeds to step 109. On the other hand, if it is outside the tolerance range, the process returns to step 101, where manual control is performed to correct the direction in which the shield 10 advances, etc. Then, in step 102, training data is acquired, and in step 103, the learning model is retrained.

ステップ１０９では、掘削が終了か否かを判定し、終了する場合、ステップ１１０へ進み、掘削制御を終了し、終了しない場合、ステップ１０５へ戻り、次の時間ステップにおける制御情報の予測を行う。 In step 109, it is determined whether excavation has ended. If so, the process proceeds to step 110, where excavation control is terminated. If not, the process returns to step 105, where control information for the next time step is predicted.

図７は、図６に示すステップ１０６のシミュレーションの一例を示したフローチャートである。ステップ２００から開始し、ステップ２０１で、学習済みモデルの予測を入力とし、シミュレーションを行う。シミュレーションは、調整可能なパラメータを含む数学的モデルに上記の予測を入力して行う。シミュレーションにより、予測した制御情報によりシールド１０を仮想的に制御し、動作させた結果が出力される。ステップ２０２では、実際にシールド１０に学習済みモデルが予測した制御情報を設定し、掘削制御した結果を観測値として取得し、取得した観測値と、シミュレーションの結果とを比較する。比較した結果、観測値とシミュレーションの結果との差が、閾値以上であるか否かにより、予測が妥当か否かを判定する。 Figure 7 is a flow chart showing an example of the simulation of step 106 shown in Figure 6. Starting from step 200, in step 201, a simulation is performed using the prediction of the trained model as input. The simulation is performed by inputting the prediction into a mathematical model including adjustable parameters. The simulation virtually controls the shield 10 using the predicted control information, and outputs the results of the operation. In step 202, the control information predicted by the trained model is actually set in the shield 10, the results of the excavation control are obtained as observed values, and the obtained observed values are compared with the results of the simulation. As a result of the comparison, the validity of the prediction is determined depending on whether the difference between the observed values and the results of the simulation is equal to or greater than a threshold value.

上記の差が閾値以上で、予測が妥当でないと判定された場合、ステップ２０３へ進み、予測を補正する。すなわち、シミュレーションの結果が観測値に近くなるように、シミュレーションに使用するパラメータを調整して予測を補正する。 If the difference is equal to or greater than the threshold and the prediction is determined to be invalid, the process proceeds to step 203, where the prediction is corrected. In other words, the parameters used in the simulation are adjusted to correct the prediction so that the simulation results are closer to the observed values.

ステップ２０２で、予測が妥当と判定した場合、その妥当と判定した予測を出力し、予測が妥当でないと判定した場合は、ステップ２０３で補正した予測を出力し、ステップ２０４でシミュレーションを終了する。 If the prediction is determined to be valid in step 202, the prediction determined to be valid is output, and if the prediction is determined to be invalid, a corrected prediction is output in step 203 and the simulation is terminated in step 204.

次に、データ同化について実施例をもって、より詳細に説明する。手順（１）として、シミュレーションに関わる変数として、図８に示す変数を用意する。なお、図８に示した変数は一例であるので、これらの変数の一部のみを用意してもよいし、これら以外の変数を用意してもよいし、これらの変数に加えて、その他の変数も用意してもよい。 Next, data assimilation will be described in more detail with examples. In step (1), the variables shown in FIG. 8 are prepared as variables related to the simulation. Note that the variables shown in FIG. 8 are only an example, so only some of these variables may be prepared, or other variables may be prepared, or other variables may be prepared in addition to these variables.

図８に示す変数は、大きく分けて、シールド１０の操作量を示す変数と、操作環境を示す変数と、操作影響を示す変数とに分けられる。制御情報として利用される変数は、操作量を示す変数、操作環境を示す変数であり、センサ等により計測される観測値は、操作影響を示す変数である。 The variables shown in FIG. 8 can be broadly divided into variables that indicate the amount of operation of the shield 10, variables that indicate the operating environment, and variables that indicate the operation effect. The variables used as control information are the variables that indicate the amount of operation and the variables that indicate the operating environment, and the observed values measured by sensors, etc. are variables that indicate the operation effect.

シミュレーションに関わる変数を用意した後、手順（２）として、用意した変数間に成立すると仮定する数式を用意する。シールド１０のシミュレータでは、理論構築が困難であることから、例えば、下記式１のような架空の数式を用いることができる。なお、下記式１は、ある操作量をある操作環境で加えたときの操作影響を出力する数式である。 After preparing the variables related to the simulation, in step (2), a formula is prepared that is assumed to hold between the prepared variables. Since it is difficult to construct a theory for the simulator of the shield 10, a fictitious formula such as the following formula 1 can be used. Note that the following formula 1 is a formula that outputs the operation effect when a certain operation amount is applied in a certain operating environment.

数式を用意した後、手順（３）として、シミュレータが、現実の観測結果に追従できるように、自由度を与えるものとして、操作量と操作環境を示す変数に時間変化するパラメータθ、すなわちθ_Ｘ１Ｈ～θ_Ｘ１０Ｈ、θ_Ｆ１Ｈ～θ_Ｆ６Ｈ、θ_Ｘ１Ｖ～θ_Ｘ１０Ｖ、θ_Ｆ１Ｖ～θ_Ｆ６Ｖの計３２個のパラメータを係数として導入する。すると、上記式１は、下記式２のように書き直すことができる。 After preparing the formulas, in step (3), a parameter θ that changes over time is introduced as coefficients to variables indicating the operation amount and the operation environment, that is, θ _X1H to θ _X10H , θ _F1H to θ _F6H , θ _X1V to θ _X10V , and θ _F1V to θ _F6V , giving the simulator a degree of freedom to follow the actual observation results. Then, the above formula 1 can be rewritten as the following formula 2.

パラメータを導入した後、手順（４）として、シミュレーションを行う。ある時刻ｔで、操作影響ΔＨ（ｔ）、ΔＶ（ｔ）を与えるとき、以前の時刻ｔ－１、ｔ－２、・・・、２、１において同様にして求めたΔＨ、ΔＶと、操作後の状態の初期値Ｓ_Ｈ（０）、Ｓ_Ｖ（０）とを用いると、時刻ｔにおける操作後の状態は、下記式３のように表すことができる。 After the parameters are introduced, a simulation is performed as step (4). When operation effects ΔH(t) and ΔV(t) are given at a certain time t, the state after the operation at time t can be expressed as in the following formula 3 using ΔH and ΔV calculated in the same manner at previous times t-1, t-2, ..., 2, 1 and the initial values S _H (0) and S _V (0) of the state after the operation.

ここで、シミュレータの出力Ｓ（ｔ）を、下記式４のように記述する。 Here, the simulator output S(t) is written as Equation 4 below.

パラメータ調整においてシミュレータの出力と現実に得られる観測値との対比により、シミュレータの現実追従性を評価し、修正していくため、予め現実の観測値に予想される観測誤差ｗを加えておく。 In parameter adjustment, the simulator's ability to track reality is evaluated and corrected by comparing the simulator output with the actual observed values, so the expected observation error w is added to the actual observed values in advance.

上記式５中、Ｙ（ｔ）（Ｙの頂部に＾が付く。）が、シミュレータの与える予測観測値である。ｗには、例えば、平均０で分散νの正規分布からサンプルしたものを与える。 In the above formula 5, Y(t) (Y has a ^ at the top) is the predicted observation value given by the simulator. For w, for example, a sample from a normal distribution with mean 0 and variance ν is given.

シミュレーションの結果、時刻ｔにおいてパラメータθの調整が必要である場合、手順（５）として、θ調整（シミュレータの現実追従性の評価修正）を行う。シミュレータの与える予測観測値Ｙ（ｔ）（Ｙの頂部に＾が付く。）と、実観測値Ｙ（ｔ）と、これらの原因と考えられる操作量Ｆ（ｔ）、操作環境Ｘ（ｔ）とをデータセットとして用意する。ここで、まだ一度もθ調整を行っていないときの最初にＹ（ｔ）を算出するときに用いるθの初期値には、適当な確率分布からサンプリングした値を与える。 If the simulation results show that the parameter θ needs to be adjusted at time t, then in step (5), θ adjustment (assessment and correction of the simulator's ability to track reality) is performed. The predicted observed value Y(t) (Y has a ^ at the top) given by the simulator, the actual observed value Y(t), the operation amount F(t) thought to be the cause of these, and the operating environment X(t) are prepared as a data set. Here, a value sampled from an appropriate probability distribution is given as the initial value of θ used when calculating Y(t) for the first time when no θ adjustment has yet been performed.

適当な確率分布ｐ（ｘ）としては、図９に示すような、例えばｘの区間が－１～１で同じ１／２の確率に分布する一様分布を与えることができる。なお、これは一例であるので、これに限定されるものではない。 As an appropriate probability distribution p(x), for example, a uniform distribution in which x ranges from -1 to 1 with the same probability of 1/2 can be given, as shown in Figure 9. Note that this is just one example, and is not limiting.

データセットを用意した後、θの調整を行う。調整には、同化と呼ばれるシミュレータの最適化技法を用いる。同化については、公知のアルゴリズムが数多くあり、各アルゴリズムの詳細については、ここでは説明を省略する。シールド１０のシミュレータにおいては、例えばガウシアンカーネルを用いる手法を適用することができる。 After preparing the data set, θ is adjusted. For the adjustment, a simulator optimization technique called assimilation is used. There are many known algorithms for assimilation, and detailed explanations of each algorithm will be omitted here. In the simulator for the shield 10, for example, a method using a Gaussian kernel can be applied.

同化では、初期状態で満足いかない数式に含まれるθを、満足いく方向に逐次修正する。ここで、θは、確定した特定の数値ではなく、ある確率分布π（θ）からサンプリングされた実現値である。このため、同化における実際に修正すべき対象は、確率分布π（θ）となる。修正前の確率分布を「事前分布」とすると、修正後の確率分布は「事後分布」となる。 In assimilation, the value θ contained in an unsatisfactory formula in the initial state is sequentially corrected in a satisfactory direction. Here, θ is not a fixed, specific numerical value, but a realized value sampled from a certain probability distribution π(θ). For this reason, the actual target to be corrected in assimilation is the probability distribution π(θ). If the probability distribution before correction is called the "prior distribution," then the probability distribution after correction is called the "posterior distribution."

パラメータ調整では、事後分布からθを再度サンプリングし、上記の手順（３）へ戻り、次の時間ステップｔ＝ｔ＋１の計算を行う。これを、満足いくシミュレータになるまで繰り返す。 In parameter adjustment, we resample θ from the posterior distribution, return to step (3) above, and perform the calculations for the next time step t = t + 1. This is repeated until we obtain a satisfactory simulator.

このようにしてパラメータθを調整して予測を補正することができるが、実際の修正すべき対象である確率分布π（θ）の修正状況が、上記の手順（３）～（５）の繰り返しではうまく収束せず、不満な場合があり得る。このような場合、修正過程をトレースして問題を抽出し、同化手順に含まれる経験的なチューニング要素であるハイパーパラメータを調整することで対応することができる。 In this way, the parameter θ can be adjusted to correct the predictions, but there may be cases where the correction status of the probability distribution π(θ), which is the actual target of correction, does not converge well by repeating steps (3) to (5) above, leaving one dissatisfied. In such cases, the correction process can be traced to extract the problem, and the problem can be addressed by adjusting the hyperparameters, which are empirical tuning elements included in the assimilation procedure.

以上に説明してきたように、シールドの掘削制御のような、十分な教師データを用意し難いケースでも、ある程度学習した学習モデルを使用することで、シミュレーションにより予測の妥当性を検証することができ、予測失敗リスクの管理が可能となって、学習済みモデルによる掘削制御の適用範囲を広げることができる。また、同化手法の適用により、こまで複雑で多数の未知パラメータを包含することから困難とされてきたシールドのシミュレーションが実用上で可能となる。 As explained above, even in cases where it is difficult to prepare sufficient training data, such as in shield excavation control, by using a learning model that has been trained to a certain extent, it is possible to verify the validity of predictions through simulation, making it possible to manage the risk of prediction failure and expanding the scope of application of excavation control using trained models. In addition, by applying the assimilation method, it becomes possible to practically simulate shields, which have been considered difficult due to their complexity and the inclusion of a large number of unknown parameters.

本システムは、学習済みモデルのテスト用のシミュレータとしてだけではなく、例えばシールドを人為的に操作する場合の最適操作の探索や、操作した結果の事前評価ツールとして利用することも可能である。 This system can be used not only as a simulator for testing trained models, but also as a tool for searching for optimal operations when manually operating a shield, for example, or for pre-evaluating the results of such operations.

これまで本発明のシミュレーション・システム、方法およびプログラムについて図面に示した実施形態を参照しながら詳細に説明してきたが、本発明は、上述した実施形態に限定されるものではなく、他の実施形態や、追加、変更、削除など、当業者が想到することができる範囲内で変更することができ、いずれの態様においても本発明の作用・効果を奏する限り、本発明の範囲に含まれるものである。 Thus far, the simulation system, method, and program of the present invention have been described in detail with reference to the embodiments shown in the drawings, but the present invention is not limited to the above-described embodiments, and can be modified within the scope of what a person skilled in the art can imagine, including other embodiments, additions, modifications, deletions, etc., and any aspect is within the scope of the present invention as long as it provides the functions and effects of the present invention.

１０…シールド
１１…カッターヘッド
１２…シールドジャッキ
１３…セグメント
１４…スキンプレート
１４ａ…前胴
１４ｂ…後胴
１５…中折れジャッキ
１６…チャンバー
１７…スクリューコンベア
１８…ベルトコンベア
１９…エレクター
２０…裏込め注入装置
２１…コピーカッタ
３０…情報処理システム
３１…ＣＰＵ
３２…ＲＯＭ
３３…ＲＡＭ
３４…ＨＤＤ
３５…外部機器Ｉ／Ｆ
３６…入出力Ｉ／Ｆ
３７…表示装置
３８…入力装置
４０…予測部
４１…取得部
４２…検証部
４３…調整部
Reference Signs List 10...Shield 11...Cutter head 12...Shield jack 13...Segment 14...Skin plate 14a...Front body 14b...Rear body 15...Center bending jack 16...Chamber 17...Screw conveyor 18...Belt conveyor 19...Erector 20...Backfill injection device 21...Copy cutter 30...Information processing system 31...CPU
32...ROM
33...RAM
34...HDD
35...External device I/F
36... Input/output I/F
37: Display device 38: Input device 40: Prediction unit 41: Acquisition unit 42: Verification unit 43: Adjustment unit

Claims

An information processing system that reproduces a behavior of an excavation machine,
A prediction means for predicting an operation result of the excavation machine when controlled by the control information, using control information output from a trained model used for controlling the excavation machine and adjustable parameters;
an acquisition means for acquiring an operation result of the excavation machine when actually controlled according to the control information;
a verification means for verifying the predicted operation result based on the acquired operation result;
and adjusting means for adjusting the parameters in accordance with a verification result.

The information processing system according to claim 1, wherein the verification means determines the validity of the prediction means based on whether or not a difference between an observed value, which is the obtained operation result, and a predicted value, which is the predicted operation result, is equal to or greater than a threshold value.

The information processing system according to claim 2, wherein the prediction means calculates the predicted value using a formula in which the time-varying control information is a variable and the time-varying parameters are coefficients.

The information processing system according to claim 2 or 3, wherein the adjustment means adjusts the parameters when the prediction means is determined to be invalid as a result of the verification.

The information processing system according to claim 4, wherein the adjustment means adjusts the parameters by modifying a probability distribution from which the parameters are sampled and resampling from the modified probability distribution.

The information processing system according to claim 5, wherein the adjustment of the parameters by the adjustment means, the calculation of the predicted value by the prediction means, and the acquisition of the operation results of the excavation machine by the acquisition means are repeated until the verification means determines that the prediction means is valid.

A learning means for learning a learning model used for controlling the excavation machine using teacher information;
A determination means for applying test data to the trained model, which is a trained model, to determine whether the trained model has a predetermined generalization performance;
A prediction execution means for executing a prediction of the control information using the trained model when it is determined that the trained model has the predetermined generalization performance;
The information processing system according to claim 1 , further comprising an evaluation means for evaluating the result of control based on the control information output from the trained model by determining whether the result is within an acceptable range.

A program for causing a computer to execute a process for reproducing a behavior of an excavation machine,
A step of predicting an operation result of the excavation machine when controlled by the control information, using control information output from a trained model used for controlling the excavation machine and adjustable parameters;
acquiring an operation result of the excavation machine when actually controlled according to the control information;
verifying the predicted operation result based on the obtained operation result;
adjusting the parameters according to the verification results;
A program that executes a step of re-predicting the operation results of the excavation machine using control information output from the learned model and the adjusted parameters.