JP7245750B2

JP7245750B2 - Evaluation support system, evaluation support method, and program

Info

Publication number: JP7245750B2
Application number: JP2019158054A
Authority: JP
Inventors: 嵩弓今井
Original assignee: Maeda Corp
Current assignee: Maeda Corp
Priority date: 2019-08-30
Filing date: 2019-08-30
Publication date: 2023-03-24
Anticipated expiration: 2039-08-30
Also published as: JP2021036110A

Description

本発明は、評価支援システム、評価支援方法、及びプログラムに関する。 The present invention relates to an evaluation support system, an evaluation support method, and a program.

従来、建設業における業務を支援する技術が検討されている。例えば、特許文献１には、坑内観察における切羽の撮影画像と、切羽の観察項目に対する評価結果と、の関係を示す教師データを学習させた学習モデルを利用して、切羽の評価を自動化する技術が記載されている。また例えば、非特許文献１には、シールドマシンを操作するオペレータの操作画面上の視線を追跡し、ポンプの吸込圧力に発生した変動をオペレータが認知して対処する一連のプロセスを分析する技術が記載されている。 Conventionally, techniques for supporting work in the construction industry have been studied. For example, Patent Document 1 discloses a technique for automating face evaluation using a learning model obtained by learning teacher data indicating the relationship between a photographed image of a face during underground observation and evaluation results for observation items of the face. is described. For example, in Non-Patent Document 1, there is a technique that tracks the line of sight of an operator who operates a shield machine on the operation screen, and analyzes a series of processes in which the operator recognizes and deals with fluctuations that occur in the suction pressure of the pump. Are listed.

特開２０１９－０２３３９２号公報JP 2019-023392 A

人工知能学会全国大会論文集、第３２回全国大会（２０１８）、「アイトラッキングによるシールドマシン操作者の認知プロセスの推定」、藤本奈央、森田順也、大久保泰、大林信彦、白井健泰Proceedings of the 32nd Annual Conference of the Japanese Society for Artificial Intelligence (2018), "Estimation of Cognitive Processes of Shield Machine Operators by Eye Tracking", Nao Fujimoto, Junya Morita, Yasushi Okubo, Nobuhiko Obayashi, Takehiro Shirai

切羽のような評価対象を評価する場合、評価者が重点的に見るべき重要部分は、工事現場の環境などの種々の条件によって異なる。熟練者であれば、自身の経験から重要部分を特定できるが、経験の浅い者は、重要部分を特定することは難しい。この点、特許文献１には、重要部分についての記載はなく、撮影画像に示された切羽の特徴が全体的に万遍なく学習されるだけなので、学習モデルの精度を十分に向上させることができない。非特許文献１は、ポンプの吸引圧力などが表示される操作画面上における視線を追跡するにすぎず、評価対象の重要部分を特定するわけではない。このため、従来の技術では、評価対象を評価する業務を十分に支援することができなかった。 When evaluating an evaluation object such as a face, the important part that the evaluator should focus on differs depending on various conditions such as the environment of the construction site. A skilled person can identify the important part from his/her own experience, but it is difficult for an inexperienced person to identify the important part. In this respect, Patent Literature 1 does not describe the important part, and only the features of the face shown in the captured image are learned evenly as a whole, so it is possible to sufficiently improve the accuracy of the learning model. Can not. Non-Patent Document 1 only tracks the line of sight on the operation screen on which the suction pressure of the pump is displayed, and does not specify the important part of the evaluation target. For this reason, the conventional technology cannot sufficiently support the task of evaluating the evaluation target.

本発明は上記課題に鑑みてなされたものであって、その目的は、評価対象を評価する業務を支援することが可能な評価支援システム、評価支援方法、及びプログラムを提供することである。 SUMMARY OF THE INVENTION The present invention has been made in view of the above problems, and an object thereof is to provide an evaluation support system, an evaluation support method, and a program capable of supporting work for evaluating an evaluation target.

上記課題を解決するために、本発明の一態様に係る評価支援システムは、工事現場における評価対象が撮影された教師撮影画像と当該教師撮影画像に対応する教師注視点情報と、評価者による前記評価対象の教師評価結果と、の関係を示す教師データを取得する教師データ取得手段と、前記教師データに基づいて、評価結果出力モデルを学習させる学習手段と、前記評価結果出力モデルに対し、入力撮影画像と、前記入力撮影画像に対応する入力注視点情報と、を入力する入力手段と、前記評価結果出力モデルから出力された、出力評価結果を取得する出力評価結果取得手段と、を含むことを特徴とする。 In order to solve the above-described problems, an evaluation support system according to an aspect of the present invention includes a teacher-photographed image of an evaluation target at a construction site, teacher-gazing-point information corresponding to the teacher-photographed image, and Teacher data acquisition means for acquiring teacher data indicating the relationship between a teacher evaluation result to be evaluated, learning means for learning an evaluation result output model based on the teacher data, and an input to the evaluation result output model Input means for inputting a captured image and input gaze point information corresponding to the input captured image; and output evaluation result obtaining means for obtaining an output evaluation result output from the evaluation result output model. characterized by

本発明の一態様に係る評価支援方法は、工事現場における評価対象が撮影された教師撮影画像と当該教師撮影画像に対応する教師注視点情報と、評価者による前記評価対象の教師評価結果と、の関係を示す教師データを取得する教師データ取得ステップと、前記教師データに基づいて、評価結果出力モデルを学習させる学習ステップと、前記評価結果出力モデルに対し、入力撮影画像と、前記入力撮影画像に対応する入力注視点情報と、を入力する入力ステップと、前記評価結果出力モデルから出力された、出力評価結果を取得する出力評価結果取得ステップと、を含むことを特徴とする。 An evaluation support method according to an aspect of the present invention includes a teacher-photographed image of an evaluation target at a construction site, teacher gaze point information corresponding to the teacher-photographed image, a teacher evaluation result of the evaluation target by an evaluator, a learning step of learning an evaluation result output model based on the training data; and an input photographed image and the input photographed image for the evaluation result output model. and an input step of inputting input gaze point information corresponding to and an output evaluation result obtaining step of obtaining an output evaluation result output from the evaluation result output model.

本発明の一態様に係るプログラムは、工事現場における評価対象が撮影された教師撮影画像と当該教師撮影画像に対応する教師注視点情報と、評価者による前記評価対象の教師評価結果と、の関係を示す教師データを取得する教師データ取得手段、前記教師データに基づいて、評価結果出力モデルを学習させる学習手段、前記評価結果出力モデルに対し、入力撮影画像と、前記入力撮影画像に対応する入力注視点情報と、を入力する入力手段、前記評価結果出力モデルから出力された、出力評価結果を取得する出力評価結果取得手段、としてコンピュータを機能させる。 A program according to an aspect of the present invention provides a relationship between a teacher-captured image in which an evaluation target is captured at a construction site, teacher gaze point information corresponding to the teacher-captured image, and a teacher evaluation result of the evaluation target by an evaluator. a learning means for learning an evaluation result output model based on the training data; an input photographed image and an input corresponding to the input photographed image for the evaluation result output model; The computer functions as input means for inputting point-of-regard information, and output evaluation result acquisition means for acquiring output evaluation results output from the evaluation result output model.

また、本発明の一態様では、前記評価支援システムは、視線検出手段により検出された前記評価者の視線に基づいて、前記教師注視点情報を取得する教師注視点情報取得手段を更に含み、前記教師データは、前記教師撮影画像と前記教師注視点情報取得手段により取得された前記教師注視点情報と、前記教師評価結果と、の関係を示す、ことを特徴とする。 In one aspect of the present invention, the evaluation support system further includes teacher gaze point information acquisition means for acquiring the teacher gaze point information based on the evaluator's gaze detected by the gaze detection means, The teacher data is characterized in that it indicates the relationship between the teacher photographed image, the teacher gaze point information acquired by the teacher gaze point information acquisition means, and the teacher evaluation result.

また、本発明の一態様では、前記教師注視点情報取得手段は、前記視線検出手段により検出された前記評価者の視線のうち、前記教師撮影画像が表示された画面上への視線を特定し、当該特定された視線に基づいて、前記教師注視点情報を取得する、ことを特徴とする。 In one aspect of the present invention, the teacher gazing point information acquisition means specifies, from among the gaze of the evaluator detected by the sight line detection means, the gaze toward the screen on which the teacher's photographed image is displayed. and acquiring the teacher gaze point information based on the specified line of sight.

また、本発明の一態様では、前記教師データには、前記工事現場における工事の特徴情報、前記教師撮影画像、及び前記教師注視点情報と、前記教師評価結果と、の関係を示し、前記評価支援システムは、前記入力撮影画像に対応する工事の特徴情報を取得する特徴情報取得手段を更に含み、前記入力手段は、前記評価結果出力モデルに対し、前記特徴情報取得手段により取得された特徴情報、前記入力撮影画像、及び前記入力注視点情報を入力する、ことを特徴とする。 Further, in one aspect of the present invention, the teacher data indicates the relationship among the characteristic information of construction at the construction site, the teacher-captured image, the teacher gaze information, and the teacher evaluation result, and The support system further includes feature information acquisition means for acquiring feature information of the construction corresponding to the input photographed image, and the input means receives the feature information acquired by the feature information acquisition means for the evaluation result output model. , the input photographed image, and the input gaze point information.

また、本発明の一態様では、前記学習手段は、前記工事現場における工事の特徴情報ごとに、当該特徴情報に対応する前記教師データに基づいて前記評価結果出力モデルを学習させ、前記評価支援システムは、前記入力撮影画像に対応する工事の特徴情報を取得する特徴情報取得手段を更に含み、前記入力手段は、前記特徴情報取得手段により取得された特徴情報に対応する前記評価結果出力モデルに対し、前記入力撮影画像と前記入力注視点情報とを入力する、ことを特徴とする。 Further, in one aspect of the present invention, the learning means learns the evaluation result output model based on the teacher data corresponding to the feature information for each feature information of construction at the construction site, and the evaluation support system further includes feature information acquisition means for acquiring feature information of the construction corresponding to the input photographed image, and the input means receives the evaluation result output model corresponding to the feature information acquired by the feature information acquisition means. and inputting the input photographed image and the input gaze point information.

また、本発明の一態様では、前記評価支援システムは、学習済みの注視点情報出力モデルに対し、前記教師撮影画像が入力された場合に出力される出力注視点情報を、前記教師注視点情報として取得する教師注視点情報取得手段を更に含み、前記教師データは、前記教師撮影画像と前記教師注視点情報取得手段により取得された前記教師注視点情報と、前記教師評価結果と、の関係を示す、ことを特徴とする。 Further, in one aspect of the present invention, the evaluation support system converts the output gaze point information output when the teacher photographed image is input to the trained gaze point information output model into the teacher gaze point information wherein the teacher data is a relationship between the teacher captured image, the teacher gazing point information acquired by the teacher gazing point information acquiring means, and the teacher evaluation result. It is characterized by showing

また、本発明の一態様では、前記評価支援システムは、学習済みの注視点情報出力モデルに対し、前記入力撮影画像が入力された場合に出力される出力注視点情報を、前記入力注視点情報として取得する入力注視点情報取得手段を更に含み、前記入力手段は、前記評価結果出力モデルに対し、前記入力撮影画像と、前記入力注視点情報取得手段により取得された前記入力注視点情報と、を入力する、ことを特徴とする。 Further, in one aspect of the present invention, the evaluation support system converts the output gaze point information output when the input photographed image is input to the trained gaze point information output model into the input gaze point information The input means obtains the input gazing point information acquired by the input gazing point information acquiring means for the evaluation result output model, and is input.

また、本発明の一態様では、前記教師撮影画像及び前記入力撮影画像の各々は、互いに同じサイズであり、前記教師注視点情報及び前記入力注視点情報の各々は、注視点が色によって表現された、前記教師撮影画像及び前記入力撮影画像の各々と同じサイズの画像である、ことを特徴とする。 In one aspect of the present invention, each of the teacher captured image and the input captured image has the same size, and each of the teacher gazing point information and the input gazing point information has a gazing point represented by a color. Further, it is an image having the same size as each of the teacher captured image and the input captured image.

また、本発明の一態様では、前記評価対象は、トンネル切羽であり、前記教師撮影画像は、前記工事現場におけるトンネル切羽が撮影された画像であり、前記教師注視点情報は、前記評価者が前記トンネル切羽を評価した場合の注視点を示し、前記入力撮影画像は、前記工事現場又は他の工事現場におけるトンネル切羽が撮影された画像であり、前記入力注視点情報は、前記入力撮影画像に示されたトンネル切羽の評価時に見るべき部分を示す、ことを特徴とする。 Further, in one aspect of the present invention, the evaluation target is a tunnel face, the teacher-captured image is an image of the tunnel face at the construction site, and the teacher gaze point information is obtained by the evaluator. A gaze point when evaluating the tunnel face is indicated, the input captured image is an image of the tunnel face captured at the construction site or another construction site, and the input gaze point information is the input captured image. Characterized by indicating the part to be viewed when evaluating the indicated tunnel face.

本発明によれば、評価対象を評価する業務を支援することができる。 According to the present invention, it is possible to support work for evaluating an evaluation target.

実施形態に係る評価支援システムの全体構成を示す図である。It is a figure showing the whole evaluation support system composition concerning an embodiment. 切羽が撮影された撮影画像の一例を示す図である。It is a figure which shows an example of the picked-up image by which the face was image|photographed. 熟練者が撮影画像を見て切羽を評価する様子を示す図である。FIG. 10 is a diagram showing how an expert evaluates a face by looking at a photographed image; 注視点画像出力モデルの概要を示す説明図である。FIG. 4 is an explanatory diagram showing an overview of a point-of-regard image output model; 評価結果出力モデルの概要を示す説明図である。FIG. 4 is an explanatory diagram showing an outline of an evaluation result output model; 評価支援システムで実現される機能の一例を示す機能ブロック図である。3 is a functional block diagram showing an example of functions realized by the evaluation support system; FIG. 第１教師データのデータ格納例を示す図である。FIG. 4 is a diagram showing a data storage example of first teacher data; 第２教師データのデータ格納例を示す図である。FIG. 10 is a diagram showing a data storage example of second teacher data; 学習処理を示すフロー図である。FIG. 10 is a flow diagram showing learning processing; 評価支援処理を示すフロー図である。FIG. 11 is a flowchart showing evaluation support processing; 変形例に係る機能ブロック図である。It is a functional block diagram concerning a modification.

［１．評価支援システムの全体構成］
図１は、実施形態に係る評価支援システムの全体構成を示す図である。図１に示すように、評価支援システムＳは、学習端末１０と視線検出装置２０とを含み、これらは互いに通信可能に接続される。なお、評価支援システムＳは、サーバコンピュータ等の他のコンピュータが含まれていてもよい。 [1. Overall configuration of evaluation support system]
FIG. 1 is a diagram showing the overall configuration of an evaluation support system according to an embodiment. As shown in FIG. 1, the evaluation support system S includes a learning terminal 10 and a line-of-sight detection device 20, which are communicably connected to each other. Note that the evaluation support system S may include other computers such as a server computer.

学習端末１０は、本実施形態で説明する処理を実行するコンピュータであり、例えば、パーソナルコンピュータ、携帯情報端末（タブレット型コンピュータを含む）、又は携帯電話機（スマートフォンを含む）等である。例えば、学習端末１０は、制御部１１、記憶部１２、通信部１３、操作部１４、及び表示部１５を含む。 The learning terminal 10 is a computer that executes the processing described in this embodiment, and is, for example, a personal computer, a personal digital assistant (including a tablet computer), or a mobile phone (including a smart phone). For example, the study terminal 10 includes a control unit 11 , a storage unit 12 , a communication unit 13 , an operation unit 14 and a display unit 15 .

制御部１１は、少なくとも１つのプロセッサを含む。制御部１１は、記憶部１２に記憶されたプログラムやデータに従って処理を実行する。記憶部１２は、主記憶部及び補助記憶部を含む。例えば、主記憶部はＲＡＭなどの揮発性メモリであり、補助記憶部は、ハードディスクやフラッシュメモリなどの不揮発性メモリである。通信部１３は、有線通信又は無線通信用の通信インタフェースを含み、例えば、ネットワークを介してデータ通信を行う。操作部１４は、入力デバイスであり、例えば、タッチパネルやマウス等のポインティングデバイスやキーボード等である。操作部１４は、操作内容を制御部１１に伝達する。表示部１５は、例えば、液晶表示部又は有機ＥＬ表示部等である。 Control unit 11 includes at least one processor. The control unit 11 executes processing according to programs and data stored in the storage unit 12 . The storage unit 12 includes a main storage unit and an auxiliary storage unit. For example, the main memory is volatile memory such as RAM, and the auxiliary memory is nonvolatile memory such as hard disk or flash memory. The communication unit 13 includes a communication interface for wired communication or wireless communication, and performs data communication via a network, for example. The operation unit 14 is an input device such as a touch panel, a pointing device such as a mouse, or a keyboard. The operation unit 14 transmits operation contents to the control unit 11 . The display unit 15 is, for example, a liquid crystal display unit or an organic EL display unit.

視線検出装置２０は、人間の視線を検出する装置であり、例えば、カメラ、赤外線センサ、又は赤外線発光部を含む。視線検出装置２０は、アイトラッカーと呼ばれることもあり、人間の目の動きを検出する。視線の検出方法自体は、任意の方法を利用可能であり、例えば、強膜反射法や角膜反射法などの非接触型を利用してもよいし、サーチコイル法や眼球電位法などの接触型を利用してもよい。本実施形態では、視線検出装置２０が学習端末１０の外部装置である場合を説明するが、視線検出装置２０は、学習端末１０の一部として組み込まれていてもよい。また、据置型の視線検出装置２０を例に挙げて説明するが、視線検出装置２０は、ヘッドマウントディスプレイ又はスマートグラスのようなウェアラブルな装置であってもよい。 The line-of-sight detection device 20 is a device that detects a human line of sight, and includes, for example, a camera, an infrared sensor, or an infrared light emitting unit. The line-of-sight detection device 20 is sometimes called an eye tracker and detects the movement of human eyes. Any method can be used as the line of sight detection method itself. may be used. Although the line-of-sight detection device 20 is an external device of the learning terminal 10 in this embodiment, the line-of-sight detection device 20 may be incorporated as part of the learning terminal 10 . Also, although the stationary line-of-sight detection device 20 will be described as an example, the line-of-sight detection device 20 may be a wearable device such as a head-mounted display or smart glasses.

なお、記憶部１２に記憶されるものとして説明するプログラム及びデータは、ネットワークを介して学習端末１０に供給されるようにしてもよい。また、学習端末１０のハードウェア構成は、上記の例に限られず、種々のハードウェアを適用可能である。例えば、学習端末１０は、コンピュータ読み取り可能な情報記憶媒体を読み取る読取部（例えば、光ディスクドライブやメモリカードスロット）や外部機器と直接的に接続するための入出力部（例えば、ＵＳＢ端子）を含んでもよい。この場合、情報記憶媒体に記憶されたプログラムやデータが読取部又は入出力部を介して、学習端末１０に供給されるようにしてもよい。 Note that the programs and data described as being stored in the storage unit 12 may be supplied to the study terminal 10 via a network. Also, the hardware configuration of the study terminal 10 is not limited to the above example, and various hardware can be applied. For example, the study terminal 10 includes a reading unit (eg, an optical disc drive or memory card slot) that reads a computer-readable information storage medium, and an input/output unit (eg, a USB terminal) for direct connection with an external device. It's okay. In this case, the programs and data stored in the information storage medium may be supplied to the study terminal 10 via the reading section or the input/output section.

［２．評価支援システムの概要］
評価支援システムＳは、工事現場の評価対象を評価する評価者の業務を支援する。評価者とは、評価の担当者であり、建設会社の社員であってもよいし、建設会社から評価業務を委託された者であってもよい。工事現場とは、工事が行われる場所であり、地上であってもよいし、地下であってもよい。工事は、任意の種類の工事であってよく、例えば、土木工事であってもよいし、建築工事であってもよい。 [2. Overview of Evaluation Support System]
The evaluation support system S supports the work of the evaluator who evaluates the evaluation target of the construction site. The evaluator is the person in charge of the evaluation, and may be an employee of the construction company or a person entrusted with the evaluation work by the construction company. A construction site is a place where construction is performed, and may be above ground or underground. The work may be any type of work, for example civil engineering work or building work.

評価対象とは、評価者により評価される物である。別の言い方をすれば、評価対象は、工事の進行具合を判断するために評価者が見る物ということもできる。例えば、評価対象は、工事によって削られる物、工事によって作られる物、又は工事によって取り壊される物である。評価とは、評価対象を観察することである。別の言い方をすれば、評価は、評価対象の状態、品質、又は良し悪しをチェックすることである。 An evaluation target is an object to be evaluated by an evaluator. In other words, the object of evaluation can be said to be what the evaluator sees in order to judge the progress of the construction work. For example, an object to be evaluated is an object to be removed by construction, an object to be created by construction, or an object to be demolished by construction. Evaluation means observing an object to be evaluated. In other words, evaluation is checking the condition, quality, or goodness or badness of an object to be evaluated.

本実施形態では、山岳トンネル工事におけるトンネル切羽（以降、単に切羽と記載する。）を評価する場面を例に挙げて、評価支援システムＳの処理を説明する。切羽は、掘削面又は掘削場所である。切羽は、評価対象の一例であり、本実施形態で切羽と記載した箇所は、評価対象と読み替えることができる。評価者は、工事現場に出向いて直に切羽を見て評価してもよいが、本実施形態では、切羽をカメラで撮影した撮影画像を見て評価する場合を説明する。 In this embodiment, the processing of the evaluation support system S will be described by taking as an example a scene of evaluating a tunnel face (hereinafter simply referred to as a face) in mountain tunnel construction. A face is an excavation surface or an excavation site. A face is an example of an object to be evaluated, and the part described as a face in the present embodiment can be read as an object to be evaluated. The evaluator may go to the construction site and directly look at the face for evaluation, but in this embodiment, a case will be described in which the face is evaluated by looking at a photographed image of the face photographed with a camera.

図２は、切羽が撮影された撮影画像の一例を示す図である。図２に示すように、例えば、撮影画像Ｉ１には、正面から撮影された切羽が示されている。評価者は、撮影画像Ｉ１に示された切羽を目視で評価して切羽観察簿を作成する。切羽観察簿は、電子データであってもよいし、紙であってもよい。切羽観察簿には、切羽の評価結果が入力又は記入され、例えば、切羽の安定性、素掘面の自律性、圧縮強度、風化又は変質の有無、割れ目の頻度・状態・形態、湧水の有無、又は水による劣化の有無などの評価項目が存在する。 FIG. 2 is a diagram showing an example of a photographed image in which a face is photographed. As shown in FIG. 2, for example, a photographed image I1 shows a face photographed from the front. The evaluator visually evaluates the face shown in the photographed image I1 and creates a face observation list. The face observation record may be electronic data or paper. In the face observation book, the evaluation results of the face are entered or entered. There are evaluation items such as presence/absence or presence/absence of deterioration due to water.

評価者は、評価対象の項目ごとに、切羽の左側・天端・右側などの各部分を評価する。原則として、評価者は、毎日少なくとも１回は切羽を評価する必要があり、最新の切羽の状態が撮影された撮影画像を見て切羽観察簿を作成する。このため、切羽観察簿の作成は、評価者の業務量を増加させる一因となっている。また、切羽の評価には、高度な専門知識と豊富な経験を要するので、経験の浅い評価者には、正確な評価をすることが難しい。更に、切羽の評価は、評価者の観点や現場の状況によって変わり、普遍的な評価をすることも難しい。 The evaluator evaluates each portion of the face, such as the left side, the crest, and the right side, for each item to be evaluated. In principle, the evaluator must evaluate the face at least once a day, and create a face observation list by looking at the most recent photographed images of the condition of the face. For this reason, preparation of the face observation log is one of the factors that increase the workload of the evaluator. In addition, since face evaluation requires highly specialized knowledge and extensive experience, it is difficult for an inexperienced evaluator to perform an accurate evaluation. Furthermore, evaluation of the working face changes depending on the viewpoint of the evaluator and the situation at the site, and it is difficult to make a universal evaluation.

そこで、評価支援システムＳは、熟練者が撮影画像を見て切羽を評価したときの視線を追跡し、熟練者が重点的に見る部分を学習させた機械学習モデルを用意し、切羽の評価業務を支援するようにしている。また、評価支援システムＳは、熟練者が作成した切羽観察簿を学習させた機械学習モデルも用意し、切羽の評価業務を支援するようにもしている。 Therefore, the evaluation support system S tracks the line of sight of the expert when evaluating the face by looking at the photographed image, prepares a machine learning model that learns the part that the expert focuses on, and uses it to evaluate the face. We are trying to support In addition, the evaluation support system S also prepares a machine learning model learned from a face observation book created by an expert to support face evaluation work.

図３は、熟練者が撮影画像を見て切羽を評価する様子を示す図である。図３に示すように、熟練者は、表示部１５に撮影画像Ｉ１を表示させて切羽を評価し、切羽観察簿Ｂ１を作成する。視線検出装置２０は、熟練者の視線を検出できるように、例えば、表示部１５の下部などに配置される。視線検出装置２０は、画面の正面方向が検出範囲となっており、熟練者の視線を検出する。学習端末１０は、視線検出装置２０の検出結果に基づいて、熟練者が重点的に見る部分を示す注視点画像Ｉ２を作成する。 FIG. 3 is a diagram showing how a skilled person evaluates a face by looking at a photographed image. As shown in FIG. 3, the expert displays a photographed image I1 on the display unit 15, evaluates the face, and creates a face observation book B1. The line-of-sight detection device 20 is arranged, for example, under the display unit 15 so as to detect the line of sight of the expert. The line-of-sight detection device 20 has a detection range in the front direction of the screen, and detects the line of sight of the expert. Based on the detection result of the line-of-sight detection device 20, the learning terminal 10 creates a point-of-regard image I2 indicating a portion that the expert views with emphasis.

注視点画像Ｉ２は、ヒートマップとも呼ばれる画像であり、熟練者が見た部分が色で表現される。注視点画像Ｉ２の色は、注視の度合いを表し、注視の度合いが強い箇所と弱い箇所が色分けされている。注視の度合いとは、熟練者が重要視している程度である。別の言い方をすれば、注視の度合いは、熟練者が注視した時間又は頻度ということもできる。例えば、注視点画像Ｉ２は、色のグラデーションを利用して注視の度合いが示されてもよいし、グラデーションを利用せずに、赤・青・黄色といった複数の色で色分けすることによって注視の度合いが示されてもよい。 The point-of-regard image I2 is an image also called a heat map, and the part seen by the expert is expressed in color. The color of the point-of-regard image I2 represents the degree of attention, and the portions with a strong degree of attention and the portions with a weak degree of attention are color-coded. The degree of fixation is the degree to which an expert attaches importance. In other words, the degree of gaze can also be said to be the time or frequency of gaze by the expert. For example, the point-of-regard image I2 may use a color gradation to indicate the degree of attention, or may be color-coded into a plurality of colors such as red, blue, and yellow without using a gradation to indicate the degree of attention. may be shown.

図３では、色の濃さを網点の密度で示しており、注視点画像Ｉ２の色が濃いほど、注視の度合いが高くなっている。本実施形態の注視点画像Ｉ２は、撮影画像Ｉ１と同じサイズであり、撮影画像Ｉ１が示す切羽の位置と、注視点画像Ｉ２が示す注視点の位置と、は互いに対応している。このため、図３の注視点画像Ｉ２は、熟練者が、切羽の右上、左下、及び中央下の状態を重点的に見ていることを意味する。 In FIG. 3, the color depth is indicated by the density of halftone dots, and the darker the color of the gaze point image I2, the higher the degree of gaze. The gaze point image I2 of this embodiment has the same size as the captured image I1, and the face position indicated by the captured image I1 and the gaze point position indicated by the gaze point image I2 correspond to each other. Therefore, the point-of-regard image I2 in FIG. 3 means that the expert is focusing on the upper right, lower left, and lower center states of the face.

評価支援システムＳは、種々の撮影画像Ｉ１を熟練者に評価させて注視点画像Ｉ２を作成する。注視点画像Ｉ２の作成には、複数の熟練者が参加してもよいし、複数の工事現場の各々の切羽が撮影された撮影画像Ｉ１が利用されてもよい。評価支援システムＳは、撮影画像Ｉ１と注視点画像Ｉ２のペアを蓄積し、撮影画像Ｉ１から注視点画像Ｉ２を生成する注視点画像出力モデルを作成する。 The evaluation support system S has an expert evaluate various captured images I1 to create a gaze point image I2. A plurality of experts may participate in creating the point-of-regard image I2, or a photographed image I1 obtained by photographing each face of a plurality of construction sites may be used. The evaluation support system S accumulates pairs of the captured image I1 and the gaze point image I2, and creates a gaze point image output model for generating the gaze point image I2 from the captured image I1.

図４は、注視点画像出力モデルの概要を示す説明図である。図４に示すように、注視点画像出力モデルＭ１は、熟練者によって評価済みの撮影画像Ｉ１と、視線検出装置２０を利用して作成した注視点画像Ｉ２と、のペアが多数格納された第１教師データＤ１を学習させた機械学習モデルである。機械学習自体は、公知の手法を利用可能であり、例えば、畳み込みニューラルネットワーク又は再帰的ニューラルネットワークが利用されてもよい。例えば、経験の浅い評価者は、学習済みの注視点画像出力モデルＭ１に対し、その日の業務で評価する切羽が撮影された撮影画像Ｉ３を入力する。 FIG. 4 is an explanatory diagram showing an overview of the point-of-regard image output model. As shown in FIG. 4, the point-of-regard image output model M1 stores a large number of pairs of a photographed image I1 evaluated by an expert and a point-of-regard image I2 created using the line-of-sight detection device 20. 1 This is a machine learning model that is trained on one teacher data D1. Machine learning itself can use a known method, and for example, a convolutional neural network or a recursive neural network may be used. For example, an inexperienced evaluator inputs a photographed image I3 of a face to be evaluated in the work of the day to the learned point-of-regard image output model M1.

撮影画像Ｉ３は、注視点画像出力モデルＭ１に学習されていない未知の画像である。注視点画像出力モデルＭ１は、撮影画像Ｉ３が入力されると、注視点画像Ｉ４を出力する。注視点画像出力モデルＭ１は、第１教師データＤ１を学習済みなので、注視点画像Ｉ４は、熟練者が撮影画像Ｉ３の切羽を評価すると仮定した場合に、熟練者が重点的に見ると推測される部分を示すことになる。評価者は、注視点画像Ｉ４を参考にしながら撮影画像Ｉ３を見て切羽の評価業務を行ってもよいが、本実施形態では、切羽の評価結果を自動的に作成する評価結果出力モデルが用意されている。 The captured image I3 is an unknown image that has not been learned by the point-of-regard image output model M1. The point-of-regard image output model M1 outputs the point-of-regard image I4 when the photographed image I3 is input. Since the point-of-regard image output model M1 has learned the first training data D1, the point-of-regard image I4 is presumed to be viewed by an expert when it is assumed that the expert evaluates the face of the captured image I3. It will show the part that The evaluator may evaluate the face by looking at the photographed image I3 while referring to the gaze point image I4. It is

図５は、評価結果出力モデルの概要を示す説明図である。図５に示すように、評価結果出力モデルＭ２は、先述した撮影画像Ｉ１及び注視点画像Ｉ２と、熟練者が記入した切羽観察簿Ｂ１と、のペアが多数格納された第２教師データＤ２を学習させた機械学習モデルである。本実施形態では、注視点画像出力モデルＭ１に対し、撮影画像Ｉ３が入力されて注視点画像Ｉ４が作成されると、評価結果出力モデルＭ２に対し、撮影画像Ｉ３と注視点画像Ｉ４が入力される。 FIG. 5 is an explanatory diagram showing an overview of the evaluation result output model. As shown in FIG. 5, the evaluation result output model M2 includes second teacher data D2 that stores a large number of pairs of the above-described photographed image I1 and point-of-regard image I2, and face observation book B1 filled in by an expert. It is a trained machine learning model. In this embodiment, when the captured image I3 is input to the point-of-regard image output model M1 to create the point-of-regard image I4, the captured image I3 and the point-of-regard image I4 are input to the evaluation result output model M2. be.

評価結果出力モデルＭ２は、撮影画像Ｉ３と注視点画像Ｉ４が入力されると、切羽観察簿Ｂ２を出力する。注視点画像出力モデルＭ１は、第２教師データＤ２を学習済みなので、切羽観察簿Ｂ２は、熟練者が撮影画像Ｉ３の切羽を評価すると仮定した場合に、熟練者が作成すると推測される切羽観察簿の内容を示す。切羽観察簿Ｂ２は、その日の評価結果としてそのまま用いられてもよいし、その日の評価を担当する評価者の参考用の情報として用いられてもよい。 The evaluation result output model M2 outputs a face observation list B2 when the photographed image I3 and the gazing point image I4 are input. Since the point-of-regard image output model M1 has already learned the second teacher data D2, the face observation book B2 is a face observation book presumed to be created by an expert when it is assumed that an expert evaluates the face of the photographed image I3. Indicates the contents of the book. The face observation record B2 may be used as it is as the evaluation result of the day, or may be used as reference information for the evaluator in charge of the evaluation of the day.

以上のように、評価支援システムＳは、主に、注視点画像出力モデルＭ１を利用して注視点画像Ｉ４を取得する第１の構成と、評価結果出力モデルＭ２を利用して切羽観察簿Ｂ２を取得する第２の構成と、を有し、評価対象を評価する業務を支援することができるようになっている。以降、評価支援システムＳの詳細を説明する。 As described above, the evaluation support system S mainly includes the first configuration for acquiring the point-of-regard image I4 using the point-of-regard image output model M1, and the face observation record B2 using the evaluation result output model M2. and a second configuration that acquires, so that it is possible to support the work of evaluating the evaluation target. Hereinafter, the details of the evaluation support system S will be described.

［３．本実施形態で実現される機能］
図６は、評価支援システムＳで実現される機能の一例を示す機能ブロック図である。図６に示すように、評価支援システムＳは、データ記憶部１００、教師注視点画像取得部１０１、第１教師データ取得部１０２、第１学習部１０３、第１入力部１０４、出力注視点画像取得部１０５、第２教師データ取得部１０６、第２学習部１０７、入力注視点画像取得部１０８、第２入力部１０９、及び出力評価結果取得部１１０が実現される。 [3. Functions realized in the present embodiment]
FIG. 6 is a functional block diagram showing an example of functions realized by the evaluation support system S. As shown in FIG. As shown in FIG. 6, the evaluation support system S includes a data storage unit 100, a teacher gazing point image acquiring unit 101, a first teacher data acquiring unit 102, a first learning unit 103, a first input unit 104, an output gazing point image Acquisition unit 105, second teacher data acquisition unit 106, second learning unit 107, input gaze point image acquisition unit 108, second input unit 109, and output evaluation result acquisition unit 110 are implemented.

データ記憶部１００、教師注視点画像取得部１０１、第１教師データ取得部１０２、第１学習部１０３、第１入力部１０４、出力注視点画像取得部１０５は、主に第１の構成に係る機能である。また、データ記憶部１００、第２教師データ取得部１０６、第２学習部１０７、入力注視点画像取得部１０８、第２入力部１０９、及び出力評価結果取得部１１０は、主に第２の構成に係る機能である。なお、本実施形態では、これら各機能が学習端末１０によって実現される場合を説明するが、後述する変形例のように、各機能は、サーバコンピュータ等の他のコンピュータによって実現されてもよい。 The data storage unit 100, the teacher gazing point image acquisition unit 101, the first teacher data acquisition unit 102, the first learning unit 103, the first input unit 104, and the output gazing point image acquisition unit 105 mainly relate to the first configuration. It is a function. Further, the data storage unit 100, the second teacher data acquisition unit 106, the second learning unit 107, the input gaze point image acquisition unit 108, the second input unit 109, and the output evaluation result acquisition unit 110 mainly have the second configuration. It is a function related to In this embodiment, the case where each of these functions is realized by the study terminal 10 will be described, but each function may be realized by another computer such as a server computer as in a modified example described later.

［データ記憶部］
データ記憶部１００は、記憶部１２を主として実現される。データ記憶部１００は、本実施形態の処理を実行するために必要なデータを記憶する。例えば、データ記憶部１００は、第１教師データＤ１と、第２教師データＤ２と、を記憶する。 [Data storage part]
The data storage unit 100 is realized mainly by the storage unit 12 . The data storage unit 100 stores data necessary for executing the processing of this embodiment. For example, the data storage unit 100 stores first teacher data D1 and second teacher data D2.

図７は、第１教師データＤ１のデータ格納例を示す図である。図７に示すように、第１教師データＤ１は、工事現場における切羽が撮影された教師撮影画像と、切羽を評価した評価者の教師注視点画像と、の関係を示すデータである。本実施形態では、後述する教師注視点画像取得部１０１により教師注視点画像が取得されるので、第１教師データＤ１は、教師撮影画像と、教師注視点画像取得部１０１により取得された教師注視点画像と、の関係を示す。なお、本実施形態では、教師注視点画像は、評価者が教師撮影画像を見て切羽を評価したときの注視点を示す場合を説明するが、教師注視点画像は、評価者が、画像ではなく、実際の切羽を直接見て評価したときの注視点を示してもよい。 FIG. 7 is a diagram showing a data storage example of the first teacher data D1. As shown in FIG. 7, the first training data D1 is data indicating the relationship between a teacher-captured image of a face at a construction site and a teacher gaze point image of an evaluator who evaluated the face. In this embodiment, the teacher's point-of-regard image acquisition unit 101, which will be described later, acquires the teacher's point-of-regard image. 3 shows the relationship between the viewpoint image and the In this embodiment, the teacher's point-of-regard image indicates the point-of-regard when the evaluator evaluates the face by looking at the teacher-captured image. Instead, the point of gaze when evaluating the actual face directly may be indicated.

第１教師データＤ１は、熟練者による評価が行われた後の任意のタイミングで作成され、教師撮影画像と教師注視点画像のペアが複数個格納される。第１教師データＤ１に格納される当該ペアの個数は、任意であってよく、例えば、十～数十個程度であってもよいし、百～数万個程度であってもよい。第１教師データＤ１は、注視点画像出力モデルＭ１の入力と出力の対応関係を定めたデータということができ、教師撮影画像が入力に相当し、教師注視点画像が出力に相当する。なお、図７では、画像ファイルが第１教師データＤ１に格納されている場合を示しているが、各画像の特徴量が第１教師データＤ１に格納されていてもよい。特徴量は、ベクトルや配列等の任意の形式で表現されるようにすればよい。 The first training data D1 is created at an arbitrary timing after the evaluation by the expert, and stores a plurality of pairs of teacher-captured images and teacher-gazing-point images. The number of pairs stored in the first training data D1 may be arbitrary, and may be, for example, about ten to several tens, or about one hundred to several tens of thousands. The first training data D1 can be said to be data that defines the correspondence between the input and the output of the point-of-regard image output model M1. Although FIG. 7 shows the case where the image file is stored in the first teacher data D1, the feature amount of each image may be stored in the first teacher data D1. The feature quantity may be expressed in any format such as vector or array.

教師撮影画像は、教師データとして用いられる撮影画像である。別の言い方をすれば、教師撮影画像は、機械学習モデルを学習させるために用いられる撮影画像である。なお、教師データは、訓練データ又は学習データと呼ばれることもある。図３－図５に示した撮影画像Ｉ１は、教師撮影画像に相当する。教師撮影画像には、工事現場における評価対象が撮影されている。本実施形態では、切羽が評価対象に相当するので、教師撮影画像は、切羽が撮影された画像となる。 A teacher captured image is a captured image used as teacher data. In other words, the teacher captured image is a captured image used for training a machine learning model. Note that teacher data is also called training data or learning data. A photographed image I1 shown in FIGS. 3 to 5 corresponds to a teacher photographed image. In the teacher-captured image, an evaluation target at a construction site is captured. In the present embodiment, the face corresponds to the evaluation target, so the teacher-captured image is an image in which the face is captured.

教師撮影画像は、任意の拡張子のデータであってよく、例えば、ＪＰＥＧ、ＰＮＧ、ＢＭＰ、又はＧＩＦの画像であってよい。本実施形態では、教師撮影画像がカラー画像である場合を説明するが、教師撮影画像は、グレーケール画像又はモノクロ画像といった他の形式であってもよい。教師撮影画像のサイズ、解像度、及びビット数は、任意であってよい。 The teacher-captured image may be data with any extension, such as a JPEG, PNG, BMP, or GIF image. In the present embodiment, a case where the captured teacher image is a color image will be described, but the captured teacher image may be in other formats such as a grayscale image or a monochrome image. The size, resolution, and number of bits of the teacher-captured image may be arbitrary.

なお、本実施形態では、各教師撮影画像における切羽の撮影条件が同じであり、拡張子、形式、サイズ、解像度、及びビット数が互いに同じものとするが、これらは互いに異なってもよい。撮影条件は、切羽を撮影する際の条件であり、例えば、切羽とカメラの位置関係（切羽に対するカメラの相対位置・向き・高さ）、画像において切羽が占める割合、撮影時の証明の明るさや色、又はカメラが生成する画像の形式などの設定である。また、切羽とカメラの位置関係（切羽の撮影方向）は、どの教師撮影画像も同じものとするが、教師撮影画像ごとに、切羽とカメラの位置関係が微妙に異なってもよい。 In the present embodiment, the face imaging conditions are the same for each teacher-photographed image, and the extension, format, size, resolution, and number of bits are the same, but these may be different. The shooting conditions are the conditions for shooting the face, such as the positional relationship between the face and the camera (relative position, orientation, and height of the camera with respect to the face), the ratio of the face in the image, the brightness of the proof at the time of shooting, and the Settings such as color or the format of the image that the camera produces. Further, the positional relationship between the face and the camera (imaging direction of the face) is the same for all teacher-captured images, but the positional relationship between the face and the camera may be slightly different for each teacher-captured image.

教師注視点画像は、教師データとして用いられる注視点画像である。別の言い方をすれば、教師注視点画像は、機械学習モデルを学習させるために用いられる注視点画像である。教師注視点画像は、注視点（注視の度合い）が色によって表現されている。図３－図５に示した注視点画像Ｉ２は、教師注視点画像に相当する。本実施形態では、切羽が評価対象に相当するので、教師注視点画像は、評価者が切羽を評価した場合の注視点を示す画像となる。 A teacher gazing point image is a gazing point image used as teacher data. In other words, the teacher gaze image is the gaze image used to train the machine learning model. In the teacher gaze point image, the gaze point (degree of gaze) is represented by color. The point-of-regard image I2 shown in FIGS. 3 to 5 corresponds to the teacher's point-of-regard image. In this embodiment, since the face corresponds to the evaluation target, the teacher gaze point image is an image showing the gaze point when the evaluator evaluates the face.

教師注視点画像は、任意の拡張子のデータであってよく、例えば、ＪＰＥＧ、ＰＮＧ、ＢＭＰ、又はＧＩＦの画像であってよい。本実施形態では、教師注視点画像がカラー画像である場合を説明するが、教師注視点画像は、グレーケール画像又はモノクロ画像といった他の形式であってもよい。教師注視点画像のサイズ、解像度、及びビット数は、任意であってよい。 The teacher gaze point image may be data with any extension, such as a JPEG, PNG, BMP, or GIF image. In this embodiment, a case where the teacher's point-of-regard image is a color image will be described, but the teacher's point-of-regard image may be in other formats such as a grayscale image or a monochrome image. The size, resolution, and number of bits of the teacher gaze image may be arbitrary.

なお、本実施形態では、教師撮影画像と教師注視点画像の各々の拡張子、形式、サイズ、解像度、及びビット数が互いに同じものとするが、これらは互いに異なってもよい。また、本実施形態では、注視点情報の一例として注視点画像を説明するが、注視点情報は、人間が視認可能な画像以外の形式であってもよく、例えば、座標情報、表形式の情報、又は数式形式の情報などによって表現されてもよい。本実施形態で注視点画像と記載した箇所は、注視点情報と読み替えることができる。例えば、教師注視点画像、出力注視点画像、及び入力注視点画像は、それぞれ教師注視点情報、出力注視点情報、及び入力注視点情報と読み替えることができる。 In this embodiment, the extension, format, size, resolution, and number of bits of the teacher-captured image and the teacher-gazing-point image are the same, but they may be different. Further, in the present embodiment, a point-of-regard image will be described as an example of the point-of-regard information, but the point-of-regard information may be in a format other than an image that can be visually recognized by humans. , or information in the form of a mathematical formula, or the like. The part described as the point-of-regard image in the present embodiment can be read as point-of-regard information. For example, a teacher gazing point image, an output gazing point image, and an input gazing point image can be read as teacher gazing point information, output gazing point information, and input gazing point information, respectively.

図８は、第２教師データＤ２のデータ格納例を示す図である。図８に示すように、第２教師データＤ２は、工事現場における評価対象が撮影された教師撮影画像と当該教師撮影画像に対応する教師注視点画像と、評価者による評価対象の教師評価結果と、の関係を示すデータである。本実施形態では、後述する教師注視点画像取得部１０１により教師注視点画像が取得されるので、第２教師データＤ２は、教師撮影画像と教師注視点画像取得部１０１により取得された教師注視点画像と、教師評価結果と、の関係を示す。 FIG. 8 is a diagram showing a data storage example of the second teacher data D2. As shown in FIG. 8, the second training data D2 includes a teacher-photographed image of an evaluation target at a construction site, a teacher-gazing-point image corresponding to the teacher-photographed image, and a teacher evaluation result of the evaluation target by an evaluator. , is data showing the relationship between . In this embodiment, the teacher's point-of-regard image is acquired by the teacher's point-of-regard image acquisition unit 101, which will be described later. 4 shows the relationship between images and teacher evaluation results.

第２教師データＤ２は、熟練者による評価後の任意のタイミングで作成され、教師撮影画像及び教師注視点画像と、教師評価結果と、のペアが複数個格納される。第２教師データＤ２に格納される当該ペアの個数は、任意であってよく、例えば、十～数十個程度であってもよいし、百～数万個程度であってもよい。第２教師データＤ２は、評価結果出力モデルＭ２の入力と出力の対応関係を定めたデータということができ、教師撮影画像と教師注視点画像が入力に相当し、教師評価結果が出力に相当する。なお、図８では、画像ファイルが第２教師データＤ２に格納されている場合を示しているが、各画像の特徴量が第２教師データＤ２に格納されていてもよい。 The second teacher data D2 is created at an arbitrary timing after the expert's evaluation, and stores a plurality of pairs of teacher-captured images, teacher-gazing-point images, and teacher evaluation results. The number of pairs stored in the second training data D2 may be arbitrary, and may be, for example, about ten to several tens, or about one hundred to several tens of thousands. The second teacher data D2 can be said to be data that defines the correspondence relationship between the input and the output of the evaluation result output model M2. The teacher photographed image and the teacher gaze point image correspond to the input, and the teacher evaluation result corresponds to the output. . Although FIG. 8 shows the case where the image file is stored in the second teacher data D2, the feature amount of each image may be stored in the second teacher data D2.

教師撮影画像に対応する教師注視点画像とは、教師撮影画像を熟練者が評価した場合に作成された注視点画像、又は、教師撮影画像を熟練者が評価すると仮定した場合の注視点画像である。例えば、図３－図５に示す撮影画像Ｉ１を教師撮影画像だとすると、視線検出装置２０を利用して作成した注視点画像Ｉ２は、教師撮影画像に対応する教師注視点画像である。なお、後述する変形例のように、教師撮影画像を注視点画像出力モデルＭ１に入力した場合に出力される注視点画像Ｉ４が、教師撮影画像に対応する教師注視点画像に相当してもよい。 The teacher's point-of-regard image corresponding to the teacher's shot image is the gaze-point image created when the teacher's shot image is evaluated by an expert, or the gaze-point image when it is assumed that the teacher's shot image is evaluated by an expert. be. For example, if the photographed image I1 shown in FIGS. 3 to 5 is a teacher photographed image, a gaze point image I2 created using the line-of-sight detection device 20 is a teacher gaze point image corresponding to the teacher photographed image. Note that, as in a modified example described later, the point-of-regard image I4 that is output when the teacher's captured image is input to the point-of-regard image output model M1 may correspond to the teacher's point-of-regard image corresponding to the teacher's captured image. .

教師評価結果は、教師データとして用いられる評価結果である。別の言い方をすれば、教師評価結果は、機械学習モデルを学習させるために用いられる評価結果である。図３及び図５に示した切羽観察簿Ｂ１は、教師評価結果に相当する。 A teacher evaluation result is an evaluation result used as teacher data. In other words, the teacher evaluation result is the evaluation result used for training the machine learning model. The face observation book B1 shown in FIGS. 3 and 5 corresponds to the teacher evaluation result.

教師評価結果は、複数の評価項目の各々の評価結果を含む。評価項目は、評価の基準となる項目であり、例えば、先述した切羽の安定性や素掘面の自律性などである。本実施形態では、評価結果が数値で示される場合を説明する。例えば、評価項目に該当するか否かが「０」又は「１」の数値で示されてもよいし、３つ以上の数値の中から該当する数値が選択されてもよい。 The teacher evaluation result includes evaluation results for each of the plurality of evaluation items. The evaluation items are items that serve as criteria for evaluation, and include, for example, the above-described stability of the working face and the autonomy of the excavation surface. In this embodiment, a case where evaluation results are indicated by numerical values will be described. For example, whether or not an item corresponds to an evaluation item may be indicated by a numerical value of "0" or "1", or a corresponding numerical value may be selected from three or more numerical values.

例えば、評価結果を示す数値は、「非常に良い」「良い」「悪い」「非常に悪い」といったように、予め定められた複数の数値の中から選択されてもよいし、１０点や５点などのような点数を示してもよい。なお、評価結果は、数値に限られず、記号や文字などで示されてもよい。また、評価結果は、評価項目が１つだけであってもよい。評価結果は、評価の内容ということもでき、工事の進捗状況や良し悪しの判断結果ということもできる。 For example, the numerical value indicating the evaluation result may be selected from a plurality of predetermined numerical values such as "very good", "good", "bad", and "very bad". A score, such as a point, may be indicated. Note that the evaluation results are not limited to numerical values, and may be indicated by symbols, characters, or the like. Also, the evaluation result may include only one evaluation item. The evaluation result can be said to be the contents of the evaluation, and it can also be said to be the progress of the construction or the judgment result of good or bad.

また、データ記憶部１００は、第１教師データＤ１と第２教師データＤ２だけでなく、注視点画像出力モデルＭ１と評価結果出力モデルＭ２も記憶する。これらのモデルは、人工知能又はエンジンと呼ばれることもある。データ記憶部１００は、注視点画像出力モデルＭ１と評価結果出力モデルＭ２の各々のプログラムやパラメータ（係数）を記憶することになる。データ記憶部１００に記憶された注視点画像出力モデルＭ１と評価結果出力モデルＭ２の各々は、後述する第１学習部１０３と第２学習部１０７の各々により学習済みであり、第１教師データＤ１と第２教師データＤ２の各々によってパラメータ等が調整されている。 The data storage unit 100 also stores not only the first teacher data D1 and the second teacher data D2, but also the point-of-regard image output model M1 and the evaluation result output model M2. These models are sometimes called artificial intelligence or engines. The data storage unit 100 stores programs and parameters (coefficients) of each of the point-of-regard image output model M1 and the evaluation result output model M2. Each of the point-of-regard image output model M1 and the evaluation result output model M2 stored in the data storage unit 100 has already been learned by each of the first learning unit 103 and the second learning unit 107, which will be described later. and the second teacher data D2, parameters and the like are adjusted.

なお、注視点画像出力モデルＭ１の基礎となる機械学習モデル自体は、公知のアルゴリズムを利用可能である。例えば、注視点画像出力モデルＭ１は、入力撮影画像が入力されると、出力注視点画像を出力するので、画像変換で用いられる機械学習モデルを利用可能である。例えば、「ディープネットワークを用いた大域特徴と局所特徴の学習による白黒写真の自動色付け」（飯塚里志、シモセラエドガー、石川博、http://iizuka.cs.tsukuba.ac.jp/projects/colorization/ja/）に記載されているようなモノクロ画像をカラー画像に変換する機械学習モデルの手法を流用してもよい。他にも例えば、複数のデータ間の変換を学習するＧＡＮと呼ばれる手法の一種であるｃｙｃｌｅＧＡＮ（Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks、https://arxiv.org/pdf/1703.10593.pdf）と呼ばれる手法を流用してもよい。 A known algorithm can be used for the machine learning model itself that forms the basis of the point-of-regard image output model M1. For example, the point-of-regard image output model M1 outputs an output point-of-regard image when an input captured image is input, so a machine learning model used in image conversion can be used. For example, ``Automatic coloring of black-and-white photographs by learning global and local features using deep networks'' (Satoshi Iizuka, Edgar Simosela, Hiroshi Ishikawa, http://iizuka.cs.tsukuba.ac.jp/projects/colorization /en/), a machine learning model technique that converts a monochrome image into a color image may be used. In addition, for example, cycleGAN (Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks, https://arxiv.org/pdf/1703.10593. pdf) may be used.

また、評価結果出力モデルＭ２の基礎となる機械学習モデルについても、公知のアルゴリズムを利用可能である。評価結果出力モデルＭ２は、入力撮影画像と入力注視点画像が入力されると、それに応じた出力評価結果を出力し、出力評価結果は、入力撮影画像と入力注視点画像の分類結果と捉えることができるので、いわゆる分類学習器を利用可能である。分類学習器は、入力されたデータを分類（ラベリング）する機械学習モデルであり、例えば、ある分類に該当するか否かを０又は１で出力したり、分類に該当する蓋然性を示すスコアを出力したりする。例えば、撮影画像に示された犬や猫などの物体を分類するＧｒａｄ－ＣＡＭと呼ばれる手法を流用してもよい。他にも例えば、ＦａｓｔｅｒＲＮＮ、Ｙｏｌｏ、又はＳＳＤと呼ばれる流用してもよい。 A known algorithm can also be used for the machine learning model that forms the basis of the evaluation result output model M2. When the input captured image and the input gaze point image are input, the evaluation result output model M2 outputs the corresponding output evaluation result, and the output evaluation result is regarded as the classification result of the input captured image and the input gaze point image. can be used, a so-called classification learner can be used. A classification learner is a machine learning model that classifies (labels) input data. For example, it outputs 0 or 1 as to whether or not it corresponds to a certain classification, or outputs a score indicating the probability that it corresponds to the classification. or For example, a technique called Grad-CAM that classifies objects such as dogs and cats shown in captured images may be used. Alternatively, for example, FasterRNN, Yolo, or SSD may be used.

なお、データ記憶部１００に記憶されるデータは、上記の例に限られない。例えば、データ記憶部１００は、熟練者による評価が行われていない教師入力画像を記憶してもよい。また例えば、データ記憶部１００は、後述する入力撮影画像を記憶してもよい。他にも例えば、データ記憶部１００は、注視点画像出力モデルＭ１と評価結果出力モデルＭ２の各々を学習させるためのプログラムを記憶してもよいし、視線検出装置２０の検出結果から注視点画像を作成するためのプログラムを記憶してもよい。 Note that the data stored in the data storage unit 100 is not limited to the above examples. For example, the data storage unit 100 may store teacher input images that have not been evaluated by experts. Further, for example, the data storage unit 100 may store an input captured image, which will be described later. Alternatively, for example, the data storage unit 100 may store a program for learning each of the point-of-regard image output model M1 and the evaluation result output model M2. may store a program for creating

［教師注視点画像取得部］
教師注視点画像取得部１０１は、制御部１１を主として実現される。教師注視点画像取得部１０１は、教師注視点画像を取得する。教師注視点画像は、任意の方法によって取得可能であり、本実施形態では、教師注視点画像取得部１０１は、視線検出装置２０により検出された評価者の視線に基づいて、教師注視点画像を取得する場合を説明する。 [Teacher gaze point image acquisition unit]
The teacher gaze point image acquisition unit 101 is realized mainly by the control unit 11 . A teacher gazing point image acquisition unit 101 acquires a teacher gazing point image. The teacher's point-of-regard image can be acquired by any method. In this embodiment, the teacher's point-of-regard image acquisition unit 101 acquires the teacher's point-of-regard image based on the evaluator's line of sight detected by the line-of-sight detection device 20. The case of acquisition will be explained.

視線検出装置２０は、本発明に係る視線検出手段の一例である。視線検出手段は、視線検出装置２０に限られず、視線を検出可能な手段であればよい。例えば、パーソナルコンピュータ、タブレット型コンピュータ、又はスマートフォンのカメラが視線検出手段に相当してもよい。他にも例えば、ヘッドマウントディスプレイ又はスマートグラスに組み込まれた視線センサが視線検出手段に相当してもよい。この場合、熟練者は、ヘッドマウントディスプレイ又はスマートグラスを装着したまま切羽の評価を行うことになる。先述したように、本実施形態では、熟練者が画像を見て切羽の評価を行い、その時の教師注視点画像が取得される場合を説明するが、熟練者が実際の切羽を直接見て評価を行い、その時の教師注視点画像が取得されてもよい。 The line-of-sight detection device 20 is an example of line-of-sight detection means according to the present invention. The line-of-sight detection means is not limited to the line-of-sight detection device 20, and may be any means capable of detecting the line of sight. For example, a camera of a personal computer, a tablet computer, or a smart phone may correspond to the line-of-sight detection means. Alternatively, for example, a line-of-sight sensor incorporated in a head-mounted display or smart glasses may correspond to the line-of-sight detection means. In this case, the expert evaluates the face while wearing a head-mounted display or smart glasses. As described above, in the present embodiment, a case will be described in which the expert evaluates the face by looking at the image and acquires the teacher gaze point image at that time. may be performed, and the teacher gaze point image at that time may be acquired.

教師注視点画像の取得方法自体は、公知のツールを利用可能である。例えば、教師注視点画像取得部１０１は、視線検出装置２０の検出結果に基づいて、注視点の位置を時系列的に記録する。教師注視点画像取得部１０１は、当該時系列的に記録された注視点の位置に基づいて、画面上の位置ごとに注視時間を計算する。教師注視点画像取得部１０１は、各位置の注視時間に基づいて、各画素の画素値を決定して教師注視画像を取得する。注視時間と画素値との関係は、予め定めておけばよく、例えば、注視時間が長いほど所定の色に近づくように定めてもよいし、注視時間が長いほど色が濃くなるように定めてもよい。 A well-known tool can be used for the acquisition method of the teacher gaze point image itself. For example, the teacher gazing point image acquisition unit 101 records the position of the gazing point in time series based on the detection result of the line-of-sight detection device 20 . The teacher gaze point image acquisition unit 101 calculates the gaze time for each position on the screen based on the position of the gaze point recorded in chronological order. The teacher gaze point image acquisition unit 101 acquires the teacher gaze image by determining the pixel value of each pixel based on the gaze time at each position. The relationship between the gaze time and the pixel value may be determined in advance. For example, it may be determined so that the longer the gaze time, the closer to a predetermined color, or the longer the gaze time, the darker the color. good too.

本実施形態では、教師注視点画像取得部１０１は、視線検出装置２０により検出された評価者の視線のうち、教師撮影画像が表示された画面上への視線を特定し、当該特定された視線に基づいて、教師注視点画像を取得する。教師撮影画像は、画面全体に表示されてもよいし、画面の一部にだけ表示されてもよい。表示部１５の画面のうち、教師撮影画像が表示された領域の位置は、初期設定によって予めデータ記憶部１００に記録されているものとする。教師注視点画像取得部１０１は、視線検出装置２０により検出された視線のうち、教師撮影画像が表示された領域内への視線（注視点が当該領域内の視線）に基づいて、教師注視点画像を取得し、当該領域外への視線（注視点が当該領域外の視線）の情報については、教師注視点画像に含まれないようにする。なお、教師撮影画像には、画面外への視線が多少含まれていてもよい。 In this embodiment, the teacher gazing point image acquisition unit 101 identifies the line of sight of the evaluator detected by the line of sight detection device 20 toward the screen on which the teacher captured image is displayed, and determines the identified line of sight. Based on , the teacher gaze image is obtained. The teacher-captured image may be displayed on the entire screen, or may be displayed on only part of the screen. It is assumed that the position of the area on the screen of the display unit 15 where the teacher's photographed image is displayed is recorded in advance in the data storage unit 100 by initial setting. The teacher's point-of-regard image acquisition unit 101 acquires the teacher's point-of-regard based on the line-of-sight toward the area where the teacher's photographed image is displayed (the point of gaze is the line-of-sight within the area) among the lines of sight detected by the line-of-sight detection device 20. The image is acquired, and the information of the line of sight to the outside of the area (the line of sight whose gaze point is outside the area) is not included in the teacher gaze point image. Note that the teacher-captured image may include some line of sight to the outside of the screen.

［第１教師データ取得部］
第１教師データ取得部１０２は、制御部１１を主として実現される。第１教師データ取得部１０２は、第１教師データＤ１を取得する。本実施形態では、第１教師データＤ１がデータ記憶部１００に記憶されているので、第１教師データ取得部１０２は、データ記憶部１００を参照し、第１教師データＤ１を取得する。第１教師データＤ１が学習端末１０以外の他のコンピュータ又は外部情報記憶媒体に記憶されている場合、第１教師データ取得部１０２は、当該他のコンピュータ又は外部情報記憶媒体に記憶された第１教師データＤ１を取得する。 [First training data acquisition unit]
The first teacher data acquisition unit 102 is realized mainly by the control unit 11 . The first teacher data acquisition unit 102 acquires the first teacher data D1. In this embodiment, the first teacher data D1 is stored in the data storage unit 100, so the first teacher data acquisition unit 102 refers to the data storage unit 100 and acquires the first teacher data D1. When the first teacher data D1 is stored in another computer or an external information storage medium other than the study terminal 10, the first teacher data acquiring unit 102 retrieves the first teacher data D1 stored in the other computer or the external information storage medium. Acquire teacher data D1.

［第１学習部］
第１学習部１０３は、制御部１１を主として実現される。第１学習部１０３は、第１教師データＤ１に基づいて、注視点画像出力モデルＭ１を学習させる。学習方法自体は、公知の機械学習モデルの手法を利用すればよく、例えば、畳み込みニューラルネットワーク又は再帰的ニューラルネットワークの学習手法を利用すればよい。この点は、後述する第２学習部１０７も同様である。 [1st learning section]
First learning unit 103 is implemented mainly by control unit 11 . The first learning unit 103 learns the point-of-regard image output model M1 based on the first teacher data D1. As for the learning method itself, a known machine learning model technique may be used, for example, a convolutional neural network or recursive neural network learning technique may be used. This also applies to the second learning unit 107, which will be described later.

第１学習部１０３は、第１教師データＤ１が示す入力と出力の関係が得られるように、注視点画像出力モデルＭ１のパラメータを調整する。例えば、第１学習部１０３は、第１教師データＤ１の教師撮影画像と教師注視点画像との各々を特徴量化し、教師撮影画像の特徴量を入力した場合に、教師注視点画像の特徴量が出力されるように、注視点画像出力モデルＭ１のパラメータを調整する。なお、第１教師データＤ１には、教師撮影画像と教師注視点画像の各々の特徴量が予め計算されて格納されていてもよい。この場合には、第１学習部１０３は、学習時に特徴量を計算しなくてよい。 The first learning unit 103 adjusts the parameters of the point-of-regard image output model M1 so as to obtain the relationship between the input and the output indicated by the first teacher data D1. For example, the first learning unit 103 converts each of the teacher-captured image and the teacher-gazing-point image of the first training data D1 into a feature quantity, and when the feature-value of the teacher-captured image is input, the feature-value of the teacher-gazing-point image is is output, the parameters of the point-of-regard image output model M1 are adjusted. Note that the first teacher data D1 may include pre-calculated and stored feature amounts of each of the teacher-captured image and the teacher-gazing-point image. In this case, the first learning unit 103 does not need to calculate feature amounts during learning.

［第１入力部］
第１入力部１０４は、制御部１１を主として実現される。第１入力部１０４は、注視点画像出力モデルＭ１に対し、入力撮影画像を入力する。 [First input section]
First input unit 104 is realized mainly by control unit 11 . The first input unit 104 inputs an input captured image to the point-of-regard image output model M1.

入力撮影画像は、注視点画像出力モデルＭ１に入力される撮影画像である。別の言い方をすれば、入力撮影画像は、評価者（例えば、経験の浅い評価者）が評価すべき切羽が撮影された撮影画像である。図４－図５に示した撮影画像Ｉ３は、入力撮影画像に相当する。本実施形態では、入力撮影画像が、第１教師データＤ１及び第２教師データＤ２に格納されていない場合を説明するが、第１教師データＤ１及び第２教師データＤ２に格納された教師撮影画像が入力撮影画像となってもよい。 The input captured image is a captured image input to the point-of-regard image output model M1. In other words, the input captured image is a captured image of a face to be evaluated by an evaluator (for example, an inexperienced evaluator). The captured image I3 shown in FIGS. 4 and 5 corresponds to the input captured image. In the present embodiment, the case where the input captured image is not stored in the first teacher data D1 and the second teacher data D2 will be described. may be the input captured image.

入力撮影画像には、工事現場における評価対象が撮影されている。本実施形態では、切羽が評価対象に相当するので、入力撮影画像は、教師撮影画像の切羽の工事現場又は他の工事現場における切羽が撮影された画像である。入力撮影画像は、教師撮影画像に撮影された切羽と同じ工事現場の切羽（例えば、教師撮影画像に撮影された切羽を更に掘り進めた後の切羽）が示されていてもよいし、全く別の場所の他の工事現場における全く異なる切羽が示されていてもよい。 In the input photographed image, an evaluation target at a construction site is photographed. In this embodiment, since the face corresponds to the evaluation target, the input photographed image is an image of the face at the construction site of the teacher photographed image or another construction site. The input photographed image may show the same face at the construction site as the face photographed in the teacher photographed image (for example, the face after further excavation of the face photographed in the teacher photographed image), or a completely different face. A completely different face at another construction site at the site may be shown.

入力撮影画像の拡張子、形式、サイズ、解像度、及びビット数が任意であってよい点は、教師撮影画像と同様である。本実施形態では、教師撮影画像と入力撮影画像との間で切羽の撮影条件が同じであり、教師撮影画像と入力撮影画像の各々は、拡張子、形式、サイズ、解像度、及びビット数が互いに同じものとするが、これらは互いに異なってもよい。また、切羽とカメラの位置関係（切羽の撮影方向）は、入力撮影画像と教師撮影画像との間で同じものとするが、入力撮影画像と教師撮影画像との間で、切羽とカメラの位置関係が微妙に異なってもよい。 The extension, format, size, resolution, and number of bits of the input photographed image may be arbitrary, as in the case of the teacher photographed image. In the present embodiment, the face shooting conditions are the same between the teacher shot image and the input shot image, and the extension, format, size, resolution, and bit number of each of the teacher shot image and the input shot image are different from each other. Although they are the same, they may be different from each other. In addition, the positional relationship between the face and the camera (imaging direction of the face) is the same between the input captured image and the teacher captured image. Relationships may be subtly different.

本実施形態では、データ記憶部１００に入力撮影画像が記憶されており、第１入力部１０４が、データ記憶部１００に記憶された入力撮影画像を取得する場合を説明するが、入力撮影画像は、任意の方法で取得可能である。例えば、第１入力部１０４は、学習端末１０以外の他のコンピュータ又は外部情報記憶媒体から入力撮影画像を取得してもよい。また例えば、第１入力部１０４は、カメラから直接的に入力撮影画像を取得してもよい。 In the present embodiment, an input photographed image is stored in the data storage unit 100, and the first input unit 104 acquires the input photographed image stored in the data storage unit 100. However, the input photographed image is , which can be obtained in any way. For example, the first input unit 104 may acquire the input captured image from a computer other than the study terminal 10 or an external information storage medium. Further, for example, the first input unit 104 may acquire the input captured image directly from the camera.

第１入力部１０４は、注視点画像出力モデルＭ１に対し、取得した入力撮影画像を入力する。本実施形態では、入力撮影画像の特徴量を計算するアルゴリズムが注視点画像出力モデルＭ１に組み込まれている場合を説明するが、当該アルゴリズムは、注視点画像出力モデルＭ１とは別に用意されていてもよい。この場合、第１入力部１０４は、特徴量を計算するアルゴリズムに対し、入力撮影画像を入力し、当該アルゴリズムから出力された入力撮影画像の特徴量を注視点画像出力モデルＭ１に入力すればよい。 The first input unit 104 inputs the acquired input captured image to the point-of-regard image output model M1. In this embodiment, a case will be described in which an algorithm for calculating the feature amount of the input captured image is incorporated in the point-of-regard image output model M1. good too. In this case, the first input unit 104 may input an input captured image to an algorithm for calculating a feature amount, and input the feature amount of the input captured image output from the algorithm to the point-of-regard image output model M1. .

［出力注視点画像取得部］
出力注視点画像取得部１０５は、制御部１１を主として実現される。出力注視点画像取得部１０５は、注視点画像出力モデルＭ１から出力された、出力注視点画像を取得する。 [Output gaze point image acquisition unit]
The output gaze point image acquisition unit 105 is realized mainly by the control unit 11 . The output point-of-regard image acquisition unit 105 acquires the output point-of-regard image output from the point-of-regard image output model M1.

出力注視点画像は、注視点画像出力モデルＭ１から出力される注視点画像である。別の言い方をすれば、出力注視点画像は、熟練者が入力撮影画像を評価すると仮定した場合に、熟練者が注視すると推測される部分を示す。出力注視点画像は、注視点（注視の度合い）が色によって表現されている。図４－図５に示した注視点画像Ｉ４は、出力注視点画像に相当する。本実施形態では、出力注視点画像は、第１教師データＤ１及び第２教師データＤ２に格納されない場合を説明するが、後述する変形例のように、出力注視点画像は、第１教師データＤ１及び第２教師データＤ２に格納され、教師データとして用いられてもよい。 The output point-of-regard image is the point-of-regard image output from the point-of-regard image output model M1. In other words, the output point-of-regard image indicates a portion that is assumed to be watched by an expert when it is assumed that the expert evaluates the input captured image. In the output gaze point image, the gaze point (degree of gaze) is represented by color. The point-of-regard image I4 shown in FIGS. 4 and 5 corresponds to the output point-of-regard image. In this embodiment, the case where the output gaze point image is not stored in the first teacher data D1 and the second teacher data D2 will be described. and second teacher data D2, and may be used as teacher data.

本実施形態では、切羽が評価対象に相当し、入力撮影画像には、工事現場における評価対象が撮影されているので、出力注視点画像は、入力撮影画像に示された切羽の評価時に見るべき部分を示す。出力注視点画像の拡張子、形式、サイズ、解像度、及びビット数が任意であってよい点は、教師注視点画像と同様である。本実施形態では、教師注視点画像と出力撮影画像との各々は、拡張子、形式、サイズ、解像度、及びビット数が互いに同じものとするが、これらは互いに異なってもよい。 In this embodiment, the face corresponds to the evaluation target, and the input captured image captures the evaluation target at the construction site. indicate the part. The extension, format, size, resolution, and number of bits of the output gaze point image may be arbitrary, as is the case with the teacher gaze point image. In this embodiment, the teacher gazing point image and the output captured image have the same extension, format, size, resolution, and number of bits, but these may differ from each other.

［第２教師データ取得部］
第２教師データ取得部１０６は、制御部１１を主として実現される。第２教師データ取得部１０６は、第２教師データＤ２を取得する。本実施形態では、第２教師データＤ２がデータ記憶部１００に記憶されているので、第２教師データ取得部１０６は、データ記憶部１００を参照し、第２教師データＤ２を取得する。第２教師データＤ２が学習端末１０以外の他のコンピュータ又は外部情報記憶媒体に記憶されている場合、第２教師データ取得部１０６は、当該他のコンピュータ又は外部情報記憶媒体に記憶された第２教師データＤ２を取得する。 [Second teacher data acquisition unit]
Second teacher data acquisition unit 106 is realized mainly by control unit 11 . The second teacher data acquisition unit 106 acquires the second teacher data D2. In this embodiment, the second teacher data D2 is stored in the data storage unit 100, so the second teacher data acquisition unit 106 refers to the data storage unit 100 and acquires the second teacher data D2. When the second teacher data D2 is stored in another computer or an external information storage medium other than the study terminal 10, the second teacher data acquisition unit 106 acquires the second teacher data D2 stored in the other computer or the external information storage medium. Acquire teacher data D2.

［第２学習部］
第２学習部１０７は、制御部１１を主として実現される。第２学習部１０７は、第２教師データＤ２に基づいて、評価結果出力モデルＭ２を学習させる。第２学習部１０７は、第２教師データＤ２が示す入力と出力の関係が得られるように、評価結果出力モデルＭ２のパラメータを調整する。例えば、第２学習部１０７は、第２教師データＤ２の教師撮影画像と教師注視点画像の各々を特徴量化し、教師撮影画像と教師注視点画像の各々の特徴量を入力した場合に、教師評価結果が出力されるように、評価結果出力モデルＭ２のパラメータを調整する。なお、第２教師データＤ２には、教師撮影画像と教師注視点画像の各々の特徴量が予め計算されて格納されていてもよい。この場合には、第２学習部１０７は、学習時に特徴量を計算しなくてよい。 [Second study part]
Second learning unit 107 is implemented mainly by control unit 11 . The second learning unit 107 learns the evaluation result output model M2 based on the second teacher data D2. The second learning unit 107 adjusts the parameters of the evaluation result output model M2 so as to obtain the relationship between the input and the output indicated by the second teacher data D2. For example, the second learning unit 107 converts each of the teacher-captured image and the teacher-gazing-point image of the second training data D2 into a feature quantity, and when the feature quantity of each of the teacher-captured image and the teacher-gazing-point image is input, the teacher Parameters of the evaluation result output model M2 are adjusted so that evaluation results are output. Note that the feature amounts of each of the teacher captured image and the teacher gaze point image may be calculated in advance and stored in the second teacher data D2. In this case, the second learning unit 107 does not need to calculate feature amounts during learning.

［入力注視点画像取得部］
入力注視点画像取得部１０８は、制御部１１を主として実現される。入力注視点画像取得部１０８は、入力注視点画像を取得する。本実施形態では、入力注視点画像取得部１０８は、学習済みの注視点画像出力モデルＭ１に対し、入力撮影画像が入力された場合に出力される出力注視点画像を、入力注視点画像として取得する。 [Input fixation point image acquisition unit]
The input gaze point image acquisition unit 108 is realized mainly by the control unit 11 . An input point-of-regard image acquisition unit 108 acquires an input point-of-regard image. In this embodiment, the input point-of-regard image acquisition unit 108 acquires an output point-of-regard image output when an input photographed image is input to the trained point-of-regard image output model M1 as an input point-of-regard image. do.

入力注視点画像は、注視点画像出力モデルＭ１に入力される注視点画像である。別の言い方をすれば、入力注視点画像は、入力撮影画像の中で見るべき部分を示す。図４－図５に示した注視点画像Ｉ４は、入力注視点画像に相当する。本実施形態では、切羽が評価対象に相当し、入力撮影画像には、工事現場における切羽が撮影されているので、入力注視点画像は、入力撮影画像に示された切羽の評価時に見るべき部分を示す。 The input gazing point image is the gazing point image input to the gazing point image output model M1. In other words, the input point-of-regard image indicates the part to be viewed in the input captured image. The point-of-regard image I4 shown in FIGS. 4 and 5 corresponds to the input point-of-regard image. In the present embodiment, the face corresponds to the evaluation target, and the input photographed image is a photograph of the face at the construction site. indicates

本実施形態では、出力注視点画像が入力注視点画像に相当し、第２入力部１０９が、注視点画像出力モデルＭ１から出力された出力注視点画像を、入力注視点画像として取得する場合を説明するが、入力注視点画像は、任意の方法で取得可能である。例えば、注視点画像出力モデルＭ１を利用しない場合には、第２入力部１０９は、操作部１４からの操作に基づいて入力注視点画像を取得してもよいし、学習端末１０以外の他のコンピュータ又は外部情報記憶媒体から入力注視点画像を取得してもよい。 In this embodiment, the output gaze point image corresponds to the input gaze point image, and the second input unit 109 acquires the output gaze point image output from the gaze point image output model M1 as the input gaze point image. As will be described, the input point-of-regard image can be obtained in any manner. For example, when the point-of-regard image output model M1 is not used, the second input unit 109 may acquire an input point-of-regard image based on an operation from the operation unit 14, or may acquire an input point-of-regard image from another device other than the learning terminal 10. The input gaze point image may be obtained from a computer or an external information storage medium.

［第２入力部］
第２入力部１０９は、制御部１１を主として実現される。第２入力部１０９は、評価結果出力モデルＭ２に対し、入力撮影画像と、入力撮影画像に対応する入力注視点画像と、を入力する。入力撮影画像については、第１入力部１０４の説明で記載した通りである。入力注視点画像は、任意の方法によって取得可能であり、本実施形態では、後述する入力注視点画像取得部１０８により入力注視点画像が取得されるので、第２入力部１０９は、評価結果出力モデルに対し、入力撮影画像と、入力注視点画像取得部１０８により取得された入力注視点画像と、を入力する。 [Second input section]
The second input unit 109 is implemented mainly by the control unit 11 . The second input unit 109 inputs an input captured image and an input gaze point image corresponding to the input captured image to the evaluation result output model M2. The input captured image is as described in the description of the first input unit 104 . The input point-of-regard image can be obtained by any method. In this embodiment, the input-point-of-regard image is obtained by the input-point-of-regard image obtaining unit 108, which will be described later. An input photographed image and an input point-of-regard image acquired by the input point-of-regard image acquiring unit 108 are input to the model.

第２入力部１０９は、評価結果出力モデルＭ２に対し、入力撮影画像と入力注視点画像を入力する。本実施形態では、入力撮影画像と入力注視点画像の各々の特徴量を計算するアルゴリズムが評価結果出力モデルＭ２に組み込まれている場合を説明するが、当該アルゴリズムは、評価結果出力モデルＭ２とは別に用意されていてもよい。この場合、第２入力部１０９は、特徴量を計算するアルゴリズムに対し、入力撮影画像と入力注視点画像の各々を入力し、当該アルゴリズムから出力された特徴量を評価結果出力モデルＭ２に入力すればよい。 The second input unit 109 inputs the input photographed image and the input gaze point image to the evaluation result output model M2. In this embodiment, a case will be described in which an algorithm for calculating the feature values of each of the input captured image and the input gaze point image is incorporated in the evaluation result output model M2. It may be prepared separately. In this case, the second input unit 109 inputs each of the input captured image and the input gazing point image to an algorithm for calculating the feature amount, and inputs the feature amount output from the algorithm to the evaluation result output model M2. Just do it.

［出力評価結果取得部］
出力評価結果取得部１１０は、制御部１１を主として実現される。出力評価結果取得部１１０は、学習済みの評価結果出力モデルに対し、入力撮影画像と、入力注視点画像と、が入力された場合に出力される、入力撮影画像に示された切羽の出力評価結果を取得する。 [Output evaluation result acquisition unit]
The output evaluation result acquisition unit 110 is realized mainly by the control unit 11 . The output evaluation result acquisition unit 110 outputs an output evaluation of the face shown in the input captured image, which is output when the input captured image and the input gaze point image are input to the learned evaluation result output model. Get results.

出力評価結果は、評価結果出力モデルＭ２から出力される評価結果である。別の言い方をすれば、出力評価結果は、熟練者が入力撮影画像を評価したと仮定した場合の評価結果と推測された内容を示す。図５に示した切羽観察簿Ｂ２は、出力評価結果に相当する。出力評価結果は、評価項目ごとに、入力撮影画像と入力注視点画像に対応する数値を示す。当該数値は、入力撮影画像と入力注視点画像の分類結果ということができ、分類学習器におけるラベルに相当する。 The output evaluation result is the evaluation result output from the evaluation result output model M2. In other words, the output evaluation result indicates the evaluation result and estimated contents when it is assumed that an expert has evaluated the input captured image. The face observation list B2 shown in FIG. 5 corresponds to the output evaluation result. The output evaluation result indicates numerical values corresponding to the input photographed image and the input gaze point image for each evaluation item. The numerical value can be said to be the classification result of the input photographed image and the input gaze point image, and corresponds to the label in the classification learning device.

なお、本実施形態では、出力評価結果が表示部１５に表示される場合を説明するが、出力評価結果は、任意の用途で利用されてよい。例えば、出力評価結果は、プリンタから印刷されてもよいし、電子メール等に添付された送信されてもよい。他にも例えば、出力評価結果を示すファイルが学習端末１０又は他のコンピュータに記録されてもよい。 In this embodiment, the output evaluation result is displayed on the display unit 15, but the output evaluation result may be used for any purpose. For example, the output evaluation result may be printed from a printer, or may be sent as an attachment to e-mail or the like. Alternatively, for example, a file showing output evaluation results may be recorded in the learning terminal 10 or another computer.

［４．本実施形態において実行される処理］
次に、評価支援システムＳで実行される処理を説明する。ここでは、注視点画像出力モデルＭ１と評価結果出力モデルＭ２の各々を学習させるための学習処理と、これらのモデルを利用して評価者の評価業務を支援するための評価支援処理と、について説明する。以降説明する処理は、制御部１１が記憶部１２に記憶されたプログラムに従って動作することによって実行される。また、以降説明する処理は、図６に示す機能ブロックにより実行される処理の一例である。 [4. Processing executed in the present embodiment]
Next, processing executed by the evaluation support system S will be described. Here, a learning process for learning each of the point-of-regard image output model M1 and the evaluation result output model M2, and an evaluation support process for supporting the evaluator's evaluation work using these models will be described. do. The processes described below are executed by the control unit 11 operating according to the programs stored in the storage unit 12 . Also, the processing described below is an example of the processing executed by the functional blocks shown in FIG.

［４－１．学習処理］
図９は、学習処理を示すフロー図である。図９に示すように、まず、制御部１１は、操作部１４の検出信号に基づいて、熟練者に評価させる教師撮影画像を表示部１５に表示させる（Ｓ１００）。Ｓ１００においては、制御部１１は、記憶部１２に記憶された教師撮影画像のうち、熟練者が操作部１４を操作して選択した教師撮影画像を、表示部１５に表示させる。Ｓ１００において表示される教師撮影画像は、対応する教師注視点画像が作成されていない画像であり、第１教師データＤ１にまだ格納されていない教師撮影画像である。 [4-1. learning process]
FIG. 9 is a flow diagram showing learning processing. As shown in FIG. 9, first, the control unit 11 causes the display unit 15 to display a teacher-captured image for the expert to evaluate based on the detection signal from the operation unit 14 (S100). In S<b>100 , the control unit 11 causes the display unit 15 to display the teacher-photographed image selected by the expert by operating the operation unit 14 from among the teacher-photographed images stored in the storage unit 12 . The teacher-captured image displayed in S100 is an image for which a corresponding teacher gaze point image has not been created, and is a teacher-captured image that has not yet been stored in the first teacher data D1.

制御部１１は、視線検出装置２０による熟練者の視線の検出結果を取得する（Ｓ１０１）。Ｓ１０１においては、制御部１１は、視線検出装置２０により検出された熟練者の視線（例えば、注視点の座標）を時系列的に記憶部１２に記録する。熟練者は、表示部１５に表示された教師撮影画像に示された切羽を評価し、切羽観察簿の評価項目の評価結果を入力する。評価結果は、操作部１４から入力されてもよいし、熟練者の手元にあるタブレット型端末等から入力されて学習端末１０に送られてもよい。他にも例えば、評価結果は、紙の切羽観察簿に記入され、事後的にスキャナで取り込まれたり、操作部１４から入力されたりしてもよい。 The control unit 11 acquires the detection result of the expert's line of sight by the line of sight detection device 20 (S101). In S<b>101 , the control unit 11 records in the storage unit 12 the line of sight of the expert detected by the line of sight detection device 20 (for example, the coordinates of the gaze point) in time series. The expert evaluates the face shown in the teacher-captured image displayed on the display unit 15, and inputs the evaluation results of the evaluation items in the face observation book. The evaluation result may be input from the operation unit 14 or may be input from a tablet terminal or the like at hand of the expert and sent to the learning terminal 10 . In addition, for example, the evaluation results may be entered in a paper face observation book, captured by a scanner after the fact, or input from the operation unit 14 .

制御部１１は、操作部１４の検出信号に基づいて、熟練者による評価が完了したか否かを判定する（Ｓ１０２）。Ｓ１０２においては、制御部１１は、操作部１４や熟練者の端末等からの入力結果に基づいて、切羽観察簿の全ての評価項目が入力されて所定の終了操作が行われたか否かを判定する。 The control unit 11 determines whether or not the evaluation by the expert has been completed based on the detection signal from the operation unit 14 (S102). In S102, the control unit 11 determines whether or not all the evaluation items of the face observation sheet have been input and a predetermined end operation has been performed based on the input result from the operation unit 14, the expert's terminal, or the like. do.

熟練者による評価が完了したと判定されない場合（Ｓ１０２；Ｎ）、熟練者による評価が終了していないので、Ｓ１０１の処理に戻る。この場合、熟練者は、引き続き切羽の評価を行い、熟練者の視線の検出結果が記録される。 If it is determined that the evaluation by the expert has not been completed (S102; N), the evaluation by the expert has not been completed, so the process returns to S101. In this case, the expert continues to evaluate the face, and the detection result of the expert's line of sight is recorded.

一方、評価結果の入力を受け付けたと判定された場合（Ｓ１０２；Ｙ）、制御部１１は、熟練者の視線の検出結果に基づいて、教師注視点画像を作成する（Ｓ１０３）。Ｓ１０３においては、制御部１１は、熟練者が評価を開始してから終了するまでの間における視線の検出結果に基づいて、教師注視点画像を作成する。制御部１１は、熟練者が注視した時間が長いほど色が濃くなるように、教師注視点画像を作成する。 On the other hand, when it is determined that the input of the evaluation result has been received (S102; Y), the control unit 11 creates a teacher gaze point image based on the result of detecting the line of sight of the expert (S103). In S103, the control unit 11 creates a teacher gaze point image based on the detection results of the line of sight from the start of the evaluation by the expert until the end of the evaluation. The control unit 11 creates a teacher gaze point image such that the longer the expert gazes, the darker the color becomes.

制御部１１は、Ｓ１００で表示させた教師撮影画像と、Ｓ１０３で作成した教師注視点画像と、のペアを第１教師データＤ１に格納する（Ｓ１０４）。制御部１１は、Ｓ１００で表示させた教師撮影画像及びＳ１０３で作成した教師注視点画像と、Ｓ１０２で完了した評価である教師評価結果と、のペアを第２教師データＤ２に格納する（Ｓ１０５）。Ｓ１０４においては、第１教師データＤ１が作成され、Ｓ１０５においては、第２教師データＤ２が作成されることになる。 The control unit 11 stores the pair of the teacher captured image displayed in S100 and the teacher gaze point image created in S103 in the first teacher data D1 (S104). The control unit 11 stores a pair of the teacher captured image displayed in S100, the teacher gaze point image created in S103, and the teacher evaluation result, which is the evaluation completed in S102, in the second teacher data D2 (S105). . At S104, the first teaching data D1 is created, and at S105, the second teaching data D2 is created.

制御部１１は、操作部１４の検出信号に基づいて、学習処理を実行するか否かを判定する（Ｓ１０６）。学習処理は、任意のタイミングで実行されてよく、例えば、操作部１４から所定の操作が行われた場合に実行される。なお、学習処理は、予め定められた時間が到来した場合に実行されてもよいし、熟練者による評価が行われるたびに実行されてもよい。他にも例えば、第１教師データＤ１及び第２教師データＤ２の各々に対し、新しいデータが一定数以上追加された場合に実行されてもよい。 The control unit 11 determines whether or not to execute the learning process based on the detection signal from the operation unit 14 (S106). The learning process may be executed at any timing, for example, when a predetermined operation is performed from the operation unit 14. Note that the learning process may be executed when a predetermined time has passed, or may be executed each time an expert evaluates. Alternatively, for example, it may be executed when a certain number or more of new data are added to each of the first teacher data D1 and the second teacher data D2.

学習処理を実行すると判定されない場合（Ｓ１０６；Ｎ）、本処理は終了する。この場合、再びＳ１の処理から実行され、第１教師データＤ１及び第２教師データＤ２の各々に対し、新しいデータが追加されてもよい。また、以降のＳ１０７及びＳ１０８の処理は、任意のタイミングで実行可能であり、第１教師データＤ１及び第２教師データＤ２の各々に対し、新しいデータが追加された後でなくてもよい。 If it is determined not to execute the learning process (S106; N), this process ends. In this case, the processing may be repeated from S1, and new data may be added to each of the first teacher data D1 and the second teacher data D2. Further, the subsequent processing of S107 and S108 can be executed at any timing, and does not have to be after new data is added to each of the first teacher data D1 and the second teacher data D2.

一方、学習処理を実行すると判定された場合（Ｓ１０６；Ｙ）、制御部１１は、第１教師データＤ１に基づいて、注視点画像出力モデルＭ１の学習処理を実行する（Ｓ１０７）。Ｓ１０７においては、制御部１１は、公知の学習アルゴリズムに基づいて、第１教師データＤ１が示す入力と出力の関係が得られるように、注視点画像出力モデルＭ１のパラメータを調整する。 On the other hand, when it is determined to execute the learning process (S106; Y), the control unit 11 executes the learning process of the point-of-regard image output model M1 based on the first teacher data D1 (S107). In S107, the control unit 11 adjusts the parameters of the point-of-regard image output model M1 based on a known learning algorithm so as to obtain the relationship between the input and the output indicated by the first teacher data D1.

制御部１１は、第２教師データＤ２に基づいて、評価結果出力モデルＭ２の学習処理を実行し（Ｓ１０８）、本処理は終了する。Ｓ１０８においては、制御部１１は、公知の学習アルゴリズムに基づいて、第２教師データＤ２が示す入力と出力の関係が得られるように、評価結果出力モデルＭ２のパラメータを調整する。 The control unit 11 executes the learning process of the evaluation result output model M2 based on the second teacher data D2 (S108), and ends this process. In S108, the control unit 11 adjusts the parameters of the evaluation result output model M2 based on a known learning algorithm so as to obtain the relationship between the input and the output indicated by the second teacher data D2.

［４－２．評価支援処理］
図１０は、評価支援処理を示すフロー図である。評価支援処理は、学習処理が実行された後に実行される。図１０に示すように、まず、制御部１１は、入力撮影画像を取得する（Ｓ２００）。Ｓ２００においては、記憶部１２に記憶された入力撮影画像のうち、評価者が操作部１４を操作して選択した入力撮影画像を取得する。 [4-2. Evaluation support processing]
FIG. 10 is a flowchart showing evaluation support processing. The evaluation support process is executed after the learning process is executed. As shown in FIG. 10, first, the control unit 11 acquires an input captured image (S200). In S<b>200 , among the input captured images stored in the storage unit 12 , the input captured image selected by the evaluator by operating the operation unit 14 is obtained.

制御部１１は、注視点画像出力モデルＭ１に対し、Ｓ２００で取得した入力撮影画像を入力し（Ｓ２０１）、注視点画像出力モデルＭ１から出力された出力注視点画像を、入力注視点画像として取得する（Ｓ２０２）。Ｓ２０１において入力撮影画像が入力されると、注視点画像出力モデルＭ１は、入力撮影画像の特徴量を計算する。注視点画像出力モデルＭ１は、計算した特徴量に基づいて、出力注視点画像を出力する。 The control unit 11 inputs the input captured image acquired in S200 to the point-of-regard image output model M1 (S201), and acquires the output point-of-regard image output from the point-of-regard image output model M1 as the input point-of-regard image. (S202). When the input captured image is input in S201, the point-of-regard image output model M1 calculates the feature amount of the input captured image. The point-of-regard image output model M1 outputs an output point-of-regard image based on the calculated feature amount.

制御部１１は、評価結果出力モデルＭ２に対し、Ｓ２００で取得した入力撮影画像と、Ｓ２０２で取得した入力注視点画像と、を入力し（Ｓ２０３）、評価結果出力モデルＭ２から出力された、出力評価結果を取得する（Ｓ２０４）。Ｓ２０３において入力撮影画像と入力注視点画像が入力されると、評価結果出力モデルＭ２は、入力撮影画像と入力注視点画像の各々の特徴量を計算する。評価結果出力モデルＭ２は、これらの特徴量に基づいて、出力評価結果を出力する。 The control unit 11 inputs the input captured image acquired in S200 and the input gaze point image acquired in S202 to the evaluation result output model M2 (S203), and outputs the output from the evaluation result output model M2. An evaluation result is acquired (S204). When the input photographed image and the input point-of-regard image are input in S203, the evaluation result output model M2 calculates the feature amount of each of the input photographed image and the input point-of-regard image. The evaluation result output model M2 outputs output evaluation results based on these feature amounts.

制御部１１は、Ｓ２０４で取得した出力評価結果を表示部１５に表示させ（Ｓ２０５）、本処理は終了する。評価者は、表示部１５に表示された出力評価結果を評価の参考にしたり、表示部１５に表示された出力評価結果を印刷したりして、その日の切羽の評価業務を行う。 The control unit 11 causes the display unit 15 to display the output evaluation result obtained in S204 (S205), and the process ends. The evaluator uses the output evaluation result displayed on the display unit 15 as a reference for evaluation, prints the output evaluation result displayed on the display unit 15, and performs evaluation work for the face of the day.

評価支援システムＳの第１の構成によれば、第１教師データＤ１に基づいて学習された注視点画像出力モデルＭ１に対し、入力撮影画像を入力して出力注視点画像を取得することによって、評価者の業務を支援することができる。例えば、出力注視点画像を入力注視点画像として評価結果出力モデルＭ２に入力して出力評価結果を得ることにより、評価者は、熟練者であればこのような評価結果になるといったヒントを得たり、出力評価結果をそのままその日の評価結果として利用したりすることができる。 According to the first configuration of the evaluation support system S, by inputting an input captured image to the point-of-regard image output model M1 learned based on the first teacher data D1 and obtaining an output point-of-regard image, Can assist the evaluator in their work. For example, by inputting the output point-of-regard image as the input point-of-regard image into the evaluation result output model M2 to obtain the output evaluation result, the evaluator can obtain a hint that an expert would obtain such an evaluation result. , the output evaluation result can be used as it is as the evaluation result of the day.

また、評価支援システムＳは、教師撮影画像と、視線検出装置２０により取得された教師注視点画像と、の関係を注視点画像出力モデルＭ１に学習させることによって、熟練者の視線の検出結果を注視点画像出力モデルＭ１に学習させ、注視点画像出力モデルＭ１の精度を高めることができる。 In addition, the evaluation support system S makes the point-of-regard image output model M1 learn the relationship between the teacher's photographed image and the teacher's point-of-regard image acquired by the line-of-sight detection device 20, thereby obtaining the detection result of the expert's line of sight. The accuracy of the point-of-regard image output model M1 can be improved by making the point-of-regard image output model M1 learn.

また、評価支援システムＳは、視線検出装置２０により検出された評価者の視線のうち、教師撮影画像が表示された画面上への視線を特定して教師注視点画像を取得することによって、切羽の評価に関係のない視線を排除し、注視点画像出力モデルＭ１の精度を効果的に高めることができる。 In addition, the evaluation support system S identifies the line of sight of the evaluator detected by the line of sight detection device 20 toward the screen on which the teacher-captured image is displayed, and acquires the teacher gaze point image. It is possible to eliminate the line of sight irrelevant to the evaluation of , effectively increasing the accuracy of the point-of-regard image output model M1.

また、評価支援システムＳは、学習済みの評価結果出力モデルＭ２に対し、入力撮影画像と出力注視点画像とを入力し、入力撮影画像に示された切羽の出力評価結果を取得することによって、熟練者であればこのような評価結果になるといった情報が出力され、評価者の業務を効果的に支援することができる。 In addition, the evaluation support system S inputs the input captured image and the output gaze point image to the learned evaluation result output model M2, and obtains the output evaluation result of the face shown in the input captured image. Information indicating such an evaluation result for an expert is output, and the work of the evaluator can be effectively supported.

また、評価支援システムＳは、教師撮影画像、入力撮影画像、教師注視点画像、及び出力注視点画像の各々を互いに同じサイズとすることによって、注視点画像出力モデルＭ１の精度を効果的に高めることができる。 In addition, the evaluation support system S effectively increases the accuracy of the point-of-regard image output model M1 by setting the size of each of the teacher's photographed image, the input photographed image, the teacher's point-of-regard image, and the output point-of-regard image to be the same. be able to.

また、評価支援システムＳは、評価対象を切羽とすることで、トンネル工事の評価者の業務を支援することができる。 In addition, the evaluation support system S can support the work of the evaluator of the tunnel construction by setting the evaluation target to be the face.

評価支援システムＳの第２の構成によれば、第２教師データＤ２に基づいて学習された評価結果出力モデルＭ２に対し、入力撮影画像と出力注視点画像を入力して出力評価結果を取得することによって、評価者の業務を支援することができる。例えば、評価者は、熟練者であればこのような評価結果になるといったヒントを得たり、出力評価結果をそのままその日の評価結果として利用したりすることができる。 According to the second configuration of the evaluation support system S, the input photographed image and the output gaze point image are input to the evaluation result output model M2 learned based on the second teacher data D2, and the output evaluation result is obtained. By doing so, it is possible to support the work of the evaluator. For example, the evaluator can obtain a hint that an expert will have such an evaluation result, or can use the output evaluation result as it is as the evaluation result of the day.

また、評価支援システムＳは、教師撮影画像と視線検出装置２０により取得された教師注視点画像と、教師評価結果と、の関係を評価結果出力モデルＭ２に学習させることによって、熟練者の視線の検出結果を評価結果出力モデルＭ２に学習させ、評価結果出力モデルＭ２の精度を高めることができる。 In addition, the evaluation support system S causes the evaluation result output model M2 to learn the relationship between the teacher's photographed image, the teacher gaze point image acquired by the line-of-sight detection device 20, and the teacher's evaluation result. The accuracy of the evaluation result output model M2 can be improved by making the evaluation result output model M2 learn the detection result.

また、評価支援システムＳは、視線検出装置２０により検出された評価者の視線のうち、教師撮影画像が表示された画面上への視線を特定して教師注視点画像を取得することによって、切羽の評価に関係のない視線を排除し、評価結果出力モデルＭ２の精度を効果的に高めることができる。 In addition, the evaluation support system S identifies the line of sight of the evaluator detected by the line of sight detection device 20 toward the screen on which the teacher-captured image is displayed, and acquires the teacher gaze point image. It is possible to eliminate the line of sight irrelevant to the evaluation of , effectively increasing the accuracy of the evaluation result output model M2.

また、評価支援システムＳは、学習済みの注視点画像出力モデルＭ１に対し、入力撮影画像を入力して出力注視点画像を入力注視点画像として取得し、評価結果出力モデルＭ２に入力することによって、入力注視点画像を自動的に取得し、評価者の業務を効果的に支援することができる。 Further, the evaluation support system S inputs an input photographed image to the learned point-of-regard image output model M1, acquires the output gaze-point image as an input gaze-point image, and inputs it to the evaluation result output model M2. , the input gaze point image can be automatically acquired to effectively support the work of the evaluator.

また、評価支援システムＳは、教師撮影画像、入力撮影画像、教師注視点画像、及び出力注視点画像の各々を互いに同じサイズとすることによって、評価結果出力モデルＭ２の精度を効果的に高めることができる。 In addition, the evaluation support system S effectively increases the accuracy of the evaluation result output model M2 by setting the size of each of the teacher captured image, the input captured image, the teacher gaze point image, and the output gaze point image to be the same size. can be done.

［５．変形例］
なお、本発明は、以上に説明した実施の形態に限定されるものではない。本発明の趣旨を逸脱しない範囲で、適宜変更可能である。 [5. Modification]
It should be noted that the present invention is not limited to the embodiments described above. Modifications can be made as appropriate without departing from the gist of the present invention.

図１１は、変形例に係る機能ブロック図である。図１１に示すように、以降説明する変形例では、実施形態で説明した機能に加えて、特徴情報取得部１１１が実現される。特徴情報取得部１１１は、学習端末１０によって実現される場合を説明するが、サーバコンピュータ等の他のコンピュータによって実現されてもよい。 FIG. 11 is a functional block diagram according to a modification. As shown in FIG. 11, in the modified example described below, a characteristic information acquisition unit 111 is implemented in addition to the functions described in the embodiment. Although the feature information acquisition unit 111 is implemented by the study terminal 10, it may be implemented by another computer such as a server computer.

［５－１．第１の構成に係る変形例］
（１－１）まず、第１の構成に係る変形例について説明する。例えば、同じトンネル工事であったとしても、山岳トンネル工事と地下トンネル工事とで評価者が見るべき場所が変わることがある。このため、工事の特徴情報を注視点画像出力モデルＭ１に学習させ、特徴情報に応じた出力注視点画像が出力されるようにしてもよい。 [5-1. Modification of First Configuration]
(1-1) First, a modification of the first configuration will be described. For example, even if the tunnel construction is the same, the location that the evaluator should look at may change between the mountain tunnel construction and the underground tunnel construction. For this reason, the point-of-regard image output model M1 may be made to learn the feature information of the construction work, and an output point-of-regard image corresponding to the feature information may be output.

特徴情報は、工事の特徴に関する情報であり、工種ということもできる。例えば、特徴情報は、工事現場の場所、工法、地盤、天候、平均降水量、機材、材料、又は作業員といった特徴である。特徴情報には、これら複数の項目の各々の特徴が示されてもよいし、何れか１つの特徴だけが示されていてもよい。特徴情報は、各項目が数値によって示されてもよいし、記号又は文字などによって示されてもよい。 The feature information is information relating to the features of construction, and can also be called a type of construction. For example, the feature information is the location of the construction site, construction method, ground, weather, average precipitation, equipment, materials, or workers. The feature information may indicate the feature of each of the plurality of items, or may indicate only one of the features. Each item of the characteristic information may be indicated by a numerical value, or may be indicated by a symbol or a character.

本変形例の第１教師データＤ１は、工事現場における工事の特徴情報及び教師撮影画像と、教師注視点画像と、の関係を示す。即ち、第１教師データＤ１には、特徴情報及び教師撮影画像と、教師注視点画像と、のペアが格納される。特徴情報及び教師撮影画像が入力に相当し、教師注視点画像が出力に相当する。第１教師データＤ１に格納される特徴情報は、教師撮影画像が示す工事の特徴情報であり、例えば、教師撮影画像を評価する熟練者によって操作部１４等から入力されたり、工事の計画書から取得されたりする。 The first training data D1 of this modified example indicates the relationship between the characteristic information of the construction at the construction site, the teacher-captured image, and the teacher gaze point image. That is, the first teacher data D1 stores a pair of feature information, a teacher-captured image, and a teacher gaze point image. The feature information and the teacher captured image correspond to the input, and the teacher gaze point image corresponds to the output. The feature information stored in the first training data D1 is the feature information of the construction indicated by the teacher-captured image. or obtained.

本変形例の評価支援システムＳは、特徴情報取得部１１１を含む。特徴情報取得部１１１は、制御部１１を主として実現される。特徴情報取得部１１１は、入力撮影画像に対応する工事の特徴情報を取得する。入力撮影画像に対応する工事の特徴情報とは、入力撮影画像が示す工事の特徴情報であり、例えば、入力撮影画像を評価する評価者によって操作部１４から入力されたり、工事の計画書から取得されたりする。 The evaluation support system S of this modified example includes a feature information acquisition unit 111 . The feature information acquisition unit 111 is implemented mainly by the control unit 11 . The characteristic information acquisition unit 111 acquires construction characteristic information corresponding to the input captured image. The feature information of the construction corresponding to the input captured image is the feature information of the construction indicated by the input captured image. be done.

例えば、山岳トンネルの工事現場を担当する評価者は、山岳トンネルの入力撮影画像である旨を操作部１４から入力し、地下トンネルの工事現場を担当する評価者は、地下トンネルの入力撮影画像である旨を操作部１４から入力する。なお、特徴情報は、操作部１４から入力されるのではなく、任意の方法で取得されてよい。例えば、予め入力撮影画像に特徴情報が関連付けられていてもよいし、担当者に特徴情報を予め関連付けておいてもよい。 For example, an evaluator in charge of a construction site of a mountain tunnel inputs from the operation unit 14 that it is an input photographed image of a mountain tunnel. Input that there is from the operation unit 14 . Note that the characteristic information may be acquired by any method instead of being input from the operation unit 14 . For example, the feature information may be associated in advance with the input captured image, or the feature information may be associated in advance with the person in charge.

第１入力部１０４は、注視点画像出力モデルＭ１に対し、特徴情報取得部１１１により取得された特徴情報と入力撮影画像とを入力する。注視点画像出力モデルＭ１は、特徴情報と入力撮影画像との両方を特徴量化し、出力注視点画像を出力する。特徴情報と入力撮影画像の特徴量についても、ベクトルや配列等の任意の形式で表現されるようにすればよい。なお、特徴量を計算するアルゴリズムが注視点画像出力モデルＭ１の外部にあってもよい点については、実施形態で説明した通りである。 The first input unit 104 inputs the feature information acquired by the feature information acquisition unit 111 and the input captured image to the point-of-regard image output model M1. The point-of-regard image output model M1 converts both the feature information and the input captured image into feature quantities, and outputs an output point-of-regard image. The feature information and the feature amount of the input captured image may also be expressed in any format such as vector or array. As described in the embodiment, the algorithm for calculating the feature amount may be outside the point-of-regard image output model M1.

変形例（１－１）によれば、工事現場における工事の特徴情報及び教師撮影画像と、教師注視点画像と、の関係を注視点画像出力モデルＭ１に学習させることによって、工事の特徴に応じた出力注視点画像を取得することができ、出力注視点画像の精度を高めることができる。 According to the modified example (1-1), the point-of-regard image output model M1 learns the relationship between the characteristic information of the construction at the construction site, the teacher-captured image, and the teacher-at-a-glance image. Therefore, it is possible to obtain an output gazing point image with a higher accuracy, thereby improving the accuracy of the output gazing point image.

（１－２）また例えば、変形例（１－１）では、注視点画像出力モデルＭ１に特徴情報を学習させる場合を説明したが、工事の特徴情報ごとに、専用の注視点画像出力モデルＭ１を別々に用意してもよい。例えば、山岳トンネル用の注視点画像出力モデルＭ１と、地下トンネル用の注視点画像出力モデルＭ１と、を別々に用意してもよい。 (1-2) For example, in the modified example (1-1), the case where feature information is learned by the point-of-regard image output model M1 has been described. may be prepared separately. For example, the point-of-regard image output model M1 for mountain tunnels and the point-of-regard image output model M1 for underground tunnels may be prepared separately.

本変形例のデータ記憶部１００は、工事の特徴情報ごとに、第１教師データＤ１と注視点画像出力モデルＭ１を記憶する。例えば、データ記憶部１００は、工事現場の場所ごとに第１教師データＤ１と注視点画像出力モデルＭ１を記憶したり、工法ごとに第１教師データＤ１と注視点画像出力モデルＭ１を記憶したりする。第１教師データＤ１の作成方法自体は、実施形態で説明した通りであり、データ記憶部１００には、第１教師データＤ１が工事の特徴情報に関連付けられて格納される。 The data storage unit 100 of this modification stores the first teacher data D1 and the point-of-regard image output model M1 for each construction feature information. For example, the data storage unit 100 stores the first teacher data D1 and the point-of-regard image output model M1 for each location of the construction site, or stores the first teacher data D1 and the point-of-regard image output model M1 for each construction method. do. The method of creating the first teacher data D1 itself is as described in the embodiment, and the first teacher data D1 is stored in the data storage unit 100 in association with the construction feature information.

本変形例の第１学習部１０３は、工事現場における工事の特徴情報ごとに、当該特徴情報に対応する第１教師データＤ１に基づいて注視点画像出力モデルＭ１を学習させる。第１学習部１０３は、特徴情報ごとに、当該特徴情報に関連付けられた第１教師データＤ１に基づいて、当該特徴情報に関連付けられた注視点画像出力モデルＭ１を学習させる。注視点画像出力モデルＭ１の学習方法自体は、実施形態で説明した通りである。 The first learning unit 103 of this modified example learns the point-of-regard image output model M1 based on the first teacher data D1 corresponding to each feature information of construction at the construction site. The first learning unit 103 learns the point-of-regard image output model M1 associated with each piece of feature information based on the first teacher data D1 associated with the feature information. The learning method itself for the point-of-regard image output model M1 is as described in the embodiment.

第１入力部１０４は、特徴情報取得部１１１により取得された特徴情報に対応する注視点画像出力モデルＭ１に対し、入力撮影画像を入力する。第１入力部１０４は、データ記憶部１００に記憶された注視点画像出力モデルＭ１のうち、入力撮影画像の特徴情報に関連付けられた注視点画像出力モデルＭ１に対し、入力撮影画像を入力する。入力撮影画像が入力された後の処理は、実施形態で説明した通りである。 The first input unit 104 inputs an input captured image to the point-of-regard image output model M1 corresponding to the feature information acquired by the feature information acquisition unit 111 . The first input unit 104 inputs the input captured image to the target point image output model M1 associated with the feature information of the input captured image among the target point image output models M1 stored in the data storage unit 100 . The processing after the input photographed image is input is as described in the embodiment.

変形例（１－２）によれば、工事現場における工事の特徴情報ごとに注視点画像出力モデルＭ１を用意することによって、工事の特徴に応じた出力注視点画像を取得することができ、出力注視点画像の精度を高めることができる。 According to the modified example (1-2), by preparing the point-of-regard image output model M1 for each feature information of the construction work at the construction site, it is possible to acquire the output point-of-regard image corresponding to the feature of the construction work, and output it. It is possible to improve the accuracy of the gaze point image.

（１－３）また例えば、注視点画像出力モデルＭ１により出力された出力注視点画像は、任意の用途で利用されてよく、評価結果出力モデルＭ２に入力されること以外の用途で利用されてもよい。例えば、出力注視点画像は、プリンタから印刷されてもよいし、電子メール等に添付された送信されてもよい。他にも例えば、出力注視点画像を示すファイルが学習端末１０又は他のコンピュータに記録されてもよい。この場合、出力注視点画像を経験の浅い評価者の教育に利用し、評価者に出力注視点画像を見せて、熟練者であれば見る部分を教えるようにしてもよい。また例えば、教師注視点画像は、視線検出装置２０の検出結果を利用せずに取得されてもよい。例えば、熟練者に教師撮影画像を見せて、評価時に重点的に見た部分を手動で指定させるようにしてもよい。 (1-3) Further, for example, the output point-of-regard image output by the point-of-regard image output model M1 may be used for any purpose, and may be used for purposes other than being input to the evaluation result output model M2. good too. For example, the output gaze point image may be printed from a printer, or may be sent as an attachment to e-mail or the like. Alternatively, for example, a file indicating the output gaze point image may be recorded in the learning terminal 10 or another computer. In this case, the output point-of-regard image may be used for training an inexperienced evaluator, and the output point-of-regard image may be shown to the evaluator to teach the expert what part to see. Further, for example, the teacher gaze point image may be acquired without using the detection result of the line-of-sight detection device 20 . For example, an expert may be shown a teacher-captured image, and may be instructed to manually designate a portion that was viewed with emphasis during evaluation.

［５－２．第２の構成に係る変形例］
（２－１）次に、第２の構成に係る変形例について説明する。例えば、変形例（１－１）では、工事の特徴情報を注視点画像出力モデルＭ１に学習させる場合を説明したが、評価結果出力モデルＭ２についても同様に、工事の特徴情報を評価結果出力モデルＭ２に学習させてもよい。 [5-2. Modification of Second Configuration]
(2-1) Next, a modification of the second configuration will be described. For example, in the modified example (1-1), the case where the point-of-regard image output model M1 is made to learn the feature information of construction work has been described. You may let M2 learn.

本変形例の第２教師データＤ２は、工事現場における工事の特徴情報、教師撮影画像、及び教師注視点画像と、教師評価結果と、の関係を示す。即ち、第２教師データＤ２には、特徴情報、教師撮影画像、及び教師注視点画像と、教師評価結果と、のペアが格納される。特徴情報、教師撮影画像、及び教師注視点画像が入力に相当し、教師評価結果が出力に相当する。第２教師データＤ２に格納される特徴情報は、教師撮影画像が示す工事の特徴情報であり、例えば、教師撮影画像を評価する熟練者によって操作部１４等から入力されたり、工事の計画書から取得されたりする。 The second teacher data D2 of this modified example indicates the relationship between the feature information of the construction at the construction site, the teacher's photographed image, the teacher's gaze point image, and the teacher's evaluation result. That is, the second teacher data D2 stores a pair of feature information, a teacher-captured image, a teacher-gazing-point image, and a teacher evaluation result. The feature information, the teacher's photographed image, and the teacher's point-of-regard image correspond to the input, and the teacher's evaluation result corresponds to the output. The feature information stored in the second training data D2 is the feature information of the construction indicated by the teacher-captured image. or obtained.

第２入力部１０９は、評価結果出力モデルＭ２に対し、特徴情報取得部１１１により取得された特徴情報、入力撮影画像、及び入力注視点画像を入力する。評価結果出力モデルＭ２は、特徴情報、入力撮影画像、及び入力注視点画像の各々を特徴量化し、出力評価結果を出力する。特徴情報、入力撮影画像、及び入力注視点画像の特徴量についても、ベクトルや配列等の任意の形式で表現されるようにすればよい。なお、特徴量を計算するアルゴリズムが評価結果出力モデルＭ２の外部にあってもよい点については、実施形態で説明した通りである。 The second input unit 109 inputs the feature information acquired by the feature information acquisition unit 111, the input captured image, and the input gaze point image to the evaluation result output model M2. The evaluation result output model M2 converts each of the feature information, the input captured image, and the input gazing point image into feature quantities, and outputs an output evaluation result. The feature information, the input captured image, and the feature amount of the input gaze point image may also be expressed in any format such as vector or array. As described in the embodiment, the algorithm for calculating the feature amount may be outside the evaluation result output model M2.

変形例（２－１）によれば、工事現場における工事の特徴情報、教師撮影画像、及び教師注視点画像と、教師評価結果と、の関係を評価結果出力モデルＭ２に学習させることによって、工事の特徴に応じた出力評価結果を取得することができ、出力評価結果の精度を高めることができる。 According to the modified example (2-1), the evaluation result output model M2 learns the relationship between the characteristic information of the construction work at the construction site, the teacher-captured image, the teacher-gazing-point image, and the teacher evaluation result. It is possible to obtain an output evaluation result according to the characteristics of , and to improve the accuracy of the output evaluation result.

（２－２）また例えば、変形例（１－２）では、工事の特徴情報ごとに、専用の注視点画像出力モデルＭ１を別々に用意する場合を説明したが、評価結果出力モデルＭ２についても同様に、工事の特徴情報ごとに、専用の評価結果出力モデルＭ２を別々に用意してもよい。例えば、山岳トンネル用の評価結果出力モデルＭ２と、地下トンネル用の評価結果出力モデルＭ２と、を別々に用意してもよい。 (2-2) For example, in the modified example (1-2), a case was explained in which a dedicated gaze point image output model M1 was separately prepared for each piece of construction feature information, but the evaluation result output model M2 is also Similarly, a dedicated evaluation result output model M2 may be separately prepared for each construction feature information. For example, an evaluation result output model M2 for mountain tunnels and an evaluation result output model M2 for underground tunnels may be prepared separately.

本変形例のデータ記憶部１００は、工事の特徴情報ごとに、第２教師データＤ２と評価結果出力モデルＭ２を記憶する。例えば、データ記憶部１００は、工事現場の場所ごとに第２教師データＤ２と評価結果出力モデルＭ２を記憶したり、工法ごとに第２教師データＤ２と評価結果出力モデルＭ２を記憶したりする。第２教師データＤ２の作成方法自体は、実施形態で説明した通りであり、データ記憶部１００には、第２教師データＤ２が工事の特徴情報に関連付けられて格納される。 The data storage unit 100 of this modified example stores the second teacher data D2 and the evaluation result output model M2 for each construction feature information. For example, the data storage unit 100 stores the second teacher data D2 and the evaluation result output model M2 for each location of the construction site, or stores the second teacher data D2 and the evaluation result output model M2 for each construction method. The method of creating the second teacher data D2 itself is as described in the embodiment, and the second teacher data D2 is stored in the data storage unit 100 in association with the construction feature information.

第２学習部１０７は、工事現場における工事の特徴情報ごとに、当該特徴情報に対応する第２教師データＤ２に基づいて評価結果出力モデルＭ２を学習させる。第２学習部１０７は、特徴情報ごとに、当該特徴情報に関連付けられた第２教師データＤ２に基づいて、当該特徴情報に関連付けられた評価結果出力モデルＭ２を学習させる。評価結果出力モデルＭ２の学習方法自体は、実施形態で説明した通りである。 The second learning unit 107 learns the evaluation result output model M2 based on the second teacher data D2 corresponding to each feature information of construction at the construction site. The second learning unit 107 learns the evaluation result output model M2 associated with the feature information for each feature information based on the second teacher data D2 associated with the feature information. The learning method itself for the evaluation result output model M2 is as described in the embodiment.

第２入力部１０９は、特徴情報取得部１１１により取得された特徴情報に対応する評価結果出力モデルＭ２に対し、入力撮影画像と入力注視点画像とを入力する。第２入力部１０９は、データ記憶部１００に記憶された評価結果出力モデルＭ２のうち、入力撮影画像の特徴情報に関連付けられた評価結果出力モデルＭ２に対し、入力撮影画像を入力する。入力撮影画像が入力された後の処理は、実施形態で説明した通りである。 The second input unit 109 inputs the input captured image and the input gaze point image to the evaluation result output model M2 corresponding to the feature information acquired by the feature information acquisition unit 111 . The second input unit 109 inputs the input captured image to the evaluation result output model M2 associated with the feature information of the input captured image among the evaluation result output models M2 stored in the data storage unit 100 . The processing after the input photographed image is input is as described in the embodiment.

変形例（２－２）によれば、工事現場における工事の特徴情報ごとに評価結果出力モデルＭ２を用意することによって、工事の特徴に応じた出力評価結果を取得することができ、出力評価結果の精度を高めることができる。 According to the modified example (2-2), by preparing the evaluation result output model M2 for each characteristic information of the construction work at the construction site, it is possible to obtain the output evaluation result corresponding to the characteristic of the construction work. accuracy can be improved.

（２－３）また例えば、第２教師データＤ２には、注視点画像出力モデルＭ１が出力した出力注視点画像が、教師注視点画像として格納されていてもよい。即ち、第２教師データＤ２には、熟練者の視線を検出することによって取得された教師注視点画像だけでなくてもよい。 (2-3) Further, for example, the output gaze point image output by the gaze point image output model M1 may be stored as a teacher gaze point image in the second teacher data D2. That is, the second teacher data D2 need not only be the teacher gaze point image acquired by detecting the line of sight of the expert.

本変形例の教師注視点画像取得部１０１は、学習済みの注視点画像出力モデルＭ１に対し、教師撮影画像が入力された場合に出力される出力注視点画像を、教師注視点画像として取得する。教師注視点画像取得部１０１による出力注視点画像の取得方法自体は、実施形態で説明した通りである。本変形例の第２教師データＤ２は、教師撮影画像と教師注視点画像取得部１０１により取得された教師注視点画像と、教師評価結果と、の関係を示すことになる。 The teacher gazing point image acquisition unit 101 of this modified example acquires, as a teacher gazing point image, an output gazing point image that is output when a teacher captured image is input to the trained gazing point image output model M1. . The method itself for acquiring the output gaze point image by the teacher gaze point image acquisition unit 101 is as described in the embodiment. The second teacher data D2 of this modified example indicates the relationship between the teacher captured image, the teacher gazing point image acquired by the teacher gazing point image acquisition unit 101, and the teacher evaluation result.

変形例（２－３）によれば、注視点画像出力モデルＭ１から出力された出力注視点画像を、教師注視点画像として第２教師データＤ２に格納して利用することによって、第２教師データＤ２の数を増やすことができ、評価結果出力モデルＭ２の精度を高めることができる。 According to the modification (2-3), the output point-of-regard image output from the point-of-regard image output model M1 is stored in the second teacher data D2 as the teacher point-of-regard image, and is used as the second teacher data. The number of D2 can be increased, and the accuracy of the evaluation result output model M2 can be improved.

（２－４）また例えば、注視点画像出力モデルＭ１に入力される入力注視点画像は、注視点画像出力モデルＭ１により出力された出力注視点画像でなくてもよい。例えば、評価者が、入力撮影画像を表示させた表示部１５上で重点的に見たい部分を選択することによって、入力注視点画像が取得されるようにしてもよい。この場合、評価者が熟練者であれば、正確な部分を選択できるので、熟練者が重点的に見たい部分を選択するだけで、切羽観察簿を自動的に作成することができ、評価業務を支援することができる。また例えば、ある日の評価業務において切羽を見るべき場所が前日とさほど変わらない場合には、前日に熟練者が重点的に見た部分を記録しておき、当該部分が入力注視点画像となってもよい。他にも例えば、過去に別の場所で似たような工事が行われた場合に、その時と切羽を見るべき場所がさほど変わらない場合には、その時に熟練者が重点的に見た部分を記録しておき、当該部分が入力注視点画像となってもよい。 (2-4) Further, for example, the input point-of-regard image input to the point-of-regard image output model M1 may not be the output point-of-regard image output by the point-of-regard image output model M1. For example, the input gazing point image may be acquired by the evaluator selecting a portion that the evaluator wants to focus on on the display unit 15 displaying the input captured image. In this case, if the evaluator is an expert, he or she can select an accurate part, so that the expert can automatically create a face observation list simply by selecting the part that the expert wants to focus on. can support In addition, for example, when the place where the face should be seen in the evaluation work on a certain day is not much different from the previous day, the part that the expert saw intensively on the previous day is recorded, and that part becomes the input gaze point image. may In addition, for example, when similar construction was done in another place in the past, if the place where the face should be seen is not much different from that time, the part that the expert saw intensively at that time It may be recorded and the relevant portion may be used as the input gaze point image.

［５－３．その他の変形例］
（３）また例えば、上記変形例を組み合わせてもよい。 [5-3. Other Modifications]
(3) Further, for example, the above modifications may be combined.

また例えば、教師撮影画像、入力撮影画像、教師注視点画像、及び出力注視点画像の各々が同じサイズである場合を説明したが、これらのサイズは互いに異なっていてもよい。他にも例えば、これらの解像度等が互いに異なってもよい。 Also, for example, the case where each of the teacher captured image, the input captured image, the teacher gazing point image, and the output gazing point image has the same size has been described, but these sizes may be different from each other. In addition, for example, these resolutions and the like may be different from each other.

また例えば、注視点画像出力モデルＭ１のアルゴリズムは、実施形態で説明した例に限られない。例えば、教師注視点画像に閾値を定めて各画素をラベル分けし、ラベル分けされた画像を教師データとしてセマンティックセグメンテーションにより分類されるようにしてもよい。即ち、注視点画像で表現可能な色の段階数（例えば、２５６段階）を任意の閾値で所定段階に分けて疑似的にセマンティックセグメンテーションによるラベルに落とし込むことによって、出力注視点画像が作成されるようにしてもよい。 Also, for example, the algorithm of the point-of-regard image output model M1 is not limited to the example described in the embodiment. For example, each pixel may be labeled by setting a threshold for the teacher gaze image, and the labeled image may be classified by semantic segmentation as teacher data. That is, by dividing the number of stages of colors (for example, 256 stages) that can be represented by the point-of-regard image into predetermined stages with an arbitrary threshold value and dropping them into labels based on pseudo semantic segmentation, the output point-of-regard image is created. can be

また例えば、実施形態及び変形例では、主に山岳トンネルの切羽を評価する場面を例に挙げたが、地下トンネルの切羽を評価する場面にも適用可能であり、実施形態及び変形例で説明した処理によって、評価者の業務を支援するようにすればよい。また例えば、評価支援システムＳが切羽の評価で利用される場合を説明したが、任意の評価対象を評価する場面に適用可能である。 Further, for example, in the embodiment and the modified example, the case of mainly evaluating the face of a mountain tunnel was taken as an example, but it is also applicable to the scene of evaluating the face of an underground tunnel, and has been described in the embodiment and the modified example. The process may support the work of the evaluator. Further, for example, although the case where the evaluation support system S is used for face evaluation has been described, it can be applied to any evaluation target.

例えば、トンネル工事以外の工事にも適用可能であり、コンクリートのひび割れを評価する場面や建造物の形状やバランスなどを評価する場面などにも適用可能である。例えば、評価対象は、建物、橋、ダム、鉄骨の骨組、柱、又は壁などであってもよい。これらの評価対象を評価する場合についても、実施形態及び変形例で説明した処理と同様の処理によって、熟練者が見るべき部分を示す出力注視点画像を取得したり、熟練者の評価結果と推測される出力評価結果を取得したりすればよい。 For example, it can be applied to construction work other than tunnel construction, and can also be applied to situations such as evaluating concrete cracks and evaluating the shape and balance of buildings. For example, the evaluation target may be a building, bridge, dam, steel frame, column, wall, or the like. When evaluating these evaluation targets, the same processing as the processing described in the embodiment and the modified example is performed to acquire an output gaze point image indicating a portion to be viewed by the expert, and to estimate the result of evaluation by the expert. It suffices to acquire the output evaluation result to be used.

また例えば、実施形態では、学習端末１０によって各機能が実現される場合を説明したが、評価支援システムＳに複数のコンピュータが含まれている場合に、各コンピュータで機能が分担されてもよい。例えば、データ記憶部１００がサーバコンピュータによって実現され、学習端末１０は、サーバコンピュータに記憶された第１教師データＤ１、第２教師データＤ２、注視点画像出力モデルＭ１、及び評価結果出力モデルＭ２の各々を利用してもよい。 Further, for example, in the embodiment, the case where each function is realized by the learning terminal 10 has been described, but when the evaluation support system S includes a plurality of computers, each computer may share the function. For example, the data storage unit 100 is implemented by a server computer, and the learning terminal 10 stores first teacher data D1, second teacher data D2, gaze point image output model M1, and evaluation result output model M2 stored in the server computer. Each may be used.

Ｓ評価支援システム、１０学習端末、１１制御部、１２記憶部、１３通信部、１４操作部、１５表示部、２０視線検出装置、Ｂ１，Ｂ２切羽観察簿、Ｄ１第１教師データ、Ｄ２第２教師データ、Ｉ１，Ｉ３撮影画像、Ｉ２，Ｉ４注視点画像、Ｍ１注視点画像出力モデル、Ｍ２評価結果出力モデル、１００データ記憶部、１０１教師注視点画像取得部、１０２第１教師データ取得部、１０３第１学習部、１０４第１入力部、１０５出力注視点画像取得部、１０６第２教師データ取得部、１０７第２学習部、１０８入力注視点画像取得部、１０９第２入力部、１１０出力評価結果取得部、１１１特徴情報取得部。 S Evaluation support system 10 Learning terminal 11 Control unit 12 Storage unit 13 Communication unit 14 Operation unit 15 Display unit 20 Line of sight detection device B1, B2 Face observation record D1 First teacher data D2 Second Teacher data I1, I3 photographed images I2, I4 gaze point images M1 gaze point image output model M2 evaluation result output model 100 data storage unit 101 teacher gaze point image acquisition unit 102 first teacher data acquisition unit 103 first learning unit 104 first input unit 105 output gaze point image acquisition unit 106 second teacher data acquisition unit 107 second learning unit 108 input gaze point image acquisition unit 109 second input unit 110 output evaluation result acquisition unit 111 feature information acquisition unit;

Claims

Acquisition of teacher data for acquiring teacher data indicating a relationship between a teacher-photographed image of an evaluation object photographed at a construction site, teacher gaze point information corresponding to the teacher-photographed image, and a teacher evaluation result of the evaluation object by an evaluator means and
learning means for learning an evaluation result output model based on the teacher data;
input means for inputting an input captured image and input gaze point information corresponding to the input captured image to the evaluation result output model;
output evaluation result acquisition means for acquiring an output evaluation result output from the evaluation result output model;
An evaluation support system comprising:

The evaluation support system further includes teacher gaze point information acquisition means for acquiring the teacher gaze point information based on the gaze of the evaluator detected by the gaze detection means,
The teacher data indicates the relationship between the teacher captured image, the teacher gaze point information acquired by the teacher gaze point information acquiring means, and the teacher evaluation result.
The evaluation support system according to claim 1, characterized by:

The teacher gaze point information acquisition means includes:
identifying, from among the line of sight of the evaluator detected by the line of sight detection means, a line of sight toward a screen on which the teacher-captured image is displayed;
obtaining the teacher gaze point information based on the identified line of sight;
3. The evaluation support system according to claim 2, characterized by:

The teacher data indicates the relationship between the feature information of construction at the construction site, the teacher-captured image, the teacher gaze information, and the teacher evaluation result,
The evaluation support system further includes feature information acquisition means for acquiring feature information of the construction work corresponding to the input captured image,
The input means inputs the feature information acquired by the feature information acquisition means, the input captured image, and the input gaze point information to the evaluation result output model.
4. The evaluation support system according to any one of claims 1 to 3, characterized by:

The learning means learns the evaluation result output model based on the teacher data corresponding to the feature information for each piece of feature information of construction at the construction site,
The evaluation support system further includes feature information acquisition means for acquiring feature information of the construction work corresponding to the input captured image,
The input means inputs the input captured image and the input gaze point information to the evaluation result output model corresponding to the feature information acquired by the feature information acquisition means.
5. The evaluation support system according to any one of claims 1 to 4, characterized by:

The evaluation support system includes a teacher point-of-regard information acquisition means for acquiring, as the teacher point-of-regard information, output gaze-point information output when the teacher-captured image is input to a learned point-of-regard information output model. further comprising
The teacher data indicates the relationship between the teacher captured image, the teacher gaze point information acquired by the teacher gaze point information acquiring means, and the teacher evaluation result.
6. The evaluation support system according to any one of claims 1 to 5, characterized by:

The evaluation support system includes input point-of-regard information acquisition means for acquiring, as the input point-of-regard information, output point-of-regard information output when the input photographed image is input to a learned point-of-regard information output model. further comprising
The input means inputs the input photographed image and the input gaze point information acquired by the input gaze point information acquisition means into the evaluation result output model.
7. The evaluation support system according to any one of claims 1 to 6, characterized by:

each of the teacher-captured image and the input-captured image has the same size;
Each of the teacher point-of-regard information and the input point-of-regard information is an image of the same size as each of the teacher-captured image and the input captured image, in which the point-of-regard is represented by color.
The evaluation support system according to any one of claims 1 to 7, characterized by:

The evaluation target is a tunnel face,
The teacher-captured image is an image of a tunnel face captured at the construction site,
The teacher gaze point information indicates the gaze point when the evaluator evaluates the tunnel face,
The input photographed image is an image obtained by photographing a tunnel face at the construction site or another construction site,
The input gaze point information indicates a portion to be viewed when evaluating the tunnel face shown in the input captured image,
The evaluation support system according to any one of claims 1 to 8, characterized by:

Acquisition of teacher data for acquiring teacher data indicating a relationship between a teacher-photographed image of an evaluation object photographed at a construction site, teacher gaze point information corresponding to the teacher-photographed image, and a teacher evaluation result of the evaluation object by an evaluator a step;
a learning step of learning an evaluation result output model based on the teacher data;
an input step of inputting an input captured image and input gaze point information corresponding to the input captured image to the evaluation result output model;
an output evaluation result obtaining step for obtaining an output evaluation result output from the evaluation result output model;
An evaluation support method characterized by comprising:

Acquisition of teacher data for acquiring teacher data indicating a relationship between a teacher-photographed image of an evaluation object photographed at a construction site, teacher gaze point information corresponding to the teacher-photographed image, and a teacher evaluation result of the evaluation object by an evaluator means,
learning means for learning an evaluation result output model based on the teacher data;
input means for inputting an input captured image and input gaze point information corresponding to the input captured image to the evaluation result output model;
output evaluation result acquisition means for acquiring output evaluation results output from the evaluation result output model;
A program that allows a computer to function as a