JP2019200512A

JP2019200512A - Structure deterioration detection system

Info

Publication number: JP2019200512A
Application number: JP2018093639A
Authority: JP
Inventors: 祐貴井上; Suketaka Inoue; 洋登永吉; Hirotaka Nagayoshi; 俊介大田; Shunsuke Ota; 健太郎大西; Kentaro Onishi; 賀仁成田; Norihito Narita; 孝史野口; Takashi Noguchi; 良一植田; Ryoichi Ueda; 真人仲村柄; Masato Nakamura; 大介勝又; Daisuke Katsumata
Original assignee: Hitachi Systems Ltd
Current assignee: Hitachi Systems Ltd
Priority date: 2018-05-15
Filing date: 2018-05-15
Publication date: 2019-11-21
Anticipated expiration: 2038-05-15
Also published as: JP7150468B2

Abstract

To provide a technique capable of shortening a time including a calculation time required for learning and diagnosis in a computer and a user's work time and reducing a user's work load.SOLUTION: A computer system 1 (application 10) of a structure deterioration detection system executes first processing using the deep learning of a first image of a structure 5 as an input to output a second image representing a diagnosis result of deterioration and second processing for visualizing information including a second image to display the visualized information on a GUI screen 21. A CNN consisting of a model 31 of the deep learning includes a widening convolution layer. The first process at a training time inputs a first image patch of a predetermined first input size to the model 31 to obtain a first diagnosis result image of a first output size. The first process at a diagnosis time inputs a second image patch of a second input size to the model 31 from a target image of a variable size to obtain a second diagnosis result image of a second output size.SELECTED DRAWING: Figure 1

Description

本発明は、構造物の劣化等の状態を検出するための情報処理システム等の技術に関する。特に、機械学習を用いて劣化等の状態を学習および診断する技術に関する。 The present invention relates to a technology such as an information processing system for detecting a state such as deterioration of a structure. In particular, the present invention relates to a technique for learning and diagnosing a state such as deterioration using machine learning.

各種の建築物やインフラ設備等の構造物（例えば家、ビル、道路、鉄道、橋、トンネル、電気設備、水道設備、通信設備等を含む）は、経年老朽化や災害等に伴い、ひび割れ、さび・腐食、剥離、異物付着等の劣化や損傷等の状態（「劣化」と総称して記載する場合がある）が発生する。そのため、維持管理のために点検補修等の対策作業が必要である。しかし、その作業のための人員不足や高コスト等の社会的課題がある。それに対し、計算機を用いて構造物の劣化等の状態を診断、検出するシステム（構造物劣化検出システムと記載する場合がある）が開発され、有効性が期待されている。 Structures such as various buildings and infrastructure equipment (including houses, buildings, roads, railways, bridges, tunnels, electrical equipment, water supply equipment, communication equipment, etc.) are cracked, Deterioration or damage such as rust / corrosion, peeling, adhesion of foreign matter, etc. (may be collectively referred to as “deterioration”) occurs. Therefore, countermeasure work such as inspection and repair is necessary for maintenance. However, there are social issues such as a lack of personnel and high costs for the work. On the other hand, a system (which may be described as a structure deterioration detection system) for diagnosing and detecting a state such as deterioration of a structure using a computer has been developed and is expected to be effective.

構造物劣化検出に係わる従来技術例では、カメラを用いて対象構造物の表面が撮像され、その撮像された画像を人（作業者）が目視で劣化を診断し検出する作業が行われる。あるいは、計算機に画像が入力され画像処理等によって劣化箇所を推定し検出する処理が行われる。あるいは、特に、計算機で機械学習を用いて画像から特徴を学習（訓練とも呼ばれる）し診断するシステムが開発されている。 In the related art example related to the structure deterioration detection, the surface of the target structure is imaged using a camera, and the person (operator) visually diagnoses and detects the deterioration of the captured image. Alternatively, an image is input to the computer, and a process of estimating and detecting a deteriorated part by image processing or the like is performed. Alternatively, in particular, a system has been developed in which features are learned from images (also called training) and diagnosed using machine learning on a computer.

上記機械学習を用いた構造物劣化検出に係わる先行技術例として、特許第６２９４５２９号公報（特許文献１）が挙げられる。特許文献１では、「ひび割れ検出処理装置」等として、機械学習を用いて、路面画像からひび割れを検出する旨、路面画像を分割したブロック画像を処理する旨等が記載されている。 Japanese Patent No. 6294529 (Patent Document 1) is given as an example of the prior art relating to structure deterioration detection using machine learning. Japanese Patent Application Laid-Open No. 2005-228561 describes a “crack detection processing device” or the like that uses machine learning to detect a crack from a road surface image, and processes a block image obtained by dividing the road surface image.

機械学習の１つとして深層学習（ディープラーニング）がある。近年では、深層学習において畳み込みニューラルネットワーク（ＣＮＮ：Convolutional Neural Network）等を用いて画像の学習および診断を行う技術が開発されている。非特許文献１には、ＣＮＮの手法の一例として、訓練を画像パッチ単位で行い、診断（推論）を可変サイズ入力画像単位で行う旨が記載されている。 One type of machine learning is deep learning. In recent years, techniques for learning and diagnosing images using a convolutional neural network (CNN) or the like in deep learning have been developed. Non-Patent Document 1 describes that as an example of the CNN method, training is performed in units of image patches and diagnosis (inference) is performed in units of variable-size input images.

特許第６２９４５２９号公報Japanese Patent No. 6294529

P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun, “OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks”, arXiv:1312.6229 [cs], Dec. 2013. ＜URL：https://arxiv.org/pdf/1312.6229.pdf＞P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun, “OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks”, arXiv: 1312.6229 [cs], Dec. 2013. <URL: https://arxiv.org/pdf/1312.6229.pdf>

従来技術例の構造物劣化検出システムでは、計算機において教師情報入力を含む機械学習を用いて、入力画像から劣化等の特徴を学習および診断して検出する。その際、特に深層学習を用いる場合、計算機性能にも依るが、ＣＮＮのモデル（ネットワークとも呼ばれる）の計算に関する計算時間が長くかかる。多数の画像を処理する必要があり、何回もモデル計算を行う必要があること等から、計算時間が長くかかる。また、そのシステムでは、その計算時間と、ユーザ（作業者）による作業時間とを含め、全体的に時間が長くかかる。 In the structure deterioration detection system of the prior art example, the computer learns and diagnoses features such as deterioration from the input image using machine learning including teacher information input, and detects it. At that time, particularly when deep learning is used, although it depends on the computer performance, it takes a long time to calculate a CNN model (also called a network). Since a large number of images need to be processed and the model calculation needs to be performed many times, it takes a long calculation time. In addition, the system takes a long time as a whole, including the calculation time and the work time by the user (worker).

また、そのシステムでは、ユーザによる作業の手間も大きい。例えば、ユーザは、計算機の機械学習による診断結果画像を画面で見て、画素毎に劣化推定結果が正解か否かを入力する正解付け作業を行う。正解付け情報を教師情報としてモデルに反映することで、診断の精度を高めることができる。しかし、ユーザによる正解付け作業の手間が大きい。ユーザの作業負担の低減も求められている。 In addition, the system requires a lot of work for the user. For example, the user looks at a diagnosis result image by machine learning of a computer on the screen and performs correct answering work for inputting whether or not the degradation estimation result is correct for each pixel. The accuracy of diagnosis can be improved by reflecting the correct answer information as teacher information in the model. However, it takes a lot of time for correct answering work by the user. There is also a need to reduce the work burden on the user.

本発明の目的は、構造物劣化検出システム技術に関して、劣化検出の精度を確保しつつ、計算機での学習および診断に要する計算時間およびユーザの作業時間を含む時間を短縮でき、ユーザの作業負担を低減できる技術を提供することである。 An object of the present invention is to reduce the time required for learning and diagnosis in a computer and the time required for the user and the work time of the user while ensuring the accuracy of the deterioration detection with respect to the structure deterioration detection system technology. It is to provide technology that can be reduced.

本発明のうち代表的な実施の形態は、構造物劣化検出システムであって、以下に示す構成を有することを特徴とする。一実施の形態の構造物劣化検出システムは、計算機システム上に構成され、構造物の表面のひび割れを含む劣化を検出する構造物劣化検出システムであって、前記計算機システムは、前記構造物の表面が撮像された第１画像を入力として、深層学習を用いて、前記劣化の診断結果を表す情報を含む第２画像を出力する第１処理と、前記第１画像および前記第２画像を含む情報を可視化して画面に表示し、ユーザによる入力操作を受け付ける第２処理と、を行い、前記深層学習のモデルを構成する畳み込みニューラルネットワークは、拡幅畳み込みフィルタを演算する拡幅畳み込み層を含み、前記第１処理は、訓練時に、訓練用画像データに基づいて、所定の第１入力サイズの第１画像パッチを前記モデルに入力して、第１出力サイズの第１診断結果画像を得る訓練処理と、前記構造物の対象画像の診断時に、可変サイズとして前記第１入力サイズ以上である前記対象画像から、第２入力サイズの第２画像パッチを切り出し、各々の第２画像パッチを前記モデルに入力して、第２出力サイズの各々の第２診断結果画像を得る診断処理と、を有する。 A typical embodiment of the present invention is a structure deterioration detection system having the following configuration. A structure deterioration detection system according to an embodiment is a structure deterioration detection system configured on a computer system to detect deterioration including cracks on the surface of the structure, and the computer system includes a surface of the structure. The first process that outputs the second image including the information representing the diagnosis result of the deterioration using the deep learning, and the information including the first image and the second image A convolutional neural network constituting the deep learning model includes a widening convolutional layer for calculating a widening convolutional filter, and performing a second process of accepting an input operation by a user. In the first process, a first image patch having a predetermined first input size is input to the model based on the training image data during training, and a first diagnosis having a first output size is performed. During the training process for obtaining a result image and the diagnosis of the target image of the structure, a second image patch having a second input size is cut out from the target image that is not less than the first input size as a variable size. Diagnostic processing for inputting an image patch to the model and obtaining a second diagnostic result image of each of the second output sizes.

本発明のうち代表的な実施の形態によれば、構造物劣化検出システム技術に関して、劣化検出の精度を確保しつつ、計算機での学習および診断に要する計算時間およびユーザの作業時間を含む時間を短縮でき、ユーザの作業負担を低減できる技術を提供することである。 According to the representative embodiment of the present invention, regarding the structure deterioration detection system technology, the time including the calculation time required for learning and diagnosis in the computer and the user's work time is ensured while ensuring the accuracy of deterioration detection. It is to provide a technique that can be shortened and reduce the work burden on the user.

本発明の実施の形態１の構造物劣化検出システムの構成を示す図である。It is a figure which shows the structure of the structure deterioration detection system of Embodiment 1 of this invention. 実施の形態１で、構造物劣化検出ソフトウェアの構成を示す図である。In Embodiment 1, it is a figure which shows the structure of structure deterioration detection software. 実施の形態１で、画像パッチおよびＤＬ−ＣＮＮモデル等を示す図である。In Embodiment 1, it is a figure which shows an image patch, a DL-CNN model, etc. FIG. 実施の形態１で、拡幅畳み込みフィルタの例を示す図である。In Embodiment 1, it is a figure which shows the example of a wide convolution filter. 実施の形態１で、拡幅畳み込み処理を示す図である。In Embodiment 1, it is a figure which shows the widening convolution process. 実施の形態１で、ＣＮＮモデルおよび計算を示す図である。In Embodiment 1, it is a figure which shows a CNN model and calculation. 実施の形態１で、画像サイズ関係等を示す図である。In Embodiment 1, it is a figure which shows image size relationship etc. FIG. 実施の形態１で、訓練時の処理フローを示す図である。In Embodiment 1, it is a figure which shows the processing flow at the time of training. 実施の形態１で、弱点画像等について示す図である。In Embodiment 1, it is a figure shown about a weak point image. 実施の形態１で、弱点画像の例を示す図である。In Embodiment 1, it is a figure which shows the example of a weak point image. 実施の形態１で、ＭＩＬ回転処理を示す図である。FIG. 10 is a diagram showing MIL rotation processing in the first embodiment. 実施の形態１で、診断時の第１処理の処理フローを示す図である。In Embodiment 1, it is a figure which shows the processing flow of the 1st process at the time of diagnosis. 実施の形態１で、可視化画面表示時の第２処理の処理フローを示す図である。In Embodiment 1, it is a figure which shows the processing flow of the 2nd process at the time of a visualization screen display. 実施の形態１で、可視化画面の表示例を示す図である。In Embodiment 1, it is a figure which shows the example of a display of a visualization screen. 実施の形態１で、各種画像の例を示す図である。6 is a diagram illustrating examples of various images in Embodiment 1. FIG. 実施の形態１で、二値化画像等の例を示す図である。In Embodiment 1, it is a figure which shows examples, such as a binarized image. 実施の形態１で、モデル入力サイズ設定について示す図である。In Embodiment 1, it is a figure shown about model input size setting. 比較例の構造物劣化検出システムで、ＤＬ−ＣＮＮのモデル等を示す図である。It is a figure which shows the model etc. of DL-CNN in the structure deterioration detection system of a comparative example. 比較例の構造物劣化検出システムで、モデルおよび計算を示す図である。It is a figure which shows a model and calculation in the structure deterioration detection system of a comparative example.

以下、本発明の実施の形態を図面に基づいて詳細に説明する。なお、実施の形態を説明するための全図において同一部には原則として同一符号を付し、その繰り返しの説明は省略する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. Note that components having the same function are denoted by the same reference symbols throughout the drawings for describing the embodiment, and the repetitive description thereof will be omitted.

［課題等］
前提技術や課題等について以下に補足説明する。 [Issues]
Supplementary explanations of the prerequisite technologies and issues are given below.

（１）従来技術例の構造物劣化検出システムとして、深層学習等の機械学習を用いずに、人が画像から劣化を診断するシステムの場合における、作業者等のユーザの点検業務のフローは以下である。（１−１）ユーザは、現地で点検対象のインフラ設備等の構造物をカメラで撮影する。例えば、ある対象構造物について、カメラ撮影画像枚数が1000枚となる。（１−２）ユーザは、撮影した全画像の各画像に対し、ひび割れ等の劣化箇所があるかどうかを目視で確認、診断する。例えば、カメラ撮影画像1000枚が目視対象画像となり、目視対象画像1000枚のうち、劣化箇所を含む画像として80枚が抽出される。上記従来技術例の場合、ユーザの診断作業等の手間や負担が大きく、長い時間がかかる。 (1) In the case of a system for diagnosing deterioration from an image without using machine learning such as deep learning as a structure deterioration detection system of the prior art example, the flow of inspection work of a user such as an operator is as follows: It is. (1-1) A user photographs a structure such as an infrastructure facility to be inspected on site with a camera. For example, for a certain target structure, the number of images taken by the camera is 1000. (1-2) The user visually confirms and diagnoses whether or not there is a degraded portion such as a crack in each of the captured images. For example, 1000 camera-captured images become viewing target images, and 80 images are extracted from the 1000 viewing target images as images including a deteriorated portion. In the case of the prior art example described above, it takes a lot of time and effort for the user's diagnosis work and the like and takes a long time.

（２）従来技術例の構造物劣化検出システムとして、機械学習を用いて、計算機が画像から劣化を学習および診断するシステムの場合における、ユーザの点検業務のフローは以下である。（２−１）同様に、対象構造物をカメラで撮影した画像、例えば1000枚が用意される。それらの画像データが計算機に入力される。（２−２）計算機は、全画像の各画像に対し、機械学習による劣化診断を実行する。画像毎にモデル計算が適用され、診断結果情報として、劣化推定箇所を含む画像（診断結果画像）、例えば100枚が得られる。画像数に応じてモデル計算の繰り返しが必要であり、長い計算時間がかかる。（２−３）ユーザは、画面で、各診断結果画像内の劣化推定箇所を目視で確認し、その劣化推定箇所が実際に劣化であるかどうか、最終判定を行う。また、ユーザは、画面で、画像内の劣化推定結果が正解か否かを確認して正解付け情報を入力し、モデルに反映する。上記従来技術例の場合、計算機の計算時間とユーザの作業時間とを含め、長い時間がかかる。 (2) In the case of a system in which the computer learns and diagnoses deterioration from an image using machine learning as the structure deterioration detection system of the prior art example, the flow of the inspection work of the user is as follows. (2-1) Similarly, an image obtained by photographing the target structure with a camera, for example, 1000 images is prepared. Those image data are input to the computer. (2-2) The computer executes deterioration diagnosis by machine learning for each image of all images. Model calculation is applied to each image, and as the diagnosis result information, an image (diagnosis result image) including, for example, a degradation estimated portion, for example, 100 sheets is obtained. The model calculation needs to be repeated according to the number of images, which takes a long calculation time. (2-3) The user visually confirms the estimated deterioration location in each diagnosis result image on the screen, and finally determines whether the estimated deterioration location is actually deteriorated. Further, the user confirms whether or not the degradation estimation result in the image is a correct answer on the screen, inputs correct answer information, and reflects it in the model. In the case of the above prior art example, it takes a long time including the calculation time of the computer and the work time of the user.

（実施の形態１）
図１〜図１９を用いて、本発明の実施の形態１の構造物劣化検出システムについて説明する。実施の形態１の構造物劣化検出システムは、計算機システム上に構成され、構造物の表面のひび割れを含む劣化を検出するシステムである。計算機システムは、構造物の表面が撮像された第１画像（画像群）を入力として、深層学習（ＤＬと略す場合がある）を用いて、劣化の診断結果を表す情報を含む第２画像を出力する第１処理（劣化診断処理）を行う。また、計算機システムは、第１画像および第２画像を含む情報を可視化して画面に表示し、ユーザによる入力操作を受け付ける第２処理（可視化処理）を行う。深層学習のモデルを構成するＣＮＮは、拡幅畳み込みフィルタを演算する拡幅畳み込み層を含む。第１処理は、訓練時に、訓練用画像データに基づいて、所定の第１入力サイズの第１画像パッチをモデルに入力して、第１出力サイズの第１診断結果画像を得る訓練処理を含む。また、第１処理は、対象構造物の対象画像の診断時に、可変サイズとして第１入力サイズ以上である対象画像から、第２入力サイズの第２画像パッチを切り出し、各々の第２画像パッチをモデルに入力して、第２出力サイズの各々の第２診断結果画像を得る診断処理を含む。 (Embodiment 1)
The structure deterioration detection system according to the first embodiment of the present invention will be described with reference to FIGS. The structure deterioration detection system according to the first embodiment is a system configured on a computer system to detect deterioration including cracks on the surface of the structure. The computer system uses a first image (image group) obtained by imaging the surface of the structure as an input, and uses a deep learning (sometimes abbreviated as DL) to generate a second image including information representing a diagnosis result of deterioration. The output first process (deterioration diagnosis process) is performed. The computer system also performs a second process (visualization process) that visualizes information including the first image and the second image, displays the information on the screen, and receives an input operation by the user. The CNN constituting the deep learning model includes a widening convolution layer for calculating a widening convolution filter. The first process includes a training process for inputting a first image patch having a predetermined first input size to a model and obtaining a first diagnosis result image having a first output size based on the training image data during training. . In the first process, when diagnosing the target image of the target structure, a second image patch having the second input size is cut out from the target image having a variable size equal to or larger than the first input size, and each second image patch is extracted. A diagnostic process is included that is input to the model and obtains a second diagnostic result image of each of the second output sizes.

実施の形態１の構造物劣化検出システムの場合における、ユーザの点検業務のフローは以下である。（３−１）同様に、対象構造物をカメラで撮影した画像、例えば1000枚が用意される。それらの画像データが計算機に入力される。（３−２）本システムは、全画像の各画像に対し、深層学習による劣化診断を実行する。診断結果画像として、劣化推定箇所を含む画像、例えば100枚が得られる。１回のモデル計算では、第２出力サイズ（縦横の画素数が複数）の診断結果情報が得られる。ＣＮＮの所定の層においてstride数を２以上に設定する従来技術例では、出力サイズは、入力サイズよりも倍数的に小さくなり、出力が疎になってしまう。したがって、非特許文献１のように数十回モデルを使用するか、従来技術例のように画素毎にモデルを使用することにより、出力を密にする処理を行う必要がある。一方、実施の形態１のシステムでは、ＣＮＮのすべての層においてstride数を１に設定することにより、１回のモデルの使用で密な出力を作成することができ、１枚の画像に要する計算時間が短く抑制されている。上記劣化診断のモデル計算は、画像毎に、訓練または作業時の実診断として行われ、その度にモデルが学習（更新）される。 The flow of the user's inspection work in the case of the structure deterioration detection system of the first embodiment is as follows. (3-1) Similarly, an image obtained by photographing the target structure with a camera, for example, 1000 sheets is prepared. Those image data are input to the computer. (3-2) This system executes deterioration diagnosis by deep learning for each image of all images. As the diagnosis result image, an image including a degradation estimated portion, for example, 100 images is obtained. In one model calculation, diagnostic result information of the second output size (a plurality of vertical and horizontal pixels) is obtained. In the prior art example in which the number of strides is set to 2 or more in a predetermined layer of the CNN, the output size is multiple times smaller than the input size, and the output becomes sparse. Therefore, it is necessary to perform a process for dense output by using a model several tens of times as in Non-Patent Document 1 or using a model for each pixel as in the prior art. On the other hand, in the system of the first embodiment, by setting the stride number to 1 in all layers of the CNN, a dense output can be created by using one model, and the calculation required for one image Time is suppressed for a short time. The model calculation of the deterioration diagnosis is performed as an actual diagnosis at the time of training or work for each image, and the model is learned (updated) each time.

（３−３）本システムは、計算機のアプリ上で、画面に診断結果画像を表示する。診断結果画像に対応する目視対象画像が例えば100枚である。ユーザは、画面で各画像を目視で確認し、劣化推定箇所が実際に劣化であるか否か、最終判定を行う。ユーザの最終判定結果の劣化箇所を含む画像として例えば80枚が抽出される。また、ユーザは、画面で訓練用画像の診断結果画像に対し、劣化推定結果が正解か否かの正解付け作業を行い、正解付け画像がモデルに反映される。これにより、診断の精度が高められる。 (3-3) This system displays a diagnostic result image on a screen on a computer application. There are, for example, 100 visual target images corresponding to the diagnosis result image. The user visually confirms each image on the screen, and makes a final determination as to whether or not the degradation estimated location is actually degraded. For example, 80 images are extracted as the images including the degraded portion of the final determination result of the user. In addition, the user corrects whether or not the deterioration estimation result is correct with respect to the diagnosis result image of the training image on the screen, and the correct image is reflected in the model. This increases the accuracy of diagnosis.

上記のように、本システムでは、深層学習を用いた劣化診断によって、ユーザの目視対象画像を絞り込むことができる。上記例では1000枚から100枚に低減されている。これにより、ユーザの点検作業に係わる工数、時間やコスト等を削減できる。上記のように、本システムでは、劣化診断を完全に自動化するのではなく、深層学習を用いて劣化診断の一部を自動化する。本システムでは、アプリ上で画面に診断結果画像を可視化し、ユーザの目視確認での最終判定や正解付けを含む作業を効率化するように支援する。これにより、正確性を確保しつつ、劣化検出に係わる時間を短縮する。 As described above, in this system, the user's visual target image can be narrowed down by deterioration diagnosis using deep learning. In the above example, the number is reduced from 1000 to 100. Thereby, the man-hour, time, cost, etc. concerning a user's inspection work can be reduced. As described above, in the present system, deterioration diagnosis is not completely automated, but part of the deterioration diagnosis is automated using deep learning. In this system, the diagnosis result image is visualized on the screen on the application, and the work including the final determination and the correct answer in the visual confirmation of the user is supported. As a result, the time for degradation detection is shortened while ensuring accuracy.

［構造物劣化検出システム］
図１は、実施の形態１の構造物劣化検出システムを含む全体の構成を示す。図１の全体は、計算機システム１、構造物５、カメラ４を有する。構造物５は、劣化診断対象であり、各種の建築物やインフラ設備等が該当する。カメラ４は、ユーザ（作業者）の操作に基づいて、構造物５の表面を撮像し、画像４１（静止画または動画）を含む画像データを得る。画像４１には劣化４２の箇所が含まれている場合がある。実施の形態１では、劣化４２として少なくともひび割れを含む。 [Structural deterioration detection system]
FIG. 1 shows an overall configuration including the structure deterioration detection system of the first embodiment. The whole of FIG. 1 includes a computer system 1, a structure 5, and a camera 4. The structure 5 is an object of deterioration diagnosis, and corresponds to various buildings and infrastructure facilities. The camera 4 captures the surface of the structure 5 based on a user (operator) operation, and obtains image data including an image 41 (still image or moving image). The image 41 may include a portion of deterioration 42. In the first embodiment, the deterioration 42 includes at least a crack.

実施の形態１の構造物劣化検出システムは、主に計算機システム１によって構成されている。計算機システム１は、任意の計算機を含むシステムであり、例えば、ＰＣである計算機２と、サーバである計算機３とが、通信網６を介して接続されるシステムである。計算機システム１は、複数の計算機で構成されてもよい。計算機２は、例えば、構造物５の管理者や点検を請負った事業者等の任意の組織における、構造物劣化検出に係わる作業を行うユーザ（作業者）が使用する、クライアント端末装置となるＰＣである。ユーザとして複数の人がいてもよい。計算機３は、例えば、事業者によるクラウドコンピューティングシステムやデータセンタ等のシステム上に設けられたサーバ装置である。計算機２や計算機３には、ＧＰＵ（Graphics Processing Unit）を備えてもよい。 The structure deterioration detection system according to the first embodiment is mainly configured by a computer system 1. The computer system 1 is a system including an arbitrary computer. For example, a computer 2 that is a PC and a computer 3 that is a server are connected via a communication network 6. The computer system 1 may be composed of a plurality of computers. The computer 2 is, for example, a PC serving as a client terminal device used by a user (operator) who performs work related to detection of structure deterioration in an arbitrary organization such as an administrator of the structure 5 or a business operator who has been inspected. It is. There may be a plurality of people as users. The computer 3 is a server device provided on a system such as a cloud computing system or a data center by an operator, for example. The computer 2 and the computer 3 may include a GPU (Graphics Processing Unit).

計算機２および計算機３には、構造物劣化検出ソフトウェア１０（アプリともいう）が設けられている。計算機２には、構造物劣化検出ソフトウェア１０のクライアントプログラム２０がインストールされている。計算機３には、構造物劣化検出ソフトウェア１０のサーバプログラム３０がインストールされている。計算機２のクライアントプログラム２０は、クライアント機能を実現し、計算機３のサーバプログラム３０は、サーバ機能を実現する。クライアントプログラム２０とサーバプログラム３０とは、通信網６を介して相互にクライアントサーバ通信で連携する。 The computer 2 and the computer 3 are provided with structure deterioration detection software 10 (also referred to as an application). The computer 2 is installed with a client program 20 of the structure deterioration detection software 10. A server program 30 of the structure deterioration detection software 10 is installed in the computer 3. The client program 20 of the computer 2 implements a client function, and the server program 30 of the computer 3 implements a server function. The client program 20 and the server program 30 cooperate with each other via client-server communication via the communication network 6.

構造物劣化検出ソフトウェア１０は、ＣＰＵ等によるプログラム処理に基づいて、劣化検出機能や可視化機能を実現する。劣化検出機能は、ＣＮＮを含む深層学習のモデル（ＤＬ−ＣＮＮ）３１を用いて、画像内のひび割れ等の劣化を学習および診断する処理を行う。劣化検出機能の処理は、正解付け処理を含む。劣化検出機能では、ＣＮＮのモデル３１に対する訓練時の画像パッチの入力サイズを一定の第１入力サイズとし、対象画像の診断時の画像パッチの入力サイズを、可変サイズの画像に基づいた第２入力サイズ（第１入力サイズ以上）とする。可視化機能は、対象画像や劣化診断結果画像を含む情報を可視化してＧＵＩ画面２１に表示し、ユーザによる入力操作を受け付ける可視化処理を行う。 The structure deterioration detection software 10 realizes a deterioration detection function and a visualization function based on program processing by a CPU or the like. The deterioration detection function performs processing for learning and diagnosing deterioration such as cracks in an image using a deep learning model (DL-CNN) 31 including CNN. The process of the deterioration detection function includes a correct answer process. In the deterioration detection function, the input size of the image patch at the time of training for the CNN model 31 is set to a constant first input size, and the input size of the image patch at the time of diagnosis of the target image is set to the second input based on the variable-size image. Let it be a size (greater than the first input size). The visualization function visualizes information including the target image and the degradation diagnosis result image, displays the information on the GUI screen 21, and performs a visualization process for accepting an input operation by the user.

計算機２のクライアントプログラム２０は、サーバプログラム３０のサービスとの通信に基づいて、グラフィカル・ユーザ・インタフェース（ＧＵＩ）となるＧＵＩ画面（可視化画面）２１をユーザに対して提供する。計算機２のクライアントプログラム２０は、ユーザ操作入力やＧＵＩ画面２１の表示処理を担当する。計算機２にタッチパネル表示装置を備える場合、ＧＵＩ画面２１はタッチ操作可能な画面としてもよい。 The client program 20 of the computer 2 provides a user with a GUI screen (visualization screen) 21 serving as a graphical user interface (GUI) based on communication with the service of the server program 30. The client program 20 of the computer 2 is in charge of user operation input and display processing of the GUI screen 21. When the computer 2 includes a touch panel display device, the GUI screen 21 may be a touch operable screen.

計算機２のクライアントプログラム２０は、構造物５に関してカメラ４によって撮像された画像の画像データを入力し、計算機２側の記憶装置または計算機３側のＤＢ３２等に保存する。記憶装置やＤＢ３２としては、ＣＰＵまたはＧＰＵ等が扱う画像メモリ、保存用の不揮発性メモリ、各種のストレージ装置やＤＢサーバ等が適用可能である。計算機２に入力される画像データには、属性情報等の画像情報を伴う。画像データの画像情報は、ＩＤ（識別情報）や撮影日時等の他に、カメラ４の画素数、画角（または画角を計算可能である焦点距離およびセンササイズ等）等の情報を含む。画像情報は、対象物距離情報（カメラ４と構造物５の表面との距離）を含んでもよい。あるいは、計算機２は、ユーザの操作に基づいて、画像データに、カメラ画素数等の画像情報を設定してもよい。計算機２は、画像の処理によって対象物距離等を計算してもよい。カメラ４とは別の距離センサ等を用いて対象物距離等を計測して、画像データと共に入力してもよい。 The client program 20 of the computer 2 inputs image data of an image captured by the camera 4 with respect to the structure 5 and stores it in a storage device on the computer 2 side or a DB 32 on the computer 3 side. As the storage device and the DB 32, an image memory handled by a CPU or GPU, a non-volatile memory for storage, various storage devices, a DB server, and the like are applicable. The image data input to the computer 2 is accompanied by image information such as attribute information. The image information of the image data includes information such as the number of pixels of the camera 4, the angle of view (or the focal length and the sensor size capable of calculating the angle of view), etc., in addition to the ID (identification information) and the shooting date and time. The image information may include object distance information (distance between the camera 4 and the surface of the structure 5). Or the computer 2 may set image information, such as a camera pixel number, to image data based on a user's operation. The computer 2 may calculate the object distance or the like by image processing. An object distance or the like may be measured using a distance sensor or the like different from the camera 4 and input together with the image data.

計算機３のサーバプログラム３０は、ＣＮＮを含む深層学習のモデル（ＤＬ−ＣＮＮ）３１を構成し、モデル３１を用いた計算によって、劣化の学習および診断等の処理を行う。計算機３のサーバプログラム３０は、一般的に計算負荷が高い、深層学習のモデル３１を用いた計算処理を担当する。また、サーバプログラム３０は、ＧＵＩ画面２１のための画面データを計算機２へ送信する。サーバプログラム３０は、ＤＢ（データベース）３２に、モデル３１を含む各種のデータや情報を格納して管理する。 The server program 30 of the computer 3 constitutes a deep learning model (DL-CNN) 31 including CNN, and performs processing such as deterioration learning and diagnosis by calculation using the model 31. The server program 30 of the computer 3 is in charge of a calculation process using a deep learning model 31 that generally has a high calculation load. Further, the server program 30 transmits screen data for the GUI screen 21 to the computer 2. The server program 30 stores and manages various data and information including the model 31 in a DB (database) 32.

ＤＢ３２のデータや情報として、画像データ、構造物データ、診断データ等がある。画像データは、カメラ４で撮像された画像群や、訓練用の画像群や、診断結果画像群等のデータである。構造物データは、構造物５に関する管理情報や３次元オブジェクトデータ等である。診断データは、実施の形態１の構造物劣化検出システムおよびユーザによって構造物５の劣化を診断した結果を含む、点検作業の結果をまとめたデータである。 Examples of data and information in the DB 32 include image data, structure data, and diagnostic data. The image data is data such as an image group captured by the camera 4, a training image group, and a diagnostic result image group. The structure data is management information related to the structure 5, three-dimensional object data, or the like. The diagnosis data is data that summarizes the result of the inspection work including the result of diagnosing the deterioration of the structure 5 by the structure deterioration detection system of the first embodiment and the user.

ＰＣである計算機２は、例えば、ＣＰＵ、ＲＯＭ、ＲＡＭ、不揮発性メモリ、マウスやキーボード等の入力機器、液晶表示装置等の出力機器、入出力インタフェース装置、通信インタフェース装置、等の公知の要素を備える。サーバである計算機３は、例えば、ＣＰＵ、ＲＯＭ、ＲＡＭ、不揮発性メモリ、入力機器、出力機器、入出力インタフェース装置、通信インタフェース装置、等の公知の要素を備える。 The computer 2 that is a PC includes known elements such as a CPU, ROM, RAM, nonvolatile memory, input devices such as a mouse and a keyboard, output devices such as a liquid crystal display device, input / output interface devices, communication interface devices, and the like. Prepare. The computer 3 as a server includes known elements such as a CPU, a ROM, a RAM, a nonvolatile memory, an input device, an output device, an input / output interface device, and a communication interface device.

なお、図１の計算機システム１の構成例に限らず可能である。例えば、計算機２の処理と計算機３の処理とを１台の計算機に統合した形態でもよい。ユーザが所持するカメラ機能付き携帯情報端末装置で主な処理を行う形態でもよい。構造物５の画像群を取得するための技術的手段に関しては、人手によるカメラ４での撮影に限定されず、各種の手段が適用可能である。 The configuration is not limited to the configuration example of the computer system 1 in FIG. For example, a form in which the processing of the computer 2 and the processing of the computer 3 are integrated into one computer may be used. A mode in which main processing is performed by a portable information terminal device with a camera function possessed by the user may be employed. The technical means for acquiring the image group of the structure 5 is not limited to manual shooting with the camera 4, and various means can be applied.

［構造物劣化検出ソフトウェア］
図２は、構造物劣化検出ソフトウェア（アプリ）１０に関する構成を示す。アプリ１０は、データセット１０１、ネットワーク構成１０２（モデル３１に対応する）、ＧＵＩ画面１０３（ＧＵＩ画面２１に対応する）を有する。アプリ１０は、第１処理機能１１（訓練・診断機能）、第２処理機能１２（可視化機能）、カメラ画像入力機能１３、ＭＩＬ回転機能１４、モデル入力サイズ設定機能１５、評価・絞り込み機能１６等を有する。 [Structural deterioration detection software]
FIG. 2 shows a configuration related to the structure deterioration detection software (application) 10. The application 10 includes a data set 101, a network configuration 102 (corresponding to the model 31), and a GUI screen 103 (corresponding to the GUI screen 21). The application 10 includes a first processing function 11 (training / diagnosis function), a second processing function 12 (visualization function), a camera image input function 13, a MIL rotation function 14, a model input size setting function 15, an evaluation / narrowing function 16, and the like. Have

データセット１０１は、画像データとして、オリジナル画像群２１１、弱点画像群２１２を含む。オリジナル画像群２１１は、カメラ４で対象物５の表面を撮像した画像群である。オリジナル画像群２１１は、実際のひび割れを含む画像や、実際のひび割れを含まない画像を有する。弱点画像群２１２は、後述するが、モデル３１の弱点を学習するための画像群である。データセット１０１の画像データは、カメラ４の画像に基づいた、訓練用画像、正解付け画像、診断対象画像、診断結果画像、等を含む。訓練用画像は、点検作業時の診断よりも前に、モデル３１を学習させるための画像である。正解付け画像は、診断結果画像に対してユーザが正解付け入力した画像である。診断対象画像は、点検作業時に実診断する対象画像であり、一定サイズに限定されない可変サイズである。診断結果画像は、モデル３１から計算の結果として出力される画像である。 The data set 101 includes an original image group 211 and a weak point image group 212 as image data. The original image group 211 is an image group obtained by capturing the surface of the object 5 with the camera 4. The original image group 211 includes an image including an actual crack and an image not including an actual crack. The weak point image group 212 is an image group for learning the weak points of the model 31 as described later. The image data of the data set 101 includes a training image, a correct answer image, a diagnosis target image, a diagnosis result image, and the like based on the image of the camera 4. The training image is an image for learning the model 31 before the diagnosis at the time of inspection work. The correct answer image is an image that the user corrects and inputs the diagnosis result image. The diagnosis target image is a target image to be actually diagnosed at the time of inspection work, and is a variable size that is not limited to a certain size. The diagnosis result image is an image output as a calculation result from the model 31.

カメラ画像入力機能１３は、カメラ４からの画像データおよび画像情報を入力し、データセット１０１の一部として管理する。アプリ１０は、画像情報を含む管理情報２１３を作成し管理する。 The camera image input function 13 inputs image data and image information from the camera 4 and manages them as a part of the data set 101. The application 10 creates and manages management information 213 including image information.

第１処理機能１１（訓練・診断機能）は、訓練および診断に用いるネットワーク構成１０２（モデル３１）を管理する。ネットワーク構成１０２は、ＭＩＬ回転２２１、拡幅畳み込み２２２を有する。モデル３１の入力の第１画像として、訓練時には所定の第１入力サイズの画像パッチを有する。第１入力サイズは、モデル入力サイズ設定機能１５に基づいて、モデル３１のパラメータ（各層のフィルタのサイズ等）に応じて変更されるサイズとして設定され、最初にモデル３１のパラメータが設定された後には、それに応じた所定の固定サイズとして設定される。モデル３１の出力の第２画像として、診断結果画像（劣化診断結果情報）を有する。１枚の診断結果画像は、１枚の画像パッチの第１入力サイズに対応した、所定の第１出力サイズを有する。診断結果画像における各画素は、劣化の可能性を確率で表す確率値を持つ。 The first processing function 11 (training / diagnosis function) manages the network configuration 102 (model 31) used for training and diagnosis. The network configuration 102 has a MIL rotation 221 and a widening convolution 222. The first image of the model 31 has an image patch having a predetermined first input size during training. Based on the model input size setting function 15, the first input size is set as a size that is changed according to the parameters of the model 31 (such as the filter size of each layer), and after the parameters of the model 31 are initially set. Is set as a predetermined fixed size accordingly. A diagnostic result image (deterioration diagnostic result information) is provided as the second image output from the model 31. One diagnostic result image has a predetermined first output size corresponding to the first input size of one image patch. Each pixel in the diagnosis result image has a probability value that represents the possibility of deterioration as a probability.

第１処理機能１１は、ＭＩＬ回転機能１４を含む。ＭＩＬ回転機能１４は、ＭＩＬ回転２２１の処理を行う。ＭＩＬは、Multiple Instance Learningであり、複数のインスタンスをモデルに入力して学習する概念を示す。ＭＩＬ回転２２１は、実施の形態１で特有の、劣化の特性を考慮した処理であり、劣化の方向に対応できるように、元画像を回転させることで複数の画像を生成する処理を含む。ＭＩＬ回転２２１は、ユーザの操作および設定に応じて、機能をオン／オフすることができ、オン状態の場合に行われ、オフ状態の場合には省略される。 The first processing function 11 includes a MIL rotation function 14. The MIL rotation function 14 performs processing of MIL rotation 221. MIL is Multiple Instance Learning, which represents the concept of learning by inputting a plurality of instances into a model. The MIL rotation 221 is processing that takes into consideration the characteristics of deterioration that is unique to the first embodiment, and includes processing for generating a plurality of images by rotating the original image so as to be able to cope with the direction of deterioration. The MIL rotation 221 can turn on / off the function according to the user's operation and setting, and is performed in the on state, and is omitted in the off state.

拡幅畳み込み２２２（Dilated Convolution）は、畳み込み処理の１種として、拡幅畳み込みフィルタを用いた演算処理である。拡幅畳み込み２２２のフィルタは、stride数が１、dilate数が２以上と規定される。なお、モデル３１は、一部の層に、拡幅ではない畳み込み処理（フィルタのdilate数が１）を含んでもよい。 Widening convolution 222 (Dilated Convolution) is an arithmetic processing using a widening convolution filter as one type of convolution processing. The filter of the widening convolution 222 is defined to have a stride number of 1 and a dilate number of 2 or more. The model 31 may include convolution processing (the number of filter dilates is 1) that is not widened in some layers.

拡幅畳み込み２２２は、全結合畳み込み層（Fully Convolutional Networks）を含み、この全結合畳み込み層は、拡幅畳み込み層として実装されている。全結合畳み込み層は、拡幅畳み込み層を通じて抽出された特徴情報（特徴量）を分類して取り出すための層である。 The widening convolution 222 includes a fully coupled convolutional layer, which is implemented as a widening convolutional layer. The fully connected convolution layer is a layer for classifying and extracting feature information (feature amount) extracted through the widening convolution layer.

ＧＵＩ画面１０３は、対象画像２３１や訓練用画像、診断結果画像２３２を表示し、また、小領域除去画像２３３、直線除去画像２３４、正解付け画像（第３画像）２３５等を表示する。診断結果画像２３２は、多階調画像や二値化画像がある。ＧＵＩ画面１０３でユーザが閾値を変更操作することで、その閾値に応じた二値化画像が表示される。また、ＧＵＩ画面１０３でユーザが操作することで、小領域除去画像２３３や直線除去画像２３４が表示される。また、ＧＵＩ画面１０３でユーザが操作して正解付け入力することで、正解付け画像２３５が表示される。ユーザは、診断結果画像の画素毎に、モデル３１による劣化推定結果が正解か否かを表す情報を正解付け情報として入力する。この正解付け情報を含む正解付け画像が、教師情報としてモデル３１に反映される。 The GUI screen 103 displays the target image 231, the training image, and the diagnosis result image 232, and displays the small region removal image 233, the straight line removal image 234, the correct answer image (third image) 235, and the like. The diagnosis result image 232 includes a multi-tone image and a binarized image. When the user changes the threshold value on the GUI screen 103, a binarized image corresponding to the threshold value is displayed. Further, when the user operates on the GUI screen 103, the small area removal image 233 and the straight line removal image 234 are displayed. Further, the correct answer image 235 is displayed when the user operates the GUI screen 103 and inputs correct answer. The user inputs, as correct answer information, information indicating whether or not the degradation estimation result by the model 31 is correct for each pixel of the diagnosis result image. The correct answer image including the correct answer information is reflected in the model 31 as teacher information.

評価・絞り込み機能１６は、診断結果画像を評価して、複数枚の診断結果画像から、ＤＬによって劣化の可能性が高いと推定された劣化箇所を含む診断結果画像を絞り込む。その際、評価・絞り込み機能１６は、劣化推定確率の閾値を用いて、画素毎の確率値を二値化し、閾値以上の箇所を劣化箇所として抽出してもよい。また、評価・絞り込み機能１６は、例えば画像内の劣化箇所（閾値以上）の画素数が、所定の閾値以上である場合に、劣化面積が大きいまたは劣化度合いが大きいと判断して、画像を絞り込んでもよい。絞り込みによって、例えば1000枚の診断結果画像から100枚が抽出される。ユーザは、絞り込まれた画像を優先して目視確認して最終判定を行うことができる。ユーザは、絞り込まれなかった他の画像についても、任意に指定して確認できる。 The evaluation / narrowing function 16 evaluates the diagnosis result image, and narrows down the diagnosis result image including the deterioration portion estimated to be highly likely to be deteriorated by the DL from the plurality of diagnosis result images. At that time, the evaluation / narrowing function 16 may binarize the probability value for each pixel using the threshold value of the deterioration estimation probability, and extract a portion that is equal to or higher than the threshold value as a deterioration portion. The evaluation / squeezing function 16 narrows down the image by determining that the degradation area is large or the degradation degree is large, for example, when the number of pixels in the degradation area (greater than or equal to the threshold) in the image is greater than or equal to a predetermined threshold. But you can. By narrowing down, for example, 100 images are extracted from 1000 diagnostic result images. The user can make a final determination by visually confirming the narrowed down image. The user can arbitrarily specify and check other images that have not been narrowed down.

［深層学習］
公知の深層学習およびＣＮＮについて以下に補足説明する。ＣＮＮは、入力に対し、行列積および活性化関数の演算を行うことが基本である。しかし、画像入力の場合、［画像入力の次元数］＝［画素数］×３となり（なお３はＲ，Ｇ，Ｂの色画素に対応する）、ネットワークの入力ノード数がとても大きい（後述の図１８）。画像の性質上、意味のある情報は、隣接する画素に凝縮されている。よって、ＣＮＮでは、隣接する画素間での行列積を求めるようにする。ＣＮＮでは、画像入力次元と同じ大きさのパラメータ（例えば縦画素数×横画素数）で行列積を求めるのではなく、３×３、５×５等の小さなパラメータ（対応するフィルタ）を使用して、モデルの総パラメータ数を抑えるようにする。サイズの合わない行列は積を求められないので、ＣＮＮでは、普通の行列積ではなく、畳み込み（convolution）処理を用いる。ＣＮＮでは、入力に対し、畳み込みフィルタを用いて畳み込み処理を何度も行うこと（複数の層で行うこと）で、より高次元の特徴が特徴マップとして抽出される。フィルタの数、サイズ、層の深さ等は、すべて、ハイパーパラメータと呼ばれ、人が設計または設定する必要がある。 [Deep learning]
The following is a supplementary explanation of known deep learning and CNN. Basically, the CNN performs matrix product and activation function operations on the input. However, in the case of image input, [number of dimensions of image input] = [number of pixels] × 3 (3 corresponds to R, G, B color pixels), and the number of input nodes of the network is very large (described later). FIG. 18). Due to the nature of the image, meaningful information is condensed into adjacent pixels. Therefore, in CNN, a matrix product between adjacent pixels is obtained. CNN uses small parameters (corresponding filters) such as 3 × 3, 5 × 5, etc. instead of obtaining a matrix product with parameters having the same size as the image input dimension (for example, the number of vertical pixels × the number of horizontal pixels). To reduce the total number of parameters in the model. Since a product with a size that does not match cannot be obtained, CNN uses a convolution process instead of a normal matrix product. In the CNN, higher-dimensional features are extracted as feature maps by repeatedly performing convolution processing on the input using a convolution filter (using a plurality of layers). The number, size, layer depth, etc. of the filters are all called hyperparameters and need to be designed or set by the person.

また、従来の深層学習では、精度を高めるためには、ユーザによる正解付け作業に基づいて、教師情報をモデルに反映することが有効である。実施の形態１の構造物劣化検出システムでも、正解付け作業に基づいた教師情報（正解付け画像）をモデル３１に反映することで、診断の精度を高める。 In conventional deep learning, in order to increase accuracy, it is effective to reflect teacher information in a model based on correct answering work by a user. Also in the structure deterioration detection system of the first embodiment, the accuracy of diagnosis is improved by reflecting the teacher information (correct answer image) based on the correct answer work on the model 31.

［比較例：ＤＬ−ＣＮＮのモデル］
図１８は、実施の形態１に対する比較例の構造物劣化検出システムにおける、画像パッチおよびＤＬ−ＣＮＮのモデル等を示す。（Ａ）は、診断対象画像１８１を示す。診断対象画像１８１は、例えば、縦方向（ｙ）の縦画素数がｍ、横方向（ｘ）の横画素数がｍ、総画素数がｍ×ｍ＝Ｍの正方形画像とする。（Ｂ）は、ＤＬ−ＣＮＮのモデル１８３に入力するための所定のサイズの画像パッチ１８２を示す。画像パッチ１８２の縦画素数がｎ、横画素数がｎ、総画素数がｎ×ｎ＝Ｎの正方形画像とする。（Ａ）の診断対象画像１８１のサイズは、画像パッチ１８２のサイズよりも大きい。診断対象画像１８１のサイズは様々であり、モデル１８３の入力サイズとは異なる場合が多い。画素は、Ｒ，Ｇ，Ｂの色画素で構成される。診断対象画像１８１から、画素毎に、画像パッチ１８２が切り出される。すなわち、診断対象画像１８１のＭ個の画素から、Ｍ個の画像パッチ１８２が切り出される。画像パッチ１８２の中心画素は、モデル１８３によって劣化確率が計算される画素である。 [Comparative example: DL-CNN model]
FIG. 18 shows an image patch, a DL-CNN model, and the like in the structural deterioration detection system of the comparative example with respect to the first embodiment. (A) shows the diagnostic object image 181. The diagnosis target image 181 is, for example, a square image in which the number of vertical pixels in the vertical direction (y) is m, the number of horizontal pixels in the horizontal direction (x) is m, and the total number of pixels is m × m = M. (B) shows an image patch 182 of a predetermined size to be input to the DL-CNN model 183. The image patch 182 is a square image having n vertical pixels, n horizontal pixels, and a total pixel number of n × n = N. The size of the diagnosis target image 181 in (A) is larger than the size of the image patch 182. The size of the diagnosis target image 181 varies and is often different from the input size of the model 183. The pixel is composed of R, G, and B color pixels. An image patch 182 is cut out from the diagnosis target image 181 for each pixel. That is, M image patches 182 are cut out from M pixels of the diagnosis target image 181. The central pixel of the image patch 182 is a pixel whose deterioration probability is calculated by the model 183.

（Ｃ）のモデル１８３は、公知のネットワーク構成として、入力層、複数の隠れ層、全結合層、出力層等で構成される。隠れ層は、畳み込み層やプーリング層を含む。入力層の複数の各々のノードには、入力画像（画像パッチ１８２）の各々の画素の画素値が入力される。入力ノード数として、ｎ×ｎ×３である。入力層の各ノードと隠れ層の各ノードとの間では、畳み込みフィルタを用いた畳み込み演算処理等が行われる。従来例の畳み込みフィルタ１８４は、例えば、縦横で３×３の大きさであり、拡幅は無い。畳み込みフィルタ１８４は、画像パッチ１８３よりも小さいサイズであり、画像パッチ１８３の画素群に対して、所定のストライド（stride）で適用される。従来例の畳み込みフィルタ１８４のstride数は２以上である。stride数は、フィルタ処理を繰り返す際の中心画素間の移動量に対応する。モデル１８３の全結合層、出力層を通じて、画像パッチ１８３の中心画素に関する劣化推定確率値が出力される。 The model 183 in (C) includes an input layer, a plurality of hidden layers, a fully coupled layer, an output layer, and the like as a known network configuration. The hidden layer includes a convolution layer and a pooling layer. The pixel value of each pixel of the input image (image patch 182) is input to each of a plurality of nodes in the input layer. The number of input nodes is n × n × 3. Between each node in the input layer and each node in the hidden layer, a convolution operation process using a convolution filter is performed. The convolution filter 184 of the conventional example has a size of 3 × 3 in length and width, for example, and has no widening. The convolution filter 184 has a size smaller than that of the image patch 183, and is applied to the pixel group of the image patch 183 with a predetermined stride. The number of strides of the convolution filter 184 of the conventional example is 2 or more. The stride number corresponds to the amount of movement between the central pixels when the filtering process is repeated. A degradation estimation probability value related to the center pixel of the image patch 183 is output through all the coupling layers and the output layer of the model 183.

上記のように、比較例の場合、診断対象画像１８１の全体を診断するためには、モデル１８３の計算を、Ｍ個の画素、Ｍ個の画像パッチ１８３に対応してＭ回同様に繰り返し行う必要がある。そのため、数Ｍに対応して計算時間が長くかかる。 As described above, in the case of the comparative example, in order to diagnose the entire diagnosis target image 181, the calculation of the model 183 is repeatedly performed in the same manner M times corresponding to M pixels and M image patches 183. There is a need. Therefore, it takes a long calculation time corresponding to the number M.

［ＤＬ−ＣＮＮのモデル］
図３は、実施の形態１の構造物劣化検出システムにおける、画像パッチ、ＤＬ−ＣＮＮのモデル３１等を示す。（Ａ）は、対象画像（第１画像）３０１を示す。対象画像３０１は、例えば、縦方向（ｙ）の縦画素数がｃ１、横方向（ｘ）の横画素数がｃ２、総画素数がｃ１×ｃ２の長方形画像とする。（Ｂ）は、モデル３１に入力するための所定の入力サイズの画像パッチ３０２を示す。画像パッチ３０２の縦画素数がｎ、横画素数がｎ、総画素数がｎ×ｎ＝Ｎの正方形画像とする。対象画像３０１のサイズは、画像パッチ３０２のサイズよりも大きい。対象画像３０１のサイズは可変サイズであり、モデル３１の入力サイズ以上のサイズである。対象画像３０１から必要に応じて複数の画像パッチ３０２が切り出される。切り出しの詳細は後述する。 [DL-CNN model]
FIG. 3 shows an image patch, DL-CNN model 31 and the like in the structure deterioration detection system of the first embodiment. (A) shows the target image (first image) 301. The target image 301 is, for example, a rectangular image in which the number of vertical pixels in the vertical direction (y) is c1, the number of horizontal pixels in the horizontal direction (x) is c2, and the total number of pixels is c1 × c2. (B) shows an image patch 302 of a predetermined input size for input to the model 31. It is assumed that the image patch 302 has a square image in which the number of vertical pixels is n, the number of horizontal pixels is n, and the total number of pixels is n × n = N. The size of the target image 301 is larger than the size of the image patch 302. The size of the target image 301 is a variable size and is larger than the input size of the model 31. A plurality of image patches 302 are cut out from the target image 301 as necessary. Details of the clipping will be described later.

（Ｃ）のモデル３１は、ネットワーク構成として、入力層、複数の隠れ層、全結合畳み込み層、出力層等で構成される。隠れ層は、拡幅畳み込み層を含む。入力層の複数の各々のノードには、入力画像である画像パッチ３０２の入力サイズ（ｎ×ｎ）に応じた各々の画素の画素値が入力される。入力ノード数として、ｎ×ｎ×３である。入力層の各ノードと隠れ層の各ノードとの間では、拡幅畳み込みフィルタを用いた拡幅畳み込み演算処理等が行われる。ＣＮＮのモデル３１の複数の層のうち、例えば最初のいくつかの層で、非拡幅の畳み込みフィルタを適用してもよい。拡幅畳み込みフィルタ３０４は、例えば、縦横の計算対象画素（斜線パターン部分）で３×３、全体で５×５の大きさであり、dilate数が２である。dilate数は、計算対象画素間の拡幅された数に対応する。拡幅畳み込みフィルタ３０４は、画像パッチ３０２よりも小さいサイズであり、画像パッチ３０２の画素群に対して、stride数＝１のストライドで適用される。モデル３１の全結合畳み込み層、出力層を通じて、画像パッチ３０２の画素に関する劣化推定確率値が出力される。さらに、実施の形態１では、モデル３１から、縦横の複数の画素から成る診断結果画像（画像パッチ３０３）が出力され、各画素に劣化推定確率値を持つ。
※段落番号は振り直し
実施の形態１では、モデル３１の出力が、画像パッチ３０３として構成されている。画像パッチ３０３は、出力サイズがｑ×ｑとする。出力サイズは、入力サイズに比例する。出力サイズは、入力サイズに対し、固定値（Ｅとする）を引いたサイズである。固定値Ｅは、１画素に対する劣化推定計算の際に必要となる矩形のサイズであり、最低入力サイズでもある。固定値Ｅは、モデル３１のパラメータ（各層のフィルタのサイズ等）に依存する。訓練時には、例えば固定値Ｅの最低入力サイズの画像パッチ（第１入力サイズの画像パッチ３０２）を使用する。そのため、訓練時の第１出力サイズは、ｑ×ｑ＝１×１となる。診断時には、訓練時の第１入力サイズ以上の大きさの第２入力サイズの画像パッチ３０２を使用する。そのため、診断時の第２出力サイズは、ｑ×ｑ＝２×２以上のサイズとなる。例えば、最低入力サイズ（固定値Ｅ）が75×75である場合、訓練時には、75×75の画像パッチ３０２を使用し、画像パッチ３０３の出力サイズは１×１となる。診断時には、より大きいサイズ、例えば100×100の画像パッチ３０２を使用する場合、画像パッチ３０３の出力サイズは26×26となる。 The model 31 of (C) includes an input layer, a plurality of hidden layers, a fully coupled convolution layer, an output layer, and the like as a network configuration. The hidden layer includes a widening convolution layer. A pixel value of each pixel corresponding to the input size (n × n) of the image patch 302 that is an input image is input to each of a plurality of nodes in the input layer. The number of input nodes is n × n × 3. Between each node of the input layer and each node of the hidden layer, a widening convolution calculation process using a widening convolution filter is performed. Of the layers of the CNN model 31, for example, the first few layers may be applied with a non-widened convolution filter. The widening convolution filter 304 has, for example, a size of 3 × 3 for the vertical and horizontal calculation target pixels (hatched pattern portion), a total size of 5 × 5, and a dilate number of 2. The dilate number corresponds to the widened number between calculation target pixels. The widening convolution filter 304 has a size smaller than that of the image patch 302 and is applied to the pixel group of the image patch 302 with a stride number = 1. Through the all combined convolution layer and output layer of the model 31, the deterioration estimation probability value for the pixel of the image patch 302 is output. Further, in the first embodiment, a diagnosis result image (image patch 303) composed of a plurality of vertical and horizontal pixels is output from the model 31, and each pixel has a deterioration estimation probability value.
* Renumbering of paragraph numbers In the first embodiment, the output of the model 31 is configured as an image patch 303. The image patch 303 has an output size of q × q. The output size is proportional to the input size. The output size is a size obtained by subtracting a fixed value (E) from the input size. The fixed value E is a rectangular size required for the deterioration estimation calculation for one pixel, and is also the minimum input size. The fixed value E depends on the parameters of the model 31 (such as the filter size of each layer). At the time of training, for example, an image patch having a fixed value E and a minimum input size (image patch 302 having a first input size) is used. Therefore, the first output size during training is q × q = 1 × 1. At the time of diagnosis, an image patch 302 having a second input size larger than the first input size at the time of training is used. Therefore, the second output size at the time of diagnosis is a size of q × q = 2 × 2 or more. For example, when the minimum input size (fixed value E) is 75 × 75, the 75 × 75 image patch 302 is used during training, and the output size of the image patch 303 is 1 × 1. At the time of diagnosis, when an image patch 302 having a larger size, for example, 100 × 100 is used, the output size of the image patch 303 is 26 × 26.

また、図３では、診断時に対象画像３０１から画像パッチ３０２を切り出しているが、これは計算機システム１での計算（ＧＰＵのメモリ等）の効率や制限を考慮しており、理論上は必須ではない。訓練時の第１入力サイズと診断時の第２入力サイズとが独立しており、第１入力サイズに依らずに第２入力サイズを選択できる。第１入力サイズに対する第２入力サイズの自由度が大きい。 In FIG. 3, the image patch 302 is cut out from the target image 301 at the time of diagnosis. However, this takes into consideration the efficiency and limitations of calculation (such as GPU memory) in the computer system 1 and is theoretically indispensable. Absent. The first input size at the time of training and the second input size at the time of diagnosis are independent, and the second input size can be selected without depending on the first input size. The degree of freedom of the second input size with respect to the first input size is large.

上記のように、実施の形態１の場合、対象画像３０１の全体を診断するためには、モデル３１の計算を、Ｍ個の画素に対応してＭ回同様に繰り返し行う必要は無く、図１８の比較例よりも少ない回数で計算できる。そのため、計算時間が短くなる。 As described above, in the case of Embodiment 1, in order to diagnose the entire target image 301, it is not necessary to repeat the calculation of the model 31 in the same manner M times corresponding to M pixels. It can be calculated with a smaller number of times than the comparative example. Therefore, the calculation time is shortened.

［拡幅畳み込みフィルタ］
図４は、実施の形態１で用いる、拡幅畳み込みフィルタ（dilated convolution filter）の例を示す。（Ａ）の拡幅畳み込みフィルタ４０１は、図３の拡幅畳み込みフィルタ３０４と同様の構成であり、dilate数＝２である。このフィルタのサイズは、計算対象画素に関して３×３のサイズであり、拡幅を含む全体では５×５のサイズである。中心画素とその周りの８個の画素の各画素値（対応するノード値）からの所定の演算によって、次の層の画素値（ノード値）が得られる。（Ｂ）の拡幅畳み込みフィルタ４０２は、dilate数＝４の場合である。このフィルタのサイズは、計算対象画素に関して３×３であり、拡幅を含む全体では９×９のサイズである。他の拡幅畳み込みフィルタを適用してもよい。 [Wide convolution filter]
FIG. 4 shows an example of a dilated convolution filter used in the first embodiment. The widening convolution filter 401 in (A) has the same configuration as the widening convolution filter 304 in FIG. 3, and the number of dilates = 2. The size of this filter is 3 × 3 with respect to the pixel to be calculated, and the overall size including the widening is 5 × 5. The pixel value (node value) of the next layer is obtained by a predetermined calculation from the pixel values (corresponding node values) of the central pixel and the surrounding eight pixels. The widening convolution filter 402 in (B) is a case where the number of dilates = 4. The size of the filter is 3 × 3 with respect to the calculation target pixel, and the size including the widening is 9 × 9 as a whole. Other widening convolution filters may be applied.

［拡幅畳み込み演算処理］
図５は、実施の形態１での拡幅畳み込み演算処理の例を示す。図４の拡幅畳み込みフィルタ４０２を用いる例で示す。第１画像５００の各画素を四角で示す。第１画像５００の例えば左上の画素５０１から開始して、ｘ方向に順に、stride数＝１で１画素ずつ着目してゆき、１行目の処理が終わるとｙ方向に順に移動して同様に処理を繰り返す。１個目の画素５０１（ｘ１，ｙ１）を中心画素として、拡幅畳み込みフィルタ４０２であるフィルタ５１１が適用される。次に、隣の２個目の画素５０２（ｘ２，ｙ１）を中心画素として、同様に拡幅畳み込みフィルタ４０２であるフィルタ５１２が適用される。図示するように、最後のＭ個目の画素まで同様に拡幅畳み込みフィルタ４０２が適用される。なお、フィルタを適用する際に、元の第１画像５００の領域外になる画素については、例えばパディングとして適当な値を使用すればよい。このような拡幅畳み込み演算の繰り返しによって、次の層の画像の各画素値（ノード値）が得られる。 [Wide-width convolution processing]
FIG. 5 shows an example of the widening convolution operation processing in the first embodiment. An example using the widening convolution filter 402 of FIG. 4 is shown. Each pixel of the first image 500 is indicated by a square. For example, starting from the upper left pixel 501 of the first image 500, attention is paid to each pixel with the stride number = 1 sequentially in the x direction. Repeat the process. A filter 511 that is the widening convolution filter 402 is applied with the first pixel 501 (x1, y1) as the central pixel. Next, the filter 512 that is the widening convolution filter 402 is applied in the same manner with the adjacent second pixel 502 (x2, y1) as the central pixel. As shown in the figure, the widening convolution filter 402 is similarly applied up to the last M-th pixel. Note that when applying the filter, for pixels outside the area of the original first image 500, for example, an appropriate value may be used as padding. By repeating such a widening convolution operation, each pixel value (node value) of the next layer image is obtained.

［比較例：モデル計算］
図１９は、図１８の比較例における、ＤＬ−ＣＮＮのモデル１８３および計算の内容を模式的に示す。ここでは、モデル１８３は、ｋ個の層から成るものとし、第１層Ｌ１、第２層Ｌ２、第（ｋ−１）層Ｌｋ−１、第ｋ層Ｌｋを示す。第１層Ｌ１は、画像パッチ１８３の入力に対応し、縦横の画素のサイズをａ１×ａ１＝ｎ×ｎとする。奥行きのサイズを３とする（Ｒ，Ｇ，Ｂの色画素に対応する）。そのサイズ（ｎ×ｎ×３）の領域を長方体で図示している。ａ１＝ｎは例えば２３２である。 [Comparative example: Model calculation]
FIG. 19 schematically shows the DL-CNN model 183 and the contents of calculation in the comparative example of FIG. Here, the model 183 is composed of k layers, and shows a first layer L1, a second layer L2, a (k−1) th layer Lk−1, and a kth layer Lk. The first layer L1 corresponds to the input of the image patch 183, and the size of the vertical and horizontal pixels is a1 × a1 = n × n. The depth size is 3 (corresponding to R, G, B color pixels). A region of the size (n × n × 3) is illustrated by a rectangular parallelepiped. a1 = n is 232, for example.

第１層Ｌ１の画像の各画素に対し、前述のように、畳み込みフィルタによる畳み込み処理が適用される。第１層Ｌ１の画像から第２層Ｌ２の画像を得る際の畳み込み処理ＣＮＶ１において、畳み込みフィルタＧ１を用いる。畳み込みフィルタＧ１は、例えば、図１８と同様に、拡幅は無く、３×３のサイズで、フィルタ種類数ｇ２として例えば２４を用い、stride数は２以上である。第１層Ｌ１のある１画素に対し畳み込みフィルタＧ１の処理を適用することで、次の第２層Ｌ２のある画素値が得られる様子を破線で模式的に示している。 As described above, the convolution process using the convolution filter is applied to each pixel of the image of the first layer L1. A convolution filter G1 is used in the convolution process CNV1 when obtaining the image of the second layer L2 from the image of the first layer L1. For example, as in FIG. 18, the convolution filter G1 is not widened, has a size of 3 × 3, uses, for example, 24 as the filter type number g2, and has a stride number of 2 or more. A broken line schematically shows how a pixel value of the next second layer L2 is obtained by applying the process of the convolution filter G1 to one pixel of the first layer L1.

第２層Ｌ２の画像は、畳み込み処理ＣＮＶ１の結果、サイズが縮小されている。この画像のサイズをａ２×ａ２×ｇ２で示す。ａ２は例えば１１６となる。第３層以降についても同様に、畳み込み処理等が適用され、特徴量が抽出され、サイズが縮減されてゆく。例えば、第（ｋ−１）層Ｌｋ−１では、画像のサイズが、３×３×ｇｋ−１となる。フィルタ種類数ｇｋ−１として例えば４５である。その画像に対し、全結合畳み込み処理ＦＣＣＮＶ１が適用される。その際のフィルタは、サイズが１×１で、２種である。その結果、第ｋ層Ｌｋでは、サイズが３×３×２の画像情報となり、画素毎に劣化推定確率値を持つ。その画像情報が出力される。 The image of the second layer L2 is reduced in size as a result of the convolution process CNV1. The size of this image is indicated by a2 × a2 × g2. For example, a2 is 116. Similarly, the convolution processing or the like is applied to the third and subsequent layers, the feature amount is extracted, and the size is reduced. For example, in the (k−1) th layer Lk−1, the image size is 3 × 3 × gk−1. The filter type number gk−1 is 45, for example. A fully combined convolution process FCCNV1 is applied to the image. At that time, the filter has a size of 1 × 1 and two types. As a result, in the k-th layer Lk, the image information has a size of 3 × 3 × 2, and each pixel has a deterioration estimation probability value. The image information is output.

［モデル計算］
図６は、実施の形態１におけるＤＬ−ＣＮＮのモデル３１および計算の内容を模式的に示す。ここでは、モデル３１は、ｊ個の層から成るものとし、第１層Ｌ１、第２層Ｌ２、第（ｊ−１）層Ｌｊ−１、第ｊ層Ｌｊを示す。第１層Ｌ１は、画像パッチ３０２の入力に対応し、縦横の画素のサイズをｂ１×ｂ１＝ｎ×ｎとする。奥行きのサイズを３とする。入力の画像パッチ３０２のサイズ（入力サイズ）（ｎ×ｎ×３）の領域を長方体で図示している。ｂ１＝ｎは例えば２３２であり、比較例のａ１＝ｎ＝２３２と同じ場合とする。 [Model calculation]
FIG. 6 schematically shows the DL-CNN model 31 and the content of calculation in the first embodiment. Here, the model 31 is assumed to be composed of j layers, and shows a first layer L1, a second layer L2, a (j-1) th layer Lj-1, and a jth layer Lj. The first layer L1 corresponds to the input of the image patch 302, and the size of vertical and horizontal pixels is b1 × b1 = n × n. The depth size is 3. A region of the size (input size) (n × n × 3) of the input image patch 302 is illustrated by a rectangular solid. For example, b1 = n is 232, which is the same as a1 = n = 232 in the comparative example.

第１層Ｌ１の画像の各画素に対し、前述のように、拡幅畳み込みフィルタによる拡幅畳み込み処理が適用される。第１層Ｌ１の画像から第２層Ｌ２の画像を得る際の拡幅畳み込み処理ＤＣＮＶ１において、拡幅畳み込みフィルタＦ１を用いる。拡幅畳み込みフィルタＦ１は、例えば、図４の拡幅畳み込みフィルタ４０２と同様に、３×３のサイズで、２４種を用い、dilate数＝４、stride数＝１である。第２層Ｌ２のフィルタ種類数ｆ２として２４の例である。 As described above, the widening convolution process using the widening convolution filter is applied to each pixel of the image of the first layer L1. In the widening convolution process DCNV1 when obtaining the image of the second layer L2 from the image of the first layer L1, the widening convolution filter F1 is used. For example, the widening convolution filter F1 has a size of 3 × 3, uses 24 types, and the number of dilates = 4 and the number of strides = 1, similarly to the widening convolution filter 402 of FIG. The number of filter types f2 of the second layer L2 is 24.

第１層Ｌ１のある１画素に対し拡幅畳み込みフィルタＦ１の処理を適用することで、次の第２層Ｌ２のある画素値が得られる様子を模式的に示している。第２層Ｌ２の画像は、拡幅畳み込み処理ＤＣＮＶ１の結果、サイズが縮小されている。この画像のサイズをｂ２×ｂ２×ｆ２で示す。ｂ２は例えば２２４、ｆ２は例えば２４となる。第３層以降についても同様に、拡幅畳み込み処理または非拡幅の畳み込み処理等が適用され、特徴量が抽出され、サイズが縮減されてゆく。例えば、第（ｊ−１）層Ｌｊ−１では、画像のサイズが、ｂｊ−１×ｂｊ−１×ｆｊ−１となる。第（ｊ−１）層Ｌｊ−１のフィルタ種類数ｆｊ−１として４５の例である。その画像に対し、全結合畳み込み処理ＦＣＣＮＶ１が適用される。その際のフィルタは、サイズが１×１で、２種である。その結果、第ｊ層Ｌｊでは、サイズがｂｊ×ｂｊ×２の画像（画像パッチ３０３に対応する診断結果画像）となり、画素毎に劣化推定確率値を持つ。その画像が出力される。サイズｂｊは例えば１４２となる。このサイズｂｊは、比較例のサイズ＝３よりも大きい。 A state in which a pixel value of the next second layer L2 is obtained by applying the processing of the widening convolution filter F1 to one pixel of the first layer L1 is schematically shown. The size of the image of the second layer L2 is reduced as a result of the widening convolution process DCNV1. The size of this image is indicated by b2 × b2 × f2. For example, b2 is 224 and f2 is 24, for example. Similarly, the widening convolution process or the non-widening convolution process is applied to the third and subsequent layers, and feature quantities are extracted and the size is reduced. For example, in the (j−1) th layer Lj−1, the image size is bj−1 × bj−1 × fj−1. The number of filter types fj-1 in the (j-1) th layer Lj-1 is 45. A fully combined convolution process FCCNV1 is applied to the image. At that time, the filter has a size of 1 × 1 and two types. As a result, in the j-th layer Lj, an image having a size of bj × bj × 2 (diagnosis result image corresponding to the image patch 303) is obtained, and has a deterioration estimation probability value for each pixel. The image is output. The size bj is 142, for example. This size bj is larger than the size of the comparative example = 3.

上記のように、実施の形態１では、１回のモデル３１の計算の結果、第ｊ層の画像のようにｂｊ×ｂｊの出力サイズの診断結果画像が得られる。そのため、診断対象画像に対し、Ｍ回よりも少ない回数の計算で、診断結果画像が得られる。 As described above, in the first embodiment, as a result of the calculation of the model 31 once, a diagnostic result image having an output size of bj × bj is obtained like an image of the jth layer. Therefore, the diagnosis result image can be obtained by performing the calculation less than M times for the diagnosis target image.

［画像サイズ関係］
図７は、実施の形態１で、訓練時および診断時の画像のサイズ等の関係について示す。（Ａ）は、訓練時の画像のサイズ等を示す。入力される第１画像である第１画像パッチは、図６と同様に、第１入力サイズとして、ｂ１×ｂ１を有し、例えば９１×９１である。出力される第２画像である第１診断結果画像は、第１出力サイズとして、ｂｊ×ｂｊを有し、例えば１×１である。モデル３１に入力される第１画像パッチは、第１入力サイズに固定される。 [Image size]
FIG. 7 shows the relationship between the image size and the like during training and diagnosis in the first embodiment. (A) shows the size of the image at the time of training, and the like. As in FIG. 6, the first image patch that is the first image to be input has b1 × b1 as the first input size, and is 91 × 91, for example. The first diagnosis result image that is the output second image has bj × bj as the first output size, for example, 1 × 1. The first image patch input to the model 31 is fixed to the first input size.

（Ｂ）は、診断時の画像のサイズ等を示す。第１画像である対象画像７０１（実線の長方形で示す）は、第１入力サイズ以上の可変入力サイズであり、縦×横の画素数として、ｃ１×ｃ２とする。対象画像７０１は非正方形も許容される。ｃ１，ｃ２≧ｂ１である。 (B) shows the size of the image at the time of diagnosis. The target image 701 (indicated by a solid rectangle) that is the first image has a variable input size that is equal to or larger than the first input size, and is c1 × c2 as the number of vertical × horizontal pixels. The target image 701 may be non-square. c1, c2 ≧ b1.

第１処理機能１１は、対象画像７０１から、第２入力サイズ（ｂ１×ｂ１）で、複数の画像を切り出す。この診断の際の第２入力サイズ（ｂ１×ｂ１）は、訓練時の第１入力サイズ（ｂ１×ｂ１）と異なっていてもよい。第１処理機能１１は、第２入力サイズとして、例えば、ＧＰＵのメモリのサイズに合わせて、メモリ使用率の最大値未満でなるべく大きいサイズとなるように、第２入力サイズを決定してもよい。これにより、計算機性能を最大に活用して短時間で計算可能である。第１処理機能１１は、対象画像７０１のｃ１×ｃ２の領域を、第２入力サイズ（ｂ１×ｂ１）に対応する第２出力サイズ（ｂｊ×ｂｊ）で区分する。これは、入力サイズに対し出力サイズが小さくことを考慮している。第２入力サイズ（ｂ１×ｂ１）および第２出力サイズ（ｂｊ×ｂｊ）の正方形の領域（特に６個の領域（１）〜（６））を示す。第２出力サイズの領域を破線で示す。 The first processing function 11 cuts out a plurality of images from the target image 701 with the second input size (b1 × b1). The second input size (b1 × b1) at the time of this diagnosis may be different from the first input size (b1 × b1) at the time of training. The first processing function 11 may determine the second input size as the second input size, for example, in accordance with the size of the GPU memory so as to be as large as possible below the maximum value of the memory usage rate. . Thereby, it is possible to calculate in a short time by making the best use of the computer performance. The first processing function 11 classifies the c1 × c2 region of the target image 701 by the second output size (bj × bj) corresponding to the second input size (b1 × b1). This takes into account that the output size is smaller than the input size. A square region (especially, six regions (1) to (6)) of the second input size (b1 × b1) and the second output size (bj × bj) is shown. A region of the second output size is indicated by a broken line.

第１処理機能１１は、区分した各第２出力サイズ（ｂｊ×ｂｊ）に対応する各第２入力サイズ（ｂ１×ｂ１）の領域毎に切り出す。切り出す第２入力サイズの領域は、隣接する領域同士で一部が重なっている。第１処理機能１１は、対象画像７０１の全画素をカバーするように、複数の画像を切り出す。本例では、６枚の画像の切り出しによってカバーできる場合を示す。また、切り出しの際、対象画像７０１の外側の余り分については、例えばパディングによって適当な画素値とする。 The first processing function 11 cuts out each region of each second input size (b1 × b1) corresponding to each divided second output size (bj × bj). The regions of the second input size to be cut out partially overlap with each other. The first processing function 11 cuts out a plurality of images so as to cover all the pixels of the target image 701. In this example, the case where it can cover by cutting out 6 images is shown. Further, when cutting out, the remainder outside the target image 701 is set to an appropriate pixel value by, for example, padding.

切り出された複数の画像（切り出し画像）、例えば画像７１１〜画像７１６を有する。各切り出し画像を第２画像パッチとする。第１処理機能１１は、複数の各々の第２画像パッチを、モデル３１に入力して計算を適用する。本例では６回の計算である。この結果、複数の各々の診断結果画像（第２診断結果画像）、例えば画像７２１〜画像７２６が得られる。第１処理機能１１は、得られた複数の第２診断結果画像を連結して、１枚の第２診断結果画像７０２を得る。 A plurality of cut images (cut images), for example, images 711 to 716 are included. Each cut-out image is set as a second image patch. The first processing function 11 inputs a plurality of second image patches to the model 31 and applies the calculation. In this example, the calculation is six times. As a result, a plurality of respective diagnosis result images (second diagnosis result images), for example, image 721 to image 726 are obtained. The first processing function 11 obtains one second diagnostic result image 702 by connecting the obtained second diagnostic result images.

［訓練時処理］
図８は、計算機システム１のアプリ１０（特に第１処理機能１１）における訓練時の処理フローを示す。この訓練処理は、訓練用画像データ作成処理を含む。図８は、ステップＳ１〜Ｓ９を有する。以下、ステップの順に説明する。 [Processing during training]
FIG. 8 shows a processing flow during training in the application 10 (particularly, the first processing function 11) of the computer system 1. This training processing includes training image data creation processing. FIG. 8 includes steps S1 to S9. Hereinafter, it demonstrates in order of a step.

（Ｓ１）Ｓ１で、計算機システム１（特にカメラ画像入力機能１３）は、図２のオリジナル画像群２１１を入力する。オリジナル画像群２１１は、構造物５を撮影した複数の画像を含み、特に実際のひび割れ等の劣化を含む訓練用画像データを含む。例えば、計算機２は、カメラ４の画像データを入力し、ＤＢ３２に格納させる。計算機３は、ＤＢ３２に格納されている画像群から順に画像を画像メモリへ読み出す。 (S1) In S1, the computer system 1 (particularly the camera image input function 13) inputs the original image group 211 of FIG. The original image group 211 includes a plurality of images obtained by photographing the structure 5, and particularly includes image data for training including deterioration such as actual cracks. For example, the computer 2 inputs the image data of the camera 4 and stores it in the DB 32. The computer 3 sequentially reads out images from the image group stored in the DB 32 to the image memory.

（Ｓ２）Ｓ２で、計算機システム１は、可視化画面（ＧＵＩ画面２１）で、オリジナル画像をモデル３１に入力して計算した結果の診断結果画像を表示し、ユーザの手動操作に基づいて、その画像に対する正解付け作業が行われる。例えば、診断結果画像の画素のうち、劣化と推定された画素が実際には劣化ではない場合や、非劣化と推定された画素が実際には劣化である場合には、不正解を表す値が入力される。計算機システム１は、正解付け情報が入力された正解付け画像（第３画像）を、訓練用画像の一部として保存する。 (S2) In S2, the computer system 1 displays a diagnosis result image obtained by inputting and calculating the original image into the model 31 on the visualization screen (GUI screen 21), and the image based on the manual operation of the user. The correct answer work for is performed. For example, when a pixel estimated to be deteriorated among the pixels of the diagnosis result image is not actually deteriorated, or a pixel estimated to be non-degraded is actually deteriorated, a value indicating an incorrect answer is Entered. The computer system 1 stores the correct answer image (third image) to which the correct answer information is input as a part of the training image.

（Ｓ３）Ｓ３で、計算機システム１は、図２の弱点画像群２１２を入力する。弱点画像群２１２は、モデル３１の計算の結果として誤検出をもたらす画像であり、例えば構造物５の壁面の直線群や周囲の植物等を含む画像である。 (S3) In S3, the computer system 1 inputs the weak spot image group 212 of FIG. The weak point image group 212 is an image that causes erroneous detection as a result of the calculation of the model 31, and is an image including, for example, a straight line group of the wall surface of the structure 5, surrounding plants, and the like.

（Ｓ４）Ｓ４で、計算機システム１は、弱点画像群２１２に対し、エッジ検出処理等を施して、自動的な正解付け処理を行い、その結果、正解付けされた弱点画像を、訓練用画像の一部（ネガティブサンプル画像）として保存する。自動的な正解付け処理は、例えば画像内の直線群や植物に対応する画素に、非劣化を表す値を設定する処理である。 (S4) In S4, the computer system 1 performs edge detection processing, etc., on the weak point image group 212 to perform automatic correct answer processing. As a result, the correct weak point image is converted into the training image. Save as part (negative sample image). The automatic correct answering process is a process of setting a value representing non-deterioration, for example, to pixels corresponding to a group of straight lines or plants in the image.

（Ｓ５）Ｓ５で、計算機システム１は、Ｓ２やＳ４に基づいた各々の訓練用画像データの画像に対し、訓練用の第１入力サイズ（ｎ×ｎ）の第１画像パッチを必要な数で切り出す（図７）。 (S5) In S5, the computer system 1 applies the necessary number of first image patches of the first input size (n × n) for training to the images of the training image data based on S2 and S4. Cut out (FIG. 7).

（Ｓ６）Ｓ６で、計算機システム１は、訓練用画像を増やすために、Ｓ５の第１画像パッチに対し、データ拡張処理を施して、複数の各々の訓練用画像（バリエーション）を生成する。データ拡張処理は、ひび割れ等の劣化の種類や特性に応じた処理であり、ノイズ付加や反転等の公知の処理を含み、シフト処理（画素領域を平行移動させる処理）を含まない。 (S6) In S6, the computer system 1 performs a data expansion process on the first image patch in S5 to generate a plurality of training images (variations) in order to increase the number of training images. The data expansion processing is processing according to the type and characteristics of deterioration such as cracks, and includes known processing such as addition of noise and inversion, and does not include shift processing (processing for moving the pixel area in parallel).

（Ｓ７）Ｓ７で、計算機システム１は、ユーザ設定でＭＩＬ回転機能１４がオン状態である場合に、ＭＩＬ回転処理を行う（図１１）。ＭＩＬ回転処理では、所定の回転角度θを用いて、例えば１８０度範囲内で、等分割するように、元画像を回転させて、複数の各々の回転後画像を生成する。 (S7) In S7, the computer system 1 performs the MIL rotation process when the MIL rotation function 14 is turned on by user setting (FIG. 11). In the MIL rotation process, the original image is rotated so as to be equally divided within a range of 180 degrees, for example, using a predetermined rotation angle θ, and a plurality of rotated images are generated.

（Ｓ８）Ｓ８で、計算機システム１のＭＩＬ回転機能１４は、上記生成した複数の各々の第１画像パッチ毎に、モデル３１に入力して計算を適用し、すなわち訓練用の診断を実行し、複数の各々の診断結果画像を得る。そして、ＭＩＬ回転機能１４は、複数の診断結果画像を１つに統合した画像を取得し、結果を保存する。 (S8) In S8, the MIL rotation function 14 of the computer system 1 inputs the calculation into the model 31 for each of the plurality of first image patches generated as described above, that is, executes the diagnosis for training, A plurality of diagnosis result images are obtained. Then, the MIL rotation function 14 acquires an image obtained by integrating a plurality of diagnosis result images into one, and stores the result.

（Ｓ９）Ｓ９で、計算機システム１（特に評価・絞り込み機能１６）は、診断結果画像について、評価処理を行い、ユーザによる最終判定を行う。ユーザは、ＧＵＩ画面２１で診断結果画像を見て、モデル３１による劣化推定結果が正解か否かを確認して正解付け情報を入力する。計算機システム１は、正解付け画像を保存する。 (S9) In S9, the computer system 1 (especially the evaluation / narrowing function 16) performs an evaluation process on the diagnosis result image and makes a final determination by the user. The user looks at the diagnosis result image on the GUI screen 21 and confirms whether or not the degradation estimation result by the model 31 is correct, and inputs correct answer information. The computer system 1 stores the correct answer image.

また、Ｓ９の評価処理は、構造物５（構造物ＩＤで識別される）に応じた特定の回転方向についての評価および決定を含む。計算機システム１は、Ｓ８のＭＩＬ回転処理の結果の画像（図１１の画像ｇ４０）に基づいて、特定の回転方向（対応する回転角度θ）を決定し、その情報を保存する。評価処理の例としては、予め、ユーザが、複数枚の画像について各画像における劣化に該当する画素をタグ付けし、評価用データとして保存する。そして、その評価用データに対し、モデル３１の良し悪しを計算することが挙げられる。 Moreover, the evaluation process of S9 includes evaluation and determination for a specific rotation direction according to the structure 5 (identified by the structure ID). The computer system 1 determines a specific rotation direction (corresponding rotation angle θ) based on the image (image g40 in FIG. 11) obtained as a result of the MIL rotation process in S8, and stores the information. As an example of the evaluation process, the user tags pixels corresponding to deterioration in each image for a plurality of images in advance and stores them as evaluation data. And calculating the quality of the model 31 with respect to the data for evaluation is mentioned.

［弱点画像］
図９は、弱点画像の設定や正解付け画像について示す。図９中、オリジナル画像群２１１、弱点画像群２１２、モデル３１、診断結果画像（第２画像）等を示す。アプリ１０は、オリジナル画像や弱点画像をモデル３１に入力して訓練を行う。 [Weak image]
FIG. 9 shows weak image settings and correct images. In FIG. 9, an original image group 211, a weak point image group 212, a model 31, a diagnosis result image (second image), and the like are shown. The application 10 performs training by inputting an original image or a weak point image into the model 31.

画像をモデル３１に入力して劣化を診断、検出する際に、ひび割れ等の劣化箇所ではなく、他の箇所（非劣化箇所）を劣化状態として誤検出することが生じる。誤検出の例として、構造物５の壁面等に元々デザインとして設けられている直線群や、周囲の植物等が挙げられる。一般的な機械学習の場合、このような誤検出については、通常、学習データ数を増やして精度を上げることで対応可能である。しかしながら、実施の形態１のシステムでは、計算時間を短縮したいので、単純に学習データ数（画像数）を増やすのではなく、対策する仕組みを設けている。 When an image is input to the model 31 and deterioration is diagnosed and detected, not a deteriorated portion such as a crack but another portion (non-deteriorated portion) is erroneously detected as a deteriorated state. Examples of erroneous detection include a group of straight lines originally provided as a design on the wall surface of the structure 5, surrounding plants, and the like. In the case of general machine learning, such erroneous detection can usually be dealt with by increasing the number of learning data and improving accuracy. However, in the system of the first embodiment, since it is desired to shorten the calculation time, a mechanism is provided for taking measures instead of simply increasing the number of learning data (number of images).

実施の形態１のシステムでは、訓練用画像データ数が限られていても、誤検出を減らして効率的に学習できるようにする機能を有する。本システムでは、ＤＬ−ＣＮＮのモデル３１の誤検出の弱点を、弱点画像として抽出して設定し、その弱点を克服するように、弱点画像をモデル３１に入力して訓練を行わせる。データセット１０１は、弱点画像群２１２を用いるように拡張される。 The system according to Embodiment 1 has a function of reducing the number of false detections and enabling efficient learning even when the number of training image data is limited. In this system, a weak point of erroneous detection of the DL-CNN model 31 is extracted and set as a weak point image, and the weak point image is input to the model 31 to perform training so as to overcome the weak point. The data set 101 is expanded to use the weak point image group 212.

弱点画像は、図１０の例のように、直線群を含む画像や、植物を含む画像である。アプリ１０は、オリジナル画像に基づいてモデル３１に入力した結果の診断結果画像から、評価に基づいて、誤検出が多い箇所を抽出して、弱点画像として設定する。弱点画像は、言い換えると、劣化を含まない事例を表すネガティブサンプル画像である。あるいは、アプリ１０では、ユーザが任意に指定した画像を、弱点画像として設定することができる。正規のオリジナル画像群２１１のデータセット１０１に、設定された弱点画像群２１２が弱点強調データセットとして追加されて、新しいデータセット１０１として拡張される。 The weak point image is an image including a straight line group or an image including a plant, as in the example of FIG. The application 10 extracts a part with many false detections based on the evaluation from the diagnosis result image input to the model 31 based on the original image, and sets it as a weak point image. In other words, the weak point image is a negative sample image representing a case that does not include deterioration. Alternatively, the application 10 can set an image arbitrarily designated by the user as a weak point image. The set weak point image group 212 is added as a weak point emphasis data set to the data set 101 of the regular original image group 211 and is expanded as a new data set 101.

弱点画像は、オリジナル画像の診断結果画像から自動的に抽出したものでもよいし、構造物５とは全く関係無い画像をユーザが任意に指定するものでもよい。例えば、複数の診断結果画像のうち、誤検出箇所を含む診断結果画像９０１が抽出され、その診断結果画像９０１内の誤検出箇所（例えば直線群、植物）の領域が抽出され、その領域を加工して弱点画像が作成される。弱点画像は、ユーザが新たに手動で作成した画像や正解付けした画像とする必要は無いので、弱点画像に係わる作業の手間は抑制されている。 The weak point image may be automatically extracted from the diagnosis result image of the original image, or may be an image in which the user arbitrarily designates an image that has nothing to do with the structure 5. For example, among the plurality of diagnosis result images, a diagnosis result image 901 including an erroneously detected portion is extracted, an area of an erroneously detected portion (for example, a straight line group, a plant) in the diagnostic result image 901 is extracted, and the region is processed. A weak point image is created. Since it is not necessary for the weak point image to be an image manually created by the user or an image that has been correctly answered, labor for the weak point image is reduced.

図９中、訓練結果の複数の診断結果画像のうち、劣化推定箇所を含む診断結果画像９０２に対し、ユーザは画面で、その劣化推定箇所の画素が正解か否（不正解）かを入力する正解付け作業を行う。これにより正解付け情報が入力された正解付け画像が作成され、データセット１０１の一部となる。 In FIG. 9, for a diagnosis result image 902 including a deterioration estimated place among a plurality of diagnosis result images of the training result, the user inputs whether the pixel of the deterioration estimated place is a correct answer (incorrect answer) on the screen. Perform correct answering work. As a result, a correct image with the correct answer information input is created and becomes a part of the data set 101.

図１０は、弱点画像の一例を示す。（Ａ）は、直線群を含む画像であり、（Ｂ）は、植物を含む画像である。これらの画像は、ひび割れ等の劣化を含んでいない。これらの画像がモデル３１に入力された場合に、直線群や植物の箇所が過敏に反応して、ひび割れ等の劣化（劣化推定箇所）として誤診断、誤検出される。そのため、このような画像を弱点画像として用いて、モデル３１を学習させる。弱点画像を入力して学習した後のモデル３１では、実際の対象画像の診断を行った場合に、壁面の直線群や周囲の植物等が劣化として誤検出されることが低減される。 FIG. 10 shows an example of the weak spot image. (A) is an image including a straight line group, and (B) is an image including a plant. These images do not include deterioration such as cracks. When these images are input to the model 31, the straight line group or the plant part reacts with sensitivity, and is erroneously detected and erroneously detected as deterioration such as cracks (deterioration estimated part). Therefore, the model 31 is learned using such an image as a weak point image. In the model 31 after learning by inputting the weak point image, when the actual target image is diagnosed, false detection of deterioration of the straight line group of the wall surface and surrounding plants is reduced.

弱点画像の作成や設定の仕方の例としては、図８のＳ４のように、入力された画像（例えば図１０の（Ａ）の画像）に対し、計算機システム１で公知のエッジ検出処理を適用して、直線群の領域を抽出し、その直線群の領域を所定のサイズになるように加工し、その結果を弱点画像として設定してもよい。 As an example of how to create and set a weak point image, a known edge detection process is applied to the input image (for example, the image of FIG. 10A) by the computer system 1 as shown in S4 of FIG. Then, the straight line group region may be extracted, the straight line group region may be processed to have a predetermined size, and the result may be set as a weak point image.

［ＭＩＬ回転機能］
ＤＬ−ＣＮＮに関して、画像内の一般的な物体（例えば人）の診断の場合には、重力方向（大抵は画像内の下方向）があるので、画像内の物体の向きが、ある程度限定的に判断可能である。一方、画像内での劣化箇所の主な方向は基本的に不明である。例えば、ひび割れは、画像面内の３６０度の角度範囲内で、いずれの角度方向に主な方向が沿って延びて生じているか、基本的には不明である。通常、３６０度の任意の角度で劣化の検出ができるように、画像面内の方向に依らずに全角度方向に対応できるモデル（第１モデルとする）が作成される。 [MIL rotation function]
Regarding the DL-CNN, in the case of diagnosis of a general object (for example, a person) in an image, since there is a direction of gravity (usually a downward direction in the image), the direction of the object in the image is limited to some extent. Judgment is possible. On the other hand, the main direction of the deteriorated portion in the image is basically unknown. For example, it is basically unknown which crack direction is caused by extending along the main direction in any angular direction within an angle range of 360 degrees in the image plane. Usually, a model (referred to as a first model) that can handle all angle directions regardless of the direction in the image plane is created so that deterioration can be detected at an arbitrary angle of 360 degrees.

ここで、仮に、各画像内の劣化の主な角度方向が同じ特定の角度方向（例えば面内垂直方向）に揃えられる場合、第１モデルのように全角度方向に対応する必要が無く、特定の角度方向に対応したモデル（第２モデルとする）で対応することができる。すなわち、第２モデルを用いる場合、第１モデルよりも少ないパラメータ数で、同程度の性能を出すことができる。ただし、ユーザが手動で画像内の劣化の角度方向を揃える作業をしてしまうと、手間や時間がかかり、主旨を取り違えたものとなるため、自動的に対応できるようにする。 Here, if the main angular direction of degradation in each image is aligned with the same specific angular direction (for example, in-plane vertical direction), it is not necessary to correspond to all angular directions as in the first model, and the specific It is possible to cope with a model corresponding to the angle direction (referred to as a second model). That is, when the second model is used, the same level of performance can be obtained with a smaller number of parameters than the first model. However, if the user manually performs the work of aligning the deterioration angle direction in the image, it takes time and effort, and the main point is mistaken.

第２モデルは、例えば面内垂直方向に生じている劣化に対応するモデルとする場合、入力画像内の劣化箇所が主に面内垂直方向に沿って生じている場合、その劣化を高い確率で検出できる（言い換えると、劣化推定確率として高い確率値が出力される）。その第２モデルは、入力画像内の劣化箇所が、面内垂直方向からずれた角度方向で生じている場合、その劣化を低い確率でしか検出できない（言い換えると、劣化推定確率として低い確率値が出力される）。 For example, when the second model is a model corresponding to the deterioration occurring in the in-plane vertical direction, and the deterioration portion in the input image is mainly generated along the in-plane vertical direction, the deterioration is highly likely to occur. It can be detected (in other words, a high probability value is output as the deterioration estimation probability). The second model can detect the deterioration only with a low probability when the deterioration portion in the input image is generated in an angular direction deviated from the vertical direction in the plane (in other words, a low probability value is used as the deterioration estimation probability). Output).

そこで、実施の形態１のシステムでは、上記第１モデル、第２モデルのいずれにも対応できる機能を有し、特に、上記第２モデルに対応した学習を行うためのＭＩＬ回転機能１４を有する。このＭＩＬ回転機能１４を用いる場合、訓練対象画像を、図１１の例のように、所定の回転角度θで回転させることで、複数の画像（例えば画像ｇ１〜ｇ３）を生成し、各画像をモデル３１に入力して試しに診断を行わせる。評価・絞り込み機能１６は、複数の画像の診断結果画像の各劣化推定確率から、確率値が一番高いものに対応する画像の角度方向を、特定の角度方向として選択する。そして、実施の形態１のシステムは、診断時には、対象画像を、その特定の角度方向に対応した第２モデルを用いて診断する。これにより、ユーザの作業を少なくしたまま、ひび割れ等の劣化を高精度に検出することができる。 Therefore, the system according to the first embodiment has a function capable of handling both the first model and the second model, and in particular, has a MIL rotation function 14 for performing learning corresponding to the second model. When the MIL rotation function 14 is used, a plurality of images (for example, images g1 to g3) are generated by rotating the training target image at a predetermined rotation angle θ as in the example of FIG. The model 31 is inputted to make a diagnosis for trial. The evaluation / narrowing function 16 selects the angular direction of the image corresponding to the one with the highest probability value as the specific angular direction from the respective deterioration estimation probabilities of the diagnosis result images of a plurality of images. The system according to the first embodiment diagnoses the target image using the second model corresponding to the specific angular direction at the time of diagnosis. Thereby, degradation, such as a crack, can be detected with high accuracy while reducing the user's work.

［ＭＩＬ回転処理］
図１１は、実施の形態１で、訓練時のＭＩＬ回転機能１４によるＭＩＬ回転処理の例を示す。画像ｇ０を入力画像例とする。画像ｇ０は、文字「あ」が正常な方向で写っている画像とする。ＭＩＬ回転処理に伴う回転角度をθとする。アプリ１０では、回転角度θの値が予め設定されている。ＭＩＬ回転機能１４は、画像ｇ０を、回転角度θを用いて、複数の各方向に回転させる回転処理８０１を行う。本例では、回転角度θとして、０度、θａ度、θｂ度の３種を示すが、これに限らず可能である。画像ｇ０を、０度回転した画像ｇ１、θａ度回転した画像ｇ２、θｂ度回転した画像ｇ３等が得られる。画像ｇ１は、０度なので非回転である。画像ｇ２は、θａ度の回転によって辺が斜めになった領域を包含する正方形とされている。 [MIL rotation processing]
FIG. 11 shows an example of MIL rotation processing by the MIL rotation function 14 during training in the first embodiment. The image g0 is an input image example. The image g0 is an image in which the character “A” appears in a normal direction. Let θ be the rotation angle associated with the MIL rotation processing. In the application 10, the value of the rotation angle θ is set in advance. The MIL rotation function 14 performs a rotation process 801 for rotating the image g0 in a plurality of directions using the rotation angle θ. In this example, three types of rotation angles θ of 0 degrees, θa degrees, and θb degrees are shown, but the present invention is not limited to this. An image g0 rotated by 0 degrees, an image g2 rotated by θa degrees, an image g3 rotated by θb degrees, and the like are obtained. Since image g1 is 0 degrees, it is not rotated. The image g2 is a square that includes a region whose sides are inclined by rotation of θa degrees.

ＭＩＬ回転機能１４は、回転後の各画像（画像ｇ１〜ｇ３）を、ＤＬ−ＣＮＮのモデル３１に入力して計算を適用する診断処理（ＤＮＮ８０２）を行う。ＤＮＮ８０２の結果、各回転方向に応じた画像、例えば画像ｇ１１，ｇ１２，ｇ１３が得られる。これらの画像は、回転後の辺が斜めになった領域を含む。複数の画像の結果を統合する必要があるので、各画像に対し、逆回転等が必要である。逆回転の角度は、回転角度θのマイナス角度である。ＭＩＬ回転機能１４は、各画像に対し、回転角度θに関する逆回転処理８０３を施す。すなわち、回転角度として、０度、−θａ度、−θｂ度とした回転処理が行われる。この結果、各画像、例えば画像ｇ２１，ｇ２２，ｇ２３が得られる。例えば画像ｇ２２は、−θａ度の回転によって辺が斜めになった領域を包含する正方形とされている。 The MIL rotation function 14 performs a diagnostic process (DNN 802) in which the rotated images (images g1 to g3) are input to the DL-CNN model 31 and calculation is applied. As a result of DNN 802, images corresponding to the respective rotation directions, for example, images g11, g12, and g13 are obtained. These images include a region where the sides after rotation are slanted. Since it is necessary to integrate the results of a plurality of images, reverse rotation or the like is required for each image. The reverse rotation angle is a minus angle of the rotation angle θ. The MIL rotation function 14 performs reverse rotation processing 803 regarding the rotation angle θ on each image. That is, the rotation process is performed with the rotation angles set to 0 degrees, -θa degrees, and -θb degrees. As a result, each image, for example, images g21, g22, and g23 are obtained. For example, the image g22 is a square that includes a region whose sides are slanted by a rotation of −θa degrees.

ＭＩＬ回転機能１４は、逆回転後の各画像（画像ｇ２１〜ｇ２３）から、元の画像のサイズに対応する領域を切り抜く切抜き処理８０４を行う。この結果、特徴マップに対応する画像として、例えば画像ｇ３１，ｇ３２，ｇ３３が得られる。例えば画像ｇ２２から切り抜かれた画像ｇ３２を有する。画像ｇ３１をｆ（ｘ，ｙ，０）で表す。画像ｇ３２をｆ（ｘ，ｙ，ａ）で表す。画像ｇ３３をｆ（ｘ，ｙ，ｂ）で表す。切り抜き後の各画像は、画素毎に劣化推定確率値を持つ。 The MIL rotation function 14 performs a clipping process 804 for cutting out an area corresponding to the size of the original image from each image (images g21 to g23) after the reverse rotation. As a result, for example, images g31, g32, and g33 are obtained as images corresponding to the feature map. For example, an image g32 cut out from the image g22 is included. The image g31 is represented by f (x, y, 0). The image g32 is represented by f (x, y, a). The image g33 is represented by f (x, y, b). Each image after clipping has a degradation estimation probability value for each pixel.

ＭＩＬ回転機能１４は、すべての回転角度θの画像（画像ｇ３１〜ｇ３３）に対し、マックスプーリング（max pooling）処理８０５を施す。このマックスプーリング処理８０５は、画像内の対応する位置の画素毎に、劣化の確率が高い方に対応する最大値をとる処理である。この処理の結果、出力として、１つの画像ｇ４０が得られる。画像ｇ４０を、ｍａｘ（ｆ（ｘ，ｙ，ｉ））で表す。上記のように、元画像を回転させた各画像をモデル３１に入力すると、回転した診断結果情報が出力されるので、複数の診断結果情報を統合するために、逆回転や切り抜き処理が行われる。 The MIL rotation function 14 performs a max pooling process 805 on all images (images g31 to g33) at the rotation angle θ. The max pooling process 805 is a process for obtaining the maximum value corresponding to the higher probability of deterioration for each pixel at the corresponding position in the image. As a result of this processing, one image g40 is obtained as an output. The image g40 is represented by max (f (x, y, i)). As described above, when each image obtained by rotating the original image is input to the model 31, the rotated diagnosis result information is output. Therefore, in order to integrate a plurality of diagnosis result information, reverse rotation or clipping processing is performed. .

ＭＩＬ回転処理の回転角度θについて詳しくは以下である。回転角度θは、基本的には任意の角度が可能である。回転角度θは、画像面内の３６０度範囲のうちの等分割の角度が好ましいが、特に制限は無い。ひび割れは、概ね線状のパターンであるため、そのひび割れを含む画像を１８０度回転させた場合でも、ひび割れの方向は同じといえる。よって、実施の形態１で、実装例として、ＭＩＬ回転機能１４では、１８０度範囲内で、所定の回転角度θ毎に等分割で回転させて複数の画像を生成する。実験によれば、１８０度範囲内で、回転角度θ＝９０度として２分割する場合（０度、９０度の２種の画像）から、回転角度θ＝４５度として４分割する場合（０度、４５度、９０度、１３５度の４種の画像）まで、所定の回転角度θで等分割することで、精度および処理速度のバランスがとれた結果が得られた。 Details of the rotation angle θ of the MIL rotation processing are as follows. The rotation angle θ can be basically any angle. The rotation angle θ is preferably an equally divided angle within the 360 ° range in the image plane, but is not particularly limited. Since the crack is a substantially linear pattern, it can be said that the direction of the crack is the same even when the image including the crack is rotated 180 degrees. Therefore, in the first embodiment, as an implementation example, the MIL rotation function 14 generates a plurality of images by rotating in an equal division within a range of 180 degrees for each predetermined rotation angle θ. According to the experiment, in the range of 180 degrees, the image is divided into two with a rotation angle θ = 90 degrees (two types of images of 0 degrees and 90 degrees), and the image is divided into four with a rotation angle θ = 45 degrees (0 degrees). , 45 degrees, 90 degrees, and 135 degrees) are equally divided at a predetermined rotation angle θ to obtain a result that balances accuracy and processing speed.

［診断時処理］
図１２は、点検作業時等の対象画像の診断（実診断）時の処理フローを示す。この診断処理は、計算機システム１で事前に行われる診断処理を含む。ユーザは画面で診断処理を指定し、計算終了まで待つ。図１２は、ステップＳ２１〜Ｓ２７を有する。以下、ステップの順に説明する。 [Diagnosis processing]
FIG. 12 shows a processing flow at the time of diagnosis (actual diagnosis) of the target image during inspection work or the like. This diagnostic processing includes diagnostic processing performed in advance in the computer system 1. The user designates diagnostic processing on the screen and waits until the calculation is completed. FIG. 12 includes steps S21 to S27. Hereinafter, it demonstrates in order of a step.

（Ｓ２１）計算機システム１は、診断対象の構造物５（構造物ＩＤで識別される）の対象画像を入力する。例えば、計算機３は、ＤＢ３２から対象画像を読み出して画像メモリに展開する等の準備処理を行う。 (S21) The computer system 1 inputs the target image of the structure 5 (identified by the structure ID) to be diagnosed. For example, the computer 3 performs a preparation process such as reading the target image from the DB 32 and developing it in the image memory.

（Ｓ２２）計算機システム１（特にＭＩＬ回転機能１４）は、前述の特定の回転方向について学習したモデルＢを用いて診断を行う場合、診断対象画像を、前述の特定の回転方向（回転角度θ）に回転させて、回転後の領域を包含する正方形をとる。計算機システム１は、その回転後画像を、モデル３１（モデルＢ）に入力するための画像とする。 (S22) When the computer system 1 (especially the MIL rotation function 14) performs diagnosis using the model B learned about the specific rotation direction, the computer system 1 (the rotation angle θ) displays the diagnosis target image as the specific rotation direction (rotation angle θ). And take a square that encompasses the area after rotation. The computer system 1 uses the rotated image as an image for inputting to the model 31 (model B).

（Ｓ２３）計算機システム１（特に第１処理機能１１）は、対象画像の第２入力サイズ（図７、可変入力サイズ、ｃ１×ｃ２）から、例えばＧＰＵのメモリおよび処理のサイズに対応させた第２入力サイズで、対象画像を切り分けて、複数の画像（第２画像パッチ）を得る。 (S23) The computer system 1 (especially the first processing function 11) uses, for example, a GPU memory and a processing size corresponding to the size of the GPU from the second input size (FIG. 7, variable input size, c1 × c2) of the target image. The target image is cut into two input sizes to obtain a plurality of images (second image patches).

（Ｓ２４）計算機システム１は、複数の第２画像パッチを、順にＤＬ−ＣＮＮのモデル３１に入力して計算を適用し、すなわち診断処理を実行する。この結果、順に複数の各々の第２診断結果画像が得られる。 (S24) The computer system 1 inputs a plurality of second image patches to the DL-CNN model 31 in order and applies the calculation, that is, executes a diagnosis process. As a result, a plurality of second diagnosis result images are sequentially obtained.

（Ｓ２５）計算機システム１は、複数のすべての第２画像パッチのモデル計算（診断処理）が終了したかを確認しながら、同様にＳ２４の処理を繰り返し、すべて終了したらＳ２６へ進む。 (S25) The computer system 1 repeats the process of S24 in the same manner while confirming whether the model calculation (diagnosis process) of all of the plurality of second image patches has been completed.

（Ｓ２６）計算機システム１は、複数の各々の第２診断結果画像について、前述の特定の回転方向に対応させて逆回転し、逆回転後の画像の中から、元のサイズ（第２出力サイズ）に対応する画像領域を切り抜いて、複数の各々の第２診断結果画像（劣化確率画像）とする。 (S26) The computer system 1 reversely rotates each of the plurality of second diagnosis result images in correspondence with the specific rotation direction described above, and selects the original size (second output size) from the images after the reverse rotation. ) Are cut out to form a plurality of second diagnosis result images (deterioration probability images).

（Ｓ２７）計算機システム１は、複数の第２診断結果画像（劣化確率画像）について、Ｓ２２の切り分けに対応させて並べ直して連結して、１枚の画像（第２診断結果画像）を得て、保存する。 (S27) The computer system 1 obtains one image (second diagnosis result image) by rearranging and connecting a plurality of second diagnosis result images (deterioration probability images) in correspondence with the separation of S22. ,save.

［可視化画面表示処理］
図１３は、計算機システム１の第２処理機能１２による可視化画面表示の処理フローを示す。この処理は、可視化画面でのユーザ操作に応じたリアルタイムの処理である。図２の計算機２のＧＵＩ画面２１に対してユーザが入力操作し、要求等が計算機３に送られる。計算機３が要求等を処理して、画面データを生成して、計算機２へ応答する。そして、計算機２が画面データに基づいて可視化画面を表示する。図１３は、ステップＳ３１〜Ｓ３８を有する。以下、ステップの順に説明する。 [Visualization screen display processing]
FIG. 13 shows a processing flow of the visualization screen display by the second processing function 12 of the computer system 1. This process is a real-time process corresponding to a user operation on the visualization screen. A user performs an input operation on the GUI screen 21 of the computer 2 in FIG. 2, and a request or the like is sent to the computer 3. The computer 3 processes the request, generates screen data, and responds to the computer 2. And the computer 2 displays a visualization screen based on screen data. FIG. 13 includes steps S31 to S38. Hereinafter, it demonstrates in order of a step.

（Ｓ３１）計算機システム１は、ユーザの操作に基づいて指定された診断結果画像を入力する。例えば、計算機３は、ＤＢ３２から読み出した診断結果画像の画像データを画像メモリに展開する。 (S31) The computer system 1 inputs a diagnostic result image designated based on a user operation. For example, the computer 3 develops the image data of the diagnostic result image read from the DB 32 in the image memory.

（Ｓ３２）計算機システム１（特に第２処理機能１２）は、可視化画面における所定の領域に、ユーザの操作に基づいて指定された種類の診断結果画像を表示する。また、計算機システム１は、画面内に、操作用の閾値（劣化確率閾値、領域サイズ閾値）の部品やコマンドボタン等の部品を表示し、対象の構造物５や対象画像のＩＤ等の情報を表示する。可視化画面の表示例は図１４で示される。計算機システム１は、画面内の閾値のＧＵＩ部品がユーザによって操作された場合、閾値を変更する。 (S32) The computer system 1 (particularly the second processing function 12) displays a diagnostic result image of the type specified based on the user's operation in a predetermined area on the visualization screen. Further, the computer system 1 displays components such as operation threshold values (deterioration probability threshold value, region size threshold value) and command buttons on the screen, and displays information such as the ID of the target structure 5 and the target image. indicate. A display example of the visualization screen is shown in FIG. The computer system 1 changes the threshold when the threshold GUI component in the screen is operated by the user.

（Ｓ３３）計算機システム１は、劣化確率閾値（二値化閾値）に応じて、診断結果画像内の画素毎の確率値を多階調から二値化した二値化画像を生成し、診断結果画像とする。 (S33) The computer system 1 generates a binarized image in which the probability value for each pixel in the diagnosis result image is binarized from multiple gradations according to the deterioration probability threshold (binarization threshold), and the diagnosis result An image.

（Ｓ３４）また、計算機システム１は、領域サイズ閾値に応じて、診断結果画像の二値化画像内における劣化を表す隣接する領域の面積（画素数）が領域サイズ閾値以下となる小領域を抽出し、その小領域を除去した小領域除去画像を生成し、診断結果画像とする。 (S34) In addition, the computer system 1 extracts a small region in which the area (number of pixels) of an adjacent region representing deterioration in the binarized image of the diagnosis result image is equal to or smaller than the region size threshold according to the region size threshold. Then, a small area removal image from which the small area has been removed is generated and used as a diagnosis result image.

（Ｓ３５）また、計算機システム１は、ユーザ操作による直線除去の指定（例えば直線除去ボタンのオン設定、あるいは直線除去コマンドの設定に対応した所定キー入力等）に応じて、診断結果画像の二値化画像内における直線状領域を抽出し、その直線状領域を除去した直線除去画像を生成し、診断結果画像とする。 (S35) In addition, the computer system 1 determines the binary value of the diagnosis result image in accordance with the designation of the straight line removal by the user operation (for example, on setting of the straight line removal button or predetermined key input corresponding to the setting of the straight line removal command). A linear region in the digitized image is extracted, and a straight line-removed image from which the linear region is removed is generated and used as a diagnosis result image.

（Ｓ３６）計算機システム１は、Ｓ３３〜Ｓ３５の処理を反映した診断結果画像を、画面内の所定の領域に表示する。ユーザは、可視化画面の診断結果画像を見て、適宜に閾値を変更しながら、劣化の有無や箇所を確認でき、最終判定を行うことができる。 (S36) The computer system 1 displays a diagnostic result image reflecting the processing of S33 to S35 in a predetermined area in the screen. The user can check the presence / absence and location of deterioration while appropriately changing the threshold value by looking at the diagnosis result image on the visualization screen, and can make a final determination.

（Ｓ３７）また、計算機システム１は、診断結果画像に対する、ユーザ操作による正解付け作業を受け付ける。ユーザは、診断結果画像における画素毎に、劣化推定結果が正解か否かを指定して入力することができる。計算機システム１は、入力された正解付け情報を含む正解付け画像（第３画像）を、データセット１０１の一部として保存する。 (S37) Moreover, the computer system 1 accepts a correct answer operation by a user operation for the diagnosis result image. The user can designate and input whether or not the deterioration estimation result is correct for each pixel in the diagnosis result image. The computer system 1 stores the correct answer image (third image) including the input correct answer information as a part of the data set 101.

（Ｓ３８）計算機システム１は、画面でのユーザによる終了操作等に応じて、画面の表示を終了し、終了操作ではない場合には、Ｓ３１から同様に処理を繰り返す。 (S38) The computer system 1 ends the display of the screen in response to the user's end operation on the screen, etc. If the operation is not the end operation, the processing is repeated from S31.

［可視化画面（ＧＵＩ画面）］
図１４は、第２処理機能１２により提供される可視化画面（ＧＵＩ画面２１）の表示例を示す。ユーザは、可視化画面で劣化診断結果の画像を目視確認して、劣化検出の最終判定や正解付け作業を行う。第２処理機能１２は、可視化画面内の所定の画像領域、例えば画像第１領域１４１、画像第２領域１４２に、モデル３１の出力の診断結果画像（劣化確率画像）を表示する。 [Visualization screen (GUI screen)]
FIG. 14 shows a display example of a visualization screen (GUI screen 21) provided by the second processing function 12. The user visually confirms the image of the deterioration diagnosis result on the visualization screen, and performs the final determination of the deterioration detection and the correct answer work. The second processing function 12 displays the diagnosis result image (degradation probability image) of the output of the model 31 in a predetermined image area in the visualization screen, for example, the first image area 141 and the second image area 142.

診断結果画像データでは、画素毎に、劣化推定結果の確率値（０〜１の値）を階調値（例えば０〜２５５の値）として持つ。この画像では、その画素毎の確率値を、そのまま画素の多階調の色として表示することもできる。また、この画像では、二値化閾値を用いて、その画素の確率値を二値化し、すなわち劣化か否かを表す二値にして表示することもできる。本例では、画像第１領域１４１には、診断結果多階調画像（図１５の（Ｂ）と対応する）が表示されており、画像第２領域１４２には、元画像の上に診断結果二値化画像を重畳した画像（図１５の（Ｄ）と対応する）が表示されている。 In the diagnosis result image data, the probability value (value of 0 to 1) of the degradation estimation result is provided as a gradation value (for example, a value of 0 to 255) for each pixel. In this image, the probability value for each pixel can be displayed as it is as a multi-tone color of the pixel. Further, in this image, the probability value of the pixel can be binarized using the binarization threshold value, that is, can be displayed as a binary value indicating whether or not the pixel is deteriorated. In this example, a diagnosis result multi-tone image (corresponding to FIG. 15B) is displayed in the image first area 141, and the diagnosis result is displayed on the original image in the image second area 142. An image (corresponding to (D) of FIG. 15) on which the binarized image is superimposed is displayed.

画像第１領域１４１や画像第２領域１４２には、図１５のような他の種類の画像を選択して表示することもできる。１つの可視化画面内で、１つの画像領域に１つの画像を表示してもよいし、図１４の例のように２つ以上の画像領域に並列で２つ以上の画像を表示してもよい。１つの画像領域に２つ以上の画像を交互に切り替えながら表示してもよい。また、画像領域の画像に対し、拡大／縮小表示やシフト表示等のユーザ操作も可能である。 In the image first area 141 and the image second area 142, other types of images as shown in FIG. 15 can be selected and displayed. In one visualization screen, one image may be displayed in one image area, or two or more images may be displayed in parallel in two or more image areas as in the example of FIG. . Two or more images may be displayed alternately in one image area. Also, user operations such as enlargement / reduction display and shift display can be performed on the image in the image area.

また、可視化画面内には、構造物５や画像のＩＤ等の情報や、コマンドボタン等の情報を表示する領域も設けられている。この領域では、例えば、構造物ＩＤ、画像ＩＤ等が表示され、ユーザの操作で選択可能となっている。また、この領域では、アプリ１０の機能の選択や設定のための操作が可能となっている。例えば、画像領域に表示する画像の種類等が選択可能である。また、この領域では、例えば直線除去ボタン１４７が設けられている。直線除去ボタン１４７がオン状態にされた場合、画像領域に直線除去画像２３４（図１３のＳ３５）が表示される。 In the visualization screen, an area for displaying information such as the structure 5 and the ID of the image and information such as a command button is also provided. In this area, for example, a structure ID, an image ID, and the like are displayed and can be selected by a user operation. In this area, operations for selecting and setting functions of the application 10 are possible. For example, the type of image to be displayed in the image area can be selected. In this region, for example, a straight line removal button 147 is provided. When the straight line removal button 147 is turned on, a straight line removal image 234 (S35 in FIG. 13) is displayed in the image area.

また、可視化画面内には、ＧＵＩ部品として、劣化確率閾値スライダー１４４や、領域サイズ閾値スライダー１４５が設けられている。劣化確率閾値スライダー１４４では、ユーザが劣化確率閾値（二値化閾値）を所定の範囲内で可変操作できる。劣化確率閾値スライダー１４４の操作に伴い、劣化確率閾値が変更され、画面内の二値化画像の表示内容がリアルタイムで更新される（図１３のＳ３３）。 In the visualization screen, a deterioration probability threshold slider 144 and a region size threshold slider 145 are provided as GUI components. With the deterioration probability threshold slider 144, the user can variably operate the deterioration probability threshold (binarization threshold) within a predetermined range. With the operation of the deterioration probability threshold slider 144, the deterioration probability threshold is changed, and the display content of the binarized image in the screen is updated in real time (S33 in FIG. 13).

なお、予め、計算機システム１が各二値化閾値に応じた画像を生成して画像メモリに保持しておき、ユーザの操作に応じて対応する画像を画面に表示するようにしてもよい。また、画面内に、異なる各二値化閾値に応じた各画像を並列で表示してもよい。 Note that the computer system 1 may generate an image corresponding to each binarization threshold in advance and hold it in the image memory, and display the corresponding image on the screen according to the user's operation. In addition, images corresponding to different binarization threshold values may be displayed in parallel on the screen.

領域サイズ閾値スライダー１４５では、ユーザが領域サイズ閾値を所定の範囲内で可変操作できる。領域サイズ閾値スライダー１４５の操作に伴い、領域サイズ閾値が変更され、小領域除去画像２３３を用いて、画面内の二値化画像の表示内容がリアルタイムで更新される（図１３のＳ３４）。 The area size threshold slider 145 allows the user to variably operate the area size threshold within a predetermined range. With the operation of the area size threshold slider 145, the area size threshold is changed, and the display content of the binarized image in the screen is updated in real time using the small area removal image 233 (S34 in FIG. 13).

［画像例］
図１５は、各種の画像の例を示す。図１５の（Ａ）〜（Ｄ）の各画像は対応関係を持つ。（Ａ）は、診断対象元画像を示し、構造物５の壁面においてひび割れの劣化（破線枠内）を含む画像１５１の例である。（Ｂ）は、診断結果多階調画像を示し、この画像１５２では、画素毎に劣化推定確率値が多階調で表現されている。例えば、低階調が青で、高階調が赤で表現され、ヒートマップのような画像である。（Ｃ）は、絞り込みされた、診断結果二値化画像を示す。この画像１５３では、（Ｂ）の多階調の画像１５２と、劣化確率閾値とに基づいて、画素値が二値化されている。例えば、元の多階調の画素値が劣化確率閾値未満の場合には値０として黒色で表現され、元の多階調の画素値が劣化確率閾値以上である場合には値１として赤色で表現される。なお表示色はユーザ設定可能である。（Ｄ）は、（Ａ）の画像１５１の上に（Ｃ）の二値化画像のうちの劣化箇所を重畳した画像を示す。本例では、画像内で、ひび割れの劣化が、概略的に縦方向（ｙ）に沿って生じている。また、正解付け作業の際に、劣化領域を透明にして輪郭線を表示するようにしてもよい。 [Image example]
FIG. 15 shows examples of various images. Each image in FIGS. 15A to 15D has a correspondence relationship. (A) shows an original image to be diagnosed, and is an example of an image 151 including crack degradation (within a broken line frame) on the wall surface of the structure 5. (B) shows a diagnosis result multi-tone image, and in this image 152, the deterioration estimation probability value is expressed in multi-tone for each pixel. For example, the low gradation is blue and the high gradation is red, which is an image like a heat map. (C) shows the binarized diagnostic result binarized image. In this image 153, pixel values are binarized based on the multi-gradation image 152 of (B) and the deterioration probability threshold. For example, when the original multi-gradation pixel value is less than the deterioration probability threshold, the value 0 is represented in black, and when the original multi-gradation pixel value is greater than or equal to the deterioration probability threshold, the value 1 is red. Expressed. The display color can be set by the user. (D) shows an image obtained by superimposing a degraded portion of the binarized image (C) on the image 151 (A). In this example, the degradation of cracks occurs in the image along the vertical direction (y). Further, in the correct answering work, the deterioration area may be made transparent and the outline may be displayed.

図１６は、診断結果二値化画像における劣化等の例を示す。（Ａ）の画像１６１は、元画像の上に診断結果二値化画像を重畳表示した例である。この画像１６１内には、劣化と推定される劣化箇所だけでなく、一部、破線枠で示すように、直線状領域１６３のような誤検出箇所も含まれている。また、画像１６１内には、領域サイズ閾値に応じて、劣化箇所の小領域（小領域群）１６４も含まれている。（Ｂ）は、（Ａ）の画像１６１に対し、直線領域除去および小領域除去を施した後の画像１６２の例である。この画像１６２では、直線状領域や小領域が除去されており、劣化箇所の確認や最終判定がしやすくなっている。例えば、一点鎖線枠で示すように、ひび割れの劣化箇所１６５が確認できる。劣化箇所１６５は、概略的にある方向に沿って生じていることがわかる。 FIG. 16 shows an example of deterioration or the like in the diagnostic result binarized image. The image 161 in (A) is an example in which a diagnostic result binarized image is superimposed and displayed on the original image. This image 161 includes not only a deteriorated portion that is estimated to be deteriorated, but also a misdetected portion such as a linear region 163, as shown in part by a broken line frame. In addition, the image 161 includes a small region (small region group) 164 of a degraded portion according to the region size threshold. (B) is an example of the image 162 after linear area removal and small area removal are performed on the image 161 of (A). In this image 162, the linear region and the small region are removed, so that it is easy to check the deterioration portion and make a final determination. For example, as shown by the alternate long and short dash line frame, a cracked deterioration portion 165 can be confirmed. It can be seen that the deteriorated portion 165 is roughly along a certain direction.

［小領域除去］
ひび割れ等の劣化は、ある程度以上連続的につながった画素領域として検出されるはずと考えられる。そこで、診断結果画像の劣化箇所から、面積が小さい領域（小領域）を除去することで、ユーザによる劣化検出の最終判定を行いやすくする。計算機システム１（特に第２処理機能１２）は、診断結果画像、特に二値化画像における、劣化（値１）の画素が隣接している劣化領域の面積（サイズ）を画素数等で判断する。その劣化領域の面積が、領域サイズ閾値以下である場合、その領域（小領域）を、劣化箇所とはせずに除去し、小領域除去画像２３３として、図１６の例のように表示する。 [Small area removal]
It is considered that deterioration such as a crack should be detected as a pixel region continuously connected to a certain extent. Therefore, by removing a region (small region) having a small area from the degraded portion of the diagnosis result image, the user can easily perform the final determination of the degradation detection. The computer system 1 (especially the second processing function 12) determines the area (size) of a deteriorated area in which a deteriorated (value 1) pixel is adjacent in a diagnosis result image, particularly a binarized image, based on the number of pixels. . When the area of the degraded area is equal to or smaller than the area size threshold, the area (small area) is removed without being a degraded part, and is displayed as a small area removed image 233 as in the example of FIG.

［直線領域除去］
診断結果画像のうち、劣化箇所と推定された領域が、直線形状に近い場合、ひび割れ等の劣化ではない可能性が高いと考えられる。例えば壁面に元々デザインとして形成されている直線状の溝やフレーム等の可能性が挙げられる。そこで、診断結果画像内の劣化箇所から、直線状領域を除去することで、ユーザによる劣化検出の最終判定を行いやすくする。計算機システム１（特に第２処理機能１２）は、診断結果画像の特に二値化画像に対し、公知の直線検出アルゴリズム処理、例えば確率的ハフ変換処理を適用する。これにより、診断結果画像から、直線状領域が抽出される。計算機システム１は、抽出された直線状領域を除去して、直線除去画像２３４として、図１６の例のように表示する。 [Linear area removal]
In the diagnosis result image, when the region estimated as the deteriorated portion is close to a straight line shape, it is highly likely that the region is not deteriorated such as a crack. For example, there is a possibility of a linear groove or frame originally formed as a design on the wall surface. Therefore, by removing the linear region from the degraded portion in the diagnosis result image, the user can easily perform the final determination of the degradation detection. The computer system 1 (particularly the second processing function 12) applies a well-known straight line detection algorithm process, for example, a probabilistic Hough transform process, particularly to a binarized image of the diagnosis result image. Thereby, a linear region is extracted from the diagnosis result image. The computer system 1 removes the extracted linear region and displays it as a straight line removed image 234 as in the example of FIG.

［モデル入力サイズ設定］
図１７は、実施の形態１で、モデル入力サイズ設定機能１５によるモデル入力サイズ設定について示す。この設定処理は、図３等のモデル３１に対する第１画像パッチの好適な第１入力サイズを決定し設定する処理である。（Ａ）は、カメラ４の画像に対応する第１画像における複数の画像を示す。ある構造物５（例えば構造物ＩＤ＝ＳＴＲ１）に関する第１画像として、例えば、画像Ｐ＃１，Ｐ＃２，Ｐ＃３等を有する。これらの複数の画像は、サイズや解像度が異なっている場合がある。画像に伴う画像情報として、サイズ（ＳＺ）、カメラ画素数（ＰＮ）、カメラ画角（ＡＮ）、対象物距離（ＤＳ）、解像度（ＤＦ）を示す。画像情報は、例えば、画像データの属性情報として付属されるか、あるいは、計算機システム１で作成または設定される。アプリ１０は、カメラ４から取得した画像データの画像情報を用いて設定処理を行う。なお、例えば同じ１mmの大きさのひび割れの劣化の場合でも、その劣化（対象物）からの距離が異なる各カメラ画像の場合、その劣化箇所に占める画素数が異なるものとなる。そのため、対象物距離（ＤＳ）の情報についても保存している。 [Model input size setting]
FIG. 17 shows the model input size setting by the model input size setting function 15 in the first embodiment. This setting process is a process for determining and setting a suitable first input size of the first image patch for the model 31 shown in FIG. (A) shows a plurality of images in the first image corresponding to the image of the camera 4. As a first image related to a certain structure 5 (for example, structure ID = STR1), for example, there are images P # 1, P # 2, P # 3, and the like. These multiple images may differ in size and resolution. As image information associated with an image, a size (SZ), a camera pixel number (PN), a camera angle of view (AN), an object distance (DS), and a resolution (DF) are shown. For example, the image information is attached as attribute information of the image data, or is created or set by the computer system 1. The application 10 performs setting processing using image information of image data acquired from the camera 4. For example, even in the case of degradation of a crack having the same size of 1 mm, in the case of each camera image having a different distance from the degradation (object), the number of pixels in the degradation location is different. Therefore, information on the object distance (DS) is also stored.

（Ｂ）は、（Ａ）の第１画像に関する補正処理を示す。（Ａ）の複数の画像は、解像度が一定になるように、サイズが補正される。例えば、一定の解像度を解像度ＤＦＣとする。例えば、画像Ｐ＃３は、解像度ＤＦ３が解像度ＤＦＣになるように、サイズＳＺ３がサイズＳＺ３Ｃに補正されている。補正後の複数の画像Ｃ＃１，Ｃ＃２，Ｃ＃３は、同じ解像度ＤＦＣにされている。カメラ画像入力機能１５は、このような補正処理を行う機能を含む。 (B) shows the correction process regarding the first image of (A). The size of the plurality of images (A) is corrected so that the resolution is constant. For example, a certain resolution is set as the resolution DFC. For example, in the image P # 3, the size SZ3 is corrected to the size SZ3C so that the resolution DF3 becomes the resolution DFC. The corrected images C # 1, C # 2, and C # 3 have the same resolution DFC. The camera image input function 15 includes a function for performing such correction processing.

（Ｃ）は、ＤＬ−ＣＮＮのモデル３１の第１入力サイズの設定について示す。（Ｂ）の補正後の複数の画像の各サイズ（ＳＺ１Ｃ，ＳＺ２Ｃ，ＳＺ３Ｃ）から、例えば最小サイズの正方形が選択される。その正方形のサイズが、モデル３１に入力する第１画像パッチの第１入力サイズ（ｂ１×ｂ１）として決定され、情報が設定される。このモデル３１は、対象の構造物５（構造物ＩＤ＝ＳＴＲ１）の診断に好適なモデル３１となる。上記のように、モデル入力サイズ設定機能１５を用いることで、好適な入力サイズを設定でき、より高精度の診断が可能である。また、ユーザが、可視化画面のモデル設定項目（非図示）で、任意に好適な入力サイズおよび出力サイズを指定して設定することも可能である。 (C) shows the setting of the first input size of the model 31 of DL-CNN. For example, a square having a minimum size is selected from each size (SZ1C, SZ2C, SZ3C) of the plurality of corrected images in (B). The size of the square is determined as the first input size (b1 × b1) of the first image patch input to the model 31, and information is set. This model 31 is a model 31 suitable for diagnosis of the target structure 5 (structure ID = STR1). As described above, by using the model input size setting function 15, a suitable input size can be set, and more accurate diagnosis is possible. In addition, the user can arbitrarily set and specify a suitable input size and output size using model setting items (not shown) on the visualization screen.

［効果等］
上記のように、実施の形態１の構造物劣化検出システムによれば、劣化検出の精度を確保しつつ、計算機での学習および診断に要する計算時間およびユーザの作業時間を含む時間を短縮でき、ユーザの作業負担を低減できる。構造物の管理者等は、効率的に点検補修業務を行うことができる。計算機性能が限られるシステムの場合でも、深層学習を用いた劣化診断を可能とする。本システムによれば、構造物の画像群のサイズや解像度等が多様な場合にも対応できるので、画像群の取得作業を含め、訓練および診断の作業を効率的に実現できる。また、ひび割れ等の劣化の特性を考慮して学習するので、診断の精度が確保できる。また、好適な画像入力サイズを選択できるので、計算機システム１の性能（ＧＰＵ等）に合わせて最大限効率的な計算が可能である。 [Effects]
As described above, according to the structure deterioration detection system of the first embodiment, it is possible to reduce the time including the calculation time required for learning and diagnosis in the computer and the work time of the user while ensuring the accuracy of deterioration detection. User workload can be reduced. Structure managers can efficiently perform inspection and repair work. Degradation diagnosis using deep learning is possible even in a system with limited computer performance. According to the present system, it is possible to cope with various sizes, resolutions, and the like of the image group of the structure, so that the training and diagnosis work including the image group acquisition work can be efficiently realized. In addition, since learning is performed in consideration of deterioration characteristics such as cracks, the accuracy of diagnosis can be ensured. Further, since a suitable image input size can be selected, the most efficient calculation can be performed in accordance with the performance (GPU or the like) of the computer system 1.

他の実施の形態として以下も可能である。可視化画面内に、構造物５の３次元オブジェクトモデルに基づいた画像を表示し、その画像の３次元オブジェクトの面に、位置や方向を対応付けながら、二値化画像等を貼り付けて表示してもよい。また、構造物５上の位置毎に、時系列上の診断日時毎の画像を関係付けて、劣化の進行度合い等を画面で確認可能としてもよい。 Other embodiments are also possible as follows. An image based on the three-dimensional object model of the structure 5 is displayed in the visualization screen, and a binarized image or the like is pasted and displayed on the surface of the three-dimensional object of the image while associating the position and direction. May be. Moreover, it is good also as making it possible to confirm the progress degree of degradation, etc. on a screen by associating images for each diagnosis date and time in time series for each position on the structure 5.

以上、本発明を実施の形態に基づき具体的に説明したが、本発明は前述の実施の形態に限定されず、その要旨を逸脱しない範囲で種々変更可能である。 Although the present invention has been specifically described above based on the embodiments, the present invention is not limited to the above-described embodiments, and various modifications can be made without departing from the scope of the invention.

１…計算機システム、２…計算機、３…計算機、４…カメラ、５…構造物、６…通信網、１０…構造物劣化検出ソフトウェア（アプリ）、２０…クライアントプログラム、２１…ＧＵＩ画面、３０…サーバプログラム、３１…ＤＬ−ＣＮＮ（モデル）、３２…ＤＢ、４１…画像、４２…劣化。 DESCRIPTION OF SYMBOLS 1 ... Computer system, 2 ... Computer, 3 ... Computer, 4 ... Camera, 5 ... Structure, 6 ... Communication network, 10 ... Structure deterioration detection software (application), 20 ... Client program, 21 ... GUI screen, 30 ... Server program, 31 ... DL-CNN (model), 32 ... DB, 41 ... image, 42 ... deterioration.

Claims

A structure deterioration detection system configured on a computer system for detecting deterioration including cracks on the surface of the structure,
The computer system is
A first process in which a first image obtained by imaging the surface of the structure is input, and a second image including information representing a diagnosis result of the deterioration is output using deep learning;
A second process for visualizing information including the first image and the second image, displaying the information on a screen, and receiving an input operation by a user;
And
The convolutional neural network constituting the deep learning model includes a widening convolutional layer that calculates a widening convolution filter,
The first process includes
A training process for inputting a first image patch having a predetermined first input size to the model and obtaining a first diagnosis result image having a first output size based on the training image data during training;
At the time of diagnosis of the target image of the structure, a second image patch having a second input size is cut out from the target image having a variable size equal to or larger than the first input size, and each second image patch is input to the model. A diagnostic process for obtaining a second diagnostic result image of each of the second output sizes;
Structure deterioration detection system including

In the structure deterioration detection system according to claim 1,
The widening convolution filter of the model has a stride number of 1, a dilate number of 2 or more,
The second output size has a plurality of vertical and horizontal pixels,
The second image has a deterioration estimation probability value for each pixel.
Structure deterioration detection system.

In the structure deterioration detection system according to claim 1,
The second image has a deterioration estimation probability value for each pixel,
The second process is a process of creating a third image in which correct attachment information indicating whether or not the diagnosis result is correct is input for each pixel based on the user's operation on the second image on the screen. Including
The first process uses the third image for the training process.
Structure deterioration detection system.

In the structure deterioration detection system according to claim 1,
The first process includes a data expansion process for creating the training image data,
The data expansion process includes an inversion process and a noise addition process that do not involve a shift process on an image.
Structure deterioration detection system.

In the structure deterioration detection system according to claim 1,
The first process is a process of performing the training process using weak point emphasis data for learning a weak point of erroneous detection of the model as the training image data based on a diagnosis result by the model or a setting by a user. Including,
Structure deterioration detection system.

In the structure deterioration detection system according to claim 5,
The weak point emphasis data includes a plant image or a line group image,
Structure deterioration detection system.

In the structure deterioration detection system according to claim 1,
The first process includes a rotation process,
The rotation processing generates a plurality of images by rotating the first image patch, inputs each of the plurality of images to the model, and estimates deterioration for each pixel from each diagnosis result image Including a process of extracting a portion having the largest probability value and integrating it into one diagnostic result image,
Structure deterioration detection system.

In the structure deterioration detection system according to claim 7,
The rotation process includes a process of generating the plurality of images by rotating the first image patch in an equal division within a 180 degree range at a predetermined rotation angle θ.
Structure deterioration detection system.

In the structure deterioration detection system according to claim 1,
The computer system corrects the size of the plurality of images in the first image related to the target structure so that the resolution is constant, and calculates the first image based on the size of the plurality of images with the constant resolution. Determine and set the input size,
Structure deterioration detection system.

In the structure deterioration detection system according to claim 1,
The second process includes
Processing for displaying, on the screen, a multi-tone image in which deterioration estimation probability values are expressed in multi-tone for each pixel of the first diagnosis result image or the second diagnosis result image as the second image;
A first part for variably setting a first threshold value for binarization related to the estimated deterioration probability value is displayed on the screen, and the first threshold value is set based on an operation of the first part by the user. The process to change,
A process of generating a binarized image obtained by binarizing the deterioration estimation probability value of the second image according to the changed first threshold, and displaying the binarized image on the screen;
Structure deterioration detection system including

In the structure deterioration detection system according to claim 10,
The second process includes
The screen displays a second component for variably setting a second threshold related to the size of the degradation region adjacent to the pixel estimated to be degraded, and the second component is displayed based on the user's operation of the second component. 2 processing to change the threshold value;
A process of generating an image in which a degraded area having a small size is removed from the degraded areas in the binarized image of the second image according to the changed second threshold, and displaying the image on the screen;
Structure deterioration detection system including

In the structure deterioration detection system according to claim 10,
The second process includes
The screen displays a third part related to the removal of the linearly shaped area among the degraded areas adjacent to the pixels estimated to be degraded, and removes the linearly shaped area based on the user's operation of the third part. Processing to accept,
A process of generating an image obtained by extracting and removing the linear shape region from the degraded region in the binarized image of the second image in response to acceptance of the removal of the linear shape region, and displaying the image on the screen When,
Structure deterioration detection system including