KR20240015368A

KR20240015368A - Flood level prediction model management system

Info

Publication number: KR20240015368A
Application number: KR1020220093146A
Authority: KR
Inventors: 정회경; 조민우
Original assignee: 배재대학교 산학협력단
Priority date: 2022-07-27
Filing date: 2022-07-27
Publication date: 2024-02-05
Also published as: WO2024025210A1

Abstract

본 발명의 일 실시예에 따른 홍수 수위 예측 모델 관리 시스템은, LSTM 모델 또는 GRU 모델이 입력되는 모델 입력부; 기상 데이터셋 또는 수위 데이터셋이 입력되는 데이터 입력부; 및 상기 모델 입력부에 입력된 모델에 대해서 상기 데이터 입력부에 입력된 데이터셋의 형태별로 성능을 비교하는 성능 비교부;를 포함할 수 있다.A flood level prediction model management system according to an embodiment of the present invention includes a model input unit where an LSTM model or a GRU model is input; A data input unit where a meteorological data set or a water level data set is input; and a performance comparison unit that compares performance for each type of data set input to the data input unit with respect to the model input to the model input unit.

Description

Flood level prediction model management system {FLOOD LEVEL PREDICTION MODEL MANAGEMENT SYSTEM}

본 발명은 홍수의 핵심 파라미터인 수위를 예측하는 모델을 도출하기 위해 입력 데이터에 따른 수위 예측 모델의 성능을 비교하여 최적의 홍수 수위 예측 모델을 결정할 수 있는 홍수 수위 예측 모델 관리 시스템에 관한 것이다.The present invention relates to a flood water level prediction model management system that can determine the optimal flood water level prediction model by comparing the performance of water level prediction models according to input data in order to derive a model for predicting water level, which is a key parameter of flood.

세계적으로 지구온난화로 인한 이상 기후로 인해 자연재해로부터 입는 피해가 증가하고 있으며, 지속적으로 자연재해의 빈도 및 강도가 증가할 것으로 예상된다. 특히, 홍수로 인한 피해는 점점 늘어나고 있으며, 홍수를 예측할 수 있다면 홍수로부터 입는 경제적, 인명적 손실을 줄일 수 있다.Worldwide, damage from natural disasters is increasing due to abnormal climate caused by global warming, and the frequency and intensity of natural disasters are expected to continue to increase. In particular, damage from floods is increasing, and if floods can be predicted, economic and human losses from floods can be reduced.

홍수로 인한 피해를 줄이기 위해선 홍수를 정확히 예측하고 적절한 시기에 홍수 피해 지역에 위치한 인명 및 재산을 대피시키는 것이 필요하다. 그러나 홍수 예측은 고려해야 할 변수가 많고, 각 요소들은 공간적, 시간적 상관 관계를 가지고 있기 때문에 예측하는데 많은 어려움이 있다.In order to reduce damage from floods, it is necessary to accurately predict floods and evacuate people and property located in flood-affected areas at an appropriate time. However, flood prediction is difficult to predict because there are many variables to consider and each factor has spatial and temporal correlations.

홍수 예측을 위한 방법에는 크게 수문학적 모델과 데이터 기반 지능형 모델이 존재한다.Methods for predicting floods largely include hydrological models and data-based intelligent models.

수문학적 모델의 경우는 수문의 특성을 분석하며 유출 합류점을 물리적으로 설명하는 모델이다. 이 방법은 유체역학 이론에 기초하며 질량, 운동량, 에너지 보존 등의 물리적 법칙을 결합하여 합류 방정식을 도출하는 방법이다. 그러나 해당 모델은 연구자의 깊은 수문학적 지식을 필요로 하며, 시간이 지남에 따라 지형의 침식 등으로 인한 변동성이 존재하여, 장기적으로 사용하기 힘든 문제점이 존재한다. 또한, 많은 양의 입력 데이터 구축이 어렵고 고려해야 할 변수가 많은 비선형성으로 인해 낮은 예측 정확도를 얻게 된다. 그리고 하천마다 유역의 특성이 상이하므로 각 하천에 적용되는 개별적 모델을 만들어야 하기 때문에 범용성이 부족한 단점이 존재한다.The hydrological model is a model that analyzes hydrological characteristics and physically explains the outflow confluence. This method is based on fluid mechanics theory and combines physical laws such as mass, momentum, and energy conservation to derive the confluence equation. However, the model requires the researcher's deep hydrological knowledge, and there is variability due to topographic erosion over time, making it difficult to use in the long term. In addition, it is difficult to construct a large amount of input data, and low prediction accuracy is obtained due to nonlinearity with many variables to consider. And since the characteristics of each river's basin are different, an individual model that applies to each river must be created, which has the disadvantage of lacking versatility.

한편, 데이터 기반 지능형 모델의 경우는 관측된 데이터를 기반으로 데이터 분석을 통해 수위 및 유출량을 예측하는 방식이다. 홍수를 예측하고 관리하기 위한 수문학적 분야에 적용된 데이터 기반 지능형 모델 중 하나는 Artificial Neural Network(ANN) 모델이다. ANN 모델은 수위와 기상 데이터를 입력 데이터로 활용하고 HS(Harmony Search) 및 Differential Evolution(DE)를 적용하여 물의 흐름을 추측하였다. HS 및 DE를 활용하여 아키텍처의 매개변수를 업데이트하고 중요한 특징을 선택하여 과적합을 방지하였으며, Radial Basis Function Neural Network(BRFNN) 및 Multi layer Perceptron(MLP) 모델과 비교하여 더 좋은 성능을 보이는 것을 확인하였으며 ANN 모델이 물의 흐름 예측을 위해 사용될 수 있다는 것을 입증하였다. 이 외에도 ANN 기반의 수위 예측 모델, 유출량 예측 모델 등 홍수 핵심 요소를 예측하는 많은 연구가 진행되었다. 그러나, 대부분의 관련 연구에서는 입력 데이터로 시계열 데이터의 범주인 수문 데이터와 기상 데이터를 활용하는데, ANN 모델의 경우 순차 데이터 및 시계열 데이터에 대해 연산할 때 메모리가 부족한 문제점이 존재하고, 학습 과정에서 최적의 파라미터를 찾기 어려운 문제점이 존재한다.Meanwhile, the data-based intelligent model predicts water level and outflow volume through data analysis based on observed data. One of the data-based intelligent models applied in the field of hydrology to predict and manage floods is the Artificial Neural Network (ANN) model. The ANN model used water level and weather data as input data and applied Harmony Search (HS) and Differential Evolution (DE) to estimate water flow. Using HS and DE, we updated the parameters of the architecture and selected important features to prevent overfitting, and compared to the Radial Basis Function Neural Network (BRFNN) and Multi layer Perceptron (MLP) models, we confirmed that they showed better performance. It was demonstrated that the ANN model can be used to predict water flow. In addition, many studies have been conducted to predict key elements of flooding, such as ANN-based water level prediction models and runoff prediction models. However, most related studies use hydrological data and meteorological data, which are categories of time series data, as input data, but in the case of ANN models, there is a problem of insufficient memory when calculating sequential data and time series data, and the optimal There is a problem that it is difficult to find the parameters of .

본 출원인은, 상기와 같은 문제점을 해결하기 위하여 본 발명을 제안하게 되었다.The present applicant proposed the present invention to solve the above problems.

한국공개특허 제10-2020-0087347호 (공개일자 2020.07.21.)Korea Patent Publication No. 10-2020-0087347 (publication date 2020.07.21.) 한국등록특허 제10-2403270호 (등록일자 2022.05.24.)Korean Patent No. 10-2403270 (registration date 2022.05.24.) 한국등록특허 제10-2409155호 (등록일자 2022.06.10.)Korean Patent No. 10-2409155 (registration date 2022.06.10.) 한국등록특허 제10-2308526호 (등록일자 2021.09.28.)Korean Patent No. 10-2308526 (registration date 2021.09.28.) 한국등록특허 제10-2159620호 (등록일자 2020.09.18.)Korean Patent No. 10-2159620 (registration date 2020.09.18.)

본 발명은 상기와 같은 문제점을 해결하기 위하여 제안된 것으로, 입력 데이터로 수위 데이터 및 기상 데이터를 활용하고 입력 데이터에 따른 입력 모델의 성능 비교를 통해 최상의 성능을 가지는 LSTM-GRU 기반의 수위 예측 모델을 도출할 수 있는 홍수 수위 예측 모델 관리 시스템을 제공한다.The present invention was proposed to solve the above problems. It uses water level data and weather data as input data and compares the performance of the input model according to the input data to create a water level prediction model based on LSTM-GRU with the best performance. Provides a flood level prediction model management system that can be derived.

상기한 바와 같은 과제를 달성하기 위한 본 발명의 일 실시예에 따른 홍수 수위 예측 모델 관리 시스템은, LSTM 모델 또는 GRU 모델이 입력되는 모델 입력부; 기상 데이터셋 또는 수위 데이터셋이 입력되는 데이터 입력부; 및 상기 모델 입력부에 입력된 모델에 대해서 상기 데이터 입력부에 입력된 데이터셋의 형태별로 성능을 비교하는 성능 비교부;를 포함할 수 있다.A flood level prediction model management system according to an embodiment of the present invention for achieving the above-described task includes a model input unit into which an LSTM model or a GRU model is input; A data input unit where a meteorological data set or a water level data set is input; and a performance comparison unit that compares performance for each type of data set input to the data input unit with respect to the model input to the model input unit.

상기 모델 입력부에는, LSTM 2계층으로 이루어진 Multi LSTM 모델, GRU 2계층으로 이루어진 Multi GRU 모델 및 LSTM과 GRU로 구성된 LSTM-GRU 모델이 입력될 수 있다.In the model input unit, a Multi LSTM model composed of 2 layers of LSTM, a Multi GRU model composed of 2 layers of GRU, and an LSTM-GRU model composed of LSTM and GRU can be input.

상기 데이터 입력부에는, 수위 데이터를 포함하는 데이터셋, 수위 데이터와 AWS 기상 데이터를 포함하는 데이터셋 및 수위 데이터와 ASOS 기상 데이터를 포함하는 데이터셋이 입력될 수 있다.A dataset including water level data, a dataset including water level data and AWS weather data, and a dataset including water level data and ASOS weather data may be input into the data input unit.

상기 성능 비교부는, 상기 모델 입력부에 입력된 3가지 모델과 상기 데이터 입력부에 입력된 3가지 데이터셋을 조합하여 총 9가지로 구성된 입력 데이터와 모델에 대해 실험하거나 성능 비교를 진행할 수 있다.The performance comparison unit can perform experiments or performance comparisons on a total of 9 types of input data and models by combining the 3 models input to the model input unit and the 3 data sets input to the data input unit.

상기 성능 비교부는, 상기 모델 입력부에 입력된 모델의 학습을 위한 기본 손실 함수로는 MSE를 이용하고, 실제 테스트 데이터에 대해 NSE 및 MAE 지표를 보조 지표로 이용하여 관측값과 예측값을 비교할 수 있다.The performance comparison unit may use MSE as a basic loss function for learning the model input to the model input unit, and use NSE and MAE indicators as auxiliary indicators for actual test data to compare observed and predicted values.

상기 성능 비교부는, 상기 모델 입력부에 입력된 수위 예측 모델의 성능 비교 지표로 MSE, NSE 및 MAE를 이용하고, 3가지 지표를 통해 상기 모델 입력부에 입력된 수위 예측 모델과 기상 데이터에 따른 성능 평가를 진행할 수 있다.The performance comparison unit uses MSE, NSE, and MAE as performance comparison indicators of the water level prediction model input to the model input unit, and performs performance evaluation according to the water level prediction model and weather data input to the model input unit through three indicators. You can proceed.

상기 성능 비교부는, MSE, NSE 및 MAE 지표와 최고 수위 예측의 오차를 비교하여 상기 모델 입력부에 입력된 모델의 성능을 판단할 수 있다.The performance comparison unit may determine the performance of the model input to the model input unit by comparing the MSE, NSE, and MAE indicators and the error of the highest water level prediction.

상기 성능 비교부의 성능 비교 결과에 따라 Multi LSTM 모델, Multi GRU 모델 및 LSTM-GRU 모델 중 가장 성능이 우수한 모델을 결정하여 제시하는 모델 결정부를 포함하며, 상기 모델 결정부는 ASOS 기상 데이터와 수위 데이터를 학습 데이터로 사용하는 LSTM-GRU 모델을 수위 예측 모델로 결정할 수 있다.It includes a model decision unit that determines and presents the model with the best performance among the Multi LSTM model, Multi GRU model, and LSTM-GRU model according to the performance comparison results of the performance comparison unit, and the model decision unit learns ASOS weather data and water level data. The LSTM-GRU model used as data can be determined as the water level prediction model.

본 발명에 따른 홍수 수위 예측 모델 관리 시스템은 기상 데이터셋과 수위 데이터셋만을 활용하여 보다 쉽게 수위를 예측할 수 있다.The flood level prediction model management system according to the present invention can more easily predict water levels by using only weather datasets and water level datasets.

본 발명에 따른 홍수 수위 예측 모델 관리 시스템은 실제 관측 데이터를 사용하여 데이터 기반의 수위 예측을 위한 모델을 제시할 수 있다.The flood level prediction model management system according to the present invention can present a model for data-based water level prediction using actual observation data.

본 발명에 따른 홍수 수위 예측 모델 관리 시스템은 입력 데이터에 따른 모델별 성능의 차이와 시계열 데이터 예측에 사용되는 모델의 구성에 따른 성능 차이를 확인할 수 있고, 성능이 가장 우수하고 적합한 수위 예측 모델을 제시할 수 있다.The flood water level prediction model management system according to the present invention can check the difference in performance of each model according to input data and the performance difference according to the configuration of the model used to predict time series data, and presents the water level prediction model with the best performance and most appropriate. can do.

도 1은 본 발명의 일 실시예에 따른 홍수 수위 예측 모델 관리 시스템의 구성을 개략적으로 보여주는 도면이다.
도 2는 도 1에 따른 시스템에 의한 홍수 수요 예측 모델 관리 방법을 설명하기 위한 도면이다.
도 3은 도 1에 따른 시스템이 적용되는 예시적인 지역의 강우 피해량과 인구수를 보여주는 그래프이다.
도 4는 도 1에 따른 시스템이 적용되는 테스트 베드 지역의 상류 및 하류의 전체 데이터 셋을 시각화한 그래프이다.
도 5는 도 1에 따른 시스템에 입력되는 AWS 및 ASOS 데이터의 품질 정보를 보여주는 그래프이다.
도 6은 도 1에 따른 시스템이 적용되는 테스트 베드 지역의 수위 측정소와 기상 데이터 측정소의 위치를 보여주는 도면이다.
도 7은 도 1에 따른 시스템에 적용되는 LSTM 및 GRU 모델의 구조를 보여주는 도면이다.
도 8은 도 1에 따른 시스템에 적용되는 LSTM-GRU 모델의 구조를 보여주는 도면이다.
도 9는 도 1에 따른 시스템에 있어서 모델별 훈련시 훈련 반복 횟수(Epoches)에 따른 손실값을 보여주는 그래프이다.
도 10 내지 도 12는 도 1에 따른 시스템에 있어서 모델별 학습데이터에 따라 관측값과 예측값의 차이를 보여주는 그래프이다.Figure 1 is a diagram schematically showing the configuration of a flood level prediction model management system according to an embodiment of the present invention.
Figure 2 is a diagram for explaining a method of managing a flood demand prediction model by the system according to Figure 1.
FIG. 3 is a graph showing the amount of rainfall damage and population in an exemplary area to which the system according to FIG. 1 is applied.
FIG. 4 is a graph visualizing the entire data set upstream and downstream of the test bed area where the system according to FIG. 1 is applied.
Figure 5 is a graph showing quality information of AWS and ASOS data input to the system according to Figure 1.
FIG. 6 is a diagram showing the locations of the water level measurement station and the weather data measurement station in the test bed area where the system according to FIG. 1 is applied.
FIG. 7 is a diagram showing the structures of LSTM and GRU models applied to the system according to FIG. 1.
FIG. 8 is a diagram showing the structure of the LSTM-GRU model applied to the system according to FIG. 1.
FIG. 9 is a graph showing loss values according to the number of training repetitions (Epoches) when training each model in the system according to FIG. 1.
Figures 10 to 12 are graphs showing the difference between observed and predicted values according to learning data for each model in the system according to Figure 1.

본 발명의 이점 및/또는 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나, 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 것이며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하며, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. 명세서 전체에 걸쳐 동일 참조 부호는 동일 구성요소를 지칭한다.The advantages and/or features of the present invention and methods for achieving them will become clear by referring to the embodiments described in detail below in conjunction with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below and will be implemented in various different forms, but the present embodiments only serve to ensure that the disclosure of the present invention is complete and are within the scope of common knowledge in the technical field to which the present invention pertains. It is provided to fully inform those who have the scope of the invention, and the present invention is only defined by the scope of the claims. Like reference numerals refer to like elements throughout the specification.

또한, 이하 실시되는 본 발명의 바람직한 실시예는 본 발명을 이루는 기술적 구성요소를 효율적으로 설명하기 위해 각각의 시스템 기능구성에 기 구비되어 있거나, 또는 본 발명이 속하는 기술분야에서 통상적으로 구비되는 시스템 기능 구성은 가능한 생략하고, 본 발명을 위해 추가적으로 구비되어야 하는 기능 구성을 위주로 설명한다. 만약 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자라면, 하기에 도시하지 않고 생략된 기능 구성 중에서 종래에 기 사용되고 있는 구성요소의 기능을 용이하게 이해할 수 있을 것이며, 또한 상기와 같이 생략된 구성 요소와 본 발명을 위해 추가된 구성 요소 사이의 관계도 명백하게 이해할 수 있을 것이다.In addition, the preferred embodiments of the present invention to be implemented below are provided in each system function configuration in order to efficiently explain the technical components constituting the present invention, or system functions commonly provided in the technical field to which the present invention pertains. The configuration will be omitted whenever possible, and the description will focus on the functional configuration that must be additionally provided for the present invention. If a person has ordinary knowledge in the technical field to which the present invention pertains, he or she will be able to easily understand the functions of conventionally used components among the functional configurations not shown and omitted below, as well as the omitted configurations as described above. The relationships between elements and components added for the present invention will also be clearly understood.

도 1은 본 발명의 일 실시예에 따른 홍수 수위 예측 모델 관리 시스템의 구성을 개략적으로 보여주는 도면, 도 2는 도 1에 따른 시스템에 의한 홍수 수요 예측 모델 관리 방법을 설명하기 위한 도면, 도 3은 도 1에 따른 시스템이 적용되는 예시적인 지역의 강우 피해량과 인구수를 보여주는 그래프, 도 4는 도 1에 따른 시스템이 적용되는 테스트 베드 지역의 상류 및 하류의 전체 데이터 셋을 시각화한 그래프, 도 5는 도 1에 따른 시스템에 입력되는 AWS 및 ASOS 데이터의 품질 정보를 보여주는 그래프, 도 6은 도 1에 따른 시스템이 적용되는 테스트 베드 지역의 수위 측정소와 기상 데이터 측정소의 위치를 보여주는 도면, 도 7은 도 1에 따른 시스템에 적용되는 LSTM 및 GRU 모델의 구조를 보여주는 도면, 도 8은 도 1에 따른 시스템에 적용되는 LSTM-GRU 모델의 구조를 보여주는 도면, 도 9는 도 1에 따른 시스템에 있어서 모델별 훈련시 Epoches에 따른 손실값을 보여주는 그래프, 도 10 내지 도 12는 도 1에 따른 시스템에 있어서 모델별 학습데이터에 따라 관측값과 예측값의 차이를 보여주는 그래프이다.1 is a diagram schematically showing the configuration of a flood water level prediction model management system according to an embodiment of the present invention, FIG. 2 is a diagram illustrating a flood demand prediction model management method by the system according to FIG. 1, and FIG. 3 is a diagram A graph showing the amount of rainfall damage and population in an exemplary area to which the system according to FIG. 1 is applied, FIG. 4 is a graph visualizing the entire data set upstream and downstream of the test bed area to which the system according to FIG. 1 is applied, and FIG. 5 is a graph A graph showing the quality information of AWS and ASOS data input to the system according to FIG. 1, FIG. 6 is a diagram showing the locations of water level measurement stations and weather data measurement stations in the test bed area to which the system according to FIG. 1 is applied, and FIG. 7 is a diagram showing the locations of A diagram showing the structure of the LSTM and GRU model applied to the system according to Figure 1. Figure 8 is a diagram showing the structure of the LSTM-GRU model applied to the system according to Figure 1. Figure 9 is a diagram showing the structure of the LSTM-GRU model applied to the system according to Figure 1. Figure 9 is a diagram showing the structure of the LSTM and GRU model applied to the system according to Figure 1. 10 to 12 are graphs showing loss values according to Epoches during training, and are graphs showing the difference between observed and predicted values according to learning data for each model in the system according to FIG. 1.

이하에서는 첨부된 도면을 참조하여 본 발명의 일 실시예를 상세히 설명하기로 한다.Hereinafter, an embodiment of the present invention will be described in detail with reference to the attached drawings.

도 1을 참조하면, 본 발명의 일 실시예에 따른 홍수 수위 예측 모델 관리 시스템(100, 이하 '모델 관리 시스템'이라 함)은 모델 입력부(120), 데이터 입력부(140), 성능 비교부(160), 모델 결정부(180) 및 데이터베이스(190)를 포함할 수 있다. 여기서, 모델 입력부(120), 데이터 입력부(140), 성능 비교부(160), 모델 결정부(180)는 수위 예측 모델부(110)를 구성할 수 있다. 즉, 본 발명의 일 실시예에 따른 모델 관리 시스템(100)은 수위 예측 모델부(110) 및 데이터베이스(190)를 포함할 수 있다.Referring to Figure 1, the flood level prediction model management system 100 (hereinafter referred to as 'model management system') according to an embodiment of the present invention includes a model input unit 120, a data input unit 140, and a performance comparison unit 160. ), a model determination unit 180, and a database 190. Here, the model input unit 120, data input unit 140, performance comparison unit 160, and model decision unit 180 may form the water level prediction model unit 110. That is, the model management system 100 according to an embodiment of the present invention may include a water level prediction model unit 110 and a database 190.

본 발명의 일 실시예에 따른 모델 관리 시스템(100)은, 홍수 발생시의 수위를 예측하는데 가장 적합한 모델을 제시할 수 있다. The model management system 100 according to an embodiment of the present invention can present a model most suitable for predicting the water level when a flood occurs.

본 발명의 일 실시예에 따른 모델 관리 시스템(100)이 적용되는 홍수를 유발하는 주요 원인 중 하나는 짧은 시간에 많은 양의 강우를 동반하는 집중 호우이다. 본 발명에서는, 강우로 인해 많은 피해가 발생했던 테스트 베드 선정을 위해 대한민국의 행정안전부에서 발간한 지역별 호우피해 현황자료를 사용하였다. 도 3의 (a)를 통해 대한민국의 각 도시 별 강우 피해량 및 강우 피해 발생 빈도를 확인할 수 있다.One of the main causes of flooding to which the model management system 100 according to an embodiment of the present invention is applied is heavy rain accompanied by a large amount of rain in a short period of time. In the present invention, regional heavy rain damage status data published by the Ministry of the Interior and Safety of the Republic of Korea was used to select a test bed where a lot of damage occurred due to rain. Through (a) in Figure 3, you can check the amount of rainfall damage and the frequency of rainfall damage for each city in Korea.

도 3의 (a)를 참조하면, 경기도 지역의 강우 피해량은 경우 전체 도시 중 2번째로 높지만, 발생 빈도의 경우 모든 도시 중 가장 높은 빈도로 강우가 발생하였다. 또한, 경기도 지역의 지리적 특성상 인구밀집도가 가장 높은 대한민국의 수도를 둘러싸고 있는 특성을 가지고 있으며, 경기도 내에도 많은 인구가 분포되어 있다. 도 3의 (b)를 통해 2020년도 기준 각 도시 별 인구밀도 및 인구수를 확인할 수 있다.Referring to Figure 3 (a), the amount of rainfall damage in the Gyeonggi-do region was the second highest among all cities, but in terms of frequency of occurrence, rainfall occurred with the highest frequency among all cities. In addition, due to the geographical characteristics of the Gyeonggi-do region, it surrounds the capital of Korea, which has the highest population density, and a large population is distributed within Gyeonggi-do. Through (b) in Figure 3, you can check the population density and population of each city as of 2020.

본 발명의 일 실시예에 따른 모델 관리 시스템(100)은 도 3에 도시된 바와 같이 호우로 인한 피해 사례와 홍수가 발생했을 때 많은 피해를 받을 것으로 예상되는 지역(예를 들면, 경기도)을 테스트 베드(Test bed) 지역으로 선정하고, 경기도 지역 내 상류와 하류의 데이터를 측정하고 있는 경기도 여주시의 여주보를 테스트 베드의 한 예로 선정하였다.As shown in FIG. 3, the model management system 100 according to an embodiment of the present invention tests cases of damage caused by heavy rain and areas expected to suffer a lot of damage when floods occur (e.g., Gyeonggi-do). The test bed area was selected, and Yeoju Weir in Yeoju-si, Gyeonggi-do, which measures upstream and downstream data within the Gyeonggi-do region, was selected as an example of a test bed.

본 발명의 일 실시예에 따른 모델 관리 시스템(100)의 데이터 입력부(140)에는 기상 데이터셋(data set) 및 수위 데이터셋(data set)이 입력될 수 있다.A meteorological data set and a water level data set may be input to the data input unit 140 of the model management system 100 according to an embodiment of the present invention.

수위 데이터셋의 경우에는 대한민국 수자원 관리 종합 정보 시스템을 참조하였고, 해당 수위 측정소는 경기도 여주시 여주보 상류와 하류 수위 측정소 데이터를 활용하였다. 하천의 특성상 물은 상류에서 하류로 흐르게 되며, 이 때 강우로 인한 과도한 유출량이 발생하면서 홍수의 원인이 된다. 본 발명의 일 실시예에 따른 모델 관리 시스템의 데이터 입력부(140)의 입력 데이터로 수위 데이터셋과 기상 데이터셋을 사용하며, 측정 기간은 2013년 10월 2일부터 2020년 11월 12일까지 1시간 간격으로 데이터를 측정하였다. 전체 데이터의 수는 상류 및 하류 각각 71,136행으로 구성되며 도 4를 통해 상류 및 하류의 전체 데이터셋을 시각화하여 확인할 수 있다. 도 4의 (a)는 여주보의 상류에서의 데이터셋을 나타내고, (b)는 여주보의 하류에서의 데이터셋을 나타낸다.In the case of the water level dataset, the Korea Water Resources Management Comprehensive Information System was referred to, and the water level measurement station used data from the water level measurement stations upstream and downstream of Yeoju Reservoir in Yeoju-si, Gyeonggi-do. Due to the nature of rivers, water flows from upstream to downstream, and when excessive runoff occurs due to rainfall, it causes flooding. A water level dataset and a weather dataset are used as input data to the data input unit 140 of the model management system according to an embodiment of the present invention, and the measurement period is from October 2, 2013 to November 12, 2020. Data were measured at time intervals. The total number of data consists of 71,136 rows each upstream and downstream, and the entire upstream and downstream dataset can be visualized and confirmed through Figure 4. Figure 4 (a) shows a data set upstream of Yeoju Weir, and (b) shows a data set downstream of Yeoju Weir.

상류 및 하류의 데이터셋은 강수량이 많은 여름철을 제외한 기간엔 적절한 수위를 유지하다가 강우량이 많이 발생하는 여름철에 급격히 수위가 증가하는 동일한 특성을 가지는데, 도 4에 도시된 바와 같이 여주보 상류의 경우 강우가 많이 발생하지 않는 시기의 수위에 비해 강우가 발생할 때의 증가량이 하류에 비해 큰 폭으로 증가하는 것을 확인하였다.The upstream and downstream datasets have the same characteristic of maintaining an appropriate water level during the period except for the summer season when there is a lot of rainfall, but the water level increases rapidly during the summer season when there is a lot of rainfall. As shown in Figure 4, in the case of the upstream of Yeoju reservoir, It was confirmed that compared to the water level in times when there is not much rainfall, the increase in water level when rainfall occurs increases significantly compared to the downstream.

본 발명의 일 실시예에 따른 모델 관리 시스템(100)에서 수위 예측 모델의 학습 데이터로 사용하기 위한 기상 데이터의 경우 한국 기상청에서 제공하는 2가지 데이터를 사용하는데, 지상의 기상을 관측하는 방재기상관측장비(AWS; Automatic Weather System) 및 종관기상관측장비(ASOS; Automated Synoptic Observing System) 데이터를 사용한다. 방재기상관측장비(AWS)의 경우에는 사람이 직접 관측하던 것을 자동으로 관측할 수 있도록 설계한 장비로, 실시간 측정, 연산, 저장, 표출 등 모든 과정을 자동으로 처리하게 되며 기압, 기온, 습도, 풍향, 풍속, 강수량 등의 자료를 실시간으로 관측하는 장비이다. ASOS 및 AWS 2가지 데이터셋 모두 기온, 습도, 강수량 파라미터를 입력 데이터로 사용한다. 또한, 수위 데이터셋과 마찬가지로 1시간 간격으로 데이터 측정을 진행하였다.In the case of weather data to be used as learning data for a water level prediction model in the model management system 100 according to an embodiment of the present invention, two types of data provided by the Korea Meteorological Administration are used: disaster prevention weather observation that observes the weather on the ground; It uses data from the Automatic Weather System (AWS) and the Automated Synoptic Observing System (ASOS). In the case of disaster prevention and weather observation equipment (AWS), it is a device designed to automatically observe what humans used to observe directly. It automatically processes all processes such as real-time measurement, calculation, storage, and display, and automatically processes atmospheric pressure, temperature, humidity, It is a device that observes data such as wind direction, wind speed, and precipitation in real time. Both ASOS and AWS datasets use temperature, humidity, and precipitation parameters as input data. In addition, as with the water level dataset, data was measured at hourly intervals.

도 5는 기상청에서 제공한 AWS 및 ASOS 데이터의 품질 정보를 보여주는 그래프이다. 우리나라는 여름철은 고온 다습한 기후로 많은 양의 강우 및 태풍이 발생하는데, 도 5를 참조하면, 해당하는 기간에 무인으로 작동되는 AWS 데이터의 정확도가 다른 기간이나 ASOS에 비해 떨어지는 것을 확인할 수 있다. Figure 5 is a graph showing quality information of AWS and ASOS data provided by the Korea Meteorological Administration. Korea has a hot and humid climate in the summer, causing a large amount of rainfall and typhoons. Referring to Figure 5, you can see that the accuracy of AWS data, which operates unmanned during the corresponding period, is lower than that of other periods or ASOS.

한편, 최근접 AWS 관측소의 위치는 실제 테스트 베드인 여주보와 8km 정도 떨어진 거리에 위치하고 있으며, 최근접 ASOS 관측소의 위치는 여주보와 20km 가량 떨어져 있다. 도 6을 통해 테스트 베드 지역의 수위 측정소와 기상 데이터 측정소의 위치를 확인할 수 있다. 도 6의 (a)는 여주보의 수위 측정소와 기상 데이터 측정소 위치를 나타내고, (b)는 여주보가 위치하는 경기도의 위치를 나타낸다.Meanwhile, the location of the closest AWS observatory is located about 8km away from Yeojubo, the actual test bed, and the location of the closest ASOS observatory is about 20km away from Yeojubo. Through Figure 6, the locations of the water level measurement station and the weather data measurement station in the test bed area can be confirmed. Figure 6 (a) shows the location of the water level measurement station and the meteorological data measurement station of Yeoju Weir, and (b) shows the location of Gyeonggi-do where Yeoju Weir is located.

여름철 집중 호우의 경우에는 강우 범위가 좁고 많은 양의 강우가 지속되는 특성이 있는데, 이러한 특성으로 인해 실제 테스트 베드와 기상 데이터 관측 지점의 거리에 따라 수위 예측 모델의 성능에 영향을 미칠 수 있다. 따라서, 본 발명의 일 실시예에 따른 모델 관리 시스템(100)은 실제 두가지 데이터를 각각 입력 데이터로 사용했을 때 모델의 성능 비교 실험을 진행하여 최적의 수위 예측 모델을 도출한다. [표 1]을 통해 사용된 수문 관측소 및 기상 관측소의 정보와 사용된 데이터셋에 대해 확인할 수 있다.In the case of heavy rain in summer, the rainfall range is narrow and large amounts of rainfall are sustained. Due to these characteristics, the performance of the water level prediction model can be affected depending on the distance between the actual test bed and the meteorological data observation point. Accordingly, the model management system 100 according to an embodiment of the present invention conducts a performance comparison experiment of the models when two actual pieces of data are used as input data, and derives an optimal water level prediction model. Through [Table 1], you can check the information on the hydrological and meteorological stations used and the datasets used.

[표 1]을 참조하면, Station에서 Yeojubo upstream과 Yeojubo downstream은 여주의 상류와 하류로서 수문 관측소(Hydrology)를 의미하고, ASS Yeoju와 AOSO Icheon은 기상 관측소(Meteorology)를 의미한다. Latitude와 Longitude는 각각 수문 관측소와 기상 관측소의 경도와 위도를 의미한다. 수문 관측소에서는 수위(Water level)를 측정하고, 기상 관측소에서는 온도(Temperature), 습도(humidity), 강수량(precipitation)을 측정한다.Referring to [Table 1], in Station, Yeojubo upstream and Yeojubo downstream refer to the upstream and downstream of Yeoju and refer to the hydrological observatory (Hydrology), and ASS Yeoju and AOSO Icheon refer to the meteorological observatory (Meteorology). Latitude and Longitude refer to the longitude and latitude of the hydrological station and meteorological station, respectively. Hydrological observatories measure water level, and meteorological observatories measure temperature, humidity, and precipitation.

[표 1]에 기재된 데이터셋이 본 발명의 일 실시예에 따른 모델 관리 시스템(100)의 데이터 입력부(140)에 입력될 수 있다.The dataset listed in [Table 1] may be input into the data input unit 140 of the model management system 100 according to an embodiment of the present invention.

본 발명의 일 실시예에 따른 모델 관리 시스템(100)은 수위 데이터셋과 기상 데이터셋과 같이 다양한 형태의 데이터를 입력하여 수위 예측 모델을 도출하게 되는데, 하나의 모델에 대해서 다양한 입력 데이터에 대한 성능을 비교할 뿐만 아니라, 다양한 모델에 대해서 다양한 형태의 입력 데이터에 대한 성능을 비교함으로써 최적의 수위 예측 모델을 도출해 낼 수 있다.The model management system 100 according to an embodiment of the present invention derives a water level prediction model by inputting various types of data such as water level data sets and weather data sets. For one model, the performance for various input data In addition to comparing, the optimal water level prediction model can be derived by comparing the performance of various types of input data for various models.

이를 위해, 본 발명의 일 실시예에 따른 모델 관리 시스템(100)의 모델 입력부(120)에는 시계열 데이터를 처리하는 다양한 신경망 모델이 선택되어 입력될 수 있다.To this end, various neural network models that process time series data may be selected and input into the model input unit 120 of the model management system 100 according to an embodiment of the present invention.

우선, 모델 입력부(120)에는 장단기 메모리(LSTM; Long Shor-Term Memory, 이하 'LSTM'이라 함) 모델이 입력될 수 있다. LSTM 모델은 시계열 데이터 처리를 위해 사용되는 RNN(Recurrent Neural Network) 아키텍쳐 중 하나이다. RNN은 시간적으로 상관관계가 있는 데이터에서 주로 사용되며, 직전 데이터와 현재 데이터간의 상관관계를 고려하며 이후의 데이터를 예측하기 위해 신호가 순환하는 구조를 가지면서 과거 데이터를 통해 미래의 데이터를 예측하는데, 이 때 과거 데이터를 오래 기억하지 못하는 문제점이 존재한다. LSTM은 이러한 문제점을 보완하여 등장한 아키텍쳐이다. 총 6개의 파라미터가 있으며 4개의 게이트로 이루어진 구조를 통해 단기 기억 뿐만 아닌 장기기억까지 해결할 수 있으며, LSTM의 구조는 도 7의 (a)에 도시된 바와 같다First, a Long Short-Term Memory (LSTM) model may be input to the model input unit 120. The LSTM model is one of the RNN (Recurrent Neural Network) architectures used to process time series data. RNN is mainly used in temporally correlated data. It considers the correlation between previous data and current data and has a signal circulation structure to predict future data, predicting future data through past data. , At this time, there is a problem of not remembering past data for a long time. LSTM is an architecture that emerged to complement these problems. There are a total of 6 parameters, and the structure of 4 gates can solve not only short-term memory but also long-term memory, and the structure of LSTM is shown in (a) of Figure 7.

LSTM 네트워크는 RNN과 같은 체인 구조로 되어 있지만, RNN의 반복 모듈은 단순한 1개의 tanh 레이어로 구성된 것이 아닌 4개의 레이어를 통해 서로 정보를 주고받는 구조로 되어 있다. LSTM 셀 내의 상태는 크게 두 가지의 벡터로 나뉘는데, 이 때 h_t는 short-term state를 의미하고, C_t는 long-term state를 의미한다. 데이터는 시그모이드 게이트를 통해 셀 상태에 추가 또는 제거할 수 있으며, 각 게이트는 개별 가중치가 서로 다른 레이어 또는 일련의 행렬 연산과 유사하다. 게이트를 통해 오래전의 과거 데이터의 정보도 유지할 수 있기 때문에 장기적인 종속성 문제를 해결하도록 설계되었다. The LSTM network has a chain structure like RNN, but the recurrent module of RNN is structured to exchange information with each other through 4 layers rather than simply one tanh layer. The state within an LSTM cell is largely divided into two vectors, where h_t refers to the short-term state and C_t refers to the long-term state. Data can be added or removed from the cell state through sigmoid gates, each of which is similar to a layer or series of matrix operations with different individual weights. It is designed to solve long-term dependency problems because gates can also maintain information from long-ago past data.

LSTM 네트워크의 처음 단계는 셀에서 생략할 불필요한 정보를 식별하고 판단하는 것이다. 해당하는 셀은 망각 게이트라고 하며, 해당 프로세스는 t-1에서 마지막 LSTM cell(h_(t-1))의 출력과 현재 시간 t에서 현재 입력(x_t)이 시그모이드 함수에 의해 결정된다. 이 때 시그모이드 함수를 통해 나오는 값은 (0~1) 사이의 값을 가지는데, 이 때 값이 클수록 이전 상태의 정보를 온전하게 기억하며, 값이 작을수록 이전 상태의 정보를 잊게 되며 이전 출력의 생략할 부분을 결정하게 된다. The first step of the LSTM network is to identify and determine unnecessary information to omit from the cell. The corresponding cell is called a forget gate, and in the process, the output of the last LSTM cell (h_(t-1)) at t-1 and the current input (x_t) at current time t are determined by a sigmoid function. At this time, the value that comes out of the sigmoid function has a value between (0~1). At this time, the larger the value, the more intact the information of the previous state is remembered, and the smaller the value, the more the information of the previous state is forgotten. This determines which parts of the output will be omitted.

망각 게이트를 거친 후 저장할 정보를 선택하는 과정을 거치게 된다. 망각 게이트를 거치면서, 이전 시간의 기억 셀(c_(t-1))을 잊어버리게 되며, 새로 기억할 정보를 추가하는데, 각 원소가 새로 추가되는 정보로써 가치가 얼마나 큰지 판단하게 된다. 이 때 새로운 정보를 무조건적으로 수용하는 것이 아닌 적절한 선택을 하게 된다. 해당하는 역할을 수행하는 게이트를 입력 게이트라고 한다. 이 때 마지막 LSTM cell(h_(t-1))과 현재의 값(x_t)을 통해 시그모이드 함수를 취하고, 활성화 함수인 tanh 함수가 추가된다. 이 때 시그모이드 계층을 거친 후의 값은 (0~1) 사이의 값을 가지며 새로운 정보를 업데이트할 정도를 의미하고, tanh함수를 거친 후의 값은 (-1~1) 사이의 값을 가지며 가중치를 부여하는 중요도를 의미한다. 입력 게이트는 최종적으로 두 값에 대해 Hadamard product 연산을 수행하여 해당하는 새 메모리가 이전 셀 상태(C_(t-1))에 추가되어 C_t가 된다.After passing through the forgetting gate, you go through the process of selecting information to be stored. As it passes through the forget gate, the memory cell (c_(t-1)) from the previous time is forgotten, new information to be remembered is added, and the value of each element as newly added information is judged. At this time, we make appropriate choices rather than unconditionally accepting new information. The gate that performs the corresponding role is called an input gate. At this time, the sigmoid function is taken through the last LSTM cell (h_(t-1)) and the current value (x_t), and the tanh function, which is an activation function, is added. At this time, the value after going through the sigmoid layer has a value between (0~1) and means the degree to update new information, and the value after going through the tanh function has a value between (-1~1) and the weight It means the importance given to . The input gate finally performs the Hadamard product operation on the two values, and the corresponding new memory is added to the previous cell state (C_(t-1)), resulting in C_t.

입력 게이트를 통해 새로운 정보의 가치를 판단하고 기억할 정보를 선택한 후 다음 과정은 출력 정보를 선택하게 된다. 해당하는 게이트를 출력 게이트라고 하며, 현재 시점의 값(x_t)과 마지막 LSTM cell(h_(t-1))의 값을 통해 시그모이드 함수를 취하고, 현재 셀 상태(C_t)와 Hadamard product 연산을 하여 값이 걸러지는 효과가 발생하며 은닉 상태가 된다.After determining the value of new information and selecting information to remember through the input gate, the next process is to select output information. The corresponding gate is called the output gate, and takes the sigmoid function through the value at the current time (x_t) and the value of the last LSTM cell (h_(t-1)), and performs the Hadamard product operation with the current cell state (C_t). This has the effect of filtering out the value and putting it in a hidden state.

이 때, σ는 시그모이드 함수를 의미하며, W는 가중치 행렬, 그리고 b는 편향을 의미한다. c_t는 현재 시간의 셀 상태를 의미하며, c_(t-1)은 이전 시간의 셀 상태를 의미한다. 그리고 ⊙는 Harmard product 연산을 의미한다. 수식 (1)은 망각 게이트를 거치는 과정을 의미하며, 입력 데이트를 거치는 과정은 (2)와 (3)번 수식과 (4)번 수식을 통해 셀 상태가 업데이트된다. 다음으로 (5)번 수식이 의미하는 출력 게이트를 거치며, 최종 은닉층 상태가 (6)번 수식을 통해 업데이트 되는 구조로 LSTM은 동작한다.At this time, σ refers to the sigmoid function, W refers to the weight matrix, and b refers to the bias. c_t means the cell state at the current time, and c_(t-1) means the cell state at the previous time. And ⊙ means Harmard product operation. Equation (1) refers to the process of going through the forget gate, and the process of going through the input data updates the cell state through equations (2) and (3) and (4). Next, the LSTM operates in a structure in which it passes through the output gate indicated by equation (5), and the final hidden layer state is updated through equation (6).

또한, 모델 입력부(120)에는 게이트 순환 유닛(GRU; Gated Recurrent Unit, 이하 'GRU'라 함) 모델이 입력될 수 있다. GRU는 RNN 아키텍처 중 하나로서 RNN의 문제점을 개선시킨 LSTM의 장기 의존성 문제에 대한 해결책은 유지하면서, 은닉 상태를 업데이트하는 계산을 줄인 모델로, 성능은 LSTM과 유사한 모델이다. LSTM의 경우 장기 의존성 문제를 해결하기 위해 기존 RNN에 비하여 파라미터가 더 많이 필요하게 되어, 데이터가 충분하지 않은 경우 오버피팅이 발생하는 문제점이 존재하는데, GRU는 LSTM 구조 변경을 통해 이러한 단점을 개선할 수 있다. GRU의 구조는 도 7의 (b)와 같다.Additionally, a gated recurrent unit (GRU) model may be input to the model input unit 120. GRU is one of the RNN architectures, and is a model that reduces the calculation of updating the hidden state while maintaining the solution to the long-term dependency problem of LSTM, which improved the problems of RNN. It is a model with similar performance to LSTM. In the case of LSTM, more parameters are needed than existing RNNs to solve long-term dependency problems, so there is a problem of overfitting when there is insufficient data. GRU can improve this shortcoming by changing the LSTM structure. You can. The structure of GRU is as shown in Figure 7 (b).

도 7의 (b)를 참조하면, 도 7의 (a)에 도시된 LSTM 구조보다 확실히 간결해진 구조임을 알 수 있다. LSTM과의 주요 차별점은 GRU는 LSTM의 forget gate와 input gate를 통합하여 update gate로 대체하였다. 또한, Cell state와 Hidden state를 통합하여 LSTM보다 간단한 구조를 가지며, LSTM에 비해 파라미터수가 적기 때문에 연산 비용이 적게 든다.Referring to (b) of Figure 7, it can be seen that the structure is clearly simpler than the LSTM structure shown in (a) of Figure 7. The main difference from LSTM is that GRU integrates the forget gate and input gate of LSTM and replaces them with an update gate. In addition, it has a simpler structure than LSTM by integrating the cell state and hidden state, and has fewer parameters than LSTM, so computational costs are lower.

도 7의 (b)의 r에 해당하는 과정은 reset gate를 의미하며, 해당 과정을 통해 네트워크의 은닉 상태를 의미한다. Reset gate를 거친 결과는 과거 은닉층의 정보와 연산되어 은닉 상태의 후보군을 계산하게 되는데, 이 때 과거 hidden state의 값과 reset gate를 거친 값이 곱해져 hidden state의 후보자가 된다. 다음으로 z에 해당하는 부분은 update gate를 의미한다. 해당 부분을 통해 LSTM의 forget gate와 input gate의 기능을 수행하며, 현재 정보를 얼마나 사용할지 결정하게 된다. 해당 게이트를 거치며 연산된 값은 이전에 연산된 hidden state의 후보자와 연산하여 최종 hidden state가 결정된다. 아래의 수식 (7)은 reset gate에 해당하며, 수식 (8)은 update gate에 해당하고, 수식 (9)와 (10)은 hidden state 후보자 결정 및 최종 hidden state를 결정하는 수식이다.The process corresponding to r in (b) of FIG. 7 means reset gate, and means the hidden state of the network through this process. The result of going through the reset gate is calculated with the information of the past hidden layer to calculate a candidate group for the hidden state. At this time, the value of the past hidden state and the value going through the reset gate are multiplied to become a candidate for the hidden state. Next, the part corresponding to z refers to the update gate. This part performs the functions of LSTM's forget gate and input gate, and determines how much of the current information to use. The value calculated through the gate is calculated with the previously calculated hidden state candidate to determine the final hidden state. Equation (7) below corresponds to the reset gate, equation (8) corresponds to the update gate, and equations (9) and (10) are equations for determining hidden state candidates and the final hidden state.

GRU와 LSTM 중 어떤 것이 모델의 성능면에서 더 낫다고 단정지을 수 없기에 본 발명의 일 실시예에 따른 모델 관리 시스템(100)은 성능 비교부(160)에서 LSTM과 GRU를 기반으로 실험을 수행하고 성능을 비교하게 된다.Since it cannot be determined which of GRU and LSTM is better in terms of model performance, the model management system 100 according to an embodiment of the present invention performs experiments based on LSTM and GRU in the performance comparison unit 160 and compares the performance. is compared.

한편, 본 발명의 일 실시예에 따른 모델 관리 시스템(100)의 성능 비교부(160)는 수위 예측 모델의 성능 비교 지표로 MSE(Mean Squared error), NSE(Nash-Sutcliffe coefficient of efficiency), 그리고 MAE(Mean Absolute Error) 3가지를 사용한다. Meanwhile, the performance comparison unit 160 of the model management system 100 according to an embodiment of the present invention uses mean squared error (MSE), Nash-Sutcliffe coefficient of efficiency (NSE), and Three types of MAE (Mean Absolute Error) are used.

MSE는 회귀 모델의 성능을 평가하는데 사용되는 방법으로, 실제 관측값과 예측된 값의 차이 값에 제곱을 한 것이다. 해당 지표는 관측값과 예측값의 차이에 대해 제곱을 하기 때문에 이상치에 민감한 특성을 가지는데, 수문 모델의 경우 예측에 대해 이상치가 발생할 경우 인명 피해를 초래할 수 있기 때문에, MSE 지표를 선택하였다. MSE is a method used to evaluate the performance of a regression model, and is the square of the difference between the actual observed value and the predicted value. This indicator is sensitive to outliers because it squares the difference between observed and predicted values. In the case of a hydrological model, if an outlier occurs in the prediction, it can cause casualties, so the MSE indicator was selected.

NSE의 경우 수문 모델의 성능을 평가하는데 많이 사용되는 지표로 (-∞~1)의 값을 가진다. 1에 가까운 값을 가질수록 모델의 성능이 좋은 것을 의미한다. NSE is a widely used indicator to evaluate the performance of hydrological models and has a value of (-∞~1). A value closer to 1 means better model performance.

MAE의 경우에는 관측값과 예측값의 모든 절대 오차의 평균을 의미하며, 직관적으로 모델의 성능을 확인할 수 있는 장점이 있다. 성능 비교 지표들에 대한 방정식은 수식 (11) 내지 수식 (13)과 같이 주어진다.In the case of MAE, it means the average of all absolute errors between observed and predicted values, and has the advantage of being able to intuitively check the model's performance. Equations for performance comparison indicators are given as equations (11) to (13).

본 발명의 일 실시예에 따른 모델 관리 시스템(100)의 성능 비교부(160)는 상기한 3가지 지표를 통해 LSTM 모델과 GRU 모델을 활용하여 입력 데이터에 따른 각 모델별 실험을 진행하여 성능을 비교할 수 있다.The performance comparison unit 160 of the model management system 100 according to an embodiment of the present invention performs experiments for each model according to input data using the LSTM model and GRU model through the three indicators described above to determine performance. You can compare.

실험에 사용할 모델은, LSTM 2계층으로 이루어진 Multi LSTM 모델, GRU 2계층으로 이루어진 Multi GRU 모델, LSTM과 GRU로 구성된 LSTM-GRU 모델을 포함할 수 있다. 따라서, 본 발명의 일 실시예에 따른 모델 관리 시스템(100)의 모델 입력부(120)에는 Multi-LSTM 모델, Multi-GRU 모델 및 LSTM-GRU 모델이 입력될 수 있다.Models to be used in the experiment may include the Multi LSTM model consisting of two layers of LSTM, the Multi GRU model consisting of two layers of GRU, and the LSTM-GRU model consisting of LSTM and GRU. Accordingly, a Multi-LSTM model, a Multi-GRU model, and an LSTM-GRU model can be input into the model input unit 120 of the model management system 100 according to an embodiment of the present invention.

모델 입력부(120)에 입력되는 모델의 학습에 사용되는 데이터셋은 수위 데이터로만 이루어진 데이터셋(S1), 수위 데이터와 AWS 기상 데이터로 구성된 데이터셋(S2), 그리고 수위 데이터와 ASOS 기상 데이터로 구성된 데이터셋(S3)을 포함할 수 있다. 이러한 데이터셋(S1,S2,S3)은 데이터 입력부(140)에 입력될 수 있다.The dataset used for learning the model input to the model input unit 120 is a dataset consisting of only water level data (S1), a dataset consisting of water level data and AWS weather data (S2), and a dataset consisting of water level data and ASOS weather data. May include a dataset (S3). These data sets (S1, S2, S3) can be input to the data input unit 140.

본 발명의 일 실시예에 따른 모델 관리 시스템(100)의 성능 비교부(160)는 3가지 데이터셋에 따른 모델별 비교를 진행하여 총 9개 모델에 대한 성능 비교를 진행한다. 각 실험에 대한 정보는 [표 2]를 통해 확인할 수 있다.The performance comparison unit 160 of the model management system 100 according to an embodiment of the present invention compares the performance of a total of 9 models by comparing each model according to the 3 data sets. Information about each experiment can be found in [Table 2].

[표 2]에서, S1의 경우는 수위 데이터셋을 학습 데이터로 사용한 시나리오를 의미하며, S2는 수위 데이터셋 및 AWS 기상 데이터셋을 학습 데이터로 사용한 시나리오를 의미하고, 마지막으로 S3의 경우 수위 데이터셋 및 ASOS 데이터셋을 학습 데이터로 사용한 시나리오를 의미한다. 따라서, 성능 비교부(160)는 9개 시나리오에 따른 성능 비교 실험을 진행한다.In [Table 2], S1 refers to a scenario using the water level dataset as learning data, S2 refers to a scenario using the water level dataset and AWS weather dataset as learning data, and finally, S3 refers to the water level data This refers to a scenario using set and ASOS datasets as learning data. Therefore, the performance comparison unit 160 conducts a performance comparison experiment according to nine scenarios.

본 발명의 일 실시예에 따른 모델 관리 시스템(100)에서 사용된 수위 예측 모델의 구조는 3가지로 전체적인 모델의 구성은 도 8에 도시된 바와 같다.There are three structures of the water level prediction model used in the model management system 100 according to an embodiment of the present invention, and the overall model structure is as shown in FIG. 8.

도 8을 참조하면, 3가지 모델 모두 공통적으로 과거 20시간의 데이터를 입력 데이터로 사용한다. 이 때, 시나리오 S1에 해당하는 학습 데이터를 통해 학습하는 경우 입력 데이터의 형태는 [None, 2] 로 구성되며, S2와 S3의 경우 [None, 5]로 구성된다. 마지막에는 Dense 레이어(layer)를 거치며 최종적으로 과거 20시간의 데이터를 통해 21시간째의 데이터를 예측한다.Referring to Figure 8, all three models commonly use data from the past 20 hours as input data. At this time, when learning through learning data corresponding to scenario S1, the input data type consists of [None, 2], and for S2 and S3, it consists of [None, 5]. Finally, it goes through the Dense layer and finally predicts the 21st hour of data using the past 20 hours of data.

Multi LSTM 모델의 경우에는 히든 레이어가 모두 LSTM층으로 구성되며, Multi GRU 모델은 2개층 모두 GRU로 구성된다. LSTM-GRU 모델의 경우 첫번째 히든 레이어가 LSTM 레이어, 두번째 히든 레이어가 GRU 레이어로 구성된다. In the case of the Multi LSTM model, all hidden layers are composed of LSTM layers, and in the Multi GRU model, both layers are composed of GRUs. In the case of the LSTM-GRU model, the first hidden layer is composed of an LSTM layer and the second hidden layer is composed of a GRU layer.

본 발명의 일 실시예에 따른 모델 관리 시스템(100)의 데이터 입력부(140)에는 수위데이터셋 및 기상 데이터셋이 입력되며, 성능 비교부(160)는 3가지 모델과 3가지 데이터 시나리오(S1,S2,S3)를 조합한 총 9가지로 구성된 입력 데이터와 모델에 의한 실험을 진행한다. 전체 학습 및 테스트 데이터의 총 수집 기간은 2013년 10월 2일부터 2021년 11월 12일까지의 1시간 간격으로 측정되었으며, 총 711,36개의 행으로 구성된 데이터 중 80%인 56,908개의 행을 학습 데이터 및 검증 데이터로 사용하였으며, 나머지 14,208개의 행을 테스트하기 위한 데이터로 사용하였다. 홍수 예측 모델의 경우 급격히 불어나 높은 수위를 예측하는 것이 중요한데, 성능 비교부(160)는 테스트 데이터 내에 전체 측정된 수위 중 최댓값이 존재하는지 여부를 판단하여 모델의 타당성을 입증할 수 있다.A water level data set and a weather data set are input to the data input unit 140 of the model management system 100 according to an embodiment of the present invention, and the performance comparison unit 160 inputs three models and three data scenarios (S1, We conduct experiments using input data and models consisting of a total of 9 types of combinations of S2, S3). The total collection period of all training and test data was measured at 1-hour intervals from October 2, 2013 to November 12, 2021, and 56,908 rows, or 80% of the data consisting of a total of 711,36 rows, were learned. It was used as data and verification data, and the remaining 14,208 rows were used as data for testing. In the case of a flood prediction model, it is important to predict a rapidly rising water level, and the performance comparison unit 160 can verify the validity of the model by determining whether the maximum value among all measured water levels exists in the test data.

성능 비교부(160)는 모델 학습을 위한 기본 손실 함수는 MSE를 이용하며, 실제 테스트 데이터에 대해 NSE, MAE 지표를 보조 지표로 활용하여 관측값과 예측값의 비교를 위해 사용한다. 또한, 최적화 함수는 모두 동일하게 adam을 사용하고, LSTM 모델 및 GRU 모델의 unit 수는 256, 훈련 반복 횟수(Epoches)는 200으로 동일한 조건으로 진행한다. 학습 모델의 경우 [표 2]에 정리된 case와 같이 총 9가지로 진행되며, 각 모델 학습 및 검증에 대한 정보는 [표 3]에 도시된 바와 같다. The performance comparison unit 160 uses MSE as the basic loss function for model learning, and uses NSE and MAE indicators as auxiliary indicators for actual test data to compare observed and predicted values. In addition, the optimization functions all use the same adam, the number of units for the LSTM model and the GRU model is 256, and the number of training repetitions (Epoches) is 200. In the case of learning models, there are a total of 9 cases as summarized in [Table 2], and information on learning and verification of each model is shown in [Table 3].

도 9에는 각 시나리오(S1,S2,S3)에 의해 각각의 모델을 학습하는 동안 훈련 반복 횟수에 따른 손실 값 그래프가 도시되어 있다.Figure 9 shows a loss value graph according to the number of training repetitions while learning each model by each scenario (S1, S2, and S3).

도 9를 참조하면, 모델 훈련시 훈련 반복 횟수(Epoches)에 따른 손실 값 변화를 확인할 수 있다. 도 9에서 각 그래프의 가로축은 훈련 반복 횟수(Epoches)을 나타내고, 세로축은 손실값(Loss)을 나타낸다.Referring to FIG. 9, you can check the change in loss value according to the number of training repetitions (Epoches) when training the model. In Figure 9, the horizontal axis of each graph represents the number of training repetitions (Epoches), and the vertical axis represents the loss value (Loss).

또한, 도 9의 (a) 및 (b)의 그래프에서 파란색은 학습(Train) 데이터를 나타내고, 주황색은 실험(Test) 데이터를 나타낸다. 도 9의 (c) 내지 (i)의 그래프에서 파란색은 학습 데이터를 나타내고, 주황색은 검증(Validation) 데이터를 나타낸다.Additionally, in the graphs of Figures 9 (a) and (b), blue represents training data, and orange represents test data. In the graphs in (c) to (i) of Figures 9, blue represents training data, and orange represents validation data.

성능 비교부(160)에 의하면 LSTM 모델의 특성에 따라 Multi LSTM 모델의 하이퍼파라미터 수가 가장 많으며, Multi GRU 모델의 하이퍼파라미터 수가 가장 적음을 확인할 수 있다. 학습 시간의 경우 Multi GRU 모델이 가장 적게 소요되었으나, 하이퍼파라미터에 비례하여 학습 시간이 정해지지 않는 것을 확인하였다. 또한, 도 9를 참조하면, 모든 케이스에서 훈련 반복 횟수(Epoches)에 따라 손실 값이 0에 가깝게 수렴하는데, 학습이 잘 이루어졌다고 판단되며, 도 9의 (h)에 해당하는 S3_LSTM_GRU의 경우에는 중간에 손실 값이 급격히 올라가는 것을 볼 수 있는데 그 이유는 과적합이 발생된 것으로 판단되나, 다른 모델과의 형평성을 위해 훈련 반복 횟수(Epoches)는 200으로 유지하여 실험을 진행하였다.According to the performance comparison unit 160, it can be confirmed that according to the characteristics of the LSTM model, the Multi LSTM model has the largest number of hyperparameters, and the Multi GRU model has the smallest number of hyperparameters. In terms of learning time, the Multi GRU model took the least, but it was confirmed that the learning time was not determined in proportion to the hyperparameters. In addition, referring to Figure 9, in all cases, the loss value converges close to 0 depending on the number of training repetitions (Epoches), which indicates that learning has been performed well. In the case of S3_LSTM_GRU corresponding to (h) in Figure 9, the loss value converges close to 0. You can see that the loss value rises rapidly, which is believed to be due to overfitting. However, for fairness with other models, the experiment was conducted by maintaining the number of training repetitions (Epoches) at 200.

성능 비교부(160)에 의하면 손실 함수로 사용된 MSE 값의 경우에는 모델마다 입력 데이터 특성에 따른 차이점을 발견하였는데, Multi-LSTM 모델의 경우에는 입력 데이터에 따른 차이점이 크게 발생하지 않았지만, Multi-GRU와 LSTM-GRU의 경우에는 입력 데이터에 따라 성능 차이가 발생하는 것을 확인할 수 있다. Multi-GRU 모델의 경우 학습 데이터에 기상데이터를 포함한 S2, S3 case의 경우 S1 case에 비해 성능이 떨어지는 것을 확인하였고, 9가지 케이스 중 저조한 성능을 보이는 것을 확인할 수 있다. 이는 LSTM을 경량화하면서 LSTM 모델에 비해 높은 차원의 데이터를 효과적으로 학습하지 못하였기 때문이라고 판단된다. 그러나, LSTM-GRU 모델의 경우에는 학습 데이터의 차원이 작은 S1 모델의 검증 MSE가 가장 낮은 성능을 보이는 것을 확인하였다. 최종적으로 가장 좋은 성능을 보인 case는 S3_LSTM_GRU로 훈련시 0.15, 검증시 0.20으로 가장 좋은 결과를 얻었다.According to the performance comparison unit 160, in the case of the MSE value used as the loss function, differences were found depending on the input data characteristics for each model. In the case of the Multi-LSTM model, there were no significant differences depending on the input data, but the Multi-LSTM model In the case of GRU and LSTM-GRU, it can be seen that performance differences occur depending on the input data. In the case of the Multi-GRU model, it was confirmed that the performance of the S2 and S3 cases, which included weather data in the learning data, was lower than that of the S1 case, and it was confirmed that the performance was poor among the 9 cases. This is believed to be because, while making the LSTM lightweight, it was unable to effectively learn high-dimensional data compared to the LSTM model. However, in the case of the LSTM-GRU model, it was confirmed that the verification MSE of the S1 model, which has a small dimension of the learning data, showed the lowest performance. The case that ultimately showed the best performance was S3_LSTM_GRU, which achieved the best results of 0.15 during training and 0.20 during validation.

성능 비교부(160)는, 테스트를 위한 데이터셋에는 전체 데이터 수집 기간 중 가장 높은 수위 데이터가 존재하며 MSE, NSE, MAE 3가지 지표와 최고 수위 예측의 오차를 비교하여 모델의 성능을 판단한다. 이 때, 관측 데이터 중 가장 높은 수위는 3552cm이다. [표 4]를 통해 테스트 결과를 확인할 수 있으며, 도 10 내지 12에 도시된 바와 같이 각 모델마다 학습 데이터에 따라 관측값과 예측값의 차이를 그래프를 통해 확인할 수 있다.The performance comparison unit 160 contains the highest water level data during the entire data collection period in the data set for testing, and determines the performance of the model by comparing the three indicators of MSE, NSE, and MAE with the error of the highest water level prediction. At this time, the highest water level among the observed data is 3552cm. The test results can be checked through [Table 4], and as shown in Figures 10 to 12, the difference between the observed value and the predicted value according to the learning data for each model can be checked through a graph.

도 10에는 시나리오 S1에서 Multi LSTM 모델(a), Multi GRU 모델(b) 및 LSTM-GRU 모델(c)에 대한 관측값(주황색)과 예측값(파란색)을 비교한 그래프가 도시되어 있다.Figure 10 shows a graph comparing observed values (orange) and predicted values (blue) for the Multi LSTM model (a), Multi GRU model (b), and LSTM-GRU model (c) in scenario S1.

도 11에는 시나리오 S2에서 Multi LSTM 모델(a), Multi GRU 모델(b) 및 LSTM-GRU 모델(c)에 대한 관측값(주황색)과 예측값(파란색)을 비교한 그래프가 도시되어 있다.Figure 11 shows a graph comparing observed values (orange) and predicted values (blue) for the Multi LSTM model (a), Multi GRU model (b), and LSTM-GRU model (c) in scenario S2.

도 12에는 시나리오 S3에서 Multi LSTM 모델(a), Multi GRU 모델(b) 및 LSTM-GRU 모델(c)에 대한 관측값(주황색)과 예측값(파란색)을 비교한 그래프가 도시되어 있다.Figure 12 shows a graph comparing observed values (orange) and predicted values (blue) for the Multi LSTM model (a), Multi GRU model (b), and LSTM-GRU model (c) in scenario S3.

성능 비교부(160)의 테스트 결과도 모델 훈련 및 검증과 유사한 결과를 얻었다. Multi LSTM 모델은 입력 데이터에 따라 큰 차이는 발견되지 않았지만, NSE 값은 3가지 데이터 중 S1 case에 해당하는 데이터를 입력 데이터로 사용했을 때 가장 좋은 결과를 얻었다. 또한, 검증 데이터에 포함된 최대 수위를 예측한 차이도 S1 case에서 81.77cm으로 가장 좋은 결과를 얻은 것을 확인하였으며 LSTM 모델 중 S2 case 데이터를 사용하였을 때 NSE는 0.802로 S1 및 S3 케이스에 비해서는 조금 떨어지는 결과를 확인하였으며, 최대 수위 예측 오차도 106.22cm으로 가장 저조한 결과를 얻었다.The test results of the performance comparison unit 160 also obtained similar results to the model training and verification. Although no significant differences were found in the Multi LSTM model depending on the input data, the best NSE value results were obtained when data corresponding to the S1 case among the three data types was used as input data. In addition, the difference in predicting the maximum water level included in the verification data was confirmed to have the best result of 81.77cm in the S1 case. When using the S2 case data among the LSTM models, the NSE was 0.802, slightly lower than the S1 and S3 cases. The falling results were confirmed, and the maximum water level prediction error was 106.22cm, which was the lowest result.

Multi GRU 모델의 경우에는 모든 케이스에서 다른 모델에 비해 가장 저조한 성능을 확인하였다. 검증시 결과와 유사하며, 이는 GRU 모델이 입력 데이터의 차원이 많을 때 효과적으로 학습하지 못하는 결과를 초래하였다고 판단된다. S1 case의 데이터를 활용하였을 때, GRU 모델 내에선 가장 좋은 결과를 얻었지만, 다른 모델에 비해 떨어지는 결과를 확인하였다. S2, S3 케이스의 경우 NSE 값이 각각 0.31과 0.356으로 9가지 케이스 중 가장 떨어지는 것을 확인하였고, 최대 수위 예측 오차 또한 S2는 150.14, S3는 207.17로 가장 큰 오차가 발생하는 것을 확인하였다. In the case of the Multi GRU model, the poorest performance was confirmed compared to other models in all cases. It is similar to the results during verification, and it is believed that this resulted in the GRU model not learning effectively when the input data had many dimensions. When using the data from the S1 case, the best results were obtained within the GRU model, but the results were found to be inferior to other models. In the case of S2 and S3 cases, it was confirmed that the NSE values were 0.31 and 0.356, respectively, the lowest among the 9 cases, and the maximum water level prediction error was also confirmed to be 150.14 for S2 and 207.17 for S3, which was the largest error.

LSTM_GRU 모델의 경우에는 3가지 모델 중 평균적으로 가장 좋은 성능을 보이는 것을 확인하였는데, S1 케이스의 경우 LSTM 모델보다는 조금 낮은 NSE와 조금 큰 최대 수위 예측 오차를 보였다. 그러나 S2와 S3 케이스의 경우 NSE 값이 각각 0.935와 0.42, 최대 수위 오차가 98.06과 47.16으로 다른 모델에 비해 훨씬 좋은 성능을 보였다. 이 중 S3 케이스를 사용했을 때 MSE, NSE, MAE, 최대 수위 예측 오차를 포함하여 모든 평가 지표에서 9개 모델 중 가장 좋은 결과를 얻을 수 있었다. In the case of the LSTM_GRU model, it was confirmed that it showed the best performance on average among the three models. In the case of S1, it showed a slightly lower NSE and a slightly larger maximum water level prediction error than the LSTM model. However, the S2 and S3 cases showed much better performance than other models, with NSE values of 0.935 and 0.42 and maximum water level errors of 98.06 and 47.16, respectively. Among these, when using the S3 case, the best results were obtained among the 9 models in all evaluation indicators, including MSE, NSE, MAE, and maximum water level prediction error.

따라서, 상기한 실험 결과를 이용하여, 본 발명의 일 실시예에 따른 모델 관리 시스템(100)의 모델 결정부(180)는 테스트 베드인 여주보에 가장 적합한 기상 데이터는 모델의 성능으로 볼 때 S3 case에 해당하는 ASOS 데이터라고 판단할 수 있다. Therefore, using the above experimental results, the model decision unit 180 of the model management system 100 according to an embodiment of the present invention determines that the weather data most suitable for Yeojubo, which is a test bed, is S3 in terms of model performance. It can be judged to be ASOS data corresponding to the case.

모델 결정부(180)는 S3 case의 실험 결과로부터, ASOS 기상 데이터와 수위 데이터를 학습 데이터로 사용하는 LSTM-GRU 모델을 가장 적합한 수위 예측 모델로 결정한다.The model decision unit 180 determines the LSTM-GRU model, which uses ASOS meteorological data and water level data as learning data, as the most appropriate water level prediction model from the experiment results of case S3.

한편, 데이터셋에 따른 3가지 시나리오(S1,S2,S3)에 대한 3가지 모델의 성능 비교 실험에 관련된 모든 데이터는 데이터베이스(190)에 저장될 수 있다.Meanwhile, all data related to the performance comparison experiment of the three models for the three scenarios (S1, S2, and S3) according to the dataset may be stored in the database 190.

본 발명의 일 실시예에 따른 홍수 수위 예측 모델 관리 시스템(100)에 의한 홍수 수위 예측 모델 관리 방법은 도 2에 도시된 바와 같이, 모델 입력부(120)에 실험 대상 모델을 입력하는 단계(S110), 데이터 입력부(140)에 데이터셋을 입력하는 단계(S120), 성능 비교부(160)를 통해 각 모델별로 데이터셋 시나리오에 대한 실험을 진행하여 성능을 비교하는 단계(S130) 및 모델 결정부(180)에 의해 성능이 가장 우수한 모델을 결정하고 제시하는 단계(S140)를 포함할 수 있다. As shown in FIG. 2, the flood level prediction model management method by the flood level prediction model management system 100 according to an embodiment of the present invention includes the step of inputting an experiment target model into the model input unit 120 (S110). , a step of inputting a dataset into the data input unit 140 (S120), a step of comparing performance by conducting an experiment on the dataset scenario for each model through the performance comparison unit 160 (S130), and a model decision unit (S130) 180) may include a step (S140) of determining and presenting the model with the best performance.

본 발명의 일 실시예에 따른 홍수 수위 예측 모델 관리 방법은 상기에서 설명한 홍수 수위 예측 모델 관리 시스템(100)에 의해서 수행될 수 있다.The flood level prediction model management method according to an embodiment of the present invention may be performed by the flood level prediction model management system 100 described above.

상기에서 설명한 바와 같은 본 발명의 일 실시예에 따른 홍수 수위 예측 모델 관리 시스템(100)은 기상 데이터셋과 수위 데이터셋만을 활용하여 보다 쉽게 수위를 예측할 수 있다.The flood level prediction model management system 100 according to an embodiment of the present invention as described above can more easily predict the water level by using only the weather dataset and the water level dataset.

이상에서 설명된 시스템(장치)은 하드웨어 구성 요소, 소프트웨어 구성 요소, 및/또는 하드웨어 구성 요소 및 소프트웨어 구성 요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치 및 구성 요소는, 예를 들어, 프로세서, 컨트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPA(field programmable array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 애플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 컨트롤러를 포함할 수 있다. 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The system (device) described above may be implemented with hardware components, software components, and/or a combination of hardware components and software components. For example, devices and components described in embodiments may include, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable array (FPA), It may be implemented using one or more general-purpose or special-purpose computers, such as a programmable logic unit (PLU), microprocessor, or any other device capable of executing and responding to instructions. A processing device may execute an operating system (OS) and one or more software applications that run on the operating system. Additionally, a processing device may access, store, manipulate, process, and generate data in response to the execution of software. For ease of understanding, a single processing device may be described as being used; however, those skilled in the art will understand that a processing device includes multiple processing elements and/or multiple types of processing elements. It can be seen that it may include. For example, a processing device may include multiple processors or one processor and one controller. Additionally, other processing configurations, such as parallel processors, are possible.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치, 또는 전송되는 신호 파(signal wave)에 영구적으로, 또는 일시적으로 구체화(embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.Software may include a computer program, code, instructions, or a combination of one or more of these, which may configure a processing unit to operate as desired, or may be processed independently or collectively. You can command the device. Software and/or data may be used on any type of machine, component, physical device, virtual equipment, computer storage medium or device to be interpreted by or to provide instructions or data to a processing device. , or may be permanently or temporarily embodied in a transmitted signal wave. Software may be distributed over networked computer systems and stored or executed in a distributed manner. Software and data may be stored on one or more computer-readable recording media.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CDROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 실시예의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded on a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, etc., singly or in combination. Program instructions recorded on the medium may be specially designed and configured for the embodiment or may be known and available to those skilled in the art of computer software. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical media such as CDROMs and DVDs, and magneto-optical media such as floptical disks. Includes magneto-optical media and hardware devices specifically configured to store and execute program instructions, such as ROM, RAM, flash memory, etc. Examples of program instructions include machine language code, such as that produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter, etc. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

이상과 같이 본 발명의 실시예에서는 구체적인 구성 요소 등과 같은 특정 사항들과 한정된 실시예 및 도면에 의해 설명되었으나 이는 본 발명의 보다 전반적인 이해를 돕기 위해서 제공된 것일 뿐, 본 발명은 상기의 실시예에 한정되는 것은 아니며, 본 발명이 속하는 분야에서 통상적인 지식을 가진 자라면 이러한 기재로부터 다양한 수정 및 변형이 가능하다. 따라서, 본 발명의 사상은 설명된 실시예에 국한되어 정해져서는 아니되며, 후술하는 청구범위뿐 아니라 이 청구범위와 균등하거나 등가적 변형이 있는 모든 것들은 본 발명 사상의 범주에 속한다고 할 것이다.As described above, the embodiments of the present invention have been described with specific details such as specific components and limited examples and drawings, but this is only provided to facilitate a more general understanding of the present invention, and the present invention is limited to the above embodiments. This does not mean that various modifications and variations can be made from this description by those skilled in the art. Accordingly, the spirit of the present invention should not be limited to the described embodiments, and the claims described below as well as all modifications that are equivalent or equivalent to the claims will fall within the scope of the present invention.

100: 홍수 수위 예측 모델 관리 시스템
110: 수위 예측 모델부
120: 모델 입력부
140: 데이터 입력부
160: 성능 비교부
180: 모델 결정부
190: 데이터베이스100: Flood level prediction model management system
110: Water level prediction model unit
120: model input unit
140: data input unit
160: Performance comparison unit
180: Model decision unit
190: database

Claims

A model input unit where an LSTM model or GRU model is input;
A data input unit where a meteorological data set or a water level data set is input; and
a performance comparison unit that compares performance for each type of data set input to the data input unit with respect to the model input to the model input unit;
A flood level prediction model management system comprising a.

According to paragraph 1,
In the model input section,
A flood water level prediction model management system characterized by input of a Multi LSTM model consisting of two layers of LSTM, a Multi GRU model consisting of two layers of GRU, and an LSTM-GRU model consisting of LSTM and GRU.

According to paragraph 2,
In the data input section,
A flood water level prediction model management system, characterized in that a dataset including water level data, a dataset including water level data and AWS weather data, and a dataset including water level data and ASOS weather data are input.

According to paragraph 3,
The performance comparison unit,
Flood level prediction model management, characterized in that the three models input to the model input unit and the three data sets input to the data input unit are combined to experiment or compare performance with a total of nine input data and models. system.

According to clause 4,
The performance comparison unit,
Flood level prediction, characterized by using MSE as the basic loss function for learning the model input to the model input unit, and comparing observed and predicted values using NSE and MAE indicators as auxiliary indicators for actual test data. Model management system.

According to clause 4,
The performance comparison unit,
MSE, NSE, and MAE are used as performance comparison indicators of the water level prediction model input to the model input unit, and performance evaluation is performed according to the water level prediction model and weather data input to the model input unit through three indicators. Flood level prediction model management system.

According to claim 5 or 6,
The performance comparison unit,
A flood water level prediction model management system that determines the performance of the model input to the model input unit by comparing the MSE, NSE and MAE indicators and the error of the highest water level prediction.

In clause 7,
It includes a model decision unit that determines and presents the model with the best performance among the Multi LSTM model, Multi GRU model, and LSTM-GRU model according to the performance comparison results of the performance comparison unit,
The model decision unit determines the LSTM-GRU model, which uses ASOS weather data and water level data as learning data, as a water level prediction model. A flood water level prediction model management system.