KR102642421B1

KR102642421B1 - Apparatus and method for air quality modeling based on artificial intelligence

Info

Publication number: KR102642421B1
Application number: KR1020220190402A
Authority: KR
Inventors: 우정헌; 어양담; 김진석; 이승우
Original assignee: 건국대학교 산학협력단
Priority date: 2022-12-30
Filing date: 2022-12-30
Publication date: 2024-02-28

Abstract

인공지능 기반의 대기질 모델링 장치 및 방법이 개시되며, 본원의 일 실시예에 따른 인공지능 기반의 대기질 모델링 방법은, 대기오염 물질의 배출 데이터 및 상기 대기오염 물질에 따른 농도 데이터를 포함하는 학습 데이터를 수집하는 단계, 상기 배출 데이터를 입력으로 하여 상기 농도 데이터를 도출하도록 학습되는 인공지능 기반의 분석 모델을 구축하기 위한 학습 기법 및 상기 학습 데이터에 대한 상기 분석 모델의 분석 범위를 결정하는 단계, 상기 결정된 학습 기법 및 분석 범위에 기초하여 상기 학습 데이터를 이용하여 상기 분석 모델을 구축하는 단계 및 분석 대상 공간에 대한 상기 배출 데이터를 포함하는 분석 대상 데이터를 입력 데이터로서 획득하고, 상기 분석 모델을 이용하여 상기 분석 대상 공간에 대한 상기 농도 데이터를 포함하는 출력 데이터를 도출하는 단계를 포함할 수 있다.An artificial intelligence-based air quality modeling device and method are disclosed, and the artificial intelligence-based air quality modeling method according to an embodiment of the present application includes emission data of air pollutants and concentration data according to the air pollutants. Collecting data, determining a learning technique for building an artificial intelligence-based analysis model that is learned to derive the concentration data by inputting the discharge data, and an analysis range of the analysis model for the learning data, Building the analysis model using the learning data based on the determined learning technique and analysis range; acquiring analysis target data including the discharge data for the analysis target space as input data; and using the analysis model. This may include deriving output data including the concentration data for the analysis target space.

Description

Artificial intelligence-based air quality modeling device and method {APPARATUS AND METHOD FOR AIR QUALITY MODELING BASED ON ARTIFICIAL INTELLIGENCE}

본원은 인공지능 기반의 대기질 모델링 장치 및 방법에 관한 것이다.This application relates to artificial intelligence-based air quality modeling devices and methods.

우리나라는 대기오염물질에 따른 대기질 악화로 심각한 이슈로 부각되고 있으며, 이에 따라 국가적으로 연구가 활발하게 이루어지고 있는 실정이며, 다양한 정책을 통한 대기질 개선을 위하여 노력하고 있다. 이러한 대기질 개선 정책을 수립하고 검증을 하기 위해서는 대기질 모델링이 필수적이다.In Korea, the deterioration of air quality due to air pollutants is emerging as a serious issue, and accordingly, national research is being actively conducted and efforts are being made to improve air quality through various policies. Air quality modeling is essential to establish and verify such air quality improvement policies.

그러나, 기존에 활용되던 대기질 모델링의 경우 대기오염 원인 물질들의 배출이 대기 중에서 이동하여 오염도 등의 환경 피해로 나타나는 과정을 모사함에 있어 배출량 프로세싱과 기상모델 구동, 대기화학수송모델 구동과 같은 높은 전산 비용 및 시간을 요구한다.However, in the case of air quality modeling that was previously used, in simulating the process in which emissions of air pollution-causing substances move in the atmosphere and appear as environmental damage such as pollution level, high computational requirements such as emissions processing, meteorological model operation, and atmospheric chemical transport model operation are required. Requires cost and time.

이에 따라, 대기질 개선 정책을 효율적으로 수립하고 효과적으로 수행하기 위해서는 대기오염물질에 대한 신속하며 비용-효과적이며, 신속한 대기질 모델링을 통한 정책의 효율성을 검토하기 위하여 고속화 대기질 모형의 개발이 필수적인 실정이다.Accordingly, in order to efficiently establish and effectively implement air quality improvement policies, it is essential to develop a high-speed air quality model to review the efficiency of the policy through rapid and cost-effective air quality modeling for air pollutants. am.

본원의 배경이 되는 기술은 한국등록특허공보 제10-2443982호에 개시되어 있다.The technology behind this application is disclosed in Korean Patent Publication No. 10-2443982.

본원은 전술한 종래 기술의 문제점을 해결하기 위한 것으로서, 방대한 전산 비용 및 시간이 소요되는 3D 대기환경모델링의 비용을 절감하기 위하여 인공지능 기반의 실시간 대기질 모델링을 수행하여, 대기환경개선 정책에 따른 배출량의 변화를 통해 대기질의 변화를 모사할 수 있는 인공지능 기반의 대기질 모델링 장치 및 방법을 제공하려는 것을 목적으로 한다.In order to solve the problems of the prior art described above, our institute performs artificial intelligence-based real-time air quality modeling to reduce the cost of 3D air environment modeling, which requires extensive computational costs and time, and improves air quality in accordance with the air environment improvement policy. The purpose is to provide an artificial intelligence-based air quality modeling device and method that can simulate changes in air quality through changes in emissions.

다만, 본원의 실시예가 이루고자 하는 기술적 과제는 상기된 바와 같은 기술적 과제들로 한정되지 않으며, 또 다른 기술적 과제들이 존재할 수 있다.However, the technical challenges sought to be achieved by the embodiments of the present application are not limited to the technical challenges described above, and other technical challenges may exist.

상기한 기술적 과제를 달성하기 위한 기술적 수단으로서, 본원의 일 실시예에 따른 인공지능 기반의 대기질 모델링 방법은, 대기오염 물질의 배출 데이터 및 상기 대기오염 물질에 따른 농도 데이터를 포함하는 학습 데이터를 수집하는 단계, 상기 배출 데이터를 입력으로 하여 상기 농도 데이터를 도출하도록 학습되는 인공지능 기반의 분석 모델을 구축하기 위한 학습 기법 및 상기 학습 데이터에 대한 상기 분석 모델의 분석 범위를 결정하는 단계, 상기 결정된 학습 기법 및 분석 범위에 기초하여 상기 학습 데이터를 이용하여 상기 분석 모델을 구축하는 단계 및 분석 대상 공간에 대한 상기 배출 데이터를 포함하는 분석 대상 데이터를 입력 데이터로서 획득하고, 상기 분석 모델을 이용하여 상기 분석 대상 공간에 대한 상기 농도 데이터를 포함하는 출력 데이터를 도출하는 단계를 포함할 수 있다.As a technical means for achieving the above-described technical task, the artificial intelligence-based air quality modeling method according to an embodiment of the present application uses learning data including emission data of air pollutants and concentration data according to the air pollutants. A step of collecting, a learning technique for building an artificial intelligence-based analysis model that is learned to derive the concentration data by inputting the emission data, and determining an analysis range of the analysis model for the learning data, the determined Building the analysis model using the learning data based on a learning technique and analysis range; acquiring analysis target data including the discharge data for the analysis target space as input data; and using the analysis model to obtain analysis target data including the discharge data for the analysis target space. It may include deriving output data including the concentration data for the space to be analyzed.

또한, 상기 결정하는 단계는, 상기 학습 데이터 각각에 대응하는 공간 전체를 학습 대상으로 하는 전체 데이터 범위 또는 상기 공간을 미리 설정된 복수의 구획 공간으로 분할한 격자 공간을 학습 대상으로 하는 격자별 데이터 범위로 상기 분석 범위를 결정할 수 있다.In addition, the determining step includes dividing the entire space corresponding to each of the learning data into the entire data range as the learning target or the grid space divided into a plurality of preset partition spaces into the data range for each grid as the learning target. The analysis range can be determined.

또한, 상기 결정하는 단계는, 기계학습 기반의 제1학습 기법 또는 딥러닝 기반의 제2학습 기법으로 상기 학습 기법을 결정할 수 있다.Additionally, in the determining step, the learning technique may be determined as a first learning technique based on machine learning or a second learning technique based on deep learning.

또한, 상기 제1학습 기법은, 다중선형회귀, 서포트 벡터 회귀, 커널 회귀, 가우스 회귀, 트리 회귀, 랜덤 포레스트 및 신경망 회귀 중 적어도 하나를 포함할 수 있다.Additionally, the first learning technique may include at least one of multiple linear regression, support vector regression, kernel regression, Gaussian regression, tree regression, random forest, and neural network regression.

또한, 상기 제2학습 기법은, 심층 신경망, 재귀 신경망, 합성곱 신경망, LSTM(Long Short-Term Memory) 및 GRU(Gated Recurrent Unit) 중 적어도 하나를 포함할 수 있다.Additionally, the second learning technique may include at least one of a deep neural network, a recursive neural network, a convolutional neural network, a Long Short-Term Memory (LSTM), and a Gated Recurrent Unit (GRU).

또한, 상기 구축하는 단계는, 상기 전체 데이터 범위를 이용하는 상기 제2학습 기법에 대하여 적용되는 하이퍼 파라미터와 상기 격자별 데이터 범위를 이용하는 상기 제2학습 기법에 대하여 적용되는 하이퍼 파라미터를 적어도 하나 이상 다른 값으로 적용할 수 있다.In addition, the building step includes setting at least one hyperparameter applied to the second learning technique using the entire data range and the hyperparameter applied to the second learning technique using the data range for each grid to a different value. It can be applied.

또한, 상기 도출하는 단계는, 상기 분석 모델을 이용하여 상기 분석 대상 공간의 PM 2.5 데이터에 대한 예측 정보를 상기 출력 데이터로서 도출할 수 있다.Additionally, in the deriving step, prediction information about PM 2.5 data in the analysis target space may be derived as the output data using the analysis model.

한편, 본원의 일 실시예에 따른 인공지능 기반의 대기질 모델링을 위한 분석 모델 학습 방법은, 대기오염 물질의 배출 데이터 및 상기 대기오염 물질에 따른 농도 데이터를 포함하는 학습 데이터를 수집하는 단계, 상기 배출 데이터를 입력으로 하여 상기 농도 데이터를 도출하도록 학습되는 인공지능 기반의 분석 모델을 구축하기 위한 학습 기법 및 상기 학습 데이터에 대한 상기 분석 모델의 분석 범위를 결정하는 단계 및 상기 결정된 학습 기법 및 분석 범위에 기초하여 상기 학습 데이터를 이용하여 상기 분석 모델을 구축하는 단계를 포함할 수 있다.Meanwhile, an analysis model learning method for artificial intelligence-based air quality modeling according to an embodiment of the present application includes collecting learning data including emission data of air pollutants and concentration data according to the air pollutants, A learning technique for building an artificial intelligence-based analysis model that is learned to derive the concentration data by inputting discharge data and determining an analysis range of the analysis model for the learning data, and the determined learning technique and analysis range. It may include building the analysis model using the learning data based on .

한편, 본원의 일 실시예에 따른 인공지능 기반의 대기질 모델링 장치는 대기오염 물질의 배출 데이터 및 상기 대기오염 물질에 따른 농도 데이터를 포함하는 학습 데이터를 수집하는 수집부, 상기 배출 데이터를 입력으로 하여 상기 농도 데이터를 도출하도록 학습되는 인공지능 기반의 분석 모델을 구축하기 위한 학습 기법 및 상기 학습 데이터에 대한 상기 분석 모델의 분석 범위를 결정하는 학습 설정부, 상기 결정된 학습 기법 및 분석 범위에 기초하여 상기 학습 데이터를 이용하여 상기 분석 모델을 구축하는 학습 수행부 및 분석 대상 공간에 대한 상기 배출 데이터를 포함하는 분석 대상 데이터를 입력 데이터로서 획득하고, 상기 분석 모델을 이용하여 상기 분석 대상 공간에 대한 상기 농도 데이터를 포함하는 출력 데이터를 도출하는 분석부를 포함할 수 있다.Meanwhile, an artificial intelligence-based air quality modeling device according to an embodiment of the present application includes a collection unit that collects learning data including emission data of air pollutants and concentration data according to the air pollutants, and the emission data as input. A learning setting unit for determining a learning technique for building an artificial intelligence-based analysis model that is learned to derive the concentration data and an analysis range of the analysis model for the learning data, based on the determined learning technique and analysis range A learning performing unit that builds the analysis model using the learning data and acquires analysis target data including the discharge data for the analysis target space as input data, and uses the analysis model to obtain analysis target data for the analysis target space. It may include an analysis unit that derives output data including concentration data.

또한, 상기 학습 설정부는, 상기 학습 데이터 각각에 대응하는 공간 전체를 학습 대상으로 하는 전체 데이터 범위 또는 상기 공간을 미리 설정된 복수의 구획 공간으로 분할한 격자 공간을 학습 대상으로 하는 격자별 데이터 범위로 상기 분석 범위를 결정할 수 있다.In addition, the learning setting unit divides the entire space corresponding to each of the learning data into the entire data range as the learning target or the grid space divided into a plurality of preset partition spaces into the data range for each grid as the learning target. The scope of analysis can be determined.

또한, 상기 학습 설정부는, 기계학습 기반의 제1학습 기법 또는 딥러닝 기반의 제2학습 기법으로 상기 학습 기법을 결정할 수 있다.Additionally, the learning setting unit may determine the learning technique as a first learning technique based on machine learning or a second learning technique based on deep learning.

또한, 상기 학습 수행부는, 상기 전체 데이터 범위를 이용하는 상기 제2학습 기법에 대하여 적용되는 하이퍼 파라미터와 상기 격자별 데이터 범위를 이용하는 상기 제2학습 기법에 대하여 적용되는 하이퍼 파라미터를 적어도 하나 이상 다른 값으로 적용할 수 있다.In addition, the learning performance unit sets at least one hyperparameter applied to the second learning technique using the entire data range and the hyperparameter applied to the second learning technique using the grid-specific data range to different values. It can be applied.

한편, 본원의 일 실시예에 따른 인공지능 기반의 대기질 모델링을 위한 분석 모델 학습 장치는, 대기오염 물질의 배출 데이터 및 상기 대기오염 물질에 따른 농도 데이터를 포함하는 학습 데이터를 수집하는 수집부, 상기 배출 데이터를 입력으로 하여 상기 농도 데이터를 도출하도록 학습되는 인공지능 기반의 분석 모델을 구축하기 위한 학습 기법 및 상기 학습 데이터에 대한 상기 분석 모델의 분석 범위를 결정하는 학습 설정부 및 상기 결정된 학습 기법 및 분석 범위에 기초하여 상기 학습 데이터를 이용하여 상기 분석 모델을 구축하는 학습 수행부를 포함할 수 있다.Meanwhile, an analysis model learning device for artificial intelligence-based air quality modeling according to an embodiment of the present application includes a collection unit that collects learning data including emission data of air pollutants and concentration data according to the air pollutants; A learning technique for building an artificial intelligence-based analysis model that is learned to derive the concentration data by inputting the discharge data, a learning setting unit for determining an analysis range of the analysis model for the learning data, and the determined learning technique. and a learning execution unit that builds the analysis model using the learning data based on the analysis range.

상술한 과제 해결 수단은 단지 예시적인 것으로서, 본원을 제한하려는 의도로 해석되지 않아야 한다. 상술한 예시적인 실시예 외에도, 도면 및 발명의 상세한 설명에 추가적인 실시예가 존재할 수 있다.The above-described means of solving the problem are merely illustrative and should not be construed as intended to limit the present application. In addition to the exemplary embodiments described above, additional embodiments may be present in the drawings and detailed description of the invention.

전술한 본원의 과제 해결 수단에 의하면, 방대한 전산 비용 및 시간이 소요되는 3D 대기환경모델링의 비용을 절감하기 위하여 인공지능 기반의 실시간 대기질 모델링을 수행하여, 대기환경개선 정책에 따른 배출량의 변화를 통해 대기질의 변화를 모사할 수 있는 인공지능 기반의 대기질 모델링 장치 및 방법을 제공할 수 있다.According to the above-mentioned means of solving the problem of this institute, in order to reduce the cost of 3D air environment modeling, which requires enormous computational costs and time, real-time air quality modeling based on artificial intelligence is performed to monitor changes in emissions according to air environment improvement policies. It is possible to provide an artificial intelligence-based air quality modeling device and method that can simulate changes in air quality.

다만, 본원에서 얻을 수 있는 효과는 상기된 바와 같은 효과들로 한정되지 않으며, 또 다른 효과들이 존재할 수 있다.However, the effects that can be obtained herein are not limited to the effects described above, and other effects may exist.

도 1은 본원의 일 실시예에 따른 대기질 모델링 시스템의 개략적인 구성도이다.
도 2는 본원의 일 실시예에 따른 인공지능 기반의 대기질 모델링 장치의 분석 범위를 설명하기 위한 개념도이다.
도 3은 본원의 일 실시예에 따른 인공지능 기반의 대기질 모델링 장치의 훈련 프로세스 및 구축된 분석 모델을 이용한 예측 프로세스를 설명하기 위한 개념도이다.
도 4는 본원의 일 실시예에 따른 인공지능 기반의 대기질 모델링 장치가 예시적으로 적용 가능한 제1학습 기법 및 제2학습 기법을 나타낸 도면이다.
도 5는 전체 데이터 범위 또는 격자별 데이터 범위를 이용하는 제2학습 기법에 대하여 적용되는 하이퍼 파라미터를 예시적으로 나타낸 도표이다.
도 6은 전체 데이터 범위 기반의 학습을 통해 구축된 분석 모델의 성능을 나타낸 도표이다.
도 7은 격자별 데이터 범위 기반의 학습을 통해 구축된 분석 모델의 성능을 나타낸 도표이다.
도 8은 본원의 일 실시예에 따른 인공지능 기반의 대기질 모델링 장치의 개략적인 구성도이다.
도 9는 본원의 일 실시예에 따른 인공지능 기반의 대기질 모델링 방법에 대한 동작 흐름도이다.
도 10은 본원의 일 실시예에 따른 인공지능 기반의 대기질 모델링을 위한 분석 모델 학습 방법에 대한 동작 흐름도이다.1 is a schematic configuration diagram of an air quality modeling system according to an embodiment of the present application.
Figure 2 is a conceptual diagram illustrating the analysis scope of an artificial intelligence-based air quality modeling device according to an embodiment of the present application.
Figure 3 is a conceptual diagram illustrating the training process of an artificial intelligence-based air quality modeling device and the prediction process using the constructed analysis model according to an embodiment of the present application.
Figure 4 is a diagram showing a first learning technique and a second learning technique that can be applied by way of example to an artificial intelligence-based air quality modeling device according to an embodiment of the present application.
Figure 5 is a diagram illustrating hyper parameters applied to the second learning technique using the entire data range or the data range for each grid.
Figure 6 is a chart showing the performance of the analysis model built through learning based on the entire data range.
Figure 7 is a chart showing the performance of the analysis model built through learning based on the data range for each grid.
Figure 8 is a schematic configuration diagram of an artificial intelligence-based air quality modeling device according to an embodiment of the present application.
Figure 9 is an operation flowchart of an artificial intelligence-based air quality modeling method according to an embodiment of the present application.
Figure 10 is an operation flowchart of an analysis model learning method for artificial intelligence-based air quality modeling according to an embodiment of the present application.

아래에서는 첨부한 도면을 참조하여 본원이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 본원의 실시예를 상세히 설명한다. 그러나 본원은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본원을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.Below, with reference to the attached drawings, embodiments of the present application will be described in detail so that those skilled in the art can easily implement them. However, the present application may be implemented in various different forms and is not limited to the embodiments described herein. In order to clearly explain the present application in the drawings, parts that are not related to the description are omitted, and similar reference numerals are assigned to similar parts throughout the specification.

본원 명세서 전체에서, 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는 "직접적으로 연결"되어 있는 경우뿐 아니라, 그 중간에 다른 소자를 사이에 두고 "전기적으로 연결" 또는 "간접적으로 연결"되어 있는 경우도 포함한다. Throughout this specification, when a part is said to be “connected” to another part, this means not only “directly connected” but also “electrically connected” or “indirectly connected” with another element in between. "Includes cases where it is.

본원 명세서 전체에서, 어떤 부재가 다른 부재 "상에", "상부에", "상단에", "하에", "하부에", "하단에" 위치하고 있다고 할 때, 이는 어떤 부재가 다른 부재에 접해 있는 경우뿐 아니라 두 부재 사이에 또 다른 부재가 존재하는 경우도 포함한다.Throughout this specification, when a member is said to be located “on”, “above”, “at the top”, “below”, “at the bottom”, or “at the bottom” of another member, this means that a member is located on another member. This includes not only cases where they are in contact, but also cases where another member exists between two members.

본원 명세서 전체에서, 어떤 부분이 어떤 구성 요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성 요소를 제외하는 것이 아니라 다른 구성 요소를 더 포함할 수 있는 것을 의미한다.Throughout the specification of the present application, when a part "includes" a certain component, this means that it may further include other components rather than excluding other components unless specifically stated to the contrary.

도 1은 본원의 일 실시예에 따른 대기질 모델링 시스템의 개략적인 구성도이다.1 is a schematic configuration diagram of an air quality modeling system according to an embodiment of the present application.

도 1을 참조하면, 본원의 일 실시예에 따른 대기질 모델링 시스템(10)은 본원의 일 실시예에 따른 인공지능 기반의 대기질 모델링 장치(100)(이하, '대기질 모델링 장치(100)'라 한다.), 사용자 단말(200) 및 데이터베이스(300)를 포함할 수 있다.Referring to FIG. 1, the air quality modeling system 10 according to an embodiment of the present application is an artificial intelligence-based air quality modeling device 100 (hereinafter referred to as 'air quality modeling device 100') according to an embodiment of the present application. '), and may include a user terminal 200 and a database 300.

또한, 도 1을 참조하면, 대기질 모델링 장치(100)는 분석 대상 공간에 대한 배출 데이터를 포함하는 분석 대상 데이터(1)를 입력으로 하여, 해당 분석 대상 공간에 대한 농도 데이터를 포함하는 출력 데이터(2)를 출력하도록 동작할 수 있다. 예를 들어, 본원에서 개시하는 대기질 모델링 장치(100)는 이하에서 상세히 설명하는 인공지능 기반의 분석 모델을 이용하여 분석 대상 공간의 PM 2.5 데이터에 대한 예측 정보를 출력 데이터로서 도출하는 것일 수 있으나, 이에만 한정되는 것은 아니며, 본원의 구현예에 따라서는 황산화물, 질소산화물, 탄화수소, 일산화탄소, 미세먼지, 초미세먼지, 분진, 2차 오염물질 등의 다양한 오염 물질의 농도 데이터에 대한 예측 정보를 출력 데이터로서 제공할 수 있다.In addition, referring to FIG. 1, the air quality modeling device 100 receives analysis target data 1 including emission data for the analysis target space as input, and output data including concentration data for the analysis target space. It can be operated to output (2). For example, the air quality modeling device 100 disclosed herein may derive prediction information about PM 2.5 data in the analysis target space as output data using an artificial intelligence-based analysis model described in detail below. , but is not limited to this, and according to the embodiment of the present application, predictive information on concentration data of various pollutants such as sulfur oxides, nitrogen oxides, hydrocarbons, carbon monoxide, fine dust, ultrafine dust, dust, and secondary pollutants. can be provided as output data.

대기질 모델링 장치(100), 사용자 단말(200) 및 데이터베이스(300) 상호간은 네트워크(20)를 통해 통신할 수 있다. 네트워크(20)는 단말들 및 서버들과 같은 각각의 노드 상호간에 정보 교환이 가능한 연결 구조를 의미하는 것으로, 이러한 네트워크(20)의 일 예에는, 3GPP(3rd Generation Partnership Project) 네트워크, LTE(Long Term Evolution) 네트워크, 5G 네트워크, WIMAX(World Interoperability for Microwave Access) 네트워크, 인터넷(Internet), LAN(Local Area Network), Wireless LAN(Wireless Local Area Network), WAN(Wide Area Network), PAN(Personal Area Network), wifi 네트워크, 블루투스(Bluetooth) 네트워크, 위성 방송 네트워크, 아날로그 방송 네트워크, DMB(Digital Multimedia Broadcasting) 네트워크 등이 포함되나 이에 한정되지는 않는다.The air quality modeling device 100, the user terminal 200, and the database 300 may communicate with each other through the network 20. The network 20 refers to a connection structure that allows information exchange between nodes such as terminals and servers. Examples of such networks 20 include the 3rd Generation Partnership Project (3GPP) network, Long Term Evolution) network, 5G network, WIMAX (World Interoperability for Microwave Access) network, Internet, LAN (Local Area Network), Wireless LAN (Wireless Local Area Network), WAN (Wide Area Network), PAN (Personal Area) Network), wifi network, Bluetooth network, satellite broadcasting network, analog broadcasting network, DMB (Digital Multimedia Broadcasting) network, etc., but are not limited thereto.

사용자 단말(200)은 예를 들면, 스마트폰(Smartphone), 스마트패드(SmartPad), 태블릿 PC등과 PCS(Personal Communication System), GSM(Global System for Mobile communication), PDC(Personal Digital Cellular), PHS(Personal Handyphone System), PDA(Personal Digital Assistant), IMT(International Mobile Telecommunication)-2000, CDMA(Code Division Multiple Access)-2000, W-CDMA(W-Code Division Multiple Access), Wibro(Wireless Broadband Internet) 단말기 같은 모든 종류의 무선 통신 장치일 수 있다. 본원의 실시예에 관한 설명에서 사용자 단말(200)은 분석 모델을 구축하기 위한 학습 데이터 선택, 독립/종속 변수 설정, 분석 범위, 학습 기법 선택 등에 대한 사용자 입력을 수신하여 대기질 모델링 장치(100)로 전달하거나, 대기질 모델링 장치(100)의 분석 모델에 의해 도출된 출력 데이터를 디스플레이 등을 통해 표시하기 위한 디바이스일 수 있다.The user terminal 200 includes, for example, a smartphone, a SmartPad, a tablet PC, etc., as well as a Personal Communication System (PCS), a Global System for Mobile communication (GSM), a Personal Digital Cellular (PDC), and a PHS ( Personal Handyphone System), PDA (Personal Digital Assistant), IMT (International Mobile Telecommunication)-2000, CDMA (Code Division Multiple Access)-2000, W-CDMA (W-Code Division Multiple Access), Wibro (Wireless Broadband Internet) terminal It can be any type of wireless communication device, such as: In the description of the embodiment of the present application, the user terminal 200 receives user input for selecting learning data for building an analysis model, setting independent/dependent variables, analysis range, selecting a learning technique, etc., and uses the air quality modeling device 100. It may be a device for transmitting or displaying output data derived by an analysis model of the air quality modeling device 100 through a display or the like.

본원의 실시예에 관한 설명에서 데이터베이스(300)는 대기질 모델링 장치(100)에 탑재되는 인공지능 기반의 분석 모델의 학습(훈련)을 위한 학습 데이터, 대기질 모델링 장치(100)로 입력된 분석 대상 데이터에 대응하여 분석 모델에 의해 도출된 출력 데이터 등을 저장하기 위한 서버 또는 디바이스일 수 있다.In the description of the embodiment of the present application, the database 300 includes learning data for learning (training) an artificial intelligence-based analysis model mounted on the air quality modeling device 100, and analysis input to the air quality modeling device 100. It may be a server or device for storing output data derived by an analysis model in response to target data.

이하에서는 도 2 내지 도 5를 참조하여 대기질 모델링 장치(100)의 구체적인 기능 및 동작에 대하여 상세히 설명하도록 한다.Hereinafter, specific functions and operations of the air quality modeling device 100 will be described in detail with reference to FIGS. 2 to 5.

대기질 모델링 장치(100)는 대기오염 물질의 배출 데이터 및 대기오염 물질에 따른 농도 데이터를 포함하는 학습 데이터를 수집할 수 있다. 구체적으로, 대기질 모델링 장치(100)에 의해 수집되는 학습 데이터는 대기오염물질 배출량 변화 시나리오에 따른 격자형 배출장과 농도장에 대한 정보를 포함할 수 있다. 달리 말해, 대기질 모델링 장치(100)는 대기오염물질 배출량 변화 시나리오에 따라 특정 공간에 형성되는 대기오염 물질의 격자형 배출장에 대한 정보를 포함하는 배출 데이터와 해당 대기오염물질 배출량 변화 시나리오에 의하여 해당 공간에 형성되는 2차 오염물질 등의 농도장(격자형 농도장)에 대하 정보를 포함하는 농도 데이터를 포함하는 데이터 셋을 학습 데이터로 수집할 수 있다.The air quality modeling device 100 may collect learning data including emission data of air pollutants and concentration data according to air pollutants. Specifically, the learning data collected by the air quality modeling device 100 may include information on grid-type emission fields and concentration fields according to air pollutant emission change scenarios. In other words, the air quality modeling device 100 uses emission data including information on grid-type emission fields of air pollutants formed in a specific space according to the air pollutant emission change scenario and the corresponding air pollutant emission change scenario. A data set containing concentration data containing information about the concentration field (grid concentration field) of secondary pollutants formed in the relevant space can be collected as learning data.

또한, 대기질 모델링 장치(100)는 배출 데이터를 입력으로 하여 농도 데이터를 도출하도록 학습되는 인공지능 기반의 분석 모델을 구축하기 위한 학습 기법 및 학습 데이터에 대한 분석 모델의 분석 범위를 결정할 수 있다.Additionally, the air quality modeling device 100 may determine a learning technique for building an artificial intelligence-based analysis model that is learned to derive concentration data using emission data as input, and an analysis range of the analysis model for the learning data.

도 2는 본원의 일 실시예에 따른 인공지능 기반의 대기질 모델링 장치의 분석 범위를 설명하기 위한 개념도이다.Figure 2 is a conceptual diagram illustrating the analysis scope of an artificial intelligence-based air quality modeling device according to an embodiment of the present application.

도 2를 참조하면, 대기질 모델링 장치(100)는 학습 데이터 각각에 대응하는 공간 전체를 학습 대상으로 하는 전체 데이터 범위 또는 해당 공간을 미리 설정된 복수의 구획 공간으로 분할한 격자 공간을 학습 대상으로 하는 격자별 데이터 범위로 분석 범위를 결정할 수 있다.Referring to FIG. 2, the air quality modeling device 100 uses the entire data range as the learning target, the entire space corresponding to each piece of learning data, or the grid space divided into a plurality of preset partition spaces as the learning target. The analysis scope can be determined by the data range for each grid.

보다 구체적으로, 도 2의 (a)는 학습 데이터에 대응하는 공간에 대한 전체 데이터를 기반으로 한 처리 기법을 나타내고, 도 2의 (b)는 학습 데이터에 대응하는 공간을 복수의 구획 공간으로 분할한 격자 공간 각각에 대한 격자별 데이터를 기반으로 한 처리 기법을 나타낸 도면이다.More specifically, Figure 2(a) shows a processing technique based on the entire data for the space corresponding to the learning data, and Figure 2(b) divides the space corresponding to the learning data into a plurality of compartment spaces. This diagram shows a processing technique based on grid-specific data for each grid space.

이와 관련하여, 본원에서 개시하는 대기질 모델링 장치(100)는 도 2의 (a)에 도시된 바와 같이 각 학습 데이터에 대응하는 공간의 전체 배출 데이터(11) 및 해당 공간의 전체 농도 데이터(22)를 포함하는 학습 데이터 셋을 기초로 하여 분석 모델을 구축(훈련)할 수 있으며, 다른 예로, 대기질 모델링 장치(100)는 도 2의 (b)에 도시된 바와 같이 각 학습 데이터에 대응하는 전체 배출 데이터(11) 중에서 특정 격자 공간에 대한 격자 배출 데이터(11') 및 해당 격자 공간에 대한 격자 농도 데이터(22')를 포함하는 학습 데이터 셋을 기초로 하여 분석 모델을 구축할 수 있다.In this regard, the air quality modeling device 100 disclosed herein includes total emission data 11 of the space corresponding to each learning data and total concentration data 22 of the space, as shown in (a) of FIG. 2. ), an analysis model can be built (trained) based on a learning data set containing, and as another example, the air quality modeling device 100 corresponds to each learning data as shown in (b) of FIG. 2. An analysis model can be built based on a learning data set including grid emission data 11' for a specific grid space and grid concentration data 22' for the grid space among the total emission data 11.

도 3은 본원의 일 실시예에 따른 인공지능 기반의 대기질 모델링 장치의 훈련 프로세스 및 구축된 분석 모델을 이용한 예측 프로세스를 설명하기 위한 개념도이다.Figure 3 is a conceptual diagram illustrating the training process of an artificial intelligence-based air quality modeling device and the prediction process using the constructed analysis model according to an embodiment of the present application.

구체적으로 도 3의 (a)는 대기질 모델링 장치(100)에 의해 수행되는 분석 모델의 훈련 프로세스를 나타내고, 도 3의 (b)는 대기질 모델링 장치(100)에 의해 수행되는 분석 모델을 이용한 분석 대상 공간에 대한 예측 프로세스를 나타낸다.Specifically, Figure 3 (a) shows the training process of the analysis model performed by the air quality modeling device 100, and Figure 3 (b) shows the training process using the analysis model performed by the air quality modeling device 100. It represents the prediction process for the analysis target space.

본원의 일 실시예에 따르면, 대기질 모델링 장치(100)는 데이터베이스(300)에 저장된 데이터 셋 중에서 훈련 데이터(X_train, Y_train) 및 테스트 데이터(X_test, Y_test)를 선택하는 사용자 입력을 사용자 단말(200)을 통해 수신하여 훈련 데이터 및 테스트 데이터를 지정하고, 학습 데이터로부터 분석 모델이 학습할 데이터의 특성을 선택하는 사용자 입력을 사용자 단말(200)을 통해 수신하여 독립/종속 변수를 지정할 수 있다.According to an embodiment of the present application, the air quality modeling device 100 receives a user input for selecting training data (X_train, Y_train) and test data (X_test, Y_test) from the data sets stored in the database 300 to the user terminal 200. ) to designate training data and test data, and receive user input for selecting characteristics of data to be learned by the analysis model from the learning data through the user terminal 200 to designate independent/dependent variables.

한편, 도 3을 참조하면, 대기질 모델링 장치(100)의 학습 데이터를 이용한 분석 모델 훈련 시에는 학습 데이터가 전체 데이터(all data), 0을 제외한 전체 데이터, 0을 제외한 격자(cell) 데이터 등의 유형으로 선택되어 훈련에 사용될 수 있으며, 이러한 학습 데이터의 데이터 타입에 따라 적용 가능한 학습 기법이 상이하게 결정될 수 있다.Meanwhile, referring to FIG. 3, when training an analysis model using the learning data of the air quality modeling device 100, the learning data includes all data, all data excluding 0, grid (cell) data excluding 0, etc. It can be selected as a type and used for training, and applicable learning techniques may be determined differently depending on the data type of the learning data.

예시적으로, 딥러닝 기반의 합성곱 신경망 적용을 위하여 0을 제외한 격자(cell) 데이터의 유형으로 학습 데이터가 선택적으로 사용되거나 대기질 모델링 장치(100)의 하드웨어 메모리 스펙 등에 따라 커널 회귀, 가우스 회귀 등의 기계학습 방식 적용 시에는 전체 데이터 또는 0을 제외한 전체 데이터를 이용한 학습이 불가한 것일 수 있으나, 이에만 한정되는 것은 아니다.For example, to apply a deep learning-based convolutional neural network, learning data is selectively used as a type of grid (cell) data excluding 0, or kernel regression and Gaussian regression are used depending on the hardware memory specifications of the air quality modeling device 100. When applying machine learning methods such as these, learning using all data or all data excluding 0 may not be possible, but is not limited to this.

도 4는 본원의 일 실시예에 따른 인공지능 기반의 대기질 모델링 장치가 예시적으로 적용 가능한 제1학습 기법 및 제2학습 기법을 나타낸 도면이다.Figure 4 is a diagram showing a first learning technique and a second learning technique that can be applied by way of example to an artificial intelligence-based air quality modeling device according to an embodiment of the present application.

도 4를 참조하면, 대기질 모델링 장치(100)는 다중선형회귀, 서포트 벡터 회귀, 커널 회귀, 가우스 회귀, 트리 회귀, 랜덤 포레스트 및 신경망 회귀 중 적어도 하나를 포함하는 기계학습 기반의 제1학습 기법 또는 심층 신경망, 재귀 신경망, 합성곱 신경망, LSTM(Long Short-Term Memory) 및 GRU(Gated Recurrent Unit) 중 적어도 하나를 포함하는 딥러닝 기반의 제2학습 기법으로 학습 기법을 결정할 수 있다.Referring to FIG. 4, the air quality modeling device 100 is a first learning technique based on machine learning including at least one of multiple linear regression, support vector regression, kernel regression, Gaussian regression, tree regression, random forest, and neural network regression. Alternatively, the learning technique may be determined as a deep learning-based second learning technique including at least one of a deep neural network, a recursive neural network, a convolutional neural network, LSTM (Long Short-Term Memory), and GRU (Gated Recurrent Unit).

한편, 본원의 일 실시예에 따르면, 대기질 모델링 장치(100)는 서포트 벡터 회귀 기반의 제1학습 기법 적용시, 비선형 커널 함수중 대기질 분석 분야에서 활용되는 기계학습 기법 중 성능이 우수한 것으로 알려진 RBF(Radial Basis Function) 커널 함수를 이용할 수 있다.Meanwhile, according to an embodiment of the present application, when the air quality modeling device 100 applies a first learning technique based on support vector regression, RBF, which is known to have excellent performance among machine learning techniques used in the field of air quality analysis among nonlinear kernel functions, (Radial Basis Function) You can use the kernel function.

한편, 전술한 기계학습 기반의 제1학습 기법 또는 딥러닝 기반의 제2학습 기법의 세부 유형(다중선형회귀, 서포트 벡터 회귀, 커널 회귀, 가우스 회귀, 트리 회귀, 랜덤 포레스트, 심층 신경망, 재귀 신경망, 합성곱 신경망, LSTM(Long Short-Term Memory) 및 GRU(Gated Recurrent Unit) 등) 각각에 대한 사항은 통상의 기술자에게 자명한 사항인바, 상세한 설명은 생략하도록 한다.Meanwhile, detailed types of the above-mentioned machine learning-based first learning technique or deep learning-based second learning technique (multiple linear regression, support vector regression, kernel regression, Gaussian regression, tree regression, random forest, deep neural network, recursive neural network) , convolutional neural network, LSTM (Long Short-Term Memory), and GRU (Gated Recurrent Unit), etc.) are self-evident to those skilled in the art, so detailed explanations will be omitted.

대기질 모델링 장치(100)는 결정된 학습 기법 및 분석 범위에 기초하여 대기질 모델링 장치(100)에 의해 수집된 학습 데이터를 이용하여 인공지능 기반의 분석 모델을 구축할 수 있다.The air quality modeling device 100 may build an artificial intelligence-based analysis model using the learning data collected by the air quality modeling device 100 based on the determined learning technique and analysis range.

도 5는 전체 데이터 범위 또는 격자별 데이터 범위를 이용하는 제2학습 기법에 대하여 적용되는 하이퍼 파라미터를 예시적으로 나타낸 도표이다.Figure 5 is a diagram illustrating hyper parameters applied to the second learning technique using the entire data range or the data range for each grid.

도 5를 참조하면, 전체 데이터 범위를 이용하는 제2학습 기법에 대하여 적용되는 하이퍼 파라미터와 격자별 데이터 범위를 이용하는 제2학습 기법에 대하여 적용되는 하이퍼 파라미터를 적어도 하나 이상 다른 값으로 적용할 수 있다.Referring to FIG. 5, the hyperparameters applied to the second learning technique using the entire data range and the hyperparameters applied to the second learning technique using the data range for each grid may be applied with at least one different value.

구체적으로, 도 5를 참조하면, 전체 데이터 범위를 사용하는 제2학습 기법(영상처리기법)의 경우, 격자별 데이터 범위를 사용하는 제2학습 기법 대비 에포크(Epoch) 값이 상대적으로 큰 값으로 설정(100>50)되는 것일 수 있으며, 전체 데이터 범위를 사용하는 제2학습 기법(영상처리기법)의 경우, 격자별 데이터 범위를 사용하는 제2학습 기법 대비 배치 사이즈(Batch_size) 값이 상대적으로 큰 값으로 설정(64>32)되는 것일 수 있으며, 전체 데이터 범위를 사용하는 제2학습 기법(영상처리기법)의 경우, 격자별 데이터 범위를 사용하는 제2학습 기법 대비 학습률(Learning rate) 값이 상대적으로 작은 값으로 설정(0.001<0.1)되는 것일 수 있으나, 이에만 한정되는 것은 아니다.Specifically, referring to Figure 5, in the case of the second learning technique (image processing technique) using the entire data range, the epoch value is relatively large compared to the second learning technique using the data range for each grid. It may be set (100>50), and in the case of the second learning technique (image processing technique) that uses the entire data range, the batch size (Batch_size) value is relatively small compared to the second learning technique that uses the data range for each grid. It may be set to a large value (64>32), and in the case of the second learning technique (image processing technique) that uses the entire data range, the learning rate value compared to the second learning technique that uses the data range for each grid. This may be set to a relatively small value (0.001<0.1), but is not limited to this.

대기질 모델링 장치(100)는 분석 대상 공간에 대한 배출 데이터를 포함하는 분석 대상 데이터를 입력 데이터로서 획득할 수 있다. 또한, 대기질 모델링 장치(100)는 구축(훈련)된 분석 모델을 이용하여 분석 대상 공간에 대한 농도 데이터를 포함하는 출력 데이터를 도출할 수 있다.The air quality modeling apparatus 100 may obtain analysis target data including emission data for the analysis target space as input data. Additionally, the air quality modeling apparatus 100 may derive output data including concentration data for the analysis target space using a built (trained) analysis model.

예시적으로 대기질 모델링 장치(100)의 분석 모델이 입력된 분석 대상 데이터에 대응하여 출력하는 출력 데이터는 특정 오염 물질(예를 들면, 미세먼지 등)의 분석 대상 공간에 대한 실시간 농도장 형태의 농도 데이터일 수 있다.For example, the output data output by the analysis model of the air quality modeling device 100 in response to the input analysis target data is in the form of a real-time concentration field for the analysis target space of a specific pollutant (e.g., fine dust, etc.). It may be concentration data.

이하에서는 도 6 및 도 7을 참조하여 본원에서 개시하는 대기질 모델링 장치(100)의 인공지능 기반의 분석 모델의 성능에 대한 실험예를 설명하도록 한다.Hereinafter, an experimental example of the performance of the artificial intelligence-based analysis model of the air quality modeling device 100 disclosed herein will be described with reference to FIGS. 6 and 7.

도 6은 전체 데이터 범위 기반의 학습을 통해 구축된 분석 모델의 성능을 나타낸 도표이고, 도 7은 격자별 데이터 범위 기반의 학습을 통해 구축된 분석 모델의 성능을 나타낸 도표이다.Figure 6 is a chart showing the performance of an analysis model built through learning based on the entire data range, and Figure 7 is a chart showing the performance of an analysis model built through learning based on the data range for each grid.

도 6 및 도 7을 참조하면, 본원에서 개시하는 대기질 모델링 장치(100)에 의할 때, 종래의 수치방정식 기반 3차원 대기환경모형 등을 적용하여 농도 데이터를 도출하는 경우 대비 대기환경 시물레이션에 소요되는 시간을 획기적으로 단축할 수 있음을 확인할 수 있고, RMSE(Root Mean Squared Error), R2 score 등의 회귀 모델의 성능 지표 역시 우수한 성능을 보이는 것을 확인할 수 있다.Referring to FIGS. 6 and 7, when using the air quality modeling device 100 disclosed herein, compared to the case of deriving concentration data by applying a conventional numerical equation-based 3D atmospheric environment model, etc., the atmospheric environment simulation It can be seen that the time required can be dramatically shortened, and performance indicators of regression models such as RMSE (Root Mean Squared Error) and R2 score also show excellent performance.

한편, 도 6 및 도 7을 참조하면, 학습 데이터에 대한 분석 범위에 따라 인공지능 기반의 분석 모델의 훈련 시 선택 가능한 인공지능 기반의 학습 기법의 유형이 상이하게 결정되는 것일 수 있다.Meanwhile, referring to FIGS. 6 and 7 , the type of artificial intelligence-based learning technique that can be selected when training an artificial intelligence-based analysis model may be determined differently depending on the analysis scope of the learning data.

예시적으로, 커널 회귀, 가우스 회귀 등의 기계학습 기반의 제1학습 기법은 격자별 데이터 범위에 대하여만 선택적으로 적용 가능한 것일 수 있으며, 다른 예로, 전체 데이터 범위에 대하여 선택 가능한 딥러닝 기반의 제2학습 기법은 DNN, RNN, LSTM 등을 포함하고, 격자별 데이터 범위에 대하여 선택 가능한 딥러닝 기반의 제2학습 기법은 DNN, RNN, LSTN, GRU, CNN 등을 포함하는 것일 수 있다.As an example, a first learning technique based on machine learning, such as kernel regression or Gaussian regression, may be selectively applied only to the data range for each grid, and as another example, a deep learning-based method that can be selected for the entire data range. Second learning techniques include DNN, RNN, LSTM, etc., and deep learning-based second learning techniques that can be selected for the data range for each grid may include DNN, RNN, LSTN, GRU, CNN, etc.

도 8은 본원의 일 실시예에 따른 인공지능 기반의 대기질 모델링 장치의 개략적인 구성도이다.Figure 8 is a schematic configuration diagram of an artificial intelligence-based air quality modeling device according to an embodiment of the present application.

도 8을 참조하면, 대기질 모델링 장치(100)는 수집부(110), 학습 설정부(120), 학습 수행부(130) 및 분석부(140)를 포함할 수 있다. 참고로, 본원의 실시예에 관한 설명에서 인공지능 기반의 대기질 모델링을 위한 분석 모델 학습 장치(100)는 대기질 모델링 장치(100)의 하위 구성 중 인공지능 기반의 대기질 분석 모델을 훈련시키는 기능을 수행하는 수집부(110), 학습 설정부(120) 및 학습 수행부(130)를 포함하는 장치인 것으로 이해될 수 있다.Referring to FIG. 8 , the air quality modeling device 100 may include a collection unit 110, a learning setup unit 120, a learning performance unit 130, and an analysis unit 140. For reference, in the description of the embodiment of the present application, the analysis model learning device 100 for artificial intelligence-based air quality modeling is a sub-configuration of the air quality modeling device 100 that trains an artificial intelligence-based air quality analysis model. It can be understood as a device that includes a collection unit 110, a learning setting unit 120, and a learning performing unit 130 that perform functions.

수집부(110)는 대기오염 물질의 배출 데이터 및 대기오염 물질에 따른 농도 데이터를 포함하는 학습 데이터를 수집할 수 있다.The collection unit 110 may collect learning data including emission data of air pollutants and concentration data according to air pollutants.

학습 설정부(120)는 배출 데이터를 입력으로 하여 농도 데이터를 도출하도록 학습되는 인공지능 기반의 분석 모델을 구축하기 위한 학습 기법 및 학습 데이터에 대한 분석 모델의 분석 범위를 결정할 수 있다.The learning setting unit 120 may determine a learning technique for building an artificial intelligence-based analysis model that is learned to derive concentration data using discharge data as input, and an analysis range of the analysis model for the learning data.

구체적으로, 학습 설정부(120)는 학습 데이터 각각에 대응하는 공간 전체를 학습 대상으로 하는 전체 데이터 범위 또는 해당 공간을 미리 설정된 복수의 구획 공간으로 분할한 격자 공간을 학습 대상으로 하는 격자별 데이터 범위로 분석 범위를 결정할 수 있다.Specifically, the learning setting unit 120 sets the entire data range for which the entire space corresponding to each piece of learning data is the learning target, or the grid-specific data range for which the grid space divided into a plurality of preset partition spaces is the learning target. The scope of analysis can be determined.

또한, 본원의 일 실시예에 따르면, 학습 설정부(120)는 다중선형회귀, 서포트 벡터 회귀, 커널 회귀, 가우스 회귀, 트리 회귀, 랜덤 포레스트 및 신경망 회귀 중 적어도 하나를 포함하는 기계학습 기반의 제1학습 기법 또는 심층 신경망, 재귀 신경망, 합성곱 신경망, LSTM(Long Short-Term Memory) 및 GRU(Gated Recurrent Unit) 중 적어도 하나를 포함하는 딥러닝 기반의 제2학습 기법으로 학습 기법을 결정할 수 있다.In addition, according to an embodiment of the present application, the learning setup unit 120 is a machine learning-based system including at least one of multiple linear regression, support vector regression, kernel regression, Gaussian regression, tree regression, random forest, and neural network regression. 1The learning technique can be determined as a learning technique or a second learning technique based on deep learning including at least one of deep neural network, recursive neural network, convolutional neural network, LSTM (Long Short-Term Memory), and GRU (Gated Recurrent Unit). .

학습 수행부(130)는 결정된 학습 기법 및 분석 범위에 기초하여 수집부(110)에 의해 수집된 학습 데이터를 이용하여 인공지능 기반의 분석 모델을 구축할 수 있다.The learning performance unit 130 may build an artificial intelligence-based analysis model using the learning data collected by the collection unit 110 based on the determined learning technique and analysis range.

분석부(140)는 분석 대상 공간에 대한 배출 데이터를 포함하는 분석 대상 데이터를 입력 데이터로서 획득할 수 있다. 또한, 분석부(140)는 구축(훈련)된 분석 모델을 이용하여 분석 대상 공간에 대한 농도 데이터를 포함하는 출력 데이터를 도출할 수 있다.The analysis unit 140 may obtain analysis target data including emission data for the analysis target space as input data. Additionally, the analysis unit 140 may use the constructed (trained) analysis model to derive output data including concentration data for the analysis target space.

이하에서는 상기에 자세히 설명된 내용을 기반으로, 본원의 동작 흐름을 간단히 살펴보기로 한다.Below, we will briefly look at the operation flow of the present application based on the details described above.

도 9는 본원의 일 실시예에 따른 인공지능 기반의 대기질 모델링 방법에 대한 동작 흐름도이다.Figure 9 is an operation flowchart of an artificial intelligence-based air quality modeling method according to an embodiment of the present application.

도 9에 도시된 인공지능 기반의 대기질 모델링 방법은 앞서 설명된 대기질 모델링 장치(100)에 의하여 수행될 수 있다. 따라서, 이하 생략된 내용이라고 하더라도 대기질 모델링 장치(100)에 대하여 설명된 내용은 인공지능 기반의 대기질 모델링 방법에 대한 설명에도 동일하게 적용될 수 있다.The artificial intelligence-based air quality modeling method shown in FIG. 9 can be performed by the air quality modeling device 100 described above. Therefore, even if the content is omitted below, the content described with respect to the air quality modeling device 100 can be equally applied to the explanation of the artificial intelligence-based air quality modeling method.

도 9를 참조하면, 단계 S11에서 수집부(110)는 대기오염 물질의 배출 데이터 및 대기오염 물질에 따른 농도 데이터를 포함하는 학습 데이터를 수집할 수 있다.Referring to FIG. 9, in step S11, the collection unit 110 may collect learning data including emission data of air pollutants and concentration data according to air pollutants.

다음으로, 단계 S12에서 학습 설정부(120)는 배출 데이터를 입력으로 하여 농도 데이터를 도출하도록 학습되는 인공지능 기반의 분석 모델을 구축하기 위한 학습 기법 및 학습 데이터에 대한 분석 모델의 분석 범위를 결정할 수 있다.Next, in step S12, the learning setting unit 120 determines the learning technique for building an artificial intelligence-based analysis model that is learned to derive concentration data by inputting discharge data and the analysis range of the analysis model for the learning data. You can.

구체적으로 단계 S12에서 학습 설정부(120)는 학습 데이터 각각에 대응하는 공간 전체를 학습 대상으로 하는 전체 데이터 범위 또는 해당 공간을 미리 설정된 복수의 구획 공간으로 분할한 격자 공간을 학습 대상으로 하는 격자별 데이터 범위로 분석 범위를 결정할 수 있다.Specifically, in step S12, the learning setting unit 120 sets the entire data range corresponding to each piece of learning data as the learning target, or the grid space divided into a plurality of preset partition spaces for each grid as the learning target. The scope of analysis can be determined by the data range.

또한, 본원의 일 실시예에 따르면, 단계 S12에서 학습 설정부(120)는 다중선형회귀, 서포트 벡터 회귀, 커널 회귀, 가우스 회귀, 트리 회귀, 랜덤 포레스트 및 신경망 회귀 중 적어도 하나를 포함하는 기계학습 기반의 제1학습 기법 또는 심층 신경망, 재귀 신경망, 합성곱 신경망, LSTM(Long Short-Term Memory) 및 GRU(Gated Recurrent Unit) 중 적어도 하나를 포함하는 딥러닝 기반의 제2학습 기법으로 학습 기법을 결정할 수 있다.In addition, according to an embodiment of the present application, in step S12, the learning setup unit 120 performs machine learning including at least one of multiple linear regression, support vector regression, kernel regression, Gaussian regression, tree regression, random forest, and neural network regression. The learning technique is a deep learning-based first learning technique or a deep learning-based second learning technique including at least one of deep neural network, recursive neural network, convolutional neural network, LSTM (Long Short-Term Memory), and GRU (Gated Recurrent Unit). You can decide.

다음으로, 단계 S13에서 학습 수행부(130)는 결정된 학습 기법 및 분석 범위에 기초하여 단계 S11에서 수집된 학습 데이터를 이용하여 인공지능 기반의 분석 모델을 구축할 수 있다.Next, in step S13, the learning performance unit 130 may build an artificial intelligence-based analysis model using the learning data collected in step S11 based on the determined learning technique and analysis range.

다음으로, 단계 S14에서 분석부(140)는 분석 대상 공간에 대한 배출 데이터를 포함하는 분석 대상 데이터를 입력 데이터로서 획득할 수 있다.Next, in step S14, the analysis unit 140 may obtain analysis target data including emission data for the analysis target space as input data.

다음으로, 단계 S14에서 분석부(140)는 단계 S13을 통해 구축(훈련)된 분석 모델을 이용하여 분석 대상 공간에 대한 농도 데이터를 포함하는 출력 데이터를 도출할 수 있다.Next, in step S14, the analysis unit 140 may derive output data including concentration data for the analysis target space using the analysis model built (trained) in step S13.

상술한 설명에서, 단계 S11 내지 S15는 본원의 구현예에 따라서, 추가적인 단계들로 더 분할되거나, 더 적은 단계들로 조합될 수 있다. 또한, 일부 단계는 필요에 따라 생략될 수도 있고, 단계 간의 순서가 변경될 수도 있다.In the above description, steps S11 to S15 may be further divided into additional steps or combined into fewer steps, depending on the implementation of the present disclosure. Additionally, some steps may be omitted or the order between steps may be changed as needed.

도 10은 본원의 일 실시예에 따른 인공지능 기반의 대기질 모델링을 위한 분석 모델 학습 방법에 대한 동작 흐름도이다.Figure 10 is an operation flowchart of an analysis model learning method for artificial intelligence-based air quality modeling according to an embodiment of the present application.

도 10에 도시된 인공지능 기반의 대기질 모델링을 위한 분석 모델 학습 방법은 앞서 설명된 대기질 모델링 장치(100)에 의하여 수행될 수 있다. 따라서, 이하 생략된 내용이라고 하더라도 대기질 모델링 장치(100)에 대하여 설명된 내용은 도 10에 대한 설명에도 동일하게 적용될 수 있다The analysis model learning method for artificial intelligence-based air quality modeling shown in FIG. 10 can be performed by the air quality modeling device 100 described above. Therefore, even if the content is omitted below, the content described with respect to the air quality modeling device 100 can be equally applied to the description of FIG. 10

도 10을 참조하면, 단계 S21에서 수집부(110)는 대기오염 물질의 배출 데이터 및 대기오염 물질에 따른 농도 데이터를 포함하는 학습 데이터를 수집할 수 있다.Referring to FIG. 10, in step S21, the collection unit 110 may collect learning data including emission data of air pollutants and concentration data according to air pollutants.

다음으로, 단계 S22에서 학습 설정부(120)는 배출 데이터를 입력으로 하여 농도 데이터를 도출하도록 학습되는 인공지능 기반의 분석 모델을 구축하기 위한 학습 기법 및 학습 데이터에 대한 분석 모델의 분석 범위를 결정할 수 있다.Next, in step S22, the learning setting unit 120 determines the learning technique for building an artificial intelligence-based analysis model that is learned to derive concentration data by taking the discharge data as input and the analysis range of the analysis model for the learning data. You can.

구체적으로 단계 S22에서 학습 설정부(120)는 학습 데이터 각각에 대응하는 공간 전체를 학습 대상으로 하는 전체 데이터 범위 또는 해당 공간을 미리 설정된 복수의 구획 공간으로 분할한 격자 공간을 학습 대상으로 하는 격자별 데이터 범위로 분석 범위를 결정할 수 있다.Specifically, in step S22, the learning setting unit 120 sets the entire data range as the learning target, the entire space corresponding to each piece of learning data, or the grid space divided into a plurality of preset partition spaces, for each grid as the learning target. The scope of analysis can be determined by the data range.

또한, 본원의 일 실시예에 따르면, 단계 S22에서 학습 설정부(120)는 다중선형회귀, 서포트 벡터 회귀, 커널 회귀, 가우스 회귀, 트리 회귀, 랜덤 포레스트 및 신경망 회귀 중 적어도 하나를 포함하는 기계학습 기반의 제1학습 기법 또는 심층 신경망, 재귀 신경망, 합성곱 신경망, LSTM(Long Short-Term Memory) 및 GRU(Gated Recurrent Unit) 중 적어도 하나를 포함하는 딥러닝 기반의 제2학습 기법으로 학습 기법을 결정할 수 있다.In addition, according to an embodiment of the present application, in step S22, the learning setup unit 120 performs machine learning including at least one of multiple linear regression, support vector regression, kernel regression, Gaussian regression, tree regression, random forest, and neural network regression. The learning technique is a deep learning-based first learning technique or a deep learning-based second learning technique including at least one of deep neural network, recursive neural network, convolutional neural network, LSTM (Long Short-Term Memory), and GRU (Gated Recurrent Unit). You can decide.

다음으로, 단계 S23에서 학습 수행부(130)는 결정된 학습 기법 및 분석 범위에 기초하여 단계 S21에서 수집된 학습 데이터를 이용하여 인공지능 기반의 분석 모델을 구축할 수 있다.Next, in step S23, the learning performance unit 130 may build an artificial intelligence-based analysis model using the learning data collected in step S21 based on the determined learning technique and analysis range.

상술한 설명에서, 단계 S21 내지 S23은 본원의 구현예에 따라서, 추가적인 단계들로 더 분할되거나, 더 적은 단계들로 조합될 수 있다. 또한, 일부 단계는 필요에 따라 생략될 수도 있고, 단계 간의 순서가 변경될 수도 있다.In the above description, steps S21 to S23 may be further divided into additional steps or combined into fewer steps, depending on the implementation of the present disclosure. Additionally, some steps may be omitted or the order between steps may be changed as needed.

본원의 일 실시예에 따른 인공지능 기반의 대기질 모델링 방법 내지 인공지능 기반의 대기질 모델링을 위한 분석 모델 학습 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 본 발명의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The artificial intelligence-based air quality modeling method or the analysis model learning method for artificial intelligence-based air quality modeling according to an embodiment of the present application is implemented in the form of program instructions that can be executed through various computer means and stored in a computer-readable medium. can be recorded The computer-readable medium may include program instructions, data files, data structures, etc., singly or in combination. Program instructions recorded on the medium may be those specifically designed and configured for the present invention, or may be known and usable by those skilled in the art of computer software. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic media such as floptical disks. -Includes optical media (magneto-optical media) and hardware devices specifically configured to store and execute program instructions, such as ROM, RAM, flash memory, etc. Examples of program instructions include machine language code, such as that produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter, etc. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

또한, 전술한 인공지능 기반의 대기질 모델링 방법 내지 인공지능 기반의 대기질 모델링을 위한 분석 모델 학습 방법은 기록 매체에 저장되는 컴퓨터에 의해 실행되는 컴퓨터 프로그램 또는 애플리케이션의 형태로도 구현될 수 있다.Additionally, the artificial intelligence-based air quality modeling method or the analysis model learning method for artificial intelligence-based air quality modeling may also be implemented in the form of a computer program or application executed by a computer stored in a recording medium.

전술한 본원의 설명은 예시를 위한 것이며, 본원이 속하는 기술분야의 통상의 지식을 가진 자는 본원의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 예를 들어, 단일형으로 설명되어 있는 각 구성 요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성 요소들도 결합된 형태로 실시될 수 있다.The description of the present application described above is for illustrative purposes, and those skilled in the art will understand that the present application can be easily modified into other specific forms without changing its technical idea or essential features. Therefore, the embodiments described above should be understood in all respects as illustrative and not restrictive. For example, each component described as unitary may be implemented in a distributed manner, and similarly, components described as distributed may also be implemented in a combined form.

본원의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본원의 범위에 포함되는 것으로 해석되어야 한다.The scope of the present application is indicated by the claims described below rather than the detailed description above, and all changes or modified forms derived from the meaning and scope of the claims and their equivalent concepts should be construed as being included in the scope of the present application.

10: 대기질 모델링 시스템
100: 인공지능 기반의 대기질 모델링 장치
110: 수집부
120: 학습 설정부
130: 학습 수행부
140: 분석부
200: 사용자 단말
300: 데이터베이스
20: 네트워크10: Air quality modeling system
100: Artificial intelligence-based air quality modeling device
110: Collection department
120: Learning settings unit
130: Learning execution unit
140: analysis department
200: user terminal
300: database
20: Network

Claims

In the artificial intelligence-based air quality modeling method,
Collecting learning data selected from a plurality of types, including emission data of air pollutants and concentration data according to the air pollutants, and including total data, total data excluding 0, and cell data excluding 0;
Determining a learning technique for building an artificial intelligence-based analysis model that is learned to derive the concentration data using the discharge data as input and an analysis range of the analysis model for the learning data;
Constructing the analysis model using the learning data based on the determined learning technique and analysis range; and
Obtaining analysis target data including the emission data for the analysis target space as input data, and using the analysis model to derive output data including the concentration data for the analysis target space,
Including,
The determining step is,
The analysis range is determined as the entire data range in which the entire space corresponding to each of the learning data is the learning target, or the data range for each grid in which the space is divided into a plurality of preset partition spaces as the learning target,
The determining step is,
The learning technique is determined as a first learning technique based on machine learning or a second learning technique based on deep learning,
Among the first learning techniques, select the learning data of a type that does not correspond to the entire data and all data except 0 among the plurality of types for kernel regression or Gaussian regression, and among the second learning techniques, a convolutional neural network With respect to the plurality of types, cell data excluding the 0 is selected as the learning data,
The construction step is,
At least one hyperparameter applied to the second learning technique using the entire data range and the hyperparameter applied to the second learning technique using the grid-specific data range are applied with different values,
Among the hyper parameters, the epoch and batch size are set larger for the second learning technique using the entire data range than for the second learning technique using the data range for each grid, and the learning rate among the hyper parameters is set to be larger for the second learning technique using the data range for each grid. A modeling method that is set to be smaller for the second learning technique using the entire data range than for the second learning technique using .

delete

According to paragraph 1,
The first learning technique is,
A modeling method comprising at least one of multiple linear regression, support vector regression, kernel regression, Gaussian regression, tree regression, random forest, and neural network regression.

According to paragraph 1,
The second learning technique is,
A modeling method comprising at least one of a deep neural network, a recursive neural network, a convolutional neural network, a Long Short-Term Memory (LSTM), and a Gated Recurrent Unit (GRU).

delete

According to paragraph 1,
The derivation step is,
A modeling method that uses the analysis model to derive prediction information about PM 2.5 data in the analysis target space as the output data.

In the analysis model learning method for artificial intelligence-based air quality modeling,
Collecting learning data selected from a plurality of types, including emission data of air pollutants and concentration data according to the air pollutants, and including total data, total data excluding 0, and cell data excluding 0;
Determining a learning technique for building an artificial intelligence-based analysis model that is learned to derive the concentration data using the discharge data as input and an analysis range of the analysis model for the learning data; and
Building the analysis model using the learning data based on the determined learning technique and analysis range,
Including,
The determining step is,
The analysis range is determined as the entire data range in which the entire space corresponding to each of the learning data is the learning target, or the data range for each grid in which the space is divided into a plurality of preset partition spaces as the learning target,
The determining step is,
The learning technique is determined as a first learning technique based on machine learning or a second learning technique based on deep learning,
Among the first learning techniques, select the learning data of a type that does not correspond to the entire data and all data except 0 among the plurality of types for kernel regression or Gaussian regression, and among the second learning techniques, a convolutional neural network With respect to the plurality of types, cell data excluding the 0 is selected as the learning data,
The construction step is,
At least one hyperparameter applied to the second learning technique using the entire data range and the hyperparameter applied to the second learning technique using the grid-specific data range are applied with different values,
Among the hyper parameters, the epoch and batch size are set larger for the second learning technique using the entire data range than for the second learning technique using the data range for each grid, and the learning rate among the hyper parameters is set to be larger for the second learning technique using the data range for each grid. A learning method that is set to be smaller for the second learning technique using the entire data range than for the second learning technique using .

In an artificial intelligence-based air quality modeling device,
A collection unit that includes emission data of air pollutants and concentration data according to the air pollutants, and collects learning data selected from a plurality of types including total data, total data excluding 0, and cell data excluding 0;
a learning setting unit that determines a learning technique for building an artificial intelligence-based analysis model that is learned to derive the concentration data using the discharge data as input, and an analysis range of the analysis model for the learning data;
a learning execution unit that builds the analysis model using the learning data based on the determined learning technique and analysis range; and
An analysis unit that obtains analysis target data including the emission data for the analysis target space as input data and uses the analysis model to derive output data including the concentration data for the analysis target space;
Including,
The learning setting unit,
The analysis range is determined as the entire data range in which the entire space corresponding to each of the learning data is the learning target, or the data range for each grid in which the space is divided into a plurality of preset partition spaces as the learning target,
The learning setting unit,
The learning technique is determined as a first learning technique based on machine learning or a second learning technique based on deep learning,
Among the first learning techniques, select the learning data of a type that does not correspond to the entire data and all data except 0 among the plurality of types for kernel regression or Gaussian regression, and among the second learning techniques, a convolutional neural network With respect to the plurality of types, cell data excluding the 0 is selected as the learning data,
The learning performance unit,
At least one hyperparameter applied to the second learning technique using the entire data range and the hyperparameter applied to the second learning technique using the grid-specific data range are applied with different values,
Among the hyper parameters, the epoch and batch size are set larger for the second learning technique using the entire data range than for the second learning technique using the data range for each grid, and the learning rate among the hyper parameters is set to be larger for the second learning technique using the data range for each grid. A modeling device that is set to be smaller for the second learning technique using the entire data range than for the second learning technique using .

delete

According to clause 9,
The first learning technique is,
A modeling device comprising at least one of multiple linear regression, support vector regression, kernel regression, Gaussian regression, tree regression, random forest, and neural network regression.

According to clause 9,
The second learning technique is,
A modeling device comprising at least one of a deep neural network, a recursive neural network, a convolutional neural network, a Long Short-Term Memory (LSTM), and a Gated Recurrent Unit (GRU).

delete

In the analysis model learning device for artificial intelligence-based air quality modeling,
A collection unit that includes emission data of air pollutants and concentration data according to the air pollutants, and collects learning data selected from a plurality of types including total data, total data excluding 0, and cell data excluding 0;
a learning setting unit that determines a learning technique for building an artificial intelligence-based analysis model that is learned to derive the concentration data by inputting the discharge data and an analysis range of the analysis model for the learning data; and
a learning execution unit that builds the analysis model using the learning data based on the determined learning technique and analysis range;
Including,
The learning setting unit,
The analysis range is determined as the entire data range in which the entire space corresponding to each of the learning data is the learning target, or the data range for each grid in which the space is divided into a plurality of preset partition spaces as the learning target,
The learning setting unit,
The learning technique is determined as a first learning technique based on machine learning or a second learning technique based on deep learning,
Among the first learning techniques, select the learning data of a type that does not correspond to the entire data and all data except 0 among the plurality of types for kernel regression or Gaussian regression, and among the second learning techniques, a convolutional neural network With respect to the plurality of types, cell data excluding the 0 is selected as the learning data,
The learning performance unit,
At least one hyperparameter applied to the second learning technique using the entire data range and the hyperparameter applied to the second learning technique using the grid-specific data range are applied with different values,
Among the hyper parameters, the epoch and batch size are set larger for the second learning technique using the entire data range than for the second learning technique using the data range for each grid, and the learning rate among the hyper parameters is set to be larger for the second learning technique using the data range for each grid. A learning device that is set to be smaller for the second learning technique using the entire data range than for the second learning technique using .