KR102186942B1

KR102186942B1 - Method and apparatus for predicting ultrafine dust information

Info

Publication number: KR102186942B1
Application number: KR1020190066675A
Authority: KR
Inventors: 이수원; 이득우
Original assignee: 숭실대학교산학협력단
Priority date: 2019-05-13
Filing date: 2019-06-05
Publication date: 2020-12-04
Also published as: KR20200131141A

Abstract

초미세먼지 정보 예측 방법 및 장치가 개시된다. 초미세먼지 정보 예측 방법은 제1 시점의 날씨 데이터 및 미세먼지 데이터를 포함하는 원시 데이터를 수집하는 단계, 제1 시점의 원시 데이터에 대하여 데이터 전처리를 수행하는 단계 및 데이터 전처리가 수행된 제1 시점의 원시 데이터를 기 학습된 초미세먼지 정보 예측 모델에 입력하여, 제1 시점보다 미래 시점인 제2 시점의 초미세먼지 농도에 대한 예측 정보를 획득하는 단계를 포함한다.Disclosed are a method and apparatus for predicting ultrafine dust information. The ultrafine dust information prediction method includes collecting raw data including weather data and fine dust data of a first time point, performing data preprocessing on the raw data of the first time point, and a first time point at which data pre-processing is performed. And acquiring prediction information on the concentration of ultrafine dust at a second time point that is a future point in time than the first time point by inputting the raw data of

Description

Method and apparatus for predicting ultrafine dust information {METHOD AND APPARATUS FOR PREDICTING ULTRAFINE DUST INFORMATION}

아래 실시예들은 초미세먼지 정보 예측 기술에 관한 것이다.The following embodiments relate to ultrafine dust information prediction technology.

최근 전 세계적으로 미세먼지에 대한 관심이 높다. 미세먼지는 공기 중에 떠 있는 부유물질을 의미한다. 미세먼지는 호흡기 질환의 원인이며, 산업 시설 및 기계 장비 고장의 원인이 되기도 한다. 미세먼지는 크기에 따라 입자의 지름이 10㎛(마이크로미터)인 PM(Particulate Matter)10과, 입자의 지름이 2.5㎛인 PM2.5 두 가지로 분류된다. 이 중 PM2.5는 인체 깊숙한 혈관, 뇌까지 침투하기 때문에 PM10에 노출되었을 때보다 더 큰 건강 문제를 일으킬 수 있다. 따라서 PM2.5 레벨이 나쁜 날을 잘 예측하여 대비하는 것이 중요하다. 그러나 국내의 일 평균 미세먼지 예측 연구는 대부분 PM10을 대상으로 하고 있다.Recently, there is high interest in fine dust around the world. Fine dust refers to suspended matter floating in the air. Fine dust is a cause of respiratory disease, and it can also cause failure of industrial facilities and mechanical equipment. Depending on the size, fine dust is classified into PM (Particulate Matter)10 with a particle diameter of 10㎛ (micrometer) and PM2.5 with a particle diameter of 2.5㎛. Among them, PM2.5 can cause more health problems than when exposed to PM10 because it penetrates deep blood vessels and brain. Therefore, it is important to predict and prepare well for days when the PM2.5 level is bad. However, most studies on predicting the average daily fine dust in Korea are targeting PM10.

PM2.5인 미세먼지를 대상으로 한 연구라고 하더라도 화산 폭발 등 특수한 상황을 가정한 연구들이 주로 존재하며, 일상에 존재하는 날씨나 대기 오염 물질 또는 중국의 영향 등의 정보를 활용한 PM2.5 예측 연구가 존재하지 않는다. 따라서, 일반적인 상황에서의 PM2.5인 미세먼지를 예측할 수 있는 방법에 대한 연구가 필요한 실정이다.Even in the case of studies on fine dust, which is PM2.5, studies that assume special circumstances such as volcanic eruptions mainly exist, and PM2.5 prediction using information such as weather, air pollutants, or Chinese influences that exist in daily life. There is no research. Therefore, there is a need for research on a method to predict fine dust, which is PM2.5 in general situations.

일 실시예에 따른 초미세먼지 정보 예측 방법은 제1 시점의 날씨 데이터 및 미세먼지 데이터를 포함하는 원시 데이터를 수집하는 단계; 상기 제1 시점의 원시 데이터에 대하여 데이터 전처리를 수행하는 단계; 및 상기 데이터 전처리가 수행된 제1 시점의 원시 데이터를 기 학습된 초미세먼지 정보 예측 모델에 입력하여, 상기 제1 시점보다 미래 시점인 제2 시점의 초미세먼지 농도에 대한 예측 정보를 획득하는 단계를 포함할 수 있다.According to an exemplary embodiment, a method for predicting ultrafine dust information includes: collecting raw data including weather data and fine dust data of a first point in time; Performing data preprocessing on the raw data at the first point in time; And inputting the raw data of the first time point on which the data pre-processing has been performed into a pre-learned ultrafine dust information prediction model to obtain prediction information on the ultrafine dust concentration at a second time point that is a future time point than the first time point. It may include steps.

상기 초미세먼지 정보 예측 모델은, 서포트 벡터 머신(SVM: Support Vector Machine)이고, 상기 제1 시점보다 과거 기간의 날씨 데이터 및 미세먼지 데이터를 포함하는 학습 데이터에 기초하여 학습될 수 있다.The ultrafine dust information prediction model is a support vector machine (SVM), and may be learned based on training data including weather data and fine dust data of a period past the first time point.

상기 초미세먼지 정보 예측 모델은, 데이터 전처리가 수행된 학습 데이터를 입력 받아 초미세먼지 농도에 대한 예측을 포함하는 출력 값을 출력하고, 상기 출력 값과 상기 학습 데이터에 기초하여 상기 서포트 벡터 머신의 감마(gamma) 파라미터가 조정될 수 있다.The ultrafine dust information prediction model receives training data on which data has been pre-processed and outputs an output value including prediction of the ultrafine dust concentration, and based on the output value and the training data, the support vector machine The gamma parameter can be adjusted.

상기 제1 시점의 원시 데이터를 수집하는 단계는, 미리 정해진 지역 및 백령도의 상기 제1 시점의 기온, 강수량, 풍량, PM(Particulate Matter)10인 미세먼지 수치 및 PM2.5인 미세먼지 수치에 대한 데이터를 수집하는 단계를 포함할 수 있다.The collecting of the raw data at the first point in time includes the temperature, precipitation, air volume, particulate matter value of PM (Particulate Matter) 10, and the particulate matter value of PM2.5 at the first point of time in a predetermined area and Baengnyeong Island. It may include collecting data.

상기 데이터 전처리를 수행하는 단계는, 상기 원시 데이터에 대하여 주성분 분석(Principal Component Analysis; PCA)을 수행하는 단계를 포함할 수 있다.The performing of the pre-processing of the data may include performing a principal component analysis (PCA) on the raw data.

상기 제2 시점의 초미세먼지 농도에 대한 예측 정보는, 상기 제2 시점의 상기 미리 정해진 지역의 일평균 초미세먼지 농도에 대하여 좋음, 보통 및 나쁨 중 어느 하나로 결정된 정보를 포함할 수 있다.The prediction information on the ultrafine dust concentration at the second time point may include information determined as one of good, normal, and bad about the daily average ultrafine dust concentration in the predetermined area at the second time point.

일 실시예에 따른 초미세먼지 정보 예측 모델을 학습시키는 학습 방법은 제1 시점보다 과거인 기간의 날씨 데이터 및 미세먼지 데이터를 포함하는 원시 데이터를 수집하는 단계; 상기 원시 데이터에 대하여 데이터 전처리를 수행하는 단계; 상기 데이터 전처리가 수행된 원시 데이터를 학습 데이터로서 초미세먼지 정보 예측 모델에 입력하여, 초미세먼지 농도에 대한 예측을 포함하는 출력 값을 획득하는 단계; 및 상기 출력 값 및 상기 학습 데이터에 기초하여 상기 초미세먼지 정보 예측 모델의 파라미터를 조정하는 단계를 포함할 수 있다.According to an embodiment, a learning method for training an ultrafine dust information prediction model includes: collecting raw data including weather data and fine dust data of a period past a first time point; Performing data preprocessing on the raw data; Inputting the raw data on which the data has been pre-processed as training data into an ultrafine dust information prediction model to obtain an output value including a prediction of the ultrafine dust concentration; And adjusting a parameter of the ultrafine dust information prediction model based on the output value and the training data.

상기 초미세먼지 정보 예측 모델의 파라미터를 조정하는 단계는, 서포트 벡터 머신인 상기 초미세먼지 정보 예측 모델의 감마 파라미터를 조정하여, 상기 서포트 벡터 머신의 결정 경계 곡률을 조정하는 단계를 포함할 수 있다.The adjusting of the parameter of the ultrafine dust information prediction model may include adjusting a gamma parameter of the ultrafine dust information prediction model, which is a support vector machine, to adjust a decision boundary curvature of the support vector machine. .

일 실시예에 따른 초미세먼지 정보 예측 방법을 수행하는 초미세먼지 정보 예측 장치는 제1 시점의 날씨 데이터 및 미세먼지 데이터를 포함하는 원시 데이터를 수집하는 데이터 수집부; 상기 제1 시점의 날씨 데이터에 대하여 데이터 전처리를 수행하는 데이터 전처리부; 및 상기 데이터 전처리가 수행된 제1 시점의 날씨 데이터를 입력 받아 제2 시점의 초미세먼지 농도에 대한 예측을 수행하는 초미세먼지 정보 예측부를 포함할 수 있다.According to an embodiment, an apparatus for predicting ultra-fine dust information for performing a method for predicting ultra-fine dust information includes: a data collection unit that collects raw data including weather data and fine dust data of a first point in time; A data preprocessing unit that performs data preprocessing on the weather data at the first point in time; And an ultrafine dust information predictor configured to receive weather data at a first time point in which the data preprocessing is performed and perform prediction on the ultrafine dust concentration at a second time point.

일 실시예에 따른 초미세먼지 정보 예측 모델을 학습시키는 학습 장치는 제1 시점보다 과거인 기간의 날씨 데이터 및 미세먼지 데이터를 포함하는 원시 데이터를 수집하는 데이터 수집부; 상기 원시 데이터에 대하여 데이터 전처리를 수행하는 데이터 전처리부; 및 상기 데이터 전처리가 수행된 원시 데이터를 학습 데이터로서 초미세먼지 정보 예측 모델에 입력하여, 초미세먼지 농도에 대한 예측을 포함하는 출력 값을 획득하고, 상기 출력 값 및 상기 학습 데이터에 기초하여 상기 초미세먼지 정보 예측 모델의 파라미터를 조정하는 학습부를 포함할 수 있다.According to an exemplary embodiment, a learning apparatus for training an ultrafine dust information prediction model includes: a data collection unit that collects raw data including weather data and fine dust data of a period past a first point in time; A data preprocessor for performing data preprocessing on the raw data; And inputting the raw data on which the data pre-processing has been performed into an ultrafine dust information prediction model as training data to obtain an output value including a prediction of the ultrafine dust concentration, and based on the output value and the training data, the It may include a learning unit that adjusts the parameters of the ultrafine dust information prediction model.

일 실시예에 따르면, 날씨 데이터, 대기 오염 물질 데이터 및 중국 기상이 한국 기상에 미치는 영향 데이터와 같은 정보를 이용하여 서울 등의 미리 정해진 지역의 전체 일평균 PM 2.5인 미세먼지를 예측할 수 있다.According to an embodiment, by using information such as weather data, air pollutant data, and impact data of Chinese weather on Korean weather, it is possible to predict fine dust having an average daily average of 2.5 PM in a predetermined area such as Seoul.

일 실시예에 따르면, 초미세먼지 농도에 대한 정보를 예측하여 시민들의 일상 생활 및 야외 활동에 필요한 정보를 미리 제공할 수 있다.According to an embodiment, information necessary for daily life and outdoor activities of citizens may be provided in advance by predicting information on the concentration of ultrafine dust.

일 실시예에 따르면, 초미세먼지 농도에 대한 예측뿐만 아니라 수치를 학습하여 결과를 예측하는 다른 분야에도 적용될 수 있다.According to an embodiment, it may be applied not only to prediction on the concentration of ultrafine dust, but also to other fields for predicting results by learning numerical values.

도 1은 일 실시예에 따른 초미세먼지 정보 예측 시스템의 전체적인 구성을 도시한 도면이다.
도 2는 일 실시예에 따른 초미세먼지 정보 예측 모델의 학습 방법의 동작을 설명하기 위한 흐름도이다.
도 3은 일 실시예에 따른 초미세먼지 정보 예측 방법의 동작을 설명하기 위한 흐름도이다.
도 4는 일 실시예에 따른 초미세먼지 정보 예측 모델을 학습시키는 학습 장치의 구성을 도시하는 도면이다.
도 5는 일 실시예에 따른 초미세먼지 정보 예측 장치의 구성을 도시하는 도면이다.1 is a diagram illustrating an overall configuration of a system for predicting ultrafine dust information according to an exemplary embodiment.
2 is a flowchart illustrating an operation of a learning method of a prediction model of ultrafine dust information according to an exemplary embodiment.
3 is a flowchart illustrating an operation of a method of predicting ultrafine dust information according to an exemplary embodiment.
4 is a diagram illustrating a configuration of a learning apparatus for training an ultrafine dust information prediction model according to an embodiment.
5 is a diagram illustrating a configuration of an apparatus for predicting ultrafine dust information according to an exemplary embodiment.

이하에서, 첨부된 도면을 참조하여 실시예들을 상세하게 설명한다. 그러나, 실시예들에는 다양한 변경이 가해질 수 있어서 특허출원의 권리 범위가 이러한 실시예들에 의해 제한되거나 한정되는 것은 아니다. 실시예들에 대한 모든 변경, 균등물 내지 대체물이 권리 범위에 포함되는 것으로 이해되어야 한다.Hereinafter, exemplary embodiments will be described in detail with reference to the accompanying drawings. However, since various changes may be made to the embodiments, the scope of the rights of the patent application is not limited or limited by these embodiments. It should be understood that all changes, equivalents, or substitutes to the embodiments are included in the scope of the rights.

실시예에서 사용한 용어는 단지 설명을 목적으로 사용된 것으로, 한정하려는 의도로 해석되어서는 안된다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 명세서에서, "포함하다" 또는 "가지다" 등의 용어는 명세서 상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The terms used in the examples are used for illustrative purposes only and should not be interpreted as limiting. Singular expressions include plural expressions unless the context clearly indicates otherwise. In the present specification, terms such as "comprise" or "have" are intended to designate the presence of features, numbers, steps, actions, components, parts, or combinations thereof described in the specification, but one or more other features. It is to be understood that the presence or addition of elements or numbers, steps, actions, components, parts, or combinations thereof, does not preclude in advance.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 실시예가 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가지고 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥 상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless otherwise defined, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which the embodiment belongs. Terms as defined in a commonly used dictionary should be interpreted as having a meaning consistent with the meaning in the context of the related technology, and should not be interpreted as an ideal or excessively formal meaning unless explicitly defined in this application. Does not.

또한, 첨부 도면을 참조하여 설명함에 있어, 도면 부호에 관계없이 동일한 구성 요소는 동일한 참조부호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다. 실시예를 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 실시예의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다.In addition, in the description with reference to the accompanying drawings, the same reference numerals are assigned to the same components regardless of the reference numerals, and redundant descriptions thereof will be omitted. In describing the embodiments, when it is determined that a detailed description of related known technologies may unnecessarily obscure the subject matter of the embodiments, the detailed description thereof will be omitted.

도 1은 일 실시예에 따른 초미세먼지 정보 예측 시스템의 전체적인 구성을 도시한 도면이다.1 is a diagram illustrating an overall configuration of a system for predicting ultrafine dust information according to an exemplary embodiment.

도 1을 참조하면, 초미세먼지 정보 예측 시스템은 과거의 날씨 데이터 및 미세먼지 데이터에 기초하여 입자 지름이 2.5 마이크로미터인 초미세먼지 농도에 대한 예측을 수행할 수 있다. 초미세먼지 정보 예측 시스템은 과거 및 현재의 날씨 데이터 및 미세먼지 데이터를 포함하는 원시 데이터에 기초하여 통해 초미세먼지 농도에 대한 예측을 수행할 수 있다.Referring to FIG. 1, the ultrafine dust information prediction system may predict the concentration of ultrafine dust having a particle diameter of 2.5 micrometers based on past weather data and fine dust data. The ultrafine dust information prediction system may perform prediction on the ultrafine dust concentration based on raw data including past and present weather data and fine dust data.

일 실시예에서, 초미세먼지 정보 예측 시스템은 데이터 전처리부(110) 및 초미세먼지 정보 예측 모델(120)을 포함할 수 있다. 데이터 전처리부(110)는 과거의 날씨 데이터 및 미세먼지 데이터를 포함한 원시 데이터를 입력 받아 원시 데이터에 대하여 데이터 전처리를 수행할 수 있다. 데이터 전처리부(110)는 원시 데이터에 대하여 주성분 분석을 수행할 수 있다. 여기서, 주성분 분석은 데이터의 분산이 가장 큰 방향을 주성분(Principal Component)으로 보고 주성분을 새로운 축으로 삼아 차원을 축소하는 기법을 의미할 수 있다. 또한, 원시 데이터는 예를 들어, 0 내지 1 사이의 값으로 스케일링될 수 있다.In an embodiment, the ultrafine dust information prediction system may include a data preprocessor 110 and an ultrafine dust information prediction model 120. The data preprocessor 110 may receive raw data including past weather data and fine dust data and perform data preprocessing on the raw data. The data preprocessor 110 may perform principal component analysis on the raw data. Here, the principal component analysis may refer to a technique of reducing the dimension by considering the direction in which the variance of data is largest as a principal component and using the principal component as a new axis. Also, the raw data can be scaled to a value between 0 and 1, for example.

데이터 전처리부(110)에서 데이터 전처리가 수행된 원시 데이터는 초미세먼지 정보 예측 모델(120)에 입력될 수 있다. 여기서 초미세먼지 정보 예측 모델(120)은 기존에 학습된 초미세먼지 정보 예측 모델(120)을 의미할 수 있다. 초미세먼지 정보 예측 모델(120)은 데이터 전처리가 수행된 원시 데이터에 기초하여, 원시 데이터에 포함된 날씨 데이터 및 미세먼지 데이터에 대응하는 시점보다 미래 시점에 대한 미세먼지 농도의 예측을 수행할 수 있다.The raw data that has been pre-processed by the data preprocessor 110 may be input to the ultrafine dust information prediction model 120. Here, the ultrafine dust information prediction model 120 may mean the previously learned ultrafine dust information prediction model 120. The ultrafine dust information prediction model 120 can predict the concentration of fine dust for a future point of time rather than a time point corresponding to the weather data and fine dust data included in the raw data, based on the raw data on which the data has been pre-processed. have.

일 실시예에서 초미세먼지 정보 예측 모델(120)은 서포트 벡터 머신일 수 있다. 초미세먼지 정보 예측 모델(120)은 예측을 수행하여 출력한 출력 값과 학습 데이터에 기초하여 서포트 벡터 머신의 감마 파라미터가 조정됨으로써 학습될 수 있다. 여기서 초미세먼지 정보 예측 모델(120)은 비선형적인 데이터에 기초하여 학습되기 때문에, 비선형 학습에 강점을 보이는 방사 기저 함수 (RBF: Radial Basis Function) 커널 서포트 벡터 머신일 수 있다. 초미세먼지 정보 예측 모델(120)의 방사 기저 서포트 벡터 머신은 서포트 벡터 머신의 감마 파라미터가 미세 조정되어, 서포트 벡터 머신의 결정 경계(decision boundary) 곡률이 조정될 수 있다.In an embodiment, the ultrafine dust information prediction model 120 may be a support vector machine. The ultrafine dust information prediction model 120 may be learned by adjusting a gamma parameter of a support vector machine based on an output value and training data output by performing prediction. Here, since the ultrafine dust information prediction model 120 is trained based on nonlinear data, it may be a Radial Basis Function (RBF) kernel support vector machine that exhibits strength in nonlinear learning. In the radiation basis support vector machine of the ultrafine dust information prediction model 120, the gamma parameter of the support vector machine is finely adjusted, so that the curvature of the decision boundary of the support vector machine may be adjusted.

도 2는 일 실시예에 따른 초미세먼지 정보 예측 모델의 학습 방법의 동작을 설명하기 위한 흐름도이다.2 is a flowchart illustrating an operation of a learning method of a prediction model of ultrafine dust information according to an exemplary embodiment.

도 2를 참조하면, 단계(210)에서 초미세먼지 정보 예측 모델을 학습시키는 학습 장치는 제1 시점보다 과거인 기간의 날씨 데이터 및 미세먼지 데이터를 포함하는 원시 데이터를 수집할 수 있다. 여기서, 수집된 원시 데이터에는 미리 정해진 지역의 제1 시점보다 과거인 기간의 기온, 강수량, 풍량, PM10인 미세먼지 수치 및 PM2.5인 미세먼지 수치에 대한 데이터를 포함할 수 있다. 미세먼지의 농도는 1차 미세먼지(물리적인 미세먼지)의 움직임과 2차 미세먼지(화학적 미세먼지)의 움직임 및 생성 조건과 관련될 수 있기 때문에 초미세먼지 정보 예측 모델은 날씨 데이터 및 미세먼지 데이터 등에 기초하여 학습될 수 있다. 또한, 2차 미세먼지 생성에 대기 오염 물질이 기여할 수 있기 때문에, 실시예에 따라 초미세먼지 정보 예측 모델을 학습시키는 데에 CO, NO₂, O₃, SO₂와 같은 대기 오염 물질과 관련된 정보도 필요할 수 있다.Referring to FIG. 2, in operation 210, the learning apparatus for learning the ultrafine dust information prediction model may collect raw data including weather data and fine dust data for a period past the first time point. Here, the collected raw data may include data on temperature, precipitation, air volume, fine dust value of PM10, and fine dust value of PM2.5 for a period past the first time point of a predetermined area. Since the concentration of fine dust can be related to the movement of the first fine dust (physical fine dust) and the movement and generation conditions of the second fine dust (chemical fine dust), the prediction model of ultrafine dust information is based on weather data and fine dust. It can be learned based on data or the like. In addition, since air pollutants may contribute to the generation of secondary fine dust, information related to air pollutants such as CO, NO ₂ , O ₃ , and SO ₂ is used to train the prediction model for ultra-fine dust information according to the embodiment. You may also need it.

미세먼지는 지정학적 특성상 기상 정보 외에도 주변 지역에서 유입된 미세먼지 정보 등이 필요할 수 있다. 한국의 미세먼지 농도는 중국의 날씨나 미세먼지 농도로부터 영향을 받을 수 있기 때문에 중국과 가까운 백령도의 제1 시점보다 과거인 기간의 기온, 강수량, 풍량, PM10인 미세먼지 수치 및 PM2.5인 미세먼지 수치에 대한 데이터도 원시 데이터에 포함될 수 있다.Due to the geopolitical nature of fine dust, information on fine dust introduced from surrounding areas may be required in addition to meteorological information. Since the concentration of fine dust in Korea can be influenced by the weather in China or the concentration of fine dust, the temperature, precipitation, air volume, PM10 (fine dust level) and PM2.5 (fine dust) of the period past the first point in Baengnyeongdo Island near China. Data on dust levels can also be included in the raw data.

일 예에서, 초미세먼지 정보 예측 모델의 학습에 이용될 원시 데이터는 2015년 1월 1일부터 2017년 12월 31일까지 기간의 날씨 데이터 및 미세먼지 데이터를 포함할 수 있다. 또한, 2018년 1월 1일부터 9월 30일까지 기간의 날씨 데이터 및 미세먼지 데이터가 초미세먼지 정보 예측 모델의 성능을 테스트하는 데에 이용될 수 있다.In one example, raw data to be used for training the ultrafine dust information prediction model may include weather data and fine dust data for a period from January 1, 2015 to December 31, 2017. In addition, weather data and fine dust data for the period from January 1 to September 30, 2018 can be used to test the performance of the ultrafine dust information prediction model.

여기서 제1 시점보다 과거인 기간의 원시 데이터는 날씨와 관련된 정보를 얻을 수 있는 곳에서 쉽게 수집할 수 있는 데이터가 될 수 있다. 또한 미리 정해진 지역은 서울 등과 같이 미세먼지 농도에 대한 예측을 수행하고자 하는 지역을 의미할 수 있다.Here, raw data for a period past the first point in time may be data that can be easily collected in a place where information related to weather can be obtained. In addition, the predetermined area may mean an area in which a prediction for the concentration of fine dust is to be performed, such as Seoul.

단계(220)에서 학습 장치는 원시 데이터에 대하여 데이터 전처리를 수행할 수 있다. 일 실시예에서 학습 장치는 원시 데이터에 대하여 주성분 분석을 수행할 수 있다. 주성분 분석은 데이터의 분산이 가장 큰 방향을 주성분(Principal Component)으로 보고 주성분을 새로운 축으로 삼아 차원을 축소하는 기법을 의미할 수 있다. 본 명세서에서는 원시 데이터의 주성분을 3차원으로 추출할 수 있다. 새롭게 추출된 각 차원 m의 분산이 전체 데이터의 분산을 설명하는 비율은 다음 식과 같이 정의될 수 있다.In step 220, the learning device may perform data pre-processing on the raw data. In an embodiment, the learning device may perform principal component analysis on raw data. Principal component analysis may refer to a technique of reducing the dimension by considering the direction in which the variance of data is largest as the principal component and using the principal component as a new axis. In the present specification, the principal component of raw data can be extracted in three dimensions. The ratio at which the variance of each newly extracted dimension m explains the variance of the entire data can be defined as follows.

여기서,

은 데이터

에 대한 주성분 벡터를 의미할 수 있고, n, p는 각각 시그마에서 인덱스 i, j의 끝 값을 의미한다.here,

Silver data

It can mean a principal component vector for, and n and p mean the ending values of indices i and j in sigma, respectively.

단계(230)에서 학습 장치는 데이터 전처리가 수행된 원시 데이터를 학습 데이터로서 초미세먼지 정보 예측 모델에 입력하여, 초미세먼지 농도에 대한 예측을 포함하는 출력 값을 획득할 수 있다.In step 230, the learning device may input raw data on which the data has been pre-processed into the ultrafine dust information prediction model as training data to obtain an output value including prediction of the ultrafine dust concentration.

단계(240)에서 학습 장치는 출력 값 및 학습 데이터에 기초하여 초미세먼지 정보 예측 모델의 파라미터를 조정할 수 있다.In step 240, the learning device may adjust a parameter of the ultrafine dust information prediction model based on the output value and the training data.

일 실시예에서 초미세먼지 농도에 대한 예측을 정확하게 수행하기 위하여, 학습 장치는 초미세먼지 정보 예측 모델을 학습시키는 데에 다양한 데이터를 사용할 수 있다. 그러나 데이터의 종류가 많아질수록 데이터 간의 상호작용이 발생하여 데이터의 비선형성을 증가시킬 수 있다. 데이터의 비선형성을 극복하기 위하여 초미세먼지 정보 예측 모델은 감마 파라미터가 조정된 방사 기저 함수 커널 서포트 벡터 머신이 될 수 있다. 학습 장치는 서포트 벡터 머신인 초미세먼지 정보 예측 모델의 감마 파라미터를 조정하여, 서포트 벡터 머신의 결정 경계 곡률을 조정할 수 있다.In an embodiment, in order to accurately predict the concentration of ultrafine dust, the learning device may use various data to train a prediction model of ultrafine dust information. However, as data types increase, interactions between data may occur, which may increase data nonlinearity. In order to overcome the nonlinearity of the data, the ultrafine dust information prediction model may be a radiation basis function kernel support vector machine with adjusted gamma parameters. The learning apparatus may adjust the gamma parameter of the ultrafine dust information prediction model, which is a support vector machine, to adjust the crystal boundary curvature of the support vector machine.

학습 장치는 초미세먼지 정보 예측 모델이 출력한, 초미세먼지 농도에 대한 예측을 포함하는 출력 값과 학습 데이터에 포함된, 출력 값에 대응하는 시점의 실제 초미세먼지 농도에 대한 정보에 대한 유사도에 기초하여 초미세먼지 정보 예측 모델의 파라미터를 조정할 수 있다.The learning device is the similarity between the output value including the prediction of the ultrafine dust concentration output from the ultrafine dust information prediction model and the information on the actual ultrafine dust concentration at the time corresponding to the output value included in the training data. Based on, the parameters of the ultrafine dust information prediction model can be adjusted.

일 실시예에서 학습 장치는 출력 값과 실제 초미세먼지 농도에 대한 정보 간의 유사도를 다음 식을 통해 산출할 수 있다.In an embodiment, the learning device may calculate the similarity between the output value and the information on the actual ultrafine dust concentration through the following equation.

여기서,

는 학습 데이터를 의미하고,

는 서포트 벡터 집합이며,

는 각 데이터에 대응하는 파라미터를 의미할 수 있다. 또한, 방사 기저 함수 커널K는 다음 식과 같이 표현될 수 있다.here,

Means training data,

Is the set of support vectors,

May mean a parameter corresponding to each data. In addition, the radiation basis function kernel K can be expressed as the following equation.

여기서,

(gamma)는 커널의 결정 곡률을 조정할 수 있다. 방사 기저 함수 커널 기반의 서포트 벡터 머신은 수학식 3처럼 x의 제곱을 이용하여 비선형 결정 경계 곡률을 최적화할 수 있다. 이때

(gamma) 값이 클수록 K의 곡률이 커져 K는 근접 학습데이터에 많은 영향을 받아 과대 적합(overfitting)될 수 있다. 따라서 주어진 데이터

에 대해

를 최대화하기 위한

값의 조정이 필요할 수 있다. 본 명세서에서는

를 0.1 내지 1사이에서 변화시키며 데이터에 최적인 kernel의 곡률을 결정할 수 있다.here,

(gamma) can adjust the crystal curvature of the kernel. The radial basis function kernel-based support vector machine can optimize the nonlinear crystal boundary curvature by using the square of x as shown in Equation 3. At this time

As the value of (gamma) increases, the curvature of K increases, so that K may be overfitting because it is affected more by proximity learning data. So the given data

About

To maximize

Values may need adjustment. In this specification

By varying between 0.1 and 1, the curvature of the kernel that is optimal for the data can be determined.

예를 들어, 감마 파라미터를 0.25로 설정하였을 때, 출력 값과 출력 데이터에 대응하는 시점의 초미세먼지 농도의 유사도가 가장 높은 것으로 나타날 수 있다. 이 경우, 감마 파라미터의 값이 0.25일 때, 감마 파라미터의 값의 변화로 조정된 결정 경계 곡률의 변화가 데이터의 비선형성을 가장 잘 나타낸다고 판단될 수 있고, 최적의 감마 파라미터라고 판단될 수 있다.For example, when the gamma parameter is set to 0.25, the similarity between the output value and the ultrafine dust concentration at a time point corresponding to the output data may be shown to be the highest. In this case, when the value of the gamma parameter is 0.25, it may be determined that the change in the crystal boundary curvature adjusted by the change in the value of the gamma parameter best represents the nonlinearity of the data, and may be determined as the optimal gamma parameter.

도 3은 일 실시예에 따른 초미세먼지 정보 예측 방법의 동작을 설명하기 위한 흐름도이다.3 is a flowchart illustrating an operation of a method of predicting ultrafine dust information according to an exemplary embodiment.

도 3을 참조하면, 단계(310)에서 초미세먼지 정보 예측 장치는 제1 시점의 날씨 데이터 및 미세먼지 데이터를 포함하는 원시 데이터를 수집할 수 있다. 여기서 초미세먼지 정보 예측 장치가 수집하는 제1 시점의 원시 데이터는 미리 정해진 지역 및 백령도의 제1 시점의 기온, 강수량, 풍량, PM10인 미세먼지 수치 및 PM2.5인 미세먼지 수치에 대한 데이터를 의미할 수 있다. 초미세먼지 농도에는 다양한 요소들이 영향을 미칠 수 있기 때문에 원시 데이터에는 다양한 데이터가 포함될 수 있다. 예를 들어, 서울의 초미세먼지 농도를 예측하고자 할 때, 예측하고자 하는 시점 즈음에 서풍이 많이 분다면, 지형적인 특성상 한국의 초미세먼지 농도에 중국으로부터 불어온 백령도의 미세먼지 농도나 황사의 정도가 서울의 초미세먼지 농도에 영향을 미칠 수 있다. 따라서, 실시예에 따라 제1 시점의 원시 데이터에는 백령도의 날씨 데이터 및 미세먼지 데이터와 같은 기상정보 관련 데이터가 포함될 수 있다.Referring to FIG. 3, in step 310, the apparatus for predicting ultrafine dust information may collect raw data including weather data and fine dust data of a first time point. Here, the raw data at the first point of time collected by the ultrafine dust information predictor is data on the temperature, precipitation, air volume, fine dust value of PM10, and fine dust value of PM2.5 at the first point of time in a predetermined area and Baengnyeongdo Island. It can mean. Since various factors can influence the concentration of ultrafine dust, various data can be included in the raw data. For example, when trying to predict the concentration of ultrafine dust in Seoul, if there is a lot of western wind around the time of the prediction, the concentration of fine dust on Baengnyeongdo or yellow dust blown from China due to the topography The degree can affect the concentration of ultrafine dust in Seoul. Accordingly, according to an embodiment, the raw data at the first point in time may include weather information related data such as weather data and fine dust data of Baengnyeongdo Island.

단계(320)에서 초미세먼지 정보 예측 장치는 제1 시점의 원시 데이터에 대하여 데이터 전처리를 수행할 수 있다. 초미세먼지 정보 예측 장치는 원시 데이터에 대하여 스케일링을 수행할 수 있고, 주성분 분석을 수행할 수 있다.In step 320, the apparatus for predicting ultrafine dust information may perform data pre-processing on the raw data of the first viewpoint. The ultrafine dust information prediction apparatus may perform scaling on raw data and may perform principal component analysis.

단계(330)에서 초미세먼지 정보 예측 장치는 데이터 전처리가 수행된 제1 시점의 원시 데이터를 기 학습된 초미세먼지 정보 예측 모델에 입력하여, 제1 시점보다 미래의 시점을 의미하는, 제2 시점의 초미세먼지 농도에 대한 예측 정보를 획득할 수 있다.In step 330, the ultrafine dust information prediction apparatus inputs the raw data of the first time point on which the data pre-processing has been performed into the pre-learned ultrafine dust information prediction model, and refers to a second time point in the future than the first time point. Predictive information on the concentration of ultrafine dust at the time point can be obtained.

여기서, 초미세먼지 정보 예측 모델은 서포트 벡터 머신이 될 수 있다. 특히, 일 실시예에서 초미세먼지 정보 예측 모델은 데이터의 비선형성을 학습하는 데에 유리한 방사 기저 함수 커널 기반의 서포트 벡터 머신일 수 있다. 또한, 초미세먼지 정보 예측 모델은 제1 시점보다 과거 기간의 날씨 데이터 및 미세먼지 데이터를 포함하는 학습 데이터에 기초하여 학습될 수 있다. 일 실시예에서 초미세먼지 정보 예측 모델은 데이터 전처리가 수행된 학습 데이터를 입력 받아 초미세먼지 농도에 대한 예측을 포함하는 출력 값을 출력하고, 출력 값과 학습 데이터에 기초하여 서포트 벡터 머신의 감마(gamma) 파라미터가 조정됨으로써 학습될 수 있다.Here, the ultrafine dust information prediction model may be a support vector machine. In particular, in an embodiment, the ultrafine dust information prediction model may be a support vector machine based on an emission basis function kernel, which is advantageous for learning nonlinearity of data. In addition, the ultrafine dust information prediction model may be learned based on training data including weather data and fine dust data of a period past the first time point. In one embodiment, the ultrafine dust information prediction model receives training data on which the data has been pre-processed and outputs an output value including prediction of the ultrafine dust concentration, and based on the output value and the training data, the gamma of the support vector machine It can be learned by adjusting the (gamma) parameter.

초미세먼지 정보 예측 모델이 출력할 제2 시점의 초미세먼지 농도에 대한 예측 정보는 제2 시점의 미리 정해진 지역의 일평균 초미세먼지 농도에 대하여 좋음, 보통 및 나쁨 중 어느 하나로 결정된 정보를 포함할 수 있다. 즉, 일 실시예에서 초미세먼지 정보 예측 모델은 서울의 미리 정해진 시점의 초미세먼지 농도에 대한 예측을 수행하여, 서울의 일평균 초미세먼지 농도에 대하여 좋음, 보통 및 나쁨 중 어느 하나로 예측할 수 있다. 여기서, 일평균 초미세먼지 농도가 좋음, 보통 및 나쁨으로 나눠지는 기준은 다음과 같을 수 있다. PM2.5의 수치가 일평균 0㎍/㎥ 내지 15㎍/㎥이면, 좋음, PM 2.5의 수치가 일평균 16㎍/㎥ 내지 35㎍/㎥이면, 보통, PM2.5의 수치가 일평균 36㎍/㎥이상인 경우 나쁨으로 분류될 수 있다. 초미세먼지 정보 예측 모델은 서울의 미리 정해진 시점의 초미세먼지 농도에 대한 수치를 예측할 수 있고, 서울의 미리 정해진 시점의 초미세먼지 농도를 예측한 수치에 기초하여 미리 정해진 구분 기준에 따라 좋음, 보통 및 나쁨으로 결정할 수 있다.The prediction information on the ultrafine dust concentration at the second point in time to be output by the ultrafine dust information prediction model includes information determined as good, normal, or bad for the daily average ultrafine dust concentration in a predetermined area at the second point in time. can do. That is, in one embodiment, the ultrafine dust information prediction model performs prediction on the concentration of ultrafine dust at a predetermined time point in Seoul, and can predict one of good, normal, and bad for the daily average ultrafine dust concentration in Seoul. have. Here, the average daily ultrafine dust concentration is divided into good, normal and bad criteria may be as follows. If the daily average value of PM2.5 is 0㎍/㎥ to 15㎍/㎥, it is good, and if the daily average value of PM 2.5 is 16㎍/㎥ to 35㎍/㎥, the average PM2.5 value is 36 If it is more than ㎍/㎥, it can be classified as bad. The ultra-fine dust information prediction model can predict the value of the ultra-fine dust concentration at a predetermined time in Seoul, and is good according to the predetermined classification criteria based on the predicted value of the ultra-fine dust concentration at a predetermined time in Seoul. It can be determined as medium and bad.

도 4는 일 실시예에 따른 초미세먼지 정보 예측 모델을 학습시키는 학습 장치의 구성을 도시하는 도면이다.4 is a diagram illustrating a configuration of a learning apparatus for training an ultrafine dust information prediction model according to an embodiment.

도 4를 참조하면, 일 실시예에서 초미세먼지 정보 예측 모델(440)을 학습시키는 학습 장치(400)는 데이터 수집부(410), 데이터 전처리부(420) 및 학습부(430)를 포함할 수 있다. 학습 장치(400)는 본 명세서에서 설명되는 학습 장치에 대응할 수 있다.Referring to FIG. 4, in an embodiment, the learning device 400 for training the ultrafine dust information prediction model 440 may include a data collection unit 410, a data preprocessor 420, and a learning unit 430. I can. The learning device 400 may correspond to the learning device described herein.

일 실시예에서 데이터 수집부(410)는 미리 정해진 시점을 의미하는 제1 시점을 기준으로, 제1 시점보다 과거인 기간의 날씨 데이터 및 미세먼지 데이터를 포함하는 원시 데이터를 수집할 수 있다. 데이터 수집부(410)가 수집하는 원시 데이터는 미리 정해진 지역의 날씨와 관련된 일반적이고, 누구나 쉽게 수집할 수 있는 데이터를 의미할 수 있다.In an embodiment, the data collection unit 410 may collect raw data including weather data and fine dust data for a period past the first time point based on a first time point indicating a predetermined time point. The raw data collected by the data collection unit 410 may mean data that is generally related to the weather in a predetermined area, and that anyone can easily collect.

데이터 전처리부(420)는 데이터 수집부(410)가 수집한 원시 데이터에 대하여 데이터 전처리를 수행할 수 있다. 데이터 전처리부(420)를 통해 원시 데이터는 0과 1 사이의 값으로 스케일링될 수 있다. 또한, 데이터 전처리부(420)는 원시 데이터에 대하여 주성분 분석을 수행할 수 있다.The data preprocessor 420 may perform data preprocessing on the raw data collected by the data collection unit 410. The raw data may be scaled to a value between 0 and 1 through the data preprocessor 420. Also, the data preprocessor 420 may perform principal component analysis on the raw data.

학습부(430)는 데이터 전처리가 수행된 원시 데이터를 학습 데이터로서 초미세먼지 정보 예측 모델(440)을 학습시키는 데에 이용할 수 있다.The learning unit 430 may use raw data on which data preprocessing has been performed as training data to train the ultrafine dust information prediction model 440.

일 실시예에서 학습부(430)는 학습 데이터를 초미세먼지 정보 예측 모델(440)에 입력하여 초미세먼지 농도에 대한 예측을 포함하는 출력 값을 획득할 수 있다. 학습부(430)는 출력 값 및 학습 데이터에 기초하여 초미세먼지 정보 예측 모델(440)의 파라미터를 조정할 수 있다. 학습부(430)는 출력 값과 학습 데이터에서 출력 값에 대응하는 데이터 간의 유사도에 기초하여, 초미세먼지 정보 예측 모델(440)이 최적의 예측을 수행할 수 있도록 초미세먼지 정보 예측 모델(440)의 파라미터를 조정할 수 있다.In an embodiment, the learning unit 430 may input training data into the ultrafine dust information prediction model 440 to obtain an output value including a prediction of the ultrafine dust concentration. The learning unit 430 may adjust a parameter of the ultrafine dust information prediction model 440 based on the output value and the training data. The learning unit 430 is based on the similarity between the output value and the data corresponding to the output value in the training data, the ultrafine dust information prediction model 440 so that the ultrafine dust information prediction model 440 can perform optimal prediction. ) Parameters can be adjusted.

초미세먼지 정보 예측 모델(440)은 비선형 데이터를 학습하는 데에 최적화된 방사 기저 함수 커널 서포트 벡터 머신일 수 있다. 학습부(430)가 초미세먼지 정보 예측 모델(440)을 학습시키고, 성능을 테스트하는 과정에서 서포트 벡터 머신의 결정 경계 곡률이 적절하게 조정될 수 있도록 서포트 벡터 머신의 감마 파라미터가 미세 조정될 수 있다.The ultrafine dust information prediction model 440 may be an emission basis function kernel support vector machine optimized for learning nonlinear data. The gamma parameter of the support vector machine may be finely adjusted so that the learning unit 430 trains the ultrafine dust information prediction model 440 and tests the performance of the support vector machine so that the decision boundary curvature of the support vector machine can be appropriately adjusted.

도 5는 일 실시예에 따른 초미세먼지 정보 예측 장치의 구성을 도시하는 도면이다.5 is a diagram illustrating a configuration of an apparatus for predicting ultrafine dust information according to an exemplary embodiment.

도 5를 참조하면, 일 실시예에서 초미세먼지 정보 예측 장치(500)는 데이터 수집부(510), 데이터 전처리부(520) 및 초미세먼지 정보 예측부(530)를 포함할 수 있다. 초미세먼지 정보 예측 장치(500)는 본 명세서에서 설명된 초미세먼지 정보 예측 장치에 대응할 수 있다.Referring to FIG. 5, in an embodiment, the ultrafine dust information prediction apparatus 500 may include a data collection unit 510, a data preprocessor 520, and an ultrafine dust information prediction unit 530. The ultrafine dust information predicting apparatus 500 may correspond to the ultrafine dust information predicting apparatus described herein.

일 실시예에서 데이터 수집부(510)는 제1 시점의 날씨 데이터 및 미세먼지 데이터를 포함하는 원시 데이터를 수집할 수 있다. 제1 시점의 원시 데이터에는 초미세먼지 농도를 예측하고자 하는 지역의 날씨 데이터 및 미세먼지 데이터뿐만 아니라, 지형적인 특성상, 초미세먼지 농도를 예측하고자 하는 지역의 초미세먼지 농도에 영향을 줄 수 있는 지역의 날씨 데이터 및 미세먼지 데이터가 포함될 수도 있다.In an embodiment, the data collection unit 510 may collect raw data including weather data and fine dust data at a first time point. In the raw data at the first point in time, not only weather data and fine dust data of the region for which the ultrafine dust concentration is to be predicted, but also due to the topographical characteristics, can affect the concentration of the ultrafine dust in the region where the ultrafine dust concentration is to be predicted. Local weather data and fine dust data may be included.

데이터 전처리부(520)는 데이터 수집부(510)가 수집한 제1 시점의 원시 데이터에 대하여 데이터 전처리를 수행하여, 제1 시점의 원시 데이터를, 초미세먼지 정보 예측 모델(540)에 입력될 수 있는 형태로 변환할 수 있다.The data preprocessing unit 520 performs data preprocessing on the raw data at the first time point collected by the data collection unit 510, so that the raw data at the first time point is input to the ultrafine dust information prediction model 540. It can be converted into a form that can be used.

초미세먼지 정보 예측부(530)는 데이터 전처리가 수행된 제 시점의 원시 데이터에 기초하여 제1 시점보다 미래의 시점을 의미하는 제2 시점의 초미세먼지 농도에 대한 예측 정보를 획득할 수 있다.The ultrafine dust information prediction unit 530 may obtain prediction information on the concentration of ultrafine dust at a second point in time, meaning a point in the future than the first point in time, based on the raw data at the first point in which data preprocessing has been performed. .

일 실시예에서 초미세먼지 정보 예측부(530)는 초미세먼지 정보 예측 모델(540)을 통해 초미세먼지 농도에 대한 예측 정보를 획득할 수 있다. 초미세먼지 정보 예측부(530)는 초미세먼지 정보 예측 모델(540)에 데이터 전처리가 수행된 제1 시점의 원시 데이터를 입력하여 제2 시점의 초미세먼지 농도에 대한 예측 정보를 획득할 수 있다.In an embodiment, the ultrafine dust information prediction unit 530 may obtain prediction information on the ultrafine dust concentration through the ultrafine dust information prediction model 540. The ultrafine dust information prediction unit 530 may obtain prediction information on the ultrafine dust concentration at the second time point by inputting raw data of the first time point where data preprocessing has been performed into the ultrafine dust information prediction model 540. have.

일 실시예에서 초미세먼지 정보 예측 모델(540)은 예측 정보로서, 제2 시점의 미리 정해진 지역의 초미세먼지 농도에 대한 수치를 출력할 수 있다. 초미세먼지 정보 예측부(530) 또는 초미세먼지 정보 예측 모델(540)은 초미세먼지 농도에 대한 수치에 기초하여 제2 시점의 미리 정해진 지역의 초미세먼지 농도에 대한 예측 정보를 좋음, 보통 및 나쁨으로 결정할 수 있다.In an embodiment, the ultrafine dust information prediction model 540 is prediction information, and may output a value for the concentration of the ultrafine dust in a predetermined area at the second time point. The ultrafine dust information prediction unit 530 or the ultrafine dust information prediction model 540 provides prediction information on the ultrafine dust concentration in a predetermined area at the second time point based on the value for the ultrafine dust concentration. And bad.

이상에서 설명된 장치는 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치 및 구성요소는, 예를 들어, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPGA(field programmable gate array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 애플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다. 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The apparatus described above may be implemented as a hardware component, a software component, and/or a combination of a hardware component and a software component. For example, the devices and components described in the embodiments are, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate array (FPGA). , A programmable logic unit (PLU), a microprocessor, or any other device capable of executing and responding to instructions, such as one or more general purpose computers or special purpose computers. The processing device may execute an operating system (OS) and one or more software applications executed on the operating system. In addition, the processing device may access, store, manipulate, process, and generate data in response to the execution of software. For the convenience of understanding, although it is sometimes described that one processing device is used, one of ordinary skill in the art, the processing device is a plurality of processing elements and/or a plurality of types of processing elements. It can be seen that it may include. For example, the processing device may include a plurality of processors or one processor and one controller. In addition, other processing configurations are possible, such as a parallel processor.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치, 또는 전송되는 신호 파(signal wave)에 영구적으로, 또는 일시적으로 구체화(embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.The software may include a computer program, code, instructions, or a combination of one or more of these, configuring the processing unit to behave as desired or processed independently or collectively. You can command the device. Software and/or data may be interpreted by a processing device or to provide instructions or data to a processing device, of any type of machine, component, physical device, virtual equipment, computer storage medium or device. , Or may be permanently or temporarily embodyed in a transmitted signal wave. The software may be distributed over networked computer systems and stored or executed in a distributed manner. Software and data may be stored on one or more computer-readable recording media.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 실시예의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like alone or in combination. The program instructions recorded on the medium may be specially designed and configured for the embodiment, or may be known and usable to those skilled in computer software. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic media such as floptical disks. -A hardware device specially configured to store and execute program instructions such as magneto-optical media, and ROM, RAM, flash memory, and the like. Examples of the program instructions include not only machine language codes such as those produced by a compiler, but also high-level language codes that can be executed by a computer using an interpreter or the like. The hardware device described above may be configured to operate as one or more software modules to perform the operation of the embodiment, and vice versa.

이상과 같이 실시예들이 비록 한정된 실시예와 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기의 기재로부터 다양한 수정 및 변형이 가능하다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.As described above, although the embodiments have been described by the limited embodiments and drawings, various modifications and variations are possible from the above description by those of ordinary skill in the art. For example, the described techniques are performed in a different order from the described method, and/or components such as a system, structure, device, circuit, etc. described are combined or combined in a form different from the described method, or other components Alternatively, even if substituted or substituted by an equivalent, an appropriate result can be achieved.

그러므로, 다른 구현들, 다른 실시예들 및 청구범위와 균등한 것들도 후술하는 청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents to the claims also fall within the scope of the following claims.

400: 학습 장치
500: 초미세먼지 정보 예측 장치
410, 510: 데이터 수집부
110, 420, 520: 데이터 전처리부
430: 학습부
530: 초미세먼지 정보 예측부
120, 440, 540: 초미세먼지 정보 예측 모델400: learning device
500: ultrafine dust information prediction device
410, 510: data collection unit
110, 420, 520: data preprocessor
430: learning department
530: ultrafine dust information prediction unit
120, 440, 540: Ultrafine dust information prediction model

Claims

In the ultrafine dust information prediction method,
Collecting raw data including weather data and fine dust data at a first time point;
Performing data preprocessing on the raw data at the first point in time; And
Inputting the raw data of the first time point on which the data preprocessing has been performed into a pre-learned ultrafine dust information prediction model, and obtaining prediction information on the ultrafine dust concentration at a second time point that is a future point than the first time point
Including,
The ultrafine dust information prediction model,
In the learning process, training data on which data preprocessing has been performed are input and output values including prediction of ultrafine dust concentration are output, and a gamma parameter of a support vector machine is determined based on the output value and the training data. Adjusted,
How to predict ultrafine dust information.

The method of claim 1,
The ultrafine dust information prediction model,
It is a support vector machine (SVM),
Learning based on learning data including weather data and fine dust data of a period past the first time point,
How to predict ultrafine dust information.

delete

The method of claim 1,
Collecting the raw data of the first time point,
Collecting data on the temperature, precipitation, air volume, particulate matter value of PM (Particulate Matter) 10 and particulate matter value of PM2.5 at the first point of time in a predetermined area and Baengnyeong Island
Containing,
How to predict ultrafine dust information.

The method of claim 1,
The step of performing the data preprocessing,
Performing Principal Component Analysis (PCA) on the raw data
Containing,
How to predict ultrafine dust information.

The method of claim 1,
The prediction information on the concentration of ultrafine dust at the second time point is,
Including information determined as one of good, normal, and bad for the daily average ultrafine dust concentration in a predetermined area at the second time point,
How to predict ultrafine dust information.

In a learning method for training a prediction model of ultrafine dust information,
Collecting raw data including weather data and fine dust data for a period past the first time point;
Performing data preprocessing on the raw data;
Inputting the raw data on which the data has been pre-processed as training data into an ultrafine dust information prediction model to obtain an output value including prediction of the ultrafine dust concentration; And
Adjusting a parameter of the ultrafine dust information prediction model based on the output value and the training data
Including,
Adjusting the parameters of the ultrafine dust information prediction model,
Adjusting a gamma parameter of the ultrafine dust information prediction model, which is a support vector machine, to adjust a decision boundary curvature of the support vector machine
Containing,
Learning method.

delete

In the ultrafine dust information prediction apparatus for performing the ultrafine dust information prediction method,
A data collection unit that collects raw data including weather data and fine dust data of a first time point;
A data preprocessing unit that performs data preprocessing on the weather data at the first time point; And
An ultrafine dust information prediction unit that receives weather data at a first point in which the data preprocessing is performed and performs prediction on the concentration of ultrafine dust at a second point of time through an ultrafine dust information prediction model
Including,
The ultrafine dust information prediction model,
In the learning process, training data on which data preprocessing has been performed are input and output values including prediction of ultrafine dust concentration are output, and a gamma parameter of a support vector machine is determined based on the output value and the training data. Adjusted,
Ultrafine dust information prediction device.

In the learning device for learning a prediction model of ultrafine dust information,
A data collection unit for collecting raw data including weather data and fine dust data for a period past the first time point;
A data preprocessor for performing data preprocessing on the raw data; And
By inputting the raw data on which the data pre-processing has been performed as training data into an ultrafine dust information prediction model, an output value including a prediction of the ultrafine dust concentration is obtained, and the second based on the output value and the training data Learning unit to adjust parameters of the prediction model for fine dust information
Including,
The learning unit,
Adjusting the gamma parameter of the ultrafine dust information prediction model which is a support vector machine, and adjusting a crystal boundary curvature of the support vector machine,
Learning device.