KR102634529B1

KR102634529B1 - Agricultural Price Prediction Apparatus Using Multi-Step Time Series Forecasting and Method for Predicting Agricultural Product Price Using the Same

Info

Publication number: KR102634529B1
Application number: KR1020230067187A
Authority: KR
Inventors: 권성한; 최윤성
Original assignee: 주식회사 온투인
Priority date: 2023-05-24
Filing date: 2023-05-24
Publication date: 2024-02-07

Abstract

본 명세서는 농산물 가격 예측 장치에 의한 농산물 가격 예측 방법에 관한 것이다. 상기 농산물 가격 예측 장치에 의한 농산물 가격 예측 방법은 복수의 시계열 예측 모델들 중 하나의 시계열 예측 모델을 선택하는 단계; 복수의 농산물에 대한 가격 데이터를 데이터베이스에서 읽어내어 상기 선택된 시계열 예측 모델에서의 학습을 위한 전처리를 수행하는 단계; 상기 가격 데이터에 대해서 농산물 단위로 적어도 하나의 농산물을 묶는 군집화 모델로 상기 가격 데이터를 군집화 하여 적어도 하나의 가격 데이터 군집을 생성하는 단계; 상기 선택된 시계열 예측 모델에 대해서 하이퍼-파라미터(hyper-parameters)를 튜닝하고, 상기 적어도 하나의 가격 데이터 군집 별로 상기 튜닝된 시계열 예측 모델을 학습시켜, 미래 시점의 농산물 별 예측 가격을 산출하는 복수의 농산물 가격 예측 모델들을 생성하는 단계; 상기 복수의 농산물 가격 예측 모델들 중 각 농산물 가격 예측 모델에 대하여 농산물 별로 성능 평가를 수행하여 최고의 성능을 가지는 농산물 가격 예측 모델을 해당 농산물에 대한 최종 농산물 가격 예측 모델로 결정하는 단계; 및 목적 농산물에 해당되는 최종 농산물 가격 예측 모델을 이용하여 목적 일자의 상기 목적 농산물에 대한 예측 가격을 산출하는 단계를 포함한다.This specification relates to a method for predicting prices of agricultural products using an agricultural product price prediction device. The agricultural product price prediction method using the agricultural product price prediction device includes selecting one time series prediction model from a plurality of time series prediction models; Reading price data for a plurality of agricultural products from a database and performing preprocessing for learning in the selected time series prediction model; Generating at least one price data cluster by clustering the price data using a clustering model that groups at least one agricultural product into agricultural products; A plurality of agricultural products that tune hyper-parameters for the selected time series prediction model, learn the tuned time series prediction model for each of the at least one price data cluster, and calculate the predicted price for each agricultural product at a future point in time. generating price prediction models; Performing a performance evaluation for each agricultural product among the plurality of agricultural product price prediction models and determining the agricultural product price prediction model with the best performance as the final agricultural product price prediction model for the agricultural product; and calculating a predicted price for the target agricultural product on the target date using a final agricultural product price prediction model corresponding to the target agricultural product.

Description

Agricultural Price Prediction Apparatus Using Multi-Step Time Series Forecasting and Method for Predicting Agricultural Product Price Using the Same}

본 명세서는 다단계 시계열 예측 방법을 이용한 농산물 가격 예측 장치 및 이를 이용한 농산물 가격 예측 방법에 관한 것이다.This specification relates to an agricultural product price prediction device using a multi-level time series forecasting method and a method for predicting agricultural product prices using the same.

최근 급변하는 기후 및 국제 정세로 농산물의 가격 변동성이 더욱 커지고 있어서, 자연재해 등이 발생한다면 그 가격은 걷잡을 수 없이 폭등할 수 있다. 가격이 급변할 시 생산자와 소비자 모두에게 큰 영향을 미치는 관계로 농산물의 가격 안정화는 국가 차원에서 다루고 있는 중요한 문제이다. 생산자는 매년 재배량과 재배면적을 신중하게 고려해야 하는데 이는 가격과 밀접한 관련이 있다. 하지만 수확 기간에 따른 생산량 조절이 어려워 가격의 급등과 급락을 반복하게 된다. 이 밖에도 기후나 병해충, 저장이 어려운 특성, 비탄력적 소비행태 등 수요와 공급의 균형이 맞지 않는 다양한 원인이 존재한다. 안정적이지 못한 농산물 수급은 가격에 영향을 미쳐 생산자, 소비자에게 영향을 주고 농산물 거래 활성화를 어렵게 한다.Recently, the price volatility of agricultural products has become more volatile due to the rapidly changing climate and international situation, and if a natural disaster occurs, the prices may skyrocket uncontrollably. Stabilizing prices of agricultural products is an important issue being addressed at the national level because sudden price changes have a significant impact on both producers and consumers. Producers must carefully consider the amount and area planted each year, which are closely related to price. However, it is difficult to control production depending on the harvest period, causing prices to repeatedly rise and fall. In addition, there are various reasons why supply and demand are out of balance, such as climate, pests, difficult storage, and inelastic consumption patterns. Unstable supply and demand of agricultural products affects prices, affecting producers and consumers, and makes it difficult to revitalize agricultural trade.

이 문제에 대응하기 위해 정부는 주의·경계·심각 3단계로 구성된 농산물 수급 조절 매뉴얼을 배포하였으며, 또한 가격 관측의 일환으로 한국농촌경제연구원이 설립한 ‘농업관측본부’는 각종 통계자료를 수집, 분석하여 매월 관측월보를 발간하고 있다. 한국농수산식품유통공사 aT에서는 2020년부터 경진대회를 통해 농산물 가격을 예측할 수 있는 모델을 발굴하고 있으며, 경상남도는 “주요 농산물 가격예측 시스템”을 통해 가격예측 정보를 제공하는 등 농산물 가격 예측에 대한 사회적 수요가 높아지는 추세이다. 하지만 이러한 서비스들은 길어야 7일 후 정도의 예측 가격만 제공한다. 단기가격 예측 정보의 경우 농산물의 출하관리에는 도움이 될 수 있다. 하지만 어떤 작물을 파종할지 결정하고자 하는 농민이나 농산물 가격에 기반한 금융서비스를 제공하고자 하는 핀테크 사업자의 경우에는 보다 중장기적인 농산물 가격예측 정보가 필요하다.In order to respond to this problem, the government distributed a manual for regulating the supply and demand of agricultural products, which consists of three levels: Caution, Warning, and Serious. Additionally, as part of price observation, the 'Agricultural Observation Headquarters' established by the Korea Rural Economic Institute collects various statistical data, We analyze and publish monthly observation reports every month. Korea Agro-Fisheries and Food Trade Corporation aT has been discovering models that can predict agricultural product prices through contests starting in 2020, and Gyeongsangnam-do is providing social support for agricultural product price forecasting, such as providing price forecasting information through the “Main Agricultural Product Price Prediction System.” Demand is on the rise. However, these services only provide price predictions for at most 7 days out. Short-term price prediction information can be helpful in managing the shipment of agricultural products. However, for farmers who want to decide which crops to sow or fintech operators who want to provide financial services based on agricultural product prices, mid- to long-term agricultural product price forecast information is needed.

한국등록특허공보 제10-2137583호에는 LSTM을 이용하여 농산물의 가격 및 판매량을 예측하는 방법이 개시되어 있다.Korean Patent Publication No. 10-2137583 discloses a method for predicting the price and sales volume of agricultural products using LSTM.

위 기재된 내용은 오직 본 발명의 기술적 사상들에 대한 배경 기술의 이해를 돕기 위한 것이며, 따라서 그것은 본 발명의 기술 분야의 당업자에게 알려진 선행 기술에 해당하는 내용으로 이해될 수 없다.The content described above is only intended to help understand the background technology of the technical ideas of the present invention, and therefore, it cannot be understood as content corresponding to prior art known to those skilled in the art of the present invention.

한국등록특허공보 제10-2137583호, 2020.07.24.Korean Patent Publication No. 10-2137583, 2020.07.24.

본 명세서는 전술한 문제점을 해결하기 위한 것으로서, 본 명세서의 일 실시 예는 다단계 시계열 예측 방법 기반의 농산물 가격 예측 모델을 이용하여 중장기적인 농산물의 가격을 예측하는 모델을 제공하는 것을 목적으로 한다.The purpose of this specification is to solve the above-mentioned problems, and an embodiment of this specification aims to provide a model for predicting mid- to long-term prices of agricultural products using an agricultural product price prediction model based on a multi-level time series forecasting method.

본 발명이 해결하고자 하는 과제들은 이상에서 언급된 과제로 제한되지 않으며, 언급되지 않은 또 다른 과제들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.The problems to be solved by the present invention are not limited to the problems mentioned above, and other problems not mentioned can be clearly understood by those skilled in the art from the description below.

본 명세서는 농산물 가격 예측 장치에 의한 농산물 가격 예측 방법을 제시한다. 상기 농산물 가격 예측 방법은, 복수의 시계열 예측 모델들 중 하나의 시계열 예측 모델을 선택하는 단계; 복수의 농산물에 대한 가격 데이터를 데이터베이스에서 읽어내어 상기 선택된 시계열 예측 모델에서의 학습을 위한 전처리를 수행하는 단계; 상기 가격 데이터에 대해서 농산물 단위로 적어도 하나의 농산물을 묶는 군집화 모델로 상기 가격 데이터를 군집화 하여 적어도 하나의 가격 데이터 군집을 생성하는 단계; 상기 선택된 시계열 예측 모델에 대해서 하이퍼-파라미터(hyper-parameters)를 튜닝하고, 상기 적어도 하나의 가격 데이터 군집 별로 상기 튜닝된 시계열 예측 모델을 학습시켜, 미래 시점의 농산물 별 예측 가격을 산출하는 복수의 농산물 가격 예측 모델들을 생성하는 단계; 상기 복수의 농산물 가격 예측 모델들 중 각 농산물 가격 예측 모델에 대하여 농산물 별로 성능 평가를 수행하여 최고의 성능을 가지는 농산물 가격 예측 모델을 해당 농산물에 대한 최종 농산물 가격 예측 모델로 결정하는 단계; 및 목적 농산물에 해당되는 최종 농산물 가격 예측 모델을 이용하여 목적 일자의 상기 목적 농산물에 대한 예측 가격을 산출하는 단계를 포함할 수 있다..This specification presents a method for predicting prices of agricultural products using an agricultural product price prediction device. The agricultural product price prediction method includes selecting one time series prediction model among a plurality of time series prediction models; Reading price data for a plurality of agricultural products from a database and performing preprocessing for learning in the selected time series prediction model; Generating at least one price data cluster by clustering the price data using a clustering model that groups at least one agricultural product into agricultural products; A plurality of agricultural products that tune hyper-parameters for the selected time series prediction model, learn the tuned time series prediction model for each of the at least one price data cluster, and calculate the predicted price for each agricultural product at a future point in time. generating price prediction models; Performing a performance evaluation for each agricultural product among the plurality of agricultural product price prediction models and determining the agricultural product price prediction model with the best performance as the final agricultural product price prediction model for the agricultural product; And it may include calculating a predicted price for the target agricultural product on the target date using a final agricultural product price prediction model corresponding to the target agricultural product.

상기 농산물 가격 예측 장치에 의한 농산물 가격 예측 방법 및 그 밖의 실시예는 다음과 같은 특징을 포함할 수 있다.The agricultural product price prediction method and other embodiments using the agricultural product price prediction device may include the following features.

실시 예에 따라, 상기 군집화 모델은, 상기 선택된 시계열 예측 모델에 군집된 가격 데이터를 적용할 때 하이퍼-파라미터 튜닝에 대해 미리 결정된 튜닝 패턴을 만족하는 농산물 품목들이 동일 군집이 되도록 상기 가격 데이터를 군집화 할 수 있다.Depending on the embodiment, the clustering model may cluster the price data so that agricultural product items that satisfy a predetermined tuning pattern for hyper-parameter tuning are in the same cluster when applying the clustered price data to the selected time series prediction model. You can.

실시 예에 따라, 또한, 상기 군집화 모델은, 상기 선택된 시계열 예측 모델에 군집된 가격 데이터를 적용할 때 최고의 성능 평가 결과를 산출하는 군집의 수로 상기 가격 데이터를 군집화 할 수 있다.Depending on the embodiment, the clustering model may cluster the price data into the number of clusters that produce the best performance evaluation result when applying the clustered price data to the selected time series prediction model.

실시 예에 따라, 또한, 상기 선택된 시계열 예측 모델에 대한 하이퍼-파라미터 튜닝은, 미리 결정된 범주 내에서 각 하이퍼-파라미터의 값을 변경해가며 상기 선택된 시계열 예측 모델을 상기 적어도 하나의 가격 데이터 군집 별로 학습시켜, 상기 선택된 시계열 예측 모델이 최고의 성능을 산출하도록 하는 하이퍼-파라미터의 값을 상기 각 하이퍼-파라미터의 값으로 결정하는 과정일 수 있다.Depending on the embodiment, hyper-parameter tuning for the selected time series prediction model may be performed by changing the value of each hyper-parameter within a predetermined category and training the selected time series prediction model for each of the at least one price data cluster. , It may be a process of determining the value of the hyper-parameter that allows the selected time series prediction model to produce the best performance as the value of each hyper-parameter.

실시 예에 따라, 또한, 상기 최종 농산물 가격 예측 모델은, 상기 선택된 시계열 예측 모델에 기초하여 미래 시점 별로 생성한 제1 농산물 가격 예측 모델, 상기 선택된 시계열 예측 모델에 의한 이전 시점의 예측 결과가 다음 시점의 예측을 위한 시계열 예측 모델에 입력되는 제2 농산물 가격 예측 모델, 이전 시점의 가격을 예측하는 시계열 예측 모델의 예측 결과를 다음 시점의 가격을 예측하는 시계열 예측 모델의 입력 데이터에 포함하여 학습시켜 생성한 제3 농산물 가격 예측 모델, 및 단일 시계열 예측 모델로 모든 미래 시점의 농산물 별 예측 가격을 산출하는 제4 농산물 가격 예측 모델 중 선택된 농산물 가격 예측 모델일 수 있다.According to the embodiment, the final agricultural product price prediction model may include a first agricultural product price prediction model generated for each future time point based on the selected time series prediction model, and a prediction result of the previous time point by the selected time series prediction model at the next time point. A second agricultural product price prediction model, which is input to the time series prediction model for prediction, is created by learning the prediction results of the time series prediction model predicting the price at the previous point in time by including it in the input data of the time series prediction model predicting the price at the next point in time. It may be an agricultural product price prediction model selected among a third agricultural product price prediction model and a fourth agricultural product price prediction model that calculates predicted prices for each agricultural product at all future points in time using a single time series prediction model.

실시 예에 따라, 또한, 상기 복수의 농산물에 대한 가격 데이터를 데이터베이스에서 읽어내어 상기 선택된 시계열 예측 모델에서의 학습을 위한 전처리를 수행하는 단계는, 상기 데이터베이스에서 상기 가격 데이터를 불러와 결손 데이터를 미리 결정된 추정 방법으로 보강하는 단계; 및 상기 결손 데이터가 보강된 가격 데이터를 상기 선택된 시계열 예측 모델에서의 학습을 위한 포맷으로 구성하는 단계를 포함할 수 있다.According to the embodiment, the step of reading price data for the plurality of agricultural products from a database and performing preprocessing for learning in the selected time series prediction model includes loading the price data from the database and removing missing data in advance. Reinforcing with the determined estimation method; And it may include configuring the price data augmented with the missing data into a format for learning in the selected time series prediction model.

한편, 본 명세서는 컴퓨터로 읽을 수 있는 기록매체에 저장된 컴퓨터프로그램을 제시하며, 상기 컴퓨터프로그램은 하드웨어와 결합되어, 상기 농산물 가격 예측 장치에 의한 농산물 가격 예측 방법이 포함하는 각 단계를 실행시킬 수 있다.Meanwhile, this specification presents a computer program stored in a computer-readable recording medium, and the computer program is combined with hardware to execute each step included in the agricultural product price prediction method by the agricultural product price prediction device. .

다른 한편, 본 명세서는 농산물 가격 예측 방법을 수행하는 농산물 가격 예측 장치를 제시한다. 상기 농산물 가격 예측 장치는, 복수의 농산물에 대한 가격 데이터를 수집하기 위해 외부의 농산물 가격 데이터 제공 서버와 통신하는 통신부; 수집된 상기 가격 데이터를 데이터베이스화 하여 저장하는 저장부; 및 상기 통신부 및 상기 저장부와 기능적으로 연결되는 제어부를 포함하되, 상기 제어부는, 복수의 시계열 예측 모델들 중 하나의 시계열 예측 모델을 선택하고, 복수의 농산물에 대한 상기 가격 데이터를 상기 데이터베이스에서 읽어내어 상기 선택된 시계열 예측 모델에서의 학습을 위한 전처리를 수행하고, 상기 가격 데이터에 대해서 농산물 단위로 적어도 하나의 농산물을 묶는 군집화 모델로 상기 가격 데이터를 군집화 하여 적어도 하나의 가격 데이터 군집을 생성하고, 상기 선택된 시계열 예측 모델에 대해서 하이퍼-파라미터(hyper-parameters)를 튜닝하고, 상기 적어도 하나의 가격 데이터 군집 별로 상기 튜닝된 시계열 예측 모델을 학습시켜, 미래 시점의 농산물 별 예측 가격을 산출하는 복수의 농산물 가격 예측 모델들을 생성하고, 상기 복수의 농산물 가격 예측 모델들 중 각 농산물 가격 예측 모델에 대하여 농산물 별로 성능 평가를 수행하여 최고의 성능을 가지는 농산물 가격 예측 모델을 해당 농산물에 대한 최종 농산물 가격 예측 모델로 결정하고, 목적 농산물에 해당되는 최종 농산물 가격 예측 모델을 이용하여 목적 일자의 상기 목적 농산물에 대한 예측 가격을 산출할 수 있다.On the other hand, this specification presents an agricultural product price prediction device that performs an agricultural product price prediction method. The agricultural product price prediction device includes a communication unit that communicates with an external agricultural product price data providing server to collect price data for a plurality of agricultural products; A storage unit that stores the collected price data in a database; and a control unit functionally connected to the communication unit and the storage unit, wherein the control unit selects one time series prediction model among a plurality of time series prediction models and reads the price data for a plurality of agricultural products from the database. Perform preprocessing for learning in the selected time series prediction model, cluster the price data with a clustering model that groups at least one agricultural product by agricultural product, and generate at least one price data cluster, A plurality of agricultural product prices that tune hyper-parameters for a selected time series prediction model, learn the tuned time series prediction model for each of the at least one price data cluster, and calculate a predicted price for each agricultural product at a future point in time. Generate prediction models, perform a performance evaluation for each agricultural product among the plurality of agricultural product price prediction models, and determine the agricultural product price prediction model with the best performance as the final agricultural product price prediction model for the agricultural product. , the predicted price for the target agricultural product on the target date can be calculated using the final agricultural product price prediction model corresponding to the target agricultural product.

상기 농산물 가격 예측 장치 및 그 밖의 실시예는 다음과 같은 특징을 포함할 수 있다.The agricultural product price prediction device and other embodiments may include the following features.

실시 예에 따라, 또한, 상기 제어부는, 복수의 농산물에 대한 가격 데이터를 데이터베이스에서 읽어내어 상기 선택된 시계열 예측 모델에서의 학습을 위한 전처리를 수행하고, 상기 데이터베이스에서 상기 가격 데이터를 불러와 결손 데이터를 미리 결정된 추정 방법으로 보강하고, 상기 결손 데이터가 보강된 가격 데이터를 상기 선택된 시계열 예측 모델에서의 학습을 위한 포맷으로 구성할 수 있다.Depending on the embodiment, the control unit reads price data for a plurality of agricultural products from a database, performs preprocessing for learning in the selected time series prediction model, and retrieves the price data from the database to remove missing data. The price data augmented with a predetermined estimation method and the missing data augmented can be configured into a format for learning in the selected time series prediction model.

본 명세서에 개시된 실시 예는 다단계 시계열 예측 방법 기반의 농산물 가격 예측 모델을 이용하여 농산물의 가격을 예측하는 모델을 제공할 수 있는 효과가 있다.The embodiment disclosed in this specification has the effect of providing a model for predicting the price of agricultural products using an agricultural product price prediction model based on a multi-level time series prediction method.

한편, 본 발명에서 얻을 수 있는 효과는 이상에서 언급한 효과들로 제한되지 않으며, 언급하지 않은 또 다른 효과들은 아래의 기재로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.Meanwhile, the effects that can be obtained from the present invention are not limited to the effects mentioned above, and other effects not mentioned will be clearly understood by those skilled in the art from the description below. You will be able to.

본 명세서에 첨부되는 다음의 도면들은 본 발명의 바람직한 실시예를 예시하는 것이며, 발명을 실시하기 위한 구체적인 내용과 함께 본 발명의 기술사상을 더욱 이해시키는 역할을 하는 것이므로, 본 발명은 그러한 도면에 기재된 사항에만 한정되어 해석되어서는 아니 된다.
도 1은 일 실시 예에 따른 농산물 가격 예측 방법의 도출을 위한 프레임 워크의 구성을 간략하게 설명하는 도면이다.
도 2는 특정 기간의 농산물의 각 품종 별 가격의 기술통계 분석결과를 도시한다.
도 3 내지 도 5는 농산물 가격 예측 방법에서 다양한 가격 예측 모델의 예측 결과를 도시한다.
도 6은 일 실시 예에 따른 농산물 가격 예측 방법을 위한 프레임 워크에서 농산물 품종의 기간 별 전략 및 예측 기법의 최우수 조합을 도시한다.
도 7은 실시예에 따른 가격 예측 방법을 설명하는 순서도를 도시한다.
도 8은 일 실시 예에 따른 농산물 가격 예측 방법을 수행하는 농산물 가격 예측 장치의 간략한 구성을 설명한다.
도 9는 일 실시예에 따른 농산물 가격 예측 장치의 일부 구성으로 포함되는 AI 장치의 블록도이다.The following drawings attached to this specification illustrate preferred embodiments of the present invention, and serve to further understand the technical idea of the present invention along with specific details for carrying out the invention. Therefore, the present invention is described in such drawings. It should not be interpreted as limited to the specific details.
Figure 1 is a diagram briefly explaining the configuration of a framework for deriving a method for predicting prices of agricultural products according to an embodiment.
Figure 2 shows the results of descriptive statistics analysis of prices for each variety of agricultural products during a specific period.
Figures 3 to 5 show prediction results of various price prediction models in the agricultural product price prediction method.
Figure 6 shows the best combination of strategies and forecasting techniques for each period of agricultural product varieties in a framework for an agricultural product price prediction method according to an embodiment.
Figure 7 shows a flow chart explaining a price prediction method according to an embodiment.
Figure 8 explains a brief configuration of an agricultural product price prediction device that performs an agricultural product price prediction method according to an embodiment.
Figure 9 is a block diagram of an AI device included as part of an agricultural product price prediction device according to an embodiment.

본 명세서에 개시된 기술은 농산물의 가격 예측에 적용될 수 있다. 그러나 본 명세서에 개시된 기술은 이에 한정되지 않고, 상기 기술의 기술적 사상이 적용될 수 있는 모든 장치 및 방법에도 적용될 수 있다.The technology disclosed in this specification can be applied to price prediction of agricultural products. However, the technology disclosed in this specification is not limited to this and can be applied to all devices and methods to which the technical idea of the technology can be applied.

본 명세서에서 사용되는 기술적 용어는 단지 특정한 실시 예를 설명하기 위해 사용된 것으로, 본 명세서에 개시된 기술의 사상을 한정하려는 의도가 아님을 유의해야 한다. 또한, 본 명세서에서 사용되는 기술적 용어는 본 명세서에서 특별히 다른 의미로 정의되지 않는 한, 본 명세서에 개시된 기술이 속하는 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 의미로 해석되어야 하며, 과도하게 포괄적인 의미로 해석되거나, 과도하게 축소된 의미로 해석되지 않아야 한다. 또한, 본 명세서에서 사용되는 기술적인 용어가 본 명세서에 개시된 기술의 사상을 정확하게 표현하지 못하는 잘못된 기술적 용어일 때에는, 본 명세서에 개시된 기술이 속하는 분야에서 통상의 지식을 가진 자가 올바르게 이해할 수 있는 기술적 용어로 대체되어 이해되어야 할 것이다. 또한, 본 명세서에서 사용되는 일반적인 용어는 사전에 정의되어 있는 바에 따라, 또는 전후 문맥 상에 따라 해석되어야 하며, 과도하게 축소된 의미로 해석되지 않아야 한다.It should be noted that the technical terms used in this specification are only used to describe specific embodiments and are not intended to limit the spirit of the technology disclosed in this specification. In addition, the technical terms used in this specification, unless specifically defined in a different way in this specification, should be interpreted as meanings generally understood by those skilled in the art in the field to which the technology disclosed in this specification belongs. It should not be interpreted in a very comprehensive sense or in an excessively reduced sense. In addition, if the technical term used in this specification is an incorrect technical term that does not accurately express the idea of the technology disclosed in this specification, it is a technical term that can be correctly understood by a person with ordinary knowledge in the field to which the technology disclosed in this specification belongs. It should be understood and replaced with . Additionally, general terms used in this specification should be interpreted as defined in the dictionary or according to the context, and should not be interpreted in an excessively reduced sense.

본 명세서에서 사용되는 제1, 제2 등과 같이 서수를 포함하는 용어는 다양한 구성 요소들을 설명하는데 사용될 수 있지만, 상기 구성 요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 예를 들어, 본 발명의 권리 범위를 벗어나지 않으면서 제1 구성 요소는 제2 구성 요소로 명명될 수 있고, 유사하게 제2 구성 요소도 제1 구성 요소로 명명될 수 있다.Terms containing ordinal numbers, such as first, second, etc., used in this specification may be used to describe various components, but the components should not be limited by the terms. The above terms are used only for the purpose of distinguishing one component from another. For example, a first component may be referred to as a second component, and similarly, the second component may also be referred to as a first component without departing from the scope of the present invention.

이하, 첨부된 도면을 참조하여 본 명세서에 개시된 실시 예들을 상세히 설명하되, 도면 부호에 관계없이 동일하거나 유사한 구성 요소는 동일한 참조 번호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다.Hereinafter, embodiments disclosed in the present specification will be described in detail with reference to the attached drawings. However, identical or similar components will be assigned the same reference numerals regardless of the reference numerals, and duplicate descriptions thereof will be omitted.

또한, 본 명세서에 개시된 기술을 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 본 명세서에 개시된 기술의 요지를 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다. 또한, 첨부된 도면은 본 명세서에 개시된 기술의 사상을 쉽게 이해할 수 있도록 하기 위한 것일 뿐, 첨부된 도면에 의해 그 기술의 사상이 제한되는 것으로 해석되어서는 아니 됨을 유의해야 한다.Additionally, when describing the technology disclosed in this specification, if it is determined that a detailed description of a related known technology may obscure the gist of the technology disclosed in this specification, the detailed description will be omitted. In addition, it should be noted that the attached drawings are only intended to facilitate easy understanding of the spirit of the technology disclosed in this specification, and should not be construed as limiting the spirit of the technology by the attached drawings.

명세서 전체에서, 장치 또는 단말은 서버 또는 다른 장치와 유선 또는 무선 통신할 수 있는 통신 단말 또는 통신 장치를 포함한다. 장치 또는 단말의 형태는 휴대전화, 스마트폰, 스마트패드, 랩톱 컴퓨터, 데스크톱 컴퓨터, 스마트티브이, 웨어러블 디바이스, 거울 형태의 디스플레이 장치, 스마트미러 등과 같이 다양한 형태를 가질 수 있다. 웨어러블 디바이스는 워치형 단말기, 글래스형 단말기, HMD 등과 같이 다양할 수 있다. 또한, 단말은 이러한 형태에 한정되지 않고 다양한 전자 장치로 구현될 수 있다.Throughout the specification, a device or terminal includes a communication terminal or communication device capable of wired or wireless communication with a server or other device. The form of the device or terminal may be various, such as a mobile phone, smartphone, smart pad, laptop computer, desktop computer, smart TV, wearable device, mirror-shaped display device, smart mirror, etc. Wearable devices can be diverse, such as watch-type terminals, glass-type terminals, HMDs, etc. Additionally, the terminal is not limited to this form and can be implemented with various electronic devices.

다단계 시계열 예측 접근법(Multi-step Time Series Forecasting)Multi-step Time Series Forecasting

본 명세서에서는 다양한 범위의 미래 시점의 농산물 가격을 예측하기 위해서 다단계 시계열 예측 접근법(Multi-step Time Series Forecasting)을 적용한다. 다단계 시계열 예측 시 일반적으로 사용되는 접근법은 직접 다단계 시계열 예측 접근법(Direct Multi-step Time Series Forecasting) 및 재귀 다단계 시계열 예측 접근법(Recursive Multi-step Time Series Forecasting)이 있으며, 그 중 재귀 다단계 시계열 예측 접근법이 일반적으로 가장 많이 사용된다. 하지만 재귀 다단계 시계열 예측 접근법은 익일 예측 값을 활용하여 이틀 뒤의 값을 예측하고 이 값을 포함하여 그 다음 날 가격을 예측하는 원리이기 때문에 중장기 즉 예측해야 하는 시점이 멀어질수록 예측 오차가 점점 커질 수밖에 없다. 이에 본 명세서의 실시 예에서는 중장기 농산물 가격 예측 모형을 구축하고자 (a) 직접 다단계 시계열 예측 접근법 전략(Direct Multi-step Time Series Forecasting), (b) 재귀 다단계 시계열 예측 접근법(Recursive Multi-step Time Series Forecasting) 전략, (c) 직접-재귀 하이브리드 다단계 시계열 예측 접근법(Direct-Recursive Hybrid Multi-step Time Series Forecasting) 전략 및 (d) 다중 출력 방법(Multiple Outputs) 전략의 4 가지 전략을 중장기 농산물 가격 예측에 사용하지만, 전술한 재귀 다단계 시계열 예측 접근법의 오차 문제로 인해 이를 제외한 3 가지 전략을 중장기 농산물 가격 예측에 사용한다.In this specification, a multi-step time series forecasting approach (Multi-step Time Series Forecasting) is applied to predict agricultural product prices at various future points in time. Commonly used approaches for multi-step time series forecasting include Direct Multi-step Time Series Forecasting and Recursive Multi-step Time Series Forecasting. Among them, the recursive multi-step time series forecasting approach is It is generally the most used. However, the recursive multi-step time series forecasting approach uses the next day's forecast value to predict the value two days later and includes this value to predict the next day's price, so the forecast error increases in the mid- to long-term, that is, as the time to predict becomes farther away, the forecast error increases. There is no choice but to do so. Accordingly, in the embodiment of this specification, to build a mid- to long-term agricultural product price prediction model, (a) Direct Multi-step Time Series Forecasting, (b) Recursive Multi-step Time Series Forecasting ) strategy, (c) Direct-Recursive Hybrid Multi-step Time Series Forecasting strategy, and (d) Multiple Outputs strategy are used to predict mid- to long-term agricultural product prices. However, due to the error problem of the above-mentioned recursive multi-level time series forecasting approach, three strategies other than this are used to predict mid- to long-term agricultural product prices.

(a) 직접 다단계 시계열 예측 접근법(Direct Multi-step Time Series Forecasting) 전략(a) Direct Multi-step Time Series Forecasting strategy

직접 다단계 시계열 예측 접근법 전략은 아래 제시된 수식 (1)과 같이 각 시점 별 예측 모델을 독립적으로 구축하는 접근법으로 정의된다.The direct multi-level time series forecasting approach strategy is defined as an approach that independently builds a forecast model for each time point, as shown in Equation (1) below.

이 접근법은 준수한 성능을 보이지만, 예측 모델을 다수 개발해야 하므로 컴퓨팅 자원이 많이 소모되고, 각 시점 별 예측 결과가 서로 독립적으로 생성되어 연계성이 떨어진다.This approach shows acceptable performance, but requires the development of multiple prediction models, consuming a lot of computing resources, and predicting results at each time point are generated independently from each other, resulting in poor connectivity.

(b) 재귀 다단계 시계열 예측 접근법(Recursive Multi-step Time Series Forecasting) 전략(b) Recursive Multi-step Time Series Forecasting strategy

재귀 다단계 시계열 예측 접근법은 아래 수식 (2)에 표현된 것과 같이 재귀적으로 이전 단계의 예측 결과를 다음 단계 예측에 활용하는 방식으로 특정 시점의 값을 예측하는 전략을 말한다.The recursive multi-stage time series forecasting approach refers to a strategy of predicting the value at a specific point in time by recursively using the prediction results of the previous stage to predict the next stage, as expressed in Equation (2) below.

이 접근법은 하나의 단일 모델로 여러 미래 시점의 예측 결과를 산출할 수 있다는 장점이 있다. 하지만 먼 미래를 예측, 즉 k가 커질수록 예측 오류차가 누적되므로 성과 저하가 뚜렷하게 나타난다. 본 명세서의 실시 예에서도 파일럿 실험 결과 오류가 급증함을 확인하여 비교 전략에서 제외하였다.This approach has the advantage of being able to produce forecast results for multiple future points in time with a single model. However, as predictions are made far into the future, that is, as k increases, the prediction error difference accumulates, and a clear decrease in performance appears. In the embodiment of this specification, it was confirmed that errors increased rapidly as a result of the pilot experiment, so it was excluded from the comparison strategy.

(c) 직접-재귀 하이브리드 다단계 시계열 예측 접근법(Direct-Recursive Hybrid Multi-step Time Series Forecasting) 전략(c) Direct-Recursive Hybrid Multi-step Time Series Forecasting strategy

직접-재귀 하이브리드 다단계 시계열 예측 접근법은 직접 다단계 시계열 예측 접근법과 재귀 다단계 시계열 예측 접근법의 장점을 결합한 하이브리드 전략의 다단계 시계열 예측 방법이다. t+2 시점의 예측 값을 얻고자 하면 먼저 t+1 시점의 예측 값을 구하는 모델을 만든 후, 이 모델을 통해 예측된 값을 추가하여 t+2 시점의 예측 값을 구하는 모델을 학습시킨 후 결과를 얻는 방식을 의미하며 식은 아래의 수식 (3)과 같다.The direct-recursive hybrid multi-step time series forecasting approach is a hybrid strategy multi-step time series forecasting method that combines the advantages of the direct multi-step time series forecasting approach and the recursive multi-step time series forecasting approach. If you want to obtain the predicted value at time t+2, first create a model to obtain the predicted value at time t+1, then add the predicted value through this model to learn the model to obtain the predicted value at time t+2. It refers to the method of obtaining the result, and the formula is as formula (3) below.

(d) 다중 출력 방법(Multiple Outputs) 전략(d) Multiple Outputs Strategy

다중 출력 방법(Multiple Outputs) 전략은 한 번에 전체 미래 예측 값을 산출할 수 있는 전략을 의미한다. 이 전략에 적용할 수 있는 대표적인 분석기법으로 LSTM, GRU 등이 존재하며 식은 아래의 수식 (4)와 같다.The Multiple Outputs strategy refers to a strategy that can calculate the entire future forecast value at once. Representative analysis techniques that can be applied to this strategy include LSTM and GRU, and the equation is shown in Equation (4) below.

다중 출력 방법은 단일 모델로 원하는 시점까지의 예측 값을 얻을 수 있는 반면, 모델이 과도하게 복잡해지는 단점이 있으며 학습이 오래 걸리고, 과적합을 피하기 위해서는 방대한 양의 학습 데이터가 요구된다.While the multi-output method can obtain predicted values up to the desired point with a single model, it has the disadvantage of making the model overly complex, learning takes a long time, and a large amount of training data is required to avoid overfitting.

이하에서는 전술한 네 가지 다단계 시계열 예측 접근법 전략 중, 오류가 증폭되는 재귀 다단계 시계열 예측 접근법(Recursive Multi-step Time Series Forecasting) 전략을 제외하고 나머지 세 가지 전략을 중심으로 가격 예측 모델링을 수행한다.Below, among the four multi-step time series forecasting approach strategies described above, price forecast modeling is performed focusing on the remaining three strategies, excluding the Recursive Multi-step Time Series Forecasting strategy, which amplifies errors.

도 1은 일 실시 예에 따른 농산물 가격 예측 방법의 도출을 위한 프레임 워크의 구성을 간략하게 설명하는 도면이다.Figure 1 is a diagram briefly explaining the configuration of a framework for deriving a method for predicting prices of agricultural products according to an embodiment.

도 2는 특정 기간의 농산물의 각 품종 별 가격의 기술통계 분석결과를 도시한다.Figure 2 shows the results of descriptive statistics analysis of prices for each variety of agricultural products during a specific period.

도 1을 참조하면, 실시 예에 따른 인공신경망 학습 모델은 농산물의 품종(양파, 오이, 배추), 기간, 예측 전략(직접 전략, 하이브리드 전략, 다중 출력 전략), 예측 기법(MLP, LGBM, LSTM, GRU) 등 다양한 조건을 결합하여, 농산물 품종 및 시점 별 최적의 가격 예측 모델을 도출한다.Referring to Figure 1, the artificial neural network learning model according to the embodiment includes the agricultural product variety (onion, cucumber, cabbage), period, prediction strategy (direct strategy, hybrid strategy, multiple output strategy), and prediction technique (MLP, LGBM, LSTM). , GRU), etc. are combined to derive the optimal price prediction model for each agricultural product type and time point.

가격 예측 모델에 대한 성능 평가 지표로는 MAPE(Mean Absolute Percentage Error)를 사용한다. MAPE는 예측치와 실측치 간 오차를 비교한 것으로, 수식은 아래의 수식 (5)와 같다. 이 수치가 낮을수록 모형의 성능이 좋다고 할 수 있다. 농산물 품종 별로 단가의 편차가 크므로 종합적인 비교를 위해 MAPE를 평가지표로 선정하였다.MAPE (Mean Absolute Percentage Error) is used as a performance evaluation indicator for the price prediction model. MAPE is a comparison of the error between predicted values and actual measurements, and the formula is as equation (5) below. The lower this number, the better the model’s performance. Because there is a large variation in unit price for each agricultural product variety, MAPE was selected as the evaluation index for comprehensive comparison.

이하에서는 도 1에 도시한 농산물의 가격 예측 방법을 상세하게 설명한다.Below, the method for predicting prices of agricultural products shown in FIG. 1 will be described in detail.

실시 예에 따른 가격 예측 모델링은 Python 3.6 버전에서 진행하였으며 Google Tensorflow, matplotlib, statsmodels 등의 라이브러리를 활용하였다. 가격 예측 모델링은 총 3개의 농산물(양파, 무, 배추) 품종을 대상으로 진행하였다. 모델 학습 및 예측에 필요한 농산물 가격 데이터는 농넷에서 수집한 데이터를 사용하였다. 2014년 1월 3일부터 2021년 12월 31일까지의 데이터를 활용하여 30일(월), 90일(분기), 180일(반기), 270일, 365일(1년) 총 5개 시점의 미래 가격 예측을 수행하였다. 수집한 일 별 데이터 중 결측치는 python에서 제공하는 pandas 모듈과 asfreq 함수를 활용하여 보완하였다.Price prediction modeling according to the embodiment was conducted in Python version 3.6, using libraries such as Google Tensorflow, matplotlib, and statsmodels. Price prediction modeling was conducted on a total of three agricultural products (onion, radish, and cabbage). Agricultural price data required for model learning and prediction were collected from Nongnet. Using data from January 3, 2014 to December 31, 2021, there are five time points in total: 30 days (Monday), 90 days (quarterly), 180 days (semi-annual), 270 days, and 365 days (one year). Future price prediction was performed. Missing values among the collected daily data were supplemented using the pandas module and asfreq function provided in Python.

농산물의 가격은 각 품종 별로 가격대가 상이하여 비교분석에 주의가 필요하다. 구체적으로 각 품종 별 가격의 기술통계 분석결과는 도 2의 표 1과 같다. 학습 데이터 셋(training data set)은 2014년 1월 3일부터 2020년 12월 31일까지의 품종 별 가격 데이터이다. 농산물 수확의 주기는 1년이므로, 봄부터 겨울까지의 1년 사이에 가격 변화의 패턴을 확인하기 위해 검증 데이터 셋(test data set)은 2021년 1월 1일부터 2021년 12월 31일까지의 가격 데이터로 설정하였다.The prices of agricultural products differ for each variety, so caution is needed in comparative analysis. Specifically, the descriptive statistical analysis results of the prices for each variety are shown in Table 1 in Figure 2. The training data set is price data for each variety from January 3, 2014 to December 31, 2020. Since the agricultural harvest cycle is one year, the test data set is from January 1, 2021 to December 31, 2021 to check the pattern of price changes during the year from spring to winter. It was set as price data.

실시 예에서는 직접(Direct) 예측 전략 및 하이브리드(Hybrid) 예측 전략이 다단계 출력(Multiple Outputs) 전략에서 사용한 일반적인 시계열 분석 기법인 LSTM(Long Short-Term Memory)과 GRU(Gated Recurrent Unit) 대비 얼마나 효과적인지 살펴보았다. 이 때 직접(Direct) 예측 전략 및 하이브리드(Hybrid) 예측 전략에서는 MLP(Multi-Layer Perceptron)와 LGBM(Light Gradient Boosting Machine)을 사용하였다.In the embodiment, how effective the Direct prediction strategy and the Hybrid prediction strategy are compared to Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), which are common time series analysis techniques used in the Multiple Outputs strategy. I looked. At this time, MLP (Multi-Layer Perceptron) and LGBM (Light Gradient Boosting Machine) were used in the direct and hybrid prediction strategies.

도 3 내지 도 5는 농산물 가격 예측 방법에서 다양한 가격 예측 모델의 예측 결과를 도시한다.Figures 3 to 5 show prediction results of various price prediction models in the agricultural product price prediction method.

도 3의 표 2에는 LGBM 분석 기법을 이용한 가격 예측 모델의 예측 결과가 도시되고, 도 4의 표 3에는 MLP 분석 기법을 이용한 가격 예측 모델의 예측 결과가 도시된다. 3개 품종의 5개 시점 별 예측 가격을 살펴본 결과, 양파는 직접(Direct) 예측 전략, 오이와 배추는 하이브리드(Hybrid) 예측 전략이 더 적합한 전략임을 알 수 있다. MLP 모델에서 양파는 하이브리드(Hybrid) 예측 전략이, 오이와 배추는 직접(Direct) 예측 전략이 평균적으로 더 낮은 MAPE를 나타낸다. 특히 양파의 경우, 하이브리드(Hybrid) 예측 전략과 결합되어 사용할 때 우수한 성능을 보이는 것을 확인할 수 있다.Table 2 in FIG. 3 shows the prediction results of the price prediction model using the LGBM analysis technique, and Table 3 in FIG. 4 shows the prediction results of the price prediction model using the MLP analysis technique. As a result of examining the predicted prices at 5 time points for 3 varieties, it can be seen that the Direct prediction strategy is a more appropriate strategy for onions, and the Hybrid prediction strategy is more appropriate for cucumbers and cabbage. In the MLP model, the Hybrid prediction strategy for onions and the Direct prediction strategy for cucumbers and cabbage show lower MAPE on average. In particular, in the case of onions, it can be seen that excellent performance is shown when used in combination with a hybrid prediction strategy.

도5의 표 4에는 LSTM과 GRU의 예측 결과가 도시된다. 직접(Direct) 예측 전략 및 하이브리드(Hybrid) 예측 전략의 비교군으로 실험된 다중 출력(Multiple Outputs) 전략의 예측 값은 LGBM, MLP의 최적 모델 대비 MAPE가 전반적으로 높게 나타난다. 특히 5개 시점의 세부 예측 결과 중 90일 이후에 대한 예측부터는 변별력 있는 예측을 보여주지 못한다.Table 4 in Figure 5 shows the prediction results of LSTM and GRU. The prediction value of the Multiple Outputs strategy, which was tested as a comparison group of the Direct prediction strategy and the Hybrid prediction strategy, shows an overall higher MAPE compared to the optimal models of LGBM and MLP. In particular, among the detailed forecast results at five points in time, forecasts after 90 days do not show discriminatory forecasts.

도 6은 일 실시 예에 따른 농산물 가격 예측 방법을 위한 프레임 워크에서 농산물 품종의 기간 별 전략 및 예측 기법의 최우수 조합을 도시한다.Figure 6 shows the best combination of strategies and forecasting techniques for each period of agricultural product varieties in a framework for an agricultural product price prediction method according to an embodiment.

도 6을 참조하면, 도시한 표 5에는 각 품종의 기간 별 전략 및 예측 기법의 최우수 조합을 제시한다. 양파 품종의 경우 MLP와 하이브리드(Hybrid) 예측 전략의 조합이 가장 우수한 예측력을 보였고, 오이와 배추 품종의 경우 LGBM과 Hybrid 전략이 가장 높은 성능을 나타냈다. 모델의 관점에서 볼 때, 최우수 조합 15개 중 품종에 따라 MLP와 LGBM을 하이브리드(Hybrid) 예측 전략과 결합한 모델이 12회 관찰되므로, 하이브리드(Hybrid) 예측 전략이 보편적으로 우수한 것을 알 수 있다.Referring to Figure 6, Table 5 shows the best combination of strategies and forecasting techniques for each variety period. For onion varieties, the combination of MLP and Hybrid prediction strategy showed the best prediction ability, and for cucumber and cabbage varieties, LGBM and Hybrid strategy showed the highest performance. From a model perspective, among the 15 best combinations, a model combining MLP and LGBM with a hybrid prediction strategy was observed 12 times depending on the breed, showing that the hybrid prediction strategy is generally superior.

본 실시 예에서는 재귀적 형태의 일반적인 시계열 예측 전략 대비 새로운 전략들이 중장기 농산물 가격 예측에 얼마나 효과적인지 검증하고자 양파, 오이, 배추 등 3가지 품종을 대상으로 가격 예측 모델 별로 전술한 다양한 전략으로 가격 예측을 진행하였다. 분석 결과를 종합한 표 5를 참고하면 품종 및 기간 별 차이는 존재하지만 전반적으로 다중 출력(Multiple Outputs) 전략 대비 다단계 예측법(Multi-step Forecasting), 그 중에서도 특히 하이브리드(Hybrid) 예측 전략이 상대적으로 더 나은 성능 조합을 보임을 알 수 있다.In this example, in order to verify how effective the new strategies are in predicting mid- to long-term agricultural product prices compared to the general time series prediction strategy in a recursive form, price prediction was made using the various strategies described above for each price prediction model for three varieties, including onions, cucumbers, and cabbage. proceeded. Referring to Table 5, which summarizes the analysis results, there are differences by breed and period, but overall, the multi-step forecasting method, especially the hybrid forecasting strategy, is relatively better than the multiple outputs strategy. It can be seen that it shows a better combination of performance.

전술한 실시 예에서는 농산물 종류 별로 및 예측 시점 별로 최고의 가격 예측을 위한 전략과 가격 예측 기법이 다르게 적용됨을 알 수 있다. 즉, 농산물 품종 별/품목 별로 예측하고자 하는 미래의 시점 별로 최고 성능의 가격 예측 결과를 산출하는 농산물 예측 모델이 다를 수 있음을 알 수 있다.In the above-described embodiment, it can be seen that the strategy and price prediction technique for predicting the best price are applied differently for each type of agricultural product and for each prediction time point. In other words, it can be seen that the agricultural product prediction model that produces the best performance price prediction results may be different at each future point in time to be predicted for each agricultural product variety/item.

농산물 예측 모델은 다양한 시계열 예측 모델(시계열 예측 네트워크)을 이용하여 농산물의 품종에 따라 최고의 성능을 낼 수 있도록 가격 데이터 별로 학습을 하고, 사용된 시계열 예측 네트워크의 하이퍼-파라미터(hyper-parameter) 값을 다양하게 변경하여 최고의 가격 예측 결과를 산출할 수 있도록 튜닝된다.The agricultural product prediction model uses various time series prediction models (time series prediction networks) to learn each price data to achieve the best performance depending on the variety of agricultural products, and calculates the hyper-parameter values of the time series prediction network used. It is tuned to produce the best price prediction results through various changes.

전술한 실시 예에 사용된 시계열 예측 모델들 중 LSTM 네트워크를 통해 미래의 특정 시점의 농산물의 가격을 예측하는 농산물 가격 예측 모델을 생성할 때, 성능이 좋은 예측 모델을 얻으려면 LSTM 네트워크의 파라미터들의 값 설정이 중요하다. 학습률(Learning rate), 학습 과정 반복 횟수(number of iteration), 미니배치 크기(mini-batch size) 등과 같은 파라미터(parameter)들을 네트워크 내에서의 파라미터와 구별되도록 하이퍼-파라미터(hyper-parameter)라 부른다. 학습률은 예측 값과 실제 값의 차이를 나타내는 손실 함수(cost function)가 그리는 그래프 상에서 한 번에 값이 이동하는 거리를 나타낸다. 한 번에 너무 많은 거리를 이동하면 손실 함수의 결과를 최소화하는 매개변수 W 값을 지나칠 수 있으며, 너무 적은 거리를 이동하면 찾는데 학습에 너무 많은 시간이 소요되어 비효율적이다. 학습 속도는 일반적으로 0.001로 설정한다. 딥러닝에서는 한 번에 활용하는 학습 데이터의 크기를 결정하는데 있어서, 기존의 기계학습에서 사용하는 풀배치(full batch) 방식과는 다르게 미니배치 방식을 활용한다. 풀배치 방식은 한번의 학습에 모든 데이터를 활용해서 학습 모델을 만드는 방법인 반면, 미니배치 방식은 전체 데이터를 학습에 활용하되 한 번의 과정(iteration) 내에서 전체 데이터를 서브 데이터셋으로 나누어 여러 번 학습한다. 학습에 사용되는 데이터가 분석 대상이 되는 모집단 전체가 될 수 없기 때문에 랜덤하게 서브 데이터셋으로 나누어 여러 번 학습해 주는 것이 모집단 전체를 더 잘 대표할 수 있다는 장점이 있다. 절대적인 미니배치 크기는 존재하지 않지만, 보통 총 학습 데이터의 약 1 ~ 2 % 정도를 미니배치 크기로 설정한다.Among the time series prediction models used in the above-described embodiments, when creating an agricultural product price prediction model that predicts the price of agricultural products at a specific point in the future through an LSTM network, in order to obtain a high-performance prediction model, the values of the parameters of the LSTM network Settings are important. Parameters such as learning rate, number of iterations of the learning process, mini-batch size, etc. are called hyper-parameters to distinguish them from parameters within the network. . The learning rate represents the distance a value moves at one time on a graph drawn by a cost function that represents the difference between the predicted value and the actual value. If you move too much distance at once, the value of parameter W that minimizes the result of the loss function may be exceeded, and if you move too small a distance, it will take too much time to find and learn, making it inefficient. The learning rate is generally set to 0.001. In deep learning, a mini-batch method is used to determine the size of learning data used at once, unlike the full batch method used in existing machine learning. The full-batch method is a method of creating a learning model by using all data for one learning, while the mini-batch method uses the entire data for learning, but divides the entire data into sub-datasets within one iteration to create a learning model. learn Since the data used for learning cannot represent the entire population being analyzed, randomly dividing it into sub-datasets and learning them multiple times has the advantage of better representing the entire population. There is no absolute mini-batch size, but usually about 1 to 2% of the total training data is set as the mini-batch size.

그러나 시계열 예측 모델(시계열 예측 네트워크)이 다양하고, 가격 데이터가 농산물의 종류 별로 다르기 때문에 농산물 가격 예측 모델을 만드는 과정에서 농산물의 종류 별로 하이퍼-파라미터를 튜닝해야 한다. 또한, 한번 만들어진 농산물 가격 예측 모델은 매일 새로운 가격 데이터가 더해지므로 시간이 흐름에 따라 새롭게 학습 및 튜닝을 하여 가격 예측 성능이 떨어지는 것을 방지할 필요가 있다. 이러한 과정은 많은 시간과 자원의 투입이 필요하게 되고, 이는 비용의 상승으로 이어지는 문제가 된다. 본 명세서의 실시 예에서는 다양한 농산물 품종들 중에서 하이퍼-파라미터의 튜닝에 있어서 동일한 튜닝 패턴(튜닝 값)을 가지는 품종들을 하나의 군집(그룹 또는 클러스터라 불릴 수도 있음)으로 묶어 동일 군집 내의 가격 데이터로 학습하는 시계열 예측 모델들에 대해서 동일한 하이퍼-파라미터 값으로 한 번에 튜닝함으로써 이러한 문제를 해결하고자 한다. 즉, 동일 군집에 포함된 농산물 가격 데이터는 같은 하이퍼-파라미터 값으로 동일한 성능 특성을 나타낸다. 따라서, 가격 데이터를 군집화(클러스터링)하여 시계열 예측 네트워크의 하이퍼-파라미터를 튜닝하면 학습 및 튜닝 작업에 소요되는 자원을 절약할 수 있다.However, because time series prediction models (time series prediction networks) are diverse and price data is different for each type of agricultural product, hyper-parameters must be tuned for each type of agricultural product in the process of creating an agricultural product price prediction model. In addition, since new price data is added to the agricultural product price prediction model once created every day, it is necessary to prevent price prediction performance from deteriorating by learning and tuning it again over time. This process requires an investment of a lot of time and resources, which leads to an increase in costs. In an embodiment of the present specification, in tuning hyper-parameters among various agricultural product varieties, varieties with the same tuning pattern (tuning value) are grouped into one cluster (may be called a group or cluster) and learned using price data within the same cluster. We seek to solve this problem by tuning time series prediction models at once with the same hyper-parameter values. In other words, agricultural product price data included in the same cluster show the same performance characteristics with the same hyper-parameter values. Therefore, by clustering price data and tuning the hyper-parameters of the time series prediction network, resources required for learning and tuning work can be saved.

이하에서는 가격 데이터 군집화를 통한 시계열 예측 모델 튜닝 과정의 단순화를 통해 농산물의 가격을 예측하는 방법의 예를 설명한다.Below, an example of a method for predicting prices of agricultural products is explained through simplification of the time series prediction model tuning process through price data clustering.

도 7은 실시예에 따른 농산물 가격 예측 방법을 설명하기 위한 순서도이다.Figure 7 is a flowchart for explaining a method for predicting prices of agricultural products according to an embodiment.

먼저, 실시 예에 따른 농산물 가격 예측 방법은 가격 예측을 위한 농산물의 가격 데이터를 수집한다(S710). 농산물의 가격 데이터는 API를 통해 농산물 가격 데이터를 제공하는 기관(예컨대, 한국농수산식품유통공사)의 서버로부터 수집하여 데이터베이스에 저장한다. 저장된 가격 데이터는 일차적으로 가공되어 10개의 품목 및 29개의 품종에 대한 거래량(tone)과 가격(원/kg) 데이터가 하나의 테이블로 만들어져 저장된다. 한국농수산식품유통공사의 서버로부터 수집하는 농산물의 가격 데이터는 농산물의 실시간 경락 가격, 도매 가격, 소매 가격 정보로 구성되며, 각 가격 정보는 품목명, 도매시장명(경략, 도매의 경우), 시장명(소매의 경우), 거래 일시, 거래 가격을 포함한다.First, the agricultural product price prediction method according to the embodiment collects price data of agricultural products for price prediction (S710). Price data for agricultural products is collected from the server of an organization that provides agricultural product price data (e.g., Korea Agro-Fisheries and Food Trade Corporation) through API and stored in a database. The stored price data is first processed and the transaction volume (tone) and price (won/kg) data for 10 items and 29 varieties are created and stored in one table. The price data of agricultural products collected from the server of the Korea Agro-Fisheries and Food Trade Corporation consists of real-time meridian price, wholesale price, and retail price information of agricultural products, and each price information includes item name, wholesale market name (in case of meridian, wholesale), and market name. (for retail), transaction date and time, and transaction price.

다음으로, 농산물 가격 예측 방법은 복수의 시계열 예측 모델들 중 하나의 시계열 예측 모델을 선택한다(S720).Next, the agricultural product price prediction method selects one time series prediction model among a plurality of time series prediction models (S720).

다음으로, 농산물 가격 예측 방법은 복수의 농산물에 대한 가격 데이터를 데이터베이스에서 읽어내어 상기 선택된 시계열 예측 모델에서의 학습을 위한 전처리를 수행한다(S730). 여기에서 데이터 전처리 수행 방법은 데이터베이스에서 가격 데이터를 불러와 결손 데이터를 미리 결정된 추정 방법으로 보강한 뒤, 결손 데이터가 보강된 가격 데이터를 상기 선택된 시계열 예측 모델에서의 학습을 위한 포맷으로 구성한다. 농산물 가격 예측 방법은 데이터베이스로 저장된 가격 데이터를 호출하여 시계열 예측 모델에 입력하기 위한 데이터 형식으로 정리, 즉 데이터 포맷을 가공한다. 가격 데이터는 열 마다 필요 양식에 맞추어 정리된다. 거래가 없었던 날짜의 데이터(결측치)는 행을 생성한 뒤, 해당 빈 데이터의 앞/뒤 하루 일자의 값에 대한 평균값을 산출한 뒤, 이를 빈 데이터에 할당한다. 이 때, 품종이 아닌 품목으로 가격 데이터가 분류될 경우, 각 품목 별로 품종 가격의 평균(mean)값을 해당 품목의 평균값으로 할당하여 데이터를 정리한다. 정리 데이터는 품목을 행 이름으로 지정하고, 날짜를 열 이름으로 지정하여 데이터프레임으로 구성된다. 구성된 데이터 프레임은 넘피 어레이(numpy array)로 변환되어 데이터 계산을 용이하게 만든다.Next, the agricultural product price prediction method reads price data for a plurality of agricultural products from the database and performs preprocessing for learning in the selected time series prediction model (S730). Here, the data preprocessing method loads price data from the database, augments the missing data with a predetermined estimation method, and configures the price data augmented with the missing data into a format for learning in the selected time series prediction model. The agricultural product price prediction method calls price data stored in a database and organizes it into a data format for input into a time series prediction model, that is, processing the data format. Price data is organized into the required format for each column. For data on days when there were no transactions (missing values), a row is created, the average value of the values of the day before and after the blank data is calculated, and this is assigned to the blank data. At this time, if the price data is classified by item rather than variety, the data is organized by assigning the average value of the variety price for each item as the average value of the corresponding item. Organized data is organized into a data frame by specifying items as row names and dates as column names. The constructed data frame is converted to a numpy array to facilitate data calculation.

다음으로, 농산물 가격 예측 방법은 상기 가격 데이터에 대해서 농산물 단위로 적어도 하나의 농산물을 묶는 군집화 모델로 상기 가격 데이터를 군집화 하여 적어도 하나의 가격 데이터 군집을 생성한 뒤(S740), 상기 선택된 시계열 예측 모델에 대해서 하이퍼-파라미터(hyper-parameters)를 튜닝하고, 상기 적어도 하나의 가격 데이터 군집 별로 상기 튜닝된 시계열 예측 모델을 학습시켜, 미래 시점의 농산물 별 예측 가격을 산출하는 복수의 농산물 가격 예측 모델들을 생성할 수 있다(S750). 군집화 모델은 상기 선택된 시계열 예측 모델에 군집된 가격 데이터를 적용할 때 하이퍼-파라미터 튜닝에 대해 미리 결정된 튜닝 패턴을 만족하는 농산물 품목들이 동일 군집이 되도록 상기 가격 데이터를 군집화 한다. 즉, 군집화 모델은 10개의 품목 또는 29개의 품종의 농산물들을 몇 개의 그룹으로 묶은 동일 그룹에 대해서 선택된 시계열 예측 모델로 학습하여 농산물 가격 예측 모델을 만들 때 동일 하이퍼-파라미터에 대해서 동일한 값으로 튜닝한다. 이들 그룹화된 농산물의 가격 데이터는 동일한 하이퍼-파라미터에 대해서 동일한 값으로 튜닝할 때, 최적의 성능이 나오게 된다. 또한, 상기 군집화 모델은 여기에 더하여 선택된 시계열 예측 모델에 군집된 가격 데이터를 적용할 때 최고의 성능 평가 결과를 산출하는 군집의 수로 농산물의 가격 데이터를 군집화 한다. 예를 들어, 12 종류의 농산물이 있을 경우, 6개씩 하나의 군집으로 묶어 2개의 군집을 만들 수 있고, 4개씩 하나의 군집으로 묶어 3개의 군집을 만들 수 있고, 3개씩 하나의 군집으로 묶어 4개의 군집을 만들 수 있다. 이 경우, 군집이 수에 따라 가격 예측 모델의 성능이 달라지는 데, 최고의 성능을 산출하게 되는 군집의 수로 군집화 하여 시계열 예측 모델의 하이퍼-파라미터를 튜닝하고 예측 모델을 학습하여 가격 예측 모델을 만들 수 있다. 성능을 평가하는 지표로는 MAPE(Mean Absolute Percentage Error), RMSE(Root Mean Squared Error) 또는 R2 score가 사용될 수 있다.Next, the agricultural product price prediction method generates at least one price data cluster by clustering the price data with a clustering model that groups at least one agricultural product by agricultural product unit (S740), and then uses the selected time series prediction model Tuning hyper-parameters and learning the tuned time series prediction model for each of the at least one price data cluster to generate a plurality of agricultural product price prediction models that calculate the predicted price for each agricultural product at a future point in time. You can do it (S750). When applying the clustered price data to the selected time series prediction model, the clustering model clusters the price data so that agricultural product items that satisfy a predetermined tuning pattern for hyper-parameter tuning are in the same cluster. In other words, the clustering model learns with a time series prediction model selected for the same group of 10 items or 29 varieties of agricultural products into several groups, and tunes the same hyper-parameters to the same values when creating an agricultural product price prediction model. When the price data of these grouped agricultural products is tuned to the same value for the same hyper-parameter, optimal performance is achieved. In addition, the clustering model clusters the price data of agricultural products into the number of clusters that yields the best performance evaluation results when applying the clustered price data to the selected time series prediction model. For example, if there are 12 types of agricultural products, you can make 2 clusters by grouping 6 into one cluster, 3 clusters by grouping 4 into one cluster, and 4 by grouping 3 into one cluster. You can create a colony of dogs. In this case, the performance of the price prediction model varies depending on the number of clusters. You can create a price prediction model by clustering with the number of clusters that produces the best performance, tuning the hyper-parameters of the time series prediction model, and learning the prediction model. . As indicators to evaluate performance, MAPE (Mean Absolute Percentage Error), RMSE (Root Mean Squared Error), or R2 score can be used.

한편, 하이퍼-파라미터 튜닝 과정은 미리 결정된 범주 내에서 각 하이퍼-파라미터의 값을 변경해가며 상기 선택된 시계열 예측 모델을 상기 적어도 하나의 가격 데이터 군집 별로 학습시켜, 상기 선택된 시계열 예측 모델이 최고의 성능을 산출하도록 하는 하이퍼-파라미터의 값을 상기 각 하이퍼-파라미터의 값으로 결정하는 방법으로 수행된다. 예컨대, 각 하이퍼-파라미터 별로 적용가능한 값의 범위를 설정한 뒤, 설정된 범위 내에서 선택 가능한 모든 값의 조합으로 예측 모델을 튜닝하여 결과 값을 도출한 뒤, 미리 결정된 기준을 만족하는 조합의 하이퍼-파라미터 값을 자동으로 찾는다.Meanwhile, the hyper-parameter tuning process trains the selected time series prediction model for each of the at least one price data cluster by changing the value of each hyper-parameter within a predetermined category, so that the selected time series prediction model produces the best performance. This is performed by determining the value of the hyper-parameter to be determined from the value of each hyper-parameter. For example, after setting the range of applicable values for each hyper-parameter, tuning the prediction model with a combination of all values that can be selected within the set range to derive the result value, and then deriving the hyper-parameter of the combination that satisfies the predetermined criteria. Automatically finds parameter values.

한편, 새로운 농산물 품목이 추가되는 경우, 가격 데이터의 군집화를 다시 진행한 뒤에는 데이터의 처리 양상이 많이 변한다면 이후에 튜닝된 농산물 가격 예측 모델의 튜닝 또한 다시 진행해야 한다.On the other hand, if new agricultural products are added and the data processing pattern changes significantly after clustering the price data is performed again, tuning of the subsequently tuned agricultural product price prediction model must also be performed again.

전술한, 과정들은 상기 복수의 시계열 예측 모델들 전체 대해서 하나씩 수행되어, 각 시계열 예측 모델에 기반한 농산물 가격 예측 모델이 생성될 수 있다.The above-mentioned processes can be performed one by one for all of the plurality of time series prediction models, so that an agricultural product price prediction model based on each time series prediction model can be generated.

여기에서 상기 복수의 농산물 가격 예측 모델들은 앞에서 설명한 다단계 시계열 예측 접근법(Multi-step Time Series Forecasting) 및 도 1 내지 도 6을 참조하여 농산물 별 최고의 성능을 보여주는 농산물 가격 예측 방법 및 전략을 적용하여 도출된다. 즉, 상기 복수의 농산물 가격 예측 모델은 각 농산물에 대해서 상기 선택된 시계열 예측 모델에 기초하여 미래 시점 별로 생성한 농산물 가격 예측 모델(다이렉트 모델), 상기 선택된 시계열 예측 모델에 의한 이전 시점의 예측 결과가 다음 시점의 예측을 위한 시계열 예측 모델에 입력되는 농산물 가격 예측 모델(재귀 모델), 이전 시점의 가격을 예측하는 시계열 예측 모델의 예측 결과를 다음 시점의 가격을 예측하는 시계열 예측 모델의 입력 데이터에 포함하여 학습시켜 생성된 농산물 가격 예측 모델(하이브리드 모델), 및 단일 시계열 예측 모델로 모든 미래 시점의 농산물 별 예측 가격을 산출하는 농산물 가격 예측 모델(다중 출력 모델)을 포함할 수 있다.Here, the plurality of agricultural product price prediction models are derived by applying the previously described multi-step time series forecasting approach and agricultural product price prediction methods and strategies that show the best performance for each agricultural product with reference to Figures 1 to 6. . That is, the plurality of agricultural product price prediction models are agricultural product price prediction models (direct models) generated for each future point in time based on the selected time series prediction model for each agricultural product, and the prediction results at the previous time point by the selected time series prediction model are as follows. An agricultural product price prediction model (recursive model) that is input to a time series prediction model for prediction at a point in time, and the prediction results of a time series prediction model predicting the price at a previous point in time are included in the input data of a time series prediction model for predicting the price at the next point in time. It may include an agricultural product price prediction model (hybrid model) created by learning, and an agricultural product price prediction model (multiple output model) that calculates the predicted price for each agricultural product at all future points in time with a single time series prediction model.

다음으로, 농산물 가격 예측 방법은 전술한 과정들에 의해 생성된 복수의 농산물 가격 예측 모델들 중 각 농산물 가격 예측 모델에 대하여 농산물 별로 성능 평가를 수행하여 최고의 성능을 가지는 농산물 가격 예측 모델을 해당 농산물에 대한 최종 농산물 가격 예측 모델로 결정한다(S760). 즉, 특정 농산물에 대한 농산물 가격 예측 모델은 상기 특정 농산물에 대해서 생성한 다이렉트 모델, 재귀 모델, 하이브리드 모델 및 다중 출력 모델 중에서 최고의 성능을 나타내는 모델이 최종적인 농산물 가격 예측 모델로 결정된다.Next, the agricultural product price prediction method performs a performance evaluation for each agricultural product among the plurality of agricultural product price prediction models generated through the above-mentioned processes, and selects the agricultural product price prediction model with the best performance for the corresponding agricultural product. It is decided using the final agricultural product price prediction model (S760). That is, as for the agricultural product price prediction model for a specific agricultural product, the model showing the best performance among the direct model, recursive model, hybrid model, and multi-output model created for the specific agricultural product is determined as the final agricultural product price prediction model.

마지막으로, 가격을 예측하고자 하는 농산물에 해당되는 최종 농산물 가격 예측 모델을 이용하여 예측 가격을 알고 싶은 날짜의 농산물에 대한 예측 가격을 산출한다(S770).Finally, the predicted price for the agricultural product on the date for which the predicted price is to be known is calculated using the final agricultural product price prediction model corresponding to the agricultural product for which the price is to be predicted (S770).

한편, 가격 데이터가 신규로 추가될 경우, 추가되는 가격 데이터를 전처리만 하여 지속적으로 예측 값을 도출할 수 있다. 그러나, 도출된 가격 예측 값의 정확도가 많이 저하된 경우, 즉, 가격 예측 모델의 성능이 많이 떨어졌다면 튜닝을 새롭게 하거나 군집화부터 다시 진행하여 가격 예측 모델을 튜닝한다.Meanwhile, when new price data is added, forecast values can be continuously derived by simply preprocessing the added price data. However, if the accuracy of the derived price prediction value has decreased significantly, that is, if the performance of the price prediction model has decreased significantly, the price prediction model is tuned by renewing tuning or starting from clustering again.

상술한 설명에서, 단계들, 과정들 또는 동작들(S710 내지 S770)은 본 발명의 구현예에 따라서, 추가적인 단계, 과정 또는 동작으로 더 분할되거나, 더 적은 단계, 과정 또는 동작으로 조합될 수 있다. 또한, 일부 단계, 과정 또는 동작은 필요에 따라 생략될 수도 있고, 단계 또는 동작 간의 순서가 전환될 수도 있다. 또한, 전술한 농산물 가격 예측 방법이 포함하는 각 단계 또는 동작은 컴퓨터프로그램으로 구현되어 컴퓨터로 읽을 수 있는 기록매체에 저장될 수 있으며, 컴퓨터 장치에 의해 각 단계, 과정 또는 동작이 실행될 수도 있다.In the above description, the steps, processes or operations S710 to S770 may be further divided into additional steps, processes or operations, or may be combined into fewer steps, processes or operations, depending on the implementation of the present invention. . Additionally, some steps, processes, or operations may be omitted, or the order between steps or operations may be switched, as needed. Additionally, each step or operation included in the above-described agricultural product price prediction method may be implemented as a computer program and stored in a computer-readable recording medium, and each step, process, or operation may be executed by a computer device.

도 8은 일 실시 예에 따른 농산물 가격 예측 방법을 수행하는 농산물 가격 예측 장치의 간략한 구성을 설명한다.Figure 8 explains a brief configuration of an agricultural product price prediction device that performs an agricultural product price prediction method according to an embodiment.

도 8을 참조하면, 일 실시 예에 따른 농산물 가격 예측 장치(100)는 제어부(110), 통신부(120) 및 저장부(130)를 포함한다. 도시된 구성요소들은 필수적인 것은 아니어서, 그 보다 많은 구성요소들을 갖거나, 그보다 적은 구성요소들을 갖는 농산물 가격 예측 장치가 구현될 수도 있다. 이러한 구성요소는 하드웨어 또는 소프트웨어로 구현되거나 하드웨어와 소프트웨어의 결합을 통해서 구현될 수 있다.Referring to FIG. 8, the agricultural product price prediction device 100 according to an embodiment includes a control unit 110, a communication unit 120, and a storage unit 130. The illustrated components are not essential, and an agricultural product price prediction device may be implemented with more components or fewer components. These components may be implemented in hardware or software, or through a combination of hardware and software.

제어부(110)는 통신부(120) 및 저장부(130)와 기능적으로 연결되어, 이들을 제어할 수 있다.The control unit 110 is functionally connected to the communication unit 120 and the storage unit 130 and can control them.

제어부(110)는 가격 예측을 위한 농산물의 가격 데이터를 수집한다. 제어부(110)는 통신부(120)를 제어하여 외부의 농산물 가격 데이터 서버(200)에 접속한 뒤, 데이터 크롤링 또는 API를 통해 농산물 가격 데이터를 수집하고, 이를 데이터베이스화 하여 저장부(130)에 저장한다.The control unit 110 collects price data of agricultural products for price prediction. The control unit 110 controls the communication unit 120 to connect to the external agricultural product price data server 200, collects agricultural product price data through data crawling or API, converts it into a database, and stores it in the storage unit 130. do.

제어부(110)는 복수의 시계열 예측 모델들 중 하나의 시계열 예측 모델을 선택하고, 복수의 농산물에 대한 상기 가격 데이터를 상기 데이터베이스에서 읽어내어 상기 선택된 시계열 예측 모델에서의 학습을 위한 전처리를 수행한다. 제어부(110)는 복수의 농산물에 대한 가격 데이터를 데이터베이스에서 읽어내어 상기 선택된 시계열 예측 모델에서의 학습을 위한 전처리를 수행하고, 상기 데이터베이스에서 상기 가격 데이터를 불러와 결손 데이터를 미리 결정된 추정 방법으로 보강하고, 상기 결손 데이터가 보강된 가격 데이터를 상기 선택된 시계열 예측 모델에서의 학습을 위한 포맷으로 구성한다.The control unit 110 selects one time series prediction model among a plurality of time series prediction models, reads the price data for a plurality of agricultural products from the database, and performs preprocessing for learning in the selected time series prediction model. The control unit 110 reads price data for a plurality of agricultural products from a database, performs preprocessing for learning in the selected time series prediction model, and retrieves the price data from the database to reinforce missing data using a predetermined estimation method. And, the price data augmented with the missing data is configured into a format for learning in the selected time series prediction model.

또한, 제어부(110)는 상기 가격 데이터에 대해서 농산물 단위로 적어도 하나의 농산물을 묶는 군집화 모델로 상기 가격 데이터를 군집화 하여 적어도 하나의 가격 데이터 군집을 생성한다. 상기 군집화 모델은 상기 선택된 시계열 예측 모델에 군집된 가격 데이터를 적용할 때 하이퍼-파라미터 튜닝에 대해 미리 결정된 튜닝 패턴을 만족하는 농산물 품목들이 동일 군집이 되도록 상기 가격 데이터를 군집화 한다. 또한, 상기 군집화 모델은 상기 선택된 시계열 예측 모델에 군집된 가격 데이터를 적용할 때 최고의 성능 평가 결과를 산출하는 군집의 수로 상기 가격 데이터를 군집화 한다.Additionally, the control unit 110 generates at least one price data cluster by clustering the price data using a clustering model that groups at least one agricultural product into agricultural products. When applying the clustered price data to the selected time series prediction model, the clustering model clusters the price data so that agricultural product items that satisfy a predetermined tuning pattern for hyper-parameter tuning are in the same cluster. Additionally, the clustering model clusters the price data into the number of clusters that yields the best performance evaluation result when applying the clustered price data to the selected time series prediction model.

또한, 제어부(110)는 상기 선택된 시계열 예측 모델에 대해서 하이퍼-파라미터(hyper-parameters)를 튜닝하고, 상기 적어도 하나의 가격 데이터 군집 별로 상기 튜닝된 시계열 예측 모델을 학습시켜, 미래 시점의 농산물 별 예측 가격을 산출하는 복수의 농산물 가격 예측 모델들을 생성한다. 제어부(110)는 미리 결정된 범주 내에서 각 하이퍼-파라미터의 값을 변경해가며 상기 선택된 시계열 예측 모델을 상기 적어도 하나의 가격 데이터 군집 별로 학습시켜, 상기 선택된 시계열 예측 모델이 최고의 성능을 산출하도록 하는 하이퍼-파라미터의 값을 상기 각 하이퍼-파라미터의 값으로 결정한다.In addition, the control unit 110 tunes hyper-parameters for the selected time series prediction model, trains the tuned time series prediction model for each of the at least one price data cluster, and predicts each agricultural product at a future point in time. Create multiple agricultural product price prediction models that calculate prices. The control unit 110 changes the value of each hyper-parameter within a predetermined category and trains the selected time series prediction model for each of the at least one price data cluster, so that the selected time series prediction model produces the best performance. The value of the parameter is determined by the value of each hyper-parameter.

또한, 제어부(110)는 상기 복수의 농산물 가격 예측 모델들 중 각 농산물 가격 예측 모델에 대하여 농산물 별로 성능 평가를 수행하여 최고의 성능을 가지는 농산물 가격 예측 모델을 해당 농산물에 대한 최종 농산물 가격 예측 모델로 결정한다. 상기 최종 농산물 가격 예측 모델은 각 농산물에 대해서 상기 선택된 시계열 예측 모델에 기초하여 미래 시점 별로 생성한 농산물 가격 예측 모델(다이렉트 모델), 상기 선택된 시계열 예측 모델에 의한 이전 시점의 예측 결과가 다음 시점의 예측을 위한 시계열 예측 모델에 입력되는 농산물 가격 예측 모델(재귀 모델), 이전 시점의 가격을 예측하는 시계열 예측 모델의 예측 결과를 다음 시점의 가격을 예측하는 시계열 예측 모델의 입력 데이터에 포함하여 학습시켜 생성된 농산물 가격 예측 모델(하이브리드 모델), 및 단일 시계열 예측 모델로 모든 미래 시점의 농산물 별 예측 가격을 산출하는 농산물 가격 예측 모델(다중 출력 모델) 중에 최고의 성능을 내는 것으로 선택될 수 있다.In addition, the control unit 110 performs a performance evaluation for each agricultural product among the plurality of agricultural product price prediction models and determines the agricultural product price prediction model with the best performance as the final agricultural product price prediction model for the corresponding agricultural product. do. The final agricultural product price prediction model is an agricultural product price prediction model (direct model) generated for each future time point based on the selected time series prediction model for each agricultural product, and the prediction result at the previous time point by the selected time series prediction model is the prediction at the next time point. An agricultural product price prediction model (recursive model) that is input to the time series prediction model for, is created by training the prediction results of the time series prediction model predicting the price at the previous point in time by including it in the input data of the time series prediction model predicting the price at the next point in time. It can be selected as the best performing agricultural product price prediction model (hybrid model), and an agricultural product price prediction model (multiple output model) that calculates the predicted prices for each agricultural product at all future points in time with a single time series prediction model.

또한, 제어부(110)는 목적 농산물에 해당되는 최종 농산물 가격 예측 모델을 이용하여 목적 일자의 상기 목적 농산물에 대한 예측 가격을 산출한다.Additionally, the control unit 110 calculates the predicted price for the target agricultural product on the target date using a final agricultural product price prediction model corresponding to the target agricultural product.

한편, 저장부(130)는 가격 데이터뿐만 아니라, 다양한 시계열 예측 모델을 저장할 수 있다. 저장부(130)는 제어부(110)의 동작을 위한 프로그램을 저장할 수 있고, 입/출력되는 데이터들을 임시로 저장할 수 있다. 저장부(130)는 플래시 메모리 타입(flash memory type), 하드디스크 타입(hard disk type), SSD 타입(Solid State Disk type), SDD 타입(Silicon Disk Drive type), 멀티미디어 카드 마이크로 타입(multimedia card micro type), 카드 타입의 메모리(예를 들어 SD 또는 XD 메모리 등), 램(random access memory; RAM), SRAM(static random access memory), 롬(read-only memory; ROM), EEPROM(electrically erasable programmable read-only memory), PROM(programmable read-only memory), 자기 메모리, 자기 디스크 및 광디스크 중 적어도 하나의 타입의 저장매체를 포함할 수 있다.Meanwhile, the storage unit 130 can store not only price data but also various time series prediction models. The storage unit 130 can store programs for operating the control unit 110 and temporarily store input/output data. The storage unit 130 is a flash memory type, a hard disk type, a solid state disk type, an SDD type (Silicon Disk Drive type), and a multimedia card micro type. type), card-type memory (e.g. SD or XD memory, etc.), random access memory (RAM), static random access memory (SRAM), read-only memory (ROM), EEPROM (electrically erasable programmable memory) It may include at least one type of storage medium among read-only memory (PROM), programmable read-only memory (PROM), magnetic memory, magnetic disk, and optical disk.

본 명세서에서 개시되는 네트워크는 예를 들어, 무선 네트워크, 유선 네트워크, 인터넷과 같은 공용 네트워크, 사설 네트워크, 모바일 통신 네트워크용 광역 시스템(Global System for Mobile communication network; GSM) 네트워크, 범용 패킷 무선 네트워크(General Packet Radio Network; GPRN), 근거리 네트워크(Local Area Network; LAN), 광역 네트워크(Wide Area Network; WAN), 거대도시 네트워크(Metropolitan Area Network; MAN), 셀룰러 네트워크, 공중 전화 교환 네트워크(Public Switched Telephone Network; PSTN), 개인 네트워크(Personal Area Network), 블루투스, Wi-Fi 다이렉트(Wi-Fi Direct), 근거리장 통신(Near Field communication), 초광대역(Ultra-Wide band), 이들의 조합, 또는 임의의 다른 네트워크일 수 있지만 이들로 한정되는 것은 아니다.Networks disclosed herein include, for example, wireless networks, wired networks, public networks such as the Internet, private networks, Global System for Mobile communication networks (GSM) networks, general packet wireless networks (General Packet Radio Network (GPRN), Local Area Network (LAN), Wide Area Network (WAN), Metropolitan Area Network (MAN), Cellular Network, Public Switched Telephone Network ; PSTN), Personal Area Network, Bluetooth, Wi-Fi Direct, Near Field communication, Ultra-Wide band, combinations thereof, or any It may be a different network, but is not limited to these.

도 9는 일 실시예에 따른 농산물 가격 예측 장치의 일부 구성으로 포함되는 AI 장치의 블록도이다.Figure 9 is a block diagram of an AI device included as part of an agricultural product price prediction device according to an embodiment.

상기 AI 장치(20)는 AI 프로세싱을 수행할 수 있는 AI 모듈을 포함하는 전자 기기 또는 상기 AI 모듈을 포함하는 서버 등을 포함할 수 있다. 또한, 상기 AI 장치(20)는 도 8에 도시된 농산물 가격 예측 장치(100)의 적어도 일부의 구성으로 포함되어 예측 모델의 학습 등 AI 프로세싱 과정 중 적어도 일부를 함께 수행하도록 구비될 수도 있다.The AI device 20 may include an electronic device including an AI module capable of performing AI processing or a server including the AI module. In addition, the AI device 20 may be included as at least a part of the agricultural product price prediction device 100 shown in FIG. 8 and may be equipped to perform at least part of the AI processing process, such as learning a prediction model.

상기 AI 장치(20)의 AI 프로세싱은, 도 8에 도시된 농산물 가격 예측 장치(100)의 제어와 관련된 모든 동작들 및 인공지능 학습을 통한 가격 예측 모델링을 위한 모든 동작들을 포함할 수 있다. 예를 들어, 농산물 가격 예측 장치(100)는 수집된 가격 데이터 셋을 AI 프로세싱하여 처리/판단 및 학습한 뒤, 특정 농산물에 대한 미래의 가격을 예측하는 동작을 수행할 수 있다. 상기 AI 장치(20)는 도 8의 제어부(110)의 일부 구성요소로 포함되거나 제어부(110)로 대체될 수 있다.The AI processing of the AI device 20 may include all operations related to the control of the agricultural product price prediction device 100 shown in FIG. 8 and all operations for price prediction modeling through artificial intelligence learning. For example, the agricultural product price prediction device 100 may process/judge and learn the collected price data set through AI processing, and then perform an operation to predict the future price for a specific agricultural product. The AI device 20 may be included as a component of the control unit 110 of FIG. 8 or may be replaced with the control unit 110.

상기 AI 장치(20)는 AI 프로세서(21), 메모리(25) 및/또는 통신부(27)를 포함할 수 있다.The AI device 20 may include an AI processor 21, memory 25, and/or a communication unit 27.

상기 AI 장치(20)는 신경망을 학습할 수 있는 컴퓨팅 장치로서, 서버, 데스크탑 PC, 노트북 PC, 태블릿 PC 등과 같은 다양한 전자 장치로 구현되거나, 하나의 칩으로 구현될 수 있다. 본원 발명에서 상기 AI 장치(20)는 상기 다양한 전자 장치 중 어느 하나의 형태로 구현된 농산물 가격 예측 장치일 수 있다.The AI device 20 is a computing device capable of learning a neural network, and may be implemented as various electronic devices such as servers, desktop PCs, laptop PCs, tablet PCs, etc., or may be implemented as a single chip. In the present invention, the AI device 20 may be an agricultural product price prediction device implemented in any one of the various electronic devices.

AI 프로세서(21)는 메모리(25)에 저장된 프로그램을 이용하여 신경망을 학습할 수 있다. 특히, AI 프로세서(21)는 디바이스 관련 데이터를 인식하기 위한 신경망을 학습할 수 있다. 여기서, 디바이스 관련 데이터를 인식하기 위한 신경망은 인간의 뇌 구조를 컴퓨터 상에서 모의하도록 설계될 수 있으며, 인간의 신경망의 뉴런(neuron)을 모의하는, 가중치를 갖는 복수의 네트워크 노드들을 포함할 수 있다. 복수의 네트워크 모드들은 뉴런이 시냅스(synapse)를 통해 신호를 주고 받는 뉴런의 시냅틱 활동을 모의하도록 각각 연결 관계에 따라 데이터를 주고 받을 수 있다. 여기서 신경망은 신경망 모델에서 발전한 딥러닝 모델을 포함할 수 있다. 딥 러닝 모델에서 복수의 네트워크 노드들은 서로 다른 레이어에 위치하면서 컨볼루션(convolution) 연결 관계에 따라 데이터를 주고 받을 수 있다. 신경망 모델의 예는 심층 신경망(DNN, Deep Neural Networks), 합성곱 신경망(CNN, Convolutional Neural Networks), 순환 신경망(RNN, Recurrent Neural Networks), 제한 볼츠만 머신(RBM, Restricted Boltzmann Machine), 심층 신뢰 신경망(DBN, Deep Belief Networks), 심층 Q-네트워크(Deep Q-Network)와 같은 다양한 딥러닝 기법들을 포함하며, 컴퓨터비젼(CV), 음성인식, 자연어처리, 음성/신호처리 등의 분야에 적용될 수 있다.The AI processor 21 can learn a neural network using a program stored in the memory 25. In particular, the AI processor 21 can learn a neural network for recognizing device-related data. Here, a neural network for recognizing device-related data may be designed to simulate the human brain structure on a computer, and may include a plurality of network nodes with weights that simulate neurons of a human neural network. Multiple network modes can exchange data according to each connection relationship to simulate the synaptic activity of neurons sending and receiving signals through synapses. Here, the neural network may include a deep learning model developed from a neural network model. In a deep learning model, multiple network nodes are located in different layers and can exchange data according to convolutional connection relationships. Examples of neural network models include Deep Neural Networks (DNN), Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), Restricted Boltzmann Machine (RBM), and Deep Trust Neural Networks. It includes various deep learning techniques such as (DBN, Deep Belief Networks) and Deep Q-Network, and can be applied to fields such as computer vision (CV), speech recognition, natural language processing, and voice/signal processing. there is.

한편, 전술한 바와 같은 기능을 수행하는 프로세서는 범용 프로세서(예를 들어, CPU)일 수 있으나, 인공지능 학습을 위한 AI 전용 프로세서(예를 들어, GPU)일 수 있다.Meanwhile, the processor that performs the above-described functions may be a general-purpose processor (e.g., CPU), or may be an AI-specific processor (e.g., GPU) for artificial intelligence learning.

메모리(25)는 AI 장치(20)의 동작에 필요한 각종 프로그램 및 데이터를 저장할 수 있다. 메모리(25)는 비 휘발성 메모리, 휘발성 메모리, 플래시 메모리(flash-memory), 하드디스크 드라이브(HDD) 또는 솔리드 스테이트 드라이브(SSD) 등으로 구현할 수 있다. 메모리(25)는 AI 프로세서(21)에 의해 액세스되며, AI 프로세서(21)에 의한 데이터의 독취/기록/수정/삭제/갱신 등이 수행될 수 있다. 또한, 메모리(25)는 본 발명의 일 실시예에 따른 데이터 분류/인식을 위한 학습 알고리즘을 통해 생성된 신경망 모델(예를 들어, 딥 러닝 모델(26))을 저장할 수 있다.The memory 25 can store various programs and data necessary for the operation of the AI device 20. The memory 25 can be implemented as non-volatile memory, volatile memory, flash-memory, hard disk drive (HDD), or solid state drive (SSD). The memory 25 is accessed by the AI processor 21, and reading/writing/modifying/deleting/updating data by the AI processor 21 can be performed. Additionally, the memory 25 may store a neural network model (eg, deep learning model 26) generated through a learning algorithm for data classification/recognition according to an embodiment of the present invention.

한편, AI 프로세서(21)는 데이터 분류/인식을 위한 신경망을 학습하는 데이터 학습부(22)를 포함할 수 있다. 데이터 학습부(22)는 데이터 분류/인식을 판단하기 위하여 어떤 학습 데이터를 이용할지, 학습 데이터를 이용하여 데이터를 어떻게 분류하고 인식할지에 관한 기준을 학습할 수 있다. 데이터 학습부(22)는 학습에 이용될 학습 데이터를 획득하고, 획득된 학습데이터를 딥러닝 모델에 적용함으로써, 딥러닝 모델을 학습할 수 있다.Meanwhile, the AI processor 21 may include a data learning unit 22 that learns a neural network for data classification/recognition. The data learning unit 22 can learn standards regarding what learning data to use to determine data classification/recognition and how to classify and recognize data using the learning data. The data learning unit 22 can learn a deep learning model by acquiring learning data to be used for learning and applying the acquired learning data to the deep learning model.

데이터 학습부(22)는 적어도 하나의 하드웨어 칩 형태로 제작되어 AI 장치(20)에 탑재될 수 있다. 예를 들어, 데이터 학습부(22)는 인공지능(AI)을 위한 전용 하드웨어 칩 형태로 제작될 수도 있고, 범용 프로세서(CPU) 또는 그래픽 전용 프로세서(GPU)의 일부로 제작되어 AI 장치(20)에 탑재될 수도 있다. 또한, 데이터 학습부(22)는 소프트웨어 모듈로 구현될 수 있다. 소프트웨어 모듈(또는 인스트럭션(instruction)을 포함하는 프로그램 모듈)로 구현되는 경우, 소프트웨어 모듈은 컴퓨터로 읽을 수 있는 판독 가능한 비일시적 판독 가능 기록 매체(non-transitory computer readable media)에 저장될 수 있다. 이 경우, 적어도 하나의 소프트웨어 모듈은 운영체제(Operating System)에 의해 제공되거나, 애플리케이션(응용 프로그램)에 의해 제공될 수 있다.The data learning unit 22 may be manufactured in the form of at least one hardware chip and mounted on the AI device 20. For example, the data learning unit 22 may be manufactured in the form of a dedicated hardware chip for artificial intelligence (AI), or may be manufactured as part of a general-purpose processor (CPU) or a graphics processor (GPU) to be used in the AI device 20. It may be mounted. Additionally, the data learning unit 22 may be implemented as a software module. When implemented as a software module (or a program module including instructions), the software module may be stored in a non-transitory computer readable recording medium that can be read by a computer. In this case, at least one software module may be provided by an operating system (Operating System) or an application (application program).

데이터 학습부(22)는 학습 데이터 획득부(23) 및 모델 학습부(24)를 포함할 수 있다.The data learning unit 22 may include a learning data acquisition unit 23 and a model learning unit 24.

학습 데이터 획득부(23)는 데이터를 분류하고 인식하기 위한 신경망 모델에 필요한 학습 데이터를 획득할 수 있다. 예를 들어, 학습 데이터 획득부(23)는 학습 데이터로서, 신경망 모델에 입력하기 위한 도로 인프라 상의 송신부들에 대한 영상 데이터 및/또는 샘플 데이터를 획득할 수 있다.The learning data acquisition unit 23 may acquire learning data required for a neural network model for classifying and recognizing data. For example, the learning data acquisition unit 23 may acquire image data and/or sample data for transmitters on road infrastructure to be input into a neural network model as learning data.

모델 학습부(24)는 상기 획득된 학습 데이터를 이용하여, 신경망 모델이 소정의 데이터를 어떻게 분류할지에 관한 판단 기준을 가지도록 학습할 수 있다. 이 때 모델 학습부(24)는 학습 데이터 중 적어도 일부를 판단 기준으로 이용하는 지도 학습(supervised learning)을 통하여, 신경망 모델을 학습시킬 수 있다. 또는 모델 학습부(24)는 지도 없이 학습 데이터를 이용하여 스스로 학습함으로써, 판단 기준을 발견하는 비지도 학습(unsupervised learning)을 통해 신경망 모델을 학습시킬 수 있다. 또한, 모델 학습부(24)는 학습에 따른 상황 판단의 결과가 올바른 지에 대한 피드백을 이용한 강화 학습(reinforcement learning)을 통하여, 신경망 모델을 학습시킬 수 있다. 또한, 모델 학습부(24)는 오류 역전파법(error back-propagation) 또는 경사 하강법(gradient decent)을 포함하는 학습 알고리즘을 이용하여 신경망 모델을 학습시킬 수 있다.The model learning unit 24 can use the acquired training data to train the neural network model to have a judgment standard on how to classify certain data. At this time, the model learning unit 24 can learn a neural network model through supervised learning that uses at least some of the learning data as a judgment standard. Alternatively, the model learning unit 24 can learn a neural network model through unsupervised learning, which discovers a judgment standard by learning on its own using training data without guidance. In addition, the model learning unit 24 can learn a neural network model through reinforcement learning using feedback on whether the result of situational judgment based on learning is correct. Additionally, the model learning unit 24 may learn a neural network model using a learning algorithm including error back-propagation or gradient descent.

신경망 모델이 학습되면, 모델 학습부(24)는 학습된 신경망 모델을 메모리에 저장할 수 있다. 모델 학습부(24)는 학습된 신경망 모델을 AI 장치(20)와 유선 또는 무선 네트워크로 연결된 서버의 메모리에 저장할 수도 있다.When the neural network model is learned, the model learning unit 24 may store the learned neural network model in memory. The model learning unit 24 may store the learned neural network model in the memory of a server connected to the AI device 20 through a wired or wireless network.

데이터 학습부(22)는 인식 모델의 분석 결과를 향상시키거나, 인식 모델의 생성에 필요한 리소스 또는 시간을 절약하기 위해 학습 데이터 전처리부(미도시) 및 학습 데이터 선택부(미도시)를 더 포함할 수도 있다.The data learning unit 22 further includes a learning data preprocessing unit (not shown) and a learning data selection unit (not shown) to improve the analysis results of the recognition model or save the resources or time required for generating the recognition model. You may.

학습 데이터 전처리부는 획득된 데이터가 상황 판단을 위한 학습에 이용될 수 있도록, 획득된 데이터를 전처리할 수 있다. 예를 들어, 학습 데이터 전처리부는, 모델 학습부(24)가 송신부에 대한 영상 데이터 인식을 위한 학습을 위하여 획득된 학습 데이터를 이용할 수 있도록, 획득된 데이터를 기 설정된 포맷으로 가공할 수 있다.The learning data preprocessor may preprocess the acquired data so that the acquired data can be used for learning to determine the situation. For example, the learning data preprocessor may process the acquired data into a preset format so that the model learning unit 24 can use the acquired learning data for learning to recognize image data for the transmitter.

또한, 학습 데이터 선택부는, 학습 데이터 획득부(23)에서 획득된 학습 데이터 또는 전처리부에서 전처리된 학습 데이터 중 학습에 필요한 데이터를 선택할 수 있다. 선택된 학습 데이터는 모델 학습부(24)에 제공될 수 있다. 예를 들어, 학습 데이터 선택부는, 네트워크를 통해 수집한 데이터 셋 중 특정 필드를 인식함으로써, 특정 필드에 포함된 데이터만을 학습 데이터로 선택할 수 있다.Additionally, the learning data selection unit may select data required for learning from among the learning data acquired by the learning data acquisition unit 23 or the learning data pre-processed by the pre-processing unit. The selected learning data may be provided to the model learning unit 24. For example, the learning data selection unit may recognize a specific field among the data sets collected through the network and select only data included in the specific field as learning data.

또한, 데이터 학습부(22)는 신경망 모델의 분석 결과를 향상시키기 위하여 모델 평가부(미도시)를 더 포함할 수도 있다.Additionally, the data learning unit 22 may further include a model evaluation unit (not shown) to improve the analysis results of the neural network model.

모델 평가부는, 신경망 모델에 평가 데이터를 입력하고, 평가 데이터로부터 출력되는 분석 결과가 소정 기준을 만족하지 못하는 경우, 모델 학습부(22)로 하여금 다시 학습하도록 할 수 있다. 이 경우, 평가 데이터는 인식 모델을 평가하기 위한 기 정의된 데이터일 수 있다. 일 예로, 모델 평가부는 평가 데이터에 대한 학습된 인식 모델의 분석 결과 중, 분석 결과가 정확하지 않은 평가 데이터의 개수 또는 비율이 미리 설정된 임계치를 초과하는 경우, 소정 기준을 만족하지 못한 것으로 평가할 수 있다.The model evaluation unit inputs evaluation data into the neural network model, and when the analysis result output from the evaluation data does not satisfy a predetermined standard, the model learning unit 22 can perform re-training. In this case, the evaluation data may be predefined data for evaluating the recognition model. As an example, the model evaluation unit may evaluate that, among the analysis results of the learned recognition model for the evaluation data, the number or ratio of evaluation data for which the analysis result is inaccurate exceeds a preset threshold, as not satisfying a predetermined standard. .

통신부(27)는 AI 프로세서(21)에 의한 AI 프로세싱 결과를 외부 전자 기기로 전송할 수 있다. 여기서 외부 전자 기기는 도 8의 농산물 가격 예측 장치(100)로 정의될 수 있다. 한편, 상기 AI 장치(20)는 농산물 가격 예측 장치(100) 내에 구비된 제어부(110)에 기능적으로 임베딩되어 구현될 수도 있다.The communication unit 27 can transmit the results of AI processing by the AI processor 21 to an external electronic device. Here, the external electronic device may be defined as the agricultural product price prediction device 100 of FIG. 8. Meanwhile, the AI device 20 may be implemented by being functionally embedded in the control unit 110 provided in the agricultural product price prediction device 100.

한편, 도 9에 도시된 AI 장치(20)는 AI 프로세서(21)와 메모리(25), 통신부(27) 등으로 기능적으로 구분하여 설명하였지만, 전술한 구성요소들이 하나의 모듈로 통합되어 AI 모듈로 호칭될 수도 있음을 밝혀 둔다.Meanwhile, the AI device 20 shown in FIG. 9 has been described as functionally divided into an AI processor 21, a memory 25, a communication unit 27, etc., but the above-described components are integrated into one module to form an AI module. Please note that it may also be referred to as .

본 명세서에서 사용된 용어 "부"는(예를 들면, 제어부 등), 예를 들어, 하드웨어, 소프트웨어 또는 펌웨어(firmware) 중 하나 또는 둘 이상의 조합을 포함하는 단위(unit)를 의미할 수 있다. "부"는, 예를 들어, 유닛(unit), 로직(logic), 논리블록(logical block), 부품(component), 또는 회로(circuit) 등의 용어와 바꾸어 사용(interchangeably use)될 수 있다. "부"는, 일체로 구성된 부품의 최소 단위 또는 그 일부가 될 수 있다. "부"는 하나 또는 그 이상의 기능을 수행하는 최소 단위 또는 그 일부가 될 수도 있다. "부"는 기계적으로 또는 전자적으로 구현될 수 있다. 예를 들어, "부"는, 알려졌거나 앞으로 개발될, 어떤 동작들을 수행하는 ASIC(Application-Specific Integrated Circuit) 칩, FPGAs(Field-Programmable Gate Arrays) 또는 프로그램 가능 논리 장치(programmable-logic device) 중 적어도 하나를 포함할 수 있다.The term “unit” (e.g., control unit, etc.) used herein may mean, for example, a unit including one or a combination of two or more of hardware, software, or firmware. “Part” may be used interchangeably with terms such as unit, logic, logical block, component, or circuit, for example. A “part” may be the minimum unit of an integrated part or a part thereof. “Part” may be the minimum unit or part of one or more functions. The “part” may be implemented mechanically or electronically. For example, a “part” may be an Application-Specific Integrated Circuit (ASIC) chip, Field-Programmable Gate Arrays (FPGAs), or programmable-logic device, known or to be developed in the future, that performs certain operations. It can contain at least one.

다양한 실시예에 따른 장치(예: 모듈들 또는 그 기능들) 또는 방법(예: 동작들)의 적어도 일부는, 예컨대, 프로그램 모듈의 형태로 컴퓨터로 읽을 수 있는 저장매체(computer-readable storage media)에 저장된 명령어로 구현될 수 있다. 상기 명령어가 프로세서에 의해 실행될 경우, 상기 하나 이상의 프로세서가 상기 명령어에 해당하는 기능을 수행할 수 있다. 컴퓨터가 읽을 수 있는 매체는, 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록장치를 포함한다. 컴퓨터로 읽을 수 있는 저장매체는, 하드디스크, 플로피디스크, 마그네틱 매체(magnetic media)(예: 자기테이프), 광기록 매체(optical media)(예: CD-ROM(compact disc read only memory), DVD(digital versatile disc), 자기-광 매체(magneto-optical media)(예: 플롭티컬 디스크(floptical disk)), 하드웨어 장치(예: ROM(read only memory), RAM(random access memory), 또는 플래시 메모리 등) 등을 포함할 수 있다. 또한, 프로그램 명령에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함할 수 있다. 상술한 하드웨어 장치는 다양한 실시예의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지다.At least a portion of the device (e.g., modules or functions thereof) or method (e.g., operations) according to various embodiments may be stored in a computer-readable storage media, e.g., in the form of a program module. It can be implemented with instructions stored in . When the instruction is executed by a processor, the one or more processors may perform the function corresponding to the instruction. Computer-readable media includes all types of recording devices that store data that can be read by a computer system. Storage media that can be read by a computer include hard disks, floppy disks, magnetic media (e.g. magnetic tape), optical media (e.g. CD-ROM (compact disc read only memory), DVD (digital versatile disc), magneto-optical media (e.g., floptical disk), hardware device (e.g., read only memory (ROM), random access memory (RAM), or flash memory etc.), etc. In addition, program instructions may include not only machine language codes such as those created by a compiler, but also high-level language codes that can be executed by a computer using an interpreter, etc. The above-mentioned hardware device It may be configured to operate as one or more software modules to perform the operations of various embodiments, and vice versa.

다양한 실시예에 따른 모듈 또는 프로그램 모듈은 전술된 구성요소들 중 적어도 하나 이상을 포함하거나, 일부가 생략되거나, 또는 추가적인 다른 구성요소를 더 포함할 수 있다. 다양한 실시예에 따른 모듈, 프로그램 모듈 또는 다른 구성요소에 의해 수행되는 동작들은 순차적, 병렬적, 반복적 또는 휴리스틱(heuristic)한 방법으로 실행될 수 있다. 또한, 일부 동작은 다른 순서로 실행되거나, 생략되거나, 또는 다른 동작이 추가될 수 있다.A module or program module according to various embodiments may include at least one of the above-described components, some of them may be omitted, or may further include other additional components. Operations performed by modules, program modules, or other components according to various embodiments may be executed sequentially, in parallel, iteratively, or in a heuristic manner. Additionally, some operations may be executed in a different order, omitted, or other operations may be added.

본 명세서에 사용된 용어 "하나"는 하나 또는 하나 이상으로 정의된다. 또한, 청구 범위에서 "적어도 하나" 및 "하나 이상"과 같은 도입 문구를 사용하는 것은, 동일한 청구항에 "적어도 하나" 및 "하나 이상"과 같은 도입 문구 및 "하나" 같은 불명료한 문구가 포함되어 있는 경우라 할지라도, 불명료한 문구 "하나"에 의한 다른 청구항 요소의 도입이 그러한 요소를 하나만을 포함하는 발명에 대해 그렇게 도입된 청구항 요소를 포함하는 임의의 특정 청구항을 제한한다는 것을 의미하는 것으로 해석되어서는 안된다.As used herein, the term “one” is defined as one or more than one. Additionally, the use of introductory phrases such as “at least one” and “one or more” in a claim may mean that the same claim contains introductory phrases such as “at least one” and “one or more” and ambiguous phrases such as “an.” The introduction of another claim element by the ambiguous phrase "a", if any, shall be construed to mean that any particular claim containing the claim element so introduced is limited to an invention containing only one such element. It shouldn't be.

본 문서에서, "A 또는 B" 또는 "A 및/또는 B 중 적어도 하나" 등의 표현은 함께 나열된 항목들의 모든 가능한 조합을 포함할 수 있다.In this document, expressions such as “A or B” or “at least one of A and/or B” may include all possible combinations of the items listed together.

달리 명시하지 않는 한, "제1" 및 "제2"와 같은 용어는 그러한 용어가 설명하는 요소들을 임의로 구별하는 데 사용된다. 따라서, 이들 용어는 그러한 요소들의 시간적 또는 다른 우선 순위를 나타내도록 반드시 의도된 것은 아니며, 특정 수단이 서로 다른 청구항들에 열거되어 있다는 단순한 사실만으로 이러한 수단들의 조합이 유리하게 사용될 수 없다는 것을 나타내는 것은 아니다. 따라서, 이들 용어는 그러한 요소의 시간적 또는 다른 우선 순위를 나타내도록 반드시 의도되지는 않는다. 특정 조치가 서로 다른 주장에 인용되었다는 단순한 사실만으로 이러한 조치의 조합이 유용하게 사용될 수 없다는 것을 나타내지는 않는다.Unless otherwise specified, terms such as “first” and “second” are used to optionally distinguish between the elements described by such terms. Accordingly, these terms are not necessarily intended to indicate temporal or other priority of such elements, and the mere fact that particular means are recited in different claims does not indicate that a combination of such means cannot be advantageously used. . Accordingly, these terms are not necessarily intended to indicate temporal or other priority of such elements. The mere fact that a particular measure is cited in different claims does not indicate that a combination of these measures cannot be used usefully.

동일한 기능을 달성하기 위한 구성 요소의 배열은 효과적으로 "관련"되어 원하는 기능이 달성된다. 따라서, 특정 기능성을 달성하기 위해 결합된 임의의 2 개의 구성 요소는 구조 또는 중개하는 구성 요소와 관계없이 원하는 기능이 달성되도록 서로 "관련"되는 것으로 간주될 수 있다. 마찬가지로 이와 같이 연관된 두 개의 구성 요소는 원하는 기능을 달성하기 위해 서로 "작동 가능하게 연결"되거나 "작동 가능하게 결합된" 것으로 간주될 수 있다.The arrangement of components to achieve the same function is effectively “related” so that the desired function is achieved. Accordingly, any two components combined to achieve particular functionality may be considered to be “related” to each other such that the desired functionality is achieved, regardless of structure or intervening components. Likewise, two such associated components may be considered “operably connected” or “operably coupled” to each other to achieve a desired function.

또한, 통상의 기술자는 전술한 동작들의 기능성 사이의 경계가 단지 예시적인 것임을 인식할 것이다. 복수의 동작들은 단일 동작으로 결합될 수 있고, 단일 동작은 추가 동작들로 분산될 수 있으며, 동작들은 시간적으로 적어도 부분적으로 겹쳐서 실행될 수 있다. 또한, 대안적인 실시예들은 특정 동작에 대한 복수의 인스턴스들을 포함할 수 있고, 동작들의 순서는 다양한 다른 실시예에서 변경될 수 있다. 그러나, 다른 수정, 변형 및 대안이 또한 가능하다. 따라서, 상세한 설명 및 도면은 제한적인 의미가 아니라 예시적인 것으로 간주되어야 한다.Additionally, those skilled in the art will recognize that the boundaries between the functionality of the foregoing operations are illustrative only. Multiple operations may be combined into a single operation, a single operation may be distributed into additional operations, and the operations may be executed at least partially overlapping in time. Additionally, alternative embodiments may include multiple instances for a particular operation, and the order of the operations may vary in various other embodiments. However, other modifications, variations and alternatives are also possible. Accordingly, the detailed description and drawings are to be regarded in an illustrative rather than a restrictive sense.

"X일 수 있다"는 문구는 조건 X가 충족될 수 있음을 나타낸다. 이 문구는 또한 조건 X가 충족되지 않을 수도 있음을 나타낸다. 예를 들어, 특정 구성 요소를 포함하는 시스템에 대한 참조는 시스템이 특정 구성 요소를 포함하지 않는 시나리오도 포함해야 한다. 예를 들어, 특정 동작을 포함하는 방법에 대한 참조는 해당 방법이 특정 구성 요소를 포함하지 않는 시나리오도 포함해야 한다. 그러나 또 다른 예를 들면, 특정 동작을 수행하도록 구성된 시스템에 대한 참조는 시스템이 특정 작업을 수행하도록 구성되지 않은 시나리오도 포함해야 한다.The phrase “may be X” indicates that condition X may be met. This phrase also indicates that condition X may not be met. For example, a reference to a system containing a specific component should also include scenarios in which the system does not contain the specific component. For example, a reference to a method that includes a specific behavior should also include scenarios in which the method does not include that specific component. However, as another example, a reference to a system configured to perform a specific action should also include scenarios in which the system is not configured to perform a specific task.

용어 "포함하는", "갖는", "구성된", "이루어진" 및 "본질적으로 이루어진"은 상호 교환적으로 사용된다. 예를 들어, 임의의 방법은 적어도 도면 및/또는 명세서에 포함된 동작을 포함할 수 있으며, 도면 및/또는 명세서에 포함된 동작만을 포함할 수 있다. 또는, "포함하는"이라는 단어는 청구항에 나열된 요소들 또는 동작들의 존재를 배제하지 않는다.The terms “comprising,” “having,” “consisting of,” “consisting of,” and “consisting essentially of” are used interchangeably. For example, any method may include at least the operations included in the drawings and/or the specification, or may include only the operations included in the drawings and/or the specification. Alternatively, the word “comprising” does not exclude the presence of elements or acts listed in a claim.

통상의 기술자는 논리 블록들 사이의 경계가 단지 예시적인 것이며, 대안적인 실시 예들이 논리 블록들 또는 회로 소자들을 병합하거나 또는 다양한 논리 블록들 또는 회로 소자들 상에 기능의 대체적인 분해를 부과할 수 있음을 인식할 것이다. 따라서, 여기에 도시된 아키텍처는 단지 예시적인 것이며, 사실 동일한 기능을 달성하는 많은 다른 아키텍처가 구현될 수 있다는 것으로 이해되어야 한다.Those skilled in the art will recognize that the boundaries between logical blocks are illustrative only and that alternative embodiments may merge logical blocks or circuit elements or impose alternative decompositions of functionality on various logical blocks or circuit elements. You will recognize that it exists. Accordingly, it should be understood that the architecture shown herein is merely exemplary, and in fact many other architectures may be implemented that achieve the same functionality.

또한, 예를 들어, 일 실시예에서, 도시된 예들은 단일 집적 회로 상에 또는 동일한 장치 내에 위치된 회로로서 구현될 수 있다. 대안적으로, 상기 예들은 임의의 수의 개별적인 집적 회로들 또는 적합한 방식으로 서로 상호 접속된 개별 장치들로서 구현될 수 있으며, 다른 변경, 수정, 변형 및 대안들이 또한 가능하다. 따라서, 명세서 및 도면은 제한적인 의미가 아니라 예시적인 것으로 간주되어야 한다.Also, for example, in one embodiment, the depicted examples could be implemented on a single integrated circuit or as circuitry located within the same device. Alternatively, the above examples could be implemented as any number of separate integrated circuits or individual devices interconnected together in any suitable manner, and other variations, modifications, variations and alternatives are also possible. Accordingly, the specification and drawings are to be regarded in an illustrative rather than a restrictive sense.

또한, 예를 들어, 전술한 예들 또는 그 일부는, 임의의 적절한 유형의 하드웨어 기술 언어와 같은, 물리적 회로 또는 물리적 회로로 변환 가능한 논리적 표현의 소프트웨어 또는 코드 표현으로서 구현될 수 있다.Additionally, for example, the above-described examples, or portions thereof, may be implemented as a software or code representation of a physical circuit or a logical expression convertible to a physical circuit, such as any suitable type of hardware description language.

또한, 본 발명은 비 프로그래머블 하드웨어로 구현된 물리적 장치 또는 유닛으로 제한되지 않지만, 일반적으로 본원에서는 '컴퓨터 시스템'으로 표시되는 메인 프레임, 미니 컴퓨터, 서버, 워크스테이션, 개인용 컴퓨터, 노트패드(notepad), 개인용 디지털 정보 단말기(PDA), 전자 게임(electronic games), 자동차 및 기타 임베디드 시스템, 휴대전화 및 다양한 다른 무선 장치 등과 같은, 적절한 프로그램 코드에 따라 동작함으로써 원하는 장치 기능을 수행할 수 있는 프로그램 가능한 장치 또는 유닛에도 적용될 수 있다.Additionally, the present invention is not limited to physical devices or units implemented in non-programmable hardware, but includes mainframes, minicomputers, servers, workstations, personal computers, notepads, etc., generally referred to herein as 'computer systems'. A programmable device that can perform a desired device function by operating in accordance with appropriate program code, such as personal digital assistants (PDAs), electronic games, automobiles and other embedded systems, cell phones, and various other wireless devices. Or it can also be applied to units.

이 명세서에 언급된 시스템, 장치 또는 디바이스는 적어도 하나의 하드웨어 구성 요소를 포함한다.A system, apparatus or device referred to in this specification includes at least one hardware component.

본 명세서에 설명된 바와 같은 연결들은 예를 들어 중간 장치를 통해 각각의 노드, 유닛 또는 장치로부터 또는 각각의 노드, 유닛 또는 장치로 신호를 전송하기에 적합한 임의의 유형의 연결일 수 있다. 따라서, 묵시적으로 또는 달리 언급되지 않는 한, 연결은 예를 들어 직접 연결 또는 간접 연결일 수 있다. 연결은 단일 연결, 다수의 연결, 단방향 연결 또는 양방향 연결이라는 것을 참조하여 설명되거나 묘사될 수 있다. 그러나, 서로 다른 실시 예들은 연결의 구현을 변화시킬 수 있다. 예를 들어 양방향 연결이 아닌 별도의 단방향 연결을 사용할 수 있으며 그 반대의 경우도 가능할 수 있다. 또한, 다수의 연결은 복수의 신호를 순차적으로 또는 시간 다중화 방식으로 전송하는 단일 연결로 대체될 수 있다. 마찬가지로, 복수의 신호를 전송하는 단일 연결은 이러한 신호의 서브 세트를 전송하는 다양한 연결로 분리될 수 있다. 따라서 신호를 전송하기 위한 많은 옵션들이 존재한다.Connections as described herein may be any type of connection suitable for transmitting signals to or from each node, unit or device, for example via an intermediate device. Accordingly, unless implied or otherwise stated, a connection may be, for example, a direct connection or an indirect connection. A connection may be described or depicted with reference to being a single connection, multiple connections, one-way connection, or two-way connection. However, different embodiments may vary the implementation of the connection. For example, you could use a separate one-way connection rather than a two-way connection, or vice versa. Additionally, multiple connections can be replaced with a single connection that transmits multiple signals sequentially or in a time-multiplexed manner. Likewise, a single connection carrying multiple signals may be split into various connections carrying subsets of those signals. Therefore, many options exist for transmitting signals.

이상에서 본 명세서의 기술에 대한 바람직한 실시 예가 첨부된 도면들을 참조하여 설명되었다. 여기서, 본 명세서 및 청구 범위에 사용된 용어나 단어는 통상적이거나 사전적인 의미로 한정해서 해석되어서는 아니되며, 본 발명의 기술적 사상에 부합하는 의미와 개념으로 해석되어야 한다. 본 발명의 범위는 본 명세서에 개시된 실시 예들로 한정되지 아니하고, 본 발명은 본 발명의 사상 및 특허청구범위에 기재된 범주 내에서 다양한 형태로 수정, 변경, 또는 개선될 수 있다.In the above, preferred embodiments of the technology of this specification have been described with reference to the attached drawings. Here, the terms or words used in this specification and claims should not be construed as limited to their usual or dictionary meanings, but should be construed as meanings and concepts consistent with the technical idea of the present invention. The scope of the present invention is not limited to the embodiments disclosed in this specification, and the present invention may be modified, changed, or improved in various forms within the scope described in the spirit and claims of the present invention.

Claims

Selecting one time series prediction model from among a plurality of time series prediction models;
Reading price data for a plurality of agricultural products from a database and performing preprocessing for learning in the selected time series prediction model;
Generating at least one price data cluster by clustering the price data with a clustering model that groups at least one agricultural product by agricultural product, the clustering model applying the clustered price data to the selected time series prediction model Clustering the price data so that agricultural products that satisfy a predetermined tuning pattern for hyper-parameter tuning are in the same cluster, and clustering the price data into the number of clusters that yield the best performance evaluation results;
For the selected time series prediction model, hyper-parameters are tuned for each of the at least one price data cluster, and the tuned time series prediction model is trained for each of the at least one price data cluster, so that each agricultural product at a future point in time Generating a plurality of agricultural product price prediction models that calculate predicted prices;
Performing a performance evaluation for each agricultural product among the plurality of agricultural product price prediction models and determining the agricultural product price prediction model with the best performance as the final agricultural product price prediction model for the agricultural product; and
Comprising: calculating a predicted price for the target agricultural product on the target date using a final agricultural product price prediction model corresponding to the target agricultural product;
Agricultural product price prediction method using an agricultural product price prediction device.

delete

According to claim 1,
Hyper-parameter tuning for the selected time series prediction model is,
The selected time series prediction model is trained for each of the at least one price data cluster by changing the value of each hyper-parameter within a predetermined category, and the hyper-parameter value is set so that the selected time series prediction model produces the best performance. The process of determining the value of each hyper-parameter
Agricultural product price prediction method using an agricultural product price prediction device.

According to claim 1,
The final agricultural product price prediction model is,
A first agricultural product price prediction model generated for each future time point based on the selected time series prediction model, and a second agricultural product price prediction in which the prediction result of the previous time point by the selected time series prediction model is input to a time series prediction model for prediction of the next time point. model, a third agricultural product price prediction model created by learning by including the prediction results of a time series prediction model predicting the price of the previous point in the input data of a time series prediction model predicting the price of the next point in time, and a single time series prediction model for all An agricultural product price prediction model selected among the fourth agricultural product price prediction models that calculates the predicted price of each agricultural product at a future point in time.
Agricultural product price prediction method using an agricultural product price prediction device.

According to claim 1,
The step of reading price data for a plurality of agricultural products from the database and performing preprocessing for learning in the selected time series prediction model is:
Loading the price data from the database and reinforcing missing data using a predetermined estimation method; and
Containing the step of configuring the price data augmented with the missing data into a format for learning in the selected time series prediction model.
Agricultural product price prediction method using an agricultural product price prediction device.

Combined with hardware,
A computer program stored in a computer-readable storage medium to execute each step included in the agricultural product price prediction method using the agricultural product price prediction device according to any one of claims 1, 4 to 6.

A communication department that communicates with an external agricultural product price data provision server to collect price data for a plurality of agricultural products;
A storage unit that stores the collected price data in a database; and
It includes a control unit functionally connected to the communication unit and the storage unit, wherein the control unit includes,
Select one time series prediction model among a plurality of time series prediction models,
Reading the price data for a plurality of agricultural products from the database and performing preprocessing for learning in the selected time series prediction model,
Generating at least one price data cluster by clustering the price data using a clustering model that groups at least one agricultural product by agricultural product unit,
For the selected time series prediction model, hyper-parameters are tuned for each of the at least one price data cluster, and the tuned time series prediction model is trained for each of the at least one price data cluster, so that each agricultural product at a future point in time Create multiple agricultural product price prediction models that calculate predicted prices,
Among the plurality of agricultural product price prediction models, a performance evaluation is performed for each agricultural product for each agricultural product, and the agricultural product price prediction model with the best performance is determined as the final agricultural product price prediction model for the agricultural product,
Calculate the predicted price for the target agricultural product on the target date using the final agricultural product price prediction model corresponding to the target agricultural product,
When applying the clustered price data to the selected time series prediction model, the clustering model clusters the price data so that agricultural products that satisfy a predetermined tuning pattern for hyper-parameter tuning are in the same cluster, but achieves the best performance evaluation results. Clustering the price data by the number of clusters to be calculated
Agricultural product price prediction device.

delete

According to clause 8,
Hyper-parameter tuning for the selected time series prediction model is,
The selected time series prediction model is trained for each of the at least one price data cluster by changing the value of each hyper-parameter within a predetermined category, and the hyper-parameter value is set so that the selected time series prediction model produces the best performance. The process of determining the value of each hyper-parameter
Agricultural product price prediction device.

According to clause 8,
The final agricultural product price prediction model is,
A first agricultural product price prediction model generated for each future time point based on the selected time series prediction model, and a second agricultural product price prediction in which the prediction result of the previous time point by the selected time series prediction model is input to a time series prediction model for prediction of the next time point. model, a third agricultural product price prediction model created by learning by including the prediction results of a time series prediction model predicting the price of the previous point in the input data of a time series prediction model predicting the price of the next point in time, and a single time series prediction model for all An agricultural product price prediction model selected among the fourth agricultural product price prediction models that calculates the predicted price of each agricultural product at a future point in time.
Agricultural product price prediction device.

According to clause 8,
The control unit,
Reading price data for multiple agricultural products from the database and performing preprocessing for learning in the selected time series prediction model,
Retrieving the price data from the database and reinforcing missing data using a predetermined estimation method,
Configuring the price data augmented with the missing data into a format for learning in the selected time series prediction model.
Agricultural product price prediction device.