KR102242577B1

KR102242577B1 - Data analysis method and apparatus for improving crop productivity

Info

Publication number: KR102242577B1
Application number: KR1020190055303A
Authority: KR
Inventors: 이혜림; 조용빈; 이상영; 황정환
Original assignee: 대한민국
Priority date: 2019-05-10
Filing date: 2019-05-10
Publication date: 2021-04-20
Also published as: WO2020230937A1; KR20200130023A

Abstract

본 발명은 농작물의 생산성 향상을 위한 데이터 분석 방법 및 장치에 관한 것으로서, 본 발명의 일 실시 예에 따른 데이터 분석 방법은, 농가에서 수집된 환경 데이터, 생육 데이터 및 생산량 데이터에 대한 통합 데이터베이스를 구축하는 단계, 환경 데이터로부터 농작물의 생산성에 영향을 미치는 제 1 핵심변수를 추출하고, 생육 데이터로부터 농작물의 생산성에 영향을 미치는 제 2 핵심변수를 추출하는 단계, 제 1 핵심변수 및 제 2 핵심변수를 기초로 하여 환경 데이터, 생육 데이터 및 생산량 데이터 간의 상호 연관성을 분석하는 단계 및 분석의 결과를 기초로 하여 농작물의 재배시기를 고려한 생육단계별로 최대 생산량의 산출을 위한 제 1 핵심변수 및 제 2 핵심변수의 구간값을 추정하는 단계를 포함할 수 있다.The present invention relates to a data analysis method and apparatus for improving the productivity of agricultural crops, and the data analysis method according to an embodiment of the present invention is for constructing an integrated database for environmental data, growth data, and production data collected from farms. Step, Extracting the first key variable that affects the productivity of crops from environmental data, and extracting the second key variable that affects the productivity of crops from the growth data, based on the first and second key variables Based on the step of analyzing the correlation between environmental data, growth data, and production data, and based on the results of the analysis, the first and second key variables for calculating the maximum yield for each growth stage in consideration of the cultivation time of the crop It may include the step of estimating the interval value.

Description

Data analysis method and device for improving the productivity of crops {DATA ANALYSIS METHOD AND APPARATUS FOR IMPROVING CROP PRODUCTIVITY}

본 발명은 농작물의 생산성 향상을 위한 데이터 분석 방법 및 장치에 관한 것으로, 더욱 상세하게는 농작물의 수량 증대 등을 통한 최적의 수익성 산출을 위해 농가에서 수집되는 빅데이터를 분석함으로써 토마토의 재배시기 및 생육단계별 최적의 환경조건을 도출할 수 있는 방법 및 시스템에 관한 것이다.The present invention relates to a data analysis method and apparatus for improving the productivity of agricultural crops, and more particularly, to the growing timing and growth of tomatoes by analyzing big data collected by farmers to calculate optimal profitability through increasing the quantity of crops. It relates to a method and system capable of deriving optimal environmental conditions for each step.

정보통신기술(ICT, Information and Communications Technology)이 발전하여 ICT 기술이 각 산업별로 확산되면서 농업 분야에서는 정부 주도 IT 융합기술 확산으로 시설원예농업에서의 스마트 팜(Smart Farm)이 자본과 기술이 집약된 고부가가치 농업으로 주목 받고 있으며, 우리나라 농업에서 차지하는 비중이 증대하였다. With the development of ICT (Information and Communications Technology) and the spread of ICT technology to each industry, in the agricultural field, the government-led IT convergence technology spread, and the smart farm in the facility horticultural agriculture industry is concentrated in capital and technology. It is attracting attention as a high value-added agriculture, and its share of agriculture in Korea has increased.

시설재배 농가의 생산성은 환경(광, 온도, 습도, 수분 등) 관리기술에 따라 차이가 크며, 특히 이상기후(일조량 부족, 저온·고온 등) 상황에서 시설재배의 환경관리기술은 중요한 요소로 부각되고 있다. 정부 주도 ICT 확산사업으로 스마트팜 도입 면적은 2014년 이후 ('14년) 60ha→('15) 364ha→('16) 600ha→('17) 4,000ha로 증가하였으나, ICT 확산사업 대상자들의 애로사항으로 표준화된 복합환경관리기술 보급 및 종합적인 컨설팅 요구도가 높다. The productivity of facility cultivation farmers varies greatly depending on the management technology of the environment (light, temperature, humidity, moisture, etc.), and especially in conditions of abnormal climate (lack of sunlight, low temperature, high temperature, etc.), environmental management technology of facility cultivation is an important factor. Has become. As a government-led ICT expansion project, the area of smart farm introduction has increased from 2014 ('14) to 60ha → ('15) 364ha → ('16) 600ha → ('17) 4,000ha, but the difficulties of the targets of the ICT expansion project. As a result, there is a high demand for standardized multi-environment management technology and comprehensive consulting.

기존 스마트 팜은 재배시설 자동화 및 편리성 증대에 집중했다면 현재는 시설재배 농가의 생산성을 향상하기 위하여 빅데이터 기반 최적의 생육조건을 분석하여 재배현장에 접목시키는 연구혁신이 이루어지고 있다. 우리나라 시설원예농업은 환경의 정밀제어가 부족해 네덜란드와 2017년 기준 우리나라 시설원예작물의 단위당 생산량을 비교하였을 때 오이 8.1배. 토마토 7.2배, 딸기 3.3배 차이로 큰 차이가 있는데, 우리나라가 선진국 수준으로 시설채소 생산량을 증대하기 위해서는 빅데이터 기반으로 하는 정밀한 생육관리에 필요한 스마트 팜 측정 빅데이터를 통합적으로 관리하여 분석하고, 빅데이터의 분석결과 농가에서 재배하고 있는 작목의 생산성을 향상하는 종합적인 현장 실증이 필요하다.Whereas existing smart farms focused on automation of cultivation facilities and increasing convenience, research innovation is being carried out by analyzing the optimal growth conditions based on big data to improve the productivity of facility cultivated farmers and incorporating them into the cultivation field. In Korea, facility horticultural agriculture lacks precise control of the environment, so when comparing the production per unit of facility horticultural crops in the Netherlands and Korea as of 2017, cucumbers are 8.1 times. There is a big difference, with a difference of 7.2 times for tomatoes and 3.3 times for strawberries.In order for Korea to increase facility vegetable production to the level of advanced countries, the smart farm measurement big data necessary for precise growth management based on big data is integrated and analyzed. As a result of data analysis, a comprehensive field demonstration is needed to improve the productivity of crops grown in farms.

데이터 기반 스마트팜 생산성 향상 연구는 생산의 과학화와 유통의 지능화를 통해 농업 생 산성을 향상시킬 수 있다. 예를 들어, 버섯 병 재배는 배지를 만들어 병에 채운 뒤에 살균하여 버섯 종균을 접종하여 버섯을 발생시켜 생육하는 순서로 진행되며 농가에서는 매일 수 천 병씩 생산하므로 1병당 생산량의 미미한 차이가 누적되면 농가의 년간 소득에 큰 영향을 미친다. 버섯의 생장에 영향을 미치는 요인으로는 생육단계별 발이 과정과 생육단계별 온도, 습도, CO₂, 조도 등을 들 수 있다. 버섯농가의 생산성 향상을 위해서는 버섯 생산에 투입되는 농자재의 정확한 파악과 함께 재배사의 환경조건을 버섯 생산에 최적의 환경으로 관리하는 것이 필요하다.Data-based smart farm productivity improvement research can improve agricultural productivity through scientific production and intelligent distribution. For example, mushroom bottle cultivation proceeds in the order of producing and growing mushrooms by making a medium, filling the bottle, sterilizing, and inoculating mushroom seeds to generate mushrooms. Farmers produce thousands of bottles every day. Has a big impact on your income for the year. Factors influencing the growth of mushrooms include the germination process at each stage of growth, temperature, humidity, CO ₂ and illuminance at each stage of growth. In order to improve the productivity of mushroom farmers, it is necessary to accurately grasp the agricultural materials used for mushroom production and to manage the environmental conditions of the growers in the optimal environment for mushroom production.

최근 들어 정부의 ICT 융복합 확산정책에 따라 재배사내에 온습도 및 이산화탄소 센서 등을 설치하여 생육환경을 측정하고 관리할 수 있는 스마트팜이 권장되고 있다. 그러나 재배환경 데이터의 수집 및 분석, 환경 데이터에 기초한 재배 컨설팅은 이루지지지 않고 있다. 따라서, 재배 생산성 향상을 위해 농가의 생산 및 경영실태를 조사하고 재배 환경을 조사 분석하여 최적의 생산 환경을 구축하는 연구와 함께 수집된 생육환경 데이터에 기반한 농가 컨설팅이 필요하다. 또한, 농가의 생산량 증대와 품질 향상을 위해서는 농가의 생산 및 경영실태를 조사하고 재배 환경을 조사 분석하여 최적의 생산 환경을 구축하는 연구와 함께, 빅데이터 기반으로 하는 정밀한 생육관리에 필요한 스마트 팜 측정 빅데이터를 통합적으로 관리하여 분석하고, 분석 결과를 농가에 환류 하여 생산성을 향상시키는 종합적인 현장실증 연구가 필요하다.In recent years, according to the government's policy of spreading ICT convergence, a smart farm that can measure and manage the growing environment by installing temperature and humidity and carbon dioxide sensors in the cultivation company has been recommended. However, collection and analysis of cultivation environment data and cultivation consulting based on environmental data have not been performed. Therefore, in order to improve cultivation productivity, it is necessary to research farmhouse production and management conditions, research and analyze the cultivation environment to establish an optimal production environment, and farmhouse consulting based on the collected growing environment data. In addition, in order to increase the production volume and improve the quality of farmers, smart farm measurement necessary for precise growth management based on big data, along with research to establish an optimal production environment by investigating the production and management conditions of farms, researching and analyzing the cultivation environment. It is necessary to comprehensively manage and analyze big data, and to return the analysis results to farms to improve productivity.

대한민국 등록특허공보 제10-1811640호 (2017.12.26)Republic of Korea Patent Publication No. 10-1811640 (2017.12.26)

본 발명은 전술한 바와 같은 필요성에 따라 안출된 것으로서, 스마트팜에서 수집되는 빅데이터를 활용하여 토마토의 재배환경, 생육 및 생산량과의 상관관계를 분석함으로써, 농가별로 토마토의 생산량을 극대화할 수 있는 최적의 환경조건을 사용자가 설정 및 반영할 수 있도록 하는 데이터 분석 방법 및 모델을 제공함에 목적이 있다.The present invention was devised according to the necessity as described above, and by analyzing the correlation between the cultivation environment, growth, and production of tomatoes using big data collected from a smart farm, it is possible to maximize the production of tomatoes by farm households. The purpose of this study is to provide a data analysis method and model that enables users to set and reflect optimal environmental conditions.

본 발명의 일 실시 예에 따른 농작물의 생산성 향상을 위한 데이터 분석 방법은, 농가에서 수집된 환경 데이터, 생육 데이터 및 생산량 데이터에 대한 통합 데이터베이스를 구축하는 단계, 환경 데이터로부터 농작물의 생산성에 영향을 미치는 제 1 핵심변수를 추출하고, 생육 데이터로부터 농작물의 생산성에 영향을 미치는 제 2 핵심변수를 추출하는 단계, 제 1 핵심변수 및 제 2 핵심변수를 기초로 하여 환경 데이터, 생육 데이터 및 생산량 데이터 간의 상호 연관성을 분석하는 단계 및 분석의 결과를 기초로 하여 농작물의 재배시기를 고려한 생육단계별로 최대 생산량의 산출을 위한 제 1 핵심변수 및 제 2 핵심변수의 구간값을 추정하는 단계를 포함할 수 있다.The data analysis method for improving the productivity of crops according to an embodiment of the present invention comprises the steps of establishing an integrated database for environmental data, growth data, and production data collected from farms, and affecting the productivity of crops from environmental data. Extracting the first key variable, extracting the second key variable that affects the productivity of the crop from the growth data, based on the first key variable and the second key variable, the interaction between environmental data, growth data, and production data It may include the step of analyzing the association and estimating the interval values of the first key variable and the second key variable for calculating the maximum yield for each growth stage considering the cultivation time of the crop, based on the result of the analysis.

본 발명의 일 실시 예에 따른 통합 데이터베이스를 구축하는 단계에서는, 환경 데이터, 생육 데이터 및 생산량 데이터의 시간 단위(unit of time)를 주(week) 단위로 변환하고, 주 단위로 변환된 데이터들을 농가 별로 분류하여 저장할 수 있다.In the step of constructing an integrated database according to an embodiment of the present invention, a unit of time of environmental data, growth data, and production data is converted into a week unit, and the converted data is converted into a farm household. It can be sorted and stored by category.

본 발명의 일 실시 예에 따른 제 1 핵심변수에는 농가의 누적일사량, 온도, 습도, 이산화탄소농도, 1일 관수횟수, 1회 물공급량, 염류농도 및 ph농도가 포함되며, 제 2 핵심변수에는 농작물의 생장길이, 줄기굵기 및 화방높이가 포함될 수 있다.The first key variable according to an embodiment of the present invention includes the cumulative amount of insolation, temperature, humidity, carbon dioxide concentration, number of irrigation per day, water supply amount per time, salt concentration and ph concentration of the farm, and the second key variable includes crops. The growth length, stem thickness, and flower height of the plant may be included.

본 발명의 일 실시 예에 따른 제 1 핵심변수 및 제 2 핵심변수를 추출하는 단계는, 환경 데이터로부터 추출된 누적일사량에 대한 단위(unit)를 파악하는 단계, 농가의 일(day) 별 야간시간대의 누적일사량과 시간 단위(unit of time)의 지연변수에 따른 누적일사량을 비교하는 단계 및 비교의 결과에 따라 누적일사량의 보정 여부를 결정하는 단계를 포함할 수 있다.The step of extracting the first key variable and the second key variable according to an embodiment of the present invention includes determining a unit for the cumulative insolation extracted from environmental data, and the night time zone for each day of the farm. Comparing the accumulated insolation amount and the accumulated insolation amount according to a delay variable of a unit of time, and determining whether to correct the accumulated insolation amount according to the result of the comparison.

본 발명의 일 실시 예에 따른 상호 연관성을 분석하는 단계는, 농작물의 개화에서 수확까지의 소요기간을 누적 평균한 제 1 핵심변수와 생산량 데이터를 상호 매칭하는 단계 및 제 2 핵심변수 각각에 영향을 미치는 제 1 핵심변수를 분석하는 단계를 포함할 수 있다.In the step of analyzing the correlation according to an embodiment of the present invention, the step of mutually matching the first key variable and the production amount data obtained by cumulative average of the time required from flowering to harvesting of the crop, and the second key variable are affected. The impact may include analyzing the first key variable.

본 발명의 일 실시 예에 따른 제 1 핵심변수 및 제 2 핵심변수의 구간값을 추정하는 단계는, 상호 연관성을 분석한 결과에 따라 농작물의 생산량이 높은 것으로 판단된 상위 N개(N은 자연수)의 통합 데이터를 추출하는 단계, 추출된 통합 데이터를 기초로 하여 재배시기를 고려한 생육단계별로 최대 생산량에 해당하는 제 1 핵심변수 및 제 2 핵심변수의 구간값을 분석하는 단계 및 상호 연관성을 분석한 결과에 따라 제 1 핵심변수의 구간값과 제 2 핵심변수의 구간값을 매칭하는 단계를 포함할 수 있다.The step of estimating the interval value of the first key variable and the second key variable according to an embodiment of the present invention is the top N (N is a natural number) judged to have a high production amount of crops according to the result of correlation analysis. The step of extracting the integrated data of, based on the extracted integrated data, the step of analyzing the interval values of the first key variable and the second key variable corresponding to the maximum production for each growth stage considering the cultivation period, and analyzing the correlation. According to the result, the step of matching the section value of the first key variable and the section value of the second key variable may be included.

본 발명의 일 실시 예에 따른 농작물의 생산성 향상을 위한 데이터 분석 장치는, 농가에서 수집된 환경 데이터, 생육 데이터 및 생산량 데이터에 대한 통합 데이터베이스를 구축하는 데이터베이스 관리부, 환경 데이터로부터 농작물의 생산성에 영향을 미치는 제 1 핵심변수를 추출하고, 생육 데이터로부터 농작물의 생산성에 영향을 미치는 제 2 핵심변수를 추출하는 핵심변수 추출부, 제 1 핵심변수 및 제 2 핵심변수를 기초로 하여 환경 데이터, 생육 데이터 및 생산량 데이터 간의 상호 연관성을 분석하는 제 1 데이터 분석부 및 분석의 결과를 기초로 하여 농작물의 재배시기를 고려한 생육단계별로 최대 생산량의 산출을 위한 제 1 핵심변수 및 제 2 핵심변수의 구간값을 추정하는 제 2 데이터 분석부를 포함할 수 있다.A data analysis device for improving productivity of crops according to an embodiment of the present invention includes a database management unit that builds an integrated database for environmental data, growth data, and production data collected from farms, and influences the productivity of crops from the environmental data. Based on the first key variable and the second key variable, the key variable extraction unit extracts the first key variable that affects the crop and extracts the second key variable that affects the productivity of the crop from the growth data. Based on the first data analysis unit that analyzes the correlation between the output data and the results of the analysis, estimates the interval values of the first and second key variables for calculating the maximum output for each growth stage considering the growing season of the crop. It may include a second data analysis unit.

본 발명의 일 실시 예에 따른 데이터베이스 관리부는, 환경 데이터, 생육 데이터 및 생산량 데이터의 시간 단위(unit of time)를 주(week) 단위로 변환하고, 주 단위로 변환된 데이터들을 농가 별로 분류하여 저장할 수 있다.The database management unit according to an embodiment of the present invention converts a unit of time of environmental data, growth data, and production data into a week unit, and classifies and stores the converted data by week. I can.

본 발명의 일 실시 예에 따른 핵심변수 추출부는, 환경 데이터로부터 추출된 누적일사량에 대한 단위(unit)를 파악하고, 농가의 일(day) 별 야간시간대의 누적일사량과 시간 단위(unit of time)의 지연변수에 따른 누적일사량을 비교하며, 비교의 결과에 따라 누적일사량의 보정 여부를 결정할 수 있다.The core variable extraction unit according to an embodiment of the present invention determines a unit for the cumulative insolation extracted from environmental data, and the cumulative insolation and unit of time in the night time for each day of the farm. The cumulative insolation according to the delay variable of is compared, and whether to correct the cumulative insolation can be determined according to the result of the comparison.

본 발명의 일 실시 예에 따른 제 1 데이터 분석부는, 농작물의 개화에서 수확까지의 소요기간을 누적 평균한 제 1 핵심변수와 생산량 데이터를 상호 매칭하고, 제 2 핵심변수 각각에 영향을 미치는 제 1 핵심변수를 분석할 수 있다.The first data analysis unit according to an embodiment of the present invention is configured to mutually match a first key variable and production data obtained by a cumulative average of a time period from flowering to harvest of a crop, and have a first effect on each of the second key variables. Key variables can be analyzed.

본 발명의 일 실시 예에 따른 제 2 데이터 분석부는, 상호 연관성을 분석한 결과에 따라 농작물의 생산량이 높은 것으로 판단된 상위 N개(N은 자연수)의 통합 데이터를 추출하고, 추출된 통합 데이터를 기초로 하여 재배시기를 고려한 생육단계별로 최대 생산량에 해당하는 제 1 핵심변수 및 제 2 핵심변수의 구간값을 분석하며, 상호 연관성을 분석한 결과에 따라 제 1 핵심변수의 구간값과 제 2 핵심변수의 구간값을 매칭시킬 수 있다.The second data analysis unit according to an embodiment of the present invention extracts the integrated data of the top N pieces (N is a natural number) determined to have a high production amount of crops according to the result of analyzing the correlation, and extracts the extracted integrated data. Based on the cultivation period, the section value of the first and second key variables corresponding to the maximum production volume is analyzed, and the section value of the first key variable and the second key are analyzed according to the results of the correlation analysis. You can match the interval values of variables.

한편, 본 발명의 일 실시 예에 의하면, 전술한 방법을 컴퓨터에서 실행시키기 위한 프로그램을 기록한 컴퓨터로 읽을 수 있는 기록매체를 제공할 수 있다.Meanwhile, according to an embodiment of the present invention, a computer-readable recording medium in which a program for executing the above-described method on a computer is recorded may be provided.

본 발명의 일 실시 예로서 제공되는 데이터 분석 방법 및 장치에 따르면, 토마토 스마트팜에서 발생하는 빅데이터를 활용하여 토마토의 생산량을 증가하는 환경설정을 제시함으로써 농가의 수익성을 증대시킬 수 있다.According to the data analysis method and apparatus provided as an embodiment of the present invention, it is possible to increase the profitability of farmers by presenting an environment setting that increases the production amount of tomatoes by using big data generated from a tomato smart farm.

또한, 시설원예 분야 복합환경제어시스템, 양액기 관련 업체 등에서 사용되는 기본적인 환경설정을 보완할 수 있고, 이를 통해 각 농가별 맞춤형 관리, 데이터 표준화 및 복합환경제어시스템 업그레이드의 기반을 마련할 수 있다.In addition, it is possible to supplement the basic environment settings used in the complex environment control system in the field of facility horticulture, nutrient solution related companies, etc., and through this, it is possible to prepare the basis for customized management for each farm, data standardization, and upgrade of the complex environment control system.

도 1은 본 발명의 일 실시 예에 따른 농작물의 생산성 향상을 위한 데이터 분석 방법을 나타낸 순서도이다.
도 2는 본 발명의 일 실시 예에 따른 제 1 핵심변수 중 누적일사량을 추출하는 과정을 나타낸 순서도이다.
도 3은 본 발명의 일 실시 예에 따른 제 1 핵심변수 중 누적일사량을 추출하는 과정을 나타낸 개념도이다.
도 4는 본 발명의 일 실시 예에 따른 제 1 핵심변수 및 제 2 핵심변수의 구간값을 추정하는 과정을 나타낸 순서도이다.
도 5는 본 발명의 일 실시 예에 따른 재배시기를 고려한 생육단계별 제 1 핵심변수의 구간값을 추정한 결과를 나타낸 표이다.
도 6a는 본 발명의 일 실시 예에 따른 재배시기를 고려한 생육단계별로 제 1 핵심변수의 구간값과 제 2 핵심변수의 구간값을 매칭한 결과를 나타낸 표이며, 도 6b는 도 6a의 일 예시를 나타낸 표이다.
도 7은 본 발명의 일 실시 예에 따른 상위 20%의 통합 데이터와 나머지 데이터를 (a) 온도 및 습도를 기준으로 비교한 결과, (b) 누적일사량 및 이산화탄소 농도를 기준으로 비교한 결과를 나타낸 그래프이다.
도 8은 본 발명의 일 실시 예에 따른 상위 20%의 통합 데이터와 나머지 데이터를 (a) 1회 물공급량 및 물 공급횟수를 기준으로 비교한 결과, (b) 평당 생산량을 기준으로 비교한 결과를 나타낸 그래프이다.
도 9는 본 발명의 일 실시 예에 따른 상위 20%의 통합 데이터와 나머지 데이터를 (a) 생장길이를 기준으로 비교한 결과, (b) 줄기굵기를 기준으로 비교한 결과, (c) 화방높이를 기준으로 비교한 결과를 나타낸 그래프이다.
도 10은 본 발명의 일 실시 예에 따른 농작물의 생산성 향상을 위한 데이터 분석 장치를 나타낸 블록도이다.1 is a flow chart showing a data analysis method for improving the productivity of agricultural crops according to an embodiment of the present invention.
2 is a flowchart illustrating a process of extracting cumulative insolation from among first key variables according to an embodiment of the present invention.
3 is a conceptual diagram illustrating a process of extracting cumulative insolation from among first key variables according to an embodiment of the present invention.
4 is a flowchart illustrating a process of estimating section values of a first key variable and a second key variable according to an embodiment of the present invention.
5 is a table showing a result of estimating a section value of a first key variable for each growth stage in consideration of a cultivation period according to an embodiment of the present invention.
6A is a table showing the result of matching the section value of the first key variable and the section value of the second key variable for each growth stage considering the cultivation time according to an embodiment of the present invention, and FIG. 6B is an example of FIG. 6A It is a table showing.
7 shows a result of comparing the integrated data of the top 20% and the remaining data according to an embodiment of the present invention based on (a) temperature and humidity, and (b) a comparison result based on cumulative insolation and carbon dioxide concentration. It is a graph.
FIG. 8 is a result of comparing the integrated data of the top 20% and the remaining data according to an embodiment of the present invention based on (a) the amount of water supplied once and the number of times of water supply, and (b) the result of comparison based on the amount of production per pyeong. It is a graph showing.
9 is a result of comparing the total data of the top 20% and the remaining data according to an embodiment of the present invention based on (a) growth length, (b) a result of comparison based on stem thickness, (c) flower height This is a graph showing the result of comparison based on.
10 is a block diagram showing a data analysis device for improving productivity of agricultural crops according to an embodiment of the present invention.

본 명세서에서 사용되는 용어에 대해 간략히 설명하고, 본 발명에 대해 구체적으로 설명하기로 한다.The terms used in the present specification will be briefly described, and the present invention will be described in detail.

본 발명에서 사용되는 용어는 본 발명에서의 기능을 고려하면서 가능한 현재 널리 사용되는 일반적인 용어들을 선택하였으나, 이는 당 분야에 종사하는 기술자의 의도 또는 판례, 새로운 기술의 출현 등에 따라 달라질 수 있다. 또한, 특정한 경우는 출원인이 임의로 선정한 용어도 있으며, 이 경우 해당되는 발명의 설명 부분에서 상세히 그 의미를 기재할 것이다. 따라서 본 발명에서 사용되는 용어는 단순한 용어의 명칭이 아닌, 그 용어가 가지는 의미와 본 발명의 전반에 걸친 내용을 토대로 정의되어야 한다. Terms used in the present invention have selected general terms that are currently widely used as possible while taking functions of the present invention into consideration, but this may vary according to the intention or precedent of a technician working in the field, the emergence of new technologies, and the like. In addition, in certain cases, there are terms arbitrarily selected by the applicant, and in this case, the meaning of the terms will be described in detail in the description of the corresponding invention. Therefore, the terms used in the present invention should be defined based on the meaning of the term and the overall contents of the present invention, not a simple name of the term.

명세서 전체에서 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있음을 의미한다. 또한, 명세서에 기재된 "...부" 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어 또는 소프트웨어로 구현되거나 하드웨어와 소프트웨어의 결합으로 구현될 수 있다. 또한, 명세서 전체에서 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는 "직접적으로 연결"되어 있는 경우뿐 아니라, "그 중간에 다른 구성을 사이에 두고" 연결되어 있는 경우도 포함한다.When a part of the specification is said to "include" a certain component, it means that other components may be further included rather than excluding other components unless specifically stated to the contrary. In addition, terms such as "... unit" described in the specification mean a unit that processes at least one function or operation, which may be implemented as hardware or software, or a combination of hardware and software. In addition, when a part is said to be "connected" with another part throughout the specification, this includes not only the case of being "directly connected" but also the case of being connected "with another configuration in the middle".

아래에서는 첨부한 도면을 참조하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 본 발명의 실시 예를 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시 예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다. Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those of ordinary skill in the art can easily implement the present invention. However, the present invention may be implemented in various different forms and is not limited to the embodiments described herein. In the drawings, parts irrelevant to the description are omitted in order to clearly describe the present invention, and similar reference numerals are attached to similar parts throughout the specification.

이하 첨부된 도면을 참고하여 본 발명을 상세히 설명하기로 한다.Hereinafter, the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일 실시 예에 따른 농작물의 생산성 향상을 위한 데이터 분석 방법을 나타낸 순서도이다.1 is a flow chart showing a data analysis method for improving the productivity of agricultural crops according to an embodiment of the present invention.

도 1을 참조하면, 본 발명의 일 실시 예에 따른 농작물의 생산성 향상을 위한 데이터 분석 방법은, 농가에서 수집된 환경 데이터, 생육 데이터 및 생산량 데이터에 대한 통합 데이터베이스를 구축하는 단계(S100), 환경 데이터로부터 농작물의 생산성에 영향을 미치는 제 1 핵심변수를 추출하고, 생육 데이터로부터 농작물의 생산성에 영향을 미치는 제 2 핵심변수를 추출하는 단계(S200), 제 1 핵심변수 및 제 2 핵심변수를 기초로 하여 환경 데이터, 생육 데이터 및 생산량 데이터 간의 상호 연관성을 분석하는 단계(S300) 및 분석의 결과를 기초로 하여 농작물의 재배시기를 고려한 생육단계별로 최대 생산량의 산출을 위한 제 1 핵심변수 및 제 2 핵심변수의 구간값을 추정하는 단계(S400)를 포함할 수 있다.Referring to FIG. 1, a data analysis method for improving productivity of crops according to an embodiment of the present invention includes the step of constructing an integrated database for environmental data, growth data, and production data collected from farms (S100), and environment Extracting the first key variable affecting the productivity of the crop from the data, and extracting the second key variable affecting the productivity of the crop from the growth data (S200), based on the first key variable and the second key variable Analysis of the correlation between environmental data, growth data, and production data (S300), and the first and second key variables for calculating the maximum output for each growth stage considering the cultivation period of the crop based on the results of the analysis (S300). It may include the step (S400) of estimating the interval value of the core variable.

이때, 농작물은 본 발명의 일 실시 예로서 바람직하게는 토마토를 말한다. 따라서, 본 발명의 일 실시 예에 따른 환경 데이터는 토마토를 수집하기 위한 환경조건에 관한 데이터, 생육 데이터는 토마토의 생육에 관한 데이터, 생산량 데이터는 농가에서 생산되는 토마토의 생산량에 관한 데이터를 나타낸다.At this time, the crop is preferably a tomato as an embodiment of the present invention. Accordingly, the environmental data according to an embodiment of the present invention represents data on environmental conditions for collecting tomatoes, the growth data represents data about the growth of tomatoes, and the production data represents data about the production amount of tomatoes produced by farmers.

본 발명의 일 실시 예에 따른 환경 데이터는 농가에 설치된 센서를 통해 측정되는 데이터로서, 농작물이 재배되는 농가의 환경조건들이 일정한 시간 기준에 따라 측정된 데이터를 말한다. 예를 들어, 농가의 온도, 습도, 일사량, 잔존 이산화탄소 농도, 양액 등 토마토 재배에 영향을 미치는 환경조건에 관한 24가지 항목에 관한 정보가 농가에 설치된 각종 센서들을 통해 매 시간(hour)마다 측정될 수 있으며, 측정된 각 항목에 관한 정보들은 환경 데이터로서 데이터베이스 관리부(10)로 수집될 수 있다.Environmental data according to an embodiment of the present invention is data measured through a sensor installed in a farm, and refers to data obtained by measuring environmental conditions of a farm where crops are grown according to a certain time standard. For example, information on 24 items on environmental conditions affecting tomato cultivation, such as temperature, humidity, insolation, residual carbon dioxide concentration, nutrient solution, etc., can be measured every hour through various sensors installed in the farm. In addition, information on each measured item may be collected by the database management unit 10 as environmental data.

또한, 전술한 환경 데이터는 다음의 [표 1]과 같은 시간 기준에 따라 구분되어 수집될 수 있다. 이때, 일출 또는 일몰은 환경 데이터가 수집되는 농가가 위치한 지역의 천문시를 기준으로 결정될 수 있다.In addition, the above-described environmental data may be classified and collected according to a time standard as shown in [Table 1] below. In this case, the sunrise or sunset may be determined based on the astronomical time of the area where the farmhouse where environmental data is collected is located.

- 일일 평균: 일일 24시간 평균 환경
- 주간: 해가 뜨고부터 해가 지기 전까지의 환경
- 야간: 해가 지고 그 이튿날 일출 전 환경
- 해지기 전: 해가 지기 전 2시간 전~해가 질 때까지 환경
- 초저녁: 해가 진 후부터 2시간까지 환경
- 심야: 해가 진 2시간 후 ~ 이튿날 해뜨기 2시간 전까지 환경
- 새벽: 해가 뜨기 2시간 전~ 해가 뜰 때까지 환경-Daily average: 24 hours daily average environment
-Day: Environment from sunrise to sunset
-Night: The environment before sunrise the next day after sunset
-Before sunset: 2 hours before sunset to environment until sunset
-Early evening: environment from sunset to 2 hours
-Late night: 2 hours after sunset to 2 hours before sunrise on the next day.
-Dawn: 2 hours before the sun rises to the environment until the sun rises

본 발명의 일 실시 예에 따른 생육 데이터는 농업 전문가 등에 의해 농가에서 측정되는 데이터로서, 농작물이 얼만큼 성장했는지에 대한 기준들이 일정한 시간 기준에 따라 측정된 데이터를 말한다. 예를 들어, 생육 데이터는 토마토 생장길이, 잎의 수, 잎길이, 잎폭, 줄기굵기, 화방높이, 개화군, 착과군, 수확군, 열매수 등의 12가지 항목에 대하여 매 주(week)마다 측정된 정보를 포함할 수 있으며, 측정된 각 항목에 관한 정보들은 생육 데이터로서 데이터베이스 관리부(10)로 수집될 수 있다.Growth data according to an embodiment of the present invention is data measured at a farm by an agricultural expert or the like, and refers to data in which standards for how much a crop has grown are measured according to a certain time standard. For example, the growth data is for 12 items such as tomato growth length, number of leaves, leaf length, leaf width, stem thickness, flower plant height, flowering group, fruit group, harvest group, and number of fruits. The measured information may be included, and information on each measured item may be collected by the database management unit 10 as growth data.

본 발명의 일 실시 예에 따른 생산량 데이터는 농가에서 농작물이 출하되는 날에 생산되는 농작물의 생산량에 관한 데이터를 말한다. 예를 들어, 생산량 데이터는 판매원장 등의 자료에 포함된 내용으로부터 추출될 수 있으며, 토마토가 출하되는 날에 생산되는 소정의 단위 면적(3.3m²)당 토마토 생산량(kg)에 관한 정보를 포함할 수 있다. 이와 같은 생산량 데이터는 일(day) 단위를 기준으로 데이터베이스 관리부(10)로 수집될 수 있다.The production amount data according to an embodiment of the present invention refers to data on the production amount of crops produced on the day when the crops are shipped from the farm. For example, production volume data can be extracted from the contents included in data such as sales ledger, and includes information on tomato production (kg) per ^{predetermined unit area (3.3m 2) produced on the day the tomato is shipped.} can do. Such production data may be collected by the database management unit 10 on a daily basis.

본 발명의 일 실시 예에 따른 통합 데이터베이스를 구축하는 단계(S100)에서는, 환경 데이터, 생육 데이터 및 생산량 데이터의 시간 단위(unit of time)를 주(week) 단위로 변환하고, 주 단위로 변환된 데이터들을 농가 별로 분류하여 저장할 수 있다.In the step (S100) of building an integrated database according to an embodiment of the present invention, the unit of time of environmental data, growth data, and production data is converted into a week unit, and the converted into a week unit. Data can be classified and stored by farm household.

전술하였듯이 환경 데이터, 생육 데이터 및 생산량 데이터는 각기 다른 시간 기준에 따라 측정 또는 수집되므로, 통합 데이터베이스를 구축 및 후술할 데이터 간 연관성 분석을 위해서는 각 데이터의 시간 단위를 통일시킬 필요가 있다. 이에 따라, 통합 데이터베이스를 구축하는 단계(S100)에서 데이터베이스 관리부(10)는 농가로부터 수집된 데이터들의 시간 단위를 주 단위로 모두 변환하여 하나의 단위로 통일시키는 작업을 수행할 수 있다. 또한, 데이터베이스 관리부(10)는 주 단위로 변환된 데이터들을 각각의 농가별로 분류하여 관리할 수 있다.As described above, since environmental data, growth data, and production data are measured or collected according to different time standards, it is necessary to unify the time units of each data in order to establish an integrated database and analyze the correlation between data to be described later. Accordingly, in the step of constructing an integrated database (S100), the database management unit 10 converts all time units of data collected from farms into a weekly unit and unifies them into one unit. In addition, the database management unit 10 may classify and manage the data converted on a weekly basis for each farm household.

이와 같은 시간 단위의 통일 및 농가별 분류를 통한 통합 데이터베이스 구축 과정(S100)을 통해 환경 데이터, 생육 데이터 및 생산량 데이터 간의 연관성 분석 속도를 대폭 향상시킬 수 있다. 또한, 토마토를 재배하는 전 지역의 농가에서 수집되는 데이터들을 용이하게 빅데이터화하고 효율적으로 관리할 수 있다.Through the process of constructing an integrated database (S100) through the unification of time units and classification by farm households, it is possible to significantly improve the speed of analysis of the association between environmental data, growth data, and production data. In addition, data collected from farms in all regions where tomatoes are grown can be easily converted into big data and managed efficiently.

본 발명의 일 실시 예에 따른 데이터베이스 관리부(10)에 의해 통합 데이터베이스가 구축되면, 핵심변수 추출부(20)에 의해 통합 데이터베이스에 소정의 기준(i.e. 주 단위 및 농가 단위)에 따라 저장된 통합 데이터(i.e. 환경 데이터, 생육 데이터 및 생산량 데이터)로부터 핵심변수가 추출되는 단계(S200)가 수행될 수 있다. 이때, 본 발명의 일 실시 예에 따른 핵심변수 추출 단계(S200)에서는 다중회귀분석(Multiple regression analysis)이 이용될 수 있다. 즉, 핵심변수 추출부(20)는 토마토의 생산성에 영향을 미치는 환경조건에 관한 제 1 핵심변수 일부 및 토마토의 생산성에 영향을 미치는 생육항목에 관한 제 2 핵심변수 일부를 다중회귀분석을 통해 추출할 수 있다.When an integrated database is constructed by the database management unit 10 according to an embodiment of the present invention, the integrated data stored in the integrated database by the core variable extraction unit 20 according to a predetermined standard (ie, a weekly unit and a farmhouse unit) ( ie, a step (S200) of extracting key variables from environmental data, growth data, and production data) may be performed. In this case, multiple regression analysis may be used in the core variable extraction step S200 according to an embodiment of the present invention. That is, the key variable extraction unit 20 extracts some of the first key variables related to environmental conditions that affect the productivity of tomatoes and some of the second key variables related to growth items that affect the productivity of tomatoes through multiple regression analysis. can do.

회귀분석(Regression analysis)이란 한 개 또는 한 개 이상의 독립변수의 종속변수에 대한 영향을 추정할 수 있는 통계기법을 말한다. 다시 말해서, 회귀분석은 관찰된 연속형 변수들에 대해 두 변수 사이의 모형을 구한 뒤 적합도를 측정해 내는 분석 방법으로, 하나의 종속변수와 여러 독립변수 사이의 관계를 규명하는 방식을 다중회귀분석이라 한다. 본 발명의 일 실시 예에 따르면, 이러한 다중회귀분석 과정에서 이동평균법을 이용하여 시간지연효과를 반영한 환경과 생산량, 생육과 생산량 간의 상관관계에 대한 분석이 수행될 수 있다. Regression analysis is a statistical technique capable of estimating the influence of one or more independent variables on the dependent variable. In other words, regression analysis is an analysis method in which a model between two variables is obtained for observed continuous variables and then the goodness of fit is measured. Multiple regression analysis is a method of identifying the relationship between one dependent variable and several independent variables. It is called this. According to an exemplary embodiment of the present invention, an analysis of the correlation between the environment and the production volume reflecting the time delay effect, and the growth and production volume may be analyzed using a moving average method in such a multiple regression analysis process.

본 발명의 일 실시 예에 따른 핵심변수 추출부(20)는 전술한 다중회귀분석뿐만 아니라 다양한 연구 데이터들을 반영한 통합 데이터의 분석을 통해 제 1 핵심변수 및 제 2 핵심변수를 추출할 수 있다. 이와 같이 추출된 제 1 핵심변수에는 농가의 누적일사량, 온도, 습도, 이산화탄소농도, 1일 관수횟수, 1회 물공급량, 염류농도 및 pH농도가 포함될 수 있다. 여기서 온도는 토마토와 같은 시설원예작물의 경우 재배 특성상 외부 온도보다 내부 온도에 영향을 받으므로, 작물이 재배되는 시설의 내부 온도를 의미한다.The core variable extracting unit 20 according to an embodiment of the present invention may extract the first core variable and the second core variable through the analysis of integrated data reflecting various research data as well as the above-described multiple regression analysis. The first core variables extracted as described above may include accumulated insolation, temperature, humidity, carbon dioxide concentration, number of irrigation per day, water supply amount per time, salt concentration and pH concentration of the farm. Here, the temperature refers to the internal temperature of the facility where the crop is grown, since the temperature is affected by the internal temperature rather than the external temperature in the case of facility horticultural crops such as tomatoes.

또한, 제 2 핵심변수에는 농작물의 생장길이, 줄기굵기 및 화방높이가 포함될 수 있다. 여기서 화방이란 중심줄기의 꽃꼭지 위로 자라는 꽃의 전체 영역을 의미한다. 예를 들어, 토마토의 화방은 토마토의 꽃이 피어나는 토마토의 꽃 줄기를 포함하는 꽃 전체를 말한다.In addition, the second key variable may include the growth length, stem thickness, and flower height of the crop. Here, the flower room means the entire area of a flower that grows above the stem of the central stem. For example, the flower garden of a tomato refers to the whole flower including the stem of the tomato in which the tomato flower blooms.

도 2는 본 발명의 일 실시 예에 따른 제 1 핵심변수 중 누적일사량을 추출하는 과정을 나타낸 순서도, 도 3은 개념도이다.FIG. 2 is a flowchart illustrating a process of extracting cumulative insolation from among first key variables according to an embodiment of the present invention, and FIG. 3 is a conceptual diagram.

제 1 핵심변수 중 누적일사량을 제외한 온도, 습도 등의 변수들은 평균적인 변화 패턴을 나타내므로, 수집된 데이터에 특정한 이상치가 발생하는 경우에는 결측 처리를 하거나 전후 시간을 고려하여 곧바로 보정을 수행할 수 있다. 그러나, 제 1 핵심변수 중 누적일사량은 특정한 이상치가 발생하거나 손실이 발생하는 경우에는 연관성을 분석하는데 사용하기 어려운 문제가 존재한다. 따라서, 본 발명의 일 실시 예에 따르면, 누적일사량의 경우에는 온전한 데이터 값을 추출하기 위해서 다른 변수들과는 달리 제 1 핵심변수의 추출 과정에서 누적일사량의 패턴 파악, 이상치 확인 및 보정의 과정이 수행될 수 있다.Among the first key variables, variables such as temperature and humidity, excluding cumulative insolation, show an average pattern of change, so if a specific outlier occurs in the collected data, it is possible to perform missing processing or immediately correct it in consideration of the time before and after. have. However, among the first key variables, the cumulative insolation has a problem that is difficult to use to analyze the association when a specific outlier occurs or a loss occurs. Therefore, according to an embodiment of the present invention, in the case of cumulative insolation, in order to extract the complete data value, unlike other variables, the process of identifying the accumulated insolation pattern, checking outliers, and correcting are performed in the extraction process of the first key variable. I can.

즉, 도 2를 참조하면, 본 발명의 일 실시 예에 따른 제 1 핵심변수 및 제 2 핵심변수를 추출하는 단계(S200)는, 환경 데이터로부터 추출된 누적일사량에 대한 단위(unit)를 파악하는 단계(S210), 농가의 일(day) 별 야간시간대의 누적일사량과 시간 단위(unit of time)의 지연변수에 따른 누적일사량을 비교하는 단계(S220) 및 비교의 결과에 따라 누적일사량의 보정 여부를 결정하는 단계(S230)를 포함할 수 있다.That is, referring to FIG. 2, the step of extracting the first and second key variables according to an embodiment of the present invention (S200) is to determine a unit for the cumulative insolation extracted from environmental data. Step (S210), comparing the cumulative insolation amount according to the delay variable of the unit of time and the accumulated insolation amount in the night time for each day of the farm (S220), and whether to correct the cumulative insolation amount according to the result of the comparison It may include a step of determining (S230).

본 발명의 일 실시 예에 따른 누적일사량에 대한 단위를 파악하는 단계(S210)에서는 환경 데이터로부터 누적일사량과 관계된 추출된 데이터 값들이 광도에 관한 단위를 가지는지 또는 누적일사량에 관한 단위를 가지는 여부가 판단될 수 있다. In the step of determining the unit for cumulative insolation according to an embodiment of the present invention (S210), whether the extracted data values related to the cumulative insolation from environmental data have a unit for luminous intensity or a unit for cumulative insolation is determined. Can be judged.

예를 들어, 도 3을 참조하면, 핵심변수 추출부(20)는 환경 데이터로부터 추출된 누적일사량의 값 중 시간별 누적일사량 값이 0 J/cm² 이상 백단위의 값(ex. 110 J/cm² 등) 이하이면서 일별 누적일사량 값이 3000 J/cm² 이상인지 여부를 판단할 수 있다(S10). 환경 데이터로부터 추출된 누적일사량의 값이 전술한 조건을 만족하는 경우에는 해당 값은 광도로 판단될 수 있다(S21). 광도로 판단된 값은 일사량 변환식을 통해 누적일사량에 관한 값으로 변환될 수 있다(S30). 환경 데이터로부터 누적일사량의 값이 전술한 조건을 만족하지 않는 경우에는 해당 값은 누적일사량의 값으로 판단되고(S22), 1차적인 결측 처리의 판단 과정이 수행될 수 있다(S40).For example, referring to FIG. 3, the core variable extracting unit 20 includes a value of 0 J/cm ² or more in hundred units (ex. 110 J/cm ^{2 ).} Etc.) It can be determined whether or not the cumulative daily insolation value is 3000 J/cm ^{2 or more while being less than or equal to (S10).} When the value of the cumulative insolation extracted from the environmental data satisfies the above-described condition, the corresponding value may be determined as luminance (S21). The value determined by the luminosity may be converted into a value related to the cumulative insolation through an insolation conversion equation (S30). When the value of the cumulative insolation from the environmental data does not satisfy the above-described condition, the corresponding value is determined as the value of the cumulative insolation (S22), and a process of determining the primary missing processing may be performed (S40).

본 발명의 일 실시 예에 따른 농가의 일 별 야간시간대의 누적일사량과 시간 단위의 지연변수에 따른 누적일사량을 비교하는 단계(S220)에서는 일 별 야간시간대의 누적일사량이 시간 단위의 지연변수에 따른 누적일사량과 일치하는지 여부가 판단될 수 있다(S40).In the step (S220) of comparing the cumulative insolation in the daily night time zone of the farm according to an embodiment of the present invention and the cumulative insolation in accordance with the delay variable in the time unit, It may be determined whether or not it matches the cumulative insolation (S40).

일 별 야간시간대의 누적일사량은 일사가 없으므로 누적 일사량이 증가하지 않는다. 그러나, 시스템 상의 오류로 인해 야간시간임에도 불구하고 계속 누적일사량이 증가하거나 값들이 변동하는 경우가 발생할 수 있다. 이러한 경우 정확한 누적일사량을 측정할 수 없기 때문에 정확한 누적일사량 추정을 위해서 본 발명에서는 야간에는 누적일사량이 증가하지 않는다는 가정 및 지연변수를 이용하는 비교 판단 과정(S220)을 수행한다.The cumulative amount of insolation in the night time zone for each day does not increase because there is no insolation. However, due to an error in the system, there may be a case where the cumulative insolation continues to increase or the values fluctuate despite the night time. In this case, since it is not possible to accurately measure the cumulative insolation, the present invention performs a comparison determination process (S220) using the assumption that the cumulative insolation does not increase at night and using a delay variable for accurate cumulative insolation estimation.

여기서 지연변수는 일사량에 관한 로우(raw)데이터(i.e. 통합 데이터로부터 추출된 일사량에 관한 기초데이터)에 시간차이를 둔 변수로서, S220 단계에서 특정 일자의 누적일사량을 추출하고 참값을 확인하기 위해 새로이 생성된 변수이다. 즉, 지연변수는 야간에는 누적일사량이 증가하지 않는다는 가정 하에 일사량에 관한 데이터의 시간차를 기초로 한 비교 판단을 수행하기 위해서 본 발명의 일 실시 예에 따라 새롭게 정의된 변수를 의미한다.Here, the delay variable is a variable with a time difference between raw data on insolation (ie, basic data on insolation extracted from integrated data), and in step S220, the accumulated insolation on a specific day is extracted and a new value is confirmed. It is a created variable. That is, the delay variable refers to a variable newly defined according to an embodiment of the present invention in order to perform a comparison determination based on a time difference of data on insolation under the assumption that the cumulative insolation does not increase at night.

예를 들어, 도 3을 참조하면, 핵심변수 추출부(20)는 누적일사량의 값에서 일 별 야간시간대의 누적일사량과 시간 단위의 지연변수에 따른 누적일사량을 추출할 수 있다. [표 2]와 같이 추출된 두 값을 비교하여 일치하지 않는 경우에는 핵심변수 추출부(20)는 누적일사량의 값을 시스템 상의 오류로 보아 해당 값은 결측된 것으로 처리할 수 있다(S50). 반면, 두 값을 비교하여 일치하는 경우에는 특정 일자의 누적일사량을 해당 일자의 참값으로 판단하고, 이상치 확인 및 보정을 위한 2차적인 판단 과정을 추가적으로 수행할 수 있다(S60).For example, referring to FIG. 3, the core variable extracting unit 20 may extract the cumulative insolation amount in the night time zone per day and the cumulative insolation amount according to the delay variable in the time unit from the value of the cumulative insolation amount. If the two extracted values are not matched by comparing the values as shown in [Table 2], the core variable extracting unit 20 may treat the value of the accumulated insolation as a system error and treat the value as missing (S50). On the other hand, when the two values are compared and matched, the cumulative insolation amount of a specific date is determined as the true value of the corresponding date, and a secondary determination process for confirming and correcting the outlier may be additionally performed (S60).

시간time 야간시간대 누적일사량Cumulative insolation during night time 누적일사량의 지연변수(A)Delay variable of cumulative insolation (A) 2019-05-02 20:002019-05-02 20:00 15551555 -- 2019-05-02 21:002019-05-02 21:00 15751575 15551555 2019-05-02 22:002019-05-02 22:00 15801580 15751575 2019-05-02 23:002019-05-02 23:00 15951595 15801580 2019-05-03 20:002019-05-03 20:00 16001600 15951595

본 발명의 일 실시 예에 따른 누적일사량의 보정 여부를 결정하는 단계(S230)에서는 누적일사량의 값이 일정한 조건에 해당하는지 여부가 판단될 수 있으며, 판단의 결과에 따라 해당 값에 대한 보정의 수행 여부가 결정될 수 있다.In the step of determining whether to correct the cumulative insolation according to an embodiment of the present invention (S230), it may be determined whether the value of the cumulative insolation corresponds to a certain condition, and correction is performed on the corresponding value according to the determination result. Whether or not can be determined.

예를 들어, 도 3을 참조하면, 핵심변수 추출부(20)는 누적일사량의 값 중 일별 누적일사량 값이 0 J/cm² 이상 200 J/cm² 이하이거나 3000 J/cm² 이상인지 여부를 판단할 수 있다(S60). 누적일사량의 값이 전술한 조건을 만족하지 않는 경우에는 해당 값은 그대로 제 1 핵심변수에 관한 값으로 사용될 수 있다(S72). 누적일사량의 값이 전술한 조건을 만족하는 경우에는 다음의 [표 3]과 같은 보정 방식을 통해 해당 값에 대한 보정이 수행될 수 있다(S71).For example, referring to FIG. 3, the core variable extraction unit 20 determines whether the daily cumulative insolation value is 0 J/cm ² or more and 200 J/cm ² or less, or 3000 J/cm ² or more. It can be determined (S60). When the value of the cumulative insolation does not satisfy the above-described condition, the corresponding value may be used as a value for the first key variable (S72). When the value of the accumulated insolation satisfies the above-described condition, correction for the corresponding value may be performed through a correction method as shown in Table 3 below (S71).

① 이튿날 일출시간에 0이 아닌 다른 값으로 초기화 되는 경우
: (마지막 누적일사량 값 - 일출시간의 누적일사량 값)
② 일일 내 값이 누적되는 중간에 센서 오류로 측정이 안 된 경우
: 결측 처리
③ 일출 후 누적되는 과정에서 중간에 초기화 되고, 다시 누적되는 경우
: (초기화 되기 전의 누적값 + 초기화된 이후의 누적값)
④ 이튿날 일출시간에 초기화 되지 않고 연속적으로 누적되는 경우
: 일 별(최대값 - 최소값)
⑤ (0 J/cm²= 일사량 = 100 J/cm²)에 해당하는 경우
: 결측 처리① When the sunrise time of the next day is initialized to a value other than 0
: (Last cumulative insolation value-Accumulated insolation value at sunrise time)
② In the case of not measuring due to sensor error in the middle of accumulating daily values
: Missing treatment
③ In the process of accumulating after sunrise, initialized in the middle and accumulated again
: (Accumulated value before initialization + accumulated value after initialization)
④ If it is not initialized at the sunrise time of the next day and accumulates continuously
: Daily (maximum value-minimum value)
⑤ In the case of (0 J/cm ² = Insolation = 100 J/cm ² )
: Missing treatment

즉, 누적일사량의 값이 전술한 조건을 만족하는 경우, 핵심변수 추출부(20)는 시간의 변화에 따른 이상치를 [표 3]과 같은 5가지의 보정 기준에 따라 판단하고, 판단 결과에 따라 보정을 수행하거나 시스템 상의 오류로 판단하여 해당 값은 결측된 것으로 처리할 수 있다. 이때, [표 3]과 같은 5가지 보정 방식의 기준이 되는 시간(ex. 일출시간 등)은 전술한 [표 1]과 같은 시간 기준에 따라 결정될 수 있다. That is, when the value of the cumulative insolation satisfies the above-described condition, the core variable extracting unit 20 determines the outlier value according to the change of time according to the five correction criteria shown in [Table 3], and according to the determination result. The value can be treated as missing by performing correction or by judging that it is an error in the system. At this time, the time (ex. sunrise time, etc.) that is the standard of the five correction methods shown in [Table 3] may be determined according to the time criteria shown in [Table 1].

본 발명의 일 실시 예에 따른 핵심변수 추출부(20)에 의해 제 1 핵심변수 및 제 2 핵심변수가 추출되면, 이에 기초하여 제 1 데이터 분석부(30)는 통합 데이터의 상호 연관성을 분석하는 단계(S300)를 수행할 수 있다. 이때, 상호 연관성을 분석하는 단계(S300)는, 농작물의 개화에서 수확까지의 소요기간을 누적 평균한 제 1 핵심변수와 생산량 데이터를 상호 매칭하는 단계 및 제 2 핵심변수 각각에 영향을 미치는 제 1 핵심변수를 분석하는 단계를 포함할 수 있다.When the first key variable and the second key variable are extracted by the key variable extraction unit 20 according to an embodiment of the present invention, the first data analysis unit 30 analyzes the correlation between the integrated data. Step S300 may be performed. At this time, the step of analyzing the correlation (S300) is a step of mutually matching the first key variable and production data obtained by the cumulative average of the required period from flowering to harvesting of the crop, and the first affecting each of the second key variables. It may include the step of analyzing key variables.

예를 들어, 제 1 데이터 분석부(30)는 토마토의 생육적인 특성을 반영하기 위해서 7주 내지 12주의 소요기간(i.e. 토마토의 개화 후 수확까지의 소요기간)동안 수집된 제 1 핵심변수를 누적 평균한 값과 주차별로 정리된 생산량 데이터의 값을 매칭시킬 수 있다. 이러한 매칭 과정을 통해 토마토의 개화 후 수확까지의 소요 기간 동안의 생산량과 환경조건과의 상관관계를 분석할 수 있다.For example, the first data analysis unit 30 accumulates the first key variables collected for a period of 7 to 12 weeks (ie, a period of time from flowering to harvest of tomatoes) to reflect the growth characteristics of tomatoes. It is possible to match the averaged value with the value of the output data organized by parking. Through this matching process, it is possible to analyze the correlation between the production volume and environmental conditions during the period from flowering to harvesting of tomatoes.

또한, 제 1 데이터 분석부(30)는 생장길이, 줄기굵기 및 화방높이 각각에 영향을 미치는 환경조건은 어떠한 것들이 있는지를 분석할 수 있다. 예를 들어, 제 1 데이터 분석부(30)는 제 1 핵심변수와 제 2 핵심변수의 상관관계 분석을 통해 다음의 [표 4]와 같은 결과를 도출할 수 있다. [표 4]를 참조하면, 생장길이에 영향을 미치는 제 1 핵심변수에는 온도, 공급 염류농도 및 물 공급횟수(i.e. 관수횟수)가 포함될 수 있다. 줄기굵기에 영향을 미치는 제 1 핵심변수에는 누적일사량, 잔존 이산화탄소 농도, 공급 염류농도, 공급 ph농도, 1회 물 공급량 및 물 공급횟수가 포함될 수 있다. 화방높이에 영향을 미치는 제 1 핵심변수에는 누적일사량, 습도, 잔존 이산화탄소 농도, 공급 염류농도, 공금 ph농도, 물 공급횟수가 포함될 수 있다.In addition, the first data analysis unit 30 may analyze what environmental conditions affect each of the growth length, stem thickness, and flower bed height. For example, the first data analysis unit 30 may derive a result as shown in [Table 4] through the correlation analysis between the first key variable and the second key variable. Referring to [Table 4], the first key variables affecting the growth length may include temperature, supply salt concentration, and water supply frequency (i.e. irrigation frequency). The first key variables affecting the stem thickness may include cumulative insolation, residual carbon dioxide concentration, supply salt concentration, supply ph concentration, water supply amount per time, and number of water supply times. The first key variables that affect the height of the flower shop may include cumulative insolation, humidity, residual carbon dioxide concentration, supply salt concentration, utility ph concentration, and number of water supply.

도 4는 본 발명의 일 실시 예에 따른 제 1 핵심변수 및 제 2 핵심변수의 구간값을 추정하는 과정(S400)을 나타낸 순서도이다. 4 is a flowchart illustrating a process (S400) of estimating section values of a first key variable and a second key variable according to an embodiment of the present invention.

또한, 도 5는 본 발명의 일 실시 예에 따른 재배시기를 고려한 생육단계별 제 1 핵심변수의 구간값을 추정한 결과를 나타낸 표, 도 6a는 본 발명의 일 실시 예에 따른 재배시기를 고려한 생육단계별로 제 1 핵심변수의 구간값과 제 2 핵심변수의 구간값을 매칭한 결과를 나타낸 표이며, 도 6b는 도 6a의 일 예시를 나타낸 표이다.In addition, FIG. 5 is a table showing the result of estimating the section value of the first key variable for each growth stage in consideration of the cultivation time according to an embodiment of the present invention. It is a table showing the result of matching the section value of the first key variable and the section value of the second key variable step by step, and FIG. 6B is a table showing an example of FIG. 6A.

본 발명의 일 실시 예에 따른 제 1 데이터 분석부(30)에 의한 연관성 분석이 완료되면, 이에 기초하여 제 2 데이터 분석부(40)는 최적의 환경설정 값을 제공하기 위한 핵심변수들의 구간값을 추정하는 단계(S400)를 수행할 수 있다.When the correlation analysis by the first data analysis unit 30 according to an embodiment of the present invention is completed, the second data analysis unit 40 based on this, the section values of key variables for providing an optimal environment setting value. The step of estimating (S400) may be performed.

도 4를 참조하면, 본 발명의 일 실시 예에 따른 제 1 핵심변수 및 제 2 핵심변수의 구간값을 추정하는 단계(S400)는, 상호 연관성을 분석한 결과에 따라 농작물의 생산량이 높은 것으로 판단된 상위 N개(N은 자연수)의 통합 데이터를 추출하는 단계(S410), 추출된 통합 데이터를 기초로 하여 재배시기를 고려한 생육단계별로 최대 생산량에 해당하는 제 1 핵심변수 및 제 2 핵심변수의 구간값을 분석하는 단계(S420) 및 상호 연관성을 분석한 결과에 따라 제 1 핵심변수의 구간값과 제 2 핵심변수의 구간값을 매칭하는 단계(S430)를 포함할 수 있다.Referring to FIG. 4, in the step of estimating the interval value of the first and second key variables according to an embodiment of the present invention (S400), it is determined that the production amount of crops is high according to the result of analyzing the correlation. Extracting the integrated data of the top N (N is a natural number) (S410), based on the extracted integrated data, the first key variable and the second key variable corresponding to the maximum production for each growth stage considering the cultivation time Analyzing the section value (S420) and matching the section value of the first key variable and the section value of the second key variable according to the result of analyzing the correlation (S430).

예를 들어, 제 2 데이터 분석부(40)는 제 1 데이터 분석부(30)에 의한 연관성 분석 결과를 기초로 농가별로 관리되는 통합 데이터 중 상위 20%에 해당하는 데이터들을 추출할 수 있다. 즉, 제 2 데이터 분석부(40)는 상위 20%에 해당하는 데이터들을 기준으로 최대 생산량에 해당하는 값을 분석하여 도출하고, 그 값에 매칭되는 제 1 핵심변수 및 제 2 핵심변수의 구간값을 추정할 수 있다.For example, the second data analysis unit 40 may extract data corresponding to the top 20% of the integrated data managed for each farm based on the result of the correlation analysis by the first data analysis unit 30. That is, the second data analysis unit 40 analyzes and derives a value corresponding to the maximum output based on the data corresponding to the top 20%, and the section value of the first key variable and the second key variable matching the value Can be estimated.

이때, 제 1 핵심변수 및 제 2 핵심변수의 구간값은 재배시기를 고려한 생육단계를 기준으로 추정될 수 있다. 토마토의 생육단계는 기본적으로 크게 생육초기, 생육중기, 생육말기의 3단계로 나눠질 수 있다. 여기서 계절별 토마토 꽃의 개화에서 수확까지의 기간 분석을 통해 토마토의 재배시기를 고려하면, 생육중기는 9월에서 10월의 단계, 11월에서 12월의 단계, 1월에서 2월의 단계 및 3월에서 6월의 단계의 4단계로 나눠질 수 있다. 즉, 토마토의 재배시기를 고려한 생육단계는 총 6단계로 나눠질 수 있다.At this time, the section values of the first and second key variables may be estimated based on the growth stage in consideration of the cultivation period. Tomato growth stages can be roughly divided into three stages: early growth, mid-growth, and end-growth. Considering the cultivation period of tomatoes through the analysis of the period from flowering to harvest of tomato flowers by season, the mid-growth period is from September to October, from November to December, from January to February, and the third. It can be divided into four phases, from month to June. That is, the growing stage considering the growing season of tomatoes can be divided into a total of 6 stages.

본 발명의 일 실시 예에 따른 추출된 통합 데이터를 기초로 하여 재배시기를 고려한 생육단계별로 최대 생산량에 해당하는 제 1 핵심변수 및 제 2 핵심변수의 구간값을 분석하는 단계(S420)에서는 제 2 데이터 분석부(40)가 누적일사량의 구간값을 기준으로 최대 생산량이 나올 수 있는 나머지 제 1 핵심변수들의 구간값들의 조합을 산출할 수 있다. 누적일사량은 사용자에 의해 별도로 설정될 수 없는 외생적인 변수이므로, 재배시기를 고려한 생육단계별로 누적일사량의 구간값을 기준으로 하여 최적의 환경조건의 조합이 결정될 수 있다. In the step (S420) of analyzing the interval values of the first and second key variables corresponding to the maximum production for each growth stage considering the cultivation time based on the extracted integrated data according to an embodiment of the present invention (S420), the second The data analysis unit 40 may calculate a combination of section values of the remaining first key variables in which the maximum production amount can be obtained based on the section value of the cumulative insolation. Since the cumulative insolation amount is an exogenous variable that cannot be set separately by the user, an optimal combination of environmental conditions may be determined based on the section value of the cumulative insolation amount for each growth stage considering the cultivation time.

예를 들어, 상위 20%의 통합 데이터를 분석한 결과에 따라 도 5와 같이 재배시기를 고려한 생육단계별로 최대생산량, 평균생산량, 평균수확주수의 값이 결정될 수 있다. 또한, 도 5와 같이 생육단계별로 제 1 핵심변수 및 제 2 핵심변수의 구간값이 분석되어 결정될 수 있다. 핵심변수들은 전체 상위 20%의 통합 데이터를 분석한 결과에 기반하므로 도 5와 같이 하나의 구간값으로 정의될 수 있다. For example, according to the result of analyzing the integrated data of the top 20%, values of the maximum production amount, the average production amount, and the average number of harvested weeks may be determined for each growth stage considering the cultivation time as shown in FIG. 5. In addition, as shown in FIG. 5, interval values of the first and second key variables for each growth stage may be analyzed and determined. Since the key variables are based on the result of analyzing the integrated data of the top 20%, they can be defined as one section value as shown in FIG. 5.

이와 같이 정리된 환경조건의 조합은 사용자에게 제공될 수 있다. 즉, 도 5와 같이 재배시기를 고려한 생육단계별로 누적일사량의 구간값을 기준으로 하여 최적의 환경조건의 조합이 결정되면, 제 2 데이터 분석부(40)는 각 농가의 전체적인 토마토의 생산을 증대하는 환경을 사용자가 스스로 설정할 수 있는 환경 설정값을 제공할 수 있다.The combination of the environmental conditions organized as described above may be provided to the user. That is, as shown in FIG. 5, when the optimal combination of environmental conditions is determined based on the section value of the cumulative insolation for each growth stage considering the cultivation period, the second data analysis unit 40 increases the overall production of tomatoes by each farm. It is possible to provide environment setting values that allow users to set their own environment.

본 발명의 일 실시 예에 따른 상호 연관성을 분석한 결과에 따라 제 1 핵심변수의 구간값과 제 2 핵심변수의 구간값을 매칭하는 단계(S430)에서는 제 2 데이터 분석부(40)가 제 2 핵심변수들을 기준으로 한 단기적인 환경조건의 조합을 결정할 수 있다. 다시 말해서, 제 2 데이터 분석부(40)는 생육단계 전반에 걸쳐 최대 생산량을 도출하기 위한 환경조건의 조합을 어떻게 설정해야 하는지 뿐만 아니라 생장길이, 줄기굵기, 화방높이 등의 생육 조건을 적절하게 유지하기 위한 단기적인 환경조건의 조합을 어떻게 설정해야 하는지를 분석할 수 있다.In the step (S430) of matching the section value of the first key variable and the section value of the second key variable according to the result of analyzing the correlation according to an embodiment of the present invention, the second data analysis unit 40 Combinations of short-term environmental conditions can be determined based on key variables. In other words, the second data analysis unit 40 properly maintains growth conditions such as growth length, stem thickness, and flower bed height, as well as how to set the combination of environmental conditions for deriving the maximum production amount throughout the growth stage. It is possible to analyze how to establish a combination of short-term environmental conditions to achieve.

예를 들어, 도 6b와 같이 제 2 데이터 분석부(40)는 상위 20%의 통합 데이터를 분석한 결과를 기초로 최대 생산량이 도출되는 제 2 핵심변수의 값을 추정할 수 있으며, 그에 따른 제 1 핵심변수의 값을 매칭시킬 수 있다. 또한, 제 2 데이터 분석부(40)는 이러한 매칭 결과를 재배시기를 고려한 생육단계별로 누적일사량의 구간값을 기준으로 하여 정리할 수 있다.For example, as shown in FIG. 6B, the second data analysis unit 40 may estimate the value of the second key variable from which the maximum production amount is derived based on the result of analyzing the integrated data of the top 20%. 1 Can match the values of key variables. In addition, the second data analysis unit 40 may organize the matching results based on the section value of the cumulative insolation for each growth stage considering the cultivation time.

이와 같이 도출된 단기적인 환경설정의 조합을 토대로 사용자는 현재 재배되고 있는 토마토의 생장길이, 줄기굵기 및 화방높이를 체크하여 어떠한 환경조건의 값을 얼만큼 조절해야 하는지를 확인할 수 있다. 즉, 앞서 살펴본 데이터 분석 방법을 통해 사용자는 토마토의 생산량을 최대로 하기 위해서 토마토의 생육 상태를 고려한 단기적인 환경 설정뿐만 아니라 생육단계 전반을 고려한 전체적인 환경 설정을 적절하게 수행할 수 있다.Based on the combination of the short-term environment settings derived as described above, the user can check the growth length, stem thickness, and flower height of the currently cultivated tomatoes to determine which environmental condition values and how much to adjust. That is, through the above-described data analysis method, the user can appropriately set the overall environment considering the overall growth stage as well as the short-term environment setting in consideration of the growing state of tomatoes in order to maximize the tomato production.

이하에서는 농가별로 관리되는 통합 데이터의 상위 20%와 하위 80%를 핵심변수별로 비교한 결과를 구체적으로 살펴보도록 한다.Hereinafter, the results of comparing the top 20% and the bottom 80% of the integrated data managed by farm household by key variable will be examined in detail.

도 7은 본 발명의 일 실시 예에 따른 상위 20%의 통합 데이터와 나머지 데이터를 (a) 온도 및 습도를 기준으로 비교한 결과, (b) 누적일사량 및 이산화탄소 농도를 기준으로 비교한 결과를 나타낸 그래프이다.7 shows a result of comparing the integrated data of the top 20% and the remaining data according to an embodiment of the present invention based on (a) temperature and humidity, and (b) a comparison result based on cumulative insolation and carbon dioxide concentration. It is a graph.

도 7의 (a)를 참조하면, 온도와 습도에 대하여 상위 20%와 하위 80%는 미세하긴 하나 전반적으로 유의미한 차이를 나타내는 것을 확인할 수 있다. 즉, 상위 20%와 하위 80%의 온도와 습도값은 각 월별로 일치하는 경우가 존재하지 않으므로, 온도와 습도는 미세한 차이를 기초로 최대 생산량에 영향을 미침을 알 수 있다.Referring to (a) of FIG. 7, it can be seen that the upper 20% and the lower 80% with respect to temperature and humidity are minute, but show a significant difference overall. In other words, since there is no case that the temperature and humidity values of the top 20% and the bottom 80% coincide for each month, it can be seen that the temperature and humidity affect the maximum output based on a minute difference.

도 7의 (b)를 참조하면, 누적일사량의 경우, 햇볕이 강한 7월에서 9월의 기간 동안 상위 20%와 하위 80% 사이에 상당한 차이가 발생함을 확인할 수 있다. 즉, 누적일사량은 외생적인 요인으로서 계절, 지역 등의 시기와 절기 등에 상당한 영향을 받음을 알 수 있다. 또한, 이산화탄소 농도의 경우, 1월에서 3월의 기간 동안 상위 20%와 하위 80% 사이에 상당한 차이가 발생함을 확인할 수 있다.Referring to (b) of FIG. 7, in the case of cumulative insolation, it can be seen that a significant difference occurs between the top 20% and the bottom 80% during the period from July to September when sunlight is strong. In other words, it can be seen that the cumulative insolation is an exogenous factor and is significantly influenced by the seasons, regions, and seasons. In addition, in the case of carbon dioxide concentration, it can be seen that a significant difference occurs between the top 20% and the bottom 80% during the period from January to March.

도 8은 본 발명의 일 실시 예에 따른 상위 20%의 통합 데이터와 나머지 데이터를 (a) 1회 물 공급량 및 물 공급횟수를 기준으로 비교한 결과, (b) 평당 생산량을 기준으로 비교한 결과를 나타낸 그래프이다.FIG. 8 is a result of comparing the integrated data of the top 20% and the remaining data according to an embodiment of the present invention based on (a) the amount of water supplied once and the number of times of water supply, (b) the result of comparison based on the amount of production per pyeong It is a graph showing.

도 8의 (a)를 참조하면, 1회 물 공급량 및 물 공급횟수에 대하여 상위 20%와 하위 80%는 전반적으로 큰 폭의 차이를 나타냄을 확인할 수 있다. 즉, 농작물의 재배자에 의해 오롯이 결정되는 핵심변수인 물 공급량 및 공급횟수는 최대 생산량에 상당한 영향을 미침을 알 수 있다.Referring to (a) of FIG. 8, it can be seen that the upper 20% and the lower 80% generally show a large difference in the amount of water supplied and the number of times of water supply. In other words, it can be seen that the amount of water supply and the number of times supplied, which are key variables determined entirely by the grower of the crop, have a significant effect on the maximum production.

도 8의 (b)를 참조하면, 재배시기를 고려할 때, 상위 20%가 하위 80%에 비해 최대 2배 이상의 생산량의 차이(i.e. 5월의 생산량의 차이)를 보임을 확인할 수 있다. 즉, 상위 20%에 대한 분석 결과를 토대로 재배시기를 고려한 생육단계별로 전반적인 환경조건이 설정된다면, 생산량을 전반적으로 향상시킬 수 있음을 도 8의 (b)를 통해 확인할 수 있다.Referring to (b) of FIG. 8, when considering the cultivation time, it can be seen that the upper 20% showed a difference in the production amount (i.e. the difference in the production amount in May) at most twice or more than the lower 80%. That is, if overall environmental conditions are set for each growth stage in consideration of the cultivation time based on the analysis result of the top 20%, it can be confirmed through (b) of FIG. 8 that overall production can be improved.

도 9는 본 발명의 일 실시 예에 따른 상위 20%의 통합 데이터와 나머지 데이터를 (a) 생장길이를 기준으로 비교한 결과, (b) 줄기굵기를 기준으로 비교한 결과, (c) 화방높이를 기준으로 비교한 결과를 나타낸 그래프이다.9 is a result of comparing the total data of the top 20% and the remaining data according to an embodiment of the present invention based on (a) growth length, (b) a result of comparison based on stem thickness, (c) flower height This is a graph showing the result of comparison based on.

도 9를 참조하면, 전술한 제 1 핵심변수들과 마찬가지로 제 2 핵심변수인 생장길이, 줄기굵기 및 화방높이는 상위 20%와 하위 80% 간에 유의미한 차이가 존재함을 확인할 수 있다. 즉, 상위 20%에 대한 분석 결과를 토대로 생육 조건에 따른 단기적인 환경조건이 설정된다면, 생산량을 전반적으로 향상시킬 수 있음을 도 8 및 도 9를 통해 확인할 수 있다.Referring to FIG. 9, it can be seen that there is a significant difference between the top 20% and the bottom 80% in the second key variables, such as growth length, stem thickness, and flower height, like the above-described first key variables. That is, it can be seen through FIGS. 8 and 9 that if short-term environmental conditions according to growing conditions are set based on the analysis results for the top 20%, overall production can be improved.

도 10은 본 발명의 일 실시 예에 따른 농작물의 생산성 향상을 위한 데이터 분석 장치(100)를 나타낸 블록도이다.10 is a block diagram showing a data analysis apparatus 100 for improving productivity of agricultural crops according to an embodiment of the present invention.

도 10을 참조하면, 본 발명의 일 실시 예에 따른 농작물의 생산성 향상을 위한 데이터 분석 장치(100)는, 농가에서 수집된 환경 데이터, 생육 데이터 및 생산량 데이터에 대한 통합 데이터베이스를 구축하는 데이터베이스 관리부(10), 환경 데이터로부터 농작물의 생산성에 영향을 미치는 제 1 핵심변수를 추출하고, 생육 데이터로부터 농작물의 생산성에 영향을 미치는 제 2 핵심변수를 추출하는 핵심변수 추출부(20), 제 1 핵심변수 및 제 2 핵심변수를 기초로 하여 환경 데이터, 생육 데이터 및 생산량 데이터 간의 상호 연관성을 분석하는 제 1 데이터 분석부(30) 및 분석의 결과를 기초로 하여 농작물의 재배시기를 고려한 생육단계별로 최대 생산량의 산출을 위한 제 1 핵심변수 및 제 2 핵심변수의 구간값을 추정하는 제 2 데이터 분석부(40)를 포함할 수 있다.Referring to FIG. 10, a data analysis device 100 for improving productivity of crops according to an embodiment of the present invention includes a database management unit that builds an integrated database for environmental data, growth data, and production data collected from farms ( 10), Core variable extraction unit 20, which extracts the first key variable that affects the productivity of crops from environmental data, and extracts the second key variable that affects the productivity of crops from the growth data, and the first key variable And the first data analysis unit 30 that analyzes the correlation between environmental data, growth data, and production data based on the second key variable, and the maximum output for each growth stage in consideration of the cultivation time of the crops based on the results of the analysis. It may include a second data analysis unit 40 for estimating the section value of the first key variable and the second key variable for the calculation of.

본 발명의 일 실시 예에 따른 데이터베이스 관리부(10)는, 환경 데이터, 생육 데이터 및 생산량 데이터의 시간 단위(unit of time)를 주(week) 단위로 변환하고, 주 단위로 변환된 데이터들을 농가 별로 분류하여 저장할 수 있다.The database management unit 10 according to an embodiment of the present invention converts a unit of time of environmental data, growth data, and production data into a week unit, and converts the converted data into a weekly basis for each farm household. It can be sorted and saved.

본 발명의 일 실시 예에 따른 핵심변수 추출부(20)는, 환경 데이터로부터 추출된 누적일사량에 대한 단위(unit)를 파악하고, 농가의 일(day) 별 야간시간대의 누적일사량과 시간 단위(unit of time)의 지연변수에 따른 누적일사량을 비교하며, 비교의 결과에 따라 누적일사량의 보정 여부를 결정할 수 있다.The core variable extracting unit 20 according to an embodiment of the present invention grasps a unit for the cumulative insolation extracted from environmental data, and the cumulative insolation in the night time zone for each day of the farm and the time unit ( unit of time), the cumulative insolation according to the delay variable is compared, and whether to correct the cumulative insolation can be determined according to the result of the comparison.

본 발명의 일 실시 예에 따른 제 1 데이터 분석부(30)는, 농작물의 개화에서 수확까지의 소요기간을 누적 평균한 제 1 핵심변수와 생산량 데이터를 상호 매칭하고, 제 2 핵심변수 각각에 영향을 미치는 제 1 핵심변수를 분석할 수 있다.The first data analysis unit 30 according to an embodiment of the present invention mutually matches the first key variable and the production amount data obtained by accumulating average of the time required from flowering to harvesting of a crop, and affects each of the second key variables. Can analyze the first key variable that affects.

본 발명의 일 실시 예에 따른 제 2 데이터 분석부(40)는, 상호 연관성을 분석한 결과에 따라 농작물의 생산량이 높은 것으로 판단된 상위 N개(N은 자연수)의 통합 데이터를 추출하고, 추출된 통합 데이터를 기초로 하여 재배시기를 고려한 생육단계별로 최대 생산량에 해당하는 제 1 핵심변수 및 제 2 핵심변수의 구간값을 분석하며, 상호 연관성을 분석한 결과에 따라 제 1 핵심변수의 구간값과 제 2 핵심변수의 구간값을 매칭시킬 수 있다.The second data analysis unit 40 according to an embodiment of the present invention extracts and extracts the integrated data of the top N items (N is a natural number) determined to have a high production amount of crops according to a result of analyzing the correlation Based on the integrated data, the section value of the first key variable and the second key variable corresponding to the maximum production volume for each growth stage considering the cultivation period is analyzed, and the section value of the first key variable according to the result of the correlation analysis And the interval value of the second key variable can be matched.

본 발명의 일 실시 예에 따른 장치(100)와 관련하여서는 전술한 방법에 대한 내용이 적용될 수 있다. 따라서, 장치(100)와 관련하여, 전술한 방법에 대한 내용과 동일한 내용에 대하여는 설명을 생략하였다.In relation to the device 100 according to an embodiment of the present invention, the contents of the above-described method may be applied. Accordingly, with respect to the apparatus 100, descriptions of the same contents as those of the above-described method have been omitted.

한편, 본 발명의 일 실시 예에 의하면, 전술한 방법을 컴퓨터에서 실행시키기 위한 프로그램을 기록한 컴퓨터로 읽을 수 있는 기록매체를 제공할 수 있다. 다시 말해서, 전술한 방법은 컴퓨터에서 실행될 수 있는 프로그램으로 작성 가능하고, 컴퓨터 판독 가능 매체를 이용하여 상기 프로그램을 동작시키는 범용 디지털 컴퓨터에서 구현될 수 있다. 또한, 상술한 방법에서 사용된 데이터의 구조는 컴퓨터 판독 가능 매체에 여러 수단을 통하여 기록될 수 있다. 본 발명의 다양한 방법들을 수행하기 위한 실행 가능한 컴퓨터 프로그램이나 코드를 기록하는 기록 매체는, 반송파(carrier waves)나 신호들과 같이 일시적인 대상들은 포함하는 것으로 이해되지는 않아야 한다. 상기 컴퓨터 판독 가능 매체는 마그네틱 저장매체(예를 들면, 롬, 플로피 디스크, 하드 디스크 등), 광학적 판독 매체(예를 들면, 시디롬, DVD 등)와 같은 저장 매체를 포함할 수 있다.Meanwhile, according to an embodiment of the present invention, a computer-readable recording medium in which a program for executing the above-described method on a computer is recorded may be provided. In other words, the above-described method can be written as a program that can be executed on a computer, and can be implemented in a general-purpose digital computer that operates the program using a computer-readable medium. Further, the structure of the data used in the above-described method can be recorded on a computer-readable medium through various means. A recording medium for recording executable computer programs or codes for performing the various methods of the present invention should not be understood as including temporary objects such as carrier waves or signals. The computer-readable medium may include a storage medium such as a magnetic storage medium (eg, ROM, floppy disk, hard disk, etc.), and an optical reading medium (eg, CD-ROM, DVD, etc.).

전술한 본 발명의 설명은 예시를 위한 것이며, 본 발명이 속하는 기술분야의 통상의 지식을 가진 자는 본 발명의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시 예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 예를 들어, 단일형으로 설명되어 있는 각 구성요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성 요소들도 결합된 형태로 실시될 수 있다.The above description of the present invention is for illustrative purposes only, and those of ordinary skill in the art to which the present invention pertains will be able to understand that other specific forms can be easily modified without changing the technical spirit or essential features of the present invention. will be. Therefore, it should be understood that the embodiments described above are illustrative and non-limiting in all respects. For example, each component described as a single type may be implemented in a distributed manner, and similarly, components described as being distributed may also be implemented in a combined form.

본 발명의 범위는 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다.The scope of the present invention is indicated by the claims to be described later rather than the detailed description, and all changes or modified forms derived from the meaning and scope of the claims and their equivalent concepts should be interpreted as being included in the scope of the present invention. .

10: 데이터베이스 관리부
20: 핵심변수 추출부
30: 제 1 데이터 분석부
40: 제 2 데이터 분석부
100: 데이터 분석 장치10: database management department
20: core variable extraction unit
30: first data analysis unit
40: second data analysis unit
100: data analysis device

Claims

In the data analysis method for improving the productivity of crops,
Establishing, by a database management unit, an integrated database for environmental data, growth data, and production data collected from farms;
Extracting, by a key variable extraction unit, a first key variable that affects the productivity of the crop from the environmental data, and extracting a second key variable that affects the productivity of the crop from the growth data;
Analyzing, by a first data analysis unit, a correlation between the environmental data, growth data, and production data based on the first key variable and the second key variable; And
A second data analysis unit including the step of estimating a section value of the first key variable and the second key variable for calculating the maximum production amount for each growth stage considering the cultivation time of the crop, based on the result of the analysis,
Analyzing the correlation,
Matching a first key variable obtained by cumulative average of the required period from flowering to harvesting of the agricultural crop and the production amount data; And
Further comprising the step of analyzing the first key variable affecting each of the second key variable,
The step of estimating the interval value of the first key variable and the second key variable,
According to the result of the correlation analysis, the integrated data corresponding to the top N (N is a natural number) determined to be high in the output of the crop or the limitedly extracted data corresponding to the top 20% of the output of the crop is extracted. The step of doing;
Analyzing section values of the first key variable and the second key variable corresponding to the maximum production amount for each growth stage considering the cultivation time based on the extracted integrated data; And
And matching the section value of the first key variable and the section value of the second key variable according to a result of the correlation analysis.

The method of claim 1,
In the step of building the integrated database,
Productivity improvement of crops, characterized in that a unit of time of the environmental data, growth data, and production data is converted into a week unit, and the data converted into the week unit is classified and stored for each farm household. Data analysis method for

The method of claim 1,
The first key variable includes the farm's cumulative insolation, temperature, humidity, carbon dioxide concentration, number of irrigation per day, water supply amount per time, salt concentration and ph concentration,
The second key variable data analysis method for improving the productivity of crops, characterized in that including the growth length, stem thickness and flower height of the crop.

The method of claim 3,
The step of extracting the first key variable and the second key variable,
Determining a unit for the cumulative insolation extracted from the environmental data;
Accumulation according to the delay variable of the unit of time in the night time based on the basic data on the previously extracted insolation to check the cumulative insolation in the night time for each day of the farm and the true value of the accumulated insolation. Comparing the amount of insolation; And
And determining whether to correct the accumulated insolation according to the result of the comparison.

delete

In the data analysis device for improving the productivity of crops,
A database management unit that builds an integrated database for environmental data, growth data, and production data collected from farms;
A key variable extracting unit for extracting a first key variable affecting the productivity of the crop from the environmental data, and extracting a second key variable affecting the productivity of the crop from the growth data;
A first data analysis unit that analyzes a correlation between the environmental data, growth data, and production data based on the first and second key variables; And
Based on the result of the analysis, including a second data analysis unit for estimating the interval value of the first key variable and the second key variable for calculating the maximum output for each growth stage in consideration of the cultivation time of the crop,
The first data analysis unit,
The first key variable obtained by accumulating the accumulated average of the time required from flowering to harvesting of the crop and the production amount data are mutually matched, and the first key variable affecting each of the second key variables is analyzed,
The second data analysis unit,
According to the result of the correlation analysis, the integrated data corresponding to the top N (N is a natural number) determined to be high in the output of the crop or the limitedly extracted data corresponding to the top 20% of the output of the crop is extracted. And, based on the extracted integrated data, analyze the interval values of the first key variable and the second key variable corresponding to the maximum production for each growth stage considering the cultivation time, and according to the result of the correlation analysis A data analysis device for improving productivity of crops, characterized in that matching the section value of the first key variable and the section value of the second key variable.

The method of claim 7,
The database management unit,
Productivity improvement of crops, characterized in that a unit of time of the environmental data, growth data, and production data is converted into a week unit, and the data converted into the week unit is classified and stored for each farm household. Data analysis device for

The method of claim 7,
The first key variable includes the farm's cumulative insolation, temperature, humidity, carbon dioxide concentration, number of irrigation per day, water supply amount per time, salt concentration and ph concentration,
The second key variable is a data analysis device for improving the productivity of the crop, characterized in that including the growth length, stem thickness, and flower height of the crop.

The method of claim 9,
The core variable extraction unit,
In order to determine the unit for the cumulative insolation extracted from the environmental data, and to check the cumulative insolation in the night time zone for each day of the farm and the true value of the cumulative insolation, basic data on the previously extracted insolation is used. Data analysis for improving productivity of crops, characterized in that the cumulative insolation is compared according to the delay variable of the unit of time in the night time zone as a reference, and whether or not to correct the cumulative insolation is determined according to the result of the comparison Device.

delete

A computer-readable recording medium on which a program for implementing the method of any one of claims 1 to 4 is recorded.