KR20240030729A

KR20240030729A - Crop detecting system based on deep learning

Info

Publication number: KR20240030729A
Application number: KR1020220110131A
Authority: KR
Inventors: 이상희; 김영근; 최용; 장성혁
Original assignee: 대한민국(농촌진흥청장)
Priority date: 2022-08-31
Filing date: 2022-08-31
Publication date: 2024-03-07

Abstract

본 발명은 모니터링 시스템을 개시한다. 보다 상세하게는, 본 발명은 딥 러닝에 기반한 이미지 내 객체 검출 알고리즘을 이용하여 농작물의 수확량을 높은 정확도로 자동 측정할 수 있는 딥 러닝 기반 농작물 객체 검출 시스템에 관한 것이다.
본 발명의 실시예에 따르면, 감자와 이물질을 판별하는 제1 알고리즘 및 객체를 추적하고 개수를 카운팅하는 제2 알고리즘을 이용하여 실시간 이미지를 분석함으로써, 시스템의 설치 공간의 제약을 받지 않으며, 간단한 구조로 구축 비용이 낮은 반면, 높은 정확도로 감자 수확량을 자동 측정할 수 있는 모니티링 시스템을 제공할 수 있는 효과가 있다.The present invention discloses a monitoring system. More specifically, the present invention relates to a deep learning-based crop object detection system that can automatically measure crop yield with high accuracy using a deep learning-based object detection algorithm in images.
According to an embodiment of the present invention, by analyzing real-time images using a first algorithm for distinguishing between potatoes and foreign substances and a second algorithm for tracking objects and counting their number, the system is not limited by installation space and has a simple structure. While the construction cost is low, it has the effect of providing a monitoring system that can automatically measure potato yield with high accuracy.

Description

Deep learning-based crop object detection system {CROP DETECTING SYSTEM BASED ON DEEP LEARNING}

본 발명은 모니터링 시스템에 관한 것으로, 특히 딥 러닝에 기반한 이미지 내 객체 검출 알고리즘을 이용하여 농작물의 수확량을 높은 정확도로 자동 측정할 수 있는 딥 러닝 기반 농작물 객체 검출 시스템에 관한 것이다.The present invention relates to a monitoring system, and in particular, to a deep learning-based crop object detection system that can automatically measure crop yield with high accuracy using a deep learning-based object detection algorithm in images.

감자는 쌀, 밀, 옥수수와 더불어 세계 4대 식량자원 중 하나로서, 국내 재배면적은 2020년 23,599 ha, 생산량은 553,194 ton에 달하는 주요 밭작물로 국내 농가 소득에 크게 기여하고 있다.Potatoes are one of the world's four major food resources along with rice, wheat, and corn, and are a major field crop with a domestic cultivation area of 23,599 ha and production of 553,194 tons in 2020, contributing greatly to the income of domestic farms.

그러, 농촌인구 감소와 고령화로 인해 인건비는 지속적으로 상승하여 생산비는 증가하고 있으며, FTA 등 시장개방 가속화로 값싼 농산물이 유입되고 있어 국내산 밭작물의 경쟁력 확보가 필요하다. 이에 대한 대안으로 투입자재를 절감하여 생산비는 줄이면서 생산성은 증가시킬 수 있는 정밀농업(Precision Agriculture)의 중요성이 대두되고 있다.However, due to the decline and aging of the rural population, labor costs continue to rise and production costs are increasing, and as cheap agricultural products are flowing in due to the acceleration of market opening such as FTA, it is necessary to secure the competitiveness of domestic field crops. As an alternative to this, the importance of precision agriculture, which can increase productivity while reducing production costs by reducing input materials, is emerging.

정밀농업은 포장(圃場) 내 토양의 특성, 작물의 생육 및 수확량 등을 측정하고 탑재된 GPS(Global Positioning System)의 위치정보와 결합하여 포장 내 공간변이를 파악함으로써 위치별 특성에 맞춰 투입자원을 낮추는 동시에 생산성은 높일 수 있는 농업을 가리킨다.Precision agriculture measures the characteristics of the soil within the field, crop growth and yield, and combines it with the location information of the installed GPS (Global Positioning System) to identify spatial variations within the field, so that input resources are adjusted according to the characteristics of each location. It refers to agriculture that can lower productivity while simultaneously increasing productivity.

포장 내 수확량 변이 정보는 당해 연도 재배결과 및 다음 연도 계획 수립을 위한 기초적이며, 필수적인 정보로 중요성이 높다. 수확량 측정을 위해 로드셀(Load Cell), 기계시각(Machine Vision), 포토센서(Photosensor)를 활용한 방법 등이 연구되었으며, 미국, 유럽 및 일본 등에서는 콤바인에 로드셀을 부착하여 수확량을 측정하는 시스템이 실용화되어 사용되고 있다. Yield variation information within the field is of high importance as basic and essential information for the current year's cultivation results and establishing plans for the next year. Methods using load cells, machine vision, and photosensors have been studied to measure yield, and in the United States, Europe, and Japan, a system that measures yield by attaching a load cell to a combine has been developed. It has been put into practical use.

기계시각을 활용한 수확량 측정 방법은, 종래 가장 많이 사용되고 있는 로드셀을 사용한 방법과 대비하여 볼 때 공간의 제약을 받지 않고, 비교적 간단하게 시스템을 구성할 수 있어 적용 범위가 넓은 방법으로 알려져 있다.Compared to the method using load cells, which is the most commonly used method, the yield measurement method using machine vision is known as a method with a wide range of application because it is not limited by space and the system can be configured relatively simply.

Larsson(1994)은 CCD카메라(Charge-Coupled Device Camera)를 컨베이어벨트 끝단에 설치하여 투영면적의 픽셀 수 분석을 통해 떨어지는 감자의 크기를 측정하였으며, Hofstee와 Molema(2002)는 라인 스캔 카메라를 설치하여 2D 정보를 기반으로 하는 대량 추정법을 개발하였다. 또한, Yaowei Long(2018)은 스테레오비전을 이용하여 2차원 감자 영상에 깊이정보를 더해 체적을 측정하였다. 또한, Lee(2018)는 포장에 굴취된 감자의 영상을 획득하고 획득된 영상에서 감자를 검출하여 회귀식을 통해 무게를 추정하였다.Larsson (1994) installed a CCD camera (Charge-Coupled Device Camera) at the end of the conveyor belt and measured the size of falling potatoes by analyzing the number of pixels in the projection area, and Hofstee and Molema (2002) installed a line scan camera to measure the size of the potatoes. A mass estimation method based on 2D information was developed. Additionally, Yaowei Long (2018) measured volume by adding depth information to a 2D potato image using stereovision. In addition, Lee (2018) acquired an image of a potato dug into a package, detected the potato in the acquired image, and estimated its weight through a regression equation.

그러나, 전술한 기계시각을 활용한 수확량 측정 방법에서는 연속적으로 촬영되는 영상에서 겹치는 동일한 감자를 제거해야 하며, 경계 부분에 위치하여 잘린 감자를 제외하는 문제점이 있다. However, the above-mentioned method of measuring yield using machine vision has the problem of having to remove overlapping identical potatoes from continuously captured images and excluding potatoes that are cut at the border.

이러한 문제를 해결하기 위해, 최근 산업에서 폭넓게 활용되고 있는 인공지능(Artificial Intelligence)의 활용이 대안이 될 수 있다. 객체검출(Object Detection)에 활용되는 인공신경망으로는 RCNN(Region with Convolutional Network), Fast RCNN, Faster RCNN 방식이 있다. 이러한 R-CNN 계열의 검출 방법들은 영상에서 물체가 있을 법한 추측되는 후보를 뽑고(Region proposal), 이후 검출기를 통하여 물체를 분류(Classification)하는 두 개의 네트워크로 구성된 검출기(2-stage detector)로 구성된다.To solve this problem, the use of artificial intelligence, which has recently been widely used in industry, can be an alternative. Artificial neural networks used for object detection include RCNN (Region with Convolutional Network), Fast RCNN, and Faster RCNN. These R-CNN series detection methods are composed of a detector (2-stage detector) consisting of two networks that select candidates that are likely to contain an object in the image (Region proposal) and then classify the object through the detector. do.

전술한 방식에 따르면, 정확도가 높다는 장점은 있으나, 여전히 처리 속도에 한계가 있어 수확 작업 중 컨베이어에서 이송되는 감자를 실시간으로 검출하는 데 한계가 있다.Although the above-mentioned method has the advantage of high accuracy, there is still a limit to the processing speed, so there is a limit to detecting potatoes transported on a conveyor in real time during harvesting.

또한, 기존 2-stage detector에서 Region proposal 단계를 제거하고 한 번에 객체를 탐지하는 방식(1-stage detector)이 제안되었으나, 이는 Localization 및 Classification이 동시에 이루어짐에 따라 속도는 매우 빠르지만, 정확도에 있어서 2-stage detector에 비해 다소 낮은 단점이 있다.In addition, a method (1-stage detector) was proposed that removes the region proposal step from the existing 2-stage detector and detects the object at once. However, this is very fast as localization and classification are performed simultaneously, but has low accuracy. It has a somewhat lower disadvantage than a 2-stage detector.

공개특허공보 제10-2021-0140895호(공개일자: 2021.11.23.)Public Patent Publication No. 10-2021-0140895 (Publication date: 2021.11.23.)

본 발명은 전술한 문제점을 해결하기 위해 안출된 것으로, 본 발명은 컨베이어 이송 속도를 고려하여 1-stage detector 중 하나인 YOLOv5를 사용해 객체를 판별 및 추적하고, 최종적으로 감자의 개수를 세는 시스템을 설계함으로써, 실제 감자 수확기에서 이송되는 영상을 활용하여 농작물 수확량을 모니터링하는 데 과제가 있다.The present invention was created to solve the above-mentioned problems. The present invention designes a system that identifies and tracks objects using YOLOv5, one of the 1-stage detectors, considering the conveyor transfer speed, and finally counts the number of potatoes. By doing so, there is a challenge in monitoring crop yield using images transmitted from an actual potato harvester.

전술한 과제를 해결하기 위해, 본 발명의 실시예에 따른 딥 러닝 기반 농작물 객체 검출 시스템은, 카메라 장치에 의해 촬영된 실시간 이미지에 등장하는 복수의 객체를 추적 대상 객체 또는 하나 이상의 비추적 대상 객체로 분류하는 분류부, 상기 실시간 이미지의 전, 후 프레임에서 각각 등장하는 복수의 객체에 대한 두 거리값(distance)을 산출하고, 상기 두 거리값에 기초하여 상기 추적 대상 객체를 판별하여 추적 대상 객체별 고유 ID를 부여하는 추적부 및, 상기 고유 ID가 부여된 추적 대상 객체의 개수를 산출하는 카운팅부를 포함할 수 있다.In order to solve the above-described problem, the deep learning-based crop object detection system according to an embodiment of the present invention divides a plurality of objects appearing in real-time images captured by a camera device into a tracked target object or one or more non-tracked target objects. A classifier that classifies, calculates two distance values for a plurality of objects that appear in each of the before and after frames of the real-time image, and determines the tracking target object based on the two distance values, so that each tracking target object It may include a tracking unit that assigns a unique ID, and a counting unit that calculates the number of objects to be tracked to which the unique ID has been assigned.

상기 분류부는, 상기 실시간 이미지를 S X S(S은 1 이상의 자연수) 그리드로 분할하는 분할모듈, 분할된 각 그리드 셀의 바운딩 박스와 컨피던스 스코어를 예측하는 예측모듈 및, 복수의 분할된 그리드를 결합하되, NMS(Non-maximum Suppression) 과정을 통해 상기 바운딩 박스의 위치를 조정하여 객체의 종류를 판별하는 판별모듈을 포함할 수 있다.The classification unit combines a segmentation module for dividing the real-time image into an S It may include a discrimination module that determines the type of object by adjusting the position of the bounding box through a non-maximum suppression (NMS) process.

상기 추적부는, 상기 전, 후 프레임에 각각 등장하는 복수의 객체에 대한 제1 및 제2 객체정보를 추출하는 추출모듈, 필터를 이용하여 제1 객체정보에 따라 예상되는 후 프레임에서의 추정 객체정보를 산출하고, 상기 추정 객체정보와 제2 객체정보간의 제1 거리값을 산출하는 제1 거리산출모듈, 상기 전, 후 프레임에 각각 등장하는 복수의 객체의 특징값을 이용하여 제2 거리값을 산출하는 제2 거리산출모듈 및, 상기 제1 및 제2 거리값을 합산한 결과값이 임계값을 이상인 객체를 동일 객체로 판별 및 추적하는 추적모듈을 포함할 수 있다.The tracking unit uses an extraction module and a filter to extract first and second object information for a plurality of objects appearing in the previous and later frames, respectively, and estimated object information in the next frame expected according to the first object information. A first distance calculation module that calculates a first distance value between the estimated object information and the second object information, and calculates a second distance value using characteristic values of a plurality of objects that appear in each of the before and after frames. It may include a second distance calculation module that calculates a distance, and a tracking module that determines and tracks an object whose sum of the first and second distance values is greater than or equal to a threshold as the same object.

상기 필터는, 칼만 필터(kalman filter)일 수 있다.The filter may be a Kalman filter.

상기 제1 및 제2 거리값은, 각각 마하라노비스 거리(Mahalanobis distance) 및 코사인 거리(Cosine distance)일 수 있다.The first and second distance values may be Mahalanobis distance and Cosine distance, respectively.

상기 결과값은, 이하의 수학식, The above result is the following equation,

에 의해 산출될 수 있다(단, c는 결과값, d_m은 마하라노비스 거리, d_c는 코사인 거리, λ는 하이퍼 파라미터).It can be calculated by (where c is the result, d _m is the Mahalanobis distance, d _c is the cosine distance, and λ is the hyper parameter).

상기 실시간 이미지는, 이미지내 각 객체가 외부로 이송되는 방향에 따른 카운팅 라인이 설정되고, 상기 카운팅부는, 상기 실시간 이미지에서 상기 추적 대상 객체에 대응하는 바운딩 박스의 중심점이 상기 카운팅 라인을 통과할 때, 객체의 개수를 가산할 수 있다.In the real-time image, a counting line is set according to the direction in which each object in the image is transported to the outside, and the counting unit is configured to operate when the center point of the bounding box corresponding to the tracking object in the real-time image passes through the counting line. , the number of objects can be added.

본 발명의 실시예에 따르면, 감자와 이물질을 판별하는 제1 알고리즘 및 객체를 추적하고 개수를 카운팅하는 제2 알고리즘을 이용하여 실시간 이미지를 분석함으로써, 시스템의 설치 공간의 제약을 받지 않으며, 간단한 구조로 구축 비용이 낮은 반면, 높은 정확도로 감자 수확량을 자동 측정할 수 있는 모니티링 시스템을 제공할 수 있는 효과가 있다.According to an embodiment of the present invention, by analyzing real-time images using a first algorithm for distinguishing between potatoes and foreign substances and a second algorithm for tracking objects and counting their number, the system is not limited by installation space and has a simple structure. While the construction cost is low, it has the effect of providing a monitoring system that can automatically measure potato yield with high accuracy.

도 1은 본 발명의 실시예에 따른 딥 러닝 기반 농작물 객체 검출 시스템에서 이용된 학습데이터의 라벨링 영역 이미지를 예시한 도면이다.
도 2는 본 발명의 실시예에 따른 딥 러닝 기반 농작물 객체 검출 시스템에서 객체 분류 이미지를 예시한 도면이다.
도 3은 본 발명의 실시예에 따른 딥 러닝 기반 농작물 객체 검출 시스템의 전체 구조를 나타낸 도면이다.
도 4는 본 발명의 실시예에 따른 딥 러닝 기반 농작물 객체 검출 시스템의 분류부를 나타낸 도면이다.
도 5는 본 발명의 실시예에 따른 딥 러닝 기반 농작물 객체 검출 시스템의 추적부를 나타낸 도면이다.
도 6은 본 발명의 실시예에 따른 딥 러닝 기반 농작물 객체 검출 시스템에 설정된 카운팅 라인 및 엔드 라인을 예시한 도면이다.
도 7은 본 발명의 실시예에 따른 딥 러닝 기반 농작물 객체 검출 시스템의 평과결과에 대한 이미지 및 데이터를 나타낸 도면이다.
도 8은 본 발명의 실시예에 따른 딥 러닝 기반 농작물 객체 검출 시스템의 학습 모델의 평가시 관심영역에서의 각 객체의 분류 결과에 이미지를 예시한 도면이다.Figure 1 is a diagram illustrating a labeling area image of learning data used in a deep learning-based crop object detection system according to an embodiment of the present invention.
Figure 2 is a diagram illustrating an object classification image in a deep learning-based crop object detection system according to an embodiment of the present invention.
Figure 3 is a diagram showing the overall structure of a deep learning-based crop object detection system according to an embodiment of the present invention.
Figure 4 is a diagram showing a classification unit of a deep learning-based crop object detection system according to an embodiment of the present invention.
Figure 5 is a diagram showing a tracking unit of a deep learning-based crop object detection system according to an embodiment of the present invention.
Figure 6 is a diagram illustrating counting lines and end lines set in the deep learning-based crop object detection system according to an embodiment of the present invention.
Figure 7 is a diagram showing images and data for the evaluation results of the deep learning-based crop object detection system according to an embodiment of the present invention.
Figure 8 is a diagram illustrating an image of the classification result of each object in the region of interest when evaluating the learning model of the deep learning-based crop object detection system according to an embodiment of the present invention.

상기한 바와 같은 본 발명을 첨부된 도면들과 실시예들을 통해 상세히 설명하도록 한다. The present invention as described above will be described in detail through the attached drawings and examples.

본 발명에서 사용되는 기술적 용어는 단지 특정한 실시 예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아님을 유의해야 한다. 또한, 본 발명에서 사용되는 기술적 용어는 본 발명에서 특별히 다른 의미로 정의되지 않는 한, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 의미로 해석되어야 하며, 과도하게 포괄적인 의미로 해석되거나, 과도하게 축소된 의미로 해석되지 않아야 한다. 또한, 본 발명에서 사용되는 기술적인 용어가 본 발명의 사상을 정확하게 표현하지 못하는 잘못된 기술적 용어일 때에는, 당업자가 올바르게 이해할 수 있는 기술적 용어로 대체되어 이해되어야 할 것이다. 또한, 본 발명에서 사용되는 일반적인 용어는 사전에 정의되어 있는 바에 따라, 또는 전후 문맥상에 따라 해석되어야 하며, 과도하게 축소된 의미로 해석되지 않아야 한다.It should be noted that the technical terms used in the present invention are only used to describe specific embodiments and are not intended to limit the present invention. In addition, the technical terms used in the present invention, unless specifically defined in a different sense in the present invention, should be interpreted as meanings generally understood by those skilled in the art in the technical field to which the present invention pertains, and are not overly comprehensive. It should not be interpreted in a literal or excessively reduced sense. Additionally, if the technical term used in the present invention is an incorrect technical term that does not accurately express the idea of the present invention, it should be replaced with a technical term that can be correctly understood by a person skilled in the art. In addition, general terms used in the present invention should be interpreted according to the definition in the dictionary or according to the context, and should not be interpreted in an excessively reduced sense.

또한, 본 발명에서 사용되는 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한 복수의 표현을 포함한다. 본 발명에서, "구성된다" 또는 "포함한다" 등의 용어는 발명에 기재된 여러 구성 요소들, 또는 여러 단계를 반드시 모두 포함하는 것으로 해석되지 않아야 하며, 그 중 일부 구성 요소들 또는 일부 단계들은 포함되지 않을 수도 있고, 또는 추가적인 구성 요소 또는 단계들을 더 포함할 수 있는 것으로 해석되어야 한다.Additionally, as used in the present invention, singular expressions include plural expressions unless the context clearly dictates otherwise. In the present invention, terms such as “consists of” or “comprises” should not be construed as necessarily including all of the various components or steps described in the invention, and some of the components or steps are included. It may not be possible, or it may include additional components or steps.

또한, 본 발명에서 사용되는 제1, 제2 등과 같이 서수를 포함하는 용어는 구성 요소들을 설명하는데 사용될 수 있지만, 구성 요소들은 용어들에 의해 한정되어서는 안 된다. 용어들은 하나의 구성 요소를 다른 구성 요소로부터 구별하는 목적으로만 사용된다. 예를 들어, 본 발명의 권리 범위를 벗어나지 않으면서 제1 구성 요소는 제2 구성 요소로 명명될 수 있고, 유사하게 제2 구성 요소도 제1 구성 요소로 명명될 수 있다.Additionally, terms including ordinal numbers, such as first, second, etc., used in the present invention may be used to describe components, but the components should not be limited by the terms. Terms are used only to distinguish one component from another. For example, a first component may be referred to as a second component, and similarly, the second component may also be referred to as a first component without departing from the scope of the present invention.

이하, 첨부된 도면을 참조하여 본 발명에 따른 바람직한 실시예를 상세히 설명하되, 도면 부호에 관계없이 동일하거나 유사한 구성 요소는 동일한 참조 번호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다.Hereinafter, preferred embodiments according to the present invention will be described in detail with reference to the attached drawings. However, identical or similar components will be assigned the same reference numbers regardless of the reference numerals, and duplicate descriptions thereof will be omitted.

본 발명을 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다. 또한, 첨부된 도면은 본 발명의 사상을 쉽게 이해할 수 있도록 하기 위한 것일 뿐, 첨부된 도면에 의해 본 발명의 사상이 제한되는 것으로 해석되어서는 아니 됨을 유의해야 한다.In describing the present invention, if it is determined that a detailed description of related known technologies may obscure the gist of the present invention, the detailed description will be omitted. In addition, it should be noted that the attached drawings are only intended to facilitate easy understanding of the spirit of the present invention, and should not be construed as limiting the spirit of the present invention by the attached drawings.

또한, 본 발명의 실시예에서는, 분류를 위한 딥 러닝 모델로서, YOLOv5(You Only Look Once)를 이용하고, 그 학습 모델의 학습을 위한 학습 데이터를 입력받았다. 이러한 학습 데이터(ldata)로서, 2021년 11월 18일 가을감자를 대상으로 경상남도 고령에 위치한 감자 재배 포장에서 획득한 영상을 이용하고 있으며, 이러한 학습 데이터(ldata)에서 감자의 품종은 수미이고, 파종 후 약 100일이 경과한 포장에서 획득한 이미지이며, 영상 획득 시 영상 크기는 1090 X 1080 pixels, FPS는 24이다.Additionally, in an embodiment of the present invention, YOLOv5 (You Only Look Once) is used as a deep learning model for classification, and learning data for learning the learning model is input. As this learning data (ldata), we are using images acquired from a potato cultivation field located in Goryeong, Gyeongsangnam-do targeting fall potatoes on November 18, 2021. In this learning data (ldata), the variety of potato is Sumi, and the planting date is This image was obtained from packaging about 100 days later, and the image size at the time of acquisition was 1090

그리고, 촬영한 영상에서는 각 프레임의 이미지를 추출하였고, 이 중 감자가 온전히 컨베이어 내에 위치한 이미지만을 따로 선별하여 구성한 9,906개의 학습데이터가 이용되었다.In addition, images of each frame were extracted from the captured video, and 9,906 learning data consisting of separately selected images where potatoes were entirely located within the conveyor were used.

도 1은 본 발명의 실시예에 따른 딥 러닝 기반 농작물 객체 검출 시스템에서 이용된 학습데이터의 라벨링 영역 이미지를 예시한 도면이고, 도 2는 본 발명의 실시예에 따른 딥 러닝 기반 농작물 객체 검출 시스템에서 객체 분류 이미지를 예시한 도면이다.Figure 1 is a diagram illustrating a labeling area image of learning data used in a deep learning-based crop object detection system according to an embodiment of the present invention, and Figure 2 is a diagram illustrating a labeling area image of learning data used in a deep learning-based crop object detection system according to an embodiment of the present invention. This diagram illustrates an object classification image.

도 1 및 도 2를 참조하면, 이미지 하단에 위치한 톤백 부분은 Detect 작업 시 제외됨에 따라, 본 발명의 실시예에서는 컨베이어 부분만을 잘라낸 라벨링 영역(Labeling area)에서 라벨링 작업을 수행하였다.Referring to Figures 1 and 2, since the toneback portion located at the bottom of the image is excluded during the detection operation, in the embodiment of the present invention, the labeling operation was performed in the labeling area where only the conveyor portion was cut out.

뿐만 아니라, 본 발명의 실시예에서는 라벨링 작업시, 공지된 툴인 'Ybat(YOLO Bbox Annotation Tool)'을 사용하여 감자(a), 흙 잔해(b), 줄기(c) 및, 작업자 손(d)의 4가지로 분류하였다.In addition, in the embodiment of the present invention, during the labeling work, the known tool 'Ybat (YOLO Bbox Annotation Tool)' is used to label potatoes (a), soil debris (b), stems (c), and operator's hands (d). It was classified into four categories.

그리고, 본 발명의 실시예에서는, 추가적인 학습 데이터의 확보를 위해 python의 albumentation library를 사용하여 Equalization, Flip(Horzontal, Vertical), Rotate, Brightness Contrast, Gaussian Blur, CLAHE, Gaussian Noise, HSV, Down scale, Sharpen, Random Gamma 기법에 가중치를 랜덤으로 주고 10세트씩 데이터를 증폭하였다. 추가적으로 모든 기법이 랜덤으로 적용된 데이터 10세트를 포함하여 전체 데이터를 227,838개로 증폭하였다.And, in an embodiment of the present invention, in order to secure additional learning data, python's albumentation library is used to perform Equalization, Flip (Horzontal, Vertical), Rotate, Brightness Contrast, Gaussian Blur, CLAHE, Gaussian Noise, HSV, Down scale, Weights were randomly assigned to the Sharpen and Random Gamma techniques, and the data was amplified in 10 sets. Additionally, the total data was amplified to 227,838, including 10 sets of data to which all techniques were randomly applied.

아울러, 본 발명의 실시예에서는, 학습에 사용된 PC 사양으로, CPU는 AMD의 Ryzen 9 5950X 16 Core, GPU는 NVIDIA Geforce RTX 3090 2개, RAM은 128GB, OS는 Windows10 pro 64bit가 사용되었으며, Python-Pytorch를 사용하여 전체 데이터를 8:2 비율로 Train과 Valid로 랜덤하게 나누고 batch-size는 64로 300회 학습을 수행하였다.In addition, in the embodiment of the present invention, the PC specifications used for learning were AMD's Ryzen 9 5950X 16 Core as the CPU, 2 NVIDIA Geforce RTX 3090 as the GPU, 128GB of RAM, Windows 10 pro 64bit as the OS, and Python -Using Pytorch, the entire data was randomly divided into Train and Valid at an 8:2 ratio, and learning was performed 300 times with a batch-size of 64.

이하, 도면을 참조하여 본 발명의 실시예에 따른 딥 러닝 기반 농작물 객체 검출 시스템을 상세히 설명한다.Hereinafter, a deep learning-based crop object detection system according to an embodiment of the present invention will be described in detail with reference to the drawings.

도 3은 본 발명의 실시예에 따른 딥 러닝 기반 농작물 객체 검출 시스템의 전체 구조를 나타낸 도면이다. 이하의 설명에서 본 발명의 딥 러닝 기반 농작물 객체 검출 시스템 및 이를 이루는 각 구성부는 공지의 마이크로프로세서에 의해 실행가능한 컴퓨터 프로그램으로 구현될 수 있고, 읽고 쓰기가 가능한 기록매체에 기록될 수 있다.Figure 3 is a diagram showing the overall structure of a deep learning-based crop object detection system according to an embodiment of the present invention. In the following description, the deep learning-based crop object detection system of the present invention and each component thereof may be implemented as a computer program executable by a known microprocessor and may be recorded on a readable and writable recording medium.

이하의 설명에서 본 발명의 실시예에 따른 농작물 객체 검출 시스템은, 작업 현장에서 수확된 농작물이 이송되는 실시간 영상을 이용하여 그 수확량을 AI 기술에 기반하여 자동으로 계산하는 시스템으로서, 농작물 중 감자를 모니터링 대상으로 하며, 특히 수확량 산출을 위해 감자 수확기의 수집부 상단에 2차 이송 컨베이어 끝단에 카메라 장치를 설치하고 작업자의 1차 선별 이후의 영상을 실시간으로 촬영한 이미지를 이용한 예를 통해 본 발명의 기술적 사상을 설명하나, 모니터링 대상이 특정 작물에 한정되는 것은 아니다. In the following description, the crop object detection system according to an embodiment of the present invention is a system that automatically calculates the harvest amount based on AI technology using real-time images of crops harvested at the work site being transported. Potatoes are among the crops. It is a monitoring object, and in particular, a camera device is installed at the end of the secondary transfer conveyor at the top of the collection part of the potato harvester to calculate the yield, and the present invention is used through an example of using images captured in real time after the worker's primary selection. The technical idea is explained, but the monitoring target is not limited to specific crops.

또한, 감자 수확기는 감자를 굴취하여 이물질을 분리하고, 500kg 톤백에 수집하는 장치일 수 있다.Additionally, the potato harvester may be a device that digs up potatoes, separates foreign substances, and collects them in a 500kg ton bag.

그리고, 본 발명의 농작물 객체 검출 시스템은 카메라 장치로부터 실시간으로 영상을 입력받고, 딥 러닝을 활용하여 감자 객체와 이물질을 분류하고, 분류된 감자 객체에는 고유의 ID를 부여하여 객체를 추적한다. 또한, 이후 각 객체가 영상 내 특정 위치에 도달하였을 때 객체를 카운팅하는 것을 특징으로 한다.In addition, the crop object detection system of the present invention receives images from a camera device in real time, classifies potato objects and foreign substances using deep learning, and assigns unique IDs to the classified potato objects to track the objects. In addition, it is characterized by counting objects when each object reaches a specific location in the image.

이를 위한 구성으로서, 도 3을 참조하면 본 발명의 실시예에 따른 딥 러닝 기반 농작물 객체 검출 시스템(100)은, 카메라 장치(10)에 의해 촬영된 실시간 이미지에 등장하는 복수의 객체를 추적 대상 객체 또는 하나 이상의 비추적 대상 객체로 분류하는 분류부(110), 실시간 이미지의 전, 후 프레임에서 각각 등장하는 복수의 객체에 대한 두 거리값(distance)을 산출하고, 두 거리값에 기초하여 추적 대상 객체를 판별하여 추적 대상 객체별 고유 ID를 부여하는 추적부(120) 및, 고유 ID가 부여된 추적 대상 객체의 개수를 산출하는 카운팅부(130)를 포함할 수 있다.As a configuration for this, referring to FIG. 3, the deep learning-based crop object detection system 100 according to an embodiment of the present invention tracks a plurality of objects appearing in a real-time image captured by the camera device 10 as a tracking target object. Or, a classification unit 110 that classifies one or more non-tracked target objects, calculates two distance values for a plurality of objects that appear in each of the before and after frames of the real-time image, and calculates the tracking target based on the two distance values. It may include a tracking unit 120 that determines the object and assigns a unique ID to each object to be tracked, and a counting unit 130 that calculates the number of objects to be tracked to which a unique ID has been assigned.

분류부(110)는 카메라 장치(10)로부터 실시간으로 촬영된 감자 이미지(img)를 입력받아 등장하는 객체의 종류별로 분류할 수 있다. 특히 본 발명의 실시예에 따르면, 농작물 중, 감자를 분류하기 위해 공지의 알고리즘인 YOLOv5가 이용될 수 있다. 객체는 도 2에 도시된 바와 같이 4 종류로 분류될 수 있고 각 객체에는 바운딩 박스(bounding box)가 설정되어 이후 프레임부터 추적이 가능하게 된다. 이러한 기능을 수행하기 위해, 분류부(110)는 각 기능을 세분화한 복수의 모듈로 구성될 수 있다.The classification unit 110 may receive a potato image (img) captured in real time from the camera device 10 and classify it according to the type of object that appears. In particular, according to an embodiment of the present invention, YOLOv5, a known algorithm, can be used to classify potatoes among crops. Objects can be classified into four types as shown in FIG. 2, and a bounding box is set for each object to enable tracking from subsequent frames. To perform this function, the classification unit 110 may be composed of a plurality of modules that subdivide each function.

추적부(120)는 분류부(110)에 의해 분류된 4 종류의 객체 중, 감자 객체를 판단하고, 이미지 내 감자 객체의 이동을 추적할 수 있다. 특히, 본 발명의 실시예에 따르면, 동일 감자 객체에 대한 중복 산정 문제를 최소화하기 위해 감자 객체 별 중복되지 않는 고유 ID를 부여할 수 있고, 감자 객체의 정확한 추적을 위하여 공지의 SORT 알고리즘이 이용될 수 있다. 바람직하게는, SORT 알고리즘에 신경망 네트워크인 CNN을 결합한 DeepCNN이 이용될 수 있다. 이러한 기능을 수행하기 위해, 추적부(120)는 각 기능을 세분화한 복수의 모듈로 구성될 수 있다.The tracking unit 120 may determine a potato object among the four types of objects classified by the classification unit 110 and track the movement of the potato object within the image. In particular, according to an embodiment of the present invention, in order to minimize the problem of duplicate calculation for the same potato object, a non-redundant unique ID can be assigned to each potato object, and the known SORT algorithm can be used to accurately track the potato object. You can. Preferably, DeepCNN, which combines the SORT algorithm with CNN, a neural network, can be used. To perform these functions, the tracking unit 120 may be composed of a plurality of modules that subdivide each function.

카운팅부(130)는 추적중인 감자 객체가 톤백 부분으로 이송되는 시점에 감자 개수를 카운팅할 수 있다. 이를 위해, 이미지 내 영역에는 카운팅 라인(Counting Line)이 정의되어 있고, 카운팅부(130)는 카운팅 라인을 벗어나는 객체 중, 고유 ID가 부여된 감자 객체의 개수를 계산할 수 있다.The counting unit 130 can count the number of potatoes when the potato object being tracked is transferred to the tone bag portion. To this end, a counting line is defined in the area within the image, and the counting unit 130 can calculate the number of potato objects to which a unique ID is assigned among objects that fall outside the counting line.

전술한 구조에 따라, 본 발명의 실시예에 따른 딥 러닝 기반 농작물 객체 검출 시스템은, 감자 수확기에 설치된 카메라 장치로부터 실시간으로 수확된 감자의 이송 상태를 촬영한 실시간 이미지에 대하여 딥러닝 알고리즘을 통해 감자와 타 객체를 분류하고, 분류된 객체의 추적하여 최종적으로 감자의 개수를 카운팅함으로써, 단순한 구조로 수백톤에 이르는 감자 수확량을 자동으로 산출할 수 있다.According to the above-described structure, the deep learning-based crop object detection system according to an embodiment of the present invention detects potatoes through a deep learning algorithm on real-time images captured in real time of the transport state of potatoes harvested from a camera device installed in a potato harvester. By classifying objects and other objects, tracking the classified objects, and finally counting the number of potatoes, it is possible to automatically calculate the yield of hundreds of tons of potatoes with a simple structure.

이하, 도면을 참조하여 본 발명의 실시예에 따른 딥 러닝 기반 농작물 객체 검출 시스템에 포함되는 각 구성부를 상세히 설명한다.Hereinafter, each component included in the deep learning-based crop object detection system according to an embodiment of the present invention will be described in detail with reference to the drawings.

도 4는 본 발명의 실시예에 따른 딥 러닝 기반 농작물 객체 검출 시스템의 분류부를 나타낸 도면이다.Figure 4 is a diagram showing a classification unit of a deep learning-based crop object detection system according to an embodiment of the present invention.

도 4를 참조하면, 본 발명의 실시예에 따른 딥 러닝 기반 농작물 객체 검출 시스템의 분류부(110)는, 실시간 이미지(img)를 S X S(S은 1 이상의 자연수) 그리드로 분할하는 분할모듈(111), 분할된 각 그리드 셀의 바운딩 박스와 컨피던스 스코어를 예측하는 예측모듈(112), 복수의 분할된 그리드를 결합하되, NMS(Non-maximum Suppression) 과정을 통해 바운딩 박스의 위치를 조정하여 객체의 종류를 판별하는 판별모듈(113)을 포함할 수 있다.Referring to FIG. 4, the classification unit 110 of the deep learning-based crop object detection system according to an embodiment of the present invention includes a segmentation module 111 that divides a real-time image (img) into an S ), a prediction module 112 that predicts the bounding box and confidence score of each divided grid cell, combines a plurality of divided grids, and adjusts the position of the bounding box through a NMS (Non-maximum Suppression) process to determine the object's It may include a discrimination module 113 that determines the type.

분할모듈(111)은 학습 모델인 YOLOv5에서 학습을 진행하는 모듈로서, 실시간 입력되는 이미지(img)를 S X S의 그리드로 분할하고, 분할된 영역에 대하여 여러 박스로 태깅하게 된다. The segmentation module 111 is a module that performs learning in YOLOv5, a learning model, and divides a real-time input image (img) into an S

예측모듈(112)은 분할된 각 그리드 셀의 바운딩 박스와 컨피던스 스코어를 예측한다. 여기서, 컴피던스 스코어는 알고리즘이 검출한 것에 대하여 얼마나 정확한지를 알려주는 값으로서, 예를 들어 컨피던스 스코어가 0.99라면 알고리즘은 검출된 객체가 검출해야 하는 대상과 거의 똑같다고 판단하는 것으로 볼 수 있다. The prediction module 112 predicts the bounding box and confidence score of each divided grid cell. Here, the confidence score is a value that indicates how accurate the algorithm is with respect to what it has detected. For example, if the confidence score is 0.99, the algorithm can be seen as determining that the detected object is almost the same as the object to be detected.

판별모듈(113)은 바운딩 박스의 위치를 조정하여 객체의 종류를 전술한 4 종류 중 어느 하나로 판별할 수 있고, 판별결과(img')를 후술하는 추적부에 전달할 수 있다. 여기서, 판별모듈(113)은 각 그리드를 결합하는 과정에서 NMS(Non-maximum Suppression) 작업을 수행함으로써, 바운딩 박스 위치를 조정하여 최종적으로 감자 객체를 추론하게 된다.The discrimination module 113 can adjust the position of the bounding box to determine the type of object as one of the four types described above, and transmit the determination result (img') to the tracking unit to be described later. Here, the discrimination module 113 performs NMS (Non-maximum Suppression) in the process of combining each grid, thereby adjusting the bounding box position to finally infer the potato object.

이하, 도면을 참조하여 본 발명의 실시예에 따른 딥 러닝 기반 농작물 객체 검출 시스템의 추적부의 구조를 상세히 설명한다.Hereinafter, the structure of the tracking unit of the deep learning-based crop object detection system according to an embodiment of the present invention will be described in detail with reference to the drawings.

도 5는 본 발명의 실시예에 따른 딥 러닝 기반 농작물 객체 검출 시스템의 추적부를 나타낸 도면이다.Figure 5 is a diagram showing a tracking unit of a deep learning-based crop object detection system according to an embodiment of the present invention.

도 5를 참조하면, 본 발명의 실시예에 따른 딥 러닝 기반 농작물 객체 검출 시스템의 추적부(120)는, 컨베이어를 따라 이송되는 감자의 수를 세기 위해 각 감자에 고유 ID를 부여하고, 이를 추적할 수 있다.Referring to FIG. 5, the tracking unit 120 of the deep learning-based crop object detection system according to an embodiment of the present invention assigns a unique ID to each potato to count the number of potatoes transported along the conveyor and tracks them. can do.

이를 위해, 추적부(120)는 식별된 각 객체에 고유ID를 부여할 수 있으며, 감자가 이송되는 동안, 즉, 실시간 이미지에서 감자 객체의 위치가 변화하는 동안 객체에 부여된 고유ID가 변경되는 것을 방지하기 위해, 칼만 필터(Kalman filter)를 이용한 추적 알고리즘으로서 SORT 알고리즘을 이용할 수 있다. 특히, 본 발명의 실시예에서는 칼만 필터(Kalman filter)와 헝가리안(Hungarian 알고리즘)을 사용하여 이미지내 객체를 추적할 수 있고, 기존 SORT 알고리즘에 CNN을 추가하여 정확도는 개선한 DeepSORT 알고리즘이 적용될 수 있다.To this end, the tracking unit 120 can assign a unique ID to each identified object, and the unique ID assigned to the object changes while the potato is being transported, that is, while the position of the potato object changes in the real-time image. To prevent this, the SORT algorithm can be used as a tracking algorithm using a Kalman filter. In particular, in an embodiment of the present invention, objects in an image can be tracked using the Kalman filter and the Hungarian algorithm, and the DeepSORT algorithm, which improves accuracy by adding CNN to the existing SORT algorithm, can be applied. there is.

이를 위한 구성으로서, 추적부(120)는 입력되는 이미지의 전, 후 프레임에 각각 등장하는 복수의 객체에 대한 제1 및 제2 객체정보를 추출하는 추출모듈(121), 필터를 이용하여 제1 객체정보에 따라 예상되는 후 프레임에서의 추정 객체정보를 산출하고, 추정 객체정보와 제2 객체정보간의 제1 거리값을 산출하는 제1 거리산출모듈(122), 전, 후 프레임에 각각 등장하는 복수의 객체의 특징값을 이용하여 제2 거리값을 산출하는 제2 거리산출모듈(123) 및, 제1 및 제2 거리값을 합산한 결과값이 임계값을 이상인 객체를 동일 객체로 판별 및 추적하는 추적모듈(124)을 포함할 수 있다.As a configuration for this, the tracking unit 120 includes an extraction module 121 that extracts first and second object information for a plurality of objects that appear in the frames before and after the input image, respectively, and a filter to extract first and second object information. A first distance calculation module 122 that calculates the estimated object information in the next frame expected according to the object information and calculates the first distance value between the estimated object information and the second object information, appearing in the front and back frames, respectively. A second distance calculation module 123 that calculates a second distance value using the characteristic values of a plurality of objects, and determines that an object for which the result of adding up the first and second distance values is greater than a threshold is the same object, and It may include a tracking module 124 for tracking.

추출모듈(121)은 객체가 분류된 이미지(img')에 대하여, 이미지(img')를 이루는 N(N은 자연수)번째 프레임과, N+1번째 프레임을 서로 비교하여 각 프레임에 등장하는 객체들에 대한 객체정보를 추출할 수 있다.The extraction module 121 compares the N (N is a natural number)-th frame and the N+1-th frame forming the image (img') with respect to the image (img') in which the object is classified, and compares the object that appears in each frame. Object information about objects can be extracted.

제1 거리산출모듈(122)은 전, 후 프레임간 등장하는 객체간의 동일성을 판단하는 요건으로서, 마하라노비스 거리(Mahalanobis distance)를 산출할 수 있다. N(N은 자연수)번째 프레임에 대하여 칼만 필터(Kalman filter)를 사용함에 따라 프레임 내 등장하는 하나 이상의 객체의 현재 객체정보 추정할 수 있고, 그 추정된 객체정보와, N(N은 자연수)번째 프레임내 등장하는 하나 이상의 객체에 대한 객체정보간의 마하라노비스 거리값을 구할할 수 있다.The first distance calculation module 122 can calculate the Mahalanobis distance as a requirement for determining the identity between objects appearing between before and after frames. By using a Kalman filter for the N (N is a natural number)th frame, it is possible to estimate the current object information of one or more objects that appear in the frame, and the estimated object information and the N (N is a natural number)th frame. The Mahalanobis distance value between object information for one or more objects appearing in a frame can be obtained.

제2 거리산출모듈(123)은 N 프레임과 N+1프레임에 등장하는 객체의 특징값을 이용하여 코사인 거리(Cosine distance)를 산출할 수 있다.The second distance calculation module 123 can calculate a cosine distance using the characteristic values of objects appearing in the N frame and the N+1 frame.

추적모듈(124)은 마하라노비스 거리값 및 코사인 거리값을 합산한 결과값이 임계값을 이상인 경우, 두 프레임에서 모두 등장하는 해당 객체를 동일 객체로 판별할 수 있다.If the result of adding the Mahalanobis distance value and the cosine distance value is greater than or equal to the threshold, the tracking module 124 may determine that the corresponding object appearing in both frames is the same object.

추적모듈(124)은 제1 및 제2 거리값을 합산한 결과값에 대하여, 이하의 수학식 1에 의해 산출할 수 있다.The tracking module 124 can calculate the result of adding the first and second distance values using Equation 1 below.

여기서, c는 결과값, d_m은 마하라노비스 거리, d_c는 코사인 거리, λ는 하이퍼 파라미터를 가리킨다.Here, c refers to the result, d _m refers to the Mahalanobis distance, d _c refers to the cosine distance, and λ refers to the hyper parameter.

이에 따라, 추적모듈(124)은 결과값(c)이 임계값 이상인 객체에 대하여, 동일 객체로 판단하고 고유ID를 부여할 수 있다.Accordingly, the tracking module 124 can determine that the objects whose result value (c) is greater than or equal to the threshold are the same object and assign a unique ID.

도 6은 딥 러닝 기반 농작물 객체 검출 시스템에 설정된 카운팅 라인 및 엔드 라인을 예시한 도면으로서, 추적부(120)에 의해 고유ID를 갖는 감자 객체의 라운딩 박스를 이용하여 객체의 중심 좌표가 이미지 내 미리 정해둔 카운팅 라인(Counting Line)을 통과하는 경우, 시스템은 카운팅 작업을 수행할 수 있다.Figure 6 is a diagram illustrating the counting line and end line set in the deep learning-based crop object detection system. The center coordinates of the object are determined in advance in the image by the tracking unit 120 using the rounding box of the potato object with a unique ID. If it passes the designated counting line, the system can perform the counting task.

또한, 감자의 개체들은 유사한 모양들이 상당히 많이 존재할 수 있으며, 서로 다른 감자 객체에 동일한 고유ID가 부여되는 상황을 방지하기 위해 별도의 엔드 라인(End Line)이 설정될 수 있다. 시스템은 엔드 라인(End Line)을 통과하는 고유ID는 검출하지 않을 수 있다.Additionally, potato objects may have many similar shapes, and a separate end line may be set to prevent a situation where the same unique ID is assigned to different potato objects. The system may not detect unique IDs that pass the end line.

이하, 도면을 참조하여 본 발명의 실시예에 따른 딥 러닝 기반 농작물 객체 검출 시스템을 실제 구현하고, 그 평가결과를 통해 본 발명의 기술적 사상을 상세히 설명한다.Hereinafter, with reference to the drawings, a deep learning-based crop object detection system according to an embodiment of the present invention will be actually implemented, and the technical idea of the present invention will be described in detail through the evaluation results.

도 7은 본 발명의 실시예에 따른 딥 러닝 기반 농작물 객체 검출 시스템의 평과결과에 대한 이미지 및 데이터를 나타낸 도면이다.Figure 7 is a diagram showing images and data for the evaluation results of the deep learning-based crop object detection system according to an embodiment of the present invention.

본 발명의 농작물 객체 검출 시스템은, Python 언어를 통해 구현될 수 있고, 시스템은 영상부와 작업기록부를 더 포함할 수 있다.The crop object detection system of the present invention can be implemented through the Python language, and the system may further include an imaging unit and a work record unit.

도 7을 참조하면, 객체 검출 시스템의 영상부를 예시한 것으로, 영상부가 제공하는 메인 윈도우는 2개의 탭으로 구성되고, 연결된 카메라 장치 또는 녹화된 비디오 이미지를 입력으로 받아 첫 번째 탭 영역에 실시간 영상을 송출함에 따라 화면상에 표시한다(a).Referring to FIG. 7, an example of the imaging unit of the object detection system is provided. The main window provided by the imaging unit consists of two tabs, receives a connected camera device or recorded video image as input, and displays real-time video in the first tab area. It is displayed on the screen as it is transmitted (a).

이때, 송출되는 영상에는 YOLOv5 훈련 모델을 통해 감자가 검출될 때, 감자 객체에 바운딩 박스가 중첩되어 표시됨에 따라, 사용자는 감자의 검출 유무를 실시간으로 확인할 수 있다.At this time, when a potato is detected through the YOLOv5 training model in the transmitted video, a bounding box is displayed overlapping the potato object, allowing the user to check in real time whether the potato has been detected.

이후, 검출된 감자 객체가 노란색으로 표시된 카운팅 라인(Counting Line)에 도달할 때, 메인 윈도우 아래 서브 윈도우 영역으로 검출된 객체들의 모습을 표시된다. 또한, 메인 윈도우의 우측으로는 객체 검출 시스템의 기록부가 표시된다. 기록부는 크게 3부분으로 구분될 수 있으며, 메인 윈도우 우측 상단에는 초당 수확된 감자의 개수가 표시되고, 하단에는 시간에 따라 수확된 감자의 수가 그래프로 표시된다. 메인 윈도우의 2번째 탭의 테이블은 전술한 정보 및 누적 개수를 테이블 형태로 기록되도록 구성된다(b).Afterwards, when the detected potato object reaches the yellow counting line, the detected objects are displayed in the sub-window area below the main window. Additionally, the record of the object detection system is displayed on the right side of the main window. The record book can be largely divided into three parts, with the number of potatoes harvested per second displayed in the upper right corner of the main window, and the number of potatoes harvested over time displayed in a graph at the bottom. The table in the second tab of the main window is configured to record the above-mentioned information and cumulative number in table form (b).

도 8은 본 발명의 실시예에 따른 딥 러닝 기반 농작물 객체 검출 시스템의 학습 모델의 평가시 관심영역에서의 각 객체의 분류 결과에 이미지를 예시한 도면이다.Figure 8 is a diagram illustrating an image of the classification result of each object in the region of interest when evaluating the learning model of the deep learning-based crop object detection system according to an embodiment of the present invention.

도 8을 참조하면, 관심 영역에서 각 객체의 분류 결과를 보여주고 있으며, 이에 대한 평가과정은 다음과 같다. Referring to Figure 8, the classification results of each object in the area of interest are shown, and the evaluation process is as follows.

먼저, 본 발명의 실시예에 따른 딥 러닝 기반 농작물 객체 검출 시스템의 학습 모델의 평가는 정밀도(precision), 재현율(recall), mAP(mean Average Precision) 및 F1 Score의 평가지표를 사용하여 수행되었다.First, the evaluation of the learning model of the deep learning-based crop object detection system according to an embodiment of the present invention was performed using the evaluation indices of precision, recall, mAP (mean average precision), and F1 Score.

여기서, 정밀도는 이하의 수학식 2에 의해 산출된다.Here, precision is calculated by Equation 2 below.

또한, 재현율은 이하의 수학식 3에 의해 산출된다.Additionally, the recall rate is calculated by Equation 3 below.

여기서, True Positive는 학습모델이 포지티브 클래스를 올바르게 예측한 결과를 나타내고, False Positive는 모델이 포지티브 클래스를 잘못 예측한 결과를 나타낸다. 또한, True Negative는 모델이 네가티브 클래스를 올바르게 예측한 결과를 나타내고, False Negative는 모델이 네가티브 클래스를 잘못 예측한 결과를 나타낸다. Here, True Positive represents the result of the learning model correctly predicting the positive class, and False Positive represents the result of the model incorrectly predicting the positive class. Additionally, True Negative indicates the result of the model correctly predicting the negative class, and False Negative indicates the result of the model incorrectly predicting the negative class.

그리고, 데이터 클래스 간 불균형이 심할 때 사용되는 지표인 F1 Score는 정밀도와 재현율의 조화평균으로, 이하의 수학식 4에 의해 산출된다.Additionally, the F1 Score, an indicator used when the imbalance between data classes is severe, is the harmonic average of precision and recall, and is calculated using Equation 4 below.

이에 따라, 평가 결과에 따르면, 이하의 표 1에 나타난 바와 같이 학습 횟수(epoch)의 증가에 따라 모든 평가지수가 증가함을 알 수 있다. 300회 학습시 정밀도, 재현율, mAP 및 F1 Score는 각각 0.9997, 0.9994, 0.9872, 0.9996으로 계산되었다.Accordingly, according to the evaluation results, it can be seen that all evaluation indices increase as the number of learning epochs increases, as shown in Table 1 below. When learning 300 times, precision, recall, mAP, and F1 Score were calculated as 0.9997, 0.9994, 0.9872, and 0.9996, respectively.

EpochEpoch RresionRresion RecallRecall mAPmAP F1 ScoreF1 Score 1One 0.86080.8608 0.77930.7793 0.57700.5770 0.81800.8180 100100 0.99590.9959 0.99660.9966 0.96920.9692 0.98270.9827 200200 0.99680.9968 0.99790.9979 0.97960.9796 0.99740.9974 300300 0.99970.9997 0.99940.9994 0.98720.9872 0.99960.9996

여기서, 컴퓨터 비전 분야에서 지표로 사용되는 mAP는 평가지표 중 하나로써, IoU(Intersection of Union) Threshold를 0.5에서 0.05씩 증가시켜 0.95까지 계산된 값들의 평균값을 사용한다. 평가에 사용된 영상데이터는 학습데이터를 추출한 원본 영상과 실험실 내 실시간 영상을 사용한 것이다.Here, mAP, which is used as an indicator in the field of computer vision, is one of the evaluation indicators and uses the average value of the values calculated up to 0.95 by increasing the IoU (Intersection of Union) Threshold from 0.5 to 0.05. The video data used for evaluation was the original video from which the learning data was extracted and real-time video in the laboratory.

상기한 설명에 많은 사항이 구체적으로 기재되어 있으나 이것은 발명의 범위를 한정하는 것이라기보다 바람직한 실시예의 예시로서 해석되어야 한다. 따라서, 발명은 설명된 실시예에 의하여 정할 것이 아니고 특허청구범위와 특허청구범위에 균등한 것에 의하여 정하여져야 한다.Although many details are described in detail in the above description, this should be interpreted as an example of a preferred embodiment rather than limiting the scope of the invention. Therefore, the invention should not be determined by the described embodiments, but by the scope of the patent claims and their equivalents.

10 : 카메라 장치 100 : 객체 검출 시스템
110 : 분류부 110 : 분할모듈
112 : 예측모듈 113 : 판별모듈
120 : 추적부 121 : 추출모듈
122 : 제1 거리산출모듈 123 : 제2 거리산출모듈
124 : 추적모듈 130 : 카운팅부10: Camera device 100: Object detection system
110: classification unit 110: division module
112: prediction module 113: discrimination module
120: tracking unit 121: extraction module
122: first distance calculation module 123: second distance calculation module
124: tracking module 130: counting unit

Claims

a classification unit that classifies a plurality of objects appearing in a real-time image captured by a camera device as a tracking target object or one or more non-tracking target objects;
Calculate two distance values for a plurality of objects that appear in the before and after frames of the real-time image, determine the tracking target object based on the two distance values, and assign a unique ID to each tracking target object. tracking unit; and
Counting unit that calculates the number of tracking objects to which the unique ID has been assigned
Deep learning-based crop object detection system including.

According to claim 1,
The classification department,
a segmentation module that divides the real-time image into an SXS (S is a natural number greater than or equal to 1) grid;
A prediction module that predicts the bounding box and confidence score of each divided grid cell; and,
A discrimination module that combines multiple divided grids and determines the type of object by adjusting the position of the bounding box through the NMS (Non-maximum Suppression) process.
Deep learning-based crop object detection system including.

According to claim 1,
The tracking unit,
an extraction module that extracts first and second object information for a plurality of objects appearing in the before and after frames, respectively;
a first distance calculation module that uses a filter to calculate estimated object information in a future frame expected according to first object information and calculates a first distance value between the estimated object information and second object information;
a second distance calculation module that calculates a second distance value using feature values of a plurality of objects that appear in each of the before and after frames; and
A tracking module that determines and tracks objects for which the result of adding up the first and second distance values is greater than or equal to a threshold as the same object.
Deep learning-based crop object detection system including.

According to claim 3,
The filter is,
Kalman filter, a deep learning-based crop object detection system.

According to claim 3,
The first and second distance values are,
Deep learning-based crop object detection system with Mahalanobis distance and Cosine distance, respectively.

According to claim 5,
The above result is the following equation,

A deep learning-based crop object detection system calculated by (where c is the result, d _m is the Mahalanobis distance, d _c is the cosine distance, and λ is a hyperparameter).

According to claim 1,
In the real-time image, a counting line is set according to the direction in which each object in the image is transported to the outside,
The counting unit,
A deep learning-based crop object detection system that adds the number of objects when the center point of the bounding box corresponding to the tracking object in the real-time image passes the counting line.