KR102602400B1

KR102602400B1 - Method and Apparatus for Box Level Postprocessing For Accurate Object Detection

Info

Publication number: KR102602400B1
Application number: KR1020210034354A
Authority: KR
Inventors: 정용화; 박대희; 유승현; 손승욱; 주권일; 안한세
Original assignee: 고려대학교 세종산학협력단
Priority date: 2021-03-17
Filing date: 2021-03-17
Publication date: 2023-11-17
Also published as: KR20220129726A

Abstract

정확한 객체 탐지를 위한 박스 레벨 후처리 방법 및 장치가 제시된다. 본 발명에서 제안하는 정확한 객체 탐지를 위한 박스 레벨 후처리 방법은 카메라를 통해 복수의 객체를 촬영하여 영상을 획득하는 단계, 획득된 영상으로부터 객체 탐지부를 통해 복수의 객체 탐지를 위한 박스들을 획득하는 단계, 보정부를 통해 복수의 객체 탐지를 위한 박스들 중 복수의 객체 간 겹침으로 인한 가짜 객체 박스의 신뢰도 값과 진짜 객체 박스의 신뢰도 값을 박스 내 전경 픽셀 정보를 이용하여 보정하는 단계 및 보정된 신뢰도 값에 따라 객체 탐지부를 통해 복수의 객체 탐지를 위한 박스들의 위치를 재조정하는 단계를 포함한다. A box-level post-processing method and device for accurate object detection are presented. The box-level post-processing method for accurate object detection proposed in the present invention includes the steps of acquiring images by photographing a plurality of objects through a camera, and obtaining boxes for detecting a plurality of objects from the acquired images through an object detection unit. , A step of correcting the reliability value of the fake object box and the reliability value of the real object box due to overlap between multiple objects among the boxes for multiple object detection using the foreground pixel information in the box through the correction unit, and the corrected reliability value. It includes a step of readjusting the positions of the boxes for detecting a plurality of objects through the object detection unit.

Description

Box level postprocessing method and apparatus for accurate object detection {Method and Apparatus for Box Level Postprocessing For Accurate Object Detection}

본 발명은 정확한 객체 탐지를 위해 합성 곱 신경망(Convolutional Neural Network) 기반 객체 탐지를 이용하는 영상처리 방법 및 장치에 관한 것이다. The present invention relates to an image processing method and device that uses convolutional neural network-based object detection for accurate object detection.

돈사 내 작업자의 부족(국내의 경우 작업자 1명이 평균 2,000 마리의 돼지를 관리)과 돼지의 높은 폐사율(국내의 경우 연간 약 500만 마리의 돼지가 폐사)을 고려할 때, 개별 돼지에 대한 세밀한 관리를 위하여 정보기술(Information Technology)을 적용한 돈사 모니터링의 필요성이 증가하고 있다[1]. 그러나 지속적으로 발전하는 합성 곱 신경망 기반 객체 탐지 기술[2]을 적용하여도 돼지들 간의 겹침(occlusion) 등의 이유로 혼잡한 돈사 내 돼지들을 정확히 탐지하는데 한계가 있다. Considering the lack of workers in pig farms (in Korea, one worker manages an average of 2,000 pigs) and the high mortality rate of pigs (in Korea, about 5 million pigs die annually), detailed management of individual pigs is necessary. For this reason, the need for monitoring pig farms using information technology is increasing [1]. However, even with the application of continuously evolving convolutional neural network-based object detection technology [2], there are limitations in accurately detecting pigs in crowded pig houses due to reasons such as occlusion between pigs.

합성곱 신경망(Convolutional Neural Network) 기술 발전으로 객체 탐지를 통한 돈사에서의 돼지 모니터링이 가능하다. 종래기술에 따른 돈사에서의 돼지 모니터링을 위한 객체 탐지 방법으로는 카메라로부터 획득된 영상으로부터 탐지된 객체에 대응하는 바운딩 박스(bounding box)들을 획득하는 YOLOv4 객체 탐지 방법이 있다. 시간과 정확도가 적절하게 어울리는 가성비가 좋은 객체 탐지기인 YOLOv4에서는 Soft-nms를 거친 후 나오는 객체의 정보에서 현재 탐지한 박스 정보가 클래스와 객체 크기를 얼마나 정확하게 잡았는지를 나타내는 신뢰도 값을 이용한다. 또한, 신뢰도 값을 재조정하는 방법에서 사용되는 객체 분리를 위해서 적응 임계(adaptiveThreshold) 기법을 이용해서 영상 내 객체를 분리하고 영상을 회전시킨다. YOLOv4는 대표적인 객체 탐지 공개 DB인 MS COCO[4]로 처리속도 대비 가장 높은 정확도를 달성한다. With the development of convolutional neural network technology, pig monitoring in pig farms is possible through object detection. An object detection method for monitoring pigs in a pig pen according to the prior art is the YOLOv4 object detection method, which obtains bounding boxes corresponding to the detected object from images obtained from a camera. YOLOv4, a cost-effective object detector with an appropriate combination of time and accuracy, uses a reliability value that indicates how accurately the currently detected box information captures the class and object size from the object information that comes out after going through Soft-nms. In addition, to separate objects used in the method of readjusting the reliability value, the adaptiveThreshold technique is used to separate objects in the image and rotate the image. YOLOv4 achieves the highest accuracy relative to processing speed with MS COCO [4], a representative object detection public database.

합성곱 신경망 기반 딥러닝 기술의 지속적인 발전으로 YOLOv4는 대부분의 객체들을 정확히 탐지(박스의 신뢰도 값이 80% 이상)하지만, 복수의 객체들로 인한 복잡하고 혼잡한 구조 내 객체들의 겹침으로 가짜 객체 박스가 진짜 객체 박스 보다 높은 신뢰도 값을 갖는 경우가 발생한다. With the continuous development of convolutional neural network-based deep learning technology, YOLOv4 accurately detects most objects (confidence value of boxes is over 80%), but fake object boxes are created due to the overlap of objects in a complex and crowded structure due to multiple objects. A case may occur where has a higher confidence value than the real object box.

따라서, 이러한 문제를 해결하기 위하여, 박스 내 전경 픽셀 정보를 이용하여 가짜 객체의 신뢰도 값은 낮추고 인접 박스의 정보를 이용하여 겹친 객체의 신뢰도 값은 높이는 보정 방법을 필요로 한다.Therefore, in order to solve this problem, a correction method is needed that lowers the reliability value of fake objects using foreground pixel information within a box and increases the reliability value of overlapping objects using information in adjacent boxes.

본 발명이 이루고자 하는 기술적 과제는 합성곱 신경망(Convolutional Neural Network) 기술을 이용한 객체 탐지에 있어서, 객체 탐지의 정확도를 개선하기 위해 탐지기의 출력인 박스들 중에서 잘못된 박스들의 신뢰도 값을 조정하는 박스 레벨 후처리 방법 및 장치를 제공하는데 있다. The technical problem that the present invention aims to achieve is object detection using convolutional neural network technology. In order to improve the accuracy of object detection, the box level post-processing method adjusts the reliability values of incorrect boxes among the boxes that are output from the detector. The purpose is to provide processing methods and devices.

일 측면에 있어서, 본 발명에서 제안하는 정확한 객체 탐지를 위한 박스 레벨 후처리 방법은 카메라를 통해 복수의 객체를 촬영하여 영상을 획득하는 단계, 획득된 영상으로부터 객체 탐지부를 통해 복수의 객체 탐지를 위한 박스들을 획득하는 단계, 보정부를 통해 복수의 객체 탐지를 위한 박스들 중 복수의 객체 간 겹침으로 인한 가짜 객체 박스의 신뢰도 값과 진짜 객체 박스의 신뢰도 값을 박스 내 전경 픽셀 정보를 이용하여 보정하는 단계 및 보정된 신뢰도 값에 따라 객체 탐지부를 통해 복수의 객체 탐지를 위한 박스들의 위치를 재조정하는 단계를 포함한다. In one aspect, the box-level post-processing method for accurate object detection proposed in the present invention includes the steps of acquiring images by photographing a plurality of objects through a camera, and detecting a plurality of objects through an object detector from the acquired images. Obtaining boxes, correcting the reliability value of the fake object box and the reliability value of the real object box due to overlap between a plurality of objects among the boxes for detecting a plurality of objects through a correction unit using foreground pixel information in the box. and readjusting the positions of boxes for detecting a plurality of objects through an object detection unit according to the corrected reliability value.

보정부를 통해 복수의 객체 탐지를 위한 박스들 중 복수의 객체 간 겹침으로 인한 가짜 객체 박스의 신뢰도 값과 진짜 객체 박스의 신뢰도 값을 박스 내 전경 픽셀 정보를 이용하여 보정하는 단계는 가짜 객체 박스인지 진짜 객체 박스인지 여부를 판단하기 위하여, 해당 박스 내 전경 픽셀의 포함된 정도를 계산하고, 해당 박스 내 포함된 전경 픽셀이 미리 정해진 기준 미만일 경우 가짜 객체 박스로 판단하고, 해당 박스 내 포함된 전경 픽셀이 미리 정해진 기준 이상일 경우 진짜 객체 박스로 판단하며, 진짜 객체 박스에 대해 적응 임계 값(adaptive threshold) 방법을 이용하여 해당 박스 내 전경 픽셀 및 배경 픽셀을 분리한다. The step of correcting the reliability value of the fake object box and the reliability value of the real object box due to overlap between multiple objects among the boxes for multiple object detection through the correction unit using the foreground pixel information in the box is to determine whether the fake object box is real or not. In order to determine whether it is an object box, the degree of foreground pixels included in the box is calculated. If the foreground pixels included in the box are less than a predetermined standard, it is judged to be a fake object box, and the foreground pixels included in the box are determined to be fake object boxes. If it exceeds a predetermined standard, it is judged to be a real object box, and the foreground and background pixels within the box are separated using an adaptive threshold method for the real object box.

본 발명의 실시예에 따르면, 분리된 전경 픽셀 및 배경 픽셀에 따라 가짜 객체 박스의 신뢰도 값과 진짜 객체 박스의 신뢰도 값을 해당 박스의 전체 크기에 대한 해당 박스 내 전경 영역의 크기의 비를 이용하여 신뢰도 값을 보정한다. According to an embodiment of the present invention, the reliability value of the fake object box and the reliability value of the real object box are calculated according to the separated foreground pixels and background pixels using the ratio of the size of the foreground area within the box to the total size of the box. Calibrate the reliability value.

본 발명의 실시예에 따르면, 진짜 객체 박스들 중 겹치는 영역에 의해 진짜 객체 박스의 신뢰도 값이 낮게 계산된 경우 해당 박스의 전체 크기에 대한 해당 박스 내 겹치지 않은 영역의 크기의 비를 이용하여 신뢰도 값을 보정한다. According to an embodiment of the present invention, when the reliability value of the real object box is calculated to be low due to overlapping areas among the real object boxes, the reliability value is calculated using the ratio of the size of the non-overlapping area within the box to the total size of the box. Correct.

본 발명의 실시예에 따르면, 진짜 객체 박스들 중 겹치는 영역에 대한 신뢰도 값을 보정하기 위해 원본 영상에서 탐지된 객체에 관한 박스와 원본 영상을 회전한 회전 영상에서 탐지된 객체에 관한 박스를 매칭시키고, 원본 영상과 회전 영상에 대해 적응 임계 값 방법을 이용하여 해당 박스 내 전경 픽셀 및 배경 픽셀을 분리한 후 원본 영상과 회전 영상 중 더 많은 전경 픽셀을 포함하고 있는 박스에 대한 신뢰도 값을 보정한다. According to an embodiment of the present invention, in order to correct the reliability value of the overlapping area among the real object boxes, the box related to the object detected in the original image is matched with the box related to the object detected in the rotation image obtained by rotating the original image. , the adaptive threshold method is used for the original image and the rotated image to separate the foreground and background pixels within the box, and then the reliability value for the box containing more foreground pixels among the original image and the rotated image is corrected.

또 다른 일 측면에 있어서, 본 발명에서 제안하는 정확한 객체 탐지를 위한 박스 레벨 후처리 장치는 카메라를 통해 촬영된 복수의 객체에 대한 영상을 획득하는 영상 수집부, 획득된 영상으로부터 복수의 객체 탐지를 위한 박스들을 획득하고, 보정부를 통해 보정된 신뢰도 값에 따라 복수의 객체 탐지를 위한 박스들의 위치를 재조정하는 객체 탐지부 및 복수의 객체 탐지를 위한 박스들 중 복수의 객체 간 겹침으로 인한 가짜 객체 박스의 신뢰도 값과 진짜 객체 박스의 신뢰도 값을 박스 내 전경 픽셀 정보를 이용하여 보정하는 보정부를 포함한다.In another aspect, the box-level post-processing device for accurate object detection proposed in the present invention includes an image collection unit that acquires images of a plurality of objects captured through a camera, and detection of a plurality of objects from the acquired images. An object detection unit that acquires boxes for detecting multiple objects and readjusts the positions of the boxes for multiple object detection according to the reliability value corrected through the correction unit, and a fake object box due to overlap between multiple objects among the boxes for multiple object detection. It includes a correction unit that corrects the reliability value of and the reliability value of the real object box using foreground pixel information in the box.

본 발명의 실시예들에 따르면 복수의 객체들로 인한 복잡하고 혼잡한 구조 내 객체들의 겹침으로 가짜 객체 박스가 진짜 객체 박스 보다 높은 신뢰도 값을 갖는 경우가 발생하는 것을 해결하기 위해 박스 내 전경 픽셀 정보를 이용하여 가짜 객체의 신뢰도 값은 낮추고 인접 박스의 정보를 이용하여 겹친 객체의 신뢰도 값은 높이는 보정 방법을 제안함으로써 객체 탐지의 정확도를 개선할 수 있다. According to embodiments of the present invention, in order to solve the case where a fake object box has a higher reliability value than a real object box due to overlap of objects in a complex and congested structure due to a plurality of objects, foreground pixel information within the box is used. The accuracy of object detection can be improved by proposing a correction method that lowers the reliability value of fake objects and increases the reliability value of overlapping objects using information from adjacent boxes.

도 1은 본 발명의 일 실시예에 따른 정확한 객체 탐지를 위한 박스 레벨 후처리 방법을 설명하기 위한 흐름도이다.
도 2는 본 발명의 일 실시예에 따른 신뢰도 보정 알고리즘을 나타내는 도면이다.
도 3은 본 발명의 일 실시예에 따른 복수의 객체에 대한 배경 분리과정을 설명하기 위한 흐름도이다.
도 4는 본 발명의 일 실시예에 따른 영상획득을 위해 설치된 카메라를 나타내는 도면이다.
도 5는 본 발명의 일 실시예에 따른 객체와 배경을 분리하기 위해 박스의 전경 영역의 크기(S_{foreground_pixel})를 구하는 과정을 나타내는 도면이다.
도 6은 본 발명의 일 실시예에 따른 가짜 객체 박스의 신뢰도 값을 낮추는 신뢰도 값 보정 과정을 설명하기 위한 도면이다.
도 7은 본 발명의 일 실시예에 따른 진짜 객체 박스의 신뢰도 값을 높이는 신뢰도 값 보정 과정을 설명하기 위한 도면이다.
도 8은 본 발명의 일 실시예에 따른 신뢰도 보정 알고리즘을 원본 영상과 회전한 영상에 각각 적용을 한 후 매칭시키는 과정을 설명하기 위한 도면이다.
도 9는 본 발명의 일 실시예에 따른 적응 임계(adaptiveThreshold) 기법을 이용하여 영상 내 객체를 분리하는 과정을 설명하기 위한 도면이다.
도 10은 본 발명의 일 실시예에 따른 정확한 객체 탐지를 위한 박스 레벨 후처리 장치를 설명하기 위한 도면이다. 1 is a flowchart illustrating a box-level post-processing method for accurate object detection according to an embodiment of the present invention.
Figure 2 is a diagram showing a reliability correction algorithm according to an embodiment of the present invention.
Figure 3 is a flowchart illustrating a background separation process for a plurality of objects according to an embodiment of the present invention.
Figure 4 is a diagram showing a camera installed for image acquisition according to an embodiment of the present invention.
Figure 5 is a diagram showing the process of calculating the size (S _{foreground_pixel} ) of the foreground area of a box to separate the object and the background according to an embodiment of the present invention.
FIG. 6 is a diagram illustrating a reliability value correction process for lowering the reliability value of a fake object box according to an embodiment of the present invention.
FIG. 7 is a diagram illustrating a reliability value correction process for increasing the reliability value of a real object box according to an embodiment of the present invention.
Figure 8 is a diagram illustrating the process of matching after applying the reliability correction algorithm to the original image and the rotated image, respectively, according to an embodiment of the present invention.
Figure 9 is a diagram for explaining the process of separating objects in an image using an adaptiveThreshold technique according to an embodiment of the present invention.
FIG. 10 is a diagram illustrating a box-level post-processing device for accurate object detection according to an embodiment of the present invention.

이하, 본 발명의 실시 예를 첨부된 도면을 참조하여 상세하게 설명한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the attached drawings.

도 1은 본 발명의 일 실시예에 따른 정확한 객체 탐지를 위한 박스 레벨 후처리 방법을 설명하기 위한 흐름도이다. 1 is a flowchart illustrating a box-level post-processing method for accurate object detection according to an embodiment of the present invention.

제안하는 정확한 객체 탐지를 위한 박스 레벨 후처리 방법은 카메라를 통해 복수의 객체를 촬영하여 영상을 획득하는 단계(110), 획득된 영상으로부터 객체 탐지부를 통해 복수의 객체 탐지를 위한 박스들을 획득하는 단계(120), 보정부를 통해 복수의 객체 탐지를 위한 박스들 중 복수의 객체 간 겹침으로 인한 가짜 객체 박스의 신뢰도 값과 진짜 객체 박스의 신뢰도 값을 박스 내 전경 픽셀 정보를 이용하여 보정하는 단계(130) 및 보정된 신뢰도 값에 따라 객체 탐지부를 통해 복수의 객체 탐지를 위한 박스들의 위치를 재조정하는 단계(140)를 포함한다. The proposed box-level post-processing method for accurate object detection includes the steps of acquiring images by photographing a plurality of objects using a camera (110), and acquiring boxes for detecting a plurality of objects from the acquired images through an object detection unit. (120), correcting the reliability value of the fake object box and the reliability value of the real object box due to overlap between a plurality of objects among the boxes for detecting a plurality of objects through the correction unit using the foreground pixel information in the box (130) ) and a step 140 of readjusting the positions of the boxes for detecting a plurality of objects through the object detection unit according to the corrected reliability value.

단계(110)에서, 카메라를 통해 복수의 객체를 촬영하여 영상을 획득한다. 예를 들어, 카메라를 통해 복수의 객체를 촬영하기 위해 위에서 아래를 촬영하는 탑-뷰(top-view) 카메라를 통해 촬영할 수 있다. In step 110, images are obtained by photographing a plurality of objects using a camera. For example, in order to photograph multiple objects through a camera, photographing can be done through a top-view camera that photographs from above and below.

단계(120)에서, 획득된 영상으로부터 객체 탐지부를 통해 복수의 객체 탐지를 위한 박스들을 획득한다. 설치된 카메라로부터 획득한 영상에 대해 탐지기를 가동하여 각각의 신뢰도 값을 구할 수 있다. In step 120, boxes for detecting a plurality of objects are obtained from the acquired image through an object detection unit. Each reliability value can be obtained by operating a detector for images acquired from installed cameras.

단계(130)에서, 보정부를 통해 복수의 객체 탐지를 위한 박스들 중 복수의 객체 간 겹침으로 인한 가짜 객체 박스의 신뢰도 값과 진짜 객체 박스의 신뢰도 값을 박스 내 전경 픽셀 정보를 이용하여 보정한다. In step 130, the reliability value of the fake object box and the reliability value of the real object box due to overlap between a plurality of objects among the boxes for detecting a plurality of objects are corrected through the correction unit using foreground pixel information in the box.

단계(130)에서는 먼저 가짜 객체 박스인지 진짜 객체 박스인지 여부를 판단하기 위하여, 해당 박스 내 전경 픽셀의 포함된 정도를 계산한다. 해당 박스 내 포함된 전경 픽셀이 미리 정해진 기준 미만일 경우 가짜 객체 박스로 판단하고, 해당 박스 내 포함된 전경 픽셀이 미리 정해진 기준 이상일 경우 진짜 객체 박스로 판단한다. 이후, 진짜 객체 박스에 대해 적응 임계 값(adaptive threshold) 방법을 이용하여 해당 박스 내 전경 픽셀 및 배경 픽셀을 분리한다. In step 130, the degree of inclusion of foreground pixels in the box is first calculated to determine whether it is a fake object box or a real object box. If the foreground pixels included in the box are less than a predetermined standard, it is judged as a fake object box, and if the foreground pixels included in the box are more than the predetermined standard, it is judged as a real object box. Afterwards, an adaptive threshold method is used for the real object box to separate the foreground and background pixels within the box.

분리된 전경 픽셀 및 배경 픽셀에 따라 가짜 객체 박스의 신뢰도 값과 진짜 객체 박스의 신뢰도 값을 해당 박스의 전체 크기에 대한 해당 박스 내 전경 영역의 크기의 비를 이용하여 신뢰도 값을 보정한다. According to the separated foreground pixels and background pixels, the reliability values of the fake object box and the real object box are corrected using the ratio of the size of the foreground area within the box to the total size of the box.

또한, 진짜 객체 박스들 중 겹치는 영역에 의해 진짜 객체 박스의 신뢰도 값이 낮게 계산된 경우 해당 박스의 전체 크기에 대한 해당 박스 내 겹치지 않은 영역의 크기의 비를 이용하여 신뢰도 값을 보정한다. Additionally, if the reliability value of the real object box is calculated to be low due to overlapping areas among the real object boxes, the reliability value is corrected using the ratio of the size of the non-overlapping area within the box to the total size of the box.

진짜 객체 박스들 중 겹치는 영역에 대한 신뢰도 값을 보정하기 위해 원본 영상에서 탐지된 객체에 관한 박스와 원본 영상을 회전한 회전 영상에서 탐지된 객체에 관한 박스를 매칭시킨다. 이후, 원본 영상과 회전 영상에 대해 적응 임계 값 방법을 이용하여 해당 박스 내 전경 픽셀 및 배경 픽셀을 분리한 후 원본 영상과 회전 영상 중 더 많은 전경 픽셀을 포함하고 있는 박스에 대한 신뢰도 값을 보정한다. To correct the reliability value for the overlapping area among the real object boxes, the box for the object detected in the original image is matched with the box for the object detected in the rotated image that rotated the original image. Afterwards, the adaptive threshold method is used for the original image and the rotated image to separate the foreground and background pixels within the box, and then the reliability value for the box containing more foreground pixels among the original image and the rotated image is corrected. .

단계(140)에서, 보정된 신뢰도 값에 따라 객체 탐지부를 통해 복수의 객체 탐지를 위한 박스들의 위치를 재조정한다. In step 140, the positions of boxes for detecting a plurality of objects are readjusted through the object detection unit according to the corrected reliability value.

이하, 도 2 내지 도 9를 참조하여 본 발명의 일 실시예에 따른 정확한 객체 탐지를 위한 박스 레벨 후처리 방법에 대해 더욱 상세히 설명한다. 본 발명에서 제안하는 객체 탐지 방법 및 장치에 대한 상세한 설명을 위해 돈사 내에서의 객체(다시 말해, 돼지) 탐지에 적용하여 예시로서 설명한다. 돈사 내에서의 객체 탐지는 일 실시예일뿐 이에 한정되지 않으며, 본 발명에서 제안하는 객체 탐지 방법 및 장치는 다양한 분야에서의 객체 탐지에 적용될 수 있다. Hereinafter, a box-level post-processing method for accurate object detection according to an embodiment of the present invention will be described in more detail with reference to FIGS. 2 to 9. For detailed explanation of the object detection method and device proposed in the present invention, it will be described as an example by applying it to object detection (that is, pigs) in a pig pen. Object detection within a pig pen is only an example and is not limited to this, and the object detection method and device proposed in the present invention can be applied to object detection in various fields.

도 2는 본 발명의 일 실시예에 따른 신뢰도 보정 알고리즘을 나타내는 도면이다. Figure 2 is a diagram showing a reliability correction algorithm according to an embodiment of the present invention.

본 발명에서 제안하는 객체 탐지 방법 및 장치에 대한 상세한 설명을 위해 돈사 내에서의 객체(다시 말해, 돼지) 탐지에 적용하여 예시로서 설명한다. 먼저, 카메라를 통해 복수의 객체를 촬영하여 영상을 획득한다. 본 발명의 일 실시예에 따른 카메라는 돈사 내 돼지의 영상을 촬영하기 위해 위에서 아래를 촬영하는 탑-뷰(top-view) 카메라로 설치될 수 있다. For detailed explanation of the object detection method and device proposed in the present invention, it will be described as an example by applying it to object detection (that is, pigs) in a pig pen. First, images are obtained by photographing a plurality of objects using a camera. The camera according to an embodiment of the present invention may be installed as a top-view camera that shoots from above to below to capture images of pigs in a pig pen.

카메라를 통해 획득된 돈사 내 돼지 영상으로부터 객체 탐지부를 통해 복수의 돼지 탐지를 위한 박스들을 획득한다. 설치된 카메라로부터 획득한 영상에 대해 탐지기를 가동하여 각각의 박스에 대한 신뢰도 값을 구할 수 있다. 계산된 신뢰도 값에 대해 도 2에 도시된 알고리즘을 이용하여 각각의 박스에 대한 신뢰도 값을 보정할 수 있다. 이때, 보정부를 통해 복수의 돼지 탐지를 위한 박스들 중 복수의 돼지 간 겹침으로 인한 가짜 객체 박스의 신뢰도 값과 진짜 객체 박스의 신뢰도 값을 박스 내 전경 픽셀 정보를 이용하여 보정한다. Boxes for detecting multiple pigs are obtained through an object detection unit from images of pigs in the pig pen acquired through a camera. The reliability value for each box can be obtained by running a detector on the image acquired from the installed camera. The reliability value for each box can be corrected using the algorithm shown in FIG. 2 for the calculated reliability value. At this time, through the correction unit, the reliability value of the fake object box and the reliability value of the real object box due to overlap between the plurality of pigs among the boxes for detecting the plurality of pigs are corrected using the foreground pixel information in the box.

본 발명의 실시예에 따른 객체 탐지부는 시간과 정확도가 적절하게 어울리는 가성비가 좋은 객체 탐지기인 YOLOv4를 사용하였다. YOLOv4에서 Soft-nms를 거친 후 나오는 객체의 정보에서 현재 탐지한 박스 정보에서 클래스(class)와 객체 크기를 얼마나 정확하게 잡았는지를 나타내는 신뢰도 값을 이용한다. 또한, 신뢰도 값을 재조정하는 방법에서 사용되는 객체 분리를 위해서 적응 임계(adaptive Threshold) 기법을 이용하여 영상 내 객체를 분리하고 영상을 회전시켜 신뢰도 값을 보정한다. The object detection unit according to an embodiment of the present invention used YOLOv4, a cost-effective object detector with an appropriate combination of time and accuracy. In YOLOv4, the confidence value that indicates how accurately the class and object size are captured in the currently detected box information from the object information that comes out after going through Soft-nms is used. In addition, to separate objects used in the method of readjusting the reliability value, an adaptive threshold technique is used to separate objects in the image and rotate the image to correct the reliability value.

본 발명의 실시예에 따르면, 분리된 전경 픽셀 및 배경 픽셀에 따라 가짜 객체 박스의 신뢰도 값과 진짜 객체 박스의 신뢰도 값을 해당 박스의 전체 크기에 대한 해당 박스 내 전경 영역의 크기의 비를 이용하여 신뢰도 값을 보정할 수 있다. According to an embodiment of the present invention, the reliability value of the fake object box and the reliability value of the real object box are calculated according to the separated foreground pixels and background pixels using the ratio of the size of the foreground area within the box to the total size of the box. The reliability value can be corrected.

가짜 객체 박스의 경우, Decrease_Confidence(b) 함수 및 Opencv를 이용하여 객체를 배경과 분리하고, 해당 박스 내 전경 영역의 크기(S_{foreground_pixel})를 구할 수 있다. In the case of a fake object box, you can use the Decrease_Confidence( b ) function and Opencv to separate the object from the background and obtain the size of the foreground area (S _{foreground_pixel} ) within the box.

반면에, 진짜 객체 박스의 경우, Increase_Confidence(b) 함수를 이용하여 겹쳐진 박스를 제외하고 해당 박스 내 겹치지 않은 영역의 크기(S_{not_overlapped})를 구할 수 있다. On the other hand, in the case of a real object box, you can use the Increase_Confidence( b ) function to obtain the size of the non-overlapped area (S _{not_overlapped} ) within the box, excluding the overlapped box.

이후, 신뢰도 보정 알고리즘을 원본 영상과 회전한 영상에 각각 적용을 한 후 원본에서 탐지된 결과와 회전한 후 영상에서 탐지된 결과에 대해 유클리디안 거리(Euclidean Distance)를 계산하고, 원본 영상에서 탐지된 객체에 관한 박스와 원본 영상을 회전한 회전 영상에서 탐지된 객체에 관한 박스를 매칭시킬 수 있다. Afterwards, the reliability correction algorithm is applied to the original image and the rotated image respectively, and then the Euclidean distance is calculated for the result detected in the original image and the result detected in the rotated image, and the detection is performed in the original image. You can match the box about the detected object with the box about the object detected in the rotated image that rotated the original image.

원본 영상과 회전 영상에 대해 적응 임계 값 방법을 이용하여 해당 박스 내 전경 픽셀 및 배경 픽셀을 분리한 후 탐지된 결과에서 많은 배경을 포함한 탐지 결과를 구한다. 이후, 원본 영상과 회전 영상 중 더 많은 배경을 포함한 객체의 신뢰도 값을 보정한다. 신뢰도 값을 보정한 후 신뢰도 값에 따라 박스 위치를 재조정한다. After separating the foreground and background pixels within the box using an adaptive threshold method for the original image and the rotated image, a detection result including a large amount of background is obtained from the detected result. Afterwards, the reliability value of the object containing more background among the original image and the rotated image is corrected. After correcting the reliability value, the box position is readjusted according to the reliability value.

도 3은 본 발명의 일 실시예에 따른 복수의 객체에 대한 배경 분리과정을 설명하기 위한 흐름도이다. Figure 3 is a flowchart illustrating a background separation process for a plurality of objects according to an embodiment of the present invention.

본 발명에서 제안하는 객체 탐지 방법 및 장치에 대한 상세한 설명을 위해 돈사 내에서의 객체(다시 말해, 돼지) 탐지에 적용하여 예시로서 설명한다. For detailed explanation of the object detection method and device proposed in the present invention, it will be described as an example by applying it to object detection (that is, pigs) in a pig pen.

본 발명에서는 합성곱 신경망 기반 돼지 탐지기의 출력인 박스들의 신뢰도 값을 평가하고 잘못된 박스들의 신뢰도 값을 보정하는 박스 레벨 후처리 방법을 제안한다. 먼저, 카메라를 통해 복수의 객체를 촬영하여 영상을 획득한다. The present invention proposes a box-level post-processing method that evaluates the reliability values of boxes that are the output of a convolutional neural network-based pig detector and corrects the reliability values of incorrect boxes. First, images are obtained by photographing a plurality of objects using a camera.

카메라를 통해 획득된 돈사 내 돼지 영상으로부터 객체 탐지부를 통해 복수의 돼지 탐지를 위한 박스들을 획득한다. 설치된 카메라로부터 획득한 영상에 대해 탐지기를 가동하여 각각의 박스에 대한 신뢰도 값(Confidence score)을 구할 수 있다. Boxes for detecting multiple pigs are obtained through an object detection unit from images of pigs in the pig pen acquired through a camera. By operating a detector on images acquired from installed cameras, the confidence score for each box can be obtained.

본 발명의 실시예에 따른 객체 탐지부는 시간과 정확도가 적절하게 어울리는 가성비가 좋은 객체 탐지기인 YOLOv4를 사용하였다. YOLOv4에서 Soft-nms를 거친 후 나오는 객체의 정보에서 현재 탐지한 박스 정보에서 클래스(class)와 객체 크기를 얼마나 정확하게 잡았는지를 나타내는 신뢰도 값을 이용한다. The object detection unit according to an embodiment of the present invention used YOLOv4, a cost-effective object detector with an appropriate combination of time and accuracy. In YOLOv4, the confidence value that indicates how accurately the class and object size are captured in the currently detected box information from the object information that comes out after going through Soft-nms is used.

YOLOv4에서 Soft-nms를 거친 후 나온 신뢰도 값을 이용하여 신뢰도 임계값(T)을 지정하고, 미리 정해진 조건에 따라 신뢰도 값을 다음과 같이 조절할 수 있다. The reliability threshold (T) can be specified using the reliability value obtained after going through Soft-nms in YOLOv4, and the reliability value can be adjusted according to predetermined conditions as follows.

예를 들어, 해당 박스의 계산된 신뢰도 값 < T-10 일 경우, 해당 박스를 삭제 후 다음 객체(다시 말해, 돼지)를 탐색할 수 있다. For example, if the calculated reliability value of the box is < T-10, the box can be deleted and the next object (in other words, a pig) can be searched.

T+10 < 해당 박스의 계산된 신뢰도 값일 경우, 해당 박스 유지 후 다음 객체를 탐색할 수 있다. If T+10 < the calculated reliability value of the box, the next object can be searched after maintaining the box.

T< 해당 박스의 계산된 신뢰도 값 < T+10 일 경우, Decrease_Confidence() 함수를 호출하여 해당 박스에 대한 신뢰도 값을 감소시킬 수 있다. If T < the calculated confidence value of the box < T+10, the confidence value of the box can be decreased by calling the Decrease_Confidence() function.

T-10 < 해당 박스의 계산된 신뢰도 값 < T 일 경우, Increase_Confidence() 함수를 호출하여 해당 박스에 대한 신뢰도 값을 증가시킬 수 있다. If T-10 < the calculated confidence value of the box < T, the confidence value for the box can be increased by calling the Increase_Confidence() function.

해당 박스의 계산된 신뢰도 값 < T 일 경우, 해당 박스 삭제 후 다음 객체를 탐색할 수 있다. If the calculated reliability value of the box is < T, the next object can be searched after deleting the box.

T < 해당 박스의 계산된 신뢰도 값일 경우, 해당 박스 유지 후 다음 객체를 탐색할 수 있다. If T < the calculated reliability value of the box, the next object can be searched after maintaining the box.

위 과정을 클래스 내 모든 객체 박스에 대해 반복 수행할 수 있다. 클래스 내 모든 객체 박스에 대해 반복 수행한 후, 촬영된 영상을 회전시킨 후 위 과정을 클래스 내 모든 객체 박스에 대해 다시 반복 수행할 수 있다.The above process can be repeated for all object boxes in the class. After repeating the process for all object boxes in the class, the captured image can be rotated and the above process can be repeated again for all object boxes in the class.

회전시킨 영상에 대한 클래스 내 모든 객체 박스에 대해서도 반복 수행을 완료한 후, 진짜 객체 박스들 중 겹치는 영역에 대한 신뢰도 값을 보정하기 위해 원본 영상에서 탐지된 객체에 관한 박스와 원본 영상을 회전한 회전 영상에서 탐지된 객체에 관한 박스를 n:m 매칭시킨다. 이후, 원본 영상과 회전 영상에 대해 적응 임계 값 방법을 이용하여 해당 박스 내 전경 픽셀 및 배경 픽셀을 분리한 후 원본 영상과 회전 영상 중 더 많은 전경 픽셀을 포함하고 있는 박스에 대한 신뢰도 값을 보정한다. After completing the iteration for all object boxes in the class for the rotated image, the box for the object detected in the original image and the original image were rotated to correct the reliability value for the overlapping area among the real object boxes. Match n:m boxes related to objects detected in the image. Afterwards, the adaptive threshold method is used for the original image and the rotated image to separate the foreground and background pixels within the box, and then the reliability value for the box containing more foreground pixels among the original image and the rotated image is corrected. .

예를 들어, 매칭 결과 중 가장 많은 전경 픽셀을 포함하는 탐지 결과를 선택하고, 선택된 탐지 결과에 대해 Rotation_Confidence() 함수를 호출하여 신뢰도 값을 재조정할 수 있다. 매칭 결과 중 각각의 신뢰도 값이 가장 높은 객체를 지닌 매칭 정보를 이용하여 최종 탐지 결과에 반영할 수 있다. For example, among the matching results, the detection result containing the most foreground pixels can be selected and the confidence value can be readjusted by calling the Rotation_Confidence() function for the selected detection result. Among the matching results, the matching information with the object with the highest reliability value can be used to reflect this in the final detection result.

도 4는 본 발명의 일 실시예에 따른 영상획득을 위해 설치된 카메라를 나타내는 도면이다. Figure 4 is a diagram showing a camera installed for image acquisition according to an embodiment of the present invention.

본 발명의 일 실시예에 따른 카메라(410)는 돈사 내 돼지의 영상을 촬영하기 위해 위에서 아래를 촬영하는 탑-뷰(top-view) 카메라로 설치될 수 있다. The camera 410 according to an embodiment of the present invention may be installed as a top-view camera that captures images from above and below to capture images of pigs in a pig pen.

돈사 내의 카메라(410)로부터 획득된 영상을 본 발명의 일 실시예에 따른 YOLOv4 객체 탐지부에 적용하여 탐지된 객체에 대응하는 바운딩 박스(bounding box)들을 획득할 수 있다. YOLOv4는 대표적인 객체 탐지 공개 DB인 MS COCO[4]로 처리속도 대비 높은 정확도를 달성할 수 있다. 즉, 합성곱 신경망 기반 딥러닝 기술의 지속적인 발전으로 YOLOv4는 대부분의 돼지들을 정확히 탐지하지만, 돈방의 복잡한 구조 등의 이유로 배경(background)에 해당하는 가짜 돼지를 탐지한 박스들이 포함되는 문제(False Positive, FP)가 남아있다. 만약 이러한 가짜 돼지들을 제거하기 위하여 단순히 박스의 신뢰도(confidence) 값이 적은 것을 제거한다면, 겹친 돼지(occluded pig)와 같이 신뢰도 값이 낮은 다른 박스도 함께 제거된다는 또 다른 문제(False Negative, FN)가 발생한다. 이러한 문제들을 해결하기 위하여 본 발명에서는 현재 박스의 신뢰도 값으로는 가짜 돼지인지 진짜 돼지인지 판단하기 애매한 경우, 박스내 전경(foreground) 픽셀 정보와 인접 박스의 정보를 이용하여 신뢰도 값을 보정하는 작업을 수행한다. The image acquired from the camera 410 in the pig pen can be applied to the YOLOv4 object detection unit according to an embodiment of the present invention to obtain bounding boxes corresponding to the detected object. YOLOv4 can achieve high accuracy compared to processing speed with MS COCO [4], a representative object detection public database. In other words, with the continuous development of convolutional neural network-based deep learning technology, YOLOv4 accurately detects most pigs, but due to the complex structure of the pig pen, there is an issue that boxes that detect fake pigs in the background are included (False Positive). , FP) remain. If, in order to remove these fake pigs, you simply remove boxes with low confidence values, another problem (False Negative, FN) arises in that other boxes with low confidence values, such as occluded pigs, are also removed. Occurs. In order to solve these problems, in the present invention, when it is difficult to determine whether a pig is a fake pig or a real pig based on the reliability value of the current box, the reliability value is corrected using the foreground pixel information within the box and the information of adjacent boxes. Perform.

도 5는 본 발명의 일 실시예에 따른 객체와 배경을 분리하기 위해 박스의 전경 영역의 크기(S_{foreground_pixel})를 구하는 과정을 나타내는 도면이다. Figure 5 is a diagram showing the process of calculating the size (S _{foreground_pixel} ) of the foreground area of a box to separate the object and the background according to an embodiment of the present invention.

먼저 신뢰도 값이 애매한 범위에 있는 박스가 배경에 해당하는 가짜 돼지 박스인지를 판단하기 위하여, 해당 박스에 전경 픽셀이 얼마나 포함되는지 계산한다. 돼지의 위치에 따라 박스 내 전경 픽셀 비율이 다르긴 하지만, 가짜 돼지에 대응하는 박스라면 미리 정해진 기준 이하의 픽셀이 전경 픽셀이라 가정한다. 박스 내 픽셀들을 전경과 배경으로 구분하기 위하여 적응 임계 값(adaptive threshold) 방법[5]을 적용하고, 이러한 기준으로 판단된 박스의 신뢰도 값을 해당 박스의 전체 크기(S_total)(510) 대비 해당 박스의 전경 영역의 크기(S_{foreground_pixel})(520)를 이용하는 Decrease_Confidence() 함수를 호출하여 다음의 수식으로 보정한다: First, in order to determine whether a box with an ambiguous confidence value is a fake pig box corresponding to the background, it is calculated how many foreground pixels are included in the box. Although the ratio of foreground pixels in a box varies depending on the pig's location, if it is a box corresponding to a fake pig, pixels below a predetermined standard are assumed to be foreground pixels. To distinguish pixels within a box into foreground and background, an adaptive threshold method [5] is applied, and the reliability value of the box determined based on this standard is compared to the total size of the box (S _total ) (510). Call the Decrease_Confidence() function using the size of the foreground area of the box (S _{foreground_pixel} ) (520) and correct it with the following formula:

도 6은 본 발명의 일 실시예에 따른 가짜 객체 박스의 신뢰도 값을 낮추는 신뢰도 값 보정 과정을 설명하기 위한 도면이다. FIG. 6 is a diagram illustrating a reliability value correction process for lowering the reliability value of a fake object box according to an embodiment of the present invention.

본 발명의 실시예에 따른 YOLOv4 객체 탐지부를 이용한 가짜 돼지 박스에 대해 해당 박스의 전체 크기(S_total) 대비 해당 박스의 전경 영역의 크기(S_{foreground_pixel})를 이용하는 Decrease_Confidence() 함수를 호출하여 신뢰도 값을 보정한다. For the fake pig box using the YOLOv4 object detection unit according to an embodiment of the present invention, the confidence value is calculated by calling the Decrease_Confidence() function that uses the size of the foreground area (S _{foreground_pixel} ) of the box compared to the total size of the box (S _total ). Correct.

도 5에 도시된 바와 같이, 가짜 돼지 박스의 신뢰도 값은 75%였다. 이러한 가짜 돼지 박스의 신뢰도 값에 대해 Decrease_Confidence() 함수를 호출하여 보정된 신뢰도 값은 도 6에 도시된 바와 같이 63%로 보정됨을 확인할 수 있다. As shown in Figure 5, the reliability value of the fake pig box was 75%. It can be confirmed that the confidence value corrected by calling the Decrease_Confidence() function for the confidence value of this fake pig box is corrected to 63%, as shown in FIG. 6.

도 7은 본 발명의 일 실시예에 따른 진짜 객체 박스의 신뢰도 값을 높이는 신뢰도 값 보정 과정을 설명하기 위한 도면이다. FIG. 7 is a diagram illustrating a reliability value correction process for increasing the reliability value of a real object box according to an embodiment of the present invention.

본 발명의 실시예에 따르면, 인접 박스 정보를 이용하여 신뢰도 값을 보정할 수도 있다. 신뢰도 값이 애매한 범위에 있는 박스가 전경에 해당하는 겹친 돼지 박스인지를 판단하기 위하여, 먼저, 해당 박스와 겹치는 박스가 존재하는지 판단한다. 이러한 경우라면 해당 박스 내 전경 픽셀 비율이 절반 이상이 될 것이고, 앞서 설명된 신뢰도 값을 낮추는 보정 과정이 적용되지 않을 것이다. 즉, 겹친 돼지 박스의 경우 인접 돼지 박스에 의하여 신뢰도 값이 낮게 계산되었다고 가정할 수 있으며, 이러한 가정을 기반으로 해당 박스의 신뢰도 값을 전체 크기(S_total)(710) 대비 해당 박스의 겹치지 않은 영역의 크기(S_{not_overlapped})(720)를 이용하는 Increase_Confidence() 함수를 호출하여 다음의 수식으로 보정한다: According to an embodiment of the present invention, the reliability value may be corrected using adjacent box information. To determine whether a box with an ambiguous reliability value is an overlapping pig box corresponding to the foreground, first determine whether there is a box overlapping with the corresponding box. In this case, the ratio of foreground pixels within the box will be more than half, and the correction process for lowering the confidence value described above will not be applied. In other words, in the case of overlapping pig boxes, it can be assumed that the reliability value was calculated to be low due to the adjacent pig box, and based on this assumption, the reliability value of the box is calculated as the non-overlapping area of the box compared to the total size (S _total ) (710). Call the Increase_Confidence() function using the size (S _{not_overlapped} ) (720) and correct it with the following formula:

도 7을 참조하면, 전체 크기(S_total)(710), 해당 박스의 겹치지 않은 영역의 크기(S_{not_overlapped})(720) 및 인접 박스(730)를 나타내었다. 겹치지 않은 영역의 크기(S_{not_overlapped})(720)은 인접 박스(730) 정보를 이용하여 계산할 수 있다. Referring to FIG. 7, the total size (S _total ) (710), the size of the non-overlapped area of the corresponding box (S _{not_overlapped} ) (720), and adjacent boxes (730) are shown. The size of the non-overlapped area (S _{not_overlapped} ) 720 can be calculated using information on the adjacent box 730.

또한, 전경 픽셀 정보와 인접 박스 정보를 이용한 신뢰도 값의 전체 보정 과정은 앞서 설명된 도 2의 알고리즘에 따라 보정할 수 있다. Additionally, the overall correction process of the reliability value using foreground pixel information and adjacent box information can be corrected according to the algorithm of FIG. 2 described above.

도 8은 본 발명의 일 실시예에 따른 신뢰도 보정 알고리즘을 원본 영상과 회전한 영상에 각각 적용을 한 후 매칭시키는 과정을 설명하기 위한 도면이다. Figure 8 is a diagram illustrating the process of matching after applying the reliability correction algorithm to the original image and the rotated image, respectively, according to an embodiment of the present invention.

신뢰도 보정 알고리즘을 원본 영상과 회전한 영상에 각각 적용을 한 후 원본에서 탐지된 결과(A, B, C, D)와 회전한 후 영상에서 탐지된 결과(a, b, c, d)라고 가정한다. 이때, 유클리디안 거리(Euclidean Distance)를 이용하여 원본에서는 A라고 탐지한 객체가 회전한 후 영상에서는 a, b라고 탐지하였다고 매칭시키고, 원본에서는 B라고 탐지한 객체가 회전한 후 영상에서는 c라고 탐지하였다고 매칭시키고, 원본에서는 C, D라고 탐지한 객체가 회전한 후 영상에서는 d라고 탐지한 객체라고 서로 n:m 매칭을 시킬 수 있다. After applying the reliability correction algorithm to the original image and the rotated image respectively, assume that the results detected in the original image (A, B, C, D) and the results detected in the rotated image (a, b, c, d) are do. At this time, using the Euclidean Distance, the object detected as A in the original is rotated and then detected as a and b in the image, and the object detected as B in the original is rotated and then detected as c in the image. They are matched as detected, and after the objects detected as C and D in the original are rotated, n:m matching can be done with the objects detected as D in the image.

도 9는 본 발명의 일 실시예에 따른 적응 임계(adaptiveThreshold) 기법을 이용하여 영상 내 객체를 분리하는 과정을 설명하기 위한 도면이다. Figure 9 is a diagram for explaining the process of separating objects in an image using an adaptiveThreshold technique according to an embodiment of the present invention.

원본 영상과 회전 영상에 대해 적응 임계(adaptiveThreshold) 기법을 이용하여 객체를 분리한 후 탐지된 결과에서 더 많은 전경을 포함한 탐지 결과를 선택한다. After separating objects using an adaptiveThreshold technique for the original image and rotated image, a detection result that includes more foreground is selected from the detected results.

원본 영상과 회전 영상 중 배경을 제외한 객체의 픽셀을 포함한 박스의 신뢰도 값을 다음과 같은 식을 이용하여 보정한다: The reliability value of the box containing the pixels of the object excluding the background among the original image and the rotated image is corrected using the following equation:

여기서, P_{overlapped_Pixel} 은 원본 영상과 회전 영상의 겹친 영역의 크기(910)를 나타내고, P_{not_overlapped_Pixel}은 원본 영상과 회전 영상의 겹치지 않은 영역의 크기(920)를 나타낸다. Here, P _{overlapped_Pixel} represents the size (910) of the overlapped area between the original image and the rotated image, and P _{not_overlapped_Pixel} represents the size (920) of the non-overlapping area between the original image and the rotated image.

배경을 제외한 회전한 영상과 회전하지 않은 영상의 객체는 도 9와 같이 겹친 부분(910)과 겹치지 않음 부분(920)으로 나타낼 수 있다. 도 9를 참조하면, 겹쳐진 부분(910)은 두 개의 픽셀이 겹친 것이기 때문에 Reflectivity = 0.5로 설정하여 계산할 때 겹쳐지지 않은 부분의 픽셀 값을 50%만 반영한다.Objects in the rotated image and the non-rotated image, excluding the background, can be represented by an overlapped portion 910 and a non-overlapping portion 920 as shown in FIG. 9 . Referring to FIG. 9, since the overlapped portion 910 is two pixels overlapping, only 50% of the pixel value of the non-overlapping portion is reflected when calculated by setting Reflectivity = 0.5.

도 10은 본 발명의 일 실시예에 따른 정확한 객체 탐지를 위한 박스 레벨 후처리 장치를 설명하기 위한 도면이다.FIG. 10 is a diagram illustrating a box-level post-processing device for accurate object detection according to an embodiment of the present invention.

제안하는 정확한 객체 탐지를 위한 박스 레벨 후처리 장치는 영상 수집부(1010), 객체 탐지부(1020) 및 보정부(1030)를 포함한다. The proposed box-level post-processing device for accurate object detection includes an image collection unit 1010, an object detection unit 1020, and a correction unit 1030.

영상 수집부(1010)는 카메라를 통해 촬영된 복수의 객체에 대한 영상을 획득한다. 예를 들어, 카메라를 통해 복수의 객체를 촬영하기 위해 위에서 아래를 촬영하는 탑-뷰(top-view) 카메라를 통해 촬영할 수 있다. The image collection unit 1010 acquires images of a plurality of objects photographed through a camera. For example, in order to photograph multiple objects through a camera, photographing can be done through a top-view camera that photographs from above and below.

객체 탐지부(1020)는 획득된 영상으로부터 객체 탐지부를 통해 복수의 객체 탐지를 위한 박스들을 획득한다. 설치된 카메라로부터 획득한 영상에 대해 탐지기를 가동하여 각각의 신뢰도 값을 구할 수 있다. 본 발명의 실시예에 따른 객체 탐지부(1020)는 시간과 정확도가 적절하게 어울리는 가성비가 좋은 객체 탐지기인 YOLOv4를 사용할 수 있다. The object detection unit 1020 obtains boxes for detecting a plurality of objects from the acquired image through the object detection unit. Each reliability value can be obtained by operating a detector for images acquired from installed cameras. The object detector 1020 according to an embodiment of the present invention can use YOLOv4, a cost-effective object detector that appropriately matches time and accuracy.

보정부(1030)는 복수의 객체 탐지를 위한 박스들 중 복수의 객체 간 겹침으로 인한 가짜 객체 박스의 신뢰도 값과 진짜 객체 박스의 신뢰도 값을 박스 내 전경 픽셀 정보를 이용하여 보정한다. The correction unit 1030 corrects the reliability value of the fake object box and the reliability value of the real object box due to overlap between multiple objects among the multiple object detection boxes using foreground pixel information within the box.

7보정부(1030)는 먼저 가짜 객체 박스인지 진짜 객체 박스인지 여부를 판단하기 위하여, 해당 박스 내 전경 픽셀의 포함된 정도를 계산한다. 해당 박스 내 포함된 전경 픽셀이 미리 정해진 기준 미만일 경우 가짜 객체 박스로 판단하고, 해당 박스 내 포함된 전경 픽셀이 미리 정해진 기준 이상일 경우 진짜 객체 박스로 판단한다. 이후, 진짜 객체 박스에 대해 적응 임계 값(adaptive threshold) 방법을 이용하여 해당 박스 내 전경 픽셀 및 배경 픽셀을 분리한다. 7The correction unit 1030 first calculates the degree of inclusion of foreground pixels in the box in order to determine whether it is a fake object box or a real object box. If the foreground pixels included in the box are less than a predetermined standard, it is judged as a fake object box, and if the foreground pixels included in the box are more than the predetermined standard, it is judged as a real object box. Afterwards, an adaptive threshold method is used for the real object box to separate the foreground and background pixels within the box.

보정부(1030)는 분리된 전경 픽셀 및 배경 픽셀에 따라 가짜 객체 박스의 신뢰도 값과 진짜 객체 박스의 신뢰도 값을 해당 박스의 전체 크기에 대한 해당 박스 내 전경 영역의 크기의 비를 이용하여 신뢰도 값을 보정한다. The correction unit 1030 determines the reliability value of the fake object box and the real object box according to the separated foreground pixels and background pixels by using the ratio of the size of the foreground area within the box to the total size of the box. Correct.

또한, 보정부(1030)는 진짜 객체 박스들 중 겹치는 영역에 의해 진짜 객체 박스의 신뢰도 값이 낮게 계산된 경우 해당 박스의 전체 크기에 대한 해당 박스 내 겹치지 않은 영역의 크기의 비를 이용하여 신뢰도 값을 보정한다. In addition, if the reliability value of the real object box is calculated to be low due to overlapping areas among the real object boxes, the correction unit 1030 uses the ratio of the size of the non-overlapping area within the box to the total size of the box to determine the reliability value. Correct.

객체 탐지부(1020)는 보정부(1030)를 통해 보정된 신뢰도 값에 따라 객체 탐지부를 통해 복수의 객체 탐지를 위한 박스들의 위치를 재조정한다. The object detection unit 1020 readjusts the positions of boxes for detecting a plurality of objects through the object detection unit according to the reliability value corrected through the correction unit 1030.

이상에서 설명된 장치는 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치 및 구성요소는, 예를 들어, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPA(field programmable array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 애플리케이션을 수행할 수 있다.　 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다.　 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다.　 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다.　 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The device described above may be implemented with hardware components, software components, and/or a combination of hardware components and software components. For example, devices and components described in embodiments may include, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable array (FPA), It may be implemented using one or more general-purpose or special-purpose computers, such as a programmable logic unit (PLU), microprocessor, or any other device capable of executing and responding to instructions. A processing device may execute an operating system (OS) and one or more software applications that run on the operating system. Additionally, a processing device may access, store, manipulate, process, and generate data in response to the execution of software. For ease of understanding, a single processing device may be described as being used; however, those skilled in the art will understand that a processing device includes multiple processing elements and/or multiple types of processing elements. It can be seen that it may include. For example, a processing device may include a plurality of processors or one processor and one controller. Additionally, other processing configurations, such as parallel processors, are possible.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다.　 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치에 구체화(embody)될 수 있다.　 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.Software may include a computer program, code, instructions, or a combination of one or more of these, which may configure a processing unit to operate as desired, or may be processed independently or collectively. You can command the device. Software and/or data may be used on any type of machine, component, physical device, virtual equipment, computer storage medium or device to be interpreted by or to provide instructions or data to a processing device. It can be embodied in . Software may be distributed over networked computer systems and thus stored or executed in a distributed manner. Software and data may be stored on one or more computer-readable recording media.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다.　 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다.　 상기 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다.　 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다.　 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다.　 The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded on a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, etc., singly or in combination. Program instructions recorded on the medium may be specially designed and configured for the embodiment or may be known and available to those skilled in the art of computer software. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic media such as floptical disks. -Includes optical media (magneto-optical media) and hardware devices specifically configured to store and execute program instructions, such as ROM, RAM, flash memory, etc. Examples of program instructions include machine language code, such as that produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter, etc.

이상과 같이 실시예들이 비록 한정된 실시예와 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기의 기재로부터 다양한 수정 및 변형이 가능하다.　 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.As described above, although the embodiments have been described with limited examples and drawings, various modifications and variations can be made by those skilled in the art from the above description. For example, the described techniques are performed in a different order than the described method, and/or components of the described system, structure, device, circuit, etc. are combined or combined in a different form than the described method, or other components are used. Alternatively, appropriate results may be achieved even if substituted or substituted by an equivalent.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents of the claims also fall within the scope of the claims described below.

<참고 자료><Reference materials>

[1] S. Matthews, et al., "Early Detection of Health and Welfare Compromises through Automated Detection of Behavioural Changes in Pigs," The Veterinary Journal, Vol. 217, pp. 43-51, 2016.[1] S. Matthews, et al., “Early Detection of Health and Welfare Compromises through Automated Detection of Behavioral Changes in Pigs,” The Veterinary Journal , Vol. 217, pp. 43-51, 2016.

[2] L. Liu, et al., "Deep Learning for Generic Object Detection: A Survey," International Journal of Computer Vision, Vol. 128, pp. 261-318, 2020.[2] L. Liu, et al., “Deep Learning for Generic Object Detection: A Survey,” International Journal of Computer Vision , Vol. 128, pp. 261-318, 2020.

[3] A. Bochkovskiy, C. Wang, and H. Liao, "YOLOv4: Optimal Speed and Accuracy of Object Detection," arXiv preprint arXiv:2004.10934, 2020.[3] A. Bochkovskiy, C. Wang, and H. Liao, “YOLOv4: Optimal Speed and Accuracy of Object Detection,” arXiv preprint arXiv:2004.10934 , 2020.

[4] T. Lin, et al., "Microsoft COCO: Common Objects in Context," In Proceedings of the European Conference on Computer Vision(ECCV), 2014.[4] T. Lin, et al., “Microsoft COCO: Common Objects in Context,” In Proceedings of the European Conference on Computer Vision (ECCV) , 2014.

[5] I. Blayvas, A. Bruckstein, and R. Kimmel, "Efficient Computation of Adaptive Threshold Surfaces for Image Binarization," Pattern Recognition, Vol. 18, No. 1, pp. 89-101, 2006.[5] I. Blayvas, A. Bruckstein, and R. Kimmel, “Efficient Computation of Adaptive Threshold Surfaces for Image Binarization,” Pattern Recognition , Vol. 18, no. 1, pp. 89-101, 2006.

Claims

Obtaining images by photographing a plurality of objects using a camera;
Obtaining boxes for detecting a plurality of objects from the acquired image through an object detection unit;
Correcting, through a correction unit, the reliability value of the fake object box and the reliability value of the real object box due to overlap between a plurality of objects among the boxes for detecting a plurality of objects using foreground pixel information within the box; and
A step of readjusting the positions of boxes for detecting a plurality of objects through an object detection unit according to the corrected reliability value.
Including,
The step of correcting, through the correction unit, the reliability value of the fake object box and the reliability value of the real object box due to overlap between a plurality of objects among the boxes for detecting a plurality of objects using the foreground pixel information in the box,
In order to determine whether it is a fake object box or a real object box, the degree of inclusion of foreground pixels in the box is calculated. If the foreground pixels included in the box are less than a predetermined standard, it is judged to be a fake object box, and the box is judged to be a fake object box. If the included foreground pixels are greater than a predetermined standard, it is judged to be a real object box, and an adaptive threshold method is used for the real object box to separate the foreground and background pixels within the box.
Object detection method.

delete

According to paragraph 1,
According to the separated foreground pixels and background pixels, the reliability value of the fake object box and the real object box are corrected using the ratio of the size of the foreground area within the box to the total size of the box.
Object detection method.

According to paragraph 3,
If the reliability value of the real object box is calculated to be low due to overlapping areas among the real object boxes, the reliability value is corrected using the ratio of the size of the non-overlapping area within the box to the total size of the box.
Object detection method.

According to paragraph 4,
To correct the reliability value for the overlapping area among the real object boxes, the box for the object detected in the original image is matched with the box for the object detected in the rotated image where the original image is rotated,
After separating the foreground and background pixels in the box using an adaptive threshold method for the original image and the rotated image, the reliability value for the box containing more foreground pixels among the original image and the rotated image is corrected.
Object detection method.

An image collection unit that acquires images of a plurality of objects photographed through a camera;
an object detection unit that acquires boxes for detecting a plurality of objects from the acquired image and readjusts the positions of the boxes for detecting the plurality of objects according to the reliability value corrected through the correction unit; and
A correction unit that corrects the reliability value of the fake object box and the reliability value of the real object box due to overlap between multiple objects among the boxes for multiple object detection using foreground pixel information within the box.
Including,
The correction unit,
In order to determine whether it is a fake object box or a real object box, the degree of inclusion of foreground pixels in the box is calculated. If the foreground pixels included in the box are less than a predetermined standard, it is judged to be a fake object box, and the box is judged to be a fake object box. If the included foreground pixels are greater than a predetermined standard, it is judged to be a real object box, and an adaptive threshold method is used for the real object box to separate the foreground and background pixels within the box.
Object detection device.

delete

According to clause 6,
Correction part,
According to the separated foreground pixels and background pixels, the reliability value of the fake object box and the real object box are corrected using the ratio of the size of the foreground area within the box to the total size of the box.
Object detection device.

According to clause 8,
Correction part,
If the reliability value of the real object box is calculated to be low due to overlapping areas among the real object boxes, the reliability value is corrected using the ratio of the size of the non-overlapping area within the box to the total size of the box.
Object detection device.

According to clause 9,
Correction part,
To correct the reliability value for the overlapping area among the real object boxes, the box for the object detected in the original image is matched with the box for the object detected in the rotated image where the original image is rotated,
After separating the foreground and background pixels in the box using an adaptive threshold method for the original image and the rotated image, the reliability value for the box containing more foreground pixels among the original image and the rotated image is corrected.
Object detection device.