KR102581154B1

KR102581154B1 - Method and Apparatus for Object Detection Using Model Ensemble

Info

Publication number: KR102581154B1
Application number: KR1020210044505A
Authority: KR
Inventors: 정용화; 박대희; 손승욱; 유승현; 안한세
Original assignee: 고려대학교 세종산학협력단
Priority date: 2021-04-06
Filing date: 2021-04-06
Publication date: 2023-09-21
Also published as: KR20220138620A

Abstract

모델 앙상블을 이용한 객체 탐지 방법 및 장치가 제시된다. 본 발명에서 제안하는 모델 앙상블을 이용한 객체 탐지 방법은 카메라를 통해 복수의 객체를 촬영하여 영상을 획득하고, 모델링부가 영상으로부터 제1 모델 및 제2 모델을 이용하여 각각의 모델 별로 영상 내 객체들을 탐지하기 위한 탐지 박스들을 획득하는 단계, 획득된 제1 모델의 탐지 박스의 수 및 제2 모델의 탐지 박스의 수 각각을 탐지 박스 비교부를 통해 영상 내 객체의 수와 비교하는 단계 및 제1 모델의 탐지 박스의 수 및 제2 모델의 탐지 박스의 수 각각을 영상 내 객체의 수와 비교한 결과에 따라 탐지 박스 조정부를 통해 제1 모델의 탐지 박스들 및 제2 모델의 탐지 박스들 각각에 대한 IOU(intersection Over Union) 값을 이용하여 매칭 박스를 결정하고, 최종 탐지 박스를 선정하는 단계를 포함한다.An object detection method and device using a model ensemble are presented. The object detection method using a model ensemble proposed in the present invention acquires images by photographing a plurality of objects through a camera, and the modeling unit uses the first model and the second model from the image to detect objects in the image for each model. Obtaining detection boxes, comparing each of the obtained number of detection boxes of the first model and the number of detection boxes of the second model with the number of objects in the image through a detection box comparison unit, and detecting the first model According to the result of comparing the number of boxes and the number of detection boxes of the second model with the number of objects in the image, IOU (IOU) for each of the detection boxes of the first model and the detection boxes of the second model through the detection box adjuster It includes the step of determining the matching box using the (intersection Over Union) value and selecting the final detection box.

Description

Object detection method and apparatus using model ensemble {Method and Apparatus for Object Detection Using Model Ensemble}

본 발명은 영상 처리 응용의 정확도 개선을 위해 복수의 모델을 이용하는 객체 탐지 방법 및 장치에 관한 것이다.The present invention relates to an object detection method and device that uses a plurality of models to improve the accuracy of image processing applications.

딥러닝 모델의 탐지 상자들 간의 관계를 파악하기 위해 IOU(Intercetion of Union)를 사용할 수 있다. IOU는 두 상자의 겹쳐진 부분의 너비를 두 상자를 합쳤을 때의 너비로 나눈 것으로, 이를 통해 두 상자가 얼마나 유사한지를 0~1 사이로 표현할 수가 있다. 이보다 한 단계 발전한 것이 DIOU(Distance-IOU)[2] 이다. DIOU는 두 상자의 중심점 좌표 거리를 두 상자를 감싸는 큰 상자의 대각선 길이로 나눈 것으로, IOU가 표현하지 못했던 중심점 간의 거리를 포함하여 0~1 사이로 표현해준다. 이를 통해 탐지한 상자들 간의 상관관계를 0~1 사이로 정규화시킨 값으로 확인할 수 있다. PP-YOLO 기법[1]은 탐지한 상자들에 대하여 IOU-예측 값을 알 수 있다. IOU-예측은 현재 탐지한 상자가 실제 레이블과의 IOU가 얼마나 될 것인가를 예측하는 값으로, 탐지한 상자들의 클래스 별 예측값에 영향을 줄 수 있다. Intercetion of Union (IOU) can be used to determine the relationship between detection boxes in a deep learning model. IOU is the width of the overlapping part of the two boxes divided by the width of the two boxes combined. This allows you to express how similar the two boxes are between 0 and 1. One step further is DIOU (Distance-IOU) [2]. DIOU is the coordinate distance between the center points of two boxes divided by the diagonal length of the large box surrounding the two boxes, and is expressed between 0 and 1, including the distance between center points that IOU could not express. Through this, the correlation between the detected boxes can be confirmed as a normalized value between 0 and 1. The PP-YOLO technique [1] can know the IOU-prediction value for the detected boxes. IOU-prediction is a value that predicts how much the currently detected box will have an IOU with the actual label, and can affect the predicted value for each class of detected boxes.

하지만, 이러한 종래기술들은 객체들이 근접한 경우, 같은 클래스를 탐지한 상자들이 어떤 객체를 탐지한 것인지에 대한 구분이 어려워, 서로 다른 객체를 탐지했더라도 동일한 객체를 탐지한 것으로 판단해 상자를 제거하여, 결과적으로 탐지 정확도가 떨어지는 현상이 발생한다. However, in these prior technologies, when objects are close, it is difficult to distinguish which objects the boxes detecting the same class are detecting, so even if different objects are detected, the boxes are judged to have detected the same object and are removed, resulting in This causes a decrease in detection accuracy.

돈사 내 작업자의 부족(국내의 경우 작업자 1명이 평균 2,000 마리의 돼지를 관리)과 돼지의 높은 폐사율(국내의 경우 연간 약 500만 마리의 돼지가 폐사)을 고려할 때, 개별 돼지에 대한 세밀한 관리를 위하여 정보기술(Information Technology)을 적용한 돈사 모니터링의 필요성이 증가하고 있다. 그러나 지속적으로 발전하는 합성 곱 신경망 기반 객체 탐지 기술을 적용하여도 돼지들 간의 겹침(occlusion) 등의 이유로 혼잡한 돈사 내 돼지들을 정확히 탐지하는데 한계가 있다. Considering the lack of workers in pig farms (in Korea, one worker manages an average of 2,000 pigs) and the high mortality rate of pigs (in Korea, about 5 million pigs die annually), detailed management of individual pigs is necessary. For this reason, the need for monitoring pig farms using information technology is increasing. However, even when continuously developing convolutional neural network-based object detection technology is applied, there are limitations in accurately detecting pigs in crowded pig houses due to reasons such as occlusion between pigs.

합성곱 신경망(Convolutional Neural Network) 기술 발전으로 객체 탐지를 통한 돈사에서의 돼지 모니터링이 가능하다. 종래기술에 따른 돈사에서의 돼지 모니터링을 위한 객체 탐지 방법으로는 카메라로부터 획득된 영상으로부터 탐지된 객체에 대응하는 바운딩 박스(bounding box)들을 획득하는 YOLOv4 객체 탐지 방법이 있다. 시간과 정확도가 적절하게 어울리는 가성비가 좋은 객체 탐지기인 YOLOv4에서는 Soft-nms를 거친 후 나오는 객체의 정보에서 현재 탐지한 박스 정보가 클래스와 객체 크기를 얼마나 정확하게 잡았는지를 나타내는 신뢰도 값을 이용한다. 또한, 신뢰도 값을 재조정하는 방법에서 사용되는 객체 분리를 위해서 적응 임계(adaptiveThreshold) 기법을 이용해서 영상 내 객체를 분리하고 영상을 회전시킨다. YOLOv4는 대표적인 객체 탐지 공개 DB인 MS COCO로 처리속도 대비 가장 높은 정확도를 달성한다. 또한 상호 보안적인 데이터를 구하기 위해 영상의 대비를 증가 시키는 CLAHE 알고리즘을 사용하여 추가 학습 데이터를 확보한다.With the development of convolutional neural network technology, pig monitoring in pig farms is possible through object detection. An object detection method for monitoring pigs in a pig pen according to the prior art is the YOLOv4 object detection method, which obtains bounding boxes corresponding to the detected object from images obtained from a camera. YOLOv4, a cost-effective object detector with an appropriate combination of time and accuracy, uses a reliability value that indicates how accurately the currently detected box information captures the class and object size from the object information that comes out after going through Soft-nms. In addition, to separate objects used in the method of readjusting the reliability value, the adaptiveThreshold technique is used to separate objects in the image and rotate the image. YOLOv4 achieves the highest accuracy relative to processing speed with MS COCO, a representative object detection public database. Additionally, in order to obtain mutually secure data, additional learning data is secured using the CLAHE algorithm, which increases the contrast of the image.

합성곱 신경망 기반 딥러닝 기술의 지속적인 발전으로 YOLOv4는 대부분의 객체들을 정확히 탐지(박스의 신뢰도 값이 80% 이상)하지만, 복수의 객체들로 인한 복잡하고 혼잡한 구조 내 객체들의 겹침으로 가짜 객체 박스가 진짜 객체 박스 보다 높은 신뢰도 값을 갖는 경우가 발생한다. With the continuous development of convolutional neural network-based deep learning technology, YOLOv4 accurately detects most objects (confidence value of boxes is over 80%), but fake object boxes are created due to the overlap of objects in a complex and crowded structure due to multiple objects. A case may occur where has a higher confidence value than the real object box.

따라서, 이러한 문제를 해결하기 위해 복수의 모델을 이용하여 영상 처리 응용의 정확도 개선시키기 위한 방법을 필요로 한다. Therefore, in order to solve this problem, a method for improving the accuracy of image processing applications using multiple models is needed.

[1] X. Long, et al. "PP-YOLO: An Effective and Efficient Implementation of Object Detector." arXiv preprint arXiv:2007.12099, 2020.[1] X. Long, et al. “PP-YOLO: An Effective and Efficient Implementation of Object Detector.” arXiv preprint arXiv:2007.12099, 2020. [2] D. Yuan, et. al. "Accurate Bounding-box Regression with Distance-IoU Loss for Visual Tracking." arXiv preprint arXiv:2007.01864, 2020.[2] D. Yuan, et. al. “Accurate Bounding-box Regression with Distance-IoU Loss for Visual Tracking.” arXiv preprint arXiv:2007.01864, 2020.

객체 탐지의 정확도를 높이기 위한 CNN 기반의 연구가 진행되고 있지만, 종래기술에서는 단일 모델에 대한 학습 및 테스트에 초점을 맞추고 있다. 이러한 종래기술에 따른 정확도는 YOLOv4 또는 EfficientDet의 COCO 데이터셋의 정확도는 대략 43% 및 45%로 측정되어 절반도 안 되는 정확도임을 확인할 수 있다. CNN-based research is in progress to increase the accuracy of object detection, but the prior art focuses on learning and testing a single model. The accuracy according to this prior art can be confirmed that the accuracy of YOLOv4 or EfficientDet's COCO dataset is measured at approximately 43% and 45%, which is less than half of the accuracy.

본 발명이 이루고자 하는 기술적 과제는 이러한 단일 모델에서의 정확도 개선을 하기 위해 단일 모델이 아닌 복수의 모델을 사용하여 객체 탐지 박스를 조정함으로써 객체 탐지 정확도를 향상시키기 위한 방법 및 장치를 제공하는데 있다.The technical problem to be achieved by the present invention is to provide a method and device for improving object detection accuracy by adjusting the object detection box using multiple models rather than a single model in order to improve accuracy in such a single model.

일 측면에 있어서, 본 발명에서 제안하는 모델 앙상블을 이용한 객체 탐지 방법은 카메라를 통해 복수의 객체를 촬영하여 영상을 획득하고, 모델링부가 영상으로부터 제1 모델 및 제2 모델을 이용하여 각각의 모델 별로 영상 내 객체들을 탐지하기 위한 탐지 박스들을 획득하는 단계, 획득된 제1 모델의 탐지 박스의 수 및 제2 모델의 탐지 박스의 수 각각을 탐지 박스 비교부를 통해 영상 내 객체의 수와 비교하는 단계 및 제1 모델의 탐지 박스의 수 및 제2 모델의 탐지 박스의 수 각각을 영상 내 객체의 수와 비교한 결과에 따라 탐지 박스 조정부를 통해 제1 모델의 탐지 박스들 및 제2 모델의 탐지 박스들 각각에 대한 IOU(intersection Over Union) 값을 이용하여 매칭 박스를 결정하고, 최종 탐지 박스를 선정하는 단계를 포함한다. In one aspect, the object detection method using a model ensemble proposed in the present invention acquires images by photographing a plurality of objects through a camera, and the modeling unit uses the first model and the second model from the images to identify each model for each model. Obtaining detection boxes for detecting objects in an image, comparing each of the obtained number of detection boxes of the first model and the number of detection boxes of the second model with the number of objects in the image through a detection box comparison unit, and According to the result of comparing each of the number of detection boxes of the first model and the number of detection boxes of the second model with the number of objects in the image, the detection boxes of the first model and the detection boxes of the second model are generated through the detection box adjuster. It includes the step of determining matching boxes using the intersection over union (IOU) value for each and selecting the final detection box.

제1 모델의 탐지 박스의 수 및 제2 모델의 탐지 박스의 수 각각을 영상 내 객체의 수와 비교한 결과에 따라 탐지 박스 조정부를 통해 제1 모델의 탐지 박스들 및 제2 모델의 탐지 박스들 각각에 대한 IOU 값을 이용하여 매칭 박스를 결정하고, 최종 탐지 박스를 선정하는 단계는 제1 모델의 탐지 박스의 수 또는 제2 모델의 탐지 박스의 수 중 어느 하나의 모델의 탐지 박스의 수가 영상 내 객체의 수와 같을 경우, 해당 모델을 객체 탐지를 위한 모델로 선택하고, 해당 모델의 탐지 박스를 이용하여 영상 내 객체를 탐지한다. According to the result of comparing each of the number of detection boxes of the first model and the number of detection boxes of the second model with the number of objects in the image, the detection boxes of the first model and the detection boxes of the second model are generated through the detection box adjuster. The step of determining the matching box using the IOU value for each and selecting the final detection box is performed by determining the number of detection boxes of either the number of detection boxes of the first model or the number of detection boxes of the second model. If it is the same as the number of objects in the image, select the model as the model for object detection and detect the object in the image using the detection box of the model.

제1 모델의 탐지 박스의 수 및 제2 모델의 탐지 박스의 수 각각을 영상 내 객체의 수와 비교한 결과에 따라 탐지 박스 조정부를 통해 제1 모델의 탐지 박스들 및 제2 모델의 탐지 박스들 각각에 대한 IOU 값을 이용하여 매칭 박스를 결정하고, 최종 탐지 박스를 선정하는 단계는 제1 모델의 탐지 박스의 수 및 제2 모델의 탐지 박스의 수가 모두 영상 내 객체의 수와 비교하여 같지 않은 경우, 제1 모델의 탐지 박스들 및 제2 모델의 탐지 박스들 각각에 대하여 신뢰도 값의 내림차순으로 정렬하고, 제1 모델의 탐지 박스들 및 제2 모델의 탐지 박스들 각각에 대해 IOU(intersection Over Union) 값을 이용하여 비교한다. According to the result of comparing each of the number of detection boxes of the first model and the number of detection boxes of the second model with the number of objects in the image, the detection boxes of the first model and the detection boxes of the second model are generated through the detection box adjuster. The step of determining the matching box using the IOU value for each and selecting the final detection box is that the number of detection boxes in the first model and the number of detection boxes in the second model are not equal compared to the number of objects in the image. In this case, each of the detection boxes of the first model and the detection boxes of the second model is sorted in descending order of reliability value, and an intersection over IOU (IOU) is calculated for each of the detection boxes of the first model and the detection boxes of the second model. Union) values are used to compare.

제1 모델의 탐지 박스들 및 제2 모델의 탐지 박스들 각각에 대해 비교한 IOU 값이 미리 정해진 기준 이상인 탐지 박스를 동일한 객체를 탐지한 탐지 박스로 판단하여 매칭 박스로 결정하고, 해당 매칭 박스를 영상 내 객체를 탐지하기 위한 최종 탐지 박스로 선정한다. A detection box whose IOU value compared to each of the detection boxes of the first model and the detection boxes of the second model is more than a predetermined standard is judged to be a detection box that detected the same object and is determined as a matching box, and the matching box is determined as a matching box. Select it as the final detection box to detect objects in the video.

매칭 박스를 결정하고 남은 나머지 제1 모델의 탐지 박스들 및 제2 모델의 탐지 박스들에 대하여 신뢰도 값이 높은 순으로 영상 내 객체의 수와 같아질 때까지 영상 내 객체를 탐지하기 위한 최종 탐지 박스로 추가한다. After determining the matching box, the remaining detection boxes of the first model and the detection boxes of the second model are ordered in order of reliability, and are used to detect objects in the image until the number of objects in the image is equal to the final detection box. Add with

또 다른 일 측면에 있어서, 본 발명에서 제안하는 모델 앙상블을 이용한 객체 탐지 장치는 카메라를 통해 복수의 객체를 촬영한 영상으로부터 제1 모델 및 제2 모델을 이용하여 각각의 모델 별로 영상 내 객체들을 탐지하기 위한 탐지 박스들을 획득하는 모델링부, 획득된 제1 모델의 탐지 박스의 수 및 제2 모델의 탐지 박스의 수 각각을 영상 내 객체의 수와 비교하는 탐지 박스 비교부 및 제1 모델의 탐지 박스의 수 및 제2 모델의 탐지 박스의 수 각각을 영상 내 객체의 수와 비교한 결과에 따라 제1 모델의 탐지 박스들 및 제2 모델의 탐지 박스들 각각에 대한 IOU(intersection Over Union) 값을 이용하여 매칭 박스를 결정하고, 최종 탐지 박스를 선정하는 탐지 박스 조정부를 포함한다.In another aspect, the object detection device using a model ensemble proposed in the present invention detects objects in the image for each model using a first model and a second model from images taken of a plurality of objects through a camera. a modeling unit that acquires detection boxes, a detection box comparison unit that compares each of the obtained number of detection boxes of the first model and the number of detection boxes of the second model with the number of objects in the image, and a detection box of the first model. According to the result of comparing the number of detection boxes of the second model and the number of detection boxes of the second model with the number of objects in the image, the IOU (intersection over union) value for each of the detection boxes of the first model and the detection boxes of the second model is calculated. It includes a detection box adjustment unit that determines the matching box and selects the final detection box.

본 발명의 실시예들에 따르면 상호 보안적인 데이터로 학습된 복수의 모델을 사용하여 객체 탐지 박스를 조정함으로써 객체 탐지 정확도를 향상시킬 수 있다. 또한, 많은 시간을 소모하지 않는 알고리즘을 이용하기 때문에 최적의 모델의 조합을 찾으면서 정확도는 개선하고, 하나의 영상에서 많은 객체들을 정확하게 탐지 할 수 있다. According to embodiments of the present invention, object detection accuracy can be improved by adjusting the object detection box using a plurality of models learned with mutually secure data. Additionally, because an algorithm that does not consume a lot of time is used, accuracy is improved while finding the optimal model combination, and many objects can be accurately detected in one image.

도 1은 본 발명의 일 실시예에 따른 모델 앙상블을 이용한 객체 탐지 방법을 설명하기 위한 흐름도이다.
도 2는 본 발명의 일 실시예에 따른 모델 앙상블을 이용한 객체 탐지 과정을 설명하기 위한 도면이다.
도 3은 본 발명의 일 실시예에 따른 모델 앙상블을 이용한 객체 탐지 장치의 구성을 나타내는 도면이다.
도 4는 본 발명의 일 실시예에 따른 영상획득을 위해 설치된 카메라를 나타내는 도면이다.
도 5는 본 발명의 일 실시예에 따른 모델링부를 통해 획득된 탐지 박스를 나타내는 도면이다.
도 6은 본 발명의 일 실시예에 따른 제1 모델 및 제2 모델을 이용하여 각각 획득된 탐지 박스를 나타내는 도면이다.
도 7은 본 발명의 일 실시예에 따른 제1 모델의 탐지 박스들 및 제2 모델의 탐지 박스들 각각에 대해 IOU 값을 비교하는 과정을 설명하기 위한 도면이다.
도 8은 본 발명의 일 실시예에 따른 최종 탐지 박스를 선정하는 과정을 설명하기 위한 도면이다.
도 9는 본 발명의 일 실시예에 따른 탐지 박스의 신뢰도 값 조절 알고리즘을 나타내는 도면이다. 1 is a flowchart illustrating an object detection method using a model ensemble according to an embodiment of the present invention.
Figure 2 is a diagram illustrating an object detection process using a model ensemble according to an embodiment of the present invention.
Figure 3 is a diagram showing the configuration of an object detection device using a model ensemble according to an embodiment of the present invention.
Figure 4 is a diagram showing a camera installed for image acquisition according to an embodiment of the present invention.
Figure 5 is a diagram showing a detection box obtained through a modeling unit according to an embodiment of the present invention.
Figure 6 is a diagram showing detection boxes each obtained using a first model and a second model according to an embodiment of the present invention.
Figure 7 is a diagram for explaining the process of comparing IOU values for each of the detection boxes of the first model and the detection boxes of the second model according to an embodiment of the present invention.
Figure 8 is a diagram for explaining the process of selecting a final detection box according to an embodiment of the present invention.
Figure 9 is a diagram showing an algorithm for adjusting the reliability value of a detection box according to an embodiment of the present invention.

이하, 본 발명의 실시 예를 첨부된 도면을 참조하여 상세하게 설명한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the attached drawings.

도 1은 본 발명의 일 실시예에 따른 모델 앙상블을 이용한 객체 탐지 방법을 설명하기 위한 흐름도이다. 1 is a flowchart illustrating an object detection method using a model ensemble according to an embodiment of the present invention.

제안하는 모델 앙상블을 이용한 객체 탐지 방법은 카메라를 통해 복수의 객체를 촬영하여 영상을 획득하고, 모델링부가 영상으로부터 제1 모델 및 제2 모델을 이용하여 각각의 모델 별로 영상 내 객체들을 탐지하기 위한 탐지 박스들을 획득하는 단계(110), 획득된 제1 모델의 탐지 박스의 수 및 제2 모델의 탐지 박스의 수 각각을 탐지 박스 비교부를 통해 영상 내 객체의 수와 비교하는 단계(120) 및 제1 모델의 탐지 박스의 수 및 제2 모델의 탐지 박스의 수 각각을 영상 내 객체의 수와 비교한 결과에 따라 탐지 박스 조정부를 통해 제1 모델의 탐지 박스들 및 제2 모델의 탐지 박스들 각각에 대한 IOU(intersection Over Union) 값을 이용하여 매칭 박스를 결정하고, 최종 탐지 박스를 선정하는 단계(130)를 포함한다. The object detection method using the proposed model ensemble acquires images by photographing a plurality of objects through a camera, and the modeling unit uses the first and second models from the images to detect objects in the images for each model. Obtaining boxes (110), comparing each of the obtained number of detection boxes of the first model and the number of detection boxes of the second model with the number of objects in the image through a detection box comparison unit (120), and the first According to the result of comparing each of the number of detection boxes of the model and the number of detection boxes of the second model with the number of objects in the image, each of the detection boxes of the first model and the detection boxes of the second model is adjusted through the detection box adjuster. It includes a step (130) of determining a matching box using the IOU (intersection over union) value and selecting the final detection box.

단계(110)에서, 카메라를 통해 복수의 객체를 촬영하여 영상을 획득하고, 모델링부가 영상으로부터 제1 모델 및 제2 모델을 이용하여 각각의 모델 별로 영상 내 객체들을 탐지하기 위한 탐지 박스들을 획득한다. 예를 들어, 카메라를 통해 복수의 객체를 촬영하기 위해 위에서 아래를 촬영하는 탑-뷰(top-view) 카메라를 통해 촬영할 수 있다. 설치된 카메라로부터 획득한 영상에 대해 모델링부를 통해 탐지 박스들을 획득할 수 있다. In step 110, an image is acquired by photographing a plurality of objects through a camera, and the modeling unit uses the first model and the second model from the image to obtain detection boxes for detecting objects in the image for each model. . For example, in order to photograph multiple objects through a camera, photographing can be done through a top-view camera that photographs from above and below. Detection boxes can be obtained through the modeling unit for images acquired from installed cameras.

이때, 모델링부는 복수의 탐지 모델을 이용하여 탐지 박스들을 획득할 수 있다. 예를 들어, 제1 모델 및 제2 모델을 이용하여 각각의 모델 별로 영상 내 객체들을 탐지하기 위한 탐지 박스들을 획득할 수 있다. 본 발명의 실시예에 따른 모델 앙상블을 위한 복수의 탐지 모델은 제1 모델 및 제2 모델을 포함하지만, 이에 한정되지 않고, 더 많을 수의 탐지 모델을 이용하여 앙상블을 통해 객체 탐지 박스를 조정함으로써 객체 탐지 정확도를 향상시킬 수 있다. At this time, the modeling unit may obtain detection boxes using a plurality of detection models. For example, detection boxes for detecting objects in an image can be obtained for each model using the first model and the second model. A plurality of detection models for a model ensemble according to an embodiment of the present invention includes, but is not limited to, a first model and a second model, and adjusts the object detection box through the ensemble using a larger number of detection models. Object detection accuracy can be improved.

단계(120)에서, 획득된 제1 모델의 탐지 박스의 수 및 제2 모델의 탐지 박스의 수 각각을 탐지 박스 비교부를 통해 영상 내 객체의 수와 비교한다. In step 120, each of the obtained number of detection boxes of the first model and the number of detection boxes of the second model is compared with the number of objects in the image through a detection box comparison unit.

단계(130)에서, 제1 모델의 탐지 박스의 수 및 제2 모델의 탐지 박스의 수 각각을 영상 내 객체의 수와 비교한 결과에 따라 탐지 박스 조정부를 통해 제1 모델의 탐지 박스들 및 제2 모델의 탐지 박스들 각각에 대한 IOU(intersection Over Union) 값을 이용하여 매칭 박스를 결정하고, 최종 탐지 박스를 선정한다. In step 130, the number of detection boxes of the first model and the number of detection boxes of the second model are respectively compared with the number of objects in the image through the detection box adjuster. 2 Determine the matching box using the IOU (intersection over union) value for each of the model's detection boxes, and select the final detection box.

제1 모델의 탐지 박스의 수 또는 제2 모델의 탐지 박스의 수 중 어느 하나의 모델의 탐지 박스의 수가 영상 내 객체의 수와 같을 경우, 해당 모델을 객체 탐지를 위한 모델로 선택하고, 해당 모델의 탐지 박스를 이용하여 영상 내 객체를 탐지한다. If the number of detection boxes of either the number of detection boxes of the first model or the number of detection boxes of the second model is equal to the number of objects in the image, the corresponding model is selected as a model for object detection, and the corresponding model is selected. Detect objects in the image using the detection box.

반면에, 제1 모델의 탐지 박스의 수 및 제2 모델의 탐지 박스의 수가 모두 영상 내 객체의 수와 비교하여 같지 않은 경우, 제1 모델의 탐지 박스들 및 제2 모델의 탐지 박스들 각각에 대하여 신뢰도 값의 내림차순으로 정렬하고, 제1 모델의 탐지 박스들 및 제2 모델의 탐지 박스들 각각에 대해 IOU(intersection Over Union) 값을 이용하여 비교한다. On the other hand, if both the number of detection boxes of the first model and the number of detection boxes of the second model are not equal compared to the number of objects in the image, the detection boxes of the first model and the detection boxes of the second model are respectively They are sorted in descending order of reliability values, and compared using IOU (intersection over union) values for each of the detection boxes of the first model and the detection boxes of the second model.

매칭 박스를 결정하고 남은 나머지 제1 모델의 탐지 박스들 및 제2 모델의 탐지 박스들에 대하여 신뢰도 값이 높은 순으로 영상 내 객체의 수와 같아질 때까지 영상 내 객체를 탐지하기 위한 최종 탐지 박스로 추가한다. 도 2를 참조하여 모델 앙상블을 이용한 객체 탐지 과정을 더욱 상세히 설명한다. After determining the matching box, the remaining detection boxes of the first model and the detection boxes of the second model are ordered in order of reliability, and are used to detect objects in the image until the number of objects in the image is equal to the final detection box. Add with Referring to Figure 2, the object detection process using the model ensemble is explained in more detail.

도 2는 본 발명의 일 실시예에 따른 모델 앙상블을 이용한 객체 탐지 과정을 설명하기 위한 도면이다. Figure 2 is a diagram illustrating an object detection process using a model ensemble according to an embodiment of the present invention.

본 발명의 실시예에 따르면, 먼저 카메라를 통해 복수의 객체를 촬영하여 영상을 획득한다. 제안하는 모델 앙상블을 이용한 객체 탐지 장치의 모델링부가 영상으로부터 제1 모델(Model A) 및 제2 모델(Model B)을 이용하여 각각의 모델 별로 영상 내 객체들을 탐지하기 위한 탐지 박스들을 획득한다(210). 다시 말해, 제1 모델(Model A)을 이용하여 탐지 박스(A_box)를 획득하고, 제2 모델(Model B)을 이용하여 탐지 박스(B_box)를 획득한다. According to an embodiment of the present invention, images are first obtained by photographing a plurality of objects using a camera. The modeling unit of the object detection device using the proposed model ensemble acquires detection boxes for detecting objects in the image for each model using the first model (Model A) and the second model (Model B) from the image (210) ). In other words, the detection box (A_box) is obtained using the first model (Model A), and the detection box (B_box) is obtained using the second model (Model B).

이후, 획득된 제1 모델의 탐지 박스의 수 및 제2 모델의 탐지 박스의 수 각각을 탐지 박스 비교부를 통해 영상 내 객체의 수와 비교한다. Thereafter, each of the obtained number of detection boxes of the first model and the number of detection boxes of the second model is compared with the number of objects in the image through a detection box comparison unit.

더욱 상세하게는, 제1 모델(Model A)을 이용하여 획득한 탐지 박스(A_box)의 사이즈(다시 말해, 탐지 박스의 수)를 영상 내 객체의 수(no_object)와 비교한다(211). 제1 모델(Model A)의 탐지 박스(A_box)의 수가 영상 내 객체의 수(no_object)와 같을 경우, 제1 모델(Model A)을 객체 탐지를 위한 모델로 선택하고, 제1 모델의 탐지 박스(A_box)를 반환하여 영상 내 객체를 탐지한다(212). More specifically, the size of the detection box (A_box) (in other words, the number of detection boxes) obtained using the first model (Model A) is compared with the number of objects (no_object) in the image (211). If the number of detection boxes (A_box) of the first model (Model A) is equal to the number of objects (no_object) in the image, the first model (Model A) is selected as a model for object detection, and the detection box of the first model (A_box) is returned to detect objects in the image (212).

제1 모델(Model A)의 탐지 박스(A_box)의 수가 영상 내 객체의 수(no_object)와 같지 않을 경우, 제2 모델(Model B)을 이용하여 획득한 탐지 박스(B_box)의 사이즈(다시 말해, 탐지 박스의 수)를 영상 내 객체의 수(no_object)와 비교한다(213). 제2 모델(Model B)의 탐지 박스(B_box)의 수가 영상 내 객체의 수(no_object)와 같을 경우, 제2 모델(Model B)을 객체 탐지를 위한 모델로 선택하고, 제2 모델의 탐지 박스(B_box)를 반환하여 영상 내 객체를 탐지한다(214). If the number of detection boxes (A_box) of the first model (Model A) is not equal to the number of objects (no_object) in the image, the size of the detection box (B_box) obtained using the second model (Model B) (in other words, , the number of detection boxes) is compared with the number of objects in the image (no_object) (213). If the number of detection boxes (B_box) of the second model (Model B) is equal to the number of objects (no_object) in the image, the second model (Model B) is selected as a model for object detection, and the detection box of the second model (B_box) is returned to detect objects in the image (214).

제1 모델(Model A)의 탐지 박스(A_box) 및 제2 모델(Model B)의 탐지 박스(B_box) 모두 영상 내 객체의 수(no_object)와 같지 않을 경우, 제1 모델(Model A)의 탐지 박스(A_box) 및 제2 모델(Model B)의 탐지 박스(B_box) 각각에 대하여 신뢰도 값의 내림차순으로 정렬한다(121). If both the detection box (A_box) of the first model (Model A) and the detection box (B_box) of the second model (Model B) are not equal to the number of objects in the image (no_object), detection of the first model (Model A) The box (A_box) and the detection box (B_box) of the second model (Model B) are sorted in descending order of reliability values (121).

이후, 제1 모델의 탐지 박스들(A_box) 및 제2 모델의 탐지 박스들(B_box) 각각에 대해 IOU(intersection Over Union) 값을 이용하여 비교하는 과정을 반복 수행하기 위한 파라미터를 초기 설정한다(122). Afterwards, parameters for repeating the comparison process using IOU (intersection over union) values are initially set for each of the detection boxes (A_box) of the first model and the detection boxes (B_box) of the second model ( 122).

제1 모델(Model A) 및 제2 모델(Model B)의 탐지 박스들 중 동일한 객체를 탐지한 탐지 박스로 판단되는 매칭 박스의 수(matched_boxes)를 0으로 설정하고, IOU 값 비교 반복 횟수(i)를 1로 설정한다. IOU 값 비교 반복 횟수(i)를 제1 모델의 탐지 박스(A_box)의 수와 비교하고(231), 작을 경우 j를 1로 설정한다(241). 여기서 j는 IOU 값 비교 반복 횟수(i)가 제1 모델의 탐지 박스(A_box)의 수보다 작을 때의 IOU 값 비교 반복 횟수(j)를 나타낸다. 이후, j 값을 제2 모델의 탐지 박스(B_box)의 수와 비교하고(242), 클 경우 j 값을 ++1 한다(247). Among the detection boxes of the first model (Model A) and the second model (Model B), the number of matching boxes (matched_boxes) determined to be detection boxes that detect the same object is set to 0, and the number of IOU value comparison repetitions (i ) is set to 1. The IOU value comparison repetition number (i) is compared with the number of detection boxes (A_box) of the first model (231), and if it is small, j is set to 1 (241). Here, j represents the number of IOU value comparison repetitions (j) when the IOU value comparison repetition number (i) is smaller than the number of detection boxes (A_box) of the first model. Afterwards, the j value is compared with the number of detection boxes (B_box) of the second model (242), and if it is greater, the j value is set to ++1 (247).

반면에 작을 경우 제1 모델의 탐지 박스들(A_box) 및 제2 모델의 탐지 박스들(B_box)의 IOU 값을 비교하여 가장 큰 값을 갖는 해당 탐지 박스의 IOU 값을 max_iou 값으로 설정한다(243). max_iou 값을 미리 설정된 iou_thresh 값과 비교한다(244). 여기서, iou_thresh 값은 해당 제1 모델의 탐지 박스(A_box)와 제2 모델의 탐지 박스(B_box)가 동일한 객체를 탐지했는지 판별하기 위한 임계값이다. max_iou 값이 미리 정해진 iou_thresh 값 보다 클 경우, 매칭 박스의 수(matched_boxes)를 ++1하고, 해당 박스를 매칭 박스로 결정하고, 해당 매칭 박스를 영상 내 객체를 탐지하기 위한 최종 탐지 박스(c_box)로 선정한다(245). 이후, j 값을 ++1 하고, 단계(242)부터 반복한다. On the other hand, if it is small, the IOU values of the detection boxes (A_box) of the first model and the detection boxes (B_box) of the second model are compared and the IOU value of the corresponding detection box with the largest value is set as the max_iou value (243 ). The max_iou value is compared with the preset iou_thresh value (244). Here, the iou_thresh value is a threshold for determining whether the detection box (A_box) of the first model and the detection box (B_box) of the second model detected the same object. If the max_iou value is greater than the predetermined iou_thresh value, the number of matching boxes (matched_boxes) is set to ++1, the corresponding box is determined as a matching box, and the matching box is used as the final detection box (c_box) for detecting objects in the image. Selected as (245). Afterwards, set the j value to ++1 and repeat from step 242.

다시 단계(231)를 참조하면, IOU 값 비교 반복 횟수(i)를 제1 모델의 탐지 박스(A_box)의 수와 비교하여 클 경우, 매칭 박스의 수(matched_boxes)를 영상 내 객체의 수(no_object)와 비교한다(232). Referring to step 231 again, if the number of IOU value comparison repetitions (i) is greater than the number of detection boxes (A_box) of the first model, the number of matching boxes (matched_boxes) is set to the number of objects in the image (no_object ) compared to (232).

매칭 박스의 수(matched_boxes)가 영상 내 객체의 수(no_object) 보다 작을 경우, 매칭 박스를 결정하고 남은 나머지 제1 모델의 탐지 박스들(A_box) 및 제2 모델의 탐지 박스들(B_box)에 대하여 신뢰도 값이 높은 순으로 최종 탐지 박스(c_box)에 추가하고, 매칭 박스의 수(matched_boxes)를 ++1한다(233). If the number of matching boxes (matched_boxes) is smaller than the number of objects in the image (no_object), the matching boxes are determined and the remaining detection boxes (A_box) of the first model and detection boxes (B_box) of the second model are They are added to the final detection box (c_box) in order of the highest reliability value, and the number of matching boxes (matched_boxes) is set to ++1 (233).

매칭 박스를 결정하고 남은 나머지 제1 모델의 탐지 박스들(A_box) 및 제2 모델의 탐지 박스들(B_box)의 수가 0인지 판단하고(234), 0일 경우 최종 탐지 박스(c_box)를 반환하여 최종 탐지 박스(c_box)를 이용한 영상 내 객체 탐지를 수행한다(235). 반면에, 매칭 박스를 결정하고 남은 나머지 제1 모델의 탐지 박스들(A_box) 및 제2 모델의 탐지 박스들(B_box)의 수가 0이 아닐 경우, 단계(232)부터 반복 수행한다. After determining the matching box, determine whether the remaining number of detection boxes (A_box) of the first model and detection boxes (B_box) of the second model are 0 (234), and if 0, return the final detection box (c_box) Object detection in the image is performed using the final detection box (c_box) (235). On the other hand, if the number of detection boxes (A_box) of the first model and detection boxes (B_box) of the second model remaining after determining the matching box is not 0, the process is repeated from step 232.

도 3은 본 발명의 일 실시예에 따른 모델 앙상블을 이용한 객체 탐지 장치의 구성을 나타내는 도면이다. Figure 3 is a diagram showing the configuration of an object detection device using a model ensemble according to an embodiment of the present invention.

제안하는 모델 앙상블을 이용한 객체 탐지 장치(300)는 모델링부(310), 탐지 박스 비교부(320) 및 탐지 박스 조정부(330)를 포함한다. The object detection apparatus 300 using the proposed model ensemble includes a modeling unit 310, a detection box comparison unit 320, and a detection box adjustment unit 330.

모델링부(310)는 카메라를 통해 복수의 객체를 촬영한 영상으로부터 제1 모델 및 제2 모델을 이용하여 각각의 모델 별로 영상 내 객체들을 탐지하기 위한 탐지 박스들을 획득한다. 예를 들어, 카메라를 통해 복수의 객체를 촬영하기 위해 위에서 아래를 촬영하는 탑-뷰(top-view) 카메라를 통해 촬영할 수 있다. 설치된 카메라로부터 획득한 영상에 대해 모델링부를 통해 탐지 박스들을 획득할 수 있다. The modeling unit 310 uses a first model and a second model from images of a plurality of objects captured through a camera to obtain detection boxes for detecting objects in the image for each model. For example, in order to photograph multiple objects through a camera, photographing can be done through a top-view camera that photographs from above and below. Detection boxes can be obtained through the modeling unit for images acquired from installed cameras.

이때, 모델링부(310)는 복수의 탐지 모델을 이용하여 탐지 박스들을 획득할 수 있다. 예를 들어, 제1 모델 및 제2 모델을 이용하여 각각의 모델 별로 영상 내 객체들을 탐지하기 위한 탐지 박스들을 획득할 수 있다. 본 발명의 실시예에 따른 모델 앙상블을 위한 복수의 탐지 모델은 제1 모델 및 제2 모델을 포함하지만, 이에 한정되지 않고, 더 많을 수의 탐지 모델을 이용하여 앙상블을 통해 객체 탐지 박스를 조정함으로써 객체 탐지 정확도를 향상시킬 수 있다. At this time, the modeling unit 310 may obtain detection boxes using a plurality of detection models. For example, detection boxes for detecting objects in an image can be obtained for each model using the first model and the second model. A plurality of detection models for a model ensemble according to an embodiment of the present invention includes, but is not limited to, a first model and a second model, and adjusts the object detection box through the ensemble using a larger number of detection models. Object detection accuracy can be improved.

탐지 박스 비교부(320)는 획득된 제1 모델의 탐지 박스의 수 및 제2 모델의 탐지 박스의 수 각각을 영상 내 객체의 수와 비교한다. The detection box comparison unit 320 compares the number of detection boxes of the first model and the number of detection boxes of the second model with the number of objects in the image.

탐지 박스 조정부(330)는 제1 모델의 탐지 박스의 수 및 제2 모델의 탐지 박스의 수 각각을 영상 내 객체의 수와 비교한 결과에 따라 제1 모델의 탐지 박스들 및 제2 모델의 탐지 박스들 각각에 대한 IOU(intersection Over Union) 값을 이용하여 매칭 박스를 결정하고, 최종 탐지 박스를 선정한다. The detection box adjuster 330 detects the detection boxes of the first model and the second model according to the results of comparing each of the number of detection boxes of the first model and the number of detection boxes of the second model with the number of objects in the image. The matching box is determined using the IOU (intersection over union) value for each box, and the final detection box is selected.

탐지 박스 조정부(330)는 제1 모델의 탐지 박스의 수 또는 제2 모델의 탐지 박스의 수 중 어느 하나의 모델의 탐지 박스의 수가 영상 내 객체의 수와 같을 경우, 해당 모델을 객체 탐지를 위한 모델로 선택하고, 해당 모델의 탐지 박스를 이용하여 영상 내 객체를 탐지한다. When the number of detection boxes of either the number of detection boxes of the first model or the number of detection boxes of the second model is equal to the number of objects in the image, the detection box adjuster 330 uses the model for object detection. Select it as a model and detect objects in the image using the detection box of the model.

반면에, 탐지 박스 조정부(330)는 제1 모델의 탐지 박스의 수 및 제2 모델의 탐지 박스의 수가 모두 영상 내 객체의 수와 비교하여 같지 않은 경우, 제1 모델의 탐지 박스들 및 제2 모델의 탐지 박스들 각각에 대하여 신뢰도 값의 내림차순으로 정렬하고, 제1 모델의 탐지 박스들 및 제2 모델의 탐지 박스들 각각에 대해 IOU(intersection Over Union) 값을 이용하여 비교한다. On the other hand, when the number of detection boxes of the first model and the number of detection boxes of the second model are not equal to the number of objects in the image, the detection box adjuster 330 adjusts the detection boxes of the first model and the number of detection boxes of the second model. Each of the detection boxes of the model is sorted in descending order of reliability value, and each of the detection boxes of the first model and the detection boxes of the second model are compared using an intersection over union (IOU) value.

탐지 박스 조정부(330)는 제1 모델의 탐지 박스들 및 제2 모델의 탐지 박스들 각각에 대해 비교한 IOU 값이 미리 정해진 기준 이상인 탐지 박스를 동일한 객체를 탐지한 탐지 박스로 판단하여 매칭 박스로 결정하고, 해당 매칭 박스를 영상 내 객체를 탐지하기 위한 최종 탐지 박스로 선정한다. The detection box adjuster 330 determines a detection box whose IOU value compared to each of the detection boxes of the first model and the detection boxes of the second model is greater than or equal to a predetermined standard as a detection box that detected the same object, and selects it as a matching box. Then, the corresponding matching box is selected as the final detection box for detecting objects in the video.

탐지 박스 조정부(330)는 매칭 박스를 결정하고 남은 나머지 제1 모델의 탐지 박스들 및 제2 모델의 탐지 박스들에 대하여 신뢰도 값이 높은 순으로 영상 내 객체의 수와 같아질 때까지 영상 내 객체를 탐지하기 위한 최종 탐지 박스로 추가한다. The detection box adjusting unit 330 determines the matching box, and the remaining detection boxes of the first model and the detection boxes of the second model are ordered in order of reliability, until the number of objects in the image is equal to the number of objects in the image. Add it as the final detection box to detect.

이하, 도 4 내지 도 8을 참조하여 본 발명의 일 실시예에 따른 모델 앙상블을 이용한 객체 탐지 과정에 대해 더욱 상세히 설명한다. 본 발명에서 제안하는 객체 탐지 방법 및 장치에 대한 상세한 설명을 위해 돈사 내에서의 객체(다시 말해, 돼지) 탐지에 적용하여 예시로서 설명한다. 돈사 내에서의 객체 탐지는 일 실시예일뿐 이에 한정되지 않으며, 본 발명에서 제안하는 객체 탐지 방법 및 장치는 다양한 분야에서의 객체 탐지에 적용될 수 있다. Hereinafter, the object detection process using a model ensemble according to an embodiment of the present invention will be described in more detail with reference to FIGS. 4 to 8. For detailed explanation of the object detection method and device proposed in the present invention, it will be described as an example by applying it to object detection (that is, pigs) in a pig pen. Object detection within a pig pen is only an example and is not limited to this, and the object detection method and device proposed in the present invention can be applied to object detection in various fields.

도 4는 본 발명의 일 실시예에 따른 영상획득을 위해 설치된 카메라를 나타내는 도면이다.Figure 4 is a diagram showing a camera installed for image acquisition according to an embodiment of the present invention.

본 발명의 일 실시예에 따른 카메라는 돈사 내 돼지의 영상을 촬영하기 위해 위에서 아래를 촬영하는 탑-뷰(top-view) 카메라로 설치될 수 있다. 돈사 내에 설치된 카메라로 획득된 영상으로부터 본 발명의 일 실시예에 따른 모델링부를 통해 영상으로부터 제1 모델 및 제2 모델을 이용하여 각각의 모델 별로 영상 내 객체들을 탐지하기 위한 탐지 박스들을 획득한다.The camera according to an embodiment of the present invention may be installed as a top-view camera that shoots from above to below to capture images of pigs in a pig pen. Detection boxes for detecting objects in the image are obtained for each model using a first model and a second model from the image acquired by a camera installed in the pig pen through a modeling unit according to an embodiment of the present invention.

본 발명의 실시예에 따르면 YOLOv4를 사용하여 영상으로부터 탐지 박스들을 획득할 수 있다. YOLOv4는 대표적인 객체 탐지 공개 DB인 MS COCO로 처리속도 대비 높은 정확도를 달성할 수 있다. 즉, 합성곱 신경망 기반 딥러닝 기술의 지속적인 발전으로 YOLOv4는 대부분의 돼지들을 정확히 탐지하지만, 돈방의 복잡한 구조 등의 이유로 배경(background)에 해당하는 가짜 돼지를 탐지한 박스들이 포함되는 문제(False Positive, FP)가 남아있다. 만약 이러한 가짜 돼지들을 제거하기 위하여 단순히 박스의 신뢰도(confidence) 값이 적은 것을 제거한다면, 겹친 돼지(occluded pig)와 같이 신뢰도 값이 낮은 다른 박스도 함께 제거된다는 또 다른 문제(False Negative, FN)가 발생한다. According to an embodiment of the present invention, detection boxes can be obtained from an image using YOLOv4. YOLOv4 can achieve high accuracy compared to processing speed with MS COCO, a representative object detection public database. In other words, with the continuous development of convolutional neural network-based deep learning technology, YOLOv4 accurately detects most pigs, but due to the complex structure of the pig pen, there is an issue that boxes that detect fake pigs in the background are included (False Positive). , FP) remain. If, in order to remove these fake pigs, you simply remove boxes with low confidence values, another problem (False Negative, FN) arises in that other boxes with low confidence values, such as occluded pigs, are also removed. Occurs.

이러한 문제들을 해결하기 위하여 본 발명에서는 상호 보안적인 데이터로 학습된 복수의 모델을 사용하여 객체 탐지 박스를 조정함으로써 객체 탐지 정확도를 향상시킬 수 있다. To solve these problems, the present invention can improve object detection accuracy by adjusting the object detection box using a plurality of models learned with mutually secure data.

도 5는 본 발명의 일 실시예에 따른 모델링부를 통해 획득된 탐지 박스를 나타내는 도면이다. Figure 5 is a diagram showing a detection box obtained through a modeling unit according to an embodiment of the present invention.

카메라를 통해 복수의 객체를 촬영하여 영상을 획득하고, 모델링부가 영상으로부터 제1 모델 및 제2 모델을 이용하여 각각의 모델 별로 영상 내 객체들을 탐지하기 위한 탐지 박스들을 획득한다. 모델링부는 복수의 탐지 모델을 이용하여 탐지 박스들을 획득할 수 있다. 예를 들어, 제1 모델 및 제2 모델을 이용하여 각각의 모델 별로 영상 내 객체들을 탐지하기 위한 탐지 박스들을 획득할 수 있다. An image is acquired by photographing a plurality of objects through a camera, and the modeling unit uses the first model and the second model from the image to obtain detection boxes for detecting objects in the image for each model. The modeling unit may obtain detection boxes using a plurality of detection models. For example, detection boxes for detecting objects in an image can be obtained for each model using the first model and the second model.

도 5를 참조하면, 영상 내 객체의 수는 9이고, 탐지 박스의 수도 동일한 9임을 알 수 있다. 제1 모델의 탐지 박스의 수 또는 제2 모델의 탐지 박스의 수 중 어느 하나의 모델의 탐지 박스의 수가 영상 내 객체의 수와 같을 경우, 해당 모델을 객체 탐지를 위한 모델로 선택하고, 해당 모델의 탐지 박스를 이용하여 영상 내 객체를 탐지할 수 있다. Referring to FIG. 5, it can be seen that the number of objects in the image is 9, and the number of detection boxes is also 9. If the number of detection boxes of either the number of detection boxes of the first model or the number of detection boxes of the second model is equal to the number of objects in the image, the corresponding model is selected as a model for object detection, and the corresponding model is selected. You can detect objects in the image using the detection box.

도 6은 본 발명의 일 실시예에 따른 제1 모델 및 제2 모델을 이용하여 각각 획득된 탐지 박스를 나타내는 도면이다. Figure 6 is a diagram showing detection boxes each obtained using a first model and a second model according to an embodiment of the present invention.

도 5와는 달리, 도 6asms 영상 내 객체의 수는 9이지만, 탐지 박스의 수는 8개 이다. 도 6b는 제1 모델을 이용하여 획득한 탐지 박스들을 나타내는 도면이고, 도 6c는 제2 모델을 이용하여 획득한 탐지 박스들을 나타낸다. Unlike Figure 5, the number of objects in the image in Figure 6asms is 9, but the number of detection boxes is 8. FIG. 6B is a diagram showing detection boxes obtained using the first model, and FIG. 6C shows detection boxes obtained using the second model.

도 6b의 탐지 박스(611)와 도 6c의 탐지 박스(612)는 두 객체의 겹침으로 인해 발생하는 오류일 수 있다. The detection box 611 of FIG. 6B and the detection box 612 of FIG. 6C may be errors caused by overlapping of two objects.

이와 같이, 제1 모델의 탐지 박스의 수 및 제2 모델의 탐지 박스의 수가 모두 영상 내 객체의 수와 비교하여 같지 않은 경우, 제1 모델의 탐지 박스들 및 제2 모델의 탐지 박스들 각각에 대하여 신뢰도 값의 내림차순으로 정렬하고, 제1 모델의 탐지 박스들 및 제2 모델의 탐지 박스들 각각에 대해 IOU(intersection Over Union) 값을 이용하여 비교할 수 있다. In this way, when both the number of detection boxes of the first model and the number of detection boxes of the second model are not equal compared to the number of objects in the image, each of the detection boxes of the first model and the detection boxes of the second model They can be sorted in descending order of reliability values, and compared using IOU (intersection over union) values for each of the detection boxes of the first model and the detection boxes of the second model.

제1 모델의 탐지 박스들 및 제2 모델의 탐지 박스들 각각에 대해 비교한 IOU 값이 미리 정해진 기준(iou_thresh) 이상인 탐지 박스를 동일한 객체를 탐지한 탐지 박스로 판단하여 매칭 박스로 결정하고, 해당 매칭 박스를 영상 내 객체를 탐지하기 위한 최종 탐지 박스로 선정할 수 있다. A detection box whose IOU value compared to each of the detection boxes of the first model and the detection boxes of the second model is more than a predetermined standard (iou_thresh) is judged to be a detection box that detected the same object and is determined as a matching box, and the corresponding The matching box can be selected as the final detection box to detect objects in the video.

도 7은 본 발명의 일 실시예에 따른 제1 모델의 탐지 박스들 및 제2 모델의 탐지 박스들 각각에 대해 IOU 값을 비교하는 과정을 설명하기 위한 도면이다. Figure 7 is a diagram for explaining the process of comparing IOU values for each of the detection boxes of the first model and the detection boxes of the second model according to an embodiment of the present invention.

도 7a는 제1 모델을 이용하여 획득한 탐지 박스들을 나타내는 도면이고, 도 7b는 제2 모델을 이용하여 획득한 탐지 박스들을 나타낸다.FIG. 7A is a diagram showing detection boxes obtained using the first model, and FIG. 7B shows detection boxes obtained using the second model.

도 7a의 탐지 박스(711)와 도 7b의 탐지 박스(712)의 IOU 값은 0.99이다. 이 IOU 값 0.99를 미리 정해진 기준(iou_thresh)과 비교한다. 미리 정해진 기준(iou_thresh) 값은 해당 제1 모델의 탐지 박스(A_box)와 제2 모델의 탐지 박스(B_box)가 동일한 객체를 탐지했는지 판별하기 위한 임계값이다. 본 발명의 실시예에 따른 미리 정해진 기준(iou_thresh) 값은 0.9로 설정하였다. 도 7a의 탐지 박스(711)와 도 7b의 탐지 박스(712)는 IOU 값은 0.99 이므로 각각의 탐지 박스는 동일한 객체를 탐지한 것으로 판단할 수 있다. 따라서, 해당 탐지 박스를 매칭 박스로 결정하고, 해당 매칭 박스를 영상 내 객체를 탐지하기 위한 최종 탐지 박스로 선정할 수 있다. The IOU value of the detection box 711 in FIG. 7A and the detection box 712 in FIG. 7B is 0.99. This IOU value of 0.99 is compared with a predetermined standard (iou_thresh). The predetermined standard (iou_thresh) value is a threshold for determining whether the detection box (A_box) of the first model and the detection box (B_box) of the second model detected the same object. The predetermined standard (iou_thresh) value according to the embodiment of the present invention was set to 0.9. Since the IOU value of the detection box 711 of FIG. 7A and the detection box 712 of FIG. 7B is 0.99, it can be determined that each detection box detects the same object. Therefore, the corresponding detection box can be determined as the matching box, and the matching box can be selected as the final detection box for detecting the object in the image.

나머지 탐지 박스들에 대해서도 IOU 값을 비교하고, 비교한 IOU 값이 미리 정해진 기준 이상인 탐지 박스를 동일한 객체를 탐지한 탐지 박스로 판단하여 매칭 박스로 결정하고, 해당 매칭 박스를 영상 내 객체를 탐지하기 위한 최종 탐지 박스로 선정한다. The IOU values of the remaining detection boxes are compared, and the detection box whose IOU value is higher than a predetermined standard is judged to be a detection box that detected the same object and is determined as a matching box, and the matching box is used to detect objects in the video. is selected as the final detection box for

반면에, 도 7b의 탐지 박스(721)와 도 7b의 탐지 박스(722)는 IOU 값은 미리 정해진 기준(iou_thresh) 보다 작다. 이러한 경우, 매칭 박스를 결정하고 남은 나머지 제1 모델의 탐지 박스들 및 제2 모델의 탐지 박스들에 대하여 신뢰도 값이 높은 순으로 영상 내 객체의 수와 같아질 때까지 영상 내 객체를 탐지하기 위한 최종 탐지 박스로 추가한다. On the other hand, the IOU value of the detection box 721 of FIG. 7B and the detection box 722 of FIG. 7B is smaller than the predetermined standard (iou_thresh). In this case, after determining the matching box, the remaining detection boxes of the first model and the detection boxes of the second model are ordered in order of reliability values to detect objects in the image until the number of objects in the image is equal to the number of objects in the image. Add it as the final detection box.

도 8은 본 발명의 일 실시예에 따른 최종 탐지 박스를 선정하는 과정을 설명하기 위한 도면이다. Figure 8 is a diagram for explaining the process of selecting a final detection box according to an embodiment of the present invention.

도 8을 참조하면, 매칭 박스를 결정하고 남은 나머지 제1 모델의 탐지 박스(811) 및 제2 모델의 탐지 박스(812)에 대하여 신뢰도 값이 높은 순으로 영상 내 객체의 수와 같아질 때까지 영상 내 객체를 탐지하기 위한 최종 탐지 박스(821, 822)로 추가한 결과를 나타내는 도면이다. Referring to FIG. 8, after determining the matching box, the remaining detection boxes 811 of the first model and the detection boxes 812 of the second model are ranked in descending order of confidence values until they equal the number of objects in the image. This is a diagram showing the results of adding the final detection boxes (821, 822) to detect objects in the image.

이와 같이, 탐지 박스의 오류를 보완 하기 위하여 상호 보완적인 데이터로 학습된 두 개의 모델의 탐지 박스 결과를 결합하여 객체 탐지 정확도를 향상시킬 수 있다. In this way, in order to compensate for detection box errors, object detection accuracy can be improved by combining the detection box results of two models learned with complementary data.

도 9는 본 발명의 일 실시예에 따른 탐지 박스의 신뢰도 값 조절 알고리즘을 나타내는 도면이다.Figure 9 is a diagram showing an algorithm for adjusting the reliability value of a detection box according to an embodiment of the present invention.

본 발명의 실시예에 따르면, 먼저 카메라를 통해 복수의 객체를 촬영하여 영상을 획득한다. 제안하는 모델 앙상블을 이용한 객체 탐지 장치의 모델링부가 영상으로부터 제1 모델(Model A) 및 제2 모델(Model B)을 이용하여 각각의 모델 별로 영상 내 객체들을 탐지하기 위한 탐지 박스들을 획득한다. 다시 말해, 제1 모델(Model A)을 이용하여 탐지 박스(A_box)를 획득하고, 제2 모델(Model B)을 이용하여 탐지 박스(B_box)를 획득한다. According to an embodiment of the present invention, images are first obtained by photographing a plurality of objects using a camera. The modeling unit of the object detection device using the proposed model ensemble uses the first model (Model A) and the second model (Model B) from the image to obtain detection boxes for detecting objects in the image for each model. In other words, the detection box (A_box) is obtained using the first model (Model A), and the detection box (B_box) is obtained using the second model (Model B).

더욱 상세하게는, 제1 모델(Model A)을 이용하여 획득한 탐지 박스(A_box)의 사이즈(다시 말해, 탐지 박스의 수)를 영상 내 객체의 수(no_object)와 비교한다. 제1 모델(Model A)의 탐지 박스(A_box)의 수가 영상 내 객체의 수(no_object)와 같을 경우, 제1 모델(Model A)을 객체 탐지를 위한 모델로 선택하고, 제1 모델의 탐지 박스(A_box)를 반환하여 영상 내 객체를 탐지한다. More specifically, the size of the detection box (A_box) (in other words, the number of detection boxes) obtained using the first model (Model A) is compared with the number of objects (no_object) in the image. If the number of detection boxes (A_box) of the first model (Model A) is equal to the number of objects (no_object) in the image, the first model (Model A) is selected as a model for object detection, and the detection box of the first model Returns (A_box) to detect objects in the image.

제1 모델(Model A)의 탐지 박스(A_box)의 수가 영상 내 객체의 수(no_object)와 같지 않을 경우, 제2 모델(Model B)을 이용하여 획득한 탐지 박스(B_box)의 사이즈(다시 말해, 탐지 박스의 수)를 영상 내 객체의 수(no_object)와 비교한다. 제2 모델(Model B)의 탐지 박스(B_box)의 수가 영상 내 객체의 수(no_object)와 같을 경우, 제2 모델(Model B)을 객체 탐지를 위한 모델로 선택하고, 제2 모델의 탐지 박스(B_box)를 반환하여 영상 내 객체를 탐지한다. If the number of detection boxes (A_box) of the first model (Model A) is not equal to the number of objects (no_object) in the image, the size of the detection box (B_box) obtained using the second model (Model B) (in other words, , number of detection boxes) is compared with the number of objects in the image (no_object). If the number of detection boxes (B_box) of the second model (Model B) is equal to the number of objects (no_object) in the image, the second model (Model B) is selected as a model for object detection, and the detection box of the second model Returns (B_box) to detect objects in the image.

제1 모델(Model A) 및 제2 모델(Model B)의 탐지 박스들 중 동일한 객체를 탐지한 탐지 박스로 판단되는 매칭 박스의 수(matched_boxes)를 0으로 설정하고, IOU 값 비교 반복 횟수(i)를 1로 설정한다. IOU 값 비교 반복 횟수(i)를 제1 모델의 탐지 박스(A_box)의 수와 비교하고, 작을 경우 j를 1로 설정한다. 여기서 j는 IOU 값 비교 반복 횟수(i)가 제1 모델의 탐지 박스(A_box)의 수보다 작을 때의 IOU 값 비교 반복 횟수(j)를 나타낸다. 이후, j 값을 제2 모델의 탐지 박스(B_box)의 수와 비교하고, 클 경우 j 값을 ++1 한다. Among the detection boxes of the first model (Model A) and the second model (Model B), the number of matching boxes (matched_boxes) determined to be detection boxes that detect the same object is set to 0, and the number of IOU value comparison repetitions (i ) is set to 1. The IOU value comparison repetition number (i) is compared with the number of detection boxes (A_box) of the first model, and if it is small, j is set to 1. Here, j represents the number of IOU value comparison repetitions (j) when the IOU value comparison repetition number (i) is smaller than the number of detection boxes (A_box) of the first model. Afterwards, the j value is compared with the number of detection boxes (B_box) of the second model, and if it is greater, the j value is set to ++1.

반면에 작을 경우 제1 모델의 탐지 박스들(A_box) 및 제2 모델의 탐지 박스들(B_box)의 IOU 값을 비교하여 가장 큰 값을 갖는 해당 탐지 박스의 IOU 값을 max_iou 값으로 설정한다. max_iou 값을 미리 설정된 iou_thresh 값과 비교한다. 여기서, iou_thresh 값은 해당 제1 모델의 탐지 박스(A_box)와 제2 모델의 탐지 박스(B_box)가 동일한 객체를 탐지했는지 판별하기 위한 임계값이다. max_iou 값이 미리 정해진 iou_thresh 값 보다 클 경우, 매칭 박스의 수(matched_boxes)를 ++1하고, 해당 박스를 매칭 박스로 결정하고, 해당 매칭 박스를 영상 내 객체를 탐지하기 위한 최종 탐지 박스(c_box)로 선정한다. 이후, j 값을 ++1 하고, 단계부터 반복한다. On the other hand, if it is small, the IOU values of the detection boxes (A_box) of the first model and the detection boxes (B_box) of the second model are compared and the IOU value of the corresponding detection box with the largest value is set as the max_iou value. Compare the max_iou value with the preset iou_thresh value. Here, the iou_thresh value is a threshold for determining whether the detection box (A_box) of the first model and the detection box (B_box) of the second model detected the same object. If the max_iou value is greater than the predetermined iou_thresh value, the number of matching boxes (matched_boxes) is set to ++1, the corresponding box is determined as a matching box, and the matching box is used as the final detection box (c_box) for detecting objects in the image. Select. Afterwards, set the j value to ++1 and repeat from step 1.

다시 단계를 참조하면, IOU 값 비교 반복 횟수(i)를 제1 모델의 탐지 박스(A_box)의 수와 비교하여 클 경우, 매칭 박스의 수(matched_boxes)를 영상 내 객체의 수(no_object)와 비교한다. Referring to the step again, the number of IOU value comparison iterations (i) is compared to the number of detection boxes (A_box) of the first model, and if it is large, the number of matching boxes (matched_boxes) is compared to the number of objects in the image (no_object). do.

매칭 박스의 수(matched_boxes)가 영상 내 객체의 수(no_object) 보다 작을 경우, 매칭 박스를 결정하고 남은 나머지 제1 모델의 탐지 박스들(A_box) 및 제2 모델의 탐지 박스들(B_box)에 대하여 신뢰도 값이 높은 순으로 최종 탐지 박스(c_box)에 추가하고, 매칭 박스의 수(matched_boxes)를 ++1한다. If the number of matching boxes (matched_boxes) is smaller than the number of objects in the image (no_object), the matching boxes are determined and the remaining detection boxes (A_box) of the first model and detection boxes (B_box) of the second model are They are added to the final detection box (c_box) in order of highest reliability value, and the number of matching boxes (matched_boxes) is set to ++1.

매칭 박스를 결정하고 남은 나머지 제1 모델의 탐지 박스들(A_box) 및 제2 모델의 탐지 박스들(B_box)의 수가 0인지 판단하고, 0일 경우 최종 탐지 박스(c_box)를 반환하여 최종 탐지 박스(c_box)를 이용한 영상 내 객체 탐지를 수행한다. 반면에, 매칭 박스를 결정하고 남은 나머지 제1 모델의 탐지 박스들(A_box) 및 제2 모델의 탐지 박스들(B_box)의 수가 0이 아닐 경우, 단계부터 반복 수행한다. After determining the matching box, determine whether the remaining number of detection boxes (A_box) of the first model and detection boxes (B_box) of the second model are 0, and if 0, return the final detection box (c_box) to determine the final detection box. Perform object detection in the image using (c_box). On the other hand, if the number of detection boxes (A_box) of the first model and detection boxes (B_box) of the second model remaining after determining the matching box is not 0, the process is repeated from the step.

이상에서 설명된 장치는 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치 및 구성요소는, 예를 들어, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPA(field programmable array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 애플리케이션을 수행할 수 있다.　 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다.　 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다.　 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다.　 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The device described above may be implemented with hardware components, software components, and/or a combination of hardware components and software components. For example, devices and components described in embodiments may include, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable array (FPA), It may be implemented using one or more general-purpose or special-purpose computers, such as a programmable logic unit (PLU), microprocessor, or any other device capable of executing and responding to instructions. A processing device may execute an operating system (OS) and one or more software applications that run on the operating system. Additionally, a processing device may access, store, manipulate, process, and generate data in response to the execution of software. For ease of understanding, a single processing device may be described as being used; however, those skilled in the art will understand that a processing device includes multiple processing elements and/or multiple types of processing elements. It can be seen that it may include. For example, a processing device may include a plurality of processors or one processor and one controller. Additionally, other processing configurations, such as parallel processors, are possible.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다.　 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치에 구체화(embody)될 수 있다.　 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.Software may include a computer program, code, instructions, or a combination of one or more of these, which may configure a processing unit to operate as desired, or may be processed independently or collectively. You can command the device. Software and/or data may be used on any type of machine, component, physical device, virtual equipment, computer storage medium or device to be interpreted by or to provide instructions or data to a processing device. It can be embodied in . Software may be distributed over networked computer systems and stored or executed in a distributed manner. Software and data may be stored on one or more computer-readable recording media.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다.　 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다.　 상기 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다.　 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다.　 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다.　 The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded on a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, etc., singly or in combination. Program instructions recorded on the medium may be specially designed and configured for the embodiment or may be known and available to those skilled in the art of computer software. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic media such as floptical disks. -Includes optical media (magneto-optical media) and hardware devices specifically configured to store and execute program instructions, such as ROM, RAM, flash memory, etc. Examples of program instructions include machine language code, such as that produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter, etc.

이상과 같이 실시예들이 비록 한정된 실시예와 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기의 기재로부터 다양한 수정 및 변형이 가능하다.　 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.As described above, although the embodiments have been described with limited examples and drawings, various modifications and variations can be made by those skilled in the art from the above description. For example, the described techniques are performed in a different order than the described method, and/or components of the described system, structure, device, circuit, etc. are combined or combined in a different form than the described method, or other components are used. Alternatively, appropriate results may be achieved even if substituted or substituted by an equivalent.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents of the claims also fall within the scope of the claims described below.

Claims

Obtaining an image by photographing a plurality of objects through a camera, and obtaining detection boxes for each model using a first model and a second model from the image by a modeling unit to detect objects in the image;
Comparing each of the obtained number of detection boxes of the first model and the number of detection boxes of the second model with the number of objects in the image through a detection box comparison unit; and
According to the result of comparing each of the number of detection boxes of the first model and the number of detection boxes of the second model with the number of objects in the image, the detection boxes of the first model and the detection boxes of the second model are generated through the detection box adjuster. A step of determining a matching box using the IOU (intersection over union) value for each and selecting the final detection box for detecting objects in the image using the matching box.
Object detection method including.

According to paragraph 1,
According to the result of comparing each of the number of detection boxes of the first model and the number of detection boxes of the second model with the number of objects in the image, the detection boxes of the first model and the detection boxes of the second model are generated through the detection box adjuster. The step of determining the matching box using the IOU value for each and selecting the final detection box for detecting the object in the video using the matching box is,
If the number of detection boxes of either the number of detection boxes of the first model or the number of detection boxes of the second model is equal to the number of objects in the image,
Select the model as the model for object detection and use the detection box of the model to detect objects in the image.
Object detection method.

According to paragraph 1,
According to the result of comparing each of the number of detection boxes of the first model and the number of detection boxes of the second model with the number of objects in the image, the detection boxes of the first model and the detection boxes of the second model are generated through the detection box adjuster. The step of determining the matching box using the IOU value for each and selecting the final detection box for detecting the object in the video using the matching box is,
If both the number of detection boxes of the first model and the number of detection boxes of the second model are not equal compared to the number of objects in the image,
Sort each of the detection boxes of the first model and the detection boxes of the second model in descending order of confidence value, and perform an intersection over union (IOU) for each of the detection boxes of the first model and the detection boxes of the second model. Comparing using values
Object detection method.

According to paragraph 3,
A detection box whose IOU value compared to each of the detection boxes of the first model and the detection boxes of the second model is more than a predetermined standard is judged to be a detection box that detected the same object and is determined as a matching box, and the matching box is determined as a matching box. Selecting the final detection box to detect objects in the video
Object detection method.

According to paragraph 4,
After determining the matching box, the remaining detection boxes of the first model and the detection boxes of the second model are ordered in order of reliability, and are used to detect objects in the image until the number of objects in the image is equal to the final detection box. to add
Object detection method.

A modeling unit that obtains detection boxes for detecting objects in the image for each model using a first model and a second model from images taken of a plurality of objects through a camera;
a detection box comparison unit that compares each of the obtained number of detection boxes of the first model and the number of detection boxes of the second model with the number of objects in the image; and
According to the result of comparing each of the number of detection boxes of the first model and the number of detection boxes of the second model with the number of objects in the image, IOU ( A detection box adjustment unit that determines a matching box using the (intersection Over Union) value and selects the final detection box for detecting objects in the image using the matching box.
An object detection device comprising:

According to clause 6,
The detection box control unit,
If the number of detection boxes of either the number of detection boxes of the first model or the number of detection boxes of the second model is equal to the number of objects in the image,
Select the model as the model for object detection and use the detection box of the model to detect objects in the image.
Object detection device.

According to clause 6,
The detection box control unit,
If both the number of detection boxes of the first model and the number of detection boxes of the second model are not equal compared to the number of objects in the image,
Sort the detection boxes of the first model and the detection boxes of the second model in descending order of confidence values, and perform an intersection over union (IOU) for each of the detection boxes of the first model and the detection boxes of the second model. Comparing using values
Object detection device.

According to clause 8,
The detection box control unit,
A detection box whose IOU value compared to each of the detection boxes of the first model and the detection boxes of the second model is more than a predetermined standard is judged to be a detection box that detected the same object and is determined as a matching box, and the matching box is determined as a matching box. Selecting the final detection box to detect objects in the video
Object detection device.

According to clause 9,
The detection box control unit,
After determining the matching box, the remaining detection boxes of the first model and the detection boxes of the second model are ordered in order of reliability, and are used to detect objects in the image until the number of objects in the image is equal to the final detection box. to add
Object detection device.