KR20220079464A

KR20220079464A - Method for detecting object and device performing the same

Info

Publication number: KR20220079464A
Application number: KR1020210171153A
Authority: KR
Inventors: 하영국; 박호림; 황성연
Original assignee: 건국대학교 산학협력단
Priority date: 2020-12-04
Filing date: 2021-12-02
Publication date: 2022-06-13
Anticipated expiration: 2041-12-02
Also published as: KR102804923B1

Abstract

객체 감지 방법 및 이를 위한 장치가 개시된다. 다양한 실시예에 따른 탐지 방법은 객체 이미지에서 하나 이상의 객체가 탐지된 제1 이미지를 생성하는 동작과 패널티 스코어(penalty score)에 기초하여 상기 제1 이미지보다 오탐지 비율(false positive rate(FSR))이 감소된 제2 이미지를 생성하는 동작을 포함하고, 상기 패널티 스코어는 상기 하나 이상의 객체에 대응되는 하나 이상의 클래스(class) 각각의 오탐지 비율에 기초하여 생성된 것일 수 있다.Disclosed are an object detection method and an apparatus therefor. A detection method according to various embodiments includes an operation of generating a first image in which one or more objects are detected from an object image and a false positive rate (FSR) than the first image based on a penalty score and generating the reduced second image, wherein the penalty score may be generated based on a false positive rate of each of one or more classes corresponding to the one or more objects.

Description

Object detection method and device therefor

본 발명의 다양한 실시예들은 객체 탐지 방법 및 이를 위한 장치에 관한 것이다.Various embodiments of the present invention relate to an object detection method and an apparatus therefor.

컴퓨터 비전 분야에서는 합성곱 신경망과 역전파 학습 방법을 비롯한 딥러닝의 발전으로 인해 많은 연구가 진행되고 발전할 수 있었다. 컴퓨터 비전 분야 기술 중 딥러닝이 가장 활발히 적용되는 기술 중에는 객체 분류 기술과 객체 탐지 기술이 있다. 객체 분류(object classification) 기술이란 이미지에 존재하는 객체의 종류를 구분하는 기술이며, 객체 탐지(object detection) 기술이란 객체의 종류를 구분하는 객체 분류와 객체의 위치 정보를 출력하는 객체 위치 영역 인식 과정이 통합되어 수행되는 기술을 의미한다.In the field of computer vision, advances in deep learning, including convolutional neural networks and backpropagation learning methods, have enabled many researches and advancements. Among the technologies in the computer vision field, deep learning is most actively applied to object classification technology and object detection technology. The object classification technology is a technology for classifying the types of objects present in an image, and the object detection technology is a process of object classification that distinguishes the types of objects and the object location area recognition process that outputs the location information of the objects This means the technology that is integrated and performed.

객체 탐지 기술은 자율주행과 영상 의학 분야와 같이 인간과 밀접하여 딥러닝 시스템의 결정이 인간의 생명과 밀접하게 연관되는 분야에서도 사용되고 있는 만큼, 객체 탐지의 성능을 개선하는 것은 매우 중요하다.Since object detection technology is closely related to humans, such as autonomous driving and imaging medicine, it is also used in fields where the decision of a deep learning system is closely related to human life, so it is very important to improve the performance of object detection.

위에서 설명한 배경기술은 발명자가 본원의 개시 내용을 도출하는 과정에서 보유하거나 습득한 것으로서, 반드시 본 출원 전에 일반 공중에 공개된 공지기술이라고 할 수는 없다.The background art described above is possessed or acquired by the inventor in the process of deriving the disclosure of the present application, and cannot necessarily be said to be a known technology disclosed to the general public prior to the present application.

딥러닝 시스템에 있어 오탐지(false positive)란 딥러닝 시스템이 정답이라고 예측한 것이 거짓인 경우일 수 있다. 딥러닝 시스템은 학습하지 않은 데이터에 대해 높은 신뢰 점수를 주므로 학습하지 않은 데이터에 대해 정답이라는 결과를 출력(false positive)할 수 있다. 자율주행과 영상 의학 분야와 같이 딥러닝 시스템의 결정이 인간의 생명과 밀접하게 연관되는 분야에서는 딥러닝 시스템의 오탐지가 큰 사고로 이어질 가능성이 있다. 예를 들어, 자율주행차량 전면에 차량이 존재하지 않지만 자율주행차량 내의 객체 탐지 시스템이 차량이 존재한다고 오탐지할 경우, 자율주행차량이 급정거하여 주행자의 생명을 위협할 수 있다. 이에, 오탐지에 강인하게 객체를 탐지하는 기술이 요구될 수 있다.In a deep learning system, a false positive may be a case in which what the deep learning system predicts as the correct answer is false. Since the deep learning system gives a high confidence score to the unlearned data, it may output a false positive result for the unlearned data. In fields where the decisions of deep learning systems are closely related to human life, such as autonomous driving and imaging medicine, false detections of deep learning systems may lead to serious accidents. For example, if there is no vehicle in front of the autonomous vehicle, but the object detection system in the autonomous vehicle falsely detects that the vehicle is present, the autonomous vehicle may suddenly stop and endanger the life of the driver. Accordingly, a technique for robustly detecting an object against false positives may be required.

다양한 실시예들은 오탐지 비율에 따른 패널티 스코어에 기초하여 오탐지에 강인하게 객체를 탐지하는 기술을 제공할 수 있다.Various embodiments may provide a technique for robustly detecting an object in false positives based on a penalty score according to a false positive rate.

다만, 기술적 과제는 상술한 기술적 과제들로 한정되는 것은 아니며, 또 다른 기술적 과제들이 존재할 수 있다.However, the technical tasks are not limited to the above-described technical tasks, and other technical tasks may exist.

다양한 실시예에 따른 탐지 방법은 객체 이미지에서 하나 이상의 객체가 탐지된 제1 이미지를 생성하는 동작과 패널티 스코어(penalty score)에 기초하여 상기 제1 이미지보다 오탐지 비율(false positive rate(FSR))이 감소된 제2 이미지를 생성하는 동작을 포함하고, 상기 패널티 스코어는 상기 하나 이상의 객체에 대응되는 하나 이상의 클래스(class) 각각의 오탐지 비율에 기초하여 생성된 것일 수 있다.A detection method according to various embodiments includes an operation of generating a first image in which one or more objects are detected from an object image and a false positive rate (FSR) than the first image based on a penalty score and generating the reduced second image, wherein the penalty score may be generated based on a false positive rate of each of one or more classes corresponding to the one or more objects.

상기 제1 이미지를 생성하는 동작은 상기 객체 이미지에서 상기 하나 이상의 객체를 분류하는 분류 모델에 상기 객체 이미지를 입력하여 특성 맵(feature map)을 생성하는 동작과 상기 특성 맵에서 상기 하나 이상의 객체를 탐지하는 탐지 모델에 상기 특성 맵을 입력하여 상기 제1 이미지 및 상기 제1 이미지와 대응되는 하나 이상의 제1 추론 지수를 생성하는 동작을 포함할 수 있다.The operation of generating the first image includes generating a feature map by inputting the object image to a classification model that classifies the one or more objects from the object image and detecting the one or more objects in the characteristic map and generating the first image and one or more first inference indices corresponding to the first image by inputting the characteristic map to a detection model.

상기 분류 모델은 상기 탐지 모델이 탐지하려는 객체와 관련된 제1 학습 데이터 및 상기 탐지하려는 객체와 무관한 제2 학습 데이터에 기초하여 학습된 것일 수 있다.The classification model may be learned based on first learning data related to the object to be detected by the detection model and second learning data irrelevant to the object to be detected.

상기 제2 학습 데이터는 상기 제1 학습 데이터에 포함된 하나 이상의 클래스에 기초하여 레이블링(labeling)된 것일 수 있다.The second learning data may be labeled based on one or more classes included in the first learning data.

상기 제2 이미지를 생성하는 동작은 검증 데이터에 기초하여 상기 패널티 스코어를 생성하는 동작과 상기 패널티 스코어에 기초하여 상기 제2 이미지 및 상기 제2 이미지와 대응되는 하나 이상의 제2 추론 지수를 생성하는 동작을 포함하고, 상기 검증 데이터는 상기 제1 학습 데이터 및 상기 제2 학습 데이터의 조합인 것일 수 있다.The generating of the second image may include generating the penalty score based on verification data and generating the second image and one or more second inference indices corresponding to the second image based on the penalty score. Including, the verification data may be a combination of the first learning data and the second learning data.

상기 제2 추론 지수는 하나의 클래스가 가지는 오탐지 비율과 상기 오탐지 비율의 평균의 차이값에 상기 제1 추론 지수를 곱한 값일 수 있다.The second inference index may be a value obtained by multiplying a difference between a false positive rate of one class and an average of the false positive rate by the first inference index.

상기 패널티 스코어는 상기 오탐지 비율의 평균을 초과하는 오탐지 비율을 가지는 클래스에 대하여 생성된 것일 수 있다.The penalty score may be generated for a class having a false positive rate exceeding an average of the false positive rate.

다양한 실시예에 따른 장치는 인스트럭션들을 포함하는 메모리와 상기 메모리와 전기적으로 연결되고, 상기 인스트럭션들을 실행하기 위한 프로세서를 포함하고, 상기 프로세서에 의해 상기 인스트럭션들이 실행될 때, 상기 프로세서는 객체 이미지에서 하나 이상의 객체가 탐지된 제1 이미지를 생성하고, 패널티 스코어(penalty score)에 기초하여 상기 제1 이미지보다 오탐지 비율(false positive rate(FSR))이 감소된 제2 이미지를 생성하며, 상기 패널티 스코어는 상기 하나 이상의 객체에 대응되는 하나 이상의 클래스(class) 각각의 오탐지 비율에 기초하여 생성된 것일 수 있다.An apparatus according to various embodiments includes a memory including instructions and a processor electrically connected to the memory and configured to execute the instructions, wherein when the instructions are executed by the processor, the processor is configured to generate one or more objects in an object image. generating a first image in which an object is detected, and generating a second image having a reduced false positive rate (FSR) than the first image based on a penalty score, wherein the penalty score is The one or more classes corresponding to the one or more objects may be generated based on a false detection rate of each.

상기 프로세서는 상기 객체 이미지에서 상기 하나 이상의 객체를 분류하는 분류 모델에 상기 객체 이미지를 입력하여 특성 맵(feature map)을 생성하고, 상기 특성 맵에서 상기 하나 이상의 객체를 탐지하는 탐지 모델에 상기 특성 맵을 입력하여 상기 제1 이미지 및 상기 제1 이미지와 대응되는 하나 이상의 제1 추론 지수를 생성할 수 있다.The processor generates a feature map by inputting the object image to a classification model for classifying the one or more objects from the object image, and the feature map to a detection model for detecting the one or more objects in the feature map may be input to generate the first image and one or more first inference indices corresponding to the first image.

상기 프로세서는 검증 데이터에 기초하여 상기 패널티 스코어를 생성하고, 상기 패널티 스코어에 기초하여 상기 제2 이미지 및 상기 제2 이미지와 대응되는 하나 이상의 제2 추론 지수를 생성하며, 상기 검증 데이터는 상기 제1 학습 데이터 및 상기 제2 학습 데이터의 조합인 것일 수 있다.The processor generates the penalty score based on verification data, and generates the second image and one or more second inference indices corresponding to the second image based on the penalty score, wherein the verification data includes the first It may be a combination of the learning data and the second learning data.

도 1은 다양한 실시예에 따른 탐지 장치를 설명하기 위한 도면이다.
도 2는 다양한 실시예에 따른 탐지 장치가 패널티 스코어를 생성하는 동작을 설명하기 위한 도면이다.
도 3은 다양한 실시예에 따른 오탐지의 일 예를 설명하기 위한 도면이다.
도 4는 다양한 실시예에 따른 탐지 장치가 오탐지에 강인하게 객체를 탐지하는 동작을 설명하기 위한 도면이다.
도 5는 다양한 실시예에 따른 탐지 장치의 다른 예이다.1 is a diagram for describing a detection apparatus according to various embodiments of the present disclosure;
2 is a diagram for explaining an operation of generating a penalty score by a detection apparatus according to various embodiments of the present disclosure;
3 is a diagram for describing an example of false detection according to various embodiments of the present disclosure;
4 is a diagram for describing an operation of robustly detecting an object by a detection apparatus according to various embodiments of the present disclosure;
5 is another example of a detection apparatus according to various embodiments.

실시예들에 대한 특정한 구조적 또는 기능적 설명들은 단지 예시를 위한 목적으로 개시된 것으로서, 다양한 형태로 변경되어 구현될 수 있다. 따라서, 실제 구현되는 형태는 개시된 특정 실시예로만 한정되는 것이 아니며, 본 명세서의 범위는 실시예들로 설명한 기술적 사상에 포함되는 변경, 균등물, 또는 대체물을 포함한다.Specific structural or functional descriptions of the embodiments are disclosed for purposes of illustration only, and may be changed and implemented in various forms. Accordingly, the actual implementation form is not limited to the specific embodiments disclosed, and the scope of the present specification includes changes, equivalents, or substitutes included in the technical spirit described in the embodiments.

제1 또는 제2 등의 용어를 다양한 구성요소들을 설명하는데 사용될 수 있지만, 이런 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 해석되어야 한다. 예를 들어, 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소는 제1 구성요소로도 명명될 수 있다.Although terms such as first or second may be used to describe various elements, these terms should be interpreted only for the purpose of distinguishing one element from another. For example, a first component may be termed a second component, and similarly, a second component may also be termed a first component.

어떤 구성요소가 다른 구성요소에 "연결되어" 있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다.When a component is referred to as being “connected to” another component, it may be directly connected or connected to the other component, but it should be understood that another component may exist in between.

단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 명세서에서, "포함하다" 또는 "가지다" 등의 용어는 설명된 특징, 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것이 존재함으로 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The singular expression includes the plural expression unless the context clearly dictates otherwise. In this specification, terms such as "comprise" or "have" are intended to designate that the described feature, number, step, operation, component, part, or combination thereof exists, and includes one or more other features or numbers, It should be understood that the existence or addition of steps, operations, components, parts or combinations thereof is not precluded in advance.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 해당 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가진다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 갖는 것으로 해석되어야 하며, 본 명세서에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art. Terms such as those defined in a commonly used dictionary should be interpreted as having a meaning consistent with the meaning in the context of the related art, and should not be interpreted in an ideal or excessively formal meaning unless explicitly defined in the present specification. does not

이하, 실시예들을 첨부된 도면들을 참조하여 상세하게 설명한다. 첨부 도면을 참조하여 설명함에 있어, 도면 부호에 관계없이 동일한 구성 요소는 동일한 참조 부호를 부여하고, 이에 대한 중복되는 설명은 생략하기로 한다.Hereinafter, embodiments will be described in detail with reference to the accompanying drawings. In the description with reference to the accompanying drawings, the same components are assigned the same reference numerals regardless of the reference numerals, and overlapping descriptions thereof will be omitted.

도 1은 다양한 실시예에 따른 탐지 장치를 설명하기 위한 도면이다.1 is a diagram for describing a detection apparatus according to various embodiments of the present disclosure;

도 1을 참조하면, 다양한 실시예에 따르면, 탐지 장치(100)는 객체 탐지 모델(110) 및 패널티(penalty) 모델(150)을 포함할 수 있다. 예를 들어, 객체 탐지 모델(110)은 YOLOv3이고, 분류 모델(111)은 Darknet53일 수 있다. 탐지 장치(100)는 각 구성(객체 탐지 모델(110), 패널티 모델(150))을 활용하여 오탐지에 강인하게 객체 이미지에서 하나 이상의 객체를 탐지할 수 있다.Referring to FIG. 1 , according to various embodiments, the detection apparatus 100 may include an object detection model 110 and a penalty model 150 . For example, the object detection model 110 may be YOLOv3, and the classification model 111 may be Darknet53. The detection apparatus 100 may detect one or more objects in the object image to be robust against false positives by utilizing each configuration (the object detection model 110 and the penalty model 150 ).

도 1에서는 객체 탐지 모델(110)과 패널티 모델(150)이 동일한 하드웨어에서 구현된 경우를 들어 설명하지만, 반드시 이에 한정되는 것은 아니며, 실시예에 따라 객체 탐지 모델(110)과 패널티 모델(150)은 통신 가능한 별개의 하드웨어에서도 구현될 수 있다.In FIG. 1 , a case in which the object detection model 110 and the penalty model 150 are implemented on the same hardware is described, but the present invention is not limited thereto, and the object detection model 110 and the penalty model 150 according to an embodiment. can also be implemented in separate hardware capable of communicating.

다양한 실시예에 따르면, 탐지 장치(100)는 객체 이미지에서 하나 이상의 객체가 탐지된 제1 이미지를 생성하고, 패널티 스코어에 기초하여 제1 이미지보다 오탐지 비율이 감소된 제2 이미지를 생성할 수 있다. 탐지 장치(100)는 하나 이상의 객체에 대응되는 하나 이상의 클래스 각각의 오탐지 비율에 기초하여 패널티 스코어를 생성할 수 있다.According to various embodiments, the detection apparatus 100 may generate a first image in which one or more objects are detected from an object image, and generate a second image with a reduced false positive rate than the first image based on the penalty score. have. The detection apparatus 100 may generate a penalty score based on a false detection rate of each of one or more classes corresponding to one or more objects.

다양한 실시예에 따르면, 객체 탐지 모델(110)은 분류 모델(111) 및 탐지 모델(113)을 포함할 수 있다. 객체 탐지 모델(110)은 객체 이미지에서 하나 이상의 객체를 1차적으로 탐지할 수 있다.According to various embodiments, the object detection model 110 may include a classification model 111 and a detection model 113 . The object detection model 110 may primarily detect one or more objects in the object image.

다양한 실시예에 따르면, 분류 모델(111)은 객체 이미지에서 하나 이상의 객체를 분류하여 특성 맵(feature map)을 생성(예: 변환)할 수 있다. 분류 모델(111)은 탐지 모델(113)이 탐지하려는 객체와 관련된 제1 학습 데이터(예: 분포 내 데이터(in-distribution data)), 및 탐지하려는 객체와 무관한 제2 학습 데이터(예: 분포 밖 데이터(out-of-distribution data))를 입력 받아 객체 이미지에서 하나 이상의 객체를 분류한 제1 이미지를 출력하도록 생성(예: 학습)된 것일 수 있다. 예를 들어, 분류 모델(111)은 제1 학습 데이터에 의해 학습된 이후에, 제2 학습 데이터에 의해 추가적으로 학습(fine-tuning)된 것일 수 있다. 제2 학습 데이터는 제1 학습 데이터에 포함된 하나 이상의 클래스에 기초하여 레이블링(labeling)된 것일 수 있다. 예를 들어, 제2 학습 데이터는 제2 학습 데이터에 포함된 하나 이상의 클래스에 제1 학습 데이터에 포함된 하나 이상의 클래스가 균일하게 분포하도록 레이블링된 것일 수 있다.According to various embodiments, the classification model 111 may classify one or more objects in an object image to generate (eg, transform) a feature map. The classification model 111 includes first training data related to the object to be detected by the detection model 113 (eg, in-distribution data), and second training data independent of the object to be detected (eg, distribution). It may be generated (eg, learned) to output a first image obtained by classifying one or more objects from an object image by receiving out-of-distribution data. For example, the classification model 111 may be additionally fine-tuned by the second training data after being trained by the first training data. The second training data may be labeled based on one or more classes included in the first training data. For example, the second training data may be labeled such that one or more classes included in the second training data are uniformly distributed in one or more classes included in the first training data.

다양한 실시예에 따르면, 탐지 모델(113)은 특성 맵에서 하나 이상의 객체가 탐지된 제1 이미지와, 하나 이상의 제1 추론 지수를 생성할 수 있다. 제1 추론 지수는 특성 맵에 포함된 객체에 대응되는 클래스를 결정하기 위한 지표이고, 탐지 모델(113)은 제1 추론 지수가 미리 정해진 기준값(threshold)를 초과하는 경우에는 해당 클래스의 객체가 특성 맵에 포함된다고 판단함으로써 제1 이미지를 생성할 수 있다. 제1 추론 지수는 오브젝트 스코어(object score)에 클래스 스코어(class score)를 곱한 값일 수 있다. 오브젝트 스코어는 제1 이미지의 특정 위치에 특정 객체가 존재할 확률을 의미하고, 클래스 스코어는 특정 객체가 속한 클래스가 특정 클래스일 확률일 수 있다.According to various embodiments, the detection model 113 may generate a first image in which one or more objects are detected in the characteristic map and one or more first inference indices. The first inference index is an index for determining a class corresponding to the object included in the characteristic map, and the detection model 113 determines that the object of the corresponding class has a characteristic when the first inference index exceeds a predetermined threshold. By determining to be included in the map, the first image may be generated. The first inference index may be a value obtained by multiplying an object score by a class score. The object score may mean a probability that a specific object exists at a specific position in the first image, and the class score may be a probability that a class to which the specific object belongs is a specific class.

다양한 실시예에 따르면, 패널티 모델(150)은 패널티 스코어에 기초하여 제1 이미지보다 오탐지 비율이 감소된 제2 이미지 및 제2 추론 지수를 생성할 수 있다. 제2 추론 지수는 제1 이미지에 포함된 객체에 대응되는 클래스를 결정하기 위한 지표이고, 패널티 모델(150)은 제2 추론 지수가 미리 정해진 기준값(threshold)를 초과하는 경우에는 해당 클래스의 객체가 제2 이미지에 포함된다고 판단함으로써 제2 이미지를 생성할 수 있다. 제2 추론 지수는 하나의 클래스가 가지는 오탐지 비율과 상기 오탐지 비율의 평균의 차이값에 상기 제1 추론 지수를 곱한 값일 수 있다. 패널티 모델(150)은 도 2에서 후술한 동작에 의하여 오탐지 비율, 오탐지 비율의 평균, 및 패널티 스코어를 계산할 수 있다.According to various embodiments, the penalty model 150 may generate a second image and a second inference index having a reduced false positive rate compared to the first image based on the penalty score. The second inference index is an index for determining a class corresponding to the object included in the first image, and the penalty model 150 indicates that when the second inference index exceeds a predetermined threshold, the object of the corresponding class is By determining that the second image is included in the second image, the second image may be generated. The second inference index may be a value obtained by multiplying a difference between a false positive rate of one class and an average of the false positive rate by the first inference index. The penalty model 150 may calculate a false positive rate, an average of false positive rates, and a penalty score by an operation described later with reference to FIG. 2 .

다양한 실시예에 따르면, 패널티 모델(150)은 오탐지 비율의 평균을 초과하는 오탐지 비율을 가지는 클래스에 대하여 패널티 스코어를 생성할 수 있다. 패널티 모델(150)은 평균을 초과하는 오탐지 비율을 가지는 클래스에 대하여만 패널티 스코어를 적용함으로써 오탐지를 할 가능성이 높은 클래스에 대하여 효율적으로 오탐지 비율을 감소시킬 수 있다.According to various embodiments, the penalty model 150 may generate a penalty score for a class having a false positive rate that exceeds an average of the false positive rate. The penalty model 150 may effectively reduce the false positive rate with respect to a class having a high probability of false positive by applying a penalty score only to a class having a false positive rate exceeding the average.

도 2는 다양한 실시예에 따른 탐지 장치가 패널티 스코어를 생성하는 동작을 설명하기 위한 도면이다.2 is a diagram for explaining an operation of generating a penalty score by a detection apparatus according to various embodiments of the present disclosure;

도 2를 참조하면, 다양한 실시예에 따르면, 탐지 장치(100)는 검증 데이터를 입력 받아 검증 데이터에 포함된 클래스 각각의 패널티 스코어를 생성할 수 있다. 검증 데이터는 제1 학습 데이터 및 제2 학습 데이터가 1:5의 비율로 조합된 것일 수 있다.Referring to FIG. 2 , according to various embodiments, the detection apparatus 100 may receive verification data and generate a penalty score for each class included in the verification data. The verification data may be a combination of the first learning data and the second learning data in a ratio of 1:5.

다양한 실시예에 따르면, 분류 모델(111)은 검증 데이터에서 하나 이상의 객체를 분류하여 특성 맵(feature map)을 생성(예: 변환)할 수 있다. 분류 모델(111)은 탐지 모델(113)이 탐지하려는 객체와 관련된 제1 학습 데이터(예: 분포 내 데이터(in-distribution data)), 및 탐지하려는 객체와 무관한 제2 학습 데이터(예: 분포 밖 데이터(out-of-distribution data))를 입력 받아 객체 이미지에서 하나 이상의 객체를 분류한 제1 이미지를 출력하도록 생성(예: 학습)된 것일 수 있다. 예를 들어, 분류 모델(111)은 제1 학습 데이터에 의해 학습된 이후에, 제2 학습 데이터에 의해 추가적으로 학습(fine-tuning)된 것일 수 있다. 제2 학습 데이터는 제1 학습 데이터에 포함된 하나 이상의 클래스에 기초하여 레이블링(labeling)된 것일 수 있다. 예를 들어, 제2 학습 데이터는 제2 학습 데이터에 포함된 하나 이상의 클래스에 제1 학습 데이터에 포함된 하나 이상의 클래스가 균일하게 분포하도록 레이블링된 것일 수 있다.According to various embodiments, the classification model 111 may generate (eg, transform) a feature map by classifying one or more objects in the verification data. The classification model 111 includes first training data related to the object to be detected by the detection model 113 (eg, in-distribution data), and second training data independent of the object to be detected (eg, distribution). It may be generated (eg, learned) to output a first image obtained by classifying one or more objects from an object image by receiving out-of-distribution data. For example, after the classification model 111 is trained by the first training data, it may be additionally fine-tuned by the second training data. The second training data may be labeled based on one or more classes included in the first training data. For example, the second training data may be labeled such that one or more classes included in the second training data are uniformly distributed in one or more classes included in the first training data.

다양한 실시예에 따르면, 탐지 모델(113)은 특성 맵에서 하나 이상의 객체가 탐지된 결과 데이터와, 하나 이상의 제3 추론 지수를 생성할 수 있다. 제3 추론 지수는 검증 데이터의 특성 맵에 포함된 객체에 대응되는 클래스를 결정하기 위한 지표이고, 탐지 모델(113)은 제3 추론 지수가 미리 정해진 기준값(threshold)를 초과하는 경우에는 해당 클래스의 객체가 특성 맵에 포함된다고 판단함으로써 결과 데이터를 생성할 수 있다. 제3 추론 지수는 오브젝트 스코어(object score)에 클래스 스코어(class score)를 곱한 값일 수 있다. 오브젝트 스코어는 결과 데이터의 특정 위치에 특정 객체가 존재할 확률을 의미하고, 클래스 스코어는 특정 객체가 속한 클래스가 특정 클래스일 확률일 수 있다.According to various embodiments, the detection model 113 may generate result data in which one or more objects are detected in the characteristic map and one or more third inference indices. The third inference index is an index for determining a class corresponding to the object included in the characteristic map of the verification data, and the detection model 113 determines that the third inference index exceeds a predetermined threshold value of the corresponding class. By determining that the object is included in the characteristic map, result data can be generated. The third inference index may be a value obtained by multiplying an object score by a class score. The object score may mean a probability that a specific object exists at a specific location of the result data, and the class score may be a probability that a class to which the specific object belongs is a specific class.

다양한 실시예에 따르면, 패널티 모델(150)은 결과 데이터에 포함된 하나 이상의 클래스 각각의 오탐지 비율에 기초하여 클래스 각각의 패널티 스코어를 생성할 수 있다. 하나 이상의 클래스는 하나 이상의 객체가 속한 클래스일 수 있다. 패널티 모델(150)은 ground truth 중 실제 정답이 아닌 것을 객체 탐지 모델(110)이 정답이라고 예측한 false positive의 비율을 계산하여 클래스 각각의 오탐지 비율을 생성(예: 계산)할 수 있다. 패널티 모델(150)은 클래스 각각의 오탐지 비율을 합한 값을 전체 클래스의 개수로 나누어 오탐지 비율의 평균을 계산할 수 있다. 패널티 모델(150)은 오탐지 비율의 평균을 초과하는 오탐지 비율을 가지는 클래스에 대하여 패널티 스코어를 생성할 수 있다.According to various embodiments, the penalty model 150 may generate a penalty score for each class based on a false positive rate of each of one or more classes included in the result data. One or more classes may be classes to which one or more objects belong. The penalty model 150 may generate (eg, calculate) a false positive rate for each class by calculating a ratio of false positives that the object detection model 110 predicts as a correct answer for a ground truth that is not an actual correct answer. The penalty model 150 may calculate an average of the false positive rates by dividing the sum of the false positive rates of each class by the number of all classes. The penalty model 150 may generate a penalty score for a class having a false positive rate that exceeds an average of the false positive rate.

도 3은 오탐지의 일 예를 설명하기 위한 도면이다.3 is a diagram for explaining an example of false detection.

도 3을 참조하면, 객체를 탐지하는 딥러닝 모델은 다중 객체 탐지 시 오탐지할 수 있다. 적색 박스(300)는 딥러닝 모델이 주행 차량의 전면에 장애물 차량이 존재하지 않지만 장애물 차량이 존재한다고 오탐지한 것일 수 있다. 주행 차량은 오탐지 결과 자율주행상황 중 급정거를 하게 될 수 있고, 사고로 이어져 주행자의 생명을 위협할 수 있다.Referring to FIG. 3 , a deep learning model for detecting an object may falsely detect multiple objects. The red box 300 may indicate that the deep learning model erroneously detects that the obstacle vehicle does not exist in front of the driving vehicle, but that the obstacle vehicle is present. As a result of a false detection, the driving vehicle may make an abrupt stop during an autonomous driving situation, which may lead to an accident and endanger the life of the driver.

딥러닝 모델의 성능 평가 지표를 위한 오차 행렬은 다음 표 1과 같이 정리될 수 있다. 표 1의 요소 중 False Positive는 딥러닝 모델이 정답이라고 예측한 것이 거짓인 경우로 위의 오탐지의 예시에 해당하는 경우일 수 있다.The error matrix for the performance evaluation index of the deep learning model can be organized as shown in Table 1 below. Among the elements in Table 1, a false positive is a case in which the deep learning model predicts that the correct answer is false, and may correspond to the above example of false positive.

[표 1][Table 1]

도 4는 다양한 실시예에 따른 탐지 장치가 오탐지에 강인하게 객체를 탐지하는 동작을 설명하기 위한 도면이다.4 is a diagram for describing an operation of robustly detecting an object by a detection apparatus according to various embodiments of the present disclosure;

동작 410 및 동작 420은 탐지 장치(100)가 객체 이미지에 포함된 클래스 각각의 패널티 스코어에 기초하여 오탐지에 강인하게 객체를 탐지하는 동작을 설명하기 위한 것일 수 있다.Operations 410 and 420 may be used to describe an operation in which the detection apparatus 100 robustly detects an object based on a penalty score of each class included in the object image.

동작 410에서, 탐지 장치(100)는 객체 이미지에서 하나 이상의 객체가 탐지된 제1 이미지를 생성할 수 있다. 탐지 장치(100)는 객체 이미지에서 하나 이상의 객체를 분류하여 특성 맵을 생성하고, 특성 맵에서 하나 이상의 객체를 탐지할 수 있다.In operation 410, the detection apparatus 100 may generate a first image in which one or more objects are detected from the object image. The detection apparatus 100 may classify one or more objects in an object image to generate a characteristic map, and detect one or more objects in the characteristic map.

동작 420에서, 탐지 장치(100)는 패널티 스코어에 기초하여 제1 이미지보다 오탐지 비율이 감소된 제2 이미지를 생성할 수 있다. 탐지 장치(100)는 하나 이상의 객체에 대응되는 하나 이상의 클래스 각각의 오탐지 비율에 기초하여 패널티 스코어를 생성할 수 있다.In operation 420 , the detection apparatus 100 may generate a second image having a reduced false positive rate compared to the first image based on the penalty score. The detection apparatus 100 may generate a penalty score based on a false detection rate of each of one or more classes corresponding to one or more objects.

도 5는 다양한 실시예에 따른 탐지 장치의 다른 예이다.5 is another example of a detection apparatus according to various embodiments of the present disclosure;

도 5을 참조하면, 다양한 실시예에 따르면, 탐지 장치(500)는 메모리(510) 및 프로세서(530)을 포함할 수 있다.Referring to FIG. 5 , according to various embodiments, the detection apparatus 500 may include a memory 510 and a processor 530 .

다양한 실시예에 따르면, 메모리(510)는 프로세서(530)에 의해 실행가능한 인스트럭션들(예: 프로그램)을 저장할 수 있다. 예를 들어, 인스트럭션들은 프로세서(530)의 동작 및/또는 프로세서(530)의 각 구성의 동작을 실행하기 위한 인스트럭션들을 포함할 수 있다.According to various embodiments, the memory 510 may store instructions (eg, a program) executable by the processor 530 . For example, the instructions may include instructions for executing an operation of the processor 530 and/or an operation of each component of the processor 530 .

다양한 실시예에 따르면, 메모리(510)는 휘발성 메모리 장치 또는 불휘발성 메모리 장치로 구현될 수 있다. 휘발성 메모리 장치는 DRAM(dynamic random access memory), SRAM(static random access memory), T-RAM(thyristor RAM), Z-RAM(zero capacitor RAM), 또는 TTRAM(Twin Transistor RAM)으로 구현될 수 있다. 불휘발성 메모리 장치는 EEPROM(Electrically Erasable Programmable Read-Only Memory), 플래시(flash) 메모리, MRAM(Magnetic RAM), 스핀전달토크 MRAM(Spin-Transfer Torque(STT)-MRAM), Conductive Bridging RAM(CBRAM), FeRAM(Ferroelectric RAM), PRAM(Phase change RAM), 저항 메모리(Resistive RAM(RRAM)), 나노 튜브 RRAM(Nanotube RRAM), 폴리머 RAM(Polymer RAM(PoRAM)), 나노 부유 게이트 메모리(Nano Floating Gate Memory(NFGM)), 홀로그래픽 메모리(holographic memory), 분자 전자 메모리 소자(Molecular Electronic Memory Device), 및/또는 절연 저항 변화 메모리(Insulator Resistance Change Memory)로 구현될 수 있다.According to various embodiments, the memory 510 may be implemented as a volatile memory device or a nonvolatile memory device. The volatile memory device may be implemented as dynamic random access memory (DRAM), static random access memory (SRAM), thyristor RAM (T-RAM), zero capacitor RAM (Z-RAM), or twin transistor RAM (TTRAM). Nonvolatile memory devices include EEPROM (Electrically Erasable Programmable Read-Only Memory), Flash memory, MRAM (Magnetic RAM), Spin-Transfer Torque (STT)-MRAM (Spin-Transfer Torque (STT)-MRAM), Conductive Bridging RAM (CBRAM) , FeRAM(Ferroelectric RAM), PRAM(Phase change RAM), Resistive RAM(RRAM), Nanotube RRAM(Nanotube RRAM), Polymer RAM(Polymer RAM(PoRAM)), Nano Floating Gate Memory (NFGM)), a holographic memory, a Molecular Electronic Memory Device, and/or an Insulator Resistance Change Memory.

다양한 실시예에 따르면, 프로세서(530)는 메모리(510)에 저장된 컴퓨터로 읽을 수 있는 코드(예를 들어, 소프트웨어) 및 프로세서(530)에 의해 유발된 인스트럭션(instruction)들을 실행할 수 있다. 프로세서(530)는 목적하는 동작들(desired operations)을 실행시키기 위한 물리적인 구조를 갖는 회로를 가지는 하드웨어로 구현된 데이터 처리 장치일 수 있다. 목적하는 동작들은 예를 들어, 프로그램에 포함된 코드(code) 또는 인스트럭션들(instructions)을 포함할 수 있다. 하드웨어로 구현된 데이터 처리 장치는 예를 들어, 마이크로프로세서(microprocessor), 중앙 처리 장치(central processing unit), 프로세서 코어(processor core), 멀티-코어 프로세서(multi-core processor), 멀티프로세서(multiprocessor), ASIC(Application-Specific Integrated Circuit), FPGA(Field Programmable Gate Array)를 포함할 수 있다.According to various embodiments, the processor 530 may execute computer readable code (eg, software) stored in the memory 510 and instructions induced by the processor 530 . The processor 530 may be a hardware-implemented data processing device having a circuit having a physical structure for executing desired operations. Target operations may include, for example, code or instructions included in a program. A data processing device implemented in hardware includes, for example, a microprocessor, a central processing unit, a processor core, a multi-core processor, and a multiprocessor. , an Application-Specific Integrated Circuit (ASIC), and a Field Programmable Gate Array (FPGA).

다양한 실시예에 따르면, 프로세서(530)에 의해 수행되는 동작은 도 1 내지 도 4을 참조하여 설명한 탐지 장치(100)의 동작과 실질적으로 동일할 수 있다. 도 1 내지 도 4에서 설명한 탐지 장치(100)의 각 구성(예: 객체 탐지 모델(110), 패널티 모델(150))은 프로세서(530)에 의해 실행될 수 있다. 이에, 상세한 설명은 생략하도록 한다.According to various embodiments, the operation performed by the processor 530 may be substantially the same as the operation of the detection apparatus 100 described with reference to FIGS. 1 to 4 . Each configuration (eg, the object detection model 110 and the penalty model 150 ) of the detection apparatus 100 described with reference to FIGS. 1 to 4 may be executed by the processor 530 . Accordingly, detailed description will be omitted.

이상에서 설명된 실시예들은 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치, 방법 및 구성요소는, 예를 들어, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPGA(field programmable gate array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 소프트웨어 애플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 컨트롤러를 포함할 수 있다. 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The embodiments described above may be implemented by a hardware component, a software component, and/or a combination of a hardware component and a software component. For example, the apparatus, methods, and components described in the embodiments may include, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate (FPGA) array), a programmable logic unit (PLU), a microprocessor, or any other device capable of executing and responding to instructions, may be implemented using a general purpose computer or special purpose computer. The processing device may execute an operating system (OS) and a software application running on the operating system. A processing device may also access, store, manipulate, process, and generate data in response to execution of the software. For convenience of understanding, although one processing device is sometimes described as being used, one of ordinary skill in the art will recognize that the processing device includes a plurality of processing elements and/or a plurality of types of processing elements. It can be seen that can include For example, the processing device may include a plurality of processors or one processor and one controller. Other processing configurations are also possible, such as parallel processors.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치, 또는 전송되는 신호 파(signal wave)에 영구적으로, 또는 일시적으로 구체화(embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.Software may comprise a computer program, code, instructions, or a combination of one or more thereof, which configures a processing device to operate as desired or is independently or collectively processed You can command the device. The software and/or data may be any kind of machine, component, physical device, virtual equipment, computer storage medium or apparatus, to be interpreted by or to provide instructions or data to the processing device. , or may be permanently or temporarily embody in a transmitted signal wave. The software may be distributed over networked computer systems and stored or executed in a distributed manner. Software and data may be stored in a computer-readable recording medium.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 저장할 수 있으며 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다.The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may store program instructions, data files, data structures, etc. alone or in combination, and the program instructions recorded on the medium may be specially designed and configured for the embodiment, or may be known and available to those skilled in the art of computer software. have. Examples of the computer-readable recording medium include magnetic media such as hard disks, floppy disks and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic such as floppy disks. - includes magneto-optical media, and hardware devices specially configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. Examples of program instructions include not only machine language codes such as those generated by a compiler, but also high-level language codes that can be executed by a computer using an interpreter or the like.

위에서 설명한 하드웨어 장치는 실시예의 동작을 수행하기 위해 하나 또는 복수의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The hardware devices described above may be configured to operate as one or a plurality of software modules to perform the operations of the embodiments, and vice versa.

이상과 같이 실시예들이 비록 한정된 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 이를 기초로 다양한 기술적 수정 및 변형을 적용할 수 있다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.As described above, although the embodiments have been described with reference to the limited drawings, those of ordinary skill in the art may apply various technical modifications and variations based thereon. For example, the described techniques are performed in an order different from the described method, and/or the described components of the system, structure, apparatus, circuit, etc. are combined or combined in a different form than the described method, or other components Or substituted or substituted by equivalents may achieve an appropriate result.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents to the claims are also within the scope of the following claims.

Claims

generating a first image in which one or more objects are detected in the object image; and
Generating a second image with a reduced false positive rate (FSR) than the first image based on a penalty score
including,
The penalty score is
One or more classes corresponding to the one or more objects will be generated based on the false positive rate of each, the detection method.

According to claim 1,
The operation of generating the first image comprises:
generating a feature map by inputting the object image into a classification model for classifying the one or more objects from the object image; and
generating the first image and one or more first inference indices corresponding to the first image by inputting the characteristic map to a detection model for detecting the one or more objects in the characteristic map
A detection method comprising:

3. The method of claim 2,
The classification model is
The detection method, wherein the detection model is learned based on first learning data related to the object to be detected and second learning data irrelevant to the object to be detected.

4. The method of claim 3,
The second learning data,
The detection method, which is labeled based on one or more classes included in the first training data.

4. The method of claim 3,
The operation of generating the second image comprises:
generating the penalty score based on verification data; and
generating the second image and one or more second inference indices corresponding to the second image based on the penalty score
including,
The verification data is
The detection method, which is a combination of the first learning data and the second learning data.

6. The method of claim 5,
The second inference index is,
A value obtained by multiplying a difference between a false positive rate of one class and an average of the false positive rate by the first inference index.

According to claim 1,
The penalty score is
and a class having a false positive rate that exceeds the average of the false positive rate.

According to claim 1,
A computer program stored in a computer-readable recording medium in combination with hardware to execute the method of any one of claims 1 to 7.

a memory containing instructions; and
a processor electrically connected to the memory and configured to execute the instructions
including,
When the instructions are executed by the processor, the processor
generating a first image in which one or more objects are detected in the object image;
Generates a second image with a reduced false positive rate (FSR) than the first image based on a penalty score,
The penalty score is
One or more classes corresponding to the one or more objects will be generated based on the false positive rate of each, the device.

10. The method of claim 9,
The processor is
generating a feature map by inputting the object image to a classification model that classifies the one or more objects in the object image;
The apparatus of generating the first image and one or more first inference indices corresponding to the first image by inputting the characteristic map to a detection model for detecting the one or more objects in the characteristic map.

11. The method of claim 10,
The classification model is
The apparatus of claim 1, wherein the detection model is learned based on first training data related to the object to be detected and second training data irrelevant to the object to be detected.

12. The method of claim 11,
The second learning data,
The apparatus of claim 1, which is labeled based on one or more classes included in the first training data.

12. The method of claim 11,
The processor is
generate the penalty score based on validation data;
generating the second image and one or more second inference indices corresponding to the second image based on the penalty score;
The verification data is
which is a combination of the first learning data and the second learning data.

14. The method of claim 13,
The second inference index is,
A value obtained by multiplying a difference between a false positive rate of one class and an average of the false positive rate by the first inference index.

10. The method of claim 9,
The penalty score is
and for a class having a false positive rate that exceeds the average of the false positive rate.