KR102283053B1

KR102283053B1 - Real-Time Multi-Class Multi-Object Tracking Method Using Image Based Object Detection Information

Info

Publication number: KR102283053B1
Application number: KR1020190176376A
Authority: KR
Inventors: 김동규; 정병우
Original assignee: (주)베라시스
Priority date: 2019-12-27
Filing date: 2019-12-27
Publication date: 2021-07-28
Also published as: KR20210083760A

Abstract

본 발명은 영상 기반 객체 검출 정보를 이용한 실시간 다중 클래스 다중 객체 추적 방법에 관한 것으로, 영상 기반 객체 검출정보 중 검출데이터를 독취하는 제1 단계, 상기 독출된 검출데이터에 대하여 위치기반 필터링을 수행하는 제2 단계, 상기 제2 단계에서 필터링된 검출데이터에 대하여 매칭 및 칼만필터 처리를 수행하는 제3 단계, 상기 제 3 단계 이후에 초기 객체 필터링 처리를 수행하는 제4 단계, 상기 제4 단계 이후에 데이터 가공 및 관리를 수행하는 제5 단계를 포함하고, 상기 제 1 내지 제5 단계는 1 프레임마다 순자적으로 진행되며, 상기 제 3 단계는 5 프레임까지 진행되면 5 프레임안에 동일한 객체가 4프레임 매칭되면 후보 데이터세트로 저장하고 갱신하며, 그 후보데이터 세트는 상기 매칭 및 칼만필터 처리 단계에서 매칭과정에 사용하는 것을 특징으로 한다.The present invention relates to a real-time multi-class multi-object tracking method using image-based object detection information, a first step of reading detection data from image-based object detection information, and a first step of performing location-based filtering on the read detection data Step 2, a third step of performing matching and Kalman filter processing on the detected data filtered in the second step, a fourth step of performing an initial object filtering process after the third step, and the data after the fourth step a fifth step of performing processing and management, wherein the first to fifth steps are sequentially performed for each frame, and the third step proceeds up to 5 frames and matches the same object 4 frames within 5 frames The candidate data set is stored and updated, and the candidate data set is used in the matching process in the matching and Kalman filter processing steps.

Description

Real-Time Multi-Class Multi-Object Tracking Method Using Image Based Object Detection Information

본 발명은 영상 기반 객체 검출 정보를 이용한 실시간 다중 클래스 다중 객체 추적 방법에 관한 것으로, 더욱 상세하게는 차량에 부착된 일반적인 영상장치를 활용하여, 주행 중 카메라를 통해 저장된 영상을 통해 검출(Detection) 데이터를 획득하고 이와 같이 획득된 검출데이터를 통해 오인식되거나 미인식된 부분을 단계적으로 다양한 방법을 통해 보정하기 위해 추적기(tracker)(이하, 추적 알고리즘이라 칭함)를 개발하여 주요 객체(Object)들을 실시간으로 검출하고 추적할 수 있도록 된 새로운 형태의 영상 기반 객체 검출 정보를 이용한 실시간 다중 클래스 다중 객체 추적 방법에 관한 것이다.The present invention relates to a real-time multi-class multi-object tracking method using image-based object detection information, and more particularly, by utilizing a general imaging device attached to a vehicle, detection data through an image stored through a camera while driving , and develop a tracker (hereinafter referred to as a tracking algorithm) to correct misrecognized or unrecognized parts through the obtained detection data in various steps in a step-by-step manner to detect major objects in real time. It relates to a real-time multi-class multi-object tracking method using a new type of image-based object detection information that can be detected and tracked.

종래에는 객체(Object) 검출 성능이 완벽하지 않아 미인식, 오인식, 정확하지 않은 위치 등의 문제가 생긴다.Conventionally, since the object detection performance is not perfect, problems such as unrecognized, misrecognized, and incorrect location occur.

또한, 검출(Detection)정보만 사용한다면 데이터가 없는 구간 혹은 잘못 인식한 데이터의 활용으로, 응용(Application)에서 실제 사용에 어려움이 있다.In addition, if only detection information is used, there is a difficulty in actual use in the application because there is no data or the use of wrongly recognized data.

또한, 임베디드(Embedded) 환경에서는 제한적인 컴퓨팅 파워와 자원을 사용해야 하기 때문에 알고리즘 개발에도 제약이 있다.In addition, there are limitations in algorithm development because limited computing power and resources must be used in an embedded environment.

또한, 일반적으로 검출기(Detector)(이하, 검출 알고리즘이라 칭함)는 추적기(Tracker)(추적 알고리즘)보다 많은 컴퓨팅 파워를 필요로 하며, 영상 전체에서 다중 클래스 혹은 다중 객체를 검출(Detection)하기 때문에 컴퓨팅 파워를 많이 소모하는 문제점이 있다.In addition, in general, a detector (hereinafter referred to as a detection algorithm) requires more computing power than a tracker (tracking algorithm), and since it detects multiple classes or multiple objects in the entire image, computing There is a problem that consumes a lot of power.

[선행기술문헌][Prior art literature]

대한민국특허공개번호 제10-2016-0110773호(2016년09월22일 공개)(발명의 명칭: 칼만 필터 모델을 이용한 이동 객체 추적 방법 및 장치)Republic of Korea Patent Publication No. 10-2016-0110773 (published on September 22, 2016) (Title of the invention: Method and apparatus for tracking a moving object using a Kalman filter model)

본 발명의 목적은 상기와 같은 문제점을 감안하여 제안된 것으로서, 종래의 환경에서 최소한의 자원을 사용하면서 실시간(Real-time)으로 동작하고, 미인식과 오인식에 대처할 수 있는 추적 알고리즘(Tracker)을 개발함으로써 주요 객체(Object)들을 실시간으로 검출하고 추적할 수 있도록 된 영상 기반 객체 검출 정보를 이용한 실시간 다중 클래스 다중 객체 추적 방법을 제공함에 있다.An object of the present invention is to develop a tracking algorithm (Tracker) that is proposed in view of the above problems, operates in real-time while using minimal resources in a conventional environment, and can cope with unrecognized and misrecognized It is to provide a real-time multi-class multi-object tracking method using image-based object detection information that can detect and track major objects in real time.

상기한 바와 같은 목적을 달성하기 위한 본 발명의 일 실시예에 따르면, 영상 기반 객체 검출 정보를 이용한 실시간 다중 클래스 다중 객체 추적 방법은 영상 기반 객체 검출 정보를 이용한 실시간 다중 클래스 다중 객체 추적 방법에 있어서,According to an embodiment of the present invention for achieving the above object, a real-time multi-class multi-object tracking method using image-based object detection information is a real-time multi-class multi-object tracking method using image-based object detection information,

영상 기반 객체 검출정보 중 검출데이터를 독취하는 제1 단계,A first step of reading detection data from image-based object detection information;

상기 독취된 검출데이터에 대하여 위치기반 필터링을 수행하는 제2 단계,A second step of performing location-based filtering on the read detection data;

상기 제2 단계에서 필터링된 검출데이터에 대하여 매칭 및 칼만필터 처리를 수행하는 제3 단계,A third step of performing matching and Kalman filter processing on the detected data filtered in the second step;

상기 제 3 단계 이후에 초기 객체 필터링 처리를 수행하는 제4 단계,a fourth step of performing initial object filtering processing after the third step;

상기 제4 단계 이후에 데이터 가공 및 관리를 수행하는 제5 단계를 포함하고,Including a fifth step of performing data processing and management after the fourth step,

상기 제 1 내지 제5 단계는 1 프레임마다 순차적으로 진행되며,The first to fifth steps are sequentially performed for each frame,

상기 제 3 단계는 5 프레임까지 진행되면 5 프레임안에 동일한 객체가 4프레임 매칭되면 후보 데이터세트로 저장하고 갱신하며, 그 후보데이터 세트는 상기 매칭 및 칼만필터 처리 단계에서 매칭과정에 사용하는 것을 특징으로 하되,In the third step, if the same object is matched with 4 frames within 5 frames, it is stored and updated as a candidate data set, and the candidate data set is used in the matching process in the matching and Kalman filter processing steps. but,

삭제delete

상기 제3 단계는The third step is

칼만필터 예측과정을 시작하는 단계,Initiating the Kalman Filter prediction process,

필터링 데이터와 추적 데이터셋을 매칭시키는 단계,matching the filtering data and the tracking data set;

매칭되면 칼만필터 보정과정을 수행하는 단계, 및If matched, performing a Kalman filter correction process, and

보정완료되면 추적데이터를 갱신하는 단계를 포함하는 것을 특징으로 하고,Comprising the step of updating the tracking data when the correction is completed,

삭제delete

상기 제 4 단계는The fourth step is

남은 필터링 데이터와 후보군 데이터셋 매칭하는 단계,Matching the remaining filtered data with the candidate data set,

매칭되면 후보 데이터를 갱신하는 단계,updating candidate data if matched;

후보 데이터셋 체크 후 추적 데이터에 전달하는 단계,After checking the candidate dataset, passing it to the tracking data,

남은 필터링 데이터를 후보 데이터셋에 추가하는 단계를 포함하는 것을 특징으로 하여,
초기 물체 필터링을 통하여 오인식을 줄이며, 후에 칼만 필터링과정을 거쳐 미인식을 줄여서 실시간으로 다중 객체를 인식할 수 있는 인식률을 향상시키며,
영상기반이 아닌 검출 데이터에서 얻을 수 있는 정보인 클래스, 점수, 검출결과박스만을 사용하여 영상처리기반 데이터 처리에 비해 매우 적은 연산량을 사용하여 실시간으로 검출데이터의 처리가 가능한 것을 특징으로 하는 영상 기반 객체 검출 정보를 이용한 실시간 다중 클래스 다중 객체 추적 방법이 제공된다.It characterized in that it comprises the step of adding the remaining filtering data to the candidate dataset,
It reduces misrecognition through initial object filtering, and then goes through the Kalman filtering process to reduce unrecognized recognition to improve the recognition rate that can recognize multiple objects in real time.
An image-based object characterized in that it is possible to process detected data in real time by using only a class, score, and detection result box, which are information that can be obtained from detection data rather than image-based, using a very small amount of computation compared to image processing-based data processing. A real-time multi-class multi-object tracking method using detection information is provided.

또한, 바람직하게는,Also, preferably,

필터링 데이터와 추적 데이터셋을 매칭시키는 단계는The step of matching the filtering data and the tracking data set is

최근접 이웃추적방법(NN:Nearest Neighbor), 휴리스틱 유사방법(heuristic similarity matching) 및 다중 클래스 점수 계산(Multi Class Score Calculation)방법이 순차적으로 수행되는 것을 특징으로 한다.Nearest Neighbor (NN), heuristic similarity matching, and Multi Class Score Calculation methods are sequentially performed.

이상 설명한 바와 같이 본 발명에 따른 영상 기반 객체 검출 정보를 이용한 실시간 다중 클래스 다중 객체 추적 방법에 의하면, 초기 물체 필터링을 통하여 오인식을 줄이며, 후에 칼만 필터링과정을 거쳐 미인식을 줄여서 실시간으로 다중 객체를 인식할 수 있는 인식률을 향상시키는 효과가 있다.As described above, according to the real-time multi-class multi-object tracking method using image-based object detection information according to the present invention, misrecognition is reduced through initial object filtering, and then multiple objects are recognized in real time by reducing unrecognized through Kalman filtering process. It has the effect of improving the recognition rate that can be done.

본 발명에 따르면 영상기반이 아닌 검출 데이터에서 얻을 수 있는 정보인 클래스, 점수, 검출결과박스만을 사용하여 영상처리기반 데이터 처리에 비해 매우 적은 연산량을 사용하여 실시간으로 검출데이터의 처리가 가능한 효과가 있다.According to the present invention, it is possible to process detected data in real time using a very small amount of computation compared to image processing-based data processing using only the class, score, and detection result box, which are information that can be obtained from detection data, not image-based. .

도 1은 본 발명에 따른 영상 기반 객체 검출 정보를 이용한 실시간 다중 클래스 다중 객체 추적 방법을 설명하는 흐름도이다.
도 2는 본 발명을 설명하기 위한 카메라 영상화면의 일예를 나타낸 것이다.
도 3은 본 발명에서 이용되는 칼만알고리즘을 표현하는 식을 나타낸 것이다.
도 4는 본 발명에 따른 영상 기반 객체 검출 정보를 이용한 실시간 다중 클래스 다중 객체 추적 방법의 결과화면의 일예를 나타낸 것이다.
도 5 내지 도 7은 본 발명에 따른 영상 기반 객체 검출 정보를 이용한 실시간 다중 클래스 다중 객체 추적 방법에서 위치기반 관심영역 변경 결과화면의 일예들을 나타낸 것이다.
도 8은 본 발명에 따른 영상 기반 객체 검출 정보를 이용한 실시간 다중 클래스 다중 객체 추적 방법에서 물체 크기에 따른 관심영역 변경 결과화면의 일예를 나타낸 것이다.
도 9 내지 도 12는 본 발명에 따른 영상 기반 객체 검출 정보를 이용한 실시간 다중 클래스 다중 객체 추적 방법에서 검출영상 프레임과 보정영상 프레임의 각각의 프레임에서의 결과화면들의 일예들을 나타낸 것이다.1 is a flowchart illustrating a real-time multi-class multi-object tracking method using image-based object detection information according to the present invention.
2 shows an example of a camera image screen for explaining the present invention.
3 shows an expression expressing the Kalman algorithm used in the present invention.
4 shows an example of a result screen of a real-time multi-class multi-object tracking method using image-based object detection information according to the present invention.
5 to 7 show examples of a result screen for changing a location-based ROI in the real-time multi-class multi-object tracking method using image-based object detection information according to the present invention.
8 shows an example of a result screen for changing a region of interest according to an object size in the real-time multi-class multi-object tracking method using image-based object detection information according to the present invention.
9 to 12 show examples of result screens in each frame of a detected image frame and a corrected image frame in the real-time multi-class multi-object tracking method using image-based object detection information according to the present invention.

본 발명은 다양한 변경을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시 예들을 도면에 예시하고 상세하게 설명하고자 한다.Since the present invention can have various changes and can have various embodiments, specific embodiments are illustrated in the drawings and described in detail.

그러나, 이는 본 발명을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다.However, this is not intended to limit the present invention to specific embodiments, and it should be understood to include all modifications, equivalents and substitutes included in the spirit and scope of the present invention.

본 출원에서 사용한 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, "포함하다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The terms used in the present application are only used to describe specific embodiments, and are not intended to limit the present invention. The singular expression includes the plural expression unless the context clearly dictates otherwise. In the present application, terms such as “comprise” or “have” are intended to designate that a feature, number, step, operation, component, part, or combination thereof described in the specification exists, but one or more other features It should be understood that this does not preclude the existence or addition of numbers, steps, operations, components, parts, or combinations thereof.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가지고 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 가진 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless defined otherwise, all terms used herein, including technical and scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Terms such as those defined in a commonly used dictionary should be interpreted as having a meaning consistent with the meaning in the context of the related art, and should not be interpreted in an ideal or excessively formal meaning unless explicitly defined in the present application. does not

이하, 첨부한 도면들을 참조하여, 본 발명의 바람직한 실시예를 보다 상세하게 설명하고자 한다. 본 발명을 설명함에 있어 전체적인 이해를 용이하게 하기 위하여 도면상의 동일한 구성요소에 대해서는 동일한 참조부호를 사용하고 동일한 구성요소에 대해서 중복된 설명은 생략한다.Hereinafter, preferred embodiments of the present invention will be described in more detail with reference to the accompanying drawings. In describing the present invention, in order to facilitate the overall understanding, the same reference numerals are used for the same components in the drawings, and duplicate descriptions of the same components are omitted.

이하, 첨부한 도면을 참고하여 본 발명에 따른 영상 기반 객체 검출 정보를 이용한 실시간 다중 클래스 다중 객체 추적 방법에 대하여 상세히 설명한다.Hereinafter, a real-time multi-class multi-object tracking method using image-based object detection information according to the present invention will be described in detail with reference to the accompanying drawings.

본 발명의 특징을 간단히 살펴본다. A brief look at the features of the present invention.

먼저, 검출알고리즘(Detector)으로부터 받는 정보는 top class정보, top score(confidence, 0~1값), bounding box정보(cx,cy,w,h), 다른 class에 대한 score 정보이며, 위의 검출(Detection) 정보만을 사용(영상처리 사용하지 않음)한다.First, the information received from the detection algorithm (Detector) is top class information, top score (confidence, 0~1 value), bounding box information (cx,cy,w,h), score information for other classes, and the above detection (Detection) Use only information (no image processing).

상기 표 1은 검출데이터를 통해 획득한 자료로서 본 발명의 권리범위에는 포함되지 않지만, 본 발명에서 이용하려는 데이터이다. 본 발명은 검출데이터의 획득과정을 권리범위에 포함시키지 않았다.Table 1 is data obtained through detection data, which is not included in the scope of the present invention, but is intended to be used in the present invention. In the present invention, the process of acquiring detected data is not included in the scope of rights.

표 1을 참조하면, 검출데이터에서 특히 분류가 가능한 객체 데이터를 사용하였다.Referring to Table 1, object data that can be classified is used in the detection data.

또한, 본 발명은 비영상 처리방법(Non-Image processing method)으로서, 객체(object)의 바운딩 박스(bounding box)의 값으로만 계산하기 때문에 연산량이 적다.(영상 처리(image processing)방법은 일반적으로 연산량이 많이 드는 문제점이 있다.). 이로 인해 본 발명에서는 관련 데이터를 실시간 알고리즘(Real-time Algorithm)으로 처리가 가능하다.In addition, the present invention is a non-image processing method, and the amount of computation is small because the calculation is performed only with the value of the bounding box of the object. (The image processing method is generally Therefore, there is a problem that the amount of computation is high). For this reason, in the present invention, it is possible to process related data using a real-time algorithm.

본 발명의 특징은, 다중 클래스(Multi Class), 다중 객체 트래킹(Multi Object tracking), 칼만 필터 추정(Kalman Filter Estimation) - 필터링 기반 트래킹(Filtering based tracking), 매칭 방법(Matching method)은 Nearest Neighbor and Heuristic method을 사용하며, 카메라 캘리브레이션 및 2D 및 3D 변환(Camera Calibration and 2D to 3D Transformation)을 수행하며, 응용(Application)을 위한 객체정보 추출 및 가공을 수행한다.Features of the present invention, multi-class (Multi Class), multi-object tracking (Multi Object tracking), Kalman filter estimation (Kalman Filter Estimation) - filtering based tracking (Filtering based tracking), matching method (Matching method) is Nearest Neighbor and Heuristic method is used, camera calibration and 2D to 3D transformation are performed, and object information extraction and processing for application are performed.

이하 도 1에서 초기 필터링 과정과 칼만 필터링 과정을 설명하기 전에, 요약하여 설명하면, 영상 처리 화면 1프레임이 약0.3초마다 입력된다고 가정하면, 계속되는 프레임마다 도1의 과정에 수행되는데, 초기부터 칼만필터 과정이 수행되는 것이 아니라고 초기에는 위치필터링 과정후에 초기 필터링이 먼저 수행된다. 초기필터링 과정에서 5프레임중 4프레임이 매칭되면 후보 데이터 셋이 만들어지고 이후 칼만필터 과정이 수행되는 것이다. 이후에도 계속적으로 프레임 입력은 진행되며 이후에는 순차적으로 검출데이터 셋과 칼만필터 결과의 처리가 프레임마다 계속적으로 진행되어 유용한 데이터로서 보정된 데이터는 갱신되어 관리된다.Hereinafter, before explaining the initial filtering process and the Kalman filtering process in FIG. 1, assuming that one image processing screen frame is input every 0.3 seconds, it is performed in the process of FIG. 1 for each successive frame, from the beginning of the Kalman filtering process. Since the filtering process is not performed, the initial filtering is performed first after the location filtering process. If 4 out of 5 frames are matched in the initial filtering process, a candidate data set is created, and then the Kalman filter process is performed. After that, frame input is continuously performed, and thereafter, the processing of the detection data set and the Kalman filter result is continuously performed for each frame, and the corrected data as useful data is updated and managed.

한편, 이하 설명하는 Nearest Neighbor와 Heuristic method 및 다중 클래스 점수 계산(Multi Class Score Calculation) 과정은 칼만 필터 매칭과정인 단계 S5의 과정을 나타낸다. 칼만 필터 매칭과정에서는 NN, 휴리스틱 방법 및 다중 클래스 점수계산 방법을 단계적으로(3단계) 사용하여 예측과정을 수행함으로써 매칭되면 보정을 하고 이후 갱신하는 과정을 진행한다. 본 발명에서는 추적데이터 세트에서(도 11, 도 12 참조)에서 좌측 및 우측화면 한 프레임 세트에서 매칭이 완료되면 보정으로 표시될 수 있고, 매칭이 완료되지 않은 경우 예측(Prediction)과정으로 갱신되지 않고 남게 된다.Meanwhile, the Nearest Neighbor and Heuristic method and the Multi Class Score Calculation process described below represent the process of step S5, which is a Kalman filter matching process. In the Kalman filter matching process, the prediction process is performed using NN, heuristic method, and multi-class score calculation method step by step (step 3). In the present invention, when matching is completed in one frame set of the left and right screens in the tracking data set (see FIGS. 11 and 12), it can be displayed as correction, and when matching is not completed, it is not updated with the prediction process will remain

한편, 각각의 프레임 세트의 예측 및 보정과정에서 무한정 데이터를 저장할 수 없으므로 일정크기만을 계속적으로 저장하고 삭제하기를 반복한다.Meanwhile, since data cannot be stored indefinitely in the process of predicting and correcting each frame set, only a certain size is continuously stored and deleted repeatedly.

또한, 본 발명에서는 초기 물체 필터링(단계 S9~S12) 및 칼만 필터링(S4~S7)과정은 초기에는 초기 물체 필터링 과정이 먼저 수행되다가 후보 데이터셋이 정해지면 칼만 필터링 과정이 수행되는데, 이때 후보 데이터 셋과 추적 데이터 셋에 대하여 인덱스 정보만 저장하고, 초기 물체 필터링 및 칼만 필터링시 해당 인덱스를 통해 중앙처리장치(미도시)의 저장장치(미도시)에 접속하여 각각의 데이터를 독취하여 해당 처리를 수행하기를 반복한다.In addition, in the present invention, in the initial object filtering (steps S9 to S12) and Kalman filtering (S4 to S7) processes, the initial object filtering process is performed first, and when a candidate dataset is determined, the Kalman filtering process is performed. At this time, the candidate data Only index information is stored for the set and tracking data set, and in the case of initial object filtering and Kalman filtering, access to the storage device (not shown) of the central processing unit (not shown) through the corresponding index to read each data and perform the corresponding processing repeat to do

이하 NN과, 휴리스틱방법등에 대하여 설명한다.Hereinafter, NN and a heuristic method will be described.

Nearest Neighbor와 Heuristic method에 대한 설명Description of Nearest Neighbor and Heuristic method

[Nearest Neighbor](이하 NN으로 칭함)[Nearest Neighbor] (hereinafter referred to as NN)

주어진 객체에서 가장 가까운 거리에 있는 다른 것을 찾는 알고리즘이다. 가장 가까운 객체를 찾는 것이 아니라 주어진 객체(object)의 크기와 클래스에 따라 가변적으로 조사 공간(searching space)을 조절하였다. 크기, 객체(object)별 가변적인 거리(이미지 내에서의 좌표 거리)를 기준으로 필터링을 하였다.It is an algorithm that finds the other closest distance from a given object. Instead of finding the closest object, the searching space was variably adjusted according to the size and class of a given object. Filtering was performed based on the size and variable distance for each object (coordinate distance within the image).

“집합 내에서 한 객체와 가장 가까운 거리를 가지는 다른 객체를 찾는 것”을 말한다. 영상처리분야에서 사용하는 일반적인 방법이다.It refers to “finding another object that has the closest distance to one object in a set”. It is a general method used in the image processing field.

이미지 프레임간의 시간 간격이 짧을 때, 객체(object)는 그 근처에 있을 가능성이 높다. 그래서 가까운 객체(object)가 먼 객체(object)보다 같은 물체일 가능성이 높다. 이러한 이유로 이미지(image)내에서 x,y 2차원 좌표계에서 유클리디안 거리 루트(x값 차이 제곱 + y값 차이 제곱 )을 조건으로 사용하는 것이다.When the time interval between image frames is short, the object is more likely to be nearby. Therefore, close objects are more likely to be the same object than distant objects. For this reason, the Euclidean distance root (square of the difference in x value + the square of the difference in y value) is used as a condition in the x,y two-dimensional coordinate system in the image.

같은 클래스(class)를 가진 물체는 이미지 내에서 크기가 클수록 대체로 가깝다. 3D real world에서 카메라와 가까이에 있는 물체는 이미지(image)내에서도 크게 나온다. 그리고 거리가 가까울수록 같은 양을 움직여도 이미지(image) 내에서 움직인 양은 대체로 크기에 따라 비례한다. 그래서 객체(object)의 크기에 비례하는 탐색영역을 제한하고, 이 영역 내에서만 이하에서 설명할 매칭을 사용하게 된다. 이로 인해 전체영역을 탐색하지 않는 연산량을 절약하는 방법이다.Objects of the same class are generally closer as their size increases in the image. In the 3D real world, objects close to the camera appear larger in the image. And as the distance increases, even if the same amount is moved, the amount of movement in the image is generally proportional to the size. Therefore, the search area proportional to the size of the object is limited, and matching described below is used only within this area. This is a method to save the amount of computation that does not search the entire area.

또한, 가장 가까운 것을 찾는 알고리즘이지만 가까운 기준으로만 매칭을 하게 되면 엉뚱한 결과를 초래하므로 일정 거리 이상의 물체를 걸러내는 용도로 사용한다.Also, although it is an algorithm that finds the closest one, it is used to filter out objects that are more than a certain distance because matching only based on a close criterion results in erratic results.

[휴리스틱방법(Heuristic Method)][Heuristic Method]

휴리스틱 방법은 발견적 방법이라고 해서, 경험적으로 좋은 해를 찾는 방법이다. 이론을 기반으로 설계한 방법과는 달라서 최적의 해를 찾기보다는 어느 정도 만족할 만한 해를 찾는 방법이다. 설계적이고 이론적인 방법이 아닌 인간의 경험을 토대로 사용되는 임의의 방법을 휴리스틱이라고 한다The heuristic method is a method of finding a good solution empirically because it is a heuristic method. It is different from the method designed based on the theory, so it is a method to find a solution that is satisfactory to some extent rather than finding the optimal solution. An arbitrary method used based on human experience rather than a design and theoretical method is called a heuristic.

예시로 칼만필터는 휴리스틱한 방법이 아니다. 선형 시스템과 가우시안 노이즈를 가지는 모델에서는 칼만필터가 최고의 최적(optimal)한 성능을 가진다는 것이 증명되었다.For example, the Kalman filter is not a heuristic method. It has been proven that the Kalman filter has the best optimal performance for a linear system and a model with Gaussian noise.

여기에서 말한 휴리스틱 방법(heuristic method)을 통한 매칭은 클래스 매칭및 휴리스틱 유사성 방법(Heuristic similarity Method)를 말한 것이며, 도 1을 참조하면, 도 1에서의 단계 S5와 S9를 나타내는데, 단계 S5와 S9에는 클래스 매칭, NN(Nearest Neighbor), heuristic similarity matching 이 전부 포함된다.The matching through the heuristic method mentioned here refers to the class matching and the heuristic similarity method. Referring to FIG. 1, steps S5 and S9 in FIG. 1 are shown, and steps S5 and S9 include Class matching, Nearest Neighbor (NN), and heuristic similarity matching are all included.

본 발명에서 카메라는 자동차 전방에 부착되며 이 영상으로부터 현재 프레임 time t의 검출(Detection Algorithm)을 시행한 뒤의 주요 다중 클래스(multi class) 객체정보가 메모리에 저장된다.In the present invention, the camera is attached to the front of the vehicle, and main multi-class object information after detecting the current frame time t from this image is stored in the memory.

도 2를 참조하면, 디텍션 과정을 통해 검출된 객체들이 화면에 표시된다.Referring to FIG. 2 , objects detected through a detection process are displayed on a screen.

이때, 각각의 객체들에 대한 인식과정이 이후 다양하고 알고리즘을 통해 단계적으로 프레임별로 계속하여 진행된다. 객체 인식과정에서 사용되는 데이터는 영상 데이터가 아닌 기타 다른 여러종류의 데이터가 사용된다. 예를 들면,At this time, the recognition process for each object is varied and continues for each frame step by step through an algorithm. As the data used in the object recognition process, various types of data other than image data are used. For example,

객체 1의 class, score, bounding box(cx, cy, w, h), 다른 class에 대한 score값들.Object 1's class, score, bounding box(cx, cy, w, h), and score values for other classes.

객체 2의 class, score, bounding box(cx, cy, w, h), 다른 class에 대한 score값들.Object 2's class, score, bounding box(cx, cy, w, h), and score values for other classes.

객체 3의 class, score, bounding box(cx, cy, w, h), 다른 class에 대한 score값들.Object 3's class, score, bounding box(cx, cy, w, h), and score values for other classes.

이 사용될 수 있다. 클래스에 대하여는 이후에도 설명되지만, 객체의 크기 및 종류에 따라 분류해놓은 것으로서 검출(detection)과정에서 얻어진 데이터(표 1 참조)를 사용하게 된다.this can be used Although the class will be described later, the data obtained in the detection process (see Table 1) is used as it is classified according to the size and type of the object.

본 발명에 따르면, 카메라가 차량에 고정되어 있다는 가정하에 영상에서 객체(object)들이 나타날 수 있는 위치는 한정된다. 이를 바탕으로 영역을 벗어나는 객체는 제거해준 뒤 데이터 세트(Data Set)에 저장한다.According to the present invention, on the assumption that the camera is fixed to the vehicle, the positions where objects can appear in the image are limited. Based on this, objects out of the area are removed and stored in a data set.

현재 트래킹(Tracking)(검출데이터를 이후 처리하는 과정을 트래킹이라고 한다. 이하 추적데이터라고도 칭한다)중인 객체들의 위치 히스토리(history)값을 이용하여 칼만필터 계산에 필요한 변수들을 구해준다(u,Q,R). 그리고 트래킹(Tracking) 중인 객체와 데이터 세트(Data Set)로 들어온 검출(Detection)값들을 Nearest Neighbor와 heuristic method를 통해 매칭을 한다. 트래킹(Tracking) 중인 객체와 매칭된 데이터(data)로 칼만필터의 보정(Correction)을 계산하기 위해 z(관측값)을 구해주고, 칼만필터의 보정(Correction)을 수행하는데, 이때 예측(Prediction)과 z의 공분산의 비율에 따라 보정값(Correction)으로 추적 세트(Tracking Set)의 값을 갱신해준다.The variables necessary for Kalman filter calculation are obtained using the location history values of objects currently in tracking (the process of processing the detected data later is called tracking. Hereinafter also referred to as tracking data). R). Then, the object being tracked and the detection values entered into the data set are matched through the Nearest Neighbor and the heuristic method. To calculate the correction of the Kalman filter with data matched with the object being tracked, z (observed value) is obtained, and the correction of the Kalman filter is performed. At this time, prediction According to the ratio of covariance of and z, the value of the tracking set is updated with a correction value.

추적(Tracking)과의 매칭에서 매칭되지 않은 데이터 세트(Data Set)는 후보군 매칭(Initial Object Filter)에 사용한다. 후보군 매칭에는 트래킹(Tracking Set) 매칭과 마찬가지로 트래킹(Tracking)으로 넘기기 전에 몇 프레임(3~5프레임)안에 동일한 객체가 지속적으로 등장하는지를 판단한다. 몇 프레임(3~5프레임)을 채우면 후보 체크(Candidate check)로 판단하여 추적 세트(Tracking Set)에 새롭게 추가할 것인지 삭제할 것인지를 판단한다. 이것으로 지속적인 검출(detection)이 나오지 않는 간헐적으로 나오는 오인식을 걸러내는 역할을 한다.In matching with tracking, data sets that are not matched are used for candidate group matching (Initial Object Filter). In candidate group matching, like tracking set matching, it is determined whether the same object continuously appears within a few frames (3 to 5 frames) before transferring to tracking. When a few frames (3 to 5 frames) are filled, it is judged as a candidate check to determine whether to newly add or delete a tracking set. This serves to filter out intermittent misrecognition that does not result in continuous detection.

최종적으로 보정(Correction)으로 갱신된 추적 세트(Tracking Set)의 데이터를 응용(Application)에서 사용하기 위하여 가공 및 추출을 한다. 그리고 응용(Application)이 있다면 그것에서 사용하면 된다. 예를 들면 전방차량과의 충돌가능성과 속도를 계산하여 운전자에게 알림을 줄 수 있도록 위치와 속도를 이용한 충돌시간(Time to Collision)을 만들 수 있다. 또는 주변 객체(Object)와의 거리, 객체(object)들의 정보를 이용하여 운전자에게 유의미한 정보를 전달하여 경로안내나 위험 주의상황을 인지시킬 수 있다.Finally, the data of the tracking set updated by correction is processed and extracted for use in the application. And if there is an application, you can use it in it. For example, it is possible to create a collision time (Time to Collision) using the position and speed so that the driver can be notified by calculating the probability and speed of a collision with the vehicle in front. Alternatively, meaningful information may be transmitted to the driver by using the distance from the surrounding object and information on the objects, thereby recognizing a route guidance or a dangerous caution situation.

칼만 필터(Kalman Filter)Kalman Filter

검출(Detection)에서 미인식으로 인해 매칭되는 객체(object)가 없을 경우에 검출(Detection)값 만으로 응용(Application)에서 사용하게 되면 데이터가 없어서 응용(application)에서의 사용이 어려울 수 있다. 그래서 칼만 필터(kalman filter)를 사용하여 예측(prediction)으로 예측값을 만들어주고, 이를 적절한 동작모델링과 연산을 통해 지속하게 만든다. 그리고 이 값을 응용(Application)에서 사용할 수 있어서 미인식으로 인한 문제를 해결할 수 있다.When there is no matching object due to unrecognized detection in Detection, if only the detection value is used in the application, it may be difficult to use in the application because there is no data. So, using a Kalman filter, a prediction is made with a prediction, and it is continued through appropriate motion modeling and operation. And since this value can be used in the application, the problem caused by unrecognized can be solved.

그리고, 객체(object) 매칭이 되는 경우에는 현재 예측(prediction)중인 데이터와 변이(variance)를 관측(매칭)된 값과 변이(variance)를 이용하여 새로운 추정치와 변이(variance)를 갱신하여 사용할 수 있다.And, in the case of object matching, the currently predicted data and variance can be used by updating the new estimate and variance using the observed (matching) value and variance. there is.

주된 목적은 흔들리는 검출(Detection) 데이터에서의 최적의(optimal) 상태 추정값을 얻기 위함이고 그리고 미인식에도 대처하기 위함이다.The main purpose is to obtain an optimal state estimate in the shaky detection data and to cope with the unknown.

칼만 필터는 오차(잡음)가 포함되어 있는 측정치(관측값)을 바탕으로 선형 상태를 추정하는 재귀필터이다.The Kalman filter is a recursive filter that estimates a linear state based on measurements (observed values) that contain errors (noise).

칼만필터는 도 3과 같은 형태로 표시된다. 이하 도 3에 대하여 설명한다.The Kalman filter is displayed in the form of FIG. 3 . Hereinafter, FIG. 3 will be described.

도 3에서, 좌측을 예측(prediction)이라고 하고 우측 과정을 보정(correction)이라고 한다.In Fig. 3, the left side is called prediction and the right side process is called correction.

기존에 누적된 과거 추적 데이터를 사용하여 현재의 위치를 예측을 하고, 새로 들어온 검출 데이터 중에서 매칭되는 관측데이터가 있다면 예측데이터와 관측데이터 사이의 분산(variance)값의 비율을 가지고 둘 중의 분산의 비율로써 최종적인 상태가 정해지는 것이다.The current location is predicted using the previously accumulated past tracking data, and if there is matching observation data among the new detection data, the ratio of the variance between the predicted data and the observed data is used and the ratio of the variance between the two This will determine the final state.

<예측, Prediction> <Prediction>

(1) k시점의 상태 예측(1) Prediction of state at time k

는 상태변수를 말하며 시간 k일 때의 물체의 bounding box의 값을 상태로써 가진다.

is a state variable and has the value of the object's bounding box at time k as a state.

A는 상태 전이 행렬이다. 상태변수

가 가지는 cx, cy, w, h는 각각이 독립이어서 A는 4x4 단위행렬이다.A is the state transition matrix. state variable

Each of cx, cy, w, and h of is independent, so that A is a 4x4 identity matrix.

u_k 는 사용자 입력을 말하지만, 이미지에 나타나는 물체들은 어디로 움직일지에 대한 입력값이 존재하지 않는다. 그러므로 u_k값을 이전 추적 데이터로부터 계산하여 넣어주어야 한다. 단위시간 t(프레임간 시간 간격)에 물체가 이미지에서 움직인 양을 말한다. B는 사용자 입력에 대한 상태 전이 행렬로

,u_k가 같은 원소값을 가지므로 B는 4x4 단위행렬이다. u _k refers to user input, but there is no input value for where the objects appearing in the image will move. Therefore, by calculating the value u _k from the previous trace data it must put. It is the amount of movement of an object in the image in unit time t (time interval between frames). B is the state transition matrix for user input.

Since ,u _k have the same element value, B is a 4x4 identity matrix.

(2)k시점의 공분산 예측(2) Prediction of covariance at time k

는 시간 t가 k일 시점의 공분산 행렬이다. 4x4 단위행렬에 일정 수가 곱해져 초기값으로 정해진 후, 직접 접근하여 변경하는 일 없이 예측과 보정과정을 통해 자동으로 업데이트 된다. n은 100으로 주었다. Qk는 예측에 의한 공분산 오차 공분산 행렬이다. 이미지 내에서 물체의 검출데이터가 일정하지 않고 값이 흔들리는 상태로 매칭되었을 때에는 uk의 원소값들의 변화가 크다. 이러한 점을 들어 Q를 다음과 같은 형태로 정의하여 움직임이 많은 물체가 큰 공분산행렬을 갖도록 하였다. 그리고 상태변수

의 원소들에 해당하는 Q값은 물체의 크기가 클수록 이미지상에서의 움직임이 크게크게 나타나기 때문에 다음과 같이 물체의 크기에 비례하도록 하였다. 원소에 곱해진 0.1, 0.2과 같은 상수들은 임의의 값으로 상황에 따라 바뀔 수 있다. 그리고 Q식에서 루트 안의 값이 음수를 가진다면

으로 계산한다.

is the covariance matrix at time t k. After the 4x4 identity matrix is multiplied by a certain number and set as an initial value, it is automatically updated through prediction and correction without direct access and change. n was given as 100. Qk is the covariance error covariance matrix by prediction. When the detection data of an object in the image is not constant and the value is matched in a shaking state, the change in the element values of uk is large. For this point, Q was defined in the following form so that a moving object has a large covariance matrix. and state variables

The Q value corresponding to the elements of is proportional to the size of the object as follows, because the larger the object, the greater the movement on the image. Constants such as 0.1 and 0.2 multiplied by elements are arbitrary values and can be changed according to circumstances. And in the Q expression, if the value in the root is negative,

calculated as

위의 내용에 따라서,

,

는 다음과 수학식 6 및 수학식 7과 같이 정리된다.According to the above,

,

is arranged as in Equations (6) and (7) as follows.

<보정, Correction><Correction>

Kk는 칼만 게인을 나타낸다. 상태예측값과 관측값 중 어느 값을 더 믿을 것이냐를 분산의 비율로써 계산한다. Kk represents the Kalman gain. It is calculated as the ratio of variance to which of the state predicted value or the observed value to believe more.

Zk 는 시간 k에서 관측값을 나타낸다. 관측값의 형태는 상태

가 가진 형태와 같다. H행렬은 상태

를 관측값 Zk 의 형태로 변환시켜준다. 즉 상태값에서 관측값을 도출한다. 하지만 상태와 관측값의 형태가 같기 때문에 H는 4x4 단위행렬을 가진다.Zk represents the observation at time k. The shape of the observation is the state

has the same form as H matrix is state

is converted into the form of the observation value Zk. That is, the observed value is derived from the state value. However, since the state and observation values are the same, H has a 4x4 identity matrix.

(1) 칼만게인 계산 (1) Kalman Gain Calculation

위의 식으로 칼만게인을 계하는데 위의 내용으로 인해 다음과 수학식 10과 같이 식이 정리된다. The Kalman gain is calculated by the above equation, and due to the above, the equation is arranged as shown in Equation 10 below.

R은 관측값 측정 노이즈 공분산 행렬이다. 사실 이 값은 검출기의 성능에 따라 좌우되는 값이다. 검출기의 성능이 좋다면 R의 값은 작아지고, 정확하다면 R은 필요가 없고, 칼만필터도 필요가 없다. 허나 오차가 있는 검출 결과값의 측정 노이즈가 있기 때문에 이 값을 넣어주어야 한다. 물체의 크기가 클수록 오차가 커지기 때문에 이 점을 이용하여 R을 정의하였다. R is the observation measurement noise covariance matrix. In fact, this value depends on the performance of the detector. If the performance of the detector is good, the value of R is small, and if it is accurate, there is no need for R and there is no need for a Kalman filter. However, since there is measurement noise in the detection result with error, this value must be entered. Since the error increases as the size of the object increases, this point was used to define R.

(2) 관측값을 이용한 상태추정값 업데이트 (2) Update state estimate using observations

관측값 z_k와 k시점의 상태예측

값의 차이를 칼만게인 Kk만큼 비중을 두어 상태

를 보정한다.State prediction at observation z _{k and time k}

State by weighting the difference in values as much as the Kalmangain Kk

to correct

식은 위의 내용으로 다음과 같이 정리된다.

The above expression is summarized as follows.

(3) 공분산 업데이트 (3) Covariance update

상태가 업데이트 된 후 상태의 공분산 행렬을 칼만게인으로 업데이트를 해준다. I는 4x4 단위행렬이다. After the state is updated, the state's covariance matrix is updated with the Kalman gain. I is a 4x4 identity matrix.

식은 위의 내용으로 다음과 같이 정리된다.

The above expression is summarized as follows.

도 1의 순서도에 대하여 순차적으로 설명한다. The flowchart of FIG. 1 will be sequentially described.

검출기가 검출을 완료한 뒤 트래커가 실행이 되면 트래커에서는 미리 약속된 지점에서 검출데이터를 불러온다. 주행중인 자동차에 장착된 전방카메라를 기준으로 물체들은 화면상에서 나타나는 위치가 정해져 있다. 이를 기반으로 검출된 데이터에서 오인식으로 의심되는 데이터는 사용하지 않는다. S3의 S4에서 이전에 누적된 추적 데이터로부터 칼만필터를 통해 물체의 상태(위치와 크기)를 예측한다. S5에서 예측된 데이터와 검출된 데이터간의 매칭을 위해 클래스 정보를 이용하여 1차적으로 걸러낸다. 그리고 이미지 위에서 물체의 중심점거리를 이용하여 2차적으로 걸러낸다. 이때, 탐색영역은 물체의 크기에 비례하도록 제한한다. 그리고 물체의 크기와 종횡비로 평가함수를 만들어 가장 높은 점수를 갖는 매칭상대를 찾는다. 매칭이 된 추적 데이터는 S6의 칼만필터 보정을 통해 상태x와 상태 공분산 P가 업데이트되어 더 정확한 위치로 갱신된다. 매칭이 실패한 추적 데이터는 예측상태 x와 예측 상태 공분산 P가 계산된 채로 갱신이 되어 위치가 전보다 부정확해진다. S8의 초기 물체 필터링은 위치기반 필터링과 같이 오인식을 걸러내기 위한 장치로써, 간헐적으로 나타나는 오인식이 추적되지 않도록 걸러내는 역할을 한다. 그리고 일정 수 이상 누적된 후보 데이터는 검사를 통해서 m프레임 중 n프레임이 실제로 매칭되었다면 이 데이터를 S3에서 사용할 수 있도록 새로운 데이터로 넘겨준다. 추적데이터와 매칭된 후 남은 검출 데이터들은 S8의 초기 물체 필터링에서 누적된 후보 데이터와 매칭이 된다(S9). S10에서는 칼만필터를 사용하지 않고 S5에서 사용한 매칭방법과 동일한 방법을 사용한다. 검출데이터와 매칭된 후보 데이터는 검출데이터를 추가로 넣어주어 갱신을 한다. 검출데이터와 매칭이 되지 않은 후보 데이터는 S4의 prediction과 같이 상태예측

만 하여 추가로 넣어준다. 매칭된 후 남은 데이터들은 새로운 후보 데이터의 후보군으로 추가한다(S12). S13에서는 전체적인 데이터 관리 및 추가 삭제(메모리 관리), 다중 물체 점수 계산, 어플리케이션에서 사용하기 위한 정보를 추출하는 부분을 나타낸다. 데이터 관리에는 추적 구간 데이터 중 오래된 것을 삭제하는 것과, 다중 물체 점수 계산과정에서 물체의 점수가 일정점수 이하로 떨어지는 경우(매칭이 안되는 것으로 인한)에 해당 데이터를 삭제하여 메모리 관리를 한다. 그리고 어플리케이션에서 사용하기 위한 데이터 가공부분이 해당한다.When the tracker is executed after the detector completes the detection, the tracker calls the detection data from the predetermined point. Based on the front camera mounted on the vehicle in motion, the positions of the objects appearing on the screen are determined. Based on this, data suspected of being misrecognized in the detected data is not used. In S3 of S4, the state (position and size) of the object is predicted through the Kalman filter from the previously accumulated tracking data. For matching between the predicted data and the detected data in S5, it is primarily filtered using class information. Then, it is filtered secondarily using the center point distance of the object on the image. In this case, the search area is limited to be proportional to the size of the object. Then, an evaluation function is created based on the size and aspect ratio of the object to find the matching partner with the highest score. The matched tracking data is updated to a more accurate position by updating the state x and state covariance P through the Kalman filter correction of S6. The tracking data that fails to match is updated with the predicted state x and the predicted state covariance P calculated, so that the location is more inaccurate than before. The initial object filtering of the S8 is a device to filter out misrecognition like location-based filtering, and it plays a role in filtering intermittent misrecognition so that it is not tracked. And, if n frames out of m frames are actually matched through the inspection of candidate data accumulated over a certain number, this data is transferred as new data so that it can be used in S3. After matching with the tracking data, the remaining detection data is matched with the candidate data accumulated in the initial object filtering of S8 (S9). In S10, the same method as the matching method used in S5 is used without using the Kalman filter. Candidate data matching the detection data is updated by adding additional detection data. Candidate data that does not match the detected data is state prediction like S4 prediction.

Just add more. The remaining data after matching is added as a candidate group of new candidate data (S12). In S13, the overall data management and addition and deletion (memory management), multi-object score calculation, and information extraction for use in applications are shown. In data management, the old data of the tracking section is deleted, and when the score of an object falls below a certain point in the multi-object score calculation process (due to inability to match), the data is deleted and memory is managed. And the data processing part for use in the application corresponds.

사용된 알고리즘에 대한 설명Description of the algorithm used

S2의 위치기반 필터링을 나타낸다.It shows location-based filtering of S2.

자동차 주행중에 나타날 수 있는 물체는 이미지 내에서 위치가 대략적으로 정해짐을 이용하고, 오인식을 줄인다.Objects that can appear while driving a car use the approximate location of the image and reduce misrecognition.

단계 S3의 S4, S6 칼만필터 모델링과 사용Modeling and using S4 and S6 Kalman filters in step S3

에러가 포함된 검출 결과에서 최적의 상태를 추정(optimal solution)한다.An optimal solution is estimated from the detection result including the error.

칼만필터의 예측(prediction)으로 미인식에 대처한다.It copes with the unknown with the prediction of the Kalman filter.

S5와 S10에 해당하는 매칭 방법Matching method for S5 and S10

영상에 나타나는 물체 특성을 고려한 탐색영역 제하여 연산량을 줄인다.The amount of computation is reduced by subtracting the search area considering the characteristics of the object appearing in the image.

물체의 클래스, 위치, 크기, 종횡비를 이용한 매칭 (위치에 Nearest Neighbor, 클래스, 크기 종횡비는 휴리스틱 방법)한다.Match using object class, location, size, and aspect ratio (Nearest Neighbor to location, class, size and aspect ratio heuristic method).

S8의 초기 물체 필터링Initial object filtering in S8

오인식에 대처하기 위한 방법이고. 간단하면서 효율적인 방법이다.A way to deal with misconceptions. A simple and efficient method.

물체의 중심점좌표 중 y값 좌표를 이용하여 클래스별 정해진 범위를 벗어나면 걸러낸다.Using the y-value coordinates among the coordinates of the center point of an object, if it is out of the range set for each class, it is filtered.

S8에서 일정 누적 이상된 후보 데이터는 검사를 통해 추적 데이터셋의 새로운 데이터로 넣어준다.In S8, candidate data with a certain cumulative abnormality is added as new data in the tracking dataset through inspection.

본 발명의 특징을 다시한번 살펴본다.Let's look at the features of the present invention again.

임베디드(embedded)환경에서도 실시간으로 사용 가능한 범용 다중 물체 추적기이다. It is a general-purpose multi-object tracker that can be used in real time even in an embedded environment.

영상기반 물체 검출기로 검출된 정보라면 대부분 가지고 있는 클래스, 점수,검출결과박스만 사용한다.Most of the information detected by the image-based object detector uses only the class, score, and detection result box it has.

다른 여러 가지 검출기 뒤에 추적기가 붙어도 대부분 동작할 수 있는 범용성을 가지고 있다.It has the versatility to work most of the time even if a tracker is attached behind many other detectors.

클래스(class), 점수(score), 검출결과박스(bounding box)정보만을 사용하여 영상처리 기반 알고리즘들에 비해 매우 적은 연산량을 가진다.Using only class, score, and bounding box information, it has a very small amount of computation compared to image processing-based algorithms.

자동차에 부착되는 일반 카메라에 나타나는 물체들의 위치 특성을 고려한 단계 S2의 위치기반 필터링Location-based filtering of step S2 in consideration of the location characteristics of objects appearing in a general camera attached to a vehicle

추적기에서 실시간 사용을 위한 프레임 사이의 물체의 휴리스틱 매칭 방법 (단계 S5, S9 참조).A method of heuristic matching of objects between frames for real-time use in the tracker (see steps S5, S9).

본 발명에서 수행되는 과정을 다시한번 설명한다.The process performed in the present invention will be described once again.

검출(Detection)로부터 얻는 데이터는 바운딩 박스(bounding box)의 값이다. 위치값을 상태로 놓고, u를 속도 변화량으로 놓는다.The data obtained from Detection is the value of the bounding box. Let the position value be the state, and let u be the velocity change.

A, B는 4 by 4 단위행렬이다. 이미지 위에서 cx, cy, w, h는 서로 영향을 주지 않는 독립적인 관계이다.A and B are 4 by 4 identity matrices. On the image, cx, cy, w, and h are independent relationships that do not affect each other.

x =[ cx ; cy ; w ; h ] (상태)x =[ cx ; cy; w ; h ] (state)

z =[ x’ ; y’ ; w’ ; h’ ] (변화량)z =[ x' ; y' ; w' ; h’ ] (variation)

P는 공분산 메트릭스로써 예측(prediction)하면서 생기는 가우시안 오차를 표현한다.P represents a Gaussian error generated during prediction as a covariance matrix.

H는 상태 x값을 z의 형식으로 표현할 수 있도록 하는 행렬이지만 x와 z가 같은 형태를 가지므로, 4 by 4 단위 행렬이다.H is a matrix that allows the state x value to be expressed in the form of z, but since x and z have the same form, it is a 4 by 4 unit matrix.

K는 칼만 게인으로 예측(prediction)으로 인한 현재 공분산 P와 관측할 때 정해지는 공분산 R과의 비율로써 정해진다.(0~1)사이의 값을 갖는다.K is the Kalman gain, and it is determined as the ratio between the current covariance P due to prediction and the covariance R determined at the time of observation. It has a value between 0 and 1.

보정(Correction)에서 현재상태의 추정값 갱신 = 예측(prediction)+ K(관측값-예측(prediction))이 된다.In correction, the update of the estimate of the current state = prediction + K (observed value - prediction).

그리고 K값을 이용하여 P값을 갱신한다.Then, the P value is updated using the K value.

위 과정을 반복한다.Repeat the above process.

자동차 주행 중에 나타날 수 있는 객체(object)의 기본적인 특성을 설명하면 다음과 같다.The basic characteristics of an object that may appear while driving a vehicle will be described as follows.

바닥마커의 형태인 객체(object)들(횡단보도,도로마커등)은 y값이 급격하게 변한다.Objects (crosswalks, road markers, etc.) in the form of floor markers change abruptly in y.

자동차, 사람, 자전거등의 땅에 붙어 있는 객체(object)들은 거리가 가까울수록 x축과 y축의 움직임이 증가하고 멀수록 감소한다. 그리고 y축으로의 이동은 실제 세계(real world)의 객체(object)가 이미지(image)에 2D(2차원) 투사(projection)되면서 y축의 데이터가 압축되게 되어 실제 움직인 만큼 y축으로의 이동량이 크게 나타나지 않는다. x축은 움직인 만큼 나타나게 된다.For objects attached to the ground, such as cars, people, and bicycles, the movement of the x-axis and y-axis increases as the distance increases, and decreases as the distance increases. And the movement to the y-axis is a 2D (two-dimensional) projection of an object in the real world onto an image, and the data on the y-axis is compressed, so the amount of movement in the y-axis as much as the actual movement. It doesn't appear that big. The x-axis is displayed as much as it moves.

거리가 멀수록 모든 객체(object)들은 이미지(image)상에서의 움직임이 작게 나타난다.As the distance increases, all objects appear less motion on the image.

예측(Prediction)시 객체(object)의 크기가 작아지면 u값도 작아진다(일반적으로), 크기가 큰 객체(object)일수록 u값도 크다. 그리고 공분산의 증가량도 이와 비례한다.When the size of an object decreases during prediction, the u-value also decreases (generally), and the larger the object, the greater the u-value. And the amount of increase in covariance is also proportional to this.

위의 내용을 토대로 u, Q, R값을 모델링을 하였으며 특성은 다음과 같다.Based on the above, u, Q, and R values were modeled, and the characteristics are as follows.

자동차, 사람, 자전거, 오토바이 등 땅에 붙어있는 객체(object)들의 u의 x는 프레임(frame)간의 속도로 유지하고, u의 y값은 이동량이 적으므로 평균값을 이용하여 스무딩(smoothing)을 했다. W,h도 y와 마찬가지다.For objects attached to the ground, such as cars, people, bicycles, and motorcycles, the x of u is maintained at the speed between frames, and the y value of u is smoothed using the average value because the amount of movement is small. . W and h are the same as y.

바닥마커와 공중에 나타나는 신호등 같은 객체(object)들은 y값의 변화량이 크므로 x, y를 프레임(frame)간의 속도로 유지하였고, w,h는 스무딩(smoothing)을 적용하였다.For objects such as floor markers and traffic lights appearing in the air, the amount of change in y is large, so x and y are maintained at the speed between frames, and smoothing is applied to w and h.

Q는 예측(prediction)에 의한 오차 공분산 누적량인데 이것은 이미지 위에서 이동할 때 증가하기 때문에 객체(object)의 크기/n + |u값/m|으로 설정하였고, (w * h)의 제곱근만큼 Q(1),Q(4),Q(11),Q(14)에 추가하였다.Q is the cumulative amount of error covariance due to prediction, which increases when moving over the image, so it was set as the size of the object/n + |u value/m|, and Q(1) by the square root of (w * h) ), Q(4), Q(11), Q(14).

R은 객체(object)의 크기에 비례하도록 설정하였다.R is set to be proportional to the size of the object.

예측 한계(Prediction limit)Prediction limit

이미지 평면(Image plane)에서 객체(object)가 가질 수 있는 애스펙트비(aspect ratio)는 어느 정도 정해져 있다. (신호등은 가로로 길고, 보행자는 세로로 길다. 자동차는 가로가 길 수도 있고 세로도 길 수 있지만 너무 얇아질 수는 없다. 도로교통표지판은 정사각형에 가까운 종횡비를 갖는다.등).An aspect ratio that an object can have in an image plane is determined to some extent. (Traffic lights are long horizontally, pedestrians are tall. Cars can be both horizontal and vertical, but cannot be too thin. Road traffic signs have an aspect ratio close to a square, etc.).

객체 예측(Object prediction)시 과거 정보를 이용하기 때문에 일반적으로 줄어드는 것은 지속적으로 줄어들고 늘어나는 것은 지속적으로 늘어난다. 사람이 걷는 것을 예로 들면 다리가 벌려져 있을 때에는 바운딩 박스(bounding box)가 정사각형에 가까워지지만 옆으로 서있는 경우에는 비율이 얇아진다. 그래서 일반적으로 예측(prediction)을 하게 되면 더더욱 얇아지게 되고 매칭에 실패하게 된다. 이를 방지하기 위하여 최소 최대 종횡비를 객체(object) 클래스마다 정해주게 되었다. 예측(Prediction)시 종횡비가 틀어져 매칭이 실패하는 것을 막기 위함이다.In object prediction, since past information is used, in general, decreasing ones continue to decrease and increasing ones continuously increase. For example, when a person walks, the bounding box approaches a square when the legs are spread apart, but the proportions become thinner when standing sideways. So, in general, when prediction is made, it becomes thinner and the matching fails. To prevent this, the minimum and maximum aspect ratio is determined for each object class. This is to prevent the matching failure due to the aspect ratio being misaligned during prediction.

매칭 알고리즘(Matching Algorithm)Matching Algorithm

이미지는 프레임(Frame) 단위로 입력이 들어온다. 특수한 상황이 아니라면 프레임간 이미지 내에서 객체(object)의 이동량의 크기는 작다. 이를 바꾸어 말하면 “이전 프레임에 있던 한 객체는 다음 프레임에서 그 주변에 위치할 것이다", “또한 객체(object)의 크기도 비슷할 것이다”, “오인식이 나더라도 같은 대분류 내에서 일어날 가능성이 높다”.즉 일정 거리내에서 가장 가까운 객체, 비슷한 크기와 모양, 같은 클래스 혹은 대분류에 속하는 클래스를 가진 객체가 동일한 객체일 가능성이 높은 것이다.Images are input in units of frames. Except in special circumstances, the amount of movement of an object within the image between frames is small. In other words, “An object in the previous frame will be located around it in the next frame”, “The size of the object will also be similar”, “Even if misrecognition occurs, it is highly likely to occur within the same broad category”. That is, objects with the closest object within a certain distance, similar size and shape, and classes belonging to the same class or large category are highly likely to be the same object.

클래스 매칭(Class Matching)Class Matching

주요 클래스들을 나열하고 특성별로 묶으면 다음과 같다.The main classes are listed and grouped by characteristics as follows.

A그룹 :자동차,버스,트럭, 밴(VAN)Group A: Car, Bus, Truck, Van (VAN)

B그룹 :이륜차(자전거,오토바이), 사람Group B: Two-wheeled vehicles (bicycles, motorcycles), people

C그룹 :표지판종류Group C: Types of signs

D그룹 :바닥마커 종류, 횡단보도Group D: Types of floor markers, crosswalks

기타 :나머지 Other: the rest

위와 같이 분류한 것을 대분류라고 부르겠다. 이미지 내에서 나타나는 위치와 동적 특성이 비슷한 것 끼리 묶는다. 검출(Detection)단에서 A그룹의 자동차나 트럭을 인식할 때 같은 영역이라고 하더라도 점수(Score)가 비슷하게 나올 경우가 있다.그래서 대분류 내에 해당하는 클래스들은 매칭을 허용하였다. 매칭할 때 대분류내에 해당하는지를 기준으로 필터링을 하였다.Classification as above will be referred to as a major classification. Groups with similar position and dynamic characteristics within the image. When the detection stage recognizes the car or truck of group A, the score may be similar even in the same area. Therefore, the classes corresponding to the large classification were allowed to match. When matching, it was filtered based on whether it falls within the broad category.

휴리스틱 유사 방법(Heuristic Similarity method)Heuristic Similarity method

검출(Detection)로부터 클래스(class), 점수(score), 바운딩 박스 점수(bounding box points), 다른 클래스의 점수(class’s scores)들을 받는다. 이를 기반으로 매칭하기 위해서는 위의 대분류 클래스, NN(Nearest Neighbor)과 더불어 객체(object)의 크기와 종횡비를 비교하려 한다. 유사도라 명명한다. 면적과 종횡비의 유사도를 비율로써 계산하는 비용 함수(cost function)을 정의하고 이것이 가장 높은 객체(object)를 매칭한다.From Detection, we get class, score, bounding box points, and class's scores. To match based on this, we try to compare the size and aspect ratio of the object with the above large classification class, NN (Nearest Neighbor). called similarity. Define a cost function that calculates the similarity between area and aspect ratio as a ratio, and it matches the highest object.

휴리스틱 유사방법(Heuristic Similarity method)에서는In the heuristic similarity method,

클래스(Class)와, NN으로 필터링된 detection 데이터와의 매칭을 수행하며, 크기와 종횡비를 비교한다.A class is matched with the NN-filtered detection data, and the size and aspect ratio are compared.

먼저, 크기 비교에 대하여 설명하면 다음과 같다.First, size comparison will be described as follows.

검출 데이터(Detection Data)의 바운딩 박스(bbox) 넓이와 추적 데이터셋 혹은 후보군 데이터셋의 바운딩 박스(bbox)의 넓이를 구한다.Find the area of the bounding box (bbox) of the detection data and the area of the bounding box (bbox) of the tracking data set or candidate data set.

D_area(detection 데이터 중 한 객체의 넓이)와 M_area(매칭할 데이터 중 한 객체의 넓이)를 가정하면,Assuming D_area (the area of one object in the detection data) and M_area (the area of one object in the data to be matched),

Sim_area = D_area/M_area (넓이 유사도)이고,Sim_area = D_area/M_area (area similarity),

만약 sim_area가 1보다 크면 sim_area의 역수를 저장하고, 같은 넓이라면 1이고, 넓이 차이가 많이 날수록 0에 가깝다.If sim_area is greater than 1, the reciprocal of sim_area is stored, if the area is the same, it is 1, and as the area differs, it is closer to 0.

또한, 종횡비 비교를 하면, 종횡비는 세로/가로의 값으로 D_ratio, M_ratio라고 한다. 또한, sim_ratio = D_ratio/M_ratio (종횡비유사도)이다. 만약 Sim_ratio가 1보다 크면 sim_ratio의 역수를 저장하고, 위와 같이 비슷할수록 1에 가깝고, 차이가 날수록 0에 가깝다.In addition, when comparing the aspect ratio, the aspect ratio is the vertical/horizontal value, called D_ratio and M_ratio. Also, sim_ratio = D_ratio/M_ratio (aspect ratio similarity). If Sim_ratio is greater than 1, the reciprocal of sim_ratio is stored, and as above, it is close to 1, and the difference is closer to 0.

또한, 하나의 cost function으로 묶기 과정에 대하여 설명하면, 이는, gain_area, gain_ratio 라고 하며, 둘은 양수이며 둘의 합은 1이다.Also, if the process of bundling into one cost function is described, it is called gain_area and gain_ratio, both of which are positive numbers, and the sum of the two is 1.

default로는 둘 다 0.5값을 가진다. 하지만 객체(object) 종횡비가 일정하게 나타나는 물체라면 종횡비 점수비율을 높인다.By default, both have a value of 0.5. However, if the object aspect ratio is constant, increase the aspect ratio score ratio.

따라서, result = gain_area x sim_area + gain_ratio x sim_ratio 이며,Therefore, result = gain_area x sim_area + gain_ratio x sim_ratio,

여기서, result는 0과 1사이의 값을 가지고 최소 문턱값(Threshold)값을 정해주어 이 값을 넘고 가장 높은 result를 가진 것을 매칭 결과로 저장한다.Here, result has a value between 0 and 1, sets a minimum threshold value, exceeds this value, and stores the one with the highest result as a matching result.

다중 클래스 점수 계산(Multi Class Score Calculation)Multi Class Score Calculation

위에 설명한 매칭 방법에 의해서 매칭된 추적 세트(tracking set)의 한 객체의 점수(score) 갱신을 기술한다.Describes the update of the score of an object in the tracking set matched by the matching method described above.

위의 수식에서 S_t- ₁는 추적 세트(tracking set)의 모든 클래스에 대한 점수(Score)값이고, Ot는 매칭된 객체(object)의 모든 클래스에 대한 점수(Score)값을 나타낸다. 그리고 η(St)에서 얻어진 St값은 매우 작아진 상태이다. 왜냐하면 1개 객체의 모든 점수(score)값의 합은 1이고 각 클래스(class)에 대한 점수(score)값은 0~1 사이의 값을 가진다. 그래서 1 이하의 실수끼리 원소곱을 하였으니 숫자가 매우 작아진다. η는 St의 총합을 1로 만드는 정규화(nomalize)기능에 대한 표시이다.In the above formula, S _t- ₁ is the score value for all classes of the tracking set, and Ot represents the score value for all classes of the matched object. And the St value obtained by η(St) is in a very small state. Because the sum of all score values of one object is 1, the score value for each class has a value between 0 and 1. So, when real numbers less than 1 are multiplied by element, the number becomes very small. η is a representation of the normalize function that makes the sum of Sts equal to 1.

그리고 위와 같은 방법에서의 문제점은 트래킹(Tracking)이 지속되면 탑 스코어(top score)값이 1에 수렴해버려서 다른 같은 대분류 내의 매칭이 된다 하더라도 탑(top)을 유지하게 된다. 이를 해결하기 위하여 같은 대분류에 속하는 클래스(class)점수에게 기회 점수를 부여하도록 하였다. 그러면 트래킹(tracking)이 지속되어도 다른 대분류 클래스로의 전환이 이루어질 수 있는 여지가 생긴다. 그리고 돈 케어(“don’t care”)(객체(object)가 모든 클래스에 해당하지 않고 아무것도 아닌 것, 즉 배경)에도 기회 점수를 부여하여 위의 클래스 점수(class score)계산에 원소로 추가하고 계산하였다. 이는 탑 클래스(top class)의 점수(score)가 일정비율로 자연감소 하도록 하기 위함이다. (나중에는 일정 점수 이하로 떨어진 tracking 객체는 제거하도록 하는 부분이 있다.).And the problem with the above method is that if tracking is continued, the top score value converges to 1, so that the top is maintained even if there is a match within the same large category. In order to solve this problem, opportunity points were given to class scores belonging to the same major category. Then, even if tracking continues, there is room for transition to another large classification class. And giving opportunity points to money care (“don't care”) (where the object does not correspond to all classes and is nothing, i.e. the background) is added as an element to the class score calculation above, Calculated. This is so that the score of the top class naturally decreases at a certain rate. (Later, there is a section to remove the tracking object that has fallen below a certain score.).

객체 필터링(Object Filtering)Object Filtering

검출(Detection)에서의 미인식과 오인식의 문제에서 칼만 필터(Kalman filter)에서는 미인식부분의 해결과, 불완전한 검출(Detection) 정보로부터(오차가 포함되어 있는 검출(Detection) 정보) 최적의(optimal) 상태 추정을 위하여 사용하였다면, 지금 기술할 방법은 오인식으로부터 대처하기 위한 부분이다.In the problem of non-recognition and misrecognition in detection, the Kalman filter solves the unrecognized part, and from incomplete detection information (detection information including error) optimal If it is used for state estimation, the method to be described now is a part to deal with misrecognition.

로케이션 필터링(Location Filtering)Location Filtering

자동차에 카메라가 장착되고 고정되어 있고, 카메라 캘리브레이션(Camera Calibration)이 되어 있다면, 주행상황에서 나타나는 객체(object)들을 위치로 걸러낼 수가 있다. 가령 사람이 하늘에 날아다닐 수 없는 것이고 신호등이 땅바닥에 나타날 수 없는 것과 같다. 핸드헬드 카메라에서 사용한다고 하면 이 로케이션 필터(Location Filter)는 사용할 수 없을 것이다.If the car is equipped with a camera and fixed, and if the camera has been calibrated, it is possible to filter the objects that appear in the driving situation by location. For example, a person cannot fly in the sky and a traffic light cannot appear on the ground. If you're using it on a handheld camera, this Location Filter won't work.

초기 객체 필터링( Initial Object Filtering)Initial Object Filtering

위에서는 위치에 따른 잘못된 데이터를 걸러냈다고 한다면 이번에는 제한적이고 간단한 검출(Detection) 정보로부터 오인식을 제거하기 위한 것이다. 초기에 3~5 프레임(frame)을 검사하여 후보 세트(Candidate Set)에 저장하고 매칭시키며 정해진 기준 프레임 동안 실제 데이터로 매칭이 된 데이터를 추적 세트(Tracking Set)에 새롭게 추가해주는 것이다.If it is said that the wrong data according to the location has been filtered out above, this time it is to remove the misrecognition from the limited and simple detection information. Initially, 3 to 5 frames are inspected, stored in a candidate set, and matched, and data matched with actual data during a predetermined reference frame are newly added to the tracking set.

이 방법은 깜빡깜빡하고 나타나는 오인식을 걸러낼 때 용이하나, 지속적으로비슷한 위치에 비슷한 크기, 비슷한 비율(ratio), 대분류에 속하는 클래스(class)로 들어오는 오인식은 걸러낼 수 없는 단점을 가지고 있다.This method is easy to filter out the false recognition that appears blinking, but it has a disadvantage that it cannot filter out the false recognition that continuously enters a similar location, similar size, similar ratio, and class belonging to a large category.

추적 세트 매칭(Tracking Set matching)에서 사용하는 것과 같은 방법으로 사용하나 칼만 필터(Kalman filter)는 적용하지 않은 채로 사용하고 있다.It is used in the same way as used in Tracking Set matching, but without Kalman filter applied.

후보 필터링(Candidate filtering)을 간단히 설명하면 한 객체가 5 프레임(frame)동안 4번이 실제로 위 조건들을 통과하여 매칭이 되었으면 이 데이터를 추적 세트(Tracking Set)에 전달해주는 것이다.Briefly explaining candidate filtering, if an object actually passed the above conditions 4 times during 5 frames and was matched, this data is delivered to the tracking set.

결과result

도 4를 참조하면, 색깔 박스는 최종 추적 tracking 후 결과를 나타내며, 초록색 타원은 prediction variance(칼만 필터 매칭을 위한 예측과정)를 나타내며, 흰색 타원은 observation variance(검출 데이터로부터 얻어진 클래스 데이터 바운딩 박스)를 나타내며, 빨간색 타원은 correction variance(칼만 필터 적용 매칭후 보정된 상태)를 나타낸다.Referring to FIG. 4 , the colored box represents the result after final tracking, the green ellipse represents the prediction variance (prediction process for Kalman filter matching), and the white ellipse represents the observation variance (class data bounding box obtained from the detection data). The red ellipse indicates the correction variance (corrected state after matching with the Kalman filter applied).

도 5를 참조하면, 차량 전면에 표시된 화면으로서, 카메라를 통하여 볼 수 있는 전방화면영상에 붉은 박스로 관심영역(ROI)이 표시된다.Referring to FIG. 5 , as a screen displayed on the front of the vehicle, a region of interest (ROI) is displayed as a red box on the front screen image that can be viewed through the camera.

도 6은 도 5와 유사한 도면으로서, 관심영역(ROI)이 상부로 향하여 표시된 화면이고, 도 7은 도 5와 유사한 도면으로서, 관심영역(ROI)이 하부로 향하여 표시된 화면이다.FIG. 6 is a view similar to FIG. 5 and is a screen in which the region of interest (ROI) is displayed upward, and FIG. 7 is a view similar to FIG. 5, in which the region of interest (ROI) is displayed downward.

도 5 내지 도 7의 과정은 도 1에서 S1 위치기반 필터링 과정을 나타낸다.5 to 7 show the S1 location-based filtering process in FIG. 1 .

도 8은 차량 전방을 나타낸 화면으로서, 소실점이라고 부르는 위치에 차량이 표시되며, 차량에 바운딩 박스(bbox)가 표시되며, 관심영역이 그 주변에 사각형 영역으로 표시되어 NN처리 과정을 수행하기 위한 상태를 나타낸 도면이다.8 is a screen showing the front of the vehicle, a vehicle is displayed at a position called a vanishing point, a bounding box (bbox) is displayed on the vehicle, and a region of interest is displayed as a rectangular area around it to perform NN processing. is a diagram showing

도 9 및 도 10은 검출데이터 영상과 칼만필터 매칭영상을 좌측과 우측에 프레임별로 도시한 도면이다.9 and 10 are views showing a detection data image and a Kalman filter matching image frame by frame on the left and right sides.

도 9 및 도 10을 참조하면, 좌측 프레임에서 바운딩박스(bbox)에 각종 클래스로 구분된 객체들이 검출된다. 도 9 및 도 10은 오인식을 제거하기 위한 초기 객체 필터링이다. 도 9에서 위에서부터 아래로 5프레임이 표시된다. 각각의 프레임에는 좌측에 검출 데이터가 표시되고, 우측은 칼만필터 예측 및 보정 화면이 표시된다. 1프레임에서 5프레임까지 계속 출현하는 객체가 4프레임이 매칭되었다면 추적 데이터로 넘겨주도록 세팅하였다. 도 9 및 도 10에서의 검출데이터 중에서 횡단보도는 지속적으로 나타나고, 우측 중앙 빨간색 간판과 좌측 중앙 홈플러스 광고판은 간헐적으로 오인식 하는 것을 볼 수 있다.9 and 10 , objects classified into various classes are detected in a bounding box (bbox) in the left frame. 9 and 10 are initial object filtering for removing misrecognition. 5 frames are displayed from top to bottom in FIG. 9 . In each frame, the detection data is displayed on the left, and the Kalman filter prediction and correction screen is displayed on the right. Objects that appear continuously from frame 1 to frame 5 are set to be passed as tracking data if 4 frames are matched. It can be seen that, among the detected data in FIGS. 9 and 10, the crosswalk continuously appears, and the right central red signboard and the left central Homeplus billboard are intermittently misrecognized.

도 11 및 도 12를 참조하면, 좌측이 검출 데이터, 우측이 최종 추적 데이터 결과물이다.11 and 12 , the left side is the detection data, and the right side is the final tracking data result.

칼만필터를 통해 위치 보정을 함과 동시에, 검출 알고리즘 쪽에서 물체를 인식하지 못할 경우 추적 알고리즘인 칼만 필터의 예측 부분에서 이전 데이터를 기반으로 현재의 예상 위치를 예측해서 미인식이 발생하여도 어플리케이션 단계에서 사용함에 문제가 없도록 빈 구간을 메워준다.At the same time, the position is corrected through the Kalman filter, and if the detection algorithm does not recognize an object, the prediction part of the Kalman filter, a tracking algorithm, predicts the current expected position based on previous data and uses it in the application stage even if unrecognized occurs. Fill in the blanks so that there is no problem.

너무 멀리 있는 물체는 검출하기가 힘들어지면 주행함에 따라 멀리있는 신호등이 지속적으로 검출이 되어서 초기 물체 필터링을 통과하여 최종 칼만필터를 통해 최종 추적 결과에 나타나는 것을 볼 수 있다.When it becomes difficult to detect an object that is too far away, as it travels, a distant traffic light is continuously detected, passes through the initial object filtering, and appears in the final tracking result through the final Kalman filter.

도 11 및 도 12를 참조하면, 총 10프레임의 화면이 표시되어 있는데, 우측 추적 결과데이터는 보정(correction)과 예측(prediction)과정이 나타나는데, 붉은 색 박스로 표시된 것을 볼 수 있다.Referring to FIGS. 11 and 12 , a screen of a total of 10 frames is displayed, and in the right tracking result data, correction and prediction processes appear, and it can be seen that the red box is displayed.

이상 설명한 바와 같이 본 발명에 따른 영상 기반 객체 검출 정보를 이용한 실시간 다중 클래스 다중 객체 추적 방법에 의하면, 초기 물체 필터링을 통하여 오인식을 줄이며, 후에 칼만 필터링과정을 거쳐 미인식을 줄여서 실시간으로 다중 객체를 인식할 수 있는 인식률을 향상시키게 된다.As described above, according to the real-time multi-class multi-object tracking method using image-based object detection information according to the present invention, misrecognition is reduced through initial object filtering, and then multiple objects are recognized in real time by reducing unrecognized through Kalman filtering process. It will improve the recognition rate that can be done.

또한, 본 발명에 따르면 영상기반이 아닌 검출 데이터에서 얻을 수 있는 정보인 클래스, 점수, 검출결과박스만을 사용하여 영상처리기반 데이터 처리에 비해 매우 적은 연산량을 사용하여 실시간으로 검출데이터의 처리가 가능하다.In addition, according to the present invention, it is possible to process detected data in real time using a very small amount of computation compared to image processing-based data processing using only the class, score, and detection result box, which are information that can be obtained from detection data, not image-based. .

이상에서와 같이 도면과 명세서에서 최적의 실시예가 개시되었다. 여기서 특정한 용어들이 사용되었으나, 이는 단지 본 발명을 설명하기 위한 목적에서 사용된 것이지 의미 한정이나 청구범위에 기재된 본 발명의 범위를 제한하기 위하여 사용된 것은 아니다. 그러므로, 본 기술 분야의 통상의 지식을 가진자라면 이로부터 다양한 변형 및 균등한 타 실시예가 가능하다는 점을 이해할 것이다. 따라서, 본 발명의 진정한 기술적 보호범위는 첨부된 청구범위의 기술적 사상에 의해 정해져야 할 것이다.As described above, the best embodiment has been disclosed in the drawings and the specification. Although specific terms are used herein, they are used only for the purpose of describing the present invention and are not used to limit the meaning or the scope of the present invention described in the claims. Therefore, it will be understood by those skilled in the art that various modifications and equivalent other embodiments are possible therefrom. Accordingly, the true technical protection scope of the present invention should be defined by the technical spirit of the appended claims.

Claims

In the real-time multi-class multi-object tracking method using image-based object detection information,
A first step of reading detection data from image-based object detection information;
A second step of performing location-based filtering on the read detection data;
A third step of performing matching and Kalman filter processing on the detected data filtered in the second step;
a fourth step of performing initial object filtering processing after the third step;
Including a fifth step of performing data processing and management after the fourth step,
The first to fifth steps are sequentially performed for each frame,
In the third step, if 4 frames are matched with the same object within 5 frames, the third step is stored and updated as a candidate data set, and the candidate data set is used in the matching process in the matching and Kalman filter processing steps but,
The third step is
Initiating the Kalman Filter prediction process,
matching the filtering data and the tracking data set;
If matched, performing a Kalman filter correction process, and
Comprising the step of updating the tracking data when the correction is completed,
The fourth step is
Matching the remaining filtered data with the candidate data set,
updating candidate data if matched;
After checking the candidate dataset, the step of passing it to the tracking data;
It characterized in that it comprises the step of adding the remaining filtering data to the candidate dataset,
It reduces misrecognition through initial object filtering, and then goes through the Kalman filtering process to reduce unrecognized recognition to improve the recognition rate that can recognize multiple objects in real time.
An image-based object characterized in that it is possible to process detected data in real time by using only a class, score, and detection result box, which are information that can be obtained from detection data rather than image-based, using a very small amount of computation compared to image processing-based data processing. Real-time multi-class multi-object tracking method using detection information.

delete

The method of claim 1,
The step of matching the filtering data and the tracking data set is
Nearest Neighbor (NN), heuristic similarity matching, and Multi Class Score Calculation are sequentially performed in real-time using image-based object detection information. Class multi-object tracking method.