KR102437755B1

KR102437755B1 - System for Managing Adaptive Frame for Real Time Object Detection and Method for Controlling Adaptive Frame Using the Same

Info

Publication number: KR102437755B1
Application number: KR1020210025338A
Authority: KR
Inventors: 황광일; 이정훈; 이승수
Original assignee: 인천대학교 산학협력단
Priority date: 2021-02-25
Filing date: 2021-02-25
Publication date: 2022-08-29

Abstract

An adaptive frame management system for real-time object recognition and an adaptive frame control method using the same compare an input frame rate from an input source with an object recognition service rate of an object recognition device and adaptively controls a frame to implement a real-time object recognition service. The present invention has an effect of providing a real-time object detection service compared to the existing YOLO and providing a high recognition rate and convenience of YOLO.

Description

System for Managing Adaptive Frame for Real Time Object Detection and Method for Controlling Adaptive Frame Using the Same

본 발명은 실시간 객체 인식을 위한 적응형 프레임 관리 시스템에 관한 것으로서, 더욱 상세하게는 입력소스로부터 들어오는 입력 프레임 레이트와 객체 인식 장치의 객체 인식 서비스 레이트를 비교하여 적응형으로 프레임을 제어함으로써 실시간 객체 인식 서비스를 구현하는 실시간 객체 인식을 위한 적응형 프레임 관리 시스템 및 이를 이용한 적응형 프레임 제어 방법에 관한 것이다.The present invention relates to an adaptive frame management system for real-time object recognition, and more particularly, real-time object recognition by comparing an input frame rate coming from an input source and an object recognition service rate of an object recognition device and adaptively controlling the frame. An adaptive frame management system for real-time object recognition implementing a service, and an adaptive frame control method using the same.

YOLO(You Only Look Once)는 지속적인 버전 업데이트를 통한 사용의 편리성과 높은 객체 인식률 보장으로 인해 최근 많은 지능형 영상 application의 객체인식 소프트웨어로써 사용되고 있다.YOLO (You Only Look Once) has recently been used as object recognition software for many intelligent image applications due to its ease of use and high object recognition rate guaranteed through continuous version updates.

또한, 최근에는 고성능의 임베디드 보드를 이용한 다양한 인공지능 기반 솔루션들이 개발되고 있다. YOLO는 성공적인 실시간 객체 인식을 위해서 고사양의 하드웨어(GPU 등)를 요구한다.Also, recently, various AI-based solutions using high-performance embedded boards are being developed. YOLO requires high-end hardware (GPU, etc.) for successful real-time object recognition.

최근에는 이러한 YOLO를 기반으로 하는 객체 인식을 이용한 다양한 애플리케이션들이 개발되고 있다.Recently, various applications using object recognition based on YOLO have been developed.

도 1은 종래 기술에 따른 YOLO 객체 인식 서비스의 구조를 나타낸 도면이고, 도 2는 종래 기술에 따른 YOLO 리얼 타임 프로세싱을 나타낸 도면이다.1 is a diagram illustrating the structure of a YOLO object recognition service according to the prior art, and FIG. 2 is a diagram illustrating YOLO real-time processing according to the prior art.

대다수의 네트워크 카메라는 네트워크로 전송되는 실시간 동영상의 프레임 레이트를 보장할 수 있는 RTSP(Real Time Streaming Protocol) 프로토콜을 이용한다. RTSP는 네트워크로 전송되는 프레임들을 버퍼링하기 위한 RTSP 큐를 가지고 있다.Most network cameras use the RTSP (Real Time Streaming Protocol) protocol that can guarantee the frame rate of real-time video transmitted over the network. RTSP has an RTSP queue for buffering frames transmitted over the network.

실시간 객체 인식을 위한 프레임 처리 속도는 실제로 수백 ms 정도이며, 애플리케이션에 따라 3 내지 5 fps 또는 10 fps 정도의 실시간 객체 인식 속도를 요구한다.The frame processing speed for real-time object recognition is actually about several hundred ms, and a real-time object recognition speed of about 3 to 5 fps or 10 fps is required depending on the application.

하지만, 기존 YOLO의 객체 인식 서비스 속도가 RTSP로부터 들어오는 프레임 도착 간격보다 느릴 경우, YOLO는 실시간 처리에 대한 심각한 문제를 야기할 수 있다.However, if the object recognition service speed of the existing YOLO is slower than the frame arrival interval from the RTSP, YOLO may cause serious problems for real-time processing.

YOLO는 객체 인식 동안 들어오는 영상 프레임들을 큐에 저장하고, 객체 인식이 끝날 때마다 순차적으로 큐로부터 꺼내어 다음 프레임을 처리한다.YOLO stores incoming image frames during object recognition in a queue, and each time object recognition is finished, it sequentially takes them out of the queue and processes the next frame.

객체 인식 처리 시간(Ts)이 입력 영상 프레임의 도착 시간 간격(

)보다 길 경우, 이어서 들어오는 영상 프레임들은 큐에서 대기하는 시간이 점점 길어지게 된다. 이때, 도 2에 도시된 바와 같이, 각 입력 영상 프레임들의 총 서비스 지연 시간(D(n))은 다음의 수학식 1과 같다.The object recognition processing time (Ts) is the arrival time interval of the input image frame (

), the waiting time in the queue for subsequent incoming video frames becomes longer. At this time, as shown in FIG. 2 , the total service delay time D(n) of each input image frame is expressed by Equation 1 below.

여기서,

은 n번째 영상 프레임의 큐에서의 대기 시간,

는 입력 영상 프레임의 도착 시간 간격,

은 YOLO의 객체 인식에 소요되는 객체 인식 서비스 시간이다.here,

is the waiting time in the queue of the nth video frame,

is the arrival time interval of the input video frame,

is the object recognition service time required for object recognition of YOLO.

각 영상 프레임에 대한 객체 인식 완료 시간은 시간이 갈수록 점점 누적되어 길어지고, 애플리케이션이 요구하는 데드라인(Deadline)을 초과하게 된다.The object recognition completion time for each image frame gradually accumulates over time and becomes longer, exceeding the deadline required by the application.

객체 인식 애플리케이션은 누적 지연이 시간이 갈수록 점점 더 커지고, 실시간이 아닌 휠씬 이전 시간의 영상 프레임을 처리하는 결과를 초래하는 문제점이 발생한다. 도 2와 같이, 객체 인식 처리 속도가 지연될 경우, 이후의 영상 프레임들에 지연이 누적되어 실시간 처리가 불가능해지는 것을 알 수 있다.In the object recognition application, the accumulated delay increases as time goes by, and there arises a problem that results in processing an image frame of a much earlier time rather than in real time. As shown in FIG. 2 , when the object recognition processing speed is delayed, it can be seen that the delay is accumulated in subsequent image frames, making real-time processing impossible.

한국 등록특허번호 제10-1921709호Korean Patent No. 10-1921709

이와 같은 문제점을 해결하기 위하여, 본 발명은 입력소스로부터 들어오는 입력 프레임 레이트와 객체 인식 장치의 객체 인식 서비스 레이트를 비교하여 적응형으로 프레임을 제어함으로써 실시간 객체 인식 서비스를 구현하는 실시간 객체 인식을 위한 적응형 프레임 관리 시스템 및 이를 이용한 적응형 프레임 제어 방법을 제공하는데 그 목적이 있다.In order to solve this problem, the present invention provides an adaptation for real-time object recognition that implements a real-time object recognition service by adaptively controlling a frame by comparing an input frame rate coming from an input source and an object recognition service rate of an object recognition device. An object of the present invention is to provide a type frame management system and an adaptive frame control method using the same.

상기 목적을 달성하기 위한 본 발명의 특징에 따른 실시간 객체 인식을 위한 적응형 프레임 관리 시스템은,An adaptive frame management system for real-time object recognition according to a feature of the present invention for achieving the above object,

복수의 입력소스로부터 영상 프레임을 입력받아 큐(Queue)로 전달하고, 입력 영상으로부터 들어오는 영상 프레임의 도착 시간 간격(

)을 측정하는 다중 소스 선출기;It receives video frames from a plurality of input sources and delivers them to a queue, and the arrival time interval (

) multi-source selector to measure;

상기 영상 프레임에서 객체의 존재를 추정하여 객체 인식 결과를 출력하는 객체 인식 장치;an object recognition device for estimating the existence of an object in the image frame and outputting an object recognition result;

상기 객체 인식 장치에 연결되어 상기 영상 프레임에 대해 객체 인식에 소요되는 객체 인식 서비스 시간(

)을 측정하는 객체 인식 서비스 시간 측정부; 및The object recognition service time (

) object recognition service time measuring unit to measure; and

상기 큐의 출력단에 연결되어 상기 큐로부터 상기 영상 프레임을 각각 수신하고, 상기 다중 소스 선출기로부터 수신한 도착 시간 간격(

)과 상기 객체 인식 서비스 시간 측정부로부터 수신한 객체 인식 서비스 시간(

)을 비교한 결과에 따라 상기 큐에서 인출된 영상 프레임을 상기 객체 인식 장치로 전달할 지 또는 폐기할 지 판단하는 출입 허용 제어부를 포함한다.It is connected to the output terminal of the queue to receive each of the image frames from the queue, and the arrival time interval received from the multi-source selector (

) and the object recognition service time (

.

출입 허용 제어부는 상기 도착 시간 간격(

)이 상기 객체 인식 서비스 시간(

)보다 작은 경우, 상기 큐에서 인출된 일부 영상 프레임을 폐기(Drop)하고, 나머지 영상 프레임을 상기 객체 인식 장치로 전달한다.The access permission control unit is the arrival time interval (

) is the object recognition service time (

), some image frames fetched from the queue are dropped, and the remaining image frames are transferred to the object recognition device.

출입 허용 제어부는 상기 도착 시간 간격(

)이 상기 객체 인식 서비스 시간(

)보다 작은 경우, 최신의 영상 프레임만을 전달하기 위하여 상기 큐의 테일(tail)에 있는 영상 프레임을 제외하고, 상기 큐의 나머지 패킷들을 폐기(Drop)한다.The access permission control unit is the arrival time interval (

) is the object recognition service time (

), in order to deliver only the latest video frame, the remaining packets of the queue are dropped except for the video frame in the tail of the queue.

출입 허용 제어부는 상기 도착 시간 간격(

)이 상기 객체 인식 서비스 시간(

)보다 크거나 같은 경우, 상기 큐에서 인출된 영상 프레임을 그대로 상기 객체 인식 장치로 전달한다.The access permission control unit is the arrival time interval (

) is the object recognition service time (

), the image frame fetched from the queue is transferred to the object recognition device as it is.

출입 허용 제어부는 다중 소스 선출기로부터 도착 시간 간격(

)을 수신하고, 객체 인식 서비스 시간 측정부로부터 객체 인식 서비스 시간(

)을 수신하는 허용 컨트롤러; 및The access control unit determines the arrival time interval (

), and the object recognition service time (

) receiving the admission controller; and

큐의 출력단에 연결되어 상기 큐로부터 상기 영상 프레임을 수신하고, 허용 컨트롤러의 결과값이 상기 도착 시간 간격(

)이 상기 객체 인식 서비스 시간(

)보다 작은 경우, 최신의 영상 프레임만을 전달하기 위하여 상기 큐의 테일(tail)에 있는 영상 프레임을 제외하고, 상기 큐의 나머지 패킷들을 폐기(Drop)하고, 상기 허용 컨트롤러의 결과값이 상기 도착 시간 간격(

)이 상기 객체 인식 서비스 시간(

)보다 크거나 같은 경우, 상기 큐에서 인출된 영상 프레임을 그대로 상기 객체 인식 장치로 전달하는 디멀티플렉서를 포함한다.It is connected to the output terminal of the queue to receive the image frame from the queue, and the result value of the admission controller is the arrival time interval (

) is the object recognition service time (

), in order to deliver only the latest video frame, except for the video frame in the tail of the queue, the remaining packets of the queue are dropped, and the result value of the admission controller is the arrival time interval(

) is the object recognition service time (

), including a demultiplexer that transmits the image frame fetched from the queue to the object recognition device as it is.

본 발명의 특징에 따른 실시간 객체 인식을 위한 적응형 프레임 제어 방법은,An adaptive frame control method for real-time object recognition according to a feature of the present invention,

)을 다중 소스 선출기에서 측정하는 단계;It receives video frames from a plurality of input sources and delivers them to a queue, and the arrival time interval (

) in the multi-source selector;

객체 인식 장치에서 상기 영상 프레임에서 객체의 존재를 인식하는 단계;recognizing the existence of an object in the image frame in an object recognition apparatus;

상기 객체 인식 장치에 연결된 객체 인식 서비스 시간 측정부는 영상 프레임에 대해 객체 인식에 소요되는 객체 인식 서비스 시간(

)을 측정하는 단계; 및The object recognition service time measuring unit connected to the object recognition device is an object recognition service time (

) to measure; and

상기 큐로부터 상기 영상 프레임을 수신하고, 상기 다중 소스 선출기로부터 수신한 도착 시간 간격(

)을 비교한 결과에 따라 상기 큐에서 인출된 영상 프레임을 상기 객체 인식 장치로 전달할 지 또는 폐기할 지 판단하는 단계를 포함한다.Arrival time interval (

) and the object recognition service time (

) and determining whether to transmit or discard the image frame retrieved from the queue to the object recognition device according to a result of comparing.

전술한 구성에 의하여, 본 발명은 기존의 YOLO에 비해 실시간 객체 탐지 서비스가 가능하여 YOLO의 높은 인식률과 편의성을 제공할 수 있는 효과가 있다.According to the above-described configuration, the present invention has the effect of providing a high recognition rate and convenience of YOLO by enabling a real-time object detection service compared to the existing YOLO.

본 발명은 YOLO 내부의 하드웨어를 변경하지 않고, 기존의 YOLO의 객체 인식 성능을 그대로 사용하면서 네트워크로 전송되는 실시간 동영상을 처리할 수 있는 효과가 있다.The present invention has the effect of processing real-time video transmitted over the network while using the object recognition performance of the existing YOLO without changing the hardware inside the YOLO.

도 1은 종래 기술에 따른 YOLO 객체 인식 서비스의 구조를 나타낸 도면이다.
도 2는 종래 기술에 따른 YOLO 리얼 타임 프로세싱을 나타낸 도면이다.
도 3은 본 발명의 실시예에 따른 실시간 객체 인식을 위한 적응형 프레임 관리 시스템의 구성을 나타낸 도면이다.
도 4는 본 발명의 실시예에 따른 다중 소스 선출기의 내부 로직을 나타낸 도면이다.
도 5는 본 발명의 실시예에 따른 출입 허용 제어부의 내부 로직을 나타낸 도면이다.
도 6은 본 발명의 실시예에 따른 적응형 프레임 관리 시스템을 이용한 리얼 타임 프로세싱을 나타낸 도면이다.
도 7은 적응형 프레임 제어 장치를 가진 객체 인식 장치와 기존의 객체 인식 장치의 객체 인식 서비스 시간을 비교한 결과를 나타낸 도면이다.
도 8은 객체 인식 장치의 하드웨어별 객체 탐지 시간을 나타낸 도면이다.
도 9 내지 도 13은 객체 인식 장치의 각각 다른 하드웨어에서 실행된 RTSP 전송에 대한 각 영상 프레임의 총 처리 지연시간을 나타낸 도면이다.
도 14는 실제 본 실험에서 각 시간대별 객체 인식을 처리하고 있는 영상 프레임을 나타낸 도면이다.1 is a diagram showing the structure of a YOLO object recognition service according to the prior art.
2 is a diagram illustrating YOLO real-time processing according to the prior art.
3 is a diagram showing the configuration of an adaptive frame management system for real-time object recognition according to an embodiment of the present invention.
4 is a diagram illustrating internal logic of a multi-source selector according to an embodiment of the present invention.
5 is a diagram illustrating the internal logic of the access permission control unit according to an embodiment of the present invention.
6 is a diagram illustrating real-time processing using an adaptive frame management system according to an embodiment of the present invention.
7 is a diagram illustrating a result of comparing object recognition service times of an object recognition apparatus having an adaptive frame control apparatus and an existing object recognition apparatus.
8 is a diagram illustrating an object detection time for each hardware of the object recognition apparatus.
9 to 13 are diagrams illustrating a total processing delay time of each image frame for RTSP transmission executed in different hardware of an object recognition apparatus.
14 is a diagram illustrating an image frame in which object recognition is processed for each time period in the actual experiment.

명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다.Throughout the specification, when a part "includes" a certain component, it means that other components may be further included, rather than excluding other components, unless otherwise stated.

종래의 YOLO는 실시간 객체 인식 처리를 향상하기 위해서 내부 모델을 모두 수정해야 하기 때문에 YOLO의 사용자 편의성 측면에서 문제점이 있으며, 애플리케이션에서 새로운 기법의 적용을 위해서 해당 객체 인식의 성능 자체에 대한 추가적인 검증이 필요한 문제점도 발생한다.Conventional YOLO has a problem in terms of user convenience because all internal models need to be modified to improve real-time object recognition processing. Problems also arise.

그러나 본 발명은 네트워크로 전송되는 실시간 동영상을 처리하기 위하여 YOLO 내부의 하드웨어를 변경하지 않고, YOLO가 실행되고 있는 시스템 하드웨어의 능력에 의존하지 않으며, 기존의 YOLO의 객체 인식 성능을 그대로 사용할 수 있다.However, the present invention does not change the hardware inside YOLO to process real-time video transmitted over the network, does not depend on the capability of the system hardware in which YOLO is running, and can use the existing object recognition performance of YOLO as it is.

본 발명은 기존의 YOLO의 객체인식 성능을 그대로 유지하면서도 실시간 객체인식 처리를 가능하게 하기 위한 적응형 프레임 제어(Adaptive Frame Control, AFC) 방법을 제공한다.The present invention provides an adaptive frame control (AFC) method for enabling real-time object recognition processing while maintaining the object recognition performance of the existing YOLO.

본 발명은 YOLO의 다양한 입력 소스에 대해 일관된 객체 인식 서비스를 가능하게 하고, 특히, IP 카메라 연결에 많이 사용되는 RTSP(Real Time Streaming Protocol) 기반의 YOLO 객체 인식 Application에서 발생할 수 있는 실시간 처리에 관한 문제점을 해결할 수 있다. AFC는 기존의 YOLO와 연동하여 YOLO의 인식 성능을 그대로 활용하면서도 다양한 입력소스로부터의 프레임 입력(Frame Input)을 최적화함으로써 다양한 애플리케이션에서 요구하는 실시간 객체 인식을 가능하게 한다.The present invention enables consistent object recognition service for various input sources of YOLO, and in particular, problems related to real-time processing that may occur in RTSP (Real Time Streaming Protocol)-based YOLO object recognition applications, which are often used to connect IP cameras can solve AFC enables real-time object recognition required by various applications by optimizing frame input from various input sources while utilizing the recognition performance of YOLO as it is in conjunction with the existing YOLO.

도 3은 본 발명의 실시예에 따른 실시간 객체 인식을 위한 적응형 프레임 관리 시스템의 구성을 나타낸 도면이고, 도 4는 본 발명의 실시예에 따른 다중 소스 선출기의 내부 로직을 나타낸 도면이고, 도 5는 본 발명의 실시예에 따른 출입 허용 제어부의 내부 로직을 나타낸 도면이다.3 is a diagram showing the configuration of an adaptive frame management system for real-time object recognition according to an embodiment of the present invention, FIG. 4 is a diagram showing internal logic of a multi-source selector according to an embodiment of the present invention, FIG. 5 is a diagram illustrating the internal logic of the access permission control unit according to an embodiment of the present invention.

본 발명의 실시예에 따른 실시간 객체 인식을 위한 적응형 프레임 관리 시스템(100)은 적응형 프레임 제어 장치(110) 및 객체 인식 장치(120)를 포함한다.The adaptive frame management system 100 for real-time object recognition according to an embodiment of the present invention includes an adaptive frame control device 110 and an object recognition device 120 .

적응형 프레임 제어 장치(110)는 다중 소스 선출기(Multi Source Prefetcher)(111), 큐(Queue)(112) 및 출입 허용 제어부(113)를 포함한다.The adaptive frame control apparatus 110 includes a multi-source selector 111 , a queue 112 , and an access control unit 113 .

다중 소스 선출기(111)는 다중 소스 먹스(Multi-Source Mux)(111a), 싱크로나이저(Synchronizer)(111b) 및 프레임 처리부(111c)를 포함한다.The multi-source selector 111 includes a multi-source mux 111a, a synchronizer 111b, and a frame processing unit 111c.

객체 인식 장치(120)는 인출 스레드(121), 객체 탐지부(122) 및 디스플레이부(123)를 포함한다.The object recognition apparatus 120 includes a retrieval thread 121 , an object detection unit 122 , and a display unit 123 .

큐(112)는 가장 먼저 들어온 영상의 프레임이 먼저 출력되는 선입선출(First In, First Out) 구조이고, 입력소스로부터 영상의 프레임을 수신하여 저장한다.The queue 112 has a first-in, first-out (first-in, first-out) structure in which a frame of an image received first is output first, and receives and stores a frame of an image from an input source.

큐(112)는 입력단에 다중 소스 선출기(111)에 연결되고, 출력단에 출입 허용 제어부(113)에 연결된다.The queue 112 is connected to the multi-source selector 111 at the input end, and is connected to the access control unit 113 at the output end.

다중 소스 먹스(111a)는 비디오 채널 멀티플렉서(Video Channel Multiplexer)로서 복수의 CCTV(Closed Circuit Television) 카메라의 출력 신호선들과 입력 포트가 각각 접속되고, CCTV의 출력 신호선들에 디지털 비디오 레코더(Digital Video Recorder, DVR) 또는 네트워크 비디오 레코더(Network Video Recorder, NVR)의 입력 신호선들이 각각 접속될 수 있다.The multi-source mux 111a is a video channel multiplexer, and output signal lines and input ports of a plurality of CCTV (Closed Circuit Television) cameras are respectively connected, and a digital video recorder (Digital Video Recorder) is connected to the output signal lines of the CCTV cameras. , DVR) or network video recorder (NVR) input signal lines may be connected to each other.

이외에 다중 소스 먹스(111a)는 다양한 입력소스에 접속될 수 있다.In addition, the multi-source mux 111a may be connected to various input sources.

다중 소스 먹스(111a)는 객체 인식 장치(120)의 실행 시 복수의 입력소스로부터 영상의 프레임을 받아들일 수 있다. 여기서, 입력소스는 RTSP 기반의 네트워크 카메라, Video 파일, USB 카메라, NVR/DVR, CCTV 등을 포함할 수 있다.The multi-source mux 111a may receive an image frame from a plurality of input sources when the object recognition apparatus 120 is executed. Here, the input source may include an RTSP-based network camera, video file, USB camera, NVR/DVR, CCTV, and the like.

각 입력소스로부터 들어오는 영상 프레임들은 객체 인식 장치(120)에서 처리할 수 있는 속도로 처리되기 때문에 일관된 처리 속도를 만들 수 없다. 예를 들어, 저장된 CCTV 동영상 파일에 대한 객체 인식을 수행할 경우, 객체 인식 장치(120)가 실행되는 하드웨어가 고사양인 상황(객체 탐지 FPS가 30fps 이상)에서는 객체 인식 서비스 시간이 30 fps보다 더 빠르기 때문에 해당 영상으로부터 객체가 인식된 정확한 시간을 파악하기가 어렵다.Since the image frames coming from each input source are processed at a speed that the object recognition apparatus 120 can process, a consistent processing speed cannot be created. For example, when performing object recognition on a stored CCTV video file, the object recognition service time is faster than 30 fps in a situation where the hardware on which the object recognition device 120 is executed is of high specification (object detection FPS is 30 fps or more) Therefore, it is difficult to determine the exact time at which the object was recognized from the image.

적응형 프레임 관리 시스템(100)은 객체 인식 장치(120)의 내부 하드웨어를 변경하지 않고, 객체 인식 장치(120)의 프레임 인출 전에 실시간 입력과 처리에 대한 적응형 프레임 제어를 수행한다.The adaptive frame management system 100 performs adaptive frame control for real-time input and processing before frame retrieval of the object recognition apparatus 120 without changing the internal hardware of the object recognition apparatus 120 .

적응형 프레임 제어 장치(110)는 다중 소스 선출기(111)와 출입 허용 제어부(113)의 사이에 큐(112)를 사용하여 다양한 입력으로부터 일관된 실시간 처리를 가능하게 한다.The adaptive frame control apparatus 110 uses a queue 112 between the multi-source selector 111 and the access control unit 113 to enable consistent real-time processing from various inputs.

다중 소스 먹스(111a)는 다양한 입력소스로부터 애플리케이션에서 요구하는 target FPS(Frame Per Second)에 일관된 프레임 Input을 가능하게 하는 기능을 수행한다.The multi-source mux 111a performs a function of enabling consistent frame input from various input sources to the target FPS (Frame Per Second) required by the application.

다중 소스 먹스(111a)는 별도의 스레드(Thread)로 동작되고, 사용자가 원하는 다양한 입력에 대한 프레임들을 빠른 속도로 읽는다. 여기서, 스레드는 다중 소스 선출기(111)에서 입력소스를 받아들어 큐(112)로 전달하는 일련의 수행 프로세스를 나타낸다.The multi-source mux 111a operates as a separate thread, and reads frames for various inputs desired by the user at a high speed. Here, the thread represents a series of execution processes that receive an input source from the multi-source selector 111 and transfer it to the queue 112 .

싱크로나이저(Synchronizer)(111b)는 스레드에서 다중 소스 먹스(111a)로부터 입력되는 영상 프레임들을 읽어 들이는 시간과, target FPS와 비교하여 target FPS보다 입력 프레임 레이트(Input Frame Rate)가 높은 경우, target FPS에 동기를 맞춰주는 기능을 수행한다.The synchronizer 111b reads the video frames input from the multi-source mux 111a in the thread, and when the input frame rate is higher than the target FPS compared to the target FPS, the target It performs the function to synchronize the FPS.

싱크로나이저(111b)는 target FPS보다 빠른 속도로 영상 프레임이 인출되는 경우, target FPS에 맞게 일관된 도착 시간 간격(Interarrival Time)(

)을 유지할 수 있다.The synchronizer 111b uses a consistent Interarrival Time (Interarrival Time) (

) can be maintained.

이와 같이 다중 소스 선출기(111)는 싱크로나이저(111b)를 이용하여 객체 인식 장치(120)가 실행하고 있는 플랫폼에 상관없이 모든 입력소스로부터 target FPS(30 fps)에 맞는 일관된

를 유지할 수 있다.As such, the multi-source selector 111 uses the synchronizer 111b to consistently match the target FPS (30 fps) from all input sources regardless of the platform on which the object recognition device 120 is running.

can keep

싱크로나이저(111b)를 지난 각 입력 영상 프레임은 프레임 처리부(111c)를 거쳐 큐(112)의 입력으로 들어간다.Each input image frame passing through the synchronizer 111b enters the input of the queue 112 through the frame processing unit 111c.

프레임 처리부(111c)는 입력단에 싱크로나이저(111b)를 연결하고, 출력단을 큐(112)의 입력단에 연결하며, 일측이 출입 허용 제어부(113)에 연결된다.The frame processing unit 111c connects the synchronizer 111b to the input end, connects the output end to the input end of the queue 112 , and one end is connected to the access control unit 113 .

프레임 처리부(111c)는 각각의 입력 영상으로부터 들어오는 영상 프레임의 도착 시간 간격(Interarrival Time)(

)을 지속적으로 측정하고, 측정한 도착 시간 간격을 출입 허용 제어부(113)로 전송한다.The frame processing unit 111c calculates an Interarrival Time (Interarrival Time) of an incoming image frame from each input image.

) is continuously measured, and the measured arrival time interval is transmitted to the access control unit 113 .

각 입력 영상 프레임의 도착 시간 간격(

)은 출입 허용 제어부(113)에서 프레임 허가 제어를 위해 사용되는 파라미터이다.The arrival time interval of each input image frame (

) is a parameter used for frame permission control in the access permission control unit 113 .

출입 허용 제어부(113)는 큐(112)의 출력단과 객체 인식 장치(120)의 인출 스레드(121)의 사이에 연결되고, 간단하면서도 강력한 프레임 제어를 통해 객체 인식 장치(120)로의 실시간 객체 인식 서비스를 가능하게 한다.The access control unit 113 is connected between the output terminal of the queue 112 and the retrieval thread 121 of the object recognition device 120, and provides a real-time object recognition service to the object recognition device 120 through simple and powerful frame control. makes it possible

출입 허용 제어부(113)는 큐(112)로부터 영상 프레임을 추출하여 객체 인식 장치(120)의 인출 스레드(121)로 전달한다.The access control unit 113 extracts an image frame from the queue 112 and transmits it to the retrieval thread 121 of the object recognition apparatus 120 .

출입 허용 제어부(113)는 허용 컨트롤러(113a) 및 디멀티플렉서(113b)를 포함한다.The access control unit 113 includes an admission controller 113a and a demultiplexer 113b.

객체 인식 서비스 시간 측정부(114)는 객체 인식 장치(120)의 객체 탐지부(122)에 연결되어 영상 프레임에 대해 객체 인식에 소요되는 객체 인식 서비스 시간(

)을 측정한다.The object recognition service time measurement unit 114 is connected to the object detection unit 122 of the object recognition apparatus 120 to determine the object recognition service time (

) is measured.

객체 인식 서비스 시간은 객체 인식 장치(120)에서 한 프레임에 대한 객체 인식 서비스를 수행하는데 걸리는 시간이다.The object recognition service time is a time taken for the object recognition apparatus 120 to perform an object recognition service for one frame.

객체 인식 서비스 시간 측정부(114)는 출입 허용 제어부(113)에 연결되어 객체 탐지부(122)에서 측정된 객체 인식 서비스 시간(

)을 전달한다.The object recognition service time measurement unit 114 is connected to the access control unit 113 and the object recognition service time measured by the object detection unit 122 (

) is transmitted.

허용 컨트롤러(113a)는 프레임 처리부(111c)에서 측정되는 입력 영상 프레임 간의 도착 시간 간격(

)과, 객체 탐지부(122)에서 측정된 객체 인식 서비스 시간(

)을 수신한다.The allowable controller 113a determines the arrival time interval between input image frames measured by the frame processing unit 111c (

) and the object recognition service time measured by the object detection unit 122 (

) is received.

출입 허용 제어부(113)는 객체 인식 장치(120)에서 한 프레임의 처리가 완료되어 다음 새로운 프레임을 인출(Fetch)하는 순간에 동작(Trigger)된다.The access control unit 113 is triggered when the object recognition apparatus 120 completes processing of one frame and fetches the next new frame.

허용 컨트롤러(113a)는 프레임 처리부(111c)에서 측정되는 입력 영상 프레임의 도착 시간 간격(

)과, 객체 탐지부(122)에서 측정된 객체 인식 서비스 시간(

)을 비교한다.Allowable controller 113a is the arrival time interval of the input image frame measured by the frame processing unit 111c (

) are compared.

디멀티플렉서(113b)는 허용 컨트롤러(113a)의 결과값에 따라 큐(112)에서 인출된 영상 프레임을 객체 인식 장치(120)로 전달할 지 또는 폐기할 지 판단한다.The demultiplexer 113b determines whether to transmit or discard the image frame fetched from the queue 112 to the object recognition device 120 according to the result value of the admission controller 113a.

도착 시간 간격(

)이 객체 인식 서비스 시간(

)보다 크거나 같은 경우, 고성능의 하드웨어 시스템에서 동작되는 상황으로 객체 인식 서비스 시간이 입력 프레임 레이트보다 빠른 상황이거나

=

= 0인 초기 상황이기 때문에 큐에서 객체 인식 장치(120)에서 실행 중에 입력된 영상 프레임만이 존재한다.Arrival time interval (

) is the object recognition service time (

) is greater than or equal to, it is a situation operating in a high-performance hardware system, and the object recognition service time is faster than the input frame rate or

=

= 0, only the image frame input during execution in the object recognition device 120 in the queue exists.

디멀티플렉서(113b)는 도착 시간 간격(

)이 객체 인식 서비스 시간(

)보다 크거나 같은 경우, 별다른 조치없이 큐(112)에서 인출된 영상 프레임을 그대로 객체 인식 장치(120)의 인출 스레드(121)로 전달한다.The demultiplexer 113b determines the arrival time interval (

) is the object recognition service time (

), the image frame fetched from the queue 112 is transferred to the retrieval thread 121 of the object recognition device 120 as it is without taking any action.

도착 시간 간격(

)이 객체 인식 서비스 시간(

)보다 작은 경우, 상대적으로 낮은 성능의 하드웨어에서 객체 인식 장치(120)가 실행되고 있거나 순간적으로 또는 특정 시간대에 다른 프로세스에 의해 GPU 또는 메모리가 부족하여 객체 인식 장치(120)의 객체 인식 서비스 시간이 증가하는 경우이다. 이러한 경우는 빠른 입력 프레임 레이트로 인해 객체 인식 장치(120)가 한 프레임의 객체 인식 서비스하는 동안 큐(112)에 이미 복수의 다음 프레임들이 저장된다.Arrival time interval (

) is the object recognition service time (

), the object recognition service time of the object recognition device 120 is shortened because the object recognition device 120 is running on hardware with relatively low performance, or the GPU or memory is insufficient by another process momentarily or at a specific time. in case of increasing In this case, a plurality of next frames are already stored in the queue 112 while the object recognition apparatus 120 performs an object recognition service of one frame due to a fast input frame rate.

이렇게 큐(112)에 저장된 프레임에 의해 실시간 프레임 처리에 대한 문제가 발생된다.A problem with real-time frame processing occurs due to the frames stored in the queue 112 in this way.

디멀티플렉서(113b)는 도착 시간 간격(

)이 객체 인식 서비스 시간(

)보다 작은 경우, 객체 인식 장치(120)의 인출 스레드(121)로 가장 최신의 영상 프레임만을 전달하기 위하여 큐(112)의 테일(tail)에 있는 영상 프레임을 제외하고, 현재 큐(112)의 나머지 패킷들을 모두 폐기(Drop)한다. 여기서, 테일(tail)은 큐(112)에 마지막 패킷에 대한 포인터로 큐(112)에 입력된 가장 최신의 프레임을 나타내고, 헤드(head)는 큐(112)에서 첫 번째 패킷에 대한 포인터를 나타낸다.The demultiplexer 113b determines the arrival time interval (

) is the object recognition service time (

), except for the image frame in the tail of the queue 112 in order to deliver only the most recent image frame to the retrieval thread 121 of the object recognition device 120, All remaining packets are dropped. Here, a tail is a pointer to the last packet in the queue 112 , and indicates the most recent frame input to the queue 112 , and a head indicates a pointer to the first packet in the queue 112 . .

폐기되는 영상 프레임의 수는 현재

와

의 차이에 의해 달라질 수 있다. 예를 들어, 입력 프레임 레이트가 30fps이고, 객체 인식 장치(120)의 서비스 프레임 레이트가 15fps인 경우, 실제 초당 가장 최신의 15 프레임만이 객체 인식 장치(120)의 객체 인식을 수행하고, 나머지 15 프레임이 폐기된다.The number of discarded video frames is currently

Wow

may vary due to the difference in For example, if the input frame rate is 30 fps and the service frame rate of the object recognition device 120 is 15 fps, only the latest 15 frames per second actually perform object recognition by the object recognition device 120, and the remaining 15 The frame is discarded.

애플리케이션 측면에서는 객체 인식 장치(120)의 객체 인식 서비스 시간에 수렴되는 속도로 실시간 프레임 처리가 가능하다.In terms of applications, real-time frame processing is possible at a speed converging to the object recognition service time of the object recognition device 120 .

애플리케이션 측면에서는 모든 프레임에 대하여 객체 인식을 하지 않아도 전체적인 객체 인식률에 영향을 크게 미치지 않는다.In terms of application, even if object recognition is not performed for all frames, the overall object recognition rate is not significantly affected.

따라서, 애플리케이션 측면에서는 폐기된 영상 프레임들로 인하여 객체 인식 장치(120)의 실시간성을 보장해주는 결과를 초래하며, 애플리케이션의 특성에 알맞은 하드웨어 선택이 가능하여 비용 효율적인 시스템 설계를 가능하게 한다.Accordingly, in terms of application, it results in guaranteeing the real-time of the object recognition apparatus 120 due to the discarded image frames, and enables cost-effective system design by enabling selection of hardware suitable for application characteristics.

객체 인식 장치(120)는 심층 신경망(Deep Neural Networks, DNN), 컨볼루션 신경망 (Convolutional deep Neural Networks, CNN), 순환 신경망(Reccurent Neural Network, RNN) 및 심층 신뢰 신경 망(Deep Belief Networks, DBN) 중 어느 하나의 신경망을 이용하여 입력 영상으로부터 특징맵을 추출한다.The object recognition device 120 includes a deep neural network (DNN), a convolutional neural network (CNN), a recurrent neural network (RNN), and a deep trust neural network (DBN). A feature map is extracted from the input image using any one of the neural networks.

객체 인식 장치(120)는 딥러닝(Deep learning)을 기반으로 학습부에 의하여 이미 학습이 완료된 모델을 이용하여서 특징맵을 생성할 수 있다. 딥러닝은 여러 비선형 변환기법의 조합을 통해 높은 수준의 추상화(Abstractions, 다량의 데이터나 복잡한 자료들 속에서 핵심적인 내용 또는 기능을 요약하는 작업)를 시도하는 기계학습(Machine Learning) 알고리즘의 집합으로 정의된다.The object recognition apparatus 120 may generate a feature map using a model that has already been learned by the learning unit based on deep learning. Deep learning is a set of machine learning algorithms that attempt high-level abstractions (summarizing core contents or functions in large amounts of data or complex data) through a combination of several nonlinear transformation methods. Defined.

객체 인식 장치(120)의 객체 탐지부(122)는 영상 프레임에서 객체가 존재할 것으로 추정되는 영역을 추출하고, 추출된 영역으로부터 특징을 나타내는 특징맵을 추출한다.The object detection unit 122 of the object recognition apparatus 120 extracts a region in which an object is estimated to exist from the image frame, and extracts a feature map indicating a characteristic from the extracted region.

객체 탐지부(122)는 추출한 특징맵을 기초로 영상에서 객체의 존재가 추정되는 적어도 하나의 영역을 추출한다. 영역을 추출하는 방법은 예를 들어 faster RCNN, SSD(Single Shot MultiBox Detector), YOLO(You Only Look Once) 등이 있을 수 있으며, 본 발명의 객체 인식 장치(120)는 YOLO 객체 인식모듈을 일례로 하고 있다.The object detector 122 extracts at least one region in which the existence of an object is estimated from the image based on the extracted feature map. A method of extracting the region may include, for example, faster RCNN, Single Shot MultiBox Detector (SSD), You Only Look Once (YOLO), etc., and the object recognition apparatus 120 of the present invention uses the YOLO object recognition module as an example. are doing

객체 탐지부(122)는 특징맵 중에서 영상의 영역별 클래스의 좌표를 포함하는 특징맵을 선정하고, 선정된 특징맵으로부터 영역을 구별하는 좌표를 식별한 뒤, 식별된 좌표를 개체의 존재가 추정되는 영역으로 추출할 수 있다.The object detector 122 selects a feature map including the coordinates of the class for each region of the image from among the feature maps, identifies coordinates for distinguishing regions from the selected feature map, and estimates the existence of the object based on the identified coordinates. area can be extracted.

객체 탐지부(122)는 물건, 사람, 동물 등 다양한 개체를 하나 또는 2개 이상으로 설정할 수 있다.The object detection unit 122 may set one or two or more of various objects such as objects, people, and animals.

또한, 객체 탐지부(122)는 추출된 적어도 하나의 영역 각각에 대해서, 해당 객체의 최외곽을 둘러싸는 바운딩 박스(Bounding Box)로서 표시할 수 있다.Also, the object detector 122 may display each of the extracted at least one region as a bounding box surrounding the outermost portion of the corresponding object.

각각의 바운딩 박스는 영상에서 해당 바운딩 박스의 위치에 개체의 존재 가능성이 있음을 나타낸다.Each bounding box indicates the possibility of the existence of an object at the position of the corresponding bounding box in the image.

디스플레이부(123)는 영상 정보를 나타내는 프레임을 입력으로 받아 해당 프레임 내에서 개체의 위치 좌표((X1, Y1), (X2, Y2))를 바운딩 박스로 한 결과 정보를 출력한다.The display unit 123 receives a frame representing image information as an input, and outputs result information using the position coordinates ((X1, Y1), (X2, Y2)) of the object within the frame as a bounding box.

객체 인식 장치(120)는 영상 프레임을 입력받아 플레이 했을 때, 초당 15 내지 30 프레임을 처리해서 바운딩 박스가 표시된 영상 정보를 출력한다.When an image frame is input and played, the object recognition apparatus 120 processes 15 to 30 frames per second to output image information in which a bounding box is displayed.

도 6은 본 발명의 실시예에 따른 적응형 프레임 관리 시스템을 이용한 리얼 타임 프로세싱을 나타낸 도면이고, 도 7은 적응형 프레임 제어 장치를 가진 객체 인식 장치와 기존의 객체 인식 장치의 객체 인식 서비스 시간을 비교한 결과를 나타낸 도면이다.6 is a diagram illustrating real-time processing using an adaptive frame management system according to an embodiment of the present invention, and FIG. 7 is an object recognition apparatus having an adaptive frame control apparatus and an object recognition service time of an existing object recognition apparatus. It is a figure showing the comparison result.

적응형 프레임 제어 장치(110)에서의 실시간 프레임 처리의 일례를 나타낸 것이다. 도 6과 같이, 적응형 프레임 제어 장치(110)는 모든 입력 영상 프레임에 대해 객체 인식 서비스를 수행하지 않고, 가장 최신의 영상 프레임만을 처리한다.An example of real-time frame processing in the adaptive frame control apparatus 110 is shown. As shown in FIG. 6 , the adaptive frame control apparatus 110 processes only the most recent image frame without performing the object recognition service on all input image frames.

따라서, 실제 처리되는 영상 프레임들에 대한 총 서비스 시간은 다음의 수학식 2와 같다.Accordingly, the total service time for actually processed image frames is expressed by Equation 2 below.

적응형 프레임 제어 장치(110)는 큐(112)에 유입된 가장 최신의 영상 프레임을 처리하기 때문에

와 같으며, 상대적으로 작은 값이므로 총 서비스 시간이 거의

에 수렴된다.Since the adaptive frame control device 110 processes the most recent image frame introduced into the queue 112 ,

, and since it is a relatively small value, the total service time is almost

is converged on

도 7은 각각의 입력 영상 프레임 간의 도착 시간 간격(Interarrival Time)(

)이 30 fps이고, 객체 인식 장치(120)의 객체 인식에 소요되는 객체 인식 서비스 시간이 15 fps일 때, 애플리케이션 데드라인이 0.3초인 경우이다.7 shows an Interarrival Time between each input image frame (

) is 30 fps and the object recognition service time required for object recognition of the object recognition apparatus 120 is 15 fps, the application deadline is 0.3 seconds.

기존의 YOLO의 경우, 큐에 저장된 영상 프레임에 대한 처리 시간은 시간이 경과됨에 따라 딜레이가 누적됨을 볼 수 있다.In the case of the existing YOLO, it can be seen that the processing time for the image frame stored in the queue accumulates delay as time elapses.

실제 큐의 사이즈는 유한하기 때문에 유한한 큐를 고려할 경우, 큐 최대 사이즈에 수렴된 지연이 계속해서 발생하게 된다.Since the size of the actual queue is finite, when a finite queue is considered, the delay converged to the maximum size of the queue continues to occur.

이와 달리 본 발명의 적응형 프레임 제어 장치(110)는 가장 최신의 프레임만을 객체 인식 서비스에 사용하기 때문에 누적 지연이 발생되지 않고, 애플리케이션의 데드라인 이내에 각 영상 프레임을 처리할 수 있음을 볼 수 있다.Contrary to this, since the adaptive frame control apparatus 110 of the present invention uses only the most recent frame for the object recognition service, it can be seen that no cumulative delay occurs and each image frame can be processed within the application deadline. .

본 발명의 적응형 프레임 제어 장치(110)의 개발 환경 및 YOLO와 비교 실험을 위한 기본적인 환경 구성은 다음의 표 1과 같다.The development environment of the adaptive frame control apparatus 110 of the present invention and the basic environment configuration for a comparative experiment with YOLO are shown in Table 1 below.

도 8은 객체 인식 장치의 하드웨어별 객체 탐지 시간을 나타낸 도면이고, 도 9 내지 도 13은 객체 인식 장치의 각각 다른 하드웨어에서 실행된 RTSP 전송에 대한 각 영상 프레임의 총 처리 지연시간을 나타낸 도면이다.8 is a diagram illustrating an object detection time for each hardware of the object recognition apparatus, and FIGS. 9 to 13 are diagrams illustrating a total processing delay time of each image frame for RTSP transmission executed in different hardware of the object recognition apparatus.

도 8에서 GTX 1060은 하드웨어 중 가장 좋은 성능을 보이는 시스템이다.In Fig. 8, GTX 1060 is a system showing the best performance among hardware.

YOLO416과 YOLO608은 1.4초와 2.8초의 지연이 발생함을 확인할 수 있다. 이러한 결과는 현재 실제 카메라로 보여지는 이벤트 또는 객체가 객체 인식 장치에서 1.5초에서 2.8초 이후에 확인이 되는 것을 보여준다. 즉, 30 FPS의 객체 탐지 속도보다 낮은 FPS의 상황에서는 RTSP 입력소스로부터 이벤트에 대한 실시간 처리를 보장할 수 없다는 것을 증명한다.It can be seen that YOLO416 and YOLO608 have delays of 1.4 seconds and 2.8 seconds. These results show that the event or object currently viewed by the actual camera is confirmed after 1.5 to 2.8 seconds in the object recognition device. That is, it proves that real-time processing of events from the RTSP input source cannot be guaranteed in the situation of FPS lower than the object detection speed of 30 FPS.

AFC 416, AFC 128은 0.3초 이내의 지연으로 객체 인식 서비스가 수행됨을 볼 수 있으며, YOLO와 같은 시스템, 같은 입력 조건, 같은 FPS 처리율에도 불구하고, 실시간 처리를 보장하는 것을 볼 수 있다. 여기서, AFC는 본 발명의 적응형 프레임 제어(Adaptive Frame Control) 방법이 적용된 시스템을 나타낸다.In AFC 416 and AFC 128, it can be seen that object recognition service is performed with a delay of less than 0.3 seconds, and real-time processing is guaranteed despite the same system as YOLO, the same input conditions, and the same FPS processing rate. Here, AFC represents a system to which the adaptive frame control method of the present invention is applied.

이러한 객체 인식 서비스 지연의 문제는 도 10 내지 도 13의 저사양의 하드웨어로 내려갈수록 더 심각해짐을 볼 수 있다.It can be seen that the problem of such object recognition service delay becomes more serious as the lower-spec hardware of FIGS. 10 to 13 goes down.

만약, 0.5초의 객체 인식 데드라인을 가지는 애플리케이션에서 415×415 영상 사이즈의 객체 인식을 수행하기 원한다면, YOLO의 경우, GTX 1060을 탑재한 컴퓨터로도 요구 사항을 만족할 수 없고, 더 고사양의 시스템을 고려해야 한다.If you want to perform object recognition with an image size of 415×415 in an application with an object recognition deadline of 0.5 seconds, in the case of YOLO, even a computer equipped with a GTX 1060 cannot satisfy the requirements, and a higher-spec system should be considered. do.

그러나 본 발명의 적응형 프레임 제어(AFC) 방법을 사용하는 경우, 도 10과 같은 조건의 Jetson NX를 사용하여 애플리케이션의 요구 사항을 만족할 수 있다.However, when the adaptive frame control (AFC) method of the present invention is used, the requirements of the application can be satisfied by using the Jetson NX under the conditions shown in FIG. 10 .

이때, 실제로 Jetson NX와 GTX 1060 이상의 시스템과의 가격을 비교하면, NX가 적어도 1/6 이상의 낮은 가격으로 애플리케이션을 구성할 수 있다.At this time, comparing the price of Jetson NX and GTX 1060 or higher system, NX can configure the application at least 1/6 lower price.

만약, 영상 사이즈에 대한 요구 사항이 더욱 낮아지는 경우(128×128, 256×256), 도 13에 도시된 바와 같이, Jetson Nano를 하드웨어로 고려할 수 있다.If the requirements for the image size are further lowered (128×128, 256×256), as shown in FIG. 13 , the Jetson Nano may be considered as hardware.

반면에 기존 YOLO의 경우, 객체 인식 서비스 지연이 상당히 심각하게 발생하는 것을 확인할 수 있다.On the other hand, in the case of the existing YOLO, it can be seen that the object recognition service delay occurs quite seriously.

이러한 성능의 차이는 기존의 YOLO의 경우, 입력 프레임 레이트와 객체 인식에 소요되는 서비스 레이트 간의 차이에 대한 문제를 간과하였으나, 본 발명의 적응형 프레임 제어(AFC) 방법에서는 다중 소스 선출기를 통해 입력단의 프레임 레이트를 조절하고, 별도의 큐(112)를 사용하여 RTSP의 버퍼링에 의존하지 않고, 출입 허용 제어부(113)를 통해 입력 프레임 레이트와 객체 인식 장치(120)의 객체 인식 서비스 레이트에 따라 탄력적으로 프레임을 제어한 결과이다.This difference in performance overlooked the problem of the difference between the input frame rate and the service rate required for object recognition in the case of the existing YOLO, but in the adaptive frame control (AFC) method of the present invention, the It adjusts the frame rate and uses a separate queue 112 to flexibly according to the input frame rate and the object recognition service rate of the object recognition device 120 through the access control unit 113 without relying on RTSP buffering. It is the result of controlling the frame.

이와 같이 본 발명의 실험 결과는 하드웨어 제약이 있는 시스템에서 RTSP 입력소스에 대한 AFC의 실시간 객체 탐지 서비스 능력이 기존 YOLO에 비해 상당히 우수함을 입증한다.As such, the experimental results of the present invention prove that the real-time object detection service capability of AFC for the RTSP input source is significantly superior to that of the existing YOLO in a system with hardware limitations.

도 14는 실제 본 실험에서 각 시간대별 객체 인식을 처리하고 있는 영상 프레임을 보여주고 있다.14 shows an image frame in which object recognition is processed for each time period in the actual experiment.

이상에서 본 발명의 실시예는 장치 및/또는 방법을 통해서만 구현이 되는 것은 아니며, 본 발명의 실시예의 구성에 대응하는 기능을 실현하기 위한 프로그램, 그 프로그램이 기록된 기록 매체 등을 통해 구현될 수도 있으며, 이러한 구현은 앞서 설명한 실시예의 기재로부터 본 발명이 속하는 기술분야의 전문가라면 쉽게 구현할 수 있는 것이다.In the above, the embodiment of the present invention is not implemented only through an apparatus and/or method, and may be implemented through a program for realizing a function corresponding to the configuration of the embodiment of the present invention, a recording medium in which the program is recorded, etc. And, such an implementation can be easily implemented by an expert in the technical field to which the present invention belongs from the description of the above-described embodiment.

이상에서 본 발명의 실시예에 대하여 상세하게 설명하였지만 본 발명의 권리범위는 이에 한정되는 것은 아니고 다음의 청구범위에서 정의하고 있는 본 발명의 기본 개념을 이용한 당업자의 여러 변형 및 개량 형태 또한 본 발명의 권리범위에 속하는 것이다.Although the embodiments of the present invention have been described in detail above, the scope of the present invention is not limited thereto. is within the scope of the right.

100: 적응형 프레임 관리 시스템
110: 적응형 프레임 제어 장치
111: 다중 소스 선출기
111a: 다중 소스 먹스
111b: 싱크로나이저
111c: 프레임 처리부
112: 큐
113: 출입 허용 제어부
113a: 허용 컨트롤러
113b: 디멀티플렉서
114: 객체 인식 서비스 시간 측정부
120: 객체 인식 장치
121: 인출 스레드
122: 객체 탐지부
123: 디스플레이부100: Adaptive Frame Management System
110: adaptive frame control device
111: multi-source selector
111a: multi-source mux
111b: synchronizer
111c: frame processing unit
112: queue
113: access permission control unit
113a: admission controller
113b: demultiplexer
114: object recognition service time measurement unit
120: object recognition device
121: draw thread
122: object detection unit
123: display unit

Claims

It receives video frames from a plurality of input sources and delivers them to a queue, and the arrival time interval (

) multi-source selector to measure;
an object recognition device for estimating the existence of an object in the image frame and outputting an object recognition result;
The object recognition service time (

) object recognition service time measuring unit to measure; and
It is connected to the output terminal of the queue to receive each of the image frames from the queue, and the arrival time interval received from the multi-source selector (

) and the object recognition service time (

.
The access permission control unit is the arrival time interval (

) is the object recognition service time (

), an adaptive frame management system for real-time object recognition that drops some image frames fetched from the queue and delivers the remaining image frames to the object recognition device.

delete

) and the object recognition service time (

.
The access permission control unit is the arrival time interval (

) is the object recognition service time (

.

4. The method according to claim 1 or 3,
The access permission control unit is the arrival time interval (

) is the object recognition service time (

), an adaptive frame management system for real-time object recognition that delivers the image frame fetched from the queue to the object recognition device as it is.

4. The method of claim 3,
The number of discarded video frames is the arrival time interval (

) is the object recognition service time (

), an adaptive frame management system for real-time object recognition that varies by the difference in

) and the object recognition service time (

.
The multi-source selector may include: a multi-source mux receiving image frames from a plurality of input sources;
The function to synchronize the target FPS when the input frame rate is higher than the target FPS compared to the time to read the video frame input from the multi-source mux and the target FPS (Frame Per Second). Performing synchronizer (Synchronizer); and
connected to the synchronizer, connected to the input end of the queue, and the arrival time interval between each input image frame (

) and an adaptive frame management system for real-time object recognition comprising a frame processing unit that transmits the measurement to the access control unit.

) and the object recognition service time (

.
The access permission control unit,
the arrival time interval from the multi-source selector (

), and the object recognition service time (

) receiving the admission controller; and
It is connected to the output terminal of the queue to receive the image frame from the queue, and the result value of the admission controller is the arrival time interval (

) is the object recognition service time (

.

) and the object recognition service time (

.
The access permission control unit is an adaptive frame management system for real-time object recognition that is triggered when the object recognition device completes processing of one frame and fetches the next new frame.

) in the multi-source selector;
recognizing the existence of an object in the image frame in an object recognition apparatus;
The object recognition service time measuring unit connected to the object recognition device is an object recognition service time (

) to measure; and
Arrival time interval (

) and the object recognition service time (

) and determining whether to transmit or discard the image frame retrieved from the queue to the object recognition device according to the result of comparing
The step of determining whether to transmit or discard to the object recognition device comprises:
the above arrival time interval (

) is the object recognition service time (

), the adaptive frame control method for real-time object recognition of dropping the remaining packets of the queue except for the image frame in the tail of the queue in order to deliver only the latest image frame.

delete

) to measure; and
Arrival time interval (

) and the object recognition service time (

) is the object recognition service time (

), the adaptive frame control method for real-time object recognition comprising the step of transferring the image frame fetched from the queue to the object recognition device as it is.