KR102201241B1

KR102201241B1 - Apaptive Object Recognizing Apparatus and Method for Processing Data Real Time In Multi Channel Video

Info

Publication number: KR102201241B1
Application number: KR1020190137470A
Authority: KR
Inventors: 황광일; 이정훈; 정영빈
Original assignee: 인천대학교 산학협력단
Priority date: 2019-10-31
Filing date: 2019-10-31
Publication date: 2021-01-12

Abstract

An adaptive object recognition device and method for processing data of a multichannel video stream in real-time efficiently performs real-time object recognition of a CCTV image by receiving a channel video frame, adaptively changes a channel selection list for selecting the order of input channels receiving image frames according to an object recognition result, and processes multi-channel CCTV video frames without increasing computing resources.

Description

[Apaptive Object Recognizing Apparatus and Method for Processing Data Real Time In Multi Channel Video}

본 발명은 개체 인식 장치에 관한 것으로서, 특히 다채널 비디오 프레임을 입력받고, 개체 인식 결과에 따라 영상 프레임을 입력받는 입력채널의 순서를 선택하는 채널 선택 리스트를 적응적으로 변경하여 컴퓨팅 리소스의 증가없이 다채널의 CCTV 영상 프레임을 처리함으로써 CCTV 영상의 실시간 개체 인식을 효율적으로 수행하는 다채널 비디오 스트림을 실시간으로 데이터 처리하는 적응형 개체 인식 장치 및 방법에 관한 것이다.The present invention relates to an entity recognition apparatus, in particular, without increasing computing resources by adaptively changing a channel selection list for receiving a multichannel video frame and selecting an order of an input channel to receive an image frame according to an entity recognition result. The present invention relates to an adaptive entity recognition apparatus and method for processing data in real time on a multi-channel video stream that efficiently performs real-time entity recognition of CCTV images by processing multi-channel CCTV image frames.

일반적으로 CCTV(Closed Circuit Television) 시스템은 실내 및 실외 보안이 필요한 여러 장소에 카메라를 설치하고, 그 카메라들로부터 수신되는 영상 신호를 단일 장소에 위치한 여러 TV 모니터에 표시하여 적은 감시 인력으로도 많은 장소를 동시에 감시할 수 있다. 또한, 카메라로부터 수신되는 영상신호는 VCR과 같은 저장매체에 기록되어, 도난이 발생한 경우 범인을 체포하는데 있어 도움이 될 뿐만 아니라 중요한 증거 자료로 사용될 수 있다.In general, CCTV (Closed Circuit Television) systems install cameras in various places that require indoor and outdoor security, and display video signals received from the cameras on multiple TV monitors located in a single place, so that many places with fewer surveillance personnel. Can be monitored at the same time. In addition, the video signal received from the camera is recorded in a storage medium such as a VCR, and can be used as important evidence as well as helpful in arresting the criminal in case of theft.

그러나 CCTV에서의 영상은 아날로그 형태로 기록되어 있으므로 움직임 검출이나 개체 인식과 같이, 정밀 조사에 도움이 되는 영상 처리 기법을 적용하기가 어렵다. 최근의 디지털 기술의 발전에 힘입어 영상 신호를 디지털화하여 처리하는 기법이 소개되었다.However, since the video in CCTV is recorded in an analog format, it is difficult to apply an image processing technique useful for detailed investigation, such as motion detection and object recognition. With the recent development of digital technology, techniques for digitizing and processing video signals have been introduced.

디지털 비디오 레코더(Digital Video Recorder, DVR) 또는 네트워크 비디오 레코더(Network Video Recorder, NVR) 시스템은 카메라를 통해 입력된 아날로그 방식의 영상 신호를 디지털 신호로 전환하여 동화상 국제 압축 방식인 MPEG으로 영상을 압축/복원하여 저장 및 재생한다.Digital Video Recorder (DVR) or Network Video Recorder (NVR) system converts analog video signals input through cameras into digital signals and compresses/compresses video with MPEG, an international moving picture compression method. Restore, save and play.

이러한 DVR 또는 NVR은 사건에 대한 확인을 위해서 저장된 영상 정보를 재생하게 되는데, 복수의 CCTV 카메라와 연결되어 멀티 채널의 영상 정보를 저장하고 있다.Such a DVR or NVR reproduces stored image information to confirm an event, and is connected to a plurality of CCTV cameras to store multi-channel image information.

따라서, 종래의 무인 감시 CCTV 시스템은 영상 처리 기술을 이용하여 특정 지역에 침입, 배회 등을 실시간 모니터링하고, 의심 상황을 알람 자동 발생한다.Therefore, the conventional unmanned surveillance CCTV system uses image processing technology to monitor intrusion, roaming, etc. in a specific area in real time, and automatically generates an alarm of a suspected situation.

이러한 CCTV 시스템은 실시간 무인 감시를 위해 기계 학습 기반의 개체 인식 기술을 사용한다. 최근에는 영상 처리 기술을 Deep Learning 기반의 개체 인식 알고리즘이 개발되었고, 대표적으로 SSD(Single Object Detector), YOLO(You Only Look Once) 등이 있다.Such CCTV systems use machine learning-based entity recognition technology for real-time unmanned surveillance. In recent years, image processing technology based on Deep Learning object recognition algorithms have been developed, and representatively, SSD (Single Object Detector) and YOLO (You Only Look Once).

이러한 딥러닝 기반의 알고리즘은 하나의 영상 파일에 딥러닝 분석을 수행하여 하나의 탐지 결과를 출력하는데, 복수의 CCTV 채널에 적용하기 위해서 개체 인식 모듈이 병렬로 동작해야 하므로 많은 용량의 메모리, GPU, MCU 등의 하드웨어 리소스를 필요로 하며, 추가적인 서버 구축에 따른 고가의 비용이 요구되는 문제점이 있다.This deep learning-based algorithm performs deep learning analysis on one image file and outputs one detection result.In order to apply to multiple CCTV channels, the entity recognition module must operate in parallel, so a large amount of memory, GPU, It requires hardware resources such as MCU, and there is a problem that expensive cost is required for additional server construction.

하지만 현재 대다수의 DVR와 NVR은 일반 PC를 사용하거나 PC급 성능의 시스템을 구축하고 있다. 따라서, 대다수의 DVR와 NVR은 다채널 CCTV로부터 실시간 개체 인식을 위해서 딥러닝 기반의 개체 인식 알고리즘을 적용할 수 없거나 제대로 동작하지 못하는 한계가 있다.However, most DVRs and NVRs currently use general PCs or build systems with PC-class performance. Therefore, most DVRs and NVRs have limitations in that they cannot apply or operate properly based on a deep learning-based entity recognition algorithm for real-time entity recognition from multi-channel CCTV.

한국 등록특허번호 제10-1930940호Korean Patent Registration No. 10-1930940

이와 같은 문제점을 해결하기 위하여, 본 발명은 다채널 비디오 프레임을 입력받고, 개체 인식 결과에 따라 영상 프레임을 입력받는 입력채널의 순서를 선택하는 채널 선택 리스트를 적응적으로 변경하여 컴퓨팅 리소스의 증가없이 다채널의 CCTV 영상 프레임을 처리함으로써 CCTV 영상의 실시간 개체 인식을 효율적으로 수행하는 다채널 비디오 스트림을 실시간으로 데이터 처리하는 적응형 개체 인식 장치 및 방법을 제공하는데 그 목적이 있다.In order to solve such a problem, the present invention adaptively changes a channel selection list that selects an order of input channels for receiving multi-channel video frames and receiving image frames according to an entity recognition result, without increasing computing resources. An object of the present invention is to provide an adaptive entity recognition apparatus and method for processing data in real time on a multi-channel video stream that efficiently performs real-time entity recognition of CCTV images by processing multi-channel CCTV image frames.

상기 목적을 달성하기 위한 본 발명의 특징에 따른 다채널 비디오 스트림을 실시간으로 데이터 처리하는 적응형 개체 인식 장치는,An adaptive entity recognition apparatus for processing data in real time on a multi-channel video stream according to a feature of the present invention for achieving the above object,

비디오 채널 멀티플렉서(Video Channel Multiplexer)로서 복수의 CCTV(Closed Circuit Television) 카메라의 출력 신호선들과 입력 포트가 각각 접속되는 먹스(Mux);A mux to which output signal lines and input ports of a plurality of CCTV (Closed Circuit Television) cameras are connected as a video channel multiplexer;

상기 먹스로부터 상기 복수의 CCTV 카메라 중에서 하나의 CCTV 카메라의 영상 프레임을 수신하고, 상기 수신한 영상 프레임에서 개체의 존재를 추정하여 개체 인식 결과를 출력하는 개체 인식 모듈;An entity recognition module for receiving an image frame from one of the plurality of CCTV cameras from the mux, estimating the existence of an entity in the received image frame, and outputting an entity recognition result;

상기 출력한 개체 인식 결과를 기초로 영상 프레임을 입력받는 입력채널의 순서를 선택하는 채널 선택 리스트를 적응적으로 변경하는 탐지 결과부; 및A detection result unit for adaptively changing a channel selection list for selecting an order of input channels for receiving an image frame based on the output object recognition result; And

상기 변경된 채널 선택 리스트에 따라 채널 선택 신호를 생성하여 상기 먹스로 전송하는 적응형 채널 선택부를 포함하며,And an adaptive channel selection unit generating a channel selection signal according to the changed channel selection list and transmitting it to the mux,

상기 먹스는 상기 적응형 채널 선택부의 채널 선택 신호에 따라 선택된 입력채널을 통해 영상 프레임을 선택적으로 입력받는 것을 특징으로 한다.The mux is characterized in that it selectively receives an image frame through an input channel selected according to a channel selection signal of the adaptive channel selection unit.

본 발명의 특징에 따른 다채널 비디오 스트림을 실시간으로 데이터 처리하는 적응형 개체 인식 장치는,An adaptive entity recognition apparatus for processing data in real time on a multi-channel video stream according to a feature of the present invention,

상기 먹스로부터 상기 복수의 CCTV 카메라 중에서 하나의 CCTV 카메라의 영상 프레임을 수신하고, 상기 수신한 영상 프레임에서 개체의 존재를 추정하여 개체 인식 결과를 출력하며, 하나의 입력채널에 접속하여 영상 정보를 입력받아 상기 하나의 입력채널의 최대 프레임 인식률을 나타내는 프로세싱 성능(Processing Capability)을 측정하는 개체 인식 모듈;Receives an image frame of one CCTV camera from among the plurality of CCTV cameras from the MUX, estimates the existence of an entity in the received image frame, outputs an entity recognition result, and inputs image information by accessing one input channel An entity recognition module that receives and measures a processing capability indicating a maximum frame recognition rate of the one input channel;

상기 채널 선택 리스트는 상기 프로세싱 성능 안에서 입력채널의 순서를 선택하는 우선 순위를 동적으로 할당하는 것을 특징으로 한다.The channel selection list is characterized in that a priority order for selecting an order of input channels within the processing capability is dynamically allocated.

본 발명의 특징에 따른 다채널 비디오 스트림을 실시간으로 데이터 처리하는 적응형 개체 인식 방법은,An adaptive entity recognition method for processing data in real time on a multi-channel video stream according to a feature of the present invention,

복수의 CCTV(Closed Circuit Television) 카메라의 출력 신호선들과 입력 포트가 각각 접속되는 먹스(Mux)를 통해 상기 각각의 CCTV 카메라에서 영상 프레임을 순차적으로 수신하는 단계;Sequentially receiving image frames from each of the CCTV cameras through a mux to which output signal lines and input ports of a plurality of CCTV (Closed Circuit Television) cameras are connected, respectively;

개체 인식 모듈을 이용하여 상기 각각의 CCTV 카메라에서 수신한 영상 프레임을 개체의 존재를 추정하여 개체 인식 결과를 각각 출력하는 단계;Estimating the existence of an object from the image frame received from each of the CCTV cameras using an object recognition module, and outputting object recognition results, respectively;

상기 각각의 개체 인식 결과를 기초로 영상 프레임을 입력받는 입력채널의 순서를 선택하는 채널 선택 리스트를 적응적으로 변경하는 단계; 및Adaptively changing a channel selection list for selecting an order of input channels for receiving image frames based on the respective object recognition results; And

상기 변경된 채널 선택 리스트에 따라 채널 선택 신호를 생성하여 상기 먹스로 전송하고, 상기 먹스는 상기 채널 선택 신호에 따라 선택된 입력채널을 통해 영상 프레임을 선택적으로 입력받는 단계를 포함하는 것을 특징으로 한다.And generating a channel selection signal according to the changed channel selection list and transmitting it to the MUX, and the MUX selectively receiving an image frame through an input channel selected according to the channel selection signal.

전술한 구성에 의하여, 본 발명은 컴퓨팅 리소스의 증가없이 다채널의 CCTV 영상 프레임을 처리하여 CCTV 영상의 실시간 개체 인식을 효율적으로 수행할 수 있는 효과가 있다.According to the above-described configuration, the present invention has the effect of efficiently performing real-time object recognition of CCTV images by processing multi-channel CCTV image frames without increasing computing resources.

본 발명은 적응형 개체 인식 장치를 모듈화된 소프트웨어 구성을 통해 다양한 딥러닝 기반 개체 인식 모둘과의 호환이 가능한 효과가 있다.The present invention has the effect of being compatible with various deep learning-based entity recognition modules through a modular software configuration of an adaptive entity recognition device.

본 발명은 CCTV의 개수의 증가에 따른 성능의 급격한 저하 또는 하드웨어 사양 증가가 불필요한 효과가 있다.The present invention has an effect that a sudden decrease in performance or an increase in hardware specifications according to an increase in the number of CCTVs is unnecessary.

도 1은 본 발명의 실시예에 따른 다채널 비디오 스트림을 실시간으로 데이터 처리하는 적응형 개체 인식 장치의 구성을 나타낸 도면이다.
도 2는 본 발명의 실시예에 따른 확장된 채널 선택 리스트의 모습을 나타낸 도면이다.
도 3은 본 발명의 실시예에 따른 채널 선택 리스트에 따라 영상 프레임을 선택하여 입력받는 모습을 나타낸 도면이다.1 is a diagram showing the configuration of an adaptive entity recognition apparatus for processing data in real time on a multi-channel video stream according to an embodiment of the present invention.
2 is a diagram illustrating an extended channel selection list according to an embodiment of the present invention.
FIG. 3 is a diagram illustrating a mode in which an image frame is selected and inputted according to a channel selection list according to an embodiment of the present invention.

명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다.Throughout the specification, when a part "includes" a certain component, it means that other components may be further included rather than excluding other components unless otherwise stated.

도 1은 본 발명의 실시예에 따른 다채널 비디오 스트림을 실시간으로 데이터 처리하는 적응형 개체 인식 장치의 구성을 나타낸 도면이고, 도 2는 본 발명의 실시예에 따른 확장된 채널 선택 리스트의 모습을 나타낸 도면이고, 도 3은 본 발명의 실시예에 따른 채널 선택 리스트에 따라 영상 프레임을 선택하여 입력받는 모습을 나타낸 도면이다.1 is a diagram showing a configuration of an adaptive entity recognition apparatus that processes data in real time on a multi-channel video stream according to an embodiment of the present invention, and FIG. 2 is a view of an expanded channel selection list according to an embodiment of the present invention. FIG. 3 is a diagram illustrating a mode in which an image frame is selected and inputted according to a channel selection list according to an embodiment of the present invention.

본 발명의 실시예에 따른 다채널 비디오 스트림을 실시간으로 데이터 처리하는 적응형 개체 인식 장치(100)는 먹스(Mux)(110), 개체 인식 모듈(120), 탐지 결과부(130), 이벤트 로그부(140) 및 적응형 채널 선택부(150)를 포함한다.The adaptive entity recognition apparatus 100 for processing data on a multi-channel video stream in real time according to an embodiment of the present invention includes a Mux 110, an entity recognition module 120, a detection result unit 130, and an event log. It includes a unit 140 and an adaptive channel selection unit 150.

먹스(Mux)(110)는 비디오 채널 멀티플렉서(Video Channel Multiplexer)로서 복수의 CCTV(Closed Circuit Television) 카메라(160)의 출력 신호선들과 입력 포트가 각각 접속되고, CCTV(160)의 출력 신호선들에 디지털 비디오 레코더(Digital Video Recorder, DVR) 또는 네트워크 비디오 레코더(Network Video Recorder, NVR)(170)의 입력 신호선들이 각각 접속되어 있다.The mux (110) is a video channel multiplexer (Video Channel Multiplexer), the output signal lines and input ports of a plurality of CCTV (Closed Circuit Television) cameras 160 are connected, respectively, to the output signal lines of the CCTV (160). Input signal lines of a digital video recorder (DVR) or a network video recorder (NVR) 170 are respectively connected.

도 3에 도시된 바와 같이, 먹스(110)는 선택된 입력채널의 프레임만을 입력받고, 나머지 입력채널의 프레임을 버린다.As shown in FIG. 3, the mux 110 receives only the frames of the selected input channel and discards the frames of the remaining input channels.

DVR/NVR(170)은 복수의 CCTV 카메라(160)와 연결되어 멀티 채널의 영상 프레임을 저장한다.The DVR/NVR 170 is connected to a plurality of CCTV cameras 160 to store multi-channel image frames.

먹스(110)는 하나의 입력채널의 선택 신호를 생성하는 경우, 해당 입력채널에 전기적으로 접속된 CCTV 카메라(160)로부터 영상 프레임을 수신하여 출력한다.When generating a selection signal for one input channel, the mux 110 receives and outputs an image frame from the CCTV camera 160 electrically connected to the corresponding input channel.

다시 말해, 먹스(110)는 복수의 CCTV 카메라(160)로부터 하나의 입력채널인 CCTV 카메라(160)를 선택하여 영상 프레임을 개체 인식 모듈(120)로 전송한다.In other words, the mux 110 selects one input channel of the CCTV camera 160 from the plurality of CCTV cameras 160 and transmits the image frame to the entity recognition module 120.

개체 인식 모듈(120)은 심층 신경망(Deep Neural Networks, DNN), 컨볼루션 신경망 (Convolutional deep Neural Networks, CNN), 순환 신경망(Reccurent Neural Network, RNN) 및 심층 신뢰 신경 망(Deep Belief Networks, DBN) 중 어느 하나의 신경망을 이용하여 입력 영상으로부터 특징맵을 추출한다.The entity recognition module 120 includes Deep Neural Networks (DNN), Convolutional Deep Neural Networks (CNN), Reccurent Neural Networks (RNNs), and Deep Belief Networks (DBN). A feature map is extracted from the input image using any one of the neural networks.

개체 인식 모듈(120)은 딥러닝(Deep learning)을 기반으로 학습부에 의하여 이미 학습이 완료된 모델을 이용하여서 특징맵을 생성할 수 있다. 딥러닝은 여러 비선형 변환기법의 조합을 통해 높은 수준의 추상화(Abstractions, 다량의 데이터나 복잡한 자료들 속에서 핵심적인 내용 또는 기능을 요약하는 작업)를 시도하는 기계학습(Machine Learning) 알고리즘의 집합으로 정의된다.The entity recognition module 120 may generate a feature map using a model that has already been learned by a learning unit based on deep learning. Deep learning is a set of machine learning algorithms that attempts high-level abstractions (abstractions, the task of summarizing key contents or functions in a large amount of data or complex data) through a combination of several nonlinear transducers. Is defined.

개체 인식 모듈(120)은 영상 프레임에서 개체가 존재할 것으로 추정되는 영역을 추출하고, 추출된 영역으로부터 특징을 나타내는 특징맵을 추출한다.The entity recognition module 120 extracts a region in which an entity is estimated to exist from the image frame, and extracts a feature map representing a feature from the extracted region.

개체 인식 모듈(120)은 추출한 특징맵을 기초로 영상에서 개체의 존재가 추정되는 적어도 하나의 영역을 추출한다. 영역을 추출하는 방법은 예를 들어 faster RCNN, SSD(Single Shot MultiBox Detector), Yolo(You Only Look Once) 등이 있을 수 있다.The entity recognition module 120 extracts at least one area from the image in which the existence of an entity is estimated based on the extracted feature map. A method of extracting the region may include, for example, faster RCNN, Single Shot MultiBox Detector (SSD), You Only Look Once (Yolo).

개체 인식 모듈(120)은 특징맵 중에서 영상의 영역별 클래스의 좌표를 포함하는 특징맵을 선정하고, 선정된 특징맵으로부터 영역을 구별하는 좌표를 식별한 뒤, 식별된 좌표를 개체의 존재가 추정되는 영역으로 추출할 수 있다.The entity recognition module 120 selects a feature map including the coordinates of each region class of the image among the feature maps, identifies the coordinates that distinguish the region from the selected feature map, and estimates the existence of the object based on the identified coordinates. It can be extracted to the area that can be used.

개체 인식 모듈(120)은 물건, 사람, 동물 등 다양한 개체를 하나 또는 2개 이상으로 설정할 수 있다.The entity recognition module 120 may set one or two or more of various entities such as objects, people, and animals.

또한, 개체 인식 모듈(120)은 추출된 적어도 하나의 영역 각각에 대해서, 해당 객체의 최외곽을 둘러싸는 바운딩 박스(Bounding Box)로서 표시할 수 있다.In addition, the object recognition module 120 may display each of the extracted at least one area as a bounding box surrounding the outermost part of the object.

각각의 바운딩 박스는 영상에서 해당 바운딩 박스의 위치에 개체의 존재 가능성이 있음을 나타낸다.Each bounding box indicates that there is a possibility of an entity at the location of the bounding box in the image.

개체 인식 모듈(120)은 영상 정보를 나타내는 프레임을 입력으로 받아 해당 프레임 내에서 개체의 위치 좌표((X1, Y1), (X2, Y2))를 바운딩 박스로 한 결과 정보를 출력한다.The object recognition module 120 receives a frame representing image information as an input and outputs result information obtained by using the position coordinates ((X1, Y1), (X2, Y2)) of the object within the frame as a bounding box.

개체 인식 모듈(120)은 영상 프레임을 입력받아 플레이 했을 때, 초당 15 내지 30 프레임을 처리해서 바운딩 박스가 표시된 영상 정보를 출력한다.When an image frame is received and played, the entity recognition module 120 processes 15 to 30 frames per second to output image information in which the bounding box is displayed.

개체 인식 모듈(120)은 적응형 채널 선택부(150)에 의해 먹스(110)에서 선택된 입력채널의 영상 프레임을 입력받고, 개체 인식 알고리즘을 통해 개체 인식 결과를 생성하여 탐지 결과부(130)로 전송한다.The entity recognition module 120 receives the image frame of the input channel selected from the mux 110 by the adaptive channel selection unit 150, generates an entity recognition result through an entity recognition algorithm, and sends the image frame to the detection result unit 130. send.

다시 말해, 개체 인식 모듈(120)은 먹스(110)로부터 복수의 CCTV 카메라(160) 중에서 하나의 CCTV 카메라(160)의 영상 프레임을 수신하고, 수신한 영상 프레임에서 개체의 존재를 추정하여 개체 인식 결과를 출력한다.In other words, the entity recognition module 120 receives an image frame of one CCTV camera 160 from among a plurality of CCTV cameras 160 from the MUX 110, and recognizes the entity by estimating the existence of the entity from the received image frame. Print the result.

개체 인식 모듈(120)은 각각의 CCTV 카메라(160)로부터 입력되어 모니터링을 수행할 입력채널이 N개라고 가정하면, 하나의 입력채널에 접속하여 영상 프레임을 입력받아 플레이하면, 프로세싱 성능(Processing Capability)이 결과로 측정된다. Entity recognition module 120 is input from each CCTV camera 160, assuming that there are N input channels to perform monitoring, when accessing one input channel to receive video frames and play, Processing Capability ) Is measured as the result.

여기서, 프로세싱 성능은 C frame/sec 동영상을 처리하는 시스템 성능을 나타낸다.Here, the processing performance refers to the system performance of processing C frame/sec video.

예를 들면, 개체 인식 모듈(120)은 1초당 25 프레임, 30 프레임 등 동영상을 처리하는 프로세싱 성능이 결과로 측정된다.For example, the object recognition module 120 is measured as a result of processing a video, such as 25 frames per second, 30 frames.

이러한 프로세싱 성능은 각 시스템(PC, NVR, DVR)의 프로세싱 사양에 따라 달라질 수 있다.This processing performance may vary depending on the processing specifications of each system (PC, NVR, DVR).

탐지 결과부(130)는 개체 인식 모듈(120)로부터 프로세싱 성능(C frame/sec)을 수신하고, 데드라인 시간을 설정할 수 있다.The detection result unit 130 may receive processing performance (C frame/sec) from the entity recognition module 120 and set a deadline time.

여기서, 데드라인 시간(Td)은 영상 프레임 내에 개체 이동에 따른 최대 허용 시간을 나타내며, 예를 들어 2초, 3초 등으로 설정된다.Here, the deadline time Td represents the maximum allowable time according to the movement of the object in the image frame, and is set to, for example, 2 seconds or 3 seconds.

탐지 결과부(130)는 프로세싱 성능을 데드라인 시간에 곱하여 채널 선택 리스트(152)의 최대 사이즈를 계산한다(M = C × Td).The detection result unit 130 calculates the maximum size of the channel selection list 152 by multiplying the processing performance by the deadline time (M = C × Td).

탐지 결과부(130)는 최대로 지원할 수 있는 채널은 최대 사이즈(M)와 같거나 작게 설정하고, 이벤트 틱(Event Tick)을 1/C sec로 설정한다. 여기서, 이벤트 틱은 적응형 채널 선택부(150)가 먹스(110)에서 채널을 선택하는 이벤트 간격 시간을 나타낸다.The detection result unit 130 sets the maximum supported channel to be equal to or smaller than the maximum size (M), and sets the event tick to 1/C sec. Here, the event tick represents an event interval time at which the adaptive channel selection unit 150 selects a channel from the mux 110.

예를 들어, 적응형 채널 선택부(150)가 먹스(110)에서 1채널, 2채널, 4채널, 5채널 등 1/C sec(이벤트 간격 시간)마다 채널을 선택한다. 따라서, 이벤트 틱은 프레임 레이트(frame rate)를 따라 간다.For example, the adaptive channel selection unit 150 selects a channel from the mux 110 at every 1/C sec (event interval time) such as 1 channel, 2 channels, 4 channels, and 5 channels. Thus, the event tick follows the frame rate.

적응형 채널 선택부(150)는 최초에 채널 선택 리스트(152)의 사이즈를 N개로 설정하여 모든 채널이 순서대로 한 번씩 선택되도록 한다.The adaptive channel selection unit 150 initially sets the size of the channel selection list 152 to N so that all channels are selected once in order.

적응형 채널 선택부(150)는 모든 채널을 순서대로 한 번씩 선택되는 채널 선택 신호를 생성하여 먹스(110)로 전송한다.The adaptive channel selection unit 150 generates a channel selection signal that is selected once in order for all channels and transmits it to the mux 110.

이벤트 로그부(140)는 개체 인식 발생 시 로거(Logger)에 해당 정보를 저장한다. 이벤트 로그부(140)는 입력채널 정보, 시간, 감지된 개체수, 바운딩 박스 정보를 기록한다.The event log unit 140 stores corresponding information in a logger when object recognition occurs. The event log unit 140 records input channel information, time, number of detected individuals, and bounding box information.

개체 인식 모듈(120)은 적응형 채널 선택부(150)에서 선택된 제1 입력채널부터 제n 입력채널까지 모든 입력채널에 대한 개체 인식의 수행이 완료되면, 개체 인식 알고리즘을 통해 개체 인식 결과를 생성하여 탐지 결과부(130)로 전송한다.The entity recognition module 120 generates an entity recognition result through the entity recognition algorithm when the entity recognition for all input channels from the first input channel selected by the adaptive channel selection unit 150 to the nth input channel is completed. Then, it is transmitted to the detection result unit 130.

제1 채널부터 제n 채널까지 모든 채널에 대한 개체 인식의 수행이 완료되면 하나의 라운드가 끝났다고 표현한다.When entity recognition for all channels from the first channel to the nth channel is completed, it is expressed that one round is over.

탐지 결과부(130)는 하나의 라운드의 개별 영상 프레임마다 개체 인식 결과를 수신하여 저장한다. 개체 인식 결과는 바운딩 박스((X1, Y1), (X2, Y2)), 인식된 개체수, 개체 인식 확률을 포함할 수 있다.The detection result unit 130 receives and stores the object recognition result for each individual image frame of one round. The entity recognition result may include a bounding box ((X1, Y1), (X2, Y2)), the number of recognized entities, and an entity recognition probability.

탐지 결과부(130)는 수신한 개체 인식 결과를 분석하여 다음 라운드를 위한 채널 선택 리스트(152)를 적응적으로 변경할 수 있다.The detection result unit 130 may adaptively change the channel selection list 152 for the next round by analyzing the received entity recognition result.

탐지 결과부(130)는 출력한 개체 인식 결과를 기초로 영상 프레임을 입력받는 입력채널의 순서를 선택하는 채널 선택 리스트(152)를 적응적으로 변경하여 적응형 채널 선택부(150)로 전송한다.The detection result unit 130 adaptively changes the channel selection list 152 for selecting the order of the input channels receiving the image frames based on the output object recognition result and transmits the adaptive channel selection unit 150 to the adaptive channel selection unit 150. .

적응형 채널 선택부(150)는 변경된 채널 선택 리스트(152)에 따라 채널 선택 신호를 생성하여 먹스(110)로 전송한다. 먹스(110)는 적응형 채널 선택부(150)의 채널 선택 신호에 따라 선택된 입력채널을 통해 영상 프레임을 선택적으로 입력받는다.The adaptive channel selection unit 150 generates a channel selection signal according to the changed channel selection list 152 and transmits it to the mux 110. The mux 110 selectively receives an image frame through an input channel selected according to a channel selection signal of the adaptive channel selection unit 150.

탐지 결과부(130)는 하나의 라운드 동안 생성된 개체 인식 결과를 저장하고, 개체 인식 결과를 분석하여 모든 입력채널의 영상 프레임에서 개체가 인식되지 못하는 경우, 다시 N개의 채널의 채널 선택 리스트(152)를 생성하여 적응형 채널 선택부(150)로 전송한다.The detection result unit 130 stores the object recognition result generated during one round, analyzes the object recognition result, and when the object is not recognized in the image frames of all input channels, the channel selection list 152 of N channels again ) Is generated and transmitted to the adaptive channel selection unit 150.

탐지 결과부(130)는 개체 인식 결과를 분석하여 하나 이상의 개체를 검출하게 되면, 채널 선택 리스트(152)의 최대 사이즈인 M(C × Td)으로 확장하고, 확장된 채널 선택 리스트(152)를 적응형 채널 선택부(150)로 전송한다.When detecting one or more entities by analyzing the entity recognition result, the detection result unit 130 expands to M (C × Td), which is the maximum size of the channel selection list 152, and expands the expanded channel selection list 152. It is transmitted to the adaptive channel selection unit 150.

C는 개체 인식 모듈(120)에서 하나의 입력채널의 최대 프레임 인식률이고, Td는 영상 프레임 내에서 개체 이동에 따른 기설정된 최대 허용 시간을 나타낸다.C is the maximum frame recognition rate of one input channel in the entity recognition module 120, and Td represents a preset maximum allowable time according to movement of an entity within an image frame.

여기서, 확장된 채널 선택 리스트(152)는 PA(Privileged Area)(152a)와 NPA(Non Privileged Area)(152b)의 2개의 영역으로 구분되고, PA(152a)는 이번 라운드에 개체가 인식된 채널들에 대한 선택 순서 리스트 영역이며, NPA(152b)는 이번 라운드에 개체가 인식되지 않은 채널들에 대한 영역을 나타낸다.Here, the expanded channel selection list 152 is divided into two areas: a private area (PA) 152a and a non-privileged area (NPA) 152b, and the PA 152a is a channel in which an entity is recognized in this round. It is a selection order list area for fields, and the NPA 152b indicates an area for channels in which an entity is not recognized in this round.

PA(152a)는 개체가 인식된 입력채널들에 대한 선택 순서를 나타내는 제1 리스트 영역이고, NPA(152b)는 개체가 인식되지 않은 입력채널들에 대한 선택 순서를 나타내는 제2 리스트 영역이다.The PA 152a is a first list area indicating a selection order for input channels in which an entity is recognized, and the NPA 152b is a second list area indicating a selection order for input channels in which an entity is not recognized.

탐지 결과부(130)는 하나의 라운드가 완료되면, 각 입력채널별로 인식된 개체들, 개체 인식 확률, 개체마다 바운딩 박스 정보를 저장한다.When one round is completed, the detection result unit 130 stores the recognized entities for each input channel, the entity recognition probability, and bounding box information for each entity.

탐지 결과부(130)는 개체 인식 결과 중에서 각 입력채널마다 인식된 개체들의 개체 인식 확률의 최대값을 채널별로 추출한다.The detection result unit 130 extracts, for each channel, a maximum value of an entity recognition probability of entities recognized for each input channel from among entity recognition results.

탐지 결과부(130)는 적어도 1개 이상의 입력채널에서 개체가 인식되는 경우, 채널 선택 리스트(152)의 사이즈를 최대 사이즈(M = C × Td)로 확장하고, PA 구간의 사이즈를 계산한다.When an entity is recognized in at least one input channel, the detection result unit 130 expands the size of the channel selection list 152 to the maximum size (M = C × Td) and calculates the size of the PA section.

탐지 결과부(130)는 PA 구간(152a)의 사이즈를 PA = M - NPA(N-D) 구간을 뺀 값으로 계산된다. 여기서, D는 인식된 입력채널의 개수를 나타낸다.The detection result unit 130 is calculated as the size of the PA section 152a minus the PA = M-NPA (N-D) section. Here, D represents the number of recognized input channels.

PA 구간(152a)의 사이즈는 개체가 인식될수록 증가하고, 개체가 인식되지 않을수록 감소한다.The size of the PA section 152a increases as the entity is recognized, and decreases as the entity is not recognized.

PA 구간(152a)은 개체가 인식된 입력채널의 영상 프레임을 개체가 인식되지 않은 채널보다 한 번 이상이라도 더 개체 인식 모듈로 입력하여 개체 인식 과정을 수행하고자 하는 것이다.The PA section 152a is to perform the object recognition process by inputting the image frame of the input channel in which the object is recognized to the object recognition module at least once more than the channel in which the object is not recognized.

탐지 결과부(130)는 PA 구간(152a)을 개체가 인식된 입력채널의 개수로 나눈 몫을 계산한다. 예를 들면, PA 구간(152a)이 14이고, 개체가 인식된 입력채널(D)이 3개라고 가정하면, 나눈 몫은 14/3 = 4가 된다.The detection result unit 130 calculates a quotient obtained by dividing the PA section 152a by the number of input channels in which the entity is recognized. For example, assuming that the PA section 152a is 14 and the number of input channels D in which the entity is recognized is 3, the divided quotient is 14/3 = 4.

PA 구간(152a)에서는 개체가 인식된 입력채널을 균등하게 번갈아가면서 할당한다. In the PA section 152a, input channels in which the entity is recognized are evenly and alternately allocated.

예를 들면, 도 2에 도시된 바와 같이, PA 구간(152a)이 13이고, 채널이 인식된 입력채널(D)이 3개라고 가정하면, PA 구간(152a) 내에서 13/3(PA/D)=4, 즉, 개체가 인식된 채널인 3개를 균등하게 번갈아가면서 PA 구간(152a) 내에서 할당한다. For example, as shown in FIG. 2, assuming that the PA section 152a is 13 and the number of input channels D for which channels are recognized are 3, 13/3 (PA/ D) = 4, that is, three channels in which the entity is recognized are evenly alternately allocated within the PA section 152a.

PA 구간 내에서 개체가 인식된 채널(ch[1], ch[2], ch[3])을 나눈 몫(4번)을 곱한 만큼 할당하게 된다. 즉, ch[1], ch[2], ch[3], ch[1], ch[2], ch[3], ch[1], ch[2], ch[3], ch[1], ch[2], ch[3]을 할당하게 된다.In the PA interval, the individual is assigned as much as multiplied by the quotient (number 4) obtained by dividing the recognized channel (ch[1], ch[2], ch[3]). That is, ch[1], ch[2], ch[3], ch[1], ch[2], ch[3], ch[1], ch[2], ch[3], ch[1 ], ch[2], ch[3] are assigned.

탐지 결과부(130)는 PA 구간(152a)을 개체가 인식된 입력채널의 개수로 나눈 나머지를 계산한다. 예를 들면, PA 구간(152a)이 13이고, 개체가 인식된 입력채널(D)이 3개라고 가정하면, 나머지는 13 % 3 = 1이 된다.The detection result unit 130 calculates the remainder of the PA section 152a divided by the number of input channels in which the entity is recognized. For example, assuming that the PA section 152a is 13 and the number of input channels D in which the entity is recognized is 3, the remainder is 13% 3 = 1.

PA 구간(152a)의 나머지 구간(1개)은 개체가 인식된 입력채널을 할당한 후 남은 구간으로 개체가 인식된 입력채널 중에서 개체 인식 확률이 높은 순서대로 입력채널을 할당한다. The remaining sections (one) of the PA section 152a are sections remaining after the input channel in which the entity is recognized is allocated, and the input channels are allocated in the order of the highest entity recognition probability among the input channels in which the entity is recognized.

다시 말해, 나머지 구간은 개체가 인식된 입력채널 중에서 상기 개체 인식 확률의 최대값을 비교하여 개체 인식 확률이 높은 순서대로 입력채널을 할당할 수 있다.In other words, for the remaining sections, input channels may be allocated in the order of the highest object recognition probability by comparing the maximum value of the object recognition probability among input channels in which the object is recognized.

예를 들면, 도 2에 도시된 바와 같이, 개체 인식 확률이 높은 ch[1]을 PA 내 남은 구간(13 % 3 = 1)에 할당하게 된다.For example, as shown in FIG. 2, ch[1] having a high probability of recognizing an individual is allocated to the remaining section (13% 3 = 1) in the PA.

탐지 결과부(130)는 NPA 구간(152b)에 ch[4] 내지 ch[N]을 순차적으로 한 번씩 할당한다. 이와 같이, 탐지 결과부(130)는 PA 구간(152a)과 NPA 구간(152b)에 채널 할당이 완료되면, 변경된 채널 선택 리스트(152)를 적응형 채널 선택부(150)로 전송한다.The detection result unit 130 sequentially allocates ch[4] to ch[N] to the NPA section 152b once. In this way, the detection result unit 130 transmits the changed channel selection list 152 to the adaptive channel selection unit 150 when channel allocation is completed in the PA section 152a and the NPA section 152b.

적응형 채널 선택부(150)는 탐지 결과부(130)로부터 변경된 채널 선택 리스트(152)를 수신하고, 변경된 채널 선택 리스트(152)를 기초로 먹스(110)를 제어하여 다음 라운드를 수행한다.The adaptive channel selection unit 150 receives the changed channel selection list 152 from the detection result unit 130 and controls the mux 110 based on the changed channel selection list 152 to perform the next round.

채널 선택 리스트(152)는 탐지 결과부(130)에서 프로세싱 성능 안에서 입력채널의 순서를 선택하는 우선 순위를 동적으로 할당할 수 있다. 즉, 최대 프레임 내에서 멀티 입력채널로 나누어진다.The channel selection list 152 may dynamically allocate a priority for selecting an order of input channels within the processing performance by the detection result unit 130. That is, it is divided into multiple input channels within the maximum frame.

다른 실시예로서, DVR/NVR(170)은 내부에 먹스(Mux)(110), 개체 인식 모듈(120), 탐지 결과부(130), 이벤트 로그부(140) 및 적응형 채널 선택부(150)를 포함시킬 수도 있으며, 이러한 구성요소를 하드웨어 또는 소프트웨어 모듈로 구현할 수 있다.In another embodiment, the DVR/NVR 170 includes a Mux 110, an entity recognition module 120, a detection result unit 130, an event log unit 140, and an adaptive channel selection unit 150. ) May be included, and these components may be implemented as hardware or software modules.

본 발명은 적응형 개체 인식 장치를 모듈화된 소프트웨어 구성을 통해 다양한 딥러닝 기반 개체 인식 모둘과의 호환이 가능하며, CCTV의 개수의 증가에 따른 성능의 급격한 저하 또는 하드웨어 사양 증가가 불필요한 효과가 있다.In the present invention, the adaptive entity recognition device can be compatible with various deep learning-based entity recognition modules through a modular software configuration, and there is an effect that a rapid deterioration in performance or an increase in hardware specifications according to an increase in the number of CCTVs is unnecessary.

이상에서 본 발명의 실시예는 장치 및/또는 방법을 통해서만 구현이 되는 것은 아니며, 본 발명의 실시예의 구성에 대응하는 기능을 실현하기 위한 프로그램, 그 프로그램이 기록된 기록 매체 등을 통해 구현될 수도 있으며, 이러한 구현은 앞서 설명한 실시예의 기재로부터 본 발명이 속하는 기술분야의 전문가라면 쉽게 구현할 수 있는 것이다.In the above, embodiments of the present invention are not implemented only through an apparatus and/or method, but may be implemented through a program for realizing a function corresponding to the configuration of the embodiment of the present invention, a recording medium in which the program is recorded, etc. In addition, this implementation can be easily implemented by an expert in the technical field to which the present invention belongs from the description of the above-described embodiment.

이상에서 본 발명의 실시예에 대하여 상세하게 설명하였지만 본 발명의 권리범위는 이에 한정되는 것은 아니고 다음의 청구범위에서 정의하고 있는 본 발명의 기본 개념을 이용한 당업자의 여러 변형 및 개량 형태 또한 본 발명의 권리범위에 속하는 것이다.Although the embodiments of the present invention have been described in detail above, the scope of the present invention is not limited thereto, and various modifications and improvements by those skilled in the art using the basic concept of the present invention defined in the following claims are also provided. It belongs to the scope of rights.

100: 적응형 개체 인식 장치 110: 먹스
120: 개체 인식 모듈 130: 탐지 결과부
140: 이벤트 로그부 150: 적응형 채널 선택부
160: CCTV 170: DVR/NVR100: adaptive object recognition device 110: mux
120: entity recognition module 130: detection result unit
140: event log unit 150: adaptive channel selection unit
160: CCTV 170: DVR/NVR

Claims

A mux to which output signal lines and input ports of a plurality of CCTV (Closed Circuit Television) cameras are connected as a video channel multiplexer;
An entity recognition module for receiving an image frame from one of the plurality of CCTV cameras from the mux, estimating the existence of an entity in the received image frame, and outputting an entity recognition result;
A detection result unit for adaptively changing a channel selection list for selecting an order of input channels for receiving an image frame based on the output object recognition result; And
And an adaptive channel selection unit generating a channel selection signal according to the changed channel selection list and transmitting it to the mux,
The detection result unit expands or decreases the size of the channel selection list according to the entity recognition result,
Wherein the mux selectively receives an image frame through an input channel selected according to a channel selection signal of the adaptive channel selection unit.

A mux to which output signal lines and input ports of a plurality of CCTV (Closed Circuit Television) cameras are connected as a video channel multiplexer;
Receives an image frame of one CCTV camera from among the plurality of CCTV cameras from the MUX, estimates the existence of an entity in the received image frame, outputs an entity recognition result, and inputs image information by accessing one input channel An entity recognition module that receives and measures a processing capability indicating a maximum frame recognition rate of the one input channel;
A detection result unit for adaptively changing a channel selection list for selecting an order of input channels for receiving an image frame based on the output object recognition result; And
And an adaptive channel selection unit generating a channel selection signal according to the changed channel selection list and transmitting it to the mux,
And the channel selection list dynamically allocates a priority for selecting an order of input channels within the processing capability.

The method of claim 2,
The detection result unit adaptive entity recognition apparatus, wherein the size of the channel selection list is expanded or reduced according to the entity recognition result.

The method according to claim 1 or 2,
The detection result unit expands the channel selection list to a maximum size when at least one entity is detected as the entity recognition result, and the maximum size is a maximum frame recognition rate of one input channel in the entity recognition module, and an image frame. Adaptive object recognition device, characterized in that multiplied by a predetermined maximum allowable time according to the movement of the object within.

The method of claim 1 or 3,
The channel selection list includes a first list area indicating a selection order for input channels in which an entity is recognized, and a second list area indicating a selection order for input channels for which an entity is not recognized,
The detection result unit calculates a quotient obtained by dividing the size of the first list area by the number of input channels in which the entity is recognized, and alternately allocates the input channels in which the entity is recognized to the first list area, and the An adaptive entity recognition device, characterized in that sequentially allocating an input channel in which an entity is recognized by multiplying the calculated quotient.

The method of claim 5,
The detection result unit calculates a remainder of the size of the first list area divided by the number of input channels in which the entity is recognized, and the remaining section calculated in the first list area is entity recognition among the input channels in which the entity is recognized. An adaptive entity recognition device, characterized in that the input channels are allocated in an order of high probability.

Sequentially receiving an image frame from each of the CCTV cameras through a mux to which output signal lines of a plurality of CCTV (Closed Circuit Television) cameras and an input port are respectively connected;
Estimating the existence of an object from the image frame received from each of the CCTV cameras using an object recognition module, and outputting each object recognition result;
Adaptively changing a channel selection list for selecting an order of input channels for receiving an image frame based on the respective object recognition results; And
Generating a channel selection signal according to the changed channel selection list and transmitting it to the MUX, the MUX selectively receiving an image frame through an input channel selected according to the channel selection signal,
Outputting each of the object recognition results,
When the entity recognition module accesses one input channel to receive image information and plays it, the processing capability indicating the maximum frame recognition rate of one input channel is measured, and the maximum allowance according to the movement of the entity within the image frame An adaptive object recognition method comprising the step of setting a time.

delete

The method of claim 7,
Generating an entity recognition result by sequentially performing entity recognition for all input channels from the first input channel to the nth input channel;
Extracting a maximum value of an entity recognition probability of recognized entities for each of the input channels; And
And expanding the channel selection list to a maximum size by multiplying the processing performance and the maximum allowable time when at least one entity is detected in the generated entity recognition result.

The method of claim 9,
The expanded channel selection list includes a first list area indicating a selection order for input channels in which an entity is recognized, and a second list area indicating a selection order for input channels for which an entity is not recognized,
A quotient is calculated by dividing the size of the first list area by the number of input channels in which the entity is recognized, and the input channels in which the entity is recognized are equally alternately allocated to the first list area, and the entity is recognized. And sequentially allocating an input channel by multiplying the calculated quotient.

The method of claim 10,
The remainder of the size of the first list area divided by the number of input channels in which the entity is recognized is calculated, and the remaining section calculated in the first list area is the maximum of the entity recognition probability among the input channels in which the entity is recognized. And allocating input channels in an order of high probability of recognizing an entity by comparing values.