KR102421043B1

KR102421043B1 - Apparatus for Processing Images and Driving Method Thereof

Info

Publication number: KR102421043B1
Application number: KR1020200017666A
Authority: KR
Inventors: 최준호
Original assignee: 주식회사 인텔리빅스
Priority date: 2020-02-13
Filing date: 2020-02-13
Publication date: 2022-07-14
Also published as: KR20210103210A

Abstract

본 발명은 영상처리장치 및 그 장치의 구동방법에 관한 것으로서, 본 발명의 실시예에 따른 영상처리장치는, 복수의 촬영장치로부터 촬영영상을 수신하는 통신 인터페이스부, 및 상기 수신한 촬영영상에 포함되는 추적 객체에 대한 객체 분류기와 객체 검출기의 분석 결과를 서로 비교하여 비교 결과에 따라 이벤트를 결정하는 제어부를 포함할 수 있다.The present invention relates to an image processing apparatus and a method of driving the apparatus, and an image processing apparatus according to an embodiment of the present invention includes a communication interface unit for receiving photographed images from a plurality of photographing apparatuses, and included in the received photographed images. The control unit may include a control unit that compares the analysis results of the object classifier and the object detector with respect to the tracking object to be tracked and determines an event according to the comparison result.

Description

Image processing apparatus and driving method of the same

본 발명은 영상처리장치 및 그 장치의 구동방법에 관한 것으로서, 더 상세하게는 가령 지능형 영상 감시를 위하여 검출기, 모션이벤트 및 분류기를 종합적으로 고려하여 채널수 확보 및 오탐지를 최소화하는 영상처리장치 및 그 장치의 구동방법에 관한 것이다.The present invention relates to an image processing apparatus and a method of driving the apparatus, and more particularly, to an image processing apparatus for securing the number of channels and minimizing false detection by comprehensively considering a detector, a motion event, and a classifier for intelligent image monitoring; It relates to a method of driving the device.

최근 관공서나 기업 등에서 보안/안전을 위해 설치하는 CCTV 카메라의 수는 폭발적으로 증가하고 있다. 그러나 설치한 CCTV 카메라 수에 비해 CCTV 카메라 영상을 모니터링하는 요원의 수는 턱없이 부족한 실정이다. 이러한 문제점을 해결하기 위해 지능형 CCTV 영상 감시 시스템의 도입이 활발하게 이루어지고 있다.Recently, the number of CCTV cameras installed for security/safety in government offices and companies is increasing explosively. However, compared to the number of installed CCTV cameras, the number of personnel monitoring CCTV camera images is insufficient. In order to solve this problem, the introduction of an intelligent CCTV video surveillance system is actively being made.

지능형 CCTV 영상 감시 시스템의 핵심을 이루는 CCTV 영상분석장치는 CCTV 카메라로부터 비디오 영상을 받아 이동 객체들을 검출/추적하고, 이를 바탕으로 “금지된 구역에 침입 발생” 등과 같은 이상 상황을 자동으로 감지하여 경보를 발생시킨다. 모니터링 요원은 다수의 (무의미한) CCTV 영상을 항상 주시할 필요 없이 경보가 발생한 CCTV 영상만 확인함으로써, 다수의 CCTV 카메라 영상을 효과적으로 모니터링할 수 있다.The CCTV video analysis device, which is the core of the intelligent CCTV video surveillance system, receives video images from the CCTV cameras, detects/tracks moving objects, and automatically detects abnormal situations such as “intrusion into a prohibited area” based on this and provides an alert. causes Monitoring personnel can effectively monitor multiple CCTV camera images by checking only the CCTV images in which an alarm has occurred without constantly watching multiple (meaningless) CCTV images.

그러나 기존의 CCTV 영상분석장치의 대부분은 모션 기반의 객체 검출 알고리즘을 사용하는 관계로, 실제 관심 객체(예: 대표적으로 사람 및 차량)의 검출 이외에도 다양한 원인(예: 바람에 흔들리는 나뭇가지, 출렁이는 물결, 움직이는 그림자, 갑작스러운 조명 변화, 반짝이는 불빛, 눈/비 등)에 의한 객체 오검출이 빈번하게 발생한다. 이를 통해 오경보 또한 빈번하게 발생하여 효율적인 모니터링을 할 수 없게 만든다.However, since most of the existing CCTV image analysis devices use motion-based object detection algorithms, in addition to the detection of actual objects of interest (eg, people and vehicles), various causes (eg, branches swaying in the wind, swaying) erroneous detection of objects due to waves, moving shadows, sudden lighting changes, flashing lights, snow/rain, etc.) occur frequently. Through this, false alarms also occur frequently, making effective monitoring impossible.

컴퓨터 비전(Computer Vision) 연구자들은 모션 기반의 객체 검출 기술 이외에 2000년대 중반부터 객체 형상 학습 기반의 객체 검출 기술을 발전시켜 왔다. 상기 기술에서는 특정 타입의 객체(예를 들면, 보행자)의 다양한 학습 이미지들로부터 객체 형상 특징을 추출하여 학습하고, 학습된 객체의 형상 특징과 유사한 형상 특징을 보이는 영역을 영상에서 찾음에 의해 객체 검출을 수행한다. 대표적으로 Viola-Jones, HOG, ICF, ACF, DPM 등의 객체 검출 기술이 있다. 그러나 이러한 객체 검출 기술들의 검출 성능 한계 및 처리 부하 문제로 상용 CCTV 영상분석장치에 적용하기에는 어려움이 있었다.Computer Vision researchers have been developing object detection technology based on object shape learning since the mid-2000s in addition to motion-based object detection technology. In the above technology, object shape features are extracted and learned from various learning images of a specific type of object (for example, pedestrians), and object detection is performed by finding a region showing shape features similar to those of the learned object in the image. carry out Representatively, there are object detection technologies such as Viola-Jones, HOG, ICF, ACF, and DPM. However, it was difficult to apply these object detection technologies to commercial CCTV image analysis devices due to the limitation of detection performance and the problem of processing load.

이런 와중에 2012년도에 캐나다 토론토 대학의 G. Hinton 교수 팀이 AlexNet이라는 DCNN을 이용하여, ILSVRC(ImageNet Large Scale Visual Recognition Challenge)에서 기존의 이미지 인식 알고리즘들과는 압도적인 성능 차이로 우승을 하게 됨에 따라, 컴퓨터 비전 분야에서 딥러닝(Deep Learning) 기술이 주목을 받기 시작하였고, 그 후 딥러닝 기술을 이용하여 컴퓨터 비전의 각종 문제들을 해결하려는 시도가 이어져 왔다.In the midst of this, in 2012, Professor G. Hinton's team at the University of Toronto, Canada won the ILSVRC (ImageNet Large Scale Visual Recognition Challenge) using DCNN called AlexNet with an overwhelming performance difference from existing image recognition algorithms. Deep learning technology began to attract attention in the field of vision, and after that, attempts have been made to solve various problems of computer vision using deep learning technology.

2014년부터 DCNN 기반의 객체 검출 기술들이 발표되기 시작하였다. 이들 DCNN 기반의 객체 검출 기술은 기존의 객체 검출 기술의 성능을 훨씬 뛰어 넘는 검출 성능을 제공한다. 대표적으로 Fast/Faster R-CNN, RFCN, SSD, YOLO 등의 객체 검출 기술이 있다.From 2014, DCNN-based object detection technologies began to be announced. These DCNN-based object detection techniques provide detection performance that far exceeds the performance of existing object detection techniques. Typically, there are object detection technologies such as Fast/Faster R-CNN, RFCN, SSD, and YOLO.

그러나 DCNN 기반의 객체 검출 기술을 상용 CCTV 영상분석장치에 적용하기에는 여전히 여러 제약점들이 있다. 대표적인 제약점은 DCNN 기반의 객체 검출기를 이용해 비디오를 실시간 처리하기 위한 하드웨어 비용이 매우 높다는 점이다. 통상적인 DCNN 기반의 객체 검출기는 한 장의 비디오 프레임으로부터 객체를 검출하는 데에도 상당히 많은 연산량을 요구하기 때문에, 일반 CPU에서 DCNN 기반의 객체 검출기를 이용하여 비디오를 실시간으로 처리(예: 통상적으로 초당 7 프레임 이상 객체 검출 수행 필요)하기에는 매우 어렵다. 따라서 DCNN 기반의 객체 검출기를 이용하여 비디오를 실시간으로 처리하려면, 대규모 병렬 연산이 가능한 GPU가 반드시 요구된다. 또한 GPU를 사용한다 하더라도, 성능이 우수한 고가의 GPU를 사용하지 않는 이상 하나의 영상분석장치에서 여러 개의 비디오 스트림을 동시에 실시간 처리하기는 어렵다.However, there are still several limitations in applying DCNN-based object detection technology to commercial CCTV image analysis devices. A typical limitation is that the hardware cost for real-time video processing using a DCNN-based object detector is very high. Since the conventional DCNN-based object detector requires a considerable amount of computation even to detect an object from one video frame, a general CPU uses a DCNN-based object detector to process the video in real time (e.g., typically 7 per second). It is very difficult to perform frame anomaly object detection). Therefore, in order to process video in real time using a DCNN-based object detector, a GPU capable of massively parallel computation is absolutely required. Also, even if a GPU is used, it is difficult to simultaneously process multiple video streams in real-time in one image analysis device unless an expensive GPU with excellent performance is used.

또한, 종래에 객체 검출기를 탑재하는 영상분석서버는 채널수가 작고, 오류 탐지가 많은 문제가 여전히 발생하고 있다.In addition, the conventional image analysis server equipped with an object detector has a small number of channels and a large number of error detections.

한국등록특허공보 제10-1040049호(2011.06.02.)Korea Patent Publication No. 10-1040049 (2011.06.02.) 한국등록특허공보 제10-1173853호(2012.08.08.)Korean Patent Publication No. 10-1173853 (2012.08.08.) 한국등록특허공보 제10-1178539호(2012.08.24.)Korean Patent Publication No. 10-1178539 (2012.08.24.) 한국등록특허공보 제10-1748121호(2017.06.12.)Korean Patent Publication No. 10-1748121 (2017.06.12.) 한국등록특허공보 제10-1789690호(2017.10.18.)Korean Patent Publication No. 10-1789690 (2017.10.18.) 한국등록특허공보 제10-1808587호(2017.12.07.)Korean Patent Publication No. 10-1808587 (2017.12.07.) 한국공개특허공보 제10-2018-0072561호(2018.06.29.)Korean Patent Publication No. 10-2018-0072561 (2018.06.29.) 한국등록특허공보 제10-1850286호(2018.04.13.)Korean Patent Publication No. 10-1850286 (2018.04.13.) 한국공개특허공보 제10-2018-0107930호(2018.10.04.)Korean Patent Publication No. 10-2018-0107930 (2018.10.04.)

본 발명의 실시예는 가령 지능형 영상 감시를 위하여 검출기, 모션이벤트 및 분류기를 종합적으로 고려하여 채널수 확보 및 오탐지를 최소화하는 영상처리장치 및 그 장치의 구동방법을 제공함에 그 목적이 있다.An object of the present invention is to provide an image processing apparatus and a method of driving the apparatus for securing the number of channels and minimizing false detection by comprehensively considering a detector, a motion event, and a classifier for intelligent image monitoring.

본 발명의 실시예에 따른 영상처리장치는, 복수의 촬영장치로부터 촬영영상을 수신하는 통신 인터페이스부, 및 상기 수신한 촬영영상에 포함되는 추적 객체에 대한 객체 분류기와 객체 검출기의 분석 결과를 서로 비교하여 비교 결과에 따라 이벤트를 결정하는 제어부를 포함한다.An image processing apparatus according to an embodiment of the present invention includes a communication interface for receiving captured images from a plurality of photographing devices, and comparing the analysis results of an object classifier and an object detector for a tracking object included in the received captured images with each other. and a control unit for determining an event according to the comparison result.

상기 제어부는, 상기 추적 객체에 대한 상기 객체 분류기의 탐지 영역과 상기 객체 검출기의 탐지 영역에 교집합 영역이 있을 때 상기 교집합 영역의 분석 결과를 근거로 상기 이벤트를 결정할 수 있다.The controller may determine the event based on an analysis result of the intersection region when there is an intersection region between the detection region of the object classifier and the detection region of the object detector for the tracking object.

상기 제어부는, 상기 교집합 영역의 분석 결과로서 동일 객체가 판단될 때 상기 이벤트를 결정할 수 있다.The controller may determine the event when the same object is determined as a result of the analysis of the intersection region.

상기 제어부는 상기 추적 객체의 일부 영역에 대한 이미지를 상기 객체 분류기에 제공하여 분석 결과를 얻으며, 동일 추적 객체의 전체 영역에 대한 스냅샷(snapshot)을 상기 객체 검출기로 제공하여 분석 결과를 얻을 수 있다.The control unit provides an image of a partial region of the tracking object to the object classifier to obtain an analysis result, and provides a snapshot of the entire region of the same tracking object to the object detector to obtain an analysis result .

상기 제어부는, 상기 분석 결과를 근거로 이벤트 영역 내의 객체 유무 및 종류를 확인하여 상기 이벤트를 결정할 수 있다.The controller may determine the event by checking the presence and type of an object in the event area based on the analysis result.

상기 제어부는, 상기 교집합 영역의 비교 결과가 지정값 이상 유효하고 상기 교집합 영역의 이벤트 발생 객체가 사람, 차량 및 미확인인지 확인하여 상기 이벤트를 결정할 수 있다.The controller may determine the event by confirming that the comparison result of the intersection area is valid more than a specified value and that the event generating object of the intersection area is a person, a vehicle, and unidentified.

상기 객체 분류기와 상기 객체 검출기는 서로 분리되어 독립적으로 구성되는 영상분석서버 및 객체검출서버에 각각 구성될 수 있다.The object classifier and the object detector may be respectively configured in an image analysis server and an object detection server configured independently of each other.

상기 객체 분류기는 모션 기반 분류기 또는 DNN(Deep Neural Network) 분류기를 사용하며, 상기 객체 검출기는 DNN 검출기를 사용할 수 있다.The object classifier may use a motion-based classifier or a deep neural network (DNN) classifier, and the object detector may use a DNN detector.

또한, 본 발명의 실시예에 따른 영상처리장치의 구동방법은, 통신 인터페이스부는, 복수의 촬영장치로부터 촬영영상을 수신하는 단계, 및 제어부가, 상기 수신한 촬영영상에 포함되는 추적 객체에 대한 객체 분류기와 객체 검출기의 분석 결과를 서로 비교하여 비교 결과에 따라 이벤트를 결정하는 단계를 포함한다.In addition, the method of driving an image processing apparatus according to an embodiment of the present invention includes the steps of: receiving, by the communication interface unit, captured images from a plurality of photographing devices; and comparing the analysis results of the classifier and the object detector with each other to determine an event according to the comparison result.

상기 이벤트를 결정하는 단계는, 상기 추적 객체에 대한 상기 객체 분류기의 탐지 영역과 상기 객체 검출기의 탐지 영역에 교집합 영역이 있을 때 상기 교집합 영역의 분석 결과를 근거로 상기 이벤트를 결정할 수 있다.The determining of the event may include determining the event based on an analysis result of the intersection region when there is an intersection region between the detection region of the object classifier and the detection region of the object detector for the tracking object.

상기 이벤트를 결정하는 단계는, 상기 교집합 영역의 분석 결과로서 동일 객체가 판단될 때 상기 이벤트를 결정할 수 있다.The determining of the event may include determining the event when the same object is determined as a result of the analysis of the intersection region.

상기 구동방법은, 상기 제어부가 상기 추적 객체의 일부 영역에 대한 이미지를 상기 객체 분류기에 제공하여 분석 결과를 얻으며, 동일 추적 객체의 전체 영역에 대한 스냅샷을 상기 객체 검출기로 제공하여 분석 결과를 얻는 단계를 더 포함할 수 있다.In the driving method, the control unit provides an image of a partial region of the tracking object to the object classifier to obtain an analysis result, and provides a snapshot of the entire region of the same tracking object to the object detector to obtain an analysis result It may include further steps.

상기 이벤트를 결정하는 단계는, 상기 분석 결과를 근거로 이벤트 영역 내의 객체 유무 및 종류를 확인하여 상기 이벤트를 결정할 수 있다.In the determining of the event, the event may be determined by checking the presence and type of an object in the event area based on the analysis result.

상기 이벤트를 결정하는 단계는, 상기 교집합 영역의 비교 결과가 지정값 이상 유효하고 상기 교집합 영역의 이벤트 발생 객체가 사람, 차량 및 미확인인지 확인하여 상기 이벤트를 결정할 수 있다.In the determining of the event, it is possible to determine the event by confirming that the comparison result of the intersection area is valid more than a specified value and that the event generating object of the intersection area is a person, a vehicle, and unidentified.

상기 객체 분류기는 모션 기반 분류기 또는 DNN 분류기를 사용하며, 상기 객체 검출기는 DNN 검출기를 사용할 수 있다.The object classifier may use a motion-based classifier or a DNN classifier, and the object detector may use a DNN detector.

본 발명의 실시예에 따르면 채널수를 확보하고 동시에 영상의 분석 과정에서 오탐지를 최소화할 수 있을 것이다. 예를 들어, 본 발명의 실시예는 저해상도 영상 기반 딥러닝 객체검출 기술을 적용하여 딥러닝 객체 검출기를 활용한 채널수를 확보할 수 있으며, 딥러닝 객체 검출기를 활용한 정확도를 향상시킬 수 있을 것이다.According to an embodiment of the present invention, it will be possible to secure the number of channels and at the same time minimize false positives in the image analysis process. For example, an embodiment of the present invention can secure the number of channels using a deep learning object detector by applying a low-resolution image-based deep learning object detection technology, and it will be possible to improve the accuracy using the deep learning object detector .

도 1은 본 발명의 실시예에 따른 영상처리시스템을 나타내는 도면,
도 2는 도 1의 영상분석장치의 전체 구성을 간략하게 보여주는 도면,
도 3은 도 1의 영상분석장치의 SW 영역을 설명하기 위한 도면,
도 4는 도 3의 매니저의 세부 구조를 예시한 도면,
도 5는 도 3의 프로세서의 세부 구조를 예시한 도면,
도 6은 도 1의 객체검출장치의 전체 구성을 간략하게 보여주는 도면,
도 7은 본 발명의 실시예에 따른 이벤트 발생 흐름을 설명하기 위한 도면,
도 8은 도 1의 객체검출장치의 운용 흐름을 설명하기 위한 도면,
도 9는 본 발명의 실시예에 따른 영상처리과정을 나타내는 흐름도,
도 10 내지 도 12는 본 발명의 실시예에 따른 영상처리과정 및 그 결과를 설명하기 위한 도면, 그리고
도 13은 본 발명의 다른 실시예에 따른 영상처리과정을 나타내는 흐름도이다.1 is a view showing an image processing system according to an embodiment of the present invention;
Figure 2 is a view briefly showing the overall configuration of the image analysis apparatus of Figure 1;
3 is a view for explaining the SW area of the image analysis apparatus of FIG. 1;
4 is a diagram illustrating a detailed structure of the manager of FIG. 3;
5 is a diagram illustrating a detailed structure of the processor of FIG. 3;
6 is a view schematically showing the overall configuration of the object detection apparatus of FIG. 1;
7 is a view for explaining an event generation flow according to an embodiment of the present invention;
8 is a view for explaining an operation flow of the object detection apparatus of FIG. 1;
9 is a flowchart illustrating an image processing process according to an embodiment of the present invention;
10 to 12 are views for explaining an image processing process and results thereof according to an embodiment of the present invention, and
13 is a flowchart illustrating an image processing process according to another embodiment of the present invention.

이하, 도면을 참조하여 본 발명의 실시예에 대하여 상세히 설명한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

도 1은 본 발명의 실시예에 따른 영상처리시스템을 나타내는 도면이다.1 is a view showing an image processing system according to an embodiment of the present invention.

도 1에 도시된 바와 같이, 본 발명의 실시예에 따른 영상처리시스템(90)은 촬영장치(100), 통신망(110), 관제장치(120), 영상분석장치(130) 및 객체검출장치(140)의 일부 또는 전부를 포함한다.As shown in FIG. 1 , the image processing system 90 according to the embodiment of the present invention includes a photographing device 100 , a communication network 110 , a control device 120 , an image analysis device 130 , and an object detection device ( 140) in part or in whole.

여기서, "일부 또는 전부를 포함한다"는 것은 관제장치(120)와 같은 일부 구성요소가 생략되어 영상처리시스템(90)이 구성되거나, 영상분석장치(130)나 객체검출장치(140)와 같은 일부 구성요소가 통신망(110)을 구성하는 네트워크장치(예: 무선교환장치 등)에 통합되어 구성될 수 있는 것 등을 의미하는 것으로서, 발명의 충분한 이해를 돕기 위하여 전부 포함하는 것으로 설명한다.Here, "including some or all" means that some components such as the control device 120 are omitted to configure the image processing system 90, or the image analysis device 130 or the object detection device 140, such as It means that some components can be configured by being integrated into a network device (eg, a wireless switching device, etc.) constituting the communication network 110, and it will be described as including all of them to help a sufficient understanding of the invention.

촬영장치(100)는 지방자치단체나 우범지역 등 다양한 곳에 설치되어 사건, 사고를 감시할 수 있으며, 촬영된 영상은 통신망(110)을 경유하여 관제장치(120), 영상분석장치(130) 및 객체검출장치(140) 중 적어도 하나의 장치로 제공될 수 있다. 촬영장치(100)는 IP 카메라(101), 얼굴인식카메라(102) 및 아날로그 카메라(103) 등 적어도 하나의 카메라를 포함할 수 있으며, 이러한 카메라들은 고정식 카메라이거나 PTZ(Pan-Tilt-Zoom) 카메라일 수 있다. 촬영장치(100)는 관제장치(120)의 요청에 따라 가령 팬, 틸트, 줌 동작을 수행하여 주변을 촬영한 후 촬영영상을 제공할 수 있다.The photographing device 100 may be installed in various places such as local governments or crime zones to monitor incidents and accidents, and the captured images are captured by the control device 120, the image analysis device 130 and the It may be provided as at least one of the object detection devices 140 . The photographing apparatus 100 may include at least one camera such as an IP camera 101 , a face recognition camera 102 , and an analog camera 103 , and these cameras are either a fixed camera or a PTZ (Pan-Tilt-Zoom) camera. can be The photographing apparatus 100 may provide a photographed image after photographing the surroundings by performing, for example, pan, tilt, and zoom operations according to the request of the control apparatus 120 .

다양한 종류의 촬영장치(100)에 의해 촬영된 촬영영상은 관제장치(120)로 제공되어 DB(120a)에 저장될 수 있지만, 자체적으로 내부에 탑재되는 또는 주변에서 연동하거나 통신망(110)과의 사이에 영상저장장치로서 별도의 저장장치(예: DVR)를 구성하고 이를 통해 촬영영상이 제공될 수도 있다.The photographed images photographed by various types of photographing devices 100 may be provided to the control device 120 and stored in the DB 120a, but may be mounted on their own or interlocked with the periphery or communicated with the communication network 110 A separate storage device (eg, DVR) may be configured as an image storage device therebetween, and a captured image may be provided through this.

통신망(110)은 유무선 통신망을 모두 포함한다. 가령 통신망(110)으로서 유무선 인터넷망이 이용되거나 연동될 수 있다. 여기서 유선망은 케이블망이나 공중 전화망(PSTN)과 같은 인터넷망을 포함하는 것이고, 무선 통신망은 CDMA, WCDMA, GSM, EPC(Evolved Packet Core), LTE(Long Term Evolution), 와이브로(Wibro) 망 등을 포함하는 의미이다. 물론 본 발명의 실시예에 따른 통신망(110)은 이에 한정되는 것이 아니며, 가령 클라우드 컴퓨팅 환경하의 클라우드 컴퓨팅망, 5G망 등에 사용될 수 있다. 가령, 통신망(110)이 유선 통신망인 경우 통신망(110) 내의 액세스포인트는 전화국의 교환국 등에 접속할 수 있지만, 무선 통신망인 경우에는 통신사에서 운용하는 SGSN 또는 GGSN(Gateway GPRS Support Node)에 접속하여 데이터를 처리하거나, BTS(Base Station Transmission), NodeB, e-NodeB 등의 다양한 중계기에 접속하여 데이터를 처리할 수 있다.The communication network 110 includes both wired and wireless communication networks. For example, a wired/wireless Internet network may be used or interlocked as the communication network 110 . Here, the wired network includes an Internet network such as a cable network or a public telephone network (PSTN), and the wireless communication network includes CDMA, WCDMA, GSM, Evolved Packet Core (EPC), Long Term Evolution (LTE), and Wibro networks. meaning to include Of course, the communication network 110 according to the embodiment of the present invention is not limited thereto, and may be used, for example, in a cloud computing network under a cloud computing environment, a 5G network, and the like. For example, when the communication network 110 is a wired communication network, the access point in the communication network 110 can connect to a switching center of a telephone company, etc., but in the case of a wireless communication network, it connects to a SGSN or GGSN (Gateway GPRS Support Node) operated by a communication company to transmit data. or by connecting to various repeaters such as Base Station Transmission (BTS), NodeB, and e-NodeB to process data.

통신망(110)은 액세스포인트를 포함할 수 있다. 여기서의 액세스포인트는 건물 내에 많이 설치되는 펨토(femto) 또는 피코(pico) 기지국과 같은 소형 기지국을 포함한다. 펨토 또는 피코 기지국은 소형 기지국의 분류상 촬영장치(100) 등을 최대 몇 대까지 접속할 수 있느냐에 따라 구분된다. 물론 액세스포인트는 촬영장치(100) 등과 지그비 및 와이파이 등의 근거리 통신을 수행하기 위한 근거리 통신모듈을 포함할 수 있다. 액세스포인트는 무선통신을 위하여 TCP/IP 혹은 RTSP(Real-Time Streaming Protocol)를 이용할 수 있다. 여기서, 근거리 통신은 와이파이 이외에 블루투스, 지그비, 적외선, UHF(Ultra High Frequency) 및 VHF(Very High Frequency)와 같은 RF(Radio Frequency) 및 초광대역 통신(UWB) 등의 다양한 규격으로 수행될 수 있다. 이에 따라 액세스포인트는 데이터 패킷의 위치를 추출하고, 추출된 위치에 대한 최상의 통신 경로를 지정하며, 지정된 통신 경로를 따라 데이터 패킷을 다음 장치, 예컨대 관제장치(120), 영상분석장치(130) 및/또는 객체검출장치(140) 등으로 전달할 수 있다. 액세스포인트는 일반적인 네트워크 환경에서 여러 회선을 공유할 수 있으며, 예컨대 라우터(router), 리피터(repeater) 및 중계기 등이 포함된다.The communication network 110 may include an access point. Here, the access point includes a small base station, such as a femto or pico base station, which is often installed in a building. Femto or pico base stations are classified according to the maximum number of access to the imaging device 100, etc. in the classification of a small base station. Of course, the access point may include a short-distance communication module for performing short-distance communication such as Zigbee and Wi-Fi with the photographing device 100 and the like. The access point may use TCP/IP or Real-Time Streaming Protocol (RTSP) for wireless communication. Here, short-range communication may be performed in various standards such as Bluetooth, Zigbee, infrared, radio frequency (RF) such as ultra high frequency (UHF) and very high frequency (VHF), and ultra-wideband communication (UWB) in addition to Wi-Fi. Accordingly, the access point extracts the location of the data packet, designates the best communication path for the extracted location, and sends the data packet along the designated communication path to the next device, such as the control device 120, the image analysis device 130, and / or may be transmitted to the object detection device 140 or the like. The access point may share several lines in a general network environment, and includes, for example, a router, a repeater, and a repeater.

관제장치(120)는 지방자치단체 등의 관제센터에 구비될 수 있으며, 통신사들이 운영하는 통신망(110) 또는 댁내나 공공기관 등에서 운영하는 액세스포인트에 연결된 서버 및/또는 컴퓨터를 포함한다. 다시 말해, 본 발명의 실시예에 따른 관제장치(120)는 컴퓨터일 수 있고, 서버일 수 있으며, 서버와 그 서버에 연결된 컴퓨터를 포함하는 의미로 이해될 수 있다. 본 발명의 실시예에서는 기술의 특성상 관리자가 소지하는 컴퓨터나 스마트폰 등의 휴대폰을 의미할 수도 있다. 물론 컴퓨터는 랩탑, 데스크탑, 태블릿 PC 등을 포함한다.The control device 120 may be provided in a control center such as a local government, and includes a server and/or computer connected to a communication network 110 operated by telecommunication companies or an access point operated in a home or a public institution. In other words, the control device 120 according to the embodiment of the present invention may be a computer, may be a server, and may be understood to include a server and a computer connected to the server. In an embodiment of the present invention, it may refer to a mobile phone such as a computer or a smart phone possessed by an administrator due to the nature of the technology. Computers, of course, include laptops, desktops, tablet PCs, and the like.

설명의 편의상 관제장치(120)는 공공기관 등에 구비되는 컴퓨터로 설명한다. 컴퓨터는 본 발명의 실시예에 따른 프로그램을 설치할 수 있다. 예를 들어, 해당 프로그램의 설치에 따라 관제장치(120)는 중요 이벤트가 통지될 때 관리자가 쉽게 감지할 수 있도록 화면에 팝업, 알람 동작 등을 수행할 수 있을 것이다.For convenience of description, the control device 120 will be described as a computer provided in a public institution or the like. The computer may install a program according to an embodiment of the present invention. For example, according to the installation of the corresponding program, the control device 120 may perform a pop-up on the screen, an alarm operation, etc. so that the administrator can easily detect when an important event is notified.

영상처리시스템(90)의 특성상 관제센터에는 관제를 위한 많은 전용 컴퓨터 등이 관제장치(120)로서 사용될 수 있다. 따라서, 관제장치(120)는 화면에 상시적으로 관리대상구역을 모니터링하기 위한 화면을 표시하고, 이런 전제하에 본 발명의 실시예에서는 화면에 팝업기능과 알람기능이 수행될 수 있을 것이다.Due to the characteristics of the image processing system 90 , many dedicated computers for control may be used as the control device 120 in the control center. Accordingly, the control device 120 always displays a screen for monitoring the management target area on the screen, and under this premise, in the embodiment of the present invention, a pop-up function and an alarm function may be performed on the screen.

물론 이와 같은 경우가 아니라 하더라도, 가령 댁내에 감시시스템으로서 본 발명의 실시예에 따른 영상처리시스템(90)을 구축한 경우에는 관제장치(120)가 특정 사용자가 사용하는 스마트폰 등이 될 수 있다. 이의 경우에는 해당 사용자의 휴대폰으로 중요 이벤트가 감지되었음이 통지될 수 있을 것이다.Of course, even if this is not the case, for example, when the image processing system 90 according to the embodiment of the present invention is built as a monitoring system in the house, the control device 120 may be a smartphone used by a specific user. . In this case, the user's mobile phone may be notified that an important event has been detected.

또한, 관제장치(120)는 DB(120a)를 더 포함하여 연동할 수 있다. 가령, DVR 등의 영상저장장치의 요청에 따라 과거의 영상들을 DB(120a)에 저장해 둘 수도 있다. 서버는 고가인 만큼 영상분석을 수행하고 이벤트를 도출하고 도출된 이벤트 중에서 중요도가 높은 이벤트를 필터링하기 위한 하드웨어(H/W)나 소프트웨어(S/W) 자원의 성능이 높기 때문에 촬영장치 또는 영상저장장치와 연계하여 이의 동작을 효율적으로 수행할 수 있다. 물론 본 발명의 실시예에서와 같이 영상분석은 별도의 영상분석장치(130)에서 이루어지도록 구성될 수도 있다. 이러한 구성은 어디까지나 시스템 설계자의 의도에 따라 얼마든지 달라질 수 있는 것이므로 본 발명의 실시예에서는 어느 하나의 형태에 특별히 한정하지는 않을 것이다.In addition, the control device 120 may further include a DB 120a to interwork. For example, past images may be stored in the DB 120a according to a request from an image storage device such as a DVR. Since the server is expensive, the performance of hardware (H/W) or software (S/W) resources for performing image analysis, deriving events, and filtering high-importance events among the derived events is high. In connection with the device, its operation can be efficiently performed. Of course, as in the embodiment of the present invention, image analysis may be configured to be performed in a separate image analysis apparatus 130 . Since such a configuration can be changed according to the intention of the system designer to the last, the embodiment of the present invention will not be limited to any one form.

나아가, 관제장치(120)는 영상저장장치에 저장되어 있는 영상을 검색(예: 픽쳐 검색(I-search), 객체 검색(object search) 등)하거나 영상을 재생하기 위한 검색 및 재생엔진 즉 해당 프로그램을 포함할 수 있다. 영상의 검색과 재생 등을 위해 라스(RAS) 솔루션이 사용될 수 있다. 또한, 관제장치(120)는 통신사와 연계하여 메시지로 알림 등을 주는 VMS(Voice Message Service) 기능을 수행할 수도 있다.Furthermore, the control device 120 searches for an image stored in the image storage device (eg, a picture search (I-search), an object search, etc.) or a search and playback engine for reproducing an image, that is, a corresponding program. may include. A RAS solution may be used for image search and playback. In addition, the control device 120 may perform a VMS (Voice Message Service) function that provides a notification or the like as a message in connection with a communication company.

뿐만 아니라, 관제장치(120)는 자체 경비원, 사설 보안업체, 경찰 등 유관기관에서 운영하거나 소유하는 장치를 포함할 수 있다. 관제장치(120)는 영상분석장치(130)에서의 통지에 따라 중요 이벤트의 발생을 감지할 수 있지만, 관제장치(120)에서 중요 이벤트를 알리도록 설계된 경우에는 그에 따라 동작하게 된다. 예컨대, 보안장치 등이 관제장치(120)에 연계하여 동작하는 경우 관제장치(120)의 요청에 따라 중요 이벤트의 발생을 감지하고 적절한 조처를 취하게 될 것이다.In addition, the control device 120 may include a device operated or owned by a related organization such as a security guard, a private security company, or the police. The control device 120 may detect the occurrence of an important event according to the notification from the image analysis device 130 , but when the control device 120 is designed to notify the important event, it operates accordingly. For example, when the security device or the like operates in connection with the control device 120 , the occurrence of an important event will be detected according to the request of the control device 120 and appropriate measures will be taken.

영상분석장치(130)는 복수의 촬영장치(100)로부터 수신되는 촬영영상을 분석하여 분석결과를 메타 데이터의 형태로 객체검출장치(140)로 제공할 수 있다. 또한, 객체검출장치(140)와 협업하여 이벤트가 최종적으로 결정된 경우 관제장치(120)로 이벤트의 발생을 통지해 줄 수 있다. 좀더 구체적으로, 영상분석장치(130)는 촬영장치(100)의 입력영상으로부터 전경(모션) 픽셀을 검출하는 전경(모션) 검출(Foreground Detection), 입력영상과 전경영상을 이용하여 객체 검출 및 추적을 수행하는 객체 검출 및 추적(Object Detection and Tracking), 카메라 보정 정보를 이용하여 추적 객체의 실제 크기/위치/속도 추정(Optional)을 위한 객체의 실제 크기 추정(Real Object Size Estimation), 추적 객체를 사람, 차량, 미확인 중 하나로 분류하는 객체 분류(DNN Object Classification) 및 지정된 이벤트 규칙을 만족하는 객체 행위를 검출하는 이벤트 검출(Event Detection) 동작을 수행할 수 있다. 이외에도 본 발명의 실시예에 따른 영상분석장치(130)는 이벤트의 오탐지(혹은 탐지)의 정확도를 높이기 위하여 객체 모듈(Object Module)에서의 스냅샷 처리 및 이벤트 결정 동작을 더 수행할 수 있다.The image analysis apparatus 130 may analyze the captured images received from the plurality of photographing apparatuses 100 and provide the analysis result to the object detection apparatus 140 in the form of metadata. In addition, when an event is finally determined in cooperation with the object detection device 140 , the control device 120 may notify the occurrence of the event. More specifically, the image analysis apparatus 130 performs foreground (motion) detection for detecting foreground (motion) pixels from the input image of the photographing apparatus 100 , and object detection and tracking using the input image and the foreground image. Object Detection and Tracking that performs An object classification (DNN Object Classification) that classifies one of a person, a vehicle, and an unidentified one and an event detection operation of detecting an object behavior satisfying a specified event rule may be performed. In addition, the image analysis apparatus 130 according to an embodiment of the present invention may further perform snapshot processing and event determination in an object module in order to increase the accuracy of false detection (or detection) of an event.

먼저, 영상분석장치(130)는 촬영장치(100)로부터의 촬영영상을 시간변화에 따른 비디오 프레임의 형태로 수신하며, 전경 검출을 위하여 주기적으로 배경 모델을 학습하고, 입력 영상과 배경 모델의 비교에 의한 초기 전경(모션) 픽셀 검출, 조명 변화, 그림자, 동적 배경 등에 의한 노이즈 전경(모션) 픽셀 제거, 모폴로지 필터링에 의한 최종 전경 영상 획득 동작을 수행할 수 있다. 이의 과정에서 영상분석장치(130)는 고해상도의 영상을 저해상도로 변환하는 등의 동작을 수행할 수 있다. 이는 저렴한 또는 한정된 CPU 자원의 연산처리 부담을 줄일 수 있다.First, the image analysis apparatus 130 receives a captured image from the photographing apparatus 100 in the form of a video frame according to time change, periodically learns a background model for foreground detection, and compares the input image with the background model. Initial foreground (motion) pixel detection by , noise foreground (motion) pixel removal by lighting change, shadow, dynamic background, etc., and final foreground image acquisition operation by morphological filtering may be performed. In this process, the image analysis apparatus 130 may perform an operation such as converting a high-resolution image into a low-resolution image. This can reduce the computational burden of inexpensive or limited CPU resources.

또한, 영상분석장치(130)는 객체 검출 및 추적 동작을 위하여, 신규 출연 객체의 검출 및 추적을 시작하고, 추적 중인 객체들의 바운딩 박스 좌표 및 형상 모델(Appearance Model)을 업데이트하며, 초기 추적 결과를 통한 객체 유효성 검증 및 추적 객체들 간의 겹침(Occlusion) 처리 등의 동작을 수행한다.In addition, the image analysis apparatus 130 starts the detection and tracking of a new appearance object for the object detection and tracking operation, updates the bounding box coordinates and shape model of the objects being tracked, and returns the initial tracking result. It performs operations such as object validation and occlusion processing between tracking objects.

나아가, 영상분석장치(130)는 객체의 실제 크기 추정을 위하여 카메라 투영 모델을 이용하여 영상에서의 객체 바운딩 박스 좌표를 3D 공간 좌표로 변환 후 객체의 실제 크기/위치/속도를 계산하고 카메라 투영 모델에서 사용되는 카메라 파라미터값(예: 초점거리, 설치높이, 기울기 각 등)은 카메라 보정 작업을 통해 미리 획득할 수 있다.Furthermore, the image analysis apparatus 130 converts the object bounding box coordinates in the image into 3D spatial coordinates by using the camera projection model for estimating the actual size of the object, and then calculates the actual size/position/velocity of the object and the camera projection model The camera parameter values (eg, focal length, installation height, tilt angle, etc.) used in

객체 분류를 위하여, 영상분석장치(130)는 모션 기반 분류기 또는 D(C)NN 분류기를 포함할 수 있다. 추적 객체 바운딩 박스의 크기/형태/모션 정보를 이용하여 룰(rule) 기반으로 객체(예: 객체 클래스: 사람, 사람이 아닌 것) 분류를 수행할 수 있으며, 프레임별 객체 분류 결과를 누적하고 누적된 점수가 제일 높은 클래스를 해당 객체의 클래스(class) 즉 부류로 확정할 수 있다. 물론 영상분석장치(130)는 객체의 분류를 수행할 때 객체 바운딩 박스의 모양을 보고 분류할 수도 있다. For object classification, the image analysis apparatus 130 may include a motion-based classifier or a D(C)NN classifier. By using the size/shape/motion information of the tracking object bounding box, object (eg, object class: person, non-human) classification can be performed based on a rule, and the object classification results for each frame are accumulated and accumulated The class with the highest score can be determined as the class, that is, the class of the object. Of course, the image analysis apparatus 130 may classify by looking at the shape of the object bounding box when classifying the object.

뿐만 아니라, 이벤트 검출을 위하여 영상분석장치(130)는 이벤트 규칙 설정 및 객체 행위를 검출할 수 있다. 여기서 이벤트 규칙 설정은 검출 영역, 객체 행위, 객체 종류, 객체 속성(필터), 검출 스케줄 등을 포함하며, 객체 행위는 침입, 배회, 경계선 통과, 방향성 이동, 급작스러운 멈춤, 쓰러짐, 군집, 폭력 등을 포함한다.In addition, in order to detect an event, the image analysis apparatus 130 may set an event rule and detect an object action. Here, the event rule setting includes detection area, object behavior, object type, object attribute (filter), detection schedule, etc. includes

본 발명의 실시예에 따른 영상분석장치(130)는 오탐지의 정확도를 높이기 위하여 위의 이벤트 검출 단계에서 오류에 의한 이벤트가 아닌지를 더 판단할 수 있다. 이를 위하여, 이벤트가 발생한 객체, 가령 비디오 프레임에서 시간변화에 따라 추적한 다수의 객체, 더 정확하게는 이벤트 룰셋(rule set) 즉 설정규칙에 의해 이벤트가 발생된 경우 스냅샷(snapshot)을 생성하여 객체검출장치(140)로 제공하여 검출결과(혹은 분석결과)를 요청할 수 있다. 여기서, 스냅샷은 스냅사진이라 명명될 수도 있으며, 변화하는 장면을 인위적으로 연출하지 않고 재빨리 촬영하여 기록한 사진을 의미한다. 따라서, 본 발명의 실시예에서는 가령 추적 객체마다 시간 변화에 따른 객체 이미지를 형성할 수 있으므로 해당 객체 이미지를 시간 변화에 따라 복수의 이미지를 중첩하여 객체검출장치(140)로 제공할 수 있다. 그리고, 영상분석장치(130)는 객체검출기를 구비하는 객체검출장치(140)로부터 분석결과로서 메타 데이터(meta data)를 수신할 수 있다.The image analysis apparatus 130 according to an embodiment of the present invention may further determine whether the event is due to an error in the above event detection step in order to increase the accuracy of false detection. To this end, an object in which an event has occurred, for example, a plurality of objects tracked according to time change in a video frame, or more precisely, a snapshot is created when an event is generated by an event rule set, that is, a setting rule. A detection result (or analysis result) may be requested by providing it to the detection device 140 . Here, the snapshot may be referred to as a snapshot, and refers to a photograph recorded by quickly taking a scene without artificially directing a changing scene. Accordingly, in the embodiment of the present invention, for example, since an object image according to time change can be formed for each tracking object, a plurality of images of the corresponding object can be provided to the object detection apparatus 140 by overlapping a plurality of images according to time change. In addition, the image analysis apparatus 130 may receive meta data as an analysis result from the object detection apparatus 140 having an object detector.

또한, 영상분석장치(130)는 이벤트 결정 동작을 위하여 객체검출장치(140)에서 제공하는 메타 데이터를 분석하여 이벤트 영역 내에 검출 객체 유무 및 종류를 확인한다. 가령 분류기를 이용한 모션 기반 룰셋의 영역 데이터와 비교하여 영역이 70% 이상 유효하고 이벤트 발생 객체 종류가 사람, 차량, 미확인인지 확인하여 이벤트를 결정(혹은 확정)한다. 예를 들어, 영상분석장치(130)는 추적 객체의 움직임 영역에 대하여 바운딩 박스를 설정하게 되고, 해당 영역에서 일부 패치 즉 객체 이미지의 일부를 DNN 분류기를 통해 사람으로 판단할 수 있다. 또한, 영상분석장치(130)는 움직임 영역(혹은 추적 객체)의 전체에 대한 객체 이미지를 객체검출장치(140)로 제공할 수 있으며, 해당 객체 이미지들은 시간 변화에 따른 복수의 객체 이미지일 수 있다. 영상분석장치(130)는 제공한 움직임 영역의 전체에서 분석된 분석 결과를 객체검출장치(140)로부터 메타 데이터의 형태로 수신하는 것이다. 그리고, 분류기와 검출기의 분석 결과를 서로 비교한다. 분류기의 분석 결과는 객체의 유형이고 따라서 객체검출장치(140)의 분석 결과와 일치 정도가 지정 범위를 만족하면, 즉 교차 영역이 있으면 이벤트를 전달하는 것이다. 따라서, 해당 객체에 의한 이벤트는 오탐지에 의한 이벤트가 아니라는 것이 결정되는 것이다.In addition, the image analysis apparatus 130 analyzes the metadata provided by the object detection apparatus 140 for an event determination operation to check the presence and type of a detection object in the event area. For example, compared with the area data of the motion-based ruleset using a classifier, the area is valid by 70% or more, and the event is determined (or confirmed) by checking whether the type of event generating object is a person, vehicle, or unconfirmed. For example, the image analysis apparatus 130 may set a bounding box for the movement region of the tracking object, and may determine a part of a patch, ie, a part of an object image, as a person in the corresponding region through a DNN classifier. In addition, the image analysis apparatus 130 may provide an object image for the entire motion region (or tracking object) to the object detection apparatus 140 , and the object images may be a plurality of object images according to time change. . The image analysis apparatus 130 receives the analysis result analyzed in the entire motion region provided from the object detection apparatus 140 in the form of metadata. Then, the analysis results of the classifier and the detector are compared with each other. The analysis result of the classifier is the type of object, and therefore, if the degree of matching with the analysis result of the object detection device 140 satisfies the specified range, that is, if there is an intersection area, the event is transmitted. Accordingly, it is determined that the event by the corresponding object is not an event due to false detection.

이와 같이, 객체검출장치(140)는 가령 객체검출서버(예: GPU 서버)로서 영상분석장치(130)와 연동하여 내부에 탑재하는 객체검출기를 통해 분석된 분석결과를 가령 메타 데이터의 형태로 생성하여 영상분석장치(130)로 제공한다. 객체검출장치(140)는 영상분석장치(130)로부터 이벤트 규칙을 만족하는 영역의 객체에 대한 복수의 스냅샷을 제공받아 가령 사람 및 차량의 위치 등을 찾아 메타 데이터의 형태로 영상분석장치(130)에 제공할 수 있다.In this way, the object detection device 140 is, for example, an object detection server (eg, a GPU server) and interworks with the image analysis device 130 to generate an analysis result analyzed through an object detector mounted therein, for example, in the form of metadata. to provide the image analysis device (130). The object detection device 140 receives a plurality of snapshots of an object in a region satisfying the event rule from the image analysis device 130, finds, for example, the location of a person and a vehicle, and provides the image analysis device 130 in the form of metadata. ) can be provided.

지금까지 본 발명의 실시예에서는 영상분석장치(130)와 객체검출장치(140)가 서로 별개의 장치로 형성되어 이원화돼 동작하는 것을 설명하였다. 이러한 영상분석장치(130) 및 객체검출장치(140)는 각각 또는 병합하는 경우에도 본 발명의 실시예에 따른 영상처리장치가 될 수 있으며, 본 발명의 실시예에서는 객체 분류기와 객체 검출기를, 다시 말해서 촬영영상의 객체에 대한 진위여부, 더 정확하게는 이벤트가 발생한 영역의 객체에 대한 진위여부를 판별하는 서로 다른 구성요소를 이용하는 경우라면 하나의 장치에 통합되는 경우도 무관하며, 예를 들어 관제장치(120)가 본 발명의 실시예에 따른 동작을 수행하여도 무관하다. 따라서, 본 발명의 실시예에 따른 객체 분류기 및 객체 검출기는 서로 다른 방식으로 영상 분석, 더 정확하게는 객체 분석을 수행하는 제1 객체처리기 및 제2 객체처리기 등으로 명명될 수도 있다. 따라서, 본 발명의 실시예에서는 구성요소들이 어떠한 형태로 구성되느냐에 특별히 한정하지는 않을 것이다.Up to now, in the embodiments of the present invention, it has been described that the image analysis device 130 and the object detection device 140 are formed as separate devices and operate in a binary manner. The image analysis apparatus 130 and the object detection apparatus 140 may be an image processing apparatus according to an embodiment of the present invention even when each or merged, and in the embodiment of the present invention, an object classifier and an object detector are In other words, in the case of using different components that determine the authenticity of the object in the captured image, or more precisely, the authenticity of the object in the area where the event occurred, it does not matter if it is integrated into one device, for example, the control device It does not matter whether 120 performs an operation according to an embodiment of the present invention. Accordingly, the object classifier and the object detector according to the embodiment of the present invention may be named as a first object processor and a second object processor that perform image analysis, more precisely, object analysis in different ways. Therefore, in the embodiment of the present invention, it will not be particularly limited to which form the components are configured.

도 2는 도 1의 영상분석장치의 전체 구성을 간략하게 보여주는 도면이고, 도 3은 도 1의 영상분석장치의 SW 영역을 나타내는 도면이며, 도 4는 도 3의 프로세스의 세부 구조를 예시한 도면, 그리고 도 5는 도 3의 매니저의 세부 구조를 예시한 도면이다.FIG. 2 is a diagram schematically showing the overall configuration of the image analysis apparatus of FIG. 1 , FIG. 3 is a diagram illustrating a SW region of the image analysis apparatus of FIG. 1 , and FIG. 4 is a diagram illustrating a detailed structure of the process of FIG. , and FIG. 5 is a diagram illustrating a detailed structure of the manager of FIG. 3 .

사실, 도 1의 영상분석장치(130)는 다양한 형태로 구성될 수 있고, 더 정확하게는 하드웨어와 소프트웨어의 조합에 의해 구성될 수 있지만, 설명의 이해를 돕기 위하여 간략하게 기술한다. 대표적으로, 영상분석장치(130)는 도 1의 객체검출장치(140)와 통신하는 통신 인터페이스부 및 제어부를 포함하며, 오탐방지를 위한 이벤트오탐방지부를 더 포함할 수 있다. 또한, 제어부는 CPU와 메모리 등을 포함하여 원칩화하여 구성될 수 있으며, 연산처리속도를 증가시키기 위해 CPU는 영상분석장치(130)의 초기 동작시나 필요시 이벤트오탐방지부의 프로그램을 복사하여 메모리에 로딩한 후 이를 실행시킬 수도 있다. 따라서, 이벤트오탐방지부의 동작은 제어부가 실행하는 것으로 이해될 수도 있다. 따라서, 도 3 내지 도 5의 SW 구성은 위의 이벤트오탐방지부의 구성을 의미할 수 있지만, 제어부의 구성으로 이해될 수도 있다.In fact, the image analysis apparatus 130 of FIG. 1 may be configured in various forms, and more precisely, may be configured by a combination of hardware and software, but it will be briefly described to help the understanding of the description. Representatively, the image analysis apparatus 130 includes a communication interface unit and a control unit that communicate with the object detection apparatus 140 of FIG. 1 , and may further include an event false detection unit for preventing false positives. In addition, the control unit may be configured as one-chip including the CPU and memory, and in order to increase the operation processing speed, the CPU copies the program of the event false detection unit to the memory during the initial operation of the image analysis device 130 or if necessary. You can also run it after loading. Accordingly, it may be understood that the operation of the event false detection unit is executed by the control unit. Accordingly, the SW configuration of FIGS. 3 to 5 may mean the configuration of the above false event detection unit, but may also be understood as the configuration of the control unit.

가령, 도 2의 영상처리시스템(90')은 도 1의 영상처리시스템(90)의 일부를 보여줄 수 있다. 영상처리시스템(90')은 촬영장치(100)의 VPN모듈(부)(200), A(I)M모듈(부)(210) 및 VA 인터페이스부(220) 등을 포함할 수 있다. 여기서, VA 인터페이스부(220)는 VA 서버(222) 및 객체검출서버(223)의 연결망의 역할을 수행할 수 있다. VPN모듈(200)은 카메라와 VA 서버(222) 사이의 터널링을 제공한다. 또한, AM모듈(210)은 VSasS의 인터페이스 서버에 해당된다. VA서버(222)는 영상 분석을 수행하는 서버이며, 객체검출서버(223)는 딥러닝 기반 객체검출기 모듈을 탐재한 서버이다.For example, the image processing system 90 ′ of FIG. 2 may show a part of the image processing system 90 of FIG. 1 . The image processing system 90 ′ may include a VPN module (unit) 200 , an A(I)M module (part) 210 , and a VA interface unit 220 of the photographing apparatus 100 . Here, the VA interface unit 220 may serve as a connection network between the VA server 222 and the object detection server 223 . The VPN module 200 provides tunneling between the camera and the VA server 222 . In addition, the AM module 210 corresponds to the interface server of VSasS. The VA server 222 is a server that performs image analysis, and the object detection server 223 is a server with a deep learning-based object detector module.

여기서, 도 2의 VA서버(222)는 도 3에서와 같이 VA 프로세스부(300)와 VA 매니저부(310)를 포함할 수 있다. 물론 이들은 모두 SW 모듈에 해당할 수 있다. 좀더 구체적으로 VA 프로세스부(300)는 도 3 및 도 4에서 볼 수 있는 바와 같이 내외부 인터페이스(External/internal I/F), RTSP 수신기(RTSP Receiver), RTSP 이벤트 송신기(RTSP Event Sender), RTSP 서버(RTSP Server), VA 모듈(VA Module), 객체모듈(Object Module) 및 이벤트결정모듈(Event decision)의 일부 또는 전부를 포함할 수 있다. 내외부 인터페이스는 HTTP REST를 이용한 내부/외부 통신 기능을 담당하고, RTSP 수신기는 카메라와의 연동을 위해서 RTSP 통신 규격을 이용하여 H.264 영상을 수신하는 기능을 담당하며, RTSP 이벤트 송신기는 영상 분석 정보를 스트리머(Streamer)에게 전달하는 기능(Event Meta, VA Object Meta)을 담당할 수 있다. 또한, RTSP 서버는 영상 분석 채널의 영상과 분석 결과를 릴레이(relay)하는 기능을 담당하며, VA모듈은 수신된 영상의 분석을 담당하는 기능을 수행하며, 객체모듈은 객체검출서버(223)와 통신을 하며 메타 데이터를 파싱하는 동작을 수행할 수 있다. 또한, 이벤트결정(부)는 DNN 검출기를 통해 얻어진 결과를 이용하여 이벤트를 판단하는 기능을 수행할 수 있다.Here, the VA server 222 of FIG. 2 may include a VA process unit 300 and a VA manager unit 310 as shown in FIG. 3 . Of course, these may all correspond to SW modules. In more detail, the VA process unit 300 includes an external/internal interface (External/internal I/F), an RTSP receiver, an RTSP event sender, and an RTSP server, as shown in FIGS. 3 and 4 . (RTSP Server), VA module (VA Module), object module (Object Module) and may include some or all of the event decision module (Event decision). The internal/external interface is responsible for internal/external communication functions using HTTP REST, the RTSP receiver is responsible for receiving H.264 video using the RTSP communication standard for interworking with the camera, and the RTSP event transmitter is responsible for video analysis information It can be in charge of the function (Event Meta, VA Object Meta) to deliver to the streamer. In addition, the RTSP server is responsible for relaying the image of the image analysis channel and the analysis result, the VA module performs the function of analyzing the received image, and the object module is connected to the object detection server 223 It can communicate and parse metadata. In addition, the event determination (unit) may perform a function of determining an event using the result obtained through the DNN detector.

다시 정리하면, VA 프로세스부(300)는 비디오 채널, RTSP 서버, RTSP 클라이언트, VA 모듈, 객체모듈 및 이벤트 범주로 구분하여 비디오 채널은 다양한 제품의 스트림을 수용하기 위한 기능, RTSP에 접속하여 수신하는 기능을 구현하는 비디오 채널모듈, 수신된 비디오 스트림을 영상 분석이 가능한 영상으로 변환하는 기능을 수행하는 비디오 디코더모듈, VA에서 생성된 메타데이터를 외부 전달용으로 변환하는 기능을 수행하는 메타 데이터 컨버터모듈을 포함할 수 있다. RTSP 서버의 범주는 영상 분석 채널의 영상과 영상 분석 결과를 외부로 전송 기능(검증용으로 사용할 수 있음)을 수행하는 RTSP 서버모듈을 포함한다. RTSP 클라이언트 범주는 지정된 url에 접속하여 비디오를 수신하는 기능을 수행하는 비디오스트림 수신기모듈, 영상 분석 모듈에서 생성된 메타데이터를 스트리머에 전송하는 기능을 수행하는 이벤트 전송기모듈을 포함한다. VA 모듈 범주는 전경 즉 움직이는 영역을 검출하는 기능, 일반, 오버 헤드뷰 보행자 검출, 얼굴검출 모드 제공을 수행하는 전경 검출기모듈, 전경 검출 영역으로 추적 물체를 검출하는 기능을 수행하는 객체 추적모듈, 추적중인 물체의 종류를 판단하는 기능, DNN 방식에서는 프로세스별로 한 개의 검출기를 생성하여 자원을 공유하는 객체 분류모듈, 추적중인 물체가 기설정된 이벤트 규칙을 만족하면 이벤트를 발생시키는 기능을 수행하는 이벤트검출모듈을 포함한다. 객체모듈 범주는 모션 기반 VA 엔진의 이벤트 발생 영상을 가령 객체검출장치(140)로 영상을 전달하고 그에 따른 메타 데이터를 수신하는 이미지 프락시모듈(image proc), 검출기로부터 수신된 메타 데이터를 파싱하는 메타 파싱모듈을 포함한다. 이벤트 범주는 파싱된 데이터를 바탕으로 이벤트 여부를 판단하는 이벤트 결정모듈을 포함한다.In other words, the VA process unit 300 divides the video channel into video channel, RTSP server, RTSP client, VA module, object module, and event category. A video channel module that implements a function, a video decoder module that converts a received video stream into an image capable of image analysis, and a metadata converter module that converts metadata generated in VA for external transmission may include. The category of the RTSP server includes the RTSP server module that transmits the image of the image analysis channel and the image analysis result to the outside (which can be used for verification). The RTSP client category includes a video stream receiver module that accesses a specified url to receive video, and an event transmitter module that transmits metadata generated by the video analysis module to the streamer. The VA module category includes a foreground detector module that detects the foreground, that is, a moving area, a normal, overhead view pedestrian detection, and a face detection mode, an object tracking module that detects a tracking object in the foreground detection area, and tracking A function to determine the type of object being tracked, an object classification module to share resources by creating one detector for each process in the DNN method, and an event detection module to generate an event when the object being tracked satisfies a preset event rule includes The object module category includes, for example, an image proxy module (image proc) that transmits an event occurrence image of a motion-based VA engine to the object detection device 140 and receives the corresponding metadata, and a meta that parses the metadata received from the detector. Includes parsing module. The event category includes an event determination module that determines whether or not an event is based on the parsed data.

반면 도 3의 VA 매니저(310)는 HTTP모듈, 서버매니지(Server Manage)모듈 및 채널매니지(Channel Manage)모듈을 포함할 수 있다. HTTP모듈은 VSasS 및 모니터링 클라이언트와 인터페이스 기능을 담당하고, 서버매니지모듈은 페일오버(Failover) 기능, 장애복구, 서버의 CRUD, 서버들의 현재 상태를 모니터링, 영상 분석 채널의 분배 기능을 담당하며, 채널매니지모듈은 영상분석채널 관리, 채널 할당, 채널 삭제, 채널 상태 모니터링 등의 동작을 수행할 수 있다.On the other hand, the VA manager 310 of FIG. 3 may include an HTTP module, a Server Manage module, and a Channel Manage module. The HTTP module is responsible for the interface function with VSasS and monitoring clients, and the server management module is responsible for the failover function, fault recovery, CRUD of the server, monitoring the current status of the servers, and the distribution function of the video analysis channel. The management module may perform operations such as video analysis channel management, channel allocation, channel deletion, and channel status monitoring.

좀더 구체적으로 도 3의 VA 매니저(310)는 도 5에서 볼 수 있는 바와 같이 서버매니저, 채널매니저, 인터페이스모듈 및 콘솔(Consul)을 포함할 수 있으며, 서버매니저(500)는 페일오버 기능, 장애복구, 서버의 CRUD, 서버들의 현재 상태를 모니터링, 영상 분석 채널의 분배 기능을 담당하는 서버매니지모듈, 그리고 서버들의 현재 상태 정보 수집 기능을 수행하는 서버모니터링모듈을 포함한다. 또한, 채널매니저(510)는 영상 분석 채널 관리, 채널 할당, 채널 삭제 동작을 수행하는 채널매니지모듈, 그리고 채널 상태 모니터링채널의 장애 관리 동작을 수행하는 채널모니터링모듈을 포함할 수 있다. 인터페이스모듈은 내외 인터페이스 기능을 수행하는 HTTP모듈, VSaaS와 인터페이스 기능을 담당, 영상 분석 요청, 수정, 삭제 기능, 서버들의 상태 정보 전달 기능, 채널들의 상태 정보 전달 기능을 수행하는 외부 인터페이스(모듈), 영상 분석 채널 서버 배당 기능, VA 설정 정보 전달 기능, 영상 분석 설정 정보 전달 기능을 수행하는 내부 인터페이스(모듈)을 포함한다. 콘솔은 부하 분산 및 VA 매니저 상태 정보 교환 동작을 수행한다.More specifically, as shown in FIG. 5 , the VA manager 310 of FIG. 3 may include a server manager, a channel manager, an interface module, and a console, and the server manager 500 has a failover function, a failure It includes a server management module in charge of restoration, CRUD of the server, monitoring the current status of servers, and distribution of video analysis channels, and a server monitoring module performing a function of collecting information on the current status of servers. In addition, the channel manager 510 may include a channel management module for performing image analysis channel management, channel allocation, and channel deletion operations, and a channel monitoring module for performing failure management of a channel state monitoring channel. The interface module is an HTTP module that performs internal and external interface functions, is in charge of interface functions with VSaaS, and an external interface (module) that performs video analysis request, correction, and deletion functions, server status information transmission function, and channel status information transmission function, It includes an internal interface (module) that performs an image analysis channel server allocation function, a VA setting information transmission function, and an image analysis setting information transmission function. The console performs load balancing and exchange of VA manager status information.

도 6은 도 1의 객체검출장치의 전체 구성을 간략하게 보여주는 도면이다. 물론 도 2의 객체검출서버(223)의 구성은 그 일부를 보여준다고 할 수 있다.FIG. 6 is a diagram schematically illustrating an overall configuration of the object detection apparatus of FIG. 1 . Of course, it can be said that the configuration of the object detection server 223 of FIG. 2 shows a part thereof.

도 1의 객체검출장치(140)의 경우에도 영상분석장치(130)와 마찬가지로 서버로서 다양한 형태로 구성될 수 있으며, 하드웨어나 소프트웨어로 구성되거나 그 조합에 의해 구성될 수 있으며, 도 6에선 SW 구성만을 보여준다고 할 수 있다.In the case of the object detection device 140 of FIG. 1 , similarly to the image analysis device 130 , it may be configured in various forms as a server, and may be comprised of hardware or software or a combination thereof, and in FIG. 6 , SW configuration can be said to show only

도 6에 도시된 바와 같이, 객체검출장치(140)는 객체검출프로세스(부)(600) 및 객체검출매니저(610)를 포함하며, 콘솔 및 OS부를 더 포함할 수 있다.As shown in FIG. 6 , the object detection apparatus 140 includes an object detection process (unit) 600 and an object detection manager 610, and may further include a console and an OS unit.

객체검출프로세스(600)는 HTTP REST를 이용한 내부/외부 통신 기능을 수행하는 외부/내부 인터페이스모듈, 프로세스와 통신을 담당, 이미지 데이터 수신, 메타 데이터 송신 및 영상 분석 프로세스 세션 관리를 수행하는 UV펼치기(liveuv)모듈, 이미지로부터 객체 데이터 검출, 메타데이터를 생성하는 객체 검출모듈을 포함할 수 있다.The object detection process 600 is an external/internal interface module that performs internal/external communication functions using HTTP REST, is responsible for communication with the process, receives image data, transmits meta data, and manages an image analysis process session. liveuv) module, object data detection from an image, and an object detection module for generating metadata.

또한, 객체검출매니저(610)는 VA 프로세서와 통신 및 콘솔 데이터 정보 교환 동작을 수행하는 HTTP모듈, 페일오버 기능, 객체검출장치(140)(예: 서버)의 현재 채널 수 및 상태 정보 교환을 수행하는 서버매니지모듈, 객체 검출 프로세스 관리 및 상태 모니터링을 수행하는 객체검출매니지모듈을 포함할 수 있다.In addition, the object detection manager 610 communicates with the VA processor and exchanges information on the current number of channels and the status of the HTTP module, a failover function, and the object detection device 140 (eg, a server) that communicates with the VA processor and exchanges console data information. It may include a server management module, an object detection management module for performing object detection process management and status monitoring.

도 7은 본 발명의 실시예에 따른 이벤트 발생 흐름을 나타내는 도면이다. 7 is a diagram illustrating an event generation flow according to an embodiment of the present invention.

도 7을 참조하면, 도 1의 영상분석장치(예: VA 서버)(130)는 내부에 탑재된 매니저와 프로세스모듈간 프로세스 할당을 위한 동작을 수행할 수 있다(S700).Referring to FIG. 7 , the image analysis apparatus (eg, VA server) 130 of FIG. 1 may perform an operation for allocating a process between a manager mounted therein and a process module ( S700 ).

그리고, 영상분석장치(130)는 객체검출장치(예: 객체검출서버)(140)로 할당을 요청한다(S701). 물론 여기서, 할당은 채널 할당이 될 수 있다.Then, the image analysis apparatus 130 requests allocation to the object detection apparatus (eg, object detection server) 140 (S701). Of course, here, the allocation may be a channel allocation.

객체검출장치(140)는 서버당 할당된 채널 수를 비교하여 채널수가 가장 작은 객체 검출 서버를 선정할 수 있다(S702, S703). 물론 본 발명의 실시예에서는 복수의 객체검출서버와 연동하는 것으로서 설명하고 있지만, 이에 특별히 한정하지는 않을 것이다.The object detection apparatus 140 may select an object detection server having the smallest number of channels by comparing the number of channels allocated to each server (S702, S703). Of course, the embodiment of the present invention is described as interworking with a plurality of object detection servers, but it will not be particularly limited thereto.

이어 객체검출장치(140)는 객체검출서버 접속정보를 영상분석장치(130)로 전달한다(S705). 위의 S702 내지 S704의 동작은 객체검출장치(140)의 검출기 매니저와 동작이 이루어질 수 있다.Then, the object detection device 140 transmits the object detection server connection information to the image analysis device 130 (S705). The above operations of S702 to S704 may be performed with the detector manager of the object detection apparatus 140 .

또한, 영상분석장치(130)는 객체검출장치(140), 더 정확하게는 객체검출장치의 DNN 프로세스모듈과 통신하여 객체검출서버 TCP 연결을 수행한다(S705).In addition, the image analysis apparatus 130 communicates with the object detection apparatus 140, more precisely, the DNN process module of the object detection apparatus, and performs a TCP connection to the object detection server (S705).

가령 위의 S700 내지 S705 단계를 통해 영상분석장치(130)와 객체검출장치(140)는 채널 형성 즉 터널링 동작을 수행할 수 있다.For example, through steps S700 to S705 above, the image analysis apparatus 130 and the object detection apparatus 140 may form a channel, that is, perform a tunneling operation.

이어, 영상분석장치(130)는 도 1의 촬영장치(100)에서 제공하는 촬영영상을 분석하며, 이벤트 발생을 판단한다(S706, S707). 가령, 적용 가능한 이벤트는 침입/배회, 가상선 통과, 경로 통과, 방향성 이동, 멈춤, 버려짐, 제거됨, 폭력, 군집, 쓰러짐 등을 포함할 수 있다.Next, the image analysis apparatus 130 analyzes the photographed image provided by the photographing apparatus 100 of FIG. 1 and determines the occurrence of an event (S706, S707). For example, applicable events may include break-in/loitering, passing a virtual line, passing a path, directional movement, stopping, abandoned, removed, violence, swarming, falling, and the like.

그리고 영상분석장치(130)는 이벤트가 발생하면, 가령 오탐지 등에 의한 이벤트 발생이 있는지를 점검할 수 있다. 이를 위하여 영상분석장치(130)는 객체검출장치(140)로 스냅샷을 전송한다. 스냅샷은 검출기 유형, 이미지 바이트 데이터, 이미지 크기, 이미지 폭, 이미지 높이를 포함할 수 있다.In addition, when an event occurs, the image analysis apparatus 130 may check whether an event occurs due to, for example, false detection. To this end, the image analysis apparatus 130 transmits a snapshot to the object detection apparatus 140 . The snapshot may include detector type, image byte data, image size, image width, image height.

영상분석장치(130)는 객체검출장치(140)로 전송한 스냅샷에 대한 검출 메타 데이터를 수신한다(S709).The image analysis apparatus 130 receives detection metadata for the snapshot transmitted to the object detection apparatus 140 (S709).

그리고 영상분석장치(130)는 가령 객체 필터를 적용해 이벤트를 판단한다(S710). 예를 들어, 앞서 언급한 바와 같이 이벤트 영역의 움직임 객체에 대한 바운딩 박스가 형성된 경우 해당 박스 내의 움직임 객체에 대한 일부를 패치 즉 객체 이미지의 형태로 추출하여 이를 내부 분류기를 통해 객체의 유형을 판단하고, 또 바운딩 박스 전체 영역 또는 바운딩 박스 내의 추적 객체의 전체 영역에 대한 스냅샷의 분석 결과로부터 객체 유형 등을 판단한 후 분류기에서의 분석 결과와 검출기에서의 분석 결과를 서로 비교하여 비교 결과 기준값 이상의 일치(예: 70% 이상)를 보이는 경우, 해당 이벤트 발생이 오탐지에 의한 이벤트 발생이 아니라 판단하고 이벤트 결과를 가령 도 1의 관제장치(120)로 통지할 수 있다.Then, the image analysis apparatus 130 determines an event by, for example, applying an object filter ( S710 ). For example, as mentioned above, when a bounding box for a moving object in the event area is formed, a part of the moving object in the box is extracted in the form of a patch, that is, an object image, and the type of object is determined through an internal classifier. In addition, after determining the object type from the analysis result of the snapshot of the entire area of the bounding box or the entire area of the tracking object in the bounding box, the analysis result in the classifier and the analysis result in the detector are compared with each other to match the comparison result above the reference value ( For example: 70% or more), it is determined that the corresponding event is not an event caused by false detection, and the result of the event may be notified to, for example, the control device 120 of FIG. 1 .

도 8은 도 1의 객체검출장치의 운용 흐름을 나타내는 도면이다.FIG. 8 is a diagram illustrating an operation flow of the object detection apparatus of FIG. 1 .

가령, 도 8은 하나의 객체검출장치(140)에서 복수의 영상분석장치(131, 132)로부터의 객체 검출 동작을 수행할 때의 운용 방식으로 이해해도 좋다.For example, FIG. 8 may be understood as an operation method when one object detection apparatus 140 performs an object detection operation from a plurality of image analysis apparatuses 131 and 132 .

본 발명의 실시예에 따른 객체검출장치(140)는 복수의 영상분석장치(131, 132), 즉 제1 영상분석장치(131) 및 제2 영상분석장치(132)로부터 스냅샷을 수신할 수 있다(S800).The object detection apparatus 140 according to an embodiment of the present invention may receive a snapshot from a plurality of image analysis apparatuses 131 and 132 , that is, the first image analysis apparatus 131 and the second image analysis apparatus 132 . There is (S800).

이의 과정에서 객체검출장치(140)는 내부 메모리, 즉 버퍼의 용량을 초과하는 경우, 가령 제2 영상분석장치(132)로 스냅샷의 전송을 중지하라고 요청할 수 있다(S801, S802).In this process, when the capacity of the internal memory, ie, the buffer, is exceeded, the object detection apparatus 140 may request, for example, to stop transmission of the snapshot to the second image analysis apparatus 132 (S801, S802).

이와 같이 객체검출장치(140)는 영상분석장치(132)로부터 수신되는 스냅샷을 수신하여 버퍼의 저장 용량을 초과하지 않는 범위에서 스냅샷을 저장하여 저장한 스냅샷의 DNN 분석 동작을 수행할 수 있으며, 만약 분석 결과를 처리한 이후 용량이 비게 되는 경우에는 중지를 요청했던 제2 영상분석장치(132)로 다시 스냅샷의 전송을 요청할 수 있다(S803 ~ S805).As such, the object detection device 140 receives the snapshot received from the image analysis device 132, stores the snapshot in a range that does not exceed the storage capacity of the buffer, and performs a DNN analysis operation of the stored snapshot. And, if the capacity becomes empty after processing the analysis result, it is possible to request transmission of the snapshot again to the second image analysis device 132 that has requested the stop (S803 to S805).

반면, 도 8에서 볼 때 제1 영상분석장치(131)는 이벤트 발생 여부를 판단하고(S810), 판단 결과 이벤트가 발생한 경우에는 지정 객체에 대한 스냅샷을 생성하여 객체검출장치(140)로 전송할 수 있다(S810 ~ S813).On the other hand, as shown in FIG. 8 , the first image analysis apparatus 131 determines whether an event occurs ( S810 ), and when an event occurs as a result of the determination, a snapshot of a designated object is generated and transmitted to the object detection device 140 . It can be (S810 ~ S813).

이의 과정에서 제1 영상분석장치(131)는 DNN 분류기의 분석 결과와 객체검출장치(140)에서 제공한 스냅샷의 분석 결과를 근거로 이벤트의 진위 여부를 판별하여(S812) 오탐지에 의한 이벤트가 아니라 판단될 때, 가령 관제장치(120)로 이벤트 발생을 통지할 수 있다. 즉 정확한 탐지에 의한 이벤트 발생을 통지할 수 있다. 가령, 제1 영상분석장치(131)는 2개의 분석 결과를 비교하여 비교 결과가 기설정된 조건을 만족할 때 오탐지가 아니라 확정할 수 있다.In this process, the first image analysis device 131 determines the authenticity of the event based on the analysis result of the DNN classifier and the analysis result of the snapshot provided by the object detection device 140 (S812) to determine the event caused by false detection. When it is determined that not, for example, the event occurrence may be notified to the control device 120 . That is, it is possible to notify the occurrence of an event by accurate detection. For example, the first image analysis apparatus 131 may compare two analysis results and determine that the comparison result is not a false positive when the comparison result satisfies a preset condition.

도 9는 본 발명의 실시예에 따른 영상처리 과정을 나타내는 흐름도이며, 도 10 내지 도 12는 본 발명의 실시예에 따른 영상처리과정 및 그 결과를 설명하기 위한 도면이다.9 is a flowchart illustrating an image processing process according to an embodiment of the present invention, and FIGS. 10 to 12 are diagrams for explaining an image processing process and a result thereof according to an embodiment of the present invention.

설명의 편의상 도 9를 도 1과 참조하면, 가령 도 1의 영상분석장치(130)는 본 발명의 실시예에 따른 영상처리장치의 하나로서, 배경 학습 전경 검출, 객체 검출 및 추적, 실제 크기 추정 등의 동작을 수행할 수 있다(S900). 예를 들어, 이의 단계에서 영상분석장치(130)는 추적 객체에 대한 바운딩 박스를 설정할 수 있다.Referring to FIG. 9 and FIG. 1 for convenience of explanation, for example, the image analysis apparatus 130 of FIG. 1 is one of the image processing apparatuses according to an embodiment of the present invention, and includes background learning foreground detection, object detection and tracking, and actual size estimation. and the like may be performed (S900). For example, in this step, the image analysis apparatus 130 may set a bounding box for the tracking object.

또한, 영상분석장치(130)는 모션 기반 분류기 또는 DNN 분류기를 통해 추적 객체의 유형을 분류할 수 있다(S910). 예를 들어 사람, 차량, 미확인 중 하나로 분류할 수 있으며, 분류를 위하여 바운딩 박스 크기나 모양을 판단하여 결정할 수 있다. 학습에 의해 객체를 분류하므로 정확도는 높다.Also, the image analysis apparatus 130 may classify the type of the tracking object through a motion-based classifier or a DNN classifier (S910). For example, it can be classified as one of a person, a vehicle, and an unidentified, and for classification, it can be determined by determining the size or shape of a bounding box. Because objects are classified by learning, the accuracy is high.

나아가, 영상분석장치(130)는 지정된 이벤트 규칙을 만족하는 객체 행위를 검출한다(S910). 예를 들어, 영상분석장치(130)는 검출 영역, 객체 행위, 객체 종류, 객체 속성, 검출 스케줄 등의 이벤트 규칙을 설정할 수 있고, 침입, 배회, 경계선 통과, 방향성 이동, 급작스러운 멈춤, 쓰러짐, 군집, 폭력 등의 객체 행위를 검출할 수 있다.Furthermore, the image analysis apparatus 130 detects an object action that satisfies a specified event rule (S910). For example, the image analysis device 130 may set event rules such as detection area, object behavior, object type, object property, detection schedule, etc. Object behaviors such as swarms and violence can be detected.

뿐만 아니라, 영상분석장치(130)는 객체모듈을 이용하여 이벤트 설정 규칙에 의해 이벤트가 발생된 경우 스냅샷을 객체검출기, 가령 도 1의 객체검출장치(140)에 전송하고 객체검출기로부터 분석결과를 수신한다(920, S930). 분석결과는 메타 데이터의 형태로 수신할 수 있다. 여기서, 스냅샷은 이벤트 발생에 관계되는 지정 추적 객체에 대하여 바운딩 박스 크기로 생성된 복수의 객체 이미지를 의미할 수 있다.In addition, the image analysis device 130 transmits a snapshot to an object detector, for example, the object detection device 140 of FIG. Receive (920, S930). Analysis results can be received in the form of metadata. Here, the snapshot may mean a plurality of object images generated in the size of a bounding box with respect to a specified tracking object related to the occurrence of an event.

이어 영상분석장치(130)는 메타 데이터를 분석하여 이벤트 영역 내에 검출 객체 유무 및 종류를 확인한다(S940). 그리고, 기존 모션 기반 룰셋 즉 설정규칙의 영역 데이터와 비교하여 영역이 70% 이상 유효하고 이벤트 발생 객체 종류가 사람, 차량, 미확인으로 확인되면 이벤트를 결정(혹은 확정)한다. 이러한 과정을 본 발명의 실시예에서는 이벤트 결정 단계라 명명할 수 있다.Then, the image analysis apparatus 130 analyzes the metadata to confirm the presence and type of the detection object in the event area (S940). And, if the area is 70% or more valid compared to the area data of the existing motion-based rule set, that is, the setting rule, and the event occurrence object type is confirmed as a person, a vehicle, or unconfirmed, an event is determined (or confirmed). This process may be referred to as an event determination step in an embodiment of the present invention.

도 10에서 볼 때, 도 10의 (a)는 분류기의 감지를 보여주며, 도 10의 (b)는 객체 검출기 데이터를 보여준다. 가령, 도 1의 영상분석장치(130)는 모션에 의한 움직임 영역(1000)을 감지하고, 움직임 영역(1000)에 대한 클래스를 확인(예: DNN 분류기에 의한 객체 종류 확인)하는 등의 동작을 수행할 수 있다. 움직임 객체의 중심점이 이벤트 영역에 있는지 확인한다. 또한, 영상분석장치(130)는 객체 중심점이 이벤트 영역에 있다면 가령 도 1의 객체검출장치(140)의 객체검출기에 스냅샷을 전달한다. Referring to FIG. 10, FIG. 10(a) shows the detection of the classifier, and FIG. 10(b) shows the object detector data. For example, the image analysis apparatus 130 of FIG. 1 detects a motion region 1000 by motion, and performs operations such as checking a class for the motion region 1000 (eg, checking an object type by a DNN classifier). can be done Check if the center point of the moving object is in the event area. Also, if the object center point is in the event area, the image analysis apparatus 130 transmits, for example, a snapshot to the object detector of the object detection apparatus 140 of FIG. 1 .

또한, 객체검출장치(140)는 객체 검출기에 의한 영상에 사람 및 차량의 위치를 찾아 전달한다. 다시 말해, 영상분석장치(130)는 제공한 스냅샷의 분석결과를 수신하는 것이다.In addition, the object detection device 140 finds and transmits the positions of people and vehicles to the image by the object detector. In other words, the image analysis apparatus 130 receives the analysis result of the provided snapshot.

이어, 영상분석장치(130)는 분류기에 의해 탐지된 영역과 객체 검출기에 의해 탐지된 영역을 비교하여, 가령 분석결과를 서로 비교하여 교집합 영역이 있을 경우 이벤트 발생을 결정하게 된다. 이와 같이 최종 결정된 이벤트는 가령 도 1의 관제장치(120)로 통지될 수 있다.Next, the image analysis apparatus 130 compares the region detected by the classifier with the region detected by the object detector, for example, compares the analysis results with each other to determine the occurrence of an event when there is an intersection region. As such, the finally determined event may be notified to, for example, the control device 120 of FIG. 1 .

구체적 예를 들어 보면, 영상분석장치(130)는 도 10의 (a)에서와 같이 움직임 영역(1000)에서 일부 패치(patch)(1010) 즉 객체이미지의 일부를 DNN 분류기에 전달하여 사람으로 판단할 수 있다. 또한, 영상분석장치(130)는 패치가 아닌 전체 영역(1020)을 검출기에 전달한다. 그리고 객체검출기로부터 사람 영역을 전달받는다. 분류기와 검출기의 2개의 분석 결과를 비교하여 교차 영역이 있기에 이벤트를 전달하는 것이다.As a specific example, the image analysis apparatus 130 transmits a part of a patch 1010 , that is, a part of an object image, that is, a part of an object image in the motion region 1000, to the DNN classifier as shown in FIG. can do. Also, the image analysis apparatus 130 transmits the entire area 1020, not the patch, to the detector. And it receives the human domain from the object detector. By comparing the two analysis results of the classifier and the detector, the event is propagated because there is an intersection area.

도 11 및 도 12는 본 발명의 실시예에 따른 방법을 적용하여 오탐지가 줄어 드는 것을 보여주기 위한 도면이다. 기존에는 도 11의 (a)에서와 같이 그림자에 의한 오탐이 발생하였다면, 본 발명의 실시예에 따르면 도 11의 (a')에서와 같이 그림자에 의한 오탐이 사라지게 된다. 또한, 기존에는 도 11의 (b)에서와 같이 사람이 아닌 다른 객체에 의한 오탐이 발생하였다면, 개선 후에는 도 11의 (b')에서와 같이 사람이 아닌 다른 객체에 대한 오탐이 사라진다.11 and 12 are diagrams illustrating that false positives are reduced by applying the method according to an embodiment of the present invention. Previously, as in FIG. 11(a) , false positives due to shadows occurred. According to an embodiment of the present invention, false positives due to shadows disappear as in FIG. 11(a'). In addition, if a false positive is generated by an object other than a person as in (b) of FIG. 11 , after improvement, the false positive in an object other than a person disappears as in (b') of FIG. 11 .

본 발명의 실시예에 따른 방식을 실제로, 도 12에서와 같이 야간 주차장에 적용해 본 결과, 기존의 분류기(예: DNN 분류기)만을 적용한 경우 총 24개의 이벤트가 탐지되었다면 본 발명의 실시예에서와 같이 분류기(예: DNN 분류기)와 검출기(예: DNN 검출기)를 병행한 경우 총 20개의 이벤트가 탐지되었고, 실험 결과 실제 오탐(지) 횟수가 4회가 더 존재한다는 것을 확인할 수 있었다.As a result of applying the method according to the embodiment of the present invention to a night parking lot as in FIG. 12, if only a total of 24 events are detected when only an existing classifier (eg, a DNN classifier) is applied, as in the embodiment of the present invention When a classifier (eg, a DNN classifier) and a detector (eg, a DNN detector) were used in parallel, a total of 20 events were detected.

상기한 내용 이외에도 본 발명의 실시예에 따른 영상분석장치(130)는 다양한 동작을 수행할 수 있으며, 기타 자세한 내용은 앞서 충분히 설명하였으므로 그 내용들로 대신하고자 한다.In addition to the above, the image analysis apparatus 130 according to an embodiment of the present invention can perform various operations, and since other detailed information has been sufficiently described above, it will be replaced with the contents.

도 13은 본 발명의 다른 실시예에 따른 영상처리과정의 흐름도이다.13 is a flowchart of an image processing process according to another embodiment of the present invention.

본 발명의 실시예에 따른 영상처리장치는 도 1에 도시된 바 있는 관제장치(120), 영상분석장치(130) 및 객체검출장치(140) 중 적어도 하나의 장치를 포함할 수 있으며, 영상처리장치는 복수의 촬영장치(100)로부터 촬영영상을 수신한다(S1300).An image processing apparatus according to an embodiment of the present invention may include at least one of the control apparatus 120 , the image analysis apparatus 130 , and the object detection apparatus 140 shown in FIG. 1 , and image processing The device receives the captured images from the plurality of photographing devices 100 (S1300).

또한, 영상처리장치는, 수신한 촬영영상에 포함되는 추적객체에 대하여 객체 분류기와 객체 검출기의 분석 결과를 서로 비교하여 비교 결과에 따라 이벤트를 결정(혹은 확정)한다(S1310). 예를 들어, 영상처리장치는 내부 탑재된 분류기를 통해 이벤트 발생 여부를 1차적으로 판단하고, 판단한 이벤트가 오탐에 의한 것은 아닌지 2차적으로 판단하여 이벤트 발생을 최종적으로 확정한다고 볼 수 있다.Also, the image processing apparatus compares the analysis results of the object classifier and the object detector with respect to the tracking object included in the received captured image, and determines (or confirms) an event according to the comparison result ( S1310 ). For example, it can be seen that the image processing apparatus first determines whether an event has occurred through an internally mounted classifier, and secondarily determines whether the determined event is due to a false positive, and finally confirms the occurrence of the event.

물론 이의 과정에서 영상처리장치는 객체 분류기를 이용하여 이벤트가 발생되는 것으로 판단되는 추적 객체의 스냅샷을 객체 검출기로 제공하여 분석 결과를 수신할 수 있다. 객체 분류기와 객체 검출기의 분석 결과 가령 추적 객체 중 움직임이 있는 추적 객체의 객체 유형이나 위치 등이 서로 일치하고 일치 정도가 지정값 이상이면 해당 이벤트는 오탐지가 없는 이벤트임을 최종적으로 판단할 수 있다. 그리고 그 이벤트 발생은 가령 도 1의 관제장치(120)로 통보될 수 있다.Of course, in this process, the image processing apparatus may use the object classifier to provide a snapshot of the tracking object that is determined to have generated an event to the object detector to receive the analysis result. As a result of the analysis of the object classifier and the object detector, for example, if the object type or location of the tracking object with movement among the tracking objects matches and the degree of matching is greater than or equal to a specified value, it can be finally determined that the corresponding event is an event without false positives. And the occurrence of the event may be notified to, for example, the control device 120 of FIG. 1 .

상기한 내용 이외에도 본 발명의 실시예에 따른 영상처리장치는 다양한 동작을 수행할 수 있으며, 기타 자세한 내용은 앞서 충분히 설명하였으므로 그 내용들로 대신하고자 한다.In addition to the above, the image processing apparatus according to an embodiment of the present invention can perform various operations, and since other details have been sufficiently described above, those contents will be replaced.

한편, 본 발명의 실시예를 구성하는 모든 구성 요소들이 하나로 결합하거나 결합하여 동작하는 것으로 설명되었다고 해서, 본 발명이 반드시 이러한 실시 예에 한정되는 것은 아니다. 즉, 본 발명의 목적 범위 안에서라면, 그 모든 구성 요소들이 하나 이상으로 선택적으로 결합하여 동작할 수도 있다. 또한, 그 모든 구성요소들이 각각 하나의 독립적인 하드웨어로 구현될 수 있지만, 각 구성 요소들의 그 일부 또는 전부가 선택적으로 조합되어 하나 또는 복수 개의 하드웨어에서 조합된 일부 또는 전부의 기능을 수행하는 프로그램 모듈을 갖는 컴퓨터 프로그램으로서 구현될 수도 있다. 그 컴퓨터 프로그램을 구성하는 코드들 및 코드 세그먼트들은 본 발명의 기술 분야의 당업자에 의해 용이하게 추론될 수 있을 것이다. 이러한 컴퓨터 프로그램은 컴퓨터가 읽을 수 있는 비일시적 저장매체(non-transitory computer readable media)에 저장되어 컴퓨터에 의하여 읽혀지고 실행됨으로써, 본 발명의 실시 예를 구현할 수 있다.On the other hand, even though it has been described that all components constituting the embodiment of the present invention are combined or operated in combination, the present invention is not necessarily limited to this embodiment. That is, within the scope of the object of the present invention, all the components may operate by selectively combining one or more. In addition, all of the components may be implemented as one independent hardware, but a part or all of each component is selectively combined to perform some or all functions of the combined components in one or a plurality of hardware program modules It may be implemented as a computer program having Codes and code segments constituting the computer program can be easily deduced by those skilled in the art of the present invention. Such a computer program is stored in a computer-readable non-transitory computer readable media, read and executed by the computer, thereby implementing an embodiment of the present invention.

여기서 비일시적 판독 가능 기록매체란, 레지스터, 캐시(cache), 메모리 등과 같이 짧은 순간 동안 데이터를 저장하는 매체가 아니라, 반영구적으로 데이터를 저장하며, 기기에 의해 판독(reading)이 가능한 매체를 의미한다. 구체적으로, 상술한 프로그램들은 CD, DVD, 하드 디스크, 블루레이 디스크, USB, 메모리 카드, ROM 등과 같은 비일시적 판독가능 기록매체에 저장되어 제공될 수 있다.Here, the non-transitory readable recording medium refers to a medium that stores data semi-permanently and can be read by a device, not a medium that stores data for a short moment, such as a register, cache, memory, etc. . Specifically, the above-described programs may be provided by being stored in a non-transitory readable recording medium such as a CD, DVD, hard disk, Blu-ray disk, USB, memory card, ROM, and the like.

이상에서는 본 발명의 바람직한 실시 예에 대하여 도시하고 설명하였지만, 본 발명은 상술한 특정의 실시 예에 한정되지 아니하며, 청구범위에 청구하는 본 발명의 요지를 벗어남이 없이 당해 발명이 속하는 기술분야에서 통상의 지식을 가진 자에 의해 다양한 변형실시가 가능한 것은 물론이고, 이러한 변형실시들은 본 발명의 기술적 사상이나 전망으로부터 개별적으로 이해되어서는 안 될 것이다.In the above, preferred embodiments of the present invention have been illustrated and described, but the present invention is not limited to the specific embodiments described above, and it is common in the technical field to which the present invention pertains without departing from the gist of the present invention as claimed in the claims. Various modifications may be made by those having the knowledge of, of course, and these modifications should not be individually understood from the technical spirit or perspective of the present invention.

100: 촬영장치 110: 통신망
120: 관제장치 130: 영상분석장치
140: 객체검출장치 300: VA 프로세스(부)
310: VA 메니저 600: 객체검출서버
610: 객체검출매니저100: photographing device 110: communication network
120: control device 130: image analysis device
140: object detection device 300: VA process (part)
310: VA manager 600: object detection server
610: object detection manager

Claims

a communication interface unit for receiving captured images from a plurality of photographing devices; and
A control unit that compares each analysis result of an object classifier and an object detector for a tracking object included in the received captured image with each other and determines an event according to the comparison result;
The object classifier and the object detector perform image analysis or object analysis on the tracking object in different ways,
The control unit is
An image of a partial area of the tracking object is provided to the object classifier to obtain an analysis result, and a snapshot of the entire area of the same tracking object is provided to the object detector to obtain an analysis result,
The object detector is
An image processing apparatus for generating the analysis result using the provided snapshot in the form of metadata and providing it to the object classifier.

According to claim 1,
The controller is configured to determine the event based on an analysis result of the intersection region when there is an intersection region between the detection region of the object classifier and the detection region of the object detector for the tracking object.

3. The method of claim 2,
The controller is configured to determine the event when the same object is determined as a result of analyzing the intersection region.

delete

According to claim 1,
The control unit determines the event by checking the presence and type of an object in the event area based on the analysis result.

3. The method of claim 2,
The control unit determines the event by determining whether a comparison result of the intersection area is valid more than a specified value and whether the event generating object of the intersection area is a person, a vehicle, or unidentified.

According to claim 1,
The object classifier and the object detector are respectively configured in an image analysis server and an object detection server configured independently of each other.

8. The method of claim 7,
The object classifier uses a motion-based classifier or a deep neural network (DNN) classifier, and the object detector uses a DNN detector.

The communication interface unit may include: receiving a photographed image from a plurality of photographing apparatuses; and
Comprising, by the control unit, comparing each analysis result of the object classifier and the object detector for the tracking object included in the received captured image with each other and determining an event according to the comparison result;
The object classifier and the object detector perform image analysis or object analysis on the tracking object in different ways,
The method further includes the step of obtaining, by the controller, an image of a partial region of the tracking object to the object classifier to obtain an analysis result, and providing a snapshot of the entire region of the same tracking object to the object detector to obtain an analysis result and
The object detector is
A method of driving an image processing apparatus for generating the analysis result using the provided snapshot in the form of metadata and providing it to the object classifier.

10. The method of claim 9,
The step of determining the event is
When there is an intersection region between the detection region of the object classifier and the detection region of the object detector for the tracking object, the event is determined based on an analysis result of the intersection region.

11. The method of claim 10,
The step of determining the event is
A driving method of an image processing apparatus for determining the event when the same object is determined as a result of analyzing the intersection region.

delete

10. The method of claim 9,
The step of determining the event is
A driving method of an image processing apparatus for determining the event by checking the presence and type of an object in an event area based on the analysis result.

11. The method of claim 10,
The step of determining the event is
A method of driving an image processing apparatus to determine the event by determining whether a comparison result of the intersection region is valid more than a specified value and whether an event generating object in the intersection region is a person, a vehicle, or an unidentified object.

10. The method of claim 9,
The object classifier and the object detector are respectively configured in an image analysis server and an object detection server configured independently of each other.

16. The method of claim 15,
The object classifier uses a motion-based classifier or a DNN classifier, and the object detector uses a DNN detector.