KR20200136736A

KR20200136736A - Multi object detection system using deep running based on closed circuit television image

Info

Publication number: KR20200136736A
Application number: KR1020190062686A
Authority: KR
Inventors: 박영석
Original assignee: 주식회사 엠제이비전테크
Priority date: 2019-05-28
Filing date: 2019-05-28
Publication date: 2020-12-08
Also published as: KR102479516B1

Abstract

The present invention relates to a multi-object detection system using deep learning based on a CCTV image. An objective to be achieved by the present invention is to learn and detect data by using a convolutional neural network (CNN), from among deep learning algorithms, mainly used in the image processing field, in order to resolve the drawbacks of an existing image processing algorithm for detecting various objects such as vehicles and pedestrian information, from an image obtained from a CCTV. According to one embodiment of the present invention, the multi-object detection system using deep learning based on the CCTV image comprises: a detection unit configured to detect at least one preset object with respect to a CCTV image, by using a predefined detection deep learning network; a recognition unit configured to recognize information on the object detected by the detection unit, based on a predefined recognition deep learning network and a detection result for the detected object; and an image monitoring unit configured to provide the information recognized by the recognition unit, together with the CCTV image, wherein the detection unit detects the corresponding object by using a CNN layer in which a 3x3 convolutional layer and a 1x1 convolutional layer are combined.

Description

Multi-object detection system using deep learning based on CCTV image {MULTI OBJECT DETECTION SYSTEM USING DEEP RUNNING BASED ON CLOSED CIRCUIT TELEVISION IMAGE}

본 발명의 실시예는 CCTV 영상 기반 딥 러닝을 이용한 다중객체 검출 시스템에 관한 것이다.An embodiment of the present invention relates to a system for detecting multiple objects using deep learning based on CCTV images.

국내의 자율주행 서비스 모델은 차량에 설치된 각종 센서로부터 획득되는 정보뿐만 아니라 도로에 구축되는 기반 시설로부터 차량, 보행자, 동물 등 주행에 필요한 주변 환경정보를 분석한 후 이를 운전자에게 제공하는 V2X(Vehicle to Everything)와 관련하여 많은 연구가 진행되고 있다. 이를 이용하여 자율주행 기술의 핵심인 '인지, 판단, 제어' 기술의 신뢰도가 향상됨으로써 사고 발생률을 감소시킬 수 있다.The domestic autonomous driving service model analyzes not only information acquired from various sensors installed in the vehicle, but also surrounding environment information necessary for driving such as vehicles, pedestrians, and animals from infrastructure built on the road, and provides it to the driver. Everything) is being studied. By using this, the reliability of'cognition, judgment, and control' technology, which is the core of autonomous driving technology, can be improved, thereby reducing the accident rate.

최근 관공서나 기업 등에서 보안/안전을 위해 설치하는 CCTV 카메라의 수는 폭발적으로 증가하고 있다. Recently, the number of CCTV cameras installed for security/safety in government offices and corporations has exploded.

그러나, 설치된 CCTV 카메라 수에 비해 CCTV 카메라 영상을 모니터링하는 요원의 수는 턱없이 부족한 실정이다. 이러한 문제점을 해결하기 위해 지능형 CCTV 영상 감시 시스템의 도입이 활발하게 이루어지고 있다.However, compared to the number of installed CCTV cameras, the number of personnel monitoring CCTV camera images is insufficient. In order to solve this problem, the introduction of an intelligent CCTV video surveillance system has been actively made.

지능형 CCTV 영상 감시 시스템의 핵심을 이루는 CCTV 영상분석장치는 CCTV 카메라로부터 비디오 영상을 받아 이동 객체들을 검출/추적하고, 이를 바탕으로 "금지된 구역에 침입 발생" 등과 같은 이상 상황을 자동으로 감지하여 경보를 발생시킨다. 모니터링 요원은 다수의 (무의미한) CCTV 영상을 항상 주시할 필요 없이 경보가 발생한 CCTV 영상만 확인함으로써, 다수의 CCTV 카메라 영상을 효과적으로 모니터링할 수 있다.CCTV video analysis device, which is the core of the intelligent CCTV video surveillance system, receives video images from CCTV cameras and detects/tracks moving objects, and based on this, automatically detects abnormal situations such as "intrusion in prohibited areas" and alarms Occurs. Monitoring personnel can effectively monitor multiple CCTV camera images by checking only the CCTV images that have been alarmed without having to always watch multiple (insignificant) CCTV images.

그러나 기존의 CCTV 영상분석장치의 대부분은 모션 기반의 객체 검출 알고리즘을 사용하는 관계로, 실제 관심 객체(예: 대표적으로 사람 및 차량)의 검출 이외에도 다양한 원인(예: 바람에 흔들리는 나뭇가지, 출렁이는 물결, 움직이는 그림자, 갑작스러운 조명 변화, 반짝이는 불빛, 눈/비 등)에 의한 객체 오검출이 빈번하게 발생한다. 이를 통해 오경보 또한 빈번하게 발생하여 효율적인 모니터링을 할 수 없게 만든다. However, since most of the existing CCTV image analysis devices use motion-based object detection algorithms, in addition to the detection of actual objects of interest (e.g., representatively people and vehicles), various causes (e.g., branches shaking in the wind, swaying) Erroneous detection of objects due to waves, moving shadows, sudden changes in lighting, twinkling lights, snow/rain, etc.) occurs frequently. Through this, false alarms also occur frequently, making efficient monitoring impossible.

본 발명의 실시예는, CCTV로부터 획득한 영상에서 차량, 보행자 정보 등 다양한 객체를 검출하기 위한 기존의 영상처리 알고리즘의 단점들을 개선하기 위해서 딥 러닝 알고리즘 중에서 영상처리 분야에서 주로 사용되는 CNN(Convolutional Neural Network)을 이용하여 데이터를 학습하고 검출할 수 있는 CCTV 영상 기반 딥 러닝을 이용한 다중객체 검출 시스템을 제공한다.An embodiment of the present invention is a convolutional neural network (CNN) mainly used in image processing among deep learning algorithms in order to improve the shortcomings of existing image processing algorithms for detecting various objects such as vehicle and pedestrian information from images acquired from CCTV. Network) to learn and detect data, and provides a multi-object detection system using deep learning based on CCTV images.

본 발명의 실시예에 따른 CCTV 영상 기반 딥 러닝을 이용한 다중객체 검출 시스템은, 미리 정의된 검출 딥 러닝 네트워크를 이용하여 CCTV 영상에 대해 미리 설정된 적어도 하나의 객체를 검출하는 검출부; 미리 정의된 인식 딥 러닝 네트워크와 상기 검출부를 통해 검출된 객체에 대한 검출 결과에 기초하여 해당 객체에 대한 정보를 인식하는 인식부; 및 상기 인식부에서 인식된 정보를 상기 CCTV 영상과 함께 제공하는 영상 모니터링부를 포함하고, 상기 검출부는, 3x3 컨볼루션 계층 및 1x1 컨볼루션 계층이 조합된 컨볼루션 신경망(Convolutional Neural Network) 계층을 이용하여 해당 객체를 검출한다.A multi-object detection system using deep learning based on CCTV images according to an embodiment of the present invention includes: a detection unit that detects at least one object preset for a CCTV image by using a predefined detection deep learning network; A recognition unit for recognizing information on a corresponding object based on a detection result of the object detected through a predefined recognition deep learning network and the detection unit; And an image monitoring unit providing the information recognized by the recognition unit together with the CCTV image, wherein the detection unit uses a convolutional neural network layer in which a 3x3 convolution layer and a 1x1 convolution layer are combined. Detect the object.

또한, 상기 검출부는, 상기 CCTV의 입력영상에 대하여, 제1 화각으로 운영되는 환경에서 416x416 픽셀로 학습된 가중치 파일을 이용하여 객체를 검출하여 학습하고, 제2 화각으로 운영되는 화각에서 320x320 픽셀로 학습한 가중치 파일을 이용하여 객체를 검출하여 학습하고, 상기 제1 화각이 상기 제2 화각보다 넓을 수 있다.In addition, the detection unit detects and learns the input image of the CCTV by using a weight file learned as 416x416 pixels in an environment operated at a first angle of view, and detects and learns an object from an angle of view operated at a second angle of view to 320x320 pixels. An object is detected and learned using the learned weight file, and the first angle of view may be wider than the second angle of view.

본 발명에 따르면, CCTV로부터 획득한 영상에서 차량, 보행자 정보 등 다양한 객체를 검출하기 위한 기존의 영상처리 알고리즘의 단점들을 개선하기 위해서 딥 러닝 알고리즘 중에서 영상처리 분야에서 주로 사용되는 CNN(Convolutional Neural Network)을 이용하여 데이터를 학습하고 검출할 수 있는 CCTV 영상 기반 딥 러닝을 이용한 다중객체 검출 시스템을 제공할 수 있다.According to the present invention, a convolutional neural network (CNN) mainly used in the image processing field among deep learning algorithms in order to improve the shortcomings of existing image processing algorithms for detecting various objects such as vehicle and pedestrian information from images acquired from CCTV. It is possible to provide a multi-object detection system using deep learning based on CCTV images that can learn and detect data using.

도 1은 V2X(Vehicle to Everything)를 이용한 자율주행 인프라를 나타낸 구성도이다.
도 2는 딥 러닝 네트워크 구조를 나타낸 도면이다.
도 3은 본 발명의 실시예에 따른 CCTV 영상 기반 딥 러닝을 이용한 다중객체 검출 시스템의 전체 구성을 나타낸 개요도이다.
도 4는 본 발명의 실시예에 따른 CCTV 영상 기반 딥 러닝을 이용한 다중객체 검출 시스템의 전체 구성을 나타낸 블록도이다.
도 5는 본 발명의 실시예에 따른 인식부의 인공지능 영상분석방법 중 가중치 파일을 이용한 객체검출방법을 설명하기 위해 나타낸 도면이다.1 is a block diagram showing an autonomous driving infrastructure using V2X (Vehicle to Everything).
2 is a diagram showing the structure of a deep learning network.
3 is a schematic diagram showing the overall configuration of a multi-object detection system using deep learning based on CCTV images according to an embodiment of the present invention.
4 is a block diagram showing the overall configuration of a multi-object detection system using deep learning based on CCTV images according to an embodiment of the present invention.
5 is a diagram illustrating an object detection method using a weight file among artificial intelligence image analysis methods of a recognition unit according to an embodiment of the present invention.

본 명세서에서 사용되는 용어에 대해 간략히 설명하고, 본 발명에 대해 구체적으로 설명하기로 한다.The terms used in the present specification will be briefly described, and the present invention will be described in detail.

본 발명에서 사용되는 용어는 본 발명에서의 기능을 고려하면서 가능한 현재 널리 사용되는 일반적인 용어들을 선택하였으나, 이는 당 분야에 종사하는 기술자의 의도 또는 판례, 새로운 기술의 출현 등에 따라 달라질 수 있다. 또한, 특정한 경우는 출원인이 임의로 선정한 용어도 있으며, 이 경우 해당되는 발명의 설명 부분에서 상세히 그 의미를 기재할 것이다. 따라서 본 발명에서 사용되는 용어는 단순한 용어의 명칭이 아닌, 그 용어가 가지는 의미와 본 발명의 전반에 걸친 내용을 토대로 정의되어야 한다.The terms used in the present invention have been selected from general terms that are currently widely used while considering functions in the present invention, but this may vary depending on the intention or precedent of a technician working in the field, the emergence of new technologies, and the like. In addition, in certain cases, there are terms arbitrarily selected by the applicant, and in this case, the meaning of the terms will be described in detail in the description of the corresponding invention. Therefore, the terms used in the present invention should be defined based on the meaning of the term and the overall contents of the present invention, not a simple name of the term.

명세서 전체에서 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있음을 의미한다. 또한, 명세서에 기재된 "...부", "모듈" 등의 용어는 적어도 하나 이상의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어 또는 소프트웨어로 구현되거나 하드웨어와 소프트웨어의 결합으로 구현될 수 있다.When a part of the specification is said to "include" a certain component, it means that other components may be further included rather than excluding other components unless otherwise stated. In addition, terms such as "... unit" and "module" described in the specification mean a unit that processes at least one function or operation, which may be implemented as hardware or software or a combination of hardware and software. .

아래에서는 첨부한 도면을 참고하여 본 발명의 실시예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다. Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those of ordinary skill in the art may easily implement the present invention. However, the present invention may be implemented in various different forms and is not limited to the embodiments described herein. In the drawings, parts irrelevant to the description are omitted in order to clearly describe the present invention, and similar reference numerals are assigned to similar parts throughout the specification.

도 1은 V2X(Vehicle to Everything Communication)를 이용한 자율주행 인프라를 나타낸 구성도로, 좀 더 구체적으로, 도로 상의 주변 환경 정보를 획득하는 다양한 장치 중 CCTV를 통해 실시간 도로 상황을 수집하여 분석한 후 V2X 관제 서버와 연계하여 주행하는 차량에 전송하는 서비스 모델을 나타낸 도면이다. 1 is a block diagram showing an autonomous driving infrastructure using V2X (Vehicle to Everything Communication). More specifically, V2X control after collecting and analyzing real-time road conditions through CCTV among various devices that acquire information about the surrounding environment on the road. It is a diagram showing a service model transmitted to a vehicle traveling in connection with a server.

도 1을 참조하면, 본 실시예는 CCTV를 통해서 분석할 수 있는 차량, 보행자, 이륜차(자전거, 오토바이, 전동 킥보드 등) 정보를 V2X(차량-사물통신 Vehicle to Everything communication)를 이용하여 운전자에게 전달하기 위해 다양한 조도 변화와 열화된 영상에서 강건한 다중객체 검출 기술에 관한 것으로, 딥 러닝 알고리즘 중에서 영상처리 분야에서 주로 사용되는 CNN(Convolutional Neural Network)(도 2 참조)을 이용하여 데이터를 학습하고 검출하여 다양한 객체 검출용 영상처리 알고리즘의 단점들을 개선하기 위한 시스템에 관한 것이다.Referring to FIG. 1, the present embodiment transmits vehicle, pedestrian, and two-wheeled vehicle (bicycle, motorcycle, electric kickboard, etc.) information that can be analyzed through CCTV to the driver using V2X (vehicle-to-thing communication). In order to do this, it relates to a robust multi-object detection technology in various illuminance changes and deteriorated images, by learning and detecting data using CNN (Convolutional Neural Network) (see Fig. 2), which is mainly used in the image processing field among deep learning algorithms. The present invention relates to a system for improving the disadvantages of image processing algorithms for detecting various objects.

도 3은 본 발명의 실시예에 따른 CCTV 영상 기반 딥 러닝을 이용한 다중객체 검출 시스템의 전체 구성을 나타낸 개요도이고, 도 4는 본 발명의 실시예에 따른 CCTV 영상 기반 딥 러닝을 이용한 다중객체 검출 시스템의 전체 구성을 나타낸 블록도이며, 도 5는 본 발명의 실시예에 따른 인식부의 인공지능 영상분석방법 중 가중치 파일을 이용한 객체검출방법을 설명하기 위해 나타낸 도면이다.3 is a schematic diagram showing the overall configuration of a multi-object detection system using CCTV image-based deep learning according to an embodiment of the present invention, and FIG. 4 is a multi-object detection system using CCTV image-based deep learning according to an embodiment of the present invention. FIG. 5 is a block diagram showing the overall configuration of, and FIG. 5 is a diagram illustrating an object detection method using a weight file among an artificial intelligence image analysis method of a recognition unit according to an embodiment of the present invention.

도 3을 참조하면, 본 발명의 실시예에 따른 CCTV 영상 기반 딥 러닝을 이용한 다중객체 검출 시스템(1000)은 검출부(100), 인식부(200) 및 영상 모니터링부(300)를 포함한다.Referring to FIG. 3, a multi-object detection system 1000 using CCTV image-based deep learning according to an embodiment of the present invention includes a detection unit 100, a recognition unit 200, and an image monitoring unit 300.

상기 검출부(100)는 미리 정의된 검출 딥 러닝 네트워크를 이용하여 CCTV 영상에 대해 미리 설정된 적어도 하나의 객체를 검출할 수 있다. The detection unit 100 may detect at least one object set in advance for a CCTV image using a predefined detection deep learning network.

상기 검출부(100)는, 딥 러닝 기반으로 CCTV 영상 속 다양한 객체를 검출하는 구성 수단으로, 딥 네트워크를 사용하여 사람 검출, 얼굴 검출, 자동차 검출 및 번호판 검출 등을 수행할 수 있다. 여기서, 검출부(100)는 입력 비디오에 대해서 일정 시간 간격으로 자동 객체검출을 수행하고, 검출된 객체에 대해 딥 러닝 네트워크를 이용하여 추적 기술을 적용함으로써 실시간 처리할 수 있으며, 물체가 검색되는 영역을 최소화하는 방법을 적용하여 딥 네트워크를 설계하고 트레이닝(training)할 수 있다.The detection unit 100 is a configuration means for detecting various objects in a CCTV image based on deep learning, and may perform human detection, face detection, vehicle detection, license plate detection, and the like using a deep network. Here, the detection unit 100 performs automatic object detection on the input video at regular time intervals, and can process the detected object in real time by applying a tracking technology using a deep learning network. Deep networks can be designed and trained by applying a minimization method.

또한, 검출부(100)는 검출 딥 네트워크를 이용하여 식별 가능한 물체를 포함하는 감시카메라 영상에 대해 자동으로 객체를 검출한다. 여기서, 자동으로 검출되는 물체(또는 객체)는 사람, 얼굴, 자동차(번호판), 이륜차 등을 포함할 수 있다. 구체적으로, 검출부(100)는 딥 러닝 기반 네트워크를 활용하여 사람 영역을 검출하고, 검출된 사람 영역 내에서 얼굴 검출 딥 네트워크를 활용하여 얼굴 영역을 검출한다. 마찬가지로, 검출부는 자동차 검출 딥 네트워크를 활용하여 CCTV 영상 내 존재하는 자동차들을 검출하고, 검출된 자동차 영상 내에서 번호판 영역을 검출할 수 있다. 이때, 검출부(100)는 검출된 각 객체의 위치 정보, 크기 정보, 그리고 검출의 신뢰도 값을 출력할 수 있다. 더불어, 검출부(100)는 물체의 크기를 사전에 모르기 때문에 영상에서 미리 설정된 스케일들 각각에서 물체 영역 검색을 하면서 객체 존재 유무를 확인할 수 있다.In addition, the detection unit 100 automatically detects an object with respect to a surveillance camera image including an identifiable object using a detection deep network. Here, the automatically detected object (or object) may include a person, a face, a vehicle (license plate), a two-wheeled vehicle, and the like. Specifically, the detection unit 100 detects a human region using a deep learning-based network, and detects a face region using a face detection deep network within the detected human region. Likewise, the detection unit may detect vehicles existing in the CCTV image by using the vehicle detection deep network and detect a license plate area within the detected vehicle image. In this case, the detection unit 100 may output location information, size information, and a detection reliability value of each detected object. In addition, since the detection unit 100 does not know the size of the object in advance, it can check whether the object exists or not while searching the object area at each of the preset scales in the image.

또한, 검출부(100)는 입력 영상 내 사람들을 자동으로 검출하고, 사람의 위치, 자세, 이동방향 등의 다양한 정보를 포함하는 사람 관련 정보를 빠르게 추출하기 위한 관심 영역 기반의 고속 사람 검출용 네트워크를 사용할 수 있다.In addition, the detection unit 100 automatically detects people in the input image, and provides a high-speed human detection network based on a region of interest for quickly extracting human-related information including various information such as a location, posture, and movement direction of the person. Can be used.

본 실시예에 따른 검출부(100)는, 기존의 영상처리 알고리즘의 단점을 개선하여 다양한 조도 변화와 열화된 영상에서 강건한 다중객체 검출하기 위하여, 컨볼루션 신경망(Convolutional Neural Network)을 이용하여 데이터를 학습하고 검출할 수 있다. The detection unit 100 according to the present embodiment learns data using a convolutional neural network in order to detect robust multiple objects in various illuminance changes and deteriorated images by improving the disadvantages of the existing image processing algorithm. And can be detected.

좀 더 구체적으로, 검출부(100)는, 24개의 CNN(Convolutional Neural Network) 계층으로 설계하여 높은 정밀도의 객체 검출을 처리할 수 있으며, 33 convolution layer와 11 convolution layer를 조합하여 네트워크의 연산량을 줄임으로써 빠르게 처리할 수 있다.More specifically, the detection unit 100 can process high-precision object detection by designing with 24 convolutional neural network (CNN) layers, and by combining 33 convolution layers and 11 convolution layers to reduce the amount of network computation. You can do it quickly.

또한, 검출부(100)는, CCTV 입력영상을 416x416 픽셀과 320320 픽셀 크기의 두 가지 형태로 학습할 수 있다. 좀 더 구체적으로, CCTV의 입력영상에 대하여, 제1 화각으로 운영되는 환경에서 416416 픽셀로 학습된 가중치 파일을 이용하여 객체를 검출하여 학습하고, 제2 화각으로 운영되는 화각에서 320320 픽셀로 학습한 가중치 파일을 이용하여 객체를 검출하여 학습함으로써 검출성능을 향상시킬 수 있다. 이때, 제1 화각은 제2 화각보다 넓은 화각으로 운영되는 환경을 의미할 수 있다. In addition, the detection unit 100 may learn the CCTV input image in two types of 416x416 pixels and 320320 pixels. More specifically, for the input image of CCTV, an object is detected and learned using a weight file learned with 416416 pixels in an environment operated at the first angle of view, and learned with 320 320 pixels at the angle of view operated at the second angle of view. The detection performance can be improved by detecting and learning an object using a weight file. In this case, the first angle of view may mean an environment operated with a wider angle of view than the second angle of view.

상기 인식부(200)는 미리 정의된 인식 딥 러닝 네트워크와 검출부(100)를 통해 검출된 객체에 대한 검출 결과에 기초하여 해당 객체에 대한 정보를 인식할 수 있다.The recognition unit 200 may recognize information on a corresponding object based on a predefined recognition deep learning network and a detection result of the object detected through the detection unit 100.

또한, 인식부(200)의 인공지능 영상분석은 입력되는 CCTV 영상에 대해서 차량, 보행자, 자전거/오토바이의 객체를 학습된 가중치 파일을 이용하여 검출할 수 있으며, 하나의 화면에서 검출하는 객체의 종류와 개수에는 제한이 없다. In addition, the artificial intelligence image analysis of the recognition unit 200 can detect the objects of vehicles, pedestrians, and bicycles/motorcycles for the input CCTV image using the learned weight file, and the type of object detected on one screen. There is no limit to the number of and.

예를 들어, 도 5에 도시된 바와 같이, 가장 좌측 박스의 인식 대상체에 대해서는 차량일 확률이 0.8, 자전거일 확률이 0.1, 오토바이일 확률이 0.07, 사람일 확률이 0.03으로 확률적 인식 결과를 제공하고, 가운데 상위 박스의 인식 대상체에 대해서는 보행자일 확률이 0.6, 자전거일 확률이 0.2, 오토바이일 확률이 0.12, 차량일 확률이 0.05으로 확률적 인식 결과를 제공하며, 가운데 하위 박스의 인식 대상체에 대해서는 자전거일 확률이 0.5, 오토바이일 확률이 0.35, 사람일 확률이 0.12, 차량일 확률이 0.03으로 확률적 인식 결과를 제공할 수 있다.For example, as shown in FIG. 5, for the object to be recognized in the leftmost box, the probability of a vehicle is 0.8, a probability of a bicycle is 0.1, a probability of a motorcycle is 0.07, and a probability of a person is 0.03. For the recognized object in the upper middle box, the probability of being a pedestrian is 0.6, the probability of being a bicycle is 0.2, the probability of being a motorcycle is 0.12, and the probability of being a vehicle is 0.05. The probability of being a bike is 0.5, the probability of being a motorcycle is 0.35, the probability of being a person is 0.12, and the probability of being a vehicle is 0.03, and a probability recognition result can be provided.

상기 영상 모니터링부(300)는 인식부(200)에서 인식된 정보를 CCTV 영상과 함께 제공하여 해당 영상을 모니터링 할 수 있도록 한다.The image monitoring unit 300 provides information recognized by the recognition unit 200 together with a CCTV image so that the corresponding image can be monitored.

이러한 영상 모니터링부(300)의 인공지능 영상분석은 OnVif 또는 RTSP 표준 프로토콜을 지원하는 다양한 CCTV에 대해서 영상분석이 가능하며, 인공지능 영상분석과 함께 CCTV 카메라 데이터를 저장분배 서버에 녹화하는 기능과 함께 실시간 영상확인이 가능하다.The artificial intelligence video analysis of the video monitoring unit 300 enables video analysis for various CCTVs that support OnVif or RTSP standard protocols, along with the function of recording CCTV camera data on a storage distribution server along with artificial intelligence video analysis. Real-time video check is possible.

또한, 영상 모니터링부(300)는 도 5에 도시된 바와 같이 영상 분석 결과, 차량번호 인식, 차량 및 보행자 검출 그리고, 자전거 및 오토바이가 검출되면 객체를 포함하는 바운딩 박스로 표출할 수 있다.In addition, as shown in FIG. 5, when the image analysis result, vehicle number recognition, vehicle and pedestrian detection, and bicycle and motorcycle are detected, the image monitoring unit 300 may display a bounding box including an object.

본 발명에 따르면, CCTV로부터 획득한 영상에서 차량, 보행자 정보 등 다양한 객체를 검출하기 위한 기존의 영상처리 알고리즘의 단점들을 개선하기 위해서 딥 러닝 알고리즘 중에서 영상처리 분야에서 주로 사용되는 CNN(Convolutional Neural Network)을 이용하여 데이터를 학습하고 검출할 수 있다.According to the present invention, a convolutional neural network (CNN) mainly used in the image processing field among deep learning algorithms in order to improve the shortcomings of existing image processing algorithms for detecting various objects such as vehicle and pedestrian information from images acquired from CCTV. You can learn and detect data using.

이상에서 설명한 것은 본 발명에 의한 CCTV 영상 기반 딥 러닝을 이용한 다중객체 검출 시스템을 실시하기 위한 하나의 실시예에 불과한 것으로서, 본 발명은 상기 실시예에 한정되지 않고, 이하의 특허청구범위에서 청구하는 바와 같이 본 발명의 요지를 벗어남이 없이 당해 발명이 속하는 분야에서 통상의 지식을 가진 자라면 누구든지 다양한 변경 실시가 가능한 범위까지 본 발명의 기술적 정신이 있다고 할 것이다.What has been described above is only one embodiment for implementing a multi-object detection system using CCTV image-based deep learning according to the present invention, and the present invention is not limited to the above embodiment, and is claimed in the following claims. As described above, without departing from the gist of the present invention, anyone of ordinary skill in the field to which the present invention belongs will have the technical spirit of the present invention to the extent that various changes can be implemented.

1000: CCTV 영상 기반 딥 러닝을 이용한 다중객체 검출 시스템
100: 검출부
200: 인식부
300: 영상 모니터링부1000: Multi-object detection system using deep learning based on CCTV images
100: detection unit
200: recognition unit
300: image monitoring unit

Claims

A detection unit for detecting at least one object set in advance for a CCTV image using a predefined detection deep learning network;
A recognition unit for recognizing information on a corresponding object based on a detection result of the object detected through a predefined recognition deep learning network and the detection unit; And
Including an image monitoring unit for providing the information recognized by the recognition unit together with the CCTV image,
The detection unit,
A multi-object detection system using CCTV image-based deep learning, characterized in that the object is detected using a convolutional neural network layer in which a 3x3 convolution layer and a 1x1 convolution layer are combined.

The method of claim 1,
The detection unit,
For the input image of the CCTV,
In an environment operated at the first angle of view, an object is detected and learned using a weight file learned with 416x416 pixels,
The object is detected and learned using a weight file learned with 320x320 pixels at the angle of view operated as the second angle of view,
A multi-object detection system using CCTV image-based deep learning, wherein the first angle of view is wider than the second angle of view.