KR102411209B1

KR102411209B1 - System and Method for Image Classification Based on Object Detection Events by Edge Device

Info

Publication number: KR102411209B1
Application number: KR1020210186805A
Authority: KR
Inventors: 조문석; 홍성견; 서인석
Original assignee: 쿨사인 주식회사
Priority date: 2021-12-24
Filing date: 2021-12-24
Publication date: 2022-06-22

Abstract

본 발명은 엣지 디바이스에서 생성한 객체 탐지 이벤트 기반의 영상 분류 시스템 및 방법에 있어서, 영상을 촬영하여 영상데이터를 생성하는 촬영모듈, 상기 촬영모듈을 통해 입력되는 영상에서 피사체가 감지되는 경우 상기 피사체의 형태 및 행동을 분석하여 사람, 동물, 식물 및 사물 중 어느 하나로 피사체의 종류를 판단하여 피사체의 종류에 해당하는 대분류코드 및 피사체 위치에 대한 메타데이터를 생성하는 제1분석모듈, 상기 영상데이터에 상기 메타데이터를 병합하여 후처리데이터를 생성하는 영상처리모듈, 상기 영상처리모듈을 통해 생성된 후처리데이터를 송신하는 제1통신모듈,을 포함하는 영상촬영장치; 및 상기 영상촬영장치로부터 후처리데이터를 수신하는 제2통신모듈, 상기 제2통신모듈에 수신된 후처리데이터에 포함된 메타데이터 중 대분류코드 및 위치정보를 추출하는 추출모듈, 및 상기 추출모듈을 통해 추출된 대분류 코드별로 저장모듈에 저장된 분류기준에 따라 세분류하여 상기 세분류에 해당하는 세분류코드를 메타데이터에 추가한 후 해당 영상데이터와 병합하여 후처리데이터를 재생성하는 제2분석모듈,을 포함하는 메인서버를 포함하는 엣지 디바이스에서 생성한 객체 탐지 이벤트 기반의 영상 분류 시스템 및 방법에 관한 것이다. 본 발명에 따르면, 엣지(본 발명에서의 영상촬영장치) 디바이스와 메인서버 모두에서 추론을 진행하여 엣지 디바이스에서는 객체를 탐지하고 메인서버에서는 객체를 분류하여 영상을 태깅하여 저장하거나 촬영함으로써 추후 사용자가 태그 및 태그 조합을 이용하여 다양한 조건의 영상검색을 시행할 수 있는 엣지 디바이스에서 생성한 객체 탐지 이벤트 기반의 영상 분류 시스템 및 방법을 제공할 수 있다.The present invention provides an image classification system and method based on an object detection event generated by an edge device, a photographing module for generating image data by photographing an image, A first analysis module that analyzes the shape and behavior to determine the type of the subject as any one of a person, an animal, a plant, and an object to generate a large classification code corresponding to the type of the subject and metadata about the position of the subject; An image photographing apparatus comprising: an image processing module for generating post-processing data by merging metadata; a first communication module for transmitting the post-processing data generated through the image processing module; and a second communication module for receiving post-processing data from the image capturing device, an extraction module for extracting large classification codes and location information among metadata included in the post-processing data received by the second communication module, and the extraction module. A second analysis module that sub-classifies according to the classification criteria stored in the storage module for each major classification code extracted through the sub-classification, adds the sub-classification code corresponding to the sub-classification to metadata, and then merges it with the image data to regenerate post-processing data; It relates to an object detection event-based image classification system and method generated by an edge device including a main server. According to the present invention, inference is performed on both the edge (image photographing apparatus in the present invention) device and the main server to detect an object in the edge device and classify the object in the main server to tag, store, or photograph the image so that the user can It is possible to provide an image classification system and method based on an object detection event generated by an edge device that can perform image search under various conditions using tags and tag combinations.

Description

Image classification system based on object detection event generated by edge device {System and Method for Image Classification Based on Object Detection Events by Edge Device}

본 발명은 엣지 디바이스에서 생성한 객체 탐지 이벤트 기반의 영상 분류 시스템 및 방법에 관한 것으로, 보다 상세하게는 영상을 촬영하여 영상데이터를 생성하는 촬영모듈, 상기 촬영모듈을 통해 입력되는 영상에서 피사체가 감지되는 경우 상기 피사체의 형태 및 행동을 분석하여 사람, 동물, 식물 및 사물 중 어느 하나로 피사체의 종류를 판단하여 피사체의 종류에 해당하는 대분류코드 및 피사체 위치에 대한 메타데이터를 생성하는 제1분석모듈, 상기 영상데이터에 상기 메타데이터를 병합하여 후처리데이터를 생성하는 영상처리모듈, 상기 영상처리모듈을 통해 생성된 후처리데이터를 송신하는 제1통신모듈,을 포함하는 영상촬영장치; 및 상기 영상촬영장치로부터 후처리데이터를 수신하는 제2통신모듈, 상기 제2통신모듈에 수신된 후처리데이터에 포함된 메타데이터 중 대분류코드 및 위치정보를 추출하는 추출모듈, 및 상기 추출모듈을 통해 추출된 대분류 코드별로 저장모듈에 저장된 분류기준에 따라 세분류하여 상기 세분류에 해당하는 세분류코드를 메타데이터에 추가한 후 해당 영상데이터와 병합하여 후처리데이터를 재생성하는 제2분석모듈,을 포함하는 메인서버;를 포함하는 엣지 디바이스에서 생성한 객체 탐지 이벤트 기반의 영상 분류 시스템 및 방법에 관한 것이다.The present invention relates to an image classification system and method based on an object detection event generated by an edge device, and more particularly, to a photographing module that captures an image to generate image data, and a subject is detected in an image input through the photographing module a first analysis module that analyzes the shape and behavior of the subject, determines the type of the subject as any one of a person, an animal, a plant, and an object, and generates a large classification code corresponding to the type of the subject and metadata about the position of the subject; an image photographing apparatus comprising: an image processing module configured to generate post-processing data by merging the metadata with the image data; and a first communication module configured to transmit the post-processed data generated through the image processing module; and a second communication module for receiving post-processing data from the image capturing device, an extraction module for extracting large classification codes and location information among metadata included in the post-processing data received by the second communication module, and the extraction module. A second analysis module that sub-classifies according to the classification criteria stored in the storage module for each major classification code extracted through the sub-classification, adds the sub-classification code corresponding to the sub-classification to metadata, and then merges it with the image data to regenerate post-processing data; It relates to a system and method for classifying an image based on an object detection event generated by an edge device including a main server.

본 발명은 영상데이터를 송수신하는 클라우드 엣지 시스템에 관한 것이다.The present invention relates to a cloud edge system for transmitting and receiving image data.

기존의 영상 기반의 CCTV 시스템은 시스템의 엔드포인트의 영상수집장치(카메라)에서 취득한 영상정보를 유무선 네트워크 또는 케이블 등의 매체를 이용하여 전송하도록 하고 있다.The existing video-based CCTV system transmits video information acquired from the video collection device (camera) of the endpoint of the system using a medium such as a wired/wireless network or cable.

카메라와 제어장치의 인터페이스에 따라서 엔드포인트에서 영상의 인코딩을 실시하거나 원격지의 별도 장비에서 실시가 가능하다.Depending on the interface between the camera and the control device, it is possible to encode the video at the endpoint or to perform it in a separate device at a remote location.

상기와 유사한 구조로 구성된 인공지능 추론 시스템은 역시 카메라에서 취득한 영상정보를 이용하여 엔드포인트에서 추론과정을 실시하거나 취득된 영상정보를 원격지에 전송하여 추론을 실시하는 방식으로 구성되어 있다.The artificial intelligence inference system configured with a structure similar to the above is also configured in such a way that an inference process is performed at the endpoint using the image information acquired from the camera or the inference is performed by transmitting the acquired image information to a remote location.

기존의 인공지능 시스템(객체 탐지, 인식, 추적, 분류 등 응용 가능한 다양한 분야의)의 구성은 깊고 넓은 신경망 기반의 학습 모델을 사용하는 서버 기반의 시스템과 보다 얕은 경량화된 학습 모델을 사용하는 엣지 기반의 시스템으로 구분되고 있다.The configuration of the existing artificial intelligence system (in various applicable fields such as object detection, recognition, tracking, classification, etc.) is a server-based system using a deep and wide neural network-based learning model and an edge-based system using a shallower lightweight learning model. is divided into a system of

그러나 각 엣지 디바이스(영상수집장치)는 설치 위치나 기구 형태 등의 제약 사항으로 인한 성능 상의 제한으로 인하여 객체 탐지, 식별, 분류, 추적 등 다양한 인공지능 기반의 추론을 한 장치에서 실행하는 것이 불가능하다.However, it is impossible to execute various AI-based inferences such as object detection, identification, classification, and tracking in one device due to limitations in performance due to restrictions such as installation location and instrument shape of each edge device (image collection device). .

각 노드에서 추론 프로세스를 거치지 않은 영상 신호를 수신하여 중앙 서버에서 모든 추론 연산을 실행하는 경우에는 서버 당 제어 가능한 노드 수가 현저히 감소하므로 확장성과 효율성이 감소하게 된다.In the case where each node receives the video signal that has not been subjected to the inference process and executes all inference operations on the central server, the number of controllable nodes per server is significantly reduced, thereby reducing scalability and efficiency.

본 발명의 배경이 되는 기술은 대한민국 등록특허공보 제10-2152237호 등에 개시되어 있으나, 상술한 문제점에 대한 근본적인 해결책은 제시되고 있지 못하는 실정이다.Although the technology that is the background of the present invention is disclosed in Korean Patent Publication No. 10-2152237, etc., a fundamental solution to the above-described problem is not presented.

대한민국 등록특허공보 제10-2152237호Republic of Korea Patent Publication No. 10-2152237

상술한 바와 같은 같은 문제점을 해결하기 위해 안출된 본 발명의 목적은, 엣지(본 발명에서의 영상촬영장치) 디바이스와 메인서버 모두에서 추론을 진행하여 엣지 디바이스에서는 객체를 탐지하고 메인서버에서는 객체를 분류하여 영상을 태깅하여 저장하거나 촬영함으로써 추후 사용자가 태그 및 태그 조합을 이용하여 다양한 조건의 영상검색을 시행할 수 있는 엣지 디바이스에서 생성한 객체 탐지 이벤트 기반의 영상 분류 시스템 및 방법을 제공하기 위함이다.The purpose of the present invention, which was devised to solve the above-described problems, is to detect an object in the edge device and detect the object in the main server by performing inference in both the edge (image capturing apparatus in the present invention) device and the main server. To provide an image classification system and method based on an object detection event generated by an edge device that allows users to perform image search under various conditions using tags and tag combinations in the future by classifying, tagging, storing, or shooting images. .

또한, 본 발명의 다른 목적은, 하나 이상의 피사체가 감지되고 피사체들의 이동방향이 같을 경우 식물 또는 사물에 대한 코드를 사람 또는 동물에 대한 코드에 귀속시켜 코드를 하나 이상 생성하기 때문에 보다 검색이 용이한 엣지 디바이스에서 생성한 객체 탐지 이벤트 기반의 영상 분류 시스템 및 방법을 제공하기 위함이다.In addition, another object of the present invention is that, when one or more objects are detected and the moving directions of the objects are the same, a code for a plant or object is attributed to a code for a person or an animal to generate one or more codes. This is to provide an image classification system and method based on an object detection event generated by an edge device.

상기한 바와 같은 목적을 달성하기 위한 본 발명의 특징에 따르면, 본 발명인 엣지 디바이스에서 생성한 객체 탐지 이벤트 기반의 영상 분류 시스템은, 영상을 촬영하여 영상데이터를 생성하는 촬영모듈과, 상기 촬영모듈을 통해 입력되는 영상에서 피사체가 감지되는 경우 상기 피사체의 형태 및 행동을 분석하여 사람, 동물, 식물 및 사물 중 어느 하나로 피사체의 종류를 판단하여 피사체의 종류에 해당하는 대분류코드 및 피사체 위치에 대한 메타데이터를 생성하는 제1분석모듈과, 상기 영상데이터에 상기 메타데이터를 병합하여 ONVIF(Open Network Video Interface FORUM) 프로토콜로 통신 가능한 후처리데이터를 생성하는 영상처리모듈과, 상기 제1분석모듈에서 메타데이터가 생성되면 PTP(Precision Time Protocol) 동기화신호를 생성하여 근접한 하나 이상의 영상촬영장치로 송신하고, PTP를 통해 하나 이상의 영상촬영장치의 시간을 주기적으로 동기화 하여 메타데이터 생성 시 연결된 영상촬영장치의 영상스티리밍 신호들을 매칭하는 제1통신모듈을 포함하는 영상촬영장치; 및 상기 영상촬영장치로부터 후처리데이터를 수신하는 제2통신모듈과, 상기 제2통신모듈에 수신된 후처리데이터에 포함된 메타데이터 중 대분류코드 및 위치정보를 추출하는 추출모듈과, 및 상기 추출모듈을 통해 추출된 대분류 코드별로 저장모듈에 저장된 분류기준에 따라 세분류하여 상기 세분류에 해당하는 세분류코드를 메타데이터에 추가한 후 해당 영상데이터와 병합하여 후처리데이터를 재생성하는 제2분석모듈과, 데이터 송수신 및 데이터 처리에 따라 발생되는 부하를 감시하여 부하가 미리 설정된 부하보다 높을 경우 상기 영상촬영장치로부터 영상데이터만을 수신받아 제2분석모듈을 통해 메타데이터를 생성하여 상기 영상데이터에 상기 메타데이터를 병합하여 후처리데이터를 생성하는 제어모듈과, 제2분석모듈을 통해 생성된 하나 이상의 후처리데이터의 메타데이터를 파싱하여 상기 후처리데이터에 포함된 메타데이터의 대분류코드 및 세분류코드에 따라 상기 후처리데이터를 분류하여 저장하는 저장모듈을 포함하는 메인서버; 및 대분류코드 또는 세분류코드 중 어느 하나 이상 및 시간범위 정보를 포함하는 검색정보를 생성하여 메인서버로 송신하는 사용자단말기를 포함한다.
이때, 상기 제1분석모듈은 사물 또는 식물로 판단된 피사체가 사람 또는 동물로 판단된 피사체의 이동방향과 동시간대에 같은 방향으로 이동하면 상기 사람 또는 동물로 판단된 피사체에 귀속된 것으로 판단하여 사람 또는 동물로 판단된 피사체에 대한 대분류 코드, 상기 대분류코드에 귀속된 대분류코드 및 위치에 대한 메타데이터를 생성하는 것을 특징으로 하고, 상기 제2분석모듈은 대분류코드 또는 소분류코드 중 어느 하나 이상 및 위치정보에 따라 미리 설정된 마스킹 처리, 모자이크 처리 및 개인 신원 식별 처리 중 어느 하나 이상의 추가처리를 진행하여 메타데이터에 추가하고, 하나 이상의 대분류코드를 세분류하여 상기 세분류에 해당하는 각 세분류코드를 메타데이터에 추가한 후 해당 영상데이터와 병합하여 후처리데이터를 재생성하고 후처리데이터를 재생성하여 영상 품질을 변경한다.
그리고 상기 제2분석모듈은 영상촬영장치로부터 수신된 영상데이터를 분석하여 피사체의 안면의 형태의 통계자료를 활용해, 각 피사체의 인종, 성별, 연령대 또는 감정을 분석하는 안면분석모듈을 더 포함하고, 상기 메인서버는 사용자단말기로부터 대분류코드 및 세분류코드 중 어느 하나 이상 및 시간범위 정보가 포함된 검색정보가 수신되면 상기 검색정보에 해당하는 시간범위 내의 미리 저장된 후처리데이터를 검출하고 검출된 하나 이상의 후처리데이터를 상기 사용자단말기로 스트리밍하되 상기 검색정보에 해당하는 피사체의 이동 동선에 따라 후처리데이터를 순차적으로 스트리밍하거나 하나 이상의 후처리데이터를 동시간대에 스트리밍한다.According to a feature of the present invention for achieving the above object, the object detection event-based image classification system generated by the edge device of the present invention includes a photographing module for generating image data by photographing an image, and the photographing module When a subject is detected in an image input through a first analysis module generating is generated, it generates a PTP (Precision Time Protocol) synchronization signal and transmits it to one or more adjacent imaging devices, and periodically synchronizes the time of one or more imaging devices through PTP to create metadata an image photographing device including a first communication module for matching reaming signals; and a second communication module for receiving post-processing data from the image capturing device, an extraction module for extracting large classification codes and location information from metadata included in the post-processing data received by the second communication module, and the extraction A second analysis module for sub-classifying according to the classification criteria stored in the storage module for each major classification code extracted through the module, adding the sub-classification code corresponding to the sub-classification to metadata, and merging it with the image data to regenerate post-processing data; By monitoring the load generated according to data transmission and reception and data processing, when the load is higher than the preset load, only the image data is received from the image photographing device, the metadata is generated through the second analysis module, and the metadata is added to the image data. A control module for generating post-processing data by merging, and parsing metadata of one or more post-processing data generated through a second analysis module according to a major classification code and a sub-classification code of metadata included in the post-processing data a main server including a storage module for classifying and storing processed data; and a user terminal that generates search information including any one or more of a major classification code or a sub-category code and time range information and transmits the generated search information to the main server.
At this time, the first analysis module determines that the subject determined to be a person or a plant belongs to the subject determined to be a person or an animal if the subject determined to be an object or a plant moves in the same direction as the movement direction of the subject determined to be a person or an animal. or a large classification code for a subject determined to be an animal, a large classification code attributed to the large classification code, and metadata about a location, wherein the second analysis module includes any one or more of a large classification code or a small classification code and a location According to the information, any one or more additional processing of preset masking processing, mosaic processing, and personal identification processing is performed and added to the metadata, and one or more major classification codes are subdivided and each sub-classification code corresponding to the sub-classification is added to the metadata After merging with the corresponding image data, the post-processing data is regenerated, and the image quality is changed by regenerating the post-processing data.
And the second analysis module further includes a facial analysis module that analyzes the image data received from the image photographing device and analyzes the race, gender, age group or emotion of each subject by using the statistical data of the subject's face shape, , the main server detects pre-stored post-processing data within a time range corresponding to the search information and detects at least one of the detected post-processing data when search information including any one or more of a major classification code and a sub-category code and time range information is received from the user terminal The post-processing data is streamed to the user terminal, but the post-processed data is sequentially streamed according to the movement of the subject corresponding to the search information, or one or more post-processed data are streamed at the same time.

삭제delete

이상 살펴본 바와 같은 본 발명에 따르면, 엣지(본 발명에서의 영상촬영장치) 디바이스와 메인서버 모두에서 추론을 진행하여 엣지 디바이스에서는 객체를 탐지하고 메인서버에서는 객체를 분류하여 영상을 태깅하여 저장하거나 촬영함으로써 추후 사용자가 태그 및 태그 조합을 이용하여 다양한 조건의 영상검색을 시행할 수 있는 엣지 디바이스에서 생성한 객체 탐지 이벤트 기반의 영상 분류 시스템 및 방법을 제공할 수 있다.According to the present invention as described above, inference is performed on both the edge (image capturing apparatus in the present invention) device and the main server to detect an object in the edge device and classify the object in the main server to tag, store, or shoot an image. By doing so, it is possible to provide an image classification system and method based on an object detection event generated by an edge device that allows a user to perform image search under various conditions using tags and tag combinations in the future.

또한, 본 발명에 따르면, 하나 이상의 피사체가 감지되고 피사체들의 이동방향이 같을 경우 식물 또는 사물에 대한 코드를 사람 또는 동물에 대한 코드에 귀속시켜 코드를 하나 이상 생성하기 때문에 보다 검색이 용이한 엣지 디바이스에서 생성한 객체 탐지 이벤트 기반의 영상 분류 시스템 및 방법을 제공할 수 있다.In addition, according to the present invention, when one or more objects are detected and the moving directions of the objects are the same, a code for a plant or object is attributed to a code for a person or an animal to generate one or more codes, which makes it easier to search for an edge device It is possible to provide an image classification system and method based on the object detection event generated in .

도 1은 본 발명의 실시예에 따른 엣지 디바이스에서 생성한 객체 탐지 이벤트 기반의 영상 분류 시스템의 구성을 나타낸 예시도
도 2는 본 발명의 실시예에 따른 엣지 디바이스에서 생성한 객체 탐지 이벤트 기반의 영상 분류 시스템의 구성을 나타낸 블록도
도 3은 본 발명의 실시예에 따른 엣지 디바이스에서 생성한 객체 탐지 이벤트 기반의 영상 분류 방법의 제공순서를 나타낸 순서도이다.1 is an exemplary diagram illustrating the configuration of an image classification system based on an object detection event generated by an edge device according to an embodiment of the present invention;
2 is a block diagram showing the configuration of an image classification system based on an object detection event generated by an edge device according to an embodiment of the present invention;
3 is a flowchart illustrating a method of providing an image classification method based on an object detection event generated by an edge device according to an embodiment of the present invention.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다.Advantages and features of the present invention and methods of achieving them will become apparent with reference to the embodiments described below in detail in conjunction with the accompanying drawings.

그러나 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하고, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. 명세서 전체에 걸쳐 동일 참조 부호는 동일 구성 요소를 지칭한다.However, the present invention is not limited to the embodiments disclosed below, but may be implemented in various different forms, and only these embodiments allow the disclosure of the present invention to be complete, and common knowledge in the art to which the present invention pertains It is provided to fully inform those who have the scope of the invention, and the present invention is only defined by the scope of the claims. Like reference numerals refer to like elements throughout.

이하, 본 발명의 실시예들에 의하여 엣지 디바이스에서 생성한 객체 탐지 이벤트 기반의 영상 분류 시스템 및 방법을 설명하기 위한 도면들을 참고하여 본 발명에 대해 설명하도록 한다.Hereinafter, the present invention will be described with reference to the drawings for explaining an image classification system and method based on an object detection event generated by an edge device according to embodiments of the present invention.

본 발명인 엣지 디바이스에서 생성한 객체 탐지 이벤트 기반의 영상 분류 시스템은, 다 채널의 영상을 감시, 제어, 기록하는 시스템(CCTV 관제시스템 또는 VMS 등 종합영상관리시스템)의 효율적인 채널 관리와 성능개선을 위한 것이다.The object detection event-based image classification system created by the present invention edge device is for efficient channel management and performance improvement of a system that monitors, controls, and records multi-channel images (CCTV control system or comprehensive image management system such as VMS). will be.

영상촬영장치(100)의 내부 또는 외부에 설치된 영상처리모듈(130)을 통해 취득된 영상에 대해 분석하여 추론을 시행함으로서 영상데이터에 추론 과정으로 취득한 메타데이터를 포함하여 통합된 후처리데이터(영상데이터패킷)으로 합성하여 전송함으로써 전송 효율성과 동시채널 처리 가능 수를 증대시킬 수 있다.Integrated post-processing data (images) including metadata acquired through an inference process in the image data by analyzing and inferring images acquired through the image processing module 130 installed inside or outside the image capturing apparatus 100 data packet) and transmit, it is possible to increase transmission efficiency and the number of simultaneous channels that can be processed.

여기서, 후처리데이터는, ONVIF(Open Network Video Interface Forum; 이하 ONVIF) 프로토콜을 만족하는 형태로 인코딩되는 것이 바람직하나 이에 한정되는 것이 아닌 영상데이터가 사용자단말기(300)에서 디코딩되어 출력될 수 있다면, 다양한 방식으로 인코딩되는 것이 바람직하다.Here, the post-processing data is preferably encoded in a form that satisfies the ONVIF (Open Network Video Interface Forum; hereinafter ONVIF) protocol, but is not limited thereto. If image data can be decoded and output by the user terminal 300, It is desirable to be encoded in various ways.

본 발명의 전반에 걸쳐 기재된 영상촬영장치(100)는 영상을 취득하는 장치로서, 일반적으로 카메라 인 것이 바람직하나, 광학센서, 열화상 카메라, 3D-Depth 카메라, 라이다 등 이미지 신호를 수신하는 것이 가능하다면 영상촬영장치(100)로 사용 가능하다.The imaging apparatus 100 described throughout the present invention is a device for acquiring an image, and generally preferably a camera, but receiving an image signal such as an optical sensor, thermal imaging camera, 3D-Depth camera, lidar, etc. If possible, it can be used as the image photographing apparatus 100 .

본 발명의 전반에 걸쳐 기재된 사용자단말기(300)는, 일반적으로 개인 컴퓨터, 노트북 등의 형태로 영상촬영장치(100) 및 메인서버(200)로부터 수신받은 후처리데이터를 출력하는 것이 바람직하나 이것으로 제한되는 것이 아니다. 즉, 휴대폰 단말기(300), 스마트폰 단말기(300), PDA단말기(300) 등 영상촬영장치(100) 및 메인서버(200)와 유선 또는 무선으로 데이터를 송수신하는 것이 가능하다면 사용자단말기(300)로 이용 가능하다.It is preferable that the user terminal 300 described throughout the present invention outputs post-processing data received from the image capturing apparatus 100 and the main server 200 in the form of a personal computer, a notebook computer, etc. in general. It is not limited. That is, if it is possible to transmit/receive data to and from the image photographing apparatus 100 and the main server 200 such as the mobile phone terminal 300, the smart phone terminal 300, the PDA terminal 300, etc. by wire or wirelessly, the user terminal 300 available as

도 1은 본 발명의 실시예에 따른 엣지 디바이스에서 생성한 객체 탐지 이벤트 기반의 영상 분류 시스템의 구성을 나타낸 예시도이고, 도 2는 본 발명의 실시예에 따른 엣지 디바이스에서 생성한 객체 탐지 이벤트 기반의 영상 분류 시스템의 구성을 나타낸 블록도이다.1 is an exemplary diagram showing the configuration of an image classification system based on an object detection event generated by an edge device according to an embodiment of the present invention, and FIG. 2 is based on an object detection event generated by an edge device according to an embodiment of the present invention. It is a block diagram showing the configuration of the image classification system of

도 1 및 도 2를 참고하면, 본 발명인 엣지 디바이스에서 생성한 객체 탐지 이벤트 기반의 영상 분류 시스템 및 방법은, 영상촬영장치(100), 메인서버(200) 및 사용자단말기(300)를 포함한다.1 and 2 , the object detection event-based image classification system and method generated by the present invention edge device includes an image capturing apparatus 100 , a main server 200 , and a user terminal 300 .

여기서, 영상촬영장치(100)는, 촬영모듈(110), 제1분석모듈(120), 영상처리모듈(130) 및 제1통신모듈(140)을 포함한다.Here, the image photographing apparatus 100 includes a photographing module 110 , a first analysis module 120 , an image processing module 130 , and a first communication module 140 .

촬영모듈(110)은 영상을 촬영하여 영상데이터를 생성한다.The photographing module 110 generates image data by photographing an image.

여기서, 촬영모듈(110)은 음성인식모듈(112)을 더 포함한다.Here, the photographing module 110 further includes a voice recognition module 112 .

음성인식모듈(112)은, 음성을 인식한다.The voice recognition module 112 recognizes a voice.

제1분석모듈(120)은, 상기 촬영모듈(110)을 통해 입력되는 영상에서 피사체가 감지되는 경우 상기 피사체의 형태 및 행동을 분석하여 피사체의 종류를 판단한 후 피사체의 종류 및 피사체 위치에 대한 메타데이터를 생성한다.The first analysis module 120, when a subject is detected in the image input through the photographing module 110, analyzes the shape and behavior of the subject to determine the type of the subject create data

즉, 제1분석모듈(120)은, 피사체가 감지되는 경우 상기 피사체의 형태 및 행동을 분석하여 사람, 동물, 식물 및 사물 중 어느 하나로 피사체의 종류를 판단하여 피사체의 종류에 해당하는 대분류코드 및 피사체 위치에 대한 메타데이터를 생성한다. 여기서, 제1분석모듈(120)은 촬영모듈(110)을 통해 취득한 영상데이터의 객체 탐지/식별/추적/분류 등의 분석 및 추론 연산을 진행하는 것이 바람직하다.That is, the first analysis module 120, when a subject is detected, analyzes the shape and behavior of the subject, determines the type of the subject as any one of a person, an animal, a plant, and a large classification code corresponding to the type of the subject and Generate metadata about the subject's location. Here, it is preferable that the first analysis module 120 performs analysis and reasoning operations such as object detection/identification/tracking/classification of the image data acquired through the photographing module 110 .

또한, 제1분석모듈(120)은 사물 또는 식물로 판단된 피사체가 사람 또는 동물로 판단된 피사체의 이동방향과 동시간대에 같은 방향으로 이동하면 상기 사람 또는 동물로 판단된 피사체에 귀속된 것으로 판단하여 사람 또는 동물로 판단된 피사체에 대한 대분류 코드, 상기 대분류코드에 귀속된 대분류코드 및 위치에 대한 메타데이터를 생성한다.In addition, the first analysis module 120 determines that the subject determined to be an object or a plant belongs to the subject determined to be a person or an animal when the subject determined to be a human or an animal moves in the same time period as the moving direction of the subject. Thus, a large classification code for a subject determined to be a human or an animal, a large classification code attributed to the large classification code, and metadata about a location are generated.

즉, 사람 또는 동물이 사물 또는 식물을 파지하는 등 다양한 방식을 통해 같이 이동할 경우 귀속된 것으로 판단하여 각각의 대분류 코드에 대한 메타데이터를 생성하는 것이 바람직하다.That is, when a person or an animal moves together through various methods such as gripping an object or a plant, it is desirable to determine that it is attributed and generate metadata for each major classification code.

만약, 사람이나 동물의 액션 없이 사물 또는 식물이 자체적으로 이동이 되었다면 이상신호를 생성하여 사용자단말기로 송신하는 것이 바람직하다.If an object or plant moves by itself without action of a person or an animal, it is preferable to generate an abnormal signal and transmit it to the user terminal.

영상처리모듈(130)은 상기 영상데이터에 상기 메타데이터를 병합하여 후처리데이터를 생성한다. 여기서 메타데이터는 영상데이터의 분석 및 추론을 통해 생성된 결과인 것이 바람직하다. 또한, 메타데이터는 실시간으로 전송되는 RTCP(Realtime Transport Control Protocol; 이하 RTCP) 패킷에 포함되는 방식을 사용하는 것이 바람직하다. 여기서 RTCP 패킷의 형식은 204로 지정(RFC1899, Application specific RTCP)하는 것이 바람직하다.The image processing module 130 generates post-processing data by merging the metadata with the image data. Here, the metadata is preferably a result generated through analysis and inference of image data. In addition, it is preferable to use a method in which the metadata is included in a Realtime Transport Control Protocol (RTCP) packet transmitted in real time. Here, it is preferable to designate the format of the RTCP packet as 204 (RFC1899, Application specific RTCP).

RTCP 패킷은 영상프레임과 추론결과를 동기하기위한 기준값인 RTP-TIMESTAMPm Detection data 등의 PAYLOAD-TYPE, OBJECT-COUNT(OBJECT-ID, OBJECT-BBOX-LEFT-TOP 좌표, OBJECT-BBOX-RIGHT-BOTTOM 좌표 등을 포함하는 것이 바람직하다. 그리고, 영상처리모듈(130)은, 후처리데이터 생성 시 상기 음성인식모듈을 통해 인식되어 생성된 음성데이터를 상기 후처리데이터에 포함하여 생성한다.RTCP packets are PAYLOAD-TYPE, OBJECT-COUNT (OBJECT-ID, OBJECT-BBOX-LEFT-TOP coordinates, OBJECT-BBOX-RIGHT-BOTTOM coordinates, etc. It is preferable to include, etc. And, the image processing module 130 generates the post-processing data by including the generated voice data recognized through the voice recognition module in the post-processing data when generating the post-processing data.

제1통신모듈(140)은 상기 영상처리모듈(130)을 통해 생성된 후처리데이터를 송신한다. 여기서, 상기 제1통신모듈(140)은, 상기 제1분석모듈(120)에서 메타데이터가 생성되면 PTP(Precision Time Protocol; 이하 PTP) 동기화신호를 생성하여 근접한 하나 이상의 영상촬영장치(100)로 송신하는 것을 특징으로 한다.The first communication module 140 transmits the post-processing data generated through the image processing module 130 . Here, the first communication module 140 generates a PTP (Precision Time Protocol; hereafter referred to as PTP) synchronization signal when the metadata is generated by the first analysis module 120 to be transmitted to one or more adjacent image photographing devices 100 . It is characterized by transmitting.

즉, 제1통신모듈(140)은 PTP를 통해 하나 이상의 영상촬영장치(100)의 시간을 주기적으로 동기화하여 메타데이터 생성 시 연결된 영상촬영장치(100)의 영상스트리밍 신호들을 매칭하는 것이 바람직하다.That is, it is preferable that the first communication module 140 periodically synchronizes the time of one or more image photographing apparatuses 100 through PTP to match the image streaming signals of the connected image photographing apparatuses 100 when generating metadata.

N대의 영상촬영장치(100)가 설치되는 경우, N대의 영상촬영장치(100)의 시간을 동기화하고 이로부터 동기화된 영상데이터를 메인서버(200) 또는 사용자단말기(300)가 수신하여 전체 그림을 구성하는 것이 바람직하다.When N image photographing apparatuses 100 are installed, the time of the N image photographing apparatuses 100 is synchronized, and the synchronized image data is received by the main server 200 or the user terminal 300 to display the entire picture. It is preferable to configure

메인서버(200)는 제2통신모듈(210), 추출모듈(220), 제2분석모듈(230), 제어모듈(240) 및 저장모듈(250)을 포함한다.The main server 200 includes a second communication module 210 , an extraction module 220 , a second analysis module 230 , a control module 240 , and a storage module 250 .

제2통신모듈(210)은 상기 영상촬영장치(100)로부터 후처리데이터를 수신한다. 여기서, 제2통신모듈(210)은 영상데이터에 메타데이터가 인코딩된 후처리데이터를 수신하기 때문에 별도의 데이터파서의 실행이 불필요하며 시간 순서에 따라 정렬 처리하는 과정 또한 불필요하게 된다. 그리고, 미디어 스트림이 in-band된 후처리데이터의 추론 결과가 실행 시간 순으로 정렬된 상태로 수신할 수 있으므로 별도의 인코딩이나 정렬과정 없이 바로 메타데이터가 포함된 영상을 사용자단말기(300)로 제공 가능하다.The second communication module 210 receives post-processing data from the image capturing apparatus 100 . Here, since the second communication module 210 receives post-processing data in which metadata is encoded in the image data, it is unnecessary to execute a separate data parser, and the process of sorting and processing according to the time sequence is also unnecessary. In addition, since the inference result of the post-processed data in which the media stream is in-band can be received in an arranged state in the order of execution time, an image including metadata is directly provided to the user terminal 300 without a separate encoding or sorting process It is possible.

추출모듈(220)은, 제2통신모듈(210)에 수신된 후처리데이터에 포함된 메타데이터 중 피사체의 종류 즉, 대분류 코드 및 위치정보를 추출한다.The extraction module 220 extracts the type of object, that is, the large classification code and location information, from among the metadata included in the post-processing data received by the second communication module 210 .

제2분석모듈(230)은 추출모듈(220)을 통해 추출된 대분류 코드별로 저장모듈에 저장된 분류기준에 따라 세분류하여 상기 세분류에 해당하는 세분류코드를 메타데이터에 추가한 후 해당 영상데이터와 병합하여 후처리데이터를 재생성한다.The second analysis module 230 sub-classifies according to the classification criteria stored in the storage module for each major classification code extracted through the extraction module 220, adds the sub-classification code corresponding to the sub-classification to the metadata, and then merges it with the image data. Regenerate post-processing data.

제2분석모듈(230)에서 세분류하는 기준은 다음과 같다.The criteria for subdividing classification in the second analysis module 230 are as follows.

사람 - 인종, 연령, 성별 등People - race, age, gender, etc.

동물 - 종별 분류, 품종 분류 등Animals - classification by species, classification of breeds, etc.

식물 - 종별 분류 등Plants - classification, etc.

사물 - 자동차, 화재, 연기, 노면, 기후(강우, 강설 등) 등Object - Car, fire, smoke, road surface, climate (rainfall, snowfall, etc.)

위와 같이 소분류 한 후 각 소분류 항목에 따라 세분화된 분류 기준을 더 추가하여 분류하는 것이 바람직하다. 예를 들면 자동차의 경우, 차종, 색상, 동일 차종인 경우 모델, 번호판 등의 세분류로 구분할 수 있다.After the sub-classification as above, it is desirable to classify by adding more subdivided classification criteria according to each sub-classification item. For example, in the case of a car, it can be divided into sub-categories such as car model, color, and model and license plate in case of the same car model.

사람의 경우 안면분석모듈을 통한 신원 확인, 신체부위, 자세, 복장 등의 세분류로 구분이 가능할 수 있다. 즉, 대분류는 사람, 동물, 식물 및 사물을 나누는 것이고 세분류는 대분류로 나누어진 피사체의 특징을 나누어 구분하게 됨이 바람직하다. 그리고, 제2분석모듈은, 대분류코드 또는 소분류코드 중 어느 하나 이상 및 위치정보에 따라 미리 설정된 마스킹 처리, 모자이크 처리 및 개인 신원 식별 처리 중 어느 하나 이상의 추가처리를 진행하여 메타데이터에 추가한다.In the case of a person, identification through the facial analysis module, body parts, postures, and clothes may be classified into subdivisions. That is, it is preferable that the large classification divides people, animals, plants, and things, and the sub-classification divides the characteristics of the subject divided into the large classifications. In addition, the second analysis module performs additional processing of any one or more of masking processing, mosaic processing, and personal identification identification processing preset according to any one or more of the large classification code or the small classification code and location information, and adding it to the metadata.

이를 통해, 서로 다른 N개의 영상촬영장치 A-Z로부터 각각 다른 종류의 객체에 대한 탐지 정보가 송신되는 경우 대분류코드 또는 소분류코드에 따라 서로 다른 추가처리를 진행하는 것 또한 가능하다. 그리고, 제2분석모듈(230)은, 상기 제2통신모듈(210)에 수신된 메타데이터에 포함된 정확도수치가 미리 설정된 정확도 수치 이하면 상기 후처리데이터에 포함된 상기 영상데이터를 재분석하여 상기 영상데이터에 재분석된 메타데이터를 병합하여 후처리데이터를 재생성한다.Through this, when detection information for different types of objects is transmitted from N different image capturing apparatuses A-Z, it is also possible to perform different additional processing according to the major classification code or the small classification code. In addition, the second analysis module 230 re-analyses the image data included in the post-processing data when the accuracy value included in the metadata received by the second communication module 210 is less than or equal to a preset accuracy value, and the Post-processing data is regenerated by merging the reanalyzed metadata with the image data.

여기서 제2분석모듈(230)은 수신한 영상데이터의 품질에 대한 재처리를 진행하여 영상품질을 변경하는 것이 가능하다. 그리고, 제2분석모듈(230)은 안면분석모듈(232)을 더 포함한다. 안면분석모듈(232)은 상기 후처리데이터를 분석하여 피사체의 안면의 형태를 분석하고 분석된 안면의 형태와 미리 저장된 하나 이상의 피사체 안면 데이터를 비교하여 미리 설정된 일치도 이하일 경우 침입신호를 생성하여 침입신호 및 해당 후처리데이터를 사용자단말기(300)로 송신한다. 즉, 이미 등록된 사람이 아닌 새로운 사람이 출입을 하였다면 새로운 사람이 출입을 하였다는 신호를 사용자단말기(300)에 송신하여 사용자가 새로운 사람이 출입하였다는 것을 확인할 수 있게함이 바람직하다. 그리고 제2분석모듈(230)은 세분류코드가 생성되면 상기 세분류코드에 미리 매칭된 음성데이터를 검출하고 검출된 음성데이터와 상기 후처리데이터에 저장된 음성데이터를 비교 분석하여 일치도가 미리 설정된 일치도 이하면 사용자 단말기로 이상신호를 생성하여 송신하고 미리 설정된 일치도 이상이면 개인 신원 식별처리를 진행한다. 즉, 피사체가 사람일 경우 개인식별처리를 진행하여 생성된 세분류코드와 미리 매칭되어 저장되어 있던 음성데이터를 검출하여 후처리데이터에 포함되어 있던 음성데이터와 비교하여 동일인인지를 판단한 후 동일인이면 개인식별처리를 완료하고 동일인이 아니라고 판단되면 세분류코드에 해당하는 개인을 흉내내어 잠입하였다고 판단하여 이상신호를 생성하여 사용자단말기(300)로 보냄으로써 사용자가 이상을 판단하게됨이 바람직하다.Here, the second analysis module 230 may change the image quality by reprocessing the quality of the received image data. And, the second analysis module 230 further includes a face analysis module (232). The facial analysis module 232 analyzes the post-processing data to analyze the shape of the subject's face, compares the analyzed facial shape with one or more pre-stored face data of the subject, and generates an intrusion signal if the match is less than or equal to a preset level of agreement to generate an intrusion signal and transmits the post-processing data to the user terminal 300 . That is, if a new person who is not already registered enters and exits, it is preferable to transmit a signal indicating that the new person has entered the user terminal 300 so that the user can confirm that the new person has entered. Then, when the sub-classification code is generated, the second analysis module 230 detects voice data previously matched to the sub-classification code, compares and analyzes the detected voice data with the voice data stored in the post-processing data, so that the degree of matching is less than a preset matching level. An abnormal signal is generated and transmitted to the user terminal, and if the matching level is greater than the preset level, personal identification processing is performed. That is, if the subject is a person, personal identification processing is performed to detect the stored voice data that is matched with the subclassification code generated in advance, compares it with the voice data included in the post-processing data to determine whether it is the same person, and then identifies the same person When the processing is completed and it is determined that it is not the same person, it is preferable that the user judges the abnormality by simulating the individual corresponding to the subclassification code and infiltrating, generating an abnormal signal and sending it to the user terminal 300 .

여기서, 안면분석모듈(232)은 영상촬영장치(100)로부터 수신된 영상데이터를 분석하여 피사체의 안면의 형태의 통계자료를 활용하여 각 피사체의 인종, 성별, 연령대 또는 감정 등을 분석하게 됨이 바람직하다. 즉, 안면분석모듈(232)을 통해 피사체의 인종, 성별, 연령대 및 감정을 미리 저장된 통계자료를 활용하여 분석하는 것이 바람직하다. 그리고, 안면분석모듈(232)은 상기 메타데이터에 포함된 대분류코드가 사람일 경우 피사체의 안면의 형태를 분석하고, 분석된 안면의 형태와 미리 저장된 하나 이상의 피사체 안면 데이터를 비교하여 미리 설정된 일치도 이상일 경우 개인 신원 식별처리를 진행하고, 피사체 안면 데이터와 미리 매칭되어 설정된 추가처리를 진행한다. 즉, 안면분석모듈(232)을 통해 개인 신원 식별처리를 진행하고 미리 저장된 개인이 검출되면 미리 매칭된 추가처리를 진행하고 미리 저장되지 않은 개인이 검출되면 새로운 사람이 출입했다는 신호를 사용자단말기(300)에 송신하게 됨이 바람직하다.Here, the facial analysis module 232 analyzes the image data received from the image capturing device 100 and analyzes the race, gender, age, or emotion of each subject by using the statistical data of the subject's face shape. desirable. That is, it is preferable to analyze the race, gender, age group, and emotion of the subject through the facial analysis module 232 using pre-stored statistical data. In addition, the facial analysis module 232 analyzes the facial shape of the subject when the large classification code included in the metadata is a person, compares the analyzed facial shape with one or more pre-stored facial data of the subject, and matches a preset degree or more In this case, personal identity identification processing is performed, and additional processing set in advance with the subject's facial data is performed. That is, personal identification processing is performed through the facial analysis module 232, and when a pre-stored individual is detected, additional matching processing is performed. ) is preferably transmitted to

여기서, 추가처리는 대분류코드 또는 세분류코드에 미리 매칭되어 설정된다.Here, the additional processing is set by matching in advance with the large classification code or the sub-category code.

또한, 추가처리는 사용자가 대분류코드 또는 세분류코드에 맞게 설장하는 것도 가능하다.In addition, it is also possible for the user to set up additional processing according to the large classification code or the sub-category code.

제어모듈(240)은 데이터 송수신 및 데이터 처리에 따라 발생되는 부하를 감시하여 부하가 미리 설정된 부하보다 높을 경우 상기 영상촬영장치(100)로부터 영상데이터만을 수신받아 제2분석모듈(230)을 통해 메타데이터를 생성하여 상기 영상데이터에 상기 메타데이터를 병합하여 후처리데이터를 생성하도록 한다.The control module 240 monitors a load generated according to data transmission/reception and data processing, and when the load is higher than a preset load, receives only image data from the image capturing device 100 and transmits the image data through the second analysis module 230 . Post-processing data is generated by generating data and merging the metadata with the image data.

여기서, 제어모듈(240)은 메인서버(200)의 자원, 프로세스 등을 감시하고 관제하는 것이 바람직하다. 또한, 제어모듈(240)은 영상출력장치와 메인서버(200)의 통신 연관데이터(망속도, 망연결 상태 및 기록)를 감시 및 관제하는 것이 바람직하다. 즉, 제어모듈(240)은 영상출력장치와 메인서버(200)의 상태를 감시, 각 엣지(영상출력장치)의 부하상태와 망 연결 상태를 감시하여 분석 방식을 전환하도록 제어하는 것이 바람직하다.Here, it is preferable that the control module 240 monitors and controls the resources, processes, etc. of the main server 200 . In addition, it is preferable that the control module 240 monitors and controls the communication-related data (network speed, network connection state and record) of the image output device and the main server 200 . That is, it is preferable that the control module 240 monitors the state of the image output device and the main server 200, and monitors the load state and network connection state of each edge (image output device) to switch the analysis method.

저장모듈(250)은 상기 제2분석모듈(230)을 통해 생성된 하나 이상의 후처리데이터의 메타데이터를 파싱하여 상기 후처리데이터에 포함된 메타데이터의 대분류코드 및 세분류코드에 따라 상기 후처리데이터를 분류하여 저장한다.The storage module 250 parses the metadata of one or more post-processing data generated through the second analysis module 230, and the post-processing data according to the major classification code and the sub-classification code of the metadata included in the post-processing data. Classify and save.

여기서, 메인서버(200)는 사용자 단말기(300)로부터 대분류코드 및 세분류코드 중 어느 하나 이상 및 시간범위 정보가 포함된 검색정보가 수신되면 상기 검색정보에 해당하는 시간범위 내의 미리 저장된 후처리데이터를 검출하고 검출된 하나 이상의 후처리데이터를 상기 사용자단말기(300)로 스트리밍하되 상기 검색정보에 해당하는 피사체의 이동 동선에 따라 후처리데이터를 순차적으로 스트리밍하거나 하나 이상의 후처리데이터를 동시간대에 스트리밍한다,Here, when the main server 200 receives search information including any one or more of the major classification code and the sub-category code and time range information from the user terminal 300, pre-stored post-processing data within the time range corresponding to the search information Stream the detected and detected one or more post-processed data to the user terminal 300, but sequentially stream the post-processed data according to the movement of the subject corresponding to the search information or stream one or more post-processed data at the same time ,

즉, 메인서버(200)는 사용자단말기(300)가 특정 시간대 즉 1월 1일 10시 에서 12시 사이의 특정 피사체(세분류코드)에 대해 검색을 진행하면 사용자 단말기(300)에 1월 1일 10시에서 12시 사이에 하나 이상의 영상촬영장치(100)로부터 감지된 특정 피사체가 있는지를 확인하고 특정 피사체가 감지되었다면 특정 피사체의 이동 동선에 따라 하나 이상의 후처리 데이터를 피사체의 이동순서에 따라 순차적으로 스트리밍하여 출력되도록 하는 것이 가능하며, 피사체가 관찰된 하나 이상의 후처리데이터를 사용자 단말기(300)에 동시에 출력하는 것 또한 가능하다.That is, when the user terminal 300 searches for a specific subject (subdivision code) between 10:00 and 12:00 on January 1 in a specific time period, the main server 200 sends the information to the user terminal 300 on January 1 It is checked whether there is a specific subject detected by the one or more image capturing apparatuses 100 between 10 o'clock and 12 o'clock. It is possible to stream and output as , and it is also possible to simultaneously output one or more post-processed data in which the subject is observed to the user terminal 300 .

만약, 특정 피사체가 감지되지 않았다면 1월1일 10시에서 12시 전후 시간대로 검색범위를 차츰 넓혀 검색하는 것 또한 가능하다. 여기서, 메인서버(200)는 검색정보에 맞는 전체화면을 구성가능하도록 하며, 화면이 사용자단말기(300)에 출력되는 동시에 녹화 및 저장이 가능하다.If a specific subject is not detected, it is also possible to search by gradually expanding the search range from 10:00 to 12:00 on January 1st. Here, the main server 200 makes it possible to configure the entire screen according to the search information, and the screen is output to the user terminal 300 and recording and storage are possible at the same time.

사용자 단말기(300)는 하나 이상의 영상촬영장치(100) 또는 메인서버(200)로부터 후처리데이터를 수신하면 수신된 하나 이상의 후처리데이터를 상기 PTP 동기화신호를 통해 동기화하여 출력한다. 그리고 사용자단말기(300)는 특정 피사체정보 및 행동정보 중 하나 이상을 포함하는 검색정보를 생성하여 메인서버(200)로 송신한다. 또한, 사용자단말기(300)는 상기 메인서버(200)로부터 침입신호가 수신되면 침입신호에 해당하는 후처리데이터를 화면에 출력하는 것을 특징으로 한다.When the user terminal 300 receives post-processing data from one or more image capturing apparatuses 100 or the main server 200, the user terminal 300 synchronizes and outputs the one or more received post-processed data through the PTP synchronization signal. In addition, the user terminal 300 generates search information including at least one of specific subject information and behavior information, and transmits the generated search information to the main server 200 . In addition, when the intrusion signal is received from the main server 200, the user terminal 300 outputs post-processing data corresponding to the intrusion signal on the screen.

도 3은 본 발명의 실시예에 따른 엣지 디바이스에서 생성한 객체 탐지 이벤트 기반의 영상 분류 방법의 제공순서를 나타낸 순서도이다.3 is a flowchart illustrating a method of providing an image classification method based on an object detection event generated by an edge device according to an embodiment of the present invention.

도 3을 참고하면, 우선, 영상촬영장치가 영상을 촬영하여 영상데이터를 생성한다(S110).Referring to FIG. 3 , first, an image photographing apparatus captures an image to generate image data ( S110 ).

그 다음, 상기 (S110)단계를 통해 입력되는 영상에서 피사체가 감지되는 경우 상기 피사체의 형태 및 행동을 분석하여 사람, 동물, 식물 및 사물 중 어느 하나로 피사체의 종류를 판단하여 피사체의 종류에 해당하는 대분류코드 및 피사체 위치에 대한 메타데이터를 생성한다(S120). 여기서, 상기 (S120)단계는 사물 또는 식물로 판단된 피사체가 사람 또는 동물로 판단된 피사체의 이동방향과 동시간대에 같은 방향으로 이동하면 상기 사람 또는 동물로 판단된 피사체에 귀속된 것으로 판단하여 사람 또는 동물로 판단된 피사체에 대한 대분류 코드, 상기 대분류코드에 귀속된 대분류코드 및 위치에 대한 메타데이터를 생성한다. 그 다음, 상기 영상데이터에 상기 메타데이터를 병합하여 후처리데이터를 생성한다(S130). 그 다음, 영상촬영장치가 상기 (c)단계를 통해 생성된 후처리데이터를 메인서버로 송신한다(S140). 그 다음, 메인서버가 상기 영상촬영장치로부터 후처리데이터를 수신한다(S150). 그 다음, 상기 제2통신모듈에 수신된 후처리데이터에 포함된 메타데이터 중 대분류코드 및 위치정보를 추출한다(S160). 그 다음, 메인서버가 상기 (S160)단계를 통해 추출된 대분류 코드별로 저장모듈에 저장된 분류기준에 따라 세분류하여 상기 세분류에 해당하는 세분류코드를 메타데이터에 추가한 후 해당 영상데이터와 병합하여 후처리데이터를 재생성한다(S170). 여기서, 상기 (S170)단계는, 대분류코드 또는 소분류코드 중 어느 하나 이상 및 위치정보에 따라 미리 설정된 마스킹 처리, 모자이크 처리 및 개인 신원 식별 처리 중 어느 하나 이상의 추가처리를 진행하여 메타데이터에 추가한다.Then, when a subject is detected in the image input through the step (S110), the shape and behavior of the subject are analyzed to determine the type of the subject as any one of a person, an animal, a plant, and an object, which corresponds to the type of the subject. A large classification code and metadata about the location of the subject are generated (S120). Here, in the step (S120), if the subject determined to be an object or a plant moves in the same direction as the movement direction of the subject determined to be a person or an animal, it is determined as belonging to the subject determined as the person or animal, and a person Alternatively, a large classification code for a subject determined as an animal, a large classification code attributed to the large classification code, and metadata about a location are generated. Then, post-processing data is generated by merging the metadata with the image data (S130). Then, the image photographing apparatus transmits the post-processing data generated in step (c) to the main server (S140). Then, the main server receives the post-processing data from the image photographing apparatus (S150). Next, a large classification code and location information are extracted from the metadata included in the post-processing data received by the second communication module (S160). Then, the main server sub-classifies according to the classification criteria stored in the storage module for each major classification code extracted through the step (S160), adds the sub-classification code corresponding to the sub-classification to the metadata, and then merges it with the image data for post-processing The data is regenerated (S170). Here, in the step (S170), any one or more of masking processing, mosaic processing, and personal identification identification processing set in advance according to any one or more of a large classification code or a small classification code and location information are additionally processed and added to the metadata.

또한, 상기 메인서버는 상기 (S170)단계를 통해 생성된 하나 이상의 후처리데이터의 메타데이터를 파싱하여 상기 후처리데이터에 포함된 메타데이터의 대분류코드 및 세분류코드에 따라 상기 후처리데이터를 분류하여 저장한다.In addition, the main server parses the metadata of one or more post-processing data generated through the step (S170) and classifies the post-processing data according to the major classification code and the sub-classification code of the metadata included in the post-processing data. Save.

또한, 상기 (S170)단계는 상기 메타데이터에 포함된 대분류코드가 사람일 경우 피사체의 안면의 형태를 분석하고, 분석된 안면의 형태와 미리 저장된 하나 이상의 피사체 안면 데이터를 비교하여 미리 설정된 일치도 이하일 경우 침입신호를 생성하여 침입신호 및 해당 후처리데이터를 사용자단말기로 송신한다.In addition, in the step (S170), when the large classification code included in the metadata is a person, the facial shape of the subject is analyzed, and the analyzed facial shape is compared with one or more pre-stored facial data of the subject. It generates an intrusion signal and transmits the intrusion signal and the corresponding post-processing data to the user terminal.

이때, 상기 사용자단말기는 상기 메인서버로부터 침입신호가 수신되면 침입신호에 해당하는 후처리데이터를 화면에 출력한다.At this time, when the intrusion signal is received from the main server, the user terminal outputs post-processing data corresponding to the intrusion signal on the screen.

그리고, 상기 메타데이터에 포함된 대분류코드가 사람일 경우 피사체의 안면의 형태를 분석하고, 분석된 안면의 형태와 미리 저장된 하나 이상의 피사체 안면 데이터를 비교하여 미리 설정된 일치도 이상일 경우 개인 신원 식별처리를 진행하고, 피사체 안면 데이터와 미리 매칭되어 설정된 추가처리를 진행한다.Then, when the large classification code included in the metadata is a human, the facial shape of the subject is analyzed, and the analyzed facial shape is compared with one or more pre-stored face data of the subject. and additional processing set in advance with the subject's facial data.

여기서, 상기 추가처리는 대분류코드 또는 세분류코드에 미리 매칭되어 설정된다.Here, the additional processing is set by matching in advance with the large classification code or the sub-category code.

마지막으로, 메인서버는 사용자단말기로부터 대분류코드 및 세분류코드 중 어느 하나 이상 및 시간범위 정보가 포함된 검색정보가 수신되면 상기 검색정보에 해당하는 시간범위 내의 미리 저장된 후처리데이터를 검출하고 검출된 하나 이상의 후처리데이터를 상기 사용자단말기로 스트리밍하되 상기 검색정보에 해당하는 피사체의 이동 동선에 따라 후처리데이터를 순차적으로 스트리밍하거나 하나 이상의 후처리데이터를 동시간대에 스트리밍한다(S180). 이때, 사용자단말기는 대분류코드 또는 세분류코드 중 어느 하나 이상 및 시간범위 정보를 포함하는 검색정보를 생성하여 메인서버로 송신한다.Finally, when search information including any one or more of a major classification code and a sub-category code and time range information is received from the user terminal, the main server detects pre-stored post-processing data within the time range corresponding to the search information, and detects the detected one The above-mentioned post-processed data is streamed to the user terminal, but the post-processed data is sequentially streamed according to the movement of the subject corresponding to the search information, or one or more post-processed data are streamed at the same time (S180). At this time, the user terminal generates search information including any one or more of a major classification code or a sub-category code and time range information and transmits the generated search information to the main server.

본 발명이 속하는 기술분야의 통상의 지식을 가진 자는 본 발명이 그 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 실시될 수 있다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 본 발명의 범위는 상기 상세한 설명보다는 후술하는 특허청구의 범위에 의하여 나타내어지며, 특허청구의 범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다.Those of ordinary skill in the art to which the present invention pertains will understand that the present invention may be embodied in other specific forms without changing the technical spirit or essential characteristics thereof. Therefore, it should be understood that the embodiments described above are illustrative in all respects and not restrictive. The scope of the present invention is indicated by the following claims rather than the above detailed description, and all changes or modifications derived from the meaning and scope of the claims and their equivalent concepts are included in the scope of the present invention. should be interpreted

100: 영상촬영장치 110 : 촬영모듈
112: 음성인식모듈 120 : 제1분석모듈
130 : 영상처리모듈 140: 제1통신모듈
200: 메인서버 210 : 제2통신모듈
220: 추출모듈 230: 제2분석모듈
232: 안면분석모듈 240: 제어모듈
250: 저장모듈 300: 사용자단말기100: image photographing device 110: photographing module
112: voice recognition module 120: first analysis module
130: image processing module 140: first communication module
200: main server 210: second communication module
220: extraction module 230: second analysis module
232: facial analysis module 240: control module
250: storage module 300: user terminal

Claims

A photographing module that generates image data by photographing an image, and when a subject is detected in the image input through the photographing module, analyzes the shape and behavior of the subject to determine the type of the subject as any one of a person, an animal, a plant, and an object A first analysis module that determines and generates a large classification code corresponding to the type of object and metadata on the location of the object, and post-processing that can be communicated with the ONVIF (Open Network Video Interface FORUM) protocol by merging the metadata with the image data When the metadata is generated by the image processing module for generating data and the first analysis module, a PTP (Precision Time Protocol) synchronization signal is generated and transmitted to one or more adjacent image photographing apparatuses, and the one or more image photographing apparatuses through PTP an image photographing apparatus including a first communication module that synchronizes time periodically to match image streaming signals of a connected image photographing apparatus when generating metadata; and
a second communication module for receiving post-processing data from the image capturing device; an extraction module for extracting large classification codes and location information from among metadata included in the post-processing data received by the second communication module; and the extraction module A second analysis module for sub-classifying according to the classification criteria stored in the storage module for each major classification code extracted through adding the sub-classification code corresponding to the sub-classification to metadata and merging it with the image data to regenerate post-processing data;
By monitoring the load generated according to data transmission and reception and data processing, when the load is higher than the preset load, only the image data is received from the image photographing device, the metadata is generated through the second analysis module, and the metadata is added to the image data. A control module for generating post-processing data by merging;
Parsing the metadata of one or more post-processing data generated through the second analysis module, and a storage module for classifying and storing the post-processing data according to the major classification code and the sub-classification code of the metadata included in the post-processing data. main server;
A user terminal that generates search information including any one or more of a major classification code or a sub-category code and time range information and transmits it to the main server,
The first analysis module,
If the subject determined to be an object or plant moves in the same direction as the movement direction of the subject determined to be a person or an animal, it is determined that the subject determined to be a person or an animal belongs to the subject determined to be a person or an animal. It is characterized in that the large classification code, the large classification code attributed to the large classification code, and metadata about the location are generated,
The second analysis module,
Add to metadata by performing additional processing of any one or more of masking processing, mosaic processing, and personal identification processing set in advance according to any one or more of the large classification code or the small classification code and location information;
After sub-classifying one or more major classification codes, adding each sub-classification code corresponding to the sub-classification to metadata, merging with the corresponding image data to regenerate post-processing data, and regenerating post-processing data to change the image quality,
The second analysis module,
It further comprises a facial analysis module that analyzes the image data received from the image recording device and uses the statistical data of the subject's face shape to analyze the race, gender, age group or emotion of each subject,
The main server is
When search information including any one or more of a major classification code and a sub-category code and time range information is received from the user terminal, pre-stored post-processing data within a time range corresponding to the search information is detected, and the detected one or more post-processed data are stored in the An object detection event-based image classification system created by an edge device that streams to a user terminal but streams post-processing data sequentially or one or more post-processing data at the same time according to the movement of the subject corresponding to the search information .

delete