KR20230095365A

KR20230095365A - Image analysis apparatus for detecting abnormal event, and method thereof

Info

Publication number: KR20230095365A
Application number: KR1020210184765A
Authority: KR
Inventors: 권택순; 정의정; 고경왕
Original assignee: (주)이스트소프트
Priority date: 2021-12-22
Filing date: 2021-12-22
Publication date: 2023-06-29

Abstract

본 발명에 따른 이상행동 감지를 위한 영상분석장치는, 고해상도로 촬영된 영상데이터인 원본데이터를 저해상도로 열화시켜 분석용 영상데이터를 생성하는 전처리모듈; 상기 분석용 영상데이터로부터 이상행동이 존재하는 국부영역을 이상행동탐지영역으로 선정하는 탐지신경망모듈; 상기 원본데이터로부터 상기 탐지심층신경망모듈에 의해 선정된 상기 이상행동탐지영역에 대응하는 영역에 대한 국부데이터를 추출하는 국부데이터추출모듈; 상기 국부데이터를 입력으로 하여 이상행동여부를 판별하는 분류신경망모듈;을 포함하여 구성될 수 있다.An image analysis apparatus for detecting abnormal behavior according to the present invention includes a pre-processing module for generating image data for analysis by degrading original data, which is image data captured at high resolution, to low resolution; a detection neural network module that selects a local area in which abnormal behavior exists from the image data for analysis as an abnormal behavior detection area; a local data extraction module extracting local data for a region corresponding to the abnormal behavior detection region selected by the detection deep neural network module from the original data; It may be configured to include; a classification neural network module that determines whether or not there is an abnormal behavior by taking the local data as an input.

Description

Image analysis apparatus for detecting abnormal event, and method thereof}

본 발명은 카메라로부터 수집되는 영상 내의 이상행동을 감지하는 기술에 관한 것이다.The present invention relates to a technique for detecting an abnormal behavior in an image collected from a camera.

감시카메라는 산업용, 교육용, 의료용, 교통관제용, 방제용 등 다양한 용도로 사용되고 있으며, 범죄의 예방 및 억제 효과, 범인의 발견 및 체포의 용이성, 시민들이 인식하는 범죄에 대한 두려움 감소, 그리고 주요지역에의 설치에 따른 한정된 경찰인력의 보완 등 그 효과가 실증적으로 증명되고 있다.Surveillance cameras are used for a variety of purposes, including industrial, educational, medical, traffic control, and control purposes, and are effective in preventing and deterring crime, facilitating detection and arrest of criminals, reducing the fear of crime recognized by citizens, and in major areas. Its effects, such as supplementing the limited police manpower following the installation of E, have been empirically proven.

최근, CCTV, 드론, 스마트폰 등에서 수집된 영상데이터를 영상 분석 알고리즘에 적용하여 영상데이터에 포함된 객체들의 이상행동(Abnormal event)을 분석하고 있다. 이렇게 영상데이터로부터 이상행동을 감지하여 사고를 미연에 방지하는 기술이 각광받고 있으나, 방대한 양의 영상데이터를 처리하기 위해서는 높은 연산 능력이 요구된다. 특히, CCTV, 드론, 스마트폰 등에서 수집할 수 있는 영상데이터의 품질과 수량이 급격히 증가함에 따라, 고해상도로 촬영된 방대한 양의 영상데이터를 처리하기에는 소형 임베디드 장치의 성능이 부족하므로, 부득이 고성능 컴퓨팅 자원을 가진 서버로 영상을 전송하여 사후처리하거나, 혹은 사람이 직접 분석해야 하는 어려움이 있다.Recently, video data collected from CCTVs, drones, smartphones, etc. are applied to video analysis algorithms to analyze abnormal events of objects included in video data. Although technology for preventing accidents by detecting abnormal behavior from image data is in the limelight, high computational power is required to process a vast amount of image data. In particular, as the quality and quantity of image data that can be collected from CCTVs, drones, and smartphones rapidly increase, the performance of small embedded devices is insufficient to process the vast amount of image data captured at high resolution, so high-performance computing resources are inevitably needed. There is a difficulty in that the image must be transmitted to a server having a post-processing or a person must directly analyze the image.

한편, 심층신경망 학습을 통해 영상을 분석하는 기술은 지난 2015년 사람의 정확도를 넘은 이후로 지속적으로 발전하여 실제 산업 및 사회 문제 등을 해결하는 데에 유용하게 사용되고 있다. 그러나, 이러한 심층신경망 기술의 발전은 더 높은 연산 능력을 요구하므로, 저사양의 임베디드 장치에서는 사용하기 어렵다. On the other hand, the technology of analyzing images through deep neural network learning has been continuously developed since surpassing human accuracy in 2015, and is being usefully used to solve real industrial and social problems. However, since the development of such deep neural network technology requires higher computing power, it is difficult to use it in low-end embedded devices.

본 발명은 상술한 종래기술의 문제점을 해결하고자 한 것으로서, 저사양 임베디드 환경에서도 고해상도의 영상데이터 내의 이상행동을 감지할 수 있는 영상분석장치 및 그 방법을 제공하는 것을 목적으로 한다.An object of the present invention is to provide an image analysis device and method capable of detecting abnormal behavior in high-resolution image data even in a low-end embedded environment.

본 발명의 목적들은 이상에서 언급한 목적들로 제한되지 않으며, 언급되지 않은 또 다른 목적들은 아래의 기재로부터 명확하게 이해될 수 있을 것이다.The objects of the present invention are not limited to the objects mentioned above, and other objects not mentioned will be clearly understood from the description below.

본 발명은, 상기 탐지신경망모듈을 학습시키는 탐지학습모듈을 더 포함할 수 있으며, 상기 탐지학습모듈은, 상기 탐지신경망모듈이 입력된 영상데이터 내에 이상행동이 존재하는 영역을 나타내는 잠재벡터를 출력하면, 상기 잠재벡터와 미리 설정된 기댓값과의 차이가 최소가 되도록 상기 탐지신경망모듈을 학습시키는 것을 특징으로 한다.The present invention may further include a detection learning module for training the detection neural network module, and when the detection learning module outputs a latent vector indicating a region in which abnormal behavior exists in the input image data, the detection neural network module outputs a latent vector. , characterized in that the detection neural network module is trained so that the difference between the latent vector and a preset expected value is minimized.

아울러, 본 발명은, 상기 분류신경망모듈을 학습시키는 분류학습모듈을 더 포함할 수 있으며, 상기 분류학습모듈은, 상기 분류신경망모듈이 입력된 영상데이터 내에 이상행동이 포함되어 있는지 여부를 나타내는 출력값을 산출하면, 상기 출력값과 미리 설정된 기댓값과의 차이가 최소가 되도록 상기 분류신경망모듈을 학습시키는 것을 특징으로 한다.In addition, the present invention may further include a classification learning module for training the classification neural network module, wherein the classification learning module generates an output value indicating whether or not abnormal behavior is included in the image data input to the classification neural network module. Upon calculation, it is characterized in that the classification neural network module is trained so that the difference between the output value and a preset expected value is minimized.

다른 측면에서, 본 발명에 따른 이상행동 감지를 위한 영상분석방법은, 전처리모듈이 고해상도로 촬영된 영상데이터인 원본데이터를 저해상도로 열화시켜 분석용 영상데이터를 생성하는 단계와, 탐지신경망모듈이 상기 분석용 영상데이터로부터 이상행동이 존재하는 국부영역을 이상행동탐지영역으로 선정하는 단계와, 국부데이터추출모듈이 상기 원본데이터로부터 상기 탐지심층신경망모듈에 의해 선정된 상기 이상행동탐지영역에 대응하는 영역에 대한 국부데이터를 추출하는 단계와, 분류신경망모듈이 상기 국부데이터를 입력으로 하여 이상행동여부를 판별하는 단계;을 포함하여 구성될 수 있다.In another aspect, an image analysis method for detecting abnormal behavior according to the present invention includes generating image data for analysis by degrading original data, which is image data captured at high resolution, by a preprocessing module to a low resolution, and generating image data for analysis by a detection neural network module. selecting a local area where abnormal behavior exists from the analysis image data as an abnormal behavior detection area; and a local data extraction module corresponding to the abnormal behavior detection area selected from the original data by the detection deep neural network module. It may be configured to include a step of extracting local data for, and a step of determining whether or not there is an abnormal behavior by a classified neural network module using the local data as an input.

본 발명에 따른 방법은, 상기 탐지신경망모듈을 학습시키는 탐지학습단계를 더 포함할 수 있고, 상기 탐지학습단계는, 전처리모듈이 입력된 고해상도의 영상데이터를 저해상도로 열화시킨 열화데이터를 생성하는 단계; 탐지신경망모듈이 상기 열화데이터 내에 이상행동이 존재하는 영역을 나타내는 잠재벡터를 출력하는 단계; 탐지학습모듈이 상기 잠재벡터와 미리 설정된 기댓값과의 차이가 최소가 되도록 상기 탐지신경망모듈을 학습시키는 단계;를 포함할 수 있다.The method according to the present invention may further include a detection learning step of training the detection neural network module, wherein the detection learning step is a step of generating deterioration data obtained by degrading input high resolution image data to a low resolution by a preprocessing module. ; outputting, by a detection neural network module, a latent vector indicating a region in which an abnormal behavior exists in the deterioration data; and learning the detection neural network module so that the difference between the latent vector and a preset expected value is minimized by the detection learning module.

아울러, 본 발명에 따른 방법은, 상기 전처리모듈이 상기 열화데이터를 데이터증강을 통해 복수의 열화데이터로 변환하는 단계를 더 포함할 수 있다.In addition, the method according to the present invention may further include converting, by the preprocessing module, the deteriorated data into a plurality of deteriorated data through data augmentation.

또한, 본 발명에 따른 방법은, 상기 분류신경망모듈을 학습시키는 분류학습단계를 더 포함할 수 있고, 상기 분류학습단계는, 분류신경망모듈이 입력된 영상데이터 내에 이상행동이 포함되어 있는지 여부를 나타내는 출력값을 산출하는 단계; 분류학습모듈이 상기 분류신경망모듈이 출력한 상기 출력값과 미리 설정된 기댓값과의 차이가 최소가 되도록 상기 분류신경망모듈을 학습시키는 단계;를 포함할 수 있다.In addition, the method according to the present invention may further include a classification learning step of training the classification neural network module, wherein the classification learning step indicates whether or not abnormal behavior is included in the image data input to the classification neural network module. Calculating an output value; The classification learning module may include training the classification neural network module such that a difference between the output value output by the classification neural network module and a preset expected value is minimized.

또 다른 측면에서, 본 발명은, 컴퓨터에서 상술한 이상행동 감지를 위한 영상분석방법을 실행시키는 프로그램이 기록된 컴퓨터 판독 가능한 기록매체를 제공할 수 있다.In another aspect, the present invention may provide a computer readable recording medium on which a program for executing the above-described image analysis method for detecting abnormal behavior in a computer is recorded.

본 발명에 따른 이상행동 감지를 위한 영상분석 장치 및 방법은, 저사양 임베디드 장치에서 이상행동을 탐지하는 심층신경망을 구현하여 고해상도의 영상데이터를 분석할 때 야기되는 성능저하 문제를 해결함으로써, 이상행동 감지를 위해 고성능 연산장치 및 인력을 수반하지 않고도 실시간으로 이상행동을 감지할 수 있으며, 그에 따라 사회안전망 구축에 기여할 수 있다.An image analysis apparatus and method for detecting abnormal behavior according to the present invention implements a deep neural network for detecting abnormal behavior in a low-end embedded device to solve the problem of performance degradation caused when analyzing high-resolution image data, thereby detecting abnormal behavior. For this purpose, it is possible to detect abnormal behavior in real time without involving high-performance computing devices and manpower, thereby contributing to the establishment of a social safety net.

도 1은 본 발명의 실시예에 따른 심층신경망 기반의 이상행동 감지를 위한 영상분석장치의 구성 및 이상행동 감지를 위한 영상분석방법을 설명하기 위한 개요도이다.
도 2는 본 발명의 실시예에 따른 원본데이터의 전처리 과정을 설명하는 개요도이다.
도 3은 본 발명의 실시예에 따른 탐지신경망모듈의 학습과정을 설명하는 개요도이다.
도 4는 본 발명의 실시예에 따른 분류신경망모듈의 학습과정을 설명하는 개요도이다.1 is a schematic diagram illustrating a configuration of an image analysis device for detecting abnormal behavior based on a deep neural network and an image analysis method for detecting abnormal behavior according to an embodiment of the present invention.
2 is a schematic diagram illustrating a pre-processing process of original data according to an embodiment of the present invention.
3 is a schematic diagram illustrating a learning process of a detection neural network module according to an embodiment of the present invention.
4 is a schematic diagram illustrating a learning process of a classified neural network module according to an embodiment of the present invention.

본 발명의 상세한 설명에 앞서, 이하에서 설명되는 본 명세서 및 청구범위에 사용된 용어나 단어는 통상적이거나 사전적인 의미로 한정해서 해석되어서는 아니 되며, 발명자는 그 자신의 발명을 가장 최선의 방법으로 설명하기 위해 용어의 개념으로 적절하게 정의할 수 있다는 원칙에 입각하여 본 발명의 기술적 사상에 부합하는 의미와 개념으로 해석되어야만 한다. 따라서 본 명세서에 기재된 실시예와 도면에 도시된 구성은 본 발명의 가장 바람직한 실시예에 불과할 뿐, 본 발명의 기술적 사상을 모두 대변하는 것은 아니므로, 본 출원시점에 있어서 이들을 대체할 수 있는 다양한 균등물과 변형 예들이 있을 수 있음을 이해하여야 한다. Prior to the detailed description of the present invention, the terms or words used in this specification and claims described below should not be construed as being limited to a common or dictionary meaning, and the inventors should use their own invention in the best way. It should be interpreted as a meaning and concept corresponding to the technical idea of the present invention based on the principle that it can be properly defined as a concept of a term for explanation. Therefore, the embodiments described in this specification and the configurations shown in the drawings are only the most preferred embodiments of the present invention, and do not represent all of the technical ideas of the present invention, so various equivalents that can replace them at the time of the present application. It should be understood that there may be water and variations.

또한, 본 명세서에서 사용한 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 또한, 본 명세서에서 기술되는 "포함한다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다. In addition, terms used in this specification are only used to describe specific embodiments, and are not intended to limit the present invention. Singular expressions include plural expressions unless the context clearly dictates otherwise. In addition, terms such as "include" or "having" described in this specification are intended to designate that there is a feature, number, step, operation, component, part, or combination thereof described in the specification, but one or the It should be understood that the above does not preclude the possibility of the presence or addition of other features, numbers, steps, operations, components, parts, or combinations thereof.

또한, 명세서에 기재된 "…부", "…기", "모듈" 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어나 소프트웨어 또는 하드웨어 및 소프트웨어의 결합으로 구현될 수 있다. In addition, terms such as “… unit”, “… unit”, and “module” described in the specification mean a unit that processes at least one function or operation, which may be implemented as hardware or software or a combination of hardware and software. there is.

또한, "일(a 또는 an)", "하나(one)", "그(the)" 및 유사어는 본 발명을 기술하는 문맥에 있어서(특히, 이하의 청구항의 문맥에서) 본 명세서에 달리 지시되거나 문맥에 의해 분명하게 반박되지 않는 한, 단수 및 복수 모두를 포함하는 의미로 사용될 수 있다. Also, "a or an", "one", "the" and similar words in the context of describing the invention (particularly in the context of the claims below) indicate otherwise in this specification. may be used in the sense of including both the singular and the plural, unless otherwise clearly contradicted by the context.

아울러, 본 발명의 범위 내의 실시예들은 컴퓨터 실행가능 명령어 또는 컴퓨터 판독가능 매체에 저장된 데이터 구조를 가지거나 전달하는 컴퓨터 판독가능 매체를 포함한다. 이러한 컴퓨터 판독가능 매체는, 범용 또는 특수 목적의 컴퓨터 시스템에 의해 액세스 가능한 임의의 이용 가능한 매체일 수 있다. 예로서, 이러한 컴퓨터 판독가능 매체는 RAM, ROM, EPROM, CD-ROM 또는 기타 광디스크 저장장치, 자기 디스크 저장장치 또는 기타 자기 저장장치, 또는 컴퓨터 실행가능 명령어, 컴퓨터 판독가능 명령어 또는 데이터 구조의 형태로 된 소정의 프로그램 코드 수단을 저장하거나 전달하는 데에 이용될 수 있고, 범용 또는 특수 목적 컴퓨터 시스템에 의해 액세스 될 수 있는 임의의 기타 매체와 같은 물리적 저장 매체를 포함할 수 있지만, 이에 한정되지 않는다. 이하의 설명 및 특허 청구 범위에서, 컴퓨터 판독가능 명령어는, 예를 들면, 범용 컴퓨터 시스템 또는 특수 목적 컴퓨터 시스템이 특정 기능 또는 기능의 그룹을 수행하도록 하는 명령어 및 데이터를 포함한다. 컴퓨터 실행가능 명령어는, 예를 들면, 어셈블리어, 또는 심지어는 소스코드와 같은 이진, 중간 포맷 명령어일 수 있다. In addition, embodiments within the scope of the present invention include computer readable media having or conveying computer executable instructions or data structures stored on the computer readable media. Such computer readable media can be any available media that can be accessed by a general purpose or special purpose computer system. By way of example, such computer readable media may be in the form of RAM, ROM, EPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage, or computer executable instructions, computer readable instructions or data structures. physical storage media, such as, but not limited to, any other medium that can be used to store or convey any program code means that is used and that can be accessed by a general purpose or special purpose computer system. In the following description and claims, computer readable instructions include, for example, instructions and data that cause a general purpose or special purpose computer system to perform a particular function or group of functions. Computer executable instructions may be, for example, binary, intermediate format instructions, such as assembly language, or even source code.

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시예들을 상세히 설명한다. 이때, 첨부된 도면에서 동일한 구성 요소는 가능한 동일한 부호로 나타내고 있음을 유의해야 한다. 또한, 본 발명의 요지를 흐리게 할 수 있는 공지 기능 및 구성에 대한 상세한 설명은 생략할 것이다. 마찬가지의 이유로 첨부 도면에 있어서 일부 구성요소는 과장되거나 생략되거나 또는 개략적으로 도시되었으며, 각 구성요소의 크기는 실제 크기를 전적으로 반영하는 것이 아니다. Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. At this time, it should be noted that the same components in the accompanying drawings are indicated by the same reference numerals as much as possible. In addition, detailed descriptions of well-known functions and configurations that may obscure the gist of the present invention will be omitted. For the same reason, some components in the accompanying drawings are exaggerated, omitted, or schematically illustrated, and the size of each component does not entirely reflect the actual size.

먼저, 본 발명의 실시예에 따른 심층신경망 기반의 이상행동 감지를 위한 영상분석 장치 및 방법에 대해서 설명하기로 한다. 도 1은 본 발명의 실시예에 따른 심층신경망 기반의 이상행동 감지를 위한 영상분석장치의 구성 및 이상행동 감지를 위한 영상분석방법을 설명하기 위한 개요도이고, 도 2는 본 발명의 실시예에 따른 원본데이터의 전처리 과정을 설명하는 개요도이며, 도 3은 본 발명의 실시예에 따른 탐지신경망모듈의 학습과정을 설명하는 개요도이고, 도 4는 본 발명의 실시예에 따른 분류신경망모듈의 학습과정을 설명하는 개요도이다.First, an image analysis apparatus and method for detecting abnormal behavior based on a deep neural network according to an embodiment of the present invention will be described. 1 is a schematic diagram illustrating a configuration of an image analysis device for detecting abnormal behavior based on a deep neural network and an image analysis method for detecting abnormal behavior according to an embodiment of the present invention, and FIG. 2 is a diagram according to an embodiment of the present invention. Figure 3 is a schematic diagram explaining the preprocessing process of original data, Figure 3 is a schematic diagram explaining the learning process of the detection neural network module according to an embodiment of the present invention, Figure 4 is a schematic diagram explaining the learning process of the classification neural network module according to an embodiment of the present invention It is an explanatory diagram.

도 1에서 보듯이, 본 발명에 따른 이상행동 감지를 위한 영상분석장치는, 고해상도로 촬영된 영상데이터인 원본데이터(D1)를 저해상도로 열화시켜 분석용 영상데이터(D2)를 생성하는 전처리모듈(10)과, 상기 분석용 영상데이터(D2)로부터 이상행동이 존재하는 국부영역을 이상행동탐지영역(D3)으로 선정하는 탐지신경망모듈(20)과, 상기 원본데이터(D1)로부터 상기 탐지심층신경망모듈(20)에 의해 선정된 상기 이상행동탐지영역에 대응하는 영역에 대한 국부데이터(D4)를 추출하는 국부데이터추출모듈(40)과, 상기 국부데이터(D4)를 입력으로 하여 이상행동여부를 판별하는 분류신경망모듈(30)을 포함하여 구성될 수 있다.As shown in FIG. 1, the image analysis apparatus for detecting abnormal behavior according to the present invention is a pre-processing module for generating image data D2 for analysis by degrading original data D1, which is image data captured at high resolution, to low resolution ( 10), a detection neural network module 20 that selects a local area in which abnormal behavior exists from the analysis image data D2 as an abnormal behavior detection area D3, and the detection deep neural network from the original data D1 A local data extraction module 40 for extracting local data D4 for an area corresponding to the abnormal behavior detection area selected by the module 20, and the local data D4 as an input to determine whether or not there is an abnormal behavior It may be configured to include a classified neural network module 30 that discriminates.

여기서, 영상데이터는 CCTV, 스마트폰 등의 카메라를 통해 고해상도로 촬영된 동영상 또는 정지화상 이미지를 의미한다. "이상행동"은, 여러가지 동작들이 모여서 이루어지는 행동들 가운데 정상이 아닌 행동으로 정의될 수 있으며, 객체의 유형(사람, 또는 사물), 객체수(1인 또는 2인 이상), 세부동작(휘두르기, 차기, 밀치기, 당기기, 던지기, 쓰러짐 등) 등에 따라, 예컨대 폭행, 싸움, 절도, 기물파손 등의 다양한 유형으로 분류될 수 있다. 본 발명은, 카메라와 그래픽 연산장치(GPU)가 포함된 저사향 임베디드 환경(예컨대, 드론, cctv, 스마트폰 등)에서도, 카메라로 수집되는 고해상도로 촬영된 영상 내에 이상행동이 존재하는지 감지하기 위해 심층신경망을 학습시키고 이를 활용하기 위한 것이다.Here, the image data means a moving image or still image captured in high resolution through a camera such as a CCTV or a smartphone. "Abnormal behavior" can be defined as an abnormal behavior among behaviors made up of various behaviors, including the type of object (person or object), the number of objects (one person or two or more), and detailed movements (wielding, Kicking, pushing, pulling, throwing, falling, etc.) In the present invention, even in a low-noise embedded environment (eg, drone, cctv, smartphone, etc.) including a camera and a graphic processing unit (GPU), in order to detect whether abnormal behavior exists in high-resolution images captured by a camera, It is intended to train and utilize deep neural networks.

영상데이터 내에 이상행동 감지에 있어서 심층신경망 모듈이 요구하는 연산장치의 성능은 일반적으로 입력으로 주어지는 영상의 해상도(크기)에 비례한다. 따라서, 전처리모듈(10)을 통해 카메라의 성능에 따라 제공되는 고해상도의 영상을 저사양의 임베디드 장치에서의 낮은 성능의 연산장치로 동작가능한 수준의 저해상도 영상으로 열화시킨다. 이렇게 열화된 저해상도의 영상을 통해 탐지신경망모듈(20)을 학습시켜 이상행동을 감지하는데 사용한다.The performance of an arithmetic device required by a deep neural network module in detecting abnormal behavior in image data is generally proportional to the resolution (size) of an image given as an input. Therefore, the high-resolution image provided according to the performance of the camera through the pre-processing module 10 is degraded into a low-resolution image that can be operated by a low-performance arithmetic unit in a low-specification embedded device. The detection neural network module 20 is trained through the degraded low-resolution image and used to detect abnormal behavior.

도 2에는 탐지신경망모듈(20)의 학습을 위해 고해상도로 촬영된 영상데이터인 원본데이터(L1)를 저해상도로 열화시켜 열화데이터(L2)를 생성하는 과정을 도시하였다. 도 2에서 보듯이, 예컨대 2K 이상의 해상도를 가진 원본데이터(L1)가 입력되면 전처리모듈(10)이 저해상도(예를 들어, 640×480)로 열화된 열화데이터(L2)를 생성한다. 추가적으로, 전처리모듈(10)은 탐지신경망모듈(20)의 학습시, 열화된 영상데이터를 데이터증강(Data Augmentation)을 통해 노이즈를 추가하거나, 반전시키거나, 회전시키거나, 일부를 잘라내는 등의 방법으로 변형하여 복수의 열화데이터(L2)를 생성하여 이를 학습에 활용할 수 있다. 이를 통해, 학습에 사용하는 데이터이 양을 증가시켜 데이터 열화에 따른 탐지성능의 저하를 보완할 수 있다.2 shows a process of generating degraded data L2 by degrading original data L1, which is image data captured at high resolution, to low resolution for learning of the detection neural network module 20. As shown in FIG. 2, when original data L1 having a resolution of, for example, 2K or higher is input, the preprocessing module 10 generates degraded data L2 degraded to a low resolution (eg, 640×480). In addition, the pre-processing module 10 adds noise to the degraded image data through data augmentation, inverts it, rotates it, or cuts out a part of it during learning of the detection neural network module 20. By modifying the method, a plurality of deterioration data (L2) can be generated and used for learning. Through this, it is possible to compensate for the degradation of detection performance due to data deterioration by increasing the amount of data used for learning.

다음으로, 도 3에서 보듯이, 탐지신경망모듈(20)은 입력으로 주어지는 열화데이터 내에 이상행동이 존재하는 영역(즉, 이상행동탐지영역)을 탐지한다. 여기서, 탐지신경망모듈(20)은, 입력층, 복수의 은닉층 및 출력층으로 구분된 복수의 계층을 포함하여 구성될 수 있다. 복수의 계층 각각은 가중치(W)가 적용되는 복수의 연산을 포함한다. 입력층 및 복수의 은닉층은 비선형적 구조로 연결되는 것이 바람직하다. 은닉층의 경우, 입력되는 데이터의 특징에 따라 회선층(Convolutional Layer), 잔여층(Residual Layer), 완전연결층(Fully Connected Layer) 등의 형태를 가질 수 있다. 이러한 은닉층의 조합은 종래에 알려진 영상 내 객체 탐지 목적의 다양한 신경망 모델들(RCNN, YOLO 등)을 이용할 수 있다. 출력층은 이전 은닉층의 출력을 입력으로 하여 영상데이터 내에 이상행동이 존재하는 국부영역(이상행동탐지영역)이 벡터 형태로 표현된 잠재벡터(Latent vector)를 출력한다.Next, as shown in FIG. 3, the detection neural network module 20 detects an area in which deviant behavior exists (ie, an anomalous behavior detection area) within deterioration data given as an input. Here, the detection neural network module 20 may include a plurality of layers divided into an input layer, a plurality of hidden layers, and an output layer. Each of the plurality of layers includes a plurality of operations to which weights (W) are applied. The input layer and the plurality of hidden layers are preferably connected in a non-linear structure. In the case of the hidden layer, it may have a form of a convolutional layer, a residual layer, a fully connected layer, or the like, depending on the characteristics of input data. The combination of these hidden layers can use various neural network models (RCNN, YOLO, etc.) for the purpose of object detection in an image known in the related art. The output layer takes the output of the previous hidden layer as an input and outputs a latent vector in which a local area (abnormal behavior detection area) in image data where abnormal behavior exists is expressed in a vector form.

예컨대, 탐지신경망모듈(20)은 입력된 데이터의 이미지를 S×S(S는 자연수)의 그리드셀로 나누고, 물체가 존재하는 영역(바운딩박스, B) 및 바운딩박스 내에 이상행동이 존재할 확률(클래스 확률, C)을 나타내는 잠재벡터를 생성한다. 여기서, 바운딩박스 B는 X,Y좌표 및 가로, 세로 크기 정보를 포함한다. 즉, 잠재벡터는 입력된 영상데이터 내에서 이상행동이 존재하는 영역에 대한 위치정보 및 상기 영역 내에 이상행동이 존재할 확률정보를 포함한다. 예를 들어, "448×448" 크기의 이미지가 입력된 경우 "7×7"의 그리드를 적용하여, 잠재벡터는 "7×7×30"의 모양이 되고, "7×7"의 각 셀에는 길이 30인 벡터가 각 셀의 바운딩박스 B 및 분류한 클래스 확률 C의 정보를 갖게 된다.For example, the detection neural network module 20 divides the image of the input data into grid cells of S×S (S is a natural number), and the area where the object exists (bounding box, B) and the probability of abnormal behavior existing within the bounding box ( Create a latent vector representing the class probability, C). Here, the bounding box B includes X, Y coordinates and horizontal and vertical size information. That is, the latent vector includes positional information about an area where an abnormal behavior exists within the input image data and probability information where an abnormal behavior exists within the area. For example, if a "448×448" size image is input, a "7×7" grid is applied, the latent vector becomes a shape of "7×7×30", and each cell of "7×7" In , a vector with a length of 30 has information on the bounding box B of each cell and the class probability C classified.

학습과정에서, 탐지신경망모듈(20)은 학습을 통해 이상행동탐지영역에 대한 정확도를 향상시키게 되는데, 탐지학습모듈(20a)은 출력된 잠재벡터와 미리 설정된 기댓값(즉, 이상행동이 존재하는 위치 및 확률값을 가진 정답데이터)과의 차이가 최소가 되도록 탐지신경망모듈(20)을 학습시킨다. 즉, 학습데이터로 입력된 열화데이터(L2)에 대해 탐지신경망모듈(20)이 출력한 잠재벡터가, 이상행동이 포함된 국부영역(L3)을 나타내도록 손실함수의 출력값이 낮아지는 방향으로 심층신경망의 가중치(W)를 조정(이렇게 가중치를 조정하는 과정을 '학습'이라 칭한다)함으로써 이루어진다. 최종적으로, 탐지신경망모듈(20)은 입력된 열화데이터 내에 이상행동이 존재하는 국부영역을 나타내는 잠재벡터를 출력하도록 학습된다.In the learning process, the detection neural network module 20 improves the accuracy of the deviant behavior detection area through learning. and correct answer data with probability values) to train the detection neural network module 20 to minimize the difference. That is, the latent vector output by the detection neural network module 20 for the deterioration data L2 input as training data is deep in the direction in which the output value of the loss function is lowered so that the local region L3 including the abnormal behavior is represented. This is done by adjusting the weights (W) of the neural network (the process of adjusting the weights in this way is called 'learning'). Finally, the detection neural network module 20 is trained to output a latent vector representing a local area in which abnormal behavior exists in the input degradation data.

한편, 도 4에서 보듯이, 분류신경망모듈(30)은 고해상도의 영상데이터(L4) 내에 이상행동이 포함되어 있는지 여부를 나타내는 출력값을 산출하도록 학습된다. 분류신경망모듈(30)은 앞에서 설명한 탐지신경망모듈(20)과 유사하게, 입력층, 복수의 은닉층 및 출력층을 포함하여 구성될 수 있다. 추가적으로 분류신경망모듈(30)에 입력으로 주어지는 영상데이터의 크기가 상이할 수 있는데, 서로 다른 크기의 입력을 처리할 수 있도록 회선층, 풀링층 등을 더 포함할 수 있다. 이와 같이, 탐지신경망모듈(20)은 복수의 계층으로 구성될 수 있으며, 복수의 계층 각각은 가중치(w)가 적용되는 복수의 연산을 수행할 수 있다. 분류신경망모듈(30)을 구성하는 출력층은 이전 은닉층의 출력을 입력으로 받아 영상데이터(L4) 내에 이상행동이 포함되어 있는지 여부를 나타내는 출력값을 산출한다. 예컨대, 이상행동에 해당하면 "1", 이상행동이 없으면 "0"과 같이 이상행동여부를 표현하는 실수값 형태로 출력할 수 있다. 분류학습모듈(30a)은 분류신경망모듈(30)로부터 출력된 출력값과 미리 설정된 기댓값을 입력으로 하는 손실함수의 출력값이 낮아지는 방향, 즉 분류신경망모듈(30)의 출력값과 기대값의 차이가 최소가 되도록, 분류신경망모듈(30)을 구성하는 심층신경망의 가중치(w)를 조정함으로써, 분류신경망모듈(30)을 학습시킨다.Meanwhile, as shown in FIG. 4, the classified neural network module 30 is trained to calculate an output value indicating whether or not abnormal behavior is included in the high-resolution image data L4. Similar to the detection neural network module 20 described above, the classified neural network module 30 may include an input layer, a plurality of hidden layers, and an output layer. Additionally, the size of image data given as input to the classified neural network module 30 may be different, and a convolution layer, a pooling layer, and the like may be further included to process inputs of different sizes. In this way, the detection neural network module 20 may be composed of a plurality of layers, and each of the plurality of layers may perform a plurality of operations to which a weight w is applied. The output layer constituting the classified neural network module 30 receives the output of the previous hidden layer as an input and calculates an output value indicating whether or not abnormal behavior is included in the image data L4. For example, it may be output in the form of a real value representing whether or not there is an abnormal behavior, such as "1" if there is an abnormal behavior and "0" if there is no abnormal behavior. In the classification learning module 30a, the direction in which the output value of the loss function that takes the output value output from the classification neural network module 30 and the preset expected value as input decreases, that is, the difference between the output value of the classification neural network module 30 and the expected value is the minimum. The classified neural network module 30 is trained by adjusting the weight w of the deep neural network constituting the classified neural network module 30 so that

다시 도 1로 돌아가서, 잘 학습된 탐지신경망모듈(20) 및 분류신경망모듈(30)을 이용한 영상분석장치를 통해, 성능 저하 없이도 저사양의 임베디드 장치에서 고해상도의 영상데이터 내의 이상행동을 감지할 수 있다.Returning to FIG. 1 again, through the image analysis device using the well-learned detection neural network module 20 and the classified neural network module 30, it is possible to detect abnormal behavior in high-resolution image data in a low-end embedded device without performance degradation. .

즉, 도 1에서 보듯이, 고해상도로 촬영된 영상데이터인 원본데이터(D1)가 입력되면 전처리모듈(10)이 저해상도로 열화시켜 분석용 영상데이터(D2)를 생성한다. 생성된 분석용 영상데이터(D2)는 탐지신경망모듈(20)로 입력되고, 탐지신경망모듈(20)은 저해상도로 열화된 분석용 영상데이터(D2)로부터 이상행동이 존재하는 국부영역을 이상행동탐지영역(D3)으로 선정한다. 구체적으로, 탐지신경망모듈(20)은 이상행동이 존재하는 영역을 나타내는 잠재벡터를 출력하게 된다. 예컨대, 잠재벡터는 열화된 분석용 영상데이터(D2) 내에 이상행동이 존재하는 국부영역에 대한 좌표값을 나타내고, 이 좌표값을 기초로 하여 국부데이터추출모듈(40)이 고해상도의 원본데이터(D1)으로부터 이상행동탐지영역(D3)에 대응하는 영역에 대한 국부데이터(D4)를 추출한다. 여기서, 국부데이터(D4)는, 저해상도로 열화된 데이터(D2)로부터 탐지신경망모듈(20)이 이상행동이 존재하는 영역으로 선정된 영역에 대응하는 고해상도의 원본데이터의 일부로 구성된다. 이렇게 추출된 고해상도의 국부데이터(D4)를 입력으로 하여 분류신경망모듈(30)이 이상행동여부를 나타내는 출력값을 산출하고, 산출된 출력값에 따라 이상행동인지 여부를 판별한다.That is, as shown in FIG. 1, when original data D1, which is image data captured at high resolution, is input, the preprocessing module 10 degrades it to a low resolution to generate image data D2 for analysis. The generated image data for analysis D2 is input to the detection neural network module 20, and the detection neural network module 20 detects abnormal behavior in the local area where abnormal behavior exists from the image data for analysis D2 deteriorated to low resolution. area (D3). Specifically, the detection neural network module 20 outputs a latent vector indicating a region in which abnormal behavior exists. For example, the latent vector represents a coordinate value for a local area in which an abnormal behavior exists in the deteriorated analysis image data D2, and based on this coordinate value, the local data extraction module 40 performs the high-resolution original data D1 ), local data D4 for an area corresponding to the abnormal behavior detection area D3 is extracted. Here, the local data D4 is composed of a part of high-resolution original data corresponding to an area selected by the detection neural network module 20 as an area in which abnormal behavior exists from the data D2 degraded to a low resolution. Using the extracted high-resolution local data D4 as an input, the classified neural network module 30 calculates an output value indicating whether or not there is an abnormal behavior, and determines whether or not the behavior is abnormal according to the calculated output value.

종래에는, 고해상도의 영상데이터를 분석하여 이상행동인지 여부를 판별하기 위해서는 고사양의 컴퓨팅 자원을 지원하는 서버에 데이터를 전송하여 사후적으로 처리할 수밖에 없었다. 그러나, 본 발명에 따르면, 저사양의 임베디드 환경에서도성능 저하 없이 고해상도의 영상데이터 내의 이상행동을 감지할 수 있으므로, CCTV 또는 드론 등의 저사양 컴퓨팅 자원을 통해 실시간으로 이상행동을 감지할 수 있게 되어 사회안전망 구축에 기여할 수 있다.Conventionally, in order to analyze high-resolution image data and determine whether or not it is an abnormal behavior, it was inevitable to transmit the data to a server supporting high-end computing resources and process it ex post. However, according to the present invention, since abnormal behavior in high-resolution image data can be detected without performance degradation even in a low-specification embedded environment, it is possible to detect abnormal behavior in real time through low-specification computing resources such as CCTVs or drones, thereby enabling a social safety net. can contribute to building

한편, 앞서 설명된 본 발명의 실시예에 따른 방법들은 다양한 컴퓨터 수단을 통하여 판독 가능한 프로그램 형태로 구현되어 컴퓨터로 판독 가능한 기록매체에 기록될 수 있다. 여기서, 기록매체는 프로그램 명령, 데이터 파일, 데이터구조 등을 단독으로 또는 조합하여 포함할 수 있다. 기록매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 예컨대 기록매체는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광 기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치를 포함한다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 와이어뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 와이어를 포함할 수 있다.Meanwhile, the methods according to the embodiments of the present invention described above may be implemented in a program form readable by various computer means and recorded on a computer-readable recording medium. Here, the recording medium may include program commands, data files, data structures, etc. alone or in combination. Program instructions recorded on the recording medium may be those specially designed and configured for the present invention, or those known and usable to those skilled in computer software. For example, recording media include magnetic media such as hard disks, floppy disks and magnetic tapes, optical media such as CD-ROMs and DVDs, and magneto-optical media such as floptical disks ( magneto-optical media), and hardware devices specially configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. Examples of program commands may include high-level language wires that can be executed by a computer using an interpreter, as well as machine language wires such as those produced by a compiler.

이상에서 설명한 바와 같이, 본 명세서는 다수의 특정한 구현물의 세부사항들을 포함하지만, 이들은 어떠한 발명이나 청구 가능한 것의 범위에 대해서도 제한적인 것으로서 이해되어서는 안 되며, 오히려 특정한 발명의 특정한 실시형태에 특유할 수 있는 특징들에 대한 설명으로서 이해되어야 한다. 개별적인 실시형태의 문맥에서 본 명세서에 기술된 특정한 특징들은 단일 실시형태에서 조합하여 구현될 수도 있다. 반대로, 단일 실시형태의 문맥에서 기술한 다양한 특징들 역시 개별적으로 혹은 어떠한 적절한 하위 조합으로도 복수의 실시형태에서 구현 가능하다. 나아가, 특징들이 특정한 조합으로 동작하고 초기에 그와 같이 청구된 바와 같이 묘사될 수 있지만, 청구된 조합으로부터의 하나 이상의 특징들은 일부 경우에 그 조합으로부터 배제될 수 있으며, 그 청구된 조합은 하위 조합이나 하위 조합의 변형물로 변경될 수 있다. As set forth above, this specification contains many specific implementation details, but these should not be construed as limiting on the scope of any invention or claimables, but rather may be specific to a particular embodiment of a particular invention. It should be understood as a description of the features in Certain features that are described in this specification in the context of separate embodiments may also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments individually or in any suitable subcombination. Further, while features may operate in particular combinations and are initially depicted as such claimed, one or more features from a claimed combination may in some cases be excluded from that combination, and the claimed combination is a subcombination. or sub-combination variations.

본 발명의 실시예는 본 발명의 최상의 모드를 제시하고 있으며, 본 발명을 설명하기 위하여, 그리고 당업자가 본 발명을 제작 및 이용할 수 있도록 하기 위한 예를 제공하고 있다. 이렇게 작성된 명세서는 그 제시된 구체적인 용어에 본 발명을 제한하는 것이 아니다. 따라서 상술한 예를 참조하여 본 발명을 상세하게 설명하였지만, 당업자라면 본 발명의 범위를 벗어나지 않으면서도 본 예들에 대한 개조, 변경 및 변형을 가할 수 있다. 따라서 본 발명의 범위는 설명된 실시예에 의하여 정할 것이 아니고 특허청구범위에 의해 정하여져야 한다.The embodiments of the present invention present the best mode of the present invention and provide examples to illustrate the present invention and to enable those skilled in the art to make and use the present invention. The specification thus prepared does not limit the invention to the specific terms presented. Therefore, although the present invention has been described in detail with reference to the above-described examples, those skilled in the art may make modifications, changes, and modifications to the present examples without departing from the scope of the present invention. Therefore, the scope of the present invention should not be defined by the described examples, but by the claims.

Claims

A pre-processing module for generating image data for analysis by degrading original data, which is image data captured at high resolution, to low resolution;
a detection neural network module that selects a local area in which abnormal behavior exists from the image data for analysis as an abnormal behavior detection area;
a local data extraction module extracting local data for a region corresponding to the abnormal behavior detection region selected by the detection deep neural network module from the original data;
A video analysis device for detecting abnormal behavior, comprising: a classified neural network module that determines whether or not there is an abnormal behavior by taking the local data as an input.

According to claim 1,
Further comprising a detection learning module for learning the detection neural network module,
The detection learning module,
Characterized in that, when the detection neural network module outputs a latent vector representing a region in which an abnormal behavior exists in the input image data, the detection neural network module is trained so that a difference between the latent vector and a preset expected value is minimized. Video analysis device for detecting abnormal behavior.

According to claim 2,
The video analysis device for detecting abnormal behavior, characterized in that the latent vector includes positional information about a region in which abnormal behavior exists in the input image data and probability information in which abnormal behavior exists in the region.

According to claim 1,
Further comprising a classification learning module for learning the classification neural network module,
The classification learning module,
Characterized in that, when the classification neural network module calculates an output value indicating whether or not an abnormal behavior is included in the input image data, the classification neural network module is trained so that a difference between the output value and a preset expected value is minimized. Image analysis device for motion detection.

Generating image data for analysis by a pre-processing module by degrading original data, which is image data captured at high resolution, to low resolution;
selecting, by a detection neural network module, a local area in which an abnormal behavior exists from the image data for analysis as an abnormal behavior detection area;
extracting, by a local data extraction module, local data for an area corresponding to the abnormal behavior detection area selected by the detection deep neural network module from the original data;
An image analysis method for detecting abnormal behavior, comprising: a classification neural network module determining whether the local data is an abnormal behavior by using the local data as an input.

According to claim 5,
Further comprising a detection learning step of learning the detection neural network module,
In the detection learning step,
generating deterioration data obtained by degrading input high resolution image data to low resolution by a preprocessing module;
outputting, by a detection neural network module, a latent vector indicating a region in which an abnormal behavior exists in the deterioration data;
characterized in that it comprises; learning the detection neural network module so that the difference between the latent vector and a preset expected value is minimized by the detection learning module;
Image analysis method for detecting abnormal behavior.

According to claim 6,
The video analysis method for detecting abnormal behavior, characterized in that the latent vector includes positional information about a region in which an abnormal behavior exists in the input degradation data and probability information in which an abnormal behavior exists in the region.

According to claim 5,
Characterized in that the pre-processing module further comprises the step of converting the degradation data into a plurality of degradation data through data augmentation,
Image analysis method for detecting abnormal behavior.

According to claim 5,
Further comprising a classification learning step of training the classification neural network module,
The classification learning step,
calculating, by the classification neural network module, an output value indicating whether an abnormal behavior is included in the input image data;
Characterized in that it comprises, the classification learning module learning the classification neural network module such that the difference between the output value output by the classification neural network module and a preset expected value is minimized by the classification learning module.
Image analysis method for detecting abnormal behavior.

A computer-readable recording medium on which a program for executing the image analysis method for detecting an abnormal behavior according to any one of claims 5 to 9 in a computer is recorded.