KR20180028198A

KR20180028198A - Image processing method, apparatus for predicting dangerous situation and method, server for predicting dangerous situation using thereof

Info

Publication number: KR20180028198A
Application number: KR1020160115584A
Authority: KR
Inventors: 이상훈; 강지우
Original assignee: 연세대학교 산학협력단
Priority date: 2016-09-08
Filing date: 2016-09-08
Publication date: 2018-03-16
Also published as: KR101891887B1

Abstract

The present invention provides an image processing method and an image processing apparatus using a real-time image to predict a dangerous situation, capable of reducing a burden with respect to transmission and storage of data, and a dangerous situation prediction method and a server using the same. According to one embodiment of the present invention, the server comprises: a behavior similarity analysis unit comparing a motion in accordance with a temporal flow of three-dimensional (3D) posture information with behavior patterns previously stored in a database when 3D posture information of a person appearing in an image, feature information of an entity held by the person, and context information of the image, which includes either or both of a photographing time and place, are received from an image processing device, to extract the behavior pattern with the highest similarity; an entity similarity analysis unit comparing similarities between the received feature information of the entity and feature information of entity types previously stored in the database to extract an entity type with a similarity equal to or greater than a preset threshold; and a dangerous situation prediction unit analyzing correlation of a dangerous situation among the extracted behavior pattern, entity type, and context information to extract dangerous situation prediction information.

Description

TECHNICAL FIELD The present invention relates to an image processing method and apparatus for predicting a dangerous situation using a real-time image, a method for predicting a dangerous situation using the same, a server, a server,

본 발명은 CCTV에서 촬영되는 실시간 영상을 이용하여 위험 상황을 예측하는 기술에 관한 것이다.TECHNICAL FIELD The present invention relates to a technique for predicting a dangerous situation using a real-time image captured in CCTV.

사회적 문제가 되고 있는 다양한 범죄를 사전에 예방하고 대응하기 위해서 다양한 장소에 CCTV가 설치되고 있으며, CCTV가 설치되는 장소는 앞으로도 계속 증가할 전망이다.CCTV is being installed in various places in order to prevent and respond to various social crimes, and the place where CCTV is installed will continue to increase in the future.

현재, 범죄의 사전 예방과 대응에 있어서 CCTV의 활용은 위험 상황이 발생한 이후에 해당 현장을 촬영한 CCTV 영상(서버나 특정 저장 공간에 저장됨)을 검색하고 위험 상황이 발생한 구간을 확인하고 있는 것이 일반적이다.Currently, in the prevention and countermeasure of crime, CCTV is used to search for CCTV images (stored in a server or a specific storage space) photographed after the occurrence of a dangerous situation, It is common.

따라서, 현재의 방식으로는 CCTV 영상을 이용하여 위험 상황의 발생을 실시간으로 파악할 수 없는 한계가 있다.Therefore, there is a limitation in that the occurrence of a dangerous situation can not be grasped in real time using the CCTV image in the current method.

뿐만 아니라 CCTV 영상을 일정 기간 보관하기 위해 CCTV 영상 자체를 서버로 전송하기 때문에 상대적으로 높은 대역폭(Bandwidth)을 차지하며 서버 또한 높은 연산량을 필요로 한다.In addition, since the CCTV image itself is transmitted to the server in order to store the CCTV image for a certain period of time, it occupies a relatively high bandwidth and the server also requires a high computational load.

또한, CCTV 영상이 서버에 그대로 저장되므로 CCTV 영상을 확인 시 개인의 신상 정보(얼굴 등) 등이 그대로 노출되어, 프라이버시가 침해되는 문제가 발생 할 수 있다.In addition, since the CCTV image is stored in the server as it is, when the CCTV image is checked, the personal information (face, etc.) of the individual is directly exposed and the privacy may be infringed.

이에, CCTV의 실시간 영상을 이용하여 위험 상황의 발생을 실시간으로 파악하되, 개인 정보의 보호뿐만 아니라 데이터 전송과 저장에 있어서도 부담이 적은 새로운 방안이 요구되고 있다.Therefore, it is required to grasp the occurrence of a dangerous situation in real time using real-time video of CCTV, and a new scheme which not only protects personal information but also has less burden on data transmission and storage.

본 발명은 전술한 종래 기술의 문제점을 해결하기 위한 것으로, CCTV의 실시간 영상을 이용하여 위험 상황의 발생을 실시간으로 파악하는 방안을 제공하고자 한다.SUMMARY OF THE INVENTION The present invention has been made to solve the above problems of the related art, and it is an object of the present invention to provide a method for real-time monitoring of the occurrence of a dangerous situation using real-time video of CCTV.

또한, 위험 상황을 실시간으로 파악 시 개인 정보가 노출되지 않으며, 데이터의 전송과 저장을 효율적으로 관리할 수 있는 방안을 제공하고자 한다.In addition, we want to provide a way to efficiently manage the transmission and storage of data when personal information is not exposed when real - time monitoring of the dangerous situation occurs.

상기와 같은 목적을 달성하기 위해, 본 발명의 일 실시예에 따른 실시간 영상을 이용하여 위험 상황을 예측하는 서버는 영상 처리 장치로부터, 영상에 등장하는 인물의 3차원 자세 정보, 상기 인물이 소지하고 있는 개체의 특징 정보 및 상기 영상의 상황 정보 - 촬영 시각 및 장소 중 하나 이상을 포함함 - 가 수신되면, 상기 수신되는 3차원 자세 정보의 시간적 흐름에 따른 움직임을 DB에 기 저장된 행동 유형들과 비교하여 유사도가 가장 높은 행동 유형을 추출하는 행동 유사도 분석부, 상기 수신되는 개체의 특징 정보와 DB에 기 저장된 개체 유형들의 특징 정보간 유사도를 비교하여 유사도가 미리 정해진 임계치 이상인 개체 유형을 추출하는 개체 유사도 분석부 및 상기 추출된 행동 유형, 개체 유형 및 상기 상황 정보간의 위험 상황에 대한 상호 연관 관계를 분석하여 위험 상황 예측 정보를 추출하는 위험 상황 예측부를 포함하는 것을 특징으로 한다.According to an aspect of the present invention, there is provided a server for predicting a dangerous situation using a real-time image according to an embodiment of the present invention, the server including: a three-dimensional posture information of a person appearing in an image; Dimensional posture information is compared with pre-stored behavior types in the DB when the feature information of the object including at least one of the feature information of the object and the at least one of the context information, A similarity degree analyzing unit for extracting a behavior type having the highest degree of similarity, a feature similarity analyzing unit for comparing the similarity between the feature information of the received entity and the feature information of the previously stored entity types in the DB to extract an entity type having a similarity degree equal to or higher than a predetermined threshold, An analysis unit, and a correlation between the extracted behavior type, the entity type, and the situation information, Analyzing the type and characterized in that it contains hazardous situation for extracting dangerous situation prediction information predicting section.

상기와 같은 목적을 달성하기 위해, 본 발명의 일 실시예에 따른 영상 처리 장치는 실시간 촬영되는 영상으로부터 인물의 영역 및 상기 인물이 소지하고 있는 개체의 영역을 구분하는 전처리부, 상기 구분된 인물의 영역에서 상기 인물의 3차원 자세 정보를 추출하고, 상기 구분된 개체의 영역에서 상기 개체의 특징 정보를 추출하는 특징 정보 추출부 및 상기 추출된 3차원 자세 정보 및 개체의 특징 정보를 서버로 전송하는 전송부를 포함하되, 상기 3차원 자세 정보는 상기 인물이 취하는 자세의 뼈대(skeleton)를 구성하는 접합 부위(joint)들의 3차원 좌표 형태로 표현되어 상기 인물의 개인 정보가 제외되고, 상기 개체의 특징 정보는 상기 개체의 특징이 반영된 형태인 디스크립터(discriptor)로 추출되어 상기 개체를 직관적으로 표현하는 정보가 제외되며, 상기 3차원 자세 정보와 개체의 특징 정보는 상기 서버에서 상기 인물의 행동 유형과 상기 인물이 소지하고 있는 개체 유형으로 각각 추출되어 위험 상황 예측을 위한 정보로 이용되는 것을 특징으로 한다.According to an aspect of the present invention, there is provided an image processing apparatus including a preprocessor for separating a region of a person and an area of an object possessed by the person, Dimensional attitude information of the person in the region and extracts the feature information of the entity in the region of the separated entity, and a feature information extracting unit that extracts the extracted three- Wherein the three-dimensional attitude information is expressed in a three-dimensional coordinate form of joints constituting a skeleton of a posture taken by the person so that the personal information of the person is excluded, Information is extracted by a discriptor which is a type reflecting the characteristics of the entity, and information expressing the entity intuitively is excluded, Feature information-based three-dimensional position information and the object is made in the server characterized in that used as the information for the risk situation prediction are respectively extracted behavior pattern and the type of object that the person is in possession of the person.

상기와 같은 목적을 달성하기 위해, 본 발명의 일 실시예에 따른 서버가 실시간 영상을 이용하여 위험 상황을 예측하는 방법은 (a) 영상 처리 장치로부터, 영상에 등장하는 인물의 3차원 자세 정보, 상기 인물이 소지하고 있는 개체의 특징 정보 및 상기 영상의 상황 정보 - 촬영 시각 및 장소 중 하나 이상을 포함함 - 를 수신하는 단계, (b) 상기 수신되는 3차원 자세 정보의 시간적 흐름에 따른 움직임을 DB에 기 저장된 행동 유형들과 비교하여 유사도가 가장 높은 행동 유형을 추출하고, 상기 수신되는 개체의 특징 정보와 DB에 기 저장된 개체 유형들의 특징 정보간 유사도를 비교하여 유사도가 미리 정해진 임계치 이상인 개체 유형을 추출하는 단계 및 (c) 상기 추출된 행동 유형, 개체 유형 및 상기 상황 정보간의 위험 상황에 대한 상호 연관 관계를 분석하여 위험 상황 예측 정보를 추출하는 단계를 포함하는 것을 특징으로 한다.According to an aspect of the present invention, there is provided a method for predicting a dangerous situation using a real-time image, the method comprising the steps of: (a) Receiving at least one of the feature information of the object possessed by the person and the context information of the image including at least one of the capturing time and the location of the image; (b) receiving the movement according to the temporal flow of the received three- The behavior type having the highest degree of similarity is compared with the behavior types pre-stored in the DB, and the degree of similarity between the feature information of the received entity and the feature information of the previously stored entity types in the DB is compared, (C) analyzing the correlation between the extracted behavior type, the entity type, and the situation information, and the risk situation And extracting the risk situation prediction information.

상기와 같은 목적을 달성하기 위해, 본 발명의 일 실시예에 따른 영상 처리 장치가 위험 상황을 예측하기 위하여 영상을 처리하는 방법은 (a) 실시간 촬영되는 영상으로부터 인물의 영역 및 상기 인물이 소지하고 있는 개체의 영역을 구분하는 단계, (b) 상기 구분된 인물의 영역에서 상기 인물의 3차원 자세 정보를 추출하고, 상기 구분된 개체의 영역에서 상기 개체의 특징 정보를 추출하는 단계 및 (c) 상기 추출된 3차원 자세 정보 및 개체의 특징 정보를 서버로 전송하는 단계를 포함하되 상기 3차원 자세 정보는 상기 인물이 취하는 자세의 뼈대(skeleton)를 구성하는 접합 부위(joint)들의 3차원 좌표 형태로 표현되어 상기 인물의 개인 정보가 제외되고, 상기 개체의 특징 정보는 상기 개체의 특징이 반영된 형태인 디스크립터(discriptor)로 추출되어 상기 개체를 직관적으로 표현하는 정보가 제외되며, 상기 3차원 자세 정보와 개체의 특징 정보는 상기 서버에서 상기 인물의 행동 유형과 상기 인물이 소지하고 있는 개체 유형으로 각각 추출되어 위험 상황 예측을 위한 정보로 이용되는 것을 특징으로 한다.According to an aspect of the present invention, there is provided a method of processing an image for predicting a dangerous situation according to an embodiment of the present invention includes the steps of: (a) (B) extracting the three-dimensional attitude information of the person in the area of the divided person and extracting the characteristic information of the object in the area of the separated person; and (c) And transmitting the extracted three-dimensional attitude information and the feature information of the entity to a server, wherein the three-dimensional attitude information includes a three-dimensional coordinate form of joints constituting a skeleton of a posture taken by the person The personal information of the person is excluded and the feature information of the entity is extracted as a descriptor which is a type reflecting the characteristics of the entity, The three-dimensional attitude information and the feature information of the object are extracted from the behavior type of the person and the entity type possessed by the person in the server and used as information for predicting the dangerous situation .

본 발명의 일 실시예에 따르면, CCTV의 실시간 영상을 이용하여 위험 상황의 발생을 실시간으로 파악할 수 있다.According to an embodiment of the present invention, the occurrence of a dangerous situation can be grasped in real time using a real-time image of CCTV.

또한, CCTV 영상에 등장하는 인물의 3차원 자세 정보와 해당 인물이 소지하고 있는 개체에 대한 특징 정보 등이 서버로 전송되어 저장되므로, 위험 상황을 실시간으로 파악 시 개인 정보가 노출되지 않아 프라이버시 침해 문제로부터 자유로울 수 있다.In addition, since the three-dimensional attitude information of the person appearing in the CCTV image and the characteristic information about the object held by the person are transmitted and stored to the server, the privacy information is not exposed when the risk situation is grasped in real time, .

또한, 서버로 전송되는 인물의 3차원 자세 정보와 개체에 대한 특징 정보의 데이터 크기가 작으므로 데이터의 전송과 저장에 대한 부담을 감소시킬 수 있다.In addition, since the data size of the three-dimensional attitude information of the person and the feature information of the object transmitted to the server is small, the burden of data transmission and storage can be reduced.

본 발명의 효과는 상기한 효과로 한정되는 것은 아니며, 본 발명의 상세한 설명 또는 특허청구범위에 기재된 발명의 구성으로부터 추론 가능한 모든 효과를 포함하는 것으로 이해되어야 한다.It should be understood that the effects of the present invention are not limited to the above effects and include all effects that can be deduced from the detailed description of the present invention or the configuration of the invention described in the claims.

도 1은 본 발명의 일 실시예에 따른 실시간 영상을 이용하여 위험 상황을 예측하는 시스템의 구성을 도시한 도면이다.
도 2는 본 발명의 일 실시예에 따른 영상 처리 장치의 구성을 도시한 블록도이다.
도 3은 본 발명의 일 실시예에 따른 클라우드 서버의 구성을 도시한 블록도이다.
도 4는 본 발명의 일 실시예에 따른 영상 처리 과정을 도시한 흐름도이다.
도 5는 본 발명의 일 실시예에 따른 위험 상황 예측 과정을 도시한 흐름도이다.
도 6은 본 발명의 일 실시예에 따른 전처리 과정을 도시한 도면이다.
도 7은 본 발명의 일 실시예에 따른 3차원 자세 정보의 추출과 유사도 매칭을 도시한 도면이다.
도 8은 본 발명의 일 실시예에 따른 개체 특징 정보의 추출을 도시한 도면이다.
도 9는 본 발명이 일 실시예에 따른 위험 상황 예측 결과를 도시한 도면이다.
도 10은 본 발명의 일 실시예에 따른 위험 상황 예측 과정을 도시한 도면이다.1 is a block diagram illustrating a system for predicting a dangerous situation using a real-time image according to an embodiment of the present invention.
2 is a block diagram showing the configuration of an image processing apparatus according to an embodiment of the present invention.
3 is a block diagram illustrating a configuration of a cloud server according to an embodiment of the present invention.
4 is a flowchart illustrating an image processing process according to an embodiment of the present invention.
5 is a flowchart illustrating a process of predicting a risk situation according to an embodiment of the present invention.
6 is a diagram illustrating a preprocessing process according to an embodiment of the present invention.
FIG. 7 is a diagram illustrating extraction and similarity matching of three-dimensional attitude information according to an embodiment of the present invention.
8 is a diagram illustrating extraction of entity feature information according to an embodiment of the present invention.
FIG. 9 is a diagram illustrating a result of a risk situation prediction according to an embodiment of the present invention.
FIG. 10 is a diagram illustrating a process for predicting a dangerous situation according to an embodiment of the present invention.

이하에서는 첨부한 도면을 참조하여 본 발명을 설명하기로 한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며, 따라서 여기에서 설명하는 실시예로 한정되는 것은 아니다.DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, the present invention will be described with reference to the accompanying drawings. The present invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein.

그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.In order to clearly illustrate the present invention, parts not related to the description are omitted, and similar parts are denoted by like reference characters throughout the specification.

명세서 전체에서, 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는 "직접적으로 연결"되어 있는 경우뿐 아니라, 그 중간에 다른 부재를 사이에 두고 "간접적으로 연결"되어 있는 경우도 포함한다.Throughout the specification, when a part is referred to as being "connected" to another part, it includes not only "directly connected" but also "indirectly connected" .

또한 어떤 부분이 어떤 구성 요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성 요소를 제외하는 것이 아니라 다른 구성 요소를 더 구비할 수 있다는 것을 의미한다.Also, when an element is referred to as "comprising ", it means that it can include other elements, not excluding other elements unless specifically stated otherwise.

이하 첨부된 도면을 참고하여 본 발명의 실시예를 상세히 설명하기로 한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일 실시예에 따른 실시간 영상을 이용하여 위험 상황을 예측하는 시스템의 구성을 도시한 도면이다.1 is a block diagram illustrating a system for predicting a dangerous situation using a real-time image according to an embodiment of the present invention.

본 발명의 일 실시예에 따른 실시간 영상을 이용하여 위험 상황을 예측하는 시스템(이하 ‘실시간 위험 상황 예측 시스템’이라 칭함)(100)은 영상 처리 장치(110), 클라우드 서버(120) 및 사용자 단말기(130)를 포함할 수 있다.A system 100 for predicting a risk situation using a real-time image according to an embodiment of the present invention 100 includes a video processing apparatus 110, a cloud server 120, (130).

참고로, 본 발명은 CCTV의 실시간 영상을 이용하여 위험 상황의 발생을 실시간으로 파악하되, 개인 정보의 보호뿐만 아니라 데이터 전송과 저장에 있어서도 부담이 적은 실시간 위험 상황 예측 기술에 관한 것이다.The present invention relates to a real-time risk situation prediction technique in which the occurrence of a dangerous situation is monitored in real time using real-time video of the CCTV, and not only the protection of personal information but also the burden on data transmission and storage is minimized.

이를 위해 본 발명은 실시간 획득되는 영상에서 인물의 행동에 대한 특징 정보와 인물이 소지하고 있는 개체의 특징 정보를 이용하여 인물의 행동 유형과 개체 유형을 추출하고, 추출된 행동 유형과 개체 유형 및 상황 정보(영상 촬영 시각 및 장소)간 위험 상황의 연관성을 분석하여 위험 상황을 예측할 수 있다.To this end, the present invention extracts a behavior type and an entity type of a person using characteristic information of a person's behavior and a feature information of a person possessed by the person in real time, extracts the extracted behavior type, The risk situation can be predicted by analyzing the correlation of the risk situation between the information (the time and location of the image capture).

이를 위한 실시간 위험 상황 예측 시스템(100)의 각 구성 요소를 간략히 설명하면, 영상 처리 장치(110)는 CCTV에 연결되거나 CCTV에 포함될 수 있으며, CCTV가 촬영한 실시간 영상에서 인물의 3D 자세 정보를 추출하고, 해당 인물이 소지하고 있는 개체을 검출할 수 있다.The image processing apparatus 110 may be connected to the CCTV or may be included in the CCTV. The 3D posture information of the person may be extracted from the real-time image captured by the CCTV , And can detect an object possessed by the person.

여기서, CCTV가 촬영한 실시간 영상(이하, ‘촬영 영상’이라 칭함)은 가공되지 않은 2D-RGB 형태의 이미지이다.Here, a real-time image (hereinafter referred to as a "captured image") captured by the CCTV is an unprocessed 2D-RGB image.

영상 처리 장치(110)는 촬영 영상에서 의미가 같은 부분들로 영역을 분할하고 분할된 영역에 대해 특징점을 추출할 수 있다.The image processing apparatus 110 may divide the area into parts having the same meaning in the photographed image and extract the minutiae points for the divided areas.

예를 들어 칼을 들고 있는 인물의 촬영 영상이 입력되면, 관심 영역인 칼과 인물을 분리해 낸 후 칼과 인물에 대해서 각각 특징점을 추출하는 것이다.For example, when a shot image of a person holding a knife is input, the knife and character are separated from the region of interest, and then the feature points are extracted for the knife and the character, respectively.

이때, 촬영 영상에 등장하는 인물은 얼굴과 착용하고 있는 옷과 같은 개인 신상 정보가 제외된 자세(pose)의 형태로 추출되며, 해당 인물이 소지하고 있는 개체(칼)의 정보는 개체의 특징을 반영할 수 있는 특징 정보의 형태(Descriptor)로 추출될 수 있다.At this time, the person appearing in the photographed image is extracted in the form of a pose in which personal information such as the face and the clothes being worn are excluded, and information of the object (knife) possessed by the person is extracted from the characteristic It can be extracted as a shape (Descriptor) of characteristic information that can be reflected.

여기서 상기 ‘자세 형태’는 3차원 자세 정보로서 해당 인물이 취하는 자세의 뼈대(skeleton)를 구성하는 접합 부위(joint)들의 3차원 좌표 형태로 표현될 수 있다.Here, the 'posture form' can be expressed as a three-dimensional coordinate form of joints constituting a skeleton of a posture taken by the person as three-dimensional posture information.

즉, 3차원 자세 정보는 인물의 신체를 단순하게 표현한 정보로서, 인물이 어떠한 행동 또는 동작을 취하고 있는지에 대한 정보가 담겨 있는 것이다.That is, the three-dimensional attitude information is information representing a simple body of a person, and contains information about what action or action a person is taking.

그리고 인물이 소지하고 있는 개체(물건)는 해당 인물의 행동 의도를 반영하고 있는 대상으로서, 특히 범죄와 같은 상황이 발생할 시, 인물이 들고 있는 흉기 등을 통해 잠재적 범죄 의도를 파악할 수 있다.And the object possessed by the person reflects the behavior intention of the person, and in particular, when a situation such as a crime occurs, the potential crime intention can be grasped by the person's weapon.

영상 처리 장치(110)는 촬영 영상에 등장하는 인물의 3차원 자세 정보와 해당 인물이 소지하고 있는 개체의 특징 정보를 클라우드 서버(120)로 전송할 수 있다. 참고로, 상기 특징 정보들이 전송될 때 촬영 영상이 촬영된 장소와 촬영 시간 중 하나 이상을 포함하는 상황 정보도 함께 전송될 수 있다.The image processing apparatus 110 can transmit the three-dimensional attitude information of the person appearing in the photographed image and the characteristic information of the object possessed by the person to the cloud server 120. When the feature information is transmitted, the location information including at least one of the location where the photographed image was photographed and the photographed time may be transmitted together.

따라서, 원 촬영 영상이 아닌, 촬영 영상으로부터 추출된 특징 정보만이 클라우드 서버(120)로 전송되므로, 전송되는 데이터의 양을 줄이고 개인의 신상 정보가 포함되지 않으며, 결국 클라우드 서버(120)에서 상기 특징 정보를 이용하여 위험 여부를 실시간으로 판단 시, 촬영 영상에 등장하는 인물의 개인 신상 정보는 사용하지 않게 되어 프라이버시 침해 문제로부터 자유로울 수 있다.Therefore, since only the feature information extracted from the photographed image is transmitted to the cloud server 120, the amount of data to be transmitted is reduced, personal information is not included, and the cloud server 120 When the risk information is used in real time by using the feature information, the personal information of the person appearing in the photographed image is not used, so that it can be free from the privacy invasion problem.

한편, 클라우드 서버(120)는 영상 처리 장치(110)로부터 수신되는 촬영 영상의 특징 정보들(인물의 3차원 자세 정보와 해당 인물이 소지하고 있는 개체의 특징 정보)을 활용하여 실시간 위험 예측 정보를 추출할 수 있다.Meanwhile, the cloud server 120 utilizes the characteristic information of the photographed image received from the image processing apparatus 110 (the three-dimensional attitude information of the person and the characteristic information of the object possessed by the person) Can be extracted.

상기 3차원 자세 정보와 개체의 특징 정보를 위험 상황 예측에 활용하기 위해서는 일련의 정제 과정을 필요로 한다.In order to utilize the three-dimensional attitude information and the feature information of the entity in the risk situation prediction, a series of refining processes are required.

먼저 인물의 3차원 자세 정보는 자세의 뼈대(skeleton)를 구성하는 접합 부위 (joint)들의 3차원 좌표 형태로 수신되며, 클라우드 서버(120)는 3차원 자세 정보에서 접합 부위의 중간점, 특정 뼈대가 이루는 각도 등 인물의 자세를 효율적으로 나타낼 수 있는 정보의 정의를 통해 각 자세를 수치화(Parameterization)할 수 있다.First, the three-dimensional attitude information of a person is received in the form of three-dimensional coordinates of joints constituting a skeleton of the posture. The cloud server 120 calculates the three- And the angle of the person, such as the angle of the posture can be efficiently represented by the definition of each posture can be parameterized (Parameterization).

또한 이러한 정보로 이루어진 자세 특징 벡터(Vector)를 시간에 흐름에 따라 나열하여 인물의 행동 유형을 집합(Matrix) 또는 텐서(Tensor) 형태로 수치화하여 DB 에 저장할 수 있다.In addition, posture feature vectors (vectors) made up of such information are arranged in accordance with time so that the behavior types of the person can be numerically expressed in the form of a matrix or a tensor and stored in the DB.

클라우드 서버(120)는 이후 질의되는 3차원 자세 정보를 이용하여 DB에 저장된 다양한 행동 유형들과 유사도를 비교하여 가장 유사한 행동 유형을 추출할 수 있다.The cloud server 120 can then extract the most similar behavior types by comparing the similarities with various behavior types stored in the DB using the three-dimensional attitude information to be inquired.

이때 추출된 행동 유형은 수치화된 형태로 나타낼 수 있다. 예를 들어 위험 정도 30(최저 1부터 최고 100인 경우) 또는 위험 상황 레벨 3(최저 1부터 최고 10인 경우)과 같이 나타낼 수 있다.At this time, the extracted behavior type can be expressed in numerical form. For example, a risk level of 30 (for a minimum of 1 to a maximum of 100) or a risk level of 3 (for a minimum of 1 to a maximum of 10).

또한, 클라우드 서버(120)는 인물이 소지하고 있는 개체에 대한 특징 정보를 DB에 저장된 유사도 비교를 통해 수치화할 수 있다.In addition, the cloud server 120 can characterize feature information about an object possessed by a person through a similarity comparison stored in the DB.

여기서 개체에 대한 특징 정보는 개체를 구성하는 특징 부위들에 대한 정보(descriptor) 형태일 수 있다.Here, the feature information on the entity may be in the form of a descriptor for the feature parts constituting the entity.

즉, DB를 기반으로 인물이 소지하고 있는 개체가 무엇인지에 대해서 직접적으로 인지하지 않고 개체를 단순히 수치화 한다.In other words, based on the DB, the object is simply quantified without directly recognizing what the object possessed by the person is.

이와 같이 촬영 영상에서 인물이 소자하고 있는 개체의 정보를 직접적으로 인지하지 않는 것은 촬영 영상에서 개인 정보를 보호하기 위한 장치이기도 하며, 어떠한 개체의 정보라도 단순히 수치화 됨으로써 개체의 종류(Class)를 미리 정해놓지 않음으로써 촬영 영상에 등장할 수 있는 다양한 개체에도 적용할 수 있는 장점이 있다.In this way, it is also a device for protecting personal information in the photographed image, in order not to directly perceive the information of the object in which the person is disappeared in the photographed image, and the information of any object can be numerically expressed, It can be applied to various objects that can appear in the photographed image.

또한, 클라우드 서버(120)는 수치화된 인물의 행동 유형과 해당 인물이 소지하고 있는 개체 유형, 그리고 상황 정보간의 상호 연관 관계에 대한 분석에 기초하여 위험 상황 예측 정보를 제공할 수 있다.In addition, the cloud server 120 can provide risk prediction information based on an analysis of the correlation between the numerical character behavior type, the entity type possessed by the person, and the context information.

즉, 인물의 자세와 해당 인물이 소지하고 있는 개체 그리고 그 상황이 발생한 장소와 시간 등이 위험 상황과 어떤 연관이 있는지를 분석하는 것이다.In other words, it analyzes the posture of the person, the object possessed by the person, and the place and time where the situation occurred and how it relates to the risk situation.

이를 위해 클라우드 서버(120)는 기존의 촬영 영상과 위험 상황 발생 기록을 이용해 딥 러닝(deep learning)을 이용한 연관 관계를 학습할 수 있으며, 추후 학습된 플랫폼에 임의의 행동 유형에 대한 정보와 개체 유형에 대한 정보 그리고 그에 대한 상황 정보가 입력됐을 때, 입력된 정보들이 위험 상황과 연관이 있는지를 판단하고, 위험 상황이라고 판단되는 경우 위험 상황에 대한 예측 정보를 생성하여 사용자 단말기(130)로 제공할 수 있다.For this, the cloud server 120 can learn the association using deep learning by using the existing photographed image and the record of the occurrence of the dangerous situation, and the information about the arbitrary behavior type and the type of the object When the information on the dangerous situation is inputted, it is determined whether or not the input information is related to the dangerous situation. If it is determined that the information is dangerous, the information about the dangerous situation is generated and provided to the user terminal 130 .

이때, 클라우드 서버(120)는DB 상에서 유사도가 가장 큰 행동 유형에 대한 정보(예를 들어 해당 행동 유형을 나타내는 자세를 취하고 있는 사람 이미지 등)를 더 제공할 수 있다.At this time, the cloud server 120 may further provide information on the behavior type having the greatest similarity on the DB (for example, a person image taking a posture indicating the behavior type).

따라서, CCTV의 실시간 촬영 영상에서 개인의 개인 정보 보호와 동시에 해당 지역에 대한 위험 상황을 알고 싶어하는 사용자의 요구를 합리적으로 충족시킬 수 있다.Therefore, it is possible to reasonably satisfy the user's desire to know the risk situation in the local area while protecting the personal information of the individual in the real-time shot image of the CCTV.

한편, 사용자 단말기(130)는 스마트 폰, 휴대폰, PDA, PMP, 태블릿 컴퓨터 등의 이동 통신 단말기와, 노트북 컴퓨터, 데스크 탑 컴퓨터, 셋탑 박스와 연결된 디지털 TV 등 네트워크를 통해 클라우드 서버(120)와 연결될 수 있는 모든 단말기를 포함할 수 있으며, 클라우드 서버(120)가 제공하는 상기 위험 상황 예측 정보를 수신하여 화면에 표시할 수 있다.Meanwhile, the user terminal 130 is connected to the cloud server 120 through a network such as a smart phone, a mobile phone, a PDA, a PMP, a tablet computer, or the like, and a digital TV connected to a notebook computer, a desktop computer, And can receive the risk situation prediction information provided by the cloud server 120 and display it on the screen.

결과적으로 본 발명의 실시예에 따른 위험 상황 예측 방식은 끊임없이 획득되는 CCTV 데이터(촬영 영상)와 위험 상황과의 연관성 데이터를 통해 지속적으로 학습이 되므로 점점 더 우수한 예측 성능을 가지게 된다.As a result, the dangerous situation prediction method according to the embodiment of the present invention gradually obtains better prediction performance because it continuously learns from the CCTV data (photographing image) continuously acquired and the correlation data with the dangerous situation.

또한 인물의 3차원 자세 정보 및 해당 인물이 소지하고 있는 객체의 특징 정보, 그리고 장소나 시간 등의 상황 정보에 대한 연관성 분석을 통해 인물의 행동 유형과 소지하고 있는 개체 그리고 시간이나 장소에 따른 가중치 부여하고 이를 통해 위험 상황 예측의 정확성을 높일 수 있다.Also, by analyzing the relationship between the three-dimensional attitude information of the person, the feature information of the object possessed by the person, and the situation information such as the place and the time, the behavior type of the person and the possessed object, This can increase the accuracy of risk prediction.

도 2는 본 발명의 일 실시예에 따른 영상 처리 장치의 구성을 도시한 블록도이다.2 is a block diagram showing the configuration of an image processing apparatus according to an embodiment of the present invention.

본 발명의 일 실시예에 따른 영상 처리 장치(110)는 영상 획득부(111), 전처리부(112), 특징 정보 추출부(113), 전송부(114), 제어부(115) 및 메모리(116)를 포함할 수 있으며, 이 중 특징 정보 추출부(113)는 3차원 자세 정보 추출부(113a) 및 개체 특징 정보 추출부(113b)를 포함할 수 있다.The image processing apparatus 110 according to an embodiment of the present invention includes an image acquisition unit 111, a preprocessor 112, a feature information extraction unit 113, a transfer unit 114, a control unit 115, and a memory 116 The feature information extracting unit 113 may include a three-dimensional posture information extracting unit 113a and an entity feature information extracting unit 113b.

각 구성 요소를 설명하면, 영상 획득부(111)는 실시간 촬영 영상을 획득할 수 있다. 여기서 실시간으로 획득되는 촬영 영상은 2차원의 RGB 영상이다.Describing each component, the image acquisition unit 111 can acquire a real-time photographic image. Here, the captured image obtained in real time is a two-dimensional RGB image.

전처리부(112)는 영상 획득부(111)를 통해 실시간으로 획득되는 촬영 영상에서 인물의 영역과 해당 인물이 소지하고 있는 개체의 영역을 분리하는 전처리를 수행할 수 있다.The preprocessing unit 112 may perform preprocessing for separating the region of the person and the region of the object possessed by the person in the photographed image acquired in real time through the image acquisition unit 111. [

이는 특징 정보 추출부(113)에서 인물과 개체에 대한 특징 정보를 추출 시, 추출되는 특징 정보의 정확도가 높아지도록 인물과 개체를 좀 더 명확히 구분해주는 처리 과정이다.This is a processing procedure that more clearly distinguishes a person and an object so that the accuracy of the extracted feature information becomes higher when the feature information extracting unit 113 extracts the feature information about the person and the object.

이를 위해 전처리부(112)는 딥 러닝(예를 들어 Convolutional Neural Networks; CNN)을 이용하여 2차원의 RGB 영상에서 인물 영역 및 개체 영역을 각각 분할할 수 있으며, 분할된 인물 영역은 후술하는 3차원 자세 정보 추출부(113a)가 학습하는 딥 러닝의 입력 값으로 사용될 수 있다.For this, the preprocessing unit 112 may divide the character area and the individual area in the two-dimensional RGB image using deep running (for example, Convolutional Neural Networks (CNN)), and the divided character areas are divided into three- Can be used as an input value of the deep learning to be learned by the attitude information extracting unit 113a.

여기서 전처리부(112)는 딥 러닝을 이용한 학습 시 의미론적 영역 분할(semantic segmentation) 및 의미론적 관심 영역 분할(semantic segmentation for Region of Interest)을 이용하여 인물 영역 및 개체 영역을 각각 분할할 수 있다.Here, the preprocessing unit 112 may divide the person area and the object area using semantic segmentation and semantic segmentation for region of interest, respectively, during learning using deep learning.

이하, 딥 러닝을 이용하여 2차원의 RGB 영상에서 인물 영역 및 개체 영역을 분할하는 학습을 ‘공간적 딥 러닝’이라 칭하도록 한다.Hereinafter, learning to divide a person area and an object area in a two-dimensional RGB image using deep learning will be referred to as " spatial deep learning ".

한편, 특징 정보 추출부(113)는 전처리부(112)에서 분할된 인물 영역에서 인물의 3차원 자세 정보를 추출하고, 분할된 개체 영역에서 개체의 특징 정보를 추출할 수 있다.Meanwhile, the feature information extracting unit 113 may extract the three-dimensional attitude information of the person in the person area divided by the preprocessing unit 112 and extract the feature information of the object in the divided person area.

이를 위해 특징 정보 추출부(113)는 3차원 자세 정보 추출부(113a) 및 개체 특징 정보 추출부(113b)를 포함할 수 있다.For this, the feature information extracting unit 113 may include a three-dimensional posture information extracting unit 113a and an entity feature information extracting unit 113b.

먼저, 3차원 자세 정보 추출부(113a)는 2차원의 RGB 영상에서 분할된 인물 영역에서, 인물이 취하는 자세의 뼈대(skeleton)를 구성하는 접합 부위(joint)들의 3차원 좌표 형태로 표현되는 3차원 자세 정보를 추출할 수 있다.First, the three-dimensional posture information extracting unit 113a extracts a three-dimensional posture information extracting unit 113a, which is represented by a three-dimensional coordinate form of joints constituting a skeleton of a posture taken by a person, Dimensional attitude information can be extracted.

이를 위해, 3차원 자세 정보 추출부(113a)는 딥 러닝을 이용하여 인물 영역에서 3차원 자세 정보를 추출하는 학습을 수행할 수 있다.To this end, the three-dimensional posture information extracting unit 113a can perform learning to extract three-dimensional posture information from the person area by using deep learning.

학습 과정은 입력되는 인물 영역과 해당 인물 영역의 시간적 흐름에 따른 자세 정보(뼈대를 구성하는 접합 부위들의 3차원 좌표 형태)의 변화를 이용하여 학습되며, 추후 학습된 플랫폼에 임의의 인물 영역에 대한 정보가 입력되면, 입력된 인물 영역에 해당하는 3차원 자세 정보가 추출될 수 있다.The learning process is learned by using the change of the attitude information (three-dimensional coordinate form of the joint parts constituting the skeleton) according to the temporal flow of the input person area and the person area, and then, When the information is input, the three-dimensional attitude information corresponding to the input person area can be extracted.

이하, 딥 러닝을 이용하여 인물 영역에서 3차원 자세 정보를 추출하는 학습을 ‘시간적 딥 러닝’이라 칭하도록 한다.Hereinafter, the learning for extracting the three-dimensional attitude information in the person area by using deep learning will be referred to as " temporal deep learning ".

또한, 개체 특징 정보 추출부(113b)는 전처리부(112)에서 분할된 개체의 영역에서 개체의 특징 정보를 추출할 수 있다.In addition, the individual feature information extracting unit 113b can extract the feature information of the individual in the region of the divided individual in the preprocessing unit 112. [

여기서 개체의 특징 정보는 개체의 특징을 반영할 수 있는 형태로서 예를 들어 디스크립터(descriptor)로 추출될 수 있다.Here, the feature information of the entity can be extracted as a descriptor that can reflect characteristics of the entity, for example.

한편, 전송부(114)는 촬영 영상에서 추출된 인물의 특징 정보(3차원 자세 정보)와 인물이 소지하고 있는 개체의 특징 정보(디스크립터)를 클라우드 서버(120)로 전송할 수 있다.Meanwhile, the transmitting unit 114 may transmit the characteristic information (three-dimensional attitude information) of the person extracted from the photographed image and the characteristic information (descriptor) of the object possessed by the person to the cloud server 120.

이때, 상기 특징 정보뿐만 아니라 촬영 시각 및 촬영 장소에 대한 정보를 포함하는 상황 정보가 더 전송될 수 있다.At this time, not only the feature information but also the situation information including the photographing time and the photographing location can be further transmitted.

한편, 제어부(115)는 프로세서를 포함할 수 있으며, 영상 처리 장치(110)의 구성 요소들, 즉, 영상 획득부(111), 전처리부(112), 특징 정보 추출부(113) 및 전송부(114)가 전술한 동작을 수행하도록 제어할 수 있다. 그리고 메모리(116) 또한 제어할 수 있다.The control unit 115 may include a processor and may include a component of the image processing apparatus 110, that is, an image acquiring unit 111, a preprocessing unit 112, a feature information extracting unit 113, The controller 114 can control the operation to perform the above-described operation. The memory 116 can also be controlled.

한편, 메모리(116)는 제어부(115)가 영상 처리 장치(110)의 각 구성 요소들이 전술한 동작을 수행하도록 제어하는 알고리즘과 그 과정에서 파생되는 다양한 데이터들을 저장할 수 있다.Meanwhile, the memory 116 may store an algorithm for controlling each component of the image processing apparatus 110 to perform the operation described above, and various data derived from the process.

도 3은 본 발명의 일 실시예에 따른 클라우드 서버의 구성을 도시한 블록도이다.3 is a block diagram illustrating a configuration of a cloud server according to an embodiment of the present invention.

본 발명의 일 실시예에 따른 클라우드 서버(120)는 행동 유사도 분석부(121), 개체 유사도 분석부(122), 위험 상황 예측부(123), 제어부(124) 및 저장부(125)를 포함할 수 있다.The cloud server 120 according to an exemplary embodiment of the present invention includes a behavior similarity analyzing unit 121, an object similarity analyzing unit 122, a dangerous situation predicting unit 123, a control unit 124, and a storage unit 125 can do.

클라우드 서버(120)는 영상 처리 장치(110)로부터 영상에 등장하는 인물의 3차원 자세 정보, 인물이 소지하고 있는 개체의 특징 정보 및 영상의 상황 정보 - 촬영 시각 및 장소 중 하나 이상을 포함함 - 를 수신할 수 있다.The cloud server 120 includes at least one of three-dimensional attitude information of a person appearing in the image, characteristic information of an object possessed by the person, and context information-photographing time and location of the image from the image processing apparatus 110, Lt; / RTI >

행동 유사도 분석부(121)는 수신되는 인물의 3차원 자세 정보의 시간적 흐름에 따른 움직임을 DB에 기 저장된 행동 유형들과 비교하여 유사도가 가장 높은 행동 유형을 추출할 수 있다.The behavior similarity analyzer 121 can extract a behavior pattern having the highest degree of similarity by comparing the motion of the received person with the temporal flow of the three-dimensional attitude information with previously stored behavior patterns in the DB.

이때, 행동 유사도 분석부(121)는 영상 단위의 시퀀스 매칭을 통해 질의 영상(영상 처리 장치(110)로부터 수신되는 3차원 자세 정보)과 유사한 부분 시퀀스들을 DB로부터 검출할 수 있다.At this time, the behavior similarity analyzer 121 can detect partial sequences similar to the query image (three-dimensional attitude information received from the image processing apparatus 110) from the DB through sequence matching of the image units.

이를 위해 행동 유사도 분석부(121)는 다음과 같이 DB를 구축할 수 있다.For this, the behavior similarity analysis unit 121 can construct a DB as follows.

먼저, 행동 유사도 분석부(121)는 기 획득된 CCTV 영상에서 인물의 3차원 자세 정보가 추출되면, 해당 3차원 자세 정보로부터 자세 특징 벡터를 추출할 수 있다.First, when the three-dimensional attitude information of a person is extracted from the acquired CCTV image, the behavior similarity analyzing unit 121 can extract the attitude characteristic vector from the corresponding three-dimensional attitude information.

여기서 ‘자세 특징 벡터’는 3차원 자세 정보에서 중심 좌표와 신체 부위와의 거리, 3차원 관절의 각도, 각 신체 부위의 이동 거리 및 가속도 등 인물의 자세 및 상황을 표현하기 위한 유용한 특징들을 포함할 수 있다.Here, the 'posture characteristic vector' includes useful features for representing the posture and the situation of the person such as the distance between the center coordinate and the body part, the angle of the three-dimensional joint, the moving distance and acceleration of each body part in the three- .

상기 자세 특징 벡터를 이용하는 이유는 뼈대를 구성하는 접합 부위들의 3차원 좌표 정보인 3차원 자세 정보만으로는 고수준의 동작을 구분하기 어렵기 때문이다.The reason why the posture feature vector is used is that it is difficult to distinguish high-level motion only by the three-dimensional posture information, which is the three-dimensional coordinate information of the joint regions constituting the skeleton.

행동 유사도 분석부(121)는 3차원 자세 정보 및 그에 해당하는 자세 특징 벡터를 연계하여 DB에 저장할 수 있다.The behavior similarity analysis unit 121 may associate the three-dimensional attitude information and the corresponding posture feature vector with each other and store the three-dimensional attitude information in the DB.

그리고, 추출된 자세 특징 벡터를 시간적 흐름에 따라 나열하여 시퀀스 기반의 모델링을 수행하고, 그 결과를 자세 특징 벡터의 집합(matrix) 또는 텐서(tensor) 형태로 수치화하여 DB에 저장할 수 있다.Sequence-based modeling is performed by arranging the extracted posture feature vectors according to the temporal flow, and the result can be stored in the DB in the form of a set of posture feature vectors or a tensor.

이와 같은 과정을 통해 다양한 행동 유형들이 3차원 자세 정보와 자세 특징 벡터 그리고 그를 이용한 시퀀스 기반의 모델링을 통해 DB에 저장될 수 있다.Through this process, various behavior types can be stored in DB through 3D posture information, posture feature vector and sequence based modeling using it.

참고로 각 행동 유형이 DB에 저장될 때는 걷기, 뛰기, 휘두름, 낙상, 발차기 등과 같은 행동 유형을 직관적으로 표현하는 정보(예를 들어 ‘단어’)로 저장되지 않고, 자세 특징 벡터의 집합 또는 텐서 형태로 수치화되어 저장된다.Note that when each type of behavior is stored in the DB, it is not stored as intuitive information (eg, 'word') of behavior types such as walking, running, swinging, falling, kicking, And stored in the form of a tensor.

따라서, 영상 처리 장치(110)로부터 수신되는 3차원 자세 정보가 시간적 흐름에 따라 순차적으로 입력(질의)되면, 행동 유사도 분석부(121)는 입력되는 3차원 자세 정보와 DB에 저장된 3차원 자세 정보의 유사도를 시퀀스 매칭을 통해 비교하여 유사도가 가장 큰 3차원 자세 정보를 DB로부터 추출할 수 있다.Accordingly, when the three-dimensional attitude information received from the image processing apparatus 110 is sequentially input (queried) according to the temporal flow, the behavior similarity analysis unit 121 calculates the three-dimensional attitude information, Can be extracted from the database by comparing the similarities of the three-dimensional attitude information with each other through sequence matching.

여기서, 상기 추출된 3차원 자세 정보들은 각각의 자세 특징 벡터와 연계되어 있고, 각각의 자세 특징 벡터는 시퀀스 기반으로 모델링된 특정 행동 유형의 부분 시퀀스들이다.Here, the extracted three-dimensional posture information is associated with each posture feature vector, and each posture feature vector is a subsequence of a specific action type modeled on a sequence basis.

결국, 행동 유사도 분석부(121)는 질의된 3차원 자세 정보와 유사한 부분 시퀀스들을 가장 많이 포함하는 행동 유형을 질의된 3차원 자세 정보의 행동 유형으로 추출할 수 있다.As a result, the behavior similarity analyzer 121 can extract the behavior type that includes the most similar partial sequences similar to the inquired three-dimensional attitude information as the behavior type of the inquired three-dimensional attitude information.

이때 상기 추출되는 행동 유형은 해당 행동 유형을 나타내는 단어가 아닌 수치화된 형태이다.At this time, the extracted action type is a digitized form rather than a word indicating the action type.

한편, 개체 유사도 분석부(122)는 수신되는 개체의 특징 정보를 개체 특징 벡터로 변환하고, DB에 기 저장된 개체 유형들의 개체 특징 벡터와 유사도를 비교함으로써 개체 유형을 추출할 수 있다.Meanwhile, the object similarity analyzer 122 may extract the entity type by converting the feature information of the received entity into the entity feature vector, and comparing the similarity with the entity feature vector of the entity types previously stored in the DB.

이를 위해 개체 유사도 분석부(122)는 다양한 개체의 특징 정보와 그에 대한 개체 특징 벡터를 저장하는 DB를 이용하여 개체 특징 벡터의 유사도를 비교하되, 유사도가 가장 큰 특정 개체 유형을 추출하거나 유사도가 미리 정해진 임계치 이상인 복수의 개체 유형을 추출할 수 있다.To this end, the object similarity analyzer 122 compares the similarity of the object feature vectors using the DB that stores the feature information of the various entities and the object feature vectors therefor, extracts the specific entity type having the greatest similarity, A plurality of entity types having a predetermined threshold value or more can be extracted.

이때, 유사도의 비교 결과에 이미지 매칭 및 토픽 모델링(topic modeling)이 더 적용될 수 있다.At this time, image matching and topic modeling may be further applied to the comparison result of the degree of similarity.

여기서 추출되는 개체 유형은 전술한 행동 유형과 동일하게 개체를 나타내는 단어가 아닌 수치화된 형태일 수 있다.The entity type extracted here may be a digitized form, not a word representing an entity, just as the behavior type described above.

인물의 행동 유형과 인물이 소지하고 있는 개체 유형을 모두 단어가 아닌 수치화된 형태로 추출하는 이유는, 사전에 DB화하여 관리되지 못한 행동 유형이나 개체에 대해서도 판별 정확도를 높이기 위함이다.The reason why the behavior type of the person and the type of the object possessed by the person are extracted in numerical form rather than in the word is to increase the discrimination accuracy even for the behavior type or the object which is not managed by the DB in advance.

한편, 위험 상황 예측부(123)는 행동 유사도 분석부(121)에서 추출된 ‘행동 유형’과 개체 유사도 분석부(122)에서 추출된 ‘개체 유형’ 그리고 촬영 영상의 촬영 시각 및 촬영 장소 중 하나 이상을 포함하는 ‘상황 정보’간의 상호 연관 관계에 대한 분석에 기초하여 위험 상황 예측 정보를 제공할 수 있다.On the other hand, the risk situation predicting unit 123 predicts the behavior type, which is extracted from the behavior similarity analysis unit 121, the 'object type' extracted from the object similarity analysis unit 122, Based on the analysis of the interrelationship between the 'context information' including the above-mentioned 'context information'.

이를 위해, 위험 상황 예측부(123)는 기존의 촬영 영상과 위험 상황 발생 기록을 이용해 딥 러닝을 이용한 연관 관계를 학습할 수 있으며, 상기 추출된 행동 유형 및 개체 유형 그리고 상황 정보가 입력(질의)되면, 입력된 정보들에 대한 위험 상황 연관성을 판단하여 위험 상황에 대한 예측 정보를 생성하고 이를 사용자 단말기(130)로 제공할 수 있다.To this end, the risk-situation predicting unit 123 can learn the association using the deep-learning using the existing photographed image and the dangerous situation occurrence record. The extracted behavior type, entity type, and situation information are input (query) , It is possible to generate prediction information on a dangerous situation by providing a risk context relevancy to the input information, and to provide the prediction information to the user terminal 130.

참고로, 상기 딥 러닝을 이용한 학습 시 입력되는 개체 유형은 추출된 개체 유형 중 유사도가 가장 높은 특정 개체 유형이거나 유사도가 미리 정해진 임계치 이상인 개체 유형일 수 있다.For reference, the entity type inputted during learning using the deep learning may be a specific entity type having the highest similarity among the extracted entity types or an entity type having a similarity degree that is equal to or higher than a predetermined threshold value.

그리고, 섀플리 밸류(Shapley’s value)를 이용하여 행동 유형, 개체 유형 및 상황 정보에 가중치를 부여한 후 상기 딥 러닝을 이용한 학습을 수행할 수도 있다.The Shapley's value may be used to weight the behavior type, the entity type, and the context information, and then the learning using the deep learning may be performed.

한편, 제어부(124)는 복수의 프로세서를 포함할 수 있으며, 클라우드 서버(120)의 구성 요소들, 즉, 행동 유사도 분석부(121), 개체 유사도 분석부(122) 및 위험 상황 예측부(123)가 전술한 동작을 수행하도록 제어할 수 있으며, 저장부(125) 또한 제어할 수 있다.The controller 124 may include a plurality of processors and may include components of the cloud server 120 such as a behavior similarity analysis unit 121, an object similarity analysis unit 122, and a risk situation prediction unit 123 ) To perform the above-described operation, and the storage unit 125 can also be controlled.

한편, 저장부(125)는 제어부(124)가 클라우드 서버(120)의 각 구성 요소들이 전술한 동작을 수행하도록 제어하기 위한 알고리즘 및 그 과정에서 파생되는 다양한 데이터들을 저장할 수 있다.Meanwhile, the storage unit 125 may store an algorithm for controlling each component of the cloud server 120 to perform the operation described above, and various data derived in the process.

도 4는 본 발명의 일 실시예에 따른 영상 처리 과정을 도시한 흐름도이다.4 is a flowchart illustrating an image processing process according to an embodiment of the present invention.

도 4에 도시된 흐름도는 영상 처리 장치(110)에 의해 수행될 수 있다.The flowchart shown in Fig. 4 may be performed by the image processing apparatus 110. Fig.

영상 처리 장치(110)는 공간적 딥 러닝을 통해 학습된 결과로서, 실시간으로 획득되는 촬영 영상에서 전처리를 통해 인물의 영역과 해당 인물이 소지하고 있는 개체의 영역을 구분한다(S401).As a result of learning through spatial deep learning, the image processing apparatus 110 distinguishes an area of a person and an area of an object possessed by the person through preprocessing in an image captured in real time (S401).

이는 인물과 개체에 대한 특징 정보를 추출 시, 추출되는 특징 정보의 정확도가 높아지는데 도움이 될 수 있다.This can help improve the accuracy of extracted feature information when extracting feature information about a person and an individual.

S401 후, 영상 처리 장치(110)는 시간적 딥 러닝을 통해 학습된 결과로서, 인물 영역에서 3차원 자세 정보를 추출한다(S402).After S401, the image processing apparatus 110 extracts the three-dimensional attitude information from the person area as a learning result through temporal deep learning (S402).

여기서 3차원 자세 정보는 인물이 취하는 자세의 뼈대를 구성하는 접합 부위들의 3차원 좌표 형태로 표현되는 것이며, 영상 처리 장치(110)는 딥 러닝을 이용한 학습 결과로 인물 영역에서 3차원 자세 정보를 추출할 수 있다.Here, the three-dimensional attitude information is expressed in the form of three-dimensional coordinates of joints constituting the skeleton of the attitude of the person, and the image processing apparatus 110 extracts the three-dimensional attitude information from the person area by the learning result using the deep learning can do.

S402 후, 영상 처리 장치(110)는 개체 영역에서 개체의 특징 정보를 추출한다(S403).After S402, the image processing apparatus 110 extracts feature information of the entity from the entity region (S403).

여기서 개체의 특징 정보는 개체의 특징을 반영할 수 있는 형태(descriptor)로 추출될 수 있다.Here, the feature information of the entity can be extracted as a descriptor that can reflect the feature of the entity.

S403 후, 영상 처리 장치(110)는 상기 추출된 3차원 자세 정보와 개체 특징 정보를 클라우드 서버(120)로 전송한다(S404).After S403, the image processing apparatus 110 transmits the extracted three-dimensional attitude information and the object characteristic information to the cloud server 120 (S404).

이때, 영상 처리 장치(110)는 촬영 영상의 촬영 시각 및 촬영 장소에 대한 정보 중 하나 이상을 포함하는 영상의 상황 정보를 더 전송할 수 있다.At this time, the image processing apparatus 110 may further transmit status information of the image including at least one of the shooting time and the shooting location of the shot image.

참고로, 도 4의 실시예에서는 3차원 자세 정보를 먼저 추출하고 그 이후에 개체의 특징 정보를 추출하는 것으로 설명하였지만, 실시예에 따라서는 개체의 특징 정보를 먼저 추출할 수도 있고, 3차원 자세 정보와 개체의 특징 정보가 동시에 추출될 수도 있다.In the embodiment of FIG. 4, the three-dimensional attitude information is extracted first and then the feature information of the object is extracted. However, according to the embodiment, the feature information of the object may be extracted first, The information and the characteristic information of the object may be simultaneously extracted.

도 5는 본 발명의 일 실시예에 따른 위험 상황 예측 과정을 도시한 흐름도이다.5 is a flowchart illustrating a process of predicting a risk situation according to an embodiment of the present invention.

도 5에 도시된 흐름도는 클라우드 서버(120)에 의해 수행될 수 있으며, 행동 유형과 개체 유형을 추출하기 위한 DB는 미리 생성된 상태이다.The flowchart shown in FIG. 5 can be performed by the cloud server 120, and a DB for extracting a behavior type and an entity type is generated in advance.

상기 DB의 구축에 대한 상세한 설명은 도 3을 참조하여 설명한바 있으므로, DB의 구축 과정에 대한 설명은 생략하도록 한다.The construction of the DB is described in detail with reference to FIG. 3. Therefore, the description of the DB construction process will be omitted.

클라우드 서버(120)는 영상 처리 장치(110)로부터 영상에 등장하는 인물의 3차원 자세 정보, 인물이 소지하고 있는 개체의 특징 정보 및 영상의 상황 정보 - 촬영 시각 및 장소 중 하나 이상을 포함함 - 를 수신한다(S501).The cloud server 120 includes at least one of three-dimensional attitude information of a person appearing in the image, characteristic information of an object possessed by the person, and context information-photographing time and location of the image from the image processing apparatus 110, (S501).

S501 후, 클라우드 서버(120)는 수신되는 인물의 3차원 자세 정보의 시간적 흐름에 따른 움직임을 DB에 기 저장된 행동 유형들과 비교하여 유사도가 가장 높은 행동 유형을 추출한다(S502).After step S501, the cloud server 120 compares the motion of the received person with the temporal flow of the three-dimensional attitude information with the previously stored behavior types in the DB, and extracts the behavior type with the highest degree of similarity (S502).

S502 후, 클라우드 서버(120)는 수신되는 개체의 특징 정보를 개체 특징 벡터로 변환하고, DB에 기 저장된 개체 유형들의 개체 특징 벡터와 유사도를 비교함으로써 개체 유형을 추출한다(S503).After step S502, the cloud server 120 converts the feature information of the received entity into the entity feature vector, and extracts the entity type by comparing the similarity with the entity feature vector of the entity types previously stored in the DB (S503).

S503 후, 클라우드 서버(120)는 상기 추출된 행동 유형과 개체 유사도 개체 유형 및 상황 정보간의 상호 연관 관계에 대한 분석에 기초하여 위험 상황 예측 정보를 생성하고, 위험 상황 예측 정보를 사용자 단말기(130)로 제공한다(S504).After step S503, the cloud server 120 generates the risk situation prediction information based on the analysis of the correlation between the extracted behavior type, the entity similarity entity type and the situation information, and transmits the risk situation prediction information to the user terminal 130 (S504).

참고로, 클라우드 서버(120)는 기존의 촬영 영상과 위험 상황 발생 기록을 이용해 딥 러닝(deep learning)을 이용한 연관 관계를 학습할 수 있다.For reference, the cloud server 120 can learn the relation using deep learning by using the existing photographed image and the dangerous situation occurrence record.

도 6은 본 발명의 일 실시예에 따른 전처리 과정을 도시한 도면이다.6 is a diagram illustrating a preprocessing process according to an embodiment of the present invention.

도 6은 영상 처리 장치(110)에서 획득되는 촬영 영상에서 인물 및 인물과 관련된 개체를 분리하는 과정을 보여주고 있다.FIG. 6 shows a process of separating a person and a person related to a person from an image captured by the image processing apparatus 110.

촬영 영상은 2차원의 RGB 영상이며, 영상 처리 장치(110)는 촬영 영상의 픽셀 단위 의미 분석을 통해 인물 및 인물과 관련된 개체 영역을 분리할 수 있다.The photographed image is a two-dimensional RGB image, and the image processing apparatus 110 can separate individual regions related to the person and the person through pixel-based semantic analysis of the photographed image.

이와 같은 전처리를 통해, 전술한 인물 영역에서의 3차원 자세 정보에 대한 인지 정확도가 높아질 수 있다.Through such a preprocessing, the recognition accuracy of the three-dimensional attitude information in the person area can be enhanced.

도 7a 및 도 7b는 본 발명의 일 실시예에 따른 3차원 자세 정보의 추출과 유사도 매칭을 도시한 도면이다.FIGS. 7A and 7B are diagrams showing extraction of three-dimensional attitude information and similarity matching according to an embodiment of the present invention.

실시간으로 획득되는 촬영 영상에서 인물 영역이 분리되면, 영상 처리 장치(110)는 인물 영역에서 3차원 자세 정보를 추출할 수 있다.When the person area is separated from the photographed image obtained in real time, the image processing apparatus 110 can extract the three-dimensional attitude information in the person area.

여기서 3차원 자세 정보는 도 7a에 도시된 바와 같이, 인물이 취하는 자세의 뼈대(skeleton)를 구성하는 접합 부위(joint)들의 3차원 좌표 형태로 표현될 수 있으며, 3차원 자세 정보의 추출은 딥 러닝(deep learning)을 이용한 학습 결과로 수행될 수 있다.As shown in FIG. 7A, the three-dimensional attitude information can be expressed in a three-dimensional coordinate form of joints constituting a skeleton of a posture taken by a person, And can be performed as learning results using deep learning.

도 7b는 3차원 자세 정보를 이용하여 행동 유형을 추출하는 과정에서, 질의된 3차원 자세 정보와 DB에 기 저장된 행동 유형의 3차원 자세 정보간 유사도 매칭을 보여주고 있다.FIG. 7B shows the similarity matching between the inquired three-dimensional attitude information and the three-dimensional attitude information of the behavior type previously stored in the DB in the process of extracting the behavior type using the three-dimensional attitude information.

DB에 저장된 3차원 자세 정보들은 각각의 자세 특징 벡터와 연계되어 있고, 각각의 자세 특징 벡터는 시퀀스 기반으로 모델링된 특정 행동 유형의 부분 시퀀스들이므로, 질의된 3차원 자세 정보와 유사한 부분 시퀀스들을 가장 많이 포함하는 행동 유형이 질의된 3차원 자세 정보의 행동 유형으로 추출할 수 있다.Since the three-dimensional posture information stored in the DB is associated with each posture feature vector and each posture feature vector is a partial sequence of a specific action type modeled on a sequence basis, partial sequence similar to the inquired three- It is possible to extract a lot of behaviors including the behavior type of the inquired three-dimensional attitude information.

도 8은 본 발명의 일 실시예에 따른 개체 특징 정보의 추출을 도시한 도면이다.8 is a diagram illustrating extraction of entity feature information according to an embodiment of the present invention.

실시간으로 획득되는 촬영 영상에서 인물이 소지하고 있는 개체의 영역이 분리되면, 영상 처리 장치(110)는 개체의 영역에서 개체의 특징 정보를 추출할 수 있으며, 추출된 개체의 특징 정보는 개체의 특징을 반영할 수 있는 형태(descriptor)로 추출될 수 있다.If the region of the object possessed by the person is separated in the captured image obtained in real time, the image processing apparatus 110 can extract the feature information of the entity in the region of the entity, Can be extracted as a descriptor that can reflect the data.

도 9는 본 발명이 일 실시예에 따른 위험 상황 예측 결과를 도시한 도면이다.FIG. 9 is a diagram illustrating a result of a risk situation prediction according to an embodiment of the present invention.

클라우드 서버(120)는 기존의 촬영 영상과 위험 상황 발생 기록을 이용해 딥 러닝(deep learning)을 이용한 연관 관계를 학습할 수 있다.The cloud server 120 can learn the association using deep learning by using the existing photographed image and the dangerous situation occurrence record.

클라우드 서버(120)는 인물의 행동 유형에 대한 정보와 인물이 소지하고 있는 개체 유형에 대한 정보 그리고 그에 대한 상황 정보가 입력됐을 때, 딥 러닝을 이용한 연관 학습의 결과를 통해, 입력된 정보들이 위험 상황과 연관이 있는지를 판단할 수 있으며, 위험 상황이라고 판단되는 경우 위험 상황에 대한 예측 정보를 생성하여 사용자 단말기(130)로 제공할 수 있다.The cloud server 120 is configured to receive information about a behavior type of a person, information about an entity type possessed by the person, and context information about the entity, through the result of association learning using deep learning, It is possible to generate prediction information on a dangerous situation and to provide the prediction information to the user terminal 130 when it is determined that the information is related to the situation.

도 9의 [상황 1]은 인물의 행동 유형과 인물이 소지하고 있는 개체 유형 그리고 상황 정보가 표시되어 있으며, 섀플리 밸류를 통해 부여된 가중치가 각각 부여되어 있다. 여기서, 가중치는 합이 100인 경우를 가정하였다.9, the behavior type of the person, the type of the object possessed by the person, and the situation information are displayed, and the weights given through the Shapley value are respectively assigned. Here, the weight is assumed to be a sum of 100.

참고로, 인물의 행동 유형로 기재된 ‘휘두름’, 개체 유형으로 기재된 ‘야구 배트’는 이해를 돕기 위한 것이며 실제 결과는 앞서 설명한 바와 같이 행동 유형이나 개체 유형을 나타내는 단어가 아닌 수치화된 형태로 표현될 수 있다.For reference, 'writh' in the character's behavior type and 'baseball bat' in the entity type are for the sake of understanding. Actual results are expressed in numerical form rather than words indicating behavior type or entity type .

도 9의 [상황 1]은 클라우드 서버(120)는 인물의 행동 유형(휘두름)과 인물이 소지하고 있는 개체 유형(야구 배트) 그리고 상황 정보(밤)의 위험 상황 연관성을 고려했을 때 야구 배트를 휘두르는 폭력 상황으로서 위험 레벨 3에 해당한다고 판단하여 이에 대한 정보를 사용자 단말기(130)로 제공할 수 있다.9, the cloud server 120 determines whether or not the baseball bat is a game player based on the relationship between the behavior type of the character (writhing), the type of the object (baseball bat) possessed by the person (baseball bat) It is determined that the user is in a dangerous level 3 as a wielding violent situation, and information on the danger level 3 can be provided to the user terminal 130.

반면, 도 9의 [상황 2]는 인물의 행동 유형(운반)과 인물이 소지하고 있는 개체 유형(야구 배트) 그리고 상황 정보(낮)의 위험 상황 연관성을 고려했을 때 야구 배트를 운반하는 이동 상황으로서 안전하다고 판단할 수 있으며, 이에 대한 정보를 사용자 단말기(130)로 제공할 수 있다.On the other hand, [Situation 2] of FIG. 9 shows that, considering the relationship between the behavior type (transportation) of the character, the type of the object possessed by the person (baseball bat) And can provide information on the information to the user terminal 130. [0064]

도 10은 본 발명의 일 실시예에 따른 위험 상황 예측 과정을 도시한 도면이다.FIG. 10 is a diagram illustrating a process for predicting a dangerous situation according to an embodiment of the present invention.

영상 처리 장치(110)는 실시간으로 획득되는 촬영 영상에서 인물의 3차원 자세 정보와 인물이 소지하고 있는 개체의 특징 정보를 추출하여 클라우드 서버(120)로 전송한다.The image processing apparatus 110 extracts the three-dimensional attitude information of the person and the feature information of the person possessed by the person in the photographed image obtained in real time, and transmits the extracted information to the cloud server 120.

이때, 영상이 촬영된 시각과 장소에 대한 상황 정보가 함께 전송될 수 있다.At this time, the situation information about the time and place at which the image was photographed can be transmitted together.

이후, 클라우드 서버(120)는 3차원 자세 정보를 이용하여 해당 인물의 행동 유형을 추출하고, 개체의 특징 정보를 이용하여 개체 유형을 추출할 수 있다.Then, the cloud server 120 extracts the behavior type of the person using the three-dimensional attitude information, and extracts the entity type using the feature information of the entity.

이후, 클라우드 서버(120)는 추출된 행동 유형과 개체 유형 그리고 상황 정보(주차장 및 새벽 1시 30분)간 위함 상황의 연관성을 분석하여 폭력 상황임을 판단하고 이에 대한 위험 상황 예측 정보(폭력 상황 위험 레벨 5)를 생성하여 사용자 단말기(130)로 제공할 수 있다.After that, the cloud server 120 analyzes the association between the extracted behavior type, entity type, and situation information (parking lot and 1:30 am) to determine the violation situation, Level 5) may be generated and provided to the user terminal 130.

참고로, 사용자 단말기(130)에 표시되는 위험 상황 예측 정보는 위험 상황을 나타내는 문구(폭력 상황)와 위험 상황의 레벨(레벨 5) 등을 포함할 수 있다.For reference, the dangerous situation prediction information displayed on the user terminal 130 may include a statement of a dangerous situation (a violent situation) and a level of a dangerous situation (level 5).

전술한 본 발명의 설명은 예시를 위한 것이며, 본 발명이 속하는 기술분야의 통상의 지식을 가진 자는 본 발명의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다.It will be understood by those skilled in the art that the foregoing description of the present invention is for illustrative purposes only and that those of ordinary skill in the art can readily understand that various changes and modifications may be made without departing from the spirit or essential characteristics of the present invention. will be.

그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다.It is therefore to be understood that the above-described embodiments are illustrative in all aspects and not restrictive.

예를 들어, 단일형으로 설명되어 있는 각 구성 요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성 요소들도 결합된 형태로 실시될 수 있다.For example, each component described as a single entity may be distributed and implemented, and components described as being distributed may also be implemented in a combined form.

본 발명의 범위는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다.The scope of the present invention is defined by the appended claims, and all changes or modifications derived from the meaning and scope of the claims and their equivalents should be construed as being included within the scope of the present invention.

100 : 실시간 위험 상황 예측 시스템
110 : 영상 처리 장치
111 : 영상 획득부
112 : 전처리부
113 : 특징 정보 추출부
113a : 3차원 자세 정보 추출부, 113b : 개체 정보 추출부
114 : 전송부
120 : 클라우드 서버
121 : 행동 유사도 분석부
122 : 개체 유사도 분석부
123 : 위험 상황 예측부
130 : 사용자 단말기100: Real-time risk situation prediction system
110: Image processing device
111:
112:
113: Feature information extracting unit
113a: three-dimensional attitude information extracting unit, 113b:
114:
120: Cloud server
121: Behavioral similarity analysis section
122: Object similarity analyzing unit
123:
130: User terminal

Claims

A server for predicting a risk situation using a real time image,
When the image processing apparatus receives from the image processing apparatus three-dimensional attitude information of a person appearing in the image, characteristic information of the person possessed by the person, and at least one of the situation information-photographing time and place of the image,
A behavior similarity analyzer for comparing a motion of the received three-dimensional attitude information according to a temporal flow with previously stored behavior patterns in a DB to extract a behavior pattern having the highest similarity;
An object similarity analyzing unit for comparing the similarity between the feature information of the received entity and the feature information of the previously stored entity types in the DB to extract an entity type having a similarity value equal to or higher than a predetermined threshold value; And
A risk situation predicting unit for extracting risk situation prediction information by analyzing correlations between the extracted behavior type, entity type, and the risk information,
Lt; / RTI >

The method according to claim 1,
The three-dimensional attitude information is expressed in a three-dimensional coordinate form of joints constituting a skeleton of a posture taken by the person,
The feature information of the entity is extracted as a descriptor which is a type reflecting the characteristics of the entity,
Wherein the pre-stored behavior type and entity type are parameterized and stored.

The method according to claim 1,
The behavior similarity analysis unit
Comparing the received three-dimensional attitude information with sequence similarity of three-dimensional attitude information of entity types previously stored in the DB,
The three-dimensional attitude information of the entity types previously stored in the DB is associated with the attitude characteristic vector extracted from the corresponding three-dimensional attitude information,
The behavior types are obtained by performing sequence-based modeling by arranging the extracted posture feature vectors according to temporal flow, and the result is numerically expressed as a set of the posture feature vectors or a tensor form server.

The method according to claim 1,
The object similarity analyzing unit
Wherein the degree of similarity is calculated by converting the feature information of the received entity into an entity feature vector and comparing the feature information with an entity feature vector of the previously stored entity types.

5. The method of claim 4,
The object similarity analyzing unit
Wherein the server further extracts the entity type by applying image matching and topic modeling to the result of calculating the degree of similarity.

The method according to claim 1,
The risk situation prediction unit
If information on the extracted behavior type and entity type and the context information are input as query information, deep learning is performed to extract the risk situation prediction information based on the analysis of the correlation, Lt; / RTI >

The method according to claim 6,
The types of objects input during learning using the deep learning are
Wherein the specific entity type having the highest degree of similarity among the extracted entity types or a plurality of entity types having a degree of similarity equal to or greater than a predetermined threshold value.

The method according to claim 6,
The risk situation prediction unit
Wherein the learning is performed using the deep learning after assigning weights to the extracted behavior type, the entity type, and the context information by using a Shapley's value.

An image processing apparatus comprising:
A preprocessing unit for distinguishing an area of a person from an image photographed in real time and an area of an object possessed by the person;
A feature information extracting unit for extracting the three-dimensional attitude information of the person in the divided person area and extracting the feature information of the object in the divided person area; And
And transmits the extracted three-dimensional attitude information and the feature information of the entity to a server
, &Lt; / RTI &
Wherein the three-dimensional attitude information is expressed in a three-dimensional coordinate form of joints constituting a skeleton of the posture taken by the person, so that the personal information of the person is excluded,
The feature information of the entity is extracted by a discriptor which is a type reflecting the characteristics of the entity, and information expressing the entity intuitively is excluded,
Wherein the three-dimensional attitude information and the feature information of the entity are respectively extracted from the behavior type of the person and the entity type possessed by the person in the server and used as information for predicting the dangerous situation.

10. The method of claim 9,
The classification of the character area and the object area and the extraction of the three-dimensional attitude information of the character are learned by deep learning,
Wherein the three-dimensional attitude information about the person's area - the shape of the area is changed according to the attitude of the person is learned based on the change of the three-dimensional attitude information according to the temporal flow.

10. The method of claim 9,
The transmitter
Further comprising status information including at least one of a time at which the image was photographed and a photographing location,
Wherein the status information is used as information for predicting the dangerous situation in the server.

A method for predicting a dangerous situation using a real-time image by a server,
(a) receiving, from an image processing apparatus, at least one of three-dimensional attitude information of a person appearing in the image, characteristic information of an object possessed by the person, and status information of the image, step;
(b) comparing a motion of the received three-dimensional attitude information according to a temporal flow with previously stored behavior types in the DB to extract a behavior type having the highest similarity, extracting feature information of the received object, Extracting an entity type having a degree of similarity equal to or higher than a predetermined threshold by comparing similarities between feature information of the types; And
(c) extracting the risk situation prediction information by analyzing the correlation between the extracted behavior type, the entity type, and the risk information,
And estimating the risk situation.

13. The method of claim 12,
The step (b)
Comparing the received three-dimensional attitude information with sequence similarity of three-dimensional attitude information of entity types previously stored in the DB,
The three-dimensional attitude information of the entity types previously stored in the DB is associated with the attitude characteristic vector extracted from the corresponding three-dimensional attitude information,
The behavior types are obtained by performing sequence-based modeling by arranging the extracted posture feature vectors according to temporal flow, and the result is numerically expressed as a set of the posture feature vectors or a tensor form Risk prediction method.

13. The method of claim 12,
The step (c)
If information on the extracted behavior type and entity type and the context information are input as query information, deep learning is performed to extract the risk situation prediction information based on the analysis of the correlation, Wherein the risk prediction is performed based on the risk information.

13. The method of claim 12,
The three-dimensional attitude information is expressed in a three-dimensional coordinate form of joints constituting a skeleton of a posture taken by the person,
The feature information of the entity is extracted as a descriptor which is a type reflecting the characteristics of the entity,
Wherein the pre-stored behavior type and entity type are parameterized and stored.

A method of processing an image for predicting a dangerous situation in an image processing apparatus,
(a) distinguishing an area of a person from an image captured in real time and an area of an object possessed by the person;
(b) extracting the three-dimensional attitude information of the person in the area of the divided person and extracting the characteristic information of the object in the area of the separated person; And
(c) transmitting the extracted three-dimensional attitude information and characteristic information of the object to the server
, &Lt; / RTI &
Wherein the three-dimensional attitude information is expressed in a three-dimensional coordinate form of joints constituting a skeleton of the posture taken by the person, the personal information of the person is excluded,
The feature information of the entity is extracted by a discriptor which is a type reflecting the characteristics of the entity, and information expressing the entity intuitively is excluded,
Wherein the three-dimensional attitude information and the feature information of the entity are respectively extracted from the behavior type of the person and the entity type possessed by the person in the server and used as information for predicting the risk situation.

17. The method of claim 16,
The step (c)
Further comprising status information including at least one of a time at which the image was photographed and a photographing location,
Wherein the status information is used as information for predicting the dangerous situation in the server.

17. A computer program stored in a recording medium comprising a series of instructions for performing the method according to any one of claims 12 to 17.