KR102261669B1

KR102261669B1 - Artificial Neural Network Based Object Region Detection Method, Device and Computer Program Thereof

Info

Publication number: KR102261669B1
Application number: KR1020190032952A
Authority: KR
Inventors: 김동기
Original assignee: 주식회사 핀텔
Priority date: 2019-03-22
Filing date: 2019-03-22
Publication date: 2021-06-07
Also published as: KR20200119372A

Abstract

본 발명은 인공신경망 기반 객체영역 검출방법, 장치 및 이에 대한 컴퓨터 프로그램에 관한 것으로서, 더욱 상세하게는 영상 디코딩 과정에서 도출되는 파라미터 값에 기초하여 예비적으로 검출된 객체영역을 인공신경망을 이용하여 검토하고, 이를 영상분석에 참조함으로써, 보다 객체인식, 추적 등의 분석을 더욱 고속화할 수 있는 인공신경망 기반 객체영역 검출방법, 장치 및 이에 대한 컴퓨터 프로그램에 관한 것이다.The present invention relates to a method and apparatus for detecting an object region based on an artificial neural network, and a computer program therefor, and more particularly, reviewing an object region preliminarily detected based on a parameter value derived from an image decoding process using an artificial neural network. And, by referring this to image analysis, it relates to an artificial neural network-based object region detection method, apparatus, and computer program therefor that can further speed up analysis such as object recognition and tracking.

Description

Artificial Neural Network Based Object Region Detection Method, Device and Computer Program Thereof

최근 들어, 스마트 폰, CCTV, 블랙박스, 고화질 카메라 등으로부터 수집되는 영상 데이터가 급격히 증가되고 있다. 이에 따라, 비정형의 영상 데이터들을 기반으로 인물이나 사물 등을 인식하여 의미 있는 정보를 추출하고 내용을 시각적으로 분석하고 활용하기 위한 요구 사항이 증대되고 있다.Recently, image data collected from smart phones, CCTVs, black boxes, high-definition cameras, and the like are rapidly increasing. Accordingly, the requirements for recognizing a person or object based on atypical image data to extract meaningful information and to visually analyze and utilize the content are increasing.

영상 데이터 분석 기술은 이러한 다양한 영상들에 대해 학습 및 분석을 수행하여 원하는 영상을 검색하거나 이벤트 발생 등의 상황 인식을 위한 제반 기술들을 말한다.The image data analysis technology refers to various technologies for performing learning and analysis on these various images to search for a desired image or to recognize a situation such as an event occurrence.

하지만, 영상 데이터를 인식하여 분석하고 추적하는 기술은 상당한 계산량을 요구하는 알고리즘이기 때문에, 즉, 복잡도가 높아서 영상 데이터의 크기가 커질수록 연산 장치에 상당한 부하를 주게 된다. 이에 따라, 크기가 커진 영상데이터를 분석하는 시간이 점점 오래 걸리게 된다. 따라서, 영상 정보 분석 시간을 줄일 수 있는 방법이 꾸준히 요구되고 있는 실정이다.However, since the technology for recognizing, analyzing, and tracking image data is an algorithm that requires a considerable amount of computation, that is, the complexity is high, and the larger the size of the image data, the greater the load on the computing device. Accordingly, it takes a longer time to analyze the image data having an increased size. Accordingly, there is a steady demand for a method capable of reducing the time for analyzing image information.

한편, 최근 몇 년 사이 테러 등으로 인해 보안에 대한 인식이 강화되면서 영상 보안 시장이 지속적으로 성장하고 있으며, 이에 따라, 지능형 영상 처리에 대한 요구도 증가하고 있는 추세에 있다.On the other hand, the image security market is continuously growing as awareness of security has been strengthened due to terrorism, etc. in recent years, and accordingly, the demand for intelligent image processing is also increasing.

최근 H.264 및 H.265 등의 규약에 따른 블록 기반의 비디오 코덱을 기반으로 높은 해상도의 영상을 효율적으로 압축하여 전송하고 확인할 수 있는 기술이 확산되었다. 이와 같은 고해상도 영상은 CCTV 등의 모니터링 영상에도 적용이 되고 있으나, 이와 같은 고해상도 영상에서 분석, 트래킹 등에 있어서, 영상의 해상도가 높아짐에 따라 종래와 같은 객체검출방법은 보다 높은 연산량을 요구하고 따라서 실시간 영상에 대한 분석이 원활하게 이루어지지 않는다는 점이 있었다.Recently, a technology capable of efficiently compressing, transmitting, and verifying a high-resolution image based on a block-based video codec according to protocols such as H.264 and H.265 has spread. Such a high-resolution image is also applied to monitoring images such as CCTV, but in the analysis and tracking of such high-resolution images, as the resolution of the image increases, the conventional object detection method requires a higher amount of computation, and therefore, a real-time image There was a point that the analysis was not carried out smoothly.

한편, 선행문헌 1(한국등록특허 제10-1409826호, 2014.6.13 등록)은 참조프레임 내의 블록들의 움직임 벡터에 대한 히스토그램에 기반하여 참조프레임의 움직임 벡터를 산출하고, 전역 움직임벡터에 기초하여 참조블록의 영역종류를 결정하는 기술을 개시하고 있다. Meanwhile, in Prior Document 1 (Korean Patent Registration No. 10-1409826, registered on June 13, 2014), a motion vector of a reference frame is calculated based on a histogram of the motion vectors of blocks in a reference frame, and reference is made based on the global motion vector. A technique for determining the type of area of a block is disclosed.

그러나, 선행문헌 1의 경우 영역전체에 대하여 움직임 벡터를 산출하고, 영역 전체에 대하여 히스토그램 데이터를 산출하여야 하기 때문에, 현재의 높은 해상도의 영상에서의 실시간 처리가 가능한 정도의 속도가 나오기 어렵고, 또한, 모든 블록들에 대하여 모션벡터를 고려하여야 하기 때문에, 불필요한 블록에 대해서도 일단 연산을 수행하여야 한다는 문제점이 있다.However, in the case of Prior Document 1, since it is necessary to calculate a motion vector for the entire region and calculate histogram data for the entire region, it is difficult to obtain a speed capable of real-time processing in the current high-resolution image, and, Since motion vectors must be considered for all blocks, there is a problem that operations must be performed once for unnecessary blocks.

또한, 선행문헌 1과 같이 움직임 벡터만을 고려요소로 보는 경우, 객체영역의 정확한 검출이 어려울 수 있다. 따라서, 선행문헌 1의 경우 정확하게 객체영역을 결정하기 위하여, 영상 내부의 특징량들을 다시 연산하여야 하기 때문에, 신속하면서 정확한 고해상도 영상에 대한 분석이 현실적으로 어렵다는 문제점이 있었다.In addition, when only a motion vector is considered as a factor to be considered as in Prior Document 1, it may be difficult to accurately detect an object region. Accordingly, in the case of Prior Document 1, in order to accurately determine the object region, it is necessary to re-calculate the feature quantities inside the image, so that it is difficult to analyze the high-resolution image quickly and accurately.

선행문헌 1: 한국등록특허 제10-1409826호, '적응적 탐색 범위를 이용한 움직임 예측방법', 2014.6.13 등록)Prior Document 1: Korean Patent Registration No. 10-1409826, 'Motion prediction method using an adaptive search range', registered on June 13, 2014)

본 발명의 목적은 카메라로부터 수신되는 압축된 영상의 비트스트림을 수신하여 객체 영역 검출, 객체 인식, 추적에 사용될 수 있는 데이터를 추출하고, 추출된 파라미터들에 기초하여 예비적으로 검출된 객체영역을 인공신경망을 이용하여 검토하고, 이를 영상분석에 참조함으로써, 보다 객체인식, 추적 등의 분석을 더욱 고속화할 수 있는 객체영역 검출방법, 장치 및 이에 대한 컴퓨터 프로그램을 제공하는 것이다.It is an object of the present invention to receive a bitstream of a compressed image received from a camera, extract data that can be used for object region detection, object recognition, and tracking, and extract a pre-detected object region based on the extracted parameters. It is to provide an object region detection method, apparatus, and computer program therefor that can further speed up analysis such as object recognition and tracking by examining using an artificial neural network and referencing it to image analysis.

상기와 같은 과제를 해결하기 위하여 본 발명은, 하나 이상의 프로세서 및 상기 프로세서에서 수행 가능한 명령들을 저장하는 메인 메모리를 포함하는 컴퓨팅 시스템에서 수행되는 객체영역검출방법으로서, 영상데이터에 대하여 가변길이디코딩단계, 역양자화단계, 역변환단계, 및 가산단계를 수행하여, 영상을 디코딩 하는 영상디코딩단계; 상기 영상데이터에 포함된 매크로블록에 대한 데이터의 크기정보; 및 상기 영상디코딩단계에서 추출되는 1 이상의 영상디코딩 파라미터에 기초하여 영상의 제1객체영역정보를 도출하는 제1객체영역검출단계; 및 상기 제1객체영역검출단계에서 도출된 제1객체영역정보를 인공신경망을 통해 검토하여 제2객체영역정보를 도출하는 제2객체영역검출단계; 를 포함하는 객체영역검출방법을 제공한다.In order to solve the above problems, the present invention provides a method for detecting an object region performed in a computing system including one or more processors and a main memory for storing instructions executable by the processor, comprising: a variable-length decoding step for image data; an image decoding step of decoding an image by performing an inverse quantization step, an inverse transform step, and an addition step; size information of data for a macroblock included in the image data; and a first object region detecting step of deriving first object region information of an image based on one or more image decoding parameters extracted in the image decoding step; and a second object region detection step of deriving second object region information by examining the first object region information derived in the first object region detection step through an artificial neural network; It provides an object area detection method including

본 발명에서는, 상기 제2객체영역검출단계는, 학습된 순환신경망(RNN) 모델에 기초하여 상기 제1객체영역검출단계에서 도출된 상기 제1객체영역정보에 객체가 존재 하는지 여부를 판단하여 제2객체영역정보를 도출할 수 있다.In the present invention, in the second object region detection step, it is determined whether an object exists in the first object region information derived in the first object region detection step based on a learned recurrent neural network (RNN) model. 2 Object area information can be derived.

본 발명에서는, 상기 제2객체영역검출단계는. 상기 제1객체영역검출단계에서 검출된 제1객체영역정보 중 서로 맞닿은 매크로블록 혹은 매크로블록을 구성하는 서브블록을 그룹화 하여 생성한 그룹객체영역에 대한 그룹객체영역정보를 도출하는 그룹객체영역생성단계; 상기 그룹객체영역정보를 실수 벡터로 표현하여 영상의 시간에 따른 시퀀스 데이터를 생성하는 벡터시퀀스생성단계; 상기 시퀀스 데이터를 상기 순환신경망(RNN)의 입력으로 하여 상기 그룹객체영역에 객체가 존재 할 확률을 도출하는 그룹객체영역검토단계; 도출된 상기 그룹객체영역에 객체가 존재 할 확률이 기설정된 기준을 충족하는지 여부에 기초하여 제2객체영역을 검출하는 제3판별단계; 를 포함할 수 있다.In the present invention, the step of detecting the second object region includes: A group object area generation step of deriving group object area information for a group object area generated by grouping macroblocks or subblocks constituting macroblocks in contact with each other among the first object area information detected in the first object area detection step ; a vector sequence generation step of generating sequence data according to time of an image by expressing the group object region information as a real vector; a group object region review step of deriving a probability that an object exists in the group object region by using the sequence data as an input of the recurrent neural network (RNN); a third discrimination step of detecting a second object region based on whether the derived probability that an object exists in the group object region satisfies a preset criterion; may include.

본 발명에서는, 상기 벡터시퀀스생성단계는, 상기 그룹객체영역에 식별부호를 지정하는 식별부호지정단계; 상기 그룹객체영역을 구성하는 매크로블록 혹은 매크로블록을 구성하는 서브블록의 영상디코딩 파라미터를 도출하는 파라미터도출단계; 상기 영상디코딩 파라미터에 기초하여 영상의 연속된 프레임에서 동일한 객체를 포함하는 것으로 판단되는 그룹객체영역의 식별부호를 클러스터링 하는 그룹클러스터링단계; 및 클러스터링 된 상기 그룹객체영역의 정보를 벡터로 표현하는 그룹벡터정보생성단계; 를 포함할 수 있다.In the present invention, the step of generating the vector sequence comprises: an identification code designation step of designating an identification code in the group object area; a parameter deriving step of deriving an image decoding parameter of a macroblock constituting the group object area or a subblock constituting the macroblock; a group clustering step of clustering identification codes of a group object region determined to include the same object in successive frames of an image based on the image decoding parameter; and a group vector information generation step of expressing the clustered information of the group object area as a vector; may include.

본 발명에서는, 상기 영상디코딩 파라미터는 매크로블록 혹은 매크로블록을 구성하는 서브블록의 모션벡터정보; 및 매크로블록 혹은 매크로블록을 구성하는 서브블록의 예측오류정보; 중 1 이상을 포함할 수 있다.In the present invention, the image decoding parameter includes motion vector information of a macroblock or a subblock constituting the macroblock; and prediction error information of a macroblock or subblocks constituting the macroblock; It may include one or more of

본 발명에서는, 상기 그룹클러스터링단계는, 상기 그룹객체영역을 구성하는 매크로블록 혹은 매크로블록을 구성하는 서브블록의 위치정보, 모션벡터정보 및 예측오류정보 중 1 이상에 기초하여 동일한 객체를 포함하는 그룹객체영역을 식별할 수 있다.In the present invention, the group clustering step includes a group including the same object based on at least one of location information, motion vector information, and prediction error information of a macroblock constituting the group object area or subblock constituting a macroblock. The object area can be identified.

본 발명에서는, 상기 그룹객체영역의 정보를 표현한 벡터는, 상기 그룹객체영역의 위치정보; 및 상기 그룹객체영역의 파라미터정보; 를 포함할 수 있다.In the present invention, the vector expressing the information of the group object area includes: location information of the group object area; and parameter information of the group object area; may include.

본 발명에서는, 상기 그룹객체영역의 파라미터정보는, 상기 그룹객체영역을 구성하는 매크로블록 혹은 매크로블록을 구성하는 서브블록의 모션벡터정보에 대한 방향히스토그램; 및 상기 그룹객체영역을 구성하는 매크로블록 혹은 매크로블록을 구성하는 서브블록의 모션벡터정보에 대한 크기히스토그램; 을 포함 할 수 있다.In the present invention, the parameter information of the group object area includes: a direction histogram for motion vector information of a macroblock constituting the group object area or subblock constituting the macroblock; and a size histogram for motion vector information of a macroblock constituting the group object area or a subblock constituting the macroblock. may include

상기와 같은 과제를 해결하기 위하여 본 발명은, 하나 이상의 프로세서 및 상기 프로세서에서 수행 가능한 명령들을 저장하는 메인 메모리를 포함하는 컴퓨팅 시스템에서 수행되는 객체영역검출장치로서, 영상데이터에 대하여 가변길이디코딩단계, 역양자화단계, 역변환단계, 및 가산단계를 수행하여, 영상을 디코딩 하는 영상디코딩부; 상기 영상데이터에 포함된 매크로블록에 대한 데이터의 크기정보; 및 상기 영상디코딩단계에서 추출되는 1 이상의 영상디코딩 파라미터에 기초하여 영상의 제1객체영역정보를 도출하는 제1객체영역검출부; 및 상기 제1객체영역검출단계에서 도출된 제1객체영역정보를 인공신경망을 통해 검토하여 제2객체영역정보를 도출하는 제2객체영역검출부; 를 포함하는 객체영역검출장치를 제공한다.In order to solve the above problems, the present invention provides an apparatus for detecting an object region performed in a computing system including one or more processors and a main memory for storing instructions executable by the processor, comprising: a variable length decoding step for image data; an image decoding unit for decoding an image by performing an inverse quantization step, an inverse transform step, and an addition step; size information of data for a macroblock included in the image data; and a first object region detection unit for deriving first object region information of an image based on one or more image decoding parameters extracted in the image decoding step. and a second object region detection unit for deriving second object region information by examining the first object region information derived in the first object region detection step through an artificial neural network. It provides an object area detection device comprising a.

상기와 같은 과제를 해결하기 위하여 본 발명은, 하나 이상의 프로세서에 의해 실행되는 복수의 명령들을 포함하는 컴퓨터-판독가능 매체에 저장된 컴퓨터 프로그램으로서, 상기 컴퓨터 프로그램은, 영상데이터에 대하여 가변길이디코딩단계, 역양자화단계, 역변환단계, 및 가산단계를 수행하여, 영상을 디코딩 하는 영상디코딩단계; 상기 영상데이터에 포함된 매크로블록에 대한 데이터의 크기정보; 및 상기 영상디코딩단계에서 추출되는 1 이상의 영상디코딩 파라미터에 기초하여 영상의 제1객체영역정보를 도출하는 제1객체영역검출단계; 및 상기 제1객체영역검출단계에서 도출된 제1객체영역정보를 인공신경망을 통해 검토하여 제2객체영역정보를 도출하는 제2객체영역검출단계; 를 포함하는 컴퓨터 프로그램을 제공한다.In order to solve the above problems, the present invention provides a computer program stored in a computer-readable medium including a plurality of instructions executed by one or more processors, the computer program comprising: a variable-length decoding step for image data; an image decoding step of decoding an image by performing an inverse quantization step, an inverse transform step, and an addition step; size information of data for a macroblock included in the image data; and a first object region detecting step of deriving first object region information of an image based on one or more image decoding parameters extracted in the image decoding step; and a second object region detection step of deriving second object region information by examining the first object region information derived in the first object region detection step through an artificial neural network; It provides a computer program comprising a.

본 발명의 일 실시예에 따르면, 인코딩 된 영상데이터로부터 객체를 인식, 트래킹, 추적, 식별 등의 분석을 수행하는 경우에 미리 객체영역을 검출하여 검출된 객체영역에 대해서만 영상분석을 수행하기 때문에, 보다 빠른 속도로 영상을 처리할 수 있는 효과를 발휘할 수 있다.According to an embodiment of the present invention, when performing analysis such as recognizing, tracking, tracking, or identifying an object from encoded image data, since the image analysis is performed only on the detected object area by detecting the object area in advance, It is possible to exert the effect of processing the image at a faster speed.

본 발명의 일 실시예에 따르면, 영상분석 전 객체영역을 검출함에 있어서도 1차적으로 불필요한 매크로블록을 선별적으로 제거함으로써, 객체영역 검출에 대한 연산량을 감소시켜 시스템 전체적으로 연산속도를 높일 수 있는 효과를 발휘할 수 있다.According to an embodiment of the present invention, even in detecting an object region before image analysis, by selectively removing unnecessary macroblocks, the amount of computation for object region detection is reduced, thereby increasing the computational speed of the system as a whole. can perform

본 발명의 일 실시예에 따르면, 객체영역 검출에 있어서 최소의 연산량임에도 불구하고 상당히 높은 수준의 정확도로 객체가 존재할 수 있다고 판단되는 객체영역을 검출할 수 있는 효과를 발휘할 수 있다.According to an embodiment of the present invention, it is possible to exhibit the effect of detecting an object region determined to exist in which an object may exist with a fairly high level of accuracy despite the minimum amount of computation in object region detection.

본 발명의 일 실시예에 따르면, 해당 영상데이터를 디코딩 하기 위한 디코더부의 구성을 변경하지 않으면서, 해당 디코더부에서 디코딩 과정에서 생성되는 파라미터를 이용함으로써, H.264 및 H.265 등의 규약에 따른 블록을 이용한 코덱방식이라면 코덱이 변경되더라도, 용이하게 적용될 수 있는 효과를 발휘할 수 있다.According to an embodiment of the present invention, without changing the configuration of the decoder unit for decoding the corresponding image data, by using the parameters generated in the decoding process in the corresponding decoder unit, H.264 and H.265, etc. In the case of a codec method using blocks according to the present invention, even if the codec is changed, an effect that can be easily applied can be exhibited.

본 발명의 일 실시예에 따르면, 객체영역을 1차적으로 검출한 후 인공신경망에 기초하여 검출 된 객체영역의 신뢰도를 검토함으로써 높은 수준의 정확도로 객체영역을 검출할 수 있는 효과를 발휘할 수 있다.According to an embodiment of the present invention, it is possible to exhibit the effect of detecting the object region with a high level of accuracy by first detecting the object region and then examining the reliability of the detected object region based on the artificial neural network.

도 1은 본 발명의 일 실시예에 따른 객체영역검출 시스템의 전체적인 구조를 개략적으로 도시한 도면이다.
도 2는 본 발명의 일 실시예에 따른 객체영역검출 시스템의 세부 구성을 개략적으로 도시한 도면이다.
도 3은 H.264 및 H.265 등의 규약에 따른 블록을 이용하는 비디오 코덱의 일 실시예에 따른 영상데이터의 데이터스트림의 복호화전 구조를 개략적으로 도시한 도면이다.
도 4는 가변블록을 이용하는 비디오 코덱의 일 실시예에 따른 영상데이터의 매크로블록의 데이터필드 구조를 개략적으로 도시한 도면이다.
도 5는 본 발명의 일 실시예에 따른 제1객체영역검출부의 세부 구성을 개략적으로 도시한 도면이다.
도 6은 매크로블록의 몇 예들을 도시한 도면이다.
도 7은 서브블록을 포함하는 매크로블록의 예를 도시한 도면이다.
도 8은 본 발명의 일 실시예에 따른 제1객체영역검출단계의 과정을 블록기준으로 도시한 도면이다.
도 9는 본 발명의 일 예에 따른 영상화면에 대한 블록분할 정보를 도시하는 도면이다.
도 10은 본 발명의 일 예에 따른 영상화면에 대한 모션벡터정보를 도시하는 도면이다.
도 11은 본 발명의 일 예에 따른 영상화면에 대한 예측오류정보를 도시하는 도면이다.
도 12는 본 발명의 일 실시예에 따른 인코딩 되는 영상을 생성하는 인코더 시스템을 개략적으로 도시한다.
도 13은 영상 데이터의 프레임들의 예들을 개략적으로 도시한 도면이다
도 14는 본 발명의 일 실시예에 따른 제2객체영역검출부의 세부 구성을 개략적으로 도시한 도면이다.
도 15는 본 발명의 일 실시예에 따른 그룹객체영역생성단계의 결과를 도시한 도면이다.
도 16은 본 발명의 일 실시예에 따른 벡터시퀀스생성부의 세부 구성을 개략적으로 도시한 도면이다.
도 17은 본 발명의 일 실시예에 따른 그룹클러스터링단계의 과정을 도시한 도면이다.
도 18은 본 발명의 일 실시예에 따른 그룹벡터정보의 구성을 도시하는 도면이다.
도 19는 본 발명의 일 실시예에 따른 그룹객체영역검토부의 동작을 개략적으로 도시한 도면이다.
도 20은 본 발명의 일 실시예에 따른 그룹객체영역검토부의 동작을 개략적으로 도시한 도면이다.
도 21은 본 발명의 일 실시예에 따른 그룹객체영역검토부의 동작을 개략적으로 도시한 도면이다.
도 22는 본 발명의 일 실시예에 따른 컴퓨팅장치의 내부 구성을 예시적으로 도시한다.1 is a diagram schematically illustrating the overall structure of an object region detection system according to an embodiment of the present invention.
2 is a diagram schematically illustrating a detailed configuration of an object region detection system according to an embodiment of the present invention.
3 is a diagram schematically illustrating a structure before decoding of a data stream of image data according to an embodiment of a video codec using blocks according to protocols such as H.264 and H.265.
4 is a diagram schematically illustrating a data field structure of a macroblock of image data according to an embodiment of a video codec using a variable block.
5 is a diagram schematically illustrating a detailed configuration of a first object area detection unit according to an embodiment of the present invention.
6 is a diagram illustrating some examples of macroblocks.
7 is a diagram illustrating an example of a macroblock including subblocks.
8 is a block diagram illustrating a process of a first object region detection step according to an embodiment of the present invention.
9 is a diagram illustrating block division information for an image screen according to an embodiment of the present invention.
10 is a diagram illustrating motion vector information for a video screen according to an example of the present invention.
11 is a diagram illustrating prediction error information for a video screen according to an example of the present invention.
12 schematically shows an encoder system for generating an encoded image according to an embodiment of the present invention.
13 is a diagram schematically illustrating examples of frames of image data.
14 is a diagram schematically illustrating a detailed configuration of a second object area detection unit according to an embodiment of the present invention.
15 is a diagram illustrating a result of a group object area creation step according to an embodiment of the present invention.
16 is a diagram schematically illustrating a detailed configuration of a vector sequence generator according to an embodiment of the present invention.
17 is a diagram illustrating a process of a group clustering step according to an embodiment of the present invention.
18 is a diagram illustrating a configuration of group vector information according to an embodiment of the present invention.
19 is a diagram schematically illustrating an operation of a group object area review unit according to an embodiment of the present invention.
20 is a diagram schematically illustrating an operation of a group object area review unit according to an embodiment of the present invention.
21 is a diagram schematically illustrating an operation of a group object area review unit according to an embodiment of the present invention.
22 exemplarily illustrates an internal configuration of a computing device according to an embodiment of the present invention.

다양한 실시예들이 이제 도면을 참조하여 설명되며, 전체 도면에서 걸쳐 유사한 도면번호는 유사한 구성요소를 나타내기 위해서 사용된다. 본 명세서에서, 다양한 설명들이 본 발명의 이해를 제공하기 위해서 제시된다. 그러나 이러한 실시예들은 이러한 구체적인 설명 없이도 실행될 수 있음이 명백하다. 다른 예들에서, 공지된 구조 및 장치들은 실시예들의 설명을 용이하게 하기 위해서 블록 다이어그램 형태로 제공된다.BRIEF DESCRIPTION OF THE DRAWINGS Various embodiments are now described with reference to the drawings, wherein like reference numerals are used to refer to like elements throughout. In this specification, various descriptions are presented to provide an understanding of the present invention. However, it is apparent that these embodiments may be practiced without these specific descriptions. In other instances, well-known structures and devices are provided in block diagram form in order to facilitate describing the embodiments.

본 명세서에서 사용되는 용어 "컴포넌트", "모듈", "시스템", “~부” 등은 컴퓨터-관련 엔티티, 하드웨어, 펌웨어, 소프트웨어, 소프트웨어 및 하드웨어의 조합, 또는 소프트웨어의 실행을 지칭한다. 예를 들어, 컴포넌트는 프로세서상에서 실행되는 처리과정, 프로세서, 객체, 실행 스레드, 프로그램, 및/또는 컴퓨터일 수 있지만, 이들로 제한되는 것은 아니다. 예를 들어, 컴퓨팅 장치에서 실행되는 애플리케이션 및 컴퓨팅 장치 모두 컴포넌트일 수 있다. 하나 이상의 컴포넌트는 프로세서 및/또는 실행 스레드 내에 상주할 수 있고, 일 컴포넌트는 하나의 컴퓨터As used herein, the terms “component,” “module,” “system,” “part,” and the like refer to a computer-related entity, hardware, firmware, software, a combination of software and hardware, or execution of software. For example, a component can be, but is not limited to being, a process running on a processor, a processor, an object, a thread of execution, a program, and/or a computer. For example, both an application running on a computing device and the computing device may be a component. One or more components may reside within a processor and/or thread of execution, and a component may reside within one computer.

내에 로컬화될 수 있고, 또는 2개 이상의 컴퓨터들 사이에 분배될 수 있다. 또한, 이러한 컴포넌트들은 그 내부에 저장된 다양한 데이터 구조들을 갖는 다양한 컴퓨터 판독 가능한 매체로부터 실행할 수 있다. 컴포넌트들은 예를 들어 하나 이상의 데이터 패킷들을 갖는 신호(예를 들면, 로컬 시스템, 분산 시스템에서 다른 컴포넌트와 상호작용하는 하나의 컴포넌트로부터 데이터 및/또는 신호를 통해 다른 시스템과 인터넷과 같은 네트워크를 통한 데이터)에 따라 로컬 및/또는 원격 처리들을 통해 통신할 수 있다. It may be localized within, or distributed between two or more computers. In addition, these components can execute from various computer readable media having various data structures stored therein. The components may contain, for example, a signal having one or more data packets (eg, data from one component interacting with another component in a local system, a distributed system, and/or data via a network such as the Internet with another system via a signal). ) may communicate via local and/or remote processes.

또한, "포함한다" 및/또는 "포함하는"이라는 용어는, 해당 특징 및/또는 구성요소가 존재함을 의미하지만, 하나이상의 다른 특징, 구성요소 및/또는 이들의 그룹의 존재 또는 추가를 배제하지 않는 것으로 이해되어야 한다.Also, the terms "comprises" and/or "comprising" mean that the feature and/or element is present, but excludes the presence or addition of one or more other features, elements, and/or groups thereof. should be understood as not

또한, 제1, 제2 등과 같이 서수를 포함하는 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되지는 않는다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 예를 들어, 본 발명의 권리 범위를 벗어나지 않으면서 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다. 및/또는 이라는 용어는 복수의 관련된 기재된 항목들의 조합 또는 복수의 관련된 기재된 항목들 중의 어느 항목을 포함한다.Also, terms including an ordinal number, such as first, second, etc., may be used to describe various elements, but the elements are not limited by the terms. The above terms are used only for the purpose of distinguishing one component from another. For example, without departing from the scope of the present invention, a first component may be referred to as a second component, and similarly, a second component may also be referred to as a first component. and/or includes a combination of a plurality of related listed items or any of a plurality of related listed items.

또한, 본 발명의 실시예들에서, 별도로 다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가지고 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥 상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 발명의 실시예에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.In addition, in the embodiments of the present invention, unless otherwise defined, all terms used herein, including technical or scientific terms, are generally understood by those of ordinary skill in the art to which the present invention belongs. have the same meaning as Terms such as those defined in a commonly used dictionary should be interpreted as having a meaning consistent with the meaning in the context of the related art, and unless explicitly defined in the embodiment of the present invention, an ideal or excessively formal meaning is not interpreted as

도 1은 본 발명의 일 실시예에 따른 객체영역검출 시스템의 전체적인 구조를 개략적으로 도시한 도면이다.1 is a diagram schematically illustrating the overall structure of an object region detection system according to an embodiment of the present invention.

도 1에서의 객체영역검출 시스템은 광의로 수신한 영상데이터를 처리하는 제1객체영역검출부(2000), 제2객체영역검출부(4000), 영상디코딩부(1000), 및 영상분석부(3000)를 포함한다. 영상데이터는 규약 된 코덱에 의하여 인코딩 된 영상데이터로서, 바람직하게는 가변크기 블록을 이용하여 영상이 인코딩 된 데이터에 해당하고, 바람직하게는 H.264 및 H.265 코덱 방식 등을 포함하는 규약에 따라 인코딩 된 영상데이터에 해당한다. 더욱 바람직하게는 가변크기 블록을 이용하는 규약에 따라 인코딩 된 영상데이터에 해당한다.The object region detection system in FIG. 1 includes a first object region detection unit 2000, a second object region detection unit 4000, an image decoding unit 1000, and an image analysis unit 3000 that process received image data in a broad sense. includes The video data is video data encoded by a standard codec, and preferably corresponds to data encoded using a variable size block, preferably according to a protocol including H.264 and H.265 codec methods. Corresponds to the encoded video data. More preferably, it corresponds to video data encoded according to a protocol using a variable size block.

영상데이터는 객체영역검출 시스템에 저장되어 있는 영상이거나 혹은 실시간으로 다른 영상 수집장치(예를 들어, CCTV 혹은 모니터링 장치)에 의하여 수신한 영상데이터에 해당할 수 있다. The image data may correspond to images stored in the object area detection system or image data received by another image collecting device (eg, CCTV or monitoring device) in real time.

상기 영상디코딩부(1000)는 H.264 및 H.265 코덱 방식 등을 포함하는 규약에 따라 인코딩 된 영상을 디코딩 혹은 복호화하기 위한 장치에 해당하고, 이는 해당 코덱의 복호화 방식에 따라 구성된다.The image decoding unit 1000 corresponds to an apparatus for decoding or decoding an image encoded according to a protocol including the H.264 and H.265 codec methods, and is configured according to the decoding method of the corresponding codec.

영상분석부(3000)는 복호화되어 수신한 영상에 대하여 객체인식, 추적, 식별 등의 분석을 수행하고, 영상분석부(3000)는 영상분석 목적에 따라 전처리부, 특징정보 추출부, 특징정보 분석부 등의 다양한 구성을 포함할 수 있다.The image analysis unit 3000 performs analysis such as object recognition, tracking, and identification on the decoded and received image, and the image analysis unit 3000 includes a preprocessor, a feature information extractor, and a feature information analysis according to the purpose of image analysis. It may include various components such as a part.

본 발명에서는 영상분석부(3000)의 영상처리 속도를 보다 개선시키기 위하여, 제1객체영역검출부(2000) 및 제2객체영역검출부(2000)를 도입하였다. 제1객체영역검출부(2000)는 영상데이터 자체로부터 추출한 정보 및 상기 영상디코딩부(1000)의 영상 디코딩 과정에서 추출되는 영상디코딩 파라미터와 같은 디코딩 된 영상 자체가 아닌 디코딩 되지 않은 영상데이터의 정보 및 디코딩과정에서 추출되는 파라미터 정보를 통하여 제1객체영역을 검출하고, 이에 대한 제1객체영역정보를 상기 제2객체영역검출부(4000)에 전달하고, 상기 제2객체영역검출부는 상기 제1객체영역정보를 인공신경망을 통해 검토하여 제2객체영역정보를 상기 영상분석부(3000)에 전달한다. 이 때, 상기 제2객체영역검출부(4000)는, 학습된 순환신경망(RNN) 모델에 기초하여 상기 제1객체영역검출단계에서 도출된 상기 제1객체영역정보에 객체가 존재 하는지 여부를 판단하여 제2객체영역정보를 도출할 수 있다.In the present invention, in order to further improve the image processing speed of the image analysis unit 3000 , the first object area detection unit 2000 and the second object area detection unit 2000 are introduced. The first object region detection unit 2000 includes information extracted from the image data itself and information and decoding of undecoded image data, not the decoded image itself, such as image decoding parameters extracted in the image decoding process of the image decoding unit 1000 . The first object region is detected through the parameter information extracted in the process, and the first object region information is transmitted to the second object region detection unit 4000, and the second object region detection unit includes the first object region information. is reviewed through an artificial neural network, and the second object region information is transmitted to the image analysis unit 3000 . At this time, the second object region detection unit 4000 determines whether an object exists in the first object region information derived in the first object region detection step based on the learned recurrent neural network (RNN) model. The second object area information can be derived.

따라서, 영상분석부(3000)는 영상전체에 대해 객체인식, 특징점 추출, 전처리 등을 수행하는 것이 아니라, 상기 제2객체영역검출부(4000)로부터 전달받은 제2객체영역정보에 따른 영역에 대해서만 영상분석을 수행할 수 있다.Therefore, the image analysis unit 3000 does not perform object recognition, feature point extraction, pre-processing, etc. on the entire image, but only on the region according to the second object region information received from the second object region detection unit 4000 . analysis can be performed.

본 발명의 일 실시예에서는 상기 영상분석부(3000)는 상기 제2객체영역검출부(4000)로부터 전달받은 제2객체영역정보에 기초하여 분석할 객체영역을 설정하고, 이에 따른 객체인식, 특징점 추출, 전처리 등을 수행한다. 그러나, 본 발명은 이에 한정되지 않고, 본 발명의 다른 실시예에서는 상기 영상분석부(3000)는 상기 제2객체영역검출부(4000)로부터 전달받은 제2객체영역정보에 기초하여 분석할 예비객체영역을 설정하고, 상기 예비객체영역에 대하여 추가적인 객체영역 검출 혹은 예비객체영역 검증에 따른 객체영역 검출 등의 과정을 통하여 객체영역을 설정하고, 이에 따른 객체인식, 특징점 추출, 전처리 등을 수행할 수 있다.In an embodiment of the present invention, the image analysis unit 3000 sets an object area to be analyzed based on the second object area information received from the second object area detection unit 4000 , and thus object recognition and feature point extraction , preprocessing, etc. However, the present invention is not limited thereto, and in another embodiment of the present invention, the image analysis unit 3000 is a preliminary object area to be analyzed based on the second object area information received from the second object area detection unit 4000 . is set, and the object area is set through a process such as additional object area detection or object area detection according to preliminary object area verification for the preliminary object area, and object recognition, feature point extraction, pre-processing, etc. can be performed accordingly. .

즉, 본 발명의 실시예들에서는 상기 제2객체영역검출부에서 객체영역이 확정되는 것이 아니라, 상기 영상분석부에서 상기 제1객체영역검출부 및/또는 제2객체영역검출부의 결과를 이용하여 추가적인 객체영역에 대한 도출, 혹은 검증 등이 이루어질 수도 있다.That is, in embodiments of the present invention, the object region is not determined by the second object region detection unit, but an additional object using the results of the first object region detection unit and/or the second object region detection unit in the image analysis unit. Derivation or verification of the area may be performed.

상기 제1객체영역검출부(2000), 제2객체영역검출부(4000), 영상디코딩부(1000), 영상분석부(3000)는 단일의 컴퓨팅 장치로 구성될 수도 있으나, 2 이상의 컴퓨팅 장치에 의하여 구성될 수도 있다. The first object region detection unit 2000, the second object region detection unit 4000, the image decoding unit 1000, and the image analysis unit 3000 may be configured as a single computing device, but are configured by two or more computing devices. could be

즉, 본 발명에 따른 객체영역검출 시스템은 하나 이상의 프로세서 및 상기 프로세서에서 수행 가능한 명령들을 저장하는 메인 메모리를 포함하는 컴퓨팅 시스템에 의하여 구현될 수 있다.That is, the object region detection system according to the present invention may be implemented by a computing system including one or more processors and a main memory for storing instructions executable by the processor.

도 2는 본 발명의 일 실시예에 따른 객체영역검출 시스템의 세부 구성을 개략적으로 도시한 도면이다.2 is a diagram schematically illustrating a detailed configuration of an object region detection system according to an embodiment of the present invention.

본 발명의 일 실시예에 따른 객체영역검출장치는 하나 이상의 프로세서 및 상기 프로세서에서 수행 가능한 명령들을 저장하는 메인 메모리를 포함하는 컴퓨팅 시스템에 의하여 구현된다.The object region detection apparatus according to an embodiment of the present invention is implemented by a computing system including one or more processors and a main memory for storing instructions executable by the processor.

도 2에 도시된 바와 같이, 가변길이디코딩부(1100), 역양자화부(1200), 역변환부(1300), 가산부(1400), 및 예측부(1500)를 포함하는 영상디코딩부(1000)는 영상의 인코딩 방식으로 인코딩 된 영상데이터를 디코딩 한다.As shown in FIG. 2 , an image decoding unit 1000 including a variable length decoding unit 1100 , an inverse quantization unit 1200 , an inverse transform unit 1300 , an adder 1400 , and a prediction unit 1500 . Decodes the encoded image data by the image encoding method.

상기 영상디코딩부(1000)의 구성은 영상데이터를 인코딩 하는 인코딩부의 구성에 따라 구성될 수 있다. 예를 들어, 도 2에 도시된 영상디코딩부(1000)의 구성은 도 12에 따라 인코딩 된 영상을 디코딩 하는 구성에 해당하고, 본 발명의 일 실시예에서는 H.264 또는 H.265 코덱을 기준으로 도 2 및 도 12가 도시되었다.The configuration of the image decoding unit 1000 may be configured according to the configuration of an encoding unit for encoding image data. For example, the configuration of the image decoding unit 1000 shown in FIG. 2 corresponds to a configuration for decoding an image encoded according to FIG. 12 , and in an embodiment of the present invention, the H.264 or H.265 codec is used. 2 and 12 are shown.

가변길이디코딩부(1100)는 입력되는 영상 데이터를 가변길이디코딩(복호화)한다. 이를 통해, 가변 길이 디코딩부(111)는 영상 데이터를 움직임 벡터, 양자화 값, DCT 계수로 분리 혹은 영상 데이터로부터 움직임 벡터, 양자화 값, DCT 계수를 추출할 수 있다.The variable-length decoding unit 1100 variable-length decodes (decodes) input image data. Through this, the variable length decoding unit 111 may separate the image data into motion vectors, quantization values, and DCT coefficients or extract motion vectors, quantization values, and DCT coefficients from the image data.

역양자화부(1200)는 가변길이디코딩부(1100)로부터 출력되는 DCT 계수를 추출된 양자화 값에 따라 역양자화한다. The inverse quantization unit 1200 inversely quantizes the DCT coefficient output from the variable length decoding unit 1100 according to the extracted quantization value.

역변환부(1300)는 역양자화부(1200)에 의해 역양자화 된 DCT 계수를 역변환(IDCT)하여 차분치 영상을 획득한다.The inverse transform unit 1300 performs inverse transform (IDCT) on the DCT coefficient inverse quantized by the inverse quantizer 1200 to obtain a differential image.

예측부(1500)는 해당 프레임이 인트라모드인지 혹은 인터모드인지에 따라서 예측을 수행한다. 예측부(1500)의 움직임보상부(1530)는 움직임 벡터와 이전 영상 데이터를 이용하여 현재 영상 데이터에 대한 움직임을 보상한다. 이를 통해, 움직임보상부는 예측 영상을 생성한다.The prediction unit 1500 performs prediction according to whether the corresponding frame is intra mode or inter mode. The motion compensator 1530 of the prediction unit 1500 compensates for the motion of the current image data using the motion vector and previous image data. Through this, the motion compensator generates a predicted image.

상기 영상디코딩부(1000)의 가변길이디코딩부(1100), 역양자화부(1200), 역변환부(1300), 가산부(1400), 및 예측부(1500)의 구성은 영상데이터의 인코딩 방식 혹은 코덱에 따라 변경될 수 있고, 이는 통상의 기술자가 해당 영상데이터의 코덱에 따라 구현할 수 있다. 본 발명은 H.264 및 H.265 등의 규약에 따른 영상을 디코딩 하는 기존의 디코딩부에 제1객체영역검출부(2000)에 추가하여 구성할 수 있다는 장점이 있다.The configuration of the variable-length decoding unit 1100, the inverse quantization unit 1200, the inverse transform unit 1300, the adder 1400, and the prediction unit 1500 of the image decoding unit 1000 is the encoding method of the image data or It may be changed according to the codec, which a person skilled in the art may implement according to the codec of the corresponding image data. The present invention has the advantage that it can be configured by adding the first object region detection unit 2000 to the existing decoding unit for decoding an image according to the protocol such as H.264 and H.265.

상기 제1객체영역검출부(2000)는 상기 영상데이터에 포함된 매크로블록에 대한 데이터의 크기정보; 및 상기 영상디코딩단계에서 추출되는 1 이상의 영상디코딩 파라미터에 기초하여 영상의 제1객체영역정보를 도출한다. 이와 같이 도출된 제1객체영역정보는 상기 제2객체영역검출부(4000)를 통해 검토되고, 상기 제2객체영역검출부(4000)는 검토 결과에 기초하여 제2객체영역정보를 도출하여 상기 영상분석부(3000)로 전달하고, 영상분석부(3000)는 제2객체영역정보에 대해서만 영상처리를 함으로써 영상분석에 소요되는 연산량을 대폭적으로 감소시킬 수 있다.The first object area detection unit 2000 may include size information of data for a macroblock included in the image data; and deriving first object region information of an image based on one or more image decoding parameters extracted in the image decoding step. The derived first object region information is reviewed through the second object region detection unit 4000, and the second object region detection unit 4000 derives second object region information based on the result of the review and analyzes the image. It is transmitted to the unit 3000, and the image analysis unit 3000 performs image processing on only the second object region information, thereby significantly reducing the amount of computation required for image analysis.

도 3은 H.264 및 H.265 등의 규약에 따른 비디오 코덱의 일 실시예에 따른 영상데이터의 데이터스트림의 복호화전 구조를 개략적으로 도시한 도면이다.3 is a diagram schematically illustrating a structure before decoding of a data stream of image data according to an embodiment of a video codec according to protocols such as H.264 and H.265.

도 3에 도시된 데이터스트림은 도 2에서 가변길이디코딩부(1100)에 입력되는 복호화가 전혀 수행되지 않고, 저장되거나 혹은 전송되는 영상데이터에 해당한다. 이와 같은 영상데이터 혹은 데이터스트림은 NAL(Network Abstraction Layer) 로 이루어져 있고, 각각의 NAL은 Nal Unit 과 페이로드로서의 RBSP(Raw Byte Sequence Payload)로 이루어진다. NAL은 SPS, PPS 와 같은 파라미터 정보가 기재된 단위 혹은 VCL(Video Coding Layer)에 해당하는 Slice 데이터가 기재된 단위에 해당할 수 있다. The data stream shown in FIG. 3 corresponds to image data that is stored or transmitted without performing any decoding input to the variable length decoding unit 1100 in FIG. 2 . Such image data or data stream consists of a Network Abstraction Layer (NAL), and each NAL consists of a Nal Unit and a Raw Byte Sequence Payload (RBSP) as a payload. The NAL may correspond to a unit in which parameter information such as SPS and PPS is described or a unit in which slice data corresponding to a video coding layer (VCL) is described.

VCL에 해당하는 SLICE NAL은 헤더와 데이터로 구성되어 있고, 여기서 데이터는 복수의 매크로블록 필드와 구분자 필드로 이루어진다. 본 발명에서 객체영역검출을 수행하는 영상데이터의 인코딩 방식은 NAL 상태의 데이터에서는 일정한 블록크기를 갖는 매크로블록으로 인코딩 하는 방식이다. 도 3에서 MB로 구분된 데이터 필드는 일정한 크기의 블록에 대한 데이터가 인코딩 되어 있다. The SLICE NAL corresponding to the VCL consists of a header and data, where the data consists of a plurality of macroblock fields and a delimiter field. In the present invention, the encoding method of image data for performing object region detection is a method of encoding data in the NAL state into macroblocks having a constant block size. In the data field divided into MB in FIG. 3, data for a block of a certain size is encoded.

후술하는 제1객체영역검출부(2000)의 제1판별부(2100)는 디코딩이 수행되지 않은 영상데이터로부터 각각의 매크로블록, 즉 도 3에서 MB로 표시된 부분의 데이터 크기를 이용한다. 이와 같은 방식에 의하면 가변길이디코딩부(1100)와 같은 1차적인 디코딩 절차 이전의 원래의 데이터를 이용하여 예비적으로 객체영역후보영역을 도출할 수 있다.The first discriminating unit 2100 of the first object region detecting unit 2000, which will be described later, uses the data size of each macroblock, that is, the portion indicated by MB in FIG. 3 from the image data that has not been decoded. According to this method, it is possible to preliminarily derive the object region candidate region using the original data before the primary decoding procedure such as the variable length decoder 1100 .

일반적인 H.264 또는 H.265 코덱으로 인코딩 된 영상데이터의 경우, 도 3의 MB 데이터필드는 16픽셀x16픽셀의 크기를 갖는 매크로블록에 대한 데이터가 인코딩 되어 저장되어 있고, 가변길이디코딩부(1100)에 의하여 상기 매크로블록의 세부 블록정보를 일부 인코딩 된 형태로 확인할 수 있다.In the case of video data encoded with a general H.264 or H.265 codec, the MB data field of FIG. 3 encodes and stores data for a macroblock having a size of 16 pixels x 16 pixels, and a variable length decoding unit 1100 ), the detailed block information of the macroblock can be checked in a partially encoded form.

도 4는 가변블록을 이용하는 비디오 코덱의 일 실시예에 따른 영상데이터의 매크로블록의 데이터필드 구조를 개략적으로 도시한 도면이다.4 is a diagram schematically illustrating a data field structure of a macroblock of image data according to an embodiment of a video codec using a variable block.

도 4에 도시된 매크로블록의 데이터필드는 상기 가변길이디코딩부(1100)에 의하여 디코딩 된 형태이다. 기본적으로 매크로블록의 데이터필드는 블록의 크기 등의 정보를 포함하는 타입(Type) 필드; 인트라모드로 인코딩 되었는지 혹은 인터모드로 인코딩 되었는지에 대한 정보, 및 인터모드인 경우에 기준프레임정보 및 모션벡터정보를 포함하는 예측타입(Prediction Type) 필드; 디코딩 시 입력된 이전 픽쳐 비트열을 유지하기 위한 정보를 포함하는 CPB(Coded Picture Buffer) 필드; 양자화 파라미터에 대한 정보를 포함하는 QP(Quantization Parameter) 필드; 및 해당 블록의 색상에 대한 DCT 계수에 대한 정보를 포함하는 DATA 필드를 포함한다. The data field of the macroblock shown in FIG. 4 is decoded by the variable length decoding unit 1100 . Basically, the data field of the macroblock includes a Type field including information such as the size of the block; a prediction type field including information on whether encoding is performed in intra mode or inter mode, and reference frame information and motion vector information in case of inter mode; a Coded Picture Buffer (CPB) field including information for maintaining a bitstream of a previous picture input during decoding; a Quantization Parameter (QP) field including information on a quantization parameter; and a DATA field including information on DCT coefficients for the color of the corresponding block.

매크로블록이 복수의 서브블록을 포함하는 경우, 도 4의 2번째 열에 도시된 타입-예측타입-CPB-QP-DATA의 데이터 유닛이 복수 개로 연결되어 있다.When a macroblock includes a plurality of subblocks, a plurality of data units of type-prediction type-CPB-QP-DATA shown in the second column of FIG. 4 are connected.

후술하는 제2판별부(2200)는 상기 가변길이디코딩부(1100)에 의하여 디코딩 된 매크로블록 전체(서브블록이 없는 경우) 혹은 매크로블록을 구성하는 서브블록의 타입 필드에서 알 수 있는 블록크기 및 예측타입 필드에서 알 수 있는 모션벡터정보를 이용하여, 객체영역을 검출한다.The second determining unit 2200, which will be described later, includes a block size and a block size that can be known from the type field of the entire macroblock (when there is no subblock) or subblocks constituting the macroblock decoded by the variable length decoding unit 1100. An object region is detected using motion vector information known from the prediction type field.

한편, DATA 필드의 색상정보는 복수의 계통의 색상에 대한 정보(도 4에서는 YCbCr 계통에서의 색상정보를 포함)가 인코딩 된 형태로 포함되어 있다. 이와 같은 DATA 필드의 색상정보는 역양자화부(1200), 및 역변환부(1300)에 의하여 정보가 디코딩 되고, 가산부(1400)에서 원래의 영상데이터에서의 색상값과 영상디코딩부(1000)의 예측부(1500)에서 예측한 색상값의 차이에 해당하는 예측오류정보가 도출될 수 있다.On the other hand, the color information of the DATA field includes information on a plurality of colors (including color information in the YCbCr system in FIG. 4) in an encoded form. The color information of the DATA field is decoded by the inverse quantization unit 1200 and the inverse transform unit 1300, and the color value in the original image data and the image decoding unit 1000 in the adder 1400. Prediction error information corresponding to the difference between the color values predicted by the prediction unit 1500 may be derived.

이와 같이 가산부(1400)에서 도출될 수 있는 예측오류정보는 상기 제2판별부(2200)에 의하여 객체영역을 검출하기 위하여 이용될 수 있다.As described above, the prediction error information that can be derived from the adder 1400 may be used to detect the object region by the second determiner 2200 .

도 5는 본 발명의 일 실시예에 따른 제1객체영역검출부(2000)의 세부 구성을 개략적으로 도시한 도면이다.5 is a diagram schematically illustrating a detailed configuration of the first object area detection unit 2000 according to an embodiment of the present invention.

상기 제1객체영역검출부(2000)는 상기 가변길이디코딩단계가 수행되기 전의 영상데이터로부터 매크로블록 대한 데이터의 크기정보를 추출하는 단계; 및 상기 매크로블록에 대한 데이터의 크기정보가 기설정된 기준을 충족하는 지 여부를 판별하는 단계; 를 수행하는 제1판별부(2100); 및 상기 제1판별부(2100)에 의하여 데이터의 크기정보가 기설정된 기준을 충족하는 매크로블록에 대하여 상기 영상디코딩단계에서 추출되는 1 이상의 영상디코딩 파라미터에 기초하여 제1객체영역을 검출하는 제2판별부(2200); 상기 제2판별부(2200)에서 검출된 제1객체영역의 정보를 제2객체영역검출부(4000)로 출력하는 객체영역출력부(2300)를 포함한다.extracting, by the first object region detection unit 2000, size information of macroblock data from the image data before the variable length decoding step is performed; and determining whether size information of data for the macroblock meets a preset criterion. a first determining unit 2100 that performs and a second method for detecting a first object region based on one or more image decoding parameters extracted in the image decoding step with respect to a macroblock whose size information of data satisfies a preset criterion by the first determining unit 2100 a determining unit 2200; and an object region output unit 2300 for outputting information on the first object region detected by the second determining unit 2200 to the second object region detecting unit 4000 .

상기 제1판별단계에서는 도 2에 도시된 바와 같이, 가변길이디코딩부(1100)에 의하여 디코딩 되지 않은 영상데이터의 NAL 형식의 데이터스트림으로부터 각각의 매크로블록에 대한 데이터의 크기정보를 도출한다. 즉, 도 3에서 MB로 표기된 각각의 데이터필드의 데이터 크기에 기초하여 각각의 매크로블록이 후술하는 제2판별부(2200)의 판별대상이 되는지 여부를 결정한다.In the first determining step, as shown in FIG. 2 , size information of data for each macroblock is derived from the NAL format data stream of image data not decoded by the variable length decoding unit 1100 . That is, based on the data size of each data field denoted by MB in FIG. 3 , it is determined whether each macroblock is a target for determination by the second determining unit 2200 to be described later.

가변크기 블록을 이용한 영상 인코딩 방법에서는 복잡한 영상이 위치하는 매크로블록의 경우 매크로블록(16x16)을 복수의 서브블록(8x8, 4x4 등)으로 나누게 되거나 여러 정보를 포함하기 때문에, 해당 매크로블록은 크기가 커지게 된다. 또한, H.264 및 H.265 등의 영상 인코딩 방법에서는 매크로블록의 데이터 중 자주 발생하는 값과 그렇지 않은 값들이 존재할 경우, 자주 발생하는 값에 짧은 길이의 부호를 할당하고 그렇지 않은 값에는 긴 부호를 할당하여 전체 데이터량을 줄이는 방법으로 인코딩 한다.In the image encoding method using variable size blocks, in the case of a macroblock in which a complex image is located, the macroblock (16x16) is divided into a plurality of subblocks (8x8, 4x4, etc.) or because it contains various information, the corresponding macroblock is it gets bigger In addition, in video encoding methods such as H.264 and H.265, when there are frequently occurring values and non-frequently occurring values among macroblock data, a short code is assigned to a frequently occurring value, and a long code is assigned to a value that does not occur frequently. is encoded in a way to reduce the total data amount by allocating

제1판별부(2100)는 가변크기 블록을 이용한 인코딩 방법의 이와 같은 특성을 이용하여, 가변길이디코딩부(1100) 등의 디코딩 절차가 수행되기 전의 영상데이터로부터 각각의 매크로블록의 데이터크기를 도출하고, 해당 데이터크기가 기설정된 기준을 충족하는 경우, 즉 기설정된 값 이상인 경우에 이를 2차판별부에 의한 판별대상으로 분류한다. The first determining unit 2100 derives the data size of each macroblock from the image data before the decoding procedure of the variable-length decoding unit 1100 is performed by using this characteristic of the encoding method using the variable size block. and, when the corresponding data size satisfies a preset criterion, that is, when it is greater than or equal to a preset value, it is classified as a target for discrimination by the secondary discrimination unit.

이와 같은 방식으로 본 발명에서는 영상디코딩부(1000)를 변경하지 않고, 영상디코딩부(1000)로 입력되는 영상데이터로부터 객체영역을 검출할 수 있는 유효데이터를 추출하여 간단한 연산으로 해당 매크로블록에 대한 객체영역 여부를 판별한다.In this way, in the present invention, without changing the image decoding unit 1000 , valid data capable of detecting an object region is extracted from the image data input to the image decoding unit 1000 , and the corresponding macroblock is processed with a simple operation. Determines whether there is an object area.

본 발명의 다른 실시예에서는, 상기 제1판별부(2100)에 의한 판별결과 및 제2판별부(2200)에 의한 판별결과를 종합적으로 판단하여, 각각의 매크로블록 혹은 서브블록에 대한 객체영역 여부를 판단할 수도 있다.In another embodiment of the present invention, the determination result by the first determination unit 2100 and the determination result by the second determination unit 2200 are comprehensively determined to determine whether each macroblock or subblock is an object area. can also be judged.

한편, 상기 제2 판별부(2200)는 상기 매크로블록이 서브블록을 포함하고 있지 않는 경우에는 상기 매크로블록 전체에 대한 1 이상의 영상디코딩 파라미터에 기초하여 객체영역을 검출하고, 상기 매크로블록이 서브블록을 포함하고 있는 경우에는 상기 서브블록 각각에 대한 1 이상의 영상디코딩 파라미터에 기초하여 객체영역을 검출한다.Meanwhile, when the macroblock does not include a subblock, the second determining unit 2200 detects an object region based on one or more image decoding parameters for the entire macroblock, and the macroblock is a subblock. , the object region is detected based on one or more image decoding parameters for each of the sub-blocks.

여기서, 상기 제2 판별부(2200)는 대상 블록에 대한 블록크기정보, 및 모션벡터정보, 예측오류정보 중 1 이상의 정보에 기초하여 객체영역정보를 도출하고, 바람직하게는, 대상 블록에 대한 블록크기정보, 및 모션벡터정보, 예측오류정보 중 2 이상의 정보에 기초하여 제1객체영역정보를 도출하고, 가장 바람직하게는, 블록크기정보, 및 모션벡터정보, 예측오류정보에 기초하여 제1객체영역정보를 도출한다.Here, the second determining unit 2200 derives object region information based on one or more of block size information for the target block, motion vector information, and prediction error information, and preferably, the block for the target block. First object region information is derived based on size information, and at least two pieces of information among motion vector information and prediction error information, and most preferably, the first object region information is derived based on block size information, motion vector information, and prediction error information. Derive area information.

구체적으로, 상기 제2 판별부(2200)는 대상 블록에 대한 블록크기정보, 및 모션벡터정보, 예측오류정보 중 1 이상의 정보에 대한 평가결과에 기초하여 제1객체영역정보를 도출하고, 바람직하게는, 대상 블록에 대한 블록크기정보, 및 모션벡터정보, 예측오류정보 중 2 이상의 정보에 대한 각각의 평가결과를 종합하여, 제1객체영역정보를 도출하고, 가장 바람직하게는, 블록크기정보, 및 모션벡터정보, 예측오류정보의 각각의 평가결과를 종합하여 제1객체영역정보를 도출한다. 본 발명의 일 실시예에서는 블록크기정보, 및 모션벡터정보, 예측오류정보 각각에 대하여 기설정된 기준에 의하여 스코어링을 수행하고, 각각의 요소의 평가 스코어에 대하여 가중치 등을 적용하여 종합적인 스코어를 도출한 후에, 도출된 스코어가 기설정된 조건 혹은 값을 충족하는 지 여부에 따라서 제1객체영역정보를 도출할 수 있다.Specifically, the second determining unit 2200 derives first object region information based on the evaluation result of at least one of block size information for the target block, motion vector information, and prediction error information, and preferably , derives the first object area information by synthesizing each evaluation result for two or more pieces of block size information on the target block, motion vector information, and prediction error information, and most preferably, block size information; And by synthesizing each evaluation result of motion vector information and prediction error information, the first object region information is derived. In one embodiment of the present invention, scoring is performed according to a predetermined criterion for each of block size information, motion vector information, and prediction error information, and a weight is applied to the evaluation score of each element to derive a comprehensive score. After this, the first object region information may be derived according to whether the derived score satisfies a preset condition or value.

본 발명의 일 실시예에서는, 상기 1 이상의 영상디코딩 파라미터는 매크로블록 혹은 매크로블록을 구성하는 서브블록의 블록크기정보를 포함한다. 블록크기정보는 해당 블록이 예를 들어, 16x16 크기인지, 8x8 크기인지, 4x4 크기인지에 대한 것이다. In an embodiment of the present invention, the one or more image decoding parameters include block size information of a macroblock or a subblock constituting the macroblock. The block size information relates to whether the corresponding block is, for example, 16x16 in size, 8x8 in size, or 4x4 in size.

객체가 존재하는 영역의 경우, 배경영역보다 복잡한 형태를 가질 가능성이 높다. 한편, 가변블록을 이용하는 영상인코딩 방식에서는 복잡한 형태를 갖는 매크로블록에 대해서는 복수의 서브블록으로 분할하여 인코딩을 하고, 본 발명의 일 실시예에서는 이와 같은 인코딩 특성을 이용하여 제2판별부(2200)에서 객체영역 검출을 수행한다. 즉, 제2판별부(2200)는 객체영역인지 여부를 판별하려는 매크로블록 혹은 서브블록의 크기가 작을수록 객체영역에 해당될 가능성이 높다고 판단하거나 혹은 객체영역 여부를 결정하는 스코어에 있어서는 크기가 작은 블록보다 높은 스코어를 부여한다.In the case of the area where the object exists, it is highly likely to have a more complex shape than the background area. On the other hand, in the image encoding method using a variable block, a macroblock having a complex shape is divided into a plurality of sub-blocks and encoded, and in an embodiment of the present invention, the second discrimination unit 2200 uses such encoding characteristics. Perform object area detection in That is, the second determining unit 2200 determines that the smaller the size of the macroblock or subblock for determining whether it is an object area, the higher the probability that it corresponds to the object area, or the smaller the size of the score for determining whether it is an object area. Gives a higher score than the block.

여기서, 도 5에 도시된 바와 같이, 상기 블록크기정보는 상기 가변길이디코딩단계에서 복호화된 정보로부터 도출된다. 이와 같은 방식으로 객체영역을 도출하기 위한 별도의 특징량 정보를 생성하지 않고, 영상디코딩부(1000)의 구성을 유지하면서, 객체영역을 검출할 수 있는 파라미터를 도출할 수 있다. Here, as shown in FIG. 5, the block size information is derived from information decoded in the variable length decoding step. In this way, it is possible to derive a parameter capable of detecting the object region while maintaining the configuration of the image decoding unit 1000 without generating additional feature information for deriving the object region.

본 발명의 일 실시예에서는, 상기 1 이상의 영상디코딩 파라미터는 매크로블록 혹은 매크로블록을 구성하는 서브블록의 모션벡터정보를 포함한다. 포함한다. 인터모드의 프레임의 매크로블록 혹은 서브블록은 각각 모션벡터 정보(방향 및 크기)를 포함하고 있고, 제2 판별부(2200)에서는 모션벡터정보 중 모션벡터의 크기 정보를 이용하여 해당 매크로블록 혹은 서브블록이 객체영역에 해당할 수 있는 지 여부 혹은 판별 관련 값을 판별한다.In an embodiment of the present invention, the one or more image decoding parameters include motion vector information of a macroblock or a subblock constituting the macroblock. include Each macroblock or subblock of the intermode frame includes motion vector information (direction and magnitude), and the second determining unit 2200 uses the motion vector magnitude information among the motion vector information to use the corresponding macroblock or subblock. Determines whether a block can correspond to an object area or a value related to the determination.

객체가 존재하는 영역의 경우, 배경영역보다 움직임이 있을 가능성이 높다. 한편, 가변블록을 이용하는 영상인코딩 방식에서의 참조프레임(예를 들어, P프레임)의 각각의 이미지 블록은 기준프레임에 대한 모션벡터의 크기 정보를 포함하고 있다. 본 발명의 일 실시예에서는 이와 같은 인코딩 특성을 이용하여 제2판별부(2200)에서 객체영역 검출을 수행한다. 즉, 제2판별부(2200)는 객체영역인지 여부를 판별하려는 매크로블록 혹은 서브블록의 모션벡터의 크기가 클수록 객체영역에 해당될 가능성이 높다고 판단하거나 혹은 객체영역 여부를 결정하는 스코어에 있어서는 모션벡터의 크기가 작은 블록보다 높은 스코어를 부여한다.In the case of a region where an object exists, there is a higher possibility of movement than the background region. Meanwhile, each image block of a reference frame (eg, P frame) in an image encoding method using a variable block includes size information of a motion vector with respect to the reference frame. In an embodiment of the present invention, the object region detection is performed by the second determining unit 2200 by using such an encoding characteristic. That is, the second determining unit 2200 determines that the larger the size of the motion vector of the macroblock or subblock to determine whether it is an object region, the higher the probability that it corresponds to the object region, or the motion in the score for determining whether the object region is an object region. A block with a smaller vector size is given a higher score than a block.

여기서, 도 5에 도시된 바와 같이, 상기 모션벡터정보는 상기 가변길이디코딩부(1100)에서 복호화된 정보로부터 도출된다. 이와 같은 방식으로 객체영역을 도출하기 위한 별도의 특징량 정보를 생성하지 않고, 영상디코딩부(1000)의 구성을 유지하면서, 객체영역을 검출할 수 있는 파라미터를 도출할 수 있다. Here, as shown in FIG. 5 , the motion vector information is derived from information decoded by the variable length decoding unit 1100 . In this way, it is possible to derive a parameter capable of detecting the object region while maintaining the configuration of the image decoding unit 1000 without generating additional feature information for deriving the object region.

본 발명의 일 실시예에서는, 상기 1 이상의 영상디코딩 파라미터는 매크로블록 혹은 매크로블록을 구성하는 서브블록의 예측오류정보를 포함한다. In an embodiment of the present invention, the one or more image decoding parameters include prediction error information of a macroblock or a subblock constituting the macroblock.

상기 예측오류정보는 상기 가산단계에서 도출되고, 상기 예측오류정보는 해당 매크로블록 혹은 매크로블록을 구성하는 서브블록에 대한 복호화된 영상데이터에 기초한 색상정보 및 상기 가산단계에서 수행되는 예측단계에서 예측된 해당 매크로블록 혹은 매크로블록을 구성하는 서브블록에 대한 색상정보의 차이에 기초하여 도출된다.The prediction error information is derived in the adding step, and the prediction error information is color information based on the decoded image data for the macroblock or subblock constituting the macroblock, and the prediction step performed in the adding step. It is derived based on the difference in color information of the corresponding macroblock or subblocks constituting the macroblock.

객체가 존재하는 영역의 경우, 영상의 색상적 변화 혹은 형태적 변화가 있을 가능성이 높고, 따라서, 예측오류정보의 크기가 높을 가능성이 높다. In the case of a region in which an object exists, there is a high possibility that there is a color change or a morphological change of the image, and therefore the size of the prediction error information is high.

바람직하게는, 본 발명에서는 YCrCb 컬러 공간에 있어서, 밝기값(LUMA)영역에 해당하는 Y색상정보에 대한 예측오류정보에 기초하여 제2판별부(2200)에서 객체영역을 검출한다.Preferably, in the present invention, in the YCrCb color space, the object region is detected by the second determining unit 2200 based on the prediction error information on the Y color information corresponding to the luminance value (LUMA) region.

즉, 제2판별부(2200)는 객체영역인지 여부를 판별하려는 매크로블록 혹은 서브블록의 Y색상정보에 대한 예측오류정보의 크기가 클수록 객체영역에 해당될 가능성이 높다고 판단하거나 혹은 객체영역 여부를 결정하는 스코어에 있어서는 예측오류정보가 작은 블록보다 높은 스코어를 부여한다.That is, the second determining unit 2200 determines that the prediction error information for Y color information of a macroblock or subblock to determine whether it is an object region is higher, determines that it is more likely to correspond to the object region, or determines whether the object region is an object region or not. In the score to be determined, a higher score is given to a block having a smaller prediction error information.

전술한 바와 같이, 상기 제2판별부(2200)는 블록크기정보, 모션벡터정보, 예측오류정보 중 1 이상의 값에 기초하여 해당 매크로블록 혹은 서브블록이 객체영역에 해당하는 지 여부를 판별한다. 제2판별부(2200)에 의한 최종 객체영역인지 여부에 대한 판별은 각각의 정보가 모두 기설정된 기준치를 넘는 경우에 객체영역으로 판단하거나, 각각의 정보가 기설정된 기준치를 넘는 경우가 기설정된 개수 이상인 경우에 객체영역으로 판단하거나, 각각의 정보에 대해 기설정된 규칙에 의해 스코어를 부여하고, 각각의 스코어에 대한 종합적인 스코어가 기설정된 기준치를 넘는 경우에 최종 객체영역으로 판별할 수도 있다.As described above, the second determining unit 2200 determines whether the corresponding macroblock or subblock corresponds to the object region based on one or more values among block size information, motion vector information, and prediction error information. Determination of whether or not the final object area is the final object area by the second determining unit 2200 is determined as an object area when each piece of information exceeds a predetermined reference value, or a predetermined number when each piece of information exceeds a predetermined reference value In the case of abnormality, it may be determined as an object area, or a score may be given for each piece of information according to a predetermined rule, and if the overall score for each score exceeds a predetermined reference value, it may be determined as the final object area.

한편, 제1판별부(2100) 및 제2판별부(2200)의 판별기준 혹은 기준치 등은 해당 영상으로부터 도출된 통계적 수치 혹은 시스템에서 지금까지 영상처리를 통하여 도출된 통계적 수치 혹은 입력된 수치에 의하여 구현될 수 있다.On the other hand, the discrimination criteria or reference values of the first discriminating unit 2100 and the second discriminating unit 2200 are determined by statistical values derived from the corresponding image or statistical values derived through image processing in the system or input values. can be implemented.

본 발명의 바람직한 실시예에서는 제1판별부(2100)에서 매크로블록의 데이터의 크기로 1차적으로 판별대상을 필터링하고, 제2판별부(2200)에서 블록크기정보, 모션벡터정보, 예측오류정보를 종합적으로 판단하여, 객체영역에 해당하는 블록을 판단한다. 따라서, 최소의 연산부하에서 단순히 모션벡터만을 이용하는 경우보다 더욱 정확하게 객체영역을 검출할 수 있다.In a preferred embodiment of the present invention, the first discriminating unit 2100 filters the discrimination target primarily by the size of macroblock data, and the second discriminating unit 2200 includes block size information, motion vector information, and prediction error information. by comprehensively determining the block corresponding to the object area. Accordingly, it is possible to detect the object region more accurately than in the case where only motion vectors are used under minimal computational load.

본 발명의 다른 실시예에서는, 상기 제1객체영역검출부(2000)는 제1판별부(2100) 및 제2판별부(2200)에서의 판별결과를 종합하여, 객체영역을 도출할 수도 있다. 구체적으로, 제1판별부(2100)에서의 매크로블록에 대한 크기정보에 대한 스코어, 제2판별부(2200)에서의 블록크기정보, 모션벡터정보, 예측오류정보 중 1 이상에 대한 판별스코어에 기초하여 매크로블록 전체 혹은 매크로블록을 구성하는 서브블록에 대하여 객체영역을 검출할 수도 있다.In another embodiment of the present invention, the first object area detection unit 2000 may derive the object area by synthesizing the determination results of the first determination unit 2100 and the second determination unit 2200 . Specifically, the score for the size information of the macroblock in the first determining unit 2100, the block size information in the second determining unit 2200, the motion vector information, and the prediction error information in the discrimination score for one or more Based on the whole macroblock or subblocks constituting the macroblock, the object region may be detected.

본 발명의 다른 실시예에서는, 상기 제1객체영역검출부(2000)는 제1판별부(2100)에서의 판별결과에 기초하여 객체영역을 도출할 수 있다.In another embodiment of the present invention, the first object area detecting unit 2000 may derive the object area based on the determination result of the first determining unit 2100 .

본 발명의 다른 실시예에서는, 상기 제1객체영역검출부(2000)는 제2판별부(2200)에서의 판별결과에 기초하여 객체영역을 도출할 수 있다.In another embodiment of the present invention, the first object area detection unit 2000 may derive the object area based on the determination result of the second determination unit 2200 .

도 6은 매크로블록의 몇 예들을 도시한 도면이다. 6 is a diagram illustrating some examples of macroblocks.

도 6의 (A)는 매크로블록이 1개의 블록으로 이루어진 경우를 도시하고, 도 6의 (B)는 4개의 블록으로 이루어진 경우를 도시하고, 도 6의 (C)는 7개의 블록으로 이루어진 경우를 도시하고, 도 6의 (D)는 16개의 블록으로 이루어진 경우를 도시한다. FIG. 6(A) shows a case in which a macroblock consists of one block, FIG. 6(B) shows a case in which it consists of 4 blocks, and FIG. 6(C) shows a case in which it consists of 7 blocks and FIG. 6(D) shows a case of 16 blocks.

전술한 바와 같이, 상기 제1판별부(2100)는 각각의 매크로블록에 대한 데이터의 크기정보를 판별한다. 도 6의 (D)와 같이 복수의 서브블록으로 이루어진 경우에는 일반적으로 도 6의 (A) 보다 매크로블록에 대한 데이터의 크기가 높을 가능성이 높다. 제1판별부(2100)는 도 6의 (A)와 같은 매크로블록보다 도 6의 (D)와 같은 매크로블록이 객체영역에 해당될 가능성이 높은 방향으로 판별을 수행한다.As described above, the first determining unit 2100 determines data size information for each macroblock. In the case of a plurality of sub-blocks as shown in (D) of FIG. 6 , there is a high possibility that the data size of the macroblock is generally higher than that of FIG. The first determining unit 2100 determines the macroblock as shown in FIG. 6(D) in a direction that is more likely to correspond to the object area than the macroblock as shown in FIG. 6(A).

한편, 도 6의 (D)에서의 서브블록의 블록사이즈는 도 6의 (B)에서의 서브블록 혹은 도 6의 (A)의 매크로블록 전체의 블록사이즈보다 작고, 따라서, 제2판별부(2200)는 도 6의 (D)에서의 서브블록이 도 6의 (B)에서의 서브블록 혹은 도 6의 (A)의 매크로블록 전체 보다 객체영역에 해당될 가능성이 높은 방향으로 판별을 수행한다. On the other hand, the block size of the sub-block in Fig. 6 (D) is smaller than the block size of the entire sub-block in Fig. 6 (B) or the macroblock in Fig. 6 (A), so the second discrimination unit ( 2200) performs the determination in a direction in which the sub-block in FIG. 6 (D) is more likely to correspond to the object area than the sub-block in FIG. 6 (B) or the entire macroblock of FIG. 6 (A). .

도 7은 서브블록을 포함하는 매크로블록의 예를 도시한 도면이다. 7 is a diagram illustrating an example of a macroblock including subblocks.

가변크기 블록을 이용한 영상데이터의 인코딩에서는 동일한 매크로블록에서의 서브블록은 도 7에서와 같이 각각의 서브블록은 상이한 블록크기 및 모션벡터를 가질 수 있다.In encoding image data using variable size blocks, subblocks in the same macroblock may have different block sizes and motion vectors as shown in FIG. 7 .

도 7의 블록 #4, #5, #6, #7은 블록 #1, #2, #3보다 작은 크기의 블록사이즈 및 큰 모션벡터를 가지고 따라서, 제2판별부(2200)는 블록 #4, #5, #6, #7에 대하여 객체영역에 해당하는 블록인지 여부에 대한 스코어에 있어서 블록 #1, #2, #3 보다 높은 스코어를 부여할 수 있다.Blocks #4, #5, #6, and #7 of FIG. 7 have smaller block sizes and larger motion vectors than blocks #1, #2, and #3. , #5, #6, and #7 may be given a higher score than blocks #1, #2, and #3 in the score of whether the block corresponds to the object area.

도 8은 본 발명의 일 실시예에 따른 제1객체영역검출단계의 과정을 블록기준으로 도시한 도면이다.8 is a block diagram illustrating a process of a first object region detection step according to an embodiment of the present invention.

도 8의 (A)는 16개의 매크로블록을 도시한다.8A shows 16 macroblocks.

도 8의 (B)는 상기 가변길이디코딩단계가 수행되기 전의 영상데이터로부터 매크로블록 대한 데이터의 크기정보를 추출하는 단계; 및 상기 매크로블록에 대한 데이터의 크기정보가 기설정된 기준을 충족하는 지 여부를 판별하는 단계;를 포함하는 제1판별단계가 수행된 상태를 도시한다. 도 8의 (B)에서 점선으로 표시된 부분은 매크로블록에 대한 데이터의 크기가 기설정된 값 미만인 매크로블록들에 해당하고, 도 8의 (B)에서 실선으로 표시된 부분은 매크로블록에 대한 데이터의 크기가 기설정된 값 이상인 매크로블록들에 해당한다.8B shows the steps of extracting size information of macroblock data from the image data before the variable length decoding step is performed; and determining whether the data size information for the macroblock satisfies a preset criterion. A portion indicated by a dotted line in FIG. 8B corresponds to macroblocks in which the size of data for the macroblock is less than a preset value, and a portion indicated by a solid line in FIG. 8B is the size of data for the macroblock. Corresponds to macroblocks greater than or equal to a preset value.

도 8의 (C)는 상기 제1판별단계에 의하여 데이터의 크기정보가 기설정된 기준을 충족하는 매크로블록(도 8의 (B)의 실선영역)에 대하여 상기 영상디코딩단계에서 추출되는 1 이상의 영상디코딩 파라미터에 기초하여 객체영역을 검출하는 제2판별단계가 수행된 상태를 도시한다.FIG. 8(C) shows one or more images extracted in the image decoding step with respect to a macroblock (solid line area in FIG. 8(B)) in which data size information meets a preset criterion by the first discrimination step. A state in which the second discrimination step of detecting the object region is performed based on the decoding parameter is shown.

도 8의 (C)에서 점선으로 표시된 부분은 제2판별단계에서 기설정된 기준을 만족시키지 못한 블록에 해당하고, 도 8의 (C)에서 실선으로 표시된 부분은 제2판별단계에서 기설정된 기준을 만족시킨 서브블록 혹은 매크로블록에 해당한다. A portion indicated by a dotted line in FIG. 8(C) corresponds to a block that does not satisfy a preset criterion in the second discrimination step, and a part indicated by a solid line in FIG. It corresponds to a satisfied subblock or macroblock.

본 발명의 바람직한 일 실시예에서는 제1판별부(2100)에 의하여 매크로블록 중 일부를 제2판별부(2200)의 판별대상에서 제외하고, 제1판별부(2100)의 기준을 충족시킨 매크로블록들에 대해서만 제2판별부(2200)의 판별을 수행함으로써, 객체영역검출단계에서의 연산량을 더욱 감소시킬 수 있고, 결과적으로 전체적인 영상처리의 속도를 향상시킬 수 있다.In a preferred embodiment of the present invention, some of the macroblocks are excluded from the discrimination target of the second determining unit 2200 by the first determining unit 2100, and macroblocks satisfying the criteria of the first determining unit 2100 By performing the discrimination of the second discriminator 2200 only with respect to the images, the amount of computation in the object region detection step can be further reduced, and as a result, the overall image processing speed can be improved.

도 9는 본 발명의 일 예에 따른 영상화면에 대한 블록분할 정보를 도시하는 도면이다.9 is a diagram illustrating block division information for an image screen according to an embodiment of the present invention.

도 9에 도시된 바와 같이, 객체가 있는 영역의 경우, 각각의 매크로블록이 보다 세부블록으로 나누어짐을 확인할 수 있고, 이와 같은 특성을 반영하여 상기 제2판별부(2200)는 해당 세부블록 혹은 매크로블록이 객체영역인지 여부를 판별한다.As shown in FIG. 9 , in the case of an object area, it can be confirmed that each macroblock is divided into more detailed blocks, and by reflecting such characteristics, the second determining unit 2200 determines the corresponding detailed blocks or macros. Determines whether a block is an object area.

도 10은 본 발명의 일 예에 따른 영상화면에 대한 모션벡터정보를 도시하는 도면이다.10 is a diagram illustrating motion vector information for a video screen according to an example of the present invention.

도 10에 도시된 바와 같이, 객체가 있는 영역의 경우, 각각의 매크로블록 혹은 세부블록의 모션벡터의 크기가 커짐을 알 수 있고, 이와 같은 특성을 반영하여 상기 제2판별부(2200)는 해당 세부블록 혹은 매크로블록이 객체영역인지 여부를 판별한다.As shown in FIG. 10 , in the case of an object area, it can be seen that the size of the motion vector of each macroblock or subblock increases. Reflecting this characteristic, the second determining unit 2200 determines the corresponding It is determined whether a subblock or macroblock is an object area.

도 11은 본 발명의 일 예에 따른 영상화면에 대한 예측오류정보를 도시하는 도면이다.11 is a diagram illustrating prediction error information for a video screen according to an example of the present invention.

도 11에 도시된 바와 같이, 객체가 있는 영역의 경우, 각각의 매크로블록 혹은 세부블록의 예측오류정보(도 11에서는 예측오류정보 중 YCrCb 컬러 공간에서 밝기값에 해당하는 Y값에 대한 예측오류정보를 수치에 따라 명(수치가 큰 경우), 암(수치가 작은 경우)로 표현하였음)의 값이 커짐을 알 수 있고, 이와 같은 특성을 반영하여 상기 제2판별부는 해당 세부블록 혹은 매크로블록이 객체영역인지 여부를 판별한다.As shown in FIG. 11 , in the case of an object area, prediction error information of each macroblock or subblock (in FIG. 11, prediction error information about a Y value corresponding to a brightness value in the YCrCb color space among prediction error information) It can be seen that the value of light (expressed as a large number) and dark (expressed as a small value) increases depending on the numerical value, and reflecting these characteristics, the second discrimination unit determines whether the corresponding subblock or macroblock Determines whether it is an object area.

도 12는 본 발명의 일 실시예에 따른 인코딩 되는 영상을 생성하는 인코더 시스템을 개략적으로 도시한다.12 schematically shows an encoder system for generating an encoded image according to an embodiment of the present invention.

전술한 본 발명의 일 실시예에 따른 제1객체영역검출부, 영상디코딩부는 도 6에서와 같이 가변크기 블록을 이용하여 인코딩 된 영상데이터에 대해 적용할 수 있다. 대표적인 일 예로서는 H.264 또는 H.265 코덱에 의하여 인코딩 된 영상데이터에 대하여 적용될 수 있다.The first object region detecting unit and the image decoding unit according to an embodiment of the present invention described above may be applied to image data encoded using a variable size block as shown in FIG. 6 . As a representative example, it may be applied to image data encoded by the H.264 or H.265 codec.

도 12에 도시된 인코더(10)는 도 1 및 도 2에 영상데이터로 도시된 데이터를 생성하기 위해, DCT부(Discrete Cosine Transform)(11), 양자화부(Quantization)(12), 역양자화부(Inverse Quantization; IQ)(13), 역변환부(Inverse Discrete Cosine Transform; IDCT)(14), 프레임 메모리(15), 움직임 추정 및 보상부(Motion Estimation and Compensation; ME/MC)(16) 및 가변 길이 코딩부(Variable Length Coding; VLC)(17)를 포함할 수 있다. 마찬가지로, 상기 영상디코딩부는 상기 인코딩부의 구성에 상응하게 구성됨이 바람직하다.The encoder 10 shown in FIG. 12 includes a DCT unit (Discrete Cosine Transform) 11, a quantization unit 12, and an inverse quantization unit to generate data shown as image data in FIGS. 1 and 2 . (Inverse Quantization; IQ) 13, Inverse Discrete Cosine Transform (IDCT) 14, Frame Memory 15, Motion Estimation and Compensation (ME/MC) 16 and Variable It may include a Variable Length Coding (VLC) 17 . Likewise, it is preferable that the image decoding unit is configured to correspond to the configuration of the encoding unit.

이에 대해 간략하게 설명을 하자면, DCT부(11)는 공간적 상관성을 제거하기 위해 기설정된 사이즈 (예를 들어 4×4) 픽셀 블록 단위로 입력되는 영상 데이터에 대해 DCT 연산을 수행한다. To briefly explain this, the DCT unit 11 performs a DCT operation on image data input in units of pixel blocks of a preset size (eg, 4×4) in order to remove spatial correlation.

이후, 양자화부(12)는 DCT부(11)에서 얻어진 DCT 계수에 대해 양자화를 수행하여, 몇 개의 대표 값으로 표현함으로써, 고효율 손실 압축을 수행한다.Thereafter, the quantization unit 12 quantizes the DCT coefficients obtained by the DCT unit 11 and expresses them as several representative values, thereby performing high-efficiency lossy compression.

또한, 역양자화부(13)는 양자화부(12)에서 양자화된 영상 데이터를 역양자화한다. Also, the inverse quantization unit 13 inversely quantizes the image data quantized by the quantization unit 12 .

역변환부(14)는 역양자화부(13)에서 역양자화 된 영상 데이터에 대해 IDCT 변환을 수행한다. The inverse transform unit 14 performs IDCT transformation on the image data inverse quantized by the inverse quantizer 13 .

프레임 메모리(15)는 역변환부(14)에서 IDCT 변환된 영상데이터를 프레임 단위로 저장한다.The frame memory 15 stores the IDCT-transformed image data in the inverse transform unit 14 in units of frames.

한편, 움직임 추정 및 보상부(16)는 입력되는 현재 프레임의 영상데이터와 프레임 메모리부(15)에 저장된 이전 프레임의 영상 데이터를 이용하여 매크로 블록당 움직임 벡터(Motion Vector; MV)를 추정하여 블록정합오차(blockmatching error)에 해당되는 SAD(sum of absolute difference)를 계산한다. On the other hand, the motion estimation and compensation unit 16 estimates a motion vector (MV) per macro block using the input image data of the current frame and the image data of the previous frame stored in the frame memory unit 15 to block the block. Calculate the sum of absolute difference (SAD) corresponding to the blockmatching error.

가변길이 코딩부(17)는 움직임 추정 및 보상부(16)에서 추정된 움직임 벡터에 따라 DCT및 양자화 처리된 데이터에서 통계적 중복성을 제거한다.The variable length coding unit 17 removes statistical redundancy from the DCT and quantized data according to the motion vector estimated by the motion estimation and compensation unit 16 .

도 13은 영상 데이터의 프레임들의 예들을 개략적으로 도시한 도면이다13 is a diagram schematically illustrating examples of frames of image data.

일반적인 동영상의 비디오 부분은 I 프레임(도 13에서 “I”로 도시한 프레임), P 프레임(도 13에서 “P”로 도시한 프레임), 및 B 프레임(도 13에서 “B”로 도시한 프레임)으로 구성된다.The video portion of a typical moving picture includes I frames (frames denoted by “I” in Fig. 13), P frames (frames denoted as “P” in Fig. 13), and B frames (frames denoted by “B” in Fig. 13). ) is composed of

I 프레임은 키 프레임으로써 전체 이미지를 모두 포함하고, 동영상 파일에 있어서 억세스 포인트로 기능할 수 있으며, 독립적으로 인코딩 된 프레임에 해당하며 낮은 압축률을 가지고 있다. I-frame is a key frame that includes the entire image, can function as an access point in a video file, corresponds to an independently encoded frame, and has a low compression rate.

한편, P 프레임의 경우, 이전의 I 프레임 혹은 P 프레임을 참조하여 순방향 예측에 의하여 만들어지는 프레임으로서 독립적으로 인코딩 된 프레임에 해당하지 않는다. 이와 같은 P 프레임은 I 프레임에 비해 높은 압축률을 가지고 있다. 여기서, “이전”의 프레임이라는 것은 바로 전 프레임뿐만 아니라 해당 프레임 전에 존재하는 복수의 프레임 중 하나를 의미하고, “이후”의 프레임이라는 것은 바로 다음 프레임뿐만 아니라 해당 프레임 다음에 존재하는 복수의 프레임 중 하나를 의미한다.On the other hand, in the case of a P frame, as a frame generated by forward prediction with reference to a previous I frame or P frame, it does not correspond to an independently encoded frame. Such a P frame has a higher compression ratio than the I frame. Here, the "previous" frame means one of a plurality of frames existing before the frame as well as the immediately preceding frame, and the "following" frame means not only the next frame but also one of a plurality of frames existing after the corresponding frame. means one

한편, B 프레임의 경우, 이전의 프레임 및 이후의 프레임을 참조하여 순방향 및 역방향 예측에 의하여 만들어지는 프레임으로서 독립적으로 인코딩 된 프레임에 해당하지 않는다. 이와 같은 B 프레임은 I, P 프레임에 비해 높은 압축률을 가지고 있다. 따라서, 상기 독립적으로 인코딩 된 프레임은 I 프레임에 해당하고, 비독립적으로 인코딩 된 프레임은 나머지 B 프레임 혹은 P 프레임에 해당할 수 있다.On the other hand, in the case of a B frame, it is a frame created by forward and backward prediction with reference to a previous frame and a subsequent frame, and does not correspond to an independently encoded frame. Such a B frame has a higher compression ratio than the I and P frames. Accordingly, the independently encoded frame may correspond to an I frame, and the non-independently encoded frame may correspond to the remaining B frames or P frames.

상기 B, P 프레임은 참조프레임에 해당하고, 바람직하게는, 상기 제1객체영역검출부는 이와 같은 참조프레임에 대하여 객체영역검출을 수행한다.The B and P frames correspond to reference frames, and preferably, the first object area detection unit detects the object area with respect to the reference frames.

도 14는 본 발명의 일 실시예에 따른 제2객체영역검출부(4000)의 세부 구성을 개략적으로 도시한 도면이다.14 is a diagram schematically illustrating a detailed configuration of the second object area detection unit 4000 according to an embodiment of the present invention.

상기 제2객체영역검출부(4000)는 상기 제1객체영역검출부(2000)에서 검출된 제1객체영역정보 중 서로 맞닿은 매크로블록 혹은 매크로블록을 구성하는 서브블록을 그룹화 하여 생성한 그룹객체영역에 대한 그룹객체영역정보를 도출하는 그룹객체영역생성부(4100); 상기 그룹객체영역정보를 실수 벡터로 표현하여 영상의 시간에 따른 시퀀스 데이터를 생성하는 벡터시퀀스생성부(4200); 상기 시퀀스 데이터를 상기 순환신경망(RNN)의 입력으로 하여 상기 그룹객체영역에 객체가 존재 할 확률을 도출하는 그룹객체영역검토부(4300); 도출된 상기 그룹객체영역에 객체가 존재 할 확률이 기설정된 기준을 충족하는지 여부에 기초하여 제2객체영역을 검출하는 제3판별부(4400); 를 포함한다.The second object area detection unit 4000 is a group object area generated by grouping abutting macroblocks or subblocks constituting macroblocks among the first object area information detected by the first object area detecting unit 2000 . a group object area generating unit 4100 deriving group object area information; a vector sequence generator 4200 for generating sequence data according to time of an image by expressing the group object region information as a real vector; a group object area review unit 4300 for deriving a probability that an object exists in the group object area by using the sequence data as an input of the recurrent neural network (RNN); a third determining unit 4400 configured to detect a second object region based on whether the derived probability that an object exists in the group object region satisfies a preset criterion; includes

상기 그룹객체영역생성부(4100)에서는 상기 제1객체영역검출부(2000)에서 검출된 제1객체영역정보를 수신하고, 제1객체영역정보 중 인접한 매크로블록 혹은 매크로블록을 구성하는 서브블록을 서로 묶어 그룹을 생성한다.The group object area generating unit 4100 receives the first object area information detected by the first object area detecting unit 2000, and divides adjacent macroblocks or subblocks constituting the macroblock among the first object area information with each other. group to create a group.

영상데이터에 객체가 존재하는 경우 상기 객체는 상기 영상데이터의 매크로블록 내에 존재할 수도 있지만, 복수의 매크로블록 혹은 복수의 매크로블록을 구성하는 서브블록에 걸쳐 존재할 수도 있다. 이와 같이 복수의 매크로블록 혹은 복수의 매크로블록을 구성하는 서브블록을 점유하고 있는 객체가 존재하는 경우, 상기 객체가 점유하고 있는 매크로블록 혹은 서브블록들을 하나의 그룹으로 묶어 분석할 수 있다. 이와 같이 하나의 그룹으로 매크로블록 혹은 서브블록들을 묶어 분석하는 경우, 복수의 블록에서 객체의 존재 여부를 동시에 판단할 수 있고, 연속되는 프레임에서 객체의 존재 여부를 유기적으로 파악할 수 있어 정확한 판단을 할 수 있게 된다.When an object exists in image data, the object may exist within a macroblock of the image data, or may exist across a plurality of macroblocks or subblocks constituting the plurality of macroblocks. As described above, when there is an object occupying a plurality of macroblocks or subblocks constituting the plurality of macroblocks, the macroblocks or subblocks occupied by the object may be grouped and analyzed. In this way, when macroblocks or subblocks are grouped and analyzed as a group, the existence of an object in a plurality of blocks can be simultaneously determined, and the existence of an object can be organically identified in successive frames to make an accurate determination. be able to

상기 벡터시퀀스생성부(4200)에서는 상기 그룹객체영역에 대한 특징 정보를 벡터로 표현하여 시간에 따른 시퀀스 데이터를 생성한다. 본 발명에서 상기 제2객체영역검출부(4000)는 순환신경망(RNN) 모델에 기초하여 제1객체영역정보에 객체가 존재 하는지 여부를 판단할 수 있다. 이를 위해 상기 벡터시퀀스생성부(4200)에서는 상기 그룹객체영역에 대한 정보를 상기 순환신경망(RNN)에 입력할 수 있는 시간에 따른 벡터 시퀀스 데이터로 만들어 상기 순환신경망(RNN)을 통해 분석할 수 있도록 한다.The vector sequence generator 4200 generates sequence data according to time by expressing the characteristic information on the group object region as a vector. In the present invention, the second object region detection unit 4000 may determine whether an object exists in the first object region information based on a recurrent neural network (RNN) model. To this end, the vector sequence generator 4200 creates vector sequence data according to time that can be inputted to the recurrent neural network (RNN), so that the information on the group object region can be analyzed through the recurrent neural network (RNN). do.

시간에 따른 상기 시퀀스 데이터에는 상기 그룹객체영역에 대한 정보가 포함될 수 있다. 이 때, 복수의 그룹객체영역이 존재하는 경우 상기 벡터시퀀스생성부(4200)는 각각의 그룹객체영역에 대한 분석을 할 수 있도록 상기 벡터시퀀스생성부(4200)에서는 상기 그룹객체영역에 식별부호를 지정하고, 각 그룹객체영역의 파라미터를 도출하고, 연속되는 프레임에서 동일한 객체가 포함되는 것으로 판단되는 그룹객체영역을 분류하여 벡터로 표현하여 시퀀스 데이터를 생성할 수 있다.The sequence data according to time may include information on the group object area. At this time, when a plurality of group object regions exist, the vector sequence generator 4200 adds an identification code to the group object region so that the vector sequence generator 4200 can analyze each group object region. Sequence data can be generated by designating, deriving parameters of each group object area, classifying the group object area determined to contain the same object in successive frames, and expressing it as a vector.

이 때, 상기 벡터시퀀스생성부(4200)에서는 상기 그룹객체영역의 파라미터를 도출하기 위하여 상기 가변길이코딩부(1100) 혹은 상기 가산부(1400)로부터 각각 모션벡터정보 혹은 예측오류정보를 불러와서 활용할 수 있다. 상기 모션벡터정보 및 예측오류정보는 시퀀스 데이터를 생성하기 위한 정보로 사용될 수도 있고, 혹은 상기 시퀀스 데이터를 생성하기 위해 그룹객체영역 중 동일한 객체가 포함되는지 여부를 판단하기 위한 정보로 사용될 수도 있다.At this time, the vector sequence generator 4200 calls and utilizes motion vector information or prediction error information from the variable length coding unit 1100 or the adder 1400, respectively, in order to derive the parameters of the group object region. can The motion vector information and the prediction error information may be used as information for generating sequence data, or may be used as information for determining whether the same object is included in the group object area to generate the sequence data.

상기 그룹객체영역검토부(4300)에서는 순환신경망(RNN)에 기초하여 상기 그룹객체영역에 대한 분석을 수행한다. 순환신경망(RNN)은 시계열 데이터를 분석하기 적합한 인공신경망으로서, 본 발명에서는 상기 순환신경망(RNN)에 상기 벡터시퀀스생성부(4200)에서 생성 된 시퀀스 데이터를 입력하여 상기 그룹객체영역의 정보를 분석할 수 있다.The group object area review unit 4300 analyzes the group object area based on a recurrent neural network (RNN). A recurrent neural network (RNN) is an artificial neural network suitable for analyzing time series data. In the present invention, the sequence data generated by the vector sequence generator 4200 is input to the recurrent neural network (RNN) to analyze information of the group object area. can do.

본 발명의 일 실시예에서 상기 그룹객체영역검토부(4300)는 미리 상기 순환신경망(RNN)에 대한 학습을 수행하고, 학습된 상기 순환신경망(RNN)에 기초하여 상기 그룹객체영역에 대한 분석을 수행할 수 있다.In an embodiment of the present invention, the group object area review unit 4300 performs learning on the recurrent neural network (RNN) in advance, and analyzes the group object area based on the learned recurrent neural network (RNN). can be done

본 발명의 일 실시예에서 상기 그룹객체영역검토부(4300)는 상기 그룹객체영역에 대한 정보를 포함하는 상기 시퀀스 데이터를 입력 받아 상기 그룹객체영역에 객체가 존재 할 확률을 도출해낼 수 있다. 본 발명에서는 이와 같은 확률에 기초하여 상기 제3판별부(4400)가 상기 그룹객체영역에 대한 분석을 수행할 수 있게 된다.In an embodiment of the present invention, the group object area review unit 4300 may receive the sequence data including information on the group object area, and derive a probability that an object exists in the group object area. In the present invention, the third determining unit 4400 can analyze the group object area based on such a probability.

상기 제3판별부(4400)에서는 상기 그룹객체영역검토부(4300)에서 순환신경망(RNN)을 통해 도출 된 결과를 분석하여 상기 그룹객체영역에 객체가 존재하는지 여부를 판단한다. 본 발명의 일 실시예에서는 상기 그룹객체영역검토부(4300)에서는 상기 그룹객체영역에 객체가 존재할 확률을 도출하고, 상기 제3판별부(4400)에서는 상기 확률이 기설정된 기준을 충족하는지 여부에 기초하여 상기 그룹객체영역의 신뢰성을 판단한다. 본 발명의 일 실시예에서 상기 그룹객체영역검토부(4300)는 상기 그룹객체영역에 객체가 존재할 확률이 기설정된 기준 값 미만인 경우, 상기 그룹객체영역을 상기 제1객체영역검출부(2000)에서 도출한 제1객체영역정보에서 삭제하여 상기 그룹객체영역에 객체가 존재할 확률이 기설정된 기준 값 이상인 경우만을 남기는 방식으로 제2객체영역정보를 도출하여 상기 영상분석부(3000)로 전송한다. 상기 영상분석부(3000)는 이와 같이 검토된 제2객체영역정보에 기초하여 영상을 분석함으로써 보다 빠른 속도로 높은 정확도를 가지고 영상을 분석할 수 있게 된다.The third determining unit 4400 analyzes the result derived from the recurrent neural network (RNN) in the group object area review unit 4300 to determine whether an object exists in the group object area. In an embodiment of the present invention, the group object area review unit 4300 derives the probability that an object exists in the group object area, and the third determination unit 4400 determines whether the probability meets a preset criterion. based on the reliability of the group object area. In an embodiment of the present invention, the group object area review unit 4300 derives the group object area from the first object area detection unit 2000 when the probability that an object exists in the group object area is less than a preset reference value. The second object region information is derived and transmitted to the image analysis unit 3000 by deleting it from the first object region information and leaving only the case where the probability that an object exists in the group object region is greater than or equal to a preset reference value. The image analysis unit 3000 can analyze the image at a faster speed and with high accuracy by analyzing the image based on the reviewed second object region information.

본 발명의 일 실시예에서 상기 제2객체영역정보는 상기 그룹객체영역의 정보를 포함할 수 있다. 이와 같이 상기 제2객체영역정보에 상기 그룹객체영역생성부(4100)에서 생성된 그룹객체영역의 정보가 포함됨으로써, 상기 영상분석부(3000)가 영상을 분석할 때, 그룹으로 묶인 매크로블록 혹은 서브블록에 기초하여 객체를 검출함으로써 보다 용이하게 객체를 검출해낼 수 있는 효과를 발휘할 수 있다.In an embodiment of the present invention, the second object area information may include information on the group object area. As described above, since the information on the group object region generated by the group object region generating unit 4100 is included in the second object region information, when the image analysis unit 3000 analyzes the image, the grouped macroblock or By detecting the object based on the sub-block, the effect of more easily detecting the object can be exhibited.

도 15는 본 발명의 일 실시예에 따른 그룹객체영역생성단계의 결과를 도시한 도면이다.15 is a diagram illustrating a result of a group object area creation step according to an embodiment of the present invention.

도 15에는 복수의 매크로블록 중 일부의 매크로블록이 상기 제1객체영역검출부(2000)에 의하여 제1객체영역으로 판별되어 있는 모습이 도시되어 있다. 상기 제1객체영역검출부(2000)에 의해 제1객체영역으로 판별 된 매크로블록은 실선으로 표시되어 있고, 그 외의 매크로블록은 점선으로 표시되어 있다. 본 발명의 실시예에서는 매크로블록뿐만 아니라 매크로블록을 구성하는 서브블록 단위로 객체영역이 설정될 수 있으나, 도 15에서는 설명의 편의를 위하여 서브블록이 없이 매크로블록만으로 이루어진 영상 정보의 실시예를 도시한다.FIG. 15 shows a state in which some macroblocks among a plurality of macroblocks are discriminated as a first object area by the first object area detection unit 2000 . Macroblocks determined as the first object area by the first object area detection unit 2000 are indicated by solid lines, and other macroblocks are indicated by dotted lines. In the embodiment of the present invention, the object area may be set in units of subblocks constituting macroblocks as well as macroblocks. However, for convenience of explanation, FIG. 15 shows an embodiment of image information composed of only macroblocks without subblocks. do.

도 15에서는 블록 #33, #34, #3a, #3b, #43, #44, #4a, #4b, #4c, #53, #5a, #5b, #5c, #6a 및 #6b가 제1객체영역으로 검출되어 있다.In Figure 15, blocks #33, #34, #3a, #3b, #43, #44, #4a, #4b, #4c, #53, #5a, #5b, #5c, #6a, and #6b are It is detected as 1 object area.

이 중 블록 #33과 #34는 서로 맞닿아 있고, 상기 그룹객체영역생성부(4100)는 이를 그룹 G1으로 그룹화 하여 그룹객체영역을 생성한다. 또한, 블록 #33 및 #34로 이루어진 그룹 G1은 블록 #43 및 #44과 맞닿아 있고, 상기 그룹객체영역생성부(4100)는 그룹 G1에 상기 블록 #43 및 #44를 추가하여 그룹화 한다. 또한 블록 #33, #34, #43 및 #44로 이루어진 그룹 G1은 블록 #53과 맞닿아 있고, 상기 그룹객체영역생성부(4100)는 그룹 G1에 상기 블록 #53을 추가하여 그룹화 한다.Among them, blocks #33 and #34 are in contact with each other, and the group object area generating unit 4100 groups them into a group G1 to create a group object area. Also, the group G1 composed of blocks #33 and #34 is in contact with blocks #43 and #44, and the group object area generator 4100 adds the blocks #43 and #44 to the group G1 to group them. Also, the group G1 composed of blocks #33, #34, #43, and #44 is in contact with the block #53, and the group object area generator 4100 adds the block #53 to the group G1 to group them.

이와 같은 방법으로 상기 그룹객체영역생성부(4100)는 블록 #3a, #3b, #4a, #4b, #4c, #5a, #5b, #5c, #6a 및 #6b를 그룹 G2로 그룹화하여 그룹객체영역을 생성할 수 있다.In this way, the group object area generating unit 4100 groups blocks #3a, #3b, #4a, #4b, #4c, #5a, #5b, #5c, #6a and #6b into a group G2. You can create a group object area.

본 발명의 다른 실시예에서 상기 그룹객체영역생성부(4100)는 서로 맞닿은 블록뿐만 아니라, 기설정된 거리 이내의 매크로블록 혹은 매크로블록을 구성하는 서브블록을 그룹화 하여 그룹객체영역을 생성할 수 있다.In another embodiment of the present invention, the group object area generating unit 4100 may create a group object area by grouping not only the blocks in contact with each other but also macroblocks or subblocks constituting the macroblocks within a preset distance.

예를 들어, 상기 기설정된 거리가 매크로블록의 폭의 2배이고, 도 15의 실시예에서 블록 #64가 제1객체영역으로 검출된 경우, 상기 그룹 G1의 블록 #53과 기설정된 거리 이내에 위치하므로 그룹 G1에 상기 블록 #64을 추가하여 그룹화할 수 있다.For example, when the preset distance is twice the width of the macroblock and block #64 is detected as the first object area in the embodiment of FIG. 15, it is located within a preset distance from block #53 of the group G1. Group G1 can be grouped by adding the block #64.

도 16은 본 발명의 일 실시예에 따른 벡터시퀀스생성부(4200)의 세부 구성을 개략적으로 도시한 도면이다.16 is a diagram schematically illustrating a detailed configuration of a vector sequence generator 4200 according to an embodiment of the present invention.

상기 벡터시퀀스생성부(4200)는 상기 그룹객체영역에 식별부호를 지정하는 식별부호지정부(4210); 상기 그룹객체영역을 구성하는 매크로블록 혹은 매크로블록을 구성하는 서브블록의 영상디코딩 파라미터를 도출하는 파라미터도출부(4220); 상기 영상디코딩 파라미터에 기초하여 영상의 연속된 프레임에서 동일한 객체를 포함하는 것으로 판단되는 그룹객체영역의 식별부호를 클러스터링 하는 그룹클러스터링부(4230); 및 클러스터링 된 상기 그룹객체영역의 정보를 벡터로 표현하는 그룹벡터정보생성부(4240); 를 포함한다.The vector sequence generation unit 4200 includes: an identification code designation unit 4210 for designating an identification code in the group object area; a parameter deriving unit 4220 for deriving an image decoding parameter of a macroblock constituting the group object area or a subblock constituting the macroblock; a group clustering unit 4230 for clustering identification codes of group object regions determined to include the same object in consecutive frames of an image based on the image decoding parameter; and a group vector information generating unit 4240 for expressing the clustered information of the group object area as a vector; includes

상기 영상디코딩 파라미터는 매크로블록 혹은 매크로블록을 구성하는 서브블록의 모션벡터정보; 및 매크로블록 혹은 매크로블록을 구성하는 서브블록의 예측오류정보; 중 1 이상을 포함할 수 있다.The image decoding parameter may include motion vector information of a macroblock or a subblock constituting the macroblock; and prediction error information of a macroblock or subblocks constituting the macroblock; It may include one or more of

상기 식별부호지정부(4210)는 상기 그룹객체영역에 식별부호를 지정한다. 상기 식별부호는 상기 그룹객체영역생성부(4100)에서 생성한 그룹객체영역을 구분하기 위한 정보로서, 바람직하게는 영상데이터의 모든 프레임 각각에 대하여 프레임에 존재하는 모든 그룹객체영역 각각에 식별부호를 지정할 수 있다.The identification code designation unit 4210 designates an identification code in the group object area. The identification code is information for distinguishing the group object area generated by the group object area generating unit 4100. Preferably, for each frame of image data, an identification code is applied to each of all group object areas existing in the frame. can be specified.

상기 파라미터도출부(4220)는 상기 그룹객체영역을 구성하는 매크로블록 혹은 매크로블록을 구성하는 서브블록의 영상디코딩 파라미터를 도출한다. 상기 그룹객체영역을 구성하는 각각의 매크로블록 혹은 서브블록의 영상디코딩 파라미터는 상기 영상디코딩부(1000)의 상기 가변길이코딩부(1100) 혹은 상기 가산부(1400)로부터 획득할 수 있다. 상기 파라미터도출부(4220)는 상기 가변길이코딩부(1100) 및 상기 가산부(1400)와 직접 연결되어 모션벡터정보 혹은 예측오류정보 등의 영상디코딩 파라미터를 획득할 수도 있고, 혹은 상기 영상디코딩 파라미터를 획득한 상기 제1객체영영검출부(2000)를 통해 획득할 수도 있다.The parameter derivation unit 4220 derives image decoding parameters of macroblocks constituting the group object area or subblocks constituting macroblocks. The image decoding parameter of each macroblock or subblock constituting the group object region may be obtained from the variable length coding unit 1100 or the adder 1400 of the image decoding unit 1000 . The parameter derivation unit 4220 may be directly connected to the variable length coding unit 1100 and the adder 1400 to obtain image decoding parameters such as motion vector information or prediction error information, or the image decoding parameter It may be obtained through the first object image detection unit 2000 that has obtained .

상기 그룹클러스터링부(4230)는 상기 그룹객체영역을 구성하는 매크로블록 혹은 매크로블록을 구성하는 서브블록의 위치정보, 모션벡터정보 및 예측오류정보 중 1 이상에 기초하여 동일한 객체를 포함하는 그룹객체영역을 식별할 수 있다. 상기 그룹클러스터링부(4230)에서는 영상데이터의 각 프레임에 존재하는 복수의 그룹객체영역을 동일한 객체가 포함될 것으로 판단되는 그룹객체영역끼리 분류하여 상기 그룹객체영역을 시간에 따라 분석할 수 있도록 한다. 예를 들어, 상기 그룹클러스터링부(4230)는 인접한 프레임에서 상기 그룹객체영역의 위치정보가 비슷한 그룹객체영역을 분류한다. 이 때, 특정 프레임에서 특정 그룹객체영역에 포함된 객체가 이동하는 경우, 상기 이동이 해당 그룹객체영역의 매크로블록에 모션벡터로 나타날 것이고, 직후의 프레임에서는 상기 모션벡터에 따라 이동한 위치에 상기 객체가 위치 할 확률이 높으므로, 상기 모션벡터에 따라 이동한 위치에 존재하는 그룹객체영역이 존재하는 경우, 동일한 객체를 포함하는 것으로 판단할 수 있다.The group clustering unit 4230 is a group object region including the same object based on at least one of location information, motion vector information, and prediction error information of a macroblock constituting the group object region or subblock constituting the macroblock. can be identified. The group clustering unit 4230 classifies a plurality of group object regions existing in each frame of image data into group object regions determined to contain the same object, so that the group object region can be analyzed over time. For example, the group clustering unit 4230 classifies a group object area having similar location information of the group object area in an adjacent frame. At this time, when an object included in a specific group object area moves in a specific frame, the movement will appear as a motion vector in the macroblock of the group object area, and in the frame immediately after, the movement is performed according to the motion vector. Since there is a high probability that the object will be located, if there is a group object area existing at a position moved according to the motion vector, it may be determined that the same object is included.

상기 그룹벡터정보생성부(4240)는 상기 그룹객체영역의 특징을 나타낼 수 있는 정보를 벡터로 표현하여 상기 그룹객체영역검토부(4300)의 순환신경망(RNN)에 입력 할 수 있도록 한다. 이를 위해 상기 그룹벡터정보생성부(4240)는 상기 파라미터도출부(4220) 등을 통해 상기 그룹객체영역을 구성하는 매크로블록 혹은 매크로블록을 구성하는 서브블록의 정보를 획득할 수 있다.The group vector information generating unit 4240 expresses information that can represent the characteristics of the group object region as a vector so that it can be input to the RNN of the group object region review unit 4300 . To this end, the group vector information generating unit 4240 may obtain information on the macroblock constituting the group object area or the subblock constituting the macroblock through the parameter deriving unit 4220 or the like.

본 발명의 일 실시예에서 상기 그룹객체영역의 정보를 표현한 벡터는 상기 그룹객체영역의 위치정보; 및 상기 그룹객체영역의 파라미터정보; 를 포함할 수 있다. 이와 같이 위치정보 및 파라미터정보를 통해 상기 그룹객체영역의 특징을 나타내고, 이를 순환신경망(RNN)을 통해 검토함으로써 상기 그룹객체영역에 객체가 존재 하는지 여부를 확인할 수 있다.In an embodiment of the present invention, the vector expressing the information of the group object area includes location information of the group object area; and parameter information of the group object area; may include. As described above, it is possible to determine whether an object exists in the group object area by indicating the characteristics of the group object area through location information and parameter information, and examining it through a recurrent neural network (RNN).

도 17은 본 발명의 일 실시예에 따른 그룹클러스터링단계의 과정을 도시한 도면이다.17 is a diagram illustrating a process of a group clustering step according to an embodiment of the present invention.

도 17에 도시되어 있는 프레임 1 및 프레임 2는 영상 데이터 내에 포함되어 있는 연속된 프레임이다. 상세하게는 프레임 2는 프레임 1 직후의 프레임이다.Frame 1 and Frame 2 shown in FIG. 17 are continuous frames included in the image data. In detail, frame 2 is a frame immediately after frame 1.

프레임 1에는 상기 식별부호지정부(4210)에 의해 식별부호가 지정된 그룹객체영역이 두 개 존재하고 있다. 본 발명의 일 실시예에서 상기 식별부호지정부(4210)는 상기 두 개의 그룹객체영역에 각각 프레임 1의 그룹객체영역 1의 의미로 G1-1 및 프레임 1의 그룹객체영역 2의 의미로 G1-2의 식별부호를 부여한 있다. 마찬가지로 프레임 2에 존재하는 두 개의 그룹객체영역에도 상기 식별부호지정부(4210)는 각각 G2-1 및 G2-2의 식별부호를 부여한다.In frame 1, two group object areas to which identification codes are designated by the identification code designation unit 4210 exist. In an embodiment of the present invention, the identification code designation unit 4210 is assigned to the two group object areas, respectively, G1-1 as group object area 1 of frame 1 and G1- as meaning of group object area 2 of frame 1, respectively. The identification code of 2 has been assigned. Similarly, the identification code designation unit 4210 assigns identification codes of G2-1 and G2-2 to the two group object regions existing in frame 2, respectively.

도 17의 실시예에서 상기 파라미터도출부(4220)는 상기 그룹객체영역을 구성하는 매크로블록 혹은 매크로블록을 구성하는 서브블록의 모션벡터정보를 상기 가변길이코딩부(1100)로부터 직접 혹은 간접적으로(제1객체영역검출부 등을 통해) 불러올 수 있다. 도 17에는 상기 그룹객체영역을 구성하는 매크로블록의 모션벡터가 각 매크로블록에 도시되어 있다.In the embodiment of FIG. 17, the parameter derivation unit 4220 directly or indirectly obtains motion vector information of a macroblock constituting the group object area or a subblock constituting a macroblock from the variable length coding unit 1100 ( through the first object area detection unit, etc.). In Fig. 17, motion vectors of macroblocks constituting the group object area are shown in each macroblock.

그룹객체영역 G1-1의 각 매크로블록은 도 17에서 우측 위 방향의 모션벡터를 가지고 있고, 그룹객체영역 G1-2의 각 매크로블록은 좌측 아래 방향의 모션벡터를 가지고 있다.Each macroblock of the group object area G1-1 has a motion vector in the upper right direction in FIG. 17, and each macroblock in the group object area G1-2 has a motion vector in the lower left direction.

프레임 2의 모션벡터정보는 생략하였다.Motion vector information of frame 2 is omitted.

상기 그룹클러스터링부(4230)는 상기 그룹객체영역 중 동일한 객체를 포함하는 것으로 판단되는 그룹객체영역을 클러스터링 한다. 이를 위해 상기 그룹클러스터링부(4230)는 상기 그룹객체영역을 구성하는 매크로블록 혹은 매크로블록을 구성하는 서브블록의 위치정보, 모션벡터정보 및 예측오류정보 중 1 이상에 기초하여 동일한 객체를 포함하는 그룹객체영역을 식별할 수 있다.The group clustering unit 4230 clusters the group object area determined to include the same object among the group object areas. To this end, the group clustering unit 4230 configures a group including the same object based on at least one of location information, motion vector information, and prediction error information of a macroblock constituting the group object area or subblock constituting a macroblock. The object area can be identified.

도 17에서는 상기 그룹객체영역 G2-1의 경우 구성 매크로블록의 위치정보가 상기 그룹객체영역 G1-1의 위치정보에 상기 그룹객체영역 G1-1의 모션벡터를 더한 것과 유사하게 나타난다. 즉, 상기 그룹클러스터링부(4230)는 프레임 1 직후의 프레임 2의 그룹객체영역 G2-1에 포함 된 객체는, 상기 프레임 1의 그룹객체영역 G1-1에 포함된 객체가 상기 모션벡터를 따라 이동 한 것으로 판단할 수 있다. 이와 같이 동일한 객체가 포함되는 것으로 판단되는 경우 상기 그룹클러스터링부(4230)는 두 그룹객체영역의 식별부호 G1-1 및 G2-1를 클러스터링 하여 동일 분류로 구분한다.In FIG. 17, in the case of the group object area G2-1, the location information of the constituent macroblocks is similar to that obtained by adding the motion vector of the group object area G1-1 to the location information of the group object area G1-1. That is, in the group clustering unit 4230, the object included in the group object area G2-1 of the frame 2 immediately after the frame 1 moves along the motion vector of the object included in the group object area G1-1 of the frame 1 can be judged to have been When it is determined that the same object is included in this way, the group clustering unit 4230 clusters identification codes G1-1 and G2-1 of the two group object areas and classifies them into the same classification.

같은 방법으로 상기 그룹객체영역 G1-2 및 그룹객체영역 G2-2 역시 동일한 객체가 포함되는 것으로 판단할 수 있고, 두 그룹객체영역의 식별부호 G1-2 및 G2-2를 클러스터링 하여 동일 분류로 구분한다.In the same way, it can be determined that the group object area G1-2 and the group object area G2-2 also contain the same object, and the identification codes G1-2 and G2-2 of the two group object areas are clustered and classified into the same classification. do.

이와 같이 동일 분류로 구분 된 식별부호는 그룹객체영역정보의 벡터 시퀀스를 작성하는 데 사용된다. The identification code divided into the same classification as described above is used to create a vector sequence of group object area information.

도 18은 본 발명의 일 실시예에 따른 그룹벡터정보의 구성을 도시하는 도면이다.18 is a diagram illustrating a configuration of group vector information according to an embodiment of the present invention.

도 18의 (A)에는 상기 그룹객체영역생성부(4100)에 의해 그룹화 된 그룹객체영역이 도시되어 있다. 도 18의 (A)에 도시된 그룹객체영역은 영상데이터의 프레임의 일부로서, 객체영역으로 판단되지 않은 다른 매크로블록은 생략되어 있다.18A shows the group object area grouped by the group object area generating unit 4100. As shown in FIG. The group object area shown in FIG. 18A is a part of a frame of image data, and other macroblocks that are not determined to be object areas are omitted.

도 18의 (A)에 도시된 그룹객체영역은 폭 5, 높이 3의 크기이고, 상기 그룹객체영역의 가장 우상단에 위치한 매크로블록의 좌표는 (24,425)이다. 각 매크로블록에는 가변길이코딩부(1100)로부터 도출한 모션벡터정보가 표시되어 있다.The group object area shown in FIG. 18A has a width of 5 and a height of 3, and the coordinates of the macroblock located at the top right of the group object area are (24,425). Motion vector information derived from the variable length coding unit 1100 is displayed in each macroblock.

도 18의 (B)에는 상기 도 18의 (A)에 도시된 그룹객체영역의 모션벡터정보가 도시되어 있다. 각 매크로블록의 모션벡터는 대부분 도 18의 위쪽을 향하고 있지만, 세부적인 방향 및 크기는 각각 다르다. 이와 같은 각 모션벡터를 통해 상기 그룹객체영역의 특성을 나타내기 위하여 상기 모션벡터의 방향을 기설정된 방향 기준에 기초하여 분류할 수 있다. 도 18의 (B)에서는 방향을 8개의 각도로 나눈 기준에 따라 각각 d1, d2 내지 d8으로 분류하였다. 이와 같은 기준에 따라 분류하면 상기 그룹객체영역의 각 모션벡터는 d1, d2, d7 및 d8의 방향 기준으로 분류된다. 본 발명에서는 이와 같은 분류에 의해 상기 그룹객체영역의 모션벡터의 방향 및 크기 히스토그램을 생성할 수 있다. 이와 같은 히스토그램을 파라미터로 사용하여 상기 그룹객체영역의 특성 정보로 사용할 수 있다.FIG. 18B shows motion vector information of the group object area shown in FIG. 18A. Most of the motion vectors of each macroblock are directed upward in FIG. 18, but the detailed directions and magnitudes are different. The direction of the motion vector may be classified based on a preset direction criterion in order to indicate the characteristics of the group object region through each motion vector as described above. In (B) of FIG. 18 , the directions were classified into d1, d2 to d8, respectively, according to the criterion of dividing the direction into eight angles. When classified according to the above criteria, each motion vector of the group object region is classified according to the direction criteria of d1, d2, d7, and d8. In the present invention, it is possible to generate a histogram of the direction and magnitude of the motion vector of the group object region by such classification. Such a histogram can be used as the characteristic information of the group object area by using the histogram as a parameter.

도 18의 (C)에는 벡터로 표현한 상기 그룹객체영역정보가 도시되어 있다. 상기 그룹객체영역정보는 상기 그룹객체영역을 구분하기 위한 식별부호와 상기 그룹객체영역의 위치를 나타내는 위치정보 및 상기 그룹객체영역의 특성을 나타내는 파라미터정보를 포함할 수 있다.18C shows the group object area information expressed as a vector. The group object area information may include an identification code for classifying the group object area, location information indicating a location of the group object area, and parameter information indicating characteristics of the group object area.

본 발명의 일 실시예에서 상기 그룹객체영역의 파라미터정보는 상기 그룹객체영역을 구성하는 매크로블록 혹은 매크로블록을 구성하는 서브블록의 모션벡터정보에 대한 방향히스토그램; 및 상기 그룹객체영역을 구성하는 매크로블록 혹은 매크로블록을 구성하는 서브블록의 모션벡터정보에 대한 크기히스토그램; 을 포함할 수 있다.In an embodiment of the present invention, the parameter information of the group object area includes: a direction histogram for motion vector information of a macroblock constituting the group object area or a subblock constituting the macroblock; and a size histogram for motion vector information of a macroblock constituting the group object area or a subblock constituting the macroblock. may include.

도 18의 (C)에 도시된 것과 같이, 그룹객체영역정보는 도 18의 (A)의 그룹객체영역의 위치를 나타내는 X좌표, Y좌표, 폭 및 높이와, 도 18의 (B)에서와 같이 각 매크로블록의 모션벡터에 대한 방향히스토그램 및 크기히스토그램의 정보를 순서에 따라 나열하는 방식으로 구성될 수 있다.As shown in (C) of FIG. 18, the group object area information includes the X coordinate, Y coordinate, width and height indicating the location of the group object area of FIG. Similarly, it may be configured in a manner in which information of a direction histogram and a magnitude histogram for a motion vector of each macroblock is listed in order.

도 19 내지 도 21은 본 발명의 일 실시예에 따른 그룹객체영역검토부의 동작을 개략적으로 도시한 도면이다.19 to 21 are diagrams schematically illustrating the operation of the group object area review unit according to an embodiment of the present invention.

도 19 및 도 20에는 본 발명의 일 실시예에 따른 영상데이터의 연속된 세 프레임이 도시되어 있다. 각각의 프레임에는 상기 제1객체영역검출부(2000)에 의해 도출된 객체영역이 실선으로 표시되어 있고, 상기 객체영역의 매크로블록은 상기 제2객체영역검출부(4000)의 그룹객체영역생성부(4100)에 의해 그룹화 되었으며, 상기 벡터시퀀스생성부(4200)의 그룹클러스터링부(4230)에 의해 클러스터링 되어 있다. 즉, 그룹객체영역생성부(4100)에 의해 생성된 프레임 1의 그룹객체영역 G1, 프레임 2의 그룹객체영역 G1 및 프레임 3의 그룹객체영역 G1은 상기 그룹클러스터링부(4230)에 의해 동일한 객체를 포함하는 것으로 판단되어 클러스터링 되어 있다.19 and 20 show three consecutive frames of image data according to an embodiment of the present invention. In each frame, the object region derived by the first object region detector 2000 is indicated by a solid line, and the macroblock of the object region is the group object region generator 4100 of the second object region detector 4000 . ), and are clustered by the group clustering unit 4230 of the vector sequence generating unit 4200 . That is, the group object area G1 of frame 1, the group object area G1 of frame 2, and the group object area G1 of frame 3 generated by the group object area generation unit 4100 are the same objects generated by the group clustering unit 4230. It is considered to be included and clustered.

이와 같이 클러스터링 된 그룹객체영역의 정보는 벡터로 표현되어 상기 프레임의 시간 순서에 따라 시퀀스 데이터가 생성된다. 이 때, 도 19의 실시예에서는 상기 시퀀스 데이터에 각 그룹객체영역의 정보가 프레임 마다 하나의 벡터로 표현되어 학습된 순환신경망(RNN)에 입력된다.The information of the clustered group object area is expressed as a vector, and sequence data is generated according to the temporal order of the frame. At this time, in the embodiment of FIG. 19 , information on each group object region is expressed as one vector for each frame in the sequence data, and is input to a learned recurrent neural network (RNN).

이와 같이 순환신경망(RNN)에 시간에 따른 시퀀스 데이터가 순차적으로 입력되어 상기 그룹객체영역 내에 객체가 포함되어 있는지 여부를 파악할 수 있다.In this way, sequence data according to time is sequentially input to the recurrent neural network (RNN), so that it is possible to determine whether an object is included in the group object region.

본 발명의 일 실시예에서는 상기 순환신경망(RNN)을 통해 상기 그룹객체영역의 정보를 입력 받아 상기 그룹객체영역 내에 객체가 포함되어 있을 확률을 도출하고, 상기 제3판별부(4400)에서 상기 확률이 기설정된 기준을 충족하는지 여부를 판단함으로써 해당 그룹객체영역을 객체영역에서 삭제하거나, 존치하는 방법으로 제2객체영역정보를 도출할 수 있다.In an embodiment of the present invention, the information of the group object area is received through the recurrent neural network (RNN) to derive a probability that an object is included in the group object area, and the third determining unit 4400 determines the probability. By determining whether the predetermined criterion is satisfied, the second object area information can be derived by deleting or retaining the corresponding group object area from the object area.

이를 위해 상기 순환신경망(RNN)에서는 복수의 그룹객체영역의 정보가 동시에 입력되므로, 이를 각각 구분하여 각각에 대한 확률을 도출할 수 있다.To this end, since information of a plurality of group object regions is simultaneously input to the recurrent neural network (RNN), the respective probabilities can be derived by classifying them.

한편, 도 20과 같은 본 발명의 다른 실시예에서는 각 그룹객체영역의 정보가 각각의 벡터로 표현되고, 이와 같은 벡터가 학습된 복수의 학습된 순환신경망(RNN1, RNN2)에 각각 입력될 수 있다.Meanwhile, in another embodiment of the present invention as shown in FIG. 20, information of each group object region is expressed as a vector, and such a vector may be input to a plurality of learned recurrent neural networks RNN1 and RNN2, respectively. .

이와 같은 실시예에서는 각각의 그룹객체영역의 정보가 별개의 순환신경망에 입력 되어 각각의 그룹객체영역 내에 객체가 포함되어 있을 확률을 각각 도출하여, 상기 제3판별부(4400)가 각각 도출된 상기 확률이 기설정된 기준을 충족하는지 여부를 판단할 수 있다.In such an embodiment, the information of each group object area is input to a separate cyclic neural network, and the probability that the object is included in each group object area is derived, respectively, and the third discrimination unit 4400 is derived from the It may be determined whether the probability satisfies a preset criterion.

도 21에는 본 발명의 또 다른 실시예가 도시되어 있다.21 shows another embodiment of the present invention.

도 21의 실시예에서는 연속된 복수 개의 프레임이 그룹으로 묶여진 세 개의 프레임그룹이 도시되어 있다. 도 21의 실시예에서 각각의 프레임그룹은 n개의 프레임으로 묶여 첫 번째 프레임그룹에는 1부터 n까지의 프레임이, 두 번째 프레임그룹에는 2부터 n+1까지의 프레임이, 세 번째 프레임그룹에는 3부터 n+2까지의 프레임이 포함되어 있다. 도 21의 실시예에는 이처럼 프레임그룹이 동일하게 n개의 프레임을 포함하고 있으나, 본 발명의 다른 실시예에서는 각각의 프레임그룹이 서로 다른 개수의 프레임을 포함할 수 있다. 또한, 도 21의 실시예에서는 하나의 프레임이 복수의 프레임그룹에 중복되어 포함되어 있으나(n이 4인 경우, 프레임 2는 첫 번째 프레임그룹과 두 번째 프레임그룹에, 프레임 3은 첫 번째, 두 번째 및 세 번째 프레임그룹에 모두 포함되어 있다.), 본 발명의 다른 실시예에서는 하나의 프레임이 하나의 프레임그룹에만 포함될 수 있다.In the embodiment of FIG. 21, three frame groups in which a plurality of consecutive frames are grouped are shown. In the embodiment of FIG. 21, each frame group is grouped into n frames, the first frame group contains frames 1 to n, the second frame group contains frames 2 through n+1, and the third frame group contains 3 frames. Frames from to n+2 are included. In the embodiment of FIG. 21 , the frame group includes n frames equally, but in another embodiment of the present invention, each frame group may include a different number of frames. In addition, in the embodiment of FIG. 21 , one frame is overlapped and included in a plurality of frame groups (when n is 4, frame 2 is in the first and second frame groups, and frame 3 is in the first and second frame groups). included in both the third and third framegroups), in another embodiment of the present invention, one frame may be included in only one framegroup.

도 21에 도시된 각각의 세 프레임그룹에서는 상기 프레임그룹에 포함되는 복수의 프레임에서 상기 제1객체영역검출부(2000)에 의해 도출된 각각의 객체영역에 기초하여 도출된 누적객체영역이 도시되어 있다. 상기 누적객체영역은 상기 프레임그룹에 포함된 프레임들의 객체영역들에 기초하여 도출될 수 있다. 본 발명의 일 실시예에서 상기 누적객체영역은 상기 프레임들의 객체영역들을 모두 합친 영역일 수 있고, 다른 실시예에서 상기 누적객체영역은 기설정된 회수 이상 객체영역으로 도출 된 영역일 수 있고, 또 다른 실시예에서 상기 누적객체영역은 프레임들의 객체영역을 평균하여 도출 된 영역일 수 있다. 상기 누적객체영역의 매크로블록은 상기 제2객체영역검출부(4000)의 그룹객체영역생성부(4100)에 의해 그룹화 되었으며, 상기 벡터시퀀스생성부(4200)의 그룹클러스터링부(4230)에 의해 클러스터링 되어 있다. 즉, 그룹객체영역생성부(4100)에 의해 생성된 세 그룹프레임의 그룹객체영역 G1은 상기 그룹클러스터링부(4230)에 의해 동일한 객체를 포함하는 것으로 판단되어 클러스터링 되어 있다.In each of the three frame groups shown in FIG. 21, accumulated object areas derived based on each object area derived by the first object area detection unit 2000 from a plurality of frames included in the frame group are shown. . The accumulated object area may be derived based on object areas of frames included in the frame group. In one embodiment of the present invention, the accumulated object area may be an area in which all object areas of the frames are combined, and in another exemplary embodiment, the accumulated object area may be an area derived as an object area more than a preset number of times, and another In an embodiment, the accumulated object area may be an area derived by averaging the object areas of frames. The macroblocks of the cumulative object area are grouped by the group object area generation unit 4100 of the second object area detection unit 4000, and are clustered by the group clustering unit 4230 of the vector sequence generation unit 4200. have. That is, the group object area G1 of the three group frames generated by the group object area generation unit 4100 is determined to include the same object by the group clustering unit 4230 and is clustered.

이와 같이 클러스터링 된 그룹객체영역의 정보는 벡터로 표현되어 상기 그룹프레임의 시간 순서에 따라 시퀀스 데이터가 생성된다. 이 때, 도 21의 실시예에서는 상기 시퀀스 데이터에 각 그룹객체영역의 정보가 그룹프레임 마다 하나의 벡터로 표현되어 학습된 순환신경망(RNN)에 입력된다.The information of the clustered group object area is expressed as a vector, and sequence data is generated according to the time order of the group frame. At this time, in the embodiment of FIG. 21 , information on each group object region is expressed as one vector for each group frame in the sequence data and is input to a learned recurrent neural network (RNN).

도 22는 본 발명의 일 실시예에 따른 컴퓨팅장치의 내부 구성을 예시적으로 도시한다. 도 22에 도시된 구성요소 전체 혹은 일부는 후술하는 컴퓨팅장치의 구성요소를 포함할 수 있다.22 exemplarily illustrates an internal configuration of a computing device according to an embodiment of the present invention. All or some of the components illustrated in FIG. 22 may include components of a computing device to be described later.

도 22에 도시한 바와 같이, 컴퓨팅장치(11000)은 적어도 하나의 프로세서(processor)(11100), 메모리(memory)(11200), 주변장치 인터페이스(peripheral interface)(11300), 입/출력 서브시스템(I/O subsystem)(11400), 전력 회로(11500) 및 통신 회로(11600)를 적어도 포함할 수 있다. 22, the computing device 11000 includes at least one processor 11100, a memory 11200, a peripheral interface 11300, an input/output subsystem ( It may include at least an I/O subsystem) 11400 , a power circuit 11500 , and a communication circuit 11600 .

메모리(11200)는, 일례로 고속 랜덤 액세스 메모리(high-speed random access memory), 자기 디스크, 에스램(SRAM), 디램(DRAM), 롬(ROM), 플래시 메모리 또는 비휘발성 메모리를 포함할 수 있다. 메모리(11200)는 컴퓨팅장치(11000)의 동작에 필요한 소프트웨어 모듈, 명령어 집합 또는 그밖에 다양한 데이터를 포함할 수 있다.The memory 11200 may include, for example, a high-speed random access memory, a magnetic disk, an SRAM, a DRAM, a ROM, a flash memory, or a non-volatile memory. have. The memory 11200 may include a software module, an instruction set, or other various data necessary for the operation of the computing device 11000 .

이때, 프로세서(11100)나 주변장치 인터페이스(11300) 등의 다른 컴포넌트에서 메모리(11200)에 액세스하는 것은 프로세서(11100)에 의해 제어될 수 있다. 상기 프로세서(11100)은 단일 혹은 복수로 구성될 수 있고, 연산처리속도 향상을 위하여 GPU 및 TPU 형태의 프로세서를 포함할 수 있다.In this case, access to the memory 11200 from other components such as the processor 11100 or the peripheral interface 11300 may be controlled by the processor 11100 . The processor 11100 may be configured as a single or plural number, and may include a GPU and a TPU type processor to improve operation processing speed.

주변장치 인터페이스(11300)는 컴퓨팅장치(11000)의 입력 및/또는 출력 주변장치를 프로세서(11100) 및 메모리 (11200)에 결합시킬 수 있다. 프로세서(11100)는 메모리(11200)에 저장된 소프트웨어 모듈 또는 명령어 집합을 실행하여 컴퓨팅장치(11000)을 위한 다양한 기능을 수행하고 데이터를 처리할 수 있다.Peripheral interface 11300 may couple input and/or output peripherals of computing device 11000 to processor 11100 and memory 11200 . The processor 11100 may execute a software module or an instruction set stored in the memory 11200 to perform various functions for the computing device 11000 and process data.

입/출력 서브시스템(11400)은 다양한 입/출력 주변장치들을 주변장치 인터페이스(11300)에 결합시킬 수 있다. 예를 들어, 입/출력 서브시스템(11400)은 모니터나 키보드, 마우스, 프린터 또는 필요에 따라 터치스크린이나 센서 등의 주변장치를 주변장치 인터페이스(11300)에 결합시키기 위한 컨트롤러를 포함할 수 있다. 다른 측면에 따르면, 입/출력 주변장치들은 입/출력 서브시스템(11400)을 거치지 않고 주변장치 인터페이스(11300)에 결합될 수도 있다.The input/output subsystem 11400 may couple various input/output peripherals to the peripheral interface 11300 . For example, the input/output subsystem 11400 may include a controller for coupling peripheral devices such as a monitor, keyboard, mouse, printer, or a touch screen or sensor as needed to the peripheral interface 11300 . According to another aspect, input/output peripherals may be coupled to peripheral interface 11300 without going through input/output subsystem 11400 .

전력 회로(11500)는 단말기의 컴포넌트의 전부 또는 일부로 전력을 공급할 수 있다. 예를 들어 전력 회로(11500)는 전력 관리 시스템, 배터리나 교류(AC) 등과 같은 하나 이상의 전원, 충전 시스템, 전력 실패 감지 회로(power failure detection circuit), 전력 변환기나 인버터, 전력 상태 표시자 또는 전력 생성, 관리, 분배를 위한 임의의 다른 컴포넌트들을 포함할 수 있다.The power circuit 11500 may supply power to all or some of the components of the terminal. For example, the power circuit 11500 may include a power management system, one or more power sources such as batteries or alternating current (AC), a charging system, a power failure detection circuit, a power converter or inverter, a power status indicator, or a power source. It may include any other components for creation, management, and distribution.

통신 회로(11600)는 적어도 하나의 외부 포트를 이용하여 다른 컴퓨팅장치와 통신을 가능하게 할 수 있다.The communication circuit 11600 may enable communication with another computing device using at least one external port.

또는 상술한 바와 같이 필요에 따라 통신 회로(11600)는 RF 회로를 포함하여 전자기 신호(electromagnetic signal)라고도 알려진 RF 신호를 송수신함으로써, 다른 컴퓨팅장치와 통신을 가능하게 할 수도 있다.Alternatively, as described above, if necessary, the communication circuit 11600 may transmit and receive an RF signal, also known as an electromagnetic signal, including an RF circuit, thereby enabling communication with other computing devices.

이러한 도 22의 실시예는, 컴퓨팅장치(11000)의 일례일 뿐이고, 컴퓨팅장치(11000)는 도 22에 도시된 일부 컴포넌트가 생략되거나, 도 22에 도시되지 않은 추가의 컴포넌트를 더 구비하거나, 2개 이상의 컴포넌트를 결합시키는 구성 또는 배치를 가질 수 있다. 예를 들어, 모바일 환경의 통신 단말을 위한 컴퓨팅장치는 도 22에 도시된 컴포넌트들 외에도, 터치스크린이나 센서 등을 더 포함할 수도 있으며, 통신 회로(1160)에 다양한 통신방식(WiFi, 3G, LTE, Bluetooth, NFC, Zigbee 등)의 RF 통신을 위한 회로가 포함될 수도 있다. 컴퓨팅장치(11000)에 포함 가능한 컴포넌트들은 하나 이상의 신호 처리 또는 어플리케이션에 특화된 집적 회로를 포함하는 하드웨어, 소프트웨어, 또는 하드웨어 및 소프트웨어 양자의 조합으로 구현될 수 있다.22 is only an example of the computing device 11000, and the computing device 11000 omits some components shown in FIG. 22, or further includes additional components not shown in FIG. 22, or 2 It may have a configuration or arrangement that combines two or more components. For example, a computing device for a communication terminal in a mobile environment may further include a touch screen or a sensor in addition to the components shown in FIG. 22 , and various communication methods (WiFi, 3G, LTE) are provided in the communication circuit 1160 . , Bluetooth, NFC, Zigbee, etc.) may include a circuit for RF communication. Components that may be included in the computing device 11000 may be implemented as hardware, software, or a combination of both hardware and software including an integrated circuit specialized for one or more signal processing or applications.

본 발명의 실시예에 따른 방법들은 다양한 컴퓨팅장치를 통하여 수행될 수 있는 프로그램 명령(instruction) 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 특히, 본 실시예에 따른 프로그램은 PC 기반의 프로그램 또는 모바일 단말 전용의 어플리케이션으로 구성될 수 있다. 본 발명이 적용되는 어플리케이션은 파일 배포 시스템이 제공하는 파일을 통해 이용자 단말에 설치될 수 있다. 일 예로, 파일 배포 시스템은 이용자 단말이기의 요청에 따라 상기 파일을 전송하는 파일 전송부(미도시)를 포함할 수 있다.Methods according to embodiments of the present invention may be implemented in the form of program instructions that can be executed through various computing devices and recorded in computer-readable media. In particular, the program according to the present embodiment may be configured as a PC-based program or an application dedicated to a mobile terminal. The application to which the present invention is applied may be installed in the user terminal through a file provided by the file distribution system. As an example, the file distribution system may include a file transmission unit (not shown) that transmits the file according to a request of the user terminal.

본 발명의 일 실시예에 따르면, 인코딩 된 영상데이터로부터 객체를 인식, 트래킹, 식별 등의 분석을 수행하는 경우에 미리 객체영역을 검출하여 검출된 객체영역에 대해서만 영상분석을 수행하기 때문에, 보다 빠른 속도로 영상을 처리할 수 있는 효과를 발휘할 수 있다.According to an embodiment of the present invention, when performing analysis such as recognizing, tracking, or identifying an object from encoded image data, the image analysis is performed only on the detected object area by detecting the object area in advance, so that faster It can have the effect of processing images at high speed.

본 발명의 일 실시예에 따르면, 해당 영상데이터를 디코딩 하기 위한 디코더부의 구성을 변경하지 않으면서, 해당 디코더부에서 디코딩 과정에서 생성되는 파라미터를 이용함으로써, 가변크기 블록을 이용한 코덱방식이라면 코덱이 변경되더라도, 용이하게 적용될 수 있는 효과를 발휘할 수 있다.According to an embodiment of the present invention, the codec is changed in the case of a codec method using variable size blocks by using parameters generated in the decoding process in the corresponding decoder unit without changing the configuration of the decoder unit for decoding the corresponding image data. Even if it is, an effect that can be easily applied can be exhibited.

이상에서 설명된 장치는 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치 및 구성요소는, 예를 들어, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPGA(field programmable gate array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 어플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다. 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The device described above may be implemented as a hardware component, a software component, and/or a combination of the hardware component and the software component. For example, the apparatus and components described in the embodiments may include, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate array (FPGA). , a programmable logic unit (PLU), a microprocessor, or any other device capable of executing and responding to instructions, may be implemented using one or more general purpose or special purpose computers. The processing device may execute an operating system (OS) and one or more software applications running on the operating system. The processing device may also access, store, manipulate, process, and generate data in response to execution of the software. For convenience of understanding, although one processing device is sometimes described as being used, one of ordinary skill in the art will recognize that the processing device includes a plurality of processing elements and/or a plurality of types of processing elements. It can be seen that can include For example, the processing device may include a plurality of processors or one processor and one controller. Other processing configurations are also possible, such as parallel processors.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로 (collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상장치(virtual equipment), 컴퓨터 저장 매체 또는 장치, 또는 전송되는 신호 파(signal wave)에 영구적으로, 또는 일시적으로 구체화(embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨팅장치 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.Software may comprise a computer program, code, instructions, or a combination of one or more thereof, which configures a processing device to operate as desired or is independently or collectively processed You can command the device. The software and/or data may be any kind of machine, component, physical device, virtual equipment, computer storage medium or device, to be interpreted by or to provide instructions or data to the processing device. , or may be permanently or temporarily embody in a transmitted signal wave. The software may be distributed over networked computing devices, and may be stored or executed in a distributed manner. Software and data may be stored in one or more computer-readable recording media.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 실시예의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, etc. alone or in combination. The program instructions recorded on the medium may be specially designed and configured for the embodiment, or may be known and available to those skilled in the art of computer software. Examples of the computer-readable recording medium include magnetic media such as hard disks, floppy disks and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic such as floppy disks. - includes magneto-optical media, and hardware devices specially configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. Examples of program instructions include not only machine language codes such as those generated by a compiler, but also high-level language codes that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

이상과 같이 실시예들이 비록 한정된 실시예와 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기의 기재로부터 다양한 수정 및 변형이 가능하다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.As described above, although the embodiments have been described with reference to the limited embodiments and drawings, various modifications and variations are possible from the above description by those skilled in the art. For example, the described techniques are performed in an order different from the described method, and/or the described components of the system, structure, apparatus, circuit, etc. are combined or combined in a different form than the described method, or other components Or substituted or substituted by equivalents may achieve an appropriate result.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents to the claims are also within the scope of the following claims.

Claims

delete

An object region detection method performed in a computing system including one or more processors and a main memory for storing instructions executable by the processor, the method comprising:
an image decoding step of decoding an image by performing a variable length decoding step, an inverse quantization step, an inverse transform step, and an addition step on the video data;
size information of data for a macroblock included in the image data; and a first object region detecting step of deriving first object region information of an image based on one or more image decoding parameters extracted in the image decoding step; and
a second object region detection step of deriving second object region information by examining the first object region information derived in the first object region detection step through an artificial neural network; including,
The second object region detection step includes:
A group object area generation step of deriving group object area information for a group object area generated by grouping macroblocks or subblocks constituting macroblocks in contact with each other among the first object area information detected in the first object area detection step ;
a vector sequence generation step of generating sequence data according to time of an image by expressing the group object region information as a real vector;
a group object region review step of deriving a probability that an object exists in the group object region by using the sequence data as an input of a recurrent neural network (RNN);
a third discrimination step of detecting a second object region based on whether the derived probability that an object exists in the group object region satisfies a preset criterion; Including, an object area detection method.

4. The method according to claim 3,
The vector sequence generation step is,
an identification code designation step of designating an identification code in the group object area;
a parameter deriving step of deriving an image decoding parameter of a macroblock constituting the group object area or a subblock constituting the macroblock;
a group clustering step of clustering identification codes of a group object region determined to include the same object in successive frames of an image based on the image decoding parameter; and
a group vector information generation step of expressing the clustered information of the group object area as a vector; An object area detection method comprising a.

5. The method according to claim 4,
The video decoding parameters are
motion vector information of a macroblock or subblocks constituting the macroblock; and
prediction error information of a macroblock or subblocks constituting the macroblock; An object area detection method including at least one of

5. The method according to claim 4,
The group clustering step is
An object region detection method for identifying a group object region including the same object based on at least one of position information, motion vector information, and prediction error information of a macroblock constituting the group object region or subblock constituting the macroblock.

4. The method according to claim 3,
The sequence data is
location information of the group object area; and parameter information of the group object area; An object area detection method comprising a.

8. The method of claim 7,
The parameter information of the group object area is,
a direction histogram for motion vector information of a macroblock constituting the group object area or a subblock constituting the macroblock; and
a size histogram for motion vector information of a macroblock constituting the group object area or a subblock constituting a macroblock; An object area detection method comprising a.

An object region detection apparatus comprising one or more processors and a main memory for storing instructions executable by the processor, the apparatus comprising:
an image decoding unit for decoding an image by performing a variable length decoding step, an inverse quantization step, an inverse transform step, and an addition step on the video data;
size information of data for a macroblock included in the image data; and a first object region detection unit for deriving first object region information of an image based on one or more image decoding parameters extracted from the image decoding unit. and
a second object region detection unit for deriving second object region information by examining the first object region information derived from the first object region detection unit through an artificial neural network; including,
The second object area detection unit,
a group object area generator for deriving group object area information for a group object area generated by grouping macroblocks or subblocks constituting macroblocks in contact with each other among the first object area information detected by the first object area detection unit;
a vector sequence generator for generating sequence data according to time of an image by expressing the group object region information as a real vector;
a group object area review unit for deriving a probability that an object exists in the group object area by using the sequence data as an input of a recurrent neural network (RNN);
a third determining unit configured to detect a second object region based on whether the derived probability that an object exists in the group object region satisfies a preset criterion; Including, object area detection device.

A computer program stored on a computer-readable medium comprising a plurality of instructions executed by one or more processors, the computer program comprising:
The computer program is
an image decoding step of decoding an image by performing a variable length decoding step, an inverse quantization step, an inverse transform step, and an addition step on the video data;
size information of data for a macroblock included in the image data; and a first object region detecting step of deriving first object region information of an image based on one or more image decoding parameters extracted in the image decoding step; and
a second object region detection step of deriving second object region information by examining the first object region information derived in the first object region detection step through an artificial neural network; including,
The second object region detection step includes:
A group object area generation step of deriving group object area information for a group object area generated by grouping macroblocks or subblocks constituting macroblocks that are in contact with each other among the first object area information detected in the first object area detection step ;
a vector sequence generating step of generating sequence data according to time of an image by expressing the group object region information as a real vector;
a group object area review step of deriving a probability that an object exists in the group object area by using the sequence data as an input of a recurrent neural network (RNN);
a third discrimination step of detecting a second object region based on whether the derived probability that an object exists in the group object region satisfies a preset criterion; comprising, a computer program.