KR20230040265A

KR20230040265A - Object Region Detection Method, Device and Computer Program For Traffic Information and Pedestrian Information Thereof

Info

Publication number: KR20230040265A
Application number: KR1020220103947A
Authority: KR
Inventors: 김동기
Original assignee: 주식회사 핀텔
Priority date: 2021-09-15
Filing date: 2022-08-19
Publication date: 2023-03-22
Also published as: KR102530635B1

Abstract

The present invention relates to a traffic information and pedestrian information analysis method, device and computer program for the same and, more specifically, to a traffic information and pedestrian information analysis method using object region information detected based on parameter values derived in an image decoding process and relatively few computing resources to improve analysis accuracy, device and computer program for the same. The method includes a motion frame determination step; a first object derivation step; a comparison object area derivation step; a second object derivation step; and an analysis step.

Description

Traffic information and pedestrian information analysis method, device and computer program for this {Object Region Detection Method, Device and Computer Program For Traffic Information and Pedestrian Information Thereof}

본 발명은 교통정보 및 보행정보 분석방법, 장치 및 이에 대한 컴퓨터 프로그램에 관한 것으로서, 더욱 상세하게는 영상 디코딩 과정에서 도출되는 파라미터값에 기초하여 검출된 객체영역 정보를 이용하여, 상대적으로 적은 컴퓨팅 리소스를 사용하여 분석정확도를 개선할 수 있는 교통정보 및 보행정보 분석방법, 장치 및 이에 대한 컴퓨터 프로그램에 관한 것이다.The present invention relates to a method, apparatus, and computer program for analyzing traffic information and walking information, and more particularly, to a method using object area information detected based on a parameter value derived in an image decoding process, and relatively small computing resources. It relates to a method, device, and computer program for analyzing traffic information and walking information that can improve analysis accuracy by using

최근 들어, 스마트 폰, CCTV, 블랙박스, 고화질 카메라 등으로부터 수집되는 영상 데이터가 급격히 증가되고 있다. 이에 따라, 비정형의 영상 데이터들을 기반으로 인물이나 사물 등을 인식하여 의미 있는 정보를 추출하고 내용을 시각적으로 분석하고 활용하기 위한 요구 사항이 증대되고 있다.Recently, image data collected from smart phones, CCTVs, black boxes, high-definition cameras, and the like are rapidly increasing. Accordingly, requirements for extracting meaningful information by recognizing people or objects based on unstructured image data and visually analyzing and utilizing the contents are increasing.

영상 데이터 분석 기술은 이러한 다양한 영상들에 대해 학습 및 분석을 수행하여 원하는 영상을 검색하거나 이벤트 발생 등의 상황 인식을 위한 제반기술들을 말한다.Image data analysis technology refers to various technologies for situational awareness such as searching for a desired image or occurrence of an event by performing learning and analysis on these various images.

하지만, 영상 데이터를 인식하여 분석하고 추적하는 기술은 상당한 계산량을 요구하는 알고리즘이기 때문에, 즉, 복잡도가 높아서 영상 데이터의 크기가 커질수록 연산 장치에 상당한 부하를 주게 된다. 이에 따라, 크기가 커진 영상데이터를 분석하는 시간이 점점 오래 걸리게 된다. 따라서, 영상 정보 분석 시간을 줄일 수 있는 방법이 꾸준히 요구되고 있는 실정이다.However, since the technology for recognizing, analyzing, and tracking image data is an algorithm that requires a significant amount of computation, that is, as the size of image data increases, a considerable load is placed on the computing device. Accordingly, it takes longer and longer to analyze image data whose size has increased. Accordingly, there is a steady demand for a method capable of reducing image information analysis time.

한편, 최근 몇 년 사이 테러 등으로 인해 보안에 대한 인식이 강화되면서 영상 보안 시장이 지속적으로 성장하고 있으며, 이에 따라, 지능형 영상 처리에 대한 요구도 증가하고 있는 추세에 있다.On the other hand, as awareness of security has been strengthened due to terrorism in recent years, the video security market has been continuously growing, and accordingly, the demand for intelligent image processing is also increasing.

최근 H.264 등의 규약에 따른 블록 기반의 비디오 코덱을 기반으로 높은 해상도의 영상을 효율적으로 압축하여 전송하고 확인할 수 있는 기술이 확산되었다. 이와 같은 고해상도 영상은 CCTV 등의 모니터링 영상에도 적용이 되고 있으나, 이와 같은 고해상도 영상에서 분석, 트래킹 등에 있어서, 영상의 해상도가 높아짐에 따라 종래와 같은 객체검출방법은 보다 높은 연산량을 요구하고 따라서 실시간 영상에 대한 분석이 원활하게 이루어지지 않는다는 점이 있었다.Recently, a technology capable of efficiently compressing, transmitting, and verifying a high-resolution image based on a block-based video codec according to rules such as H.264 has spread. Such high-resolution images are also applied to monitoring images such as CCTV, but in analysis and tracking of such high-resolution images, as the resolution of images increases, conventional object detection methods require a higher amount of computation, and thus real-time images There was a point that the analysis of was not carried out smoothly.

한편, 선행문헌 1(한국등록특허 제10-1409826호, 2014.6.13 등록)은 참조프레임 내의 블록들의 움직임 벡터에 대한 히스토그램에 기반하여 참조프레임의 움직임 벡터를 산출하고, 전역 움직임벡터에 기초하여 참조블록의 영역종류를 결정하는 기술을 개시하고 있다. Meanwhile, Prior Document 1 (Korean Patent Registration No. 10-1409826, registered on June 13, 2014) calculates a motion vector of a reference frame based on a histogram of motion vectors of blocks in a reference frame, and references it based on the global motion vector. A technique for determining the area type of a block is disclosed.

그러나, 선행문헌 1의 경우 영역전체에 대하여 움직임 벡터를 산출하고, 영역 전체에 대하여 히스토그램 데이터를 산출하여야 하기 때문에, 현재의 높은 해상도의 영상에서의 실시간 처리가 가능한 정도의 속도가 나오기 어렵고, 또한, 모든 블록들에 대하여 모션벡터를 고려하여야 하기 때문에, 불필요한 블록에 대해서도 일단 연산을 수행하여야 한다는 문제점이 있다.However, in the case of Prior Document 1, since motion vectors must be calculated for the entire region and histogram data must be calculated for the entire region, it is difficult to obtain a speed capable of real-time processing in the current high-resolution image, and also, Since motion vectors must be considered for all blocks, there is a problem in that calculations must be performed once for unnecessary blocks.

또한, 선행문헌 1과 같이 움직임 벡터만을 고려요소로 보는 경우, 객체영역의 정확한 검출이 어려울 수 있다. 따라서, 선행문헌 1의 경우 정확하게 객체영역을 결정하기 위하여, 영상 내부의 특징량들을 다시 연산하여야 하기 때문에, 신속하면서 정확한 고해상도 영상에 대한 분석이 현실적으로 어렵다는 문제점이 있었다.In addition, when only the motion vector is considered as a factor to be considered as in Prior Document 1, it may be difficult to accurately detect the object region. Therefore, in the case of Prior Document 1, in order to accurately determine the object region, since feature quantities inside the image must be recalculated, there is a problem in that it is practically difficult to quickly and accurately analyze a high-resolution image.

선행문헌 1: 한국등록특허 제10-1409826호, '적응적 탐색 범위를 이용한 움직임 예측방법', 2014.6.13 등록)Prior Document 1: Korean Patent Registration No. 10-1409826, 'Motion prediction method using adaptive search range', registered on June 13, 2014)

상기와 같은 과제를 해결하기 위하여 본 발명의 일 실시예에서는, 하나 이상의 프로세서 및 상기 프로세서에서 수행 가능한 명령들을 저장하는 메인 메모리를 포함하는 컴퓨팅 시스템에서 수행되는 교통 혹은 보행정보 분석방 법으로서, 영상데이터의 대상프레임을 포함하는 제1분석프레임에 대하여 디코딩을 수행하지 않은 상태에서, 블록에 대한 데이터의 크기정보 혹은 영상디코딩 파라미터에 기초하여 상기 대상프레임이 움직임이 있는 프레임인지 여부를 판단하는 움직임프레임판단단계; 상기 대상프레임을 디코딩하여 디코딩영상을 추출하고, 디코딩영상에 대하여 딥러닝 기반의 제1기계학습모델에 의한 객체검출을 수행하여 1 이상의 제1객체를 도출하는 제1객체도출단계; 상기 영상데이터의 대상프레임을 포함하는 제2분석프레임에 대하여 블록에 대한 데이터의 크기정보 혹은 영상디코딩 파라미터에 기초하여 상기 대상프레임 내부의 1 이상의 비교객체영역을 검출하는 비교객체영역도출단계; 및 1 이상의 상기 비교객체영역 중 상기 제1객체가 존재하지 않는 비교객체영역의 디코딩영상에 대하여 딥러닝 기반의 제2기계학습모델에 의한 객체검출을 수행하여 1 이상의 제2객체를 도출하는 제2객체도출단계; 및 상기 제1객체 및 상기 제2객체를 차량 혹은 보행자 객체로 판단하여 트래킹을 수행하여, 교통 혹은 보행정보를 분석하는 분석단계;를 포함하는, 교통 혹은 보행정보 분석방법을 제공한다.In order to solve the above problems, in one embodiment of the present invention, a method for analyzing traffic or walking information performed in a computing system including one or more processors and a main memory for storing instructions executable by the processor, image data Motion frame determination for determining whether the target frame is a frame with motion based on size information or image decoding parameters of data for a block in a state in which decoding is not performed on the first analysis frame including the target frame of step; a first object derivation step of decoding the target frame, extracting a decoded image, and deriving one or more first objects by performing object detection using a deep learning-based first machine learning model on the decoded image; a comparison object region derivation step of detecting one or more comparison object regions inside the target frame based on size information of block data or image decoding parameters with respect to the second analysis frame including the target frame of the image data; and a second method for deriving one or more second objects by performing object detection by a deep learning-based second machine learning model on a decoded image of a comparison object area in which the first object does not exist among one or more comparison object areas. Object derivation step; and an analysis step of determining the first object and the second object as vehicle or pedestrian objects, performing tracking, and analyzing traffic or walking information.

본 발명의 몇 실시예에서는, 상기 비교객체영역검출단계는, 상기 대상프레임, 및 1 이상의 대상프레임 이전의 프레임에서의 블록에 대한 비트스트림 데이터의 크기정보 및 모션벡터정보 중 1 이상에 기초하여 비교객체영역을 검출할 수 있다.In some embodiments of the present invention, the comparison object region detecting step compares the target frame and blocks in frames preceding one or more target frames based on at least one of size information and motion vector information of bitstream data. The object area can be detected.

본 발명의 몇 실시예에서는, 상기 비교객체영역검출단계는, 상기 대상프레임, 및 1 이상의 대상프레임 이전의 프레임에 대하여 각각의 블록의 비트스트림 데이터의 크기정보를 합하여, 각각의 블록에 대한 종합블록데이터크기판단값을 도출하는 단계; 및 상기 종합블록데이터크기판단값이 기설정된 기준에 부합하는 블록들의 정보에 기초하여 비교객체영역을 도출하는 단계를 포함할 수 있다.In some embodiments of the present invention, the comparison object region detecting step may include summing size information of bitstream data of each block with respect to the target frame and frames preceding one or more target frames to form a comprehensive block for each block. Deriving a data size judgment value; and deriving a comparison object area based on information of blocks whose comprehensive block data size determination value meets a predetermined criterion.

본 발명의 몇 실시예에서는, 상기 블록의 비트스트림데이터의 크기정보는 각각의 프레임 내의 데이터크기에 대한 전체정보에 기초하여 정규화된 값일 수 있다.In some embodiments of the present invention, the size information of the bitstream data of the block may be a normalized value based on overall information on the data size in each frame.

본 발명의 몇 실시예에서는, 상기 비교객체영역검출단계는, 상기 대상프레임, 및 1 이상의 대상프레임 이전의 프레임 각각에 대하여 각각의 블록의 모션벡터의 크기가 기설정된 기준을 부합하는 경우에, 각각의 블록의 모션벡터판단값을 제1수치로 부여하고, 각각의 블록의 모션벡터의 크기가 기설정된 기준에 부합하지 않는 경우에 각각의 블록의 모션벡터판단값을 제2수치로 부여하는 단계; 각각의 상기 대상프레임, 및 1 이상의 대상프레임 이전의 프레임의 각각의 블록에 대한 상기 모션벡터판단값들을 누적하여 각각의 블록에 대한 종합모션벡터판단값을 도출하는 단계; 및 상기 종합모션벡터판단값이 기설정된 기준에 부합하는 블록들을 비교객체영역으로 도출하는 단계;를 포함할 수 있다.In some embodiments of the present invention, the comparison object region detection step may include, when the size of a motion vector of each block meets a predetermined criterion for each of the target frame and one or more frames preceding the target frame, respectively. assigning a motion vector judgment value of each block as a first numerical value, and assigning a motion vector judgment value of each block as a second numerical value when the magnitude of the motion vector of each block does not meet a predetermined criterion; deriving a comprehensive motion vector judgment value for each block by accumulating the motion vector judgment values for each block of each target frame and frames preceding one or more target frames; and deriving, as a comparison object area, blocks whose comprehensive motion vector judgment value meets a predetermined criterion.

본 발명의 몇 실시예에서는, 상기 비교객체영역검출단계는 각각의 블록의 상기 종합모션벡터판단값이 기설정된 기준에 부합 여부 및 각각의 블록의 모션벡터의 방향에 따라 도출된 그룹핑 정보에 기초하여 제2객체영역정보를 도출할 수 있다. In some embodiments of the present invention, the comparison object area detection step is based on whether the comprehensive motion vector judgment value of each block meets a predetermined criterion and grouping information derived according to the direction of the motion vector of each block. Second object area information may be derived.

본 발명의 몇 실시예에서는, 상기 제1객체도출단계에서, 상기 제1기계학습모델에 입력되는 디코딩영상은 원본영상이 압축된 형태의 영상을 포함하고, 상기 제2객체도출단계에서, 상기 제2기계학습모델에 입력되는 비교객체영역의 디코딩영상은 원본영상 혹은 상기 제1객체도출단계에서 입력되는 원본영상이 압축된 형태의 영상 보다 상대적으로 고화질로 압축된 압축영상을 포함할 수 있다. In some embodiments of the present invention, in the first object derivation step, the decoded image input to the first machine learning model includes an image in which the original image is compressed, In the second object derivation step, the decoded image of the comparison object area input to the second machine learning model is compressed to a higher quality than the original image or an image in which the original image input in the first object derivation step is compressed. compressed images may be included.

본 발명의 몇 실시예에서는, 상기 제2기계학습모델은 상기 제1기계학습모델보다 상대적으로 높은 연산부하를 요구하고, 더욱 정확한 검출을 수행할 수 있다. In some embodiments of the present invention, the second machine learning model requires a relatively higher computational load than the first machine learning model, and can perform more accurate detection.

상기와 같은 과제를 해결하기 위하여, 본 발명의 일 실시예에서는, 하나 이상의 프로세서 및 상기 프로세서에서 수행 가능한 명령들을 저장하는 메인 메모리를 포함하는 컴퓨팅 시스템으로 구현되는 교통 혹은 보행정보 분석시스템으로서, 영상데이터의 대상프레임을 포함하는 제1분석프레임에 대하여 디코딩을 수행하지 않은 상태에서, 블록에 대한 데이터의 크기정보 혹은 영상디코딩 파라미터에 기초하여 상기 대상프레임이 움직임이 있는 프레임인지 여부를 판단하는 움직임프레임판단부; 상기 대상프레임을 디코딩하여 디코딩영상을 추출하고, 디코딩영상에 대하여 딥러닝 기반의 제1기계학습모델에 의한 객체검출을 수행하여 1 이상의 제1객체를 도출하는 제1객체영역도출부; 상기 영상데이터의 대상프레임을 포함하는 제2분석프레임에 대하여 블록에 대한 데이터의 크기정보 혹은 영상디코딩 파라미터에 기초하여 상기 대상프레임 내부의 1 이상의 비교객체영역을 검출하는 비교객체영역도출부; 및 1 이상의 상기 비교객체영역 중 상기 제1객체가 존재하지 않는 비교객체영역의 디코딩영상에 대하여 딥러닝 기반의 제2기계학습모델에 의한 객체검출을 수행하여 1 이상의 제2객체를 도출하는 제2객체영역도출부; 및 상기 제1객체 및 상기 제2객체를 차량 혹은 보행자 객체로 판단하여 트래킹을 수행하여, 교통 혹은 보행정보를 분석하는 분석부;를 포함하는, 교통 혹은 보행정보 분석시스템을 제공한다.In order to solve the above problems, in one embodiment of the present invention, a traffic or walking information analysis system implemented as a computing system including one or more processors and a main memory for storing instructions executable by the processor, image data Motion frame determination for determining whether the target frame is a frame with motion based on size information or image decoding parameters of data for a block in a state in which decoding is not performed on the first analysis frame including the target frame of wealth; a first object region derivation unit that decodes the target frame, extracts a decoded image, and derives one or more first objects by performing object detection on the decoded image by a first machine learning model based on deep learning; a comparison object area deriving unit for detecting one or more comparison object areas inside the target frame based on size information of data or image decoding parameters for a second analysis frame including the target frame of the image data; and a second method for deriving one or more second objects by performing object detection by a deep learning-based second machine learning model on a decoded image of a comparison object area in which the first object does not exist among one or more comparison object areas. object area extraction unit; and an analyzer configured to determine the first object and the second object as vehicle or pedestrian objects, perform tracking, and analyze traffic or walking information.

상기와 같은 과제를 해결하기 위하여, 본 발명의 일 실시예에서는, 하나 이상의 프로세서에 의해 실행되는 복수의 명령들을 포함하는, 컴퓨터-판독가능 매체에 저장된 컴퓨터 프로그램으로서, 상기 컴퓨터 프로그램은, 영상데이터의 대상프레임을 포함하는 제1분석프레임에 대하여 디코딩을 수행하지 않은 상태에서, 블록에 대한 데이터의 크기정보 혹은 영상디코딩 파라미터에 기초하여 상기 대상프레임이 움직임이 있는 프레임인지 여부를 판단하는 움직임프레임판단단계; 상기 대상프레임을 디코딩하여 디코딩영상을 추출하고, 디코딩영상에 대하여 딥러닝 기반의 제1기계학습모델에 의한 객체검출을 수행하여 1 이상의 제1객체를 도출하는 제1객체도출단계; 상기 영상데이터의 대상프레임을 포함하는 제2분석프레임에 대하여 블록에 대한 데이터의 크기정보 혹은 영상디코딩 파라미터에 기초하여 상기 대상프레임 내부의 1 이상의 비교객체영역을 검출하는 비교객체영역도출단계; 및 1 이상의 상기 비교객체영역 중 상기 제1객체가 존재하지 않는 비교객체영역의 디코딩영상에 대하여 딥러닝 기반의 제2기계학습모델에 의한 객체검출을 수행하여 1 이상의 제2객체를 도출하는 제2객체도출단계; 및 상기 제1객체 및 상기 제2객체를 차량 혹은 보행자 객체로 판단하여 트래킹을 수행하여, 교통 혹은 보행정보를 분석하는 분석단계;를 포함하는 컴퓨터 프로그램을 제공한다.In order to solve the above problems, in one embodiment of the present invention, as a computer program stored in a computer-readable medium, including a plurality of instructions executed by one or more processors, the computer program, the image data A motion frame determination step of determining whether the target frame is a frame with motion based on size information of block data or image decoding parameters in a state in which decoding is not performed on the first analysis frame including the target frame. ; a first object derivation step of decoding the target frame, extracting a decoded image, and deriving one or more first objects by performing object detection using a deep learning-based first machine learning model on the decoded image; a comparison object region derivation step of detecting one or more comparison object regions inside the target frame based on size information of block data or image decoding parameters with respect to the second analysis frame including the target frame of the image data; and a second method for deriving one or more second objects by performing object detection by a deep learning-based second machine learning model on a decoded image of a comparison object area in which the first object does not exist among one or more comparison object areas. Object derivation step; and an analysis step of determining the first object and the second object as vehicle or pedestrian objects, performing tracking, and analyzing traffic or walking information.

본 발명의 일 실시예에 따르면, 영상 디코딩 과정에서 도출되는 파라미터값에 기초하여 검출된 객체영역 정보를 이용하여, 상대적으로 적은 컴퓨팅 리소스를 사용하여 분석정확도를 개선할 수 있는 교통정보 및 보행정보 분석방법, 장치 및 이에 대한 컴퓨터 프로그램을 제공하는 효과를 발휘할 수 있다.According to an embodiment of the present invention, traffic information and pedestrian information analysis capable of improving analysis accuracy using relatively small computing resources using object area information detected based on parameter values derived in an image decoding process. It is possible to exert an effect of providing a method, an apparatus, and a computer program for the same.

본 발명의 일 실시예에 따르면, 교통정보 혹은 보행정보의 분석의 정확도를 높이기 위하여 프레임별 분석을 수행하면서, 동시에 디코딩 파라미터에 기반한 객체검출에 대한 검증을 수행할 수 있는 효과를 발휘할 수 있다.According to an embodiment of the present invention, it is possible to perform frame-by-frame analysis in order to increase the accuracy of analysis of traffic information or walking information, and simultaneously verify object detection based on decoding parameters.

본 발명의 일 실시예에 따르면, 디코딩된 이미지 정보 기반 고속 딥러닝 기반 객체검출모델의 오류를 디코딩 파라미터에 기반한 정확도 보완을 수행할 수 있는 효과를 발휘할 수 있다.According to an embodiment of the present invention, an error of a high-speed deep learning-based object detection model based on decoded image information can be corrected based on decoding parameters.

도 1은 본 발명의 일 실시예에 따른 객체영역검출 시스템의 전체적인 구조를 개략적으로 도시한 도면이다.
도 2는 본 발명의 일 실시예에 따른 객체영역검출 시스템의 세부 구성을 개략적으로 도시한 도면이다.
도 3은 H.264 등의 규약에 따른 블록을 이용하는 비디오 코덱의 일 실시예에 따른 영상데이터의 데이터스트림의 복호화전 구조를 개략적으로 도시한 도면이다.
도 4는 가변블록을 이용하는 비디오 코덱의 일 실시예에 따른 영상데이터의 매크로블록의 데이터필드 구조를 개략적으로 도시한 도면이다.
도 5는 본 발명의 일 실시예에 따른 객체영역검출부의 세부 구성을 개략적으로 도시한 도면이다.
도 6은 매크로블록의 몇 예들을 도시한 도면이다.
도 7은 서브블록을 포함하는 매크로블록의 예를 도시한 도면이다.
도 8은 본 발명의 일 실시예에 따른 객체영역검출단계의 과정을 블록기준으로 도시한 도면이다.
도 9는 본 발명의 일 실시예에 따른 제1객체영역정보도출부의 동작을 블록기준으로 예시적으로 도시한 도면이다.
도 10는 본 발명의 일 실시예에 따른 제1객체영역정보도출부의 동작을 블록기준으로 예시적으로 도시한 도면이다.
도 11는 본 발명의 일 실시예에 따른 제2객체영역정보도출부의 동작을 블록기준으로 예시적으로 도시한 도면이다.
도 12는 본 발명의 일 실시예에 따른 제1객체영역정보도출부의 동작에 따른 영상화면 및 세부데이터처리맵의 일예를 도시한 도면이다.
도 13는 본 발명의 일 실시예에 따른 제2객체영역정보도출부의 동작에 따른 영상화면 및 세부데이터처리맵의 일예를 도시한 도면이다.
도 14는 본 발명의 일 실시예에 따른 객체영역검출방법에 따른 객체영역검출의 예를 도시한 도면이다.
도 15는 본 발명의 일 실시예에 따른 교통정보 혹은 보행정보의 분석방법의 전체 단계들을 개략적으로 도시한다.
도 16은 본 발명의 일 실시예에 따른 비교객체영역의 검출에 사용되는 프레임들을 도시한다.
도 17은 본 발명의 일 실시예에 따른 객체검출 과정을 예시적으로 도시한다.
도 18은 본 발명의 일 실시예에 따른 객체검출 과정을 예시적으로 도시한다.
도 19는 본 발명의 일 실시예에 따른 객체검출 과정을 예시적으로 도시한다.
도 20은 본 발명의 일 실시예에 따른 제1기계학습모델과 제2기계학습모델의 동작을 개략적으로 도시한다.
도 21은 본 발명의 일 실시예에 따른 제1기계학습모델과 제2기계학습모델의 동작을 개략적으로 도시한다.
도 22은 본 발명의 일 실시예에 따른 교통정보 혹은 보행정보의 분석시스템의 내부 구성을 개략적으로 도시한다.
도 23는 본 발명의 일 실시예에 따른 인코딩 되는 영상을 생성하는 인코더 시스템을 개략적으로 도시한다.
도 24은 영상 데이터의 프레임들의 예들을 개략적으로 도시한 도면이다
도 25는 본 발명의 일 실시예에 따른 컴퓨팅장치의 내부 구성을 예시적으로 도시한다.1 is a diagram schematically showing the overall structure of an object area detection system according to an embodiment of the present invention.
2 is a diagram schematically showing a detailed configuration of an object area detection system according to an embodiment of the present invention.
FIG. 3 is a diagram schematically illustrating a structure before decoding of a data stream of image data according to an embodiment of a video codec using blocks conforming to H.264 and the like.
4 is a diagram schematically illustrating a data field structure of a macroblock of image data according to an embodiment of a video codec using variable blocks.
5 is a diagram schematically illustrating a detailed configuration of an object area detection unit according to an embodiment of the present invention.
6 is a diagram showing several examples of macroblocks.
7 is a diagram illustrating an example of a macroblock including subblocks.
8 is a block-based diagram illustrating a process of detecting an object area according to an embodiment of the present invention.
9 is a diagram exemplarily showing the operation of the first object area information derivation unit on a block basis according to an embodiment of the present invention.
10 is a diagram exemplarily illustrating the operation of the first object area information derivation unit on a block basis according to an embodiment of the present invention.
11 is a diagram exemplarily showing the operation of the second object area information derivation unit on a block basis according to an embodiment of the present invention.
12 is a diagram showing an example of a video screen and a detailed data processing map according to the operation of the first object area information deriving unit according to an embodiment of the present invention.
13 is a diagram showing an example of a video screen and a detailed data processing map according to the operation of the second object area information deriving unit according to an embodiment of the present invention.
14 is a diagram illustrating an example of object area detection according to an object area detection method according to an embodiment of the present invention.
15 schematically illustrates the entire steps of a method for analyzing traffic information or walking information according to an embodiment of the present invention.
16 illustrates frames used for detection of a comparison object area according to an embodiment of the present invention.
17 exemplarily illustrates an object detection process according to an embodiment of the present invention.
18 exemplarily illustrates an object detection process according to an embodiment of the present invention.
19 exemplarily illustrates an object detection process according to an embodiment of the present invention.
20 schematically illustrates operations of a first machine learning model and a second machine learning model according to an embodiment of the present invention.
21 schematically illustrates operations of a first machine learning model and a second machine learning model according to an embodiment of the present invention.
22 schematically illustrates the internal configuration of an analysis system for traffic information or walking information according to an embodiment of the present invention.
23 schematically illustrates an encoder system for generating an encoded image according to an embodiment of the present invention.
24 is a diagram schematically illustrating examples of frames of video data;
25 illustratively illustrates the internal configuration of a computing device according to an embodiment of the present invention.

다양한 실시예들이 이제 도면을 참조하여 설명되며, 전체 도면에서 걸쳐 유사한 도면번호는 유사한 구성요소를 나타내기 위해서 사용된다. 본 명세서에서, 다양한 설명들이 본 발명의 이해를 제공하기 위해서 제시된다. 그러나 이러한 실시예들은 이러한 구체적인 설명 없이도 실행될 수 있음이 명백하다. 다른 예들에서, 공지된 구조 및 장치들은 실시예들의 설명을 용이하게 하기 위해서 블록 다이어그램 형태로 제공된다. Various embodiments are now described with reference to the drawings, wherein like reference numbers are used throughout the drawings to indicate like elements. In this specification, various descriptions are presented to provide an understanding of the present invention. However, it is apparent that these embodiments may be practiced without these specific details. In other instances, well-known structures and devices are presented in block diagram form in order to facilitate describing embodiments.

본 명세서에서 사용되는 용어 "컴포넌트", "모듈", "시스템", “~부” 등은 컴퓨터-관련 엔티티, 하드웨어, 펌웨어, 소프트웨어, 소프트웨어 및 하드웨어의 조합, 또는 소프트웨어의 실행을 지칭한다. 예를 들어, 컴포넌트는 프로세서상에서 실행되는 처리과정, 프로세서, 객체, 실행 스레드, 프로그램, 및/또는 컴퓨터일 수 있지만, 이들로 제한되는 것은 아니다. 예를 들어, 컴퓨팅 장치에서 실행되는 애플리케이션 및 컴퓨팅 장치 모두 컴포넌트일 수 있다. 하나 이상의 컴포넌트는 프로세서 및/또는 실행 스레드 내에 상주할 수 있고, 일 컴포넌트는 하나의 컴퓨터 As used herein, the terms “component”, “module”, “system”, “unit”, and the like refer to a computer-related entity, hardware, firmware, software, a combination of software and hardware, or an execution of software. For example, a component may be, but is not limited to, a process running on a processor, a processor, an object, a thread of execution, a program, and/or a computer. For example, both an application running on a computing device and a computing device may be components. One or more components may reside within a processor and/or thread of execution, and a component may reside on a computer

내에 로컬화될 수 있고, 또는 2개 이상의 컴퓨터들 사이에 분배될 수 있다. 또한, 이러한 컴포넌트들은 그 내부에 저장된 다양한 데이터 구조들을 갖는 다양한 컴퓨터 판독가능한 매체로부터 실행할 수 있다. 컴포넌트들은 예를 들어 하나 이상의 데이터 패킷들을 갖는 신호(예를 들면, 로컬 시스템, 분산 시스템에서 다른 컴포넌트와 상호작용하는 하나의 컴포넌트로부터 데이터 및/또는 신호를 통해 다른 시스템과 인터넷과 같은 네트워크를 통한 데이터)에 따라 로컬 및/또는 원격 처리들을 통해 통신할 수 있다. It can be localized within, or distributed between two or more computers. Also, these components can execute from various computer readable media having various data structures stored thereon. Components may for example signal with one or more packets of data (e.g. data and/or signals from one component interacting with another component in a local system, distributed system to other systems and data over a network such as the Internet). ) to communicate via local and/or remote processes.

또한, "포함한다" 및/또는 "포함하는"이라는 용어는, 해당 특징 및/또는 구성요소가 존재함을 의미하지만, 하나이상의 다른 특징, 구성요소 및/또는 이들의 그룹의 존재 또는 추가를 배제하지 않는 것으로 이해되어야 한다.Also, the terms "comprises" and/or "comprising" mean that the feature and/or element is present, but excludes the presence or addition of one or more other features, elements and/or groups thereof. It should be understood that it does not.

또한, 제1, 제2 등과 같이 서수를 포함하는 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되지는 않는다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 예를 들어, 본 발명의 권리 범위를 벗어나지 않으면서 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다. 및/또는 이라는 용어는 복수의 관련된 기재된 항목들의 조합 또는 복수의 관련된 기재된 항목들 중의 어느 항목을 포함한다.In addition, terms including ordinal numbers, such as first and second, may be used to describe various components, but the components are not limited by the terms. These terms are only used for the purpose of distinguishing one component from another. For example, a first element may be termed a second element, and similarly, a second element may be termed a first element, without departing from the scope of the present invention. The terms and/or include any combination of a plurality of related recited items or any of a plurality of related recited items.

또한, 본 발명의 실시예들에서, 별도로 다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가지고 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥 상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 발명의 실시예에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.In addition, in the embodiments of the present invention, unless otherwise defined, all terms used herein, including technical or scientific terms, are generally understood by those of ordinary skill in the art to which the present invention belongs. has the same meaning as Terms such as those defined in commonly used dictionaries should be interpreted as having a meaning consistent with the meaning in the context of the related art, and unless explicitly defined in the embodiments of the present invention, an ideal or excessively formal meaning not be interpreted as

1. 디코딩 파라미터에 기반한 객체영역 검출 알고리즘1. Object area detection algorithm based on decoding parameters

도 1은 본 발명의 일 실시예에 따른 객체영역검출 시스템의 전체적인 구조를 개략적으로 도시한 도면이다.1 is a diagram schematically showing the overall structure of an object area detection system according to an embodiment of the present invention.

도 1에서의 객체영역검출 시스템은 광의로 수신한 영상데이터를 처리하는 객체영역검출부(2000), 영상디코딩부(1000), 및 영상분석부(3000)를 포함한다. 영상데이터는 규약된 코덱에 의하여 인코딩된 영상데이터로서, 바람직하게는 가변크기 블록을 이용하여 영상이 인코딩된 데이터에 해당하고, 바람직하게는 H.264, H.265 코덱 방식 등을 포함하는 규약에 따라 인코딩된 영상데이터에 해당한다. 더욱 바람직하게는 가변크기 블록을 이용하는 규약에 따라 인코딩된 영상데이터에 해당한다.The object region detection system in FIG. 1 includes an object region detection unit 2000 processing received image data in a broad sense, an image decoding unit 1000, and an image analysis unit 3000. Video data is video data encoded by a standardized codec, and preferably corresponds to data encoded using a variable size block, and preferably complies with standards including H.264 and H.265 codecs. Corresponds to video data encoded according to More preferably, it corresponds to video data encoded according to a convention using a variable size block.

영상데이터는 객체영역검출 시스템에 저장되어 있는 영상이거나 혹은 실시간으로 다른 영상 수집장치(예를들어, CCTV 혹은 모니터링 장치)에 의하여 수신한 영상데이터에 해당할 수 있다. The image data may correspond to images stored in the object region detection system or image data received in real time by another image collection device (eg, CCTV or monitoring device).

상기 영상디코딩부(1000)는 H.264, H.265 코덱 방식 등을 포함하는 규약에 따라 인코딩된 영상을 디코딩 혹은 복호화하기 위한 장치에 해당하고, 이는 해당 코덱의 복호화 방식에 따라 구성된다.The image decoding unit 1000 corresponds to a device for decoding or decoding an image encoded according to a convention including an H.264 or H.265 codec, and is configured according to a decoding method of a corresponding codec.

영상분석부(3000)는 복호화되어 수신한 영상에 대하여 객체인식, 추적, 식별 등의 분석을 수행하고, 영상분석부(3000)는 영상분석 목적에 따라 전처리부, 특징정보 추출부, 특징정보 분석부 등의 다양한 구성을 포함할 수 있다.The image analysis unit 3000 performs analysis such as object recognition, tracking, and identification on the decoded and received image, and the image analysis unit 3000 performs a pre-processing unit, a feature information extraction unit, and a feature information analysis according to the purpose of image analysis. It may include various configurations such as parts.

본 발명에서는 영상분석부(3000)의 영상처리 속도를 보다 개선시키기 위하여, 객체영역검출부(2000)를 도입하였다. 객체영역검출부(2000)는 영상데이터의 비트스트림으로부터 추출한 정보 및 상기 영상디코딩부(1000)의 영상 디코딩 과정에서 추출되는 영상디코딩 파라미터와 같은 디코딩된 영상 자체가 아닌 디코딩되지 않은 영상데이터 비트스트림에서 추출되는 정보 및 디코딩과정에서 추출되는 파라미터 정보를 통하여 객체영역정보를 검출하고, 이에 대한 객체영역정보를 상기 영상분석부(3000)에 전달한다. 상기 객체영역정보는 사각형태의 영역의 좌표 정보에 해당하거나 혹은 복수의 블록으로 이루어진 비정형화 영역 정보에 해당할 수 있다.In the present invention, in order to further improve the image processing speed of the image analysis unit 3000, the object area detection unit 2000 is introduced. The object region detector 2000 extracts information extracted from the bitstream of the image data and the image decoding parameters extracted in the image decoding process of the image decoding unit 1000, not from the decoded image itself but from the undecoded image data bitstream. Object region information is detected through the information and parameter information extracted in the decoding process, and the object region information is transmitted to the image analysis unit 3000. The object area information may correspond to coordinate information of a rectangular area or non-standard area information composed of a plurality of blocks.

따라서, 영상분석부(3000)는 영상전체에 대해 객체인식, 특징점 추출, 전처리 등을 수행하는 것이 아니라, 상기 객체영역검출부(2000)로부터 전달받은 객체영역정보에 따른 영역에 대해서만 영상분석을 수행할 수 있다.Therefore, the image analysis unit 3000 does not perform object recognition, feature point extraction, preprocessing, etc. for the entire image, but performs image analysis only for the region according to the object region information received from the object region detection unit 2000. can

상기 객체영역검출부(2000), 영상디코딩부(1000), 영상분석부(3000)는 단일의 컴퓨팅 장치로 구성될 수도 있으나, 2 이상의 컴퓨팅 장치에 의하여 구성될 수도 있다. The object area detection unit 2000, the image decoding unit 1000, and the image analysis unit 3000 may be configured as a single computing device, or may be configured by two or more computing devices.

즉, 본 발명에 따른 객체영역검출 시스템은 하나 이상의 프로세서 및 상기 프로세서에서 수행 가능한 명령들을 저장하는 메인 메모리를 포함하는 컴퓨팅 시스템에 의하여 구현될 수 있다.That is, the object region detection system according to the present invention may be implemented by a computing system including one or more processors and a main memory storing instructions executable by the processor.

도 2는 본 발명의 일 실시예에 따른 객체영역검출 시스템의 세부 구성을 개략적으로 도시한 도면이다.2 is a diagram schematically showing a detailed configuration of an object area detection system according to an embodiment of the present invention.

본 발명의 일 실시예에 따른 객체영역검출장치는 하나 이상의 프로세서 및 상기 프로세서에서 수행 가능한 명령들을 저장하는 메인 메모리를 포함하는 컴퓨팅 시스템에 의하여 구현된다.An object area detection apparatus according to an embodiment of the present invention is implemented by a computing system including one or more processors and a main memory storing instructions executable by the processor.

도 2에 도시된 바와 같이, 가변길이디코딩부(1100), 역양자화부(1200), 역변환부(1300), 가산부(1400), 및 예측부(1500)를 포함하는 영상디코딩부(1000)는 영상의 인코딩 방식으로 인코딩된 영상데이터을 디코딩한다.As shown in FIG. 2, an image decoding unit 1000 including a variable length decoding unit 1100, an inverse quantization unit 1200, an inverse transform unit 1300, an adder 1400, and a prediction unit 1500 Decodes the video data encoded by the video encoding method.

상기 영상디코딩부(1000)의 구성은 영상데이터를 인코딩하는 인코딩부의 구성에 따라 구성될 수 있다. 예를들어, 도 2에 도시된 영상디코딩부(1000)의 구성은 실시예에서는 H.264 코덱을 디코딩하는 구성에 따른 형태이나, 본 발명은 이에 한정되지 않고, 블록 기반 디코딩/인코딩을 수행하는 코덱 방식이라면 적용될 수 있다. 예를들어, H.265 코덱의 경우에도, 본 발명이 적용될 수 있다.The configuration of the video decoding unit 1000 may be configured according to the configuration of an encoding unit that encodes video data. For example, the configuration of the video decoding unit 1000 shown in FIG. 2 is a configuration according to the configuration for decoding the H.264 codec in the embodiment, but the present invention is not limited thereto, and the configuration for performing block-based decoding/encoding Any codec method can be applied. For example, the present invention can also be applied to the H.265 codec.

가변길이디코딩부(1100)는 입력되는 영상 데이터를 가변길이디코딩(복호화)한다. 이를 통해, 가변 길이 디코딩부(111)는 영상 데이터를 움직임 벡터, 양자화 값, DCT 계수로 분리 혹은 영상 데이터로부터 움직임 벡터, 양자화 값, DCT 계수를 추출할 수 있다.The variable length decoding unit 1100 performs variable length decoding (decoding) of input video data. Through this, the variable length decoding unit 111 can separate the video data into motion vectors, quantization values, and DCT coefficients, or extract motion vectors, quantization values, and DCT coefficients from the video data.

역양자화부(1200)는 가변길이디코딩부(1100)로부터 출력되는 DCT 계수를 추출된 양자화 값에 따라 역양자화한다. The inverse quantization unit 1200 inversely quantizes the DCT coefficient output from the variable length decoding unit 1100 according to the extracted quantization value.

역변환부(1300)는 역양자화부(1200)에 의해 역양자화된 DCT 계수를 역변환(IDCT)하여 차분치 영상을 획득한다.The inverse transform unit 1300 performs an inverse transform (IDCT) on the DCT coefficients inversely quantized by the inverse quantization unit 1200 to obtain a difference value image.

예측부(1500)는 해당 프레임이 인트라모드인지 혹은 인터모드인지에 따라서 예측을 수행한다. 예측부(1500)의 움직임보상부(1530)는 움직임 벡터와 이전 영상 데이터를 이용하여 현재 영상 데이터에 대한 움직임을 보상한다. 이를 통해, 움직임 보상부는 예측 영상을 생성한다.The prediction unit 1500 performs prediction according to whether a corresponding frame is an intra mode or an inter mode. The motion compensation unit 1530 of the prediction unit 1500 compensates for motion of current image data by using a motion vector and previous image data. Through this, the motion compensator generates a predicted image.

상기 영상디코딩부(1000)의 가변길이디코딩부(1100), 역양자화부(1200), 역변환부(1300), 가산부(1400), 및 예측부(1500)의 구성은 영상데이터의 인코딩 방식 혹은 코덱에 따라 변경될 수 있고, 이는 통상의 기술자가 해당 영상데이터의 코덱에 따라 구현할 수 있다. 본 발명은 H.264, H.265 등의 규약에 따른 영상을 디코딩하는 기존의 디코딩부에 객체영역검출부(2000)에 추가하여 구성할 수 있다는 장점이 있다.The configuration of the variable length decoding unit 1100, the inverse quantization unit 1200, the inverse transform unit 1300, the adder 1400, and the prediction unit 1500 of the video decoding unit 1000 is based on the encoding method of video data or It can be changed according to the codec, and this can be implemented by a person skilled in the art according to the codec of the corresponding image data. The present invention has an advantage in that it can be configured by adding an object region detection unit 2000 to an existing decoding unit that decodes video according to H.264, H.265, and the like.

상기 객체영역검출부(2000)는 상기 영상데이터에 포함된 블록에 대한 데이터의 크기정보; 및 상기 영상디코딩단계에서 추출되는 1 이상의 영상디코딩 파라미터에 기초하여 영상의 객체영역정보를 도출하는 객체영역검출단계를 수행한다.The object area detection unit 2000 includes size information of data for blocks included in the image data; and an object region detection step of deriving object region information of an image based on one or more image decoding parameters extracted in the image decoding step.

여기서 블록은 가변크기 형태의 블록에 해당할 수도 있고, 혹은 매크로블록 방식을 사용하는 경우에는 매크로블록 혹은 매크로블록에 포함되는 서브블록에 해당할 수 있다. Here, the block may correspond to a variable-sized block, or in the case of using a macroblock method, it may correspond to a macroblock or a subblock included in a macroblock.

이와 같이 도출된 객체영역정보는 상기 영상분석부(3000)로 전달되고, 영상분석부(3000)는 객체영역정보에 대해서만 영상처리를 함으로써 영상분석에 소요되는 연산량을 대폭적으로 감소시킬 수 있다.The object region information derived in this way is transferred to the image analysis unit 3000, and the image analysis unit 3000 processes images only for the object region information, thereby significantly reducing the amount of computation required for image analysis.

도 3은 H.264 등의 규약에 따른 비디오 코덱의 일 실시예에 따른 영상데이터의 데이터스트림의 복호화전 구조를 개략적으로 도시한 도면이다.FIG. 3 is a diagram schematically illustrating a structure before decoding of a data stream of video data according to an embodiment of a video codec conforming to a standard such as H.264.

도 3에 도시된 데이터스트림은 도 2에서 가변길이디코딩부(1100)에 입력되는 복호화가 전혀 수행되지 않고, 저장되거나 혹은 전송되는 영상데이터에 해당한다. 이와 같은 영상데이터 혹은 데이터스트림은 NAL(Network Abstraction Layer) 로 이루어져 있고, 각각의 NAL은 Nal Unit 과 페이로드로서의 RBSP(Raw Byte Sequence Payload)로 이루어진다. NAL은 SPS, PPS 와 같은 파라미터 정보가 기재된 단위 혹은 VCL(Video Coding Layer)에 해당하는 Slice 데이터가 기재된 단위에 해당할 수 있다. The data stream shown in FIG. 3 corresponds to image data that is stored or transmitted without any decoding being input to the variable length decoding unit 1100 in FIG. 2 . Such video data or data stream is composed of NAL (Network Abstraction Layer), and each NAL is composed of Nal Unit and RBSP (Raw Byte Sequence Payload) as a payload. NAL may correspond to a unit in which parameter information such as SPS and PPS is described, or a unit in which slice data corresponding to VCL (Video Coding Layer) is described.

VCL에 해당하는 SLICE NAL은 헤더와 데이터로 구성되어 있고, 여기서 데이터는 복수의 매크로블록 필드와 구분자 필드로 이루어진다. 본 발명에서 객체영역검출을 수행하는 영상데이터의 인코딩 방식은 NAL 상태의 데이터에서는 일정한 블록크기를 갖는 매크로블록으로 인코딩하는 방식이다. 도 3에서 MB로 구분된 데이터 필드는 일정한 크기의 블록에 대한 데이터가 인코딩되어 있다. SLICE NAL corresponding to VCL consists of a header and data, where the data consists of a plurality of macroblock fields and a delimiter field. In the present invention, an encoding method of image data for performing object region detection is a method of encoding NAL state data into a macroblock having a certain block size. In FIG. 3, data fields divided into MBs encode data for blocks of a certain size.

후술하는 객체영역검출부(2000)의 제1객체영역정보도출부 (2100)는 디코딩이 수행되지 않은 영상데이터로부터 각각의 매크로블록, 즉 도 3에서 MB로 표시된 부분의 데이터 크기를 이용할 수 있다. The first object region information deriving unit 2100 of the object region detecting unit 2000 described later may use the data size of each macroblock, that is, the portion indicated by MB in FIG. 3 from image data on which decoding is not performed.

일반적인 H.264 코덱으로 인코딩된 영상데이터의 경우, 도 3의 MB 데이터필드는 16픽셀x16픽셀의 크기를 갖는 매크로블록에 대한 데이터가 인코딩되어 저장되어 있고, 가변길이디코딩부(1100)에 의하여 상기 매크로블록의 세부 블록정보를 일부 인코딩된 형태로 확인할 수 있다.In the case of video data encoded with a general H.264 codec, in the MB data field of FIG. 3, data for a macroblock having a size of 16 pixels x 16 pixels is encoded and stored. Detailed block information of a macroblock can be checked in a partially encoded form.

도 4는 가변블록을 이용하는 비디오 코덱의 일 실시예에 따른 영상데이터의 매크로블록의 데이터필드 구조를 개략적으로 도시한 도면이다.4 is a diagram schematically illustrating a data field structure of a macroblock of image data according to an embodiment of a video codec using variable blocks.

도 4에 도시된 매크로블록의 데이터필드는 상기 가변길이디코딩부(1100)에 의하여 디코딩된 형태이다. 기본적으로 매크로블록의 데이터필드는 블록의 크기 등의 정보를 포함하는 타입(Type) 필드; 인트라모드로 인코딩되었는지 혹은 인터모드로 인코딩되었는 지에 대한 정보, 및 인터모드인 경우에 기준프레임정보 및 모션벡터정보를 포함하는 예측타입(Prediction Type) 필드; 디코딩시 입력된 이전 픽쳐 비트열을 유지하기 위한 정보를 포함하는 CPB(Coded Picture Buffer) 필드; 양자화 파라미터에 대한 정보를 포함하는 QP(Quantization Parameter) 필드; 및 해당 블록의 색상에 대한 DCT 계수에 대한 정보를 포함하는 DATA 필드를 포함한다. The data field of the macroblock shown in FIG. 4 is decoded by the variable length decoding unit 1100. Basically, the data field of a macroblock includes a type field including information such as the size of the block; a prediction type field including information on whether encoding is done in intra mode or inter mode, and reference frame information and motion vector information in the case of inter mode; a Coded Picture Buffer (CPB) field containing information for maintaining a bit string of a previous picture input during decoding; a Quantization Parameter (QP) field including information about a quantization parameter; and a DATA field including information on DCT coefficients for the color of the corresponding block.

매크로블록이 복수의 서브블록을 포함하는 경우, 도 4의 2번째 열에 도시된 타입-예측타입-CPB-QP-DATA의 데이터 유닛이 복수개로 연결되어 있다. 후술하는 제1객체영역정보도출부(2100)는 각각의 서브블록의 비트스트림의 데이터크기에 기초하여 제1객체영역정보를 도출할 수 있다. 서브블록의 비트스트림의 데이터크기는 가변길이디코딩 전에 식별자를 기반으로 추출될 수도 있고, 가변길이디코딩 중 혹은 완료된 후에 추출될 수도 있다.When a macroblock includes a plurality of subblocks, a plurality of data units of type-prediction type-CPB-QP-DATA shown in the second column of FIG. 4 are connected. The first object region information deriving unit 2100 described later may derive first object region information based on the data size of the bitstream of each subblock. The data size of the bitstream of a subblock may be extracted based on an identifier before variable length decoding, or may be extracted during or after variable length decoding.

후술하는 제2판별부(2200)는 상기 가변길이디코딩부(1100)에 의하여 디코딩된 매크로블록 전체(서브블록이 없는 경우) 혹은 매크로블록을 구성하는 서브블록의 타입 필드에서 알 수 있는 블록크기 및 예측타입 필드에서 알 수 있는 모션벡터정보를 이용하여, 객체영역을 검출한다.The second determination unit 2200, which will be described later, determines the block size and type field of the entire macroblock decoded by the variable length decoding unit 1100 (when there is no subblock) or the type field of the subblocks constituting the macroblock. The object area is detected using the motion vector information known from the prediction type field.

한편, DATA 필드의 색상정보는 복수의 계통의 색상에 대한 정보(도 4에서는 YCbCr 계통에서의 색상정보를 포함)가 인코딩된 형태로 포함되어 있다. 이와 같은 DATA 필드의 색상정보는 역양자화부(1200), 및 역변환부(1300)에 의하여 정보가 디코딩되고, 가산부(1400)에서 원래의 영상데이터에서의 색상값과 영상디코딩부(1000)의 예측부(1500)에서 예측한 색상값의 차이에 해당하는 예측오류정보가 도출될 수 있다.Meanwhile, the color information of the DATA field includes information on colors of a plurality of systems (including color information of YCbCr systems in FIG. 4) in an encoded form. The color information of the DATA field is decoded by the inverse quantization unit 1200 and the inverse transform unit 1300, and the color value of the original image data in the adder 1400 and the image decoding unit 1000 Prediction error information corresponding to a difference between color values predicted by the prediction unit 1500 may be derived.

도 5는 본 발명의 일 실시예에 따른 객체영역검출부(2000)의 세부 구성을 개략적으로 도시한 도면이다.5 is a diagram schematically illustrating a detailed configuration of an object area detection unit 2000 according to an embodiment of the present invention.

상기 객체영역검출부(2000)은 상기 영상데이터에 포함된 블록에 대한 데이터의 크기정보; 및 상기 영상디코딩단계에서 추출되는 1 이상의 영상디코딩 파라미터에 기초하여 영상의 객체영역정보를 도출하는 객체영역검출단계;를 수행한다.The object area detection unit 2000 includes size information of data of blocks included in the image data; and an object region detection step of deriving object region information of an image based on one or more image decoding parameters extracted in the image decoding step.

상기 객체영역검출부(2000)는 상기 가변길이디코딩단계가 수행되기 전의 영상데이터의 비트스트림으로부터 블록의 데이터의 크기정보를 추출하는 단계; 및 상기 블록에 대한 데이터의 크기정보가 기설정된 기준을 충족하는 지 여부를 판별하는 단계;에 의하여 제1객체영역정보를 도출하는 제1객체영역정보도출부(2100); 및 상기 영상디코딩단계에서 추출되는 1 이상의 영상디코딩 파라미터에 기초하여 제2객체영역정보를 도출하는 제2객체영역정보도출부(2200); 및 상기 제1객체영역정보 및 상기 제2객체영역정보에 기초하여 객체영역정보를 결정하고, 객체영역정보를 영상분석부에 전달하는 객체영역출력부(2300)를 포함한다.extracting, by the object region detection unit 2000, size information of block data from a bitstream of image data before the variable length decoding step is performed; and determining whether size information of the data for the block satisfies a predetermined criterion; and a second object region information deriving unit 2200 for deriving second object region information based on one or more image decoding parameters extracted in the image decoding step. and an object region output unit 2300 that determines object region information based on the first object region information and the second object region information and transmits the object region information to an image analysis unit.

상기 제1객체영역정보도출부(2100)는 도 2에 도시된 바와 같이, 가변길이디코딩부(1100)에 의하여 디코딩되지 않은 영상데이터의 NAL 형식의 데이터스트림으로부터 각각의 매크로블록에 대한 데이터의 크기정보를 도출할 수 있다. 도 3에서 MB로 표기된 각각의 데이터필드의 데이터 크기에 기초하여 각각의 매크로블록이 객체영역에 해당하는 지 여부를 결정한다.As shown in FIG. 2, the first object region information derivation unit 2100 determines the size of data for each macroblock from the NAL format data stream of video data that has not been decoded by the variable length decoding unit 1100. information can be derived. Based on the data size of each data field indicated as MB in FIG. 3, whether each macroblock corresponds to the object area is determined.

가변크기 블록을 이용한 영상 인코딩 방법에서는 복잡한 영상이 위치하는 매크로블록의 경우 매크로블록(16x16)을 복수의 서브블록(8x8, 4x4 등)으로 나누게 되거나 여러 정보를 포함하기 때문에, 해당 매크로블록은 크기가 커지게 된다. 또한, H.264 등의 영상 인코딩 방법에서는 매크로블록의 데이터 중 자주 발생하는 값과 그렇지 않은 값들이 존재할 경우, 자주 발생하는 값에 짧은 길이의 부호를 할당하고 그렇지 않은 값에는 긴 부호를 할당하여 전체 데이터량을 줄이는 방법으로 인코딩한다.In the video encoding method using a variable size block, in the case of a macroblock in which a complex video is located, since a macroblock (16x16) is divided into a plurality of subblocks (8x8, 4x4, etc.) or contains various information, the corresponding macroblock has a large size. It gets bigger. In addition, in the video encoding method such as H.264, if there are frequently occurring values and non-frequently occurring values among macroblock data, a shorter length code is assigned to the frequently occurring value and a longer code is assigned to the non-frequently occurring value. Encode in a way to reduce the amount of data.

제1객체영역정보도출부(2100)는 가변크기 블록을 이용한 인코딩 방법의 이와 같은 특성을 이용하여, 가변길이디코딩부(1100) 등의 디코딩 절차가 완료되기 전의 영상데이터로부터 각각의 매크로블록의 데이터크기를 도출하고, 이를 기초로 제1객체영역정보를 도출할 수 있다. The first object region information deriving unit 2100 utilizes this characteristic of the encoding method using variable size blocks to obtain data of each macroblock from video data before the decoding procedure of the variable length decoding unit 1100 is completed. The size may be derived, and first object area information may be derived based on the size.

이와 같은 방식으로 본 발명에서는 영상디코딩부(1000)를 변경하지 않고, 영상디코딩부(1000)로 입력되는 영상데이터로부터 객체영역을 검출할 수 있는 유효데이터를 추출하여 간단한 연산으로 해당 매크로블록에 대한 객체영역 여부를 판별한다.In this way, in the present invention, without changing the image decoding unit 1000, valid data capable of detecting the object region is extracted from the video data input to the image decoding unit 1000, and a simple operation is performed for the corresponding macroblock. Determine whether the object area exists.

본 발명의 일 실시예에서는, 별도의 매크로블록/서브블록 형태로 인코딩되는 형태가 아닌 경우에도, 상기 제1객체영역정보도출부(2100)는 해당 블록의 비트스트림의 데이터크기에 기초하여 제1객체영역정보를 도출할 수 있다. In one embodiment of the present invention, even when not encoded in a separate macroblock/subblock form, the first object region information deriving unit 2100 determines the first object region information deriving unit 2100 based on the data size of the bitstream of the corresponding block. Object area information can be derived.

상기 제1객체영역정보도출부(2100)은 상기 영상데이터가 복수의 매크로블록으로 인코딩되고, 상기 매크로블록 중 일부는 서브블록을 포함하는 경우, 전술한 바와 같이 매크로블록의 비트스트림 데이터 크기만으로 수행될 수도 있지만, 본 발명의 일 실시예에서는, 상기 제1객체영역정보도출부(2100)은 영상데이터의 비트스트림의 구분자 정보로 매크로블록 혹은 서브블록을 구분하여 이로부터 매크로블록 혹은 서브블록에 해당하는 블록의 비트스트림 데이터의 크기정보를 추출하는 단계; 및 서브블록을 가지지 않는 매크로블록 혹은 서브블록에 대한 비트스트림 데이터의 크기정보가 기설정된 기준을 충족하는 지 여부를 판별하는 단계;에 의하여 제1객체영역정보를 도출하는 형태로 수행될 수도 있다.When the video data is encoded into a plurality of macroblocks, and some of the macroblocks include subblocks, the first object region information derivation unit 2100 performs only the bitstream data size of the macroblocks as described above. However, in one embodiment of the present invention, the first object region information deriving unit 2100 classifies macroblocks or subblocks with delimiter information of the bitstream of image data and corresponds to macroblocks or subblocks. extracting size information of bitstream data of a block; and determining whether size information of bitstream data for a macroblock or subblock having no subblocks satisfies a predetermined criterion; thereby deriving first object region information.

본 발명의 일 실시예에서는, 상기 제2객체영역정보도출부(2200)은 상기 매크로블록이 서브블록을 포함하고 있지 않는 경우에는 상기 매크로블록 전체에 대한 1 이상의 영상디코딩 파라미터에 기초하여 객체영역을 검출하고, 상기 매크로블록이 서브블록을 포함하고 있는 경우에는 상기 서브블록 각각에 대한 1 이상의 영상디코딩 파라미터에 기초하여 객체영역을 검출한다.In one embodiment of the present invention, the second object region information deriving unit 2200 determines the object region based on one or more image decoding parameters for the entire macroblock when the macroblock does not include a subblock. and if the macroblock includes subblocks, the object region is detected based on at least one image decoding parameter for each of the subblocks.

본 발명의 발명자는 다양한 시뮬레이션을 통하여 블록크기정보, 모션벡터크기정보, 모션벡터방향각에 기초한 그룹핑정보, 예측오류정보 등의 다양한 정보를 이용하여 제2객체영역정보도출부(2200)을 구현해본 결과, 블록의 비트스트림 데이터 크기와 연동하여 객체영역을 검출하는 경우에는, 모션벡터크기정보만을 이용하거나, 모션벡터크기정보 및 모션벡터방향각에 기초한 그룹핑 정보(혹은 모션벡터방향각 정보)를 이용하는 경우가, 정확도 및 연산속도를 모두 만족시키는 형태가 됨을 확인하였다.The inventor of the present invention implemented the second object area information derivation unit 2200 using various information such as block size information, motion vector size information, grouping information based on motion vector direction angle, and prediction error information through various simulations. As a result, in the case of detecting the object area in conjunction with the size of the bitstream data of the block, only the motion vector size information is used or grouping information (or motion vector direction angle information) based on the motion vector size information and the motion vector direction angle is used. It was confirmed that the case satisfies both accuracy and operation speed.

본 발명의 바람직한 실시예에서는, 상기 1 이상의 영상디코딩 파라미터는 매크로블록 혹은 매크로블록을 구성하는 서브블록의 모션벡터정보를 포함한다. 포함한다. 인터모드의 프레임의 매크로블록 혹은 서브블록은 각각 모션벡터 정보(방향 및 크기)를 포함하고 있고, 바람직하게는, 상기 제2객체영역정보도출부(2200)은 모션벡터정보 중 모션벡터의 크기 정보를 이용하여 해당 매크로블록 혹은 서브블록이 객체영역에 해당할 수 있는 지 여부 혹은 판별 관련 값을 판별한다.In a preferred embodiment of the present invention, the one or more video decoding parameters include motion vector information of a macroblock or a subblock constituting a macroblock. include Each macroblock or subblock of an intermode frame includes motion vector information (direction and magnitude), and preferably, the second object region information deriving unit 2200 uses motion vector magnitude information among motion vector information. It is used to determine whether the corresponding macroblock or subblock can correspond to the object area or a value related to the determination.

객체가 존재하는 영역의 경우, 배경영역보다 움직임이 있을 가능성이 높다. 한편, 가변블록을 이용하는 영상인코딩 방식에서의 참조프레임(예를들어, P프레임)의 각각의 이미지 블록은 기준프레임에 대한 모션벡터의 크기 정보를 포함하고 있다. 본 발명의 일 실시예에서는 이와 같은 인코딩 특성을 이용하여 제2객체영역정보도출부(2200)에서 객체영역 검출을 수행한다. 즉, 제2객체영역정보도출부(2200)는 객체영역인지 여부를 판별하려는 매크로블록 혹은 서브블록의 모션벡터의 크기가 클수록 객체영역에 해당될 가능성이 높다고 판단하거나 혹은 객체영역 여부를 결정하는 스코어에 있어서는 모션벡터의 크기가 작은 블록보다 높은 스코어를 부여한다.In the case of an area where an object exists, there is a higher possibility that there is motion than a background area. Meanwhile, each image block of a reference frame (eg, P frame) in an image encoding method using a variable block includes size information of a motion vector for the reference frame. In an embodiment of the present invention, the second object region information deriving unit 2200 performs object region detection using the encoding characteristic. That is, the second object region information derivation unit 2200 determines that the greater the motion vector of the macroblock or subblock to determine whether or not it is an object region, the higher the probability that the macroblock or subblock corresponds to the object region, or a score for determining whether or not it is an object region. For , a higher score is given than blocks with small motion vectors.

여기서, 도 5에 도시된 바와 같이, 상기 모션벡터정보는 상기 가변길이디코딩부(1100)에서 복호화된 정보로부터 도출된다. 이와 같은 방식으로 객체영역을 도출하기 위한 별도의 특징량 정보를 생성하지 않고, 영상디코딩부(1000)의 구성을 유지하면서, 객체영역을 검출할 수 있는 파라미터를 도출할 수 있다. Here, as shown in FIG. 5, the motion vector information is derived from information decoded by the variable length decoding unit 1100. In this way, parameters capable of detecting the object region may be derived while maintaining the configuration of the image decoding unit 1000 without generating separate feature information for deriving the object region.

즉, 본 발명의 바람직한 실시예에서는, 상기 객체영역검출부(2000)는 상기 영상데이터에 포함된 블록에 대한 비트스트림 데이터의 크기정보에 기초하여 제1객체영역정보를 도출하고, 상기 영상디코딩단계에서 상기 가변길이디코딩부에서 복호화된 정보로부터 도출되는 모션벡터 정보에 기초하여 제2객체영역도출을 검출하고, 상기 객체영역정보는 상기 제1객체영역정보 혹은 상기 제2객체영역정보에 해당하는 영역의 정보를 포함한다. 여기서, 상기 객체영역정보는 상기 제1객체영역정보에 해당하는 영역과 상기 제2객체영역정보에 해당하는 영역을 OR 연산하여 도출한다.That is, in a preferred embodiment of the present invention, the object region detection unit 2000 derives first object region information based on size information of bitstream data for a block included in the image data, and in the image decoding step Second object area derivation is detected based on motion vector information derived from information decoded by the variable length decoding unit, and the object area information is a region corresponding to the first object area information or the second object area information. contains information Here, the object area information is derived by performing an OR operation on an area corresponding to the first object area information and an area corresponding to the second object area information.

도 6은 매크로블록의 몇 예들을 도시한 도면이다. 6 is a diagram showing several examples of macroblocks.

도 6의 (A)는 매크로블록이 1개의 블록으로 이루어진 경우를 도시하고, 도 6의 (B)는 4개의 블록으로 이루어진 경우를 도시하고, 도 6의 (C)는 7개의 블록으로 이루어진 경우를 도시하고, 도 6의 (D)는 16개의 블록으로 이루어진 경우를 도시한다. 6(A) shows a case in which a macroblock consists of one block, FIG. 6(B) shows a case in which a macroblock consists of four blocks, and FIG. 6(C) shows a case in which a macroblock consists of seven blocks. , and FIG. 6 (D) shows a case consisting of 16 blocks.

전술한 바와 같이, 본 발명의 일 실시예에서는 상기 제1객체영역정보도출부 (2100)는 각각의 매크로블록에 대한 데이터의 크기정보를 판별한다. 본 발명의 다른 실시예에서는, 상기 제1객체영역정보도출부(2100)은 매크로블록 혹은 매크로블록에 포함되는 서브블록에 대한 데이터의 크기정보를 판별한다.As described above, in one embodiment of the present invention, the first object area information deriving unit 2100 determines size information of data for each macroblock. In another embodiment of the present invention, the first object area information deriving unit 2100 determines size information of data for a macroblock or a subblock included in a macroblock.

상기 제1객체영역정보도출부 (2100)가 각각의 매크로블록에 대한 데이터의 크기정보를 판별하는 실시예와 관련하여, 도 6의 (D)와 같이 복수의 서브블록으로 이루어진 경우에는 일반적으로 도 6의 (A) 보다 매크로블록에 대한 비트스트림 데이터의 크기가 높을 가능성이 높다. 이 경우, 상기 제1객체영역정보도출부는 도 6의 (A)와 같은 매크로블록보다 도 6의 (D)와 같은 매크로블록이 객체영역에 해당될 가능성이 높은 방향으로 판별을 수행한다.Regarding the embodiment in which the first object area information derivation unit 2100 determines the size information of data for each macroblock, when it is composed of a plurality of subblocks as shown in FIG. It is highly likely that the size of bitstream data for a macroblock is higher than in (A) of 6. In this case, the first object region information derivation unit performs determination in a direction in which the macroblock as shown in FIG. 6(D) is more likely to correspond to the object region than the macroblock as shown in FIG. 6(A).

도 7은 서브블록을 포함하는 매크로블록의 예를 도시한 도면이다. 7 is a diagram illustrating an example of a macroblock including subblocks.

가변크기 블록을 이용한 영상데이터의 인코딩에서는 동일한 매크로블록에서의 서브블록은 도 7에서와 같이 각각의 서브블록은 상이한 블록크기 및 모션벡터를 가질 수 있다.In the encoding of video data using variable size blocks, subblocks in the same macroblock may have different block sizes and motion vectors, as shown in FIG. 7 .

도 7의 블록 #4, #5, #6, #7은 블록 #1, #2, #3보다 큰 모션벡터를 가지고 따라서, 제2객체영역정보도출부(2200)는 블록 #4, #5, #6, #7에 대하여 상대적으로 높은 가능성으로 객체영역으로 판별할 수 있다.Blocks #4, #5, #6, and #7 of FIG. 7 have motion vectors larger than those of blocks #1, #2, and #3. Accordingly, the second object region information deriving unit 2200 may use blocks #4 and #5. , #6 and #7 can be identified as object areas with a relatively high probability.

도 8은 본 발명의 일 실시예에 따른 객체영역검출단계의 과정을 블록기준으로 도시한 도면이다.8 is a block-based diagram illustrating a process of detecting an object area according to an embodiment of the present invention.

전술한 바와 같이, 본 발명의 일 실시예에서는, 상기 객체영역검출부(2000)에서 수행되는 상기 객체영역검출단계는, 상기 영상데이터에 포함된 블록에 대한 비트스트림 데이터의 크기정보에 기초하여 제1객체영역정보를 도출하고, 상기 영상디코딩단계에서 상기 가변길이디코딩부에서 복호화된 정보로부터 도출되는 모션벡터 정보에 기초하여 제2객체영역도출을 검출하고, 상기 객체영역정보는 상기 제1객체영역정보 혹은 상기 제2객체영역정보에 해당하는 영역의 정보를 포함한다.As described above, in one embodiment of the present invention, the object area detection step performed by the object area detection unit 2000, First object region information is derived based on size information of bitstream data for a block included in the image data, and based on motion vector information derived from information decoded by the variable length decoding unit in the image decoding step, A second object region derivation is detected, and the object region information includes information of a region corresponding to the first object region information or the second object region information.

본 발명의 일 실시예에서는, 상기 제1객체영역정보는 상기 비트스트림 데이터의 크기가 기설정된 기준(예를들어 기설정된 크기 이상인 경우)에 부합하는 블록들에 대한 정보에 해당하거나, 혹은 상기 부합하는 블록들의 경계에 따라 형성되는 사각형의 좌측상단꼭지점 및 우측하단꼭지점에 대한 정보 등으로 구현될 수 있다.In one embodiment of the present invention, the first object area information corresponds to information about blocks whose size of the bitstream data meets a predetermined standard (eg, when the size is greater than or equal to the predetermined size), or the size of the bitstream data corresponds to the predetermined standard. It can be implemented with information about the upper left vertex and the lower right vertex of the rectangle formed along the boundaries of the blocks.

마찬가지로, 본 발명의 일 실시예에서는, 상기 제2객체영역정보는 상기 모션벡터의 정보가 기설정된 기준에 부합하는 블록(들)(예를들어, 모션벡터의 크기가 기설정된 크기 이상인 경우; 혹은 모션벡터의 크기가 기설정된 크기 이상이면서 모션벡터의 방향으로 그룹핑이 된 블록)에 대한 정보에 해당하거나, 혹은 상기 부합하는 블록들의 경계에 따라 형성되는 사각형의 좌측상단꼭지점 및 우측하단꼭지점에 대한 정보 등으로 구현될 수 있다.Similarly, in one embodiment of the present invention, the second object area information is the block(s) in which the information of the motion vector meets a preset criterion (eg, when the size of the motion vector is greater than or equal to the preset size; or Corresponds to information about blocks whose motion vector size is equal to or greater than a preset size and is grouped in the direction of the motion vector), or information about the upper left vertex and lower right vertex of a rectangle formed along the boundary of the corresponding blocks. etc. can be implemented.

예를들어, 도 8의 (A)는 제1객체영역정보에 해당하는 영역을 예시적으로 도시하고, 도 8의 (B)는 제2객체영역정보에 해당하는 영역을 예시적으로 도시한다. 이 경우, 본 발명의 바람직한 실시예에서는 객체영역검출부(2000)는 상기 제1객체영역정보 혹은 제2객체영역정보에 해당할 수 있는 (OR 연산) 도 8의 (C)에 해당하는 영역을 객체영역정보로 도출할 수 있다.For example, (A) of FIG. 8 exemplarily shows an area corresponding to the first object area information, and (B) of FIG. 8 exemplarily shows an area corresponding to the second object area information. In this case, in a preferred embodiment of the present invention, the object area detection unit 2000 determines the area corresponding to (C) of FIG. 8 (C) that can correspond to the first object area information or the second object area information (OR operation). It can be derived from area information.

본 발명에서와 같이, 블록의 비트스트림 데이터 크기와 블록의 모션벡터 크기(및 방향)을 고려하여 객체영역을 검출하는 경우에는 도 8의 (C)에서와 같이 각각 도출된 영역을 OR 연산을 통하여 합치는 것이 실제의 객체영역에 부합하였고, 또한 최대의 연산속도를 가질 수 있었다.As in the present invention, in the case of detecting an object region by considering the size of the bitstream data of a block and the magnitude (and direction) of a motion vector of a block, each derived region is OR-operated as shown in FIG. 8(C). The merging conformed to the real object area, and also had the maximum operation speed.

도 9는 본 발명의 일 실시예에 따른 제1객체영역정보도출부(2100)의 동작을 블록기준으로 예시적으로 도시한 도면이다.9 is a diagram exemplarily showing the operation of the first object area information deriving unit 2100 on a block basis according to an embodiment of the present invention.

본 발명의 바람직한 실시예에서는, 상기 객체영역상기 객체영역검출단계는, 기설정된 개수의 복수의 연속된 프레임에서의 블록에 대한 데이터의 크기정보 및 기설정된 개수의 복수의 프레임에서의 모션벡터정보에 기초하여 객체영역정보를 도출한다. 예를들어, 5, 6, 7개 등에 대한 프레임의 블록의 비트스트림 데이터의 크기 및 모션벡터 정보를 종합적으로 고려하여 객체영역을 도출한다.In a preferred embodiment of the present invention, the object region detection step may include size information of data for blocks in a plurality of consecutive frames of a preset number and motion vector information of a plurality of frames of a preset number. Based on this, object area information is derived. For example, the object area is derived by comprehensively considering the size of bitstream data and motion vector information of blocks of 5, 6, 7, etc. frames.

본 발명의 일 실시예에서는, 도 9에 도시된 바와 같이 복수의 기설정된 개수의 프레임(도 9의 경우 5개) 중 어느 하나의 프레임을 선택하고(바람직하게는 I프레임), 각각의 블록에 대한 비트스트림 데이터의 크기(절대값)을 해당 프레임내의 데이터크기에 대한 전체정보에 기초하여 정규화한다. 즉, 도 9에서 2번째 프레임이 선택된 경우, 2번째 프레임의 7개의 블록에 대한 비트스트림 데이터의 크기(절대값)은 S1, S2, S3, S4, S5, S6, S7가 될 수 있다. 여기서, S1 ~ S7의 정보에 기초하여 정규화된 데이터의 크기에 해당하는 NS1, NS2, NS3, NS4, NS5, NS6, NS7을 도출할 수 있다. 여기서 중요한 점은 정규화 기준이 해당 프레임에서의 비트스트림 데이터 크기정보라는 점이다. 이와 같은 과정을 통하여 더욱 정확한 객체영역 검출을 수행할 수 있다.In one embodiment of the present invention, as shown in FIG. 9, selecting any one frame (preferably an I frame) among a plurality of preset numbers of frames (five in the case of FIG. 9), and in each block The size (absolute value) of bitstream data for each frame is normalized based on the entire information on the size of data in the corresponding frame. That is, when the second frame is selected in FIG. 9, the size (absolute value) of bitstream data for the seven blocks of the second frame may be S1, S2, S3, S4, S5, S6, and S7. Here, NS1, NS2, NS3, NS4, NS5, NS6, and NS7 corresponding to the sizes of normalized data can be derived based on the information of S1 to S7. An important point here is that the normalization criterion is bitstream data size information in a corresponding frame. Through this process, more accurate object area detection can be performed.

본 발명의 일 실시예에서는, 각각의 블록에 대한 비트스트림 데이터의 크기(절대값)을 해당 프레임내의 데이터크기에 대한 전체정보에 기초하여 정규화하는 과정 이전 혹은 이후에, 블록 혹은 매크로블록의 크기가 기설정된 기준(예를들어 상위 70%)인 블록들만을 추출하는 과정을 수행할 수도 있다. 이는 보다 연산을 빠르게 하기 위함이다.In one embodiment of the present invention, before or after the process of normalizing the size (absolute value) of bitstream data for each block based on the entire information on the data size in the corresponding frame, the size of the block or macroblock is A process of extracting only blocks that are a predetermined criterion (eg, top 70%) may be performed. This is to make the calculation faster.

예를들어, S1 ~ S7의 값은 하기와 같은 형태로 정규화될 수 있다. 본 발명은 다양한 형태의 공지된 정규화 기법이 이용될 수 있다. For example, the values of S1 to S7 may be normalized in the following form. Various types of known normalization techniques may be used in the present invention.

S1S1 S2S2 S3S3 S4S4 S5S5 S6S6 S7S7 해당 프레임의 블록 비트스트림 데이터크기(절대값)Block bitstream data size of the frame (absolute value) 2020 4040 4040 2020 5050 100100 200200 정규환된 비트스트림 데이터의 크기정보[0 ~ 10] Size information of normalized bitstream data [0 ~ 10] 1One 22 22 1One 2.52.5 55 1010

S1S1 S2S2 S3S3 S4S4 S5S5 S6S6 S7S7 해당 프레임의 블록 비트스트림 데이터크기(절대값)Block bitstream data size of the frame (absolute value) 22 44 44 22 55 1010 2020 정규환된 비트스트림 데이터의 크기정보[0 ~ 10] Size information of normalized bitstream data [0 ~ 10] 1One 22 22 1One 2.52.5 55 1010

도 10는 본 발명의 일 실시예에 따른 제1객체영역정보도출부의 동작을 블록기준으로 예시적으로 도시한 도면이다.10 is a diagram exemplarily showing the operation of the first object area information derivation unit on a block basis according to an embodiment of the present invention.

본 발명의 일 실시예에서는, 도 10에 도시된 바와 같이 복수의 기설정된 개수의 프레임(도 9의 경우 5개)에서 각각의 블록에 대한 비트스트림 데이터의 크기(절대값)을 해당 프레임내의 데이터크기에 대한 전체정보에 기초하여 정규화한다.In one embodiment of the present invention, as shown in FIG. 10, the size (absolute value) of bitstream data for each block in a plurality of preset numbers of frames (five in the case of FIG. 9) is the data in the frame. Normalize based on full information about size.

즉, 예를들어 1번째 프레임의 7개의 블록에 대한 비트스트림 데이터의 크기(절대값)은 S11, S12, S13, S14, S15, S16, S17가 될 수 있다. 여기서, S11 ~ S17의 정보에 기초하여 정규화된 데이터의 크기에 해당하는 NS11, NS12, NS13, NS14, NS15, NS16, NS17을 도출할 수 있다. 여기서 중요한 점은 정규화 기준은 해당 프레임에서의 비트스트림 데이터 크기정보이다. 이와 같은 과정을 통하여 더욱 정확한 객체영역 검출을 수행할 수 있다. 이와 같은 정규화 과정은 도 9를 참조하여 설명한 방식 등이 채용될 수 있다. That is, for example, the sizes (absolute values) of bitstream data for the seven blocks of the first frame may be S11, S12, S13, S14, S15, S16, and S17. Here, based on the information of S11 to S17, NS11, NS12, NS13, NS14, NS15, NS16, and NS17 corresponding to the size of the normalized data may be derived. An important point here is that the normalization criterion is bitstream data size information in a corresponding frame. Through this process, more accurate object area detection can be performed. For such a normalization process, the method described with reference to FIG. 9 may be employed.

이와 같은 방식으로 도 10의 2열에서와 같은 복수개(5개) 프레임의 각 블록별 비트스트림 데이터크기의 정규화 크기정보가 포함된 5세트의 정보를 도출할 수 있다.In this way, 5 sets of information including normalized size information of the bitstream data size for each block of a plurality of (5) frames as in the second column of FIG. 10 can be derived.

이후 도 10의 3열에서와 같이 5세트의 각각의 블록의 정규화 크기정보를 합하여 각 블록의 최종 크기정보를 도출할 수 있다.Then, as shown in column 3 of FIG. 10, final size information of each block may be derived by summing the normalized size information of each block in the 5 sets.

예를들어, 도 10의 3열에 도시된 프레임의 AS1=f(NS11, NS21, NS31, NS41, NS51, NS61, NS71)의 형태 혹은 단순히 NS11+NS21+NS31+NS41+NS51+NS61+NS71 형태가 될 수 있다.For example, AS1 = f (NS11, NS21, NS31, NS41, NS51, NS61, NS71) or simply NS11+NS21+NS31+NS41+NS51+NS61+NS71 in the frame shown in column 3 of FIG. It can be.

이후, 상기 AS1 내지 AS7과 같은 각각의 블록의 종합블록데이터크기판단값에서 기설정된 기준(예를들어 특정 수치 이상 혹은 초과)을 부합하는 블록에 기초하여 제1객체영역정보를 도출할 수 있다.Thereafter, the first object region information may be derived based on blocks meeting a predetermined criterion (for example, greater than or equal to a specific value) in the comprehensive block data size determination value of each block, such as AS1 to AS7.

즉, 상기 객체영역검출단계는, 기설정된 개수의 복수의 프레임 각각에 대하여 각각의 블록의 비트스트림 데이터의 크기정보에 기초하여 각각의 블록에 대한 종합블록데이터크기판단값을 도출하는 단계; 및 상기 종합블록데이터크기판단값이 기설정된 기준에 부합하는 블록들의 정보에 기초하여 제1객체영역정보를 도출하는 단계;를 포함하고, 상기 객체영역정보는 상기 제1객체영역정보를 포함하는 정보에 기초하여 결정된다.That is, the object region detecting step may include deriving a comprehensive block data size determination value for each block based on size information of bitstream data of each block for each of a plurality of frames of a predetermined number; and deriving first object area information based on information of blocks whose comprehensive block data size determination value meets a predetermined criterion, wherein the object area information includes information including the first object area information. is determined based on

바람직하게는, 상기 블록의 비트스트림데이터의 크기정보는 각각의 프레임 내의 데이터크기에 대한 전체정보에 기초하여 정규화된 값이다.Preferably, the size information of the bitstream data of the block is a normalized value based on overall information on the size of data in each frame.

도 11는 본 발명의 일 실시예에 따른 제2객체영역정보도출부의 동작을 블록기준으로 예시적으로 도시한 도면이다.11 is a diagram exemplarily showing the operation of the second object area information derivation unit on a block basis according to an embodiment of the present invention.

본 발명의 바람직한 실시예에서, 상기 제2객체영역정보도출부(2200)에 의하여 수행되는 객체영역검출단계는 기설정된 개수의 복수의 프레임 각각에 대하여 각각의 블록의 모션벡터의 크기가 기설정된 기준을 부합하는 경우에, 각각의 블록의 모션벡터판단값을 제1수치로 부여하고, 각각의 블록의 모션벡터의 크기가 기설정된 기준에 부합하지 않는 경우에 각각의 블록의 모션벡터판단값을 제2수치로 부여하는 단계;를 수행한다.In a preferred embodiment of the present invention, the object region detection step performed by the second object region information deriving unit 2200 is based on a preset reference to the size of a motion vector of each block for each of a plurality of frames of a preset number. , the motion vector judgment value of each block is given as a first numerical value, and the motion vector judgment value of each block is given when the magnitude of the motion vector of each block does not meet the predetermined criterion. The step of giving 2 numerical values; is performed.

예를들어, 5개의 프레임의 각각의 블록에 대하여, 해당 모션벡터의 크기가 기설정된 기준(예를들어 특정 수치 이상)인 경우에, 해당 블록에 대해 2를 부여하고, 아닌 경우에는 해당 블록에 대해 0을 부여한다고 가정시, 도 11의 1열에서와 같은 5개의 프레임에 대해 모션벡터판단값을 포함하는 5개의 모션벡터판단값맵을 도출할 수 있다.For example, for each block of 5 frames, if the size of the corresponding motion vector is a predetermined standard (eg, greater than or equal to a specific value), 2 is assigned to the corresponding block, otherwise, the corresponding block Assuming that 0 is assigned to , 5 motion vector decision value maps including motion vector decision values for 5 frames as shown in column 1 of FIG. 11 can be derived.

이후, 상기 기설정된 개수의 각각의 복수의 프레임의 각각의 블록별로 상기 모션벡터판단값들에 기초하여 각각의 블록에 대한 종합모션벡터판단값을 도출하는 단계;가 수행된다. Thereafter, a step of deriving a comprehensive motion vector judgment value for each block based on the motion vector judgment values for each block of each of the plurality of frames of the predetermined number is performed.

예를들어, 각각의 블록의 복수의 프레임에서의 모션벡터판단값들을 합하는 경우에, (2, 2) 위치의 블록의 경우 종합모션벡터판단값이 10(2+2+2+2+2)로 도출될 수 있다. 반면 (3, 3) 위치의 블록의 경우 종합모션벡터판단값이 8(2+2+2+2)로 도출될 수 있다.For example, in the case of summing motion vector judgment values in a plurality of frames of each block, in the case of a block at position (2, 2), the total motion vector judgment value is 10 (2+2+2+2+2) can be derived as On the other hand, in the case of the block at the (3, 3) position, the comprehensive motion vector judgment value can be derived as 8 (2 + 2 + 2 + 2).

이후, 상기 종합모션벡터판단값이 기설정된 기준에 부합하는 블록들의 정보에 기초하여 제2객체영역정보를 도출하는 단계;가 수행된다. 예를들어, 상기 기설정된 기준이 8이상인 경우에, 도 11에서 (2,2), (2,3), (3,2), (3,3)가 제2객체영역정보에 따른 블록에 해당할 수 있고, 최종적인 상기 객체영역정보는 상기 제2객체영역정보를 포함하는 정보에 기초하여 결정될 수 있다.Thereafter, a step of deriving second object area information based on information of blocks whose comprehensive motion vector judgment value meets a predetermined criterion is performed. For example, when the preset criterion is 8 or more, in FIG. 11, (2,2), (2,3), (3,2), and (3,3) are blocks according to the second object area information. It may correspond, and the final object area information may be determined based on information including the second object area information.

본 발명의 다른 실시예에서는, 상기 제2객체영역정보를 도출하는 단계는, 각각의 블록의 상기 종합모션벡터판단값이 기설정된 기준에 부합 여부 및 각각의 블록의 모션벡터의 방향에 따라 도출된 그룹핑 정보에 기초하여 제2객체영역정보를 도출할 수도 있다. 일 예로는, 도 11에서와 같은 과정 등을 통해 모션벡터가 기설정된 크기 이상인 블록들에 대하여 모션벡터의 방향의 차이가 기설정된 범위 이내인 그룹핑된 블록들에 대한 정보에 기초하여 상기 제2객체영역정보가 도출될 수도 있다.In another embodiment of the present invention, the step of deriving the second object region information is derived according to whether the comprehensive motion vector judgment value of each block meets a predetermined criterion and the direction of the motion vector of each block. Second object area information may be derived based on the grouping information. For example, based on information about blocks whose motion vectors are grouped within a predetermined range with respect to blocks whose motion vectors have a predetermined size or more through the same process as in FIG. 11, the second object Area information may be derived.

즉, 상기 객체영역도출검출단계는, 기설정된 개수의 복수의 프레임에서의 블록에 대한 데이터의 크기정보 및 기설정된 개수의 복수의 프레임에서의 모션벡터의 크기 및 모션벡터의 방향에 기초하여 객체영역정보를 도출한다.That is, the object region derivation and detection step may include the object region based on size information of data for blocks in a plurality of frames of a preset number and magnitudes and directions of motion vectors in a plurality of frames of a preset number. derive information

또한, 본 발명의 일 실시예에서는, 상기 객체영역검출단계(더욱 구체적으로는, 제2객체영역검출부(2200)에서 수행되는 단계)에서 상기 모션벡터의 크기; 혹은 상기 모션벡터의 크기 및 방향;을 고려하여 도출되는 영역에 대하여, 추가적으로 공지된 필터 등을 적용하여 객체영역의 노이즈를 제거할 수도 있다.Further, in one embodiment of the present invention, in the object region detection step (more specifically, the step performed by the second object region detection unit 2200), the size of the motion vector; Alternatively, noise in the object area may be removed by additionally applying a known filter to the area derived by considering the size and direction of the motion vector.

도 12는 본 발명의 일 실시예에 따른 제1객체영역정보도출부의 동작에 따른 영상화면 및 세부데이터처리맵의 일예를 도시한 도면이다.12 is a diagram showing an example of a video screen and a detailed data processing map according to the operation of the first object area information deriving unit according to an embodiment of the present invention.

도 12의 좌측상단 사진은 예시적인 프레임 이미지를 도시한다.The upper left photo of FIG. 12 shows an exemplary frame image.

도 12의 우측상단 이미지는 매크로블록의 크기가 상위 70%에 해당하는 블록들에 대한 이미지맵을 도시한다.The upper right image of FIG. 12 shows an image map of blocks corresponding to the upper 70% of macroblock sizes.

도 12의 좌측하단 이미지는 7개의 프레임의 매크로블록의 정규화된 비트스트림 데이터 크기를 합한 정보에 대한 이미지맵을 도시한다.The lower left image of FIG. 12 shows an image map for information obtained by summing the normalized bitstream data sizes of macroblocks of 7 frames.

도 12의 우측하단 이미지는 도 10을 참조하여 설명하였던 상기 블록의 최종 크기정보가 기설정된 기준 이상에 해당하는 블록에 대한 이미지맵을 도시한다.The lower right image of FIG. 12 shows an image map of a block whose final size information of the block described with reference to FIG. 10 corresponds to a predetermined criterion or higher.

도 13는 본 발명의 일 실시예에 따른 제2객체영역정보도출부의 동작에 따른 영상화면 및 세부데이터처리맵의 일예를 도시한 도면이고, 도 14는 본 발명의 일 실시예에 따른 객체영역검출방법에 따른 객체영역검출의 예를 도시한 도면이다.13 is a diagram showing an example of a video screen and detailed data processing map according to the operation of the second object area information deriving unit according to an embodiment of the present invention, and FIG. 14 is an object area detection according to an embodiment of the present invention. It is a diagram showing an example of object area detection according to the method.

2. 디코딩 파라미터를 이용한 교텅정보 혹은 보행정보 분석 알고리즘2. Teaching information or gait information analysis algorithm using decoding parameters

도 15는 본 발명의 일 실시예에 따른 교통정보 혹은 보행정보의 분석방법의 전체 단계들을 개략적으로 도시한다.15 schematically illustrates the entire steps of a method for analyzing traffic information or walking information according to an embodiment of the present invention.

본 발명의 교통 혹은 보행정보 분석방법은 하나 이상의 프로세서 및 상기 프로세서에서 수행 가능한 명령들을 저장하는 메인 메모리를 포함하는 컴퓨팅 시스템에서 수행된다.The traffic or walking information analysis method of the present invention is performed in a computing system including one or more processors and a main memory storing commands executable by the processor.

상기 컴퓨팅 시스템은 도 1 내지 도 14를 참조하여 설명한 객체영역검출 방법을 수행하기 위한 모듈들의 일부 혹은 전체를 포함할 수 있다.The computing system may include some or all of the modules for performing the object area detection method described with reference to FIGS. 1 to 14 .

본 발명의 실시예들에 따른 교통 혹은 보행정보 분석방법은 영상데이터의 대상프레임을 포함하는 제1분석프레임에 대하여 디코딩을 수행하지 않은 상태에서, 블록에 대한 데이터의 크기정보 혹은 영상디코딩 파라미터에 기초하여 상기 대상프레임이 움직임이 있는 프레임인지 여부를 판단하는 움직임프레임판단단계(S100); 상기 대상프레임을 디코딩하여 디코딩영상을 추출하고, 디코딩영상에 대하여 딥러닝 기반의 제1기계학습모델에 의한 객체검출을 수행하여 1 이상의 제1객체를 도출하는 제1객체도출단계(S200); 상기 영상데이터의 대상프레임을 포함하는 제2분석프레임에 대하여 블록에 대한 데이터의 크기정보 혹은 영상디코딩 파라미터에 기초하여 상기 대상프레임 내부의 1 이상의 비교객체영역을 검출하는 비교객체영역도출단계(S300); 및 1 이상의 상기 비교객체영역 중 상기 제1객체가 존재하지 않는 비교객체영역의 디코딩영상에 대하여 딥러닝 기반의 제2기계학습모델에 의한 객체검출을 수행하여 1 이상의 제2객체를 도출하는 제2객체도출단계(S400); 및 상기 제1객체 및 상기 제2객체를 차량 혹은 보행자 객체로 판단하여 트래킹을 수행하여, 교통 혹은 보행정보를 분석하는 분석단계(S500);를 포함한다.A method for analyzing traffic or walking information according to embodiments of the present invention is based on size information of block data or image decoding parameters in a state in which decoding is not performed on a first analysis frame including a target frame of image data. a motion frame determination step (S100) of determining whether or not the target frame is a motion frame; A first object derivation step (S200) of decoding the target frame, extracting a decoded image, and deriving one or more first objects by performing object detection by a deep learning-based first machine learning model on the decoded image; Comparison object region derivation step of detecting one or more comparison object regions inside the target frame based on size information or image decoding parameters of data for a block with respect to the second analysis frame including the target frame of the image data (S300) ; and a second method for deriving one or more second objects by performing object detection by a deep learning-based second machine learning model on a decoded image of a comparison object area in which the first object does not exist among one or more comparison object areas. object derivation step (S400); and an analysis step (S500) of determining the first object and the second object as vehicle or pedestrian objects, performing tracking, and analyzing traffic or walking information.

단계 S100에서는 영상데이터의 대상프레임을 포함하는 제1분석프레임에 대하여 디코딩을 수행하지 않은 상태에서, 블록에 대한 데이터의 크기정보 혹은 영상디코딩 파라미터에 기초하여 상기 대상프레임이 움직임이 있는 프레임인지 여부를 판단한다.In step S100, in a state in which decoding is not performed on the first analysis frame including the target frame of video data, whether the target frame is a frame with motion is determined based on size information of block data or image decoding parameters. judge

단계 S100에서는 단일프레임(대상프레임)에 대해서 도 5의 제1객체영역정보도출부(2100)에 의한 매크로블록의 데이터 크기에 기반하여 해당 프레임이 움직임이 있는 프레임인지 여부를 판단하거나, 도 5의 제2객체영역정보출부(2200)에 의한 블록의 모션벡터에 기반하여 해당 프레임이 움직임이 있는 프레임인지 여부를 판단할 수 있다. In step S100, based on the data size of the macroblock by the first object region information deriving unit 2100 of FIG. 5 for a single frame (target frame), it is determined whether the corresponding frame is a frame with motion or Based on the motion vector of the block by the second object region information output unit 2200, it can be determined whether the corresponding frame is a frame with motion.

혹은, 본 발명의 다른 실시예에서는, 단일의 대상프레임을 포함하는 복수의 프레임(제1분석프레임)에 대하여, 도 5의 제1객체영역정보도출부(2100)에 의한 매크로블록의 데이터 크기에 기반하여 해당 프레임이 움직임이 있는 프레임인지 여부를 판단하거나, 도 5의 제2객체영역정보출부(2200)에 의한 블록의 모션벡터에 기반하여 해당 프레임이 움직임이 있는 프레임인지 여부를 판단할 수 있다.Alternatively, in another embodiment of the present invention, with respect to a plurality of frames (first analysis frame) including a single target frame, the data size of a macroblock by the first object region information deriving unit 2100 of FIG. Based on this, it can be determined whether the corresponding frame is a frame with motion, or it can be determined whether the corresponding frame is a frame with motion based on the motion vector of the block by the second object region information output unit 2200 of FIG. .

이와 같은 단계 S100은 전술한 도 1 내지 도 14를 참조하여 설명한 디코딩 파라미터 및 블록의 데이터 크기 정보를 이용한 방법을 사용할 수 있다.In this step S100, the method using the decoding parameters and block data size information described with reference to FIGS. 1 to 14 may be used.

예를들어, 단일의 대상프레임 기반으로 수행하는 경우에는, 단일의 대상프레임을 구성하는 복수의 매크로블록의 데이터크기의 총합 혹은 복수의 블록의 모션벡터 크기의 총합에 기초하여, 단일의 대상프레임에 대하여 움직임이 있는지 여부를 판단할 수 있다.For example, in the case of performing the operation based on a single target frame, based on the total data size of a plurality of macroblocks constituting a single target frame or the total sum of motion vector sizes of a plurality of blocks, It can be determined whether or not there is motion.

혹은, 예를들어, 복수의 프레임 기반으로 수행하는 경우에는, 대상프레임 및 대상프레임 이전의 복수의 프레임에 대하여, 도 9 내지 도 11을 참조하여 설명하였던 복수의 매크로블록의 데이터크기의 총합 혹은 복수의 블록의 모션벡터 크기의 총합에 기초하여, 대상프레임에 대하여 움직임이 있는지 여부를 판단할 수 있다.Alternatively, for example, in the case of performing based on a plurality of frames, the sum or plurality of data sizes of a plurality of macroblocks described with reference to FIGS. 9 to 11 for the target frame and the plurality of frames preceding the target frame. Based on the sum of motion vector magnitudes of the blocks of , it may be determined whether there is motion in the target frame.

본 발명의 다른 실시예에서는 전술한 방법 외의 다양한 형태로 대상프레임에 움직임이 있는지 여부를 판단할 수 있다. 바람직하게는, 연산 속도를 높이기 위하여, 해당 대상프레임의 디코딩 이전에 NAL유닛에 포함된 정보에 기초하여 해당 대상프레임이 움직임이 있는 프레임인지 여부를 판단한다. 그러나, 본 발명의 다른 실시예에서는, 해당 대상프레임이 디코딩된 후에, 디코딩된 이미지를 기반으로 움직임이 있는지 여부를 판단하는 방식도 채용될 수 있다.In another embodiment of the present invention, it is possible to determine whether there is motion in the target frame in various forms other than the above-described method. Preferably, in order to increase the operation speed, it is determined whether the corresponding target frame is a frame with motion based on information included in the NAL unit before decoding the corresponding target frame. However, in another embodiment of the present invention, a method of determining whether there is motion based on a decoded image after a corresponding target frame is decoded may also be employed.

S200에서는 상기 대상프레임을 디코딩하여 디코딩영상을 추출하고, 디코딩영상에 대하여 딥러닝 기반의 제1기계학습모델에 의한 객체검출을 수행하여 1 이상의 제1객체를 도출한다.In S200, the target frame is decoded to extract a decoded image, and object detection is performed on the decoded image by a first machine learning model based on deep learning to derive one or more first objects.

제1기계학습모델은 일반적인 교통정보 혹은 보행정보를 분석하기 위하여 사용되는 모델로서, CNN 등의 공지된 인공신경망을 통하여 구현될 수 있고, 이에 대한 자세한 설명은 생략하도록 한다.The first machine learning model is a model used to analyze general traffic information or walking information, and can be implemented through a known artificial neural network such as CNN, and a detailed description thereof will be omitted.

상기 S200에서는 기존과 같은 방식으로 움직임이 있다고 판단되는 프레임에 대하여, 객체를 검출하고, 이를 제1객체로 지정한다.In S200, an object is detected for a frame in which motion is determined in the same manner as before, and this is designated as a first object.

한편, 인간의 라벨링 작업에 의하여 학습된 제1기계학습모델의 경우, 객체가 가려져 있는 경우, 혹은 주변의 조명 등의 환경에 의하여 영향을 받는 경우에 인간의 인식과 같이 모든 객체들을 정확하게 제1객체로 검출하지 못하는 경우가 존재할 수 있다.On the other hand, in the case of the first machine learning model learned by the human labeling task, when the object is occluded or affected by the environment such as ambient lighting, all objects are accurately identified as the first object, as in human perception. There may be cases where it cannot be detected.

그러나, 본 발명에서는 디코딩 파라미터에 기반하여, 단계 S300에서는 상기 영상데이터의 대상프레임을 포함하는 제2분석프레임에 대하여 블록에 대한 데이터의 크기정보 혹은 영상디코딩 파라미터에 기초하여 상기 대상프레임 내부의 1 이상의 비교객체영역을 검출하여, 이를 보완할 수 있다.However, in the present invention, based on decoding parameters, in step S300, based on size information of data or image decoding parameters for the second analysis frame including the target frame of the image data, one or more data within the target frame are determined. By detecting the comparison object area, it can be supplemented.

본 발명의 일 실시예에서는, 상기 비교객체영역검출단계는, 상기 대상프레임, 및 1 이상의 대상프레임 이전의 프레임에서의 블록에 대한 비트스트림 데이터의 크기정보 및 모션벡터정보 중 1 이상에 기초하여 비교객체영역을 검출할 수 있다.In one embodiment of the present invention, the comparison object area detection step compares the target frame and blocks in frames preceding one or more target frames based on at least one of size information and motion vector information of bitstream data. The object area can be detected.

본 발명의 일 실시예에서는, 상기 비교객체영역검출단계는, 상기 대상프레임, 및 1 이상의 대상프레임 이전의 프레임에 대하여 각각의 블록의 비트스트림 데이터의 크기정보를 합하여, 각각의 블록에 대한 종합블록데이터크기판단값을 도출하는 단계; 및 상기 종합블록데이터크기판단값이 기설정된 기준에 부합하는 블록들의 정보에 기초하여 비교객체영역을 도출하는 단계를 포함할 수 있다.In one embodiment of the present invention, the step of detecting the comparison object region is a composite block for each block by summing the size information of the bitstream data of each block with respect to the target frame and frames preceding one or more target frames. Deriving a data size judgment value; and deriving a comparison object area based on information of blocks whose comprehensive block data size determination value meets a predetermined criterion.

바람직하게는, 상기 블록의 비트스트림데이터의 크기정보는 각각의 프레임 내의 데이터크기에 대한 전체정보에 기초하여 정규화된 값일 수 있다.Preferably, the size information of the bitstream data of the block may be a normalized value based on overall information on the size of data in each frame.

이와 같은 과정은 도 10을 참조하여 설명한 방법으로 구현될 수 있다. Such a process may be implemented by the method described with reference to FIG. 10 .

예를들어 현재 분석대상프레임이 100번째 프레임인 경우에는, 96, 97, 98, 99, 100번째 프레임 각각에 대하여 각각의 매크로블록의 크기정보를 합산하여, 100번째 프레임에서의 비교객체영역을 도출할 수 있다. 이는 노이즈성으로 매크로블록의 데이터 크기가 일시적으로 커지는 경우에 비교객체영역의 검출의 오류가 발생할 수 있는 데, 이와 같은 노이즈성 오류를 제거하기 위함이다.For example, if the current analysis target frame is the 100th frame, the comparison object area in the 100th frame is derived by summing the size information of each macroblock for each of the 96th, 97th, 98th, 99th, and 100th frames can do. This is to eliminate noise-related errors, which may cause errors in detection of the comparison object area when the data size of the macroblock temporarily increases due to noise.

본 발명의 일 실시예에서는, 상기 비교객체영역검출단계는, 상기 대상프레임, 및 1 이상의 대상프레임 이전의 프레임 각각에 대하여 각각의 블록의 모션벡터의 크기가 기설정된 기준을 부합하는 경우에, 각각의 블록의 모션벡터판단값을 제1수치로 부여하고, 각각의 블록의 모션벡터의 크기가 기설정된 기준에 부합하지 않는 경우에 각각의 블록의 모션벡터판단값을 제2수치로 부여하는 단계; 각각의 상기 대상프레임, 및 1 이상의 대상프레임 이전의 프레임의 각각의 블록에 대한 상기 모션벡터판단값들을 누적하여 각각의 블록에 대한 종합모션벡터판단값을 도출하는 단계; 및 상기 종합모션벡터판단값이 기설정된 기준에 부합하는 블록들을 비교객체영역으로 도출하는 단계;를 포함할 수 있다.In one embodiment of the present invention, the comparison object region detection step may include, when the size of a motion vector of each block meets a predetermined criterion for each of the target frame and one or more frames preceding the target frame, respectively. assigning a motion vector judgment value of each block as a first numerical value, and assigning a motion vector judgment value of each block as a second numerical value when the magnitude of the motion vector of each block does not meet a predetermined criterion; deriving a comprehensive motion vector judgment value for each block by accumulating the motion vector judgment values for each block of each target frame and frames preceding one or more target frames; and deriving, as a comparison object area, blocks whose comprehensive motion vector judgment value meets a predetermined criterion.

본 발명ㅇ의 다른 실시예에서는 상기 제1수치 및 제2수치를 사용하는 것이 아니라, 블록의 모션벡터판단값 자체를 누적합산하여, 종합모션벡터판단값을 도출하여 이를 기설정된 기준과 비교하여, 블록들을 비교객체영역으로 도출할 수도 있다.In another embodiment of the present invention, rather than using the first numerical value and the second numerical value, the motion vector determination value of the block itself is cumulatively summed to derive a comprehensive motion vector determination value and compare it with a preset standard, Blocks can also be derived as a comparison object area.

바람직하게는, 상기 비교객체영역검출단계는 각각의 블록의 상기 종합모션벡터판단값이 기설정된 기준에 부합 여부 및 각각의 블록의 모션벡터의 방향에 따라 도출된 그룹핑 정보에 기초하여 제2객체영역정보를 도출할 수 있다.Preferably, the comparison object area detection step is based on whether the comprehensive motion vector judgment value of each block meets a predetermined criterion and grouping information derived according to the direction of the motion vector of each block to the second object area. information can be derived.

이와 같은 과정은 도 11을 참조하여 설명한 방법으로 구현될 수 있다. Such a process may be implemented by the method described with reference to FIG. 11 .

예를들어 현재 분석대상프레임이 100번째 프레임인 경우에는, 96, 97, 98, 99, 100번째 프레임 각각에 대하여 각각의 블록의 모션크기정보를 합산하여, 100번째 프레임에서의 비교객체영역을 도출할 수 있다. 이는 노이즈성으로 일부 블록의 모션벡터크기가 일시적으로 커지는 경우에 비교객체영역의 검출의 오류가 발생할 수 있는 데, 이와 같은 노이즈성 오류를 제거하기 위함이다.For example, if the current analysis target frame is the 100th frame, the motion size information of each block is summed up for each of the 96th, 97th, 98th, 99th, and 100th frames to derive a comparison object area in the 100th frame can do. This is to eliminate noise-related errors, which may cause errors in detection of the comparison object area when the magnitudes of motion vectors of some blocks temporarily increase due to noise.

본 발명의 다른 실시예에서는, 도 10에서와 같은 매크로블록의 크기 정보를 통하여 검출되는 비교객체영역 및 도 11에서와 같은 모션벡터 정보를 통하여 검출되는 비교객체영역을 모두 합하여 최종적인 비교객체영역으로 도출할 수도 있다.In another embodiment of the present invention, a comparison object area detected through macroblock size information as shown in FIG. 10 and a comparison object area detected through motion vector information as shown in FIG. can also be derived.

단계 S400에서는, 1 이상의 상기 비교객체영역 중 상기 제1객체가 존재하지 않는 비교객체영역의 디코딩영상에 대하여 딥러닝 기반의 제2기계학습모델에 의한 객체검출을 수행하여 1 이상의 제2객체를 도출하는 제2객체도출단계(S400)가 수행된다.In step S400, one or more second objects are derived by performing object detection by a deep learning-based second machine learning model on a decoded image of a comparison object area in which the first object does not exist among one or more comparison object areas. A second object derivation step (S400) is performed.

구체적으로 단계 S400에서는 대상프레임에서 검출된 1 이상의 비교객체영역에 제1객체들이 존재하는 지 여부를 판단한다. 만약에, 일부 비교객체영역에 제1객체가 존재하지 않는 경우, 해당 비교객체영역에는 보행자 혹은 차량 등의 객체가 존재하나 제1기계학습모델에서는 검출되지 않을 가능성이 있다고 간주하여, 이에 대하여 제2기계학습모델에 의한 객체도출이 수행된다.Specifically, in step S400, it is determined whether first objects exist in one or more comparison object areas detected in the target frame. If the first object does not exist in some comparison object areas, it is considered that there is a possibility that an object such as a pedestrian or a vehicle exists in the comparison object area but is not detected by the first machine learning model, and a second Object derivation by machine learning model is performed.

단계 S500에서는, 상기 제1객체 및 상기 제2객체를 차량 혹은 보행자 객체로 판단하여 트래킹을 수행하여, 교통 혹은 보행정보를 분석하는 분석단계가 수행된다.In step S500, an analysis step of determining the first object and the second object as vehicle or pedestrian objects, performing tracking, and analyzing traffic or walking information is performed.

상기 분석단계는 인식된 객체 및 객체의 트래킹정보를 이용하여 하기와 같은 다양한 분야의 분석을 수행할 수 있다.In the analysis step, various fields of analysis may be performed using the recognized object and tracking information of the object.

(1) 스마트교차로 시스템(1) Smart Intersection System

세부분석내역: 방향별 통과 교통량 집계, 점유율측정, 감응대상 이동류 공간점유 정보, 유턴차량검지, 차종분류, 초기대기행령 대수, 대기행령 길이, 무단횡단 검지, 정지차량 검지, 꼬리물기 검지Detailed analysis: aggregation of passing traffic volume by direction, measurement of occupancy rate, space occupancy information of moving flow subject to response, U-turn vehicle detection, vehicle type classification, number of initial queues, length of queues, jaywalking detection, stopped vehicle detection, tail-biting detection

(2) 교통정보 수집시스템 (VDS)(2) Traffic information collection system (VDS)

세부분석내역: 교통량 집계, 차량속도 측정, 점유율 측정, 정지차량 검지, 보행자 검지, 사고차량 검지Detailed analysis details: traffic count, vehicle speed measurement, occupancy measurement, stopped vehicle detection, pedestrian detection, accident vehicle detection

(3) 돌발상황 검지 시스템(3) Unexpected situation detection system

세부분석내역: 정지차량 검지, 보행자검지, 역주행 차량 검지Detailed analysis: stopped vehicle detection, pedestrian detection, reverse running vehicle detection

(4) 횡단보도 보행안전시스템(4) Crosswalk Pedestrian Safety System

세부분석내역: 횡단대기자 검지, 보도 통행자 검지Detailed analysis: Detection of people waiting at crossing, detection of pedestrians on the sidewalk

(5) 골목길 보행안전 시스템(5) Alley Pedestrian Safety System

세부분석내역: 충돌예측 상황검지, 충돌예측 알림시스템Detailed analysis details: collision prediction situation detection, collision prediction notification system

도 16은 본 발명의 일 실시예에 따른 비교객체영역의 검출에 사용되는 프레임들을 도시한다.16 illustrates frames used for detection of a comparison object area according to an embodiment of the present invention.

본 발명의 일 실시예에서는, 단일의 프레임에 대한 매크로블록 데이터 크기 및 블록의 모션벡터를 이용하여 비교객체영역의 검출할 수 있다.In one embodiment of the present invention, the comparison object area can be detected using the macroblock data size for a single frame and the motion vector of the block.

그러나, 단일프레임의 경우, 일부 프레임에서 노이즈성 혹은 영상압축 과정에서, 모션벡터 혹은 매크로블록 크기가 일시적으로 급증하는 경우에, 잘못된 비교객체영역 검출이 수행될 수 있다.However, in the case of a single frame, when a motion vector or macroblock size temporarily increases due to noise or video compression in some frames, erroneous comparison object region detection may be performed.

본 발명의 바람직한 실시예에서는, 이와 같은 문제점을 해결하기 위하여, 도 16에서느와 같이 n번째 프레임이 분석대상 프레임인 경우, n-1, n-2, …, n-k번째 프레임에 대한 매크로블록의 데이터 크기 혹은 모션벡터 정보를 합산하여, 분석대상프레임의 비교객체영역을 검출함으로써, 보다 정확한 비교객체영역 검출을 수행할 수 있는 장점을 가질 수 있다.In a preferred embodiment of the present invention, in order to solve this problem, when the nth frame is the analysis target frame as shown in FIG. 16, n-1, n-2, . . . , by summing the data size or motion vector information of the macroblocks for the n-kth frame to detect the comparison object area of the analysis target frame, it is possible to have an advantage of performing more accurate comparison object area detection.

도 17은 본 발명의 일 실시예에 따른 객체검출 과정을 예시적으로 도시한다.17 exemplarily illustrates an object detection process according to an embodiment of the present invention.

도 17에서는 실제로는 3개의 객체가 있는 경우, 제1객체도 모두 3개의 객체를 포함하고, 비교객체영역도 3개의 제1객체를 포함하는 경우를 도시한다. 이는 제1기계학습모델의 제1객체 및 비교객체영역이 정확하게 검출된 경우에 해당한다.17 shows a case where there are actually three objects, the first object also includes all three objects, and the comparison object area also includes three first objects. This corresponds to the case where the first object and comparison object regions of the first machine learning model are accurately detected.

도 18은 본 발명의 일 실시예에 따른 객체검출 과정을 예시적으로 도시한다.18 exemplarily illustrates an object detection process according to an embodiment of the present invention.

도 18에서는 제1기계학습모델은 3개의 제1객체를 정확하게 검출하였으나, 비교객체영역에서는 사람객체에 대해서는 디코딩파라미터로 검지하지 못한 케이스에 해당한다. 이 경우, 3개의 제1객체 모두를 최종적인 객체로 선정함으로써, 비교객체영역의 오작동이 실제 객체검출의 오작동으로 이어지지 않도록 한다.In FIG. 18, the first machine learning model accurately detects three first objects, but corresponds to a case in which the human object was not detected as a decoding parameter in the comparison object area. In this case, by selecting all three first objects as final objects, a malfunction in the comparison object area does not lead to a malfunction in actual object detection.

도 19는 본 발명의 일 실시예에 따른 객체검출 과정을 예시적으로 도시한다.19 exemplarily illustrates an object detection process according to an embodiment of the present invention.

도 19에서는 제1기계학습모델에서는 사람 객체를 누락하였으나, 디코딩 파라미터에 의한 비교객체영역은 사람 객체이 포함된 블록을 비교객체영역으로 판단한 경우에 해당한다.In FIG. 19, the human object is omitted in the first machine learning model, but the comparison object area according to the decoding parameter corresponds to a case in which a block including a human object is determined to be a comparison object area.

이 경우, 사람이 속하는 비교객체영역에 대해서는 제2기계학습모델에 의하여 추가적인 객체영역 검출을 수행한다. 바람하게는 비교객체영역에 해당하는 디코딩 이미지에 대하여, 추가적인 객체영역 검출을 수행한다.In this case, an additional object region is detected by the second machine learning model for the comparison object region to which the person belongs. Preferably, additional object region detection is performed on the decoded image corresponding to the comparison object region.

도 20은 본 발명의 일 실시예에 따른 제1기계학습모델과 제2기계학습모델의 동작을 개략적으로 도시한다.20 schematically illustrates operations of a first machine learning model and a second machine learning model according to an embodiment of the present invention.

본 발명의 일 실시예에서는, 상기 제1객체도출단계에서, 상기 제1기계학습모델에 입력되는 디코딩영상은 원본영상이 압축된 형태의 영상을 포함하고, 상기 제2객체도출단계에서, 상기 제2기계학습모델에 입력되는 비교객체영역의 디코딩영상은 원본영상 혹은 상기 제1객체도출단계에서 입력되는 원본영상이 압축된 형태의 영상 보다 상대적으로 고화질로 압축된 압축영상을 포함한다. In one embodiment of the present invention, in the first object derivation step, the decoded image input to the first machine learning model includes an image in a form in which the original image is compressed, and in the second object derivation step, the decoded image is input to the first machine learning model. 2 The decoded image of the comparison object region input to the machine learning model includes an original image or a compressed image compressed to a higher quality than the original image input in the first object deriving step.

즉, 영상 영역전체에 대하여 메인으로 객체를 검지하는 제1객체도출단계에서는 압축된 영상을 이용함으로써, 연산속도를 고속화한다. 이 경우, 객체영역의 오탐지가 발생할 수 있다. 이와 같이 오탐지가 된 부분을 디코딩파라미터를 이용하여 고속으로 탐지할 수 있는 비교객체영역을 이용하여 찾아낸다. That is, in the first object derivation step for detecting objects in the entire image area, the operation speed is increased by using the compressed image. In this case, false detection of the object area may occur. In this way, the falsely detected part is found using the comparison object area that can be detected at high speed using the decoding parameter.

제1객체가 내부에 포함되지 않은 비교객체영역의 경우, 압축된 영상에 의한 제1기계학습모델의 탐지에 의하여 누락될 수 있기 때문에, 제2기계학습모델에는 압축을 하지 않은 원본영상 혹은 제1기계학습모델에 입력되는 영상의 압축률보다 더 낮은 압축률(즉, 더 좋은 화질 혹은 높은 해상도)의 비교객체영역의 영상을 제2기계학습모델에 입력함으로써, 부분적으로 객체검출 정확도를 높일 수 있다.In the case of the comparison object area where the first object is not included, it may be missed by the detection of the first machine learning model by the compressed image, so the second machine learning model has an uncompressed original image or the first Object detection accuracy may be partially increased by inputting an image of the comparison object region having a compression rate lower than that of the image input to the machine learning model (ie, better quality or higher resolution) to the second machine learning model.

도 21은 본 발명의 일 실시예에 따른 제1기계학습모델과 제2기계학습모델의 동작을 개략적으로 도시한다.21 schematically illustrates operations of a first machine learning model and a second machine learning model according to an embodiment of the present invention.

본 발명의 일 실시예에서는, 상기 제2기계학습모델은 상기 제1기계학습모델보다 상대적으로 높은 연산부하를 요구하고, 더욱 정확한 검출을 수행할 수 있다. In one embodiment of the present invention, the second machine learning model requires a relatively higher computational load than the first machine learning model, and can perform more accurate detection.

도 21에서의 실시예는 도 20에서와 같은 제1기계학습모델과 제2기계학습모델에 입력되는 영상의 압축품질을 상이하는 하는 방법과 동시에 사용될 수 있다The embodiment in FIG. 21 can be used simultaneously with the method of differentiating the compression quality of images input to the first machine learning model and the second machine learning model as shown in FIG. 20.

즉, 제2기계학습모델에는 제1기계학습모델과 비교시, 더 좋은 화질의 디코딩된 영상이 입력되고, 또한 제2기계학습모델의 경우 더 많은 필터, 레이어를 구비함으로써 더욱 높은 검출 정확도를 가질 수도 있다.That is, when compared to the first machine learning model, the second machine learning model receives a decoded image of better quality, and in the case of the second machine learning model, it has higher detection accuracy by having more filters and layers. may be

도 22은 본 발명의 일 실시예에 따른 교통정보 혹은 보행정보의 분석시스템의 내부 구성을 개략적으로 도시한다.22 schematically illustrates the internal configuration of an analysis system for traffic information or walking information according to an embodiment of the present invention.

본 발명의 일 실시예에 따른 교통 혹은 보행정보 분석시스템은 하나 이상의 프로세서 및 상기 프로세서에서 수행 가능한 명령들을 저장하는 메인 메모리를 포함하는 컴퓨팅 시스템으로 구현될 수 있다.A traffic or walking information analysis system according to an embodiment of the present invention may be implemented as a computing system including one or more processors and a main memory storing commands executable by the processor.

상기 교통 혹은 보행정보 분석시스템(4000)은 영상데이터의 대상프레임을 포함하는 제1분석프레임에 대하여 디코딩을 수행하지 않은 상태에서, 블록에 대한 데이터의 크기정보 혹은 영상디코딩 파라미터에 기초하여 상기 대상프레임이 움직임이 있는 프레임인지 여부를 판단하는 움직임프레임판단부(4100); 상기 대상프레임을 디코딩하여 디코딩영상을 추출하고, 디코딩영상에 대하여 딥러닝 기반의 제1기계학습모델에 의한 객체검출을 수행하여 1 이상의 제1객체를 도출하는 제1객체영역도출부(4200); 상기 영상데이터의 대상프레임을 포함하는 제2분석프레임에 대하여 블록에 대한 데이터의 크기정보 혹은 영상디코딩 파라미터에 기초하여 상기 대상프레임 내부의 1 이상의 비교객체영역을 검출하는 비교객체영역도출부(4300); 및 1 이상의 상기 비교객체영역 중 상기 제1객체가 존재하지 않는 비교객체영역의 디코딩영상에 대하여 딥러닝 기반의 제2기계학습모델에 의한 객체검출을 수행하여 1 이상의 제2객체를 도출하는 제2객체영역도출부(4400); 및 상기 제1객체 및 상기 제2객체를 차량 혹은 보행자 객체로 판단하여 트래킹을 수행하여, 교통 혹은 보행정보를 분석하는 분석부(4500);를 포함할 수 있다.The traffic or walking information analysis system 4000 does not perform decoding on the first analysis frame including the target frame of image data, based on size information of block data or image decoding parameters, and the target frame a motion frame determining unit 4100 that determines whether or not this is a motion frame; a first object region derivation unit 4200 that decodes the target frame, extracts a decoded image, and derives one or more first objects by performing object detection on the decoded image by a first machine learning model based on deep learning; A comparison object area derivation unit (4300) for detecting one or more comparison object areas inside the target frame based on size information of block data or image decoding parameters with respect to the second analysis frame including the target frame of the image data. ; and a second method for deriving one or more second objects by performing object detection by a deep learning-based second machine learning model on a decoded image of a comparison object area in which the first object does not exist among one or more comparison object areas. an object area derivation unit 4400; and an analyzer 4500 that determines the first object and the second object as vehicle or pedestrian objects, performs tracking, and analyzes traffic or walking information.

이들 구성에 대한 설명은 도 15 내지 도 20을 참조하여 설명한 단계들에 상응한다.The description of these configurations corresponds to the steps described with reference to FIGS. 15 to 20 .

도 23는 본 발명의 일 실시예에 따른 인코딩 되는 영상을 생성하는 인코더 시스템을 개략적으로 도시한다.23 schematically illustrates an encoder system for generating an encoded image according to an embodiment of the present invention.

전술한 본 발명의 일 실시예에 따른 객체영역검출부, 영상디코딩부는 도 6에서와 같이 가변크기 블록을 이용하여 인코딩된 영상데이터에 대해 적용할 수 있다. 대표적인 일예로서는 H.264 코덱에 의하여 인코딩된 영상데이터에 대하여 적용될 수 있다.The object region detection unit and the image decoding unit according to an embodiment of the present invention described above may be applied to image data encoded using a variable size block as shown in FIG. 6 . As a representative example, it can be applied to image data encoded by the H.264 codec.

도 23에 도시된 인코더(10)는 도 1 및 도 2에 영상데이터로 도시된 데이터를 생성하기 위해, DCT부(Discrete Cosine Transform)(11), 양자화부(Quantization)(12), 역양자화부(Inverse Quantization; IQ)(13), 역변환부(Inverse Discrete Cosine Transform;IDCT)(14), 프레임 메모리(15), 움직임 추정 및 보상부(Motion Estimation and Compensation; ME/MC)(16) 및 가변 길이 코딩부(Variable Length Coding; VLC)(17)를 포함할 수 있다. 마찬가지로, 상기 영상디코딩부는 상기 인코딩부의 구성에 상응하게 구성됨이 바람직하다.The encoder 10 shown in FIG. 23 includes a DCT unit (Discrete Cosine Transform) 11, a quantization unit 12, and an inverse quantization unit to generate data shown as image data in FIGS. 1 and 2. (Inverse Quantization; IQ) 13, Inverse Discrete Cosine Transform (IDCT) 14, Frame Memory 15, Motion Estimation and Compensation (ME/MC) 16 and Variable A variable length coding (VLC) 17 may be included. Similarly, it is preferable that the video decoding unit is configured to correspond to the configuration of the encoding unit.

이에 대해 간략하게 설명을 하자면, DCT부(11)는 공간적 상관성을 제거하기 위해 기설정된 사이즈 (예를들어 4×4) 픽셀 블록 단위로 입력되는 영상 데이터에 대해 DCT 연산을 수행한다. To briefly explain this, the DCT unit 11 performs a DCT operation on input image data in units of pixel blocks of a predetermined size (eg, 4x4) in order to remove spatial correlation.

이후, 양자화부(12)는 DCT부(11)에서 얻어진 DCT 계수에 대해 양자화를 수행하여, 몇 개의 대표 값으로 표현함으로써, 고효율 손실 압축을 수행한다.Thereafter, the quantization unit 12 performs quantization on the DCT coefficients obtained by the DCT unit 11 and expresses them with several representative values, thereby performing highly efficient lossy compression.

또한, 역양자화부(13)는 양자화부(12)에서 양자화된 영상 데이터를 역양자화한다. In addition, the inverse quantization unit 13 inversely quantizes the image data quantized by the quantization unit 12 .

역변환부(14)는 역양자화부(13)에서 역양자화된 영상 데이터에 대해 IDCT 변환을 수행한다. The inverse transform unit 14 performs IDCT transformation on the image data inversely quantized by the inverse quantization unit 13 .

프레임 메모리(15)는 역변환부(14)에서 IDCT 변환된 영상데이터를 프레임 단위로 저장한다.The frame memory 15 stores the IDCT-converted image data in the inverse transform unit 14 in units of frames.

한편, 움직임 추정 및 보상부(16)는 입력되는 현재 프레임의 영상데이터와 프레임 메모리부(15)에 저장된 이전 프레임의 영상 데이터를 이용하여 매크로 블록당 움직임 벡터(Motion Vector; MV)를 추정하여 블록정합오차(blockmatching error)에 해당되는 SAD(sum of absolute difference)를 계산한다. Meanwhile, the motion estimating and compensating unit 16 estimates a motion vector (MV) for each macroblock by using the video data of the current frame and the video data of the previous frame stored in the frame memory unit 15. A sum of absolute difference (SAD) corresponding to a block matching error is calculated.

가변길이 코딩부(17)는 움직임 추정 및 보상부(16)에서 추정된 움직임 벡터에 따라 DCT및 양자화 처리된 데이터에서 통계적 중복성을 제거한다.The variable length coding unit 17 removes statistical redundancy from DCT and quantized data according to the motion vector estimated by the motion estimation and compensation unit 16 .

도 24은 영상 데이터의 프레임들의 예들을 개략적으로 도시한 도면이다24 is a diagram schematically illustrating examples of frames of video data;

일반적인 동영상의 비디오 부분은 I 프레임(도 24에서 “I”로 도시한 프레임), P 프레임(도 24에서 “P”로 도시한 프레임), 및 B 프레임(도 24에서 “B”로 도시한 프레임)으로 구성된다.The video portion of a general motion picture includes I frames (frames indicated by “I” in FIG. 24), P frames (frames indicated by “P” in FIG. 24), and B frames (frames indicated by “B” in FIG. 24). ) is composed of

I 프레임은 키 프레임으로써 전체 이미지를 모두 포함하고, 동영상 파일에 있어서 억세스 포인트로 기능할 수 있으며, 독립적으로 인코딩된 프레임에 해당하며 낮은 압축률을 가지고 있다. An I frame includes all images as a key frame, can function as an access point in a video file, corresponds to an independently encoded frame, and has a low compression rate.

한편, P 프레임의 경우, 이전의 I 프레임 혹은 P 프레임을 참조하여 순방향 예측에 의하여 만들어지는 프레임으로서 독립적으로 인코딩된 프레임에 해당하지 않는다. 이와 같은 P 프레임은 I 프레임에 비해 높은 압축률을 가지고 있다. 여기서, “이전”의 프레임이라는 것은 바로 전 프레임뿐만 아니라 해당 프레임 전에 존재하는 복수의 프레임 중 하나를 의미하고, “이후”의 프레임이라는 것은 바로 다음 프레임뿐만 아니라 해당 프레임 다음에 존재하는 복수의 프레임 중 하나를 의미한다.Meanwhile, in the case of a P frame, a frame created by forward prediction by referring to a previous I frame or P frame does not correspond to an independently encoded frame. Such a P frame has a higher compression ratio than an I frame. Here, the “previous” frame means not only the immediately preceding frame but also one of a plurality of frames existing before the corresponding frame, and the “post” frame means not only the immediately following frame but also one of a plurality of frames existing after the corresponding frame. means one

한편, B 프레임의 경우, 이전의 프레임 및 이후의 프레임을 참조하여 순방향 및 역방향 예측에 의하여 만들어지는 프레임으로서 독립적으로 인코딩된 프레임에 해당하지 않는다. 이와 같은 B 프레임은 I, P 프레임에 비해 높은 압축률을 가지고 있다. 따라서, 상기 독립적으로 인코딩된 프레임은 I 프레임에 해당하고, 비독립적으로 인코딩된 프레임은 나머지 B 프레임 혹은 P 프레임에 해당할 수 있다.On the other hand, in the case of a B frame, it is a frame created by forward and backward prediction with reference to previous and subsequent frames, and does not correspond to an independently encoded frame. Such a B frame has a higher compression rate than I and P frames. Accordingly, the independently encoded frame may correspond to an I frame, and the non-independently encoded frame may correspond to the remaining B frames or P frames.

상기 B, P 프레임은 참조프레임에 해당하고, 바람직하게는, 상기 객체영역검출부는 이와 같은 참조프레임에 대하여 객체영역검출을 수행한다.The B and P frames correspond to reference frames, and preferably, the object region detection unit performs object region detection on these reference frames.

도 25는 본 발명의 일 실시예에 따른 컴퓨팅장치의 내부 구성을 예시적으로 도시한다. 도 2에 도시된 구성요소 전체 혹은 일부는 후술하는 컴퓨팅장치의 구성요소를 포함할 수 있다.25 illustratively illustrates the internal configuration of a computing device according to an embodiment of the present invention. All or some of the components shown in FIG. 2 may include components of a computing device described later.

도 25에 도시한 바와 같이, 컴퓨팅장치(11000)은 적어도 하나의 프로세서(processor)(11100), 메모리(memory)(11200), 주변장치 인터페이스(peripheral interface)(11300), 입/출력 서브시스템(I/Osubsystem)(11400), 전력 회로(11500) 및 통신 회로(11600)를 적어도 포함할 수 있다. As shown in FIG. 25, a computing device 11000 includes at least one processor 11100, a memory 11200, a peripheral interface 11300, an input/output subsystem ( I/O subsystem 11400, a power circuit 11500, and a communication circuit 11600 may be included at least.

메모리(11200)는, 일례로 고속 랜덤 액세스 메모리(high-speed random access memory), 자기 디스크, 에스램(SRAM), 디램(DRAM), 롬(ROM), 플래시 메모리 또는 비휘발성 메모리를 포함할 수 있다. 메모리(11200)는 컴퓨팅장치(11000)의 동작에 필요한 소프트웨어 모듈, 명령어 집합 또는 그밖에 다양한 데이터를 포함할 수 있다.The memory 11200 may include, for example, high-speed random access memory, magnetic disk, SRAM, DRAM, ROM, flash memory, or non-volatile memory. there is. The memory 11200 may include a software module, a command set, or other various data necessary for the operation of the computing device 11000.

이때, 프로세서(11100)나 주변장치 인터페이스(11300) 등의 다른 컴포넌트에서 메모리(11200)에 액세스하는 것은 프로세서(11100)에 의해 제어될 수 있다. 상기 프로세서(11100)은 단일 혹은 복수로 구성될 수 있고, 연산처리속도 향상을 위하여 GPU 및 TPU 형태의 프로세서를 포함할 수 있다.In this case, access to the memory 11200 from other components, such as the processor 11100 or the peripheral device interface 11300, may be controlled by the processor 11100. The processor 11100 may be composed of single or multiple processors, and may include GPU and TPU type processors in order to improve calculation processing speed.

주변장치 인터페이스(11300)는 컴퓨팅장치(11000)의 입력 및/또는 출력 주변장치를 프로세서(11100) 및 메모리 (11200)에 결합시킬 수 있다. 프로세서(11100)는 메모리(11200)에 저장된 소프트웨어 모듈 또는 명령어 집합을 실행하여 컴퓨팅장치(11000)을 위한 다양한 기능을 수행하고 데이터를 처리할 수 있다.Peripheral interface 11300 may couple input and/or output peripherals of computing device 11000 to processor 11100 and memory 11200 . The processor 11100 may execute various functions for the computing device 11000 and process data by executing software modules or command sets stored in the memory 11200 .

입/출력 서브시스템(11400)은 다양한 입/출력 주변장치들을 주변장치 인터페이스(11300)에 결합시킬 수 있다. 예를 들어, 입/출력 서브시스템(11400)은 모니터나 키보드, 마우스, 프린터 또는 필요에 따라 터치스크린이나 센서등의 주변장치를 주변장치 인터페이스(11300)에 결합시키기 위한 컨트롤러를 포함할 수 있다. 다른 측면에 따르면, 입/출력 주변장치들은 입/출력 서브시스템(11400)을 거치지 않고 주변장치 인터페이스(11300)에 결합될 수도 있다.Input/output subsystem 11400 can couple various input/output peripherals to peripheral interface 11300. For example, the input/output subsystem 11400 may include a controller for coupling a peripheral device such as a monitor, keyboard, mouse, printer, or touch screen or sensor to the peripheral interface 11300 as needed. According to another aspect, input/output peripherals may be coupled to the peripheral interface 11300 without going through the input/output subsystem 11400.

전력 회로(11500)는 단말기의 컴포넌트의 전부 또는 일부로 전력을 공급할 수 있다. 예를 들어 전력 회로(11500)는 전력 관리 시스템, 배터리나 교류(AC) 등과 같은 하나 이상의 전원, 충전 시스템, 전력 실패 감지 회로(power failure detection circuit), 전력 변환기나 인버터, 전력 상태 표시자 또는 전력 생성, 관리, 분배를 위한 임의의 다른 컴포넌트들을 포함할 수 있다.The power circuit 11500 may supply power to all or some of the terminal's components. For example, power circuit 11500 may include a power management system, one or more power sources such as a battery or alternating current (AC), a charging system, a power failure detection circuit, a power converter or inverter, a power status indicator or power It may contain any other components for creation, management and distribution.

통신 회로(11600)는 적어도 하나의 외부 포트를 이용하여 다른 컴퓨팅장치와 통신을 가능하게 할 수 있다.The communication circuit 11600 may enable communication with another computing device using at least one external port.

또는 상술한 바와 같이 필요에 따라 통신 회로(11600)는 RF 회로를 포함하여 전자기 신호(electromagnetic signal)라고도 알려진 RF 신호를 송수신함으로써, 다른 컴퓨팅장치와 통신을 가능하게 할 수도 있다.Alternatively, as described above, the communication circuit 11600 may include an RF circuit and transmit/receive an RF signal, also known as an electromagnetic signal, to enable communication with other computing devices.

이러한 도 25의 실시예는, 컴퓨팅장치(11000)의 일례일 뿐이고, 컴퓨팅장치(11000)은 도 25에 도시된 일부 컴포넌트가 생략되거나, 도 25에 도시되지 않은 추가의 컴포넌트를 더 구비하거나, 2개 이상의 컴포넌트를 결합시키는 구성 또는 배치를 가질 수 있다. 예를 들어, 모바일 환경의 통신 단말을 위한 컴퓨팅장치는 도 25에도시된 컴포넌트들 외에도, 터치스크린이나 센서 등을 더 포함할 수도 있으며, 통신 회로(1160)에 다양한 통신방식(WiFi, 3G, LTE, Bluetooth, NFC, Zigbee 등)의 RF 통신을 위한 회로가 포함될 수도 있다. 컴퓨팅장치(11000)에 포함 가능한 컴포넌트들은 하나 이상의 신호 처리 또는 어플리케이션에 특화된 집적 회로를 포함하는 하드웨어, 소프트웨어, 또는 하드웨어 및 소프트웨어 양자의 조합으로 구현될 수 있다.The embodiment of FIG. 25 is just one example of the computing device 11000, and the computing device 11000 may omit some of the components shown in FIG. 25, further include additional components not shown in FIG. It may have a configuration or arrangement combining two or more components. For example, a computing device for a communication terminal in a mobile environment may further include a touch screen or a sensor in addition to the components shown in FIG. , Bluetooth, NFC, Zigbee, etc.) may include a circuit for RF communication. Components that may be included in the computing device 11000 may be implemented as hardware including one or more signal processing or application-specific integrated circuits, software, or a combination of both hardware and software.

본 발명의 실시예에 따른 방법들은 다양한 컴퓨팅장치를 통하여 수행될 수 있는 프로그램 명령(instruction) 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 특히, 본 실시예에 따른 프로그램은 PC 기반의 프로그램 또는 모바일 단말 전용의 어플리케이션으로 구성될 수 있다. 본 발명이 적용되는 어플리케이션은 파일 배포 시스템이 제공하는 파일을 통해 이용자 단말에 설치될 수 있다. 일 예로, 파일 배포 시스템은 이용자 단말이기의 요청에 따라 상기 파일을 전송하는 파일 전송부(미도시)를 포함할 수 있다.Methods according to embodiments of the present invention may be implemented in the form of program instructions that can be executed through various computing devices and recorded in computer readable media. In particular, the program according to the present embodiment may be composed of a PC-based program or a mobile terminal-specific application. An application to which the present invention is applied may be installed in a user terminal through a file provided by a file distribution system. For example, the file distribution system may include a file transmission unit (not shown) that transmits the file according to a request of a user terminal.

본 발명의 일 실시예에 따르면, 인코딩된 영상데이터로부터 객체를 인식, 트레킹, 추적, 식별 등의 분석을 수행하는 경우에 미리 객체영역을 검출하여 검출된 객체영역에 대해서만 영상분석을 수행하기 때문에, 보다 빠른 속도로 영상을 처리할 수 있는 효과를 발휘할 수 있다.According to one embodiment of the present invention, when performing analysis such as recognizing, tracking, tracking, and identifying an object from encoded image data, an object region is detected in advance and image analysis is performed only for the detected object region. It is possible to achieve an effect of processing images at a higher speed.

본 발명의 일 실시예에 따르면, 영상분석 전 객체영역을 검출함에 있어서도 객체영역 검출에 대한 연산량을 감소시켜 시스템 전체적으로 연산속도를 높일 수 있는 효과를 발휘할 수 있다.According to an embodiment of the present invention, even in detecting an object region before image analysis, an operation speed for the object region detection can be reduced, thereby increasing the overall system operation speed.

본 발명의 일 실시예에 따르면, 객체영역 검출에 있어서 최소의 연산량임에도 불구하고 상당히 높은 수준의 정확도로 객체가 존재할 수 있다고 판단되는 객체영역을 검출할 수 있는 효과를 발휘할 수 있다.According to an embodiment of the present invention, in spite of a minimum amount of computation in detecting an object area, an effect of detecting an object area where it is determined that an object may exist can be achieved with a fairly high level of accuracy.

본 발명의 일 실시예에 따르면, 해당 영상데이터를 디코딩하기 위한 디코더부의 구성을 변경하지 않으면서, 해당 디코더부에서 디코딩 과정에서 생성되는 파라미터를 이용함으로써, H.264, H. 265(HEVC), H. 266(VVC) 등의 규약에 따른 블록을 이용한 코덱방식이라면 코덱이 변경되더라도, 용이하게 적용될 수 있는 효과를 발휘할 수 있다.According to an embodiment of the present invention, H.264, H. 265 (HEVC), H.265 (HEVC), A codec method using a block conforming to a rule such as H. 266 (VVC) can exert an effect that can be easily applied even if the codec is changed.

본 발명의 일 실시예에 따르면, 촬영기기 특성, 촬영장소 등의 환경요소에 관계없이 높은 정확도로 고속으로 객체영역을 검출함으로써, 고해상도 CCTV 영상 검출 등의 실시간 영상 해석, 판독, 검출에 활용될 수 있는 효과를 발휘할 수 있다.According to an embodiment of the present invention, it can be used for real-time image analysis, reading, and detection such as high-resolution CCTV image detection by detecting an object area with high accuracy and high speed regardless of environmental factors such as the characteristics of a recording device and a recording location. effect can be exerted.

이상에서 설명된 장치는 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치 및 구성요소는, 예를 들어, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPGA(field programmable gate array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 어플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다. 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The device described above may be implemented as a hardware component, a software component, and/or a combination of hardware components and software components. For example, devices and components described in the embodiments may include, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate array (FPGA) , a programmable logic unit (PLU), microprocessor, or any other device capable of executing and responding to instructions. The processing device may run an operating system (OS) and one or more software applications running on the operating system. A processing device may also access, store, manipulate, process, and generate data in response to execution of software. For convenience of understanding, there are cases in which one processing device is used, but those skilled in the art will understand that the processing device includes a plurality of processing elements and/or a plurality of types of processing elements. It can be seen that it can include. For example, a processing device may include a plurality of processors or a processor and a controller. Other processing configurations are also possible, such as parallel processors.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로 (collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상장치(virtual equipment), 컴퓨터 저장 매체 또는 장치, 또는 전송되는 신호 파(signal wave)에 영구적으로, 또는 일시적으로 구체화(embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨팅장치 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.Software may include a computer program, code, instructions, or a combination of one or more of the foregoing, which configures a processing device to operate as desired or processes independently or collectively. The device can be commanded. Software and/or data may be any tangible machine, component, physical device, virtual equipment, computer storage medium or device, intended to be interpreted by or to provide instructions or data to a processing device. , or may be permanently or temporarily embodied in a transmitted signal wave. Software may be distributed on networked computing devices and stored or executed in a distributed manner. Software and data may be stored on one or more computer readable media.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 실시예의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded on a computer readable medium. The computer readable medium may include program instructions, data files, data structures, etc. alone or in combination. Program commands recorded on the medium may be specially designed and configured for the embodiment or may be known and usable to those skilled in computer software. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic media such as floptical disks. - includes hardware devices specially configured to store and execute program instructions, such as magneto-optical media, and ROM, RAM, flash memory, and the like. Examples of program instructions include high-level language codes that can be executed by a computer using an interpreter, as well as machine language codes such as those produced by a compiler. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

이상과 같이 실시예들이 비록 한정된 실시예와 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기의 기재로부터 다양한 수정 및 변형이 가능하다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.As described above, although the embodiments have been described with limited examples and drawings, those skilled in the art can make various modifications and variations from the above description. For example, the described techniques may be performed in an order different from the method described, and/or components of the described system, structure, device, circuit, etc. may be combined or combined in a different form than the method described, or other components may be used. Or even if it is replaced or substituted by equivalents, appropriate results can be achieved.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents of the claims are within the scope of the following claims.

Claims

A traffic or walking information analysis method performed in a computing system including one or more processors and a main memory storing instructions executable by the processor,
a motion frame determination step of determining whether a target frame is a frame with motion with respect to a first analysis frame including a target frame of image data;
a first object derivation step of decoding the target frame, extracting a decoded image, and deriving one or more first objects by performing object detection using a deep learning-based first machine learning model on the decoded image;
a comparison object region derivation step of detecting one or more comparison object regions inside the target frame based on size information of block data or image decoding parameters with respect to the second analysis frame including the target frame of the image data; and
A second object that derives one or more second objects by performing object detection by a deep learning-based second machine learning model on a decoded image of a comparison object area in which the first object does not exist among one or more comparison object areas derivation step; and
and an analysis step of determining the first object and the second object as vehicle or pedestrian objects, performing tracking, and analyzing traffic or walking information.

The method of claim 1,
In the step of detecting the comparison object area,
A method for analyzing traffic or walking information, wherein a comparison object region is detected based on at least one of size information and motion vector information of bitstream data for blocks in the target frame and frames preceding one or more target frames.

The method of claim 1,
In the step of detecting the comparison object area,
deriving a comprehensive block data size determination value for each block by summing size information of bitstream data of each block with respect to the target frame and frames preceding one or more target frames; and
and deriving a comparison object area based on information of blocks whose comprehensive block data size determination value meets a predetermined criterion.

The method of claim 3,
The size information of the bitstream data of the block is a normalized value based on the entire information on the data size in each frame.

The method of claim 1,
In the step of detecting the comparison object area,
When the size of the motion vector of each block meets a predetermined criterion for each of the target frame and one or more frames preceding the target frame, a motion vector judgment value of each block is given as a first numerical value, respectively. assigning a motion vector judgment value of each block as a second numerical value when the size of the motion vector of the block does not meet a predetermined criterion;
deriving a comprehensive motion vector judgment value for each block by accumulating the motion vector judgment values for each block of each target frame and frames preceding one or more target frames; and
A method for analyzing traffic or walking information, comprising: deriving blocks whose comprehensive motion vector judgment values meet a predetermined criterion as a comparison object area.

The method of claim 5,
The comparison object area detection step derives second object area information based on grouping information derived according to whether the comprehensive motion vector judgment value of each block meets a preset criterion and the direction of the motion vector of each block. , Traffic or pedestrian information analysis method.

The method of claim 1,
In the first object derivation step, the decoded image input to the first machine learning model includes an image in which the original image is compressed,
In the second object derivation step, the decoded image of the comparison object area input to the second machine learning model is compressed to a higher quality than the original image or an image in which the original image input in the first object derivation step is compressed. A method for analyzing traffic or walking information, including compressed images.

The method of claim 1,
The second machine learning model requires a relatively higher computational load than the first machine learning model and can perform more accurate detection.

A traffic or walking information analysis system implemented as a computing system including one or more processors and a main memory storing instructions executable by the processors,
A motion of determining whether the target frame is a frame with motion based on size information of block data or image decoding parameters in a state in which decoding is not performed on the first analysis frame including the target frame of image data. frame judgment unit;
a first object region derivation unit that decodes the target frame, extracts a decoded image, and derives one or more first objects by performing object detection on the decoded image by a first machine learning model based on deep learning;
a comparison object area deriving unit for detecting one or more comparison object areas inside the target frame based on size information of data or image decoding parameters for a second analysis frame including the target frame of the image data; and
A second object that derives one or more second objects by performing object detection by a deep learning-based second machine learning model on a decoded image of a comparison object area in which the first object does not exist among one or more comparison object areas area extraction unit; and
An analyzer configured to determine the first object and the second object as vehicle or pedestrian objects, perform tracking, and analyze traffic or walking information; a traffic or walking information analysis system comprising:

A computer program stored on a computer-readable medium comprising a plurality of instructions executed by one or more processors,
The computer program,
A motion of determining whether the target frame is a frame with motion based on size information of block data or image decoding parameters in a state in which decoding is not performed on the first analysis frame including the target frame of image data. frame judgment step;
a first object derivation step of decoding the target frame, extracting a decoded image, and deriving one or more first objects by performing object detection using a deep learning-based first machine learning model on the decoded image;
a comparison object region derivation step of detecting one or more comparison object regions inside the target frame based on size information of block data or image decoding parameters with respect to the second analysis frame including the target frame of the image data; and
A second object that derives one or more second objects by performing object detection by a deep learning-based second machine learning model on a decoded image of a comparison object area in which the first object does not exist among one or more comparison object areas derivation step; and
An analysis step of determining the first object and the second object as vehicle or pedestrian objects, performing tracking, and analyzing traffic or walking information.