KR20200113477A

KR20200113477A - Method and Apparatus for Processing Image in Compressed Domain

Info

Publication number: KR20200113477A
Application number: KR1020190033708A
Authority: KR
Inventors: 김영근
Original assignee: 에스케이텔레콤 주식회사
Priority date: 2019-03-25
Filing date: 2019-03-25
Publication date: 2020-10-07

Abstract

The present embodiment relates to a method and a device for processing an image in a compressed area capable of reducing the usage amount of a storage space while improving the quality of a video summary by storing meaningful pictures at intervals suitable for human perception by analyzing a video stream transmitted from a CCTV camera in the compressed area to detect the movements of objects and based on the same, generate a thumbnail image for the video summary. The device for processing an image comprises: an input buffer part; an extraction part; an analysis part; and a movement detection part.

Description

Method and Apparatus for Processing Image in Compressed Domain}

본 실시예는 압축 영역에서의 영상 처리방법 및 장치에 관한 것이다. 더욱 상세하게는, CCTV 영상과 같이 매우 긴 영상에 대한 영상 요약정보을 보다 효율적으로 생성하기 위한 영상 처리방법 및 장치에 관한 것이다.The present embodiment relates to an image processing method and apparatus in a compressed region. More specifically, it relates to an image processing method and apparatus for more efficiently generating image summary information for a very long image such as a CCTV image.

이 부분에 기술된 내용은 단순히 본 실시예에 대한 배경 정보를 제공할 뿐 종래기술을 구성하는 것은 아니다.The content described in this section merely provides background information on the present embodiment and does not constitute the prior art.

도 1은 일반적인 비디오 스트림의 구성을 예시한 예시도이다. 도 1을 참조하면, 비디오 스트림은 일정 주기의 IDR 픽처를 가지고 있으며, IDR 픽처로부터 다음 IDR 픽처가 나오기까지의 픽처에 대한 단위를 GOP(Group of Picture)라고 한다. IDR 픽처는 b/p 픽처에 비해 크기가 크지만 한개의 픽처로 한개의 완전환 화면을 구성할 수 있어서 저장하기에 용이하는 장점이 존재한다. 반면, 그 크기가 크므로 네트워크 전송 시에 많은 대역폭을 사용하게 되며, 저장 시에 공간도 많이 차지하게 된다는 단점이 존재한다.1 is an exemplary diagram illustrating a configuration of a general video stream. Referring to FIG. 1, a video stream has an IDR picture of a predetermined period, and a unit of a picture from an IDR picture to a next IDR picture is called a GOP (Group of Picture). Although the IDR picture is larger in size than the b/p picture, there is an advantage in that it is easy to store because one complete picture can be composed of one picture. On the other hand, since the size is large, there is a disadvantage that a large amount of bandwidth is used during network transmission and a large amount of space is occupied during storage.

도 2는 종래의 비디오 스트림으로부터 영상 요약을 위한 썸네일(Thumnail)을 생성하는 절차를 설명하기 위한 도면이다. 종래의 경우 비디오 스트림으로부터 영상 요약을 위한 썸네일을 생성할 때 비디오 스트림에 주기적으로 포함되어 있는 IDR 픽처를 저장하는 방법에 사용되었다. 하지만, 이런 방법은 CCTV 영상과 같이 매우 긴 비디오 스트림의 저장에 있어서서는 적합하지 않다. 특히, 클라우드 기반의 다량의 세션을 저장하는 경우에는 매우 큰 공간의 낭비로 나타난다. 즉, CCTV 영상의 비디오 스트림은 IDR 픽처의 간격이 넓기때문에, 종래의 방법에 의하는 경우 사람이 인지하기에 긴 시간 간격으로 썸네일이 생성된다. 또한, 주기적으로 생성된 픽처는 의미가 없는 픽처도 다량 생성이 되므로, CCTV와 같이 매우 긴 시간을 저장할 시에는 저장 공간의 낭비가 크다는 한계가 존재한다.FIG. 2 is a diagram illustrating a procedure of generating a thumbnail for summarizing an image from a conventional video stream. In the conventional case, when a thumbnail for summarizing an image is generated from a video stream, it has been used in a method of storing an IDR picture periodically included in a video stream. However, this method is not suitable for storing very long video streams such as CCTV images. In particular, storing a large number of sessions based on the cloud appears to be a very large waste of space. That is, since a video stream of a CCTV image has a wide interval between IDR pictures, thumbnails are generated at long intervals for human perception in the case of a conventional method. In addition, since a large number of pictures that are periodically generated are also meaningless pictures, there is a limitation in that a large amount of storage space is wasted when storing a very long time such as a CCTV.

이에, CCTV 카메라로부터 전송되는 비디오 스트림에 대해 비디오 스트림을 저장하면서 영상 요약을 위한 썸네일을 동시에 저장할 때 의미가 있는 썸네일 이미지를 빈번하게 저장하여 영상 요약의 질을 향상시키면서도 저장 공간의 사용량을 감소시킬 수 있도록 하는 새로운 기술을 필요로 한다.Therefore, when storing the video stream for the video stream transmitted from the CCTV camera and simultaneously storing the thumbnail for video summary, meaningful thumbnail images are frequently stored to improve the quality of video summary and reduce the use of storage space. It requires new skills to enable.

본 실시예는 CCTV 카메라로부터 전송되는 비디오 스트림을 압축 영역에서 분석하여 객체의 움직임을 탐지하고, 이를 기반으로 영상 요약을 위한 썸네일 이미지를 생성함으로써 의미가 있는 픽처를 사람이 인지하기에 적당한 간격으로 저장하여 영상 요약의 질을 향상시키면서도 저장 공간의 사용량을 감소시킬 수 있도록 하는 압축 영역에서의 영상 처리방법 및 장치를 제공하는 데 그 목적이 있다.In this embodiment, a video stream transmitted from a CCTV camera is analyzed in a compression area to detect the motion of an object, and based on this, a thumbnail image for image summary is generated, so that a meaningful picture is stored at an appropriate interval for human perception. Accordingly, an object of the present invention is to provide a method and apparatus for processing an image in a compressed region capable of reducing the amount of storage space while improving the quality of an image summary.

본 실시예는, 영상의 부호화된 비디오 스트림을 입력받는 입력 버퍼부; 상기 부호화된 비디오 스트림을 압축 영역(Compressed Domain)에서 분석하여 상기 부호화 비디오 스트림 내 모션 벡터 정보 및 DCT 계수를 추출하는 추출부; 상기 모션 벡터 정보 및 상기 DCT 계수를 분석한 분석결과를 제공하는 분석부; 및 상기 모션 벡터 정보 및 상기 DCT 계수 중 적어도 하나의 분석결과에 기초하여 상기 영상 내 객체의 움직임을 탐지하는 움직임 탐지부를 포함하는 것을 특징으로 하는 영상 처리장치를 제공한다.The present embodiment includes: an input buffer unit receiving an encoded video stream of an image; An extractor configured to analyze the encoded video stream in a compressed domain to extract motion vector information and DCT coefficients in the encoded video stream; An analysis unit that provides an analysis result obtained by analyzing the motion vector information and the DCT coefficient; And a motion detector configured to detect a motion of the object in the image based on an analysis result of at least one of the motion vector information and the DCT coefficient.

또한, 본 실시예의 다른 측면에 의하면, 영상의 부호화된 비디오 스트림을 입력받는 과정; 상기 부호화된 비디오 스트림을 압축 영역에서 분석하여 상기 부호화 비디오 스트림 내 모션 벡터 정보 및 DCT 계수를 추출하는 과정; 상기 모션 벡터 정보 및 상기 DCT 계수를 분석한 분석결과를 제공하는 과정; 및 상기 모션 벡터 정보 및 상기 DCT 계수 중 적어도 하나의 분석결과에 기초하여 상기 영상 내 객체의 움직임을 탐지하는 과정을 포함하는 것을 특징으로 하는 영상 처리방법을 제공한다.In addition, according to another aspect of the present embodiment, there is provided a process of receiving an encoded video stream of an image; Analyzing the encoded video stream in a compression region and extracting motion vector information and DCT coefficients in the encoded video stream; Providing an analysis result obtained by analyzing the motion vector information and the DCT coefficient; And detecting a motion of the object in the image based on an analysis result of at least one of the motion vector information and the DCT coefficient.

또한, 본 실시예의 다른 측면에 의하면, 제 12항에 의한 영상 처리방법의 각 단계를 실행시키기 위하여 컴퓨터로 읽을 수 있는 기록매체에 저장된 컴퓨터프로그램을 제공한다.Further, according to another aspect of the present embodiment, there is provided a computer program stored in a computer-readable recording medium in order to execute each step of the image processing method according to claim 12.

본 실시예에 의하면, CCTV 카메라로부터 전송되는 비디오 스트림을 압축 영역에서 분석하여 객체의 움직임을 탐지하고, 이를 기반으로 영상 요약을 위한 썸네일 이미지를 생성함으로써 의미가 있는 픽처를 사람이 인지하기에 적당한 간격으로 저장하여 영상 요약의 질을 향상시키면서도 저장 공간의 사용량을 감소시킬 수 있는 효과가 있다.According to the present embodiment, a video stream transmitted from a CCTV camera is analyzed in a compression region to detect the motion of an object, and based on this, a thumbnail image for image summary is generated, so that an appropriate interval for a person to recognize a meaningful picture. By storing it as an image, it is possible to reduce the amount of storage space while improving the quality of the video summary.

도 1은 일반적인 비디오 스트림의 구성을 예시한 예시도이다.
도 2는 종래의 비디오 스트림으로부터 영상 요약을 위한 썸네일을 생성하는 절차를 설명하기 위한 도면이다.
도 3은 본 실시예에 따른 영상 처리장치를 개략적으로 나타낸 블록 구성도이다.
도 4는 본 실시예에 따른 영상 처리장치의 수행부를 개략적으로 나타낸 블록 구성도이다.
도 5는 본 실시예에 따른 영상 처리방법을 설명하기 위한 순서도이다.1 is an exemplary diagram illustrating a configuration of a general video stream.
2 is a diagram for describing a procedure of generating a thumbnail for summarizing an image from a conventional video stream.
3 is a block diagram schematically illustrating an image processing apparatus according to the present embodiment.
4 is a block diagram schematically illustrating an execution unit of the image processing apparatus according to the present embodiment.
5 is a flowchart illustrating an image processing method according to the present embodiment.

이하, 본 실시예를 첨부된 도면을 참조하여 상세하게 설명한다.Hereinafter, this embodiment will be described in detail with reference to the accompanying drawings.

최근, 썸네일 형태로 영상을 요약하여 보여주는 애플리케이션이 늘어가고 있다. 요약하려는 영상의 길이는 다양하며, CCTV 영상과 같이 매우 긴 영상은 저장 방법에 따라서 사용성 및 저장 공간에 큰 영향을 받는다. CCTV 영상의 비디오 스트림은 일반적인 동영상 스트림과는 다른 특징을 갖는다. 즉, 네트워크 대역폭과 저장 공간을 줄이기 위해 GOP 단위를 넓게 사용하고, 프레임 레이를 낮춤으로써 IDR 픽처의 간격이 크게 형성된다. 예를 들어, 60 GOP, 15 fps의 경우 IDR 픽처는 4초마다 한번씩 포함되어 있다. 한편, CCTV 영상의 비디오 스트림과 같이 fps가 낮고, GOP가 높으면 IDR이 수초에 한번씩 포함되어 있어서 썸네일이 중요한 장면을 놓칠 가능성이 높다.Recently, the number of applications that summarize and display images in the form of thumbnails are increasing. The length of the image to be summarized varies, and very long images such as CCTV images are greatly affected by usability and storage space depending on the storage method. The video stream of CCTV video has different characteristics from the general video stream. That is, in order to reduce network bandwidth and storage space, the GOP unit is widely used, and the frame ray is decreased, thereby forming a large interval between IDR pictures. For example, in the case of 60 GOP and 15 fps, IDR pictures are included once every 4 seconds. On the other hand, if the fps is low and the GOP is high, like a video stream of a CCTV video, IDR is included once every few seconds, so there is a high possibility that a scene where a thumbnail is important will be missed.

또한, 비디오 스트림과 영상 요약을 위한 썸네일 이미지를 동시에 저장할 때, 주기적으로 비디오 스트림으로부터 이미지를 추출하는 경우, 비디오로부터 모든 픽처를 디코딩해야 하므로, CPU 자원을 헐씬 많이 사용하게 되고, 장시간 저장할 경우 의미 없는 이미지도 계속 저장함에 따라 저장 공간을 매우 많이 소모하게 된다.In addition, when storing a video stream and a thumbnail image for video summary at the same time, when periodically extracting an image from the video stream, all pictures from the video must be decoded, which consumes a lot of CPU resources. As images continue to be stored, storage space is consumed very much.

한편, CCTV 영상의 비디오 스트림의 모션 벡터 정보는 배경이 정지되어 있고, 이동체에 대한 움직임 정보를 가지고 있으므로, 모션 벡터가 감지가 된다면, 이는 의미 있는 정보일 가능성이 크다. 이 점에 기인하여, 본 실시예의 경우, CCTV 카메라로부터 전송되는 비디오 스트림을 압축 영역에서 분석하여 모션 벡터를 통한 객체의 움직임을 탐지하고, 이를 기반으로 영상 요약을 위한 썸네일 이미지를 생성함으로써 의미가 있는 픽처가 사람이 인지하기에 적당한 간격으로 저장되어 영상 요약의 질이 향상될 수 있도록 하면서도, 저장 공간의 사용량을 감소시킬 수 있도록 하는 방안을 제시한다.On the other hand, since the motion vector information of the video stream of the CCTV video has the background still and the motion information of the moving object, if the motion vector is detected, this is likely to be meaningful information. Due to this, in the case of this embodiment, it is meaningful by analyzing the video stream transmitted from the CCTV camera in the compression region to detect the motion of the object through the motion vector, and creating a thumbnail image for the image summary based on this. A method is proposed that allows pictures to be stored at appropriate intervals for human perception to improve the quality of video summary while reducing the amount of storage space used.

도 3은 본 실시예에 따른 영상 처리장치를 개략적으로 나타낸 블록 구성도이다.3 is a block diagram schematically illustrating an image processing apparatus according to the present embodiment.

도 3에 도시하듯이, 본 실시예에 따른 영상 처리장치(300)는 입력 버퍼부(310), 수행부(320) 및 영상 처리부(330)를 포함한다. 여기서, 본 실시예에 따른 영상 처리장치(300)에 포함되는 구성 요소는 반드시 이에 한정되는 것은 아니다. 즉, 도 3의 경우는 압축 영역에서 모션 벡터 추출을 통한 객체의 움직임 탐지 및 이를 기반으로 영상 요약을 위한 썸네일 이미지를 생성하기 위한 필수 구성요소만을 예시적으로 도시한 것으로서, 이러한, 영상 처리장치(300)는 다른 기능의 구현을 위해 도시한 것보다 많거나 적은 구성요소 또는 상이한 구성요소의 구성을 가질 수 있음을 인식하여야 한다.As shown in FIG. 3, the image processing apparatus 300 according to the present embodiment includes an input buffer unit 310, an execution unit 320, and an image processing unit 330. Here, components included in the image processing apparatus 300 according to the present embodiment are not necessarily limited thereto. That is, the case of FIG. 3 exemplarily shows only essential components for detecting motion of an object through motion vector extraction from a compressed region and generating a thumbnail image for image summary based on the motion vector. It should be appreciated that 300) may have more or less components than those shown or configurations of different components for the implementation of other functions.

입력 버퍼부(310)는 영상촬영 장치로부터 발생한 영상의 비디오 스트림을 버퍼링하는 기능을 수행한다. 한편, 본 실시예에 있어서, 영상촬영 장치는 바람직하게는 CCTV 장치 등과 같은 지능형 감시 장치일 수 있다, 이로 인해, 본 실시예에 따른 입력 버퍼부(310)가 버퍼링받는 비디오 스트림은 GOP 단위를 넓게 사용하고, 낮은 프레임 레이트를 가지며, 그로 인해 IDR 픽처의 간격이 크게 형성될 수 있다.The input buffer unit 310 performs a function of buffering a video stream of an image generated from an image capturing device. On the other hand, in this embodiment, the video capture device may preferably be an intelligent monitoring device such as a CCTV device. Accordingly, the video stream buffered by the input buffer unit 310 according to the present embodiment has a wide GOP unit. It is used, has a low frame rate, and thus the interval of IDR pictures can be formed large.

본 실시예에 따른 입력 버퍼부(310)는 영상촬영 장치로부터 기 설정된 압축 규격에 따라 부호화된 비디오 스트림을 입력받을 수 있다. 예컨대, 입력 버퍼부(310)는 블록 단위의 움직임 보상에 기반하는 H.264/H.265와 같은 압축 규격에 따라 부호화된 비디오 스트림을 입력받을 수 있다.The input buffer unit 310 according to the present embodiment may receive a video stream encoded according to a preset compression standard from an image capturing apparatus. For example, the input buffer unit 310 may receive a video stream encoded according to a compression standard such as H.264/H.265 based on motion compensation in units of blocks.

한편, 본 실시예에 있어서, 부호화된 비디오 스트림 내 IDR 픽처가 아닌 b/p 픽처는 가장 선두의 IDR 픽처로부터 디코딩을 해야 로우 이미지(Raw Image)를 얻고 그 후에 jpeg/png 이미지로 압축할 수 있다. 이에, 입력 버퍼부(310)는 영상촬영 장치로부터 IDR(Instantaneous Decoder Refresh) 픽처에 의해 정의되는 GOP(Group of Pictures) 단위로 버퍼링되는 상기의 부호화된 비디오 스트림을 입력받을 수 있다.Meanwhile, in this embodiment, a b/p picture that is not an IDR picture in an encoded video stream must be decoded from the leading IDR picture to obtain a raw image, and then can be compressed into a jpeg/png image. . Accordingly, the input buffer unit 310 may receive the coded video stream buffered in GOP (Group of Pictures) units defined by an Instantaneous Decoder Refresh (IDR) picture from the image capturing apparatus.

수행부(320)는 부호화된 비디오 스트림으로부터 의미 있는 장면을 탐지하는 기능을 수행한다. 한편, 본 실시예에 있어서, 영상촬영 장치로부터 발생한 영상의 경우 특정 구역을 지정하여 감시함에 따라 배경이 정지되어 있고 이로 인해 영상의 변동성이 적다는 특징을 갖는다. 그로 인해, 영상 내 객체의 움직임이 발생한 경우 이는 의미 있는 장면으로서 분류될 수 있다.The execution unit 320 performs a function of detecting a meaningful scene from the encoded video stream. On the other hand, in the present embodiment, in the case of an image generated from an image capturing apparatus, the background is stopped as a specific area is designated and monitored, and thus, the variation of the image is small. Therefore, when an object in the image moves, it can be classified as a meaningful scene.

이 점에 기인하여, 본 실시예에 따른 수행부(320)는 부호화된 비디오 스트림을 분석하여 객체의 움직임을 탐지하고, 이에 대한 결과를 제공함으로써 이후, 영상 처리부(330)의 영상 처리 과정에서, 중요한 사건들에 대하여 영상 요약이 이루어질 수 있도록 한다.Due to this, the execution unit 320 according to the present embodiment detects the motion of an object by analyzing the encoded video stream, and provides a result thereof. Later, in the image processing process of the image processing unit 330, Allow video summaries to be made on important events.

수행부(320)는 부호화된 비디오 스트림 내 모션 벡터 정보를 검출하고, 이를 기반으로 객체의 움직임을 탐지할 수 있다. 즉, 본 실시예에 있어서, 부호화된 비디오 스트림 내 모션 벡터 정보는 배경이 정지되어 있고, 이동체에 대한 움직임 정보를 가지고 있으므로, 모션 벡터가 감지가 된다면, 이는 의미 있는 정보일 가능성이 크다.The execution unit 320 may detect motion vector information in the encoded video stream and detect motion of an object based on this. That is, in the present embodiment, since the background of motion vector information in the encoded video stream is stationary and has motion information on the moving object, if a motion vector is detected, this is likely to be meaningful information.

본 실시예에 따는 수행부(320)는 부호화된 비디오 스트림을 압축 영역(Compressed Domain)에서 분석한다. 즉, 수행부(320)는 부호화된 비디오 스트림에 대해 압축을 풀지 않고 부호화 상태에서 모션 벡터의 영역을 분석하여 움직임의 발생 여부를 탐지한다. 일반적으로 미디어 스트림의 경우 압축을 푸는 과정에 있어서 CPU 소모가 많이 발생하나, 본 실시예의 경우 압축된 상태에서 분석함으로써 이를 줄일 수 있는 효과가 있다.The execution unit 320 according to the present embodiment analyzes the encoded video stream in a compressed domain. That is, the execution unit 320 does not decompress the encoded video stream, but analyzes the region of the motion vector in the encoded state to detect whether motion has occurred. In general, in the case of a media stream, a lot of CPU consumption occurs in the decompression process, but in the case of the present embodiment, it is possible to reduce this by analyzing in a compressed state.

또한, 본 실시예에 있어서, 수행부(320)는 모션 벡터 정보를 통한 객체의 움직임 탐지의 정확도 향상을 위해 부호화된 비디오 스트림 내 DCT(Discrete Cosine Transform) 계수를 추가로 검출하여 활용할 수 있다. 한편, DCT 계수란 Discrete Cosine Transform의 약자로서 n개의 데이터를 n개의 코사인 함수의 합으로 표현하여 데이터의 양을 줄이는 방식을 말한다. 이 방식은 근본적으로 푸리에(Fourier)의 가설 즉, 시간 영역(time domain)에서 n개의 값은 주파수 영역(frequency domain)에서의 n개의 삼각함수로 표현될 수 있다는 푸리에 변환에서 출발한 것이다. 국제 표준 규격의 화상 압축 방식인 MPEG에서는 영상을 8x8 즉, 64개의 블럭 단위로 인코딩하는데, 이 64개의 데이터를 64개의 코사인 함수로 표기하는 것을 DCT 라 하고, 이렇게 변환된 코사인 함수들의 계수를 DCT 계수라 한다. 이러한 DCT 계수를 이용하면 동영상의 앞뒤 프레임간의 차이를 추출해 낼 수 있다.In addition, in the present embodiment, the execution unit 320 may additionally detect and utilize Discrete Cosine Transform (DCT) coefficients in the encoded video stream in order to improve the accuracy of motion detection of an object through motion vector information. Meanwhile, DCT coefficient is an abbreviation of Discrete Cosine Transform and refers to a method of reducing the amount of data by expressing n data as the sum of n cosine functions. This method basically starts from Fourier's hypothesis, that is, the Fourier transform that n values in the time domain can be represented by n trigonometric functions in the frequency domain. In MPEG, which is an international standard image compression method, an image is encoded in units of 8x8, that is, 64 blocks, and the expression of 64 data in 64 cosine functions is called DCT, and the coefficients of the transformed cosine functions are DCT coefficients. It is called. Using these DCT coefficients, it is possible to extract the difference between the frames before and after the video.

본 실시예에 있어서, 수행부(320)는 DCT 계수를 검출된 모션 벡터 정보에 대한 유효성을 검증하는 데 있어서 검증 파라미터로서 활용한다.In this embodiment, the execution unit 320 utilizes the DCT coefficient as a verification parameter in verifying the validity of the detected motion vector information.

이하, 도 4를 함께 참조하여, 본 실시예에 따른 수행부(320)의 구체적인 동작에 대하여 설명하도록 한다. 한편, 도 4의 (a)는 픽셀 영역에서 객체의 움직임 탐지 절차를 도시한 도면이다. 도 4의 (b)는 본 실시예에 따른 수행부(320)를 개략적으로 나타낸 블록 구성도로서, 압축 영역에서 모션 벡터 정보를 통한 객체의 움직임 탐지 절차를 도시한 도면이다.Hereinafter, a detailed operation of the execution unit 320 according to the present embodiment will be described with reference to FIG. 4. Meanwhile, FIG. 4A is a diagram illustrating a procedure for detecting motion of an object in a pixel area. FIG. 4B is a block diagram schematically showing the execution unit 320 according to the present embodiment, and is a diagram illustrating a motion detection procedure of an object through motion vector information in a compressed region.

먼저, 도 4의 (a)를 참조하면, 픽셀 영역에서 객체의 움직임 탐지를 수행하게 되면, reorder, IDCT, motion compensation과 같이 많은 자원을 소모하는 작업이 수행되어야 한다. 그리고, 픽셀 영역에서의 객체의 움직임 탐지의 경우 모든 픽셀에 대해 연산을 해야 하므로 이에 대해서도 많은 자원이 소모된다.First, referring to FIG. 4A, when motion detection of an object in a pixel area is performed, a task that consumes a lot of resources such as reorder, IDCT, and motion compensation must be performed. In addition, in the case of motion detection of an object in a pixel area, a lot of resources are consumed because calculations are required for all pixels.

반면, 도 4의 (b)를 참조하면, 본 실시예의 경우 압축 영역에서 객체의 움직임 탐지를 수행함에 따라 많은 연산량을 필요로 하는 부분(도 4의 (a)의 점선 표기 부분)들을 건너뛰면서 움직임 탐지를 할 수 있으므로 자원의 낭비를 줄일 수 있는 효과가 있다.On the other hand, referring to (b) of FIG. 4, in the case of this embodiment, as the motion of the object is detected in the compressed region, the movement while skipping the part that requires a large amount of computation (the dotted line mark in FIG. 4(a)) Since it can be detected, there is an effect of reducing the waste of resources.

도 4의 (b)에 도시하듯이, 본 실시예에 따른 수행부(320)는 추출부(400), 분석부(410) 및 움직임 탐지부(420)를 포함한다. 여기서, 수행부(320)에 포함되는 구성요소는 반드시 이에 한정되는 것은 아니다.As shown in (b) of FIG. 4, the execution unit 320 according to the present embodiment includes an extraction unit 400, an analysis unit 410, and a motion detection unit 420. Here, the components included in the execution unit 320 are not necessarily limited thereto.

추출부(400)는 부호화된 비디오 스트림을 압축 영역에서 분석하여 부호화된 비디오 스트림 내 모션 벡터 정보 및 DCT 계수를 추출하는 기능을 수행한다. 이러한, 추출부(400)의 동작은 부호화된 비디오 스트림의 전부가 아닌 일부 영역을 디코딩하여 움직임 탐지를 위한 필요 정보만을 추출하는 과정에 해당한다.The extraction unit 400 performs a function of analyzing the encoded video stream in the compression region and extracting motion vector information and DCT coefficients in the encoded video stream. The operation of the extraction unit 400 corresponds to a process of extracting only necessary information for motion detection by decoding a partial region of the encoded video stream.

이를 위해, 본 실시예에 따른, 추출부(400)는 부호화된 비디오 스트림에 대해서 엔트로피 디코딩하도록 구성되는 엔트로피 디코더(Entropy Decoder)로 구현될 수 있다. 즉, 추출부(400)는 부호화된 비디오 스트림이 CAVLC(Context-adaptive variable-length coding) 기반으로 인코딩된 경우와 CABAC(Context-adaptive binary arithmetic coding) 기반으로 인코딩된 경우를 구분하고, 이를 기반으로 각각 상이한 디코딩 방식을 수행하여 모션 벡터 정보와 DCT 계수를 추출할 수 있다.To this end, the extraction unit 400 according to the present embodiment may be implemented as an entropy decoder configured to entropy decode an encoded video stream. That is, the extraction unit 400 classifies a case where the encoded video stream is encoded based on context-adaptive variable-length coding (CAVLC) and a case that is encoded based on context-adaptive binary arithmetic coding (CABAC), and Motion vector information and DCT coefficients may be extracted by performing different decoding methods, respectively.

분석부(410)는 추출부(400)를 통해 추출된 모션 벡터 정보 및 DCT 계수를 제공받아 분석하고, 각각에 대한 분석결과를 제공하는 기능을 수행한다. 즉, 본 실시예에 있어서, 분석부(410)는 제1 분석부(Motion vector analysis, 412) 및 제2 분석부(DCT coefficient analysis)를 포함하여 구현될 수 있다.The analysis unit 410 receives and analyzes motion vector information and DCT coefficients extracted through the extraction unit 400 and performs a function of providing an analysis result for each. That is, in this embodiment, the analysis unit 410 may be implemented by including a first analysis unit (motion vector analysis) 412 and a second analysis unit (DCT coefficient analysis).

제1 분석부(412)는 모션 벡터 정보에 대한 분석을 수행하고, 이에 대한 분석결과를 제공한다. 본 실시예에 있어서, 제1 분석부(412)는 모션 벡터 정보에 대한 분석결과를 기반으로 모션 벡터 정보에 대한 보정을 수행할 수 있다. 한편, 모션 벡터 정보는 메크로 블록 단위로 이전 영상과 비교해서 가장 유사한 메크로 블록을 찾아낸 벡터 정보를 가지고 있다. 이는 16×16, 8×8, 4×4 등이 될 수 있다. 이와 같이 모션 벡터 정보가 발생하면 움직임이 탐지되었을 수 있지만, 다양한 이유로 인해서 보정이 되어야 한다. 즉, 움직임 예측(Motion Estimation) 과정을 통해서 모션 벡터 정보를 찾는데 이때 압축에 유리한 블록을 찾으므로 모션 벡터 정보는 실제 움직임과 일치하지 않을 수 있다. 또한, 모션 벡터 정보를 발생시키기보다 자체 블록에서 인트라 예측을 하여 처리하는 블록의 경우 모션 벡터 정보가 발생하지 않는다. 또한, 보통 패턴 균일하거나 특징이 거의 없는 영역에서 모션 벡터 정보를 잘못 찾는 경우가 발생할 수 있다. The first analysis unit 412 analyzes motion vector information and provides an analysis result thereof. In this embodiment, the first analysis unit 412 may correct motion vector information based on an analysis result of the motion vector information. Meanwhile, the motion vector information has vector information that finds the most similar macroblock by comparing it with the previous image in units of macroblocks. This could be 16×16, 8×8, 4×4, etc. When motion vector information is generated in this way, motion may have been detected, but it must be corrected for various reasons. That is, motion vector information is found through a motion estimation process. In this case, since a block advantageous for compression is found, the motion vector information may not coincide with the actual motion. In addition, in the case of a block that processes by performing intra prediction in its own block rather than generating motion vector information, motion vector information is not generated. In addition, there may be a case of erroneously finding motion vector information in an area that is usually pattern uniform or has few features.

이 점에 기인하여, 제1 분석부(412)는 모션 벡터 정보에 대한 분석을 통해 모션 벡터 정보에 매칭되는 블록의 주변 블록 중 인트라 예측에 기반한 처리를 수행하는 특정 블록에 대해서도 적응적으로 모션 벡터 정보를 부여할 수 있다. 이는, 객체 내 움직임 탐지에 대한 정확도를 향상시키기 위한 보정 과정에 해당되며, 예컨대, 제1 분석부(412)는 주변 블록의 모션 벡터 정보를 기반으로 특정 블록에 대한 모션 벡터 정보를 부여할 수 있다.Due to this, the first analysis unit 412 adaptively applies a motion vector to a specific block that performs processing based on intra prediction among neighboring blocks of the block matching the motion vector information through analysis of the motion vector information. Information can be given. This corresponds to a correction process to improve the accuracy of motion detection within an object. For example, the first analysis unit 412 may provide motion vector information for a specific block based on motion vector information of a neighboring block. .

제1 분석부(412)는 보정된 모션 벡터 정보 및 상기의 분석결과를 움직임 탐지부(420)로 전송하고, 이를 통해 움직임 탐지부(420)로 하여금 움직임 탐지 절차를 수행하도록 한다.The first analysis unit 412 transmits the corrected motion vector information and the analysis result to the motion detection unit 420, through which the motion detection unit 420 performs a motion detection procedure.

제2 분석부(414)는 DCT 계수에 대한 분석을 수행하고, 이에 대한 분석결과를 제공한다. DCT 계수는 메크로 블록에 대해 이전 화면과의 차분치에 대한 픽셀 값을 이산 코사인 변환을 한 후에 발생하는 계수로서 2가지 용도로 사용될 수 있다.The second analysis unit 414 analyzes the DCT coefficient and provides the analysis result. The DCT coefficient is a coefficient generated after discrete cosine transformation of a pixel value for a difference value from a previous screen for a macroblock, and can be used for two purposes.

먼저, DCT 계수는 모션 벡터 정보와 별개로 객체의 움직임을 탐지하기 위한 변수 값으로서 활용될 수 있다. 즉, 움직임이 탐지가 되었다면, DCT 계수에 변화가 있을 것이며, 이 변화량이 특정 임계치를 초과하는 경우 움직임이 탐지되었다고 할 수 있다. 이를 위해, 제2 분석부(414)는 DCT 계수의 변화량을 상기의 분석결과로서 제공할 수 있다.First, the DCT coefficient may be used as a variable value for detecting motion of an object separately from motion vector information. That is, if motion is detected, there will be a change in the DCT coefficient, and if this change amount exceeds a specific threshold, it can be said that the motion is detected. To this end, the second analysis unit 414 may provide the amount of change in the DCT coefficient as the analysis result.

또한, DTC 계수는 제1 분석부(212)를 통해 분석 및 보정된 모션 벡터 정보에 대한 유효성 여부를 검증하는 용도로 사용될 수 있다. 즉, DCT 계수를 통해 영상의 대략적인 실제 패턴을 알 수 있으므로, 이를 통해, 모션 벡터 정보가 유효한지를 판단할 수 있다. 제2 분석부(214)는 DCT 계수에 대한 분석결과를 움직임 탐지부(420)로 전송하고, 이를 통해, 움직임 탐지부(420)로 하여금 DCT 계수에 대한 분석결과를 통해 모션 벡터 정보에 대한 유효성을 판별하도록 한다.In addition, the DTC coefficient may be used to verify the validity of motion vector information analyzed and corrected by the first analysis unit 212. That is, since the approximate actual pattern of the image can be known through the DCT coefficient, it is possible to determine whether motion vector information is valid through this. The second analysis unit 214 transmits the analysis result of the DCT coefficient to the motion detection unit 420, and through this, the motion detection unit 420 determines the effectiveness of motion vector information through the analysis result of the DCT coefficient. To determine.

움직임 탐지부(420)는 분석부(210)를 통해 제공받은 모션 벡터 정보 및 DCT 계수 중 적어도 하나의 분석결과에 기초하여 영상 내 객체의 움직임을 탐지하는 기능을 수행한다.The motion detection unit 420 performs a function of detecting a motion of an object in an image based on an analysis result of at least one of motion vector information and DCT coefficients provided through the analysis unit 210.

본 실시예에 있어서, 움직임 탐지부(420)는 모션 벡터 정보 및 DCT 계수에 대한 분석결과를 통합하여 객체의 움직임이 탐지되었는지 여부를 결정할 수 있다. 즉, 움직임 탐지부(420)는 모션 벡터 정보에 대한 분석결과에 DCT 계수에 대한 분석결과를 반영하여 보정을 수행한다.In this embodiment, the motion detection unit 420 may determine whether the motion of the object is detected by integrating the motion vector information and the analysis result of the DCT coefficient. That is, the motion detection unit 420 performs correction by reflecting the analysis result of the DCT coefficient in the analysis result of the motion vector information.

이 경우, 움직임 탐지부(420)는 DCT 계수에 대한 분석결과를 활용하여 모션 벡터 정보에 대한 유효성을 판별하고, 판별결과에 따라 객체의 움직임을 탐지할 수 있다. 예컨대, 움직임 탐지부(420)는 모션 벡터 정보에 대한 분석결과 및 해당 모션 벡터 정보에 상응하는 DCT 계수에 대한 분석결과에 대하여 기 정의된 가중치를 반영한 가중치 반영값이 기 설정된 임계치를 초과하는 경우 모션 벡터 정보가 유효한 것으로 판별하고, 객체의 움직임이 탐지된 것으로 결정한다. 반대로, 움직임 탐지부(420)는 상기의 가중치 반영값이 기 설정된 임계치 미만인 경우 모션 벡터 정보가 유효하지 않은 것으로 판별하고, 객체의 움직임이 탐지되지 않은 것으로 결정한다.In this case, the motion detection unit 420 may determine the validity of the motion vector information by using the analysis result of the DCT coefficient, and detect the motion of the object according to the determination result. For example, the motion detection unit 420 may perform motion when a weight reflection value reflecting a predefined weight for an analysis result of motion vector information and an analysis result of a DCT coefficient corresponding to the motion vector information exceeds a preset threshold. It is determined that the vector information is valid, and that the motion of the object is detected. Conversely, when the weight reflection value is less than a preset threshold, the motion detection unit 420 determines that the motion vector information is not valid, and determines that the motion of the object is not detected.

다른 실시예에 있어서, 움직임 탐지부(420)는 DCT 계수에 대한 분석결과에 기초하여 DCT 계수의 변화량을 산출하고, 산출된 변화량에 기반하여 객체의 움직임을 탐지할 수 있다. 예컨대, 움직임 탐지부(420)는 산출된 변화량이 기 설정된 임계치를 초과하는 경우 객체의 움직임이 탐지된 것으로 결정한다.In another embodiment, the motion detection unit 420 may calculate a change amount of the DCT coefficient based on an analysis result of the DCT coefficient and detect the motion of the object based on the calculated change amount. For example, the motion detection unit 420 determines that the motion of the object is detected when the calculated change amount exceeds a preset threshold.

다른 실시예에 있어서, 움직임 탐지부(420)는 영상촬영 장치로부터 얼굴이나 사람을 인식한 결과를 전달받고, 이를 기반으로 객체의 움직임을 탐지할 수 있다.In another embodiment, the motion detection unit 420 may receive a result of recognizing a face or a person from the image capturing device, and detect the motion of the object based on the received result.

영상 처리부(330)는 움직임 탐지부(420)의 객체 움직임 탐지결과를 기반으로 영상 요약정보를 생성한다. 본 실시예에 있어서, 영상 요약정보는 썸네일 이미지인 것이 바람직하나 반드시 이에 한정되는 것은 아니다. 이러한, 영상 처리부(330)는 비디오 디코더(Video Decoder), 조절부(Image Resize), 이미지 디코더(Image Decoder) 및 제1 내지 제2 처리부(Black Hole) 등을 포함하여 구성될 수 있다. 여기서, 영상 처리부(330)에 포함되는 각 구성요소의 동작은 종래와 동일한 바 자세한 설명은 생략하도록 한다.The image processing unit 330 generates image summary information based on the object motion detection result of the motion detection unit 420. In the present embodiment, the image summary information is preferably a thumbnail image, but is not limited thereto. The image processing unit 330 may include a video decoder, an image resize, an image decoder, and first to second processing units (Black Hole). Here, since the operation of each component included in the image processing unit 330 is the same as in the related art, a detailed description will be omitted.

영상 처리부(330)는 객체 움직임 탐지결과를 기반으로 영상 내 객체의 움직임이 탐지된 장면에 대한 타임스템프를 테이블에 저장한다.The image processing unit 330 stores, in a table, a timestamp for a scene in which motion of an object in an image is detected based on the object motion detection result.

영상 처리부(330)는 저장된 타임스템프를 기준으로 특정 시점에 상응하는 이미지를 저장하여 영상 요약정보를 생성한다. 예컨대, 영상 처리부(330)는 부화호된 비디오 스트림을 복호화하고, 복고화된 비디오 스트림 내 상기의 타임스템프에 상응하는 시간을 포함한 시간에 대해 재부호화를 수행한 후 이를 이미지로 저장한다.The image processing unit 330 generates image summary information by storing an image corresponding to a specific time point based on the stored time stamp. For example, the image processing unit 330 decodes the coded video stream, re-encodes a time including a time corresponding to the time stamp in the retrofitted video stream, and then stores it as an image.

한편, 영상 처리부(330)는 영상 내 객체의 움직임이 탐지된 장면이 IDR 픽처가 아닌 b/p 픽처인 경우 복호화된 비디오 스트림 내의 참조 영상(ex: IDR 픽처)에 대한 정보를 추가로 조합하여 상기의 재부호화를 수행한다. 이는, 이후 영상 요약정보를 조회 시에 참조 영상의 디코딩 없이도 영상 요약 정보의 디코딩이 가능하게 하는 효과가 있다.On the other hand, the image processing unit 330 further combines information on a reference image (ex: IDR picture) in the decoded video stream when the scene in which the motion of the object in the image is detected is a b/p picture other than an IDR picture. Re-encoding is performed. This has the effect of enabling decoding of the video summary information without decoding the reference video when the video summary information is subsequently searched.

영상 처리부(330)는 필요에 따라 영상 요약 정보의 생성 주기를 변경할 수 있다. 예컨대, 영상 처리부(330)가 이미지를 너무 자주 생성하게 되면 CPU 및 저장 공간을 많이 소모하게 되므로 가장 적정한 간격으로 저장하도록 한다. 예컨대, 비디오 스트림이 15fps일 경우 픽처의 간격이 67ms인데, 이미지로 저장을 할 때는 초당 1장 정도를 저장하면 정성적으로 보기에 인간이 인지하기에 어색하지 않으면서도 저장 공간을 절약할 수 있다. The image processing unit 330 may change the generation period of the image summary information as necessary. For example, if the image processing unit 330 generates an image too often, it consumes a lot of CPU and storage space, so that it is stored at the most appropriate interval. For example, if the video stream is 15fps, the interval of pictures is 67ms. When saving as an image, storing about 1 picture per second can save storage space without being qualitatively awkward for humans to perceive.

도 5는 본 실시예에 따른 영상 처리방법을 설명하기 위한 순서도이다.5 is a flowchart illustrating an image processing method according to the present embodiment.

영상 처리장치(300)는 영상촬영 장치로부터 발생한 영상의 부호화된 비디오 스트림을 수신한다(S502). 단계 S502에서 영상 처리장치(300)는 영상촬영 장치로부터 IDR 픽처에 의해 정의되는 GOP 단위로 버퍼링되는 상기의 부호화된 비디오 스트림을 입력받을 수 있다.The image processing apparatus 300 receives an encoded video stream of an image generated from an image capturing apparatus (S502). In step S502, the image processing apparatus 300 may receive the encoded video stream buffered in GOP units defined by the IDR picture from the image capturing apparatus.

영상 처리장치(300)는 단계 S502의 부호화된 비디오 스트림을 분석하여 부호화 비디오 스트림 내 모션 벡터 정보 및 DCT 계수를 추출한다(S504). 단계 S504에서 영상 처리장치(300)는 부호화된 비디오 스트림에 대해 압축을 풀지 않고 부호화 상태에서 모션 벡터의 영역 및 DCT 계수를 추출한다. 즉, 영상 처리장치(300)는 부호화된 비디오 스트림에 대해서 엔트로피 디코딩을 수행하고, 이를 통해, 부호화된 비디오 스트림의 전부가 아닌 일부 영역을 디코딩하여 모션 벡터 정보 및 DCT 계수를 추출할 수 있다.The image processing apparatus 300 analyzes the encoded video stream in step S502 and extracts motion vector information and DCT coefficients in the encoded video stream (S504). In step S504, the image processing apparatus 300 extracts a region of a motion vector and a DCT coefficient in the encoded state without decompressing the encoded video stream. That is, the image processing apparatus 300 may perform entropy decoding on the encoded video stream, and through this, decode a partial region of the encoded video stream, and extract motion vector information and DCT coefficients.

영상 처리장치(300)는 단계 S504에서 추출한 모션 벡터 정보 및 DCT 계수를 분석하여 분석결과를 생성한다(S506). 단계 S506에서 영상 처리장치(300)는 모션 벡터 정보에 대한 분석결과를 기반으로 모션 벡터 정보에 대한 보정을 수행할 수 있다. 예컨대, 영상 처리장치(300)는 모션 벡터 정보에 대한 분석을 통해 모션 벡터 정보에 매칭되는 블록의 주변 블록 중 인트라 예측에 기반한 처리를 수행하는 특정 블록에 대해서도 적응적으로 모션 벡터 정보를 부여할 수 있다.The image processing apparatus 300 analyzes the motion vector information and DCT coefficients extracted in step S504 to generate an analysis result (S506). In step S506, the image processing apparatus 300 may correct motion vector information based on an analysis result of the motion vector information. For example, the image processing apparatus 300 may adaptively provide motion vector information to a specific block that performs processing based on intra prediction among neighboring blocks of a block matching the motion vector information through analysis of motion vector information. have.

영상 처리장치(300)는 단계 S506에서 생성한 모션 벡터 정보의 분석결과 및 DCT 계수의 분석결과 중 적어도 하나의 분석결과에 기초하여 영상 내 객체의 움직임을 탐지한다(S508). 단계 S508에서 영상 처리장치(300)는 DCT 계수에 대한 분석결과를 활용하여 모션 벡터 정보에 대한 유효성을 판별하고, 판별결과에 따라 객체의 움직임을 탐지할 수 있다.The image processing apparatus 300 detects a motion of an object in the image based on at least one analysis result of the analysis result of the motion vector information generated in step S506 and the analysis result of the DCT coefficient (S508). In step S508, the image processing apparatus 300 may determine the validity of the motion vector information by using the analysis result of the DCT coefficient, and detect the motion of the object according to the determination result.

또한, 영상 처리장치(300)는 DCT 계수에 대한 분석결과에 기초하여 DCT 계수의 변화량을 산출하고, 산출된 변화량에 기반하여 객체의 움직임을 탐지할 수 있다.In addition, the image processing apparatus 300 may calculate a change amount of the DCT coefficient based on an analysis result of the DCT coefficient, and detect the motion of the object based on the calculated change amount.

영상 처리장치(300)는 단계 S508의 움직임 탐지결과를 기반으로 영상 요약 정보를 생성한다(S510). 영상 처리장치(300)는 단계 S508의 움직임 탐지결과를 기반으로 영상 내 객체의 움직임이 탐지된 장면에 대한 타임스템프를 테이블에 저장하고, 저장된 타임스템프의 시각을 포함하는 특정 시점에 상응하는 이미지를 저장하여 영상 요약정보를 생성한다.The image processing apparatus 300 generates image summary information based on the motion detection result in step S508 (S510). The image processing apparatus 300 stores a timestamp for a scene in which motion of an object in an image is detected in a table based on the motion detection result of step S508, and stores an image corresponding to a specific point in time including the time of the stored timestamp. Save to generate video summary information.

여기서, 단계 S502 내지 S510는 앞서 설명된 영상 처리장치(300)의 각 구성요소의 동작에 대응되므로 더 이상의 상세한 설명은 생략한다.Here, since steps S502 to S510 correspond to the operation of each component of the image processing apparatus 300 described above, further detailed descriptions are omitted.

도 5에서는 각각의 과정을 순차적으로 실행하는 것으로 기재하고 있으나, 반드시 이에 한정되는 것은 아니다. 다시 말해, 도 5에 기재된 과정을 변경하여 실행하거나 하나 이상의 과정을 병렬적으로 실행하는 것으로 적용 가능할 것이므로, 도 5는 시계열적인 순서로 한정되는 것은 아니다.In FIG. 5, it is described that each process is sequentially executed, but is not limited thereto. In other words, since the process described in FIG. 5 may be changed and executed or one or more processes may be executed in parallel, FIG. 5 is not limited to a time series order.

전술한 바와 같이 도 5에 기재된 영상 처리방법은 프로그램으로 구현되고 컴퓨터의 소프트웨어를 이용하여 읽을 수 있는 기록매체(CD-ROM, RAM, ROM, 메모리 카드, 하드 디스크, 광자기 디스크, 스토리지 디바이스 등)에 기록될 수 있다.As described above, the image processing method illustrated in FIG. 5 is a recording medium (CD-ROM, RAM, ROM, memory card, hard disk, magneto-optical disk, storage device, etc.) that is implemented as a program and can be read using software of a computer. Can be written on.

이상의 설명은 본 실시예의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 실시예가 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 실시예의 본질적인 특성에서 벗어나지 않는 범위에서 다양한 수정 및 변형이 가능할 것이다. 따라서, 본 실시예들은 본 실시예의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시예에 의하여 본 실시예의 기술 사상의 범위가 한정되는 것은 아니다. 본 실시예의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술 사상은 본 실시예의 권리범위에 포함되는 것으로 해석되어야 할 것이다.The above description is merely illustrative of the technical idea of the present embodiment, and those of ordinary skill in the technical field to which the present embodiment belongs will be able to make various modifications and variations without departing from the essential characteristics of the present embodiment. Accordingly, the present exemplary embodiments are not intended to limit the technical idea of the present exemplary embodiment, but are illustrative, and the scope of the technical idea of the present exemplary embodiment is not limited by these exemplary embodiments. The scope of protection of this embodiment should be interpreted by the following claims, and all technical ideas within the scope equivalent thereto should be construed as being included in the scope of the present embodiment.

300: 영상 처리장치 310: 입력 버퍼부
320: 수행부 330: 영상 처리부
400: 추출부 410: 분석부
420: 움직임 탐지부300: image processing device 310: input buffer unit
320: execution unit 330: image processing unit
400: extraction unit 410: analysis unit
420: motion detection unit

Claims

An input buffer unit receiving an encoded video stream of an image;
An extractor configured to analyze the encoded video stream in a compressed domain to extract motion vector information and DCT coefficients in the encoded video stream;
An analysis unit that provides an analysis result obtained by analyzing the motion vector information and the DCT coefficient; And
A motion detection unit that detects motion of an object in the image based on the analysis result of at least one of the motion vector information and the DCT coefficient
Image processing apparatus comprising a.

The method of claim 1,
The input buffer unit,
An image processing apparatus, characterized in that receiving an encoded video stream based on a compression standard based on motion compensation in units of blocks.

The method of claim 1,
The input buffer unit,
An image processing apparatus comprising: receiving the encoded video stream buffered in GOP (Group of Pictures) units defined by an Instantaneous Decoder Refresh (IDR) picture.

The method of claim 1,
The analysis unit includes a first analysis unit for analyzing the motion vector information and a second analysis unit for analyzing the DCT coefficient,
The first analysis unit adaptively assigns motion vector information to a specific block that performs processing based on intra prediction among neighboring blocks of the block matching the motion vector information through analysis of the motion vector information. An image processing device made into.

The method of claim 1,
The motion detection unit,
And determining the validity of the motion vector information by using the analysis result of the DCT coefficient, and detecting the motion of the object according to the determination result.

The method of claim 5,
The motion detection unit,
When the motion vector information and the weight reflection value reflecting the predefined weight for the analysis result of the DCT coefficient corresponding to the motion vector information exceeds a preset threshold, it is determined that the motion vector information is valid. Image processing device.

The method of claim 1,
The motion detection unit,
And calculating a change amount of the DCT coefficient based on an analysis result of the DCT coefficient, and detecting a motion of the object based on the change amount.

The method of claim 1,
And an image processing unit that generates image summary information based on the detection result of the motion detection unit.

The method of claim 8,
The image processing unit,
And generating the image summary information by storing a timestamp for a scene in which the motion of the object in the image is detected, and storing an image corresponding to a specific point in time based on the timestamp.

The method of claim 9,
The image processing unit,
And storing the image as the image after decoding the encoded video stream, re-encoding a time including a time corresponding to the time stamp in the decoded video stream.

The method of claim 10,
The image processing unit,
And performing the re-encoding by additionally combining information on a reference image in the decoded video stream.

Receiving an encoded video stream of an image;
Analyzing the encoded video stream in a compression region and extracting motion vector information and DCT coefficients in the encoded video stream;
Providing an analysis result obtained by analyzing the motion vector information and the DCT coefficient; And
The process of detecting the motion of the object in the image based on the analysis result of at least one of the motion vector information and the DCT coefficient
Image processing method comprising a.

A computer program stored in a computer-readable recording medium to execute each step of the image processing method according to claim 12.