KR100841181B1

KR100841181B1 - Motion activity description method and apparatus for video

Info

Publication number: KR100841181B1
Application number: KR1020070095214A
Authority: KR
Inventors: 심동규; 김해광; 박철수; 정재원; 오대일; 문주희
Original assignee: 주식회사 팬택앤큐리텔
Priority date: 2007-09-19
Filing date: 2007-09-19
Publication date: 2008-06-24
Also published as: KR20070104307A

Abstract

본 발명은 동영상의 움직임 활동 (Motion Activity)을 기술하는 방법 및 장치, 특히, 움직임 활동 특징을 동영상으로부터 추출된 움직임 파라메터들의 크기와 방향의 통계적 특성으로 기술하는 것을 특징으로 동영상의 움직임 활동(Motion Activity) 특징 기술 (Description) 방법 및 장치에 관한 것으로, 본 발명의 방법 및 장치에 의하면 기존의 동영상 움직임 색인기법으로는 표현하기 어려운 동영상 전체, 대표 영상 사이, 시간상 특정 구간에 대한 신호적 특성들과 시공간적 분포 및 변화 정도와 패턴 등에 대한 지각적 특징을 기술할 수 있어, 이러한 움직임 정도가 중요한 특징이 되는 동영상 검색(video retrieval), 원격감시(surveilance), 멀티미디어 데이터베이스, 방송 필터링 (broadcasting filtering) 등의 디지털 비디오 서비스 응용들에 효과적으로 활용될 수 있다.The present invention provides a method and apparatus for describing a motion activity of a video, in particular, a motion activity feature as a statistical characteristic of the size and direction of motion parameters extracted from the video. Description Method and apparatus. According to the method and apparatus of the present invention, the signal characteristics and spatiotemporal temporal characteristics of an entire video, a representative video, and a specific section in time, which are difficult to express by a conventional video motion indexing technique. Perceptual features such as distribution, degree of change and pattern can be described, and digital such as video retrieval, remote surveillance, multimedia database, broadcasting filtering, etc. It can be effectively used for video service applications.

동영상, 움직임 활동 기술자, 움직임 파라메터, 크기, 방향, 동영상 검색(video retrieval), 원격감시(surveilance), 멀티미디어 데이터베이스, 방송 필터링 (broadcasting filtering) Motion pictures, motion activity descriptors, motion parameters, magnitude, direction, video retrieval, remote surveillance, multimedia database, broadcast filtering

Description

Method and apparatus for describing motion activity features of video TECHNICAL FIELD AND APPARATUS FOR VIDEO

도 1은 본 발명의 움직임 활동 특징 기술 방법의 흐름도,1 is a flowchart of a method for describing a motion activity feature of the present invention;

도 2는 본 발명의 방법을 구현하기 위한 움직임 활동 특징 기술 장치의 일실시예의 블록도,2 is a block diagram of one embodiment of a motion activity feature description apparatus for implementing the method of the present invention;

도 3은 도 2의 움직임 활동 특징 기술 장치의 구체화예의 블록도,3 is a block diagram of an embodiment of a motion activity feature description device of FIG. 2;

도 4a 내지 도 4j는 본 발명의 실시예에 따른 누적 움직임 히스토그램을 이용한 움직임 기술자 생성장치의 블록도,4A through 4J are block diagrams of an apparatus for generating a motion descriptor using a cumulative motion histogram according to an embodiment of the present invention;

도 5는 도 4b, 도 4d, 도 4f, 도 4g 및 도 4j에서의 움직임 추정치 발산 처리부의 상세 블록도,FIG. 5 is a detailed block diagram of a motion estimation divergence processing unit in FIGS. 4B, 4D, 4F, 4G, and 4J;

도 6은 도 4c 및 도 4d에서의 움직임 필터의 상세 블록도.6 is a detailed block diagram of the motion filter in FIGS. 4C and 4D;

도 7는 도 4a 내지 도 4j에서의 움직임 기술자 생성부의 상세 블록도.7 is a detailed block diagram of a motion descriptor generator of FIGS. 4A to 4J;

도 8는 도 5에서 BMA(Block Matching Algorithm)를 사용한 영상간 움직임 추정기법을 설명하기 위한 도면.FIG. 8 is a diagram for explaining an inter-image motion estimation method using a block matching algorithm (BMA) in FIG. 5; FIG.

도 9a 내지 도 9c는 움직임 추정의 발산을 설명하기 위한 도면.9A to 9C are diagrams for explaining the divergence of motion estimation.

도 10은 도 4b, 도 4d, 도 4f, 도 4g 및 도 4j에서의 움직임 추정치 발산 처 리부에서의 움직임 추정치 발산처리를 설명하기 위한 현재영역과 주변영역의 공간적 상관관계를 보인 도면.FIG. 10 is a diagram showing a spatial correlation between a current area and a peripheral area for explaining the motion estimation divergence processing in the motion estimation divergence processor in FIGS. 4B, 4D, 4F, 4G, and 4J; FIG.

도 11은 도 4a 내지 도 4j에서의 움직임 히스토그램 생성부에서 생성된 움직임 방향 데이터에 대한 움직임 히스토그램을 설명하기 위한 도면.FIG. 11 is a diagram for describing a motion histogram of motion direction data generated by the motion histogram generator of FIGS. 4A to 4J; FIG.

도 12는 도 4a 내지 도 4j에서의 누적 움직임 히스토그램 생성부에서 누적된 움직임 방향 데이터에 대한 누적 움직임 히스토그램을 설명하기 위한 도면. FIG. 12 is a diagram for describing a cumulative motion histogram of motion direction data accumulated by the cumulative motion histogram generator in FIGS. 4A to 4J; FIG.

도 13은 도 4a 내지 도 4j에서의 누적 움직임 히스토그램 생성부에서 누적된 35장의 영상으로 구성된 비디오에 대한 누적 움직임 히스토그램의 예시도.FIG. 13 is an exemplary diagram of a cumulative motion histogram for a video composed of 35 images accumulated by the cumulative motion histogram generator in FIGS. 4A to 4J; FIG.

도 14은 도 4a 내지 도 4j에서의 누적 움직임 히스토그램 생성부에서 누적된 누적 움직임 히스토그램의 색인(클립화)을 설명하기 위한 예시도.14 is an exemplary diagram for explaining an index (clipping) of a cumulative motion histogram accumulated by the cumulative motion histogram generator in FIGS. 4A to 4J;

도 15a 내지 도 15j는 본 발명의 실시예에 따른 누적 움직임 히스토그램을 이용한 움직임 기술자 생성방법을 설명하기 위한 흐름도.15A to 15J are flowcharts illustrating a method of generating a motion descriptor using a cumulative motion histogram according to an embodiment of the present invention.

도 16은 도 15b, 도 15d, 도 15g, 도 15h 및 도 15j에서의 움직임 추정치 발산처리 단계의 상세 흐름도.Fig. 16 is a detailed flowchart of the motion estimation divergence processing steps in Figs. 15B, 15D, 15G, 15H, and 15J.

도 17는 도 15c 및 도 15d에서의 시각적 특성을 고려한 움직임 크기 및 방향 필터링과정을 설명하기 위한 흐름도.FIG. 17 is a flowchart illustrating a motion magnitude and direction filtering process in consideration of visual characteristics in FIGS. 15C and 15D.

도 18a 및 도 18b는 도 15a 내지 도 15j에서의 움직임 기술자 생성단계의 상세 흐름도.18A and 18B are detailed flow charts of the motion descriptor generation step in FIGS. 15A-15J.

도 19은 본 발명의 실시예에 따른 움직임 기술자를 이용한 비디오 검색 시스템의 예시도.19 illustrates an example video retrieval system using a motion descriptor according to an embodiment of the present invention.

도 20a는 정지 영상의 구조화를 설명하기 위한 도면.20A is a diagram for explaining structuring of still images.

도 20b는 비디오의 구조화를 설명하기 위한 도면.20B is a diagram for explaining structuring of a video.

도 21은 씬 단위의 비디오 구조화를 표현한 도면이다. 21 is a diagram illustrating video structuring in units of scenes.

* 도면의 주요 부분에 대한 부호의 설명** Explanation of symbols for the main parts of the drawings *

1100: 움직임 벡터 추출 수단 1100: means for extracting motion vectors

2100-1 ∼ 2100-6: 움직임 벡터의 크기 특성 추출 수단;2100-1 to 2100-6: means for extracting magnitude characteristics of a motion vector;

3100-1 ∼ 3100-6: 움직임 벡터의 방향 특성 추출 수단;3100-1 to 3100-6: direction characteristic extraction means of the motion vector;

4100: 대표 방향 벡터 계산 수단 5100: 집합 수단(combiner)4100: means for calculating the representative direction vector 5100: a combiner

2 : 움직임 추정치 발산 처리부 3 : 움직임 필터2: motion estimation divergence processor 3: motion filter

4 : 움직임 히스토그램 생성부 5 : 누적 움직임 히스토그램 생성부 4: motion histogram generator 5: cumulative motion histogram generator

6 : 움직임 기술자 생성부 7 : 움직임 크기 계산부6: motion descriptor generation unit 7: motion size calculation unit

8 : 움직임 크기 히스토그램 생성부 8: motion size histogram generator

9 : 누적 움직임크기 히스토그램 생성부9: cumulative motion size histogram generator

10 : 움직임 방향 계산부 11 : 움직임 방향 히스토그램 생성부10: movement direction calculation unit 11: movement direction histogram generator

12 : 이전영상 저장부 13 : 움직임 크기 계산부12: previous image storage unit 13: motion size calculation unit

14 : 누적 움직임방향 히스토그램 생성부 14: cumulative movement direction histogram generator

15 : 움직임 크기 추정치 발산 처리부15: motion size estimation value divergence processing unit

17 : 움직임 방향 추정치 발산 처리부 22 : 현재영상 저장부17: motion direction estimate divergence processor 22: current image storage unit

23 : 움직임 벡터값 비교 변환부 23: motion vector comparison converter

26 : 움직임 클립 기술자 생성부26: motion clip descriptor generation unit

32 : 평균치 계산부 33 : 움직임 방향 계산부32: average value calculation unit 33: movement direction calculation unit

36 : 움직임 기술자 생성기 42 : 절대치 계산부36: motion descriptor generator 42: absolute value calculation unit

43 : 움직임 방향 양자화/역양자화부 52 : 움직임 벡터값 변환부43: motion direction quantization / inverse quantization unit 52: motion vector value conversion unit

본 발명은 동영상의 움직임 활동 (Motion Activity)을 기술하는 방법 및 장치에 관한 것으로(본 명세서에서는 동영상이라는 용어와 비디오라는 용어를 호환적으로 사용한다), 더욱 상세하게는 움직임 활동 특징을 동영상으로부터 추출된 움직임 파라메터들의 크기와 방향의 통계적 특성으로 기술하는 것을 특징으로 동영상의 움직임 활동 특징 기술 방법 및 장치에 관한 것이다.The present invention relates to a method and apparatus for describing a motion activity of a video (in this specification, the term video is used interchangeably), and more particularly, to extract a motion activity feature from a video. The present invention relates to a method and apparatus for describing a motion activity feature of a moving picture, characterized by describing the statistical characteristics of the magnitude and direction of the motion parameters.

현재, 표현 미디어(문자, 도형, 음성, 음향 그리고 더욱 광범위하게 영상 등이 모두 포함된 정보를 표현하는 수단)와 전달 미디어(통신 네트워크, 방송 네트워크, 저장 미디어) 그리고 이들 운용하는 시스템 성능의 지속적인 발전으로, 소용량 단일 미디어로 이루어진 모노미디어(가령 데이터는 데이터로만, 음성은 음성으로만 이루어지는 단일 미디어) 보다 복수의 모노미디어로 구성된 대용량 멀티미디어 데이터의 자유로운 생성, 빠른 검색 그리고 편리한 사용 및 재사용에 대한 욕구가 점점 증가되고 있다. 표현 미디어 전자화에 의하여 자유로운 멀티미디어 데이터의 생성에 대한 욕구가 충족됨에 따라, 방대한 양의 모노미디어나 멀티미디어 데이터가 개인 또는 공용 시스템 상에 산재하게 되었다. At present, the continuous development of expression media (means for representing information that includes text, graphics, voice, sound and more broadly images) and transmission media (communication networks, broadcast networks, storage media) and the performance of their operating systems Therefore, there is a desire to freely generate, search for, and easily use and reuse large-capacity multimedia data consisting of a plurality of monomedia rather than a monomedia consisting of a small capacity single media (e.g., a single media consisting of data only data and voice only voice). It is increasing. As the desire for the creation of free multimedia data has been satisfied by the expression media electronics, vast quantities of monomedia or multimedia data have been scattered on personal or public systems.

그러나, 멀티미디어 데이터 양이 증가에 비례하여, 데이터를 사용 및 재사용하기 위하여 검색하는데 소요되는 시간과 비용 또한 증가하게 되었다. 따라서, 보다 빠르고 효율적인 검색을 위하여 현재 널리 이용되고 있는 문자기반 검색기술을 포함하고, 복합된 정보 속성을 갖는 멀티미디어 데이터의 효과적인 검색에 적합한 검색 기술에 대한 연구 및 개발이 활발하게 진행 되고 있다. However, as the amount of multimedia data increases, so does the time and cost of searching to use and reuse the data. Therefore, research and development of a search technology suitable for an effective search of multimedia data having a complex information attribute, including the character-based search technology that is widely used for faster and more efficient search has been actively conducted.

멀티미디어 데이터의 효과적인 색인 및 검색을 위해서는 각 미디어 데이터의 특징을 표현하는 정보속성 크기의 소량화, 전처리 과정의 단순화 및 실시간화, 특징을 표현하는 정보속성의 유효성 및 다형성 그리고 검색에서의 유연성이 필요하다. Effective indexing and retrieval of multimedia data requires the reduction of the size of information attributes representing the characteristics of each media data, the simplification and real-time of the preprocessing process, the validity and polymorphism of the information attributes representing the characteristics, and the flexibility in the search. .

아울러, 검색결과의 객관적 유사도 및 주관적 유사도 또한 검색의 성능을 평가하는 중요한 요인이다. 주관적 유사도의 중요성은 미디어 데이터 특징을 표현하는 정보속성 기술/표현 한계에 기인한 것으로, 객관적 유사도가 크다고 할지라도 사용자가 의도하는 검색결과를 얻을 수 없을 경우 검색에 대한 유효성 및 실용성이 저하되기 때문이다. 따라서, 근래에 들어 미디어 데이터 특징을 표현하는 정보속성 기술에 있어 주관적 유사도를 반영할 수 있는 기법들에 대한 연구가 활발히 진행되고 있다.In addition, objective similarity and subjective similarity of search results are also important factors in evaluating the performance of the search. The importance of subjective similarity is due to the limitation of description / expression of the information attribute that expresses the characteristics of media data. Even if the objective similarity is large, the effectiveness and practicality of the search are deteriorated if the user cannot obtain the intended search result. . Therefore, in recent years, researches on techniques that can reflect subjective similarities in information attribute technology expressing media data characteristics have been actively conducted.

이미 널리 이용되고 있는 문자 기반 검색과 멀티미디어 데이터 기반 검색의 주된 차이점은 검색에 사용되는 정보속성 추출의 난해도 및 정보속성 표현의 다형 성을 들 수 있다. The main differences between text-based retrieval and multimedia data-based retrieval, which are already widely used, include the difficulty of information attribute extraction and the polymorphism of information attribute representation.

정보속성 추출의 난해도 측면에서 문자 데이터의 경우 문서내의 몇 개의 주요단어와 문장을 색인하여 검색이 가능하지만 멀티미디어의 경우 데이터 자체의 크기가 크고 여러 미디어가 혼합되어 있으므로 이들의 정보속성이 유기적으로 상호 결합된 유효한 새로운 정보속성을 얻기 위해서는 적절한 전처리 과정을 거처야 한다. In terms of difficulty of extracting information attributes, text data can be searched by indexing several key words and sentences in a document.However, in the case of multimedia, the data itself is large and mixed with various media. Appropriate preprocessing is required to obtain valid new information attributes combined.

전처리 과정의 중요 목표는 검색에 유효한 특징 정보 추출과 그 과정의 실효성에 있다. 부연하면, 설령 유효한 특징 정보를 검출할 수 있는 기법이라 할지라도 그 과정상의 하드웨어(H/W) 및 소프트웨어(S/W) 측면의 비용이 많이 소요되면 대량의 멀티미디어 데이터를 빠른 시간에 처리하고 사용해야 하는 응용분야나 시스템 성능이 좋지 못한 단말 시스템을 사용하는 분야에서 실용화 될 수 없기 때문이다.The main goal of the preprocessing process is to extract feature information valid for retrieval and the effectiveness of the process. In other words, even if a technique capable of detecting valid feature information is expensive in terms of hardware (H / W) and software (S / W) in the process, a large amount of multimedia data must be processed and used in a short time. This is because it cannot be put into practical use in applications or terminal systems with poor system performance.

정보속성 표현의 다형성 측면의 일례로 비디오 검색의 경우를 살펴보면, 비디오는 영상, 음성, 오디오 등의 여러 가지 미디어들의 정보가 혼합되어 있어 비디오의 특징을 표현할 때, 각각의 모노미디어 데이터의 정보속성 만으로, 혹은 두개이상의 미디어가 혼합된 멀티미디어 데이터의 정보속성에 대해 적절한 전처리를 수행하여 데이터의 특징이 될 수 있는 유효한 정보속성을 추출한 후, 추출된 정보속성을 사용하여 보다 다양한 형태의 데이터 검색이 가능하다. 일례로, 상기한 비디오 검색을 위해서 영상의 정보속성을 사용할 수 있고, 영상과 음성의 정보속성을 유기적으로 상호 결합하여 검색 할 수 있다. 따라서 단일 미디어의 속성을 사용 하는 검색보다 다양한 멀티 미디어의 속성들을 사용하는 검색이 보다 효과적이다. In the case of video retrieval as an example of the polymorphic aspect of information attribute expression, video is mixed with information of various media such as video, voice, and audio, and when the characteristics of the video are expressed, only the information attribute of each monomedia data is used. After extracting a valid information attribute that can be a feature of data by performing proper preprocessing on the information attribute of multimedia data mixed with two or more media, it is possible to search for various types of data using the extracted information attribute. . For example, an information property of an image may be used for the video search, and an information attribute of an image and an audio may be organically combined to be searched. Therefore, a search using various multi-media properties is more effective than a search using a single media property.

현재 멀티미디어 데이터 색인 및 검색에서 가장 많이 연구되고 있는 분야는 데이터 획득이 손쉬운 정지 영상 분야이다. 정지영상은 디지털 방식의 전자 스틸 카메라나 영상 데이터 베이스와 같은 저장계, 정지화 전송 장치나 오디오 그래픽 회의, 영상 회의 등의 전송계, 나아가서 걸러 프린터 등의 인쇄계 등에 널리 이용되고 있다. 정지영상 색인 및 검색은 내용기반 영상검색 방법으로서 이들의 주된 관심사항은 영상의 색(Color), 질감(Texture) 그리고 모양(Shape) 정보 등에 대해서 회전, 크기변환, 이동 등의 변화에 일관된 특성을 보이는 특징정보 추출과 색인 그리고 이를 활용한 검색방법이다. Currently, the most researched field in multimedia data indexing and retrieval is the field of still image which is easy to acquire data. Still images are widely used in storage systems such as digital electronic still cameras and image databases, still image transmission devices, transmission systems such as audio graphic conferences, video conferences, and even printing systems such as printers. Still image indexing and retrieval is a content-based image retrieval method whose main interests are consistent characteristics such as rotation, resizing, and movement of image color, texture, and shape information. This method extracts and indexes visible features and searches using them.

비디오 검색분야는 정지 영상에 비해 데이터의 획득이 쉽지 않고, 대용량의 데이터를 저장하고 처리해야 하는 관계로 그 응용이 제한적이었다. 그러나, 디스크, 테이프 그리고 CD ROM등 저장 미디어와 통신 네트워크등 전달 미디어의 급속한 발전으로 인하여 데이터 획득에 소요되는 비용이 저렴해지고, 필요 장치들이 소형화 되면서 근래에 들어 이 분야에 대한 연구가 활발히 진행되고 있다. 통상 비디오는 연속된 시간상에서 획득된 순서를 갖고 있는 일련의 영상들에 대한 총칭이다. 또한 비디오는 연속된 시간에서 영상 데이터를 획득하기 때문에 영상내의 공간적인 용장성(중복성)과 더불어 이웃하는 영상간의 용장성이 매우 커서 영상간 예측이 어느 정도 가능하다. 이러한 용장성에 기인한 비디오에서의 영상간 특성은 정지영상과는 크게 구별되는 것이다. 비디오 검색기법에서 상기한 영상간 유사성은 특징 정보추출에 있어 중요하게 활용된다.Video retrieval is not as easy as acquiring data compared to still images, and its application is limited due to the need to store and process large amounts of data. However, due to the rapid development of storage media such as disks, tapes and CD ROMs, and transmission media such as communication networks, the cost required for data acquisition becomes low, and the required devices are miniaturized. . Normally video is a generic term for a series of images that have an order of acquisition over successive times. In addition, since video acquires image data in continuous time, spatial redundancy (redundancy) in the image and redundancy between neighboring images are very large, and thus inter-image prediction is possible. Due to this redundancy, the inter-image characteristics in the video are distinguished from the still image. The similarity between the images in the video retrieval technique is important for extracting feature information.

비디오에서 영상간의 용장성은 영상간의 움직임 정도를 이용하여 측정될 수 있다. 일례로, 영상간의 용장성이 클 경우, 두 영상간에 중복되는 영역의 크기가 크다는 것을 나타내고, 이는 영상간 움직임이 작다라고 해석될 수 있다. 반대로, 영상간의 용장성이 작을 경우, 두 영상간의 중복되는 영역이 작고 영상간의 움직임은 크다고 할 수 있다. 현재 표준화가 완료된 비디오 압축 기법들은 데이터 압축효율을 향상시키기 위하여 상기한 영상간의 용장성을 최소화시킬 수 있는 영상간 움직임 추정(BMA-Block Matching Algorithm)을 이용한 압축 기법들을 채택하고 있다 (H.261, H.263, MPEG-1, MPEG-2, MPEG-4). Redundancy between images in video can be measured using the degree of motion between images. For example, when the redundancy between images is large, this indicates that the size of the overlapping region between the two images is large, which may be interpreted as a small movement between the images. On the contrary, when the redundancy between images is small, the overlapping area between the two images is small and the movement between the images is large. Currently, standardized video compression techniques employ compression techniques using BMA-Block Matching Algorithm (H.261, H.261) to minimize data redundancy to improve data compression efficiency. H.263, MPEG-1, MPEG-2, MPEG-4).

기존의 비디오 검색 기법들은 영상간의 색상(Color), 질감(Texture), 모양정보(Shape) 그리고 움직임(Motion)에서의 변화에 바탕을 두어 임의의 크기의 시간적 구간(이하, 클립이라함)들로 구조화(Video Structuring)하여, 구간내의 영상들의 의미적/신호적 특성을 대표할 수 있는 몇 개의 대표영상(Key-Frame)을 선별하고, 선별된 대표 영상들의 정보 속성에 대하여 특징정보를 추출하여 색인하거나 검색에 사용하였다.Conventional video retrieval techniques use arbitrary size temporal intervals (hereinafter referred to as clips) based on changes in color, texture, shape, and motion between images. Video Structuring selects several key-frames that can represent the semantic and signal characteristics of the images in the section, and extracts and extracts feature information on the information attributes of the selected representative images. Or used for search.

비디오 구조화에 있어 일반적인 구조는 시간적으로 끊이지 않는 정지화상의 연속인 '쇼트(shot)'라는 단위를 기본 구조 단위로, '쇼트'의 시간적인 연속으로서 내용상 시공간적인 연속성을 갖는 단위인 '씬(Scene)' 및 '씬'들로 구성된 '기승전결' 수준의 이야기 전개의 단위인 '스토리(story)' 등의 계층적 구조를 갖을 수 있다. The general structure of video structuring is the basic structure unit of 'shot', which is a continuous sequence of still images that are not temporally continuous, and 'scene', which is a unit of time and space in content as a temporal continuation of 'shot'. It can have a hierarchical structure such as 'story' which is a unit of story development at the level of 'competition' composed of ')' and 'scenes'.

이를 도시하면 도 20b와 같다. 비디오 구조화는 신호적 특성을 기반으로 하는 Event Tree 형태의 구조화가 가능하다. 비디오의 구조화에서 상호 연관정보(link)를 바탕으로 신호적 및 의미론적 구조 정보가 모두 존재하기도 한다.This is illustrated in FIG. 20B. Video structuring can be structured in the form of Event Tree based on signal characteristics. In the structure of video, both signal and semantic structural information exist based on the link.

즉, 도 20b의 왼쪽에 도시된 세그먼트 트리(Segment Tree)와 오른쪽에 도시된 이벤트 트리(Event Tree)는 화살표 방향으로 서로 링크(link)되어 있다. 예를 들면, 이벤트 트리(Event Tree)에 구조화된 클린턴 케이스(Clinton Case)를 검색하면, 세스먼트 트리(Segment Tree) 상에 도시된 세그먼트1(Segment 1)에서의 서브 세그먼트 1(Sub-segment1)의 쇼트 2의 비디오와 세그먼트 3(segment 3)의 쇼트 3(shot 3)의 비디오가 링크된다.That is, the segment tree shown on the left side of FIG. 20B and the event tree shown on the right side are linked to each other in the direction of the arrow. For example, when searching for a Clinton Case structured in the Event Tree, Sub-segment1 in Segment 1 shown on the Segment Tree is searched. The video of shot 2 of 2 and the video of shot 3 of segment 3 are linked.

하나의 정지화상의 경우에도 구조화가 가능하다. 한 사람이 숲속에 있는 사진의 경우, 이는 사람과 숲이라는 객체로 사람은 다시 얼굴과 몸체, 얼굴은 다시 눈, 코, 귀 등의 객체로 구성되는 구조를 갖는 것이다. 이를 도시하면 도 20a와 같다. 도 20a는 정지화상의 구조화를 설명하는 도면으로서, 영상내의 신호특성을 기반으로 하는 Region Tree 형태의 신호적 구조화와 영상내의 지각적 의미를 갖는 물체에 기반으로 한 Object Tree 형태의 의미론적 구조화가 가능하다. 일반적으로, 신호적 구조화는 반자동 및 자동적 정지화 구조화 기법을 사용하여 수행되며, 의미론적 구조화는 개념적인 구조화 방법이기 때문에 사용자에 의한 수동적 구조화 기법을 사용할 수 있다.Even in the case of one still picture, the structure is possible. In the case of a picture of a person in the forest, it is a person and a forest object, which is a face and a body, and a face is composed of objects such as eyes, nose, and ears. This is illustrated in FIG. 20A. FIG. 20A is a diagram illustrating the structure of a still image. A signal structure of a region tree based on signal characteristics in an image and a semantic structure of an object tree based on an object having a perceptual meaning in the image are possible. Do. In general, signal structuring is performed using semi-automatic and automatic stationary structuring techniques, and since semantic structuring is a conceptual structuring method, a manual structuring technique by a user can be used.

도 20a에 도시된 정지화상의 구조화도 도 20b에 도시된 비디오의 구조화와 마찬가지로 왼쪽에 도시된 영역 트리(Region Tree)와 오른쪽에 도시된 객체 트 리(Object Tree)는 서로 화살표 방향으로 링크되어 신호적 및 의미론적 구조 정보가 모두 존재하기도 한다.Structure diagram of still picture shown in FIG. 20A Similar to the structure of video shown in FIG. 20B, the Region Tree shown on the left and the Object Tree shown on the right are linked to each other in the direction of the arrow to signal There are also both semantic and semantic structural information.

음향신호의 경우, 주위의 배경 소음, 대화하는 사람의 소리, 배경음악 등으로 구성되는 구조를 갖는다.In the case of an acoustic signal, it has a structure composed of background noise of the surroundings, the sound of a person talking, background music, and the like.

이러한 데이터 구조화는 비디오를 구성하는 영상의 수가 많을수록 보다 정확하고 다양한 특징에 의한 색인과 보다 빠른 검색을 지원할 수 있다는 장점이 있다. Such data structuring has an advantage in that the larger the number of images constituting the video, the more accurate and various indexes and faster retrieval can be supported.

그리고, 도 21은 '씬' 단위의 비디오 구조화를 표현하는 도면이다.21 is a diagram representing video structuring in 'scene' units.

이러한 대표 영상에 의한 색인 및 검색방법은 대표 영상의 특징 정보만으로 표현될 수 없는 비디오 전체, 대표 영상 사이, 시간상 특정구간에 대한 신호적 특성들, 시공간적인 분포 그리고 변화에 대한 주된 패턴 등에 대한 특징을 기술하기 않기 때문에, 이러한 특성이 요구되는 응용에서는 적합하지 않은 단점이 있다.The indexing and retrieval method using the representative image is characterized by the signal characteristics, spatiotemporal distribution, and the main pattern of change in the video, the representative image, and the temporal region which cannot be represented only by the characteristic information of the representative image. Since it is not described, there is a disadvantage that it is not suitable for applications where such a characteristic is required.

한편, 동영상 기술방법에 대한 종래의 기술로는 문자, 대표정지영상 (key frame), 대표정지영상의 특징 등을 사용하여 왔으나, 이는 동영상의 고유한 특징인 움직임 활동 정도를 효과적으로 기술하지 못하는 단점이 있었다. 동영상에서의 움직임 활동 특징을 기술하는 종래의 기술로는 카메라 움직임이나 영상 객체의 궤적 등의 특징을 사용하는 방법이 있으나, 이는 동영상 화면 전체에서 나타나는 전체적인 움직임 활동의 특징을 기술하지 못하는 단점이 있다. 또한 기존에 영상에서의 움직임의 크기 자체를 이용하여 움직임의 크기에 대한 성질을 나타낼 수 있었으나 영상의 움직임의 변화량의 개념인 움직임 활동성을 나타내는 특징이 제안되어 있지 않았다.On the other hand, the conventional techniques for moving picture description methods have been used for character, key still image (key frame), representative still image features, etc. This is a disadvantage that does not effectively describe the degree of motion activity that is a unique feature of the video there was. Conventional techniques for describing motion activity features in a video include a method of using features such as a camera movement or a trajectory of an image object, but this has a disadvantage in that it does not describe an overall motion activity feature that appears in the entire video screen. In addition, although the characteristics of the motion size can be expressed by using the motion size itself in the image, a feature indicating motion activity, which is a concept of the change amount of the motion of the image, has not been proposed.

따라서, 본 발명의 하나의 목적은 동영상에서 움직임 파라메터들의 크기와 방향의 통계적 특성으로 동영상의 움직임 활동을 효율적이며 효과적으로 기술하고 활용할 수 있는 동영상의 움직임 활동 특징 기술 방법 및 장치를 제공하는 것이다.Accordingly, one object of the present invention is to provide a method and apparatus for describing a motion activity feature of a motion picture which can efficiently and effectively describe and utilize motion motion of a motion picture by statistical characteristics of the size and direction of motion parameters in the motion picture.

본 발명의 다른 목적은 영상내 물체들의 움직임의 크기의 대소 뿐만 아니라 움직임의 변화 정도에 따라 움직임 활동을 효과적으로 기술할 수 있는 동영상의 움직임 활동 특징 기술 방법 및 장치를 제공하는 것이다.Another object of the present invention is to provide a method and apparatus for describing a motion activity feature of a moving picture which can effectively describe the motion activity according to the degree of change of the motion as well as the magnitude of the motion of the objects in the image.

본 발명의 또 다른 목적은 동영상내의 내용에 대한 움직임 색인 및 검색에 있어서 사용자의 지각적 특성을 반영하고, 다양한 단계의 색인 및 검색을 지원할 수 있도록 하는 누적 움직임 히스토그램을 이용한 움직임 기술자 생성장치 및 그 방법을 제공하는 것이다.Another object of the present invention is an apparatus and method for generating a motion descriptor using a cumulative motion histogram that reflects a user's perceptual characteristics in motion indexing and retrieval of contents in a video and supports various levels of indexing and retrieval. To provide.

이와 같은 본 발명의 목적을 달성하기 위한 제 1 수단은 동영상으로부터 움직임 파라메터를 추출하는 단계; 전단계에서 추출한 움직임 파라메터의 크기의 통계적 특성을 추출하는 단계; 및 상기 움직임 파라메터의 방향의 통계적 특성을 추출하는 단계를 포함하는 것을 특징으로 하는 동영상의 움직임 활동 특징 기술 방법이다.The first means for achieving the object of the present invention comprises the steps of extracting a motion parameter from the video; Extracting statistical characteristics of the magnitudes of the motion parameters extracted in the previous step; And extracting statistical characteristics of the direction of the motion parameter.

이와 같은 본 발명의 목적을 달성하기 위한 제 2 수단은 동영상으로부터 움 직임 파라메터를 추출하는 움직임 파라메터 추출 수단; 움직임 파라메터 추출 수단으로부터 입력된 움직임 파라메터의 크기의 통계적 특성을 추출하는 수단; 상기 움직임 파라메터의 방향의 통계적 특성을 추출하는 수단; 및 추출된 통계적 특성들을 모아 움직임 활동 기술자를 정의하는 집합 수단(combiner)을 포함하는 것을 특징으로 하는 동영상의 움직임 활동 특징 기술 장치이다.The second means for achieving the object of the present invention comprises a motion parameter extraction means for extracting the motion parameters from the video; Means for extracting a statistical characteristic of the magnitude of the input motion parameter from the motion parameter extraction means; Means for extracting a statistical characteristic of the direction of the motion parameter; And a combiner for collecting the extracted statistical characteristics to define a motion activity descriptor.

이와 같은 본 발명의 목적을 달성하기 위한 제 3 수단은 입력되는 움직임 크기 데이터 및 방향 데이터에 대해서 움직임 히스토그램을 각각 생성하는 움직임 히스토그램 생성부와, 상기 움직임 히스토그램 생성부에서 생성된 움직임 히스토그램을 정해진 순서에 따라 누적 움직임 히스토그램을 생성하는 누적 움직임 히스토그램 생성부와, 상기 누적 움직임 히스토그램 생성부에서 생성된 누적 움직임 히스토그램의 변화량에 따라 비디오를 임의의 크기로 구조화(계층화)하고, 구조화된 각 단위에 대해 움직임 특성을 기술하는 움직임 기술자를 생성하는 움직임 기술자 생성부를 포함하여 구성된다.The third means for achieving the object of the present invention is a motion histogram generator for generating a motion histogram for each of the input motion size data and direction data, and the motion histogram generated by the motion histogram generator in a predetermined order A cumulative motion histogram generator that generates a cumulative motion histogram according to the present invention, and the video is arbitrarily sized (layered) according to a change amount of the cumulative motion histogram generated by the cumulative motion histogram generator, and the motion characteristics of each structured unit It comprises a motion descriptor generator for generating a motion descriptor for describing.

이와 같은 본 발명의 목적을 달성하기 위한 제 4 수단은 입력되는 움직임크기 및 방향 데이터에 대해서 움직임 히스토그램을 생성하는 움직임 히스토그램 생성단계와, 상기 움직임 히스토그램 생성단계에서 생성된 움직임 히스토그램을 정해진 순서에 따라 움직임 히스토그램을 생성하는 누적 움직임 히스토그램 생성단계와, 상기 누적 움직임 히스토그램 생성단계에서 생성된 누적 움직임 히스토그램에 대해서 시간에 대한 변화량에 따라 비디오를 임의의 크기로 구조화(계층화)하고, 구조화된 각 단위에 대해 움직임 특성을 기술하는 움직임 기술자를 생성하는 움직 임 기술자 생성단계를 포함하여 이루어진다.The fourth means for achieving the object of the present invention comprises a motion histogram generation step of generating a motion histogram for the input motion size and direction data, and the motion histogram generated in the motion histogram generation step in a predetermined order A cumulative motion histogram generation step of generating a histogram, and a cumulative motion histogram generated in the cumulative motion histogram generation step, the video is structured (layered) to an arbitrary size according to an amount of change over time, and motion is performed for each structured unit. A motion descriptor generation step of generating a motion descriptor describing the characteristic.

이와 같은 본 발명의 목적을 달성하기 위한 제 5 수단은 입력되는 움직임 크기 정보의 크기(정도)를 계산하는 움직임 크기 계산수단과, 상기 움직임 크기 계산수단에서 계산된 움직임 크기정보에 대해서 움직임 크기 히스토그램을 생성하는 움직임 크기 히스토그램 생성수단과, 상기 움직임 크기 히스토그램 생성수단에서 생성된 움직임 크기 히스토그램을 정해진 순서에 따라 누적 움직임 크기 히스토그램을 생성하는 누적 움직임 크기 히스토그램 생성수단과, 상기 누적 움직임 크기 히스토그램 생성수단에서 생성된 누적 움직임 크기 히스토그램의 변화량에 따라 비디오를 임의의 크기로 구조화(계층화)하고, 구조화된 각 단위에 대해 움직임 특성을 기술하는 움직임 기술자를 생성하는 움직임 기술자 생성수단을 포함하여 구성된다.The fifth means for achieving the object of the present invention is a motion size histogram for the motion size calculation means for calculating the magnitude (degree) of the input motion size information, and the motion size information calculated by the motion size calculation means A motion size histogram generating means for generating, a cumulative motion size histogram generating means for generating a cumulative motion size histogram according to a predetermined order of the motion size histogram generated by the motion size histogram generating means, and the cumulative motion size histogram generating means And a motion descriptor generating means for structuring (layering) the video to an arbitrary size according to the amount of change in the accumulated cumulative motion size histogram, and generating a motion descriptor for describing the motion characteristics for each structured unit.

이와 같은 본 발명의 목적을 달성하기 위한 제 6 수단은 입력되는 움직임 방향 정보의 방향을 계산하는 움직임 방향 계산수단과; 상기 움직임 방향 계산수단에서 계산된 움직임 방향 정보에 대해서 움직임 방향 히스토그램을 생성하는 움직임 방향 히스토그램 생성수단과; 상기 움직임 방향 히스토그램 생성수단에서 생성된 움직임 방향 히스토그램을 정해진 순서에 따라 누적 움직임 방향 히스토그램을 생성하는 누적 움직임 방향 히스토그램 생성수단과; 상기 누적 움직임 방향 히스토그램 생성수단에서 생성된 누적 움직임 방향 히스토그램의 변화량에 따라 비디오를 임의의 크기로 구조화(계층화)하고, 구조화된 각 단위에 대해 움직임 특성을 기술하는 움직임 기술자를 생성하는 움직임 기술자 생성 수단을 포함하여 구성된다.The sixth means for achieving the object of the present invention comprises a movement direction calculation means for calculating the direction of the input movement direction information; Motion direction histogram generating means for generating a motion direction histogram with respect to the motion direction information calculated by the motion direction calculating means; Cumulative motion direction histogram generating means for generating a cumulative motion direction histogram according to a predetermined order of the motion direction histogram generated by the motion direction histogram generating means; Motion descriptor generation means for structuring (layering) the video to an arbitrary size according to the amount of change in the cumulative motion direction histogram generated by the cumulative motion direction histogram generating means, and generating a motion descriptor for describing the motion characteristics for each structured unit. It is configured to include.

이와 같은 본 발명의 목적을 달성하기 위한 제 7 수단은 입력되는 움직임 크 기 정보의 크기(정도)를 계산하는 움직임 크기 계산수단과; 상기 움직임 크기 계산수단에서 계산된 움직임 크기정보에 대해서 움직임 크기 히스토그램을 생성하는 움직임 크기 히스토그램 생성수단과; 상기 움직임 크기 히스토그램 생성수단에서 생성된 움직임 크기 히스토그램을 정해진 순서에 따라 누적 움직임 크기 히스토그램을 생성하는 누적 움직임 크기 히스토그램 생성수단과; 입력되는 움직임 방향 정보의 방향을 계산하는 움직임 방향 계산수단과; 상기 움직임 방향 계산수단에서 계산된 움직임 방향 정보에 대해서 움직임 방향 히스토그램을 생성하는 움직임 방향 히스토그램 생성수단과; 상기 움직임 방향 히스토그램 생성수단에서 생성된 움직임 방향 히스토그램을 정해진 순서에 따라 누적 움직임 방향 히스토그램을 생성하는 누적 움직임 방향 히스토그램 생성수단과; 상기 누적 움직임 크기 히스토그램 생성수단 및 상기 누적 움직임 방향 히스토그램 생성수단에서 생성된 누적 움직임 크기 및 방향 히스토그램의 변화량에 따라 비디오를 임의의 크기로 구조화(계층화)하고, 구조화된 각 단위에 대해 움직임 특성을 기술하는 움직임 기술자를 생성하는 움직임 기술자 생성수단을 포함하여 구성된다.The seventh means for achieving the object of the present invention comprises: motion size calculation means for calculating the size (degree) of the input motion size information; Motion magnitude histogram generating means for generating a motion magnitude histogram with respect to the motion magnitude information calculated by the motion magnitude calculating means; Cumulative motion magnitude histogram generating means for generating a cumulative motion magnitude histogram according to a predetermined order of the motion magnitude histogram generated by the motion magnitude histogram generating means; Motion direction calculation means for calculating a direction of input motion direction information; Motion direction histogram generating means for generating a motion direction histogram with respect to the motion direction information calculated by the motion direction calculating means; Cumulative motion direction histogram generating means for generating a cumulative motion direction histogram according to a predetermined order of the motion direction histogram generated by the motion direction histogram generating means; According to the cumulative motion magnitude histogram generating means and the cumulative motion direction histogram generating means, the video is structured (layered) to an arbitrary size and the motion characteristics are described for each structured unit. And a motion descriptor generating means for generating a motion descriptor.

이와 같은 본 발명의 목적을 달성하기 위한 제 8 수단은 입력되는 움직임 크기 정보의 크기(정도)를 계산하는 움직임 크기 계산단계와; 상기 움직임 크기 계산단계에서 계산된 움직임 크기정보에 대해서 움직임 크기 히스토그램을 생성하는 움직임 크기 히스토그램 생성단계와; 상기 움직임 크기 히스토그램 생성단계에서 생성된 움직임 크기 히스토그램을 정해진 순서에 따라 누적 움직임 크기 히스토그램을 생성하는 누적 움직임 크기 히스토그램 생성단계와; 상기 누적 움직임 크기 히 스토그램 생성단계에서 생성된 누적 움직임 크기 히스토그램의 변화량에 따라 비디오를 임의의 크기로 구조화(계층화)하고, 구조화된 각 단위에 대해 움직임 특성을 기술하는 움직임 기술자를 생성하는 움직임 기술자 생성단계를 포함하여 구성된다.Eighth means for achieving the object of the present invention comprises a motion size calculation step of calculating the magnitude (degree) of the input motion size information; A motion magnitude histogram generation step of generating a motion magnitude histogram with respect to the motion magnitude information calculated in the motion magnitude calculation step; A cumulative motion magnitude histogram generating step of generating a cumulative motion magnitude histogram according to a predetermined order of the motion magnitude histogram generated in the motion magnitude histogram generating step; A motion descriptor for structuring (layering) the video to an arbitrary size according to the amount of change in the cumulative motion magnitude histogram generated in the cumulative motion magnitude histogram generation step, and generating a motion descriptor for describing motion characteristics for each structured unit. It comprises a generation step.

이와 같은 본 발명의 목적을 달성하기 위한 제 9 수단은 입력되는 움직임 방향 정보의 방향을 계산하는 움직임 방향 계산단계와; 상기 움직임 방향 계산단계에서 계산된 움직임 방향 정보에 대해서 움직임 방향 히스토그램을 생성하는 움직임 방향 히스토그램 생성단계와; 상기 움직임 방향 히스토그램 생성단계에서 생성된 움직임 방향 히스토그램을 정해진 순서에 따라 누적 움직임 방향 히스토그램을 생성하는 누적 움직임 방향 히스토그램 생성단계와; 상기 누적 움직임 방향 히스토그램 생성단계에서 생성된 누적 움직임 방향 히스토그램의 변화량에 따라 비디오를 임의의 크기로 구조화(계층화)하고, 구조화된 각 단위에 대해 움직임 특성을 기술하는 움직임 기술자를 생성하는 움직임 기술자 생성 단계를 포함하여 구성된다.A ninth means for achieving the object of the present invention comprises a movement direction calculation step of calculating the direction of the input movement direction information; A motion direction histogram generating step of generating a motion direction histogram with respect to the motion direction information calculated in the motion direction calculation step; A cumulative motion direction histogram generating step of generating a cumulative motion direction histogram according to a predetermined order of the motion direction histogram generated in the motion direction histogram generating step; A motion descriptor generation step of structuring (layering) the video to an arbitrary size according to the amount of change in the cumulative motion direction histogram generated in the cumulative motion direction histogram generation step, and generating a motion descriptor describing motion characteristics for each structured unit. It is configured to include.

이와 같은 본 발명의 목적을 달성하기 위한 제 10 수단은 입력되는 움직임 크기 정보의 크기(정도)를 계산하는 움직임 크기 계산 단계와; 상기 움직임 크기 계산단계에서 계산된 움직임 크기정보에 대해서 움직임 크기 히스토그램을 생성하는 움직임 크기 히스토그램 생성단계와; 상기 움직임 크기 히스토그램 생성단계에서 생성된 움직임 크기 히스토그램을 정해진 순서에 따라 누적 움직임 크기 히스토그램을 생성하는 누적 움직임 크기 히스토그램 생성단계와; 입력되는 움직임 방향 정보의 방향을 계산하는 움직임 방향 계산단계와; 상기 움직임 방향 계산단계에서 계산된 움직임 방향 정보에 대해서 움직임 방향 히스토그램을 생성하는 움직임 방 향 히스토그램 생성단계와; 상기 움직임 방향 히스토그램 생성단계에서 생성된 움직임 방향 히스토그램을 정해진 순서에 따라 누적 움직임 방향 히스토그램을 생성하는 누적 움직임 방향 히스토그램 생성단계와; 상기 누적 움직임 크기 히스토그램 생성단계 및 상기 누적 움직임 방향 히스토그램 생성단계에서 생성된 누적 움직임 크기 및 방향 히스토그램의 변화량에 따라 비디오를 임의의 크기로 구조화(계층화)하고, 구조화된 각 단위에 대해 움직임 특성을 기술하는 움직임 기술자를 생성하는 움직임 기술자 생성단계를 포함하여 구성된다.A tenth means for achieving the object of the present invention comprises a motion size calculation step of calculating the magnitude (degree) of the input motion size information; A motion magnitude histogram generation step of generating a motion magnitude histogram with respect to the motion magnitude information calculated in the motion magnitude calculation step; A cumulative motion magnitude histogram generating step of generating a cumulative motion magnitude histogram according to a predetermined order of the motion magnitude histogram generated in the motion magnitude histogram generating step; A motion direction calculation step of calculating a direction of input motion direction information; A motion direction histogram generation step of generating a motion direction histogram with respect to the motion direction information calculated in the motion direction calculation step; A cumulative motion direction histogram generating step of generating a cumulative motion direction histogram according to a predetermined order of the motion direction histogram generated in the motion direction histogram generating step; According to the cumulative motion magnitude histogram generation step and the cumulative motion direction histogram generation step, the video is structured (layered) to an arbitrary size and the motion characteristics are described for each structured unit. And a motion descriptor generation step of generating a motion descriptor.

이하, 본 발명의 실시예를 첨부된 도면을 참조하여 상세히 설명하면 다음과 같다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

본 발명에서 동영상에 대한 움직임 활동 특징은 동영상에서 나타나는 움직임 파라메터의 통계적 특성에 의해 기술된다. 즉, 본 발명은 공간축에서 N차의 통계값을 추출하고, 이 값들의 시간축으로 M차의 통계값을 추출하여 하나의 동영상 단위에 대한 특징값으로서 기술한다. 특히, 본 발명의 하나의 실시예는 동영상의 움직임 파라메터의 통계적 특징으로 움직임 파라메터의 크기와 방향을 추출하고 이것의 1차 통계값과 2차 통계값을 움직임 활동 기술자로 추출하는 것을 특징으로 한다. 본 발명에서 1차 통계값은 평균값을 의미하고 2차 통계값은 임의의 차수의 모멘트로, 바람직하게는 표준편차값 또는 분산값을 의미한다. 이하의 설명에서는 설명의 편의를 위해 2차 통계값으로 표준편차값을 예로 들어 설명할 것이다. 동영상은 각각의 영상으로 구성되고 각 영상에서 움직임 파라메터의 크기와 방향의 1 차 통계값과 2차 통계값을 특징으로 할 수 있고, 한 개 이상의 영상에서 추출된 이러한 통계값들의 1차 통계값과 2차 통계값을 특징값으로 사용할 수 있다.In the present invention, the motion activity feature for the video is described by the statistical characteristics of the motion parameters appearing in the video. That is, the present invention extracts the statistical value of the Nth order from the spatial axis, and extracts the statistical value of the Mth order from the time axis of these values and describes it as a feature value for one video unit. In particular, an embodiment of the present invention is characterized by extracting the magnitude and direction of the motion parameter as a statistical feature of the motion parameter of the video and extracting the first and second statistical values thereof as the motion activity descriptor. In the present invention, the primary statistical value means an average value and the secondary statistical value means a moment of any order, and preferably a standard deviation value or a variance value. In the following description, for convenience of explanation, the standard deviation value will be described as an example of the second statistical value. A video may consist of individual images and may be characterized by primary and secondary statistics of the size and direction of motion parameters in each image, and the primary and secondary statistics of these statistics extracted from one or more images. Secondary statistics can be used as feature values.

동영상은 일반적으로 기존의 아날로그 형태로 저장된 영상과 최근에 제안된 디지털에 바탕을 둔 영상으로 나눌 수 있다. 디지털 영상 압축 기술에 바탕을 둔 MPEG이나 H.263 등과 같은 미디어의 경우 움직임 파라메터를 비트열에 저장하고 있어 이를 이용하여 움직임 활동성을 쉽게 기술할 수 있다. 또한 아날로그 영상에서도 어떠한 움직임 추정 방법에 의하여 움직임 파라메터를 추정하고 이 것을 이용하여 움직임 활동성을 기술하고 이 기술값에 따라 동영상을 검색할 수 있다. 특히 MPEG이나 H.263과 같은 동영상 압축부호화 방식에서는 영상을 여러 블록이나 객체로 나누어 부호화한다. 하나의 블록을 부호화 할 때, 영상정보의 시간적 중복성을 줄이기 위하여 시간적으로 이웃한 영상에서 부호화할 블록과 가장 유사한 참조 블록을 찾아내어 참조할 블록의 화면상의 상대적 위치를 움직임 파라메터로서 이를 부호화하고, 부호화할 블록과 참조 블록의 차이값을 부호화함으로써 압축효율을 높인다. 한 화면의 k번째 블록의 움직임 파라메터는, 예를 들어, 2차원의 MV _k =(MV _xk , MV _yk )로 표현할 수 있다. 여기서 MV _xk 는 수평방향의 움직임 성분이고 MV _yk 는 수직방향의 움직임 성분이다. 움직임 파라메터는 아래의 수학식 1로 표현되는, 움직임 파라메터의 크기 성분 I_k와 하기 수학식 2로 표현되는 움직임 파라메터의 방향 성분 φ_k에 의해 표현될 수 있다. In general, a video can be divided into an image stored in an existing analog form and a video based on a recently proposed digital image. In the case of media such as MPEG or H.263 based on digital image compression technology, motion parameters are stored in bit streams, so it is easy to describe motion activity using them. In addition, it is possible to estimate motion parameters in any analog video by any motion estimation method, to describe motion activity using this method, and to search video according to this technology value. In particular, in video compression encoding scheme such as MPEG or H.263, video is divided into several blocks or objects and encoded. When encoding a block, in order to reduce the temporal redundancy of the image information, find a reference block most similar to the block to be encoded in a temporal neighboring image, and encode the relative position on the screen of the block to be referred to as a motion parameter. The compression efficiency is increased by encoding the difference between the block to be referenced and the reference block. The motion parameter of the k-th block of one screen may be expressed by, for example, two-dimensional MV _k = (MV _xk , MV _yk ) . Where MV _xk is a horizontal motion component and MV _yk is a vertical motion component. The motion parameter may be represented by the magnitude component I _k of the motion parameter represented by Equation 1 below and the direction component φ _k of the motion parameter represented by Equation 2 below.

본 발명에서는 움직임 활동 특징을 움직임 파라메터의 크기와 방향에 대한 통계적 특성으로 기술한다. 동영상의 각 화면이 M개의 블록으로 또는 객체로 구성될 때, 각 화면에서 움직임 파라메터의 크기와 방향에 대한 1차 통계값 및 2차 통계값을 추출한다. 한 화면에서의 움직임 파라메터 크기의 1차 통계값(I_av)은 하기 수학식 3에 의해 구해지고, 한 화면에서의 움직임 파라메터 크기의 2차 통계값(예컨대, I_dev)은 하기 수학식 4에 따라 구해지며, 움직임 파라메터 방향의 1차 통계값(φ_av)는 하기 수학식 5에 의해 구해지며, 한 화면에서의 움직임 파라메터 방향의 2차 통계값(φ_dev)은 하기 수학식 6에 의해 구해질 수 있다.In the present invention, the motion activity feature is described as a statistical characteristic of the magnitude and direction of the motion parameter. When each screen of the video is composed of M blocks or objects, first and second statistical values of the size and direction of motion parameters are extracted from each screen. The first statistical value I _av of the motion parameter size on one screen is obtained by the following Equation 3, and the second statistical value (eg, I _dev ) of the motion parameter size on one screen is expressed by Equation 4 below. The primary statistical value φ _av in the motion parameter direction is obtained by the following Equation 5, and the secondary statistical value φ _dev in the motion parameter direction in one screen is obtained by Equation 6 below. Can be done.

여러 장의 연속된 화면으로 구성된 하나의 동영상에 대한 움직임 활동 기술자는 동영상을 구성하는 각 화면들로부터 상기 수학식 3 내지 수학식 6에 의해 구한 하나의 화면에서의 움직임 파라메터의 크기와 방향의 1차 통계값 및 2차 통계값으로부터 하기 수학식 7 내지 수학식 15에 의해 구해진다. 이와 같이 본 발명에서 동영상의 움직임 활동 정도를 색인함에 있어서는 수학식 7 내지 수학식 15로 표현되는 움직임 파라메터 통계값의 일부 혹은 전부가 움직임 활동 기술자로 이용될 수 있다.The motion activity descriptor for a single video composed of several consecutive screens is obtained by first-order statistics of the size and direction of motion parameters on one screen obtained by Equations 3 to 6 from the respective screens constituting the video. It is calculated | required by following formula (7)-formula 15 from a value and a secondary statistical value. As described above, in indexing the degree of motion activity of the video, some or all of the motion parameter statistics expressed by Equations 7 to 15 may be used as the motion activity descriptor.

동영상에서 움직임 파라메터를 추출한 화면의 수가 T라고 하고, 그 중 i 번째 화면의 움직임 파라메터들의 평균을 I_av _,I, 움직임 파라메터 2차 통계값을 I_dev _,i 움직임 파라메터 방향의 평균을 φ_av _,i , 움직임 파라메터 방향의 2차 통계값을 φ_dev,i 라고 한다.The number of screens from which motion parameters are extracted from the video is T, and the average of motion parameters of the i th screen is I _av _{, I, the} second parameter of motion parameters I _dev _{, i} the average of motion parameter directions is φ _av _{, i} , The secondary statistical value in the direction of the motion parameter is called φ _{dev, i} .

하기 수학식 7은 T개로 구성된 동영상 각각에서 구한 움직임 파라메터 크기의 평균값들의 평균으로 T개의 영상으로 구성된 동영상을 대표하는 통계적 기술자로 사용된다. Equation 7 is used as a statistical descriptor representing a video composed of T images as an average of average values of motion parameter sizes obtained from each of T videos.

상기 수학식 7에 의해 구한 특징값은 시간과 공간 축에서의 움직임 파라메터 크기의 평균값으로 동영상 전체의 평균 움직임을 나타내는 통계적값으로 동영상 검색 및 표현시 대표 특징으로 사용될 수 있다.The feature value obtained by Equation 7 is an average value of motion parameter magnitudes on the time and space axes, and is a statistical value representing the average motion of the entire video and may be used as a representative feature when searching and expressing the video.

하기 수학식 8은 T개로 구성된 동영상 각각에서 구한 움직임 파라메터 크기의 평균값들의 표준편차로 T개의 영상으로 구성된 동영상을 대표하는 통계적 기술자로 사용한다.Equation 8 is used as a statistical descriptor representing a video composed of T images as a standard deviation of average values of motion parameter sizes obtained from each of T videos.

상기 수학식 8에 의해 구한 특징값은 움직임 파라메터 크기의 시간적 활동성으로 시간적으로 변화하는 움직임 활동성을 표현하는 통계적 특징값으로 동영상 검색에 유용하기 사용할 수 있다. 또한 이 값은 시간에 따른 움직임 파라메터의 크 기의 표준편차로 수학식 7의 1차 통계값의 신뢰도를 표현할 수 있다. 예를 들어 수학식 8의 표준편차값이 큰 것은 수학식 7의 1차 통계값을 중심으로 움직임 파라메터의 크기의 분포가 크다는 것을 의미한다. 즉 이 값이 큰 것은 1차 통계값의 신뢰도가 적다는 것을 의미한다. 이러한 특성을 이용하여 이 값에 의하여 나누어 줌으로써 가중치를 줄 수 있다.The feature value obtained by Equation 8 is a statistical feature value representing the movement activity that changes in time by the temporal activity of the motion parameter size, and can be useful for video search. In addition, this value can represent the reliability of the first-order statistical value of Equation 7 as the standard deviation of the size of the motion parameter over time. For example, a large standard deviation value of Equation 8 means that the distribution of the size of the motion parameter is large based on the first statistical value of Equation 7. In other words, a larger value means less reliability of the primary statistics. By using this property, the weight can be given by dividing by this value.

하기 수학식 9는 T개로 구성된 동영상 각각에서 구한 움직임 파라메터 크기의 표준편차값들의 평균으로 T개의 영상으로 구성된 동영상을 대표하는 통계적 기술자로 사용된다.Equation 9 is used as a statistical descriptor representing a video composed of T images as an average of standard deviation values of motion parameter sizes obtained from each of T videos.

상기 수학식 9에 의해 구한 특징값은 공간적인 움직임의 활동성을 한 개 이상의 영상에서 평균한 값으로, 공간상의 움직임 활동 정도를 표현하는 통계적 특징값이다. 이러한 특징값으로 공간적인 움직임 활동 정도에 따라 동영상을 검색할 수 있다.The feature value obtained by Equation 9 is the average value of spatial motion activity in one or more images, and is a statistical feature value expressing the degree of motion activity in space. With this feature value, the video can be searched according to the degree of spatial motion activity.

하기 수학식 10은 T개로 구성된 동영상 각각에서 구한 움직임 파라메터 크기의 표준편차값들의 표준편차로 T개의 영상으로 구성된 동영상을 대표하는 통계적 기술자로 사용한다.Equation 10 is used as a statistical descriptor representing a video composed of T images as standard deviations of standard deviation values of motion parameter sizes obtained from each of T videos.

상기 수학식 10에 의해 구한 특징값은 공간적인 움직임 활동 정도와 시간적 움직임 활동 정도를 함께 나타내는 특징값이다.The feature value obtained by Equation 10 is a feature value indicating both the degree of spatial motion activity and the degree of temporal motion activity.

하기 수학식 11은 T개로 구성된 동영상 각각에서 구한 움직임 파라메터 방향의 평균값들의 평균으로 T개의 영상으로 구성된 동영상을 대표하는 통계적 기술자로 사용한다.Equation 11 is used as a statistical descriptor representing a video composed of T images as an average of average values of motion parameter directions obtained from each of T videos.

상기 수학식 11에 의해 구해지는 특징값은 움직임 파라메터의 방향의 공간 및 시간적 1차 통계값으로 한 개 이상의 동영상의 움직임 방향을 나타내는 특징값이다.The feature value obtained by Equation 11 is a feature value representing the motion direction of at least one video as a spatial and temporal primary statistical value in the direction of the motion parameter.

하기 수학식 12는 T개로 구성된 동영상 각각에서 구한 움직임 파라메터 방향의 평균값들의 표준편차로 T개의 영상으로 구성된 동영상을 대표하는 통계적 기술자로 사용한다.Equation 12 is used as a statistical descriptor representing a video composed of T images as a standard deviation of average values of the motion parameter directions obtained from each of the T videos.

상기 수학식 12에 의해 구한 특징값은 움직임 파라메터값의 방향의 시간적 변화정도, 즉 활동정도를 나타내는 특징값으로 방향의 변화량에 따라 동영상을 검색할 수 있다.The feature value obtained by Equation 12 is a feature value representing a temporal change in the direction of the motion parameter value, that is, an activity degree, and the video may be searched according to the change amount of the direction.

하기 수학식 13은 T개로 구성된 동영상 각각에서 구한 움직임 파라메터 방향의 표준편차값들의 평균으로 T개의 영상으로 구성된 동영상을 대표하는 통계적 기술자로 사용한다.Equation 13 is used as a statistical descriptor representing a video composed of T images as an average of standard deviation values of the motion parameter directions obtained from each of the T videos.

상기 수학식 13에 의해 구한 특징값은 움직임 파라메터의 방향의 공간적 2차 통계값의 평균으로 움직임 파라메터의 공간적 활동도를 나태낸다. 이러한 특징값을 이용하여 공간적 방향 변화율에 따라 동영상을 검색할 수 있다.The feature value obtained by Equation 13 represents the spatial activity of the motion parameter as an average of spatial secondary statistics in the direction of the motion parameter. The feature value can be used to search for a video based on a rate of change in spatial direction.

하기 수학식 14는 T개로 구성된 동영상 각각에서 구한 움직임 파라메터 방향의 표준편차값들의 표준편차로 T개의 영상으로 구성된 동영상을 대표하는 통계적 기술자로 사용한다.Equation 14 below is a standard deviation of standard deviation values of the motion parameter directions obtained from each of the T videos, and is used as a statistical descriptor representing a video composed of T images.

상기 수학식 14에 의해 구한 특징값은 움직임 방향의 공간적, 시간적 활동 정도를 나타내는 특징값이다.The feature value obtained by Equation 14 is a feature value representing the degree of spatial and temporal activity in the movement direction.

이상에서 설명한 움직임 파라메터의 크기 및 방향에 대한 통계적 특징값 계산은 반드시 상술한 수학식 3 내지 수학식 14에 의해서만 구해지는 것이 아니고 본 발명이 속하는 기술분야의 당업자들에게 자명한 기타의 방법에 의해 산출될 수 있다.The calculation of the statistical feature values for the magnitude and direction of the motion parameters described above is not necessarily obtained by the above Equations 3 to 14 but is calculated by other methods apparent to those skilled in the art. Can be.

본 발명에서 움직임 활동 기술자를 구성하는 또 다른 요소로 동영상 화면전체에서 M개의 방향으로 양자화된 움직임 파라메터 방향의 빈도수가 포함될 수 있다. 이 값을 서열화하여 가장 빈도수가 높은 방향 벡터로부터 순차적으로 N개를 추출하여 이것을 대표 방향 벡터라고 한다. 여기서 빈도수가 가장 높은 순으로 양자화된 움직임 파라메터는 φ_1,φ_2,...φ _N 으로 표현될 수 있다. In another exemplary embodiment of the present invention, the frequency of the motion parameter direction quantized in M directions may be included in the entire video screen. This value is sequenced, and N pieces are sequentially extracted from the most frequent direction vector, which is called a representative direction vector. Here, the motion parameters quantized in order of highest frequency may be expressed as φ _1, φ _2, ... φ _N.

φmax = 1, φ 2, ... φN>, N ≤Mφmax = 1, φ 2, ... φN>, N ≤M

이러한 특징을 사용함으로써 사용자가 원하는 방향의 움직임을 가진 동영상을 검색할 수 있는 표현자를 제공한다. 각각의 동영상, 혹은 몇 장의 화면으로 구성된 부분 동영상은 앞서 구한 움직임 활동 특징을 나타내는 움직임 파라메터 크 기 및 방향의 통계적 요소들로 구성된 움직임 활동 기술자로 대표될 수 있다. 이러한 움직임 활동 기술자는 동영상들을 움직임 활동에 따라 비교하는데 사용될 수 있어, 동영상 검색 등의 다양한 응용에 사용할 수 있다. By using this feature, the presenter provides a presenter capable of searching for a video having a movement in a desired direction. Each video or partial video composed of several screens may be represented by a motion activity descriptor composed of statistical elements of motion parameter size and direction representing the motion activity characteristics obtained above. Such motion activity descriptors can be used to compare videos according to motion activities, and can be used for various applications such as video search.

도 1은 본 발명의 움직임 활동 특징 기술방법의 흐름도이다. 본 발명의 하나의 양상에 의해 동영상의 움직임 활동 특징을 기술하는 경우에는 먼저 주어진 동영상으로부터 움직임 파라메터를 추출하고(S1100) 동영상의 한 화면이 M개의 블록 혹은 객체로 구성될 때 각 화면에서, 움직임 파라메터의 크기의 통계적 특성을 추출하고(S2100), 움직임 파라메터의 방향의 통계적 특성을 추출한다(S3100). 본 발명에서 움직임 파라메터를 추출하는 방법은 대표적으로는 움직임 벡터를 구하는 것이나, 반드시 이러한 방법으로 국한되는 것은 아니다.1 is a flowchart of a method for describing a motion activity feature of the present invention. In the case of describing a motion activity feature of a video by one aspect of the present invention, first, motion parameters are extracted from a given video (S1100), and when each screen of the video is composed of M blocks or objects, the motion parameters are displayed on each screen. Extract the statistical characteristic of the size of (S2100), and extract the statistical characteristic of the direction of the motion parameter (S3100). In the present invention, a method of extracting a motion parameter is typically to obtain a motion vector, but is not necessarily limited to this method.

본 발명의 방법의 다른 실시예에서는 이상의 방법에 의해 추출한 움직임 활동 기술자로 동영상을 계층화하여 동영상의 움직임 활동 특징을 계층적으로 기술할 수 있다.In another embodiment of the method of the present invention, a motion activity descriptor extracted by the above method may be hierarchically described to describe the motion activity characteristics of the video.

본 발명에서 상기 움직임 파라메터로는 디지털 동영상 부호화에 의하여 부호화된 움직임 파라메터를 사용하는 것이 바람직하다. 또한, 움직임 파라메터의 크기 및 방향의 통계적 특징을 추출함에 있어서는 각 영상의 공간 통계 특징값을 구하고나서 시간 통계 특징값을 구하거나, 여러 영상의 시간 통계 특징값을 구하고나서 공간 통계 특징값을 구할 수 있다.In the present invention, it is preferable to use a motion parameter encoded by digital video encoding as the motion parameter. In addition, in extracting the statistical features of the motion parameter size and direction, the spatial statistical feature values of each image can be obtained, or the temporal statistical feature values of various images can be obtained. have.

본 발명의 움직임 파라메터의 크기의 통계적 특성 추출 단계(S2100)에서는 각각의 영상에 대하여 움직임 파라메터의 크기의 1차 통계값을 구한 후, 이 값들의 1차 통계값을 그 동영상 전체를 대표하는 통계적 특징값으로 사용하거나(예컨대, I_av, _av), 각각의 영상에 대하여 움직임 파라메터의 크기의 1차 통계값을 구한 후, 이 값들의 2차 통계값(예컨대, I_dev _, _av)을 그 동영상 전체를 대표하는 통계적 특징값으로 사용할 수 있다. 이 때, 동영상 전체를 대표하는 통계적 특징값으로 추출된 2차 통계값(I_dev,av)은 1차 통계값(I_{av, av})의 신뢰도로 사용될 수 있다.In the step of extracting the statistical characteristic of the size of the motion parameter of the present invention (S2100), after obtaining the first statistical value of the size of the motion parameter for each image, the first statistical value of these values is a statistical feature representing the entire video. As a value (e.g., I _av, _av ), or obtain a first order statistical value of the magnitude of the motion parameter for each image, then use the second order statistics (e.g., I _dev _, _av ) of the values It can be used as a statistical feature value representing. In this case, the second statistical values I _{dev and av} extracted as statistical feature values representing the entire video may be used as the reliability of the first statistical values I _{av and av} .

한편, 움직임 파라메터의 크기의 통계적 특성 추출 단계(S2100)에서는 각각의 영상에 대하여 움직임 파라메터의 크기의 2차 통계값을 구한 후, 이 값들의 1차 통계값(예컨대, I_av,dev)을 그 동영상 전체를 대표하는 통계적 특징값으로 사용하거나, 각각의 영상에 대하여 움직임 파라메터의 크기의 2차 통계값을 구한 후, 이 값들의 2차 통계값(예컨대, I_dev _, _dev)을 그 동영상 전체를 대표하는 통계적 특징값으로 사용할 수 있다. 이 경우에도 이 때, 동영상 전체를 대표하는 통계적 특징값으로 추출된 2차 통계값(I_{dev, dev})은 1차 통계값(I_{av, dev})의 신뢰도로 사용될 수 있다.On the other hand, in the step of extracting the statistical characteristic of the size of the motion parameter (S2100), after obtaining the second statistical value of the size of the motion parameter for each image, the first statistical value of these values (eg, I _{av, dev} ) is obtained. Use the statistical features representing the entire video, or obtain the second statistical value of the size of the motion parameter for each video, and then use the second statistical values (eg, I _dev _, _dev ) of these values It can be used as a representative statistical feature value. Even in this case, the second statistical values I _{dev and dev} extracted as statistical feature values representing the entire video may be used as the reliability of the first statistical values I _{av and dev} .

본 발명에서 움직임 파라메터의 방향의 통계적 특성 추출 단계(S3100)에서는 각각의 영상에 대하여 움직임 파라메터의 방향의 1차 통계값을 구한 후, 이 값들의 1차 통계값(예컨대, φ_{av, av})을 그 동영상 전체를 대표하는 통계적 특징값으로 사용하거나, 각각의 영상에 대하여 움직임 파라메터의 방향의 1차 통계값을 구한 후, 이 값들의 2차 통계값(예컨대, φ_dev _, _av)을 그 동영상 전체를 대표하는 통계적 특징 값으로 사용할 수 있다. 이 때, 동영상 전체를 대표하는 통계적 특징값으로 추출된 2차 통계값(φ_{dev, av})은 1차 통계값(φ_{av, av})의 신뢰도로 사용될 수 있다.In the present invention, in the step of extracting statistical characteristics of the direction of the motion parameter (S3100), after obtaining the first statistical value of the direction of the motion parameter for each image, the first statistical values (eg, φ _{av, av} ) of these values are obtained. Use the statistical characteristic values representing the entire video, or obtain the first statistical value in the direction of the motion parameter for each video, and then use the secondary statistical values (e.g., φ _dev _, _av ). It can be used as a statistical feature value representing. In this case, the second statistical values φ _{dev and av} extracted as statistical feature values representing the entire video may be used as the reliability of the first statistical values φ _{av and av} .

본 발명에서 움직임 파라메터의 방향의 통계적 특성 추출 단계(S3100)에서는각각의 영상에 대하여 움직임 파라메터의 방향의 2차 통계값을 구한 후, 이 값들의 1차 통계값(예컨대, φ_{av, dev})을 그 동영상 전체를 대표하는 통계적 특징값으로 사용하거나, 각각의 영상에 대하여 움직임 파라메터의 방향의 2차 통계값을 구한 후, 이 값들의 2차 통계값(예컨대, φ_dev _, _dev)을 그 동영상 전체를 대표하는 통계적 특징값으로 사용할 수 있다. 이 때, 동영상 전체를 대표하는 통계적 특징값으로 추출된 2차 통계값(φ_{dev, dev})은 1차 통계값(φ_{av, dev})의 신뢰도로 사용될 수 있다.In the present invention, in the step of extracting the statistical characteristics of the direction of the motion parameter (S3100), after obtaining the second statistical value of the direction of the motion parameter with respect to each image, the first statistical values of these values (eg, φ _{av, dev} ) are obtained. Use the statistical characteristic values representing the entire video, or obtain the second statistical value in the direction of the motion parameter for each video, and then use the second statistical value (eg, φ _dev _, _dev ) of these values. It can be used as a statistical feature value representing. In this case, the second statistical values φ _{dev and dev} extracted as statistical feature values representing the entire video may be used as the reliability of the first statistical values φ _{av and dev} .

*또한, 본 발명의 방법에서는 동영상에서의 움직임 파라메터의 빈도수를 동영상의 전체적인 평균 방향으로 사용할 수 있다.In addition, the method of the present invention can use the frequency of the motion parameter in the moving picture as the overall average direction of the moving picture.

이러한 본 발명의 움직임 활동 특징 기술 방법의 구체화예를 설명하면, 동영상 입력시, 움직임 파라메터를 추출하고, 각 화면에 대한 움직임 파라메터 크기 및 방향 성분의 1차적 통계값 및 2차 통계값[예컨대, I_av (수학식 3), I_dev (수학식 4), φ_av(수학식 5), φ_dev (수학식 6)]를 계산한 후, 움직임 활동 기술자 요소 I_av _, _av (수학식 7), I_av _, _dev (수학식 8)_,I_dev _, _av(수학식 9)_,I_dev _, _dev(수학식 10)_,φ_av _, _av(수학식 11)_, φ_av, _dev(수학식 12), φ_dev _, _av(수학식 13)_,φ_dev _, _dev (수학식 14)_,φ_max (수학식 15)를 계산하고, 이들을 동영상 위치와 동영상 시간적인 길이 정보와 함께 움직임 활동 기술자로서 기록한다. 상기 움직임 활동 기술자들은 응용분야 및 움직임 표현의 정밀도를 고려하여 선택적으로 사용될 수 있다.Referring to the specific example of the motion activity feature description method of the present invention, the motion parameter is extracted at the time of moving image input, and the primary statistical value and the secondary statistical value of the motion parameter magnitude and direction component for each screen [eg, I _av (Equation 3), I _dev (Equation 4), φ _av (Equation 5), φ _dev (Equation 6)], and then the motion activity descriptor elements I _av _, _av (Equation 7), I _av _, _dev (Equation 8) _, I _dev _, _av (Equation 9) _, I _dev _, _dev (Equation 10) _, φ _av _, _av (Equation 11) _, φ _av, _dev (Equation 12), φ _dev _, _av (Equation 13) _, φ _dev _, _dev (Equation 14) _, φ _max Calculate Equation (15) and record them as motion activity descriptors along with video position and video temporal length information. The motion activity descriptors can optionally be used in view of the application and the precision of the motion representation.

도 2는 본 발명의 움직임 활동 특징 기술장치의 일실시예의 블록도이다. 본 발명의 움직임 활동 특징 기술장치는 동영상으로부터 움직임 파라메터를 추출하는 움직임 파라메터 추출 수단(1100); 움직임 파라메터 추출 수단으로부터 입력된 움직임 파라메터의 크기의 통계적 특성을 추출하는 수단(2100); 상기 움직임 파라메터의 방향의 통계적 특성을 추출하는 수단(3100); 및 추출된 통계적 특성들을 모아 움직임 활동 기술자를 정의하는 집합 수단(combiner)(4100)을 포함한다.2 is a block diagram of one embodiment of a motion activity feature description apparatus of the present invention. The motion activity feature description apparatus of the present invention comprises: motion parameter extracting means 1100 for extracting motion parameters from a video; Means (2100) for extracting a statistical characteristic of the magnitude of the input motion parameter from the motion parameter extraction means; Means (3100) for extracting a statistical characteristic of the direction of the motion parameter; And a combiner 4100 that gathers the extracted statistical characteristics to define a motion activity descriptor.

도 3은 도 2의 움직임 활동 특징 기술장치의 구체화예의 상세 블록도이다. 움직임 파라메터 추출 수단(1100)이 입력된 동영상으로부터 추출한 움직임 파라메터는 장치 2100-1, 2100-2, 3100-1, 및 3100-2로 입력된다. 장치 2100-1은 움직임 파라메터 추출 수단(1100)으로부터 입력된 움직임 파라메터 값으로부터 I_av (수학식 3)를 구하여 장치 2100-2, 장치 2100-3, 및 장치 2100-5로 출력하고, 장치 3100-1는 φ_av (수학식 5)를 계산하여 장치 3100-2, 장치 3100-3, 및 장치 3100-5으로 출력한다. 장치 2100-2는 움직임 파라메터 추출 수단(1100)으로부터 입력된 움직임 파라메터 값과 장치 2100-1로부터 입력된 I_av (수학식 3)로부터 I_dev (수학식 4)를 계산하여 장치 2100-4 및 장치 2100-6로 출력한다. 장치 3100-2는 장치 3100-1로부터 입력된 φ_av (수학식 5)와 움직임 파라메터 추출 수단(1100)으로부터 입력된 움직임 파라메터로부터 φ_dev (수학식 6)를 계산하여 장치 3100-4 및 장치 3100-6로 출력한다. 장치 2100-3은 장치 2100-1로부터 입력된 I_av (수학식 3)로부터 I_av _, _av (수학식 7)를 계산하여 장치 2100-5와 집합 수단(4100)으로 출력한다. 장치 3100-3은 장치 3100-1로부터 입력된 φ_av (수학식 5)로부터 φ_av _, _av(수학식 11)를 계산하여 장치 3100-5와 집합 수단(4100)으로 출력한다. 장치 2100-4는 장치 2100-2으로부터 입력된 I_dev (수학식 4)로부터 I_av _, _dev (수학식 8)를 계산하여 장치 2100-6과 집합 수단(4100)으로 출력한다. 장치 3100-4은 장치 3100-2로부터 입력된 φ_dev (수학식 6)로부터 φ_av _, _dev(수학식 12)를 계산하여 장치 3100-6와 집합 수단(4100)으로 출력한다. 장치 2100-5는 장치 2100-1로부터 I_av (수학식 3)를 장치 2100-3으로부터 I_av _, _av (수학식 7)을 입력받아 I_dev _, _av(수학식 9)을 계산하고 집합 수단(4100)으로 출력한다. 장치 2100-6은 장치 2100-2로부터 I_dev (수학식 4)를 장치 2100-4으로부터 I_av _, _dev (수학식 8)을 입력받아 I_dev _, _av(수학식 9)을 계산하고 집합 수단(4100)으로 I_dev _, _dev(수학식 10)을 출력한다. 장치 3100-5는 장치 3100-1로부터 φ_av(수학식 5)를 장치 3100-3으로부터 φ_av _, _av(수학식 11)을 입력받아 φ_dev _, _av(수학식 13)을 계산하고 집합 수단(4100)으로 출력한다. 장치 3100-6은 장치 3100-2로부터 φ_dev (수학식 6)를 장치 3100-4으로부터 φ_av _, _dev (수학식 12)을 입력받아 φ_dev _, _dev 수학식 14를 계산하여 집합 수단(4100)으로 출력한다. 대표방향벡터 추출 수단(5100)은 움직임 파라메터 추출 수단(1100)로부터 움직임 파라메터를 입력받아 대표방향벡터 (φ_max _,수학식 15)를 계산하여 집합 수단(4100)으로 출력한다. 집합 수단(4100)은 움직임 벡터의 크기 특성 추출 수단(2100) 및 움직임 벡터의 방향 특성 추출 수단(3100)로부터의 출력을 동영상 데이터의 위치와 시간적 길이 정보와 집합시켜 동영상 움직임 활동 기술자를 구성한다. 3 is a detailed block diagram of an embodiment of the motion activity feature description apparatus of FIG. The motion parameters extracted from the video input by the motion parameter extracting means 1100 are input to the devices 2100-1, 2100-2, 3100-1, and 3100-2. The device 2100-1 uses I _av from a motion parameter value input from the motion parameter extracting means 1100. (Equation 3) is obtained and output to the device 2100-2, the device 2100-3, and the device 2100-5, the device 3100-1 is φ _av (Equation 5) is calculated and output to the apparatus 3100-2, the apparatus 3100-3, and the apparatus 3100-5. Device 2100-2 is a motion parameter value input from the motion parameter extraction means 1100 and I _av input from the device 2100-1 From Equation (3) I _dev Calculate Equation 4 and output it to the devices 2100-4 and 2100-6. Device 3100-2 is the input φ _av from Device 3100-1 Φ _dev (Equation 6) is calculated from the equation (5) and the motion parameter input from the motion parameter extracting means 1100 and output to the devices 3100-4 and 3100-6. Device 2100-3 is the I _av input from device 2100-1 Equation 3 from I _av _, _av The equation (7) is calculated and output to the apparatus 2100-5 and the gathering means 4100. Device 3100-3 is input via φ _av from Device 3100-1 Φ _av _and _av (Equation 11) are calculated from Equation 5 and output to the apparatus 3100-5 and the gathering means 4100. Device 2100-4 is the I _dev input from device 2100-2 From Equation 4 I _av _, _dev (Equation 8) Is calculated and output to the apparatus 2100-6 and the gathering means 4100. The device 3100-4 calculates φ _av _, _dev (Equation 12) from φ _dev (Equation 6) inputted from the device 3100-2, and outputs the result to the device 3100-6 and the aggregation means 4100. Device 2100-5 is I _av of the apparatus 2100-1 (Equation 3) from the device 2100-3 I _av _, _av The equation (7) is input and I _dev _, _av (Equation 9) is calculated and output to the aggregation means 4100. Device 2100-6 is the device I _dev from device 2100-2 (Equation 4) from device 2100-4 I _av _, _dev Input (Equation 8) to calculate I _dev _, _av (Equation 9), and outputs I _dev _, _dev (Equation 10) to the aggregation means (4100). The device 3100-5 receives φ _av (Equation 5) from the device 3100-1, φ _av _, _av (Equation 11) from the device 3100-3, calculates φ _dev _, _av (Equation 13), and calculates the aggregation means ( 4100). Apparatus 3100-6 receives the φ _dev (Equation 6) from the apparatus unit 3100-4 3100-2 from the input φ _{_av,} _dev (formula 12) φ _{_dev,} _dev Equation 14 is calculated and output to the aggregation means 4100. The representative direction vector extracting means 5100 receives the motion parameter from the motion parameter extracting means 1100, calculates the representative direction vector φ _max _, and outputs it to the aggregation means 4100. The aggregation means 4100 aggregates the output from the magnitude feature extraction means 2100 of the motion vector and the direction feature extraction means 3100 of the motion vector with the position and temporal length information of the moving picture data to form a moving picture motion descriptor.

도 4a 내지 도 4j는 본 발명의 다른 실시예에 따른 누적 움직임 히스토그램을 이용한 움직임 기술자 생성장치의 블록도를 도시한 것이다. 이하에서 설명하는 바와 같이 본 발명에서 움직임 활동 특징 기술자는 누적 움직임 히스토그램을 이용해서 구할 수 있다.4A through 4J are block diagrams of an apparatus for generating a motion descriptor using a cumulative motion histogram according to another exemplary embodiment of the present invention. As described below, in the present invention, the motion activity feature descriptor may be obtained using a cumulative motion histogram.

도 4a에 도시된 바와 같이, 본 발명의 다른 실시예에 따른 누적 움직임 히스토그램을 이용한 움직임 기술자 생성장치는 입력되는 움직임 크기 정보 및 방향 정보에 대해서 움직임 히스토그램을 각각 생성하는 움직임 히스토그램 생성부(4)와, 상기 움직임 히스토그램 생성부(4)에서 생성된 움직임 히스토그램을 정해진 순서에 따라 누적 움직임 히스토그램을 생성하는 누적 움직임 히스토그램 생성부(5)와, 상기 누적 움직임 히스토그램 생성부(5)에서 생성된 누적 움직임 히스토그램의 변화량에 따라 비디오를 임의의 크기로 구조화(계층화)하고, 구조화된 각 단위에 대해 움직임 특성을 기술하는 움직임 기술자를 생성하는 움직임 기술자 생성부(6)로 구성된다.As shown in FIG. 4A, the motion descriptor generation apparatus using the cumulative motion histogram according to another embodiment of the present invention includes a motion histogram generator 4 for generating a motion histogram for input motion size information and direction information, respectively. A cumulative motion histogram generator 5 generating a cumulative motion histogram according to a predetermined order of the motion histogram generated by the motion histogram generator 4, and a cumulative motion histogram generated by the cumulative motion histogram generator 5. And a motion descriptor generation unit 6 for structuring (layering) the video to an arbitrary size according to the amount of change and generating a motion descriptor for describing the motion characteristics for each structured unit.

도 4b에 도시된 바와 같이, 본 발명의 실시예에 따른 누적 움직임 히스토그램을 이용한 움직임 기술자 생성장치는 외부 프레임 선택 모드에 의해 움직임 크기 및 방향의 추정치의 발산을 처리하는 움직 추정치 발산 처리부(2)와, 상기 움직임 추정치 발산 처리부(2)에서 발산 처리된 움직임 크기 정보 및 방향 정보에 대해서 움직임 히스토그램을 각각 생성하는 움직임 히스토그램 생성부(4)와, 상기 움직임 히스토그램 생성부(4)에서 생성된 움직임 히스토그램을 정해진 순서에 따라 누적 움직임 히스토그램을 생성하는 누적 움직임 히스토그램 생성부(5)와, 상기 누적 움직임 히스토그램 생성부(5)에서 생성된 누적 움직임 히스토그램의 변화량에 따라 비디오를 임의의 크기로 구조화(계층화)하고, 구조화된 각 단위에 대해 움직임 특성을 기술하는 움직임 기술자를 생성하는 움직임 기술자 생성부(6)로 구성된다.As shown in FIG. 4B, the motion descriptor generation apparatus using the cumulative motion histogram according to the embodiment of the present invention includes a motion estimation divergence processing unit 2 which processes divergence of an estimate of motion magnitude and direction by an external frame selection mode. The motion histogram generator 4 generates a motion histogram with respect to the motion size information and the direction information diverged by the motion estimate divergence processor 2, and a motion histogram generated by the motion histogram generator 4, respectively. The video is structured (layered) to an arbitrary size according to a cumulative motion histogram generator 5 generating a cumulative motion histogram according to a predetermined order, and a change amount of the cumulative motion histogram generated by the cumulative motion histogram generator 5. For each structured unit, the motion group describes the motion characteristics. It consists of a motion descriptor generator 6 for generating a predicate.

도 4c에 도시된 바와 같이, 본 발명의 실시예에 따른 누적 움직임 히스토그램을 이용한 움직임 기술자 생성장치는 입력되는 움직임 크기 정보 및 방향 정보에 대한 시각적 필터링을 수행하는 움직임 필터(3)와, 상기 움직임 필터(3)에서 필터 링된 움직임 크기 정보 및 방향 정보에 대해서 움직임 히스토그램을 각각 생성하는 움직임 히스토그램 생성부(4)와, 상기 움직임 히스토그램 생성부(4)에서 생성된 움직임 히스토그램을 정해진 순서에 따라 누적 움직임 히스토그램을 생성하는 누적 움직임 히스토그램 생성부(5)와, 상기 누적 움직임 히스토그램 생성부(5)에서 생성된 누적 움직임 히스토그램의 변화량에 따라 비디오를 임의의 크기로 구조화(계층화)하고, 구조화된 각 단위에 대해 움직임 특성을 기술하는 움직임 기술자를 생성하는 움직임 기술자 생성부(6)로 구성된다.As shown in FIG. 4C, the motion descriptor generation apparatus using the cumulative motion histogram according to an embodiment of the present invention includes a motion filter 3 for performing visual filtering on input motion size information and direction information, and the motion filter. A motion histogram generator 4 for generating a motion histogram for the motion size information and the direction information filtered in (3), and a motion histogram generated in the motion histogram generator 4 according to a predetermined order. The video is structured (layered) to an arbitrary size according to the amount of change in the cumulative motion histogram generated by the cumulative motion histogram generator 5 and the cumulative motion histogram generator 5, and for each structured unit. Create a motion descriptor that creates a motion descriptor that describes the motion characteristics Consists of a part (6).

도 4d에 도시된 바와 같이, 본 발명의 실시예에 따른 누적 움직임 히스토그램을 이용한 움직임 기술자 생성장치는 외부 프레임 선택 모드에 의해 움직임 크기정보 및 방향정보의 추정치의 발산을 처리하는 움직 추정치 발산 처리부(2)와, 상기 움직임 추정치 발산 처리부(2)에서 발산 처리된 움직임 크기정보 및 방향 정보에 대해서 시각적 필터링을 수행하는 움직임 필터(3)와, 상기 움직임 필터(3)에서 필터링된 움직임 크기 정보 및 방향 정보에 대한 움직임 히스토그램을 각각 생성하는 움직임 히스토그램 생성부(4)와, 상기 움직임 히스토그램 생성부(4)에서 생성된 움직임 히스토그램을 정해진 순서에 따라 누적 움직임 히스토그램을 생성하는 누적 움직임 히스토그램 생성부(5)와, 상기 누적 움직임 히스토그램 생성부(5)에서 생성된 누적 움직임 히스토그램의 변화량에 따라 비디오를 임의의 크기로 구조화(계층화)하고, 구조화된 각 단위에 대해 움직임 특성을 기술하는 움직임 기술자를 생성하는 움직임 기술자 생성부(6)로 구성된다.As shown in FIG. 4D, the motion descriptor generation apparatus using the cumulative motion histogram according to an embodiment of the present invention includes a motion estimation divergence processor 2 which processes divergence of an estimate of motion size information and direction information by an external frame selection mode. ), A motion filter 3 for performing visual filtering on the motion size information and the direction information diverged by the motion estimation value divergence processing unit 2, and the motion size information and the direction information filtered by the motion filter 3. A motion histogram generating unit 4 for generating a motion histogram for the cumulative motion, a cumulative motion histogram generating unit 5 for generating a cumulative motion histogram according to a predetermined order of the motion histogram generated in the motion histogram generating unit 4 and The cumulative motion histogram generated by the cumulative motion histogram generator 5 It consists of the video according to the variation of the ram to the motion descriptor generating unit 6 for structuring (layered) into an arbitrary size, and generating a motion descriptor for describing motion characteristics for the respective structured units.

도 4e에 도시된 바와 같이, 본 발명의 실시예에 따른 누적 움직임 히스토그램을 이용한 움직임 기술자 생성장치는 입력되는 움직임 크기 정보의 크기(정도)를 계산하는 움직임 크기 계산부(7)와, 상기 움직임 크기 계산부(7)에 계산된 움직임 크기정보에 대해서 움직임 크기 히스토그램을 생성하는 움직임 크기 히스토그램 생성부(8)와, 상기 움직임 크기 히스토그램 생성부(8)에서 생성된 움직임 크기 히스토그램을 정해진 순서에 따라 누적 움직임 크기 히스토그램을 생성하는 누적 움직임 크기 히스토그램 생성부(9)와, 상기 누적 움직임 크기 히스토그램 생성부(9)에서 생성된 누적 움직임 크기 히스토그램의 변화량에 따라 비디오를 임의의 크기로 구조화(계층화)하고, 구조화된 각 단위에 대해 움직임 특성을 기술하는 움직임 기술자를 생성하는 움직임 기술자 생성부(6)로 구성된다.As shown in FIG. 4E, the apparatus for generating a motion descriptor using a cumulative motion histogram according to an embodiment of the present invention includes a motion size calculator 7 for calculating a magnitude (degree) of input motion size information and the motion size. The motion magnitude histogram generator 8 which generates a motion magnitude histogram with respect to the motion magnitude information calculated by the calculation unit 7 and the motion magnitude histogram generated by the motion magnitude histogram generator 8 are accumulated in a predetermined order. The video is structured (layered) to an arbitrary size according to a cumulative motion magnitude histogram generator 9 for generating a motion magnitude histogram and a change amount of the cumulative motion magnitude histogram generated by the cumulative motion magnitude histogram generator 9, A motion that generates a motion descriptor that describes the motion characteristic for each structured unit Is composed of the caster generator (6).

도 4f에 도시된 바와 같이, 본 발명의 실시예에 따른 누적 움직임 히스토그램을 이용한 움직임 기술자 생성장치는 입력되는 움직임 방향 정보의 방향을 계산하는 움직임 방향 계산부(10)와, 상기 움직임 방향 계산부(10)에 계산된 움직임 방향 정보에 대해서 움직임 방향 히스토그램을 생성하는 움직임 방향 히스토그램 생성부(11)와, 상기 움직임 방향 히스토그램 생성부(11)에서 생성된 움직임 방향 히스토그램을 정해진 순서에 따라 누적 움직임 방향 히스토그램을 생성하는 누적 움직임 방향 히스토그램 생성부(14)와, 상기 누적 움직임 방향 히스토그램 생성부(14)에서 생성된 누적 움직임 방향 히스토그램의 변화량에 따라 비디오를 임의의 크기로 구조화(계층화)하고, 구조화된 각 단위에 대해 움직임 특성을 기술하는 움 직임 기술자를 생성하는 움직임 기술자 생성부(6)로 구성된다.As shown in FIG. 4F, the motion descriptor generation apparatus using the cumulative motion histogram according to the embodiment of the present invention includes a motion direction calculator 10 for calculating a direction of input motion direction information, and the motion direction calculator ( 10. A cumulative motion direction histogram according to a predetermined order of the motion direction histogram generator 11 generating a motion direction histogram with respect to the motion direction information calculated in 10) and the motion direction histogram generated by the motion direction histogram generator 11 in a predetermined order. The video is arbitrarily structured (layered) according to the amount of change in the cumulative motion direction histogram generator 14 and the cumulative motion direction histogram generated by the cumulative motion direction histogram generator 14, and each structured angle is generated. A motion that generates a motion descriptor that describes the motion characteristics for the unit Is composed of the caster generator (6).

도 4g에 도시된 바와 같이, 본 발명의 실시예에 따른 누적 움직임 히스토그램을 이용한 움직임 기술자 생성장치는 외부 프레임 선택 모드에 의해 움직임 크기정보의 추정치의 발산을 처리하는 움직임 크기 추정치 발산 처리부(15)와, 상기 움직임 크기 추정치 발산 처리부(15)에서 발산 처리된 움직임 크기정보의 크기(정도)를 계산하는 움직임 크기 계산부(7)와, 상기 움직임 크기 계산부(7)에 계산된 움직임 크기정보에 대해서 움직임 크기 히스토그램을 생성하는 움직임 크기 히스토그램 생성부(8)와, 상기 움직임 크기 히스토그램 생성부(8)에서 생성된 움직임 크기 히스토그램을 정해진 순서에 따라 누적 움직임 크기 히스토그램을 생성하는 누적 움직임 크기 히스토그램 생성부(9)와, 상기 누적 움직임 크기 히스토그램 생성부(9)에서 생성된 누적 움직임 크기 히스토그램의 변화량에 따라 비디오를 임의의 크기로 구조화(계층화)하고, 구조화된 각 단위에 대해 움직임 특성을 기술하는 움직임 기술자를 생성하는 움직임 기술자 생성부(6)로 구성된다.As shown in FIG. 4G, the motion descriptor generation apparatus using the cumulative motion histogram according to an embodiment of the present invention includes a motion size estimate divergence processor 15 which processes divergence of the estimated value of the motion size information by an external frame selection mode. The motion size calculation unit 7 for calculating the magnitude (degree) of the motion size information divergently processed by the motion size estimation value divergence processor 15, and the motion size information calculated by the motion size calculator 7 A motion magnitude histogram generator for generating a motion magnitude histogram, and a cumulative motion magnitude histogram generator for generating a cumulative motion magnitude histogram according to a predetermined order of the motion magnitude histogram generated by the motion magnitude histogram generator 8 9) and the cumulative motion generated by the cumulative motion magnitude histogram generator 9 Group consists of the video according to the variation of the histogram as a motion descriptor generating unit 6 for structuring (layered) into an arbitrary size, and generating a motion descriptor for describing motion characteristics for the respective structured units.

도 4h에 도시된 바와 같이, 본 발명의 실시예에 따른 누적 움직임 히스토그램을 이용한 움직임 기술자 생성장치는 외부 프레임 선택 모드에 의해 움직임 방향 정보의 추정치의 발산을 처리하는 움직임 방향 추정치 발산 처리부(17)와, 상기 움직임 방향 추정치 발산 처리부(17)에서 발산 처리된 움직임 방향 정보의 방향을 계산하는 움직임 방향 계산부(10)와, 상기 움직임 방향 계산부(10)에 계산된 움직임 방향 정보에 대해서 움직임 방향 히스토그램을 생성하는 움직임 방향 히스토그램 생성부(11)와, 상기 움직임 방향 히스토그램 생성부(11)에서 생성된 움직임 방향 히스토그램을 정해진 순서에 따라 누적 움직임 방향 히스토그램을 생성하는 누적 움직임 방향 히스토그램 생성부(14)와, 상기 누적 움직임 방향 히스토그램 생성부(14)에서 생성된 누적 움직임 방향 히스토그램의 변화량에 따라 비디오를 임의의 크기로 구조화(계층화)하고, 구조화된 각 단위에 대해 움직임 특성을 기술하는 움직임 기술자를 생성하는 움직임 기술자 생성부(6)로 구성된다.As shown in FIG. 4H, the motion descriptor generation apparatus using the cumulative motion histogram according to an embodiment of the present invention includes a motion direction estimation value divergence processing unit 17 which processes divergence of an estimate of motion direction information by an external frame selection mode. A motion direction histogram with respect to the motion direction calculation unit 10 for calculating the direction of the motion direction information diverged by the motion direction estimation value divergence processing unit 17 and the motion direction information calculated by the motion direction calculation unit 10. A motion direction histogram generator 11 for generating a motion direction histogram generator 11, a cumulative motion direction histogram generator 14 for generating a cumulative motion direction histogram in a predetermined order, and a motion direction histogram generated by the motion direction histogram generator 11; Cumulative motion generated by the cumulative motion direction histogram generator 14 It consists of a motion descriptor generation unit 6 for structuring (layering) the video to an arbitrary size according to the amount of change in the directional histogram, and generating a motion descriptor for describing the motion characteristic for each structured unit.

도 4i에 도시된 바와 같이, 본 발명의 실시예에 따른 누적 움직임 히스토그램을 이용한 움직임 기술자 생성장치는 입력되는 움직임 크기 정보의 크기(정도)를 계산하는 움직임 크기 계산부(7)와, 상기 움직임 크기 계산부(7)에 계산된 움직임 크기정보에 대해서 움직임 크기 히스토그램을 생성하는 움직임 크기 히스토그램 생성부(8)와, 상기 움직임 크기 히스토그램 생성부(8)에서 생성된 움직임 크기 히스토그램을 정해진 순서에 따라 누적 움직임 크기 히스토그램을 생성하는 누적 움직임 크기 히스토그램 생성부(9)와, 입력되는 움직임 방향 정보의 방향을 계산하는 움직임 방향 계산부(10)와, 상기 움직임 방향 계산부(10)에 계산된 움직임 방향 정보에 대해서 움직임 방향 히스토그램을 생성하는 움직임 방향 히스토그램 생성부(11)와, 상기 움직임 방향 히스토그램 생성부(11)에서 생성된 움직임 방향 히스토그램을 정해진 순서에 따라 누적 움직임 방향 히스토그램을 생성하는 누적 움직임 방향 히스토그램 생성부(14)와, 상기 누적 움직임 크기 히스토그램 생성부(9) 및 상기 누적 움직임 방향 히스토그램 생성부(14)에서 생성된 누적 움직임 크기 및 방향 히스토그램의 변화량에 따라 비디오를 임의의 크기로 구조화(계층화)하고, 구조화된 각 단위에 대해 움직임 특성을 기술하는 움직임 기술자를 생성하는 움직임 기술자 생성부(6)로 구성된다.As shown in FIG. 4I, a motion descriptor generating apparatus using a cumulative motion histogram according to an embodiment of the present invention includes a motion size calculator 7 for calculating a magnitude (degree) of input motion size information, and the motion size. The motion magnitude histogram generator 8 which generates a motion magnitude histogram with respect to the motion magnitude information calculated by the calculation unit 7 and the motion magnitude histogram generated by the motion magnitude histogram generator 8 are accumulated in a predetermined order. A cumulative motion magnitude histogram generator 9 for generating a motion magnitude histogram, a motion direction calculator 10 for calculating a direction of input motion direction information, and motion direction information calculated in the motion direction calculator 10. A motion direction histogram generator 11 for generating a motion direction histogram with respect to the motion direction histogram. A cumulative motion direction histogram generator 14 for generating a cumulative motion direction histogram according to a predetermined order of the motion direction histogram generated by the togram generator 11, the cumulative motion magnitude histogram generator 9, and the cumulative motion A motion descriptor for structuring (layering) the video to an arbitrary size according to the cumulative motion magnitude generated by the direction histogram generator 14 and the amount of change in the direction histogram, and generating a motion descriptor for describing motion characteristics for each structured unit. It is comprised of the generation part 6.

도 4j에 도시된 바와 같이, 본 발명의 실시예에 따른 누적 움직임히스토그램을 이용한 움직임 기술자 생성장치는 외부 프레임 선택 모드에 의해 움직임 크기정보의 추정치의 발산을 처리하는 움직임 크기 추정치 발산 처리부(15)와, 상기 움직임 크기 추정치 발산 처리부(15)에서 발산 처리된 움직임 크기정보의 크기(정도)를 계산하는 움직임 크기 계산부(7)와, 상기 움직임 크기 계산부(7)에 계산된 움직임 크기정보에 대해서 움직임 크기 히스토그램을 생성하는 움직임 크기 히스토그램 생성부(8)와, 상기 움직임 크기 히스토그램 생성부(8)에서 생성된 움직임 크기 히스토그램을 정해진 순서에 따라 누적 움직임 크기 히스토그램을 생성하는 누적 움직임 크기 히스토그램 생성부(9)와, 외부 프레임 선택 모드에 의해 움직임 방향 정보의 추정치의 발산을 처리하는 움직임 방향 추정치 발산 처리부(17)와, 상기 움직임 방향 추정치 발산 처리부(17)에서 발산 처리된 움직임 방향 정보의 방향을 계산하는 움직임 방향 계산부(10)와, 상기 움직임 방향 계산부(10)에 계산된 움직임 방향 정보에 대해서 움직임 방향 히스토그램을 생성하는 움직임 방향 히스토그램 생성부(11)와, 상기 움직임 방향 히스토그램 생성부(11)에서 생성된 움직임 방향 히스토그램을 정해진 순서에 따라 누적 움직임 방향 히스토그램을 생성하는 누적 움직 임 방향 히스토그램 생성부(14)와, 상기 누적 움직임 크기 히스토그램 생성부(9) 및 상기 누적 움직임 방향 히스토그램 생성부(14)에서 각각 생성된 누적 움직임 크기 및 방향 히스토그램의 변화량에 따라 비디오를 임의의 크기로 구조화(계층화)하고, 구조화된 각 단위에 대해 움직임 특성을 기술하는 움직임 기술자를 생성하는 움직임 기술자 생성부(6)로 구성된다.As shown in FIG. 4J, the motion descriptor generation apparatus using the cumulative motion histogram according to an embodiment of the present invention includes a motion size estimate divergence processing unit 15 which processes divergence of an estimate of motion size information by an external frame selection mode. The motion size calculation unit 7 for calculating the magnitude (degree) of the motion size information divergently processed by the motion size estimation value divergence processor 15, and the motion size information calculated by the motion size calculator 7 A motion magnitude histogram generator for generating a motion magnitude histogram, and a cumulative motion magnitude histogram generator for generating a cumulative motion magnitude histogram according to a predetermined order of the motion magnitude histogram generated by the motion magnitude histogram generator 8 9) and the divergence of the estimate of the motion direction information by the external frame selection mode. The motion direction estimation value divergence processing unit 17, the motion direction calculation unit 10 that calculates the direction of the motion direction information divergent by the motion direction estimation value divergence processing unit 17, and the motion direction calculation unit 10. A motion direction histogram generator 11 generating a motion direction histogram with respect to the calculated motion direction information and a motion direction histogram generated by the motion direction histogram generator 11 according to a predetermined order to generate a cumulative motion direction histogram. The video is randomly selected according to the cumulative movement direction histogram generator 14, the cumulative motion magnitude histogram generator 9, and the cumulative motion direction histogram generator 14, respectively, according to the cumulative movement magnitude and the amount of change in the direction histogram. Structured (layered) to the size of Which consists of a motion descriptor generating unit 6 for generating a motion descriptor.

도 5는 도 4b, 도 4d, 도 4g, 도 4h 및 도 4j에서의 움직임 추정치 발산 처리부(2)의 상세 블록도를 도시한 것이다.FIG. 5 shows a detailed block diagram of the motion estimation divergence processing unit 2 in FIGS. 4B, 4D, 4G, 4H, and 4J.

도 5에 도시된 바와 같이, 상기 움직임 추정치 발산 처리부(2)는 미리 이전 영상이 저장된 이전 영상 저장부(12)와, 미리 현재 영상이 저장된 현재 영상 저장부(22)와, 상기 외부 프레임 선택 모드(frame_select_mode)에 따라 현재 입력 영상중에 움직임 추정치를 갖는 제1 영역(MBc)과 현재 그 제1 영역에 이웃하는 상기 현재 영상 저장부(22)에 저장된 제2 내지 제4 현재 영역, 또는 그 제1 영역(MBc)과 t상기 이전 영상 저장부(12)에 저장된 그 제1 영역(MBc)의 이전 영역 및 이전에 그 제1 영역에 이웃하는 제2 내지 제4 이전 영역의 평균치를 각각 계산하는 평균치 계산부(32)와, 상기 평균치 계산부(32)에서 계산된 제1 영역 및 제2 내지 제4 영역의 차 또는 제1 영역, 그 제1 영역의 이전영역 및 상기 제2 내지 제4 이전영역의 차를 각각 계산하고, 그 차의 절대값을 각각 계산하는 절대치 계산부(42)와, 상기 절대치 계산부(42)에서 각각 계산된 절대치와 기 설정된 임계치를 비교한 후, 그 비교결과에 따라 상기 움직임 추정부(1)에서 입력되는 X, Y 움직임 벡터값(MVx, MVy)을 변환하여 움직임 벡터값(MVox, MVoy)으로 출력하는 움직임 벡터값 비교 변환부(52)로 구성된다. As shown in FIG. 5, the motion estimate divergence processor 2 includes a previous image storage unit 12 in which a previous image is stored in advance, a current image storage unit 22 in which a current image is stored in advance, and the external frame selection mode. a second area to a first area MBc having a motion estimate in the current input image and a second to fourth current area stored in the current image storage unit 22 neighboring the first area according to (frame_select_mode), or a first thereof An average value for calculating an average value of the previous area of the first area MBc stored in the previous image storage unit 12 and the second to fourth previous areas previously adjacent to the first area, respectively, in the area MBc and the previous image storage unit 12; The difference between the calculation unit 32 and the first area and the second to fourth areas calculated by the average calculation unit 32 or the first area, the previous area of the first area, and the second to fourth previous area. Calculate the difference between and calculate the absolute value of each After comparing the absolute value calculated by the absolute value calculation unit 42 and the absolute value calculated by the absolute value calculation unit 42 and the predetermined threshold value, the X and Y motion vector values input by the motion estimation unit 1 according to the comparison result. A motion vector value comparison converter 52 converts (MVx, MVy) and outputs the motion vector values MVox, MVoy.

도 6은 도 4c 및 도 4d에서의 움직임 필터(3)의 상세 블록도를 도시한 것이다.FIG. 6 shows a detailed block diagram of the motion filter 3 in FIGS. 4C and 4D.

도 6에 도시된 바와 같이, 상기 움직임 필터(3)는 상기 움직임 추정치 발산 처리부(2)에서 처리된 X,Y 움직임 벡터값(MVox, MVoy)을 이용하여 움직임 크기를 계산하는 움직임 크기 계산부(13)와, 상기 움직임 크기 계산부(13)에서 계산된 움직임 크기와 기 설정된 임계치를 비교하고, 그 비교결과에 따라 움직임 벡터값을 변환하는 움직임 벡터값 비교 변환부(23)와, 상기 움직임 벡터값 비교 변환부(23)에서 변환된 움직임 벡터값을 이용하여 움직임 방향을 계산하는 움직임 방향 계산부(33)와, 상기 움직임 방향 계산부(33)에서 계산된 움직임 방향을 양자화 및 역 양자화시켜 움직임 방향치(θxy)를 출력하는 움직임 방향 양자화/역양자화부(43)로 구성된다.As shown in FIG. 6, the motion filter 3 includes a motion size calculator for calculating a motion size using the X and Y motion vector values MVox and MVoy processed by the motion estimate divergence processor 2. 13), a motion vector value comparison converter 23 for comparing the motion size calculated by the motion size calculator 13 with a preset threshold value, and converting a motion vector value according to the comparison result, and the motion vector. The motion direction calculator 33 calculates a motion direction using the motion vector value converted by the value comparison converter 23, and the motion direction calculated by the motion direction calculator 33 is quantized and inversely quantized. It consists of a motion direction quantization / dequantization unit 43 which outputs a direction value [theta] xy.

도 7는 도 4a 내지 도 4j에서의 움직임 기술자 생성부(6)의 상세 블록도를 도시한 것이다.FIG. 7 shows a detailed block diagram of the motion descriptor generation unit 6 in FIGS. 4A to 4J.

도 7에 도시된 바와 같이, 상기 움직임 기술자 생성부(6)는 상기 누적 움직임 히스토그램 생성부(5)에서 누적된 누적 움직임 히스토그램의 변화량을 계산하는 움직임 히스토그램 변화량 계산부(161)와, 상기 움직임 히스토그램 변화량 계산 부(161)에서 계산된 움직임 히스토그램 변화시간 및 클립갯수를 색인하여 움직임 클립 기술자를 생성하는 클립타임 색인부(162)와, 상기 움직임 히스토그램 변화량 계산부(161)에서 계산된 움직임 히스토그램 변화량과 기 설정된 임계치를 비교한 후, 그 비교결과에 따라 상기 움직임 히스토그램 변화량 계산부(161) 또는 상기 클립 타임 색인부(162)를 인에이블시키는 비교부(163)와, 상기 클립 타임 색인부(162)에서 생성된 움직임 클립 기술자에 의해 기술된 정보를 이용하여 움직임 기술자를 생성하는 움직임 기술자 생성기(36)로 구성된다.As illustrated in FIG. 7, the motion descriptor generator 6 includes a motion histogram change calculator 161 for calculating a change amount of the cumulative motion histogram accumulated by the cumulative motion histogram generator 5, and the motion histogram. A clip time indexing unit 162 for generating a motion clip descriptor by indexing the motion histogram change time and the number of clips calculated by the change amount calculating unit 161, and the motion histogram change amount calculated by the motion histogram change amount calculating unit 161; A comparison unit 163 for enabling the motion histogram change calculation unit 161 or the clip time index unit 162 and the clip time index unit 162 according to the comparison result after comparing a preset threshold value; A motion descriptor generator for generating a motion descriptor using the information described by the motion clip descriptor generated in 36).

상기 클립 타임 색인부(162) 및 상기 비교부(163)는 상기 누적 움직임 히스토그램 생성부(5)에서 생성된 누적 움직임 히스토그램이 구조화가 않된 경우에만 동작하고, 누적 움직임 히스토그램이 구조화된 경우에는 상기 움직임 히스토그램 변화량 계산부(161)에서 계산된 움직임 히스토그램 변화량은 상기 움직임 기술자 생성기(36)에 바로 제공된다.The clip time indexing unit 162 and the comparing unit 163 operate only when the cumulative motion histogram generated by the cumulative motion histogram generating unit 5 is not structured, and the motion when the cumulative motion histogram is structured. The histogram change amount calculated by the histogram change amount calculator 161 is directly provided to the motion descriptor generator 36.

이와 같이 구성된 본 발명의 실시예에 따른 누적 움직임 히스토그램을 이용한 움직임 기술자 생성장치 및 그 방법을 상세히 설명하면 다음과 같다.An apparatus and method for generating a motion descriptor using a cumulative motion histogram according to an embodiment of the present invention configured as described above are described in detail.

먼저, 움직임 추정과정을 도 8에 도시된 현재의 비디오 압축 표준에서 주로 사용하고 있는 BMA(Block Matching Algorithm)를 위주로 설명하면 다음과 같다.First, the motion estimation process will be described with reference to BMA (Block Matching Algorithm) mainly used in the current video compression standard illustrated in FIG. 8.

BMA는 영상간의 움직임 추정을 수행하기 위하여, 이전과 현재의 두 영상을 필요로 한다. 움직임 추정단위는 16x16 화소들로 구성된 매크로블록(Macroblock :MB)이라 불리는 영역이다. 따라서, 현재 영상을 MB단위로 분할한 후, 현재 영상에서의 추정하려는 MB(이하, MBc라 칭함)과 동일한 위치의 이전 영상내의 위치를 기점으로 하여 미리 정의된 움직임 추정 영역(Sr)을 검색하여, 현재 MB의 영상 데이터와 가장 유사한 MB크기의 데이터에 대한 영상내의 위치를 찾고, 현재 영상내의 MB 위치와의 차분 벡터(MVx, MVy)로 움직임을 표시한다. The BMA requires two previous and current images to perform motion estimation between images. The motion estimation unit is an area called a macroblock (MB) composed of 16x16 pixels. Therefore, after dividing the current image into MBs, the predefined motion estimation region Sr is searched based on the position in the previous image at the same position as the MB to be estimated in the current image (hereinafter referred to as MBc). Next, the position of the image of the MB size data most similar to the current MB image data is found, and the motion is represented by the difference vectors MVx and MVy from the MB position of the current image.

통상적으로, 움직임 추정영역내의 모든 위치의 데이터를 비교하여 현재 MB에 대한 움직임 벡터를 추정해야 하지만, 이 방법은 움직인 추정에 많은 시간이 소요된다는 단점이 있어, 실제 응용에서는 3-Step Search, Spiral Search등의 빠른 움직임 추정 기법을 사용한다. 이러한 기법들은 움직임 추정에 소요되는 시간은 단축하였지만, 항상 최적의 움직임 추정을 보장하지 못한다. In general, the motion vector for the current MB must be estimated by comparing the data of all positions in the motion estimation region, but this method has a disadvantage in that it takes a lot of time to estimate the movement. Use fast motion estimation techniques such as Search. These techniques reduce the time required for motion estimation, but do not always guarantee optimal motion estimation.

일반적인 움직임 추정 기법들은 움직임 추정영역 내의 영상 데이터가 모두 또는 일부가 동일할 경우, 움직임 추정영역 내의 다수의 위치가 선택될 수 있어 적절한 처리를 하지 않으면 정확한 움직임 추정을 방해받게 되는데, 이를 움직임 추정에서의 발산이라 한다.In general motion estimation techniques, when all or part of the image data in the motion estimation region is the same, multiple positions in the motion estimation region may be selected, and thus, accurate motion estimation may be hindered without proper processing. It is called divergence.

다음으로, 도 4b, 도 4d에 도시된 움직임 추정치 발산 처리부(2), 도 4g에 도시된 움직임 크기 추정치 발산 처리부(15), 도 4h에 도시된 움직임 방향 추정치 발산 처리부(17) 및 도 4j에 도시된 움직임 크기 추정치 발산 처리부(15)와 움직임 방향 추정치 발산 처리부(17)의 동작, 및 도12b와 도 15d에 도시된 움직임 추정치 발산 처리단계(S12),(S32), 도 15g에 도시된 움직임 크기 추정치 발산 처리단계(S62), 도 15h에 도시된 움직임 방향 추정치 발산 처리단계(S74) 및 도 15j에 도 시된 움직임 크기 추정치 발산 처리단계(S91)와 움직임 방향 추정치 발산 처리단계(S92)를 도 5, 도 9a 내지 도 9c 및 도 16를 참조하여 상세히 설명하면 다음과 같다.Next, the motion estimate divergence processor 2 shown in Figs. 4B and 4D, the motion magnitude estimate divergence processor 15 shown in Fig. 4G, the motion direction estimate divergence processor 17 shown in Fig. 4H, and Fig. 4J are shown in Figs. Operations of the shown motion magnitude estimate divergence processor 15 and the motion direction estimate divergence processor 17, and the motion estimate divergence processing steps S12, S32, and S32 shown in Figs. 12B and 15D, and the motions shown in Fig. 15G. 15 illustrates the magnitude estimate divergence processing step S62, the motion direction estimate divergence processing step S74 shown in FIG. 15H, the motion magnitude estimate divergence processing step S91, and the motion direction estimate divergence processing step S92 shown in FIG. 15J. 5, 9A to 9C and 16 will be described in detail as follows.

움직임 추정의 발산의 일례를 도시하면, 도 9a 내지 도 9c에 도시된 바와 같다. 도 9a는 움직임 추정 영역 전체가 동일한 화소값을 갖는 경우이고, 도 9b는 수직 방향으로 동일한 화소값을 갖는 영역이 분할되어 있고, 도 9c는 수평 방향으로 동일한 화소값을 갖는 영역이 분할되어 있다. An example of divergence of motion estimation is shown in Figs. 9A to 9C. FIG. 9A illustrates a case in which the entire motion estimation region has the same pixel value, FIG. 9B illustrates a region having the same pixel value in the vertical direction, and FIG. 9C illustrates a region having the same pixel value in the horizontal direction.

위와 같은 경우, 도 9a는 움직임 추정영역 내의 어느 위치에서도 동일한 유사도를 갖게 되며, 도 9b는 수직 방향으로 그리고 도6c는 수평방향으로 동일한 유사도를 갖게 된다. 이처럼 동일 유사도를 갖는 위치가 다수일 경우에는 정확한 움직임 추정을 할 수 없게 되는데, 이를 움직임 추정의 발산이라 한다.In the above case, FIG. 9A has the same similarity at any position in the motion estimation region, and FIG. 9B has the same similarity in the vertical direction and FIG. 6C in the horizontal direction. As such, when there are a plurality of locations having the same similarity, accurate motion estimation cannot be performed. This is called divergence of motion estimation.

본 발명은 상기 움직임 추정치에 대한 발산을 처리하기 위하여 다음과 같은 2가지 방법을 제시한다.The present invention proposes the following two methods for processing the divergence for the motion estimate.

첫번째 방법은, 현재 영상에 존재하는 입력되는 움직임 추정치를 갖는 영역(MBc)의 이웃하는 영역들(MB1, MB2, MB3)의 DC(Direct Current)을 사용하였고, 두번째 방법은 현재 움직임 추정치를 갖는 영역과 동일한 위치의 이전 영상내의 영역과 그 주변 영역에서의 DC를 사용하였다. 이들에 대한 공간적 위치관계는 도 9에 도시하였다. The first method uses a direct current (DC) of neighboring areas MB1, MB2, and MB3 of the area MBc having the input motion estimate present in the current image, and the second method uses the current motion estimate area. DC in the region of the previous image and its surrounding area at the same position as is used. The spatial positional relationship with respect to these is shown in FIG.

첫번째 방법은 추정된 움직임 추정치의 발산여부를 정확히 측정할 수는 없지만, 이웃한 영상간에는 영상의 획득 시간 차이가 작아 용장성 크기 때문에, 실제 움직임 추정에 사용된 이전 영상대신 현재 영상을 사용하여 이전 영상의 영상 특성을 어느 정도 분석할 수 있기 때문이다. 또한 이 방법은 저장 용량의 한계 때문에 이전 영상 전체를 저장할 수 없는 응용에 적합하다. 두번째 방법은 이전 영상의 정보를 사용하기 때문에 첫번째 방법보다 훨씬 정확하게 움직임 추정치의 발산 여부를 측정할 수 있다. The first method cannot accurately measure the divergence of estimated motion estimates, but due to the large redundancy due to the small difference in acquisition time between neighboring images, the previous image is used instead of the previous image used for actual motion estimation. This is because some of the image characteristics can be analyzed. This method is also suitable for applications that cannot store the entire previous image because of the limited storage capacity. Since the second method uses the information of the previous image, it is possible to more accurately measure the divergence of the motion estimate than the first method.

도 10을 참고하면, 도 10은 도 4b, 도 4d, 도 4f, 도 4g 및 도 4j에서의 움직임 추정치 발산 처리부에서의 움직임 추정치 발산처리를 설명하기 위한 현재영역과 주변영역의 공간적 상관관계를 보인 도면이다. 추정된 움직임의 발산 처리에 해당 움직임 추정치를 갖는 영역(MB1, MB2, MB3, MBc, MBp에 해당함)의 DC를 사용하는 이유는 DC값이 영역의 평균치이고 영역 전체를 대표하면서 영역의 국부적인 차이(노이즈)에 덜 민감하기 때문이다. 상기한 DC는 수학식 16을 사용하여 다음과 같이 계산된다.Referring to FIG. 10, FIG. 10 illustrates a spatial correlation between a current region and a peripheral region for explaining a motion estimation divergence processing in the motion estimation divergence processor in FIGS. 4B, 4D, 4F, 4G, and 4J. Drawing. The reason for using DC of the region having corresponding motion estimates (corresponding to MB1, MB2, MB3, MBc, and MBp) for the divergence processing of the estimated motion is that the DC value is the average value of the region and represents the whole region, and the local difference of the region. This is because it is less sensitive to noise. The above DC is calculated as follows using Equation 16.

여기서, 상기 Pi는 움직임 추정치를 갖는 영역의 i번째 화소 값이고, N은 영역 안의 화소의 갯수이며, S는 영역내의 전체 화소의 합이고, DC는 영역내 화소값들의 평균치이다. Where Pi is the i-th pixel value of the region having the motion estimate, N is the number of pixels in the region, S is the sum of all pixels in the region, and DC is the average of the pixel values in the region.

여기서, MBp는 영상 내에서 MBc와 동일한 공간적 위치를 갖는 임의의 크기의 이전 영상내의 영역을 나타내고, 도 5의 프레임 선택 모드(frame_select_mode)에 따라 MB1, MB2, MB3는 현재영상이나 이전 영상내의 공간적 위치가 MBc와 이웃하는 주변영역을 나타낸다. 첫번째 발산처리 방법을 사용할 경우(frame_select_mode가 현재 영상선택 모드가 됨), MB1, MB2, MB3은 MBc의 주변 영역들에 해당되며, 두번째 방법을 사용할 경우(frame_select_mode가 이전 영상선택 모드가 됨), MB1, MB2, MB3는 이전 영상내의 MBc의 공간적 위치에 있어 이웃하는 주변 영역들에 해당된다. 본 발명에서는 상기한 움직임 추정 영역의 크기(Sr) 및 추정방법은 응용에 따라 다르기 때문에 특별한 제한을 두지 않는다. 다만, 영역의 크기는 한정될 필요는 없지만, 미리 정의된 크기를 사용하는 것이 계산상에 잇점이 있다. Here, MBp represents a region in a previous image having an arbitrary size having the same spatial position as MBc in the image, and MB1, MB2, and MB3 represent a spatial position in the current image or the previous image according to the frame selection mode (frame_select_mode) of FIG. 5. Represents a peripheral area neighboring MBc. When using the first divergence processing method (frame_select_mode is the current image selection mode), MB1, MB2, and MB3 correspond to the surrounding areas of MBc, and when using the second method (frame_select_mode is the previous image selection mode), MB1 , MB2, MB3 correspond to neighboring peripheral regions in the spatial position of MBc in the previous image. In the present invention, since the size (Sr) and the estimation method of the motion estimation region are different depending on the application, there is no particular limitation. However, the size of the region does not need to be limited, but using a predefined size has an advantage in calculation.

도 5는 지금까지 설명한 움직임 추정치 발산처리방법을 수행하는 움직임 추정치 발산 처리부(2)의 상세 블록도를 도시한 도면이다. FIG. 5 is a diagram showing a detailed block diagram of the motion estimate divergence processing unit 2 that performs the motion estimation divergence processing method described so far.

여기서, 프레임 선택모드(frame_select_mode)는 첫번째 방법이나 두번째 방법에서 사용할 영상을 선택하는 외부 입력모드이고, MVx와 MVy는 상기 움직임 추정부(1)에서 MBc의 수평/수직 방향의 추정된 움직임 벡터이다. 상기 평균치 계산부(32)에서 계산된 MB1, MB2, MB3, MBc그리고 MBp의 DC값을 각각 MB1_DC, MB2_DC, MB3_DC, MBc_DC, MBp_DC라 했을 때, 추정된 움직임 벡터에 대해서 도 5의 움직임 추정치 발산 처리부(2)의 동작 및 도 16의 움직임 추정치 발산 처리단계(S13-S93)를 살펴보면 다음과 같다.Here, the frame selection mode (frame_select_mode) is an external input mode for selecting an image to be used in the first method or the second method, and MVx and MVy are estimated motion vectors in the horizontal / vertical direction of MBc in the motion estimation unit 1. When the DC values of MB1, MB2, MB3, MBc, and MBp calculated by the average calculation unit 32 are MB1_DC, MB2_DC, MB3_DC, MBc_DC, and MBp_DC, respectively, the motion estimation value divergence processor of FIG. The operation of (2) and the motion estimation value divergence processing steps (S13-S93) of FIG. 16 are as follows.

먼저, 상기 프레임 선택모드(frame_select_mode)가 현재 영상선택 모드(S33), 즉 첫번째 발산처리 방법임을 가르킬 때에, 상기 평균치 계산부(32)에서 계산된 MB1_DC, MB2_DC, MB3_DC, MBc_DC는 현재 영상 내의 각 영역들에서 구해진 영역의 평균치들이다(S43). First, when the frame selection mode (frame_select_mode) indicates the current image selection mode (S33), that is, the first divergence processing method, MB1_DC, MB2_DC, MB3_DC, and MBc_DC calculated by the average value calculation unit 32 are each in the current image. The average values of the regions obtained in the regions (S43).

첫번째, 상기 절대치 계산부(42)에서 계산된 MBc_DC와 MB2_DC 그리고 MBc_DC와 MB3_DC 차의 절대치(S53)가 정의된 임계치 THO보다 작으면, 상기 움직임 벡터값 비교 변환부(52)는 움직임이 없는 것으로 간주하여 움직임 벡터값 MVox와 MVoy를 모두 0으로 변환한다(S63). First, if the absolute value S53 of the difference between MBc_DC and MB2_DC and MBc_DC and MB3_DC calculated by the absolute value calculator 42 is smaller than the defined threshold THO, the motion vector value comparison converter 52 considers no motion. The motion vector values MVox and MVoy are both converted to 0 (S63).

두번째, 상기 첫번째에 해당되지 않고, 상기 절대치 계산부(42)에서 계산된 MBc_DC와 MB3_DC 차의 절대치(S53)가 정의된 임계치 THO보다 작으면, 상기 움직임 벡터값 비교 변환부(52)는 수평방향으로 움직임이 없는 것으로 간주하여 MVox를 0으로 변환하고, MVoy는 MVy를 그대로 출력한다(S63). Second, if it is not the first and the absolute value S53 of the difference between MBc_DC and MB3_DC calculated by the absolute value calculator 42 is smaller than the defined threshold THO, the motion vector value comparison converter 52 moves in the horizontal direction. Considering that there is no motion, MVox is converted to 0, and MVoy outputs MVy as it is (S63).

세번째, 상기 첫 번째 및 두번째에 해당되지 않고, 상기 절대치 계산부(42)에서 계산된 MBc_DC와 MB2_DC 차의 절대치(S53)가 정의된 임계치 THO보다 작으면, 상기 움직임 벡터값 비교 변환부(52)는 수직방향으로 움직임이 없는 것으로 간주하여 MVoy를 0으로 변환하고 MVox를 MVx와 동일하게 출력한다(S63). Third, if the absolute value S53 of the difference between MBc_DC and MB2_DC calculated by the absolute value calculator 42 is not smaller than the defined threshold THO, the motion vector value comparison converter 52 does not correspond to the first and second. Considers that there is no movement in the vertical direction, converts MVoy to 0 and outputs MVox equal to MVx (S63).

네번째, 상기 첫 번째 내지 세번째에 해당되지 않으면, 상기 움직임 벡터값 비교 변환부(52)는 움직임 추정치가 발산이 되지 않는 것으로 간주하여 MVox와 MVoy는 MVx와 MVy를 그대로 출력한다(S63). Fourth, if it does not correspond to the first to third, the motion vector value comparison converter 52 considers that the motion estimation value does not diverge and MVox and MVoy output MVx and MVy as they are (S63).

한편, frame_select_mode가 이전 영상 선택모드, 즉 두번째 발산 처리방법 임을 가르킬 때, 상기 평균치 게산부(32)에서 계산된 MBc_DC는 현재 영상 그리고 MBp_DC, MB1_DC, MB2_DC, MB3_DC 이전 영상 내의 각 영역들에서 구해진 영역의 평 균치들이다(S73).On the other hand, when the frame_select_mode indicates the previous image selection mode, that is, the second divergence processing method, the MBc_DC calculated by the average value estimating unit 32 is a region obtained from the current image and the respective regions within the MBp_DC, MB1_DC, MB2_DC, and MB3_DC previous images. Are the averages of (S73).

첫 번째, 상기 절대치 계산부(42)에서 계산된 MBc_DC와 MBp_DC 차의 절대치(S83)가 정의된 임계치 THO보다 작으면, 상기 움직임 벡터값 비교 변환부(52)는 움직임이 없는 것으로 간주하여 MVox 와 MVoy를 모두 0으로 변환한다(S93). First, if the absolute value S83 of the difference between MBc_DC and MBp_DC calculated by the absolute value calculating unit 42 is smaller than the defined threshold THO, the motion vector value comparison and converting unit 52 considers that there is no motion, All MVoys are converted to 0 (S93).

두번째, 상기 첫번째에 해당되지 않고, 상기 절대치 계산부(42)에서 계산된 MBc_DC와 MB2_DC 그리고 MBc_DC와 MB3_DC 차의 절대치(S83)가 정의된 임계치 THO보다 작으면, 상기 움직임 벡터값 비교 변환부(52)는 움직임이 없는 것으로 간주하여 MVox 와 MVoy를 모두 0으로 변환한다(S93). Second, if not the first, and the absolute value (S83) of the difference between MBc_DC and MB2_DC and MBc_DC and MB3_DC calculated by the absolute value calculation unit 42 is smaller than the defined threshold THO, the motion vector value comparison converter 52 ) Considers no movement and converts both MVox and MVoy to 0 (S93).

세번째, 상기 첫 번째 및 두번째에 해당되지 않고, 상기 절대치 계산부(42)에서 계산된 MBc_DC와 MB3_DC 차의 절대치(S83)가 정의된 임계치 THO보다 작으면, 상기 움직임 벡터값 비교 변환부(52)는 수평방향으로 움직임이 없는 것으로 간주하여 MVox를 0으로 변환하고, MVoy를 MVy와 동일하게 출력한다(S93). Third, if the absolute value S83 of the difference between MBc_DC and MB3_DC calculated by the absolute value calculator 42 is smaller than the defined threshold THO, the motion vector value comparison converter 52 does not correspond to the first and second. Considers that there is no movement in the horizontal direction, converts MVox to 0, and outputs MVoy equal to MVy (S93).

네번째, 상기 첫 번째 내지 세번째에 해당되지 않고, 상기 절대치 계산부(42)에서 계산된 MBc_DC와 MB2_DC 차의 절대치(S83)가 정의된 임계치 THO보다 작으면, 상기 움직임 벡터값 비교 변환부(52)는 수직방향으로 움직임이 없는 것으로 간주하여 MVoy를 0으로 변환하고, MVox를 MVx와 동일하게 출력한다(S93). Fourth, if the absolute value S83 of the difference between MBc_DC and MB2_DC calculated by the absolute value calculator 42 is not smaller than the defined threshold THO, the motion vector value comparison converter 52 does not correspond to the first to third times. Considers that there is no movement in the vertical direction, converts MVoy to 0, and outputs MVox in the same manner as MVx (S93).

다섯번째, 상기 첫 번째 내지 네번째에 해당되지 않으면, 상기 움직임 벡터값 비교 변환부(52)는 움직임 추정치가 발산이 되지 않는 것으로 간주하여 계산된 MVox와 MVoy를 MVx와 MVy로 그대로 출력한다(S93). Fifth, if it is not the first to fourth, the motion vector value comparison converter 52 considers that the motion estimation value does not diverge and outputs MVox and MVoy calculated as MVx and MVy as they are (S93). .

상기한 발산처리 방법들에서 MBc_DC를 기준으로 MBp_DC, MB1_DC, MB2_DC, MB3_DC들과의 차를 사용한 이유는 DC 차의 절대치가 정의된 임계치보다 작으면 MVx, MVy의 움직임 추정영역의 영상이 전체적, 수평 혹은 수직 방향으로 MBc와 동일하거나 거의 차이가 없을(움직임이 발생하지 않을) 가능성이 높은데 움직임 벡터가 존재하므로 이는 잘못된 움직임 추정치라 여겨 상기한 2가지 방법중 적당한 것을 선택하여 발산에 대한 처리를 행한다. The difference between MBp_DC, MB1_DC, MB2_DC, and MB3_DC based on MBc_DC in the divergence processing methods is that when the absolute value of the DC difference is smaller than the defined threshold, the image of the motion estimation region of MVx and MVy is global and horizontal. Alternatively, since there is a high possibility that the motion vector is equal to or almost indistinguishable from MBc in the vertical direction (no motion will occur), it is regarded as a wrong motion estimate, and an appropriate one of the above two methods is selected to perform divergence processing.

다음으로, 도 4c와 도 4d에 도시된 움직임 필터(3)의 동작 및 도 15c와 도 16d에 도시된 움직임 필터링 단계(S22),(S33)를 도 6 및 도 17를 참조하여 상세히 설명하면 다음과 같다.Next, the operation of the motion filter 3 shown in FIGS. 4C and 4D and the motion filtering steps S22 and S33 shown in FIGS. 15C and 16D will be described in detail with reference to FIGS. 6 and 17. Same as

움직임 필터링은 움직임에 대해 주관적 유사도 향상을 위하여 인간의 시각적 느낌을 반영하고 영상내의 주된 움직임과 움직임을 추출하기 위한 과정이며, 시각적 한계를 적절히 이용하여 움직임을 표현하는 데이터를 줄이는데 그 목적이 있다. 움직임 필터링은 움직임 크기의 필터링단계와 움직임 방향의 필터링단계로 구성된다. Motion filtering is a process for reflecting human visual feeling and extracting main movements and movements in the image to improve subjective similarity with respect to movements, and aims to reduce data expressing movements by using visual limitations appropriately. The motion filtering consists of a filtering step of the motion magnitude and a filtering step of the motion direction.

도 17에 도시된 바와 같이, 움직임 크기의 필터링 단계(S14-S54)를 상세히 설명하면 다음과 같다.As shown in FIG. 17, the following describes the filtering of the motion magnitude (S14-S54) in detail.

상기 움직임 추정치 발산처리부(2)에서 발산 처리된 움직임 벡터값(MVox, MVoy)의 입력(S14)시 움직임 크기 계산부(13)는 움직임의 크기(Lmv)를 다음 수학식 17을 이용하여 계산한다(S11). When inputting the motion vector values MVox and MVoy diverged by the motion estimate divergence processor 2 (S14), the motion magnitude calculator 13 calculates the magnitude of the motion Lmv using Equation 17 below. (S11).

상기 움직임 벡터 비교 변환부(23)는 상기 계산된 움직임 크기(Lmv)가 정의된 임계치(THL)보다 작은지 판단하여 작을 경우(S34), 실제로 영상내에서 움직임이 발생하였다고 할지라도 인간이 시각적으로 느낄 수 없는 크기이거나 영상획득과정이나 처리과정에서 영상 내에 랜덤한 노이즈가 발생한 것으로 간주하여, 추정된 움직임 값, MVfx와 MVfy를 모두 0으로 변환한다(S44). The motion vector comparison converter 23 determines whether the calculated motion size Lmv is smaller than the defined threshold value THL (S34), even though the motion is actually generated in the image. It is assumed that random noise is generated in the image during unacceptable size or during image acquisition or processing, and the estimated motion values, MVfx and MVfy, are all converted to 0 (S44).

한편, 계산된 움직임 크기(Lmv)가 정의된 임계치(THL)보다 크면(S34), 상기 움직임 벡터값 비교 변환부(23)는 추정된 움직임 값, MVfx와 MVfy을 상기 움직임 벡터(MVx, MVy)값으로 변환한다(S54). On the other hand, if the calculated motion size Lmv is greater than the defined threshold value THL (S34), the motion vector value comparison and converting unit 23 converts the estimated motion values MVfx and MVfy into the motion vectors MVx and MVy. Convert to a value (S54).

이는 종래기술의 설명에서 언급한 바와 같이, 멀티미디어 데이터 검색에서 주관적 유사도가 검색의 성능에 영향을 미치는 중요한 요인이기 때문이다. 크기 필터링의 임계치(THL)는 인간의 시각적 특성이나 응용분야의 특성에 따라 실험적 혹은 통계적인 방법을 사용하여 산출될 수 있다. This is because, as mentioned in the description of the prior art, subjective similarity in multimedia data retrieval is an important factor affecting the performance of retrieval. The threshold of magnitude filtering (THL) can be calculated using experimental or statistical methods depending on the visual characteristics of the human being or the characteristics of the application.

다음으로, 움직임 방향 필터링 단계 (S64-74)를 상세히 설명하면 다음과 같다. 상기 움직임 크기 필터링 단계와 마찬 가지로 움직임에 대해서도 인간이 시각적으로 느끼는 한계를 반영하고, 그 한계치를 효과적으로 이용하여 영상의 움직임을 표현하는데 필요한 데이터의 크기를 줄이고자 하는데 목적이 있다. Next, the motion direction filtering step (S64-74) will be described in detail as follows. As with the motion size filtering step, the purpose is to reflect the limitations that a human visually feels about the movement and to reduce the size of data required to express the movement of the image by effectively using the threshold value.

움직임에 대한 시각적 한계를 반영하는데 있어, 상기 움직임 방향 계산 부(33)는 상기 움직임 벡터값 비교 변환부(23)에서 변환된 움직임 벡터값 MVfx, MVfy를 이용하여 다음 수학식 18과 같이 움직임 방향 데이터(θxy)를 계산한다(S64).In order to reflect the visual limitation on the motion, the motion direction calculator 33 uses the motion vector values MVfx and MVfy converted by the motion vector value comparison converter 23, as shown in Equation 18 below. (θxy) is calculated (S64).

상기 움직임 방향 양자화/역양자화부(43)는 히스토그램을 생성하기 전에 상기 움직임 방향 계산부(33)에서 계산된 방향 데이터(θxy)에 대해서 양자화와 역양자화를 다음 수학식 19와 같이 수행한다(S74)). 이때 사용된 양자화 방법은 선형적인 방법일 수 도 있고 시각적 특성을 반영한 비선형적 양자화 방법을 사용할 수 있다. The motion direction quantization / dequantization unit 43 performs quantization and inverse quantization on the direction data θxy calculated by the motion direction calculation unit 33 before generating a histogram as shown in Equation 19 (S74). )). In this case, the quantization method used may be a linear method or a nonlinear quantization method reflecting visual characteristics.

여기서, q는 방향에 대한 양자화 인자이며 p는 크기에 대한 양자화 인자이 고, θxy는 MVfx, MVfy의 방향치이며, R(x)는 x보다 크지 않은 정수이다.Where q is a quantization factor for the direction, p is a quantization factor for the magnitude, θxy is the direction of MVfx and MVfy, and R (x) is an integer not greater than x.

다음으로, 도 4a 내지 4d에 도시된 움직임 히스토그램 생성부(4), 도 4e와 도 4g에 도시된 움직임 크기 히스토그램 생성부(8), 도 4f와 도 4h에 도시된 움직임 방향 히스토그램 생성부(11) 및 도 4i와 도 4j에 도시된 움직임 크기 히스토그램 생성부(8)와 움직임 방향 히스토그램 생성부(11)의 동작, 및 도 15a 내지 12d에 도시된 움직임 히스토그램 생성단계(S2),(S13),(S23),(S342), 도 15e와 도 15g에 도시된 움직임 크기 히스토그램 생성단계(S43),(S64), 도 15f와 도 15h에 도시된 움직임 방향 히스토그램 생성단계(S53),(S74) 및 도 15i와 도 15j에 도시된 움직임 크기 히스토그램 생성단계(S83),(S95)와 움직임 방향 히스토그램 생성단계(S84),(S96)를 도 11를 참조하여 상세히 설명하면 다음과 같다.Next, the motion histogram generator 4 shown in Figs. 4A to 4D, the motion magnitude histogram generator 8 shown in Figs. 4E and 4G, and the motion direction histogram generator 11 shown in Figs. 4F and 4H. And the motion magnitude histogram generator 8 and the motion direction histogram generator 11 shown in FIGS. 4I and 4J, and the motion histogram generation steps S2 and S13 shown in FIGS. 15A to 12D, (S23), (S342), the motion magnitude histogram generation steps S43, S64 shown in Figs. 15E and 15G, the movement direction histogram generation steps S53, S74 shown in Figs. 15F and 15H, and The motion magnitude histogram generation steps S83 and S95 and the movement direction histogram generation steps S84 and S96 illustrated in FIGS. 15I and 15J will be described in detail with reference to FIG. 11 as follows.

히스토그램은 분석하고자 하는 데이터의 전체적인 통계적인 특성을 2-D, 3-D등으로 계층적으로 표현할 수 있기 때문에 영상신호 처리나 패턴 인식 등의 분야에서 자주 사용되는 방법이다. 본 발명에서는 비디오의 움직임에 대한 통계적 특성을 기술하기 위하여 움직임 히스토그램을 사용한다. Histogram is a method frequently used in the field of image signal processing or pattern recognition because it can express the overall statistical characteristics of data to be analyzed hierarchically in 2-D, 3-D, etc. In the present invention, the motion histogram is used to describe the statistical characteristics of the motion of the video.

본 발명에서 제안하는 움직임 히스토그램은 비디오내의 내용물의 움직임 정보(움직임 방향(θxy), 움직임 크기(Lmv) 그 밖의 움직임을 표현하는 파라메터)를 상기한 움직임 필터링 기법중의 양자화/역양자화 방법이나 일반적인 방법을 사용하여 정보가 표현하는 영역을 몇 개의 그룹(bin)으로 나누고, 각 그룹에 해당하는 정보가 발생하는 빈도를 나타낸다. 움직임 히스토그램을 구하는 방법은 수학식 20 과 같이 표현될 수 있다.The motion histogram proposed by the present invention is a quantization / dequantization method or a general method of the motion filtering technique in which motion information (parameters representing motion directions (θxy), motion magnitude (Lmv) and other motions) of contents in a video is described. We divide the area represented by the information into several groups (bins) and indicate how often information corresponding to each group occurs. A method of obtaining a motion histogram may be expressed as Equation 20.

H(MVi) = SMVi/M H (MVi) = SMVi / M

여기서,

here,

상기 SMVi는 히스토그램으로 표현하고자 하는 움직임 정보의 i번째 그룹이 발생한 빈도수의 합이고, M은 히스토그램으로 표현하려는 움직임 데이터의 총 발생 빈도수이고, H(MVi)는 i번째 그룹의 움직임 정보가 발생한 확률을 나타낸다. 상기한 SMV는 움직임 방향, 움직임 크기, 움직임을 표현하는 파라메터에 대한 지칭이다. The SMVi is the sum of the frequencies of the i-th group of motion information to be represented by the histogram, the M is the total frequency of motion data to be represented by the histogram, and H (MVi) is the probability of the motion information of the i-th group. Indicates. The SMV is a reference to a parameter representing a motion direction, a motion size, and a motion.

*움직임 히스토그램으로 영상내의 전체적이고 특징적인 움직임의 흐름과 패턴에 대해서 분석 및 표현이 가능하다. 도 11은 한 영상내의 움직임 방향에 대한 움직임 히스토그램이고, 선의 굵기가 굵을수록 한 영상 내에 해당하는 방향의 움직임을 갖는 영역의 수가 많다는 것을 의미한다. 따라서, 도 11의 2-D 움직임 히스토그램은 구조화된 비디오에서 각 대표 영상이나 임의의 크기의 비디오 클립에 대한 전체적인 통계적 특성을 기술하고자 할 경우에 유용하게 사용할 수 있다. 그러나, 2-D의 움직임 히스토그램은 비디오 클립내의 움직임 정보에 대한 자세한 흐름을 표현할 수 없기 때문에, 본 발명에서는 도 12에 도시된 3-D 누적 움직임 히스토그램을 사용하여 움직임 특징을 기술한다.* Motion histogram enables analysis and expression of the overall and characteristic movements and patterns in the image. FIG. 11 is a motion histogram of the direction of movement in one image, and the thicker the line, the greater the number of regions having motion in the corresponding direction in the image. Therefore, the 2-D motion histogram of FIG. 11 may be useful when describing the overall statistical characteristics of each representative image or a video clip of any size in the structured video. However, since the 2-D motion histogram cannot express the detailed flow of motion information in the video clip, the present invention describes the motion feature using the 3-D cumulative motion histogram shown in FIG.

다음으로, 도 4a 내지 도 4d에 도시된 누적 움직임 히스토그램 생성부(5), 도 4e와 도 4g에 도시된 누적 움직임 크기 히스토그램 생성부(9), 도 4f와 도 4h에 도시된 누적 움직임 방향 히스토그램 생성부(14) 및 도 4i와 도 4j에 도시된 누적 움직임 크기 히스토그램 생성부(9)과 누적 움직임 방향 히스토그램 생성부(14)의 동작, 및 도 15a 내지 도 15d에 도시된 누적 움직임 히스토그램 생성단계(S3),(S14),(S24),(S35), 도 15e와 도 15g에 도시된 누적 움직임 크기 히스토그램 생성단계(S44),(S65), 도 15f와 도 15h에 도시된 누적 움직임 방향 히스토그램 생성단계(S54),(S75) 및 도 15i와 도 15j에 도시된 누적 움직임 크기 히스토그램 생성단계(S85),(S97)과 누적 움직임 방향 히스토그램 생성단계(S86),(S98)의 과정을 도 12 및 도 13을 참조하여 상세히 설명하면 다음과 같다.Next, the cumulative motion histogram generator 5 shown in Figs. 4A to 4D, the cumulative motion magnitude histogram generator 9 shown in Figs. 4E and 4G, and the cumulative motion direction histogram shown in Figs. 4F and 4H. Operation of the generator 14 and the cumulative motion magnitude histogram generator 9 and the cumulative motion direction histogram generator 14 shown in FIGS. 4I and 4J, and the cumulative motion histogram generation step shown in FIGS. 15A to 15D. (S3), (S14), (S24), (S35), cumulative motion magnitude histogram generation steps (S44), (S65) shown in Figs. 15E and 15G, and the cumulative motion direction histograms shown in Figs. 15F and 15H. The process of generating steps S54 and S75 and the cumulative motion magnitude histogram generating steps S85 and S97 and the cumulative motion direction histogram generating steps S86 and S98 shown in FIGS. 15I and 15J are shown in FIG. 12. And it will be described in detail with reference to Figure 13 as follows.

상기 누적 움직임 히스토그램 생성부(5), 누적 움직임 크기 히스토그램 생성부(9) 및 누적 움직임 방향 히스토그램 생성부(14)는 상기한 움직임 히스토그램을 정해진 순서에 따라 누적시켜 3-D 움직임 히스토그램을 생성하고, 이를 비디오의 움직임 특징을 표현하는 데 사용한다. 본 발명에서는 이를 누적 움직임 히스토그램이라 칭한다. 누적 움직임 히스토그램의 형태를 도시하면, 도 12에 도시된 바와 같이, 표현될 수 있다. 도 12에서 fmv 는 각 영상에서의 움직임 크기, 방향 등의 움직임 정보이고, F= {fmv(1), ...., fmv(n)}은 임의의 크기의 움직임 정보를 갖는 영상 집합을 나타낸다. 그리고, H(x)는 각 움직임 정보에 대한 움직임 히스토그램 값에 해당되고, 누적 히스토그램의 비의 값이다.The cumulative motion histogram generator 5, the cumulative motion magnitude histogram generator 9, and the cumulative motion direction histogram generator 14 accumulate the motion histogram in a predetermined order to generate a 3-D motion histogram, It is used to express the motion characteristics of the video. In the present invention, this is called a cumulative motion histogram. The shape of the cumulative motion histogram can be represented, as shown in FIG. 12. In FIG. 12, fmv is motion information such as motion size and direction in each image, and F = {fmv (1), ...., fmv (n)} indicates a video set having motion information of arbitrary size. . H (x) corresponds to a motion histogram value for each motion information, and is a value of a ratio of a cumulative histogram.

누적 움직임 히스토그램 생성에 있어, 어떤 움직임 정보를 사용하는가와 누 적에 있어 y축의 기준으로 무엇을 사용하느냐는 그 응용에 따라 선택적인 사항이다. In the generation of the cumulative motion histogram, what motion information is used and what is used as the reference of the y-axis in the accumulation is optional depending on the application.

누적 움직임 히스토그램은 본 발명에서 제안한 움직임 필터링 기법을 사용하여 인간의 시각적인 특성을 적절히 반영할 수 있으며, 적은 양의 움직임 특징 정보를 사용하여 비디오 전체나 특정 시간동안의 움직임 흐름과 패턴을 표현할 수 있다. 일례로 도 13은, "한적한 거리를 사람과 자동차가 지나가고 어떤 이유에서 지상과 하늘에서 연속적인 폭발이 발생하였고 이를 피하려는 사람들의 국부적인 움직임이 있은 후 다시 폭발이 있는 장면"이 35장의 영상으로 구성된 비디오로 구성될 때, 움직임 방향 정보에 대하여 수학식 19의 q를 45°로 하여 움직임 필터링을 적용하여 생성된 누적 움직임 히스토그램을 형상화한 것이다. The cumulative motion histogram can properly reflect the human visual characteristics using the motion filtering technique proposed in the present invention, and can express the motion flow and pattern of the entire video or during a specific time by using a small amount of motion characteristic information. . For example, FIG. 13 shows 35 scenes of "a scene where people and cars pass by a quiet street, and for some reason a continuous explosion occurred on the ground and in the sky, and after the local movement of people trying to avoid it, the explosion again." When the video is configured, the cumulative motion histogram generated by applying motion filtering with q of equation 19 at 45 ° for the motion direction information is shaped.

여기서, x 축은 비디오에서의 각 영상들의 시간적 위치를 나타내고, y축은 각 영상에 대한 움직임 방향 정보를 표시한 것이다. 상기한 일례와 같은 복잡한 움직임을 갖는 장면도 누적 움직임 히스토그램을 사용하여 장면 전체의 움직임 흐름과 패턴을 도 13과 같이 아주 분명하고 명확하게 표현할 수 있음을 볼 수 있다. Here, the x axis represents the temporal position of each image in the video, and the y axis represents the motion direction information for each image. It can be seen that even a scene having a complex motion as in the example described above can be expressed very clearly and clearly as shown in FIG. 13 by using a cumulative motion histogram.

상기한 바와 같이, 통상 비디오는 많은 영상들로 구성될 수 있기 때문에, 영상의 수의 증가에 비례하여 각 영상의 움직임 특징을 2-D 혹은 3-D로 표현하는 움직임 히스토그램과 누적 움직임 히스토그램의 데이터량도 증가하게 된다. 아울러, 검색에 소요되는 시간 또한 증가하게 된다. 이러한 문제 등에 적절히 대응하기 위하여, 본 발명에서는 상기에 기술한 히스토그램을 이용한 움직임 기술방법을 포함하고, 보다 다양한 단계의 검색(Search/Retrieval from Coarse to Fine Level)에 대응할 수 있는 비디오 움직임 특징을 기술하는 움직임 기술자 생성 방안을 제안한다. 제안하는 움직임 기술자는 다단계의 계층적으로 구성되어 있으며, 본 발명의 사용자나 응용분야에 따라 검색 단계를 선택할 수 있다.As described above, since a normal video may be composed of many images, data of a motion histogram and a cumulative motion histogram expressing a motion characteristic of each image in 2-D or 3-D in proportion to an increase in the number of images. The amount will also increase. In addition, the time required for searching also increases. In order to appropriately cope with such a problem, the present invention includes a motion description method using the histogram described above, and describes a video motion feature that can cope with more various levels of search (Search / Retrieval from Coarse to Fine Level). We propose a motion descriptor generation method. The proposed motion descriptor is hierarchically structured in multiple stages, and a search stage can be selected according to a user or an application of the present invention.

다음으로, 도 4a 내지 도 4j에 도시된 움직임 기술자 생성부(6)의 동작 및 도 15a 내지 도 15j에 도시된 움직임 기술자 생성단계(S4),(S15),(S25),(S36),(S45),(S55),(S66),(S76),(S87),(S99)를 도 7, 도 14, 도 18a 및 도 18b를 참조하여 상세히 설명하면 다음과 같다.Next, the operation of the motion descriptor generation unit 6 shown in Figs. 4A to 4J and the motion descriptor generation steps S4, S15, S25, S36 and S36 shown in Figs. 15A to 15J. S45, S55, S66, S76, S87, and S99 will be described in detail with reference to FIGS. 7, 14, 18A, and 18B as follows.

일반적으로, 비디오는 위에서 일례로든 장면보다 훨씬 많은 영상들로 구성될 수 있으므로, 움직임 특징을 표현하는 누적 움직임 히스토그램의 데이터양도 증가하게 된다. 이와 같이 누적 움직임 히스토그램의 데이터 양의 증가에 따라 검색에 소요되는 시간의 증가를 적절히 대응하기 위하여, 본 발명에서는 누적 움직임 히스토그램에 대한 효과적인 색인을 위한 움직임 기술자를 제안한다. 제안하는 움직임 기술자는 누적 움직임 히스토그램의 특징들을 분석하여 유사한 특징을 갖는 구간을 클립으로 분할하고 클립에 포함되는 누적 움직임 히스토그램 데이터의 움직임 특징들을 효과적이고 유연하게 움직임 기술자에 반영함으로써 신속한 검색을 위한 방안을 제시한다. 움직임 기술자 생성과정의 전체적인 흐름은 도 18a와 같다.In general, since a video may consist of many more images than the scene, for example, from above, the amount of data in the cumulative motion histogram representing the motion feature is also increased. In order to properly cope with the increase in the time required for searching as the amount of data in the cumulative motion histogram increases, the present invention proposes a motion descriptor for an effective indexing of the cumulative motion histogram. The proposed motion descriptor analyzes the features of the cumulative motion histogram and divides the sections with similar features into clips, and the motion descriptors of the cumulative motion histogram data included in the clip are effectively and flexibly applied to the motion descriptor. present. The overall flow of the motion descriptor generation process is shown in FIG. 18A.

이를 상세히 설명하면, 상기 움직임 히스토그램 변화량 계산부(161)는 상기 움직임 히스토그램 누적부(5)에서 누적된 누적 움직임 히스토그램이 입력되 면(S17), 누적 움직임 히스토그램의 변화량을 계산하고, 클립타임 색인부(162)는 상기 움직임 히스토그램 변화량 계산부(161)에서 계산된 움직임 히스토그램 변화시간 및 클립갯수를 색인(S27)하여 움직임 클립 기술자를 생성한다(S37).In detail, the motion histogram change calculation unit 161 calculates the change amount of the cumulative motion histogram when the cumulative motion histogram accumulated by the motion histogram accumulator 5 is input (S17), and the clip time indexing unit. 162 generates a motion clip descriptor by indexing the motion histogram change time and the number of clips calculated by the motion histogram change calculator 161 (S37).

그리고, 움직임 기술자 생성기(36)는 상기 클립 타임 색인부(162)에서 생성된 움직임 클립 기술자에 의해 기술된 정보를 이용하여 움직임 기술자를 생성하게 된다(S47).The motion descriptor generator 36 generates a motion descriptor using information described by the motion clip descriptor generated by the clip time indexing unit 162 (S47).

상기한 움직임 기술자에 대해서 자세히 살펴보면 다음과 같다. 움직임 기술자는 비디오 전체를 표현하는 누적 움직임 히스토그램에 대해서 시간에 대한 변화량에 따라 클립으로 분할하고 각 분할된 클립에 대한 움직임 특징을 기술한다. 여기서 클립은 변화량(△Ht)이 정의된 임계치(TH△H)를 초과하는 시간적 위치들 사이의 히스토그램 데이터에 대한 특징정보로 표현된다. 클립화 과정에서 사용되는 변화량(△Ht)에 대한 임계치(TH△H)의 결정은 본 발명의 응용에 따라 다를 수 있으면, 임계치의 산출은 실험적인 방법이나 통계적인 방법을 사용하여 결정될 수 있다. 클립화 과정은 도 18b와 같이 도시될 수 있다.Looking at the motion descriptor in detail as follows. The motion descriptor divides the cumulative motion histogram representing the entire video into clips according to changes in time, and describes motion characteristics for each divided clip. In this case, the clip is represented by feature information on histogram data between temporal positions whose variation amount ΔHt exceeds a defined threshold THΔH. If the determination of the threshold THΔH with respect to the change amount ΔHt used in the clipping process may vary depending on the application of the present invention, the calculation of the threshold value may be determined using an experimental method or a statistical method. The clipping process can be shown as in FIG. 18B.

이를 상세히 설명하면, 움직임 히스토그램 변화량 계산부(161)는 상기 움직임 히스토그램 누적부(5)에서 누적된 누적 움직임 히스토그램의 다음 수학식 21과 같이 입력되면(S17), 움직임 히스토그램의 변화량(△Ht)을 다음 수학식 22와 같이 계산한다(S271).In detail, the motion histogram change calculation unit 161 inputs the change amount ΔHt of the motion histogram when the motion histogram change calculation unit 161 is input as shown in Equation 21 of the cumulative motion histogram accumulated by the motion histogram accumulator 5 (S17). The calculation is performed as in Equation 22 (S271).

비교부(163)는 상기 움직임 히스토그램 변화량 계산부(161)에서 계산된 변화량(△Ht)와 기 설정된 임계치(TH△H)를 비교(S272)한 후, 상기 임계치(TH△H)가 변화량(△Ht)보다 크면, 상기 움직임 히스토그램 변화량 계산부(161)를 인에이블시켜 상기 단계(S271)를 반복 실행시키고, 상기 임계치(TH△H)가 변화량(△Ht)과 같거나 작으면, 클립 타임 색인부(162)를 인에이블시킨다. The comparing unit 163 compares the change amount ΔHt calculated by the motion histogram change amount calculating unit 161 with the preset threshold THΔH (S272), and then the threshold value THΔH is changed to the change amount ( If larger than ΔHt, the motion histogram change calculation unit 161 is enabled to repeat the step S271, and if the threshold THΔH is equal to or smaller than the change amount ΔHt, clip time The index unit 162 is enabled.

상기 클립 타임 색인부(162)는 상기 비교부(163)에 의해 인에이블되어 상기 움직임 히스토그램 변환량 계산부(161)에서 계산된 변화량의 변화시간(t) 및 클립 개수(c)를 순차적으로 증가시켜 클립 타임(Clip Time)을 색인한다(S273)The clip time indexing unit 162 is enabled by the comparing unit 163 and sequentially increases the change time t and the number of clips c of the change amount calculated by the motion histogram conversion amount calculating unit 161. The clip time to be indexed (S273).

도 14은 도 13에 대하여 상기한 방법으로 클립화를 수행한 예시도이다. 도 14에서 누적 움직임 히스토그램은 8개의 클립으로 분할되었고, 각 클립은 8개의 각 방향에 대한 빈도와 각 방향에 대해서 움직임 크기의 평균에 대한 특징정보를 가지고 있으며, 클립으로 표현되는 누적 움직임 히스토그램에 대한 시간 축상의 구간(Duration)을 표시한다. 클립의 크기와 클립이 표현하는 누적 히스토그램의 구간은 겹침 정도에는 제한이 없다. 또한, 표현에 있어서도 한 클립을 다시 세분화 하여 계층적으로 표현할 수 있다. 하지만, 검색의 정확성 및 표현의 유효성을 위하여 비디오를 표현하는 모든 움직임 히스토그램은 반드시 하나 이상의 클립에 속해야 한다. FIG. 14 is an exemplary view of clipping by the method described above with respect to FIG. 13. In FIG. 14, the cumulative motion histogram is divided into eight clips, and each clip has frequency information for each of the eight directions and characteristic information about an average of the motion sizes for each direction. Displays the duration on the time axis. The size of the clip and the interval of the cumulative histogram represented by the clip are not limited in the degree of overlap. Also in the expression, one clip can be subdivided again to express hierarchically. However, for the accuracy of retrieval and validity of representation, every motion histogram that represents a video must belong to one or more clips.

지금까지 설명한 움직임 기술자는 도 18a과 같은 과정을 거처 생성되며, 하기와 같이 표현될 수 있다. 움직임 기술자(MotionDescripor)가 기술하는 각 정보의 길이를 표현하는 xbits는 각 데이터를 표현하는데 필요한 임의의 크기의 비트 수인데, 본 발명은 움직임 특징정보의 효과적인 압축 방법에 관한 사항에 아니고, 응용에 따라 각 데이터를 표현하는데 필요한 비트수가 다를 수 있기 때문에 이에 대한 한정 및 정의는 하지 않는다.The motion descriptor described so far is generated based on the process as shown in FIG. 18A and may be expressed as follows. The xbits representing the length of each information described by the motion descriptor (MotionDescripor) is the number of bits of arbitrary size required to represent each data. The present invention is not related to an effective compression method of motion feature information. Since the number of bits required to represent each data may be different, it is not limited and defined.

도 7에 도시된 움직임 기술자 생성기(36)에서 각각 생성된 움직임클립 기술자(MotionClipDescriptor)가 기술하는 정보를 설명하면 다음과 같다.The information described by the motion clip descriptor (MotionClipDescriptor) generated in the motion descriptor generator 36 shown in FIG. 7 will now be described.

움직임 기술자(MotionDescriptor)는 본 발명에서 제안하는 움직임 기술자중 최상위에 위치하는 기술자로서 다음과 같이 표현된다. A motion descriptor is a descriptor located at the top of a motion descriptor proposed by the present invention.

MotionDescriptor는 비디오 구별자 VideoID를 기술하는 video_id와, 해당 MotionDescriptor가 기술하는 비디오의 시간적위치를 표시하는 TimeDescriptor인 time_des와, 기술된 움직임 특징정보 기술 단계를 표시하는 MotionDescriptionLevel의 레벨과, 움직임 방향에 대한 2-D 내지는 3-D 형태의 특징 정보를 기술하는 MotionDirectionDescriptor의 direction_des와, 움직임 크기에 대한 2-D 내지는 3-D 형태의 특징 정보를 기술하는 MotionIntensityDescriptor 의 intensity_des와, 비디오 검색에서 유용하게 사용될 수 있는 방향과 크기에 대한 선택 플래그정보인 flag_used_des와, 그리고 다음 단계의 움직임 특징을 기술하는 MotionDescriptor의 mot_sub_des와, 이를 표시하는 flag_exist_sub_des와 mot_sub_des의 개수를 표시하는 NumberOfSubDescriptor의 n으로 구성된다. The MotionDescriptor is a video_id describing the video identifier VideoID, time_des which is a TimeDescriptor indicating the temporal position of the video described by the MotionDescriptor, a level of the MotionDescriptionLevel indicating the described motion feature information description level, and a 2-direction about the motion direction. Direction_des of MotionDirectionDescriptor describing feature information of D or 3-D form, intensity_des of MotionIntensityDescriptor describing feature information of 2-D or 3-D form of motion size, and directions that can be usefully used in video search. Flag_used_des, which is the selection flag information about the size, and mot_sub_des of the MotionDescriptor describing the motion characteristics of the next stage, and n of NumberOfSubDescriptor, which indicates the number of flag_exist_sub_des and mot_sub_des indicating this.

*본 발명에서는 상기한 비디오 구조화에서 언급한 video, story, scene, shot, segment 그리고 subsegment등의 구조화 단위에 대한 대표영상들이나 구조화 단위에 대한 움직임 특징을 통계적으로 기술하기 위하여 MotionDescriptor로 표현되는 움직임 방향과 크기에 대한 평균, 평균과 2차 통계값에 대한 Central Momonent 그리고 움직임 데이터에 대한 2-D/3-D 누적 움직임 히스토그램등을 사용하여 색인하고 이를 이용한 검색 기술자를 표현하면 표 1과 같다.In the present invention, the motion direction represented by the MotionDescriptor to statistically describe the motion characteristics of the representative image or the structured unit for the structured unit such as video, story, scene, shot, segment and subsegment mentioned in the video structure Table 1 shows the index using the mean for the size, the Central Momonent for the mean and secondary statistics, and the 2-D / 3-D cumulative motion histogram for the motion data.

움직임 특징 기술자(MotionDescriptor) MotionDescriptor 비디오 구별자 (VideoID)Video identifier (VideoID) 기술범위 식별자 (TimeDescriptor)Description scope identifier (TimeDescriptor) 방향 기술자 (DirectionDescriptor)Direction Descriptor video_id video_id time_destime_des direction_desdirection_des 크기 기술자 (IntensityDescriptor)Size Descriptor (IntensityDescriptor) 부가 기술자 (MotionDescriptor)Additional Descriptor intensity_des intensity_des sub_des[1] … sub_des[n]sub_des [1]. sub_des [n]

표 1에서의 부가 기술자는 움직임 기술자를 사용한 특성 기술시에 표 2에서의 비디오 데이터 구조화에 따른 MotionDescriptor(MD)의 개념적 표현에서와 같이 구조화된 비디오의 움직임 특성을 계층적으로 표현할 필요성이 있을 경우나 보다 자세한 특성을 표시할 필요성이 있을 시에 사용되는 재귀적인 구조의 구문이다.In the description of the characteristics using the motion descriptor, the additional descriptor in Table 1 needs to hierarchically express the motion characteristics of the structured video as in the conceptual representation of the MotionDescriptor (MD) according to the structure of the video data in Table 2. A recursive construct used when there is a need to display more specific characteristics.

상기 MotionDescriptor로 움직임 특성이 기술되는 비디오에 대한 식별자인 VideoID는 동일한 이름의 다른 버전(version)등을 구별하기 위하여 표 3과 같이 기술된다.VideoID, which is an identifier for a video whose motion characteristics are described by the MotionDescriptor, is described as shown in Table 3 to distinguish different versions of the same name.

비디오 식별자(VideoID)Video Identifier (VideoID) 비디오 이름 (VideoName)Video Name 크레디트 (VideoCredits)Credit (VideoCredits) 시기/시점 (VideoDate)When / Time (VideoDate) 버전 (VideoVersions)Versions (VideoVersions) titletitle creditcredit datedate versionversion

표 3에서의 비디오 식별자(VideoID)는 비디오 이름의 VideoName을 기술하는 title과, 비디오의 제작자, 지료의 출처 그리고 제공자등을 기술하는 크레디트(VideoCredits)의 credit와, 비디오 제작시기(VideoDate)를 기술하는 date와, 그리고 버전(VideoVersion)을 기술하는 version등으로 구성된다. The video identifier (VideoID) in Table 3 describes the title describing the VideoName of the video name, the credit of VideoCredits describing the creator of the video, the source and provider of the video, and the date of video production (VideoDate). It consists of a date and a version describing the version (VideoVersion).

본 발명에서는 비디오 식별자의 사용에 있어 표 3에서의 각 필드들을 상황에 따라 선택적으로 사용될 수 있으며, 표 3에서의 식별자외에 기존의 비디오 식별자를 사용하여도 무방하다. 아울러, 비디오 식별자 기술에 있어 제안하는 움직임 특징 기술자의 적용분야에 따라 가변적인 상황이 발생 가능하기 때문에 움직임 기술자(MotionDescriptor)내의 비디오 식별자의 기술에 대한 제한은 두지 않는다. 표 4는 표 3에서의 비디오 식별자의 신텍스이다.In the present invention, each field in Table 3 may be selectively used according to a situation in using a video identifier, and an existing video identifier may be used in addition to the identifier in Table 3. In addition, since a variable situation may occur depending on the application field of the proposed motion feature descriptor in the video identifier technology, there is no limitation on the description of the video identifier in the motion descriptor. Table 4 is the syntax of the video identifier in Table 3.

다음으로, 기술범위 식별자(TimeDescriptor)는 움직임 기술자가 기술 범위내의 모든 움직임 데이터를 대표하는 지의 여부에 따라 순차적 범위 기술자(Sequential TimeDescription)와 무작위 범위 기술자(Random TimeDescription)로 분류된다. Next, a description range identifier (TimeDescriptor) is classified into a sequential range description and a random range description according to whether the motion descriptor represents all the motion data in the description range.

상기 순차 범위 기술자(Sequential Time Descriptor)는 제안하는 움직임 기술자(MotionDescriptor)가 기술 범위내의 모든 움직임 데이터와 특성를 표현한다는 의미이며, 표현되는 움직임 데이터의 시작시간(Start_Time), 종료시간(End_Time) 그리고 기간(Duration)으로 구성되는데, 기간(Duration)은 시작과 종료 시간 사이의 구간을 지칭하며, 전체 누적 히스토그램 데이터의 시간에 대한 정보이다. 따라서 종료시간이나 기간은 선택적으로 사용될 수 있다. The sequential time descriptor means that the proposed motion descriptor expresses all motion data and characteristics within the description range. The start time (Start_Time), end time (End_Time), and duration ( Duration (Duration) refers to the interval between the start and end time, it is information about the time of the total cumulative histogram data. Thus an end time or period can optionally be used.

무작위 범위 기술자(Random Time Descriptor)는 기술범위내의 일부분의 움직임 데이터와 특성들 만을 표현한다는 의미이며 표 5와 같이 정의된다. 기술범위 식별자의 효율적 표현을 위하여 순차적 범위 기술자와 무작위 범위 기술자를 함께 사용할 수 있다.Random Time Descriptor (Random Time Descriptor) means that only part of the motion data and characteristics within the description range is defined as shown in Table 5. Sequential range descriptors and random range descriptors can be used together for efficient representation of the description range identifier.

기술범위 식별자(TimeDescriptor)Description scope identifier (TimeDescriptor) 순차적 범위 기술 (Sequential TimeDescription)Sequential TimeDescription 무작위 범위 기술 (Random TimeDescriptor)Random TimeDescriptor 시작시간 (Start_Time)Start time (Start_Time) 기간 (Duration)Duration 종료시간 (End_Time)End_Time 시간적 위치 (TemporalPosition)TemporalPosition startstart durationduration endend s[1] .......s[n] s [1] ....... s [n]

표 6의 기술범위 식별자 신텍스에서의 usedSequential 플래그(Flag) 필드는 순차적 범위 기술을 사용하는지에 대한 여부를 나타내고, usedRandom 플래그(Flag) 필드는 무작위 범위 기술을 사용하는지에 대한 여부를 나타낸다. 그리고 NumberOfPosition은 무작위 범위기술에서 표현하는 Position의 전체 수에 해당한다.The usedSequential flag field in the description range identifier syntax of Table 6 indicates whether to use a sequential range description, and the usedRandom flag field indicates whether to use a random range description. NumberOfPosition corresponds to the total number of positions represented in the random range description.

본 발명에서는 움직임 기술자(MotionDescriptor)내의 기술범위 식별자의 사용에 있어 표 5의 각 필드들을 상황에 따라 선택적으로 사용될 수 있으며, 표 5의 식별자외에 기존의 효율적인 비디오 기술범위 식별자를 사용할 수도 있다.In the present invention, each field of Table 5 may be selectively used according to a situation in the use of the description range identifier in a motion descriptor, and in addition to the identifier of Table 5, an existing efficient video description range identifier may be used.

다음으로, 방향 기술자 (DirectionDescritpor)는 기술범위 식별자가 의도하는 기술범위내의 각 영상이나 전체 영상에 대한 움직임 데이터에 대하여 움직임 방향의 통계적 특성을 표현하는 기술자로 방향에 대한 평균, 평균과 2차 통계값에 대한 Central Moment들, 지배적인 방향(dominant directions), 누적 움직임 히스토그램 그리고 방향에 대한 데이터로 구성된다. 상기한 바와 같이 기본적으로 2차원적인 움직임 정보는 방향과 크기로 구성되며 움직임 정보를 벡터형태(MV-Motion Vector)의 MV=(MVx, MVy)로 표현할 수 있고, 여기서 MVx는 수평방향의 움직임 성분(크기)이고 MVy는 수직방향의 움직임 성분(크기)이다. Next, the direction descriptor (DirectionDescritpor) is a descriptor that expresses the statistical characteristics of the movement direction with respect to the motion data of each image or the entire image within the technical scope intended by the technology range identifier. It consists of central moments for the dominant directions, dominant directions, cumulative movement histogram and data about directions. As described above, the two-dimensional motion information is basically composed of a direction and a magnitude, and the motion information can be expressed as MV = (MVx, MVy) of a vector (MV-Motion Vector), where MVx is a horizontal motion component. (Magnitude) and MVy is the vertical motion component (magnitude).

움직임 벡터에서 움직임의 방향(ψ)은 다음 수학식 2에 의해서 얻을 수 있다.The direction of motion ψ in the motion vector can be obtained by the following equation.

[수학식 2][Equation 2]

아울러, 움직임 벡터를 사용하여 방향을 계산하는 방법은 다수가 존재할 수 있으며, 그 사용에는 제한을 두지 않는다. 수학식 2에서의 (MVxk, MVyk)는 한 영상을 임의의 크기를 갖는 M개의 영역으로 분할 하였을 때, k번째 영역에 대한 움직임 정보이다. In addition, a number of methods for calculating a direction using a motion vector may exist, and the use thereof is not limited. (MVxk, MVyk) in Equation 2 is motion information for the k-th region when one image is divided into M regions having an arbitrary size.

ρi는 비디오 내의 i번째 영상의 움직임 방향에 대한 평균이고 수학식 9에 의해서 구할 수 있다. 움직임 방향에 대한 평균을 구하는 방법은 수학식23 이외에도 다수의 방법이 존재한다.ρ i is an average of the moving direction of the i-th image in the video and can be obtained by Equation (9). In addition to Equation 23, there are a number of methods for calculating the average of the direction of movement.

θ1(MeanOfClipMotionHistogram)는 움직임 기술자내의 기술범위 기술자가 의도하는 범위내의 T개의 영상에서의 움직임 방향의 평균(ρ)에 대한 평균이고, 수학식 24에 의해서 구할 수 있다. 움직임 방향의 평균(ρ)에 대한 평균을 구하는 방법은 수학식 24 이외에도 다수가 존재한다. 상기 MeanOfClipMotionHistogram은 움직임 방향에 대한 누적 움직임 히스토그램 전체의 평균치이다. θ1 (MeanOfClipMotionHistogram) is an average of the averages ρ of movement directions in T images within a range intended by the technical range descriptor in the motion descriptor, and can be obtained by Equation (24). In addition to the equation (24), there are a number of methods for obtaining an average of the average ρ of the movement direction. The MeanOfClipMotionHistogram is an average value of the entire cumulative motion histogram with respect to the movement direction.

여기서, T는 기술범위의 모든 영상의 수와 동일하지 않을 수 있다. Here, T may not be equal to the number of all images in the description range.

움직임 방향에 대한 Central Moment는 방향 평균(θ1)에서 각 영상의 방향 평균(ρ)에 대한 시간적 분포 및 왜곡정도에 대한 특징을 나타내며, p차원의 모멘트(θp)를 갖을 수 있다. 그 p차원의 모멘트(θp)는 다음 수학식 25에 의해서 구할 수 있다. p차원의 모멘트(θp)를 구하는 방법은 수학식 25 이외에도 다수가 존재한다.The central moment with respect to the direction of movement represents a characteristic of the temporal distribution and the degree of distortion of the direction average ρ of each image in the direction average θ1 and may have a p-dimensional moment θp. The p-dimensional moment θp can be obtained by the following equation (25). There are many methods for obtaining the p-dimensional moment θp in addition to the equation (25).

기술범위 기술자가 의도하는 범위내의 T개의 영상에서 i번째 영상에 대한 움직임 방향에 대한 공간적 분포에 대한 특징을 표현하는 방향 표준편차(στi)를 다음 수학식 26에 의해서 구할 수 있다. 방향 표준편차(στi)를 구하는 방법은 수학식 26에 의한 방법이외에도 다수가 존재한다.The direction standard deviation sigma i representing the characteristics of the spatial distribution of the motion direction of the i-th image in the T images within the range intended by the technical range technician can be obtained by the following equation (26). There are many methods for obtaining the direction standard deviation?

범위내의 T개의 전체 영상들에 대한 방향의 공간적 분포의 평균를 표시하는 2차 통계값의 평균(σθ1)은 다음 수학식 27에 의해서 구할 수 있다. 2차 통계값의 평균(σθ1)을 구하는 방법은 수학식 27에 의한 방법이외에도 다수의 방법이 존재한다.The average σθ1 of the second statistical value indicating the average of the spatial distribution of the directions for the T total images in the range may be obtained by the following equation (27). There are a number of methods for obtaining the average sigma? 1 of secondary statistical values in addition to the method shown in Equation 27.

방향의 2차 통계값에 대한 Central Moment는 2차 통계값 평균(σθ1)에서 각 영상의 2차 통계값(στi)에 대한 공간적 분포 및 왜곡 정도에 대한 특징을 나타내며 p차원의 모멘트(σθp)를 갖을 수 있다. 그 p차원의 모멘트(σθp)는 다음 수학식 28에 의해서 구할 수 있다. 그 p차원의 모멘트(σθp)를 구하는 방법은 수학식 28에 의한 방법이외에도 다수의 방법이 존재한다.The central moment for the secondary statistical value in the direction is characterized by the spatial distribution and the degree of distortion of the secondary statistical value (στi) of each image in the average of the secondary statistical value (σθ1) and represents the p-dimensional moment (σθp). Can have The p-dimensional moment σθp can be obtained by the following equation (28). There are a number of methods for obtaining the p-dimensional moment (? Θp) in addition to the method given by equation (28).

움직임의 지배적인 방향(dominant direction)을 나타내는 β는 수학식 20를 사용하여 방향에 대한 움직임 히스토그램을 생성한 후, 히스토그램 값이 가장 큰 방향이 β에 해당되며, 복수개의 β를 산출하고자 할 경우에는 히스토그램 값의 크기에 따라 정렬하여 큰 값의 순으로 결정된다. DataOfVideoMotionLength는 움직임 방향에 대한 누적 움직임 히스토그램의 각 빈에 대한 움직임 크기의 평균치를 나타낸다.Β, which represents the dominant direction of motion, generates a histogram of the direction using Equation 20, and the direction having the largest histogram value corresponds to β. They are sorted according to the size of the histogram value and are determined in order of the larger value. DataOfVideoMotionLength represents the average of the motion magnitudes for each bin of the cumulative motion histogram with respect to the motion direction.

상기한 바와 같이 비디오 내의 내용물의 움직임 방향에 대한 특징을 기술하는 DirectionDescriptor를 정리하면 다음 표 7과 같다.As described above, the DirectionDescriptor describing the characteristics of the movement direction of the contents in the video is summarized in the following Table 7.

움직임 방향 기술자(DirectionDescriptor)Direction Descriptor 움직임 방향 평균Direction of movement average 움직임 방향 시간적 분포Movement direction temporal distribution 움직임 방향 편차 평균Moving direction deviation mean 움직임 방향 공간적 분포Spatial direction of movement θ₁ θ ₁ θ₂…θ_p θ ₂ . θ _p σ_θ1 σ _θ1 σ_θ1…σ_θp sigma _θ1 . σ _θp 지배적 움직임 방향Dominant direction of movement 움직임방향 누적 히스토그램Cumulative histogram DataOfVideoMotionLengthDataOfVideoMotionLength β₁…β_k β ₁ . β _k H_θ H _θ n ₁ _… n_M n ₁ _. n _M

다음 표 8은 상기 움직임 방향 기술자(DirectionDescriptor)의 신텍스를 나타낸 것으로, 비디오 데이터 구조화에 따른 MotionDescriptor(MD)를 개념적으로 표현한 것이다. flag_exist_sub_des의 플래그는 움직임 방향 기술자를 사용한 방향 특성 기술시에 다음 표 8에서와 같이 구조화된 비디오의 움직임 특성을 계층적으로 표현할 필요성이 있을 경우나 보다 자세한 방향특성을 표시할 필요성이 있을 시에 사용되는 재귀적인 구조의 구문이다. The following Table 8 shows the syntax of the motion direction descriptor (DirectionDescriptor), conceptually representing a MotionDescriptor (MD) according to video data structure. The flag of flag_exist_sub_des is used when the direction characteristic using the motion direction descriptor is needed to express hierarchically the motion characteristic of the structured video as shown in Table 8, or when it is necessary to display the detailed direction characteristic. Recursive syntax.

다음으로, 크기 기술자(IntensityDescritpor)는 기술범위 식별자가 의도하는 기술범위내의 각 영상이나 전체 영상에 대한 움직임 데이터에 대하여 움직임 크기의 통계적 특성을 표현하는 기술자로 크기에 대한 평균과, 평균과 2차 통계값에 대한 Central Moment들과, 누적 크기 히스토그램 그리고 크기에 대한 데이터로 구성된다. Next, the IntensityDescritpor is a descriptor that expresses the statistical characteristics of the motion size with respect to the motion data for each image or the entire image within the technical scope intended by the technology range identifier. It consists of Central Moments for values, cumulative size histogram and data about size.

상기한 바와 같이 기본적으로 2차원적인 움직임 정보는 방향과 크기로 구성되며 움직임 정보를 벡터형태(MV-Motion Vector)의 MV=(MVx, MVy)로 표현할 수 있고, 여기서 MVx는 수평방향의 움직임 성분(크기)이고 MVy는 수직방향의 움직임 성분(크기)이다. As described above, the two-dimensional motion information is basically composed of a direction and a magnitude, and the motion information can be expressed as MV = (MVx, MVy) of a vector (MV-Motion Vector), where MVx is a horizontal motion component. (Magnitude) and MVy is the vertical motion component (magnitude).

움직임 벡터에서 움직임의 크기(I)는 다음 수학식 1에 의해서 구할 수 있다. 아울러 수학식 1 이외의 방법을 사용하여서도 계산될 수 있으며, 움직임 크기를 계산하는 방법에는 제한을 두지 않는다.The magnitude I of the motion in the motion vector can be obtained by the following equation. In addition, it can be calculated using a method other than Equation 1, and there is no limitation on the method of calculating the motion size.

[수학식 1][Equation 1]

여기서, (MVxk, MVyk)는 한 영상을 임의의 크기를 갖는 M개의 영역으로 분할 하였을 때, k번째 영역에 대한 움직임 정보이다. Here, (MVxk, MVyk) is motion information for the k-th region when one image is divided into M regions having an arbitrary size.

λi는 비디오 내의 i번째 영상의 움직임 크기에 대한 평균이고, 다음 수학식 29에 의해서 구할 수 있다. 아울러 수학식 29 이외의 방법을 사용하여서도 계산될 수 있으며, 움직임 크기에 대한 평균을 계산하는 방법에는 제한을 두지 않는다.[lambda] i is an average of the motion size of the i-th image in the video, and can be obtained by the following equation (29). In addition, it may be calculated using a method other than Equation 29, and there is no limitation on the method of calculating the average of the motion magnitudes.

ω1(MeanOfClipMotionLength)은 움직임 기술자내의 기술범위 기술자가 의도하는 범위내의 T개의 영상에서의 움직임 크기의 평균(λ)에 대한 평균이고, 다음 수학식 30에 의하여 구할 수 있다. 아울러 수학식 30 이외의 방법을 사용하여서도 계산될 수 있으며, 움직임 크기에 대한 평균을 계산하는 방법에는 제한을 두지 않는다. 즉, 상기 MeanOfClipMotionLength는 누적 움직임 히스토그램 전체의 움직임 크기에 대한 평균치를 나타낸다.ω1 (MeanOfClipMotionLength) is an average of the mean λ of the motion sizes in the T images within the range intended by the technical range descriptor in the motion descriptor, and can be obtained by the following equation (30). In addition, it may be calculated using a method other than Equation 30, and there is no limitation on the method of calculating the average of the motion magnitudes. In other words, the MeanOfClipMotionLength represents an average value of the motion magnitude of the entire cumulative motion histogram.

움직임 크기에 대한 Central Moment는 상기 크기 평균(ω1)에서 각 영상의 상기 크기 평균(λ)에 대한 시간적 분포 및 왜곡정도에 대한 특징을 나타내며 q차원의 모멘트(ωq)를 갖을 수 있다. 그 q 차원의 모멘트(ωq)는 다음 수학식 31에 의해서 구할 수 있다. 아울러 수학식 31 이외의 방법을 사용하여서도 계산될 수 있으며, 그 q 차원의 모멘트(ωq)를 계산하는 방법에는 제한을 두지 않는다.The central moment for the motion magnitude represents a characteristic of the temporal distribution and the degree of distortion of the magnitude average λ of each image in the magnitude average ω1 and may have a q-dimensional moment ωq. The q-time moment ωq can be obtained by the following equation (31). In addition, it can be calculated using a method other than Equation 31, and there is no limitation on the method of calculating the moment (ωq) of the q dimension.

기술범위 기술자가 의도하는 범위내의 T개의 영상에서 i번째 영상에 대한 움직임 크기에 대한 공간적 분포에 대한 특징을 표현하는 크기 2차 통계값(σλi)는 다음 수학식 32에 의해서 구할 수 있다. 아울러 수학식 32 이외의 방법을 사용하여서도 계산될 수 있으며, 그 크기 2차 통계값(σλi)를 계산하는 방법에는 제한을 두지 않는다.The magnitude second statistical value (σλi) representing a feature of the spatial distribution of the motion size of the i-th image in the T images within the range intended by the technical range technician may be obtained by the following equation (32). In addition, it can be calculated using a method other than Equation 32, and there is no limitation on the method of calculating the magnitude second statistical value?

범위내의 T개의 전체 영상들에 대한 크기의 공간적 분포의 평균를 표시하는 2차 통계값의 평균(σω1)은 다음 수학식 33에 의하여 구할 수 있다. 아울러 수학식 33 이외의 방법을 사용하여서도 계산될 수 있으며, 그 2차 통계값의 평균(σω1)를 계산하는 방법에는 제한을 두지 않는다.The average σω1 of the second statistical value indicating the average of the spatial distribution of the sizes of the T total images in the range may be obtained by the following equation (33). In addition, it can be calculated using a method other than Equation 33, and there is no limitation on the method of calculating the average (σω1) of the secondary statistical values.

크기의 2차 통계값에 대한 Central Moment는 2차 통계값 평균(σω1)에서 각 영상의 2차 통계값(σλi)에 대한 공간적 분포 및 왜곡정도에 대한 특징을 나타내며, q차원의 모멘트(σωq)를 갖을 수 있다. 그 q차원의 모멘트(σωq)는 다음 수학식 34에 의해서 구할 수 있다. 아울러 수학식 34 이외의 방법을 사용하여서도 계산될 수 있으며, 그 q차원의 모멘트(σωq)를 계산하는 방법에는 제한을 두지 않는다.The central moment for the second-order statistics of magnitude represents the spatial distribution and distortion of the second-order statistics (σλi) of each image from the mean of the second-order statistics (σω1), and the q-dimensional moment (σωq) It may have a. The q-dimensional moment (σωq) can be obtained by the following equation (34). In addition, it can be calculated using a method other than Equation 34, and there is no limitation on the method of calculating the q-dimensional moment σωq.

DataOfVideoMotionDirection은 움직임 크기에 대한 누적 움직임 히스토그램의 각 빈에 대한 움직임 방향의 평균치를 나타낸다.DataOfVideoMotionDirection represents the average value of the direction of motion for each bin of the cumulative motion histogram with respect to the motion size.

상기한 바와 같이 비디오 내의 내용물의 움직임 크기에 대한 특징을 기술하는 IntensityDescriptor를 정리하면 다음 표 9와 같다. As described above, IntensityDescriptor describing characteristics of the motion size of the contents in the video is summarized in the following Table 9.

움직임 크기 기술(IntensityDescriptor)IntensityDescriptor 움직임 크기평균Motion size average 움직임 크기 시간적 분포Motion size temporal distribution 움직임 크기 편차 평균Movement size deviation mean ω₁ ω ₁ ω₂ … ω_q ω ₂ . ω _q σ_ω1 σ _ω1 움직임 크기 공간적 분포Motion size spatial distribution 움직임 크기 누적 히스토그램Motion magnitude cumulative histogram DataOfVideoMotionDirectionDataOfVideoMotionDirection σ_ω2 … σ_ωq sigma _ω2 ... σ _ωq H_ω H _ω n₁ … n_m n ₁ . n _m

다음 표 10은 상기 움직임 크기 기술자의 신텍스를 나타낸 것으로, 비디오 데이터 구조화에 따른 MotionDescriptor(MD)를 개념적으로 표현한 것이다. 표 10에서, flag_exist_sub_des의 플래그는 움직임 크기 기술자를 사용한 크기 특성기술시에, 구조화된 비디오의 움직임 특성을 계층적으로 표현할 필요성이 있을 경우나 보다 자세한 크기 특성을 표시할 필요성이 있을 시에 사용되는 재귀적인 구조의 구문이다.Table 10 shows the syntax of the motion size descriptor, and conceptually represents a MotionDescriptor (MD) according to video data structuring. In Table 10, the flag of flag_exist_sub_des is a recursion used when size characterization using a motion size descriptor is needed when hierarchically expressing motion characteristics of a structured video or when it is necessary to display more detailed size characteristics. Is a syntactic construct.

다음으로, 움직임 히스토그램 기술자(MotionHistogramDescritptor)는 방향이나 크기등의 움직임 정보에 대한 통계적인 특성을 기술하는 방법으로서, 히스토그램이 표시하고 있는 빈수(NumberOfBins 혹은 NumberOfHistogramBin)와, 각 빈에서의 빈도치(BinValueOfHistogram 혹은 DataOfVideoMotionHistogram) 그리고 빈이 의미하는 실제 데이터에서 대표치(RepresentativeValueOfBin)를 의미한다. Next, a motion histogram descriptor (MotionHistogramDescritptor) is a method of describing statistical characteristics of motion information such as direction and size, and the number of bins (NumberOfBins or NumberOfHistogramBin) displayed by the histogram and the frequency value (BinValueOfHistogram or DataOfVideoMotionHistogram) and RepresentativeValueOfBin in the actual data that the bean means.

상기 NumberOfMotionHistogramBin은 움직임 크기 및 방향에 대한 누적 히스토그램의 빈(bin)수, 즉 히스토그램이 표현하고자 하는 움직임 데이터의 그룹 수에 해당된다. 상기 DataOfVideoMotionHistogram은 움직임 크기 및 방향에 대한 누적 움직임 히스토그램의 데이터를 의미한다.The NumberOfMotionHistogramBin corresponds to the bin number of the cumulative histogram with respect to the motion size and direction, that is, the number of groups of motion data to be represented by the histogram. The DataOfVideoMotionHistogram means data of a cumulative motion histogram with respect to the motion size and direction.

그 움직임 히스토그램 기술자(MotionHistogramDescritptor)를 정리하면 다음 표 11과 같다.The Motion Histogram Descriptor is summarized in Table 11 below.

일반적으로 히스토그램의 빈수(n)는 응용분야, 히스토그램으로 표현하고자 하는 특성치의 정밀도 그리고 전체적인 데이터 크기등을 고려하여 사용자가 선택적으로 결정할 수 있다. In general, the number n of histograms can be selectively determined by the user in consideration of application fields, precision of characteristic values to be represented in the histogram, and overall data size.

움직임 히스토그램 기술자(MotionHistogramDescriptor)Motion HistogramDescriptor 히스토그램 빈수 (NumberOfBins)Histogram Frequency (NumberOfBins) 히스토그램값 (BinValueOfHistogram)Histogram Value (BinValueOfHistogram) nn bin_value[1] … bin_value[n]bin_value [1]... bin_value [n] 빈의 대표치 (RepresentativeValueOfBin)Bean's RepresentativeValueOfBin 움직임 하위 누적 히스토그램Motion sub-cumulative histogram rvalue_of_each_bin[1] … rvalue_of_each_bin[n]rvalue_of_each_bin [1]... rvalue_of_each_bin [n] H₁ … H_m H ₁ . H _m

다음 표 12는 움직임 히스토그램 기술자의 신텍스를 나타낸 것이다.Table 12 below shows the syntax of the motion histogram descriptor.

표 12에서, NumberOfSubHistogram과 MotionHistogramDescriptor는 움직임 특징정보에 대한 보다 자세하거나 보정되는 정보를 2-D 혹은 3-D 형태의 보조 히스토그램으로 기술하기 위한 것이다. 일례로, 움직임 방향 누적 히스토그램(Hθ)을 사용하는 경우, 상기 MotionHistogramDescriptor는 각 빈에 따른 움직임 크기에 대한 정보를 표시할 수 있고, 움직임 방향 누적 히스토그램(Hθ)과는 다른 방법을 사용하여 생성된 움직임 누적 히스토그램일 수 있다.In Table 12, NumberOfSubHistogram and MotionHistogramDescriptor are used to describe more detailed or corrected information about motion feature information in a 2-D or 3-D auxiliary histogram. For example, when using the motion direction cumulative histogram (Hθ), the MotionHistogramDescriptor can display information about the motion size according to each bin, and the motion generated by using a method different from the motion direction cumulative histogram (Hθ). It may be a cumulative histogram.

본 발명에서 제안하는 MotionDescriptor를 사용하여 일반적인 비디오 구조화에 따라 계층적으로 움직임 특징정보 기술하면, 다음 표 13 및 표 14와 같이 기술될 것이다. 표 13에서 MD는 MotionDescriptor의 약어이다.If motion feature information is described hierarchically according to general video structure using MotionDescriptor proposed in the present invention, it will be described as Table 13 and Table 14. In Table 13, MD stands for MotionDescriptor.

상기 표 14는 비디오 데이터 구조화에 따른 MotionDescriptor(MD)를 개념적으로 표현한 것이다. 상기 표 14에 도시한 바와 같이, MotionDescriptor는 비디오 전체 혹은 Story, Scene, Shot 그리고 Segment등 임의의 크기의 구조화 단위에서 기술이 가능하며, 각 단위내의 비디오 움직임 데이터의 특성을 기술할 수 있으며, 상위 레벨의 개괄적인 움직임 특성 기술에서 하위 레벨의 보다 구체적인 특징 기술으로 계층적인 표현이 가능하다.Table 14 conceptually represents a MotionDescriptor (MD) according to video data structure. As shown in Table 14, the MotionDescriptor can describe the entire video or structured units of arbitrary sizes such as story, scene, shot, and segment, and can describe the characteristics of video motion data in each unit. Hierarchical representation is possible with the more detailed feature description of lower level in general motion feature description of.

또한 MotionDescriptor로 기술된 비디오 움직임 특성을 이용한 검색에 있어 다음의 표 15와 같이 기술자내의 각 필드에 사용가능 플래그(Flag used_direction_des, used_intensity_des)를 부여 함으로서 사용자가 원하는 특징정보들을 사용하여 검색할 수 있다. 아울러 exist_sub_des 플래그를 사용하여 비디오 전체나 혹은 특정 단위 구간에 대한 검색을 수행할 수 있다. In addition, in the search using the video motion characteristics described by the MotionDescriptor, as shown in Table 15 below, by assigning usable flags (Flag used_direction_des, used_intensity_des) to each field in the descriptor, the user can search using the desired feature information. In addition, the exist_sub_des flag can be used to search the entire video or a specific unit section.

MotionDescriptor { VideoID video_id TimeDescriptor time_des MotionDescriptionLevel level Flag used_direction_des Flag used_intensity_des if (used_direction_des) MotionDirectionDescriptor direction_des if (used_intensity_des) MotionIntensityDescriptor intensity_des MotionDescriptor {VideoID video_id TimeDescriptor time_des MotionDescriptionLevel level Flag used_direction_des Flag used_intensity_des if (used_direction_des) MotionDirectionDescriptor direction_des if (used_intensity_des) MotionIntensityDescriptor intensity_des

방향 기술자(DirectionDescriptor)와 크기 기술자(IntensityDescriptor)에 대해서도 평균, Central Moment들 그리고 히스토그램에 대하여 사용 가능 플래그를 설정함으로서 상기의 일례와 같이 단계적인 검색이 가능하다. In the case of the DirectionDescriptor and the IntensityDescriptor, the usable flag can be set for the average, the Central Moments, and the histogram.

본 발명에서 제안한 움직임 기술자를 적용할 수 있는 비디오 검색을 위한 시스템은 도 19과 같다. 도 19에 도시된 움직임 기술자를 이용한 비디오 검색시스템의 동작을 설명하면 다음과 같다.19 is a system for video search to which the motion descriptor proposed in the present invention can be applied. The operation of the video retrieval system using the motion descriptor shown in FIG. 19 will now be described.

먼저, 그래픽 유저 인터페이스(100)에 의해 인터페이싱되거나 데이터 베이스(DB)에서 인출된 질의 비디오(101)에서 움직임 기술자 추출장치(102)에 의해 움직임 기술자가 추출된다. 그 추출된 움직임 기술자는 움직임 기술자 부호화장치(103)에서 부호화되고, 다중화장치(104)를 거쳐 다중화된 후, 네트워크를 통해 서버측으로 송신된다.First, the motion descriptor is extracted by the motion descriptor extractor 102 from the query video 101 interfaced by the graphical user interface 100 or retrieved from the database DB. The extracted motion descriptor is encoded by the motion descriptor encoding apparatus 103, multiplexed via the multiplexing apparatus 104, and then transmitted to the server side via a network.

서버측에서의 역다중화장치(105)는 다중화된 움직임 기술자를 역다중화하고, 움직임 기술자 복호화장치(106)는 이를 복호화한다. 검색엔진(106)에서의 움직임 기술자 유사도 비교기(107)는 상기 움직임 기술자 복호화장치(106)에서 복호화된 움직임 기술자와 데이터 베이스 엔진(112)에서의 멀티미디어 데이터 베이스(114)에 저장된 움직임 기술자의 유사도를 비교한 후, 그 비교결과에 따라 유사도가 높은 움직임 기술자를 기 설정된 수 만큼 랭킹한다.The demultiplexer 105 on the server side demultiplexes the multiplexed motion descriptors, and the motion descriptor decoder 106 decodes them. The motion descriptor similarity comparator 107 of the search engine 106 determines the similarity between the motion descriptor decoded by the motion descriptor decoder 106 and the motion descriptor stored in the multimedia database 114 of the database engine 112. After the comparison, the motion descriptors having high similarity are ranked by a predetermined number according to the comparison result.

한편, 클라이언트측에서의 질의 비디오(101)은 멀티미디어 복호기(109)를 거쳐서 복호화된 후, 상기 과정을 반복 실행하게 된다.On the other hand, the query video 101 at the client side is decoded via the multimedia decoder 109, and then the process is repeated.

한편, 서버측에서의 비디오(110)는 움직임 기술자 추출장치(111)에 의해 움직임 기술자가 추출된 후, 데이터 베이스 구축장치(113)를 거쳐 멀티미디어 데이터 베이스(114)에 저장된다.On the other hand, the video 110 on the server side is stored in the multimedia database 114 via the database building device 113 after the motion descriptor is extracted by the motion descriptor extraction device 111.

이와 같이, 본 발명에서 제안하는 기술은 움직임 기술자 추출 장치에 적용될 수 있고, 이에 대한 간략한 예시는 도 19과 같다. As such, the technique proposed in the present invention may be applied to a motion descriptor extraction apparatus, and a brief example thereof is illustrated in FIG. 19.

본 발명은 비디오 검색에서 상기한 누적 움직임 히스토그램 데이터와 이를 보다 효과적으로 기술한 MotionDescripor 기술자를 사용하고, 유사도 측정에서 MotionDescripor내의 움직임 특징정보, MotionClipDescriptor 내의 움직임 특징 정보, MotionDescriptor를 표현하는 MotionClipDescriptor의 계층적 형태 등을 이용하여 비디오의 전체적이나 특정 구간에 대한 움직임의 흐름과 패턴에 대한 기존의 유사도 측정방법(SAD-Sum Of Absolute Difference, MSE Mean Square Error 등의)들을 적용할 수 있다. 또한 보다 정확한 유사도 측정이 요구될 시에는 MotionDescripor내의 DataOfVideoMotionHistogram 나 DataOfVideoMotionLength를 사용할 수 있다. 아울러, Start_Time과 End_Time를 사용하여 시간을 이용한 비디오 검색도 가능하다.The present invention uses the above-described cumulative motion histogram data and a MotionDescripor descriptor that describes the effect more effectively in video retrieval. Conventional similarity measurement methods (such as SAD-Sum Of Absolute Difference, MSE Mean Square Error) can be applied to the flow and pattern of the motion of the entire video or a specific section. In addition, when more accurate similarity measurement is required, DataOfVideoMotionHistogram or DataOfVideoMotionLength in MotionDescripor can be used. In addition, it is also possible to search video by time using Start_Time and End_Time.

본 발명을 활용한 디지털 비디오 응용 서비스Digital video application service using the present invention

본 발명의 움직임 기술자는 인터넷과 IMT-2000과 같은 디지털 비디오 서비스가 지원되는 이동통신 환경, 화상회의, 디지털 방송 그리고 VOD와 같은 제한된 대역폭과 시스템 리소스를 효율적으로 활용하는데 움직임 정도가 중요하게 사용될 수 있는 다양한 디지털 비디오 서비스 분야에 사용될 수 있다. 그 중 대표적인 활용 일례로서 비디오 브라우징, 원격감시, 검색 그리고 리퍼포징을 설명한다.The motion descriptor of the present invention can be used for the degree of movement to effectively utilize the limited bandwidth and system resources such as mobile communication environment, video conferencing, digital broadcasting, and VOD supported by the digital video service such as the Internet and IMT-2000. It can be used in various digital video service fields. Among them, video browsing, remote monitoring, retrieval, and refurbishing are described as examples.

1) 브라우징(1) Browsing BrowisingBrowising ))

브라우징은 사용자가 멀티미디어 데이터 베이스를 통하여 데이터들 사이에 존재하는 다양한 형태의 링크를 따라 이동하는 정보검색 활동을 칭한다. 비디오 같은 대용량의 미디어 데이터에 대하여 빠르고 효율적인 브라우징을 수행하기 위하여 지원이 되어야 할 필수적인 기능 요소 중에 하나는 사용자에게 사용중인 비디오에 대안 비디오의 구조, 요약정보 , 대표영상 그리고 특징정보를 비주얼 형태로 유저 인터페이스를 통하여 제공하는 것이고 이를 위하여 사용자가 볼 수 있거나 들을 수 있는 수단을 사용할 수 있다. 또한 움직임 정도에 따라 몇 개의 범주로 분류하여 동일한 범주나 상이한 범주의 비디오 세그먼트를 빠르게 브라우징 할 수 있는 기능이 필요하다.Browsing refers to information retrieval activity in which a user moves through various forms of links that exist between data through a multimedia database. One of the essential functional elements that must be supported for fast and efficient browsing of large media data such as video is the user interface in visual form of structure, summary information, representative image, and feature information of alternative video on the video being used by the user. This can be provided by means of which the user can see or hear. There is also a need for the ability to quickly browse video segments of the same or different categories by classifying them into several categories according to the degree of movement.

2) 원격감시(2) Remote monitoring SurveilanceSurveilance ))

원격감시는 제한된 공간상에 발생하는 사건들을 기록하고 감시해야 하는 통상적으로 발생 가능한 움직임 특성의 기대치가 존재하는 응용이다. 따라서, 통상적인 기대치와 다른 상이한 범주에 속한 움직임 특성을 갖는 비디오 세크먼트는 사건에 대한 중요한 정보를 담고 있기 때문에 움직임 정도를 사용할 경우 정도에 따라 기록해야 할 데이터량을 제어할 수 있으며 중요사건에 대한 효율적인 검색도 지원할 수 있다. Remote monitoring is an application where there are expectations of commonly occurring movement characteristics that must be recorded and monitored for events occurring in confined spaces. Therefore, video segments with motion characteristics belonging to different categories that differ from the usual expectations contain important information about the event, so when using the degree of movement, you can control the amount of data to be recorded depending on the degree of movement. It can also support efficient search.

3) 비디오 검색(3) Video Search ( VideoVideo RetrievalRetrieval ))

비디오 검색은 브라우징과 함께 필요한 정보를 찾는 탐색 활동의 다른 형태로서 통상 자연언어, 구조화된 컴퓨터 언어등을 사용하여 질의를 표현하거나 비슷한 예제를 부가정보로 제시하는 형태로 진행된다. 비디오 검색은 시간적 순서를 갖는 복수의 프레임들로 구성된 비디오 세그먼트 단위로 진행되며 통상 다양한 형태의 복수의 물체가 세그먼트 내에 존재하게 된다. 이러한 다양한 색상과 모양 그리고 질감의 영역이나 물체가 존재하게 된다. 따라서, 특정한 색상이나 형태의 모양 그리고 구성의 질감에 의한 검색 보다는 움직임 정도와 같은 통상적인 움직임 정도에 대한 정보를 검색의 초기단계에 사용하면 검색의 범위를 크게 줄일 수 있어 보다 빠른 검색이 가능하다. 또한 특정한 색상이나 모양 그리고 질감에 대한 특징을 사용할 수 없을 시에 컨텐츠의 변화에 기반을 둔 움직임 정보를 사용하면 보다 효율적인 검색을 할 수 있다 일례로, 뉴스 프로그램의 데이터 베이스에서 앵커가 등장하는 장면을 검색하고자 할 경우, 매일 다른 의상과 다른 스튜디오와 다른 사람에 의해 진행되는 경우가 보통이므로 색상, 모양 그리고 질감을 사용하는 것보다 "움직임 정도가 거의 없는 장면을 검색"한 후 보다 구체적인 특징정보를 사용하는 것이 바람직하다. 또한, 비디오 내의 일관된 방향을 갖는 물체가 존재할 경우 방향에 대한 정보는 매우 유용하게 사용될 수 있다.Video retrieval is another form of search activity that finds the information needed with browsing, usually using a natural language, a structured computer language, etc. to express a query or present similar examples as additional information. Video retrieval is performed in units of video segments composed of a plurality of frames having a temporal order, and a plurality of objects of various types exist in the segments. There are areas of various colors, shapes, and textures or objects. Therefore, if the information about the general degree of movement such as the degree of movement is used in the initial stage of the search rather than the search by the specific color, shape, shape, or texture of the composition, the scope of the search can be greatly reduced, so that the faster search is possible. In addition, when information on specific colors, shapes, and textures is not available, motion information based on changes in content can be used for more efficient search. For example, a scene in which an anchor appears in a database of a news program can be found. If you want to search, it's usually done by different clothes, different studios and different people every day, so you can "find a scene with very little movement" and then use more specific features rather than using colors, shapes, and textures. It is desirable to. Also, information about the direction can be very useful when there is an object with a consistent direction in the video.

4) 비디오 리퍼포징(4) Video Repurposing ( VideoVideo RepurposingRepurposing ))

비디오 리퍼포징은 서버상의 부호화된 스트림을 사용하는 사용자 시스템 성능과 사용자 요구에 따라 적절하게 재가공하여 서비스-하는 응용을 지칭한다. 특히, 이동통신 단말과 같이 제한된 대역폭과 성능을 갖는 시스템을 위하여 디지털 비디오 서비스를 수행할 경우, 움직임 정도는 대역폭을 보다 효율적으로 사용하기 위하여 전송 프레임 수, 즉 비트율을 제어하는데 다음과 같이 사용될 수 있다. 가령, 전송해야 할 GOP(Group Of Picture)에 대한 움직임 정도가 작을 경우 컨텐트에 대한 중요한 변화가 발생하기 않은 것으로 간주하여 데이터를 전송하지 않거나 전송할 프레임 수를 줄이고, 반대로 움직임 정도가 크면 의미 있는 사건들이 발생한 것으로 간주하여 보다 많은 수의 프레임을 전송하여 사용자에 보다 정확한 정보를 제공할 수 있다.Video refurbishing refers to an application that reprocesses and services appropriately according to user requirements and user system capabilities using encoded streams on the server. In particular, when performing a digital video service for a system having a limited bandwidth and performance, such as a mobile communication terminal, the degree of motion may be used to control the number of transmission frames, that is, the bit rate, in order to use the bandwidth more efficiently. . For example, if the movement of the GOP (Group Of Picture) to be transmitted is small, it is considered that no significant change to the content has occurred. Therefore, data is not transmitted or the number of frames to be transmitted is reduced. It can be regarded as occurring and transmit a larger number of frames to provide more accurate information to the user.

본 발명은 멀티미디어 데이터 검색에서 사용될 비디오의 움직임 특징 정보의 기술자를 제안함으로써 비디오의 검색에 본 발명에서 제안한 방법을 사용하게 될 경우 경제적인 이득이 있다.The present invention is economically advantageous when the method proposed in the present invention is used for video retrieval by proposing a descriptor of motion feature information of a video to be used in multimedia data retrieval.

또한 본 발명의 움직임 활동 기술자는 비디오나 부분 비디오의 컨텐츠를 검색하고 식별하는데 있어 몇 개의 대표영상의 특징 정보만을 사용하는 기존의 비디오 움직임 색인기법으로는 표현하기 어려운 비디오 전체, 대표영상 사이, 시간상 특정 구간에 대한 신호적 특성들과 시공간적 분포 그리고 변화 정도와 패턴등에 대한 지각적 특징을 기술할 수 있어 이러한 움직임 정도가 중요한 특징이 되는 디지털 비디오 서비스 응용들에 활용될 수 있다. 또한, 움직임 활동 기술자의 특징정보는 응용분야와 움직임 표현의 정밀도를 고려하여 선택적으로 사용할 수 있으며, 기존의 동영상 압축 기법인 MPEG-1,-2,-4와 H.263를 사용하여 압축된 비디오 스트림의 경우, 비디오 스트림 내의 움직임 정보를 특별한 부가 처리 없이 직접 사용할 수 있어 특징정보 추출에 소요되는 복잡도가 아주 작아 실시간 처리가 필요한 응용분야에도 적용 가능하다.In addition, the motion activity descriptor of the present invention can identify the entire video, the representative image, and the temporal image that are difficult to express by the existing video motion indexing technique that uses only characteristic information of a few representative images in searching and identifying the content of the video or partial video. Signal characteristics, space-time distribution, and perceptual characteristics such as the degree of change and pattern of the interval can be described, which can be used in digital video service applications where such a degree of movement is an important feature. In addition, the feature information of the motion activity descriptor can be selectively used in consideration of the application field and the precision of the motion expression, and the video compressed using the existing video compression techniques MPEG-1, -2, -4 and H.263 In the case of the stream, the motion information in the video stream can be used directly without any additional processing, so the complexity required for feature information extraction is very small, and can be applied to applications requiring real-time processing.

Claims

Extracting a motion parameter from a video;

Extracting statistical characteristics of the magnitudes of the motion parameters extracted in the previous step; And

Extracting a statistical characteristic of the direction of the motion parameter,

Extracting the statistical characteristic of the magnitude of the motion parameter is extracting a spatial statistical feature value of the magnitude of the motion parameter of each image, and extracting the statistical characteristic of the direction of the motion parameter is the motion of each image. A method of describing a motion activity feature of a video, comprising extracting spatial statistical feature values of a direction of a parameter.

The method of claim 1, wherein the method uses a motion vector encoded by digital video encoding as the motion parameter.

The method of claim 1, wherein the step of extracting the spatial statistical feature value of the size of the motion parameter comprises obtaining a standard deviation value I _dev of the size of the motion parameter on each screen of a video composed of T screens from which the motion parameters are extracted. And using a mean (I _av , _dev ) of the standard deviation values of the motion parameter sizes obtained from the T-screen video as a statistical characteristic value representing the entire video. Activity feature description method.

The method of claim 3, wherein the standard deviation value I _dev of the magnitude of the motion parameter is obtained by Equation 4 below, and the average values I _av and _dev of the standard deviation values are obtained by Equation 9 below. Method of describing motion activity feature of moving picture.

[Equation 4]

Where I _k is the magnitude component of the motion parameter,

I _av is the mean value of the magnitude of the motion parameter,

M is the number of blocks or objects constituting one screen,

[Equation 9]

In the above formula, I _dev _{, i} is the standard deviation value of the motion parameter size of the i-th screen,

T is the number of screens from which motion parameters are extracted from the video.

5. The method of claim 4, wherein the method calculates the standard deviation value I _{dev, av} of the mean of the motion parameters according to the following equation (7) of the mean value I _av _, _av of the mean values of the motion parameter magnitudes of Equation 8 below. The method of claim 1, further comprising the step of using reliability.

[Equation 7]

In the above formula, I _av _{, i} is an average of motion parameters of the i th screen,

[Equation 8]

I _av _, _av are the first-order statistics of the first-order statistics of the motion parameter size,

The method of claim 1, wherein the step of extracting the statistical characteristics of the size of the motion parameter is to calculate the standard deviation value I _dev of the size of the motion parameter by the following Equation 4 for each image, And using the standard deviation values (I _dev , _dev ) of these values as statistical feature values representative of the video as a whole.

[Equation 4]

Where I _k is the magnitude component of the motion parameter,

I _av is the first statistical value of the magnitude of the motion parameter,

M is the number of blocks or objects constituting one screen.

[Equation 10]

In the above formula, I _{dev, i} are standard deviation values of the motion parameter size of the i-th screen,

I _{av and dev} are averages of standard deviation values of the motion parameter sizes obtained from each of the T videos generated by Equation 9 below.

[Equation 9]

In the above formula, I _dev _{, i} is the standard deviation value of the magnitude of the motion parameter,

Claim 6, wherein the method is for the standard deviation value of the standard deviation of the size of the motion parameters of the equation 10 (I _{_{_dev,} dev)} the average value of the size standard deviation value of the movement parameter of the equation (9) to (I _{_av,} _dev ) further comprising the step of using the reliability of the movement activity feature description method.

[Equation 10]

In the above formula, I _dev _{, i} are second order statistics of the motion parameter size of the i-th screen,

I _av _and _dev are averages of standard deviation values of motion parameter sizes obtained from each of the T videos generated by Equation 9 below.

[Equation 9]

In the above formula, I _dev _{, i} is the second statistical value of the motion parameter size of the i-th screen,

The method of claim 1, wherein the step of extracting the time statistical feature value in the direction of the motion parameter is to obtain a standard deviation value φ _dev of the direction of the motion parameter for each image, and then, averages the standard deviation values φ _av. and _dev ) as a statistical feature value representative of the video as a whole.

The method of claim 8, wherein the standard deviation value of the direction of the motion parameter is obtained by the following equation (6), and the average of the standard deviation values (φ _av , _dev ) of the direction of the motion parameter is obtained by the following equation (13). The motion activity feature description method of a video characterized by the above-mentioned.

[Equation 6]

Where _k is the direction component of the motion parameter,

φ _av is the first statistical value in the direction of the motion parameter,

M is the number of blocks or objects constituting one screen.

[Equation 13]

In the above formula, φ _dev _{, i} is a second statistical value in the direction of the motion parameter of the i-th screen,

The method of claim 1, wherein the step of extracting the statistical characteristics of the direction of the motion parameter is obtained by calculating the standard deviation (φ _dev ) of the direction of the motion parameter by the following equation (6) for each image. And using the standard deviation values (φ _dev , _dev ) of these values obtained as the statistical feature values representative of the video as a whole.

[Equation 6]

Where _k is the direction component of the motion parameter,

φ _av is the first statistical value in the direction of the motion parameter,

M is the number of blocks or objects constituting one screen.

[Equation 14]

In the above formula, φ _dev _{, i} is the standard deviation value of the direction of the motion parameter of the i-th screen,

φ _av _, _dev are the average of the standard deviation values in the direction of the motion parameter,

12. The method of claim 10, wherein the method calculates the standard deviation values φ _{dev and dev} of standard deviation values in the direction of the motion parameter of Equation 14 below. _av _, _dev ). The method of claim 1, further comprising the step of using reliability.

[Equation 14]

φ _av , _dev are the mean values of the standard deviations in the direction of the motion parameters,

[Equation 13]

The method according to claim 1, wherein the method obtains a frequency of the motion parameter direction quantized in M directions in the entire moving picture screen, and several motion parameter directions (φ _1, φ ₂ ,. _. φ _N) and the number N (M≥N) motion activity to a vector characterized in that it further comprises a step of extracting a motion activity descriptor consisting of the features described method.

[Equation 15]

φ _max = <N, φ _1, φ _{2, ...} φ _N >

Means for extracting motion parameters from a video;

Means for extracting statistical characteristics of the magnitudes of the motion parameters extracted in the previous step; Means for extracting a statistical characteristic of the direction of the motion parameter: and

And a combiner for gathering the extracted statistical characteristics to define a motion activity descriptor, wherein the means for extracting the statistical characteristics of the size of the motion parameter extracts the spatial statistical feature values of the size of the motion parameter of each image. And means for extracting the statistical characteristic of the direction of the motion parameter is means for extracting spatial statistical feature values of the direction of the motion parameter of each image.

The apparatus of claim 13, wherein the apparatus uses the motion parameter encoded by digital video encoding as the motion parameter.

The method of claim 13, wherein the means for extracting the spatial statistical feature values of the magnitude of the motion parameters obtains the standard deviation of the magnitudes of the motion parameters for each image, and then averages the values to represent the entire video. And a means for use as a value.

The method according to claim 13, wherein the means for extracting the statistical characteristics of the magnitude of the motion parameter obtains the standard deviation of the magnitude of the motion parameter for each image, and then the standard deviation value of these values represents the entirety of the video. And a means for use as a value.

15. The apparatus of claim 13, comprising means for using the standard deviation of the standard deviation values of the magnitude of the motion parameter as a confidence in the mean value of the standard deviation values of the magnitude of the motion parameter.

The method of claim 13, wherein the means for extracting the spatial statistical feature values in the direction of the motion parameter obtains the standard deviation of the direction of the motion parameter for each image, and then averages the standard deviation values to represent the entire video. And a means for use as a feature value.

The method of claim 13, wherein the means for extracting the statistical characteristics of the direction of the motion parameter obtains the standard deviation value of the direction of the motion parameter for each image, and then represents the standard deviation value of the standard deviation values representing the entire video. A motion activity feature description apparatus for a video, comprising means for use as a statistical feature value.

14. The apparatus of claim 13, comprising means for using the standard deviation of the standard deviation values of the direction of the motion parameter as the reliability of the mean value of the standard deviation of the motion parameter.

20. The apparatus of claim 19, wherein the apparatus further comprises means for using the frequency of motion parameters in the video in the overall average direction of the video.

The method of claim 13, wherein the device obtains the frequency of the motion parameter direction quantized in M directions in the entire video screen, and several motion parameter directions φ1, φ2, ... And means for extracting a vector consisting of the number N (M≥N) into a motion activity descriptor.

[Equation 15]

φmax = <N, φ1, φ 2, ... φ N> __