KR101068288B1

KR101068288B1 - Content-Based Video Copy Detection Method Using Segment Feature Extraction

Info

Publication number: KR101068288B1
Application number: KR1020090013959A
Authority: KR
Inventors: 김보경; 장재형; 김재광; 이지형; 정제희
Original assignee: 성균관대학교산학협력단
Priority date: 2009-02-19
Filing date: 2009-02-19
Publication date: 2011-09-28
Also published as: WO2010095796A1; KR20100094803A

Abstract

본 발명에 따른 세그먼트 특징을 이용한 내용 기반 동영상 검출 방법은, 복제 동영상 검출을 위해 사용될 세그먼트를 추출하는 세그먼트 추출 단계와; 입력 동영상의 복제를 검출하기 위해서 입력 동영상에서 추출한 세그먼트와 원본 동영상에서 추출한 세그먼트들을 비교하여 입력 동영상에 포함된 세그먼트와 유사한 원본 동영상에 포함된 세그먼트를 찾는 세그먼트 비교 단계 및; 입력 동영상의 세그먼트 비교결과를 통해 어떤 영상으로부터 복제되었는가를 결정하는 동영상 검출단계를 갖추어 이루어진다.Content-based video detection method using the segment feature according to the present invention, the segment extraction step of extracting a segment to be used for detecting duplicate video; A segment comparison step of comparing a segment extracted from the input video to segments extracted from the original video to find a segment included in the original video similar to the segment included in the input video to detect duplication of the input video; And a video detection step of determining which video is copied from the segment comparison result of the input video.

Description

Content-Based Video Copy Detection Method Using Segment Feature Extraction}

본 발명은 복제 동영상을 검출하기 위한 방법에 관한 것으로, 특히 유사한 특징 값을 갖는 프레임을 병합하여 만든 세그먼트를 추출하고, 세그먼트의 정보를 이용해서 동영상을 세그먼트 단위로 비교해서 복제 동영상을 검출하는 세그먼트 특징을 이용한 내용 기반 동영상 검출 방법에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a method for detecting a duplicate video. In particular, a segment feature for extracting a segment created by merging frames having similar feature values, and comparing the video in segments by using the segment information, detects a duplicate video. It relates to a content-based video detection method using a.

인터넷 환경이 빠르게 발달함에 따라 지식을 전달하는 수단이 문자, 이미지에서 동영상으로 빠르게 변화하고 있다. 이 과정에서 동영상 미디어의 불법 복제에 대한 저작권 보호의 필요성도 커지고 있다.As the internet environment develops rapidly, the means of transferring knowledge is rapidly changing from texts and images to video. In this process, the need for copyright protection against illegal copying of video media is also increasing.

동영상의 불법 복제에 대한 저작권 보호를 위한 방법으로 내용 기반 복사 검출(Content-based copy detection) 방법이 있다. 이러한 내용 기반 복사 검출 방법은 동영상으로부터 추출 가능한 특징을 이용하여 복제 동영상을 검출하는 것으로, 이미 배포된 동영상에 대해서도 특징의 추출이 가능한 장점이 있다.There is a content-based copy detection method as a method for copyright protection of video illegal copying. Such a content-based copy detection method detects a duplicate video by using a feature that can be extracted from a video, and has an advantage that a feature can be extracted even for a previously distributed video.

동영상은 다른 미디어에 비해서 용량이 크기 때문에 동영상 검출을 위해서는 동영상을 효율적으로 식별할 수 있는 특징을 추출하는 것이 중요하다. 그러나, 기존의 내용 기반 복사 검출 방법들은 모든 프레임의 특징을 이용하기 때문에 불필요한 데이터와 연산 비용이 요구된다. 즉, 기존의 복제 동영상 검출 연구는 모든 프레임을 비교하였기 때문에 많은 처리 시간을 필요로 한다.Since video has a larger capacity than other media, it is important to extract a feature that can efficiently identify a video for video detection. However, since the existing content-based copy detection methods use the features of all frames, unnecessary data and operation costs are required. That is, the existing duplicated video detection research requires a lot of processing time because all frames are compared.

이를 구체적으로 설명하면, 상기한 내용 기반 동영상 복사 검출(Content-based video copy detection)은 비교를 위한 입력 동영상과 복제 여부를 판단하기 위해서 사전에 수집한 원본 동영상간의 불법 복제 여부를 판단하는 연구이다. 이러한 연구들은 동영상의 불법 복제 여부 판단을 위해서 두 동영상에서 특징 값들을 추출하고 추출된 특징 값들의 비교를 통해서 동영상의 불법 복제 여부를 판단한다.Specifically, the content-based video copy detection is a study for determining whether illegal copying is performed between an input video for comparison and an original video collected in advance in order to determine whether or not to copy. These studies extract feature values from two videos to determine whether the video is illegally copied and then determine whether the video is illegal by comparing the extracted feature values.

참고문헌 [1]에 개시된 바와 같은 내용 기반 동영상 복사 검출 기법은, 특징을 추출 범위에 따라 크게 두 가지로 구분하고, 각각에 해당하는 방법들의 성능을 비교 하였다. 로컬 디스크립터(Local descriptor) 방법은 동영상 프레임의 부분적인 영역의 특징과 연속된 프레임의 부분적인 영역의 특징을 추출하여 이용하는 방식인데 반해, 그로벌 디스크립터(Global descriptor) 방법은 프레임에 포함된 전체 영상 정보를 특징 값으로 이용하는 방식이다.In the content-based video copy detection technique disclosed in Ref. [1], the feature is classified into two types according to the extraction range, and the performances of the corresponding methods are compared. The local descriptor method extracts and uses the partial region feature of the video frame and the partial region feature of the continuous frame, whereas the global descriptor method uses the entire image information included in the frame. Is used as a feature value.

로컬 디스크립터(Local descriptor) 방식은 관심점(Harris interest point detector)과 주변 지역에 대한 디퍼런셜 디스크립션(Differential description)을 기반으로 동영상의 복제 여부를 판단하였다.The local descriptor method determines whether a video is copied based on a Harris interest point detector and a differential description of the surrounding area.

참고문헌 [2]에서는 관심점의 주변 지역 특징을 그레이 레벨 2D 신호(Gray level 2D signal)의 2차 미분으로 표현하였고, 참고문헌 [3]에 개시된 ViCopT 방법은 지역적 특징을 통해 Trajectory를 만들고 모션(Motion)이나 배경(Background) 라벨을 할당하였다.In Ref. [2], the local area feature of interest is expressed as the second derivative of Gray level 2D signal, and the ViCopT method disclosed in Ref. You have assigned a Motion or Background label.

또한, 참고문헌 [4]에 개시된 STIP 방법은 시공간적인 이벤트를 포착하기 위해 시간적, 공간적 변화를 포착할 수 있는 특징 점을 사용하였다. 그러나, 관심점(Harris interest point detector)을 사용하는 방법은 콘트라스트(Contrast) 감소와 같은 편집 효과가 적용된 영상에 대해 관심점을 검출하지 못해 그 성능이 저하될 수 있다.In addition, the STIP method disclosed in Ref. [4] used feature points capable of capturing temporal and spatial changes to capture spatiotemporal events. However, the method of using a Harris interest point detector may not detect an interest point on an image to which an editing effect such as contrast reduction is applied, and thus its performance may be degraded.

그로벌 디스크립터(Global descriptor) 방법들 중에서는 연속된 프레임에 특정 개수로 나눈 블록의 상대적 움직임을 양자화하여 각 움직임을 시간영역에 대하여 통계적으로 표현함으로써 클립을 나타내는 모션 히스토그램 특징을 제안하였다(참고 문헌 [5] 참조).Among the global descriptor methods, a motion histogram feature representing a clip is proposed by quantizing a relative motion of a block divided by a specific number in consecutive frames and statistically representing each motion in a time domain. 5].

참고문헌 [6]에서는 영상을 N개의 윈도우로 분할하여 각 윈도우별로 평균 명암도 값을 구하고, 각 윈도우의 명암도 평균값을 오름차순으로 정리하여 프레임 t에 대해 이산적인 등위(rank)로 표시해서, 화소단위의 변화에 덜 민감하고 명암 값의 상대적인 순서에 의미를 둔 오디널 측정(Ordinal measurement) 방법을 제안하였다.In reference [6], the image is divided into N windows to obtain an average contrast value for each window, and the average brightness values of each window are arranged in ascending order and displayed in discrete rank for the frame t. We propose an ordinal measurement method that is less sensitive to changes and that focuses on the relative order of contrast values.

또한, 참고문헌 [7]에서는 N개의 윈도우로 분할한 영상에 대해 특정 시간 동안 각 프레임별로 같은 위치를 가진 윈도우들의 등위로 표시하는 템포럴 오디널 측 정(Temporal Ordinal measurement) 방법을 제안하였다.In addition, Ref. [7] proposed a temporal ordinal measurement method for displaying the image divided into N windows as equality of windows having the same position for each frame for a specific time.

또한, 참고문헌 [8]에서는 동영상 복사 검출 방법으로 사용되는 모션매칭(참고문헌 [5] 참조)과, 오디널 측정(Ordinal measurement) 방법(참고문헌 [6] 참조) 및, 색상 히스토그램 인터섹션 방법의 성능을 비교하여, 오디널 측정(Ordinal measurement) 방법의 검색 효율이 우수함을 보여주었다.In addition, Reference [8] further describes motion matching (see Reference [5]), the original measurement method (see Reference [6]), and color histogram intersection method, which are used as video copy detection methods. By comparing the performance of, we showed that the search efficiency of the original measurement method is excellent.

그로벌 디스크립터(Global Descriptor) 방식의 내용 기반 복사 검출 방법은 프레임 단위의 비교를 통해 동영상 검색을 수행하기 때문에, 특징 값이 유사한 인접한 프레임을 모두 비교하는 불필요한 연산이 필요하다.Since the content descriptor copy detection method of the Global Descriptor method performs a video search by comparing the frame unit, an unnecessary operation of comparing all adjacent frames having similar feature values is necessary.

참 고 문 헌references

[1] J. Law-To, L. Chen, A. Joly, I. Laptev, O. Buisson, V. Gouet-Brunet, N. Boujemaa, and F. Stentiford, "Video copy detection: a comparative study," In ACM International Conference on Image and Video Retrieval, Amsterdam, The Netherlands, July 2007.[1] J. Law-To, L. Chen, A. Joly, I. Laptev, O. Buisson, V. Gouet-Brunet, N. Boujemaa, and F. Stentiford, "Video copy detection: a comparative study," In ACM International Conference on Image and Video Retrieval, Amsterdam, The Netherlands, July 2007.

[2] A. Joly, O. Buisson, and C. Frelicot. Content-based copy detection using distortion-based probabilistic similarity search. ieee Transactions on Multimedia, 2007.[2] A. Joly, O. Buisson, and C. Frelicot. Content-based copy detection using distortion-based probabilistic similarity search. ieee Transactions on Multimedia, 2007.

[3] J. Law-To, O. Buisson, V. Gouet-Brunet, and N. Boujemaa, "Robust voting algorithm based on labels of behavior for video copy detection," In ACM Multimedia, MM’06, 2006.[3] J. Law-To, O. Buisson, V. Gouet-Brunet, and N. Boujemaa, "Robust voting algorithm based on labels of behavior for video copy detection," In ACM Multimedia, MM'06, 2006.

[4] I. Laptev and T. Lindeberg. Space-time interest points. In International Conference on Computer Vision, 2003.[4] I. Laptev and T. Lindeberg. Space-time interest points. In International Conference on Computer Vision, 2003.

[5] D.N. Bhat, S.K.Nayar, “Ordinal measures for image correspondence,” In IEEE Trans. on PAMI, Vol. 20, No. 4, pp. 415-423, 1998[5] D.N. Bhat, S.K. Nayar, “Ordinal measures for image correspondence,” In IEEE Trans. on PAMI, Vol. 20, No. 4, pp. 415-423, 1998

[6] L. Chen and F. W. M. Stentiford, "Video sequence matching based on temporal ordinal measurement," Technical report no. 1, UCL Adastral, 2006.[6] L. Chen and F. W. M. Stentiford, "Video sequence matching based on temporal ordinal measurement," Technical report no. 1, UCL Adastral, 2006.

[7] 현기호(Ki-Ho Hyun), 이재철(Jae-Cheol Lee), "모션의 방향성 히스토그램을 이용한 내용 기반 비디오 복사 검출," 한국정보과학회논문지 : 소프트웨어 및 응용 제30권 제5·6호, pp. 497 - 502, Jun., 2003.[7] Ki-Ho Hyun, Jae-Cheol Lee, "Content-based Video Copy Detection Using Directional Histogram of Motion," Journal of KIISE: Software and Applications, Vol. 30, No. 5, 6, pp. 497-502, Jun., 2003.

[8] A. Hampapur, K. Hyun, and R. Bolle., “Comparison of Sequence Matching Techniques for Video Copy Detection,” In SPIE. Storage and Retrieval for Media Databases 2002, vol. 4676, pp. 194-201, San Jose, CA, USA, Jan. 2002.[8] A. Hampapur, K. Hyun, and R. Bolle., “Comparison of Sequence Matching Techniques for Video Copy Detection,” In SPIE. Storage and Retrieval for Media Databases 2002, vol. 4676, pp. 194-201, San Jose, CA, USA, Jan. 2002.

[9] M. J. Swain and D. H. Ballard, Color indexing, International Journal of Computer Vision, vol.7, no.1, pp.11-32, Nov. 1991.[9] M. J. Swain and D. H. Ballard, Color indexing, International Journal of Computer Vision, vol. 7, no. 1, pp. 11-32, Nov. 1991.

본 발명은 상기한 점을 감안하여 발명된 것으로, 영상에 포함되는 프레임을 다수의 윈도우로 분할하고, 각 윈도우 마다 특징 값을 추출하며, 유사한 특징 값을 갖는 프레임을 하나의 세그먼트로 통합한 다음, 세그먼트 단위의 검색을 통해 복제 동영상을 검출하도록 된 세그먼트 특징을 이용한 내용 기반 동영상 검출 방법을 제공함에 그 목적이 있다.The present invention has been made in view of the above, and it is possible to divide a frame included in an image into a plurality of windows, extract feature values for each window, combine frames having similar feature values into one segment, An object of the present invention is to provide a content-based video detection method using a segment feature that detects a duplicate video through segmental search.

상기 목적을 달성하기 위한 본 발명에 따른 세그먼트 특징을 이용한 내용 기반 동영상 검출 방법은, Content-based video detection method using a segment feature according to the present invention for achieving the above object,

복제 동영상 검출을 위해 사용될 세그먼트를 추출하는 세그먼트 추출 단계와;A segment extraction step of extracting a segment to be used for detecting a duplicate video;

입력 동영상의 복제를 검출하기 위해서 입력 동영상에서 추출한 세그먼트와 원본 동영상에서 추출한 세그먼트들을 비교하여 입력 동영상에 포함된 세그먼트와 유사한 원본 동영상에 포함된 세그먼트를 찾는 세그먼트 비교 단계 및; A segment comparison step of comparing a segment extracted from the input video to segments extracted from the original video to find a segment included in the original video similar to the segment included in the input video to detect duplication of the input video;

입력 동영상의 세그먼트 비교결과를 통해 어떤 영상으로부터 복제되었는가를 결정하는 동영상 검출단계를 갖추어 이루어진다.And a video detection step of determining which video is copied from the segment comparison result of the input video.

본 발명에 따르면, 동영상을 세그먼트 단위로 비교함으로써 기존의 프레임 기반의 방법과 동등한 검출 정확도를 나타내면서도 검색 시간이 단축될 수 있게 된다.According to the present invention, by comparing the video in segments, the search time can be shortened while the detection accuracy is equivalent to that of the existing frame-based method.

이하, 예시도면을 참조하면서 본 발명에 따른 실시예를 상세히 설명한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명에 따른 동영상 검출 방법의 개략적인 구성도를 나타낸 것으로, 본 발명은 동영상에 포함된 연속된 다수의 유사한 프레임을 세그먼트들로 결합하고, 입력 동영상과 원본 동영상들의 세그먼트들이 가지고 있는 특징을 비교하여 동영상의 복제 여부를 판단하는 방법이다.1 is a schematic configuration diagram of a video detecting method according to the present invention. The present invention combines a plurality of consecutive similar frames included in a video into segments, and features of segments of an input video and an original video have. This is a method of determining whether a video is duplicated by comparing the results.

도 1에 도시된 바와 같이, 먼저 동영상 검출을 위한 전처리 단계에서는, 원본 동영상(Reference video set)에서 대표 프레임 추출과정과 세그먼트 특징값 추출 과정을 통해 세그먼트를 추출하고, 세그먼트 데이터베이스를 구성한다. 이어, 검색단계에서는, 입력 동영상(Query video clip)의 복제 여부를 검출하기 위해, 입력 동영상에서도 상기 전처리 단계와 마찬가지로 입략 동영상에서 대표 프레임 추출과정과 세그먼트 특징값 추출 과정을 통해 세그먼트를 추출하고, 탐색을 통해 입력 동영상에서 추출한 세그먼트와 원본 동영상에서 추출한 동영상에서 추출한 세그먼트를 비교하며, 세그먼트 비교를 위해 생성한 값들을 결과 세트부에 등록한 다음, 차이값이나 유사도를 비교한 모든 세그먼트의 결과로 원본 동영상을 검출한다.As shown in FIG. 1, first, in a preprocessing step for detecting a video, a segment is extracted from a reference video set through a representative frame extraction process and a segment feature value extraction process, thereby constructing a segment database. Subsequently, in the search step, in order to detect whether the input video (Query video clip) is duplicated, the segment is extracted through the representative frame extraction process and the segment feature value extraction process in the summary video as in the preprocessing step. Compares the segments extracted from the input video with the segments extracted from the video extracted from the original video, registers the values created for segment comparison in the result set, and then compares the original video with the results of all segments Detect.

이하, 세그먼트 단위로 동영상을 검출하는 방법에 대해 설명한다.Hereinafter, a method of detecting a video in segment units will be described.

세그먼트 단위로 동영상을 검출하기 위한 과정은 세그먼트 추출 단계, 세그먼트 비교 단계, 동영상 검출단계로 이루어진다.The process for detecting a video on a segment basis includes a segment extraction step, a segment comparison step, and a video detection step.

1. 세그먼트 추출 단계1. Segment Extraction Step

본 세그먼트 추출 단계는 복제 동영상 검출을 위해 사용될 세그먼트를 추출하는 단계로서, 세부적인 세그먼트 추출 단계는 세그먼트 추출 과정을 나타낸 도 2에 도시된 바와 같이 두 단계로 구성된다.The segment extraction step is a step of extracting a segment to be used for detecting a duplicate video, and the detailed segment extraction step is composed of two steps as shown in FIG.

본 발명에 있어서, 세그먼트는 동영상을 비교하기 위한 단위로서, 이러한 세그먼트는 프레임의 명암도 정보를 이용해서 결정되는 세그먼트의 시작 프레임인 대표 프레임과, 유사한 모션 정보를 포함하는 그 외의 프레임들로 구성된다. 따라서, 본 세그먼트 추출 단계는 대표 프레임을 추출하는 대표 프레임 추출 단계와, 모션 정보를 추출하여 세그먼트를 생성하는 세그먼트 특징값 추출 단계로 이루어진다.In the present invention, a segment is a unit for comparing a video, and the segment is composed of a representative frame which is a start frame of the segment determined using the intensity information of the frame, and other frames including similar motion information. Accordingly, the segment extraction step includes a representative frame extraction step of extracting a representative frame and a segment feature value extraction step of generating a segment by extracting motion information.

1.1 대표 프레임 추출 단계1.1 Representative Frame Extraction Stage

먼저, 대표 프레임 추출 단계에 대해 설명한다.First, the representative frame extraction step will be described.

세그먼트는 유사한 프레임의 집합으로 구성된다. 따라서, 세그먼트 시작 프레임과 이전 프레임의 명암도 특징 값은 일정 이상의 차이를 갖는다. 본 대표 프레임 추출 단계에서는 프레임에 포함된 픽셀들의 명암도 정보를 이용하여 대표 프레임을 추출한다. 그러나, 하나의 프레임을 하나의 특징 값으로 표현하는 것은 프레임의 고유한 특징 값으로 사용하기는 어렵다. 따라서, 본 대표 프레임 추출 단계에서는 프레임을 특정 크기의 영역으로 분할하고, 각 영역에 포함된 픽셀들의 명암도의 평균값의 차이를 이용해서 대표 프레임을 결정한다.A segment consists of a set of similar frames. Therefore, the contrast characteristic values of the segment start frame and the previous frame have a difference of more than a certain level. In the representative frame extracting step, the representative frame is extracted using the intensity information of the pixels included in the frame. However, expressing one frame with one feature value is difficult to use as a unique feature value of the frame. Therefore, in the representative frame extracting step, the frame is divided into regions having a specific size, and the representative frame is determined by using a difference in the average value of the contrast of the pixels included in each region.

먼저, 동영상의 프레임을 N_Frame = N_x × N_y 개의 윈도우로 분할한다. 이와 같이 분할된 윈도우에 포함된 픽셀의 명암도 정보를 이용해서 각 분할된 윈도우의 평균 명암도를 산출한다. 윈도우 i의 평균 명암도 L_i는 식 (1)과 같이 계산된다.First, a frame of a video is divided into N _Frame = N _x × N _y windows. The average intensity of each divided window is calculated using the intensity information of the pixels included in the divided windows. The average intensity of a window i L _i, is calculated by equation (1).

---- (1)

---- (One)

여기서, W 및 H는 윈도우의 너비 및 높이이고, x, y는 윈도우 i 내부의 픽셀 좌표를 나타낸다. 그리고, I(x, y) 는 x, y의 좌표에 해당하는 픽셀의 명암도 값이다.Where W and H are the width and height of the window, and x and y represent the pixel coordinates inside the window i. I (x, y) is the intensity value of the pixel corresponding to the coordinates of x and y.

모든 프레임들은 상기 식 (1)에 의해서 구해진 각 분할된 윈도우의 내부의 명암도 값을 프레임의 명암도 특징 값으로 갖는다. 따라서, 이전 프레임과 현재 프레임에서 동일한 위치에 존재하는 각 윈도우의 평균 명암도 값의 차이의 총 합을 대표 프레임 결정에 사용한다. 각 윈도우의 평균 명암도 값의 차이의 총합이 실험적으로 얻어진 값 보다 클 경우에는 현재 프레임을 대표 프레임으로 선정한다. 대표 프레임이 선정되면 바로 이전의 대표 프레임에서 현재 대표 프레임까지가 세그먼트가 된다.All the frames have the intensity value of the frame as the intensity value of the inside of each divided window obtained by Equation (1). Therefore, the sum total of the difference of the average contrast value of each window existing at the same position in the previous frame and the current frame is used for determining the representative frame. If the sum of the differences in the average intensity values of each window is larger than the experimentally obtained value, the current frame is selected as the representative frame. When the representative frame is selected, the segment from the previous representative frame to the current representative frame becomes a segment.

일반적으로 복제한 동영상과 원본 동영상의 세그먼트는 동일하게 구성된다. 그러나, 복제한 동영상이 인위적인 편집에 의해서 시작되는 대표 프레임이 다르게 선정된다면 연속적으로 세그먼트의 대표 프레임이 다르게 선택된다. 따라서, 본 발명에서는 동영상에서 1초에 해당하는 만큼의 프레임 분량으로 분할하되, 장면 전환이 발생할 때에는 장면전환 이전까지의 프레임까지만 세그먼트를 생성하고, 장면 전환 이후에 대해서는 새로운 세그먼트를 생성한다. In general, the segment of the duplicated video and the original video is composed identically. However, if the representative frame started by artificial editing is copied differently, the representative frame of the segment is selected differently in succession. Therefore, in the present invention, the video is divided into frames equal to 1 second, but when a scene change occurs, a segment is generated only up to the frame before the scene change, and a new segment is generated after the scene change.

1.2 세그먼트 특징값 추출 단계1.2 Segment Feature Extraction Step

본 세그먼트 특징값 추출 단계에서는 대표 프레임으로부터 세그먼트의 종료 점이 되는 프레임을 찾기 위해 대표 프레임과 연속적인 프레임들의 모션 특징을 추출한다.In the segment feature value extraction step, the motion features of the representative frame and the successive frames are extracted to find the frame that is the end point of the segment from the representative frame.

이를 위해, 이전 프레임의 윈도우가 현재 프레임에서 대응하는 윈도우 위치와 주변 윈도우를 포함하여 비교하는 방법으로 탐색하여 변화 방향을 양자화한다. 현재 프레임을 t라 하고, t프레임과 t-1프레임의 같은 위치에 있는 윈도우를 각각 B_t, B_t-1 이라 한다.To this end, the direction of change is quantized by searching by comparing the window of the previous frame with the corresponding window position and the surrounding window in the current frame. The current frame is referred to as _t , and the windows at the same location in frame t and frame _t-1 are called B _t and B _t-1 , respectively.

도 3은 세그먼트 특징값의 추출 과정을 설명하기 위한 도면으로, 도 3에 나타낸 바와 같이 한 윈도우를 중심으로 8방향에 이웃한 주변 윈도우로 오디널 특정치(참고문헌 [2] 참조)를 구한다고 했을 때, 현재 프레임의 윈도우 B_t를 중심으로 추적영역을 정의하고, 추적영역 내 모든 윈도우에 대한 오디널 측정치와 이전 프레임의 윈도우 B_t-1의 오디널 특정치를 각각 비교한다.FIG. 3 is a diagram for explaining a process of extracting segment feature values. As shown in FIG. 3, an oral specific value (see Ref. [2]) is obtained from neighboring windows adjacent to eight directions with respect to one window. In this case, the tracking area is defined based on the window B _t of the current frame, and the audition measurement values of all the windows in the tracking area are compared with the auditor specific values of the window B _t-1 of the previous frame.

이 중에서 윈도우 B_t-1로부터 가장 작은 거리차이를 갖는 프레임 t의 윈도우 로의 방향을 양자화한다. 프레임 t-1의 모든 윈도우에서 산출된 방향 값을 통계 처리하여 프레임을 축으로 하는 방향수 만큼의 히스토그램이 생성된다. 이를 통해 N_x×N_y개의 명암도 평균 윈도우를 가지고 있는 대표 프레임과 방향벡터 히스토그램으로 하나의 세그먼트를 구성한다.Of these, the direction of the frame t having the smallest distance difference from the window B _t-1 is quantized. Histograms corresponding to the number of directions of the frame are generated by statistically processing the direction values calculated in all windows of the frame t-1. This constitutes a representative frame and a segment in the direction vectors in the histogram that has a N _x × N _y of the average intensity over the window.

2 세그먼트 비교 단계2 segment comparison step

본 단계는 입력 동영상의 복제를 검출하기 위해서 입력 동영상에서 추출한 세그먼트와 원본 동영상에서 추출한 세그먼트들을 비교하여 입력 동영상에 포함된 세그먼트와 유사한 원본 동영상에 포함된 세그먼트를 찾는 단계이다.In this step, in order to detect duplication of the input video, the segment extracted from the input video and the segments extracted from the original video are compared to find a segment included in the original video similar to the segment included in the input video.

유사한 세그먼트를 찾기 위해, 본 세그먼트 비교 단계는 대표 프레임 비교 단계와, 방향벡터 히스토그램 비교 단계를 사용한다.To find similar segments, this segment comparison step uses a representative frame comparison step and a direction vector histogram comparison step.

하나의 세그먼트는 명암도와 모션 정보가 유사한 프레임으로 구성되기 때문에 동일한 세그먼트의 대표 프레임은 그 이외의 프레임과 유사한 명암도 값을 갖고 있다. 따라서, 대표 프레임 비교 단계에서는 대표 프레임의 명암도의 차이를 이용해서 세그먼트간의 유사도를 계산한다. 그러나, 세그먼트의 대표 프레임이 인위적인 편집에 의해서 다르게 선택될 수 있기 때문에, 방향 벡터 히스토그램 비교 단계에서 명암도의 모션특징을 비교해서 입력 세그먼트가 부분적으로만 같은 경우에 발생하는 오차를 보완한다.Since one segment is composed of a frame having similar contrast and motion information, the representative frame of the same segment has a similar brightness value as the other frames. Therefore, in the representative frame comparison step, the similarity between the segments is calculated using the difference in the intensity of the representative frame. However, since the representative frame of the segment can be selected differently by artificial editing, the motion characteristics of the contrast are compared in the direction vector histogram comparison step to compensate for the error occurring when the input segments are only partially the same.

세그먼트의 비교는 입력 세그먼트와 모든 원본 세그먼트의 비교를 요구한다. 그러나, 입력 동영상의 세그먼트는 원본 동영상의 세그먼트와 유사한 순서인 특징을 갖는다. 따라서, 세그먼트의 검색속도를 감소시키기 위해서 현재 비교 중인 입력 세그먼트는 이전에 가장 유사했던 원본 세그먼트의 뒤에 존재하는 세그먼트와 우선적으로 비교를 수행한다.Comparison of segments requires comparison of input segments with all original segments. However, the segments of the input video have a feature that is similar in order to the segments of the original video. Therefore, in order to reduce the search speed of the segment, the input segment currently being compared is first compared with the segment existing behind the most similar original segment.

2.1 대표 프레임 비교 단계2.1 Representative Frame Comparison Step

하나의 세그먼트는 공간적인 특징에 대해 연속성과 유사성을 가지고 있기 때문에 입력 동영상의 세그먼트와 원본 동영상의 세그먼트가 유사하다면, 각 세그먼트가 가지고 있는 대표프레임의 명암도 성분도 유사하다. 따라서, 세그먼트의 유사성을 비교하기 위해서 상기 식 (2)를 사용해서 대표 프레임의 명암도를 비교한다. 상기 식 (2)는 입력 세그먼트와 원본 세그먼트간의 명암도 특징의 차이를 나타낸다. 따라서, Dist_frame이 가장 작은 원본 세그먼트가 복제된 원본 세그먼트라고 할 수 있다.Since one segment has sequentiality and similarity with respect to spatial features, if the segment of the input video and the segment of the original video are similar, the intensity component of the representative frame of each segment is similar. Therefore, in order to compare the similarity of the segments, the intensity of the representative frame is compared using Equation (2) above. Equation (2) represents the difference in contrast characteristics between the input segment and the original segment. Therefore, it can be said that the original segment with the smallest Dist _frame is a duplicated original segment.

---- (2)

여기서, Q는 입력 동영상 세그먼트, T는 원본 동영상 세그먼트, N은 대표 프레임이 가지고 있는 윈도우의 개수이다. 그리고, I(Q_i), I(T_i)는 각 세그먼트의 대표 프레임이 가지고 있는 i번째 윈도우의 명암도 값이다. Dist_frame값이 작을수록 두 세그먼트의 대표 프레임은 유사도가 높다고 할 수 있다.Where Q is the input video segment, T is the original video segment, and N is the number of windows that the representative frame has. I (Q _i ) and I (T _i ) are the intensity values of the i-th window of the representative frame of each segment. As the Dist _frame value is smaller, the representative frames of the two segments have higher similarity.

대표 프레임은 세그먼트의 시작 프레임이고, 하나의 세그먼트는 유사한 밝기 특징 값을 갖는 프레임들로 구성이 된다. 따라서, 세그먼트의 대표 프레임은 세그먼트의 대표 특징 값으로 사용이 가능하다. 그러나, 동영상의 인위적인 편집에 의해서 대표 프레임이 달라질 수 있으며, 일정 시간마다 반드시 세그먼트를 생성하였기 때문에 연속적으로 세그먼트의 대표 프레임이 어긋나는 경우가 존재한다. 따라서, 이러한 경우를 보완하기 위해서 추가적으로 모션 특징 값을 이용해서 세그먼트를 비교한다.The representative frame is the start frame of the segment, and one segment is composed of frames having similar brightness characteristic values. Therefore, the representative frame of the segment can be used as the representative feature value of the segment. However, the representative frame may be changed by artificial editing of the video, and since the segment is necessarily generated every certain time, the representative frame of the segment may deviate continuously. Therefore, to compensate for this case, segments are further compared using motion feature values.

2.2 방향벡터 히스토그램 비교 단계2.2 Direction Vector Histogram Comparison

방향 벡터 히스토그램을 비교하는 과정을 통해 원본 세그먼트와 입력 세그먼트가 포함하는 프레임이 부분적으로만 같은 경우, 상기 대표 프레임 비교 단계에서 언급한 대표 프레임의 차이에서 발생하는 오차를 보완할 수 있다.By comparing the direction vector histogram, when a frame included in the original segment and the input segment is only partially identical, an error occurring in the difference between the representative frames mentioned in the representative frame comparison step may be compensated for.

방향벡터 히스토그램의 유사도를 측정하기 위해 스완(Swan)과 발라드(Ballad)가 제안한(참고문헌 [9] 참조) 히스토그램 인터섹션을 이용한다. Dist_motion값이 높을수록 두 세그먼트의 모션 특징이 유사하다고 할 수 있다.The histogram intersection proposed by Swan and Ballard (see Ref. [9]) is used to measure the similarity of the direction vector histogram. The higher the Dist _motion value, the more similar the motion characteristics of the two segments.

---- (3)

---- (4)

여기서, Q는 입력 동영상의 프레임, T는 원본 동영상의 프레임, i는 양자화 된 각각의 방향성분을 의미한다. l은 양자화된 방향성분의 수로서, 0은 윈도우가 움직이지 않았음을 뜻한다. h_k(Q_i), h_k(T_i)는 세그먼트 내의 k번째 모션에서 각 방향성분 i의 움직임을 갖는 윈도우의 개수이다. n은 세그먼트로 통합되는 프레임의 수를 의미한다.Here, Q denotes a frame of the input video, T denotes a frame of the original video, and i denotes each quantized direction component. l is the number of quantized directional components, and 0 means that the window has not moved. h _k (Q _i ) and h _k (T _i ) are the number of windows having the movement of each direction component i in the kth motion in the segment. n means the number of frames integrated into the segment.

대표 프레임의 비교 단계에서 생성된 Dist_frame은 유사한 동영상일수록 작게 측정이 되고, Dist_motion은 유사한 동영상일수록 크게 측정이 된다. 따라서, 세그먼트간의 유사도를 평가하기 위해 식 (5)를 사용해서 최종적인 세그먼트간의 유사도를 판단한다. 가장 큰 유사도 값을 갖는 원본 세그먼트를 복제된 원본 세그먼트로 선택한다.Dist _frames generated during the comparison of representative frames are measured smaller for similar videos, and Dist _motion is measured larger for similar videos. Therefore, to evaluate the similarity between segments, equation (5) is used to determine the final similarity between segments. Select the original segment with the highest similarity value as the duplicated source segment.

---- (5)

3. 동영상 검출 단계3. Video Detection Step

본 동영상 검출 단계에서는 입력 동영상의 세그먼트 비교결과를 통해 어떤 영상으로부터 복제되었는가를 결정한다. 본 동영상 검출 단계를 통해 입력 영상에서 분할된 n개의 입력 세그먼트에 대한 복제 세그먼트로 검출된 결과들을 복제 영상 단위의 결과로 검출할 수 있고, 일부 세그먼트에 대한 검색결과에 오차가 발생하는 경우 이를 주변의 세그먼트의 결과를 통해 보완할 수 있다.In the video detection step, it is determined from which image is copied from the segment comparison result of the input video. Through the video detecting step, the detected results of the duplicate segments of the n input segments divided from the input image may be detected as the result of the duplicate image unit, and if an error occurs in the search results for some segments, This can be complemented by the results of the segment.

인위적인 편집에 의해서 편집된 입력 동영상은 하나의 동영상으로만 구성될 뿐만 아니라 불특정한 다수의 동영상으로도 구성이 되기도 한다. 따라서, 본 단 계에서는 이러한 편집된 다수개의 동영상도 검색이 가능한 동영상 검출 방법을 수행한다. The input video edited by artificial editing is not only composed of one video but also composed of a plurality of unspecified videos. Therefore, in this step, a video detection method capable of searching such a plurality of edited videos is performed.

입력 동영상의 세그먼트 일련번호를 Q라고 하고, Q와 유사하다고 판정된 원본 동영상의 세그먼트 일련번호를 R(Q)라고 한다면, 세그먼트 단위의 검색결과를 동영상으로 검출하는 과정은 아래와 같다.If the segment serial number of the input video is Q and the segment serial number of the original video determined to be similar to Q is R (Q), the process of detecting the search result of each segment as a video is as follows.

1) R(Q)의 다음 세그먼트 R(Q)+1과, 입력 세그먼트 Q의 다음 세그먼트 Q+1과 유사하다고 판정된 원본 세그먼트 R(Q+1)이 같다면, Tol_TRUE를 증가시킨다.1) If the next segment R (Q) +1 of R (Q) and the original segment R (Q + 1) determined to be similar to the next segment Q + 1 of the input segment Q are equal, Tol _TRUE is increased.

2) Tol_TRUE가 하나의 동영상의 식별 기준인 L보다 크다면, Q ∼ Q+Tol_TRUE만큼의 세그먼트는 원본 R(Q) ∼ R(Q)+Tol_TRUE만큼의 동영상과 유사하다고 검출한다. 이때, L은 실험적인 결과 값으로 결정하였다.2) If Tol _TRUE is larger than L, which is an identification criterion of one video, it detects that segments of Q to Q + Tol _TRUE are similar to videos of original R (Q) to R (Q) + Tol _TRUE . In this case, L was determined as an experimental result value.

3) R(Q+1) ≠ R(Q)+1인 경우에는 Tol_FALSE를 증가시킨다. Tol_FALSE는 연속적으로 일어난 잡음의 수이다. 3) If R (Q + 1) ≠ R (Q) +1, increase Tol _FALSE . Tol _FALSE is the number of consecutive noises.

4) Tol_FALSE가 다른 동영상임을 식별하는 기준 R보다 작으면서 R(Q)+Tol_FALSE+1 = R(Q+Tol_FALSE+1)인 경우, Q+1 ∼ Q+Tol_FALSE까지의 세그먼트는 오차 보정 가능한 잡음으로 판단하여 R(Q)에서 R(Q+Tol_FALSE+1)사이의 연속된 세그먼트로 조정한다.4) flew less than the reference R that identifies the different video Tol _FALSE R (Q) + Tol _FALSE +1 = R (if the Q + Tol _FALSE +1), a segment of up to 1 + Q ~ Q + Tol _FALSE is error The noise is corrected and adjusted to a continuous segment between R (Q) and R (Q + Tol _FALSE +1).

5) Tol_FALSE가 다른 동영상임을 식별하는 기준 R보다 크다면, Q+1 이후는 더 이상 Q 이전과 같은 동영상의 세그먼트가 아님을 나타낸다. 따라서, Q까지의 세그먼트에 대해 2)의 과정으로 동영상 검출을 판별하고, Q+1 세그먼트부터 다시 1)의 과정을 반복한다.5) If Tol _FALSE is greater than the criterion R identifying that it is another video, it indicates that Q + 1 and later are no longer segments of the video as before Q. Therefore, the video detection is determined in the process of 2) for the segment up to Q, and the process of 1) is repeated from the Q + 1 segment.

즉, 각 검출 단계는, 1) 세그먼트단위 결과의 연속성 확인 단계, 2) 연속성을 통한 유사 동영상 검출 단계, 3) 오류 보정 기회 부여 단계, 4) 잡음 판정 단계, 5) 다른 영상의 시작으로의 판정 단계로 요약 할 수 있다.That is, each detection step includes: 1) verifying the continuity of the result of the segment unit, 2) detecting a similar video through the continuity, 3) giving an error correction opportunity, 4) determining the noise, and 5) determining the start of another image. Can be summarized as steps.

4. 실험 및 결과4. Experiment and Results

본 발명에 따른 방법에 의해서 세그먼트의 대표 프레임은 총 465개의 윈도우(31×15)로 분할되며, 모션 특징의 양자화 방향수는 정지와 8방향을 포함한 9가지 방향으로 양자화를 수행하였다. 24 프레임의 정보를 갖는 한 세그먼트의 용량은 대략 1kb로 이는 헤더 정보를 제외한 값이다.By the method according to the present invention, the representative frame of the segment is divided into a total of 465 windows (31 × 15), and the quantization direction number of the motion feature is quantized in nine directions including the stop and eight directions. The capacity of one segment with information of 24 frames is approximately 1 kb, excluding the header information.

검색성능 비교를 위한 입력영상으로 한 영화의 일부분인 부분영상 25개와 여러 영화의 부분영상을 결합하여 만든 편집영상 25개를 준비하였다. 50개의 입력영상은 각각 5분 분량이다.As an input image for comparing the search performance, 25 partial images, which are part of a movie, and 25 edited images, which are made by combining partial images of several movies, were prepared. The 50 input images are 5 minutes each.

실험에 사용하기 위해 해상도 약 800×336인 100시간 분량의 영화를 원본 동영상으로 준비하였다. 원본 동영상의 세그먼트들은 사전에 추출하여 저장하고 있다고 가정한다.For the experiment, a 100-hour movie with a resolution of about 800 × 336 was prepared as the original video. Segments of the original video are assumed to be extracted and stored in advance.

4.1 데이터 및 처리 시간 분석4.1 Analyzing Data and Processing Time

3시간 분량과 100시간 원본 동영상에 대해 각각 5분, 15분 분량에 대한 입력 동영상을 가지고 순차적으로 탐색하는 방법으로 검색시간을 산출하였다. 비교 대 상인 오디널 측정(Ordinal Measurement)(참고문헌 [5] 참조) 방법은 3×3의 윈도우 사이즈로 순차탐색을 수행한다. 상기 방법을 각각 20회 반복하여 그 평균값으로 검색 시간에 대한 성능을 비교한다.The search time was calculated by sequentially searching the input video for 5 minutes and 15 minutes for 3 hours and 100 hours original video, respectively. The comparison method, Ordinal Measurement (see Ref. [5]), performs a sequential search with a window size of 3 × 3. The method is repeated 20 times each and the performance of the search time is compared with the average value.

표 1 특징 크기 비교Table 1 Feature Size Comparison

표 1은 오디널 측정 방법 및 본 발명의 방법에서 사용되는 전처리 과정을 통해 생성되는 비교 특징의 크기를 나타낸다. 본 발명에 따른 방법은 평균적으로 1시간 분량의 동영상에서는 5,117개의 세그먼트를 포함하고, 평균적으로 1개의 세그먼트가 17프레임을 갖으며, 각 세그먼트의 크기는 5,016비트이다. 즉, 본 발명에 따른 방법에서는 특징 크기가 약 25.7M비트(5,117×5,016)이다. 반면, 기존의 오디널 측정(Ordinal Measurement)에서 17프레임이 갖는 특징의 크기가 612비트이므로, 그 특징 크기는 약 3.1M비트(5,117×612)이다. 마찬가지로, 3시간 분량의 동영상의 경우, 본 발명은 약 9.11M바이트의 특징 크기를 갖는 반면, 오디널 측정 방법에서는 약 1.11M바이트의 특징 크기를 갖게 된다. 따라서, 기존의 오디널 측정(Ordinal Measurement) 방법과 비교하여 본 발명은 약 8배 많은 특징 크기를 가지고 있게 된다. 그러나, 원본 동영상과 비교하여 본 발명에서는 그 특징 크기가 약 200분의 1 수준으로 감소하였다. Table 1 shows the magnitudes of the comparative features produced through the pre-treatment process used in the method of ordinal measurement and the method of the present invention. The method according to the present invention includes 5,117 segments in an average video of 1 hour, on average 1 segment has 17 frames, and the size of each segment is 5,016 bits. That is, in the method according to the present invention, the feature size is about 25.7 M bits (5,117 x 5,016). On the other hand, since the size of the feature of 17 frames in the conventional Ordinal Measurement is 612 bits, the feature size is about 3.1 M bits (5,117 x 612). Similarly, in the case of a three-hour video, the present invention has a feature size of about 9.11 Mbytes, while in the auditoral measurement method it has a feature size of about 1.11 Mbytes. Therefore, the present invention has about eight times as many feature sizes as compared with the conventional ordinal measurement method. However, compared to the original video clip, the feature size of the present invention is reduced to about one hundredth of a level.

표 2 검색 시간 비교Table 2 Search Time Comparison

표 2는 상기 실험 결과를 기존의 오디널 측정(Ordinal Measurement) 방법과 비교하여 나타낸다. 본 발명에 따른 검색 시간은 기존의 오디널 측정(Ordinal Measurement) 방법에 비해 100분의 1 수준으로 감소함을 보였다. 이는 세그먼트 단위의 연산 비용이 대응되는 같은 분량의 프레임 단위 연산보다 비교 정보가 적고, 순서적 특징을 고려하여 같은 동영상에 포함된 세그먼트가 우선 탐색되어 검색 시간이 줄어들었기 때문이다. 검색 시간은 입력 동영상의 세그먼트 수와 원본 동영상의 세그먼트 수에 비례하며 입력 동영상이 연속적일수록, 변화가 적을수록 검색시간이 줄어들었다.Table 2 shows the experimental results in comparison with the conventional original measurement (Ordinal Measurement) method. The search time according to the present invention was shown to be reduced to a level of one hundredth of that of the conventional ordinal measurement method. This is because the comparison information is less than that of the same amount of frame operations corresponding to the unit cost of the segment, and the search time is reduced because the segments included in the same video are first searched in consideration of the order characteristic. The search time is proportional to the number of segments of the input video and the number of segments of the original video. The search time decreases as the input video is continuous and the change is small.

4.2 정확도 분석4.2 Accuracy Analysis

동영상 50개를 가지고 100시간 분량의 원본 동영상을 검색하였다. 검색 성능의 정확도 분석을 위해서 프레임 단위의 결과 뿐만 아니라 세그먼트에 의한 정확도도 측정하기 위해서, 평균 정밀도(Average Precision) 방법을 사용하였다. 평 균 정밀도 방법은 검색된 결과 중 상위 순위로 검색된 연관성 있는 결과에 더 가중치를 부여하는 방식이다. 각 입력동영상의 세그먼트별로 생성된 검색결과에 대해 평균 정밀도들을 식 (6)에 의해 구하고, 산술평균으로 검색성능을 산출하였다.With 50 videos, I retrieved 100 hours of original video. To measure the accuracy of the search performance, the average precision method was used to measure not only the results of the frame but also the accuracy of the segments. The average precision method is to give more weight to the relevant results found in the higher rank among the search results. The average precisions of the search results generated for each segment of each input video were calculated by Equation (6), and the search performance was calculated from the arithmetic mean.

---- (6)

여기서, R은 실제로 하나의 입력 세그먼트와 유사한 원본 세그먼트 수이고, i는 검출결과의 랭크이다. xi는 i번째 랭크의 결과가 연관된 결과이면 1, 아니면 0을 나타내는 이진함수이고, pi는 랭크 i까지 컷-오프(cut-off)된 결과의 정밀도(precision)이다.Where R is actually the number of original segments similar to one input segment, and i is the rank of the detection result. xi is a binary function that represents 1 if the result of the i th rank is related, or 0, and pi is the precision of the result cut-off to rank i.

본 발명에 따른 방법은 동영상의 정보를 변환하지 않고, 동영상을 수집하고 결합하여 새로 생성한 동영상을 검출한다. 부분영상과 편집영상 모두 복제 동영상을 검출하였고, 동영상 세그먼트 단위의 결과에 있어서도 평균 99.7%의 검출 성능을 보였다. 본 발명에 따른 방법은 세그먼트 단위의 정확도 역시 높은 것을 보여준다.The method according to the present invention detects a newly generated video by collecting and combining the video without converting the information of the video. Both partial and edited images were detected, and the average of 99.7% was detected in the results of video segment. The method according to the invention shows that the accuracy of the segment unit is also high.

한편, 동영상 세그먼트의 검색성능이 떨어지게 되는 경우는 다음과 같다.Meanwhile, the search performance of the video segment may be reduced as follows.

첫 번째는 입력 동영상이 편집된 경계 부분에서는 특정 시간 단위로 분할되는 원본 동영상의 세그먼트와 입력 동영상의 세그먼트가 일정한 간격으로 어긋나는 현상이 발생하는 경우이다.The first is a case where a segment of an original video divided by a specific time unit and a segment of an input video are shifted at regular intervals at the boundary where the input video is edited.

두 번째는 긴 시간동안 움직임이 없는 장면에서 특정 시간 단위의 세그먼트들은 거의 같은 값을 지니기 때문에 정확한 원본 동영상의 세그먼트가 아닌 그 주 변 세그먼트를 검출해 내기도 한다.Secondly, in a scene where there is no motion for a long time, the segments of a certain time unit have almost the same value, so the surrounding segment is detected instead of the exact segment of the original video.

세 번째는 입력 동영상의 세그먼트의 길이가 분할 기준 시간 보다 짧은 경우는 다른 세그먼트와 비교할 수 있는 정보량이 적기 때문에 결과 값의 정밀도가 떨어지게 된다. Third, when the length of the segment of the input video is shorter than the split reference time, the result value is inferior because the amount of information that can be compared with other segments is small.

이상과 같이 본 발명에서는 유사성을 가진 프레임을 하나의 세그먼트로 결합하여 불법 복제 동영상을 검출하는 방법을 제공함으로써, 원본에서 부분적으로 추출된 불법 복제 동영상 뿐만 아니라 서로 다른 동영상에서 추출하여 결합된 복제 동영상 역시 모두 검출이 가능하고, 프레임 단위의 비교 방법보다 빠른 시간에 검출될 수 있게 된다.As described above, the present invention provides a method of detecting an illegal copy video by combining frames having similarities into one segment, so that not only the illegal copy video partially extracted from the original but also the duplicate video extracted from different videos are combined. Both can be detected and can be detected at a faster time than the frame-by-frame comparison method.

도 1은 본 발명에 따른 동영상 검출 방법의 개략적인 구성도,1 is a schematic configuration diagram of a video detecting method according to the present invention;

도 2는 본 발명에 따른 세그먼트 추출 과정을 나타낸 도면,2 is a view showing a segment extraction process according to the present invention,

도 3은 본 발명에 따른 도 3은 세그먼트 특징값의 추출 과정을 설명하기 위한 도면이다.3 is a diagram for explaining a process of extracting segment feature values according to the present invention.

Claims

The representative frame is extracted by using the intensity information of the pixels included in the frame, and the video frame is divided into N _Frame = N _x × N _y window regions, and the difference between the average values of the intensity of the pixels included in each window region is determined. A representative frame extracting step of determining a representative frame by using the second frame; and in order to find a frame that is the end point of the segment from the representative frame, the window of the previous frame is compared and searched including the corresponding window position and the surrounding window in the current frame, and the change direction A segment feature extracting step of extracting motion feature values of the representative frame and the successive frames by quantizing a segment, and extracting a segment to be used for detecting a duplicate video;

A segment comparison step of comparing a segment extracted from the input video to segments extracted from the original video to find a segment included in the original video similar to the segment included in the input video to detect duplication of the input video;

And a video detecting step of determining which image is copied from the segment comparison result of the input video.

delete

The method of claim 1, wherein the average intensity of each divided window is calculated using the intensity information of the pixels included in each divided window, and the average intensity L _i of the window _i is:

(W and H are the width and height of the window, x and y are the pixel coordinates inside window i, and I (x, y) is the intensity value of the pixel corresponding to the coordinates of x and y.)

Content-based video detection method using the segment feature, characterized in that calculated by.

The method of claim 1 wherein the segment comparison step to find similar segments,

To compare the similarity between the input segment and the original segment,

(Where Q is the input video segment, T is the original video segment, N is the number of windows the representative frame has, I (Q _i ) and I (T _i ) are the i-th windows of each segment's representative frame. Contrast value)

A representative frame comparing step of comparing the contrast between the input segment and the original segment to show the difference in the contrast characteristic, and comparing the contrast of the representative frame;

When the frames included in the original segment and the input segment are only partially identical, in order to compensate for errors occurring in the difference between the representative frames in the representative frame comparison step,

Where Q is the frame of the input video, T is the frame of the original video, i is each quantized direction component, l is the number of quantized direction components, and h _k (Q _i ) and h _k (T _i ) are segments The number of windows with the movement of each directional component i in the k th motion in n, where n is the number of frames incorporated into the segment)

Measure similarity by

And a direction vector histogram comparing step of determining similarity between the final segments by using the segment-based content detection method.

The method of claim 4, wherein in the representative frame comparison step, the original segment having the smallest Dist _frame is a duplicated original segment.

The method of claim 4, wherein, in the final determination of similarity between segments, an original segment having the largest similarity value is selected as a duplicated original segment.

The method of claim 1, wherein the detecting of the video comprises:

If the segment serial number of the input video is similar to Q and Q, and the segment serial number of the original video is set to R (Q),

Continuity of segment-by-segment result that increases Tol _TRUE if next segment R (Q) +1 of R (Q) and original segment R (Q + 1) determined to be similar to next segment Q + 1 of input segment Q are equal Confirmation step;

If Tol _TRUE is greater than L (where L is the result determined by the experiment), which is the identification criterion of one video, the segments of Q to Q + Tol _TRUE are the original R (Q) to R (Q) + Tol _TRUE A similar video detection step through continuity for detecting that the video is similar to as many videos;

Providing an error correction opportunity for increasing Tol _FALSE (where Tol _FALSE is the number of consecutively occurring noises) when R (Q + 1) ≠ R (Q) +1;

Tol is _FALSE flew less than the reference R to identify that other video if the _{R (Q) + Tol FALSE +1} = R (Q + Tol FALSE +1), Q + 1 ~ segment to Q + Tol _FALSE is error correctable A noise determination step of judging as noise and adjusting to a continuous segment between R (Q) and R (Q + Tol _FALSE + 1);

If Tol _FALSE is greater than the criterion R identifying that it is another video, it indicates that after Q + 1 is no longer a segment of the same video as before Q, and then by a similar video detection step through the continuity for the segment up to Q. And a determination step from the Q + 1 segment to the start of another image which repeats the continuity check of the result of the segment unit again from the Q + 1 segment, thereby detecting the segment unit search result as a movie. Content-based video detection method using.