KR101050255B1

KR101050255B1 - Video scene segmentation system and method

Info

Publication number: KR101050255B1
Application number: KR1020090078545A
Authority: KR
Inventors: 이인서; 전종환; 조치원; 이경준
Original assignee: 주식회사 노매드커넥션
Priority date: 2009-08-25
Filing date: 2009-08-25
Publication date: 2011-07-19
Also published as: KR20110021014A

Abstract

개시되는 동영상 장면 분할 시스템은 분할 대상 동영상을 구성하는 프레임을 추출하는 프레임 추출부, 프레임 추출부에서 추출한 프레임의 이웃 프레임과의 에너지 차이에 따라 분할 예정 위치를 결정하는 계수 필터링부, 계수 필터링부에서 결정한 분할 예정 위치 전후 프레임 간의 연속성을 판단하는 움직임 분석부 및 움직임 분석부에서 판단한 연속성에 따라 동영상의 장면 분할 위치를 결정하는 분할 위치 결정부를 포함한다.The disclosed video scene segmentation system includes a frame extractor for extracting a frame constituting a segmented video, a coefficient filter for determining a scheduled position for segmentation according to an energy difference between neighboring frames of a frame extracted by the frame extractor, and a coefficient filter. The motion analysis unit determines the continuity between the determined frames before and after the split scheduled position, and the split position determiner determines the scene division position of the video based on the continuity determined by the motion analyzer.

동영상, 장면, 분할 Video, scene, split

Description

System and Method of Scene Partitioning for Moving Picture}

본 발명은 영상 처리 시스템에 관한 것으로, 보다 구체적으로는 동영상 장면 분할 시스템 및 방법에 관한 것이다.The present invention relates to an image processing system, and more particularly, to a video scene segmentation system and method.

근래 들어 DVD, HDTV, 위성 TV, 셋톱박스, 디지털 카메라 등과 같은 멀티미디어 기기의 수요가 계속적으로 증가하고 있다. 최근에는 초고속 인터넷을 이용하여 제공되는 양방향 텔레비전 서비스인 IPTV(Internet Protocol Television) 서비스가 도입되어, 시청자가 자신이 편리한 시간에 보고 싶은 프로그램을 선택하여 시청할 수 있게 되었다.In recent years, the demand for multimedia devices such as DVD, HDTV, satellite TV, set-top boxes, and digital cameras is continuously increasing. Recently, the Internet Protocol Television (IPTV) service, which is an interactive television service provided using high-speed Internet, has been introduced, allowing viewers to select and watch a program they want to watch at a convenient time.

이러한 멀티미디어 기기의 발전과 함께, 동영상 데이터를 효과적으로 관리하기 위한 연구가 진행되고 있으며, 그 일 분야로 동영상 데이터를 분할하여 구조화하기 위한 연구를 들 수 있다. 동영상 데이터의 분할은 동영상 데이터를 용이하게 저장, 색인, 검색할 수 있도록 함은 물론, 동영상 중간에 광고를 삽입하기 위한 용도로도 이용된다.With the development of such multimedia devices, researches are being conducted to effectively manage video data, and there is a research for dividing and structuring video data into one field. The segmentation of video data is used not only to easily store, index, and search video data, but also to insert an advertisement in the middle of the video.

예를 들어, 케이블 TV나 IPTV 서비스 등에서는 영화 등을 무료로 시청하는 대신 영화를 시청하는 도중에 광고를 시청하도록 하고 있다. 방송 중이던 영화를 일시 중지하고 광고를 시청하도록 하기 위해서는 영화의 내용이 전환되거나 출연자가 위치하는 장소가 변경되는 등 적절한 위치에서 영화를 중지시키는 것이 중요하다.For example, cable TV, IPTV service, etc., instead of watching a movie for free to watch advertisements while watching a movie. In order to pause a movie being broadcast and watch an advertisement, it is important to stop the movie at an appropriate position such as changing the contents of the movie or changing the place where the performer is located.

현재는 영화 등의 동영상에 대한 장면 분할을 위해 컬러 히스토그램 비교 방법, 화소 단위 비교 방법 등을 이용한다.Currently, a color histogram comparison method and a pixel unit comparison method are used to divide a scene for a movie such as a movie.

화소 단위 비교 방법은 동일한 장면 내에서는 화소값의 변화가 적다는 점에 착안하여, 연속하는 한 쌍의 프레임에서 대응하는 화소값을 비교하여 얼마나 많은 변화가 발생하였는지 측정한다. 이 방법은 구현은 간단하나 카메라 움직임에 민감하여, 움직임이 많은 영상에 적용할 경우 장면을 정확히 분할할 수 없는 단점이 있다.The pixel-by-pixel comparison method focuses on the small change in pixel values in the same scene, and compares the corresponding pixel values in a pair of consecutive frames to measure how many changes have occurred. Although this method is simple to implement, it is sensitive to camera movement, and thus it is not possible to accurately segment a scene when applied to a moving image.

한편, 컬러 히스토그램 비교 방법은 동일한 장면 내의 프레임들이 상호 유사한 색상 분포를 가진다는 특성을 이용하여, 인접 프레임들의 히스토그램 차이를 임계값과 비교하여 장면을 분할한다. 이 방법은 카메라의 이동에는 덜 민감하나, 빛의 영향에 민감한 단점이 있다. 즉, 갑작스런 조명 변화가 있는 경우 동일한 장면을 다른 장면으로 인식할 수 있고, 다른 장면임에도 불구하고 색상 분포가 유사하면 이를 검출하지 못하는 문제가 있다.On the other hand, the color histogram comparison method divides a scene by comparing the histogram difference of adjacent frames with a threshold value using the characteristic that the frames in the same scene have a similar color distribution. This method is less sensitive to camera movement but has the disadvantage of being sensitive to the effects of light. That is, when there is a sudden change in lighting, the same scene may be recognized as a different scene, and if the color distribution is similar despite the different scene, there is a problem in that it is not detected.

이와 같이, 현재는 동영상의 화소값이나 색상 분포 등 RGB값을 기초 정보로 이용하기 때문에, 카메라의 움직임이나 시간/공간적 연속성을 정확히 판단할 수 없어 동영상을 효과적으로 구조화하기 어렵다. 이에 따라, 뜻하지 않은 위치에서 광 고가 송출되는 문제가 발생할 수 있고, 이를 방지하기 위해 수작업을 통해 장면 분할의 오류를 수정하여야 하므로 작업이 번거롭고 시간이 증가하는 등의 문제가 있다.As described above, since RGB values such as pixel values and color distribution of the video are used as basic information, it is difficult to accurately determine the movement of the camera and the temporal / spatial continuity of the video. Accordingly, there may be a problem that an advertisement is output at an unexpected location, and in order to prevent this, an error of scene division must be corrected by manual operation, which causes troublesome work and an increase in time.

본 발명은 상술한 단점 및 문제점을 해결하기 위해 안출된 것으로, 장면 전환 여부를 정확히 검출할 수 있는 동영상 장면 분할 시스템 및 방법을 제공하는 데 그 기술적 과제가 있다.The present invention has been made to solve the above-mentioned disadvantages and problems, and there is a technical problem to provide a video scene segmentation system and method that can accurately detect whether or not a scene change.

본 발명의 다른 기술적 과제는 하나의 동영상을 복수의 독립적인 장면으로 분할할 수 있는 동영상 장면 분할 시스템 및 방법을 제공하는 데 있다.Another object of the present invention is to provide a video scene segmentation system and method capable of dividing a video into a plurality of independent scenes.

상술한 기술적 과제를 달성하기 위한 본 발명의 일 실시예에 의한 동영상 장면 분할 시스템은 분할 대상 동영상을 구성하는 프레임을 추출하는 프레임 추출부;In accordance with an aspect of the present invention, there is provided a video scene segmentation system including: a frame extracting unit configured to extract a frame constituting a segment target video;

상기 프레임 추출부에서 추출한 프레임의 이웃 프레임과의 에너지 차이에 따라 분할 예정 위치를 결정하는 계수 필터링부; 상기 계수 필터링부에서 결정한 분할 예정 위치 전후 프레임 간의 연속성을 판단하는 움직임 분석부; 및 상기 움직임 분석부에서 판단한 연속성에 따라 상기 동영상의 장면 분할 위치를 결정하는 분할 위치 결정부;를 포함한다.A coefficient filtering unit for determining a preliminary split position according to an energy difference between neighboring frames of the frame extracted by the frame extractor; A motion analysis unit for determining the continuity between the frames before and after the split scheduled position determined by the coefficient filtering unit; And a split position determiner configured to determine a scene split position of the video based on the continuity determined by the motion analyzer.

한편, 본 발명의 일 실시예에 의한 동영상 장면 분할 방법은 프레임 추출부, 계수 필터링부, 움직임 분석부 및 분할 위치 결정부를 포함하는 동영상 장면 분할 시스템에서의 동영상 장면 분할 방법으로서, 상기 프레임 추출부가 분할 대상 동영상을 구성하는 프레임을 추출하는 과정; 상기 추출된 프레임 각각에 대하여, 상기 계수 필터링부가 이웃 프레임 간의 에너지 차이값과 문턱값을 비교하여 분할 예정 위치를 결정하는 과정; 상기 분할 예정 위치를 참조하여, 상기 움직임 분석부가 상기 분할 예정 위치 전후의 프레임간의 시간적/공간적 연속성을 판단하는 과정; 및 상기 분할 위치 결정부가 상기 시간적/공간적 연속성이 낮은 프레임을 분할 위치로 결정하는 과정;을 포함한다.Meanwhile, a video scene segmentation method according to an embodiment of the present invention is a video scene segmentation method in a video scene segmentation system including a frame extractor, a coefficient filter, a motion analyzer, and a segmentation position determiner. Extracting a frame constituting the target video; For each of the extracted frames, determining, by the coefficient filtering unit, a predetermined scheduled position by comparing an energy difference value and a threshold value between neighboring frames; Determining, by the motion analyzer, temporal / spatial continuity between frames before and after the segmentation scheduled position with reference to the segmentation scheduled position; And determining, by the division position determiner, a frame having a low temporal / spatial continuity as a division position.

본 발명에 의하면, 동영상을 구성하는 프레임 간의 에너지 비교를 통해 장면 분할에 의미 있는 프레임을 선택한다. 아울러, 인트라 프레임과 예측 프레임의 계수를 동시에 필터링하여 시간/공간 연속성을 동시에 판단할 수 있어 장면 분할에 소요되는 시간을 단축시킬 수 있다.According to the present invention, a frame meaningful for scene division is selected by comparing energy between frames constituting a moving image. In addition, time / spatial continuity can be simultaneously determined by filtering the coefficients of the intra frame and the predicted frame at the same time, thereby reducing the time required for scene division.

뿐만 아니라, 동영상의 배경 및 전경에 대한 움직임을 추정하여, 실제 내용상 장면이 전환되었는지의 여부를 판단한다. 이에 따라, 동영상의 장면 전환 시점을 정확하게 추출하여 장면간을 정확하게 분할할 수 있는 이점이 있다.In addition, the motion of the background and the foreground of the video is estimated to determine whether the scene has been changed in actual contents. Accordingly, there is an advantage in that the scene transition time of the video can be accurately extracted and the scene is accurately divided.

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시예를 보다 구체적으로 설명한다.Hereinafter, with reference to the accompanying drawings will be described in detail a preferred embodiment of the present invention.

도 1은 본 발명에 의한 동영상 장면 분할 시스템의 접속 관계를 설명하기 위한 도면이다.1 is a view for explaining a connection relationship of a video scene segmentation system according to the present invention.

도시한 것과 같이, 동영상 장면 분할 시스템(10)은 통신망(20)을 통해 복수의 콘텐츠 제공자(Contents Provider; CP) 서버(30-1~30-n)와 접속되거나, 동영상들이 저장된 데이터베이스(40)를 구비할 수 있다. 그리고, CP 서버(30-1~30-n)로 부터 전송되는 동영상, 또는 데이터베이스(40)에 저장된 동영상을 장면 별로 분할한다.As shown, the video scene segmentation system 10 is connected to a plurality of Content Provider (CP) servers 30-1 to 30-n through the communication network 20, or the database 40 storing videos. It may be provided. Then, the video transmitted from the CP server (30-1 ~ 30-n), or the video stored in the database 40 is divided by scene.

장면 분할시에는 프레임의 에너지, 카메라의 움직임 분석 및 추정이 이용되며, 도 2 내지 도 4를 참조하여 설명하면 다음과 같다.When the scene is divided, energy of a frame, motion analysis and estimation of a camera are used, which will be described below with reference to FIGS. 2 to 4.

먼저, 도 2는 본 발명의 일 실시예에 의한 동영상 장면 분할 시스템의 구성도이다.First, Figure 2 is a block diagram of a video scene segmentation system according to an embodiment of the present invention.

도 2에 도시한 것과 같이, 동영상 장면 분할 시스템(10)은 전체적인 동작을 제어하는 제어부(110), 프레임 추출부(120), 계수 필터링부(130), 움직임 분석부(140), 분할 위치 결정부(150) 및 메모리(160)를 포함한다.As shown in FIG. 2, the video scene segmentation system 10 may include a controller 110, a frame extractor 120, a coefficient filter 130, a motion analyzer 140, and a segmentation position controller for controlling the overall operation. The unit 150 and the memory 160 are included.

동영상을 구성하는 프레임은 인트라 프레임(Intra frame; I-프레임), 예측 프레임(Predictive frame; P-프레임), 양방향 프레임(Bidirectional frame; B-프레임)으로 나눌 수 있다. I-프레임은 이전 영상이나 이후 영상과 상관도가 없거나 매우 낮은 독립적인 프레임을 의미한다. P-프레임은 이전 영상에 의존하는 프레임을 의미하고, B-프레임은 이전 영상 및 이후 영상에 의존하는 프레임을 의미한다. 즉, I-프레임으로는 공간상의 상관도를 알 수 있고, P-프레임이나 B-프레임으로는 시간상의 상관도를 알 수 있다. 다만, B-프레임은 동영상의 진행 시간에 대해 과거 및 미래의 프레임으로부터 예측되는 프레임으로, 시간에 따른 카메라의 움직임을 찾을 수 없으므로 본 발명에서는 고려하지 않는다.The frame constituting the video may be divided into an intra frame (I-frame), a predictive frame (P-frame), and a bidirectional frame (B-frame). I-frame refers to an independent frame that has little or no correlation with previous or subsequent pictures. The P-frame means a frame that depends on the previous picture, and the B-frame means a frame that depends on the previous picture and the after picture. In other words, spatial correlation can be known as an I-frame, and temporal correlation can be known as a P-frame or a B-frame. However, the B-frame is a frame predicted from a past and future frame with respect to the moving time of the video, and thus, since the movement of the camera cannot be found over time, it is not considered in the present invention.

따라서, 프레임 추출부(120)를 이용하여 분할 대상 동영상으로부터 I-프레임 및 P-프레임을 구분해 내는 것이다.Therefore, the I-frame and the P-frame are separated from the segmentation target video by using the frame extractor 120.

일반적으로 P-프레임은 I-프레임에 매우 의존적이지만, 어떤 경우에는 I-프레임과 의존도가 낮은 프레임이 존재할 수 있다. 이러한 P-프레임은 독립적인 장면으로 취급될 수 있으므로, 계수 필터링부(130)는 P-프레임 중 이웃 프레임과 에너지 차이 즉, AC(Alternate Current) 계수 차이가 기 설정된 문턱값보다 높거나DC(Direct Current) 계수 차이가 기 설정된 문턱값보다 높은 프레임을 의미 있는 장면으로 선택한다. 또한, I-프레임 중에서도 이웃하는 I-프레임과 상관도가 높은 I-프레임이 존재할 수 있으며, 계수 필터링부(130)는 I-프레임 중에서 이웃 프레임과 에너지 차이 즉, AC 계수 차이가 기 설정된 문턱값보다 높거나 DC 계수 차이가 기 설정된 문턱값보다 높은 프레임을 의미 있는 장면으로 선택한다. 즉, 이웃하는 I-프레임 간의 상관도가 높은 I-프레임들은 하나의 장면으로 취급하는 것이다.In general, P-frames are highly dependent on I-frames, but in some cases there may be frames that are less dependent on I-frames. Since the P-frame may be treated as an independent scene, the coefficient filtering unit 130 may have an energy difference, that is, an alternating current (AC) coefficient difference between a neighboring frame among the P-frames, higher than a preset threshold or direct current (DC). Current) Selects a frame having a higher coefficient difference than a preset threshold as a meaningful scene. In addition, an I-frame having a high correlation with a neighboring I-frame may exist among the I-frames, and the coefficient filtering unit 130 may set a threshold value in which an energy difference, that is, an AC coefficient difference, is set in advance among the I-frames. The frame that is higher or the DC coefficient difference is higher than the preset threshold is selected as a meaningful scene. That is, I-frames having high correlation between neighboring I-frames are treated as one scene.

DC 계수는 하나의 프레임에 포함되는 각 블록에서 밝기 또는 색차의 평균값을 의미하고, AC 계수는 에지나 잡음에 대한 평균값을 의미한다. 프레임 간의 DC 계수의 차이가 클수록 프레임 간의 유사성이 낮아지며, 프레임 간의 AC 계수의 차이가 크다는 것은 프레임의 배경이 전환되었을 가능성이 높음을 의미한다.The DC coefficient refers to an average value of brightness or color difference in each block included in one frame, and the AC coefficient refers to an average value of edge or noise. The greater the difference in DC coefficients between frames, the lower the similarity between frames, and the greater the difference in AC coefficients between frames means that the background of the frame is more likely to be switched.

따라서, 본 발명에서는 분할 대상 동영상의 I-프레임 및 P-프레임을 추출하고, 인접 I-프레임 간 AC 계수의 차이가 큰 프레임, 인접 P-프레임 간 DC 계수 차이가 큰 프레임을 분할 예정 위치로 선택하는 것이다.Therefore, in the present invention, I-frames and P-frames of the video to be divided are extracted, and a frame having a large difference in AC coefficient between adjacent I-frames and a frame having a large DC coefficient difference between adjacent P-frames is selected as a split scheduled position. It is.

이를 위해, 프레임 추출부(120)는 도 3에 도시한 것과 같이, I-프레임 추출 모듈(122) 및 P-프레임 추출 모듈(124)을 포함한다. 한편, 계수 필터링부(130)는 비교 모듈(132) 및 선택 모듈(134)을 포함한다.To this end, the frame extractor 120 includes an I-frame extraction module 122 and a P-frame extraction module 124, as shown in FIG. Meanwhile, the coefficient filtering unit 130 includes a comparison module 132 and a selection module 134.

분할 대상 프레임에 대하여 프레임 추출부(120)의 I-프레임 추출 모듈(122) 및 P-프레임 추출 모듈(124)에서 각각 프레임이 추출된다. 그리고, 비교 모듈(132)은 이웃 I-프레임 간 AC 계수 및 DC 계수의 차이를 각각 기 설정된 문턱값과 비교하는 한편, P-프레임 간 AC계수 및 DC 계수의 차이를 각각 기 설정된 문턱값과 비교한다. 아울러, 선택 모듈(134)은 비교 모듈(132)의 비교 결과에 따라 의미 있는 장면 전환 위치로 추정되는 프레임 즉, DC 계수 차이 또는 AC 계수 차이가 문턱값보다 높은 프레임을 분할 예정 위치로서 선택한다.Frames are extracted by the I-frame extraction module 122 and the P-frame extraction module 124 of the frame extraction unit 120, respectively, for the split target frame. In addition, the comparison module 132 compares the difference between the AC coefficient and the DC coefficient between neighboring I-frames with a preset threshold, respectively, and compares the difference between the AC coefficient and the DC coefficient between P-frames with the preset threshold, respectively. do. In addition, the selection module 134 selects a frame estimated as a meaningful scene change position, that is, a frame having a DC coefficient difference or an AC coefficient difference higher than a threshold value, as the scheduled division position according to the comparison result of the comparison module 132.

다만, 이러한 에너지의 비교만으로는 장면 전환 여부를 정확히 판단할 수 없으므로, 움직임 분석부(140)를 통해 분할 예정 위치로 결정된 프레임들에 대하여 프레임의 배경 또는 전경이 이전 프레임과 변경되었는지 확인한다.However, since it is not possible to accurately determine whether the scene is changed only by comparing the energy, the motion analyzer 140 checks whether the background or the foreground of the frame has been changed from the previous frame with respect to the frames determined as the scheduled division positions.

즉, 움직임 분석부(140)는 계수 필터링부(130)의 선택 모듈(134)에서 분할 예정 위치로 선택된 프레임들을 참조하여, 이웃 프레임 간 비교를 통해 시간적/공간적 연속성을 판단한다. 시간적/공간적 연속성은 프레임의 글로벌 모션(Global motion) 분석 또는 로컬 모션(Local motion)을 분석에 의해 이루어지며, 이를 위해 움직임 분석부(140)는 도 4에 도시한 것과 같이 글로벌 모션 분석 모듈(142) 및 로컬 모션 분석 모듈(144)을 포함한다.That is, the motion analyzer 140 determines temporal / spatial continuity through comparison between neighboring frames with reference to the frames selected as the division scheduled positions by the selection module 134 of the coefficient filtering unit 130. Temporal / spatial continuity is achieved by analyzing global motion or local motion of a frame. For this purpose, the motion analyzer 140 may analyze the global motion analysis module 142 as illustrated in FIG. 4. And local motion analysis module 144.

글로벌 모션 분석 모듈(142)은 카메라 움직임 분석을 통해 해당 프레임이 이전 프레임과 비교할 때 배경 영상이 변화되었는지 확인한다. 즉, 카메라의 패닝(Panning), 요잉(Yawing), 주밍(Zooming) 등 촬영 기법에 의한 2차원적 움직임(시간적 연속성)인지, 공간 이동에 의한 3차원적 움직임(공간적 연속성)인지 판단 한다. 이를 위해 글로벌 모션 분석 모듈(142)은 분할 예정 위치에서의 각 프레임에 대하여 잉여(residual) 계수 즉, AC 계수의 분포 패턴을 분석하며, 이로부터 프레임 간 에너지의 차이는 크나 단순한 촬영 기법에 의한 에너지 변화인지, 실제 장면 전환에 의한 에너지 변화인지를 판단할 수 있다.The global motion analysis module 142 checks whether the background image has changed when the corresponding frame is compared with the previous frame through the camera motion analysis. In other words, it is determined whether the camera is a two-dimensional motion (temporal continuity) by a photographing technique such as panning, yawing, zooming, or three-dimensional motion (spatial continuity) by spatial movement. To this end, the global motion analysis module 142 analyzes a distribution pattern of residual coefficients, that is, AC coefficients, for each frame at a predetermined position to be divided. It can be determined whether the change or the energy change due to the actual scene change.

로컬 모션 분석 모듈(144)은 해당 프레임이 이전 프레임과 비교할 때 전경 영상 즉, 비디오 객체가 변화되었는지 확인한다. 이는 인물 또는 인물의 모션이 변화되었는지 확인하는 것으로, 분할 예정 위치에서의 각 프레임에 대하여 비디오 객체를 분리한 후, 이웃하는 프레임 간의 잉여 계수 즉, AC 계수의 분포 패턴을 분석하여 시간적 연속성 및 공간적 연속성을 판단한다.The local motion analysis module 144 checks whether the foreground image, that is, the video object, has changed when the corresponding frame is compared with the previous frame. This is to check whether the person or the person's motion has changed. After separating the video object for each frame at the position to be divided, analyzing the distribution pattern of the surplus coefficients, or AC coefficients, between neighboring frames, the temporal continuity and the spatial continuity are analyzed. Judge.

글로벌 모션 분석 및 로컬 모션 분석에는 다양한 특징 추출 알고리즘이 이용될 수 있으며, 예를 들어 움직임 벡터에 대한 주요 성분 분석법(Principal Component Analysis)/잉여 계수에 대한 독립 성분 분석법(Independent Component Analysis), 매칭 추적 분석법(Matching Pursuit Analysis) 등이 이용될 수 있다. 움직임 벡터에 대한 주요 성분 분석법으로부터는 카메라(배경) 또는 비디오 객체(전경)의 2차원적 움직임을 판단할 수 있고, 잉여 계수에 대한 독립 성분 분석법으로부터는 카메라 또는 비디오 객체의 3차원적 움직임을 판단할 수 있다. 아울러, 매칭 추적 분석법의 경우 시간적 특징 및 공간적 특징을 동시에 추출할 수 있는 이점이 있다.Various feature extraction algorithms can be used for global motion analysis and local motion analysis, for example principal component analysis for motion vectors / independent component analysis for surplus coefficients, matching tracking analysis Matching Pursuit Analysis and the like can be used. The principal component analysis of the motion vector can determine the two-dimensional movement of the camera (background) or video object (foreground), and the independent component analysis of the surplus coefficients determines the three-dimensional movement of the camera or video object. can do. In addition, in the case of a matching tracking analysis method, there is an advantage that a temporal feature and a spatial feature can be simultaneously extracted.

한편, 분할 위치 결정부(150)는 움직임 분석부(140)의 분석 결과에 따라 시간적/공간적 연속성이 낮은 프레임을 분할 위치로 결정하여 동영상을 분할하고, 분 할된 동영상을 메모리(160)에 저장한다.Meanwhile, the split position determiner 150 determines a frame having a low temporal / spatial continuity as the split position according to the analysis result of the motion analyzer 140 to divide the video, and stores the divided video in the memory 160. do.

이와 같이, 본 발명에서는 분할 대상 동영상을 구성하는 프레임 중 이웃 프레임과 에너지 차이가 큰 프레임을 추출하여 분할 예정 위치를 결정한다. 이때, 이웃 프레임과 독립적인 I-프레임은 물론, P-프레임까지 고려하여 장면 분할의 정확도를 높인다.As described above, the present invention extracts a frame having a large energy difference from a neighboring frame among frames constituting the split target video to determine a split scheduled position. At this time, the accuracy of scene division is improved by considering not only I-frames independent of neighboring frames but also P-frames.

그리고, 분할 예정 위치 전후의 프레임으로부터, 움직임 분석을 통해 시간적/공간적 연속성을 판단한다. 결국, 시간적/공간적 연속성이 낮은 프레임을 추출하고, 이로부터 분할 위치를 결정함으로써, 장면 전환이 이루어진 프레임 간을 정확히 구분하여 장면 분할 효율을 높일 수 있다.Then, temporal / spatial continuity is determined from the frames before and after the segmentation scheduled position through motion analysis. As a result, by extracting a frame having low temporal / spatial continuity, and determining a split position therefrom, it is possible to accurately distinguish between frames in which a scene change has been performed, thereby improving scene segmentation efficiency.

따라서, 본 발명에서 동영상으로부터 최종적으로 분할된 '장면'이라 함은 동영상의 재생 시간에 따라, 카메라 또는 비디오 객체의 시간적/공간적 연속성에 기초하여 나누어진 장면으로 정의될 수 있다.Accordingly, in the present invention, the "scene" finally divided from the video may be defined as a scene divided based on the temporal / spatial continuity of the camera or video object according to the playback time of the video.

이와 같이 동영상의 장면을 최종 분할한 후, 분할 위치 결정부(150)는 각 분할된 장면 각각의 메타 데이터에 재생 시간 정보 즉, 재생 순서를 포함시킬 수 있다. 그리고, 이러한 재생 시간 정보를 이용하여 장면 단위로 스냅샷을 생성하여 동영상 전체의 내용을 한 눈에 보여 주는 등의 응용 서비스를 제공하는 것도 가능하다.After the final division of the scene of the video as described above, the division position determination unit 150 may include the reproduction time information, that is, the reproduction order in the metadata of each divided scene. In addition, it is also possible to provide an application service such as generating a snapshot in units of scenes using the play time information to show the contents of the entire video at a glance.

한편, 이러한 동영상 장면 분할 시스템(10)에서의 장면 분할 방법에 대하여 도 5 내지 도 7을 참조하여 설명하면 다음과 같다.Meanwhile, the scene segmentation method of the video scene segmentation system 10 will be described below with reference to FIGS. 5 to 7.

도 5는 본 발명의 일 실시예에 의한 동영상 장면 분할 방법을 설명하기 위한 흐름도이다.5 is a flowchart illustrating a video scene segmentation method according to an embodiment of the present invention.

동영상 장면 분할 시스템(10)은 CP 서버 또는 자체적으로 보유하고 있는 데이터베이스로부터 분할 대상 동영상을 수신한다(S10).The video scene segmentation system 10 receives a segmentation target video from a CP server or a database held by itself (S10).

그리고, 프레임 추출부(120)에서 해당 동영상을 구성하는 I-프레임 및 P-프레임을 추출하고, 계수 필터링부(130)는 프레임 간의 에너지 차이값과 문턱값과의 비교를 통해 분할 예정 위치를 결정한다(S20). 본 발명의 바람직한 실시예에서, 계수 필터링부(130)는 I-프레임의 AC 계수 및 DC 계수, P-프레임의 AC 계수 및 DC 계수를 산출하고, 이웃 프레임과의 AC 계수 및 DC 계수 차이가 각각의 문턱값보다 큰 I-프레임, 이웃 프레임과의 AC 계수 및 DC 계수 차이가 각각의 문턱값보다 큰 P-프레임을 분할 예정 위치로 선택할 수 있다.Then, the frame extractor 120 extracts the I-frame and the P-frame constituting the video, and the coefficient filtering unit 130 determines the segmentation scheduled position by comparing the energy difference value and the threshold value between the frames. (S20). In a preferred embodiment of the present invention, the coefficient filtering unit 130 calculates the AC coefficient and DC coefficient of the I-frame, the AC coefficient and DC coefficient of the P-frame, and the difference between the AC coefficient and DC coefficient with the neighboring frame, respectively An I-frame larger than a threshold of P, and a P-frame having a difference in AC coefficient and DC coefficient with a neighboring frame larger than each threshold may be selected as a division scheduled position.

이후, 움직임 분석부(140)에서는 단계 S20에서 결정된 분할 예정 위치를 참조하여, 분할 예정 위치 전후의 프레임간 비교를 통해 시간적/공간적 연속성을 판단한다(S30). 즉, 분할 예정 위치 전후의 프레임으로부터 특징을 추출하고 비교하여 2차원적, 또는 3차원적 변화가 존재하는지 확인하는 것이다.Thereafter, the motion analysis unit 140 determines the temporal / spatial continuity by comparing frames before and after the split scheduled position with reference to the split scheduled position determined in step S20 (S30). That is, the feature is extracted from the frames before and after the segmentation scheduled position and compared to determine whether there is a two-dimensional or three-dimensional change.

그리고, 단계 S30의 판단 결과에 따라, 분할 위치 결정부(150)는 시간적/공간적 연속성이 낮은 프레임을 동영상의 실제 장면 전환 위치로 판단하고(S40), 해당 위치에서 동영상을 분할한다(S50).Then, according to the determination result of step S30, the split position determiner 150 determines a frame having a low temporal / spatial continuity as the actual scene change position of the video (S40), and divides the video at the corresponding position (S50).

도 6을 참조하여 도 5에 도시한 분할 예정 위치 결정 과정을 설명하면, 먼저 프레임 추출부(120)의 I-프레임 추출 모듈(122) 및 P-프레임 추출 모듈(124)에서 각각 해당 동영상에 포함된 I/P 프레임들을 추출한다(S201).Referring to FIG. 6, the division scheduled position determination process illustrated in FIG. 5 will be described. First, the I-frame extraction module 122 and the P-frame extraction module 124 of the frame extractor 120 are included in the corresponding video, respectively. The extracted I / P frames are extracted (S201).

그리고, 계수 필터링부(130)의 비교 모듈(132)은 각 프레임의 에너지 즉, AC 계수 및 DC 계수를 산출하고(S203), 프레임 간의 에너지 차이값을 기 설정된 문턱값과 비교한다(S205).The comparison module 132 of the coefficient filtering unit 130 calculates the energy of each frame, that is, the AC coefficient and the DC coefficient (S203), and compares the energy difference between the frames with a preset threshold (S205).

이후, 선택 모듈(134)은 단계 S205의 비교 결과 프레임 간의 에너지 차이값이 문턱값보다 큰 프레임을 분할 예정 위치로 결정한다(S207).Thereafter, the selection module 134 determines a frame in which the energy difference value between the frames of the comparison result of the step S205 is larger than the threshold value as the scheduled division position (S207).

이와 같이 분할 예정 위치를 결정한 후에는 도 7에 도시한 것과 같이 움직임 분석 과정이 수행된다.After determining the division scheduled position as described above, a motion analysis process is performed as shown in FIG. 7.

즉, 움직임 분석부(140)의 글로벌 모션 분석 모듈(142)에서, 분할 예정 위치 전후의 프레임 간 비교를 통해 카메라의 움직임을 분석하여(S301), 카메라의 움직임에 의한 장면 전환이 이루어졌는지, 즉 글로벌 모션인지 판단한다(S303).That is, the global motion analysis module 142 of the motion analysis unit 140 analyzes the movement of the camera by comparing the frames before and after the split scheduled position (S301), so that a scene change by the movement of the camera is performed, that is, It is determined whether the motion is global (S303).

글로벌 모션으로 판단된 경우에는 해당 프레임의 배경에 대한 특징을 추출하여, 이웃 프레임과의 특징 비교를 통해 시간적/공간적 연속성을 판단한다(S309). 즉, 카메라의 패닝(Panning), 요잉(Yawing), 주밍(Zooming) 등 촬영 기법에 의한 2차원적 움직임(시간적 연속성)인지, 공간 이동에 의한 3차원적 움직임(공간적 연속성)인지 판단한다. 이를 위해 글로벌 모션 분석 모듈(142)은 분할 예정 위치에서의 각 프레임에 대하여 잉여(residual) 계수 즉, AC 계수의 분포 패턴을 분석하며,이로부터, 프레임 간 에너지의 차이는 크나 단순한 촬영 기법에 의한 에너지 변화인지, 실제 장면 전환에 의한 에너지 변화인지를 판단하게 된다.When it is determined that the motion is global, the background feature of the corresponding frame is extracted, and the temporal / spatial continuity is determined by comparing the feature with the neighboring frame (S309). That is, it is determined whether two-dimensional motion (temporal continuity) by a shooting technique such as panning, yawing, zooming, or three-dimensional motion (spatial continuity) by spatial movement of the camera. To this end, the global motion analysis module 142 analyzes a distribution pattern of residual coefficients, that is, AC coefficients, for each frame at a predetermined position to be divided. It is determined whether it is energy change or energy change due to actual scene change.

그리고, 이웃 프레임과 시간적/공간적 연속성이 낮은 프레임을 분할 위치로 결정한다(S40).In operation S40, the neighboring frame and the frame having low temporal / spatial continuity are determined as the split positions.

한편, 단계 S303의 판단 결과 글로벌 모션이 아닌 경우, 즉 로컬 모션인 경우에는 로컬 모션 분석 모듈(144)에서 해당 프레임으로부터 비디오 객체를 분리한 후, 이웃 프레임 간의 잉여 계수 즉, AC 계수의 분포 패턴을 분석하여 시간적/공간적 연속성을 판단한다(S309). 마찬가지로, 전경 특징에 대하여 시간적/공간적 연속성이 낮은 경우 해당 프레임을 분할 위치로 결정한다(S40)..On the other hand, when the determination result of step S303 is not the global motion, that is, in the case of local motion, the local motion analysis module 144 separates the video object from the frame, and then distributes the distribution pattern of the surplus coefficient, that is, AC coefficient, between neighboring frames. The analysis determines the temporal / spatial continuity (S309). Similarly, when temporal / spatial continuity with respect to the foreground feature is low, the corresponding frame is determined as the split position (S40).

이와 같이, 본 발명은 이웃 프레임 간의 에너지 차이 값에 의해 1차적으로 분할 예정 위치를 선택한다. 그리고, 분할 예정 위치 전후의 프레임을 비교하여 시간적/공간적 연속성을 판단하여, 연속성이 낮은 프레임을 분할 위치로 결정한다.As described above, the present invention selects a preliminary division position based on an energy difference value between neighboring frames. The temporal / spatial continuity is determined by comparing the frames before and after the segmentation scheduled position, and a frame having low continuity is determined as the segmentation position.

따라서, 동영상의 내용상 장면 전환이 이루어지는 부분을 고속으로, 용이하게 검출할 수 있어, 장면 분할의 정확도와 신속성을 향상시킬 수 있다. 아울러, 이러한 장면 분할 기법을 광고 삽입 분야에 적용하는 경우, 적절한 위치에서 동영상의 방송을 중단하고 광고를 송출할 수 있음은 물론, 이전 전개된 내용에 합치되는 내용의 광고를 삽입할 수 있어, 광고 효과를 극대화할 수 있다.Therefore, the part where the scene change is performed on the contents of the moving picture can be easily detected at high speed, and the accuracy and speed of scene division can be improved. In addition, when the scene segmentation technique is applied to the field of advertisement insertion, it is possible to interrupt the broadcasting of the video at the appropriate position and transmit the advertisement, as well as to insert an advertisement that matches the previously developed contents. The effect can be maximized.

이상에서 설명한 본 발명이 속하는 기술분야의 당업자는 본 발명이 그 기술적 사상이나 필수적 특징을 변경하지 않고서 다른 구체적인 형태로 실시될 수 있다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적인 것이 아닌 것으로서 이해해야만 한다. 본 발명의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 등가개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다.Those skilled in the art to which the present invention described above belongs will understand that the present invention can be implemented in other specific forms without changing the technical spirit or essential features. Therefore, the above-described embodiments are to be understood as illustrative in all respects and not as restrictive. The scope of the present invention is shown by the following claims rather than the detailed description, and all changes or modifications derived from the meaning and scope of the claims and their equivalents should be construed as being included in the scope of the present invention. do.

도 1은 본 발명에 의한 동영상 장면 분할 시스템의 접속 관계를 설명하기 위한 도면,1 is a view for explaining a connection relationship of a video scene segmentation system according to the present invention;

도 2는 본 발명의 일 실시예에 의한 동영상 장면 분할 시스템의 구성도,2 is a block diagram of a video scene segmentation system according to an embodiment of the present invention;

도 3은 도 2에 도시한 프레임 추출부 및 계수 필터링부의 구성도,3 is a block diagram of a frame extractor and a coefficient filter shown in FIG. 2;

도 4는 도 2에 도시한 움직임 분석부의 구성도,4 is a configuration diagram of a motion analyzer shown in FIG. 2;

도 5는 본 발명의 일 실시예에 의한 동영상 장면 분할 방법을 설명하기 위한 흐름도,5 is a flowchart illustrating a video scene segmentation method according to an embodiment of the present invention;

도 6은 도 5에 도시한 분할 예정 위치 결정 과정을 설명하기 위한 흐름도,6 is a flowchart for explaining a division scheduled position determination process shown in FIG. 5;

도 7은 도 5에 도시한 움직임 분석 과정을 설명하기 위한 흐름도이다.FIG. 7 is a flowchart for describing a motion analysis process illustrated in FIG. 5.

<도면의 주요 부분에 대한 부호 설명>Description of the Related Art [0002]

10 : 동영상 장면 분할 시스템 110 : 제어부10: video scene segmentation system 110: control unit

120 : 프레임 추출부 130 : 계수 필터링부120: frame extraction unit 130: coefficient filtering unit

140 : 움직임 분석부 150 : 분할 위치 결정부140: motion analysis unit 150: division position determination unit

160 : 메모리160: memory

Claims

A frame extracting unit configured to extract an intra frame (I-frame) and a predictive frame (P-frame) constituting a segmentation video;

Computing the energy of each of the I-frame and P-frame, and compares the energy difference with the neighboring homogeneous frame with a predetermined threshold and the comparison module of the comparison module as a result of comparing the frame with the energy greater than the threshold A coefficient filtering unit including a selection module that determines a division scheduled position;

A motion analysis unit for determining the continuity between the frames before and after the split scheduled position determined by the coefficient filtering unit; And

A split position determiner configured to determine a scene split position of the video based on the continuity determined by the motion analyzer;

Video scene segmentation system comprising a.

delete

The method of claim 1,

The comparison module calculates at least one of an alternating current (AC) coefficient and a direct current (DC) coefficient from the I-frame, and compares the AC coefficient difference and the DC coefficient difference with adjacent I-frames with respective thresholds. Video scene segmentation system, characterized in that.

The method of claim 1,

The comparison module calculates at least one of an alternating current (AC) coefficient and a direct current (DC) coefficient from the P-frame, and compares an AC coefficient difference and a DC coefficient difference with an adjacent P-frame with respective thresholds. Video scene segmentation system, characterized in that.

The method of claim 1,

The motion analysis unit comprises a global motion analysis module for analyzing the motion of the camera for each of the frames before and after the split scheduled position determined by the coefficient filtering unit.

The method of claim 6,

The global motion analysis module may determine whether a background image is changed temporally or spatially by analyzing a camera movement according to a difference value of an alternating current (AC) coefficient for each of the frames before and after the segmentation scheduled position. Scene segmentation system.

The method of claim 6,

The motion analysis unit further comprises a local motion analysis module for analyzing the change of the video object for each of the frames before and after the split scheduled position determined by the coefficient filtering unit.

The method of claim 8,

And the local motion analysis module determines whether the foreground image changes temporally or spatially according to a difference value of an alternating current (AC) coefficient for each of the frames before and after the segmentation scheduled position.

The method of claim 1,

And the division position determiner divides a moving image scene according to a scene dividing position of the moving image, and stores reproduction order information in metadata of each divided scene.

A video scene segmentation method in a video scene segmentation system including a frame extractor, a coefficient filter, a motion analyzer, and a segment position determiner,

Extracting an intra frame (I-frame) and a predictive frame (P-frame) from the video to be divided by the frame extractor;

For each of the extracted frames, determining, by the coefficient filtering unit, a predetermined scheduled position by comparing an energy difference value and a threshold value between neighboring frames;

Determining, by the motion analyzer, temporal / spatial continuity between frames before and after the segmentation scheduled position with reference to the segmentation scheduled position; And

And determining, by the division position determiner, a frame having a low temporal / spatial continuity as a division position.

The determining of the scheduled division position includes: calculating at least one of an alternating current (AC) coefficient and a direct current (DC) coefficient from the I-frame;

Comparing the AC coefficient difference and the DC coefficient difference with neighboring I-frames with respective thresholds;

Determining an I-frame in which the AC coefficient difference value and the DC coefficient difference value are larger than the respective threshold values as the scheduled division positions;

Calculating at least one of an Alternate Current (AC) coefficient and a Direct Current (DC) coefficient from the P-frame;

Comparing the AC coefficient difference value and the DC coefficient difference value between neighboring P-frames with respective threshold values; And

Determining a P-frame in which the AC coefficient difference value and the DC coefficient difference value are larger than the respective threshold values as the scheduled division positions;

Video scene segmentation method comprising a.

delete

The method of claim 11,

The determining of the temporal / spatial continuity may include: determining whether a background change has been made by analyzing the movement of the camera; And

Determining temporal / spatial continuity by comparing features with neighboring frames when a background change is made as a result of the determination;

Video scene segmentation method comprising a.

The method of claim 14,

Determining whether a foreground change is made when the background change is not performed as a result of the determination; And

Determining temporal / spatial continuity by comparing features with neighboring frames when the foreground change is made;

Video scene segmentation method further comprising.

The method of claim 11,

After the determining of the dividing position, dividing the scene of the video according to the dividing position by the dividing position determining unit; And

Storing division order information in metadata of the divided scenes by the division position determiner;

Video scene segmentation method further including.