KR20090037288A

KR20090037288A - Method for real-time scene-change detection for rate control of video encoder, method for enhancing qulity of video telecommunication using the same, and system for the video telecommunication

Info

Publication number: KR20090037288A
Application number: KR1020080075307A
Authority: KR
Inventors: 이창현; 송관웅; 박영오; 김용석; 주영훈; 박태성; 권재훈; 정도영; 박재성; 김성기; 김용규; 오윤제
Original assignee: 삼성전자주식회사
Priority date: 2007-10-10
Filing date: 2008-07-31
Publication date: 2009-04-15
Also published as: KR101490521B1

Abstract

A real-time scene change detection method for controlling a moving picture encoding data rate, a method for improving video call quality through the same and a video call system are provided to assign bit resources of an image which is not focused to a next input image, thereby improving image quality. A frequency converter(104) performs discrete cosine transform. A quantizer(106) quantizes a block of a spectrum data coefficient outputted in the frequency converter. The quantizer applies predetermined scalar quantization to spectrum data by a step size. The quantizer receives variable information of quantization parameter per a frame from a QP adjusting unit(34). An entropy coder(108) compresses specific additional information of a corresponding macro block and output from of the quantizer. Moving picture information compressed in an entropy coder is buffered in an encoder buffer(110). A buffer level pointer of the encoder buffer is provided to an encoder QP controller(30).

Description

METHODE FOR REAL-TIME SCENE-CHANGE DETECTION FOR RATE CONTROL OF VIDEO ENCODER, METHOD FOR ENHANCING QULITY OF VIDEO TELECOMMUNICATION USING THE SAME, AND SYSTEM FOR THE VIDEO TELECOMMUNICATION}

본 발명은 동영상 부호화(encoding)에 관한 기술로서, 동영상 부호화시에 동영상 부호화 데이터율 제어를 위해 선행되는 실시간 장면 전환 검출 방법에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention [0001] The present invention relates to video encoding, and relates to a method for detecting real-time scene changeover that is preceded for video coded data rate control during video encoding.

동영상 신호의 전송이나 저장시에 높은 영상 화질을 유지하면서 낮은 데이터 레이트나 적은 저장 영역을 얻기 위한 다양한 디지털 동영상 압축 기술이 제안되어 왔다. 이러한 동영상 압축 기술은 H.261, H.263, H.264, MPEG-2, MPEG-4 등과 같은 국제 표준 규격들이다. 이러한 압축 기술은 이산 코사인 변환(DCT: Discrete Cosine Transform) 기법이나 모션 보상(MC: Motion Compensation) 기법 등에 의해, 비교적 높은 압축률을 달성하고 있다. 이러한 동영상 압축 기술은 동영상 데이터의 스트림이 다양한 디지털 네트워크, 예를 들면, 휴대전화 네트워크, 컴퓨터 네트워 크, 케이블 네트워크, 위성 네트워크 등에 효율적으로 전달되도록 적용되고 있다. 또한 하드디스크, 광디스크, 디지털 동영상 디스크(DVD) 등의 기억 매체에도 효율적으로 저장되도록 적용되고 있다.Various digital video compression techniques have been proposed to obtain a low data rate or a small storage area while maintaining high image quality in transmission or storage of a video signal. These video compression technologies are international standards such as H.261, H.263, H.264, MPEG-2, MPEG-4, and the like. Such a compression technique achieves a relatively high compression ratio by a Discrete Cosine Transform (DCT) technique or a Motion Compensation (MC) technique. Such video compression technology is applied to efficiently stream streams of video data to various digital networks, for example, cellular networks, computer networks, cable networks, satellite networks, and the like. It is also applied to storage media such as hard disks, optical disks, and digital video disks (DVD).

고화질을 위해서는, 동영상 부호화시 많은 양의 데이터를 요구하게 된다. 그러나, 동영상 데이터를 전달하는 통신 네트워크는 부호화에 적용할 수 있는 데이터 레이트를 제한할 수 있다. 예를 들어, 위성방송 시스템의 데이터 채널이나 디지털 케이블 텔레비전 네트워크의 데이터 채널은 일반적으로 고정 비트 레이트(CBR: Constant Bit Rate)로 데이터를 보내고 있다. 또한 디스크와 같은 저장 매체의 저장 용량도 한정되어 있다.For high quality, a large amount of data is required for video encoding. However, a communication network that delivers video data may limit the data rate applicable to encoding. For example, data channels of satellite broadcasting systems or data channels of digital cable television networks generally transmit data at a constant bit rate (CBR). In addition, the storage capacity of a storage medium such as a disk is also limited.

따라서, 동영상 부호화 프로세스는 화질과 이미지 압축에 필요한 비트 수를 적절히 트레이드 오프하게 된다. 또한 동영상 부호화는 비교적 복잡한 처리를 요구하므로, 예를 들어 소프트웨어로 이를 구현하려고 할 경우에는, 동영상 부호화 프로세스는 비교적 많은 CPU 사이클을 필요로 하게 된다. 더욱이 이를 실시간 처리로 재생하려고 하면, 시간적인 제약이 부호화 수행시의 정밀함을 제한하게 되며, 이에 따라 달성할 수 있는 화질이 제한된다.Therefore, the video encoding process trades off the image quality and the number of bits necessary for image compression. Also, since video encoding requires a relatively complicated process, for example, to implement it in software, the video encoding process requires a relatively large number of CPU cycles. Moreover, when attempting to reproduce this by real time processing, time constraints limit the precision in performing encoding, thereby limiting the image quality that can be achieved.

이와 같이, 동영상 부호화 데이터율 제어는 실제 사용환경에서 중요한 사안이며, 되도록이면 처리 방식의 복잡도와 전송 데이터율을 줄이면서도 고화질을 얻도록 하기 위한 동영상 부호화 데이터율 제어 방식이 제안되고 있다.As described above, video coded data rate control is an important issue in an actual use environment, and a video coded data rate control method has been proposed to obtain high quality while reducing the complexity of the processing method and the transmission data rate.

JVT(Joint Video Team: ITU-T Video Coding Experts Group and ISO/IEC 14496-10 AVC Moving Picture Experts Group, Z. G. Li, F. Pan, K. P. Lim, G. Feng, X. Lin, and S. Rahardja, "Adaptive basic unit layer rate control for JVT", JVT-G012-r1, 7th Meeting Pattaya Ⅱ, Thailand, Mar. 2003.)에서는 MPEG 동영상 압축 알고리즘에 따른 동영상 프레임 부호화시 양자화 파라미터(QP: Quantization Parameter)를 조절하여 데이터율을 제어하는 기본적인 기술이 개시되고 있다.Joint Video Team: ITU-T Video Coding Experts Group and ISO / IEC 14496-10 AVC Moving Picture Experts Group, ZG Li, F. Pan, KP Lim, G. Feng, X. Lin, and S. Rahardja, " Adaptive basic unit layer rate control for JVT ", JVT-G012-r1, 7th Meeting Pattaya Ⅱ, Thailand, Mar. 2003.) adjusts the quantization parameter (QP) in video frame coding according to MPEG video compression algorithm. Basic techniques for controlling the data rate have been disclosed.

한편, 주어진 자원(전송율 등)이 제한된 상태에서 동영상 부호화시 GOP(group of picture) 내(內)의 인터(Inter) 프레임에서 장면 전환이 일어나면 부호화 데이터율 제어의 흐름이 깨진다. 왜냐하면 부호화 데이터율 제어는 앞선 프레임과의 유사성이 있다는 가정 하에 만들어졌기 때문이다. 이러한 경우를 사전에 막기 위해 실시간 장면 전환 검출이 필요하게 된다.On the other hand, when a scene change occurs in an inter frame in a group of picture (GOP) during video encoding in a state in which a given resource (transmission rate, etc.) is limited, the flow of coding data rate control is broken. This is because the coded data rate control is made on the assumption that there is similarity with the preceding frame. Real-time scene change detection is needed to prevent this case in advance.

장면 전환 검출을 위해, 이웃하는 프레임간의 유사성을 찾기 위한 기법으로는 상관(Correlation), 통계적 순차 분석(Statistical sequential analysis), 히스토그램(Histogram) 등의 방식을 사용한다. 또한 H.264/AVC로 압축된 동영상에서는 RDO(Rate Distortion Optimization) 과정에서 인터(Inter) 프레임내에 인트라(Intra) 코딩된 매크로블록(macroblock)이 존재할 수 있으며, 인터(Inter) 프레임내에 인트라 코딩된 매크로블록의 수가 일정수준 이상이 되면 그 프레임은 장면 전환되었다고 간주할 수 있다.For scene change detection, a technique for finding similarity between neighboring frames uses a method such as correlation, statistical sequential analysis, and histogram. In addition, in an H.264 / AVC compressed video, intra-coded macroblocks may exist in an inter frame during a rate distortion optimization (RDO) process, and intra-coded in an inter frame. If the number of macroblocks is above a certain level, the frame can be considered to be a transition.

그러나, H.264/AVC로 압축된 동영상에서 인터 프레임 내에 인트라 코딩된 매크로블록의 수로 장면 전환을 판단하는 방식은 간단하지만 실시간 처리를 할 수 없다. 즉, H.264/AVC RDO 과정에서 발생하는 "Chicken&Egg dilemma"에 의해 양자화 파라미터(QP)없이 인터 프레임내에 인트라 코딩된 매크로블록의 수를 알 수 없다.However, in the H.264 / AVC-compressed video, the method of determining a scene change by the number of intra-coded macroblocks in an inter frame is simple but cannot perform real-time processing. That is, the number of macroblocks intra-coded in an inter frame without a quantization parameter (QP) may not be known by “Chicken & Egg dilemma” generated in the H.264 / AVC RDO process.

이를 해결하기 위하여, 프레임간의 비 유사성을 측정하여 장면전환을 판단하는 방법이 연구되어 왔다. 프레임간의 비 유사성의 측정은 압축된 영상의 비유사성척도(DM; dissimilarity metric)를 이용하는 방법과 압축되지 않은 영상에서의 비유사성 척도(DM)를 이용하는 방법이 있다.In order to solve this problem, a method of determining a scene transition by measuring dissimilarities between frames has been studied. The measurement of dissimilarity between frames includes a method of using a dissimilarity metric (DM) of a compressed image and a method of using a dissimilarity measure (DM) of an uncompressed image.

장면 전환 검출은 동영상 비트율 제어를 위해 수행되므로, 동영상 비트율 제어를 수행하기 이전에 완료되어야 한다. 또한, 동영상 비트율 제어를 통해 영상 압축과정 이전에 양자화 파라미터가 계산되어야 한다. 결국, 장면 전환 검출은 영상 압축을 수행하기 이전에 진행되어야하므로, 압축된 영상에서 비유사성척도(DM)를 연산하는 것은 적합하지 않다.Since the scene change detection is performed for the video bit rate control, it must be completed before performing the video bit rate control. In addition, the quantization parameter must be calculated before the video compression process through video bit rate control. As a result, scene transition detection must proceed before performing image compression, so it is not suitable to calculate the dissimilarity measure (DM) in the compressed image.

한편, 압축되지 않은 영상에서는 프레임의 평균제곱오차(MSE; Mean Square Error)를 이용하여 비유사성 척도(DM)를 측정할 수 있다. 평균제곱오차를 이용하여 비유사성 척도(DM)를 연산하면 프레임의 픽셀을 기반으로 연산을 수행하기 때문에 연산량이 많지는 않지만, 움직임이 많은 영상에서 장면전환을 검출하는 성능이 우수하지 않은 단점이 있다. 이러한 단점을 해소하기 위하여, 프레임의 픽셀 및 히스토그램을 모두 고려하여 비유사성 척도(DM)를 연산하는 방법이 사용된다. 상기한 기존의 방법으로서, mean absolute frame difference(MAFD), MAFD after histogram equalization with normalization(HEN), signed difference MAFD(SDMAFD) after HEN, absolute difference frame variance(ADFV) after HEN의 4가지 비유사성 척도(4DMs)를 모두 사용하는 방법이 시도되었다. 이와 같은 4DMs를 이용한 방법은 장 면전환의 검출 성능은 우수하지만 연산량이 많다. 따라서, 4DMs를 이용한 방법을 통해 실시간으로 프레임의 장면전환을 검출하기는 용이하지 않다.Meanwhile, in the uncompressed image, the dissimilarity measure (DM) may be measured using a mean square error (MSE) of the frame. Computing the dissimilarity measure (DM) using the mean square error is not very expensive because the calculation is performed based on the pixels of the frame, but there is a disadvantage in that the performance of detecting a scene change in a moving image is not excellent. . To alleviate this drawback, a method of calculating dissimilarity measures (DM) using both the pixel and histogram of the frame is used. As described above, four non-similarity measures (mean absolute frame difference (MAFD), MAFD after histogram equalization with normalization (HEN), signed difference MAFD (SDMAFD) after HEN, absolute difference frame variance (ADFV) after HEN) 4DMs) have been tried. The method using 4DMs has a good detection performance of scene switching but a large amount of calculation. Therefore, it is not easy to detect a scene change of a frame in real time through the method using 4DMs.

본 발명은 전술한 점을 고려하여 안출된 것으로서, 하드웨어 복잡성이 적으며 보다 효율적으로 실시간으로 장면 전환을 검출하기 위한 동영상 부호화 데이터율 제어를 위한 실시간 장면 전환 검출 방법을 제공하는 것에 그 목적이 있다.SUMMARY OF THE INVENTION The present invention has been made in view of the foregoing, and an object thereof is to provide a real-time scene change detection method for controlling video coded data rate for detecting scene change in real time with less hardware complexity.

또한, 본 발명은 오류로 인해 생성된 영상을 검출하고, 이를 이용하여 영상의 품질을 향상시킬 수 있는 방법을 제공하는데 또 다른 목적이 있다.Another object of the present invention is to provide a method of detecting an image generated due to an error and improving the quality of the image by using the same.

상기한 목적을 달성하기 위하여 본 발명에 따른 실시간 장면 전환 검출 방법은 동영상 부호화 데이터율 제어를 위한 실시간 장면 전환 검출 방법에 있어서, 현재 프레임을 복수의 영역으로 분할하고, 각각의 분할된 영역의 비 유사성 척도(DM; dissimilarity metric)를 연산하는 과정과, 상기 각 영역의 비 유사성 척도가 미리 설정된 기준치를 벗어나는지 판단하는 과정과, 현재 프레임 내에서, 비 유사성 척도가 상기 미리 설정된 기준치를 벗어나는 영역의 개수를 확인하는 과정과, 상기 과정에서 미리 설정된 기준치를 벗어나는 영역의 개수를 확인한 결과, 기준치를 벗어나는 영역의 개수가 미리 설정된 개수 이상일 때 현재 프레임에서 장면 전환이 이루어진 것으로 간주하는 과정을 포함한다.In order to achieve the above object, the real-time scene change detection method according to the present invention is a real-time scene change detection method for controlling video encoding data rate, wherein the current frame is divided into a plurality of areas, and the similarity of each divided area is not similar. Calculating a dissimilarity metric (DM), determining whether a non-similarity measure of each region deviates from a preset reference value, and a number of regions in which a non-similarity measure deviates from the preset reference value within a current frame And checking the number of regions that deviate from the preset reference value in the process, and deciding that a scene change is made in the current frame when the number of regions deviating from the reference value is greater than or equal to the preset number.

상기 각각의 분할된 영역의 비 유사성 척도(DM)를 연산하는 과정은, 현재 프레임과 재구성된 이전 프레임(기준 프레임)간의 샘플간 오차 정보를 이용하여, 현재 프레임의 PSNR(Peak Signal to Noise Ratio)을 예측하는 과정을 포함할 수 있 다.The process of calculating the dissimilarity measure (DM) of each divided region may include a peak signal to noise ratio (PSNR) of a current frame by using error information between samples between a current frame and a reconstructed previous frame (reference frame). It can include the process of predicting.

상기 각각의 분할된 영역의 비 유사성 척도(DM)를 연산하는 과정은, 현재 프레임에서 예측된 PSNR(Predicted Peak Signal to Noise Ratio; 이하, PPSNR이라 함)과 장면전환이 발생된 이후 프레임들의 평균 PPSNR과의 비율을 이용하여 각각의 분할된 영역의 비 유사성 척도를 연산하는 것일 수 있다.The process of calculating the dissimilarity measure (DM) of each divided region includes a predicted peak signal to noise ratio (PSNR) predicted in the current frame and an average PPSNR of frames after the scene change occurs. Using the ratio of and to calculate the similarity measure of each divided region.

나아가, 본 발명의 일 측면에 따른 장면 전환 검출 방법은, 상기 장면 전환이 이루어진 프레임 이후에 입력되는 프레임의 예측 PSNR의 미분 값을 연산하는 과정과, 연산된 결과값을 확인하여, 상기 결과값이 음수값을 지시하는 경우, 장면전환이 종료된 프레임으로 설정하는 과정을 더 포함할 수 있다.Furthermore, the scene change detection method according to an aspect of the present invention, the process of calculating the derivative value of the predicted PSNR of the frame input after the frame, the scene change is made, and confirming the calculated result value, When indicating a negative value, the method may further include setting the frame to which the scene change is completed.

본 발명의 다른 측면에 따른 영상통화 품질 향상 방법은 무선 단말의 영상통화 방법에 있어서, 단말의 입력 동영상 신호의 급격한 변화구간의 시작 프레임과 종료 프레임을 검출하는 과정과, 상기 검출된 영상들에 대해 송신부에서는 부호화를 스킵하는 과정과, 상기 스킵된 프레임들에 대해 수신부에서는 미리 수신된 전 프레임을 복사하여 상기 스킵된 프레임을 대신하여 재생하는 과정을 포함하며, 상기 종료 프레임 검출은 입력 영상과 재구성된 이전 영상 간에 구해지는 예측 PSNR의 미분을 통해 연산한다.According to another aspect of the present invention, there is provided a method for improving a video call quality in a video call method of a wireless terminal, the method comprising: detecting a start frame and an end frame of a sudden change section of an input video signal of a terminal; The transmitting unit skips encoding, and the receiving unit copies the entire frame received in advance and reproduces the skipped frame instead of the skipped frames. The end frame detection is performed by reconstructing the input image. It calculates through the derivative of the predictive PSNR obtained between the previous images.

본 발명의 또 다른 측면에 따른 영상통화 시스템은 무선 단말을 이용한 영상통화 시스템에 있어서, 단말의 입력 동영상 신호의 급격한 변화구간의 시작 프레임과 종료 프레임을 검출하는 검출기를 구비하며, 검출된 모든 영상들에 대해 프레임 부호화를 스킵하는 송신부, 및 상기 스킵된 프레임들에 대해 미리 수신된 전 프레 임을 복사하여 대체 재생하는 수신부를 포함하며, 상기 검출기는 입력 영상과 재구성된 이전 영상 간에 구해지는 예측 PSNR의 미분을 통해 연산한다.According to another aspect of the present invention, a video call system includes a detector for detecting a start frame and an end frame of a sudden change section of an input video signal of a terminal in a video call system using a wireless terminal. And a transmitter for skipping frame encoding for a receiver, and a receiver for copying and reproducing a previous frame previously received for the skipped frames, wherein the detector is a derivative of a predicted PSNR obtained between an input image and a reconstructed previous image. Calculate through

상기한 바와 같이, 본 발명에 따른 동영상 부호화 데이터율 제어를 위한 실시간 장면 전환 검출 방식은 하드웨어 복잡성이 적으며 보다 효율적으로 실시간으로 장면 전환을 검출할 수 있다.As described above, the real-time scene change detection method for controlling the video encoding data rate according to the present invention has less hardware complexity and can more efficiently detect the scene change in real time.

또한, 본 발명에 따른 동영상 코딩 방법 및 시스템에 따르면, 초점 맞지 않는 영상의 비트 자원을 축적하여 이후 입력되는 영상에 할당함으로써 영상품질을 향상시킬 수 있다.In addition, according to the video coding method and system according to the present invention, it is possible to improve the image quality by accumulating the bit resources of the out of focus image and assigning them to the input image.

이하 본 발명에 따른 바람직한 실시예를 첨부한 도면을 참조하여 상세히 설명한다. 하기 설명에서는 구체적인 구성 소자 등과 같은 특정 사항들이 나타나고 있는데 이는 본 발명의 보다 전반적인 이해를 돕기 위해서 제공된 것일 뿐 이러한 특정 사항들이 본 발명의 범위 내에서 소정의 변형이나 혹은 변경이 이루어질 수 있음은 이 기술분야에서 통상의 지식을 가진 자에게는 자명하다 할 것이다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings. In the following description, specific details such as specific components are shown, which are provided to help a more general understanding of the present invention, and it is understood that these specific details may be changed or changed within the scope of the present invention. It is self-evident to those of ordinary knowledge in Esau.

도 1은 본 발명의 일 실시예에 따른 장면 전환 검출 방법이 적용되는 비디오 인코더 장치의 블록 구성도이다. 도 1을 참조하면, 본 발명의 일 실시예에 따른 장면 전환 검출 방법이 적용되는 비디오 인코더 장치는 비디오 프레임 시퀀스를 입력받으며 출력으로서는 압축된 비디오 데이터를 출력하는 일반적인 H.264/AVC(Advanced Video Coding) 인코더(10)를 구비할 수 있다. 또한 프레임들을 저장하는 프레임 저장 메모리(20)와, 인코더(10)의 데이터 레이트 제어를 위한 양자화 파라미터(QP: Quantization Parameter) 제어 동작을 수행하는 인코더 QP 제어기(30)를 구비한다.1 is a block diagram of a video encoder device to which a scene change detection method according to an embodiment of the present invention is applied. Referring to FIG. 1, a video encoder apparatus to which a scene change detection method according to an exemplary embodiment of the present invention is applied receives a video frame sequence and outputs compressed video data as an output, which is general H.264 / AVC (Advanced Video Coding). The encoder 10 may be provided. Also, a frame storage memory 20 for storing frames and an encoder QP controller 30 for performing a quantization parameter (QP) control operation for data rate control of the encoder 10 are provided.

먼저, 인코더(10)의 구성 및 동작을 보다 상세히 설명하면, 인코더(10)는 주파수 변환기(104), 양자화기(106), 엔트로피 코더(108), 인코더 버퍼(110), 역양자화기(116), 역주파수 변환기(114), 모션추정/보상기(120) 및 필터(112)를 포함한다.First, the configuration and operation of the encoder 10 will be described in more detail. The encoder 10 includes a frequency converter 104, a quantizer 106, an entropy coder 108, an encoder buffer 110, and an inverse quantizer 116. ), An inverse frequency converter 114, a motion estimation / compensator 120, and a filter 112.

현재 프레임이 인터 프레임(예를 들어 P프레임일 경우)이라면, 모션추정/보상기(120)는 프레임 저장 메모리(20)에 버퍼링되어 있는 이전 프레임의 재구성된 프레임인 기준 프레임에 대한, 현재 프레임내(內) 매크로블록의 모션을 추정하고 보상한다. 프레임은 원 영상의 예를 들어 16x16 픽셀에 대응하는 매크로블록의 단위로 처리된다. 각 매크로블록은 인트라(intra) 또는 인터(inter) 모드로 부호화된다. 모션추정시에는 부가 정보로서 모션 벡터와 같은 모션 정보를 출력하고, 모션보상시에는 모션정보를 재구성된 이전 프레임에 적용하여 모션보상된 현재 프레임을 생성한다. 이와 같이 모션보상된 현재 프레임의 매크로블록(예측 매크로블록)과 원래의 현재 프레임의 매크로블록간의 차이분이 주파수 변환기(104)로 제공된다.If the current frame is an inter frame (e.g., a P frame), the motion estimation / compensator 120 may determine the current frame within the current frame, relative to the reference frame, which is a reconstructed frame of the previous frame buffered in the frame storage memory 20. I) Estimate and compensate for the motion of the macroblock. The frame is processed in units of macroblocks corresponding to, for example, 16x16 pixels of the original image. Each macroblock is encoded in an intra or inter mode. In motion estimation, motion information such as a motion vector is output as additional information, and in motion compensation, motion information is applied to a reconstructed previous frame to generate a motion compensated current frame. The difference between the macroblock (prediction macroblock) of the motion compensated current frame and the macroblock of the original current frame is provided to the frequency converter 104 in this manner.

주파수 변환기(104)는 공간 도메인의 동영상 정보를 주파수 도메인(예를 들어 스펙트럼) 데이터로 변환한다. 이때 주파수변환기(104)는 통상 이산 코사인 변환(DCT: Discrete Cosine Transform)을 수행하여 매크로블록단위로 DCT 계수 블록을 생성한다.The frequency converter 104 converts video information of the spatial domain into frequency domain (eg, spectrum) data. In this case, the frequency converter 104 typically performs Discrete Cosine Transform (DCT) to generate DCT coefficient blocks in macroblock units.

양자화기(106)는 상기 주파수 변환기(104)에서 출력되는 스펙트럼 데이터 계수의 블록을 양자화한다. 이때 양자화기(106)는 통상 프레임별 기반으로 가변되는 스텝-크기로 스펙트럼 데이터에 일정한 스칼라 양자화를 적용한다. 이러한 양자화기(106)는 데이터 레이트 컨트롤을 위해 프레임별로 양자화 파라미터(QP)의 가변 정보를 인코더 QP 제어기(30)의 QP조정부(34)로부터 제공받게 된다.Quantizer 106 quantizes a block of spectral data coefficients output from frequency converter 104. In this case, the quantizer 106 applies a constant scalar quantization to spectral data with a step-size that is usually changed on a frame-by-frame basis. The quantizer 106 receives variable information of the quantization parameter QP for each frame from the QP adjuster 34 of the encoder QP controller 30 for data rate control.

엔트로피 코더(108)는 해당 매크로블록의 특정 부가 정보(예를 들어, 모션 정보, 공간 보외법 모드, 양자화 파라미터)를 비롯하여, 양자화기(106)로부터의 출력을 압축한다. 통상적으로 적용되는 엔트로피 코딩 기술은, 산술 코딩, 허프만 코딩, 런 랭스(run-length) 코딩, LZ(Lempel Ziv) 코딩 등이 있다. 엔트로피 코더(108)는 통상 다른 종류의 정보에 다른 코딩 기술을 적용한다.Entropy coder 108 compresses the output from quantizer 106, including certain side information (eg, motion information, spatial interpolation modes, quantization parameters) of the macroblock. Commonly applied entropy coding techniques include arithmetic coding, Huffman coding, run-length coding, Lempel Ziv (LZ) coding, and the like. Entropy coder 108 typically applies different coding techniques to different kinds of information.

엔트로피 코더(108)에서 압축된 동영상 정보는 인코더 버퍼(110)에 버퍼링된다. 인코더 버퍼(110)의 버퍼 레벨 지시자는 데이터 레이트 조절을 위해 인코더 QP 제어기(30)로 제공된다. 인코더 버퍼(110)에 저장된 동영상 정보는 예를 들어 고정된 전송율로 인코더 버퍼(110)로부터 출력 및 삭제된다.The video information compressed by the entropy coder 108 is buffered in the encoder buffer 110. The buffer level indicator of the encoder buffer 110 is provided to the encoder QP controller 30 for data rate adjustment. Video information stored in the encoder buffer 110 is output and deleted from the encoder buffer 110 at a fixed transmission rate, for example.

한편, 상기에서 재구성된 현재 프레임이 후속 모션추정/보상에 필요한 경우, 역양자화기(116)는 양자화된 스펙트럼 계수에 대해 역양자화를 수행한다. 역주파수 변환기(114)는 주파수 변환기(104)의 동작을 역으로 수행하여, 역양자화기(116)의 출력으로부터 예를 들어 역DCT 변환을 통해 역차이 매크로블록이 생성된다. 이는 신호 손실 등의 영향으로 원래의 차이 매크로블록과 동일하지 않다.On the other hand, if the reconstructed current frame is required for subsequent motion estimation / compensation, the dequantizer 116 performs inverse quantization on the quantized spectral coefficients. The inverse frequency converter 114 reversely operates the frequency converter 104 such that an inverse difference macroblock is generated from the output of the inverse quantizer 116, for example, via an inverse DCT transform. This is not the same as the original difference macroblock due to signal loss and the like.

현재 프레임이 인터 프레임일 경우에는, 재구성된 상기 역차이 매크로블록은 상기 모션추정/보상기(120)의 예측 매크로블록과 합쳐져서 재구성된 매크로블록을 생성하게 된다. 재구성된 매크로블록들은 프레임 저장 메모리(20)에 다음 프레임을 예측하는데 이용하기 위해 기준 프레임으로 저장된다. 이때 상기 재구성된 매크로블록은 원래의 매크로블록의 왜곡된 버전이므로 일부 실시예에서는 디블록킹(deblocking) 필터(112)를 재구성된 프레임에 적용하여, 매크로블록간 불연속성을 원활하게 한다.When the current frame is an inter frame, the reconstructed inverse difference macroblock is combined with the prediction macroblock of the motion estimation / compensator 120 to generate a reconstructed macroblock. The reconstructed macroblocks are stored in the frame storage memory 20 as reference frames for use in predicting the next frame. In this case, since the reconstructed macroblock is a distorted version of the original macroblock, in some embodiments, the deblocking filter 112 is applied to the reconstructed frame, thereby facilitating discontinuities between macroblocks.

한편, 인코더(10)의 QP를 제어하는 인코더 QP 제어기(30)에는 본 발명의 특징에 따라 프레임 저장 메모리(20)에 저장된 현재 프레임 및 기준 프레임 등을 통해 실시간으로 장면전환을 검출하는 장면전환 검출부(32)를 구비한다. 장면전환 검출부(32)에서 장면 전환을 검출하게 되면, 이에 대한 정보는 QP조정부(34)로 제공되며, QP조정부(34)는 이에 따라 장면 전환 검출시에 양자화기(106)의 양자화 파라미터를 적절히 조정하여 현재 프레임의 장면 전환에 적절히 대응할 수 있도록 한다.Meanwhile, the encoder QP controller 30 that controls the QP of the encoder 10 includes a scene change detection unit that detects a scene change in real time through a current frame and a reference frame stored in the frame storage memory 20 according to a feature of the present invention. (32) is provided. When the scene change detection unit 32 detects a scene change, information about the scene change is provided to the QP adjusting unit 34, and the QP adjusting unit 34 accordingly adjusts the quantization parameter of the quantizer 106 properly when detecting the scene change. Adjust to adapt to the transition of the current frame.

이를 위해, 본 발명에서 상기 장면전환 검출부(32)는 장면 전환을 판단하는 과정에서 연산의 부하가 증가하는 것을 방지하기 위하여, 현재 프레임의 PSNR(Peak Signal to Noise Ratio)을 예측한 값(PPSNR; Predicted PSNR)만을 이용한다. 구체적으로, 장면전환 검출부(32)는 현재 프레임을 도 2와 같이 복수의 영역으로 분할하고, 분할된 각 영역의 PSNR을 예측한다. 그리고, 각 영역의 비 유사성 척도(DM; dissimilarity metric)를 연산하여, 상기 비 유사성 척도(DM)가 미리 정해진 기준값을 벗어나는 지를 각각 확인하고, 프레임 내에서 몇 개의 영역이 미리 정해진 기 준값을 벗어나는 지를 확인한다. 확인된 개수가 미리 정해진 값 이상일 경우에는 현재 프레임에서 장면 전환이 일어난 것으로 간주하게 된다.To this end, in the present invention, the scene change detection unit 32 predicts a peak signal to noise ratio (PSNR) of a current frame (PPSNR) in order to prevent an operation load from increasing during the scene change determination. Use only Predicted PSNR). Specifically, the scene change detector 32 divides the current frame into a plurality of regions as shown in FIG. 2 and predicts the PSNR of each divided region. The dissimilarity metric (DM) of each region is calculated to determine whether the dissimilarity metric deviates from a predetermined reference value, respectively, and how many regions within a frame deviate from a predetermined reference value. Check it. If the checked number is more than a predetermined value, it is considered that a scene change has occurred in the current frame.

본 발명에서는 각 영역의 비 유사성 척도(DM)는 프레임 내의 국소적인 변화를 확인할 수 있도록 현재프레임의 PSNR과 이전 프레임들의 평균 PPSNR의 비를 확인하는 것으로 이루어진다. 이는 하기 수학식 1과 같이 계산되어 질 수 있다.In the present invention, the dissimilarity measure (DM) of each region consists of checking the ratio of the PSNR of the current frame and the average PPSNR of previous frames so as to identify local changes in the frame. This may be calculated as in Equation 1 below.

상기 수학식 1에서 DM^x _proposed _{, i}에서 x는 분할된 영역의 식별번호를 표시하며, i는 현재 프레임의 프레임 번호를 의미하며, S_j는 j번째 발생한 급장면전환에 대한 해당 영상의 프레임 번호를 의미한다. DM^x _proposed _{, i}는 현재 프레임 각 영역의 PPSNR과 프레임의 장면전환이 이루어진 시점부터 각 영역들의 평균 PPSNR과의 비율이다. 또한, PPSNR^x _k _,k-1과 PPSNR^x _i _{, i-1}은 각각 하기 수학식 2 및 3에 의해 구해질 수 있다.In Equation 1, in DM ^x _proposed _{, i} denotes an identification number of a divided region, i denotes a frame number of the current frame, and S _j denotes a frame number of the corresponding image for the j-th occurrence of the sudden switching. Means. DM ^x _proposed _{, i} is the ratio between the PPSNR of each region of the current frame and the average PPSNR of each region from the time when the scene change of the frame is made. In addition, PPSNR ^x _k _{, k-1} and PPSNR ^x _i _{, i-1} may be obtained by Equations 2 and 3, respectively.

수학식 2 및 3에서 n은 샘플(즉, 픽셀)당 갖는 비트수를 나타내는 것으로서, 일반적으로 n은 8로 설정될 수 있다. 상기 PMSE는 현재 프레임의 예측된 MSE(Mean Square Error)로서, 하기 수학식 4 및 5의 연산에 의해 획득될 수 있다.In Equations 2 and 3, n denotes the number of bits per sample (ie, a pixel), and in general, n may be set to 8. The PMSE is a predicted Mean Square Error (MSE) of the current frame, and may be obtained by the calculation of Equations 4 and 5 below.

상기 수학식 4, 5에서 O^k _mn 및 Oⁱ _mn는 각각 k번째와 i번째 프레임(즉 현재 프레임)의 m열 n행 번째 오리지널(original) 샘플을 나타낸다. 한 프레임은 M[m] x N[n] 픽셀로 구성된다. 상기 수학식 4, 5에서 Oⁱ _mn는 i번째 프레임(즉 현재 프레임)의 m열 n행 번째 오리지널(original) 샘플을 나타내고 R^j _mn는 j번째 프레임(즉 이전 프레임)의 m열 n행 번째 재구성된(reconstructed) 기준 샘플을 나타낸다. 한 프레 임은 M[m] x N[n] 픽셀로 구성된다.In Equations 4 and 5, O ^k _mn and O ⁱ _mn represent the m-th row n-th original samples of the k-th and i-th frames (that is, the current frame), respectively. One frame consists of M [m] x N [n] pixels. In Equations 4 and 5, O ⁱ _mn represents an m-th row n original sample of the i-th frame (that is, the current frame), and R ^j _mn represents the m-th n-th row of the j-th frame (ie, the previous frame). Reconstructed reference samples are shown. One frame consists of M [m] x N [n] pixels.

본 발명에서 현재 프레임의 장면 전환 여부는 프레임을 이루는 복수의 영역 중, 미리 정해진 임계치 이내의 비 유사성 척도(DM^x _proposed _{, i})를 갖는 영역이 몇 개인지를 확인하여 결정된다.In the present invention, whether or not to change the scene of the current frame is determined by checking how many regions having a non-similarity measure (DM ^x _proposed _{, i} ) within a predetermined threshold among the plurality of regions constituting the frame.

미리 정해진 임계치 이내의 비 유사성 척도(DM^x _proposed _{, i})를 갖는 영역은 하기의 수학식 6에 의해 연산되며, 장면 전환 여부는 수학식 7의 연산을 통해 결정될 수 있다.An area having a dissimilarity measure (DM ^x _proposed _{, i} ) within a predetermined threshold may be calculated by Equation 6 below, and whether or not to change scenes may be determined by using Equation 7 below.

수학식 6에서 β는 각 영역의 비 유사성 척도(DM^x _proposed _{, i})를 정의하는 임계치이다. 그리고, 수학식 7에서 α는 프레임의 장면전환 여부를 결정하기 위한 비율을 정의한 임계치이고, N_f는 프레임 내에 분할된 영역의 수를 나타낸다.Β in Equation 6 is a threshold that defines the non-similarity measure (DM ^x _proposed _{, i} ) of each region. In Equation 7, α is a threshold defining a ratio for determining whether a frame is changed or not, and N _f represents the number of regions divided in the frame.

예컨대, N_f를 12개로 정의하고, α값을 0.75로 정의하고, β값을 0.7로 정의 한다. 이때, 비 유사성 척도(DM^x _proposed _{, i})가 0.7 미만인 분할 영역의 개수가 9이상이면, 현재 프레임을 급장면전환된 프레임으로 판단한다. 나아가, α 및 β는 실험치에 의해 결정된 값일 수 있다.For example, N _f is defined as 12, α value is defined as 0.75, and β value is defined as 0.7. At this time, if the number of partitions having a similarity measure (DM ^x _proposed _{, i} ) less than 0.7 is 9 or more, it is determined that the current frame is a suddenly switched frame. Furthermore, α and β may be values determined by experimental values.

도 3은 본 발명의 일 실시예에 따른 실시간 장면 검출 동작의 흐름도로서, 도 1에 도시된 바와 같은 장면전환 검출부(32)에서 수행될 수 있는 동작이다. 도 3을 참조하면, 먼저, 302단계를 통해 영상 프레임을 입력받는다. 다음으로 입력받은 영상프레임은 304단계를 통해 N_f개의 복수의 영역으로 분할된다. 그리고, 수학식 1 내지 수학식 5를 연산하여 분할된 각 영역의 비 유사성 척도(DM^x _proposed _{, i})를 연산하게 된다(306단계). 308단계는 수학식 6을 이용하여 연산된 비 유사성 척도(DM^x _proposed _{, i})와 미리 정해진 임계치인 β값을 비교하여 각 영역의 C^x를 연산한다. 예컨대, β값이 0.7로 설정되었을 경우, 연산된 비 유사성 척도(DM^x _proposed _{, i})가 상기 β값 0.7보다 상대적으로 작을 경우 상기 C^x를 1로 결정하고, 비 유사성 척도(DM^x _proposed _{, i})가 상기 β값 0.7보다 상대적으로 크거나 같으면 상기 C^x를 0으로 결정한다. 308단계를 진행한 후, 상기 x값과 N_f값을 확인하여 프레임에 포함된 전체 영역의 C^x가 모두 연산 되었는지를 확인한다(310단계). 예컨대, 도 2와 같이 프레임 이 12개의 영역으로 분할되었음을 가정하면(즉, N_f=12로 설정되면), 12개의 영역 중, 최초로 비 유사성 척도(DM^x _proposed _{, i}) 등이 연산되는 영역의 x값은 0으로 설정될 수 있으며, 최종적으로 비 유사성 척도(DM^x _proposed _{, i}) 등이 연산되는 영역의 x값은 N_f-1인 11로 설정될 수 있다. 따라서, 310단계는 x값이 N_f-1값과 동일한 값인지를 확인하는 것일 수 있다. 310단계의 확인결과, 프레임에 포함된 전체 영역의 C^x값의 연산이 완료되지 않았을 경우(310-아니오), 311단계를 수행하여 x값을 업데이트 한다. 그리고, 프레임에 포함된 전체 영역의 비 유사성 척도(DM^x _proposed _{, i}) 및 C^x값이 모두 연산될 때까지 306, 308, 310, 및 311단계를 반복적으로 수행한다. 한편, 310단계의 확인결과, 프레임에 포함된 전체 영역의 C^x값의 연산이 완료되었으면(310-예), 312단계를 수행한다. 312단계는 수학식 7을 이용하여 308단계에서 연산된 각 영역의 C^x값을 모두 합산한다. 다음으로, 314단계는 합산된 값과 미리 정해진 임계치를 비교하여 현재 프레임이 급 장면전환이 이루어진 프레임인지를 판단한다. 예컨대, 304단계에서 프레임이 12개의 영역으로 분할되었고 α값이 0.75로 설정되었을 경우, 각 영역의 C^x값을 모두 합산한 결과값이 9이상이면 현재 프레임을 급 장면전환이 이루어진 것으로 간주한다. 314단계를 통해 현재 프레임이 급 장면전환이 이루어진 것으로 판단되면(314-예), 316단계를 통해 장면 전환 검출 신호 등을 발 생시키고, 318단계를 통해 비 유사성 척도(DM^x _proposed _{, i})의 연산에 사용되는 S_j값을 갱신한다. 이와 같이 발생된 장면 전환 검출 신호는 이후 QP조정부(34)로 제공될 수 있으며, QP조정부(34)는 이에 따라 장면 전환 검출시에 양자화기(106)의 양자화 파라미터를 적절히 조정하게 된다. 한편, 314단계를 통해 현재 프레임이 급 장면전환이 이루어지지 않은 것으로 판단되면(314-아니오), 320단계를 수행한다. 320단계는 입력된 프레임이 영상의 마지막 프레임인지를 확인하는 단계이다. 320단계의 확인결과, 입력된 프레임이 영상의 마지막 프레임일 경우(320-예) 프레임의 장면전환 여부의 확인을 종료하고, 프레임이 계속적으로 입력될 경우(320-아니오) 프레임의 입력이 종료될 때까지 전술한 302단계 내지 318단계를 반복하여 수행하게 된다.3 is a flowchart of a real-time scene detection operation according to an embodiment of the present invention, which may be performed by the scene change detection unit 32 as shown in FIG. 1. Referring to FIG. 3, first, an image frame is received in step 302. Next, the input image frame is divided into a plurality of N _f regions in step 304. In operation 306 _, the non-similarity measure DM ^x _proposed _{and i} of each divided region may be calculated by calculating Equations 1 to 5. Step 308 compares the non-similarity measure (DM ^x _proposed _{, i} ) calculated using Equation 6 with β value, which is a predetermined threshold, and calculates C ^x of each region. For example, when the β value is set to 0.7, when the calculated similarity measure DM ^x _proposed _{, i} is relatively smaller than the β value 0.7, the C ^x is determined as 1, and the similarity measure DM ^x _proposed _{, If i} ) is relatively greater than or equal to the β value of 0.7, the C ^x is determined to be zero. After proceeding to step 308, by checking the x value and the N _f value it is confirmed whether all the C ^x of the entire region included in the frame has been calculated (step 310). For example, assuming that a frame is divided into 12 regions as shown in FIG. 2 (that is, when N _f = 12 is set), among the 12 regions, the first non-similarity measure DM ^x _proposed _{, i} , etc. is calculated. The x value may be set to 0, and finally, the x value of the region in which the dissimilarity measure DM ^x _proposed _{, i} , etc. is calculated may be set to 11, which is N _f −1. Accordingly, step 310 may be to determine whether the x value is equal to the N _f −1 value. As a result of checking in step 310, when the calculation of the C ^x value of the entire area included in the frame is not completed (310-no), step 311 is performed to update the x value. Then, the perform 306, 308, 310, and step 311 until the dissimilarity measure (DM _proposed _{^x, i)} and C ^x the value of the entire region including all of the frame to be computed repeatedly. On the other hand, if it is determined in step 310 that the calculation of the C ^x value of the entire area included in the frame is completed (310-Yes), step 312 is performed. In operation 312, all of the C ^x values of each region calculated in operation 308 are added up using Equation 7. Next, step 314 compares the summed value with a predetermined threshold to determine whether the current frame is a frame that has undergone a sudden scene change. For example, when the frame is divided into 12 regions in step 304 and the α value is set to 0.75, if the resultant value of the sum of all the C ^x values of each region is 9 or more, the current frame is regarded as a sudden scene change. If it is determined in step 314 that the current frame has undergone a rapid scene change (Yes-314), a scene change detection signal is generated in step 316, and in step 318, the non-similarity measure (DM ^x _proposed _{, i} ) Update the S _j value used in the operation. The scene change detection signal generated as described above may be provided to the QP adjuster 34, and the QP adjuster 34 accordingly adjusts the quantization parameter of the quantizer 106 according to the scene change detection. On the other hand, if it is determined in step 314 that the current frame does not have a sudden scene change (314-No), step 320 is performed. Step 320 is to determine whether the input frame is the last frame of the image. If the inputted frame is the last frame of the image (320-Yes) as a result of the check in step 320, the checking of whether the frame is transitioned is terminated, and if the frame is continuously inputted (320-No), the input of the frame is terminated. The above steps are repeated until step 302 to step 318 are performed.

이하, 실험을 통해 본 발명에 따른 장면 전환 검출 방식의 유효성을 알아본다. 우선, 실험을 위해 움직임이 빠르고 조명 빛등 급장면전환 검출을 어렵게 하는 임의의 테스트 영상 2개를 선택하였다. 선택된 영상은 하기의 표 1과 같다. 테스트 시퀀스 영상의 명칭은 각각 'Worldcup', 'FF-X2'으로 설정된다. 테스트 시퀀스 영상은 각각 6,843개와 7,138개의 프레임으로 이루어지며, 13개와 159개의 급장면전환 프레임으로 구성되었다.Hereinafter, the effectiveness of the scene change detection method according to the present invention is examined through experiments. First, two random test images were selected for the experiment, which make the movement fast and make it difficult to detect the diversion of the illumination light. The selected image is shown in Table 1 below. The name of the test sequence image is set to 'Worldcup' and 'FF-X2', respectively. The test sequence images consisted of 6,843 and 7,138 frames, respectively, and consisted of 13 and 159 sharp scene switching frames.

SequenceSequence Sequence commentSequence comment 프레임 수Number of frames 급 장면전환 개수Number of transitions WorldcupWorldcup Sports highlightSports highlight 6,8436,843 1313 FF-X2FF-X2 Animation highlightAnimation highlight 7,1387,138 159159

전술한 2개의 테스트 시퀀스 영상을 이용하여 종래기술에 따른 MSE DM, 4DMs, 본 발명의 출원인에 의해 기 출원된 대한민국 특허출원 10-2006-0075856호에 기재된 방법(이하, '856 특허에 기재된 방법), 및 본 발명에서 제안하는 방법에 따라 각각 장면전환을 검출하도록 하고, 장면전환 검출과정에서 발생되는 오류를 확인하여 하기의 표 2에 기재하였다. 표 2에서 'FALSE의 수'는 실제 장면전환이 일어나지 않았음에도 불구하고 장면전환으로 검출한 개수를 나타내고, 'MISS의 수'는 실제 장면전환이 일어났음에도 불구하고 장면전환을 검출하지 못한 수를 나타낸다. 또한, 각 방법별로 DP_FalseMiss(%)를 연산하여 기재하였다. DP_FalseMiss(%)는 영상에 포함된 장면전환개수에 대한 FALSE 및 MISS를 합한 것의 비율을 나타내는 것으로써, 하기의 수학식 8을 연산한 결과이다.Method described in Korean Patent Application No. 10-2006-0075856, previously filed by the applicant of the present invention, MSE DM, 4DMs according to the prior art using the two test sequence images described above (hereinafter, the method described in the '856 patent) According to the method proposed in the present invention, and to detect the scene change, respectively, and to check the error generated during the scene change detection process is described in Table 2 below. In Table 2, 'the number of FALSE' indicates the number of scenes detected by the transition even though no actual transition occurred, and the 'number of MISS' indicates the number of scenes not detected even though the actual transition occurred. Indicates. In addition, DP _FalseMiss (%) was calculated and described for each method. DP _FalseMiss (%) represents the ratio of the sum of FALSE and MISS to the number of scene transitions included in the image, which is the result of calculating Equation 8 below.

SequenceSequence ASC detect algorithsASC detect algoriths FALSE의 수Number of FALSE MISS의 수Number of MISS DP_FalseMiss(%)DP _FalseMiss (%) WorldcupWorldcup MSE DMMSE DM 22 22 30.830.8 4DMs4DMs 00 1One 7.77.7 '856특허에 기재된 방법Method described in the '856 patent 33 33 46.246.2 본 발명에 기재된 방법Method described in the present invention 00 1One 7.77.7 FF-X2FF-X2 MSE DMMSE DM 9797 6060 98.798.7 4DMs4DMs 1111 5050 38.438.4 '856특허에 기재된 방법Method described in the '856 patent 5252 5959 69.869.8 본 발명에 기재된 방법Method described in the present invention 3131 4848 49.749.7

표 2를 참조하면, 본 발명에 따른 방법을 이용하여 장면전환을 검출하면, MSD DM보다 상대적으로 약 36.1%정도 검출 성능이 우수하고, 4DMs보다 상대적으로 약 5.7%정도 검출 성능이 낮음을 알 수 있다.Referring to Table 2, when the scene change is detected using the method according to the present invention, the detection performance is superior to about 36.1% relative to MSD DM, and about 5.7% lower than 4DMs. have.

또한, 전술한 방법들을 수행하는데 요구되는 연산부하를 개인용 컴퓨터를 통해 확인하였으며 그 결과를 도 4의 그래프를 통해 도시하였다.In addition, the computational load required to perform the above-described methods was confirmed through a personal computer, and the result is illustrated through the graph of FIG. 4.

실험에 사용된 개인용 컴퓨터는 Microsoftⓡ Windowsⓡ XP를 운영체제(OS; Operating System)로 탑재하고, Intelⓡ VTune™ Performance Analyzer 8.0 프로그램을 수록한 저장매체를 포함한다. 실험은 상기 개인용 컴퓨터 내의 운영체제의 타이머(operating system timer)를 활용하는 시간기준모드(time-based mode)로 세팅하였으며 기준시간간격(sampling interval)은 1㎳로 설정하였다. 실험의 신뢰성을 높이기 위해 상기한 조건을 구비한 서로 다른 3대의 컴퓨터를 사용하여 측정하였고, 3대의 컴퓨터를 통해 측정된 값을 평균화하였다. 도 4는 3대의 컴퓨터에서 얻은 시간 샘플(timer samples)들을 모두 합치고, 알고리즘 간의 연산부하 측면에서의 측정된 결과를 도시한다. 본 발명에서 제안한 알고리즘은 연산부하 측면에서, 'MSE DM' 알고리즘보다는 34.8% 개선되었으며 '4 DMs' 알고리즘보다는 무려 93.1% 개선되었다. 이는 도 4에 나와 있듯이 H.264 frame layer rate control 의 computational load를 감안할 때 중요한 의미를 갖는다. 결국 본 발명은 H.264 동영상 부호화시 급장면전환에 대한 적절한 비트율 제어에 적용할 수 있는 급장면전환 검출 알고리즘일 수 있다. 또한, 본 발명은 기존 알고리즘 대비 검출 성능 및 연산부하를 고려한 가장 적합한 알고리즘일 수 있다.The personal computer used in the experiment included Microsoft® Windows® XP as an operating system (OS) and a storage medium containing the Intel® VTune ™ Performance Analyzer 8.0 program. The experiment was set to a time-based mode utilizing a operating system timer of the operating system in the personal computer, and a sampling time interval was set to 1 ms. In order to increase the reliability of the experiment, measurements were made using three different computers with the above conditions, and the values measured by the three computers were averaged. Figure 4 shows the measured results in terms of computational load between algorithms, incorporating all time samples from three computers. In terms of computational load, the proposed algorithm is improved by 34.8% over the 'MSE DM' algorithm and 93.1% over the '4 DMs' algorithm. This is important in view of the computational load of H.264 frame layer rate control as shown in FIG. As a result, the present invention may be an algorithm for detecting a scene switching that may be applied to proper bit rate control for scene switching in H.264 video encoding. In addition, the present invention may be the most suitable algorithm considering the detection performance and the computational load compared to the existing algorithm.

이러한, 본 발명의 일 실시예에 따른 장면 전환 검출 방법은 무선 단말의 영상 통화 방법에 적용될 수 있다. 즉, 영상 통화를 위해 생성되는 영상 중, 장면 전환이 발생된 프레임에 대해서는, 본 발명의 일 실시예에 따른 장면 전환 검출 방법을 적용하여 양자화 파라미터를 적절히 조절하고, 부호화하여 전송할 수 있다.Such a scene change detection method according to an embodiment of the present invention may be applied to a video call method of a wireless terminal. That is, among frames of scenes generated for a video call, a scene change occurs, a scene change detection method according to an embodiment of the present invention may be appropriately adjusted, encoded, and transmitted.

한편, 모바일 기기의 물리적인 렌즈 제한과 더불어, 모바일 기기의 급작스런 움직임은 초점 맞지 않는 영상을 생성하게 된다. 그리고, 이러한 초점이 맞지 않는 영상은, 시공간적인 상관성이 적으므로, 부호화시 많은 비트 자원을 소모하게 된다. 비트 자원의 일시적인 많은 소모는 급작스런 움직임 이후 제대로 생성된 영상에까지 영향을 미쳐 전체적으로 영상품질을 떨어뜨리는 원인이 된다. 즉, 초점 맞지 않아 보기 힘든 영상을 필요 이상의 비트를 할당하여 추후 제대로 된 영상마저 영향을 미치므로 이를 개선할 필요성이 있다.On the other hand, in addition to the physical lens limitation of the mobile device, sudden movement of the mobile device generates an out of focus image. Since the defocused video has little space-time correlation, a lot of bit resources are consumed during encoding. Temporary heavy consumption of bit resources affects well-formed images after sudden movements, which in turn degrades the overall image quality. In other words, by assigning more bits than necessary to an image that is difficult to see, it is necessary to improve the image since it affects a proper image later.

따라서, 본 발명의 다른 실시예에 따른 영상 장면 전환 검출 방법은 초점 맞지 않는 영상에 포함된 프레임들을 검출할 수 있는 방법을 제안한다. 구체적으로, 본 발명의 다른 실시예에 따른 영상 장면 전환 검출 방법은 초점이 맞지 않는 영상으로 인해 장면 전환이 시작되는 프레임(이하, 시작 프레임이라 함.) 및 초점이 맞지 않는 영상이 종료되는 프레임(이하, 종료 프레임이라 함.)을 검출한다.Therefore, the image scene change detection method according to another embodiment of the present invention proposes a method that can detect the frames included in the out of focus image. Specifically, the image scene change detection method according to another embodiment of the present invention is a frame in which a scene change starts (hereinafter referred to as a start frame) due to an image that is not in focus and a frame in which an image that is not in focus ends ( Hereinafter referred to as end frame.).

도 5는 본 발명의 다른 실시예에 따른 장면 전환 검출 방법이 적용되는 동영상 인코더 장치의 블록 구성도이다. 도 5를 참조하면, 본 발명의 다른 실시예에 따른 장면 전환 검출 방법이 적용되는 동영상 인코더 장치는, 본 발명의 일 실시예에 따른 장면 전환 검출 방법이 적용되는 동영상 인코더 장치와 동일하다. 다만, 장면전환 검출부(32)의 세부 구성이 다르다. 따라서, 본 발명의 다른 실시예의 동영상 인코더 장치에서, 일 실시예의 동영상 인코더 장치와 동일한 구성요소에 대해서는, 동일한 도면 번호를 부여한다. 그리고, 일 실시예의 동영상 인코더 장치와 동일한 구성요소에 대한 상세한 설명은, 본 발명의 일 실시예에 상세히 개시되어 있으므로 생략한다.5 is a block diagram of a video encoder device to which a scene change detection method according to another embodiment of the present invention is applied. Referring to FIG. 5, a video encoder device to which a scene change detection method according to another embodiment of the present invention is applied is the same as a video encoder device to which a scene change detection method according to an embodiment of the present invention is applied. However, the detailed configuration of the scene change detection unit 32 is different. Therefore, in the video encoder device of another embodiment of the present invention, the same components as those of the video encoder device of one embodiment are assigned the same reference numerals. In addition, detailed descriptions of the same components as those of the video encoder apparatus of one embodiment are omitted since they are disclosed in detail in one embodiment of the present invention.

한편, 본 발명의 다른 실시예가 적용되는 동영상 인코더 장치에서, 인코더(10)를 제어하는 인코더 제어기(40)에는 본 발명의 특징에 따라 프레임 저장 메모리(20)에 저장된 현재 프레임 및 기준 프레임 등을 통해 실시간으로 매우 빠른 장면을 검출하는 장면전환 검출부(42)를 구비한다. 장면전환 검출부(42)에는 유사 장면의 시작프레임을 검출하는 시작프레임 검출부(422)와 유사 장면의 종료 프레임을 검출하는 종료 프레임 검출부(424), 스킵할 프레임을 판단하는 프레임 스킵 판단부(426)를 구비한다.Meanwhile, in the video encoder device to which another embodiment of the present invention is applied, the encoder controller 40 for controlling the encoder 10 is provided with a current frame and a reference frame stored in the frame storage memory 20 according to the feature of the present invention. A scene change detector 42 for detecting a very fast scene in real time is provided. The scene change detection unit 42 includes a start frame detector 422 for detecting a start frame of a similar scene, an end frame detector 424 for detecting an end frame of a similar scene, and a frame skip determination unit 426 for determining a frame to be skipped. It is provided.

장면전환 검출부(42)에서 장면 전환을 검출하게 되면, 이에 대한 정보는 QP조정부(44)로 제공되며, QP조정부(44)는 이에 따라 장면 전환 검출시에 양자화기(106)의 양자화 파라미터를 적절히 조정하여 현재 프레임의 장면 전환에 적절히 대응할 수 있도록 한다.When the scene change detection unit 42 detects a scene change, information about the scene change is provided to the QP adjusting unit 44, and the QP adjusting unit 44 accordingly adjusts the quantization parameter of the quantizer 106 appropriately when detecting the scene change. Adjust to adapt to the transition of the current frame.

상기 시작프레임 검출부(422)는 이전 저장된 기준 프레임(reference frame)과 입력되는 현재 프레임의 PSNR(Peak Signal to Noise Ratio)을 예측하여 장면 전환을 판단하게 된다. 즉 상기 예측된 PSNR이 미리 설정된 기준치를 벗어날 경우에는 현재 프레임에서 장면 전환이 일어난 것으로 간주하게 된다.The start frame detector 422 determines a scene change by predicting a peak signal to noise ratio (PSNR) between a previously stored reference frame and an input current frame. That is, when the predicted PSNR deviates from a preset reference value, it is assumed that a scene change occurs in the current frame.

나아가, 상기 시작프레임 검출부(422),는 본 발명의 일 실시예에서의 장면전환 검출부(32)와 동일한 동작을 통해 시작 프레임을 검출하는 것도 가능하다.In addition, the start frame detector 422 may detect the start frame through the same operation as the scene change detector 32 according to the exemplary embodiment of the present invention.

상기 종료 프레임 검출부(322)는 이전 저장된 기준 프레임(reference frame)과 입력되는 현재 프레임의 PSNR(Peak Signal to Noise Ratio)을 예측한 파라미터의 미분값을 이용하여 종료프레임을 검출한다. 예컨대, 상기 종료 프레임 검출부(322)는 하기 수학식 9를 연산하여, 종료프레임을 검출한다.The end frame detector 322 detects the end frame by using a derivative value of a parameter predicted a PSNR (Peak Signal to Noise Ratio) of a previously stored reference frame and an input current frame. For example, the end frame detector 322 detects the end frame by calculating Equation 9 below.

즉, 영상의 움직임 관점에서 볼 때 음수 Diff _{AvgPartialPPSNR} 는 현재 프레임의 움직임이 이전 프레임의 움직임 대비 줄었음을 의미한다. 이이 기초하여, 상기 종료 프레임 검출부(424)는, 상기 시작프레임 검출부(422)에서 매우 빠른 영상(즉, 초점이 맞지 않는 영상)이 검출된 이후, 처음으로 Diff _{AvgPartialPPSNR} 값이 음수로 되는 프레임을 종료 프레임으로 결정한다. That is, from the viewpoint of the motion of the image, the negative Diff _{AvgPartialPPSNR} means that the motion of the current frame is reduced compared to the motion of the previous frame. Based on this, the end frame detector 424 ends the first frame having a negative Diff _{AvgPartialPPSNR} value after a very fast image (that is, an out of focus image) is detected by the start frame detector 422. Decide on a frame.

상기 수학식 9에서 PPSNR 은 저장된 기준 프레임(reference frame)과 입력되는 현재 프레임의 PSNR을 예측한 파라미터로서, 본 발명의 일 실시예에서의 수학식 3과 같이 연산될 수 있다. N _f 는 프레임이 분할된 블록의 수를 지시한다.In Equation 9, PPSNR is a parameter for predicting a stored reference frame and a PSNR of an input current frame, and may be calculated as in Equation 3 of an embodiment of the present invention. N _f indicates the number of blocks in which the frame is divided.

한편, 스킵 판단부(426)는 상기 시작프레임 검출부(422)와 종료 프레임 검출부(424)의 정보로부터, 생략할 프레임을 결정한다. 생략할 프레임으로 결정되면, 해당 프레임의 압축된 데이터는 전송되지 않고, 해당 프레임이 생략되었음을 지시하는 정보만 엔트로피 코더(108)에 전달된다.Meanwhile, the skip determination unit 426 determines a frame to be omitted from the information of the start frame detector 422 and the end frame detector 424. If it is determined that the frame is to be omitted, the compressed data of the frame is not transmitted, and only information indicating that the frame is omitted is transmitted to the entropy coder 108.

상기 QP 조정부(34)는 종료 프레임 검출부(424)로부터 매우 빠른 영상(즉, 초점이 맞지 않는 영상)의 종료 프레임에 대한 정보를 수신한다. 그리고, 상기 QP 조정부(34)는, 매우 빠른 영상(즉, 초점이 맞지 않는 영상)이 종료된 후에 입력되는 프레임에 대해서는, 영상의 복잡도에 따라 양자화 파라미터(quantization parameter)를 적용하는 데이터율 제어를 수행한다.The QP adjuster 34 receives information about an end frame of a very fast image (that is, an image that is not in focus) from the end frame detector 424. The QP adjuster 34 performs data rate control for applying a quantization parameter according to the complexity of the image to a frame input after a very fast image (that is, an image that is not in focus) is finished. To perform.

나아가, 본 발명의 다른 실시예에 따른 장면 전환 검출 방법은 무선 단말의 영상 통화 방법에 적용될 수 있다. 즉, 본 발명의 다른 실시예에 따른 장면 전환 검출 방법을 이용하여, 매우 빠른 영상(예컨대, 초점이 맞지 않는 영상)을 검출하고, 이에 해당되는 프레임을 스킵하여 전송하는 것이 가능하다. 또한,수신단에서는 스킵된 영상 대신 이전의 프레임으로 복원할 수 있다.Furthermore, the scene change detection method according to another embodiment of the present invention can be applied to a video call method of a wireless terminal. That is, by using the scene change detection method according to another embodiment of the present invention, it is possible to detect a very fast image (for example, an image that is out of focus) and skip and transmit a frame corresponding thereto. In addition, the receiver may restore the previous frame instead of the skipped image.

구체적으로, 무선 단말의 영상 통화 방법은 영상 전송장치 및 영상 수신장치를 구비한 무선 단말에 의해 수행될 수 있다.In detail, the video call method of the wireless terminal may be performed by a wireless terminal having a video transmitting apparatus and a video receiving apparatus.

무선 단말에 구비된 영상 전송장치는 영상 통화가 진행되는 동안, 영상통화를 위해 마련된 카메라로부터 촬영되는 영상을 부호화하여 영상 수신장치로 전송한다. 이때, 영상 전송장치는 상기 카메라로부터 촬영되는 영상 중, 매우 빠른 영상(예컨대, 초점이 맞지 않는 영상)이 시작되는 프레임을 검출하고, 상기 매우 빠른 영상이 종료되는 프레임을 검출한다. 특히, 영상 전송장치는 매우 빠른 영상(예컨대, 초점이 맞지 않는 영상)이 시작되는 프레임을 본 발명의 일 실시예에 따른 장면 전환 검출 방법에 개시된 방법으로 검출할 수 있으며, 상기 매우 빠른 영상이 종료되는 프레임을 PPSNR의 미분 연산을 통해 검출할 수 있다. 나아가, 상기 영상 전송장치는 매우 빠른 영상(예컨대, 초점이 맞지 않는 영상)이 시작되는 프레임과 종료되는 프레임 사이에 존재하는 프레임에 대해서는, 프레임의 스킵을 지시하는 신호를 삽입하여 부호화를 수행하여 수신단으로 전송한다.The video transmitting apparatus provided in the wireless terminal encodes an image photographed from a camera provided for video calling and transmits the video to the video receiving apparatus during the video call. In this case, the image transmission apparatus detects a frame at which a very fast image (for example, an out of focus image) starts from among images captured by the camera, and detects a frame at which the very fast image ends. In particular, the image transmission apparatus may detect a frame at which a very fast image (eg, an image that is out of focus) starts by the method disclosed in the scene change detection method according to an embodiment of the present invention, and the very fast image is terminated. The frame to be detected can be detected through a derivative operation of the PPSNR. Furthermore, the image transmission apparatus performs encoding by inserting a signal indicating skip of a frame to a frame existing between a frame at which a very fast image (for example, an image that is not in focus) starts and ends the frame. To send.

한편, 영상 수신장치는 부호화되어 영상 데이터를 복원하고, 복원된 상기 영상을 디스플레이 장치를 통해 재생한다. 영상 수신장치는 부호화 과정에서 삽입된 프레임 스킵을 지시하는 신호를 확인할 수 있으며, 이에 따라 상기 스킵된 프레임과 시각적(time)으로 바로 이전에 존재하는 프레임을 복사하여, 스킵된 프레임을 대체하여 영상을 복원하게 된다.On the other hand, the image receiving apparatus is encoded to restore the image data, and reproduces the restored image through the display device. The image receiving apparatus may identify a signal indicating skipping of the inserted frame during the encoding process, thereby copying the skipped frame and the frame immediately present in time, and replacing the skipped frame to replace the skipped frame. Will be restored.

본 발명에 따른 방법은 컴퓨터로 읽을 수 있는 기록 매체에 컴퓨터가 읽을 수 있는 코드로서 구현될 수 있다. 컴퓨터가 읽을 수 있는 기록매체는 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록 장치를 포함한다. 컴퓨터가 읽을 수 있는 기록 매체의 예로는 ROM, RAM, CD-ROM, 자기 테이프, 플로피 디스크, 광 디스크 등이 있으며, 또한 캐리어 웨이브(예를 들어, 인터넷을 통한 전송)의 형태로 구현되는 것을 포함한다. 또한, 컴퓨터가 읽을 수 있는 기록 매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산 방식으로 컴퓨터가 읽을 수 있는 코드로 저장되고 실행될 수 있다.The method according to the invention can be embodied as computer readable code on a computer readable recording medium. Computer-readable recording media include all kinds of recording devices that store data that can be read by a computer system. Examples of computer-readable recording media include ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical disk, and the like, and may also include those implemented in the form of carrier waves (eg, transmission over the Internet). do. The computer readable recording medium can also be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion.

상기와 같이 본 발명의 일 실시예에 따른 동영상 부호화 데이터율 제어를 위한 실시간 장면 전환 검출 동작이 이루어질 수 있으며, 한편 상기한 본 발명의 설명에서는 구체적인 실시예에 관해 설명하였으나 여러 가지 변형이 본 발명의 범위를 벗어나지 않고 실시될 수 있다.As described above, a real-time scene change detection operation for controlling video encoding data rate according to an embodiment of the present invention can be performed. Meanwhile, in the above description of the present invention, a specific embodiment has been described. It can be carried out without departing from the scope.

도 1은 본 발명의 일 실시예에 따른 장면 전환 검출 방법이 적용되는 동영상 인코더 장치의 블록 구성도,1 is a block diagram of a video encoder device to which a scene change detection method is applied according to an embodiment of the present invention;

도 2는 본 발명의 일 실시예에 따라 복수의 영역으로 분할된 프레임의 일 예시도,2 is an exemplary view of a frame divided into a plurality of areas according to an embodiment of the present invention;

도 3은 본 발명의 일 실시예에 따른 실시간 장면 검출 동작의 흐름도,3 is a flowchart of a real-time scene detection operation according to an embodiment of the present invention;

도 4는 본 발명의 일 실시예에 따른 실시간 장면 검출 동작의 테스트 결과를 나타낸 그래프,4 is a graph showing a test result of a real-time scene detection operation according to an embodiment of the present invention;

도 5는 본 발명의 다른 실시예에 따른 장면 전환 검출 방법이 적용되는 동영상 인코더 장치의 블록 구성도.5 is a block diagram of a video encoder device to which a scene change detection method according to another embodiment of the present invention is applied.

Claims

In the real-time scene change detection method for controlling the video coded data rate,

Dividing a current frame into a plurality of regions and calculating a dissimilarity metric (DM) of each divided region;

Determining whether the similarity measure of each region deviates from a preset reference value;

Confirming the number of regions within the current frame where the similarity measure deviates from the preset reference value,

And checking the number of regions outside the preset reference value in the above process, when the number of regions outside the reference value is greater than or equal to the preset number, determining that the scene has been changed in the current frame. Way.

The method of claim 1, wherein the calculating of the similarity measure (DM) of each divided region comprises:

A method of predicting a peak signal to noise ratio (PSNR) of a current frame before encoding by using error information between samples between a current frame and a reconstructed previous frame (reference frame) .

The method of claim 2, wherein the non-similarity measure (DM) of each divided region is calculated.

After the PSNR predicted in the current frame (hereinafter referred to as PPSNR) and the scene transition occurs, the non-similarity measure of each divided region is calculated using the average PPSNR of the frames. Real time scene change detection method.

The method according to claim 2, wherein the similarity measure of each divided region is calculated according to Equation 1 below.

x denotes an identification number of the divided region, i denotes a frame number of the current frame, and S _j denotes a frame number of the corresponding image for the jth sudden scene changeover.

The method according to claim 4, wherein the PPSNR is calculated according to Equation 2 and Equation 3 below.

In Equation 2 and Equation 3, PMSE is the predicted Mean Square Error (MSE) of the current frame, and in Equation 10, n is the number of bits per sample. The PMSE _i _{, i-} ₁ and PMSE _k _{, k-1} are calculated according to Equations 4 and 5, respectively.

In equations (4) and (5), O ⁱ _mn represents an m-th row n-row original sample of the i-th frame, and R ^j _mn represents the m-th row n-th reconstructed sample of the j-th frame. (One frame consists of M [m] x N [n] pixels).

The method of claim 1, wherein the determination of whether the number of regions deviating from the reference value is greater than or equal to a preset number is performed.

Real-time scene change detection method characterized in that determined by the operation according to the equation (6).

In Equation 6, α is a threshold defining a ratio for determining whether a frame is transitioned, and N _f is the number of regions divided in the frame. In Equation 14, C ^x is determined by an operation according to Equation 15 below.

In Equation 7, β is a threshold defining a non-similarity measure of each region.

The method according to any one of claims 1 to 6,

Calculating a derivative value of a predictive PSNR of a frame input after the frame on which the scene change is performed;

And checking the calculated result value and setting the frame to a frame at which the scene change is completed when the result value indicates a negative value.

8. The method according to claim 7, wherein the derivative value of the predicted PSNR is calculated using Equation 8 below.

In Equation 8, the prediction PSNR is a parameter for predicting the stored reference frame and the PSNR of the input current frame, and N _f indicates the number of blocks in which the frame is divided.

In the video call method of the wireless terminal,

Detecting a start frame and an end frame of a sudden change section of the input video signal of the terminal;

The transmitter skips encoding on the detected images;

Receiving the skipped frames, the receiving unit comprises the step of copying all previously received frames and playing in place of the skipped frames,

The end frame detection is calculated through the derivative of the predictive PSNR obtained between the input image and the reconstructed previous image.

In a video call system using a wireless terminal,

A detector for detecting a start frame and an end frame of a sudden change section of an input video signal of a terminal;

A transmitter for skipping frame encoding on all detected images;

And a receiver configured to copy and alternately reproduce all frames previously received for the skipped frames.

And the detector calculates the derivative of the predictive PSNR obtained between the input image and the reconstructed previous image.