JP2007134986A

JP2007134986A - Shot boundary detection device

Info

Publication number: JP2007134986A
Application number: JP2005326355A
Authority: JP
Inventors: Masaru Sugano; 勝菅野; Kazunori Matsumoto; 一則松本; Yasuyuki Nakajima; 康之中島
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2005-11-10
Filing date: 2005-11-10
Publication date: 2007-05-31

Abstract

<P>PROBLEM TO BE SOLVED: To provide a shot boundary detection device which is capable of accurately extracting the shot boundary of a low-luminance image, a monochrome image, or even a moving image that contains the unchanged part of an image in the side region of a picture screen. <P>SOLUTION: An image partially decoded through a variable-length decoding unit 1 is subjected to trimming wherein the upper and lower region or the left and right region of the image divided in a horizontal direction and/or a vertical direction by a center region extraction unit 2 are discarded so as to extract the center region of the image. The image of the center region is inputted into an interframe luminance difference value operation unit 3, a color difference histogram correlation value operation unit 4, and a darkness judging unit 5. The darkness judging unit 5 acquires the average luminance value of the frames. When it is found that the average luminance value of the frames is smaller than a prescribed threshold; the image of the center region is judged to be a low-luminance image or a monochrome image, and a shot boundary judging unit 6 is informed of the judgement result. The shot boundary judging unit 6 performs shot boundary judgement processing by multiplying an interframe luminance difference value D<SB>n</SB>by γ(>1) when the image is judged to be a low-luminance image or a monochrome image. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

この発明はショット境界検出装置に関し、特に暗い画像や字幕が入った画像などを含む動画像から精度よくショット境界を検出できるショット境界検出装置に関する。 The present invention relates to a shot boundary detection apparatus, and more particularly to a shot boundary detection apparatus that can accurately detect a shot boundary from a moving image including a dark image or an image with subtitles.

従来の動画像カット点検出装置の一例として、下記の特許文献１に開示されているものがある。該特許文献１には、動画像カット点検出処理の概要として、図５に示されているものが示されている。 An example of a conventional moving image cut point detection device is disclosed in Patent Document 1 below. The patent document 1 shows what is shown in FIG. 5 as an outline of the moving image cut point detection processing.

ステップＳ２１で画像が入力されると、ステップＳ２２で該画像から符号化パラメータが抽出され、ステップＳ２３で該抽出された符号化パラメータを用いて瞬時カット点であるか否かの判断がなされる。この判断が否定の時には、ステップＳ２４に進んで特殊カット画面であるディゾルブ検出が行われ、さらにステップＳ２６に進んでワイプ検出が行われる。一方、ステップＳ２３で瞬時カット点であると判断された時にはステップＳ２９でフラッシュ検出か否かの判断がなされ、フラッシュ検出でないと判断されると瞬時カットと決定され、ステップＳ３０で瞬時カット登録がなされる。また、ステップＳ２５，Ｓ２７では、それぞれディゾルブ登録、ワイプ登録がなされる。 When an image is input in step S21, an encoding parameter is extracted from the image in step S22, and it is determined in step S23 whether or not it is an instantaneous cut point using the extracted encoding parameter. When this determination is negative, the process proceeds to step S24 to perform dissolve detection as a special cut screen, and further proceeds to step S26 to perform wipe detection. On the other hand, if it is determined in step S23 that the point is an instantaneous cut point, it is determined in step S29 whether or not flash detection is performed. If it is determined that flash detection is not performed, an instantaneous cut is determined, and instantaneous cut registration is performed in step S30. The In steps S25 and S27, dissolve registration and wipe registration are performed, respectively.

下記の特許文献２には、前記瞬時カット点検出の一手法として、フレーム間輝度差分値、ヒストグラム相関値、および色差ヒストグラム相関値を用いて、フレームｎ−１、ｎ、ｎ＋１の瞬時的な変化を調べ、カット画面であるか非カット画面であるかを識別することが開示されている。
特開平１１−２５２５０９号公報特開平７−２３６１５３号公報 In Patent Document 2 below, as one method of the instantaneous cut point detection, an instantaneous change in frames n−1, n, and n + 1 using inter-frame luminance difference values, histogram correlation values, and color difference histogram correlation values is described. And identifying whether the screen is a cut screen or a non-cut screen.
JP 11-252509 A Japanese Patent Laid-Open No. 7-236153

しかしながら、前記特許文献２に示されている瞬時カット点検出手法では、暗い画像、白黒画像、あるいは画面の上部または下部に字幕やマーケット情報のようなイメージの不変部分が入る画像においては、ショット境界の検出精度が低くなるという課題があった。 However, in the instantaneous cut point detection method disclosed in Patent Document 2, in a dark image, a black-and-white image, or an image in which an invariant portion of an image such as subtitles or market information enters at the top or bottom of the screen, a shot boundary There has been a problem that the detection accuracy is low.

本発明の目的は、前記した従来技術の課題を解消し、低輝度画像、白黒画像、あるいは画面の側片に近い領域に不変部分を有する画像を含む動画像であっても、ショット境界を精度よく抽出できるショット境界検出装置を提供することにある。 The object of the present invention is to solve the above-described problems of the prior art and to accurately detect shot boundaries even in a low-brightness image, a black-and-white image, or a moving image including an image having an invariant portion in an area close to a side piece of the screen. An object of the present invention is to provide a shot boundary detection device that can be well extracted.

前記した目的を達成するために、本発明は、符号化画像を部分復号する手段と、該部分復号された画像から、字幕などの画像の不変部分が入る領域を除去する手段と、前記除去されなかった領域の画像を用いて、ショット境界を検出する手段とを具備した点に第１の特徴がある。 In order to achieve the above object, the present invention comprises means for partially decoding an encoded image, means for removing an area where an invariant part of an image such as a caption enters from the partially decoded image, and the removal. The first feature is that it includes means for detecting a shot boundary using an image of a region that has not been present.

また、符号化画像を部分復号する手段と、該部分復号された画像を水平または垂直方向にｍ（ｍ≧３）分割し、上下または左右の周辺の分割領域の少なくとも１つを除去する手段と、前記除去されなかった領域の画像を用いて、ショット境界を検出する手段とを具備した点に第２の特徴がある。 Means for partially decoding the encoded image; and means for dividing the partially decoded image into m (m ≧ 3) in the horizontal or vertical direction and removing at least one of the upper and lower or left and right peripheral divided regions. There is a second feature in that a means for detecting a shot boundary using an image of the area that has not been removed is provided.

さらに、本発明は、前記ショット境界を検出する手段は、フレーム間輝度差分値Ｄ_ｎを求める手段と、画像の平均輝度値を検出し、該平均輝度値が予め定められた閾値より小さい場合に、前記フレーム間輝度差分値Ｄ_ｎにγ（γ＞１）を乗じる手段と、γＤ_ｎが予め定められた閾値Ｔｈより大きければショット境界の候補とする手段とを具備する点に第３の特徴がある。 Further, in the present invention, the means for detecting the shot boundary is a means for obtaining an inter-frame luminance difference value D _n , an average luminance value of the image is detected, and the average luminance value is smaller than a predetermined threshold value. A third feature is that it comprises means for multiplying the inter-frame luminance difference value D _n by γ (γ> 1), and means for making a shot boundary candidate if γD _n is greater than a predetermined threshold Th. There is.

本発明によれば、字幕などの不変部分が入っていない画像領域を用いてショット境界の検出を行えるので、該ショット境界の検出精度を向上することができる。 According to the present invention, since the shot boundary can be detected using an image area that does not include an invariant part such as a caption, the detection accuracy of the shot boundary can be improved.

また、本発明によれば、画像を水平または垂直方向にｍ（ｍ≧３）分割し、上下又は左右の周辺の分割領域の少なくとも１つを除去した画像を用いてショット境界を検出するようにしたので、画像上下又は左右の周辺領域に、字幕やマーケット情報などの不変情報が乗っても、これらがカットされる可能性が高くなる。このため、画像情報のみからショット境界の検出が行われることになり、ショット境界の検出精度を向上することができる。 Further, according to the present invention, the shot boundary is detected using an image obtained by dividing an image into m (m ≧ 3) in the horizontal or vertical direction and removing at least one of the upper and lower or left and right peripheral divided regions. Therefore, even if invariant information such as subtitles and market information is placed on the upper and lower or left and right peripheral areas of the image, there is a high possibility that these will be cut. For this reason, the shot boundary is detected only from the image information, and the detection accuracy of the shot boundary can be improved.

また、低輝度画像や白黒画像などの場合には、フレーム間輝度差分値Ｄ_ｎに１より大きい値γを乗じてフレーム間輝度差分値を実質的に大きくなるように補正したので、低輝度画像や白黒画像などの場合のショット境界の検出精度を向上させることができるようになる。 In the case of such a low brightness image or a monochrome image, since the correction to be substantially larger luminance difference value between frames by multiplying the value greater than one γ on the luminance difference value D _n between the frames, the low-luminance image It is possible to improve the accuracy of shot boundary detection in the case of a black-and-white image or the like.

以下に、図面を参照して、本発明を詳細に説明する。図１は、本発明のショット境界検出装置の概略の構成を示すブロック図である。なお、このショット境界検出は、図５のステップＳ２３の処理に相当する。 Hereinafter, the present invention will be described in detail with reference to the drawings. FIG. 1 is a block diagram showing a schematic configuration of a shot boundary detection apparatus according to the present invention. This shot boundary detection corresponds to the processing in step S23 of FIG.

本発明のショット境界検出は、一部復号（または、部分復号）された画像データから検出される。すなわち、例えばＭＰＥＧ等で符号化された符号化画像ストリームａは、可変長復号部（ＶＬＤ）１に入力し、一部復号される。該一部復号された画像データｂは、次いで、中央領域抽出部２に入力し、字幕などの画像の不変部分が入る領域を除去される。該中央領域抽出部２は、一具体例として、水平方向にｍ分割（ｍ≧３）された画像の中央領域を抽出する。例えば、図２に示されているように、画像は４分割され、中央の２つの分割領域（図の斜線領域）が抽出され、上下の各１つの領域は除去される。この理由は、１フレームの画像データｂの前記上下の各１つの領域には、字幕やマーケット情報などの不変情報が表示される可能性があるからである。なお、前記上下の領域のいずれか一方を除去するだけでもよい。 The shot boundary detection according to the present invention is detected from partially decoded (or partially decoded) image data. That is, for example, an encoded image stream a encoded by MPEG or the like is input to the variable length decoding unit (VLD) 1 and partially decoded. The partially decoded image data b is then input to the central area extraction unit 2 and an area in which an invariant part of an image such as a caption enters is removed. As a specific example, the central region extraction unit 2 extracts a central region of an image divided into m (m ≧ 3) in the horizontal direction. For example, as shown in FIG. 2, the image is divided into four parts, two central divided areas (shaded areas in the figure) are extracted, and one area above and below is removed. This is because invariant information such as subtitles and market information may be displayed in each of the upper and lower areas of one frame of image data b. Note that only one of the upper and lower regions may be removed.

フレーム間輝度差分値演算部３は、下記の（１）式の演算により、フレーム間輝度差分値Ｄ_ｎを求める。 The inter-frame luminance difference value calculation unit 3 obtains the inter-frame luminance difference value D _n by the calculation of the following equation (1).

ここに、Ｍ，Ｎは、それぞれ８×８画素からなるブロックの垂直、水平方向の１フレーム当たりの総数である。また、Ｙ_ｎ（ｉ，ｊ）は、ｎ番目のフレームのブロック（ｉ、ｊ）の輝度ブロック平均値である。 Here, M and N are the total number per one frame in the vertical and horizontal directions of a block composed of 8 × 8 pixels, respectively. Y _n (i, j) is the average luminance block value of the block (i, j) of the nth frame.

また、色差ヒストグラム相関値演算部４は、下記の（２）式により、色差ヒストグラム相関値ρ_ｎを求める。 Further, the color difference histogram correlation value calculation unit 4 obtains the color difference histogram correlation value ρ _n by the following equation (2).

ここに、Ｈ_{ｎ，ｋ，ｌ}（ｋ、ｌ＝０，１，２，・・・，ｈｃ−１）は１フレーム中のＤＣ色差信号Ｃｂ、Ｃｒデータをｈｃクラスに分類することにより得られるヒストグラムであり、その詳細は例えば前記特許文献２の段落［００２９］〜［００３２］に説明されている。 Here, H _{n, k, l} (k, l = 0, 1, 2,..., Hc−1) is obtained by classifying the DC color difference signals Cb and Cr data in one frame into the hc class. The histogram is described in detail in paragraphs [0029] to [0032] of Patent Document 2, for example.

暗さ判定部５は、下記の（３）式から１フレームの平均輝度値aveＹ_ｎを求め、その平均輝度値が予め定められた閾値Ｔｈ_ｄより小さければ、低輝度画像または白黒画像と判定し、その旨をショット境界判定部６に通知する。 Darkness decision unit 5 calculates the average luminance value AveY _n of one frame (3) below, is smaller than the threshold Th_d that the average luminance value is predetermined, it is determined that the low brightness or black and white images, This is notified to the shot boundary determination unit 6.

次に、ショット境界判定部６は、前記フレーム間輝度差分値演算部３および色差ヒストグラム相関値演算部４で求められたフレーム間輝度差分値Ｄ_ｎと色差ヒストグラム相関値ρ_ｎを用いて、ショット境界の判定を行う。 Next, the shot boundary determination unit 6 uses the inter-frame luminance difference value D _n and the chrominance histogram correlation value ρ _n obtained by the inter-frame luminance difference value calculation unit 3 and the chrominance histogram correlation value calculation unit 4 to perform a shot. Determine the boundary.

次に、本実施形態の動作を、図３のフローチャートを参照して説明する。ステップＳ１では、可変長復号部１に符号化画像ストリームａが入力する。ステップＳ２では、中央領域抽出部２にて、画像を水平方向に複数分割し、その中央領域が選択される。ステップＳ３では、フレーム間輝度値差分演算部３にて、前記（１）式により、フレーム間輝度差分値Ｄ_ｎが求められる。ステップＳ４では、暗さ判定部５にて、前記（３）式により、１フレーム当たりの平均輝度値aveＹ_ｎが求められる。ステップＳ５では、該平均輝度値aveＹ_ｎが前記閾値Ｔｈ_ｄより小さいか否かまたは白黒画像であるか否かが判定される。この判断が肯定の場合には、画像が暗い（低輝度画像）または白黒画像であると判定し、ステップＳ６に進む。ステップＳ６では、前記フレーム間輝度差分値Ｄ_ｎにある定数γ（ただし、γ＞１．０）が乗じられる。そして、ステップＳ７に進んで、Ｄ_ｎ＞Ｔｈ_ｐｒｅが成立するかどうかの判断がなされる。低輝度画像や白黒画像では、一般にフレーム間輝度差分値Ｄ_ｎが小さくなり、ショット境界が検出されにくくなるからである。ステップＳ５の判断が否定の時、すなわち画像が明るいときには、ステップＳ６はスキップされる。なお、ステップＳ５で平均輝度値aveＹ_ｎによる暗さ判定を多段階に判定し、ステップＳ６における定数γを該暗さの程度に応じて異なる値に設定してもよい。 Next, the operation of the present embodiment will be described with reference to the flowchart of FIG. In step S1, the encoded image stream a is input to the variable length decoding unit 1. In step S2, the central area extraction unit 2 divides the image into a plurality of parts in the horizontal direction, and the central area is selected. In step S3, the inter-frame luminance value difference calculation unit 3 obtains the inter-frame luminance difference value D _{n according} to the equation (1). In step S4, the darkness determination unit 5 obtains the average luminance value aveY _n per frame by the above equation (3). In step S5, it is determined whether or not the average luminance value aveY _n is smaller than the threshold value Th_d or a monochrome image. If this determination is affirmative, it is determined that the image is dark (low luminance image) or a black and white image, and the process proceeds to step S6. In step S6, the constant in the inter-frame luminance difference value _{D n} gamma (although, γ> 1.0) is multiplied by. Then, the process proceeds to step S7 to determine whether or not D _n > Th_pre is satisfied. In the low brightness image or a monochrome image, generally interframe luminance difference value D _n is reduced, because the shot boundary is less likely to be detected. When the determination in step S5 is negative, that is, when the image is bright, step S6 is skipped. Incidentally, the darkness determined by the average luminance value AveY _n determined in multiple stages in the step S5, may be set to different values in accordance with the constant γ in step S6 to the degree of the dark is.

ステップＳ７の判断が肯定の時には、さらにステップＳ８の判断がなされる。すなわち、αＤ_ｎ＞Ｄ_ｎー１、Ｄ_ｎ＋１かつρ_ｎ＞ρ_ｎー１、ρ_ｎ＋１が成立するか否かの判断がなされる。ここに、αは重み係数であり、１より大きい値である。ステップＳ７の判断が肯定になると、ショット境界と判断し、否定になると、ステップＳ９の判断を行う。 If the determination in step S7 is affirmative, the determination in step S8 is further made. That is, it is determined whether αD _n > D _n−1 , D _{n + 1} and ρ _n > ρ _n−1 , ρ _{n + 1} are satisfied. Here, α is a weighting factor and is a value larger than 1. If the determination in step S7 is affirmative, it is determined that it is a shot boundary, and if it is negative, the determination in step S9 is performed.

ショット境界が大きい動きをもつシーン中に存在する場合には、フレーム間差が非常に大きくなり、ステップＳ８の式ではショット境界が判定できない。よって、ステップＳ９の判断を行う。すなわち、ρ_ｎ＞Ｔｈ_ａｃが成立するか否かの判断を行う。ここに、Ｔｈ_ａｃは、ρ_ｎ中のピーク値を決定するための閾値である。ステップＳ９の判断が肯定の場合にはショット境界と判断し、否定の場合はステップＳ１０に進む。 When the shot boundary exists in a scene having a large motion, the difference between frames becomes very large, and the shot boundary cannot be determined by the equation of step S8. Therefore, the determination in step S9 is performed. That is, it is determined whether or not ρ _n > Th_ac is satisfied. Here, Th_ac is a threshold for determining the peak value in the [rho _n. If the determination in step S9 is affirmative, it is determined as a shot boundary, and if negative, the process proceeds to step S10.

連続する２つのショットにおいて、単にカメラアングルが異なる場合には、色差ヒストグラムは類似する。このため、ステップＳ８、Ｓ９の判定でショット境界を検出するのは困難である。しかし、画素差はこれらのショット境界で大変大きいから、輝度差分値のピーク検出が有効である。そこで、ステップＳ１０の判断、すなわちβＤ_ｎ＞Ｄ_ｎー１、Ｄ_ｎ＋１又はＤ_ｎ−Ｔｈ_ａｄ＞Ｄ_ｎー１、Ｄ_ｎ＋１が成立する時にはショット境界と判断する。ここに、βとＴｈ_ａｄは、それぞれＤ_ｎのピーク値を検出するための重みファクタと閾値である。 In two consecutive shots, if the camera angles are simply different, the color difference histograms are similar. For this reason, it is difficult to detect a shot boundary in the determinations in steps S8 and S9. However, since the pixel difference is very large at these shot boundaries, peak detection of the luminance difference value is effective. Therefore, when the determination in step S10, that is, βD _n > D _n−1 , D _{n + 1} or D _n −Th_ad> D _n−1 , D _{n + 1} is satisfied, it is determined that the shot boundary. Here, β and Th_ad are a weight factor and a threshold for detecting the peak value of D _n , respectively.

本発明者が、「ＴＲＥＣＶＩＤ２００５」テストデータに対して、本発明の処理、すなわちステップＳ２とＳ６の処理を使用しない場合（従来方式）、ステップＳ６の処理を使用した場合、ステップＳ２の処理を使用した場合、およびステップＳ６とＳ２の両方を使用した場合について、ショット境界を検出した場合、その検出率は図４のようになった。このことから、本発明の処理を導入すると、ショット境界の検出精度が向上することが確かめられた。 When the inventor does not use the process of the present invention, that is, the process of steps S2 and S6 (conventional method), the process of step S6 is used for the “TRECVID2005” test data. When the shot boundary is detected in the case where both the steps S6 and S2 are used, the detection rate is as shown in FIG. From this, it was confirmed that when the processing of the present invention is introduced, the detection accuracy of the shot boundary is improved.

前記した実施形態では、前記中央領域抽出部２は画面を水平方向にｍ分割しその中央領域を抽出する例であったが、画面を垂直方向にｍ分割して左右の側片を含む領域を除去して中央領域を抽出するようにしてもよい。または、画面の水平、垂直両方向の周辺領域を除去して中央領域の画像部分のみを抽出するようにしてもよい。 In the embodiment described above, the central area extraction unit 2 is an example in which the screen is divided into m in the horizontal direction and the central area is extracted. However, the screen is divided into m in the vertical direction and the area including the left and right side pieces is extracted. The central region may be extracted by removal. Alternatively, only the image portion of the central area may be extracted by removing peripheral areas in both the horizontal and vertical directions of the screen.

本発明の一実施形態の概略の構成を示すブロック図である。It is a block diagram which shows the schematic structure of one Embodiment of this invention. 図１の中央領域抽出部の動作説明図である。It is operation | movement explanatory drawing of the center area | region extraction part of FIG. 本実施形態の動作を説明するフローチャートである。It is a flowchart explaining operation | movement of this embodiment. 本実施形態にショット境界検出率の実験例である。This embodiment is an example of shot boundary detection rate. 動画像カット点検出処理の概要を示すフローチャートである。It is a flowchart which shows the outline | summary of a moving image cut point detection process.

Explanation of symbols

１・・・可変長復号部、２・・・中央領域抽出部、３・・・フレーム間輝度差分値演算部、４・・・色差ヒストグラム相関値演算部、５・・・暗さ判定部、６・・・ショット境界判定部。 DESCRIPTION OF SYMBOLS 1 ... Variable length decoding part, 2 ... Center area extraction part, 3 ... Inter-frame brightness | luminance difference value calculation part, 4 ... Color difference histogram correlation value calculation part, 5 ... Darkness determination part, 6: Shot boundary determination unit.

Claims

Means for partially decoding the encoded image;
Means for removing a peripheral region where an invariant part of an image such as a caption enters from the partially decoded image;
A shot boundary detection apparatus comprising: means for detecting a shot boundary using an image of the area that has not been removed.

Means for partially decoding the encoded image;
Means for dividing the partially decoded image into m (m ≧ 3) in the horizontal or vertical direction and removing at least one of the upper and lower or left and right peripheral divided regions;
A shot boundary detection apparatus comprising: means for detecting a shot boundary using an image of the area that has not been removed.

In the shot boundary detection device according to claim 1 or 2,
The means for detecting the shot boundary is a means for obtaining an inter-frame luminance difference value D _n and an average luminance value of the image, and when the average luminance value is smaller than a predetermined threshold, the inter-frame luminance difference A shot boundary detection device comprising: means for multiplying the value D _n by γ (γ>1); and means for setting a shot boundary candidate if γD _n is larger than a predetermined threshold Th.