JP2016178641A

JP2016178641A - Method and apparatus for measurement of quality of video based on frame loss pattern

Info

Publication number: JP2016178641A
Application number: JP2016067071A
Authority: JP
Inventors: シアオドングー; Xiaodong Gu; ドービンリウ; Debing Liu; ジーボーチェン; Zhibo Chen
Original assignee: Thomson Licensing SAS
Current assignee: Thomson Licensing SAS
Priority date: 2016-03-30
Filing date: 2016-03-30
Publication date: 2016-10-06

Abstract

PROBLEM TO BE SOLVED: To provide a method and apparatus for measuring quality of video on the basis of a frame loss pattern.SOLUTION: The method comprises: generating a frame loss pattern of video by indicating whether each frame in the video is lost or successfully transmitted; and evaluating quality of the video as a function of the generated frame loss pattern.SELECTED DRAWING: Figure 2

Description

本発明は、フレーム損失パターンに基づいてビデオの品質を測定する方法および装置に関する。 The present invention relates to a method and apparatus for measuring video quality based on frame loss patterns.

この節は、以下に説明および／または特許請求される本発明のさまざまな態様と関連させることができる、技術のさまざまな態様を読み手に紹介することを意図する。この論考は、本発明のさまざまな態様をより理解し易くするための背景情報を読み手に提供するのに役立つものとされている。従って、こうした記述は、この観点から読まれるべきであり、先行技術の承認として読まれるべきではないことを理解されたい。 This section is intended to introduce the reader to various aspects of the technology that may be associated with various aspects of the invention described and / or claimed below. This discussion is intended to help provide readers with background information to make the various aspects of the present invention easier to understand. Accordingly, it should be understood that such a description should be read in this regard and not as prior art approval.

デジタル圧縮されたビデオの送信において、かなり重大な障害の源は、エラーを起こしやすいチャネルを介してビデオストリームを配信することによる。情報の一部損失および一部破損は、フレーム内の局所的歪みがフレームを介して空間的および時間的に伝搬するため、ユーザの知覚品質に劇的なインパクトを与えることがある。このようなフレーム損失の視覚的インパクトは、破損したストリームに対処するビデオデコーダの能力によって異なる。一部の例において、デコーダは、自発的にいくつかのフレームを落とす（drop）ことを判断できる。例えば、デコーダは、破損または欠損した情報を有するフレームをすべて落とすまたは廃棄して、次に有効な復号されたフレームが使用可能になるまで、代わりに前のビデオフレームを繰り返すことができる。エンコーダも、ターゲットとなる符号化ビットレートが低すぎる場合において、コンテンツの動きが急増する時にフレームを落とすことができる。上記のすべての例において、フレーム損失は、ビデオ内で発生すると言えよう。 In the transmission of digitally compressed video, a fairly significant source of failure is due to the delivery of video streams over error-prone channels. Partial loss and corruption of information can have a dramatic impact on the user's perceived quality because local distortions in the frame propagate spatially and temporally through the frame. The visual impact of such frame loss depends on the video decoder's ability to deal with corrupted streams. In some examples, the decoder can determine to drop some frames spontaneously. For example, the decoder can drop or discard all frames with corrupted or missing information and instead repeat the previous video frame until the next valid decoded frame is available. The encoder can also drop the frame when the content movement increases rapidly if the target encoding bit rate is too low. In all the above examples, it can be said that frame loss occurs in the video.

多くの既存のビデオ品質モニタリング製品において、メディア全体のビデオ品質は、主な３つのコーディングアーチファクト、即ち、ジャーキネス(jerkiness)、ブロックノイズ(blockiness)およびぼけ(blurring)に基づいて分析される。ブロックノイズとぼけは、主な２種類の空間コーディングアーチファクトであり、それぞれ、ブロック境界の不連続として、および高周波数損失としてふるまう。一方、ジャーキネスは、最も重大な時間アーチファクトである。 In many existing video quality monitoring products, the overall media video quality is analyzed based on three main coding artifacts: jerkiness, blockiness and blurring. Block noise and blur are the two main types of spatial coding artifacts, which behave as block boundary discontinuities and high frequency losses, respectively. On the other hand, jerkiness is the most significant time artifact.

グループフレーム損失のセットによって生じるビデオ品質の時間劣化は、ジャーキネスと呼ばれ、グループフレーム損失とは、ビデオシーケンスの１または複数の連続したフレームがまとまって損失したという事実を意味する。 The time degradation of video quality caused by a set of group frame losses is called jerkiness, and group frame loss means the fact that one or more consecutive frames of a video sequence have been lost together.

知覚されるビデオ品質に対する（周期的な（periodic：一定間隔の）または周期的でない）ビデオフレーム損失の知覚的インパクトの評価についていくつかの研究がある。 There are several studies on assessing the perceptual impact of video frame loss (periodic or non-periodic) on perceived video quality.

非特許文献１において、人間は通常、一貫したフレーム損失に対してかなり寛容であり、その負のインパクトは、フレーム損失の一貫性と大いに関係することが指摘され、ひいてはジャーキネスの大きさとして使用されるようになった。 In Non-Patent Document 1, it is pointed out that humans are usually quite tolerant of consistent frame loss, and its negative impact is greatly related to the consistency of frame loss, and is therefore used as a measure of jerkiness. It became so.

非特許文献２において、ジャーキネスの知覚的インパクトとグループフレーム損失の長さおよび発生頻度との関係が述べられている。 Non-Patent Document 2 describes the relationship between the perceptual impact of jerkiness and the length and frequency of group frame loss.

K. C. Yang, C. C. Guest , K. EI-Maleh and P. K. Das, “Perceptual Temporal Quality Metric for Compressed Video”, IEEETransaction on Multimedia, vol.9, no.7, Nov. 2007, pp.1528-1535.K. C. Yang, C. C. Guest, K. EI-Maleh and P. K. Das, “Perceptual Temporal Quality Metric for Compressed Video”, IEEETransaction on Multimedia, vol.9, no.7, Nov. 2007, pp.1528-1535. R. R. Pastrama-Vidal and J. C. Gicquel, “AutomaticQuality Assessment of Video Fluidity Impairments Using a No-Reference Metric”,the 2nd International Workshop on Video Processing and Quality Metric forConsumer Electronics, Scottsdale, USA 22-24, Jan. 2006.R. R. Pastrama-Vidal and J. C. Gicquel, “Automatic Quality Assessment of Video Fluidity Impairments Using a No-Reference Metric”, the 2nd International Workshop on Video Processing and Quality Metric for Consumer Electronics, Scottsdale, USA 22-24, Jan. 2006. L. Y. Duan, J. Q. Wamg et al, ”Shot-LevelCamera Motion Estimation based on a Parametric Model”L. Y. Duan, J. Q. Wamg et al, “Shot-LevelCamera Motion Estimation based on a Parametric Model”

本発明の発明者は、ビデオのフレーム損失パターンがジャーキネスの知覚的インパクトに大きな影響を与え、やがてビデオ品質全体にインパクトを与えることを発見した。「フレーム損失パターン」については、ビデオシーケンスの各フレームが、異なる表現を用いて送信される時に上手く送信されたか、または損失したかに関する状態を順次記録することによって生成されるパターンという意味である。 The inventors of the present invention have found that video frame loss patterns have a significant impact on the perceived impact of jerkiness and eventually impact on the overall video quality. “Frame loss pattern” means a pattern generated by sequentially recording the state of whether each frame of a video sequence was successfully transmitted or lost when transmitted using a different representation.

従って、本発明は、ビデオおよび対応する装置の品質を測定する方法を提供することによって、この発見を活用する。 Thus, the present invention takes advantage of this discovery by providing a method for measuring the quality of video and corresponding devices.

一実施形態において、ビデオの品質を測定する方法を提供する。その方法は、ビデオの各フレームが損失したかまたは上手く送信されたかどうかを示すことによって、ビデオのフレーム損失パターンを生成することと、ビデオの品質を、生成されたフレーム損失パターンの関数として評価することを備える。 In one embodiment, a method for measuring video quality is provided. The method generates a video frame loss pattern by indicating whether each frame of the video was lost or transmitted successfully, and evaluates the video quality as a function of the generated frame loss pattern Prepare for that.

一実施形態において、ビデオの品質を測定する装置を提供する。その装置は、入力ビデオを受信し、そして受信されたビデオのフレーム損失パターンを生成する手段と、ビデオの品質を、生成されたフレーム損失パターンの関数として評価する手段とを備える。 In one embodiment, an apparatus for measuring video quality is provided. The apparatus comprises means for receiving an input video and generating a frame loss pattern for the received video, and means for evaluating the quality of the video as a function of the generated frame loss pattern.

発明の有利な実施形態は、従属する特許請求の範囲、以下の説明および図面において開示される。 Advantageous embodiments of the invention are disclosed in the dependent claims, the following description and the drawings.

発明の模範的実施形態を、添付図面を参照して説明する。
ビデオのフレーム損失パターンのうちのいくつかの例を示す模範的な図である。本発明の実施形態に従った、フレーム損失パターンに基づいてビデオの品質を測定する方法を示す流れ図である。フレーム損失パターンが２つのサブセクションに分割されることを示す模範的な図である。ビデオの品質を測定する方法のフレームワークを示す模範的な図である。本発明の実施形態に従った、フレーム損失パターンに基づいてビデオの品質を測定する装置を示すブロック図である。ビデオ品質の主観テストを行うように設計されたソフトウェアツールのインタフェースを示す図である。 Exemplary embodiments of the invention will be described with reference to the accompanying drawings.
FIG. 6 is an exemplary diagram illustrating some examples of video frame loss patterns. 3 is a flow diagram illustrating a method for measuring video quality based on a frame loss pattern, in accordance with an embodiment of the present invention. FIG. 6 is an exemplary diagram illustrating that a frame loss pattern is divided into two subsections. FIG. 2 is an exemplary diagram illustrating a framework for a method for measuring video quality. FIG. 2 is a block diagram illustrating an apparatus for measuring video quality based on a frame loss pattern according to an embodiment of the present invention. FIG. 2 shows an interface of a software tool designed to perform a subjective test of video quality.

以下の説明において、本発明の実施形態のさまざまな態様を説明する。説明を目的として、特定の構成および詳細は、完全な理解を与えるために記載されている。しかしながら、本発明は、本明細書に提示した特定の詳細を用いずに実施されてもよいことが当業者には明らかであろう。 In the following description, various aspects of embodiments of the present invention will be described. For purposes of explanation, specific configurations and details are set forth in order to provide a thorough understanding. However, it will be apparent to those skilled in the art that the present invention may be practiced without the specific details presented herein.

図１は、ビデオのフレーム損失パターンのうちのいくつかの例を示す模範的な図である。上記のように、フレーム損失パターンは、異なる表現を用いてビデオの各フレームの状態を順次記録することによって生成される。例えば、フレーム損失パターンを、「１、１、１、１、０、０、１、１、１、０、１」などの、０−１の連続によって表すことができ、「０」は、対応するフレームが送信中に損失したことを表し、「１」は、フレームの送信が上手くいったことを表す。フレーム損失パターンを、高／低レベルを用いた図の形にすることもでき、それぞれ、上手くいったフレーム送信とフレーム損失とを表す。図１（ａ）は、そのようなビデオのフレーム損失パターンを示し、図のそれぞれの低いレベル（谷部）が、１または複数の連続したフレーム損失を包含するグループフレーム損失を意味し、上手く送信されたフレームが、図の高いレベル（最上部）によって表されている。図１（ｂ）は、２つの異なるフレーム損失パターンである「パターン１」および「パターン２」を示し、両方とも、各グループフレーム損失が同じ長さを有する、１０個のグループフレーム損失から構成されている。フレーム損失パターンをビデオの知覚的ジャーキネスに影響を与える１つの要因と見なしていない、非特許文献１および２の研究結果によれば、上記の２つの「パターン１」および「パターン２」によって生じるビデオ品質の劣化は、非常に似ているはずである。しかしながら、本発明の発明者の研究は、上記の場合とは全く異なる結論に達する。 FIG. 1 is an exemplary diagram illustrating some examples of video frame loss patterns. As described above, the frame loss pattern is generated by sequentially recording the state of each frame of the video using different representations. For example, the frame loss pattern can be represented by a sequence of 0-1 such as “1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1”, where “0” “1” indicates that the frame was successfully transmitted. Frame loss patterns can also be in the form of diagrams using high / low levels, representing successful frame transmission and frame loss, respectively. FIG. 1 (a) shows the frame loss pattern of such a video, where each low level (valley) in the figure represents a group frame loss that includes one or more consecutive frame losses and transmitted successfully. The rendered frame is represented by the high level (top) of the figure. FIG. 1 (b) shows two different frame loss patterns, “Pattern 1” and “Pattern 2,” both consisting of 10 group frame losses, each group frame loss having the same length. ing. According to the research results of Non-Patent Documents 1 and 2, which do not regard the frame loss pattern as one factor affecting the perceptual jerkiness of video, the video produced by the above two “pattern 1” and “pattern 2” The quality degradation should be very similar. However, the inventor's work of the present invention arrives at a very different conclusion from the above case.

発明者の発見によれば、ジャーキネスの知覚的インパクトは、フレーム損失パターンによって大いに影響を受ける。同じフレーム損失率である、以下の事例（１）から事例（３）までを例にとる。
事例（１）：１つのフレームが２フレームおきに損失する。
事例（２）：４つの連続したフレームが８フレームおきに損失する。
事例（３）：ビデオシーケンスの後半が完全に損失する。 According to the inventor's discovery, the perceptual impact of jerkiness is greatly influenced by the frame loss pattern. The following case (1) to case (3), which have the same frame loss rate, are taken as examples.
Case (1): One frame is lost every two frames.
Case (2): Four consecutive frames are lost every 8 frames.
Case (3): The second half of the video sequence is completely lost.

上記の３つの事例すべてにおける全体のフレーム損失率は、５０パーセントである。しかしながら、それらの知覚的インパクトは、全く異なる。事例１において、閲覧者は、はっきりとしたディザリングを知覚しても、長時間にわたる閲覧の後で気分が悪くなるであろう。事例３において、閲覧者は、そのような種類の現象を知覚しないが、長時間にわたるフリージング(freezing)に直面するであろう。つまり、同じフレーム損失率を有する異なるフレーム損失パターンによって、完全に異なる知覚が生じる。 The overall frame loss rate in all three cases above is 50 percent. However, their perceptual impact is quite different. In Case 1, the viewer will perceive a clear dithering but will feel unwell after a long period of browsing. In Case 3, the viewer does not perceive such kind of phenomenon, but will face freezing over time. That is, completely different perceptions result from different frame loss patterns having the same frame loss rate.

本発明の実施形態に従って、上記の発見に基づいてビデオの品質を測定する方法を提供する。 In accordance with an embodiment of the present invention, a method is provided for measuring video quality based on the above findings.

図２は、本発明の実施形態に従った、フレーム損失パターンに基づいてビデオの品質を測定する方法を示す流れ図である。 FIG. 2 is a flowchart illustrating a method for measuring video quality based on a frame loss pattern according to an embodiment of the present invention.

図２に示すように、方法は、以下のステップを備える。 As shown in FIG. 2, the method comprises the following steps.

ステップ２０１：フレーム損失パターンは、ビデオシーケンスの各フレームの（損失または上手く送信された）状態を示すことによって生成される。 Step 201: A frame loss pattern is generated by indicating the state (lost or transmitted successfully) of each frame of the video sequence.

ステップ２０２：１番目の（the first）損失フレームからの、ビデオシーケンスの１または複数の連続した損失フレームをグループフレーム損失にグループ化する。 Step 202: Group one or more consecutive lost frames of a video sequence from the first lost frame into a group frame loss.

ステップ２０３：最初の(the first)グループフレーム損失からの、フレーム損失パターンを、１または複数の連続したグループフレーム損失を有する複数のセクションに分割して、セクションの各グループフレーム損失が、そのグループフレーム損失とその前のグループフレーム損失との間で同じ数の上手く送信されたフレームおよび同じ数の損失フレームを有するようにする。 Step 203: Dividing the frame loss pattern from the first group frame loss into multiple sections having one or more consecutive group frame losses, and each group frame loss in the section Have the same number of successfully transmitted frames and the same number of lost frames between the loss and the previous group frame loss.

ステップ２０４：グループフレーム損失の各セクションによって生成された品質劣化の値を計算する。 Step 204: Calculate quality degradation values generated by each section of group frame loss.

ステップ２０５：すべてのセクションの値を合わせることによってビデオシーケンスの品質を評価する。 Step 205: Evaluate the quality of the video sequence by combining the values of all sections.

次に、添付図面を参照して詳細な説明を与える。 A detailed description will now be given with reference to the accompanying drawings.

発明の実施形態に従った方法において、まず第一に、ビデオシーケンスのフレーム損失パターンが生成される。この生成を、適切なやり方によってビデオシーケンスのすべてのフレームの状態を示すことによって実現することができる。フレーム損失を、周知の方法によって検出することができることを当業者は認識するであろう。この点についての詳細はこれ以上説明しない。 In the method according to an embodiment of the invention, firstly a frame loss pattern of the video sequence is generated. This generation can be accomplished by indicating the status of all frames of the video sequence in an appropriate manner. One skilled in the art will recognize that frame loss can be detected by well-known methods. Details on this point will not be described further.

次に、ビデオシーケンスのすべての損失フレームのうちの１番目から開始して、１または複数の連続した損失フレームは、グループフレーム損失と呼ばれる１つのグループにグループ化される。 Next, starting from the first of all the lost frames in the video sequence, one or more consecutive lost frames are grouped into a group called group frame loss.

考えられるフレーム損失パターンを示す。 A possible frame loss pattern is shown.

このパターンは、タイムスタンプによって与えられ、ここでｇｄｉは、ｉ番目のグループフレーム損失を表す。次に、フレーム損失パターンは、第１のグループフレーム損失から複数のサブセクションに分割（またはセグメント化）される。各サブセクションは、１または複数の連続したグループフレーム損失を備え、それぞれのグループフレーム損失は、ビデオシーケンスの品質劣化に同様の知覚的インパクトを与える。 This pattern is given by a time stamp, where gdi represents the i-th group frame loss. The frame loss pattern is then divided (or segmented) into a plurality of subsections from the first group frame loss. Each subsection comprises one or more consecutive group frame losses, each group frame loss having a similar perceptual impact on the quality degradation of the video sequence.

連続したグループフレーム損失を同様の知覚的インパクトに分割して上記の目的のために、この方法において、ビデオシーケンスの各グループフレーム損失を２つのパラメータｇｄ＝（ｇａｐｇｄ，ｌｅｎｇｄ）によって特定することができる。第１のパラメータｇａｐｇｄは、現在のグループフレーム損失と前のグループフレーム損失との間で上手く送信されたフレームの数である。第２のパラメータｌｅｎｇｄは、現在のグループフレーム損失の損失フレームの数である。ｇａｐｇｄとｌｅｎｇｄの両方の値は、１から１０までの整数に限定される。セグメント化されたサブセクションのすべてのグループフレーム損失が、同じパラメータｇａｐｇｄおよびｌｅｎｇｄを有する場合、それらのパラメータは、ビデオシーケンスの品質劣化に同様の知覚的インパクトを与える。 For the above purpose by dividing successive group frame losses into similar perceptual impacts, in this method each group frame loss of the video sequence can be specified by two parameters gd = (gpgd, lengthd). . The first parameter gappgd is the number of frames successfully transmitted between the current group frame loss and the previous group frame loss. The second parameter lengthd is the number of lost frames of the current group frame loss. Both gapgd and lengthd values are limited to integers from 1 to 10. If all group frame losses of a segmented subsection have the same parameters gappgd and lengthd, those parameters have a similar perceptual impact on the quality degradation of the video sequence.

距離関数 Distance function

をサブセクションの２つのグループフレーム損失間の知覚的インパクトの差の大きさとして使用することができる。上記の距離関数において、関数ｆ（ｘ，ｙ）は、ビデオの知覚品質評価に使用され、それについて後で説明する。 Can be used as the magnitude of the difference in perceptual impact between two group frame losses in a subsection. In the above distance function, the function f (x, y) is used for evaluating the perceptual quality of the video, which will be described later.

フレーム損失パターンは、その後、距離関数の定義に基づいてサブセクションにセグメント化される。 The frame loss pattern is then segmented into subsections based on the distance function definition.

以下は、距離関数の擬似コードである。 The following is pseudo code for the distance function.

セグメント化手順。
１、ｎＣｏｕｎｔ＝０，ｐｏｏｌ＝｛｝；／／数字として定義される／サブセクションのセット
２、（ｉ＝０，ｉ＜＝ｎ，ｉ＋＋）｛に対し
２、１もし各ｇｄ∈ｐｏｏｌが（ｄ（ｇｄ，ｇｄｉ）＜ｃ）であれば、
プールにｇｄｉを挿入する
さもなくば｛ＳｕｂＳｅｃｔｉｏｎｎＣｏｕｎｔ＝ｐｏｏｌ；ｐｏｏｌ＝｛｝；ｎＣｏｕｎｔ＋＋｝；｝
を挿入する
上記の擬似コードｃは、定数である。この手順は、フレーム損失パターンＦＬＰをサブセクションのセット Segmentation procedure.
1, nCount = 0, pool = {}; // defined as numbers / subsection set 2, (i = 0, i <= n, i ++) {2, 1 if each gd∈pool is ( If d (gd, gdi) <c),
Insert gdi into the pool otherwise {SubSectionnCount = pool; pool = {}; nCount ++};}
The above pseudo code c is a constant. This procedure sets the frame loss pattern FLP to a set of subsections.

に分割する。 Divide into

図３は、上記のフレーム損失パターンが２つのサブセクションに分割される例を示す。図３に示すように、１つのサブセクション内のグループフレーム損失は、同様の知覚的インパクトから成ると見なされ、そして隣接する２つのサブセクションは、上手く送信されたいくつかのフレームにリンクされている。 FIG. 3 shows an example in which the above frame loss pattern is divided into two subsections. As shown in FIG. 3, group frame loss within one subsection is considered to consist of similar perceptual impacts, and two adjacent subsections are linked to several successfully transmitted frames. Yes.

次に、各サブセクションの知覚評価が実行される。 Next, a perceptual evaluation of each subsection is performed.

同じサブセクション内のグループフレーム損失が、同様の知覚的インパクトから成ると見なされるので、そのサブセクションを、典型的には、一定間隔のフレーム損失として扱うことができることを認識するであろう。 It will be appreciated that sub-sections can typically be treated as regularly spaced frame losses since group frame losses within the same sub-section are considered to consist of similar perceptual impacts.

従って、各サブセクションも、２つのパラメータＳｕｂＳｅｃｔｉｏｎ＝（ｇａｐＳＳ，ｌｅｎＳＳ）によって特定される。第１のパラメータｇａｐＳＳは、隣接する２つのグループフレーム損失間で上手く送信されたフレームの平均数であり、そして第２のパラメータｌｅｎＳＳは、サブセクションのすべてのグループフレーム損失の損失フレームの平均数である。 Therefore, each subsection is also specified by two parameters SubSection = (gapSS, lenSS). The first parameter gapSS is the average number of frames successfully transmitted between two adjacent group frame losses, and the second parameter lenSS is the average number of lost frames of all group frame losses in the subsection. is there.

平たく言えば、ｇａｐＳＳおよびｌｅｎＳＳのサブセクションの特徴量(feature values)はまさに、そのサブセクション内のすべてのグループフレーム損失に対するｇａｐＳＳおよびｌｅｎＳＳの平均値である。 Put simply, the feature values of the gapSS and lenSS subsections are exactly the average values of gapSS and lenSS for all group frame losses in that subsection.

そこでサブセクションの知覚品質の劣化は、ｇａｐＳＳおよびｌｅｎＳＳの特徴量によって決定されることが考えられる。以下のように定義される。 Therefore, it is conceivable that the degradation of the perceptual quality of the subsection is determined by the feature values of gapSS and lenSS. It is defined as follows.

主観検査によって、（ｇａｐＳＳ，ｌｅｎＳＳ）のある離散値に対し、知覚品質評価を手動でマークすることができる。このため、我々は、離散関数ｆ（ｘ，ｙ）をｘ，ｙ∈｛１，２，．．．，１０｝で定義する。 By subjective examination, perceptual quality evaluation can be manually marked for some discrete values of (gapSS, lenSS). For this reason, we express the discrete function f (x, y) as x, yε {1, 2,. . . , 10}.

例として、離散関数ｆ（ｘ，ｙ）を以下のように定義することができる。 As an example, the discrete function f (x, y) can be defined as follows:

ここで、ｃｓｔｉｌｌは、閾値とする定数であり、ＣａｍｅｒａＭｏｔｉｏｎは、サブセクションのカメラモーションのレベルの大きさである。そしてｆ１（ｘ，ｙ）ｆ２（ｘ，ｙ）は、以下の表に与えられる。 Here, cstill is a constant used as a threshold, and CameraMotion is the size of the camera motion level of the subsection. And f1 (x, y) f2 (x, y) is given in the following table.

カメラモーションは、知覚品質に影響を与えるもう一つの重要な要因になるので、カメラモーションのレベルも推定する必要がある。 Since camera motion is another important factor that affects perceived quality, it is also necessary to estimate the level of camera motion.

カメラモーションを周知の方法で推定することができる。その最も重要なグローバル動き推定モデルは、非特許文献３に記載された、８パラメータ透視動きモデルである。 The camera motion can be estimated by a known method. The most important global motion estimation model is an 8-parameter perspective motion model described in Non-Patent Document 3.

非特許文献３は、以下の式で明らかになる。 Non-Patent Document 3 becomes clear by the following equation.

ここで、（ａＤ，．．，ａ７）は、グローバル動きパラメータであり、（ｘｉ，ｙｉ）は、現在のフレームのｉ番目のピクセルの空間座標を示し、そして（ｘｉ’，ｙｉ’）は、前のフレームの対応するピクセルの空間座標を示す。動きモデルパラメータとシンボルレベルの解釈との関係が確立される。 Where (aD,..., A7) are global motion parameters, (xi, yi) indicates the spatial coordinates of the i th pixel of the current frame, and (xi ′, yi ′) is Indicates the spatial coordinates of the corresponding pixel in the previous frame. A relationship between motion model parameters and symbol level interpretation is established.

非特許文献３で紹介されたアルゴリズムは、本発明の実施形態の方法における８パラメータＧＭＥモデルを抽出するのに適用される。カメラモーションのレベルは、最終的に以下のように定義される。 The algorithm introduced in Non-Patent Document 3 is applied to extract an 8-parameter GME model in the method of the embodiment of the present invention. The level of camera motion is finally defined as follows.

そこでｆｐ（ｘ，ｙ）＝ｆ（ｘ，ｙ）は、ｘ，ｙ∈｛１，２，．．，１０｝として定義される。関数ｆｐ（ｘ，ｙ）を整数でない変数を有する関数に一般化する方法についても問題であり、典型的な学習(training)が問題である。従って学習マシン（例えば、当業者には周知である、人工ニューラルネットワーク（ＡＮＮ））を使用して、そのマシンがｆ（ｘ，ｙ）で学習する間に So fp (x, y) = f (x, y) is x, yε {1,2,. . , 10}. There is also a problem with how to generalize the function fp (x, y) into a function with non-integer variables, and typical training is a problem. Thus, using a learning machine (eg, an artificial neural network (ANN), well known to those skilled in the art), while the machine learns with f (x, y)

を割り当てることができる。 Can be assigned.

これまでのところ、フレーム損失パターンの各サブセクションによって生成される知覚品質劣化の値Ｊｐが得られた。 So far, the perceptual quality degradation value Jp generated by each subsection of the frame loss pattern has been obtained.

最終的に、ビデオシーケンスの品質は、フレーム損失パターンのすべてのセクションの値を合わせることによって評価される。 Finally, the quality of the video sequence is evaluated by combining the values of all sections of the frame loss pattern.

この方法において、プーリング戦略を使用して、それらの値をビデオシーケンス全体の品質評価に統合することができる。このような時間品質のプーリング戦略は、ブロックノイズ、ぼけなどの、空間アーチファクトを考慮する場合のプーリング戦略とはまったく異なることを指摘すべきである。人間視覚システム（ＨＶＳ）の特性のため、人々は、「悪く感じやすく、許し難い」。より高い時間品質から成る２つのサブセクション間の上手く送信されたフレームは通常、全体の時間品質を考慮する場合に無視される。 In this way, pooling strategies can be used to integrate those values into the overall video sequence quality assessment. It should be pointed out that such a temporal quality pooling strategy is quite different from the pooling strategy when considering spatial artifacts such as block noise and blur. Due to the nature of the human visual system (HVS), people are “easy to feel bad and unforgivable”. A successfully transmitted frame between two subsections of higher temporal quality is usually ignored when considering the overall temporal quality.

上記のセグメント化ステップにおいて、ビデオシーケンスは、一定間隔のフレーム損失 In the above segmentation step, the video sequence is lost at regular intervals.

のサブセクションのセットにセグメント化され、そして隣接する各２つのサブセクションは、ＳｕｂｓｅｃｔｉｏｎｉとＳｕｂｓｅｃｔｉｏｎｉ＋１との間の上手く送信されたフレーム、ＮｏＬｏｓｓｉを示す、いくつかの上手く送信されたフレームによって区切られる。分かり易くするために、ＮｏＬｏｓｓｉは、最も少ない品質劣化値１を有する特別な種類の一定間隔のフレーム損失として扱われる。つまり、我々は、 Each two adjacent subsections are segmented into a set of subsections and separated by a number of successfully transmitted frames indicating NoLossi, a successfully transmitted frame between Subsectioni and Subsectioni + 1. For clarity, NoLossi is treated as a special type of regularly spaced frame loss with the least quality degradation value of 1. In other words, we

と設定する。その後、これらすべてのＮｏＬｏｓｓｉは、設定したＦＬＰに挿入された。 And set. All these NoLossi were then inserted into the set FLP.

従って、全体の品質劣化は、以下のように定義される。 Therefore, the overall quality degradation is defined as follows.

ここで、ｗ（ｆｌｐｉ）は、ＦＬＰの要素の重み関数であり、以下のように定義される。 Here, w (flpi) is a weight function of the elements of the FLP and is defined as follows.

この式において、ｌｅｎｇｔｈ（ｆｌｐｉ）は、ｆｌｐｉのフレーム数である。ｄｉｓｔ（ｆｌｐｉ）は、ｆｌｐｉの中央から最後のフレームまでの距離である。Ｊｐ（ｆｌｐｉ）は、上記で定義されたように、ｆｌｐｉによって導出された知覚的時間劣化である。 In this equation, length (flpi) is the number of frames of flpi. dist (flpi) is the distance from the center of flpi to the last frame. Jp (flpi) is the perceptual time degradation derived by flpi, as defined above.

ｆＴは、人間の「記憶および忘却」プロパティを記述した関数である。閲覧者は、彼／彼女が最後のフレームを閲覧し終わった時に、彼の／彼女の全体の評価を行うと考えられている。最後のフレームから遠く離れたサブセクションは、閲覧者によって忘れられる可能性がある。遠く離れているほど、忘れる可能性が高くなる。 fT is a function describing the human “remember and forget” property. The viewer is thought to make his / her overall assessment when he / she has finished viewing the last frame. Subsections far from the last frame may be forgotten by the viewer. The farther away, the more likely you are to forget.

ｆＤは、「悪く感じやすく、許し難い」視覚プロパティを記述する関数である。人間は、かなり歪みのあるサブセクションに強いインパクトを受ける一方、歪みのないサブセクションをほとんど無視する。 fD is a function that describes a visual property that is “prone to feel bad and unacceptable”. Humans have a strong impact on fairly distorted subsections, while almost ignoring undistorted subsections.

図４は、ビデオの品質を測定する方法のフレームワークを示す模範的な図である。フレームワークの入力は、受信されたビデオシーケンスであり、フレーム損失パターン（またはタイムスタンプ）と併せて、各フレームの損失した／上手くいった受信を示す。フレームワークの出力は、入力ビデオシーケンスに対するビデオ量（ジャーキネス）のレベルを示す値Ｊである。 FIG. 4 is an exemplary diagram illustrating a framework for a method for measuring video quality. The framework input is the received video sequence, indicating the lost / successful reception of each frame, along with a frame loss pattern (or time stamp). The output of the framework is a value J that indicates the level of video volume (jerkness) for the input video sequence.

図４に示すように、フレームワークの主部は、（１）入力ビデオのフレーム損失パターンのセグメント化、（２）一定間隔のフレーム損失と見なされている各セクションの知覚評価、および（３）プーリング、の３つの動作から構成される。 As shown in FIG. 4, the main part of the framework consists of (1) segmentation of the frame loss pattern of the input video, (2) perceptual evaluation of each section that is considered as regularly spaced frame loss, and (3) It consists of three actions: pooling.

セグメント化ステップにおいて、入力ビデオシーケンスのフレーム損失パターンは、上記のように、セクションのセットに分割される。これらのセクションを２種類に分類することができる。その１つ（ＳｕｂＳｅｃｔｉｏｎｉ）は、同様のグループフレーム損失から構成され、そしてセグメント内の一定間隔のフレーム損失と見なされる。もう１つの種類（ＮｏＬｏｓｓｉ）は、フレーム損失をまったく包含しない。 In the segmentation step, the frame loss pattern of the input video sequence is divided into sets of sections as described above. These sections can be classified into two types. One of them (SubSectioni) is composed of similar group frame loss and is considered as frame loss at regular intervals within the segment. The other type (NoLossi) does not include any frame loss.

知覚評価ステップにおいて、ＮｏＬｏｓｓｉの知覚評価は、定数１に設定される。ＳｕｂＳｅｃｔｉｏｎｉの知覚評価は、上記のように、式（１）に基づいて推定される。 In the perceptual evaluation step, Nolossi's perceptual evaluation is set to a constant 1. The SubSectioni perceptual evaluation is estimated based on Equation (1) as described above.

プーリングステップにおいて、全体のジャーキネス評価は、上記のように、式（３）に従ったすべてのセクションの知覚評価に基づいて推定される。 In the pooling step, the overall jerkiness rating is estimated based on the perceptual rating of all sections according to equation (3) as described above.

本発明の別の実施形態は、フレーム損失パターンに基づいてビデオの品質を測定する装置を提供する。 Another embodiment of the present invention provides an apparatus for measuring video quality based on frame loss patterns.

図５は、本発明の実施形態に従ったフレーム損失パターンに基づいてビデオの品質を測定する装置を示すブロック図である。 FIG. 5 is a block diagram illustrating an apparatus for measuring video quality based on a frame loss pattern according to an embodiment of the present invention.

図５に示すように、装置５００は、入力ビデオシーケンスを受信し、且つ受信されたビデオシーケンスのフレーム損失パターンを生成するためのフレーム損失パターン生成ユニット５０１と、フレーム損失パターン生成ユニット５０１からフレーム損失パターンを受信し、且つ第１の損失フレームからの、フレーム損失パターンの１または複数の連続した損失フレームをグループフレーム損失にグループ化するためのグループ化ユニット５０２と、グループ化ユニット５０２からグループ化されたフレーム損失パターンを受信し、且つ第１のグループフレーム損失からの、フレーム損失パターンを、１または複数の連続したグループフレーム損失を有する複数のセクションに分割して、セクションの各グループフレーム損失が、そのグループフレーム損失とその前のグループフレーム損失との間で同じ数の上手く送信されたフレームおよび同じ数の損失フレームを有するようにするための分割ユニット５０３と、分割ユニット５０３によって出力されたグループフレーム損失の各セクションによって生成された品質劣化の値を計算するための計算ユニット５０４と、計算ユニット５０４によって出力されたすべてのセクションの値を合わせてビデオシーケンスの品質を評価し、且つその値を出力するための評価ユニット５０５とを備える。 As shown in FIG. 5, an apparatus 500 receives an input video sequence and generates a frame loss pattern generation unit 501 for generating a frame loss pattern of the received video sequence, and a frame loss from the frame loss pattern generation unit 501. A grouping unit 502 for receiving the pattern and grouping one or more consecutive loss frames of the frame loss pattern from the first loss frame into a group frame loss; Dividing the frame loss pattern from the first group frame loss into a plurality of sections having one or more consecutive group frame losses, wherein each group frame loss of the section is The group A division unit 503 for having the same number of successfully transmitted frames and the same number of lost frames between a frame loss and a previous group frame loss, and the group frame loss output by the division unit 503 A calculation unit 504 for calculating the quality degradation value generated by each section, and evaluating the quality of the video sequence by combining the values of all sections output by the calculation unit 504 and outputting the value The evaluation unit 505 is provided.

本発明の評価精度を非特許文献１および２と比較して推定する実験が行われた。このため、ビデオ品質の主観テストを行うためのソフトウェアツールが設計された。図６は、その主観テストを行うために設計されたソフトウェアツールのインタフェースを示す図である。入力ビデオシーケンスを右側の「ＹＵＶシーケンス」グループボックス内で選択することができ、フレーム損失パターンを右側中央の「フレーム損失パターン」グループボックス内で選択することができる。フレーム損失パターンの影響を受けたビデオシーケンスはその後、左側の小ウィンドウで表示される。 An experiment was performed to estimate the evaluation accuracy of the present invention in comparison with Non-Patent Documents 1 and 2. For this reason, a software tool was designed to perform a subjective test of video quality. FIG. 6 is a diagram showing an interface of a software tool designed to perform the subjective test. The input video sequence can be selected in the “YUV Sequence” group box on the right, and the frame loss pattern can be selected in the “Frame Loss Pattern” group box on the right center. The video sequence affected by the frame loss pattern is then displayed in a small window on the left.

閲覧者はその後、以下のようなジャーキネスの知覚をマークするように要求される。
１−品質劣化をまったく知覚しない
２−注意して調べて見ると時間軸に不自然なものがあるのに気付くが、ビデオを楽しむのに影響を及ぼさない
３−はっきりとした品質劣化であり、通常、邪魔に感じる
４−品質劣化をかなり不快に思う
５−劣化がひどく、ビデオにまったく耐えられない The viewer is then required to mark the perception of jerkiness as follows:
1-Do not perceive quality degradation at all 2-If you look carefully and notice that there is something unnatural on the time axis, it will not affect the enjoyment of the video 3-It is clear quality degradation, 4-usually feels uncomfortable-very uncomfortable with quality degradation 5-severe degradation, can't stand video at all

その主観テストにおいて、１０ＣＩＦ（３５２×２８８ピクセルのビデオ解像度）シーケンスが選択され、２０個のフレーム損失パターンが選ばれた。３人の閲覧者は採点するように依頼されて、彼らの平均値は、ＪＳとして示した、主観スコアに考慮される。マークされたスコアを有するすべてのシーケンスは、データセットＤＳに組まれる。 In the subjective test, a 10 CIF (352 × 288 pixel video resolution) sequence was selected and 20 frame loss patterns were selected. Three viewers are asked to score and their average value is taken into account in the subjective score, shown as JS. All sequences with marked scores are assembled into the data set DS.

パラメータ設定
実装において、経験に基づいて定数が決定される。定数は、β１＝β２＝１，β３＝２、
ｃ＝１．５、ｃｓｔｉｌｌ＝０．２３である。 Parameter setting In implementation, constants are determined based on experience. The constants are β1 = β2 = 1, β3 = 2,
c = 1.5 and cstill = 0.23.

簡易にするために、３００フレームのウィンドウは、閲覧者がこのウィンドウの前のフレームの品質について忘れるであろうと思われる間の記憶容量とされた。そしてウィンドウ内で、ｆＴ＝１に設定され、ｆＤ（ｄ）＝６−ｄに設定される。ｆ１（ｘ，ｙ）およびｆ２（ｘ，ｙ）は、上記の表１および表２によって決定される。 For simplicity, a 300-frame window has been made available for storage while the viewer would forget about the quality of the previous frame of this window. In the window, fT = 1 is set and fD (d) = 6-d is set. f1 (x, y) and f2 (x, y) are determined by Table 1 and Table 2 above.

実験結果
本発明の評価精度は、本発明に従って得られた主観評価結果Ｊと主観スコアＪＳとを比較することによって推定される。ピアソンの相関が予測精度測定に使用される。 Experimental Result The evaluation accuracy of the present invention is estimated by comparing the subjective evaluation result J obtained according to the present invention and the subjective score JS. Pearson's correlation is used to measure prediction accuracy.

以下の表は、本発明のピアソンの相関（予測精度）および非特許文献１および２で提案されたアルゴリズムを示す。 The following table shows the Pearson correlation (prediction accuracy) of the present invention and the algorithm proposed in Non-Patent Documents 1 and 2.

本発明の好適な実施形態に適用される基礎的な新規の特徴を示し、説明し、そして指摘したが、説明された装置および方法、開示されたデバイスの形態および詳細、およびそれらの動作のさまざまな省略および置換および変更は、本発明の精神から逸脱することなく当業者によって行われてよいことが理解されよう。同じ結果を実現するために実質的に同じやり方で実質的に同じ機能を実行する要素のすべての組み合わせは、発明の範囲内であることを明確に意図する。説明したある実施形態から別の実施形態に要素を置き換えることも、すべて意図および企図する。 Although the basic novel features applied to the preferred embodiments of the present invention have been shown, described and pointed out, the apparatus and method described, the forms and details of the disclosed devices, and the variety of their operation It will be understood that any omissions, substitutions and changes may be made by those skilled in the art without departing from the spirit of the invention. It is expressly intended that all combinations of elements performing substantially the same function in substantially the same way to achieve the same result are within the scope of the invention. The substitution of elements from one described embodiment to another is all contemplated and contemplated.

本発明は、単に例として説明され、そして発明の範囲から逸脱せずに詳細の変更を行うことができることが理解されよう。説明で開示された各特徴および（それに適した）特許請求の範囲および図面は、単独または適した任意の組み合わせにおいて提供されてよい。適した特徴は、ハードウェア、ソフトウェア、またはその２つを組み合わせて実装されてよい。 It will be understood that the present invention has been described by way of example only and modifications of detail can be made without departing from the scope of the invention. Each feature disclosed in the description, and (where appropriate) the claims and drawings may be provided alone or in any suitable combination. Suitable features may be implemented in hardware, software, or a combination of the two.

Claims

A method for measuring video quality,
Generating a frame loss pattern for the video by indicating whether each frame of the video has been lost or transmitted successfully;
Evaluating the quality of the video as a function of the generated frame loss pattern;
Said method.

The evaluation is
Grouping one or more consecutive lost frames of the video from a first lost frame into a group frame loss;
The frame loss pattern from the first group frame loss is divided into a plurality of sections having one or more consecutive group frame losses, and each group frame loss of the section is divided into the group frame loss and the previous group Having the same number of successfully transmitted frames and the same number of lost frames between frame loss;
Calculating the value of quality degradation generated by each section of group frame loss;
Assessing the quality of the video by combining the values of all sections;
The method of claim 1, comprising:

The value of the quality degradation generated by the section of group frame loss is calculated as the average number of frames successfully transmitted between two adjacent group frame losses of the section and the loss frame of all the group frame losses of the section. The method of claim 2, further comprising calculating as a function of an average number of.

The method according to claim 2 or 3, wherein the evaluation comprises matching the values of all sections by weighted pooling.

The weighting factor of one section of the weighted pooling depends on the number of frames in the section, the distance from the middle of the section to the last frame, and the perceptual time degradation induced by the section. the method of.

An apparatus for measuring video quality, the means for receiving an input video and generating a frame loss pattern of the received video;
Means for evaluating the quality of the video as a function of the generated frame loss pattern;
Comprising the apparatus.

The means for evaluating the quality of the video comprises:
A grouping unit for receiving the frame loss pattern from a frame loss pattern generation unit and grouping one or more consecutive loss frames of the frame loss pattern from the first loss frame into a group frame loss When,
Receiving the grouped frame loss pattern from the grouping unit and dividing the frame loss pattern from an initial group frame loss into a plurality of sections having one or more consecutive group frame losses; A division unit for causing each group frame loss of a section to have the same number of successfully transmitted frames and the previous same number of lost frames between the group frame loss and the previous group frame loss When,
A calculation unit for calculating a value of quality degradation generated by each section of the group frame loss output by the division unit;
7. An apparatus according to claim 6, comprising: an evaluation unit for combining the values of all sections output by the calculation unit to evaluate the quality of the video and outputting the values.

The calculation unit calculates the value of the quality degradation generated by a section of group frame loss as an average number of frames successfully transmitted between two adjacent group frame losses of the section and all the groups of the section. The apparatus of claim 7, wherein the apparatus calculates the frame loss as a function of the average number of lost frames.

The apparatus according to claim 7 or 8, wherein the evaluation unit matches the values of all sections by weighted pooling.

The weighting factor for one section of the weighted pooling depends on the number of frames in the section, the distance from the middle of the section to the last frame, and the perceptual time degradation induced by the section. The device described.