WO2010143870A2

WO2010143870A2 - Method and apparatus for monitoring video quality

Info

Publication number: WO2010143870A2
Application number: PCT/KR2010/003673
Authority: WO
Inventors: 권재철; 박희철; 이주용; 진영민; 장성환; 서창렬
Original assignee: 주식회사 케이티
Priority date: 2009-06-08
Filing date: 2010-06-08
Publication date: 2010-12-16
Also published as: WO2010143870A3; WO2010143870A9

Abstract

Provided are a method and apparatus for monitoring video quality. The method for monitoring video quality comprises: (a) receiving first visual rhythm information of an encoded video from a first server; (b) extracting second visual rhythm information of the video obtained by receiving the encoded video through a network, and decoding and reproducing the received video; (c) calculating a visual rhythm difference on the basis of the first visual rhythm information and the second visual rhythm information; and (d) transmitting the calculated visual rhythm difference to a second server, wherein the second server measures the degradation of video quality of the reproduced video using the visual rhythm difference.

Description

Image quality monitoring method and device

The present invention relates to a method and apparatus for monitoring image quality, and more particularly, to a method and apparatus for monitoring image quality deterioration for an image reproduced in a set-top box.

Recently, with the commercialization of IPTV, interest in image quality has sharply increased. In this regard, IPTV operators are making various efforts to monitor the level of image quality experienced by the viewers and to reflect the results in the content generation or transmission methods to increase customer satisfaction. How to evaluate

Currently, ITU-T SG9 and VQEG are being standardized to measure the objective quality of subjective sensational quality, which is largely referred to as Full Reference (FR), Reduced Reference (RR), and No Standard (NR). ; No Reference).

Among these, the electro-observation method is a method of measuring the similarity or distortion of the reproduced image compared to the original image. In addition, the reduction criterion method is a method of measuring the relative image quality by comparing the feature information extracted from the original image and the feature information extracted from the playback image, the non-reference method is only a playback image visible to the viewer in the absence of the original image itself or feature information How to measure the quality.

Of these, the Electro-Compliance provides the most accurate results, and the Non-Standard Law is known to be the most inaccurate.

However, in the actual IPTV business, there are many factors that affect the quality of the playback video. Therefore, the relative comparison between the original and the playback video suggested by the above standard can cope with the problems such as the deterioration of the quality in the IPTV business. Difficult to do

For example, the distortion of the encoded image, i.e., the distortion at the front-end stage, is mainly due to the defect of the original image or the distortion introduced during the encoding process, and the distortion due to a defect in a network or a set-top box. That is, since the distortion at the back-end is mainly caused by packet loss or malfunction of the set-top box, the distortion at the front-end and the distortion at the back-end are different from each other.

Therefore, since the playback image of the set-top box represented on the viewer's TV appears as the sum of all distortions, it is difficult to find the cause when the distortion occurs.

Therefore, in J.240 (Framework for remote monitoring of transmitted picture signal-to-noise ratio using spread-spectrum and orthogonal transform) standardized by ITU-T in June 2004, the input image is divided into blocks of appropriately sized bands. We proposed a method that randomizes image data in the frequency domain or spatial domain using spread spectrum and orthogonal transform, and takes one sample as feature information, and uses the maximum signal-to-noise ratio (Peak). It is claimed that the estimated mean of Signal-to-Noise Ratio (hereinafter referred to as PSNR) is very accurate.

However, this method requires an operation that requires multiplying image data by a pseudo-noise (PN) sequence and performing Walsh-Hadamard transformation to extract feature information. Although this operation is not very expensive, it is very burdensome to process in the set-top box because the operation must be performed on all pixel information, and the PSNR value estimated at the individual frame level may be different from the claim.

For example, a 720x480 @ 30 fps video requires 1,296kbps of bandwidth required by feature information, which is very burdensome to process in a set-top box.

Also proposed by YAMADA et al (“Reduced-reference based video quality metrics using representative-luminance values,” in Proc. International Workshop on Video Processing and Quality Metrics for Consumer Electronics, Scottsdale, Ariz., USA, Jan. 2007.) The method estimates the PSNR by transmitting the position of the pixels having the "representative luminance value" in the input image and the value as the characteristic information, and extracting and comparing the pixel values of this position as the characteristic information at the receiving side.

Although this method is simple, the position information of the "representative luminance value" involves a large amount of data, and in order to make it practical, a separate binary compression codec for compressing the position information is required.

Accordingly, there is a demand for a method for quantitatively (quantitatively) measuring the quality of an image reproduced in the set-top box while minimizing the computational burden of the set-top box and easily confirming various information on the deterioration of the image quality of the set-top box.

Meanwhile, the objective image quality is an amount evaluated based on the difference in pixel values between the reference image and the reproduced image, and the subjective image quality is an amount evaluated by the human eye. The objective picture quality measure is meaningful as a reference measure for reproduction picture quality because the relative difference between the comparison targets is determined by a clear standard, and the subjective picture quality measure has a meaning as a measure for evaluating the sensory picture quality experienced by the viewers.

Since the objective image quality and the subjective image quality do not necessarily coincide, there have been many attempts to quantify subjective image quality using human visual characteristics.

Conventional methods for quantifying subjective image quality include ITU-T J.144, which is currently being standardized, such as the Electro-Compliance (FR), the Reduced Reference Method (RR), and the Non-Standard Method (NR). The subjective quality scale was modeled by quantifying the deterioration factors (eg, blocking, blurring, jerkiness, color error, edge business, etc.) present in the image frame.

Of these, FR cannot be applied to a general viewing environment because an original image must exist, and there are problems in that the quality measures proposed by such modeling are very computational and not accurate in the RR and NR methods.

In addition, most subjective measures focus mainly on evaluating distortion, such as blocking, blurring, and jerkiness, which occur during the compression process. Therefore, there is an attempt and demand to quantitatively evaluate image degradation due to network transmission error in terms of playback images. It is constantly being presented.

In order to solve the above-mentioned problems of the prior art, the present invention provides a new quality measure for objectively measuring the degree of image quality deterioration and an objective image quality deterioration measure based thereon.

In another aspect, the present invention provides a method and apparatus for measuring the objective image quality degradation of the image reproduced in the set-top box, while reducing the computational burden of the set-top box.

In addition, the present invention provides a method and apparatus for quantitatively measuring the quality of a playback image deteriorated due to packet loss and allowing a quality control operator to easily identify the location and amount of image quality deterioration.

The present invention also provides an apparatus capable of quantitatively measuring subjective picture quality degradation due to network transmission error using visual rhythm information.

The objects of the present invention are not limited to the above-mentioned objects, and other objects that are not mentioned will be clearly understood from the following description.

In order to achieve the above object, a method for monitoring image quality according to an aspect of the present invention comprises the steps of (a) receiving, from a first server, the first visual rhythm information of an encoded image, (b) the encoded image Extracting second visual rhythm information on the decoded playback image received through a network, (c) calculating a visual rhythm difference based on the first visual rhythm information and the second visual rhythm information; and (d) transmitting the calculated visual rhythm difference to a second server, wherein the second server measures the deterioration of image quality of the reproduced image by using the visual rhythm difference.

In order to achieve the above object, a method of monitoring image quality according to another aspect of the present invention comprises the steps of (a) receiving from the media server, the first visual rhythm information for the encoded image, (b) from the set-top box, Receiving second visual rhythm information on the decoded playback image, (c) calculating a visual rhythm difference based on the first visual rhythm information and the second visual rhythm information, and (d) the visual rhythm Measuring the deterioration of the image quality of the reproduced image using the difference.

In order to achieve the above object, the apparatus for monitoring the image quality according to an aspect of the present invention is the first visual rhythm information that is visual rhythm information for the encoded image from the set-top box and the visual for the image reproduced in the set-top box Receiving unit for receiving a visual rhythm difference using the difference of the second rhythm information that is rhythm information, Image quality estimation value calculator for calculating the image quality estimation value of the image played in the set-top box using the visual rhythm difference image, The reproduction According to a request of an objective image quality deterioration scale calculation unit for calculating an objective image quality deterioration measure including at least one of an image deterioration number, a threshold deterioration number, a threshold deterioration rate, an average threshold deterioration amount, and an average deterioration amount, the image quality estimation is performed. Providing image quality monitoring information based on the value and the objective quality deterioration measure. And a graphical user interface.

In order to achieve the above object, the set-top box for monitoring the video quality according to an aspect of the present invention, a video decoder for decoding the encoded video received from the first server via a network to generate a playback video, from the playback video Visual rhythm information extraction unit for extracting the second visual rhythm information, Visual rhythm for calculating the visual rhythm difference based on the first visual rhythm information and the second visual rhythm information for the encoded image received from the first server And a visual rhythm difference transmitter for transmitting the visual rhythm difference to a second server that measures image quality degradation of the reproduced image by using the difference calculator and the visual rhythm difference.

In order to achieve the above object, the apparatus for monitoring the image quality according to another aspect of the present invention, the visual rhythm difference calculated by the difference between the first visual rhythm information for the reference image and the second visual rhythm information for the playback image A reproduction quality estimation value calculator for calculating an image quality estimation value of the reproduced video using an objective deterioration section detection unit for detecting at least one section having an image quality estimation value of the reproduced video below a predetermined reference value as an objective degradation section, respectively; For each of the at least one objective deterioration section, the subjective deterioration detection unit for calculating the subjective deterioration amount by using the time duration of the objective deterioration section and the complexity of the image.

Specific details for achieving the above object will be apparent with reference to the embodiments described below in detail with the accompanying drawings.

However, the present invention is not limited to the embodiments disclosed below, but may be configured in different forms, and the present embodiments are intended to complete the disclosure of the present invention and to provide general knowledge in the technical field to which the present invention belongs. It is provided to fully inform those who have the scope of the invention.

According to one of the problem solving means of the method and apparatus for monitoring the image quality of the present invention described above, it is possible to measure the objective image quality degradation of the image reproduced in the set-top box while reducing the computational burden of the set-top box.

In addition, since visual rhythm information is used, objective quality measurement such as position, duration, deterioration frequency, average deterioration amount of image quality deterioration with respect to the playback image is possible, and playback quality measurement by frame is also possible.

In addition, visual rhythm information can visually confirm the deterioration of image quality of the playback image, thereby enabling intuitive image quality monitoring.

In addition, by transmitting the compressed visual rhythm information and the compressed visual rhythm information difference image, it is possible to reduce the uplink transmission bandwidth, and contribute to securing the storage space.

In addition, the subjective bodily sensation quality of the viewer can be quantitatively measured using the visual rhythm information.

In addition, since the subjective haptic image quality can be calculated using a simple formula, it is possible to reduce the computational complexity in calculating the image quality deterioration measure.

1 is a diagram illustrating a configuration of a system for monitoring video quality of an IPTV according to an embodiment of the present invention.

2 is a diagram illustrating a configuration of a system for monitoring video quality of an IPTV in an offline manner.

3 is a diagram for VR information according to an embodiment of the present invention.

4 is a diagram illustrating an objective deterioration interval and a critical deterioration interval according to an embodiment of the present invention.

5 is a block diagram showing the configuration of a quality control server according to an embodiment of the present invention.

FIG. 6 is a diagram illustrating viewing quality information of content watched by a specific subscriber provided by a quality management server according to an exemplary embodiment of the present invention.

7 is a diagram illustrating a method of expressing VR information for VR synchronization according to an embodiment of the present invention.

8 is a diagram illustrating a form of VR information for VR synchronization according to an embodiment of the present invention.

9 illustrates a method of synchronizing VR information according to an embodiment of the present invention.

FIG. 10 is a diagram illustrating VR information extracted from an hour-long video content according to an embodiment of the present invention.

11 is a diagram illustrating a reduction method by spatial subsampling according to an embodiment of the present invention.

12 is a diagram illustrating a reduction method by temporal subsampling according to an embodiment of the present invention.

13 is a view illustrating a reduction method by lossless compression according to an embodiment of the present invention.

14 is a diagram illustrating a method of reducing information by extracting VR information by RTP packet loss detection according to an embodiment of the present invention.

FIG. 15 illustrates a method of reducing information by extracting VR information by video bitstream error detection according to an embodiment of the present invention.

16 is a flowchart illustrating an operation of a media server in an online manner according to an embodiment of the present invention.

17 is a flowchart illustrating the operation of the set-top box in the online manner according to an embodiment of the present invention.

18 is a flowchart illustrating an operation of a quality control server in an online manner according to an embodiment of the present invention.

19 is a flowchart illustrating an operation of a media server in an offline manner according to an embodiment of the present invention.

20 is a flowchart illustrating the operation of the set-top box in an offline manner according to an embodiment of the present invention.

21 is a flowchart illustrating an operation of a quality control server in an online manner according to an embodiment of the present invention.

22 to 24 are diagrams illustrating NPSNR estimation performance of experimental images of Table 1 according to an embodiment of the present invention.

25 to 28 illustrate VR information and a VR_Diff image for an experimental video of Table 2 according to an embodiment of the present invention.

29 is a block diagram showing a configuration of a quality control server according to another embodiment of the present invention.

30 is a flowchart illustrating a subjective quality deterioration measurement process according to an embodiment of the present invention.

FIG. 31 is a diagram illustrating a correlation between MOS values recorded by viewers and NMOS values for all degradation periods according to an embodiment of the present invention.

DETAILED DESCRIPTION Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those skilled in the art may easily implement the present invention.

As those skilled in the art would realize, the described embodiments may be modified in various different ways, all without departing from the spirit or scope of the present invention.

In the drawings, parts irrelevant to the description are omitted in order to clearly describe the present invention, and like reference numerals designate like parts throughout the specification.

For reference, in the entire specification, when a part is "connected" to another part, it is not only "directly connected" but also "electrically connected" with another element in between. Also includes.

In addition, when a part is said to "include" a certain component, this means that it may further include other components, without excluding other components unless otherwise stated.

Hereinafter, with reference to the accompanying configuration diagram or processing flow chart, it will be described in detail for the practice of the present invention.

For reference, a system for monitoring video quality of an IPTV according to an embodiment of the present invention may be implemented in an on-line method and an off-line method, and hereinafter, divided into an online method and an offline method. This section describes the operation of each component of the system.

The system for monitoring video quality of an IPTV according to an embodiment of the present invention includes a media server 110, a set top box 120, and a quality management server 130.

First, referring to FIG. 1, each component in the online method will be described. The media server 110 includes a video encoder 111 and a visual rhythm information extractor 112.

Here, the video encoder 111 compresses the video signal. The media server 110 stores the video bit stream compressed by the video encoder 111 and transmits it to the set-top box 120 through a network.

In addition, the visual rhythm information extractor 112 extracts visual rhythm information (hereinafter, referred to as VR) from the video bit stream stored in the media server 110, that is, the encoded image (hereinafter, referred to as first VR information). Extract.

For reference, the VR information extracting unit 112 may include a PTS (Presentation Time Stamp) value, which is time information, and a program ID, which is an identifier for identifying content, in the first VR information.

Here, the VR information is composed of one-dimensional information by partially sampling the pixels of the two-dimensional image frame, and sampling the pixels at the same positions for successive image frames on the time axis to project the three-dimensional image information into the two-dimensional information (projection). As such, the detailed description of the VR information will be described later with reference to FIG. 3.

Thereafter, the media server 110 stores the generated first VR information and transmits the generated first VR information to the set-top box 120 in a separate transmission channel different from the video bit stream.

In this case, the first VR information to be transmitted may be divided into a predetermined size and transmitted in segment units. When the storage such as a hard disk exists in the set-top box 120, the entire first VR information may be transmitted at once.

For reference, in the configuration illustrated in FIG. 1, the video encoder 111 and the VR information extractor 112 are illustrated as being included in the media server 110. However, the video encoder 111 and the VR information extractor are different from each other. 112 may exist and operate separately from the media server 110.

The set top box 120 may include a video decoder 121, a VR information extractor 122, a VR difference calculator 123, and a VR difference image transmitter 124.

Here, the video decoder 121 decodes (decodes) the video bit stream transmitted from the media server 110 to generate a reproduced image.

In addition, the VR information extraction unit 122 extracts VR information (hereinafter referred to as second VR information) from the playback image generated by the video decoder 121.

For reference, the VR information extracting unit 122 may include a PTS (Presentation Time Stamp) value, which is time information, and a program ID, which is an identifier for identifying content, in the second VR information.

In addition, the VR difference calculator 123 calculates and calculates a VR difference using a difference between the first VR information transmitted from the media server 110 and the second VR information generated by the VR information extractor 122. A VR difference image (hereinafter, referred to as a VR_Diff image) is generated based on the VR difference.

In this case, the VR difference calculator 123 synchronizes the first VR information and the second VR information based on the PTS value and the program ID to generate the VR_Diff image.

Here, the VR_Diff image is information related to the image quality viewed by the viewer, and can be easily generated by calculating the difference between the first VR information and the second VR information, and the amount of computation required is very small. Therefore, the computational burden for calculating the VR_Diff image in the set-top box 120 is also very small.

In addition, the VR_Diff image has a difference value only in the section deteriorated by packet loss, and has a value of '0' in the remaining section without deterioration, so the amount of data is not large and various compression methods such as lossless compression method such as VR_Diff image are used. In the case of using the compression, the amount of data can be further reduced.

A detailed description of the compression method of the VR information (VR_Diff image) will be described later with reference to FIGS. 10 to 15.

In addition, the VR difference image transmitter 124, that is, the VR_Diff image transmitter 124 compresses the VR_Diff image generated by the VR difference calculator 123 and transmits the compressed image to the quality management server 130.

In this case, the VR_Diff image may be transmitted by using a reliable channel such as TCP or may be redundantly transmitted several times to prevent data loss during transmission.

Meanwhile, the quality management server 130 receives the VR_Diff image from the set-top box 120 and measures the image quality degradation of the image reproduced in the set-top box 120 using the VR_Diff image.

In this case, the measurement of image quality deterioration uses an estimate of a distortion amount for packet loss generated during transmission of data from the media server 120 to the set-top box 120 in terms of PSNR, and improves the existing PSNR. It uses a new quality measure called Networked PSNR (NPSNR).

NPSNR represents the quality of the image played in the set-top box 120 based on the encoded image, not the original image, and is a value for the 2D image.

Is the value estimated by VR information.

Quality management server 130 described above

And deterioration quality of the image reproduced by the set-top box 120 by using deterioration measures such as deterioration times, critical deterioration times, critical deterioration rates, average critical deterioration amounts, and average deterioration amounts.

A detailed description of the image quality degradation degree measurement of the quality management server 130 will be described later.

Hereinafter, a configuration of a system for monitoring video quality of an IPTV according to another embodiment of the present invention will be described with reference to FIG. 2.

FIG. 2 is a diagram illustrating a configuration of a system for monitoring video quality of an IPTV in an offline manner, and includes a media server 210, a set-top box 220, and a quality management server 230.

First, the media server 210 has the same configuration as the media server 110 shown in FIG. 1, and if different, the media server 210 shown in FIG. 2 sets the generated first VR information into a set-top box ( It is transmitted to the quality control server 130, not 120.

The set top box 220 may include a video decoder 221, a VR information extractor 222, and a VR information transmitter 223. Among them, the video decoder 221 and the VR information extractor 222 have the same functions as the video decoder 121 and the VR information extractor 122 of the set-top box 120 shown in FIG. The transmitter 223 compresses the second VR information extracted by the VR information extractor 222 and transmits the second VR information to the quality control server 230.

That is, in the online method, the set-top box 120 compresses the VR_Diff image and transmits it to the quality management server 130. In the offline method, the set-top box 220 compresses the second VR information and transmits the second VR information to the quality management server 230. There is a difference.

Therefore, the offline method does not need to generate the VR_Diff image in the set-top box 220, which reduces the consumption of computational resources of the set-top box 230, so that the image quality can be monitored even in a set-top box having a lower specification than the online method. There are advantages to it.

Meanwhile, the quality control server 230 receives the first VR information transmitted from the media server 210 and the second VR information transmitted from the set top box 220. Thereafter, the quality control server 230 synchronizes the first VR information and the second VR information based on the PTS value and the program ID included in the first VR information and the second VR information, and uses the difference of each VR information. Create a VR_Diff image.

Since the quality management server 230 uses the VR_Diff image to measure image quality deterioration of the image reproduced in the set-top box 220, the quality management server 130 shown in FIG. 1 will be omitted.

For reference, the offline quality management server 230 does not need to store both the first VR information and the second VR information.

That is, since the offline quality management server 230 generates the VR_Diff image using the first VR information and the second VR information, when the first VR information and the VR_Diff image exist, the second VR information is used. On the contrary, when the second VR information and the VR_Diff image exist, the first VR information may be restored using the second VR information and the VR_Diff image.

Therefore, there is no need to store both the first VR information and the second VR information, and there is an advantage that can contribute to securing the storage space.

Hereinafter, the VR information of the present invention will be described in detail with reference to FIG. 3.

As shown in FIG. 3, VR information according to an embodiment of the present invention is composed of one-dimensional information by sampling pixels located in a vertical, diagonal, horizontal, etc. direction in a two-dimensional video frame, and then continuous in the time axis. It is information that projects 3D video information into 2D information by sampling pixels at the same position with respect to video frames.

Although the spatial information included in the VR information is only a small part of the original image, it was confirmed through experiments that the information amount reflects a large part of the 2D frame information. In other words, the experiment confirmed that the PSNR value of the two-dimensional frame between the original video and the set-top box playback image is highly correlated with the PSNR value calculated only with their VR information. Can be estimated.

The theoretical background of the present invention is as follows.

Is the pixel value of the original image,

Is the pixel value of the set-top box playback image,

Is a random variable that represents a pixel value in a playback video frame after encoding. Total distortion of video played back on set-top box

To be defined as in <Equation 1>. Where E [X] is the expected value (ie mean) of the random variable X.

here

, I, j th pixel value of the n th frame.

double

Is the amount of distortion that occurs during encoding.

Is the amount of distortion caused by packet loss generated during the transmission process.

Suppose the two distortion amounts are independent of each other

Is equal to the sum of the two distortion amounts. By the way

Is the amount known during encoding,

Is an unknown amount and the original video is not available in the set-top box.

Is an unknown amount in the set-top box.

In the present invention

Is to propose a measure of image quality deterioration by estimating

In order to know the quantity,

Is required, but since the encoded video cannot be expected in the set-top box, the above-described VR information is used to estimate the amount.

VR information about the image of the bit stream obtained in the encoding step

, VR information about the video played on the set-top box

If

Estimate of

Can be approximated as in Equation 2 below.

here

to be.

That is, VR information about the encoded image

VR information about the video played on the set-top box

Have

Estimate

If there is no deterioration due to an error generated during the transmission process, the total distortion amount is equal to the distortion amount generated during the encoding process, and from <Equation 2>, whether the image quality deterioration has occurred due to a transmission error, and how much is the amount of degradation? You can get information about whether

For reference, the amount of distortion generated during the encoding process is not an interest of the present invention.

In order to define the measure of image quality deterioration occurred in the back-end stage, and to provide more intuitive insight

To objectively evaluate the image quality of the image reproduced in the set-top box using the PSNR value.

To this end, a measure of quality for objectively measuring image quality degradation proposed by the present invention is NPSNR (Networked PSNR) defined by Equation 3 below.

Here, the NPSNR represents the quality of the image played in the set-top box based on the encoded image, not the original image, and is a value for the 2D image, and is defined by Equation 4.

Is an estimated value of NPSNR using VR information.

Here, c = 0.65025, which is a constant normalized so that NPSNR is 50dB when there is no packet loss.

As can be seen from the above equation,

Wow

By simply comparing the information, you can obtain information about the location of the error (deterioration), the time duration of the error (deterioration), the number of occurrences of the error (deterioration), etc.

Through this, it is possible to objectively estimate how badly the image quality (screen) is degraded.

Hereinafter, an objective image quality deterioration measurement (evaluation) method in the

quality management servers

130 and 230 will be described in detail.

For objective image quality degradation measurement in the

quality management server

130, 230 according to an embodiment of the present invention, the following amount is first defined.

Objective degradation time:

here

Is the frame time at which the kth degradation started,

Denotes the time at which the k-th degradation has ended. Therefore, <Equation 5>

This shows how long the kth degradation lasted.

Note that,

The maximum value is 50dB, which is 50dB without degradation and less than 50dB when degradation occurs. Therefore, the beginning and end of deterioration

Find the interval where the value of is less than 50dB.

Objective critical degradation time:

The objective critical deterioration interval is critical for any one of the frames of the deterioration interval described above.

It means a section having the following image quality.

Critical

Is an image quality degradation that can be inconvenient to viewers.

As a quantity, the image quality deterioration that may cause inconvenience to eyes

It is considered to occur if it is less than this preset threshold.

The first and third deterioration sections were included as critical deterioration sections because they were smaller than the threshold value, but the second deterioration sections were not included as critical deterioration sections because the deterioration amount was small.

In order to quantify the degradation of the image quality reproduced in the set-top box using the above-described deterioration interval and the critical degradation interval, the present invention proposes the following objective deterioration measures.

1.degradation count:

Kd: the largest k of

2. Critical degradation count:

Kcd: the largest k of

3. Critical degradation ratio:

4. Average threshold degradation NPSNR:

5. Average degradation

here,

6. Objective Playback Quality Scale:

, for the nth frame.

Referring to each of the objective deterioration measures described above, first, the number of deterioration means the number of deterioration periods, that is, the number of deterioration intervals, and the number of deterioration periods means the number of critical deterioration intervals.

In addition, the critical degradation rate refers to the ratio of time occupied by all critical degradation intervals in the total time, and the average threshold degradation amount is an average of the frames constituting all the critical degradation intervals.

Can be defined as a value.

In addition, the average degradation amount is obtained by converting an average distortion amount between two VR information into an NPSNR value, and an objective reproduction quality measure is defined as an NPSNR value in units of frames.

For reference, when performing the error suppression function in the set-top box, small degradation

Even if it falls, it may not be noticeable. It depends on the type of image, but experimentally, usually

If it is less than 45dB, the deterioration which may cause an inconvenience to an eye tends to be detected.

Critical in the present invention

Use 45dB as an example of setting the value. However, according to embodiments of those skilled in the art

The value may be set to another value.

Hereinafter, the configuration of the

quality management servers

130 and 230 of the present invention will be described with reference to FIG. 5, and for convenience of description, the quality management server 130 will be described as an example.

5 is a block diagram showing the configuration of a quality control server 130 according to an embodiment of the present invention.

According to an embodiment of the present invention, the quality management server 130 includes a receiver 131, an objective image quality estimation value calculator 132, an objective image quality deterioration scale calculator 133, a subjective sensory image quality calculator 134, and a storage. 135 and a graphical user interface providing unit 136.

For reference, the components 131 to 136 of the quality management server illustrated in FIG. 5 are for both an online scheme and an offline scheme. In the offline scheme, a VR_Diff generation for generating a VR_Diff image in addition to the components illustrated in FIG. It may further include a portion (not shown).

When the quality management server 130 is in the online manner, the receiving unit 131 of the quality management server 130 receives the VR_Diff image from the set-top box, and in the offline manner, the first VR information and the first VR information are received from the media server and the set-top box. 2 VR information is received to generate a VR_Diff image.

Thereafter, the quality management server 130 measures the degradation of the objective playback quality of one video content using the VR_Diff image.

In this case, the objective image quality estimation value calculation unit 132 calculates the image quality estimation value per frame (

), And the objective image quality degradation scale calculation unit 133 calculates the number of degradations, the number of critical degradations, the critical degradation rate, the average degradation amount, the average degradation amount, and the like, which are the objective quality degradation measures proposed by the present invention.

Here, the number of degradations is the number of times that degradation occurs in the entire video sequence time, that is, the number of degradation intervals, and the number of threshold degradations means the number of critical degradation intervals.

In addition, the critical deterioration rate is a ratio of time occupied by all critical deterioration intervals in the total time, and the average deterioration amount is converted into an NPSNR value by an average distortion amount between two VR information.

Also, the average threshold degradation amount is an average of the frames constituting all the threshold degradation intervals.

Value.

For reference, in the present invention, the objective quality degradation scale calculation unit 133 calculates each objective quality degradation scale, but there are separate components corresponding to each objective quality degradation scale, so that the corresponding quality degradation scale may be calculated. For example, the number of deterioration is calculated by the deterioration number calculation unit.

On the other hand, the subjective haptic image quality calculating unit 134 may calculate the subjective image quality using the human visual characteristics from the VR information.

On the other hand, the storage 135 stores the values for the objective image quality deterioration evaluation calculated by the above-described

units

132 and 133. In addition, the storage 135 stores the VR_Diff image received from the set-top box in the online method, and the VR_Diff generated by receiving the first VR information and the second VR information from the media server and the set-top box in the offline method. Save the image.

Meanwhile, the graphic user interface providing unit 136 provides various information through a graphic user interface (hereinafter referred to as a GUI) so that the quality management operator can view the quality information when necessary.

The information provided by the GUI providing unit 136 may include an NPSNR graph and various objective image quality degradation measure values, VR information of the encoded video (first VR information), VR information of a video played in the set-top box (second VR information), and the like. VR_Diff image between VR information.

Quality deterioration data is recorded in the storage 135 in the form of a structure having a data structure as shown in FIG.

When the viewing quality information shown in FIG. 6 is recorded in the storage 135, the quality management operator can access this information at any time and view the quality information on the content viewed by the subscriber in the form of a text or a graph.

Hereinafter, a method of representing and synchronizing VR information will be described with reference to FIGS. 7 to 9, and the first VR information and the first method are used to generate the VR_Diff image in the offline quality management server 230 illustrated in FIG. 2. 2 Assume that the VR information is synchronized.

For reference, the reason for synchronizing the first VR information and the second VR information to generate the VR_Diff image will be described briefly. In the offline method, the media server 210 and the set-top box 220 respectively generate the VR information (the first). VR information and second VR information) is transmitted to the quality management server 230, and in this case, it is highly likely that the first VR information and the second VR information do not exactly match 1: 1 for various reasons.

Since VR information is transmitted to the quality management server 230 through a reliable and secure channel, VR information itself is rarely lost, but the playback image of the set-top box 220 is missing frames due to transmission errors or other reasons. If the same frame is repeated, the VR information may be missing or repeated.

Therefore, in order to evaluate the objective image quality degradation of the playback image of the set-top box 220 (that is, to generate a VR_Diff image), synchronization between the first VR information and the second VR information is required, and for this purpose, each VR information is required. In addition, a program ID (identifier) may be added, which is an identifier that uniquely identifies time information and content.

Using the video presentation time stamp (PTS) value used in the MPEG-2 System as the time information to the VR value extracted from the frame, "time information + VR value" is used as one VR information presentation unit.

Since the video PTS value is time information indicating when the video frame is played, the media server 210 and the set-top box 220 have the same value for a specific frame. Specific frames of the 210 and the set-top box 220 may be uniquely identified.

In the form of VR information for VR synchronization, a VR chunk (an identifier) is added to the VR information presentation unit during the unit time shown in FIG. 7 by adding a program ID (identifier) that is an identifier for uniquely identifying the content. chunk).

The VR information for one content may have a form in which VR chunks are continuously connected as shown in FIG. 8.

As described above, since the first VR information of the media server 210 and the second VR information of the set-top box 220 are not information available at exactly the same time, a delay may occur between the two information.

In order to compare the delayed VR information, a method of synchronizing the two informations is required, and the offline quality management server 230 uses the video PTS information of the MPEG-2 PES packet included in the VR information for synchronization between the VR information. .

If there is a missing frame in the set-top box 220, the PTS value and the VR information of the previous frame may be replaced or the quality deterioration measure may be calculated by ignoring the missing frame.

As illustrated in FIG. 9, the quality management server 230 may easily synchronize the two information by comparing the VR information having the same PTS value in VR information units.

In the above, the method of expressing and synchronizing the VR information in the offline quality control server 230 has been described with reference to FIGS. 7 to 9, but the set-top box 120 of the online method expresses and synchronizes the VR information in the same manner. can do.

Hereinafter, the storage method of the VR information will be briefly described.

The VR information generated in the set-top box is information to be preserved for evaluating image quality deterioration. The place for storing the VR information may be a quality control server, or in some cases, may be a storage such as a hard disk in the set-top box.

If the amount of storage that the QC server can handle is burdensome, all VR information for one viewer can be stored on the hard disk in the set-top box, and the QC server can access and use the hard disk in the set-top box if necessary. have.

If the VR information is stored in the quality control server, in order to minimize the storage capacity, the VR information may be converted into a lossless compressed VR_Diff image instead of the generated VR information, and the VR information of the media server may be stored in the media server. Since it can be provided when needed from, it is possible to recover the VR information of the set-top box using the stored VR_Diff image and the VR information of the media server.

In addition, when VR information is stored in the hard disk of the set-top box, the online set-top box 120 losslessly compresses the VR_Diff image to store the VR_Diff image in the hard disk, and the offline set-top box 220 generates the VR information can be compressed and stored on the hard disk.

Then, at the request of the quality management server, the set-top box 220 transmits the VR information or the VR_Diff image compressed and stored in the hard disk to the quality management server using a reliable channel.

Hereinafter, a compression method of VR information will be described in detail with reference to FIGS. 10 to 15.

The amount of data of VR information extracted from an SD video content having an 720x480 resolution of one hour is 480 B / frame x 30 frames / sec x 3600 sec / hour = 51,840,000 B / hr = 115,200 kbps.

That is, the transmission bandwidth required for transmitting the VR information of this content to the quality management server is 115.2kbps, and 51.84MB of storage space is required to store this VR information.

One hour of viewing quality information for one viewer is a very large amount of data. Therefore, there is a need for a method of reducing VR information. Hereinafter, various methods for reducing the amount of transmission data by reducing VR information will be described.

When extracting the VR information from the content, as shown in FIG. 8, the VR information may be reduced by subsampling from about 2: 1 to about 8: 1 in the spatial direction (vertical direction). If subsampling is performed at 8: 1, it is experimentally confirmed that the estimation performance of estimating the NPSNR is not significantly different.

This means that only enough samples to be statistically significant have little effect on the estimated performance. This reduces the original VR data amount of 51.84MB to 6.48MB, which can be transmitted with a transmission bandwidth of 14.4kbps.

If additionally subsampling VR information reduced by spatial subsampling shown in FIG. 11 from 2: 1 to 3: 1 temporally, the data amount can be further reduced, and the effect thereof is as shown in FIG. 8. .

If temporal subsampling is additionally applied to the VR information reduced by spatial subsampling, the VR information can be reduced to about 2.16MB and 4.8kbps.

Information sub-sampled temporally, spatially or spatiotemporally by the reduction method as shown in FIGS. 11 and 12 may further reduce the data amount by the lossless compression method as shown in FIG. 13.

Applying a lossless compression method can further reduce the amount of data to about 1/2 to 1/3, and in the case of the VR information shown in FIG. 9, can further reduce the amount of data up to about 1 MB and 2 kbps. have.

This is a small amount of data that can not be compared with the existing method (J.240, YAMADA) which requires a data amount of several hundred kbps to several Mbps as the data amount of feature information.

The method shown in FIG. 14 does not extract the VR information for every frame, but uses the packet loss information provided by the RTP packet layer, and when the loss is detected, the VR for the image played in the set-top box for a predetermined time therefrom. By extracting the information and not extracting the VR information in the remaining time intervals, the data amount of the VR information can be reduced.

It takes a certain amount of time (hundreds of ms to several seconds) before the lost packet affects the viewer's TV screen. Therefore, when a packet loss event occurs in the packet layer, VR information is only displayed for a certain period of time (N seconds). Extract.

If no packet loss occurs, the VR information is not extracted. Therefore, the VR information is extracted in proportion to the amount of packet loss. If no packet loss occurs, the amount of data of the VR information is almost zero. .

The method illustrated in FIG. 15 is a method of reducing the data amount of VR information by starting to extract VR information from the moment when the video decoder of the set-top box detects an error of a bit stream syntax structure.

Since the video bitstream is independently encoded in GoP (group of picture) units, even if an error occurs in any frame, the error does not affect the next GoP.

That is, since the error only affects the remaining frames in the same GoP, it is sufficient to extract VR information only during that GoP period.

In general, in a broadcast application, the GoP size is 0.5 seconds, that is, 15 frames. Therefore, when an error of a bit stream syntax structure is detected once, the VR information is extracted up to 15 frames, thereby greatly reducing the data amount of the VR information.

Hereinafter, the operation of each component of the system for monitoring the video quality of the IPTV in an online manner will be described with reference to FIGS. 16 to 18.

16 is a flowchart illustrating the operation of the media server 110 in an online manner according to an embodiment of the present invention.

The media server 110 compresses the video signal and stores the compressed video stream (S1601).

After step S1601, the media server 110 extracts first VR information from the compressed video stream, that is, the encoded image (S1602).

After step S1602, the media server 110 includes the PTS (Presentation Time Stamp) value, which is time information, and the program ID, which identifies the content, in the extracted first VR information (S1603).

After step S1603, the media server 110 stores the VR information, and when the viewer requests it later, transmits the stored 1 VR information to the set-top box 120 (S1604).

17 is a flowchart illustrating the operation of the set-top box 120 in an online manner according to an embodiment of the present invention.

The set top box 120 receives the video bit stream and the first VR information from the media server 110 (S1701).

For reference, in step S1701, for convenience of description, the reception of the video bit stream and the first VR information is described as one step. However, the first VR information is set as a separate transmission channel different from the video bit stream. ) May be sent.

In addition, the set-top box 120 may receive the first VR information in segments divided into a predetermined size, or when the storage such as a hard disk is present in the set-top box 120, the entire first VR information may be received at once. It may be.

After step S1701, the set-top box 120 decodes the video bit stream to generate a playback image (S1702).

After step S1703, the set top box 120 extracts second VR information from the decoded playback image (S1703).

After step S1703, the set-top box 120 includes a presentation time stamp (PTS) value, which is time information, and a program ID, which is an identifier for identifying content, in the second VR information (S1704).

After the step S1704, the set-top box 120 synchronizes the first VR information and the second VR information, and generates a VR_Diff image (S1705).

At this time, the set-top box 120 synchronizes the first VR information and the second VR information based on the PTS value and the program ID included in each VR information.

After step S1705, the set-top box 120 compresses and stores the VR_Diff image and transmits it to the quality management server (S1706).

18 is a flowchart illustrating the operation of the quality control server 130 in an online manner according to an embodiment of the present invention.

The quality management server 130 receives and stores the VR_Diff image from the set-top box 120 (S1801).

After step S1801, the quality management server 130 calculates an NPSNR estimate value using the VR_Diff image (S1802).

After step S1802, the quality management server 130 calculates an objective quality deterioration measure value (S1803).

In this case, the objective image quality deterioration measure for objectively measuring the quality deterioration may include a deterioration number, a threshold deterioration number, a threshold deterioration rate, an average deterioration amount, and an average threshold deterioration amount.

After step S1803, the quality management server 130 stores the objective image quality deterioration measurement value in the storage, and provides various information using the same to the quality management operator later (S1804).

Hereinafter, the operation of each component of the system for monitoring the video quality of the IPTV in an offline manner will be described with reference to FIGS. 19 to 22.

19 is a flowchart illustrating an operation of the media server 210 in an offline manner according to an embodiment of the present invention.

The media server 210 compresses the video signal and stores the compressed video stream (S1901).

After step S1901, the media server 210 extracts first VR information from the compressed video stream, that is, the encoded image (S1902).

After step S1902, the media server 210 includes a PTS (Presentation Time Stamp) value, which is time information, and a program ID, which is an identifier for identifying content, in the extracted first VR information (S1903).

After step S1903, the media server 210 stores the first VR information, and transmits the stored first VR information to the quality management server 230 at the request of the quality management server 230 (S1904).

20 is a flowchart illustrating the operation of the set-top box 220 in the offline manner according to an embodiment of the present invention.

The set top box 220 receives the video bit stream from the media server 210 (S2001).

After step S2001, the set-top box 220 decodes the video bit stream to generate a playback image (S2002).

After the step S2003, the set-top box 220 extracts the second VR information from the decoded playback image (S2003).

After step S2003, the set-top box 220 includes a PTS (Presentation Time Stamp) value, which is time information, and a program ID, which is an identifier for identifying content, in the second VR information (S2004).

After step S2004, the set-top box 220 compresses and stores the second VR information and transmits it to the quality management server (S2005).

For reference, in the offline method, the set-top box 220 does not need to generate a VR_Diff image, which reduces the consumption of computational resources of the set-top box 220, and thus can be operated in a set-top box having a lower specification than the online method. There is this.

21 is a flowchart illustrating the operation of the quality control server 230 in an online manner according to an embodiment of the present invention.

The quality management server 230 receives the first VR information transmitted from the media server 210 and the second VR information transmitted from the set top box 220 (S2101).

After step S2101, the quality management server 230 synchronizes the two VR information based on the PTS value and the program ID included in the first VR information and the second VR information (S2102).

After step S2102, the quality management server 230 generates a VR_Diff image by using the difference of each VR information (S2103).

After step S2103, the quality management server 230 calculates an NPSNR estimate using the VR_Diff image (S2104).

After step S2104, the quality management server 230 calculates an objective quality deterioration measure value (S2105).

After step S2105, the quality management server 230 stores the objective quality deterioration measurement value in the storage, and provides various information using the same to the quality management operator later (S2106).

Hereinafter, a simulation result performed to verify the validity and performance of the objective image quality degradation measurement method proposed in the present invention will be described.

The video used in the experiment is an H.264 stream provided by the real IPTV business. The characteristics are as shown in Table 1. VR information is generated by extracting the pixels on the vertical line at the center of the frame.

In addition, VR information of the encoded playback image

Extracted H.264 stream from error-free decoded playback video, and then divided the H.264 stream into packets of 1,500 byte size and randomly lost packets with a packet loss rate of 0.1%. VR information of video played on set-top box

Assume that

Table 1

Experimental video

	KBS Special	SEXY BACK example	Winter bird
Video type	documentary	Music Video	drama
Total frame count	1,600	4,600	6,000
Resolution (HxW)	720 x 480	720 x 480	720 x 480
Frame rate (frame / sec)	30	30	30
Encoding Rate (Mbps)	6	6	6

The verification of the performance is based on the calculation of the objective deterioration scale value calculated by the NPSNR obtained from the 2D image as the VR information.

How well do you estimate using?

The result of estimating the NPSNR with respect to the experimental image by the technique of the present invention is shown in FIG. 22.

In FIG. 22 to FIG. 24, a section showing an NPSNR value smaller than 50 dB is a section in which degradation occurs due to packet loss.

This shows that the VR information detects the point of degradation as well as the two-dimensional image frame, and it is slightly different depending on the image, but overall

It can be seen that the NPSNR is well approximated.

As a quantitative (objective) measure of image quality

The usefulness of can be seen in detecting serious image quality deterioration and approximating the amount of degradation at this time, rather than a slight image quality deterioration. The slight deterioration in image quality is difficult to see in the viewers' eyes.

Based on this, the objective quality deterioration scale is calculated as shown in <Table 2>.

For reference, in Table 2, '2D' is a value calculated for 2D playback video, and 'VR' is an estimated value based on VR information only.

TABLE 2

Objective Quality Degradation Scale Estimation Performance

Measure	KBS Special		SEXY BACK example		Winter bird
Measure	2D	VR	2D	VR
	2D	VR
Deterioration times K _d (times)	11	13	29	29	34	32
Critical degradation count K _cd (times)	10	8	29	27	32	28
Deterioration rate (%)	7.2	6.2	4.5	4.1	5.6	5.6
Critical degradation rate ρ _cd (%)	6.0	5.0	4.5	3.8	5.6	5.0
Critical Deterioration Q _avg (dB)	36.5	38.3	28.4	27.1	33.1	33.6

It can be seen that on all scales, the value estimated by VR information alone is very close to the value calculated for 2D images. From these values, there were 32 deteriorations in the case of 'winter bird' image, but there were 28 deterioration among them, and the deterioration with the image quality of 5% and the average critical deterioration amount of 33.6 dB was estimated to the viewer. can do.

For reference, the deterioration shown in <Table 2> is set for the simulation to evaluate the performance of the proposed method, and in practice, commercial service is impossible. IPTV operators can use this information to provide field data for network or service improvement.

Through the VR_Diff image of FIGS. 26 to 28, it is easy to visually check how much deterioration has occurred at which position among the entire frames.

Hereinafter, the subjective image quality deterioration measurement (evaluation) method in the quality management server (130, 230) will be described in detail.

For the subjective quality deterioration measurement in the

quality management server

130, 230 according to the embodiment of the present invention, the following amounts are first defined.

Subjective Critical Degradation Interval:

The subjective critical deterioration section is the objective deterioration section of <Equation 5>.

It is defined as the section in which unpleasant deterioration of a human occurs.

Even if the same amount of deterioration occurs, the degree to which a person is perceived differs depending on the position and characteristics of the deteriorated frame and the deterioration duration. For example, if degradation occurs in only one frame and no degradation occurs in front and rear frames, the viewer hardly recognizes the degradation.

In addition, even when the movement is very intense, the image is very complex spatially, or a small amount of deterioration occurs, the viewer's eyes do not tend to recognize the deterioration well.

Even if the deterioration is recognized, the subjective deterioration amount may be significantly different from the objective deterioration amount.

The subjective critical deterioration interval is defined as an interval in which NMOS, which is a NMOS (Networked Mean Opinion Score) value defined in Equation 9 below, or MOS, which is a subjective quality evaluation value of viewers, is 3.5 or less.

Subjective deterioration amount by deterioration section:

Where the first term in parentheses

Is the mean of the kth objective critical degradation interval

Where the second term a in parentheses is the decay constant,

Denotes a frame time for which the k-th degradation lasted, that is, a time for which the objective degradation section lasts, and reflects a phenomenon in which the subjective image quality deteriorates as the time for the degradation continues.

Also, the third term in parentheses

Denotes the complexity of the image of the k-th degradation section, and reflects the phenomenon that the perceived image quality is different according to the temporal and spatial complexity of the image.

Note that,

The following

Distributor of information

And Mean Squared Difference

Can be expressed as a function of.

From here,

Is the variance of the VR information of the nth frame and reflects the complexity of the spatial direction of the nth frame,

Is a mean square difference value between the VR information of the nth frame and the n-1th frame and reflects the complexity in the time direction.

Through several repeated experiments, the attenuation constant a of <Equation 10> is 0.05 and the image complexity variable

The value can be set based on the criteria as shown in Equation 9 below.

<Equation 11> shows the deterioration interval of the average frame

I

Weight based on size

By differently, the characteristics of the visual factor of the person according to the complexity of the image are considered.

From here

For the full frame

Is the average value of

Is an average value of VR variance values of each frame belonging to the k-th degradation section.

and

The same relationship is true except that the mean squared difference is also a value.

In order to quantify the deterioration of the STB reproduction quality using the subjective deterioration interval, the following subjective deterioration measures are defined.

The subjective deterioration measure is defined as follows using the subjective critical deterioration interval above.

① critical deterioration number Kscd:

The largest k value of the.

② critical deterioration rate:

From here,

Is the total video sequence time.

③ Average subjective critical deterioration amount:

Where k is NMOS less than or equal to 3.5

Degradation section belonging to

For reference, in the related art, an objective measure of viewing an image from the point of view of a signal waveform such as NPSNR or PSNR in a general photorealistic image almost dictates the subjective image quality, but the NMOS of Equation 9 of the present invention is It is an expression defined from the observation that the amount of NPSNR can be perceived subjectively differently according to the characteristics or conditions of the image.

In fact, it is also widely known as a visual factor characteristic of human being that deterioration is less recognized when the screen is complicated or moves fast in the deterioration interval having the same average NPSNR (or PSNR) value.

Equation 9 is a different approach from the existing subjective quality measure, which excludes the signal waveform aspect of the image, and the idea that the subjective quality measure can be modeled by appropriately weighting the elements reflecting the visual characteristics in the objective quality amount. Is based on.

Hereinafter, the configuration of the

quality management servers

130 and 230 of the present invention will be described with reference to FIG. 29. For convenience of description, the quality management server 230 will be described as an example.

29 is a block diagram showing the configuration of a quality control server 230 according to another embodiment of the present invention.

According to another embodiment of the present invention, the quality control server 230 includes a receiver 231, a reproduction quality estimation value calculator 232, an objective degradation interval detector 233, a subjective degradation detector 234, and a subjective degradation scale calculator. 235, storage 236, and graphical user interface provider 237.

For reference, the components 231 to 237 of the quality management server illustrated in FIG. 29 are for both an online scheme and an offline scheme. In the offline scheme, a VR_Diff generation for generating a VR_Diff image in addition to the components illustrated in FIG. 29. It may further include a portion (not shown).

When the quality management server 230 is in the online manner, the receiving unit 231 of the quality management server 230 receives the VR_Diff image from the set-top box, and in the offline manner, the first VR information and the first VR information are received from the media server and the set-top box. 2 VR information is received to generate a VR_Diff image.

Thereafter, the quality management server 230 measures the degradation of the playback quality of one video content by using the VR_Diff image.

At this time, the playback quality estimation value calculator 232 may calculate an image quality estimation value for each frame (

), And the objective deterioration section detector 233 detects the objective deterioration section by using the image quality estimation value, and the beginning and end of the image quality deterioration are the maximum values of the image quality estimation values calculated by the reproduction image quality estimation value calculator 232. A section other than (50dB), that is, a section less than 50db may be detected as an objective degradation section.

Since the objective deterioration interval detection has been described above with reference to <Equation 5>, a detailed description thereof will be omitted.

Meanwhile, the subjective deterioration detection unit 234 calculates a subjective deterioration amount for each of the objective deterioration sections detected by the objective deterioration section detection unit 233, and detects a section in which the subjective deterioration amount is less than or equal to a predetermined value as the subjective critical deterioration section.

To this end, the subjective deterioration detection unit 234 calculates the subjective deterioration amount for each objective deterioration section by using the time for which the objective deterioration section is continued and the temporal and spatial complexity of the image.

Since the subjective deterioration amount has been described above with reference to <Equation 9> and the subjective critical deterioration section has been described above, a detailed description thereof will be omitted.

For reference, the subjective critical degradation section may be defined as a section in which NMOS (Networked Mean Opinion Score) value defined by Equation 9 is 3.5 or less, or MOS, which is a subjective quality evaluation value of viewers, is 3.5 or less.

Meanwhile, the subjective deterioration scale calculator 235 calculates a subjective deterioration measure including at least one of a threshold deterioration number, a threshold deterioration rate, and an average subjective threshold deterioration amount for the subjective deterioration interval detected by the subjective deterioration detection unit 234. do.

Here, the number of critical degradation intervals means the number of critical degradation intervals, and the number of critical degradation intervals is the number of degradations in the total video sequence time.

In addition, the threshold degradation rate is the ratio of time occupied by all subjective critical degradation intervals in the total video sequence time.

Also, the average subjective critical deterioration amount is the average NMOS value of the frames constituting the subjective critical deterioration interval having an NMOS value of 3.5 or less.

For reference, in the present invention, the subjective deterioration scale calculation unit 235 is described as calculating the subjective deterioration scale, but there are separate components corresponding to each subjective deterioration scale, so that the corresponding subjective deterioration scale may be respectively calculated. (E.g., the number of deterioration is calculated by the deterioration count calculation unit, etc.).

On the other hand, the storage 236 stores a value for the subjective quality deterioration evaluation calculated by each of the above. In addition, the storage 236 stores the VR_Diff image received from the set-top box in the online manner, and the VR_Diff generated by receiving the first VR information and the second VR information from the media server and the set-top box in the offline manner. Save the image.

Meanwhile, the graphic user interface providing unit 237 provides various information through a graphic user interface (hereinafter referred to as a GUI) so that the quality management operator can view the quality information when necessary.

The information provided by the GUI providing unit 237 includes an NMOS graph and various subjective image quality degradation measure values, VR information (first VR information) of an encoded image, VR information (second VR information) of an image reproduced in a set-top box, and VR_Diff image between VR information.

For reference, the flowchart illustrated in FIG. 30 is a flowchart of an operation mainly performed by the quality control server 230 in an offline manner.

The quality management server 230 calculates a VR difference based on the first VR information on the encoded image received from the media server 210 and the second VR information on the decoded playback image received from the set-top box 220. (S3001).

After step S3001, the quality management server 230 calculates an image quality estimation value of the image reproduced in the set-top box 220 using the VRVV difference (S3002).

After step S3002, the quality management server 230 detects an objective deterioration interval by using the image quality estimation value (S3003).

In this case, the quality management server 230 may detect a section in which the start and the end of the image quality deterioration is not the maximum value of the image quality estimation value as the objective degradation section.

After step S3003, the quality control server 230 calculates a subjective deterioration amount for each objective deterioration section detected in step S3003, and detects a section in which the subjective deterioration amount is less than or equal to a predetermined value as a subjective critical deterioration section (S3004).

At this time, the quality management server 230 reflects the subjective image quality deterioration and the phenomenon that the perceived image quality is different according to the temporal and spatial complexity of the image as a weight, and the subjective quality of each degradation section is subjectively. Calculate the amount of degradation.

After step S3004, the quality control server 230 includes at least one of the number of times of critical degradation, the threshold degradation rate, and the average subjective threshold degradation amount for the subjective critical degradation interval.

After step S3005, the quality management server 230 stores a value for subjective quality deterioration evaluation in the storage, and then provides various information using the same to the quality control operator terminal (S3006).

Hereinafter, a simulation result performed to verify the validity and performance of the subjective image quality degradation measurement method proposed in the present invention will be described.

In order to verify the feasibility and performance of the proposed method, we simulate five H.264 streams.

TABLE 3

Test video

Test video	Fightscience	Great africa	Super riding	KBS	Sexy back
Video type	Culture	documentary	Hobbies	TV Shows, Documentaries	Music Video
Video content	Science in martial arts	African grassland ecology	Supercar introduction and test scene	KBS Special, Burning Sungnyemun	Park Jin Young's Sexy Back
Video features	Graphic + photorealistic video coexistence, fast movement	Low scene transitions, slow global motion	Intense movement, global movement due to camera shake	Complex scene at fire scene, camera shake	Fast transitions, fast movements

Each stream is SD (720x480) video and has a GOP (group of pictures) size of 15 frames, 30 frames per second, and a fixed bit rate (CBR) of 2.5 Mbps. In addition, each image consists of several scenes, and the experiment was conducted with the first 3,000 frames of these images.

The H.264 decoder used open source software ffmpeg. The VR information is generated by extracting the pixels on the vertical line in the center of the frame. The VR information of the encoded playback image is extracted from the H.264 stream without error decoding.

After that, the H.264 stream is divided into 1,500 byte packets, and in order to emulate network transmission errors, the packets are randomly lost at a packet loss rate of 0.1%, and then the VRs are extracted from the video played by the decoder. Information in the VR information of the STB playback video

Used as.

Then, we examined how well the NMOS value calculated by <Equation 6> using VR information reflects the MOS value of actual viewers for performance verification.

In order to obtain the MOS value of the viewers, which is a subjective sensation quality value, 10 test viewers were decoded the original H.264 stream and 5 images whose screens were degraded by network transmission error.

In each case, the MOS value between 1 and 5 was recorded based on the criteria as shown in <Table 4>.

Table 4

MOS value calculation standard

MOS value	Degradation awareness degree
5	No deterioration even when playback is stopped
4	Almost no deterioration in normal regeneration
3	Deterioration slightly visible in normal playback
2	Degradation is somewhat noticeable in normal regeneration
One	Serious deterioration in normal regeneration

FIG. 31 illustrates a correlation between MOS values recorded by viewers for all degradation periods and NMOS values calculated using Equations 9, 11, and 13 according to an embodiment of the present invention. One drawing.

Referring to FIG. 31, it can be seen that a relatively high correlation exists between two values, that is, MOS and NMOS values. This result means that the MOS value, which is the subjective image quality of viewers, can be approximately calculated using VR information.

On the other hand, Table 5 shows the performance estimation results for the subjective deterioration measure according to an embodiment of the present invention.

The threshold deterioration count is a measure of the number of times the deterioration perceived by the viewer's eyes when each test image is played at a normal speed. The number of times that the viewer feels that the screen deteriorates due to a network transmission error is expressed by <Equation 9 It can be seen that the results are in good agreement with the NMOS value of

In the experiment, it is assumed that critical degradation occurs when the average NMOS value and MOS value of the degradation interval are 3.5 or less. The critical degradation rate is very similar to the result of the two values, and the average critical degradation amount also shows that the NMOS accurately estimates the MOS.

Table 5

Subjective Quality Degradation Estimation Performance

Test video	Fight science		Great africa		Super riding		KBS		Sexy back
division	MOS	NMOS	MOS	NMOS	MOS	NMOS	MOS	NMOS	MOS	NMOS
_{Critical degradation} count K _scd (times)	17	16	12	13	16	16	17	17	13	11
_{Critical degradation} rate ρ _scd (%)	4.83	4.43	4.0	4.2	5.1	5.1	5.3	4.93	3.77	3.8
Average critical degradation (NMOS _avg )	2.49	2.46	2.97	3.0	2.8	2.6	2.24	2.45	2.38	2.38

In the category of the above table, MOS means subjective observations of the viewers, and NMOS means the result calculated by <Equation 9>.

The subjective image quality scale above is combined with the objective image quality scale and the direct visual observation of VR image information to sufficiently approximate the amount of image quality deterioration of the STB playback image caused by the error occurred during network transmission. Because it can measure, it is effective for monitoring individual viewer's sense of quality in STB environment where computing resources are scarce.

The foregoing description of the present invention is intended for illustration, and it will be understood by those skilled in the art that the present invention may be easily modified in other specific forms without changing the technical spirit or essential features of the present invention. will be.

Therefore, it should be understood that the embodiments described above are exemplary in all respects and not restrictive.

For example, each component described as a single type may be implemented in a distributed manner, and similarly, components described as distributed may be implemented in a combined form.

The scope of the present invention is shown by the following claims rather than the above description, and all changes or modifications derived from the meaning and scope of the claims and their equivalents should be construed as being included in the scope of the present invention. do.

Claims

In the method for monitoring the image quality,

(a) receiving first visual rhythm information of the encoded image from the first server,

(b) extracting second visual rhythm information with respect to the reproduced video obtained by receiving the encoded video over a network;

(c) calculating a visual rhythm difference based on the first visual rhythm information and the second visual rhythm information; and

(d) transmitting the calculated visual rhythm difference to a second server

Including,

And the second server measures image quality deterioration of the reproduced image using the visual rhythm difference.
The method of claim 1,

The first visual rhythm information and the second visual rhythm information are composed of one-dimensional information by partially sampling pixels of a two-dimensional image frame, and sampling three-dimensional images by sampling pixels at the same position for successive image frames on a time axis. The image quality degradation monitoring method of projecting the information to two-dimensional information.
The method of claim 2,

And wherein the partially sampled pixels are located in at least one of the vertical, diagonal, and horizontal directions of the two-dimensional image frame.
The method of claim 1,

The second server calculates a reproduction quality estimation value of the reproduced image by using the visual rhythm difference, and at least one of the number of degradations, the number of threshold degradations, a threshold degradation rate, an average threshold degradation amount, and an average degradation amount of the playback image. Image quality degradation monitoring method for calculating an objective quality degradation measure including.
The method of claim 1,

The first visual rhythm information and the second visual rhythm information received in step (a) includes a presentation time stamp (PTS) value, which is time information, and a program ID, which is an identifier for identifying content. Way.
The method of claim 5,

Step (c) is

(c-1) synchronizing the first visual rhythm information and the second visual rhythm information based on the PTS value and the program ID; and

(c-2) calculating the visual rhythm difference by using the difference between the first visual rhythm information and the second visual rhythm information

Image quality degradation monitoring method comprising a.
The method of claim 1,

The first visual rhythm information and the second visual rhythm information are reduced by temporal subsampling, reduced by spatial subsampling, reduced by lossless compression, extracted by RTP packet loss, and bitstream error. And decompression using at least one of the extraction methods by detection.
In the method for monitoring the image quality,

(a) receiving from the media server first visual rhythm information about the encoded image,

(b) receiving, from the set top box, second visual rhythm information about the decoded playback image,

(c) calculating a visual rhythm difference based on the first visual rhythm information and the second visual rhythm information; and

(d) measuring image quality deterioration of the reproduced image using the visual rhythm difference

Image quality degradation monitoring method comprising a.
The method of claim 8,

Step (d)

(d-1) calculating an image quality estimation value of the playback image

Including, wherein the image quality estimation value of the playback image is the image quality degradation monitoring method of expressing the image quality of the playback image based on the encoded image as visual rhythm information.
The method of claim 9,

Step (d)

(d-2) calculating at least one of the number of degradations, the number of threshold degradations, a threshold degradation rate, an average threshold degradation amount, and an average degradation amount of the reproduced video;

Image quality degradation monitoring method comprising a.
In the device for monitoring the image quality,

A receiver for receiving a visual rhythm difference using a difference between first visual rhythm information, which is visual rhythm information on the encoded image, and second visual rhythm information, which is visual rhythm information on the image reproduced in the set-top box, from the set-top box.

An image quality estimation value calculator configured to calculate an image quality estimation value of an image reproduced in the set-top box using the visual rhythm difference image;

An objective image quality degradation scale calculation unit configured to calculate an objective image quality degradation scale including at least one of a deterioration number, a deterioration number, a critical deterioration rate, an average critical deterioration amount, and an average deterioration amount of the reproduced image;

Graphical user interface for providing image quality monitoring information based on the image quality estimation value and the objective quality deterioration measure according to a user's request

Image quality degradation monitoring device comprising a.
The method of claim 11,

The image quality monitoring information includes at least one of a graph of the image quality estimation value, the image quality measurement value, the first visual rhythm information, the second visual rhythm information, and the visual rhythm difference. .
The method of claim 11,

And the receiving unit receives the first visual rhythm information from a media server according to a monitoring method, and receives the second visual rhythm information from the set top box.
The method of claim 13,

According to the monitoring method, a visual rhythm difference calculation unit for generating a visual rhythm difference, based on the first visual rhythm information and the second visual rhythm information

Further comprising, the image quality degradation monitoring device.
The method of claim 14,

The visual rhythm difference calculating unit uses the first visual rhythm information by using a presentation time stamp (PTS) value, which is time information included in the first visual rhythm information, and the second visual rhythm information, and a program ID that identifies a content. And synchronize the second visual rhythm information.
The method of claim 15,

And the visual rhythm difference calculator calculates the visual rhythm difference using the difference between the synchronized first visual rhythm information and the second visual rhythm information.
In the set-top box for monitoring the image quality,

A video decoder for decoding a encoded image received from a first server through a network to generate a reproduced image;

A visual rhythm information extraction unit for extracting second visual rhythm information from the reproduced video;

A visual rhythm difference calculator for calculating a visual rhythm difference based on the first visual rhythm information and the second visual rhythm information on the encoded image received from the first server;

A visual rhythm difference transmitter for transmitting the visual rhythm difference to a second server that measures image quality deterioration for the reproduced image by using the visual rhythm difference.

Including, image quality degradation monitoring device
The method of claim 17,

The visual rhythm difference calculator is configured to display the first visual rhythm based on a PTS (Presentation Time Stamp) value, which is time information included in the first visual rhythm information, and the second visual rhythm information, and a program ID that identifies a content. Synchronizing information and the second visual rhythm information, and calculating the visual rhythm difference by using a difference between the first visual rhythm information and the second visual rhythm information.
The method of claim 17,

The second server calculates a reproduction quality estimation value of the reproduced image by using the visual rhythm difference, and at least one of the number of degradations, the number of threshold degradations, a threshold degradation rate, an average threshold degradation amount, and an average degradation amount of the playback image. Image quality degradation monitoring device for calculating an objective quality degradation measure including.
The method of claim 17,

The first visual rhythm information and the second visual rhythm information are reduced by temporal subsampling, reduced by spatial subsampling, reduced by lossless compression, extracted by RTP packet loss, and bitstream error. The image quality degradation monitoring apparatus, which is compressed using at least one of the extraction method by detection.
In the device for monitoring the image quality,

A playback image quality estimation value calculator configured to calculate an image quality estimation value of the playback video using the visual rhythm difference calculated by the difference between the first visual rhythm information on the reference video and the second visual rhythm information on the playback video;

An objective deterioration interval detection unit for detecting at least one section in which an image quality estimation value of the reproduced image is less than a predetermined reference value, respectively;

A subjective deterioration detection unit for calculating the subjective deterioration amount by using the time duration of the objective deterioration section and the complexity of the image for each of the detected one or more objective deterioration sections.

Image quality degradation monitoring device comprising a.
The method of claim 21,

And the reference image is an encoded image received from a media server, and the reproduced image is a decoded image received from a set-top box.
The method of claim 21,

And the complexity of the image includes temporal complexity and spatial complexity.
The method of claim 21,

And the subjective degradation detection unit detects a section in which the subjective degradation amount is equal to or less than a predetermined reference value as a subjective critical degradation section.
The method of claim 24,

A subjective deterioration scale calculation unit for calculating a subjective deterioration scale including at least one of a threshold deterioration number, a threshold deterioration rate, and an average subjective threshold deterioration amount with respect to the subjective critical deterioration interval.

Further comprising, the image quality degradation monitoring device.
The method of claim 25,

The subjective deterioration scale calculator calculates the number of subjective critical deterioration intervals as the number of critical deterioration, wherein the number of the subjective critical deterioration intervals is less than or equal to the number of the objective deterioration intervals.
The method of claim 25,

And the subjective deterioration scale calculation unit calculates the threshold deterioration rate as a ratio of time occupied by all subjective critical deterioration intervals in the entire video sequence time.