WO2018153161A1 - Video quality evaluation method, apparatus and device, and storage medium - Google Patents

Video quality evaluation method, apparatus and device, and storage medium Download PDF

Info

Publication number
WO2018153161A1
WO2018153161A1 PCT/CN2017/119261 CN2017119261W WO2018153161A1 WO 2018153161 A1 WO2018153161 A1 WO 2018153161A1 CN 2017119261 W CN2017119261 W CN 2017119261W WO 2018153161 A1 WO2018153161 A1 WO 2018153161A1
Authority
WO
WIPO (PCT)
Prior art keywords
image block
pixel
video
gradient
distortion metric
Prior art date
Application number
PCT/CN2017/119261
Other languages
French (fr)
Chinese (zh)
Inventor
刘祥凯
Original Assignee
深圳市中兴微电子技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市中兴微电子技术有限公司 filed Critical 深圳市中兴微电子技术有限公司
Publication of WO2018153161A1 publication Critical patent/WO2018153161A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N17/00Diagnosis, testing or measuring for television systems or their details
    • H04N17/02Diagnosis, testing or measuring for television systems or their details for colour television signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/154Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion

Definitions

  • the present invention relates to the field of multimedia information processing, and in particular, to a video quality evaluation method, device, device, and storage medium.
  • Video quality evaluation can be divided into two categories, subjective video quality evaluation and objective video quality evaluation.
  • Subjective video quality evaluation refers to the organization of testers to view a set of video with distortion according to the prescribed experimental process, and subjectively score the quality of the video.
  • the subjective video quality evaluation may calculate the mean of each test video as the Mean Opinion Score (MOS), or may calculate the score of each test video and the score of the original reference video corresponding to the video.
  • MOS Mean Opinion Score
  • DMOS Difference Mean Opinion Score
  • Subjective video quality evaluation can get the result closest to the human eye's true visual perception quality, but the experiment is time consuming and laborious and cannot be applied to real-time video compression and processing systems.
  • the objective video quality evaluation algorithm can automatically predict the quality of the video and is therefore more practical. In order to evaluate whether the objective video quality evaluation algorithm can accurately predict the true visual perception quality of the human eye, a more complete and complete data set is needed for test verification.
  • the contribution of subjective video quality evaluation is to establish a public test video data set and provide corresponding MOS or DMOS data for testing the performance of different objective video quality evaluation algorithms.
  • the objective video quality evaluation algorithm can be roughly divided into three categories according to whether the original reference video needs to be used in the calculation process, which are Full Reference (FR), Reduce Reference (RR) and No Reference (No Reference). , NR) video quality evaluation algorithm.
  • FR Full Reference
  • RR Reduce Reference
  • No Reference No Reference
  • NR NR video quality evaluation algorithm.
  • MSE Mean Square Error
  • PSNR Peak Signal to Noise Ratio
  • an algorithm for extracting and comparing certain pixel statistical features of the original image and the distorted image appears, such as a structural similarity algorithm (Structural). Similarity Index Measurement, SSIM).
  • SSIM Similarity Index Measurement
  • the basic process of the full reference video quality evaluation algorithm is to extract the multiple visual statistic features of the original video image and the distorted video image to form the feature vector, and estimate the distortion degree of the video image by comparing the distance between the feature vectors. .
  • the full reference video quality evaluation algorithm also has a Video Quality Model (VQM) algorithm and a MOtion Based Video Integrity Evaluation Index (MOVIE) algorithm.
  • VQM Video Quality Model
  • MOVIE MOtion Based Video Integrity Evaluation Index
  • a disadvantage of the existing full reference video quality algorithm is that it is difficult to meet both the quality prediction accuracy and the computational low complexity requirements. Because of its simple calculation, PSNR is widely used in video image processing systems with high real-time requirements such as video coding. However, the experimental results show that the correlation between the video quality score calculated by PSNR and the real subjective score is poor. .
  • Advanced video quality evaluation algorithms such as VQM and MOVIE can effectively predict video quality and approximate the subjective perceived quality score of the human eye.
  • the calculation of the VQM and MOVIE algorithms is extremely complicated and can only be applied to offline calculation of video quality. Since the video encoder needs to calculate the distortion of the reconstructed image in real time during the encoding process and select the encoding parameters based on this, the video quality evaluation algorithms such as VQM and MOVIE cannot be applied to the video encoder.
  • the embodiments of the present invention provide a video quality evaluation method, device, device, and storage medium, which can reduce the computational complexity while ensuring the accuracy of video quality estimation, and can be applied to A video image processing system such as video coding that requires high real-time performance.
  • An embodiment of the present invention provides a video quality evaluation method, where the method includes:
  • Each frame of the video in which the quality evaluation is required is divided into image blocks according to a preset size
  • a distortion metric value of the video is determined based on a distortion metric of the image of each frame.
  • An embodiment of the present invention provides a video quality evaluation apparatus, where the apparatus includes:
  • a dividing module configured to divide an image block according to a preset size in each frame of the video that needs to be subjected to quality evaluation
  • a first determining module configured to determine a distortion metric value of each image block according to a standard deviation of a space time domain gradient of each image block and a mean square error of the pixel value
  • a second determining module configured to determine a distortion metric value of each frame image according to a distortion metric value of the image block included in each frame image
  • a third determining module configured to determine a distortion metric value of the video according to the distortion metric of each frame image.
  • An embodiment of the present invention provides a video quality evaluation apparatus, including a memory, a processor, and a computer program stored on the memory and operable on the processor, where the processor implements the video quality evaluation method when the program is executed .
  • Embodiments of the present invention provide a computer readable storage medium having stored thereon a computer program, wherein the computer program is implemented by a processor to implement the video quality evaluation method described above.
  • An embodiment of the present invention provides a video quality evaluation method, device, device, and storage medium, where the method includes: first, dividing each frame of a video that needs to be subjected to quality evaluation into image blocks according to a preset size; Determining the distortion metric of each image block by the standard deviation of the spatial time domain gradient of an image block and the mean square error of the pixel value; and determining the distortion metric according to the image block included in each frame image a distortion metric for each frame of image; and finally determining a distortion metric for the video based on the distortion metric for each frame of the image.
  • the method includes: first, dividing each frame of a video that needs to be subjected to quality evaluation into image blocks according to a preset size; Determining the distortion metric of each image block by the standard deviation of the spatial time domain gradient of an image block and the mean square error of the pixel value; and determining the distortion metric according to the image block included in each frame image a distortion metric for each frame of image; and finally determining
  • FIG. 1 is a schematic flowchart of an implementation process of a video quality evaluation method according to an embodiment of the present invention
  • FIG. 2 is a schematic flowchart of an implementation process of a video quality evaluation method according to Embodiment 2 of the present invention
  • Embodiment 3 is a first template for calculating a horizontal gradient of a pixel according to Embodiment 2 of the present invention
  • FIG. 5 is a third template for calculating a time domain gradient of a pixel according to Embodiment 2 of the present invention.
  • FIG. 6 is a correlation scatter diagram between a video distortion score calculated by a video quality evaluation method and a subjective experimental distortion DMOS score of a LIVE data set according to Embodiment 3 of the present invention
  • FIG. 10 is a schematic structural diagram of a video quality evaluation apparatus according to Embodiment 4 of the present invention.
  • FIG. 11 is a schematic diagram of a hardware entity of a terminal according to an embodiment of the present invention.
  • the embodiment of the present invention provides a video quality evaluation method, which is applied to a video quality evaluation device, and the video quality evaluation device includes, but is not limited to, a terminal such as a computer, a tablet computer, or a smart phone.
  • a terminal such as a computer, a tablet computer, or a smart phone.
  • 1 is a schematic flowchart of an implementation process of a video quality evaluation method according to an embodiment of the present invention. As shown in FIG. 1, the method includes the following steps:
  • Step S101 dividing each image frame in the video that needs to be subjected to quality evaluation into image blocks according to a preset size
  • the size of the image block can be set according to actual needs, and is generally set to an image block with the same number of rows and columns, such as an image block set to 8*8 or 16*16.
  • Step S102 determining a distortion metric value of each image block according to a standard deviation of a space time domain gradient of each image block and a mean square error of the pixel;
  • step S102 further includes:
  • Step S102a determining an empty time domain gradient of each pixel in the image block
  • the image block includes at least one pixel.
  • the step S102a first calculates a horizontal gradient, a vertical gradient, and a time domain gradient of each pixel in the image block, and then according to the formula (1-1) according to the horizontal gradient, the vertical gradient, and the time domain gradient of each pixel point. To calculate the spatial time domain gradient for each pixel.
  • Is the space time domain gradient of the pixel Is the horizontal gradient of the pixel, Is the vertical gradient of the pixel, Is the time domain gradient of the pixel.
  • Step S102b determining a standard deviation of an empty time domain gradient of the image block
  • Step S102c determining a mean square error of a pixel value of the image block
  • Step S102d Determine a distortion metric value of the image block according to a standard deviation of a space time domain gradient of the image block and a mean square error of the pixel value.
  • the distortion metric value of the image block is determined according to the formula (1-2).
  • D is a distortion metric of the image block
  • MSE is a mean square error of a pixel value of the image block
  • is a standard deviation of a spatial time domain gradient of the image block.
  • the human eye is not sensitive to the distortion of the edge or texture region of the image, and is sensitive to the distortion of the flat region.
  • the human eye is not sensitive to the distortion of fast moving video images. Therefore, for the visual characteristics of the human eye, the distortion of the image block can be divided by the standard deviation of the spatial time domain gradient value of the image block on the basis of the original video distortion, thereby embodying the complexity of the human eye to the time domain content.
  • the visual perception of video distortion is not sensitive. In this way, not only the accuracy of the video quality estimation can be guaranteed, but also the computational complexity is reduced.
  • Step S103 determining a distortion metric value of each frame image according to a distortion metric value of an image block included in each frame image in the video;
  • the average value of the distortion metric values of all the image blocks included in each frame is determined as the distortion metric value of the video.
  • Step S104 Determine a distortion metric value of the video according to a distortion metric value of each frame image included in the video.
  • the average value of the distortion metric values of the images of all the frames included in the video is determined as the distortion metric value of the video.
  • the method includes: first, dividing each frame image in the video that needs to be quality-evaluated into image blocks according to a preset size; and then, according to the standard deviation and pixel value of the space-time domain gradient of each image block.
  • Mean square error determines a distortion metric value of each image block; and further determines a distortion metric value of each frame image according to a distortion metric value of the image block included in each frame image;
  • the distortion metric of the frame image determines the distortion metric of the video.
  • the embodiment of the present invention further provides a video quality evaluation method, which is applied to a video quality evaluation apparatus, and the video quality evaluation apparatus includes, but is not limited to, a terminal such as a computer, a tablet computer, or a smart phone.
  • a terminal such as a computer, a tablet computer, or a smart phone.
  • 2 is a schematic flowchart of an implementation process of a video quality evaluation method according to an embodiment of the present invention. As shown in FIG. 2, the method includes the following steps:
  • Step S201 acquiring a first video and a second video.
  • the first video is a video that requires quality evaluation, that is, a video in which distortion occurs.
  • the second video is the original video of the video that requires quality evaluation, that is, the video that does not occur.
  • acquiring the first video includes acquiring pixel values of all pixels in each frame image of the first frequency.
  • acquiring the second video comprises acquiring pixel values of all pixels in each frame image of the second video.
  • Step S202 dividing each frame image of the first video and the second video into image blocks according to a preset size
  • the first video and the second video are divided, they are divided according to the same size.
  • the first video and the second video are divided into image blocks according to a size of 4*4.
  • Step S203 determining a standard deviation of an empty time domain gradient of each image block in the first video
  • step S203 further includes:
  • Step S203a determining, according to the preset first template, a horizontal gradient of each pixel in each image block in the first video;
  • the first template is a template for calculating a horizontal gradient of a pixel
  • the first template is as shown in FIG. 3
  • the first column of the first template is a right column of a pixel to be calculated.
  • the second column is the weight of the column where the pixel is calculated
  • the third column is the weight of the column to the right of the pixel to be calculated.
  • f(k, i, j) is a pixel value at which the position of the kth frame in the first video is (i, j).
  • Step S203b determining, according to the preset second template, a vertical gradient of each pixel in each image block in the first video
  • the second template is a template for calculating a vertical gradient of a pixel
  • the second template is as shown in FIG. 4
  • the first row of the second template is a weight of a row above the pixel to be calculated
  • the second line is the weight of the row of pixels to be calculated
  • the third row is the weight of the row below the pixel to be calculated.
  • f(k, i, j) is a pixel value at which the position of the kth frame in the first video is (i, j) pixels.
  • Step S203c Calculate a time domain gradient of each pixel in each image block in the first video according to a preset third template.
  • the third template is a template for calculating a pixel time domain gradient, and the third template is as shown in FIG. 5.
  • the third template has three 3 ⁇ 3 matrices, wherein the left 3 ⁇ 3 matrix is The weight of the image of the previous frame of the frame in which the pixel is to be calculated, the middle 3 ⁇ 3 matrix is the weight of the frame where the pixel point needs to be calculated, and the matrix of the right 3 ⁇ 3 is the frame of the pixel where the pixel to be calculated is located. The weight of a frame of image.
  • the first template shown in FIG. 3, the second template shown in FIG. 4, and the third template shown in FIG. 5 are merely exemplary illustrations, the first template and the second template.
  • the third template can be set according to actual needs in practical applications.
  • the size of the first template, the second template, and the third template may be N ⁇ N, where N is an odd number greater than 1, such as 3 ⁇ 3, 5 ⁇ 5, and 7 ⁇ 7.
  • the weights of the left and right sides of the pixel to be calculated can be set according to actual needs.
  • the setting should follow the principle that the absolute values of the weights on the left and right symmetrical positions are the same, and the weight of one side is positive and the weight of one side is negative.
  • the closer to the pixel to be calculated is to be followed.
  • the principle that the absolute value of the weight of a pixel is larger.
  • the second column is the column of the pixel to be calculated, so the weight of the second column is 0, and the absolute values of the weights of the symmetric positions of the first column and the third column are the same, and the first column is Negative, the third column is positive.
  • the weight (6) of the pixel point close to the pixel to be calculated is larger than the weight (3) of the pixel point farther from the pixel to be calculated.
  • the weight setting of the second template needs to ensure that the weight of the row of the pixel to be calculated is 0, and the absolute value of the weight at the symmetric position of the upper side and the lower side of the pixel to be calculated is the same. And the weight of one side is positive, the weight of one side is negative, and the absolute value of the weight of the pixel closer to the pixel to be calculated is larger.
  • the second row is to calculate the row of the pixel point, so the weight of the second row is 0, and the absolute values of the weights of the symmetric positions of the first row and the third row are the same, and the first behavior is negative.
  • the third act is positive.
  • the weight (6) of the pixel point close to the pixel to be calculated is larger than the weight (3) of the pixel point farther from the pixel to be calculated.
  • the weight of the third template is set to ensure that the weight of the frame in which the pixel is to be calculated is 0, and the absolute value of the weight of the previous frame of the pixel to be calculated is the same as the absolute value of the symmetric position of the latter frame, and one side
  • the weight is positive, the weight on one side is negative, and the absolute value of the weight of the pixel closer to the pixel to be calculated is larger.
  • the middle 3 ⁇ 3 matrix is the weight of the frame of the pixel to be calculated
  • the matrix is 0, and the left 3 ⁇ 3 matrix is the previous frame of the frame where the pixel to be calculated is located.
  • the weight of the weight, the right 3 ⁇ 3 matrix is the weight of the frame after the pixel to be calculated, the absolute value of the weight of the left 3 ⁇ 3 matrix and the right 3 ⁇ 3 matrix symmetric position The same, and one side is positive on the negative side.
  • the weight (6) of the pixel near the pixel to be calculated is larger than the weight (3) of the pixel far from the pixel to be calculated.
  • Step S203d determining an empty time domain gradient of each pixel of each image block in the first video according to formula (1-1);
  • Step S203e Determine a standard deviation of the spatial time domain gradient of each image block according to a spatial time domain gradient of each pixel of each image block.
  • Step S204 determining a mean square error of a pixel value of each image block in the first video
  • the mean square error MSE 1 of the pixel value of the first image block in the kth frame image in the first video is calculated according to formula (2-4):
  • f(k, i, j) is the pixel value of the (i, j) pixel of the kth frame in the first video
  • g(k, i, j) is the second video.
  • the position of the k frame is the pixel value of the (i, j) pixel.
  • Step S205 determining a distortion metric value of each image block in the first video according to a standard deviation of a space time domain gradient and a mean square error of a pixel value of each image block in the first video;
  • the distortion metric value of each image block in the first video is determined according to formula (1-2).
  • Step S206 determining a distortion metric value of each frame image according to a distortion metric value of an image block included in each frame image in the first video;
  • Step S207 determining a distortion metric value of the video according to a distortion metric value of each frame image in the first video.
  • the method includes: first acquiring a first video and a second video, and dividing each frame image of the first video and the second video into image blocks according to a preset size; a standard deviation of a spatial time domain gradient of each image block in the first video and a mean square error of the pixel value determine a distortion metric value of each image block; and then according to each frame image in the first video A distortion metric value of the included image block determines a distortion metric value of the image of each frame; and finally determines a distortion metric value of the video according to a distortion metric value of each frame image in the first video.
  • the video image processing system such as video coding with high real-time requirements can be applied.
  • the embodiment of the present invention first provides a video quality evaluation method to overcome the problem that the existing video quality evaluation method cannot simultaneously achieve video quality prediction accuracy and maintain low computational complexity.
  • the method includes the following steps:
  • the first step is to calculate a horizontal gradient, a vertical gradient, and a time domain gradient for each pixel of the video image, and calculate an empty time domain gradient of the pixel on the basis of the pixel;
  • the standard deviation of the pixel empty time domain gradient in the image block is counted
  • the standard deviation ⁇ of the pixel space-time gradient value in each image block may be counted in units of 8 ⁇ 8 blocks or 16 ⁇ 16 blocks, and ⁇ is used as a space time for characterizing the content of the image block.
  • the basis for domain complexity may be used.
  • the mean square error (MSE) of each image block is counted, and the MSE is divided by the space-time gradient standard deviation of the image block and the logarithm is taken as the final distortion metric of the image block.
  • MSE mean square error
  • the traditional mean square error is used as the objective distortion calculation criterion of the video, and based on the formula (1-2), the final distortion determination D of the video is adjusted according to the video space-time domain content complexity ⁇ .
  • the video quality evaluation method provided in the embodiment of the present invention is compared with the PSNR algorithm, the SSIM algorithm, and the VQM algorithm in the prior art.
  • the PSNR algorithm compares each frame of the original video and the distorted video pixel by pixel. It is an algorithm based on independent pixel difference, ignoring the influence of sequence content and observation conditions on distortion visibility, so it tends to Subjectively perceived video quality is less consistent.
  • the SSIM algorithm is an algorithm for extracting and calculating a certain pixel statistical feature of an image of an original video and a distorted video.
  • the VQM algorithm decomposes the original video and the distorted video into different channels (such as edge, luminance, chrominance, and frame difference) through different filters (such as edge detection), and then extracts pixel-level features and space-time image block-levels respectively.
  • Pixel-level features include the magnitude of the gradient, the direction of the gradient, the color difference, the contrast and the frame difference, etc., for each pixel.
  • the statistical features of the temporal and spatial image blocks include the calculation of statistical features (mean, standard deviation of pixel-level features) within an 8*8 image block, thereby enhancing the feature integration of the pixels to the characteristics of the temporal and spatial image blocks.
  • the distortion of the video sequence is obtained through the spatial-time domain integration and weighted fusion of the distortion of each feature.
  • the evaluation criteria used include Pearson Linear Correlation Coefficient (PLCC) and Spearman Rank Order Correlation Coefficient (SROCC).
  • PLCC mainly evaluates the prediction accuracy of the algorithm, that is, evaluates the linear fit between the predicted distortion and the real MOS.
  • SROCC mainly evaluates the monotonicity of the algorithm prediction, that is, whether the prediction prediction distortion order is consistent with the real MOS ordering.
  • the video subjective quality evaluation data set used in the experiment is the LIVE data set of the Laboratory for Image and Video Engineering (LIVE) of the University of Texas, which includes wireless network transmission distortion and wired network.
  • LIVE Laboratory for Image and Video Engineering
  • Four types of video distortion such as transmission distortion, H.264 compression distortion, and MPEG-2 compression distortion.
  • Experimental comparison methods include the classic objective video quality assessment methods PSNR and SSIM, and the video quality model VQM established by the National Telecommunications and Information Administration (NTIA).
  • Table 1 shows the video quality evaluation method provided by the embodiment of the present invention and the results obtained by the PSNR, SSIM, VQM algorithm and the PLCC correlation coefficient table of the subjective scoring data.
  • the correlation between the video distortion predicted by the video quality evaluation algorithm of the present invention and the actual subjective DMOS score of the LIVE data set is as shown in FIG. 6 (the abscissa is the video distortion prediction value of the method of the present invention, and the ordinate Score the actual video for the DMOS value).
  • the correlation between the video quality score predicted by the VQM, PSNR, and SSIM comparison methods and the actual subjective DMOS score of the LIVE data set is shown in Figures 7-9, respectively.
  • the values calculated by the method provided by the embodiment of the present invention and the VQM method are both distortion estimations of the video (the larger the value indicates the greater the video distortion), so the abscissa and the ordinate in FIG. 6 and FIG. 7 The data is positively correlated.
  • the PSNR method and the SSIM method calculate the value of the video quality estimate (the smaller the value, the greater the distortion), so the abscissa and the ordinate data in Figures 8 and 9 are negatively correlated.
  • the simulation results show that the video quality evaluation method provided by the embodiment of the present invention achieves a video quality prediction result that is more consistent with the subjective perceived quality of the human eye than the prior art, and requires only a low computational complexity. .
  • FIG. 10 is a schematic structural diagram of a video quality evaluation apparatus according to Embodiment 4 of the present invention.
  • the apparatus 1000 includes: a partitioning module 1001, and a first determining module 1002. a second determining module 1003, a third determining module 1004, wherein:
  • the dividing module 1001 is configured to divide each frame image in the video that needs to be quality-evaluated into image blocks according to a preset size.
  • the first determining module 1002 is configured to determine a distortion metric value of each image block according to a standard deviation of a spatial time domain gradient of each image block and a mean square error of the pixel value.
  • the first determining module 1002 further includes:
  • a first determining unit configured to determine a spatial time domain gradient of each pixel in the image block; wherein the image block includes at least one pixel;
  • the first determining unit further includes: a first determining subunit configured to determine a horizontal gradient, a vertical gradient, and a time domain gradient of each pixel in the image block; and a second determining subunit configured to be The horizontal gradient, the vertical gradient, and the time domain gradient of each pixel in the image block determine an empty time domain gradient for each pixel in the image block.
  • the second determining subunit further includes: determining a sub-subunit, configured to determine an empty time domain gradient of each pixel in the image block according to formula (1-1), where Is the space time domain gradient of the pixel, Is the horizontal gradient of the pixel, Is the vertical gradient of the pixel, Is the time domain gradient of the pixel.
  • a second determining unit configured to determine a standard deviation of an empty time domain gradient of the image block
  • a third determining unit configured to determine a mean square error of a pixel value of the image block
  • a fourth determining unit configured to determine a distortion metric value of the image block according to a standard deviation of a spatial time domain gradient of the image block and a mean square error of the pixel value.
  • the fourth determining unit further includes:
  • a third determining subunit configured to determine a distortion metric value of each image block according to formula (1-2), where D is a distortion metric value of the image block, and MSE is a pixel value of the image block The mean square error, ⁇ is the standard deviation of the spatial time domain gradient of the image block.
  • the second determining module 1003 is configured to determine a distortion metric value of each frame image according to a distortion metric value of the image block included in each frame image.
  • the second determining module includes: a fifth determining unit configured to determine an average value of distortion metric values of all image blocks included in each frame as a distortion metric value of the video;
  • the third determining module 1004 is configured to determine a distortion metric value of the video according to the distortion metric value of each frame image.
  • the third determining module includes: a sixth determining unit configured to determine an average value of distortion metric values of images of all frames included in the video as a distortion metric value of the video.
  • the video quality evaluation method described above is implemented in the form of a software function module and sold or used as a standalone product, it may also be stored in a computer readable storage medium.
  • the technical solution of the embodiments of the present invention may be embodied in the form of a software product in essence or in the form of a software product stored in a storage medium, including a plurality of instructions.
  • a computer device (which may be a personal computer, server, or network device, etc.) is caused to perform all or part of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes various media that can store program codes, such as a USB flash drive, a mobile hard disk, a read only memory (ROM), a magnetic disk, or an optical disk.
  • program codes such as a USB flash drive, a mobile hard disk, a read only memory (ROM), a magnetic disk, or an optical disk.
  • an embodiment of the present invention provides a video quality evaluation apparatus, such as a terminal, including a memory, a processor, and a computer program stored on the memory and operable on the processor, where the processor implements the program The video quality evaluation method described.
  • Embodiments of the present invention provide a computer readable storage medium having stored thereon a computer program, wherein the computer program is implemented by a processor to implement the video quality evaluation method described above.
  • FIG. 11 is a schematic diagram of a hardware entity of a terminal according to an embodiment of the present invention.
  • the hardware entity of the terminal 1100 includes: a processor 1101, a communication interface 1102, and a memory 1103, where:
  • the processor 1101 typically controls the overall operation of the terminal 1100.
  • Communication interface 1102 can cause terminal 1100 to communicate with other terminals or servers over a network.
  • the memory 1103 is configured to store instructions and applications executable by the processor 1101, and may also cache data to be processed or processed by the processor 1101 and each module in the terminal 1100 (eg, image data, audio data, voice communication data, and video) Communication data) can be realized by flash memory (FLASH) or random access memory (RAM).
  • FLASH flash memory
  • RAM random access memory
  • a computer program (also referred to as a program, software, software application, script or code) can be written in any programming language (including assembly or interpreted language, descriptive language or programming language) and can be any Form (including as a stand-alone program, or as a module, component, subroutine, object, or other unit suitable for use in a computing environment).
  • a computer program can, but does not necessarily, correspond to a file in a file system.
  • the program can be stored in a portion of the file that holds other programs or data (eg, one or more scripts stored in the markup language document), in a single file dedicated to the program of interest, or in multiple collaborative files ( For example, storing one or more modules, submodules, or files in a code section).
  • the computer program can be deployed to be executed on one or more computers located at one site or distributed across multiple sites and interconnected by a communication network.
  • the processes and logic flows described in the specification can be performed by one or more programmable processors executing one or more computer programs to perform actions by operating input data and generating output.
  • the above described processes and logic flows can also be performed by dedicated logic circuitry, and the apparatus can also be implemented as dedicated logic circuitry, such as an FPGA or ASIC.
  • processors suitable for the execution of a computer program include, for example, a general purpose microprocessor and a special purpose microprocessor, and any one or more processors of any type of digital computer.
  • a processor will receive instructions and data from a read only memory or a random access memory or both.
  • the calculated elements are a processor for performing actions in accordance with instructions and one or more memories for storing instructions and data.
  • a computer also includes one or more mass storage devices (eg, magnetic disks, magneto-optical disks, or optical disks) for storing data, or is operatively coupled to receive data from or send data thereto, or Both are. However, the computer does not need to have such a device.
  • the computer can be embedded in another device, such as a mobile phone, a personal digital assistant (PDA), a mobile audio player or mobile video player, a game console, a global positioning system (GPS) receiver, or a mobile storage device.
  • PDA personal digital assistant
  • GPS global positioning system
  • Suitable devices for storing computer program instructions and data include all forms of non-volatile memory, media and storage devices, including, for example, semiconductor storage devices (eg, EPROM, EEPROM, and flash memory devices), magnetic disks (eg, internal hard drives or removable hard drives). ), magneto-optical disks, and CD-ROM and DVD-ROM discs.
  • the processor and memory can be supplemented by or included in dedicated logic circuitry.
  • Embodiments of the subject matter described in the specification can be implemented in a computing system.
  • the computing system includes a backend component (eg, a data server), or includes a middleware component (eg, an application server), or includes a front end component (eg, a client computer with a graphical user interface or web browser through which the user passes)
  • the end computer can interact with an embodiment of the subject matter described herein, or any combination of one or more of the above described backend components, middleware components, or front end components.
  • the components of the system can be interconnected by any form of digital data communication or medium (e.g., a communication network). Examples of communication networks include local area networks (LANs) and wide area networks (WANs), interconnected networks (e.g., the Internet), and end-to-end networks (e.g., ad hoc end-to-end networks).
  • LANs local area networks
  • WANs wide area networks
  • interconnected networks e.g., the Internet
  • end-to-end networks
  • the features described in this application are implemented on a smart television module (or connected to a television module, hybrid television module, etc.).
  • the smart TV module can include processing circuitry configured to integrate more traditional television program sources (eg, program sources received via cable, satellite, air, or other signals) with Internet connectivity.
  • the smart TV module can be physically integrated into a television set or can include stand-alone devices such as set top boxes, Blu-ray or other digital media players, game consoles, hotel television systems, and other ancillary equipment.
  • the smart TV module can be configured to enable viewers to search for and find videos, movies, pictures or other content on the network, on local cable channels, on satellite television channels, or on local hard drives.
  • a set top box (STB) or set top box unit (STU) may include an information-applicable device that includes a tuner and is coupled to the television set and an external source to tune the signal to be displayed on a television screen or other playback device.
  • the smart TV module can be configured to provide a home screen or a top screen including icons for a variety of different applications (eg, web browsers and multiple streaming services, connecting cable or satellite media sources, other network "channels", etc.).
  • the smart TV module can also be configured to provide electronic programming to the user.
  • the companion application of the smart TV module can be run on the mobile terminal to provide the user with additional information related to the available programs, thereby enabling the user to control the smart TV module and the like.
  • this feature can be implemented on a portable computer or other personal computer (PC), smart phone, other mobile phone, handheld computer, tablet PC, or other computing device.
  • the first video and the second video are first acquired, and each frame image of the first video and the second video is divided into image blocks according to a preset size; and then according to the first video.
  • the standard deviation of the spatial time domain gradient of each image block and the mean square error of the pixel value determine the distortion metric value of each image block; and further according to the image block included in each frame image in the first video
  • the distortion metric determines a distortion metric of the image of each frame; and finally determines a distortion metric of the video according to a distortion metric of each frame image in the first video; thus, not only can the video quality estimation be accurate It reduces the computational complexity and can be applied to video image processing systems such as video coding with high real-time requirements.

Abstract

Disclosed are a video quality evaluation method, apparatus and device, and a storage medium. The method comprises: dividing each frame of image in a video, on which a quality evaluation needs to be performed, into image blocks of a pre-set size; determining a distortion metric value of each of the image blocks according to a standard deviation of space-time domain gradients of each of the image blocks and a mean squared error of pixel values; determining a distortion metric value of each frame of image according to the distortion metric value of the image block included in each frame of image; and determining a distortion metric value of the video according to the distortion metric value of each frame of image.

Description

一种视频质量评价方法及装置、设备、存储介质Video quality evaluation method and device, device and storage medium
相关申请的交叉引用Cross-reference to related applications
本申请基于申请号为201710102413.4、申请日为2017年02月24日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此以全文引入的方式引入本申请。The present application is filed on the basis of the Chinese Patent Application No. PCT Application No. PCT Application No. PCT Application No. .
技术领域Technical field
本发明涉及多媒体信息处理领域,尤其涉及一种视频质量评价方法及装置、设备、存储介质。The present invention relates to the field of multimedia information processing, and in particular, to a video quality evaluation method, device, device, and storage medium.
背景技术Background technique
视频质量评价可以分为两大类,分别是主观视频质量评价和客观视频质量评价。主观视频质量评价是指组织测试者按照规定的实验流程观看一组存在失真的视频,并对视频的质量进行主观打分。主观视频质量评价可以对每个测试视频的得分计算均值作为平均主观得分(Mean Opinion Score,MOS),也可以将每个测试视频的得分与该视频对应的原始参考视频的得分进行相减之后计算差分平均主观得分(Difference Mean Opinion Score,DMOS)。主观视频质量评价可以得到最接近人眼真实视觉感知质量的结果,但实验耗时费力,无法应用于实时视频压缩与处理系统。Video quality evaluation can be divided into two categories, subjective video quality evaluation and objective video quality evaluation. Subjective video quality evaluation refers to the organization of testers to view a set of video with distortion according to the prescribed experimental process, and subjectively score the quality of the video. The subjective video quality evaluation may calculate the mean of each test video as the Mean Opinion Score (MOS), or may calculate the score of each test video and the score of the original reference video corresponding to the video. Difference Mean Opinion Score (DMOS). Subjective video quality evaluation can get the result closest to the human eye's true visual perception quality, but the experiment is time consuming and laborious and cannot be applied to real-time video compression and processing systems.
客观视频质量评价算法可以自动预测视频的质量,因而更具实用性。为了评估客观视频质量评价算法是否能够准确预测人眼的真实视觉感知质量,需要较为全面完整的数据集进行测试验证。主观视频质量评价的贡献就是建立公开的测试视频数据集并提供相应的MOS或DMOS数据,用于测试不同客观视频质量评价算法的性能。The objective video quality evaluation algorithm can automatically predict the quality of the video and is therefore more practical. In order to evaluate whether the objective video quality evaluation algorithm can accurately predict the true visual perception quality of the human eye, a more complete and complete data set is needed for test verification. The contribution of subjective video quality evaluation is to establish a public test video data set and provide corresponding MOS or DMOS data for testing the performance of different objective video quality evaluation algorithms.
客观视频质量评价算法根据计算过程中是否需要用到原始参考视频而大致可以被分为三类,分别是全参考(Full Reference,FR)、半参考(Reduce Reference,RR)和无参考(No Reference,NR)的视频质量评价算法。计算视频图像的失真最直接的方法是将原始图像与失真图像逐像素比较,例如最基本的图像失真度量方法均方误差(Mean Square Error,MSE)和峰值信噪比(Peak Signal to Noise Ratio,PSNR)。但实际上直接逐像素比较的过程并不能体现人眼对视频图像失真的感知特点,于是出现了提取原始图像和失真图像的某种像素统计特征并进行比较的算法,例如结构相似度算法(Structural Similarity Index Measurement,SSIM)。从本质上来看,全参考视频质量评价算法的基本过程就是分别提取原始视频图像和失真视频图像的多项视觉统计特征组成特征向量,并通过比较特征向量之间的距离来估计视频图像的失真程度。The objective video quality evaluation algorithm can be roughly divided into three categories according to whether the original reference video needs to be used in the calculation process, which are Full Reference (FR), Reduce Reference (RR) and No Reference (No Reference). , NR) video quality evaluation algorithm. The most straightforward way to calculate the distortion of a video image is to compare the original image to the distorted image pixel by pixel, such as the most basic image distortion metric method Mean Square Error (MSE) and Peak Signal to Noise Ratio (Peak Signal to Noise Ratio, PSNR). However, the process of direct pixel-by-pixel comparison does not reflect the perceptual characteristics of the human eye to the distortion of the video image. Therefore, an algorithm for extracting and comparing certain pixel statistical features of the original image and the distorted image appears, such as a structural similarity algorithm (Structural). Similarity Index Measurement, SSIM). In essence, the basic process of the full reference video quality evaluation algorithm is to extract the multiple visual statistic features of the original video image and the distorted video image to form the feature vector, and estimate the distortion degree of the video image by comparing the distance between the feature vectors. .
目前全参考视频质量评价算法还有视频质量模型(Video Quality Model,VQM)算法、基于运动信息的视频失真度量准则(MOtion based Video Integrity Evaluation index,MOVIE)算法。At present, the full reference video quality evaluation algorithm also has a Video Quality Model (VQM) algorithm and a MOtion Based Video Integrity Evaluation Index (MOVIE) algorithm.
现有的全参考视频质量算法的缺点是很难同时满足质量预测准确性与计算低复杂度的要求。PSNR因为计算简单,所以被广泛的应用于视频编码等对实时性要求较高的视频图像处理系统当中,然而实验结果表明PSNR所计算出的视频质量得分与真实的主观打分之间相关性较差。VQM和MOVIE等先进的视频质量评价算法能够有效的预测视频质量,接近人眼对视频的主观感知质量评分。但是VQM和MOVIE算法的计算极其复杂,只能应用于离线计算视频质量。由于视频编码器需要在编码过程中实时计算重建图像的失真并以此为依据对编码参数进行选择决策,因此VQM和MOVIE等视频质量评价算法无法应用于视频编码器。A disadvantage of the existing full reference video quality algorithm is that it is difficult to meet both the quality prediction accuracy and the computational low complexity requirements. Because of its simple calculation, PSNR is widely used in video image processing systems with high real-time requirements such as video coding. However, the experimental results show that the correlation between the video quality score calculated by PSNR and the real subjective score is poor. . Advanced video quality evaluation algorithms such as VQM and MOVIE can effectively predict video quality and approximate the subjective perceived quality score of the human eye. However, the calculation of the VQM and MOVIE algorithms is extremely complicated and can only be applied to offline calculation of video quality. Since the video encoder needs to calculate the distortion of the reconstructed image in real time during the encoding process and select the encoding parameters based on this, the video quality evaluation algorithms such as VQM and MOVIE cannot be applied to the video encoder.
发明内容Summary of the invention
为解决现有存在的技术问题,本发明实施例提供一种视频质量评价方法及装置、设备、存储介质,能够在保证视频质量估计的准确性的同时,降低了计算复杂度,进而能够应用于实时性要求较高的视频编码等视频图像处理系统。In order to solve the existing technical problems, the embodiments of the present invention provide a video quality evaluation method, device, device, and storage medium, which can reduce the computational complexity while ensuring the accuracy of video quality estimation, and can be applied to A video image processing system such as video coding that requires high real-time performance.
本发明实施例的技术方案是这样实现的:The technical solution of the embodiment of the present invention is implemented as follows:
本发明实施例提供一种视频质量评价方法,所述方法包括:An embodiment of the present invention provides a video quality evaluation method, where the method includes:
将需要进行质量评价的视频中每一帧图像按照预设大小划分图像块;Each frame of the video in which the quality evaluation is required is divided into image blocks according to a preset size;
根据每一图像块的空时域梯度的标准差和像素值的均方误差确定所述每一图像块的失真度量值;Determining a distortion metric value of each image block according to a standard deviation of a spatial time domain gradient of each image block and a mean square error of the pixel value;
根据所述每一帧图像所包含的图像块的失真度量值确定所述每一帧图像的失真度量值;Determining a distortion metric value of each frame image according to a distortion metric value of the image block included in each frame image;
根据所述每一帧图像的失真度量值确定所述视频的失真度量值。A distortion metric value of the video is determined based on a distortion metric of the image of each frame.
本发明实施例提供一种视频质量评价装置,所述装置包括:An embodiment of the present invention provides a video quality evaluation apparatus, where the apparatus includes:
划分模块,配置为将需要进行质量评价的视频中每一帧图像按照预设大小划分图像块;a dividing module configured to divide an image block according to a preset size in each frame of the video that needs to be subjected to quality evaluation;
第一确定模块,配置为根据每一图像块的空时域梯度的标准差和像素值的均方误差确定所述每一图像块的失真度量值;a first determining module, configured to determine a distortion metric value of each image block according to a standard deviation of a space time domain gradient of each image block and a mean square error of the pixel value;
第二确定模块,配置为根据所述每一帧图像所包含的图像块的失真度量值确定所述每一帧图像的失真度量值;a second determining module, configured to determine a distortion metric value of each frame image according to a distortion metric value of the image block included in each frame image;
第三确定模块,配置为根据所述每一帧图像的失真度量值确定所述视频的失真度量值。And a third determining module, configured to determine a distortion metric value of the video according to the distortion metric of each frame image.
本发明实施例提供一种视频质量评价设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现所述的视频质量评价方法。An embodiment of the present invention provides a video quality evaluation apparatus, including a memory, a processor, and a computer program stored on the memory and operable on the processor, where the processor implements the video quality evaluation method when the program is executed .
本发明实施例提供一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该计算机程序被处理器执行时实现上述的视频质量评价方法。Embodiments of the present invention provide a computer readable storage medium having stored thereon a computer program, wherein the computer program is implemented by a processor to implement the video quality evaluation method described above.
本发明实施例提供一种视频质量评价方法及装置、设备、存储介质,其中,所述方法包括:首先将需要进行质量评价的视频中每一帧图像按照预设大小划分图像块;然后根据每一图像块的空时域梯度的标准差和像素值的均方误差确定所述每一图像块的失真度量值;再根据所述每一帧图像所包含的图像块的失真度量值确定所述每一帧图像的失真度量值;最后根据所述每一帧图像的失真度量值确定所述视频的失真度量值。如此,不仅能够保证视频质量估计的准确性,并且降低了计算复杂度,进而能够应用于实时性要求较高的视频编码等视频图像处理系统。An embodiment of the present invention provides a video quality evaluation method, device, device, and storage medium, where the method includes: first, dividing each frame of a video that needs to be subjected to quality evaluation into image blocks according to a preset size; Determining the distortion metric of each image block by the standard deviation of the spatial time domain gradient of an image block and the mean square error of the pixel value; and determining the distortion metric according to the image block included in each frame image a distortion metric for each frame of image; and finally determining a distortion metric for the video based on the distortion metric for each frame of the image. In this way, not only the accuracy of the video quality estimation can be ensured, but also the computational complexity is reduced, and the video image processing system such as video coding with high real-time requirements can be applied.
附图说明DRAWINGS
在附图(其不一定是按比例绘制的)中,相似的附图标记可在不同的视图中描述相似的部件。具有不同字母后缀的相似附图标记可表示相似部件的不同示例。附图以示例而非限制的方式大体示出了本文中所讨论的各个实施例。In the drawings, which are not necessarily to scale, the Like reference numerals with different letter suffixes may indicate different examples of similar components. The drawings generally illustrate the various embodiments discussed herein by way of example and not limitation.
图1为本发明实施例一视频质量评价方法的实现流程示意图;1 is a schematic flowchart of an implementation process of a video quality evaluation method according to an embodiment of the present invention;
图2为本发明实施例二视频质量评价方法的实现流程示意图;2 is a schematic flowchart of an implementation process of a video quality evaluation method according to Embodiment 2 of the present invention;
图3为本发明实施例二用于计算像素水平梯度的第一模板;3 is a first template for calculating a horizontal gradient of a pixel according to Embodiment 2 of the present invention;
图4为本发明实施例二用于计算像素垂直梯度的第二模板;4 is a second template for calculating a vertical gradient of a pixel according to Embodiment 2 of the present invention;
图5为本发明实施例二用于计算像素时域梯度的第三模板;FIG. 5 is a third template for calculating a time domain gradient of a pixel according to Embodiment 2 of the present invention; FIG.
图6为本发明实施例三提供的视频质量评价方法计算得到的视频失真分数与LIVE数据集视频主观实验失真DMOS分数之间的相关性散点图;6 is a correlation scatter diagram between a video distortion score calculated by a video quality evaluation method and a subjective experimental distortion DMOS score of a LIVE data set according to Embodiment 3 of the present invention;
图7为VQM方法计算得到的视频失真分数与LIVE数据集视频主观实验失真DMOS分数之间的相关性散点图;7 is a correlation scatter plot between the video distortion score calculated by the VQM method and the subjective experimental distortion DMOS score of the LIVE data set video;
图8为PSNR方法计算得到的视频失真分数与LIVE数据集视频主观实 验失真DMOS分数之间的相关性散点图;8 is a correlation scatter plot between the video distortion score calculated by the PSNR method and the subjective experimental distortion DMOS score of the LIVE data set video;
图9为SSIM方法计算得到的视频失真分数与LIVE数据集视频主观实验失真DMOS分数之间的相关性散点图;9 is a correlation scatter plot between the video distortion score calculated by the SSIM method and the subjective experimental distortion DMOS score of the LIVE data set video;
图10为本发明实施例四视频质量评价装置的组成结构示意图;10 is a schematic structural diagram of a video quality evaluation apparatus according to Embodiment 4 of the present invention;
图11为本发明实施例中终端的一种硬件实体示意图。FIG. 11 is a schematic diagram of a hardware entity of a terminal according to an embodiment of the present invention.
具体实施方式detailed description
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对发明的技术方案做进一步详细描述。以下实施例用于说明本发明,但不用来限制本发明的范围。In order to make the objects, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions of the invention will be further described in detail below with reference to the accompanying drawings in the embodiments. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.
实施例一 Embodiment 1
本发明实施例提供一种视频质量评价方法,应用于视频质量评价装置,该视频质量评价装置在实际应用中包括但不限于是计算机、平板电脑、智能手机等终端。图1为本发明实施例一视频质量评价方法的实现流程示意图,如图1所示,所述方法包括以下步骤:The embodiment of the present invention provides a video quality evaluation method, which is applied to a video quality evaluation device, and the video quality evaluation device includes, but is not limited to, a terminal such as a computer, a tablet computer, or a smart phone. 1 is a schematic flowchart of an implementation process of a video quality evaluation method according to an embodiment of the present invention. As shown in FIG. 1, the method includes the following steps:
步骤S101,将需要进行质量评价的视频中每一帧图像按照预设大小划分图像块;Step S101, dividing each image frame in the video that needs to be subjected to quality evaluation into image blocks according to a preset size;
这里,所述图像块的大小可以根据实际需求进行设定,一般设置为行数和列数相同的图像块,比如设置为8*8或者16*16的图像块。Here, the size of the image block can be set according to actual needs, and is generally set to an image block with the same number of rows and columns, such as an image block set to 8*8 or 16*16.
步骤S102,根据每一图像块的空时域梯度的标准差和像素的均方误差确定所述每一图像块的失真度量值;Step S102, determining a distortion metric value of each image block according to a standard deviation of a space time domain gradient of each image block and a mean square error of the pixel;
这里,所述步骤S102进一步包括:Here, the step S102 further includes:
步骤S102a,确定所述图像块中每一像素点的空时域梯度;Step S102a, determining an empty time domain gradient of each pixel in the image block;
这里,所述图像块中至少包括一个像素点。Here, the image block includes at least one pixel.
所述步骤S102a首先计算所述图像块中的每一个像素点的水平梯度、垂直梯度和时域梯度,再根据每一像素点的水平梯度、垂直梯度和时域梯 度按照公式(1-1)来计算每一像素点的空时域梯度。The step S102a first calculates a horizontal gradient, a vertical gradient, and a time domain gradient of each pixel in the image block, and then according to the formula (1-1) according to the horizontal gradient, the vertical gradient, and the time domain gradient of each pixel point. To calculate the spatial time domain gradient for each pixel.
Figure PCTCN2017119261-appb-000001
Figure PCTCN2017119261-appb-000001
在公式(1-1)中,
Figure PCTCN2017119261-appb-000002
为所述像素点的空时域梯度、
Figure PCTCN2017119261-appb-000003
为所述像素点的水平梯度、
Figure PCTCN2017119261-appb-000004
为所述像素点的垂直梯度、
Figure PCTCN2017119261-appb-000005
为所述像素点的时域梯度。
In formula (1-1),
Figure PCTCN2017119261-appb-000002
Is the space time domain gradient of the pixel,
Figure PCTCN2017119261-appb-000003
Is the horizontal gradient of the pixel,
Figure PCTCN2017119261-appb-000004
Is the vertical gradient of the pixel,
Figure PCTCN2017119261-appb-000005
Is the time domain gradient of the pixel.
步骤S102b,确定所述图像块的空时域梯度的标准差;Step S102b, determining a standard deviation of an empty time domain gradient of the image block;
步骤S102c,确定所述图像块的像素值的均方误差;Step S102c, determining a mean square error of a pixel value of the image block;
步骤S102d,根据所述图像块的空时域梯度的标准差和像素值的均方误差,确定所述图像块的失真度量值。Step S102d: Determine a distortion metric value of the image block according to a standard deviation of a space time domain gradient of the image block and a mean square error of the pixel value.
这里,按照公式(1-2)确定所述图像块的失真度量值。Here, the distortion metric value of the image block is determined according to the formula (1-2).
Figure PCTCN2017119261-appb-000006
Figure PCTCN2017119261-appb-000006
公式(1-2)中,D为所述图像块的失真度量值,MSE为所述图像块的像素值的均方误差,σ为所述图像块的空时域梯度的标准差。In the formula (1-2), D is a distortion metric of the image block, MSE is a mean square error of a pixel value of the image block, and σ is a standard deviation of a spatial time domain gradient of the image block.
根据人眼视觉感知的相关研究成果可知,人眼对于图像的边缘或纹理区域的失真不敏感,对于平坦区域的失真较为敏感。同时,人眼对于快速运动的视频图像的失真不敏感。因此,针对这一人眼视觉特性,可以在原有的视频失真的基础上,将图像块的失真除以该图像块的空时域梯度值的标准差,从而体现人眼对空时域内容复杂的视频失真不敏感这一视觉感知特性。这样,不仅能够保证视频质量估计的准确性,并且降低了计算复杂度。According to the relevant research results of human visual perception, the human eye is not sensitive to the distortion of the edge or texture region of the image, and is sensitive to the distortion of the flat region. At the same time, the human eye is not sensitive to the distortion of fast moving video images. Therefore, for the visual characteristics of the human eye, the distortion of the image block can be divided by the standard deviation of the spatial time domain gradient value of the image block on the basis of the original video distortion, thereby embodying the complexity of the human eye to the time domain content. The visual perception of video distortion is not sensitive. In this way, not only the accuracy of the video quality estimation can be guaranteed, but also the computational complexity is reduced.
步骤S103,根据所述视频中每一帧图像所包含的图像块的失真度量值确定所述每一帧图像的失真度量值;Step S103, determining a distortion metric value of each frame image according to a distortion metric value of an image block included in each frame image in the video;
这里,将所述每一帧的所包含的所有图像块的失真度量值的平均值确定为所述视频的失真度量值。Here, the average value of the distortion metric values of all the image blocks included in each frame is determined as the distortion metric value of the video.
步骤S104,根据所述视频所包含的每一帧图像的失真度量值确定所述 视频的失真度量值。Step S104: Determine a distortion metric value of the video according to a distortion metric value of each frame image included in the video.
这里,将所述视频中所包含的所有帧的图像的失真度量值的平均值确定为所述视频的失真度量值。Here, the average value of the distortion metric values of the images of all the frames included in the video is determined as the distortion metric value of the video.
本发明实施例中,所述方法包括:首先将需要进行质量评价的视频中每一帧图像按照预设大小划分图像块;然后根据每一图像块的空时域梯度的标准差和像素值的均方误差确定所述每一图像块的失真度量值;再根据所述每一帧图像所包含的图像块的失真度量值确定所述每一帧图像的失真度量值;最后根据所述每一帧图像的失真度量值确定所述视频的失真度量值。如此,不仅能够保证视频质量估计的准确性,并且降低了计算复杂度,进而能够应用于实时性要求较高的视频编码等视频图像处理系统。In the embodiment of the present invention, the method includes: first, dividing each frame image in the video that needs to be quality-evaluated into image blocks according to a preset size; and then, according to the standard deviation and pixel value of the space-time domain gradient of each image block. Mean square error determines a distortion metric value of each image block; and further determines a distortion metric value of each frame image according to a distortion metric value of the image block included in each frame image; The distortion metric of the frame image determines the distortion metric of the video. In this way, not only the accuracy of the video quality estimation can be ensured, but also the computational complexity is reduced, and the video image processing system such as video coding with high real-time requirements can be applied.
实施例二 Embodiment 2
基于前述的实施例,本发明实施例再提供一种视频质量评价方法,应用于视频质量评价装置,该视频质量评价装置在实际应用中包括但不限于是计算机、平板电脑、智能手机等终端。图2为本发明实施例一视频质量评价方法的实现流程示意图,如图2所示,所述方法包括以下步骤:Based on the foregoing embodiments, the embodiment of the present invention further provides a video quality evaluation method, which is applied to a video quality evaluation apparatus, and the video quality evaluation apparatus includes, but is not limited to, a terminal such as a computer, a tablet computer, or a smart phone. 2 is a schematic flowchart of an implementation process of a video quality evaluation method according to an embodiment of the present invention. As shown in FIG. 2, the method includes the following steps:
步骤S201,获取第一视频和第二视频。Step S201, acquiring a first video and a second video.
这里,所述第一视频为需要进行质量评价的视频,也即发生了失真的视频。所述第二视频为所述需要进行质量评价的视频的原始视频,也即没有发生的失真的视频。Here, the first video is a video that requires quality evaluation, that is, a video in which distortion occurs. The second video is the original video of the video that requires quality evaluation, that is, the video that does not occur.
这里,获取第一视频包括获取第一频的每一帧图像中所有像素点的像素值。相应地,获取第二视频包括获取第二视频的每一帧图像中所有像素点的像素值。Here, acquiring the first video includes acquiring pixel values of all pixels in each frame image of the first frequency. Correspondingly, acquiring the second video comprises acquiring pixel values of all pixels in each frame image of the second video.
步骤S202,将所述第一视频和所述第二视频中每一帧图像按照预设大小划分图像块;Step S202, dividing each frame image of the first video and the second video into image blocks according to a preset size;
这里,对所述第一视频和所述第二视频进行划分时,是按照同样的大 小进行划分的。Here, when the first video and the second video are divided, they are divided according to the same size.
比如在本实施例中,将所述第一视频和第二视频按照4*4的大小划分图像块。For example, in this embodiment, the first video and the second video are divided into image blocks according to a size of 4*4.
步骤S203,确定所述第一视频中每一图像块的空时域梯度的标准差;Step S203, determining a standard deviation of an empty time domain gradient of each image block in the first video;
这里,所述步骤S203进一步包括:Here, the step S203 further includes:
步骤S203a,根据预设的第一模板,确定所述第一视频中每一图像块中每一像素点的水平梯度;Step S203a, determining, according to the preset first template, a horizontal gradient of each pixel in each image block in the first video;
这里,所述第一模板为用于计算像素水平梯度的模板,所述第一模板如图3所示,所述第一模板的第一列是需要进行计算的像素点左侧的一列的权值,第二列是需要进行计算的像素点所在的列的权值,第三列是需要进行计算的像素点右侧一列的权值。Here, the first template is a template for calculating a horizontal gradient of a pixel, the first template is as shown in FIG. 3, and the first column of the first template is a right column of a pixel to be calculated. Value, the second column is the weight of the column where the pixel is calculated, and the third column is the weight of the column to the right of the pixel to be calculated.
按照公式(2-1)计算所述第一视频中第k帧图像的位置为(i,j)像素点的水平梯度
Figure PCTCN2017119261-appb-000007
(i,j)表示像素点位于图像的第i行,第j列:
Calculating the horizontal gradient of the (i, j) pixel point in the position of the kth frame image in the first video according to formula (2-1)
Figure PCTCN2017119261-appb-000007
(i, j) indicates that the pixel is located in the ith row of the image, column j:
Figure PCTCN2017119261-appb-000008
Figure PCTCN2017119261-appb-000008
在公式(2-1)中,f(k,i,j)为所述第一视频中第k帧的位置为(i,j)像素点的像素值。In the formula (2-1), f(k, i, j) is a pixel value at which the position of the kth frame in the first video is (i, j).
步骤S203b,根据预设的第二模板,确定所述第一视频中每一图像块中每一像素点的垂直梯度;Step S203b, determining, according to the preset second template, a vertical gradient of each pixel in each image block in the first video;
这里,所述第二模板为用于计算像素垂直梯度的模板,所述第二模板如图4所示,所述第二模板的第一行是需要计算的像素点上面一行的权值,第二行是需要计算的像素点所在行的权值,第三行也即需要计算的像素点下面一行的权值。Here, the second template is a template for calculating a vertical gradient of a pixel, the second template is as shown in FIG. 4, and the first row of the second template is a weight of a row above the pixel to be calculated, The second line is the weight of the row of pixels to be calculated, and the third row is the weight of the row below the pixel to be calculated.
按照公式(2-2)计算所述第一视频中第k帧的位置为(i,j)像素点的 水平梯度
Figure PCTCN2017119261-appb-000009
Calculating the horizontal gradient of the position of the kth frame in the first video as (i, j) pixels according to formula (2-2)
Figure PCTCN2017119261-appb-000009
Figure PCTCN2017119261-appb-000010
Figure PCTCN2017119261-appb-000010
在公式(2-2)中,f(k,i,j)为所述第一视频中第k帧的位置为(i,j)像素点的像素值。In the formula (2-2), f(k, i, j) is a pixel value at which the position of the kth frame in the first video is (i, j) pixels.
步骤S203c,根据预设的第三模板,计算所述第一视频中每一图像块中每一像素点的时域梯度;Step S203c: Calculate a time domain gradient of each pixel in each image block in the first video according to a preset third template.
这里,第三模板为用于计算像素时域梯度的模板,所述第三模板如图5所示,所述第三模板中有三个3×3的矩阵,其中,左边3×3的矩阵为需要计算的像素点所在帧的前一帧图像的权值,中间的3×3矩阵为需要计算的像素点所在帧的权值,右边3×3的矩阵为需要计算的像素点所在帧的后一帧图像的权值。Here, the third template is a template for calculating a pixel time domain gradient, and the third template is as shown in FIG. 5. The third template has three 3×3 matrices, wherein the left 3×3 matrix is The weight of the image of the previous frame of the frame in which the pixel is to be calculated, the middle 3×3 matrix is the weight of the frame where the pixel point needs to be calculated, and the matrix of the right 3×3 is the frame of the pixel where the pixel to be calculated is located. The weight of a frame of image.
按照公式(2-3)计算所述第一视频中第k帧图像中位置为(i,j)像素点的时域梯度
Figure PCTCN2017119261-appb-000011
Calculating a time domain gradient of (i, j) pixels in the kth frame image in the first video according to formula (2-3)
Figure PCTCN2017119261-appb-000011
Figure PCTCN2017119261-appb-000012
Figure PCTCN2017119261-appb-000012
这里,需要说明的是,图3所示的第一模板、图4所示的第二模板以及图5所示的第三模板仅仅是示例性说明,所述第一模板、所述第二模板以及所述第三模板在实际应用中可以根据实际需求进行设定。比如,第一模板、第二模板跟第三模板的大小可以是N×N,其中,N为大于1的奇数,例如3×3,5×5,7×7。Here, it should be noted that the first template shown in FIG. 3, the second template shown in FIG. 4, and the third template shown in FIG. 5 are merely exemplary illustrations, the first template and the second template. And the third template can be set according to actual needs in practical applications. For example, the size of the first template, the second template, and the third template may be N×N, where N is an odd number greater than 1, such as 3×3, 5×5, and 7×7.
以第一模板为例,在权值大小的设置上除了要计算的像素点所在列的 权值为0外,要计算的像素点左侧及右侧的权值可以根据实际需求设定。但设定时需遵循左右两侧对称位置上的权值的绝对值相同,且一侧权值为正,一侧权值为负的原则,另外,还需要遵循离要计算的像素点越近像素点的权值的绝对值越大的原则。Taking the first template as an example, in addition to the weight of the column in which the pixel to be calculated is 0, the weights of the left and right sides of the pixel to be calculated can be set according to actual needs. However, the setting should follow the principle that the absolute values of the weights on the left and right symmetrical positions are the same, and the weight of one side is positive and the weight of one side is negative. In addition, the closer to the pixel to be calculated is to be followed. The principle that the absolute value of the weight of a pixel is larger.
例如,以图3为例,第二列为要计算的像素点所在列,因此第二列权值为0,第一列与第三列对称位置的权值的绝对值相同,第一列为负,第三列为正。并且离要计算的像素点近的像素点的权值(6)大于离要计算的像素点远的像素点的权值(3)大。For example, taking FIG. 3 as an example, the second column is the column of the pixel to be calculated, so the weight of the second column is 0, and the absolute values of the weights of the symmetric positions of the first column and the third column are the same, and the first column is Negative, the third column is positive. And the weight (6) of the pixel point close to the pixel to be calculated is larger than the weight (3) of the pixel point farther from the pixel to be calculated.
同样地,第二模板的权值的设置需要保证要计算的像素点所在行的权值为0,要计算的像素点的上侧与下侧的权值对称位置上的权值的绝对值相同,且一侧权值为正,一侧权值为负,以及离要计算的像素点越近的像素点的权值的绝对值越大。Similarly, the weight setting of the second template needs to ensure that the weight of the row of the pixel to be calculated is 0, and the absolute value of the weight at the symmetric position of the upper side and the lower side of the pixel to be calculated is the same. And the weight of one side is positive, the weight of one side is negative, and the absolute value of the weight of the pixel closer to the pixel to be calculated is larger.
例如,以图4为例,第二行为要计算的像素点所在行,因此第二行权值为0,第一行与第三行对称位置的权值的绝对值相同,第一行为负,第三行为正。并且离要计算的像素点近的像素点的权值(6)大于离要计算的像素点远的像素点的权值(3)大。For example, taking FIG. 4 as an example, the second row is to calculate the row of the pixel point, so the weight of the second row is 0, and the absolute values of the weights of the symmetric positions of the first row and the third row are the same, and the first behavior is negative. The third act is positive. And the weight (6) of the pixel point close to the pixel to be calculated is larger than the weight (3) of the pixel point farther from the pixel to be calculated.
第三模板的权值的设置要保证要计算的像素点所在帧的权值为0,要计算的像素点的前一帧与后一帧对称位置上的权值的绝对值相同,且一侧权值为正,一侧权值为负,以及离要计算的像素点越近的像素点的权值的绝对值越大。The weight of the third template is set to ensure that the weight of the frame in which the pixel is to be calculated is 0, and the absolute value of the weight of the previous frame of the pixel to be calculated is the same as the absolute value of the symmetric position of the latter frame, and one side The weight is positive, the weight on one side is negative, and the absolute value of the weight of the pixel closer to the pixel to be calculated is larger.
例如,以图5为例,中间3×3的矩阵是要计算的像素点所在帧的权值,该矩阵为0,左侧3×3的矩阵是要计算的像素点所在帧的前一帧的权值,右侧3×3的矩阵是要计算的像素点所在帧的后一帧的权值,左侧3×3的矩阵与右侧3×3的矩阵对称位置上权值的绝对值相同,且一侧为负一侧为正。并且离要计算的像素点近的像素点的权值(6)大于离要计算的像素点 远的像素点的权值(3)大。For example, taking FIG. 5 as an example, the middle 3×3 matrix is the weight of the frame of the pixel to be calculated, the matrix is 0, and the left 3×3 matrix is the previous frame of the frame where the pixel to be calculated is located. The weight of the weight, the right 3 × 3 matrix is the weight of the frame after the pixel to be calculated, the absolute value of the weight of the left 3 × 3 matrix and the right 3 × 3 matrix symmetric position The same, and one side is positive on the negative side. And the weight (6) of the pixel near the pixel to be calculated is larger than the weight (3) of the pixel far from the pixel to be calculated.
步骤S203d,按照公式(1-1)确定所述第一视频中每一图像块的每一像素点的空时域梯度;Step S203d, determining an empty time domain gradient of each pixel of each image block in the first video according to formula (1-1);
步骤S203e,根据所述每一图像块的每一像素点的空时域梯度确定所述每一图像块的空时域梯度的标准差。Step S203e: Determine a standard deviation of the spatial time domain gradient of each image block according to a spatial time domain gradient of each pixel of each image block.
步骤S204,确定所述第一视频中每一图像块的像素值的均方误差;Step S204, determining a mean square error of a pixel value of each image block in the first video;
这里,比如要确定所述第一视频中第k帧图像中第一个图像块的像素值的均方误差MSE 1,按照公式(2-4)计算: Here, for example, it is determined that the mean square error MSE 1 of the pixel value of the first image block in the kth frame image in the first video is calculated according to formula (2-4):
Figure PCTCN2017119261-appb-000013
Figure PCTCN2017119261-appb-000013
其中,f(k,i,j)为所述第一视频中第k帧的位置为(i,j)像素点的像素值,g(k,i,j)为所述第二视频中第k帧的位置为(i,j)像素点的像素值。Where f(k, i, j) is the pixel value of the (i, j) pixel of the kth frame in the first video, and g(k, i, j) is the second video. The position of the k frame is the pixel value of the (i, j) pixel.
步骤S205,根据所述第一视频中每一图像块的空时域梯度的标准差和像素值的均方误差确定所述第一视频中每一图像块的失真度量值;Step S205, determining a distortion metric value of each image block in the first video according to a standard deviation of a space time domain gradient and a mean square error of a pixel value of each image block in the first video;
这里,按照公式(1-2)确定所述第一视频中每一图像块的失真度量值。Here, the distortion metric value of each image block in the first video is determined according to formula (1-2).
步骤S206,根据所述第一视频中每一帧图像所包含的图像块的失真度量值确定所述每一帧图像的失真度量值;Step S206, determining a distortion metric value of each frame image according to a distortion metric value of an image block included in each frame image in the first video;
步骤S207,根据所述第一视频中每一帧图像的失真度量值确定所述视频的失真度量值。Step S207, determining a distortion metric value of the video according to a distortion metric value of each frame image in the first video.
需要说明的是,本实施例中与其它实施例中相同步骤或概念的解释可以参考其它实施例中的描述,此处不再赘述。It should be noted that the description of the same steps or concepts in the other embodiments may refer to the description in other embodiments, and details are not described herein again.
本发明实施例中,所述方法包括:首先获取第一视频和第二视频,并将将所述第一视频和所述第二视频中每一帧图像按照预设大小划分图像块;然后根据所述第一视频中每一图像块的空时域梯度的标准差和像素值的均方误差确定所述每一图像块的失真度量值;再根据所述第一视频中每 一帧图像所包含的图像块的失真度量值确定所述每一帧图像的失真度量值;最后根据所述第一视频中每一帧图像的失真度量值确定所述视频的失真度量值。如此,不仅能够保证视频质量估计的准确性,并且降低了计算复杂度,进而能够应用于实时性要求较高的视频编码等视频图像处理系统。In the embodiment of the present invention, the method includes: first acquiring a first video and a second video, and dividing each frame image of the first video and the second video into image blocks according to a preset size; a standard deviation of a spatial time domain gradient of each image block in the first video and a mean square error of the pixel value determine a distortion metric value of each image block; and then according to each frame image in the first video A distortion metric value of the included image block determines a distortion metric value of the image of each frame; and finally determines a distortion metric value of the video according to a distortion metric value of each frame image in the first video. In this way, not only the accuracy of the video quality estimation can be ensured, but also the computational complexity is reduced, and the video image processing system such as video coding with high real-time requirements can be applied.
实施例三 Embodiment 3
本发明实施例先提供一种视频质量评估方法,以克服现有视频质量评价方法中存在的无法同时实现视频质量预测准确性和保持较低计算复杂度的问题。所述方法包括以下步骤:The embodiment of the present invention first provides a video quality evaluation method to overcome the problem that the existing video quality evaluation method cannot simultaneously achieve video quality prediction accuracy and maintain low computational complexity. The method includes the following steps:
第一步,对视频图像的每一个像素点,计算水平梯度、垂直梯度和时域梯度,在此基础之上计算该像素的空时域梯度;The first step is to calculate a horizontal gradient, a vertical gradient, and a time domain gradient for each pixel of the video image, and calculate an empty time domain gradient of the pixel on the basis of the pixel;
这里,使用如图3所示的模板计算像素水平梯度
Figure PCTCN2017119261-appb-000014
使用如图4所示的模板,计算像素垂直梯度
Figure PCTCN2017119261-appb-000015
使用如图5所示的模板,计算像素的时域梯度
Figure PCTCN2017119261-appb-000016
Here, calculate the pixel horizontal gradient using the template shown in Figure 3.
Figure PCTCN2017119261-appb-000014
Calculate the vertical gradient of the pixel using the template shown in Figure 4.
Figure PCTCN2017119261-appb-000015
Calculate the time domain gradient of the pixel using the template shown in Figure 5.
Figure PCTCN2017119261-appb-000016
需要注意的是,图3-图5中的梯度计算模板只是一种可选方案,实际中可以使用多种梯度计算模板作为替代方案。此后,按照公式(1-1)计算该像素的空时域梯度。It should be noted that the gradient calculation template in Figures 3 - 5 is only an alternative, and various gradient calculation templates can be used as an alternative in practice. Thereafter, the spatial time domain gradient of the pixel is calculated according to the formula (1-1).
第二步,对每一个图像块,统计该图像块内像素空时域梯度的标准差;In the second step, for each image block, the standard deviation of the pixel empty time domain gradient in the image block is counted;
这里,在实现过程中,可以以8×8块或16×16块为单位,统计每个图像块内的像素空时域梯度值的标准差σ,以σ作为表征该图像块内容的空时域复杂度的依据。Here, in the implementation process, the standard deviation σ of the pixel space-time gradient value in each image block may be counted in units of 8×8 blocks or 16×16 blocks, and σ is used as a space time for characterizing the content of the image block. The basis for domain complexity.
第三步,统计每个图像块的均方误差(Mean Square Error,MSE),将MSE除以该图像块的空时域梯度标准差并取对数,作为该图像块的最终失真度量值。In the third step, the mean square error (MSE) of each image block is counted, and the MSE is divided by the space-time gradient standard deviation of the image block and the logarithm is taken as the final distortion metric of the image block.
这里,采用传统的均方误差作为视频的客观失真计算准则,在此基础 之上按照公式(1-2)根据视频空时域内容复杂度σ对视频的最终失真判定D进行调整。Here, the traditional mean square error is used as the objective distortion calculation criterion of the video, and based on the formula (1-2), the final distortion determination D of the video is adjusted according to the video space-time domain content complexity σ.
下面对本发明实施例中提供的视频质量评价方法与现有技术中的PSNR算法、SSIM算法以及VQM算法进行对比。The video quality evaluation method provided in the embodiment of the present invention is compared with the PSNR algorithm, the SSIM algorithm, and the VQM algorithm in the prior art.
PSNR算法,是将原始视频和失真视频中的每一帧图像逐像素进行比较,是一种基于独立的像素差值的算法,忽略了序列内容和观测条件对失真可见度的影响,因此它往往和主观感知的视频质量的一致性较差。The PSNR algorithm compares each frame of the original video and the distorted video pixel by pixel. It is an algorithm based on independent pixel difference, ignoring the influence of sequence content and observation conditions on distortion visibility, so it tends to Subjectively perceived video quality is less consistent.
SSIM算法是一种提取原始视频和失真视频的图像的某一种像素统计特征并进行计算的算法。The SSIM algorithm is an algorithm for extracting and calculating a certain pixel statistical feature of an image of an original video and a distorted video.
VQM算法将原始视频和失真视频通过不同的滤波器(如边缘检测)分解到不同的通道(如边缘、亮度、色度、帧差),然后分别提取像素级特征和空时域图像块级的统计特征。像素级的特征包括对每个像素提取梯度的幅值、梯度的方向、色差值、对比度与帧差等。时空域图像块的统计特征包括在8*8的图像块范围内计算统计特征(像素级特征的均值、标准差),从而将像素的特征整合上升为时空域图像块的特征。最终经过对各项特征失真的空时域整合和加权融合得到视频序列的失真。The VQM algorithm decomposes the original video and the distorted video into different channels (such as edge, luminance, chrominance, and frame difference) through different filters (such as edge detection), and then extracts pixel-level features and space-time image block-levels respectively. Statistical Features. Pixel-level features include the magnitude of the gradient, the direction of the gradient, the color difference, the contrast and the frame difference, etc., for each pixel. The statistical features of the temporal and spatial image blocks include the calculation of statistical features (mean, standard deviation of pixel-level features) within an 8*8 image block, thereby enhancing the feature integration of the pixels to the characteristics of the temporal and spatial image blocks. Finally, the distortion of the video sequence is obtained through the spatial-time domain integration and weighted fusion of the distortion of each feature.
要测试客观视频质量评价算法的预测准确性,是通过测量算法预测得到的视频失真与该视频实际的MOS值之间的相关性和误差大小来实现的。采用的评价准则包括皮尔森线性相关系数(Pearson Linear Correlation Coefficient,PLCC)、斯皮尔曼秩相关系数(Spearman Rank Order Correlation Coefficient,SROCC)。其中PLCC主要评价算法的预测准确性,即评价预测失真与真实MOS之间的线性拟合度。SROCC主要评价算法预测的单调性,即评价预测失真排序与真实MOS的排序是否一致。To test the prediction accuracy of the objective video quality evaluation algorithm is achieved by measuring the correlation between the video distortion predicted by the algorithm and the actual MOS value of the video. The evaluation criteria used include Pearson Linear Correlation Coefficient (PLCC) and Spearman Rank Order Correlation Coefficient (SROCC). Among them, PLCC mainly evaluates the prediction accuracy of the algorithm, that is, evaluates the linear fit between the predicted distortion and the real MOS. The SROCC mainly evaluates the monotonicity of the algorithm prediction, that is, whether the prediction prediction distortion order is consistent with the real MOS ordering.
采用本发明所提出的视频质量评价算法的仿真实验结果如下:The simulation results of the video quality evaluation algorithm proposed by the present invention are as follows:
实验所采用的视频主观质量评价数据集为美国德克萨斯大学图像与视 频工程实验室(Laboratory for Image and Video Engineering,LIVE)的LIVE数据集,该数据集中包括了无线网传输失真、有线网传输失真、H.264压缩失真、MPEG-2压缩失真等四种类型的视频失真。实验的对比方法包括经典的客观视频质量评价方法PSNR和SSIM,以及美国电信和信息管理局(National Telecommunications and Information Administration,NTIA)建立的视频质量模型VQM。The video subjective quality evaluation data set used in the experiment is the LIVE data set of the Laboratory for Image and Video Engineering (LIVE) of the University of Texas, which includes wireless network transmission distortion and wired network. Four types of video distortion, such as transmission distortion, H.264 compression distortion, and MPEG-2 compression distortion. Experimental comparison methods include the classic objective video quality assessment methods PSNR and SSIM, and the video quality model VQM established by the National Telecommunications and Information Administration (NTIA).
本发明实施例提供的视频质量评价方法以及现有的各种主流视频质量评价算法与主观打分数据的PLCC相关系数如表1所示:The video quality evaluation method provided by the embodiment of the present invention and the PLCC correlation coefficient of the existing mainstream video quality evaluation algorithms and subjective scoring data are as shown in Table 1:
表1、本发明实施例提供的视频质量评估方法和PSNR、SSIM、VQM算法得到的结果与主观打分数据的PLCC相关系数表Table 1 shows the video quality evaluation method provided by the embodiment of the present invention and the results obtained by the PSNR, SSIM, VQM algorithm and the PLCC correlation coefficient table of the subjective scoring data.
Figure PCTCN2017119261-appb-000017
Figure PCTCN2017119261-appb-000017
本发明实施例提供的视频质量评价方法以及现有的各种主流视频质量评价算法与主观打分数据的SORCC相关系数如表2所示:The video quality evaluation method provided by the embodiment of the present invention and the SORCC correlation coefficient of the existing mainstream video quality evaluation algorithms and subjective scoring data are as shown in Table 2:
表2、本发明实施例提供的视频质量评估方法和PSNR、SSIM、VQM算法得到的结果与主观打分数据的SORCC相关系数表Table 2, the video quality evaluation method provided by the embodiment of the present invention and the results obtained by the PSNR, SSIM, VQM algorithm and the SORCC correlation coefficient table of the subjective scoring data
Figure PCTCN2017119261-appb-000018
Figure PCTCN2017119261-appb-000018
Figure PCTCN2017119261-appb-000019
Figure PCTCN2017119261-appb-000019
本发明所述的视频质量评价算法所预测的视频失真与LIVE数据集的实际主观DMOS分数之间的相关性如图6所示(横坐标为本发明所述方法的视频失真预测值,纵坐标为实际的视频打分DMOS值)。VQM、PSNR、SSIM三种对比方法所预测的视频质量分数与LIVE数据集的实际主观DMOS分数之间的相关性分别如图7-图9所示。The correlation between the video distortion predicted by the video quality evaluation algorithm of the present invention and the actual subjective DMOS score of the LIVE data set is as shown in FIG. 6 (the abscissa is the video distortion prediction value of the method of the present invention, and the ordinate Score the actual video for the DMOS value). The correlation between the video quality score predicted by the VQM, PSNR, and SSIM comparison methods and the actual subjective DMOS score of the LIVE data set is shown in Figures 7-9, respectively.
需要注意的是,由于本发明实施例提供的方法与VQM方法计算出的值均为视频的失真估计(值越大表示视频失真越大),所以图6、图7中的横坐标与纵坐标数据为正相关。PSNR方法与SSIM方法计算出的值为视频的质量估计(值越小表示失真越大),所以图8、图9中的横坐标与纵坐标数据为负相关。It should be noted that the values calculated by the method provided by the embodiment of the present invention and the VQM method are both distortion estimations of the video (the larger the value indicates the greater the video distortion), so the abscissa and the ordinate in FIG. 6 and FIG. 7 The data is positively correlated. The PSNR method and the SSIM method calculate the value of the video quality estimate (the smaller the value, the greater the distortion), so the abscissa and the ordinate data in Figures 8 and 9 are negatively correlated.
通过仿真结果可以看出采用本发明实施例提供的视频质量评价方法,与现有技术相比,取得了与人眼主观感知质量更加一致的视频质量预测结果,同时仅需要较低的计算复杂度。The simulation results show that the video quality evaluation method provided by the embodiment of the present invention achieves a video quality prediction result that is more consistent with the subjective perceived quality of the human eye than the prior art, and requires only a low computational complexity. .
实施例四 Embodiment 4
本发明实施例提供一种视频质量评价装置,图10为本发明实施例四视频质量评价装置的组成结构示意图,如图10所示,所述装置1000包括:划分模块1001、第一确定模块1002、第二确定模块1003、第三确定模块1004,其中:The embodiment of the present invention provides a video quality evaluation apparatus. FIG. 10 is a schematic structural diagram of a video quality evaluation apparatus according to Embodiment 4 of the present invention. As shown in FIG. 10, the apparatus 1000 includes: a partitioning module 1001, and a first determining module 1002. a second determining module 1003, a third determining module 1004, wherein:
所述划分模块1001,配置为将需要进行质量评价的视频中每一帧图像按照预设大小划分图像块。The dividing module 1001 is configured to divide each frame image in the video that needs to be quality-evaluated into image blocks according to a preset size.
所述第一确定模块1002,配置为根据每一图像块的空时域梯度的标准 差和像素值的均方误差确定所述每一图像块的失真度量值。The first determining module 1002 is configured to determine a distortion metric value of each image block according to a standard deviation of a spatial time domain gradient of each image block and a mean square error of the pixel value.
这里,所述第一确定模块1002进一步包括:Here, the first determining module 1002 further includes:
第一确定单元,配置为确定所述图像块中每一像素点的空时域梯度;其中,所述图像块中至少包括一个像素点;a first determining unit, configured to determine a spatial time domain gradient of each pixel in the image block; wherein the image block includes at least one pixel;
这里,所述第一确定单元进一步包括:第一确定子单元,配置为确定所述图像块中每一像素点的水平梯度、垂直梯度和时域梯度;第二确定子单元,配置为根据所述图像块中每一像素点的水平梯度、垂直梯度和时域梯度确定所述图像块中每一像素点的空时域梯度。Here, the first determining unit further includes: a first determining subunit configured to determine a horizontal gradient, a vertical gradient, and a time domain gradient of each pixel in the image block; and a second determining subunit configured to be The horizontal gradient, the vertical gradient, and the time domain gradient of each pixel in the image block determine an empty time domain gradient for each pixel in the image block.
所述第二确定子单元进一步包括:确定子子单元,配置为按照公式(1-1)确定所述图像块中每一像素点的空时域梯度,其中,
Figure PCTCN2017119261-appb-000020
为所述像素点的空时域梯度、
Figure PCTCN2017119261-appb-000021
为所述像素点的水平梯度、
Figure PCTCN2017119261-appb-000022
为所述像素点的垂直梯度、
Figure PCTCN2017119261-appb-000023
为所述像素点的时域梯度。
The second determining subunit further includes: determining a sub-subunit, configured to determine an empty time domain gradient of each pixel in the image block according to formula (1-1), where
Figure PCTCN2017119261-appb-000020
Is the space time domain gradient of the pixel,
Figure PCTCN2017119261-appb-000021
Is the horizontal gradient of the pixel,
Figure PCTCN2017119261-appb-000022
Is the vertical gradient of the pixel,
Figure PCTCN2017119261-appb-000023
Is the time domain gradient of the pixel.
第二确定单元,配置为确定所述图像块的空时域梯度的标准差;a second determining unit, configured to determine a standard deviation of an empty time domain gradient of the image block;
第三确定单元,配置为确定所述图像块的像素值的均方误差;a third determining unit configured to determine a mean square error of a pixel value of the image block;
第四确定单元,配置为根据所述图像块的空时域梯度的标准差和像素值的均方误差,确定所述图像块的失真度量值。And a fourth determining unit configured to determine a distortion metric value of the image block according to a standard deviation of a spatial time domain gradient of the image block and a mean square error of the pixel value.
这里,所述第四确定单元进一步包括:Here, the fourth determining unit further includes:
第三确定子单元,配置为按照公式(1-2)确定所述每一图像块的失真度量值,其中,D为所述图像块的失真度量值,MSE为所述图像块的像素值的均方误差,σ为所述图像块的空时域梯度的标准差。a third determining subunit, configured to determine a distortion metric value of each image block according to formula (1-2), where D is a distortion metric value of the image block, and MSE is a pixel value of the image block The mean square error, σ is the standard deviation of the spatial time domain gradient of the image block.
所述第二确定模块1003,配置为根据所述每一帧图像所包含的图像块的失真度量值确定所述每一帧图像的失真度量值。The second determining module 1003 is configured to determine a distortion metric value of each frame image according to a distortion metric value of the image block included in each frame image.
这里,所述第二确定模块包括:第五确定单元,配置为将所述每一帧的所包含的所有图像块的失真度量值的平均值确定为所述视频的失真度量值;Here, the second determining module includes: a fifth determining unit configured to determine an average value of distortion metric values of all image blocks included in each frame as a distortion metric value of the video;
所述第三确定模块1004,配置为根据所述每一帧图像的失真度量值确定所述视频的失真度量值。The third determining module 1004 is configured to determine a distortion metric value of the video according to the distortion metric value of each frame image.
这里,所述第三确定模块包括:第六确定单元,配置为将所述视频中所包含的所有帧的图像的失真度量值的平均值确定为所述视频的失真度量值。Here, the third determining module includes: a sixth determining unit configured to determine an average value of distortion metric values of images of all frames included in the video as a distortion metric value of the video.
这里需要指出的是:以上视频质量评价装置实施例的描述,与上述方法实施例的描述是类似的,具有同方法实施例相似的有益效果,因此不做赘述。对于本发明视频质量评价装置实施例中未披露的技术细节,请参照本发明方法实施例的描述而理解,为节约篇幅,因此不再赘述。It should be noted here that the description of the above embodiment of the video quality evaluation device is similar to the description of the above method embodiment, and has similar advantageous effects as the method embodiment, and therefore will not be described again. For the technical details that are not disclosed in the embodiment of the video quality evaluation device of the present invention, please refer to the description of the method embodiment of the present invention, and the details are not described herein.
需要说明的是,本发明实施例中,如果以软件功能模块的形式实现上述的视频质量评价方法,并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明实施例的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机、服务器、或者网络设备等)执行本发明各个实施例所述方法的全部或部分。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read Only Memory)、磁碟或者光盘等各种可以存储程序代码的介质。这样,本发明实施例不限制于任何特定的硬件和软件结合。It should be noted that, in the embodiment of the present invention, if the video quality evaluation method described above is implemented in the form of a software function module and sold or used as a standalone product, it may also be stored in a computer readable storage medium. Based on such understanding, the technical solution of the embodiments of the present invention may be embodied in the form of a software product in essence or in the form of a software product stored in a storage medium, including a plurality of instructions. A computer device (which may be a personal computer, server, or network device, etc.) is caused to perform all or part of the methods described in various embodiments of the present invention. The foregoing storage medium includes various media that can store program codes, such as a USB flash drive, a mobile hard disk, a read only memory (ROM), a magnetic disk, or an optical disk. Thus, embodiments of the invention are not limited to any specific combination of hardware and software.
对应地,本发明实施例提供一种视频质量评价设备,例如终端,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现所述的视频质量评价方法。Correspondingly, an embodiment of the present invention provides a video quality evaluation apparatus, such as a terminal, including a memory, a processor, and a computer program stored on the memory and operable on the processor, where the processor implements the program The video quality evaluation method described.
本发明实施例提供一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该计算机程序被处理器执行时实现上述的视频质量评价方法。Embodiments of the present invention provide a computer readable storage medium having stored thereon a computer program, wherein the computer program is implemented by a processor to implement the video quality evaluation method described above.
需要说明的是,图11为本发明实施例中终端的一种硬件实体示意图, 如图11所示,该终端1100的硬件实体包括:处理器1101、通信接口1102和存储器1103,其中:It is to be noted that FIG. 11 is a schematic diagram of a hardware entity of a terminal according to an embodiment of the present invention. As shown in FIG. 11, the hardware entity of the terminal 1100 includes: a processor 1101, a communication interface 1102, and a memory 1103, where:
处理器1101通常控制终端1100的总体操作。The processor 1101 typically controls the overall operation of the terminal 1100.
通信接口1102可以使终端1100通过网络与其他终端或服务器通信。 Communication interface 1102 can cause terminal 1100 to communicate with other terminals or servers over a network.
存储器1103配置为存储由处理器1101可执行的指令和应用,还可以缓存待处理器1101以及终端1100中各模块待处理或已经处理的数据(例如,图像数据、音频数据、语音通信数据和视频通信数据),可以通过闪存(FLASH)或随机访问存储器(Random Access Memory,RAM)实现。The memory 1103 is configured to store instructions and applications executable by the processor 1101, and may also cache data to be processed or processed by the processor 1101 and each module in the terminal 1100 (eg, image data, audio data, voice communication data, and video) Communication data) can be realized by flash memory (FLASH) or random access memory (RAM).
需要说明的是,计算机程序(也被称为程序、软件、软件应用、脚本或代码)能够以任何编程语言形式(包括汇编语言或解释语言、说明性语言或程序语言)书写,并且能够以任何形式(包括作为独立程序,或者作为模块、组件、子程序、对象或其它适用于计算环境中的单元)部署。计算机程序可以但非必要地对应于文件系统中的文件。程序能够被存储在文件的保存其它程序或数据(例如,存储在标记语言文档中的一个或多个脚本)的部分中,在专用于所关注程序的单个文件中,或者在多个协同文件(例如,存储一个或多个模块、子模块或代码部分的文件)中。计算机程序能够被部署为在一个或多个计算机上执行,该一个或多个计算机位于一个站点处,或者分布在多个站点中且通过通信网络互连。It should be noted that a computer program (also referred to as a program, software, software application, script or code) can be written in any programming language (including assembly or interpreted language, descriptive language or programming language) and can be any Form (including as a stand-alone program, or as a module, component, subroutine, object, or other unit suitable for use in a computing environment). A computer program can, but does not necessarily, correspond to a file in a file system. The program can be stored in a portion of the file that holds other programs or data (eg, one or more scripts stored in the markup language document), in a single file dedicated to the program of interest, or in multiple collaborative files ( For example, storing one or more modules, submodules, or files in a code section). The computer program can be deployed to be executed on one or more computers located at one site or distributed across multiple sites and interconnected by a communication network.
说明书中描述的过程和逻辑流能够由一个或多个可编程处理器执行,该一个或多个可编程处理器执行一个或多个计算机程序以通过操作输入数据和生成输出来执行动作。上述过程和逻辑流还能够由专用逻辑电路执行,并且装置还能够被实现为专用逻辑电路,例如,FPGA或ASIC。The processes and logic flows described in the specification can be performed by one or more programmable processors executing one or more computer programs to perform actions by operating input data and generating output. The above described processes and logic flows can also be performed by dedicated logic circuitry, and the apparatus can also be implemented as dedicated logic circuitry, such as an FPGA or ASIC.
适用于执行计算机程序的处理器例如包括通用微处理器和专用微处理器,以及任何数字计算机类型的任何一个或多个处理器。通常来说,处理器会从只读存储器或随机访问存储器或以上两者接收指令和数据。计算的 的元件是用于按照指令执行动作的处理器以及一个或多个用于存储指令和数据的存储器。通常来说,计算机还会包括一个或多个用于存储数据的大容量存储设备(例如,磁盘、磁光盘、或光盘),或者操作地耦接以从其接收数据或向其发送数据,或者两者均是。然而,计算机不需要具有这样的设备。而且,计算机能够被嵌入在另一设备中,例如,移动电话、个人数字助手(PDA)、移动音频播放器或移动视频播放器、游戏控制台、全球定位系统(GPS)接收机或移动存储设备(例如,通用串行总线(USB)闪盘),以上仅为举例。适用于存储计算机程序指令和数据的设备包括所有形式的非易失性存储器、媒体和存储设备,例如包括半导体存储设备(例如,EPROM、EEPROM和闪存设备)、磁盘(例如,内部硬盘或移动硬盘)、磁光盘、以及CD-ROM和DVD-ROM盘。处理器和存储器能够由专用逻辑电路补充或者包含到专用逻辑电路中。Processors suitable for the execution of a computer program include, for example, a general purpose microprocessor and a special purpose microprocessor, and any one or more processors of any type of digital computer. Generally, a processor will receive instructions and data from a read only memory or a random access memory or both. The calculated elements are a processor for performing actions in accordance with instructions and one or more memories for storing instructions and data. Generally, a computer also includes one or more mass storage devices (eg, magnetic disks, magneto-optical disks, or optical disks) for storing data, or is operatively coupled to receive data from or send data thereto, or Both are. However, the computer does not need to have such a device. Moreover, the computer can be embedded in another device, such as a mobile phone, a personal digital assistant (PDA), a mobile audio player or mobile video player, a game console, a global positioning system (GPS) receiver, or a mobile storage device. (For example, Universal Serial Bus (USB) flash drive), the above is just an example. Suitable devices for storing computer program instructions and data include all forms of non-volatile memory, media and storage devices, including, for example, semiconductor storage devices (eg, EPROM, EEPROM, and flash memory devices), magnetic disks (eg, internal hard drives or removable hard drives). ), magneto-optical disks, and CD-ROM and DVD-ROM discs. The processor and memory can be supplemented by or included in dedicated logic circuitry.
说明书中描述的主题的实施方式能够以计算系统来实现。该计算系统包括后端组件(例如,数据服务器),或者包括中间件组件(例如,应用服务器),或者包括前端组件(例如,具有图形用户接口或网页浏览器的客户端计算机,用户通过该客户端计算机能够与本申请描述的主题的实施方式交互),或者包括上述后端组件、中间件组件或前端组件中的一个或多个的任何结合。系统的组件能够通过任何数字数据通信形式或介质(例如,通信网络)来互连。通信网络的示例包括局域网(LAN)和广域网(WAN)、互连网络(例如,互联网)以及端对端网络(例如,自组织端对端网络)。Embodiments of the subject matter described in the specification can be implemented in a computing system. The computing system includes a backend component (eg, a data server), or includes a middleware component (eg, an application server), or includes a front end component (eg, a client computer with a graphical user interface or web browser through which the user passes) The end computer can interact with an embodiment of the subject matter described herein, or any combination of one or more of the above described backend components, middleware components, or front end components. The components of the system can be interconnected by any form of digital data communication or medium (e.g., a communication network). Examples of communication networks include local area networks (LANs) and wide area networks (WANs), interconnected networks (e.g., the Internet), and end-to-end networks (e.g., ad hoc end-to-end networks).
本申请中描述的特征在智能电视模块上(或连接电视模块、混合电视模块等)实现。智能电视模块可以包括被配置成为互联网连接性集成更多传统电视节目源(例如,经由线缆、卫星、空中或其它信号接收的节目源)的处理电路。智能电视模块可以被物理地集成到电视机中或者可以包括独立的设备,例如,机顶盒、蓝光或其它数字媒体播放器、游戏控制台、酒 店电视系统以及其它配套设备。智能电视模块可以被配置为使得观看者能够搜索并找到在网络上、当地有线电视频道上、卫星电视频道上或存储在本地硬盘上的视频、电影、图片或其它内容。机顶盒(STB)或机顶盒单元(STU)可以包括信息适用设备,信息适用设备包括调谐器并且连接到电视机和外部信号源上,从而将信号调谐成之后将被显示在电视屏幕或其它播放设备上的内容。智能电视模块可以被配置成为多种不同的应用(例如,网页浏览器和多个流媒体服务、连接线缆或卫星媒体源、其它网络“频道”等)提供家用屏幕或包括图标的顶级屏幕。智能电视模块还可以被配置成为用户提供电子节目。智能电视模块的配套应用可以在移动终端上运行以向用户提供与可用节目有关的附加信息,从而使得用户能够控制智能电视模块等。在替选实施例中,该特征可以被实现在便携式计算机或其它个人计算机(PC)、智能手机、其它移动电话、手持计算机、平板PC或其它计算设备上。The features described in this application are implemented on a smart television module (or connected to a television module, hybrid television module, etc.). The smart TV module can include processing circuitry configured to integrate more traditional television program sources (eg, program sources received via cable, satellite, air, or other signals) with Internet connectivity. The smart TV module can be physically integrated into a television set or can include stand-alone devices such as set top boxes, Blu-ray or other digital media players, game consoles, hotel television systems, and other ancillary equipment. The smart TV module can be configured to enable viewers to search for and find videos, movies, pictures or other content on the network, on local cable channels, on satellite television channels, or on local hard drives. A set top box (STB) or set top box unit (STU) may include an information-applicable device that includes a tuner and is coupled to the television set and an external source to tune the signal to be displayed on a television screen or other playback device. Content. The smart TV module can be configured to provide a home screen or a top screen including icons for a variety of different applications (eg, web browsers and multiple streaming services, connecting cable or satellite media sources, other network "channels", etc.). The smart TV module can also be configured to provide electronic programming to the user. The companion application of the smart TV module can be run on the mobile terminal to provide the user with additional information related to the available programs, thereby enabling the user to control the smart TV module and the like. In alternative embodiments, this feature can be implemented on a portable computer or other personal computer (PC), smart phone, other mobile phone, handheld computer, tablet PC, or other computing device.
虽然说明书包含许多实施细节,但是这些实施细节不应当被解释为对任何权利要求的范围的限定,而是对专用于特定实施方式的特征的描述。说明书中在独立实施方式前后文中描述的特定的特征同样能够以单个实施方式的结合中实现。相反地,单个实施方式的上下文中描述的各个特征同样能够在多个实施方式中单独实现或者以任何合适的子结合中实现。而且,尽管特征可以在上文中描述为在特定结合中甚至如最初所要求的作用,但是在一些情况下所要求的结合中的一个或多个特征能够从该结合中去除,并且所要求的结合可以为子结合或者子结合的变型。The description contains a number of implementation details, which are not to be construed as limiting the scope of any claims, but rather to the description of the features of the particular embodiments. Particular features described in the specification before and after the independent embodiments can also be implemented in a combination of a single embodiment. Conversely, various features that are described in the context of a single embodiment can be implemented in the various embodiments individually or in any suitable sub-combination. Moreover, although features may be described above as even in the particular combination, even as originally claimed, in some cases one or more of the required combinations can be removed from the combination and the required combination It can be a sub-binding or a sub-combination variant.
类似地,虽然在附图中以特定次序描绘操作,但是这不应当被理解为要求该操作以所示的特定次序或者以相继次序来执行,或者所示的全部操作都被执行以达到期望的结果。在特定环境下,多任务处理和并行处理可以是有利的。此外,上述实施方式中各个系统组件的分离不应当被理解为 要求在全部实施方式中实现该分离,并且应当理解的是所描述的程序组件和系统通常能够被共同集成在单个软件产品中或被封装为多个软件产品。Similarly, although the operations are depicted in a particular order in the figures, this should not be construed as requiring that the operations are performed in the particular order shown, or in a sequential order, or all of the operations illustrated are performed to achieve the desired. result. Multitasking and parallel processing can be advantageous in certain circumstances. Furthermore, the separation of various system components in the above-described embodiments should not be understood as requiring that the separation be implemented in all embodiments, and it should be understood that the described program components and systems can generally be integrated together in a single software product or Packaged into multiple software products.
因此,已经对主题的特定实施方式进行了描述。其它实施方式在以下权利要求的范围内。在一些情况下,权利要求中所限定的动作能够以不同的次序执行并且仍能够达到期望的结果。此外,附图中描绘的过程并不必须采用所示出的特定次序、或相继次序来达到期望的结果。在特定实施方式中,可以使用多任务处理或并行处理。Thus, specific embodiments of the subject matter have been described. Other embodiments are within the scope of the following claims. In some cases, the actions defined in the claims can be performed in a different order and still achieve the desired results. Moreover, the processes depicted in the figures are not necessarily in the particular order shown, or in a sequential order to achieve the desired results. In particular embodiments, multitasking or parallel processing can be used.
工业实用性Industrial applicability
本发明实施例中,首先获取第一视频和第二视频,并将将所述第一视频和所述第二视频中每一帧图像按照预设大小划分图像块;然后根据所述第一视频中每一图像块的空时域梯度的标准差和像素值的均方误差确定所述每一图像块的失真度量值;再根据所述第一视频中每一帧图像所包含的图像块的失真度量值确定所述每一帧图像的失真度量值;最后根据所述第一视频中每一帧图像的失真度量值确定所述视频的失真度量值;如此,不仅能够保证视频质量估计的准确性,并且降低了计算复杂度,进而能够应用于实时性要求较高的视频编码等视频图像处理系统。In the embodiment of the present invention, the first video and the second video are first acquired, and each frame image of the first video and the second video is divided into image blocks according to a preset size; and then according to the first video. The standard deviation of the spatial time domain gradient of each image block and the mean square error of the pixel value determine the distortion metric value of each image block; and further according to the image block included in each frame image in the first video The distortion metric determines a distortion metric of the image of each frame; and finally determines a distortion metric of the video according to a distortion metric of each frame image in the first video; thus, not only can the video quality estimation be accurate It reduces the computational complexity and can be applied to video image processing systems such as video coding with high real-time requirements.

Claims (12)

  1. 一种视频质量评价方法,所述方法包括:A video quality evaluation method, the method comprising:
    将需要进行质量评价的视频中每一帧图像按照预设大小划分图像块;Each frame of the video in which the quality evaluation is required is divided into image blocks according to a preset size;
    根据每一图像块的空时域梯度的标准差和像素值的均方误差确定所述每一图像块的失真度量值;Determining a distortion metric value of each image block according to a standard deviation of a spatial time domain gradient of each image block and a mean square error of the pixel value;
    根据所述每一帧图像所包含的图像块的失真度量值确定所述每一帧图像的失真度量值;Determining a distortion metric value of each frame image according to a distortion metric value of the image block included in each frame image;
    根据所述每一帧图像的失真度量值确定所述视频的失真度量值。A distortion metric value of the video is determined based on a distortion metric of the image of each frame.
  2. 根据权利要求1中所述的方法,所述根据每一图像块的空时域梯度的标准差和像素值的均方误差确定所述每一图像块的失真度量值,包括:The method according to claim 1, wherein determining the distortion metric value of each image block according to a standard deviation of a spatial time domain gradient of each image block and a mean square error of the pixel value comprises:
    确定所述图像块中每一像素点的空时域梯度;其中,所述图像块中至少包括一个像素点;Determining a spatial time domain gradient of each pixel in the image block; wherein the image block includes at least one pixel;
    确定所述图像块的空时域梯度的标准差;Determining a standard deviation of a spatial time domain gradient of the image block;
    确定所述图像块的像素值的均方误差;Determining a mean square error of pixel values of the image block;
    根据所述图像块的空时域梯度的标准差和像素值的均方误差,确定所述图像块的失真度量值。A distortion metric value of the image block is determined according to a standard deviation of a spatial time domain gradient of the image block and a mean square error of the pixel value.
  3. 根据权利要求2中所述的方法,所述确定图像块中每一像素点的空时域梯度,包括:The method of claim 2, wherein determining the spatial time domain gradient for each pixel in the image block comprises:
    确定所述图像块中每一像素点的水平梯度、垂直梯度和时域梯度;Determining a horizontal gradient, a vertical gradient, and a time domain gradient for each pixel in the image block;
    根据所述图像块中每一像素点的水平梯度、垂直梯度和时域梯度确定所述图像块中每一像素点的空时域梯度。A spatial time domain gradient of each pixel in the image block is determined according to a horizontal gradient, a vertical gradient, and a time domain gradient of each pixel in the image block.
  4. 根据权利要求3中所述的方法,所述根据所述图像块中每一像素点的水平梯度、垂直梯度和时域梯度确定所述图像块中每一像素点的空 时域梯度,包括:The method according to claim 3, wherein determining a spatial time gradient of each pixel in the image block according to a horizontal gradient, a vertical gradient, and a time domain gradient of each pixel in the image block comprises:
    按照公式
    Figure PCTCN2017119261-appb-100001
    确定所述图像块中每一像素点的空时域梯度,其中,
    Figure PCTCN2017119261-appb-100002
    为所述像素点的空时域梯度、
    Figure PCTCN2017119261-appb-100003
    为所述像素点的水平梯度、
    Figure PCTCN2017119261-appb-100004
    为所述像素点的垂直梯度、
    Figure PCTCN2017119261-appb-100005
    为所述像素点的时域梯度。
    According to the formula
    Figure PCTCN2017119261-appb-100001
    Determining an empty time domain gradient for each pixel in the image block, wherein
    Figure PCTCN2017119261-appb-100002
    Is the space time domain gradient of the pixel,
    Figure PCTCN2017119261-appb-100003
    Is the horizontal gradient of the pixel,
    Figure PCTCN2017119261-appb-100004
    Is the vertical gradient of the pixel,
    Figure PCTCN2017119261-appb-100005
    Is the time domain gradient of the pixel.
  5. 根据权利要求1中所述的方法,所述根据所述图像块的空时域梯度标准差和像素值的均方误差,确定所述图像块的失真度量值,包括:The method according to claim 1, wherein determining the distortion metric value of the image block according to a space-time gradient standard deviation of the image block and a mean square error of the pixel value comprises:
    按照公式
    Figure PCTCN2017119261-appb-100006
    确定所述每一图像块的失真度量值,其中,D为所述图像块的失真度量值,MSE为所述图像块的像素值的均方误差,σ为所述图像块的空时域梯度的标准差。
    According to the formula
    Figure PCTCN2017119261-appb-100006
    Determining a distortion metric of each image block, where D is a distortion metric of the image block, MSE is a mean square error of a pixel value of the image block, and σ is a spatial time domain gradient of the image block Standard deviation.
  6. 根据权利要求1中所述的方法,所述根据所述视频中每一帧的图像所包含的图像块的失真度量值确定所述每一帧的图像的失真度量值,包括:将所述每一帧的所包含的所有图像块的失真度量值的平均值确定为所述视频的失真度量值;The method according to claim 1, wherein determining a distortion metric value of an image of each frame according to a distortion metric value of an image block included in an image of each frame in the video comprises: The average value of the distortion metric values of all the image blocks included in one frame is determined as the distortion metric value of the video;
    对应地,所述根据所述每一帧的图像的失真度量值确定所述视频的失真度量值,包括:将所述视频中所包含的所有帧的图像的失真度量值的平均值确定为所述视频的失真度量值。Correspondingly, determining the distortion metric value of the video according to the distortion metric of the image of each frame comprises: determining an average value of distortion metric values of images of all frames included in the video as The distortion metric of the video.
  7. 一种视频质量评价装置,所述装置包括:A video quality evaluation device, the device comprising:
    划分模块,配置为将需要进行质量评价的视频中每一帧图像按照预设大小划分图像块;a dividing module configured to divide an image block according to a preset size in each frame of the video that needs to be subjected to quality evaluation;
    第一确定模块,配置为根据每一图像块的空时域梯度的标准差和像素值的均方误差确定所述每一图像块的失真度量值;a first determining module, configured to determine a distortion metric value of each image block according to a standard deviation of a space time domain gradient of each image block and a mean square error of the pixel value;
    第二确定模块,配置为根据所述每一帧图像所包含的图像块的失真度量值确定所述每一帧图像的失真度量值;a second determining module, configured to determine a distortion metric value of each frame image according to a distortion metric value of the image block included in each frame image;
    第三确定模块,配置为根据所述每一帧图像的失真度量值确定所述 视频的失真度量值。And a third determining module configured to determine a distortion metric value of the video according to a distortion metric of the image of each frame.
  8. 根据权利要求7中所述的装置,所述第一确定模块包括:The apparatus according to claim 7, wherein the first determining module comprises:
    第一确定单元,配置为确定所述图像块中每一像素点的空时域梯度;其中,所述图像块中至少包括一个像素点;a first determining unit, configured to determine a spatial time domain gradient of each pixel in the image block; wherein the image block includes at least one pixel;
    第二确定单元,配置为确定所述图像块的空时域梯度的标准差;a second determining unit, configured to determine a standard deviation of an empty time domain gradient of the image block;
    第三确定单元,配置为确定所述图像块的像素值的均方误差;a third determining unit configured to determine a mean square error of a pixel value of the image block;
    第四确定单元,配置为根据所述图像块的空时域梯度的标准差和像素值的均方误差,确定所述图像块的失真度量值。And a fourth determining unit configured to determine a distortion metric value of the image block according to a standard deviation of a spatial time domain gradient of the image block and a mean square error of the pixel value.
  9. 根据权利要求8中所述的装置,所述第一确定单元包括:The apparatus according to claim 8, wherein said first determining unit comprises:
    第一确定子单元,配置为确定所述图像块中每一像素点的水平梯度、垂直梯度和时域梯度;a first determining subunit configured to determine a horizontal gradient, a vertical gradient, and a time domain gradient for each pixel in the image block;
    第二确定子单元,配置为根据所述图像块中每一像素点的水平梯度、垂直梯度和时域梯度确定所述图像块中每一像素点的空时域梯度。And a second determining subunit configured to determine an empty time domain gradient of each pixel in the image block according to a horizontal gradient, a vertical gradient, and a time domain gradient of each pixel in the image block.
  10. 根据权利要求7中所述的装置,所述第二确定模块包括:第五确定单元,配置为将所述每一帧的所包含的所有图像块的失真度量值的平均值确定为所述视频的失真度量值;The apparatus according to claim 7, wherein said second determining module comprises: a fifth determining unit configured to determine an average of distortion metric values of all image blocks included in said each frame as said video Distortion metric;
    对应地,所述第三确定模块包括:第六确定单元,配置为将所述视频中所包含的所有帧的图像的失真度量值的平均值确定为所述视频的失真度量值。Correspondingly, the third determining module comprises: a sixth determining unit configured to determine an average value of distortion metric values of images of all frames included in the video as a distortion metric value of the video.
  11. 一种视频质量评价设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现权利要求1至6任一项所述的视频质量评价方法。A video quality evaluation device comprising a memory, a processor, and a computer program stored on the memory and operable on the processor, the processor executing the program to implement the video of any one of claims 1 to 6. Quality evaluation method.
  12. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该计算机程序被处理器执行时实现权利要求1至6任一项所述的视频质量评价方法。A computer readable storage medium having stored thereon a computer program, wherein the computer program is executed by a processor to implement the video quality evaluation method according to any one of claims 1 to 6.
PCT/CN2017/119261 2017-02-24 2017-12-28 Video quality evaluation method, apparatus and device, and storage medium WO2018153161A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710102413.4A CN108513132B (en) 2017-02-24 2017-02-24 Video quality evaluation method and device
CN201710102413.4 2017-02-24

Publications (1)

Publication Number Publication Date
WO2018153161A1 true WO2018153161A1 (en) 2018-08-30

Family

ID=63253122

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/119261 WO2018153161A1 (en) 2017-02-24 2017-12-28 Video quality evaluation method, apparatus and device, and storage medium

Country Status (2)

Country Link
CN (1) CN108513132B (en)
WO (1) WO2018153161A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110401832B (en) * 2019-07-19 2020-11-03 南京航空航天大学 Panoramic video objective quality assessment method based on space-time pipeline modeling
CN111083468B (en) * 2019-12-23 2021-08-20 杭州小影创新科技股份有限公司 Short video quality evaluation method and system based on image gradient
CN114332088B (en) * 2022-03-11 2022-06-03 电子科技大学 Motion estimation-based full-reference video quality evaluation method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101742355A (en) * 2009-12-24 2010-06-16 厦门大学 Method for partial reference evaluation of wireless videos based on space-time domain feature extraction
CN102984540A (en) * 2012-12-07 2013-03-20 浙江大学 Video quality assessment method estimated on basis of macroblock domain distortion degree
CN103458265A (en) * 2013-02-01 2013-12-18 深圳信息职业技术学院 Method and device for evaluating video quality
US20140015923A1 (en) * 2012-07-16 2014-01-16 Cisco Technology, Inc. Stereo Matching for 3D Encoding and Quality Assessment
CN106028026A (en) * 2016-05-27 2016-10-12 宁波大学 Effective objective video quality evaluation method based on temporal-spatial structure

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101742355A (en) * 2009-12-24 2010-06-16 厦门大学 Method for partial reference evaluation of wireless videos based on space-time domain feature extraction
US20140015923A1 (en) * 2012-07-16 2014-01-16 Cisco Technology, Inc. Stereo Matching for 3D Encoding and Quality Assessment
CN102984540A (en) * 2012-12-07 2013-03-20 浙江大学 Video quality assessment method estimated on basis of macroblock domain distortion degree
CN103458265A (en) * 2013-02-01 2013-12-18 深圳信息职业技术学院 Method and device for evaluating video quality
CN106028026A (en) * 2016-05-27 2016-10-12 宁波大学 Effective objective video quality evaluation method based on temporal-spatial structure

Also Published As

Publication number Publication date
CN108513132B (en) 2020-11-10
CN108513132A (en) 2018-09-07

Similar Documents

Publication Publication Date Title
Ghadiyaram et al. A subjective and objective study of stalling events in mobile streaming videos
CN114584849B (en) Video quality evaluation method, device, electronic equipment and computer storage medium
CN111918066B (en) Video encoding method, device, equipment and storage medium
Moorthy et al. Visual quality assessment algorithms: what does the future hold?
Gu et al. Hybrid no-reference quality metric for singly and multiply distorted images
Zhang et al. Subjective and objective quality assessment of panoramic videos in virtual reality environments
US8804815B2 (en) Support vector regression based video quality prediction
KR102523149B1 (en) Quantification of Perceptual Quality Model Uncertainty via Bootstrapping
WO2018153161A1 (en) Video quality evaluation method, apparatus and device, and storage medium
Zeng et al. 3D-SSIM for video quality assessment
Ghadiyaram et al. A no-reference video quality predictor for compression and scaling artifacts
US20210044791A1 (en) Video quality determination system and method
Jin et al. Quantifying the importance of cyclopean view and binocular rivalry-related features for objective quality assessment of mobile 3D video
Nezhivleva et al. Comparing of Modern Methods Used to Assess the Quality of Video Sequences During Signal Streaming with and Without Human Perception
Keimel et al. Video is a cube
Xian et al. A content-oriented no-reference perceptual video quality assessment method for computer graphics animation videos
EP3073736A1 (en) Method and device for measuring quality of experience of mobile video service
CN111311584B (en) Video quality evaluation method and device, electronic equipment and readable medium
Nur Yilmaz A no reference depth perception assessment metric for 3D video
Ortiz-Jaramillo et al. Content-aware objective video quality assessment
CN116980604A (en) Video encoding method, video decoding method and related equipment
CN114630139A (en) Quality evaluation method of live video and related equipment thereof
Cui et al. No-reference video shakiness quality assessment
CN113038129A (en) Method and equipment for acquiring data samples for machine learning
Zhou et al. Research on video motion characteristics extraction and description based on human visual characteristics

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17897421

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17897421

Country of ref document: EP

Kind code of ref document: A1