WO2020118588A1 - 图像级jnd阈值的预测方法、装置、设备及存储介质 - Google Patents

图像级jnd阈值的预测方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2020118588A1
WO2020118588A1 PCT/CN2018/120749 CN2018120749W WO2020118588A1 WO 2020118588 A1 WO2020118588 A1 WO 2020118588A1 CN 2018120749 W CN2018120749 W CN 2018120749W WO 2020118588 A1 WO2020118588 A1 WO 2020118588A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
compressed
tested
compressed image
perceptual distortion
Prior art date
Application number
PCT/CN2018/120749
Other languages
English (en)
French (fr)
Inventor
张云
刘焕华
Original Assignee
中国科学院深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院深圳先进技术研究院 filed Critical 中国科学院深圳先进技术研究院
Priority to EP18943329.5A priority Critical patent/EP3896965A4/en
Priority to US17/312,736 priority patent/US20220051385A1/en
Priority to PCT/CN2018/120749 priority patent/WO2020118588A1/zh
Publication of WO2020118588A1 publication Critical patent/WO2020118588A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2148Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the process organisation or structure, e.g. boosting cascade
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/002Image coding using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/94Hardware or software architectures specially adapted for image or video understanding
    • G06V10/95Hardware or software architectures specially adapted for image or video understanding structured as a network, e.g. client-server architectures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/154Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30168Image quality inspection

Definitions

  • the invention belongs to the technical field of image/video compression, and particularly relates to a method, device, device and storage medium for predicting image-level JND threshold.
  • the perception of visual information by the human visual system is a non-uniform and non-linear information processing process.
  • the human eye has certain visual psychological redundancy when observing the image, which will affect the features in the image Or the content is selectively ignored or blocked.
  • the human eye cannot perceive subtle changes in the image pixels below a certain threshold in the image, that is, changes that the human eye cannot perceive, the threshold is the human eye's just perceptible distortion (Just Noticeable Distortion, (Referred to as JND) threshold, which represents the visual redundancy in the image.
  • JND Just Noticeable Distortion
  • the JND threshold describes the minimum distortion of the image that the human eye can perceive and reflects the perception ability and sensitivity of the human visual system. Therefore, the JND threshold is widely used in image/video processing, such as image/video encoding, streaming media applications and Watermark technology, etc.
  • JND models have been proposed. These JND models can be roughly divided into two categories: JND models based on the pixel domain and JND models based on the frequency domain.
  • the pixel domain JND model mainly considers the effect of the brightness adaptive effect and the spatial masking effect on the JND threshold.
  • Wu et al. adopted spatial structure regularity to measure the spatial masking effect, and combined with the adaptive effect of illumination, a new JND model was proposed in 2012.
  • Wu et al. believe that the presence of a disordered hiding effect will cause the JND threshold of the disordered region to be higher than the effective region. Therefore, in 2013, a The JND model of free energy principle; at the same time, Wu et al.
  • the frequency domain JND model mainly considers contrast sensitivity function (Contrast Sensitivity Function, CSF), contrast masking effect, illumination adaptive effect and foveal foveal masking effect, for example, CSF-based spatiotemporal JND proposed by Z.Wei et al in 2009
  • CSF Contra sensitivity function
  • Bae et al. considered the influence of different frequencies on the adaptive lighting, and then proposed a new lighting adaptive JND model
  • H.Ko et al By calculating the texture complexity to calculate the contrast masking effect, a JND model that can be adapted to Discrete Cosine Transform (DCT) kernels of arbitrary sizes was proposed in 2014; Ki et al. considered the quantization during the compression process.
  • DCT Discrete Cosine Transform
  • the pixel domain JND model calculates a JND threshold for each pixel of the image, while the frequency domain JND model converts the image from the pixel domain to the frequency domain, and then calculates a JND threshold for each sub-frequency.
  • the pixel domain JND model and the frequency domain JND model are both local JND threshold estimation models. Only the JND threshold of a single pixel/frequency is estimated. However, the quality of the entire image is determined by certain key areas and areas of poor quality.
  • the traditional JND model mainly considers the estimation of the original image JND threshold, and does not consider the estimation of the JND threshold of any quality image
  • the traditional JND model is limited in practical applications. For this reason, it is particularly important to predict the JND threshold for images of arbitrary quality.
  • the purpose of the present invention is to provide an image-level JND threshold prediction method, device, equipment and storage medium, aiming to solve the problem that the prior art cannot provide an effective image-level JND threshold prediction method, resulting in The problem of large deviation of JND threshold prediction.
  • the present invention provides an image-level JND threshold prediction method, which includes the following steps:
  • the image to be tested and the compressed image in the compressed image set corresponding to the image to be tested are subjected to perceptual distortion discrimination to obtain a set of perceptual distortion discrimination results.
  • the judgment result of perceptual distortion includes true value and false value;
  • the present invention provides an image-level JND threshold prediction device, the device comprising:
  • a perceptual distortion discriminating unit which is used to perform perceptual distortion discrimination on the image to be tested and the compressed image in the compressed image set corresponding to the image to be tested through a trained multi-class perceptual distortion discriminator to obtain a set of perceptual distortion discrimination results,
  • the perceptual distortion discrimination results in the perceptual distortion discrimination result set include true value and false value;
  • the JND threshold prediction unit is configured to perform error tolerance processing on the set of perceptual distortion determination results through a preset image-level JND search strategy to predict the image-level JND threshold of the image to be measured.
  • the present invention also provides a computing device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, which is implemented when the processor executes the computer program
  • a computing device including a memory, a processor, and a computer program stored in the memory and executable on the processor, which is implemented when the processor executes the computer program
  • the steps are as described in the above image level JND threshold prediction method.
  • the present invention also provides a computer-readable storage medium that stores a computer program, and when the computer program is executed by a processor, the computer-readable storage medium is implemented as described in the aforementioned image-level JND threshold prediction method A step of.
  • a trained multi-class perceptual distortion discriminator discriminates the image to be tested and the compressed image in the compressed image set corresponding to the image to be tested to obtain a perceptual distortion discriminating result set, and then uses the image-level JND search strategy to perceive
  • the distortion discrimination result set is subjected to fault tolerance to predict the image-level JND threshold of the image to be measured, thereby reducing the prediction deviation of the image-level JND threshold, improving the accuracy of the image-level JND threshold prediction, and making the predicted JND threshold closer to the human eye
  • the visual system's perception of the quality of the entire image is subjected to fault tolerance to predict the image-level JND threshold of the image to be measured, thereby reducing the prediction deviation of the image-level JND threshold, improving the accuracy of the image-level JND threshold prediction, and making the predicted JND threshold closer to the human eye
  • FIG. 1 is an implementation flowchart of an image-level JND threshold prediction method provided in Embodiment 1 of the present invention
  • FIG. 2 is a flow chart of implementation of discriminating distortion of an image to be tested and a compressed image according to Embodiment 2 of the present invention
  • Embodiment 3 is a flowchart of implementing fault tolerance processing on a set of perceptual distortion discrimination results provided by Embodiment 3 of the present invention
  • Embodiment 4 is a schematic diagram of a sliding window provided by Embodiment 3 of the present invention.
  • Embodiment 5 is a schematic structural diagram of an apparatus for predicting an image-level JND threshold provided by Embodiment 4 of the present invention.
  • FIG. 6 is a schematic structural diagram of an apparatus for predicting an image-level JND threshold according to Embodiment 5 of the present invention.
  • FIG. 7 is a schematic structural diagram of a computing device according to Embodiment 6 of the present invention.
  • FIG. 1 shows the implementation flow of the image-level JND threshold prediction method provided in Embodiment 1 of the present invention. For ease of description, only the parts related to the embodiment of the present invention are shown. The details are as follows:
  • a trained multi-class perceptual distortion discriminator performs perceptual distortion discrimination on the image to be tested and the compressed image in the compressed image set corresponding to the image to be tested, to obtain a set of perceptual distortion discrimination results.
  • the embodiments of the present invention are applicable to image/video processing platforms, systems, or devices, such as personal computers, servers, and so on.
  • the images to be tested are compressed by different compression methods to obtain compressed images of different qualities, and all the compressed images of different qualities form a compressed image set.
  • the compressed image x i is subjected to perceptual distortion discrimination to obtain the perceptual distortion discrimination result.
  • All the perceptual distortion discrimination results corresponding to the compressed image constitute a set of perceptual distortion discrimination results, wherein the perceptual distortion discrimination results include true values (for example: 1) and false values ( For example: 0).
  • the multi-class perceptual distortion discriminator Before discriminating the distortion of the image to be tested and the compressed image in the compressed image set corresponding to the image to be tested through the trained multi-class perceptual distortion discriminator, it is preferable to construct a multi-class perceptual distortion discriminator and use supervised, Semi-supervised or unsupervised image training samples train the multi-class perceptual distortion discriminator, so that the multi-class perceptual distortion discriminator can discriminate whether two images with the same content but different quality have perceptual distortion.
  • a two-class perceptual quality discriminator When training a multi-class perceptual distortion discriminator, it is preferable to construct a two-class perceptual quality discriminator through a convolutional neural network, a linear regression function, and a logistic regression function, so that the two-class perceptual quality discriminator constitutes a multi-class perceptual distortion discriminator Discriminator, learn the binary classification perceptual quality discriminator based on the pre-generated training image samples, and the first parameter set of the convolutional neural network, the second parameter set of the linear regression function, and the logistic regression based on the sample labels of the training image samples
  • the third parameter set of the function is adjusted to use the learned two-class perceptual quality discriminator to realize the perceptual distortion discrimination of the compressed image in the image to be tested and the compressed image set corresponding to the image to be tested.
  • the training of the distortion discriminator is decomposed into the training of the two-class perceptual quality discriminator, which improves the training speed and efficiency of the discriminator model.
  • the following two steps are used to learn the two-class perceptual quality discriminator:
  • the training image samples contain positive and negative image samples, denoted as ⁇ x t , y t ⁇ , x t is the sample image data, and x t Including the original image sample and the compressed image sample set corresponding to the original image sample, y t is the sample label of the sample image data;
  • CNN Convolutional Neural Network
  • the first parameter set of the convolutional neural network, the second parameter set of the linear regression function, and The third parameter set of the logistic regression function, and jump to step 4) continue to learn the two-class perceptual quality discriminator until the perceptual distortion discriminating result is consistent with the corresponding sample label or the learning times reach the preset iteration threshold.
  • the training problem of the multi-class perceptual distortion discriminator is converted into the training problem of the two-class perceptual quality discriminator, thereby improving the training of the multi-class perceptual distortion discriminator Speed and efficiency, and reduces the difficulty of subsequent prediction of image-level JND threshold.
  • the learning efficiency is initialized to 1 ⁇ 10 -4 and the Adam algorithm is used as the gradient descent method, while batch gradient descent (mini- batch) is set to 4 to process a mini-batch, the first parameter set, the second parameter set, and the third parameter set are updated to improve the training speed and effectiveness.
  • step S102 an error-tolerant process is performed on the set of perceptual distortion judgment results through a preset image-level JND search strategy to predict the image-level JND threshold of the image to be measured.
  • the multi-category perceptual distortion discriminator has a misjudgment in the perceptual distortion discrimination between the image to be measured and the compressed image, which results in an inaccurate perceptual distortion discrimination result. Therefore, through a preset image-level JND search strategy Error-tolerant processing is performed on the set of perceptual distortion discrimination results, and finally the image-level JND threshold of the image to be tested is predicted, thereby improving the accuracy of predicting the image-level JND threshold.
  • a trained multi-class perceptual distortion discriminator performs perceptual distortion discrimination on the image to be tested and the compressed image in the compressed image set corresponding to the image to be tested, to obtain a set of perceptual distortion discrimination results, and then through the image level
  • the JND search strategy performs fault-tolerant processing on the set of perceptual distortion discrimination results to predict the image-level JND threshold of the image to be measured, thereby reducing the prediction deviation of the image-level JND threshold and improving the accuracy of the image-level JND threshold prediction, making the prediction get JND
  • the threshold is closer to the human visual system's perception of the quality of the entire image.
  • FIG. 2 shows the implementation process of perceptual distortion discrimination of the image to be measured and the compressed image in step S101 provided in the second embodiment of the present invention. For ease of explanation, only the relevant parts of the embodiment of the present invention are shown. Described as follows:
  • step S201 the image to be tested and the compressed image are divided into image blocks according to the preset image block size, respectively, to obtain a corresponding set of image blocks to be tested and a set of compressed image blocks.
  • the image to be measured x and the ith compressed image x i corresponding to the image to be tested are divided into image blocks to obtain the corresponding set of image blocks to be tested and compressed Image block set.
  • the order of the image block to be tested and the compressed image block are the same.
  • the jth image block to be tested divided by the image to be tested is P x,j
  • the compressed image x i is the jth compressed image block that is the same as the image position of the image block P x,j in the image x to be measured.
  • the size of the image block is 32 ⁇ 32, so as to avoid that the image block is too large or too small and reduce the effect of subsequent feature extraction on the image block.
  • step S202 according to the position of the image block, a preset number of corresponding image blocks to be tested and compressed image blocks are selected from the set of image blocks to be tested and the set of compressed image blocks, respectively.
  • a preset number of corresponding image blocks to be tested and compressed image blocks are randomly selected from the set of image blocks to be tested and the set of compressed image blocks, respectively, and the selected image blocks to be tested and compressed image blocks satisfy the
  • the position of the measured image block in the image to be measured is the same as the position of the compressed image block in the compressed image.
  • the number of the selected image block to be tested and the compressed image block are 32, respectively, so as to avoid too many or too few image blocks for feature extraction and reduce the effect of subsequent feature extraction on the image blocks.
  • step S203 feature extraction is performed on the selected image block to be tested and the compressed image block through a preset convolutional neural network to obtain the corresponding feature set of the image block to be tested and the compressed image block feature set.
  • the network structure of the convolutional neural network is that each convolutional layer is followed by an activation layer, and every two convolutional layers is followed by a pooling layer, thereby improving the image block to be tested and compression The saliency of the features extracted by the image block.
  • the convolutional layer of the convolutional neural network has a convolutional layer number of 10, a convolutional kernel size of 3, and a convolutional step size of 2, thereby further improving the significance of the features extracted from the image block to be tested and the compressed image block .
  • the activation function of the convolutional neural network uses a modified linear unit (Reectified Linear Unit, ReLU), and the pooling method uses a maximum pooling method, thereby improving the calculation speed and convergence speed of the convolutional neural network.
  • ReLU modified linear unit
  • step S204 feature fusion is performed on the image block feature to be measured in the feature block set to be tested and the compressed image block feature in the compressed image block feature set according to a preset feature fusion method to obtain a fusion feature set.
  • feature fusion is adopted or Combine the lth image block feature F x,l and the compressed image block feature set of the image block feature set ⁇ F x,1 ,F x,2 ,...,F x,N ⁇ Features of the corresponding lth compressed image block in Perform feature fusion to obtain a fusion feature set ⁇ F 1 ′,F′ 2 ,...,F′ N ⁇ , where N is the number of selected image blocks to be tested and compressed image blocks.
  • the feature fusion method combines the features of the image block to be tested and the corresponding compressed image block features, thereby improving the feature saliency.
  • step S205 the compressed image block is evaluated for quality through a preset linear regression function according to the fusion feature set, and a corresponding quality score set is obtained.
  • the quality of each compressed image block in the compressed image block set is evaluated by an arbitrary linear regression function (for example: Support Vector Machine (SVM)) to obtain the corresponding The quality score of, for example, the jth compressed image block
  • SVM Support Vector Machine
  • the quality score of is recorded as S j
  • the quality scores of all compressed image blocks form a quality score set, which is recorded as ⁇ S 1 ,S 2 ,...,S N ⁇ .
  • a multi-layer perceptron (Multilayer Perceptron, MLP) is used as a linear regression function, and the number of intermediate layers of the multi-layer perceptron is set to 1, thereby improving the accuracy of the quality score.
  • MLP Multilayer Perceptron
  • step S206 according to the quality score set, a preset logistic regression function is used to determine whether there is perceptual distortion between the image to be tested and the compressed image, and a perceptual distortion discrimination result is obtained.
  • the quality score set ⁇ S 1 ,S 2 ,...,S N ⁇ of the compressed image block is obtained, and a logistic regression function is used Map ⁇ S 1 ,S 2 ,...,S N ⁇ to a value between 0 and 1 and record it as r.
  • N is the number of selected image blocks and compressed image blocks
  • ⁇ (.) is the sigmod function
  • w i is the weight of the i-th compressed image block
  • the weights of all compressed image blocks constitute the third parameter set of the logistic regression function
  • b is the offset parameter of the logistic regression function.
  • the image block to be tested and the compressed image are divided into image blocks, then, the divided image block to be tested and the compressed image block are subjected to feature extraction and feature fusion, and finally, the compression is performed according to the fused features
  • the image blocks are scored for quality, and then the perceptual distortion discrimination results of the compressed image and the image to be tested are obtained, thereby improving the accuracy of the perceptual distortion discrimination results.
  • FIG. 3 shows an implementation process of performing fault tolerance processing on the set of perceptual distortion judgment results in step S102 in Embodiment 1 provided by Embodiment 3 of the present invention.
  • FIG. 3 shows an implementation process of performing fault tolerance processing on the set of perceptual distortion judgment results in step S102 in Embodiment 1 provided by Embodiment 3 of the present invention.
  • step S301 according to the compressed image sequence corresponding to the set of perceptual distortion determination results, the sliding window of the preset window size is slid according to the preset sliding direction, and the compressed image corresponding to the true value of the perceptual distortion determination result in the sliding window is counted
  • the number of compressed images, the sliding direction is from right to left or from left to right.
  • one perceptual distortion determination result in the set of perceptual distortion determination results corresponds to one compressed image
  • the compressed image sequence x 1 , x 2 ,..., x N corresponding to the perceptual distortion determination result set and the perceptual distortion determination
  • the result constitutes an XY axis coordinate system
  • the compressed image sequence x 1 , x 2 ,..., x N constitute the X axis coordinate point
  • the true value (1) and the false value (0) of the perceived distortion discrimination result constitute the Y axis coordinate point
  • the sliding window slides from right to left along the X axis.
  • the compressed image sequence corresponding to the set of perceptual distortion discrimination results x 1 , x 2 , ..., x N constitutes the X-axis coordinate point
  • the true value (1) and false value (0) of the perceptual distortion judgment result constitute the Y-axis coordinate point
  • the sliding window is the last compressed image from the right end of the X-axis of the coordinate system ( That is, the N-th compressed image x N ) starts to slide to the origin of the left XY axis coordinate system.
  • the window size of the sliding window is set to 6, thereby improving the success rate of correcting and restoring the misjudgment results in the set of perceptual distortion discrimination results.
  • step S302 when the sliding direction is from right to left, and the number of compressed images is not less than the preset window threshold, the compressed image corresponding to the rightmost end of the window in the sliding window is determined as a JND compressed image, and when the sliding direction is from left To the right, and the number of compressed images is not greater than the window threshold, the compressed image corresponding to the leftmost end of the window in the sliding window is determined as the JND compressed image.
  • the sliding direction when the sliding direction is from right to left, it is determined whether the number of compressed images in the sliding window whose distortion distortion is true is greater than or equal to a preset window threshold, if yes, the sliding window stops sliding, And the compressed image corresponding to the rightmost end of the window in the sliding window is determined as the JND compressed image, as shown in Figure 4, the k-th compressed image x k at point A, otherwise, the sliding window continues to slide until the result of the perceived distortion in the sliding window The number of true compressed images is greater than or equal to the preset window threshold. When the sliding direction is from left to right, it is judged whether the number of compressed images whose true distortion judgment result in the sliding window is true is less than or equal to the window threshold.
  • the sliding window stops sliding, and the leftmost end of the window in the sliding window corresponds
  • the compressed image of is determined to be a JND compressed image, otherwise, the sliding window continues to slide until the number of compressed images in the sliding window whose distortion is judged to be true is less than or equal to the window threshold.
  • the preset window threshold is 5, so as to improve the success rate of correcting and recovering the misjudgment results in the set of perceptual distortion judgment results.
  • step S303 the image compression index used for the JND compressed image is set as the image-level JND threshold of the image to be measured.
  • the JND compressed image (that is, the k-th compressed image x k ) is obtained by compressing the original image to be tested by the corresponding image compression index, which is used in the process of compressing the compressed image x k
  • the compression factor, bit rate, or other image quality indicators for example: Peak Signal to Noise Ratio (PSNR)
  • PSNR Peak Signal to Noise Ratio
  • an image-level JND search strategy based on a sliding window is used for error tolerance processing, and finally the image-level JND threshold of the image to be tested is predicted, thereby improving the accuracy of image-level JND threshold prediction.
  • FIG. 5 shows the structure of the image-level JND threshold prediction apparatus provided by Embodiment 4 of the present invention. For ease of description, only the parts related to the embodiment of the present invention are shown, including:
  • the perceptual distortion discriminating unit 51 is configured to perform perceptual distortion discrimination on the image to be tested and the compressed image set corresponding to the image to be tested through a trained multi-class perceptual distortion discriminator to obtain a set of perceptual distortion discrimination results;
  • the JND threshold prediction unit 52 is configured to perform error tolerance processing on the set of perceptual distortion determination results through a preset image-level JND search strategy to predict the image-level JND threshold of the image to be measured.
  • each unit of the image-level JND threshold prediction device may be implemented by a corresponding hardware or software unit, and each unit may be an independent software and hardware unit, or may be integrated into one software and hardware unit, which is not used here To limit the invention.
  • each unit may be implemented by a corresponding hardware or software unit, and each unit may be an independent software and hardware unit, or may be integrated into one software and hardware unit, which is not used here To limit the invention.
  • FIG. 6 shows a structure of an image-level JND threshold prediction device provided by Embodiment 5 of the present invention. For ease of description, only parts related to the embodiment of the present invention are shown, including:
  • the two-classifier construction unit 61 is used to construct a two-class perceptual quality discriminator through a convolutional neural network, a linear regression function, and a logistic regression function, to form a multi-class perceptual distortion discriminator through the two-class perceptual quality discriminator;
  • the discriminator learning unit 62 is used to learn the binary classification perceptual quality discriminator based on the pre-generated training image samples, and the first parameter set of the convolutional neural network and the second linear regression function according to the sample labels of the training image samples.
  • the parameter set and the third parameter set of the logistic regression function are adjusted to perform perceptual distortion discrimination on the image to be tested and the compressed image set through the learned two-class perceptual quality discriminator;
  • the perceptual distortion discriminating unit 63 is configured to perform perceptual distortion discrimination on the image to be tested and the compressed image set corresponding to the image to be tested through the trained multi-class perceptual distortion discriminator to obtain a set of perceptual distortion discrimination results;
  • the JND threshold prediction unit 64 is configured to perform error tolerance processing on the set of perceptual distortion discrimination results through a preset image-level JND search strategy to predict the image-level JND threshold of the image to be measured.
  • the perceptual distortion determination unit 63 includes:
  • the image block dividing unit 631 is configured to divide the image to be tested and the compressed image into image blocks according to the preset image block size to obtain the corresponding set of image blocks to be tested and the set of compressed image blocks;
  • the image block selection unit 632 is configured to select a preset number of corresponding image blocks to be tested and compressed image blocks from the set of image blocks to be tested and the set of compressed image blocks according to the positions of the image blocks;
  • the feature extraction unit 633 is configured to perform feature extraction on the selected image block to be tested and the compressed image block through a preset convolutional neural network to obtain the corresponding feature set of the image block to be tested and the compressed image block feature set;
  • the feature fusion unit 634 is configured to perform feature fusion on the feature block of the image block to be tested and the compressed block feature of the compressed image block feature set according to a preset feature fusion method to obtain a fusion feature set;
  • the quality evaluation unit 635 is configured to perform quality evaluation on the compressed image block through a preset linear regression function according to the fusion feature set to obtain a corresponding quality score set;
  • the distortion discriminating sub-unit 636 is configured to discriminate whether there is a perceptual distortion between the image to be tested and the compressed image according to a set of quality scores through a preset logistic regression function, to obtain a perceptual distortion discrimination result.
  • the JND threshold prediction unit 64 includes:
  • the image quantity counting unit 641 is configured to slide the sliding window of the preset window size according to the preset sliding direction according to the set of compressed image sequences corresponding to the perceptual distortion judgment result set, and count the perceptual distortion judgment result in the sliding window as the true value correspondence
  • the number of compressed images of the compressed image, the sliding direction is from right to left or from left to right;
  • the JND image determination unit 642 is used to determine the compressed image corresponding to the rightmost end of the window in the sliding window as the JND compressed image when the sliding direction is from right to left and the number of compressed images is not less than the preset window threshold. From left to right, and when the number of compressed images is not greater than the window threshold, determine the compressed image corresponding to the leftmost end of the window in the sliding window as the JND compressed image; and
  • the JND threshold setting unit 643 is used to set the image compression index used by the JND compressed image as the image-level JND threshold of the image to be measured.
  • each unit of the image-level JND threshold prediction device may be implemented by a corresponding hardware or software unit, and each unit may be an independent software and hardware unit, or may be integrated into one software and hardware unit, which is not used here To limit the invention.
  • each unit may be implemented by a corresponding hardware or software unit, and each unit may be an independent software and hardware unit, or may be integrated into one software and hardware unit, which is not used here To limit the invention.
  • FIG. 7 shows the structure of the computing device provided in Embodiment 6 of the present invention. For convenience of description, only parts related to the embodiment of the present invention are shown.
  • the computing device 7 of the embodiment of the present invention includes a processor 70, a memory 71, and a computer program 72 stored in the memory 71 and executable on the processor 70.
  • the processor 70 executes the computer program 72, the steps in the embodiment of the image-level JND threshold prediction method described above are implemented, for example, steps S101 to S102 shown in FIG. 1.
  • the processor 70 executes the computer program 72, the functions of the units in the foregoing device embodiments are realized, for example, the functions of the units 51 to 52 shown in FIG. 5.
  • the image to be tested and the compressed image in the compressed image set corresponding to the image to be tested perform perceptual distortion discrimination to obtain a set of perceptual distortion discrimination results, and then pass the image -Level JND search strategy performs error-tolerant processing on the set of perceptual distortion discrimination results to predict the image-level JND threshold of the image to be measured, thereby reducing the prediction deviation of the image-level JND threshold and improving the accuracy of image-level JND threshold prediction, so that the prediction is obtained
  • the JND threshold is closer to the human visual system's perception of the quality of the entire image.
  • the computing device in this embodiment of the present invention may be a personal computer or a server.
  • the steps implemented by the processor 70 in the computing device 7 when implementing the computer program 72 to implement the image-level JND threshold prediction method reference may be made to the description of the foregoing method embodiments, which will not be repeated here.
  • a computer-readable storage medium stores a computer program that, when executed by a processor, implements the steps in the embodiment of the above-described image-level JND threshold prediction method embodiment For example, steps S101 to S102 shown in FIG. 1.
  • the computer program when executed by the processor, the functions of the units in the foregoing device embodiments are realized, for example, the functions of the units 51 to 52 shown in FIG. 5.
  • the image to be tested and the compressed image in the compressed image set corresponding to the image to be tested perform perceptual distortion discrimination to obtain a set of perceptual distortion discrimination results, and then pass the image -Level JND search strategy performs error-tolerant processing on the set of perceptual distortion discrimination results to predict the image-level JND threshold of the image to be measured, thereby reducing the prediction deviation of the image-level JND threshold and improving the accuracy of image-level JND threshold prediction, so that the prediction is obtained
  • the JND threshold is closer to the human visual system's perception of the quality of the entire image.
  • the computer-readable storage medium in the embodiments of the present invention may include any entity or device capable of carrying computer program code, and a recording medium, such as ROM/RAM, magnetic disk, optical disk, flash memory, and other memories.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Quality & Reliability (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)

Abstract

一种图像级JND阈值的预测方法、装置、设备及存储介质,该方法包括:通过训练好的多分类感知失真判别器对待测图像和与待测图像对应的压缩图像集合中的压缩图像进行感知失真判别,得到感知失真判别结果集合(S101),再通过图像级JND搜索策略对感知失真判别结果集合进行容错处理,以预测得到待测图像的图像级JND阈值(S102),从而降低图像级JND阈值的预测偏差,提高了图像级JND阈值预测的准确度,使得预测得到JND阈值更贴近人眼视觉系统对整幅图像质量的感知。

Description

图像级JND阈值的预测方法、装置、设备及存储介质 技术领域
本发明属于图像/视频压缩技术领域,尤其涉及一种图像级JND阈值的预测方法、装置、设备及存储介质。
背景技术
通过对人类视觉系统的研究发现,人类视觉系统对视觉信息的感知是一个非均匀、非线性的信息处理过程,人眼在观测图像时存在一定的视觉心理冗余,从而会对图像中的特征或内容选择性地进行忽略或屏蔽。基于人类视觉系统的各种屏蔽特性,人眼不能察觉图像中处于一定阈值以下的图像像素细微的变化,即人眼不能感知的变化,该阈值就是人眼的恰可感知失真(Just Noticeable Distortion,简称JND)阈值,代表着图像中的视觉冗余度。JND阈值描述了人眼能感知的图像最小失真,反映了人类视觉系统的感知能力及敏感度,因此,JND阈值被广泛应用于图像/视频处理中,例如:图像/视频编码、流媒体应用和水印技术等。
目前已经有多个JND模型被提出,这些JND模型可以大致分为两类:基于像素域的JND模型和基于频率域的JND模型。像素域JND模型主要考虑亮度自适应效应以及空间掩蔽效应对JND阈值的影响,例如,吴等人采用空间结构规则性来度量空间掩蔽效应,结合光照自适应效应在2012年提出了新的JND模型来提高对不规则纹理区域JND阈值估计的准确度;吴等人认为存在一种无序的隐藏效应将导致无序区域的JND阈值比有效区域的要高,因此,在2013年提出了一个基于自由能量原理的JND模型;同时,吴等人结合光照自适应效应和结构不确定性在2013年提出了一个模式掩蔽效应函数,进一步提出了一种基于模式掩蔽效应的JND模型;王等人在2016年中提出了一种基于边缘轮廓 重建的屏幕图像JND模型,该模型将边缘轮廓JND阈值的计算分解为对亮度自适应、掩蔽效应、和结构掩蔽效应的单独估计;Hadizadeh等人将视觉注意力机制因素考虑进来,提出了融合视觉注意力机制的JND模型。频率域JND模型主要考虑对比敏感度函数(Contrast Sensitivity Function,CSF)、对比掩蔽效应、光照自适应效应和视网膜中央凹掩蔽效应,例如,Z.Wei等人在2009年提出的基于CSF的时空JND模型,在该模型中引入了伽马系数对光照效应进行了补偿;Bae等人考虑了不同频率对光照自适应的影响,进而提出了一种新的光照自适应JND模型;H.Ko等人通过计算纹理复杂性来计算对比掩蔽效应,在2014年提出了一个能适应于任意大小离散余弦变换(Discrete Cosine Transform,简称DCT)核的JND模型;Ki等人考虑了在压缩过程中量化带来的能量损失对JND阈值的影响,在2018年提出了一种基于学习的JND预测计算方法。
目前,像素域JND模型是为图像的每个像素计算一个JND阈值,而频率域JND模型是先将图像从像素域转换到频率域,再为每一个子频率计算一个JND阈值,由此可看出,像素域JND模型和频率域JND模型都是局部JND阈值估计模型,只估计了单个像素/频率的JND阈值,然而,整幅图像的质量是由某些关键区域以及质量差的区域来决定的,因此,以上两种JND模型难以准确估计人眼对整幅图像的JND阈值;另外,传统的JND模型主要是考虑了对原始图像JND阈值的估计,没有考虑对任意质量图像JND阈值的估计,然而现实的图像/视频处理系统接收的大部分是有失真的图像/视频,因此,传统的JND模型在实际的应用中是受限制的。为此,针对任意质量图像的JND阈值的预测研究尤为重要。
发明内容
本发明的目的在于提供一种图像级JND阈值的预测方法、装置、设备及存储介质,旨在解决由于现有技术无法提供一种有效的图像级JND阈值的预测方 法,导致针对整幅图像的JND阈值预测偏差较大的问题。
一方面,本发明提供了一种图像级JND阈值的预测方法,所述方法包括下述步骤:
通过训练好的多分类感知失真判别器对待测图像和与所述待测图像对应的压缩图像集合中的压缩图像进行感知失真判别,得到感知失真判别结果集合,所述感知失真判别结果集合中的感知失真判别结果包括真值和假值;
通过预设的图像级JND搜索策略对所述感知失真判别结果集合进行容错处理,以预测得到所述待测图像的图像级JND阈值。
另一方面,本发明提供了一种图像级JND阈值的预测装置,所述装置包括:
感知失真判别单元,用于通过训练好的多分类感知失真判别器对待测图像和与所述待测图像对应的压缩图像集合中的压缩图像进行感知失真判别,得到感知失真判别结果集合,所述感知失真判别结果集合中的感知失真判别结果包括真值和假值;以及
JND阈值预测单元,用于通过预设的图像级JND搜索策略对所述感知失真判别结果集合进行容错处理,以预测得到所述待测图像的图像级JND阈值。
另一方面,本发明还提供了一种计算设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如上述图像级JND阈值的预测方法所述的步骤。
另一方面,本发明还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现如上述图像级JND阈值的预测方法所述的步骤。
本发明通过训练好的多分类感知失真判别器对待测图像和与待测图像对应的压缩图像集合中的压缩图像进行感知失真判别,得到感知失真判别结果集合,再通过图像级JND搜索策略对感知失真判别结果集合进行容错处理,以预测得到待测图像的图像级JND阈值,从而降低图像级JND阈值的预测偏差,提高了图像级JND阈值预测的准确度,使得预测得到JND阈值更贴近人眼视觉系 统对整幅图像质量的感知。
附图说明
图1是本发明实施例一提供的图像级JND阈值的预测方法的实现流程图;
图2是本发明实施例二提供的对待测图像和压缩图像进行感知失真判别的实现流程图;
图3是本发明实施例三提供的对感知失真判别结果集合进行容错处理的实现流程图;
图4是本发明实施例三提供的滑动窗口示意图;
图5是本发明实施例四提供的图像级JND阈值的预测装置的结构示意图;
图6是本发明实施例五提供的图像级JND阈值的预测装置的结构示意图;以及
图7是本发明实施例六提供的计算设备的结构示意图。
具体实施方式
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。
以下结合具体实施例对本发明的具体实现进行详细描述:
实施例一:
图1示出了本发明实施例一提供的图像级JND阈值的预测方法的实现流程,为了便于说明,仅示出了与本发明实施例相关的部分,详述如下:
在步骤S101中,通过训练好的多分类感知失真判别器对待测图像和与待测图像对应的压缩图像集合中的压缩图像进行感知失真判别,得到感知失真判别结果集合。
本发明实施例适用于图像/视频处理平台、系统或设备,例如:个人计算机、 服务器等。在本发明实施例中,待测图像通过不同的压缩方式进行压缩,可以得到不同质量的压缩图像,所有的、不同质量的压缩图像组成压缩图像集合。将待测图像x和与待测图像对应的压缩图像集合中的第i个压缩图像x i输入到训练好的多分类感知失真判别器中,通过该多分类感知失真判别器对待测图像x和压缩图像x i进行感知失真判别,得到感知失真判别结果,所有的压缩图像对应的感知失真判别结果构成感知失真判别结果集合,其中,感知失真判别结果包括真值(例如:1)和假值(例如:0)。
在通过训练好的多分类感知失真判别器对待测图像和与待测图像对应的压缩图像集合中的压缩图像进行感知失真判别之前,优选地,构建多分类感知失真判别器,并采用有监督、半监督或者无监督的图像训练样本对该多分类感知失真判别器进行训练,从而使得该多分类感知失真判别器能够判别两幅内容相同但质量不同的图像是否存在感知失真。
在对多分类感知失真判别器进行训练时,优选地,通过卷积神经网络、线性回归函数以及逻辑回归函数构建二分类感知质量判别器,以通过该二分类感知质量判别器构成多分类感知失真判别器,根据预先生成的训练图像样本对二分类感知质量判别器进行学习,并根据训练图像样本的样本标签对卷积神经网络的第一参数集、线性回归函数的第二参数集以及逻辑回归函数的第三参数集进行调整,以利用学习好的二分类感知质量判别器,实现对待测图像和与待测图像对应的压缩图像集合中的压缩图像进行感知失真判别,从而通过将多分类感知失真判别器的训练分解为对二分类感知质量判别器的训练,提高了判别器模型的训练速度和效率。
在根据预先生成的训练图像样本对二分类感知质量判别器进行学习时,优选地,通过下述步骤实现对二分类感知质量判别器进行学习:
1)从MCL_JCI数据集中生成预设数量个(例如:50)训练图像样本,该训练图像样本包含正负图像样本,记为{x t,y t},x t为样本图像数据,且x t包括原始图像样本和该原始图像样本对应的压缩图像样本集合,y t为样本图像数据 的样本标签;
2)将原始图像样本x与该原始图像样本对应的压缩图像样本集合中的第i个压缩图像样本x i分别划分成大小为M×M的图像块,将x和x i的第j个图像块分别记为P x,j
Figure PCTCN2018120749-appb-000001
其中j∈[1,2,...S/M],S为原始图像样本x的大小,原始图像样本与压缩图像样本的图像块排列的顺序相同;
3)从x和x i划分的块中分别选取N块位置相同的图像块,记为样本图像块集合{P x,1,P x,2,...,P x,N}和压缩样本图像块集合
Figure PCTCN2018120749-appb-000002
4)采用卷积神经网络(Convolutional Neural Network,CNN)对{P x,1,P x,2,...,P x,N}和
Figure PCTCN2018120749-appb-000003
中的样本图像块和压缩样本图像块进行特征提取,得到对应的样本图像块特征集合和压缩样本图像块特征集合,记为{F x,1,F x,2,...,F x,N}和
Figure PCTCN2018120749-appb-000004
5)通过特征融合方式
Figure PCTCN2018120749-appb-000005
或者
Figure PCTCN2018120749-appb-000006
对{F x,1,F x,2,...,F x,N}和
Figure PCTCN2018120749-appb-000007
中的第l个样本图像块特征F x,l和相应的压缩样本图像块特征
Figure PCTCN2018120749-appb-000008
进行特征融合,得到样本融合特征集合{F 1′,F′ 2,...,F′ N};
6)根据样本融合特征集合{F 1′,F′ 2,...,F′ N},通过线性回归函数对
Figure PCTCN2018120749-appb-000009
中的每一个压缩样本图像块的质量进行评分,得到对应的样本质量评分集合{S 1,S 2,...,S N};
7)通过逻辑回归函数将{S 1,S 2,...,S N}映射到0至1之间的值,记为r,当r≥0.5则认为压缩图像样本x i与原始图像样本x存在感知失真,得到感知失真判别结果,并判断该感知失真判别结果与对应样本标签是否一致,当不一致时,则调整卷积神经网络的第一参数集、线性回归函数的第二参数集以及逻辑回归函数的第三参数集,并跳转到步骤4)继续对二分类感知质量判别器进行学习,直至感知失真判别结果与对应样本标签一致或者学习次数到达了预设的迭代阈值。
在本发明实施例中,通过上述步骤1)~7)将多分类感知失真判别器的训 练问题转换成对二分类感知质量判别器的训练问题,从而提高了对多分类感知失真判别器的训练速度和效率,并降低了后续对图像级JND阈值预测的难度。
在根据预先生成的训练图像样本对二分类感知质量判别器进行学习之前,优选地,将学习效率初始化为1×10 -4,并采用Adam算法作为梯度下降方法,同时将批梯度下降(mini-batch)的大小设定为4,以处理完一个mini-batch,则将第一参数集、第二参数集以及第三参数集进行更新,从而提高了对多分类感知失真判别器的训练速度和效率。
在步骤S102中,通过预设的图像级JND搜索策略对感知失真判别结果集合进行容错处理,以预测得到待测图像的图像级JND阈值。
在本发明实施例中,多分类感知失真判别器对待测图像和压缩图像的感知失真判别存在误判的情况,导致得到的感知失真判别结果不准确,因此,通过预设的图像级JND搜索策略对感知失真判别结果集合进行容错处理,最终预测出待测图像的图像级JND阈值,从而提高了对图像级JND阈值预测的准确度。
在本发明实施例中,通过训练好的多分类感知失真判别器对待测图像和与待测图像对应的压缩图像集合中的压缩图像进行感知失真判别,得到感知失真判别结果集合,再通过图像级JND搜索策略对感知失真判别结果集合进行容错处理,以预测得到待测图像的图像级JND阈值,从而降低图像级JND阈值的预测偏差,提高了图像级JND阈值预测的准确度,使得预测得到JND阈值更贴近人眼视觉系统对整幅图像质量的感知。
实施例二:
图2示出了本发明实施例二提供的实施例一中步骤S101对待测图像和压缩图像进行感知失真判别的实现流程,为了便于说明,仅示出了与本发明实施例相关的部分,详述如下:
在步骤S201中,根据预设的图像块大小,分别将待测图像和压缩图像进行图像块划分,得到对应的待测图像块集合和压缩图像块集合。
在本发明实施例中,根据预设的图像块大小,分别将待测图像x和与待测 图像对应的第i个压缩图像x i进行图像块划分,得到对应的待测图像块集合和压缩图像块集合,在待测图像块集合和压缩图像块集合中,待测图像块和压缩图像块的排列顺序相同,例如,待测图像x划分的第j个待测图像块为P x,j,则在压缩图像x i上与待测图像块P x,j在待测图像x中的图像位置相同的地方被划分的块为第j个压缩图像块,即为
Figure PCTCN2018120749-appb-000010
优选地,图像块大小为32×32,从而避免图像块过大或过小而降低后续对图像块进行特征提取的效果。
在步骤S202中,根据图像块位置,分别在待测图像块集合和压缩图像块集合中选取预设数量个对应的待测图像块和压缩图像块。
在本发明实施例中,分别在待测图像块集合和压缩图像块集合中随机选取预设数量个对应的待测图像块和压缩图像块,选取出的待测图像块和压缩图像块满足待测图像块在待测图像中的位置与压缩图像块在压缩图像中的位置相同。
优选地,选取的待测图像块和压缩图像块的数量分别为32,从而避免特征提取的图像块数量过多或过少而降低后续对图像块进行特征提取的效果。
在步骤S203中,通过预设的卷积神经网络分别对选取出的待测图像块和压缩图像块进行特征提取,得到对应的待测图像块特征集合和压缩图像块特征集合。
在本发明实施例中,优选地,卷积神经网络的网络结构为每个巻积层后接一个激活层,每两个巻积层后接一个池化层,从而提高待测图像块和压缩图像块提取的特征的显著性。
进一步优选地,卷积神经网络卷积层的卷积层数为10,卷积核大小为3,卷积步长为2,从而进一步提高待测图像块和压缩图像块提取的特征的显著性。
又一优选地,卷积神经网络的激活函数采用修正线性单元(Rectified linear unit,ReLU),池化方式采用最大池化法,从而提高了卷积神经网络卷的计算速度和收敛速度。
在步骤S204中,根据预设的特征融合方式将待测图像块特征集合中的待测图像块特征和压缩图像块特征集合中的压缩图像块特征进行特征融合,得到融合特征集合。
在本发明实施例中,采用特征融合方式
Figure PCTCN2018120749-appb-000011
或者
Figure PCTCN2018120749-appb-000012
将待测图像块特征集合{F x,1,F x,2,...,F x,N}中的第l个待测图像块特征F x,l和压缩图像块特征集合
Figure PCTCN2018120749-appb-000013
中相应的第l个压缩图像块特征
Figure PCTCN2018120749-appb-000014
进行特征融合,得到融合特征集合{F 1′,F′ 2,...,F′ N},其中N为选取出的待测图像块和压缩图像块的数量。
优选地,采用
Figure PCTCN2018120749-appb-000015
特征融合方式将待测图像块特征和相应的压缩图像块特征进行融合,从而提高特征显著性。
在步骤S205中,根据融合特征集合,通过预设的线性回归函数对压缩图像块进行质量评价,得到对应的质量评分集合。
在本发明实施例中,根据融合特征集合,通过任意的线性回归函数(例如:支持向量机(Support Vector Machine,SVM))对压缩图像块集合中的每个压缩图像块进行质量评价,得到对应的质量评分,例如,第j块压缩图像块
Figure PCTCN2018120749-appb-000016
的质量评分记为S j,所有压缩图像块的质量评分组成质量评分集合,记为{S 1,S 2,...,S N}。
在本发明实施例中,优选地,采用多层感知器(Multilayer Perceptron,MLP)作为线性回归函数,且将多层感知器中间层的层数设置为1,从而提高了质量评分的准确性。
在步骤S206中,根据质量评分集合,通过预设的逻辑回归函数判别待测图像和压缩图像之间是否存在感知失真,得到感知失真判别结果。
在本发明实施例中,得到压缩图像块的质量评分集合{S 1,S 2,...,S N},采用逻辑回归函数
Figure PCTCN2018120749-appb-000017
将{S 1,S 2,...,S N}映射到0至1之间的值,记为r,当r≥0.5则认为压缩图像x i与待测图像x存在感知失真,输出真值(1),否则认为x i与 x不存在感知失真,输出假值(0),其中,N为选取出的待测图像块和压缩图像块的数量,Ψ(.)为sigmod函数,w i为第i个压缩图像块的权重,所有压缩图像块的权重构成逻辑回归函数的第三参数集,b为逻辑回归函数的偏置参数。
在本发明实施例中,首先,通过将待测图像和压缩图像进行图像块划分,然后,对划分的待测图像块和压缩图像块进行特征提取和特征融合,最后,根据融合的特征对压缩图像块进行质量评分,进而得到压缩图像和待测图像的感知失真判别结果,从而提高了感知失真判别结果的准确性。
实施例三:
图3示出了本发明实施例三提供的实施例一中步骤S102对感知失真判别结果集合进行容错处理的实现流程,为了便于说明,仅示出了与本发明实施例相关的部分,详述如下:
在步骤S301中,根据感知失真判别结果集合对应的压缩图像序列,将预设窗口大小的滑动窗口按照预设的滑动方向进行滑动,并统计滑动窗口内感知失真判别结果为真值对应的压缩图像的压缩图像数量,该滑动方向为从右向左或者从左向右。
在本发明实施例中,感知失真判别结果集合中的一个感知失真判别结果对应一个压缩图像,感知失真判别结果集合对应的压缩图像序列x 1,x 2,...,x N和感知失真判别结果构成一个XY轴坐标系,压缩图像序列x 1,x 2,...,x N构成X轴坐标点,感知失真判别结果的真值(1)和假值(0)构成Y轴坐标点,将预设窗口大小的滑动窗口从该坐标系的X轴右端最后一个压缩图像(即第N个压缩图像x N)开始向左端XY轴坐标系的原点滑动(即沿X轴从右向左的滑动方向),或者从靠近该坐标系原点的X轴上第一个压缩图像(即第1个压缩图像x 1)开始沿X轴向右滑动(即沿X轴从左向右的滑动方向),在滑动窗口滑动过程中,统计滑动窗口内感知失真判别结果为真值对应的压缩图像的压缩图像数量,也即统计滑动窗口内有多少个压缩图像的感知失真判别结果为真值。
作为示例地,如图4示出的滑动窗口沿X轴从右向左滑动的示意图,在图 4所示的XY轴坐标系,感知失真判别结果集合对应的压缩图像序列x 1,x 2,...,x N构成X轴坐标点,感知失真判别结果的真值(1)和假值(0)构成Y轴坐标点,将滑动窗口从该坐标系的X轴右端最后一个压缩图像(即第N个压缩图像x N)开始向左端XY轴坐标系的原点滑动。
在将预设窗口大小的滑动窗口从右向左滑动之前,优选地,将滑动窗口的窗口大小设置为6,从而提高对感知失真判别结果集合中的误判结果进行修正和恢复的成功率。
在步骤S302中,当滑动方向为从右向左,且压缩图像数量不小于预设的窗口阈值时,将滑动窗口内窗口最右端对应的压缩图像判定为JND压缩图像,当滑动方向为从左向右,且压缩图像数量不大于窗口阈值时,将滑动窗口内窗口最左端对应的压缩图像判定为JND压缩图像。
在本发明实施例中,当滑动方向为从右向左时,判断滑动窗口内感知失真判别结果为真值的压缩图像的数量是否大于等于预设的窗口阈值,是则,滑动窗口停止滑动,且将滑动窗口内窗口最右端对应的压缩图像判定为JND压缩图像,如图4所示A点所在的第k个压缩图像x k,否则,滑动窗口继续滑动,直至滑动窗口内感知失真判别结果为真值的压缩图像的数量大于等于预设的窗口阈值。而当滑动方向为从左向右时,判断滑动窗口内感知失真判别结果为真值的压缩图像的数量是否小于等于窗口阈值,是则,滑动窗口停止滑动,且将滑动窗口内窗口最左端对应的压缩图像判定为JND压缩图像,否则,滑动窗口继续滑动,直至滑动窗口内感知失真判别结果为真值的压缩图像的数量小于等于窗口阈值。
优选地,预设的窗口阈值为5,从而提高对感知失真判别结果集合中的误判结果进行修正和恢复的成功率。
在步骤S303中,将JND压缩图像所采用的图像压缩指标设置为待测图像的图像级JND阈值。
在本发明实施例中,JND压缩图像(即第k个压缩图像x k)是将原始的待 测图像通过相应的图像压缩指标进行压缩而得到的,将压缩图像x k压缩过程中所采用的压缩因子、比特率或者其它的图像质量指标(例如:峰值信噪比(Peak Signal to Noise Ratio,PSNR))作为待测图像的图像级恰可感知失真(JND)阈值。
在本发明实施例中,采用基于滑动窗口的图像级JND搜索策略进行容错处理,最终预测出待测图像的图像级JND阈值,从而提高了图像级JND阈值预测的准确性。
实施例四:
图5示出了本发明实施例四提供的图像级JND阈值的预测装置的结构,为了便于说明,仅示出了与本发明实施例相关的部分,其中包括:
感知失真判别单元51,用于通过训练好的多分类感知失真判别器对待测图像和与待测图像对应的压缩图像集合中的压缩图像进行感知失真判别,得到感知失真判别结果集合;以及
JND阈值预测单元52,用于通过预设的图像级JND搜索策略对感知失真判别结果集合进行容错处理,以预测得到待测图像的图像级JND阈值。
在本发明实施例中,图像级JND阈值的预测装置的各单元可由相应的硬件或软件单元实现,各单元可以为独立的软、硬件单元,也可以集成为一个软、硬件单元,在此不用以限制本发明。具体地,各单元的实施方式可参考前述方法实施例的描述,在此不再赘述。
实施例五:
图6示出了本发明实施例五提供的图像级JND阈值的预测装置的结构,为了便于说明,仅示出了与本发明实施例相关的部分,其中包括:
二分类器构建单元61,用于通过卷积神经网络、线性回归函数以及逻辑回归函数构建二分类感知质量判别器,以通过该二分类感知质量判别器构成多分类感知失真判别器;
判别器学习单元62,用于根据预先生成的训练图像样本对二分类感知质量 判别器进行学习,并根据训练图像样本的样本标签对卷积神经网络的第一参数集、线性回归函数的第二参数集以及逻辑回归函数的第三参数集进行调整,以通过学习好的二分类感知质量判别器对待测图像和压缩图像集合中的压缩图像进行感知失真判别;
感知失真判别单元63,用于通过训练好的多分类感知失真判别器对待测图像和与待测图像对应的压缩图像集合中的压缩图像进行感知失真判别,得到感知失真判别结果集合;以及
JND阈值预测单元64,用于通过预设的图像级JND搜索策略对感知失真判别结果集合进行容错处理,以预测得到待测图像的图像级JND阈值。
其中,优选地,感知失真判别单元63包括:
图像块划分单元631,用于根据预设的图像块大小,分别将待测图像和压缩图像进行图像块划分,得到对应的待测图像块集合和压缩图像块集合;
图像块选取单元632,用于根据图像块位置,分别在待测图像块集合和压缩图像块集合中选取预设数量个对应的待测图像块和压缩图像块;
特征提取单元633,用于通过预设的卷积神经网络分别对选取出的待测图像块和压缩图像块进行特征提取,得到对应的待测图像块特征集合和压缩图像块特征集合;
特征融合单元634,用于根据预设的特征融合方式将待测图像块特征集合中的待测图像块特征和压缩图像块特征集合中的压缩图像块特征进行特征融合,得到融合特征集合;
质量评价单元635,用于根据融合特征集合,通过预设的线性回归函数对压缩图像块进行质量评价,得到对应的质量评分集合;以及
失真判别子单元636,用于根据质量评分集合,通过预设的逻辑回归函数判别待测图像和压缩图像之间是否存在感知失真,得到感知失真判别结果。
JND阈值预测单元64包括:
图像数量统计单元641,用于根据感知失真判别结果集合对应的压缩图像 序列,将预设窗口大小的滑动窗口按照预设的滑动方向进行滑动,并统计滑动窗口内感知失真判别结果为真值对应的压缩图像的压缩图像数量,滑动方向为从右向左或者从左向右;
JND图像判定单元642,用于当滑动方向为从右向左,且压缩图像数量不小于预设的窗口阈值时,将滑动窗口内窗口最右端对应的压缩图像判定为JND压缩图像,当滑动方向为从左向右,且压缩图像数量不大于窗口阈值时,将滑动窗口内窗口最左端对应的压缩图像判定为所述JND压缩图像;以及
JND阈值设置单元643,用于将JND压缩图像所采用的图像压缩指标设置为待测图像的图像级JND阈值。
在本发明实施例中,图像级JND阈值的预测装置的各单元可由相应的硬件或软件单元实现,各单元可以为独立的软、硬件单元,也可以集成为一个软、硬件单元,在此不用以限制本发明。具体地,各单元的实施方式可参考前述方法实施例的描述,在此不再赘述。
实施例六:
图7示出了本发明实施例六提供的计算设备的结构,为了便于说明,仅示出了与本发明实施例相关的部分。
本发明实施例的计算设备7包括处理器70、存储器71以及存储在存储器71中并可在处理器70上运行的计算机程序72。该处理器70执行计算机程序72时实现上述图像级JND阈值的预测方法实施例中的步骤,例如图1所示的步骤S101至S102。或者,处理器70执行计算机程序72时实现上述各装置实施例中各单元的功能,例如图5所示单元51至52的功能。
在本发明实施例中,通过训练好的该多分类感知失真判别器对待测图像和与待测图像对应的压缩图像集合中的压缩图像进行感知失真判别,得到感知失真判别结果集合,再通过图像级JND搜索策略对感知失真判别结果集合进行容错处理,以预测得到待测图像的图像级JND阈值,从而降低图像级JND阈值的预测偏差,提高了图像级JND阈值预测的准确度,使得预测得到JND阈值 更贴近人眼视觉系统对整幅图像质量的感知。
本发明实施例的计算设备可以为个人计算机、服务器。该计算设备7中处理器70执行计算机程序72时实现图像级JND阈值的预测方法时实现的步骤可参考前述方法实施例的描述,在此不再赘述。
实施例七:
在本发明实施例中,提供了一种计算机可读存储介质,该计算机可读存储介质存储有计算机程序,该计算机程序被处理器执行时实现上述图像级JND阈值的预测方法实施例中的步骤,例如,图1所示的步骤S101至S102。或者,该计算机程序被处理器执行时实现上述各装置实施例中各单元的功能,例如图5所示单元51至52的功能。
在本发明实施例中,通过训练好的该多分类感知失真判别器对待测图像和与待测图像对应的压缩图像集合中的压缩图像进行感知失真判别,得到感知失真判别结果集合,再通过图像级JND搜索策略对感知失真判别结果集合进行容错处理,以预测得到待测图像的图像级JND阈值,从而降低图像级JND阈值的预测偏差,提高了图像级JND阈值预测的准确度,使得预测得到JND阈值更贴近人眼视觉系统对整幅图像质量的感知。
本发明实施例的计算机可读存储介质可以包括能够携带计算机程序代码的任何实体或装置、记录介质,例如,ROM/RAM、磁盘、光盘、闪存等存储器。
以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明的保护范围之内。

Claims (10)

  1. 一种图像级JND阈值的预测方法,其特征在于,所述方法包括下述步骤:
    通过训练好的多分类感知失真判别器对待测图像和与所述待测图像对应的压缩图像集合中的压缩图像进行感知失真判别,得到感知失真判别结果集合,所述感知失真判别结果集合中的感知失真判别结果包括真值和假值;
    通过预设的图像级JND搜索策略对所述感知失真判别结果集合进行容错处理,以预测得到所述待测图像的图像级JND阈值。
  2. 如权利要求1所述的方法,其特征在于,通过训练好的多分类感知失真判别器对待测图像和与所述待测图像对应的压缩图像集合中的压缩图像进行感知失真判别的步骤,包括:
    根据预设的图像块大小,分别将所述待测图像和所述压缩图像进行图像块划分,得到对应的待测图像块集合和压缩图像块集合;
    根据图像块位置,分别在所述待测图像块集合和所述压缩图像块集合中选取预设数量个对应的待测图像块和压缩图像块;
    通过预设的卷积神经网络分别对选取出的所述待测图像块和所述压缩图像块进行特征提取,得到对应的待测图像块特征集合和压缩图像块特征集合;
    根据预设的特征融合方式将所述待测图像块特征集合中的待测图像块特征和所述压缩图像块特征集合中的压缩图像块特征进行特征融合,得到融合特征集合;
    根据所述融合特征集合,通过预设的线性回归函数对所述压缩图像块进行质量评价,得到对应的质量评分集合;
    根据所述质量评分集合,通过预设的逻辑回归函数判别所述待测图像和所述压缩图像之间是否存在感知失真,得到所述感知失真判别结果。
  3. 如权利要求2所述的方法,其特征在于,通过训练好的多分类感知失真判别器对待测图像和与所述待测图像对应的压缩图像集合中的压缩图像进行感 知失真判别的步骤之前,所述方法还包括:
    通过所述卷积神经网络、所述线性回归函数以及所述逻辑回归函数构建二分类感知质量判别器,以通过所述二分类感知质量判别器构成所述多分类感知失真判别器;
    根据预先生成的训练图像样本对所述二分类感知质量判别器进行学习,并根据所述训练图像样本的样本标签对所述卷积神经网络的第一参数集、所述线性回归函数的第二参数集以及所述逻辑回归函数的第三参数集进行调整,以通过学习好的二分类感知质量判别器对所述待测图像和所述压缩图像集合中的压缩图像进行感知失真判别。
  4. 如权利要求1所述的方法,其特征在于,通过预设的图像级JND搜索策略对所述感知失真判别结果集合进行容错处理的步骤,包括:
    根据所述感知失真判别结果集合对应的压缩图像序列,将预设窗口大小的滑动窗口按照预设的滑动方向进行滑动,并统计所述滑动窗口内所述感知失真判别结果为真值对应的压缩图像的压缩图像数量,所述滑动方向为从右向左或者从左向右;
    当所述滑动方向为从右向左,且所述压缩图像数量不小于预设的窗口阈值时,将所述滑动窗口内窗口最右端对应的压缩图像判定为JND压缩图像,当所述滑动方向为从左向右,且所述压缩图像数量不大于所述窗口阈值时,将所述滑动窗口内窗口最左端对应的压缩图像判定为所述JND压缩图像;
    将所述JND压缩图像所采用的图像压缩指标设置为所述待测图像的图像级JND阈值。
  5. 一种图像级JND阈值的预测装置,其特征在于,所述装置包括:
    感知失真判别单元,用于通过训练好的多分类感知失真判别器对待测图像和与所述待测图像对应的压缩图像集合中的压缩图像进行感知失真判别,得到感知失真判别结果集合,所述感知失真判别结果集合中的感知失真判别结果包括真值和假值;以及
    JND阈值预测单元,用于通过预设的图像级JND搜索策略对所述感知失真判别结果集合进行容错处理,以预测得到所述待测图像的图像级JND阈值。
  6. 如权利要求5所述的装置,其特征在于,所述感知失真判别单元包括:
    图像块划分单元,用于根据预设的图像块大小,分别将所述待测图像和所述压缩图像进行图像块划分,得到对应的待测图像块集合和压缩图像块集合;
    图像块选取单元,用于根据图像块位置,分别在所述待测图像块集合和所述压缩图像块集合中选取预设数量个对应的待测图像块和压缩图像块;
    特征提取单元,用于通过预设的卷积神经网络分别对选取出的所述待测图像块和所述压缩图像块进行特征提取,得到对应的待测图像块特征集合和压缩图像块特征集合;
    特征融合单元,用于根据预设的特征融合方式将所述待测图像块特征集合中的待测图像块特征和所述压缩图像块特征集合中的压缩图像块特征进行特征融合,得到融合特征集合;
    质量评价单元,用于根据所述融合特征集合,通过预设的线性回归函数对所述压缩图像块进行质量评价,得到对应的质量评分集合;以及
    失真判别子单元,用于根据所述质量评分集合,通过预设的逻辑回归函数判别所述待测图像和所述压缩图像之间是否存在感知失真,得到所述感知失真判别结果。
  7. 如权利要求6所述的装置,其特征在于,所述装置还包括:
    二分类器构建单元,用于通过所述卷积神经网络、所述线性回归函数以及所述逻辑回归函数构建二分类感知质量判别器,以通过所述二分类感知质量判别器构成所述多分类感知失真判别器;以及
    判别器学习单元,用于根据预先生成的训练图像样本对所述二分类感知质量判别器进行学习,并根据所述训练图像样本的样本标签对所述卷积神经网络的第一参数集、所述线性回归函数的第二参数集以及所述逻辑回归函数的第三参数集进行调整,以通过学习好的二分类感知质量判别器对所述待测图像和所 述压缩图像集合中的压缩图像进行感知失真判别。
  8. 如权利要求5所述的装置,其特征在于,所述JND阈值预测单元包括:
    图像数量统计单元,用于根据所述感知失真判别结果集合对应的压缩图像序列,将预设窗口大小的滑动窗口按照预设的滑动方向进行滑动,并统计所述滑动窗口内所述感知失真判别结果为真值对应的压缩图像的压缩图像数量,所述滑动方向为从右向左或者从左向右;
    JND图像判定单元,用于当所述滑动方向为从右向左,且所述压缩图像数量不小于预设的窗口阈值时,将所述滑动窗口内窗口最右端对应的压缩图像判定为JND压缩图像,当所述滑动方向为从左向右,且所述压缩图像数量不大于所述窗口阈值时,将所述滑动窗口内窗口最左端对应的压缩图像判定为所述JND压缩图像;以及
    JND阈值设置单元,用于将所述JND压缩图像所采用的图像压缩指标设置为所述待测图像的图像级JND阈值。
  9. 一种计算设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现如权利要求1至4任一项所述方法的步骤。
  10. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至4任一项所述方法的步骤。
PCT/CN2018/120749 2018-12-12 2018-12-12 图像级jnd阈值的预测方法、装置、设备及存储介质 WO2020118588A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP18943329.5A EP3896965A4 (en) 2018-12-12 2018-12-12 METHOD, DEVICE AND APPARATUS FOR PREDICTING A JND THRESHOLD PER IMAGES, AND STORAGE MEDIA
US17/312,736 US20220051385A1 (en) 2018-12-12 2018-12-12 Method, device and apparatus for predicting picture-wise jnd threshold, and storage medium
PCT/CN2018/120749 WO2020118588A1 (zh) 2018-12-12 2018-12-12 图像级jnd阈值的预测方法、装置、设备及存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/120749 WO2020118588A1 (zh) 2018-12-12 2018-12-12 图像级jnd阈值的预测方法、装置、设备及存储介质

Publications (1)

Publication Number Publication Date
WO2020118588A1 true WO2020118588A1 (zh) 2020-06-18

Family

ID=71075829

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/120749 WO2020118588A1 (zh) 2018-12-12 2018-12-12 图像级jnd阈值的预测方法、装置、设备及存储介质

Country Status (3)

Country Link
US (1) US20220051385A1 (zh)
EP (1) EP3896965A4 (zh)
WO (1) WO2020118588A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112437302A (zh) * 2020-11-12 2021-03-02 深圳大学 屏幕内容图像的jnd预测方法、装置、计算机设备及存储介质
CN112637597A (zh) * 2020-12-24 2021-04-09 深圳大学 Jpeg图像压缩方法、装置、计算机设备及存储介质
CN115187519A (zh) * 2022-06-21 2022-10-14 上海市计量测试技术研究院 图像质量评价方法、系统及计算机可读介质

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20240121402A1 (en) * 2022-09-30 2024-04-11 Netflix, Inc. Techniques for predicting video quality across different viewing parameters

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103002280A (zh) * 2012-10-08 2013-03-27 中国矿业大学 基于hvs&roi的分布式编解码方法及系统
CN103096079A (zh) * 2013-01-08 2013-05-08 宁波大学 一种基于恰可察觉失真的多视点视频码率控制方法
US20130256671A1 (en) * 2012-04-01 2013-10-03 Au Optronics Corporation Display apparatus
CN103501441A (zh) * 2013-09-11 2014-01-08 北京交通大学长三角研究院 一种基于人类视觉系统的多描述视频编码方法

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7596242B2 (en) * 1995-06-07 2009-09-29 Automotive Technologies International, Inc. Image processing for vehicular applications
US6075884A (en) * 1996-03-29 2000-06-13 Sarnoff Corporation Method and apparatus for training a neural network to learn and use fidelity metric as a control mechanism
US7475048B2 (en) * 1998-05-01 2009-01-06 Health Discovery Corporation Pre-processed feature ranking for a support vector machine
US6996549B2 (en) * 1998-05-01 2006-02-07 Health Discovery Corporation Computer-aided image analysis
CN104063894B (zh) * 2014-06-13 2017-02-22 中国科学院深圳先进技术研究院 点云三维模型重建方法及系统
CN105550701B (zh) * 2015-12-09 2018-11-06 福州华鹰重工机械有限公司 实时图像提取识别方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130256671A1 (en) * 2012-04-01 2013-10-03 Au Optronics Corporation Display apparatus
CN103002280A (zh) * 2012-10-08 2013-03-27 中国矿业大学 基于hvs&roi的分布式编解码方法及系统
CN103096079A (zh) * 2013-01-08 2013-05-08 宁波大学 一种基于恰可察觉失真的多视点视频码率控制方法
CN103501441A (zh) * 2013-09-11 2014-01-08 北京交通大学长三角研究院 一种基于人类视觉系统的多描述视频编码方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3896965A4 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112437302A (zh) * 2020-11-12 2021-03-02 深圳大学 屏幕内容图像的jnd预测方法、装置、计算机设备及存储介质
CN112637597A (zh) * 2020-12-24 2021-04-09 深圳大学 Jpeg图像压缩方法、装置、计算机设备及存储介质
CN112637597B (zh) * 2020-12-24 2022-10-18 深圳大学 Jpeg图像压缩方法、装置、计算机设备及存储介质
CN115187519A (zh) * 2022-06-21 2022-10-14 上海市计量测试技术研究院 图像质量评价方法、系统及计算机可读介质

Also Published As

Publication number Publication date
EP3896965A4 (en) 2021-12-15
EP3896965A1 (en) 2021-10-20
US20220051385A1 (en) 2022-02-17

Similar Documents

Publication Publication Date Title
WO2020118588A1 (zh) 图像级jnd阈值的预测方法、装置、设备及存储介质
CN108416250B (zh) 人数统计方法及装置
US10740885B2 (en) Object based image processing
CN109598231B (zh) 一种视频水印的识别方法、装置、设备及存储介质
CN105426870B (zh) 一种人脸关键点定位方法及装置
WO2020228525A1 (zh) 地点识别及其模型训练的方法和装置以及电子设备
WO2022042135A1 (zh) 人脸图像的选择方法、装置、设备及存储介质
CN109063611B (zh) 一种基于视频语义的人脸识别结果处理方法和装置
CN111314704B (zh) 图像级jnd阈值的预测方法、装置、设备及存储介质
CN110853033A (zh) 基于帧间相似度的视频检测方法和装置
US11062210B2 (en) Method and apparatus for training a neural network used for denoising
WO2019228040A1 (zh) 一种面部图像评分方法及摄像机
CN112712068B (zh) 一种关键点检测方法、装置、电子设备及存储介质
CN111291817A (zh) 图像识别方法、装置、电子设备和计算机可读介质
CN114842343A (zh) 一种基于ViT的航空图像识别方法
CN115905619A (zh) 对视频的用户体验质量进行评价的方案
CN113536939B (zh) 一种基于3d卷积神经网络的视频去重方法
WO2021047453A1 (zh) 图像质量确定方法、装置及设备
US11645579B2 (en) Automated machine learning tagging and optimization of review procedures
WO2024011853A1 (zh) 人体图像质量检测方法、装置、电子设备及存储介质
CN109858328B (zh) 一种基于视频的人脸识别的方法及装置
CN110717544A (zh) 一种垂直鱼眼镜头下行人属性分析方法及系统
US20240144729A1 (en) Generation method and information processing apparatus
WO2023025063A1 (zh) 图像信号处理器优化方法及设备
EP4383188A1 (en) Generation method, information processing device, and generation program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18943329

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2018943329

Country of ref document: EP

Effective date: 20210712