WO2022218072A1 - 基于逼近值评估图像视频质量的方法和相关装置 - Google Patents

基于逼近值评估图像视频质量的方法和相关装置 Download PDF

Info

Publication number
WO2022218072A1
WO2022218072A1 PCT/CN2022/080254 CN2022080254W WO2022218072A1 WO 2022218072 A1 WO2022218072 A1 WO 2022218072A1 CN 2022080254 W CN2022080254 W CN 2022080254W WO 2022218072 A1 WO2022218072 A1 WO 2022218072A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
evaluated
training samples
subjective
value
Prior art date
Application number
PCT/CN2022/080254
Other languages
English (en)
French (fr)
Inventor
金飞剑
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to JP2023558311A priority Critical patent/JP2024511103A/ja
Publication of WO2022218072A1 publication Critical patent/WO2022218072A1/zh
Priority to US17/986,817 priority patent/US20230072918A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/16Indexing scheme for image data processing or generation, in general involving adaptation to the client's capabilities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30168Image quality inspection

Definitions

  • the embodiments of the present application relate to technical fields such as artificial intelligence, computer vision (image) or machine learning, and more particularly, to evaluating image and video quality based on approximation values.
  • the quality of the image/video can generally be calculated by using an algorithm model to calculate the quality index of the video/image.
  • the subjective video scoring model and the calculated value can feed back the subjective and objective quality of the video, but they are not ideal in terms of computational complexity and accuracy.
  • QP quantization parameter
  • PSNR Peak Signal-to-Noise Ratio
  • SSIM Structural SIMilarity
  • VAMF Video Multimethod Assessment Fusion
  • the present application provides a method and a related device for evaluating image and video quality based on an approximation value, which can evaluate images and videos based on an approximation value that approximates the subjective true value based on real-time feedback without increasing the server-side hardware cost and ensuring the evaluation accuracy. quality.
  • the present application provides a method for evaluating image video quality based on an approximation value, the method being executed by an electronic device with data processing capability, the method comprising:
  • the sample to be evaluated includes a video to be evaluated or an image to be evaluated;
  • the first model is a model obtained based on an offline second model
  • the second model is a model obtained by using k training samples and the subjective truth values of the k training samples as a training set
  • the k training samples are The subjective truth value is obtained by subjective scoring.
  • the first model is a model obtained by fitting the parameters of the k training samples as a reference for the approximation value of the k training samples obtained by using the second model, and k is positive integer;
  • the quality of the video to be evaluated or the image to be evaluated is evaluated based on the first approximation value.
  • the present application provides an apparatus for evaluating image and video quality based on an approximation value, the apparatus is deployed on an electronic device with data processing capability, and the apparatus includes:
  • an obtaining unit configured to obtain a sample to be evaluated, the sample to be evaluated includes a video to be evaluated or an image to be evaluated;
  • a computing unit configured to calculate a first approximation value that approximates the subjective true value of the sample to be evaluated by using the first model on the line based on the parameters of the sample to be evaluated;
  • the first model is a model obtained based on an offline second model
  • the second model is a model obtained by using k training samples and the subjective truth values of the k training samples as a training set
  • the k training samples are The subjective truth value is obtained by subjective scoring.
  • the first model is a model obtained by fitting the parameters of the k training samples as a reference for the approximation value of the k training samples obtained by using the second model, and k is positive integer;
  • An evaluation unit configured to evaluate the quality of the video to be evaluated or the image to be evaluated based on the first approximation value.
  • the present application provides a training method for a first model, the method is executed by an electronic device with data processing capability, and the method includes:
  • the parameters of the k training samples are fitted to obtain the first model.
  • the present application provides a training device for a first model, the device is deployed on an electronic device with data processing capability, and the device includes:
  • the first acquisition unit is used to acquire k training samples, and the subjective truth values of the k training samples are acquired by subjective scoring;
  • the first training unit is used to obtain the second model by using the k training samples and the subjective truth values of the k training samples as a training set;
  • the second obtaining unit is used for taking the k training samples as input, and using the second model to obtain an approximation value that approximates the subjective true value of the k training samples;
  • the second training unit is configured to use the approximation value of the k training samples as a reference, and fit the parameters of the k training samples to obtain the first model.
  • an electronic device comprising:
  • a processor adapted to implement computer instructions
  • a computer-readable storage medium where computer instructions are stored in the computer-readable storage medium, and the computer instructions are adapted to be loaded by the processor and execute the above-mentioned method for evaluating image and video quality based on an approximation value or the above-mentioned method for training the first model.
  • an embodiment of the present application provides a computer-readable storage medium, where computer instructions are stored in the computer-readable storage medium, and when the computer instructions are read and executed by a processor of a computer device, the computer device can perform the above-mentioned approximation-based method.
  • an embodiment of the present application provides a computer program product or computer program, where the computer program product or computer program includes computer instructions, and the computer instructions are stored in a computer-readable storage medium.
  • the processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device executes the above-mentioned method for evaluating image and video quality based on approximation values or the above-mentioned method for training the first model.
  • the first model on the line is used to calculate a first approximation value that approximates the subjective true value of the sample to be evaluated; and then the video to be evaluated or the video to be evaluated is evaluated based on the first approximation value.
  • the quality of the image to be evaluated can be evaluated.
  • the first approximation value can be calculated and fed back in real time, and further, image and video quality can be evaluated based on the real-time feedback approximation value that approximates the subjective true value.
  • the first model is constructed as a model obtained based on an offline second model
  • the second model is constructed as a model obtained by using k training samples and the subjective truth values of the k training samples as a training set , the subjective truth values of the k training samples are obtained by subjective scoring
  • the first model is the approximation value of the k training samples obtained by using the second model as a reference
  • the parameters of the k training samples are simulated.
  • the model obtained by combining, k is a positive integer; equivalently, using the second model obtained by training to obtain the first model by fitting, which can ensure the evaluation accuracy of the first model without increasing the hardware cost of the server.
  • the method provided by the present application can evaluate the image and video quality based on the approximation value of the real-time feedback approximating the subjective true value without increasing the server-side hardware cost and ensuring the evaluation accuracy.
  • FIG. 1 is a schematic diagram of an interface of a subjective scoring platform provided by an embodiment of the present application.
  • FIG. 2 is a schematic flowchart of a method for training a quality evaluation model based on randomly selected samples provided by an embodiment of the present application;
  • FIG. 3 is a schematic flowchart of a method for evaluating image and video quality based on an approximation value provided by an embodiment of the present application
  • FIG. 4 is a schematic block diagram of the working principle of the first model provided by the embodiment of the present application.
  • FIG. 5 is a schematic block diagram of a training principle and an evaluation principle of the first model provided by an embodiment of the present application;
  • FIG. 6 is a schematic block diagram of an optimization principle of a first model provided by an embodiment of the present application.
  • FIG. 7 is a schematic block diagram of a service system including a first model provided by an embodiment of the present application.
  • FIG. 8 is a schematic flowchart of a training method for a first model provided by an embodiment of the present application.
  • FIG. 9 is a schematic block diagram of an apparatus for evaluating image and video quality based on an approximation value provided by an embodiment of the present application.
  • FIG. 10 is a schematic block diagram of a training apparatus for a first model provided by an embodiment of the present application.
  • FIG. 11 is a schematic block diagram of an electronic device provided by an embodiment of the present application.
  • the solutions provided in this application may involve artificial intelligence technology.
  • AI artificial intelligence
  • a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results.
  • artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new kind of intelligent machine that can respond in a similar way to human intelligence.
  • Artificial intelligence is to study the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
  • artificial intelligence technology is a comprehensive discipline involving a wide range of fields, including both hardware-level technologies and software-level technologies.
  • the basic technologies of artificial intelligence generally include technologies such as sensors, special artificial intelligence chips, cloud computing, distributed storage, big data processing technology, operation/interaction systems, and mechatronics.
  • Artificial intelligence software technology mainly includes computer vision technology, speech processing technology, natural language processing technology, and machine learning/deep learning.
  • artificial intelligence technology has been researched and applied in many fields, such as common smart homes, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned driving, autonomous driving, drones It is believed that with the development of technology, artificial intelligence technology will be applied in more fields and play an increasingly important value.
  • the embodiments of the present application may relate to computer vision (Computer Vision, CV) technology in artificial intelligence technology.
  • Computer vision is a science that studies how to make machines "see”. More specifically, it refers to replacing human eyes with cameras and computers Machine vision for target recognition, tracking and measurement, and further graphics processing, so that computer processing becomes an image more suitable for human eyes to observe or transmit to instruments for detection.
  • Computer vision studies related theories and technologies, trying to build artificial intelligence systems that can obtain information from images or multidimensional data.
  • Computer vision technology usually includes image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, 3D object reconstruction, 3D technology, virtual reality, augmented reality, simultaneous localization and mapping It also includes common biometric identification technologies such as face recognition and fingerprint recognition.
  • the embodiments of the present application may also involve machine learning (ML) in artificial intelligence technology.
  • ML is a multi-domain interdisciplinary subject involving probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, and other disciplines. Subject. It specializes in how computers simulate or realize human learning behaviors to acquire new knowledge or skills, and to reorganize existing knowledge structures to continuously improve their performance.
  • Machine learning is the core of artificial intelligence and the fundamental way to make computers intelligent, and its applications are in all fields of artificial intelligence.
  • Machine learning and deep learning usually include artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, teaching learning and other technologies.
  • Image and video quality evaluation refers to the perception, measurement and evaluation of the distortion of images or video frames through subjective and objective methods.
  • the subjective scoring method is generally expressed by the mean subjective score (mean opinion score, MOS) or the difference mean opinion score (DMOS), and the subjective scoring method can also be called the subjective scoring method.
  • the objective scoring method generally uses an algorithm model to calculate the quality index of the video/image, and the objective scoring method may also be referred to as a method of using a quality evaluation model to output a quality score.
  • Subjective scoring platform An annotation platform for subjective scoring of images and videos. Subjective scoring refers to the evaluation by reviewers/annotators on the quality or aesthetics of a certain picture or video.
  • FIG. 1 is a schematic diagram of an interface of a subjective scoring platform provided by an embodiment of the present application.
  • the interface of the subjective scoring platform may include videos to be scored, and scoring options.
  • videos to be scored For example, a five-point system can be used for scoring, and the scoring options correspond to five grades: very good, good, average, poor, and very poor.
  • a video to be scored is generally scored by multiple reviewers, the reviewers select a certain grade, and the final quality score of the video to be scored is obtained according to the average of the scores of all reviewers.
  • the subjective scoring platform needs to select a part of the training samples from the massive image and video library and provide them to the reviewers for subjective scoring.
  • the subjective scoring is obtained by the reviewers subjectively scoring the training samples.
  • the subjective scores of the training samples are obtained by subjective scoring, and the samples marked with the subjective scores can be called as labeled samples.
  • Active learning can actively select the training samples that the current model thinks is the most difficult to distinguish or has a large amount of information to score the evaluators through a specific selection strategy. In this way, it can effectively ensure the performance of the model. Reduce the sample size that needs to be labeled.
  • FIG. 2 is a schematic flowchart of a method for training a quality evaluation model based on randomly selected samples provided by an embodiment of the present application.
  • the training process of the quality evaluation model is a "waterfall" algorithm development process.
  • Several (n) training samples are randomly selected from the massive database and placed on the subjective scoring platform for evaluators (ie evaluators). 1 to reviewer t) to score, and after scoring is completed, the model is obtained by training.
  • the method of randomly selecting samples can easily select many worthless training samples, especially in the massive image and video database, there will be a lot of similar and redundant data.
  • the number of selected training samples needs to be set in advance, which is not easy to control.
  • the subjective scoring and model training are completely isolated, so that the "waterfall" development process will lead to the need to re-score if the subjectively scored dataset is found to be of low quality, which is time-consuming and labor-intensive with low fault tolerance.
  • Mean subjective score which is the final quality score of a training sample mentioned above.
  • the specific value of this value can be obtained according to the average of the scores of all evaluators.
  • the subjective truth value involved in this application may be MOS.
  • Fitting is to connect a series of points on the plane with a smooth curve. Because there are an infinite number of possibilities for connecting curves, there are various fitting methods.
  • the fitted curve can generally be represented by a function, and there are different fitting names depending on the function. Commonly used fitting methods include least squares curve fitting and so on. If the undetermined function is linear, it is called linear fitting or linear regression (mainly in statistics), otherwise it is called nonlinear fitting or nonlinear regression.
  • the expression can also be a piecewise function, in which case it is called a spline fit.
  • the first model involved in the present application may be a model obtained by fitting.
  • the prediction result of the first model may be an approximation value that approximates the subjective truth value.
  • a model can be obtained through training.
  • the second model involved in this application may be a model obtained by training.
  • the prediction result of the second model may be an approximation value that approximates the subjective truth value.
  • Image quality assessment It is one of the basic technologies in image processing. It mainly analyzes and studies the characteristics of the image, and then evaluates the quality of the image (the degree of image distortion). Image quality evaluation plays an important role in algorithm analysis and comparison and system performance evaluation in image processing systems. In recent years, with the extensive research in the field of digital images, the research of image quality evaluation has also attracted more and more attention of researchers, and many indicators and methods of image quality evaluation have been proposed and improved.
  • Video quality assessment It is one of the basic technologies in video processing. It mainly analyzes and studies the characteristics of the video, and then evaluates the quality of the video (objective video quality).
  • the subjective video scoring model and the calculated value can feed back the subjective and objective quality of the video, but the computational complexity and accuracy are not ideal. For example, you can simply use the encoder's quantization parameter (QP), Peak Signal-to-Noise Ratio (PSNR), and Structural SIMilarity (SSIM) to feed back the objective quality score of the game video.
  • QP quantization parameter
  • PSNR Peak Signal-to-Noise Ratio
  • SSIM Structural SIMilarity
  • the complexity is not high, it has a low correlation with subjective quality evaluation, has certain limitations, and will introduce certain delay effects.
  • VAMF Video Multimethod Assessment Fusion
  • This method is more accurate than PSNR, but the computational complexity is much higher, and it is impossible to achieve high frames. rate, high-resolution video is calculated in real time.
  • the present application provides a method for evaluating image and video quality based on an approximation value, which can evaluate image and video quality based on an approximation value that approximates the subjective true value based on real-time feedback without increasing the server-side hardware cost and ensuring the evaluation accuracy.
  • FIG. 3 is a schematic flowchart of a method 100 for evaluating image and video quality based on an approximation value provided by an embodiment of the present application.
  • the solutions provided in the embodiments of the present application may be executed by any electronic device with data processing capability.
  • the electronic device may be implemented as a server.
  • the server can be an independent physical server, a server cluster or a distributed system composed of multiple physical servers, or a cloud service, cloud database, cloud computing, cloud function, cloud storage, network service, cloud communication, intermediate Software services, domain name services, security services, and cloud servers for basic cloud computing services such as big data and artificial intelligence platforms, the servers can be directly or indirectly connected through wired or wireless communication, which is not limited in this application.
  • the service system for the convenience of description, the following takes the service system as an example for description.
  • the method 100 may include some or all of the following:
  • S110 Obtain a sample to be evaluated, where the sample to be evaluated includes a video to be evaluated or an image to be evaluated.
  • S120 based on the parameters of the sample to be evaluated, use the first model on the line to calculate a first approximation value that approximates the subjective true value of the sample to be evaluated.
  • the first model is a model obtained based on an offline second model
  • the second model is a model obtained by using k training samples and the subjective truth values of the k training samples as a training set
  • the k training samples are The subjective truth value is obtained by subjective scoring.
  • the first model is a model obtained by fitting the parameters of the k training samples as a reference for the approximation value of the k training samples obtained by using the second model, and k is positive integer.
  • the game runs on the server side, and then captures the rendered game video image by means of screen capture, compresses it with a video encoder, and transmits it to the user client through the network.
  • the user client can evaluate the image and video quality based on the method of evaluating the image and video quality based on the approximate value, and evaluate the image and video quality based on the approximate value of the real-time feedback that approximates the subjective true value.
  • the first model on the line is used to calculate a first approximation value that approximates the subjective true value of the sample to be evaluated; and then the video to be evaluated or the video to be evaluated is evaluated based on the first approximation value.
  • the quality of the image to be evaluated can be evaluated.
  • the first approximation value can be calculated and fed back in real time, and further, image and video quality can be evaluated based on the real-time feedback approximation value that approximates the subjective true value.
  • the first model is constructed as a model obtained based on an offline second model
  • the second model is constructed as a model obtained by using k training samples and the subjective truth values of the k training samples as a training set , the subjective truth values of the k training samples are obtained by subjective scoring
  • the first model is the approximation value of the k training samples obtained by using the second model as a reference
  • the parameters of the k training samples are simulated.
  • the model obtained by combining, k is a positive integer; equivalently, using the second model obtained by training to obtain the first model by fitting, which can ensure the evaluation accuracy of the first model without increasing the hardware cost of the server.
  • the method provided by the present application can evaluate the image and video quality based on the approximation value of the real-time feedback approximating the subjective true value without increasing the server-side hardware cost and ensuring the evaluation accuracy.
  • an approximation value that approximates the subjective true value of the sample to be evaluated can be calculated online through the first model, and then the cloud game video quality is evaluated through the approximation value calculated by the first model. Equivalently, it can solve the problem of evaluating the subjective quality of video in real time during online game play. For example, it can be used to monitor the quality of the cloud game market online.
  • the user can use a set of cloud game video quality evaluation solutions close to the application scenario on the user client to obtain the subjective truth value by subjective scoring, as a training sample for the second model.
  • the interface of the subjective scoring platform shown in FIG. 1 may be used.
  • the process shown in FIG. 2 may be used to train the second model based on randomly selected samples, but the present application is not limited thereto.
  • the parameters of the sample to be evaluated include at least one of the following parameters: feedback parameters of the network module, setting parameters of the cloud game module, and calculation parameters of the codec module.
  • the type of the parameter of the sample to be evaluated may be defined based on the classification of a software development kit (Software Development Kit, SDK).
  • SDKs include but are not limited to network SDKs, cloud game SDKs, and codec SDKs.
  • FIG. 4 is a schematic block diagram of the working principle of the first model provided by the embodiment of the present application.
  • the first model utilizes the parameters 1 to n of the input samples to be evaluated (for example, parameters are selected according to scenarios, the scenarios include but are not limited to encoding parameters related to the samples to be evaluated, such as parameters related to the encoded frame and code stream, For example, related to quantization parameter (QP), MV, frame rate, frame length, frame complexity parameter, frame type, etc.
  • the frame complexity parameter can be, for example, the sum of absolute transform differences (Sum of Absolute Transformed Difference, SATD).
  • Frame rate It can refer to the number of pictures played per second. For example, 24 frames means 24 pictures per second, 60 frames means 60 pictures per second, and so on.
  • the frame length can refer to the length of the data frame).
  • a first approximation value that approximates the subjective true value of the sample to be evaluated can be obtained by calculating the input parameters 1 to n.
  • the method 100 may further include:
  • the first model Based on the parameters of the k training samples, use the first model to calculate the approximation values of the k training samples; The approximation value of the k training samples is evaluated for the first model; for each training sample in the k training samples, if there is a difference between the approximation value calculated by using the first model and the approximation value obtained by using the second model If the difference is less than or equal to the first preset threshold, it is determined that the evaluation result of the first model is positive; otherwise, it is determined that the evaluation result of the first model is negative.
  • any one of the k training samples if the difference between the approximation value calculated by using the first model and the approximation value obtained by using the second model is greater than the first preset threshold, it is determined that The evaluation result of this first model is negative.
  • the performance or accuracy of the first model can be evaluated using the obtained approximation of the second model.
  • the method 100 may further include:
  • the first model is evaluated; for each training sample in the k training samples, if using The difference between the approximation value calculated by the first model and the subjective true value is less than or equal to the second preset threshold, then it is determined that the evaluation result of the first model is positive, otherwise the evaluation result of the first model is determined is negative. For example, for any one of the k training samples, if the difference between the approximation value calculated by using the first model and the subjective true value is greater than the second preset threshold, the evaluation result of the first model is determined is negative.
  • the performance or accuracy of the first model can be evaluated using the subjective ground truth.
  • FIG. 5 is a schematic block diagram of a training principle and an evaluation principle of the first model provided by the embodiment of the present application.
  • the fitting process for the first model can be as shown by the dotted single arrow in FIG. 5 .
  • the subjective truth value A of the image frames in the decoded sequence set needs to be obtained by subjective scoring, In order to train the second model with the acquired subjective truth value A as the training set; on the other hand, the first model needs to be obtained based on the second model and the parameter set to be input.
  • the optimization process for the first model can be represented by the solid double arrows.
  • the first model can be based on the approximation value B of the subjective true value output by the second model and the approximation value C of the subjective true value output by the first model. The model is evaluated.
  • the first model can also be evaluated based on the subjective truth value A and the approximation value C of the subjective truth value output by the first model.
  • the present application can classify cloud game videos by scene, so that a second sub-model in the second model and a first sub-model in the first model can be trained for a scene.
  • a second sub-model in the second model and a first sub-model in the first model can be trained for a game situation in one scene.
  • the embodiments of the present application do not limit the specific classification of scenarios and game occasions.
  • the scene may be the type of device used to play the video to be rated or the image to be rated.
  • the game scene may be a scene of a game screen, such as a battle scene or a non-combat scene.
  • Classify the cloud game video For a certain type of video after classification (such as the collected source sequence set), you can use the encoding configuration scheme for the certain type of video (for the characteristics of cloud games, frame rate, bit rate, resolution, etc. Adaptation) to encode and compress to obtain an encoded and compressed sequence set.
  • the encoded and compressed sequence set is subjected to lossy compression relative to the acquisition source sequence set, which will bring about loss of video image quality and details.
  • the encoded and compressed sequence set is decoded to obtain a decoded sequence set.
  • the decoded sequence set may be the above k training samples.
  • the subjective truth value A of the image frames in the decoded sequence set can be obtained by subjective scoring through a set of standard systems for subjective evaluation of cloud game videos.
  • the encoded and compressed sequence set is played through third-party decoding, and then manual (i.e. reviewers) are organized to obtain the subjective truth of the image frames in the decoded sequence set by subjective scoring according to the standard system of subjective evaluation of cloud game videos. value A.
  • the second model is trained based on the subjective truth value A of the image frames in the decoded sequence set.
  • the decoded sequence set is produced by third-party decoding for the encoded compressed sequence set, and then the second model can be trained as the training set of the second model, and the trained second model can be used to obtain the subjective truth values of the image frames in the decoded sequence set.
  • Approximate value B infinitely close to the subjective truth value A.
  • the present application basically does not require the performance complexity of the second model, but only requires relatively high accuracy, that is, the second model may be a very complex model with very high accuracy.
  • the approximation value B of the subjective truth value of the image frames in the decoded sequence set and the image frames in the decoded sequence set can be based on
  • the parameters are fitted and calculated to obtain the first model. Fitting is to connect a series of points on the plane with a smooth curve. Because there are an infinite number of possibilities for connecting curves, there are various fitting methods.
  • the fitted curve can generally be represented by a function, and there are different fitting names depending on the function. Commonly used fitting methods include least squares curve fitting and so on.
  • the undetermined function is linear, it is called linear fitting or linear regression (mainly in statistics), otherwise it is called nonlinear fitting or nonlinear regression.
  • the expression can also be a piecewise function, in which case it is called a spline fit.
  • the first model involved in the present application may be a model obtained by fitting.
  • the prediction result of the first model may be an approximation value that approximates the subjective truth value.
  • the method 100 may further include:
  • the first model is integrated into the service system; if the evaluation result of the first model is negative, the first model is re-fitted until the first model is evaluated.
  • the evaluation result of a model is positive.
  • the first model can be integrated into the service system. If the first model fails to achieve the expected effect, the first model needs to be re-fitted until the first model reaches the expected effect. expected result.
  • the method 100 may further include:
  • the first approximation value is reported to the second model through the statistical module; the second approximation value of the sample to be evaluated is obtained by using the second model; based on the first approximation value and the second approximation value, it is determined whether to A module is optimized; if the difference between the first approximation value and the second approximation value is greater than a third preset threshold, the first model is optimized by using the parameters of the sample to be evaluated and the second approximation value ; if the difference between the first approximation value and the second approximation value is less than or equal to the third preset threshold, it is determined that the first module does not need to be optimized.
  • the first approximation value can be used to optimize the first model to improve the accuracy of the first model.
  • the second model before determining whether to optimize the first module based on the first approximation value and the second approximation value, obtain the subjective truth value of the sample to be evaluated; If the difference between the subjective truth values of the evaluation samples is greater than the fourth preset threshold, the second model is optimized by using the to-be-evaluated samples and the subjective truth values of the to-be-evaluated samples.
  • the second model can be optimized by using the first approximation value to improve the accuracy of the second model.
  • FIG. 6 is a schematic block diagram of the optimization principle of the first model provided by the embodiment of the present application.
  • a first model for calculating an approximation value approximating the subjective truth value is obtained first, and then the first model is placed in the cloud game server system. Based on this, the acquisition frame (that is, the sample to be evaluated) is first acquired and then encoded to obtain the encoded frame; then the parameters and code stream information brought by the encoded frame are used to obtain the sample to be evaluated that needs to be input into the first model.
  • the first model calculates based on the parameters of the sample to be evaluated, and obtains the first approximation value of the sample to be evaluated;
  • the module feeds back the first approximation value to the platform statistics module, so that the platform statistics module determines whether to optimize the first model or the second model based on the first approximation value.
  • FIG. 7 is a schematic block diagram of a service system including a first model provided by an embodiment of the present application.
  • the fitted first model is integrated into the codec module in the service system, and input parameters, such as parameter P1, parameter P2, and parameter P3, can be obtained from the service system.
  • the parameter P1 represents the feedback parameter of the network module
  • the parameter P2 represents the setting parameter of the cloud game module
  • the parameter P3 represents the calculation parameter of the codec module. Then, the first model calculates the first approximation value of the sample to be evaluated based on the parameter P1, the parameter P2 and the parameter P3.
  • the first approximation value can be reported to the platform statistics module of the service system through the data statistics module, so that the platform statistics module can determine whether the second model needs to be adjusted based on the first approximation value and whether it is necessary to The first model is optimized.
  • the platform statistics module can perform statistical classification and determine whether to optimize the first sub-model or the second sub-model corresponding to the classification.
  • the first model may also be optimized based on the optimized second model, which is not specifically limited in this embodiment of the present application. By optimizing the first model, the accuracy of the approximation value can be improved.
  • the first model includes first sub-models corresponding to multiple scenarios
  • the second model includes second sub-models corresponding to the multiple scenarios
  • the first sub-models are based on corresponding
  • the model obtained by the second sub-model of , the multiple scenes include the first scene where the sample to be evaluated is located; based on this, the S120 may include: determining the first sub-model corresponding to the first scene; based on the sample to be evaluated. parameter, and use the first sub-model corresponding to the first scene to calculate the first approximation value.
  • parameters based on samples in different scenarios can be fitted into different first sub-models, and different second sub-models can be obtained by training based on samples in different scenarios.
  • the accuracy of the first sub-model and the second sub-model can be improved.
  • the plurality of scenes includes a type of device used to play the video to be rated or the image to be rated.
  • the multiple scenarios may also include an application program or a playback method to which the video to be evaluated or the image to be evaluated belongs, which is not specifically limited in this application.
  • the first model is a semi-reference model or a no-reference model
  • the second model is a full-reference model or a no-reference model
  • the semi-reference model refers to some parameters in the image frame before compression and some parameters in the image frame after compression.
  • the reference-free model refers to the model obtained by referring to the compressed image frame only
  • the full reference model refers to the model obtained by referring to the image frame before compression and the image frame after compression.
  • the semi-reference model may also refer to a model obtained by referring to some parameters in the image frame before encoding and the image frame after encoding
  • the non-reference model may also refer to a model obtained by referring only to the image frame after encoding.
  • the full reference The model may also refer to a model obtained by referring to the image frame before encoding and the image frame after encoding, which is not specifically limited in this application.
  • the present application provides a method for evaluating image video quality based on an approximation value, which can evaluate the subjective video quality in real time.
  • the subjective truth value A of k training samples may be obtained.
  • an approximation value B (very close to A) that approximates the subjective ground-truth value A of the k training samples is obtained through the offline second model.
  • an approximation value approximating the subjective true value of the real-time video of the cloud game can be obtained online through the first model.
  • the online first model may be optimized based on the updated second model by updating the second model.
  • FIG. 8 is a schematic flowchart of a training method 200 of a first model provided by an embodiment of the present application.
  • the solutions provided in the embodiments of the present application may be executed by any electronic device with data processing capability.
  • the electronic device may be implemented as a server.
  • the server can be an independent physical server, a server cluster or a distributed system composed of multiple physical servers, or a cloud service, cloud database, cloud computing, cloud function, cloud storage, network service, cloud communication, intermediate Software services, domain name services, security services, and cloud servers for basic cloud computing services such as big data and artificial intelligence platforms, the servers can be directly or indirectly connected through wired or wireless communication, which is not limited in this application.
  • the method 200 may include:
  • the method 200 may further include solutions related to the evaluation and optimization of the first model in the method 100.
  • solutions related to the evaluation and optimization of the first model in the method 100 refer to the corresponding solutions in the method 100.
  • the scheme, in order to avoid repetition, will not be repeated here.
  • the size of the sequence numbers of the above-mentioned processes does not mean the sequence of execution, and the execution sequence of each process should be determined by its functions and internal logic, and should not be dealt with in the present application.
  • the implementation of the embodiments constitutes no limitation.
  • the term "and/or" is only an association relationship for describing associated objects, indicating that there may be three kinds of relationships. Specifically, A and/or B can represent three situations: A exists alone, A and B exist at the same time, and B exists alone.
  • the character "/" in this document generally indicates that the related objects are an "or" relationship.
  • FIG. 9 is a schematic block diagram of an apparatus 300 for evaluating image and video quality based on an approximation value provided by an embodiment of the present application.
  • an obtaining unit 310 configured to obtain a sample to be evaluated, the sample to be evaluated includes a video to be evaluated or an image to be evaluated;
  • a computing unit 320 configured to calculate a first approximation value that approximates the subjective true value of the sample to be evaluated by using the first model on the line based on the parameters of the sample to be evaluated;
  • the first model is a model obtained based on an offline second model
  • the second model is a model obtained by using k training samples and the subjective truth values of the k training samples as a training set
  • the k training samples are The subjective truth value is obtained by subjective scoring.
  • the first model is a model obtained by fitting the parameters of the k training samples as a reference for the approximation value of the k training samples obtained by using the second model, and k is positive integer;
  • the evaluating unit 330 is configured to evaluate the quality of the video to be evaluated or the image to be evaluated based on the first approximation value.
  • the parameters of the sample to be evaluated include at least one of the following parameters: feedback parameters of the network module, setting parameters of the cloud game module, and calculation parameters of the codec module.
  • the evaluation unit 330 is further configured to:
  • the first model is used to calculate the approximation value of the k training samples
  • the difference between the approximation value calculated by using the first model and the approximation value obtained by using the second model is less than or equal to the first preset threshold, it is determined that the The evaluation result of the first model is positive, otherwise it is determined that the evaluation result of the first model is negative.
  • the evaluation unit 330 is further configured to:
  • the evaluation result of the first model is determined positive, otherwise it is determined that the evaluation result of the first model is negative.
  • the evaluation unit 330 is also used to:
  • the first model is integrated into the service system; if the evaluation result of the first model is negative, the first model is re-fitted until the first model is evaluated.
  • the evaluation result of a model is positive.
  • the evaluation unit 330 is also used to:
  • the first approximation value is reported to the second model through the statistical module
  • the difference between the first approximation value and the second approximation value is greater than the third preset threshold, use the parameters of the sample to be evaluated and the second approximation value to optimize the first model; if the first approximation value is If the difference between the approximation value and the second approximation value is less than or equal to the third preset threshold, it is determined that the first model does not need to be optimized.
  • the evaluation unit 330 determines whether to optimize the first module based on the first approximation value and the second approximation value, the evaluation unit 330 is further configured to:
  • the second model is optimized by using the subjective true value of the sample to be evaluated and the sample to be evaluated.
  • the first model includes first sub-models corresponding to multiple scenarios, and the multiple scenarios include the first scenario where the sample to be evaluated is located; wherein the computing unit 320 is specifically configured to:
  • the first approximation value is calculated using the first sub-model corresponding to the first scene.
  • the plurality of scenes includes a type of device used to play the video to be rated or the image to be rated.
  • the first model is a semi-reference model or a no-reference model
  • the second model is a full-reference model or a no-reference model
  • the semi-reference model refers to some parameters in the image frame before compression and some parameters in the image frame after compression.
  • the reference-free model refers to the model obtained by referring to the compressed image frame only
  • the full reference model refers to the model obtained by referring to the image frame before compression and the image frame after compression.
  • FIG. 10 is a schematic block diagram of an apparatus 400 for training a first model provided by an embodiment of the present application.
  • the apparatus 400 may include:
  • the first obtaining unit 410 is used to obtain k training samples, and the subjective truth values of the k training samples are obtained by subjective scoring;
  • the first training unit 420 is used to obtain the second model by using the k training samples and the subjective truth values of the k training samples as a training set;
  • the second obtaining unit 430 is used for taking the k training samples as input, and using the second model to obtain an approximation value that approximates the subjective truth value of the k training samples;
  • the second training unit 440 is configured to use the approximation value of the k training samples as a reference, and fit the parameters of the k training samples to obtain the first model.
  • the apparatus embodiments and the method embodiments may correspond to each other, and similar descriptions may refer to the method embodiments. To avoid repetition, details are not repeated here.
  • the apparatus 300 may correspond to executing the corresponding subject in the method 100 of the embodiments of the present application, and each unit in the apparatus 300 is to implement the corresponding process in the method 100.
  • the apparatus 400 may correspond to executing the present application.
  • the corresponding subject in the method 200 of the application embodiment, and the units in the apparatus 400 can be used to implement the process in the method 200, and for the sake of brevity, details are not repeated here.
  • each unit in the video processing apparatus involved in the embodiments of the present application may be respectively or all merged into one or several other units to form, or some of the unit(s) may be further divided into functional units. The same operation can be achieved without affecting the realization of the technical effects of the embodiments of the present application.
  • the above-mentioned units are divided based on logical functions. In practical applications, the function of one unit may also be implemented by multiple units, or the functions of multiple units may be implemented by one unit. In other embodiments of the present application, the apparatus 300 or the apparatus 400 may also include other units. In practical applications, these functions may also be implemented with the assistance of other units, and may be implemented by cooperation of multiple units.
  • a general-purpose computing device including a general-purpose computer such as a central processing unit (CPU), a random access storage medium (RAM), a read-only storage medium (ROM), etc., and a general-purpose computer may be implemented
  • a computer program (including program code) capable of executing the steps involved in the corresponding method is run on the computer to construct the apparatus 300 or the apparatus 400 involved in the embodiments of the present application, and to realize the evaluation of images and videos based on approximation values provided by the embodiments of the present application The quality method or the training method of the first model.
  • the computer program may be recorded on, for example, a computer-readable storage medium, loaded into an electronic device through the computer-readable storage medium, and executed in the electronic device, so as to implement the corresponding methods of the embodiments of the present application.
  • the units mentioned above can be implemented in the form of hardware, can also be implemented by instructions in the form of software, and can also be implemented in the form of a combination of software and hardware.
  • the steps of the method embodiments in the embodiments of the present application may be completed by hardware integrated logic circuits in the processor and/or instructions in the form of software, and the steps of the methods disclosed in conjunction with the embodiments of the present application may be directly embodied as hardware
  • the execution of the decoding processor is completed, or the execution is completed by a combination of hardware and software in the decoding processor.
  • the software may be located in random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, and other storage media mature in the art.
  • the storage medium is located in the memory, and the processor reads the information in the memory, and completes the steps in the above method embodiments in combination with its hardware.
  • FIG. 11 is a schematic structural diagram of an electronic device 500 provided by an embodiment of the present application.
  • the electronic device 500 includes at least a processor 510 and a computer-readable storage medium 520 .
  • the processor 510 and the computer-readable storage medium 520 may be connected through a bus or other means.
  • the computer-readable storage medium 520 is used for storing the computer program 521
  • the computer program 521 includes computer instructions
  • the processor 510 is used for executing the computer instructions stored in the computer-readable storage medium 520 .
  • the processor 510 is the computing core and the control core of the electronic device 500, and is suitable for implementing one or more computer instructions, specifically for loading and executing one or more computer instructions to implement corresponding method processes or corresponding functions.
  • the processor 510 may also be referred to as a central processing unit (Central Processing Unit, CPU).
  • the processor 510 may include, but is not limited to: a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a field programmable gate array (Field Programmable Gate Array, FPGA) Or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like.
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field Programmable Gate Array
  • the computer-readable storage medium 520 may be a high-speed RAM memory, or a non-volatile memory (Non-Volatile Memory), such as at least one disk memory; computer readable storage medium.
  • the computer-readable storage medium 520 includes, but is not limited to, volatile memory and/or non-volatile memory.
  • the non-volatile memory may be a read-only memory (Read-Only Memory, ROM), a programmable read-only memory (Programmable ROM, PROM), an erasable programmable read-only memory (Erasable PROM, EPROM), an electrically programmable read-only memory (Erasable PROM, EPROM).
  • Volatile memory may be Random Access Memory (RAM), which acts as an external cache.
  • RAM Random Access Memory
  • SRAM Static RAM
  • DRAM Dynamic RAM
  • SDRAM Synchronous DRAM
  • SDRAM double data rate synchronous dynamic random access memory
  • Double Data Rate SDRAM DDR SDRAM
  • enhanced SDRAM ESDRAM
  • synchronous link dynamic random access memory SLDRAM
  • Direct Rambus RAM Direct Rambus RAM
  • the electronic device 500 may be the apparatus 300 for evaluating image and video quality based on approximation values shown in FIG. 9 ;
  • the computer-readable storage medium 520 stores computer instructions;
  • the processor 510 loads and executes the computer
  • the computer instructions stored in the readable storage medium 520 are used to implement the corresponding steps in the method embodiment shown in FIG. 3; in specific implementation, the computer instructions in the computer-readable storage medium 520 are loaded by the processor 510 and execute the corresponding steps, as To avoid repetition, I will not repeat them here.
  • the electronic device 500 may be the training device 400 of the first model shown in FIG. 10 ; the computer-readable storage medium 520 stores computer instructions; the computer-readable storage medium 520 is loaded and executed by the processor 510 The computer instructions stored in the medium 520 are to implement the corresponding steps in the method embodiment shown in FIG. 8; in the specific implementation, the computer instructions in the computer-readable storage medium 520 are loaded by the processor 510 and perform corresponding steps. To avoid repetition, It will not be repeated here.
  • an embodiment of the present application further provides a computer-readable storage medium (Memory), where the computer-readable storage medium is a memory device in the electronic device 500 for storing programs and data.
  • computer readable storage medium 520 may include both a built-in storage medium in the electronic device 500 , and certainly also an extended storage medium supported by the electronic device 500 .
  • the computer-readable storage medium provides storage space in which the operating system of the electronic device 500 is stored.
  • one or more computer instructions suitable for being loaded and executed by the processor 510 are also stored in the storage space, and these computer instructions may be one or more computer programs 521 (including program codes).
  • a computer program product or computer program comprising computer instructions stored in a computer readable storage medium.
  • computer program 521 the electronic device 500 may be a computer, the processor 510 reads the computer instructions from the computer-readable storage medium 520, and the processor 510 executes the computer instructions, so that the computer executes the approximation-based value provided in the above-mentioned various optional manners A method of evaluating the quality of an image video or a method of training the first model.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website site, computer, server or data center via Transmission to another website site, computer, server, or data center by wired (eg, coaxial cable, optical fiber, digital subscriber line, DSL) or wireless (eg, infrared, wireless, microwave, etc.) means.
  • wired eg, coaxial cable, optical fiber, digital subscriber line, DSL
  • wireless eg, infrared, wireless, microwave, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
  • Image Analysis (AREA)

Abstract

本申请提供了一种基于逼近值评估图像视频质量的方法和相关装置,涉及人工智能的计算机视觉(图像)或机器学习等技术领域,该方法可以基于获取的待评价样本的参数,利用线上的第一模型计算逼近该待评价样本的主观真值的第一逼近值;该第一模型为基于离线的第二模型得到的模型,该第二模型为以k个训练样本和该k个训练样本的主观真值作为训练集得到的模型,该第一模型为利用该第二模型获取的该k个训练样本的逼近值为参考,对该k个训练样本的参数进行拟合得到的模型,k为正整数;基于该第一逼近值评估图像视频质量。能够在不增加服务器端硬件成本和保证评估准确度的情况下,基于实时反馈的逼近主观真值的逼近值评估图像视频质量。

Description

基于逼近值评估图像视频质量的方法和相关装置
本申请要求于2021年04月13日提交中国专利局、申请号202110395015.2、申请名称为“基于逼近值评估图像视频质量的方法和相关装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及人工智能的计算机视觉(图像)或机器学习等技术领域,并且更具体地,涉及基于逼近值评估图像视频质量。
背景技术
图像/视频的质量一般可采用算法模型计算出视频/图像的质量指标。
通常情况下,视频主观打分模型和计算值可以反馈视频主客观质量情况,但是从计算复杂度和准确性来说都不是很理想。例如可以单纯利用编码器的量化参数(QP)、峰值信噪比(Peak Signal-to-Noise Ratio,PSNR)、结构相似性(Structural SIMilarity,SSIM)反馈游戏视频客观质量分值,这种方式的复杂度虽然不高,但是与主观质量评价相关性偏低,有一定的局限性,而且会引入一定的时延影响。再如可通过视频多评估融合(Video Multimethod Assessment Fusion,VAMF)等模型获取游戏视频客观质量分值,这种方式相对PSNR而言准确度要高,但是计算复杂度高很多,无法做到高帧率,高分辨率视频实时计算。
此外,由于云游戏种类繁多,视频海量数据需要实时获取视频主观分值,但相关技术中并没有用于实时反馈主观质量的模型以及相关方案。
因此,本领域亟需提供一种能够基于实时反馈主观质量评估图像视频质量的方法。
发明内容
本申请提供了一种基于逼近值评估图像视频质量的方法和相关装置,能够在不增加服务器端硬件成本和保证评估准确度的情况下,基于实时反馈的逼近主观真值的逼近值评估图像视频质量。
一方面,本申请提供了一种基于逼近值评估图像视频质量的方法,该方法由具有数据处理能力的电子设备执行,该方法包括:
获取待评价样本,该待评价样本包括待评价视频或待评价图像;
基于该待评价样本的参数,利用线上的第一模型计算逼近该待评价样本的主观真值的第一逼近值;
其中,该第一模型为基于离线的第二模型得到的模型,该第二模型为以k个训练样本和该k个训练样本的主观真值作为训练集得到的模型,该k个训练样本的主观真值以主观 打分的方式获取,该第一模型为利用该第二模型获取的该k个训练样本的逼近值为参考,对该k个训练样本的参数进行拟合得到的模型,k为正整数;
基于该第一逼近值评估该待评价视频或该待评价图像的质量。
另一方面,本申请提供了一种基于逼近值评估图像视频质量的装置,该装置部署在具有数据处理能力的电子设备上,该装置包括:
获取单元,用于获取待评价样本,该待评价样本包括待评价视频或待评价图像;
计算单元,用于基于该待评价样本的参数,利用线上的第一模型计算逼近该待评价样本的主观真值的第一逼近值;
其中,该第一模型为基于离线的第二模型得到的模型,该第二模型为以k个训练样本和该k个训练样本的主观真值作为训练集得到的模型,该k个训练样本的主观真值以主观打分的方式获取,该第一模型为利用该第二模型获取的该k个训练样本的逼近值为参考,对该k个训练样本的参数进行拟合得到的模型,k为正整数;
评估单元,用于基于该第一逼近值评估该待评价视频或该待评价图像的质量。
另一方面,本申请提供了一种第一模型的训练方法,该方法由具有数据处理能力的电子设备执行,该方法包括:
获取k个训练样本,该k个训练样本的主观真值以主观打分的方式获取;
以该k个训练样本和该k个训练样本的主观真值作为训练集,得到第二模型;
以该k个训练样本作为输入,利用该第二模型获取逼近该k个训练样本的主观真值的逼近值;
利用该k个训练样本的逼近值为参考,对该k个训练样本的参数进行拟合以得到该第一模型。
另一方面,本申请提供了一种第一模型的训练装置,该装置部署在具有数据处理能力的电子设备上,该装置包括:
第一获取单元,用于获取k个训练样本,该k个训练样本的主观真值以主观打分的方式获取;
第一训练单元,用于以该k个训练样本和该k个训练样本的主观真值作为训练集,得到第二模型;
第二获取单元,用于以该k个训练样本作为输入,利用该第二模型获取逼近该k个训练样本的主观真值的逼近值;
第二训练单元,用于利用该k个训练样本的逼近值为参考,对该k个训练样本的参数进行拟合以得到该第一模型。
另一方面,本申请提供了一种电子设备,包括:
处理器,适于实现计算机指令;以及,
计算机可读存储介质,计算机可读存储介质存储有计算机指令,计算机指令适于由处理器加载并执行上述基于逼近值评估图像视频质量的方法或上述第一模型的训练方法。
另一方面,本申请实施例提供一种计算机可读存储介质,该计算机可读存储介质存储有计算机指令,该计算机指令被计算机设备的处理器读取并执行时,使得计算机设备执行上述基于逼近值评估图像视频质量的方法或上述第一模型的训练方法。
另一方面,本申请实施例提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行上述基于逼近值评估图像视频质量的方法或上述第一模型的训练方法。
本申请实施例中,基于待评价样本的参数,利用线上的第一模型计算逼近该待评价样本的主观真值的第一逼近值;进而基于该第一逼近值评估该待评价视频或该待评价图像的质量。一方面,通过线上的第一模型,能够实时计算并反馈该第一逼近值,进而,可以基于实时反馈的逼近主观真值的逼近值评估图像视频质量。另一方面,将该第一模型构造为基于离线的第二模型得到的模型,且将该第二模型构造为以k个训练样本和该k个训练样本的主观真值作为训练集得到的模型,该k个训练样本的主观真值以主观打分的方式获取,该第一模型为利用该第二模型获取的该k个训练样本的逼近值为参考,对该k个训练样本的参数进行拟合得到的模型,k为正整数;相当于,利用训练得到的第二模型通过拟合的方式得到第一模型,能够在不增加服务器端硬件成本下,保证第一模型的评估准确度。
综上,本申请提供的方法,能够在不增加服务器端硬件成本和保证评估准确度的情况下,基于实时反馈的逼近主观真值的逼近值评估图像视频质量。
附图说明
为了更清楚地说明本申请实施例的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本申请实施例提供的主观打分平台的界面的示意图;
图2是本申请实施例提供的基于随机选择的样本训练质量评价模型的方法的示意性流程图;
图3是本申请实施例提供的基于逼近值评估图像视频质量的方法的示意性流程图;
图4是本申请实施例提供的第一模型的工作原理的示意性框图;
图5是本申请实施例提供的第一模型的训练原理和评估原理的示意性框图;
图6是本申请实施例提供的第一模型的优化原理的示意性框图;
图7是本申请实施例提供的包括第一模型的服务系统的示意性框图;
图8是本申请实施例提供的第一模型的训练方法的示意性流程图;
图9是本申请实施例提供的基于逼近值评估图像视频质量的装置的示意性框图;
图10是本申请实施例提供的第一模型的训练装置的示意性框图;
图11是本申请实施例提供的电子设备的示意性框图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请提供的方案可涉及人工智能技术。
其中,人工智能(Artificial Intelligence,AI)是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。换句话说,人工智能是计算机科学的一个综合技术,它企图了解智能的实质,并生产出一种新的能以人类智能相似的方式做出反应的智能机器。人工智能也就是研究各种智能机器的设计原理与实现方法,使机器具有感知、推理与决策的功能。
应理解,人工智能技术是一门综合学科,涉及领域广泛,既有硬件层面的技术也有软件层面的技术。人工智能基础技术一般包括如传感器、专用人工智能芯片、云计算、分布式存储、大数据处理技术、操作/交互系统、机电一体化等技术。人工智能软件技术主要包括计算机视觉技术、语音处理技术、自然语言处理技术以及机器学习/深度学习等几大方向。
随着人工智能技术研究和进步,人工智能技术在多个领域展开研究和应用,例如常见的智能家居、智能穿戴设备、虚拟助理、智能音箱、智能营销、无人驾驶、自动驾驶、无人机、机器人、智能医疗、智能客服等,相信随着技术的发展,人工智能技术将在更多的领域得到应用,并发挥越来越重要的价值。
本申请实施例可涉及人工智能技术中的计算机视觉(Computer Vision,CV)技术,计算机视觉是一门研究如何使机器“看”的科学,更进一步的说,就是指用摄影机和电脑代替人眼对目标进行识别、跟踪和测量等机器视觉,并进一步做图形处理,使电脑处理成为更适合人眼观察或传送给仪器检测的图像。作为一个科学学科,计算机视觉研究相关的理论和技术,试图建立能够从图像或者多维数据中获取信息的人工智能系统。计算机视觉技术通常包括图像处理、图像识别、图像语义理解、图像检索、OCR、视频处理、视频语义理解、 视频内容/行为识别、三维物体重建、3D技术、虚拟现实、增强现实、同步定位与地图构建等技术,还包括常见的人脸识别、指纹识别等生物特征识别技术。
本申请实施例也可以涉及人工智能技术中的机器学习(Machine Learning,ML),ML是一门多领域交叉学科,涉及概率论、统计学、逼近论、凸分析、算法复杂度理论等多门学科。专门研究计算机怎样模拟或实现人类的学习行为,以获取新的知识或技能,重新组织已有的知识结构使之不断改善自身的性能。机器学习是人工智能的核心,是使计算机具有智能的根本途径,其应用遍及人工智能的各个领域。机器学习和深度学习通常包括人工神经网络、置信网络、强化学习、迁移学习、归纳学习、式教学习等技术。
为便于对本申请方案的理解,下面对本申请涉及的相关术语进行说明。
(1)图像视频质量评价:指通过主客观的方式对图像或是视频帧的失真进行感知、衡量与评价。主观评分方式一般利用平均主观得分(mean opinion score,MOS)或平均主观得分差(difference mean opinion score,DMOS)来表示,主观评分方式也可称为主观打分方式。客观评分方式一般采用算法模型计算出视频/图像的质量指标,客观评分方式也可称为利用质量评价模型输出质量评分的方式。
(2)主观打分平台:针对图像和视频做主观打分的标注平台,主观打分是指评测员/标注员针对某一张图片或是某一段视频的画质或是美学等做出评分。
图1是本申请实施例提供的主观打分平台的界面的示意图。
如图1所示,主观打分平台的界面可以包括待打分视频,以及打分的选项。例如,可以采用五分制打分,则打分的选项分别对应:很好、好、一般、差、很差共5档。一个待打分视频一般由多个评测员进行评分,评测员选出某一档的评分,该待打分视频最终的质量分数根据所有评测员评分的均值获取得到。还有其他的主观打分方式,比如配对比较,通过让评测员看两张图像或是视频,给出哪一个更好的选择。
(3)主动学习(active learning):主观打分平台需要从海量的图像视频库挑选出一部分训练样本提供给评测员做主观评分,通过评测员对训练样本进行主观打分的方式获取主观评分可称为以主观打分的方式获取训练样本的主观评分,标注有主观评分的样本可称为已标注样本。主动学习(active learning)能够通过特定的选择策略主动挑选出当前模型认为最难区分或是信息量较大的训练样本给评测员进行打分,通过这种方式可以在保证模型性能的同时,有效地减少需要标注的样本量。
(4)被动学习(passive learning):一般采用随机挑选的样本训练模型。
图2是本申请实施例提供的基于随机选择的样本训练质量评价模型的方法的示意性流程图。
如图2所示,质量评价模型的训练过程是一个“瀑布式”式的算法开发流程,在海量的数据库中随机挑选若干(n)个训练样本放到主观打分平台供评测员(即评测员1至评测 员t)打分,打分完成后再训练得到模型。针对这种训练质量评价模型的方法,随机挑选样本的方法很容易挑选到很多无价值的训练样本,特别是海量图像视频库中会有很多相似、冗余的数据。而且选择的训练样本数需要提前设定,不容易控制。此外,将主观打分和模型训练完全孤立,这样”瀑布式“式的开发流程会导致如果发现主观打分后的数据集质量不高,则需要重新打分,耗时耗力且容错率很低。
(5)平均主观得分(mean opinion score,MOS),即上面提到的某个训练样本最终的质量分数,这个值的具体数值可以根据所有评测员的评分的均值获取得到。例如本申请中涉及的主观真值可以是MOS。
(6)拟合:拟合就是把平面上一系列的点,用一条光滑的曲线连接起来。因为连接的曲线有无数种可能,从而有各种拟合方法。拟合的曲线一般可以用函数表示,根据这个函数的不同有不同的拟合名字。常用的拟合方法有最小二乘曲线拟合法等。如果待定函数是线性,就叫线性拟合或者线性回归(主要在统计中),否则叫作非线性拟合或者非线性回归。表达式也可以是分段函数,这种情况下叫作样条拟合。例如本申请涉及的第一模型可以是一个通过拟合的方式得到的模型。该第一模型的预测结果可以是逼近主观真值的逼近值。
(7)训练:将主观打分后的图像/视频数据集作为输入,通过训练可得到一个模型。例如本申请涉及的第二模型可以是一个通过训练的方式得到的模型。该第二模型的预测结果可以是逼近主观真值的逼近值。
(8)图像质量评价(image quality assessment,IQA):是图像处理中的基本技术之一,主要通过对图像进行特性分析研究,然后评估出图像优劣(图像失真程度)。图像质量评价在图像处理系统中,对于算法分析比较、系统性能评估等方面有着重要的作用。近年来,随着对数字图像领域的广泛研究,图像质量评价的研究也越来越受到研究者的关注,提出并完善了许多图像质量评价的指标和方法。
(9)视频质量评价(video quality assessment,VQA):是视频处理中的基本技术之一,主要通过对视频进行特性分析研究,然后评估出视频优劣(视频客观质量)。
通常情况下,视频主观打分模型和计算值可以反馈视频主客观质量情况,但是从计算复杂度和准确性都不是很理想。例如可以单纯利用编码器的量化参数(QP)、峰值信噪比(Peak Signal-to-Noise Ratio,PSNR)、结构相似性(Structural SIMilarity,SSIM)反馈游戏视频客观质量分值,这种方式的复杂度虽然不高,但是与主观质量评价相关性偏低,有一定的局限性,而且会引入一定的时延影响。再如可通过视频多评估融合(Video Multimethod Assessment Fusion,VAMF)等模型获取游戏视频客观质量分值,这种方式相对PSNR而言准确度要高,但是计算复杂度高很多,无法做到高帧率,高分辨率视频实时计算。
此外,由于云游戏种类繁多,视频海量数据需要实时获取视频主观分值,但相关技术中并没有用于实时反馈主观质量的模型以及相关方案。
因此,本申请提供了一种基于逼近值评估图像视频质量的方法,能够在不增加服务器端硬件成本和保证评估准确度的情况下,基于实时反馈的逼近主观真值的逼近值评估图像视频质量。
图3是本申请实施例提供的基于逼近值评估图像视频质量的方法100的示意性流程图。需要说明的,本申请实施例提供的方案可通过任何具有数据处理能力的电子设备执行。例如,该电子设备可实施为服务器。服务器可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、以及大数据和人工智能平台等基础云计算服务的云服务器,服务器可以通过有线或无线通信方式进行直接或间接地连接,本申请在此不做限制。为便于描述,下面以服务系统为例进行说明。
如图3所示,该方法100可包括以下中的部分或全部内容:
S110,获取待评价样本,该待评价样本包括待评价视频或待评价图像。
S120,基于该待评价样本的参数,利用线上的第一模型计算逼近该待评价样本的主观真值的第一逼近值。
其中,该第一模型为基于离线的第二模型得到的模型,该第二模型为以k个训练样本和该k个训练样本的主观真值作为训练集得到的模型,该k个训练样本的主观真值以主观打分的方式获取,该第一模型为利用该第二模型获取的该k个训练样本的逼近值为参考,对该k个训练样本的参数进行拟合得到的模型,k为正整数。
S130,基于该第一逼近值评估该待评价视频或该待评价图像的质量。
在云游戏运行模式下,游戏在服务器端运行,然后通过抓屏的方式获取渲染的游戏视频画面,利用视频编码器压缩后通过网络传送给用户客户端。基于此,用户客户端可以基于逼近值评估图像视频质量的方法,基于实时反馈的逼近主观真值的逼近值评估图像视频质量。
本申请实施例中,基于待评价样本的参数,利用线上的第一模型计算逼近该待评价样本的主观真值的第一逼近值;进而基于该第一逼近值评估该待评价视频或该待评价图像的质量。一方面,通过线上的第一模型,能够实时计算并反馈该第一逼近值,进而,可以基于实时反馈的逼近主观真值的逼近值评估图像视频质量。另一方面,将该第一模型构造为基于离线的第二模型得到的模型,且将该第二模型构造为以k个训练样本和该k个训练样本的主观真值作为训练集得到的模型,该k个训练样本的主观真值以主观打分的方式获取,该第一模型为利用该第二模型获取的该k个训练样本的逼近值为参考,对该k个训练样本的参数进行拟合得到的模型,k为正整数;相当于,利用训练得到的第二模型通过拟合的方式得到第一模型,能够在不增加服务器端硬件成本下,保证第一模型的评估准确度。
综上,本申请提供的方法,能够在不增加服务器端硬件成本和保证评估准确度的情况下,基于实时反馈的逼近主观真值的逼近值评估图像视频质量。换言之,通过第一模型能 够在线计算出逼近待评价样本的主观真值的逼近值,然后通过该第一模型计算的逼近值评估云游戏视频质量。相当于,能够解决在线玩游戏的过程中实时评估视频主观质量。例如可用于在线监控云游戏大盘质量。
需要说明的是,本申请实施例中,用户可在用户客户端上利用一套贴近应用场景的云游戏视频质量评估方案,以主观打分的方式获取主观真值,作为第二模型的训练样本。例如可以采用图1所示的主观打分平台的界面,再如,可以采用图2所示的流程基于随机选择的样本训练第二模型,但本申请不限于此。
在一些实施例中,该待评价样本的参数包括以下参数中的至少一项:网络模块的反馈参数,云游戏模块的设定参数以及编解码模块的计算参数。在一种实现方式中,该待评价样本的参数的类型可以基于软件开发工具包(Software Development Kit,SDK)的分类进行定义。例如SDK包括但不限于网络SDK、云游戏SDK以及编解码SDK等。
图4是本申请实施例提供的第一模型的工作原理的示意性框图。
如图4所示,第一模型利用输入待评价样本的参数1~参数n(例如根据场景选择参数,场景包括但不限于与待评价样本的编码参数例如与编码帧的参数和码流相关,例如与量化参数(QP),MV,帧率,帧长、帧复杂度参数以及帧类型等相关。帧复杂度参数例如可以是绝对变换差的和(Sum of Absolute Transformed Difference,SATD)。帧率可以指的就是每秒钟播放的图片数量,如24帧即每秒钟播放24张图片,60帧即每秒钟播放60张图片,以此类推。帧长可以指数据帧的长度)。通过对输入的参数1~参数n的计算可获得逼近待评价样本的主观真值的第一逼近值。
在一些实施例中,该基于该待评价样本的参数,S120之前,该方法100还可包括:
基于该k个训练样本的参数,利用该第一模型计算该k个训练样本的逼近值;基于利用该第二模型获取的该k个训练样本的逼近值和利用该第一模型计算得到的该k个训练样本的逼近值,对该第一模型进行评估;针对该k个训练样本中每一个训练样本,若利用该第一模型计算的逼近值和利用该第二模型获取的逼近值之间的差值小于或等于第一预设阈值,则确定对该第一模型的评估结果为正向,否则确定对该第一模型的评估结果为负向。例如,针对该k个训练样本中任一个训练样本,若利用该第一模型计算的逼近值和利用该第二模型获取的逼近值之间的差值大于该第一预设阈值,则确定对该第一模型的评估结果为负向。
简言之,可利用第二模型的获取的逼近值对第一模型的性能或准确度进行评估。
在一些实施例中,该基于该待评价样本的参数,S120之前,该方法100还可包括:
基于该k个训练样本的主观真值和利用该第一模型计算得到的该k个训练样本的逼近值,对该第一模型进行评估;针对该k个训练样本中每一个训练样本,若利用该第一模型计算的逼近值和主观真值之间的差值小于或等于第二预设阈值,则确定对该第一模型的评估结果为正向,否则确定对该第一模型的评估结果为负向。例如,针对该k个训练样本中 任一个样本,若利用该第一模型计算的逼近值和主观真值之间的差值大于该第二预设阈值,则确定对该第一模型的评估结果为负向。
简言之,可利用主观真值对第一模型的性能或准确度进行评估。
图5是本申请实施例提供的第一模型的训练原理和评估原理的示意性框图。
如图5所示,针对第一模型的拟合过程可如图5中虚线单箭头表示的部分,一方面,需要以主观打分的方式获取解码后的序列集中的图像帧的主观真值A,以便以获取到的主观真值A作为训练集训练第二模型;另一方面,需要基于第二模型和待输入的参数集得到第一模型。针对第一模型的优化过程可如实线双箭头表示的部分,一方面,可以基于第二模型输出的主观真值的逼近值B和第一模型输出的主观真值的逼近值C对该第一模型进行评估,另一方面,也可以基于主观真值A和第一模型输出的主观真值的逼近值C对该第一模型进行评估。
在训练第二模型的过程中,可以包括以下步骤:
(1)、获取解码后序列集。
由于云游戏的种类比较繁多,复杂度不相同,主观感受也不相同。本申请可通过对云游戏视频进行场景分类,以便针对一种场景可以训练出第二模型中的一个第二子模型以及第一模型中的一个第一子模型。当然,也可以对基于场景分类的云游戏视频再基于游戏场合进行切分,以便针对一种场景下的一种游戏场合可以训练处第二模型中的一个第二子模型以及第一模型中的一个第一子模型。本申请实施例对场景和游戏场合的具体分类不作限定。例如,场景可以是用于播放该待评价视频或该待评价图像的设备的类型。游戏场合可以是游戏画面的场景,例如战斗场景或非战斗场景。对云游戏视频进行游戏分类,针对分类后的某一类视频(例如采集的源序列集),可以对该某一类视频根据编码配置方案(针对云游戏特点帧率,码率,分辨率等适配)进行编码并压缩,以得到编码并压缩后的序列集,编码压缩序列集相对于采集源序列集进行了有损失的压缩,会带来视频图像质量细节上损失。然后针对编码并压缩后的序列集进行解码,得到解码后序列集。可选的,该解码后序列集即可为上述k个训练样本。
(2)、以主观打分的方式获取解码后序列集中的图像帧的主观真值A。
由于云游戏的场景不同于实时通信主观视频场景,例如云游戏的场景可涉及手机移动端,固定PC以及电视TV终端。而且游戏玩家对体验的要求也不同于实时通信的用户对体验的要求。本申请中可通过一套云游戏视频主观评价的标准体系,以主观打分的方式获取解码后的序列集中的图像帧的主观真值A。例如,编码并压缩后的序列集通过第三方解码播放,然后组织人工(即评测员)根据云游戏视频主观评价的标准体系,以主观打分的方式获取解码后的序列集中的图像帧的主观真值A。
(3)、基于解码后序列集中的图像帧的主观真值A训练第二模型。
针对编码压缩序列集通过第三方解码制作解码后序列集,然后可作为第二模型的训练集训练第二模型,并利用训练好的第二模型获得解码后序列集中的图像帧的主观真值的逼近值B(无限的接近主观真值A)。本申请对第二模型的性能复杂度基本不做要求,只要求准确性比较高,即该第二模型可以是很复杂且准确性非常高的模型。
(4)、基于该第二模型得到该第一模型。
利用训练好的第二模型获得解码后序列集中的图像帧的主观真值的逼近值B后,可基于解码后序列集中的图像帧的主观真值的逼近值B以及解码后序列集中的图像帧的参数进行拟合计算,以得到第一模型。拟合就是把平面上一系列的点,用一条光滑的曲线连接起来。因为连接的曲线有无数种可能,从而有各种拟合方法。拟合的曲线一般可以用函数表示,根据这个函数的不同有不同的拟合名字。常用的拟合方法有最小二乘曲线拟合法等。如果待定函数是线性,就叫线性拟合或者线性回归(主要在统计中),否则叫作非线性拟合或者非线性回归。表达式也可以是分段函数,这种情况下叫作样条拟合。例如本申请涉及的第一模型可以是一个通过拟合的方式得到的模型。该第一模型的预测结果可以是逼近主观真值的逼近值。
在一些实施例中,该方法100还可包括:
若对该第一模型的评估结果为正向,则将该第一模型集成至服务系统;若对该第一模型的评估结果为负向,则重新拟合该第一模型,直至对该第一模型的评估结果为正向。
换言之,若该第一模型达到预期评估效果,则可以将该第一模型集成至服务系统,拖该第一模型未达到预期效果,则需要重新拟合该第一模型,直至该第一模型达到预期效果。
在一些实施例中,该方法100还可包括:
通过统计模块将该第一逼近值上报给该第二模型;利用该第二模型获取该待评价样本的第二逼近值;基于该第一逼近值和该第二逼近值,确定是否对该第一模块进行优化;若该第一逼近值和该第二逼近值之间的差值大于第三预设阈值,则利用该待评价样本的参数和该第二逼近值对该第一模型进行优化;若该第一逼近值和该第二逼近值之间的差值小于或等于该第三预设阈值,则确定不需要对该第一模块进行优化。
简言之,可以利用该第一逼近值对第一模型进行优化,以提升第一模型的准确度。
在一些实施例中,该基于该第一逼近值和该第二逼近值,确定是否对该第一模块进行优化之前,获取该待评价样本的主观真值;若该第二逼近值和该待评价样本的主观真值之间的差值大于第四预设阈值,则利用该待评价样本和该待评价样本的主观真值,优化该第二模型。
简言之,可以利用该第一逼近值对该第二模型进行优化,以提升第二模型的准确度。
图6是本申请实施例提供的第一模型的优化原理的示意性框图。
如图6所示,首先得到用于计算逼近主观真值的逼近值的第一模型,然后将第一模型放到云游戏服务端系统中。基于此,首先获取采集帧(即待评价样本)后对采集帧进行编码,以得到编码帧;然后通过编码帧带过来的参数和码流信息,得到需要输入到第一模型的待评价样本的参数,将该待评价样本的参数输入到该第一模型后,该第一模型基于该待评价样本的参数进行计算,并得到该待评价样本的第一逼近值;此时,可以通过数据统计模块将该第一逼近值反馈至平台统计模块,以便平台统计模块基于该第一逼近值确定是否对第一模型或第二模型进行优化。
图7是本申请实施例提供的包括第一模型的服务系统的示意性框图。
如图7所示,将拟合完成的第一模型集成到服务系统中的编解码模块中,利用服务系统中可以获取输入参数,如参数P1,参数P2以及参数P3等。参数P1表示网络模块的反馈参数,参数P2表示云游戏模块的设定参数,参数P3表示编解码模块的计算参数。然后,第一模型基于参数P1,参数P2以及参数P3计算待评价样本的第一逼近值。在一种可能的实现方式中,该第一逼近值可通过数据统计模块上报至服务系统的平台统计模块,以便平台统计模块确定基于该第一逼近值确定是否需要对第二模型以及是否需要对第一模型进行优化。例如平台统计模块可进行统计分类,并确定是否需要对分类对应的第一子模型或第二子模型进行优化。当然,也可以在优化第二模型后,基于优化后的第二模型优化该第一模型,本申请实施例对此不作具体限定,通过优化第一模型,能够提升逼近值的准确度。
在一些实施例中,该第一模型包括多个场景分别对应的第一子模型,该第二模型包括该多个场景分别对应的第二子模型,该多个第一子模型分别为基于对应的第二子模型得到的模型,该多个场景包括该待评价样本所在的第一场景;基于此,该S120可包括:确定该第一场景对应的第一子模型;基于该待评价样本的参数,利用该第一场景对应的第一子模型,计算该第一逼近值。
换言之,基于不同的场景下的样本的参数可拟合为不同的第一子模型,基于不同的场景下的样本可以通过训练的方式得到不同的第二子模型。由此,可以提升第一子模型和第二子模型的准确度。
在一些实施例中,该多个场景包括用于播放该待评价视频或该待评价图像的设备的类型。
当然,在本申请的其他可替代实施例中,该多个场景也可以包括该待评价视频或待评价图像所属的应用程序或者播放方式等,本申请对此不作具体限定。
在一些实施例中,该第一模型为半参考模型或无参考模型,该第二模型为全参考模型或无参考模型,该半参考模型指参考压缩前的图像帧中的部分参数和压缩后的图像帧得到的模型,该无参考模型指仅参考压缩后的图像帧得到的模型,该全参考模型指参考压缩前的图像帧和压缩后的图像帧得到的模型。
当然,该半参考模型也可以指参考编码前的图像帧中的部分参数和编码后的图像帧得到的模型,该无参考模型也可以指仅参考编码后的图像帧得到的模型,该全参考模型也可以指参考编码前的图像帧和编码后的图像帧得到的模型,本申请对此不作具体限定。
综上,本申请提供了一种基于逼近值评估图像视频质量的方法,该方法能够实时评估视频主观质量。可选的,可以通过对云游戏种类以及平台等信息进行分类,针对某一分类,获取k个训练样本的主观真值A。然后通过离线的第二模型获取k个训练样本的逼近主观真值A的逼近值B(非常接近A)。然后利用获取到的逼近值B作为参考真值,同时利用输入k个训练样本的多个参数拟合成第一模型(用于计算逼近主观真值A的逼近值C且能快速计算逼近值C)。基于此,可通过该第一模型在线获取云游戏实时视频的逼近主观真值的逼近值。可选的,针对后续新增游戏,可以通过更新第二模型的方式基于更新后的第二模型优化在线第一模型。
图8是本申请实施例提供的第一模型的训练方法200的示意性流程图。需要说明的,本申请实施例提供的方案可通过任何具有数据处理能力的电子设备执行。例如,该电子设备可实施为服务器。服务器可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、以及大数据和人工智能平台等基础云计算服务的云服务器,服务器可以通过有线或无线通信方式进行直接或间接地连接,本申请在此不做限制。
如图8所示,该方法200可包括:
S210,获取k个训练样本,该k个训练样本的主观真值以主观打分的方式获取。
S220,以该k个训练样本和该k个训练样本的主观真值作为训练集,得到第二模型;
S230,以该k个训练样本作为输入,利用该第二模型获取逼近该k个训练样本的主观真值的逼近值。
S240,利用该k个训练样本的逼近值为参考,对该k个训练样本的参数进行拟合以得到该第一模型。
需要说明的是,该方法200还可包括方法100中与第一模型评估和优化相关的方案,换言之,该方法200中的由于第一模型的评估优化等相关方案,可参考方法100中的相应方案,为避免重复,此处不再赘述。
以上结合附图详细描述了本申请的优选实施方式,但是,本申请并不限于上述实施方式中的具体细节,在本申请的技术构思范围内,可以对本申请的技术方案进行多种简单变型,这些简单变型均属于本申请的保护范围。例如,在上述具体实施方式中所描述的各个具体技术特征,在不矛盾的情况下,可以通过任何合适的方式进行组合,为了避免不必要的重复,本申请对各种可能的组合方式不再另行说明。又例如,本申请的各种不同的实施 方式之间也可以进行任意组合,只要其不违背本申请的思想,其同样应当视为本申请所公开的内容。
还应理解,在本申请的各种方法实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。另外,本申请实施例中,术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系。具体地,A和/或B可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。
上文对本申请实施例提供的方法进行了说明,下面对本申请实施例提供的装置进行说明。
图9是本申请实施例提供的基于逼近值评估图像视频质量的装置300的示意性框图。
获取单元310,用于获取待评价样本,该待评价样本包括待评价视频或待评价图像;
计算单元320,用于基于该待评价样本的参数,利用线上的第一模型计算逼近该待评价样本的主观真值的第一逼近值;
其中,该第一模型为基于离线的第二模型得到的模型,该第二模型为以k个训练样本和该k个训练样本的主观真值作为训练集得到的模型,该k个训练样本的主观真值以主观打分的方式获取,该第一模型为利用该第二模型获取的该k个训练样本的逼近值为参考,对该k个训练样本的参数进行拟合得到的模型,k为正整数;
评估单元330,用于基于该第一逼近值评估该待评价视频或该待评价图像的质量。
在一些实施例中,该待评价样本的参数包括以下参数中的至少一项:网络模块的反馈参数,云游戏模块的设定参数以及编解码模块的计算参数。
在一些实施例中,该计算单元320基于该待评价样本的参数,利用线上的第一模型计算逼近该待评价样本的主观真值的第一逼近值之前,该评估单元330还用于:
基于该k个训练样本的参数,利用该第一模型计算该k个训练样本的逼近值;
基于利用该第二模型获取的该k个训练样本的逼近值和利用该第一模型计算得到的该k个训练样本的逼近值,对该第一模型进行评估;
针对该k个训练样本中每一个训练样本,若利用该第一模型计算的逼近值和利用该第二模型获取的逼近值之间的差值小于或等于第一预设阈值,则确定对该第一模型的评估结果为正向,否则确定对该第一模型的评估结果为负向。
在一些实施例中,该计算单元320基于该待评价样本的参数,利用线上的第一模型计算逼近该待评价样本的主观真值的第一逼近值之前,该评估单元330还用于:
基于该k个训练样本的主观真值和利用该第一模型计算得到的该k个训练样本的逼近值,对该第一模型进行评估;
针对该k个训练样本中每一个训练样本,若利用该第一模型计算的逼近值和主观真值之间的差值小于或等于第二预设阈值,则确定对该第一模型的评估结果为正向,否则确定对该第一模型的评估结果为负向。
在一些实施例中,该评估单元330还用于:
若对该第一模型的评估结果为正向,则将该第一模型集成至服务系统;若对该第一模型的评估结果为负向,则重新拟合该第一模型,直至对该第一模型的评估结果为正向。
在一些实施例中,该评估单元330还用于:
通过统计模块将该第一逼近值上报给该第二模型;
利用该第二模型获取该待评价样本的第二逼近值;
若该第一逼近值和该第二逼近值之间的差值大于第三预设阈值,则利用该待评价样本的参数和该第二逼近值对该第一模型进行优化;若该第一逼近值和该第二逼近值之间的差值小于或等于该第三预设阈值,则确定不需要对该第一模型进行优化。
在一些实施例中,该评估单元330基于该第一逼近值和该第二逼近值,确定是否对该第一模块进行优化之前,该评估单元330还用于:
获取该待评价样本的主观真值;
若该第二逼近值和该待评价样本的主观真值之间的差值大于第四预设阈值,则利用该待评价样本和该待评价样本的主观真值,优化该第二模型。
在一些实施例中,该第一模型包括多个场景分别对应的第一子模型,该多个场景包括该待评价样本所在的第一场景;其中,计算单元320具体用于:
确定该第一场景对应的第一子模型;
基于该待评价样本的参数,利用该第一场景对应的第一子模型,计算该第一逼近值。
在一些实施例中,该多个场景包括用于播放该待评价视频或该待评价图像的设备的类型。
在一些实施例中,该第一模型为半参考模型或无参考模型,该第二模型为全参考模型或无参考模型,该半参考模型指参考压缩前的图像帧中的部分参数和压缩后的图像帧得到的模型,该无参考模型指仅参考压缩后的图像帧得到的模型,该全参考模型指参考压缩前的图像帧和压缩后的图像帧得到的模型。
图10是本申请实施例提供的第一模型的训练装置400的示意性框图。
如图10所示,该装置400可包括:
第一获取单元410,用于获取k个训练样本,该k个训练样本的主观真值以主观打分的方式获取;
第一训练单元420,用于以该k个训练样本和该k个训练样本的主观真值作为训练集,得到第二模型;
第二获取单元430,用于以该k个训练样本作为输入,利用该第二模型获取逼近该k个训练样本的主观真值的逼近值;
第二训练单元440,用于利用该k个训练样本的逼近值为参考,对该k个训练样本的参数进行拟合以得到该第一模型。
应理解,装置实施例与方法实施例可以相互对应,类似的描述可以参照方法实施例。为避免重复,此处不再赘述。具体地,该装置300可以对应于执行本申请实施例的方法100中的相应主体,并且该装置300中的各个单元分别为了实现方法100中的相应流程,类似的,装置400可对应于执行本申请实施例的方法200中的相应主体,并且该装置400中的单元可用于实现方法200中的流程,为了简洁,在此不再赘述。
还应当理解,本申请实施例涉及的视频处理装置中的各个单元可以分别或全部合并为一个或若干个另外的单元来构成,或者其中的某个(些)单元还可以再拆分为功能上更小的多个单元来构成,这可以实现同样的操作,而不影响本申请的实施例的技术效果的实现。上述单元是基于逻辑功能划分的,在实际应用中,一个单元的功能也可以由多个单元来实现,或者多个单元的功能由一个单元实现。在本申请的其它实施例中,该装置300或该装置400也可以包括其它单元,在实际应用中,这些功能也可以由其它单元协助实现,并且可以由多个单元协作实现。根据本申请的另一个实施例,可以通过在包括例如中央处理单元(CPU)、随机存取存储介质(RAM)、只读存储介质(ROM)等处理元件和存储元件的通用计算机的通用计算设备上运行能够执行相应方法所涉及的各步骤的计算机程序(包括程序代码),来构造本申请实施例涉及的装置300或该装置400,以及来实现本申请实施例提供的基于逼近值评估图像视频质量的方法或第一模型的训练方法。计算机程序可以记载于例如计算机可读存储介质上,并通过计算机可读存储介质装载于电子设备中,并在其中运行,来实现本申请实施例的相应方法。
换言之,上文涉及的单元可以通过硬件形式实现,也可以通过软件形式的指令实现,还可以通过软硬件结合的形式实现。具体地,本申请实施例中的方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路和/或软件形式的指令完成,结合本申请实施例公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件组合执行完成。可选地,软件可以位于随机存储器,闪存、只读存储器、可编程只读存储器、电可擦写可编程存储器、寄存器等本领域的成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法实施例中的步骤。
图11是本申请实施例提供的电子设备500的示意结构图。
如图11所示,该电子设备500至少包括处理器510以及计算机可读存储介质520。其中,处理器510以及计算机可读存储介质520可通过总线或者其它方式连接。计算机可读存储介质520用于存储计算机程序521,计算机程序521包括计算机指令,处理器510用于执行计算机可读存储介质520存储的计算机指令。处理器510是电子设备500的计算核心以及控制核心,其适于实现一条或多条计算机指令,具体适于加载并执行一条或多条计算机指令从而实现相应方法流程或相应功能。
作为示例,处理器510也可称为中央处理器(CentralProcessingUnit,CPU)。处理器510可以包括但不限于:通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等等。
作为示例,计算机可读存储介质520可以是高速RAM存储器,也可以是非不稳定的存储器(Non-VolatileMemory),例如至少一个磁盘存储器;可选的,还可以是至少一个位于远离前述处理器510的计算机可读存储介质。具体而言,计算机可读存储介质520包括但不限于:易失性存储器和/或非易失性存储器。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synch link DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DR RAM)。
在一种实现方式中,该电子设备500可以是图9所示的基于逼近值评估图像视频质量的装置300;该计算机可读存储介质520中存储有计算机指令;由处理器510加载并执行计算机可读存储介质520中存放的计算机指令,以实现图3所示方法实施例中的相应步骤;具体实现中,计算机可读存储介质520中的计算机指令由处理器510加载并执行相应步骤,为避免重复,此处不再赘述。
在一种实现方式中,该电子设备500可以是图10所示的第一模型的训练装置400;该计算机可读存储介质520中存储有计算机指令;由处理器510加载并执行计算机可读存储介质520中存放的计算机指令,以实现图8所示方法实施例中的相应步骤;具体实现中,计算机可读存储介质520中的计算机指令由处理器510加载并执行相应步骤,为避免重复,此处不再赘述。
根据本申请的另一方面,本申请实施例还提供了一种计算机可读存储介质(Memory),计算机可读存储介质是电子设备500中的记忆设备,用于存放程序和数据。例如,计算机可读存储介质520。可以理解的是,此处的计算机可读存储介质520既可以包括电子设备500中的内置存储介质,当然也可以包括电子设备500所支持的扩展存储介质。计算机可读存储介质提供存储空间,该存储空间存储了电子设备500的操作系统。并且,在该存储空间中还存放了适于被处理器510加载并执行的一条或多条的计算机指令,这些计算机指令可以是一个或多个的计算机程序521(包括程序代码)。
根据本申请的另一方面,提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。例如,计算机程序521。此时,电子设备500可以是计算机,处理器510从计算机可读存储介质520读取该计算机指令,处理器510执行该计算机指令,使得该计算机执行上述各种可选方式中提供的基于逼近值评估图像视频质量的方法或第一模型的训练方法。
换言之,当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。该计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行该计算机程序指令时,全部或部分地运行本申请实施例的流程或实现本申请实施例的功能。该计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。该计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质进行传输,例如,该计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元以及流程步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
最后需要说明的是,以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应所述以权利要求的保护范围为准。

Claims (16)

  1. 一种基于逼近值评估图像视频质量的方法,所述方法由具有数据处理能力的电子设备执行,所述方法包括:
    获取待评价样本,所述待评价样本包括待评价视频或待评价图像;
    基于所述待评价样本的参数,利用线上的第一模型计算逼近所述待评价样本的主观真值的第一逼近值;
    其中,所述第一模型为基于离线的第二模型得到的模型,所述第二模型为以k个训练样本和所述k个训练样本的主观真值作为训练集得到的模型,所述k个训练样本的主观真值以主观打分的方式获取,所述第一模型为利用所述第二模型获取的所述k个训练样本的逼近值为参考,对所述k个训练样本的参数进行拟合得到的模型,k为正整数;
    基于所述第一逼近值评估所述待评价视频或所述待评价图像的质量。
  2. 根据权利要求1所述的方法,所述待评价样本的参数包括以下参数中的至少一项:网络模块的反馈参数,云游戏模块的设定参数以及编解码模块的计算参数。
  3. 根据权利要求1所述的方法,所述基于所述待评价样本的参数,利用线上的第一模型计算逼近所述待评价样本的主观真值的第一逼近值之前,所述方法还包括:
    基于所述k个训练样本的参数,利用所述第一模型计算所述k个训练样本的逼近值;
    基于利用所述第二模型获取的所述k个训练样本的逼近值和利用所述第一模型计算得到的所述k个训练样本的逼近值,对所述第一模型进行评估;
    针对所述k个训练样本中每一个训练样本,若利用所述第一模型计算的逼近值和利用所述第二模型获取的逼近值之间的差值小于或等于第一预设阈值,则确定对所述第一模型的评估结果为正向,否则确定对所述第一模型的评估结果为负向。
  4. 根据权利要求1所述的方法,所述基于所述待评价样本的参数,利用线上的第一模型计算逼近所述待评价样本的主观真值的第一逼近值之前,所述方法还包括:
    基于所述k个训练样本的主观真值和利用所述第一模型计算得到的所述k个训练样本的逼近值,对所述第一模型进行评估;
    针对所述k个训练样本中每一个训练样本,若利用所述第一模型计算的逼近值和主观真值之间的差值小于或等于第二预设阈值,则确定对所述第一模型的评估结果为正向,否则确定对所述第一模型的评估结果为负向。
  5. 根据权利要求3或4所述的方法,所述方法还包括:
    若对所述第一模型的评估结果为正向,则将所述第一模型集成至服务系统;若对所述第一模型的评估结果为负向,则重新拟合所述第一模型,直至对所述第一模型的评估结果为正向。
  6. 根据权利要求1所述的方法,所述方法还包括:
    通过统计模块将所述第一逼近值上报给所述第二模型;
    利用所述第二模型获取所述待评价样本的第二逼近值;
    若所述第一逼近值和所述第二逼近值之间的差值大于第三预设阈值,则利用所述待评价样本的参数和所述第二逼近值对所述第一模型进行优化;
    若所述第一逼近值和所述第二逼近值之间的差值小于或等于所述第三预设阈值,则确定不需要对所述第一模型进行优化。
  7. 根据权利要求6所述的方法,所述若所述第一逼近值和所述第二逼近值之间的差值大于第三预设阈值,则利用所述待评价样本的参数和所述第二逼近值对所述第一模型进行优化之前,所述方法还包括:
    获取所述待评价样本的主观真值;
    若所述第二逼近值和所述待评价样本的主观真值之间的差值大于第四预设阈值,则利用所述待评价样本和所述待评价样本的主观真值,优化所述第二模型。
  8. 根据权利要求1所述的方法,所述第一模型包括多个场景分别对应的第一子模型,所述多个场景包括所述待评价样本所在的第一场景;
    所述基于所述待评价样本的参数,利用线上的第一模型计算逼近所述待评价样本的主观真值的第一逼近值,包括:
    确定所述第一场景对应的第一子模型;
    基于所述待评价样本的参数,利用所述第一场景对应的第一子模型,计算所述第一逼近值。
  9. 根据权利要求8所述的方法,所述多个场景包括用于播放所述待评价视频或所述待评价图像的设备的类型。
  10. 根据权利要求1~9任一项所述的方法,所述第一模型为半参考模型或无参考模型,所述第二模型为全参考模型或无参考模型,所述半参考模型指参考压缩前的图像帧中的部分参数和压缩后的图像帧得到的模型,所述无参考模型指仅参考压缩后的图像帧得到的模型,所述全参考模型指参考压缩前的图像帧和压缩后的图像帧得到的模型。
  11. 一种基于逼近值评估图像视频质量的装置,所述装置部署在具有数据处理能力的电子设备上,所述装置包括:
    获取单元,用于获取待评价样本,所述待评价样本包括待评价视频或待评价图像;
    计算单元,用于基于所述待评价样本的参数,利用线上的第一模型计算逼近所述待评价样本的主观真值的第一逼近值;
    其中,所述第一模型为基于离线的第二模型得到的模型,所述第二模型为以k个训练样本和所述k个训练样本的主观真值作为训练集得到的模型,所述k个训练样本的主观真值以主观打分的方式获取,所述第一模型为利用所述第二模型获取的所述k个训练样本的逼近值为参考,对所述k个训练样本的参数进行拟合得到的模型,k为正整数;
    评估单元,用于基于所述第一逼近值评估所述待评价视频或所述待评价图像的质量。
  12. 一种第一模型的训练方法,所述方法由具有数据处理能力的电子设备执行,所述方法包括:
    获取k个训练样本,所述k个训练样本的主观真值以主观打分的方式获取;
    以所述k个训练样本和所述k个训练样本的主观真值作为训练集,得到第二模型;
    以所述k个训练样本作为输入,利用所述第二模型获取逼近所述k个训练样本的主观真值的逼近值;
    利用所述k个训练样本的逼近值为参考,对所述k个训练样本的参数进行拟合以得到所述第一模型。
  13. 一种第一模型的训练装置,所述装置部署在具有数据处理能力的电子设备上,所述装置包括:
    第一获取单元,用于获取k个训练样本,所述k个训练样本的主观真值以主观打分的方式获取;
    第一训练单元,用于以所述k个训练样本和所述k个训练样本的主观真值作为训练集,得到第二模型;
    第二获取单元,用于以所述k个训练样本作为输入,利用所述第二模型获取逼近所述k个训练样本的主观真值的逼近值;
    第二训练单元,用于利用所述k个训练样本的逼近值为参考,对所述k个训练样本的参数进行拟合以得到所述第一模型。
  14. 一种电子设备,包括:
    处理器,适于执行计算机程序;
    计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,所述计算机程序被所述处理器执行时,实现如权利要求1至10中任一项所述的基于逼近值评估图像视频质量的方法或如权利要求12所述的第一模型的训练方法。
  15. 一种计算机可读存储介质,用于存储计算机程序,所述计算机程序使得计算机执行如权利要求1至10中任一项所述的基于逼近值评估图像视频质量的方法或如权利要求12所述的第一模型的训练方法。
  16. 一种计算机程序产品,当所述计算机程序产品被执行时,用于执行如权利要求1至10中任一项所述的基于逼近值评估图像视频质量的方法或如权利要求12所述的第一模型的训练方法。
PCT/CN2022/080254 2021-04-13 2022-03-11 基于逼近值评估图像视频质量的方法和相关装置 WO2022218072A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2023558311A JP2024511103A (ja) 2021-04-13 2022-03-11 近似値に基づいて画像又はビデオの品質を評価する方法及び装置、第1のモデルの訓練方法及び装置、電子機器、記憶媒体、並びにコンピュータプログラム
US17/986,817 US20230072918A1 (en) 2021-04-13 2022-11-14 Assessing image/video quality using an online model to approximate subjective quality values

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110395015.2A CN115205188A (zh) 2021-04-13 2021-04-13 基于逼近值评估图像视频质量的方法和相关装置
CN202110395015.2 2021-04-13

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/986,817 Continuation US20230072918A1 (en) 2021-04-13 2022-11-14 Assessing image/video quality using an online model to approximate subjective quality values

Publications (1)

Publication Number Publication Date
WO2022218072A1 true WO2022218072A1 (zh) 2022-10-20

Family

ID=83571687

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/080254 WO2022218072A1 (zh) 2021-04-13 2022-03-11 基于逼近值评估图像视频质量的方法和相关装置

Country Status (4)

Country Link
US (1) US20230072918A1 (zh)
JP (1) JP2024511103A (zh)
CN (1) CN115205188A (zh)
WO (1) WO2022218072A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116506622B (zh) * 2023-06-26 2023-09-08 瀚博半导体(上海)有限公司 模型训练方法及视频编码参数优化方法和装置
CN117591815A (zh) * 2023-10-31 2024-02-23 中国科学院空天信息创新研究院 面向多模态伪造生成数据的综合质量评估方法及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1763248A1 (en) * 2005-09-13 2007-03-14 Siemens Aktiengesellschaft Computation of subjective video quality
CN104902267A (zh) * 2015-06-08 2015-09-09 浙江科技学院 一种基于梯度信息的无参考图像质量评价方法
CN108665455A (zh) * 2018-05-14 2018-10-16 北京航空航天大学 图像显著性预测结果的评价方法和装置
CN111524110A (zh) * 2020-04-16 2020-08-11 北京微吼时代科技有限公司 视频质量的评价模型构建方法、评价方法及装置
CN111741330A (zh) * 2020-07-17 2020-10-02 腾讯科技(深圳)有限公司 一种视频内容评估方法、装置、存储介质及计算机设备

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1763248A1 (en) * 2005-09-13 2007-03-14 Siemens Aktiengesellschaft Computation of subjective video quality
CN104902267A (zh) * 2015-06-08 2015-09-09 浙江科技学院 一种基于梯度信息的无参考图像质量评价方法
CN108665455A (zh) * 2018-05-14 2018-10-16 北京航空航天大学 图像显著性预测结果的评价方法和装置
CN111524110A (zh) * 2020-04-16 2020-08-11 北京微吼时代科技有限公司 视频质量的评价模型构建方法、评价方法及装置
CN111741330A (zh) * 2020-07-17 2020-10-02 腾讯科技(深圳)有限公司 一种视频内容评估方法、装置、存储介质及计算机设备

Also Published As

Publication number Publication date
US20230072918A1 (en) 2023-03-09
CN115205188A (zh) 2022-10-18
JP2024511103A (ja) 2024-03-12

Similar Documents

Publication Publication Date Title
CN111950653B (zh) 视频处理方法和装置、存储介质及电子设备
WO2022218072A1 (zh) 基于逼近值评估图像视频质量的方法和相关装置
Chen et al. Learning generalized spatial-temporal deep feature representation for no-reference video quality assessment
CN111814620B (zh) 人脸图像质量评价模型建立方法、优选方法、介质及装置
Yang et al. 3D panoramic virtual reality video quality assessment based on 3D convolutional neural networks
US9105119B2 (en) Anonymization of facial expressions
US20230353828A1 (en) Model-based data processing method and apparatus
CN110751649B (zh) 视频质量评估方法、装置、电子设备及存储介质
Yu et al. Predicting the quality of compressed videos with pre-existing distortions
US11868738B2 (en) Method and apparatus for generating natural language description information
WO2021184754A1 (zh) 视频对比方法、装置、计算机设备和存储介质
Ji et al. Blind image quality assessment with semantic information
Li et al. Local and global sparse representation for no-reference quality assessment of stereoscopic images
CN116261009A (zh) 智能转化影视受众的视频检测方法、装置、设备及介质
CN117009577A (zh) 一种视频数据处理方法、装置、设备及可读存储介质
WO2020233536A1 (zh) Vr视频质量评估方法及装置
Cha et al. A Gaze-based Real-time and Low Complexity No-reference Video Quality Assessment Technique for Video Gaming
CN113538324A (zh) 评估方法、模型训练方法、装置、介质及电子设备
WO2024041268A1 (zh) 视频质量评估方法、装置、计算机设备、计算机存储介质及计算机程序产品
CN114863138B (zh) 图像处理方法、装置、存储介质及设备
WO2024109138A1 (zh) 视频编码方法、装置及存储介质
Hao et al. Context-adaptive online reinforcement learning for multi-view video summarization on Mobile devices
Imani et al. Stereoscopic video quality measurement with fine-tuning 3D ResNets
Dos Santos et al. GIBIS at MediaEval 2019: Predicting Media Memorability Task.
CN117440162B (zh) 一种多媒体互动教学方法及系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22787302

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2301005487

Country of ref document: TH

WWE Wipo information: entry into national phase

Ref document number: 2023558311

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 11202306317S

Country of ref document: SG

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22787302

Country of ref document: EP

Kind code of ref document: A1