CN111862063A

CN111862063A - Video quality evaluation method and device, computer equipment and storage medium

Info

Publication number: CN111862063A
Application number: CN202010733201.8A
Authority: CN
Inventors: 邹芳; 刘继超
Original assignee: Ping An Life Insurance Company of China Ltd
Current assignee: Ping An Life Insurance Company of China Ltd
Priority date: 2020-07-27
Filing date: 2020-07-27
Publication date: 2020-10-30

Abstract

The embodiment of the application belongs to video processing and relates to a video quality evaluation method, which comprises the steps of extracting video images from an original video frame by frame; generating a test label for the video image according to the frame extraction sequence, and integrating the test label into the video image to obtain a test video; storing the storage paths of the test labels and the video images in a database to obtain image indexes; inputting a test video into an audio and video system; when an output video is received from the output end of the audio/video system, a comparison image is obtained according to an output label of an output image in the output video and a video image corresponding to the output image; and calculating the similarity of the output image and the comparison image to obtain a quality evaluation result. The application also provides a video quality evaluation device, computer equipment and a storage medium. In addition, the present application also relates to blockchain techniques, the raw video also being stored in blockchains. The method solves the technical problems of low accuracy of the evaluation result and low reliability of the result in the prior art.

Description

Video quality evaluation method and device, computer equipment and storage medium

Technical Field

The present application relates to the field of video processing, and in particular, to a method and an apparatus for evaluating video quality, a computer device, and a storage medium.

Background

Currently, objective evaluation of video quality is mainly classified into two categories, namely reference evaluation and no reference evaluation, and the difference is whether evaluation of a video to be tested depends on information of an original video. In the prior art, at least one frame of calibration frame image is set in a video to be tested for subsequently obtaining and comparing a video segment in a video, and similarity comparison is performed with a video segment at the same position as the video to be tested, but in such a manner, frame loss, frame skipping, stagnation and other phenomena generally exist in the transmission process of a video sequence, and when the phenomena occur, the similarity comparison or inaccurate comparison cannot be realized, so that the technical problems of low accuracy of an evaluation result and low reliability of the result are caused.

Disclosure of Invention

Based on this, the present application provides a video quality evaluation method, apparatus, computer device and storage medium to solve the technical problems of low accuracy of evaluation results and low reliability of results caused by similarity comparison or inaccurate comparison operation that cannot be achieved in the prior art.

A method of video quality assessment, the method comprising:

extracting video images from an original video frame by frame;

respectively generating test labels for the video images according to the frame extraction sequence, and integrating the test labels to the corresponding video images to be used as test videos;

storing the storage paths of the test labels and the corresponding video images in a database as image indexes;

inputting the test video into an audio and video system;

when an output video is received from the output end of the audio and video system, inquiring a video image corresponding to the output image according to an output label of the output image in the output video and the image index to serve as a comparison image;

and calculating the similarity between the output image and the comparison image to obtain a quality evaluation result.

Further, the querying, according to the output label of the output image in the output video and the image index, the video image corresponding to the output image as a comparison image includes:

extracting the output video frame by frame to obtain the output image;

according to the sequence of frame extraction, identifying an output label on the output image, and storing the output label and a storage path of the output image into a database;

inquiring a test label corresponding to the output label and a storage path of a video image corresponding to the test label from the image index;

and obtaining a video image corresponding to the output image according to the storage path to serve as a comparison image.

Further, the calculating a similarity between the output image and the comparison image to obtain a quality evaluation result includes:

taking an output image and the comparison image as matching frames;

calculating the similarity of each matched frame;

and calculating the average similarity of the matched frames according to the similarity, and taking the average similarity as the quality evaluation result.

Further, the calculating the similarity of the matching frames includes:

preprocessing the matched frame to obtain a corresponding depth image matrix;

calculating the mean square error between the depth image matrixes;

and calculating the similarity of the matched frames according to the mean square error.

Further, the preprocessing the matching frame to obtain a corresponding depth image matrix includes:

comparing the size of the output image and the size of the comparison image;

if the sizes are the same, converting the output image and the contrast image into a depth image matrix;

if the sizes are different, the scale of the output image is adjusted to be equal to that of the contrast image, and then the output image after the scale adjustment and the contrast image are converted into a depth image matrix.

Further, the calculating the similarity according to the mean square error includes:

according to the formula:

calculating the similarity;

wherein MSE is mean square error, MAX_IThe number of gray levels for image quantization, PSNR, is the peak signal-to-noise ratio between the output image and the comparison image.

A video quality assessment apparatus, the apparatus comprising:

the extraction module is used for extracting video images from the original video frame by frame;

the integration module is used for respectively generating test labels for the video images according to the frame extraction sequence and integrating the test labels to the corresponding video images to be used as test videos;

the packaging module is used for storing the storage paths of the test labels and the corresponding video images in a database as image indexes;

the input module is used for inputting the test video into an audio and video system;

the query module is used for querying a video image corresponding to the output image according to the output label of the output image of the output video and the image index as a comparison image when receiving the output video from the output end of the audio/video system;

and the calculating module is used for calculating the similarity between the output image and the comparison image to obtain a quality evaluation result.

A computer device comprising a memory and a processor, and a computer program stored in the memory and executable on the processor, the processor implementing the following steps when executing the computer program:

extracting video images from an original video frame by frame;

inputting the test video into an audio and video system;

when an output video is received from the output end of the audio/video system, inquiring a video image corresponding to the output image according to an output label of the output image of the output video and the image index to serve as a comparison image;

A computer-readable storage medium storing a computer program which, when executed by a processor, performs the steps of:

extracting video images from an original video frame by frame;

inputting the test video into an audio and video system;

According to the video quality evaluation method, the video quality evaluation device, the computer equipment and the storage medium, after the output video is obtained by numbering and associating all the extracted image frames, frame-by-frame identification is carried out to obtain the video image corresponding to the output image of the output video, the output image and the video image corresponding to the output image are used as the matching frames, and the average similarity of all the matching frames is calculated.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive labor.

FIG. 1 is a schematic diagram of an application environment of a video quality assessment method;

FIG. 2 is a flow chart of a video quality assessment method;

FIG. 3 is a schematic diagram of a video quality assessment apparatus;

FIG. 4 is a diagram of a computer device in one embodiment.

Detailed Description

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "including" and "having," and any variations thereof, in the description and claims of this application and the description of the above figures are intended to cover non-exclusive inclusions. The terms "first," "second," and the like in the description and claims of this application or in the above-described drawings are used for distinguishing between different objects and not for describing a particular order.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The video quality evaluation method provided by the embodiment of the invention can be applied to the application environment shown in fig. 1. The application environment may include a terminal 102, a network for providing a communication link medium between the terminal 102 and the server 104, and a server 104, wherein the network may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

A user may use the terminal 102 to interact with the server 104 over a network to receive or send messages, etc. The terminal 102 may have installed thereon various communication client applications, such as a web browser application, a shopping application, a search application, an instant messaging tool, a mailbox client, social platform software, and the like.

The terminal 102 may be various electronic devices having a display screen and supporting web browsing, including but not limited to a smart phone, a tablet computer, an e-book reader, an MP3 player (Moving Picture Experts Group audio Layer III, mpeg compression standard audio Layer 3), an MP4 player (Moving Picture Experts Group audio Layer IV, mpeg compression standard audio Layer 4), a laptop portable computer, a desktop computer, and the like.

The server 104 may be a server that provides various services, such as a background server that provides support for pages displayed on the terminal 102.

It should be noted that the video quality assessment method provided in the embodiments of the present application is generally executed by a server/terminal, and accordingly, the video quality assessment apparatus is generally disposed in the server/terminal device.

The application is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

It should be understood that the number of terminals, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

Wherein, the terminal 102 communicates with the server 104 through the network. The server 104 acquires an original video from the terminal 102, extracts the original video frame by frame, labels the original video, packages the original video into a test video, inputs the test video into an audio/video system, acquires an output video, performs frame extraction processing on the output video to obtain an output image, identifies the output label on the output image, queries a video image corresponding to the output image according to the output label to obtain a comparison image, and calculates the similarity between the output image and the comparison image as a quality evaluation result. The terminal 102 and the server 104 are connected through a network, the network may be a wired network or a wireless network, the terminal 102 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices, and the server 104 may be implemented by an independent server or a server cluster formed by a plurality of servers.

In one embodiment, as shown in fig. 2, a video quality assessment method is provided, which is described by taking the method as an example applied to the server in fig. 1, and includes the following steps:

step 202, extracting video images from the original video frame by frame.

The step of extracting the video image frame by frame is to extract an original video by adopting FFmpeg to obtain a video image, and the server can call an FFmpeg development interface to process the original video. The purpose of frame extraction is to obtain video images according to the original video, label the video images and then synthesize the video images, and then encapsulate the video images into new video files. FFmpeg is a set of open source computer programs used to record, convert digital audio, video, and convert them into streams.

And 204, respectively generating test labels for the video images according to the frame extraction sequence, and integrating the test labels to the corresponding video images to be used as test videos.

And generating a unique test label for the obtained video image according to the frame extraction sequence, wherein the generated test label has uniqueness and can be in one-to-one correspondence with the video image. Further, the generated test tag can be a two-dimensional code, data in the two-dimensional code is a coding frame number of the video image, and the coding format of the two-dimensional code also carries redundant error correction information. The two-dimensional code can ensure that the video image can keep higher recognition rate even if the video image has larger distortion at last.

The two-dimensional code is integrated into the video image in a manner that the two-dimensional code needs to be superimposed on a specified position on the video image, and specifically, the two-dimensional code is generally superimposed on the upper right corner or the upper left corner of the video image.

Step 206, storing the storage path of the test label and the corresponding video image in a database as an image index.

And storing the storage paths of the test labels and the video images into a database, and establishing a one-to-one correspondence relationship between the test labels and the storage paths of the video images in the database.

Specifically, the test label is encoded and stored in a database together with a storage path of the video image, and a one-to-one corresponding relationship is established.

The reason why the label codes and the position information of the files are stored in the database is that the phenomena of frame loss, frame skipping, stagnation and the like usually exist in the transmission process of the test video, and the original video image (namely the integrated video image stored in the database) needs to be found from the database according to the two-dimensional code label codes for comparison after the output video image is acquired subsequently. For example, a table is built in the database, the table field img _ code is used for storing a tag two-dimensional code, such as 1234ad, and the table field img _ path stores a path of an image file, and the final purpose is that a specific storage location of an image video can be found through a tag in subsequent processing.

And step 208, inputting the test video into the audio and video system.

After the test video is obtained, the test video is used as the input of an audio and video system, and the video image is collected at the output end of the audio and video system, wherein the video can be collected through a special collection card. And acquiring an output video, and calculating the similarity of the output video relative to the test video for evaluating the quality of the output video.

And step 210, when the output video is received from the output end of the audio/video system, inquiring a video image corresponding to the output image according to an output label of the output image in the output video and the image index to serve as a comparison image.

Extracting the output video frame by frame to obtain an output image; and identifying output labels on the output image according to the frame drawing sequence, wherein the output labels generally have one-to-one correspondence with the test labels.

Specifically, extracting the output video frame by frame to obtain an output image; identifying an output label on the output image according to the sequence of frame extraction, and storing the output label and a storage path of the output image into a database; inquiring a test label corresponding to the output label and a storage path of a video image corresponding to the test label from the image index; and obtaining a video image corresponding to the output image according to the storage path as a comparison image.

Furthermore, after the output images are obtained, the corresponding comparison image can be matched for each output image according to the video image and the test label in the test video. When the conditions of frame missing, frame missing and frame skipping occur, some video images may not have corresponding output images, and the user can also know the frame missing rate or the frame skipping rate of video transmission in time. Then, the similarity between the contrast image and the output image is calculated, and the similarity is used as the evaluation result of the video quality.

The method for calculating the similarity frame by frame can avoid the condition of inaccurate evaluation result caused by incapability of normally comparing the similarity under the condition of frame skipping and frame missing, and improve the accuracy of quality evaluation.

And 212, calculating the similarity between the output image and the comparison image to obtain a quality evaluation result.

And taking the output image and the comparison image as matching frames, calculating the similarity of each matching frame, calculating the average similarity of the matching frames according to the similarity, and taking the average similarity as a quality evaluation result.

Acquiring all matched frames; calculating the similarity of the matched frames; and then, calculating the average similarity of the matched frames according to the similarity, and taking the average similarity as a quality evaluation result. By calculating the similarity of each matched frame and solving the average similarity, the similarity of a certain frame is not limited, so that the accurate similarity can be ensured to be obtained under the special conditions of frame loss, frame skipping, frame missing and the like for subsequent quality evaluation, and the accuracy of the quality evaluation and the coping capability under the special conditions are improved.

Specifically, the database is traversed to find all matching frames. Then, the image similarity of the test frame (video image of the test video) and the output image in all the matching records is calculated, and the average similarity of all the matching frames is calculated. Common algorithms for calculating the image similarity include PSNR and SSIM, some open source algorithms of a deep neural network can be adopted, then marking is carried out on video data, marked data is used for model training, and the trained model is used for calculating the similarity, so that a good effect can be achieved. The SSIM algorithm is an index for measuring the structural similarity of two images, the range is from 0 to 1, one image is an undistorted original image, the other image is a distorted output image, and if the two images are identical, the value of SSIM is 1. The higher the similarity is, the lower the distortion of the output image is, and the better the video coding, transmission, decoding and playing effects are.

Specifically, preprocessing an output image and a test image to obtain a corresponding depth image matrix; calculating the mean square error between the output image and a depth image matrix corresponding to the test image; and calculating the similarity of the matched frames according to the mean square error. The mean square error is the mean square value of the pixel difference value of the original image and the distorted image, the technology for determining the distortion degree of the distorted image through the mean square value is mature, and the data processing efficiency is high.

Specifically, the image is preprocessed as follows: comparing the size of the output image with the size of the image; if the sizes are the same, converting the output image and the contrast image into a 32-dimensional depth image matrix; and if the sizes are different, performing scale adjustment on the output image, adjusting the output image to be the same as the contrast image in scale, and converting the adjusted output image and contrast into a 32-dimensional depth image matrix.

The size of the output image is adjusted by adopting a scale mode, the scale is one of basic operations of the image, the operations can be carried out on pixel values or coordinates of the pixels to achieve a specific effect, the scaling of the coordinates of the image is equivalent to the scaling of the image, and then the purpose that the size of the output image is consistent with that of the contrast image is achieved. The scaling of the output image to be consistent with the video image is to facilitate the calculation of the similarity.

PSNR is the peak signal-to-noise ratio, and is usually used to evaluate the quality of an image after compression compared with the original image. The higher the PSNR, the smaller the post-compression distortion. Specifically, the present embodiment mainly defines two values, one is mean square error MSE, and the other is peak signal-to-noise ratio. The peak signal-to-noise ratio is often used as a measure of the quality of signal reconstruction in the field of image compression and the like, and is often defined simply by the Mean Square Error (MSE). Two mxn monochromatic images I and K, if one is a noise approximation of the other, have their mean square error defined by equation (1):

where MSE is the mean square error between the output image and the video image, m represents the length of the image, n represents the width of the image, I (I, j) represents the output image, K (I, j) represents the video image, I represents the row index of the image pixel matrix, j represents the column index of the image pixel matrix, PSNR represents the peak signal-to-noise ratio, and MAX represents the number of gray levels for image quantization.

In addition, the formula of the peak signal-to-noise ratio of the PSNR is as follows (2):

the calculation process comprises the following steps: and (3) carrying out image preprocessing on the video image to obtain a depth image matrix, and then calculating the Mean Square Error (MSE) according to a formula (1). Then, a peak signal-to-noise ratio is calculated according to the formula (2) and is used as the similarity, and the higher the finally obtained similarity is, the lower the distortion of the description image is, and the better the video coding, transmission, decoding and playing effects are. The advantages of image quality evaluation by PSNR are: the image quality can be roughly reflected by calculation and understanding, generally, the image quality with high PSNR value is relatively high, generally, when the PSNR value is more than 28, the image quality difference is not obvious, and when the PSNR value is more than 35-40, the difference can not be distinguished by naked eyes. And after the similarity of all the matched frames is obtained, calculating the average similarity, and taking the average similarity as a quality evaluation result of a section of output video.

It should be emphasized that, in order to further ensure the privacy and security of the video processing, the original video, the test video, the output video, the storage path, and other data may also be stored in a node of a block chain.

In the embodiment, all video frames are labeled by extracting frames from an original video; and moreover, the operation of label identification is carried out on all video frames of the output video, namely the video to be tested, and the similarity calculation is carried out on all matched frames, so that the technical problem that quality evaluation fails because all matched frames cannot be obtained under the condition of packet loss in the prior art is solved. Moreover, according to the proposal, the packet loss rate of the video output by the audio/video system can be accurately obtained by calibrating each frame of image.

In the video quality evaluation method, after the output video is obtained by numbering and associating all the extracted image frames, frame-by-frame identification is carried out, and the average similarity of all the matched frames is calculated, so that the accuracy and reliability of the evaluation result are greatly improved, the service scenes of small packet loss, medium packet loss and large packet loss can be covered, and the condition that the similarity comparison cannot be normally carried out in the transmission process to cause inaccurate evaluation result can be prevented.

It should be understood that, although the steps in the flowchart of fig. 2 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in fig. 2 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least a portion of the sub-steps or stages of other steps.

In one embodiment, as shown in fig. 3, there is provided a video quality evaluation apparatus that corresponds one-to-one to the video quality evaluation methods in the above-described embodiments. The video quality evaluation apparatus includes:

an extraction module 302, which extracts video images from the original video frame by frame;

an integration module 304, which generates test labels for the video images according to the frame extraction sequence, and integrates the test labels to the corresponding video images as test videos;

the packaging module 306 is used for storing the storage paths of the test labels and the corresponding video images in a database as image indexes;

the input module 308 inputs the test video into the audio/video system;

the query module 310 is configured to query, when receiving an output video from an output end of the audio/video system, a video image corresponding to the output image according to an output tag of the output image in the output video and the image index, and use the queried video image as a comparison image;

the calculating module 312 calculates the similarity between the output image and the comparison image to obtain a quality evaluation result.

It should be emphasized that, in order to further ensure the privacy and security of the video, the original video, the test video, and the output video may also be stored in a node of a blockchain.

Further, query module 310, includes;

the frame extraction submodule is used for extracting the output video frame by frame to obtain an output image;

the identification submodule is used for identifying an output label on an output image according to the frame extraction sequence and storing a storage path of the output image and the output image into a database;

the relation submodule is used for inquiring a test label corresponding to the output label and a storage path of a video image corresponding to the test label from the image index;

and the searching submodule is used for obtaining a video image corresponding to the output image according to the storage path of the video image and using the video image as a comparison image.

Further, the calculation module 312 includes:

the matching submodule is used for taking the output image and the comparison image as a matching frame;

the similarity submodule is used for calculating the similarity of each matched frame;

and the result submodule is used for calculating the average similarity of the matched frames according to the similarity and taking the average similarity as a quality evaluation result.

Further, similar sub-modules include:

the processing unit is used for preprocessing the matched frame to obtain a corresponding depth image matrix;

an error unit for calculating a mean square error between the depth image matrices;

and the similarity unit is used for calculating the similarity of the matched frames according to the mean square error.

Further, a processing unit comprising:

the comparison subunit is used for comparing the size of the output image with the size of the comparison image;

the first conversion subunit is used for converting the output image and the contrast image into a depth image matrix if the sizes are the same;

and the second conversion subunit is used for adjusting the size of the output image to be equal to the size of the contrast image and then converting the output image and the contrast image into a depth image matrix if the sizes are different.

In the embodiment, all video frames are labeled by extracting frames from an original video; and moreover, the operation of label identification is carried out on all video frames of the output video, namely the video to be tested, and the similarity calculation is carried out on all matched frames, so that the technical problem that quality evaluation fails because all matched frames cannot be obtained under the condition of packet loss in the prior art is solved.

Moreover, according to the proposal, the packet loss rate of the video output by the audio/video system can be accurately obtained by calibrating each frame of image.

In the video quality evaluation device, after the output video is obtained by numbering and associating all the extracted image frames, frame-by-frame identification is carried out, and the average similarity of all the matched frames is calculated, so that the accuracy and reliability of the evaluation result are greatly improved, the service scenes of small packet loss, medium packet loss and large packet loss can be covered, and the condition that the similarity comparison cannot be normally carried out in the transmission process to cause inaccurate evaluation result can be prevented.

In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 4. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing videos. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a video quality assessment method. The extracted image frames are numbered and associated to obtain an output video, then frame-by-frame identification is carried out to calculate the average similarity of all matched frames, so that the accuracy and reliability of the evaluation result are greatly improved, the service scenes of small packet loss, medium packet loss and large packet loss can be covered, and the condition that the evaluation result is inaccurate because the frame loss in the transmission process is avoided so that the similarity comparison cannot be normally carried out can be prevented.

As will be understood by those skilled in the art, the computer device is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and the hardware includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.

In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored, and the computer program when executed by a processor implements the steps of the video quality assessment method in the above-described embodiments, such as the steps 202 to 212 shown in fig. 2, or the processor implements the functions of the modules/units of the video quality assessment apparatus in the above-described embodiments, such as the functions of the modules 302 to 312 shown in fig. 3.

The extracted image frames are numbered and associated to obtain an output video, then frame-by-frame identification is carried out to calculate the average similarity of all matched frames, so that the accuracy and reliability of the evaluation result are greatly improved, the service scenes of small packet loss, medium packet loss and large packet loss can be covered, and the condition that the evaluation result is inaccurate because the frame loss in the transmission process is avoided so that the similarity comparison cannot be normally carried out can be prevented.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

The block chain referred by the application is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions.

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for those skilled in the art, without departing from the spirit and scope of the present invention, several changes, modifications and equivalent substitutions of some technical features may be made, and these changes or substitutions do not make the essence of the same technical solution depart from the spirit and scope of the technical solution of the embodiments of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A method for video quality assessment, the method comprising:

extracting video images from an original video frame by frame;

inputting the test video into an audio and video system;

2. The method of claim 1, wherein the storing the storage path of the test tag and the corresponding video image in a database as an image index comprises:

and establishing a one-to-one correspondence relationship between the test labels and the storage paths of the video images, and storing the one-to-one correspondence relationship in the database as the image index.

3. The method according to claim 2, wherein the querying the video image corresponding to the output image as the comparison image according to the output label of the output image in the output video and the image index comprises:

extracting the output video frame by frame to obtain the output image;

and obtaining a video image corresponding to the output image according to the storage path of the video image, and using the video image as a comparison image.

4. The method according to claim 1, wherein there are a plurality of output images, and the calculating the similarity between the output image and the comparison image to obtain the quality evaluation result comprises:

taking an output image and the comparison image as matching frames;

calculating the similarity of each matched frame;

5. The method of claim 4, wherein calculating the similarity of each of the matched frames comprises:

preprocessing the matched frame to obtain a corresponding depth image matrix;

calculating the mean square error between the depth image matrixes;

6. The method of claim 5, wherein the preprocessing the matched frames to obtain a corresponding depth image matrix comprises:

comparing the size of the output image and the size of the comparison image;

7. The method of claim 5, wherein said calculating the similarity according to the mean square error comprises:

according to the formula:

calculating the similarity;

8. A video quality assessment apparatus, comprising:

the query module is used for querying a video image corresponding to the output image according to an output label of the output image in the output video and the image index as a comparison image when the output video is received from the output end of the audio/video system;

9. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.