CN118411583A - Immersion type video quality evaluation method and device based on multi-feature fusion - Google Patents
Immersion type video quality evaluation method and device based on multi-feature fusion Download PDFInfo
- Publication number
- CN118411583A CN118411583A CN202410836696.5A CN202410836696A CN118411583A CN 118411583 A CN118411583 A CN 118411583A CN 202410836696 A CN202410836696 A CN 202410836696A CN 118411583 A CN118411583 A CN 118411583A
- Authority
- CN
- China
- Prior art keywords
- texture
- depth
- feature
- video
- distorted
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 44
- 238000007654 immersion Methods 0.000 title claims abstract description 30
- 238000013441 quality evaluation Methods 0.000 title claims abstract description 26
- 230000004927 fusion Effects 0.000 title claims abstract description 24
- 238000011156 evaluation Methods 0.000 claims abstract description 12
- 238000011176 pooling Methods 0.000 claims abstract description 11
- 238000000605 extraction Methods 0.000 claims abstract description 9
- 238000004364 calculation method Methods 0.000 claims description 20
- 238000004590 computer program Methods 0.000 claims description 8
- 230000002123 temporal effect Effects 0.000 claims description 3
- 238000004422 calculation algorithm Methods 0.000 abstract description 14
- 241000282414 Homo sapiens Species 0.000 abstract description 11
- 238000012545 processing Methods 0.000 abstract description 9
- 230000000007 visual effect Effects 0.000 abstract description 9
- 230000006870 function Effects 0.000 description 5
- 238000003646 Spearman's rank correlation coefficient Methods 0.000 description 4
- 230000033001 locomotion Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000001303 quality assessment method Methods 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000015556 catabolic process Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000000802 evaporation-induced self-assembly Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Landscapes
- Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
Abstract
The invention discloses an immersion type video quality evaluation method and device based on multi-feature fusion, which relate to the field of video processing and comprise the following steps: performing feature extraction on the reference texture video sequence and the distorted texture video sequence by adopting a 3D-LOG filter to obtain reference texture features and distorted texture features, calculating to obtain texture feature similarity, and obtaining texture video quality scores through a 3D-LOG pooling strategy based on the texture feature similarity; calculating to obtain reference depth characteristics and distortion depth characteristics according to the reference depth video sequence and the distortion depth video sequence; calculating to obtain depth feature similarity according to the reference depth feature and the distortion depth feature, determining gradient weight, and calculating to obtain depth video quality score according to the depth feature similarity and the gradient weight; and calculating the quality score of the immersive video to be evaluated according to the texture video quality score and the depth video quality score, and solving the problem that the existing video evaluation algorithm does not accord with the visual characteristics of human eyes and the characteristics of the immersive video.
Description
Technical Field
The invention relates to the field of video processing, in particular to an immersive video quality evaluation method and device based on multi-feature fusion.
Background
With the rapid development of high-speed network transmission, video acquisition, video processing and display technologies and the increasing demand of human beings for immersive experience, immersive video is in an coming burst period, becomes a video technology research hotspot, and is widely applied to the fields of remote offices, intelligent transportation, commercial broadcasting and the like. Compared with the traditional video, the immersive video has the characteristics of ultra-wide viewing angle, high freedom degree, high resolution and the like, and has extremely high immersive feeling and interaction. Wherein the high degree of freedom is embodied in the ability to provide translational movement on three primary axes and rotational movement on three secondary axes when a person views a video, which can bring a feeling of being immersive to the user.
The immersive video mainly adopts a video format of multi-view texture plus depth, and can be formed by computer generation or camera shooting. The video processing method is interfered by various distortions in the video processing process, so that the visual expression effect of the immersive scene is weakened, the user experience is influenced, and the subjective perception quality is reduced. It is therefore important to propose an algorithm that meets the visual characteristics of the human eye and that can accurately and quickly evaluate the quality of immersive video.
Most video quality evaluation algorithms at present are mainly concentrated in the field of natural videos, but the effect of directly migrating the quality evaluation algorithms related to the natural videos to the immersive videos is relatively poor because the immersive videos have different space-time characteristics in content and video formats and the natural videos. Therefore, the design of the video quality evaluation algorithm conforming to the human visual characteristics and the immersive video characteristics has important theoretical research significance and practical application value.
Disclosure of Invention
The application aims to provide an immersion type video quality evaluation method and device based on multi-feature fusion aiming at the technical problems.
In a first aspect, the invention provides an immersive video quality evaluation method based on multi-feature fusion, comprising the following steps:
Acquiring a reference immersive video and an immersive video to be evaluated, wherein the reference immersive video comprises a reference texture video sequence and a reference depth video sequence, the immersive video to be evaluated comprises a distorted texture video sequence and a distorted depth video sequence, and performing feature extraction on the reference texture video sequence and the distorted texture video sequence by adopting a 3D-LOG filter to obtain reference texture features and distorted texture features; calculating to obtain texture feature similarity according to the reference texture feature and the distortion texture feature, and obtaining texture video quality scores through a 3D-LOG pooling strategy based on the texture feature similarity;
Calculating to obtain reference depth characteristics and distortion depth characteristics according to the reference depth video sequence and the distortion depth video sequence; calculating to obtain depth feature similarity according to the reference depth feature and the distortion depth feature, determining gradient weight, and calculating to obtain depth video quality score according to the depth feature similarity and the gradient weight;
And calculating according to the texture video quality score and the depth video quality score to obtain the quality score of the immersive video to be evaluated.
Preferably, the feature extraction is performed on the reference texture video sequence and the distorted texture video sequence by using a 3D-LOG filter, so as to obtain the reference texture feature and the distorted texture feature, which specifically comprise:
the 3D-LOG filter has the following calculation formula:
;
Wherein, Representing horizontal, vertical and temporal coordinates in the space-time domain,Is the standard deviation of the 3D gaussian kernel function,,Representing a function corresponding to the 3D-LOG filter;
And respectively inputting the reference texture video sequence and the distorted texture video sequence into a 3D-LOG filter for convolution to obtain reference texture features and distorted texture features, wherein the reference texture features and the distorted texture features are shown in the following formula:
;
;
Wherein, AndRepresenting the luminance value corresponding to each pixel of the input reference texture video sequence and the distorted texture video sequence,AndRespectively representing the reference texture feature and the distorted texture feature extracted through the 3D-LOG filter,The symbols represent the convolution operation.
Preferably, the texture feature similarity is calculated according to the reference texture feature and the distortion texture feature, and the texture video quality score is obtained through a 3D-LOG pooling strategy based on the texture feature similarity, which specifically comprises the following steps:
Calculating the similarity of texture features by adopting the following method :
;
Wherein,AndRespectively representing the reference texture feature and the distorted texture feature extracted through the 3D-LOG,Is a constant which keeps the value stable;
Taking the maximum value of the reference texture feature and the distortion texture feature as texture weight The following formula is shown:
;
Wherein max represents the maximum value taken therein;
Weighting calculation is carried out on the texture weight and the similarity of the texture features to obtain the texture video quality score The following formula is shown:
。
preferably, the reference depth feature and the distortion depth feature are calculated according to the reference depth video sequence and the distortion depth video sequence, and the reference depth feature and the distortion depth feature are obtained specifically including:
And respectively calculating gradient amplitude values for the reference texture video sequence and the distorted texture video sequence to obtain corresponding depth characteristics:
;
;
;
Wherein, A depth video frame is represented and,AndRepresenting the partial derivatives in the horizontal and vertical directions respectively,AndRepresenting the gradient magnitude components in the horizontal and vertical directions respectively,Representing depth features, the depth features calculated from depth video frames in the reference depth video sequence and the distorted depth video sequence corresponding to the reference depth features, respectivelyAnd distortion depth features,The symbols represent the convolution operation.
Preferably, the depth feature similarity is calculated according to the reference depth feature and the distortion depth feature, the gradient weight is determined, and the depth video quality score is calculated according to the depth feature similarity and the gradient weight, specifically including:
The depth feature similarity is calculated using:
;
Wherein, AndRepresenting the reference depth feature and the distorted depth feature respectively,The degree of similarity of the depth features is indicated,A constant that is stable in value for another;
taking the maximum value of the reference depth characteristic and the distortion depth characteristic as gradient weight The following formula is shown:
;
Wherein max represents the maximum value taken therein;
weighting and calculating the gradient weight and the similarity of the depth features to obtain the depth video quality score The following formula is shown:
。
Preferably, the quality score of the immersive video to be evaluated is calculated according to the texture video quality score and the depth video quality score, which specifically comprises the following steps:
carrying out importance calculation on the texture video quality score and the depth video quality score to obtain an importance score The following formula is shown:
;
Wherein, Is a parameter for adjusting the relative importance between texture features and depth features;
Taking the maximum value of the absolute value of the texture video quality score and the absolute value of the depth video quality score as an evaluation weight The following formula is shown:
;
wherein max represents the maximum value taken among them, Is an absolute value symbol;
and carrying out weighted calculation on the evaluation weight and the importance score to obtain the quality score MMF of the immersive video to be evaluated, wherein the quality score MMF is shown in the following formula:
;
where N represents the number of immersive video sequences to be evaluated, i=1, 2, …, N.
In a second aspect, the present invention provides an immersion video quality assessment apparatus based on multi-feature fusion, comprising:
The texture video quality score calculation module is configured to acquire a reference immersive video and an immersive video to be evaluated, wherein the reference immersive video comprises a reference texture video sequence and a reference depth video sequence, the immersive video to be evaluated comprises a distorted texture video sequence and a distorted depth video sequence, and a 3D-LOG filter is adopted for feature extraction on the reference texture video sequence and the distorted texture video sequence to obtain reference texture features and distorted texture features; calculating to obtain texture feature similarity according to the reference texture feature and the distortion texture feature, and obtaining texture video quality scores through a 3D-LOG pooling strategy based on the texture feature similarity;
The depth video quality score calculation module is configured to calculate a reference depth feature and a distortion depth feature according to the reference depth video sequence and the distortion depth video sequence; calculating to obtain depth feature similarity according to the reference depth feature and the distortion depth feature, determining gradient weight, and calculating to obtain depth video quality score according to the depth feature similarity and the gradient weight;
And the quality score calculating module is configured to calculate the quality score of the immersion video to be evaluated according to the texture video quality score and the depth video quality score.
In a third aspect, the present invention provides an electronic device comprising one or more processors; and storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the method as described in any of the implementations of the first aspect.
In a fourth aspect, the present invention provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method as described in any of the implementations of the first aspect.
In a fifth aspect, the invention provides a computer program product comprising a computer program which, when executed by a processor, implements a method as described in any of the implementations of the first aspect.
Compared with the prior art, the invention has the following beneficial effects:
(1) The immersion video quality evaluation method based on multi-feature fusion, provided by the invention, considers that the immersion video not only contains complicated edge information and texture space-time features of motion variation, but also has the depth information for providing immersion feeling and high degree of freedom, and focuses on considering the characteristics of a human eye vision system and the characteristics of the immersion video. Therefore, the 3D-LOG filter is utilized to extract the edge and contour information of texture video in a space domain and the motion information of the texture video in a time domain, the gradient amplitude in the depth video is calculated to sense quality degradation caused by flicker distortion, the texture video quality score and the depth video score are obtained by weighting based on the extracted texture features and the depth features, finally, a weighting strategy is designed by combining human visual characteristics to measure the contribution degree of texture and depth to the immersive video, the quality score of the immersive video is obtained, and the result is consistent with the sensing result of the human visual system.
(2) The immersion video quality evaluation method based on multi-feature fusion provided by the invention considers the visual characteristics of human eyes and the characteristics of immersion video from multiple aspects, and has better video quality evaluation performance.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of an immersion video quality assessment method based on multi-feature fusion according to an embodiment of the present application;
FIG. 2 is a flow chart of an immersion video quality assessment method based on multi-feature fusion according to an embodiment of the present application;
FIG. 3 is a schematic diagram of an immersive video quality assessment device based on multi-feature fusion in accordance with an embodiment of the present application;
Fig. 4 is a schematic hardware structure of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail below with reference to the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Fig. 1 shows an immersion video quality evaluation method based on multi-feature fusion, which comprises the following steps:
S1, acquiring a reference immersive video and an immersive video to be evaluated, wherein the reference immersive video comprises a reference texture video sequence and a reference depth video sequence, the immersive video to be evaluated comprises a distorted texture video sequence and a distorted depth video sequence, and performing feature extraction on the reference texture video sequence and the distorted texture video sequence by adopting a 3D-LOG filter to obtain reference texture features and distorted texture features; and calculating to obtain texture feature similarity according to the reference texture feature and the distortion texture feature, and obtaining texture video quality scores through a 3D-LOG pooling strategy based on the texture feature similarity.
In a specific embodiment, the method for extracting the characteristics of the reference texture video sequence and the distorted texture video sequence by adopting a 3D-LOG filter to obtain the reference texture characteristics and the distorted texture characteristics specifically comprises the following steps:
the 3D-LOG filter has the following calculation formula:
;
Wherein, Representing horizontal, vertical and temporal coordinates in the space-time domain,Is the standard deviation of the 3D gaussian kernel function,,Representing a function corresponding to the 3D-LOG filter;
And respectively inputting the reference texture video sequence and the distorted texture video sequence into a 3D-LOG filter for convolution to obtain reference texture features and distorted texture features, wherein the reference texture features and the distorted texture features are shown in the following formula:
;
;
Wherein, AndRepresenting the luminance value corresponding to each pixel of the input reference texture video sequence and the distorted texture video sequence,AndRespectively representing the reference texture feature and the distorted texture feature extracted through the 3D-LOG,The symbols represent the convolution operation.
In a specific embodiment, the texture feature similarity is calculated according to the reference texture feature and the distorted texture feature, and the texture video quality score is obtained through a 3D-LOG pooling strategy based on the texture feature similarity, which specifically comprises the following steps:
Calculating the similarity of texture features by adopting the following method :
;
Wherein,AndRespectively representing the reference texture feature and the distorted texture feature extracted through the 3D-LOG,Is a constant which keeps the value stable;
Taking the maximum value of the reference texture feature and the distortion texture feature as texture weight The following formula is shown:
;
Wherein max represents the maximum value taken therein;
Weighting calculation is carried out on the texture weight and the similarity of the texture features to obtain the texture video quality score The following formula is shown:
。
specifically, referring to fig. 2, the immersion video to be evaluated is directly extracted to a reference texture video sequence and a reference depth video sequence without processing, both the reference texture video sequence and the reference depth video sequence are in an undistorted state, and the immersion video to be evaluated is processed and extracted to obtain a distorted texture video sequence and a distorted depth video sequence. Firstly, a reference texture video sequence and a distorted texture video sequence are subjected to feature extraction by adopting a 3D-LOG filter to obtain reference texture features And distorted texture features. The 3D-LOG filter can well simulate the feedback process of human visual neurons on image processing, and more comprehensively measure and perceive the time-space quality characteristic degradation of texture areas in the immersive video. Further, the texture feature similarity is calculated based on the reference texture feature and the distorted texture feature, and in the calculation formula of the texture feature similarity,. Texture video quality score is obtained through a 3D-LOG pooling strategy based on texture feature similarity。
S2, calculating according to the reference depth video sequence and the distortion depth video sequence to obtain reference depth characteristics and distortion depth characteristics; and calculating according to the reference depth feature and the distortion depth feature to obtain the depth feature similarity, determining the gradient weight, and calculating according to the depth feature similarity and the gradient weight to obtain the depth video quality score.
In a specific embodiment, gradient magnitudes are calculated for the reference texture video sequence and the distorted texture video sequence, respectively, resulting in corresponding depth features:
;
;
;
Wherein, A depth video frame is represented and,AndRepresenting the partial derivatives in the horizontal and vertical directions respectively,AndRepresenting the gradient magnitude components in the horizontal and vertical directions respectively,Representing depth features, the depth features calculated from depth video frames in the reference depth video sequence and the distorted depth video sequence corresponding to the reference depth features, respectivelyAnd distortion depth features,The symbols represent the convolution operation.
In a specific embodiment, the depth feature similarity is calculated according to the reference depth feature and the distortion depth feature, the gradient weight is determined, and the depth video quality score is calculated according to the depth feature similarity and the gradient weight, specifically including:
The depth feature similarity is calculated using:
;
Wherein, AndRepresenting the reference depth feature and the distorted depth feature respectively,The degree of similarity of the depth features is indicated,A constant that is stable in value for another;
taking the maximum value of the reference depth characteristic and the distortion depth characteristic as gradient weight The following formula is shown:
;
Wherein max represents the maximum value taken therein;
weighting and calculating the gradient weight and the similarity of the depth features to obtain the depth video quality score The following formula is shown:
。
Specifically, gradient amplitude values are calculated for the reference depth video sequence and the distortion depth video sequence and serve as key features for detecting flicker distortion positions, and the reference depth features and the distortion depth features are obtained. Further calculating depth feature similarity according to the reference depth feature and the distortion depth feature, and determining in a calculation formula of the depth feature similarity through a large number of experiments . Calculating gradient weight by using larger values between reference and distortion gradient characteristics, inputting different attention degrees to different pixels when simulating human eyes to watch video, and then carrying out weighted calculation on the similarity between the different attention degrees and the extracted depth characteristics to obtain the depth video quality score。
And S3, calculating according to the texture video quality score and the depth video quality score to obtain the quality score of the immersive video to be evaluated.
In a specific embodiment, step S3 specifically includes:
carrying out importance calculation on the texture video quality score and the depth video quality score to obtain an importance score The following formula is shown:
;
Wherein, Is a parameter for adjusting the relative importance between texture features and depth features;
Taking the maximum value of the absolute value of the texture video quality score and the absolute value of the depth video quality score as an evaluation weight The following formula is shown:
;
wherein max represents the maximum value taken among them, Is an absolute value symbol;
and carrying out weighted calculation on the evaluation weight and the importance score to obtain the quality score MMF of the immersive video to be evaluated, wherein the quality score MMF is shown in the following formula:
;
where N represents the number of immersive video sequences to be evaluated, i=1, 2, …, N.
Specifically, texture features extracted by a 3D-LOG filter and depth features obtained by calculating gradient amplitude are combined to serve as evaluation indexes of immersive video quality, and the quality score of the immersive video is characterized by taking a larger quality score of the texture video quality score and the depth video quality score as a weighting strategy in combination with human eye visual characteristics. In the calculation process, parametersIs a parameter for adjusting the relative importance between the texture feature and the depth feature, and is set as follows through experimental analysis。
The above steps S1-S3 do not necessarily represent the order between steps, but the step symbols indicate that the order between steps is adjustable.
The advantages of the process according to the invention are demonstrated by the specific examples and data below.
Table 1 SIMVD results of the MMF and other algorithms combined performance comparisons of the proposed method on the database:
Table 1 is a comparison of experimental results of the immersion video quality evaluation method (expressed by MMF) and other advanced algorithms on the immersion video database SIMVD, where SSIM, MS-SSIM, GMSD, VSI, FSIM, SPSIM, GSS, ESIM, GFM, spEED, viS3, STMAD, VMAF, SGFTM, IV-PSNR are names of other advanced algorithms, PLCC (pearson linear correlation coefficient), SROCC (spearman rank correlation coefficient) and RMSE (root mean square error) are three general criteria, and are three classical correlation parameters for considering the quality of the evaluation method in the field of video quality evaluation, where the values of PLCC and SROCC are closer to 1, the RMSE value is smaller, the correlation between the result of the objective algorithm and the result of the subjective evaluation is higher, and the algorithm is illustrated to be more superior. In table 1, the first three algorithms with better performance are respectively represented by bold, from the data perspective, the values of PLCC and SROCC obtained by the immersion video quality evaluation method based on multi-feature fusion are both more similar to 1 than those of other algorithms, and RMSE is smaller than those of other algorithms, which indicates that the higher the correlation between the immersion video quality evaluation method based on multi-feature fusion and the subjective evaluation result, the more superior the algorithm, and the more superior the evaluation of the immersion video quality.
With further reference to fig. 3, as an implementation of the method shown in the foregoing figures, the present application provides an embodiment of an immersive video quality evaluation apparatus based on multi-feature fusion, where the apparatus embodiment corresponds to the method embodiment shown in fig. 1, and the apparatus is particularly applicable to various electronic devices.
The embodiment of the application provides an immersion type video quality evaluation device based on multi-feature fusion, which comprises the following components:
The texture video quality score calculating module 1 is configured to acquire a reference immersive video and an immersive video to be evaluated, wherein the reference immersive video comprises a reference texture video sequence and a reference depth video sequence, the immersive video to be evaluated comprises a distorted texture video sequence and a distorted depth video sequence, and a 3D-LOG filter is adopted for feature extraction on the reference texture video sequence and the distorted texture video sequence to obtain reference texture features and distorted texture features; calculating to obtain texture feature similarity according to the reference texture feature and the distortion texture feature, and obtaining texture video quality scores through a 3D-LOG pooling strategy based on the texture feature similarity;
a depth video quality score calculation module 2 configured to calculate a reference depth feature and a distorted depth feature from the reference depth video sequence and the distorted depth video sequence; calculating to obtain depth feature similarity according to the reference depth feature and the distortion depth feature, determining gradient weight, and calculating to obtain depth video quality score according to the depth feature similarity and the gradient weight;
the quality score calculating module 3 is configured to calculate the quality score of the immersion video to be evaluated according to the texture video quality score and the depth video quality score.
Fig. 4 is a schematic hardware structure of an electronic device according to an embodiment of the present invention. As shown in fig. 4, the electronic apparatus of the present embodiment includes: a processor 401 and a memory 402; wherein memory 402 is used to store computer-executable instructions; the processor 401 is configured to execute computer-executable instructions stored in the memory to implement the steps executed by the electronic device in the above-described embodiment. Reference may be made in particular to the relevant description of the embodiments of the method described above.
Alternatively, the memory 402 may be separate or integrated with the processor 401.
When the memory 402 is provided separately, the electronic device further comprises a bus 403 for connecting the memory 402 and the processor 401.
The embodiment of the invention also provides a computer storage medium, wherein computer execution instructions are stored in the computer storage medium, and when a processor executes the computer execution instructions, the method is realized.
The embodiment of the invention also provides a computer program product, comprising a computer program, which realizes the method when being executed by a processor.
In the embodiments provided by the present invention, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the above-described device embodiments are merely illustrative, e.g., the division of modules is merely a logical function division, and there may be additional divisions of actual implementation, e.g., multiple modules may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or modules, which may be in electrical, mechanical, or other forms.
The modules illustrated as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to implement the solution of this embodiment.
In addition, each functional module in the embodiments of the present invention may be integrated in one processing unit, or each module may exist alone physically, or two or more modules may be integrated in one unit. The units formed by the modules can be realized in a form of hardware or a form of hardware and software functional units.
The integrated modules, which are implemented in the form of software functional modules, may be stored in a computer readable storage medium. The software functional modules described above are stored in a storage medium and include instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or processor to perform some of the steps of the methods of the various embodiments of the application.
It should be appreciated that the Processor may be a central processing unit (Central Processing Unit, abbreviated as CPU), or may be other general purpose Processor, digital signal Processor (DIGITAL SIGNAL Processor, abbreviated as DSP), application SPECIFIC INTEGRATED Circuit (ASIC), or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present invention may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in a processor for execution.
The memory may comprise a high-speed RAM memory, and may further comprise a non-volatile memory NVM, such as at least one magnetic disk memory, and may also be a U-disk, a removable hard disk, a read-only memory, a magnetic disk or optical disk, etc.
The bus may be an industry standard architecture (Industry Standard Architecture, ISA) bus, an external device interconnect (PERIPHERAL COMPONENT INTERCONNECT, PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, among others. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, the buses in the drawings of the present application are not limited to only one bus or to one type of bus.
The storage medium may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.
An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an Application SPECIFIC INTEGRATED Circuits (ASIC). It is also possible that the processor and the storage medium reside as discrete components in an electronic device or a master device.
Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the method embodiments described above may be performed by hardware associated with program instructions. The foregoing program may be stored in a computer readable storage medium. The program, when executed, performs steps including the method embodiments described above; and the aforementioned storage medium includes: various media that can store program code, such as ROM, RAM, magnetic or optical disks.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.
Claims (10)
1. The immersion type video quality evaluation method based on multi-feature fusion is characterized by comprising the following steps of:
Acquiring a reference immersive video and an immersive video to be evaluated, wherein the reference immersive video comprises a reference texture video sequence and a reference depth video sequence, the immersive video to be evaluated comprises a distorted texture video sequence and a distorted depth video sequence, and extracting features of the reference texture video sequence and the distorted texture video sequence by adopting a 3D-LOG filter to obtain reference texture features and distorted texture features; calculating to obtain texture feature similarity according to the reference texture feature and the distortion texture feature, and obtaining texture video quality scores through a 3D-LOG pooling strategy based on the texture feature similarity;
Calculating to obtain reference depth characteristics and distortion depth characteristics according to the reference depth video sequence and the distortion depth video sequence; calculating to obtain depth feature similarity according to the reference depth feature and the distortion depth feature, determining gradient weight, and calculating to obtain depth video quality score according to the depth feature similarity and the gradient weight;
And calculating the quality score of the immersive video to be evaluated according to the texture video quality score and the depth video quality score.
2. The immersion video quality evaluation method based on multi-feature fusion according to claim 1, wherein the feature extraction is performed on the reference texture video sequence and the distorted texture video sequence by using a 3D-LOG filter to obtain a reference texture feature and a distorted texture feature, and the method specifically comprises:
The calculation formula of the 3D-LOG filter is as follows:
;
Wherein, Representing horizontal, vertical and temporal coordinates in the space-time domain,Is the standard deviation of the 3D gaussian kernel function,,Representing a function corresponding to the 3D-LOG filter;
And respectively inputting the reference texture video sequence and the distorted texture video sequence into the 3D-LOG filter for convolution to obtain reference texture features and distorted texture features, wherein the reference texture features and the distorted texture features are shown in the following formula:
;
;
Wherein, AndRepresenting the luminance value corresponding to each pixel of the input reference texture video sequence and the distorted texture video sequence,AndRespectively representing the reference texture feature and the distorted texture feature extracted by the 3D-LOG filter,The symbols represent the convolution operation.
3. The immersion video quality evaluation method based on multi-feature fusion according to claim 1, wherein the calculating to obtain the texture feature similarity according to the reference texture feature and the distortion texture feature, and obtaining the texture video quality score through a 3D-LOG pooling strategy based on the texture feature similarity specifically comprises:
Calculating the similarity of texture features by adopting the following method :
;
Wherein,AndRespectively representing the reference texture feature and the distorted texture feature extracted through the 3D-LOG,Is a constant which keeps the value stable;
taking the maximum value of the reference texture feature and the distortion texture feature as a texture weight The following formula is shown:
;
Wherein max represents the maximum value taken therein;
weighting and calculating the texture weight and the similarity of the texture features to obtain a texture video quality score The following formula is shown:
。
4. The immersion video quality evaluation method based on multi-feature fusion according to claim 1, wherein the calculating according to the reference depth video sequence and the distorted depth video sequence to obtain the reference depth feature and the distorted depth feature specifically comprises:
And respectively calculating gradient amplitude values for the reference texture video sequence and the distorted texture video sequence to obtain corresponding depth characteristics:
;
;
;
Wherein, A depth video frame is represented and,AndRepresenting the partial derivatives in the horizontal and vertical directions respectively,AndRepresenting the gradient magnitude components in the horizontal and vertical directions respectively,Representing depth features, wherein the depth features calculated by the depth video frames in the reference depth video sequence and the distorted depth video sequence are respectively corresponding to the reference depth featuresAnd distortion depth features,The symbols represent the convolution operation.
5. The immersion video quality evaluation method based on multi-feature fusion according to claim 1, wherein the calculating depth feature similarity according to the reference depth feature and the distortion depth feature and determining gradient weight, calculating depth video quality score according to the depth feature similarity and the gradient weight, specifically comprises:
The depth feature similarity is calculated using:
;
Wherein, AndRepresenting the reference depth feature and the distorted depth feature respectively,The degree of similarity of the depth features is indicated,A constant that is stable in value for another;
Taking the maximum value of the reference depth characteristic and the distortion depth characteristic as gradient weight The following formula is shown:
;
Wherein max represents the maximum value taken therein;
Weighting and calculating the gradient weight and the depth characteristic similarity to obtain a depth video quality score The following formula is shown:
。
6. The multi-feature fusion-based immersive video quality evaluation method according to claim 1, wherein the calculating according to the texture video quality score and the depth video quality score obtains the quality score of the immersive video to be evaluated specifically comprises:
Carrying out importance calculation on the texture video quality score and the depth video quality score to obtain an importance score The following formula is shown:
;
Wherein, Is a parameter for adjusting the relative importance between texture features and depth features;
Taking the maximum value of the absolute value of the texture video quality score and the absolute value of the depth video quality score as an evaluation weight The following formula is shown:
;
wherein max represents the maximum value taken among them, Is an absolute value symbol;
And carrying out weighted calculation on the evaluation weight and the importance score to obtain the quality score MMF of the immersion video to be evaluated, wherein the quality score MMF is shown in the following formula:
;
Wherein N represents the number of the immersive video sequences to be evaluated, i=1, 2, …, N.
7. An immersive video quality evaluation device based on multi-feature fusion, comprising:
The texture video quality score calculation module is configured to acquire a reference immersive video and an immersive video to be evaluated, wherein the reference immersive video comprises a reference texture video sequence and a reference depth video sequence, the immersive video to be evaluated comprises a distorted texture video sequence and a distorted depth video sequence, and the reference texture video sequence and the distorted texture video sequence are subjected to feature extraction by adopting a 3D-LOG filter to obtain reference texture features and distorted texture features; calculating to obtain texture feature similarity according to the reference texture feature and the distortion texture feature, and obtaining texture video quality scores through a 3D-LOG pooling strategy based on the texture feature similarity;
a depth video quality score calculation module configured to calculate a reference depth feature and a distorted depth feature from the reference depth video sequence and the distorted depth video sequence; calculating to obtain depth feature similarity according to the reference depth feature and the distortion depth feature, determining gradient weight, and calculating to obtain depth video quality score according to the depth feature similarity and the gradient weight;
And the quality score calculating module is configured to calculate the quality score of the immersive video to be evaluated according to the texture video quality score and the depth video quality score.
8. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs,
When executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1-6.
9. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any of claims 1-6.
10. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any of claims 1-6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410836696.5A CN118411583B (en) | 2024-06-26 | Immersion type video quality evaluation method and device based on multi-feature fusion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410836696.5A CN118411583B (en) | 2024-06-26 | Immersion type video quality evaluation method and device based on multi-feature fusion |
Publications (2)
Publication Number | Publication Date |
---|---|
CN118411583A true CN118411583A (en) | 2024-07-30 |
CN118411583B CN118411583B (en) | 2024-10-22 |
Family
ID=
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106412571A (en) * | 2016-10-12 | 2017-02-15 | 天津大学 | Video quality evaluation method based on gradient similarity standard deviation |
CN110930398A (en) * | 2019-12-09 | 2020-03-27 | 嘉兴学院 | Log-Gabor similarity-based full-reference video quality evaluation method |
US20210044791A1 (en) * | 2019-08-05 | 2021-02-11 | City University Of Hong Kong | Video quality determination system and method |
CN113888502A (en) * | 2021-09-29 | 2022-01-04 | 广州大学 | No-reference video quality evaluation method, device, equipment and storage medium |
CN115423769A (en) * | 2022-08-30 | 2022-12-02 | 重庆理工大学 | No-reference synthetic video quality evaluation method based on multi-modal learning |
CN117237259A (en) * | 2023-11-14 | 2023-12-15 | 华侨大学 | Compressed video quality enhancement method and device based on multi-mode fusion |
CN117475264A (en) * | 2023-10-27 | 2024-01-30 | 福州大学 | Multi-fraction stereoscopic video quality evaluation method based on double-layer attention |
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106412571A (en) * | 2016-10-12 | 2017-02-15 | 天津大学 | Video quality evaluation method based on gradient similarity standard deviation |
US20210044791A1 (en) * | 2019-08-05 | 2021-02-11 | City University Of Hong Kong | Video quality determination system and method |
CN110930398A (en) * | 2019-12-09 | 2020-03-27 | 嘉兴学院 | Log-Gabor similarity-based full-reference video quality evaluation method |
CN113888502A (en) * | 2021-09-29 | 2022-01-04 | 广州大学 | No-reference video quality evaluation method, device, equipment and storage medium |
CN115423769A (en) * | 2022-08-30 | 2022-12-02 | 重庆理工大学 | No-reference synthetic video quality evaluation method based on multi-modal learning |
CN117475264A (en) * | 2023-10-27 | 2024-01-30 | 福州大学 | Multi-fraction stereoscopic video quality evaluation method based on double-layer attention |
CN117237259A (en) * | 2023-11-14 | 2023-12-15 | 华侨大学 | Compressed video quality enhancement method and device based on multi-mode fusion |
Non-Patent Citations (3)
Title |
---|
SHAN CHENG 等人: ""Screen Content Video Quality Assessment:Subjective and Objective Study"", 《IEEE TRANSACTIONS ON IMAGE PROCESSING》, vol. 2020, 31 December 2020 (2020-12-31), pages 8636 - 8648 * |
ZHANGKAI NI 等人: ""Gradient Direction for Screen Content Image Quality Assessment"", 《IEEE SIGNAL PROCESSING LETTERS》, vol. 23, no. 10, 31 October 2016 (2016-10-31), pages 1394 - 1397 * |
张淑芳;韩泽欣;张聪;: "基于时域梯度相似度的视频质量评价模型", 控制工程, no. 08, 20 August 2018 (2018-08-20) * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110324664B (en) | Video frame supplementing method based on neural network and training method of model thereof | |
Niu et al. | 2D and 3D image quality assessment: A survey of metrics and challenges | |
WO2021114868A1 (en) | Denoising method, terminal, and storage medium | |
CN107481271B (en) | Stereo matching method, system and mobile terminal | |
CN108986197B (en) | 3D skeleton line construction method and device | |
CN101394460A (en) | Image processing apparatus, image processing method, image processing program, and image capturing apparatus | |
CN114648482A (en) | Quality evaluation method and system for three-dimensional panoramic image | |
CN104010180B (en) | Method and device for filtering three-dimensional video | |
An et al. | Single-shot high dynamic range imaging via deep convolutional neural network | |
CN113222855A (en) | Image recovery method, device and equipment | |
CN111696034A (en) | Image processing method and device and electronic equipment | |
CN118411583B (en) | Immersion type video quality evaluation method and device based on multi-feature fusion | |
CN114202460A (en) | Super-resolution high-definition reconstruction method, system and equipment facing different damage images | |
CN111369435B (en) | Color image depth up-sampling method and system based on self-adaptive stable model | |
CN117408886A (en) | Gas image enhancement method, gas image enhancement device, electronic device and storage medium | |
CN118411583A (en) | Immersion type video quality evaluation method and device based on multi-feature fusion | |
CN111754412B (en) | Method and device for constructing data pair and terminal equipment | |
CN111738957A (en) | Intelligent beautifying method and system for image, electronic equipment and storage medium | |
CN116095291A (en) | Image preprocessing method for media stream image transmission | |
CN106358006A (en) | Video correction method and video correction device | |
CN109886280A (en) | A kind of heterologous image object matching process based on core correlation filtering | |
CN115220574A (en) | Pose determination method and device, computer readable storage medium and electronic equipment | |
CN103077396B (en) | The vector space Feature Points Extraction of a kind of coloured image and device | |
Huang et al. | Unsupervised image dehazing based on improved generative adversarial networks | |
CHEN et al. | Blind stereo image quality evaluation based on spatial domain and transform domain feature extraction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |