CN114567776A - Video low-complexity coding method based on panoramic visual perception characteristic - Google Patents
Video low-complexity coding method based on panoramic visual perception characteristic Download PDFInfo
- Publication number
- CN114567776A CN114567776A CN202210157533.5A CN202210157533A CN114567776A CN 114567776 A CN114567776 A CN 114567776A CN 202210157533 A CN202210157533 A CN 202210157533A CN 114567776 A CN114567776 A CN 114567776A
- Authority
- CN
- China
- Prior art keywords
- current frame
- pixel point
- coding unit
- pixel
- maximum coding
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 44
- 230000016776 visual perception Effects 0.000 title claims abstract description 10
- 230000008447 perception Effects 0.000 claims abstract description 47
- 238000013139 quantization Methods 0.000 claims abstract description 31
- 238000004364 calculation method Methods 0.000 claims description 10
- 230000006870 function Effects 0.000 claims description 3
- 230000000694 effects Effects 0.000 abstract description 3
- 238000005457 optimization Methods 0.000 abstract description 2
- 238000013441 quality evaluation Methods 0.000 description 6
- 238000012360 testing method Methods 0.000 description 6
- 230000000007 visual effect Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 230000000873 masking effect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
- H04N19/14—Coding unit complexity, e.g. amount of activity or edge presence estimation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/182—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/42—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
The invention discloses a video low-complexity coding method based on panoramic visual perception characteristics, which utilizes a spatial domain JND threshold value as a spatial domain perception factor, obtains a motion perception factor through a weighted gradient value, further obtains an average value of space-time weighted perception factors of all pixel points in a maximum coding unit, calculates a Lagrange coefficient adjusting factor based on the space-time weighted perception factor of the maximum coding unit according to a rate distortion optimization theory, and further obtains a quantization parameter variable quantity based on the space-time weighted perception factor of the maximum coding unit; simultaneously calculating the quantization parameter variation quantity of the maximum coding unit based on the dimension weight; calculating a new coding quantization parameter of the maximum coding unit according to the two quantization parameter variable quantities, and applying the new coding quantization parameter to coding; the method has the advantages that the coding quality can be guaranteed, the coding rate can be effectively reduced, the coding complexity can be effectively reduced, the rate distortion performance is obviously improved, and particularly when the initial coding quantization parameter is small, the coding effect is better.
Description
Technical Field
The invention relates to a video coding technology, in particular to a video low-complexity coding method based on panoramic visual perception characteristics.
Background
In recent years, panoramic video systems are widely popular with people for their "immersive" visual experience, and have great application prospects in the fields of virtual reality, simulated driving, and the like. However, the current panoramic video system still has the problem of too high coding complexity in terms of coding, which brings great challenges to the application of the panoramic video system. Therefore, how to reduce the encoding complexity has become an urgent technical problem to be solved in the field.
The existing low-complexity coding algorithm of the panoramic video does not fully consider the perception characteristic of a Human Visual System (HVS) and the characteristics of the panoramic video, and the optimal coding performance is difficult to achieve. The main purpose of video coding is to reduce the code rate of coding as much as possible on the premise of ensuring certain video quality; or under the condition that the code rate of the coding is limited, the coding is carried out by adopting a mode with the minimum distortion. Therefore, how to combine and utilize the perception characteristics of the human visual system and the characteristics of the panoramic video to guide the selection of the coding parameters becomes an important breakthrough direction for researching and reducing the coding complexity in the field.
Disclosure of Invention
The invention aims to solve the technical problem of providing a video low-complexity coding method based on panoramic visual perception characteristics, which can effectively save coding code rate and further effectively reduce coding complexity.
The technical scheme adopted by the invention for solving the technical problems is as follows: a video low complexity coding method based on panoramic visual perception characteristics is characterized by comprising the following steps:
step 1: defining a video frame to be coded currently in the panoramic video in the ERP projection format as a current frame; the width of a video frame in the panoramic video in the ERP projection format is W, and the height of the video frame in the panoramic video in the ERP projection format is H;
and 2, step: judging whether the current frame is a 1 st frame video frame, if so, encoding the current frame by adopting an original algorithm of an HEVC video encoder, and then executing the step 10; otherwise, executing step 3;
and step 3: performing spatial JND threshold calculation on each pixel point in the current frame to obtain a panoramic spatial JND threshold map of the current frame, and recording the map as G1,G1The pixel value of each pixel point in the current frame is the spatial JND threshold value of the corresponding pixel point in the current frame; and performing weighted gradient calculation on each pixel point in the current frame to obtain a weighted gradient image of the current frame, which is marked as G2,G2The pixel value of each pixel point in the current frame is the weighted gradient value of the corresponding pixel point in the current frame;
and 4, step 4: calculating the space domain perception factor of each pixel point in the current frame, and recording the space domain perception factor of the pixel point with the coordinate position of (x, y) in the current frame as deltaA(x,y),δA(x,y)=G1(x, y); calculating the motion perception factor of each pixel point in the current frame, and recording the motion perception factor of the pixel point with the coordinate position of (x, y) in the current frame as deltaT(x,y),Then calculating the space-time weighted perception factor of each pixel point in the current frame, and recording the space-time weighted perception factor of the pixel point with the coordinate position of (x, y) in the current frame as delta (x, y), wherein the delta (x, y) is deltaA(x,y)×δT(x, y); then calculating the average value of the space-time weighted perception factors of all the pixel points in the current frame, and recording the average value as Sδ(ii) a Calculating the dimension weight of each pixel point in the current frame, and recording the dimension weight of the pixel point with the coordinate position of (x, y) in the current frame as wERP(x,y),Wherein x is more than or equal to 0 and less than or equal to W-1, y is more than or equal to 0 and less than or equal to H-1, G1(x, y) denotes G1The pixel value G of the pixel point with the middle coordinate position (x, y)1(x, y) also represents the spatial JND threshold, G, of the pixel with coordinate position (x, y) in the current frame2(x, y) represents G2The pixel value G of the pixel point with the middle coordinate position (x, y)2(x, y) also represents the weighted gradient value of the pixel with coordinate position (x, y) in the current frame, SFRepresents G2All images inAverage value of pixel values of pixel points, SFAlso represents the average value of the weighted gradient values of all the pixels in the current frame, wherein epsilon is a motion perception constant and epsilon belongs to [1,2 ]]Cos () is a cosine function;
and 5: defining a maximum coding unit to be processed currently in a current frame as a current maximum coding unit;
step 6: calculating the average value of the space-time weighted perception factors of all the pixel points in the current maximum coding unit, and recording the average value as Sδ_LCU(ii) a Then, calculating Lagrange coefficient adjustment factors based on space-time weighting perception factors of the current maximum coding unit, and marking the Lagrange coefficient adjustment factors as psiLCU,Then, the quantization parameter variation quantity based on the space-time weighting perception factor of the current maximum coding unit is calculated and recorded as delta QP1,ΔQP1=3log2(ΨLCU) (ii) a Wherein, KLCUAnd BLCUAre all adjustment parameters, KLCU∈(0,1),BLCU∈(0,1);
And 7: calculating the average value of the dimensionality weights of all the pixel points in the current maximum coding unit, and recording the average value as SwERP_LCU(ii) a Then, the quantization parameter variation quantity based on the dimension weight of the current maximum coding unit is calculated and recorded as delta QP2,Wherein a and b are both adjusting parameters, a belongs to (0,1), b belongs to (0,1), and b is less than a;
and 8: calculating new coding quantization parameter, denoted as QP, of the current maximum coding unitnew,Then using QPnewUpdating the coding quantization parameter of the current maximum coding unit; then coding the current maximum coding unit; wherein, QPorgOriginal coded quantization parameter, symbol representing current maximum coding unitIs a rounded-down operation sign;
and step 9: taking the next maximum coding unit to be processed in the current frame as the current maximum coding unit, then returning to the step 6 to continue executing until all the maximum coding units in the current frame are processed, and then executing the step 10;
step 10: and (3) taking the next frame of video frame to be coded in the panoramic video in the ERP projection format as the current frame, and then returning to the step (2) to continue executing until all the video frames in the panoramic video in the ERP projection format are coded.
In said step 3, G1The acquisition mode is as follows: performing spatial JND threshold calculation on each pixel point in the current frame by adopting a spatial domain just noticeable distortion model to obtain G1。
In said step 3, G2The acquisition process comprises the following steps: g is to be2The pixel value of the pixel point with the middle coordinate position (x, y) is marked as G2(x,y),Wherein x is more than or equal to 0 and less than or equal to W-1, y is more than or equal to 0 and less than or equal to H-1, G2(x, y) also represents the weighted gradient value of the pixel with coordinate position (x, y) in the current frame,which is indicative of the horizontal direction,which is indicative of the vertical direction of the,which represents the direction of the time domain,representing the horizontal direction gradient value of the pixel point with the coordinate position of (x, y) in the current frame,indicating sitting in the current frameThe vertical gradient value of the pixel point with the index position of (x, y),representing the time domain direction gradient value of the pixel point with the coordinate position of (x, y) in the current frame,andand calculating by using a 3D-sobel operator, wherein alpha represents a gradient adjustment factor in the horizontal direction, beta represents a gradient adjustment factor in the vertical direction, gamma represents a gradient adjustment factor in the time domain direction, and alpha + beta + gamma is equal to 1.
Compared with the prior art, the invention has the advantages that:
the method fully considers the perception characteristics of a human eye visual system and the characteristics of panoramic video, utilizes a spatial domain JND threshold value (visual perception information) as a spatial domain perception factor, obtains a motion perception factor through a weighted gradient value (visual perception information), further obtains an average value of space-time weighted perception factors of all pixel points in a maximum coding unit through calculation, calculates Lagrange coefficient regulating factors of the maximum coding unit based on the space-time weighted perception factors according to a rate distortion optimization theory, and further obtains quantization parameter variable quantity of the maximum coding unit based on the space-time weighted perception factors; meanwhile, the method takes the dimension weight characteristics of the panoramic video in the ERP projection format into consideration, and calculates the quantitative parameter variation of the maximum coding unit based on the dimension weight; and calculating a new coding quantization parameter of the maximum coding unit according to the two quantization parameter variable quantities, and applying the new coding quantization parameter to coding. The method can adaptively adjust the coding quantization parameters aiming at the time-space domain and the panoramic latitude characteristics of the specific maximum coding unit, and experimental tests show that the method can effectively reduce the coding rate while ensuring the coding quality, thereby effectively reducing the coding complexity, obviously improving the rate-distortion performance, and particularly aiming at the condition that the initial coding quantization parameter is smaller, the coding effect is better.
Drawings
Fig. 1 is a block diagram of a general implementation of the method of the present invention.
Detailed Description
The invention is described in further detail below with reference to the accompanying examples.
The overall implementation block diagram of the video low-complexity coding method based on the panoramic visual perception characteristic is shown in fig. 1, and the method comprises the following steps:
step 1: defining a video frame to be coded currently in a panoramic video in an ERP (equivalent projection) projection format as a current frame; the width of a video frame in the panoramic video in the ERP projection format is W, and the height of the video frame in the panoramic video in the ERP projection format is H.
Step 2: judging whether the current frame is a 1 st frame video frame, if so, encoding the current frame by adopting an original algorithm of an HEVC video encoder, and then executing the step 10; otherwise, step 3 is executed.
And step 3: performing spatial JND (Just Noticeable Distortion) threshold calculation on each pixel point in the current frame to obtain a panoramic spatial JND threshold map of the current frame, and recording the map as G1,G1The pixel value of each pixel point in the current frame is the spatial domain JND threshold value of the corresponding pixel point in the current frame; and performing weighted gradient calculation on each pixel point in the current frame to obtain a weighted gradient map of the current frame, and recording the weighted gradient map as G2,G2The pixel value of each pixel point in the current frame is the weighted gradient value of the corresponding pixel point in the current frame; the larger the spatial domain JND threshold value is, the larger the representation just noticeable distortion is, namely the stronger the spatial domain masking property of the corresponding region is; conversely, the smaller the spatial JND threshold, the weaker the spatial masking of the corresponding region.
In this embodiment, G1The acquisition mode is as follows: performing airspace JND threshold calculation on each pixel point in the current frame by adopting the conventional classical airspace just noticeable distortion model to obtain G1。
In this embodiment, G2The acquisition process comprises the following steps: g is to be2The pixel value of the pixel point with the middle coordinate position (x, y) is marked as G2(x,y),Wherein x is more than or equal to 0 and less than or equal to W-1, y is more than or equal to 0 and less than or equal to H-1, G2(x, y) also represents the weighted gradient value of the pixel with coordinate position (x, y) in the current frame,which is indicative of the horizontal direction,which is indicative of the vertical direction of the,which represents the direction of the time domain,representing the horizontal direction gradient value of the pixel point with the coordinate position of (x, y) in the current frame,representing the vertical gradient value of the pixel point with the coordinate position of (x, y) in the current frame,the time domain direction gradient value of the pixel point with the coordinate position (x, y) in the current frame is represented, namely the gradient value of the pixel point with the coordinate position (x, y) in the current frame and the pixel point with the coordinate position (x, y) in the previous frame of video frame along the time domain direction,andthe method is calculated by an existing 3D-sobel operator, where α represents a gradient adjustment factor in a horizontal direction, β represents a gradient adjustment factor in a vertical direction, γ represents a gradient adjustment factor in a time domain direction, α + β + γ is 1, and in this embodiment, α is 0.25, β is 0.25, and γ is 0.5.
And 4, step 4: calculating the null of each pixel point in the current frameThe domain perception factor records the space domain perception factor of the pixel point with the coordinate position of (x, y) in the current frame as deltaA(x,y),δA(x,y)=G1(x, y); calculating the motion perception factor of each pixel point in the current frame, and recording the motion perception factor of the pixel point with the coordinate position of (x, y) in the current frame as deltaT(x,y),Then calculating the space-time weighted perception factor of each pixel point in the current frame, and recording the space-time weighted perception factor of the pixel point with the coordinate position of (x, y) in the current frame as delta (x, y), wherein the delta (x, y) is deltaA(x,y)×δT(x, y); then calculating the average value of the space-time weighted perception factors of all the pixel points in the current frame, and recording the average value as Sδ,Calculating the dimension weight of each pixel point in the current frame, and recording the dimension weight of the pixel point with the coordinate position of (x, y) in the current frame as wERP(x,y),Wherein x is more than or equal to 0 and less than or equal to W-1, y is more than or equal to 0 and less than or equal to H-1, G1(x, y) denotes G1The pixel value G of the pixel point with the middle coordinate position (x, y)1(x, y) also represents the spatial JND threshold, G, of the pixel with coordinate position (x, y) in the current frame2(x, y) denotes G2The pixel value G of the pixel point with the middle coordinate position (x, y)2(x, y) also represents the weighted gradient value of the pixel with coordinate position (x, y) in the current frame, SFRepresents G2Average value of pixel values of all pixel points in (S)FAlso represents the average value of the weighted gradient values of all the pixels in the current frame,epsilon is a motion perception constant, epsilon is [1,2 ]]In this embodiment, epsilon is 1, cos () is a cosine function, and pi is 3.14 ….
In this embodiment, an ERP project gridBecause each latitude adopts pixel sampling of different degrees, different pixel redundancies exist in different dimensions in a plane, and extreme lifting redundancy of two poles is most obvious, after a sphere is projected to an ERP projection format, the center of the sphere is usually taken as a base point, the longitude theta of the ERP projection format corresponds to the longitude of the spherical surface of the sphere, and the latitude of the ERP projection format isCorresponding to the latitude of the sphere, θ ∈ [ - π, π],The characteristic of the panoramic latitude is considered, and a dimension weight parameter w of an ERP projection format is introducedERP(x,y)。
And 5: defining a Largest Coding Unit (LCU) to be processed currently in a current frame as a current Largest Coding Unit.
Step 6: calculating the average value of the space-time weighted perception factors of all the pixel points in the current maximum coding unit, and recording the average value as Sδ_LCU,Then, calculating Lagrange coefficient adjustment factors based on space-time weighting perception factors of the current maximum coding unit, and marking the Lagrange coefficient adjustment factors as psiLCU,Then, the quantization parameter variation based on the space-time weighting perception factor of the current maximum coding unit is calculated and recorded as delta QP1,ΔQP1=3log2(ΨLCU) (ii) a Wherein i is more than or equal to 0 and less than or equal to 63, j is more than or equal to 0 and less than or equal to 63, and deltaLCU(i, j) represents the space-time weighted perception factor of the pixel point with the coordinate position (i, j) in the block in the current maximum coding unit, KLCUAnd BLCUAre all adjustment parameters, KLCU∈(0,1),BLCUE (0,1), K is finally determined by a large number of experiments in this exampleLCUAnd BLCUAll values are 0.5.
And 7: calculating a current maximum coding unitThe average value of the dimension weights of all the pixel points is recorded as SwERP_LCU,Then, the quantization parameter variation quantity based on the dimension weight of the current maximum coding unit is calculated and recorded as delta QP2,Wherein, wERP_LCU(i, j) represents the dimension weight of a pixel point with the coordinate position (i, j) in the block in the current maximum coding unit, a and b are both adjustment parameters, a belongs to (0,1), b belongs to (0,1), and b is less than a, and in the embodiment, the value of a is finally determined to be 0.85 and the value of b is finally determined to be 0.3 through a large number of experiments.
And 8: calculating new coding quantization parameter, denoted as QP, of the current maximum coding unitnew,Then using QPnewUpdating the coding quantization parameter of the current maximum coding unit; coding the current maximum coding unit by adopting an HEVC (high efficiency video coding) encoder; wherein, QPorgOriginal coding quantization parameter, QP, representing the current maximum coding unitorgCan be read from the initialization parameter list of the encoderIs a rounded-down operation sign.
And step 9: and taking the next maximum coding unit to be processed in the current frame as the current maximum coding unit, then returning to the step 6 to continue executing until all the maximum coding units in the current frame are processed, and then executing the step 10.
Step 10: and (3) taking the next frame of video frame to be coded in the panoramic video in the ERP projection format as the current frame, and then returning to the step (2) to continue executing until all the video frames in the panoramic video in the ERP projection format are coded.
To further illustrate the performance of the method of the present invention, the method of the present invention was tested.
The method is characterized in that HEVC video encoder standard reference software HM16.14 is selected as an experimental test platform, hardware configuration is conducted to an Intel (R) core (TM) i7-10700 CPU, a 64-bit WIN10 operating system with a main frequency of 2.9GHz and a memory of 32G, and a development tool selects VS 2013. Selecting 4 panoramic video sequences as standard test sequences, wherein the standard test sequences are as follows: two 4K sequences "Aeriological City", "DrivingCity" and two 6K sequences "BranCastle 2", "bonding 2". The number of test frames of each standard test sequence is 100 frames, an intra-frame coding mode is adopted, SearchRange is set to be 64, MaxPartitionDepth is set to be 4, and an initial coding quantization parameter QP (namely an original coding quantization parameter QP)org) Respectively 22, 27, 32, 37.
Table 1 lists the relevant parameter information of "audiocity", "drivingmincity", "BranCastle 2", "bonding 2" 4 panoramic video sequences.
TABLE 1 associated parameter information for panoramic video sequences
Panoramic video sequence | Video resolution |
AerialCity | 3840×1920 |
DrivingInCity | 3840×1920 |
BranCastle2 | 6144×3072 |
Landing2 | 6144×3072 |
Table 2 shows the coding rate savings associated with the inventive method for coding the panoramic video sequence shown in table 1, as compared to the HM16.14 original platform method. The code rate saving rate of the coding by adopting the method of the invention compared with the coding by adopting the HM16.14 original platform method is defined as delta RPRO,ΔRPRO=(RORG-RPRO)/RORGX 100 (%), wherein RPRORepresenting the code rate, R, of the code encoded by the method of the inventionORGIndicating the coding rate of coding using HM16.14 original platform method.
Table 2 comparison of code rate savings for coding using the method of the present invention compared to the HM16.14 original platform method
As can be seen from Table 2, the encoding rate can be averagely saved by 12.9% when the method of the present invention is used for encoding. Aiming at 4 panoramic video sequences with different scenes and different motion conditions, the coding rate can be effectively reduced by adopting the method for coding, and particularly aiming at an initial coding quantization parameter QP (namely an original coding quantization parameter QP)org) And in a smaller case, the coding effect is better.
Table 3 lists the rate-distortion performance of the coding of the panoramic video sequence listed in table 1 using the method of the present invention. Evaluating the quality of coded video by adopting a classical subjective quality evaluation method, wherein in the quality evaluation, a subjective quality evaluation method MOS (Mean Opinion Score) is adopted as a quality evaluation index, and rate distortion performance indexes BDBR of all panoramic video sequences under the subjective quality evaluation method MOS are respectively calculatedMOSTo evaluate the performance of the method of the invention comprehensively.
TABLE 3 Rate distortion Performance for encoding using the method of the present invention
As can be seen from Table 3, the process of the invention employs BDBRMOSThe rate distortion performance evaluation index represents that under the same subjective quality condition, the average value of the coding code rate is saved by about-7.4% under the quality evaluation index MOS. This shows that compared with the HM16.14 original platform method, the method of the present invention can save more coding rate under the same subjective perceptual quality. As can be seen from Table 3, the method of the present invention can effectively save coding rate and significantly improve rate distortion performance for different scenes and different motion conditions of a panoramic video sequence.
Claims (3)
1. A video low complexity coding method based on panoramic visual perception characteristics is characterized by comprising the following steps:
step 1: defining a video frame to be coded currently in the panoramic video in the ERP projection format as a current frame; the width of a video frame in the panoramic video in the ERP projection format is W, and the height of the video frame in the panoramic video in the ERP projection format is H;
step 2: judging whether the current frame is a 1 st frame video frame, if so, encoding the current frame by adopting an original algorithm of an HEVC video encoder, and then executing the step 10; otherwise, executing step 3;
and step 3: performing spatial JND threshold calculation on each pixel point in the current frame to obtain a panoramic spatial JND threshold map of the current frame, and recording the map as G1,G1The pixel value of each pixel point in the current frame is the spatial JND threshold value of the corresponding pixel point in the current frame; and performing weighted gradient calculation on each pixel point in the current frame to obtain a weighted gradient map of the current frame, and recording the weighted gradient map as G2,G2The pixel value of each pixel point in the current frame is the weighted gradient value of the corresponding pixel point in the current frame;
and 4, step 4: calculating the spatial domain perception factor of each pixel point in the current frame, and recording the spatial domain perception factor of the pixel point with the coordinate position of (x, y) in the current frame as deltaA(x,y),δA(x,y)=G1(x, y); and calculating the operation of each pixel point in the current frameDynamic perception factor, recording the motion perception factor of the pixel point with coordinate position (x, y) in the current frame as deltaT(x,y),Then calculating the space-time weighted perception factor of each pixel point in the current frame, and recording the space-time weighted perception factor of the pixel point with the coordinate position of (x, y) in the current frame as delta (x, y), wherein the delta (x, y) is deltaA(x,y)×δT(x, y); then calculating the average value of the space-time weighted perception factors of all the pixel points in the current frame, and recording the average value as Sδ(ii) a Calculating the dimension weight of each pixel point in the current frame, and recording the dimension weight of the pixel point with the coordinate position of (x, y) in the current frame as wERP(x,y),Wherein x is more than or equal to 0 and less than or equal to W-1, y is more than or equal to 0 and less than or equal to H-1, G1(x, y) denotes G1The pixel value G of the pixel point with the middle coordinate position (x, y)1(x, y) also represents the spatial JND threshold, G, of the pixel with coordinate position (x, y) in the current frame2(x, y) denotes G2The pixel value G of the pixel point with the middle coordinate position (x, y)2(x, y) also represents the weighted gradient value of the pixel with coordinate position (x, y) in the current frame, SFRepresents G2Average value of pixel values of all pixel points in (1), SFAlso represents the average value of the weighted gradient values of all the pixels in the current frame, wherein epsilon is a motion perception constant and epsilon belongs to [1,2 ]]Cos () is a cosine function;
and 5: defining a maximum coding unit to be processed currently in a current frame as a current maximum coding unit;
step 6: calculating the average value of the space-time weighted perception factors of all the pixel points in the current maximum coding unit, and recording the average value as Sδ_LCU(ii) a Then, calculating Lagrange coefficient adjustment factors based on space-time weighting perception factors of the current maximum coding unit, and marking the Lagrange coefficient adjustment factors as psiLCU,Then, the quantization parameter variation quantity based on the space-time weighting perception factor of the current maximum coding unit is calculated and recorded as delta QP1,ΔQP1=3log2(ΨLCU) (ii) a Wherein, KLCUAnd BLCUAre all adjustment parameters, KLCU∈(0,1),BLCU∈(0,1);
And 7: calculating the average value of the dimensionality weights of all the pixel points in the current maximum coding unit, and recording the average value as the average valueThen, the quantization parameter variation quantity based on the dimension weight of the current maximum coding unit is calculated and recorded as delta QP2,Wherein a and b are both adjusting parameters, a belongs to (0,1), b belongs to (0,1), and b is less than a;
and 8: calculating new coding quantization parameter, denoted as QP, of the current maximum coding unitnew,Then using QPnewUpdating the coding quantization parameter of the current maximum coding unit; then coding the current maximum coding unit; wherein, QPorgOriginal coded quantization parameter, symbol representing current maximum coding unitIs a rounded-down operator;
and step 9: taking the next maximum coding unit to be processed in the current frame as the current maximum coding unit, then returning to the step 6 to continue executing until all the maximum coding units in the current frame are processed, and then executing the step 10;
step 10: and (3) taking the next frame of video frame to be coded in the panoramic video in the ERP projection format as the current frame, and then returning to the step (2) to continue executing until all the video frames in the panoramic video in the ERP projection format are coded.
2. The method of claim 1, wherein in step 3, G is the number of bits of video coding with low complexity based on the perceptual property of panoramic vision1The acquisition mode is as follows: performing spatial domain JND threshold calculation on each pixel point in the current frame by adopting a spatial domain just noticeable distortion model to obtain G1。
3. The method for coding video with low complexity according to claim 1 or 2, wherein G is the number of G in step 32The acquisition process comprises the following steps: g is to be2The pixel value of the pixel point with the middle coordinate position (x, y) is marked as G2(x,y),Wherein x is more than or equal to 0 and less than or equal to W-1, y is more than or equal to 0 and less than or equal to H-1, G2(x, y) also represents the weighted gradient value of the pixel with coordinate position (x, y) in the current frame,which is indicative of the direction of the horizontal,which is indicative of the vertical direction of the,which represents the direction of the time domain,representing the horizontal direction gradient value of the pixel point with the coordinate position of (x, y) in the current frame,representing the vertical gradient value of the pixel point with the coordinate position of (x, y) in the current frame,representing the time domain direction gradient value of the pixel point with the coordinate position of (x, y) in the current frame,andand calculating by using a 3D-sobel operator, wherein alpha represents a gradient adjustment factor in the horizontal direction, beta represents a gradient adjustment factor in the vertical direction, gamma represents a gradient adjustment factor in the time domain direction, and alpha + beta + gamma is equal to 1.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210157533.5A CN114567776B (en) | 2022-02-21 | 2022-02-21 | Video low-complexity coding method based on panoramic visual perception characteristics |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210157533.5A CN114567776B (en) | 2022-02-21 | 2022-02-21 | Video low-complexity coding method based on panoramic visual perception characteristics |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114567776A true CN114567776A (en) | 2022-05-31 |
CN114567776B CN114567776B (en) | 2023-05-05 |
Family
ID=81714022
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210157533.5A Active CN114567776B (en) | 2022-02-21 | 2022-02-21 | Video low-complexity coding method based on panoramic visual perception characteristics |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114567776B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116723330A (en) * | 2023-03-28 | 2023-09-08 | 成都师范学院 | Panoramic video coding method for self-adapting spherical domain distortion propagation chain length |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6366705B1 (en) * | 1999-01-28 | 2002-04-02 | Lucent Technologies Inc. | Perceptual preprocessing techniques to reduce complexity of video coders |
US20100086063A1 (en) * | 2008-10-02 | 2010-04-08 | Apple Inc. | Quality metrics for coded video using just noticeable difference models |
CN103096079A (en) * | 2013-01-08 | 2013-05-08 | 宁波大学 | Multi-view video rate control method based on exactly perceptible distortion |
US20140169451A1 (en) * | 2012-12-13 | 2014-06-19 | Mitsubishi Electric Research Laboratories, Inc. | Perceptually Coding Images and Videos |
CN104954778A (en) * | 2015-06-04 | 2015-09-30 | 宁波大学 | Objective stereo image quality assessment method based on perception feature set |
CN107147912A (en) * | 2017-05-04 | 2017-09-08 | 浙江大华技术股份有限公司 | A kind of method for video coding and device |
-
2022
- 2022-02-21 CN CN202210157533.5A patent/CN114567776B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6366705B1 (en) * | 1999-01-28 | 2002-04-02 | Lucent Technologies Inc. | Perceptual preprocessing techniques to reduce complexity of video coders |
US20100086063A1 (en) * | 2008-10-02 | 2010-04-08 | Apple Inc. | Quality metrics for coded video using just noticeable difference models |
US20140169451A1 (en) * | 2012-12-13 | 2014-06-19 | Mitsubishi Electric Research Laboratories, Inc. | Perceptually Coding Images and Videos |
CN103096079A (en) * | 2013-01-08 | 2013-05-08 | 宁波大学 | Multi-view video rate control method based on exactly perceptible distortion |
CN104954778A (en) * | 2015-06-04 | 2015-09-30 | 宁波大学 | Objective stereo image quality assessment method based on perception feature set |
CN107147912A (en) * | 2017-05-04 | 2017-09-08 | 浙江大华技术股份有限公司 | A kind of method for video coding and device |
Non-Patent Citations (2)
Title |
---|
YAFEN XING ET AL: "Spatiotemporal just noticeable difference modeling with heterogeneous temporal visual features" * |
杜宝祯: "基于感知阈值的立体视频快速编码算法" * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116723330A (en) * | 2023-03-28 | 2023-09-08 | 成都师范学院 | Panoramic video coding method for self-adapting spherical domain distortion propagation chain length |
CN116723330B (en) * | 2023-03-28 | 2024-02-23 | 成都师范学院 | Panoramic video coding method for self-adapting spherical domain distortion propagation chain length |
Also Published As
Publication number | Publication date |
---|---|
CN114567776B (en) | 2023-05-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108924554B (en) | Panoramic video coding rate distortion optimization method based on spherical weighting structure similarity | |
CN110062234B (en) | Perceptual video coding method based on just noticeable distortion of region | |
CN104219525B (en) | Perception method for video coding based on conspicuousness and minimum discernable distortion | |
CN108063944B (en) | Perception code rate control method based on visual saliency | |
CN107241607B (en) | Visual perception coding method based on multi-domain JND model | |
CN111988611A (en) | Method for determining quantization offset information, image coding method, image coding device and electronic equipment | |
CN111193931B (en) | Video data coding processing method and computer storage medium | |
CN103096079B (en) | A kind of multi-view video rate control based on proper discernable distortion | |
CN103313002B (en) | Situation-based mobile streaming media energy-saving optimization method | |
DE102019218316A1 (en) | 3D RENDER-TO-VIDEO ENCODER PIPELINE FOR IMPROVED VISUAL QUALITY AND LOW LATENCY | |
CN111510766A (en) | Video coding real-time evaluation and playing tool | |
CN108521572B (en) | Residual filtering method based on pixel domain JND model | |
CN109451331B (en) | Video transmission method based on user cognitive demand | |
CN114567776B (en) | Video low-complexity coding method based on panoramic visual perception characteristics | |
CN108900838A (en) | A kind of Rate-distortion optimization method based on HDR-VDP-2 distortion criterion | |
CN103024384B (en) | A kind of Video coding, coding/decoding method and device | |
CN115174898A (en) | Rate distortion optimization method based on visual perception | |
JP3105335B2 (en) | Compression / expansion method by orthogonal transform coding of image | |
CN117440158A (en) | MIV immersion type video coding rate distortion optimization method based on three-dimensional geometric distortion | |
CN102685491A (en) | Method and system for realizing video coding | |
CN111757112B (en) | HEVC (high efficiency video coding) perception code rate control method based on just noticeable distortion | |
CN110933416B (en) | High dynamic range video self-adaptive preprocessing method | |
CN109451309B (en) | CTU (China train unit) layer code rate allocation method based on significance for HEVC (high efficiency video coding) full I frame coding | |
CN103517067B (en) | Initial quantitative parameter self-adaptive adjustment method and system | |
CN111464805A (en) | Three-dimensional panoramic video rapid coding method based on panoramic saliency |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20240118 Address after: Room 166, Building 1, No. 8 Xingye Avenue, Ningbo Free Trade Zone, Zhejiang Province, 315800 Patentee after: Zhejiang Chuanzhi Electronic Technology Co.,Ltd. Address before: 315800 no.388, Lushan East Road, Ningbo Economic and Technological Development Zone, Zhejiang Province Patentee before: Ningbo Polytechnic |
|
TR01 | Transfer of patent right |