CN114567776B - Video low-complexity coding method based on panoramic visual perception characteristics - Google Patents
Video low-complexity coding method based on panoramic visual perception characteristics Download PDFInfo
- Publication number
- CN114567776B CN114567776B CN202210157533.5A CN202210157533A CN114567776B CN 114567776 B CN114567776 B CN 114567776B CN 202210157533 A CN202210157533 A CN 202210157533A CN 114567776 B CN114567776 B CN 114567776B
- Authority
- CN
- China
- Prior art keywords
- current frame
- pixel point
- coding unit
- current
- perception
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 230000016776 visual perception Effects 0.000 title claims abstract description 11
- 230000008447 perception Effects 0.000 claims abstract description 38
- 238000013139 quantization Methods 0.000 claims abstract description 38
- 238000004364 calculation method Methods 0.000 claims description 9
- 238000010586 diagram Methods 0.000 claims description 7
- 230000006870 function Effects 0.000 claims description 3
- 230000000694 effects Effects 0.000 abstract description 3
- 238000005457 optimization Methods 0.000 abstract description 2
- 238000013441 quality evaluation Methods 0.000 description 6
- 238000012360 testing method Methods 0.000 description 6
- 230000000007 visual effect Effects 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 2
- 230000000873 masking effect Effects 0.000 description 2
- 230000001174 ascending effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000004438 eyesight Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
- H04N19/14—Coding unit complexity, e.g. amount of activity or edge presence estimation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/182—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/42—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
The invention discloses a video low-complexity coding method based on panoramic visual perception characteristics, which uses an airspace JND threshold value as an airspace perception factor, obtains a motion perception factor through a weighted gradient value, further obtains the average value of space-time weighted perception factors of all pixel points in a maximum coding unit, calculates Lagrange coefficient adjustment factors based on the space-time weighted perception factors of the maximum coding unit according to a rate distortion optimization theory, and further obtains the quantization parameter variation quantity of the maximum coding unit based on the space-time weighted perception factors; simultaneously calculating quantization parameter variation of the maximum coding unit based on dimension weight; calculating new coding quantization parameters of the maximum coding unit according to the two quantization parameter variation amounts, and applying the new coding quantization parameters to coding; the method has the advantages that the coding quality can be ensured, the coding rate can be effectively reduced, the coding complexity can be effectively reduced, the rate distortion performance is obviously improved, and the coding effect is better particularly when the initial coding quantization parameter is smaller.
Description
Technical Field
The invention relates to a video coding technology, in particular to a video low-complexity coding method based on panoramic visual perception characteristics.
Background
In recent years, panoramic video systems are widely welcomed by people through the 'immersive' visual experience, and have great application prospects in the fields of virtual reality, simulated driving and the like. However, the problem of excessive encoding complexity in the aspect of encoding still exists in the panoramic video system at present, which brings great challenges to the application of the panoramic video system. Therefore, how to reduce the coding complexity has become a technical problem to be solved in the field.
The existing panoramic video low-complexity coding algorithm does not fully consider the perception characteristics of a human eye visual system (Human Visual System, HVS) and the characteristics of the panoramic video, and is difficult to achieve optimal coding performance. The main purpose of video coding is to reduce the code rate of coding as much as possible on the premise of ensuring certain video quality; or in the case of limited code rate of the coding, the mode with minimum distortion is adopted for coding. Therefore, how to combine and use the perception characteristic of the human eye vision system and the panoramic video characteristic to guide the coding parameter selection becomes an important breakthrough direction for researching and reducing the coding complexity in the field.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a video low-complexity coding method based on panoramic visual perception characteristics, which can effectively save coding code rate, thereby effectively reducing coding complexity.
The technical scheme adopted for solving the technical problems is as follows: a video low-complexity coding method based on panoramic visual perception characteristics is characterized by comprising the following steps:
step 1: defining a current video frame to be coded in the panoramic video in the ERP projection format as a current frame; the width of a video frame in the panoramic video in the ERP projection format is W, and the height is H;
step 2: judging whether the current frame is a 1 st frame video frame or not, if so, adopting an original algorithm of an HEVC video encoder to encode the current frame, and then executing a step 10; otherwise, executing the step 3;
step 3: performing airspace JND threshold calculation on each pixel point in the current frame to obtain a panoramic airspace JND threshold diagram of the current frame, and marking the panoramic airspace JND threshold diagram as G 1 ,G 1 The pixel value of each pixel point in the current frame is the airspace JND threshold value of the corresponding pixel point in the current frame; and performing weighted gradient calculation on each pixel point in the current frame to obtain a weighted gradient map of the current frame, which is marked as G 2 ,G 2 The pixel value of each pixel point in the current frame is the weighted gradient value of the corresponding pixel point in the current frame;
step 4: calculating the airspace perception factor of each pixel point in the current frame, and sensing the airspace of the pixel point with the coordinate position of (x, y) in the current frameThe known factor is denoted as delta A (x,y),δ A (x,y)=G 1 (x, y); calculating motion perception factors of each pixel point in the current frame, and recording the motion perception factors of the pixel points with the coordinate positions of (x, y) in the current frame as delta T (x,y),Then calculating the space-time weighted perceptron of each pixel point in the current frame, and recording the space-time weighted perceptron of the pixel point with the coordinate position of (x, y) in the current frame as delta (x, y), wherein delta (x, y) =delta A (x,y)×δ T (x, y); calculating the average value of the space-time weighted perceptron factors of all pixel points in the current frame, and recording as S δ The method comprises the steps of carrying out a first treatment on the surface of the Calculating the dimension weight of each pixel point in the current frame, and marking the dimension weight of the pixel point with the coordinate position of (x, y) in the current frame as w ERP (x,y),Wherein x is more than or equal to 0 and less than or equal to W-1, y is more than or equal to 0 and less than or equal to H-1, G 1 (x, y) represents G 1 Pixel value, G, of pixel point with middle coordinate position (x, y) 1 (x, y) also represents the spatial JND threshold, G, of the pixel point with the coordinate position (x, y) in the current frame 2 (x, y) represents G 2 Pixel value, G, of pixel point with middle coordinate position (x, y) 2 (x, y) also represents the weighted gradient value of the pixel point with the coordinate position (x, y) in the current frame, S F Represents G 2 Average value of pixel values of all pixel points in (S) F Also represents the average value of weighted gradient values of all pixel points in the current frame, epsilon is a motion perception constant, epsilon is [1,2 ]]Cos () is a cosine function;
step 5: defining a current maximum coding unit to be processed in a current frame as a current maximum coding unit;
step 6: calculating the average value of the space-time weighted perception factors of all pixel points in the current maximum coding unit, and recording the average value as S δ_LCU The method comprises the steps of carrying out a first treatment on the surface of the Then calculating Lagrange coefficient adjusting factor based on space-time weighted perception factor of the current maximum coding unit, and marking as ψ LCU ,Calculating the quantization parameter variation of the current maximum coding unit based on the space-time weighted perception factor, and recording the quantization parameter variation as delta QP 1 ,ΔQP 1 =3log 2 (Ψ LCU ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein K is LCU And B LCU Are all adjusting parameters, K LCU ∈(0,1),B LCU ∈(0,1);
Step 7: calculating the average value of the dimension weights of all pixel points in the current maximum coding unit, and recording the average value as S wERP_LCU The method comprises the steps of carrying out a first treatment on the surface of the Calculating the quantization parameter variation based on the dimension weight of the current maximum coding unit, and recording the quantization parameter variation as delta QP 2 ,Wherein a and b are both adjusting parameters, a epsilon (0, 1), b < a;
step 8: calculating new coding quantization parameter of current maximum coding unit, and recording as QP new ,Then use QP new Updating the coding quantization parameter of the current maximum coding unit; coding the current maximum coding unit; wherein QP is org Original coding quantization parameter representing current maximum coding unit, symbolRounding down the operator;
step 9: taking the next largest coding unit to be processed in the current frame as the current largest coding unit, returning to the step 6 to continue execution until all the largest coding units in the current frame are processed, and executing the step 10;
step 10: and taking a video frame to be encoded of the next frame in the panoramic video in the ERP projection format as a current frame, and returning to the step 2 to continue execution until all video frames in the panoramic video in the ERP projection format are encoded.
In the step 3,G 1 The acquisition mode of (a) is as follows: performing spatial JND threshold calculation on each pixel point in the current frame by adopting spatial just noticeable distortion model to obtain G 1 。
In the step 3, G 2 The acquisition process of (1) is as follows: will G 2 The pixel value of the pixel point with the middle coordinate position of (x, y) is marked as G 2 (x,y),Wherein x is more than or equal to 0 and less than or equal to W-1, y is more than or equal to 0 and less than or equal to H-1, G 2 (x, y) also represents the weighted gradient value of the pixel point with the coordinate position (x, y) in the current frame, ">Indicating the direction of the horizontal direction,indicates the vertical direction +.>Representing the time domain direction +_>Horizontal gradient value representing pixel point with coordinate position (x, y) in current frame, +.>Vertical gradient value representing pixel point with coordinate position (x, y) in current frame, +.>Time domain direction gradient value representing pixel point with coordinate position (x, y) in current frame, +.>Andcalculated by a 3D-sobel operator, alpha represents a gradient adjustment factor in the horizontal directionBeta represents a gradient adjustment factor in the vertical direction, gamma represents a gradient adjustment factor in the time domain, and α+β+γ=1.
Compared with the prior art, the invention has the advantages that:
the method fully considers the perception characteristics of a human eye visual system and the characteristics of panoramic video, utilizes an airspace JND threshold (visual perception information) as an airspace perception factor, obtains a motion perception factor through a weighted gradient value (visual perception information), further calculates the average value of space-time weighted perception factors of all pixel points in a maximum coding unit, calculates a Lagrange coefficient adjustment factor of the maximum coding unit based on the space-time weighted perception factors according to a rate distortion optimization theory, and further obtains the quantization parameter variation quantity of the maximum coding unit based on the space-time weighted perception factors; meanwhile, the method takes the dimension weight characteristics of the panoramic video in the ERP projection format into consideration, and calculates the quantization parameter variation of the maximum coding unit based on the dimension weight; and calculating a new coding quantization parameter of the maximum coding unit according to the two quantization parameter variation amounts, and applying the new coding quantization parameter to coding. The method can adaptively adjust the coding quantization parameter aiming at the time-space domain and panoramic latitude characteristics of a specific maximum coding unit, and experimental tests show that the method can effectively reduce the coding rate while guaranteeing the coding quality, thereby effectively reducing the coding complexity, remarkably improving the rate distortion performance, and having better coding effect especially aiming at the condition of smaller initial coding quantization parameter.
Drawings
Fig. 1 is a block diagram of a general implementation of the method of the present invention.
Detailed Description
The invention is described in further detail below with reference to the embodiments of the drawings.
The invention provides a video low-complexity coding method based on panoramic visual perception characteristics, which is generally implemented as shown in a block diagram in fig. 1 and comprises the following steps:
step 1: defining a current video frame to be coded in the panoramic video in the ERP (Equirectangular Projection) projection format as a current frame; the width of a video frame in the panoramic video in the ERP projection format is W, and the height is H.
Step 2: judging whether the current frame is a 1 st frame video frame or not, if so, adopting an original algorithm of an HEVC video encoder to encode the current frame, and then executing a step 10; otherwise, step 3 is executed.
Step 3: performing spatial JND (Just Noticeable Distortion ) threshold calculation on each pixel point in the current frame to obtain a panoramic spatial JND threshold diagram of the current frame, and marking as G 1 ,G 1 The pixel value of each pixel point in the current frame is the airspace JND threshold value of the corresponding pixel point in the current frame; and performing weighted gradient calculation on each pixel point in the current frame to obtain a weighted gradient map of the current frame, which is marked as G 2 ,G 2 The pixel value of each pixel point in the current frame is the weighted gradient value of the corresponding pixel point in the current frame; the larger the spatial JND threshold value is, the larger the just noticeable distortion is represented, namely the stronger the spatial masking of the corresponding region is; conversely, the smaller the spatial JND threshold, the weaker the spatial masking of the corresponding region.
In the present embodiment, G 1 The acquisition mode of (a) is as follows: performing airspace JND threshold calculation on each pixel point in the current frame by adopting the existing classical airspace just noticeable distortion model to obtain G 1 。
In the present embodiment, G 2 The acquisition process of (1) is as follows: will G 2 The pixel value of the pixel point with the middle coordinate position of (x, y) is marked as G 2 (x,y),Wherein x is more than or equal to 0 and less than or equal to W-1, y is more than or equal to 0 and less than or equal to H-1, G 2 (x, y) also represents the weighted gradient value of the pixel point with the coordinate position (x, y) in the current frame, ">Indicates the horizontal direction +.>Indicates the vertical direction +.>Representing the time domain direction +_>Horizontal gradient value representing pixel point with coordinate position (x, y) in current frame, +.>A vertical gradient value representing a pixel point having a coordinate position of (x, y) in the current frame,representing the time domain direction gradient value of the pixel point with the coordinate position of (x, y) in the current frame, namely the gradient value of the pixel point with the coordinate position of (x, y) in the current frame along the time domain direction and the pixel point with the coordinate position of (x, y) in the video frame of the previous frame,and->The existing 3D-sobel operator calculates, α represents a gradient adjustment factor in a horizontal direction, β represents a gradient adjustment factor in a vertical direction, γ represents a gradient adjustment factor in a time domain direction, α+β+γ=1, and in this embodiment, α takes a value of 0.25, β takes a value of 0.25, and γ takes a value of 0.5.
Step 4: calculating the airspace perception factor of each pixel point in the current frame, and marking the airspace perception factor of the pixel point with the coordinate position of (x, y) in the current frame as delta A (x,y),δ A (x,y)=G 1 (x, y); calculating motion perception factors of each pixel point in the current frame, and recording the motion perception factors of the pixel points with the coordinate positions of (x, y) in the current frame as delta T (x,y),Then calculating the space-time weighted perception factor of each pixel point in the current frame, and the space-time of the pixel point with the coordinate position of (x, y) in the current frameThe weighted perceptual factor is denoted delta (x, y), delta (x, y) =delta A (x,y)×δ T (x, y); calculating the average value of the space-time weighted perceptron factors of all pixel points in the current frame, and recording as S δ ,Calculating the dimension weight of each pixel point in the current frame, and marking the dimension weight of the pixel point with the coordinate position of (x, y) in the current frame as w ERP (x,y),Wherein x is more than or equal to 0 and less than or equal to W-1, y is more than or equal to 0 and less than or equal to H-1, G 1 (x, y) represents G 1 Pixel value, G, of pixel point with middle coordinate position (x, y) 1 (x, y) also represents the spatial JND threshold, G, of the pixel point with the coordinate position (x, y) in the current frame 2 (x, y) represents G 2 Pixel value, G, of pixel point with middle coordinate position (x, y) 2 (x, y) also represents the weighted gradient value of the pixel point with the coordinate position (x, y) in the current frame, S F Represents G 2 Average value of pixel values of all pixel points in (S) F Also representing the average value of the weighted gradient values of all pixels in the current frame, +.>Epsilon is motion perception constant and epsilon is [1,2 ]]In this embodiment, epsilon takes a value of 1, cos () is a cosine function, pi=3.14, ….
In this embodiment, since each latitude adopts different degree of pixel sampling, different pixel redundancy exists in different dimensions in the plane, and the extremely ascending redundancy of two poles is most obvious, after a sphere is projected to the ERP projection format, the center of the sphere is usually taken as a base point, the longitude θ of the ERP projection format corresponds to the longitude of the sphere, and the latitude of the ERP projection formatCorresponding to the latitude of sphere, theta is E [ -pi, pi],Considering the characteristic of panoramic latitude, introducing a dimension weight parameter w of an ERP projection format ERP (x,y)。
Step 5: the current largest coding unit (Largest Coding Unit, LCU) to be processed in the current frame is defined as the current largest coding unit.
Step 6: calculating the average value of the space-time weighted perception factors of all pixel points in the current maximum coding unit, and recording the average value as S δ_LCU ,Then calculating Lagrange coefficient adjusting factor based on space-time weighted perception factor of the current maximum coding unit, and marking as ψ LCU ,Calculating the quantization parameter variation of the current maximum coding unit based on the space-time weighted perception factor, and recording the quantization parameter variation as delta QP 1 ,ΔQP 1 =3log 2 (Ψ LCU ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein i is more than or equal to 0 and less than or equal to 63,0, j is more than or equal to 63, delta LCU (i, j) a spatio-temporal weighted perceptual factor, K, representing a pixel point of which the intra-block coordinate position is (i, j) in the current maximum coding unit LCU And B LCU Are all adjusting parameters, K LCU ∈(0,1),B LCU E (0, 1), K is finally determined by a number of experiments in this example LCU And B LCU The value is 0.5.
Step 7: calculating the average value of the dimension weights of all pixel points in the current maximum coding unit, and recording the average value as S wERP_LCU ,Calculating the quantization parameter variation based on the dimension weight of the current maximum coding unit, and recording the quantization parameter variation as delta QP 2 ,Wherein w is ERP_LCU (i, j) means that the intra-block coordinate position in the current maximum coding unit is (i, j)The dimension weights of the pixel points of (a) and (b) are adjusting parameters, a epsilon (0, 1), b < a, and in the embodiment, the value of a is finally determined to be 0.85 and the value of b is finally determined to be 0.3 through a plurality of experiments.
Step 8: calculating new coding quantization parameter of current maximum coding unit, and recording as QP new ,Then use QP new Updating the coding quantization parameter of the current maximum coding unit; then, an HEVC video encoder is adopted to encode the current maximum encoding unit; wherein QP is org Original coding quantization parameter representing current maximum coding unit, QP org Can be read from the initialization parameter list of the encoder, symbol +.>To round down the operator.
Step 9: and taking the next largest coding unit to be processed in the current frame as the current largest coding unit, returning to the step 6, continuing to execute until all the largest coding units in the current frame are processed, and executing the step 10.
Step 10: and taking a video frame to be encoded of the next frame in the panoramic video in the ERP projection format as a current frame, and returning to the step 2 to continue execution until all video frames in the panoramic video in the ERP projection format are encoded.
To further illustrate the performance of the inventive method, the inventive method was tested.
HEVC video encoder standard reference software HM16.14 is selected as an experimental test platform, hardware is configured as Intel (R) Core (TM) i7-10700 CPU, main frequency is 2.9GHz, a 32G 64-bit WIN10 operating system is stored, and a development tool selects VS2013. 4 panoramic video sequences are selected as standard test sequences, and the standard test sequences are respectively: two 4K sequences "AerialCity", "DrivingInCity" and two 6K sequences "BranCastle2", "Landing2". The test frame number of each standard test sequence is 100 frames, and the SearchRange is set as the search range by adopting an intra-frame coding mode64, set MaxPartitionDepth to 4, initial coding quantization parameter QP (i.e., original coding quantization parameter QP org ) Taken as 22, 27, 32, 37 respectively.
Table 1 lists the relevant parameter information for the 4 panoramic video sequences of "alialcity", "DrivingInCity", "BranCastle2", "handle 2".
Table 1 related parameter information for panoramic video sequences
Panoramic video sequence | Video resolution |
AerialCity | 3840×1920 |
DrivingInCity | 3840×1920 |
BranCastle2 | 6144×3072 |
Landing2 | 6144×3072 |
Table 2 shows the savings in coding rate when encoding the panoramic video sequence listed in table 1 using the method of the present invention, as compared to using the HM16.14 raw platform method. Definition the code rate saving rate of coding by adopting the method of the invention compared with coding by adopting the HM16.14 original platform method is delta R PRO ,ΔR PRO =(R ORG -R PRO )/R ORG X 100 (%), wherein R PRO Representing the use of the inventionCoding rate R of coding by using method ORG The code rate of the code using the HM16.14 raw plateau method is shown.
Table 2 code rate savings comparing the encoding with the inventive method versus the encoding with the HM16.14 raw platform method
As can be seen from table 2, the coding using the method of the present invention can save the coding rate by 12.9% on average. The method can effectively reduce the coding rate of the panoramic video sequence aiming at 4 different scenes and different motion conditions, and particularly aims at the initial coding quantization parameter QP (namely the original coding quantization parameter QP org ) In smaller cases, the coding effect is better.
Table 3 lists the rate-distortion performance of encoding the panoramic video sequence listed in table 1 using the method of the present invention. The quality of the coded video is evaluated by a classical subjective quality evaluation method, in the quality evaluation, a subjective quality evaluation method MOS ((Mean Opinion Score) is used as a quality evaluation index, and the rate distortion performance index BDBR of each panoramic video sequence under the subjective quality evaluation method MOS is calculated respectively MOS To comprehensively evaluate the performance of the method of the invention.
Table 3 rate distortion performance for encoding using the method of the present invention
As can be seen from Table 3, the process of the present invention employs BDBR MOS The rate distortion performance evaluation index is used for representing that the average coding rate saving value is about-7.4% under the condition of the same subjective quality under the condition of the quality evaluation index MOS. This shows that the method of the invention can save more coding rate under the same subjective perceptual quality compared with the HM16.14 original platform method. From table 3 it can be seen that for different scenes, different scenes for a panoramic video sequenceThe method can effectively save the coding rate and remarkably improve the rate distortion performance under the motion condition.
Claims (3)
1. A video low-complexity coding method based on panoramic visual perception characteristics is characterized by comprising the following steps:
step 1: defining a current video frame to be coded in the panoramic video in the ERP projection format as a current frame; the width of a video frame in the panoramic video in the ERP projection format is W, and the height is H;
step 2: judging whether the current frame is a 1 st frame video frame or not, if so, adopting an original algorithm of an HEVC video encoder to encode the current frame, and then executing a step 10; otherwise, executing the step 3;
step 3: performing airspace JND threshold calculation on each pixel point in the current frame to obtain a panoramic airspace JND threshold diagram of the current frame, and marking the panoramic airspace JND threshold diagram as G 1 ,G 1 The pixel value of each pixel point in the current frame is the airspace JND threshold value of the corresponding pixel point in the current frame; and performing weighted gradient calculation on each pixel point in the current frame to obtain a weighted gradient map of the current frame, which is marked as G 2 ,G 2 The pixel value of each pixel point in the current frame is the weighted gradient value of the corresponding pixel point in the current frame;
step 4: calculating the airspace perception factor of each pixel point in the current frame, and marking the airspace perception factor of the pixel point with the coordinate position of (x, y) in the current frame as delta A (x,y),δ A (x,y)=G 1 (x, y); calculating motion perception factors of each pixel point in the current frame, and recording the motion perception factors of the pixel points with the coordinate positions of (x, y) in the current frame as delta T (x,y),Then calculating the space-time weighted perceptron of each pixel point in the current frame, and recording the space-time weighted perceptron of the pixel point with the coordinate position of (x, y) in the current frame as delta (x, y), wherein delta (x, y) =delta A (x,y)×δ T (x, y); calculating the space-time weighted perception factors of all pixel points in the current frameIs denoted as S δ The method comprises the steps of carrying out a first treatment on the surface of the Calculating the dimension weight of each pixel point in the current frame, and marking the dimension weight of the pixel point with the coordinate position of (x, y) in the current frame as w ERP (x,y),Wherein x is more than or equal to 0 and less than or equal to W-1, y is more than or equal to 0 and less than or equal to H-1, G 1 (x, y) represents G 1 Pixel value, G, of pixel point with middle coordinate position (x, y) 1 (x, y) also represents the spatial JND threshold, G, of the pixel point with the coordinate position (x, y) in the current frame 2 (x, y) represents G 2 Pixel value, G, of pixel point with middle coordinate position (x, y) 2 (x, y) also represents the weighted gradient value of the pixel point with the coordinate position (x, y) in the current frame, S F Represents G 2 Average value of pixel values of all pixel points in (S) F Also represents the average value of weighted gradient values of all pixel points in the current frame, epsilon is a motion perception constant, epsilon is [1,2 ]]Cos () is a cosine function;
step 5: defining a current maximum coding unit to be processed in a current frame as a current maximum coding unit;
step 6: calculating the average value of the space-time weighted perception factors of all pixel points in the current maximum coding unit, and recording the average value as S δ_LCU The method comprises the steps of carrying out a first treatment on the surface of the Then calculating Lagrange coefficient adjusting factor based on space-time weighted perception factor of the current maximum coding unit, and marking as ψ LCU ,Calculating the quantization parameter variation of the current maximum coding unit based on the space-time weighted perception factor, and recording the quantization parameter variation as delta QP 1 ,ΔQP 1 =3log 2 (Ψ LCU ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein K is LCU And B LCU Are all adjusting parameters, K LCU ∈(0,1),B LCU ∈(0,1);
Step 7: calculating the average value of the dimension weights of all pixel points in the current maximum coding unit, and recording the average value asCalculating the quantization parameter variation based on the dimension weight of the current maximum coding unit, and recording the quantization parameter variation as delta QP 2 ,Wherein a and b are both adjusting parameters, a epsilon (0, 1), b < a;
step 8: calculating new coding quantization parameter of current maximum coding unit, and recording as QP new ,Then use QP new Updating the coding quantization parameter of the current maximum coding unit; coding the current maximum coding unit; wherein QP is org Original coding quantization parameter representing current maximum coding unit, symbolRounding down the operator;
step 9: taking the next largest coding unit to be processed in the current frame as the current largest coding unit, returning to the step 6 to continue execution until all the largest coding units in the current frame are processed, and executing the step 10;
step 10: and taking a video frame to be encoded of the next frame in the panoramic video in the ERP projection format as a current frame, and returning to the step 2 to continue execution until all video frames in the panoramic video in the ERP projection format are encoded.
2. The method for encoding video with low complexity based on panoramic visual perception characteristics as recited in claim 1, wherein in said step 3, G 1 The acquisition mode of (a) is as follows: performing spatial JND threshold calculation on each pixel point in the current frame by adopting spatial just noticeable distortion model to obtain G 1 。
3. A panoramic-vision-based perceptual feature as defined in claim 1 or 2A low complexity video coding method, characterized in that in step 3, G 2 The acquisition process of (1) is as follows: will G 2 The pixel value of the pixel point with the middle coordinate position of (x, y) is marked as G 2 (x,y),Wherein x is more than or equal to 0 and less than or equal to W-1, y is more than or equal to 0 and less than or equal to H-1, G 2 (x, y) also represents the weighted gradient value of the pixel point with the coordinate position (x, y) in the current frame, ">Indicating the direction of the horizontal direction,indicates the vertical direction +.>Representing the time domain direction +_>Horizontal gradient value representing pixel point with coordinate position (x, y) in current frame, +.>Vertical gradient value representing pixel point with coordinate position (x, y) in current frame, +.>Time domain direction gradient value representing pixel point with coordinate position (x, y) in current frame, +.>Andcalculated by a 3D-sobel operator, alpha represents a gradient adjustment factor in the horizontal direction, and beta represents the vertical directionThe gradient adjustment factor in the direction, γ represents the gradient adjustment factor in the time domain direction, α+β+γ=1. />
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210157533.5A CN114567776B (en) | 2022-02-21 | 2022-02-21 | Video low-complexity coding method based on panoramic visual perception characteristics |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210157533.5A CN114567776B (en) | 2022-02-21 | 2022-02-21 | Video low-complexity coding method based on panoramic visual perception characteristics |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114567776A CN114567776A (en) | 2022-05-31 |
CN114567776B true CN114567776B (en) | 2023-05-05 |
Family
ID=81714022
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210157533.5A Active CN114567776B (en) | 2022-02-21 | 2022-02-21 | Video low-complexity coding method based on panoramic visual perception characteristics |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114567776B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116723330B (en) * | 2023-03-28 | 2024-02-23 | 成都师范学院 | Panoramic video coding method for self-adapting spherical domain distortion propagation chain length |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6366705B1 (en) * | 1999-01-28 | 2002-04-02 | Lucent Technologies Inc. | Perceptual preprocessing techniques to reduce complexity of video coders |
CN103096079A (en) * | 2013-01-08 | 2013-05-08 | 宁波大学 | Multi-view video rate control method based on exactly perceptible distortion |
CN104954778A (en) * | 2015-06-04 | 2015-09-30 | 宁波大学 | Objective stereo image quality assessment method based on perception feature set |
CN107147912A (en) * | 2017-05-04 | 2017-09-08 | 浙江大华技术股份有限公司 | A kind of method for video coding and device |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100086063A1 (en) * | 2008-10-02 | 2010-04-08 | Apple Inc. | Quality metrics for coded video using just noticeable difference models |
US9237343B2 (en) * | 2012-12-13 | 2016-01-12 | Mitsubishi Electric Research Laboratories, Inc. | Perceptually coding images and videos |
-
2022
- 2022-02-21 CN CN202210157533.5A patent/CN114567776B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6366705B1 (en) * | 1999-01-28 | 2002-04-02 | Lucent Technologies Inc. | Perceptual preprocessing techniques to reduce complexity of video coders |
CN103096079A (en) * | 2013-01-08 | 2013-05-08 | 宁波大学 | Multi-view video rate control method based on exactly perceptible distortion |
CN104954778A (en) * | 2015-06-04 | 2015-09-30 | 宁波大学 | Objective stereo image quality assessment method based on perception feature set |
CN107147912A (en) * | 2017-05-04 | 2017-09-08 | 浙江大华技术股份有限公司 | A kind of method for video coding and device |
Non-Patent Citations (2)
Title |
---|
Yafen Xing et al.Spatiotemporal just noticeable difference modeling with heterogeneous temporal visual features.《Displays》.2021,全文. * |
杜宝祯.基于感知阈值的立体视频快速编码算法.《信息与电脑(理论版)》.2020,全文. * |
Also Published As
Publication number | Publication date |
---|---|
CN114567776A (en) | 2022-05-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111988611B (en) | Quantization offset information determining method, image encoding device and electronic equipment | |
CN110062234B (en) | Perceptual video coding method based on just noticeable distortion of region | |
CN108063944B (en) | Perception code rate control method based on visual saliency | |
CN104219525B (en) | Perception method for video coding based on conspicuousness and minimum discernable distortion | |
CN111193931B (en) | Video data coding processing method and computer storage medium | |
KR20170031202A (en) | Adaptive inverse-quantization method and apparatus in video coding | |
CN103313047B (en) | A kind of method for video coding and device | |
CN104378636B (en) | A kind of video encoding method and device | |
CN114567776B (en) | Video low-complexity coding method based on panoramic visual perception characteristics | |
CN112825557B (en) | Self-adaptive sensing time-space domain quantization method aiming at video coding | |
CN103313002B (en) | Situation-based mobile streaming media energy-saving optimization method | |
CN111131831A (en) | Data transmission method and device | |
WO2017004889A1 (en) | Jnd factor-based super-pixel gaussian filter pre-processing method | |
CN108521572B (en) | Residual filtering method based on pixel domain JND model | |
CN112584153B (en) | Video compression method and device based on just noticeable distortion model | |
CN116760988B (en) | Video coding method and device based on human visual system | |
CN105049853A (en) | SAO coding method and system based on fragment source analysis | |
CN112637596A (en) | Code rate control system | |
CN111464805B (en) | Three-dimensional panoramic video rapid coding method based on panoramic saliency | |
CN112738518B (en) | Code rate control method for CTU (China train unit) level video coding based on perception | |
CN103517067B (en) | Initial quantitative parameter self-adaptive adjustment method and system | |
CN107948643B (en) | Method for reducing block effect of JPEG image | |
CN112929663A (en) | Knowledge distillation-based image compression quality enhancement method | |
CN110944199A (en) | Screen content video code rate control method based on space-time perception characteristics | |
CN112822490B (en) | Coding method for fast decision of intra-frame coding unit size based on perception |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20240118 Address after: Room 166, Building 1, No. 8 Xingye Avenue, Ningbo Free Trade Zone, Zhejiang Province, 315800 Patentee after: Zhejiang Chuanzhi Electronic Technology Co.,Ltd. Address before: 315800 no.388, Lushan East Road, Ningbo Economic and Technological Development Zone, Zhejiang Province Patentee before: Ningbo Polytechnic |