CN114567776B - Video low-complexity coding method based on panoramic visual perception characteristics - Google Patents

Video low-complexity coding method based on panoramic visual perception characteristics Download PDF

Info

Publication number
CN114567776B
CN114567776B CN202210157533.5A CN202210157533A CN114567776B CN 114567776 B CN114567776 B CN 114567776B CN 202210157533 A CN202210157533 A CN 202210157533A CN 114567776 B CN114567776 B CN 114567776B
Authority
CN
China
Prior art keywords
current frame
pixel point
coding unit
current
perception
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210157533.5A
Other languages
Chinese (zh)
Other versions
CN114567776A (en
Inventor
杜宝祯
张奇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Chuanzhi Electronic Technology Co ltd
Original Assignee
Ningbo Polytechnic
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ningbo Polytechnic filed Critical Ningbo Polytechnic
Priority to CN202210157533.5A priority Critical patent/CN114567776B/en
Publication of CN114567776A publication Critical patent/CN114567776A/en
Application granted granted Critical
Publication of CN114567776B publication Critical patent/CN114567776B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/14Coding unit complexity, e.g. amount of activity or edge presence estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/182Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention discloses a video low-complexity coding method based on panoramic visual perception characteristics, which uses an airspace JND threshold value as an airspace perception factor, obtains a motion perception factor through a weighted gradient value, further obtains the average value of space-time weighted perception factors of all pixel points in a maximum coding unit, calculates Lagrange coefficient adjustment factors based on the space-time weighted perception factors of the maximum coding unit according to a rate distortion optimization theory, and further obtains the quantization parameter variation quantity of the maximum coding unit based on the space-time weighted perception factors; simultaneously calculating quantization parameter variation of the maximum coding unit based on dimension weight; calculating new coding quantization parameters of the maximum coding unit according to the two quantization parameter variation amounts, and applying the new coding quantization parameters to coding; the method has the advantages that the coding quality can be ensured, the coding rate can be effectively reduced, the coding complexity can be effectively reduced, the rate distortion performance is obviously improved, and the coding effect is better particularly when the initial coding quantization parameter is smaller.

Description

Video low-complexity coding method based on panoramic visual perception characteristics
Technical Field
The invention relates to a video coding technology, in particular to a video low-complexity coding method based on panoramic visual perception characteristics.
Background
In recent years, panoramic video systems are widely welcomed by people through the 'immersive' visual experience, and have great application prospects in the fields of virtual reality, simulated driving and the like. However, the problem of excessive encoding complexity in the aspect of encoding still exists in the panoramic video system at present, which brings great challenges to the application of the panoramic video system. Therefore, how to reduce the coding complexity has become a technical problem to be solved in the field.
The existing panoramic video low-complexity coding algorithm does not fully consider the perception characteristics of a human eye visual system (Human Visual System, HVS) and the characteristics of the panoramic video, and is difficult to achieve optimal coding performance. The main purpose of video coding is to reduce the code rate of coding as much as possible on the premise of ensuring certain video quality; or in the case of limited code rate of the coding, the mode with minimum distortion is adopted for coding. Therefore, how to combine and use the perception characteristic of the human eye vision system and the panoramic video characteristic to guide the coding parameter selection becomes an important breakthrough direction for researching and reducing the coding complexity in the field.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a video low-complexity coding method based on panoramic visual perception characteristics, which can effectively save coding code rate, thereby effectively reducing coding complexity.
The technical scheme adopted for solving the technical problems is as follows: a video low-complexity coding method based on panoramic visual perception characteristics is characterized by comprising the following steps:
step 1: defining a current video frame to be coded in the panoramic video in the ERP projection format as a current frame; the width of a video frame in the panoramic video in the ERP projection format is W, and the height is H;
step 2: judging whether the current frame is a 1 st frame video frame or not, if so, adopting an original algorithm of an HEVC video encoder to encode the current frame, and then executing a step 10; otherwise, executing the step 3;
step 3: performing airspace JND threshold calculation on each pixel point in the current frame to obtain a panoramic airspace JND threshold diagram of the current frame, and marking the panoramic airspace JND threshold diagram as G 1 ,G 1 The pixel value of each pixel point in the current frame is the airspace JND threshold value of the corresponding pixel point in the current frame; and performing weighted gradient calculation on each pixel point in the current frame to obtain a weighted gradient map of the current frame, which is marked as G 2 ,G 2 The pixel value of each pixel point in the current frame is the weighted gradient value of the corresponding pixel point in the current frame;
step 4: calculating the airspace perception factor of each pixel point in the current frame, and sensing the airspace of the pixel point with the coordinate position of (x, y) in the current frameThe known factor is denoted as delta A (x,y),δ A (x,y)=G 1 (x, y); calculating motion perception factors of each pixel point in the current frame, and recording the motion perception factors of the pixel points with the coordinate positions of (x, y) in the current frame as delta T (x,y),
Figure BDA0003512810380000021
Then calculating the space-time weighted perceptron of each pixel point in the current frame, and recording the space-time weighted perceptron of the pixel point with the coordinate position of (x, y) in the current frame as delta (x, y), wherein delta (x, y) =delta A (x,y)×δ T (x, y); calculating the average value of the space-time weighted perceptron factors of all pixel points in the current frame, and recording as S δ The method comprises the steps of carrying out a first treatment on the surface of the Calculating the dimension weight of each pixel point in the current frame, and marking the dimension weight of the pixel point with the coordinate position of (x, y) in the current frame as w ERP (x,y),
Figure BDA0003512810380000022
Wherein x is more than or equal to 0 and less than or equal to W-1, y is more than or equal to 0 and less than or equal to H-1, G 1 (x, y) represents G 1 Pixel value, G, of pixel point with middle coordinate position (x, y) 1 (x, y) also represents the spatial JND threshold, G, of the pixel point with the coordinate position (x, y) in the current frame 2 (x, y) represents G 2 Pixel value, G, of pixel point with middle coordinate position (x, y) 2 (x, y) also represents the weighted gradient value of the pixel point with the coordinate position (x, y) in the current frame, S F Represents G 2 Average value of pixel values of all pixel points in (S) F Also represents the average value of weighted gradient values of all pixel points in the current frame, epsilon is a motion perception constant, epsilon is [1,2 ]]Cos () is a cosine function;
step 5: defining a current maximum coding unit to be processed in a current frame as a current maximum coding unit;
step 6: calculating the average value of the space-time weighted perception factors of all pixel points in the current maximum coding unit, and recording the average value as S δ_LCU The method comprises the steps of carrying out a first treatment on the surface of the Then calculating Lagrange coefficient adjusting factor based on space-time weighted perception factor of the current maximum coding unit, and marking as ψ LCU
Figure BDA0003512810380000031
Calculating the quantization parameter variation of the current maximum coding unit based on the space-time weighted perception factor, and recording the quantization parameter variation as delta QP 1 ,ΔQP 1 =3log 2LCU ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein K is LCU And B LCU Are all adjusting parameters, K LCU ∈(0,1),B LCU ∈(0,1);
Step 7: calculating the average value of the dimension weights of all pixel points in the current maximum coding unit, and recording the average value as S wERP_LCU The method comprises the steps of carrying out a first treatment on the surface of the Calculating the quantization parameter variation based on the dimension weight of the current maximum coding unit, and recording the quantization parameter variation as delta QP 2
Figure BDA0003512810380000032
Wherein a and b are both adjusting parameters, a epsilon (0, 1), b < a;
step 8: calculating new coding quantization parameter of current maximum coding unit, and recording as QP new
Figure BDA0003512810380000033
Then use QP new Updating the coding quantization parameter of the current maximum coding unit; coding the current maximum coding unit; wherein QP is org Original coding quantization parameter representing current maximum coding unit, symbol
Figure BDA0003512810380000034
Rounding down the operator;
step 9: taking the next largest coding unit to be processed in the current frame as the current largest coding unit, returning to the step 6 to continue execution until all the largest coding units in the current frame are processed, and executing the step 10;
step 10: and taking a video frame to be encoded of the next frame in the panoramic video in the ERP projection format as a current frame, and returning to the step 2 to continue execution until all video frames in the panoramic video in the ERP projection format are encoded.
In the step 3,G 1 The acquisition mode of (a) is as follows: performing spatial JND threshold calculation on each pixel point in the current frame by adopting spatial just noticeable distortion model to obtain G 1
In the step 3, G 2 The acquisition process of (1) is as follows: will G 2 The pixel value of the pixel point with the middle coordinate position of (x, y) is marked as G 2 (x,y),
Figure BDA0003512810380000035
Wherein x is more than or equal to 0 and less than or equal to W-1, y is more than or equal to 0 and less than or equal to H-1, G 2 (x, y) also represents the weighted gradient value of the pixel point with the coordinate position (x, y) in the current frame, ">
Figure BDA0003512810380000036
Indicating the direction of the horizontal direction,
Figure BDA0003512810380000037
indicates the vertical direction +.>
Figure BDA0003512810380000038
Representing the time domain direction +_>
Figure BDA0003512810380000039
Horizontal gradient value representing pixel point with coordinate position (x, y) in current frame, +.>
Figure BDA0003512810380000041
Vertical gradient value representing pixel point with coordinate position (x, y) in current frame, +.>
Figure BDA0003512810380000042
Time domain direction gradient value representing pixel point with coordinate position (x, y) in current frame, +.>
Figure BDA0003512810380000043
And
Figure BDA0003512810380000044
calculated by a 3D-sobel operator, alpha represents a gradient adjustment factor in the horizontal directionBeta represents a gradient adjustment factor in the vertical direction, gamma represents a gradient adjustment factor in the time domain, and α+β+γ=1.
Compared with the prior art, the invention has the advantages that:
the method fully considers the perception characteristics of a human eye visual system and the characteristics of panoramic video, utilizes an airspace JND threshold (visual perception information) as an airspace perception factor, obtains a motion perception factor through a weighted gradient value (visual perception information), further calculates the average value of space-time weighted perception factors of all pixel points in a maximum coding unit, calculates a Lagrange coefficient adjustment factor of the maximum coding unit based on the space-time weighted perception factors according to a rate distortion optimization theory, and further obtains the quantization parameter variation quantity of the maximum coding unit based on the space-time weighted perception factors; meanwhile, the method takes the dimension weight characteristics of the panoramic video in the ERP projection format into consideration, and calculates the quantization parameter variation of the maximum coding unit based on the dimension weight; and calculating a new coding quantization parameter of the maximum coding unit according to the two quantization parameter variation amounts, and applying the new coding quantization parameter to coding. The method can adaptively adjust the coding quantization parameter aiming at the time-space domain and panoramic latitude characteristics of a specific maximum coding unit, and experimental tests show that the method can effectively reduce the coding rate while guaranteeing the coding quality, thereby effectively reducing the coding complexity, remarkably improving the rate distortion performance, and having better coding effect especially aiming at the condition of smaller initial coding quantization parameter.
Drawings
Fig. 1 is a block diagram of a general implementation of the method of the present invention.
Detailed Description
The invention is described in further detail below with reference to the embodiments of the drawings.
The invention provides a video low-complexity coding method based on panoramic visual perception characteristics, which is generally implemented as shown in a block diagram in fig. 1 and comprises the following steps:
step 1: defining a current video frame to be coded in the panoramic video in the ERP (Equirectangular Projection) projection format as a current frame; the width of a video frame in the panoramic video in the ERP projection format is W, and the height is H.
Step 2: judging whether the current frame is a 1 st frame video frame or not, if so, adopting an original algorithm of an HEVC video encoder to encode the current frame, and then executing a step 10; otherwise, step 3 is executed.
Step 3: performing spatial JND (Just Noticeable Distortion ) threshold calculation on each pixel point in the current frame to obtain a panoramic spatial JND threshold diagram of the current frame, and marking as G 1 ,G 1 The pixel value of each pixel point in the current frame is the airspace JND threshold value of the corresponding pixel point in the current frame; and performing weighted gradient calculation on each pixel point in the current frame to obtain a weighted gradient map of the current frame, which is marked as G 2 ,G 2 The pixel value of each pixel point in the current frame is the weighted gradient value of the corresponding pixel point in the current frame; the larger the spatial JND threshold value is, the larger the just noticeable distortion is represented, namely the stronger the spatial masking of the corresponding region is; conversely, the smaller the spatial JND threshold, the weaker the spatial masking of the corresponding region.
In the present embodiment, G 1 The acquisition mode of (a) is as follows: performing airspace JND threshold calculation on each pixel point in the current frame by adopting the existing classical airspace just noticeable distortion model to obtain G 1
In the present embodiment, G 2 The acquisition process of (1) is as follows: will G 2 The pixel value of the pixel point with the middle coordinate position of (x, y) is marked as G 2 (x,y),
Figure BDA0003512810380000051
Wherein x is more than or equal to 0 and less than or equal to W-1, y is more than or equal to 0 and less than or equal to H-1, G 2 (x, y) also represents the weighted gradient value of the pixel point with the coordinate position (x, y) in the current frame, ">
Figure BDA0003512810380000052
Indicates the horizontal direction +.>
Figure BDA0003512810380000053
Indicates the vertical direction +.>
Figure BDA0003512810380000054
Representing the time domain direction +_>
Figure BDA0003512810380000055
Horizontal gradient value representing pixel point with coordinate position (x, y) in current frame, +.>
Figure BDA0003512810380000056
A vertical gradient value representing a pixel point having a coordinate position of (x, y) in the current frame,
Figure BDA0003512810380000057
representing the time domain direction gradient value of the pixel point with the coordinate position of (x, y) in the current frame, namely the gradient value of the pixel point with the coordinate position of (x, y) in the current frame along the time domain direction and the pixel point with the coordinate position of (x, y) in the video frame of the previous frame,
Figure BDA0003512810380000058
and->
Figure BDA0003512810380000059
The existing 3D-sobel operator calculates, α represents a gradient adjustment factor in a horizontal direction, β represents a gradient adjustment factor in a vertical direction, γ represents a gradient adjustment factor in a time domain direction, α+β+γ=1, and in this embodiment, α takes a value of 0.25, β takes a value of 0.25, and γ takes a value of 0.5.
Step 4: calculating the airspace perception factor of each pixel point in the current frame, and marking the airspace perception factor of the pixel point with the coordinate position of (x, y) in the current frame as delta A (x,y),δ A (x,y)=G 1 (x, y); calculating motion perception factors of each pixel point in the current frame, and recording the motion perception factors of the pixel points with the coordinate positions of (x, y) in the current frame as delta T (x,y),
Figure BDA0003512810380000061
Then calculating the space-time weighted perception factor of each pixel point in the current frame, and the space-time of the pixel point with the coordinate position of (x, y) in the current frameThe weighted perceptual factor is denoted delta (x, y), delta (x, y) =delta A (x,y)×δ T (x, y); calculating the average value of the space-time weighted perceptron factors of all pixel points in the current frame, and recording as S δ
Figure BDA0003512810380000062
Calculating the dimension weight of each pixel point in the current frame, and marking the dimension weight of the pixel point with the coordinate position of (x, y) in the current frame as w ERP (x,y),
Figure BDA0003512810380000063
Wherein x is more than or equal to 0 and less than or equal to W-1, y is more than or equal to 0 and less than or equal to H-1, G 1 (x, y) represents G 1 Pixel value, G, of pixel point with middle coordinate position (x, y) 1 (x, y) also represents the spatial JND threshold, G, of the pixel point with the coordinate position (x, y) in the current frame 2 (x, y) represents G 2 Pixel value, G, of pixel point with middle coordinate position (x, y) 2 (x, y) also represents the weighted gradient value of the pixel point with the coordinate position (x, y) in the current frame, S F Represents G 2 Average value of pixel values of all pixel points in (S) F Also representing the average value of the weighted gradient values of all pixels in the current frame, +.>
Figure BDA0003512810380000064
Epsilon is motion perception constant and epsilon is [1,2 ]]In this embodiment, epsilon takes a value of 1, cos () is a cosine function, pi=3.14, ….
In this embodiment, since each latitude adopts different degree of pixel sampling, different pixel redundancy exists in different dimensions in the plane, and the extremely ascending redundancy of two poles is most obvious, after a sphere is projected to the ERP projection format, the center of the sphere is usually taken as a base point, the longitude θ of the ERP projection format corresponds to the longitude of the sphere, and the latitude of the ERP projection format
Figure BDA0003512810380000065
Corresponding to the latitude of sphere, theta is E [ -pi, pi],
Figure BDA0003512810380000066
Considering the characteristic of panoramic latitude, introducing a dimension weight parameter w of an ERP projection format ERP (x,y)。
Step 5: the current largest coding unit (Largest Coding Unit, LCU) to be processed in the current frame is defined as the current largest coding unit.
Step 6: calculating the average value of the space-time weighted perception factors of all pixel points in the current maximum coding unit, and recording the average value as S δ_LCU
Figure BDA0003512810380000071
Then calculating Lagrange coefficient adjusting factor based on space-time weighted perception factor of the current maximum coding unit, and marking as ψ LCU
Figure BDA0003512810380000072
Calculating the quantization parameter variation of the current maximum coding unit based on the space-time weighted perception factor, and recording the quantization parameter variation as delta QP 1 ,ΔQP 1 =3log 2LCU ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein i is more than or equal to 0 and less than or equal to 63,0, j is more than or equal to 63, delta LCU (i, j) a spatio-temporal weighted perceptual factor, K, representing a pixel point of which the intra-block coordinate position is (i, j) in the current maximum coding unit LCU And B LCU Are all adjusting parameters, K LCU ∈(0,1),B LCU E (0, 1), K is finally determined by a number of experiments in this example LCU And B LCU The value is 0.5.
Step 7: calculating the average value of the dimension weights of all pixel points in the current maximum coding unit, and recording the average value as S wERP_LCU
Figure BDA0003512810380000073
Calculating the quantization parameter variation based on the dimension weight of the current maximum coding unit, and recording the quantization parameter variation as delta QP 2
Figure BDA0003512810380000074
Wherein w is ERP_LCU (i, j) means that the intra-block coordinate position in the current maximum coding unit is (i, j)The dimension weights of the pixel points of (a) and (b) are adjusting parameters, a epsilon (0, 1), b < a, and in the embodiment, the value of a is finally determined to be 0.85 and the value of b is finally determined to be 0.3 through a plurality of experiments.
Step 8: calculating new coding quantization parameter of current maximum coding unit, and recording as QP new
Figure BDA0003512810380000075
Then use QP new Updating the coding quantization parameter of the current maximum coding unit; then, an HEVC video encoder is adopted to encode the current maximum encoding unit; wherein QP is org Original coding quantization parameter representing current maximum coding unit, QP org Can be read from the initialization parameter list of the encoder, symbol +.>
Figure BDA0003512810380000076
To round down the operator.
Step 9: and taking the next largest coding unit to be processed in the current frame as the current largest coding unit, returning to the step 6, continuing to execute until all the largest coding units in the current frame are processed, and executing the step 10.
Step 10: and taking a video frame to be encoded of the next frame in the panoramic video in the ERP projection format as a current frame, and returning to the step 2 to continue execution until all video frames in the panoramic video in the ERP projection format are encoded.
To further illustrate the performance of the inventive method, the inventive method was tested.
HEVC video encoder standard reference software HM16.14 is selected as an experimental test platform, hardware is configured as Intel (R) Core (TM) i7-10700 CPU, main frequency is 2.9GHz, a 32G 64-bit WIN10 operating system is stored, and a development tool selects VS2013. 4 panoramic video sequences are selected as standard test sequences, and the standard test sequences are respectively: two 4K sequences "AerialCity", "DrivingInCity" and two 6K sequences "BranCastle2", "Landing2". The test frame number of each standard test sequence is 100 frames, and the SearchRange is set as the search range by adopting an intra-frame coding mode64, set MaxPartitionDepth to 4, initial coding quantization parameter QP (i.e., original coding quantization parameter QP org ) Taken as 22, 27, 32, 37 respectively.
Table 1 lists the relevant parameter information for the 4 panoramic video sequences of "alialcity", "DrivingInCity", "BranCastle2", "handle 2".
Table 1 related parameter information for panoramic video sequences
Panoramic video sequence Video resolution
AerialCity 3840×1920
DrivingInCity 3840×1920
BranCastle2 6144×3072
Landing2 6144×3072
Table 2 shows the savings in coding rate when encoding the panoramic video sequence listed in table 1 using the method of the present invention, as compared to using the HM16.14 raw platform method. Definition the code rate saving rate of coding by adopting the method of the invention compared with coding by adopting the HM16.14 original platform method is delta R PRO ,ΔR PRO =(R ORG -R PRO )/R ORG X 100 (%), wherein R PRO Representing the use of the inventionCoding rate R of coding by using method ORG The code rate of the code using the HM16.14 raw plateau method is shown.
Table 2 code rate savings comparing the encoding with the inventive method versus the encoding with the HM16.14 raw platform method
Figure BDA0003512810380000091
As can be seen from table 2, the coding using the method of the present invention can save the coding rate by 12.9% on average. The method can effectively reduce the coding rate of the panoramic video sequence aiming at 4 different scenes and different motion conditions, and particularly aims at the initial coding quantization parameter QP (namely the original coding quantization parameter QP org ) In smaller cases, the coding effect is better.
Table 3 lists the rate-distortion performance of encoding the panoramic video sequence listed in table 1 using the method of the present invention. The quality of the coded video is evaluated by a classical subjective quality evaluation method, in the quality evaluation, a subjective quality evaluation method MOS ((Mean Opinion Score) is used as a quality evaluation index, and the rate distortion performance index BDBR of each panoramic video sequence under the subjective quality evaluation method MOS is calculated respectively MOS To comprehensively evaluate the performance of the method of the invention.
Table 3 rate distortion performance for encoding using the method of the present invention
Figure BDA0003512810380000101
As can be seen from Table 3, the process of the present invention employs BDBR MOS The rate distortion performance evaluation index is used for representing that the average coding rate saving value is about-7.4% under the condition of the same subjective quality under the condition of the quality evaluation index MOS. This shows that the method of the invention can save more coding rate under the same subjective perceptual quality compared with the HM16.14 original platform method. From table 3 it can be seen that for different scenes, different scenes for a panoramic video sequenceThe method can effectively save the coding rate and remarkably improve the rate distortion performance under the motion condition.

Claims (3)

1. A video low-complexity coding method based on panoramic visual perception characteristics is characterized by comprising the following steps:
step 1: defining a current video frame to be coded in the panoramic video in the ERP projection format as a current frame; the width of a video frame in the panoramic video in the ERP projection format is W, and the height is H;
step 2: judging whether the current frame is a 1 st frame video frame or not, if so, adopting an original algorithm of an HEVC video encoder to encode the current frame, and then executing a step 10; otherwise, executing the step 3;
step 3: performing airspace JND threshold calculation on each pixel point in the current frame to obtain a panoramic airspace JND threshold diagram of the current frame, and marking the panoramic airspace JND threshold diagram as G 1 ,G 1 The pixel value of each pixel point in the current frame is the airspace JND threshold value of the corresponding pixel point in the current frame; and performing weighted gradient calculation on each pixel point in the current frame to obtain a weighted gradient map of the current frame, which is marked as G 2 ,G 2 The pixel value of each pixel point in the current frame is the weighted gradient value of the corresponding pixel point in the current frame;
step 4: calculating the airspace perception factor of each pixel point in the current frame, and marking the airspace perception factor of the pixel point with the coordinate position of (x, y) in the current frame as delta A (x,y),δ A (x,y)=G 1 (x, y); calculating motion perception factors of each pixel point in the current frame, and recording the motion perception factors of the pixel points with the coordinate positions of (x, y) in the current frame as delta T (x,y),
Figure FDA0003512810370000011
Then calculating the space-time weighted perceptron of each pixel point in the current frame, and recording the space-time weighted perceptron of the pixel point with the coordinate position of (x, y) in the current frame as delta (x, y), wherein delta (x, y) =delta A (x,y)×δ T (x, y); calculating the space-time weighted perception factors of all pixel points in the current frameIs denoted as S δ The method comprises the steps of carrying out a first treatment on the surface of the Calculating the dimension weight of each pixel point in the current frame, and marking the dimension weight of the pixel point with the coordinate position of (x, y) in the current frame as w ERP (x,y),
Figure FDA0003512810370000012
Wherein x is more than or equal to 0 and less than or equal to W-1, y is more than or equal to 0 and less than or equal to H-1, G 1 (x, y) represents G 1 Pixel value, G, of pixel point with middle coordinate position (x, y) 1 (x, y) also represents the spatial JND threshold, G, of the pixel point with the coordinate position (x, y) in the current frame 2 (x, y) represents G 2 Pixel value, G, of pixel point with middle coordinate position (x, y) 2 (x, y) also represents the weighted gradient value of the pixel point with the coordinate position (x, y) in the current frame, S F Represents G 2 Average value of pixel values of all pixel points in (S) F Also represents the average value of weighted gradient values of all pixel points in the current frame, epsilon is a motion perception constant, epsilon is [1,2 ]]Cos () is a cosine function;
step 5: defining a current maximum coding unit to be processed in a current frame as a current maximum coding unit;
step 6: calculating the average value of the space-time weighted perception factors of all pixel points in the current maximum coding unit, and recording the average value as S δ_LCU The method comprises the steps of carrying out a first treatment on the surface of the Then calculating Lagrange coefficient adjusting factor based on space-time weighted perception factor of the current maximum coding unit, and marking as ψ LCU
Figure FDA0003512810370000021
Calculating the quantization parameter variation of the current maximum coding unit based on the space-time weighted perception factor, and recording the quantization parameter variation as delta QP 1 ,ΔQP 1 =3log 2LCU ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein K is LCU And B LCU Are all adjusting parameters, K LCU ∈(0,1),B LCU ∈(0,1);
Step 7: calculating the average value of the dimension weights of all pixel points in the current maximum coding unit, and recording the average value as
Figure FDA0003512810370000022
Calculating the quantization parameter variation based on the dimension weight of the current maximum coding unit, and recording the quantization parameter variation as delta QP 2
Figure FDA0003512810370000023
Wherein a and b are both adjusting parameters, a epsilon (0, 1), b < a;
step 8: calculating new coding quantization parameter of current maximum coding unit, and recording as QP new
Figure FDA0003512810370000024
Then use QP new Updating the coding quantization parameter of the current maximum coding unit; coding the current maximum coding unit; wherein QP is org Original coding quantization parameter representing current maximum coding unit, symbol
Figure FDA0003512810370000025
Rounding down the operator;
step 9: taking the next largest coding unit to be processed in the current frame as the current largest coding unit, returning to the step 6 to continue execution until all the largest coding units in the current frame are processed, and executing the step 10;
step 10: and taking a video frame to be encoded of the next frame in the panoramic video in the ERP projection format as a current frame, and returning to the step 2 to continue execution until all video frames in the panoramic video in the ERP projection format are encoded.
2. The method for encoding video with low complexity based on panoramic visual perception characteristics as recited in claim 1, wherein in said step 3, G 1 The acquisition mode of (a) is as follows: performing spatial JND threshold calculation on each pixel point in the current frame by adopting spatial just noticeable distortion model to obtain G 1
3. A panoramic-vision-based perceptual feature as defined in claim 1 or 2A low complexity video coding method, characterized in that in step 3, G 2 The acquisition process of (1) is as follows: will G 2 The pixel value of the pixel point with the middle coordinate position of (x, y) is marked as G 2 (x,y),
Figure FDA0003512810370000031
Wherein x is more than or equal to 0 and less than or equal to W-1, y is more than or equal to 0 and less than or equal to H-1, G 2 (x, y) also represents the weighted gradient value of the pixel point with the coordinate position (x, y) in the current frame, ">
Figure FDA0003512810370000032
Indicating the direction of the horizontal direction,
Figure FDA0003512810370000033
indicates the vertical direction +.>
Figure FDA0003512810370000034
Representing the time domain direction +_>
Figure FDA0003512810370000035
Horizontal gradient value representing pixel point with coordinate position (x, y) in current frame, +.>
Figure FDA0003512810370000036
Vertical gradient value representing pixel point with coordinate position (x, y) in current frame, +.>
Figure FDA0003512810370000037
Time domain direction gradient value representing pixel point with coordinate position (x, y) in current frame, +.>
Figure FDA0003512810370000038
And
Figure FDA0003512810370000039
calculated by a 3D-sobel operator, alpha represents a gradient adjustment factor in the horizontal direction, and beta represents the vertical directionThe gradient adjustment factor in the direction, γ represents the gradient adjustment factor in the time domain direction, α+β+γ=1. />
CN202210157533.5A 2022-02-21 2022-02-21 Video low-complexity coding method based on panoramic visual perception characteristics Active CN114567776B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210157533.5A CN114567776B (en) 2022-02-21 2022-02-21 Video low-complexity coding method based on panoramic visual perception characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210157533.5A CN114567776B (en) 2022-02-21 2022-02-21 Video low-complexity coding method based on panoramic visual perception characteristics

Publications (2)

Publication Number Publication Date
CN114567776A CN114567776A (en) 2022-05-31
CN114567776B true CN114567776B (en) 2023-05-05

Family

ID=81714022

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210157533.5A Active CN114567776B (en) 2022-02-21 2022-02-21 Video low-complexity coding method based on panoramic visual perception characteristics

Country Status (1)

Country Link
CN (1) CN114567776B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116723330B (en) * 2023-03-28 2024-02-23 成都师范学院 Panoramic video coding method for self-adapting spherical domain distortion propagation chain length

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6366705B1 (en) * 1999-01-28 2002-04-02 Lucent Technologies Inc. Perceptual preprocessing techniques to reduce complexity of video coders
CN103096079A (en) * 2013-01-08 2013-05-08 宁波大学 Multi-view video rate control method based on exactly perceptible distortion
CN104954778A (en) * 2015-06-04 2015-09-30 宁波大学 Objective stereo image quality assessment method based on perception feature set
CN107147912A (en) * 2017-05-04 2017-09-08 浙江大华技术股份有限公司 A kind of method for video coding and device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100086063A1 (en) * 2008-10-02 2010-04-08 Apple Inc. Quality metrics for coded video using just noticeable difference models
US9237343B2 (en) * 2012-12-13 2016-01-12 Mitsubishi Electric Research Laboratories, Inc. Perceptually coding images and videos

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6366705B1 (en) * 1999-01-28 2002-04-02 Lucent Technologies Inc. Perceptual preprocessing techniques to reduce complexity of video coders
CN103096079A (en) * 2013-01-08 2013-05-08 宁波大学 Multi-view video rate control method based on exactly perceptible distortion
CN104954778A (en) * 2015-06-04 2015-09-30 宁波大学 Objective stereo image quality assessment method based on perception feature set
CN107147912A (en) * 2017-05-04 2017-09-08 浙江大华技术股份有限公司 A kind of method for video coding and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Yafen Xing et al.Spatiotemporal just noticeable difference modeling with heterogeneous temporal visual features.《Displays》.2021,全文. *
杜宝祯.基于感知阈值的立体视频快速编码算法.《信息与电脑(理论版)》.2020,全文. *

Also Published As

Publication number Publication date
CN114567776A (en) 2022-05-31

Similar Documents

Publication Publication Date Title
CN111988611B (en) Quantization offset information determining method, image encoding device and electronic equipment
CN110062234B (en) Perceptual video coding method based on just noticeable distortion of region
CN108063944B (en) Perception code rate control method based on visual saliency
CN104219525B (en) Perception method for video coding based on conspicuousness and minimum discernable distortion
CN111193931B (en) Video data coding processing method and computer storage medium
KR20170031202A (en) Adaptive inverse-quantization method and apparatus in video coding
CN103313047B (en) A kind of method for video coding and device
CN104378636B (en) A kind of video encoding method and device
CN114567776B (en) Video low-complexity coding method based on panoramic visual perception characteristics
CN112825557B (en) Self-adaptive sensing time-space domain quantization method aiming at video coding
CN103313002B (en) Situation-based mobile streaming media energy-saving optimization method
CN111131831A (en) Data transmission method and device
WO2017004889A1 (en) Jnd factor-based super-pixel gaussian filter pre-processing method
CN108521572B (en) Residual filtering method based on pixel domain JND model
CN112584153B (en) Video compression method and device based on just noticeable distortion model
CN116760988B (en) Video coding method and device based on human visual system
CN105049853A (en) SAO coding method and system based on fragment source analysis
CN112637596A (en) Code rate control system
CN111464805B (en) Three-dimensional panoramic video rapid coding method based on panoramic saliency
CN112738518B (en) Code rate control method for CTU (China train unit) level video coding based on perception
CN103517067B (en) Initial quantitative parameter self-adaptive adjustment method and system
CN107948643B (en) Method for reducing block effect of JPEG image
CN112929663A (en) Knowledge distillation-based image compression quality enhancement method
CN110944199A (en) Screen content video code rate control method based on space-time perception characteristics
CN112822490B (en) Coding method for fast decision of intra-frame coding unit size based on perception

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20240118

Address after: Room 166, Building 1, No. 8 Xingye Avenue, Ningbo Free Trade Zone, Zhejiang Province, 315800

Patentee after: Zhejiang Chuanzhi Electronic Technology Co.,Ltd.

Address before: 315800 no.388, Lushan East Road, Ningbo Economic and Technological Development Zone, Zhejiang Province

Patentee before: Ningbo Polytechnic