CN114567776A - Video low-complexity coding method based on panoramic visual perception characteristic - Google Patents

Video low-complexity coding method based on panoramic visual perception characteristic Download PDF

Info

Publication number
CN114567776A
CN114567776A CN202210157533.5A CN202210157533A CN114567776A CN 114567776 A CN114567776 A CN 114567776A CN 202210157533 A CN202210157533 A CN 202210157533A CN 114567776 A CN114567776 A CN 114567776A
Authority
CN
China
Prior art keywords
current frame
pixel point
coding unit
pixel
maximum coding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210157533.5A
Other languages
Chinese (zh)
Other versions
CN114567776B (en
Inventor
杜宝祯
张奇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Chuanzhi Electronic Technology Co ltd
Original Assignee
Ningbo Polytechnic
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ningbo Polytechnic filed Critical Ningbo Polytechnic
Priority to CN202210157533.5A priority Critical patent/CN114567776B/en
Publication of CN114567776A publication Critical patent/CN114567776A/en
Application granted granted Critical
Publication of CN114567776B publication Critical patent/CN114567776B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/14Coding unit complexity, e.g. amount of activity or edge presence estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/182Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention discloses a video low-complexity coding method based on panoramic visual perception characteristics, which utilizes a spatial domain JND threshold value as a spatial domain perception factor, obtains a motion perception factor through a weighted gradient value, further obtains an average value of space-time weighted perception factors of all pixel points in a maximum coding unit, calculates a Lagrange coefficient adjusting factor based on the space-time weighted perception factor of the maximum coding unit according to a rate distortion optimization theory, and further obtains a quantization parameter variable quantity based on the space-time weighted perception factor of the maximum coding unit; simultaneously calculating the quantization parameter variation quantity of the maximum coding unit based on the dimension weight; calculating a new coding quantization parameter of the maximum coding unit according to the two quantization parameter variable quantities, and applying the new coding quantization parameter to coding; the method has the advantages that the coding quality can be guaranteed, the coding rate can be effectively reduced, the coding complexity can be effectively reduced, the rate distortion performance is obviously improved, and particularly when the initial coding quantization parameter is small, the coding effect is better.

Description

Video low-complexity coding method based on panoramic visual perception characteristic
Technical Field
The invention relates to a video coding technology, in particular to a video low-complexity coding method based on panoramic visual perception characteristics.
Background
In recent years, panoramic video systems are widely popular with people for their "immersive" visual experience, and have great application prospects in the fields of virtual reality, simulated driving, and the like. However, the current panoramic video system still has the problem of too high coding complexity in terms of coding, which brings great challenges to the application of the panoramic video system. Therefore, how to reduce the encoding complexity has become an urgent technical problem to be solved in the field.
The existing low-complexity coding algorithm of the panoramic video does not fully consider the perception characteristic of a Human Visual System (HVS) and the characteristics of the panoramic video, and the optimal coding performance is difficult to achieve. The main purpose of video coding is to reduce the code rate of coding as much as possible on the premise of ensuring certain video quality; or under the condition that the code rate of the coding is limited, the coding is carried out by adopting a mode with the minimum distortion. Therefore, how to combine and utilize the perception characteristics of the human visual system and the characteristics of the panoramic video to guide the selection of the coding parameters becomes an important breakthrough direction for researching and reducing the coding complexity in the field.
Disclosure of Invention
The invention aims to solve the technical problem of providing a video low-complexity coding method based on panoramic visual perception characteristics, which can effectively save coding code rate and further effectively reduce coding complexity.
The technical scheme adopted by the invention for solving the technical problems is as follows: a video low complexity coding method based on panoramic visual perception characteristics is characterized by comprising the following steps:
step 1: defining a video frame to be coded currently in the panoramic video in the ERP projection format as a current frame; the width of a video frame in the panoramic video in the ERP projection format is W, and the height of the video frame in the panoramic video in the ERP projection format is H;
and 2, step: judging whether the current frame is a 1 st frame video frame, if so, encoding the current frame by adopting an original algorithm of an HEVC video encoder, and then executing the step 10; otherwise, executing step 3;
and step 3: performing spatial JND threshold calculation on each pixel point in the current frame to obtain a panoramic spatial JND threshold map of the current frame, and recording the map as G1,G1The pixel value of each pixel point in the current frame is the spatial JND threshold value of the corresponding pixel point in the current frame; and performing weighted gradient calculation on each pixel point in the current frame to obtain a weighted gradient image of the current frame, which is marked as G2,G2The pixel value of each pixel point in the current frame is the weighted gradient value of the corresponding pixel point in the current frame;
and 4, step 4: calculating the space domain perception factor of each pixel point in the current frame, and recording the space domain perception factor of the pixel point with the coordinate position of (x, y) in the current frame as deltaA(x,y),δA(x,y)=G1(x, y); calculating the motion perception factor of each pixel point in the current frame, and recording the motion perception factor of the pixel point with the coordinate position of (x, y) in the current frame as deltaT(x,y),
Figure BDA0003512810380000021
Then calculating the space-time weighted perception factor of each pixel point in the current frame, and recording the space-time weighted perception factor of the pixel point with the coordinate position of (x, y) in the current frame as delta (x, y), wherein the delta (x, y) is deltaA(x,y)×δT(x, y); then calculating the average value of the space-time weighted perception factors of all the pixel points in the current frame, and recording the average value as Sδ(ii) a Calculating the dimension weight of each pixel point in the current frame, and recording the dimension weight of the pixel point with the coordinate position of (x, y) in the current frame as wERP(x,y),
Figure BDA0003512810380000022
Wherein x is more than or equal to 0 and less than or equal to W-1, y is more than or equal to 0 and less than or equal to H-1, G1(x, y) denotes G1The pixel value G of the pixel point with the middle coordinate position (x, y)1(x, y) also represents the spatial JND threshold, G, of the pixel with coordinate position (x, y) in the current frame2(x, y) represents G2The pixel value G of the pixel point with the middle coordinate position (x, y)2(x, y) also represents the weighted gradient value of the pixel with coordinate position (x, y) in the current frame, SFRepresents G2All images inAverage value of pixel values of pixel points, SFAlso represents the average value of the weighted gradient values of all the pixels in the current frame, wherein epsilon is a motion perception constant and epsilon belongs to [1,2 ]]Cos () is a cosine function;
and 5: defining a maximum coding unit to be processed currently in a current frame as a current maximum coding unit;
step 6: calculating the average value of the space-time weighted perception factors of all the pixel points in the current maximum coding unit, and recording the average value as Sδ_LCU(ii) a Then, calculating Lagrange coefficient adjustment factors based on space-time weighting perception factors of the current maximum coding unit, and marking the Lagrange coefficient adjustment factors as psiLCU
Figure BDA0003512810380000031
Then, the quantization parameter variation quantity based on the space-time weighting perception factor of the current maximum coding unit is calculated and recorded as delta QP1,ΔQP1=3log2LCU) (ii) a Wherein, KLCUAnd BLCUAre all adjustment parameters, KLCU∈(0,1),BLCU∈(0,1);
And 7: calculating the average value of the dimensionality weights of all the pixel points in the current maximum coding unit, and recording the average value as SwERP_LCU(ii) a Then, the quantization parameter variation quantity based on the dimension weight of the current maximum coding unit is calculated and recorded as delta QP2
Figure BDA0003512810380000032
Wherein a and b are both adjusting parameters, a belongs to (0,1), b belongs to (0,1), and b is less than a;
and 8: calculating new coding quantization parameter, denoted as QP, of the current maximum coding unitnew
Figure BDA0003512810380000033
Then using QPnewUpdating the coding quantization parameter of the current maximum coding unit; then coding the current maximum coding unit; wherein, QPorgOriginal coded quantization parameter, symbol representing current maximum coding unit
Figure BDA0003512810380000034
Is a rounded-down operation sign;
and step 9: taking the next maximum coding unit to be processed in the current frame as the current maximum coding unit, then returning to the step 6 to continue executing until all the maximum coding units in the current frame are processed, and then executing the step 10;
step 10: and (3) taking the next frame of video frame to be coded in the panoramic video in the ERP projection format as the current frame, and then returning to the step (2) to continue executing until all the video frames in the panoramic video in the ERP projection format are coded.
In said step 3, G1The acquisition mode is as follows: performing spatial JND threshold calculation on each pixel point in the current frame by adopting a spatial domain just noticeable distortion model to obtain G1
In said step 3, G2The acquisition process comprises the following steps: g is to be2The pixel value of the pixel point with the middle coordinate position (x, y) is marked as G2(x,y),
Figure BDA0003512810380000035
Wherein x is more than or equal to 0 and less than or equal to W-1, y is more than or equal to 0 and less than or equal to H-1, G2(x, y) also represents the weighted gradient value of the pixel with coordinate position (x, y) in the current frame,
Figure BDA0003512810380000036
which is indicative of the horizontal direction,
Figure BDA0003512810380000037
which is indicative of the vertical direction of the,
Figure BDA0003512810380000038
which represents the direction of the time domain,
Figure BDA0003512810380000039
representing the horizontal direction gradient value of the pixel point with the coordinate position of (x, y) in the current frame,
Figure BDA0003512810380000041
indicating sitting in the current frameThe vertical gradient value of the pixel point with the index position of (x, y),
Figure BDA0003512810380000042
representing the time domain direction gradient value of the pixel point with the coordinate position of (x, y) in the current frame,
Figure BDA0003512810380000043
and
Figure BDA0003512810380000044
and calculating by using a 3D-sobel operator, wherein alpha represents a gradient adjustment factor in the horizontal direction, beta represents a gradient adjustment factor in the vertical direction, gamma represents a gradient adjustment factor in the time domain direction, and alpha + beta + gamma is equal to 1.
Compared with the prior art, the invention has the advantages that:
the method fully considers the perception characteristics of a human eye visual system and the characteristics of panoramic video, utilizes a spatial domain JND threshold value (visual perception information) as a spatial domain perception factor, obtains a motion perception factor through a weighted gradient value (visual perception information), further obtains an average value of space-time weighted perception factors of all pixel points in a maximum coding unit through calculation, calculates Lagrange coefficient regulating factors of the maximum coding unit based on the space-time weighted perception factors according to a rate distortion optimization theory, and further obtains quantization parameter variable quantity of the maximum coding unit based on the space-time weighted perception factors; meanwhile, the method takes the dimension weight characteristics of the panoramic video in the ERP projection format into consideration, and calculates the quantitative parameter variation of the maximum coding unit based on the dimension weight; and calculating a new coding quantization parameter of the maximum coding unit according to the two quantization parameter variable quantities, and applying the new coding quantization parameter to coding. The method can adaptively adjust the coding quantization parameters aiming at the time-space domain and the panoramic latitude characteristics of the specific maximum coding unit, and experimental tests show that the method can effectively reduce the coding rate while ensuring the coding quality, thereby effectively reducing the coding complexity, obviously improving the rate-distortion performance, and particularly aiming at the condition that the initial coding quantization parameter is smaller, the coding effect is better.
Drawings
Fig. 1 is a block diagram of a general implementation of the method of the present invention.
Detailed Description
The invention is described in further detail below with reference to the accompanying examples.
The overall implementation block diagram of the video low-complexity coding method based on the panoramic visual perception characteristic is shown in fig. 1, and the method comprises the following steps:
step 1: defining a video frame to be coded currently in a panoramic video in an ERP (equivalent projection) projection format as a current frame; the width of a video frame in the panoramic video in the ERP projection format is W, and the height of the video frame in the panoramic video in the ERP projection format is H.
Step 2: judging whether the current frame is a 1 st frame video frame, if so, encoding the current frame by adopting an original algorithm of an HEVC video encoder, and then executing the step 10; otherwise, step 3 is executed.
And step 3: performing spatial JND (Just Noticeable Distortion) threshold calculation on each pixel point in the current frame to obtain a panoramic spatial JND threshold map of the current frame, and recording the map as G1,G1The pixel value of each pixel point in the current frame is the spatial domain JND threshold value of the corresponding pixel point in the current frame; and performing weighted gradient calculation on each pixel point in the current frame to obtain a weighted gradient map of the current frame, and recording the weighted gradient map as G2,G2The pixel value of each pixel point in the current frame is the weighted gradient value of the corresponding pixel point in the current frame; the larger the spatial domain JND threshold value is, the larger the representation just noticeable distortion is, namely the stronger the spatial domain masking property of the corresponding region is; conversely, the smaller the spatial JND threshold, the weaker the spatial masking of the corresponding region.
In this embodiment, G1The acquisition mode is as follows: performing airspace JND threshold calculation on each pixel point in the current frame by adopting the conventional classical airspace just noticeable distortion model to obtain G1
In this embodiment, G2The acquisition process comprises the following steps: g is to be2The pixel value of the pixel point with the middle coordinate position (x, y) is marked as G2(x,y),
Figure BDA0003512810380000051
Wherein x is more than or equal to 0 and less than or equal to W-1, y is more than or equal to 0 and less than or equal to H-1, G2(x, y) also represents the weighted gradient value of the pixel with coordinate position (x, y) in the current frame,
Figure BDA0003512810380000052
which is indicative of the horizontal direction,
Figure BDA0003512810380000053
which is indicative of the vertical direction of the,
Figure BDA0003512810380000054
which represents the direction of the time domain,
Figure BDA0003512810380000055
representing the horizontal direction gradient value of the pixel point with the coordinate position of (x, y) in the current frame,
Figure BDA0003512810380000056
representing the vertical gradient value of the pixel point with the coordinate position of (x, y) in the current frame,
Figure BDA0003512810380000057
the time domain direction gradient value of the pixel point with the coordinate position (x, y) in the current frame is represented, namely the gradient value of the pixel point with the coordinate position (x, y) in the current frame and the pixel point with the coordinate position (x, y) in the previous frame of video frame along the time domain direction,
Figure BDA0003512810380000058
and
Figure BDA0003512810380000059
the method is calculated by an existing 3D-sobel operator, where α represents a gradient adjustment factor in a horizontal direction, β represents a gradient adjustment factor in a vertical direction, γ represents a gradient adjustment factor in a time domain direction, α + β + γ is 1, and in this embodiment, α is 0.25, β is 0.25, and γ is 0.5.
And 4, step 4: calculating the null of each pixel point in the current frameThe domain perception factor records the space domain perception factor of the pixel point with the coordinate position of (x, y) in the current frame as deltaA(x,y),δA(x,y)=G1(x, y); calculating the motion perception factor of each pixel point in the current frame, and recording the motion perception factor of the pixel point with the coordinate position of (x, y) in the current frame as deltaT(x,y),
Figure BDA0003512810380000061
Then calculating the space-time weighted perception factor of each pixel point in the current frame, and recording the space-time weighted perception factor of the pixel point with the coordinate position of (x, y) in the current frame as delta (x, y), wherein the delta (x, y) is deltaA(x,y)×δT(x, y); then calculating the average value of the space-time weighted perception factors of all the pixel points in the current frame, and recording the average value as Sδ
Figure BDA0003512810380000062
Calculating the dimension weight of each pixel point in the current frame, and recording the dimension weight of the pixel point with the coordinate position of (x, y) in the current frame as wERP(x,y),
Figure BDA0003512810380000063
Wherein x is more than or equal to 0 and less than or equal to W-1, y is more than or equal to 0 and less than or equal to H-1, G1(x, y) denotes G1The pixel value G of the pixel point with the middle coordinate position (x, y)1(x, y) also represents the spatial JND threshold, G, of the pixel with coordinate position (x, y) in the current frame2(x, y) denotes G2The pixel value G of the pixel point with the middle coordinate position (x, y)2(x, y) also represents the weighted gradient value of the pixel with coordinate position (x, y) in the current frame, SFRepresents G2Average value of pixel values of all pixel points in (S)FAlso represents the average value of the weighted gradient values of all the pixels in the current frame,
Figure BDA0003512810380000064
epsilon is a motion perception constant, epsilon is [1,2 ]]In this embodiment, epsilon is 1, cos () is a cosine function, and pi is 3.14 ….
In this embodiment, an ERP project gridBecause each latitude adopts pixel sampling of different degrees, different pixel redundancies exist in different dimensions in a plane, and extreme lifting redundancy of two poles is most obvious, after a sphere is projected to an ERP projection format, the center of the sphere is usually taken as a base point, the longitude theta of the ERP projection format corresponds to the longitude of the spherical surface of the sphere, and the latitude of the ERP projection format is
Figure BDA0003512810380000065
Corresponding to the latitude of the sphere, θ ∈ [ - π, π],
Figure BDA0003512810380000066
The characteristic of the panoramic latitude is considered, and a dimension weight parameter w of an ERP projection format is introducedERP(x,y)。
And 5: defining a Largest Coding Unit (LCU) to be processed currently in a current frame as a current Largest Coding Unit.
Step 6: calculating the average value of the space-time weighted perception factors of all the pixel points in the current maximum coding unit, and recording the average value as Sδ_LCU
Figure BDA0003512810380000071
Then, calculating Lagrange coefficient adjustment factors based on space-time weighting perception factors of the current maximum coding unit, and marking the Lagrange coefficient adjustment factors as psiLCU
Figure BDA0003512810380000072
Then, the quantization parameter variation based on the space-time weighting perception factor of the current maximum coding unit is calculated and recorded as delta QP1,ΔQP1=3log2LCU) (ii) a Wherein i is more than or equal to 0 and less than or equal to 63, j is more than or equal to 0 and less than or equal to 63, and deltaLCU(i, j) represents the space-time weighted perception factor of the pixel point with the coordinate position (i, j) in the block in the current maximum coding unit, KLCUAnd BLCUAre all adjustment parameters, KLCU∈(0,1),BLCUE (0,1), K is finally determined by a large number of experiments in this exampleLCUAnd BLCUAll values are 0.5.
And 7: calculating a current maximum coding unitThe average value of the dimension weights of all the pixel points is recorded as SwERP_LCU
Figure BDA0003512810380000073
Then, the quantization parameter variation quantity based on the dimension weight of the current maximum coding unit is calculated and recorded as delta QP2
Figure BDA0003512810380000074
Wherein, wERP_LCU(i, j) represents the dimension weight of a pixel point with the coordinate position (i, j) in the block in the current maximum coding unit, a and b are both adjustment parameters, a belongs to (0,1), b belongs to (0,1), and b is less than a, and in the embodiment, the value of a is finally determined to be 0.85 and the value of b is finally determined to be 0.3 through a large number of experiments.
And 8: calculating new coding quantization parameter, denoted as QP, of the current maximum coding unitnew
Figure BDA0003512810380000075
Then using QPnewUpdating the coding quantization parameter of the current maximum coding unit; coding the current maximum coding unit by adopting an HEVC (high efficiency video coding) encoder; wherein, QPorgOriginal coding quantization parameter, QP, representing the current maximum coding unitorgCan be read from the initialization parameter list of the encoder
Figure BDA0003512810380000076
Is a rounded-down operation sign.
And step 9: and taking the next maximum coding unit to be processed in the current frame as the current maximum coding unit, then returning to the step 6 to continue executing until all the maximum coding units in the current frame are processed, and then executing the step 10.
Step 10: and (3) taking the next frame of video frame to be coded in the panoramic video in the ERP projection format as the current frame, and then returning to the step (2) to continue executing until all the video frames in the panoramic video in the ERP projection format are coded.
To further illustrate the performance of the method of the present invention, the method of the present invention was tested.
The method is characterized in that HEVC video encoder standard reference software HM16.14 is selected as an experimental test platform, hardware configuration is conducted to an Intel (R) core (TM) i7-10700 CPU, a 64-bit WIN10 operating system with a main frequency of 2.9GHz and a memory of 32G, and a development tool selects VS 2013. Selecting 4 panoramic video sequences as standard test sequences, wherein the standard test sequences are as follows: two 4K sequences "Aeriological City", "DrivingCity" and two 6K sequences "BranCastle 2", "bonding 2". The number of test frames of each standard test sequence is 100 frames, an intra-frame coding mode is adopted, SearchRange is set to be 64, MaxPartitionDepth is set to be 4, and an initial coding quantization parameter QP (namely an original coding quantization parameter QP)org) Respectively 22, 27, 32, 37.
Table 1 lists the relevant parameter information of "audiocity", "drivingmincity", "BranCastle 2", "bonding 2" 4 panoramic video sequences.
TABLE 1 associated parameter information for panoramic video sequences
Panoramic video sequence Video resolution
AerialCity 3840×1920
DrivingInCity 3840×1920
BranCastle2 6144×3072
Landing2 6144×3072
Table 2 shows the coding rate savings associated with the inventive method for coding the panoramic video sequence shown in table 1, as compared to the HM16.14 original platform method. The code rate saving rate of the coding by adopting the method of the invention compared with the coding by adopting the HM16.14 original platform method is defined as delta RPRO,ΔRPRO=(RORG-RPRO)/RORGX 100 (%), wherein RPRORepresenting the code rate, R, of the code encoded by the method of the inventionORGIndicating the coding rate of coding using HM16.14 original platform method.
Table 2 comparison of code rate savings for coding using the method of the present invention compared to the HM16.14 original platform method
Figure BDA0003512810380000091
As can be seen from Table 2, the encoding rate can be averagely saved by 12.9% when the method of the present invention is used for encoding. Aiming at 4 panoramic video sequences with different scenes and different motion conditions, the coding rate can be effectively reduced by adopting the method for coding, and particularly aiming at an initial coding quantization parameter QP (namely an original coding quantization parameter QP)org) And in a smaller case, the coding effect is better.
Table 3 lists the rate-distortion performance of the coding of the panoramic video sequence listed in table 1 using the method of the present invention. Evaluating the quality of coded video by adopting a classical subjective quality evaluation method, wherein in the quality evaluation, a subjective quality evaluation method MOS (Mean Opinion Score) is adopted as a quality evaluation index, and rate distortion performance indexes BDBR of all panoramic video sequences under the subjective quality evaluation method MOS are respectively calculatedMOSTo evaluate the performance of the method of the invention comprehensively.
TABLE 3 Rate distortion Performance for encoding using the method of the present invention
Figure BDA0003512810380000101
As can be seen from Table 3, the process of the invention employs BDBRMOSThe rate distortion performance evaluation index represents that under the same subjective quality condition, the average value of the coding code rate is saved by about-7.4% under the quality evaluation index MOS. This shows that compared with the HM16.14 original platform method, the method of the present invention can save more coding rate under the same subjective perceptual quality. As can be seen from Table 3, the method of the present invention can effectively save coding rate and significantly improve rate distortion performance for different scenes and different motion conditions of a panoramic video sequence.

Claims (3)

1. A video low complexity coding method based on panoramic visual perception characteristics is characterized by comprising the following steps:
step 1: defining a video frame to be coded currently in the panoramic video in the ERP projection format as a current frame; the width of a video frame in the panoramic video in the ERP projection format is W, and the height of the video frame in the panoramic video in the ERP projection format is H;
step 2: judging whether the current frame is a 1 st frame video frame, if so, encoding the current frame by adopting an original algorithm of an HEVC video encoder, and then executing the step 10; otherwise, executing step 3;
and step 3: performing spatial JND threshold calculation on each pixel point in the current frame to obtain a panoramic spatial JND threshold map of the current frame, and recording the map as G1,G1The pixel value of each pixel point in the current frame is the spatial JND threshold value of the corresponding pixel point in the current frame; and performing weighted gradient calculation on each pixel point in the current frame to obtain a weighted gradient map of the current frame, and recording the weighted gradient map as G2,G2The pixel value of each pixel point in the current frame is the weighted gradient value of the corresponding pixel point in the current frame;
and 4, step 4: calculating the spatial domain perception factor of each pixel point in the current frame, and recording the spatial domain perception factor of the pixel point with the coordinate position of (x, y) in the current frame as deltaA(x,y),δA(x,y)=G1(x, y); and calculating the operation of each pixel point in the current frameDynamic perception factor, recording the motion perception factor of the pixel point with coordinate position (x, y) in the current frame as deltaT(x,y),
Figure FDA0003512810370000011
Then calculating the space-time weighted perception factor of each pixel point in the current frame, and recording the space-time weighted perception factor of the pixel point with the coordinate position of (x, y) in the current frame as delta (x, y), wherein the delta (x, y) is deltaA(x,y)×δT(x, y); then calculating the average value of the space-time weighted perception factors of all the pixel points in the current frame, and recording the average value as Sδ(ii) a Calculating the dimension weight of each pixel point in the current frame, and recording the dimension weight of the pixel point with the coordinate position of (x, y) in the current frame as wERP(x,y),
Figure FDA0003512810370000012
Wherein x is more than or equal to 0 and less than or equal to W-1, y is more than or equal to 0 and less than or equal to H-1, G1(x, y) denotes G1The pixel value G of the pixel point with the middle coordinate position (x, y)1(x, y) also represents the spatial JND threshold, G, of the pixel with coordinate position (x, y) in the current frame2(x, y) denotes G2The pixel value G of the pixel point with the middle coordinate position (x, y)2(x, y) also represents the weighted gradient value of the pixel with coordinate position (x, y) in the current frame, SFRepresents G2Average value of pixel values of all pixel points in (1), SFAlso represents the average value of the weighted gradient values of all the pixels in the current frame, wherein epsilon is a motion perception constant and epsilon belongs to [1,2 ]]Cos () is a cosine function;
and 5: defining a maximum coding unit to be processed currently in a current frame as a current maximum coding unit;
step 6: calculating the average value of the space-time weighted perception factors of all the pixel points in the current maximum coding unit, and recording the average value as Sδ_LCU(ii) a Then, calculating Lagrange coefficient adjustment factors based on space-time weighting perception factors of the current maximum coding unit, and marking the Lagrange coefficient adjustment factors as psiLCU
Figure FDA0003512810370000021
Then, the quantization parameter variation quantity based on the space-time weighting perception factor of the current maximum coding unit is calculated and recorded as delta QP1,ΔQP1=3log2LCU) (ii) a Wherein, KLCUAnd BLCUAre all adjustment parameters, KLCU∈(0,1),BLCU∈(0,1);
And 7: calculating the average value of the dimensionality weights of all the pixel points in the current maximum coding unit, and recording the average value as the average value
Figure FDA0003512810370000022
Then, the quantization parameter variation quantity based on the dimension weight of the current maximum coding unit is calculated and recorded as delta QP2
Figure FDA0003512810370000023
Wherein a and b are both adjusting parameters, a belongs to (0,1), b belongs to (0,1), and b is less than a;
and 8: calculating new coding quantization parameter, denoted as QP, of the current maximum coding unitnew
Figure FDA0003512810370000024
Then using QPnewUpdating the coding quantization parameter of the current maximum coding unit; then coding the current maximum coding unit; wherein, QPorgOriginal coded quantization parameter, symbol representing current maximum coding unit
Figure FDA0003512810370000025
Is a rounded-down operator;
and step 9: taking the next maximum coding unit to be processed in the current frame as the current maximum coding unit, then returning to the step 6 to continue executing until all the maximum coding units in the current frame are processed, and then executing the step 10;
step 10: and (3) taking the next frame of video frame to be coded in the panoramic video in the ERP projection format as the current frame, and then returning to the step (2) to continue executing until all the video frames in the panoramic video in the ERP projection format are coded.
2. The method of claim 1, wherein in step 3, G is the number of bits of video coding with low complexity based on the perceptual property of panoramic vision1The acquisition mode is as follows: performing spatial domain JND threshold calculation on each pixel point in the current frame by adopting a spatial domain just noticeable distortion model to obtain G1
3. The method for coding video with low complexity according to claim 1 or 2, wherein G is the number of G in step 32The acquisition process comprises the following steps: g is to be2The pixel value of the pixel point with the middle coordinate position (x, y) is marked as G2(x,y),
Figure FDA0003512810370000031
Wherein x is more than or equal to 0 and less than or equal to W-1, y is more than or equal to 0 and less than or equal to H-1, G2(x, y) also represents the weighted gradient value of the pixel with coordinate position (x, y) in the current frame,
Figure FDA0003512810370000032
which is indicative of the direction of the horizontal,
Figure FDA0003512810370000033
which is indicative of the vertical direction of the,
Figure FDA0003512810370000034
which represents the direction of the time domain,
Figure FDA0003512810370000035
representing the horizontal direction gradient value of the pixel point with the coordinate position of (x, y) in the current frame,
Figure FDA0003512810370000036
representing the vertical gradient value of the pixel point with the coordinate position of (x, y) in the current frame,
Figure FDA0003512810370000037
representing the time domain direction gradient value of the pixel point with the coordinate position of (x, y) in the current frame,
Figure FDA0003512810370000038
and
Figure FDA0003512810370000039
and calculating by using a 3D-sobel operator, wherein alpha represents a gradient adjustment factor in the horizontal direction, beta represents a gradient adjustment factor in the vertical direction, gamma represents a gradient adjustment factor in the time domain direction, and alpha + beta + gamma is equal to 1.
CN202210157533.5A 2022-02-21 2022-02-21 Video low-complexity coding method based on panoramic visual perception characteristics Active CN114567776B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210157533.5A CN114567776B (en) 2022-02-21 2022-02-21 Video low-complexity coding method based on panoramic visual perception characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210157533.5A CN114567776B (en) 2022-02-21 2022-02-21 Video low-complexity coding method based on panoramic visual perception characteristics

Publications (2)

Publication Number Publication Date
CN114567776A true CN114567776A (en) 2022-05-31
CN114567776B CN114567776B (en) 2023-05-05

Family

ID=81714022

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210157533.5A Active CN114567776B (en) 2022-02-21 2022-02-21 Video low-complexity coding method based on panoramic visual perception characteristics

Country Status (1)

Country Link
CN (1) CN114567776B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116723330A (en) * 2023-03-28 2023-09-08 成都师范学院 Panoramic video coding method for self-adapting spherical domain distortion propagation chain length

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6366705B1 (en) * 1999-01-28 2002-04-02 Lucent Technologies Inc. Perceptual preprocessing techniques to reduce complexity of video coders
US20100086063A1 (en) * 2008-10-02 2010-04-08 Apple Inc. Quality metrics for coded video using just noticeable difference models
CN103096079A (en) * 2013-01-08 2013-05-08 宁波大学 Multi-view video rate control method based on exactly perceptible distortion
US20140169451A1 (en) * 2012-12-13 2014-06-19 Mitsubishi Electric Research Laboratories, Inc. Perceptually Coding Images and Videos
CN104954778A (en) * 2015-06-04 2015-09-30 宁波大学 Objective stereo image quality assessment method based on perception feature set
CN107147912A (en) * 2017-05-04 2017-09-08 浙江大华技术股份有限公司 A kind of method for video coding and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6366705B1 (en) * 1999-01-28 2002-04-02 Lucent Technologies Inc. Perceptual preprocessing techniques to reduce complexity of video coders
US20100086063A1 (en) * 2008-10-02 2010-04-08 Apple Inc. Quality metrics for coded video using just noticeable difference models
US20140169451A1 (en) * 2012-12-13 2014-06-19 Mitsubishi Electric Research Laboratories, Inc. Perceptually Coding Images and Videos
CN103096079A (en) * 2013-01-08 2013-05-08 宁波大学 Multi-view video rate control method based on exactly perceptible distortion
CN104954778A (en) * 2015-06-04 2015-09-30 宁波大学 Objective stereo image quality assessment method based on perception feature set
CN107147912A (en) * 2017-05-04 2017-09-08 浙江大华技术股份有限公司 A kind of method for video coding and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
YAFEN XING ET AL: "Spatiotemporal just noticeable difference modeling with heterogeneous temporal visual features" *
杜宝祯: "基于感知阈值的立体视频快速编码算法" *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116723330A (en) * 2023-03-28 2023-09-08 成都师范学院 Panoramic video coding method for self-adapting spherical domain distortion propagation chain length
CN116723330B (en) * 2023-03-28 2024-02-23 成都师范学院 Panoramic video coding method for self-adapting spherical domain distortion propagation chain length

Also Published As

Publication number Publication date
CN114567776B (en) 2023-05-05

Similar Documents

Publication Publication Date Title
CN108924554B (en) Panoramic video coding rate distortion optimization method based on spherical weighting structure similarity
CN110062234B (en) Perceptual video coding method based on just noticeable distortion of region
CN104219525B (en) Perception method for video coding based on conspicuousness and minimum discernable distortion
CN108063944B (en) Perception code rate control method based on visual saliency
CN107241607B (en) Visual perception coding method based on multi-domain JND model
CN111988611A (en) Method for determining quantization offset information, image coding method, image coding device and electronic equipment
CN111193931B (en) Video data coding processing method and computer storage medium
CN103096079B (en) A kind of multi-view video rate control based on proper discernable distortion
CN103313002B (en) Situation-based mobile streaming media energy-saving optimization method
DE102019218316A1 (en) 3D RENDER-TO-VIDEO ENCODER PIPELINE FOR IMPROVED VISUAL QUALITY AND LOW LATENCY
CN111510766A (en) Video coding real-time evaluation and playing tool
CN108521572B (en) Residual filtering method based on pixel domain JND model
CN109451331B (en) Video transmission method based on user cognitive demand
CN114567776B (en) Video low-complexity coding method based on panoramic visual perception characteristics
CN108900838A (en) A kind of Rate-distortion optimization method based on HDR-VDP-2 distortion criterion
CN103024384B (en) A kind of Video coding, coding/decoding method and device
CN115174898A (en) Rate distortion optimization method based on visual perception
JP3105335B2 (en) Compression / expansion method by orthogonal transform coding of image
CN117440158A (en) MIV immersion type video coding rate distortion optimization method based on three-dimensional geometric distortion
CN102685491A (en) Method and system for realizing video coding
CN111757112B (en) HEVC (high efficiency video coding) perception code rate control method based on just noticeable distortion
CN110933416B (en) High dynamic range video self-adaptive preprocessing method
CN109451309B (en) CTU (China train unit) layer code rate allocation method based on significance for HEVC (high efficiency video coding) full I frame coding
CN103517067B (en) Initial quantitative parameter self-adaptive adjustment method and system
CN111464805A (en) Three-dimensional panoramic video rapid coding method based on panoramic saliency

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20240118

Address after: Room 166, Building 1, No. 8 Xingye Avenue, Ningbo Free Trade Zone, Zhejiang Province, 315800

Patentee after: Zhejiang Chuanzhi Electronic Technology Co.,Ltd.

Address before: 315800 no.388, Lushan East Road, Ningbo Economic and Technological Development Zone, Zhejiang Province

Patentee before: Ningbo Polytechnic

TR01 Transfer of patent right