WO2017015958A1 - 一种视频编解码方法及装置 - Google Patents

一种视频编解码方法及装置 Download PDF

Info

Publication number
WO2017015958A1
WO2017015958A1 PCT/CN2015/085586 CN2015085586W WO2017015958A1 WO 2017015958 A1 WO2017015958 A1 WO 2017015958A1 CN 2015085586 W CN2015085586 W CN 2015085586W WO 2017015958 A1 WO2017015958 A1 WO 2017015958A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
frame
frame image
video sequence
background
Prior art date
Application number
PCT/CN2015/085586
Other languages
English (en)
French (fr)
Inventor
陈方栋
林四新
李厚强
李礼
Original Assignee
华为技术有限公司
中国科学技术大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司, 中国科学技术大学 filed Critical 华为技术有限公司
Priority to CN201580077248.0A priority Critical patent/CN107409211B/zh
Priority to EP15899310.5A priority patent/EP3319317B1/en
Priority to PCT/CN2015/085586 priority patent/WO2017015958A1/zh
Publication of WO2017015958A1 publication Critical patent/WO2017015958A1/zh
Priority to US15/882,346 priority patent/US10560719B2/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/58Motion compensation with long-term prediction, i.e. the reference frame for a current frame not being the temporally closest one
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • H04N19/139Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/14Coding unit complexity, e.g. amount of activity or edge presence estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/192Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding the adaptation method, adaptation tool or adaptation type being iterative or recursive

Definitions

  • the present invention relates to the field of communications technologies, and in particular, to a video encoding and decoding method and apparatus.
  • the problem that is solved first in the bandwidth-limited Internet is the high-definition video compression coding problem.
  • the short-term reference frame is generally a reconstructed frame that is relatively close to the current reference frame, and contains short-term features of the current frame, such as a motion foreground having a similar shape; and the long-term reference frame may be a reconstructed frame that is relatively far from the current frame, or may pass
  • the synthesized reference frame contains long-term features of a certain video sequence, such as a background frame of a video sequence.
  • the current short-term reference frame generally uses hierarchical quantization parameters (English full name: Quantization Parameter, English abbreviation: QP) ), and there is currently no more efficient method for QP settings for long-term reference frames.
  • a technique uses the following formula (1) to encode a background frame to obtain a background long-term reference frame:
  • QP LT is the QP of the encoded background frame
  • QP inter is the minimum QP encoding the short-term reference frame P or B frame, that is, when encoding the background frame, the QP of the encoded background frame is based on the encoded short-term reference frame.
  • the quantization parameter of the P or B frame is obtained.
  • the determination of the quantization parameter when the technology encodes the background frame does not take into account the change of the video content.
  • the effect is that the quantization parameter of the encoded background frame is determined according to the quantization parameter range of the short-term reference frame, resulting in low coding quality at the time of encoding.
  • the embodiment of the invention provides a video encoding and decoding method and device, which effectively improves the overall video coding quality.
  • a first aspect of the embodiments of the present invention provides a video encoding method, including:
  • a quantization parameter QP for encoding the first background frame including:
  • the first statistical feature includes a variance of K centroid values corresponding to K image blocks and an average coding distortion of K coding distortions corresponding to the K image blocks, where the K image blocks are located in the front K frame An image block at the same position of each frame image in the image, the variance of the K centroid values is that the K centroid values constitute a variance of a sequence of centroid values, and the average coding distortion of the K code distortions is the K The average of the coded distortion sequence formed by the coding distortion.
  • the K image blocks are the i-th image block of each frame image in the pre-K frame image, and i is a positive integer number;
  • the first j image blocks are the i-th image block of each frame image in the first j frame image of the video sequence
  • the i- th image of each frame image in the previous j-frame image is calculated according to the B j,i a variance of j centroid values corresponding to the image block and an average coding distortion of j coding distortions corresponding to the i-th image block of each frame image in the previous j-frame image, wherein K ⁇ j ⁇ 2, wherein
  • B 1, i is 1, when the i-th image block of the first frame image of the video sequence does not belong to the background region,
  • the B 1,i is 0;
  • the first j image blocks are the i-th image block of each frame image in the pre-j frame image, and the i-th image block of each frame image in the pre-j-1 frame image of the video sequence corresponds to j - the variance of the centroid value as the variance of the centroid values corresponding to j of the i-th image block of each frame image in the previous j-frame image, and the image of each frame image in the pre-j-1 frame image of the video sequence
  • the average coding distortion of j-1 coding distortions corresponding to the i image blocks is the average coding distortion of j coding distortions corresponding to the i-th image block of each frame image in the previous j frame image, where j is 2
  • the variance of the j-1 centroid values corresponding to the i-th image block of each frame image in the pre-j-1 frame image is a first preset value, and the image of each frame in the pre-j-1 frame image
  • Determining whether the i-th image block of the j-th frame of the video sequence belongs to a background area including:
  • ⁇ j,i is the mean value of the centroid values of the first j image blocks
  • ⁇ j-1,i is the mean value of the centroid values of the first j-1 image blocks
  • ⁇ j,i is the number The centroid value of the i-th image block of the j frame
  • the variance of the centroid values of the first j image blocks is the i-th image block of each frame image in the pre-j-1 frame image.
  • the average coding distortion of j coding distortions corresponding to the i-th image block of each frame image in the previous j-frame image is calculated by the following formula:
  • the first probability is calculated using the following formula:
  • P i is a probability that the i-th image block of each frame image of the video sequence refers to the i-th image block of the first background frame, and the probability that the i-th image block of each frame image in the video sequence K+L frame image refers to,
  • a variance of K centroid values corresponding to the K image blocks wherein the K image blocks are an i-th image block of each frame image in the pre-K frame image, and positions and locations of the K image blocks Corresponding to the position of the ith image block of each frame image in the K+1 frame image to the K+L frame image, the K image block and the ith image block of the first background frame The position corresponds, and i is a positive integer.
  • the first probability is calculated using the following formula:
  • P i is the probability that the i-th image block of each frame image of the video sequence from the K+1th frame image to the K+L frame image of the video sequence refers to the i-th image block of the first background frame
  • a variance of K centroid values corresponding to the K image blocks wherein the K image blocks are an i-th image block of each frame image in the pre-K frame image, and positions and locations of the K image blocks Corresponding to the position of the ith image block of each frame image in the K+1 frame image to the K+L frame image, the K image block and the ith image block of the first background frame The position corresponds, and i is a positive integer.
  • a QP used to encode the first background frame including:
  • QP i is a QP for encoding an i-th image block of the first background frame
  • P i is an image of each frame in the K+1 frame image of the video sequence to the K+L frame image of the video sequence
  • the i-th image block refers to the probability of the i-th image block of the first background frame
  • P T is a preset value
  • QP min is a minimum QP value allowed by the current encoding
  • QP mode is a K-frame image of the video sequence a minimum value in a QP adopted by an image frame of the same encoding mode as the first background frame, and an ith image of each frame image in the K+1th frame image to the K+L frame image
  • the position of the block corresponds to the position of the i-th image block of the first background frame.
  • a QP used to encode the first background frame including:
  • QP i is a QP for encoding an i-th image block of the first background frame
  • P i is an image of each frame in the K+1 frame image of the video sequence to the K+L frame image of the video sequence
  • the i-th image block refers to the probability of the i-th image block of the first background frame
  • P T is a preset value
  • QP min is a minimum QP value allowed by the current encoding
  • QP mode is a K-frame image of the video sequence a minimum value in a QP adopted by an image frame of the same encoding mode as the first background frame, and an ith image of each frame image in the K+1th frame image to the K+L frame image
  • the position of the block corresponds to the position of the i-th image block of the first background frame.
  • the front K frame image is iterated by using the following formula, thereby obtaining a target iteration pixel value calculated according to the pixel value of the Kth frame image, and using the target iteration pixel value as the pixel of the first background frame value:
  • P j,x,y is an iterative pixel value calculated according to the pixel value at the (j, y) position of the jth frame of the video sequence
  • P j, x, y ' is according to the video sequence j
  • the iteration pixel value calculated from the pixel value at the 1-frame position (x, y), o j, x, y represents the pixel value at the position (x, y) in the j-th frame image of the video sequence.
  • the method further includes:
  • the second background frame being configured to provide a reference when encoding the K+L+1 frame image of the video sequence to the K+N frame image of the video sequence, where N>L+1 , N is a positive integer;
  • a second aspect of the embodiments of the present invention provides a video decoding method, including:
  • Decoding a first background frame where the first background frame is used to provide a reference when decoding the K+1 frame image of the video sequence to the K+L frame image of the video sequence, where L is a positive integer;
  • the method further includes:
  • the second background frame is used as a second long-term reference frame to decode the K+L+1 frame image to the K+N frame image of the video sequence.
  • a third aspect of the embodiments of the present invention provides a video encoding apparatus, including:
  • An acquiring unit configured to acquire a first statistical feature of a K-frame image before the video sequence, where K is a positive integer
  • a first determining unit configured to determine a first background frame, where the first background frame is used to provide a reference when encoding the K+1th frame image of the video sequence to the K+L frame image of the video sequence, Is a positive integer;
  • a second determining unit configured to determine, according to the first statistical feature acquired by the acquiring unit, a quantization parameter QP for encoding the first background frame determined by the first determining unit;
  • a first coding unit configured to encode the first background frame according to the QP determined by the second determining unit, to obtain a first background long-term reference frame
  • a second coding unit configured to encode the video sequence K+1 frame image to the video sequence K+L frame image according to the first background long-term reference frame obtained by the first coding unit.
  • the second determining unit is configured to:
  • the first statistical feature includes a variance of K centroid values corresponding to K image blocks and an average coding distortion of K coding distortions corresponding to the K image blocks, where the K image blocks are located in the front K frame An image block at the same position of each frame image in the image, the variance of the K centroid values is that the K centroid values constitute a variance of a sequence of centroid values, and the average coding distortion of the K code distortions is the K The average of the coded distortion sequence formed by the coding distortion.
  • the K image blocks are the i-th image block of each frame image in the pre-K frame image, and i is a positive integer;
  • the obtaining unit is used to:
  • the first j image blocks are the i-th image block of each frame image in the first j frame image of the video sequence
  • the i- th image of each frame image in the previous j-frame image is calculated according to the B j,i a variance of j centroid values corresponding to the image block and an average coding distortion of j coding distortions corresponding to the i-th image block of each frame image in the previous j-frame image, wherein K ⁇ j ⁇ 2, wherein
  • B 1, i is 1, when the i-th image block of the first frame image of the video sequence does not belong to the background region,
  • the B 1,i is 0;
  • the first j image blocks are the i-th image block of each frame image in the pre-j frame image, and the i-th image block of each frame image in the pre-j-1 frame image of the video sequence corresponds to j - the variance of the centroid value as the variance of the centroid values corresponding to j of the i-th image block of each frame image in the previous j-frame image, and the image of each frame image in the pre-j-1 frame image of the video sequence
  • the average coding distortion of j-1 coding distortions corresponding to the i image blocks is the average coding distortion of j coding distortions corresponding to the i-th image block of each frame image in the previous j frame image, where j is 2
  • the variance of the j-1 centroid values corresponding to the i-th image block of each frame image in the pre-j-1 frame image is a first preset value, and the image of each frame in the pre-j-1 frame image
  • the acquiring unit is configured to determine whether a horizontal component of a minimum motion vector in a motion vector of a sub-image block of the i-th image block of the j-th image block is smaller than a first preset threshold, and a vertical of the minimum motion vector
  • the straight component is smaller than the second preset threshold, and when the horizontal component is smaller than the first preset threshold, and the vertical component is smaller than the second preset threshold, determining that the ith image block of the jth frame image belongs to the background And determining, when the horizontal component is not less than the first preset threshold or the vertical component is not less than the second preset threshold, determining that the ith image block of the jth frame image does not belong to the background region.
  • the acquiring unit is configured to acquire a centroid value of an ith image block of the jth frame image and an encoding distortion of an ith image block of the jth frame image; and obtain an i th image according to the jth frame image a centroid value of the block and the B j,i , a variance of j centroid values corresponding to an i-th image block of each frame image in the pre-j frame image; an i-th image according to the j-th frame image
  • the coding distortion of the block and the B j,i are used to calculate an average coding distortion of j coding distortions corresponding to the i-th image block of each frame image in the previous j-frame image.
  • the acquiring unit is configured to calculate a variance of j centroid values corresponding to the i-th image block of each frame image in the previous j-frame image by using the following formula:
  • ⁇ j,i is the mean value of the centroid values of the first j image blocks
  • ⁇ j-1,i is the mean value of the centroid values of the first j-1 image blocks
  • ⁇ j,i is the number The centroid value of the i-th image block of the j frame
  • the variance of the centroid values of the first j image blocks is the i-th image block of each frame image in the pre-j-1 frame image.
  • the acquiring unit is configured to calculate an average coding distortion of j coding distortions corresponding to the ith image block of each frame image in the previous j frame image by using the following formula:
  • the second determining unit is configured to calculate the first probability by using the following formula:
  • P i is a probability that the i-th image block of each frame image of the video sequence refers to the i-th image block of the first background frame, and the probability that the i-th image block of each frame image in the video sequence K+L frame image refers to,
  • a variance of K centroid values corresponding to the K image blocks wherein the K image blocks are an i-th image block of each frame image in the pre-K frame image, and positions and locations of the K image blocks Corresponding to the position of the ith image block of each frame image in the K+1 frame image to the K+L frame image, the K image block and the ith image block of the first background frame The position corresponds, and i is a positive integer.
  • the second determining unit is configured to calculate the first probability by using the following formula:
  • P i is the probability that the i-th image block of each frame image of the video sequence from the K+1th frame image to the K+L frame image of the video sequence refers to the i-th image block of the first background frame
  • a variance of K centroid values corresponding to the K image blocks wherein the K image blocks are an i-th image block of each frame image in the pre-K frame image, and positions and locations of the K image blocks Corresponding to the position of the ith image block of each frame image in the K+1 frame image to the K+L frame image, the K image block and the ith image block of the first background frame The position corresponds, and i is a positive integer.
  • QP i is a QP for encoding an i-th image block of the first background frame
  • P i is an image of each frame in the K+1 frame image of the video sequence to the K+L frame image of the video sequence
  • the i-th image block refers to the probability of the i-th image block of the first background frame
  • P T is a preset value
  • QP min is a minimum QP value allowed by the current encoding
  • QP mode is a K-frame image of the video sequence a minimum value in a QP adopted by an image frame of the same encoding mode as the first background frame, and an ith image of each frame image in the K+1th frame image to the K+L frame image
  • the position of the block corresponds to the position of the i-th image block of the first background frame.
  • QP i is a QP for encoding an i-th image block of the first background frame
  • P i is an image of each frame in the K+1 frame image of the video sequence to the K+L frame image of the video sequence
  • the i-th image block refers to the probability of the i-th image block of the first background frame
  • P T is a preset value
  • QP min is a minimum QP value allowed by the current encoding
  • QP mode is a K-frame image of the video sequence a minimum value in a QP adopted by an image frame of the same encoding mode as the first background frame, and an ith image of each frame image in the K+1th frame image to the K+L frame image
  • the position of the block corresponds to the position of the i-th image block of the first background frame.
  • the first determining unit is configured to iterate the image of the front K frame by using a formula to obtain a target iteration pixel value calculated according to a pixel value of the Kth frame image, and use the target iteration pixel value as The pixel value of the first background frame:
  • P j,x,y is an iterative pixel value calculated according to the pixel value at the (j, y) position of the jth frame of the video sequence
  • P j, x, y ' is according to the video sequence j
  • the iteration pixel value calculated from the pixel value at the 1-frame position (x, y), o j, x, y represents the pixel value at the position (x, y) in the j-th frame image of the video sequence.
  • the acquiring unit is further configured to acquire a second statistical feature of the K+1 frame image of the video sequence to the K+L frame image of the video sequence;
  • the first determining unit is further configured to determine a second background frame, where the second background frame is used to encode the K+L+1 frame image of the video sequence to the K+N frame image of the video sequence.
  • the second background frame is used to encode the K+L+1 frame image of the video sequence to the K+N frame image of the video sequence.
  • the second determining unit is further configured to count the number of times the i-th image block of the first background frame is used as a reference block, and the position of the K image blocks and the i-th image block of the first background frame Positioning, calculating a probability prediction error according to the number of times the i-th image block of the first background frame is used as a reference block, and the first probability; calculating the video according to the second statistical feature acquired by the acquiring unit And calculating, by the K+L+1 frame image to the K+N frame image, a second probability of the second background frame determined by the first determining unit, according to the second probability and the probability prediction error, a third probability that the video sequence K+L+1 frame image is referenced to the second background frame of each frame image in the K+N frame image of the video sequence, and is used for coding according to the third probability.
  • QP of the second background frame is further configured to count the number of times the i-th image block of the first background frame is used as a reference block, and the position of the K
  • the first coding unit is further configured to encode the second background frame according to the QP used by the second determining unit to encode the second background frame, to obtain a second background long-term reference frame;
  • the second coding unit is further configured to encode the video sequence K+L+1 frame image to the video sequence K+N frame image according to the second background long-term reference frame obtained by the first coding unit .
  • a fourth aspect of the embodiments of the present invention provides a video decoding apparatus, including:
  • An image decoding unit configured to decode a pre-K frame image in the video sequence, where K is a positive integer
  • a background frame decoding unit configured to decode a first background frame, where the first background frame is used to provide a reference when decoding the K+1th frame image of the video sequence to the K+L frame image of the video sequence, Is a positive integer;
  • the image decoding unit is further configured to: decode the first background frame obtained by the background frame decoding unit as a first long-term reference frame, and decode the K+1 frame image to the K+L frame image of the video sequence, where L is A positive integer.
  • the background frame decoding unit is further configured to decode a second background frame, where the second background frame is used to provide a reference when decoding the K+L+1 frame image to the K+N frame image of the video sequence, where N>L+1, where N is a positive integer;
  • the image decoding unit is further configured to decode the second background frame obtained by the background frame decoding unit as the second long-term reference frame to decode the video sequence K+L+1 frame image to the K+N frame image.
  • the first statistical feature of the K-frame image of the video sequence is obtained, and the number QP used to encode the first background frame is determined according to the first statistical feature, according to the first background frame used for encoding.
  • the QP encodes the determined first background frame to obtain a first background long-term reference frame, and performs a K+1 frame image of the video sequence to a K+L frame image of the video sequence according to the first background long-term reference frame. coding.
  • the encoding quantization parameter of the background long-term reference frame is used by the pre-K frame image video content correlation.
  • the first statistical feature of the association is determined to improve the overall video coding quality.
  • FIG. 1 is a schematic diagram of an embodiment of a video encoding method in an embodiment of the present invention
  • FIG. 2 is a schematic diagram of another embodiment of a video encoding method in an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of another embodiment of a video encoding method in an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of an embodiment of a video decoding method according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of an embodiment of a video encoding apparatus according to an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of an embodiment of a video decoding apparatus according to an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of another embodiment of a video encoding apparatus according to an embodiment of the present invention.
  • FIG. 8 is a schematic diagram of another embodiment of a video decoding apparatus according to an embodiment of the present invention.
  • the embodiment of the invention provides a video encoding and decoding method and device, which improves the overall video coding quality.
  • ITU-T mainly develops video coding standards for real-time video communication, such as video telephony, video conferencing and other applications.
  • video coding standards for various applications have been successfully developed internationally, including: MPEG-1 standard for VCD, MPEG-2 standard for DVD and DVB, for video conferencing
  • the H.261 standard and the H.263 standard allow the MPEG-4 standard for encoding objects of any shape, the recently developed H.264/AVC standard, and the HEVC standard.
  • a video sequence includes a series of picture (English: picture) frames, the picture frame is further divided into slices (English: slice), and the slice is further divided into blocks (English: block).
  • the video coding is performed in units of blocks, and can be encoded from left to right and from top to bottom line from the upper left corner position of the picture.
  • the short-term reference frame is generally a reconstructed frame that is relatively close to the current reference frame, and contains short-term features of the current frame, such as a motion foreground having a similar shape; and the long-term reference frame may be a reconstructed frame that is relatively far from the current frame, or may pass
  • the synthesized reference frame contains long-term features of a certain video sequence, such as a background frame of a video sequence.
  • YUV is a color coding method adopted by European television systems (belonging to PAL), which is the color space used by PAL and SECAM analog color TV systems.
  • the image is usually taken by a three-tube color camera or a color CCD camera, and then the obtained color image signal is subjected to color separation, separately amplified and corrected to obtain RGB, and then subjected to a matrix conversion circuit to obtain a luminance signal Y and two.
  • the color difference signals B-Y (ie U) and R-Y (ie V) the last transmitting end encodes the three signals of luminance and color difference respectively, and transmits them by the same channel.
  • This representation of color is the so-called YUV color space representation.
  • the importance of using the YUV color space is that its luminance signal Y and chrominance signals U, V are separated.
  • YUV Indicates the brightness (Luminance or Luma), which is the grayscale value
  • U and “V” indicate the chroma (Chrominance or Chroma), which is used to describe the color and saturation of the image, and is used to specify the color of the pixel. .
  • An I frame is also called an intra picture.
  • An I frame is usually the first frame of each GOP. After moderate compression, it is used as a reference point for random access and can be regarded as an image.
  • part of the video frame sequence is compressed into an I frame; partially compressed into a P frame; and partially compressed into a B frame.
  • the I frame method is an intraframe compression method, also known as a "keyframe" compression method.
  • the I frame method is a compression technique based on Discrete Cosine Transform (DCT), which is similar to the JPEG compression algorithm. With I frame compression, a compression ratio of 1/6 can be achieved without significant compression marks.
  • DCT Discrete Cosine Transform
  • I frame characteristics It is an intra-coded frame, which is compressed and encoded by using the intra prediction method.
  • I frame When decoding, only the data in the I frame can be used to reconstruct the complete image; the I frame does not need to be referenced to other pictures; I frame Is the reference frame of the P frame and the B frame (the quality directly affects the quality of each frame in the same group); the I frame is the base frame (the first frame) of the frame group GOP, and there is only one I frame in the group; The frame does not need to consider the motion vector; the amount of information occupied by the I frame is relatively large.
  • P frame Forward predictive coded frame.
  • Prediction and reconstruction of P-frame The P-frame is based on the I-frame as the reference frame.
  • the predicted value and the motion vector of the P-frame "some point” are found in the I-frame, and the predicted difference is transmitted along with the motion vector.
  • the predicted value of the P-frame "some point” is found from the I frame according to the motion vector and is added to the difference value to obtain a P-frame "some point” reconstruction value, so that a complete P-frame can be obtained.
  • P frame is the encoded frame behind the I frame; P frame uses motion compensation to transmit its difference from the previous frame and motion vector (prediction error); the predicted value and prediction error of the frame must be decoded.
  • the complete P frame image can be reconstructed after summation; the P frame belongs to the forward prediction interframe coding. It only refers to the previous frame; the P frame can be the reference frame of the P frame behind it, or the reference frame of the B frame before and after it; since the P frame is the reference frame, it may cause the decoding error to spread; Transmission, P frame compression is relatively high.
  • B frame Bidirectionally predicted interpolated coded frame.
  • the B frame prediction and reconstruction B frame uses the previous frame and the following frame as reference frames to "find" the predicted value of the B frame "some point” and the two motion vectors, and take the prediction difference and the motion vector transmission.
  • the receiving end “finds (calculates)” the predicted value in two reference frames according to the motion vector and sums it with the difference to obtain a B-frame "some point” reconstruction value, so that a complete B frame can be obtained.
  • B frame is predicted by the previous frame and the following frame; B frame transmits the prediction error and motion vector between it and the previous frame and the following frame; B frame is the bidirectional predictive coded frame
  • B frame compression ratio is the highest, because it only reflects the change of the motion subject between the C-reference frames, and the prediction is more accurate.
  • the adjacent two frames of images have many similar image blocks.
  • motion search is often performed from the reconstructed frame, and the block that matches the current block is found as the reference block.
  • the current block cannot find a better reference block in the closest image frame, and therefore, a multi-reference frame coding technique is introduced.
  • the image block of the current frame finds the image block with the best coding performance by performing motion search on the first k reference frames.
  • the motion search is started from the previous reference frame, and the best motion vector with the minimum rate distortion cost is searched within a certain search range.
  • the motion search of the next reference frame is continued until the motion search of all reference frames is completed before the final motion vector and its reference frame are determined.
  • RDO Rate Distortion Optimization
  • Rate-distortion optimization techniques are limited by a given rate condition, and by optimizing the selection of coding parameters, it is desirable to obtain minimal coding distortion.
  • the Lagrangian multiplier is used to convert the constrained problem into an unconstrained problem, that is, the J in the minimum (2) is used to select the optimal mode.
  • J is generally called the rate distortion cost RD Cost
  • is called Lagrangian multiplier. It is generally adjusted by adjusting the quantization parameter (QP).
  • QP quantization parameter
  • the QP factor is a constant related to the coding mode and is an important indicator for determining the video quality code stream.
  • the QP is described in detail in the H.264 protocol and the H.265/HEVC standard protocol, and therefore will not be described here.
  • the size of the QP directly determines whether the reconstructed frame is a low-rate-high-distortion reconstructed frame or a high-rate-low-distortion reconstructed frame, which directly affects the coding effect.
  • the video encoding method provided by the embodiment of the present invention is described.
  • the main body of the video encoding method provided by the embodiment of the present invention is a video encoding device, where the video encoding device can be any device that needs to output and store video, such as a mobile phone or a notebook. Computers, tablets, personal computers and other equipment.
  • an embodiment of a video encoding method in an embodiment of the present invention includes:
  • the video frame when encoding the pre-K frame image of the video sequence, the video frame is encoded according to a preset short-term reference frame, where the video sequence is a video sequence previously acquired by the video encoding device, and the preset short-term reference frame is used. It can be the default short-term reference frame in the HEVC standard.
  • the first statistical feature may be a video image statistical feature of the K-frame image of the video sequence, for example, the first statistical feature includes a variance of K centroid values corresponding to K image blocks.
  • the variance of the K centroid values constitutes a variance of a sequence of centroid values
  • the average coding distortion of the K code distortions is an average of coding distortion sequences of the K coding distortions.
  • the feature may also include a variance of the plurality of position centroid values and the plurality of position average coding distortions, wherein each position corresponds to K image blocks of the position of each frame image in the pre-K frame image, and the plurality of position centroids
  • the variance of the centroid value of each position is the variance of the K centroid values corresponding to the K image blocks corresponding to the position
  • the average coding distortion of each of the plurality of position average coding distortions is the K image blocks.
  • the first statistical feature includes a variance of each of the K centroid values corresponding to each of the K image blocks corresponding to each position of each frame image in the pre-K frame image, and each The average coding distortion of the K image blocks corresponding to the position is not limited herein.
  • L is a positive integer
  • the first background frame is used as a long-term reference frame for providing a reference when encoding the K+1 frame image of the video sequence to the K+L frame image of the video sequence.
  • the determining, according to the first statistical feature, the quantization parameter QP for encoding the first background frame may include: calculating, according to the first statistical feature, the K+1 frame image of the video sequence. And a first probability that the image block of each frame image in the K+L frame image of the video sequence refers to the first background frame; and the QP used to encode the first background frame is calculated according to the first probability.
  • the first statistical feature of the K-frame image of the video sequence is obtained, and the number QP used to encode the first background frame is determined according to the first statistical feature, according to the first background frame used for encoding.
  • the QP encodes the determined first background frame to obtain a first background long-term reference frame, and performs a K+1 frame image of the video sequence to a K+L frame image of the video sequence according to the first background long-term reference frame. coding.
  • the encoding quantization parameter of the background long-term reference frame is used by the pre-K frame image video content correlation.
  • the first statistical feature of the association is determined to improve the overall video coding quality.
  • the method when determining the quantization parameter QP for encoding the first background frame according to the first statistical feature, the method may include: calculating the video sequence according to the first statistical feature. a first probability that the image block of each frame image in the K+1 frame image to the K+L frame image of the video sequence refers to the first background frame; and according to the first probability, the calculation is used to encode the first A QP of a background frame, the video coding method in the embodiment of the present invention has multiple implementation manners, and the following examples are respectively introduced:
  • an embodiment of a video encoding method in an embodiment of the present invention includes:
  • K is a positive integer.
  • the first statistical feature includes a square of K centroid values corresponding to K image blocks. a difference between an average coding distortion of K coding distortions corresponding to the K image blocks, the K image blocks being an i-th image block of each frame image in the pre-K frame image, i being a positive integer,
  • the variance of the K centroid values is such that the K centroid values constitute a variance of a sequence of centroid values, and the average coding distortion of the K code distortions is an average of the coded distortion sequences of the K code distortions.
  • the first statistical feature of acquiring the K-frame image of the video sequence has multiple implementation modes, which are respectively introduced below:
  • (1) Acquiring the first statistical feature of the K-frame image of the video sequence before the video sequence may include:
  • the average coding distortion of the i-th image block of the previous frame image is the coding distortion of the i-th image block of the first frame image.
  • ⁇ j,i is the centroid value of the pixel luminance of the i-th image block of the j-th frame encoded image
  • N i is the number of pixels of the i-th image block of the j-th frame
  • O j,i (m) represents a pixel value of the mth pixel in the i-th image block of the j-th frame image
  • the coding distortion of the ith image block of the image encoded for the jth frame The reconstructed pixel value of the mth pixel in the i-th image block of the j-th frame encoded image, m, i, j is a positive integer.
  • the average coding distortion of j coding distortions corresponding to the i-th image block of the frame image can be as follows (6), (7), and (8):
  • ⁇ j,i is the mean value of the centroid values of the first j image blocks (the first j image blocks are the i-th image block of each frame image in the previous j-frame image), ⁇ j-1,i
  • the mean value of the centroid values of the first j-1 image blocks, ⁇ j, i is the centroid value of the i-th image block of the jth frame,
  • the variance of the centroid values of the first j-1 image blocks, the first j-1 image blocks being the i-th image block of each frame image in the pre-j-1 frame image.
  • (2) Acquiring the first statistical feature of the K-frame image of the video sequence before the video sequence may also include:
  • the first j image blocks are the i-th image block of each frame image in the first j frame image of the video sequence , and the i- th image of each frame image in the previous j-frame image is calculated according to the B j,i
  • B 1, i is 1, and the i-th image block of the first frame image of the video sequence does not belong to When the background area is described, the B 1, i is 0;
  • the first j image blocks are the i-th image block of each frame image in the pre-j frame image, and the i-th image block of each frame image in the pre-j-1 frame image of the video sequence corresponds to j - the variance of the centroid value as the variance of the j centroid values corresponding to the i-th image block of each frame image in the previous j-frame image, and the image of each frame image in the pre-j-1 frame image of the video sequence
  • the average coding distortion of j-1 coding distortions corresponding to the i image blocks is the average coding distortion of j coding distortions corresponding to the i-th image block of each frame image in the previous j frame image, where j is 2
  • the variance of the j-1 centroid values corresponding to the i-th image block of each frame image in the pre-j-1 frame image is a first preset value (for example, 0), and the pre-j-1 frame image is The average coding distortion of the j-1 coding distortions corresponding to the
  • the determining whether the ith image block of the jth frame of the video sequence belongs to a background area may include:
  • the first preset threshold and the second threshold may be the same or different.
  • the first preset threshold and the second preset threshold may both be 1 or the first preset threshold may be 0.99.
  • the second preset threshold is 1.01, etc., and is not limited herein.
  • the calculating, according to the B j,i, a variance of j centroid values corresponding to an i-th image block of each frame image in the pre-j frame image and each frame image in the pre-j frame image may include:
  • the calculating, according to the centroid value of the i-th image block of the j-th frame image and the B j,i , the j corresponding to the i-th image block of each frame image in the pre-j frame image can include:
  • ⁇ j,i is the mean value of the centroid values of the first j image blocks
  • ⁇ j-1,i is the mean value of the centroid values of the first j-1 image blocks
  • ⁇ j,i is the number The centroid value of the i-th image block of the j frame (which can be calculated by the above formula (4))
  • the variance of the centroid values of the first j image blocks is the variance of the centroid values of the first j-1 image blocks, the first j-1 image blocks being the i-th image block of each frame image in the pre-j-1 frame image.
  • the coding distortion according to the i-th image blocks of the j-th frame image, and the B j, i, j of the front frame image is calculated in the i-th image blocks corresponding to each frame image j
  • the average coding distortion of the coding distortion can include:
  • Encoding distortion of the i-th image block of the jth frame (which can be calculated by using the above formula (5))
  • L is a positive integer
  • the first background frame is determined to have multiple implementation manners, which are respectively introduced below:
  • P j,x,y is an iterative pixel value calculated according to the pixel value at the (j, y) position of the jth frame of the video sequence
  • P j, x, y ' is according to the video sequence j
  • the iteration pixel value calculated from the pixel value at the 1-frame position (x, y), o j, x, y represents the pixel value at the position (x, y) in the j-th frame image of the video sequence.
  • the determining that the first background frame may include:
  • the front K frame image is iterated by using the following formula (13), thereby obtaining a target iteration pixel value calculated according to the pixel value of the Kth frame image, and using the target iteration pixel value as the first background
  • the pixel value of the frame :
  • P j,x,y is an iterative pixel value calculated according to the pixel value at the (j, y) position of the jth frame of the video sequence
  • P j, x, y ' is according to the video sequence j
  • the iteration pixel value calculated from the pixel value at the 1-frame position (x, y), o j, x, y represents the pixel value at the position (x, y) in the j-th frame image of the video sequence.
  • the calculating the K+1 frame image of the video sequence according to the first statistical feature to The first probability that the image block of each frame image in the K+L frame image of the video sequence refers to the first background frame may include:
  • the first probability is calculated using the following formula (14):
  • P i is the probability that the i-th image block of each frame image of the video sequence from the K+1th frame image to the K+L frame image of the video sequence refers to the i-th image block of the first background frame
  • a variance of K centroid values corresponding to the K image blocks wherein the K image blocks are an i-th image block of each frame image in the pre-K frame image, and positions and locations of the K image blocks Corresponding to the position of the ith image block of each frame image in the K+1 frame image to the K+L frame image, the K image block and the ith image block of the first background frame The position corresponds, and i is a positive integer.
  • the first image of the K-frame image before the video sequence is acquired.
  • the image block of the K+1 frame image of the video sequence is acquired.
  • the first probability can include:
  • the first probability is calculated using the following formula (15):
  • P i is a probability that the i-th image block of each frame image of the video sequence refers to the i-th image block of the first background frame, and the probability that the i-th image block of each frame image in the video sequence K+L frame image refers to,
  • a variance of K centroid values corresponding to the K image blocks wherein the K image blocks are an i-th image block of each frame image in the pre-K frame image, and positions and locations of the K image blocks Corresponding to the position of the ith image block of each frame image in the K+1 frame image to the K+L frame image, the K image block and the ith image block of the first background frame The position corresponds, and i is a positive integer.
  • the first probability may also be calculated by using the formula (14) as described above, which is not limited herein.
  • calculating a QP for encoding the first background frame also has multiple implementation manners, which are respectively introduced below:
  • the QP for encoding the first background frame is calculated according to the first probability, Can include:
  • the QP used to encode the first background frame is calculated using the following formula (16):
  • QP i is a QP for encoding an i-th image block of the first background frame
  • P i is an image of each frame in the K+1 frame image of the video sequence to the K+L frame image of the video sequence
  • the i-th image block refers to the probability of the i-th image block of the first background frame
  • P T is a preset value
  • QP min is a minimum QP value allowed by the current encoding
  • QP mode is a K-frame image of the video sequence a minimum value in a QP adopted by an image frame of the same encoding mode as the first background frame, and an ith image of each frame image in the K+1th frame image to the K+L frame image
  • the position of the block corresponds to the position of the i-th image block of the first background frame.
  • the QP for encoding the first background frame is calculated according to the first probability.
  • the QP used to encode the first background frame is calculated using the following formula (17):
  • QP i is a QP for encoding an i-th image block of the first background frame
  • P i is an image of each frame in the K+1 frame image of the video sequence to the K+L frame image of the video sequence
  • the i-th image block refers to the probability of the i-th image block of the first background frame
  • P T is a preset value
  • QP min is a minimum QP value allowed by the current encoding
  • QP mode is a K-frame image of the video sequence a minimum value in a QP adopted by an image frame of the same encoding mode as the first background frame, and an ith image of each frame image in the K+1th frame image to the K+L frame image
  • the position of the block corresponds to the position of the i-th image block of the first background frame.
  • the calculating the QP for encoding the first background frame according to the first probability may include: calculating a QP for encoding the first background frame by using the following formula (16), where Not limited.
  • P T is a preset value, for example, 0.05
  • QP min is the minimum QP value allowed by the current encoding.
  • the QP mode is the minimum QP used by other I frames.
  • the QP mode is the minimum value of the QP adopted by other P or B frames.
  • the K+1 frame image of the video sequence is sent to the video sequence according to the first background long-term reference frame.
  • the video sequence K+L frame image is encoded.
  • a first statistical feature of a segment of the encoded image sequence (pre-K frame image)
  • the image block of the image refers to the first probability of the determined first background frame, calculates a QP for encoding the first background frame according to the first probability, and encodes the first background frame according to the QP, and obtains the Kth as the video sequence.
  • the method may further include:
  • the second background frame being configured to provide a reference when encoding the K+L+1 frame image of the video sequence to the K+N frame image of the video sequence, where N>L+1 , N is a positive integer;
  • the manner of determining the second background frame may be performed by using the following method (18) or formula (19) to iterate the K+1 frame image to the K+L frame image of the video sequence, Obtaining a target iteration pixel value calculated according to a pixel value of the K+L frame image, and using the target iteration pixel value as a pixel value of the second background frame.
  • P j,x,y is an iterative pixel value calculated according to the pixel value at the (j, y) position of the jth frame of the video sequence
  • P j, x, y ' is the j-th of the video sequence.
  • the iterative pixel value calculated from the pixel value at the 1-frame position (x, y), o j, x, y represents the pixel value at the (x, y) position in the j-th frame image of the video sequence, K+L ⁇ j ⁇ K+1.
  • calculating a second probability P i ' of the K+L+1 frame image to the K+N frame image of the video sequence and referring to the second background frame may refer to step 303 according to the description.
  • the first statistical feature, the manner of calculating the first probability, is not described here.
  • the P i" P i into equation (16) or formula (17) can be calculated QP'.
  • the second background frame is encoded according to the QP used to encode the second background frame, thereby obtaining a second background long-term reference frame; and the video sequence is K according to the second background long-term reference frame.
  • the +L+1 frame image is encoded to the K+N frame image of the video sequence, and the same can be referred to the embodiment shown in FIG. 3, and details are not described herein again.
  • the pixel value described in the embodiment of the present invention may include at least one of the luminance value Y, the chrominance value U, and the chrominance value V, and may be, for example, three pixel values including Y, U, and V. It can also be any value of Y or U or V, or any two of Y, U and V, which are not limited herein.
  • Described above is an embodiment of a video encoding method, and an embodiment of a video decoding method is described below.
  • the main body of the video decoding method provided by the embodiment of the present invention is a video decoding device, where the video decoding device can be any device that needs to output and store video, such as a mobile phone, a notebook computer, a tablet computer, a personal computer, and the like.
  • an embodiment of a video decoding method in the embodiment of the present invention includes the following content:
  • K is a positive integer.
  • the coding end when the coding end (video coding apparatus) writes the reference frame into the video code stream, the reference frame parameter set and the reference frame index are formed, so that the decoding end can know the short-term according to the reference frame parameter set and the reference frame index.
  • the reference frame decoding and the long-term reference frame are sequentially decoded, and the decoding end (video decoding device) first decodes the reference frame first written by the encoding end, such as the default short-term reference frame in the HEVC standard.
  • the pre-K frame image of the video sequence may be encoded according to a preset short-term reference frame (for example, a HEVC default short-term reference frame)
  • the decoded short-term reference frame in the video code stream may be used to decode the video sequence.
  • the front K frame image since the pre-K frame image of the video sequence may be encoded according to a preset short-term reference frame (for example, a HEVC default short-term reference frame), the decoded short-term reference frame in the video code stream may be used to decode the video sequence.
  • the front K frame image for example, a HEVC default short-term reference frame
  • L is a positive integer
  • the first background frame is stored in the memory of the video decoding device as the first long-term reference.
  • the frame is tested but not displayed.
  • the first background frame as the first long-term reference frame from the K+1th frame image to the K+L frame image of the video sequence, where L is a positive integer.
  • the encoding end fully considers the influence of the change of the video content, and improves the encoding quality, so that the decoded image in the video decoding has high quality and high decoding efficiency in the embodiment of the present invention.
  • the method further includes:
  • the second background frame is used to provide a reference when decoding the K+L+1 frame image to the K+N frame image of the video sequence, where N>L+1, N is positive Integer, in the same way, the second background frame is also stored in the memory of the video decoding device as the second long-term reference frame, but is not displayed.
  • the second background frame is used as a second long-term reference frame to decode the K+L+1 frame image to the K+N frame image of the video sequence.
  • an embodiment of the video coding method apparatus 500 provided in the embodiment of the present invention includes:
  • the obtaining unit 501 is configured to acquire a first statistical feature of the K-frame image of the video sequence, where K is a positive integer;
  • a first determining unit 502 configured to determine a first background frame, where the first background frame is used to provide a reference when encoding the K+1th frame image of the video sequence to the K+L frame image of the video sequence, L is a positive integer;
  • a second determining unit 503 configured to determine, according to the first statistical feature acquired by the acquiring unit 501, a quantization parameter QP for encoding the first background frame determined by the first determining unit;
  • a first coding unit 504 configured to encode the first background frame according to the QP determined by the second determining unit 503, to obtain a first background long-term reference frame;
  • a second encoding unit 505 configured to encode the K+1 frame image of the video sequence to the K+L frame image of the video sequence according to the first background long-term reference frame obtained by the first encoding unit 504 .
  • the acquiring unit 501 acquires the first statistical feature of the K-frame image of the video sequence, and the second determining unit 503 determines, according to the first statistical feature, the number QP used to encode the first background frame.
  • a coding unit 504 determines the first back according to the QP pair used to encode the first background frame
  • the scene frame is encoded to obtain a first background long-term reference frame
  • the second encoding unit 505 encodes the video sequence K+1 frame image to the video sequence K+L frame image according to the first background long-term reference frame. .
  • the encoding quantization parameter of the background long-term reference frame is used by the pre-K frame image video content correlation.
  • the first statistical feature of the association is determined to improve the overall video coding quality.
  • the second determining unit 503 is configured to:
  • the first statistical feature includes a variance of K centroid values corresponding to K image blocks and an average coding distortion of K coding distortions corresponding to the K image blocks.
  • the K image blocks are the i-th image block of each frame image in the pre-K frame image, i is a positive integer, and the K image blocks are located in the front K An image block at the same position of each frame image in the frame image, the variance of the K centroid values is that the K centroid values constitute a variance of a sequence of centroid values, and the average coding distortion of the K coding distortions is the K The average of the coded distortion sequences formed by the coding distortions;
  • the obtaining unit 501 is configured to:
  • the first j image blocks are the i-th image block of each frame image in the first j frame image of the video sequence , and the i- th image of each frame image in the previous j-frame image is calculated according to the B j,i
  • B 1, i is 1, and the i-th image block of the first frame image of the video sequence does not belong to When the background area is described, the B 1, i is 0;
  • the first j image blocks are the i-th image block of each frame image in the pre-j frame image, and the i-th image block of each frame image in the pre-j-1 frame image of the video sequence corresponds to j - the variance of the centroid value as the variance of the j centroid values corresponding to the i-th image block of each frame image in the previous j-frame image, and the image of each frame image in the pre-j-1 frame image of the video sequence
  • the average coding distortion of j-1 coding distortions corresponding to the i image blocks is the average coding distortion of j coding distortions corresponding to the i-th image block of each frame image in the previous j frame image, where j is 2
  • the variance of the j-1 centroid values corresponding to the i-th image block of each frame image in the pre-j-1 frame image is a first preset value, and the image of each frame in the pre-j-1 frame image
  • the acquiring unit 501 is configured to determine whether a horizontal component of a minimum motion vector in a motion vector of a sub-image block of the i-th image block of the j-th frame image is smaller than a first preset.
  • a threshold and a vertical component of the minimum motion vector is smaller than a second preset threshold, and when the horizontal component is smaller than a first preset threshold, and the vertical component is smaller than a second preset threshold, determining the first The i-th image block of the j-frame image belongs to the background area, and when the horizontal component is not less than the first preset threshold or the vertical component is not less than the second preset threshold, determining the image of the j-th frame image The i image blocks do not belong to the background area.
  • the acquiring unit 501 is configured to acquire a centroid value of an ith image block of the jth frame image and an encoding distortion of an ith image block of the jth frame image; a centroid value of the i-th image block of the j-th frame image and the B j,i , a variance of j centroid values corresponding to an i-th image block of each frame image in the pre-j frame image;
  • the coding distortion of the i-th image block of the j-th frame image and the B j,i are used to calculate an average coding distortion of j coding distortions corresponding to the i-th image block of each frame image in the previous j-frame image.
  • the acquiring unit 501 is configured to calculate, according to the formula (9) and the formula (10), j centroid values corresponding to the i-th image block of each frame image in the pre-j frame image. Variance.
  • the obtaining unit 501 is configured to calculate an average coding distortion of j coding distortions corresponding to the ith image block of each frame image in the previous j frame image by using the above formula (11).
  • the second determining unit 503 is configured to calculate the first probability using the above formula (15).
  • the second determining unit 503 is configured to calculate the first probability using the above formula (14).
  • the second determining unit 503 is configured to calculate a QP for encoding the first background frame using Equation (16) above.
  • the second determining unit 503 is configured to calculate a QP for encoding the first background frame using equation (17) above.
  • the first determining unit 502 is configured to iterate the front K frame image by using the above formula (13), thereby obtaining a target calculated according to the pixel value of the Kth frame image.
  • the pixel value is iterated and the target iterative pixel value is taken as the pixel value of the first background frame.
  • the acquiring unit 501 is further configured to acquire a second statistical feature of the K+1 frame image of the video sequence to the K+L frame image of the video sequence;
  • the first determining unit 502 is further configured to determine a second background frame, where the second background frame is used to encode the K+L+1 frame image of the video sequence to the K+N frame image of the video sequence.
  • the second background frame is used to encode the K+L+1 frame image of the video sequence to the K+N frame image of the video sequence.
  • the second determining unit 503 is further configured to count the number of times the i-th image block of the first background frame is used as a reference block, and the position of the K image blocks and the i-th image block of the first background frame Corresponding to, according to the number of times the i-th image block of the first background frame is used as a reference block, and the first probability, a probability prediction error is calculated; and the second statistical feature acquired by the acquiring unit is used to calculate the a second probability that the video sequence K+L+1 frame image to the K+N frame image refers to the second background frame determined by the first determining unit, according to the second probability and the probability prediction error, Calculating a third probability that the video frame K+L+1 frame image to each frame image in the K+N frame image of the video sequence is referenced to the second background frame, and is calculated according to the third probability, Encoding a QP of the second background frame;
  • the first coding unit 504 is further configured to encode the second background frame according to the QP used by the second determining unit 503 to encode the second background frame, to obtain a second background long-term reference frame;
  • the second encoding unit 505 is further configured to: image the K+L+1 frame image of the video sequence to the K+N frame image of the video sequence according to the second background long-term reference frame obtained by the first encoding unit 504. Encode.
  • an embodiment of the video decoding apparatus in the embodiment of the present invention includes:
  • the image decoding unit 601 is configured to decode the pre-K frame image in the video sequence, where K is a positive integer;
  • a background frame decoding unit 602 configured to decode a first background frame, where the first background frame is used to provide a reference when decoding the K+1th frame image of the video sequence to the K+L frame image of the video sequence, L is a positive integer;
  • the image decoding unit 601 is further configured to: decode the first background frame obtained by the background frame decoding unit 602 as a first long-term reference frame, and decode the K+1th frame image to the K+L frame image of the video sequence, L is a positive integer.
  • the background frame decoding unit 602 is further configured to decode a second background frame, where the second background frame is used to select a K+L+1 frame image to the Kth video sequence. Provide a reference when decoding the +N frame image, where N>L+1, where N is a positive integer;
  • the image decoding unit 601 is further configured to decode the second background frame obtained by the background frame decoding unit 602 as a second long-term reference frame to decode the video sequence K+L+1 frame image to the K+N frame. image.
  • FIG. 7 is a schematic diagram of a video encoding apparatus 700 according to an embodiment of the present invention.
  • the video encoding apparatus 700 may include at least one bus 701. At least one processor 702 coupled to bus 701 and at least one memory 703 coupled to bus 701.
  • the processor 702 calls, by using the bus 701, the code stored in the memory 703 to acquire a first statistical feature of the K-frame image of the video sequence, where K is a positive integer; determining a first background frame, the first background The frame is configured to provide a reference when encoding the K+1 frame image of the video sequence to the K+L frame image of the video sequence, where L is a positive integer; determining, according to the first statistical feature, the encoding a quantization parameter QP of a background frame; encoding the first background frame according to a QP used to encode the first background frame, thereby obtaining a first background long-term reference frame; and according to the first background long-term reference frame pair
  • the K+1 frame image of the video sequence is encoded to the K+L frame image of the video sequence.
  • the processor 702 calls the code stored in the memory 703 via the bus 701 to specifically calculate the K+1 frame image of the video sequence to the video according to the first statistical feature.
  • the image block of each frame image in the sequence K+L frame image refers to the first probability of the first background frame; and based on the first probability, calculates a QP for encoding the first background frame.
  • the first statistical feature includes a variance of K centroid values corresponding to K image blocks and an average coding distortion of K coding distortions corresponding to the K image blocks
  • the K The image block is an image block located at the same position of each frame image in the pre-K frame image
  • the variance of the K centroid values is that the K centroid values constitute a variance of a sequence of centroid values
  • the K coding distortions The average coding distortion is the average of the coding distortion sequences formed by the K coding distortions.
  • the K image blocks are the i-th image block of each frame image in the pre-K frame image, and i is a positive integer;
  • the processor 702 calls the code stored in the memory 703 via the bus 701 to specifically use:
  • the first j image blocks are the ith image block of each frame image in the previous j frame image, and calculate the ith of each frame image in the previous j frame image according to the B j,i
  • the B 1, i is 1, and the i-th image block of the first frame image of the video sequence is not When belonging to the background area, the B 1, i is 0;
  • the first j image blocks are the i-th image block of each frame image in the pre-j frame image, and the i-th image block of each frame image in the pre-j-1 frame image of the video sequence is corresponding.
  • the average coding distortion of j-1 coding distortions corresponding to the image blocks is the average coding distortion of j coding distortions corresponding to the ith image block of each frame image in the previous j frame image, where j is 2
  • the variance of the j-1 centroid values corresponding to the i-th image block of each frame image in the image of the j-1th frame is a first preset value, and the image of each frame image in the pre-j-1 frame image
  • the average coding distortion of the j-1 coding distortions corresponding to the i image blocks is the second preset value or the coding distortion of the first frame image of the video sequence.
  • processor 702 calls the code stored in memory 703 via bus 701 for:
  • a horizontal component of the minimum motion vector is smaller than a first preset threshold, and the minimum motion vector The vertical component is less than the second predetermined threshold;
  • the processor 702 calls the code stored in the memory 703 via the bus 701 to specifically:
  • processor 702 calls the code stored in memory 703 via bus 701 to specifically:
  • the variance of the j centroid values corresponding to the i-th image block of each frame image in the previous j-frame image is calculated by the above formula (9) and formula (10).
  • processor 702 calls the code stored in memory 703 via bus 701 to specifically:
  • the average coding distortion of the j coding distortions corresponding to the i-th image block of each frame image in the previous j-frame image is calculated by the above formula (11).
  • processor 702 calls the code stored in memory 703 via bus 701 to specifically:
  • the first probability is calculated using the above formula (15).
  • the processor 702 calls the memory 703 via the bus 701.
  • the stored code is specifically used to:
  • the first probability is calculated using equation (14) above.
  • processor 702 calls the code stored in memory 703 via bus 701 to specifically:
  • the QP used to encode the first background frame is calculated using equation (16) above.
  • processor 702 calls the code stored in memory 703 via bus 701 to specifically:
  • the QP used to encode the first background frame is calculated using equation (17) above.
  • the processor 702 calls the code stored in the memory 703 via the bus 701 to specifically determine the first background frame:
  • the front K frame image is iterated by using the above formula (13), thereby obtaining a target iteration pixel value calculated according to the pixel value of the Kth frame image, and using the target iteration pixel value as the first background The pixel value of the frame.
  • the processor 702 after the processor 702 acquires the first statistical feature of the K-frame image of the video sequence, the processor 702 calls the code stored in the memory 703 via the bus 701 to specifically also be used for :
  • the second background frame being configured to provide a reference when encoding the K+L+1 frame image of the video sequence to the K+N frame image of the video sequence, where N>L+1 , N is a positive integer;
  • the video encoding device 700 can be any device that needs to output and play video, such as a notebook computer, a tablet computer, a personal computer, a mobile phone, and the like.
  • FIG. 9 is a schematic diagram of a video decoding apparatus 800 according to an embodiment of the present invention.
  • the video decoding apparatus 800 may include at least one bus 801. At least one processor 802 coupled to bus 801 and at least one memory 803 coupled to bus 901.
  • the processor 802 calls the code stored in the memory 803 through the bus 801 for decoding the pre-K frame image in the video sequence, K is a positive integer; decoding the first background frame, the first background frame is used in the pair Providing a reference to the video sequence K+1 frame image to the video sequence K+L frame image decoding, L is a positive integer; decoding the first background frame as the first long-term reference frame to decode the video sequence K+1 frame image to K+L frame image, L is a positive integer.
  • the method further includes: decoding a second background frame, wherein the second background frame is used to provide a reference when decoding the K+L+1 frame image to the K+N frame image of the video sequence, where N>L+1, N is a positive integer And decoding the second background frame as the second long-term reference frame to decode the video sequence K+L+1 frame image to the K+N frame image.
  • the video decoding device 800 can be any device that needs to output and play video, such as a notebook computer, a tablet computer, a personal computer, a mobile phone, and the like.
  • the embodiment of the present invention further provides a computer storage medium, wherein the computer storage medium can store a program, and the program includes some or all of the steps of at least the video encoding method or the video decoding method described in the foregoing method embodiments.
  • the disclosed system, apparatus, and method may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium.
  • the technical solution of the present invention which is essential or contributes to the prior art, or all or part of the technical solution, may be embodied in the form of a software product stored in a storage medium.
  • a number of instructions are included to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like. .

Abstract

一种图像编解码方法、视频编解码方法及装置。图像编码方法包括:获取视频序列前K帧图像的第一统计特征;确定第一背景帧;根据所述第一统计特征确定用于编码所述第一背景帧的QP;根据所述QP对所述第一背景帧进行编码,从而得到第一背景长期参考帧;根据所述第一背景长期参考帧对所述视频序列第K+1帧图像至所述视频序列第K+L帧图像进行编码。由于在对视频序列第K+1帧图像至所述视频序列第K+L帧图像进行编码时,采用的背景长期参考帧的编码QP是由前K帧图像视频内容相关联的第一统计特征确定的,提高了整体的视频编码质量。

Description

一种视频编解码方法及装置 技术领域
本发明涉及通讯技术领域,尤其涉及一种视频编解码方法及装置。
背景技术
随着互联网的迅猛发展以及人们物质精神文化的日益丰富,在互联网中针对视频的应用需求尤其是针对高清视频的应用需求越来越多,而高清视频的数据量非常大,要想高清视频能在带宽有限的互联网中传输,首先解决的问题就是高清视频压缩编码问题。
由于视频存在很大的时间相关性,即相邻的两帧图像有很多相似图像块。因而,在视频编码的帧间预测中,对于当前块,往往从已重建帧中进行运动搜索,找到当前块最匹配的块作为参考块。在高性能视频编码(英文全称:High efficiency video coding,英文缩写:HEVC)标准中,参考帧主要有两类:短期参考帧和长期参考帧。短期参考帧一般是离当前参考帧相对近的重建帧,含有当前帧的短期特征,比如存在形状相似的运动前景等;而长期参考帧可以是离当前帧相对远的重建帧,也可以是通过合成获得的参考帧,含有某一段视频序列的长期特征,比如一段视频序列的背景帧。
在HEVC中,对于图像组(英文全称:Group of Picture,英文缩写:GOP)长度为4的低时延配置,目前短期参考帧一般采用分级的量化参数(英文全称:Quantization Parameter,英文缩写:QP)设置,而对于长期参考帧的QP设置目前没有较有效的方法。
目前一种技术采用如下式(1)对背景帧进行相应编码,得到背景长期参考帧:
Figure PCTCN2015085586-appb-000001
其中,
Figure PCTCN2015085586-appb-000002
为向下取整操作,QPLT为编码背景帧的QP,QPinter为编码短期参考帧P或B帧的最小QP,即对背景帧进行编码时,编码背景帧的QP是根据编码短期参考帧P或B帧的量化参数得到的。
由于该技术编码背景帧时的量化参数的确定没有考虑视频内容变化带来 的影响,只是根据短期参考帧的量化参数范围确定编码背景帧的量化参数,造成编码时编码质量较低。
发明内容
本发明实施例提供了一种视频编解码方法及装置,有效提高了整体的视频编码质量。
本发明实施例第一方面提供了一种视频编码方法,包括:
获取视频序列前K帧图像的第一统计特征,K为正整数;
确定第一背景帧,所述第一背景帧用于在对所述视频序列第K+1帧图像至所述视频序列第K+L帧图像编码时提供参考,L为正整数;
根据所述第一统计特征确定用于编码所述第一背景帧的量化参数QP;
根据所述QP对所述第一背景帧进行编码,从而得到第一背景长期参考帧;
根据所述第一背景长期参考帧对所述视频序列第K+1帧图像至所述视频序列第K+L帧图像进行编码。
结合本发明实施例的第一方面,在本发明实施例的第一方面的第一种可能的实现方式中,
所述根据所述第一统计特征确定用于编码所述第一背景帧的量化参数QP,包括:
根据所述第一统计特征,计算所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像的图像块参考所述第一背景帧的第一概率;
根据所述第一概率,计算用于编码所述第一背景帧的QP。
结合本发明实施例的第一方面或第一方面的第一种可能的实现方式,在本发明实施例的第一方面的第二种可能的实现方式中,
所述第一统计特征包括K个图像块对应的K个质心值的方差和所述K个图像块对应的K个编码失真的平均编码失真,所述K个图像块为位于所述前K帧图像中各帧图像的相同位置的图像块,所述K个质心值的方差为所述K个质心值构成了质心值序列的方差,所述K个编码失真的平均编码失真为所述K个编码失真构成的编码失真序列的平均值。
结合本发明实施例的第一方面的第二种可能的实现方式,在本发明实施例的第一方面的第三种可能的实现方式中,
所述K个图像块为所述前K帧图像中各帧图像的第i个图像块,i为正整 数;
所述获取所述视频序列前K帧图像的第一统计特征,包括:
按从所述前K帧图像的第二帧图像到所述前K帧图像的第K帧图像的顺序进行迭代计算,从而获得所述前K帧图像中各帧图像的第i个图像块对应的K个质心值的方差,和所述前K帧图像中各帧图像的第i个图像块对应的K个编码失真的平均编码失真,其中所述迭代计算包括:
判断所述视频序列第j帧图像的第i个图像块是否属于背景区域;
在所述视频序列第j帧图像第i个图像块属于背景区域时,根据Bj,i=Bj-1,i+1,得到前j个图像块属于背景区域的个数Bj,i,所述前j个图像块为所述视频序列前j帧图像中各帧图像的第i个图像块,根据所述Bj,i计算所述前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差和所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真,其中,K≥j≥2,其中,在所述视频序列第1帧图像的第i个图像块属于所述背景区域时,B1,i为1,在所述视频序列第1帧图像的第i个图像块不属于所述背景区域时,所述B1,i为0;
在所述视频序列第j帧图像第i个图像块不属于背景区域时,根据Bj,i=Bj-1,i,得到前j个图像块属于背景区域的个数Bj,i,所述前j个图像块为所述前j帧图像中各帧图像的第i个图像块,并将所述视频序列前j-1帧图像中各帧图像的第i个图像块对应的j-1个质心值的方差作为所述前j帧图像中各帧图像的第i个图像块对应j个的质心值的方差,将所述视频序列前j-1帧图像中各帧图像的第i个图像块对应的j-1个编码失真的平均编码失真作为所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真,其中,在j为2时,所述前j-1帧图像中各帧图像的第i个图像块对应的j-1个质心值的方差为第一预设值,所述前j-1帧图像中各帧图像的第i个图像块对应的j-1个编码失真的平均编码失真为第二预设值或所述视频序列第1帧图像的编码失真。
结合本发明实施例的第一方面的第三种可能的实现方式,在本发明实施例的第一方面的第四种可能的实现方式中,
所述判断所述视频序列第j帧第i个图像块是否属于背景区域,包括:
判断所述第j帧图像的第i个图像块的子图像块的运动矢量中的最小运动 矢量的水平分量是否小于第一预设阈值,且所述最小运动矢量的竖直分量小于第二预设阈值;
在所述水平分量小于第一预设阈值,且所述竖直分量小于第二预设阈值时,确定所述第j帧图像的第i个图像块属于背景区域;
在所述水平分量不小于第一预设阈值或者所述竖直分量不小于第二预设阈值时,确定所述第j帧图像的第i个图像块不属于背景区域。
结合本发明实施例的第一方面的第三种可能的实现方式或第一方面的第四种可能的实现方式,在本发明实施例的第一方面的第五种可能的实现方式中,
所述根据所述Bj,i计算所述前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差和所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真,包括:
获取所述第j帧图像的第i个图像块的质心值和编码失真;
根据所述第j帧图像第i个图像块的质心值以及所述Bj,i,计算所述前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差;
根据所述第j帧图像的第i个图像块的编码失真以及所述Bj,i,计算所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真。
结合本发明实施例的第一方面的第五种可能的实现方式,在本发明实施例的第一方面的第六种可能的实现方式中,
所述根据所述第j帧图像第i个图像块的质心值以及所述Bj,i,计算所述前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差,包括:
采用如下公式计算所述前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差:
Figure PCTCN2015085586-appb-000003
Figure PCTCN2015085586-appb-000004
其中,μj,i为所述前j个图像块的质心值的均值,μj-1,i为所述前j-1个图像块的质心值的均值,ωj,i为所述第j帧第i个图像块的质心值,
Figure PCTCN2015085586-appb-000005
为所述前j个图像块的质心值的方差,
Figure PCTCN2015085586-appb-000006
为前j-1个图像块的质心值的方差,所述前j-1个图像块为所述前j-1帧图像中各帧图像的第i个图像块。
结合本发明实施例的第一方面的第五种可能的实现方式或第一方面的第六种可能的实现方式,在本发明实施例的第一方面的第七种可能的实现方式中,
所述根据所述第j帧图像的第i个图像块的编码失真以及所述Bj,i,计算所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真,包括:
采用如下公式计算所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真:
Figure PCTCN2015085586-appb-000007
其中,
Figure PCTCN2015085586-appb-000008
为所述第j帧第i个图像块的编码失真,
Figure PCTCN2015085586-appb-000009
为所述前j个图像块的平均编码失真,
Figure PCTCN2015085586-appb-000010
为所述前j-1个图像块的平均编码失真。
结合本发明实施例的第一方面的第三种可能的实现方式至第一方面的第七种可能的实现方式,在本发明实施例的第一方面的第八种可能的实现方式中,
所述根据所述第一统计特征,计算所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像的图像块参考所述第一背景帧的第一概率,包括:
采用如下公式计算所述第一概率:
Figure PCTCN2015085586-appb-000011
其中,Pi为所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像第i个图像块参考所述第一背景帧第i个图像块的概率,
Figure PCTCN2015085586-appb-000012
Figure PCTCN2015085586-appb-000013
为所述K个图像块的平均编码失真,
Figure PCTCN2015085586-appb-000014
为所述K个图像块对应的K个质心值的方差,所述K个图像块为所述前K帧图像中各帧图像的第i个图像块,所述K个图像块的位置与所述第K+1帧图像至所述第K+L帧图像中各帧图像的第i个图像块的位置对应,所述K个图像块与所述第一背景帧的第i个图像块的位置对应,i为正整数。
结合本发明实施例的第一方面的第二种可能的实现方式至第一方面的第七种可能的实现方式,在本发明实施例的第一方面的第九种可能的实现方式中,
所述根据所述第一统计特征,计算所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像的图像块参考所述第一背景帧的第一概率,包括:
采用如下公式计算所述第一概率:
Figure PCTCN2015085586-appb-000015
其中,Pi为所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像的第i个图像块参考所述第一背景帧第i个图像块的概率,
Figure PCTCN2015085586-appb-000016
Figure PCTCN2015085586-appb-000017
为所述K个图像块的平均编码失真,
Figure PCTCN2015085586-appb-000018
为所述K个图像块对应的K个质心值的方差,所述K个图像块为所述前K帧图像中各帧图像的第i个图像块,所述K个图像块的位置与所述第K+1帧图像至所述第K+L帧图像中各帧图像的第i个图像块的位置对应,所述K个图像块与所述第一背景帧的第i个图像块的位置对应,i为正整数。
结合本发明实施例的第一方面的第一种可能的实现方式至第一方面的第九种可能的实现方式,在本发明实施例的第一方面的第十种可能的实现方式中,
所述根据所述第一概率,计算用于编码所述第一背景帧的QP,包括:
采用如下公式计算用于编码所述第一背景帧的QP:
Figure PCTCN2015085586-appb-000019
其中,QPi为用于编码所述第一背景帧第i个图像块的QP,Pi为所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像的第i个图像块参考所述第一背景帧第i个图像块的概率,PT为预设值,QPmin为当前编码允许的最小QP值,QPmode为所述视频序列前K帧图像中与所述第一背景帧采用相同的编码模式的图像帧采用的QP中的最小值,所述第K+1帧图像至所述第K+L帧图像中各帧图像的第i个图像块的位置与所述第一背景帧的第i个图像块的位置对应。
结合本发明实施例的第一方面的第三种可能的实现方式至第一方面的第九种可能的实现方式,在本发明实施例的第一方面的第十种可能的实现方式中,
所述根据所述第一概率,计算用于编码所述第一背景帧的QP,包括:
采用如下公式计算用于编码所述第一背景帧的QP:
Figure PCTCN2015085586-appb-000020
其中,QPi为用于编码所述第一背景帧第i个图像块的QP,Pi为所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像的第i个图像块参考所述第一背景帧第i个图像块的概率,PT为预设值,QPmin为当前编码允许的最小QP值,QPmode为所述视频序列前K帧图像中与所述第一背景帧采用相同的编码模式的图像帧采用的QP中的最小值,所述第K+1帧图像至所述第K+L帧图像中各帧图像的第i个图像块的位置与所述第一背景帧的第i个图像块的位置对应。
结合本发明实施例的第一方面的第三种可能的实现方式至第一方面的第十一种可能的实现方式,在本发明实施例的第一方面的第十二种可能的实现方式中,
所述确定第一背景帧,包括:
采用如下公式对所述前K帧图像进行迭代,从而获得根据所述第K帧图像的像素值计算得到的目标迭代像素值,并将所述目标迭代像素值作为所述第一背景帧的像素值:
Figure PCTCN2015085586-appb-000021
其中,Pj,x,y为根据所述视频序列第j帧位置为(x,y)处的像素值计算得到的迭代像素值,Pj,x,y'为根据所述视频序列第j-1帧位置(x,y)处的像素值计算得到的迭代像素值,oj,x,y表示所述视频序列第j帧图像中位置为(x,y)处的像素值。
结合本发明实施例的第一方面的第一种可能的实现方式至第一方面的第十二种可能的实现方式,在本发明实施例的第一方面的第十三种可能的实现方式中,
在所述获取所述视频序列前K帧图像的第一统计特征之后,所述方法还包括:
获取所述视频序列第K+1帧图像至所述视频序列第K+L帧图像的第二统计特征;
确定第二背景帧,所述第二背景帧用于在对所述视频序列第K+L+1帧图像至所述视频序列第K+N帧图像编码时提供参考,其中N>L+1,N为正整数;
统计所述第一背景帧的第i个图像块作为参考块的次数,所述K个图像块的位置与所述第一背景帧的第i个图像块的位置对应;
根据所述第一背景帧的第i个图像块作为参考块的次数,及所述第一概率,计算概率预测误差;
根据所述第二统计特征,计算所述视频序列第K+L+1帧图像到第K+N帧图像参考所述第二背景帧的第二概率;
根据所述第二概率和所述概率预测误差,计算所述视频序列第K+L+1帧 图像到所述视频序列第K+N帧图像中各帧图像参考所述第二背景帧的第三概率;
根据所述第三概率,计算用于编码所述第二背景帧的QP;
根据用于编码所述第二背景帧的QP对所述第二背景帧进行编码,从而得到第二背景长期参考帧;
根据所述第二背景长期参考帧对所述视频序列第K+L+1帧图像至所述视频序列第K+N帧图像进行编码。
本发明实施例第二方面提供了一种视频解码方法,包括:
解码视频序列中的前K帧图像,K为正整数;
解码第一背景帧,所述第一背景帧用于在对所述视频序列第K+1帧图像至所述视频序列第K+L帧图像解码时提供参考,L为正整数;
将所述第一背景帧作为第一长期参考帧解码所述视频序列第K+1帧图像到第K+L帧图像,L为正整数。
结合本发明实施例的第二方面,在本发明实施例的第二方面的第一种可能的实现方式中,
在所述将所述第一背景帧作为长期参考帧解码所述视频序列第K+1帧图像到第K+L帧图像之后,所述方法还包括:
解码第二背景帧,所述第二背景帧用于在对所述视频序列第K+L+1帧图像到第K+N帧图像解码时提供参考,其中N>L+1,N为正整数;
将所述第二背景帧作为第二长期参考帧解码所述视频序列第K+L+1帧图像到第K+N帧图像。
本发明实施例第三方面提供了一种视频编码装置,包括:
获取单元,用于获取视频序列前K帧图像的第一统计特征,K为正整数;
第一确定单元,用于确定第一背景帧,所述第一背景帧用于在对所述视频序列第K+1帧图像至所述视频序列第K+L帧图像编码时提供参考,L为正整数;
第二确定单元,用于根据所述获取单元获取的所述第一统计特征确定用于编码所述第一确定单元确定的所述第一背景帧的量化参数QP;
第一编码单元,用于根据所述第二确定单元确定的所述QP对所述第一背景帧进行编码,从而得到第一背景长期参考帧;
第二编码单元,用于根据所述第一编码单元得到的所述第一背景长期参考帧对所述视频序列第K+1帧图像至所述视频序列第K+L帧图像进行编码。
结合本发明实施例的第三方面,在本发明实施例的第三方面的第一种可能的实现方式中,
所述第二确定单元用于:
根据所述第一统计特征,计算所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像的图像块参考所述第一背景帧的第一概率;
根据所述第一概率,计算用于编码所述第一背景帧的QP。
结合本发明实施例的第三方面或第三方面的第一种可能的实现方式,在本发明实施例的第三方面的第二种可能的实现方式中,
所述第一统计特征包括K个图像块对应的K个质心值的方差和所述K个图像块对应的K个编码失真的平均编码失真,所述K个图像块为位于所述前K帧图像中各帧图像的相同位置的图像块,所述K个质心值的方差为所述K个质心值构成了质心值序列的方差,所述K个编码失真的平均编码失真为所述K个编码失真构成的编码失真序列的平均值。
结合本发明实施例的第三方面的第二种可能的实现方式,在本发明实施例的第三方面的第三种可能的实现方式中,
所述K个图像块为所述前K帧图像中各帧图像的第i个图像块,i为正整数;
所述获取单元用于:
按从所述前K帧图像的第二帧图像到所述前K帧图像的第K帧图像的顺序进行迭代计算,从而获得所述前K帧图像中各帧图像的第i个图像块对应的K个质心值的方差,和所述前K帧图像中各帧图像的第i个图像块对应的K个编码失真的平均编码失真,其中所述迭代计算包括:
判断所述视频序列第j帧图像的第i个图像块是否属于背景区域;
在所述视频序列第j帧图像第i个图像块属于背景区域时,根据Bj,i=Bj-1,i+1,得到前j个图像块属于背景区域的个数Bj,i,所述前j个图像块为所述视频序列前j帧图像中各帧图像的第i个图像块,根据所述Bj,i计算所述前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差和所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真,其中,K≥j≥ 2,其中,在所述视频序列第1帧图像的第i个图像块属于所述背景区域时,B1,i为1,在所述视频序列第1帧图像的第i个图像块不属于所述背景区域时,所述B1,i为0;
在所述视频序列第j帧图像第i个图像块不属于背景区域时,根据Bj,i=Bj-1,i,得到前j个图像块属于背景区域的个数Bj,i,所述前j个图像块为所述前j帧图像中各帧图像的第i个图像块,并将所述视频序列前j-1帧图像中各帧图像的第i个图像块对应的j-1个质心值的方差作为所述前j帧图像中各帧图像的第i个图像块对应j个的质心值的方差,将所述视频序列前j-1帧图像中各帧图像的第i个图像块对应的j-1个编码失真的平均编码失真作为所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真,其中,在j为2时,所述前j-1帧图像中各帧图像的第i个图像块对应的j-1个质心值的方差为第一预设值,所述前j-1帧图像中各帧图像的第i个图像块对应的j-1个编码失真的平均编码失真为第二预设值或所述视频序列第1帧图像的编码失真。
结合本发明实施例的第三方面的第三种可能的实现方式,在本发明实施例的第三方面的第四种可能的实现方式中,
所述获取单元用于判断所述第j帧图像的第i个图像块的子图像块的运动矢量中的最小运动矢量的水平分量是否小于第一预设阈值,且所述最小运动矢量的竖直分量小于第二预设阈值,在所述水平分量小于第一预设阈值,且所述竖直分量小于第二预设阈值时,确定所述第j帧图像的第i个图像块属于背景区域,在所述水平分量不小于第一预设阈值或者所述竖直分量不小于第二预设阈值时,确定所述第j帧图像的第i个图像块不属于背景区域。
结合本发明实施例的第三方面的第三种可能的实现方式或第三方面的第四种可能的实现方式,在本发明实施例的第三方面的第五种可能的实现方式中,
所述获取单元用于获取所述第j帧图像的第i个图像块的质心值和所述第j帧图像的第i个图像块的编码失真;根据所述第j帧图像第i个图像块的质心值以及所述Bj,i,计算所述前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差;根据所述第j帧图像的第i个图像块的编码失真以及所述Bj,i,计算所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码 失真。
结合本发明实施例的第三方面的第五种可能的实现方式,在本发明实施例的第三方面的第六种可能的实现方式中,
所述获取单元用于采用如下公式计算所述前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差:
Figure PCTCN2015085586-appb-000022
Figure PCTCN2015085586-appb-000023
其中,μj,i为所述前j个图像块的质心值的均值,μj-1,i为所述前j-1个图像块的质心值的均值,ωj,i为所述第j帧第i个图像块的质心值,
Figure PCTCN2015085586-appb-000024
为所述前j个图像块的质心值的方差,
Figure PCTCN2015085586-appb-000025
为前j-1个图像块的质心值的方差,所述前j-1个图像块为所述前j-1帧图像中各帧图像的第i个图像块。
结合本发明实施例的第三方面的第五种可能的实现方式或第三方面的第六种可能的实现方式,在本发明实施例的第三方面的第七种可能的实现方式中,
所述获取单元用于采用如下公式计算所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真:
Figure PCTCN2015085586-appb-000026
其中,
Figure PCTCN2015085586-appb-000027
为所述第j帧第i个图像块的编码失真,
Figure PCTCN2015085586-appb-000028
为所述前j个图像块的平均编码失真,
Figure PCTCN2015085586-appb-000029
为所述前j-1个图像块的平均编码失真。
结合本发明实施例的第三方面的第三种可能的实现方式至第三方面的第七种可能的实现方式,在本发明实施例的第三方面的第八种可能的实现方式中,
所述第二确定单元用于采用如下公式计算所述第一概率:
Figure PCTCN2015085586-appb-000030
其中,Pi为所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像第i个图像块参考所述第一背景帧第i个图像块的概率,
Figure PCTCN2015085586-appb-000031
Figure PCTCN2015085586-appb-000032
为所述K个图像块的平均编码失真,
Figure PCTCN2015085586-appb-000033
为所述K个图像块对应的K个质心值的方差,所述K个图像块为所述前K帧图像中各帧图像的第i个图像块,所述K个图像块的位置与所述第K+1帧图像至所述第K+L帧图像中各帧图像的第i个图像块的位置对应,所述K个图像块与所述第一背景帧的第i个图像块的位置对应,i为正整数。
结合本发明实施例的第三方面的第二种可能的实现方式至第三方面的第七种可能的实现方式,在本发明实施例的第三方面的第九种可能的实现方式中,
所述第二确定单元用于采用如下公式计算所述第一概率:
Figure PCTCN2015085586-appb-000034
其中,Pi为所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像的第i个图像块参考所述第一背景帧第i个图像块的概率,
Figure PCTCN2015085586-appb-000035
Figure PCTCN2015085586-appb-000036
为所述K个图像块的平均编码失真,
Figure PCTCN2015085586-appb-000037
为所述K个图像块对应的K个质心值的方差,所述K个图像块为所述前K帧图像中各帧图像的第i个图像块,所述K个图像块的位置与所述第K+1帧图像至所述第K+L帧图像中各帧图像的第i个图像块的位置对应,所述K个图像块与所述第一背景帧的第i个图像块的位置对应,i为正整数。
结合本发明实施例的第三方面的第一种可能的实现方式至第三方面的第九种可能的实现方式,在本发明实施例的第三方面的第十种可能的实现方式中,
所述第二确定单元用于采用如下公式计算用于编码所述第一背景帧的QP:
Figure PCTCN2015085586-appb-000038
其中,QPi为用于编码所述第一背景帧第i个图像块的QP,Pi为所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像的第i个图像块参考所述第一背景帧第i个图像块的概率,PT为预设值,QPmin为当前编码允许的最小QP值,QPmode为所述视频序列前K帧图像中与所述第一背景帧采用相同的编码模式的图像帧采用的QP中的最小值,所述第K+1帧图像至所述第K+L帧图像中各帧图像的第i个图像块的位置与所述第一背景帧的第i个图像块的位置对应。
结合本发明实施例的第三方面的第三种可能的实现方式至第三方面的第九种可能的实现方式,在本发明实施例的第三方面的第十一种可能的实现方式中,
所述第二确定单元用于采用如下公式计算用于编码所述第一背景帧的QP:
Figure PCTCN2015085586-appb-000039
其中,QPi为用于编码所述第一背景帧第i个图像块的QP,Pi为所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像的第i个图像块参考所述第一背景帧第i个图像块的概率,PT为预设值,QPmin为当前编码允许的最小QP值,QPmode为所述视频序列前K帧图像中与所述第一背景帧采用相同的编码模式的图像帧采用的QP中的最小值,所述第K+1帧图像至所述第K+L帧图像中各帧图像的第i个图像块的位置与所述第一背景帧的第i个图像块的位置对应。
结合本发明实施例的第三方面的第三种可能的实现方式至第三方面的第 十一种可能的实现方式,在本发明实施例的第三方面的第十二种可能的实现方式中,
所述第一确定单元用于采用如下公式对所述前K帧图像进行迭代,从而获得根据所述第K帧图像的像素值计算得到的目标迭代像素值,并将所述目标迭代像素值作为所述第一背景帧的像素值:
Figure PCTCN2015085586-appb-000040
其中,Pj,x,y为根据所述视频序列第j帧位置为(x,y)处的像素值计算得到的迭代像素值,Pj,x,y'为根据所述视频序列第j-1帧位置(x,y)处的像素值计算得到的迭代像素值,oj,x,y表示所述视频序列第j帧图像中位置为(x,y)处的像素值。
结合本发明实施例的第三方面的第一种可能的实现方式至第三方面的第十二种可能的实现方式,在本发明实施例的第三方面的第十三种可能的实现方式中,
所述获取单元还用于获取所述视频序列第K+1帧图像至所述视频序列第K+L帧图像的第二统计特征;
所述第一确定单元还用于确定第二背景帧,所述第二背景帧用于在对所述视频序列第K+L+1帧图像至所述视频序列第K+N帧图像编码时提供参考,其中N>L+1,N为正整数;
所述第二确定单元还用于统计所述第一背景帧的第i个图像块作为参考块的次数,所述K个图像块的位置与所述第一背景帧的第i个图像块的位置对应,根据所述第一背景帧的第i个图像块作为参考块的次数,及所述第一概率,计算概率预测误差;根据所述获取单元获取的第二统计特征,计算所述视频序列第K+L+1帧图像到第K+N帧图像参考所述第一确定单元确定的所述第二背景帧的第二概率,根据所述第二概率和所述概率预测误差,计算所述视频序列第K+L+1帧图像到所述视频序列第K+N帧图像中各帧图像参考所述第二背景帧的第三概率,根据所述第三概率,计算用于编码所述第二背景帧的QP;
第一编码单元还用于根据所述第二确定单元确定的用于编码所述第二背景帧的QP对所述第二背景帧进行编码,从而得到第二背景长期参考帧;
第二编码单元还用于根据所述第一编码单元得到的所述第二背景长期参考帧对所述视频序列第K+L+1帧图像至所述视频序列第K+N帧图像进行编码。
本发明实施例第四方面提供了一种视频解码装置,包括:
图像解码单元,用于解码视频序列中的前K帧图像,K为正整数;
背景帧解码单元,用于解码第一背景帧,所述第一背景帧用于在对所述视频序列第K+1帧图像至所述视频序列第K+L帧图像解码时提供参考,L为正整数;
所述图像解码单元还用于将所述背景帧解码单元得到的所述第一背景帧作为第一长期参考帧解码所述视频序列第K+1帧图像到第K+L帧图像,L为正整数。
结合本发明实施例的第四方面,在本发明实施例的第四方面的第一种可能的实现方式中,
所述背景帧解码单元还用于解码第二背景帧,所述第二背景帧用于在对所述视频序列第K+L+1帧图像到第K+N帧图像解码时提供参考,其中N>L+1,N为正整数;
所述图像解码单元还用于将所述背景帧解码单元得到的所述第二背景帧作为第二长期参考帧解码所述视频序列第K+L+1帧图像到第K+N帧图像。
从以上技术方案可以看出,本发明实施例具有以下优点:
本发明实施例中获取所述视频序列前K帧图像的第一统计特征,根据所述第一统计特征确定用于编码所述第一背景帧的数QP,根据用于编码第一背景帧的QP对确定的第一背景帧进行编码,从而得到第一背景长期参考帧,根据第一背景长期参考帧对所述视频序列第K+1帧图像至所述视频序列第K+L帧图像进行编码。本发明实施例中由于在对视频序列第K+1帧图像至所述视频序列第K+L帧图像进行编码时,采用的背景长期参考帧的编码量化参数是由前K帧图像视频内容相关联的第一统计特征确定的,提高了整体的视频编码质量。
附图说明
图1是本发明实施例中视频编码方法的一个实施例示意图;
图2是本发明实施例中视频编码方法的另一个实施例示意图;
图3是本发明实施例中视频编码方法的另一个实施例示意图;
图4是本发明实施例中视频解码方法的一个实施例示意图;
图5是本发明实施例中视频编码装置的一个实施例示意图;
图6是本发明实施例中视频解码装置的一个实施例示意图;
图7是本发明实施例中视频编码装置的另一个实施例示意图;
图8是本发明实施例中视频解码装置的另一个实施例示意图。
具体实施方式
本发明实施例提供了一种视频编解码方法及装置,提高了整体的视频编码质量。
为了使本技术领域的人员更好地理解本发明方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分的实施例,而不是全部的实施例。根据本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本发明保护的范围。
本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等(如果存在)是用于区别类似的对象,而不用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
下面先对本发明实施例可能涉及的一些概念进行简单介绍。
目前,国际上有两个国际组织专门进行视频编码标准的制定工作,即国际标准化组织(英文全称:International Organization for Standardization,英文缩写:ISO)和国际电工委员会(英文全称:International Electrotechnical Commission,英文缩写:IEC)下的动态图像专家组(英文全称:Moving Pictures  Experts Group,英文缩写:MPEG),以及国际电信联盟远程通信标准化组织(英文全称:International Telecommunication Union-Telecommunication standardization sector,英文缩写:ITU-T)的视频编码专家组(英文全称:Video Coding Experts Group,英文缩写:VCEG),成立于1986年的MPEG专门负责制定多媒体领域内的相关标准,主要应用于存储、广播电视、因特网或无线网上的流媒体等。ITU-T则主要制定面向实时视频通信领域的视频编码标准,如视频电话、视频会议等应用。在过去的几十年里,国际上已经成功制定了面向各种应用的视频编码标准,主要包括:用于VCD的MPEG-1标准,用于DVD和DVB的MPEG-2标准,用于视频会议的H.261标准以及H.263标准,允许对任意形状的对象编码的MPEG-4标准,国际上最近制定的H.264/AVC标准,以及HEVC标准。
在HEVC标准中,视频序列包括一系列图像(英文:picture)帧,图像帧被进一步划分为切片(英文:slice),slice再被划分为块(英文:block)。视频编码以块为单位,可从picture的左上角位置开始从左到右从上到下一行一行进行编码处理。
在HEVC标准中,参考帧主要有两类:短期参考帧和长期参考帧。短期参考帧一般是离当前参考帧相对近的重建帧,含有当前帧的短期特征,比如存在形状相似的运动前景等;而长期参考帧可以是离当前帧相对远的重建帧,也可以是通过合成获得的参考帧,含有某一段视频序列的长期特征,比如一段视频序列的背景帧。
YUV:YUV是被欧洲电视系统所采用的一种颜色编码方法(属于PAL),是PAL和SECAM模拟彩色电视制式采用的颜色空间。在现代彩色电视系统中,通常采用三管彩色摄影机或彩色CCD摄影机进行取像,然后把取得的彩色图像信号经分色、分别放大校正后得到RGB,再经过矩阵变换电路得到亮度信号Y和两个色差信号B-Y(即U)、R-Y(即V),最后发送端将亮度和色差三个信号分别进行编码,用同一信道发送出去。这种色彩的表示方法就是所谓的YUV色彩空间表示。采用YUV色彩空间的重要性是它的亮度信号Y和色度信号U、V是分离的。
YUV的原理是把亮度与色度分离,研究证明,人眼对亮度的敏感超过色度。利用这个原理,可以把色度信息减少一点,人眼也无法查觉这一点。其中“Y” 表示明亮度(Luminance或Luma),也就是灰阶值;而“U”和“V”表示的则是色度(Chrominance或Chroma),作用是描述影像色彩及饱和度,用于指定像素的颜色。
I帧(I frame)又称为内部画面(intra picture),I帧通常是每个GOP的第一个帧,经过适度地压缩,做为随机访问的参考点,可以当成图象。在编码的过程中,部分视频帧序列压缩成为I帧;部分压缩成P帧;还有部分压缩成B帧。I帧法是帧内压缩法,也称为“关键帧”压缩法。I帧法是根据离散余弦变换DCT(Discrete Cosine Transform)的压缩技术,这种算法与JPEG压缩算法类似。采用I帧压缩可达到1/6的压缩比而无明显的压缩痕迹。
I帧特点:它是一个帧内编码帧,使用帧内预测方法进行压缩编码,;解码时仅用I帧内的数据就可重构完整图像;I帧不需要参考其他画面而生成;I帧是P帧和B帧的参考帧(其质量直接影响到同组中以后各帧的质量);I帧是帧组GOP的基础帧(第一帧),在一组中只有一个I帧;I帧不需要考虑运动矢量;I帧所占数据的信息量比较大。
P帧(P frame):前向预测编码帧。
P帧的预测与重构:P帧是以I帧为参考帧,在I帧中找出P帧“某点”的预测值和运动矢量,取预测差值和运动矢量一起传送。在接收端根据运动矢量从I帧中找出P帧“某点”的预测值并与差值相加以得到P帧“某点”重建值,从而可得到完整的P帧。
P帧特点:P帧是I帧后面的编码帧;P帧采用运动补偿的方法传送它与前面的帧的差值及运动矢量(预测误差);解码时必须将该帧的预测值与预测误差求和后才能重构完整的P帧图像;P帧属于前向预测的帧间编码。它只参考前面的帧;P帧可以是其后面P帧的参考帧,也可以是其前后的B帧的参考帧;由于P帧是参考帧,它可能造成解码错误的扩散;由于是差值传送,P帧的压缩比较高。
B帧(B frame):双向预测内插编码帧。
B帧的预测与重构B帧以前面的帧和后面的帧为参考帧,“找出”B帧“某点”的预测值和两个运动矢量,并取预测差值和运动矢量传送,接收端根据运动矢量在两个参考帧中“找出(算出)”预测值并与差值求和,得到B帧“某点”重建值,从而可得到完整的B帧。
B帧特点:B帧是由前面的帧和后面的帧来进行预测的;B帧传送的是它与前面的帧和后面的帧之间的预测误差及运动矢量;B帧是双向预测编码帧;B帧压缩比最高,因为它只反映丙参考帧间运动主体的变化情况,预测比较准确。
下面介绍与本发明相关的一些背景技术。
由于视频存在很大的时间相关性,即相邻的两帧图像有很多相似图像块。因而,在视频编码的帧间预测中,对于当前块,往往从已重建帧中进行运动搜索,找到当前块最匹配的块作为参考块,然而,由于遮挡、光线变化、噪声以及照相机移动等问题可能会出现当前块在最接近的一帧图像中无法找到较好的参考块,因此,引入了多参考帧编码技术。如图1所示,当前帧的图像块通过在前k个参考帧分别进行运动搜索找到编码性能最好的图像块。
在多参考帧编码技术中,当某一个图像块在寻找最佳匹配模块时,先从前一个参考帧开始进行运动搜索,在一定的搜索范围内寻找具有最小率失真代价的最佳运动矢量,再继续下一个参考帧的运动搜索,直到完成全部参考帧的运动搜索之后,才能确定最后的运动矢量及其所采用的参考帧。
在视频压缩过程中,如果压缩码率过大,很容易造成图像的失真,而压缩码率过小,又无法将数据带宽降低至网络或计算机可以处理的阈值范围内,为解决这一问题,RDO(Rate Distortion Optimization,率失真优化)技术应运而生,RDO技术的核心是通过计算不同编码模式中编码图像块的编码代价,在提高压缩码率和避免失真之间找到一个合理的平衡点,即在保证压缩码率的同时兼顾视频质量。
率失真优化技术受限于给定的码率条件,通过优化选择编码参数,期望获得最小的编码失真。目前主流的编码框架中,均使用拉格朗日乘子将该有约束问题转换为无约束问题,即采用最小化式(2)中的J来选择最优模式。
J=D+λR  (2)
其中,J一般被称作率失真代价RD Cost,λ被称作拉格朗日乘子,一般通过调整量化参数(Quantization Parameter,QP)来调整,λ和QP有以下关系:
λ=QPfactor·2(QP-12)/3  (3)
其中QPfactor为与编码模式相关的常数,是决定视频质量码流的重要指标,在H.264协议、H.265/HEVC标准协议中都对QP有详细介绍,故此处不再赘述。
由此可见,QP的大小直接决定了重建帧是低码率高失真的重建帧还是高码率低失真的重建帧,即直接影响编码效果。
下面继续探讨本发明实施例的技术方案。
先介绍本发明实施例提供的视频编码方法,本发明实施例提供的视频编码方法的执行主体是视频编码装置,其中,该视频编码装置可以是任何需要输出、存储视频的装置,如手机,笔记本电脑,平板电脑,个人电脑等设备。
请参阅图2,本发明实施例中视频编码方法的一个实施例包括:
201、获取视频序列前K帧图像的第一统计特征;
本实施例中,在编码视频序列的前K帧图像时,是根据预置的短期参考帧进行编码的,其中,所述视频序列是视频编码装置预先获取的视频序列,预置的短期参考帧可以是HEVC标准中默认的短期参考帧。
其中,K为正整数,所述第一统计特征可以为所述视频序列前K帧图像的视频图像统计特征,例如所述第一统计特征包括K个图像块对应的K个质心值的方差和所述K个图像块对应的K个编码失真的平均编码失真,所述K个图像块为位于所述前K帧图像中各帧图像的相同位置的图像块,其中,所述K个质心值的方差为所述K个质心值构成了质心值序列的方差,所述K个编码失真的平均编码失真为所述K个编码失真构成的编码失真序列的平均值,当然,所述第一统计特征也可以包括多个位置质心值的方差和所述多个位置平均编码失真,其中,每个位置对应所述前K帧图像中各帧图像的该位置的K个图像块,多个位置质心值的方差中,每个位置质心值的方差为该位置对应的K个图像块对应的K个质心值的方差,多个位置平均编码失真中每个位置平均编码失真为所述K个图像块对应的K个编码失真的平均编码失真,例如所述第一统计特征包括所述前K帧图像中各帧图像的每个位置对应的K个图像块对应的K个质心值的方差和每个位置对应的K个图像块的平均编码失真,此处不做限定。
202、确定第一背景帧,所述第一背景帧用于在对所述视频序列第K+1帧图像至所述视频序列第K+L帧图像编码时提供参考。
其中,L为正整数,第一背景帧即作为长期参考帧用于在对所述视频序列第K+1帧图像至所述视频序列第K+L帧图像编码时提供参考。
203、根据所述第一统计特征确定用于编码所述第一背景帧的量化参数QP;
可选的,所述根据所述第一统计特征确定用于编码所述第一背景帧的量化参数QP,可以包括:根据所述第一统计特征,计算所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像的图像块参考所述第一背景帧的第一概率;根据所述第一概率,计算用于编码所述第一背景帧的QP。
204、根据用于编码所述第一背景帧的QP对所述第一背景帧进行编码,从而得到第一背景长期参考帧;
205、根据所述第一背景长期参考帧对所述视频序列第K+1帧图像至所述视频序列第K+L帧图像进行编码。
本发明实施例中获取所述视频序列前K帧图像的第一统计特征,根据所述第一统计特征确定用于编码所述第一背景帧的数QP,根据用于编码第一背景帧的QP对确定的第一背景帧进行编码,从而得到第一背景长期参考帧,根据第一背景长期参考帧对所述视频序列第K+1帧图像至所述视频序列第K+L帧图像进行编码。本发明实施例中由于在对视频序列第K+1帧图像至所述视频序列第K+L帧图像进行编码时,采用的背景长期参考帧的编码量化参数是由前K帧图像视频内容相关联的第一统计特征确定的,提高了整体的视频编码质量。
图2所示的实施例中,当所述根据所述第一统计特征确定用于编码所述第一背景帧的量化参数QP,可以包括:根据所述第一统计特征,计算所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像的图像块参考所述第一背景帧的第一概率;根据所述第一概率,计算用于编码所述第一背景帧的QP,本发明实施例中视频编码方法有多种实现方式,下面分别举例进行介绍:
请参阅图3,本发明实施例中视频编码方法的一个实施例包括:
301、获取视频序列前K帧图像的第一统计特征;
其中,K为正整数。
本实施例中,所述第一统计特征包括K个图像块对应的K个质心值的方 差和所述K个图像块对应的K个编码失真的平均编码失真,所述K个图像块为所述前K帧图像中各帧图像的第i个图像块,i为正整数,所述K个质心值的方差为所述K个质心值构成了质心值序列的方差,所述K个编码失真的平均编码失真为所述K个编码失真构成的编码失真序列的平均值。
本实施例中,获取所述视频序列前K帧图像的第一统计特征有多种实现方式,下面分别介绍:
(1)获取所述视频序列前K帧图像的第一统计特征可以包括:
按从所述前K帧图像的第二帧图像到所述前K帧图像的第K帧图像的顺序进行迭代计算,从而获得所述前K帧图像中各帧图像的第i个图像块对应的K个质心值的方差,和所述前K帧图像中各帧图像的第i个图像块对应的K个编码失真的平均编码失真,其中所述迭代计算包括:
获取所述第j帧第i个图像块的质心值和编码失真;
根据所述第j帧第i个图像块的质心值和编码失真,计算前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差和所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真,其中,K≥j≥2,j为正整数,其中,前一帧图像第i个图像块的质心值的方差为0,前一帧图像第i个图像块的平均编码失真为第一帧图像第i个图像块的编码失真。
上述获取所述第j帧第i个图像块的质心值ωj,i和编码失真
Figure PCTCN2015085586-appb-000041
可以采用如下公式(4)和公式(5):
Figure PCTCN2015085586-appb-000042
Figure PCTCN2015085586-appb-000043
其中,ωj,i为所述第j帧编码图像的第i个图像块的像素亮度的质心值,Ni为所述第j帧的第i个图像块的像素个数,Oj,i(m)表示第j帧图像的第i个图像块内第m个像素的像素值;
Figure PCTCN2015085586-appb-000044
为第j帧编码图像的第i个图像块的编码失真,
Figure PCTCN2015085586-appb-000045
表示第j帧编码图像的第i个图像块内第m个像素的重建像素值,m,i,j为正整数。
根据所述第j帧第i个图像块的质心值和编码失真,计算前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差和所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真,可以采用如下公式(6)、(7)和(8):
Figure PCTCN2015085586-appb-000046
Figure PCTCN2015085586-appb-000047
Figure PCTCN2015085586-appb-000048
其中,μj,i为所述前j个图像块(前j个图像块为所述前j帧图像中各帧图像的第i个图像块)的质心值的均值,μj-1,i为所述前j-1个图像块的质心值的均值,ωj,i为所述第j帧第i个图像块的质心值,
Figure PCTCN2015085586-appb-000049
为所述前j个图像块的质心值的方差,
Figure PCTCN2015085586-appb-000050
为前j-1个图像块的质心值的方差,所述前j-1个图像块为所述前j-1帧图像中各帧图像的第i个图像块。
(2)获取所述视频序列前K帧图像的第一统计特征也可以包括:
按从所述前K帧图像的第二帧图像到所述前K帧图像的第K帧图像的顺序进行迭代计算,从而获得所述前K帧图像中各帧图像的第i个图像块对应的K个质心值的方差,和所述前K帧图像中各帧图像的第i个图像块对应的K个编码失真的平均编码失真,其中所述迭代计算包括:
判断所述视频序列第j帧图像的第i个图像块是否属于背景区域(即图像中运动不明显的区域);
在所述视频序列第j帧图像第i个图像块属于背景区域时,根据Bj,i=Bj-1,i+1,得到前j个图像块属于背景区域的个数Bj,i,所述前j个图像块为所述视频序列前j帧图像中各帧图像的第i个图像块,根据所述Bj,i计算所述前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差和所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真,其中,K≥j≥ 2,j为正整数,其中,在所述视频序列第1帧图像的第i个图像块属于所述背景区域时,B1,i为1,在所述视频序列第1帧图像的第i个图像块不属于所述背景区域时,所述B1,i为0;
在所述视频序列第j帧图像第i个图像块不属于背景区域时,根据Bj,i=Bj-1,i,得到前j个图像块属于背景区域的个数Bj,i,所述前j个图像块为所述前j帧图像中各帧图像的第i个图像块,并将所述视频序列前j-1帧图像中各帧图像的第i个图像块对应的j-1个质心值的方差作为所述前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差,将视频序列所述前j-1帧图像中各帧图像的第i个图像块对应的j-1个编码失真的平均编码失真作为所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真,其中,在j为2时,所述前j-1帧图像中各帧图像的第i个图像块对应的j-1个质心值的方差为第一预设值(例如0),所述前j-1帧图像中各帧图像的第i个图像块对应的j-1个编码失真的平均编码失真为第二预设值或所述视频序列第1帧图像的编码失真。
可选的,所述判断所述视频序列第j帧第i个图像块是否属于背景区域,可以包括:
判断所述第j帧图像的第i个图像块的子图像块的运动矢量中的最小运动矢量的水平分量是否小于第一预设阈值,且所述最小运动矢量的竖直分量小于第二预设阈值;
在所述水平分量小于第一预设阈值,且所述竖直分量小于第二预设阈值时,确定所述第j帧图像的第i个图像块属于背景区域;
在所述水平分量不小于第一预设阈值或者所述竖直分量不小于第二预设阈值时,确定所述第j帧图像的第i个图像块不属于背景区域。
可选的,第一预设阈值和第二阈值可以相同也可以不同,例如第一预设阈值和第二预设阈值可以均取值为1,也可以是第一预设阈值为0.99,第二预设阈值为1.01等,此处不做限定。
可选的,所述根据所述Bj,i计算所述前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差和所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真,可以包括:
获取所述第j帧图像的第i个图像块的质心值和所述第j帧图像的第i个图 像块的编码失真;
根据所述第j帧图像第i个图像块的质心值以及所述Bj,i,计算所述前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差;
根据所述第j帧图像的第i个图像块的编码失真以及所述Bj,i,计算所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真。
可选的,所述根据所述第j帧图像第i个图像块的质心值以及所述Bj,i,计算所述前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差,可以包括:
采用如下公式(9)和公式(10)计算所述前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差:
Figure PCTCN2015085586-appb-000051
Figure PCTCN2015085586-appb-000052
其中,μj,i为所述前j个图像块的质心值的均值,μj-1,i为所述前j-1个图像块的质心值的均值,ωj,i为所述第j帧第i个图像块的质心值(可以采用上述公式(4)计算得到),
Figure PCTCN2015085586-appb-000053
为所述前j个图像块的质心值的方差,
Figure PCTCN2015085586-appb-000054
为前j-1个图像块的质心值的方差,所述前j-1个图像块为所述前j-1帧图像中各帧图像的第i个图像块。
可选的,所述根据所述第j帧图像的第i个图像块的编码失真以及所述Bj,i,计算所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真,可以包括:
采用如下公式(11)计算所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真:
Figure PCTCN2015085586-appb-000055
其中,
Figure PCTCN2015085586-appb-000056
为所述第j帧第i个图像块的编码失真(可以采用上述公式(5)计算得到),
Figure PCTCN2015085586-appb-000057
为所述前j个图像块的平均编码失真,
Figure PCTCN2015085586-appb-000058
为所述前j-1个图像块的平均编码失真。
302、确定第一背景帧,所述第一背景帧用于在对所述视频序列第K+1帧图像至所述视频序列第K+L帧图像编码时提供参考,L为正整数;
其中,L为正整数。
本实施例中,确定第一背景帧有多种实现方式,下面分别进行介绍:
(1)采用如下公式(12)对所述前K帧图像进行迭代,从而获得根据所述第K帧图像的像素值计算得到的目标迭代像素值,并将所述目标迭代像素值作为所述第一背景帧的像素值:
Figure PCTCN2015085586-appb-000059
其中,Pj,x,y为根据所述视频序列第j帧位置为(x,y)处的像素值计算得到的迭代像素值,Pj,x,y'为根据所述视频序列第j-1帧位置(x,y)处的像素值计算得到的迭代像素值,oj,x,y表示所述视频序列第j帧图像中位置为(x,y)处的像素值。
(2)当步骤301中采用的是方式(2)获取所述视频序列前K帧图像的第一统计特征时,确定第一背景帧可以包括:
采用如下公式(13)对所述前K帧图像进行迭代,从而获得根据所述第K帧图像的像素值计算得到的目标迭代像素值,并将所述目标迭代像素值作为所述第一背景帧的像素值:
Figure PCTCN2015085586-appb-000060
其中,Pj,x,y为根据所述视频序列第j帧位置为(x,y)处的像素值计算得到的迭代像素值,Pj,x,y'为根据所述视频序列第j-1帧位置(x,y)处的像素值计算得到的迭代像素值,oj,x,y表示所述视频序列第j帧图像中位置为(x,y)处的像素值。
303、根据所述第一统计特征,计算所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像的图像块参考所述第一背景帧的第一概率;
本实施例中,根据所述第一统计特征,计算所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像的图像块参考所述第一背景帧的第一概率有多种实现方式,下面分别进行介绍:
(1)当步骤301中采用方式(1)获取所述视频序列前K帧图像的第一统计特征时,所述根据所述第一统计特征,计算所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像的图像块参考所述第一背景帧的第一概率,可以包括:
采用如下公式(14)计算所述第一概率:
Figure PCTCN2015085586-appb-000061
其中,Pi为所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像的第i个图像块参考所述第一背景帧第i个图像块的概率,
Figure PCTCN2015085586-appb-000062
Figure PCTCN2015085586-appb-000063
为所述K个图像块的平均编码失真,
Figure PCTCN2015085586-appb-000064
为所述K个图像块对应的K个质心值的方差,所述K个图像块为所述前K帧图像中各帧图像的第i个图像块,所述K个图像块的位置与所述第K+1帧图像至所述第K+L帧图像中各帧图像的第i个图像块的位置对应,所述K个图像块与所述第一背景帧的第i个图像块的位置对应,i为正整数。
(2)当步骤301中采用方式(2)获取所述视频序列前K帧图像的第一 统计特征时,所述根据所述第一统计特征,计算所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像的图像块参考所述第一背景帧的第一概率,可以包括:
采用如下公式(15)计算所述第一概率:
Figure PCTCN2015085586-appb-000065
其中,Pi为所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像第i个图像块参考所述第一背景帧第i个图像块的概率,
Figure PCTCN2015085586-appb-000066
Figure PCTCN2015085586-appb-000067
为所述K个图像块的平均编码失真,
Figure PCTCN2015085586-appb-000068
为所述K个图像块对应的K个质心值的方差,所述K个图像块为所述前K帧图像中各帧图像的第i个图像块,所述K个图像块的位置与所述第K+1帧图像至所述第K+L帧图像中各帧图像的第i个图像块的位置对应,所述K个图像块与所述第一背景帧的第i个图像块的位置对应,i为正整数。
当然,也可以采用如上所述公式(14)计算所述第一概率,此处不做限定。
304、根据所述第一概率,计算用于编码所述第一背景帧的QP;
本实施例中,根据所述第一概率,计算用于编码所述第一背景帧的QP同样有多种实现方式,下面分别进行介绍:
(1)当步骤301中采用方式(1)获取所述视频序列前K帧图像的第一统计特征时,所述根据所述第一概率,计算用于编码所述第一背景帧的QP,可以包括:
采用如下公式(16)计算用于编码所述第一背景帧的QP:
Figure PCTCN2015085586-appb-000069
其中,QPi为用于编码所述第一背景帧第i个图像块的QP,Pi为所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像的第i个图像块参考所述第一背景帧第i个图像块的概率,PT为预设值,QPmin为当前编码允许的最小QP值,QPmode为所述视频序列前K帧图像中与所述第一背景帧采 用相同的编码模式的图像帧采用的QP中的最小值,所述第K+1帧图像至所述第K+L帧图像中各帧图像的第i个图像块的位置与所述第一背景帧的第i个图像块的位置对应。
(2)当步骤301中采用方式(2)获取所述视频序列前K帧图像的第一统计特征时,所述根据所述第一概率,计算用于编码所述第一背景帧的QP,可以包括:
采用如下公式(17)计算用于编码所述第一背景帧的QP:
Figure PCTCN2015085586-appb-000070
其中,QPi为用于编码所述第一背景帧第i个图像块的QP,Pi为所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像的第i个图像块参考所述第一背景帧第i个图像块的概率,PT为预设值,QPmin为当前编码允许的最小QP值,QPmode为所述视频序列前K帧图像中与所述第一背景帧采用相同的编码模式的图像帧采用的QP中的最小值,所述第K+1帧图像至所述第K+L帧图像中各帧图像的第i个图像块的位置与所述第一背景帧的第i个图像块的位置对应。
当然,此时所述根据所述第一概率,计算用于编码所述第一背景帧的QP,可以包括:采用如下公式(16)计算用于编码所述第一背景帧的QP,此处不做限定。
本实施例中,PT为预设数值,例如可取0.05,QPmin为当前编码允许的最小QP值,例如若第一背景帧采用I帧编码模式,则QPmode为其他I帧采用的最小QP值,若第一背景帧采用P或B帧编码模式,则QPmode为其他P或B帧采用的QP中的最小值。
305、根据用于编码所述第一背景帧的QP对所述第一背景帧进行编码,从而得到第一背景长期参考帧;
306、根据所述第一背景长期参考帧对所述视频序列第K+1帧图像至所述 视频序列第K+L帧图像进行编码。
本发明实施例中根据一段已编码图像序列(前K帧图像)的第一统计特征,估计,计算所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像的图像块参考确定的第一背景帧的第一概率,根据第一概率计算用于编码所述第一背景帧的QP,根据此QP编码第一背景帧,得到作为所述视频序列第K+1帧图像至所述视频序列第K+L帧图像编码的第一背景长期参考帧,由于该第一背景长期参考帧是通过前K帧图像的上述第一统计特征确定的量化参数编码得到的,充分考虑了视频内容变化的影响,在有效提高整体视频序列的编码质量的同时,减小因编码背景长期参考帧而带来的码率增加,进而提高整体编码性能。
在图2或图3所示实施例的基础上,在所述获取所述视频序列前K帧图像的第一统计特征之后,所述方法还可以包括:
获取所述视频序列第K+1帧图像至所述视频序列第K+L帧图像的第二统计特征;
确定第二背景帧,所述第二背景帧用于在对所述视频序列第K+L+1帧图像至所述视频序列第K+N帧图像编码时提供参考,其中N>L+1,N为正整数;
统计所述第一背景帧的第i个图像块作为参考块的次数,所述K个图像块的位置与所述第一背景帧的第i个图像块的位置对应;
根据所述第一背景帧的第i个图像块作为参考块的次数,及所述第一概率,计算概率预测误差;
根据所述第二统计特征,计算所述视频序列第K+L+1帧图像到第K+N帧图像参考所述第二背景帧的第二概率;
根据所述第二概率和所述概率预测误差,计算所述视频序列第K+L+1帧图像到所述视频序列第K+N帧图像中各帧图像参考所述第二背景帧的第三概率;
根据所述第三概率,计算用于编码所述第二背景帧的QP;
根据用于编码所述第二背景帧的QP对所述第二背景帧进行编码,从而得到第二背景长期参考帧;
根据所述第二背景长期参考帧对所述视频序列第K+L+1帧图像至所述视 频序列第K+N帧图像进行编码。
其中,获取所述视频序列第K+1帧图像至所述视频序列第K+L帧图像的第二统计特征的方式,可以参照步骤301中获取第一统计特征的方式,此处不再赘述;
确定第二背景帧的方式可以参照步骤302中的方式,采用如下公式(18)或公式(19)对所述第K+1帧图像至所述视频序列第K+L帧图像进行迭代,从而获得根据所述第K+L帧图像的像素值计算得到的目标迭代像素值,并将所述目标迭代像素值作为所述第二背景帧的像素值。
Figure PCTCN2015085586-appb-000071
Figure PCTCN2015085586-appb-000072
其中,Pj,x,y为根据所述视频序列第j帧位置为(x,y)处的像素值计算得到的迭代像素值,Pj,x,y'为所述视频序列第j-1帧位置(x,y)处的像素值计算得到的迭代像素值,oj,x,y表示所述视频序列第j帧图像中位置为(x,y)处的像素值,K+L≥j≥K+1。
上述根据所述第一背景帧的第i个图像块作为参考块的次数Si,及所述第一概率Pi,计算概率预测误差△Pi可以采用如下公式(20)
ΔPi=Si/K-Pi;  (20)
根据所述第二统计特征,计算所述视频序列第K+L+1帧图像到第K+N帧图像参考所述第二背景帧的第二概率Pi'可以参照步骤303中根据所述第一统计特征,计算所述第一概率的方式,此处不再赘述。
根据所述第二概率Pi'和所述概率预测误差ΔPi,计算所述视频序列第K+L+1帧图像到所述视频序列第K+N帧图像中各帧图像参考所述第二背景帧的第三概率Pi”,可以采用公式Pi”=ΔPi+Pi';
根据所述第三概率Pi”,计算用于编码所述第二背景帧的QP’,将Pi”=Pi代 入公式(16)或公式(17)即可计算出QP’。
此时,根据用于编码所述第二背景帧的QP对所述第二背景帧进行编码,从而得到第二背景长期参考帧;根据所述第二背景长期参考帧对所述视频序列第K+L+1帧图像至所述视频序列第K+N帧图像进行编码,同样可参照图3所示实施例,此处不再赘述。
需要说明的是,本发明实施例中描述的像素值,可以包括亮度值Y、色度值U和色度值V其中至少一项,例如可以是包括Y,U,V三个形成的像素值,也可以是Y或U或V其中任意一个值,也可以其中Y、U和V中任意两个值,此处不做限定。
上面介绍的是视频编码方法的实施例,下面介绍视频解码方法的实施例。
本发明实施例提供的视频解码方法的执行主体是视频解码装置,其中,该视频解码装置可以是任何需要输出、存储视频的装置,如手机,笔记本电脑,平板电脑,个人电脑等设备。
请参阅图4,本发明实施例中一种视频解码方法的实施例包括如下内容:
401、解码视频序列中的前K帧图像;
其中,K为正整数。
一般情况下,在编码端(视频编码装置)将参考帧写入视频码流时,都会形成参考帧参数集和参考帧索引,以便解码端根据参考帧参数集和参考帧索引,可以知道用短期参考帧解码和长期参考帧依次进行解码,解码端(视频解码装置)最先解码的是编码端最先写入的参考帧,例如HEVC标准中默认的短期参考帧。
本实施例中,由于视频序列的前K帧图像可以是根据预置的短期参考帧(例如HEVC默认的短期参考帧)编码的,因此可以利用视频码流中解码的短期参考帧解码视频序列中的前K帧图像。
402、解码第一背景帧,所述第一背景帧用于在对所述视频序列第K+1帧图像至所述视频序列第K+L帧图像解码时提供参考;
其中,L为正整数。
在解码所述视频序列的前K帧图像后,解码后续视频序列第K+1帧图像到第K+L帧图像参考的第一背景帧。
本实施例中,该第一背景帧存储在视频解码装置的内存中作为第一长期参 考帧,但不显示。
403、将所述第一背景帧作为第一长期参考帧解码所述视频序列第K+1帧图像到第K+L帧图像,L为正整数。
本发明实施例中由于编码端充分考虑了视频内容变化的影响,提升了编码质量,使得本发明实施例中视频解码中解码图像质量高,解码效率高。
可选的,在所述将所述第一背景帧作为长期参考帧解码所述视频序列第K+1帧图像到第K+L帧图像之后,所述方法还包括:
解码第二背景帧,所述第二背景帧用于在对所述视频序列第K+L+1帧图像到第K+N帧图像解码时提供参考,其中N>L+1,N为正整数,同样的,第二背景帧也是存储在视频解码装置的内存中作为第二长期参考帧,但不显示。
将所述第二背景帧作为第二长期参考帧解码所述视频序列第K+L+1帧图像到第K+N帧图像。
下面介绍本发明实施例中视频编码装置的实施例,请参阅图5,本发明实施例中提供的视频编码法装置500的一个实施例包括:
获取单元501,用于获取视频序列前K帧图像的第一统计特征,K为正整数;
第一确定单元502,用于确定第一背景帧,所述第一背景帧用于在对所述视频序列第K+1帧图像至所述视频序列第K+L帧图像编码时提供参考,L为正整数;
第二确定单元503,用于根据所述获取单元501获取的所述第一统计特征确定用于编码所述第一确定单元确定的所述第一背景帧的量化参数QP;
第一编码单元504,用于根据所述第二确定单元503确定的所述QP对所述第一背景帧进行编码,从而得到第一背景长期参考帧;
第二编码单元505,用于根据所述第一编码单元504得到的所述第一背景长期参考帧对所述视频序列第K+1帧图像至所述视频序列第K+L帧图像进行编码。
本发明实施例中获取单元501获取所述视频序列前K帧图像的第一统计特征,第二确定单元503根据所述第一统计特征确定用于编码所述第一背景帧的数QP,第一编码单元504根据用于编码第一背景帧的QP对确定的第一背 景帧进行编码,从而得到第一背景长期参考帧,第二编码单元505根据第一背景长期参考帧对所述视频序列第K+1帧图像至所述视频序列第K+L帧图像进行编码。本发明实施例中由于在对视频序列第K+1帧图像至所述视频序列第K+L帧图像进行编码时,采用的背景长期参考帧的编码量化参数是由前K帧图像视频内容相关联的第一统计特征确定的,提高了整体的视频编码质量。
在本发明的一些实施例中,所述第二确定单元503用于:
根据所述第一统计特征,计算所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像的图像块参考所述第一背景帧的第一概率;
根据所述第一概率,计算用于编码所述第一背景帧的QP。
在本发明的一些实施例中,所述第一统计特征包括K个图像块对应的K个质心值的方差和所述K个图像块对应的K个编码失真的平均编码失真。
在本发明的一些实施例中,所述K个图像块为所述前K帧图像中各帧图像的第i个图像块,i为正整数,所述K个图像块为位于所述前K帧图像中各帧图像的相同位置的图像块,所述K个质心值的方差为所述K个质心值构成了质心值序列的方差,所述K个编码失真的平均编码失真为所述K个编码失真构成的编码失真序列的平均值;
所述获取单元501用于:
按从所述前K帧图像的第二帧图像到所述前K帧图像的第K帧图像的顺序进行迭代计算,从而获得所述前K帧图像中各帧图像的第i个图像块对应的K个质心值的方差,和所述前K帧图像中各帧图像的第i个图像块对应的K个编码失真的平均编码失真,其中所述迭代计算包括:
判断所述视频序列第j帧图像的第i个图像块是否属于背景区域;
在所述视频序列第j帧图像第i个图像块属于背景区域时,根据Bj,i=Bj-1,i+1,得到前j个图像块属于背景区域的个数Bj,i,所述前j个图像块为所述视频序列前j帧图像中各帧图像的第i个图像块,根据所述Bj,i计算所述前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差和所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真,其中,K≥j≥2,j为正整数,其中,在所述视频序列第1帧图像的第i个图像块属于所述背景区域时,B1,i为1,在所述视频序列第1帧图像的第i个图像块不属于所述背景区域时,所述B1,i为0;
在所述视频序列第j帧图像第i个图像块不属于背景区域时,根据Bj,i=Bj-1,i,得到前j个图像块属于背景区域的个数Bj,i,所述前j个图像块为所述前j帧图像中各帧图像的第i个图像块,并将所述视频序列前j-1帧图像中各帧图像的第i个图像块对应的j-1个质心值的方差作为所述前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差,将所述视频序列前j-1帧图像中各帧图像的第i个图像块对应的j-1个编码失真的平均编码失真作为所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真,其中,在j为2时,所述前j-1帧图像中各帧图像的第i个图像块对应的j-1个质心值的方差为第一预设值,所述前j-1帧图像中各帧图像的第i个图像块对应的j-1个编码失真的平均编码失真为第二预设值或所述视频序列第1帧图像的编码失真。
在本发明的一些实施例中,所述获取单元501用于判断所述第j帧图像的第i个图像块的子图像块的运动矢量中的最小运动矢量的水平分量是否小于第一预设阈值,且所述最小运动矢量的竖直分量小于第二预设阈值,在所述水平分量小于第一预设阈值,且所述竖直分量小于第二预设阈值时,则确定所述第j帧图像的第i个图像块属于背景区域,在所述水平分量不小于第一预设阈值或者所述竖直分量不小于第二预设阈值时,则确定所述第j帧图像的第i个图像块不属于背景区域。
在本发明的一些实施例中,所述获取单元501用于获取所述第j帧图像的第i个图像块的质心值和所述第j帧图像的第i个图像块的编码失真;根据所述第j帧图像第i个图像块的质心值以及所述Bj,i,计算所述前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差;根据所述第j帧图像的第i个图像块的编码失真以及所述Bj,i,计算所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真。
在本发明的一些实施例中,所述获取单元501用于采用如上公式(9)和公式(10)计算所述前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差。
在本发明的一些实施例中,所述获取单元501用于采用如上公式(11)计算所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真。
在本发明的一些实施例中,所述第二确定单元503用于采用如上公式(15)计算所述第一概率。
在本发明的一些实施例中,所述第二确定单元503用于采用如上公式(14)计算所述第一概率。
在本发明的一些实施例中,所述第二确定单元503用于采用如上公式(16)计算用于编码所述第一背景帧的QP。
在本发明的一些实施例中,所述第二确定单元503用于采用如上公式(17)计算用于编码所述第一背景帧的QP。
在本发明的一些实施例中,所述第一确定单元502用于采用如上公式(13)对所述前K帧图像进行迭代,从而获得根据所述第K帧图像的像素值计算得到的目标迭代像素值,并将所述目标迭代像素值作为所述第一背景帧的像素值。
在本发明的一些实施例中,所述获取单元501还用于获取所述视频序列第K+1帧图像至所述视频序列第K+L帧图像的第二统计特征;
所述第一确定单元502还用于确定第二背景帧,所述第二背景帧用于在对所述视频序列第K+L+1帧图像至所述视频序列第K+N帧图像编码时提供参考,其中N>L+1,N为正整数;
所述第二确定单元503还用于统计所述第一背景帧的第i个图像块作为参考块的次数,所述K个图像块的位置与所述第一背景帧的第i个图像块的位置对应,根据所述第一背景帧的第i个图像块作为参考块的次数,及所述第一概率,计算概率预测误差;根据所述获取单元获取的第二统计特征,计算所述视频序列第K+L+1帧图像到第K+N帧图像参考所述第一确定单元确定的所述第二背景帧的第二概率,根据所述第二概率和所述概率预测误差,计算所述视频序列第K+L+1帧图像到所述视频序列第K+N帧图像中各帧图像参考所述第二背景帧的第三概率,根据所述第三概率,计算用于编码所述第二背景帧的QP;
第一编码单元504还用于根据所述第二确定单元503确定的用于编码所述第二背景帧的QP对所述第二背景帧进行编码,从而得到第二背景长期参考帧;
第二编码单元505还用于根据所述第一编码单元504得到的所述第二背景长期参考帧对所述视频序列第K+L+1帧图像至所述视频序列第K+N帧图像进行编码。
下面介绍本发明实施例中视频解码装置的实施例,请参阅图6,本发明实施例中提供的视频编码法装置600的一个实施例包括:
图像解码单元601,用于解码视频序列中的前K帧图像,K为正整数;
背景帧解码单元602,用于解码第一背景帧,所述第一背景帧用于在对所述视频序列第K+1帧图像至所述视频序列第K+L帧图像解码时提供参考,L为正整数;
所述图像解码单元601还用于将所述背景帧解码单元602得到的所述第一背景帧作为第一长期参考帧解码所述视频序列第K+1帧图像到第K+L帧图像,L为正整数。
在本发明的一些实施例中,所述背景帧解码单元602还用于解码第二背景帧,所述第二背景帧用于在对所述视频序列第K+L+1帧图像到第K+N帧图像解码时提供参考,其中N>L+1,N为正整数;
所述图像解码单元601还用于将所述背景帧解码单元602得到的所述第二背景帧作为第二长期参考帧解码所述视频序列第K+L+1帧图像到第K+N帧图像。
下面介绍本发明实施例中视频编码装置的一种可能的具体实施例,参见图7,图7为本发明实施例提供的视频编码装置700的示意图,视频编码装置700可包括至少一个总线701、与总线701相连的至少一个处理器702以及与总线701相连的至少一个存储器703。
其中,处理器702通过总线701,调用存储器703中存储的代码以用于获取所述视频序列前K帧图像的第一统计特征,K为正整数;确定第一背景帧,所述第一背景帧用于在对所述视频序列第K+1帧图像至所述视频序列第K+L帧图像编码时提供参考,L为正整数;根据所述第一统计特征确定用于编码所述第一背景帧的量化参数QP;根据用于编码所述第一背景帧的QP对所述第一背景帧进行编码,从而得到第一背景长期参考帧;根据所述第一背景长期参考帧对所述视频序列第K+1帧图像至所述视频序列第K+L帧图像进行编码。
在本发明的一些实施例中,处理器702通过总线701,调用存储器703中存储的代码以具体用于根据所述第一统计特征,计算所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像的图像块参考所述第一背景帧的第一概率;根据所述第一概率,计算用于编码所述第一背景帧的QP。
在本发明的一些实施例中,所述第一统计特征包括K个图像块对应的K个质心值的方差和所述K个图像块对应的K个编码失真的平均编码失真,所述K个图像块为位于所述前K帧图像中各帧图像的相同位置的图像块,所述K个质心值的方差为所述K个质心值构成了质心值序列的方差,所述K个编码失真的平均编码失真为所述K个编码失真构成的编码失真序列的平均值。
在本发明的一些实施例中,所述K个图像块为所述前K帧图像中各帧图像的第i个图像块,i为正整数;
处理器702通过总线701,调用存储器703中存储的代码以具体用于:
按从所述前K帧图像的第二帧图像到所述前K帧图像的第K帧图像的顺序进行迭代计算,从而获得所述前K帧图像中各帧图像的第i个图像块对应的K个质心值的方差,和所述前K帧图像中各帧图像的第i个图像块对应的K个编码失真的平均编码失真,其中迭代计算的过程包括:
判断所述视频序列第j帧图像的第i个图像块是否属于背景区域;
在所述视频序列第j帧图像第i个图像块属于背景区域时,则根据Bj,i=Bj-1,i+1,得到前j个图像块属于背景区域的个数Bj,i,所述前j个图像块为所述前j帧图像中各帧图像的第i个图像块,并根据所述Bj,i计算所述前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差和所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真,其中,K≥j≥2,j为正整数,其中,在所述视频序列第1帧图像的第i个图像块属于所述背景区域时,所述B1,i为1,在所述视频序列第1帧图像的第i个图像块不属于所述背景区域时,所述B1,i为0;
在所述视频序列第j帧图像第i个图像块不属于背景区域时,则根据Bj,i=Bj-1,i,得到前j个图像块属于背景区域的个数Bj,i,所述前j个图像块为所述前j帧图像中各帧图像的第i个图像块,并将所述视频序列前j-1帧图像中各帧图像的第i个图像块对应的j-1个质心值的方差作为所述前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差,将所述前j-1帧图像中各帧图像的第i个图像块对应的j-1个编码失真的平均编码失真作为所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真,其中,在j为2时,所述第j-1帧图像中各帧图像的第i个图像块对应的j-1个质心值的方差为第一预设值,所述前j-1帧图像中各帧图像的第i个图像块对应的j-1个编码失 真的平均编码失真为第二预设值或所述视频序列第1帧图像的编码失真。
在本发明一些实施例中,处理器702通过总线701,调用存储器703中存储的代码以用于:
判断所述第j帧图像的第i个图像块的子图像块的运动矢量中的最小运动矢量是否满足:所述最小运动矢量的水平分量小于第一预设阈值,且所述最小运动矢量的竖直分量小于第二预设阈值;
在所述水平分量小于第一预设阈值,且所述竖直分量小于第二预设阈值时,则确定所述第j帧图像的第i个图像块属于背景区域;
在所述水平分量不小于第一预设阈值或者所述竖直分量不小于第二预设阈值时,则确定所述第j帧图像的第i个图像块不属于背景区域。
在本发明的另一些实施例中,处理器702通过总线701,调用存储器703中存储的代码以具体用于:
获取所述第j帧图像的第i个图像块的质心值和所述第j帧图像的第i个图像块的编码失真;
根据所述第j帧图像第i个图像块的质心值以及所述Bj,i,计算所述前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差;
根据所述第j帧图像的第i个图像块的编码失真以及所述Bj,i,计算所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真。
在本发明的一些实施例中,处理器702通过总线701,调用存储器703中存储的代码以具体用于:
采用如上公式(9)和公式(10)计算所述前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差。
在本发明的一些实施例中,处理器702通过总线701,调用存储器703中存储的代码以具体用于:
采用如上公式(11)计算所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真。
在本发明的一些实施例中,处理器702通过总线701,调用存储器703中存储的代码以具体用于:
采用如上公式(15)计算所述第一概率。
在本发明的一些实施例中,处理器702通过总线701,调用存储器703中 存储的代码以具体用于:
采用如上公式(14)计算所述第一概率。
在本发明的一些实施例中,处理器702通过总线701,调用存储器703中存储的代码以具体用于:
采用如上公式(16)计算用于编码所述第一背景帧的QP。
在本发明的一些实施例中,处理器702通过总线701,调用存储器703中存储的代码以具体用于:
采用如上公式(17)计算用于编码所述第一背景帧的QP。
在本发明的一些实施例中,处理器702通过总线701,调用存储器703中存储的代码以具体用于确定所述第一背景帧:
采用如上公式(13)对所述前K帧图像进行迭代,从而获得根据所述第K帧图像的像素值计算得到的目标迭代像素值,并将所述目标迭代像素值作为所述第一背景帧的像素值。
在本发明的一些实施例中,在所述处理器702获取所述视频序列前K帧图像的第一统计特征之后,处理器702通过总线701,调用存储器703中存储的代码以具体还用于:
获取所述视频序列第K+1帧图像至所述视频序列第K+L帧图像的第二统计特征;
确定第二背景帧,所述第二背景帧用于在对所述视频序列第K+L+1帧图像至所述视频序列第K+N帧图像编码时提供参考,其中N>L+1,N为正整数;
统计所述第一背景帧的第i个图像块作为参考块的次数,所述K个图像块的位置与所述第一背景帧的第i个图像块的位置对应;
根据所述第一背景帧的第i个图像块作为参考块的次数,及所述第一概率,计算概率预测误差;
根据所述第二统计特征,计算所述视频序列第K+L+1帧图像到第K+N帧图像参考所述第二背景帧的第二概率;
根据所述第二概率和所述概率预测误差,计算所述视频序列第K+L+1帧图像到所述视频序列第K+N帧图像中各帧图像参考所述第二背景帧的第三概率;
根据所述第三概率,计算用于编码所述第二背景帧的QP;
根据用于编码所述第二背景帧的QP对所述第二背景帧进行编码,从而得到第二背景长期参考帧;
根据所述第二背景长期参考帧对所述视频序列第K+L+1帧图像至所述视频序列第K+N帧图像进行编码。
可以理解的是,本实施例的视频编码装置700的各功能模块的功能可根据上述方法实施例中的方法具体实现,其具体实现过程可以参照上述方法实施例的相关描述,此处不再赘述。视频编码装置700可为任何需要输出、播放视频的装置,如笔记本电脑,平板电脑、个人电脑、手机等设备。
下面介绍本发明实施例中视频解码装置的一种可能的具体实施例,参见图8,图9为本发明实施例提供的视频解码装置800的示意图,视频解码装置800可包括至少一个总线801、与总线801相连的至少一个处理器802以及与总线901相连的至少一个存储器803。
其中,处理器802通过总线801,调用存储器803中存储的代码以用于解码视频序列中的前K帧图像,K为正整数;解码第一背景帧,所述第一背景帧用于在对所述视频序列第K+1帧图像至所述视频序列第K+L帧图像解码时提供参考,L为正整数;将所述第一背景帧作为第一长期参考帧解码所述视频序列第K+1帧图像到第K+L帧图像,L为正整数。
在本发明的一些实施例中,在所述将所述第一背景帧作为长期参考帧解码所述视频序列第K+1帧图像到第K+L帧图像之后,所述方法还包括:解码第二背景帧,所述第二背景帧用于在对所述视频序列第K+L+1帧图像到第K+N帧图像解码时提供参考,其中N>L+1,N为正整数;将所述第二背景帧作为第二长期参考帧解码所述视频序列第K+L+1帧图像到第K+N帧图像。
可以理解的是,本实施例的视频解码装置800的各功能模块的功能可根据上述方法实施例中的方法具体实现,其具体实现过程可以参照上述方法实施例的相关描述,此处不再赘述。视频解码装置800可为任何需要输出、播放视频的装置,如笔记本电脑,平板电脑、个人电脑、手机等设备。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
本发明实施例还提供一种计算机存储介质,其中,该计算机存储介质可存储有程序,该程序执行时包括上述方法实施例中记载的至少视频编码方法或视频解码方法的部分或全部步骤。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,以上实施例仅用以说明本发明的技术方案,而非对其限制;尽 管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。

Claims (28)

  1. 一种视频编码方法,其特征在于,所述方法包括:
    获取视频序列前K帧图像的第一统计特征,K为正整数;
    确定第一背景帧,所述第一背景帧用于在对所述视频序列第K+1帧图像至所述视频序列第K+L帧图像编码时提供参考,L为正整数;
    根据所述第一统计特征确定用于编码所述第一背景帧的量化参数QP;
    根据所述QP对所述第一背景帧进行编码,从而得到第一背景长期参考帧;
    根据所述第一背景长期参考帧对所述视频序列第K+1帧图像至所述视频序列第K+L帧图像进行编码。
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述第一统计特征确定用于编码所述第一背景帧的量化参数QP,包括:
    根据所述第一统计特征,计算所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像的图像块参考所述第一背景帧的第一概率;
    根据所述第一概率,计算用于编码所述第一背景帧的QP。
  3. 根据权利要求1或2所述的方法,其特征在于,所述第一统计特征包括K个图像块对应的K个质心值的方差和所述K个图像块对应的K个编码失真的平均编码失真,所述K个图像块为位于所述前K帧图像中各帧图像的相同位置的图像块,所述K个质心值的方差为所述K个质心值构成的质心值序列的方差,所述K个编码失真的平均编码失真为所述K个编码失真构成的编码失真序列的平均值。
  4. 根据权利要求3所述的方法,其特征在于,
    所述K个图像块为所述前K帧图像中各帧图像的第i个图像块,i为正整数;
    所述获取所述视频序列前K帧图像的第一统计特征,包括:
    按从所述前K帧图像的第二帧图像到所述前K帧图像的第K帧图像的顺序进行迭代计算,从而获得所述前K帧图像中各帧图像的第i个图像块对应的K个质心值的方差,和所述前K帧图像中各帧图像的第i个图像块对应的K个编码失真的平均编码失真,其中所述迭代计算包括:
    判断所述视频序列第j帧图像的第i个图像块是否属于背景区域;
    在所述视频序列第j帧图像第i个图像块属于背景区域时,根据Bj,i= Bj-1,i+1,得到前j个图像块属于背景区域的个数Bj,i,所述前j个图像块为所述视频序列前j帧图像中各帧图像的第i个图像块;根据所述Bj,i计算所述前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差和所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真,其中,K≥j≥2,其中,在所述视频序列第1帧图像的第i个图像块属于所述背景区域时,B1,i为1,在所述视频序列第1帧图像的第i个图像块不属于所述背景区域时,所述B1,i为0;
    在所述视频序列第j帧图像第i个图像块不属于背景区域时,根据Bj,i=Bj-1,i,得到前j个图像块属于背景区域的个数Bj,i,所述前j个图像块为所述前j帧图像中各帧图像的第i个图像块,并将所述视频序列前j-1帧图像中各帧图像的第i个图像块对应的j-1个质心值的方差作为所述前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差,将所述视频序列前j-1帧图像中各帧图像的第i个图像块对应的j-1个编码失真的平均编码失真作为所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真,其中,在j为2时,所述前j-1帧图像中各帧图像的第i个图像块对应的j-1个质心值的方差为第一预设值,所述前j-1帧图像中各帧图像的第i个图像块对应的j-1个编码失真的平均编码失真为第二预设值或所述视频序列第1帧图像的编码失真。
  5. 根据权利要求4所述的方法,其特征在于,
    所述判断所述视频序列第j帧第i个图像块是否属于背景区域,包括:
    判断所述第j帧图像的第i个图像块的子图像块的运动矢量中的最小运动矢量的水平分量是否小于第一预设阈值,且所述最小运动矢量的竖直分量小于第二预设阈值;
    在所述水平分量小于第一预设阈值,且所述竖直分量小于第二预设阈值时,确定所述第j帧图像的第i个图像块属于背景区域;
    在所述水平分量不小于第一预设阈值或者所述竖直分量不小于第二预设阈值时,确定所述第j帧图像的第i个图像块不属于背景区域。
  6. 根据权利要求4或5所述的方法,其特征在于,
    所述根据所述Bj,i计算所述前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差和所述前j帧图像中各帧图像的第i个图像块对应的j个编码 失真的平均编码失真,包括:
    获取所述第j帧图像的第i个图像块的质心值和编码失真;
    根据所述第j帧图像第i个图像块的质心值以及所述Bj,i,计算所述前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差;
    根据所述第j帧图像的第i个图像块的编码失真以及所述Bj,i,计算所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真。
  7. 根据权利要求6所述的方法,其特征在于,
    采用如下公式计算所述前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差:
    Figure PCTCN2015085586-appb-100001
    Figure PCTCN2015085586-appb-100002
    其中,μj,i为所述前j个图像块的质心值的均值,μj-1,i为所述前j-1个图像块的质心值的均值,ωj,i为所述第j帧第i个图像块的质心值,
    Figure PCTCN2015085586-appb-100003
    为所述前j个图像块的质心值的方差,
    Figure PCTCN2015085586-appb-100004
    为前j-1个图像块的质心值的方差,所述前j-1个图像块为所述前j-1帧图像中各帧图像的第i个图像块。
  8. 根据权利要求6或7所述的方法,其特征在于,
    采用如下公式计算所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真:
    Figure PCTCN2015085586-appb-100005
    其中,
    Figure PCTCN2015085586-appb-100006
    为所述第j帧第i个图像块的编码失真,
    Figure PCTCN2015085586-appb-100007
    为所述前j个图像块的平均编码失真,
    Figure PCTCN2015085586-appb-100008
    为所述前j-1个图像块的平均编码失真。
  9. 根据权利要4至8中任一所述的方法,其特征在于,
    采用如下公式计算所述第一概率:
    Figure PCTCN2015085586-appb-100009
    其中,Pi为所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像第i个图像块参考所述第一背景帧第i个图像块的概率,
    Figure PCTCN2015085586-appb-100010
    Figure PCTCN2015085586-appb-100011
    Figure PCTCN2015085586-appb-100012
    为所述K个图像块的平均编码失真,
    Figure PCTCN2015085586-appb-100013
    为所述K个图像块对应的K个质心值的方差,所述K个图像块为所述前K帧图像中各帧图像的第i个图像块,所述K个图像块的位置与所述第K+1帧图像至所述第K+L帧图像中各帧图像的第i个图像块的位置对应,所述K个图像块与所述第一背景帧的第i个图像块的位置对应,i为正整数。
  10. 根据权利要求3至8中任一所述的方法,其特征在于,
    采用如下公式计算所述第一概率:
    Figure PCTCN2015085586-appb-100014
    其中,Pi为所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像的第i个图像块参考所述第一背景帧第i个图像块的概率,
    Figure PCTCN2015085586-appb-100015
    Figure PCTCN2015085586-appb-100016
    Figure PCTCN2015085586-appb-100017
    为所述K个图像块的平均编码失真,
    Figure PCTCN2015085586-appb-100018
    为所述K个图像块对应的K个质心值的方差,所述K个图像块为所述前K帧图像中各帧图像的第i个图像块,所述K个图像块的位置与所述第K+1帧图像至所述第K+L帧图像中各帧图像的第i个图像块的位置对应,所述K个图像块与所述第一背景帧的第i个图像块的位置对应,i为正整数。
  11. 根据权利要2至10中任一所述的方法,其特征在于,
    采用如下公式计算用于编码所述第一背景帧的QP:
    Figure PCTCN2015085586-appb-100019
    其中,QPi为用于编码所述第一背景帧第i个图像块的QP,Pi为所述视 频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像的第i个图像块参考所述第一背景帧第i个图像块的概率,PT为预设值,QPmin为当前编码允许的最小QP值,QPmode为所述视频序列前K帧图像中与所述第一背景帧采用相同的编码模式的图像帧采用的QP中的最小值,所述第K+1帧图像至所述第K+L帧图像中各帧图像的第i个图像块的位置与所述第一背景帧的第i个图像块的位置对应。
  12. 根据权利要求4至10中任一所述的方法,其特征在于,
    采用如下公式计算用于编码所述第一背景帧的QP:
    Figure PCTCN2015085586-appb-100020
    其中,QPi为用于编码所述第一背景帧第i个图像块的QP,Pi为所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像的第i个图像块参考所述第一背景帧第i个图像块的概率,PT为预设值,QPmin为当前编码允许的最小QP值,QPmode为所述视频序列前K帧图像中与所述第一背景帧采用相同的编码模式的图像帧采用的QP中的最小值,所述第K+1帧图像至所述第K+L帧图像中各帧图像的第i个图像块的位置与所述第一背景帧的第i个图像块的位置对应。
  13. 根据权利要求4至12中任一所述的方法,其特征在于,
    所述确定第一背景帧,包括:
    采用如下公式对所述前K帧图像进行迭代,从而获得根据所述第K帧图像的像素值计算得到的目标迭代像素值,并将所述目标迭代像素值作为所述第一背景帧的像素值:
    Figure PCTCN2015085586-appb-100021
    其中,Pj,x,y为根据所述视频序列第j帧位置为(x,y)处的像素值计算得到的迭代像素值,Pj,x,y'为根据所述视频序列第j-1帧位置(x,y)处的像素值计算得到的迭代像素值,oj,x,y表示所述视频序列第j帧图像中位置为(x,y)处的像素值。
  14. 根据权利要求2至13中任一所述的方法,其特征在于,在所述获取所述视频序列前K帧图像的第一统计特征之后,所述方法还包括:
    获取所述视频序列第K+1帧图像至所述视频序列第K+L帧图像的第二统计特征;
    确定第二背景帧,所述第二背景帧用于在对所述视频序列第K+L+1帧图像至所述视频序列第K+N帧图像编码时提供参考,其中N>L+1,N为正整数;
    统计所述第一背景帧的第i个图像块作为参考块的次数,所述K个图像块的位置与所述第一背景帧的第i个图像块的位置对应;
    根据所述第一背景帧的第i个图像块作为参考块的次数,及所述第一概率,计算概率预测误差;
    根据所述第二统计特征,计算所述视频序列第K+L+1帧图像到第K+N帧图像参考所述第二背景帧的第二概率;
    根据所述第二概率和所述概率预测误差,计算所述视频序列第K+L+1帧图像到所述视频序列第K+N帧图像中各帧图像参考所述第二背景帧的第三概率;
    根据所述第三概率,计算用于编码所述第二背景帧的QP;
    根据用于编码所述第二背景帧的QP对所述第二背景帧进行编码,从而得到第二背景长期参考帧;
    根据所述第二背景长期参考帧对所述视频序列第K+L+1帧图像至所述视频序列第K+N帧图像进行编码。
  15. 一种视频编码装置,其特征在于,包括:
    获取单元,用于获取视频序列前K帧图像的第一统计特征,K为正整数;
    第一确定单元,用于确定第一背景帧,所述第一背景帧用于在对所述视频序列第K+1帧图像至所述视频序列第K+L帧图像编码时提供参考,L为正整数;
    第二确定单元,用于根据所述获取单元获取的所述第一统计特征确定用于编码所述第一确定单元确定的所述第一背景帧的量化参数QP;
    第一编码单元,用于根据所述第二确定单元确定的所述QP对所述第一背景帧进行编码,从而得到第一背景长期参考帧;
    第二编码单元,用于根据所述第一编码单元得到的所述第一背景长期参考帧对所述视频序列第K+1帧图像至所述视频序列第K+L帧图像进行编码。
  16. 根据权利要求15所述的装置,其特征在于,
    所述第二确定单元用于:
    根据所述第一统计特征,计算所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像的图像块参考所述第一背景帧的第一概率;
    根据所述第一概率,计算用于编码所述第一背景帧的QP。
  17. 根据权利要求15或16所述的装置,其特征在于,
    所述第一统计特征包括K个图像块对应的K个质心值的方差和所述K个图像块对应的K个编码失真的平均编码失真,所述K个图像块为位于所述前K帧图像中各帧图像的相同位置的图像块,所述K个质心值的方差为所述K个质心值构成了质心值序列的方差,所述K个编码失真的平均编码失真为所述K个编码失真构成的编码失真序列的平均值。
  18. 根据权利要求17所述的装置,其特征在于,
    所述K个图像块为所述前K帧图像中各帧图像的第i个图像块,i为正整数;
    所述获取单元用于:
    按从所述前K帧图像的第二帧图像到所述前K帧图像的第K帧图像的顺序进行迭代计算,从而获得所述前K帧图像中各帧图像的第i个图像块对应的K个质心值的方差,和所述前K帧图像中各帧图像的第i个图像块对应的K个编码失真的平均编码失真,其中所述迭代计算包括:
    判断所述视频序列第j帧图像的第i个图像块是否属于背景区域;
    在所述视频序列第j帧图像第i个图像块属于背景区域时,根据Bj,i=Bj-1,i+1,得到前j个图像块属于背景区域的个数Bj,i,所述前j个图像块为所述视频序列前j帧图像中各帧图像的第i个图像块,根据所述Bj,i计算所述前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差和所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真,其中,K≥j≥2,其中,在所述视频序列第1帧图像的第i个图像块属于所述背景区域时,B1,i为1,在所述视频序列第1帧图像的第i个图像块不属于所述背景区域时,所述B1,i为0;
    在所述视频序列第j帧图像第i个图像块不属于背景区域时,根据Bj,i=Bj-1,i,得到前j个图像块属于背景区域的个数Bj,i,所述前j个图像块为所述前j帧图像中各帧图像的第i个图像块,并将所述视频序列前j-1帧图像中各帧图像的第i个图像块对应的j-1个质心值的方差作为所述前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差,将所述视频序列前j-1帧图像中各帧图像的第i个图像块对应的j-1个编码失真的平均编码失真作为所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真,其中,在j为2时,所述前j-1帧图像中各帧图像的第i个图像块对应的j-1个质心值的方差为第一预设值,所述前j-1帧图像中各帧图像的第i个图像块对应的j-1个编码失真的平均编码失真为第二预设值或所述视频序列第1帧图像的编码失真。
  19. 根据权利要求18所述的装置,其特征在于,
    所述获取单元用于判断所述第j帧图像的第i个图像块的子图像块的运动矢量中的最小运动矢量的水平分量是否小于第一预设阈值,且所述最小运动矢量的竖直分量小于第二预设阈值,在所述水平分量小于第一预设阈值,且所述竖直分量小于第二预设阈值时,确定所述第j帧图像的第i个图像块属于背景区域,在所述水平分量不小于第一预设阈值或者所述竖直分量不小于第二预设阈值时,确定所述第j帧图像的第i个图像块不属于背景区域。
  20. 根据权利要求18或19所述的装置,其特征在于,
    所述获取单元用于获取所述第j帧图像的第i个图像块的质心值和编码失真;根据所述第j帧图像第i个图像块的质心值以及所述Bj,i,计算所述前j帧 图像中各帧图像的第i个图像块对应的j个质心值的方差;根据所述第j帧图像的第i个图像块的编码失真以及所述Bj,i,计算所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真。
  21. 根据权利要求20所述的装置,其特征在于,
    所述获取单元用于采用如下公式计算所述前j帧图像中各帧图像的第i个图像块对应的j个质心值的方差:
    Figure PCTCN2015085586-appb-100022
    Figure PCTCN2015085586-appb-100023
    其中,μj,i为所述前j个图像块的质心值的均值,μj-1,i为所述前j-1个图像块的质心值的均值,ωj,i为所述第j帧第i个图像块的质心值,
    Figure PCTCN2015085586-appb-100024
    为所述前j个图像块的质心值的方差,
    Figure PCTCN2015085586-appb-100025
    为前j-1个图像块的质心值的方差,所述前j-1个图像块为所述前j-1帧图像中各帧图像的第i个图像块。
  22. 根据权利要求20或21所述的装置,其特征在于,
    所述获取单元用于采用如下公式计算所述前j帧图像中各帧图像的第i个图像块对应的j个编码失真的平均编码失真:
    Figure PCTCN2015085586-appb-100026
    其中,
    Figure PCTCN2015085586-appb-100027
    为所述第j帧第i个图像块的编码失真,
    Figure PCTCN2015085586-appb-100028
    为所述前j个图像块的平均编码失真,
    Figure PCTCN2015085586-appb-100029
    为所述前j-1个图像块的平均编码失真。
  23. 根据权利要求18至22中任一所述的装置,其特征在于,
    所述第二确定单元用于采用如下公式计算所述第一概率:
    Figure PCTCN2015085586-appb-100030
    其中,Pi为所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像第i个图像块参考所述第一背景帧第i个图像块的概率,
    Figure PCTCN2015085586-appb-100031
    Figure PCTCN2015085586-appb-100032
    Figure PCTCN2015085586-appb-100033
    为所述K个图像块的平均编码失真,
    Figure PCTCN2015085586-appb-100034
    为所述K个图像块对应的K个质心值的方差,所述K个图像块为所述前K帧图像中各帧图像的第i个图像块,所述K个图像块的位置与所述第K+1帧图像至所述第K+L帧图像中各帧图像的第i个图像块的位置对应,所述K个图像块与所述第一背景帧的第i个图像块的位置对应,i为正整数。
  24. 根据权利要求17至22中任一所述的装置,其特征在于,
    所述第二确定单元用于采用如下公式计算所述第一概率:
    Figure PCTCN2015085586-appb-100035
    其中,Pi为所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像的第i个图像块参考所述第一背景帧第i个图像块的概率,
    Figure PCTCN2015085586-appb-100036
    Figure PCTCN2015085586-appb-100037
    Figure PCTCN2015085586-appb-100038
    为所述K个图像块的平均编码失真,
    Figure PCTCN2015085586-appb-100039
    为所述K个图像块对应的K个质心值的方差,所述K个图像块为所述前K帧图像中各帧图像的第i个图像块,所述K个图像块的位置与所述第K+1帧图像至所述第K+L帧图像中各帧图像的第i个图像块的位置对应,所述K个图像块与所述第一背景帧的第i个图像块的位置对应,i为正整数。
  25. 根据权利要求16至24中任一所述的装置,其特征在于
    所述第二确定单元用于采用如下公式计算用于编码所述第一背景帧的QP:
    Figure PCTCN2015085586-appb-100040
    其中,QPi为用于编码所述第一背景帧第i个图像块的QP,Pi为所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像的第i个图像 块参考所述第一背景帧第i个图像块的概率,PT为预设值,QPmin为当前编码允许的最小QP值,QPmode为所述视频序列前K帧图像中与所述第一背景帧采用相同的编码模式的图像帧采用的QP中的最小值,所述第K+1帧图像至所述第K+L帧图像中各帧图像的第i个图像块的位置与所述第一背景帧的第i个图像块的位置对应。
  26. 根据权利要求18至24中任一所述的装置,其特征在于
    所述第二确定单元用于采用如下公式计算用于编码所述第一背景帧的QP:
    Figure PCTCN2015085586-appb-100041
    其中,QPi为用于编码所述第一背景帧第i个图像块的QP,Pi为所述视频序列第K+1帧图像至所述视频序列第K+L帧图像中各帧图像的第i个图像块参考所述第一背景帧第i个图像块的概率,PT为预设值,QPmin为当前编码允许的最小QP值,QPmode为所述视频序列前K帧图像中与所述第一背景帧采用相同的编码模式的图像帧采用的QP中的最小值,所述第K+1帧图像至所述第K+L帧图像中各帧图像的第i个图像块的位置与所述第一背景帧的第i个图像块的位置对应。
  27. 根据权利要求18至26中任一所述的装置,其特征在于,
    所述第一确定单元用于采用如下公式对所述前K帧图像进行迭代,从而获得根据所述第K帧图像的像素值计算得到的目标迭代像素值,并将所述目标迭代像素值作为所述第一背景帧的像素值:
    Figure PCTCN2015085586-appb-100042
    其中,Pj,x,y为根据所述视频序列第j帧位置为(x,y)处的像素值计算得到的迭代像素值,Pj,x,y'为根据所述视频序列第j-1帧位置(x,y)处的像素值计算得到的迭代像素值,oj,x,y表示所述视频序列第j帧图像中位置为(x,y)处的像素值。
  28. 根据权利要求16至27中任一所述的装置,其特征在于,
    所述获取单元还用于获取所述视频序列第K+1帧图像至所述视频序列第K+L帧图像的第二统计特征;
    所述第一确定单元还用于确定第二背景帧,所述第二背景帧用于在对所述视频序列第K+L+1帧图像至所述视频序列第K+N帧图像编码时提供参考,其中N>L+1,N为正整数;
    所述第二确定单元还用于统计所述第一背景帧的第i个图像块作为参考块的次数,所述K个图像块的位置与所述第一背景帧的第i个图像块的位置对应,根据所述第一背景帧的第i个图像块作为参考块的次数,及所述第一概率,计算概率预测误差;根据所述获取单元获取的第二统计特征,计算所述视频序列第K+L+1帧图像到第K+N帧图像参考所述第一确定单元确定的所述第二背景帧的第二概率,根据所述第二概率和所述概率预测误差,计算所述视频序列第K+L+1帧图像到所述视频序列第K+N帧图像中各帧图像参考所述第二背景帧的第三概率,根据所述第三概率,计算用于编码所述第二背景帧的QP;
    第一编码单元还用于根据所述第二确定单元确定的用于编码所述第二背景帧的QP对所述第二背景帧进行编码,从而得到第二背景长期参考帧;
    第二编码单元还用于根据所述第一编码单元得到的所述第二背景长期参考帧对所述视频序列第K+L+1帧图像至所述视频序列第K+N帧图像进行编码。
PCT/CN2015/085586 2015-07-30 2015-07-30 一种视频编解码方法及装置 WO2017015958A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN201580077248.0A CN107409211B (zh) 2015-07-30 2015-07-30 一种视频编解码方法及装置
EP15899310.5A EP3319317B1 (en) 2015-07-30 2015-07-30 Video encoding and decoding method and device
PCT/CN2015/085586 WO2017015958A1 (zh) 2015-07-30 2015-07-30 一种视频编解码方法及装置
US15/882,346 US10560719B2 (en) 2015-07-30 2018-01-29 Video encoding and decoding method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2015/085586 WO2017015958A1 (zh) 2015-07-30 2015-07-30 一种视频编解码方法及装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/882,346 Continuation US10560719B2 (en) 2015-07-30 2018-01-29 Video encoding and decoding method and apparatus

Publications (1)

Publication Number Publication Date
WO2017015958A1 true WO2017015958A1 (zh) 2017-02-02

Family

ID=57884004

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/085586 WO2017015958A1 (zh) 2015-07-30 2015-07-30 一种视频编解码方法及装置

Country Status (4)

Country Link
US (1) US10560719B2 (zh)
EP (1) EP3319317B1 (zh)
CN (1) CN107409211B (zh)
WO (1) WO2017015958A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10313686B2 (en) * 2016-09-20 2019-06-04 Gopro, Inc. Apparatus and methods for compressing video content using adaptive projection selection
CN113709473B (zh) 2019-03-11 2022-11-25 杭州海康威视数字技术股份有限公司 一种编解码方法、装置及其设备
CN112822520B (zh) * 2020-12-31 2023-06-16 武汉球之道科技有限公司 一种在线赛事视频的服务器编码方法

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102595135A (zh) * 2012-02-24 2012-07-18 中国科学技术大学 一种可伸缩视频编码的方法及装置
CN102665077A (zh) * 2012-05-03 2012-09-12 北京大学 一种基于宏块分类的快速高效编转码方法
GB2501495A (en) * 2012-04-24 2013-10-30 Canon Kk Selection of image encoding mode based on preliminary prediction-based encoding stage

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6055330A (en) * 1996-10-09 2000-04-25 The Trustees Of Columbia University In The City Of New York Methods and apparatus for performing digital image and video segmentation and compression using 3-D depth information
US6490319B1 (en) * 1999-06-22 2002-12-03 Intel Corporation Region of interest video coding
US6879723B1 (en) * 1999-11-12 2005-04-12 8X8, Inc. Method and apparatus for encoding frames of image data at a varying quality level
US7321624B1 (en) * 2001-03-16 2008-01-22 Objectvideo, Inc. Bit-rate allocation system for object-based video encoding
US8040949B2 (en) * 2003-01-09 2011-10-18 The Regents Of The University Of California Video encoding methods and devices
US20070092148A1 (en) * 2005-10-20 2007-04-26 Ban Oliver K Method and apparatus for digital image rudundancy removal by selective quantization
US7653130B2 (en) * 2006-12-27 2010-01-26 General Instrument Corporation Method and apparatus for bit rate reduction in video telephony
US8243790B2 (en) * 2007-09-28 2012-08-14 Dolby Laboratories Licensing Corporation Treating video information
WO2010057170A1 (en) * 2008-11-17 2010-05-20 Cernium Corporation Analytics-modulated coding of surveillance video
EP2224745B1 (en) * 2009-02-27 2019-11-06 STMicroelectronics Srl Temporal scalability in case of scene changes
CN103167283B (zh) * 2011-12-19 2016-03-02 华为技术有限公司 一种视频编码方法及设备
US9681125B2 (en) * 2011-12-29 2017-06-13 Pelco, Inc Method and system for video coding with noise filtering
US9565440B2 (en) 2013-06-25 2017-02-07 Vixs Systems Inc. Quantization parameter adjustment based on sum of variance and estimated picture encoding cost
US9584814B2 (en) * 2014-05-15 2017-02-28 Intel Corporation Content adaptive background foreground segmentation for video coding
US10070142B2 (en) * 2014-11-11 2018-09-04 Cisco Technology, Inc. Continuous generation of non-displayed reference frame in video encoding and decoding

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102595135A (zh) * 2012-02-24 2012-07-18 中国科学技术大学 一种可伸缩视频编码的方法及装置
GB2501495A (en) * 2012-04-24 2013-10-30 Canon Kk Selection of image encoding mode based on preliminary prediction-based encoding stage
CN102665077A (zh) * 2012-05-03 2012-09-12 北京大学 一种基于宏块分类的快速高效编转码方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
BIN, LI ET AL.: "QP Refinement According to Lagrange Multiplier for High Efficiency Video Coding.", 2013 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS., 31 December 2012 (2012-12-31), XP032445959 *

Also Published As

Publication number Publication date
CN107409211A (zh) 2017-11-28
CN107409211B (zh) 2019-10-22
EP3319317A4 (en) 2018-05-09
EP3319317A1 (en) 2018-05-09
US10560719B2 (en) 2020-02-11
EP3319317B1 (en) 2021-04-28
US20180152728A1 (en) 2018-05-31

Similar Documents

Publication Publication Date Title
JP5479504B2 (ja) デコーダ側の関心領域ビデオ処理
US10097821B2 (en) Hybrid-resolution encoding and decoding method and a video apparatus using the same
US9210420B1 (en) Method and apparatus for encoding video by changing frame resolution
US20220058775A1 (en) Video denoising method and apparatus, and storage medium
CN111819854B (zh) 用于协调多符号位隐藏和残差符号预测的方法和装置
US10805606B2 (en) Encoding method and device and decoding method and device
JP2006519565A (ja) ビデオ符号化
KR20080098042A (ko) 에러 은닉과 관련된 왜곡값에 기초한 인코딩 방법을 결정하는 방법 및 장치
US11064211B2 (en) Advanced video coding method, system, apparatus, and storage medium
KR102606414B1 (ko) 디블로킹 필터의 경계 강도를 도출하는 인코더, 디코더 및 대응 방법
WO2003061295A2 (en) Sharpness enhancement in post-processing of digital video signals using coding information and local spatial features
US20200021850A1 (en) Video data decoding method, decoding apparatus, encoding method, and encoding apparatus
CN101984665A (zh) 一种视频传输质量评测的方法与系统
US8379985B2 (en) Dominant gradient method for finding focused objects
EP3741127A1 (en) Loop filter apparatus and method for video coding
US10560719B2 (en) Video encoding and decoding method and apparatus
US9438925B2 (en) Video encoder with block merging and methods for use therewith
WO2021185257A1 (zh) 图像编码方法、图像解码方法及相关装置
WO2023051156A1 (zh) 视频图像的处理方法及装置
WO2012033972A1 (en) Methods and apparatus for pruning decision optimization in example-based data pruning compression
Vranjes et al. Subjective and objective quality evaluation of the H. 264/AVC coded video
CN114827616A (zh) 一种基于时空信息平衡的压缩视频质量增强方法
US9542611B1 (en) Logo detection for macroblock-based video processing
Zhang et al. Overview of the IEEE 1857 surveillance groups
Patnaik et al. H. 264/AVC/MPEG video coding with an emphasis to bidirectional prediction frames

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15899310

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2015899310

Country of ref document: EP