WO2022061613A1 - Video coding apparatus and method, and computer storage medium and mobile platform - Google Patents

Video coding apparatus and method, and computer storage medium and mobile platform Download PDF

Info

Publication number
WO2022061613A1
WO2022061613A1 PCT/CN2020/117220 CN2020117220W WO2022061613A1 WO 2022061613 A1 WO2022061613 A1 WO 2022061613A1 CN 2020117220 W CN2020117220 W CN 2020117220W WO 2022061613 A1 WO2022061613 A1 WO 2022061613A1
Authority
WO
WIPO (PCT)
Prior art keywords
module
encoding
mode
sub
video
Prior art date
Application number
PCT/CN2020/117220
Other languages
French (fr)
Chinese (zh)
Inventor
王悦名
郑萧桢
Original Assignee
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司 filed Critical 深圳市大疆创新科技有限公司
Priority to PCT/CN2020/117220 priority Critical patent/WO2022061613A1/en
Priority to CN202080013403.3A priority patent/CN113454997A/en
Publication of WO2022061613A1 publication Critical patent/WO2022061613A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • H04N19/82Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/91Entropy coding, e.g. variable length coding [VLC] or arithmetic coding

Definitions

  • the present invention relates to the technical field of video coding, and in particular, to a video coding apparatus, method, computer storage medium and removable platform.
  • the H.264 video coding standard is a highly compressed digital video codec proposed by the Joint Video Team (JVT, Joint Video Team) jointly formed by the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG). Compared with the previous video coding standard, the H.264 video coding standard can provide better image quality under the same bandwidth.
  • the H.265 video coding standard is a new video coding standard formulated by the ITU-T Video Coding Expert Group following the H.264 video coding standard.
  • the H.265 video coding standard retains some technologies of the H.264 video coding standard and improves it on this basis.
  • a chip will contain multiple independent encoders to implement video encoding under different video encoding standards. If you want to encode video streams in H.264 and H.265 formats, you need to set two different encoders respectively. However, multiple encoders consume more hardware area.
  • the first aspect of the embodiments of the present invention provides a video encoding apparatus, where the video encoding apparatus includes:
  • an integer pixel search module for determining a matching block that matches the current block in the current frame within a plurality of predetermined ranges in the plurality of reference frames
  • a sub-pixel search module electrically connected to the whole pixel search module, and the sub-pixel search module is configured to determine at least one sub-pixel matching block about the matching block;
  • a mode decision module electrically connected to the sub-pixel search module, for performing mode decision at least using the coding cost of the sub-pixel matching block to obtain the optimal prediction block of the current block for video coding;
  • the sub-pixel search module includes a half-pixel interpolation module, and the half-pixel interpolation module can use the first interpolation filter to perform H.264 encoding format video stream and H.265 encoding format video The stream does one-half pixel interpolation.
  • a second aspect of the embodiments of the present invention provides a video encoding method, where the video encoding method includes:
  • the integer pixel search module determines a matching block that matches the current block in the current frame within a plurality of predetermined ranges in the plurality of reference frames;
  • a sub-pixel search module electrically connected to the whole-pixel search module determines at least a sub-pixel matching block for the matching block, wherein the sub-pixel search module includes a half-pixel interpolation module, the determination is
  • the at least one pixel matching block of the matching block includes: the one-half pixel interpolation module uses the first interpolation filter to perform one-half pixel interpolation on the video stream in the H.264 encoding format or the video stream in the H.265 encoding format. ;
  • a mode decision module electrically connected to the sub-pixel search module makes mode decision at least using the coding cost of the sub-pixel matching block to obtain an optimal prediction block of the current block for video encoding.
  • a third aspect of the embodiments of the present invention provides a computer storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements the steps of the above video encoding method.
  • a fourth aspect of the embodiments of the present invention provides a movable platform, where the movable platform includes an imaging device and the above video encoding device, where the imaging device is used to collect video data, and the video encoding device is used to The video data collected by the imaging device is subjected to video encoding.
  • the video encoding apparatus, method, computer storage medium and movable platform of the embodiments of the present invention multiplex part of the hardware structure to encode video streams in H.264 encoding format and H.265 encoding format, saving hardware area.
  • FIG. 1 shows a structural block diagram of a video encoding apparatus according to an embodiment of the present invention
  • FIG. 2 shows a schematic diagram of a pipeline stage of a video encoding apparatus according to an embodiment of the present invention
  • FIG. 3 shows a flowchart of a video encoding method according to an embodiment of the present invention
  • FIG. 4 shows a structural block diagram of a movable platform according to an embodiment of the present invention.
  • Both H.264 and H.265 video coding standards adopt a hybrid coding framework, both of which include basic processes such as prediction, transformation, quantization, inverse transformation, inverse quantization, entropy coding, and loop filtering.
  • the video frame input to the video coding device is firstly divided into sub-blocks, which are implemented as macroblocks in the H.264 video coding standard and as coding tree units in the H.265 video coding standard. After that, each sub-block can be further divided into smaller sub-blocks. Each divided sub-block needs to be predicted first.
  • the prediction is divided into intra-frame prediction and inter-frame prediction. Intra-frame prediction uses the encoded image blocks in the same frame image to predict the current block, and inter-frame prediction uses the previous frame or previous frame. Multiple frames of already coded image blocks are predicted for the current block.
  • the prediction block of the current block is obtained through the above prediction process, and the residual block is obtained by subtracting the prediction block from the current block.
  • the video encoding apparatus transforms the residual block, converts the coefficients from the time domain to the frequency domain, and quantizes the coefficients in the frequency domain to reduce the value of the coefficients.
  • the quantized coefficients are sent to the entropy encoder for encoding together with the encoded mode information to obtain a binary code stream, and on the other hand, inverse quantization and inverse transformation are performed to restore the prediction residual block (ie, the reconstructed residual block). ), the reconstructed residual block is added to the predicted block to obtain the reconstructed block. Finally, in-loop filtering is performed on the reconstructed image to obtain the final reconstructed image, which is then provided to the subsequent encoded image for inter-frame prediction.
  • a chip includes multiple encoding devices, which respectively perform video encoding based on their corresponding video encoding standards. If each encoding device is independent, it needs to consume a lot of hardware area. Since the video coding standards under the H.264 video coding standard and the H.265 video coding standard both adopt similar hybrid coding frameworks, and there are many similar or identical modules, the video coding apparatus according to the embodiment of the present invention therefore has Parts are integrated, and the same hardware is reused to perform video encoding in H.264 and H.265 encoding formats, thereby saving hardware area.
  • FIG. 1 shows a structural block diagram of a video encoding apparatus 100 according to an embodiment of the present invention.
  • the video encoding apparatus 100 at least includes an integer pixel search module 110 , a sub-pixel search module 120 and a mode decision module 130 .
  • the whole pixel search module 110 is used to determine a matching block that matches the current block in the current frame within a plurality of predetermined ranges in the multiple reference frames;
  • the sub-pixel search module 120 is electrically connected to the whole pixel search module 110 , and the sub-pixel search module 120 is used to determine at least one sub-pixel matching block about the matching block;
  • the mode decision module 130 is electrically connected to the sub-pixel search module 120 for at least using the The encoding cost makes mode decisions to obtain the optimal prediction block for the current block for video encoding.
  • the sub-pixel search module 120 includes a half-pixel interpolation module, and the half-pixel interpolation module can use the first interpolation filter to perform H.264 encoding format video streams and H.265 encoding format video streams.
  • the video stream is interpolated by one-half pixel.
  • the mode decision module 130 subtracts the current block from the prediction block to obtain a residual block.
  • the mode decision module 130 transforms the residual block to obtain a coefficient block, and quantizes the coefficient block to obtain a quantized coefficient block.
  • the mode decision module 130 transmits the quantized coefficient block and mode information to the entropy encoding module for entropy encoding.
  • the mode information includes at least information related to block division and prediction mode.
  • the video encoding apparatus 100 multiplexes the same hardware structure to encode video streams in H.264 and H.265 encoding formats, thereby saving the area of the hardware, wherein the multiplexed hardware structure is at least It includes a half-pixel interpolation module, that is, the half-pixel interpolation module in the video encoding device 100 can be used to perform half-pixel interpolation on the video stream in the H.264 encoding format, and can also be used for H. 265 video streams in two encoding formats for one-half pixel interpolation.
  • the whole pixel search module 110 and the sub-pixel search module 120 are used to perform inter-frame prediction on the video stream in H.264 encoding format or the video stream in H.265 encoding format, and find the matching block of the current block in the reference frame, Thereby, temporal redundancy is eliminated on the basis of encoded video frames.
  • the whole pixel search module 110 is also used to determine the first motion vector between the current block and the matching block;
  • the sub-pixel search module 120 is also used to determine the second motion vector of the current block relative to at least one sub-pixel matching block,
  • the precision of the second motion vector is higher than that of the first motion vector, that is, the first motion vector is of integer pixel precision, and the second motion vector is of sub-pixel precision.
  • the current block and the matching block are macroblocks or sub-macroblocks; for H.265, the current block is the coding unit and the matching block is the prediction unit.
  • the H.264 coding format supports the division of macroblocks and sub-macroblocks of 7 different sizes and shapes, and provides four macroblock division modes of 16 ⁇ 16, 16 ⁇ 8, 8 ⁇ 16 and 8 ⁇ 8 for the luminance component , the 8 ⁇ 8 macroblock can be further divided into three sub-macroblocks of 8 ⁇ 4, 4 ⁇ 8 and 4 ⁇ 4, and each macroblock has its own motion vector.
  • a similar division structure is a coding tree unit (CTU), the size of which can be a maximum of 64 ⁇ 64 and a minimum size of 16 ⁇ 16.
  • CTU coding tree unit
  • a coding tree unit contains a luma coding tree block (CTB) and two chroma coding tree blocks (CTB) at the same location, as well as some corresponding syntax elements.
  • the coding tree block CTB can be directly used as a coding block (CB), or can be further divided into multiple small CBs in the form of a quadtree.
  • One luma CB, two chroma CBs and some related syntax elements together form a coding unit (CU)
  • each CU can be divided into one or more corresponding prediction units (PU), and each PU can obtain its own corresponding Motion vector, the motion vector of each PU can be used to obtain prediction information from the reconstructed reference frame.
  • the current block in this embodiment of the present application refers to the smallest prediction unit divided according to the corresponding video coding standard, and the size of the current block at different positions in the current frame may be different.
  • the mode of integer pixel search is basically the same, so the integer pixel search module 110 is based on exactly the same hardware structure for the video stream in H.264 encoding format and H. 265 encoding format video stream for integer pixel search, which may specifically include a candidate motion vector acquisition sub-module, a search area determination sub-module, and an integer pixel search sub-module, regardless of whether the video stream in H.264 encoding format or the video in H.265 encoding format. Streams are all searched for integer pixels through the above three sub-modules.
  • the candidate motion vector acquisition sub-module is used to acquire candidate motion vectors
  • the candidate motion vectors may be the motion vectors of the spatial adjacent blocks of the current block, the motion vectors of the temporal adjacent blocks, the global motion vector and the zero motion vector. one or more.
  • the search area determination sub-module is used for determining the search area of the integer pixel search according to the candidate motion vector.
  • the integer pixel search sub-module is used to take the position pointed by the candidate motion vector as the starting search point, and perform an integer pixel search on all or part of the points in the search area, calculate the coding cost at each point during the search, and select the one with the smallest coding cost. point as the optimal search result.
  • the sub-pixel search module 120 is electrically connected to the whole-pixel search module 110, and is configured to further perform sub-pixel search on the basis of the matching blocks obtained by the whole-pixel search, so as to further improve the search accuracy.
  • the sub-pixel search mainly includes two parts: interpolation and coding cost calculation.
  • the prediction block may be composed of corresponding pixels of the reference frame, otherwise the prediction block will be obtained by interpolating using a filter to produce pixels at non-integer positions.
  • Sub-pixel search includes half-pixel precision and quarter-pixel precision, for interpolation at half-pixel positions, as described above, sub-pixel search module 120 uses a first interpolation filter for H.264
  • the video stream in the encoding format and the video stream in the H.265 encoding format are subjected to half-pixel interpolation.
  • the first interpolation filter is an 8-tap interpolation filter, that is, an 8-tap interpolation filter is used to perform the second interpolation for both the video stream in the H.264 encoding format and the video stream in the H.265 encoding format.
  • One-half pixel interpolation but because the predicted value of the sampled signal at one-half pixel position in the H.264 video coding standard is obtained by applying one-dimensional horizontal and vertical sixth-order filtering, it is not suitable for H.264 encoding format.
  • 1/2 pixel interpolation is performed on the video stream of , there are two taps in the 8-tap interpolation filter that do not participate in the operation.
  • the coefficients of the corresponding two taps in the 8-tap interpolation filter that do not participate in the operation may be set to 0.
  • the sub-pixel search module 120 includes a first quarter-pixel interpolation module and a second quarter-pixel interpolation module, the first quarter-pixel interpolation module being based on the The second interpolation filter performs quarter-pixel interpolation on the video stream in H.264 encoding format, and the second quarter-pixel interpolation module performs quarter-pixel interpolation on the video stream in H.265 encoding format based on the third interpolation filter Pixel interpolation. That is to say, due to the large difference between the quarter-pixel interpolation in the H.264 video coding standard and the H.265 video coding standard, the hardware used for quarter-pixel interpolation is not multiplexed, but is used separately.
  • the second interpolation filter may be a 2-pixel mean filter, which uses adjacent integer pixels or half pixels to obtain an average value to obtain a pixel value at a quarter pixel position, which is used for the weighted average
  • Adjacent two pixels may be whole pixels or half pixels in the horizontal, vertical or diagonal directions at the quarter pixel position.
  • the third interpolation filter may be a 7- or 8-tap interpolation filter that uses adjacent integer pixels or half pixels to average to obtain pixel values at quarter pixel locations.
  • the third interpolation filter may be a horizontal or vertical 7-tap interpolation filter.
  • the third interpolation filter may be a horizontal or vertical 8-tap interpolation filter.
  • the coefficient of a corresponding tap in the 8-tap interpolation filter that does not participate in the operation is 0.
  • the sub-pixel search module 120 further includes a coding cost calculation sub-module, which is used to calculate the difference between the sub-pixel matching block and the current block based on the same hardware structure for the video stream in the H.264 encoding format or the video stream in the H.265 encoding format.
  • the first encoding cost of That is to say, the sub-pixel search module 120 calculates the first encoding cost of sub-pixel search for the video stream in H.264 encoding format and the video stream in H.265 encoding format based on the same hardware structure, so as to realize the multiplexing of this part of the hardware structure. .
  • the coding cost calculation sub-module can use the SAD/SATD cost function model to calculate the coding cost of sub-pixel search.
  • the SAD/SATD cost function model uses the difference between the predicted value and the image pixel value to calculate the cost, which essentially reflects the difference between the current block and the predicted block.
  • the residual error can be converted to the frequency domain to obtain the absolute difference and SATD, and the coding cost can be calculated according to the SATD.
  • the video encoding apparatus 100 further includes an intra-frame mode preliminary selection module 140 for selecting one or more optimal intra-frame prediction modes from multiple intra-frame prediction modes.
  • the intra-mode primary selection module 140 is connected to the mode decision module 130, and is configured to determine at least one prediction block related to the current block and at least one prediction block related to the at least one prediction block according to the pixel value corresponding to at least one adjacent reference block in the current frame. a second encoding cost corresponding to the block, and determining at least one intra-frame prediction mode according to the second encoding cost.
  • Intra-frame prediction can make full use of the relevant information of the adjacent reference blocks for coding according to the correlation between the current block and its adjacent reference blocks, thereby improving the coding efficiency.
  • H.264 there are 4 optional prediction modes for 16 ⁇ 16 luminance and 8 ⁇ 8 chrominance, including vertical mode, horizontal mode, non-DC mode and plane (Plane) mode.
  • 4 ⁇ 4 and 8 ⁇ 8 luminance blocks there are 9 optional prediction modes, including horizontal prediction, vertical prediction, DC mode (DC mode), and 6 special types such as left diagonal and right diagonal.
  • Direction prediction mode The plane mode is based on the pixels directly above and to the left, and uses the linear function Plane to predict the pixel value of the current block.
  • 35 intra-frame prediction modes are defined on the basis of PU, which include Planar mode, DC mode, vertical mode, horizontal mode and 31 special angle modes.
  • the prediction direction of each angle mode can be regarded as a certain offset in the vertical or horizontal direction.
  • the intra-frame mode preliminary selection module 140 multiplexes part of the hardware structure to perform the H.264 and H.265 coding formats on the one hand.
  • the same part in intra-frame prediction also provides different hardware structures for H.264 and H.265 encoding formats, respectively, to carry out the difference in intra-frame prediction of H.264 and H.265 encoding formats, respectively part.
  • the intra-frame mode preliminary selection module 140 includes a first intra-frame mode preliminary selection module, a second intra-frame mode preliminary selection module and a common intra-frame mode preliminary selection module, wherein the common intra-frame prediction mode preliminary selection module is H. 264 and H.265 encoding formats multiplexed hardware, on the one hand, together with the first intra-frame mode primary selection module, it is used to select the intra-frame prediction mode for the video stream in the H.264 encoding format, and on the other hand, it is used for the first intra-frame prediction mode.
  • the two intra-frame mode primary selection modules together select the intra-frame prediction mode for the video stream in the H.265 encoding format.
  • Intra-frame prediction mainly includes two parts: intra-frame prediction interpolation and coding cost calculation.
  • the hardware structure corresponding to the intra prediction interpolation of the partial intra prediction mode is the same, including horizontal prediction, vertical prediction and partial direct current (DC) prediction. Therefore, the common intra mode primary selection module includes a horizontal prediction sub-module, a vertical prediction sub-module and a partial DC prediction sub-module.
  • the horizontal prediction sub-module is used to perform intra-frame prediction interpolation in the horizontal mode on the video stream in the H.264 encoding format or the video stream in the H.265 encoding format based on the same hardware structure.
  • the vertical prediction sub-module is used to perform intra-frame prediction interpolation in vertical mode on the video stream in the H.264 encoding format or the video stream in the H.265 encoding format based on the same hardware structure.
  • Part of the DC prediction sub-module is used for the image block corresponding to the luminance component in the video stream in the H.264 encoding format, the image block corresponding to the luminance component and the chrominance component in the video stream in the H.265 encoding format based on the same hardware structure.
  • the horizontal prediction sub-module uses the right pixel to horizontally predict the corresponding pixel value of the current block.
  • the vertical prediction sub-module utilizes the pixels directly above to vertically predict the corresponding pixel value of the current block.
  • the partial DC prediction sub-module is suitable for large flat areas, and uses the reference pixels directly above and to the left to predict the pixel value of the current block.
  • the pixel value of the current block is the average value of these two groups of pixels; Or when a group of pixels to the left exists, the pixel value of the current block is the average value of this group of pixels.
  • the adjacent available pixels or the default value will be used for filling, and the adjacent left and upper adjacent pixels will be filled after filling. Pixels become available.
  • the intra-frame prediction and interpolation of other intra-frame prediction modes are implemented by different hardware structures respectively.
  • the first intra-mode preliminary selection module further includes a first direction prediction sub-module, a first plane prediction sub-module, and a DC prediction sub-module of the chrominance component.
  • the first directional prediction sub-module and the first plane prediction sub-module are respectively used to perform intra-frame prediction interpolation in the directional mode and the plane (Plane) mode on the video stream in the H.264 encoding format.
  • the first direction prediction sub-module includes interpolation filters in six direction modes: left diagonal, right diagonal, vertical to right, horizontal to down, vertical to left, and horizontal to up.
  • the first plane prediction sub-module uses the linear function plane to predict the pixel value of the current block based on the pixels directly above and to the left.
  • the DC prediction sub-module of the chrominance component is used to perform intra-frame prediction interpolation in the DC mode on the image block corresponding to the chrominance component in the video stream in the H.264 encoding format.
  • the second intra-frame mode preliminary selection module includes a second direction prediction sub-module and a second plane prediction sub-module, which are respectively used to perform intra-frame prediction in the direction mode and the plane mode on the video stream in the H.265 encoding format. Predictive interpolation.
  • the second direction prediction sub-module may include part or all of the 31 special direction modes in the H.265 video coding standard.
  • the second plane prediction sub-module uses two linear filters in the horizontal and vertical directions, and takes the average of the two as the prediction value of the pixels in the current block.
  • the common intra mode primary selection module further includes a coding cost calculation sub-module for calculating the second coding cost for the video stream in the H.264 encoding format or the video stream in the H.265 encoding format based on the same hardware structure.
  • the method of calculating the encoding cost is basically the same, when performing intra-frame prediction on the video stream in H.264 encoding format, the encoding cost can be calculated separately for 9 prediction modes; for the video stream in H.265 encoding format, Since there are as many as 35 intra-frame prediction modes in the H.265 video coding standard, during intra-frame prediction, only some of the intra-frame prediction modes are used for intra-frame prediction and the coding cost is calculated (for example, only in three of the intra-frame prediction modes). Calculate the coding cost in the prediction mode), and finally select one or two optimal intra-frame prediction modes.
  • the coding cost calculation sub-module uses the cost function to calculate the coding cost of various intra-frame prediction modes, and then determines the best intra-frame prediction mode according to the size of the coding cost.
  • the intra-mode priming module 140 may calculate the coding cost using the SAD/SAD cost model as described above. But optionally, the intra-mode primary selection module 140 may also use a rate-distortion optimization (RDO) cost model to calculate the coding cost.
  • RDO rate-distortion optimization
  • the sub-pixel search module 120 and the intra-mode primary selection module 140 are both electrically connected to the mode decision module 130, and the mode decision module 130 at least obtains at least one intra-frame prediction mode obtained by the intra-mode primary selection module 140 and the sub-pixel search module 120. At least one motion vector of , determines the optimal prediction block, and outputs a coefficient block. The coefficient block is obtained by transforming the residual block.
  • the mode decision module 130 can also output mode information. The mode information and coefficient blocks will finally be passed to the entropy coding block for entropy coding.
  • the mode decision module 130 is also capable of outputting reconstruction blocks. After that, the reconstructed block is subjected to deblocking filtering and entropy coding filtering.
  • the mode decision module 130 also participates in the mode decision by using the prediction results of the two special inter-frame prediction modes, the Skip mode and the Merge mode.
  • the Merge mode directly uses the motion vector of the adjacent block in the temporal or spatial domain as the motion vector of the current block, omitting the step of motion estimation.
  • Skip mode can also be considered as a special merge mode, the difference is that the skip mode directly considers that the residual obtained after transformation and quantization is 0, that is, the residual is not encoded, and the prediction block in this mode is the reconstruction block. It should be noted that for the H.264 and H.265 encoding formats, the Skip mode and the Merge mode acquire the MVs of adjacent blocks in different ways.
  • the mode decision module 130 also multiplexes part of the hardware structure to perform mode decision for the video streams of the H.264 and H.265 encoding formats.
  • the mode decision module 130 includes a first mode decision module, a second mode decision module and a common mode decision module.
  • the public mode decision module is the hardware structure multiplexed by the H.264 and H.265 encoding formats.
  • the first mode decision module and the second mode decision module are hardware structures for the H.264 encoding format and the H.265 encoding format independently.
  • the first mode decision module and the common mode decision module jointly select the division mode and the optimal prediction mode of the coding unit for the video stream in the H.264 encoding format, and obtain the residual of the H.264 encoding format according to the optimal prediction mode. difference block; the second mode decision module and the common mode decision module jointly select the division mode and the optimal prediction mode of the coding unit for the video stream in the H.265 encoding format, and obtain the H.265 encoding format according to the optimal prediction mode. residual block.
  • the mode decision mainly includes transformation, quantization, inverse transformation, inverse quantization, bit estimation and distortion estimation. Transformation and quantization can further remove the redundancy of the image and save the coding rate.
  • the purpose of transformation is to transform the image signal from the time domain to the frequency domain. Compared with the time domain signal, the signal transformed to the frequency domain reduces the bit rate to a large extent; quantization can reduce the length of image encoding.
  • the first mode decision module includes a first transformation sub-module, a first quantization sub-module, a first An inverse transform sub-module and a first inverse quantization sub-module are respectively used to transform, quantize, inverse transform and inverse quantize the video stream in the H.264 encoding format;
  • the second mode decision module includes a second transform sub-module, a second The quantization sub-module, the second inverse transform sub-module and the second inverse quantization sub-module are respectively used to transform, quantize, inverse transform and inverse quantize the video stream in the H.265 encoding format.
  • the first transform sub-module and the second transform sub-module essentially multiply the residual matrix by the transform matrix, and the first inverse transform sub-module and the second inverse transform sub-module both multiply the coefficient matrix by the transform matrix.
  • the second transform sub-module and the second inverse transform sub-module perform a shift operation at the end when performing matrix multiplication on the video stream in the H.265 encoding format, while the first transform sub-module and the first inverse transform sub-module perform a shift operation at the end.
  • the video stream in H.264 encoding format When the video stream in H.264 encoding format performs matrix multiplication, a shift operation is performed in the middle of the calculation; in addition, compared with H.265, the video stream in H.264 encoding format is in some prediction modes (such as chroma component, Under the 16x16 intra-frame mode, more Hadamard transform/inverse transform processes will be performed, so the transform and inverse transform of H.264 and H.265 use different hardware structures.
  • some prediction modes such as chroma component, Under the 16x16 intra-frame mode, more Hadamard transform/inverse transform processes will be performed, so the transform and inverse transform of H.264 and H.265 use different hardware structures.
  • Quantization and inverse quantization essentially multiply the transformed matrix by a coefficient, and then round to the nearest integer.
  • the difference is that the first quantization submodule multiplies different coefficients at different positions when quantizing the video stream in the H.264 encoding format, and the second quantization submodule multiplies the video stream in the H.265 encoding format when quantizing the video stream in the H.265 encoding format. Different positions are multiplied by the same coefficient, so the quantization and inverse quantization of H.264 and H.265 use different hardware structures.
  • Bit estimation is to estimate the number of bits required by the current prediction mode according to the syntax elements to be encoded (including prediction information and coefficients, etc.) specified in the H.264 or H.265 video coding standard. Since the syntax elements for bit estimation specified in the H.264 and H.265 video coding standards are different, the process of bit estimation for video streams in H.264 and H.265 coding formats is also different.
  • different hardware structures can be used to implement bit estimation of video streams in H.264 and H.265 encoding formats, respectively, that is, the first mode decision module further includes a first bit estimation sub-module for Bit estimation is performed on the video stream in the H.264 encoding format; the second mode decision module further includes a second bit estimation sub-module for performing bit estimation on the video stream in the H.265 encoding format.
  • bit estimation specified in the H.264 and H.265 video coding standards are similar to a certain extent, so in another embodiment, in order to save the area of the hardware, the same hardware can also be reused Structure for bit estimation of video streams in H.264 and H.265 encoding formats.
  • the common mode decision module further includes an H.264 bit estimation sub-module, which is configured to perform an analysis on the video stream in the H.264 encoding format or the video in the H.265 encoding format based on the first hardware structure. stream for bit estimation. That is to say, in this implementation, the bit estimation circuit in the multiplexing H.264 encoding format realizes the bit estimation of the video streams in H.264 and H.265 encoding formats, no matter which format is used for the video stream For bit estimation, the syntax elements of bit estimation specified in the H.264 video coding standard are used.
  • the common mode decision module includes an H.265 bit estimation sub-module, configured to perform the H.264 encoding format video stream or the H.265 encoding format video stream based on the second hardware structure.
  • Bit estimation wherein the syntax elements used by the H.264 bit estimation sub-module and the H.265 bit estimation sub-module are different. That is to say, in this implementation, the bit estimation circuit in the multiplexing H.265 encoding format realizes the bit estimation of the video streams of the H.264 and H.265 encoding formats, no matter which format the video stream is used for.
  • the syntax elements of bit estimation specified in the H.265 video coding standard are used.
  • the common mode decision module includes a distortion estimation sub-module, based on the same hardware structure for the video stream in the H.264 encoding format or the H. 265-encoded video streams for distortion estimation.
  • the distortion estimation module usually calculates the SSE, SAD, etc. of the reconstructed and original pixels as the encoded distortion.
  • SAD is the sum of absolute differences in the time domain, that is, the pixel difference between the reconstructed block and the current block
  • Lambda is the conversion factor
  • MVBits is the number of bits obtained by the bit estimation sub-module.
  • the video encoding apparatus 100 further includes an in-loop filtering module 150, which is electrically connected to the mode decision module 130 and configured to perform in-loop filtering processing on the residual block.
  • the in-loop filtering module 150 includes a first deblocking filter (DBF) sub-module and a second deblocking filter sub-module, which are respectively used for H.264 encoded video streams and H.265
  • DPF deblocking filter
  • the main function of deblocking filtering is to remove high-frequency components at block boundaries to reduce blockiness in decoded images.
  • Blocking refers to the phenomenon that when an image is compressed in blocks, discontinuous blocks that are easily noticeable to the human eye are generated at the boundaries of the blocks during decoding.
  • the first deblocking filtering sub-module and the second deblocking filtering sub-module determine that the size of the deblocking filtering boundary is different. If the filtering conditions are satisfied, the first deblocking filtering sub-module will determine the size of the 4 ⁇ The boundaries of 4 blocks are deblocked, and the second deblocking filter sub-module is used to deblock the boundaries of 8 ⁇ 8 blocks in the H.265 encoded video stream. In addition, the first deblocking filtering sub-module and the second deblocking filtering sub-module judge the filtering strength differently, and the filters used under different filtering strengths are also different.
  • the H.265 video coding standard involves two kinds of loop filtering.
  • it also includes sample adaptive offset (Sample Adaptive Offset, SAO).
  • SAO analyzes the original data and reconstructed data of the current frame. Offset compensation is performed on the image after deblocking filtering, so that the reconstructed image is as close to the original image as possible. Therefore, in one embodiment, the in-loop filtering module 150 further includes a SAO parameter estimation sub-module and a SAO filtering sub-module for performing SAO parameter estimation and SAO filtering on the video stream in the H.265 encoding format; No SAO is involved, so the reconstructed image after deblocking filtering is directly output.
  • SAO sample adaptive offset
  • the image after the inverse quantization operation is processed by the deblocking filter sub-module, and then passed to the SAO parameter estimation sub-module as an input.
  • SAO includes 4 kinds of EO (Edge Offset, boundary compensation mode) and 1 BO (Band Offset, with compensation mode) mode. In EO mode, you need to determine the size of the compensation value, and in BO mode, you need to determine which ones to compensate. band and compensation value.
  • the SAO parameter estimation sub-module is used for estimating the above compensation mode and parameters of SAO to obtain the optimal compensation mode and parameters.
  • SAO filtering is to perform the actual filtering operation according to the obtained optimal compensation mode and parameters.
  • the reconstructed image output by the SAO filtering sub-module will be buffered in the encoder as a subsequent reference frame.
  • the entropy encoding module 160 performs context-based arithmetic encoding on the syntax elements, encodes the syntax elements into binary strings, and performs arithmetic encoding to encode the strings into code streams. Among them, the most common information is represented by a short code, otherwise, a long code is used to achieve the purpose of the shortest average code length.
  • the decoder can restore the original information without distortion according to the entropy-encoded code stream.
  • the entropy coding mode adopted by the entropy coding module 160 is CABAC (Content-Based Adaptive Binary Arithmetic Coding).
  • CABAC is an adaptive arithmetic coding based on the context model. It uses the correlation between symbols and the statistical characteristics of the video stream to continuously and automatically adjust the probability of occurrence of each symbol, so that the amount of information output by the codeword is almost the same as the symbol entropy rate. in order to obtain higher coding efficiency.
  • the entropy encoding module 160 also multiplexes part of the hardware structure to perform entropy encoding on the video streams in H.264 and H.265 encoding formats.
  • entropy coding mainly includes two steps: one is binarization, which converts the syntax elements to be encoded into binary strings.
  • the syntax elements to be encoded include the division method of the current block, prediction information, and residuals. information, filtering information, etc.; the second is arithmetic coding, which encodes a binary string into a code stream.
  • the syntax elements that need to be encoded in the binarization process specified in the H.264 video coding standard and the H.265 video coding standard are quite different, so they are implemented by different hardware structures; The same, so the same set of hardware structure is reused.
  • the entropy encoding module 160 includes a first entropy encoding module, a second entropy encoding module, and a common entropy encoding module
  • the common entropy encoding module is the hardware structure of video stream multiplexing in H.264 and H.265 encoding formats
  • the first entropy encoding module The encoding module and the common entropy encoding module are used to perform entropy encoding of residual blocks on the video stream in the H.264 encoding format
  • the second entropy encoding and the common entropy encoding module are used for performing residual block encoding on the video stream in the H.265 encoding format.
  • the first entropy coding module is used to obtain the syntax elements of the H.264 coding format according to the residual block of the H.264 coding format
  • the second entropy coding module is used to obtain the H.265 coding format according to the residual block of the H.265 coding format
  • the syntax elements of the encoding format, the common entropy encoding module is used to provide an arithmetic encoding kernel to entropy encode the syntax elements of the H.264 encoding format or the syntax elements of the H.265 encoding format.
  • the video encoding apparatus 100 further includes a reference frame management module 170 electrically connected to the integer pixel search module 110, the sub-pixel search module 120 and the mode decision module 130, for acquiring reference frames, and The reference frame is sent to the integer pixel search module 110 , the sub-pixel search module 120 and the mode decision module 130 .
  • This part is the same for the H.264 video coding standard and the H.265 video coding standard, so it can be realized by multiplexing the same hardware structure.
  • the video encoding apparatus 100 in this embodiment of the present application is implemented in a pipeline-level manner.
  • the video encoding apparatus 100 includes a total of 5 pipeline stages, the integer pixel search module 110 is located in the first stage, the sub-pixel search module 120 and the intra-mode preliminary selection module 140 are located in the second stage, the mode The decision-making module 130 is located at the third level, the SAO parameter estimation sub-module and the deblocking filtering sub-module are located at the fourth level, and the entropy coding module 160 and the SAO filtering module are located at the fifth level.
  • the whole pixel search module 110 is electrically connected to the sub-pixel search module 120
  • the sub-pixel search module 120 is electrically connected to the mode decision module 130
  • the mode decision module 130 is respectively electrically connected to the SAO parameter estimation sub-module and the deblocking filter sub-module.
  • the modules, the SAO parameter estimation sub-module and the deblocking filtering sub-module are respectively electrically connected to the entropy encoding module 160 and the SAO filtering module. Electrical connection means that the above-mentioned modules are correspondingly electrically connected.
  • the whole-pixel search module 110 can output a matching block matching the current block in the current frame to the sub-pixel search module 120 .
  • the whole-pixel search module 110 can be set at the first pipeline stage, and the sub-pixel search module 120 can be set at the second pipeline stage.
  • the sub-pixel search module 120 and the pattern decision module 130 are electrically connected to each other, the sub-pixel search module 120 can output at least one sub-pixel matching block to the pattern decision module 130 .
  • the sub-pixel search module 120 can be set at the second pipeline stage, and the mode decision module 130 can be set at the third pipeline stage.
  • the SAO parameter estimation sub-module and the deblocking filtering sub-module may be set at the fourth pipeline stage, and the entropy encoding module 160 and the SAO filtering module may be set at the fifth pipeline stage.
  • the N+2th block when the N+2th block is performing integer pixel search, the N+1th block is performing sub-pixel search and intra-mode primary selection, and the Nth block is performing mode decision.
  • the mode decision module 130 since the calculation amount of the mode decision module 130 is relatively large, it can also be implemented in two-stage pipeline.
  • the division of the pipeline stages as shown in FIG. 2 is only an example, and the division of the actual pipeline stages can also be done in different ways.
  • the video encoding apparatus of the embodiment of the present application multiplexes part of the hardware structure to encode video streams in the H.264 encoding format and the H.265 encoding format, which saves hardware area.
  • FIG. 3 shows a flowchart of a video encoding method 300 according to an embodiment of the present invention.
  • the video encoding method 300 may be implemented by the video encoding apparatus 100 described above. Only the main steps of the video encoding method 300 will be described below, and for further details, reference may be made to the above.
  • the video coding method 300 includes the following steps:
  • Step S310 the integer pixel search module determines a matching block that matches the current block in the current frame within a plurality of predetermined ranges in the plurality of reference frames;
  • Step S320 a sub-pixel search module electrically connected to the whole pixel search module determines at least one sub-pixel matching block about the matching block, wherein the sub-pixel search module includes a half pixel interpolation module, the Determining at least one subpixel matching block for the matching block includes: a half pixel interpolation module halving the video stream in H.264 encoding format or the video stream in H.265 encoding format using a first interpolation filter one-pixel interpolation;
  • Step S330 the mode decision module electrically connected to the sub-pixel search module performs mode decision at least by using the coding cost of the sub-pixel matching block, so as to obtain the optimal prediction block of the current block for video coding.
  • the first interpolation filter used in step S320 is an 8-tap interpolation filter.
  • the sub-pixel search module further includes a first quarter-pixel interpolation module and a second quarter-pixel interpolation module
  • the method further includes: by the first quarter-pixel interpolation module Perform quarter-pixel interpolation on the video stream in H.264 encoding format based on the second interpolation filter, or perform quarter-pixel interpolation on the video stream in H.265 encoding format based on the third interpolation filter by the second quarter-pixel interpolation module
  • the video stream is quarter-pixel interpolated.
  • the sub-pixel search module further includes an encoding cost calculation sub-module
  • the method further includes: performing the H.264 encoding format video stream or the H.264 encoding format by the encoding cost calculation sub-module based on the same hardware structure.
  • the first encoding cost between the sub-pixel matching block and the current block is calculated for the video stream in the H.265 encoding format.
  • the method further includes: determining, by an intra-frame mode primary selection module connected to the mode decision module, based on pixel values corresponding to at least one adjacent reference block in the current frame, about the current frame at least one prediction block of the block and a second encoding cost corresponding to the at least one prediction block, and determining at least one intra-frame prediction mode according to the second encoding cost; the mode decision module determines according to the at least one intra-frame prediction mode The prediction mode and the at least one motion vector determine an optimal prediction block, and output mode information, a coefficient block and a reconstruction block; the reconstruction block is subjected to in-loop filtering processing by an in-loop filtering module electrically connected to the mode decision module ; used by the entropy coding module electrically connected to the in-loop filtering module to perform entropy coding on the mode information and coefficient blocks.
  • the intra-frame mode preliminary selection module includes a first intra-frame mode preliminary selection module, a second intra-frame mode preliminary selection module, and a common intra-frame mode preliminary selection module, and the method further includes: the first frame The intra-mode preliminary selection module and the common intra-mode preliminary selection module select the intra-prediction mode for the video stream in the H.264 encoding format, or, the second intra-mode preliminary selection module and the common intra-mode preliminary selection module.
  • the selection module selects an intra prediction mode for the video stream in the H.265 encoding format.
  • the common intra mode primary selection module includes a horizontal prediction sub-module, a vertical prediction sub-module and a DC prediction sub-module
  • the method further includes: the horizontal prediction sub-module, the vertical prediction sub-module and the DC prediction sub-module
  • the prediction sub-module performs intra-frame prediction interpolation in horizontal mode, vertical mode and DC mode on the video stream in H.264 encoding format or the video stream in H.264 encoding format based on the same hardware structure.
  • the common intra mode primary selection module further includes an encoding cost calculation sub-module, and the method further includes: the encoding cost calculation sub-module performs an H.264 encoding format video stream or an H.265 encoding format based on the same hardware structure.
  • the second encoding cost is calculated for the video stream.
  • the mode decision module includes a first mode decision module, a second mode decision module, and a public mode decision module
  • the mode decision includes: the first mode decision module and the public mode decision module are:
  • the partitioning mode and the optimal prediction mode of the coding unit are selected, and the residual block in the H.264 encoding format is obtained according to the optimal prediction mode, or the second mode decision module and the
  • the common mode decision module selects the coding unit division mode and the optimal prediction mode for the video stream in the H.265 encoding format, and obtains the residual block in the H.265 encoding format according to the optimal prediction mode.
  • the first mode decision module further includes a first bit estimation submodule, and the mode decision further includes the first bit estimation submodule to perform bit estimation on the video stream in the H.264 encoding format;
  • the second mode decision module further includes a second bit estimation submodule, and the mode decision further includes the second bit estimation submodule to perform bit estimation on the video stream in the H.265 encoding format.
  • the common mode decision module further includes an H.264 bit estimation submodule, and the mode decision further includes the H.264 bit estimation submodule based on the first hardware structure for the video in the H.264 encoding format. stream or the video stream in the H.265 encoding format for bit estimation; or, the common mode decision module includes an H.265 bit estimation sub-module, and the mode decision further includes the H.265 bit estimation sub-module based on the first
  • the second hardware structure performs bit estimation on the video stream in the H.264 encoding format or the video stream in the H.265 encoding format, wherein the H.264 bit estimation submodule and the H.265 bit estimation submodule Different syntax elements are used.
  • the common mode decision module further includes a distortion estimation sub-module
  • the mode decision further includes the distortion estimation sub-module performing the video stream of the H.264 encoding format or the H.264 encoding format based on the same hardware structure. 265-encoded video streams for distortion estimation.
  • the in-loop filtering includes SAO parameter estimation and SAO filtering of the video stream in H.265 encoding format.
  • the in-loop filtering further includes performing deblocking filtering on the video stream in the H.264 encoding format and the video stream in the H.265 encoding format based on the same hardware structure.
  • the entropy encoding module includes a first entropy encoding module, a second entropy encoding module, and a common entropy encoding module
  • the entropy encoding includes: a pair of the first entropy encoding module and the common entropy encoding module
  • Entropy encoding of the residual block is performed on the video stream in the H.264 encoding format, or the second entropy encoding and the common entropy encoding module perform the entropy encoding of the residual block on the video stream in the H.265 encoding format.
  • the entropy encoding includes: the first entropy encoding module obtains syntax elements in the H.264 encoding format according to the residual block in the H.264 encoding format, and the second entropy encoding module obtains the syntax elements in the H.264 encoding format according to the The residual block of the H.265 encoding format obtains the syntax elements of the H.265 encoding format, the common entropy encoding module provides an arithmetic encoding kernel, to the syntax elements of the H.264 encoding format or the H.265
  • the syntax elements of the encoding format are entropy encoded.
  • an embodiment of the present invention further provides a computer storage medium, on which a computer program is stored.
  • the computer program is executed by the processor, the aforementioned video encoding apparatus 100 shown in FIG. 1 can be controlled to implement the steps of the aforementioned video encoding method 300 shown in FIG. 3 .
  • the computer storage medium is a computer-readable storage medium.
  • Computer storage media may include, for example, memory cards for smartphones, storage components for tablet computers, hard drives for personal computers, read only memory (ROM), erasable programmable read only memory (EPROM), portable compact disk read only memory ( CD-ROM), USB memory, or any combination of the above storage media.
  • a computer-readable storage medium can be any combination of one or more computer-readable storage media.
  • FIG. 4 is a schematic structural diagram of a movable platform 400 according to an embodiment of the present invention.
  • the movable platform 400 of this embodiment includes an imaging device 410 and a video encoding device 420 .
  • the imaging device 410 is used to collect video data
  • the video encoding device 420 is used to perform video encoding on the video data collected by the imaging device 410.
  • the video encoding apparatus 420 may adopt the structure of the embodiment shown in FIG. 1 , and correspondingly, the specific details thereof can be referred to the above, which will not be repeated here.
  • the movable platform includes at least one of an unmanned aerial vehicle, a car, a remote control car, a robot, a camera, and a gimbal.
  • the video encoding device 420 and the imaging device 410 are mounted on the movable platform body of the movable platform.
  • the body of the movable platform is the fuselage of the unmanned aerial vehicle.
  • the movable platform body is the body of the automobile.
  • the vehicle may be an autonomous driving vehicle or a semi-autonomous driving vehicle, which is not limited herein.
  • the movable platform is a remote control car
  • the movable platform body is the body of the remote control car.
  • the movable platform body is a robot.
  • the movable platform body is the camera itself.
  • the movable platform body is a gimbal body.
  • the gimbal can be a handheld gimbal, or a gimbal mounted on a car or an aircraft.
  • the video encoding method, video encoding device, computer storage medium and mobile platform of the embodiments of the present invention multiplex part of the hardware structure to encode video streams in H.264 encoding format and H.265 encoding format, saving energy hardware area.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general purpose computer, special purpose computer, computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be downloaded from a website site, computer, server, or data center Transmission to another website site, computer, server, or data center by wire (eg, coaxial cable, optical fiber, digital subscriber line, DSL) or wireless (eg, infrared, wireless, microwave, etc.).
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that includes an integration of one or more available media.
  • the usable media may be magnetic media (eg, floppy disk, hard disk, magnetic tape), optical media (eg, digital video disc (DVD)), or semiconductor media (eg, solid state disk (SSD)), etc. .
  • the disclosed system, apparatus and method may be implemented in other manners.
  • the apparatus embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the functions, if implemented in the form of software functional units and sold or used as independent products, may be stored in a computer-readable storage medium.
  • the technical solution of the present invention can be embodied in the form of a software product in essence, or the part that contributes to the prior art or the part of the technical solution.
  • the computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present invention.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes .
  • the disclosed apparatus and method may be implemented in other manners.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or May be integrated into another device, or some features may be omitted, or not implemented.
  • Various component embodiments of the present invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof.
  • a microprocessor or a digital signal processor (DSP) may be used in practice to implement some or all of the functions of some modules according to the embodiments of the present invention.
  • DSP digital signal processor
  • the present invention may also be implemented as apparatus programs (eg, computer programs and computer program products) for performing part or all of the methods described herein.
  • Such a program implementing the present invention may be stored on a computer-readable medium, or may be in the form of one or more signals. Such signals may be downloaded from Internet sites, or provided on carrier signals, or in any other form.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A video coding apparatus and method, and a computer storage medium and a mobile platform. The video coding apparatus comprises: an integer pixel search module, which is used for determining a matching block that matches the current block in the current frame within multiple predetermined ranges in multiple reference frames; a sub-pixel search module, which is electrically connected to the integer pixel search module and used for determining at least one sub-pixel matching block with regard to the matching block; and a mode decision making module, which is electrically connected to the sub-pixel search module and used for executing mode decision making by at least using a coding cost of sub-pixel matching blocks, so as to obtain an optimal prediction block of the current block for video coding, wherein the sub-pixel search module comprises a half-pixel interpolation module, which can use a first interpolation filter to perform half-pixel interpolation on video streams of a H.264 coding format and video streams of a H.265 coding format. In the video coding solution, part of a hardware structure is multiplexed so as to code video streams of a H.264 coding format and a H.265 coding format, so that the hardware area is saved on.

Description

视频编码装置、方法、计算机存储介质和可移动平台Video encoding apparatus, method, computer storage medium, and removable platform
说明书manual
技术领域technical field
本发明涉及视频编码技术领域,具体而言涉及一种视频编码装置、方法、计算机存储介质和可移动平台。The present invention relates to the technical field of video coding, and in particular, to a video coding apparatus, method, computer storage medium and removable platform.
背景技术Background technique
H.264视频编码标准是由ITU-T视频编码专家组(VCEG)和ISO/IEC动态图像专家组(MPEG)联合组成的联合视频组(JVT,Joint Video Team)提出的高度压缩数字视频编解码器标准,与此前的视频编码标准相比,H.264视频编码标准能够在相同的带宽下提供更加优秀的图象质量。H.265视频编码标准是ITU-T视频编码专家组继H.264视频编码标准之后所制定的新的视频编码标准。H.265视频编码标准保留了H.264视频编码标准的部分技术,并在此基础上进行了改进。The H.264 video coding standard is a highly compressed digital video codec proposed by the Joint Video Team (JVT, Joint Video Team) jointly formed by the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG). Compared with the previous video coding standard, the H.264 video coding standard can provide better image quality under the same bandwidth. The H.265 video coding standard is a new video coding standard formulated by the ITU-T Video Coding Expert Group following the H.264 video coding standard. The H.265 video coding standard retains some technologies of the H.264 video coding standard and improves it on this basis.
通常一个芯片中会包含着多个分别独立的编码器,用于实现不同视频编码标准下的视频编码。如果要对H.264和H.265两种格式的视频流进行编码,则需要分别设置两种不同的编码器。然而,多个编码器需要消耗较多的硬件面积。Usually, a chip will contain multiple independent encoders to implement video encoding under different video encoding standards. If you want to encode video streams in H.264 and H.265 formats, you need to set two different encoders respectively. However, multiple encoders consume more hardware area.
发明内容SUMMARY OF THE INVENTION
在发明内容部分中引入了一系列简化形式的概念,这将在具体实施方式部分中进一步详细说明。本发明的发明内容部分并不意味着要试图限定出所要求保护的技术方案的关键特征和必要技术特征,更不意味着试图确定所要求保护的技术方案的保护范围。A series of concepts in simplified form have been introduced in the Summary section, which are described in further detail in the Detailed Description section. The Summary of the Invention section of the present invention is not intended to attempt to limit the key features and essential technical features of the claimed technical solution, nor is it intended to attempt to determine the protection scope of the claimed technical solution.
针对现有技术的不足,本发明实施例第一方面提供了一种视频编码装置,所述视频编码装置包括:In view of the deficiencies of the prior art, the first aspect of the embodiments of the present invention provides a video encoding apparatus, where the video encoding apparatus includes:
整像素搜索模块,用于在多个参考帧中的多个预定范围内确定与当前帧中的当前块相匹配的匹配块;an integer pixel search module for determining a matching block that matches the current block in the current frame within a plurality of predetermined ranges in the plurality of reference frames;
分像素搜索模块,电连接于所述整像素搜索模块,并且所述分像素搜 索模块用于确定关于所述匹配块的至少一分像素匹配块;a sub-pixel search module, electrically connected to the whole pixel search module, and the sub-pixel search module is configured to determine at least one sub-pixel matching block about the matching block;
模式决策模块,电连接所述分像素搜索模块,用于至少利用所述分像素匹配块的编码代价进行模式决策,以得到所述当前块的最优预测块以用于视频编码;a mode decision module, electrically connected to the sub-pixel search module, for performing mode decision at least using the coding cost of the sub-pixel matching block to obtain the optimal prediction block of the current block for video coding;
其中,所述分像素搜索模块包括二分之一像素插值模块,所述二分之一像素插值模块能够利用第一插值滤波器对H.264编码格式的视频流和H.265编码格式的视频流进行二分之一像素插值。Wherein, the sub-pixel search module includes a half-pixel interpolation module, and the half-pixel interpolation module can use the first interpolation filter to perform H.264 encoding format video stream and H.265 encoding format video The stream does one-half pixel interpolation.
本发明实施例第二方面提供了一种视频编码方法,所述视频编码方法包括:A second aspect of the embodiments of the present invention provides a video encoding method, where the video encoding method includes:
整像素搜索模块在多个参考帧中的多个预定范围内确定与当前帧中的当前块相匹配的匹配块;The integer pixel search module determines a matching block that matches the current block in the current frame within a plurality of predetermined ranges in the plurality of reference frames;
电连接于所述整像素搜索模块的分像素搜索模块确定关于所述匹配块的至少一分像素匹配块,其中,所述分像素搜索模块包括二分之一像素插值模块,所述确定关于所述匹配块的至少一分像素匹配块包括:二分之一像素插值模块利用第一插值滤波器对H.264编码格式的视频流或H.265编码格式的视频流进行二分之一像素插值;A sub-pixel search module electrically connected to the whole-pixel search module determines at least a sub-pixel matching block for the matching block, wherein the sub-pixel search module includes a half-pixel interpolation module, the determination is The at least one pixel matching block of the matching block includes: the one-half pixel interpolation module uses the first interpolation filter to perform one-half pixel interpolation on the video stream in the H.264 encoding format or the video stream in the H.265 encoding format. ;
电连接所述分像素搜索模块的模式决策模块至少利用所述分像素匹配块的编码代价进行模式决策,以得到所述当前块的最优预测块以用于视频编码。A mode decision module electrically connected to the sub-pixel search module makes mode decision at least using the coding cost of the sub-pixel matching block to obtain an optimal prediction block of the current block for video encoding.
本发明实施例第三方面提供了一种计算机存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现上述视频编码方法的步骤。A third aspect of the embodiments of the present invention provides a computer storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements the steps of the above video encoding method.
本发明实施例第四方面提供了一种可移动平台,所述可移动平台包括成像装置以及如上所述的视频编码装置,所述成像装置用于采集视频数据,所述视频编码装置用于对所述成像装置采集的视频数据进行视频编码。A fourth aspect of the embodiments of the present invention provides a movable platform, where the movable platform includes an imaging device and the above video encoding device, where the imaging device is used to collect video data, and the video encoding device is used to The video data collected by the imaging device is subjected to video encoding.
本发明实施例的视频编码装置、方法、计算机存储介质和可移动平台复用部分硬件结构来进行H.264编码格式和H.265编码格式的视频流的编码,节省了硬件的面积。The video encoding apparatus, method, computer storage medium and movable platform of the embodiments of the present invention multiplex part of the hardware structure to encode video streams in H.264 encoding format and H.265 encoding format, saving hardware area.
附图说明Description of drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅 是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to illustrate the technical solutions in the embodiments of the present application more clearly, the following briefly introduces the accompanying drawings used in the description of the embodiments. Obviously, the accompanying drawings in the following description are only some embodiments of the present invention. For those of ordinary skill in the art, other drawings can also be obtained from these drawings without creative labor.
在附图中:In the attached image:
图1示出了根据本发明一实施例的视频编码装置的结构框图;1 shows a structural block diagram of a video encoding apparatus according to an embodiment of the present invention;
图2示出了根据本发明一实施例的视频编码装置的流水级的示意图;2 shows a schematic diagram of a pipeline stage of a video encoding apparatus according to an embodiment of the present invention;
图3示出了根据本发明一实施例的视频编码方法的流程图;3 shows a flowchart of a video encoding method according to an embodiment of the present invention;
图4示出了本发明一实施例的可移动平台的结构框图。FIG. 4 shows a structural block diagram of a movable platform according to an embodiment of the present invention.
具体实施方式detailed description
为了使得本发明的目的、技术方案和优点更为明显,下面将参照附图详细描述根据本发明的示例实施例。显然,所描述的实施例仅仅是本发明的一部分实施例,而不是本发明的全部实施例,应理解,本发明不受这里描述的示例实施例的限制。基于本发明中描述的本发明实施例,本领域技术人员在没有付出创造性劳动的情况下所得到的所有其它实施例都应落入本发明的保护范围之内。In order to make the objects, technical solutions and advantages of the present invention more apparent, exemplary embodiments according to the present invention will be described in detail below with reference to the accompanying drawings. Obviously, the described embodiments are only some of the embodiments of the present invention, not all of the embodiments of the present invention, and it should be understood that the present invention is not limited by the example embodiments described herein. Based on the embodiments of the present invention described in the present invention, all other embodiments obtained by those skilled in the art without creative efforts shall fall within the protection scope of the present invention.
在下文的描述中,给出了大量具体的细节以便提供对本发明更为彻底的理解。然而,对于本领域技术人员而言显而易见的是,本发明可以无需一个或多个这些细节而得以实施。在其他的例子中,为了避免与本发明发生混淆,对于本领域公知的一些技术特征未进行描述。In the following description, numerous specific details are set forth in order to provide a more thorough understanding of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced without one or more of these details. In other instances, some technical features known in the art have not been described in order to avoid obscuring the present invention.
应当理解的是,本发明能够以不同形式实施,而不应当解释为局限于这里提出的实施例。相反地,提供这些实施例将使公开彻底和完全,并且将本发明的范围完全地传递给本领域技术人员。It should be understood that the present invention may be embodied in different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
在此使用的术语的目的仅在于描述具体实施例并且不作为本发明的限制。在此使用时,单数形式的“一”、“一个”和“所述/该”也意图包括复数形式,除非上下文清楚指出另外的方式。还应明白术语“组成”和/或“包括”,当在该说明书中使用时,确定所述特征、整数、步骤、操作、元件和/或部件的存在,但不排除一个或更多其它的特征、整数、步骤、操作、元件、部件和/或组的存在或添加。在此使用时,术语“和/或”包括相关所列项目的任何及所有组合。The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms "a," "an," and "the/the" are intended to include the plural forms as well, unless the context clearly dictates otherwise. It should also be understood that the terms "compose" and/or "include", when used in this specification, identify the presence of stated features, integers, steps, operations, elements and/or components, but do not exclude one or more other The presence or addition of features, integers, steps, operations, elements, parts and/or groups. As used herein, the term "and/or" includes any and all combinations of the associated listed items.
为了彻底理解本发明,将在下列的描述中提出详细的步骤以及详细的结构,以便阐释本发明提出的技术方案。本发明的较佳实施例详细描述如 下,然而除了这些详细描述外,本发明还可以具有其他实施方式。For a thorough understanding of the present invention, detailed steps and detailed structures will be proposed in the following description to explain the technical solutions proposed by the present invention. Preferred embodiments of the present invention are described in detail below, however, the present invention may have other embodiments in addition to these detailed descriptions.
H.264和H.265视频编码标准都采用混合编码框架,二者均包括预测、变换、量化、反变换、反量化、熵编码和环路滤波等基本流程。具体地,输入到视频编码装置的视频帧首先被划分为一个个的子块,子块在H.264视频编码标准中实现为宏块,在H.265视频编码标准中实现为编码树单元。之后,每一个子块还能进一步地划分为更小的子块。划分好的每个子块要先进行预测,预测分为帧内预测和帧间预测,帧内预测使用同一帧图像内已经编码的图像块对当前块进行预测,帧间预测使用前一帧或前多帧已经编码的图像块对当前块进行预测。Both H.264 and H.265 video coding standards adopt a hybrid coding framework, both of which include basic processes such as prediction, transformation, quantization, inverse transformation, inverse quantization, entropy coding, and loop filtering. Specifically, the video frame input to the video coding device is firstly divided into sub-blocks, which are implemented as macroblocks in the H.264 video coding standard and as coding tree units in the H.265 video coding standard. After that, each sub-block can be further divided into smaller sub-blocks. Each divided sub-block needs to be predicted first. The prediction is divided into intra-frame prediction and inter-frame prediction. Intra-frame prediction uses the encoded image blocks in the same frame image to predict the current block, and inter-frame prediction uses the previous frame or previous frame. Multiple frames of already coded image blocks are predicted for the current block.
通过上述预测过程得到了当前块的预测块,使用当前块减去预测块即得到残差块。之后,视频编码装置对残差块进行变换,将系数由时域转换到频域上,并对频域上的系数进行量化来减小系数的值。The prediction block of the current block is obtained through the above prediction process, and the residual block is obtained by subtracting the prediction block from the current block. After that, the video encoding apparatus transforms the residual block, converts the coefficients from the time domain to the frequency domain, and quantizes the coefficients in the frequency domain to reduce the value of the coefficients.
量化后的系数一方面和编码的模式信息一起送入熵编码器进行编码,以得到二进制的码流,另一方面进行反量化和反变换,恢复出预测残差块(即重建残差块),重建残差块与预测块相加即得到重建块。最后,对重建的图像进行环内滤波可得到最终的重建图像,并提供给之后的编码图像进行帧间预测。On the one hand, the quantized coefficients are sent to the entropy encoder for encoding together with the encoded mode information to obtain a binary code stream, and on the other hand, inverse quantization and inverse transformation are performed to restore the prediction residual block (ie, the reconstructed residual block). ), the reconstructed residual block is added to the predicted block to obtain the reconstructed block. Finally, in-loop filtering is performed on the reconstructed image to obtain the final reconstructed image, which is then provided to the subsequent encoded image for inter-frame prediction.
通常来说,一个芯片中会包含着多个编码装置,分别基于各自对应的视频编码标准进行视频编码。如果各个编码装置分别独立,则需要消耗较多的硬件面积。由于H.264视频编码标准和H.265视频编码标准下的视频编码标准都采用类似的混合编码框架,且存在这许多相近或相同的模块,本发明实施例的视频编码装置因此对这些相同的部分进行了整合,复用部分相同的硬件进行H.264和H.265两种编码格式下的视频编码,从而节省了硬件的面积。Generally speaking, a chip includes multiple encoding devices, which respectively perform video encoding based on their corresponding video encoding standards. If each encoding device is independent, it needs to consume a lot of hardware area. Since the video coding standards under the H.264 video coding standard and the H.265 video coding standard both adopt similar hybrid coding frameworks, and there are many similar or identical modules, the video coding apparatus according to the embodiment of the present invention therefore has Parts are integrated, and the same hardware is reused to perform video encoding in H.264 and H.265 encoding formats, thereby saving hardware area.
下面结合附图,对本发明实施例的视频编码装置、方法、计算机存储介质和可移动平台进行详细说明。在不冲突的情况下,下述的实施例及实施方式中的特征可以相互组合。The video encoding apparatus, method, computer storage medium, and removable platform according to the embodiments of the present invention will be described in detail below with reference to the accompanying drawings. The features of the embodiments and implementations described below may be combined with each other without conflict.
图1示出了根据本发明的一个实施例的视频编码装置100的结构框图。如图1所示,视频编码装置100至少包括整像素搜索模块110、分像素搜索模块120和模式决策模块130。其中,整像素搜索模块110用于在多个参考 帧中的多个预定范围内确定与当前帧中的当前块相匹配的匹配块;分像素搜索模块120电连接于所述整像素搜索模块110,并且所述分像素搜索模块120用于确定关于所述匹配块的至少一分像素匹配块;模式决策模块130电连接所述分像素搜索模块120,用于至少利用所述分像素匹配块的编码代价进行模式决策,以得到所述当前块的最优预测块以用于视频编码。其中,所述分像素搜索模块120包括二分之一像素插值模块,所述二分之一像素插值模块能够利用第一插值滤波器对H.264编码格式的视频流和H.265编码格式的视频流进行二分之一像素插值。在一个实施方式中,在模式决策模块130确定出最优预测块之后,模式决策模块130将当前块与预测块相减,得到残差块。接下来,模式决策模块130将残差块通过变换处理,得到系数块,并且将系数块进行量化处理,得到已量化后的系数块。最后,模式决策模块130将已量化后的系数块和模式信息传送至熵编码模块进行熵编码。其中,模式信息至少包括块划分和预测模式相关的信息。FIG. 1 shows a structural block diagram of a video encoding apparatus 100 according to an embodiment of the present invention. As shown in FIG. 1 , the video encoding apparatus 100 at least includes an integer pixel search module 110 , a sub-pixel search module 120 and a mode decision module 130 . Wherein, the whole pixel search module 110 is used to determine a matching block that matches the current block in the current frame within a plurality of predetermined ranges in the multiple reference frames; the sub-pixel search module 120 is electrically connected to the whole pixel search module 110 , and the sub-pixel search module 120 is used to determine at least one sub-pixel matching block about the matching block; the mode decision module 130 is electrically connected to the sub-pixel search module 120 for at least using the The encoding cost makes mode decisions to obtain the optimal prediction block for the current block for video encoding. Wherein, the sub-pixel search module 120 includes a half-pixel interpolation module, and the half-pixel interpolation module can use the first interpolation filter to perform H.264 encoding format video streams and H.265 encoding format video streams. The video stream is interpolated by one-half pixel. In one embodiment, after the mode decision module 130 determines the optimal prediction block, the mode decision module 130 subtracts the current block from the prediction block to obtain a residual block. Next, the mode decision module 130 transforms the residual block to obtain a coefficient block, and quantizes the coefficient block to obtain a quantized coefficient block. Finally, the mode decision module 130 transmits the quantized coefficient block and mode information to the entropy encoding module for entropy encoding. The mode information includes at least information related to block division and prediction mode.
根据本发明实施例的视频编码装置100复用部分相同的硬件结构对H.264和H.265两种编码格式的视频流进行编码,从而节省了硬件的面积,其中,复用的硬件结构至少包括二分之一像素插值模块,即视频编码装置100中二分之一像素插值模块既能够用于对H.264编码格式的视频流进行二分之一像素插值、又能够用于对H.265两种编码格式的视频流进行二分之一像素插值。The video encoding apparatus 100 according to the embodiment of the present invention multiplexes the same hardware structure to encode video streams in H.264 and H.265 encoding formats, thereby saving the area of the hardware, wherein the multiplexed hardware structure is at least It includes a half-pixel interpolation module, that is, the half-pixel interpolation module in the video encoding device 100 can be used to perform half-pixel interpolation on the video stream in the H.264 encoding format, and can also be used for H. 265 video streams in two encoding formats for one-half pixel interpolation.
其中,整像素搜索模块110和分像素搜索模块120用于对H.264编码格式的视频流或H.265编码格式的视频流进行帧间预测,在参考帧中寻找到当前块的匹配块,从而在已编码视频帧的基础上消除时间冗余。其中,整像素搜索模块110还用于确定当前块相对于匹配块之间的第一运动矢量;分像素搜索模块120还用于确定当前块相对于至少一分像素匹配块的第二运动矢量,其中第二运动矢量的精度高于第一运动矢量的精度,即第一运动矢量为整像素精度,第二运动矢量为分像素精度。其中,对于H.264的格式来说,当前块和匹配块为宏块或子宏块;对于H.265来说,当前块为编码单元以及匹配块为预测单元。Wherein, the whole pixel search module 110 and the sub-pixel search module 120 are used to perform inter-frame prediction on the video stream in H.264 encoding format or the video stream in H.265 encoding format, and find the matching block of the current block in the reference frame, Thereby, temporal redundancy is eliminated on the basis of encoded video frames. Wherein, the whole pixel search module 110 is also used to determine the first motion vector between the current block and the matching block; the sub-pixel search module 120 is also used to determine the second motion vector of the current block relative to at least one sub-pixel matching block, The precision of the second motion vector is higher than that of the first motion vector, that is, the first motion vector is of integer pixel precision, and the second motion vector is of sub-pixel precision. Wherein, for the H.264 format, the current block and the matching block are macroblocks or sub-macroblocks; for H.265, the current block is the coding unit and the matching block is the prediction unit.
具体地,H.264编码格式支持7种不同尺寸和形状的宏块和子宏块的分割,其为亮度分量提供16×16、16×8、8×16和8×8四种宏块划分方式,还可以将8×8宏块进一步划分成8×4、4×8和4×4三种子宏块,每个宏 块都有各自的运动矢量。而在H.265编码格式中,类似的划分结构为编码树单元(CTU),其尺寸最大可以为64×64,最小可以为16×16。一个编码树单元(CTU)包含了同一位置处的一个亮度编码树块(CTB)和两个色度编码树块(CTB),以及一些相应的语法元素。编码树块CTB可以直接作为一个编码块(CB),也可以进一步以四叉树的形式划分为多个小的CB。一个亮度CB和两个色度CB以及一些相关语法元素共同组成一个编码单元(CU),每个CU可以分割为对应的一个或多个预测单元(PU),每个PU都可以获得自身对应的运动矢量,每个PU的运动矢量都可以用于从重构的参考帧中获取预测信息。本申请实施例的当前块指的是根据相应的视频编码标准划分的最小预测单元,当前帧中不同位置处的当前块的尺寸可以不同。Specifically, the H.264 coding format supports the division of macroblocks and sub-macroblocks of 7 different sizes and shapes, and provides four macroblock division modes of 16×16, 16×8, 8×16 and 8×8 for the luminance component , the 8×8 macroblock can be further divided into three sub-macroblocks of 8×4, 4×8 and 4×4, and each macroblock has its own motion vector. In the H.265 coding format, a similar division structure is a coding tree unit (CTU), the size of which can be a maximum of 64×64 and a minimum size of 16×16. A coding tree unit (CTU) contains a luma coding tree block (CTB) and two chroma coding tree blocks (CTB) at the same location, as well as some corresponding syntax elements. The coding tree block CTB can be directly used as a coding block (CB), or can be further divided into multiple small CBs in the form of a quadtree. One luma CB, two chroma CBs and some related syntax elements together form a coding unit (CU), each CU can be divided into one or more corresponding prediction units (PU), and each PU can obtain its own corresponding Motion vector, the motion vector of each PU can be used to obtain prediction information from the reconstructed reference frame. The current block in this embodiment of the present application refers to the smallest prediction unit divided according to the corresponding video coding standard, and the size of the current block at different positions in the current frame may be different.
对于H.264和H.265两种编码格式的视频流来说,整像素搜索的模式基本相同,因此整像素搜索模块110基于完全相同的硬件结构对H.264编码格式的视频流和H.265编码格式的视频流进行整像素搜索,具体可以包括候选运动矢量获取子模块、搜索区域确定子模块以及整像素搜索子模块,无论H.264编码格式的视频流还是H.265编码格式的视频流,均通过以上三个子模块进行整像素搜索。具体地,候选运动矢量获取子模块用于获取候选运动矢量,候选运动矢量可以是当前块的空域相邻块的运动矢量、时域相邻块的运动矢量、全局运动矢量和零运动矢量中的一个或多个。搜索区域确定子模块用于根据候选运动矢量确定整像素搜索的搜索区域。整像素搜索子模块用于以候选运动矢量指向的位置作为起始搜索点,对搜索区域中的全部或部分点进行整像素搜索,搜索时计算每个点处的编码代价,选择编码代价最小的点作为最优的搜索结果。For video streams in H.264 and H.265 encoding formats, the mode of integer pixel search is basically the same, so the integer pixel search module 110 is based on exactly the same hardware structure for the video stream in H.264 encoding format and H. 265 encoding format video stream for integer pixel search, which may specifically include a candidate motion vector acquisition sub-module, a search area determination sub-module, and an integer pixel search sub-module, regardless of whether the video stream in H.264 encoding format or the video in H.265 encoding format. Streams are all searched for integer pixels through the above three sub-modules. Specifically, the candidate motion vector acquisition sub-module is used to acquire candidate motion vectors, and the candidate motion vectors may be the motion vectors of the spatial adjacent blocks of the current block, the motion vectors of the temporal adjacent blocks, the global motion vector and the zero motion vector. one or more. The search area determination sub-module is used for determining the search area of the integer pixel search according to the candidate motion vector. The integer pixel search sub-module is used to take the position pointed by the candidate motion vector as the starting search point, and perform an integer pixel search on all or part of the points in the search area, calculate the coding cost at each point during the search, and select the one with the smallest coding cost. point as the optimal search result.
分像素搜索模块120电连接于整像素搜索模块110,用于在整像素搜索所得到的匹配块的基础上进一步进行分像素搜索,以进一步提高搜索精度。分像素搜索主要包括插值和编码代价计算两个部分。当运动矢量指向整像素位置时,预测块可以由参考帧的相应像素组成,否则预测块将通过使用滤波器进行插值以产生非整数位置的像素而得到。The sub-pixel search module 120 is electrically connected to the whole-pixel search module 110, and is configured to further perform sub-pixel search on the basis of the matching blocks obtained by the whole-pixel search, so as to further improve the search accuracy. The sub-pixel search mainly includes two parts: interpolation and coding cost calculation. When the motion vector points to an integer pixel position, the prediction block may be composed of corresponding pixels of the reference frame, otherwise the prediction block will be obtained by interpolating using a filter to produce pixels at non-integer positions.
分像素搜索包括二分之一像素精度和四分之一像素精度,对于二分之一像素位置处的插值来说,如上所述,分像素搜索模块120利用第一插值滤波器对H.264编码格式的视频流和H.265编码格式的视频流进行二分之 一像素插值。在一个实施例中,第一插值滤波器为8抽头的插值滤波器,即无论对H.264编码格式的视频流和H.265编码格式的视频流,均采用8抽头的插值滤波器进行二分之一像素插值,但由于H.264视频编码标准中二分之一像素位置的采样信号的预测值是通过应用一维的水平和垂直六阶滤波得到的,因而在对H.264编码格式的视频流进行二分之一像素插值时,8抽头的插值滤波器中有两个抽头不参与运算。在另一实施方式中,可以设置8抽头的插值滤波器中对应的不参与运算的两个抽头的系数为0。Sub-pixel search includes half-pixel precision and quarter-pixel precision, for interpolation at half-pixel positions, as described above, sub-pixel search module 120 uses a first interpolation filter for H.264 The video stream in the encoding format and the video stream in the H.265 encoding format are subjected to half-pixel interpolation. In one embodiment, the first interpolation filter is an 8-tap interpolation filter, that is, an 8-tap interpolation filter is used to perform the second interpolation for both the video stream in the H.264 encoding format and the video stream in the H.265 encoding format. One-half pixel interpolation, but because the predicted value of the sampled signal at one-half pixel position in the H.264 video coding standard is obtained by applying one-dimensional horizontal and vertical sixth-order filtering, it is not suitable for H.264 encoding format. When 1/2 pixel interpolation is performed on the video stream of , there are two taps in the 8-tap interpolation filter that do not participate in the operation. In another implementation manner, the coefficients of the corresponding two taps in the 8-tap interpolation filter that do not participate in the operation may be set to 0.
对于四分之一像素位置处的插值来说,分像素搜索模块120包括第一四分之一像素插值模块和第二四分之一像素插值模块,第一四分之一像素插值模块基于第二插值滤波器对H.264编码格式的视频流进行四分之一像素插值,第二四分之一像素插值模块基于第三插值滤波器对H.265编码格式的视频流进行四分之一像素插值。也就是说,由于H.264视频编码标准和H.265视频编码标准中四分之一像素插值的差别较大,因而用于四分之一像素插值的硬件不进行复用,而是分别采用不同的插值滤波器对H.264和H.265编码格式的视频流进行四分之一像素插值。其中,第二插值滤波器可以是2像素的均值滤波器,其使用相邻的整像素或二分之一像素求均值,以得到四分之一像素位置处的像素值,用于加权平均的相邻两个像素可以是四分之一像素位置处的水平方向、垂直方向或对角方向的整像素或二分之一像素。第三插值滤波器可以是7或8抽头的插值滤波器,其使用相邻的整像素或二分之一像素求均值,以得到四分之一像素位置处的像素值。具体来说,第三插值滤波器可以是水平或者垂直的7抽头插值滤波器。或者,第三插值滤波器可以是水平或者垂直的8抽头插值滤波器。其中,在该8抽头插值滤波器中对应的不参与运算的一个抽头的系数为0。For interpolation at quarter-pixel positions, the sub-pixel search module 120 includes a first quarter-pixel interpolation module and a second quarter-pixel interpolation module, the first quarter-pixel interpolation module being based on the The second interpolation filter performs quarter-pixel interpolation on the video stream in H.264 encoding format, and the second quarter-pixel interpolation module performs quarter-pixel interpolation on the video stream in H.265 encoding format based on the third interpolation filter Pixel interpolation. That is to say, due to the large difference between the quarter-pixel interpolation in the H.264 video coding standard and the H.265 video coding standard, the hardware used for quarter-pixel interpolation is not multiplexed, but is used separately. Different interpolation filters perform quarter-pixel interpolation on H.264 and H.265 encoded video streams. Wherein, the second interpolation filter may be a 2-pixel mean filter, which uses adjacent integer pixels or half pixels to obtain an average value to obtain a pixel value at a quarter pixel position, which is used for the weighted average Adjacent two pixels may be whole pixels or half pixels in the horizontal, vertical or diagonal directions at the quarter pixel position. The third interpolation filter may be a 7- or 8-tap interpolation filter that uses adjacent integer pixels or half pixels to average to obtain pixel values at quarter pixel locations. Specifically, the third interpolation filter may be a horizontal or vertical 7-tap interpolation filter. Alternatively, the third interpolation filter may be a horizontal or vertical 8-tap interpolation filter. Wherein, the coefficient of a corresponding tap in the 8-tap interpolation filter that does not participate in the operation is 0.
分像素搜索模块120还包括编码代价计算子模块,用于基于相同的硬件结构对H.264编码格式的视频流或H.265编码格式的视频流计算分像素匹配块与所述当前块之间的第一编码代价。也就是说,分像素搜索模块120基于相同的硬件结构对H.264编码格式的视频流和H.265编码格式的视频流计算分像素搜索的第一编码代价,实现该部分硬件结构的复用。The sub-pixel search module 120 further includes a coding cost calculation sub-module, which is used to calculate the difference between the sub-pixel matching block and the current block based on the same hardware structure for the video stream in the H.264 encoding format or the video stream in the H.265 encoding format. The first encoding cost of . That is to say, the sub-pixel search module 120 calculates the first encoding cost of sub-pixel search for the video stream in H.264 encoding format and the video stream in H.265 encoding format based on the same hardware structure, so as to realize the multiplexing of this part of the hardware structure. .
具体地,编码代价计算子模块可以使用SAD/SATD代价函数模型计算分像素搜索的编码代价。SAD/SATD代价函数模型利用预测值与图像像素值的差值来进行代价计算,本质上反映了当前块与预测块之间的差异程度。 为了更准确地反应各模式的代价值,在实际计算时可以依据哈达马(Hadamard)变换将残差到频域求绝对差值和SATD,根据SATD计算编码代价。Specifically, the coding cost calculation sub-module can use the SAD/SATD cost function model to calculate the coding cost of sub-pixel search. The SAD/SATD cost function model uses the difference between the predicted value and the image pixel value to calculate the cost, which essentially reflects the difference between the current block and the predicted block. In order to reflect the cost value of each mode more accurately, in the actual calculation, the residual error can be converted to the frequency domain to obtain the absolute difference and SATD, and the coding cost can be calculated according to the SATD.
在一个实施例中,视频编码装置100还包括帧内模式初选模块140,用于从多种帧内预测模式中选择出一种或多种最优的帧内预测模式。具体地,帧内模式初选模块140连接于模式决策模块130,用于根据当前帧中的至少一相邻参考块对应的像素值,确定关于当前块的至少一预测块和与该至少一预测块对应的第二编码代价,并根据所述第二编码代价确定至少一帧内预测模式。帧内预测可以根据当前块与其相邻参考块的相关度,充分利用相邻参考块的相关信息进行编码,从而提高编码效率。In one embodiment, the video encoding apparatus 100 further includes an intra-frame mode preliminary selection module 140 for selecting one or more optimal intra-frame prediction modes from multiple intra-frame prediction modes. Specifically, the intra-mode primary selection module 140 is connected to the mode decision module 130, and is configured to determine at least one prediction block related to the current block and at least one prediction block related to the at least one prediction block according to the pixel value corresponding to at least one adjacent reference block in the current frame. a second encoding cost corresponding to the block, and determining at least one intra-frame prediction mode according to the second encoding cost. Intra-frame prediction can make full use of the relevant information of the adjacent reference blocks for coding according to the correlation between the current block and its adjacent reference blocks, thereby improving the coding efficiency.
对于H.264来说,16×16的亮度和8×8的色度有4种可选的预测模式,包括垂直模式、水平模式、DC模式以外和平面(Plane)模式。而对于4×4和8×8的亮度块则有9种可选的预测模式,包括水平预测、垂直预测、直流模式(DC模式)、以及左对角线、右对角线等6种特殊方向的预测模式。平面模式以正上方、正左方的像素为基础,采用线性函数Plane来预测当前块的像素值。For H.264, there are 4 optional prediction modes for 16×16 luminance and 8×8 chrominance, including vertical mode, horizontal mode, non-DC mode and plane (Plane) mode. For 4×4 and 8×8 luminance blocks, there are 9 optional prediction modes, including horizontal prediction, vertical prediction, DC mode (DC mode), and 6 special types such as left diagonal and right diagonal. Direction prediction mode. The plane mode is based on the pixels directly above and to the left, and uses the linear function Plane to predict the pixel value of the current block.
在H.265视频编码标准中,在PU的基础上定义了35种帧内预测模式,其包括平面(Planar)模式、DC模式和垂直模式、水平模式和31种特殊的角度模式。每种角度模式的预测方向都可以视为在垂直或水平方向上进行了一定的偏移。In the H.265 video coding standard, 35 intra-frame prediction modes are defined on the basis of PU, which include Planar mode, DC mode, vertical mode, horizontal mode and 31 special angle modes. The prediction direction of each angle mode can be regarded as a certain offset in the vertical or horizontal direction.
基于H.264和H.265视频编码标准中帧内预测的相同之处和不同之处,帧内模式初选模块140一方面复用部分硬件结构来进行H.264和H.265编码格式的帧内预测中的相同部分,另一方面,还为H.264和H.265编码格式分别提供了不同的硬件结构,以分别进行H.264和H.265编码格式的帧内预测中的不同部分。Based on the similarities and differences of intra-frame prediction in the H.264 and H.265 video coding standards, the intra-frame mode preliminary selection module 140 multiplexes part of the hardware structure to perform the H.264 and H.265 coding formats on the one hand. The same part in intra-frame prediction, on the other hand, also provides different hardware structures for H.264 and H.265 encoding formats, respectively, to carry out the difference in intra-frame prediction of H.264 and H.265 encoding formats, respectively part.
具体地,帧内模式初选模块140包括第一帧内模式初选模块、第二帧内模式初选模块以及公共帧内模式初选模块,其中,公共帧内预测模式初选模块为H.264和H.265编码格式所复用的硬件,其一方面与第一帧内模式初选模块一同用于为H.264编码格式的视频流选择帧内预测模式,另一方面用于与第二帧内模式初选模块一同为H.265编码格式的视频流选择帧内预测模式。Specifically, the intra-frame mode preliminary selection module 140 includes a first intra-frame mode preliminary selection module, a second intra-frame mode preliminary selection module and a common intra-frame mode preliminary selection module, wherein the common intra-frame prediction mode preliminary selection module is H. 264 and H.265 encoding formats multiplexed hardware, on the one hand, together with the first intra-frame mode primary selection module, it is used to select the intra-frame prediction mode for the video stream in the H.264 encoding format, and on the other hand, it is used for the first intra-frame prediction mode. The two intra-frame mode primary selection modules together select the intra-frame prediction mode for the video stream in the H.265 encoding format.
帧内预测主要包括帧内预测插值和编码代价计算两部分。H.264和H.265编码格式中,部分帧内预测模式的帧内预测插值对应的硬件结构是相同的,包括水平预测、垂直预测和部分直流(DC)预测。因此,公共帧内模式初选模块包括水平预测子模块、垂直预测子模块和部分直流预测子模块。水平预测子模块用于基于相同的硬件结构对H.264编码格式的视频流或H.265编码格式的视频流进行水平模式下的帧内预测插值。垂直预测子模块用于基于相同的硬件结构对H.264编码格式的视频流或H.265编码格式的视频流进行垂直模式下的帧内预测插值。部分直流预测子模块用于基于相同的硬件结构对H.264编码格式的视频流中的亮度分量对应的图像块、H.265编码格式的视频流中的亮度分量和色度分量对应的图像块进行直流模式下的帧内预测插值。其中,水平预测子模块利用正左方的像素来水平预测当前块的相应像素值。垂直预测子模块利用正上方的像素来垂直预测当前块的相应像素值。部分直流预测子模块适用于大面积平坦区域,其利用正上方和正左方的参考像素来预测当前块的像素值。在H.264的编码格式中,对于亮度分量对应的图像块而言,当正上方及左方的像素都存在时,则当前块的像素值为这两组像素的平均值;当只有正上方或正左方的一组像素点存在时,则当前块的像素值为这一组像素的平均值。在H.265的编码格式中,对于亮度分量和色度分量对应的图像块而言,如果参考像素不可用,会使用相邻的可用像素或默认值进行填充,填充以后左边和上边的相邻像素变为可用。Intra-frame prediction mainly includes two parts: intra-frame prediction interpolation and coding cost calculation. In the H.264 and H.265 coding formats, the hardware structure corresponding to the intra prediction interpolation of the partial intra prediction mode is the same, including horizontal prediction, vertical prediction and partial direct current (DC) prediction. Therefore, the common intra mode primary selection module includes a horizontal prediction sub-module, a vertical prediction sub-module and a partial DC prediction sub-module. The horizontal prediction sub-module is used to perform intra-frame prediction interpolation in the horizontal mode on the video stream in the H.264 encoding format or the video stream in the H.265 encoding format based on the same hardware structure. The vertical prediction sub-module is used to perform intra-frame prediction interpolation in vertical mode on the video stream in the H.264 encoding format or the video stream in the H.265 encoding format based on the same hardware structure. Part of the DC prediction sub-module is used for the image block corresponding to the luminance component in the video stream in the H.264 encoding format, the image block corresponding to the luminance component and the chrominance component in the video stream in the H.265 encoding format based on the same hardware structure. Performs intra prediction interpolation in DC mode. Wherein, the horizontal prediction sub-module uses the right pixel to horizontally predict the corresponding pixel value of the current block. The vertical prediction sub-module utilizes the pixels directly above to vertically predict the corresponding pixel value of the current block. The partial DC prediction sub-module is suitable for large flat areas, and uses the reference pixels directly above and to the left to predict the pixel value of the current block. In the H.264 encoding format, for the image block corresponding to the luminance component, when both the pixels directly above and to the left exist, the pixel value of the current block is the average value of these two groups of pixels; Or when a group of pixels to the left exists, the pixel value of the current block is the average value of this group of pixels. In the H.265 encoding format, for the image blocks corresponding to the luminance component and the chrominance component, if the reference pixel is not available, the adjacent available pixels or the default value will be used for filling, and the adjacent left and upper adjacent pixels will be filled after filling. Pixels become available.
除了以上三种帧内预测模式复用硬件结构对H.264和H.265编码格式的视频流进行帧内预测插值以外,其余帧内预测模式的帧内预测插值由不同的硬件结构分别实现。具体地,第一帧内模式初选模块还包括第一方向预测子模块、第一平面预测子模块、以及色度分量的直流预测子模块。其中,第一方向预测子模块和第一平面预测子模块分别用于对H.264编码格式的视频流进行方向模式和平面(Plane)模式下的帧内预测插值。其中,第一方向预测子模块包括左对角线、右对角线、竖直偏右、水平偏下、竖直偏左、水平偏上6种方向模式下的插值滤波器。第一平面预测子模块以正上方、正左方的像素为基础,采用线性函数plane来预测当前块的像素值。色度分量的直流预测子模块用于对H.264编码格式的视频流中的色度分量对应的图像块进行直流模式下的帧内预测插值。In addition to the above three intra-frame prediction modes multiplexing hardware structures to perform intra-frame prediction and interpolation on video streams in H.264 and H.265 encoding formats, the intra-frame prediction and interpolation of other intra-frame prediction modes are implemented by different hardware structures respectively. Specifically, the first intra-mode preliminary selection module further includes a first direction prediction sub-module, a first plane prediction sub-module, and a DC prediction sub-module of the chrominance component. The first directional prediction sub-module and the first plane prediction sub-module are respectively used to perform intra-frame prediction interpolation in the directional mode and the plane (Plane) mode on the video stream in the H.264 encoding format. Wherein, the first direction prediction sub-module includes interpolation filters in six direction modes: left diagonal, right diagonal, vertical to right, horizontal to down, vertical to left, and horizontal to up. The first plane prediction sub-module uses the linear function plane to predict the pixel value of the current block based on the pixels directly above and to the left. The DC prediction sub-module of the chrominance component is used to perform intra-frame prediction interpolation in the DC mode on the image block corresponding to the chrominance component in the video stream in the H.264 encoding format.
第二帧内模式初选模块包括第二方向预测子模块和第二平面预测子模块,分别用于对所述H.265编码格式的视频流进行方向模式和平面(Planar)模式下的帧内预测插值。具体地,第二方向预测子模块可以包括H.265视频编码标准中的31种特殊的方向模式中的部分或全部。第二平面预测子模块使用水平和垂直方向的两个线性滤波器,并将二者的平均值作为当前块像素的预测值。The second intra-frame mode preliminary selection module includes a second direction prediction sub-module and a second plane prediction sub-module, which are respectively used to perform intra-frame prediction in the direction mode and the plane mode on the video stream in the H.265 encoding format. Predictive interpolation. Specifically, the second direction prediction sub-module may include part or all of the 31 special direction modes in the H.265 video coding standard. The second plane prediction sub-module uses two linear filters in the horizontal and vertical directions, and takes the average of the two as the prediction value of the pixels in the current block.
由于H.264和H.265编码格式的帧内预测的编码代价计算基本相同,Since the coding cost of intra prediction in H.264 and H.265 coding formats is basically the same,
因而公共帧内模式初选模块还包括编码代价计算子模块,用于基于相同的硬件结构对H.264编码格式的视频流或H.265编码格式的视频流计算所述第二编码代价。虽然计算编码代价的方式基本相同,但在对H.264编码格式的视频流进行帧内预测时,可以对9种预测模式分别计算编码代价;而对于H.265编码格式的视频流来说,由于H.265视频编码标准中的帧内预测模式多达35种,在帧内预测时只在其中的部分帧内预测模式下进行帧内预测并计算编码代价(例如只在其中三种帧内预测模式下计算编码代价),最终选择出一到两种最优的帧内预测模式。Therefore, the common intra mode primary selection module further includes a coding cost calculation sub-module for calculating the second coding cost for the video stream in the H.264 encoding format or the video stream in the H.265 encoding format based on the same hardware structure. Although the method of calculating the encoding cost is basically the same, when performing intra-frame prediction on the video stream in H.264 encoding format, the encoding cost can be calculated separately for 9 prediction modes; for the video stream in H.265 encoding format, Since there are as many as 35 intra-frame prediction modes in the H.265 video coding standard, during intra-frame prediction, only some of the intra-frame prediction modes are used for intra-frame prediction and the coding cost is calculated (for example, only in three of the intra-frame prediction modes). Calculate the coding cost in the prediction mode), and finally select one or two optimal intra-frame prediction modes.
编码代价计算子模块采用代价函数来计算各种帧内预测模式的编码代价值,再根据编码代价的大小来确定最佳的帧内预测模式。示例性地,帧内模式初选模块140可以使用如上所述的SAD/SAD代价模型计算编码代价。但可选地,帧内模式初选模块140也可以采用率失真优化(RDO)代价模型计算编码代价。The coding cost calculation sub-module uses the cost function to calculate the coding cost of various intra-frame prediction modes, and then determines the best intra-frame prediction mode according to the size of the coding cost. Illustratively, the intra-mode priming module 140 may calculate the coding cost using the SAD/SAD cost model as described above. But optionally, the intra-mode primary selection module 140 may also use a rate-distortion optimization (RDO) cost model to calculate the coding cost.
分像素搜索模块120和帧内模式初选模块140均与模式决策模块130电连接,模式决策模块130至少根据帧内模式初选模块140得到的至少一帧内预测模式和分像素搜索模块120得到的至少一个运动矢量确定最优预测块,并输出系数块。其中,系数块是残差块经过变换得到的。模式决策模块130还能够输出模式信息。模式信息和系数块最后将被传送至熵编码块以进行熵编码。在一实施方式中,模式决策模块130还能够输出重建块。之后,对重建块进行去块滤波和熵编码滤波。进一步地,模式决策模块130还利用Skip模式和Merge模式这两种特殊的帧间预测模式的预测结果一起参与模式决策。Merge模式直接采用时域或空域相邻块的运动矢量作为当前块的运动矢量,省略了运动估计的步骤。Skip模式也可以认为是一种特殊的merge模式,区别在于skip模式直接认为变换量化后得到的残差是0,即不编码残差,该模 式下的预测块即为重建块。需要注意,对于H.264和H.265编码格式来说,Skip模式和Merge模式获取相邻块的MV的方式是不一样的。The sub-pixel search module 120 and the intra-mode primary selection module 140 are both electrically connected to the mode decision module 130, and the mode decision module 130 at least obtains at least one intra-frame prediction mode obtained by the intra-mode primary selection module 140 and the sub-pixel search module 120. At least one motion vector of , determines the optimal prediction block, and outputs a coefficient block. The coefficient block is obtained by transforming the residual block. The mode decision module 130 can also output mode information. The mode information and coefficient blocks will finally be passed to the entropy coding block for entropy coding. In one embodiment, the mode decision module 130 is also capable of outputting reconstruction blocks. After that, the reconstructed block is subjected to deblocking filtering and entropy coding filtering. Further, the mode decision module 130 also participates in the mode decision by using the prediction results of the two special inter-frame prediction modes, the Skip mode and the Merge mode. The Merge mode directly uses the motion vector of the adjacent block in the temporal or spatial domain as the motion vector of the current block, omitting the step of motion estimation. Skip mode can also be considered as a special merge mode, the difference is that the skip mode directly considers that the residual obtained after transformation and quantization is 0, that is, the residual is not encoded, and the prediction block in this mode is the reconstruction block. It should be noted that for the H.264 and H.265 encoding formats, the Skip mode and the Merge mode acquire the MVs of adjacent blocks in different ways.
模式决策模块130同样复用部分硬件结构进行H.264和H.265两种编码格式的视频流的模式决策。具体地,模式决策模块130包括第一模式决策模块、第二模式决策模块和公共模式决策模块,公共模式决策模块即为H.264和H.265两种编码格式所复用的硬件结构,第一模式决策模块和第二模式决策模块为单独针对H.264编码格式和H.265编码格式的硬件结构。其中,第一模式决策模块和公共模式决策模块共同为H.264编码格式的视频流选择编码单元的分割方式和最优预测模式,并根据所述最优预测模式获得H.264编码格式的残差块;第二模式决策模块和公共模式决策模块共同为H.265编码格式的视频流选择编码单元的分割方式和最优预测模式,并根据所述最优预测模式获得H.265编码格式的残差块。The mode decision module 130 also multiplexes part of the hardware structure to perform mode decision for the video streams of the H.264 and H.265 encoding formats. Specifically, the mode decision module 130 includes a first mode decision module, a second mode decision module and a common mode decision module. The public mode decision module is the hardware structure multiplexed by the H.264 and H.265 encoding formats. The first mode decision module and the second mode decision module are hardware structures for the H.264 encoding format and the H.265 encoding format independently. Wherein, the first mode decision module and the common mode decision module jointly select the division mode and the optimal prediction mode of the coding unit for the video stream in the H.264 encoding format, and obtain the residual of the H.264 encoding format according to the optimal prediction mode. difference block; the second mode decision module and the common mode decision module jointly select the division mode and the optimal prediction mode of the coding unit for the video stream in the H.265 encoding format, and obtain the H.265 encoding format according to the optimal prediction mode. residual block.
模式决策主要包括变换、量化、反变换、反量化、比特估计和失真估计几个部分。变换和量化可以进一步去除图像的冗余度,节省编码码率。变换的目的是将图像的信号从时域变换到频域上去,变换到频域的信号与时域信号相比较大程度地降低了码率;量化可以减少图像编码的长度。The mode decision mainly includes transformation, quantization, inverse transformation, inverse quantization, bit estimation and distortion estimation. Transformation and quantization can further remove the redundancy of the image and save the coding rate. The purpose of transformation is to transform the image signal from the time domain to the frequency domain. Compared with the time domain signal, the signal transformed to the frequency domain reduces the bit rate to a large extent; quantization can reduce the length of image encoding.
由于H.264和H.265视频编码标准中变换、量化、反变换和反量化相差较大,需要各自分别实现,因此第一模式决策模块包括第一变换子模块、第一量化子模块、第一反变换子模块和第一反量化子模块,分别用于对H.264编码格式的视频流进行变换、量化、反变换、反量化;第二模式决策模块包括第二变换子模块、第二量化子模块、第二反变换子模块和第二反量化子模块,分别用于对所述H.265编码格式的视频流进行变换、量化、反变换、反量化。Since the transformation, quantization, inverse transformation and inverse quantization in the H.264 and H.265 video coding standards are quite different and need to be implemented separately, the first mode decision module includes a first transformation sub-module, a first quantization sub-module, a first An inverse transform sub-module and a first inverse quantization sub-module are respectively used to transform, quantize, inverse transform and inverse quantize the video stream in the H.264 encoding format; the second mode decision module includes a second transform sub-module, a second The quantization sub-module, the second inverse transform sub-module and the second inverse quantization sub-module are respectively used to transform, quantize, inverse transform and inverse quantize the video stream in the H.265 encoding format.
其中,第一变换子模块和第二变换子模块实质上都是在残差矩阵上乘上变换矩阵,第一反变换子模块和第二反变换子模块都是在系数矩阵上乘上变换矩阵。然而,第二变换子模块和第二反变换子模块在对H.265编码格式的视频流进行矩阵乘法时在最后进行移位操作,而第一变换子模块和第一反变换子模块在对H.264编码格式的视频流进行矩阵乘法时在计算中间就会进行移位操作;此外,相比于H.265,H.264编码格式的视频流在某些预测模式(例如色度分量,16x16帧内模式时的亮度分量)下还会多进行哈达马变换/反变换过程,因此H.264和H.265的变换和反变换采用不同的 硬件结构。Wherein, the first transform sub-module and the second transform sub-module essentially multiply the residual matrix by the transform matrix, and the first inverse transform sub-module and the second inverse transform sub-module both multiply the coefficient matrix by the transform matrix. However, the second transform sub-module and the second inverse transform sub-module perform a shift operation at the end when performing matrix multiplication on the video stream in the H.265 encoding format, while the first transform sub-module and the first inverse transform sub-module perform a shift operation at the end. When the video stream in H.264 encoding format performs matrix multiplication, a shift operation is performed in the middle of the calculation; in addition, compared with H.265, the video stream in H.264 encoding format is in some prediction modes (such as chroma component, Under the 16x16 intra-frame mode, more Hadamard transform/inverse transform processes will be performed, so the transform and inverse transform of H.264 and H.265 use different hardware structures.
量化反量化实质上都是将变换后的矩阵乘上一个系数,再四舍五入取最接近的整数。区别之处在于,第一量化子模块在对H.264编码格式的视频流进行量化时不同位置处乘以不同的系数,第二量化子模块在对H.265编码格式的视频流进行量化时不同位置处乘以相同的系数,因此H.264和H.265的量化和反量化采用不同的硬件结构。Quantization and inverse quantization essentially multiply the transformed matrix by a coefficient, and then round to the nearest integer. The difference is that the first quantization submodule multiplies different coefficients at different positions when quantizing the video stream in the H.264 encoding format, and the second quantization submodule multiplies the video stream in the H.265 encoding format when quantizing the video stream in the H.265 encoding format. Different positions are multiplied by the same coefficient, so the quantization and inverse quantization of H.264 and H.265 use different hardware structures.
比特估计是根据H.264或H.265视频编码标准中规定的要编码的语法元素(包括预测信息和系数等)来估计出当前预测模式所需要的比特数。由于H.264和H.265视频编码标准中规定的比特估计的语法元素是不同的,因而对H.264和H.265编码格式的视频流进行的比特估计的过程也是不同的。因此,在一个实施例中,可以采用不同的硬件结构分别实现H.264和H.265编码格式的视频流的比特估计,即第一模式决策模块还包括第一比特估计子模块,用于对H.264编码格式的视频流进行比特估计;第二模式决策模块还包括第二比特估计子模块,用于对H.265编码格式的视频流进行比特估计。Bit estimation is to estimate the number of bits required by the current prediction mode according to the syntax elements to be encoded (including prediction information and coefficients, etc.) specified in the H.264 or H.265 video coding standard. Since the syntax elements for bit estimation specified in the H.264 and H.265 video coding standards are different, the process of bit estimation for video streams in H.264 and H.265 coding formats is also different. Therefore, in one embodiment, different hardware structures can be used to implement bit estimation of video streams in H.264 and H.265 encoding formats, respectively, that is, the first mode decision module further includes a first bit estimation sub-module for Bit estimation is performed on the video stream in the H.264 encoding format; the second mode decision module further includes a second bit estimation sub-module for performing bit estimation on the video stream in the H.265 encoding format.
然而,H.264和H.265视频编码标准中规定的比特估计的语法元素又具有一定程度上的相似性,因而在另一个实施例中,为了节省硬件的面积,也可以复用相同的硬件结构来进行H.264和H.265编码格式的视频流的比特估计。However, the syntax elements of bit estimation specified in the H.264 and H.265 video coding standards are similar to a certain extent, so in another embodiment, in order to save the area of the hardware, the same hardware can also be reused Structure for bit estimation of video streams in H.264 and H.265 encoding formats.
其中,作为一种实现方式,公共模式决策模块还包括H.264比特估计子模块,用于基于第一硬件结构对所述H.264编码格式的视频流或所述H.265编码格式的视频流进行比特估计。也就是说,在这种实现方式中,复用H.264编码格式下的比特估计电路实现H.264和H.265两种编码格式的视频流的比特估计,无论对哪种格式的视频流进行比特估计,均使用H.264视频编码标准中规定的比特估计的语法元素。Wherein, as an implementation manner, the common mode decision module further includes an H.264 bit estimation sub-module, which is configured to perform an analysis on the video stream in the H.264 encoding format or the video in the H.265 encoding format based on the first hardware structure. stream for bit estimation. That is to say, in this implementation, the bit estimation circuit in the multiplexing H.264 encoding format realizes the bit estimation of the video streams in H.264 and H.265 encoding formats, no matter which format is used for the video stream For bit estimation, the syntax elements of bit estimation specified in the H.264 video coding standard are used.
作为另外一种实现方式,公共模式决策模块包括H.265比特估计子模块,用于基于第二硬件结构对所述H.264编码格式的视频流或所述H.265编码格式的视频流进行比特估计,其中,所述H.264比特估计子模块和所述H.265比特估计子模块使用的语法元素不同。也就是说,在这种实现方式中,复用H.265编码格式下的比特估计电路实现H.264和H.265两种编码格式的视频流的比特估计,无论对哪种格式的视频流进行比特估计,均 使用H.265视频编码标准中规定的比特估计的语法元素。As another implementation manner, the common mode decision module includes an H.265 bit estimation sub-module, configured to perform the H.264 encoding format video stream or the H.265 encoding format video stream based on the second hardware structure. Bit estimation, wherein the syntax elements used by the H.264 bit estimation sub-module and the H.265 bit estimation sub-module are different. That is to say, in this implementation, the bit estimation circuit in the multiplexing H.265 encoding format realizes the bit estimation of the video streams of the H.264 and H.265 encoding formats, no matter which format the video stream is used for. For bit estimation, the syntax elements of bit estimation specified in the H.265 video coding standard are used.
H.264和H.265失真估计的过程可以使用相同的计算,因此,公共模式决策模块包括失真估计子模块,基于相同的硬件结构进行所述H.264编码格式的视频流或所述H.265编码格式的视频流的失真估计。失真估计模块通常计算重建像素和原始像素的SSE,SAD等值作为编码的失真。计算式如下:失真=SAD+Lambda*MVBits。式中,SAD为时域上的绝对差值和,即重建块与当前块的像素差;Lambda为换算因子,MVBits为比特估计子模块得到的比特数。The process of H.264 and H.265 distortion estimation can use the same calculation, therefore, the common mode decision module includes a distortion estimation sub-module, based on the same hardware structure for the video stream in the H.264 encoding format or the H. 265-encoded video streams for distortion estimation. The distortion estimation module usually calculates the SSE, SAD, etc. of the reconstructed and original pixels as the encoded distortion. The calculation formula is as follows: Distortion=SAD+Lambda*MVBits. In the formula, SAD is the sum of absolute differences in the time domain, that is, the pixel difference between the reconstructed block and the current block; Lambda is the conversion factor, and MVBits is the number of bits obtained by the bit estimation sub-module.
进一步地,视频编码装置100还包括环内滤波模块150,其电连接于模式决策模块130,用于对残差块进行环内滤波处理。Further, the video encoding apparatus 100 further includes an in-loop filtering module 150, which is electrically connected to the mode decision module 130 and configured to perform in-loop filtering processing on the residual block.
在一个实施例中,环内滤波模块150包括第一去块滤波(Deblocking Filter,DBF)子模块和第二去块滤波子模块,分别用于对H.264编码格式的视频流和H.265编码格式的视频流进行去块滤波。去块滤波的主要作用是去除块边界的高频分量,来减少解码图像中的块效应。块效应是指图像进行分块压缩时,造成解码时在分块边界处产生让人眼容易察觉到的不连续方块的现象。出现块效应有两个原因:一个是因为在帧间运动补偿时,连续块使用不连续的块进行预测而产生块之间的不连续,另一个是对残差块进行变换、量化和编码而产生的量化失真。第一去块滤波子模块和第二去块滤波子模块判断去块滤波边界的尺寸不同,如果满足滤波条件,则第一去块滤波子模块会对H.264编码格式的视频流中4×4块的边界进行去块滤波,而第二去块滤波子模块会对H.265编码格式的视频流中8×8块的边界进行去块滤波。此外,第一去块滤波子模块和第二去块滤波子模块对滤波强度的判断不同,不同滤波强度下使用的滤波器也不相同。In one embodiment, the in-loop filtering module 150 includes a first deblocking filter (DBF) sub-module and a second deblocking filter sub-module, which are respectively used for H.264 encoded video streams and H.265 The video stream in the encoded format is subjected to deblocking filtering. The main function of deblocking filtering is to remove high-frequency components at block boundaries to reduce blockiness in decoded images. Blocking refers to the phenomenon that when an image is compressed in blocks, discontinuous blocks that are easily noticeable to the human eye are generated at the boundaries of the blocks during decoding. There are two reasons for the blocking effect: one is that continuous blocks use discontinuous blocks for prediction during inter-frame motion compensation, resulting in discontinuity between blocks, and the other is that the residual blocks are transformed, quantized, and encoded. The resulting quantization distortion. The first deblocking filtering sub-module and the second deblocking filtering sub-module determine that the size of the deblocking filtering boundary is different. If the filtering conditions are satisfied, the first deblocking filtering sub-module will determine the size of the 4× The boundaries of 4 blocks are deblocked, and the second deblocking filter sub-module is used to deblock the boundaries of 8×8 blocks in the H.265 encoded video stream. In addition, the first deblocking filtering sub-module and the second deblocking filtering sub-module judge the filtering strength differently, and the filters used under different filtering strengths are also different.
H.265视频编码标准中涉及两种环路滤波,除了去块滤波以外,还包括样点自适应补偿(Sample Adaptive Offset,SAO),SAO通过分析当前帧的原始数据和重构后的数据,对去块滤波之后的图像进行偏移补偿操作,使得重建图像尽量接近原始的图像。因此,在一个实施例中,环内滤波模块150还包括SAO参数估计子模块和SAO滤波子模块,用于对H.265编码格式的视频流进行SAO参数估计和SAO滤波;而H.264由于不涉及SAO,因此直接输出去块滤波后的重建图像。The H.265 video coding standard involves two kinds of loop filtering. In addition to deblocking filtering, it also includes sample adaptive offset (Sample Adaptive Offset, SAO). SAO analyzes the original data and reconstructed data of the current frame. Offset compensation is performed on the image after deblocking filtering, so that the reconstructed image is as close to the original image as possible. Therefore, in one embodiment, the in-loop filtering module 150 further includes a SAO parameter estimation sub-module and a SAO filtering sub-module for performing SAO parameter estimation and SAO filtering on the video stream in the H.265 encoding format; No SAO is involved, so the reconstructed image after deblocking filtering is directly output.
对于H.265编码格式的视频流来说,反量化操作后的图像经过去块滤波子模块的去块滤波处理后,再作为输入传给SAO参数估计子模块。SAO 包括4种EO(Edge Offset,边界补偿模式)和1种BO(Band Offset,带补偿模式)模式,在EO模式下还需要确定补偿值的大小,在BO模式下还需要确定补偿哪几个带以及补偿值。SAO参数估计子模块既用于对SAO的上述补偿模式和参数进行估计,得到最优的补偿模式和参数。SAO滤波是根据得到的最优的补偿模式和参数来进行实际的滤波操作。SAO滤波子模块输出的重建图像将被缓存至编码器中,作为后续的参考帧。For the video stream in the H.265 encoding format, the image after the inverse quantization operation is processed by the deblocking filter sub-module, and then passed to the SAO parameter estimation sub-module as an input. SAO includes 4 kinds of EO (Edge Offset, boundary compensation mode) and 1 BO (Band Offset, with compensation mode) mode. In EO mode, you need to determine the size of the compensation value, and in BO mode, you need to determine which ones to compensate. band and compensation value. The SAO parameter estimation sub-module is used for estimating the above compensation mode and parameters of SAO to obtain the optimal compensation mode and parameters. SAO filtering is to perform the actual filtering operation according to the obtained optimal compensation mode and parameters. The reconstructed image output by the SAO filtering sub-module will be buffered in the encoder as a subsequent reference frame.
熵编码模块160对语法元素进行基于上下文的算术编码,将语法元素编码为二进制的字符串,并进行算数编码,以将字符串编码为码流。其中,将最常见的信息用短码表示,反之用长码表示,以达到平均码长最短的目的。解码器可以根据熵编码后的码流无失真地恢复出原信息。在一个实施例中,熵编码模块160采用的熵编码模式为CABAC(基于内容自适应的二进制算数编码)。CABAC是基于上下文模型的自适应的算数编码,其利用各符号间的相关性及视频流的统计特性不断地自动调整各符号出现的概率,使得码字输出的信息量与符号熵率几乎相同,以获得较高的编码效率。The entropy encoding module 160 performs context-based arithmetic encoding on the syntax elements, encodes the syntax elements into binary strings, and performs arithmetic encoding to encode the strings into code streams. Among them, the most common information is represented by a short code, otherwise, a long code is used to achieve the purpose of the shortest average code length. The decoder can restore the original information without distortion according to the entropy-encoded code stream. In one embodiment, the entropy coding mode adopted by the entropy coding module 160 is CABAC (Content-Based Adaptive Binary Arithmetic Coding). CABAC is an adaptive arithmetic coding based on the context model. It uses the correlation between symbols and the statistical characteristics of the video stream to continuously and automatically adjust the probability of occurrence of each symbol, so that the amount of information output by the codeword is almost the same as the symbol entropy rate. in order to obtain higher coding efficiency.
熵编码模块160同样复用了部分硬件结构来对H.264和H.265编码格式的视频流进行熵编码。具体地,熵编码主要包括两个步骤:一是二值化,这一过程将需要编码的语法元素转换成二进制的字符串,需要编码的语法元素包括当前块的划分方式、预测信息、残差信息、滤波信息等;二是算数编码,这一过程将二进制的字符串编码成码流。其中,H.264视频编码标准和H.265视频编码标准中规定的二值化过程中需要编码的语法元素差异较大,因而分别采用不同的硬件结构实现;而算数编码过程中的算数编码核相同,因而复用同一套硬件结构实现。The entropy encoding module 160 also multiplexes part of the hardware structure to perform entropy encoding on the video streams in H.264 and H.265 encoding formats. Specifically, entropy coding mainly includes two steps: one is binarization, which converts the syntax elements to be encoded into binary strings. The syntax elements to be encoded include the division method of the current block, prediction information, and residuals. information, filtering information, etc.; the second is arithmetic coding, which encodes a binary string into a code stream. Among them, the syntax elements that need to be encoded in the binarization process specified in the H.264 video coding standard and the H.265 video coding standard are quite different, so they are implemented by different hardware structures; The same, so the same set of hardware structure is reused.
因此,熵编码模块160包括第一熵编码模块、第二熵编码模块和公共熵编码模块,公共熵编码模块即H.264和H.265编码格式的视频流复用的硬件结构;第一熵编码模块和公共熵编码模块用于对H.264编码格式的视频流进行残差块的熵编码,第二熵编码和所述公共熵编码模块用于对H.265编码格式的视频流进行残差块的熵编码。其中,第一熵编码模块用于根据H.264编码格式的残差块获得H.264编码格式的语法元素,第二熵编码模块用于根据H.265编码格式的残差块获得H.265编码格式的语法元素,公共熵编码模块用于提供算数编码核,以对所述H.264编码格式的语法元素或所述H.265编码格式的语法元素进行熵编码。Therefore, the entropy encoding module 160 includes a first entropy encoding module, a second entropy encoding module, and a common entropy encoding module, and the common entropy encoding module is the hardware structure of video stream multiplexing in H.264 and H.265 encoding formats; the first entropy encoding module The encoding module and the common entropy encoding module are used to perform entropy encoding of residual blocks on the video stream in the H.264 encoding format, and the second entropy encoding and the common entropy encoding module are used for performing residual block encoding on the video stream in the H.265 encoding format. Entropy coding of difference blocks. Wherein, the first entropy coding module is used to obtain the syntax elements of the H.264 coding format according to the residual block of the H.264 coding format, and the second entropy coding module is used to obtain the H.265 coding format according to the residual block of the H.265 coding format The syntax elements of the encoding format, the common entropy encoding module is used to provide an arithmetic encoding kernel to entropy encode the syntax elements of the H.264 encoding format or the syntax elements of the H.265 encoding format.
在一些实施例中,视频编码装置100还包括电连接所述整像素搜索模 块110、所述分像素搜索模块120和所述模式决策模块130的参考帧管理模块170,用于获取参考帧,并将所述参考帧发送至整像素搜索模块110、分像素搜索模块120和模式决策模块130。这一部分对于H.264视频编码标准和H.265视频编码标准来说是相同的,因而可以复用相同的硬件结构来实现。In some embodiments, the video encoding apparatus 100 further includes a reference frame management module 170 electrically connected to the integer pixel search module 110, the sub-pixel search module 120 and the mode decision module 130, for acquiring reference frames, and The reference frame is sent to the integer pixel search module 110 , the sub-pixel search module 120 and the mode decision module 130 . This part is the same for the H.264 video coding standard and the H.265 video coding standard, so it can be realized by multiplexing the same hardware structure.
在硬件结构中,本申请实施例的视频编码装置100使用流水级的方式实现。在一个实施例中,参见图2,视频编码装置100共包含5个流水级,整像素搜索模块110位于第一级,分像素搜索模块120和帧内模式初选模块140位于第二级,模式决策模块130位于第三级,SAO参数估计子模块和去块滤波子模块位于第四级,熵编码模块160和SAO滤波模块位于第五级。需要说明的是,整像素搜索模块110电连接于分像素搜索模块120,分像素搜索模块120电连接于模式决策模块130,模式决策模块130分别电连接于SAO参数估计子模块和去块滤波子模块,SAO参数估计子模块和去块滤波子模块分别电连接于熵编码模块160和SAO滤波模块。电连接是指上述各模块是对应电气连接的。由于整像素搜索模块110和分像素搜索模块120相互电连接,因此整像素搜索模块110能够输出当前帧中的当前块相匹配的匹配块至分像素搜索模块120。也是基于此,可以将整像素搜索模块110设置于第一流水级,并且将分像素搜索模块120设置于第二流水级。由于分像素搜索模块120和模式决策模块130相互电连接,因此分像素搜索模块120能够输出至少一分像素匹配块至模式决策模块130。也是基于此,可以将分像素搜索模块120设置于第二流水级,并且将模式决策模块130设置于第三流水级。基于相似的理由,可以将SAO参数估计子模块和去块滤波子模块设置于第四流水级,并且将熵编码模块160和SAO滤波模块设置于第五流水级。In the hardware structure, the video encoding apparatus 100 in this embodiment of the present application is implemented in a pipeline-level manner. In one embodiment, referring to FIG. 2 , the video encoding apparatus 100 includes a total of 5 pipeline stages, the integer pixel search module 110 is located in the first stage, the sub-pixel search module 120 and the intra-mode preliminary selection module 140 are located in the second stage, the mode The decision-making module 130 is located at the third level, the SAO parameter estimation sub-module and the deblocking filtering sub-module are located at the fourth level, and the entropy coding module 160 and the SAO filtering module are located at the fifth level. It should be noted that the whole pixel search module 110 is electrically connected to the sub-pixel search module 120, the sub-pixel search module 120 is electrically connected to the mode decision module 130, and the mode decision module 130 is respectively electrically connected to the SAO parameter estimation sub-module and the deblocking filter sub-module. The modules, the SAO parameter estimation sub-module and the deblocking filtering sub-module are respectively electrically connected to the entropy encoding module 160 and the SAO filtering module. Electrical connection means that the above-mentioned modules are correspondingly electrically connected. Since the whole-pixel search module 110 and the sub-pixel search module 120 are electrically connected to each other, the whole-pixel search module 110 can output a matching block matching the current block in the current frame to the sub-pixel search module 120 . Also based on this, the whole-pixel search module 110 can be set at the first pipeline stage, and the sub-pixel search module 120 can be set at the second pipeline stage. Since the sub-pixel search module 120 and the pattern decision module 130 are electrically connected to each other, the sub-pixel search module 120 can output at least one sub-pixel matching block to the pattern decision module 130 . Also based on this, the sub-pixel search module 120 can be set at the second pipeline stage, and the mode decision module 130 can be set at the third pipeline stage. For similar reasons, the SAO parameter estimation sub-module and the deblocking filtering sub-module may be set at the fourth pipeline stage, and the entropy encoding module 160 and the SAO filtering module may be set at the fifth pipeline stage.
在流水级中,当第N+2个块在进行整像素搜索时,第N+1个块在进行分像素搜索和帧内模式初选,第N个块在进行模式决策。在一些实施例中,由于模式决策模块130的计算量较大,因而也可以分两级流水来实现。如图2所示的流水级的划分仅作为示例,实际流水级的划分也可以采用不同的方式。In the pipeline stage, when the N+2th block is performing integer pixel search, the N+1th block is performing sub-pixel search and intra-mode primary selection, and the Nth block is performing mode decision. In some embodiments, since the calculation amount of the mode decision module 130 is relatively large, it can also be implemented in two-stage pipeline. The division of the pipeline stages as shown in FIG. 2 is only an example, and the division of the actual pipeline stages can also be done in different ways.
本申请实施例的视频编码装置复用部分硬件结构来进行H.264编码格式和H.265编码格式的视频流的编码,节省了硬件面积。The video encoding apparatus of the embodiment of the present application multiplexes part of the hardware structure to encode video streams in the H.264 encoding format and the H.265 encoding format, which saves hardware area.
图3示出了根据本发明的一个实施例的视频编码方法300的流程图。视频编码方法300可以由上述的视频编码装置100实现。以下仅对视频编码方法300的主要步骤进行描述,进一步的细节可以参照上文。FIG. 3 shows a flowchart of a video encoding method 300 according to an embodiment of the present invention. The video encoding method 300 may be implemented by the video encoding apparatus 100 described above. Only the main steps of the video encoding method 300 will be described below, and for further details, reference may be made to the above.
如图3所示,本申请实施例的视频编码方法300包括如下步骤:As shown in FIG. 3 , the video coding method 300 according to the embodiment of the present application includes the following steps:
步骤S310,整像素搜索模块在多个参考帧中的多个预定范围内确定与当前帧中的当前块相匹配的匹配块;Step S310, the integer pixel search module determines a matching block that matches the current block in the current frame within a plurality of predetermined ranges in the plurality of reference frames;
步骤S320,电连接于所述整像素搜索模块的分像素搜索模块确定关于所述匹配块的至少一分像素匹配块,其中,所述分像素搜索模块包括二分之一像素插值模块,所述确定关于所述匹配块的至少一分像素匹配块包括:二分之一像素插值模块利用第一插值滤波器对H.264编码格式的视频流或H.265编码格式的视频流进行二分之一像素插值;Step S320, a sub-pixel search module electrically connected to the whole pixel search module determines at least one sub-pixel matching block about the matching block, wherein the sub-pixel search module includes a half pixel interpolation module, the Determining at least one subpixel matching block for the matching block includes: a half pixel interpolation module halving the video stream in H.264 encoding format or the video stream in H.265 encoding format using a first interpolation filter one-pixel interpolation;
步骤S330,电连接所述分像素搜索模块的模式决策模块至少利用所述分像素匹配块的编码代价进行模式决策,以得到所述当前块的最优预测块以用于视频编码。Step S330 , the mode decision module electrically connected to the sub-pixel search module performs mode decision at least by using the coding cost of the sub-pixel matching block, so as to obtain the optimal prediction block of the current block for video coding.
在一个实施例中,步骤S320中使用的第一插值滤波器为8抽头的插值滤波器。In one embodiment, the first interpolation filter used in step S320 is an 8-tap interpolation filter.
在一个实施例中,分像素搜索模块还包括第一四分之一像素插值模块和第二四分之一像素插值模块,所述方法还包括:由所述第一四分之一像素插值模块基于第二插值滤波器对H.264编码格式的视频流进行四分之一像素插值,或者,由所述第二四分之一像素插值模块基于第三插值滤波器对H.265编码格式的视频流进行四分之一像素插值。In one embodiment, the sub-pixel search module further includes a first quarter-pixel interpolation module and a second quarter-pixel interpolation module, and the method further includes: by the first quarter-pixel interpolation module Perform quarter-pixel interpolation on the video stream in H.264 encoding format based on the second interpolation filter, or perform quarter-pixel interpolation on the video stream in H.265 encoding format based on the third interpolation filter by the second quarter-pixel interpolation module The video stream is quarter-pixel interpolated.
进一步地,所述分像素搜索模块还包括编码代价计算子模块,所述方法还包括:由所述编码代价计算子模块基于相同的硬件结构对所述H.264编码格式的视频流或所述H.265编码格式的视频流计算所述分像素匹配块与所述当前块之间的第一编码代价。Further, the sub-pixel search module further includes an encoding cost calculation sub-module, and the method further includes: performing the H.264 encoding format video stream or the H.264 encoding format by the encoding cost calculation sub-module based on the same hardware structure. The first encoding cost between the sub-pixel matching block and the current block is calculated for the video stream in the H.265 encoding format.
在一个实施例中,所述方法还包括:由连接于所述模式决策模块的帧内模式初选模块根据所述当前帧中的至少一相邻参考块对应的像素值,确定关于所述当前块的至少一预测块和与所述至少一预测块对应的第二编码代价,并根据所述第二编码代价确定至少一帧内预测模式;由所述模式决策模块根据所述至少一帧内预测模式和所述至少一个运动矢量确定最优预测块,并输出模式信息、系数块和重建块;由电连接于所述模式决策模块 的环内滤波模块对所述重建块进行环内滤波处理;由电连接于所述环内滤波模块的熵编码模块用于对模式信息和系数块进行熵编码。In one embodiment, the method further includes: determining, by an intra-frame mode primary selection module connected to the mode decision module, based on pixel values corresponding to at least one adjacent reference block in the current frame, about the current frame at least one prediction block of the block and a second encoding cost corresponding to the at least one prediction block, and determining at least one intra-frame prediction mode according to the second encoding cost; the mode decision module determines according to the at least one intra-frame prediction mode The prediction mode and the at least one motion vector determine an optimal prediction block, and output mode information, a coefficient block and a reconstruction block; the reconstruction block is subjected to in-loop filtering processing by an in-loop filtering module electrically connected to the mode decision module ; used by the entropy coding module electrically connected to the in-loop filtering module to perform entropy coding on the mode information and coefficient blocks.
示例性地,所述帧内模式初选模块包括第一帧内模式初选模块、第二帧内模式初选模块以及公共帧内模式初选模块,所述方法还包括:所述第一帧内模式初选模块和所述公共帧内模式初选模块为H.264编码格式的视频流选择帧内预测模式,或者,所述第二帧内模式初选模块和所述公共帧内模式初选模块为所述H.265编码格式的视频流选择帧内预测模式。Exemplarily, the intra-frame mode preliminary selection module includes a first intra-frame mode preliminary selection module, a second intra-frame mode preliminary selection module, and a common intra-frame mode preliminary selection module, and the method further includes: the first frame The intra-mode preliminary selection module and the common intra-mode preliminary selection module select the intra-prediction mode for the video stream in the H.264 encoding format, or, the second intra-mode preliminary selection module and the common intra-mode preliminary selection module. The selection module selects an intra prediction mode for the video stream in the H.265 encoding format.
其中,所述公共帧内模式初选模块包括水平预测子模块、垂直预测子模块和直流预测子模块,所述方法还包括:所述水平预测子模块、所述垂直预测子模块和所述直流预测子模块分别基于相同的硬件结构对H.264编码格式的视频流或H.264编码格式的视频流进行水平模式、垂直模式和直流模式下的帧内预测插值。Wherein, the common intra mode primary selection module includes a horizontal prediction sub-module, a vertical prediction sub-module and a DC prediction sub-module, and the method further includes: the horizontal prediction sub-module, the vertical prediction sub-module and the DC prediction sub-module The prediction sub-module performs intra-frame prediction interpolation in horizontal mode, vertical mode and DC mode on the video stream in H.264 encoding format or the video stream in H.264 encoding format based on the same hardware structure.
所述公共帧内模式初选模块还包括编码代价计算子模块,所述方法还包括:所述编码代价计算子模块基于相同的硬件结构对H.264编码格式的视频流或H.265编码格式的视频流计算所述第二编码代价。The common intra mode primary selection module further includes an encoding cost calculation sub-module, and the method further includes: the encoding cost calculation sub-module performs an H.264 encoding format video stream or an H.265 encoding format based on the same hardware structure. The second encoding cost is calculated for the video stream.
在一个实施例中,所述模式决策模块包括第一模式决策模块、第二模式决策模块和公共模式决策模块,所述模式决策包括:所述第一模式决策模块和所述公共模式决策模块为H.264编码格式的视频流选择编码单元的分割方式和最优预测模式,并根据所述最优预测模式获得H.264编码格式的残差块,或者,所述第二模式决策模块和所述公共模式决策模块为H.265编码格式的视频流选择编码单元的分割方式和最优预测模式,并根据所述最优预测模式获得H.265编码格式的残差块。In one embodiment, the mode decision module includes a first mode decision module, a second mode decision module, and a public mode decision module, and the mode decision includes: the first mode decision module and the public mode decision module are: For the video stream in the H.264 encoding format, the partitioning mode and the optimal prediction mode of the coding unit are selected, and the residual block in the H.264 encoding format is obtained according to the optimal prediction mode, or the second mode decision module and the The common mode decision module selects the coding unit division mode and the optimal prediction mode for the video stream in the H.265 encoding format, and obtains the residual block in the H.265 encoding format according to the optimal prediction mode.
示例性地,所述第一模式决策模块还包括第一比特估计子模块,所述模式决策还包括所述第一比特估计子模块对所述H.264编码格式的视频流进行比特估计;所述第二模式决策模块还包括第二比特估计子模块,所述模式决策还包括所述第二比特估计子模块对所述H.265编码格式的视频流进行比特估计。Exemplarily, the first mode decision module further includes a first bit estimation submodule, and the mode decision further includes the first bit estimation submodule to perform bit estimation on the video stream in the H.264 encoding format; The second mode decision module further includes a second bit estimation submodule, and the mode decision further includes the second bit estimation submodule to perform bit estimation on the video stream in the H.265 encoding format.
示例性地,所述公共模式决策模块还包括H.264比特估计子模块,所述模式决策还包括所述H.264比特估计子模块基于第一硬件结构对所述H.264编码格式的视频流或所述H.265编码格式的视频流进行比特估计;或者,所述公共模式决策模块包括H.265比特估计子模块,所述模式决策还 包括所述H.265比特估计子模块基于第二硬件结构对所述H.264编码格式的视频流或所述H.265编码格式的视频流进行比特估计,其中,所述H.264比特估计子模块和所述H.265比特估计子模块使用的语法元素不同。Exemplarily, the common mode decision module further includes an H.264 bit estimation submodule, and the mode decision further includes the H.264 bit estimation submodule based on the first hardware structure for the video in the H.264 encoding format. stream or the video stream in the H.265 encoding format for bit estimation; or, the common mode decision module includes an H.265 bit estimation sub-module, and the mode decision further includes the H.265 bit estimation sub-module based on the first The second hardware structure performs bit estimation on the video stream in the H.264 encoding format or the video stream in the H.265 encoding format, wherein the H.264 bit estimation submodule and the H.265 bit estimation submodule Different syntax elements are used.
示例性地,所述公共模式决策模块还包括失真估计子模块,所述模式决策还包括所述失真估计子模块基于相同的硬件结构进行所述H.264编码格式的视频流或所述H.265编码格式的视频流的失真估计。Exemplarily, the common mode decision module further includes a distortion estimation sub-module, and the mode decision further includes the distortion estimation sub-module performing the video stream of the H.264 encoding format or the H.264 encoding format based on the same hardware structure. 265-encoded video streams for distortion estimation.
在一个实施例中,所述环内滤波包括对H.265编码格式的视频流进行SAO参数估计和SAO滤波。所述环内滤波还包括基于相同的硬件结构对H.264编码格式的视频流和H.265编码格式的视频流进行去块滤波。In one embodiment, the in-loop filtering includes SAO parameter estimation and SAO filtering of the video stream in H.265 encoding format. The in-loop filtering further includes performing deblocking filtering on the video stream in the H.264 encoding format and the video stream in the H.265 encoding format based on the same hardware structure.
在一个实施例中,所述熵编码模块包括第一熵编码模块、第二熵编码模块和公共熵编码模块,所述熵编码包括:所述第一熵编码模块和所述公共熵编码模块对H.264编码格式的视频流进行所述残差块的熵编码,或者,所述第二熵编码和所述公共熵编码模块对H.265编码格式的视频流进行所述残差块的熵编码。In one embodiment, the entropy encoding module includes a first entropy encoding module, a second entropy encoding module, and a common entropy encoding module, and the entropy encoding includes: a pair of the first entropy encoding module and the common entropy encoding module Entropy encoding of the residual block is performed on the video stream in the H.264 encoding format, or the second entropy encoding and the common entropy encoding module perform the entropy encoding of the residual block on the video stream in the H.265 encoding format. coding.
进一步地,所述熵编码包括:所述第一熵编码模块根据所述H.264编码格式的所述残差块获得H.264编码格式的语法元素,所述第二熵编码模块根据所述H.265编码格式的所述残差块获得H.265编码格式的语法元素,所述公共熵编码模块提供算数编码核,以对所述H.264编码格式的语法元素或所述H.265编码格式的语法元素进行熵编码。Further, the entropy encoding includes: the first entropy encoding module obtains syntax elements in the H.264 encoding format according to the residual block in the H.264 encoding format, and the second entropy encoding module obtains the syntax elements in the H.264 encoding format according to the The residual block of the H.265 encoding format obtains the syntax elements of the H.265 encoding format, the common entropy encoding module provides an arithmetic encoding kernel, to the syntax elements of the H.264 encoding format or the H.265 The syntax elements of the encoding format are entropy encoded.
另外,本发明实施例还提供了一种计算机存储介质,其上存储有计算机程序。当所述计算机程序由处理器执行时,可以控制前述如图1所示的视频编码装置100实现前述如图3所示的视频编码方法300的步骤。例如,该计算机存储介质为计算机可读存储介质。计算机存储介质例如可以包括智能电话的存储卡、平板电脑的存储部件、个人计算机的硬盘、只读存储器(ROM)、可擦除可编程只读存储器(EPROM)、便携式紧致盘只读存储器(CD-ROM)、USB存储器、或者上述存储介质的任意组合。计算机可读存储介质可以是一个或多个计算机可读存储介质的任意组合。In addition, an embodiment of the present invention further provides a computer storage medium, on which a computer program is stored. When the computer program is executed by the processor, the aforementioned video encoding apparatus 100 shown in FIG. 1 can be controlled to implement the steps of the aforementioned video encoding method 300 shown in FIG. 3 . For example, the computer storage medium is a computer-readable storage medium. Computer storage media may include, for example, memory cards for smartphones, storage components for tablet computers, hard drives for personal computers, read only memory (ROM), erasable programmable read only memory (EPROM), portable compact disk read only memory ( CD-ROM), USB memory, or any combination of the above storage media. A computer-readable storage medium can be any combination of one or more computer-readable storage media.
本发明实施例还提供一种可移动平台。图4为本发明一实施例提供的可移动平台400的结构示意图。如图4所示,本实施例的可移动平台400包括成像装置410以及视频编码装置420。其中,成像装置410用于采集视频数据,视频编码装置420用于对成像装置410采集的视频数据进行视频 编码。其中,视频编码装置420可以采用如图1所示的实施例的结构,其对应地,其具体细节可以参照上文,此处不再赘述。The embodiment of the present invention also provides a movable platform. FIG. 4 is a schematic structural diagram of a movable platform 400 according to an embodiment of the present invention. As shown in FIG. 4 , the movable platform 400 of this embodiment includes an imaging device 410 and a video encoding device 420 . The imaging device 410 is used to collect video data, and the video encoding device 420 is used to perform video encoding on the video data collected by the imaging device 410. The video encoding apparatus 420 may adopt the structure of the embodiment shown in FIG. 1 , and correspondingly, the specific details thereof can be referred to the above, which will not be repeated here.
在某些实施方式中,可移动平台包括无人飞行器、汽车、遥控车、机器人、相机、云台中的至少一种。视频编码装置420和成像装置410搭载在可移动平台的可移动平台本体上。当可移动平台为无人飞行器时,可移动平台本体为无人飞行器的机身。当可移动平台为汽车时,可移动平台本体为汽车的车身。该汽车可以是自动驾驶汽车或者半自动驾驶汽车,在此不做限制。当可移动平台为遥控车时,可移动平台本体为遥控车的车身。当可移动平台为机器人时,可移动平台本体为机器人。当可移动平台为相机时,可移动平台本体为相机本身。当可移动平台为云台时,可移动平台本体为云台本体。该云台可以是手持云台,也可以是搭载在汽车或飞行器上的云台。In some embodiments, the movable platform includes at least one of an unmanned aerial vehicle, a car, a remote control car, a robot, a camera, and a gimbal. The video encoding device 420 and the imaging device 410 are mounted on the movable platform body of the movable platform. When the movable platform is an unmanned aerial vehicle, the body of the movable platform is the fuselage of the unmanned aerial vehicle. When the movable platform is an automobile, the movable platform body is the body of the automobile. The vehicle may be an autonomous driving vehicle or a semi-autonomous driving vehicle, which is not limited herein. When the movable platform is a remote control car, the movable platform body is the body of the remote control car. When the movable platform is a robot, the movable platform body is a robot. When the movable platform is a camera, the movable platform body is the camera itself. When the movable platform is a gimbal, the movable platform body is a gimbal body. The gimbal can be a handheld gimbal, or a gimbal mounted on a car or an aircraft.
综上所述,本发明实施例的视频编码方法、视频编码装置、计算机存储介质和可移动平台复用部分硬件结构来进行H.264编码格式和H.265编码格式的视频流的编码,节省了硬件面积。To sum up, the video encoding method, video encoding device, computer storage medium and mobile platform of the embodiments of the present invention multiplex part of the hardware structure to encode video streams in H.264 encoding format and H.265 encoding format, saving energy hardware area.
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其他任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本发明实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如数字视频光盘(digital video disc,DVD))、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware or any other combination. When implemented in software, it can be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of the present invention are generated. The computer may be a general purpose computer, special purpose computer, computer network, or other programmable device. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be downloaded from a website site, computer, server, or data center Transmission to another website site, computer, server, or data center by wire (eg, coaxial cable, optical fiber, digital subscriber line, DSL) or wireless (eg, infrared, wireless, microwave, etc.). The computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that includes an integration of one or more available media. The usable media may be magnetic media (eg, floppy disk, hard disk, magnetic tape), optical media (eg, digital video disc (DVD)), or semiconductor media (eg, solid state disk (SSD)), etc. .
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的 各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。Those of ordinary skill in the art can realize that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of the present invention.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and brevity of description, the specific working process of the above-described systems, devices and units may refer to the corresponding processes in the foregoing method embodiments, which will not be repeated here.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。The functions, if implemented in the form of software functional units and sold or used as independent products, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present invention can be embodied in the form of a software product in essence, or the part that contributes to the prior art or the part of the technical solution. The computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present invention. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes .
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以权利要求的保护范围为准。The above are only specific embodiments of the present invention, but the protection scope of the present invention is not limited thereto. Any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed by the present invention. should be included within the protection scope of the present invention. Therefore, the protection scope of the present invention should be subject to the protection scope of the claims.
尽管这里已经参考附图描述了示例实施例,应理解上述示例实施例仅仅是示例性的,并且不意图将本发明的范围限制于此。本领域普通技术人员可以在其中进行各种改变和修改,而不偏离本发明的范围和精神。所有这些改变和修改意在被包括在所附权利要求所要求的本发明的范围之内。Although example embodiments have been described herein with reference to the accompanying drawings, it should be understood that the above-described example embodiments are exemplary only, and are not intended to limit the scope of the invention thereto. Various changes and modifications can be made therein by those of ordinary skill in the art without departing from the scope and spirit of the invention. All such changes and modifications are intended to be included within the scope of the invention as claimed in the appended claims.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。Those of ordinary skill in the art can realize that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of the present invention.
在本申请所提供的几个实施例中,应该理解到,所揭露的设备和方法,可以通过其它的方式实现。例如,以上所描述的设备实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个设备,或一些特征可以忽略,或不执行。In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or May be integrated into another device, or some features may be omitted, or not implemented.
在此处所提供的说明书中,说明了大量具体细节。然而,能够理解,本发明的实施例可以在没有这些具体细节的情况下实践。在一些实例中,并未详细示出公知的方法、结构和技术,以便不模糊对本说明书的理解。In the description provided herein, numerous specific details are set forth. It will be understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
类似地,应当理解,为了精简本发明并帮助理解各个发明方面中的一个或多个,在对本发明的示例性实施例的描述中,本发明的各个特征有时被一起分组到单个实施例、图、或者对其的描述中。然而,并不应将该本发明的方法解释成反映如下意图:即所要求保护的本发明要求比在每个权利要求中所明确记载的特征更多的特征。更确切地说,如相应的权利要求书所反映的那样,其发明点在于可以用少于某个公开的单个实施例的所有特征的特征来解决相应的技术问题。因此,遵循具体实施方式的权利要求书由此明确地并入该具体实施方式,其中每个权利要求本身都作为本发明的单独实施例。Similarly, it is to be understood that in the description of the exemplary embodiments of the invention, various features of the invention are sometimes grouped together , or in its description. However, this method of the invention should not be interpreted as reflecting the intention that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the corresponding claims reflect, the invention lies in the fact that the corresponding technical problem may be solved with less than all features of a single disclosed embodiment. Thus, the claims following the Detailed Description are hereby expressly incorporated into this Detailed Description, with each claim standing on its own as a separate embodiment of this invention.
本领域的技术人员可以理解,除了特征之间相互排斥之外,可以采用任何组合对本说明书(包括伴随的权利要求、摘要和附图)中公开的所有特征以及如此公开的任何方法或者设备的所有过程或单元进行组合。除非另外明确陈述,本说明书(包括伴随的权利要求、摘要和附图)中公开的每个特征可以由提供相同、等同或相似目的替代特征来代替。It will be understood by those skilled in the art that all features disclosed in this specification (including the accompanying claims, abstract and drawings) and any method or apparatus so disclosed may be used in any combination, except that the features are mutually exclusive. Processes or units are combined. Each feature disclosed in this specification (including accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
此外,本领域的技术人员能够理解,尽管在此所述的一些实施例包括其它实施例中所包括的某些特征而不是其它特征,但是不同实施例的特征的组合意味着处于本发明的范围之内并且形成不同的实施例。例如,在权利要求书中,所要求保护的实施例的任意之一都可以以任意的组合方式来使用。Furthermore, those skilled in the art will appreciate that although some of the embodiments described herein include certain features, but not others, included in other embodiments, that combinations of features of different embodiments are intended to be within the scope of the invention within and form different embodiments. For example, in the claims, any of the claimed embodiments may be used in any combination.
本发明的各个部件实施例可以以硬件实现,或者以在一个或者多个处理器上运行的软件模块实现,或者以它们的组合实现。本领域的技术人员应当理解,可以在实践中使用微处理器或者数字信号处理器(DSP)来实现根据本发明实施例的一些模块的一些或者全部功能。本发明还可以实现为用于执行这里所描述的方法的一部分或者全部的装置程序(例如,计算机程序和计算机程序产品)。这样的实现本发明的程序可以存储在计算机可读介质上,或者可以具有一个或者多个信号的形式。这样的信号可以从因特网网站上下载得到,或者在载体信号上提供,或者以任何其他形式提供。Various component embodiments of the present invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art should understand that a microprocessor or a digital signal processor (DSP) may be used in practice to implement some or all of the functions of some modules according to the embodiments of the present invention. The present invention may also be implemented as apparatus programs (eg, computer programs and computer program products) for performing part or all of the methods described herein. Such a program implementing the present invention may be stored on a computer-readable medium, or may be in the form of one or more signals. Such signals may be downloaded from Internet sites, or provided on carrier signals, or in any other form.
应该注意的是上述实施例对本发明进行说明而不是对本发明进行限制,并且本领域技术人员在不脱离所附权利要求的范围的情况下可设计出替换实施例。在权利要求中,不应将位于括号之间的任何参考符号构造成对权利要求的限制。本发明可以借助于包括有若干不同元件的硬件以及借助于适当编程的计算机来实现。在列举了若干装置的单元权利要求中,这些装置中的若干个可以是通过同一个硬件项来具体体现。单词第一、第二、以及第三等的使用不表示任何顺序。可将这些单词解释为名称。It should be noted that the above-described embodiments illustrate rather than limit the invention, and that alternative embodiments may be devised by those skilled in the art without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The invention can be implemented by means of hardware comprising several different elements and by means of a suitably programmed computer. In a unit claim enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, and third, etc. do not denote any order. These words can be interpreted as names.
以上所述,仅为本发明的具体实施方式或对具体实施方式的说明,本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。本发明的保护范围应以权利要求的保护范围为准。The above is only the specific embodiment of the present invention or the description of the specific embodiment, and the protection scope of the present invention is not limited thereto. Any changes or substitutions should be included within the protection scope of the present invention. The protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (39)

  1. 一种视频编码装置,其特征在于,所述视频编码装置包括:A video encoding device, characterized in that the video encoding device comprises:
    整像素搜索模块,用于在多个参考帧中的多个预定范围内确定与当前帧中的当前块相匹配的匹配块;an integer pixel search module for determining a matching block that matches the current block in the current frame within a plurality of predetermined ranges in the plurality of reference frames;
    分像素搜索模块,电连接于所述整像素搜索模块,并且所述分像素搜索模块用于确定关于所述匹配块的至少一分像素匹配块;a sub-pixel search module electrically connected to the whole-pixel search module, and the sub-pixel search module is configured to determine at least one sub-pixel matching block about the matching block;
    模式决策模块,电连接所述分像素搜索模块,用于至少利用所述分像素匹配块的编码代价进行模式决策,以得到所述当前块的最优预测块以用于视频编码;a mode decision module, electrically connected to the sub-pixel search module, for performing mode decision at least using the coding cost of the sub-pixel matching block to obtain the optimal prediction block of the current block for video coding;
    其中,所述分像素搜索模块包括二分之一像素插值模块,所述二分之一像素插值模块能够利用第一插值滤波器对H.264编码格式的视频流和H.265编码格式的视频流进行二分之一像素插值。Wherein, the sub-pixel search module includes a half-pixel interpolation module, and the half-pixel interpolation module can use the first interpolation filter to perform H.264 encoding format video stream and H.265 encoding format video The stream does one-half pixel interpolation.
  2. 根据权利要求1所述的视频编码装置,其特征在于,所述整像素搜索模块还用于确定所述当前块相对于所述匹配块之间的第一运动矢量;The video encoding apparatus according to claim 1, wherein the integer pixel search module is further configured to determine a first motion vector between the current block and the matching block;
    所述分像素搜索模块还用于确定所述当前块相对于所述至少一分像素匹配块的第二运动矢量,所述第二运动矢量的精度高于所述第一运动矢量的精度。The sub-pixel search module is further configured to determine a second motion vector of the current block relative to the at least one sub-pixel matching block, and the precision of the second motion vector is higher than that of the first motion vector.
  3. 根据权利要求1所述的视频编码装置,其特征在于,所述整像素搜索模块基于相同的硬件结构对H.264编码格式的视频流和H.265编码格式的视频流进行整像素搜索。The video encoding apparatus according to claim 1, wherein the integer pixel search module performs integer pixel search on the video stream in H.264 encoding format and the video stream in H.265 encoding format based on the same hardware structure.
  4. 根据权利要求1所述的视频编码装置,其特征在于,所述第一插值滤波器为8抽头的插值滤波器。The video encoding apparatus according to claim 1, wherein the first interpolation filter is an 8-tap interpolation filter.
  5. 根据权利要求1所述的视频编码装置,其特征在于,所述分像素搜索模块还包括第一四分之一像素插值模块和第二四分之一像素插值模块,所述第一四分之一像素插值模块基于第二插值滤波器对H.264编码格式的视频流进行四分之一像素插值,所述第二四分之一像素插值模块基于第三插值滤波器对H.265编码格式的视频流进行四分之一像素插值。The video encoding apparatus according to claim 1, wherein the sub-pixel search module further comprises a first quarter pixel interpolation module and a second quarter pixel interpolation module, the first quarter pixel interpolation module A pixel interpolation module performs quarter-pixel interpolation on the H.264-encoded video stream based on a second interpolation filter that performs quarter-pixel interpolation on the H.265-encoded format based on a third interpolation filter The video stream is quarter-pixel interpolated.
  6. 根据权利要求1所述的视频编码装置,其特征在于,所述分像素搜索模块还包括编码代价计算子模块,用于基于相同的硬件结构对所述H.264编码格式的视频流或所述H.265编码格式的视频流计算所述分像素匹配块与所述当前块之间的第一编码代价。The video coding apparatus according to claim 1, wherein the pixel sub-pixel search module further comprises a coding cost calculation sub-module, which is configured to perform a calculation on the video stream in the H.264 coding format or the coding cost based on the same hardware structure. The first encoding cost between the sub-pixel matching block and the current block is calculated for the video stream in the H.265 encoding format.
  7. 根据权利要求1所述的视频编码装置,其特征在于,还包括:The video encoding apparatus according to claim 1, further comprising:
    帧内模式初选模块,连接于所述模式决策模块,所述帧内模式初选模 块用于根据所述当前帧中的至少一相邻参考块对应的像素值,确定关于所述当前块的至少一预测块和与所述至少一预测块对应的第二编码代价,并根据所述第二编码代价确定至少一个帧内预测模式;The intra-frame mode primary selection module is connected to the mode decision module, and the intra-frame mode primary selection module is configured to determine the information about the current block according to the pixel value corresponding to at least one adjacent reference block in the current frame. at least one prediction block and a second encoding cost corresponding to the at least one prediction block, and determining at least one intra prediction mode according to the second encoding cost;
    所述模式决策模块用于根据所述至少一个帧内预测模式和所述至少一个运动矢量确定最优预测块,并输出模式信息、系数块和重建块;The mode decision module is configured to determine an optimal prediction block according to the at least one intra prediction mode and the at least one motion vector, and output mode information, a coefficient block and a reconstruction block;
    环内滤波模块,电连接于所述模式决策模块,所述环内滤波模块用于对所述重建块进行环内滤波处理;an in-loop filtering module, electrically connected to the mode decision module, and used for performing in-loop filtering processing on the reconstruction block;
    熵编码模块,电连接于所述环内滤波模块,所述熵编码模块用于对模式信息和系数块熵编码。The entropy coding module is electrically connected to the in-loop filtering module, and the entropy coding module is used for entropy coding the mode information and the coefficient block.
  8. 根据权利要求7所述的视频编码装置,其特征在于,所述帧内模式初选模块包括第一帧内模式初选模块、第二帧内模式初选模块以及公共帧内模式初选模块,所述第一帧内模式初选模块和所述公共帧内模式初选模块用于为H.264编码格式的视频流选择帧内预测模式,所述第二帧内模式初选模块和所述公共帧内模式初选模块用于为所述H.265编码格式的视频流选择帧内预测模式。The video encoding device according to claim 7, wherein the intra-frame mode preliminary selection module comprises a first intra-frame mode preliminary selection module, a second intra-frame mode preliminary selection module, and a common intra-frame mode preliminary selection module, The first intra mode preliminary selection module and the common intra mode preliminary selection module are used to select an intra prediction mode for the video stream in the H.264 encoding format, and the second intra mode preliminary selection module and the The common intra-frame mode primary selection module is used to select an intra-frame prediction mode for the video stream in the H.265 encoding format.
  9. 根据权利要求8所述的视频编码装置,其特征在于,所述公共帧内模式初选模块包括水平预测子模块、垂直预测子模块和直流预测子模块,分别用于基于相同的硬件结构对H.264编码格式的视频流或H.264编码格式的视频流进行水平模式、垂直模式和直流模式下的帧内预测插值。The video encoding apparatus according to claim 8, wherein the common intra-mode preliminary selection module comprises a horizontal prediction sub-module, a vertical prediction sub-module and a DC prediction sub-module, which are respectively used for H-prediction based on the same hardware structure. .264 encoding format video stream or H.264 encoding format video stream for intra-frame prediction interpolation in horizontal mode, vertical mode and DC mode.
  10. 根据权利要求8所述的视频编码装置,其特征在于,所述第一帧内模式初选模块包括第一方向预测子模块和第一平面预测子模块,分别用于对H.264编码格式的视频流进行方向模式和平面模式下的帧内预测插值;The video encoding device according to claim 8, wherein the first intra-mode preliminary selection module comprises a first direction prediction sub-module and a first plane prediction sub-module, which are respectively used for the H.264 encoding format. The video stream is subjected to intra-prediction interpolation in directional mode and plane mode;
    所述第二帧内模式初选模块包括第二方向预测子模块和第二平面预测子模块,分别用于对所述H.265编码格式的视频流进行方向模式和平面模式下的帧内预测插值。The second intra-frame mode preliminary selection module includes a second directional prediction sub-module and a second plane prediction sub-module, which are respectively used to perform intra-frame prediction in the directional mode and the plane mode on the video stream in the H.265 encoding format. interpolation.
  11. 根据权利要求8所述的视频编码装置,其特征在于,所述公共帧内模式初选模块包括:编码代价计算子模块,用于基于相同的硬件结构对H.264编码格式的视频流或H.265编码格式的视频流计算所述第二编码代价。The video encoding apparatus according to claim 8, wherein the common intra-mode preliminary selection module comprises: a coding cost calculation sub-module, which is used to perform a calculation on the video stream or H.264 encoding format based on the same hardware structure. The second encoding cost is calculated for the video stream in the .265 encoding format.
  12. 根据权利要求1所述的视频编码装置,其特征在于,所述模式决策模块包括第一模式决策模块、第二模式决策模块和公共模式决策模块,所述第一模式决策模块和所述公共模式决策模块用于为H.264编码格式的视频流选择编码单元的分割方式和最优预测模式,并根据所述最优预测模式获得H.264编码格式的残差块,所述第二模式决策模块和所述公共模式 决策模块用于为H.265编码格式的视频流选择编码单元的分割方式和最优预测模式,并根据所述最优预测模式获得H.265编码格式的残差块。The video encoding apparatus according to claim 1, wherein the mode decision module comprises a first mode decision module, a second mode decision module and a common mode decision module, the first mode decision module and the public mode decision module The decision-making module is used to select the division mode and the optimal prediction mode of the coding unit for the video stream in the H.264 encoding format, and obtain the residual block in the H.264 encoding format according to the optimal prediction mode, and the second mode decides The module and the common mode decision module are used to select a coding unit division mode and an optimal prediction mode for a video stream in H.265 encoding format, and obtain a residual block in H.265 encoding format according to the optimal prediction mode.
  13. 根据权利要求12所述的视频编码装置,其特征在于,所述第一模式决策模块包括第一变换子模块、第一量化子模块、第一反变换子模块和第一反量化子模块,分别用于对所述H.264编码格式的视频流进行变换、量化、反变换、反量化;The video encoding apparatus according to claim 12, wherein the first mode decision module comprises a first transformation submodule, a first quantization submodule, a first inverse transformation submodule and a first inverse quantization submodule, which are respectively for transforming, quantizing, inverse transforming, and inverse quantizing the video stream in the H.264 encoding format;
    所述第二模式决策模块包括第二变换子模块、第二量化子模块、第二反变换子模块和第二反量化子模块,分别用于对所述H.265编码格式的视频流进行变换、量化、反变换、反量化。The second mode decision module includes a second transform sub-module, a second quantization sub-module, a second inverse transform sub-module and a second inverse quantization sub-module, which are respectively used to transform the video stream in the H.265 encoding format , quantization, inverse transform, inverse quantization.
  14. 根据权利要求12所述的视频编码装置,其特征在于,所述第一模式决策模块还包括第一比特估计子模块,用于对所述H.264编码格式的视频流进行比特估计;The video encoding apparatus according to claim 12, wherein the first mode decision module further comprises a first bit estimation sub-module for performing bit estimation on the video stream in the H.264 encoding format;
    所述第二模式决策模块还包括第二比特估计子模块,用于对所述H.265编码格式的视频流进行比特估计。The second mode decision module further includes a second bit estimation sub-module for performing bit estimation on the video stream in the H.265 encoding format.
  15. 根据权利要求12所述的视频编码装置,其特征在于,所述公共模式决策模块还包括H.264比特估计子模块,用于基于第一硬件结构对所述H.264编码格式的视频流或所述H.265编码格式的视频流进行比特估计;或者,所述公共模式决策模块包括H.265比特估计子模块,用于基于第二硬件结构对所述H.264编码格式的视频流或所述H.265编码格式的视频流进行比特估计,其中,所述H.264比特估计子模块和所述H.265比特估计子模块使用的语法元素不同。The video encoding apparatus according to claim 12, wherein the common mode decision module further comprises an H.264 bit estimation sub-module, which is configured to perform an H.264 encoding format based on the first hardware structure for the video stream or the H.264 encoding format. Perform bit estimation on the video stream in the H.265 encoding format; or, the common mode decision module includes an H.265 bit estimation sub-module, configured to perform bit estimation on the video stream in the H.264 encoding format based on the second hardware structure or Bit estimation is performed on the video stream in the H.265 encoding format, wherein the syntax elements used by the H.264 bit estimation sub-module and the H.265 bit estimation sub-module are different.
  16. 根据权利要求12所述的视频编码装置,其特征在于,所述公共模式决策模块包括失真估计子模块,用于基于相同的硬件结构进行所述H.264编码格式的视频流或所述H.265编码格式的视频流的失真估计。The video encoding apparatus according to claim 12, wherein the common mode decision module comprises a distortion estimation sub-module for performing the video stream of the H.264 encoding format or the H.264 encoding format based on the same hardware structure. 265-encoded video streams for distortion estimation.
  17. 根据权利要求7所述的视频编码装置,其特征在于,所述环内滤波模块包括SAO参数估计子模块和SAO滤波子模块,用于对H.265编码格式的视频流进行SAO参数估计和SAO滤波。The video encoding device according to claim 7, wherein the in-loop filtering module comprises a SAO parameter estimation sub-module and a SAO filtering sub-module, which are used to perform SAO parameter estimation and SAO on the video stream in the H.265 encoding format filter.
  18. 根据权利要求7所述的视频编码装置,其特征在于,所述环内滤波模块包括第一去块滤波子模块和第二去块滤波子模块,分别用于对H.264编码格式的视频流和H.265编码格式的视频流进行去块滤波。The video encoding apparatus according to claim 7, wherein the in-loop filtering module comprises a first deblocking filtering sub-module and a second deblocking filtering sub-module, which are respectively used for video streams in H.264 encoding format. Perform deblocking filtering with video streams in H.265 encoding format.
  19. 根据权利要求7所述的视频编码装置,其特征在于,所述熵编码模块包括第一熵编码模块、第二熵编码模块和公共熵编码模块,所述第一熵编码模块和所述公共熵编码模块用于对H.264编码格式的视频流进行所 述系数块的熵编码,所述第二熵编码和所述公共熵编码模块用于对H.265编码格式的视频流进行所述系数块的熵编码。The video encoding device according to claim 7, wherein the entropy encoding module comprises a first entropy encoding module, a second entropy encoding module and a common entropy encoding module, the first entropy encoding module and the common entropy encoding module The encoding module is configured to perform entropy encoding of the coefficient block on the video stream in the H.264 encoding format, and the second entropy encoding and the common entropy encoding module are configured to perform the coefficients on the video stream in the H.265 encoding format Entropy encoding of the block.
  20. 根据权利要求19所述的视频编码装置,其特征在于,所述第一熵编码模块用于根据所述H.264编码格式的残差块获得H.264编码格式的语法元素,所述第二熵编码模块用于根据所述H.265编码格式的残差块获得H.265编码格式的语法元素,所述公共熵编码模块用于提供算数编码核,以对所述H.264编码格式的语法元素或所述H.265编码格式的语法元素进行熵编码。The video encoding device according to claim 19, wherein the first entropy encoding module is configured to obtain a syntax element of the H.264 encoding format according to the residual block of the H.264 encoding format, the second The entropy coding module is used to obtain the syntax elements of the H.265 coding format according to the residual block of the H.265 coding format, and the common entropy coding module is used to provide an arithmetic coding kernel to The syntax elements or syntax elements of the H.265 encoding format are entropy encoded.
  21. 根据权利要求7所述的视频编码装置,其特征在于,还包括电连接所述整像素搜索模块、所述分像素搜索模块和所述模式决策模块的参考帧管理模块,用于获取所述参考帧,并将所述参考帧发送至所述整像素搜索模块、所述分像素搜索模块和所述模式决策模块。The video encoding apparatus according to claim 7, further comprising a reference frame management module electrically connected to the integer pixel search module, the sub-pixel search module and the mode decision module, for obtaining the reference frame frame, and send the reference frame to the integer pixel search module, the sub-pixel search module and the mode decision module.
  22. 一种视频编码方法,其特征在于,所述方法包括:A video coding method, characterized in that the method comprises:
    整像素搜索模块在多个参考帧中的多个预定范围内确定与当前帧中的当前块相匹配的匹配块;The integer pixel search module determines a matching block that matches the current block in the current frame within a plurality of predetermined ranges in the plurality of reference frames;
    电连接于所述整像素搜索模块的分像素搜索模块确定关于所述匹配块的至少一分像素匹配块,其中,所述分像素搜索模块包括二分之一像素插值模块,所述确定关于所述匹配块的至少一分像素匹配块包括:二分之一像素插值模块利用第一插值滤波器对H.264编码格式的视频流或H.265编码格式的视频流进行二分之一像素插值;A sub-pixel search module electrically connected to the whole-pixel search module determines at least a sub-pixel matching block for the matching block, wherein the sub-pixel search module includes a half-pixel interpolation module, the determination is The at least one pixel matching block of the matching block includes: the one-half pixel interpolation module uses the first interpolation filter to perform one-half pixel interpolation on the video stream in the H.264 encoding format or the video stream in the H.265 encoding format. ;
    电连接所述分像素搜索模块的模式决策模块至少利用所述分像素匹配块的编码代价进行模式决策,以得到所述当前块的最优预测块以用于视频编码。A mode decision module electrically connected to the sub-pixel search module makes mode decision at least using the coding cost of the sub-pixel matching block to obtain an optimal prediction block of the current block for video encoding.
  23. 根据权利要求22所述的视频编码方法,其特征在于,所述第一插值滤波器为8抽头的插值滤波器。The video encoding method according to claim 22, wherein the first interpolation filter is an 8-tap interpolation filter.
  24. 根据权利要求22所述的视频编码方法,其特征在于,所述分像素搜索模块还包括第一四分之一像素插值模块和第二四分之一像素插值模块,所述方法还包括:The video coding method according to claim 22, wherein the sub-pixel search module further comprises a first quarter-pixel interpolation module and a second quarter-pixel interpolation module, and the method further comprises:
    由所述第一四分之一像素插值模块基于第二插值滤波器对H.264编码格式的视频流进行四分之一像素插值,或者,由所述第二四分之一像素插值模块基于第三插值滤波器对H.265编码格式的视频流进行四分之一像素插值。Performing quarter-pixel interpolation on the video stream in the H.264 encoding format by the first quarter-pixel interpolation module based on a second interpolation filter, or by the second quarter-pixel interpolation module based on The third interpolation filter performs quarter-pixel interpolation on the H.265 encoded video stream.
  25. 根据权利要求22所述的视频编码方法,其特征在于,所述分像素 搜索模块还包括编码代价计算子模块,所述方法还包括:The video coding method according to claim 22, wherein the sub-pixel search module further comprises a coding cost calculation sub-module, and the method further comprises:
    由所述编码代价计算子模块基于相同的硬件结构对所述H.264编码格式的视频流或所述H.265编码格式的视频流计算所述分像素匹配块与所述当前块之间的第一编码代价。Based on the same hardware structure, the encoding cost calculation sub-module calculates the difference between the sub-pixel matching block and the current block for the video stream in the H.264 encoding format or the video stream in the H.265 encoding format. The first encoding cost.
  26. 根据权利要求22所述的视频编码方法,其特征在于,还包括:The video coding method according to claim 22, further comprising:
    由连接于所述模式决策模块的帧内模式初选模块根据所述当前帧中的至少一相邻参考块对应的像素值,确定关于所述当前块的至少一预测块和与所述至少一预测块对应的第二编码代价,并根据所述第二编码代价确定至少一帧内预测模式;The intra-mode primary selection module connected to the mode decision module determines at least one prediction block related to the current block and the at least one prediction block related to the at least one adjacent reference block in the current frame according to the pixel value corresponding to the at least one adjacent reference block. predicting a second coding cost corresponding to the block, and determining at least one intra-frame prediction mode according to the second coding cost;
    由所述模式决策模块根据所述至少一帧内预测模式和所述至少一个运动矢量确定最优预测块,并输出模式信息、系数块和重建块;determining an optimal prediction block by the mode decision module according to the at least one intra prediction mode and the at least one motion vector, and outputting mode information, a coefficient block and a reconstructed block;
    由电连接于所述模式决策模块的环内滤波模块对所述重建块进行环内滤波处理;performing in-loop filtering processing on the reconstruction block by an in-loop filtering module electrically connected to the mode decision module;
    由电连接于所述环内滤波模块的熵编码模块依据所述环内滤波处理后的模式信息和系数块进行熵编码。Entropy encoding is performed by an entropy encoding module electrically connected to the in-loop filtering module according to the mode information and coefficient blocks processed by the in-loop filtering.
  27. 根据权利要求26所述的视频编码方法,其特征在于,所述帧内模式初选模块包括第一帧内模式初选模块、第二帧内模式初选模块以及公共帧内模式初选模块,所述方法还包括:The video coding method according to claim 26, wherein the intra-frame mode preliminary selection module comprises a first intra-frame mode preliminary selection module, a second intra-frame mode preliminary selection module, and a common intra-frame mode preliminary selection module, The method also includes:
    所述第一帧内模式初选模块和所述公共帧内模式初选模块为H.264编码格式的视频流选择帧内预测模式,或者,所述第二帧内模式初选模块和所述公共帧内模式初选模块为所述H.265编码格式的视频流选择帧内预测模式。The first intra mode preliminary selection module and the common intra mode preliminary selection module select the intra prediction mode for the video stream in the H.264 encoding format, or, the second intra mode preliminary selection module and the The common intra mode primary selection module selects an intra prediction mode for the video stream in the H.265 encoding format.
  28. 根据权利要求27所述的视频编码方法,其特征在于,所述公共帧内模式初选模块包括水平预测子模块、垂直预测子模块和直流预测子模块,所述方法还包括:The video coding method according to claim 27, wherein the common intra-mode preliminary selection module comprises a horizontal prediction sub-module, a vertical prediction sub-module and a DC prediction sub-module, and the method further comprises:
    所述水平预测子模块、所述垂直预测子模块和所述直流预测子模块分别基于相同的硬件结构对H.264编码格式的视频流或H.264编码格式的视频流进行水平模式、垂直模式和直流模式下的帧内预测插值。The horizontal prediction sub-module, the vertical prediction sub-module and the DC prediction sub-module respectively perform horizontal mode and vertical mode on the video stream in H.264 encoding format or the video stream in H.264 encoding format based on the same hardware structure. and intra-predictive interpolation in DC mode.
  29. 根据权利要求27所述的视频编码方法,其特征在于,所述公共帧内模式初选模块包括编码代价计算子模块,所述方法还包括:The video coding method according to claim 27, wherein the common intra mode primary selection module comprises a coding cost calculation sub-module, and the method further comprises:
    所述编码代价计算子模块基于相同的硬件结构对H.264编码格式的视频流或H.265编码格式的视频流计算所述第二编码代价。The encoding cost calculation submodule calculates the second encoding cost for the video stream in the H.264 encoding format or the video stream in the H.265 encoding format based on the same hardware structure.
  30. 根据权利要求22所述的视频编码方法,其特征在于,所述模式决 策模块包括第一模式决策模块、第二模式决策模块和公共模式决策模块,所述模式决策包括:The video coding method according to claim 22, wherein the mode decision module comprises a first mode decision module, a second mode decision module and a common mode decision module, and the mode decision comprises:
    所述第一模式决策模块和所述公共模式决策模块为H.264编码格式的视频流选择编码单元的分割方式和最优预测模式,并根据所述最优预测模式获得H.264编码格式的残差块,或者,所述第二模式决策模块和所述公共模式决策模块为H.265编码格式的视频流选择编码单元的分割方式和最优预测模式,并根据所述最优预测模式获得H.265编码格式的残差块。The first mode decision module and the common mode decision module select the division mode and the optimal prediction mode of the coding unit for the video stream in the H.264 encoding format, and obtain the H.264 encoding format according to the optimal prediction mode. Residual block, or, the second mode decision module and the common mode decision module select the division mode and the optimal prediction mode of the coding unit for the video stream in the H.265 encoding format, and obtain the optimal prediction mode according to the Residual block in H.265 encoding format.
  31. 根据权利要求30所述的视频编码方法,其特征在于,所述第一模式决策模块还包括第一比特估计子模块,所述模式决策还包括所述第一比特估计子模块对所述H.264编码格式的视频流进行比特估计;The video coding method according to claim 30, wherein the first mode decision module further comprises a first bit estimation submodule, and the mode decision further comprises the first bit estimation submodule for the H. 264 encoding format video stream for bit estimation;
    所述第二模式决策模块还包括第二比特估计子模块,所述模式决策还包括所述第二比特估计子模块对所述H.265编码格式的视频流进行比特估计。The second mode decision module further includes a second bit estimation submodule, and the mode decision further includes the second bit estimation submodule to perform bit estimation on the video stream in the H.265 encoding format.
  32. 根据权利要求31所述的视频编码方法,其特征在于,所述公共模式决策模块还包括H.264比特估计子模块,所述模式决策还包括所述H.264比特估计子模块基于第一硬件结构对所述H.264编码格式的视频流或所述H.265编码格式的视频流进行比特估计;或者,所述公共模式决策模块包括H.265比特估计子模块,所述模式决策还包括所述H.265比特估计子模块基于第二硬件结构对所述H.264编码格式的视频流或所述H.265编码格式的视频流进行比特估计,其中,所述H.264比特估计子模块和所述H.265比特估计子模块使用的语法元素不同。The video coding method according to claim 31, wherein the common mode decision module further comprises an H.264 bit estimation submodule, and the mode decision further comprises the H.264 bit estimation submodule based on the first hardware The structure performs bit estimation on the video stream in the H.264 encoding format or the video stream in the H.265 encoding format; or, the common mode decision module includes an H.265 bit estimation sub-module, and the mode decision further includes The H.265 bit estimation submodule performs bit estimation on the video stream in the H.264 encoding format or the video stream in the H.265 encoding format based on the second hardware structure, wherein the H.264 bit estimation submodule The syntax elements used by the module and the H.265 bit estimation sub-module are different.
  33. 根据权利要求31所述的视频编码方法,其特征在于,所述公共模式决策模块包括失真估计子模块,所述模式决策还包括所述失真估计子模块基于相同的硬件结构进行所述H.264编码格式的视频流或所述H.265编码格式的视频流的失真估计。The video encoding method according to claim 31, wherein the common mode decision module includes a distortion estimation sub-module, and the mode decision further includes the distortion estimation sub-module performing the H.264 H.264 operation based on the same hardware structure. Distortion estimation of the video stream in the encoding format or the video stream in the H.265 encoding format.
  34. 根据权利要求26所述的视频编码方法,其特征在于,所述环内滤波包括对H.265编码格式的视频流进行SAO参数估计和SAO滤波。The video encoding method according to claim 26, wherein the in-loop filtering comprises performing SAO parameter estimation and SAO filtering on the video stream in the H.265 encoding format.
  35. 根据权利要求26所述的视频编码方法,其特征在于,所述环内滤波包括对H.264编码格式的视频流和H.265编码格式的视频流进行去块滤波。The video encoding method according to claim 26, wherein the in-loop filtering comprises performing deblocking filtering on the video stream in the H.264 encoding format and the video stream in the H.265 encoding format.
  36. 根据权利要求26所述的视频编码方法,其特征在于,所述熵编码模块包括第一熵编码模块、第二熵编码模块和公共熵编码模块,所述熵编码包括:The video coding method according to claim 26, wherein the entropy coding module comprises a first entropy coding module, a second entropy coding module and a common entropy coding module, and the entropy coding comprises:
    所述第一熵编码模块和所述公共熵编码模块对H.264编码格式的视频流进行所述系数块的熵编码,或者,所述第二熵编码和所述公共熵编码模块对H.265编码格式的视频流进行所述系数块的熵编码。The first entropy encoding module and the common entropy encoding module perform entropy encoding of the coefficient block on the video stream in the H.264 encoding format, or the second entropy encoding and the common entropy encoding module perform entropy encoding on the H.264 encoding format. 265 encoding format for the entropy encoding of the coefficient block.
  37. 根据权利要求36所述的视频编码方法,其特征在于,所述熵编码包括:所述第一熵编码模块根据所述H.264编码格式的残差块获得H.264编码格式的语法元素,所述第二熵编码模块根据所述H.265编码格式的残差块获得H.265编码格式的语法元素,所述公共熵编码模块提供算数编码核,以对所述H.264编码格式的语法元素或所述H.265编码格式的语法元素进行熵编码。The video coding method according to claim 36, wherein the entropy coding comprises: the first entropy coding module obtains, by the first entropy coding module, a syntax element of the H.264 coding format according to the residual block of the H.264 coding format, The second entropy coding module obtains the syntax elements of the H.265 coding format according to the residual block of the H.265 coding format, and the common entropy coding module provides an arithmetic coding kernel to The syntax elements or syntax elements of the H.265 encoding format are entropy encoded.
  38. 一种计算机存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求22至37中任一项所述的视频编码方法的步骤。A computer storage medium on which a computer program is stored, characterized in that, when the computer program is executed by a processor, the steps of the video encoding method according to any one of claims 22 to 37 are implemented.
  39. 一种可移动平台,其特征在于,所述可移动平台包括成像装置以及如权利要求1-21中任一项所述的视频编码装置,所述成像装置用于采集视频数据,所述视频编码装置用于对所述成像装置采集的视频数据进行视频编码。A movable platform, characterized in that, the movable platform comprises an imaging device and the video encoding device according to any one of claims 1-21, the imaging device is used to collect video data, and the video encoding device The device is configured to perform video encoding on the video data collected by the imaging device.
PCT/CN2020/117220 2020-09-23 2020-09-23 Video coding apparatus and method, and computer storage medium and mobile platform WO2022061613A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2020/117220 WO2022061613A1 (en) 2020-09-23 2020-09-23 Video coding apparatus and method, and computer storage medium and mobile platform
CN202080013403.3A CN113454997A (en) 2020-09-23 2020-09-23 Video encoding apparatus, method, computer storage medium, and removable platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/117220 WO2022061613A1 (en) 2020-09-23 2020-09-23 Video coding apparatus and method, and computer storage medium and mobile platform

Publications (1)

Publication Number Publication Date
WO2022061613A1 true WO2022061613A1 (en) 2022-03-31

Family

ID=77808738

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/117220 WO2022061613A1 (en) 2020-09-23 2020-09-23 Video coding apparatus and method, and computer storage medium and mobile platform

Country Status (2)

Country Link
CN (1) CN113454997A (en)
WO (1) WO2022061613A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230100352A1 (en) * 2021-09-28 2023-03-30 Meta Platforms,Inc. Systems and methods for dynamic early termination of mode decision in hardware video encoders
CN116389763A (en) * 2023-06-05 2023-07-04 瀚博半导体(上海)有限公司 Video coding method and device based on multiple encoders
CN117440168A (en) * 2023-12-19 2024-01-23 福州时芯科技有限公司 Hardware architecture for realizing parallel spiral search algorithm

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117837142A (en) * 2021-10-27 2024-04-05 深圳市大疆创新科技有限公司 Video encoding method, apparatus and computer readable storage medium
CN116208775A (en) * 2023-03-03 2023-06-02 格兰菲智能科技有限公司 Motion estimation method, motion estimation device, computer equipment and hardware encoder

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060126726A1 (en) * 2004-12-10 2006-06-15 Lin Teng C Digital signal processing structure for decoding multiple video standards
US20070047651A1 (en) * 2005-08-24 2007-03-01 Samsung Electronics Co., Ltd. Video prediction apparatus and method for multi-format codec and video encoding/decoding apparatus and method using the video prediction apparatus and method
CN102547294A (en) * 2012-02-16 2012-07-04 复旦大学 Context-based adaptive binary arithmetic coding (CABAC) hardware decoder architecture applied to H.264 and high efficiency video coding (HEVC) video standards
CN103997650A (en) * 2014-05-30 2014-08-20 华为技术有限公司 Video decoding method and video decoder
CN104683820A (en) * 2015-02-28 2015-06-03 华为技术有限公司 Loop filtering method and loop filter
CN105611299A (en) * 2015-12-25 2016-05-25 北京工业大学 Motion estimation method based on HEVC

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103037218B (en) * 2012-10-22 2015-05-13 北京航空航天大学 Multi-view stereoscopic video compression and decompression method based on fractal and H.264
CN106658024B (en) * 2016-10-20 2019-07-16 杭州当虹科技股份有限公司 A kind of quick method for video coding

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060126726A1 (en) * 2004-12-10 2006-06-15 Lin Teng C Digital signal processing structure for decoding multiple video standards
US20070047651A1 (en) * 2005-08-24 2007-03-01 Samsung Electronics Co., Ltd. Video prediction apparatus and method for multi-format codec and video encoding/decoding apparatus and method using the video prediction apparatus and method
CN102547294A (en) * 2012-02-16 2012-07-04 复旦大学 Context-based adaptive binary arithmetic coding (CABAC) hardware decoder architecture applied to H.264 and high efficiency video coding (HEVC) video standards
CN103997650A (en) * 2014-05-30 2014-08-20 华为技术有限公司 Video decoding method and video decoder
CN104683820A (en) * 2015-02-28 2015-06-03 华为技术有限公司 Loop filtering method and loop filter
CN105611299A (en) * 2015-12-25 2016-05-25 北京工业大学 Motion estimation method based on HEVC

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230100352A1 (en) * 2021-09-28 2023-03-30 Meta Platforms,Inc. Systems and methods for dynamic early termination of mode decision in hardware video encoders
US11683484B2 (en) * 2021-09-28 2023-06-20 Meta Platforms, Inc. Systems and methods for dynamic early termination of mode decision in hardware video encoders
CN116389763A (en) * 2023-06-05 2023-07-04 瀚博半导体(上海)有限公司 Video coding method and device based on multiple encoders
CN116389763B (en) * 2023-06-05 2023-08-11 瀚博半导体(上海)有限公司 Video coding method and device based on multiple encoders
CN117440168A (en) * 2023-12-19 2024-01-23 福州时芯科技有限公司 Hardware architecture for realizing parallel spiral search algorithm
CN117440168B (en) * 2023-12-19 2024-03-08 福州时芯科技有限公司 Hardware architecture for realizing parallel spiral search algorithm

Also Published As

Publication number Publication date
CN113454997A (en) 2021-09-28

Similar Documents

Publication Publication Date Title
KR102412271B1 (en) Video encoding/decoding method and apparatus using prediction based on in-loop filtering
WO2022061613A1 (en) Video coding apparatus and method, and computer storage medium and mobile platform
JP5922244B2 (en) Sample adaptive offset merged with adaptive loop filter in video coding
KR101647376B1 (en) A method and an apparatus for processing a video signal
KR101752989B1 (en) Mode decision simplification for intra prediction
US9838718B2 (en) Secondary boundary filtering for video coding
KR102062568B1 (en) Enhanced intra-prediction coding using planar representations
JP5944044B2 (en) Improvement of intra prediction in lossless coding of HEVC
KR101538704B1 (en) Method and apparatus for coding and decoding using adaptive interpolation filters
KR101641808B1 (en) Largest coding unit (lcu) or partition-based syntax for adaptive loop filter and sample adaptive offset in video coding
US20120170649A1 (en) Video coding using mapped transforms and scanning modes
JP2016201808A (en) Non-square transform units and prediction units in video coding
JP2017511620A (en) Innovations in block vector prediction and estimation of reconstructed sample values in overlapping areas
US20120106862A1 (en) Image processing device, method, and program
RU2760234C2 (en) Data encoding and decoding
US20120147960A1 (en) Image Processing Apparatus and Method
KR20130112374A (en) Video coding method for fast intra prediction and apparatus thereof
KR20170072637A (en) Video Coding/Encoding Method and Apparatus thereof
US11962803B2 (en) Method and device for intra-prediction
US20220417511A1 (en) Methods and systems for performing combined inter and intra prediction
JP4360093B2 (en) Image processing apparatus and encoding apparatus and methods thereof
KR101700410B1 (en) Method and apparatus for image interpolation having quarter pixel accuracy using intra prediction modes
JP6402520B2 (en) Encoding apparatus, method, program, and apparatus
WO2023223705A1 (en) Video coding device, video coding method, and video system
US20230199196A1 (en) Methods and Apparatuses of Frequency Domain Mode Decision in Video Encoding Systems

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20954466

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20954466

Country of ref document: EP

Kind code of ref document: A1