WO2017101350A1 - 变分辨率的编码模式预测方法及装置 - Google Patents

变分辨率的编码模式预测方法及装置 Download PDF

Info

Publication number
WO2017101350A1
WO2017101350A1 PCT/CN2016/088715 CN2016088715W WO2017101350A1 WO 2017101350 A1 WO2017101350 A1 WO 2017101350A1 CN 2016088715 W CN2016088715 W CN 2016088715W WO 2017101350 A1 WO2017101350 A1 WO 2017101350A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
code stream
block
candidate reference
resolution
Prior art date
Application number
PCT/CN2016/088715
Other languages
English (en)
French (fr)
Inventor
白茂生
Original Assignee
乐视控股(北京)有限公司
乐视云计算有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 乐视控股(北京)有限公司, 乐视云计算有限公司 filed Critical 乐视控股(北京)有限公司
Priority to US15/246,684 priority Critical patent/US20170180745A1/en
Publication of WO2017101350A1 publication Critical patent/WO2017101350A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/107Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/154Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/40Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video transcoding, i.e. partial or full decoding of a coded input stream followed by re-encoding of the decoded output stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/59Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/184Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Definitions

  • Embodiments of the present invention relate to the field of video technologies, and in particular, to a variable resolution coding mode prediction method and apparatus.
  • 4K TV refers to the screen display TV with 4K resolution.
  • 4K resolution is an emerging standard for the resolution of digital cinema and digital content. Its name is based on its horizontal resolution of about 4,000 pixels. There are subtle gaps depending on the application. 4K level resolution can provide more than 8.8 million pixels, at least provide display quality of nearly 10 million pixels, achieve cinematic quality, equivalent to four times the current top 1080p resolution, display fineness of 1080p 4 times or more.
  • the cost of ultra-high-definition is also very high.
  • the data volume of each frame reaches 50MB, so the top-level configuration machine is required for both decoding and editing.
  • the video is usually transcoded into several code streams of different quality and different grades to satisfy smooth playback under different bandwidths. But the real-time transcoding has a huge resource consumption for the transcoder.
  • Embodiments of the present invention provide a variable resolution coding mode prediction method and apparatus, which are used to solve the defect that the real-time transcoding of the transcoder has a huge resource consumption in the prior art, and in the case of effectively reducing the coding complexity, Achieve high quality variable resolution real-time transcoding.
  • An embodiment of the present invention provides a variable resolution encoding mode prediction method, including:
  • Decoding a current input code stream and acquiring code stream information in a decoding process, where the code stream information includes a frame type of the currently decoded frame and macroblock coding information;
  • An embodiment of the present invention provides a variable resolution coding mode prediction apparatus, including:
  • An information acquiring module configured to decode the current input code stream, and obtain code stream information in a decoding process, where the code stream information includes a frame type of the currently decoded frame and macroblock coding information;
  • a transcoding module configured to predict, according to the code stream information, a frame type of the transcoded frame corresponding to the input code stream, and according to a mapping relationship between a resolution of the input code stream and a transcoding target resolution The coding information of the code frame is predicted.
  • the present application also discloses a video denoising device, including: a memory, a processor, wherein
  • the memory is configured to store one or more instructions, wherein the one or more instructions are for execution by the processor;
  • the processor is configured to decode the current input code stream, and obtain code stream information in a decoding process, where the code stream information includes a frame type of the currently decoded frame and macroblock coding information;
  • variable resolution coding mode prediction method and apparatus are The encoding mode to be encoded is predicted, and the encoding time can be saved to some extent. At the same time, the embodiment of the present invention simply re-optimizes the prediction mode to maintain the same video quality as the full encoding mode.
  • Embodiment 1 is a technical flowchart of Embodiment 1 of the present invention.
  • Embodiment 2 is a technical flowchart of Embodiment 2 of the present invention.
  • Embodiment 3 is a technical flowchart of Embodiment 3 of the present invention.
  • FIG. 5 is a schematic diagram of a motion direction of a candidate reference block according to Embodiment 3 of the present invention.
  • FIG. 6 is a schematic structural diagram of a device according to Embodiment 4 of the present invention.
  • FIG. 7 is a schematic structural diagram of a device according to Embodiment 5 of the present invention.
  • the main idea of the invention is to dynamically detect the noise intensity of the video and dynamically perform video denoising according to the noise intensity of the video; through two layers of spatial denoising, under the premise of completing the denoising function, the maximum is retained. Low frequency image data in each frame of the video.
  • a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
  • processors CPUs
  • input/output interfaces network interfaces
  • memory volatile and non-volatile memory
  • the memory may include non-persistent memory, random access memory (RAM), and/or non-volatile memory in a computer readable medium, such as read only memory (ROM) or flash memory.
  • RAM random access memory
  • ROM read only memory
  • Memory is an example of a computer readable medium.
  • Computer readable media includes both permanent and non-persistent, removable and non-removable media.
  • Information storage can be implemented by any method or technology.
  • the information can be computer readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory. (ROM), electrically erasable programmable read only memory (EEPROM), flash memory or other memory technology, compact disk read only memory (CD-ROM), digital versatile disk (DVD) or other optical storage, Magnetic tape cartridges, magnetic tape storage or other magnetic storage devices or any other non-transportable media can be used to store information that can be accessed by a computing device.
  • computer readable media does not include non-transitory computer readable media, such as modulated data signals and carrier waves.
  • first device if a first device is coupled to a second device, the first device can be directly electrically coupled to the second device, or electrically coupled indirectly through other devices or coupling means. Connected to the second device.
  • the description of the specification is intended to be illustrative of the preferred embodiments of the invention. The scope of protection of the application is subject to the definition of the appended claims.
  • the embodiment of the invention is applied to a variable resolution 4K real-time transcoding system, which is transcoded relative to the prior art.
  • the decoded macroblock is directly encoded according to the target transcoding resolution.
  • the core of the technical solution in the embodiment of the present invention is that, after decoding the input original code stream, the input code stream is first obtained.
  • the code stream information is used to predict the coded information of the output streams of different resolutions according to the code stream information, thereby implementing fast and efficient coding.
  • a method for predicting a coding mode with variable resolution mainly includes two major steps:
  • Step 110 Decode a current input code stream, and obtain code stream information in a decoding process, where the code stream information includes a frame type of the currently decoded frame and macroblock coding information.
  • the input 4K code stream is first decoded, and then the decoded video frame is encoded.
  • the core of the embodiment of the present invention is that, before encoding the decoded frame, acquiring original coding information of the input code stream, and performing coding information inheritance according to the original coding information, thereby implementing coding information prediction for subsequent high quality. coding.
  • the encoding adopts H264 video encoding by default.
  • the frame types of the input code stream include an intra prediction coded frame (I_FRAME), a forward predictive coded frame (P_FRAME), and a bidirectional predictive coded frame (B_FRAME).
  • a frame is a still picture, and continuous frames form an animation, such as a TV image.
  • the I frame is an intra-predictive coded frame, which belongs to intra-frame compression.
  • I decodes only the frame data can be completed (because it only depends on the macroblock coding information of the adjacent position).
  • the P frame is a forward predictive coded frame and belongs to interframe coding.
  • the P frame represents the difference between this frame and the previous reference frame, and the residual data plus the prediction data obtained by the forward motion compensation reconstructs the current P frame.
  • the B frame is a bidirectional difference frame, that is, the B frame records the difference between the current frame and the front and rear reference frames. Both the forward reference frame and the backward reference frame are required for decoding, and the residual data is added through the pre-post-travel The predicted data obtained by the motion compensation is used to reconstruct the current B frame.
  • the macroblock coding information includes an encoding mode, a reference frame, and a motion vector of each macroblock in the original input code stream, so that the subsequent encoding is combined with the variable resolution transcoding according to the encoding information.
  • the mapping between resolution and resolution of target transcoding enables efficient coding prediction.
  • Step 120 predict, according to the code stream information, a frame type of the transcoded frame corresponding to the input code stream, and according to a mapping relationship between a resolution of the input code stream and a transcoding target resolution, the transcoded frame is Encoded information for prediction.
  • the target resolution in the embodiment of the present invention may be 1080P, 720P, etc., and the prediction manners of the two are the same.
  • the candidate reference block corresponding to the current coded macroblock in the input code stream is selected, and according to the The original coding mode of the candidate reference block predicts an encoding mode of the current coded macroblock.
  • the intra-frame macroblock of the intra-predictive coded frame is encoded, first traverse each of the candidate reference blocks, and determine, according to the original split mode of the candidate reference block. Whether the candidate reference block is a detail block; counting the number of the detail blocks and predicting an encoding mode of the currently coded macroblock according to the number.
  • the current coded frame is a bidirectionally predictive coded frame
  • the bidirectionally predictive coded frame when the bidirectionally predictive coded frame is encoded, traversing each of the candidate reference blocks, determining whether the candidate reference block is an inter prediction block or an intra prediction block; If the intra prediction block is the block, determining whether the intra prediction block is a detail block and counting the number of the detail blocks; if the inter prediction block is, counting the number of the inter prediction blocks, And predicting an encoding mode of the currently coded macroblock according to the number of the detail blocks and the number of the intra prediction blocks.
  • the coding mode to be encoded is predicted, which saves the coding time to a certain extent, improves the coding efficiency, and reduces the technical cost of transcoding. At the same time, the same video quality as the full encoding mode is guaranteed.
  • Embodiment 2 is a technical flowchart of Embodiment 2 of the present invention.
  • Embodiment 2 is an implementation manner of intraframe coding information prediction in the embodiment of the present invention, which mainly includes the following steps:
  • Step 210 Select, according to a mapping relationship between a resolution of the input code stream and the transcoding target resolution, a candidate reference block corresponding to a current coded macroblock in the input code stream.
  • the physical resolution of 4K TV reaches 3840*2160, which is 4 times that of Full HD (FHD.1920*1080) and 9 times that of HD (HD.1280*720).
  • FHD.1920*1080 Full HD
  • HD.1280*720 9 times that of HD
  • the 4K code stream is transferred from 2160P.
  • the code is 1080P and 720P, the value of the corresponding reference block of the current coded macroblock in 2160P is very large.
  • the 4K to 1080P resolution is mapped to 1:2, that is, the block corresponding to the current 1080P (0,0) block is 4K(0,0),(0,1),(1,0),( 1,1) composition. Therefore, the prediction mode of the currently coded macroblock needs to be selected from the above four candidate reference blocks.
  • the four candidate reference blocks are rounded by the corresponding resolution mapping relationship.
  • Step 220 traverse each of the candidate reference blocks, and determine, according to the original partition mode of the candidate reference block, whether the candidate reference block is a detail block.
  • the candidate reference block is marked as a detail block.
  • Step 230 Count the number of the detail blocks and predict an encoding mode of the currently coded macroblock according to the number.
  • the predictive coding mode of the current coded macroblock is marked as I_16x16;
  • the predictive coding mode of the current coded macroblock is marked as I_4x4;
  • the predictive coding mode of the current coded macroblock is marked as I_8x8.
  • the coded information of the source code stream is multiplexed to predict the coded information of the coded code, and the coded information of the source code stream is reasonably utilized, thereby improving the efficiency of transcoding; at the same time, according to the input code stream and the output code stream.
  • the mapping relationship is to select a candidate reference block for the current coded macroblock, and determine whether the candidate reference block is a detail block, which greatly protects the image details after video transcoding, improves the quality of the transcoding, and brings the user the A better visual experience.
  • FIG. 3 is a technical flowchart of Embodiment 3 of the present invention, and Embodiment 3 illustrates an implementation manner of coding information prediction of a bidirectional predictive coding frame in the embodiment of the present invention.
  • FIG. 4 is a further detailed diagram of FIG. 3, and in conjunction with FIG. 3 and FIG. 4, the third embodiment of the present invention mainly includes the following steps:
  • Step 310 Select, according to a mapping relationship between a resolution of the input code stream and the transcoding target resolution, a candidate reference block corresponding to a current coded macroblock in the input code stream.
  • This step is the same as the execution process of step 210.
  • the input code stream of 2160P resolution is transcoded to the output code stream of 1080P, four candidate reference blocks are selected for the current coded macroblock, and similarly, the input code stream is converted by 2160P resolution.
  • the code is output to the 720P output stream, four candidate reference blocks are selected for the current coded macroblock.
  • the following sections describe the embodiment of the present invention with four candidate reference blocks.
  • Step 320 traverse each of the candidate reference blocks, determine whether the candidate reference block is an inter prediction block or an intra prediction block; if it is the intra prediction block, perform step 330; if it is an inter prediction block, Go to step 340.
  • Step 330 Determine whether the intra prediction block is a detail block and count the number of the detail blocks.
  • the parameter i_intra++ after traversing all the candidate reference blocks, the number of the intra is obtained according to the value of the parameter i_intra.
  • Step 340 Calculate an average MV value of the candidate reference block, determine whether the inter prediction block is a detail block, and predict a reference frame of the inter prediction block.
  • the moving image can be divided into several blocks or macroblocks, and an attempt is made to search for the position of each block or macroblock in the adjacent frame image.
  • the relative offset of the spatial position between the two is obtained, and the obtained relative offset is the motion vector usually referred to.
  • the process of obtaining the motion vector is called motion estimation.
  • the motion vector and the prediction error obtained after the motion matching are jointly sent to the decoding end, and the corresponding block or macroblock is found from the decoded adjacent reference frame image at the decoding end according to the position indicated by the motion vector, and the prediction error is added.
  • the position of the block or macroblock in the current frame is obtained.
  • the MV Motion Vector
  • the direction of the candidate reference block MV is determined.
  • 0 to 8 are the directions of the nine reference MVs.
  • the MV direction is 0 for MV (0, 0) and 8 for MV (-1, 1).
  • the direction of the current candidate reference block is marked as mb_candinate[i]->direction (i is the sequence number of the candidate reference block, and in 1080P, the value range of i is 0-3).
  • the values of the MV are accumulated and the average MV is calculated for prediction of subsequent MVs. After the average MV is obtained, the original partition mode of the candidate reference block is determined. If the number of divided blocks is less than or equal to 8 ⁇ 8, the candidate reference block is marked as a detail block.
  • the current coded macroblock is B_SKIP or B_DIRECT, and if so, the current coded macroblock is marked as a non-detail block, and the parameter i_fast_block++.
  • the forward reference frame and the backward reference frame used by each candidate reference block are used to predict whether the current coded macroblock uses a forward reference frame or a backward reference frame.
  • the forward reference frame is the parameter i_ref0
  • the backward reference frame is the parameter i_ref1. If the number of forward reference frames of the candidate reference block is greater than 1, i_ref0++, if the number of backward reference frames of the candidate reference block is greater than 1, remember i_ref1++.
  • the traversal determines four candidate reference blocks the current encoded macroblock is predicted to use the forward reference frame or the backward reference frame according to the size of the obtained i_ref0 and i_ref1.
  • Step 350 Predict the coding mode of the currently coded macroblock and predict the corresponding MV.
  • the direction of the current candidate reference block is mb_candinate[i]->direction, i is the sequence number of the candidate reference block, the value range of i is 0-3, && indicates the AND in the logical operation, and
  • the number of intra prediction blocks is greater than two, and the current coded macroblock is coded according to the intra prediction block, and the coding information prediction described in Embodiment 2 is performed according to the number of the obtained detail blocks.
  • Determining B the number of the non-detailed blocks is greater than 2, and predicting that the encoding mode of the current encoded macroblock is a B_DIRECT mode;
  • Determining C if the MV of the current candidate reference block satisfies Condition 1, predicting that the coding mode of the current coded macroblock is B_16 ⁇ 16;
  • Determining D if the MV of the current candidate reference block satisfies Condition 2, predicting that the coding mode of the current coded macroblock is B_16 ⁇ 8;
  • Determining E if the MV of the current candidate reference block satisfies Condition 3, predicting that the coding mode of the current coded macroblock is B_8 ⁇ 16;
  • Judging F if the current candidate reference block does not satisfy all of the above judgments A to E, the prediction center The encoding mode of the current coded macroblock is B_8 ⁇ 8.
  • the reference MVs corresponding to each mode are respectively calculated.
  • Equation1 Equation 1
  • Scale_x round(source_x/dest_x);
  • Scale_y round(source_y/dest_y);
  • Equation 1 Mv[x] is a motion vector in the x direction; Mv[y] is a motion vector in the y direction;
  • Mvc[0] to mvc[3] are MVs corresponding to 4 candidate reference blocks; mvc[0].x ⁇ mvc[3].x are MVs of x directions corresponding to 4 candidate reference blocks; mvc[0]. Y ⁇ mvc[3].y is the MV of the y direction corresponding to the four candidate reference blocks;
  • Source_x, source_y are the x, y direction resolution of the input stream
  • Dest_x, dest_y are the target x, y direction resolution; Scale_x, Scale_y is the x, y direction transition parameters for subsequent calculations; round () function returns the rounded value according to the specified number of digits; >> represents the right shift operator .
  • Equation 2 Equation 2
  • Equation 3 Equation 3
  • a 16x16 macroblock consists of two 16x8 blocks, Mv[0] and Mv[1] are two 16x8 motion vectors respectively; Mv[0][x] is the MV of the M direction of Mv[0]; Mv[0 ][y] is the MV of the y direction of Mv[0].
  • the coding mode to be coded by the coding information of the multiplex source stream is predicted, which saves the coding time to a certain extent; at the same time, the prediction mode is performed on the prediction mode in this embodiment. Simply re-optimized to ensure the same video quality as the full encoding mode.
  • FIG. 6 is a schematic structural diagram of a device according to Embodiment 4 of the present invention.
  • a coding mode prediction apparatus with variable resolution includes the following modules: an information acquisition module 610 and a transcoding module 620.
  • the information obtaining module 610 is configured to decode the current input code stream, and obtain code stream information in a decoding process, where the code stream information includes a frame type of the currently decoded frame and a macroblock coded message. interest;
  • a transcoding module 620 configured to predict, according to the code stream information, a frame type of the transcoded frame corresponding to the input code stream, and according to a mapping relationship between a resolution of the input code stream and a transcoding target resolution, The encoded information of the transcoded frame is predicted.
  • the transcoding module 620 is further configured to: when H264 is used as the video encoding format, use a frame type corresponding to the input code stream as a frame type of the transcoded frame, where the frame type includes an intraframe A predictive coded frame, a forward predictive coded frame, and a bidirectionally predictive coded frame.
  • the transcoding module 620 is further configured to: select, according to a mapping relationship between a resolution of the input code stream and the transcoding target resolution, a candidate reference block corresponding to a current coded macroblock in the input code stream. And predicting an encoding mode of the currently coded macroblock according to an original coding mode of the candidate reference block.
  • the transcoding module 620 is further configured to: when encoding the intra macroblock of the intra prediction encoded frame, traverse each of the candidate reference blocks, and determine according to the original segmentation mode of the candidate reference block Whether the candidate reference block is a detail block; counting the number of the detail blocks and predicting an encoding mode of the currently coded macroblock according to the number.
  • the transcoding module 620 is further configured to: when encoding the bidirectionally predictive coded frame, traverse each of the candidate reference blocks, and determine whether the candidate reference block is an inter prediction block or an intra prediction block. And if the intra prediction block is, determining whether the intra prediction block is a detail block and counting the number of the detail blocks; if the inter prediction block is, counting the number of the inter prediction blocks And predicting an encoding mode of the currently coded macroblock according to the number of the detail blocks and the number of the intra prediction blocks.
  • the corresponding device of FIG. 6 performs the embodiment shown in FIG. 1 to FIG. 5 , and the execution steps and technical effects are as described in the embodiment shown in FIG. 1 to FIG. 5 , and details are not described herein again.
  • the device embodiments described above are merely illustrative, wherein the units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, ie may be located A place, or it can be distributed to multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment. Those skilled in the art can do without creative labor. Understand and implement.
  • FIG. 7 is a schematic structural diagram of a device according to Embodiment 5 of the present invention.
  • an embodiment of the present invention provides a variable resolution encoding mode prediction device, including a memory 701 and a processor 702.
  • the memory 701 is configured to store one or more instructions, where the one or more instructions are for execution by the processor;
  • the processor 702 is configured to decode a current input code stream, and obtain code stream information in a decoding process, where the code stream information includes a frame type of a currently decoded frame and macroblock coding information, according to the code stream.
  • the information predicts a frame type of the transcoded frame corresponding to the input code stream, and predicts the coded information of the transcoded frame according to a mapping relationship between a resolution of the input code stream and a transcoding target resolution.
  • the processor 702 is further configured to: when the H264 is used as the video encoding format, the frame corresponding to the input code stream A type is used as the frame type of the transcoded frame, wherein the frame type includes an intra-predictive coded frame, a forward predictive coded frame, and a bidirectionally predictive coded frame.
  • the processor 702 is further configured to: according to the resolution of the input code stream, when the coding information of the transcoded frame is predicted according to the mapping relationship between the resolution of the input code stream and the transcoding target resolution. And a mapping relationship with the transcoding target resolution, selecting a candidate reference block corresponding to the current coded macroblock in the input code stream, and predicting an encoding mode of the current coded macroblock according to an original coding mode of the candidate reference block .
  • the processor 702 is further configured to: traverse the intra macroblock of the intra prediction encoded frame Determining, by each of the candidate reference blocks, whether the candidate reference block is a detail block according to a primary partition mode of the candidate reference block; counting the number of the detail blocks and predicting an encoding of the current coded macroblock according to the quantity mode.
  • the processor 702 is further configured to: when encoding the bidirectionally predictive coded frame, traverse each of the candidate reference blocks, and determine whether the candidate reference block is an inter prediction block or an intra prediction block; If the intra prediction block is the block, determining whether the intra prediction block is a detail block and counting the number of the detail blocks; if the inter prediction block is, counting the number of the inter prediction blocks, And predicting an encoding mode of the currently coded macroblock according to the number of the detail blocks and the number of the intra prediction blocks.

Abstract

一种变分辨率的编码模式预测方法及装置。对当前输入码流进行解码,并在解码过程中获取码流信息,其中所述码流信息包括当前解码帧的帧类型以及宏块编码信息;根据所述码流信息预测所述输入码流对应的转码帧的帧类型,并根据所述输入码流的分辨率与转码目标分辨率的映射关系对所述转码帧的编码信息进行预测。在节省转码时间的同时,保证了转码的质量。

Description

变分辨率的编码模式预测方法及装置
交叉引用
本申请引用于2015年12月18日递交的名称为“变分辨率的编码模式预测方法及装置”的第2015109593384号中国专利申请,其通过引用被全部并入本申请。
技术领域
本发明实施例涉及视频技术领域,尤其涉及一种变分辨率的编码模式预测方法及装置。
背景技术
随着4K电视的普及以及家庭带宽的增加,人们对高质量视频直播的需求也越来越多。4K电视指屏幕显示采用4K分辨率的电视机。4K分辨率是一种新兴的数字电影及数字内容的解析度标准,得名于其横向解析度约为4000像素(pixel),根据不同的应用领域而存在细微差距。4K级别的分辨率可提供880多万像素,至少能提供近千万像素的显示品质,实现电影级的画质,相当于当前顶级的1080p分辨率的四倍还多,显示细腻度为1080p的4倍以上。
当然超高清的代价也是不菲的,4K显示中,每一帧的数据量都达到了50MB,因此无论解码播放还是编辑都需要顶级配置的机器。为了兼顾不同带宽观众的直播体验,现有技术中,通常会将视频转码为不同质量、不同档次的几档码流来满足不同带宽下的流畅播放。但实时转码对转码器的资源消耗是巨大的。
因此,在有效地减少编码复杂度的情况下,一种高质量的视频变分辨率实时转码方法亟待提出。
发明内容
本发明实施例提供一种变分辨率的编码模式预测方法及装置,用以解决现有技术中实时转码对转码器的资源消耗巨大的缺陷,在有效的减少编码复杂度的情况下,实现了高质量的变分辨率实时转码。
本发明实施例提供一种变分辨率的编码模式预测方法,包括:
对当前输入码流进行解码,并在解码过程中获取码流信息,其中所述码流信息包括当前解码帧的帧类型以及宏块编码信息;
根据所述码流信息预测所述输入码流对应的转码帧的帧类型,并根据所述输入码流的分辨率与转码目标分辨率的映射关系对所述转码帧的编码信息进行预测。
本发明实施例提供一种变分辨率的编码模式预测装置,包括:
信息获取模块,用于对当前输入码流进行解码,并在解码过程中获取码流信息,其中所述码流信息包括当前解码帧的帧类型以及宏块编码信息;
转码模块,用于根据所述码流信息预测所述输入码流对应的转码帧的帧类型,并根据所述输入码流的分辨率与转码目标分辨率的映射关系对所述转码帧的编码信息进行预测。
本申请还揭示了一种视频去噪设备,包括:内存、处理器,其中,
所述内存,用于存储一条或多条指令,其中,所述一条或多条指令以供所述处理器调用执行;
所述处理器,用于对当前输入码流进行解码,并在解码过程中获取码流信息,其中所述码流信息包括当前解码帧的帧类型以及宏块编码信息;
根据所述码流信息预测所述输入码流对应的转码帧的帧类型,并根据所述输入码流的分辨率与转码目标分辨率的映射关系对所述转码帧的编码信息进行预测。
本发明实施例提供的变分辨率的编码模式预测方法及装置,通过对 待编码的编码模式进行预测,可以一定程度上节省编码时间;与此同时,本发明实施例对预测模式进行简单的再优化,可以保持与完全编码模式相同的视频质量。
附图说明
此处所说明的附图用来提供对本发明的进一步理解,构成本申请的一部分,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:
图1为本发明实施例一的技术流程图;
图2为本发明实施例二的技术流程图;
图3为本发明实施例三的技术流程图;
图4为本发明实施例三的又一技术流程图;
图5为本发明实施例三的候选参考块运动矢量方向的示意图;
图6为本发明实施例四的装置结构示意图;
图7为本发明实施例五的设备结构示意图。
具体实施方式
本发明的主要思想在于,通过自动检测视频的噪声强度,并根据视频的噪声强度,来动态进行视频去噪;通过两层空域去噪,在完成去噪功能的前提下,最大限度地保留了视频每一帧中的低频图像数据。
为使本发明的目的、技术方案和优点更加清楚,以下结合附图及具体实施例,对本发明作进一步地详细说明。在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括非暂存电脑可读媒体(transitory media),如调制的数据信号和载波。
如在说明书及权利要求当中使用了某些词汇来指称特定组件。本领域技术人员应可理解,硬件制造商可能会用不同名词来称呼同一个组件。本说明书及权利要求并不以名称的差异来作为区分组件的方式,而是以组件在功能上的差异来作为区分的准则。如在通篇说明书及权利要求当中所提及的“包含”为一开放式用语,故应解释成“包含但不限定于”。“大致”是指在可接收的误差范围内,本领域技术人员能够在一定误差范围内解决所述技术问题,基本达到所述技术效果。此外,“耦接”一词在此包含任何直接及间接的电性耦接手段。因此,若文中描述一第一装置耦接于一第二装置,则代表所述第一装置可直接电性耦接于所述第二装置,或通过其他装置或耦接手段间接地电性耦接至所述第二装置。说明书后续描述为实施本申请的较佳实施方式,然所述描述乃以说明本申请的一般原则为目的,并非用以限定本申请的范围。本申请的保护范围当视所附权利要求所界定者为准。
还需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的商品或者系统不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种商品或者系统所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的商品或者系统中还存在另外的相同要素。
本发明实施例应用于变分辨率4K实时转码系统,相对于现有技术转码 过程中,将解码得到的宏块直接根据目标转码分辨率进行编码,本发明实施例的技术核心在于,在转码过程中,将输入的原始码流进行解码之后,首先获取输入码流的码流信息,并根据所述码流信息对不同分辨率输出码流的编码信息进行预测,从而实现快速高效编码。
实施例一
图1是本发明实施例1的技术流程图,结合图1,本发明实施例一种变分辨率的编码模式预测方法主要包括两个大的步骤:
步骤110:对当前输入码流进行解码,并在解码过程中获取码流信息,其中所述码流信息包括当前解码帧的帧类型以及宏块编码信息;
转码系统在工作时,首先将输入的4K码流进行解码,再将解码后的视频帧进行编码。本发明实施例的核心在于,将解码后的帧进行编码之前,获取输入码流的原编码信息,并根据所述原编码信息进行编码信息继承,从而实现编码信息预测用以进行后续高质量的编码。
本发明实施例中,编码默认采用H264视频编码。输入码流的帧类型包括帧内预测编码帧(I_FRAME)、前向预测编码帧(P_FRAME)以及双向预测编码帧(B_FRAME)。
数据在网络上是以很小的称为帧(Frame)的单位传输的,帧由几部分组成,不同的部分执行不同的功能。一帧就是一副静止的画面,连续的帧就形成动画,如电视图像等。
在实际压缩时,会采取各种算法减少数据的容量,其中IPB就是最常见的。I帧是帧内预测编码帧,属于帧内压缩,I解码时只需要本帧数据就可以完成(因为只依赖于相邻位置的宏块编码信息)。
P帧为向前预测编码帧,属于帧间编码。P帧表示的是这一帧跟之前一个参考帧的差别,残差数据加上通过前向运动补偿得到的预测数据来重构当前P帧。
B帧是双向差别帧,也就是B帧记录的是本帧与前后参考帧的差别,解码时既需要前向参考帧又需要后向参考帧,通过残差数据加上通过前-后向运 动补偿得到的预测数据来重构当前B帧。
本发明实施例中,所述宏块编码信息包括原始输入码流中每一宏块的编码模式,参考帧以及运动矢量,以使后续编码根据这些编码信息,结合变分辨率转码时,原分辨率与目标转码的分辨率之间的映射关系,实现高效的编码预测。
步骤120:根据所述码流信息预测所述输入码流对应的转码帧的帧类型,并根据所述输入码流的分辨率与转码目标分辨率的映射关系对所述转码帧的编码信息进行预测。
本发明实施例中的所述目标分辨率可以是1080P,720P等,二者的预测方式相同。在实际的编码模式预测中,首先根据所述输入码流的分辨率与所述转码目标分辨率的映射关系选择当前编码宏块在所述输入码流中对应的候选参考块,并根据所述候选参考块的原编码模式预测所述当前编码宏块的编码模式。
若当前编码帧为帧内预测编码帧,对所述帧内预测编码帧的帧内宏块进行编码时,首先遍历每一个所述候选参考块,根据所述候选参考块的原分割模式判断所述候选参考块是否为细节块;统计所述细节块的数量并根据所述数量预测所述当前编码宏块的编码模式。
若所述当前编码帧为双向预测编码帧,对所述双向预测编码帧进行编码时,遍历每一个所述候选参考块,判断所述候选参考块是否为帧间预测块或帧内预测块;若为所述帧内预测块,则判断所述帧内预测块是否为细节块并统计所述细节块的数量;若为所述帧间预测块,则统计所述帧间预测块的数量,并根据所述细节块的数量以及所述帧内预测块的数量预测所述当前编码宏块的编码模式。
本实施例中,通过在转码过程中获取源码流的编码信息,从而对待编码的编码模式进行预测,在一定程度上节省了编码时间,提高了编码的效率,降低了转码的技术成本,与此同时,保证了与完全编码模式相同的视频质量。
实施例二
图2是本发明实施例二的技术流程图,实施例二是本发明实施例中,帧内编码信息预测的一种实施方式,主要包括以下的几个步骤:
步骤210:根据所述输入码流的分辨率与所述转码目标分辨率的映射关系选择当前编码宏块在所述输入码流中对应的候选参考块;
4K电视的物理分辨率达到3840*2160,是全高清(FHD.1920*1080)的4倍,是高清(HD.1280*720)的9倍。对于实时转码而言,相同的内容在不同码率或分辨率的编码情况下,有很多的相似之处,故源码流的编码信息是可以复用的,因此,将4K码流从2160P转码为1080P和720P时,当前编码宏块在2160P中对应的参考块的利用价值很大。
以1080P编码为例,4K到1080P分辨率映射为1:2,即当前1080P(0,0)块对应的块由4K(0,0),(0,1),(1,0),(1,1)组成。因此所述当前编码宏块的预测模式需要从上述4个候选参考块中进行选择。本发明实施例中,降分辨率转码时,如果分辨率映射为非整数,则通过对应的分辨率映射关系,取整选取4个候选参考块。
步骤220:遍历每一个所述候选参考块,根据所述候选参考块的原分割模式判断所述候选参考块是否为细节块;
如果所述候选参考块的分割模式为I_8x8或I_4x4,则将此所述候选参考块标记为细节块。
步骤230:统计所述细节块的数量并根据所述数量预测所述当前编码宏块的编码模式。
若所述细节块的数量小于等于1,将所述当前编码宏块的预测编码模式标记为I_16x16;
若所述细节块的数量大于等于2,将所述当前编码宏块的预测编码模式标记为I_4x4;
若所述细节块的数量不满足上述两种情况,则将所述当前编码宏块的预测编码模式标记为I_8x8。
本实施例中,通过复用源码流的编码信息对转码的编码信息进行预测,合理利用了源码流的编码信息,提高了转码的效率;与此同时,按照输入码流与输出码流的映射关系,为当前编码宏块选择候选参考块,并判断所述候选参考块是否为细节块,极大程度地保护了视频转码后图像细节,提高了转码的质量,为用户带来更优的视觉体验。
实施例三
图3是本发明实施例三的技术流程图,实施例三所示例的是本发明实施例中双向预测编码帧的编码信息预测的一种实施方式。图4是图3的进一步细化示意,结合图3与图4,本发明实施例三主要包括以下的几个步骤:
步骤310:根据所述输入码流的分辨率与所述转码目标分辨率的映射关系选择当前编码宏块在所述输入码流中对应的候选参考块;
本步骤同步骤210的执行过程相同,由2160P分辨率的输入码流转码至1080P的输出码流时,为当前编码宏块选取4个候选参考块,同样地,由2160P分辨率的输入码流转码至720P的输出码流时,为当前编码宏块就近选取4个候选参考块,以下部分,均以4个候选参考块对本发明实施例进行说明。
步骤320:遍历每一个所述候选参考块,判断所述候选参考块是否为帧间预测块或帧内预测块;若为所述帧内预测块,执行步骤330;若为帧间预测块,执行步骤340。
步骤330:判断所述帧内预测块是否为细节块并统计所述细节块的数量;
如果为帧内预测块,参数i_intra++,遍历所有候选参考块后,根据参数i_intra的值得到所述intra的数量。
步骤340:计算所述候选参考块的平均MV值、判断所述帧间预测块是否为细节块并预测所述帧间预测块的参考帧;
由于P帧采用前向参考帧编码和帧内编码的混合模式,在帧间预测编码中,由于活动图像邻近帧中的景物存在着一定的相关性。因此,可将活动图像分成若干块或宏块,并设法搜索出每个块或宏块在邻近帧图像中的位置, 并得出两者之间的空间位置的相对偏移量,得到的相对偏移量就是通常所指的运动矢量,得到运动矢量的过程被称为运动估计。运动矢量和经过运动匹配后得到的预测误差共同发送到解码端,在解码端按照运动矢量指明的位置,从已经解码的邻近参考帧图像中找到相应的块或宏块,和预测误差相加后就得到了块或宏块在当前帧中的位置。
因原始输入码流对应位置宏块的运动向量有很高的可利用性,因此,本发明实施例中,所以将所述输入码流的MV(Motion Vector,即运动矢量)作为后续运动估计的参考。
如图5,以输出1080P为例,判断选候选参考块MV的方向。图中,0~8是9个参考MV的方向,1080P中,对于MV(0,0),MV方向为0,对于MV(-1,1),MV方向为8。标记当前候选参考块的方向为mb_candinate[i]->direction(i为候选参考块的序号,1080P中,i的取值范围0-3)。得到每一个候选参考块的MV之后,累加所述MV的值并计算平均MV用以进行后续MV的预测。得到平均MV之后,判断所述候选参考块的原分割模式,若分割块的数量小于等于8×8,则将所述候选参考块标记为细节块。
本步骤中,还需判断所述当前编码宏块是否为B_SKIP或B_DIRECT,若是,则标记所述当前编码宏块为非细节块,参数i_fast_block++。
本发明实施例中,根据每个候选参考块使用的前向参考帧与后向参考帧来预测所述当前编码宏块使用前向参考帧还是后向参考帧。记前向参考帧为参数i_ref0,后向参考帧为参数i_ref1,若所述候选参考块的前向参考帧个数大于1,记i_ref0++,若所述候选参考块的后向参考帧个数大于1,记i_ref1++。当遍历判断完四个候选参考块,根据统计得到的i_ref0和i_ref1的大小预测所述当前编码宏块使用前向参考帧还是后向参考帧。
步骤350:预测所述当前编码宏块的编码模式并预测相应的MV。
本步骤中,首先针对当前候选参考块的方向,定义如下三种条件,Condition1、Condition2、Condition3,分别描述如下:
Condition1:
(mb_candinate[1]->direction-mb_candinate[0]->direction)<=1&&
(mb_candinate[2]->direction-mb_candinate[0]->direction)<=1&&
(mb_candinate[3]->direction-mb_candinate[0]->direction)<=1
Condition2:
(mb_candinate[1]->direction-mb_candinate[0]->direction)<=1&&
(mb_candinate[3]->direction-mb_candinate[2]->direction)<=1&&
(mb_candinate[3]->direction-mb_candinate[1]->direction)>1||
(mb_candinate[3]->direction-mb_candinate[1]->direction)>1
Condition3:
(mb_candinate[2]->direction-mb_candinate[0]->direction)<=1&&
(mb_candinate[3]->direction-mb_candinate[1]->direction)<=1&&
(mb_candinate[3]->direction-mb_candinate[2]->direction)>1
其中,当前候选参考块的方向为mb_candinate[i]->direction,i为候选参考块的序号,i的取值范围0-3,&&表示逻辑运算中的“与”,||表示逻辑运算中的“或”。
当所有候选参考块结束步骤320中的所述遍历后,做如下五种判断:
判断A:帧内预测块的个数大于两个,则当前编码宏块按帧内预测块进行编码,根据统计得到的细节块的数量,执行实施例二所述的编码信息预测。
判断B:所述非细节块的数量大于2,则预测所述当前编码宏块的编码模式为B_DIRECT模式;
判断C:若所述当前候选参考块的MV满足Condition1,则预测所述当前编码宏块的编码模式为B_16×16;
判断D:若所述当前候选参考块的MV满足Condition2,则预测所述当前编码宏块的编码模式为B_16×8;
判断E:若所述当前候选参考块的MV满足Condition3,则预测所述当前编码宏块的编码模式为B_8×16;
判断F:若所述当前候选参考块不满足上述A~E所有的判断,则预测所 述当前编码宏块的编码模式为B_8×8。
当判出所述当前编码宏块的可能编码模式后,分别计算每一种模式对应的参考MV。
对于B_16×16编码模式,采取如下公式1(Equation1)计算运动矢量MV:
Equation1
Mv[x]=(mvc[0].x+mvc[1].x+mvc[2].x+mvc[3].x)>>2)/scale_x
Mv[y]=(mvc[0].y+mvc[1].y+mvc[2].y+mvc[3].y)>>2)/scale_y
Scale_x=round(source_x/dest_x);
Scale_y=round(source_y/dest_y);
Equation1中,Mv[x]为x方向的运动向量;Mv[y]为y方向的运动向量;
mvc[0]至mvc[3]为4个候选参考块对应的MV;mvc[0].x~mvc[3].x为4个候选参考块对应的x方向的MV;mvc[0].y~mvc[3].y为4个候选参考块对应的y方向的MV;
(mvc[0].x+mvc[1].x+mvc[2].x+mvc[3].x)>>2为步骤340中计算所得的所述平均MV的x方向运动向量;
(mvc[0].y+mvc[1].y+mvc[2].y+mvc[3].y)>>2为步骤340中计算所得的所述平均MV的y方向运动向量;
source_x,source_y分别为输入码流的x,y方向分辨率;
dest_x,dest_y分别为目标x,y方向分辨率;Scale_x,Scale_y为x,y方向的过渡参数,用于后续计算;round()函数返回按指定位数进行四舍五入数值;>>代表右移运算符。
对于B_16×8编码模式,采取如下公式2(Equation2)计算运动矢量MV:
Equation2
Mv[0][x]=(mvc[0].x+mvc[1].x)>>1)/scale_x
Mv[0][y]=(mvc[1].y+mvc[1].y)>>1)/scale_y
Mv[1][x]=(mvc[2].x+mvc[3].x)>>1)/scale_x
Mv[1][y]=(mvc[2].y+mvc[3].y)>>1)/scale_y
对于B_8×16编码模式,采取如下公式3(Equation3)计算运动矢量MV:
Equation3
Mv[0][x]=(mvc[2].x+mvc[0].x)>>1)/scale_x
Mv[0][y]=(mvc[2].y+mvc[0].y)>>1)/scale_y
Mv[1][x]=(mvc[1].x+mvc[3].x)>>1)/scale_x
Mv[1][y]=(mvc[1].y+mvc[3].y)>>1)/scale_y
一个16x16宏块由两个16x8块组成,Mv[0]和Mv[1]分别为两个16x8的运动向量;Mv[0][x]即Mv[0]的x方向的MV;Mv[0][y]即Mv[0]的的y方向的MV。
本发明实施例中,P帧不存在后向预测块,其预测模式与B帧类似,此处不再赘述。
本实施例中,通过对输入码流的编码信息进,通过复用源码流的编码信息对待编码的编码模式进行预测,在一定程度节省了编码时间;与此同时,本实施例对预测模式进行简单地再优化,保证了与完全编码模式相同的视频质量。
实施例四
图6是本发明实施例四的装置结构示意图,结合图6,本发明实施例一种变分辨率的编码模式预测装置,包括如下模块:信息获取模块610、转码模块620。
信息获取模块610,用于对当前输入码流进行解码,并在解码过程中获取码流信息,其中所述码流信息包括当前解码帧的帧类型以及宏块编码信 息;
转码模块620,用于根据所述码流信息预测所述输入码流对应的转码帧的帧类型,并根据所述输入码流的分辨率与转码目标分辨率的映射关系对所述转码帧的编码信息进行预测。
具体地,所述转码模块620进一步用于:当采用H264作为视频编码格式时,将所述输入码流对应的帧类型作为所述转码帧的帧类型,其中所述帧类型包括帧内预测编码帧、前向预测编码帧以及双向预测编码帧。
具体地,所述转码模块620进一步用于:根据所述输入码流的分辨率与所述转码目标分辨率的映射关系选择当前编码宏块在所述输入码流中对应的候选参考块,并根据所述候选参考块的原编码模式预测所述当前编码宏块的编码模式。
具体地,所述转码模块620进一步用于:对所述帧内预测编码帧的帧内宏块进行编码时,遍历每一个所述候选参考块,根据所述候选参考块的原分割模式判断所述候选参考块是否为细节块;统计所述细节块的数量并根据所述数量预测所述当前编码宏块的编码模式。
具体地,所述转码模块620进一步用于:对所述双向预测编码帧进行编码时,遍历每一个所述候选参考块,判断所述候选参考块是否为帧间预测块或帧内预测块;若为所述帧内预测块,则判断所述帧内预测块是否为细节块并统计所述细节块的数量;若为所述帧间预测块,则统计所述帧间预测块的数量,并根据所述细节块的数量以及所述帧内预测块的数量预测所述当前编码宏块的编码模式。
图6对应装置执行图1~图5所示实施例,其执行步骤与技术效果如图1~图5所示实施例所述,此处不再赘述。
以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性的劳动的情况下,即可以 理解并实施。
实施例五
图7是本发明实施例五的装置结构示意图,结合图7,本发明实施例一种变分辨率的编码模式预测设备,包括内存701以及处理器702。其中,所述内存701用于存储一条或多条指令,其中,所述一条或多条指令以供所述处理器调用执行;
所述处理器702,用于对当前输入码流进行解码,并在解码过程中获取码流信息,其中所述码流信息包括当前解码帧的帧类型以及宏块编码信息;根据所述码流信息预测所述输入码流对应的转码帧的帧类型,并根据所述输入码流的分辨率与转码目标分辨率的映射关系对所述转码帧的编码信息进行预测。
根据所述码流信息预测所述输入码流对应的转码帧的帧类型时,所述处理器702,进一步用于:当采用H264作为视频编码格式时,将所述输入码流对应的帧类型作为所述转码帧的帧类型,其中所述帧类型包括帧内预测编码帧、前向预测编码帧以及双向预测编码帧。
根据所述输入码流的分辨率与转码目标分辨率的映射关系对所述转码帧的编码信息进行预测时,所述处理器702,进一步用于:根据所述输入码流的分辨率与所述转码目标分辨率的映射关系选择当前编码宏块在所述输入码流中对应的候选参考块,并根据所述候选参考块的原编码模式预测所述当前编码宏块的编码模式。
根据所述候选参考块的原编码模式预测所述当前编码宏块的编码模式时,所述处理器702,进一步用于:对所述帧内预测编码帧的帧内宏块进行编码时,遍历每一个所述候选参考块,根据所述候选参考块的原分割模式判断所述候选参考块是否为细节块;统计所述细节块的数量并根据所述数量预测所述当前编码宏块的编码模式。
根据所述候选参考块的原编码模式预测所述当前编码宏块的编码模式 时,所述处理器702,进一步用于:对所述双向预测编码帧进行编码时,遍历每一个所述候选参考块,判断所述候选参考块是否为帧间预测块或帧内预测块;若为所述帧内预测块,则判断所述帧内预测块是否为细节块并统计所述细节块的数量;若为所述帧间预测块,则统计所述帧间预测块的数量,并根据所述细节块的数量以及所述帧内预测块的数量预测所述当前编码宏块的编码模式。
本设备的技术方案和各模块的功能特征、连接方式,与图1~图5对应实施例所描述的特征和技术方案相对应,不足之处请参见前述图1~图5对应实施例。

Claims (11)

  1. 一种变分辨率的编码模式预测方法,其特征在于,包括如下的步骤:
    对当前输入码流进行解码,并在解码过程中获取码流信息,其中所述码流信息包括当前解码帧的帧类型以及宏块编码信息;
    根据所述码流信息预测所述输入码流对应的转码帧的帧类型,并根据所述输入码流的分辨率与转码目标分辨率的映射关系对所述转码帧的编码信息进行预测。
  2. 根据权利要求1所述的方法,其特征在于,根据所述码流信息预测所述输入码流对应的转码帧的帧类型,进一步包括:
    当采用H264作为视频编码格式时,将所述输入码流对应的帧类型作为所述转码帧的帧类型,其中所述帧类型包括帧内预测编码帧、前向预测编码帧以及双向预测编码帧。
  3. 根据权利要求1或2所述的方法,其特征在于,根据所述输入码流的分辨率与转码目标分辨率的映射关系对所述转码帧的编码信息进行预测,进一步包括:
    根据所述输入码流的分辨率与所述转码目标分辨率的映射关系选择当前编码宏块在所述输入码流中对应的候选参考块,并根据所述候选参考块的原编码模式预测所述当前编码宏块的编码模式。
  4. 根据权利要求3所述的方法,其特征在于,根据所述候选参考块的原编码模式预测所述当前编码宏块的编码模式,进一步包括:
    对所述帧内预测编码帧的帧内宏块进行编码时,遍历每一个所述候选参考块,根据所述候选参考块的原分割模式判断所述候选参考块是否为细节块;
    统计所述细节块的数量并根据所述数量预测所述当前编码宏块的编码模式。
  5. 根据权利要求3所述的方法,其特征在于,根据所述候选参考块 的原编码模式预测所述当前编码宏块的编码模式,进一步包括:
    对所述双向预测编码帧进行编码时,遍历每一个所述候选参考块,判断所述候选参考块是否为帧间预测块或帧内预测块;
    若为所述帧内预测块,则判断所述帧内预测块是否为细节块并统计所述细节块的数量;若为所述帧间预测块,则统计所述帧间预测块的数量,并根据所述细节块的数量以及所述帧内预测块的数量预测所述当前编码宏块的编码模式。
  6. 一种变分辨率的编码模式预测装置,其特征在于,包括如下模块:
    信息获取模块,用于对当前输入码流进行解码,并在解码过程中获取码流信息,其中所述码流信息包括当前解码帧的帧类型以及宏块编码信息;
    转码模块,用于根据所述码流信息预测所述输入码流对应的转码帧的帧类型,并根据所述输入码流的分辨率与转码目标分辨率的映射关系对所述转码帧的编码信息进行预测。
  7. 根据权利要求5所述的装置,其特征在于,所述转码模块进一步用于:
    当采用H264作为视频编码格式时,将所述输入码流对应的帧类型作为所述转码帧的帧类型,其中所述帧类型包括帧内预测编码帧、前向预测编码帧以及双向预测编码帧。
  8. 根据权利要求6或7所述的装置,其特征在于,所述转码模块进一步用于:
    根据所述输入码流的分辨率与所述转码目标分辨率的映射关系选择当前编码宏块在所述输入码流中对应的候选参考块,并根据所述候选参考块的原编码模式预测所述当前编码宏块的编码模式。
  9. 根据权利要求8所述的装置,其特征在于,所述转码模块进一步用于:
    对所述帧内预测编码帧的帧内宏块进行编码时,遍历每一个所述候 选参考块,根据所述候选参考块的原分割模式判断所述候选参考块是否为细节块;
    统计所述细节块的数量并根据所述数量预测所述当前编码宏块的编码模式。
  10. 根据权利要求8所述的装置,其特征在于,所述转码模块进一步用于:
    对所述双向预测编码帧进行编码时,遍历每一个所述候选参考块,判断所述候选参考块是否为帧间预测块或帧内预测块;
    若为所述帧内预测块,则判断所述帧内预测块是否为细节块并统计所述细节块的数量;若为所述帧间预测块,则统计所述帧间预测块的数量,并根据所述细节块的数量以及所述帧内预测块的数量预测所述当前编码宏块的编码模式。
  11. 一种视频去噪设备,包括:内存、处理器,
    所述内存,用于存储一条或多条指令,其中,所述一条或多条指令以供所述处理器调用执行;
    所述处理器,用于对当前输入码流进行解码,并在解码过程中获取码流信息,其中所述码流信息包括当前解码帧的帧类型以及宏块编码信息;
    根据所述码流信息预测所述输入码流对应的转码帧的帧类型,并根据所述输入码流的分辨率与转码目标分辨率的映射关系对所述转码帧的编码信息进行预测。
PCT/CN2016/088715 2015-12-18 2016-07-05 变分辨率的编码模式预测方法及装置 WO2017101350A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US15/246,684 US20170180745A1 (en) 2015-12-18 2016-08-25 Prediction method and Electronic Apparatus of encoding mode of variable resolution

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510959338.4A CN105898308A (zh) 2015-12-18 2015-12-18 变分辨率的编码模式预测方法及装置
CN201510959338.4 2015-12-18

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/246,684 Continuation US20170180745A1 (en) 2015-12-18 2016-08-25 Prediction method and Electronic Apparatus of encoding mode of variable resolution

Publications (1)

Publication Number Publication Date
WO2017101350A1 true WO2017101350A1 (zh) 2017-06-22

Family

ID=57002254

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/088715 WO2017101350A1 (zh) 2015-12-18 2016-07-05 变分辨率的编码模式预测方法及装置

Country Status (3)

Country Link
US (1) US20170180745A1 (zh)
CN (1) CN105898308A (zh)
WO (1) WO2017101350A1 (zh)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107016353B (zh) * 2017-03-13 2019-08-23 北京理工大学 一种变分辨率目标探测与识别一体化的方法与系统
CN108848377B (zh) * 2018-06-20 2022-03-01 腾讯科技(深圳)有限公司 视频编码、解码方法、装置、计算机设备和存储介质
CN108848376B (zh) * 2018-06-20 2022-03-01 腾讯科技(深圳)有限公司 视频编码、解码方法、装置和计算机设备
US11368692B2 (en) * 2018-10-31 2022-06-21 Ati Technologies Ulc Content adaptive quantization strength and bitrate modeling
CN110662071B (zh) * 2019-09-27 2023-10-24 腾讯科技(深圳)有限公司 视频解码方法和装置、存储介质及电子装置
CN110636293B (zh) * 2019-09-27 2024-03-15 腾讯科技(深圳)有限公司 视频编码、解码方法和装置、存储介质及电子装置
CN112235576B (zh) * 2020-11-16 2024-04-30 北京世纪好未来教育科技有限公司 编码方法、装置、电子设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050175099A1 (en) * 2004-02-06 2005-08-11 Nokia Corporation Transcoder and associated system, method and computer program product for low-complexity reduced resolution transcoding
CN101035284A (zh) * 2007-02-12 2007-09-12 清华大学 一种流式视频像素域转码的方法
CN101272496A (zh) * 2008-04-10 2008-09-24 清华大学 一种264视频降低分辨率转码的模式选择方法
CN103546754A (zh) * 2012-07-16 2014-01-29 中国科学院声学研究所 从h.264/avc到svc空间可分级的转码方法及系统
CN104618734A (zh) * 2015-01-29 2015-05-13 华为技术有限公司 相同协议类型下视频码流的转码方法和装置

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101600109A (zh) * 2009-07-13 2009-12-09 北京工业大学 基于纹理和运动特征的h.264降尺寸转码方法
CN104581170B (zh) * 2015-01-23 2018-07-06 四川大学 基于hevc降视频分辨率的快速帧间转码的方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050175099A1 (en) * 2004-02-06 2005-08-11 Nokia Corporation Transcoder and associated system, method and computer program product for low-complexity reduced resolution transcoding
CN101035284A (zh) * 2007-02-12 2007-09-12 清华大学 一种流式视频像素域转码的方法
CN101272496A (zh) * 2008-04-10 2008-09-24 清华大学 一种264视频降低分辨率转码的模式选择方法
CN103546754A (zh) * 2012-07-16 2014-01-29 中国科学院声学研究所 从h.264/avc到svc空间可分级的转码方法及系统
CN104618734A (zh) * 2015-01-29 2015-05-13 华为技术有限公司 相同协议类型下视频码流的转码方法和装置

Also Published As

Publication number Publication date
CN105898308A (zh) 2016-08-24
US20170180745A1 (en) 2017-06-22

Similar Documents

Publication Publication Date Title
WO2017101350A1 (zh) 变分辨率的编码模式预测方法及装置
US11252436B2 (en) Video picture inter prediction method and apparatus, and codec
US11638003B2 (en) Video coding and decoding methods and devices using a library picture bitstream
WO2020083403A1 (zh) 图像预测方法及装置
US11943451B2 (en) Chroma block prediction method and apparatus
WO2020119449A1 (zh) 色度块的预测方法和装置
WO2020088324A1 (zh) 一种视频图像预测方法及装置
US11736706B2 (en) Video decoding method and apparatus, and decoding device
JP7148612B2 (ja) ビデオデータインター予測の方法、装置、ビデオエンコーダ、ビデオデコーダ及びプログラム
WO2020006969A1 (zh) 运动矢量预测方法以及相关装置
US11895297B2 (en) Prediction mode determining method and apparatus, encoding device, and decoding device
US11758130B2 (en) Video picture decoding and encoding method and apparatus
WO2020114394A1 (zh) 视频编解码方法、视频编码器和视频解码器
WO2020038378A1 (zh) 色度块预测方法及装置
WO2020143585A1 (zh) 视频编码器、视频解码器及相应方法
CN104506870A (zh) 一种适用于多码流的视频编码处理方法和装置
WO2022166462A1 (zh) 编码、解码方法和相关设备
WO2023092256A1 (zh) 一种视频编码方法及其相关装置
EP3836542A1 (en) Picture partition method and device
WO2023051156A1 (zh) 视频图像的处理方法及装置
WO2017101349A1 (zh) 一种编码信息继承的实时转码方法及装置
CN112055970B (zh) 候选运动信息列表的构建方法、帧间预测方法及装置
WO2020182194A1 (zh) 帧间预测的方法及相关装置
WO2020156054A1 (zh) 视频解码方法、视频编码方法、装置、设备及存储介质
WO2020134817A1 (zh) 预测模式确定方法、装置及编码设备和解码设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16874412

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16874412

Country of ref document: EP

Kind code of ref document: A1