US20140140402A1 - Method and apparatus for selecting a coding mode - Google Patents
Method and apparatus for selecting a coding mode Download PDFInfo
- Publication number
- US20140140402A1 US20140140402A1 US14/164,653 US201414164653A US2014140402A1 US 20140140402 A1 US20140140402 A1 US 20140140402A1 US 201414164653 A US201414164653 A US 201414164653A US 2014140402 A1 US2014140402 A1 US 2014140402A1
- Authority
- US
- United States
- Prior art keywords
- picture
- frame
- field
- coding mode
- macroblock
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 92
- 238000005192 partition Methods 0.000 claims description 20
- 238000010586 diagram Methods 0.000 description 12
- 239000013598 vector Substances 0.000 description 6
- 230000006870 function Effects 0.000 description 5
- 230000002123 temporal effect Effects 0.000 description 5
- 230000003044 adaptive effect Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000000638 solvent extraction Methods 0.000 description 4
- 238000013459 approach Methods 0.000 description 3
- 238000013139 quantization Methods 0.000 description 2
- 238000013179 statistical model Methods 0.000 description 2
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 1
- 239000007795 chemical reaction product Substances 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
Images
Classifications
-
- H04N19/00018—
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
-
- H04N19/00145—
-
- H04N19/00278—
-
- H04N19/00303—
-
- H04N19/00587—
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/112—Selection of coding mode or of prediction mode according to a given display mode, e.g. for interlaced or progressive display mode
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
- H04N19/137—Motion inside a coding unit, e.g. average field, frame or block difference
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
- H04N19/147—Data rate or code amount at the encoder output according to rate distortion criteria
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/182—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
- H04N19/14—Coding unit complexity, e.g. amount of activity or edge presence estimation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
Definitions
- the present invention relates to video encoders and, more particularly, to a method and apparatus for selecting a coding mode (e.g., a frame coding mode or a field coding mode).
- a coding mode e.g., a frame coding mode or a field coding mode.
- the International Telecommunication Union (ITU) H.264 video coding standard is able to compress video much more efficiently than earlier video coding standards, such as ITU H.263, MPEG-2 (Moving Picture Experts Group), and MPEG-4.
- H.264 is also known as MPEG-4 Part 10 and Advanced Video Coding (AVC).
- AVC Advanced Video Coding
- H.264 exhibits a combination of new techniques and increased degrees of freedom in using existing techniques.
- the new techniques defined in H.264 are 4 ⁇ 4 and 8 ⁇ 8 integer transform (e.g., DCT-like integer transform), multi-frame prediction, context adaptive variable length coding (CAVLC), SI/SP frames, context-adaptive binary arithmetic coding (CABAC), and adaptive frame/field coding.
- the H.264 standard belongs to the hybrid motion-compensated DCT (MC-DCT) family of codecs. H.264 is able to generate an efficient representation of the source video by reducing temporal and spatial redundancies. Temporal redundancies are removed by a combination of motion estimation (ME) and motion compensation (MC). ME is the process of estimating the motion of a current frame in the source video from previously coded frame(s). This motion information is used to motion compensate the previously coded frame(s) to form a prediction for the current frame. The prediction is then subtracted from the original current frame to form a displaced frame difference (DFD). The motion information is present for each block of pixel data.
- ME motion estimation
- MC motion compensation
- a 16 ⁇ 16 pixel macroblock can be tessellated into the following partitions: (A) one 16 ⁇ 16 macroblock region; (B) two 16 ⁇ 8 tessellations; (C) two 8 ⁇ 16 tessellations; and (D) four 8 ⁇ 8 tessellations.
- each of the 8 ⁇ 8 tessellations can be decomposed into: (a) one 8 ⁇ 8 region; (b) two 8 ⁇ 4 regions; (c) two 4 ⁇ 8 regions; and (d) four 4 ⁇ 4 regions.
- the motion vector for each block is unique and can point to different reference frames.
- the job of the encoder is to find the optimal way of breaking down a 16 ⁇ 16 macroblock into smaller blocks (along with the corresponding motion vectors) in order to maximize compression efficiency. This breaking down of the macroblock into a specific pattern is commonly referred to as “mode selection” or “mode decision.”
- H.264 standard allows for the adaptive switching between frame coding and field coding modes. Notably, this type of switching can occur at both the picture and the macroblock (MB) pair levels.
- present day processes are typically exhaustive in the sense that H.264 encoders encode a picture by completely executing both frame coding and field coding techniques and subsequently comparing the two end products to see which one performed better. Namely, each picture is encoded in its entirety twice. This approach is computationally expensive.
- a method and apparatus for selecting a coding mode are described. For example, the method receives at least one block of a signal to be encoded. The method determines a frame vertical pixel difference in the at least one block and determines a field vertical pixel difference in the at least one block. The method then compares the frame vertical pixel difference with the field vertical pixel difference to determine a first coding mode for the at least one block.
- a method and apparatus for selecting a coding mode are described. For example, the method receives at least one block of a signal to be encoded. The method then determines a field coding cost of the at least one block in accordance with a field coding mode and determines a frame coding cost of the at least one block in accordance with a frame coding mode. The method then compares the frame coding cost with the field coding cost to determine a coding mode for the at least one block.
- FIG. 1 is a block diagram depicting an exemplary embodiment of a video encoder
- FIG. 2 is a flow diagram depicting an exemplary embodiment of a method for a selecting a coding mode in accordance with one or more aspects of the invention
- FIG. 3 is a flow diagram depicting an exemplary embodiment of a method for a frame and field mode selection process in accordance with one or more aspects of the invention
- FIG. 4 is a flow diagram depicting an exemplary embodiment of a method for a frame and field mode selection process for an I-picture in accordance with one or more aspects of the invention
- FIG. 5 is a flow diagram depicting an exemplary embodiment of a method for a frame and field mode selection process for a P- and B-picture in accordance with one or more aspects of the invention.
- FIG. 6 is a block diagram depicting an exemplary embodiment of a general computer suitable for implementing the processes and methods described herein.
- One or more aspects of the invention relate to predictive frame (i.e., INTER) mode selection in an H.264 video encoder or H.264-like encoder.
- INTER predictive frame
- the present invention is disclosed in the context of an H.264-like encoder, the present invention is not so limited. Namely, the present invention can be adapted to other motion compensation (MC) encoding standards.
- the INTER mode selection is independent of the motion estimation algorithm.
- the INTER mode selection is a one-pass decision algorithm aiming to approximate the multi-pass R-D optimization based on encoder parameters and statistical data.
- the algorithm uses a bits model in which some of the components are measured exactly, and the residual block bits are estimated through a statistical model.
- the statistical model can be adapted based on the actual encoded bits.
- FIG. 1 is a block diagram depicting an exemplary embodiment of a video encoder 100 .
- the video encoder is compliant with the H.264 standard.
- the video encoder 100 includes a subtractor 102 , a discrete cosine transform (DCT) module 104 , a quantizer 106 , an entropy coder 108 , an inverse quantizer 110 , an inverse DCT module 112 , an adder 114 , a deblocking filter 116 , a frame memory 118 , a motion compensated predictor 120 , an intra/inter switch 122 , and a motion estimator 124 .
- the video encoder 100 receives a sequence of source frames.
- the subtractor 102 receives a source frame from the input sequence and a predicted frame from the intra/inter switch 122 .
- the subtractor 102 computes a difference between the source frame and the predicted frame, which is provided to the DCT module 104 .
- the predicted frame is generated by the motion compensated predictor 120 .
- a predicted MB is formed by the pixels from the neighboring MBs in the same frame.
- the DCT module 104 transforms the difference signal from the pixel domain to the frequency domain using a DCT-like algorithm to produce a set of coefficients.
- the quantizer 106 quantizes the DCT coefficients.
- the entropy coder 108 codes the quantized DCT coefficients to produce a coded frame.
- the inverse quantizer 110 performs the inverse operation of the quantizer 106 to recover the DCT coefficients.
- the inverse DCT module 112 performs the inverse operation of the DCT-like module 104 to produce a reconstructed difference signal.
- the reconstructed difference signal is added to the predicted frame by the adder 114 to produce a reconstructed frame, which is coupled to the deblocking filter 116 .
- the deblocking filter smoothes the reconstructed frame and stores the reconstructed frame in the frame memory 118 .
- the motion compensated predictor 120 and the motion estimator 124 are coupled to the frame memory 118 and are configured to obtain one or more previously reconstructed frames (previously coded frames).
- the motion estimator 124 also receives the source frame.
- the motion estimator 124 performs a motion estimation algorithm using the source frames and previous reconstructed frames (i.e., reference frames) to produce motion estimation data.
- the motion estimation data includes motion vectors and associated references.
- the motion estimation data is provided to the entropy coder 108 and the motion compensated predictor 120 .
- the entropy coder 108 codes the motion estimation data to produce coded motion data.
- the motion compensated predictor 120 performs a motion compensation algorithm using a previous reconstructed frame and the motion estimation data to produce the predicted frame, which is coupled to the intra/inter switch 122 .
- Motion estimation and motion compensation algorithms are well known in the art.
- the motion estimator 124 includes mode decision logic 126 .
- the mode decision logic 126 is configured to select a mode for each macroblock, or pair of macroblocks, in a predictive (INTER) frame.
- the “mode” of a macroblock is the partitioning scheme. That is, in one embodiment, the mode decision logic 126 selects MODE for each macroblock in a predictive frame.
- the present invention sets forth at least one solution for selecting either a frame coding mode or a field coding mode on both a per picture basis or a per macroblock (MB) pair basis.
- one solution may include two separate yet related methods.
- the first method utilizes frame and field vertical pixel difference comparisons, while the second method entails the use of a coding cost procedure.
- the reasoning for using the frame and field vertical pixel difference comparison method is two-fold. First, for stationary areas of a picture comprising two fields, the difference between the consecutive pixels of the picture in the vertical direction tends to be smaller than the difference between the consecutive pixels of each of two fields of the picture in the vertical direction. Secondly, for moving areas of a picture comprising two fields, the difference between the consecutive pixels of the picture in the vertical direction tends to be larger than the difference between the consecutive pixels of each of two fields of the picture in the vertical direction.
- the coding costs (e.g., the motion estimation costs) of a MB pair in frame and field modes may also be used to determine whether a frame or a field coding mode is more suitable for a particular MB pair.
- the coding cost (J) is defined as:
- SAD is a difference measurement between the pixels and their (temporal or spatial) predictions for every MB or sub-MB partition.
- pixel predictions come from either a temporal prediction or a spatial prediction.
- SAD may also represent the distortion present in an MB pair.
- the MV variable represents motion vectors
- refldx is the reference picture index
- mbType is the type of macroblocks or MB partitions.
- An MB partition may include any sub-macroblock configuration derived from a 16 ⁇ 16 MB, such as a 4 ⁇ 4 block, an 8 ⁇ 8 block, or the like.
- the function portion of the equation, which attempts to serve as a representation of the number of coding bits, varies in accordance with the type of coding process that is conducted.
- the variable ⁇ represents a constant depending upon the quantization parameter and other coding parameters.
- the variable ⁇ may be utilized to nullify the units resulting from the function portion of the equation so that J results in a unitless value.
- the coding cost can be measured in either frame or field mode. As demonstrated below, a frame/field mode selection of an MB pair or picture can be based upon the frame and field coding costs.
- FIG. 2 is a flow diagram that depicts an exemplary embodiment of the present invention. Namely, method 200 describes the steps in which a coding mode is determined by one embodiment of the present invention. The method 200 begins at step 202 and proceeds to step 204 where at least one block (e.g., a macroblock or a macroblock pair) of a signal to be encoded is received.
- a block e.g., a macroblock or a macroblock pair
- a frame vertical pixel difference in the at least one block is determined.
- a field vertical pixel difference in the at least one block is determined.
- the frame vertical pixel difference is compared with the field vertical pixel difference to determine a first coding mode for the at least one block.
- method 200 performs the optional step of computing the motion estimation (ME) cost for both the frame coding mode and the filed coding mode, where the two ME costs are then compared to further assist in the determining of a proper coding mode for the at least one block. A detailed description of this step is further described below.
- the method 200 ends at step 212 .
- FIG. 2 illustrates the implementation of the motion estimation (ME) cost computation as being performed after the vertical pixel difference computation
- the motion estimation (ME) cost computation and the vertical pixel difference computation can be implemented in combination or separately.
- the motion estimation (ME) cost computation and the vertical pixel difference computation can be performed in parallel or in any sequential order as required for a particular implementation.
- an implementation of the present invention may implement the motion estimation (ME) cost computation without the vertical pixel difference computation or vice versa.
- FIG. 3 is a flow diagram depicting an exemplary embodiment of a method 300 for utilizing a frame and field vertical pixel comparison in accordance with one or more aspects of the invention. Although the following method specifically describes the processing of a block on an MB pair level, the method 300 can be similarly applied to processing of a picture on a picture level.
- the method 300 begins at step 302 and proceeds to step 304 where the sum of the differences between the absolute frame and field pixels differences in the vertical direction is calculated.
- the vertical frame pixel difference is determined to be:
- ⁇ FRM ⁇ j ⁇ abs ⁇ ⁇ X i , j - X i , j + 1 ⁇ Eq . ⁇ 2
- i and j are the pixel horizontal and vertical indices, and (i, j) are over the MB pair.
- This formula essentially involves the determination of the difference between two lines in the same frame picture (i.e. X i,j represents a first line and X i,j+1 represents the next line of the same MB).
- the vertical field pixel difference is determined to be:
- ⁇ FLD ⁇ j ⁇ abs ⁇ ⁇ X i , 2 ⁇ j - X i , 2 ⁇ ( j + 1 ) ⁇ + ⁇ j ⁇ abs ⁇ ⁇ X i , 2 ⁇ j + 1 - X i , 2 ⁇ ( j + 1 ) + 1 ⁇ Eq . ⁇ 3
- This formula pertains to the sum of differences between lines within each field of a MB. More specifically, the first component of the equation deals with the difference between two lines in a first field in a MB (e.g., a top field of a MB) and the second component of the equation deals with the difference between two lines in a second field of the same MB (e.g., the bottom field of same MB).
- the frame and field pixel differences in the vertical direction are compared. Namely, in one embodiment, if the frame pixel difference is greater than the field pixel difference (i.e., ⁇ FRM > ⁇ FLD ), then the method 300 continues to step 310 . At step 310 , field coding mode is selected for the MB pair. Conversely, if the frame pixel difference is not greater than the field pixel difference, then the method 300 continues to step 308 . At step 308 , frame coding mode is selected for the MB pair. The method 300 ends at step 312 .
- FIG. 4 is a flow diagram depicting a method 400 for determining a coding mode in accordance with one or more aspects of the invention.
- the method 400 begins at step 402 and proceeds to step 404 where at least one intra frame prediction is performed.
- an intra frame prediction is performed for all of the possible prediction directions for an intra 4 ⁇ 4 sub-MB partition, an intra 8 ⁇ 8 sub-MB partition, and an intra 16 ⁇ 16 macroblock for each MB of a given MB pair.
- both the intra 4 ⁇ 4 MB and the 8 ⁇ 8 MB have nine directions to be considered.
- the 16 ⁇ 16 MB has four directions to be considered.
- a minimum cost for each MB of a MB pair is determined.
- the minimum costs of each MB in the MB pair are added together for both the frame mode and the field mode. More specifically, each of the directions calculated in step 404 (along with the respective block type information) is applied to the coding cost formula (see Eq. 1). Since method 400 pertains to I-pictures only, the coding cost formula does not consider temporal predictions or motion estimation. Afterwards, a minimum cost is selected for each MB of the MB pair.
- a first minimum cost (regardless of the block type used to determine that minimum cost) is selected for the top MB and a second minimum cost (regardless of the block type used to determine that minimum cost) is selected for the bottom MB of the MB pair.
- a final minimum cost is then calculated by summing the minimum cost for the top MB to the minimum cost of the bottom MB. Notably, this calculation is conducted for both the frame coding mode and the field coding mode so that two separate final minimum costs, i.e., the minimum field cost (J FLDmin ) and the minimum frame cost (J FRMmin ), are respectively determined.
- the calculated J FLDmin and the J FRMmin are compared. If J FRMmin is not found to be greater than J FLDmin , then the method 400 proceeds to step 410 where the frame coding mode is selected for the MB pair. Alternatively, if J FRMmin is found to be greater than J FLDmin , then the method 400 proceeds to step 412 where the field coding mode is selected for the MB pair. The method 400 ends at step 414 .
- the method 400 may also be used to determine a minimum coding cost on the picture level (as opposed to MB pair level).
- the alternative method is identical to method 400 with the exception that after step 406 , all of the minimum frame coding costs and the minimum field coding costs per MB pair are summed over the entire picture in a separate manner. For example,
- step 408 of method 400 would be replaced with the comparison of J sum — FRMmin and J sum — FLDmin . Consequently, if J sum — FRMmin is found to be greater than J sum — FLDmin , then the field coding mode is selected for the picture. Alternatively, if J sum — FRMmin is not found to be greater than J sum — FLDmin , then the frame coding mode is selected for the picture.
- FIG. 5 is a flow diagram depicting a method 500 for determining a coding mode in accordance with one or more aspects of the invention.
- the method 500 begins at step 502 and proceeds to step 504 where motion estimations (MEs) are performed for all possible MBs or sub-MB partitions for each of the MB of a MB pair for both frame mode and field mode.
- Ms motion estimations
- an inter frame prediction is performed for all of the possible prediction directions for a 4 ⁇ 4 sub-MB partition, a 4 ⁇ 8 sub-MB partition, an 8 ⁇ 4 sub-MB partition, an 8 ⁇ 8 sub-MB partition, an 8 ⁇ 16 sub-MB partition, a 16 ⁇ 8 sub-MB partition, and a 16 ⁇ 16 macroblock for each of the two MBs of a MB pair.
- the MB/sub-MB partition type with the minimum coding cost is found for each MB of a MB pair in both the frame coding mode and the field coding mode.
- the present invention calculates a motion estimation cost using the cost coding formula (i.e., Eq. 1).
- the formula is applied to each of the seven different types of MB/sub-MB partitions twice, once in the frame mode and then in the field mode, so that a minimum ME cost for both frame coding and field coding is calculated for each MB/sub-MB partition.
- a minimum ME cost is selected for each MB of the MB pair. For example, a first minimum cost (regardless of the block type used to determine that minimum cost) is selected for the top MB and a second minimum ME cost (regardless of the block type used to determine that minimum cost) is selected for the bottom MB of the MB pair.
- a final minimum ME cost is then calculated by summing the minimum cost for the top MB to the minimum cost of the bottom MB. Notably, this calculation is conducted for both the frame coding mode and the field coding mode so that two separate final minimum costs, i.e., the minimum field cost (J FLDmin ) and the minimum frame cost (J FRMmin ), are respectively determined.
- the calculated J FLDmin and the J FRMmin are compared. If J FRMmin is found to be greater than J FLDmin , then the method 500 proceeds to step 510 where the field mode is selected for the MB pair. Alternatively, if J FRMmin is not found to be greater than J FLDmin , then the method 500 proceeds to step 512 where the frame mode is selected for the MB pair. The method 500 ends at step 514 .
- the method 500 may also be used to determining a minimum coding cost for a P-picture or B-picture on the picture level (as opposed to MB pair level).
- the alternative method is identical to method 500 with the exception that after step 506 , all of the minimum frame coding costs and minimum field coding costs are summed over the entire picture in a separate manner. For example,
- step 508 of method 500 would be replaced with the comparison of J sum — FRMmin and J sum — FLDmin . Specifically, if J sum — FRMmin is found to be greater than J sum — FLDmin , then the field coding mode is selected for the picture. Alternatively, if J sum — FRMmin is not found to be greater than J sum — FLDmin , then the frame coding mode is selected for the picture.
- the present invention determines if the outcomes of the frame/field mode selection process based on vertical pixel difference (e.g., method 300 ) and the coding cost process (e.g., method 400 for I pictures and method 500 for P and B pictures) are the same. If the results are indeed the same, then the result is considered final. If the results are different or there is some type of discrepancy, then additional criteria or calculations may be required. For instance, the following formula could be used to determine whether frame or field coding should be implemented:
- ⁇ is a constant ranging from 0 to 1.0. If the above formula holds true, then the result from the coding cost formula should be used. Otherwise, the result from the frame/field vertical pixel difference comparison should be used.
- the present invention determines if the outcomes of the frame/field mode selection process and the coding cost process, as processed on the picture level, are the same. If the results are indeed are the same, then the result is considered final. If the results are different or there is some type of discrepancy, then additional criteria may be required. For example, the final decision on the frame and field mode per picture may be determined using Table 1 below. The final decision is based on the decision from the above comparisons of the aforementioned alternative embodiments.
- Table 1 is biased towards a frame coding mode because in a frame picture, MBAFF can be turned on, which may further compensate an incorrect decision made at the picture level (if any).
- FIG. 6 is a block diagram depicting an exemplary embodiment of a video encoder 600 in accordance with one or more aspects of the invention.
- the video encoder 600 includes a processor 601 , a memory 603 , various support circuits 604 , and an I/O interface 602 .
- the processor 601 may be any type of processing element known in the art, such as a microcontroller, digital signal processor (DSP), instruction-set processor, dedicated processing logic, or the like.
- the support circuits 604 for the processor 601 include conventional clock circuits, data registers, I/O interfaces, and the like.
- the I/O interface 602 may be directly coupled to the memory 603 or coupled through the processor 601 .
- the I/O interface 602 may be coupled to a frame buffer and a motion compensator, as well as to receive input frames.
- the memory 603 may include one or more of the following random access memory, read only memory, magneto-resistive read/write memory, optical read/write memory, cache memory, magnetic read/write memory, and the like, as well as signal-bearing media as described below.
- the memory 603 stores processor-executable instructions and/or data that may be executed by and/or used by the processor 601 as described further below. These processor-executable instructions may comprise hardware, firmware, software, and the like, or some combination thereof. Modules having processor-executable instructions that are stored in the memory 603 may include a mode selection module 612 .
- the mode selection module 612 is configured to perform the methods 200 , 300 , 400 , and 500 of FIGS. 2 , 3 , 4 , and 5 respectively.
- An aspect of the invention is implemented as a program product for execution by a processor.
- Program(s) of the program product defines functions of embodiments and can be contained on a variety of signal-bearing media (computer readable media), which include, but are not limited to: (i) information permanently stored on non-writable storage media (e.g., read-only memory devices within a computer such as CD-ROM or DVD-ROM disks readable by a CD-ROM drive or a DVD drive); (ii) alterable information stored on writable storage media (e.g., floppy disks within a diskette drive or hard-disk drive or read/writable CD or read/writable DVD); or (iii) information conveyed to a computer by a communications medium, such as through a computer or telephone network, including wireless communications.
- a communications medium such as through a computer or telephone network, including wireless communications.
- the latter embodiment specifically includes information downloaded from the Internet and other networks.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
- 1. Field of the Invention
- The present invention relates to video encoders and, more particularly, to a method and apparatus for selecting a coding mode (e.g., a frame coding mode or a field coding mode).
- 2. Description of the Background Art
- The International Telecommunication Union (ITU) H.264 video coding standard is able to compress video much more efficiently than earlier video coding standards, such as ITU H.263, MPEG-2 (Moving Picture Experts Group), and MPEG-4. H.264 is also known as MPEG-4 Part 10 and Advanced Video Coding (AVC). H.264 exhibits a combination of new techniques and increased degrees of freedom in using existing techniques. Among the new techniques defined in H.264 are 4×4 and 8×8 integer transform (e.g., DCT-like integer transform), multi-frame prediction, context adaptive variable length coding (CAVLC), SI/SP frames, context-adaptive binary arithmetic coding (CABAC), and adaptive frame/field coding. The increased degrees of freedom come about by allowing multiple reference frames for prediction and many more tessellations of a 16×16 pixel macroblock (MB). These new tools and methods add to the coding efficiency at the cost of increased encoding and decoding complexity in terms of logic, memory, and number of operations. This complexity far surpasses those of H.263 and MPEG-4 and begs the need for efficient implementations.
- The H.264 standard belongs to the hybrid motion-compensated DCT (MC-DCT) family of codecs. H.264 is able to generate an efficient representation of the source video by reducing temporal and spatial redundancies. Temporal redundancies are removed by a combination of motion estimation (ME) and motion compensation (MC). ME is the process of estimating the motion of a current frame in the source video from previously coded frame(s). This motion information is used to motion compensate the previously coded frame(s) to form a prediction for the current frame. The prediction is then subtracted from the original current frame to form a displaced frame difference (DFD). The motion information is present for each block of pixel data. In H.264, there are seven possible block sizes within a macroblock, e.g., 16×16, 16×8, 8×16, 8×8, 8×4, 4×8, and 4×4 (also referred to as tessellations or partitions). Thus, a 16×16 pixel macroblock (MB) can be tessellated into the following partitions: (A) one 16×16 macroblock region; (B) two 16×8 tessellations; (C) two 8×16 tessellations; and (D) four 8×8 tessellations. Furthermore, each of the 8×8 tessellations can be decomposed into: (a) one 8×8 region; (b) two 8×4 regions; (c) two 4×8 regions; and (d) four 4×4 regions.
- Furthermore, the motion vector for each block is unique and can point to different reference frames. The job of the encoder is to find the optimal way of breaking down a 16×16 macroblock into smaller blocks (along with the corresponding motion vectors) in order to maximize compression efficiency. This breaking down of the macroblock into a specific pattern is commonly referred to as “mode selection” or “mode decision.”
- In addition, the H.264 standard allows for the adaptive switching between frame coding and field coding modes. Notably, this type of switching can occur at both the picture and the macroblock (MB) pair levels. However, present day processes are typically exhaustive in the sense that H.264 encoders encode a picture by completely executing both frame coding and field coding techniques and subsequently comparing the two end products to see which one performed better. Namely, each picture is encoded in its entirety twice. This approach is computationally expensive.
- Accordingly, there exists a need in the art for a method and apparatus for an improved adaptive frame/field mode selection encoding method.
- In one embodiment, a method and apparatus for selecting a coding mode are described. For example, the method receives at least one block of a signal to be encoded. The method determines a frame vertical pixel difference in the at least one block and determines a field vertical pixel difference in the at least one block. The method then compares the frame vertical pixel difference with the field vertical pixel difference to determine a first coding mode for the at least one block.
- In an alternate embodiment, a method and apparatus for selecting a coding mode are described. For example, the method receives at least one block of a signal to be encoded. The method then determines a field coding cost of the at least one block in accordance with a field coding mode and determines a frame coding cost of the at least one block in accordance with a frame coding mode. The method then compares the frame coding cost with the field coding cost to determine a coding mode for the at least one block.
- So that the manner in which the above recited features of the present invention can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only typical embodiments of this invention and are therefore not to be considered limiting of its scope, for the invention may admit to other equally effective embodiments.
-
FIG. 1 is a block diagram depicting an exemplary embodiment of a video encoder; -
FIG. 2 is a flow diagram depicting an exemplary embodiment of a method for a selecting a coding mode in accordance with one or more aspects of the invention; -
FIG. 3 is a flow diagram depicting an exemplary embodiment of a method for a frame and field mode selection process in accordance with one or more aspects of the invention; -
FIG. 4 is a flow diagram depicting an exemplary embodiment of a method for a frame and field mode selection process for an I-picture in accordance with one or more aspects of the invention; -
FIG. 5 is a flow diagram depicting an exemplary embodiment of a method for a frame and field mode selection process for a P- and B-picture in accordance with one or more aspects of the invention; and -
FIG. 6 is a block diagram depicting an exemplary embodiment of a general computer suitable for implementing the processes and methods described herein. - To facilitate understanding, identical reference numerals have been used, where possible, to designate identical elements that are common to the figures.
- Method and apparatus for mode selection in a video encoder is described. One or more aspects of the invention relate to predictive frame (i.e., INTER) mode selection in an H.264 video encoder or H.264-like encoder. Although the present invention is disclosed in the context of an H.264-like encoder, the present invention is not so limited. Namely, the present invention can be adapted to other motion compensation (MC) encoding standards. The INTER mode selection is independent of the motion estimation algorithm. In one embodiment, the INTER mode selection is a one-pass decision algorithm aiming to approximate the multi-pass R-D optimization based on encoder parameters and statistical data. The algorithm uses a bits model in which some of the components are measured exactly, and the residual block bits are estimated through a statistical model. The statistical model can be adapted based on the actual encoded bits.
- Embodiments of the invention use the following definitions:
- R Rate (bit-rate) of the encoder
- D Coding distortion of the encoder
- SAD Sum of absolute differences between a block and its corresponding reference block or any similar metric
- QP Quantization parameter
- MV Motion vector for a macroblock or block
- MB_TYPE Partitioning of macroblock: one of 16×16, 16×8, 8×16, and 8×8
- SUB_MB_TYPE Partitioning of 8×8 block: one of 8×8, 8×4, 4×8, and 4×4
- MODE INTER macroblock partitioning. This is the set of values of MB_TYPE and SUB_MB_TYPE
-
FIG. 1 is a block diagram depicting an exemplary embodiment of avideo encoder 100. In one embodiment, the video encoder is compliant with the H.264 standard. Thevideo encoder 100 includes asubtractor 102, a discrete cosine transform (DCT)module 104, aquantizer 106, anentropy coder 108, aninverse quantizer 110, aninverse DCT module 112, anadder 114, adeblocking filter 116, aframe memory 118, a motion compensatedpredictor 120, an intra/inter switch 122, and amotion estimator 124. Thevideo encoder 100 receives a sequence of source frames. Thesubtractor 102 receives a source frame from the input sequence and a predicted frame from the intra/inter switch 122. Thesubtractor 102 computes a difference between the source frame and the predicted frame, which is provided to theDCT module 104. In INTER mode, the predicted frame is generated by the motion compensatedpredictor 120. In INTRA mode, a predicted MB is formed by the pixels from the neighboring MBs in the same frame. - The
DCT module 104 transforms the difference signal from the pixel domain to the frequency domain using a DCT-like algorithm to produce a set of coefficients. Thequantizer 106 quantizes the DCT coefficients. Theentropy coder 108 codes the quantized DCT coefficients to produce a coded frame. - The
inverse quantizer 110 performs the inverse operation of thequantizer 106 to recover the DCT coefficients. Theinverse DCT module 112 performs the inverse operation of the DCT-like module 104 to produce a reconstructed difference signal. The reconstructed difference signal is added to the predicted frame by theadder 114 to produce a reconstructed frame, which is coupled to thedeblocking filter 116. The deblocking filter smoothes the reconstructed frame and stores the reconstructed frame in theframe memory 118. The motion compensatedpredictor 120 and themotion estimator 124 are coupled to theframe memory 118 and are configured to obtain one or more previously reconstructed frames (previously coded frames). - The
motion estimator 124 also receives the source frame. Themotion estimator 124 performs a motion estimation algorithm using the source frames and previous reconstructed frames (i.e., reference frames) to produce motion estimation data. The motion estimation data includes motion vectors and associated references. The motion estimation data is provided to theentropy coder 108 and the motion compensatedpredictor 120. Theentropy coder 108 codes the motion estimation data to produce coded motion data. The motion compensatedpredictor 120 performs a motion compensation algorithm using a previous reconstructed frame and the motion estimation data to produce the predicted frame, which is coupled to the intra/inter switch 122. Motion estimation and motion compensation algorithms are well known in the art. In one embodiment, themotion estimator 124 includesmode decision logic 126. Themode decision logic 126 is configured to select a mode for each macroblock, or pair of macroblocks, in a predictive (INTER) frame. The “mode” of a macroblock is the partitioning scheme. That is, in one embodiment, themode decision logic 126 selects MODE for each macroblock in a predictive frame. - The present invention sets forth at least one solution for selecting either a frame coding mode or a field coding mode on both a per picture basis or a per macroblock (MB) pair basis. In one embodiment, one solution may include two separate yet related methods. Notably, the first method utilizes frame and field vertical pixel difference comparisons, while the second method entails the use of a coding cost procedure.
- The reasoning for using the frame and field vertical pixel difference comparison method is two-fold. First, for stationary areas of a picture comprising two fields, the difference between the consecutive pixels of the picture in the vertical direction tends to be smaller than the difference between the consecutive pixels of each of two fields of the picture in the vertical direction. Secondly, for moving areas of a picture comprising two fields, the difference between the consecutive pixels of the picture in the vertical direction tends to be larger than the difference between the consecutive pixels of each of two fields of the picture in the vertical direction.
- Similarly, the coding costs (e.g., the motion estimation costs) of a MB pair in frame and field modes may also be used to determine whether a frame or a field coding mode is more suitable for a particular MB pair. In one embodiment, the coding cost (J) is defined as:
-
J=SAD+λ×f(MV,refldx,mbType) Eq. 1 - where SAD is a difference measurement between the pixels and their (temporal or spatial) predictions for every MB or sub-MB partition. Namely, pixel predictions come from either a temporal prediction or a spatial prediction. In one embodiment, SAD may also represent the distortion present in an MB pair. Similarly, the MV variable represents motion vectors, refldx is the reference picture index, and mbType is the type of macroblocks or MB partitions. An MB partition may include any sub-macroblock configuration derived from a 16×16 MB, such as a 4×4 block, an 8×8 block, or the like. The function portion of the equation, which attempts to serve as a representation of the number of coding bits, varies in accordance with the type of coding process that is conducted. For example, if the coding cost for intra prediction coding is needed, then the function does not consider the motion vectors or the refld variables. The variable λ represents a constant depending upon the quantization parameter and other coding parameters. In one embodiment, the variable λ may be utilized to nullify the units resulting from the function portion of the equation so that J results in a unitless value. The coding cost can be measured in either frame or field mode. As demonstrated below, a frame/field mode selection of an MB pair or picture can be based upon the frame and field coding costs.
-
FIG. 2 is a flow diagram that depicts an exemplary embodiment of the present invention. Namely,method 200 describes the steps in which a coding mode is determined by one embodiment of the present invention. Themethod 200 begins atstep 202 and proceeds to step 204 where at least one block (e.g., a macroblock or a macroblock pair) of a signal to be encoded is received. - At
step 206, a frame vertical pixel difference in the at least one block is determined. - At
step 208, a field vertical pixel difference in the at least one block is determined. - At
step 210, the frame vertical pixel difference is compared with the field vertical pixel difference to determine a first coding mode for the at least one block. - At
step 211,method 200 performs the optional step of computing the motion estimation (ME) cost for both the frame coding mode and the filed coding mode, where the two ME costs are then compared to further assist in the determining of a proper coding mode for the at least one block. A detailed description of this step is further described below. Themethod 200 ends atstep 212. - It should be noted that although
FIG. 2 illustrates the implementation of the motion estimation (ME) cost computation as being performed after the vertical pixel difference computation, this is only illustrative. In other words, the motion estimation (ME) cost computation and the vertical pixel difference computation can be implemented in combination or separately. As such, the motion estimation (ME) cost computation and the vertical pixel difference computation can be performed in parallel or in any sequential order as required for a particular implementation. Thus, an implementation of the present invention may implement the motion estimation (ME) cost computation without the vertical pixel difference computation or vice versa. -
FIG. 3 is a flow diagram depicting an exemplary embodiment of amethod 300 for utilizing a frame and field vertical pixel comparison in accordance with one or more aspects of the invention. Although the following method specifically describes the processing of a block on an MB pair level, themethod 300 can be similarly applied to processing of a picture on a picture level. - The
method 300 begins atstep 302 and proceeds to step 304 where the sum of the differences between the absolute frame and field pixels differences in the vertical direction is calculated. In one embodiment, the vertical frame pixel difference is determined to be: -
- where i and j are the pixel horizontal and vertical indices, and (i, j) are over the MB pair. This formula essentially involves the determination of the difference between two lines in the same frame picture (i.e. Xi,j represents a first line and Xi,j+1 represents the next line of the same MB). Similarly, in one embodiment, the vertical field pixel difference is determined to be:
-
- This formula pertains to the sum of differences between lines within each field of a MB. More specifically, the first component of the equation deals with the difference between two lines in a first field in a MB (e.g., a top field of a MB) and the second component of the equation deals with the difference between two lines in a second field of the same MB (e.g., the bottom field of same MB).
- At
step 306, the frame and field pixel differences in the vertical direction are compared. Namely, in one embodiment, if the frame pixel difference is greater than the field pixel difference (i.e., ΔFRM>ΔFLD), then themethod 300 continues to step 310. Atstep 310, field coding mode is selected for the MB pair. Conversely, if the frame pixel difference is not greater than the field pixel difference, then themethod 300 continues to step 308. Atstep 308, frame coding mode is selected for the MB pair. Themethod 300 ends atstep 312. - The present invention also employs an optional procedure for utilizing a coding cost to determine the use of frame or field mode for an I-picture. An
exemplary method 400 depicts one embodiment of such a process for an I-picture. Namely,FIG. 4 is a flow diagram depicting amethod 400 for determining a coding mode in accordance with one or more aspects of the invention. - The
method 400 begins atstep 402 and proceeds to step 404 where at least one intra frame prediction is performed. In one embodiment, an intra frame prediction is performed for all of the possible prediction directions for an intra 4×4 sub-MB partition, an intra 8×8 sub-MB partition, and an intra 16×16 macroblock for each MB of a given MB pair. For example, both the intra 4×4 MB and the 8×8 MB have nine directions to be considered. Similarly, the 16×16 MB has four directions to be considered. Because this step is conducted in both the frame and field coding modes for both the top and bottom MB of the MB pair, a total of 736 direction calculations (i.e., (9 directions for 16 of 4×4 block+9 directions for 4 of 8×8 block+4 directions for one of 16×16 block)*2 MBs in a MB pair*2 coding modes=736), may take place per MB pair in at least one embodiment. - At
step 406, in one embodiment, a minimum cost for each MB of a MB pair is determined. In one embodiment, the minimum costs of each MB in the MB pair are added together for both the frame mode and the field mode. More specifically, each of the directions calculated in step 404 (along with the respective block type information) is applied to the coding cost formula (see Eq. 1). Sincemethod 400 pertains to I-pictures only, the coding cost formula does not consider temporal predictions or motion estimation. Afterwards, a minimum cost is selected for each MB of the MB pair. For example, a first minimum cost (regardless of the block type used to determine that minimum cost) is selected for the top MB and a second minimum cost (regardless of the block type used to determine that minimum cost) is selected for the bottom MB of the MB pair. A final minimum cost is then calculated by summing the minimum cost for the top MB to the minimum cost of the bottom MB. Notably, this calculation is conducted for both the frame coding mode and the field coding mode so that two separate final minimum costs, i.e., the minimum field cost (JFLDmin) and the minimum frame cost (JFRMmin), are respectively determined. - At
step 408, the calculated JFLDmin and the JFRMmin are compared. If JFRMmin is not found to be greater than JFLDmin, then themethod 400 proceeds to step 410 where the frame coding mode is selected for the MB pair. Alternatively, if JFRMmin is found to be greater than JFLDmin, then themethod 400 proceeds to step 412 where the field coding mode is selected for the MB pair. Themethod 400 ends atstep 414. - In an alternative embodiment, the
method 400 may also be used to determine a minimum coding cost on the picture level (as opposed to MB pair level). Notably, the alternative method is identical tomethod 400 with the exception that afterstep 406, all of the minimum frame coding costs and the minimum field coding costs per MB pair are summed over the entire picture in a separate manner. For example, -
- Similarly, step 408 of
method 400 would be replaced with the comparison of Jsum— FRMmin and Jsum— FLDmin. Consequently, if Jsum— FRMmin is found to be greater than Jsum— FLDmin, then the field coding mode is selected for the picture. Alternatively, if Jsum— FRMmin is not found to be greater than Jsum— FLDmin, then the frame coding mode is selected for the picture. - The present invention also employs a procedure for utilizing a coding cost to determine the use of frame or field mode for P-pictures or B-pictures. An
exemplary method 500 depicts one embodiment of such a process. Namely,FIG. 5 is a flow diagram depicting amethod 500 for determining a coding mode in accordance with one or more aspects of the invention. - In one embodiment, the
method 500 begins atstep 502 and proceeds to step 504 where motion estimations (MEs) are performed for all possible MBs or sub-MB partitions for each of the MB of a MB pair for both frame mode and field mode. In one embodiment, an inter frame prediction is performed for all of the possible prediction directions for a 4×4 sub-MB partition, a 4×8 sub-MB partition, an 8×4 sub-MB partition, an 8×8 sub-MB partition, an 8×16 sub-MB partition, a 16×8 sub-MB partition, and a 16×16 macroblock for each of the two MBs of a MB pair. - At
step 506, the MB/sub-MB partition type with the minimum coding cost is found for each MB of a MB pair in both the frame coding mode and the field coding mode. In one embodiment, the present invention calculates a motion estimation cost using the cost coding formula (i.e., Eq. 1). Notably, the formula is applied to each of the seven different types of MB/sub-MB partitions twice, once in the frame mode and then in the field mode, so that a minimum ME cost for both frame coding and field coding is calculated for each MB/sub-MB partition. - Afterwards, a minimum ME cost is selected for each MB of the MB pair. For example, a first minimum cost (regardless of the block type used to determine that minimum cost) is selected for the top MB and a second minimum ME cost (regardless of the block type used to determine that minimum cost) is selected for the bottom MB of the MB pair. A final minimum ME cost is then calculated by summing the minimum cost for the top MB to the minimum cost of the bottom MB. Notably, this calculation is conducted for both the frame coding mode and the field coding mode so that two separate final minimum costs, i.e., the minimum field cost (JFLDmin) and the minimum frame cost (JFRMmin), are respectively determined.
- At
step 508, the calculated JFLDmin and the JFRMmin are compared. If JFRMmin is found to be greater than JFLDmin, then themethod 500 proceeds to step 510 where the field mode is selected for the MB pair. Alternatively, if JFRMmin is not found to be greater than JFLDmin, then themethod 500 proceeds to step 512 where the frame mode is selected for the MB pair. Themethod 500 ends atstep 514. - In an alternative embodiment, the
method 500 may also be used to determining a minimum coding cost for a P-picture or B-picture on the picture level (as opposed to MB pair level). Notably, the alternative method is identical tomethod 500 with the exception that afterstep 506, all of the minimum frame coding costs and minimum field coding costs are summed over the entire picture in a separate manner. For example, -
- Similarly, step 508 of
method 500 would be replaced with the comparison of Jsum— FRMmin and Jsum— FLDmin. Specifically, if Jsum— FRMmin is found to be greater than Jsum— FLDmin, then the field coding mode is selected for the picture. Alternatively, if Jsum— FRMmin is not found to be greater than Jsum— FLDmin, then the frame coding mode is selected for the picture. - In one embodiment, on the MB pair level, the present invention determines if the outcomes of the frame/field mode selection process based on vertical pixel difference (e.g., method 300) and the coding cost process (e.g.,
method 400 for I pictures andmethod 500 for P and B pictures) are the same. If the results are indeed the same, then the result is considered final. If the results are different or there is some type of discrepancy, then additional criteria or calculations may be required. For instance, the following formula could be used to determine whether frame or field coding should be implemented: -
- where α is a constant ranging from 0 to 1.0. If the above formula holds true, then the result from the coding cost formula should be used. Otherwise, the result from the frame/field vertical pixel difference comparison should be used.
- In an alternative embodiment, the present invention determines if the outcomes of the frame/field mode selection process and the coding cost process, as processed on the picture level, are the same. If the results are indeed are the same, then the result is considered final. If the results are different or there is some type of discrepancy, then additional criteria may be required. For example, the final decision on the frame and field mode per picture may be determined using Table 1 below. The final decision is based on the decision from the above comparisons of the aforementioned alternative embodiments.
-
TABLE 1 Decision of Approach 1Decision of Approach 2 Final Decision Frame Frame Frame Frame Field Frame Field Frame Frame Field Field Field - In this particular embodiment, Table 1 is biased towards a frame coding mode because in a frame picture, MBAFF can be turned on, which may further compensate an incorrect decision made at the picture level (if any).
-
FIG. 6 is a block diagram depicting an exemplary embodiment of avideo encoder 600 in accordance with one or more aspects of the invention. Thevideo encoder 600 includes aprocessor 601, amemory 603,various support circuits 604, and an I/O interface 602. Theprocessor 601 may be any type of processing element known in the art, such as a microcontroller, digital signal processor (DSP), instruction-set processor, dedicated processing logic, or the like. Thesupport circuits 604 for theprocessor 601 include conventional clock circuits, data registers, I/O interfaces, and the like. The I/O interface 602 may be directly coupled to thememory 603 or coupled through theprocessor 601. The I/O interface 602 may be coupled to a frame buffer and a motion compensator, as well as to receive input frames. Thememory 603 may include one or more of the following random access memory, read only memory, magneto-resistive read/write memory, optical read/write memory, cache memory, magnetic read/write memory, and the like, as well as signal-bearing media as described below. - In one embodiment, the
memory 603 stores processor-executable instructions and/or data that may be executed by and/or used by theprocessor 601 as described further below. These processor-executable instructions may comprise hardware, firmware, software, and the like, or some combination thereof. Modules having processor-executable instructions that are stored in thememory 603 may include amode selection module 612. For example, themode selection module 612 is configured to perform themethods FIGS. 2 , 3, 4, and 5 respectively. Although one or more aspects of the invention are disclosed as being implemented as a processor executing a software program, those skilled in the art will appreciate that the invention may be implemented in hardware, software, or a combination of hardware and software. Such implementations may include a number of processors independently executing various programs and dedicated hardware, such as ASICs. - An aspect of the invention is implemented as a program product for execution by a processor. Program(s) of the program product defines functions of embodiments and can be contained on a variety of signal-bearing media (computer readable media), which include, but are not limited to: (i) information permanently stored on non-writable storage media (e.g., read-only memory devices within a computer such as CD-ROM or DVD-ROM disks readable by a CD-ROM drive or a DVD drive); (ii) alterable information stored on writable storage media (e.g., floppy disks within a diskette drive or hard-disk drive or read/writable CD or read/writable DVD); or (iii) information conveyed to a computer by a communications medium, such as through a computer or telephone network, including wireless communications. The latter embodiment specifically includes information downloaded from the Internet and other networks. Such signal-bearing media, when carrying computer-readable instructions that direct functions of the invention, represent embodiments of the invention.
- While the foregoing is directed to illustrative embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow.
Claims (6)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/164,653 US9270985B2 (en) | 2007-12-17 | 2014-01-27 | Method and apparatus for selecting a coding mode |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/957,750 US8670484B2 (en) | 2007-12-17 | 2007-12-17 | Method and apparatus for selecting a coding mode |
US14/164,653 US9270985B2 (en) | 2007-12-17 | 2014-01-27 | Method and apparatus for selecting a coding mode |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/957,750 Continuation US8670484B2 (en) | 2007-12-17 | 2007-12-17 | Method and apparatus for selecting a coding mode |
Publications (2)
Publication Number | Publication Date |
---|---|
US20140140402A1 true US20140140402A1 (en) | 2014-05-22 |
US9270985B2 US9270985B2 (en) | 2016-02-23 |
Family
ID=40753220
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/957,750 Active 2032-04-19 US8670484B2 (en) | 2007-12-17 | 2007-12-17 | Method and apparatus for selecting a coding mode |
US14/164,653 Active US9270985B2 (en) | 2007-12-17 | 2014-01-27 | Method and apparatus for selecting a coding mode |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/957,750 Active 2032-04-19 US8670484B2 (en) | 2007-12-17 | 2007-12-17 | Method and apparatus for selecting a coding mode |
Country Status (4)
Country | Link |
---|---|
US (2) | US8670484B2 (en) |
EP (1) | EP2235940B1 (en) |
CA (1) | CA2706711C (en) |
WO (1) | WO2009079318A1 (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8670484B2 (en) * | 2007-12-17 | 2014-03-11 | General Instrument Corporation | Method and apparatus for selecting a coding mode |
JP2010016806A (en) * | 2008-06-04 | 2010-01-21 | Panasonic Corp | Frame coding and field coding determination method, image coding method, image coding apparatus, and program |
JP5163429B2 (en) * | 2008-11-05 | 2013-03-13 | ソニー株式会社 | Motion vector detection apparatus, processing method thereof, and program |
JP5759269B2 (en) * | 2011-06-01 | 2015-08-05 | 株式会社日立国際電気 | Video encoding device |
KR101277354B1 (en) * | 2011-12-21 | 2013-06-20 | 인텔 코오퍼레이션 | Perceptual lossless compression of image data to reduce memory bandwidth and storage |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1494483A2 (en) * | 2003-07-04 | 2005-01-05 | Nextream France | Video coder with control of GOP structure by spatial and temporal activity |
US20060050786A1 (en) * | 2004-09-09 | 2006-03-09 | Kabushiki Kaisha Toshiba | Moving picture coding apparatus and computer program product |
US20080260022A1 (en) * | 2007-04-20 | 2008-10-23 | Media Tek Inc. | Method for making macroblock adaptive frame/field decision |
US20090086820A1 (en) * | 2007-09-28 | 2009-04-02 | Edward Hong | Shared memory with contemporaneous access for use in video encoding and methods for use therewith |
US8670484B2 (en) * | 2007-12-17 | 2014-03-11 | General Instrument Corporation | Method and apparatus for selecting a coding mode |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5742346A (en) * | 1994-08-09 | 1998-04-21 | Picture Tel Corporation | Spatially adaptive blur filter |
JPH08275160A (en) * | 1995-03-27 | 1996-10-18 | Internatl Business Mach Corp <Ibm> | Discrete cosine conversion method |
US5878166A (en) * | 1995-12-26 | 1999-03-02 | C-Cube Microsystems | Field frame macroblock encoding decision |
US7092442B2 (en) * | 2002-12-19 | 2006-08-15 | Mitsubishi Electric Research Laboratories, Inc. | System and method for adaptive field and frame video encoding using motion activity |
US20060198439A1 (en) * | 2005-03-01 | 2006-09-07 | Qin-Fan Zhu | Method and system for mode decision in a video encoder |
-
2007
- 2007-12-17 US US11/957,750 patent/US8670484B2/en active Active
-
2008
- 2008-12-11 CA CA2706711A patent/CA2706711C/en not_active Expired - Fee Related
- 2008-12-11 WO PCT/US2008/086340 patent/WO2009079318A1/en active Application Filing
- 2008-12-11 EP EP08863281.5A patent/EP2235940B1/en not_active Not-in-force
-
2014
- 2014-01-27 US US14/164,653 patent/US9270985B2/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1494483A2 (en) * | 2003-07-04 | 2005-01-05 | Nextream France | Video coder with control of GOP structure by spatial and temporal activity |
US20060050786A1 (en) * | 2004-09-09 | 2006-03-09 | Kabushiki Kaisha Toshiba | Moving picture coding apparatus and computer program product |
US20080260022A1 (en) * | 2007-04-20 | 2008-10-23 | Media Tek Inc. | Method for making macroblock adaptive frame/field decision |
US20090086820A1 (en) * | 2007-09-28 | 2009-04-02 | Edward Hong | Shared memory with contemporaneous access for use in video encoding and methods for use therewith |
US8670484B2 (en) * | 2007-12-17 | 2014-03-11 | General Instrument Corporation | Method and apparatus for selecting a coding mode |
Also Published As
Publication number | Publication date |
---|---|
EP2235940B1 (en) | 2019-02-20 |
WO2009079318A1 (en) | 2009-06-25 |
US9270985B2 (en) | 2016-02-23 |
US8670484B2 (en) | 2014-03-11 |
CA2706711C (en) | 2013-12-10 |
CA2706711A1 (en) | 2009-06-25 |
EP2235940A4 (en) | 2013-05-01 |
US20090154555A1 (en) | 2009-06-18 |
EP2235940A1 (en) | 2010-10-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9374577B2 (en) | Method and apparatus for selecting a coding mode | |
US7738714B2 (en) | Method of and apparatus for lossless video encoding and decoding | |
EP1891810B1 (en) | Method and apparatus for coding motion and prediction weighting parameters | |
Kamp et al. | Multihypothesis prediction using decoder side-motion vector derivation in inter-frame video coding | |
JP5491187B2 (en) | Deblocking filtering method and apparatus | |
US9232223B2 (en) | Method for decoding a stream representative of a sequence of pictures, method for coding a sequence of pictures and coded data structure | |
US20070098067A1 (en) | Method and apparatus for video encoding/decoding | |
US20120230405A1 (en) | Video coding methods and video encoders and decoders with localized weighted prediction | |
US20220030249A1 (en) | Image encoding/decoding method and device | |
US20080137726A1 (en) | Method and Apparatus for Real-Time Video Encoding | |
US20090161757A1 (en) | Method and Apparatus for Selecting a Coding Mode for a Block | |
US20110150074A1 (en) | Two-pass encoder | |
KR20080073157A (en) | Method and apparatus for encoding and decoding based on inter prediction | |
US9270985B2 (en) | Method and apparatus for selecting a coding mode | |
US20120147960A1 (en) | Image Processing Apparatus and Method | |
US8050324B2 (en) | Method and apparatus for selecting a reference frame for motion estimation in video encoding | |
Bichon et al. | Inter-block dependencies consideration for intra coding in H. 264/AVC and HEVC standards | |
Xin et al. | Combined inter-intra prediction for high definition video coding | |
US20060274832A1 (en) | Device for encoding a video data stream | |
KR101841352B1 (en) | Reference frame selection method and apparatus | |
KR100986992B1 (en) | Fast Inter Mode Decision Method in H.264 Encoding | |
US20080107183A1 (en) | Method and apparatus for detecting zero coefficients | |
KR101895389B1 (en) | Method and Apparatus for image encoding |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: ARRIS TECHNOLOGY, INC., GEORGIA Free format text: MERGER AND CHANGE OF NAME;ASSIGNORS:GENERAL INSTRUMENT CORPORATION;ARRIS TECHNOLOGY, INC.;REEL/FRAME:035133/0286 Effective date: 20150101 |
|
AS | Assignment |
Owner name: BANK OF AMERICA, N.A., AS ADMINISTRATIVE AGENT, NORTH CAROLINA Free format text: SECURITY INTEREST;ASSIGNORS:ARRIS GROUP, INC.;ARRIS ENTERPRISES, INC.;ARRIS INTERNATIONAL LIMITED;AND OTHERS;REEL/FRAME:036020/0789 Effective date: 20150618 Owner name: BANK OF AMERICA, N.A., AS ADMINISTRATIVE AGENT, NO Free format text: SECURITY INTEREST;ASSIGNORS:ARRIS GROUP, INC.;ARRIS ENTERPRISES, INC.;ARRIS INTERNATIONAL LIMITED;AND OTHERS;REEL/FRAME:036020/0789 Effective date: 20150618 |
|
AS | Assignment |
Owner name: ARRIS ENTERPRISES, INC., GEORGIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:ARRIS TECHNOLOGY, INC;REEL/FRAME:037328/0341 Effective date: 20151214 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
AS | Assignment |
Owner name: BIG BAND NETWORKS, INC., PENNSYLVANIA Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS;ASSIGNOR:BANK OF AMERICA, N.A., AS ADMINISTRATIVE AGENT;REEL/FRAME:050721/0401 Effective date: 20190404 Owner name: ARRIS TECHNOLOGY, INC., PENNSYLVANIA Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS;ASSIGNOR:BANK OF AMERICA, N.A., AS ADMINISTRATIVE AGENT;REEL/FRAME:050721/0401 Effective date: 20190404 Owner name: ARRIS HOLDINGS CORP. OF ILLINOIS, INC., PENNSYLVAN Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS;ASSIGNOR:BANK OF AMERICA, N.A., AS ADMINISTRATIVE AGENT;REEL/FRAME:050721/0401 Effective date: 20190404 Owner name: GIC INTERNATIONAL HOLDCO LLC, PENNSYLVANIA Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS;ASSIGNOR:BANK OF AMERICA, N.A., AS ADMINISTRATIVE AGENT;REEL/FRAME:050721/0401 Effective date: 20190404 Owner name: GIC INTERNATIONAL CAPITAL LLC, PENNSYLVANIA Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS;ASSIGNOR:BANK OF AMERICA, N.A., AS ADMINISTRATIVE AGENT;REEL/FRAME:050721/0401 Effective date: 20190404 Owner name: JERROLD DC RADIO, INC., PENNSYLVANIA Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS;ASSIGNOR:BANK OF AMERICA, N.A., AS ADMINISTRATIVE AGENT;REEL/FRAME:050721/0401 Effective date: 20190404 Owner name: POWER GUARD, INC., PENNSYLVANIA Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS;ASSIGNOR:BANK OF AMERICA, N.A., AS ADMINISTRATIVE AGENT;REEL/FRAME:050721/0401 Effective date: 20190404 Owner name: ARCHIE U.S. MERGER LLC, PENNSYLVANIA Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS;ASSIGNOR:BANK OF AMERICA, N.A., AS ADMINISTRATIVE AGENT;REEL/FRAME:050721/0401 Effective date: 20190404 Owner name: TEXSCAN CORPORATION, PENNSYLVANIA Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS;ASSIGNOR:BANK OF AMERICA, N.A., AS ADMINISTRATIVE AGENT;REEL/FRAME:050721/0401 Effective date: 20190404 Owner name: ARRIS INTERNATIONAL LIMITED, PENNSYLVANIA Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS;ASSIGNOR:BANK OF AMERICA, N.A., AS ADMINISTRATIVE AGENT;REEL/FRAME:050721/0401 Effective date: 20190404 Owner name: NEXTLEVEL SYSTEMS (PUERTO RICO), INC., PENNSYLVANI Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS;ASSIGNOR:BANK OF AMERICA, N.A., AS ADMINISTRATIVE AGENT;REEL/FRAME:050721/0401 Effective date: 20190404 Owner name: ARRIS GLOBAL SERVICES, INC., PENNSYLVANIA Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS;ASSIGNOR:BANK OF AMERICA, N.A., AS ADMINISTRATIVE AGENT;REEL/FRAME:050721/0401 Effective date: 20190404 Owner name: ARRIS ENTERPRISES, INC., PENNSYLVANIA Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS;ASSIGNOR:BANK OF AMERICA, N.A., AS ADMINISTRATIVE AGENT;REEL/FRAME:050721/0401 Effective date: 20190404 Owner name: ARCHIE U.S. HOLDINGS LLC, PENNSYLVANIA Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS;ASSIGNOR:BANK OF AMERICA, N.A., AS ADMINISTRATIVE AGENT;REEL/FRAME:050721/0401 Effective date: 20190404 Owner name: ARRIS SOLUTIONS, INC., PENNSYLVANIA Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS;ASSIGNOR:BANK OF AMERICA, N.A., AS ADMINISTRATIVE AGENT;REEL/FRAME:050721/0401 Effective date: 20190404 Owner name: ARRIS GROUP, INC., PENNSYLVANIA Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS;ASSIGNOR:BANK OF AMERICA, N.A., AS ADMINISTRATIVE AGENT;REEL/FRAME:050721/0401 Effective date: 20190404 Owner name: ARRIS HOLDINGS CORP. OF ILLINOIS, INC., PENNSYLVANIA Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS;ASSIGNOR:BANK OF AMERICA, N.A., AS ADMINISTRATIVE AGENT;REEL/FRAME:050721/0401 Effective date: 20190404 Owner name: NEXTLEVEL SYSTEMS (PUERTO RICO), INC., PENNSYLVANIA Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS;ASSIGNOR:BANK OF AMERICA, N.A., AS ADMINISTRATIVE AGENT;REEL/FRAME:050721/0401 Effective date: 20190404 |
|
AS | Assignment |
Owner name: ARRIS, GEORGIA Free format text: CHANGE OF NAME;ASSIGNOR:ARRIS ENTERPRISES. INC;REEL/FRAME:049669/0652 Effective date: 20151231 |
|
AS | Assignment |
Owner name: WILMINGTON TRUST, NATIONAL ASSOCIATION, AS COLLATE Free format text: PATENT SECURITY AGREEMENT;ASSIGNOR:ARRIS ENTERPRISES LLC;REEL/FRAME:049820/0495 Effective date: 20190404 Owner name: JPMORGAN CHASE BANK, N.A., NEW YORK Free format text: TERM LOAN SECURITY AGREEMENT;ASSIGNORS:COMMSCOPE, INC. OF NORTH CAROLINA;COMMSCOPE TECHNOLOGIES LLC;ARRIS ENTERPRISES LLC;AND OTHERS;REEL/FRAME:049905/0504 Effective date: 20190404 Owner name: JPMORGAN CHASE BANK, N.A., NEW YORK Free format text: ABL SECURITY AGREEMENT;ASSIGNORS:COMMSCOPE, INC. OF NORTH CAROLINA;COMMSCOPE TECHNOLOGIES LLC;ARRIS ENTERPRISES LLC;AND OTHERS;REEL/FRAME:049892/0396 Effective date: 20190404 Owner name: WILMINGTON TRUST, NATIONAL ASSOCIATION, AS COLLATERAL AGENT, CONNECTICUT Free format text: PATENT SECURITY AGREEMENT;ASSIGNOR:ARRIS ENTERPRISES LLC;REEL/FRAME:049820/0495 Effective date: 20190404 |
|
AS | Assignment |
Owner name: ARRIS ENTERPRISES LLC, GEORGIA Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE ASSIGNEE'S NAME PREVIOUSLY RECORDED AT REEL: 049820 FRAME: 0495. ASSIGNOR(S) HEREBY CONFIRMS THE CHANGE OF NAME;ASSIGNOR:ARRIS ENTERPRISES, INC.;REEL/FRAME:049858/0161 Effective date: 20151231 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |
|
AS | Assignment |
Owner name: WILMINGTON TRUST, DELAWARE Free format text: SECURITY INTEREST;ASSIGNORS:ARRIS SOLUTIONS, INC.;ARRIS ENTERPRISES LLC;COMMSCOPE TECHNOLOGIES LLC;AND OTHERS;REEL/FRAME:060752/0001 Effective date: 20211115 |
|
AS | Assignment |
Owner name: ARRIS ENTERPRISES, INC., GEORGIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:ARRIS TECHNOLOGY, INC.;REEL/FRAME:060791/0583 Effective date: 20151214 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 8 |