EP3350996A1 - Procede d'encodage d'image et equipement pour la mise en oeuvre du procede - Google Patents
Procede d'encodage d'image et equipement pour la mise en oeuvre du procedeInfo
- Publication number
- EP3350996A1 EP3350996A1 EP16766945.6A EP16766945A EP3350996A1 EP 3350996 A1 EP3350996 A1 EP 3350996A1 EP 16766945 A EP16766945 A EP 16766945A EP 3350996 A1 EP3350996 A1 EP 3350996A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- encoding
- blocks
- image
- block
- search area
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/42—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
- H04N19/43—Hardware specially adapted for motion estimation or compensation
- H04N19/433—Hardware specially adapted for motion estimation or compensation characterised by techniques for memory access
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/42—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
- H04N19/436—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation using parallelised computational arrangements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/533—Motion estimation using multistep search, e.g. 2D-log search or one-at-a-time search [OTS]
Definitions
- the present invention relates to an image encoding method and a device for implementing this method. It applies in particular to the coding of images of a video stream.
- the video data is generally subject to source coding to compress them in order to limit the resources required for their transmission and / or storage.
- source coding There are many coding standards, such as H.264 / AVC, H.265 / HEVC and MPEG-2, that can be used for this purpose.
- a video stream comprising a set of images is considered.
- the images of the video stream to be encoded are typically considered in an encoding sequence, and each is divided into sets of pixels also processed sequentially, for example beginning at the top left and ending at the bottom. to the right of each image.
- the encoding of an image of the stream is thus performed by dividing a matrix of pixels corresponding to the image into several sets, for example blocks of fixed size 16 ⁇ 16, 32 ⁇ 32 or 64 ⁇ 64 pixels, and encoding these blocks of pixels according to a given processing sequence.
- Some standards such as H.264 / AVC, provide the possibility to break blocks of size 16 x 16 pixels (called macro-blocks) into sub-blocks, for example of size 8 x 8 or 4 x 4, in order to perform encoding processes with finer granularity.
- the H.265 / HEVC standard provides for the use of fixed size blocks up to 64 x 64 pixels, which can be partitioned to a minimum size of 8 x 8 pixels.
- the existing techniques of video compression can be divided into two main categories: on the one hand compression called “Intra”, in which the compression treatments are performed on the pixels of a single image or video frame, and secondly compression called “Inter”, in which the compression treatments are performed on several images or video frames.
- Intra the processing of a block (or set) of pixels typically comprises a prediction of the pixels of the block made using causal pixels (previously coded) present in the image being encoded (referred to as " current image "), in which case we speak of” Intra prediction ".
- the processing of a block (or set) of pixels typically comprises a prediction of the pixels of the block made using pixels from one or more images previously coded, in which case we speak of "Inter prediction” or "motion compensation".
- hybrid High Efficiency Video Coding
- This exploitation of the spatial and / or temporal redundancies makes it possible to avoid transmitting or storing the value of the pixels of each block (or set) of pixels, by representing at least some of the blocks by a residual of pixels representing the difference (or the distance) between prediction values of the pixels of the block and the real values of the pixels of the predicted block.
- Pixel residual information is present in the data generated by the encoder after transform (eg discrete cosine transform, or DCT, for Discrete Cosine Transform) and quantization to reduce data entropy generated by the encoder.
- transform eg discrete cosine transform, or DCT, for Discrete Cosine Transform
- a video encoder typically performs an encoding mode selection corresponding to a selection of encoding parameters for a processed pixel array. This decision-making can be implemented by optimizing a rate and distortion metric, the encoding parameters selected by the encoder being those that minimize a rate-distortion criterion. The choice of the encoding mode then has an impact on the performance of the encoder, both in terms of rate gain and visual quality.
- the implementation of a video encoder for which real-time processing performance is sought may be performed in the form of a combination of hardware and software elements, such as a software program to be loaded and executed on a hardware component.
- type FPGA Field Programmable Gate Array
- ASIC Application Specifies Integrated Circuit
- Programmable logic circuits of FPGA Field-Programmable Gate Array
- An ASIC is a dedicated electronic circuit that brings together custom features for a given application.
- An encoder can also use hybrid architectures, such as CPU + FPGA based architectures, a Graphics Processing Unit (GPU), or Multi-Purpose Processor Array (MPPA).
- hybrid architectures such as CPU + FPGA based architectures, a Graphics Processing Unit (GPU), or Multi-Purpose Processor Array (MPPA).
- video encoders implemented on a dedicated component are often limited, especially those that perform processing in parallel, by the bandwidth available for the transfer of data between the component and an external memory in which are stored the data of the video stream to be encoded. .
- This limitation is usually overcome by the implementation of a cache memory implanted on the component, which thus enjoys a much larger bandwidth than that of an external memory.
- the limitations imposed on the encoder are various. Generally, they result in a limitation of the motion estimation algorithms allowed by the encoder.
- Another object of the present invention is to provide an image encoding method using an improved cache memory for a real-time implementation.
- a method of encoding a first image in a set of images wherein the first image is divided into blocks, each block being encoded according to one of a plurality of encoding modes comprising at least one least one time correlation prediction type coding mode using a plurality of images of the set of images, the method comprising, for a current block of the first image: defining, in a second image of the set of images distinct from the first image and previously coded according to a predefined encoding sequence of the images of the set of images, a unitary search zone of motion estimation vectors; load the data from the unit search area into a cache memory; determining, by a search in the cached unit search area, a motion estimation vector of the current block, the motion estimation vector pointing to a block of the search area correlated to the current block; and using the motion estimation vector to decide the encoding of the current block according to one of the plurality of encoding modes; wherein the unit search area comprises a set of data of the second
- the proposed method optimizes the shape of the search area to be loaded into the cache, so as to minimize the amount of data loaded into the cache memory and not used by subsequent processing, such as for example processing related to the cache. motion estimation or motion compensation.
- the unitary search area has substantially an ovoid shape.
- the unit search area may also be determined such that at least a portion of the unit search area is substantially in the shape of an ellipsoid portion.
- the ellipsoidal shape advantageously makes it possible to increase the excursion of the vector components without complexity of implementation or use of significant additional resources.
- This embodiment of the proposed method has the advantage of increasing, for the same memory space, the excursion of the components of the motion vectors, without loss due, if necessary, to the configuration shape of the blocks of the group of blocks. encoding intended to be encoded in parallel.
- the unit search area may be determined having a substantially ellipsoidal shape.
- the unitary search area may be determined with a contour that defines a polygon of substantially elliptical shape.
- the proposed method can also advantageously be applied to the case of a group of encoding blocks intended to be encoded in parallel, such as for example a group of 2, 3, or 4 encoding blocks.
- a multiple search area is thus defined for a plurality of encoding blocks by unitary search zone meeting corresponding to the encoding blocks of the plurality of encoding blocks, respectively.
- the data of the multiple search area in the cache memory a search is made in the multiple search area loaded in the cache memory, a plurality of estimation vectors respectively corresponding to the encoding blocks of the plurality of blocks of data. encoding, and the estimation vectors determined for the encoding of the encoding blocks of the plurality of encoding blocks are used.
- the proposed method can furthermore be adapted to different forms of configuration of the blocks of the group of encoding blocks intended to be encoded in parallel, such as for example the MBAFF configuration of the H.264 type encoders (in English, "Macroblock-Adaptive”). Frame / Field Coding ").
- a device for encoding a first image in a set of images comprising: an input interface configured to receive the first image; a video encoding unit, operatively coupled to the input interface, and configured to encode the first image using the proposed method.
- a computer program loadable in a memory associated with a processor, and comprising portions of code for implementing the steps of the proposed method during the execution of said program by the processor, and a set of data representing, for example by compression or encoding, said computer program.
- Another aspect relates to a non-transient storage medium of a computer executable program, comprising a data set representing one or more programs, said one or more programs including instructions for executing said one or more programs.
- a computer comprising a processing unit operably coupled to memory means and an input / output interface module, causing the computer to encode a first image in a set of images according to the proposed method.
- the proposed method is particularly well, although not exclusively, for encoding or compressing an image of an image sequence according to an H.264 / AVC (Advanced Video Coding) scheme. But it is also suitable for encoding images according to any video coding scheme operating on block-cut images in which the blocks are encoded according to a plurality of coding modes comprising at least one prediction type coding mode. temporal correlation using a plurality of images of the video stream to be encoded, such as an H.265 / HEVC encoding scheme.
- the proposed method may advantageously be implemented in cases where the time correlation prediction type coding mode using a plurality of images of the set of images is of a type using a motion prediction from images.
- previously coded type of coding mode referenced in some video coders under the name "Inter”
- predetermined predictor vector selected from neighboring blocks of the previously coded current block
- FIG. 1 is a diagram illustrating an encoder type H.264 / AVC
- FIG. 2 is a diagram illustrating the architecture of an encoder for the implementation of the proposed method
- FIGS. 3a, 3b, and 3c are diagrams illustrating Intra prediction modes
- FIG. 4 is a diagram illustrating a median vector determination for coding in Inter prediction mode
- FIG. 5 is a diagram illustrating an encoder architecture using an FPGA component and an external memory
- FIG. 6a is a diagram illustrating a fractional pixel position determined in the context of an Inter prediction according to one embodiment
- FIGS. 6b and 6c are diagrams illustrating a candidate motion vector and a set of vectors tested in the context of an Inter prediction according to one embodiment
- FIG. 7 is a diagram illustrating an encoder architecture for implementing the proposed method
- FIGS. 8a, 8b, 8c, 8d, 8e, and 8f are diagrams illustrating the loading of cached data for the encoding of a pair of encoding blocks
- FIG. 9a is a diagram illustrating the loading of cached data for the encoding of a group of four encoding blocks
- FIGS. 9b and 9c are diagrams illustrating the configuration of a group of four encoding blocks to be encoded in parallel;
- FIGS. 9d and 9e are diagrams illustrating the loading of cached data for the encoding of a group of four encoding blocks
- FIG. 10 is a diagram illustrating the proposed method according to an implementation mode
- FIGS. 11a, 11b, 11c, 11d, 11f, 11f, 11g, 11b and 11b are diagrams illustrating different configurations of search zones according to different modes. implementation.
- pixel and “pixel” are used interchangeably.
- Sample to designate an element of a digital image.
- the proposed method can be implemented by any type of image encoder of a set of images, such as, for example, a video codec conforming to H.264 / AVC, H.265 / HEVC, and / or MPEG standards. -2.
- an H.264 / AVC type encoder using macroblocks (MB) of size 16 ⁇ 16 pixels can be adapted to an H.265 / HEVC type encoder by replacing the MB16x16 by blocks of type CTB16, CTB32 and CTB64, of respective sizes 16x16, 32x32 and 64x64, defined by the standard HEVC.
- MB macroblocks
- FIG. 1 illustrates an exemplary encoder architecture (10) of the type
- An image stream F to be encoded (F n ) is provided at the input of the encoder (10).
- Each image F n (1 1) of the input stream is divided into macroblocks of size 16 ⁇ 16 pixels, to be encoded according to a predetermined sequence of encoding macroblocks, for example from top to bottom and from left to right. .
- Macroblocks are predicted using causal pixels (previously coded) present in the current image ("Intra” prediction), or using pixels from one or more previously coded images (prediction " Inter "). This exploitation of the spatial and temporal redundancies makes it possible to represent the coding units by a residual of pixels as small as possible which is then transmitted to the decoder, possibly after transformation and quantization.
- Each macro-block to be encoded is provided as input to a motion estimation unit (12) ("ME” for “Motion Estimation”), which generates data relating to the movement of the block being encoded relative to to one or more previously encoded images F ' n _ 1 (13), commonly referred to as reference images, which are also inputted to the motion estimation unit (12).
- the motion data produced by the motion estimation unit is supplied to a motion compensation unit (14) (or Inter prediction) ("MC”, for “Motion Compensation”), which further receives input the reference image (s) used by the estimation unit of movement (12).
- the motion compensation unit (14) generates Inter prediction data, which is provided to an encoding decision unit (15).
- the data of the block to be encoded are also provided to an Intra prediction choice unit (16) which evaluates different neighboring blocks of the block to be encoded in the current image as part of the Intra prediction.
- the Intra prediction choice unit (16) generates, as input to an Intra prediction unit (17), data of one or more adjacent blocks of the current block (being encoded) for the Intra prediction, and the Intra prediction unit (17) outputs Intra prediction data, which is provided to the encoding decision-making unit (15), which selects an Inter-type prediction or an Intra-type prediction. according to the prediction data received for these two modes.
- the Intra prediction choice unit (16) and the Intra prediction unit (17) receive as input encoded image data uF ' n .
- a determination (18) of residual D n is made from the data of the current image (for the current block) F n and the prediction data selected by the encoding decision-making unit (15).
- This pixel residual is then processed by transformation (T) (19) and quantization (Q) (20), and the quantized data (X) is encoded by entropy encoding (21) to generate an encoded stream (NAL).
- An image reconstruction loop from the encoding data retrieves the quantized data (X) for processing by inverse quantization (Q "1 ) (22) and transform (T 1 ) (23) operations.
- inverse operation (24) of the residual determination operation is further applied to reconstruct already encoded blocks uF ' n , which will be used by the Intra prediction units to provide data of neighboring blocks of the block being encoded. will then be filtered (25) for the reconstruction of whole images (26) F ' n , which will provide the reference images for the Inter prediction units.
- the images are sequentially considered and divided into sets of sequentially processed pixels starting at the top left and ending at the bottom right.
- These sets of pixels are called “encoding units” in the HEVC standard, and are of maximum size 64 x 64 pixels, the coding units of this size being called “Large Coding Units", or "LCU”.
- LCU Large Coding Units
- These sets of pixels are predicted using causal pixels (previously coded) present in the current image (“Intra” prediction), or using pixels from one or more previously coded images (prediction "Inter”). This exploitation of the spatial and temporal redundancies makes it possible to represent the coding units by a residual of pixels as small as possible which is then transmitted to the decoder, possibly after transformation and quantization.
- the encoder 100 receives at input 109 an input video stream 101 comprising a plurality of images to be processed in order to perform the encoding of the stream.
- the encoder 100 comprises a controller 102, operatively coupled to the input interface 109, which drives a motion pre-estimation unit (PRE-ME) 1 12, a motion estimation unit (ME) 1 10 and a motion compensation (MC) prediction unit 104 for Inter, Merge and / or Skip type predictions (described below), as well as an Intra mode prediction unit 103.
- the received data on the input interface 109 are transmitted to the input of the Intra mode prediction 103, motion pre-estimation 1 12, and the controller 102 units.
- the controller assembly 102, motion estimation unit 1 10, unit prediction system 104 for the Inter, Merge and Skip predictions, and Intra mode prediction unit 103 forms a 1 1 1 encoding unit operably coupled to the input interface 109.
- the encoding unit 11 1 is further operably coupled to a memory unit 1 13, for example of the RAM type, through the controller 102 in the example illustrated in FIG. 2.
- the Intra mode prediction unit 103 generates Intra prediction data 107 which is inputted to an entropy encoder 105.
- the motion prediction unit 1 12 generates, for an encoding block, a list of vectors. potential candidates for the Inter decision, provided to the motion estimation unit 1 10.
- the motion estimation unit 110 and the Inter / Merge / Skip 104 prediction unit perform a refinement of the candidate vectors. potential and then select a better candidate.
- the Inter / Merge / Skip mode prediction unit 104 generates Inter, Merge, or Skip prediction data 106 that is inputted to the entropy encoder 105.
- the data supplied to the decoder for an Inter-type prediction may include a residual of pixels and information concerning one or more motion vectors.
- This information relating to one or more motion vectors may comprise one or more indices identifying a predictor vector in a list of predictor vectors known to the decoder.
- the data provided to the decoder for a Skip type prediction will typically not include any residual pixels, and may also include information identifying a predictor vector in a list of predictors known to the decoder.
- the list of predictor vectors used for Inter coding will not necessarily be the same as the list of predictor vectors used for Skip coding.
- the controller 102 generates control data 108 which is also inputted to the entropy encoder 105.
- the controller 102 is configured to drive the Intra mode prediction unit 103 and the Inter / Merge / Skip mode prediction unit 104 to control the prediction data which is respectively input to the entropy coder 105 by the Intra 103 mode prediction unit and Inter / Merge / Skip mode prediction unit 104.
- the controller 102 may further be configured to select from the different types of prediction mode (Intra mode, Inter mode, Merge mode or Skip mode depending on the coding modes implemented in the encoding unit 1 1 1) for which prediction data will be transmitted to the entropy coder 105.
- the encoding scheme may include a decision for each processed encoding block to select the type of prediction for which data will be transmitted to the entropy encoder 105.
- This choice will typically be nt implemented by the controller, to decide whether to apply the Inter prediction mode, the Intra prediction mode, the Merge prediction mode or the Skip prediction mode to the current block (or coding unit). treatment.
- This makes it possible to control the sending to the entropic coder of Intra 107 prediction data or Inter, Merge or Skip 106 prediction data as a function of the decision made by the controller 102.
- the encoder 100 may be a computer, a computer network, an electronic component, or another apparatus having a processor operatively coupled to a memory, and, depending on the embodiment selected, a data storage unit. , and other associated hardware elements such as a network interface and a media player for reading and writing on a removable storage medium on such a medium (not shown in the figure).
- the removable storage medium may be, for example, a compact disc (CD), a digital video / versatile disc (DVD), a flash disk, a USB key, etc.
- the memory, the data storage unit, or the removable storage medium contains instructions that, when executed by the controller 102, cause the controller 102 to perform or control the interface portions of the device.
- the controller 102 may be a component implementing a processor or a calculation unit for encoding images according to the proposed method and controlling the units 109, 1 10, 1 12, 103, 104, 105 of the encoder 100.
- the encoder 100 can be implemented in software form, as described above, in which case it takes the form of a program executable by a processor, or in hardware form (or “hardware"), as a specific integrated circuit application (ASIC), a system-on-a-chip (SOC), or in the form of a combination of hardware and software elements, such as for example a software program intended to be loaded and executed on a FPGA-type component (Field Programmable Gâte Array ).
- SOC System On Chips
- on-chip systems are embedded systems that integrate all components of an electronic system into a single chip.
- An encoder can also use hybrid architectures, such as CPU + FPGA based architectures, a Graphics Processing Unit (GPU), or Multi-Purpose Processor Array (MPPA).
- hybrid architectures such as CPU + FPGA based architectures, a Graphics Processing Unit (GPU), or Multi-Purpose Processor Array (MPPA).
- the image being processed is divided into encoding blocks or coding units (in English "Coding Unit” or CU), the shape and size of which are determined in particular as a function of the size of the matrix of pixels representing the image, for example square macroblocks of 16 x 16 pixels.
- a set of blocks is defined for which a processing sequence (also called “processing path") is defined.
- processing sequence also called "processing path”
- blocks of square shape one can for example treat the blocks of the current image starting with the one located at the top left of the image, followed by that immediately to the right of the preceding one, until arriving at the end of the first row of blocks to move to the left-most block in the block line immediately below that first line, to complete the processing by the lowest-most block of the image.
- the processing of the current block may include partitioning the block into sub-blocks, in order to process the block with a finer spatial granularity than that obtained with the block.
- the processing of a block furthermore comprises the prediction of the pixels of the block, by exploiting the spatial correlation (in the same image) or the temporal correlation (in one or more other previously coded images) between the pixels.
- the prediction of the pixels of the block typically comprises the selecting a prediction type of the block and prediction information corresponding to the selected type, the set forming a set of encoding parameters.
- the prediction of the processed pixel block makes it possible to calculate a residual of pixels, which corresponds to the difference between the pixels of the current block and the pixels of the prediction block, and is in some cases transmitted to the decoder after transformation and quantization.
- This coding information 106-108 may notably comprise the coding mode (for example the particular type of predictive coding among the "Intra” and “Inter” codings, or among the "Intra", “Inter”, “Merge” and “Skip” described below), the partitioning (in the case of one or more blocks partitioned into sub-blocks), and a motion information 106 in the case of an "Inter” type predictive coding, "Merge” or “Skip” and an Intra 107 prediction mode in the case of an "Intra” type predictive coding.
- the coding modes "Inter", “Skip” and “Merge” these last two pieces of information can also be predicted in order to reduce their coding cost, for example by exploiting the information of the neighboring blocks of the current block.
- the HEVC standard uses a quadtree coding scheme, described below, combined with a dynamic block size selection.
- HEVC allows partitioning of each current image into blocks ranging in size from 64 x 64 pixels to 8 x 8 pixels.
- the video stream to be encoded can thus be traversed with blocks of 64 x 64, each block of size 64 x 64 can be cut into blocks of smaller size (the finest cutting authorized being that in 8 x 8 blocks, each of size 8 x 8 pixels).
- the encoder typically chooses the size of blocks used according to proprietary criteria that are not defined by the standard.
- the video encoder may further use a YCbCr type representation of the color space of the video signals with sampling that may be 4: 2: 2 or 4: 2: 0 (sub-color sampling).
- the video signal to be encoded carries a luminance information (signal Y) and two pieces of information chrominance (Cb and Cr signals).
- the samples of each component (Y, Cb, Cr) can be encoded on 8 bits, 10 bits or more.
- an H x L pixel size (or samples) of luminance is of size H / 2 x L for each chrominance component, which amounts to sub-sampling of colors in the horizontal direction only.
- the 4: 2: 2 representation corresponds to the so-called SDI (System Deployment Image) signal format.
- an H x L pixel (or sample) luminance size is of size H / 2 x L / 2 for each chrominance component, which amounts to sub-sampling of colors in the horizontal direction and in the vertical direction.
- the predictive coding in "Intra” mode includes a prediction of the pixels of a block (or set) of pixels being processed using the previously coded pixels of the current image.
- the values of the neighboring pixels of the current block belonging to blocks that have been previously coded are used, and an average of the values of these pixels is calculated. neighbors.
- the predictive block is constructed using for each pixel the average value obtained.
- the two sets of 8 neighboring pixels 201, 202 of the neighboring block disposed to the left of the current block and the neighboring block disposed at the left are used. above the current block.
- An average value M of the values of these 16 pixels, which is used to fill the pixel values of the predictive block 200, is calculated.
- VL "Intra” prediction mode
- VL “Vertical-Left”
- each of the 8 neighboring pixels is copied into the corresponding column of the predictive block 220 in a left diagonal projection direction as illustrated in FIG. 3c.
- the H.264 / AVC video coding standard provides for 9 Intra prediction modes (including the DC, H, V, VL prediction modes described above).
- the HEVC video coding standard predicts a greater number of 35 Intra prediction modes for luminance samples, and 5 modes for chrominance samples.
- video coding standards also provide special cases for intra prediction.
- the H.264 / AVC standard allows 16x16 pixel blocks to be chopped into smaller blocks, up to 4x4 pixels in size, to increase the granularity of the predictive coding process.
- the information of the Intra prediction mode is predicted in order to reduce its coding cost. Indeed, the transmission in the encoded stream of an index identifying the Intra prediction mode has a higher cost as the number of usable prediction modes is important. Even in the case of H.264 / AVC coding, the transmission of an index between 1 and 9 identifying the Intra prediction mode used for each block among the 9 possible modes turns out to be expensive in terms of coding cost.
- HEVC standard provides for the determination of not more than three Intra predicted modes. If the encoder makes an encoding decision using one of these modes, only information relating to its index (sometimes referred to as "mpm_index”) and to an indicator indicating that one of the predicted modes has been chosen is transmitted by the encoder. Otherwise, the encoder transmits information relating to a deviation with the predicted modes (sometimes denoted "rem_intra_pred_mode").
- MPM in English "Most Probable Mode"
- MPM is the result of the prediction of the Intra prediction mode used to encode the current block.
- the Intra mode When the Intra mode is selected for the encoding of the current block, it will be possible to transmit to the decoder typically a set of coefficients corresponding to the transformed and quantized pixel residual and the MPM.
- the time correlated prediction type prediction coding referenced for certain video coders under the name "Inter" includes a prediction of the pixels of a block (or set) of pixels being processed using pixels derived from one or more previously encoded images (pixels that are not derived from the current image, unlike the Intra prediction mode).
- the Inter prediction mode typically uses one or two sets of pixels respectively located in one or two previously encoded images to predict the pixels of the current block. That said, it is possible to envisage, for an Inter prediction mode, the use of more than two sets of pixels located respectively in previously distinct two-to-two coded images whose number is greater than two.
- This technique called motion compensation, involves the determination of one or two vectors, called motion vectors, which respectively indicate the position of the set or sets of pixels to be used for the prediction in the image or images.
- Previously encoded images usually referred to as "reference images").
- the vectors used for the "Inter" mode are to be chosen by the encoder 100 by means of the motion estimation unit 1 12, from the motion estimation unit 1 10 and the prediction unit Inter / Merge / Skip modes 104.
- the implementation of the motion estimation within the encoder 100 can therefore provide, depending on the case, the determination of a single vector of motion estimation, two or more motion estimation vectors that point to different images.
- the motion estimation vector (s) generated at the output of the motion estimation unit 110 will be supplied to the Inter / Merge / Skip mode prediction unit 104 for the generation of Inter prediction vectors.
- Each prediction vector Inter can indeed be generated from a corresponding motion estimation vector.
- Motion estimation can consist in studying the displacement of the blocks between two images by exploiting the temporal correlation between the pixels. For a given block in the current image (the “current block” or “original block”), the motion estimation makes it possible to select a most similar block (referred to as “reference block”) in a previously coded image, called “reference image”, by representing the movement of this block, for example with a two-dimensional vector (and therefore two components representing for example respectively a horizontal displacement and a vertical displacement).
- reference block a most similar block in a previously coded image
- the motion estimation method is non-normative and is therefore likely to differ from one encoder to another.
- the motion estimation method may include searching in a more or less extended area of the reference image, for example defined from the block of the reference image corresponding to the original block in the original image, in order to test the resemblance of the original block with a larger or smaller number of candidate blocks of the reference image.
- the correlation between a block and its displacement according to a motion estimation vector can be calculated using the Absolute Difference Sum (SAD):
- the vector resulting from the motion estimation may serve as a basis for the determination of a vector Inter prediction.
- the method of prediction Inter may include optimizations aimed at selecting a vector distinct from the vector resulting from the motion estimation, in order to have a prediction less expensive possible for the mode that is tested.
- This optimization may for example include the testing of one or more vectors around the vector resulting from the motion estimation likely to give a better result depending on the objective pursued. Therefore, the vector used for the Inter prediction with respect to a given reference image will not necessarily be identical to the vector resulting from the motion estimation for this reference image.
- the Inter mode When the Inter mode is selected for the encoding of the current block, it will be possible to transmit to the decoder typically the residual of pixels (calculated for each prediction vector Inter as a function of the pixels of the current block and of the pixels of the block to which the vector of Inter prediction) and information concerning the corresponding Inter prediction vector or vectors.
- the Inter prediction vector or vectors can represent a significant cost in video encoders. Some encoders reduce this coding cost by exploiting the vectors of the neighboring blocks of the block being encoded. This optimization involves a prediction of the Inter prediction vector (s), like the prediction of the Intra prediction mode in the case of a predictive coding of the Intra mode block.
- the information concerning each Inter prediction vector can thus be reduced in size by transmitting, in place of the coordinates of the vector for example, an index of a predictor vector in a known dictionary of the encoder and the decoder, and a residual quantifying the distance between the prediction vector and the predictor vector.
- a median predictor vector mv pred is used to predict the vector to be encoded mv.
- the principle used in the HEVC standard is similar in that it provides for the transmission of a residual vector e mv , which is however not calculated using a median predicted vector.
- the standard actually specifies a method for calculating a set of predicted vectors.
- the encoder then chooses a predictor from among these possible predicted vectors. It can thus transmit, with the residual vector, an index number of the predictor vector retained, so that the decoder can use the same.
- the bidirectional prediction technique typically involves a weighted average of two Inter-type predictions.
- the encoder selects a set of prediction parameters for a "direction" (for a first reference image) and then for a second "direction” (for a second reference image, distinct from the first reference image).
- the encoder determines whether it retains only one of the two directions, or both, in which case an average of the two generated predictions is determined before calculating a corresponding pixel residual, which will eventually be processed by transformation and quantization.
- the bidirectional prediction therefore corresponds in principle to an "Inter" type prediction with two predicted vectors.
- aspects of the proposed Inter prediction method are applicable to bidirectional prediction.
- FIG. 4 illustrates the determination of a corresponding predictor vector in the example illustrated at the median between the previously encoded neighboring block vectors.
- the current block (in the course of encoding) 241 is surrounded by four neighboring blocks 243a, 243b, 243c, 243d previously encoded and three neighboring blocks 243a, 243b, 243c remaining to be coded.
- the example presented assumes an encoding step of the blocks of the image such that, for each block being encoded, the blocks on the left or above the current block have already been encoded, so that on FIG. 4, the previously encoded neighboring blocks 243a, 243b, 243c, 243d are situated on the left 243a or above 243b, 243c, 243d of the current block 241.
- the predictor vector mv pred 244 of the current block 241 corresponds to the median between the respective vectors 245a, 245b, 245c, 245d of the previously coded blocks 243a, 243b, 243c or 243a, 243b, 243d when the block 243c is for example not available.
- the block 243c is encoded according to an Intra mode predictive coding.
- a bad predictor vector will result in coding overhead for the current block 241.
- the H.264 / AVC and HEVC standards provide rules for using one of the available vectors to the extent that the median is not computable.
- Some coders use, sometimes in the context of the "Inter" prediction mode, a mode referenced in some video coders under the name "Skip” in which, as in the case of Inter mode, the current block is predicted using pixels from previously coded images (one or two images, or more depending on the implementation).
- Skip mode is sometimes presented as a sub-mode of the Inter mode, because it corresponds to a prediction mode "Inter” without transmission (or generation in the encoded stream) prediction vector or residual pixels.
- Skip mode is applicable when the pixel residual is small enough that it is considered unnecessary to transmit it into the encoder output stream.
- the prediction vector or vectors used for this mode will typically not be transmitted to the decoder, and will be deduced by the decoder from a predetermined list of possible vectors.
- predictors which will allow for example to transmit only the position of the predictor vector (for example a position index in the list of predictors) instead of transmitting its value (such as its coordinates).
- a predictive vector is directly selected from a known predetermined list of the decoder, the selection of the predictor vector being effected from neighboring blocks of the current block which have previously been coded.
- the respective lists of predictor vectors whether by their size or by their respective contents, will not necessarily be identical.
- the HEVC standard provides another mode of predictive coding, called "Merge", similar to the Skip mode described above with the difference that we can transmit a residual pixel.
- the Merge mode can thus also correspond to a prediction mode Inter, without transmission (or generation in the encoded stream) prediction vector, but in which a pixel residual and generated and transmitted in the encoded stream.
- reference images are usually stored in a large depth memory, in order to store a set of multiple images. For example, storing 10 1920x1080-pixel HD images with 4: 2: 2 8-bit sampling requires 40 MB of storage space.
- the memories used for this storage typically have average performance in terms of bandwidth. This is typically of the order of 2 Gb / s in the case of a DDR3-1333 type SDRAM RAM box. For example, with a reading efficiency of 70% for large quantities of burst data, we obtain a bandwidth of 1333333333 Hz x 16 bits x 0.7, or 1, 7 GB / s.
- FIG. 5 illustrates a hardware architecture of this type, in which an encoder 400 implemented on a component 402 of the FPGA type stores reference images in a RAM 401.
- the encoder may perform a motion estimation phase as part of an Inter, Skip or Merge prediction. which will require the reading in the storage RAM 401 of reference data (denoted F ⁇ ).
- the encoder will be able to log in the RAM 401 data (denoted F ⁇ ) of the encoded image reconstructed from the decision taken, for use during the encoding of the images. following in the video stream.
- the encoder 400 can therefore be provided with a motion estimation unit (ME) 403 and a motion compensation unit (MC) 404, the motion estimation unit 403 can also be configured to performing processing on data generated by a motion estimation unit 405, for example data 406 relating to candidate motion vectors as explained above with reference to FIG. 2.
- a motion estimation unit for example data 406 relating to candidate motion vectors as explained above with reference to FIG. 2.
- the amount of reference data (F ⁇ ) required for the functions of the motion estimation unit (ME) 403 and the motion compensation unit (MC) 404 of the encoder 400 may be of sufficient importance to consider using a cache system, especially to achieve the performance required in a hardware implementation of real-time processing.
- this amount of data may be significantly larger than the amount of data corresponding to a single image, due in particular to the number of candidate vectors tested, to the increase in the pixel area required for the image. refining of the candidates (part ME of the treatments), and of the increase of the zone of pixels necessary for the computation of the prediction Inter (part MC of the treatments).
- the area needed to calculate the inter prediction of an encoding block is equal to the size of this block increased by two crowns, one of which is necessary for the quarter-turn interpolation. pixel, and the other is necessary to the motion estimation excursion.
- the motion vectors can be determined with fractional pixel precision, in that a motion vector can point to a fractional pixel element generated between two neighboring pixels. . In this case, fractional samples will be generated between two neighboring samples, for example by interpolation between these two samples.
- the HEVC standard provides for the generation of fractional luminance samples by defining an 8-coefficient interpolator filter for the half-sample (or half-pixel) positions and a 7-coefficient interpolator filter for the positions. sample quarter (or quarter-pixel).
- the HEVC standard thus enables the generation of motion vectors with a precision equal to one quarter of the distance between two luminance samples.
- FIG. 6a illustrates this calculation, and shows 5 pixels 601a-601e of a reference image 600.
- the pixels 601d and 601e belong to an encoding block 602 in the course of Inter-prediction encoding.
- the calculation of the half-pixel 603 between the pixels 601d and 601e can use the values of the 601a-601e pixels as a function of the implementation.
- the motion estimation function can comprise, as a function of the implementation, the testing of a set of vectors close to an initial vector, called the candidate vector, and the choice among the tested vectors of a vector minimizing a correlation function (often of the SAD or SSD type) between the prediction block and the block to be encoded.
- this operation performs a refinement of candidate vectors identified by the motion pre-estimation function.
- FIG. 6c illustrates this increase by showing an assembly covering a 5 ⁇ 5 pixel area that ultimately requires (when taking into account a ring necessary for quarter-pixel interpolation) an area of 21 ⁇ 21 pixels.
- FIG. 6b shows a reference image 605 on which is represented the block 606 co-located in the reference image 605 with the encoding block being encoded in the current image.
- a candidate vector 608 points to a pixel (half-pixel or quarter-pixel depending on the selected granularity) of the reference image 605, and a set 607 of vectors tested respectively point to pixels (half-pixels or quarter-pixels according to the selected granularity). the selected granularity) of the reference image 605.
- FIG. 6c shows the end of the candidate vector 606, and the pixel 608 toward which this vector 606 points, as well as two test vectors 607a and 607b and the pixels 609a and 609b to which these vectors 607a and 607b respectively point.
- vector pixels tested all the pixels to which the test vectors point respectively (referred to as "vector pixels tested" in FIG. 6c) will have to be included in the data. of the reference image retrieved from the RAM for the purposes of the processing of the Inter prediction.
- Certain encoding standards such as the H.264 / AVC and H.265 / HEVC standards, allow the partitioning of a block to be encoded to split the Inter prediction into several zones each having a specific vector. This allows a better match of the macro-block to be encoded with the reference image, especially on object boundaries having different movements.
- this partitioning increases the number of candidate vectors that one may wish to test in the encoder for an encoding block, and therefore the amount of data necessary for the decision of the inter prediction of a block. encoding.
- partitioning a block of 16 pixels x 16 pixels in four 8 pixel x 8 pixel partitions requires four areas of 13 pixels x 13 pixels, making a total area of 52 pixels x 52 pixels.
- the H.264 standard allows multiple partitioning of a macroblock of 16 pixels x 16 pixels, up to partitioning in blocks of 4 pixels x 4 pixels.
- cache memory that is to say a memory space, often internal to the component, for example ASIC or FPGA, on which is implemented the video encoder, having a bandwidth in reading much more effective than that of an external memory, overcomes this problem of bandwidth limitation of an external memory.
- a cache memory can be implemented within an Altera Stratix-III (FPGA) type FPGA (EP3SL340) using 32 M144K internal memories each of which can contain 2048 72-bit words.
- FPGA Altera Stratix-III
- EP3SL340 Altera Stratix-III type FPGA
- Fig. 7 shows an implementation of the encoder implemented on an FPGA component illustrated in Fig. 4 using a cache memory within the Inter prediction decision unit.
- An encoder 700 is implemented on a component 702 of the FPGA type, and stores in a RAM memory 701 external to the FPGA component 702 reference images F '.
- the encoder 700 may perform a motion estimation phase as part of an Inter, Skip or Merge prediction, and read to do this in a cache memory 707 implemented on the FPGA component 702 (and not in the storage RAM 701 as in the architecture illustrated in Figure 4) reference data (denoted F ⁇ ).
- the encoder 700 can log in the external RAM memory 701 data (denoted F ⁇ ) of the encoded image reconstructed from the decision taken, for use during encoding subsequent images in the video stream.
- the encoder 700 may therefore be provided with a motion estimation unit (ME) 703 and a motion compensation unit (MC) 704, configured to read reference data in a local cache memory 707 rather than in an external memory 701.
- the motion estimation unit 703 may also be configured to perform processing on data generated by a motion estimation unit 705, for example data 706 relating to candidate motion vectors as explained above. with reference to Figure 2.
- the type of cache and its efficiency, as well as its complexity of implementation will preferably be chosen according to the coherence of the different zones necessary for motion estimation and motion compensation processes.
- the data necessary for encoding processing of a block will be preloaded in the cache for the macroblock being processed.
- an area around the block co-located in the reference image to the block being encoded may be preloaded in the cache.
- the use of a systematic cache may be preferred to that of a Hit-Miss cache, because that the systematic cache allows to obtain a latency for obtaining the data almost invariable during the encoding processes of several blocks in parallel.
- a bounded search zone intended to be pre-loaded into a cache, defined along the encoding path of the blocks to be encoded will therefore be considered.
- FIGS. 8a and 8b illustrate these two factors and show the case of two blocks encoded in parallel at distinct encoding instants ⁇ and T 2 and for which it is necessary to load beforehand their cache processing a search area covering the data needed to process both blocks.
- FIG. 8a shows the search area 801 of data of a reference image 800 that can be pre-loaded in cache memory with respect to the size thereof for the two blocks 802 and 803 corresponding to two blocks of the current image to be encoded in parallel
- FIG. 8b shows the area of search 804 data of the reference image 800 to pre-load in cache memory for the two blocks 805 and 806 corresponding to two blocks of the current image to be encoded in parallel after the two blocks 802 and 803 (in the hypothesis a block encoding path from left to right and from top to bottom, as indicated by the black arrow in FIGS. 8a and 8b).
- the search areas 801 and 804 to be loaded in the cache memory are defined around blocks of the reference image co-located with the blocks to be encoded in parallel.
- the data of the reference image can be grouped into virtual lines, of height that of a block, as illustrated in FIGS. 8a to 8f which show a reference image 800 comprising 8 virtual lines.
- the virtual lines may have different heights, and the images to encode a number of virtual lines different from that illustrated in Figures 8a to 8f, which show an example of implementation of the proposed method.
- Figs. 8c-8f illustrate the caching of data of a reference image 800 for different pairs of parallel encoded blocks.
- FIG. 8c shows two blocks 807 and 808 co-located in the reference image 800 with blocks to be encoded in parallel located in the upper part of the current image (first and second virtual lines).
- the size of the search area 809 to be loaded in the cache is such that the encoding of two virtual lines of the current image leads to the caching of 4 virtual lines of the reference image.
- FIG. 8d illustrates the cache loading of 6 virtual lines for the encoding of two blocks 810 and 81 1 co-located in the reference image 800 with blocks to be encoded in parallel located on the third and fourth virtual lines of FIG. the current image.
- FIG. 8e illustrates the cache loading of six virtual lines for the encoding of two blocks 812 and 813 co-located in the reference image 800 with blocks to be encoded in parallel located on the fifth and sixth virtual lines of the current image.
- FIG. 8f illustrates the cache loading of 4 virtual lines for the encoding of two blocks 814 and 815 co-located in the reference image 800 with blocks to encode in parallel located on the seventh and eighth virtual lines of the current image.
- the amount of data loaded into the cache memory for the encoding of all the blocks of a current image, the encoding being performed in parallel for sets of two blocks will correspond to 20 times the width of an image of reference equal to 8 virtual lines, the height of a virtual line corresponding to that of a block to be encoded, as shown by the following formula:
- the following table shows the case of the encoding of a 1080p60 image, with a search zone corresponding to a vertical excursion of 96 pixels (ie 6 encoding blocks of size 16 ⁇ 16 pixels), the encoding being performed with a processing in parallel of 4 encoding blocks:
- MB Macroblock
- ZR Search box
- // parallel
- Y vertical position (expressed in MB count).
- the reference image is read 3.76 times. If we take has two references, for bidirectional predictions, we arrive at a quantity of data equivalent to 7.5 images, a gain of 3.6 compared to 27 images in the table of Figure 1 1. These 7.5 images represent 0.88 GB / s for a frame rate of 60 frames per second, considering only the luminance, and a rate of 1 .76 GB / s also considering the chrominance.
- the search area usually used in encoder implementations using a cache memory in which the data of a motion estimation vector search area is pre-loaded is a square-shaped or more generally rectangular zone, for better reflect the aspect ratio of the reference image.
- a search zone of 64 MB ⁇ 16 MB is obtained for a single reference image, and 32 MB x 16 MB for two reference images in the case of a bidirectional prediction, the size of the search area being defined in macroblocks (MB), for example of size 16x16 pixels for the luminance component Y, the size chrominance components U and V depending on the sampling of these components.
- MB macroblocks
- FIG. 9a shows a search zone 900 intended to be loaded in a cache memory, of rectangular shape and defined around four blocks 901 - 904 corresponding in a reference image to blocks co-located with four blocks of a current image in encoding course in parallel.
- An additional encoding block column 905 may further be pre-loaded so as not to block the downstream processing stream as indicated above.
- the search area 900 occupies a memory space of 32 x 16 MB
- the configuration of the encoding blocks being encoded in parallel illustrated in FIG. 9a by the group of four blocks 901 - 904 of the reference image respectively co-located with encoding blocks of the current image can be advantageously replaced by a configuration called "in staircase ", and illustrated in Figure 9b, to account for encoding dependencies for each of the blocks being encoded.
- the prediction of a block according to the Intra prediction mode or the Inter prediction mode may involve neighboring blocks in the current frame of the encoding block that have already been encoded.
- the prediction Inter it will be possible, according to the embodiment, to seek to predict the motion estimation vector by using the vectors determined for the neighboring blocks, if necessary, because they are already encoded.
- the Intra prediction depending on the embodiment, it will be possible to predict the pixels of the block being encoded (current block) as a function of the pixels of one or more neighboring blocks.
- the definition of these neighboring blocks may therefore depend on the encoding path chosen for the blocks of the image.
- the step configuration illustrated in FIGS. 9b, 9c and 9d is provided by way of example of configuration of a plurality of blocks intended to be encoded in parallel (or of corresponding blocks respectively co-located in a reference image), and positioned relative to each other so that none of the blocks of the configuration corresponds to a block neighboring another block of the configuration that can be used for encoding this other block, depending on the path of the block. encoding chosen for the blocks of the image being encoded.
- Other stair configurations can of course be used for the implementation of the proposed method.
- FIG. 9c illustrates a coding block 906 currently being encoded (current block) with four neighboring blocks 907-910 located in the immediate vicinity of the current block 906.
- the four neighboring blocks are defined according to the encoding path of the blocks of the image 91 1 being encoded, which in the illustrated example changes from left to right and from top to bottom.
- block 912, located immediately below and to the left of neighbor block 910 immediately to the left of current block 906, is a block that can be encoded in parallel with current block 906.
- the search area 900 shown in FIG. 9a (horizontal excursion of 15 MB and vertical excursion of 6 MB) is reduced to a horizontal excursion of 12 MB because of the passage of a vertical configuration of the blocks to be encoded in parallel with a stair-type configuration, a configuration reflected on that of corresponding blocks 901-904 respectively co-located in a reference image.
- the management of the image edges can be performed by duplication of data or by the use of a dedicated logic.
- the duplication of data has the advantage of avoiding the cost of implementing a logic dedicated to managing the edges of the image to be encoded.
- the duplication of the data for the management of the image edges can for example lead to the definition of a search area such as that illustrated in FIG. 9e.
- FIG. 9e shows a rectangular-shaped search zone 915 intended to be pre-loaded in a cache memory in order to accelerate the processing related to a temporal correlation prediction of four blocks to be encoded in parallel positioned in a step-by-step configuration.
- This stair configuration is found on the four corresponding blocks 901 - 904 respectively co-located in a reference image illustrated in Figure 9e.
- a row of blocks 913 and a block column 914 are further copied in cache memory for management of the edges of the reference image.
- the excursion of the vector components of the coded motion is limited to +/- 160 pixels for the horizontal component, which corresponds to a cache memory occupancy of 2 x 10 MB, and to +/- 96 pixels for the vertical component, which corresponds to a cache memory occupancy of 2 x 6 MB.
- the search area 915 further includes two preloaded block columns 916a-916b so as not to block the downstream processing stream.
- search area 915 shown in Figure 9e uses 17 x 30
- the inventors of the proposed method have noticed that the data of the sets of blocks 917 and 918 respectively situated in the lower right and upper left parts of the search zone 915, were not used by the processing related to the motion estimation or motion compensation.
- the proposed method overcomes this disadvantage by optimizing the shape of the search area to be loaded into the cache, so as to minimize the amount of data loaded into cache memory and not used by the subsequent processing related to the estimate movement or motion compensation.
- the proposed method advantageously makes it possible to increase the excursion of the vector components without complexity of implementation or use of significant additional resources.
- Fig. 10 is a diagram illustrating the proposed method according to one embodiment.
- a first image of a set of images of a video sequence is envisaged.
- This first image is divided into blocks on which image encoding processes are applied.
- Each block is thus encoded, according to one of a plurality of coding modes comprising at least one temporal correlation prediction type coding mode using a plurality of images of the set of images, such as, for example, the predictions of type Inter, Merge, and Skip described above.
- a search area in a reference image for searching motion estimation vectors of the current block of which at least a portion has substantially the shape of an ovoid portion.
- the reference image used is chosen distinct from the image being encoded (first image), and has previously been encoded according to an encoding sequence of the images of the set of images.
- the search area thus defined is unitary in the sense that it is defined for an encoding block being encoded.
- the search area data is then loaded (1003) into a cache memory, and then a motion estimation vector pointing to a block of the search area correlated to the current block is determined (1004) by searching the search area loaded into cache.
- a coding decision of the current block according to one of the coding modes is then taken (1005) using the motion estimation vector.
- the proposed method thus uses a search area that is not defined with a rectangular or square shape.
- infinite vector standard a vector standard defined by the following relation:
- ⁇ max (v x , v y ), where v x and v y are two components of the vector v, of infinite norm
- an ovoid shape makes it possible, among other things, to avoid the caching of data that are not used by the motion vector search algorithm in the search area.
- a search zone is defined, a portion of which is substantially in the shape of an ellipsoid portion.
- the search area may have in its entirety an ovoid shape, or, in a particular embodiment, an ellipsoid shape. In the latter case, the outline of the search area defines a polygon of substantially elliptical shape.
- the shape of the unit search area is predetermined. Indeed, one or more forms of unit search area can be prerecorded. For example, shape definition data (ovoid, ellipsoid, or other) may be loaded from a memory to define the unit search area. Once loaded, this zone shape is predetermined for any image (s) of a sequence to be encoded.
- shape definition data ovoid, ellipsoid, or other
- these shape-defining data may define a curve corresponding to a portion or all of the search area.
- the ellipse can be defined over a quarter of space, as illustrated in FIG. 11b, from shape definition data loaded in memory. .
- the search area (1 101) can then be defined for a block (1 102) co-located in the selected reference picture with the current block being encoded by applying the form defined to one quadrant on the four quadrants, as shown in Figure 1 1 c.
- Figures 1 1 d, 1 1 e and 1 1 f show other examples of area of search (1,104) and (1,105) defined from an ovoid portion (1,103) used for the definition of the quarter-space search area.
- a multiple search area is defined which brings together the respectively definable unit search areas for each block. encoding encoded in parallel.
- FIG. 11 g shows a multiple search area (1 106) uniting the unit search areas respectively corresponding to four blocks (1 107 - 1 1 10) respectively co-located in a reference image with four encoding blocks in FIG. encoding courses arranged in a stair configuration as described above.
- the multiple search area is defined from a unitary search area definition having an ellipsoidal shape of the type illustrated in FIG. 1 1 c.
- the caching of a search area for a new encoding block to be encoded may, in one or more embodiments, be defined by a series of offset values in the horizontal direction (denoted delta x or x loaded (y)) respectively corresponding to each of the possible vertical coordinate values of the blocks of the search area, determined on the basis of the shape of the search area to be loaded in cache memory.
- the cache loading of the multiple search area for a new set of four encoding blocks to be encoded may, in one or more embodiments, be defined by a series of offset values in the horizontal direction (denoted delta x or x load (y)) respectively corresponding to each of the possible values of vertical coordinate of the blocks of the search area, determined on the basis of the shape of the area to load in cache and according to the encoding path of the blocks of the current image.
- Figure 1 1h illustrates a set of offset values loaded to x (y) for the multiple search area illustrated in Figure 1 1 g.
- the loaded x (y) values correspond to the 16 possible vertical coordinate values, respectively.
- the offset values x A loaded (y) are determined relative to the position of one (1 1 10) of the four blocks (1107 - 1 1 10) co localized with the blocks being encoded in parallel, chosen as the reference block.
- another of the four blocks (1 107 - 1 1 10) co-located with the blocks being encoded in parallel, or another block of the multiple search area 1 106 could be used as a reference block.
- the offset values A x char3 (y) can be written in a memory, for example of the ROM type, which does not significantly degrade the access performance of the DDR3 controller, compared to the random accesses of the DDR3.
- a system without cache because the memory read requests concern whole blocks (for example 512 bytes, aligned with the size of a macro-block in the case of an encoder of H.264 / AVC type).
- each pair (x, A x char3 y)) corresponds to a macroblock of size 16x16 (512 bytes with a sampling of type 4: 2: 2-8 bits), ie to 32 words DDR3-aligned, access to 16x16 MBs (corresponding to 32 contiguous addresses) is more efficient than access to random addresses.
- a x char3 (y) for each vertical position y ranging for example from 0 to 15, are grouped in the table below (the values of A x char3 (y) being natural numbers ):
- the release of the cache area for a block whose encoding is complete can, in one or more embodiments, be defined by a series of offset values in the horizontal direction (denoted deltajib x or A " fc ()) respectively corresponding to each of the possible vertical coordinate values of the blocks of the cached update search area, determined on the basis of the shape of the cache-loaded search area and as a function of the search path. encoding chosen.
- the release of the cache area for a set of four encoding blocks whose encoding is complete may, in one or more embodiments, be defined by a series of offset values in the horizontal direction (denoted deltajib x or ⁇ yy)) respectively corresponding to each of the possible vertical coordinate values of the blocks of the search area, determined on the basis of the shape of the search area loaded in memory cache and according to the encoding path of the blocks of the current image.
- FIG. 11 illustrates a set of shift values at ⁇ ⁇ y for the multiple search area shown in FIG.
- the offset values at ⁇ ⁇ ) correspond to the 16 possible vertical coordinate values, respectively.
- the offset values ⁇ ⁇ 3 (y) are determined with respect to the position of one (1 107) of the four co-located blocks (1 107 - 1 1 10). with the blocks being encoded in parallel, chosen as the reference block.
- another of the four blocks (1 107 - 1 1 10) co-located with the blocks being encoded in parallel, or another block of the multiple search area 1 106 could be used as a reference block.
- one of the four blocks (1 107 - 1 1 10) can be used as a reference block, which leads to the smallest offset values ( ⁇ ), in order to minimize the memory space required for their storage.
- the release offset values at x b (y) can be stored in a memory, for example ROM type.
- release offset values A ⁇ (y) for each vertical position y are grouped in the table below (the values of ⁇ lb (y) being relative integers):
- the invention thus makes it possible to obtain, with respect to a systematic rectangular mask, a gain in coding efficiency, because, for the same internal memory space of the FPGA, a larger excursion of the vectors is allowed.
- This improvement of the excursion of the motion vector components is due, on the one hand, to the gain of the memory space loaded unnecessarily with a rectangular mask, as illustrated by the memory space portions 917 and 918 in FIG. 9e. and on the other hand, to the ovoid-shaped search zone which allows for larger vectors in the horizontal and vertical directions, and smaller in the diagonal directions, thanks to the use in some embodiments of the quadratic standard ⁇ v ⁇ 2 instead of the infinite norm.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computing Systems (AREA)
- Theoretical Computer Science (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
Claims
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FR1558768A FR3041495B1 (fr) | 2015-09-17 | 2015-09-17 | Procede d'encodage d'image et equipement pour la mise en oeuvre du procede |
PCT/EP2016/071875 WO2017046273A1 (fr) | 2015-09-17 | 2016-09-15 | Procede d'encodage d'image et equipement pour la mise en oeuvre du procede |
Publications (1)
Publication Number | Publication Date |
---|---|
EP3350996A1 true EP3350996A1 (fr) | 2018-07-25 |
Family
ID=54366432
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP16766945.6A Ceased EP3350996A1 (fr) | 2015-09-17 | 2016-09-15 | Procede d'encodage d'image et equipement pour la mise en oeuvre du procede |
Country Status (5)
Country | Link |
---|---|
US (1) | US10999586B2 (fr) |
EP (1) | EP3350996A1 (fr) |
FR (1) | FR3041495B1 (fr) |
IL (1) | IL257846A (fr) |
WO (1) | WO2017046273A1 (fr) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3612023A4 (fr) | 2017-04-20 | 2021-05-12 | Egenesis, Inc. | Procédés de production d'animaux génétiquement modifiés |
WO2019190339A1 (fr) * | 2018-03-26 | 2019-10-03 | Huawei Technologies Co., Ltd. | Appareil d'inter-prédiction et procédé de codage vidéo |
CN114222136B (zh) * | 2019-06-25 | 2024-10-01 | Oppo广东移动通信有限公司 | 运动补偿的处理方法、编码器、解码器以及存储介质 |
CN111163319B (zh) * | 2020-01-10 | 2023-09-15 | 上海大学 | 一种视频编码方法 |
US11875427B2 (en) | 2021-09-13 | 2024-01-16 | Apple Inc. | Guaranteed real-time cache carveout for displayed image processing systems and methods |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI450591B (zh) * | 2009-04-16 | 2014-08-21 | Univ Nat Taiwan | 視訊處理晶片組及其中移動評估的資料讀取之方法 |
EP2425622B1 (fr) | 2009-04-30 | 2016-10-05 | Telefonaktiebolaget LM Ericsson (publ) | Mémoire cache interne efficace pour estimation de mouvement matérielle |
US20110228851A1 (en) | 2010-03-18 | 2011-09-22 | Amir Nusboim | Adaptive search area in motion estimation processes |
US8897355B2 (en) * | 2011-11-30 | 2014-11-25 | Avago Technologies General Ip (Singapore) Pte. Ltd. | Cache prefetch during motion estimation |
-
2015
- 2015-09-17 FR FR1558768A patent/FR3041495B1/fr active Active
-
2016
- 2016-09-15 EP EP16766945.6A patent/EP3350996A1/fr not_active Ceased
- 2016-09-15 WO PCT/EP2016/071875 patent/WO2017046273A1/fr active Application Filing
- 2016-09-15 US US15/759,039 patent/US10999586B2/en active Active
-
2018
- 2018-03-04 IL IL257846A patent/IL257846A/en unknown
Also Published As
Publication number | Publication date |
---|---|
IL257846A (en) | 2018-04-30 |
US10999586B2 (en) | 2021-05-04 |
US20200228810A1 (en) | 2020-07-16 |
FR3041495A1 (fr) | 2017-03-24 |
WO2017046273A1 (fr) | 2017-03-23 |
FR3041495B1 (fr) | 2017-10-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2017046273A1 (fr) | Procede d'encodage d'image et equipement pour la mise en oeuvre du procede | |
EP3318061B1 (fr) | Procede d'encodage d'image et equipement pour la mise en oeuvre du procede | |
EP3225029B1 (fr) | Procede d'encodage d'image et equipement pour la mise en oeuvre du procede | |
FR2906433A1 (fr) | Procedes et dispositifs de codage et de decodage d'images, programme d'ordinateur les mettant en oeuvre et support d'informaton permettant de les mettre en oeuvre | |
FR2932637A1 (fr) | Procede et dispositif de codage d'une sequence d'images | |
WO2010043811A1 (fr) | Procede et dispositif de codage d'une sequence d'image mettant en oeuvre des blocs de taille differente, signal, support de donnees, procede et dispositif de decodage, et programmes d'ordinateur correspondants | |
FR2951345A1 (fr) | Procede et dispositif de traitement d'une sequence video | |
FR2933565A1 (fr) | Procede et dispositif de codage d'une sequence d'images mettant en oeuvre une prediction temporelle, signal, support de donnees, procede et dispositif de decodage, et produit programme d'ordinateur correspondants | |
EP3075155B1 (fr) | Procédé de codage et de décodage d'images, dispositif de codage et de décodage d'images et programmes d'ordinateur correspondants | |
FR3002062A1 (fr) | Systeme et procede de reduction dynamique de l'entropie d'un signal en amont d'un dispositif de compression de donnees. | |
EP3972247B1 (fr) | Procédé de codage et de décodage d'images, dispositif de codage et de décodage d'images et programmes d'ordinateur correspondants | |
EP3158749B1 (fr) | Procédé de codage et de décodage d'images, dispositif de codage et de décodage d'images et programmes d'ordinateur correspondants | |
EP2279620B1 (fr) | Prediction d'images par determination prealable d'une famille de pixels de reference, codage et decodage utilisant une telle prediction | |
FR3057130B1 (fr) | Procede de codage d'une image, procede de decodage, dispositifs, equipement terminal et programmes d'ordinateurs associes | |
FR2959093A1 (fr) | Procede et dispositif de prediction d'une information de complexite de texture contenue dans une image | |
FR2957744A1 (fr) | Procede de traitement d'une sequence video et dispositif associe | |
FR3079098A1 (fr) | Procede d'encodage et de decodage video faible latence | |
WO2011051596A1 (fr) | Procédés et dispositifs de codage et de décodage d'images, et programmes d'ordinateur correspondants | |
FR3035761A1 (fr) | Procede de codage et de decodage d'images, dispositif de codage et de decodage d'images et programmes d'ordinateur correspondants | |
WO2024042286A1 (fr) | Lissage hors boucle de codage d'une frontière entre deux zones d'image | |
FR3100679A1 (fr) | Procede d’encodage d’image et equipement pour la mise en œuvre du procede | |
FR3081656A1 (fr) | Procedes et dispositifs de codage et de decodage d'un flux de donnees representatif d'au moins une image. | |
FR2956552A1 (fr) | Procede de codage ou de decodage d'une sequence video, dispositifs associes | |
FR3111253A1 (fr) | Procede de traitement d’image et equipement pour la mise en œuvre du procede | |
EP3854085A1 (fr) | Procédés et dispositifs de codage et de décodage d'un flux de données représentatif d'au moins une image |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
17P | Request for examination filed |
Effective date: 20180227 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
DAV | Request for validation of the european patent (deleted) | ||
DAX | Request for extension of the european patent (deleted) | ||
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
17Q | First examination report despatched |
Effective date: 20190130 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R003 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION HAS BEEN REFUSED |
|
18R | Application refused |
Effective date: 20211104 |