RU2720648C1

RU2720648C1 - Method and device for encoding or decoding an image with prediction of motion information between levels in accordance with a motion information compression circuit

Info

Publication number: RU2720648C1
Application number: RU2020102692A
Authority: RU
Inventors: Кристоф ЖИСКЕ; Патрис ОННО; Гийом Ларош; Эдуар ФРАНСУА
Original assignee: Кэнон Кабусики Кайся
Priority date: 2013-04-05
Filing date: 2020-01-23
Publication date: 2020-05-12
Also published as: EP3700213B1; JP6701409B2; JP2020114013A; BR122017024393B1; RU2015147403A; KR20190122884A; JP2016519499A; RU2673277C1; WO2014161740A1; CN105284117B; GB2512829A; CN109246436A; KR102078103B1; CN105284117A; ES2803426T3; KR20190122883A; ES2733110T3; JP6526292B2; RU2693649C1; US20200107042A1

Abstract

FIELD: data processing.

SUBSTANCE: invention relates to scalable encoding and decoding of images. Technical result is achieved by determining position in picture of reference level using position in block with size 16×16 of picture subjected to repeated sampling; and determining a set of predictors-candidates of motion information, which includes a predictor-candidate of motion information, based on motion information associated with an area of the image belonging to the picture of the reference level.

EFFECT: technical result is higher accuracy of encoding and decoding images.

6 cl, 13 dwg

Description

Изобретение в целом относится к области масштабируемого видеокодирования и декодирования, в частности, к масштабируемому видеокодированию и декодированию, которое расширило бы стандарт высокоэффективного видеокодирования (HEVC). В частности, изобретение относится к способу, устройству и компьютерной программе для предсказания вектора движения в масштабируемом видеокодере и декодере.The invention generally relates to the field of scalable video coding and decoding, in particular to scalable video coding and decoding, which would expand the high-performance video coding (HEVC) standard. In particular, the invention relates to a method, apparatus and computer program for predicting a motion vector in a scalable video encoder and decoder.

Видеокодирование представляет собой способ преобразования последовательности видеоизображений в компактный оцифрованный битовый поток таким образом, чтобы видеоизображения могли быть переданы или сохранены. Устройство для кодирования используется для кодирования видеоизображений, и соответствующее устройство для декодирования доступно для воссоздания битового потока для отображения и просмотра. Общая цель состоит в том, чтобы сформировать битовый поток меньшего размера, чем первоначальная видеоинформация. Это выгодным образом уменьшает пропускную способность, требуемую для сети передачи или устройства хранения, чтобы передать или сохранить код битового потока.Video coding is a method of converting a sequence of video images into a compact digitized bitstream so that video images can be transmitted or stored. An encoding device is used to encode video images, and a corresponding decoding device is available to recreate the bitstream for display and viewing. The overall goal is to form a smaller bitstream than the original video information. This advantageously reduces the bandwidth required for the transmission network or storage device to transmit or store the bitstream code.

Общие стандартизированные подходы были приняты для формата и способа процесса кодирования, особенно относительно части декодирования. Значительное большинство прошлых видеостандартов разбивают видеоизображения на меньшие секции (называемые макроблоками или блоками). В новом завершаемом стандарте сжатия видео высокоэффективного видеокодирования (HEVC) макроблоки заменены так называемыми наибольшими элементами кодирования (LCU) (также называемыми иерархическими блоками кодирования (CTB)) и разделены и выровнены как блоки, называемые теперь элементами кодирования (CU) в соответствии с характеристиками сегмента рассматриваемого исходного изображения. Это дает возможность более детализированного кодирования областей видеоизображения, которые содержат относительно больше информации, и меньших усилий по кодированию для этих областей с меньшим количеством признаков. Следует отметить, что область изображения также известна под следующими различными терминами в литературе по сжатию видео: пиксельный блок, блок пикселей, блок, элемент кодирования (CU) и макроблок.General standardized approaches have been adopted for the format and method of the encoding process, especially regarding the decoding part. The vast majority of past video standards break down video images into smaller sections (called macroblocks or blocks). In the new final high-performance video coding (HEVC) video compression standard, macroblocks are replaced by the so-called largest coding units (LCUs) (also called hierarchical coding units (CTBs)) and separated and aligned as blocks, now called coding units (CUs) according to segment characteristics source image in question. This allows for more detailed coding of areas of the video image that contain relatively more information, and less coding effort for these areas with fewer features. It should be noted that the image area is also known by the following various terms in the video compression literature: a pixel block, a pixel block, a block, a coding unit (CU), and a macroblock.

В настоящее время определяется масштабируемое расширение HEVC. В этом расширении изображения рассматриваются как состоящие из множества иерархических уровней. Иерархические уровни включают в себя базовый уровень, эквивалентный коллекции низкокачественных версий изображений (или кадров) первоначальной последовательности видеокадров, и один или более улучшающих уровней (также известных как уровни уточнения).The scalable HEVC extension is currently being defined. In this extension, images are considered as consisting of many hierarchical levels. Hierarchical levels include a base level equivalent to a collection of low-quality versions of images (or frames) of the original sequence of video frames, and one or more enhancement levels (also known as refinement levels).

Видеоизображения первоначально обрабатывались посредством индивидуального кодирования каждого макроблока, что напоминает цифровое кодирование неподвижных изображений или картинок. Позже модели кодирования допускают предсказание признаков в одном кадре либо на основе соседних макроблоков (пространственное или интра-предсказание), либо посредством ассоциации с аналогичным макроблоком в соседнем кадре (временное предсказание). Это дает возможность использования уже доступной закодированной информации, тем самым сокращая величину битовой скорости кодирования, необходимой в целом. Различия между первоначальным блоком для кодирования и блоком, используемым для предсказания, захватываются в остаточном множестве значений. Затем первоначальный блок кодируется в форме идентификатора блока, используемого для предсказания, и разности. Возможны многие различные типы предсказаний. Эффективное кодирование выбирает наилучший режим предсказания, обеспечивающий наивысшее качество для кодирования блока при декодировании, при этом принимая во внимание размер битового потока, производимый каждым режимом предсказания для представления упомянутого блока в битовом потоке. Общей целью является компромисс между качеством декодированной картинки и сокращением необходимой битовой скорости, также известный как компромисс скорости/искажения.Video images were initially processed by individually encoding each macroblock, which resembles digital encoding of still images or pictures. Later, coding models allow prediction of features in one frame, either based on neighboring macroblocks (spatial or intra-prediction), or by association with a similar macroblock in a neighboring frame (temporal prediction). This makes it possible to use the already available encoded information, thereby reducing the value of the coding bit rate needed in general. The differences between the original block for encoding and the block used for prediction are captured in the residual set of values. Then, the original block is encoded in the form of a block identifier used for prediction and a difference. Many different types of predictions are possible. Efficient coding selects the best prediction mode providing the highest quality for block coding during decoding, taking into account the size of the bitstream produced by each prediction mode to represent said block in the bitstream. A common goal is a compromise between the quality of the decoded picture and the reduction in the required bit rate, also known as the speed / distortion compromise.

В случае временного предсказания возможны несколько типов предсказания, и они могут быть собраны в два основных типа: однонаправленное предсказание и двунаправленное предсказание. В случае однонаправленного предсказания блок для предсказания связывается с одним предиктором. Местоположение предиктора кодируется как информация движения. Эта информация движения состоит из индекса опорного кадра, содержащего предиктор, называемого ref_idx в стандарте, и вектора, определенного вертикальным и горизонтальным смещением, которое задает местоположение блока предиктора в указанном кадре. В случае двунаправленного предсказания блок для кодирования связывается с двумя предикторами, взятыми в двух разных опорных кадрах. Как следствие информация движения содержит два индекса опорных кадров и два вектора.In the case of temporal prediction, several types of prediction are possible, and they can be assembled into two main types: unidirectional prediction and bidirectional prediction. In the case of unidirectional prediction, the prediction block is associated with one predictor. The location of the predictor is encoded as motion information. This motion information consists of a reference frame index containing a predictor called ref_idx in the standard, and a vector defined by vertical and horizontal offset that sets the location of the predictor block in the specified frame. In the case of bidirectional prediction, the coding unit is associated with two predictors taken in two different reference frames. As a result, the motion information contains two reference frame indices and two vectors.

Информация движения сама может сама закодирована с предсказанием. Информация движения, полученная соседнего элемента кодирования в том же самом кадре, может использоваться в качестве пространственных предикторов информации движения. Информация движения, полученная из совмещенного элемента кодирования в других кадрах, может использоваться в качестве временного предиктора информации движения. Информация движения, которая будет закодирована для блока, для кодирования затем кодируется с помощью индекса в используемом предикторе информации движения и разностной информации, представляющей различие между выбранным предиктором информации движения и информацией движения, которая будет закодирована.Traffic information itself may itself be predicted encoded. The motion information obtained by an adjacent coding element in the same frame can be used as spatial predictors of motion information. Motion information obtained from the combined coding element in other frames can be used as a temporary predictor of motion information. The motion information that will be encoded for the block for encoding is then encoded using the index in the used predictor of the motion information and the difference information representing the difference between the selected predictor of the motion information and the motion information to be encoded.

Предсказание информации движения на основе информации движения, соответствующей предыдущим изображениям, требует, чтобы кодер и декодер хранили поле движения ранее закодированных изображений. Это поле движения может представлять огромный объем данных для хранения, тем более для видеоматериалов, демонстрирующих большое разрешение, таких как видеоматериалы 4k2k или 8k4k. Чтобы ограничить требования хранения кодеков HEVC, стандарт HEVC принял стратегию, состоящую в использовании сжатых полей движения для предсказания информации движения вместо всего поля движения.Predicting motion information based on motion information corresponding to previous images requires the encoder and decoder to store the motion field of previously encoded images. This motion field can represent a huge amount of data for storage, especially for video materials showing high resolution, such as 4k2k or 8k4k video materials. To limit the storage requirements of HEVC codecs, the HEVC standard adopted a strategy of using compressed motion fields to predict motion information instead of the entire motion field.

Хранение поля движения требуется только тогда, когда используется информация движения предыдущих изображений. В HEVC использование временных предикторов информации движения может быть деактивировано. В этом случае может быть получено дополнительное сокращение требований хранения кодека HEVC посредством предотвращения хранения какой-либо информации движения.Storage of the motion field is only required when the motion information of previous images is used. In HEVC, the use of temporary motion information predictors can be deactivated. In this case, an additional reduction in the storage requirements of the HEVC codec can be obtained by preventing the storage of any motion information.

Одна из основных идей в масштабируемом кодеке состоит в том, чтобы повторно использовать информацию из опорного уровня (RL), закодированного с использованием заданного кодека (например, HEVC), чтобы закодировать информацию улучшающего уровня.One of the main ideas in a scalable codec is to reuse information from a reference layer (RL) encoded using a given codec (eg, HEVC) to encode information of an enhancement layer.

Было бы желательно использовать информацию движения, определенную в опорном уровне, для предсказывающего кодирования информации движения в улучшающем уровне. В частности, если использование временных предикторов информации движения в опорном уровне деактивировано, может случиться, что никакая информация движения не может быть доступна для использования для предсказания информации движения в улучшающем уровне.It would be desirable to use the motion information determined at the reference level to predictively encode motion information at an enhancement level. In particular, if the use of temporary motion information predictors at the reference level is deactivated, it may happen that no motion information can be available for use to predict motion information at an improving level.

Настоящее изобретение было создано, чтобы обратиться к одной или более упомянутых проблем. Оно относится к процессу определения предиктора информации движения в улучшающем уровне схемы масштабируемого кодирования, также известного как процесс выявления движения. Он содержит коррекцию позиции в опорном уровне, используемом для подбора более соответствующей информации движения, доступной вследствие схемы сжатия.The present invention has been made to address one or more of the problems mentioned. It relates to a process for determining a predictor of motion information in an enhancement layer of a scalable coding scheme, also known as a motion detection process. It contains position correction at a reference level used to select more appropriate motion information available due to the compression scheme.

В соответствии с первым аспектом изобретения обеспечен способ кодирования изображения в соответствии с форматом масштабируемого кодирования, упомянутый формат кодирования содержит по меньшей мере опорный уровень и улучшающий уровень, по меньшей мере часть изображения закодирована с предсказанием на основе информации движения, упомянутая информация движения сама закодирована с предсказанием на основе множества предикторов информации движения, причем способ содержит по меньшей мере для области изображения улучшающего уровня этапы, на которых определяют множество предикторов информации движения на основе информации движения другой части изображений, принадлежащих опорному уровню; определяют совмещенную позицию в опорном уровне области изображения для кодирования в улучшающем уровне, чтобы выбрать информацию движения, связанную с упомянутой позицией, как часть упомянутого множества предикторов информации движения и корректируют по меньшей мере одну координату упомянутой совмещенной позиции, причем упомянутая корректировка содержит сложение определенного значения по меньшей мере с одной из координат упомянутой совмещенной позиции, чтобы получить модифицированное значение координаты, и применение функции округления к модифицированному значению координаты.According to a first aspect of the invention, there is provided a method of encoding an image in accordance with a scalable encoding format, said encoding format comprising at least a reference level and an enhancement level, at least a portion of the image is predicted encoded based on the motion information, said motion information itself is predicted encoded based on a plurality of motion information predictors, the method comprising at least steps for improving the level of the image area, on which a plurality of motion information predictors are determined based on motion information of another part of the images belonging to the reference level; determining a combined position in the reference level of the image area for coding at an enhancement level to select motion information associated with said position as part of said plurality of motion information predictors and correcting at least one coordinate of said combined position, said correction comprising adding a certain value by at least one of the coordinates of the said combined position to obtain a modified coordinate value, and the application of the function ok ugleniya coordinates to the modified value.

В соответствии с дополнительным аспектом изобретения обеспечен способ декодирования изображения в соответствии с форматом масштабируемого кодирования, упомянутый формат кодирования содержит по меньшей мере опорный уровень и улучшающий уровень, по меньшей мере часть изображения закодирована с предсказанием на основе информации движения, упомянутая информация движения сама закодирована с предсказанием на основе множества предикторов информации движения, причем способ содержит по меньшей мере для области изображения улучшающего уровня этапы, на которых определяют множество предикторов информации движения на основе информации движения другой части изображений, принадлежащих опорному уровню; определяют совмещенную позицию в опорном уровне области изображения для декодирования в улучшающем уровне, чтобы выбрать информацию движения, связанную с упомянутой позицией, как часть упомянутого множества предикторов информации движения и корректируют по меньшей мере одну координату упомянутой совмещенной позиции, причем упомянутая корректировка содержит сложение определенного значения по меньшей мере с одной из координат упомянутой совмещенной позиции, чтобы получить модифицированное значение координаты, и применение функции округления к модифицированному значению координаты.According to a further aspect of the invention, there is provided a method of decoding an image in accordance with a scalable encoding format, said encoding format comprising at least a reference level and an enhancement level, at least a portion of the image is predicted encoded based on the motion information, said motion information itself is predicted encoded based on a plurality of motion information predictors, the method comprising at least for the image area improving ur vnya steps of: determining a plurality of predictor motion information based on motion information other parts of the images belonging to the reference level; determining a combined position in the reference level of the image region for decoding at an enhancement level to select motion information associated with said position as part of said plurality of motion information predictors and correcting at least one coordinate of said combined position, said correction comprising adding a certain value to at least one of the coordinates of the said combined position to obtain a modified coordinate value, and the application of the function rounding to the modified value of the coordinates.

В соответствии с другим аспектом изобретения обеспечено устройство для кодирования изображения в соответствии с форматом масштабируемого кодирования, упомянутый формат кодирования содержит по меньшей мере опорный уровень и улучшающий уровень, по меньшей мере часть изображения закодирована с предсказанием на основе информации движения, упомянутая информация движения сама закодирована с предсказанием на основе множества предикторов информации движения, причем устройство содержит по меньшей мере для области изображения улучшающего уровня модуль определения предикторов для определения множества предикторов информации движения на основе информации движения другой части изображений, принадлежащих опорному уровню; модуль определения позиции для определения совмещенной позиции в опорном уровне области изображения для кодирования в улучшающем уровне, чтобы выбрать информацию движения, связанную с упомянутой позицией, как часть упомянутого множества предикторов информации движения и модуль коррекции позиции для коррекции по меньшей мере одной координаты упомянутой совмещенной позиции, причем упомянутая корректировка содержит сложение определенного значения по меньшей мере с одной из координат упомянутой совмещенной позиции, чтобы получить модифицированное значение координаты, и применение функции округления к модифицированному значению координаты.In accordance with another aspect of the invention, there is provided an apparatus for encoding an image in accordance with a scalable encoding format, said encoding format comprising at least a reference layer and an enhancement layer, at least a portion of the image is predicted encoded based on motion information, said motion information itself is encoded with prediction based on a plurality of motion information predictors, the device comprising at least an image enhancing region level, a predictor determination module for determining a plurality of motion information predictors based on motion information of another part of the images belonging to the reference level; a position determining module for determining a aligned position in a reference level of the image area for coding at an enhancement level to select motion information associated with said position as part of said plurality of motion information predictors and a position correction module for correcting at least one coordinate of said aligned position, moreover, said correction comprises adding a certain value to at least one of the coordinates of said combined position to obtain a mode the fixed coordinate value, and applying the rounding function to the modified coordinate value.

В соответствии с другим аспектом изобретения обеспечено устройство для декодирования изображения в соответствии с форматом масштабируемого кодирования, упомянутый формат кодирования содержит по меньшей мере опорный уровень и улучшающий уровень, по меньшей мере часть изображения закодирована с предсказанием на основе информации движения, упомянутая информация движения сама закодирована с предсказанием на основе множества предикторов информации движения, причем устройство содержит по меньшей мере для области изображения улучшающего уровня модуль определения предикторов для определения множества предикторов информации движения на основе информации движения другой части изображений, принадлежащих опорному уровню; модуль определения позиции для определения совмещенной позиции в опорном уровне области изображения для декодирования в улучшающем уровне, чтобы выбрать информацию движения, связанную с упомянутой позицией, как часть упомянутого множества предикторов информации движения и модуль коррекции позиции для коррекции по меньшей мере одной координаты упомянутой совмещенной позиции, причем упомянутая корректировка содержит сложение определенного значения по меньшей мере с одной из координат упомянутой совмещенной позиции, чтобы получить модифицированное значение координаты, и применение функции округления к модифицированному значению координаты.In accordance with another aspect of the invention, there is provided an apparatus for decoding an image in accordance with a scalable encoding format, said encoding format comprising at least a reference level and an enhancement level, at least a portion of the image is predicted encoded based on motion information, said motion information itself is encoded with prediction based on a plurality of motion information predictors, the device comprising at least for the image area improving a level determination unit for determining a plurality of predictors predictor motion information based on motion information other parts of the images belonging to the reference level; a position determining module for determining a aligned position in a reference level of an image area for decoding at an enhancement level to select motion information associated with said position as part of said plurality of motion information predictors and a position correction module for correcting at least one coordinate of said aligned position, moreover, said correction comprises adding a certain value to at least one of the coordinates of said combined position to obtain m the unified coordinate value, and applying the rounding function to the modified coordinate value.

По меньшей мере части способов в соответствии с изобретением могут быть реализованы с помощью компьютера. В соответствии с этим настоящее изобретение может принять форму полностью аппаратного варианта осуществления, полностью программного варианта осуществления (в том числе программно-аппаратного обеспечения, резидентного программного обеспечения, микрокода и т.д.) или варианта осуществления, комбинирующего аспекты программного и аппаратного обеспечения, которые в целом могут упоминаться здесь как "схема", "модуль" или "система". Кроме того, настоящее изобретение может принять форму компьютерного программного продукта, воплощенного в любом материальном носителе, в выражении, имеющем используемый с помощью компьютера программный код, воплощенный на носителе.At least part of the methods in accordance with the invention can be implemented using a computer. Accordingly, the present invention may take the form of a fully hardware embodiment, a fully software embodiment (including firmware, resident software, microcode, etc.) or an embodiment combining software and hardware aspects that in general, may be referred to herein as a “circuit,” “module,” or “system.” In addition, the present invention may take the form of a computer program product embodied in any tangible medium, in an expression having computer program code embodied on the medium.

Поскольку настоящее изобретение может быть реализовано в программном обеспечении, настоящее изобретение может быть воплощено как машиночитаемый код для предоставления программируемому устройству на любом подходящем носителе. Материальный носитель может содержать запоминающий носитель, такой как гибкий диск, компакт-диск (CD-ROM), накопитель на жестком диске, магнитную ленту или твердотельную память и т.п. Переходный носитель может включить в себя сигнал, такой как электрический сигнал, электронный сигнал, оптический сигнал, акустический сигнал, магнитный сигнал или электромагнитный сигнал, например, микроволновый или радиочастотный сигнал.Since the present invention can be implemented in software, the present invention can be embodied as computer-readable code for providing a programmable device on any suitable medium. The material medium may comprise a storage medium such as a floppy disk, compact disc (CD-ROM), hard disk drive, magnetic tape or solid state memory, and the like. The transition medium may include a signal, such as an electrical signal, an electronic signal, an optical signal, an acoustic signal, a magnetic signal, or an electromagnetic signal, such as a microwave or radio frequency signal.

Теперь будут описаны варианты осуществления изобретения только в качестве примера и со ссылкой на следующие чертежи, на которых:Embodiments of the invention will now be described, by way of example only and with reference to the following drawings, in which:

Фиг. 1 иллюстрирует блок-схему классического масштабируемого видеокодера;FIG. 1 illustrates a block diagram of a classic scalable video encoder;

Фиг. 2 иллюстрирует пространственные и временные позиции для предикторов информации движения в варианте осуществления изобретения;FIG. 2 illustrates spatial and temporal positions for predictors of motion information in an embodiment of the invention;

Фиг. 3 иллюстрирует предсказание информации движения в улучшающем уровне с использованием информации движения опорного уровня в варианте осуществления изобретения;FIG. 3 illustrates prediction of motion information in an enhancement layer using motion information of a reference layer in an embodiment of the invention;

Фиг. 4 иллюстрирует блок-схему последовательности операций для процесса декодирования в варианте осуществления изобретения;FIG. 4 illustrates a flowchart for a decoding process in an embodiment of the invention;

Фиг. 5 иллюстрирует гранулярность информации движения в варианте осуществления изобретения;FIG. 5 illustrates the granularity of motion information in an embodiment of the invention;

Фиг. 6 схематично иллюстрирует принципы подхода TextureRL в варианте осуществления изобретения;FIG. 6 schematically illustrates the principles of the TextureRL approach in an embodiment of the invention;

Фиг. 7 иллюстрирует адаптированный процесс выявления предиктора информации движения в контексте AMVP и режиме со слиянием подхода TextureRL в улучшающем уровне в варианте осуществления изобретения;FIG. 7 illustrates an adapted process for identifying a motion information predictor in the context of AMVP and merging mode of the TextureRL approach in an enhancement layer in an embodiment of the invention;

Фиг. 8 иллюстрирует адаптированный процесс в контексте подхода индекса опорного кадра в варианте осуществления изобретения;FIG. 8 illustrates an adapted process in the context of a reference frame index approach in an embodiment of the invention;

Фиг. 9 является блок-схемой последовательности операций процесса выявления информации движения режимов со слиянием в варианте осуществления изобретения;FIG. 9 is a flowchart of a process for detecting merged mode motion information in an embodiment of the invention;

Фиг. 10 представляет блок-схему масштабируемого декодера в варианте осуществления изобретения;FIG. 10 is a block diagram of a scalable decoder in an embodiment of the invention;

Фиг. 11 является блок-схемой вычислительного устройства для реализации одного или более вариантов осуществления изобретения;FIG. 11 is a block diagram of a computing device for implementing one or more embodiments of the invention;

Фиг. 12 показывает выявление множества предикторов информации движения AMVP в варианте осуществления изобретения;FIG. 12 shows the identification of multiple AMVP motion information predictors in an embodiment of the invention;

Фиг. 13 иллюстрирует подробные сведения области памяти в варианте осуществления изобретения.FIG. 13 illustrates details of a memory area in an embodiment of the invention.

Фиг. 1 иллюстрирует блок-схему классического масштабируемого видеокодера, который может содержать несколько подразделов или каскадов, и который является репрезентативным для масштабируемого расширения HEVC. Здесь проиллюстрированы два подраздела или каскада A10 и B10, производящие данные, соответствующие базовому уровню 1.13, и данные, соответствующие одному улучшающему уровню 1.14. Каждый из подразделов A10 и B10 следует принципам стандартного видеокодера, в которых этапы преобразования, квантования и энтропийного кодирования применяются в двух отдельных проходах, и каждому уровню соответствует один из них.FIG. 1 illustrates a block diagram of a classic scalable video encoder that may contain several sub-sections or cascades, and which is representative of the scalable HEVC extension. Two subsections or cascades A10 and B10 are illustrated here, producing data corresponding to a base level of 1.13 and data corresponding to one enhancement level of 1.14. Each of the subsections A10 and B10 follows the principles of a standard video encoder, in which the conversion, quantization, and entropy coding steps are applied in two separate passes, and one level corresponds to each level.

Первый каскад B10 стремится кодировать базовый уровень, совместимый со стандартами H.264/AVC или HEVC, масштабируемого выходного потока. Входная информация в этот не масштабируемый кодер состоит в первоначальной последовательности изображений кадра, полученной посредством применения понижающей дискретизации 1.17 к изображениям 1.1. Этот кодер последовательно выполняет следующие шаги, чтобы закодировать стандартный битовый поток видео. Картинка или кадр, который будет закодирован (сжат), делится на пиксельные блоки на этапе 1.2, называемые элементами кодирования (CU) в стандарте HEVC. Каждый блок сначала подвергается операции 1.3 оценки движения, которая содержит поиск среди опорных картинок, сохраненных в выделенном буфере 1.4 памяти, опорных блоков, которые обеспечили бы хорошее предсказание блока. Этот этап оценки движения обеспечивает один или более индексов опорных картинок, которые содержат найденные опорные блоки, а также соответствующие векторы движения. Затем этап 1.5 компенсации движения применяет предполагаемые векторы движения к найденным опорным блокам, чтобы получить временной разностный блок, который является разностью между блоком предсказания с компенсацией движения и первоначальным блоком для предсказания. Кроме того, этап 1.6 интра-предсказания определяет режим пространственного предсказания, который обеспечил бы наилучшую производительность для предсказания текущего блока. Вновь вычисляется пространственный разностный блок, но в этом случае как разность между пространственным предиктором и первоначальным блоком для предсказания.The first stage of the B10 seeks to encode a base layer compatible with H.264 / AVC or HEVC standards, scalable output. The input to this non-scalable encoder consists of an initial sequence of frame images obtained by applying downsampling 1.17 to images 1.1. This encoder sequentially performs the following steps to encode a standard video bitstream. The picture or frame to be encoded (compressed) is divided into pixel blocks in step 1.2, called coding units (CUs) in the HEVC standard. Each block is first subjected to a motion estimation operation 1.3, which comprises searching among the reference pictures stored in the allocated memory buffer 1.4, reference blocks that would provide good block prediction. This motion estimation step provides one or more reference picture indices that contain the found reference blocks, as well as corresponding motion vectors. Then, motion compensation step 1.5 applies the estimated motion vectors to the found reference blocks to obtain a temporal difference block, which is the difference between the motion compensation prediction block and the original prediction block. In addition, intra prediction step 1.6 determines a spatial prediction mode that would provide the best performance for predicting the current block. The spatial difference block is again calculated, but in this case, as the difference between the spatial predictor and the original prediction block.

Затем механизм 1.7 выбора режима кодирования выбирает среди пространственных и временных предсказаний режим кодирования, который обеспечивает наилучший компромисс между скоростью и искажением при кодировании текущего блока. В зависимости от выбранного режима предсказания разностный блок предсказания затем подвергается преобразованию (DCT) и квантованию 1.8. Выполняется энтропийное кодирование 1.10 квантованных коэффициентов QTC (и связанных данных движения MD). Сжатые данные 1.13 текстуры, связанные с закодированным текущим блоком 1.2, отправляют на выход.Then, the encoding mode selection mechanism 1.7 selects the encoding mode from spatial and temporal predictions that provides the best compromise between speed and distortion when encoding the current block. Depending on the selected prediction mode, the difference prediction block is then transformed (DCT) and quantized to 1.8. Entropy coding of 1.10 quantized QTC coefficients (and associated MD motion data) is performed. The compressed texture data 1.13 associated with the encoded current block 1.2 is sent to the output.

Чтобы дополнительно улучшить эффективность кодирования, информация движения, связанная с интер-блоками, которые выполняют этап компенсации движения, кодируется с предсказанием с использованием информации движения соседних блоков. Соседние блоки в этом случае содержат соседние в пространстве и, факультативно, соседние во времени блоки. Как следствие, если используются соседние во времени блоки, информация движения ранее закодированных изображений должна быть сохранена, чтобы позволить предсказание. В текущей версии стандарта эта информация сохраняется в сжатой форме кодером и декодером, чтобы ограничить использование памяти процесса кодирования и декодирования. Однако, как упомянуто ранее, когда временной предиктор не используется при предсказании информации движения, хранение поля движения предыдущих изображений не является необходимым.To further improve coding efficiency, motion information associated with inter-blocks that perform the motion compensation step is predictively encoded using motion information of neighboring blocks. Neighboring blocks in this case contain neighboring in space and, optionally, neighboring in time blocks. As a result, if time-adjacent blocks are used, motion information of previously encoded images must be stored to enable prediction. In the current version of the standard, this information is stored in compressed form by the encoder and decoder in order to limit the memory usage of the encoding and decoding process. However, as mentioned earlier, when the time predictor is not used in predicting motion information, storing the motion field of previous images is not necessary.

Затем текущий блок воссоздается посредством обратного квантования (также называемого масштабированием) и обратного преобразования 1.11. Этот этап выполняется при необходимости посредством суммирования между подвергнутой обратному преобразованию разностью и блоком предсказания текущего блока, чтобы сформировать воссозданный блок. Воссозданные блоки добавляются в буфер, чтобы сформировать воссозданный кадр. Затем этот воссозданный кадр подвергается последующей фильтрации 1.12. Воссозданный кадр после этого последующего фильтра сохраняется в буфере памяти 1.4, называемом буфером декодированных картинок (DPB), с тем, чтобы он был доступен для использования в качестве опорной картинки для предсказания любых последующих картинок, которые будут кодироваться.The current block is then recreated by inverse quantization (also called scaling) and inverse transform 1.11. This step is performed, if necessary, by summing between the inverted difference and the prediction block of the current block to form a recreated block. Recreated blocks are added to the buffer to form a recreated frame. Then this recreated frame is subjected to subsequent filtering 1.12. The reconstructed frame after this subsequent filter is stored in a memory buffer 1.4, called the decoded picture buffer (DPB), so that it is available for use as a reference picture for predicting any subsequent pictures to be encoded.

Наконец, последнему этапу энтропийного кодирования задается режим кодирования и, в случае интер-блока, данные движения, а также ранее вычисленные квантованные коэффициенты DCT. Этот кодер энтропии кодирует каждую из этих частей данных в их двоичную форму и инкапсулирует закодированный блок в контейнер, называемый элементом NAL (элементом сетевого уровня абстракции). Элемент NAL содержит все закодированные элементы кодирования из заданного слоя. Закодированный битовый поток HEVC состоит из последовательности элементов NAL.Finally, the coding mode and, in the case of an inter-block, motion data, as well as previously calculated quantized DCT coefficients, are set to the last stage of entropy coding. This entropy encoder encodes each of these pieces of data in their binary form and encapsulates the encoded block in a container called the NAL element (network abstraction layer element). The NAL element contains all encoded encoding elements from the specified layer. The HEVC encoded bitstream consists of a sequence of NAL elements.

Далее второй каскад A10 иллюстрирует кодирование улучшающего уровня с использованием базового уровня в качестве опорного уровня. При этом этот улучшающий уровень придает детализацию пространственного разрешения дискретизированному с повышением базовому уровню. Как проиллюстрировано на фиг. 1, схема кодирования этого улучшающего уровня аналогичная схеме кодирования базового уровня за исключением того, что для каждого элемента кодирования текущей картинки из сжимаемого потока 1.10 рассматриваются дополнительные режимы на основе предсказания между уровнями. Обычно включены следующие модификации.Next, the second stage A10 illustrates coding of the enhancement layer using the base layer as the reference layer. At the same time, this improving level gives the detail of spatial resolution to the base level that is discretized with increasing. As illustrated in FIG. 1, the coding scheme of this enhancement layer is similar to the coding scheme of the base layer except that for each coding element of the current picture from the compressible stream 1.10, additional modes based on prediction between the layers are considered. The following modifications are usually included.

Дополнительный режим, называемый IntraBL 1.90, состоящий в предсказании блока улучшающего уровня с использованием дискретизированного с повышением совмещенного блока опорного уровня, добавляется к списку режимов, рассматриваемых для блока улучшающего уровня.An additional mode, called IntraBL 1.90, consisting in predicting an enhancement level block using an upsampled reference block of a reference level, is added to the list of modes considered for the enhancement block.

Предсказание информации движения, используемое в режимах интер-кодирования, может быть модифицировано, чтобы принять во внимание информацию движения, приходящую из опорного уровня. Кроме того, заданное энтропийное кодирование информации движения может быть применено во время этапа 1.20 энтропийного кодирования.The prediction of motion information used in inter-coding modes may be modified to take into account motion information coming from a reference level. In addition, a predetermined entropy encoding of motion information can be applied during entropy encoding step 1.20.

Для этих новых инструментов промежуточный модуль 1.18 между уровнями может обеспечить информацию (информацию движения, отсчеты), возможно дискретизированную с повышением в соответствии с изменением пространственного разрешения, из опорного уровня разным модулям кодирования улучшающего уровня, таким как модуль 1.23 оценки движения, режим 1.90 IntraBL или модуль 1.26 интра-предсказания. В частности, в подходе с опорным кадром модуль 1.18 дискретизирует с повышением и данные отсчетов, и данные движения полученного в результате кадра в буфере 1.4 DPB для соответствия размерностям улучшающего уровня и вставит полученные в результате данные (изображение и его движение) в буфер 1.24 DPB, который в свою очередь воздействует на операции 1.25 и 1.23.For these new tools, the intermediate module 1.18 between levels can provide information (motion information, samples), possibly discretized with increasing in accordance with a change in spatial resolution, from the reference level to different coding modules of the improving level, such as motion estimation module 1.23, IntraBL mode 1.90 or Intra prediction module 1.26. In particular, in the approach with a reference frame, module 1.18 samples upsampled data and motion data of the resulting frame in DPB buffer 1.4 to match the dimensions of the enhancement level and inserts the resulting data (image and its motion) into 1.24 DPB buffer, which in turn affects operations 1.25 and 1.23.

Фиг. 10 представляет блок-схему масштабируемого декодера, который применяется к масштабируемому битовому потоку, сделанному из двух уровней масштабируемости, например, состоящему из базового уровня и улучшающего уровня. Этот процесс декодирования, таким образом, является противоположной обработкой по отношению к процессу масштабируемого кодирования на фиг. 1. Декодируемый масштабируемый поток 10.10, сделан из одного базового уровня и одного пространственного улучшающего уровня поверх базового уровня, которые демультиплексируются на этапе 10.20 в свои соответствующие уровни.FIG. 10 is a block diagram of a scalable decoder that is applied to a scalable bit stream made of two levels of scalability, for example, consisting of a base layer and an enhancement layer. This decoding process is thus the opposite of the processing with respect to the scalable encoding process in FIG. 1. The decoded scalable stream 10.10 is made of one base layer and one spatial enhancement layer on top of the base layer, which are demultiplexed at 10.20 to their respective layers.

Первый каскад на фиг. 10 относится к процессу B12 декодирования базового уровня. Этот процесс декодирования начинается с энтропийного декодирования 10.30 каждого элемента кодирования или блока каждой закодированной картинки на базовом уровне. Это энтропийное декодирование 10.30 обеспечивает режим кодирования, информацию движения (индексы опорных картинок и векторы движения интер-кодированных блоков), направление предсказания для интра-предсказания и разностные данные. Эти разностные данные состоят из квантованных и преобразованных коэффициентов DCT. Затем эти квантованные коэффициенты DCT подвергаются операциям 10.31 обратного квантования и обратного преобразования. Компенсация движения 10.32 или интра-предсказание 10.33 данных может быть добавлена к этой разности посредством операции 10.34.The first stage in FIG. 10 relates to a base layer decoding process B12. This decoding process begins with an entropy decoding of 10.30 of each encoding element or block of each encoded picture at a basic level. This entropy decoding 10.30 provides a coding mode, motion information (reference picture indices and motion vectors of inter-coded blocks), a prediction direction for intra-prediction, and difference data. This difference data consists of quantized and transformed DCT coefficients. Then, these quantized DCT coefficients are subjected to inverse quantization and inverse transform operations 10.31. Motion compensation 10.32 or intra prediction 10.33 data can be added to this difference through operation 10.34.

Затем применяется этап 10.35 фильтра удаления блочности. Воссозданное изображение затем сохраняется в буфере 10.40 кадра.Then, blocking filtering step 10.35 is applied. The reconstructed image is then saved in the frame buffer 10.40.

Затем декодированная информация движения для интер-блоков и воссозданные блоки сохраняются в буфере кадра в первом из масштабируемых декодеров на фиг. 10 (B12). Такие кадры содержат данные, которые могут использоваться в качестве опорных данных для предсказания более высокого уровня масштабируемости.Then, the decoded motion information for the inter blocks and the recreated blocks are stored in the frame buffer in the first of the scalable decoders in FIG. 10 (B12). Such frames contain data that can be used as reference data to predict a higher level of scalability.

Далее второй каскад на фиг. 10 выполняет декодирование пространственного улучшающего уровня A12 поверх базового уровня, декодированного посредством первого каскада. Это пространственное декодирование улучшающего уровня включает в себя энтропийное декодирование второго уровня, которое обеспечивает режимы кодирования, информацию движения и информацию интра-предсказания, а также преобразованную и квантованную разностную информацию блоков второго уровня.Next, the second stage in FIG. 10 performs decoding of the spatial enhancement layer A12 on top of the base layer decoded by the first stage. This spatial decoding of the enhancement layer includes entropy decoding of the second level, which provides coding modes, motion information and intra-prediction information, as well as transformed and quantized differential information of the blocks of the second level.

Следующий этап состоит в предсказании блоков в картинке улучшения. Выбор 10.51 между различными типами предсказания блока (интра-предсказание, интер-предсказание или в случае подхода TextureRL предсказание между уровнями) зависит от режима предсказания, полученного из этапа 10.50 энтропийного декодирования.The next step is to predict the blocks in the picture improvement. The choice of 10.51 between the different types of block prediction (intra prediction, inter prediction, or in the case of the TextureRL approach between levels) depends on the prediction mode obtained from entropy decoding step 10.50.

Что касается интра-блоков, их обработка зависит от типа элемента интра-кодирования. В случае предсказанного между уровнями интра-блока (режим кодирования IntraBL) 10.57, если разностные данные были закодированы для блока, результат энтропийного декодирования 10.50 подвергается обратному квантованию и обратному преобразованию 10.52 и затем добавляется посредством операции 10.53 к блоку со совмещенным расположением текущего блока в базовой картинке в своей декодированной, подвергнутой последующей фильтрации и дискретизированной с повышением (в случае пространственной масштабируемости) версии. В случае интра-блока такой блок полностью воссоздается через обратное квантование, обратное преобразование для получения разностных данных в пространственной области и затем интра-предсказание 10.54 для получения полностью воссозданного блока.As for intra-blocks, their processing depends on the type of intra-coding element. In the case of the predicted between the levels of the intra block (IntraBL coding mode) 10.57, if the difference data was encoded for the block, the result of entropy decoding 10.50 is inverse quantized and inversely converted 10.52 and then added by operation 10.53 to the block with the current block in the base picture in its decoded, subjected to subsequent filtering and discretized with increasing (in case of spatial scalability) version. In the case of an intra-block, such a block is completely recreated through inverse quantization, inverse transformation to obtain difference data in the spatial domain, and then intra-prediction 10.54 to obtain a completely recreated block.

Что касается интер-блоков, их воссоздание подразумевает их компенсацию 10,55 движения, вычисленную на основе памяти 10.60 кадра, декодирование разностных данных и затем добавление их декодированной разностной информации к их блоку временного предиктора. В этом процессе декодирования интер-блока информация движения, связанная с рассматриваемым блоком, может декодироваться предсказывающим образом как уточнение информации движения совмещенного блока в базовой картинке. Этот аспект будет подробно описан ниже.As for the inter-blocks, their reconstruction implies their compensation of 10.55 motion calculated on the basis of the frame memory 10.60, decoding the difference data and then adding their decoded difference information to their block of the temporal predictor. In this inter-block decoding process, the motion information associated with the block in question can be predictively decoded as an update of the motion information of the combined block in the base picture. This aspect will be described in detail below.

Как на базовом уровне, этап 10.56 фильтра удаления блочности может быть применен к изображениям, выданным из этапа 10,53, и они сохраняются в памяти 10.60 кадр, прежде чем они возвращаются посредством декодирования как полностью декодированные кадры 10.70. Следует отметить, что в варианте осуществления изобретения компенсация 10.55 движения фактически использует данные из буфера 10.60 изображения улучшающего уровня и буфера 10.40 изображения базового уровня. Вместе с данными отсчетов модуль 10.80 может отвечать за обеспечение таких данных из опорного уровня посредством их дискретизации с повышением.As at a basic level, blocking filter step 10.56 can be applied to the images output from block 10.53 and they are stored in frame memory 10.60 before they are returned by decoding as fully decoded frames 10.70. It should be noted that in an embodiment of the invention, motion compensation 10.55 actually uses data from the enhancement layer image buffer 10.60 and the base layer image buffer 10.40. Together with the sample data, module 10.80 may be responsible for providing such data from the reference level by upsampling them.

Могут быть рассмотрены два подхода для предсказания изображения. В частности, предсказание изображения содержит предсказание информации движения. В обоих подходах затрагивается предсказание информации движения, но по-разному. Два подхода предсказания изображения описаны ниже. Признак этих подходов должен позволить использовать информацию движения опорного уровня слоя для предсказания информации движения на улучшающем уровне. Этот признак будет описан более подробно со ссылкой на фиг. 3 и оказывает влияние на то, каким образом осуществляется доступ к памяти 1.4 кадра на фиг. 1 опорного уровня.Two approaches for image prediction can be considered. In particular, image prediction comprises prediction of motion information. Both approaches address the prediction of motion information, but in different ways. Two image prediction approaches are described below. A feature of these approaches should allow the use of motion information of the reference layer level to predict motion information at an improving level. This feature will be described in more detail with reference to FIG. 3 and influences how the frame memory 1.4 is accessed in FIG. 1 reference level.

Затем для обоих подходов подробно объяснен случай предсказания информации движения, который является ключевым пунктом этого изобретения.Then, for both approaches, the case of motion information prediction, which is a key point of this invention, is explained in detail.

Первый подход обычно называется TextureRL, поскольку разрешено использование режима IntraBL. Этот подход использует синтаксис низкого уровня на уровне блоков, чтобы сигнализировать использование режима IntraBL. Этот подход иногда упоминается некоторыми экспертами как “подход IntraBL”.The first approach is usually called TextureRL, since IntraBL mode is allowed. This approach uses block-level low-level syntax to signal the use of IntraBL mode. This approach is sometimes referred to by some experts as the “IntraBL approach”.

Второй подход, называемый вставкой опорного кадра, состоит в том, чтобы главным образом использовать изменения высокого уровня. В частности, изменение синтаксиса не выполняется на уровне блоков. Основной признак подхода индекса опорного кадра заключается во внедрении изображений (возможно, подвергнутых дискретизации с повышением, когда разрешение является другим) опорного уровня, называемых изображениями ILR (что означает опорные изображения между уровнями), в буфер декодированных изображений улучшающего уровня. Эти изображения затем вставляются в конец заданных списков опорных изображений (список L0 и L1), используемых в качестве опорных изображений в буфере DPB (буфере декодированных картинок). Вставка зависит от типа текущего слоя улучшающего уровня. В P-слое изображение ILR вставляется в конец списка L0. В B-слое изображение ILR вставляется и в конец списка L0, и в конец списка L1. Этот подход иногда упоминается некоторыми экспертами как “подход ref_idx”. Посредством этого подхода информация движения заданного блока может быть закодирована с предсказанием с использованием временного предиктора информации движения опорного уровня, совмещенно расположенного в опорном уровне.The second approach, called insertion of the reference frame, is to mainly use high level changes. In particular, syntax changes are not performed at the block level. The main feature of the reference frame index approach is to embed images (possibly upsampled when the resolution is different) of the reference level, called ILR images (which means reference images between levels), in the decoded image enhancement layer buffer. These images are then inserted at the end of the specified reference picture lists (list L0 and L1) used as reference pictures in the DPB (decoded picture buffer). The insertion depends on the type of the current enhancement layer. In the P-layer, the ILR image is inserted at the end of the L0 list. In the B layer, an ILR image is inserted at the end of the L0 list and at the end of the L1 list. This approach is sometimes referred to by some experts as the “ref_idx approach”. Through this approach, the motion information of a given block can be predicted encoded using a temporal predictor of motion information of the reference level, co-located in the reference level.

Стоит отметить, что IntraBL в подходе TextureRL и использование вставленного опорного кадра в подходе индекса опорного кадра являются двумя способами использования информации базового уровня для предсказания улучшающего уровня. В описании этого изобретения и ради простоты мы поочередно рассматриваем один из этих подходов, но не оба вместе.It is worth noting that IntraBL in the TextureRL approach and the use of an inserted reference frame in the reference frame index approach are two ways to use the basic layer information to predict the enhancement layer. In the description of this invention and for the sake of simplicity, we alternately consider one of these approaches, but not both.

Теперь мы опишем общую информацию относительно информации движения, которая пригодна для обоих подходов.Now we will describe general information regarding motion information that is suitable for both approaches.

Типичный видеокодек использует как пространственные, так и временные корреляции между пикселями в соответствующих интра- и интер-режимах. Здесь мы сосредоточены здесь на режимах интер-кодирования, которые используют временную корреляцию между пикселями текущего кадра и ранее закодированными/декодированными кадрами.A typical video codec uses both spatial and temporal correlations between pixels in the corresponding intra- and inter-modes. Here, we focus here on inter-coding modes that use the temporal correlation between the pixels of the current frame and previously encoded / decoded frames.

В стандарте HEVC (и SHVC посредством расширения) интер-режим является режимом предсказания, который определяет временное направление предсказания. Множества информации движения с 0 по 2 определены в зависимости от этого временного направления. Если направление интер-предсказания равно 0, блок кодируется с помощью режима интра-кодирования, и он не содержит информации движения. Если направление интер-предсказания равно 1, блок содержит информацию движения из списка опорных кадров, называемого L0. Если направление интер-предсказания равно 2, блок содержит информацию движения из другого списка опорных кадров, называемого L1. Если направление интер-предсказания равно 3, блок содержит информацию движения из обоих списков L0 и L1.In the HEVC standard (and SHVC by extension), inter-mode is a prediction mode that determines the temporal direction of prediction. The sets of motion information from 0 to 2 are determined depending on this time direction. If the inter prediction direction is 0, the block is encoded using the intra-encoding mode, and it does not contain motion information. If the inter prediction direction is 1, the block contains motion information from a list of reference frames called L0. If the inter prediction direction is 2, the block contains motion information from another list of reference frames called L1. If the inter prediction direction is 3, the block contains motion information from both lists L0 and L1.

Информация движения состоит в следующей информации, индекса (ref_idx) в списке опорных кадров и вектора движения, который имеет два компонента: горизонтальную и вертикальную величины движения. Эти величины соответствуют пространственному смещению в терминах пикселей между позицией текущего блока и блока временного предиктора в опорном кадре. Это смещение может иметь субпиксельную точность (0, 1, 2 или 3 четверти пикселя).The motion information consists of the following information, the index (ref_idx) in the list of reference frames and the motion vector, which has two components: horizontal and vertical magnitudes of motion. These values correspond to the spatial displacement in terms of pixels between the position of the current block and the temporal predictor block in the reference frame. This offset can have subpixel accuracy (0, 1, 2, or 3 quarters of a pixel).

Упомянутые выше направления интер-предсказания 1 и 2 соответствуют однонаправленным предсказаниям и могут использоваться не в I-слоях (интра-кодированных слоях) слои, а в P-слоях (предсказанных слоях) и B-слоях (с двунаправленным предсказанием). Изображение конкретного типа (I, P или B) сделано по меньшей мере из одного слоя такого же типа. Направление интер-предсказания 3 называется двунаправленным предсказанием и может использоваться только в B-слоях. В этом случае рассматриваются два предиктора блока, по одному для каждого из списков L0 и L1. Следовательно, рассматриваются два индекса опорных кадров, а также два вектора движения. Предиктор блока интер-режима для двунаправленного предсказания является средним по пикселям этих двух блоков, на которые указывают эти два вектора движения. Предиктор блока здесь соответствует понятию элемента предсказания или блока предсказания в HEVC или SHVC.The inter prediction directions 1 and 2 mentioned above correspond to unidirectional predictions and can be used not in I-layers (intra-coded layers), but in P-layers (predicted layers) and B-layers (with bidirectional prediction). An image of a particular type (I, P or B) is made of at least one layer of the same type. Inter prediction direction 3 is called bidirectional prediction and can only be used in B layers. In this case, two block predictors are considered, one for each of the lists L0 and L1. Therefore, we consider two indices of reference frames, as well as two motion vectors. The inter-mode block predictor for bidirectional prediction is the pixel average of the two blocks that these two motion vectors point to. The block predictor here corresponds to the concept of a prediction element or prediction block in a HEVC or SHVC.

Как описано выше, информация движения в HEVC кодируется посредством кодирования с предсказанием с использованием множества предикторов информации движения, среди которых информация движения со совмещенным во времени местоположением. Таким образом, необходимо, чтобы каждый кадр, который используется в качестве опорного кадра, хранил на сторонах кодера и декодера свою соответствующую информацию движения. Эта информация движения сжата, чтобы уменьшить ее размер в выделенной памяти информации движения.As described above, the motion information in the HEVC is encoded by predictive coding using a plurality of motion information predictors, including motion information with a time-aligned location. Thus, it is necessary that each frame that is used as a reference frame stores on the sides of the encoder and decoder its corresponding motion information. This motion information is compressed to reduce its size in the allocated memory of the motion information.

Таким образом, HEVC использует конкретную гранулярность для представления движения. Это изображено на фиг. 5. Для каждого блока 5.0 из 16×16 пикселей минимальная гранулярность, используемая HEVC, составляет 4×4 пикселей, что дает в результате 16 потенциальных частей информации движения, по одному для каждого блока с размером 4×4. Сжатие информации движения состоит в хранении только информации движения, соответствующей верхнему левому блоку 5.1 с размером 4×4 для заданного блока 5.0.Thus, HEVC uses specific granularity to represent movement. This is depicted in FIG. 5. For each 5.0 block of 16 × 16 pixels, the minimum granularity used by HEVC is 4 × 4 pixels, resulting in 16 potential pieces of motion information, one for each 4 × 4 block. Compression of motion information consists in storing only motion information corresponding to the upper left block 5.1 with a size of 4 × 4 for a given block 5.0.

Процесс сжатия информации движения может произойти, как только был сделан заключительный выбор для блока с размером 16×16, и он был закодирован, но проще визуализировать его как выполняемый, когда было закодировано целое изображение. Ради простоты мы можем полагать, что он выполняется после процесса адаптивного петлевого фильтра и перед тем, как декодированная картинка будет помещена в буфер декодированных картинок (DPB). Этот процесс сжатия может быть описан как конкретный поиск: для заданных координат X и Y пикселя информация движения получается из позиции X’=(X>>4)<<4 и Y'=(Y>>4)<<4, где операторы '>>' и '<<' описываются следующим образом.The process of compressing motion information can occur as soon as the final choice has been made for a 16 × 16 block and it has been encoded, but it is easier to visualize it as executable when the whole image has been encoded. For the sake of simplicity, we can assume that it is executed after the adaptive loop filter process and before the decoded picture is placed in the decoded picture buffer (DPB). This compression process can be described as a specific search: for the given coordinates of the X and Y pixels, the motion information is obtained from the position X '= (X >> 4) << 4 and Y' = (Y >> 4) << 4, where the operators '>>' and '<<' are described as follows.

x>>y представляет арифметический сдвиг вправо целочисленного представления дополнения до двух для x на y двоичных цифр. Эта функция определена только для неотрицательных целочисленных значений y. Биты, смещенные в старшие значащие биты (MSB) в результате сдвига вправо, имеют значение, равное MSB для x до операции сдвига.x >> y represents the right arithmetic shift of the integer representation of the complement to two for x by y binary digits. This function is defined only for non-negative integer y values. The bits shifted to the most significant bits (MSB) as a result of a shift to the right have a value equal to MSB for x before the shift operation.

x<<y представляет арифметический сдвиг влево целочисленного представления дополнения до двух для x на y двоичных цифр. Эта функция определена только для неотрицательных целочисленных значений y. Биты, смещенные в младшие значащие биты (LSB) в результате сдвига влево, имеют значение, равное 0.x << y represents the arithmetic left shift of the integer representation of the complement to two for x by y binary digits. This function is defined only for non-negative integer y values. Bits shifted to the least significant bits (LSB) as a result of a left shift have a value of 0.

Можно отметить, что некоторые реализации могут использовать буфер для хранения соответствующего сжатого движения.It may be noted that some implementations may use a buffer to store the corresponding compressed movement.

В конфигурации HEVC векторы движения кодируются посредством кодирования с предсказанием с использованием множества предикторов информации движения. Для интер-кодированного блока имеется 3 подрежима, называемых подрежимами с пропуском (Skip), внешним (Inter) и со слиянием (Merge) кодирования блока. Внешний подрежим использует конкретный способ предсказания движения, называемый AMVP, и использует разностные текстурные данные. Подрежимы с пропуском и со слиянием используют один и тот же способ предсказания движения (но первый не использует разностные данные). Этот способ предсказания дает возможность выбора наилучшего предиктора информации движения из заданного множества, причем множество составлено из пространственной и временной информации движения.In the HEVC configuration, motion vectors are encoded by predictive coding using a plurality of motion information predictors. For an inter-coded block, there are 3 sub-modes, called skipped sub-modes (Skip), external (Inter) and merged (Merge) block coding. The external submode uses a specific motion prediction method called AMVP, and uses difference texture data. Skipping and merging submodes use the same motion prediction method (but the former does not use difference data). This prediction method makes it possible to select the best predictor of motion information from a given set, the set being composed of spatial and temporal motion information.

Мы опишем режим предсказания информации движения, называемый режимом со слиянием, и каким образом он применяется к обоим упомянутым выше подходам: TextureRL и индекс опорного кадра. Он используется для двух подрежимов интер-кодирования, подрежимов с пропуском и со слиянием. Затем мы подробно опишем эквивалентную схему, которая может использоваться в режиме AMVP.We will describe a mode for predicting motion information, called a merge mode, and how it applies to both of the above approaches: TextureRL and the reference frame index. It is used for two sub-modes of inter-coding, sub-modes with skipping and merging. Then we describe in detail the equivalent circuit that can be used in AMVP mode.

Фиг. 3 показывает универсальную блок-схему последовательности операций для схемы предиктора информации движения со слиянием для подрежимов со слиянием и с пропуском на стороне кодера, кратко называемой "режимом со слиянием". Принцип режима со слиянием состоит в том, чтобы использовать предсказание вектора движения для компенсации движения без кодирования уточнения движения. Модуль 3.3 генерации предиктора информации движения генерирует множество 3.4 предикторов информации движения на основе поля 3.1 информации движения, как описано подробно ниже. Выбор 3.7 соотношения скорости и искажения наилучшего предиктора информации движения применяется среди множества 3.4 предикторов информации движения. Он генерирует индекс 3.8 предсказанного вектора движения, который должен быть закодирован.FIG. 3 shows a universal flowchart for a merge motion information predictor circuit for merge and skip sub-modes on the encoder side, briefly referred to as “merge mode”. The principle of merge mode is to use motion vector prediction to compensate for motion without coding for motion refinement. The motion information predictor generation module 3.3 generates a plurality of 3.4 motion information predictors based on the motion information field 3.1, as described in detail below. The choice of 3.7 correlation of speed and distortion of the best predictor of motion information is applied among the set of 3.4 predictors of motion information. It generates an index 3.8 of the predicted motion vector to be encoded.

Модуль 3.14 преобразования преобразовывает упомянутый индекс в усеченный унарный код 3.15: для кодирования значения N генерируется кодовое слово с длиной N+1, за исключением максимального значения N, которому вместо этого требуется N битов. Этот код состоит из N битов, установленных равными 7, и заключительного бита, установленного равным 0. Если значение N равно максимальному количеству кандидатов, то этот конечный бит не является необходимым, и длина кодовой комбинации, таким образом, равна N. Вследствие этого максимального значения количество кандидатов режима со слиянием (обычно 5 для HEVC) может быть выбрано на уровне слоя (синтаксический элемент five_minus_max_num_Merge_cand в HEVC), этап 3.14 принимает во внимание максимальное количество предикторов 3.16.Conversion module 3.14 converts said index into a truncated unary code 3.15: a codeword with a length of N + 1 is generated for encoding a value of N, except for a maximum value of N, which instead requires N bits. This code consists of N bits set to 7 and a final bit set to 0. If N is the maximum number of candidates, then this final bit is not necessary, and the length of the code combination is therefore N. Due to this maximum value the number of merge mode candidates (usually 5 for HEVC) can be selected at the layer level (syntax element five_minus_max_num_Merge_cand in HEVC), step 3.14 takes into account the maximum number of predictors 3.16.

Сгенерированное кодовое слово 3.15 затем кодируется на этапе энтропийного кодирования 3.12:The generated codeword 3.15 is then encoded in entropy encoding step 3.12:

- Первый бит использует арифметическое кодирование с заданным контекстом;- The first bit uses arithmetic coding with a given context;

- Остальные биты используют кодирование с обходом, т.е. генерируется фактический бит.- The remaining bits use bypass coding, i.e. the actual bit is generated.

Фиг. 4 показывает блок-схему последовательности операций для соответствующего процесса декодирования. На первом этапе модуль 4.2 генерирует множество 4.8 предикторов информации движения на основе поля 4.1 информации движения текущего кадра и предыдущих кадров. Максимальное количество 4.16 предикторов движения декодируется из синтаксического элемента five_minus_max_num_Merge_cand, расположенного в заголовке слоя. Затем оно используется на этапе 4.6, чтобы извлечь кодовое слово 4.14 предиктора информации движения. Это кодовое слово преобразовывается посредством этапа 4.15 в индекс 4.9 предиктора. Предиктор 4.10 информации движения для использования затем извлекается из множества 4.8 в соответствии с этим значением 4.9 индекса предиктора. Этот предиктор затем используется в качестве фактической информации движения во время компенсации движения.FIG. 4 shows a flowchart for a corresponding decoding process. In the first step, module 4.2 generates a plurality of 4.8 motion information predictors based on the motion information field 4.1 of the current frame and previous frames. The maximum number of 4.16 motion predictors is decoded from the syntax element five_minus_max_num_Merge_cand, located in the layer header. It is then used in step 4.6 to extract the codeword 4.14 of the motion information predictor. This codeword is converted through step 4.15 to the predictor index 4.9. The motion information predictor 4.10 for use is then retrieved from the set 4.8 in accordance with this predictor index value 4.9. This predictor is then used as actual motion information during motion compensation.

Предиктор информации движения или кандидат содержат всю информацию движения: направление (т.е. доступность вектора движения и опорного индекса в списке), индекс опорного кадра и векторы движения. Несколько кандидатов генерируются посредством процесса выявления в режиме со слиянием, описанного далее, каждый из которых имеет индекс. В стандарте HEVC максимальное количество кандидатов Max_Cand по умолчанию равно 5, но может быть уменьшено до 1. Здесь мы описываем определение предиктора информации движения в режиме со слиянием с заданными частями для подходов TextureRL и индекса опорного кадра.The motion information predictor or candidate contains all the motion information: direction (i.e., the availability of the motion vector and reference index in the list), the index of the reference frame, and motion vectors. Several candidates are generated through the merge mode identification process described below, each of which has an index. In the HEVC standard, the maximum number of Max_Cand candidates by default is 5, but can be reduced to 1. Here we describe the definition of the motion information predictor in the merge mode with specified parts for the TextureRL and reference frame index approaches.

Фиг. 9 является блок-схемой последовательности операций процесса выявления информации движения режимов со слиянием. На первом этапе выявления и в ядре HEVC, и в подходах TextureRL и индекса опорного кадра, рассматриваются 7 позиций блока с 9.1 по 9,7.FIG. 9 is a flowchart of a process for detecting merged mode motion information. At the first stage of identification, both in the HEVC core and in the TextureRL and reference frame index approaches, 7 block positions from 9.1 to 9.7 are considered.

Кроме того, в случае подхода TextureRL рассматривается другой кандидат, SMVP 9.0 (пространственный предиктор вектора движения), как описано выше. Эти позиции являются пространственными и временными позициями, изображенными на фиг. 2. Каждая позиция имеет одно и то же имя на обеих фигурах. Этот SMVP не существует в подходе индекса опорного кадра.In addition, in the case of the TextureRL approach, another candidate is considered, SMVP 9.0 (spatial motion vector predictor), as described above. These positions are spatial and temporal positions depicted in FIG. 2. Each position has the same name on both pieces. This SMVP does not exist in the reference frame index approach.

Модуль 9.8 проверяет наличие пространственной информации движения, и в подходе TextureRL также SMVP для улучшающего уровня. Он выбирает по большей мере 4 предиктора информации движения. В этом модуле предиктор является доступным, если он существует на опорном уровне и если этот блок не является интра-кодированным. Кроме того, в дальнейшем в подходе TextureRL любой кандидат, который будет добавлен, также сравнивается с SMVP в дополнение к любой другой информации движения и фактически добавляется, только если он является другим. Например, "Левый" кандидат, обозначенный A1 или 9.1, также сравнивается с SMVP и добавляется в качестве второго, если существует движение в позиции X2, или в ином случае первого. Это сравнение, а также в дальнейшем, выполняется посредством проверки, что:Module 9.8 checks for spatial motion information, and the TextureRL approach also has SMVP for the enhancement layer. He selects at most 4 motion information predictors. In this module, a predictor is available if it exists at the reference level and if this block is not intra-coded. In addition, in the future in the TextureRL approach, any candidate to be added is also compared with SMVP in addition to any other motion information and is actually added only if it is different. For example, the “Left” candidate, designated A1 or 9.1, is also compared with SMVP and added as the second if there is a movement at position X2, or otherwise the first. This comparison, and also in the future, is carried out by checking that:

- Информация движения от двух кандидатов имеет одинаковое направление предсказания;- Traffic information from two candidates has the same direction of prediction;

- Если это имеет место, для каждой части информации движения, связанной с направлением предсказания:- If this is the case, for each piece of motion information related to the direction of the prediction:

Что делается ссылка на один и тот же кадр (то есть, одинаковое значение индекса ref_idx);What is the reference made to the same frame (that is, the same value of the index ref_idx);

Что векторы движения идентичны и по своим вертикальным, и по горизонтальным координатамThat the motion vectors are identical in both their vertical and horizontal coordinates

Выбор и проверка этих 5 векторов движения описаны в следующих условиях:The selection and verification of these 5 motion vectors are described in the following conditions:

- В подходе TextureRL, если информация движения 9.0 из позиции X2 со совмещенным местоположением центральной позиции X1 элемента PU доступна 9.8, она масштабируется и используется в качестве первого кандидата в списке 9.10.- In the TextureRL approach, if motion information 9.0 from position X2 with the combined location of the central position X1 of the PU element is available 9.8, it is scaled and used as the first candidate in list 9.10.

- Если "Левая" A1 информация 9.1 движения доступна 9.8, что означает, что если она существует и если этот блок не является интра-кодированным, информация движения "Левого” блока выбирается и используется в качестве первого кандидата в списке 9.10.- If the “Left” A1 motion information 9.1 is available 9.8, which means that if it exists and if this block is not intra-coded, the movement information of the “Left” block is selected and used as the first candidate in list 9.10.

- Если "Верхняя" B1 информация 9.2 движения доступна 9.8, кандидат "Верхнего" блока сравнивается 9.9 с A1 (если он существует). Если B1 равен A1, B1 не добавляется в список пространственных кандидатов 9.10, в ином случае добавляется.- If the "Upper" B1 motion information 9.2 is available 9.8, the candidate of the "Upper" block is compared 9.9 with A1 (if it exists). If B1 is equal to A1, B1 is not added to the list of spatial candidates 9.10, otherwise it is added.

- Если "Верхняя правая" B0 информация 9.3 движения доступна 9.8, вектор движения "Верхнего правого" сравнивается 9.9 с B1. Если B0 равен B1, B0 не добавляется в список пространственных кандидатов (9.10), в ином случае добавляется.- If the Upper Right B0 motion information 9.3 is available 9.8, the Upper Right motion vector is compared 9.9 with B1. If B0 is equal to B1, B0 is not added to the list of spatial candidates (9.10), otherwise it is added.

- Если "Нижний левый" A0 вектор 9.4 движения доступен 9.8, информация движения "Нижнего левого" сравнивается 9.9 с A1. Если А0 равно A1, А0 не добавляется в список пространственных кандидатов 9.10, в ином случае добавляется.- If the “Lower Left” A0 motion vector 9.4 is available 9.8, the “Lower Left” motion information is compared 9.9 with A1. If A0 is A1, A0 is not added to the list of spatial candidates 9.10, otherwise it is added.

- Если список пространственных кандидатов не содержит 4 кандидата, проверяется 9.8 доступность “Верхней левой" B2 информации 9.5 движения, если она доступна, вектор движения "Верхнего левого" B2 сравнивается 9.9 с A1 и B1. Если B2 равен A1 или B1, B2 не добавляется в список пространственных кандидатов 9.10, в ином случае добавляется.- If the list of spatial candidates does not contain 4 candidates, 9.8 the availability of the “Top Left” B2 information 9.5 movement is checked, if it is available, the motion vector of the “Top Left” B2 is compared 9.9 with A1 and B1. If B2 is A1 or B1, B2 is not added to the list of spatial candidates 9.10, otherwise added.

В конце этого каскада список 9.10 содержит от 0 до 4 кандидатов.At the end of this cascade, list 9.10 contains from 0 to 4 candidates.

Для временного кандидата могут использоваться две позиции: H 9.6, соответствующая нижней правой позиции BR1 блока со совмещенным местоположением, или центральная C 9.7 блока со совмещенным местоположением ("со совмещенным местоположением" означает блок в той же самой позиции отличающемся по времени кадре), соответствующая центральной позиции X1 текущего блока. Эти позиции изображены на фиг. 2.Two positions can be used for a temporary candidate: H 9.6, corresponding to the lower right position BR1 of the block with a combined location, or the central C 9.7 of a block with a combined location ("with a combined location" means a block in the same position with a different time frame) corresponding to the central position X1 of the current block. These positions are depicted in FIG. 2.

Сначала проверяется 9.11 наличие блока в позиции 9.6 H. Если он не доступен, то тогда проверяется 9.11 блок в центральной позиции 9.7. Если по меньшей мере одна информация движения этих позиций доступна, эта временная информация движения может быть масштабирована в случае необходимости 9.12, чтобы она была однородной с информацией движения, приходящей из опорного кадра с индексом 0, для обоих списков L0 и L1 в случае необходимости, чтобы создать временный кандидат 9.13; временный кандидат затем вставляется в список кандидатов в режиме со слиянием сразу после пространственных кандидатов.First, check the availability of a block at 9.11 at position 9.6 H. If it is not available, then check at 9.11 a block at center position 9.7. If at least one motion information of these positions is available, this temporary motion information can be scaled if necessary 9.12 so that it is homogeneous with the motion information coming from the reference frame with index 0, for both lists L0 and L1, if necessary, so that create interim candidate 9.13; the interim candidate is then inserted into the candidate list in merge mode immediately after the spatial candidates.

Кроме того, заключительная позиция для временного кандидата, H или центр в зависимости от доступности, ограничивается тем, чтобы она оставалась в пределах того же самого блока CTB (иерархического блока кодирования) или его своего правого соседа для сокращения доступов к памяти.In addition, the final position for a temporary candidate, H or center, depending on availability, is limited to staying within the same CTB (hierarchical coding unit) or its right-hand side neighbor to reduce memory access.

Важно отметить, что для всех уровней и всех подходов, но самое главное на опорном уровне, этот предиктор информации движения условно определяется и добавляется в зависимости от того:It is important to note that for all levels and all approaches, but most importantly at the reference level, this predictor of motion information is conditionally determined and added depending on:

- деактивирован ли упомянутый временной предиктор информации движения (TMVP) на уровне последовательности, например, с использованием флага sps_temporal_mvp_enable_flag, расположенного во множестве SPS (множестве параметров последовательности) - это особенно относится к варианту осуществления изобретения;- whether said temporary motion information predictor (TMVP) is deactivated at the sequence level, for example, using the sps_temporal_mvp_enable_flag flag located in a plurality of SPS (plurality of sequence parameters) - this is especially true for an embodiment of the invention;

- если он активирован на уровне последовательности, деактивирован ли он на уровне слоя, например, с использованием флага enable_temporal_mvp_flag, расположенного в заголовке слоя.- if it is activated at the sequence level, is it deactivated at the layer level, for example, using the enable_temporal_mvp_flag flag located in the layer header.

Тот факт, что этот предиктор информации движения может быть деактивирован, вместе с тем, каким образом он затронут сжатием памяти вектора движения, играет важную роль в описанном процессе и в том, каким образом происходит выявление предиктора SMVP 9.0.The fact that this predictor of motion information can be deactivated, along with the way in which it is affected by the compression of the motion vector memory, plays an important role in the described process and in the way in which the SMVP 9.0 predictor is detected.

Во-вторых, в подходе индекса опорного кадра этот временной предиктор информации движения может приходить из вставленного кадра. Как будет описан ниже, упомянутая информация движения фактически выявляется из поля сжатого движения кадра опорного уровня.Secondly, in the reference frame index approach, this temporal predictor of motion information may come from the inserted frame. As will be described below, said motion information is actually detected from the compressed motion field of the reference level frame.

Если количество кандидатов (Nb_Cand) 9.14 строго ниже максимального количества кандидатов Max_Cand, равного 5 по умолчанию и по большей мере, объединенные кандидаты генерируются на этапе 9.15, в ином случае заключительный список кандидатов в режиме со слиянием создан на этапе 9.18. Модуль 9.15 используется только тогда, когда текущий кадр является B-кадром, и он генерирует несколько кандидатов на основе доступных кандидатов в двух списках в режиме со слиянием на этапе 9.15. Эта генерация состоит в объединении одной информации движения кандидата из списка L0 с другой информацией движения другого кандидата из списка L1.If the number of candidates (Nb_Cand) 9.14 is strictly lower than the maximum number of candidates Max_Cand equal to 5 by default and at least the combined candidates are generated at step 9.15, otherwise the final list of candidates in the merge mode is created at step 9.18. Module 9.15 is used only when the current frame is a B-frame, and it generates several candidates based on available candidates in two lists in the merge mode in step 9.15. This generation consists in combining one motion information of the candidate from the list L0 with other motion information of another candidate from the list L1.

Если количество кандидатов (Nb_Cand) 9.16 строго ниже максимального количества кандидатов Max_Cand, пустые кандидаты информации движения без смещения (0,0) (т.е., все значения вектора движения равны нулю) добавляются на этапе 9.17, и Nb_Cand увеличивается, пока Nb_Cand не станет равен Max_Cand.If the number of candidates (Nb_Cand) 9.16 is strictly lower than the maximum number of candidates Max_Cand, empty candidates for motion information without offset (0,0) (i.e., all values of the motion vector are zero) are added at step 9.17, and Nb_Cand increases until Nb_Cand will become equal to Max_Cand.

В конце этого процесса заключительный список кандидатов в режиме со слиянием создан на этапе 9.18.At the end of this process, the final list of candidates in merge mode is created in step 9.18.

Текущая спецификация для SHVC (масштабируемого расширения HEVC) не использует предиктор информации движения, полученный из опорного уровня в режиме AMVP, но это может быть внедрено следующим образом.The current specification for SHVC (scalable HEVC extensions) does not use the motion information predictor obtained from the reference level in AMVP mode, but it can be implemented as follows.

Фиг. 12 показывает выявление множества предикторов информации движения AMVP. Этот процесс используется для кодирования с предсказанием информации движения. По сравнению с режимом со слиянием должна быть передана дополнительная информация: направление предсказания, и для каждой части передаваемой информации движения также передается индекс опорного кадра, индекс предиктора и разность информации движения.FIG. 12 shows the identification of multiple AMVP motion information predictors. This process is used to predict motion coding information. Compared to the merge mode, additional information must be transmitted: the prediction direction, and for each part of the transmitted motion information, the reference frame index, the predictor index and the difference of the motion information are also transmitted.

Индекс предиктора зависит от количества кандидатов: HEVC должен сгенерировать максимально 2 предиктора информации движения. В этом случае Max_Cand в этой фигуре установлен равным 2, но можно представить, что используется 3 для улучшающего уровня. Первый пространственный кандидат выбирается среди левых блоков А0 12.1 и A1 12.2 для позиций, подобных позициям для режима со слиянием.The predictor index depends on the number of candidates: the HEVC must generate a maximum of 2 motion information predictors. In this case, Max_Cand in this figure is set to 2, but you can imagine that 3 is used for the enhancement level. The first spatial candidate is selected among the left blocks A0 12.1 and A1 12.2 for positions similar to those for the merge mode.

Два пространственных предиктора информации движения режима AMVP выбираются среди верхних и среди левых блоков, включающих в себя верхние угловые блоки и левый угловой блок, снова как для режима со слиянием. Этот левый предиктор Cand1 12.9 выбирается 12.8 среди “Нижнего левого" А0 и "Левого" A1 блоков. В этом заданном порядке оцениваются следующие условия, пока не будет найдено значение информации движения для информации движения из того же самого опорного списка и той же самой опорной картинки или информации движения из другого опорного списка и той же самой опорной картинки.Two spatial predictors of AMVP mode motion information are selected among the upper and left blocks, including the upper corner blocks and the left corner block, again as for the merge mode. This left predictor Cand1 12.9 is selected 12.8 from the “Lower Left” A0 and “Left” A1 blocks. In this specified order, the following conditions are evaluated until the value of the motion information for the motion information from the same reference list and the same reference picture is found or motion information from another reference list and the same reference picture.

Предиктор Cand2 12.11 верхней информации движения выбирается на этапе 12.10 среди “Верхнего правого” B0 12.3, "Верхнего" B1 12.4 и "Верхнего левого” B2 12.5 блоков в этом заданном порядке с такими же условиями, как описано ниже. Далее Cand1 и Cand2 сравниваются, чтобы удалить один из этих предикторов информации движения, если они равны 12.15. После этого сокращения количество кандидатов сравнивается с Max_Cand на этапе 12.16: если они равны, список кандидатов информации движения AMVP полностью определен, и процесс заканчивается на этапе 12.23.The predictor Cand2 12.11 of the upper movement information is selected in step 12.10 among the “Top right” B0 12.3, “Top” B1 12.4 and “Top left” B2 12.5 blocks in this given order with the same conditions as described below. Next, Cand1 and Cand2 are compared, to remove one of these motion information predictors if they are equal to 12.15, after this reduction, the number of candidates is compared with Max_Cand in step 12.16: if they are equal, the AMVP motion information candidate list is completely determined, and the process ends in step 12.23.

В ином случае, если количество кандидатов меньше Max_Cand, временной предиктор Cand3 12.14 движения выявляется, как в режиме со слиянием, и добавляется, если он существует, на этапе 12.17. Чтобы сделать это, нижняя правая (H) 12.6 позиция сначала рассматривается в модуле 12.12 проверки доступности. Если она не существует, выбирается центр блока 12.7 с совмещенным местоположением.Otherwise, if the number of candidates is less than Max_Cand, the temporary motion predictor Cand3 12.14 is detected, as in the merge mode, and added, if it exists, at step 12.17. To do this, the lower right (H) 12.6 position is first considered in availability check module 12.12. If it does not exist, the center of block 12.7 with the combined location is selected.

Затем количество добавленных кандидатов снова сравнивается с максимальным количеством кандидатов на этапе 12.18. Если максимальное количество достигнуто, заключительный список предикторов AMVP создан на этапе 12.23. В ином случае этап 12,19 проверяет, создаем ли мы список для улучшающего уровня. Если нет, создание классического списка возобновляется на этапе 12.22, на котором к списку добавляется столько нулевых кандидатов, сколько необходимо, чтобы достигнуть максимума, и, таким образом, он завершается на этапе 12.23. В ином случае происходит заданная обработка, в которой кандидат SMVP получается из опорного уровня после описанного ниже вычисления. Когда это сделано, обычная обработка возобновляется на этапе 12.22.Then, the number of added candidates is again compared with the maximum number of candidates in step 12.18. If the maximum quantity is reached, the final AMVP predictor list is created in step 12.23. Otherwise, step 12.19 checks to see if we are creating a list for the enhancement level. If not, the creation of the classic list is resumed at step 12.22, where as many zero candidates are added to the list as necessary to reach the maximum, and thus it ends at step 12.23. Otherwise, predetermined processing occurs in which the SMVP candidate is obtained from the reference level after the calculation described below. When this is done, normal processing resumes at step 12.22.

Мы решили проиллюстрировать использование этого кандидата после Cand3. Безусловно, очевидно предположить использование его перед Cand3 между Cand1 и Cand2 или между Cand2 и Cand3. Во всех случаях дополнительное сравнение может быть выполнено в процессе сокращения, чтобы принять во внимание новый потенциальный SMVP.We decided to illustrate the use of this candidate after Cand3. Of course, it is obvious to assume its use before Cand3 between Cand1 and Cand2 or between Cand2 and Cand3. In all cases, additional comparisons can be made during the reduction process to take into account the new potential SMVP.

При рассмотрении применения режима со слиянием к подходу TextureRL режим со слиянием добавляет новый предиктор информации движения, SMVP, в улучшающем уровне, полученном из его опорного уровня. Упомянутый предиктор информации движения в текущее время приходит из информации движения, используемой при определении временного кандидата в опорном уровне, который сжат. Фиг. 6 схематично иллюстрирует принципы подхода TextureRL. Приведенное здесь описание относится к цветовому компоненту яркости изображения, но процесс относится также к цветовым компонентам цветности.When considering applying the merge mode to the TextureRL approach, the merge mode adds a new motion information predictor, SMVP, at an enhancement level derived from its reference level. The mentioned predictor of motion information at the current time comes from the motion information used in determining the temporary candidate in the reference level, which is compressed. FIG. 6 schematically illustrates the principles of the TextureRL approach. The description given here refers to the color component of the luminance of the image, but the process also applies to the color components of the color.

На фиг. 6 представлено изображение 6.1 улучшающего уровня и его изображение 6.2 опорного уровня с пространственным отношением R (обычно 1, 1,5 или 2) между 6,1 и 6.2. Безотносительно значения R мы используем слово "дискретизация с повышением” в качестве процесса повторной дискретизации, применяемого к опорному уровню для соответствия размерностям улучшающего уровня. Если R равно 1, повторная дискретизация производит вывод, идентичный вводу. Обратная повторная дискретизация называется "дискретизацией с понижением". Изображение улучшающего уровня подразделяется на решетку, представляющую гранулярность изображения. Каждый из меньших квадратов называется отсчетом в следующем тексте.In FIG. 6 shows an image 6.1 of the enhancement level and its image 6.2 of the reference level with a spatial ratio R (usually 1, 1.5 or 2) between 6.1 and 6.2. Whatever the value of R, we use the word “upsampling” as the resampling process applied to the reference level to match the dimensions of the enhancement level. If R is 1, resampling produces an output identical to the input. Inverse resampling is called “downsampling”. The enhancement level image is subdivided into a grid representing the granularity of the image, each of the smaller squares is called a count in the following text.

Теперь для заданного элемента 6.3 предиктора, представленного жирным квадратом, процесс состоит в следующем:Now for a given predictor element 6.3 represented by a bold square, the process is as follows:

A. Вычисление центрального местоположения 6.4 (xPCtr, yPCtr) рассматриваемого блока предсказания 6.3 яркости, которое выявляется следующим образом:A. The calculation of the central location 6.4 (xPCtr, yPCtr) of the considered block prediction 6.3 brightness, which is detected as follows:

xPCtr=xP+nPbW/2 xPCtr = xP + nPbW / 2

yPCtr=yP+nPbH/2 yPCtr = yP + nPbH / 2

xP, yP определяют верхний левый отсчет 6.6 текущего блока предсказания яркости относительно верхнего левого отсчета 6.7 яркости текущей картинкиxP, yP determine the upper left sample 6.6 of the current brightness prediction block relative to the upper left sample 6.7 brightness of the current picture

nPbW и nPbH определяют ширину и высоту блока предсказания яркостиnPbW and nPbH determine the width and height of the brightness prediction block

B. Дискретизация с понижением их координат в соответствии с масштабным коэффициентом R (1, 1.5 и 2.0), чтобы найти совмещенную позицию 6.5 в изображении 6.2 опорного уровня;B. Discretization with decreasing their coordinates in accordance with the scale factor R (1, 1.5 and 2.0) to find the combined position 6.5 in the image 6.2 of the reference level;

xPCtrRL=(xPCtr * PicWRL + ScaledW/2)/ScaledWxPCtrRL = (xPCtr * PicWRL + ScaledW / 2) / ScaledW

yPCtrRL=(yPCtr * PicHRL + ScaledH/2)/ScaledHyPCtrRL = (yPCtr * PicHRL + ScaledH / 2) / ScaledH

Переменные PicWRL и PicHRL установлены равными ширине и высоте картинки опорного уровня.The PicWRL and PicHRL variables are set equal to the width and height of the reference level image.

ScaledH принимает значение R * PicHRL, и ScaledW равно значению R * PicWRLScaledH takes the value R * PicHRL, and ScaledW is equal to the value R * PicWRL

C. Извлечение информации движения в этом местоположении из изображения 6.2 опорного уровня посредством идентификации блока bIPb 6.8 предсказания яркости с номером 1, покрытия модифицированного местоположения, заданное как ((xPCtrRL>>4)<<4, (yPCtrRL>>4)<<4), в картинке опорного уровня. Это соответствует этапу суммирования движения опорного уровня.C. Extracting motion information at this location from the reference level image 6.2 by identifying the luminance prediction block bIPb 6.8 with number 1, the modified location coverage defined as ((xPCtrRL >> 4) << 4, (yPCtrRL >> 4) << 4 ), in the picture of the reference level. This corresponds to the step of summing the movement of the reference level.

Местоположение яркости (xPRL, yPRL) затем устанавливается равным верхнему левому отсчету 6.8 из блока предсказания яркости со совмещенным местоположением, заданного посредством bIPb относительно верхнего левого отсчета яркости картинки опорного уровняThe luminance location (xPRL, yPRL) is then set equal to the upper left sample 6.8 from the luminance prediction unit with the combined location specified by bIPb relative to the upper left luminance sample of the reference level picture

D. Если соответствующая информация не относится к интра-предсказанию, извлечение векторов MV_RL движения и их дискретизация с повышением в соответствии с отношением R. Операция в основном генерирует пространственный вектор движения, значение SMVP которого: SMVP=rnd (R*MV_RL (rnd (xPRL/R), rnd (yPRL/R), где rnd(.) представляет процесс округления). Вместе с индексами опорного кадра, связанными с этим пространственным вектором движения, это составляет пространственный предиктор вектора движения, вставленный в начало множества.D. If the relevant information is not related to intra-prediction, the extraction of the motion vector vectors MV _RL and their discretization with increasing in accordance with the relation R. The operation basically generates a spatial motion vector whose SMVP value is: SMVP = rnd (R * MV _RL (rnd (xPRL / R), rnd (yPRL / R), where rnd (.) represents the rounding process.) Together with the reference frame indices associated with this spatial motion vector, this constitutes the spatial motion vector predictor inserted at the beginning of the set.

Текущая архитектура SHVC для TextureRL предписывает, что векторы движения для значений MVRL опорного уровня стоимости MV_RL получаются из буфера сжатия движения опорного уровня, как видно на этапе C выше. Поэтому необходимо, чтобы информация движения, соответствующая блокам 6.8 с размером 4×4, только одна хранилась для всего блока с размером 16×16 посредством процесса сжатия информации движения.The current SHVC architecture for TextureRL prescribes that motion vectors for MVRL values of the MV _RL cost reference level are obtained from the motion level compression buffer of the reference level, as seen in step C above. Therefore, it is necessary that the motion information corresponding to blocks 6.8 with a size of 4 × 4, only one was stored for the entire block with a size of 16 × 16 through the compression process of the motion information.

Теперь при рассмотрении подхода индекса опорного кадра, в этом случае информация движения нового кадра, который вставляется в опорный список улучшающего уровня, приходит также из упомянутого поля сжатой информации движения. Эта информация движения тогда может использоваться для определения временного предиктора, как описано выше.Now, when considering the approach of the index of the reference frame, in this case, the motion information of the new frame, which is inserted in the reference list of the improving level, also comes from the mentioned field of compressed motion information. This motion information can then be used to determine a temporary predictor, as described above.

Рассмотрим подробно, как выявляется это движение. Для заданного блока с размером 16×16 выбирается центр этого блока, и эта позиция используется эквивалентно тому, что описано выше, чтобы найти соответствующую информацию движения. Мы собираемся подробно рассмотреть соответствующие этапы для компонента яркости. Обратите внимание на то, что большинство частей существенно идентичны процессу, описанному в отношении фиг. 6, и определения остаются теми же самыми для идентичных переменных.Let us consider in detail how this movement is revealed. For a given block with a size of 16 × 16, the center of this block is selected, and this position is used equivalently to what is described above to find the corresponding motion information. We are going to consider in detail the corresponding steps for the brightness component. Note that most parts are substantially identical to the process described in relation to FIG. 6, and the definitions remain the same for identical variables.

A. Центральное местоположение (xPCtr, yPCtr) блока предсказания яркости выявляется следующим образом (имена переменных определены в предыдущей секции):A. The central location (xPCtr, yPCtr) of the brightness predictor is detected as follows (variable names are defined in the previous section):

xPCtr=xP+8xPCtr = xP + 8

yPCtr=yP+8yPCtr = yP + 8

B. Дискретизация с понижением их координат в соответствии с масштабным коэффициентом R (1, 1,5 и 2,0), чтобы найти совмещенную позицию в изображении опорного уровня;B. Discretization with decreasing their coordinates in accordance with the scale factor R (1, 1.5 and 2.0) to find the combined position in the image of the reference level;

ScaledH принимает значение R * PicHRL, и ScaledW равно значению R * PicWRL.ScaledH takes on the value of R * PicHRL, and ScaledW is equal to the value of R * PicWRL.

C. Совмещенная позиция (xRL, yRL) выявляется следующим образомC. The combined position (xRL, yRL) is identified as follows

xRL=(xRef>>4)<<4xRL = (xRef >> 4) << 4

yRL=(yRef>>4)<<4.yRL = (yRef >> 4) << 4.

D. Вектор движения опорного уровня выявляется следующим образом. Операция в основном генерирует векторы движения со значением RL_MV опорного уровня следующим образом: RL_MV = rnd (R*MV (rnd (xPRL/R), rnd (yPRL/R)).D. The motion vector of the reference level is identified as follows. The operation basically generates motion vectors with a reference level value RL_MV as follows: RL_MV = rnd (R * MV (rnd (xPRL / R), rnd (yPRL / R)).

Эта информация затем используется, как если бы она была выводом сжатия информации движения. Это позволяет использовать информацию движения из опорного уровня для предсказания информации движения в улучшающем уровне. В отличие от подхода TextureRL, информация движения имеет более грубую гранулярность, но может использоваться в качестве временного предиктора информации движения и в процессе определения списка в режиме со слиянием, в AMVP, используемом для интер-блоков.This information is then used as if it were a compression output of motion information. This makes it possible to use motion information from the reference level to predict motion information at an improving level. Unlike the TextureRL approach, motion information has a coarser granularity, but can be used as a temporary predictor of motion information and in the process of defining a list in merge mode in AMVP used for inter-blocks.

Теперь, когда мы представили полную архитектуру масштабируемого кодека, два подхода и то, каким образом они используют информацию движения для предсказания, мы можем обобщенно представить следующее.Now that we have introduced the full architecture of a scalable codec, two approaches, and how they use motion information to predict, we can summarize the following.

В подходе TextureRL новый предиктор информации движения в улучшающем уровне получается из его опорного уровня. Упомянутый предиктор информации движения обычно приходит из информации движения, используемой при определении временного кандидата в опорном уровне, который сжат. Таким образом, сжатие влияет на его выявление, и, таким образом, режим со слиянием. В режиме AMVP, если присутствует масштабируемый кандидат, то сжатие также повлияет на него. Режимы AMVP и со слиянием в подходе индекса опорного кадра всегда подвергаются влиянию, поскольку они также используют временной предсказанный вектор движения, и если кадр, на который ссылаются, является вставленным, то этот предиктор придет из движения кадра опорного уровня.In the TextureRL approach, a new predictor of motion information in an enhancement layer is obtained from its reference level. Said motion information predictor usually comes from the motion information used in determining a temporary candidate at a reference level that is compressed. Thus, compression affects its detection, and thus the merge mode. In AMVP mode, if a scalable candidate is present, then compression will also affect it. The AMVP and merge modes in the reference frame index approach are always affected because they also use the temporal predicted motion vector, and if the referenced frame is inserted, then this predictor will come from the motion of the reference level frame.

В подходе индекса опорного кадра информация движения нового кадра, который вставлен в опорный список улучшающего уровня, приходит также из упомянутой сжатой информации движения.In the reference frame index approach, the motion information of a new frame, which is inserted in the enhancement layer reference list, also comes from said compressed motion information.

Как объяснено в отношении фиг. 6, информация движения, сохраненная для опорного уровня, является сжатой. Это означает, что для полного блока с размером 16×16, в котором первоначально существуют вплоть до 16 частей информации движения, по одной для каждого блока с размером 4×4, содержащегося в пределах блока с размером 16×16, сохранена только одна, обычно та, которая относится к верхнему левому блоку с размером 4×4.As explained with respect to FIG. 6, motion information stored for the reference level is compressed. This means that for a full 16 × 16 block in which up to 16 pieces of motion information initially exist, one for each 4 × 4 block contained within a 16 × 16 block only one is stored, usually one that refers to the upper left block with a size of 4 × 4.

В процессе выявления предикторов информации движения, когда необходима информация движения опорного уровня, вследствие этого сжатия доступной является используемая информация движения, а именно, информация движения, связанная с верхним левым блоком с размером 4×4. Вновь, как показано на фиг. 6, при поиске информации движения, связанной с совмещенной точкой 6.5, соответствующей центру 6.4 элемента кодирования для кодирования, используется информация движения, связанная с верхним левым блоком 6.8 с размером 4×4 с номером 1. Можно отметить, что информация движения, связанная с верхним левым блоком с размером 4×4 с номером 3, соответствующим информации движения, сохраненной после сжатия для блока с размером 16×16, расположенного ниже, находится ближе к позиции совмещенной точки 6.5, и, таким образом, вероятно, будет более релевантной, чем информация движения блока 6.8 с размером 4×4.In the process of identifying motion information predictors, when motion information of the reference level is needed, due to this compression, the motion information used is available, namely, the motion information associated with the upper left block with a size of 4 × 4. Again, as shown in FIG. 6, when searching for motion information associated with a combined point 6.5 corresponding to the center 6.4 of the encoding element for encoding, the motion information associated with the upper left block 6.8 with a size of 4 × 4 with the number 1 is used. It can be noted that the motion information associated with the upper the left 4 × 4 block with the number 3 corresponding to the motion information stored after compression for the 16 × 16 block located lower is closer to the position of the combined point 6.5, and thus is likely to be more relevant than the info 4.8 motion block 6.8.

Это не оптимальный выбор информации движения вследствие процесса сжатия, примененного к информации движения в опорном уровне, может быть оценен как приводящий к потере эффективности кодирования. В варианте осуществления изобретения процесс выявления предиктора информации движения адаптирован для преодоления этой проблемы позиции.This is not an optimal choice of motion information due to the compression process applied to motion information at the reference level, can be estimated as leading to a loss in coding efficiency. In an embodiment of the invention, a process for identifying a motion information predictor is adapted to overcome this position problem.

Фиг. 7 подробно показывает адаптированный процесс в контексте подхода TextureRL. Он может быть применен и в AMVP, и в процессе выявления в режиме со слиянием в улучшающем уровне. Этот модифицированный процесс выявления в режиме со слиянием может быть расположен в модуле 1.23 оценки движения на фиг. 1 кодера и в модуле 10.55 оценки движения на фиг. 10 декодера. По существу, все это происходит при определении кандидата SMVP 9.0 на фиг. 9.FIG. 7 shows in detail the adapted process in the context of the TextureRL approach. It can be applied both in AMVP and in the process of detection in a merge mode at an improving level. This modified merge mode detection process may be located in motion estimation module 1.23 in FIG. 1 of the encoder and in the motion estimation module 10.55 in FIG. 10 decoders. Essentially, all this happens when determining the candidate SMVP 9.0 in FIG. 9.

Этап 7.1 инициализирует процесс посредством вычисления позиции, для которой следует определить движение в опорном уровне, например, посредством установки информации текущего элемента предиктора (размерности/позиция) и выявления центра упомянутого блока предиктора. Главная адаптация находится на этапе 7.3, который корректирует позицию. Это в первую очередь делается через две следующие возможности.Step 7.1 initializes the process by calculating the position for which movement in the reference level should be determined, for example, by setting the information of the current predictor element (dimension / position) and identifying the center of the predictor block. The main adaptation is at step 7.3, which adjusts the position. This is primarily done through the following two possibilities.

В первом варианте осуществления для заданной координаты X, полученной для позиции в опорном уровне, например, либо xPCtrRL, либо yPCtrRL, описанных выше, вычисляется новое значение посредством выполнения операции округления в соответствии с двумя параметрами r и M.In the first embodiment, for a given X coordinate obtained for a position in the reference level, for example, either xPCtrRL or yPCtrRL described above, a new value is calculated by performing a rounding operation in accordance with the two parameters r and M.

Например, новое значение X’ может быть вычислено следующим образом:For example, the new value of X ’can be calculated as follows:

X'=⌊((X + r)/M)⌋*M;X '= ⌊ ((X + r) / M) ⌋ * M;

где ⌊x⌋ представляет усечение x, что означает взятие его целой части. M может являться степенью 2, в этом варианте осуществления M=16 для соответствия гранулярности сжатого движения стандарта HEVC. В этом варианте осуществления используется r=4, а не более естественный выбор r=8, поскольку это обеспечивает более хорошую эффективность кодирования.where ⌊x⌋ represents the truncation of x, which means taking its integer part. M may be a power of 2, in this embodiment, M = 16 to match the granularity of the compressed movement of the HEVC standard. In this embodiment, r = 4 is used, and not the more natural choice of r = 8, since this provides better coding efficiency.

То же самое может быть применено к другой координате. Возможно выбрать другие значения параметров r и M.The same can be applied to a different coordinate. It is possible to select other values of the parameters r and M.

Коррекция позиции может быть основана на таблице поиска. В этом случае для заданных координат (X, Y) таблица коррекций F[X, Y] может быть определена по меньшей мере для одной из координат. Эта таблица может быть разной для каждой координаты. Таблица также может быть индексирована только посредством одной из координат, а именно, X или Y. Таблица также может быть уменьшена посредством использования в качестве индекса значения, относящегося к координате, вместо самой координаты, например, коррекция может быть получена как F[X mod M] вместо F [X], где M=2N как типичное значение. В одном примере M=16.Position correction can be based on a lookup table. In this case, for the given coordinates (X, Y), the correction table F [X, Y] can be determined for at least one of the coordinates. This table may be different for each coordinate. The table can also be indexed only by one of the coordinates, namely, X or Y. The table can also be reduced by using the value related to the coordinate as an index instead of the coordinate itself, for example, a correction can be obtained as F [X mod M ] instead of F [X], where M = 2N as a typical value. In one example, M = 16.

Во всех случаях корректирующее значение (либо значение r, либо таблица по меньшей мере для одного компонента) может быть передано и извлечено из информации синтаксиса высокого уровня, например, в множестве видеопараметров, множестве параметров последовательности, множестве параметров картинки или заголовке слоя. В случае передачи по меньшей мере одного значения r:In all cases, the correction value (either the r value or the table for at least one component) can be transferred and extracted from high-level syntax information, for example, in a lot of video parameters, a lot of sequence parameters, a lot of picture parameters or a layer header. If at least one r value is transmitted:

- Битовый флаг может указывать, является ли значение r первым значением или вторым, например, 0 и 4 (в этом случае он может рассматриваться как флаг включения/выключения для коррекции);- The bit flag can indicate whether the value of r is the first value or the second, for example, 0 and 4 (in this case, it can be considered as an on / off flag for correction);

- Код может указывать явное значение r, например, усеченный унарный код, представляющий значение r минус 4, например, двоичные последовательности '0' для r=4, ‘10’ для R=5, ‘110’, ‘1110’ и '1111' для других значений.- The code may indicate an explicit r value, for example, a truncated unary code representing the value of r minus 4, for example, binary sequences' 0 'for r = 4,' 10 'for R = 5,' 110 ',' 1110 'and' 1111 'for other values.

В приведенном выше описании важно отметить, что может быть затронута только одна координата, в частности, абсцисса, поскольку изменение ординаты может привести к извлечению информации движения из другой области памяти и, таким образом, вызвать дополнительные доступы к памяти.In the above description, it is important to note that only one coordinate, in particular, the abscissa can be affected, since a change in the ordinate can lead to the extraction of motion information from another area of the memory and, thus, cause additional accesses to the memory.

Следуя этому требованию сокращения доступов к памяти, по меньшей мере одно скорректированное значение может быть заменено на другое значение, это другое значение, возможно, является исходным значением, если упомянутое скорректированное значение не соответствует критерию, такому как, удовлетворение порога. Упомянутый порог может представлять собой размерность изображения вдоль этой координаты, чтобы поиск не мог произойти вне изображения. В качестве альтернативы, упомянутый порог может представлять собой предел области памяти вдоль этой координаты. Область памяти обычно соответствует предопределенному множеству самого большого элемента кодирования в опорном уровне. Эта область памяти будет проиллюстрирована более подробно с помощью фиг. 13.Following this requirement of reducing memory access, at least one adjusted value can be replaced with another value, this other value may be the original value if the adjusted value does not meet a criterion, such as a threshold. The threshold may be the dimension of the image along this coordinate so that the search could not occur outside the image. Alternatively, said threshold may be the limit of a memory region along this coordinate. A memory region usually corresponds to a predetermined set of the largest coding element at the reference level. This memory area will be illustrated in more detail using FIG. thirteen.

Затем обычное определение предиктора возобновляется на этапе 7.4. Информация движения извлекается из сжатого буфера движения с использованием выданной позиции этапа 7.3. Если это интра-режим (т.е., нет движения), кандидат помечается как таковой на этапе 7.8, в частности, без вычисления и без добавления предиктора к списку кандидатов режима со слиянием, и, таким образом, процесс выявления заканчивается на этапе 7.9. В ином случае соответствующее движение подвергается дискретизациии с повышением для соответствия размерностям улучшающего уровня.Then, the usual predictor determination is resumed at step 7.4. Motion information is retrieved from the compressed motion buffer using the issued position from step 7.3. If this is an intra-mode (i.e., there is no movement), the candidate is marked as such at step 7.8, in particular, without calculating and without adding a predictor to the merge mode candidate list, and thus the identification process ends at step 7.9 . Otherwise, the corresponding motion is sampled and increased to fit the dimensions of the enhancement level.

Фиг. 8 иллюстрирует адаптированный процесс в контексте подхода индекса опорного кадра. Он может быть применен и в AMVP, и в процессе выявления в режиме со слиянием в улучшающем уровне. Этот адаптированный процесс расположен либо в буфере 1.24 кадра, либо в модуле 1.23 оценки движения в кодере на фиг. 1 и в буфере 10.60 кадра или в модуле 10.55 оценки движения декодера на фиг. 10. Действительно, он затрагивает содержимое памяти кадра относительно сжатой информации движения.FIG. 8 illustrates an adapted process in the context of a reference frame index approach. It can be applied both in AMVP and in the process of detection in a merge mode at an improving level. This adapted process is located either in frame buffer 1.24 or in motion estimation module 1.23 in the encoder of FIG. 1 and in the frame buffer 10.60 or in the motion estimation module 10.55 of the decoder in FIG. 10. Indeed, it affects the contents of the frame memory with respect to the compressed motion information.

Таким образом, этап 8.1 инициализирует процесс выявления предиктора информации движения посредством установки текущего блока с размером 16×16 как первого в изображении улучшающего уровня. На этапе 8.2 определяется позиция центра элемента кодирования с размером 16×16, и соответствующая совмещенная позиция в опорном уровне находится на этапе 8.3. Новый этап 8.4, на котором найденная позиция корректируется. Можно сослаться на описанный выше этап 7.1, чтобы увидеть подробные сведения этой корректировки, то же самое применяется и здесь.Thus, step 8.1 initiates the process of identifying the motion information predictor by setting the current block with a size of 16 × 16 as the first in the image improving level. At step 8.2, the center position of the encoding element with a size of 16 × 16 is determined, and the corresponding combined position in the reference level is at step 8.3. A new step 8.4, in which the position found is adjusted. You can refer to step 7.1 above to see the details of this adjustment, the same applies here.

На этапе 8.5 проверяется, находится ли движение в той позиции в интра-режиме. Если это так, то движение блока с размером 16×16 устанавливается как интра-режим на этапе 8.7, в ином случае векторы движения получаются и подвергаются дискретизации с повышением для соответствия размерностям улучшающего уровня, и подвергнутые дискретизации с повышением векторы движения, опорные индексы и доступности устанавливаются как предикторы информации движения текущего блока с размером 16×16 на этапе 8.8.At step 8.5, it is checked whether the movement in that position is in the intra-mode. If this is the case, then the movement of the block with a size of 16 × 16 is set as the intra-mode in step 8.7, otherwise the motion vectors are obtained and subjected to increasing discretization to match the dimensions of the improving level, and the motion vectors, reference indices and accessibility, which are subjected to discretization are set as predictors of motion information of the current block with a size of 16 × 16 in step 8.8.

Этап 8.9 готовится к следующей итерации посредством проверки, является ли текущий блок последним в изображении. Если это верно, тогда информация движения для нового кадра полностью определена, и процесс заканчивается на этапе 8.11. В ином случае текущий блок устанавливается равным следующему блоку с размером 16×16 на этапе 8.10, и цикл итерации возвращается обратно к этапу 8.2.Step 8.9 prepares for the next iteration by checking if the current block is the last in the image. If this is the case, then the motion information for the new frame is fully defined, and the process ends at step 8.11. Otherwise, the current block is set equal to the next block with a size of 16 × 16 in step 8.10, and the iteration loop returns back to step 8.2.

Фиг. 13 иллюстрирует подробные сведения о том, что мы определяем как область памяти. Для заданного изображения 13.4 опорного уровня и его связанного изображения 13.5 улучшающего уровня 13.5 возможно определить область 13.6, к которой применять процесс определения движения, например, блок CTB. Может применяться ограничение доступа к памяти (например, в случае конвейерного кодера или декодирования, в котором блоки CTB улучшающего уровня обрабатываются непосредственно после совмещенных блоков CTB опорного уровня), и, таким образом, мы можем определить в первом аспекте изобретения совмещенную область 13.1 внутри кадра 13.4 опорного уровня. Область памяти, упомянутая на этапах 7.1 и 8.4, соответствует первому аспекту области 13.2, содержащей область 13.1, в данном случае выполненной из двух блоков CTB опорного уровня: скорректированные позиции, найденные для любой части области 13.6, должны оставаться в пределах области 13.2. Менее строгим образом мы можем обеспечить, чтобы область памяти могла содержать дополнительный столбец блоков CTB справа от области 13.2, что дает в результате область 13.3. Можно подразумевать, что ограничение в данном случае основано на области 13,6, но может использоваться любой размер области в улучшающем уровне или увеличенная область памяти в опорном уровне.FIG. 13 illustrates the details of what we define as a region of memory. For a given reference level image 13.4 and its associated image 13.5 of enhancement level 13.5, it is possible to determine an area 13.6 to which a motion detection process, for example, a CTB block, is applied. A memory access restriction may be applied (for example, in the case of a pipeline encoder or decoding in which the CTBs of the enhancement layer are processed immediately after the CTBs of the reference layer are combined), and thus, in the first aspect of the invention, we can determine the combined region 13.1 within the frame 13.4 reference level. The memory region mentioned in steps 7.1 and 8.4 corresponds to the first aspect of region 13.2 containing region 13.1, in this case made up of two reference level CTBs: the corrected positions found for any part of region 13.6 should remain within region 13.2. In a less rigorous way, we can ensure that the memory region can contain an additional column of CTB blocks to the right of region 13.2, resulting in region 13.3. It may be understood that the limitation in this case is based on a region of 13.6, but any size of the region in the enhancement level or an enlarged memory region in the reference level may be used.

Фиг. 11 является блок-схемой вычислительного устройства 11.0 для реализации одного или более вариантов осуществления изобретения. Вычислительное устройство 11.0 может являться устройством, таким как микрокомпьютер, рабочая станция или портативное устройство. Вычислительное устройство 11.0 содержит шину связи, соединенную с:FIG. 11 is a block diagram of a computing device 11.0 for implementing one or more embodiments of the invention. Computing device 11.0 may be a device, such as a microcomputer, workstation, or portable device. Computing device 11.0 contains a communication bus connected to:

- центральным процессором 11.1, таким как микропроцессор, обозначенным как ЦП;- a central processor 11.1, such as a microprocessor, designated as a CPU;

- оперативным запоминающим устройством 11.2, обозначенным как ОЗУ, для хранения исполняемого кода способа вариантов осуществления изобретения, а также регистрами, выполненными с возможностью записывать переменные и параметры, необходимые для реализации способа для кодирования или декодирования по меньшей мере части изображения в соответствии с вариантами осуществления изобретения, емкость памяти может быть расширена факультативным оперативным запоминающим устройством, например, соединенным с портом расширения;- random access memory 11.2, designated as RAM, for storing the executable code of the method of embodiments of the invention, as well as registers configured to record variables and parameters necessary for implementing the method for encoding or decoding at least part of an image in accordance with embodiments of the invention , the memory capacity can be expanded with optional random access memory, for example, connected to an expansion port;

- постоянным запоминающим устройством 11.3, обозначенным как ПЗУ, для хранения компьютерных программ для реализации вариантов осуществления изобретения;- read only memory 11.3, designated as ROM, for storing computer programs for implementing embodiments of the invention;

- сетевым интерфейсом 11.4, обычно соединенным с сетью связи, по которой передаются и принимаются цифровые данные для обработки. Сетевой интерфейс 11.4 может являться единственным сетевым интерфейсом или составленным из множества различных сетевых интерфейсов (например, проводных и беспроводных интерфейсов или различных видов проводных или беспроводных интерфейсов). Пакеты данных записываются в сетевой интерфейс для передачи или считываются из сетевого интерфейса для приема под управлением прикладной программы, работающей в ЦП 11.1;- a network interface 11.4, usually connected to a communication network through which digital data is transmitted and received for processing. The network interface 11.4 may be a single network interface or composed of many different network interfaces (e.g., wired and wireless interfaces or various kinds of wired or wireless interfaces). Data packets are written to the network interface for transmission or read from the network interface for reception under the control of an application running on CPU 11.1;

- пользовательским интерфейсом 11.5, который может использоваться для приема ввода от пользователя или отображения информации пользователю;- user interface 11.5, which can be used to receive input from the user or display information to the user;

- жестким диском 11.6, обозначенным как ЖД, который может быть обеспечен как устройство массового хранения;- a hard disk 11.6, designated as a hard disk drive, which can be provided as a mass storage device;

- модулем 11.7 ввода/вывода, который может использоваться для приема данных от внешних устройств и отправки данных внешним устройства, таким как источник видеосигнала или дисплей.- an I / O module 11.7, which can be used to receive data from external devices and send data to external devices, such as a video source or display.

Исполняемый код может быть сохранен либо в постоянном запоминающем устройстве 11.3, либо на жестком диске 11.6, либо на сменном цифровом носителе, таком как диск. В соответствии с вариантом, исполняемый код программ может быть принят посредством сети связи через сетевой интерфейс 11.4, чтобы перед исполнением быть сохраненным в одном из средств хранения устройства 11.0 связи, таком как жесткий диск 11.6.The executable code can be stored either in read-only memory 11.3, or on the hard disk 11.6, or on removable digital media such as a disk. According to an embodiment, the executable code of the programs may be received via the communication network via the network interface 11.4 so as to be stored in one of the storage means of the communication device 11.0, such as a hard disk 11.6, before execution.

Центральный процессор 11.1 выполнен с возможностью управлять и направлять исполнение команд или частей программного кода в соответствии с вариантами осуществления изобретения, эти команды сохранены в одном из упомянутых выше средств хранения. После включения ЦП 11.1 способен исполнять команды, относящиеся к прикладной программе, из ОЗУ 11.2 после того, как эти команды были загружены, например, из программного ПЗУ 11.3 или жесткого диска 11.6. Такая прикладная программа при исполнении посредством ЦП 11.1 заставляет выполнять этапы блок-схем последовательности операций, показанных на фиг. 1-4.The central processor 11.1 is configured to control and direct the execution of instructions or parts of the program code in accordance with embodiments of the invention, these instructions are stored in one of the aforementioned storage means. After turning on, the CPU 11.1 is able to execute commands related to the application program from RAM 11.2 after these commands have been downloaded, for example, from software ROM 11.3 or hard disk 11.6. Such an application, when executed by the CPU 11.1, forces the steps of the flowcharts shown in FIG. 1-4.

Любой этап алгоритма, показанного на фиг. 7, может быть реализован в программном обеспечении посредством исполнения множества команд или программ с помощью программируемой вычислительной машины, такой как персональный компьютер (ПК), процессор цифровых сигналов (DSP) или микроконтроллер; или в ином случае реализован в аппаратных средствах посредством машины или специализированного компонента, такого как программируемая пользователем вентильная матрица (FPGA) или специализированная интегральная схема (ASIC).Any step of the algorithm shown in FIG. 7 may be implemented in software by executing a plurality of instructions or programs using a programmable computer such as a personal computer (PC), digital signal processor (DSP) or microcontroller; or otherwise implemented in hardware through a machine or a specialized component, such as a user programmable gate array (FPGA) or a specialized integrated circuit (ASIC).

Хотя настоящее изобретение было описано выше со ссылкой на конкретные варианты осуществления, настоящее изобретение не ограничено конкретными вариантами осуществления, и для специалиста в области техники будут очевидны модификации, которые находятся в пределах объема настоящего изобретения.Although the present invention has been described above with reference to specific embodiments, the present invention is not limited to specific embodiments, and modifications will be apparent to those skilled in the art that are within the scope of the present invention.

Множество дополнительных модификаций и изменений появятся у специалистов в области техники после ознакомления с описанными выше иллюстративными вариантами осуществления, которые даны только в качестве примера и которые не предназначены для ограничения объема изобретения, определяемого исключительно посредством приложенной формулы изобретения. В частности, различные признаки из различных вариантов осуществления в соответствующих случаях могут быть взаимозаменяемы.Many additional modifications and changes will appear to specialists in the field of technology after reading the above illustrative embodiments, which are given only as an example and which are not intended to limit the scope of the invention defined solely by means of the attached claims. In particular, various features from various embodiments, as appropriate, may be used interchangeably.

В формуле изобретения слово "содержит" не исключает другие элементы или этапы, и форма единственного числа не исключает множество. Тот лишь факт, что различные признаки изложены во взаимно различных зависимых пунктах формулы изобретения, не указывает, что комбинация этих признаков не может успешно использоваться.In the claims, the word “comprises” does not exclude other elements or steps, and the singular form does not exclude a plurality. The mere fact that various features are set forth in mutually different dependent claims does not indicate that a combination of these features cannot be used successfully.

Claims

1. An image encoding method in accordance with a scalable format, wherein said format uses a reference level picture and a resampled picture, the image area for said image is predicted encoded based on motion information, said motion information itself is predicted encoded based on a motion information predictor from a set of predictor candidates of motion information, said method comprising the steps of:

determine the position in the picture of the reference level using the position in the block with a size of 16 × 16 pictures, subjected to repeated sampling; and

determining a set of predictor candidates of motion information, including a candidate predictor of motion information based on the motion information associated with the image area belonging to the picture of the reference level,

moreover, the determination of the position contains stages in which:

- identify the central position in the block with a size of 16 × 16 images, subjected to repeated sampling;

- scaling at least one coordinate value of the central position using the scaling factor to identify in the picture the reference level of the position corresponding to the said central position; and

- identify the value of X ’from at least one X coordinate of the corresponding position using X’ = ((X + 4) >> 4) << 4, and

determining a set of motion information predictor candidates comprises the step of using, if available, motion information associated with said specific position in the picture of the reference level, said specific position being indicated by the value X ', upon receipt of a motion predictor candidate information that should be included in said set of motion predictor candidates.

2. An apparatus for encoding an image in accordance with a scalable format, wherein said format uses a reference level picture and a resampled picture, the image area for said image is predicted encoded based on motion information, said motion information itself is predicted encoded based on an information predictor movement from a set of predictor candidates of motion information, said device comprising:

a position determining unit configured to determine a position in a picture of a reference level using a position in a block with a size of 16 × 16 of a picture subjected to repeated sampling; and

a predictor determining unit configured to determine a set of predictor candidates of motion information including a candidate predictor of motion information based on motion information associated with an image area belonging to a picture of a reference level,

moreover, the position determination unit position determination contains:

- identification of the central position in the block with a size of 16 × 16 images, subjected to repeated sampling;

- identifying an X ’value from at least one X coordinate of said corresponding position using X’ = ((X + 4) >> 4) << 4, and

determining a set of predictor candidates of motion information by the predictor determining unit comprises using, if available, motion information associated with said specific position in the picture of the reference level, said specific position being indicated by the value X ', upon receipt of the candidate predictor of motion information, which should be included in said set of motion predictor candidates.

3. A method for decoding an image in accordance with a scalable format, wherein said format uses a reference level picture and a resampled picture, the image area for said image is predicted based on motion information, said motion information itself is predicted based on the motion information predictor from a set of predictor candidates of motion information, said method comprising the steps of:

determining a set of predictor candidates of motion information, including a candidate predictor of motion information based on the motion information associated with the image area belonging to the reference level picture,

moreover, the determination of the position contains stages in which:

4. An apparatus for decoding an image in accordance with a scalable format, wherein said format uses a reference level picture and a resampled picture, the image area for said image is predicted based on motion information, said motion information itself is predicted based on the information predictor movement from a set of predictor candidates of motion information, said device comprising:

a predictor determining unit configured to determine a set of predictor candidates of motion information including a candidate predictor of motion information based on the motion information associated with an image area belonging to a picture of a reference level,

moreover, the position determination unit position determination contains:

5. A computer-readable storage medium storing processor-executable code for executing an image encoding method in accordance with a scalable format, said format using a reference level picture and resampled picture, the image area for said image being predicted encoded based on the motion information itself motion information is predicted encoded based on a motion information predictor from a set of information predictor candidates a move, said method comprising the steps of:

moreover, the determination of the position contains stages in which:

6. A computer-readable storage medium storing processor-executable code for executing an image decoding method in accordance with a scalable format, wherein said format uses a reference level picture and a resampled picture, the image area for said image is predicted based on motion information, said motion information is decoded with prediction based on a predictor of motion information from a set of predictor candidates ation movement, said method comprising the steps of:

determine the position in the picture of the reference level using the position of a block with a size of 16 × 16 pictures, subjected to repeated sampling; and

moreover, the determination of the position contains stages in which: