RU2575690C2

RU2575690C2 - Motion vector prediction at video encoding

Info

Publication number: RU2575690C2
Application number: RU2013151612/08A
Authority: RU
Inventors: Ин ЧЭНЬ; Пэйсун ЧЭНЬ; Марта КАРЧЕВИЧ
Original assignee: Квэлкомм Инкорпорейтед
Priority date: 2011-04-20
Filing date: 2012-04-20
Publication date: 2016-02-20

Abstract

FIELD: physics.

SUBSTANCE: claimed process comprises identification of the first set of video data associated with the first vector on motion disparateness at the first temporary location from the first type. It comprises also the identification of the second motion vector for the second unit of video data in the second type other than the first type on the basis of the second vector that makes the motion disparateness vector. Motion vector predictor for the second motion vector. Note here that said motion predictor is based on the first motion disparateness vector. a definite motion vector predictor is added to the list of motion vector predictor candidates for prediction of the second motion vector which is not the motion disparateness vector. Possibility of motion vector predictor determination from the first motion disparateness vector without addition of said motion disparateness vector is not added to said list of candidates. Prediction data are encoded for the second unit with the help of motion vector predictor from the list of predictor candidates.

EFFECT: prevented use of incompatible motion vector as the motion vector predictor.

76 cl, 11 dwg

Description

[0001] Это раскрытие сущности испрашивает приоритет предварительной заявки на патент (США) № 61/477,561, поданной 20 апреля 2011 года, и предварительной заявки на патент (США) № 61/512,765, поданной 28 июля 2011 года, содержимое обеих из которых полностью содержится в данном документе по ссылке.[0001] This disclosure claims the priority of provisional patent application (US) No. 61 / 477,561, filed April 20, 2011, and provisional patent application (US) No. 61 / 512,765, filed July 28, 2011, the contents of both of which are fully contained in this document by reference.

ОБЛАСТЬ ТЕХНИКИ, К КОТОРОЙ ОТНОСИТСЯ ИЗОБРЕТЕНИЕFIELD OF THE INVENTION

[0002] Данное раскрытие сущности относится к кодированию видео.[0002] This disclosure relates to video encoding.

УРОВЕНЬ ТЕХНИКИBACKGROUND

[0003] Поддержка цифрового видео может быть включена в широкий диапазон устройств, включающих в себя цифровые телевизионные приемники, системы цифровой прямой широковещательной передачи, беспроводные широковещательные системы, персональные цифровые устройства (PDA), переносные или настольные компьютеры, планшетные компьютеры, устройства для чтения электронных книг, цифровые камеры, цифровые записывающие устройства, цифровые мультимедийные проигрыватели, устройства видеоигр, консоли для видеоигр, сотовые или спутниковые радиотелефоны, так называемые "смартфоны", устройства видеоконференц-связи, устройства потоковой передачи видео и т.п. Цифровые видеоустройства реализуют такие технологии сжатия видео, как технологии сжатия видео, описанные в стандартах, заданных посредством разрабатываемых в настоящее время стандартов MPEG-2, MPEG-4, ITU-T H.263, ITU-T H.264/MPEG-4, часть 10, усовершенствованное кодирование видео (AVC), стандарта высокоэффективного кодирования видео (HEVC), и расширений таких стандартов. Видеоустройства могут передавать, принимать, кодировать, декодировать и/или сохранять цифровую видеоинформацию более эффективно посредством реализации таких технологий сжатия видео.[0003] Digital video support can be incorporated into a wide range of devices, including digital television receivers, digital direct broadcast systems, wireless broadcast systems, personal digital devices (PDAs), laptop or desktop computers, tablet computers, electronic reading devices books, digital cameras, digital recording devices, digital multimedia players, video game devices, video game consoles, cellular or satellite radiotelephones, t called “smartphones”, video conferencing devices, video streaming devices, etc. Digital video devices implement video compression technologies such as video compression technologies described in the standards defined by the currently developed standards MPEG-2, MPEG-4, ITU-T H.263, ITU-T H.264 / MPEG-4, part 10, advanced video coding (AVC), the standard for high-performance video coding (HEVC), and extensions to such standards. Video devices can transmit, receive, encode, decode and / or store digital video information more efficiently by implementing such video compression technologies.

[0004] Способы сжатия видео выполняют пространственное (внутрикадровое, intra-picture) предсказание и/или временное (межкадровое, inter-picture) предсказание для того, чтобы уменьшать или удалять избыточность, внутренне присущую в видеопоследовательностях. Для блочного кодирования видео срез (секция) видео (т.е. изображение или часть изображения) может быть разделен на видеоблоки, которые также могут называться "древовидными блоками", "единицами кодирования (CU)" и/или "узлами кодирования". Видеоблоки во внутренне-кодированном (I) срезе изображения кодируются с использованием пространственного предсказания относительно опорных выборок в соседних блоках в идентичном изображении. Видеоблоки во внешне кодированном (P- или B-) срезе изображения могут использовать пространственное предсказание относительно опорных выборок в соседних блоках в идентичном изображении или временное предсказание относительно опорных выборок в других опорных изображениях.[0004] Video compression methods perform spatial (intra-frame) prediction and / or temporal (inter-frame) prediction in order to reduce or remove the redundancy intrinsic to video sequences. For block coding a video, a slice (section) of a video (i.e., an image or part of an image) can be divided into video blocks, which may also be called “tree blocks”, “coding units (CUs)” and / or “coding units”. Video blocks in an intra-coded (I) slice of an image are encoded using spatial prediction with respect to reference samples in neighboring blocks in an identical image. Video blocks in an externally coded (P- or B-) image slice may use spatial prediction with respect to reference samples in neighboring blocks in an identical image or temporal prediction with respect to reference samples in other reference images.

[0005] Пространственное или временное предсказание приводит в результате к предсказывающему блоку для блока, который должен быть кодирован. Остаточные данные представляют пиксельные разности между исходным блоком, который должен быть кодирован, и предсказывающим блоком. Внешне кодированный блок кодируется согласно вектору движения, который указывает на блок опорных выборок, формирующих предсказывающий блок, и остаточным данным, указывающим разность между кодированным блоком и предсказывающим блоком. Внутренне кодированный блок кодируется согласно режиму внутреннего кодирования и остаточным данным. Для дополнительного сжатия остаточные данные могут быть преобразованы из пиксельной области в область преобразования, приводя к остаточным коэффициентам преобразования, которые затем могут быть квантованы. Квантованные коэффициенты преобразования, первоначально размещаемые в двумерном массиве, могут сканироваться для того, чтобы генерировать одномерный вектор коэффициентов преобразования, и может применяться энтропийное кодирование с тем, чтобы достигать еще большего сжатия.[0005] Spatial or temporal prediction results in a predictive block for the block to be encoded. The residual data represents pixel differences between the source block to be encoded and the prediction block. The externally encoded block is encoded according to a motion vector that points to a block of reference samples forming a predictive block and residual data indicating the difference between the encoded block and the predictive block. The internally encoded block is encoded according to the internal encoding mode and the residual data. For additional compression, residual data can be converted from the pixel region to the transformation region, resulting in residual transform coefficients, which can then be quantized. The quantized transform coefficients originally placed in a two-dimensional array can be scanned to generate a one-dimensional vector of transform coefficients, and entropy coding can be applied in order to achieve even greater compression.

СУЩНОСТЬ ИЗОБРЕТЕНИЯSUMMARY OF THE INVENTION

[0006] В общем, это раскрытие сущности описывает технологии для кодирования видеоданных. Это раскрытие сущности описывает технологии для выполнения предсказания векторов движения, оценки движения и компенсации движения при кодировании во внешнем режиме (т.е. кодировании текущего блока относительно блоков других изображений (кадров)) при кодировании многовидового видео (MVC). В общем, MVC является стандартом кодирования видео для инкапсуляции нескольких видов видеоданных. Каждый вид может соответствовать различной перспективе или углу, под которым захвачены соответствующие видеоданные общей сцены. Технологии этого раскрытия сущности, в общем, включают в себя предсказание данных предсказания движения в контексте кодирования многовидового видео. Иными словами, например, согласно технологиям этого раскрытия сущности вектор диспаратности (несовпадения) движения из блока в идентичном или отличном виде относительно блока, в данный момент кодируемого, может быть использован для того, чтобы предсказывать вектор движения текущего блока. В другом примере, согласно технологиям этого раскрытия сущности временной вектор движения из блока в идентичном или отличном виде относительно блока, в данный момент кодируемого, может быть использован для того, чтобы предсказывать вектор движения текущего блока.[0006] In general, this disclosure describes techniques for encoding video data. This disclosure describes techniques for performing motion vector prediction, motion estimation, and motion compensation when encoding in an external mode (i.e., encoding the current block relative to blocks of other images (frames)) when encoding multi-view video (MVC). In general, MVC is a video encoding standard for encapsulating several types of video data. Each view can correspond to a different perspective or angle, under which the corresponding video data of the common scene is captured. The technologies of this disclosure generally include predicting motion prediction data in the context of multi-view video encoding. In other words, for example, according to the technologies of this disclosure, the disparity (mismatch) vector of motion from a block in an identical or different form relative to the block currently encoded can be used to predict the motion vector of the current block. In another example, according to the techniques of this disclosure, a temporary motion vector from a block in identical or different form relative to the block currently being encoded can be used to predict the motion vector of the current block.

[0007] В примере, аспекты этого раскрытия сущности относятся к способу кодирования видеоданных, при этом способ содержит идентификацию первого блока видеоданных в первом временном местоположении из первого вида, при этом первый блок ассоциирован с первым вектором диспаратности движения; определение предиктора вектора движения для второго вектора движения, ассоциированного со вторым блоком видеоданных, при этом предиктор вектора движения основан на первом векторе диспаратности движения; при этом, когда второй вектор движения содержит вектор диспаратности движения, определение предиктора вектора движения содержит масштабирование первого вектора диспаратности движения, чтобы генерировать масштабированный предиктор вектора движения, при этом масштабирование первого вектора диспаратности движения содержит применение коэффициента масштабирования, содержащего расстояние для вида второго вектора диспаратности движения, деленное на расстояние для вида первого вектора движения, к первому вектору диспаратности движения; и кодирование данных предсказания для второго блока с использованием масштабированного предиктора вектора движения.[0007] In an example, aspects of this disclosure relate to a method of encoding video data, the method comprising identifying a first block of video data at a first temporary location from a first view, wherein the first block is associated with a first motion disparity vector; determining a motion vector predictor for a second motion vector associated with the second block of video data, wherein the motion vector predictor is based on the first motion disparity vector; wherein, when the second motion vector contains a motion discontinuity vector, the definition of the motion vector predictor comprises scaling the first motion disparity vector to generate a scaled motion vector predictor, wherein scaling the first motion disparity vector comprises applying a scaling factor containing a distance for the form of the second motion discontinuity vector divided by the distance for the form of the first motion vector to the first motion disparity vector ; and encoding prediction data for the second block using a scaled motion vector predictor.

[0008] В другом примере, аспекты этого раскрытия сущности относятся к устройству для кодирования видеоданных, содержащему один или более процессоров, причем один или более процессоров выполнены с возможностью идентифицировать первый блок видеоданных в первом временном местоположении из первого вида, при этом первый блок ассоциирован с первым вектором диспаратности движения; определять предиктор вектора движения для второго вектора движения, ассоциированного со вторым блоком видеоданных, при этом предиктор вектора движения основан на первом векторе диспаратности движения; при этом, когда второй вектор движения содержит вектор диспаратности движения, один или более процессоров выполнены с возможностью определять предиктор вектора движения посредством масштабирования первого вектора диспаратности движения, чтобы генерировать масштабированный предиктор вектора движения, при этом масштабирование первого вектора диспаратности движения содержит применение коэффициента масштабирования, содержащего расстояние для вида второго вектора диспаратности движения, деленное на расстояние для вида первого вектора движения, к первому вектору диспаратности движения; и кодировать данные предсказания для второго блока на основе масштабированного предиктора вектора движения.[0008] In another example, aspects of this disclosure relate to an apparatus for encoding video data comprising one or more processors, wherein one or more processors are configured to identify a first block of video data at a first temporary location from a first view, wherein the first block is associated with the first motion disparity vector; determine a motion vector predictor for a second motion vector associated with the second block of video data, wherein the motion vector predictor is based on the first motion disparity vector; moreover, when the second motion vector contains a motion disparity vector, one or more processors are configured to determine a motion vector predictor by scaling the first motion disparity vector to generate a scaled motion vector predictor, wherein scaling the first motion disparity vector comprises applying a scaling factor comprising distance for the view of the second motion disparity vector divided by the distance for the view of the first vector ra of motion, to the first vector of disparity of motion; and encode the prediction data for the second block based on the scaled motion vector predictor.

[0009] В другом примере аспекты этого раскрытия сущности относятся к устройству для кодирования видеоданных, содержащему средство для идентификации первого блока видеоданных в первом временном местоположении из первого вида, при этом первый блок ассоциирован с первым вектором диспаратности движения; средство для определения предиктора вектора движения для второго вектора движения, ассоциированного со вторым блоком видеоданных, при этом предиктор вектора движения основан на первом векторе диспаратности движения; при этом, когда второй вектор движения содержит вектор диспаратности движения, средство для определения предиктора вектора движения выполнено с возможностью определять предиктор вектора движения посредством масштабирования первого вектора диспаратности движения, чтобы генерировать масштабированный предиктор вектора движения, при этом масштабирование первого вектора диспаратности движения содержит применение коэффициента масштабирования, содержащего расстояние для вида второго вектора диспаратности движения, деленное на расстояние для вида первого вектора движения, к первому вектору диспаратности движения; и средство для кодирования данных предсказания для второго блока на основе масштабированного предиктора вектора движения.[0009] In another example, aspects of this disclosure relate to a video encoding apparatus comprising: means for identifying a first block of video data at a first temporary location from a first view, wherein the first block is associated with a first motion disparity vector; means for determining a motion vector predictor for a second motion vector associated with the second block of video data, wherein the motion vector predictor is based on the first motion disparity vector; wherein, when the second motion vector contains a motion vector, the means for determining the motion vector predictor is configured to determine the motion vector predictor by scaling the first motion vector to generate a scaled motion vector predictor, while scaling the first motion vector contains the use of a scaling factor containing the distance for the form of the second motion disparity vector divided by the distance an expression for the form of the first motion vector, to the first motion disparity vector; and means for encoding prediction data for a second block based on a scaled motion vector predictor.

[0010] В другом примере аспекты этого раскрытия сущности относятся к считываемому компьютером носителю данных, имеющему сохраненные инструкции, которые, при выполнении, побуждают один или более процессоров идентифицировать первый блок видеоданных в первом временном местоположении из первого вида, при этом первый блок ассоциирован с первым вектором диспаратности движения; определять предиктор вектора движения для второго вектора движения, ассоциированного со вторым блоком видеоданных, при этом предиктор вектора движения основан на первом векторе диспаратности движения; при этом, когда второй вектор движения содержит вектор диспаратности движения, инструкции побуждают один или более процессоров определять предиктор вектора движения посредством масштабирования первого вектора диспаратности движения, чтобы генерировать масштабированный предиктор вектора движения, при этом масштабирование первого вектора диспаратности движения содержит применение коэффициента масштабирования, содержащего расстояние для вида второго вектора диспаратности движения, деленное на расстояние для вида первого вектора движения, к первому вектору диспаратности движения; и кодировать данные предсказания для второго блока на основе масштабированного предиктора вектора движения.[0010] In another example, aspects of this disclosure relate to a computer-readable storage medium having stored instructions that, when executed, cause one or more processors to identify a first block of video data at a first temporary location from a first view, the first block being associated with the first motion disparity vector; determine a motion vector predictor for a second motion vector associated with the second block of video data, wherein the motion vector predictor is based on the first motion disparity vector; however, when the second motion vector contains a motion vector, the instructions prompt one or more processors to determine the motion vector predictor by scaling the first motion vector to generate a scaled motion vector predictor, while scaling the first motion vector contains the use of a scaling factor containing the distance for the form of the second motion disparity vector divided by the distance for the form of the first vector d Vision, to the first vector of disparity of movement; and encode the prediction data for the second block based on the scaled motion vector predictor.

[0011] В другом примере аспекты этого раскрытия сущности относятся к способу кодирования видеоданных, при этом способ содержит идентификацию первого блока видеоданных в первом временном местоположении из первого вида, при этом первый блок видеоданных ассоциирован с первым временным вектором движения; определение, когда второй вектор движения, ассоциированный со вторым блоком видеоданных, содержит временной вектор движения, и второй блок исходит из второго вида, предиктора вектора движения для второго вектора движения на основе первого временного вектора движения; и кодирование данных предсказания для второго блока с использованием предиктора вектора движения.[0011] In another example, aspects of this disclosure relate to a method of encoding video data, the method comprising identifying a first block of video data at a first temporary location from a first view, wherein the first block of video data is associated with a first temporary motion vector; determining when the second motion vector associated with the second block of video data contains a temporary motion vector, and the second block starts from the second view, the motion vector predictor for the second motion vector based on the first temporary motion vector; and encoding prediction data for the second block using a motion vector predictor.

[0012] В другом примере аспекты этого раскрытия сущности относятся к устройству для кодирования видеоданных, содержащему один или более процессоров, выполненных с возможностью идентифицировать первый блок видеоданных в первом временном местоположении из первого вида, при этом первый блок видеоданных ассоциирован с первым временным вектором движения; определять, когда второй вектор движения, ассоциированный со вторым блоком видеоданных, содержит временной вектор движения, и второй блок исходит из второго вида, предиктор вектора движения для второго вектора движения на основе первого временного вектора движения; и кодировать данные предсказания для второго блока с использованием предиктора вектора движения.[0012] In another example, aspects of this disclosure relate to a video encoding apparatus comprising one or more processors configured to identify a first block of video data at a first temporary location from a first view, wherein the first block of video data is associated with a first temporary motion vector; determine when the second motion vector associated with the second block of video data contains a temporary motion vector, and the second block starts from a second view, a motion vector predictor for the second motion vector based on the first temporary motion vector; and encode the prediction data for the second block using the motion vector predictor.

[0013] В другом примере аспекты этого раскрытия сущности относятся к устройству для кодирования видеоданных, содержащему средство для идентификации первого блока видеоданных в первом временном местоположении из первого вида, при этом первый блок видеоданных ассоциирован с первым временным вектором движения; средство для определения, когда второй вектор движения, ассоциированный со вторым блоком видеоданных, содержит временной вектор движения, и второй блок исходит из второго вида, предиктора вектора движения для второго вектора движения на основе первого временного вектора движения; и средство для кодирования данных предсказания для второго блока с использованием предиктора вектора движения.[0013] In another example, aspects of this disclosure relate to a device for encoding video data, comprising: means for identifying a first block of video data at a first temporary location from a first view, wherein the first block of video data is associated with a first temporary motion vector; means for determining when the second motion vector associated with the second block of video data contains a temporary motion vector, and the second block starts from a second view, a motion vector predictor for the second motion vector based on the first temporary motion vector; and means for encoding prediction data for the second block using a motion vector predictor.

[0014] В примере, аспекты этого раскрытия сущности относятся к считываемому компьютером носителю данных, имеющему сохраненные инструкции, которые при выполнении побуждают один или более процессоров идентифицировать первый блок видеоданных в первом временном местоположении из первого вида, при этом первый блок видеоданных ассоциирован с первым временным вектором движения; определять, когда второй вектор движения, ассоциированный со вторым блоком видеоданных, содержит временной вектор движения, и второй блок исходит из второго вида, предиктор вектора движения для второго вектора движения на основе первого временного вектора движения; и кодировать данные предсказания для второго блока с использованием предиктора вектора движения.[0014] In an example, aspects of this disclosure relate to a computer-readable storage medium having stored instructions that, when executed, cause one or more processors to identify a first block of video data at a first temporary location from a first view, wherein the first block of video data is associated with a first temporary motion vector; determine when the second motion vector associated with the second block of video data contains a temporary motion vector, and the second block starts from a second view, a motion vector predictor for the second motion vector based on the first temporary motion vector; and encode the prediction data for the second block using the motion vector predictor.

[0015] Подробности одного или более вариантов осуществления данного раскрытия сущности изложены на прилагаемых чертежах и в нижеприведенном описании. Другие признаки, цели и преимущества технологий, описанных в данном раскрытии сущности, должны становиться очевидными из описания и чертежей, а также из формулы изобретения.[0015] Details of one or more embodiments of this disclosure are set forth in the accompanying drawings and in the description below. Other features, objectives, and advantages of the technologies described in this disclosure should become apparent from the description and drawings, as well as from the claims.

КРАТКОЕ ОПИСАНИЕ ЧЕРТЕЖЕЙBRIEF DESCRIPTION OF THE DRAWINGS

[0016] Фиг. 1 является блок-схемой, иллюстрирующей примерную систему кодирования и декодирования видео, которая может использовать технологии, описанные в этом раскрытии сущности.[0016] FIG. 1 is a block diagram illustrating an example video encoding and decoding system that may use the techniques described in this disclosure.

[0017] Фиг. 2 является блок-схемой, иллюстрирующей примерный видеокодер, который может реализовывать способы, описанные в этом раскрытии сущности.[0017] FIG. 2 is a flowchart illustrating an example video encoder that may implement the methods described in this disclosure.

[0018] Фиг. 3 является блок-схемой, иллюстрирующей примерный видеодекодер, который может реализовывать способы, описанные в этом раскрытии сущности.[0018] FIG. 3 is a block diagram illustrating an example video decoder that may implement the methods described in this disclosure.

[0019] Фиг. 4 является концептуальной схемой, иллюстрирующей примерный шаблон предсказания по стандарту кодирования многовидового видео (MVC).[0019] FIG. 4 is a conceptual diagram illustrating an exemplary multi-view video coding (MVC) prediction pattern.

[0020] Фиг. 5 является блок-схемой, иллюстрирующей примерные местоположения для кандидатов предикторов вектора движения.[0020] FIG. 5 is a block diagram illustrating exemplary locations for motion vector predictor candidates.

[0021] Фиг. 6 является концептуальной схемой, иллюстрирующей генерирование и масштабирование предиктора вектора движения, согласно аспектам этого раскрытия сущности.[0021] FIG. 6 is a conceptual diagram illustrating the generation and scaling of a motion vector predictor, according to aspects of this disclosure.

[0022] Фиг. 7 является другой концептуальной схемой, иллюстрирующей генерирование и масштабирование предиктора вектора движения, согласно аспектам этого раскрытия сущности.[0022] FIG. 7 is another conceptual diagram illustrating the generation and scaling of a motion vector predictor, in accordance with aspects of this disclosure.

[0023] Фиг. 8 является другой концептуальной схемой, иллюстрирующей генерирование и масштабирование предиктора вектора движения, согласно аспектам этого раскрытия сущности.[0023] FIG. 8 is another conceptual diagram illustrating the generation and scaling of a motion vector predictor, in accordance with aspects of this disclosure.

[0024] Фиг. 9 является блок-схемой последовательности операций способа, иллюстрирующей примерный способ кодирования информации предсказания для блока видеоданных.[0024] FIG. 9 is a flowchart illustrating an example method of encoding prediction information for a block of video data.

[0025] Фиг. 10 является концептуальной схемой, иллюстрирующей генерирование предиктора вектора движения из блока в отличном виде относительно текущего блока.[0025] FIG. 10 is a conceptual diagram illustrating the generation of a motion vector predictor from a block in excellent form with respect to the current block.

[0026] Фиг. 11 является блок-схемой последовательности операций способа, иллюстрирующей примерный способ генерирования предиктора вектора движения из блока в отличном виде относительно текущего блока.[0026] FIG. 11 is a flowchart illustrating an exemplary method for generating a motion vector predictor from a block in excellent form with respect to the current block.

ПОДРОБНОЕ ОПИСАНИЕ ИЗОБРЕТЕНИЯDETAILED DESCRIPTION OF THE INVENTION

[0027] Согласно определенным системам кодирования видео, оценка движения и компенсация движения могут быть использованы для того, чтобы уменьшать временную избыточность в видеопоследовательности, с тем чтобы достигать сжатия данных. В этом случае, может быть сгенерирован вектор движения, который идентифицирует предсказывающий блок видеоданных, например, блок из другого видеоизображения или среза, который может быть использован для того, чтобы предсказывать значения текущего кодируемого видеоблока. Значения предсказывающего видеоблока вычитаются из значений текущего видеоблока, чтобы генерировать блок остаточных данных. Информация движения (например, вектор движения, индексы векторов движения, направления предсказания или другая информация) передается из видеокодера в видеодекодер, вместе с остаточными данными. Декодер может находить идентичный предсказывающий блок (на основе вектора движения) и восстанавливать кодированный видеоблок посредством комбинирования остаточных данных с данными предсказывающего блока.[0027] According to certain video coding systems, motion estimation and motion compensation can be used to reduce temporal redundancy in a video sequence so as to achieve data compression. In this case, a motion vector can be generated that identifies the predictive block of video data, for example, a block from another video image or slice, which can be used to predict the values of the current encoded video block. The values of the predictive video block are subtracted from the values of the current video block to generate a residual data block. Motion information (e.g., motion vector, motion vector indices, prediction directions, or other information) is transmitted from the video encoder to the video decoder, along with the residual data. The decoder can find the identical predictive block (based on the motion vector) and reconstruct the encoded video block by combining the residual data with the data of the predictive block.

[0028] В некоторых случаях, также применяется кодирование с предсказанием векторов движения, чтобы дополнительно уменьшать объем данных, требуемых для того, чтобы передавать вектор движения. Когда вектор движения устанавливается, он исходит из целевого изображения к опорному изображению. Вектор движения может пространственно или временно предсказываться. Пространственно предсказанный вектор движения ассоциирован с доступными пространственными блоками (блоком для идентичного момента времени). Временно предсказанный вектор движения ассоциирован с доступными временными блоками (блоком для другого момента времени). В случае предсказания векторов движения, вместо кодирования и передачи самого вектора движения, кодер кодирует и передает разность векторов движения (MVD) относительно известного (или узнаваемого) вектора движения. В H.264/AVC известный вектор движения, который может быть использован с MVD, чтобы задавать текущий вектор движения, может быть задан посредством так называемого предиктора вектора движения (MVP). Для того чтобы быть допустимым MVP, вектор движения должен указывать на изображение, идентичное изображению вектора движения, в данный момент кодируемого посредством MVP и MVD.[0028] In some cases, motion vector prediction coding is also used to further reduce the amount of data required to transmit the motion vector. When the motion vector is set, it proceeds from the target image to the reference image. The motion vector can be spatially or temporarily predicted. The spatially predicted motion vector is associated with available spatial blocks (a block for an identical point in time). A temporarily predicted motion vector is associated with available time blocks (a block for another point in time). In the case of motion vector prediction, instead of encoding and transmitting the motion vector itself, the encoder encodes and transmits the motion vector difference (MVD) relative to the known (or recognizable) motion vector. In H.264 / AVC, a well-known motion vector that can be used with MVD to specify the current motion vector can be defined by the so-called motion vector predictor (MVP). In order to be valid MVP, the motion vector must point to an image identical to the image of the motion vector currently encoded by MVP and MVD.

[0029] Видеокодер может компоновать список кандидатов предикторов вектора движения, который включает в себя несколько соседних блоков в пространственном и временном направлениях в качестве кандидатов для MVP. В этом случае, видеокодер может выбирать самый точный предиктор из набора кандидатов на основе анализа скорости кодирования и искажения (например, с использованием анализа функции затрат на искажение в зависимости от скорости передачи или другого анализа эффективности кодирования). Индекс предиктора вектора движения (mvp_idx) может быть передан в видеодекодер, чтобы сообщать в декодер то, где находить MVP. MVD также передается. Декодер может комбинировать MVD с MVP (заданным посредством индекса предиктора вектора движения) с тем, чтобы восстанавливать вектор движения.[0029] The video encoder may compose a candidate list of motion vector predictors that includes several neighboring blocks in the spatial and temporal directions as candidates for MVP. In this case, the video encoder can select the most accurate predictor from the set of candidates based on an analysis of the coding rate and distortion (for example, using an analysis of the distortion cost function depending on the transmission rate or another analysis of the encoding efficiency). The motion vector predictor index (mvp_idx) can be transmitted to a video decoder to tell the decoder where to find the MVP. MVD is also transmitted. The decoder can combine MVD with MVP (specified by the index of the motion vector predictor) in order to restore the motion vector.

[0030] Также может быть доступным так называемый "режим слияния", в котором информация движения (такая как векторы движения, индексы опорных изображений, направления предсказания или другая информация) соседнего видеоблока наследуется для текущего кодируемого видеоблока. Значение индекса может быть использовано для того, чтобы идентифицировать соседа, из которого текущий видеоблок наследует свою информацию движения.[0030] A so-called "merge mode" may also be available in which motion information (such as motion vectors, reference image indices, prediction directions, or other information) of a neighboring video block is inherited for the current encoded video block. The index value can be used to identify the neighbor from which the current video block inherits its motion information.

[0031] Кодирование многовидового видео (MVC) является стандартом кодирования видео для инкапсуляции нескольких видов видеоданных. В общем, каждый вид соответствует различной перспективе или углу, под которым захвачены соответствующие видеоданные общей сцены. MVC предоставляет набор метаданных, т.е. описательных данных для видов совместно и по отдельности.[0031] Multi-view video coding (MVC) is a video coding standard for encapsulating several kinds of video data. In general, each view corresponds to a different perspective or angle under which the corresponding video data of the overall scene is captured. MVC provides a set of metadata, i.e. descriptive data for species together and separately.

[0032] Кодированные виды могут использоваться для трехмерного отображения видеоданных. Например, два вида (например, виды для левого и правого глаза человека-зрителя) могут отображаться одновременно или практически одновременно с использованием различных поляризаций света, и зритель может носить пассивные поляризованные очки, так что каждый из глаз зрителя принимает соответствующий из видов. Альтернативно, зритель может носить активные очки, которые закрывают каждый глаз независимо, и отображение может быстро чередоваться между изображениями для каждого глаза синхронно с очками.[0032] The coded views may be used for three-dimensional display of video data. For example, two types (for example, views for the left and right eyes of a human viewer) can be displayed simultaneously or almost simultaneously using different polarizations of light, and the viewer can wear passive polarized glasses, so that each of the viewer's eyes takes on the corresponding of the views. Alternatively, the viewer can wear active glasses that cover each eye independently, and the display can quickly alternate between images for each eye in synchronization with the glasses.

[0033] В MVC конкретное изображение конкретного вида называется "компонентой вида". Иными словами, компонента вида для вида соответствует конкретному временному моменту вида. Типично, идентичные или соответствующие объекты двух видов не являются совместно размещенными. Термин "вектор диспаратности" может быть использован как означающий вектор, который указывает смещение объекта в изображении вида относительно соответствующего объекта в другом виде. Такой вектор также может называться "вектором смещения". Вектор диспаратности также может быть применимым к пикселю или блоку видеоданных изображения. Например, пиксель в изображении первого вида может быть смещен относительно соответствующего пикселя в изображении второго вида посредством конкретной диспаратности, связанной с отличающимися местоположениями камеры, из которых захватываются первый вид и второй вид. В некоторых примерах, диспаратность может быть использована для того, чтобы предсказывать вектор движения от одного вида до другого вида.[0033] In MVC, a particular image of a particular view is called a "view component". In other words, the view component for the view corresponds to a specific time moment of the view. Typically, identical or corresponding objects of two kinds are not co-located. The term "disparity vector" can be used as meaning a vector that indicates the displacement of an object in a view image relative to the corresponding object in another view. Such a vector may also be called a “displacement vector”. The disparity vector may also be applicable to a pixel or block of video image data. For example, a pixel in an image of a first view can be offset relative to a corresponding pixel in an image of a second view by a specific disparity associated with different camera locations from which the first view and the second view are captured. In some examples, disparity can be used to predict a motion vector from one species to another species.

[0034] В контексте MVC изображения одного вида могут быть предсказаны из изображений другого вида. Например, блок видеоданных может быть предсказан относительно блока видеоданных в опорном изображении идентичного временного момента, но другого вида. В примере, блок, который в данный момент кодируется, может называться "текущим блоком". Вектор движения, прогнозирующий текущий блок из блока в другом виде, но в идентичный момент времени, называется "вектором диспаратности движения". Вектор диспаратности движения типично является применимым в контексте кодирования многовидового видео, в котором несколько видов могут быть доступными. Согласно этому раскрытию сущности, "расстояние для вида" для вектора диспаратности движения может означать разность перемещения между видом опорного изображения и видом целевого изображения. Иными словами, расстояние для вида может представляться как разность идентификаторов видов между идентификатором вида опорного изображения и идентификатором вида целевого изображения.[0034] In the context of MVC, images of one kind can be predicted from images of another kind. For example, a block of video data may be predicted with respect to a block of video data in a reference image of an identical time moment, but of a different kind. In the example, the block that is currently being encoded may be called the "current block". A motion vector that predicts the current block from a block in a different form, but at an identical point in time, is called a "motion disparity vector." The motion disparity vector is typically applicable in the context of multi-view video coding in which several views may be available. According to this disclosure, “view distance” for a motion disparity vector may mean a difference in movement between a view of a reference image and a view of a target image. In other words, the distance for the view can be represented as the difference between the view identifiers between the view identifier of the reference image and the view identifier of the target image.

[0035] Другой тип вектора движения является "временным вектором движения". В контексте кодирования многовидового видео, временной вектор движения означает вектор движения, прогнозирующий текущий блок из блока в другой момент времени, но в идентичном виде. Согласно этому раскрытию сущности, "временное расстояние" временного вектора движения может означать расстояние на основе номера в последовательности изображений (POC) от опорного изображения до целевого изображения.[0035] Another type of motion vector is a "temporary motion vector". In the context of encoding a multi-view video, a temporary motion vector means a motion vector predicting the current block from the block at another point in time, but in an identical form. According to this disclosure, “temporal distance” of a temporary motion vector may mean a distance based on a number in an image sequence (POC) from a reference image to a target image.

[0036] Некоторые способы этого раскрытия сущности направлены на использование информации движения (например, вектора движения, индексов векторов движения, направлений предсказания или другой информации), ассоциированной с блоком видеоданных в многовидовой настройке, чтобы предсказывать информацию движения блока, в данный момент кодируемого. Например, согласно аспектам этого раскрытия сущности, вектор движения, предсказанный из другого вида, может добавляться в качестве кандидата для одного или более списков векторов движения, используемых для предсказания векторов движения текущего блока. В некоторых примерах видеокодер может использовать вектор диспаратности движения, ассоциированный с блоком в отличном виде относительно блока, в данный момент кодируемого, чтобы предсказывать вектор движения для текущего блока, и может добавлять предсказанный вектор диспаратности движения в список кандидатов векторов движения. В других примерах, видеокодер может использовать временной вектор движения, ассоциированный с блоком в отличном виде относительно блока, в данный момент кодируемого, чтобы предсказывать вектор движения для текущего блока, и может добавлять предсказанный временной вектор движения в список кандидатов векторов движения.[0036] Some methods of this disclosure are directed to using motion information (eg, motion vector, motion vector indices, prediction directions, or other information) associated with a video data block in a multi-view setting to predict the motion information of the block currently encoded. For example, according to aspects of this disclosure, a motion vector predicted from another view may be added as a candidate for one or more lists of motion vectors used to predict motion vectors of the current block. In some examples, the video encoder may use the motion displacement vector associated with the block in excellent form relative to the block currently encoded to predict the motion vector for the current block, and may add the predicted motion disparity vector to the list of motion vector candidates. In other examples, the video encoder may use the temporal motion vector associated with the block in excellent form relative to the block currently encoded to predict the motion vector for the current block, and may add the predicted temporal motion vector to the list of motion vector candidates.

[0037] Согласно аспектам этого раскрытия сущности, вектор диспаратности движения может масштабироваться до использования в качестве предиктора вектора движения для блока, в данный момент кодируемого. Например, если вектор диспаратности движения идентифицирует опорное изображение, которое имеет идентификатор вида, идентичный идентификатору вида прогнозируемого текущего вектора движения, и вектор диспаратности движения имеет целевое изображение с идентификатором вида, идентичным идентификатору вида прогнозируемого текущего вектора движения, вектор диспаратности движения не может масштабироваться до использования для того, чтобы предсказывать вектор движения для текущего блока. В других случаях, вектор диспаратности движения может масштабироваться до использования для того, чтобы предсказывать вектор движения для текущего блока.[0037] According to aspects of this disclosure, the motion disparity vector can be scaled to use as a predictor of the motion vector for the block currently encoded. For example, if a motion disparity vector identifies a reference image that has a view identifier identical to that of the predicted current motion vector, and a motion disparity vector has a target image with a view identifier identical to that of the predicted current motion vector, the motion disparity vector cannot be scaled before use in order to predict the motion vector for the current block. In other cases, the motion disparity vector can be scaled to use in order to predict the motion vector for the current block.

[0038] В другом примере вектор диспаратности движения может быть предсказан из вектора диспаратности движения, ассоциированного с пространственно соседним блоком. В этом примере, если идентификатор вида опорного изображения вектора диспаратности движения является идентичным идентификатору вида опорного изображения вектора движения, который должен быть предсказан (например, вектора движения, ассоциированного с блоком, в данный момент прогнозируемым), масштабирование может не требоваться. В противном случае, вектор диспаратности движения может масштабироваться на основе местоположения камеры для камеры, используемой для того, чтобы захватывать видеоданные. Иными словами, например, вектор диспаратности движения, использующийся для предсказания, может масштабироваться согласно разности между идентификатором вида опорного изображения вектора диспаратности движения и идентификатором вида целевого изображения вектора движения. В некоторых примерах, масштабирование векторов диспаратности движения может осуществляться на основе трансляций видов.[0038] In another example, a motion disparity vector can be predicted from a motion disparity vector associated with a spatially adjacent block. In this example, if the identifier of the type of the reference image of the motion disparity vector is identical to the identifier of the type of the reference image of the motion vector to be predicted (for example, the motion vector associated with the block currently being predicted), scaling may not be required. Otherwise, the motion disparity vector can be scaled based on the location of the camera for the camera used to capture video data. In other words, for example, the motion disparity vector used for prediction can be scaled according to the difference between the type identifier of the reference image of the motion disparity vector and the type identifier of the target image of the motion vector. In some examples, the scaling of motion disparity vectors can be based on the translation of the species.

[0039] В другом примере вектор диспаратности движения может быть предсказан из вектора диспаратности движения, ассоциированного с временно соседним (во времени) блоком. В этом примере, если идентификатор вида опорного изображения вектора диспаратности движения является идентичным идентификатору вида опорного изображения вектора движения, который должен быть предсказан, и идентификатор вида целевого изображения вектора диспаратности движения является идентичным идентификатору вида опорного изображения вектора движения, который должен быть предсказан, масштабирование может не требоваться. В противном случае, вектор диспаратности движения может масштабироваться на основе разности в идентификаторе вида, как описано относительно предыдущего примера.[0039] In another example, the motion disparity vector can be predicted from the motion disparity vector associated with a temporarily neighboring (in time) block. In this example, if the type identifier of the reference image of the motion vector of the motion vector is identical to the identifier of the type of the reference image of the motion vector to be predicted, and the identifier of the type of the target image of the motion vector of the motion to be predicted, scaling can not required. Otherwise, the motion disparity vector can be scaled based on the difference in the view identifier, as described with respect to the previous example.

[0040] Что касается временного предсказания векторов движения, согласно аспектам этого раскрытия сущности, временной вектор движения, который имеет целевое изображение в первом виде, может быть использован для того, чтобы предсказывать временной вектор движения, который имеет целевое изображение во втором другом виде. В некоторых примерах, блок в целевом изображении временного вектора движения, использующегося для предсказания, может совместно размещаться с блоком, в данный момент прогнозируемым в другом виде. В других примерах, блок в целевом изображении временного вектора движения, использующегося для предсказания, может смещаться от текущего блока вследствие диспаратности между двумя видами.[0040] Regarding the temporal prediction of motion vectors, according to aspects of this disclosure, a temporal motion vector that has a target image in a first view can be used to predict a temporal motion vector that has a target image in a second different look. In some examples, the block in the target image of the temporal motion vector used for prediction may be co-located with the block currently being predicted in a different form. In other examples, the block in the target image of the temporal motion vector used for prediction may shift from the current block due to the disparity between the two views.

[0041] В некоторых примерах, когда вектор движения, прогнозируемый из другого вида, является временным вектором движения, вектор движения может масштабироваться на основе разности в расстояниях на основе номера в последовательности изображений (POC). Например, согласно аспектам этого раскрытия сущности, если опорное изображение временного вектора движения, использующегося для предсказания, имеет POC-значение, идентичное POC-значению опорного изображения прогнозируемого текущего вектора движения, и целевое изображение временного вектора движения, использующегося для предсказания, имеет POC-значение, идентичное POC-значению опорного изображения прогнозируемого текущего вектора движения, вектор движения, использующийся для предсказания, не может масштабироваться. Тем не менее, в противном случае, вектор движения, использующийся для предсказания, может масштабироваться на основе разности в POC-значении между опорным изображением вектора движения, использующегося для предсказания, и опорным изображением вектора движения, в данный момент прогнозируемого.[0041] In some examples, when the motion vector predicted from a different view is a temporary motion vector, the motion vector can be scaled based on the difference in distances based on the number in the image sequence (POC). For example, according to aspects of this disclosure, if the reference image of the temporary motion vector used for prediction has a POC value identical to the POC value of the reference image of the predicted current motion vector, and the target image of the temporary motion vector used for prediction has a POC value identical to the POC value of the reference image of the predicted current motion vector, the motion vector used for prediction cannot be scaled. However, otherwise, the motion vector used for prediction may be scaled based on the difference in the POC value between the reference image of the motion vector used for prediction and the reference image of the motion vector currently predicted.

[0042] Согласно некоторым аспектам этого раскрытия сущности, временные векторы движения и/или векторы диспаратности движения из различных видов могут быть использованы в качестве кандидатов MVP. Например, временные векторы движения и/или векторы диспаратности движения могут быть использованы для того, чтобы вычислять MVD для текущего блока. Согласно другим аспектам этого раскрытия сущности, временные векторы движения и/или векторы диспаратности движения из различных видов могут быть использованы в качестве кандидатов слияния. Например, временные векторы движения и/или векторы диспаратности движения могут быть унаследованы для текущего блока. В таких примерах, значение индекса может быть использовано для того, чтобы идентифицировать соседа, из которого текущий видеоблок наследует свою информацию движения. В любом случае, вектор диспаратности движения и/или временной вектор движения из другого вида, используемый в качестве кандидата MVP или слияния, может масштабироваться до использования в качестве кандидата MVP или слияния.[0042] According to some aspects of this disclosure, temporary motion vectors and / or motion disparity vectors from various kinds can be used as MVP candidates. For example, temporary motion vectors and / or motion disparity vectors can be used to calculate the MVD for the current block. According to other aspects of this disclosure, temporary motion vectors and / or motion disparity vectors from various kinds can be used as merge candidates. For example, temporary motion vectors and / or motion disparity vectors can be inherited for the current block. In such examples, the index value can be used to identify the neighbor from which the current video block inherits its motion information. In any case, a motion disparity vector and / or a temporary motion vector from a different view, used as an MVP or merge candidate, can scale to use an MVP or merge as a candidate.

[0043] Фиг. 1 является блок-схемой, иллюстрирующей примерную систему 10 кодирования и декодирования видео, которая может использовать способы для предсказания векторов движения при многовидовом кодировании. Как показано на фиг. 1, система 10 включает в себя устройство-источник 12, которое предоставляет кодированные видеоданные, которые должны быть декодированы позднее посредством устройства-адресата 14. В частности, устройство-источник 12 предоставляет видеоданные в устройство-адресат 14 через считываемый компьютером носитель 16. Устройство-источник 12 и устройство-адресат 14 могут содержать любые из широкого диапазона устройств, включающих в себя настольные компьютеры, ноутбуки (т.е. переносные компьютеры), планшетные компьютеры, абонентские приставки, телефонные аппараты, к примеру, так называемые смартфоны, так называемые интеллектуальные планшеты, телевизионные приемники, камеры, устройства отображения, цифровые мультимедийные проигрыватели, консоли для видеоигр, устройство потоковой передачи видео и т.п. В некоторых случаях устройство-источник 12 и устройство-адресат 14 могут быть оснащены возможностями беспроводной связи.[0043] FIG. 1 is a block diagram illustrating an example video encoding and decoding system 10 that may use methods for predicting motion vectors in multi-view coding. As shown in FIG. 1, the system 10 includes a source device 12 that provides encoded video data that must be decoded later by the destination device 14. In particular, the source device 12 provides video data to the destination device 14 via a computer readable medium 16. The device- source 12 and destination device 14 may comprise any of a wide range of devices, including desktop computers, laptops (i.e. laptop computers), tablet computers, set-top boxes, telephone sets devices, for example, the so-called smartphones, the so-called smart tablets, television sets, cameras, display devices, digital multimedia players, video game consoles, video streaming device, etc. In some cases, source device 12 and destination device 14 may be equipped with wireless capabilities.

[0044] Устройство-адресат 14 может принимать кодированные видеоданные, которые должны быть декодированы, через считываемый компьютером носитель 16. Считываемый компьютером носитель 16 может содержать любой тип носителя или устройства, допускающего перемещение кодированных видеоданных из устройства-источника 12 в устройство-адресат 14. В одном примере считываемый компьютером носитель 16 может содержать среду связи, чтобы давать возможность устройству-источнику 12 передавать кодированные видеоданные непосредственно в устройство-адресат 14 в реальном времени.[0044] The destination device 14 may receive encoded video data to be decoded through a computer-readable medium 16. The computer-readable medium 16 may comprise any type of medium or device capable of moving encoded video data from the source device 12 to the destination device 14. In one example, computer-readable medium 16 may comprise a communication medium to enable source device 12 to transmit encoded video data directly to destination device 14 in real time. all time.

[0045] Кодированные видеоданные могут быть модулированы согласно стандарту связи, такому как протокол беспроводной связи, и переданы в устройство-адресат 14. Среда связи может содержать любую беспроводную или проводную среду связи, такую как радиочастотный (RF) спектр или одна или более физических линий передачи. Среда связи может формировать часть сети с коммутацией пакетов, такой как локальная вычислительная сеть, глобальная вычислительная сеть либо глобальная сеть, такая как Интернет. Среда связи может включать в себя маршрутизаторы, переключатели, базовые станции или любое другое оборудование, которое может быть полезным, чтобы упрощать передачу из устройства-источника 12 в устройство-адресат 14.[0045] The coded video data may be modulated according to a communication standard, such as a wireless communication protocol, and transmitted to the destination device 14. The communication medium may comprise any wireless or wired communication medium, such as a radio frequency (RF) spectrum or one or more physical lines transmission. The communication medium may form part of a packet-switched network, such as a local area network, a global computer network, or a global network, such as the Internet. The communication medium may include routers, switches, base stations, or any other equipment that may be useful to facilitate transmission from the source device 12 to the destination device 14.

[0046] В некоторых примерах кодированные данные могут выводиться из интерфейса 22 вывода в устройство хранения данных. Аналогично, к кодированным данным может осуществляться доступ из устройства хранения данных посредством интерфейса ввода. Устройство хранения данных может включать в себя любые из множества носителей хранения данных с распределенным или локальным доступом, такие как жесткий диск, Blu-Ray-диски, DVD, CD-ROM, флэш-память, энергозависимое или энергонезависимое запоминающее устройство или любые другие подходящие цифровые носители хранения данных для сохранения кодированных видеоданных. В дополнительном примере устройство хранения данных может соответствовать файловому серверу или другому промежуточному устройству хранения данных, которое может сохранять кодированное видео, сгенерированное посредством устройства-источника 12.[0046] In some examples, encoded data may be output from the output interface 22 to a data storage device. Similarly, encoded data can be accessed from a data storage device via an input interface. The storage device may include any of a variety of storage media with distributed or local access, such as a hard disk, Blu-ray discs, DVD, CD-ROM, flash memory, volatile or non-volatile storage device, or any other suitable digital storage media for storing encoded video data. In a further example, the storage device may correspond to a file server or other intermediate storage device that can store encoded video generated by the source device 12.

[0047] Устройство-адресат 14 может осуществлять доступ к сохраненным видеоданным из устройства хранения данных через потоковую передачу или загрузку. Файловый сервер может быть любым типом сервера, допускающего сохранение кодированных видеоданных и передачу этих кодированных видеоданных в устройство-адресат 14. Примерные файловые серверы включают в себя веб-сервер (например, для веб-узла), FTP-сервер, устройства системы хранения данных с подключением по сети (NAS) или локальный накопитель на дисках. Устройство-адресат 14 может осуществлять доступ к кодированным видеоданным через любое стандартное подключение для передачи данных, включающее в себя Интернет-подключение. Оно может включать в себя беспроводной канал (например, Wi-Fi-подключение), проводное подключение (например, DSL, кабельный модем и т.д.) или комбинацию означенного, которая является подходящей для того, чтобы осуществлять доступ к кодированным видеоданным, сохраненным на файловом сервере. Передача кодированных видеоданных из устройства хранения данных может представлять собой потоковую передачу, передачу на основе загрузки или комбинацию вышеозначенного.[0047] The destination device 14 may access the stored video data from the data storage device via streaming or downloading. A file server can be any type of server capable of storing encoded video data and transferring this encoded video data to a destination device 14. Exemplary file servers include a web server (for example, for a website), an FTP server, and storage devices with network connection (NAS) or local disk drive. The destination device 14 can access the encoded video data through any standard data connection, including an Internet connection. It may include a wireless channel (e.g. Wi-Fi connection), a wired connection (e.g. DSL, cable modem, etc.) or a combination of the above, which is suitable for accessing encoded video data stored on the file server. The transmission of encoded video data from a storage device may be streaming, download-based transmission, or a combination of the above.

[0048] Способы этого раскрытия сущности не обязательно ограничены приложениями или настройками беспроводной связи. Способы могут применяться к кодированию видео в поддержку любых из множества мультимедийных приложений, таких как телевизионные широковещательные передачи по радиоинтерфейсу, кабельные телевизионные передачи, спутниковые телевизионные передачи, потоковые передачи видео по Интернету, такие как динамическая адаптивная потоковая передача по HTTP (DASH), цифровое видео, которое кодируется на носитель хранения данных, декодирование цифрового видео, сохраненного на носителе хранения данных, или другие приложения. В некоторых примерах система 10 может быть выполнена с возможностью поддерживать одностороннюю или двустороннюю передачу видео, чтобы поддерживать такие приложения, как потоковая передача видео, воспроизведение видео, широковещательная передача видео и/или видеотелефония.[0048] The methods of this disclosure are not necessarily limited to applications or wireless settings. The methods can be applied to video encoding in support of any of a variety of multimedia applications, such as television broadcasts over the air, cable television, satellite television, streaming video over the Internet, such as dynamic adaptive streaming over HTTP (DASH), digital video which is encoded onto a storage medium, decoding a digital video stored on a storage medium, or other applications. In some examples, system 10 may be configured to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcast and / or video telephony.

[0049] В примере по фиг. 1 устройство-источник 12 включает в себя видеоисточник 18, видеокодер 20 и интерфейс 22 вывода. Устройство-адресат 14 включает в себя интерфейс 28 ввода, видеодекодер 30 и устройство 32 отображения. В соответствии с этим раскрытием сущности, видеокодер 20 устройства-источника 12 может быть выполнен с возможностью применять способы для предсказания векторов движения при многовидовом кодировании. В других примерах устройство источник и устройство адресат могут включать в себя другие компоненты или компоновки. Например, устройство-источник 12 может принимать видеоданные из внешнего видеоисточника 18, такого как внешняя камера. Аналогично, устройство-адресат 14 может взаимодействовать с внешним устройством отображения вместо включения в себя интегрированного устройства отображения.[0049] In the example of FIG. 1, source device 12 includes a video source 18, video encoder 20, and an output interface 22. The destination device 14 includes an input interface 28, a video decoder 30, and a display device 32. According to this disclosure, video encoder 20 of source device 12 may be configured to apply methods for predicting motion vectors in multi-view coding. In other examples, the source device and the destination device may include other components or arrangements. For example, source device 12 may receive video data from an external video source 18, such as an external camera. Similarly, destination device 14 may interact with an external display device instead of including an integrated display device.

[0050] Проиллюстрированная система 10 по фиг. 1 является просто одним примером. Способы для предсказания векторов движения при многовидовом кодировании могут выполняться посредством любого устройства кодирования и/или декодирования цифрового видео. Хотя, в общем, способы этого раскрытия сущности выполняются посредством устройства кодирования видео, способы также могут выполняться посредством видеокодера/декодера, типично называемого "кодеком". Кроме того, способы этого раскрытия сущности также могут выполняться посредством видеопрепроцессора. Устройство-источник 12 и устройство-адресат 14 являются просто примерами таких устройств кодирования, в которых устройство-источник 12 формирует кодированные видеоданные для передачи в устройство-адресат 14. В некоторых примерах устройства 12, 14 могут работать практически симметрично так, что каждое из устройств 12, 14 включает в себя компоненты кодирования и декодирования видео. Следовательно, система 10 может поддерживать одностороннюю и двухстороннюю передачу видео между видеоустройствами 12, 14, к примеру, для потоковой передачи видео, воспроизведения видео, широковещательной передачи видео или видеотелефонии.[0050] The illustrated system 10 of FIG. 1 is just one example. Methods for predicting motion vectors in multi-view coding can be performed by any digital video encoding and / or decoding device. Although, in general, the methods of this disclosure are performed by a video encoding device, the methods can also be performed by a video encoder / decoder, typically referred to as a "codec". In addition, the methods of this disclosure may also be performed by a video processor. The source device 12 and the destination device 14 are simply examples of such encoding devices in which the source device 12 generates encoded video data for transmission to the destination device 14. In some examples, the devices 12, 14 can operate almost symmetrically so that each of the devices 12, 14 includes video encoding and decoding components. Therefore, the system 10 can support one-way and two-way video transmission between video devices 12, 14, for example, for video streaming, video playback, video broadcasting, or video telephony.

[0051] Видеоисточник 18 устройства-источника 12 может включать в себя устройство видеозахвата, такое как видеокамера, видеоархив, содержащий ранее захваченное видео, и/или интерфейс прямой видеотрансляции, чтобы принимать видео от поставщика видеоконтента. В качестве дополнительной альтернативы, видеоисточник 18 может генерировать основанные на компьютерной графике данные в качестве исходного видео или комбинацию передаваемого вживую видео, архивного видео и генерируемого компьютером видео. В некоторых случаях, если видеоисточником 18 является видеокамера, устройство-источник 12 и устройство-адресат 14 могут генерировать так называемые камерофоны или видеофоны. Тем не менее, как упомянуто выше, способы, описанные в этом раскрытии сущности, могут быть применимыми к кодированию видео в целом и могут применяться к беспроводным и/или проводным вариантам применения. В каждом случае захваченное, предварительно захваченное или генерируемое компьютером видео может быть кодировано посредством видеокодера 20. Кодированная видеоинформация затем может выводиться посредством интерфейса 22 вывода на считываемый компьютером носитель 16.[0051] The video source 18 of the source device 12 may include a video capture device, such as a video camera, a video archive containing previously captured video, and / or a live video interface to receive video from a video content provider. As an additional alternative, video source 18 may generate computer-based data as a source video or a combination of live video, archived video, and computer-generated video. In some cases, if the video source 18 is a video camera, the source device 12 and the destination device 14 can generate so-called camera phones or video phones. However, as mentioned above, the methods described in this disclosure may be applicable to video encoding in general and may apply to wireless and / or wired applications. In each case, captured, pre-captured, or computer-generated video can be encoded by video encoder 20. The encoded video information can then be output via the output interface 22 to a computer-readable medium 16.

[0052] Считываемый компьютером носитель 16 может включать в себя энергозависимые носители, такие как беспроводная широковещательная передача или проводная сетевая передача, либо носители хранения данных (т.е. энергонезависимые носители хранения данных), такие как жесткий диск, флэш-накопитель, компакт-диск, цифровой видеодиск, Blu-Ray-диск или другие считываемые компьютером носители. В некоторых примерах сетевой сервер (не показан) может принимать кодированные видеоданные из устройства-источника 12 и предоставлять кодированные видеоданные в устройство-адресат 14, например, через сетевую передачу. Аналогично, вычислительное устройство оборудования для изготовления носителей, такого как оборудование для штамповки дисков, может принимать кодированные видеоданные из устройства-источника 12 и изготавливать диск, содержащий кодированные видеоданные. Следовательно, можно понимать, что считываемый компьютером носитель 16 включает в себя один или более считываемых компьютером носителей различных форм в различных примерах.[0052] Computer-readable media 16 may include volatile media, such as wireless broadcast or wired network transmission, or storage media (ie non-volatile storage media), such as a hard disk, flash drive, compact disc, DVD, Blu-ray disc, or other computer-readable media. In some examples, a network server (not shown) may receive encoded video data from source device 12 and provide encoded video data to destination device 14, for example, via network transmission. Similarly, a computing device of media manufacturing equipment, such as disc stamping equipment, can receive encoded video data from source device 12 and produce a disc containing encoded video data. Therefore, it can be understood that computer-readable medium 16 includes one or more computer-readable media of various shapes in various examples.

[0053] Интерфейс 28 ввода устройства-адресата 14 принимает информацию из считываемого компьютером носителя 16. Информация считываемого компьютером носителя 16 может включать в себя синтаксическую информацию, заданную посредством видеокодера 20, которая также используется посредством видеодекодера 30, которая включает в себя элементы синтаксиса, которые описывают характеристики и/или обработку блоков и других кодированных единиц, например, GOP. Устройство 32 отображения отображает декодированные видеоданные пользователю и может содержать любое из множества устройств отображения, таких как устройство отображения на электронно-лучевой трубке (CRT), жидкокристаллическое устройство отображения (LCD), плазменное устройство отображения, устройство отображения на органических светодиодах (OLED) или другой тип устройства отображения.[0053] The input interface 28 of the destination device 14 receives information from the computer-readable medium 16. The computer-readable medium 16 may include syntax information specified by the video encoder 20, which is also used by the video decoder 30, which includes syntax elements that describe the characteristics and / or processing of blocks and other coded units, for example, GOP. The display device 32 displays the decoded video data to the user and may include any of a variety of display devices, such as a CRT display device, a liquid crystal display device (LCD), a plasma display device, an organic LED display device (OLED), or another type of display device.

[0054] Видеокодер 20 и видеодекодер 30 могут работать согласно такому стандарту кодирования видео, как стандарт высокоэффективного кодирования видео (HEVC), разрабатываемый в настоящее время, и могут соответствовать тестовой модели HEVC (HM). Альтернативно, видеокодер 20 и видеодекодер 30 могут работать согласно другим собственным или отраслевым стандартам, таким как стандарт ITU-T H.264, альтернативно называемый "MPEG-4, часть 10, усовершенствованное кодирование видео (AVC)", или расширениям таких стандартов. Тем не менее, способы этого раскрытия сущности не ограничены каким-либо конкретным стандартом кодирования. Другие примеры стандартов кодирования видео включают в себя MPEG-2 и ITU-T H.263. Хотя не показано на фиг. 1, в некоторых аспектах, видеокодер 20 и видеодекодер 30 могут быть интегрированы с аудио-кодером и декодером, соответственно, и могут включать в себя соответствующие модули мультиплексора-демультиплексора либо другие аппаратные средства и программное обеспечение, чтобы обрабатывать кодирование как аудио, так и видео в общем потоке данных или в отдельных потоках данных. Если применимо, блоки мультиплексора-демультиплексора могут соответствовать протоколу мультиплексора ITU H.223 или другим протоколам, таким как протокол пользовательских дейтаграмм (UDP).[0054] Video encoder 20 and video decoder 30 may operate according to a video encoding standard such as the high-performance video encoding standard (HEVC) currently being developed, and may conform to the HEVC test model (HM). Alternatively, video encoder 20 and video decoder 30 may operate according to other proprietary or industry standards, such as ITU-T H.264, alternatively referred to as "MPEG-4, Part 10, Advanced Video Encoding (AVC)", or extensions to such standards. However, the methods of this disclosure are not limited to any particular coding standard. Other examples of video coding standards include MPEG-2 and ITU-T H.263. Although not shown in FIG. 1, in some aspects, video encoder 20 and video decoder 30 may be integrated with an audio encoder and decoder, respectively, and may include appropriate multiplexer-demultiplexer modules or other hardware and software to process the encoding of both audio and video in a shared data stream or in separate data streams. If applicable, the multiplexer-demultiplexer units may conform to the ITU H.223 multiplexer protocol or other protocols, such as user datagram protocol (UDP).

[0055] Стандарт ITU H.264/MPEG-4 (AVC) сформулирован посредством экспертной группы в области кодирования видео (VCEG) ITU-T совместно с экспертной группой по киноизображению (MPEG) ISO/IEC как продукт совместного партнерского проекта, известного как объединенная группа по видеостандартам (JVT). В некоторых аспектах, способы, описанные в этом раскрытии сущности, могут быть применены к устройствам, которые, в общем, соответствуют стандарту H.264. Стандарт H.264 описан в ITU-T Recommendation H.264 "Advanced Video Coding for Generic Audiovisual Services" от Исследовательской группы ITU-T и датирован мартом 2005 года, который может упоминаться в данном документе как стандарт H.264 или спецификация H.264 либо стандарт или спецификация H.264/AVC. Объединенная группа по видеостандартам (JVT) продолжает работать над дополнениями к H.264/MPEG-4 AVC.[0055] The ITU H.264 / MPEG-4 (AVC) standard is formulated by the ITU-T Video Coding Expertise Group (VCEG) ITU-T in conjunction with the ISO / IEC Motion Picture Expertise Group (MPEG) as a joint partnership project known as the joint Video Standards Group (JVT). In some aspects, the methods described in this disclosure may be applied to devices that, in general, comply with the H.264 standard. The H.264 standard is described in ITU-T Recommendation H.264 "Advanced Video Coding for Generic Audiovisual Services" from the ITU-T Study Group and is dated March 2005, which may be referred to in this document as the H.264 standard or the H.264 specification either a standard or an H.264 / AVC specification. The Joint Video Standards Group (JVT) continues to work on additions to the H.264 / MPEG-4 AVC.

[0056] JCT-VC проводит работы по разработке HEVC-стандарта. Работа по стандартизации HEVC основана на усовершенствованной модели устройства кодирования видео, называемой "тестовой моделью HEVC (HM)". HM предполагает несколько дополнительных возможностей устройств кодирования видео относительно существующих устройств согласно, например, ITU-T H.264/AVC. Например, тогда как H.264 предоставляет девять режимов кодирования с внутренним предсказанием, HM может предоставлять целых тридцать три режима кодирования с внутренним предсказанием.[0056] JCT-VC is developing the HEVC standard. The HEVC standardization work is based on an advanced model of a video encoding device called the "HEVC Test Model (HM)." HM involves several additional features of video encoding devices relative to existing devices according to, for example, ITU-T H.264 / AVC. For example, while H.264 provides nine intra-prediction coding modes, HM can provide as many as thirty-three intra-prediction coding modes.

[0057] В общем, рабочая модель HM описывает, что видеоизображение (или "кадр") может быть разделено на последовательность блоков дерева или наибольших единиц кодирования (LCU), которые включают в себя выборки как сигнала яркости, так и сигнала цветности. Синтаксические данные в потоке битов могут задавать размер для LCU, которая является наибольшей единицей кодирования с точки зрения числа пикселей. Срез включает в себя некоторое число последовательных блоков дерева в порядке кодирования. Изображение может быть разделено на один или более срезов. Каждый блок дерева может разбиваться на единицы кодирования (CU) согласно дереву квадрантов. В общем, структура данных в виде дерева квадрантов включает в себя один узел для каждого CU, при этом корневой узел соответствует блоку дерева. Если CU разбивается на четыре суб-CU, узел, соответствующий CU, включает в себя четыре концевых узла, каждый из которых соответствует одной из суб-CU.[0057] In general, the HM working model describes that a video image (or “frame”) can be divided into a sequence of tree blocks or largest coding units (LCUs), which include samples of both the luminance signal and the color signal. The syntax in the bitstream can specify the size for the LCU, which is the largest coding unit in terms of the number of pixels. A slice includes a number of consecutive tree blocks in coding order. An image may be divided into one or more slices. Each block of the tree can be divided into coding units (CU) according to the quadrant tree. In general, a quadrant tree data structure includes one node for each CU, with the root node corresponding to a tree block. If the CU is divided into four sub-CUs, the node corresponding to the CU includes four end nodes, each of which corresponds to one of the sub-CUs.

[0058] Каждый узел структуры данных в виде дерева квадрантов может предоставлять синтаксические данные для соответствующей CU. Например, узел в дереве квадрантов может включать в себя флаг разбиения, указывающий то, разбивается или нет CU, соответствующая узлу, на суб-CU. Элементы синтаксиса для CU могут быть заданы рекурсивно и могут зависеть от того, разбивается или нет CU на суб-CU. Если CU не разбивается дополнительно, она называется "концевой CU". В этом раскрытии сущности, четыре суб-CU концевой CU также называются "концевыми CU", даже если отсутствует явное разбиение исходной концевой CU. Например, если CU размера 16x16 не разбивается дополнительно, четыре суб-CU 8x8 также называются "концевыми CU", хотя CU 16x16 вообще не разбивается.[0058] Each node of the data structure in the form of a quadrant tree can provide syntax data for the corresponding CU. For example, a node in the quadrant tree may include a split flag indicating whether or not the CU corresponding to the node is split into sub-CUs. The syntax elements for the CU can be specified recursively and can depend on whether or not the CU is split into sub-CUs. If the CU is not split further, it is called the “terminal CU”. In this disclosure, four sub-CUs of the terminal CUs are also referred to as “terminal CUs,” even if there is no explicit partition of the original terminal CUs. For example, if a 16x16 CU is not split further, the four 8x8 sub-CUs are also called “end CUs,” although the 16x16 CU is not split at all.

[0059] CU имеет назначение, аналогичное назначению макроблока стандарта H.264, за исключением того, что CU не имеет различения размера. Например, блок дерева может разбиваться на четыре дочерних узла (также называемые "суб-CU"), и каждый дочерний узел, в свою очередь, может быть родительским узлом и разбиваться еще на четыре дочерних узла. Конечный неразбиваемый дочерний узел, называемый "концевым узлом дерева квадрантов", содержит узел кодирования, также называемый "концевой CU". Синтаксические данные, ассоциированные с кодированным потоком битов, могут задавать максимальное число раз, которое может разбиваться блок дерева, называемое "максимальной CU-глубиной", и также может задавать минимальный размер узлов кодирования. Соответственно, поток битов также может задавать наименьшую единицу кодирования (SCU). Это раскрытие сущности использует термин "блок", чтобы ссылаться на любое из CU, PU или TU в контексте HEVC, или аналогичные структуры данных в контексте других стандартов (например, макроблоки и их субблоки в H.264/AVC).[0059] A CU has a designation similar to that of an H.264 macroblock, except that the CU does not have a size distinction. For example, a tree block can split into four child nodes (also called “sub-CUs”), and each child node, in turn, can be a parent node and split into four more child nodes. The final unbreakable child node, called the "end node of the quadrant tree", contains an encoding node, also called the "end CU". The syntax data associated with the encoded bitstream can specify the maximum number of times that a tree block can be split, called the "maximum CU depth", and can also specify the minimum size of the encoding nodes. Accordingly, the bitstream can also specify the smallest coding unit (SCU). This disclosure uses the term “block” to refer to any of the CUs, PUs, or TUs in the context of HEVC, or similar data structures in the context of other standards (eg, macroblocks and their subunits in H.264 / AVC).

[0060] CU включает в себя узел кодирования и единицы предсказания (PU) и единицы преобразования (TU), ассоциированные с узлом кодирования. Размер CU соответствует размеру узла кодирования и должен иметь квадратную форму. Размер CU может колебаться от 8x8 пикселей вплоть до размера блока дерева максимум в 64x64 пикселей или более. Каждая CU может содержать одну или более PU и одну или более TU. Синтаксические данные, ассоциированные с CU, могут описывать, например, разделение CU на одну или более PU. Режимы разделения могут отличаться между тем, является CU кодированной в режиме пропуска или прямом режиме, кодированной в режиме внутреннего предсказания или кодированной в режиме внешнего предсказания. PU могут быть разделены таким образом, что они имеют неквадратную форму. Синтаксические данные, ассоциированные с CU, также могут описывать, например, разделение CU на одну или более TU согласно дереву квадрантов. TU может иметь квадратную или неквадратную (например, прямоугольную) форму.[0060] The CU includes a coding unit and prediction units (PUs) and transform units (TUs) associated with the coding unit. The size of the CU corresponds to the size of the encoding node and should be square. CU size can range from 8x8 pixels up to a tree block size of a maximum of 64x64 pixels or more. Each CU may contain one or more PUs and one or more TUs. The syntax data associated with the CU may describe, for example, the division of the CU into one or more PUs. Separation modes may differ between whether the CU is encoded in skip mode or direct mode, encoded in intra prediction mode or encoded in inter prediction mode. PUs can be divided so that they are non-square. The syntax data associated with the CUs may also describe, for example, the division of the CUs into one or more TUs according to a quadrant tree. A TU may be square or non-square (e.g., rectangular) in shape.

[0061] HEVC-стандарт обеспечивает возможность преобразований согласно TU, которые могут отличаться для различных CU. Размеры NU типично задаются на основе размера PU в данной CU, заданного для разделенной LCU, хотя это может не всегда иметь место. TU типично имеет идентичный размер или меньше, чем PU. В некоторых примерах остаточные выборки, соответствующие CU, могут подразделяться на меньшие единицы с использованием структуры в виде дерева квадрантов, известной как "остаточное дерево квадрантов" (RQT). Концевые узлы RQT могут называться "единицами преобразования (TU)". Значения пиксельных разностей, ассоциированные с TU, могут быть преобразованы, чтобы генерировать коэффициенты преобразования, которые могут быть квантованы.[0061] The HEVC standard enables transformations according to TUs, which may differ for different CUs. NU sizes are typically set based on the PU size in a given CU specified for a split LCU, although this may not always be the case. A TU typically has an identical size or smaller than a PU. In some examples, the residual samples corresponding to the CU can be subdivided into smaller units using a quad tree structure known as the "residual quad tree" (RQT). RQT end nodes may be referred to as "transform units (TUs)". The pixel difference values associated with the TUs can be converted to generate transform coefficients that can be quantized.

[0062] Концевая CU может включать в себя одну или более единиц предсказания (PU). В общем, PU представляет пространственную область, соответствующую всем или части соответствующей CU, и может включать в себя данные для извлечения опорной выборки для PU. Кроме того, PU включает в себя данные, связанные с предсказанием. Например, когда PU кодируется во внутреннем режиме, данные для PU могут быть включены в остаточное дерево квадрантов (RQT), которое может включать в себя данные, описывающие режим внутреннего предсказания для TU, соответствующей PU. В качестве другого примера, когда PU кодируется во внешнем режиме, PU может включать в себя данные, задающие один или более векторов движения для PU. Данные, задающие вектор движения для PU, могут описывать, например, горизонтальную компоненту вектора движения, вертикальную компоненту вектора движения, разрешение для вектора движения (например, точность в одну четверть пикселя или точность в одну восьмую пикселя), опорный кадр, на который указывает вектор движения, и/или опорный список (например, список 0, список 1 или список c) для вектора движения.[0062] The terminal CU may include one or more prediction units (PUs). In general, a PU represents a spatial region corresponding to all or parts of a corresponding CU, and may include data for retrieving a reference sample for the PU. In addition, the PU includes prediction related data. For example, when the PU is intra-mode encoded, data for the PU may be included in a residual quadrant tree (RQT), which may include data describing the intra prediction mode for the TU corresponding to the PU. As another example, when the PU is encoded in an external mode, the PU may include data defining one or more motion vectors for the PU. The motion vector data for the PU can describe, for example, the horizontal component of the motion vector, the vertical component of the motion vector, the resolution for the motion vector (for example, one quarter pixel accuracy or one eighth pixel accuracy), a reference frame that the vector points to motion, and / or a reference list (e.g., list 0, list 1 or list c) for the motion vector.

[0063] Концевая CU, имеющая одну или более PU, также может включать в себя одну или более единиц преобразования (TU). Единицы преобразования могут указываться с использованием RQT (также называемой "структурой в виде дерева квадрантов TU"), как пояснено выше. Например, флаг разбиения может указывать то, разбивается или нет концевая CU на четыре единицы преобразования. Затем, каждая единица преобразования дополнительно может разбиваться на дополнительные под-TU. Когда TU не разбивается дополнительно, она может называться "концевой TU". В общем, для внутреннего кодирования, все концевые TU, принадлежащие концевой CU, совместно используют идентичный режим внутреннего предсказания. Иными словами, идентичный режим внутреннего предсказания, в общем, применяется для того, чтобы вычислять предсказанные значения для всех TU концевой CU. Для внутреннего кодирования видеокодер 20 может вычислять остаточное значение для каждой концевой TU с использованием режима внутреннего предсказания, в качестве разности между частью CU, соответствующей TU, и исходным блоком. TU не обязательно ограничивается размером PU. Таким образом, TU могут быть больше или меньше PU. Для внутреннего кодирования PU может совместно размещаться с соответствующей концевой TU для идентичной CU. В некоторых примерах максимальный размер концевой TU может соответствовать размеру соответствующей концевой CU.[0063] An end CU having one or more PUs may also include one or more transformation units (TUs). Transformation units may be indicated using RQT (also called “TU quadtree tree structure”), as explained above. For example, a split flag may indicate whether or not the terminal CU is split into four conversion units. Then, each conversion unit can be further divided into additional sub-TUs. When a TU is not split further, it may be called an “end TU”. In general, for intra coding, all terminal TUs belonging to the terminal CU share the same intra prediction mode. In other words, the identical intra prediction mode is generally used to calculate the predicted values for all TUs of the terminal CU. For intra-coding, video encoder 20 may calculate the residual value for each end TU using the intra prediction mode as the difference between the portion of the CU corresponding to the TU and the source unit. TU is not necessarily limited to PU size. Thus, TUs may be larger or smaller than PUs. For internal coding, the PU can be co-located with the corresponding end TU for the identical CU. In some examples, the maximum size of the terminal TU may correspond to the size of the corresponding terminal CU.

[0064] Кроме того, TU концевых CU также могут быть ассоциированы с соответствующими структурами данных в виде дерева квадрантов, называемыми "остаточными деревьями квадрантов (RQT)". Иными словами, концевая CU может включать в себя дерево квадрантов, указывающее то, как концевая CU разделяется на TU. Корневой узел дерева квадрантов TU, в общем, соответствует концевой CU, в то время как корневой узел дерева квадрантов CU, в общем, соответствует блоку дерева (или LCU). TU RQT, которые не разбиваются, называются "концевыми TU". В общем, это раскрытие сущности использует термины CU и TU, чтобы ссылаться на концевую CU и концевую TU, соответственно, если не указано иное.[0064] Furthermore, TUs of terminal CUs may also be associated with corresponding quadtree tree data structures called "residual quadrant trees (RQTs)". In other words, the end CU may include a quadrant tree indicating how the end CU is divided into TUs. The root node of the TU quadrant tree generally corresponds to the end CU, while the root node of the CU quadrant tree generally corresponds to a block of tree (or LCU). TU RQTs that are not split are called “end TUs”. In general, this disclosure uses the terms CU and TU to refer to an end CU and an end TU, respectively, unless otherwise indicated.

[0065] Видеопоследовательность типично включает в себя последовательность изображений. Как описано в данном документе, "изображение" и "кадр" могут быть использованы взаимозаменяемо. Иными словами, изображение, содержащее видеоданные, может называться "видеокадром" или просто "кадром". Группа изображений (GOP), в общем, содержит последовательность из одного или более видеоизображений. GOP может включать в себя в заголовке GOP, в заголовке одного или более изображений или в другом месте синтаксические данные, которые описывают число изображений, включенных в GOP. Каждый срез изображения может включать в себя синтаксические данные среза, которые описывают режим кодирования для соответствующего среза. Видеокодер 20 типично оперирует с видеоблоками в пределах отдельных срезов, чтобы кодировать видеоданные. Видеоблок может соответствовать узлу кодирования в CU. Видеоблоки могут иметь фиксированные или варьирующиеся размеры и могут отличаться по размеру согласно указанному стандарту кодирования.[0065] A video sequence typically includes a sequence of images. As described herein, “image” and “frame” may be used interchangeably. In other words, an image containing video data may be called a “video frame” or simply “frame”. A group of images (GOP) generally comprises a sequence of one or more video images. The GOP may include syntax data in the GOP header, in the header of one or more images, or elsewhere that describes the number of images included in the GOP. Each image slice may include slice syntax data that describes the encoding mode for the corresponding slice. Video encoder 20 typically operates with video blocks within individual slices to encode video data. The video block may correspond to a coding unit in a CU. Video blocks may have fixed or varying sizes and may vary in size according to the specified coding standard.

[0066] В качестве примера, HM поддерживает предсказание для различных PU-размеров. При условии, что размер конкретной CU составляет 2Nx2N, HM поддерживает внутреннее предсказание для PU-размеров 2Nx2N или NxN и внешнее предсказание для симметричных PU-размеров 2Nx2N, 2NxN, Nx2N или NxN. HM также поддерживает асимметричное разделение для внешнего предсказания для PU-размеров 2NxnU, 2NxnD, nLx2N и nRx2N. При асимметричном разделении одно направление CU не разделяется, в то время как другое направление разделяется на 25% и 75%. Часть CU, соответствующая 25%-ому разделу, указывается посредством "n", после чего идет индикатор относительно "вверх (Up)", "вниз (Down)", "влево (Left)" или "вправо (Right)". Таким образом, например, "2NxnU" означает CU 2Nx2N, которая разделяется горизонтально с PU 2Nx0,5N вверху и PU 2Nx1,5N внизу.[0066] As an example, HM supports prediction for various PU sizes. Provided that the specific CU size is 2Nx2N, HM supports intra prediction for 2Nx2N or NxN PU sizes and inter prediction for 2Nx2N, 2NxN, Nx2N, or NxN PU symmetric sizes. HM also supports asymmetric separation for inter prediction for PU sizes 2NxnU, 2NxnD, nLx2N and nRx2N. With asymmetric separation, one direction of the CU is not divided, while the other direction is divided by 25% and 75%. The portion of the CU corresponding to the 25% section is indicated by "n", followed by the indicator relative to "up", "down (down)", "left (left)" or "right (right)". Thus, for example, “2NxnU” means CU 2Nx2N, which is divided horizontally with PU 2Nx0,5N at the top and PU 2Nx1,5N at the bottom.

[0067] В этом раскрытии сущности, "NxN" и "N на N" могут быть использованы взаимозаменяемо для того, чтобы ссылаться на размеры пикселя видеоблока с точки зрения размеров по вертикали и горизонтали, например, 16x16 пикселей или 16 на 16 пикселей. В общем, блок 16x16 должен иметь 16 пикселей в вертикальном направлении (y=16) и 16 пикселей в горизонтальном направлении (x=16). Аналогично, блок NxN, в общем, имеет N пикселей в вертикальном направлении и N пикселей в горизонтальном направлении, при этом N представляет неотрицательное целочисленное значение. Пиксели в блоке могут размещаться в строках и столбцах. Кроме того, блок не обязательно должен иметь совпадающее число пикселей в горизонтальном направлении и в вертикальном направлении. Например, блоки могут содержать NxM пикселей, причем M не обязательно равно N.[0067] In this disclosure, “NxN” and “N by N” can be used interchangeably to refer to pixel sizes of a video block in terms of vertical and horizontal dimensions, for example, 16x16 pixels or 16 by 16 pixels. In general, a 16x16 block should have 16 pixels in the vertical direction (y = 16) and 16 pixels in the horizontal direction (x = 16). Similarly, the NxN block generally has N pixels in the vertical direction and N pixels in the horizontal direction, with N representing a non-negative integer value. Pixels in a block can be placed in rows and columns. In addition, the block does not have to have the same number of pixels in the horizontal direction and in the vertical direction. For example, blocks may contain NxM pixels, with M not necessarily equal to N.

[0068] После кодирования с внутренним предсказанием или кодирования с внешним предсказанием с использованием PU CU, видеокодер 20 может вычислять остаточные данные для TU CU. PU могут содержать синтаксические данные, описывающие способ или режим генерирования прогнозирующих пиксельных данных в пространственной области (также называемой "пиксельной областью"), и TU могут содержать коэффициенты в области преобразования после применения преобразования, например, дискретного косинусного преобразования (DCT), целочисленного преобразования, вейвлет-преобразования или концептуально аналогичного преобразования к остаточным видеоданным. Остаточные данные могут соответствовать пиксельным разностям между пикселями некодированного изображения и значениями предсказания, соответствующими PU. Видеокодер 20 может генерировать TU, включающую в себя остаточные данные для CU, и затем преобразовывать TU таким образом, чтобы генерировать коэффициенты преобразования для CU.[0068] After intra prediction coding or inter prediction coding using the PU CU, video encoder 20 may calculate the residual data for the TU CU. PUs may contain syntax data describing a method or mode for generating predictive pixel data in a spatial domain (also called a "pixel domain"), and TUs may comprise coefficients in a transform domain after applying a transform, for example, a discrete cosine transform (DCT), an integer transform, wavelet transform or conceptually similar transform to residual video data. The residual data may correspond to pixel differences between the pixels of the unencoded image and the prediction values corresponding to the PU. Video encoder 20 may generate a TU including residual data for the CU, and then transform the TUs so as to generate transform coefficients for the CU.

[0069] После каких-либо преобразований, чтобы генерировать коэффициенты преобразования, видеокодер 20 может выполнять квантование коэффициентов преобразования. Квантование, в общем, означает процесс, в котором коэффициенты преобразования квантуются, чтобы, возможно, уменьшать объем данных, используемых для того, чтобы представлять коэффициенты, обеспечивая дополнительное сжатие. Процесс квантования может уменьшать битовую глубину, ассоциированную с некоторыми или всеми коэффициентами. Например, n-битовое значение может быть округлено в меньшую сторону до m-битового значения в ходе квантования, при этом n больше m.[0069] After any transformations, in order to generate transform coefficients, video encoder 20 may quantize the transform coefficients. Quantization generally means a process in which transform coefficients are quantized to possibly reduce the amount of data used to represent the coefficients, providing additional compression. The quantization process may reduce the bit depth associated with some or all of the coefficients. For example, an n-bit value can be rounded down to an m-bit value during quantization, with n being greater than m.

[0070] После квантования видеокодер может сканировать коэффициенты преобразования, формирующие одномерный вектор, из двумерной матрицы, включающей в себя квантованные коэффициенты преобразования. Сканирование может быть спроектировано с возможностью размещать коэффициенты с более высокой энергией (и, следовательно, более низкой частотой) в начале массива и размещать коэффициенты с более низкой энергией (и, следовательно, более высокой частотой) в конце массива. В некоторых примерах видеокодер 20 может использовать предварительно заданный порядок сканирования для того, чтобы сканировать квантованные коэффициенты преобразования, так чтобы генерировать преобразованный в последовательную форму вектор, который может энтропийно кодироваться. В других примерах видеокодер 20 может выполнять адаптивное сканирование. После сканирования квантованных коэффициентов преобразования, чтобы генерировать одномерный вектор, видеокодер 20 может энтропийно кодировать одномерный вектор, например, согласно контекстно-адаптивному кодированию переменной длины (CAVLC), контекстно-адаптивному двоичному арифметическому кодированию (CABAC), синтаксическому контекстно-адаптивному двоичному арифметическому кодированию (SBAC), энтропийному кодированию на основе разделения на интервалы вероятности (PIPE) или другой способы энтропийного кодирования. Видеокодер 20 также может энтропийно кодировать элементы синтаксиса, ассоциированные с кодированными видеоданными, для использования посредством видеодекодера 30 при декодировании видеоданных.[0070] After quantization, the video encoder can scan the transform coefficients forming a one-dimensional vector from a two-dimensional matrix including the quantized transform coefficients. Scanning can be designed with the ability to place the coefficients with higher energy (and therefore lower frequency) at the beginning of the array and place the coefficients with lower energy (and therefore higher frequency) at the end of the array. In some examples, video encoder 20 may use a predetermined scan order to scan the quantized transform coefficients so as to generate a serialized vector that can be entropy encoded. In other examples, video encoder 20 may perform adaptive scanning. After scanning the quantized transform coefficients to generate a one-dimensional vector, video encoder 20 can entropy encode a one-dimensional vector, for example, according to context-adaptive variable-length coding (CAVLC), context-adaptive binary arithmetic coding (CABAC), syntactic context-adaptive binary arithmetic coding ( SBAC), entropy coding based on interval probability division (PIPE), or other entropy coding methods. Video encoder 20 may also entropy encode syntax elements associated with encoded video data for use by video decoder 30 when decoding video data.

[0071] Чтобы выполнять CABAC, видеокодер 20 может назначать контекст в контекстной модели символу, который должен быть передан. Контекст может быть связан, например, с тем, являются соседние значения символа ненулевыми или нет. Чтобы выполнять CAVLC, видеокодер 20 может выбирать код переменной длины для символа, который должен быть передан. Кодовые слова в VLC могут иметь такую структуру, что относительно более короткие коды соответствуют более вероятным символам, в то время как более длинные коды соответствуют менее вероятным символам. Таким образом, использование VLC позволяет добиваться экономии битов, например, по сравнению с использованием кодовых слов равной длины для каждого символа, который должен быть передан. Определение вероятности может быть основано на контексте, назначаемом символу.[0071] To perform CABAC, video encoder 20 may assign a context in the context model to the symbol to be transmitted. A context can be associated, for example, with whether neighboring values of a symbol are nonzero or not. To perform CAVLC, video encoder 20 may select a variable-length code for the character to be transmitted. The codewords in the VLC can be structured such that relatively shorter codes correspond to more likely characters, while longer codes correspond to less likely characters. Thus, the use of VLC allows you to achieve savings in bits, for example, compared with the use of code words of equal length for each character to be transmitted. The definition of probability may be based on the context assigned to the symbol.

[0072] Видеокодер 20 дополнительно может отправлять синтаксические данные, к примеру, синтаксические данные на основе блоков, синтаксические данные на основе изображений и синтаксические GOP-данные, в видеодекодер 30, например, в заголовке изображения, в заголовке блока, в заголовке среза или в GOP-заголовке. Синтаксические GOP-данные могут описывать число изображений в соответствующей GOP, и синтаксические данные изображений могут указывать режим кодирования/предсказания, используемый для того, чтобы кодировать соответствующее изображение.[0072] Video encoder 20 may further send syntax data, for example, block-based syntax data, image-based syntax data, and GOP syntax data, to video decoder 30, for example, in an image header, a block header, a slice header, or GOP header. The syntax GOP data may describe the number of images in the corresponding GOP, and the syntax image data may indicate an encoding / prediction mode used to encode the corresponding image.

[0073] Видеокодер 20 и видеодекодер 30 могут быть реализованы как любая из множества надлежащих схем кодера или декодера при соответствующих условиях, к примеру, как один или более микропроцессоров, процессоры цифровых сигналов (DSP), специализированные интегральные схемы (ASIC), программируемые пользователем вентильные матрицы (FPGA), дискретная логическая схема, программное обеспечение, аппаратные средства, микропрограммное обеспечение либо любые комбинации вышеозначенного. Каждый из видеокодера 20 и видеодекодера 30 может быть включен в один или более кодеров или декодеров, любой из которых может быть интегрирован как часть комбинированного видеокодера/декодера (кодека). Устройство, включающее в себя видеокодер 20 и/или видеодекодер 30, может содержать интегральную схему, микропроцессор и/или устройство беспроводной связи, такое как сотовый телефон.[0073] Video encoder 20 and video decoder 30 may be implemented as any of a variety of suitable encoder or decoder circuits under appropriate conditions, for example, as one or more microprocessors, digital signal processors (DSP), custom integrated circuits (ASICs), user programmable gate matrices (FPGAs), discrete logic, software, hardware, firmware, or any combination of the above. Each of video encoder 20 and video decoder 30 may be included in one or more encoders or decoders, any of which may be integrated as part of a combined video encoder / decoder (codec). A device including a video encoder 20 and / or video decoder 30 may include an integrated circuit, a microprocessor, and / or a wireless device, such as a cell phone.

[0074] Фиг. 2 является блок-схемой, иллюстрирующей примерный видеокодер 20, который может реализовывать способы, описанные в этом раскрытии сущности для предсказания векторов движения при многовидовом кодировании. Видеокодер 20 может выполнять внутреннее и внешнее кодирование видеоблоков в срезах. Внутреннее кодирование основано на пространственном предсказании, с тем чтобы уменьшать или удалять пространственную избыточность видео в данном изображении. Внешнее кодирование основано на временном предсказании, чтобы уменьшать или удалять временную избыточность видео в смежных изображениях или изображениях видеопоследовательности. Внутренний режим (I-режим) может означать любой из нескольких режимов пространственного сжатия. Внешние режимы, к примеру, однонаправленное предсказание (P-режим) или бипредсказание (B-режим), могут означать любой из нескольких режимов временного сжатия.[0074] FIG. 2 is a flowchart illustrating an example video encoder 20 that can implement the methods described in this disclosure for predicting motion vectors in multi-view coding. Video encoder 20 may perform internal and external encoding of video blocks in slices. Intra-coding is based on spatial prediction in order to reduce or remove the spatial redundancy of the video in a given image. External coding is based on temporal prediction to reduce or remove temporal redundancy of video in adjacent images or images of a video sequence. The internal mode (I-mode) may mean any of several spatial compression modes. External modes, for example, unidirectional prediction (P-mode) or biprediction (B-mode), can mean any of several modes of temporary compression.

[0075] Как показано на фиг. 2, видеокодер 20 принимает видеоданные, которые должны быть кодированы. В примере по фиг. 2, видеокодер 20 включает в себя модуль 40 выбора режима, сумматор 50, модуль 52 преобразования, модуль 54 квантования, модуль 56 энтропийного кодирования и запоминающее устройство 64 опорных изображений. Модуль 40 выбора режима, в свою очередь, включает в себя модуль 42 оценки движения, модуль 44 компенсации движения, модуль 46 внутреннего предсказания и модуль 48 разделения. Для восстановления видеоблоков видеокодер 20 также включает в себя модуль 58 обратного квантования, модуль 60 обратного преобразования и сумматор 62. Фильтр удаления блочности (не показан на фиг. 2) также может быть включен для того, чтобы фильтровать границы блоков, чтобы удалять артефакты блочности из восстановленного видео. Если требуется, фильтр удаления блочности типично должен фильтровать вывод сумматора 62. Дополнительные контурные фильтры (в контуре или пост-контуре) также могут быть использованы в дополнение к фильтру удаления блочности. Такие фильтры не показаны для краткости, но если требуется, могут фильтровать вывод сумматора 50 (в качестве внутриконтурного фильтра).[0075] As shown in FIG. 2, video encoder 20 receives video data to be encoded. In the example of FIG. 2, video encoder 20 includes a mode selection module 40, an adder 50, a conversion module 52, a quantization module 54, an entropy encoding module 56, and reference image memory 64. The mode selection module 40, in turn, includes a motion estimation module 42, a motion compensation module 44, an intra prediction module 46, and a separation module 48. For reconstructing video blocks, video encoder 20 also includes an inverse quantization module 58, an inverse transform module 60, and an adder 62. A deblocking filter (not shown in FIG. 2) may also be included to filter block boundaries to remove blocking artifacts from restored video. If desired, the deblocking filter typically should filter the output of the adder 62. Additional loop filters (in loop or post loop) can also be used in addition to the deblocking filter. Such filters are not shown for brevity, but if required, can filter the output of the adder 50 (as an in-circuit filter).

[0076] В ходе процесса кодирования видеокодер 20 принимает изображение или срез, которые должны быть кодированы. Изображение или срез могут быть разделены на несколько видеоблоков. Модуль 42 оценки движения и модуль 44 компенсации движения выполняют кодирование с внешним предсказанием принимаемого видеоблока относительно одного или более блоков в одном или более опорных изображений, чтобы предоставлять временное сжатие. Модуль 46 внутреннего предсказания альтернативно может выполнять внутреннее кодирование с предсказанием принимаемого видеоблока относительно одного или более соседних блоков в идентичном изображении или срезе, что и блок, который должен быть кодирован, чтобы предоставлять пространственное сжатие. Видеокодер 20 может выполнять несколько проходов кодирования, например, для того чтобы выбирать надлежащий режим кодирования для каждого блока видеоданных.[0076] During the encoding process, video encoder 20 receives an image or slice to be encoded. Image or slice can be divided into several video blocks. Motion estimation module 42 and motion compensation module 44 perform inter prediction coding of the received video block with respect to one or more blocks in one or more reference pictures to provide temporal compression. The intra prediction unit 46 may alternatively perform intra prediction encoding of the received video block with respect to one or more neighboring blocks in the same image or slice as the block that must be encoded to provide spatial compression. Video encoder 20 may perform several coding passes, for example, in order to select the appropriate coding mode for each block of video data.

[0077] Кроме того, модуль 48 разделения может разделять блоки видеоданных на субблоки на основе оценки предыдущих схем разделения в предыдущих проходах кодирования. Например, модуль 48 разделения может первоначально разделять изображение или срез на LCU и разделять каждую из LCU на суб-CU на основе анализа искажения в зависимости от скорости передачи (например, оптимизации искажения в зависимости от скорости передачи). Модуль 40 выбора режима дополнительно может генерировать структуру данных в виде дерева квадрантов, указывающую разделение LCU на суб-CU. CU вершины дерева квадрантов могут включать в себя одну или более PU и одну или более TU.[0077] In addition, the separation module 48 may divide the blocks of video data into subblocks based on the evaluation of previous separation schemes in previous coding passes. For example, dividing unit 48 may initially split an image or slice into LCUs and split each of the LCUs into sub-CUs based on a distortion analysis depending on a transmission rate (for example, optimizing distortion on a transmission rate). The mode selection module 40 may further generate a data structure in the form of a quadrant tree indicating the division of the LCU into sub-CUs. CU vertices of a quadrant tree may include one or more PUs and one or more TUs.

[0078] Модуль 40 выбора режима может выбирать один из режимов кодирования, внутренних или внешних, например, на основе результатов по ошибкам и предоставляет результирующий внутренне или внешне кодированный блок в сумматор 50, чтобы генерировать остаточные блочные данные, и в сумматор 62, чтобы восстанавливать кодированный блок для использования в качестве опорного изображения. Модуль 40 выбора режима также предоставляет элементы синтаксиса, такие как векторы движения, индикаторы внутреннего режима, информация разделов и другая такая синтаксическая информация, в модуль 56 энтропийного кодирования.[0078] The mode selection module 40 may select one of the encoding modes, internal or external, for example, based on the error results, and provides the resulting internally or externally encoded block to the adder 50 to generate residual block data and to the adder 62 to recover coded block for use as a reference image. The mode selection module 40 also provides syntax elements, such as motion vectors, internal mode indicators, section information and other such syntax information, to the entropy encoding module 56.

[0079] Модуль 42 оценки движения, модуль 43 предсказания векторов движения и модуль 44 компенсации движения могут иметь высокую степень интеграции, но проиллюстрированы отдельно в концептуальных целях. Оценка движения, выполняемая посредством модуля 42 оценки движения, является процессом генерирования векторов движения, которые оценивают движение для видеоблоков. Вектор движения, например, может указывать смещение PU видеоблока в текущем изображении относительно предсказывающего блока в опорном изображении (или другой кодированной единицы) относительно текущего блока, кодируемого в текущем изображении (или другой кодированной единицы).[0079] Motion estimation module 42, motion vector prediction module 43 and motion compensation module 44 may have a high degree of integration, but are illustrated separately for conceptual purposes. Motion estimation performed by motion estimation module 42 is a process of generating motion vectors that estimate motion for video blocks. The motion vector, for example, may indicate the offset of the PU of the video block in the current image relative to the predictive block in the reference image (or other coded unit) relative to the current block encoded in the current image (or other coded unit).

[0080] Предсказывающий блок является блоком, для которого обнаруживается, что он практически совпадает с блоком, который должен быть кодирован, с точки зрения пиксельной разности, которая может быть определена посредством суммы абсолютных разностей (SAD), суммы квадратов разностей (SSD) или других разностных показателей. В некоторых примерах видеокодер 20 может вычислять значения для субцелочисленнопикселных позиций опорных изображений, сохраненных в запоминающем устройстве 64 опорных изображений, которое также может называться "буфером опорных изображений". Например, видеокодер 20 может интерполировать значения позиций в одну четверть пикселя, позиций в одну восьмую пикселя или других дробнопикселных позиций опорного изображения. Следовательно, модуль 42 оценки движения может выполнять поиск движения относительно полнопикселных позиций и дробнопикселных позиций и выводить вектор движения с дробно-пиксельной точностью.[0080] A predictive block is a block for which it is found to be substantially the same as the block to be encoded in terms of the pixel difference, which can be determined by the sum of the absolute differences (SAD), the sum of the squares of the differences (SSD), or others differential indicators. In some examples, video encoder 20 may calculate values for sub-integer-pixel positions of reference images stored in reference image memory 64, which may also be referred to as a “reference image buffer". For example, video encoder 20 may interpolate position values in one quarter of a pixel, positions in one-eighth of a pixel, or other fractional pixel positions of a reference image. Therefore, the motion estimation module 42 can search for motion relative to full-pixel positions and fractional-pixel positions and output the motion vector with fractional-pixel accuracy.

[0081] Модуль 42 оценки движения вычисляет вектор движения для PU видеоблока во внешне-кодированном срезе посредством сравнения позиции PU с позицией предсказывающего блока опорного изображения. Соответственно, в общем, данные для вектора движения могут включать в себя список опорных изображений, индекс в списке опорных изображений (ref_idx), горизонтальную компоненту и вертикальную компоненту. Опорное изображение может быть выбрано из первого списка опорных изображений (списка 0), второго списка опорных изображений (списка 1) или комбинированного списка опорных изображений (списка c), каждый из которых идентифицирует одно или более опорных изображений, сохраненных в запоминающем устройстве 64 опорных изображений.[0081] The motion estimation module 42 calculates a motion vector for the PU of the video block in an externally coded slice by comparing the position of the PU with the position of the predictive block of the reference image. Accordingly, in general, the data for the motion vector may include a list of reference images, an index in the list of reference images (ref_idx), a horizontal component, and a vertical component. The reference image may be selected from the first list of reference images (list 0), the second list of reference images (list 1), or the combined list of reference images (list c), each of which identifies one or more reference images stored in the memory 64 of the reference images .

[0082] Модуль 42 оценки движения может генерировать и отправлять вектор движения, который идентифицирует предсказывающий блок опорного изображения, в модуль 56 энтропийного кодирования и модуль 44 компенсации движения. Иными словами, модуль 42 оценки движения может генерировать и отправлять данные вектора движения, которые идентифицируют список опорных изображений, содержащий предсказывающий блок, индекс в списке опорных изображений, идентифицирующий изображение предсказывающего блока, и горизонтальную и вертикальную компоненту, чтобы находить предсказывающий блок в идентифицированном изображении.[0082] The motion estimation module 42 may generate and send a motion vector that identifies the prediction block of the reference image to the entropy encoding module 56 and the motion compensation module 44. In other words, the motion estimation module 42 may generate and send motion vector data that identifies the reference image list containing the predictive block, an index in the reference image list identifying the predictive block image, and a horizontal and vertical component to find the predictive block in the identified image.

[0083] В некоторых примерах вместо отправки фактического вектора движения для текущей PU, модуль 43 предсказания векторов движения может предсказывать вектор движения, чтобы дополнительно уменьшать объем данных, требуемых для того, чтобы передавать вектор движения. В этом случае вместо кодирования и передачи самого вектора движения, модуль 43 предсказания векторов движения может генерировать разность векторов движения (MVD) относительно известного (или узнаваемого) вектора движения. Известный вектор движения, который может быть использован с MVD для того, чтобы задавать текущий вектор движения, может быть задан посредством так называемого предиктора вектора движения (MVP). В общем, для того чтобы быть допустимым MVP, вектор движения, использующийся для предсказания, должен указывать на опорное изображение, идентичное опорному изображению вектора движения, в данный момент кодируемого.[0083] In some examples, instead of sending the actual motion vector for the current PU, the motion vector prediction unit 43 may predict the motion vector to further reduce the amount of data required to transmit the motion vector. In this case, instead of encoding and transmitting the motion vector itself, the motion vector prediction module 43 can generate a motion vector difference (MVD) relative to a known (or recognizable) motion vector. A well-known motion vector that can be used with MVD to specify the current motion vector can be defined by the so-called motion vector predictor (MVP). In general, in order to be valid MVP, the motion vector used for prediction must point to a reference image identical to the reference image of the motion vector currently encoded.

[0084] В некоторых примерах, как подробнее описано относительно фиг. 5 ниже, модуль 43 предсказания векторов движения может компоновать список кандидатов предикторов вектора движения, который включает в себя несколько соседних блоков в пространственном и/или временном направлениях в качестве кандидатов для MVP. Согласно аспектам этого раскрытия сущности, как подробнее описано ниже, кандидаты предикторов вектора движения также могут быть идентифицированы в изображениях различных видов (например, при многовидовом кодировании). Когда несколько кандидатов предикторов вектора движения доступны (из нескольких кандидатов блоков), модуль 43 предсказания векторов движения может определять предиктор вектора движения для текущего блока согласно предварительно определенным критериям выбора. Например, модуль 43 предсказания векторов движения может выбирать самый точный предиктор из набора кандидатов на основе анализа скорости кодирования и искажения (например, с использованием анализа функции затрат на искажение в зависимости от скорости передачи или другого анализа эффективности кодирования). В других примерах модуль 43 предсказания векторов движения может генерировать среднее кандидатов предикторов вектора движения. Другие способы выбора предиктора вектора движения также являются возможными.[0084] In some examples, as described in more detail with respect to FIG. 5 below, motion vector prediction module 43 may compose a motion vector predictor candidate list that includes several neighboring blocks in spatial and / or temporal directions as candidates for MVP. According to aspects of this disclosure, as described in more detail below, motion vector predictor candidates can also be identified in various kinds of images (for example, in multi-view coding). When multiple motion vector predictor candidates are available (from multiple candidate blocks), motion vector prediction module 43 may determine a motion vector predictor for the current block according to predefined selection criteria. For example, motion vector prediction module 43 may select the most accurate predictor from a set of candidates based on an analysis of the encoding rate and distortion (for example, using an analysis of the distortion cost function depending on the transmission rate or other analysis of the encoding efficiency). In other examples, motion vector prediction module 43 may generate an average of motion vector predictor candidates. Other ways of choosing a motion vector predictor are also possible.

[0085] После выбора предиктора вектора движения модуль 43 предсказания векторов движения может определять индекс предиктора вектора движения (mvp_flag), который может быть использован для того, чтобы сообщать в видеодекодер (например, такой как видеодекодер 30), где находить MVP в списке опорных изображений, содержащем кандидаты MVP-блоков. Модуль 43 предсказания векторов движения также может определять MVD между текущим блоком и выбранным MVP. MVP-индекс и MVD могут быть использованы для того, чтобы восстанавливать вектор движения.[0085] After selecting a motion vector predictor, motion vector prediction module 43 may determine a motion vector predictor index (mvp_flag), which can be used to report to a video decoder (such as video decoder 30) where to find the MVP in the reference picture list containing candidates for MVP blocks. Motion vector prediction unit 43 may also determine the MVD between the current block and the selected MVP. The MVP index and MVD can be used to reconstruct the motion vector.

[0086] В некоторых примерах модуль 43 предсказания векторов движения вместо этого может реализовывать так называемый "режим слияния", в котором модуль 43 предсказания векторов движения может "осуществлять слияние" информации движения (такой как векторы движения, индексы опорных изображений, направления предсказания или другая информация) предсказывающего видеоблока с текущим видеоблоком. Соответственно, относительно режима слияния, текущий видеоблок наследует информацию движения из другого известного (или узнаваемого) видеоблока. Модуль 43 предсказания векторов движения может компоновать список кандидатов режима слияния, который включает в себя несколько соседних блоков в пространственном и/или временном направлениях в качестве кандидатов для режима слияния. Модуль 43 предсказания векторов движения может определять значение индекса (например, merge_idx), которое может быть использовано для того, чтобы сообщать в видеодекодер (например, такой как видеодекодер 30), где находить видеоблок для слияния в списке опорных изображений, содержащем кандидаты блоков слияния.[0086] In some examples, motion vector prediction module 43 may instead implement a so-called "merge mode", in which motion vector prediction module 43 may "merge" motion information (such as motion vectors, reference picture indices, prediction directions, or other information) of the predictive video block with the current video block. Accordingly, with respect to the merge mode, the current video block inherits motion information from another known (or recognizable) video block. Motion vector prediction unit 43 may compose a merge mode candidate list, which includes several neighboring blocks in spatial and / or temporal directions, as candidates for the merge mode. Motion vector prediction unit 43 may determine an index value (e.g., merge_idx) that can be used to report to a video decoder (e.g., such as video decoder 30) where to find a video block for merging in a reference picture list containing merge block candidates.

[0087] Согласно аспектам этого раскрытия сущности, модуль 43 предсказания векторов движения может идентифицировать предиктор вектора движения, например, для генерирования MVD или слияния, при многовидовом кодировании. Например, модуль 43 предсказания векторов движения может идентифицировать вектор диспаратности движения из блока в отличном компоненте вида относительно текущего блока, чтобы предсказывать вектор движения для текущего блока. В других примерах модуль 43 предсказания векторов движения может идентифицировать временной вектор движения из блока в отличном компоненте вида относительно текущего блока, чтобы предсказывать вектор движения для текущего блока.[0087] According to aspects of this disclosure, motion vector prediction module 43 can identify a motion vector predictor, for example, to generate MVD or merge, in multi-view coding. For example, motion vector prediction module 43 may identify a motion disparity vector from a block in an excellent view component relative to the current block to predict a motion vector for the current block. In other examples, motion vector prediction module 43 may identify a temporal motion vector from a block in an excellent view component relative to the current block to predict a motion vector for the current block.

[0088] Что касается предсказания векторов диспаратности движения, модуль 43 предсказания векторов движения может идентифицировать кандидата вектора диспаратности движения из кандидата блока, чтобы предсказывать вектор движения для видеоблока, в данный момент кодируемого (называемого "текущим блоком"). Текущий блок может находиться в изображении, идентичном изображению кандидата блока (например, пространственно гранично с кандидатом блока), или может находиться в другом изображении в виде, идентичном с видом кандидата блока. В некоторых примерах модуль 43 предсказания векторов движения может идентифицировать предиктор вектора движения, который ссылается на опорное изображение в отличном виде относительно вектора движения для текущего блока. В таких случаях, согласно технологиям этого раскрытия сущности, модуль 43 предсказания векторов движения может масштабировать предиктор вектора движения на основе разности в местоположениях камеры между двумя видами (например, видом, на который ссылается предиктор вектора движения, и видом, на который ссылается текущий вектор движения). Например, модуль 43 предсказания векторов движения может масштабировать предиктор вектора диспаратности движения согласно разности между двумя видами. В некоторых примерах разность между двумя видами может быть представлена посредством разности между идентификаторами видов (view_id), ассоциированными с видами.[0088] Regarding the prediction of motion disparity vectors, the motion vector prediction module 43 may identify a motion disparity vector candidate from a candidate block to predict a motion vector for the video block currently being encoded (called the "current block"). The current block may be in the image identical to the image of the candidate block (for example, spatially boundary with the candidate block), or may be in the other image in the form identical to that of the candidate block. In some examples, motion vector prediction module 43 may identify a motion vector predictor that refers to the reference image in excellent form relative to the motion vector for the current block. In such cases, according to the techniques of this disclosure, motion vector prediction module 43 can scale the motion vector predictor based on the difference in camera locations between the two views (for example, the view referenced by the motion vector predictor and the view referenced by the current motion vector ) For example, motion vector prediction module 43 may scale the motion disparity vector predictor according to the difference between the two views. In some examples, the difference between the two views can be represented by the difference between the view identifiers (view_id) associated with the views.

[0089] Что касается временного предсказания векторов движения, модуль 43 предсказания векторов движения может идентифицировать временного кандидата вектора движения из кандидата блока в отличном виде относительно текущего блока, чтобы предсказывать вектор движения для текущего блока. Например, модуль 43 предсказания векторов движения может идентифицировать кандидата временного предиктора вектора движения в первом виде, который ссылается на блок в изображении в другом временном местоположении первого вида. Согласно аспектам этого раскрытия сущности, модуль 43 предсказания векторов движения может использовать идентифицированного кандидата временного предиктора вектора движения для того, чтобы предсказывать вектор движения, ассоциированный с текущим блоком во втором другом виде. Кандидат блока (который включает в себя кандидата предиктора вектора движения) и текущий блок могут совместно размещаться. Тем не менее, относительное местоположение кандидата блока может смещаться от текущего блока вследствие диспаратности между двумя видами.[0089] Regarding the temporal prediction of motion vectors, the motion vector prediction unit 43 can identify the temporal motion vector candidate from the block candidate in a different form with respect to the current block to predict the motion vector for the current block. For example, motion vector prediction module 43 may identify a candidate temporal motion vector predictor in a first view that refers to a block in the image at another temporary location of the first view. According to aspects of this disclosure, motion vector prediction module 43 may use the identified candidate temporal motion vector predictor candidate to predict the motion vector associated with the current block in a second other form. The block candidate (which includes the motion vector predictor candidate) and the current block may be co-located. However, the relative location of the candidate block may shift from the current block due to the disparity between the two views.

[0090] Согласно аспектам этого раскрытия сущности, модуль 43 предсказания векторов движения может генерировать MVP-индекс (mvp_flag) и MVD или может генерировать индекс слияния (merge_idx). Например, модуль 43 предсказания векторов движения может генерировать список кандидатов MVP или слияния. Согласно аспектам этого раскрытия сущности, кандидаты MVP и/или слияния включают в себя один или более видеоблоков, расположенных в отличном виде относительно видеоблока, в данный момент декодируемого.[0090] According to aspects of this disclosure, motion vector prediction unit 43 may generate an MVP index (mvp_flag) and MVD or may generate a merge index (merge_idx). For example, motion vector prediction unit 43 may generate a list of MVP or merge candidates. According to aspects of this disclosure, MVP and / or merge candidates include one or more video blocks arranged in an excellent view relative to the video block currently being decoded.

[0091] Компенсация движения, выполняемая посредством модуля 44 компенсации движения, может заключать в себе выборку или генерирование предсказывающего блока на основе вектора движения, определенного посредством модуля 42 оценки движения, и/или информации из модуля 43 предсказания векторов движения. С другой стороны, модуль 42 оценки движения, модуль 43 предсказания векторов движения и модуль 44 компенсации движения могут быть функционально интегрированы в некоторых примерах. При приеме вектора движения для PU текущего видеоблока модуль 44 компенсации движения может находить предсказывающий блок, на который указывает вектор движения в одном из списков опорных изображений.[0091] The motion compensation performed by the motion compensation unit 44 may include sampling or generating a prediction block based on the motion vector determined by the motion estimation unit 42 and / or information from the motion vector prediction unit 43. On the other hand, motion estimation module 42, motion vector prediction module 43 and motion compensation module 44 may be functionally integrated in some examples. Upon receipt of the motion vector for the PU of the current video block, the motion compensation module 44 may find the predictive block indicated by the motion vector in one of the reference picture lists.

[0092] Сумматор 50 формирует остаточный видеоблок посредством вычитания пиксельных значений предсказывающего блока из пиксельных значений текущего кодируемого видеоблока, формируя значения пиксельных разностей, как пояснено ниже. В общем, модуль 42 оценки движения выполняет оценку движения относительно компонент сигнала яркости, и модуль 44 компенсации движения использует векторы движения, вычисленные на основе компонент сигнала яркости, как для компонент сигнала цветности, так и для компонент сигнала яркости. Модуль 40 выбора режима также может генерировать элементы синтаксиса, ассоциированные с видеоблоками и срезом, для использования посредством видеодекодера 30 при декодировании видеоблоков среза.[0092] An adder 50 generates a residual video block by subtracting the pixel values of the predictive block from the pixel values of the current encoded video block, generating pixel difference values, as explained below. In general, the motion estimation module 42 performs motion estimation with respect to the luminance signal components, and the motion compensation module 44 uses motion vectors calculated based on the luminance signal components for both the color signal components and the luminance signal components. Mode selector 40 may also generate syntax elements associated with video blocks and slice for use by video decoder 30 when decoding video slice blocks.

[0093] Модуль 46 внутреннего предсказания может внутренне предсказывать текущий блок, в качестве альтернативы внешнему предсказанию, выполняемому посредством модуля 42 оценки движения и модуля 44 компенсации движения, как описано выше. В частности, модуль 46 внутреннего предсказания может определять режим внутреннего предсказания для использования для того, чтобы кодировать текущий блок. В некоторых примерах модуль 46 внутреннего предсказания может кодировать текущий блок с использованием различных режимов внутреннего предсказания, например, во время отдельных проходов кодирования, и модуль 46 внутреннего предсказания (либо модуль 40 выбора режима в некоторых примерах) может выбирать надлежащий режим внутреннего предсказания для использования из тестированных режимов.[0093] The intra prediction unit 46 may internally predict the current block, as an alternative to the inter prediction performed by the motion estimation module 42 and the motion compensation module 44, as described above. In particular, intra prediction module 46 may determine an intra prediction mode for use in order to encode the current block. In some examples, intra prediction module 46 may encode the current block using various intra prediction modes, for example, during separate coding passes, and intra prediction module 46 (or mode selection module 40 in some examples) may select the appropriate intra prediction mode for use from tested modes.

[0094] Например, модуль 46 внутреннего предсказания может вычислять значения искажения в зависимости от скорости передачи с использованием анализа искажения в зависимости от скорости передачи для различных тестированных режимов внутреннего предсказания и выбирать режим внутреннего предсказания, имеющий наилучшие характеристики искажения в зависимости от скорости передачи, из тестированных режимов. Анализ искажения в зависимости от скорости передачи, в общем, определяет величину искажения (или ошибки) между кодированным блоком и исходным некодированным блоком, который кодирован для того, чтобы генерировать кодированный блок, а также скорость передачи битов (т.е. число битов), используемую для того, чтобы генерировать кодированный блок. Модуль 46 внутреннего предсказания может вычислять отношения из искажений и скоростей для различных кодированных блоков, чтобы определять то, какой режим внутреннего предсказания демонстрирует наилучшее значение искажения в зависимости от скорости передачи для блока.[0094] For example, the intra prediction module 46 may calculate the distortion values depending on the transmission rate using the analysis of the distortion depending on the transmission rate for various tested intra prediction modes and select the intra prediction mode having the best distortion characteristics depending on the transmission speed, from tested modes. Analysis of the distortion depending on the transmission rate, in General, determines the amount of distortion (or error) between the encoded block and the original unencoded block, which is encoded in order to generate the encoded block, as well as the bit rate (i.e. the number of bits), used to generate an encoded block. The intra prediction unit 46 may calculate the relationships from distortions and rates for various coded blocks to determine which intra prediction mode exhibits the best distortion value depending on the transmission rate for the block.

[0095] После выбора режима внутреннего предсказания для блока модуль 46 внутреннего предсказания может предоставлять информацию, указывающую выбранный режим внутреннего предсказания для блока, в модуль 56 энтропийного кодирования. Модуль 56 энтропийного кодирования может кодировать информацию, указывающую выбранный режим внутреннего предсказания. Видеокодер 20 может включать в передаваемый поток битов конфигурационные данные, которые могут включать в себя множество индексных таблиц режима внутреннего предсказания и множество модифицированных индексных таблиц режима внутреннего предсказания (также называемых "таблицами преобразования кодовых слов"), определения контекстов кодирования для различных блоков и индикаторы относительно наиболее вероятного режима внутреннего предсказания, индексной таблицы режима внутреннего предсказания и модифицированной индексной таблицы режима внутреннего предсказания, которые следует использовать для каждого из контекстов.[0095] After selecting the intra prediction mode for the block, the intra prediction module 46 may provide information indicating the selected intra prediction mode for the block to the entropy encoding module 56. Entropy encoding unit 56 may encode information indicating the selected intra prediction mode. Video encoder 20 may include configuration data in the transmitted bitstream, which may include a plurality of intra prediction mode index tables and a plurality of modified intra prediction mode index tables (also called “codeword conversion tables”), definitions of coding contexts for various blocks, and indicators regarding the most probable intra prediction mode, the intra prediction mode index table and the modified dir index table ma intra prediction, which should be used for each of the contexts.

[0096] Видеокодер 20 формирует остаточный видеоблок посредством вычитания данных предсказания из модуля 40 выбора режима из кодируемого исходного видеоблока. Сумматор 50 представляет компонент или компоненты, которые выполняют эту операцию вычитания. Процессор 52 преобразования применяет преобразование, такое как дискретное косинусное преобразование (DCT) или концептуально аналогичное преобразование, к остаточному блоку, формируя видеоблок, содержащий значения остаточных коэффициентов преобразования. Процессор 52 преобразования может выполнять другие преобразования, которые концептуально являются аналогичными DCT. Вейвлет-преобразования, целочисленные преобразования, субполосные преобразования или другие типы преобразований также могут использоваться. В любом случае, процессор 52 преобразования применяет преобразование к остаточному блоку, формируя блок остаточных коэффициентов преобразования. Преобразование может преобразовывать остаточную информацию из пиксельной области в область преобразования, такую как частотная область.[0096] Video encoder 20 generates a residual video block by subtracting prediction data from mode selection module 40 from the encoded source video block. The adder 50 represents the component or components that perform this subtraction operation. The transform processor 52 applies a transform, such as a discrete cosine transform (DCT) or a conceptually similar transform, to the residual block, forming a video block containing the values of the residual transform coefficients. Conversion processor 52 may perform other transformations that are conceptually similar to DCTs. Wavelet transforms, integer transforms, subband transforms, or other types of transforms can also be used. In any case, the transform processor 52 applies the transform to the residual block, forming a block of residual transform coefficients. A transform may convert residual information from a pixel region into a transform region, such as a frequency region.

[0097] Процессор 52 преобразования может отправлять результирующие коэффициенты преобразования в модуль 54 квантования. Модуль 54 квантования квантует коэффициенты преобразования, чтобы дополнительно уменьшать скорость передачи битов. Процесс квантования может уменьшать битовую глубину, ассоциированную с некоторыми или всеми коэффициентами. Степень квантования может быть модифицирована посредством регулирования параметра квантования. В некоторых примерах модуль 54 квантования затем может выполнять сканирование матрицы, включающей в себя квантованные коэффициенты преобразования. Альтернативно, модуль 56 энтропийного кодирования может выполнять сканирование.[0097] Conversion processor 52 may send the resulting transform coefficients to quantization module 54. Quantization module 54 quantizes the transform coefficients to further reduce the bit rate. The quantization process may reduce the bit depth associated with some or all of the coefficients. The degree of quantization can be modified by adjusting the quantization parameter. In some examples, quantization module 54 may then scan a matrix including quantized transform coefficients. Alternatively, entropy coding unit 56 may perform a scan.

[0098] После квантования модуль 56 энтропийного кодирования энтропийно кодирует квантованные коэффициенты преобразования. Например, модуль 56 энтропийного кодирования может выполнять контекстно-адаптивное кодирование переменной длины (CAVLC), контекстно-адаптивное двоичное арифметическое кодирование (CABAC), синтаксическое контекстно-адаптивное двоичное арифметическое кодирование (SBAC), энтропийное кодирование на основе разделения на интервалы вероятности (PIPE) или другую технологию энтропийного кодирования. В случае контекстного энтропийного кодирования контекст может быть основан на соседних блоках. После энтропийного кодирования посредством модуля 56 энтропийного кодирования, кодированный поток битов может быть передан в другое устройство (к примеру, видеодекодер 30) или заархивировано для последующей передачи или извлечения.[0098] After quantization, entropy encoding unit 56 entropy encodes the quantized transform coefficients. For example, entropy encoding unit 56 may perform variable length context adaptive encoding (CAVLC), context adaptive binary arithmetic encoding (CABAC), syntactic context adaptive binary arithmetic encoding (SBAC), probability interval division based entropy encoding (PIPE) or other entropy coding technology. In the case of contextual entropy coding, the context may be based on neighboring blocks. After entropy encoding by entropy encoding unit 56, the encoded bitstream may be transferred to another device (eg, video decoder 30) or archived for subsequent transmission or retrieval.

[0099] Модуль 58 обратного квантования и модуль 60 обратного преобразования применяют обратное квантование и обратное преобразование, соответственно, чтобы восстанавливать остаточный блок в пиксельной области, к примеру, для последующего использования в качестве опорного блока. Модуль 44 компенсации движения может вычислять опорный блок посредством суммирования остаточного блока с предсказывающим блоком одного из изображений запоминающего устройства 64 опорных изображений. Модуль 44 компенсации движения также может применять один или более интерполяционных фильтров к восстановленному остаточному блоку, чтобы вычислять субцелочисленные пиксельные значения для использования при оценке движения. Сумматор 62 суммирует восстановленный остаточный блок с предсказывающим блоком с компенсацией движения, сформированным посредством модуля 44 компенсации движения, чтобы сформировать восстановленный видеоблок для хранения в запоминающем устройстве 64 опорных изображений. Восстановленный видеоблок может использоваться посредством модуля 42 оценки движения и модуля 44 компенсации движения в качестве опорного блока для того, чтобы внешне кодировать блок в последующем изображении.[0099] The inverse quantization unit 58 and the inverse transform unit 60 apply inverse quantization and inverse transform, respectively, to restore the residual block in the pixel region, for example, for subsequent use as a reference block. Motion compensation module 44 may calculate the reference block by summing the residual block with the predictive block of one of the images of the reference image memory 64. Motion compensation module 44 may also apply one or more interpolation filters to the reconstructed residual block to calculate subintegral pixel values for use in motion estimation. An adder 62 summarizes the reconstructed residual block with a motion compensation predictive block generated by the motion compensation unit 44 to form a reconstructed video block for storing reference images in the memory 64. The reconstructed video block may be used by the motion estimation module 42 and the motion compensation module 44 as a reference block in order to externally encode the block in the subsequent image.

[0100] Фиг. 3 является блок-схемой, иллюстрирующей примерный видеодекодер 30, который может реализовывать способы, описанные в этом раскрытии сущности для предсказания векторов движения при многовидовом кодировании. В примере по фиг. 3, видеодекодер 30 включает в себя модуль 80 энтропийного декодирования, модуль 81 предсказания, модуль 86 обратного квантования, модуль 88 обратного преобразования, сумматор 90 и запоминающее устройство 92 опорных изображений. Модуль 81 предсказания включает в себя модуль 82 компенсации движения и модуль 84 внутреннего предсказания.[0100] FIG. 3 is a flowchart illustrating an example video decoder 30 that can implement the methods described in this disclosure for predicting motion vectors in multi-view coding. In the example of FIG. 3, video decoder 30 includes an entropy decoding unit 80, a prediction unit 81, an inverse quantization unit 86, an inverse transform unit 88, an adder 90, and reference image memory 92. Prediction module 81 includes motion compensation module 82 and intra prediction module 84.

[0101] В ходе процесса декодирования видеодекодер 30 принимает кодированный поток видеобитов, который представляет видеоблоки кодированного среза, и ассоциированные элементы синтаксиса из видеокодера 20. Модуль 80 энтропийного декодирования видеодекодера 30 энтропийно декодирует поток битов, чтобы генерировать квантованные коэффициенты, векторы движения и другие элементы синтаксиса. Модуль 80 энтропийного декодирования перенаправляет векторы движения и другие элементы синтаксиса в модуль 81 предсказания. Видеодекодер 30 может принимать элементы синтаксиса на уровне среза и/или на уровне видеоблока.[0101] During the decoding process, video decoder 30 receives an encoded video bitstream that represents the video blocks of the encoded slice, and associated syntax elements from video encoder 20. Entropy decoding module 80 of video decoder 30 entropy decodes the bitstream to generate quantized coefficients, motion vectors, and other syntax elements . Entropy decoding unit 80 redirects motion vectors and other syntax elements to prediction unit 81. Video decoder 30 may receive syntax elements at the slice level and / or at the level of the video block.

[0102] Например, в качестве исходной информации, видеодекодер 30 может принимать сжатые видеоданные, которые сжаты для передачи через сеть в так называемые "единицы уровня абстрагирования от сети" или NAL-единицы. Каждая NAL-единица может включать в себя заголовок, который идентифицирует тип данных, сохраненных к NAL-единице. Существует два типа данных, которые обычно сохраняются в NAL-единицах. Первый тип данных, сохраненных к NAL-единице, представляет собой данные уровня кодирования видео (VCL), которые включают в себя сжатые видеоданные. Второй тип данных, сохраненных в NAL-единице, называется "не-VCL-данными", которые включают в себя дополнительную информацию, такую как наборы параметров, которые задают данные заголовка, общие для большого числа NAL-единиц, и дополнительную улучшающую информацию (SEI).[0102] For example, as initial information, video decoder 30 may receive compressed video data that is compressed for transmission through a network to so-called “network abstraction units” or NAL units. Each NAL unit may include a header that identifies the type of data stored to the NAL unit. There are two types of data that are usually stored in NAL units. The first type of data stored to the NAL unit is video encoding layer (VCL) data, which includes compressed video data. The second type of data stored in the NAL unit is called "non-VCL data", which includes additional information, such as parameter sets that specify header data common to a large number of NAL units, and additional enhancement information (SEI )

[0103] Например, наборы параметров могут содержать информацию заголовка уровня последовательности (например, в наборах параметров последовательности (SPS)) и нечасто изменяющуюся информацию заголовка уровня изображения (например, в наборах параметров изображения (PPS)). Нечасто изменяющаяся информация, содержащаяся в наборах параметров, не должна повторяться для каждой последовательности или изображения, в силу этого повышая эффективность кодирования. Помимо этого, использование наборов параметров обеспечивает внеполосную передачу информации заголовка, за счет этого исключая необходимость избыточных передач для устойчивости к ошибкам.[0103] For example, parameter sets may contain sequence level header information (for example, in sequence parameter sets (SPS)) and infrequently changing image level header information (for example, in image parameter sets (PPS)). Infrequently changing information contained in sets of parameters should not be repeated for each sequence or image, thereby increasing the encoding efficiency. In addition, the use of parameter sets provides out-of-band transmission of header information, thereby eliminating the need for redundant transmissions for error tolerance.

[0104] Когда срез видео кодируется в качестве внутренне-кодированного (I-) среза, модуль 84 внутреннего предсказания модуля 81 предсказания может генерировать данные предсказания для видеоблока текущего среза на основе передаваемого в служебных сигналах режима внутреннего предсказания и данных из ранее декодированных блоков текущего изображения. Когда изображение кодируется в качестве внешне-кодированного среза (т.е. B-, P- или GPB-) макроблоков, модуль 82 компенсации движения модуля 81 предсказания формирует прогнозные блоки для видеоблока текущего среза на основе векторов движения и других элементов синтаксиса, принимаемых из модуля 80 энтропийного декодирования. Прогнозные блоки могут генерироваться из одного из опорных изображений в одном из списков опорных изображений. Видеодекодер 30 может составлять списки опорных изображений, список 0 и список 1, с использованием технологий составления по умолчанию на основе опорных изображений, сохраненных в запоминающем устройстве 92 опорных изображений.[0104] When a video slice is encoded as an intra-coded (I-) slice, the intra prediction module 84 of the prediction module 81 may generate prediction data for the video block of the current slice based on the intra prediction mode transmitted in the overhead and data from previously decoded blocks of the current image . When an image is encoded as an externally encoded slice (i.e., B-, P-, or GPB-) of macroblocks, motion compensation module 82 of prediction module 81 generates prediction blocks for the video block of the current slice based on motion vectors and other syntax elements received from entropy decoding unit 80. Prediction blocks can be generated from one of the reference images in one of the lists of reference images. Video decoder 30 may compile reference image lists, list 0 and list 1, using default compilation techniques based on reference images stored in reference image memory 92.

[0105] Модуль 82 компенсации движения определяет информацию предсказания для видеоблока текущего среза посредством синтаксического анализа векторов движения и других элементов синтаксиса и использует информацию предсказания, чтобы генерировать прогнозные блоки для текущего декодируемого видеоблока. Например, модуль 82 компенсации движения использует некоторые из принимаемых элементов синтаксиса для того, чтобы определять режим предсказания (например, внутреннее или внешнее предсказание), используемый для того, чтобы кодировать видеоблоки среза, тип среза внешнего предсказания (например, B-срез, P-срез или GPB-срез), информацию составления для одного или более списков опорных изображений для среза, векторы движения для каждого внешне-кодированного видеоблока среза, состояние внешнего предсказания для каждого внешне-кодированного видеоблока среза и другую информацию для того, чтобы декодировать видеоблоки в текущем срезе. В некоторых примерах модуль 82 компенсации движения может принимать определенную информацию движения из модуля 83 предсказания векторов движения.[0105] Motion compensation module 82 determines prediction information for a video block of a current slice by parsing motion vectors and other syntax elements and uses prediction information to generate prediction blocks for a current decoded video block. For example, motion compensation module 82 uses some of the received syntax elements to determine a prediction mode (e.g., intra or inter prediction) used to encode video slice blocks, a type of inter prediction slice (e.g., B-slice, P- slice or GPB slice), compilation information for one or more lists of reference pictures for a slice, motion vectors for each externally encoded slice video block, an inter prediction state for each externally encoded form cut block, and other information to decode the video blocks in the current slice. In some examples, motion compensation module 82 may receive certain motion information from motion vector prediction module 83.

[0106] Согласно аспектам этого раскрытия сущности, модуль 83 предсказания векторов движения может принимать данные предсказания, указывающие то, где извлекать информацию движения для текущего блока. Например, модуль 83 предсказания векторов движения может принимать информацию предсказания векторов движения, такую как MVP-индекс (mvp_flag), MVD, флаг слияния (merge_flag) и/или индекс слияния (merge_idx), и использовать эту информацию для того, чтобы идентифицировать информацию движения, используемую для того, чтобы предсказывать текущий блок. Иными словами, как отмечено выше относительно видеокодера 20, согласно аспектам этого раскрытия сущности, модуль 83 предсказания векторов движения может принимать MVP-индекс (mvp_flag) и MVD и использовать эту информацию для того, чтобы определять вектор движения, используемый для того, чтобы предсказывать текущий блок. Модуль 83 предсказания векторов движения может генерировать список кандидатов MVP или слияния. Согласно аспектам этого раскрытия сущности, кандидаты MVP и/или слияния могут включать в себя один или более видеоблоков, расположенных в отличном виде относительно видеоблока, в данный момент декодируемого.[0106] According to aspects of this disclosure, motion vector prediction unit 83 may receive prediction data indicating where to retrieve motion information for the current block. For example, motion vector prediction unit 83 may receive motion vector prediction information such as MVP index (mvp_flag), MVD, merge flag (merge_flag) and / or merge index (merge_idx), and use this information to identify motion information used to predict the current block. In other words, as noted above with respect to video encoder 20, according to aspects of this disclosure, motion vector prediction unit 83 may receive the MVP index (mvp_flag) and MVD and use this information to determine the motion vector used to predict the current block. Motion vector prediction unit 83 may generate an MVP or merge candidate list. According to aspects of this disclosure, MVP and / or merge candidates may include one or more video blocks arranged in a different view relative to the video block currently being decoded.

[0107] Модуль 83 предсказания векторов движения может использовать MVP или индекс слияния, чтобы идентифицировать информацию движения, используемую для того, чтобы предсказывать вектор движения текущего блока. Иными словами, например, модуль 83 предсказания векторов движения может идентифицировать MVP из списка опорного изображения с использованием MVP-индекса (mvp_flag). Модуль 83 предсказания векторов движения может комбинировать идентифицированный MVP с принимаемым MVD, чтобы определять вектор движения для текущего блока. В других примерах модуль 83 предсказания векторов движения может идентифицировать кандидата слияния из списка опорных изображений с использованием индекса слияния (merge_idx), чтобы определять информацию движения для текущего блока. В любом случае, после определения информации движения для текущего блока, модуль 83 предсказания векторов движения может генерировать предсказывающий блок для текущего блока.[0107] The motion vector prediction unit 83 may use the MVP or merge index to identify motion information used to predict the motion vector of the current block. In other words, for example, motion vector prediction unit 83 may identify the MVP from the reference picture list using the MVP index (mvp_flag). Motion vector prediction unit 83 may combine the identified MVP with the received MVD to determine a motion vector for the current block. In other examples, motion vector prediction unit 83 may identify a merge candidate from the list of reference images using the merge index (merge_idx) to determine motion information for the current block. In any case, after determining motion information for the current block, motion vector prediction unit 83 may generate a predictive block for the current block.

[0108] Согласно аспектам этого раскрытия сущности, модуль 83 предсказания векторов движения может определять предиктор вектора движения при многовидовом кодировании. Например, модуль 83 предсказания векторов движения может принимать информацию, указывающую вектор диспаратности движения, из блока в отличном компоненте вида относительно текущего блока, который используется для того, чтобы предсказывать вектор движения для текущего блока. В других примерах модуль 83 предсказания векторов движения может принимать информацию, идентифицирующую временной вектор движения, из блока в отличном компоненте вида относительно текущего блока, который используется для того, чтобы предсказывать вектор движения для текущего блока.[0108] According to aspects of this disclosure, motion vector prediction unit 83 may determine a motion vector predictor in multi-view coding. For example, motion vector prediction unit 83 may receive information indicating a motion disparity vector from a block in an excellent view component relative to the current block, which is used to predict a motion vector for the current block. In other examples, motion vector prediction unit 83 may receive information identifying a temporal motion vector from a block in a different view component relative to the current block, which is used to predict a motion vector for the current block.

[0109] Что касается предсказания векторов диспаратности движения, модуль 83 предсказания векторов движения может предсказывать вектор диспаратности движения для текущего блока из кандидата блока. Кандидат блока может находиться в изображении, идентичном изображению текущего блока (например, пространственно гранично с кандидатом блока), или может находиться в другом изображении в виде, идентичном с видом текущего блока. Кандидат блока также может находиться в изображении другого вида, но в момент времени, идентичный с моментом времени для текущего блока.[0109] Regarding the prediction of motion disparity vectors, the motion vector prediction module 83 may predict the motion disparity vector for the current block from the candidate block. A block candidate may be in an image identical to the image of the current block (for example, spatially boundary with the block candidate), or may be in another image in a form identical to that of the current block. A block candidate may also be in an image of a different kind, but at a point in time identical with the point in time for the current block.

[0110] Например, относительно MVP или относительно режима слияния, известны (ранее определены) целевое изображение и опорное изображение для вектора A диспаратности движения текущего блока, который должен быть предсказан. В целях пояснения допустим, что вектор движения из кандидата блока представляет собой "B". Согласно аспектам этого раскрытия сущности, если вектор B движения не является вектором диспаратности движения, модуль 83 предсказания векторов движения может считать кандидата блока недоступным (например, недоступным для предсказания вектора A движения). Иными словами, модуль 83 предсказания векторов движения может деактивировать возможность использовать кандидата блока в целях предсказания векторов движения.[0110] For example, with respect to MVP or with respect to a merge mode, a target image and a reference image are known (previously determined) for the motion disparity vector A of the current block to be predicted. For purposes of explanation, suppose that the motion vector from the candidate block is “B”. According to aspects of this disclosure, if the motion vector B is not a motion disparity vector, the motion vector prediction unit 83 may consider the candidate block to be unavailable (for example, unavailable to predict the motion vector A). In other words, motion vector prediction unit 83 may deactivate the ability to use a block candidate for predicting motion vectors.

[0111] Если вектор B движения является вектором диспаратности движения, и опорное изображение вектора B движения принадлежит виду, идентичному виду опорного изображения вектора A диспаратности движения, и целевое изображение вектора B движения принадлежит виду, идентичному виду целевого изображения вектора A диспаратности движения, модуль 83 предсказания векторов движения может использовать вектор B движения непосредственно в качестве предиктора кандидата вектора A движения. В противном случае, модуль 83 предсказания векторов движения может масштабировать вектор B диспаратности движения до того, как он может быть использован в качестве предиктора кандидата вектора A движения. В таких случаях, согласно технологиям этого раскрытия сущности, модуль 83 предсказания векторов движения может масштабировать вектор диспаратности движения на основе расстояния для вида вектора A движения и расстояния для вида вектора B движения. Например, модуль 83 предсказания векторов движения может масштабировать вектор B диспаратности движения посредством коэффициента масштабирования, который равен расстоянию для вида вектора A движения, деленному на расстояние для вида вектора B движения. В некоторых примерах модуль 83 предсказания векторов движения может выполнять такое масштабирование с использованием идентификаторов видов опорных изображений и целевых изображений.[0111] If the motion vector B is a motion disparity vector, and the reference image of the motion vector B belongs to a view identical to that of the reference image of the motion disparity vector A, and the target image of the motion vector B belongs to a view identical to the kind of the target image of the motion disparity vector A, module 83 motion vector predictions can use the motion vector B directly as a predictor of the candidate motion vector A. Otherwise, the motion vector prediction unit 83 may scale the motion disparity vector B before it can be used as a predictor of the motion vector A candidate. In such cases, according to the techniques of this disclosure, motion vector prediction unit 83 can scale the motion disparity vector based on distance for the type of motion vector A and distance for the type of motion vector B. For example, motion vector prediction unit 83 may scale the motion disparity vector B by a scaling factor that is equal to the distance for the kind of motion vector A divided by the distance for the kind of motion vector B. In some examples, motion vector prediction unit 83 may perform such scaling using identifiers of types of reference images and target images.

[0112] Что касается временного предсказания векторов движения, модуль 83 предсказания векторов движения может предсказывать временной вектор движения для текущего блока из кандидата блока в отличном виде относительно вида текущего блока. Например, модуль 83 предсказания векторов движения может идентифицировать кандидата временного предиктора вектора движения, имеющего целевое изображение в первом виде, который ссылается на блок в опорном изображении в другом временном местоположении первого вида. Согласно аспектам этого раскрытия сущности, модуль 83 предсказания векторов движения может использовать идентифицированного кандидата временного предиктора вектора движения для того, чтобы предсказывать вектор движения, ассоциированный с текущим блоком во втором другом виде.[0112] Regarding the temporal prediction of motion vectors, the motion vector prediction unit 83 may predict the temporal motion vector for the current block from the candidate block in an excellent form relative to the view of the current block. For example, motion vector prediction unit 83 may identify a candidate temporal motion vector predictor having a target image in a first view that refers to a block in a reference image in another temporary location of a first view. According to aspects of this disclosure, motion vector prediction module 83 may use the identified candidate temporal motion vector predictor to predict a motion vector associated with the current block in a second other form.

[0113] Например, относительно MVP или относительно режима слияния, известны (ранее определены) целевое изображение и опорное изображение для временного вектора A движения текущего блока, который должен быть предсказан. В целях пояснения допустим, что вектор движения из кандидата блока представляет собой "B". Согласно аспектам этого раскрытия сущности, если вектор B движения из кандидата блока не является временным вектором движения, модуль 83 предсказания векторов движения может считать кандидата блока недоступным (например, недоступным для предсказания вектора A движения). Иными словами, В некоторых примерах модуль 83 предсказания векторов движения может деактивировать возможность использовать кандидата блока в целях предсказания векторов движения.[0113] For example, with respect to MVP or with respect to a merge mode, a target image and a reference image for a temporary motion vector A of the current block to be predicted are known (previously determined). For purposes of explanation, suppose that the motion vector from the candidate block is “B”. According to aspects of this disclosure, if the motion vector B from the candidate block is not a temporary motion vector, the motion vector prediction unit 83 may consider the candidate block to be unavailable (for example, unavailable to predict the motion vector A). In other words, In some examples, motion vector prediction unit 83 may deactivate the ability to use a block candidate to predict motion vectors.

[0114] Если вектор B движения является временным вектором движения, и POC опорного изображения вектора B движения является идентичным опорному изображению вектора A движения, и POC целевого изображения вектора B движения является идентичным целевому изображению вектора B движения, модуль 83 предсказания векторов движения может использовать вектор B движения непосредственно в качестве предиктора кандидата вектора A движения. В противном случае, модуль 83 предсказания векторов движения может масштабировать временной вектор B движения на основе временного расстояния. Кандидат блока (который включает в себя кандидата предиктора вектора движения) и текущий блок могут совместно размещаться в другом виде. Тем не менее, относительное местоположение кандидата блока может смещаться от текущего блока вследствие диспаратности между двумя видами.[0114] If the motion vector B is a temporary motion vector, and the reference image POC of the motion vector B is identical to the reference image of the motion vector A, and the POC of the target image of the motion vector B is identical to the target image of the motion vector B, the motion vector prediction unit 83 may use the vector B motion directly as a predictor of the candidate motion vector A. Otherwise, motion vector prediction unit 83 may scale the time motion vector B based on the time distance. A block candidate (which includes a motion vector predictor candidate) and the current block may be co-located in a different form. However, the relative location of the candidate block may shift from the current block due to the disparity between the two views.

[0115] Модуль 86 обратного квантования обратно квантует, т.е. деквантует, квантованные коэффициенты преобразования, предоставленные в потоке битов и декодированные посредством модуля 80 энтропийного декодирования. Процесс обратного квантования может включать в себя использование параметра квантования, вычисленного посредством видеокодера 20 для каждого видеоблока в срезе, чтобы определять степень квантования и, аналогично, степень обратного квантования, которое должно применяться.[0115] The inverse quantization module 86 inverse quantizes, that is, de-quantizes the quantized transform coefficients provided in the bitstream and decoded by the entropy decoding unit 80. The inverse quantization process may include using a quantization parameter calculated by the video encoder 20 for each video block in the slice to determine the degree of quantization and, likewise, the degree of inverse quantization to be applied.

[0116] Модуль 88 обратного преобразования применяет обратное преобразование, например, обратное DCT, обратное целочисленное преобразование или концептуально аналогичный процесс обратного преобразования, к коэффициентам преобразования, чтобы генерировать остаточные блоки в пиксельной области. Согласно аспектам этого раскрытия сущности, модуль 88 обратного преобразования может определять способ, которым преобразования применены к остаточным данным. Иными словами, например, модуль 88 обратного преобразования может определять RQT, которое представляет способ, которым преобразования (например, DCT, целочисленное преобразование, вейвлет-преобразование или одно или более других преобразований) применены к остаточным выборкам сигнала яркости и остаточным выборкам сигнала цветности, ассоциированным с блоком принимаемых видеоданных.[0116] The inverse transform unit 88 applies an inverse transform, such as an inverse DCT, an inverse integer transform, or a conceptually similar inverse transform process, to transform coefficients to generate residual blocks in the pixel domain. According to aspects of this disclosure, inverse transform module 88 may determine the manner in which transforms are applied to residual data. In other words, for example, inverse transform module 88 may determine an RQT that represents the manner in which transforms (e.g., DCT, integer transform, wavelet transform, or one or more other transforms) are applied to the residual samples of the luminance signal and the residual samples of the color signal associated with a block of received video data.

[0117] После того, как модуль 82 компенсации движения формирует предсказывающий блок для текущего видеоблока на основе векторов движения и других элементов синтаксиса, видеодекодер 30 формирует декодированный видеоблок посредством суммирования остаточных блоков из модуля 88 обратного преобразования с соответствующими предсказывающими блоками, сформированными посредством модуля 82 компенсации движения. Сумматор 90 представляет компонент или компоненты, которые выполняют эту операцию суммирования. Если требуется, фильтр удаления блочности также может быть применен для того, чтобы фильтровать декодированные блоки, чтобы удалять артефакты блочности. Другие контурные фильтры (в контуре кодирования или после контура кодирования) также могут быть использованы для того, чтобы сглаживать пиксельные переходы или иным образом повышать качество видео. Декодированные видеоблоки в данном изображении затем сохраняются в запоминающем устройстве 92 опорных изображений, которое сохраняет опорные изображения, используемые для последующей компенсации движения. Запоминающее устройство 92 опорных изображений также сохраняет декодированное видео для последующего представления относительно устройства отображения, такого как устройство 32 отображения по фиг. 1.[0117] After the motion compensation module 82 generates a predictive block for the current video block based on the motion vectors and other syntax elements, the video decoder 30 generates a decoded video block by summing the residual blocks from the inverse transform module 88 with the corresponding predictive blocks generated by the compensation module 82 movement. Adder 90 represents a component or components that perform this summing operation. If desired, a deblocking filter can also be applied to filter decoded blocks to remove blocking artifacts. Other loop filters (in the coding loop or after the coding loop) can also be used to smooth pixel transitions or otherwise improve the quality of the video. The decoded video blocks in this image are then stored in a reference image memory 92, which stores reference images used for subsequent motion compensation. The reference image memory 92 also stores the decoded video for later presentation with respect to the display device, such as the display device 32 of FIG. one.

[0118] Фиг. 4 является концептуальной схемой, иллюстрирующей примерный шаблон MVC-предсказания. В примере по фиг. 4, проиллюстрированы восемь видов, и двенадцать временных местоположений проиллюстрированы для каждого вида. В общем, каждая строка на фиг. 4 соответствует виду, в то время как каждый столбец указывает временное местоположение. Каждый из видов может быть идентифицирован с использованием идентификатора вида ("view_id"), который может быть использован для того, чтобы указывать относительное местоположение камеры относительно других видов. В примере, показанном на фиг. 4, идентификаторы видов указываются как "S0-S7", хотя числовые идентификаторы видов также могут быть использованы. Помимо этого, каждое из временных местоположений может быть идентифицировано с использованием значения номера в последовательности изображений (POC), которое указывает порядок отображения изображений. В примере, показанном на фиг. 4, POC-значения указываются как "T0-T11".[0118] FIG. 4 is a conceptual diagram illustrating an example MVC prediction pattern. In the example of FIG. 4, eight species are illustrated, and twelve temporary locations are illustrated for each species. In general, each row in FIG. 4 corresponds to the view, while each column indicates a temporary location. Each of the views can be identified using a view identifier ("view_id"), which can be used to indicate the relative location of the camera relative to other views. In the example shown in FIG. 4, species identifiers are indicated as “S0-S7”, although numeric species identifiers may also be used. In addition, each of the temporary locations can be identified using an image sequence number (POC) value that indicates the display order of the images. In the example shown in FIG. 4, POC values are indicated as "T0-T11".

[0119] Хотя MVC имеет так называемый базовый вид, который может декодироваться посредством H.264/AVC-декодеров, и пара стереовидов может поддерживаться посредством MVC, MVC может поддерживать более двух видов в качестве трехмерного видеоввода. Соответственно, модуль рендеринга клиента, имеющего MVC-декодер, может ожидать трехмерный видеоконтент с несколькими видами.[0119] Although the MVC has a so-called basic form that can be decoded by H.264 / AVC decoders, and a pair of stereo types can be supported by MVC, MVC can support more than two kinds as a three-dimensional video input. Accordingly, the rendering module of a client having an MVC decoder can expect three-dimensional video content with several views.

[0120] Изображения на фиг. 4 указываются с использованием затененного блока, включающего в себя букву, обозначающую то, соответствующее изображение внутренне кодируется (т.е. в качестве I-кадра) либо внешне кодируется в одном направлении (т.е. в качестве P-кадра) или в нескольких направлениях (т.е. в качестве B-кадра). В общем, предсказания указываются посредством стрелок, при этом указываемое изображение использует указывающий объект для опорного элемента предсказания. Например, P-кадр вида S2 во временном местоположении T0 прогнозируется из I-кадра вида S0 во временном местоположении T0.[0120] The images in FIG. 4 are indicated using a shaded block including a letter indicating that the corresponding image is internally encoded (i.e., as an I-frame) or externally encoded in one direction (i.e., as a P-frame) or in several directions (i.e. as a B-frame). In general, predictions are indicated by arrows, wherein the indicated image uses a pointing object for the prediction reference element. For example, a P-frame of type S2 at a temporary location T0 is predicted from an I-frame of type S0 at a temporary location T0.

[0121] Аналогично кодированию одновидового видео, изображения последовательности многовидового видео могут быть прогнозирующе кодированы относительно изображений в различных временных местоположениях. Например, b-кадр вида S0 во временном местоположении T1 имеет стрелку, указывающую на него из I-кадра вида S0 во временном местоположении T0, показывающую то, что b-кадр прогнозируется из I-кадра. Тем не менее, дополнительно, в контексте кодирования многовидового видео, изображения могут межвидово предсказываться. Иными словами, компонента вида может использовать компоненты видов в других видах для опорных элементов. В MVC, например, реализуется межвидовое предсказание, как будто компонента вида в другом виде является опорным элементом внешнего предсказания. Потенциальные межвидовые опорные элементы могут быть переданы в служебных сигналах в MVC-расширении набора параметров последовательности (SPS) и могут быть модифицированы посредством процесса составления списка опорных изображений, который обеспечивает гибкое упорядочение опорных элементов внешнего предсказания или межвидового предсказания.[0121] Similar to the encoding of a single-shot video, multi-view video sequence images can be predictively encoded with respect to images at different time locations. For example, a b-frame of type S0 at a temporary location T1 has an arrow pointing to it from an I-frame of type S0 at a temporary location T0, indicating that the b-frame is predicted from the I-frame. However, in addition, in the context of encoding multi-view video, images can be interspecificly predicted. In other words, a view component can use view components in other views for support elements. In MVC, for example, interspecific prediction is implemented, as if a component of a species in another form is a supporting element of external prediction. Potential interspecific reference elements can be transmitted in overhead signals in the MVC extension of the sequence parameter set (SPS) and can be modified through the process of compiling a list of reference images, which provides flexible ordering of reference elements of inter prediction or interspecific prediction.

[0122] Фиг. 4 предоставляет различные примеры межвидового предсказания. Изображения вида S1, в примере по фиг. 4, проиллюстрированы как прогнозируемые из изображений в различных временных местоположениях вида S1, а также межвидово предсказанные из изображений для изображений видов S0 и S2 в идентичных временных местоположениях. Например, b-кадр вида S1 во временном местоположении T1 прогнозируется из каждого из B-кадров вида S1 во временных местоположениях T0 и T2, а также b-кадров видов S0 и S2 во временном местоположении T1.[0122] FIG. 4 provides various examples of interspecific prediction. Images of the form S1, in the example of FIG. 4 are illustrated as being predicted from images at different temporal locations of view S1, as well as interspecific predicted from images for images of views S0 and S2 at identical time locations. For example, a b-frame of type S1 at a temporary location T1 is predicted from each of B-frames of type S1 at temporary locations T0 and T2, as well as b-frames of types S0 and S2 at a temporary location T1.

[0123] В примере по фиг. 4, заглавная буква "B" и строчная буква "b" имеют намерение указывать различные иерархические взаимосвязи между изображениями, а не различные способы кодирования. В общем, кадры с заглавной буквы "B" располагаются относительно выше в иерархии предсказания, чем кадры со строчной буквы "b". Фиг. 4 также иллюстрирует изменения в иерархии предсказания с использованием разных уровней затенения, при этом изображения с большей величиной затенения (т.е. относительно более темные) располагаются выше в иерархии предсказания, чем изображения, имеющие меньшее затенение (т.е. относительно более светлые). Например, все I-кадры на фиг. 4 проиллюстрированы с полным затенением, в то время как P-кадры имеют в определенной степени более светлое затенение, и B-кадры (и кадры со строчной буквы b) имеют различные уровни затенения относительно друг друга, но всегда светлее затенения P-кадров и I-кадров.[0123] In the example of FIG. 4, the capital letter “B” and the lower case “b” are intended to indicate different hierarchical relationships between images, rather than different encoding methods. In general, frames with a capital “B” are relatively higher in the prediction hierarchy than frames with a lowercase “b”. FIG. 4 also illustrates changes in the prediction hierarchy using different shading levels, with images with a higher shading value (i.e., relatively darker) being higher in the prediction hierarchy than images having lower shading (i.e., relatively lighter) . For example, all I-frames in FIG. 4 are illustrated with full shading, while P-frames have somewhat lighter shading, and B-frames (and frames with lowercase b) have different shading levels relative to each other, but are always lighter than shading of P-frames and I -frames.

[0124] В общем, иерархия предсказания связана с индексами порядка видов таким образом, что изображения относительно выше в иерархии предсказания должны быть декодированы до декодирований изображений, которые располагаются относительно ниже в иерархии, так что изображения относительно выше в иерархии могут быть использованы в качестве опорных изображений во время декодирования изображений, относительно ниже в иерархии. Индекс порядка видов является индексом, который указывает порядок декодирования компонент видов в единице доступа. Индексы порядка видов могут подразумеваться в наборе параметров, к примеру, SPS.[0124] In general, a prediction hierarchy is associated with species order indices such that images relatively higher in the prediction hierarchy must be decoded before decoding images that are relatively lower in the hierarchy, so that images relatively higher in the hierarchy can be used as reference images during image decoding, relatively lower in the hierarchy. A view order index is an index that indicates the decoding order of view components in an access unit. Species order indices may be implied in a set of parameters, for example, SPS.

[0125] Таким образом, изображения, используемые в качестве опорных изображений, могут быть декодированы до декодирования изображений, которые кодируются в отношении опорных изображений. Индекс порядка видов является индексом, который указывает порядок декодирования компонент видов в единице доступа. Для каждого индекса i порядка видов соответствующий view_id передается в служебных сигналах. Декодирование компонент видов следует порядку по возрастанию индексов порядка видов. Если все виды представляются, то набор индексов порядка видов содержит последовательно упорядоченный набор от нуля до на единицу меньше полного числа видов.[0125] Thus, pictures used as reference pictures can be decoded before decoding pictures that are encoded with respect to the reference pictures. A view order index is an index that indicates the decoding order of view components in an access unit. For each index i of the order of views, the corresponding view_id is transmitted in the service signals. Decoding of species components follows the order of increasing indices of the order of species. If all species are represented, then the set of species order indices contains a sequentially ordered set from zero to one less than the total number of species.

[0126] В MVC поднабор всего потока битов может быть извлечен для того, чтобы генерировать субпоток битов, который по-прежнему соответствует MVC. Существует множество возможных субпотоков битов, которых могут требовать конкретные приложения, на основе, например, услуг, предоставляемых посредством сервера, пропускной способности, поддержки и характеристик декодеров одного или более клиентов и/или предпочтений одного или более клиентов. Например, клиент может требовать только трех видов, и может быть предусмотрено два сценария. В одном примере один клиент может требовать плавного впечатления от просмотра и может предпочитать виды со значениями view_id в S0, S1 и S2, в то время как другой клиент может требовать масштабируемости вида и предпочитать виды со значениями view_id в S0, S2 и S4. Отметим, что оба из этих субпотоков битов могут быть декодированы в качестве независимых MVC-потоков битов и могут поддерживаться одновременно.[0126] In MVC, a subset of the entire bitstream can be extracted in order to generate a bitstream that still corresponds to MVC. There are many possible bit substreams that specific applications may require based on, for example, services provided by a server, bandwidth, support, and decoder characteristics of one or more clients and / or preferences of one or more clients. For example, a client may require only three types, and two scenarios may be provided. In one example, one client may require a smooth viewing experience and may prefer views with view_id values in S0, S1 and S2, while another client may require view scalability and prefer views with view_id values in S0, S2 and S4. Note that both of these bitstreams can be decoded as independent MVC bitstreams and can be supported simultaneously.

[0127] Фиг. 5 является блок-схемой, иллюстрирующей потенциальные кандидаты предикторов вектора движения при выполнении предсказания векторов движения (включающего в себя режим слияния). Иными словами, для блока 100, в данный момент кодируемого, информация движения (например, вектор движения, содержащий горизонтальную компоненту и вертикальную компоненту, индексы векторов движения, направления предсказания или другую информацию) из соседних блоков A₀, A₁, B₀, B₁ и B₂ может быть использована для того, чтобы предсказывать информацию движения для блока 100. Помимо этого, информация движения, ассоциированная с совместно размещенным блоком COL, также может быть использована для того, чтобы предсказывать информацию движения для блока 100. Соседние блоки A₀, A₁, B₀, B₁ и B₂ и совместно размещенный блок COL, в контексте предсказания векторов движения, могут, в общем, называться ниже "возможными вариантами предикторов вектора движения".[0127] FIG. 5 is a block diagram illustrating potential candidates for motion vector predictors when performing motion vector prediction (including a merge mode). In other words, for block 100 currently encoded, motion information (for example, a motion vector containing a horizontal component and a vertical component, motion vector indices, prediction directions, or other information) from neighboring blocks A ₀ , A ₁ , B ₀ , B ₁ and B ₂ can be used to predict the motion information for the block 100. In addition, the motion information associated with a block collocated COL, may also be used to predict the motion information for the block 100. Neighboring blocks A _0, A _1, B _0, B ₁ and B ₂ together and located block COL, in the context of prediction of motion vectors can, in general, referred to as "possible embodiments predictor motion vector".

[0128] В некоторых примерах кандидаты предикторов вектора движения, показанные на фиг. 5, могут быть идентифицированы при выполнении предсказания векторов движения (например, при формировании MVD либо при выполнении режима слияния). В других примерах различные кандидаты могут быть идентифицированы при выполнении режима слияния и предсказания векторов движения. Иными словами, видеокодер может идентифицировать отличный набор кандидатов предикторов вектора движения для выполнения режима слияния относительно выполнения предсказания векторов движения.[0128] In some examples, motion vector predictor candidates shown in FIG. 5 can be identified when predicting motion vectors (for example, when generating MVD or when performing a merge mode). In other examples, various candidates can be identified by performing a merge and motion vector prediction mode. In other words, the video encoder can identify an excellent set of motion vector predictor candidates for performing a merge mode with respect to performing motion vector prediction.

[0129] Чтобы выполнять режим слияния, в примере, видеокодер (к примеру, видеокодер 20) может первоначально определять то, какие векторы движения из кандидатов предикторов вектора движения доступны для того, чтобы сливаться с блоком 100. Иными словами, в некоторых случаях, информация движения из одного или более кандидатов предикторов вектора движения может быть недоступна вследствие, например, внутренне кодируемого, еще не кодированного или не существующего кандидата предиктора вектора движения (например, один или более кандидатов предикторов вектора движения находятся в другом изображении или среза). Видеокодер 20 может составлять список кандидатов предикторов вектора движения, который включает в себя каждый из доступных кандидатов блоков предикторов вектора движения.[0129] In order to perform the merge mode, in an example, a video encoder (eg, video encoder 20) may initially determine which motion vectors from candidate motion vector predictors are available to merge with block 100. In other words, in some cases, the information movements from one or more motion vector predictor candidates may not be available due to, for example, an internally encoded, not yet encoded or non-existing motion vector predictor candidate (for example, one or more predictor candidate candidates are motion vector in another image or slice). Video encoder 20 may compile a motion vector predictor candidate list, which includes each of the available motion vector predictor block candidates.

[0130] После составления списка кандидатов видеокодер 20 может выбирать вектор движения из списка кандидатов, который должен быть использован в качестве вектора движения для текущего блока 100. В некоторых примерах видеокодер 20 может выбирать вектор движения из списка кандидатов, который лучше всего совпадает с вектором движения для блока 100. Иными словами, видеокодер 20 может выбирать вектор движения из списка кандидатов согласно анализу искажения в зависимости от скорости передачи.[0130] After compiling the candidate list, video encoder 20 may select a motion vector from the list of candidates that should be used as the motion vector for the current block 100. In some examples, video encoder 20 may select a motion vector from the list of candidates that best matches the motion vector for block 100. In other words, video encoder 20 may select a motion vector from a list of candidates according to a distortion analysis depending on a transmission rate.

[0131] Видеокодер 20 может предоставлять индикатор того, что блок 100 кодируется с использованием режима слияния. Например, видеокодер 20 может задавать флаг или другой элемент синтаксиса, указывающий, что вектор движения для блока 100 прогнозируется с использованием режима слияния. В примере, видеокодер 20 может указывать, что параметры внешнего предсказания для блока 100 логически выводятся из кандидата предиктора вектора движения посредством задания merge_flag [x0][y0]. В этом примере, индексы x0, y0 массива могут указывать местоположение (x0, y0) верхней левой выборки сигнала яркости предсказывающего блока относительно верхней левой выборки сигнала яркости изображения (или среза).[0131] Video encoder 20 may provide an indicator that block 100 is encoded using a merge mode. For example, video encoder 20 may specify a flag or other syntax element indicating that the motion vector for block 100 is predicted using the merge mode. In the example, video encoder 20 may indicate that the inter prediction parameters for block 100 are logically inferred from the candidate motion vector predictor by setting merge_flag [x0] [y0]. In this example, the array indices x0, y0 may indicate the location (x0, y0) of the upper left sample of the luminance signal of the predictor block relative to the upper left sample of the image luminance signal (or slice).

[0132] Помимо этого, В некоторых примерах видеокодер 20 может предоставлять индекс, идентифицирующий кандидата слияния, из которого блок 100 наследует свой вектор движения. Например, merge_idx [x0][y0] может указывать индекс кандидата слияния, который идентифицирует изображение в списке кандидатов слияния, где x0, y0 указывает местоположение (x0, y0) верхней левой выборки сигнала яркости предсказывающего блока относительно верхней левой выборки сигнала яркости изображения (или среза).[0132] In addition, In some examples, video encoder 20 may provide an index identifying a merge candidate from which block 100 inherits its motion vector. For example, merge_idx [x0] [y0] may indicate a merge candidate index that identifies the image in the merge candidate list, where x0, y0 indicates the location (x0, y0) of the upper left sample of the luminance signal of the predictor block relative to the upper left sample of the image luminance signal (or cut).

[0133] Видеодекодер (к примеру, видеодекодер 30) может выполнять аналогичные этапы, чтобы идентифицировать надлежащего кандидата слияния при декодировании блока 100. Например, видеодекодер 30 может принимать индикатор того, что блок 100 прогнозируется с использованием режима слияния. В примере, видеодекодер 30 может принимать merge_flag [x0][y0], где (x0, y0) указывает местоположение верхней левой выборки сигнала яркости предсказывающего блока относительно верхней левой выборки сигнала яркости изображения (или среза).[0133] A video decoder (eg, video decoder 30) can perform similar steps to identify the appropriate merge candidate when decoding block 100. For example, video decoder 30 can receive an indicator that block 100 is predicted using the merge mode. In the example, video decoder 30 may receive merge_flag [x0] [y0], where (x0, y0) indicates the location of the upper left sample of the luminance signal of the predictor block relative to the upper left sample of the signal luminance of the image (or slice).

[0134] Помимо этого, видеодекодер 30 может составлять список кандидатов слияния. Например, видеодекодер 30 может принимать один или более элементов синтаксиса (например, флагов), указывающих видеоблоки, которые доступны для предсказания векторов движения. Видеодекодер 30 может составлять список кандидатов слияния на основе принимаемых флагов. Согласно некоторым примерам, видеодекодер 30 может составлять список кандидатов слияния (например, mergeCandList) согласно следующей последовательности:[0134] In addition, video decoder 30 may compile a list of merge candidates. For example, video decoder 30 may receive one or more syntax elements (eg, flags) indicating video blocks that are available for predicting motion vectors. Video decoder 30 may list merge candidates based on received flags. According to some examples, video decoder 30 may make a list of merge candidates (for example, mergeCandList) according to the following sequence:

1. A₁, если availableFlagA₁ равен 11. A ₁ if availableFlagA ₁ is 1

2. B₁, если availableFlagB₁ равен 12. B ₁ if availableFlagB ₁ is 1

3. B₀, если availableFlagB₀ равен 13. B ₀ if availableFlagB ₀ is 1

4. A₀, если availableFlagA₀ равен 14. A _0, if ₀ is equal to 1 availableFlagA

5. B₂, если availableFlagB₂ равен 15. B ₂ if availableFlagB ₂ is 1

6. Col, если availableFlagCol равен 16. Col if availableFlagCol is 1

Если несколько кандидатов слияния имеют идентичные векторы движения и идентичные опорные индексы, кандидаты слияния могут удаляться из списка.If several merge candidates have identical motion vectors and identical reference indices, merge candidates can be removed from the list.

[0135] Видеодекодер 30 может идентифицировать надлежащего кандидата слияния согласно принимаемому индексу. Например, видеодекодер 30 может принимать индекс, идентифицирующий кандидат слияния, из которого блок 100 наследует свой вектор движения. В примере, merge_idx [x0][y0] может указывать индекс кандидата слияния, который идентифицирует изображение в списке кандидатов слияния, где x0, y0 указывает местоположение (x0, y0) верхней левой выборки сигнала яркости предсказывающего блока относительно верхней левой выборки сигнала яркости изображения (или среза).[0135] Video decoder 30 may identify a suitable merge candidate according to a received index. For example, video decoder 30 may receive an index identifying a merge candidate from which block 100 inherits its motion vector. In the example, merge_idx [x0] [y0] may indicate a merge candidate index that identifies the image in the merge candidate list, where x0, y0 indicates the location (x0, y0) of the upper left sample of the luminance signal of the predictor block relative to the upper left sample of the image luminance signal ( or cut).

[0136] В некоторых примерах видеодекодер 30 может масштабировать предиктор вектора движения до слияния информации движения кандидата блока с блоком 100. Например, относительно временного предиктора вектора движения, если предиктор вектора движения ссылается на предсказывающий блок в опорном изображении, который находится в отличном временном местоположении относительно предсказывающего блока, на который ссылается блок 100 (например, фактический вектор движения для блока 100), видеодекодер 30 может масштабировать предиктор вектора движения. Например, видеодекодер 30 может масштабировать предиктор вектора движения таким образом, что он ссылается на опорное изображение, идентичное опорному изображению для блока 100. В некоторых примерах видеодекодер 30 может масштабировать предиктор вектора движения согласно разности в значениях номера в последовательности изображений (POC). Иными словами, видеодекодер 30 может масштабировать предиктор вектора движения на основе разности между POC-расстоянием между кандидатом блока и предсказывающим блоком, на который ссылается предиктор вектора движения, и POC-расстоянием между блоком 100 и текущим опорным изображением (например, на которое ссылается фактический вектор движения для блока 100). После выбора надлежащего предиктора вектора движения видеодекодер 30 может осуществлять слияние информации движения, ассоциированной с предиктором вектора движения, с информацией движения для блока 100.[0136] In some examples, video decoder 30 may scale the motion vector predictor to merge the motion information of the block candidate with block 100. For example, with respect to the temporal motion vector predictor, if the motion vector predictor refers to a predictive block in a reference image that is at a different temporal location relative to of the predictor block referenced by block 100 (for example, the actual motion vector for block 100), video decoder 30 may scale the motion vector predictor. For example, video decoder 30 may scale the motion vector predictor in such a way that it refers to a reference image identical to the reference image for block 100. In some examples, video decoder 30 may scale the motion vector predictor according to the difference in the number of image sequence (POC) values. In other words, video decoder 30 may scale the motion vector predictor based on the difference between the POC distance between the block candidate and the predictive block referenced by the motion vector predictor and the POC distance between the block 100 and the current reference image (for example, the actual vector refers to movements for block 100). After selecting the appropriate motion vector predictor, video decoder 30 may merge the motion information associated with the motion vector predictor with the motion information for block 100.

[0137] Аналогичный процесс может быть реализован посредством видеокодера 20 и видеодекодера 30, чтобы выполнять предсказание векторов движения для текущего блока видеоданных. Например, видеокодер 20 может первоначально определять то, какие векторы движения из кандидатов предикторов вектора движения доступны для использования в качестве MVP. Информация движения из одного или более кандидатов предикторов вектора движения может быть недоступна вследствие, например, внутренне кодируемого, еще не кодированного или не существующего кандидата предиктора вектора движения.[0137] A similar process can be implemented by video encoder 20 and video decoder 30 to perform motion vector prediction for the current block of video data. For example, video encoder 20 may initially determine which motion vectors from motion vector predictor candidates are available for use as MVPs. Motion information from one or more motion vector predictor candidates may not be available due to, for example, an internally encoded, not yet encoded or non-existing motion vector predictor candidate.

[0138] Чтобы определять то, какие из кандидатов предикторов вектора движения доступны, видеокодер 20 может анализировать каждый из кандидатов предикторов вектора движения поочередно согласно предварительно определенной схеме на основе приоритета. Например, для каждого кандидата предиктора вектора движения, видеокодер 20 может определять то, ссылается или нет предиктор вектора движения на опорное изображение, идентичное опорному изображению фактического вектора движения для блока 100. Если предиктор вектора движения ссылается на идентичное опорное изображение, видеокодер 20 может добавлять кандидат предиктора вектора движения в список кандидатов MVP. Если предиктор вектора движения не ссылается на идентичное опорное изображение, предиктор вектора движения может масштабироваться (например, масштабироваться на основе POC-расстояний, как пояснено выше) перед добавлением в список кандидатов MVP.[0138] In order to determine which of the motion vector predictor candidates are available, video encoder 20 may analyze each of the motion vector predictor candidates alternately according to a predetermined priority based scheme. For example, for each motion vector predictor candidate, video encoder 20 may determine whether or not the motion vector predictor refers to a reference image identical to the reference image of the actual motion vector for block 100. If the motion vector predictor refers to an identical reference image, video encoder 20 may add a candidate motion vector predictor to MVP candidate list. If the motion vector predictor does not refer to the same reference image, the motion vector predictor can scale (for example, scale based on POC distances, as explained above) before adding the MVP to the candidate list.

[0139] Относительно совместно размещенного блока COL, если совместно размещенный блок включает в себя несколько предикторов вектора движения (например, COL прогнозируется в качестве B-кадра), видеокодер 20 может выбирать один из временных предикторов вектора движения согласно текущему списку и текущему опорному изображению (для блока 100). Видеокодер 20 затем может добавлять выбранный временной предиктор вектора движения в список кандидатов предикторов вектора движения.[0139] Regarding the co-located COL block, if the co-located block includes several motion vector predictors (for example, COL is predicted as a B-frame), video encoder 20 may select one of the temporal motion vector predictors according to the current list and the current reference image ( for block 100). Video encoder 20 may then add the selected time motion vector predictor to the motion vector predictor candidate list.

[0140] Видеокодер 20 может передавать в служебных сигналах то, что один или более предикторов вектора движения доступны, посредством задания enable_temporal_mvp_flag. После компоновки списка кандидатов видеокодер 20 может выбирать вектор движения из кандидатов, который должен использоваться в качестве предиктора вектора движения для блока 100. В некоторых примерах видеокодер 20 может выбирать кандидат вектора движения согласно анализу искажения в зависимости от скорости передачи.[0140] Video encoder 20 may transmit overhead that one or more motion vector predictors are available by setting enable_temporal_mvp_flag. After compiling the candidate list, video encoder 20 may select a motion vector from candidates to be used as a motion vector predictor for block 100. In some examples, video encoder 20 may select a motion vector candidate according to an analysis of distortion versus transmission rate.

[0141] Видеокодер 20 может передавать в служебных сигналах выбранный предиктор вектора движения с использованием MVP-индекса (mvp_flag), который идентифицирует MVP в списке кандидатов. Например, видеокодер 20 может задавать mvp_l0_flag [x0][y0], чтобы указывать индекс предиктора вектора движения списка 0, где x0, y0 указывает местоположение (x0, y0) верхней левой выборки сигнала яркости кандидата блока относительно верхней левой выборки сигнала яркости изображения. В другом примере видеокодер 20 может задавать mvp_l1_flag [x0][y0], чтобы указывать индекс предиктора вектора движения списка 1, где x0, y0 указывает местоположение (x0, y0) верхней левой выборки сигнала яркости кандидата блока относительно верхней левой выборки сигнала яркости изображения. В еще одном другом примере, видеокодер 20 может задавать mvp_lc_flag [x0][y0], чтобы указывать индекс предиктора вектора движения списка c, где x0, y0 указывает местоположение (x0, y0) верхней левой выборки сигнала яркости кандидата блока относительно верхней левой выборки сигнала яркости изображения.[0141] Video encoder 20 may transmit overhead the selected motion vector predictor using the MVP index (mvp_flag), which identifies the MVP in the candidate list. For example, video encoder 20 may specify mvp_l0_flag [x0] [y0] to indicate a list motion vector predictor index 0, where x0, y0 indicates the location (x0, y0) of the upper left sample of the luminance signal of the candidate block relative to the upper left sample of the image luminance signal. In another example, video encoder 20 may specify mvp_l1_flag [x0] [y0] to indicate a list 1 motion vector predictor index, where x0, y0 indicates the location (x0, y0) of the upper left sample of the luminance signal of the candidate block relative to the upper left sample of the image luminance signal. In yet another example, video encoder 20 may specify mvp_lc_flag [x0] [y0] to indicate a list motion vector predictor index c, where x0, y0 indicates the location (x0, y0) of the upper left sample of the block candidate luminance signal relative to the upper left signal sample image brightness.

[0142] Видеокодер 20 также может генерировать значение разности векторов движения (MVD). MVD может составлять разность между выбранным предиктором вектора движения и фактическим вектором движения для блока 100. Видеокодер 20 может передавать в служебных сигналах MVD с MVP-индексом.[0142] Video encoder 20 may also generate a motion vector difference (MVD) value. MVD may be the difference between the selected motion vector predictor and the actual motion vector for block 100. Video encoder 20 may transmit MVDs with MVP index in the service signals.

[0143] Видеодекодер 30 может выполнять аналогичные операции, чтобы определять вектор движения для текущего блока с использованием предиктора вектора движения. Например, видеодекодер 30 может принимать индикатор в наборе параметров (например, в наборе параметров изображения (PPS)), указывающий то, что предсказание векторов движения активируется для одного или более изображений. Иными словами, в примере, видеодекодер 30 может принимать enable_temporal_mvp_flag в PPS. Когда конкретное изображение обращается к PPS, имеющему enable_temporal_mvp_flag, равное нулю, опорные изображения в запоминающем устройстве опорных изображений могут помечаться как "неиспользуемые для временного предсказания векторов движения".[0143] Video decoder 30 may perform similar operations to determine a motion vector for the current block using a motion vector predictor. For example, video decoder 30 may receive an indicator in a set of parameters (for example, in a set of image parameters (PPS)) indicating that motion vector prediction is activated for one or more images. In other words, in the example, video decoder 30 may receive enable_temporal_mvp_flag in the PPS. When a particular image accesses a PPS having enable_temporal_mvp_flag equal to zero, reference images in the reference image memory may be marked as “unused for temporal prediction of motion vectors”.

[0144] Если реализуется предсказание векторов движения, после приема блока 100 видеодекодер 30 может составлять список кандидатов MVP. Видеодекодер 30 может использовать идентичную схему, поясненную выше относительно видеокодера 20, для того чтобы составлять список кандидатов MVP. В некоторых случаях, видеодекодер 30 может выполнять масштабирование векторов движения, аналогичное масштабированию векторов движения, описанному выше относительно видеокодера 20. Например, если предиктор вектора движения не ссылается на опорное изображение, идентичное опорному изображению для блока 100, предиктор вектора движения может масштабироваться (например, масштабироваться на основе POC-расстояний, как пояснено выше) перед добавлением в список кандидатов MVP. Видеодекодер 30 может идентифицировать надлежащий предиктор вектора движения для блока 100 с использованием принимаемого MVP-индекса (mvp_flag), который идентифицирует MVP в списке кандидатов. Видеодекодер 30 затем может генерировать вектор движения для блока 100 с использованием MVP и принимаемого MVD.[0144] If motion vector prediction is implemented, after receiving block 100, video decoder 30 may compile an MVP candidate list. Video decoder 30 may use the identical circuit explained above with respect to video encoder 20 in order to compile an MVP candidate list. In some cases, video decoder 30 may perform motion vector scaling similar to the motion vector scaling described above with respect to video encoder 20. For example, if the motion vector predictor does not refer to a reference image identical to the reference image for block 100, the motion vector predictor can scale (for example, scale based on POC distances, as explained above) before adding MVP to the candidate list. Video decoder 30 may identify the appropriate motion vector predictor for block 100 using the received MVP index (mvp_flag), which identifies the MVP in the candidate list. Video decoder 30 may then generate a motion vector for block 100 using MVP and received MVD.

[0145] Фиг. 5, в общем, иллюстрирует режим слияния и предсказание векторов движения в одном виде. Следует понимать, что кандидаты блоков предикторов вектора движения, показанные на фиг. 5, предоставляются только в целях примера, большее число, меньше число или другие блоки могут использоваться для целей предсказания информации движения. Согласно аспектам этого раскрытия сущности, как описано ниже, режим слияния и предсказание векторов движения также могут применяться, когда кодируются несколько видов (к примеру, в MVC). В таких случаях, предикторы вектора движения и прогнозные блоки могут находиться в отличных видах относительно блока 100.[0145] FIG. 5 generally illustrates the merge mode and prediction of motion vectors in one form. It should be understood that the candidate motion vector predictor blocks shown in FIG. 5 are provided for illustrative purposes only, a larger number, a smaller number, or other blocks may be used for predicting motion information. According to aspects of this disclosure, as described below, the merge mode and the prediction of motion vectors can also be applied when several types are encoded (for example, in MVC). In such cases, the motion vector predictors and prediction blocks may be in excellent views relative to block 100.

[0146] Фиг. 6 является концептуальной схемой, иллюстрирующей генерирование и масштабирование предиктора вектора движения при многовидовом кодировании. Например, согласно аспектам этого раскрытия сущности, видеокодер (к примеру, видеокодер 20 или видеодекодер 30) может масштабировать вектор 120 диспаратности движения (mv) из кандидата 122 блока предиктора вектора диспаратности движения ("кандидата блока"), чтобы генерировать предиктор 124 вектора движения (mv') для текущего блока 126. Хотя фиг. 6 описывается относительно видеодекодера 30, следует понимать, что способы этого раскрытия сущности могут выполняться посредством множества других видеокодеров, включающих в себя другие процессоры, модули обработки, аппаратные модули кодирования, такие как кодеры/декодеры (кодеки) и т.п.[0146] FIG. 6 is a conceptual diagram illustrating the generation and scaling of a motion vector predictor in multi-view coding. For example, according to aspects of this disclosure, a video encoder (e.g., video encoder 20 or video decoder 30) may scale a motion displacement vector (mv) 120 from a candidate 122 of a motion displacement vector predictor block (“candidate block”) to generate a motion vector predictor 124 ( mv ') for the current block 126. Although FIG. 6 is described with respect to video decoder 30, it should be understood that the methods of this disclosure may be performed by a variety of other video encoders, including other processors, processing modules, hardware encoding modules, such as encoders / decoders (codecs) and the like.

[0147] В примере по фиг. 6, кандидат 122 блока пространственно соседствует с текущим блоком 126 в компоненте вида два (view_id 2). Кандидат 122 блока внешне прогнозируется и включает в себя вектор 120 движения, который ссылается (или "указывает") на предсказывающий блок в компоненте вида нуль (view_id 0). Например, вектор 120 движения имеет целевое изображение в виде два (view_id 2) и опорное изображение в виде нуль (view_id 0). Текущий блок 126 также внешне прогнозируется и включает в себя фактический вектор движения (не показан), который ссылается на предсказывающий блок в компоненте вида один (view_id 1). Иными словами, например, фактический вектор движения для текущего блока 126 имеет целевое изображение в виде два (view_id 2) и опорный блок в виде один (view_id 1).[0147] In the example of FIG. 6, the block candidate 122 is spatially adjacent to the current block 126 in the view two component (view_id 2). The block candidate 122 is externally predicted and includes a motion vector 120 that references (or "points") to the predictive block in a component of the form zero (view_id 0). For example, the motion vector 120 has a target image in the form of two (view_id 2) and a reference image in the form of zero (view_id 0). The current block 126 is also externally predicted and includes an actual motion vector (not shown) that refers to the predictive block in the view component one (view_id 1). In other words, for example, the actual motion vector for the current block 126 has a target image in the form of two (view_id 2) and a reference block in the form of one (view_id 1).

[0148] Согласно аспектам этого раскрытия сущности, видеодекодер 30 может генерировать предиктор 124 вектора движения для текущего блока 126 с использованием масштабированной версии вектора 120 движения. Например, видеодекодер 30 может масштабировать вектор 120 движения на основе разности в расстояниях для вида между вектором 120 движения и фактическим вектором движения для текущего блока 126. Иными словами, видеодекодер 30 может масштабировать вектор 120 движения на основе разности между местоположением камеры для камеры, используемой для того, чтобы захватывать предсказывающий блок (в опорном изображении) для кандидата 122 блока и предсказывающий блок (в опорном изображении) для текущего блока 126. Соответственно, видеодекодер 30 может масштабировать вектор 120 диспаратности движения (например, вектор движения, использующийся для предсказания) согласно разности между компонентой вида, на который ссылается вектор 120 движения для кандидата 122 блока, и компонентой вида, на который ссылается фактический вектор движения для текущего блока 126.[0148] According to aspects of this disclosure, video decoder 30 may generate a motion vector predictor 124 for the current block 126 using a scaled version of the motion vector 120. For example, video decoder 30 may scale the motion vector 120 based on the distance difference for the view between the motion vector 120 and the actual motion vector for the current block 126. In other words, video decoder 30 may scale the motion vector 120 based on the difference between the location of the camera for the camera used for in order to capture the predictive block (in the reference image) for the block candidate 122 and the predictive block (in the reference image) for the current block 126. Accordingly, the video decoder 30 may scale Vat disparity motion vector 120 (e.g., a motion vector used for prediction) in accordance with the difference between the species component referenced 120 motion vector candidate for block 122, and the type of component, which is referred to the actual motion vector for current block 126.

[0149] В примере, видеодекодер 30 может генерировать масштабированный предиктор вектора движения для текущего блока согласно уравнению (1), показанному ниже:[0149] In the example, video decoder 30 may generate a scaled motion vector predictor for the current block according to equation (1) shown below:

$m v' = m v (\frac{V i e w D i s \tan c e (m v')}{V i e w D i s \tan c e (m v)})$

, (1)

m v'' = m v (\frac{V i e w D i s \tan c e (m v'')}{V i e w D i s \tan c e (m v)})

, (one)

где ViewDistance(mv) равно разности между идентификатором вида опорного изображения вектора 120 движения (например, ViewId(RefPic(mv)) и идентификатором вида целевого изображения вектора 120 движения (например, ViewId(TargetPic(mv)), и ViewDistance(mv') равно разности между идентификатором вида опорного изображения предиктора 124 вектора движения (например, ViewId(RefPic(mv')) и идентификатором вида целевого изображения предиктора 124 вектора движения (например, ViewId(TargetPic(mv')). Соответственно, в этом примере, опорное изображение предиктора 124 вектора движения, RefPic(mv'), принадлежит новому целевому виду, и целевое изображение предиктора 124 вектора движения, TargetPic(mv'), принадлежит текущему виду. Аналогично, опорное изображение вектора 120 движения, RefPic(mv), принадлежит виду, на который указывает кандидат вектора движения, и целевое изображение вектора 120 движения, TargetPic(mv), принадлежит текущему виду. Соответственно, видеодекодер 30 может генерировать масштабированный предиктор вектора движения согласно нижеприведенному уравнению (2):where ViewDistance (mv) is equal to the difference between the type identifier of the reference image of the motion vector 120 (e.g., ViewId (RefPic (mv)) and the type identifier of the target image of the motion vector 120 (e.g., ViewId (TargetPic (mv)), and ViewDistance (mv ') equal to the difference between the type identifier of the reference image of the motion vector predictor 124 (for example, ViewId (RefPic (mv ')) and the type identifier of the target image of the motion vector predictor 124 (for example, ViewId (TargetPic (mv')). Accordingly, in this example, the reference the image of the motion vector predictor 124, RefPic (mv '), belongs to a new the target view, and the target image of the motion vector predictor 124, TargetPic (mv '), belongs to the current view. Similarly, the reference image of the motion vector 120, RefPic (mv), belongs to the kind indicated by the candidate motion vector, and the target image of the motion vector 120 , TargetPic (mv), belongs to the current view. Accordingly, video decoder 30 may generate a scaled motion vector predictor according to equation (2) below:

$m v' = m v (\frac{V i e w I D (N e w T a r g e t) - V i e w I D (C u r r e n t)}{V i e w I D (C a n d i d a t e) - V i e w I d (C u r r e n t)})$

, (2)

m v'' = m v (\frac{V i e w I D (N e w T a r g e t) - V i e w I D (C u r r e n t)}{V i e w I D (C a n d i d a t e) - V i e w I d (C u r r e n t)})

, (2)

где mv' представляет масштабированный предиктор вектора движения для текущего блока, mv представляет вектор движения для кандидата блока, ViewID(NewTarget) является компонентой вида, на который ссылается фактический вектор движения для текущего блока, ViewID(Current) является компонентой вида текущего блока, и ViewID(Candidate) является компонентой вида кандидата блока.where mv 'represents the scaled motion vector predictor for the current block, mv represents the motion vector for the block candidate, ViewID (NewTarget) is the view component referenced by the actual motion vector for the current block, ViewID (Current) is the view component of the current block, and ViewID (Candidate) is a component of the candidate block view.

[0150] При применении уравнения (2) к примеру на фиг. 6, mv' представляет масштабированный предиктор вектора движения для текущего блока 126, mv представляет вектор 120 движения, ViewID(NewTarget) является компонентой вида, на который ссылается вектор 124 движения, ViewID(Current) является компонентой вида текущего блока 126, и ViewID(Candidate) является компонентой вида кандидата 122 блока. Соответственно, в примере, показанном на фиг. 4, предиктор 124 вектора движения является вектором 120 движения, масштабированным на коэффициент одна вторая (например, $m v' = m v (\frac{1 - 2}{0 - 2})$

). Иными словами, видеодекодер 30 может масштабировать как горизонтальную компоненту смещения, так и вертикальную компоненту смещения вектора 120 движения на коэффициент одна вторая, чтобы генерировать предиктор 124 вектора движения для текущего блока 126.[0150] When applying equation (2) to the example of FIG. 6, mv ′ represents a scaled motion vector predictor for the current block 126, mv represents the motion vector 120, ViewID (NewTarget) is a view component referenced by the motion vector 124, ViewID (Current) is a view component of the current block 126, and ViewID (Candidate ) is a component of the candidate view of block 122. Accordingly, in the example shown in FIG. 4, the motion vector predictor 124 is a motion vector 120 scaled by a factor of one second (e.g.,

m v'' = m v (\frac{one - 2}{0 - 2})

) In other words, video decoder 30 can scale both the horizontal offset component and the vertical offset component of the motion vector 120 by a factor of one second to generate a motion vector predictor 124 for the current block 126.

[0151] Масштабирование векторов движения, описанное относительно фиг. 6, может выполняться как для слияния, так и для предсказания векторов движения. Иными словами, например, видеодекодер 30 может масштабировать вектор 120 движения перед слиянием вектора 120 движения с информацией движения для текущего блока 126. В другом примере видеодекодер 30 может масштабировать вектор 120 движения перед вычислением значения разности вектора движения (MVD) согласно разности между предиктором 124 вектора движения и фактическим вектором движения для текущего блока 126.[0151] The scaling of motion vectors described with respect to FIG. 6 can be performed both for merging and for predicting motion vectors. In other words, for example, video decoder 30 may scale the motion vector 120 before merging the motion vector 120 with the motion information for the current block 126. In another example, video decoder 30 may scale the motion vector 120 before calculating the value of the motion vector difference (MVD) according to the difference between the vector predictor 124 motion and the actual motion vector for the current block 126.

[0152] Как показано в примере по фиг. 6, кандидат 122 блока и текущий блок 126 могут находиться в идентичном компоненте вида. Тем не менее, В других примерах как подробнее описано относительно фиг. 7 и 8, кандидат блока может находиться в отличном компоненте вида относительно текущего блока.[0152] As shown in the example of FIG. 6, block candidate 122 and current block 126 may be in the same view component. However, in other examples, as described in more detail with respect to FIG. 7 and 8, the block candidate may be in a different view component relative to the current block.

[0153] Фиг. 7 является другой концептуальной схемой, иллюстрирующей генерирование и масштабирование предиктора вектора движения. Например, согласно аспектам этого раскрытия сущности, видеокодер (к примеру, видеокодер 20 или видеодекодер 30) может масштабировать вектор 130 диспаратности движения (mv) из кандидата 132 блока предиктора вектора диспаратности движения (x', y'), чтобы генерировать предиктор 134 вектора движения (mv') для текущего блока 136 (x, y), причем кандидат 132 блока принадлежит отличному компоненту вида относительно текущего блока 136. Соответственно, процесс, показанный и описанный относительно фиг. 7 может, в общем, называться "межвидовым предсказанием векторов диспаратности движения". Хотя фиг. 7 описывается относительно видеодекодера 30, следует понимать, что способы этого раскрытия сущности могут выполняться посредством множества других видеокодеров, включающих в себя другие процессоры, модули обработки, аппаратные модули кодирования, такие как кодеры/декодеры (кодеки) и т.п.[0153] FIG. 7 is another conceptual diagram illustrating the generation and scaling of a motion vector predictor. For example, according to aspects of this disclosure, a video encoder (e.g., video encoder 20 or video decoder 30) can scale a motion displacement vector (mv) 130 from a candidate 132 of a motion displacement vector predictor block (x ', y') to generate a motion vector predictor 134 (mv ') for the current block 136 (x, y), wherein the block candidate 132 belongs to a different view component relative to the current block 136. Accordingly, the process shown and described with respect to FIG. 7 may be generally referred to as “interspecific prediction of motion disparity vectors”. Although FIG. 7 is described with respect to video decoder 30, it should be understood that the methods of this disclosure can be performed by a variety of other video encoders, including other processors, processing modules, hardware encoding modules, such as encoders / decoders (codecs) and the like.

[0154] В примере, показанном на фиг. 7, кандидат 132 блока находится в компоненте вида один (view_id 1). Кандидат 132 блока внешне прогнозируется и включает в себя вектор 130 движения (mv), который ссылается на предсказывающий блок в компоненте вида нуль (view_id 0). Например, вектор 130 движения имеет целевое изображение в виде один (view_id 1) и опорное изображение в виде нуль (view_id 0). Текущий блок 136 совместно размещается с кандидатом 132 блока и находится в компоненте вида два (view_id 2). Как подробнее описано ниже, В некоторых примерах текущий блок 136 может включать в себя фактический вектор движения (не показан), который идентифицирует блок в первом опорном виде (view_id 1). Иными словами, например, фактический вектор движения для текущего блока 136 имеет целевое изображение в виде два (view_id 2) и может иметь опорный блок в виде один (view_id 1). В других примерах текущий блок может включать в себя фактический вектор движения, который идентифицирует блок во втором опорном виде (view_id 0). Иными словами, например, фактический вектор движения для текущего блока 136 имеет целевое изображение в виде два (view_id 2), и может иметь опорный блок в виде нуль (view_id 0). Соответственно, предиктор 134 вектора движения (mv') может ссылаться на блок в первом опорном виде (view_id 1). В другом примере второй предиктор 138 вектора движения (mv'') может ссылаться на блок во втором опорном виде (view_id 0).[0154] In the example shown in FIG. 7, the candidate block 132 is in the view component one (view_id 1). The block candidate 132 is externally predicted and includes a motion vector (mv) 130 that refers to a predictive block in a component of the form null (view_id 0). For example, the motion vector 130 has a target image in the form of one (view_id 1) and a reference image in the form of zero (view_id 0). The current block 136 is shared with the block candidate 132 and is in the view component two (view_id 2). As described in more detail below, in some examples, the current block 136 may include an actual motion vector (not shown) that identifies the block in the first reference form (view_id 1). In other words, for example, the actual motion vector for the current block 136 has a target image in the form of two (view_id 2) and may have a reference block in the form of one (view_id 1). In other examples, the current block may include an actual motion vector that identifies the block in a second reference form (view_id 0). In other words, for example, the actual motion vector for the current block 136 has a target image in the form of two (view_id 2), and may have a reference block in the form of zero (view_id 0). Accordingly, the motion vector predictor 134 (mv ′) may refer to a block in a first reference form (view_id 1). In another example, the second motion vector predictor 138 (mv ″) may refer to a block in a second reference form (view_id 0).

[0155] В некоторых примерах второй предиктор 138 вектора движения может быть недоступным для предсказания векторов движения. Например, второй предиктор 138 вектора движения может быть сгенерирован только в том случае, если предсказывающий блок во втором опорном виде доступен для прямого межвидового предсказания. Доступность предсказывающего блока во втором опорном виде может указываться, например, в наборе параметров (к примеру, в наборе параметров последовательности (SPS) или в наборе параметров изображения (PPS)) либо в заголовке среза, ассоциированном с текущим блоком 136.[0155] In some examples, the second motion vector predictor 138 may not be available for predicting motion vectors. For example, the second motion vector predictor 138 can only be generated if the predictive block in the second reference form is available for direct interspecific prediction. The availability of the predictor block in the second reference form may be indicated, for example, in a parameter set (for example, in a sequence parameter set (SPS) or in an image parameter set (PPS)) or in the slice header associated with the current block 136.

[0156] Согласно аспектам этого раскрытия сущности, видеодекодер может выполнять межвидовое предсказание векторов диспаратности движения с использованием режима слияния или с использованием предсказания векторов движения. Относительно режима слияния видеодекодер 30 может первоначально выбирать "целевой вид" для текущего блока 136. В общем, целевой вид включает в себя предсказывающий блок для текущего блока 136. В некоторых примерах целевой вид может быть первым опорным видом (показан на фиг. 7 в качестве view_id 1). В других примерах целевой вид может быть вторым опорным видом (показан на фиг. 7 в качестве view_id 0). Тем не менее, как отмечено выше, в некоторых примерах второй опорный вид может быть использован в качестве целевого вида только в том случае, если предсказывающий блок во втором опорном виде доступен для использования для целей межвидового предсказания.[0156] According to aspects of this disclosure, a video decoder can perform interspecific prediction of motion disparity vectors using a merge mode or using motion vector prediction. Regarding the merge mode, video decoder 30 may initially select a “target view” for the current block 136. In general, the target view includes a predictive block for the current block 136. In some examples, the target view may be a first reference view (shown in FIG. 7 as view_id 1). In other examples, the target view may be a second reference view (shown in FIG. 7 as view_id 0). However, as noted above, in some examples, the second reference view can be used as the target view only if the prediction block in the second reference view is available for use for interspecific prediction purposes.

[0157] В некоторых примерах видеодекодер 30 может выбирать первый опорный вид в качестве целевого вида. В других примерах видеодекодер 30 может выбирать, когда доступен, второй опорный вид в качестве целевого вида. Выбор целевого вида может быть определен, например, на основе доступности предсказывающего блока и/или предварительно определенного алгоритма выбора. Опорный индекс (ref_idx) текущего блока 136 соответствует индексу изображения, содержащего предсказывающий блок целевого вида, который добавляется в список опорных изображений текущего блока 136.[0157] In some examples, video decoder 30 may select a first reference view as the target view. In other examples, video decoder 30 may select, when available, a second reference view as the target view. The choice of the target view can be determined, for example, based on the availability of the predictor block and / or a predetermined selection algorithm. The reference index (ref_idx) of the current block 136 corresponds to the index of the image containing the predictive block of the target view, which is added to the list of reference images of the current block 136.

[0158] После выбора целевого вида видеодекодер 30 может находить кандидат 132 блока. В примере в целях иллюстрации допустим, что верхняя левая выборка сигнала яркости текущего блока 136 находится в изображении (или среза) с координатами (x, y). Видеодекодер 30 может определять совместно размещенные координаты в компоненте вида один для кандидата 132 блока. Помимо этого, В некоторых примерах видеодекодер 30 может регулировать координаты на основе диспаратности между компонентой вида текущего блока 136 (компонентой вида два) и компонентой вида кандидата 132 блока (компонентой вида один). Соответственно, видеодекодер 30 может определять координаты для кандидата 132 блока как (x', y'), где (x', y',)=(x, y)+диспаратность. В некоторых примерах диспаратность может быть включена и/или вычислена в SPS, PPS, заголовке среза, синтаксисе CU и/или синтаксисе PU.[0158] After selecting a target view, video decoder 30 may find a block candidate 132. In the example, for illustration purposes, suppose that the upper left sample of the luminance signal of the current block 136 is in the image (or slice) with coordinates (x, y). Video decoder 30 may determine co-located coordinates in a view component one for block candidate 132. In addition, In some examples, video decoder 30 may adjust the coordinates based on disparity between the view component of the current block 136 (view component two) and the view component of block candidate 132 (view component one). Accordingly, video decoder 30 may determine the coordinates for block candidate 132 as (x ', y'), where (x ', y',) = (x, y) + disparity. In some examples, disparity may be included and / or calculated in SPS, PPS, slice header, CU syntax, and / or PU syntax.

[0159] После нахождения кандидата 132 блока видеодекодер 30 может масштабировать вектор 130 движения для кандидата 132 блока на основе разности в расстояниях для вида между вектором 130 движения и фактическим вектором движения для текущего блока 136. Иными словами, видеодекодер 30 может масштабировать вектор 130 движения на основе разности в местоположении камеры для камеры, используемой для того, чтобы захватывать предсказывающий блок для кандидата 132 блока и предсказывающий блок для текущего блока 136 (например, предсказывающий блок в целевом виде). Иными словами, видеодекодер 30 может масштабировать вектор 130 диспаратности движения (например, вектор движения, использующийся для предсказания) согласно разности между компонентой вида, на который ссылается вектор 130 движения для кандидата 132 блока, и компонентой вида для целевого вида.[0159] After finding the block candidate 132, the video decoder 30 can scale the motion vector 130 for the block candidate 132 based on the distance difference between the motion vector 130 and the actual motion vector for the current block 136. In other words, the video decoder 30 can scale the motion vector 130 by based on the difference in camera location for the camera used to capture the predictive block for block candidate 132 and the predictive block for current block 136 (for example, the predictive block in the target view). In other words, video decoder 30 may scale the motion disparity vector 130 (e.g., the motion vector used for prediction) according to the difference between the view component referenced by the motion vector 130 for block candidate 132 and the view component for the target view.

[0160] В примере, видеодекодер 30 может генерировать масштабированный предиктор вектора движения для текущего блока согласно уравнению (3), показанному ниже:[0160] In the example, video decoder 30 may generate a scaled motion vector predictor for the current block according to equation (3) shown below:

$m v' = m v (\frac{V i e w I D (T a r g e t) - V i e w I D (C u r r e n t)}{V i e w I D (S e c o n d Reference) - V i e w I d (Reference)})$

, (3)

m v'' = m v (\frac{V i e w I D (T a r g e t) - V i e w I D (C u r r e n t)}{V i e w I D (S e c o n d Reference) - V i e w I d (Reference)})

, (3)

где mv' представляет масштабированный предиктор вектора движения для текущего блока, mv представляет вектор движения для кандидата блока, ViewID(Target) является компонентой вида выбранного целевого вида, ViewID(Current) является компонентой вида текущего блока, и ViewID(SecondReference) является компонентой вида второго опорного вида (при наличии), и ViewID(Reference) является компонентой вида первого опорного вида. В некоторых примерах ViewID(Target) минус ViewID(Current) может называться "расстоянием для вида" предиктора 134 вектора движения, в то время как ViewID(SecondReference) минус ViewID(Reference) может называться "расстоянием для вида" вектора 130 движения. Иными словами, расстояние для вида предиктора 134 вектора движения является разностью между целевым изображением (view_id 1) и опорным изображением (view_id 2) предиктора 134 вектора движения, в то время как расстояние для вида вектора 130 движения является разностью между целевым изображением (view_id 0) и опорным изображением (view_id 1) вектора 130 движения.where mv 'represents the scaled motion vector predictor for the current block, mv represents the motion vector for the block candidate, ViewID (Target) is the component of the selected target view, ViewID (Current) is the component of the current block, and ViewID (SecondReference) is the component of the second reference view (if any), and ViewID (Reference) is a view component of the first reference view. In some examples, the ViewID (Target) minus the ViewID (Current) can be called the "distance for the view" of the motion vector predictor 134, while the ViewID (SecondReference) minus the ViewID (Reference) can be called the "distance for the view" of the motion vector 130. In other words, the distance for the type of motion vector predictor 134 is the difference between the target image (view_id 1) and the reference image (view_id 2) of the motion vector predictor 134, while the distance for the type of motion vector predictor 130 is the difference between the target image (view_id 0) and a reference image (view_id 1) of the motion vector 130.

[0161] При применении уравнения (3) к примеру на фиг. 7, mv' представляет масштабированный предиктор 134 вектора движения или масштабированный предиктор 138 вектора движения, в зависимости от того, какой компонента вида выбирается для целевого вида. Например, если первый опорный вид (view_id 1) выбирается в качестве целевого вида, mv' представляет масштабированный предиктор 134 вектора движения, mv представляет вектор 130 движения, ViewID(Target) является компонентой вида, на который ссылается предиктор 134 вектора движения, ViewID(Current) является компонентой вида текущего блока 136, ViewID(SecondReference) является компонентой вида второго опорного вида (view_id 0), и ViewID(Reference) является компонентой вида первого опорного вида (view_id 1). Соответственно, в примере, показанном на фиг. 7, предиктор 134 вектора движения является вектором 130 движения, масштабированным на коэффициент один (например, $m v' = m v (\frac{1 - 2}{0 - 1})$

). Иными словами, горизонтальную компоненту смещения и вертикальную компоненту смещения вектора 130 движения могут быть идентичными горизонтальному компоненту смещения и вертикальному компоненту смещения предиктора 134 вектора движения.[0161] When applying equation (3) to the example of FIG. 7, mv ′ represents a scaled motion vector predictor 134 or a scaled motion vector predictor 138, depending on which view component is selected for the target view. For example, if the first reference view (view_id 1) is selected as the target view, mv ′ represents the scaled motion vector predictor 134, mv represents the motion vector 130, ViewID (Target) is the view component referenced by the motion vector predictor 134, ViewID (Current ) is a view component of the current block 136, ViewID (SecondReference) is a view component of the second reference view (view_id 0), and ViewID (Reference) is a view component of the first reference view (view_id 1). Accordingly, in the example shown in FIG. 7, the motion vector predictor 134 is a motion vector 130 scaled by a factor of one (e.g.,

m v'' = m v (\frac{one - 2}{0 - one})

) In other words, the horizontal displacement component and the vertical displacement component of the motion vector 130 may be identical to the horizontal displacement component and the vertical displacement component of the motion vector predictor 134.

[0162] Альтернативно, если второй опорный вид (view_id 0) выбирается для целевого вида, mv' представляет масштабированный предиктор 138 вектора движения, mv представляет вектор 130 движения, ViewID(Target) является компонентой вида, на который ссылается предиктор 138 вектора движения, ViewID(Current) является компонентой вида текущего блока 136, ViewID(SecondReference) является компонентой вида второго опорного вида (view_id 0)), и ViewID(Reference) является компонентой вида первого опорного вида (view_id 1). Соответственно, в примере, показанном на фиг. 7, предиктор 138 вектора движения является вектором 130 движения, масштабированным на коэффициент два (например, $m v' = m v (\frac{0 - 2}{0 - 1})$

). Иными словами, видеодекодер 30 может масштабировать как горизонтальную компоненту смещения, так и вертикальную компоненту смещения вектора 130 движения на коэффициент два, чтобы генерировать предиктор 138 вектора движения для текущего блока 136.[0162] Alternatively, if the second reference view (view_id 0) is selected for the target view, mv 'represents the scaled motion vector predictor 138, mv represents the motion vector 130, ViewID (Target) is the view component referenced by the motion vector predictor 138, ViewID (Current) is a view component of the current block 136, ViewID (SecondReference) is a view component of a second reference view (view_id 0)), and ViewID (Reference) is a view component of a first reference view (view_id 1). Accordingly, in the example shown in FIG. 7, the motion vector predictor 138 is a motion vector 130 scaled by a factor of two (e.g.,

m v'' = m v (\frac{0 - 2}{0 - one})

) In other words, video decoder 30 can scale both the horizontal offset component and the vertical offset component of the motion vector 130 by a factor of two to generate a motion vector predictor 138 for the current block 136.

[0163] Согласно аспектам этого раскрытия сущности, видеодекодер 30 может выполнять аналогичные этапы при выполнении предсказания векторов движения (например, формировании MVP). Например, видеодекодер 30 может выбирать целевой вид, который может быть первым опорным видом (view_id 1) или вторым опорным видом (view_id 0). Тем не менее, если опорное изображение компонента вида, содержащего предсказывающий блок для текущего блока, недоступно в целях межвидового предсказания, соответствующий предиктор не может быть использован. Соответственно, выбор целевого вида может быть определен, например, на основе доступности предсказывающего блока и/или предварительно определенного алгоритма выбора.[0163] According to aspects of this disclosure, video decoder 30 may perform similar steps when performing motion vector prediction (eg, MVP generation). For example, video decoder 30 may select a target view, which may be a first reference view (view_id 1) or a second reference view (view_id 0). However, if the reference image of a component of the form containing the predictive block for the current block is not available for interspecific prediction, the corresponding predictor cannot be used. Accordingly, the selection of the target species can be determined, for example, based on the availability of the predictive block and / or a predetermined selection algorithm.

[0164] Если предсказывающий блок для текущего блока 136 недоступен для использования для прямого межвидового предсказания в первом опорном виде (view_id 1) либо втором опорном виде (view_id 0), видеодекодер 30 не может выполнять предсказание векторов движения. Если, по меньшей мере, один предсказывающий блок доступен, видеодекодер 30 может выбирать опорный вид, который включает в себя предсказывающий блок, ассоциированный с фактическим вектором движения для текущего блока 136.[0164] If the prediction block for the current block 136 is not available for use for direct interspecific prediction in the first reference form (view_id 1) or the second reference form (view_id 0), video decoder 30 cannot perform motion vector prediction. If at least one predictive block is available, video decoder 30 may select a reference view that includes a predictive block associated with the actual motion vector for the current block 136.

[0165] После выбора целевого вида видеодекодер 30 затем может повторять этапы, описанные выше относительно режима слияния. Например, видеодекодер 30 может находить кандидат 132 блока. Иными словами, видеодекодер 30 может определять совместно размещенные координаты в компоненте вида один для кандидата 132 блока. Помимо этого, В некоторых примерах видеодекодер 30 может регулировать координаты на основе диспаратности между компонентой вида текущего блока 136 (компонентой вида два) и компонентой вида кандидата блока (компонентой вида один) 132.[0165] After selecting a target view, video decoder 30 may then repeat the steps described above with respect to the merge mode. For example, video decoder 30 may find a block candidate 132. In other words, video decoder 30 may determine co-located coordinates in a view component one for block candidate 132. In addition, in some examples, video decoder 30 may adjust the coordinates based on disparity between the view component of the current block 136 (view component two) and the view candidate component (view component one) 132.

[0166] Помимо этого, после нахождения кандидата 132 блока, видеодекодер 30 может масштабировать вектор 130 движения для кандидата 132 блока на основе разности в местоположении камеры для камеры, используемой для того, чтобы захватывать предсказывающий блок для кандидата 132 блока и предсказывающий блок для текущего блока 136 (например, предсказывающий блок в целевом виде). Иными словами, видеодекодер 30 может масштабировать вектор 130 диспаратности движения (например, вектор движения, использующийся для предсказания) согласно разности между компонентой вида, на который ссылается вектор 130 движения для кандидата 132 блока, и компонентой вида для целевого вида. В некоторых примерах видеодекодер 30 может выполнять масштабирование предикторов вектора движения с использованием вышеприведенного уравнения (2). В других примерах как описано относительно фиг. 8 ниже, масштабирование предикторов вектора движения может быть расширено до других видов.[0166] In addition, after finding the block candidate 132, the video decoder 30 may scale the motion vector 130 for the block candidate 132 based on the difference in camera location for the camera used to capture the predictive block for block candidate 132 and the predictive block for the current block 136 (for example, a predictive block in a target form). In other words, video decoder 30 may scale the motion disparity vector 130 (e.g., the motion vector used for prediction) according to the difference between the view component referenced by the motion vector 130 for block candidate 132 and the view component for the target view. In some examples, video decoder 30 may scale the motion vector predictors using equation (2) above. In other examples, as described with respect to FIG. 8 below, the scaling of motion vector predictors can be expanded to other forms.

[0167] Видеодекодер 30 может добавлять кандидат 132 блока в список кандидатов при выполнении режима слияния и/или предсказания векторов движения (описанного, например, относительно фиг. 5 выше). Согласно аспектам этого раскрытия сущности, кандидат блока может добавляться в список кандидатов предикторов вектора движения (например, для режима слияния либо для предсказания векторов движения с MVP) множеством способов. Например, видеодекодер 30 может составлять список кандидатов посредством нахождения кандидатов режима слияния согласно следующей схеме:[0167] Video decoder 30 may add a block candidate 132 to the candidate list when performing the merge and / or prediction of motion vectors (described, for example, with respect to FIG. 5 above). According to aspects of this disclosure, a block candidate can be added to a candidate list of motion vector predictors (for example, for merge mode or for predicting motion vectors with MVP) in a variety of ways. For example, video decoder 30 may compile a candidate list by finding merge mode candidates according to the following scheme:

2. V, если availableFlagV равен 12. V if availableFlagV is 1

3. B₁, если availableFlagB₁ равен 13. B ₁ if availableFlagB ₁ is 1

4. B₀, если availableFlagB₀ равен 14. B ₀ if availableFlagB ₀ is 1

5. A₀, если availableFlagA₀ равен 15. A ₀ if availableFlagA ₀ is 1

6. B₂, если availableFlagB₂ равен 16. B ₂ if availableFlagB ₂ is 1

7. Col, если availableFlagCol равен 17. Col if availableFlagCol is 1

- где V представляет кандидат 132 блока. В других примерах кандидат 132 блока может находиться и добавляться в список кандидатов в любой другой позиции списка кандидатов.- where V represents the candidate block 132. In other examples, block candidate 132 may be located and added to the candidate list at any other position in the candidate list.

[0168] Фиг. 8 является другой концептуальной схемой, иллюстрирующей генерирование и масштабирование предиктора вектора движения, согласно аспектам этого раскрытия сущности. Например, согласно аспектам этого раскрытия сущности, видеокодер (к примеру, видеокодер 20 или видеодекодер 30) может масштабировать вектор 140 диспаратности движения (mv) из кандидата 142 блока предиктора вектора диспаратности движения, чтобы генерировать предиктор 144 вектора движения (mv') для текущего блока 146, причем кандидат 142 блока принадлежит отличному компоненту вида относительно текущего блока 146. Хотя фиг. 8 описывается относительно видеодекодера 30, следует понимать, что способы этого раскрытия сущности могут выполняться посредством множества других видеокодеров, включающих в себя другие процессоры, модули обработки, аппаратные модули кодирования, такие как кодеры/декодеры (кодеки) и т.п.[0168] FIG. 8 is another conceptual diagram illustrating the generation and scaling of a motion vector predictor, in accordance with aspects of this disclosure. For example, according to aspects of this disclosure, a video encoder (e.g., video encoder 20 or video decoder 30) may scale a motion displacement vector (mv) 140 from a candidate 142 of a motion displacement vector predictor block to generate a motion vector predictor 144 (mv ') for the current block 146, wherein the block candidate 142 belongs to a different view component relative to the current block 146. Although FIG. 8 is described with respect to video decoder 30, it should be understood that the methods of this disclosure can be performed by a variety of other video encoders, including other processors, processing modules, hardware encoding modules, such as encoders / decoders (codecs) and the like.

[0169] Пример, показанный на фиг. 8, расширяет предсказание векторов движения, показанное и описанное относительно фиг. 7, на окружение, которое включает в себя более трех видов. Например, как показано на фиг. 8, кандидат 142 блока находится в компоненте вида два (view_id 2). Кандидат 142 блока внешне прогнозируется и включает в себя вектор 140 движения (mv), который ссылается на предсказывающий блок в компоненте вида один (view_id 1). Например, вектор 140 движения имеет целевое изображение в виде два (view_id 2) и опорное изображение в виде один (view_id 1). Текущий блок 146 совместно размещается с кандидатом 142 блока и находится в компоненте вида три (view_id 3).[0169] The example shown in FIG. 8 extends the motion vector prediction shown and described with respect to FIG. 7, on the environment, which includes more than three types. For example, as shown in FIG. 8, candidate block 142 is in a view component two (view_id 2). Block candidate 142 is externally predicted and includes a motion vector (mv) 140 that refers to a predictive block in a view component one (view_id 1). For example, the motion vector 140 has a target image in the form of two (view_id 2) and a reference image in the form of one (view_id 1). The current block 146 is co-located with the block candidate 142 and is in the view component three (view_id 3).

[0170] Согласно аспектам этого раскрытия сущности, видеодекодер 30 может выбирать целевой вид для текущего блока 146 в качестве компонента вида нуль (view_id 0). Например, целевой вид, в общем, включает в себя предсказывающий блок для текущего блока. Если изображение, содержащее предсказывающий блок, является межвидовым опорным изображением, и предсказывающий блок для текущего блока 146 находится в третьем опорном виде (view_id 0), видеодекодер 30 может выбирать третий опорный вид в качестве целевого вида.[0170] According to aspects of this disclosure, video decoder 30 may select a target view for the current block 146 as a view component of zero (view_id 0). For example, the target view generally includes a predictive block for the current block. If the image containing the predictive block is an interspecific reference image, and the predictive block for the current block 146 is in the third reference view (view_id 0), video decoder 30 may select the third reference view as the target view.

[0171] После выбора целевого вида видеодекодер 30 может находить кандидат 142 блока. Например, при условии, что верхняя левая выборка сигнала яркости текущего блока 146 находится в изображении (или среза) с координатами (x, y) в компоненте вида три, видеодекодер 30 может определять совместно размещенные координаты в компоненте вида два для кандидата 142 блока. Помимо этого, как отмечено выше, видеодекодер 30 может регулировать координаты на основе диспаратности между компонентой вида текущего блока 146 (компонентой вида три) и компонентой вида кандидата 142 блока (компонентой вида два).[0171] After selecting a target view, video decoder 30 may find a block candidate 142. For example, provided that the upper left sample of the brightness signal of the current block 146 is in the image (or slice) with coordinates (x, y) in a component of type three, video decoder 30 may determine the co-located coordinates in a component of type two for candidate 142 of the block. In addition, as noted above, the video decoder 30 can adjust the coordinates based on the disparity between the view component of the current block 146 (view component three) and the view component of the candidate block 142 (view component two).

[0172] После нахождения кандидата 142 блока видеодекодер 30 может масштабировать вектор 140 движения для кандидата 142 блока на основе разности в расстояниях для вида между вектором 140 движения и фактическим вектором движения для текущего блока 146. Иными словами, видеодекодер 30 может масштабировать вектор 130 движения на основе разности в местоположении камеры для камеры, используемой для того, чтобы захватывать предсказывающий блок для кандидата 142 блока и предсказывающий блок для текущего блока 146 (например, предсказывающий блок в целевом виде). Иными словами, видеодекодер 30 может масштабировать вектор 140 диспаратности движения (например, вектор движения, использующийся для предсказания) согласно разности между компонентой вида, на который ссылается вектор 140 движения для кандидата 142 блока, и компонентой вида для целевого вида (view_id 0).[0172] After finding block candidate 142, video decoder 30 may scale the motion vector 140 for block candidate 142 based on the distance difference between the motion vector 140 and the actual motion vector for current block 146. In other words, video decoder 30 may scale the motion vector 130 by based on the difference in camera location for the camera used to capture the predictive block for block candidate 142 and the predictive block for current block 146 (for example, the predictive block in the target view). In other words, video decoder 30 may scale the motion disparity vector 140 (for example, the motion vector used for prediction) according to the difference between the view component referenced by the motion vector 140 for block candidate 142 and the view component for the target view (view_id 0).

[0173] В примере, видеодекодер 30 может генерировать масштабированный предиктор вектора движения для текущего блока согласно уравнению (4), показанному ниже:[0173] In the example, video decoder 30 may generate a scaled motion vector predictor for the current block according to equation (4) shown below:

$m v' = m v (\frac{V i e w I D (T h i r d) - V i e w I D (C u r r e n t)}{V i e w I D (S e c o n d Reference) - V i e w I d (Reference)})$

, (4)

m v'' = m v (\frac{V i e w I D (T h i r d) - V i e w I D (C u r r e n t)}{V i e w I D (S e c o n d Reference) - V i e w I d (Reference)})

, (four)

где mv' представляет масштабированный предиктор вектора движения для текущего блока, mv представляет вектор движения для кандидата блока, ViewID(Third) является компонентой вида третьего опорного вида, ViewID(Current) является компонентой вида текущего блока, и ViewID(SecondReference) является компонентой вида второго опорного вида (при наличии), и ViewID(Reference) является компонентой вида первого опорного вида. В некоторых примерах ViewID(Third) минус ViewID(Current) может называться "расстоянием для вида" предиктора 144 вектора движения, в то время как ViewID(SecondReference) минус ViewID(Reference) может называться "расстоянием для вида" вектора 140 движения. Иными словами, расстояние для вида предиктора 144 вектора движения является разностью между целевым изображением (view_id 0) и опорным изображением (view_id 3) предиктора 144 вектора движения, в то время как расстояние для вида вектора 140 движения является разностью между целевым изображением (view_id 1) и опорным изображением (view_id 2) вектора 140 движения.where mv 'represents the scaled motion vector predictor for the current block, mv represents the motion vector for the block candidate, ViewID (Third) is a component of the third reference view, ViewID (Current) is a component of the current block, and ViewID (SecondReference) is a component of the second reference view (if any), and ViewID (Reference) is a view component of the first reference view. In some examples, ViewID (Third) minus ViewID (Current) may be referred to as the "distance for the view" of the motion vector predictor 144, while ViewID (SecondReference) minus ViewID (Reference) may be referred to as the "distance for the view" of the motion vector 140. In other words, the distance for the type of motion vector predictor 144 is the difference between the target image (view_id 0) and the reference image (view_id 3) of the motion vector predictor 144, while the distance for the type of motion vector predictor 140 is the difference between the target image (view_id 1) and a reference image (view_id 2) of the motion vector 140.

[0174] При применении уравнения (3) к примеру на фиг. 8, mv' представляет масштабированный предиктор 144 вектора движения. Например, ViewID(Third) является третьим опорным видом (view_id 0), mv' представляет масштабированный предиктор 144 вектора движения, mv представляет вектор 140 движения, ViewID(Current) является компонентой вида текущего блока 146, ViewID(SecondReference) является компонентой вида второго опорного вида (view_id 1), и ViewID(Reference) является компонентой вида первого опорного вида (view_id 2). Соответственно, в примере, показанном на фиг. 8, предиктор 144 вектора движения является вектором 140 движения, масштабированным на коэффициент три (например, $m v' = m v (\frac{0 - 3}{1 - 2})$

). Иными словами, видеодекодер 30 может масштабировать горизонтальную компоненту смещения и вертикальную компоненту смещения вектора 140 движения на три, чтобы генерировать предиктор 144 вектора движения.[0174] When applying equation (3) to the example of FIG. 8, mv ′ represents a scaled motion vector predictor 144. For example, ViewID (Third) is the third reference view (view_id 0), mv 'represents the scaled predictor 144 of the motion vector, mv represents the motion vector 140, ViewID (Current) is a component of the current block 146, ViewID (SecondReference) is a component of the second reference view (view_id 1), and ViewID (Reference) is a component of the view of the first reference view (view_id 2). Accordingly, in the example shown in FIG. 8, the motion vector predictor 144 is a motion vector 140 scaled by a factor of three (e.g.,

m v'' = m v (\frac{0 - 3}{one - 2})

) In other words, video decoder 30 may scale the horizontal displacement component and the vertical displacement component of the motion vector 140 by three to generate the motion vector predictor 144.

[0175] Хотя фиг. 7-8 предоставляют примеры для межвидового предсказания векторов диспаратности движения, следует понимать, что такие примеры предоставляются просто в целях иллюстрации. Иными словами, способы для предсказания векторов диспаратности движения могут применяться к большему или меньшему числу видов, чем показано. Дополнительно или альтернативно, способы для предсказания векторов диспаратности движения могут применяться в случаях, в которых виды имеют различные идентификаторы видов.[0175] Although FIG. 7-8 provide examples for interspecific prediction of motion disparity vectors, it should be understood that such examples are provided merely for purposes of illustration. In other words, methods for predicting motion disparity vectors can be applied to a larger or smaller number of species than shown. Additionally or alternatively, methods for predicting motion disparity vectors can be applied in cases in which species have different species identifiers.

[0176] Фиг. 9 является блок-схемой последовательности операций способа, иллюстрирующей примерный способ кодирования информации предсказания для блока видеоданных. Пример, показанный на фиг. 9, в общем, описывается как выполняемый посредством видеокодера. Следует понимать, что, В некоторых примерах способ по фиг. 9 может выполняться посредством видеокодера 20 (фиг. 1 и 2) или видеодекодера 30 (фиг. 1 и 3), описанных выше. В других примерах способ по фиг. 9 может выполняться посредством множества других процессоров, модулей обработки, аппаратных модулей кодирования, таких как кодеры/декодеры (кодеки) и т.п.[0176] FIG. 9 is a flowchart illustrating an example method of encoding prediction information for a block of video data. The example shown in FIG. 9 is generally described as being performed by a video encoder. It should be understood that, In some examples, the method of FIG. 9 can be performed by video encoder 20 (FIGS. 1 and 2) or video decoder 30 (FIGS. 1 and 3) described above. In other examples, the method of FIG. 9 may be performed by a variety of other processors, processing units, hardware encoding units, such as encoders / decoders (codecs), and the like.

[0177] Согласно примерному способу, показанному на фиг. 9, видеокодер может идентифицировать первый блок видеоданных в первом виде, причем первый блок видеоданных ассоциирован с первым вектором диспаратности движения (160). Например, вектор движения для первого блока видеоданных может быть вектором диспаратности движения, который идентифицирует опорный блок в другом компоненте вида. Видеокодер затем может определять то, является или нет второй вектор движения, ассоциированный со вторым блоком видеоданных, вектором диспаратности движения (162).[0177] According to the exemplary method shown in FIG. 9, a video encoder may identify a first block of video data in a first form, the first block of video data being associated with a first motion disparity vector (160). For example, the motion vector for the first block of video data may be a motion disparity vector that identifies the reference block in another component of the view. The video encoder can then determine whether or not the second motion vector associated with the second block of video data is a motion disparity vector (162).

[0178] Если второй вектор движения не является вектором диспаратности движения (ветвь "Нет" этапа 162), видеокодер может идентифицировать другой кандидат предиктора вектора движения (164). Иными словами, согласно некоторым аспектам этого раскрытия сущности, возможность использовать вектор диспаратности движения (например, первый вектор движения) для того, чтобы предсказывать временной вектор движения (например, второй вектор движения, когда второй вектор движения является временным вектором движения), может деактивироваться. В таких случаях, видеокодер может идентифицировать первый вектор движения как недоступный для использования для целей предсказания векторов движения.[0178] If the second motion vector is not a motion disparity vector (No branch of step 162), the video encoder may identify another candidate motion vector predictor (164). In other words, according to some aspects of this disclosure, the ability to use a motion disparity vector (e.g., a first motion vector) to predict a temporal motion vector (e.g., a second motion vector when the second motion vector is a temporary motion vector) can be deactivated. In such cases, the video encoder may identify the first motion vector as unavailable for use for predicting motion vectors.

[0179] Если второй вектор движения является вектором диспаратности движения (ветвь "Да" этапа 162), видеокодер может масштабировать первый вектор движения, чтобы генерировать предиктор вектора движения для второго вектора движения (166). Например, согласно аспектам этого раскрытия сущности, видеокодер может масштабировать первый вектор движения, чтобы генерировать предиктор вектора диспаратности движения на основе разностей в расстояниях для вида, ассоциированных с первым вектором диспаратности движения и вторым вектором движения. Иными словами, В некоторых примерах видеокодер может масштабировать предиктор вектора движения для второго блока на основе местоположений камеры. Например, видеокодер может масштабировать второй вектор движения согласно разности в идентификаторах видов, как показано и описано относительно фиг. 6-8.[0179] If the second motion vector is a motion disparity vector (Yes branch of step 162), the video encoder can scale the first motion vector to generate a motion vector predictor for the second motion vector (166). For example, according to aspects of this disclosure, a video encoder can scale a first motion vector to generate a motion disparity vector predictor based on distance differences for a view associated with a first motion disparity vector and a second motion vector. In other words, In some examples, a video encoder may scale a motion vector predictor for a second block based on camera locations. For example, a video encoder may scale a second motion vector according to a difference in view identifiers, as shown and described with respect to FIG. 6-8.

[0180] Видеокодер затем может кодировать данные предсказания для второго блока с использованием масштабированного предиктора вектора движения (168). Например, видеокодер может кодировать данные предсказания для второго блока с использованием режима слияния или с использованием предсказания векторов движения. Для режима слияния видеокодер может непосредственно кодировать данные предсказания для второго блока с использованием масштабированного второго предиктора вектора движения. Для предсказания векторов движения видеокодер может кодировать данные предсказания для второго блока посредством генерирования MVD. MVD может включать в себя разность между первым вектором движения и масштабированным вторым вектором движения.[0180] The video encoder can then encode prediction data for the second block using a scaled motion vector predictor (168). For example, a video encoder may encode prediction data for a second block using a merge mode or using motion vector prediction. For the merge mode, the video encoder can directly encode the prediction data for the second block using a scaled second motion vector predictor. To predict motion vectors, a video encoder may encode prediction data for a second block by generating MVD. MVD may include the difference between the first motion vector and the scaled second motion vector.

[0181] Также следует понимать, что этапы, показанные и описанные относительно фиг. 9, предоставляются в качестве только одного примера. Иными словами, этапы способа по фиг. 9 не обязательно должны выполняться в порядке, показанном на фиг. 9, и могут выполняться меньшее число, дополнительные или альтернативные этапы.[0181] It should also be understood that the steps shown and described with respect to FIG. 9 are provided as just one example. In other words, the steps of the method of FIG. 9 need not be performed in the order shown in FIG. 9, and fewer, additional, or alternative steps may be performed.

[0182] Фиг. 10 является концептуальной схемой, иллюстрирующей генерирование предиктора вектора движения из блока в отличном виде относительно текущего блока. Например, согласно аспектам этого раскрытия сущности видеокодер (к примеру, видеокодер 20 или видеодекодер 30) может использовать временной вектор 180 движения (mv) из кандидата 182 блока временного предиктора вектора движения для того, чтобы генерировать предиктор 184 вектора движения (mv') для текущего блока 186, причем кандидат 182 блока принадлежит отличному компоненту вида относительно текущего блока 186. Хотя фиг. 10 описывается относительно видеодекодера 30, следует понимать, что способы этого раскрытия сущности могут выполняться посредством множества других видеокодеров, включающих в себя другие процессоры, модули обработки, аппаратные модули кодирования, такие как кодеры/декодеры (кодеки) и т.п.[0182] FIG. 10 is a conceptual diagram illustrating the generation of a motion vector predictor from a block in excellent form with respect to the current block. For example, according to aspects of this disclosure, a video encoder (e.g., video encoder 20 or video decoder 30) may use a temporary motion vector 180 (mv) from a candidate 182 block of a temporary motion vector predictor to generate a motion vector predictor 184 (mv ') for the current block 186, wherein block candidate 182 belongs to a different view component with respect to current block 186. Although FIG. 10 is described with respect to video decoder 30, it should be understood that the methods of this disclosure may be performed by a variety of other video encoders, including other processors, processing modules, hardware encoding modules, such as encoders / decoders (codecs) and the like.

[0183] Как показано на фиг. 10, текущий блок 186 находится в компоненте вида один (view_id 1). Кандидат 182 блока находится в компоненте вида нуль (view_id 0). Кандидат 182 блока временно прогнозируется и включает в себя вектор 180 движения (mv), который ссылается на предсказывающий блок в другом временном местоположении в идентичном компоненте вида. Иными словами, в примере, показанном на фиг. 10, вектор 180 движения идентифицирует предсказывающий блок в изображении, имеющем опорный индекс, равный переменной i (ref_idx=i).[0183] As shown in FIG. 10, the current block 186 is in a view component one (view_id 1). The block candidate 182 is in the view component zero (view_id 0). Block candidate 182 is temporarily predicted and includes a motion vector (mv) 180 that refers to a predictive block at another temporary location in the same view component. In other words, in the example shown in FIG. 10, the motion vector 180 identifies a predictive block in an image having a reference index equal to the variable i (ref_idx = i).

[0184] Допустим, что верхняя левая выборка сигнала яркости текущего блока 186 находится в изображении (или среза) с координатами (x, y). Видеодекодер 30 может находить кандидат 182 блока посредством определения совместно размещенных координат в компоненте вида нуль для кандидата 182 блока. В некоторых примерах видеодекодер 30 может регулировать координаты кандидата 182 блока на основе диспаратности между компонентой вида текущего блока 186 (view_id 1) и компонентой вида кандидата 182 блока (view_id 0). Соответственно, видеодекодер 30 может определять координаты для кандидата 182 блока как (x', y'), где (x', y')=(x, y)+диспаратность. В некоторых примерах диспаратность может быть включена и/или вычислена в SPS, PPS, заголовке среза, синтаксисе CU и/или синтаксисе PU.[0184] Assume that the upper left sample of the luminance signal of the current block 186 is in the image (or slice) with coordinates (x, y). Video decoder 30 may find the block candidate 182 by determining the co-located coordinates in a null component for the block candidate 182. In some examples, video decoder 30 may adjust the coordinates of the block candidate 182 based on the disparity between the type component of the current block 186 (view_id 1) and the type component of the block candidate 182 (view_id 0). Accordingly, video decoder 30 may determine the coordinates for block candidate 182 as (x ', y'), where (x ', y') = (x, y) + disparity. In some examples, disparity may be included and / or calculated in SPS, PPS, slice header, CU syntax, and / or PU syntax.

[0185] Согласно аспектам этого раскрытия сущности, видеодекодер 30 затем может повторно преобразовывать опорный индекс вектора 180 движения, использующегося для целей предсказания. В общем, как отмечено выше, данные для вектора движения включают в себя список опорных изображений, индекс в списке опорных изображений (называемый "ref_idx"), горизонтальную компоненту и вертикальную компоненту. В HEVC может быть два обычных списка опорных изображений (например, список 0 и список 1) и комбинированный список опорных изображений (например, список c). Без потери общности допустим, что текущий список опорных изображений является списком t (который может соответствовать любому из списка 0, списка 1 или списка c). Согласно примеру, показанному на фиг. 10, вектор 180 движения для кандидата 182 блока может идентифицировать предсказывающий блок в изображении, расположенном в компоненте вида нуль (view_id 0), имеющий POC-значение в два и ref_idx, равный i. Согласно аспектам этого раскрытия сущности, видеодекодер 30 может идентифицировать совместно размещенный предсказывающий блок для текущего блока 186 в момент времени, идентичный с моментом времени для текущего блока 186. Иными словами, предсказывающий блок для кандидата 182 блока и предсказывающий блок для текущего блока 186 имеют идентичное временное местоположение, но находятся в изображениях двух различных видов.[0185] According to aspects of this disclosure, video decoder 30 may then re-transform the reference index of the motion vector 180 used for prediction purposes. In general, as noted above, the data for the motion vector includes a list of reference images, an index in the list of reference images (called "ref_idx"), a horizontal component, and a vertical component. In HEVC, there can be two conventional reference picture lists (for example, list 0 and list 1) and a combined reference picture list (for example, list c). Without loss of generality, suppose that the current reference picture list is a list t (which can correspond to any one from list 0, list 1, or list c). According to the example shown in FIG. 10, the motion vector 180 for block candidate 182 can identify a predictive block in an image located in a view component of zero (view_id 0) having a POC value of two and ref_idx equal to i. According to aspects of this disclosure, video decoder 30 may identify a co-located predictive block for the current block 186 at a point in time identical to the point in time for the current block 186. In other words, the predictive block for candidate block 182 and the predictive block for current block 186 have an identical temporal location, but are in images of two different kinds.

[0186] В примере, если идентифицированный предсказывающий блок для текущего блока 186 соответствует j-тому опорному изображению в списке t опорных изображений для текущего изображения, видеодекодер 30 может предсказывать опорный индекс (ref_idx) для текущего блока 186 в качестве j и видеодекодер 30 может задавать предиктор 184 вектора движения равным значению, идентичному значению для вектора 180 движения. Соответственно, видеодекодер 30 эффективно повторно преобразует опорный индекс для текущего блока 186 из ref_idx i в ref_idx j. Иными словами, видеодекодер 30 определяет то, что предиктор 184 вектора движения для текущего блока 186 имеет идентичный список опорных изображений, горизонтальную компоненту и вертикальную компоненту относительно кандидата 182 блока; тем не менее, предиктор 184 вектора движения ссылается на j-тое опорное изображение в списке опорных изображений, а не на i-тое опорное изображение в списке опорных изображений.[0186] In the example, if the identified predictive block for the current block 186 corresponds to the j-th reference picture in the reference picture list t for the current picture, the video decoder 30 may predict the reference index (ref_idx) for the current block 186 as j and the video decoder 30 may specify motion vector predictor 184 equal to a value identical to the value for motion vector 180. Accordingly, video decoder 30 effectively converts the reference index for the current block 186 from ref_idx i to ref_idx j. In other words, video decoder 30 determines that the motion vector predictor 184 for the current block 186 has an identical reference picture list, a horizontal component and a vertical component relative to the block candidate 182; however, the motion vector predictor 184 refers to the j-th reference image in the reference image list, and not to the i-th reference image in the reference image list.

[0187] Согласно аспектам этого раскрытия сущности, В некоторых примерах видеодекодер также может масштабировать предиктор 184 вектора движения. Например, если изображение, содержащее идентифицированный предсказывающий блок для текущего блока 186, не включается в список t опорных изображений, видеодекодер 30 может идентифицировать второе изображение, которое является ближайшим в списке t опорных изображений. В некоторых примерах если два изображения имеют идентичные расстояния до изображения, содержащего идентифицированный предсказывающий блок для текущего блока 186, видеодекодер 30 может выбирать изображение, которое ближе к изображению, содержащему текущий блок 186, в качестве второго изображения. В целях пояснения допустим, что идентифицированное изображение имеет опорный индекс k. В этом примере, видеодекодер 30 затем может предсказывать опорный индекс предиктора 184 вектора движения в качестве k, и видеодекодер 30 может масштабировать предиктор 184 вектора движения на основе разности в номере в последовательности изображений (POC). Иными словами, видеодекодер 30 может масштабировать предиктор 184 вектора движения на основе разности между расстоянием между текущим блоком 186 и изображением с опорным индексом j и текущим блоком 186 и изображением с опорным индексом k.[0187] According to aspects of this disclosure, In some examples, a video decoder can also scale a motion vector predictor 184. For example, if the image containing the identified predictive block for the current block 186 is not included in the reference picture list t, video decoder 30 may identify the second image that is the closest in the reference picture list t. In some examples, if the two images have identical distances to the image containing the identified predictive block for the current block 186, video decoder 30 may select an image that is closer to the image containing the current block 186 as the second image. For purposes of explanation, suppose that the identified image has a reference index k. In this example, video decoder 30 can then predict the reference index of the motion vector predictor 184 as k, and video decoder 30 can scale the motion vector predictor 184 based on the difference in number in the image sequence (POC). In other words, video decoder 30 may scale the motion vector predictor 184 based on the difference between the distance between the current block 186 and the image with the reference index j and the current block 186 and the image with the reference index k.

[0188] Согласно некоторым примерам, видеодекодер 30 может выполнять идентичный процесс при выполнении предсказания векторов движения. Тем не менее, после определения предиктора 184 вектора движения, видеодекодер 30 может генерировать вектор движения для текущего блока 186 с использованием MVD. Предсказание векторов движения может использовать идентичный процесс. В другом примере относительно предсказания векторов движения, если предсказывающий блок для текущего блока 186 не может находиться (идентифицироваться выше как находящийся с опорным индексом j), видеодекодер 30 не может выполнять режим слияния или предсказание векторов движения для текущего блока 186. Иными словами, вместо масштабирования предиктора 184 вектора движения, видеодекодер 30 может считать предиктор 184 вектора движения недоступным.[0188] According to some examples, video decoder 30 may perform an identical process when performing motion vector prediction. However, after determining the motion vector predictor 184, video decoder 30 may generate a motion vector for the current block 186 using MVD. Prediction of motion vectors can use an identical process. In another example, regarding motion vector prediction, if the predictive block for the current block 186 cannot be found (identified above as being with the reference index j), video decoder 30 cannot perform merge mode or motion vector prediction for the current block 186. In other words, instead of scaling motion vector predictor 184, video decoder 30 may consider motion vector predictor 184 to be inaccessible.

[0189] Видеодекодер 30 может добавлять кандидат 182 блока в список кандидатов для выполнения режима слияния и/или предсказания векторов движения (описанного, например, относительно фиг. 5 выше). Согласно аспектам этого раскрытия сущности, кандидат 182 блока может добавляться в список кандидатов предикторов вектора движения (например, для режима слияния либо для предсказания векторов движения с MVP) множеством способов. Например, видеодекодер 30 может составлять список кандидатов посредством нахождения кандидатов согласно следующей схеме:[0189] Video decoder 30 may add a block candidate 182 to the candidate list to perform the merge and / or prediction of motion vectors (described, for example, with respect to FIG. 5 above). According to aspects of this disclosure, block candidate 182 can be added to the motion vector predictor candidate list (for example, for merge mode or for predicting motion vectors with MVP) in a variety of ways. For example, video decoder 30 may make a list of candidates by finding candidates according to the following scheme:

2. V, если availableFlagV равен 12. V if availableFlagV is 1

7. Col, если availableFlagCol равен 17. Col if availableFlagCol is 1

- где V представляет кандидат 182 блока. В других примерах кандидат 132 блока может находиться и добавляться в список кандидатов в любой другой позиции списка кандидатов.- where V represents candidate 182 blocks. In other examples, block candidate 132 may be located and added to the candidate list at any other position in the candidate list.

[0190] Фиг. 11 является блок-схемой последовательности операций способа, иллюстрирующей примерный способ генерирования предиктора вектора движения. Пример, показанный на фиг. 11, в общем, описывается как выполняемый посредством видеокодера. Следует понимать, что, В некоторых примерах способ по фиг. 11 может выполняться посредством видеокодера 20 (фиг. 1 и 2) или видеодекодера 30 (фиг. 1 и 3), описанных выше. В других примерах способ по фиг. 11 может выполняться посредством множества других процессоров, модулей обработки, аппаратных модулей кодирования, таких как кодеры/декодеры (кодеки) и т.п.[0190] FIG. 11 is a flowchart illustrating an example method of generating a motion vector predictor. The example shown in FIG. 11 is generally described as being performed by a video encoder. It should be understood that, In some examples, the method of FIG. 11 may be performed by video encoder 20 (FIGS. 1 and 2) or video decoder 30 (FIGS. 1 and 3) described above. In other examples, the method of FIG. 11 may be performed by a variety of other processors, processing units, hardware encoding units, such as encoders / decoders (codecs) and the like.

[0191] Согласно примеру, показанному на фиг. 11, видеокодер может идентифицировать первый блок видеоданных в первом временном местоположении первого вида, причем первый блок ассоциирован с первым временным вектором движения (202). Согласно аспектам этого раскрытия сущности, когда второй вектор движения, ассоциированный со вторым блоком видеоданных, является временным вектором движения, и второй блок исходит из второго отличного вида относительно первого блока (ветвь "Да" этапа 204), видеокодер может определять предиктор вектора движения на основе первого вектора движения (206). Иными словами, например, видеокодер может определять предиктор вектора движения для предсказания второго вектора движения из первого вектора движения. Видеокодер также может кодировать данные предсказания для второго блока с использованием предиктора вектора движения (208). Например, видеокодер может использовать предиктор вектора движения в режиме слияния либо для того, чтобы генерировать значение MVD.[0191] According to the example shown in FIG. 11, a video encoder may identify a first block of video data at a first temporary location of a first view, the first block being associated with a first temporary motion vector (202). According to aspects of this disclosure, when the second motion vector associated with the second block of video data is a temporary motion vector and the second block comes from a second distinct view relative to the first block (branch “Yes” of step 204), the video encoder may determine the motion vector predictor based on first motion vector (206). In other words, for example, the video encoder may determine a motion vector predictor for predicting a second motion vector from the first motion vector. The video encoder can also encode prediction data for the second block using a motion vector predictor (208). For example, a video encoder may use a motion vector predictor in merge mode or to generate an MVD value.

[0192] Если второй вектор движения не является временным вектором движения, и/или второй блок видеоданных не исходит из отличного вида относительно первого блока видеоданных (ветвь "Нет" этапа 204), видеокодер может определять то, является или нет второй вектор движения вектором диспаратности движения (210). Согласно аспектам этого раскрытия сущности, если второй вектор движения не является вектором диспаратности движения (ветвь "Нет" этапа 210), видеокодер может идентифицировать другой кандидат предиктора вектора движения (212). Иными словами, видеокодер, В некоторых примерах может не использовать первый вектор движения для того, чтобы предсказывать второй вектор движения.[0192] If the second motion vector is not a temporary motion vector, and / or the second block of video data does not come from a different view with respect to the first block of video data (No branch of step 204), the video encoder can determine whether or not the second motion vector is a disparity vector movement (210). According to aspects of this disclosure, if the second motion vector is not a motion disparity vector (No branch of step 210), the video encoder may identify another candidate motion vector predictor (212). In other words, the video encoder, In some examples, may not use the first motion vector in order to predict the second motion vector.

[0193] Если второй вектор движения является вектором диспаратности движения (ветвь "Да" этапа 210), видеокодер может определять то, деактивировано или нет предсказание векторов диспаратности движения (214). Иными словами, согласно некоторым аспектам этого раскрытия сущности, возможность использовать временной вектор движения (например, первый вектор движения) для того, чтобы предсказывать вектор диспаратности движения (например, второй вектор движения, когда второй вектор движения является вектором диспаратности движения), может деактивироваться. В таких случаях, видеокодер может идентифицировать другой кандидат предиктора вектора движения (212) (ветвь "Нет" этапа 214).[0193] If the second motion vector is a motion disparity vector (the "Yes" branch of step 210), the video encoder can determine whether or not the motion disparity vector prediction is deactivated (214). In other words, according to some aspects of this disclosure, the ability to use a temporal motion vector (e.g., a first motion vector) to predict a motion disparity vector (e.g., a second motion vector when the second motion vector is a motion disparity vector) can be deactivated. In such cases, the video encoder may identify another candidate motion vector predictor (212) (No branch of step 214).

[0194] Если видеокодер определяет то, что предсказание векторов диспаратности движения активируется (либо, например, возможность активировать/деактивировать эту функцию отсутствует), видеокодер может определять предиктор вектора движения для второго вектора движения на основе первого вектора движения (206) (ветвь "Да" этапа 214). Помимо этого, видеокодер также может кодировать данные предсказания для второго блока с использованием предиктора вектора движения (208). Например, видеокодер может использовать предиктор вектора движения в режиме слияния либо для того, чтобы генерировать значение MVD.[0194] If the video encoder determines that the prediction of motion disparity vectors is activated (or, for example, there is no possibility to activate / deactivate this function), the video encoder can determine the motion vector predictor for the second motion vector based on the first motion vector (206) (branch "Yes" "step 214). In addition, the video encoder can also encode prediction data for the second block using a motion vector predictor (208). For example, a video encoder may use a motion vector predictor in merge mode or to generate an MVD value.

[0195] Также следует понимать, что этапы, показанные и описанные относительно фиг. 11, предоставляются в качестве только одного примера. Иными словами, этапы способа по фиг. 11 не обязательно должны выполняться в порядке, показанном на фиг. 11, и могут выполняться меньшее число, дополнительные или альтернативные этапы.[0195] It should also be understood that the steps shown and described with respect to FIG. 11 are provided as just one example. In other words, the steps of the method of FIG. 11 need not be performed in the order shown in FIG. 11, and fewer, additional, or alternative steps may be performed.

[0196] Следует понимать, что в зависимости от примера, определенные этапы или события любого из способов, описанных в данном документе, могут выполняться в другой последовательности, могут добавляться, сливаться или вообще исключаться (например, не все описанные этапы или события требуются для практической реализации способа). Кроме того, в определенных примерах, этапы или события могут выполняться одновременно, например, посредством многопоточной обработки, обработки прерывания или посредством нескольких процессоров, а не последовательно. Помимо этого, хотя конкретные аспекты этого раскрытия сущности описываются как выполняемые посредством одного модуля или блока для понятности, следует понимать, что способы этого раскрытия сущности могут выполняться посредством комбинации блоков или модулей, ассоциированных с видеокодером.[0196] It should be understood that, depending on an example, certain steps or events of any of the methods described herein may be performed in a different sequence, may be added, merged, or eliminated altogether (for example, not all described steps or events are required for practical implementation of the method). In addition, in certain examples, steps or events may be performed simultaneously, for example, by multi-threaded processing, interrupt processing, or by several processors, rather than sequentially. In addition, although the specific aspects of this disclosure are described as being performed by a single module or block for clarity, it should be understood that the methods of this disclosure can be performed by a combination of blocks or modules associated with a video encoder.

[0197] В одном или более примеров, описанные функции могут быть реализованы в аппаратных средствах, программном обеспечении, микропрограммном обеспечении или любой комбинации вышеозначенного. При реализации в программном обеспечении, функции могут быть сохранены или переданы, в качестве одной или более инструкций или кода, по машиночитаемому носителю и выполнены посредством аппаратного процессора. Считываемые компьютером носители могут включать в себя считываемые компьютером носители хранения данных, которые соответствуют материальному носителю, такие как носители хранения данных, или среды связи, включающие в себя любой носитель, который упрощает перенос компьютерной программы из одного места в другое, например, согласно протоколу связи.[0197] In one or more examples, the described functions may be implemented in hardware, software, firmware, or any combination of the above. When implemented in software, functions may be stored or transmitted, as one or more instructions or code, on a machine-readable medium and executed by a hardware processor. Computer-readable media may include computer-readable storage media that correspond to tangible media, such as storage media, or communication media including any medium that facilitates transferring a computer program from one place to another, for example, according to a communication protocol .

[0198] Таким образом, считываемые компьютером носители, в общем, могут соответствовать (1) материальному считываемому компьютером носителю данных, который является энергонезависимым, или (2) среде связи, такой как сигнал или несущая. Носители хранения данных могут быть любыми доступными носителями, к которым может осуществляться доступ посредством одного или более компьютеров или одного или более процессоров, с тем чтобы извлекать инструкции, код и/или структуры данных для реализации технологий, описанных в этом раскрытии сущности. Компьютерный программный продукт может включать в себя машиночитаемый носитель.[0198] Thus, computer-readable media can generally correspond to (1) a material computer-readable storage medium that is non-volatile, or (2) a communication medium such as a signal or carrier. Storage media may be any available media that can be accessed by one or more computers or one or more processors in order to retrieve instructions, code, and / or data structures for implementing the techniques described in this disclosure. The computer program product may include computer-readable media.

[0199] В качестве примера, а не ограничения, эти считываемые компьютером носители хранения данных могут содержать RAM, ROM, EEPROM, CD-ROM или другое устройство хранения на оптических дисках, устройство хранения на магнитных дисках или другие магнитные устройства хранения, флэш-память либо любой другой носитель, который может быть использован для того, чтобы сохранять требуемый программный код в форме инструкций или структур данных, и к которому можно осуществлять доступ посредством компьютера. Так же любое подключение корректно называть машиночитаемым носителем. Например, если инструкции передаются из веб-узла, сервера или другого удаленного источника с помощью коаксиального кабеля, оптоволоконного кабеля, "витой пары", цифровой абонентской линии (DSL) или беспроводных технологий, таких как инфракрасные, радиопередающие и микроволновые среды, то коаксиальный кабель, оптоволоконный кабель, "витая пара", DSL или беспроводные способы, такие как инфракрасные, радиопередающие и микроволновые среды, включаются в определение носителя.[0199] By way of example, and not limitation, these computer-readable storage media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage device, magnetic disk storage device or other magnetic storage device, flash memory or any other medium that can be used to store the required program code in the form of instructions or data structures, and which can be accessed by computer. Also, any connection is correctly called a machine-readable medium. For example, if instructions are transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair cable, digital subscriber line (DSL), or wireless technologies such as infrared, radio transmission, and microwave media, then the coaxial cable , fiber optic cable, twisted pair cable, DSL or wireless methods such as infrared, radio transmitting and microwave media are included in the definition of the medium.

[0200] Тем не менее, следует понимать, что считываемые компьютером носители хранения данных и носители хранения данных не включают в себя соединения, несущие, сигналы или другие энергозависимые носители, а вместо этого направлены на энергонезависимые материальные носители хранения данных. Диск (disk) и диск (disc) при использовании в данном документе включают в себя компакт-диск (CD), лазерный диск, оптический диск, универсальный цифровой диск (DVD), гибкий диск и диск Blu-Ray, при этом диски (disk) обычно воспроизводят данные магнитно, тогда как диски (disc) обычно воспроизводят данные оптически с помощью лазеров. Комбинации вышеперечисленного также следует включать в число считываемых компьютером носителей.[0200] However, it should be understood that computer-readable storage media and storage media do not include connections, carriers, signals, or other volatile media, but instead are directed to non-volatile material storage media. A disc and a disc, as used herein, include a compact disc (CD), a laser disc, an optical disc, a digital versatile disc (DVD), a floppy disk and a Blu-ray disc, and discs (disk ) typically reproduce data magnetically, while discs typically reproduce data optically with lasers. Combinations of the above should also be included in computer-readable media.

[0201] Инструкции могут выполняться посредством одного или более процессоров, например, одного или более процессоров цифровых сигналов (DSP), микропроцессоров общего назначения, специализированных интегральных схем (ASIC), программируемых пользователем вентильных матриц (FPGA) либо других эквивалентных интегральных или дискретных логических схем. Соответственно, термин "процессор" при использовании в данном документе может означать любую вышеуказанную структуру или другую структуру, подходящую для реализации технологий, описанных в данном документе. Помимо этого, в некоторых аспектах функциональность, описанная в данном документе, может быть предоставлена в рамках специализированных программных и/или аппаратных модулей, выполненных с возможностью кодирования или декодирования либо встроенных в комбинированный кодек. Кроме того, способы могут быть полностью реализованы в одной или более схем или логических элементов.[0201] Instructions may be executed by one or more processors, for example, one or more digital signal processors (DSPs), general purpose microprocessors, specialized integrated circuits (ASICs), field programmable gate arrays (FPGAs), or other equivalent integrated or discrete logic circuits . Accordingly, the term “processor” as used herein may mean any of the above structure or another structure suitable for implementing the techniques described herein. In addition, in some aspects, the functionality described herein may be provided within specialized software and / or hardware modules configured to encode or decode, or integrated into a combination codec. In addition, the methods can be fully implemented in one or more circuits or logic elements.

[0202] Способы этого раскрытия сущности могут быть реализованы в широком спектре устройств или приборов, в том числе в беспроводном переносном телефоне, в интегральной схеме (IC) или в наборе IC (к примеру, в наборе микросхем). Различные компоненты, модули или блоки описываются в этом раскрытии сущности для того, чтобы подчеркивать функциональные аспекты устройств, выполненных с возможностью осуществлять раскрытые способы, но не обязательно требуют реализации посредством различных аппаратных модулей. Наоборот, как описано выше, различные модули могут быть комбинированы в аппаратный модуль кодека или предоставлены посредством набора взаимодействующих аппаратных модулей, включающих в себя один или более процессоров, как описано выше, в сочетании с надлежащим программным обеспечением и/или микропрограммным обеспечением.[0202] The methods of this disclosure can be implemented in a wide range of devices or devices, including a cordless handset, integrated circuit (IC) or IC set (for example, in a chipset). Various components, modules, or blocks are described in this disclosure in order to emphasize the functional aspects of devices configured to implement the disclosed methods, but do not necessarily require implementation through various hardware modules. Conversely, as described above, various modules can be combined into a codec hardware module or provided through a set of interacting hardware modules including one or more processors, as described above, in combination with appropriate software and / or firmware.

[0203] Описаны различные аспекты раскрытия сущности. Эти и другие аспекты находятся в пределах объема прилагаемой формулы изобретения.[0203] Various aspects of the disclosure are described. These and other aspects are within the scope of the attached claims.

Claims

1. A method of encoding video data, the method comprising:
identifying the first block of video data at a first temporary location from a first view, wherein the first block is associated with a first motion disparity vector;
identification of the second motion vector for the second block of video data in the second form, while the second form is different from the first type;
on the basis of the second motion vector, which is the motion vector of disparity, determining a motion vector predictor for the second motion vector, wherein the motion vector predictor is based on the first motion disparity vector; and
adding a specific motion vector predictor to the list of motion vector predictor candidates for predicting a second motion vector;
based on the second motion vector, which is not a motion disparity vector, prohibiting the possibility of determining a motion vector predictor from the first motion disparity vector, so that the motion disparity vector is not added to the candidate list; and
encoding prediction data for the second block using the motion vector predictor from the list of motion vector predictor candidates.

2. The method of claim 1, wherein the list of motion vector predictor candidates comprises a motion vector predictor and one or more motion vector predictors including one or more time motion vector predictors.

3. The method of claim 1, wherein the second block of video data is temporarily adjacent to the first block of video data.

4. The method of claim 1, wherein the second block of video data is in a first temporary location.

5. The method of claim 4, further comprising identifying the first block using a disparity value between the second block of the second kind and the first block of the first kind.

6. The method of claim 1, wherein the second block of video data is in a second temporary location other than the first temporary location.

7. The method according to p. 1,
in which, when the second motion vector contains a motion disparity vector, determining a motion vector predictor comprises scaling a first motion disparity vector to generate a predictor of a scaled motion vector,
in which scaling the first motion disparity vector comprises applying a scaling factor comprising the distance between the views of the second motion disparity vector divided by the distance between the views of the first motion vector to the first motion disparity vector, and
wherein adding a motion vector predictor to said candidate list comprises adding a scaled motion vector predictor to said candidate list.

8. The method according to claim 7, in which the distance between the types of the motion disparity vector is the difference between the identifier of the kind of the reference image and the identifier of the kind of the target image associated with said motion disparity vector.

9. The method of claim 7, wherein the distance between the views of the motion disparity vector is the geometric distance between the location of the camera of the view containing the reference image and the location of the camera of the view containing the target image associated with said motion disparity vector.

10. A method for decoding video data, the method comprising:
identifying the first block of video data at a first temporary location from a first view, wherein the first block is associated with a first motion disparity vector;
identification of the second motion vector for the second block of video data in the second form, while the second form is different from the first type;
on the basis of the second motion vector, which is the motion vector of disparity, determining a motion vector predictor for the second motion vector, wherein the motion vector predictor is based on the first motion disparity vector; and
adding a specific motion vector predictor to the list of motion vector predictor candidates for predicting a second motion vector;
based on the second motion vector, which is not a motion disparity vector, prohibiting the possibility of determining a motion vector predictor from the first motion disparity vector, so that the motion disparity vector is not added to the candidate list; and
decoding the prediction data for the second block using the motion vector predictor from the list of motion vector predictor candidates.

11. The method of claim 10, wherein the list of motion vector predictor candidates comprises a motion vector predictor and one or more motion vector predictors including one or more time motion vector predictors.

12. The method of claim 10, wherein the second block of video data is temporarily adjacent to the first block of video data.

13. The method of claim 10, wherein the second block of video data is in a first temporary location.

14. The method of claim 13, further comprising identifying the first block using a disparity value between the second block of the second kind and the first block of the first kind.

15. The method according to p. 10, in which the second block of video data is in a second temporary location other than the first temporary location.

16. The method according to p. 10,
in which, when the second motion vector contains a motion disparity vector, determining a motion vector predictor comprises scaling a first motion disparity vector to generate a predictor of a scaled motion vector,
in which scaling the first motion disparity vector comprises applying a scaling factor comprising the distance between the views of the second motion disparity vector divided by the distance between the views of the first motion vector to the first motion disparity vector, and
wherein adding a motion vector predictor to said candidate list comprises adding a scaled motion vector predictor to said candidate list.

17. The method according to p. 16, in which the distance between the types of the motion disparity vector is the difference between the identifier of the kind of the reference image and the identifier of the kind of the target image associated with said motion disparity vector.

18. The method of claim 16, wherein the distance between the views of the motion disparity vector is the geometric distance between the location of the camera of the view containing the reference image and the location of the camera of the view containing the target image associated with said motion disparity vector.

19. The method of claim 10, wherein the list of motion vector predictor candidates comprises one or more other motion vector predictors and further comprising selecting a motion vector predictor from said candidate list based on the index value of the motion vector predictor.

20. A device for encoding video data containing one or more processors, and one or more processors are configured to:
identify the first block of video data in the first temporary location from the first view, while the first block is associated with the first motion disparity vector;
identify the second motion vector for the second block of video data in the second view, while the second view is different from the first view;
on the basis of the second motion vector, which is the motion disparity vector, to determine the motion vector predictor for the second motion vector, wherein the motion vector predictor is based on the first motion disparity vector; and
add a specific motion vector predictor to the list of motion vector predictor candidates to predict the second motion vector;
based on the second motion vector, which is not a motion disparity vector, to prohibit the possibility of determining a motion vector predictor from the first motion disparity vector, so that the motion disparity vector is not added to the candidate list; and
encode the prediction data for the second block based on the motion vector predictor from the list of motion vector predictor candidates.

21. The apparatus of claim 20, wherein the list of motion vector predictor candidates comprises a motion vector predictor and one or more motion vector predictors including one or more time motion vector predictors.

22. The device according to claim 20, in which the second block of video data is temporarily adjacent to the first block of video data.

23. The device according to p. 20, in which the second block of video is located in the first temporary location.

24. The apparatus of claim 23, wherein said one or more processors is further configured to identify a first block using a disparity value between a second block of a second kind and a first block of a first kind.

25. The device according to p. 20, in which the second block of video is located in a second temporary location other than the first temporary location.

26. The device according to p. 20,
in which, when the second motion vector contains a motion disparity vector, for determining a motion vector predictor, said one or more processors are configured to scale a first motion disparity vector to generate a scaled motion vector predictor,
in which for scaling the first motion disparity vector, said one or more processors are configured to apply a scaling factor comprising the distance between the views of the second motion disparity vector divided by the distance between the views of the first motion vector to the first motion disparity vector, and
in which, to add a motion vector predictor to said candidate list, said one or more processors are configured to add a scaled motion vector predictor to said candidate list.

27. The device according to p. 26, in which the distance between the types of motion disparity vector is the difference between the identifier of the type of the reference image and the identifier of the type of the target image associated with the said motion disparity vector.

28. The device according to p. 26, in which the distance between the types of the motion disparity vector is the geometric distance between the location of the camera of the kind containing the reference image and the location of the camera of the view containing the target image associated with the said motion disparity vector.

29. A device for decoding video data containing one or more processors, and one or more processors are configured to:
identify the first block of video data in the first temporary location from the first view, while the first block is associated with the first motion disparity vector;
identify the second motion vector for the second block of video data in the second view, while the second view is different from the first view;
on the basis of the second motion vector, which is the motion disparity vector, to determine the motion vector predictor for the second motion vector, wherein the motion vector predictor is based on the first motion disparity vector; and
add a specific motion vector predictor to the list of motion vector predictor candidates to predict the second motion vector;
based on the second motion vector, which is not a motion disparity vector, to prohibit the possibility of determining a motion vector predictor from the first motion disparity vector, so that the motion disparity vector is not added to the candidate list; and
decode prediction data for the second block based on the motion vector predictor from the list of motion vector predictor candidates.

30. The device of claim 29, wherein the list of motion vector predictor candidates comprises a motion vector predictor and one or more motion vector predictors including one or more time motion vector predictors.

31. The device according to p. 29, in which the second block of video is temporarily adjacent to the first block of video.

32. The device according to p. 29, in which the second block of video is located in the first temporary location.

33. The apparatus of claim 32, wherein said one or more processors is further configured to identify a first block using a disparity value between a second block of a second kind and a first block of a first kind.

34. The device according to p. 29, in which the second block of video is in a second temporary location other than the first temporary location.

35. The device according to p. 29,
in which, when the second motion vector contains a motion disparity vector, to determine the motion vector predictor, said one or more processors are configured to scale the first motion disparity vector to generate a scaled motion vector predictor,
in which for scaling the first motion disparity vector, said one or more processors are configured to apply a scaling factor comprising the distance between the views of the second motion disparity vector divided by the distance between the views of the first motion vector to the first motion disparity vector, and
in which, to add a motion vector predictor to said candidate list, said one or more processors are configured to add a scaled motion vector predictor to said candidate list.

36. The device according to p. 35, in which the distance between the types of motion disparity vector is the difference between the identifier of the type of the reference image and the identifier of the type of the target image associated with the said motion disparity vector.

37. The device according to p. 35, in which the distance between the views of the motion disparity vector is the geometric distance between the location of the camera of the view containing the reference image and the location of the camera of the view containing the target image associated with said motion disparity vector.

38. The apparatus of claim 29, wherein the list of motion vector predictor candidates comprises one or more other motion vector predictors and wherein said one or more processors are further configured to select a motion vector predictor from said candidate list based on the value of the vector predictor index movement.

39. A device for encoding video data, comprising:
means for identifying a first block of video data at a first temporary location from a first view, wherein the first block is associated with a first motion disparity vector;
means for identifying a second motion vector for the second block of video data in a second view, wherein the second view is different from the first view;
means for determining, based on the second motion vector, which is the motion disparity vector, the motion vector predictor for the second motion vector, wherein the motion vector predictor is based on the first motion disparity vector;
means for adding a motion vector predictor to the list of motion vector predictor candidates for predicting a second motion vector; and
means for prohibiting, based on the second motion vector, which is not a motion disparity vector, the possibility of determining a motion vector predictor from the first motion disparity vector, so that the motion disparity vector is not added to the candidate list; and
means for encoding prediction data for the second block based on the motion vector predictor from the list of motion vector predictor candidates.

40. The device according to claim 39, wherein the list of motion vector predictor candidates comprises a motion vector predictor and one or more motion vector predictors including one or more temporal motion vector predictors.

41. The device according to claim 39, in which the second block of video data is temporarily adjacent to the first block of video data.

42. The device according to p. 39, in which the second block of video is located in the first temporary location.

43. The device according to p. 42, further comprising means for identifying the first block using the disparity value between the second block of the second type and the first block of the first type.

44. The device according to claim 39, in which the second block of video data is in a second temporary location other than the first temporary location.

45. The device according to p. 39,
in which, when the second motion vector contains a motion disparity vector, the means for determining a motion vector predictor comprises means for scaling a first motion disparity vector to generate a predicted scaled motion vector,
in which the means for scaling the first motion disparity vector comprises means for applying a scaling factor comprising the distance between the views of the second motion disparity vector divided by the distance between the kinds of the first motion vector to the first motion disparity vector, and
wherein the means for adding a motion vector predictor to the candidate list comprises means for adding a scaled motion vector predictor to said candidate list.

46. The device according to p. 45, in which the distance between the types of motion disparity vector is the difference between the identifier of the type of the reference image and the identifier of the type of the target image associated with the said motion disparity vector.

47. The device according to claim 45, in which the distance between the views of the motion disparity vector is the geometric distance between the location of the camera of the view containing the reference image and the location of the camera of the view containing the target image associated with said motion disparity vector.

48. An apparatus for decoding video data, comprising: means for identifying a first block of video data in
a first temporary location from a first view, wherein the first block is associated with a first motion disparity vector;
means for identifying a second motion vector for the second block of video data in a second view, wherein the second view is different from the first view;
means for determining, based on the second motion vector, which is the motion disparity vector, the motion vector predictor for the second motion vector, wherein the motion vector predictor is based on the first motion disparity vector;
means for adding a motion vector predictor to the list of motion vector predictor candidates for predicting a second motion vector; and
means for prohibiting, based on the second motion vector, which is not a motion disparity vector, the possibility of determining a motion vector predictor from the first motion disparity vector, so that the motion disparity vector is not added to the candidate list; and
means for decoding prediction data for the second block based on the motion vector predictor from the list of motion vector predictor candidates.

49. The device of claim 48, wherein the list of motion vector predictor candidates comprises a motion vector predictor and one or more motion vector predictors including one or more temporal motion vector predictors.

50. The device according to p. 48, in which the second block of video is temporarily adjacent to the first block of video.

51. The device according to p. 48, in which the second block of video is located in the first temporary location.

52. The device according to p. 51, further comprising means for identifying the first block using the disparity value between the second block of the second type and the first block of the first type.

53. The device according to p. 48, in which the second block of video is in a second temporary location other than the first temporary location.

54. The device according to p. 48,
in which, when the second motion vector contains a motion disparity vector, the means for determining a motion vector predictor comprises means for scaling a first motion disparity vector to generate a predicted scaled motion vector,
in which the means for scaling the first motion disparity vector comprises means for applying a scaling factor comprising the distance between the views of the second motion disparity vector divided by the distance between the kinds of the first motion vector to the first motion disparity vector, and
wherein the means for adding a motion vector predictor to the candidate list comprises means for adding a scaled motion vector predictor to said candidate list.

55. The device according to p. 54, in which the distance between the types of motion disparity vector is the difference between the identifier of the type of the reference image and the identifier of the type of the target image associated with the said motion disparity vector.

56. The device according to p. 54, in which the distance between the views of the motion disparity vector is the geometric distance between the location of the camera of the view containing the reference image and the location of the camera of the view containing the target image associated with the said motion disparity vector.

57. The apparatus of claim 48, wherein the list of motion vector predictor candidates comprises one or more other motion vector predictors and further comprising means for selecting a motion vector predictor from said candidate list based on the index value of the motion vector predictor.

58. A computer-readable storage medium having instructions stored on it that, when executed, prompt one or more processors:
identify the first block of video data in the first temporary location from the first view, while the first block is associated with the first motion disparity vector;
identify the second motion vector for the second block of video data in the second view, while the second view is different from the first view;
to determine, based on the second motion vector, which is the motion disparity vector, the motion vector predictor for the second motion vector, wherein the motion vector predictor is based on the first motion disparity vector;
add a specific motion vector predictor to the list of motion vector predictor candidates to predict the second motion vector;
based on the second motion vector, which is not a motion disparity vector, to prohibit the possibility of determining a motion vector predictor from the first motion disparity vector, so that the motion disparity vector is not added to the candidate list; and
encode the prediction data for the second block based on the motion vector predictor from the list of motion vector predictor candidates.

59. The computer-readable storage medium according to claim 58, wherein the list of motion vector predictor candidates comprises a motion vector predictor and one or more motion vector predictors including one or more time motion vector predictors.

60. A computer-readable storage medium according to claim 58, wherein the second block of video data is temporarily adjacent to the first block of video data.

61. The computer-readable storage medium according to claim 58, wherein the second block of video data is in a first temporary location.

62. The computer-readable storage medium of claim 61, further comprising instructions that cause said one or more processors to identify the first block using the disparity value between the second block of the second kind and the first block of the first kind.

63. The computer-readable storage medium of claim 58, wherein the second block of video data is in a second temporary location other than the first temporary location.

64. A computer-readable storage medium according to claim 58,
in which, when the second motion vector contains a motion disparity vector, to determine the motion vector predictor, the instructions cause the one or more processors to scale the first motion disparity vector to generate a scaled motion vector predictor,
in which, for scaling the first motion disparity vector, the instructions cause said one or more processors to apply a scaling factor comprising the distance between the views of the second motion disparity vector divided by the distance between the views of the first motion vector to the first motion disparity vector, and
wherein in order to add a motion vector predictor to said candidate list, instructions cause said one or more processors to add a scaled motion vector predictor to said candidate list.

65. The computer-readable storage medium according to claim 64, wherein the distance between the kinds of the motion disparity vector is the difference between the identifier of the kind of the reference image and the identifier of the kind of the target image associated with said motion disparity vector.

66. A computer-readable storage medium according to claim 64, wherein the distance between the views of the motion disparity vector is the geometric distance between the location of the camera of the view containing the reference image and the location of the camera of the view containing the target image associated with said motion disparity vector.

67. A computer-readable storage medium having instructions stored on it that, when executed, prompt one or more processors:
identify the first block of video data in the first temporary location from the first view, while the first block is associated with the first motion disparity vector;
identify the second motion vector for the second block of video data in the second view, while the second view is different from the first view;
to determine, based on the second motion vector, which is the motion disparity vector, the motion vector predictor for the second motion vector, wherein the motion vector predictor is based on the first motion disparity vector;
add a specific motion vector predictor to the list of motion vector predictor candidates to predict the second motion vector;
based on the second motion vector, which is not a motion disparity vector, to prohibit the possibility of determining a motion vector predictor from the first motion disparity vector, so that the motion disparity vector is not added to the candidate list; and
decode prediction data for the second block based on the motion vector predictor from the list of motion vector predictor candidates.

68. The computer-readable storage medium according to claim 67, wherein the list of motion vector predictor candidates comprises a motion vector predictor and one or more motion vector predictors including one or more time motion vector predictors.

69. The computer-readable storage medium according to claim 67, wherein the second block of video data is temporarily adjacent to the first block of video data.

70. The computer-readable storage medium of claim 67, wherein the second block of video data is at a first temporary location.

71. The computer-readable storage medium of claim 70, further comprising instructions that cause said one or more processors to identify the first block using the disparity value between the second block of the second kind and the first block of the first kind.

72. The computer-readable storage medium of claim 67, wherein the second block of video data is in a second temporary location other than the first temporary location.

73. A computer readable storage medium according to claim 67,
in which, when the second motion vector contains a motion disparity vector, to determine the motion vector predictor, the instructions cause the one or more processors to scale the first motion disparity vector to generate a scaled motion vector predictor,
in which, for scaling the first motion disparity vector, the instructions cause said one or more processors to apply a scaling factor comprising the distance between the views of the second motion disparity vector divided by the distance between the views of the first motion vector to the first motion disparity vector, and
wherein in order to add a motion vector predictor to said candidate list, instructions cause said one or more processors to add a scaled motion vector predictor to said candidate list.

74. The computer-readable storage medium according to claim 73, wherein the distance between the kinds of the motion disparity vector is the difference between the identifier of the kind of the reference image and the identifier of the kind of the target image associated with said motion disparity vector.

75. The computer-readable storage medium according to claim 73, wherein the distance between the views of the motion disparity vector is the geometric distance between the location of the camera of the view containing the reference image and the location of the camera of the view containing the target image associated with the motion disparity vector.

76. The computer-readable storage medium according to claim 67, wherein the list of motion vector predictor candidates comprises one or more other motion vector predictors and further comprising instructions that cause said one or more processors to select a motion vector predictor from said candidate list based on the value motion vector predictor index.