RU2799709C1

RU2799709C1 - Method and device for video encoding based on long-term reference frame, computing device and data carrier

Info

Publication number: RU2799709C1
Application number: RU2022129325A
Authority: RU
Inventors: Тунбин ЦУЙ
Original assignee: Биго Текнолоджи Пте. Лтд.
Priority date: 2020-04-21
Filing date: 2021-04-21
Publication date: 2023-07-10

Abstract

FIELD: image processing technology.

SUBSTANCE: invention relates to video coding based on a long-term reference frame. The method includes the following steps: setting the long-term reference frame according to the attribute information of the image frame; determining a reference index for an image frame to be encoded based on the normal reference frame and the long-term reference frame; and obtaining a target matching block by performing inter-prediction based on the reference index to encode an image frame to be encoded; wherein adjusting the long-term reference frame according to the attribute information of the image frame includes the step of adjusting the long-term reference frame based on the redundancy rate in the time domain and the redundancy rate in the spatial domain for the image frame.

EFFECT: increasing the efficiency of video compression.

12 cl, 7 dwg, 1 tbl

Description

Ссылка на родственную заявкуLink to related application

[0001] Настоящая заявка является переводом на национальную фазу в России международной заявки № PCT/CN2021/088586, поданной 21 апреля 2020 г, согласно которой испрашивается приоритет в соответствии с заявкой на выдачу патента Китая №202010318681.1 под названием «СПОСОБ И УСТРОЙСТВО ДЛЯ КОДИРОВАНИЯ ВИДЕО НА ОСНОВАНИИ ДОЛГОВРЕМЕННОГО ОПОРНОГО КАДРА, ВЫЧИСЛИТЕЛЬНОЕ УСТРОЙСТВО И НОСИТЕЛЬ ДАННЫХ», описание которой ссылкой полностью включено в настоящий документ. [0001] This application is a national phase transfer in Russia of International Application No. PCT/CN2021/088586, filed April 21, 2020, according to which priority is claimed in accordance with Chinese Patent Application No. BASED ON LONG-TERM REFERENCE FRAME, COMPUTING DEVICE AND STORAGE MEDIA”, the description of which is hereby incorporated by reference in its entirety.

Область техники, к которой относится настоящее изобретениеThe field of technology to which the present invention relates

[0002] Варианты осуществления настоящего раскрытия связаны с областью технологий обработки изображения и, в частности, связаны со способом и устройством для кодирования видео на основании долговременного опорного кадра, вычислительным устройством и носителем данных.[0002] Embodiments of the present disclosure are related to the field of image processing technologies and, in particular, are related to a method and apparatus for encoding video based on a long-term reference frame, a computing device, and a storage medium.

Предшествующий уровень техники настоящего изобретенияBackground of the Invention

[0003] Кодирование в режиме интер-предсказания является важным способом, широко используемым при кодировании видео для улучшения эффективности сжатия. Согласно этому способу, в опорном кадре с помощью оценки движения проводится поиск оптимального сопоставляемого блока, затем с помощью технологии компенсации движения проводится предсказание предсказываемого блока для текущего блока, так что получается остаточный блок, и последующая обработка для сжатия при кодировании видео реализуется как преобразование, квантование и энтропийное кодирование остаточного блока.[0003] Inter-prediction coding is an important technique widely used in video coding to improve compression efficiency. According to this method, the optimal matching block is searched for in the reference frame by motion estimation, then the prediction block for the current block is predicted by motion compensation technology, so that a residual block is obtained, and the subsequent processing for video encoding compression is realized as a transformation, quantization and entropy coding of the residual block.

Краткое раскрытие настоящего изобретенияBrief summary of the present invention

[0004] В настоящем раскрытии предложены способ и устройство для кодирования видео на основании долговременного опорного кадра, вычислительное устройство и носитель данных.[0004] The present disclosure provides a method and apparatus for encoding video based on a long-term reference frame, a computing device, and a storage medium.

[0005] Согласно некоторым вариантам осуществления настоящего раскрытия, предложен способ для кодирования видео на основании долговременного опорного кадра. Этот способ предусматривает следующие стадии: настройка долговременного опорного кадра согласно информации атрибутов кадра изображения; определение опорного индекса для подлежащего кодированию кадра изображения на основании нормального опорного кадра и долговременного опорного кадра; и получение целевого сопоставляемого блока посредством выполнения интер-предсказания на основании опорного индекса, чтобы осуществить кодирование подлежащего кодированию кадра изображения; причем настройка долговременного опорного кадра соответственно информации атрибутов кадра изображения предусматривает стадии: настройка долговременного опорного кадра на основании степени избыточности во временной области и степени избыточности в пространственной области для кадра изображения, причем степень избыточности во временной области указывает на степень наличия того же самого кодируемого макроблока в кадре изображения и в его соседнем кадре изображения, а степень избыточности в пространственной области указывает на степень наличия того же самого кодируемого макроблока внутри кадра изображения.[0005] According to some embodiments of the present disclosure, a method is provided for encoding video based on a long-term reference frame. This method includes the following steps: setting a long-term reference frame according to image frame attribute information; determining a reference index for an image frame to be encoded based on the normal reference frame and the long-term reference frame; and obtaining a target matching block by performing inter-prediction based on the reference index to encode an image frame to be encoded; wherein setting the long-term reference frame according to the attribute information of the image frame includes the steps of: setting the long-term reference frame based on the time domain redundancy rate and the spatial domain redundancy rate for the image frame, the time domain redundancy rate indicating the degree of presence of the same encoded macroblock in image frame and its adjacent image frame, and the degree of redundancy in the spatial domain indicates the degree of presence of the same encoded macroblock within the image frame.

[0006] Согласно некоторым вариантам осуществления настоящего раскрытия, предложено устройство для кодирования видео на основании долговременного опорного кадра. Это устройство включает в себя: настроечный модуль, выполненный с возможностью настроить долговременный опорный кадр согласно информации атрибутов кадра изображения; индексный модуль, выполненный с возможностью определить опорный индекс для подлежащего кодированию кадра изображения на основании нормального опорного кадра и долговременного опорного кадра; и кодирующий модуль, выполненный с возможностью получить целевой сопоставляемый блок посредством выполнения интерпредсказания на основании опорного индекса, чтобы осуществить кодирование подлежащего кодированию кадра изображения; причем настроечный модуль дополнительно выполнен с возможностью настроить долговременный опорный кадр на основании степени избыточности во временной области и степени избыточности в пространственной области для кадра изображения, причем степень избыточности во временной области указывает на степень наличия того же самого кодируемого макроблока в кадре изображения и в его соседнем кадре изображения, а степень избыточности в пространственной области указывает на степень наличия того же самого кодируемого макроблока внутри кадра изображения.[0006] According to some embodiments of the present disclosure, an apparatus is provided for encoding video based on a long-term reference frame. This apparatus includes: a setting module configured to set a long-term reference frame according to image frame attribute information; an index module, configured to determine a reference index for an image frame to be encoded based on a normal reference frame and a long-term reference frame; and an encoding module, configured to obtain a target matching block by performing interprediction based on the reference index to encode an image frame to be encoded; wherein the tuning module is further configured to adjust the long-term reference frame based on the degree of redundancy in the time domain and the degree of redundancy in the spatial domain for the image frame, and the degree of redundancy in the time domain indicates the degree of presence of the same encoded macroblock in the image frame and in its adjacent image frame, and the degree of redundancy in the spatial domain indicates the degree of presence of the same encoded macroblock within the image frame.

[0007] Согласно некоторым вариантам осуществления настоящего раскрытия, предложено вычислительное устройство для кодирования видео на основании долговременного опорного кадра. Это вычислительное устройство включает в себя один или более процессоров; и память, выполненную с возможностью хранить одну или более программ, причем одна или более программ, будучи загруженными и выполняемыми одним или более процессорами, заставляют один или более процессоров выполнить способ для кодирования видео на основании долговременного опорного кадра, как определено согласно любому из вариантов осуществления настоящего раскрытия.[0007] According to some embodiments of the present disclosure, a computing device is provided for encoding video based on a long-term reference frame. This computing device includes one or more processors; and a memory configured to store one or more programs, wherein the one or more programs, when loaded and executed by the one or more processors, causes the one or more processors to execute a method for encoding video based on a long-term reference frame, as determined according to any of the embodiments. of this disclosure.

[0008] Согласно некоторым вариантам осуществления настоящего раскрытия, предложен энергонезависимый машиночитаемый носитель данных. В энергонезависимом машиночитаемом носителе данных хранится одна или более программ для компьютера, причем одна или более программ для компьютера, будучи загруженными и выполняемые процессором вычислительного устройства, заставляют вычислительное устройство выполнить способ для кодирования видео на основании долговременного опорного кадра, как определено согласно любому из вариантов осуществления настоящего раскрытия.[0008] According to some embodiments of the present disclosure, a non-volatile computer-readable storage medium is provided. The non-volatile computer-readable storage medium stores one or more computer programs, wherein the one or more computer programs, when downloaded and executed by the processor of the computing device, cause the computing device to execute a method for encoding video based on a long-term reference frame, as defined in accordance with any of the embodiments. of this disclosure.

Краткое описание фигурBrief description of the figures

[0009] На фиг. 1 показана блок-схема алгоритма способа для кодирования видео на основании долговременного опорного кадра согласно варианту осуществления 1 настоящего раскрытия;[0009] FIG. 1 shows a flowchart of a method for encoding video based on a long-term reference frame according to Embodiment 1 of the present disclosure;

[0010] На фиг. 2 показана блок-схема алгоритма способа для кодирования видео на основании долговременного опорного кадра согласно варианту осуществления 2 настоящего раскрытия;[0010] FIG. 2 shows a flowchart of a method for encoding video based on a long-term reference frame according to Embodiment 2 of the present disclosure;

[ООН] На фиг. 3 показана блок-схема алгоритма способа для кодирования видео на основании долговременного опорного кадра согласно варианту осуществления 3 настоящего раскрытия;[UN] FIG. 3 is a flowchart of a method for encoding video based on a long-term reference frame according to Embodiment 3 of the present disclosure;

[0012] На фиг. 4 показана упрощенная схема поиска места кодируемого макроблока в первом опорном индексе согласно варианту осуществления 3 настоящего раскрытия;[0012] FIG. 4 shows a simplified scheme for searching for a location of an encoded macroblock in a first reference index according to Embodiment 3 of the present disclosure;

[0013] На фиг. 5 показан пример схемы режима предсказания согласно варианту осуществления 3 настоящего раскрытия;[0013] FIG. 5 shows an example of a prediction mode circuit according to Embodiment 3 of the present disclosure;

[0014] На фиг. 6 показана упрощенная блок-схема устройства для кодирования видео на основании долговременного опорного кадра согласно варианту осуществления 4 настоящего раскрытия; и[0014] FIG. 6 is a simplified block diagram of an apparatus for encoding video based on a long-term reference frame according to Embodiment 4 of the present disclosure; And

[0015] На фиг. 7 показана упрощенная блок-схема вычислительного устройства для кодирования видео согласно варианту осуществления 5 настоящего раскрытия.[0015] FIG. 7 is a simplified block diagram of a video encoding computing device according to Embodiment 5 of the present disclosure.

Подробное раскрытие настоящего изобретенияDetailed disclosure of the present invention

[0016] Настоящее раскрытие подробно описано ниже со ссылками на прилагаемые фигуры совместно с вариантами осуществления. Следует понимать, что описанные в настоящем документе варианты осуществления предназначены только для объяснения настоящего раскрытия, и не предназначены для ограничения настоящего раскрытия. Кроме того, следует дополнительно отметить, что для упрощения описания на прилагаемых фигурах показаны только некоторые, а не все структуры, относящиеся к настоящему раскрытию.[0016] The present disclosure is described in detail below with reference to the accompanying figures in conjunction with embodiments. It should be understood that the embodiments described herein are only intended to explain the present disclosure, and are not intended to limit the present disclosure. In addition, it should be further noted that for ease of description, the accompanying figures show only some, and not all, of the structures relevant to the present disclosure.

[0017] Вариант осуществления 1[0017] Embodiment 1

[0018] На фиг. 1 показана блок-схема алгоритма способа для кодирования видео на основании долговременного опорного кадра согласно варианту осуществления 1 настоящего раскрытия. Этот вариант осуществления может применяться в случае, когда долговременный опорный кадр используется для осуществления кодирования изображения. Способ может быть выполнен устройством для кодирования видео на основании долговременного опорного кадра, причем устройство может быть реализовано аппаратными и/или программными средствами. Как показано на фиг. 1, предложенный согласно этому варианту осуществления настоящего раскрытия способ для кодирования видео на основании долговременного опорного кадра может предусматривать следующие стадии.[0018] FIG. 1 shows a flowchart of a method for encoding video based on a long-term reference frame according to Embodiment 1 of the present disclosure. This embodiment may be applied in a case where a long-term reference frame is used to perform image encoding. The method may be performed by a device for encoding video based on a long-term reference frame, the device being implemented in hardware and/or software. As shown in FIG. 1, the method proposed according to this embodiment of the present disclosure for encoding video based on a long-term reference frame may include the following steps.

[0019] На стадии 101 долговременный опорный кадр настраивается согласно информации атрибутов кадра изображения.[0019] In step 101, the long-term reference frame is adjusted according to the image frame attribute information.

[0020] Кадр изображения может быть любым кадром изображения в видео, которое необходимо закодировать. Информацией атрибутов могут быть данные, описывающие кадр изображения. Информация атрибутов может храниться в ассоциации с кадром изображения. Информация атрибутов может указывать, является ли соответствующий кадр изображения ключевым кадром или кадром переключения сценария. Информация атрибутов может быть создана анализирующим модулем в известном уровне техники, и создание информации атрибутов не описывается подробно в этом варианте осуществления настоящего раскрытия. Имеется два типа опорных кадров для кодирования с интер-предсказанием: кратковременный опорный кадр и долговременный опорный кадр. В сравнении с кратковременным опорным кадром долговременный опорный кадр может предоставить более долгое опорное изображение, интервал из двух долговременных опорных кадров во временной области более долгий, а эффективность сжатия при кодировании с использованием долговременного опорного кадра лучше для видео с малым изменением фона. Долговременный опорный кадр может быть кадром изображения для справок, хранящимся в буфере декодированных изображений (DPB), и буфер DPB может содержать изображение, которое кодируется, и может хранить в себе кратковременный опорный кадр, долговременный опорный кадр, неопорный кадр и тому подобное. Однако известная автору изобретения схема кодирования видео функционально поддерживает только долговременный опорный кадр, что значительно увеличивает сложность кодирования и снижает скорость кодирования со сжатием.[0020] An image frame can be any image frame in a video that needs to be encoded. The attribute information may be data describing a frame of an image. The attribute information may be stored in association with an image frame. The attribute information may indicate whether the corresponding image frame is a keyframe or a script switch frame. The attribute information may be created by the parsing module in the prior art, and the creation of the attribute information is not described in detail in this embodiment of the present disclosure. There are two types of reference frames for inter-prediction coding: a short-term reference frame and a long-term reference frame. Compared with the short-term reference frame, the long-term reference frame can provide a longer reference picture, the interval of two long-term reference frames in the time domain is longer, and the compression efficiency of coding using the long-term reference frame is better for video with little background change. The long-term reference frame may be a reference picture frame stored in a decoded picture buffer (DPB), and the DPB may contain a picture that is being encoded and may store a short-term reference frame, a long-term reference frame, a non-reference frame, and the like. However, the video coding scheme known to the inventor only functionally supports the long-term reference frame, which greatly increases the coding complexity and reduces the rate of compression coding.

[0021] Согласно вариантам осуществления настоящего раскрытия долговременный опорный кадр может быть настроен согласно информации атрибутов для каждого кадра изображения из множества кадров изображения в видео. Например, в случае, когда степень избыточности в пространственной области кадра изображения превышает порог, кадр изображение может быть настроен в качестве долговременного кадра изображения. В случае, когда информации атрибутов для кадра изображения указывает, что кадр изображения является первым кадром изображения в видео, кадр изображение может быть настроен в качестве долговременного кадра изображения.[0021] According to embodiments of the present disclosure, a long-term reference frame may be adjusted according to attribute information for each image frame from a plurality of image frames in a video. For example, in the case where the degree of redundancy in the spatial domain of an image frame exceeds a threshold, the image frame may be set as a long-term image frame. In the case where the attribute information for the image frame indicates that the image frame is the first image frame in the video, the image frame may be set as a long-term image frame.

[0022] На стадии 102 на основании нормального опорного кадра и долговременного опорного кадра определяется опорный индекс подлежащего кодированию кадра изображения.[0022] In step 102, based on the normal reference frame and the long-term reference frame, a reference index of the image frame to be encoded is determined.

[0023] Нормальный опорный кадр может быть опорным кадром для интерпредсказания в известном уровне техники, а нормальный опорный кадр может быть кадром изображения, соседним во временной последовательности к подлежащему кодированию кадру изображения, например, он может быть предыдущим кадром или последующим кадром для подлежащего кодированию кадра изображения. Подлежащий кодированию кадр изображения может быть кадром изображения, который необходимо закодировать в видео, и один подлежащий кодированию кадр изображения может кодироваться каждый раз в процессе кодирования видео. Опорный индекс может быть информацией об относительном расположении на уровне кадра сопоставляемого блока, получаемого из опорного кадра, настроенного на основании подлежащего кодированию кадра изображения, и может включать в себя номер индекса нормального опорного кадра на уровне кадра и/или номер индекса долговременного опорного кадра на уровне кадра.[0023] The normal reference frame may be a reference frame for interprediction in the prior art, and the normal reference frame may be an image frame adjacent in time sequence to the image frame to be coded, for example, it may be the previous frame or the next frame for the frame to be coded Images. The image frame to be encoded may be an image frame to be encoded into a video, and one image frame to be encoded may be encoded each time in the video encoding process. The reference index may be information about the relative location at the frame level of the matched block obtained from the reference frame adjusted based on the image frame to be encoded, and may include a normal reference frame index number at the frame level and/or a long-term reference frame index number at the frame level. frame.

[0024] Согласно этому варианту осуществления настоящего раскрытия, опорный кадр, который наиболее подходит для интер-предсказания для подлежащего кодированию кадра изображения, может разыскиваться в нормальном опорном кадре и в долговременном опорном кадре, и номер индекса опорного кадра на уровне кадра может быть сохранен в качестве опорного индекса. Может быть понятно, что опорный кадр, который наиболее подходит для интер-предсказания для подлежащего кодированию кадра изображения, может быть кадром изображения с минимальными расходами на кодирование в нормальном опорном кадре и/или в долговременном опорном кадре, или может быть кадром изображения, самым близким к подлежащему кодированию кадру изображения во временной последовательности в нормальном опорном кадре и/или в долговременном опорном кадре. Опорный индекс может включать в себя по меньшей мере один из номеров индекса нормального опорного кадра на уровне кадра и/или номер индекса долговременного опорного кадра на уровне кадра.[0024] According to this embodiment of the present disclosure, a reference frame that is most suitable for inter-prediction for an image frame to be coded may be searched for in a normal reference frame and in a long-term reference frame, and the reference frame index number at the frame level may be stored in as a reference index. It may be appreciated that the reference frame that is most suitable for inter-prediction for the image frame to be coded may be the image frame with the lowest encoding overhead in the normal reference frame and/or the long-term reference frame, or may be the image frame closest to to the image frame to be encoded in time sequence in a normal reference frame and/or in a long-term reference frame. The reference index may include at least one of the frame level normal reference frame index numbers and/or the frame level long term reference frame index number.

[0025] На стадии 103 целевой сопоставляемый блок получается посредством осуществления интер-предсказания на основании опорного индекса, чтобы выполнить кодирование подлежащего кодированию кадра изображения.[0025] In step 103, a target matching block is obtained by performing inter-prediction based on a reference index to perform encoding of an image frame to be coded.

[0026] Целевой сопоставляемый блок может быть кодируемым блоком, который является таким же или подобным кодируемому блоку в подлежащем кодированию кадре изображения, имеющим минимальные расходы на кодирование, и который был найден при выполнении интер-предсказания на основании опорного индекса.[0026] The target matching block may be an encoded block that is the same or similar to an encoded block in an image frame to be encoded, having a minimum encoding overhead, and that was found by performing inter-prediction based on the reference index.

[0027] Подлежащий кодированию кадр изображения может быть сегментирован по меньшей мере в один кодируемый макроблок, и для по меньшей мере одного кодируемого макроблока, целевой сопоставляемый блок, подобный кодируемому макроблоку, может быть найден на основании опорного индекса. Подлежащий кодированию кадр изображения может быть сжат и закодирован на основании соответствующей информации о целевом сопоставляемом блоке. Можно понять, что стадия 103 и стадия 102 могут выполняться одновременно. Например, каждый раз, когда определяется один опорный индекс подлежащего кодированию кадра изображения, целевой сопоставляемый блок, соответствующий подлежащему кодированию кадру изображения, может быть получен на основании опорного кадра. Альтернативно, стадия 103 может быть выполнена непосредственно после полного выполнения стадии 102. Например, непосредственно после определения всех опорных индексов подлежащего кодированию кадра изображения, выполняется поиск целевого сопоставляемого блока на основании множества опорных индексов.[0027] An image frame to be encoded may be segmented into at least one encoded macroblock, and for at least one encoded macroblock, a target matching block similar to the encoded macroblock may be found based on the reference index. The image frame to be encoded may be compressed and encoded based on the relevant information about the target matching block. It can be understood that step 103 and step 102 may be performed simultaneously. For example, each time one reference index of an image frame to be encoded is determined, a target matching block corresponding to the image frame to be encoded can be obtained based on the reference frame. Alternatively, step 103 may be performed immediately after complete execution of step 102. For example, immediately after all reference indices of an image frame to be encoded have been determined, a target matching block is searched based on a plurality of reference indices.

[0028] Согласно этому варианту осуществления настоящего раскрытия, кадр изображения настраивается в качестве долговременного опорного кадра согласно информации атрибутов, опорный индекс подлежащего кодированию кадра изображения определяется на основании нормального опорного кадра и долговременного опорного кадра; и получение целевого сопоставляемого блока для подлежащего кодированию кадра изображения проводится посредством выполнения интер-предсказания на основании опорного индекса, чтобы осуществить кодирование подлежащего кодированию кадра изображения. Долговременный опорный кадр заранее настроен согласно информации атрибутов кадра изображения, вследствие этого достигается точная настройка долговременного опорного кадра. Опорный индекс определяется на основании долговременного опорного кадра, и получается целевой сопоставляемый блок, вследствие этого снижаются накладные расходы на вычисления в процессе кодирования, снижается сложность кодирования на основании идеи обеспечения эффективности сжатия при кодировании изображения, снижается длительность кодирования видео, и достигается высокоэффективное кодирование изображения видео.[0028] According to this embodiment of the present disclosure, an image frame is set as a long-term reference frame according to attribute information, a reference index of an image frame to be encoded is determined based on the normal reference frame and the long-term reference frame; and obtaining a target matching block for the image frame to be coded is performed by performing inter-prediction based on the reference index to perform encoding of the image frame to be coded. The long-term reference frame is pre-adjusted according to the attribute information of the image frame, thereby achieving fine tuning of the long-term reference frame. The reference index is determined based on the long-term reference frame, and the target matching block is obtained, thereby reducing the computational overhead in the encoding process, reducing the complexity of encoding based on the idea of ensuring compression efficiency in image encoding, reducing the duration of video encoding, and achieving high-efficiency video image encoding .

[0029] На основании вышеупомянутого варианта осуществления настоящего раскрытия, настройка долговременного опорного кадра соответственно информации атрибутов кадра изображения включает в себя: извлечение, для каждого из кадров изображения из множества кадров изображения, типа кадра в информации атрибутов, соответствующей каждому из кадров изображения; и настройка кадра изображения, соответствующего типу кадра, в качестве долговременного опорного кадра в случае, когда тип кадра является ключевым кадром и/или кадром переключения сценария.[0029] Based on the above embodiment of the present disclosure, setting the long-term reference frame according to the image frame attribute information includes: extracting, for each of the image frames from the plurality of image frames, a frame type in the attribute information corresponding to each of the image frames; and setting an image frame corresponding to the frame type as a long-term reference frame in the case where the frame type is a keyframe and/or a script switching frame.

[0030] Тип кадра может быть информацией, описывающей использование или контент кадра изображения. В случае, когда тип кадра используется для описания использования кадра изображения, тип кадра может включать в себя ключевой кадр I, неключевой кадр Р, неключевой кадр В и тому подобное. В случае, когда тип кадра используется для описания контента, тип кадра может включать в себя кадр переключения сценария, кадр сценария и тому подобное. Ключевой кадр может указывать, что кадр изображения используется в качестве опорного кадра в процессе кодирования, а кадр переключения сценария может указывать, что контент кадра изображения может включать в себя контент переключения сценария. Например, в случае, когда контентом кадра изображения является изображение человека, выходящего на улицу из помещения, кадр изображение может быть кадром переключения сценария.[0030] The frame type may be information describing the usage or content of an image frame. In the case where the frame type is used to describe the use of an image frame, the frame type may include an I key frame, a P non-key frame, a B non-key frame, and the like. In the case where the frame type is used to describe the content, the frame type may include a script switching frame, a script frame, and the like. The keyframe may indicate that the image frame is used as a reference frame in the encoding process, and the script switch frame may indicate that the content of the image frame may include script switch content. For example, in the case where the content of an image frame is an image of a person going outside from a room, the image frame may be a script switching frame.

[0031] Может быть получена информация атрибутов каждого кадра изображения из множества кадров изображения в видео. В случае, когда тип кадра в информации атрибутов для кадра изображения является ключевым кадром и/или кадром переключения сценария, кадр изображение может быть настроен в качестве долговременного кадра изображения. Например, изображение кадра может храниться в буфере DPB, и может быть настроен идентификатор для долговременного опорного кадра.[0031] Attribute information of each image frame can be obtained from a plurality of image frames in a video. In the case where the frame type in the attribute information for an image frame is a key frame and/or a script switching frame, the image frame may be set as a long-term image frame. For example, a frame image may be stored in a DPB and an identifier for a long-term reference frame may be configured.

[0032] На основании вышеупомянутого варианта осуществления настоящего раскрытия, настройка долговременного опорного кадра соответственно информации атрибутов кадра изображения включает в себя: настройку долговременного опорного кадра на основании степени избыточности во временной области и степени избыточности в пространственной области для кадра изображения.[0032] Based on the above embodiment of the present disclosure, setting a long-term reference frame according to image frame attribute information includes: setting a long-term reference frame based on a redundancy rate in the time domain and a redundancy rate in the spatial domain for an image frame.

[0033] Согласно этому варианту осуществления настоящего раскрытия, степень избыточности во временной области может указывать степень наличия того же самого кодируемого макроблока в кадре изображения и в соседнем кадре изображения, и большое количество тех же самых кодируемых макроблоков в кадре изображения и в соседнем кадре изображения указывает на более высокую соответствующую степень избыточности во временной области. Степень избыточности в пространственной области может указывать степень наличия того же самого кодируемого макроблока внутри кадра изображения, и большое количество тех же самых кодируемых макроблоков внутри кадра изображения указывает на более высокую соответствующую степень избыточности в пространственной области. Поскольку долговременный опорный кадр необходимо использовать в качестве опорного кадра в течение долгого времени, и степень избыточности в пространственной области у долговременного опорного кадра относительно высокая, кадр изображения, степень избыточности у которого в пространственной области больше, чем степень избыточности во временной области, может быть выбран и настроен в качестве долговременного опорного кадра, или определяется отношение степени избыточности во временной области к степени избыточности в пространственной области, и кадр изображения, у которого это отношение меньше порога, настраивается в качестве долговременного опорного кадра.[0033] According to this embodiment of the present disclosure, the degree of redundancy in the time domain may indicate the degree of presence of the same encoded macroblock in an image frame and in an adjacent image frame, and a large number of the same encoded macroblocks in an image frame and in an adjacent image frame indicates to a higher corresponding degree of redundancy in the time domain. The degree of redundancy in the spatial domain may indicate the degree of presence of the same encoded macroblock within an image frame, and a large number of the same encoded macroblocks within an image frame indicates a higher corresponding degree of redundancy in the spatial domain. Since the long-term reference frame needs to be used as a reference frame for a long time, and the degree of redundancy in the spatial domain of the long-term reference frame is relatively high, an image frame whose degree of redundancy in the spatial domain is larger than the degree of redundancy in the time domain can be selected. and set as a long-term reference frame, or the ratio of the redundancy degree in the time domain to the degree of redundancy in the spatial domain is determined, and an image frame whose ratio is less than a threshold is set as the long-term reference frame.

[0034] Вариант осуществления 2[0034] Embodiment 2

[0035] На фиг. 2 показана блок-схема алгоритма способа для кодирования видео на основании долговременного опорного кадра согласно варианту осуществления 2 настоящего раскрытия. Этот вариант осуществления настоящего раскрытия описан на основании вышеупомянутого варианта осуществления. Долговременный опорный кадр настраивается на основании степени избыточности во временной области и степени избыточности в пространственной области у кадра изображения. Как показано на фиг. 2, предложенный согласно этому варианту осуществления настоящего раскрытия способ для кодирования видео на основании долговременного опорного кадра предусматривает следующие стадии.[0035] FIG. 2 shows a flowchart of a method for encoding video based on a long-term reference frame according to Embodiment 2 of the present disclosure. This embodiment of the present disclosure has been described based on the above embodiment. The long-term reference frame is adjusted based on the degree of redundancy in the time domain and the degree of redundancy in the spatial domain of the image frame. As shown in FIG. 2, the proposed method according to this embodiment of the present disclosure for encoding video based on a long-term reference frame includes the following steps.

[0036] На стадии 201 для каждого кадра изображения из множества кадров изображения определяются затраты на межкадровое кодирование и внутрикадровое кодирование соответственно, причем затраты на межкадровое кодирование отражают степень избыточности во временной области, а затраты на внутрикадровое кодирование отражают степень избыточности в пространственной области, степень избыточности во временной области является обратно пропорциональной затратам на межкадровое кодирование, а степень избыточности в пространственной области является обратно пропорциональной затратам на внутрикадровое кодирование.[0036] At step 201, for each image frame from a plurality of image frames, the cost of inter-frame coding and intra-frame coding, respectively, and the cost of inter-frame coding reflect the degree of redundancy in the time domain, and the cost of intra-frame coding reflect the degree of redundancy in the spatial domain, the degree of redundancy in the time domain is inversely proportional to the cost of inter-frame coding, and the degree of redundancy in the spatial domain is inversely proportional to the cost of intra-frame coding.

[0037] Затраты на межкадровое кодирование могут представлять сложность кодирования, возникающую во время межкадрового кодирования кадра изображения. В случае, когда сложность кодирования ниже, затраты на межкадровое кодирование могут быть ниже. Затраты на межкадровое кодирование могут быть обратно пропорциональны степени избыточности во временной области, и более высокая степень избыточности во временной области у кадра изображение может соответствовать меньшим затратам на межкадровое кодирование. Затраты на внутрикадровое кодирование могут представлять сложность кодирования, возникающую во время внутрикадрового кодирования кадра изображения. В случае, когда сложность кодирования ниже, затраты на внутрикадровое кодирование могут быть ниже. Затраты на внутрикадровое кодирование могут быть обратно пропорциональны степени избыточности в пространственной области, и более высокая степень избыточности в пространственной области у кадра изображение может соответствовать меньшим затратам на внутрикадровое кодирование.[0037] The cost of inter-frame coding may represent the coding complexity that occurs during inter-frame coding of an image frame. In the case where the coding complexity is lower, the cost of inter-frame coding may be lower. The cost of inter-frame coding may be inversely proportional to the degree of redundancy in the time domain, and a higher degree of redundancy in the time domain of an image frame may correspond to a lower cost of inter-frame coding. The cost of intra-frame coding may represent the coding complexity that occurs during intra-frame coding of an image frame. In the case where the coding complexity is lower, the cost of intra-frame coding may be lower. The cost of intra-frame coding may be inversely proportional to the degree of redundancy in the spatial domain, and a higher degree of redundancy in the spatial domain of an image frame may correspond to a lower cost of intra-frame coding.

[0038] Согласно этому варианту осуществления настоящего раскрытия, затраты на внутрикадровое кодирование и затраты на межкадровое кодирование могут быть определены модулем предварительного анализа системы кодирования. Затраты на внутрикадровое кодирование и затраты на межкадровое кодирование могут храниться вместе с кадром изображения в качестве информации атрибутов. Во время кодирования видео информация атрибутов кадра изображения может быть получена заранее. Затраты на внутрикадровое кодирование могут отражать степень избыточности в пространственной области у кадра изображения, а затраты на межкадровое кодирование могут отражать степень избыточности во временной области у кадра изображения.[0038] According to this embodiment of the present disclosure, the cost of intra-frame coding and the cost of inter-frame coding can be determined by the preliminary analysis module of the coding system. The intra-frame coding cost and the inter-frame coding cost may be stored together with the image frame as attribute information. During video encoding, image frame attribute information may be obtained in advance. The intra-frame coding cost may reflect a degree of redundancy in the spatial domain of an image frame, and the cost of inter-frame coding may reflect the degree of redundancy in the time domain of an image frame.

[0039] На стадии 202 текущее отношение затрат кодирования для текущего кадра изображения определяется на основании значений затрат на внутрикадровое кодирование и затрат на межкадровое кодирование. Текущий кадр изображения является одним кадром из множества кадров изображения в подлежащем кодированию видео.[0039] In step 202, the current encoding cost ratio for the current image frame is determined based on the values of the intra-frame coding cost and the inter-frame coding cost. The current image frame is one frame from a plurality of image frames in the video to be encoded.

[0040] Текущее отношение затрат кодирования может быть отношением затрат на внутрикадровое кодирование к затратам на межкадровое кодирование и может использоваться для измерения степени избыточности в пространственной области и степени избыточности во временной области у кадра изображения. Избыточность обратно пропорциональна затратам на кодирование. Например, чем меньше текущее значение отношения затрат кодирования, тем выше степень избыточности в пространственной области у кадра изображения по отношению к степени избыточности во временной области у кадра изображения, и во время межкадрового кодирования кадр изображения можно использовать в качестве долговременного опорного кадра.[0040] The current encoding cost ratio may be the ratio of intra-frame coding costs to inter-frame coding costs, and may be used to measure the amount of redundancy in the spatial domain and the amount of redundancy in the time domain of an image frame. Redundancy is inversely proportional to the cost of coding. For example, the smaller the current value of the encoding cost ratio, the higher the degree of spatial redundancy of the image frame relative to the degree of redundancy in the time domain of the image frame, and during inter-frame coding, the image frame can be used as a long-term reference frame.

[0041] Согласно этому варианту осуществления настоящего раскрытия, отношение затрат на внутрикадровое кодирование к затратам на межкадровое кодирование каждого кадра изображения из множества кадров изображения в видео может быть вычислено как текущее значение отношения затрат кодирования каждого кадра изображения, причем затраты на межкадровое кодирование могут отражать степень избыточности во временной области у кадра изображения, затраты на внутрикадровое кодирование могут отражать степень избыточности в пространственной области у кадра изображения, а степень избыточности обратно пропорциональна затратам на кодирование.[0041] According to this embodiment of the present disclosure, the ratio of intra-frame coding cost to inter-frame coding cost of each image frame of a plurality of image frames in a video may be calculated as a current value of the encoding cost ratio of each image frame, and the inter-frame coding cost may reflect the degree redundancy in the time domain of an image frame, the cost of intra-frame coding may reflect the degree of redundancy in the spatial domain of the image frame, and the degree of redundancy is inversely proportional to the cost of encoding.

[0042] На стадии 203 определяются соответственно первое отношение затрат на кодирование кадра изображения, предыдущего для текущего кадра изображения, и второе отношение затрат на кодирование кадра изображения перед предыдущим кадром изображения, и определяется дисперсия отношения для текущего отношения затрат на кодирование, первого отношения затрат на кодирование и второго отношения затрат на кодирование.[0042] In step 203, the first encoding cost ratio of the image frame preceding the current image frame and the second encoding cost ratio of the image frame before the previous image frame are determined, respectively, and the ratio variance is determined for the current encoding cost ratio, the first cost ratio for coding and a second coding cost ratio.

[0043] Предыдущий кадр изображения для текущего кадра изображения может быть кадром изображения в предыдущий момент относительно текущего кадра изображения во временной последовательности, а кадр изображения перед предыдущим кадром изображения может быть кадром изображения во второй момент перед текущим кадром изображения во временной последовательности. Первое отношение затрат на кодирование может быть отношением затрат на внутрикадровое кодирование к затратам на межкадровое кодирование для предыдущего кадра изображения относительно текущего кадра изображения, а второе отношение затрат на кодирование может быть отношением затрат на внутрикадровое кодирование к затратам на межкадровое кодирование для кадра изображения перед предыдущим кадром изображения.[0043] The previous image frame for the current image frame may be an image frame at a previous moment relative to the current image frame in time series, and the image frame before the previous image frame may be an image frame at a second moment before the current image frame in time series. The first encoding cost ratio may be the ratio of the intra-frame coding cost to the inter-frame coding cost for the previous image frame relative to the current image frame, and the second encoding cost ratio may be the ratio of the intra-frame coding cost to the inter-frame coding cost for the image frame before the previous frame. Images.

[0044] Можно выполнить поиск двух первых кадров изображения в каждом кадре изображения в буфере DPB, могут быть отдельно получены затраты на внутрикадровое кодирование и затраты на межкадровое кодирование для каждых двух первых кадров изображения, и могут быть отдельно вычислены первое отношение затрат на кодирование предыдущего кадра изображения и второе отношение затрат на кодирование другого кадра изображения для соответствующих первых двух кадров изображения. Дисперсия отношения вычисляется на основании отношения затрат на кодирование текущего кадра изображения, отношения затрат на кодирование предыдущего кадра изображения и отношения затрат на кодирование другого кадра изображения, чтобы определить изменение отношения затрат на кодирование в положении каждого кадра изображения.[0044] The first two image frames in each image frame in the DPB can be searched, the intra-frame coding cost and the inter-frame coding cost for every two first image frames can be separately obtained, and the first encoding cost ratio of the previous frame can be separately calculated image and the second cost ratio for encoding another image frame for the respective first two image frames. The ratio variance is calculated based on the ratio of the encoding cost of the current image frame, the ratio of the encoding cost of the previous image frame, and the ratio of the encoding cost of another image frame to determine the change in the encoding cost ratio at the position of each image frame.

[0045] На стадии 204 текущий кадр изображения настраивается в качестве долговременного опорного кадра в случае, когда определено, что второе отношение затрат на кодирование, первое отношение затрат на кодирование и текущее отношение затрат на кодирование последовательно уменьшаются, и дисперсия отношения меньше значения порога.[0045] In step 204, the current image frame is set as the long-term reference frame in the case where it is determined that the second encoding cost ratio, the first encoding cost ratio, and the current encoding cost ratio decrease sequentially and the ratio variance is less than the threshold value.

[0046] Значение порога может быть близким к нулю значением. В случае, когда дисперсия отношения меньше значения порога, можно считать, что значение дисперсии отношения равно 0.[0046] The threshold value may be a value close to zero. In the case where the variance of the ratio is less than the threshold value, we can assume that the variance of the ratio is 0.

[0047] Изменение отношения затрат на кодирование в некотором положении текущего кадра изображения может быть определено на основании текущего отношения затрат на кодирование, первого отношения затрат на кодирование и второго отношения затрат на кодирование. В случае, когда определено, что текущее отношение затрат на кодирование меньше, чем первое отношение затрат на кодирование, и первое отношение затрат на кодирование меньше чем второе отношение затрат на кодирование, отношения затрат на кодирование кадров изображения могут быть постепенно уменьшающимися. В случае, когда дисперсия отношения меньше значения порога, может быть определено, что изменение отношения затрат на кодирование в некотором положении в текущем кадре изображения равно 0, и может быть определено, что текущее отношение затрат на кодирование текущего кадра изображения является полярной точкой отношений затрат на кодирование нескольких соседних кадров изображения, и кадр изображения может быть использован в качестве опорного кадра для соседнего кадра изображения в течение долгого времени. В этом случае текущий кадр изображение может быть настроен в качестве долговременного опорного кадра.[0047] A change in the encoding cost ratio at a certain position of the current image frame may be determined based on the current encoding cost ratio, the first encoding cost ratio, and the second encoding cost ratio. In the case where it is determined that the current encoding cost ratio is less than the first encoding cost ratio and the first encoding cost ratio is less than the second encoding cost ratio, the image frame encoding cost ratios may be progressively reduced. In the case where the ratio variance is smaller than the threshold value, it can be determined that the change in the encoding cost ratio at a certain position in the current picture frame is 0, and it can be determined that the current encoding cost ratio of the current picture frame is the polar point of the cost ratios for encoding multiple adjacent image frames, and the image frame can be used as a reference frame for the adjacent image frame over time. In this case, the current image frame can be set as a long-term reference frame.

[0048] На стадии 205 опорный индекс, соответствующий подлежащему кодированию кадру изображения, определяется на основании нормального опорного кадра и долговременного опорного кадра и последовательно согласно режиму предсказания первого типа, режиму предсказания второго типа и режиму предсказания третьего типа, причем режим предсказания первого типа по меньшей мере включает в себя режим предсказания MERGE (ОБЪЕДИНЕНИЕ) и/или режим предсказания SKIP (ПРОПУСК), режим предсказания второго типа по меньшей мере включает в себя режим предсказания 2N*2N и/или режим предсказания ВГОШ. (ДВУНАПР), а режим предсказания третьего типа по меньшей мере включает в себя режим предсказания 2N*N, режим предсказания N*2N, режим предсказания 2N*nD, режим предсказания 2N*nU, режим предсказания nR*2N и/или режим предсказания nL*2N.[0048] In step 205, the reference index corresponding to the image frame to be encoded is determined based on the normal reference frame and the long-term reference frame, and sequentially according to the first type prediction mode, the second type prediction mode, and the third type prediction mode, the first type prediction mode being at least least includes a MERGE prediction mode and/or a SKIP prediction mode, a second type prediction mode at least includes a 2N*2N prediction mode and/or a VGOSH prediction mode. (DOWD), and the third type prediction mode at least includes a 2N*N prediction mode, an N*2N prediction mode, a 2N*nD prediction mode, a 2N*nU prediction mode, an nR*2N prediction mode, and/or an nL prediction mode *2N.

[0049] Может быть множество режимов предсказания. Например, стандарт высокоэффективного видеокодирования (HEVC) может содержать 10 типов режимов предсказания, а именно MERGE/SKIP/2N*2N/BIDIR/2N*N/N*2N/2N*nD/2N*nU/nR*2N/nL*2N. В случае, когда опорный индекс отдельно выбирается в каждом режиме предсказания, сложность вычислений значительно возрастает. Следовательно, режимы предсказания можно классифицировать в режим предсказания первого типа, режим предсказания второго типа и режим предсказания третьего типа, и опорный индекс для подлежащего кодированию кадру изображения может быть последовательно определен по порядку для режима предсказания первого типа, режима предсказания второго типа и режима предсказания третьего типа. Опорный индекс для режима предсказания второго типа может быть создан посредством непосредственного использования опорного индекса для режима предсказания первого типа, а опорный индекс для режима предсказания третьего типа может быть создан посредством непосредственного использования опорного индекса для режима предсказания первого типа и опорного индекса для режима предсказания второго типа. Следовательно, повторяющийся процесс создания опорного индекса может быть сокращен, и сложность создания опорного индекса уменьшается. Согласно этому варианту осуществления настоящего раскрытия, режимы предсказания могут быть классифицированы в три типа, причем режим предсказания первого типа по меньшей мере включает в себя режим предсказания MERGE (ОБЪЕДИНЕНИЕ) и/или режим предсказания SKIP (ПРОПУСК), режим предсказания второго типа по меньшей мере включает в себя режим предсказания 2N*2N и/или режим предсказания ВГОГО. (ДВУНАПР), режим предсказания третьего типа по меньшей мере включает в себя режим предсказания 2N*N, режим предсказания N*2N, режим предсказания 2N*nD, режим предсказания 2N*nU, режим предсказания nR*2N и/или режим предсказания nL*2N.[0049] There may be multiple prediction modes. For example, the High Efficiency Video Coding (HEVC) standard may contain 10 types of prediction modes, namely MERGE/SKIP/2N*2N/BIDIR/2N*N/N*2N/2N*nD/2N*nU/nR*2N/nL*2N . In the case where the reference index is separately selected in each prediction mode, the computational complexity increases significantly. Therefore, the prediction modes can be classified into a first type prediction mode, a second type prediction mode, and a third type prediction mode, and a reference index for a picture to be encoded by a frame can be sequentially determined in order for the first type prediction mode, the second type prediction mode, and the third prediction mode. type. The reference index for the second type prediction mode can be created by directly using the reference index for the first type prediction mode, and the reference index for the third type prediction mode can be created by directly using the reference index for the first type prediction mode and the reference index for the second type prediction mode . Therefore, the iterative process of creating a reference index can be shortened, and the complexity of creating a reference index is reduced. According to this embodiment of the present disclosure, the prediction modes can be classified into three types, wherein the first type prediction mode at least includes a MERGE prediction mode and/or a SKIP prediction mode, a second type prediction mode at least includes a 2N*2N prediction mode and/or a GOGO prediction mode. (DOVIN), the third type prediction mode at least includes a 2N*N prediction mode, an N*2N prediction mode, a 2N*nD prediction mode, a 2N*nU prediction mode, an nR*2N prediction mode, and/or an nL* prediction mode 2N.

[0050] На стадии 206 целевой сопоставляемый блок получается посредством осуществления интер-предсказания на основании опорного индекса, чтобы выполнить кодирование подлежащего кодированию кадра изображения.[0050] In step 206, a target matching block is obtained by performing inter-prediction based on a reference index to perform encoding of an image frame to be coded.

[0051] Согласно этому варианту осуществления настоящего раскрытия, затраты на межкадровое кодирование каждого кадра изображения получаются для отражения степени избыточности во временной области, а затраты на внутрикадровое кодирование для каждого кадра изображения получаются для отражения степени избыточности в пространственной области; отношение затрат на внутрикадровое кодирование к затратам на межкадровое кодирование каждого кадра изображения используется в качестве текущего отношения затрат на кодирование; первое отношение затрат на кодирование предыдущего кадра изображения для каждого кадра изображения и второе отношение затрат на кодирование кадра изображения перед предыдущим кадром изображения получаются отдельно, и дисперсия отношения определяется на основании текущего отношения затрат на кодирование, первого отношения затрат на кодирование и второго отношения затрат на кодирование; а в этом случае, когда определено, что текущий кадр изображения является полярной точкой на основании текущего отношения затрат на кодирование, первого отношения затрат на кодирование, второго отношения затрат на кодирование и дисперсии отношения, текущий кадр изображения настраивается в качестве долговременного опорного кадра. Затем опорный индекс кадра изображения, соответствующий режиму предсказания первого типа, опорный индекс кадра изображения, соответствующий режиму предсказания второго типа, и опорный индекс кадра изображения, соответствующий режиму предсказания третьего типа, определяются на основании нормального опорного кадра и долговременного опорного кадра и последовательно согласно режиму предсказания первого типа, режиму предсказания второго типа и режиму предсказания третьего типа. Выполняется поиск целевого сопоставляемого блока на основании опорного индекса, чтобы выполнить кодирование подлежащего кодированию кадра изображения. Следовательно, долговременный опорный кадр точно настроен, улучшена эффективность кодирования, снижена сложность кодирования, и отношение сжатия изображения улучшено на основании идеи обеспечения эффективности сжатия при кодировании изображения, благодаря этому снижаются расходы на полосу частот и улучшаются ощущения от использования.[0051] According to this embodiment of the present disclosure, the inter-frame coding cost of each image frame is obtained to reflect the degree of redundancy in the time domain, and the intra-frame coding cost for each image frame is obtained to reflect the degree of redundancy in the spatial domain; the ratio of intra-frame coding costs to inter-frame coding costs of each image frame is used as the current coding cost ratio; the first encoding cost ratio of the previous image frame for each image frame and the second encoding cost ratio of the image frame before the previous image frame are obtained separately, and the ratio variance is determined based on the current encoding cost ratio, the first encoding cost ratio, and the second encoding cost ratio ; and in this case, when it is determined that the current image frame is a polar point based on the current encoding cost ratio, the first encoding cost ratio, the second encoding cost ratio, and the ratio variance, the current image frame is set as a long-term reference frame. Then, a picture frame reference index corresponding to the first type prediction mode, a picture frame reference index corresponding to the second type prediction mode, and a picture frame reference index corresponding to the third type prediction mode are determined based on the normal reference frame and the long-term reference frame, and sequentially according to the prediction mode a first type, a second type prediction mode, and a third type prediction mode. A target matching block is searched based on a reference index to perform encoding of an image frame to be coded. Therefore, the long-term reference frame is finely tuned, the encoding efficiency is improved, the encoding complexity is reduced, and the image compression ratio is improved based on the idea of securing image encoding compression efficiency, thereby reducing bandwidth cost and improving user experience.

[0052] Вариант осуществления 3[0052] Embodiment 3

[0053] На фиг. 3 показана блок-схема алгоритма способа для кодирования видео на основании долговременного опорного кадра согласно варианту осуществления 3 настоящего раскрытия. Этот вариант осуществления настоящего раскрытия описан на основании вышеупомянутого варианта осуществления, в котором снижен диапазон поиска для режима интер-предсказания, так что сложность кодирования уменьшена на основании идеи обеспечения эффективности сжатия. Как показано на фиг. 3, предложенный согласно этому варианту осуществления настоящего раскрытия способ для кодирования видео на основании долговременного опорного кадра предусматривает следующие стадии.[0053] FIG. 3 shows a flowchart of a method for encoding video based on a long-term reference frame according to Embodiment 3 of the present disclosure. This embodiment of the present disclosure has been described based on the above embodiment in which the search range for the inter-prediction mode is reduced so that the encoding complexity is reduced based on the idea of securing compression efficiency. As shown in FIG. 3, the proposed method according to this embodiment of the present disclosure for encoding video based on a long-term reference frame includes the following steps.

[0054] На стадии 301 долговременный опорный кадр настраивается согласно информации атрибутов кадра изображения.[0054] In step 301, the long-term reference frame is adjusted according to the image frame attribute information.

[0055] На стадии 302 кодируемый макроблок подлежащего кодированию кадра изображения получается согласно режиму предсказания первого типа, и первый опорный индекс, соответствующий подлежащему кодированию кадру изображения в режиме предсказания первого типа создается посредством объединения информации целевых опорных индексов левого блока, верхнего левого блока, верхнего блока, верхнего правого блока и нижнего левого блока кодируемого макроблока и информации целевого опорного индекса опорного кодируемого макроблока, положение которого соответствует кодируемому макроблоку во временной области, причем опорный кодируемый макроблок расположен в нормальном опорном кадре и/или в долговременном опорном кадре.[0055] In step 302, an encoded macroblock of the image frame to be encoded is obtained according to the first type prediction mode, and a first reference index corresponding to the image frame to be encoded in the first type prediction mode is created by combining the target reference index information of the left block, upper left block, upper block , the upper right block and the lower left block of the encoded macroblock, and target reference index information of the reference encoded macroblock whose position corresponds to the encoded macroblock in the time domain, wherein the reference encoded macroblock is located in a normal reference frame and/or in a long-term reference frame.

[0056] Кодируемый макроблок может быть набором пикселей в подлежащем кодированию кадре изображения. Например, 64 пикселя могут образовать один кодируемый макроблок, и кодируемый макроблок может быть наименьшим компонентом для кодирования кадра изображения. Опорный кодируемый макроблок может быть наименьшим компонентом для кодирования долговременного опорного кадра и/или нормального опорного кадра, и опорный кодируемый макроблок может быть образован множеством пикселей. Левый блок может быть кодируемым макроблоком, расположенным с левой стороны от кодируемого макроблока, а другие блоки могут быть кодируемыми макроблоками, расположенными в соответствующих местах от кодируемого макроблока.[0056] An encoded macroblock may be a set of pixels in an image frame to be encoded. For example, 64 pixels may form one encoded macroblock, and the encoded macroblock may be the smallest component to encode an image frame. An encoded reference macroblock may be the smallest component for encoding a long-term reference frame and/or a normal reference frame, and the encoded reference macroblock may be formed by a plurality of pixels. The left block may be an encoded macroblock located on the left side of the encoded macroblock, and the other blocks may be encoded macroblocks located at appropriate locations from the encoded macroblock.

[0057] В случае, когда опорный индекс получается согласно режиму предсказания первого типа, в режиме предсказания ОБЪЕДИНЕНИЕ/ПРОПУСК, оценка движения не требуется, компенсация движения выполняется непосредственно на основании информации индексного межкадрового кодирования соседнего блока в пространственной области, и блок, положение которого соответствует кодируемому макроблоку во временной области, и целевой макроблок с минимальными затратами на кодирование, окончательно определяются в качестве информации межкадрового кодирования для кодируемого макроблока. Во время компенсации движения информация межкадрового кодирования левого блока, верхнего левого блока, верхнего блока, верхнего правого блока и нижнего левого блока от кодируемого макроблока может быть использована в пространственной области, а информация межкадрового кодирования опорного кодируемого макроблока в том же самом относительном положении от кодируемого макроблока может быть использована во временной области, причем опорный кадр во временной области может включать в себя нормальный опорный кадр и/или долговременный опорный кадр. Опорный индекс для опорного кодируемого макроблока с оптимальной опорной информацией может быть определен в качестве информации целевого опорного индекса, например, номер индекса на уровне кадра для кодируемого макроблока с минимальными затратами на кодирование или номер индекса на уровне кадра для кодируемого макроблока, который является ближайшим к подлежащему кодированию кадру изображения во временной последовательности в нормальном опорном кадре и/или в долговременном опорном кадре. Информация целевых опорных индексов левого блока, верхнего левого блока, верхнего блока, верхнего правого блока и нижнего левого блока и информация целевого индекса опорного кодируемого макроблока, положение которого соответствует кодируемому макроблоку во временной области, может быть сохранена в первом опорном индексе. На фиг. 4 показана упрощенная схема поиска места кодируемого макроблока в первом опорном индексе согласно варианту осуществления 3 настоящего раскрытия. Как показано на фиг. 4, опорный кадр и кадр изображения могут быть сегментированы во множество опорных кодируемых макроблоков 11 и множество кодируемых макроблоков 21. Оптимальные опорные индексы соседнего блока в пространственной области для кодируемого макроблока 21 и опорный кодируемый блок, положение которого соответствует кодируемому макроблоку во временной области, могут быть объединены вместе для получения объединенного опорного индекса

, причем d представляет глубину кодируемого блока, a RefMask_i представляет опорный индекс (i+1)-го блока. Первый блок является левым блоком текущего кодируемого блока 21 в кадре изображения, второй блок является верхним левым блоком текущего кодируемого блока 21 в кадре изображения, третий блок является верхнем блоком текущего кодируемого блока 21 в кадре изображения, четвертый блок является верхним правым блоком текущего кодируемого блока 21 в кадре изображения, пятый блок является нижним левым блоком текущего кодируемого блока 21 в кадре изображения и шестой блок является опорным кодируемым макроблоком, положение которого соответствует кодируемому макроблоку. В стандарте HEVC k=5, то есть всего имеется k+2 кодируемых макроблоков, которые включают в себя левый блок 23, верхний левый блок 24, верхний блок 25, верхний правый блок 26 и нижний левый блок 22 текущего кодируемого макроблока 21 в кадре изображения, и опорный кодируемый макроблок 21 в опорном кадре, положение которого соответствует кодируемому макроблоку. Во время создания опорного индекса опорный кадр может быть передним опорным кадром или задним опорным кадром. Опорный индекс RefMask_i для (i+1)-го блока удовлетворяет условию RefMask_i=1<<RefForwardi+(1<<RefBackward_i)<<16, где RefForward_i представляет передний опорный индекс (i+1)-го блока (номер индекса переднего опорного кадра на уровне кадра), а RefBackward_i представляет задний опорный индекс (i+1)-го блока (номер индекса заднего опорного кадра на уровне кадра). Номер индекса заднего опорного кадра на уровне кадра можно хранить в 16 старших значащих битах первого опорного индекса, а номер индекса переднего опорного кадра на уровне кадра можно хранить в 16 младших значащих битах первого опорного индекса. То есть, в случае, когда опорный кадр, включающий в себя (i+1)-ый блок, является передним опорным кадром, опорный индекс RefMask_i (i+1)-го блока является индексом, получаемым при перемещении 1 влево на RefForward_i, а в случае, когда опорный кадр, включающий в себя (1+1)-ый блок является задним опорным кадром, опорный индекс RefMask_i (i+1)-го блока является индексом, получаемым при перемещении 1 влево на RefBackward_i и затем на 16 битов. Логическая операция ИЛИ выполняется в соответствующих местах опорных индексов шести блоков для получения первого опорного индекса кодируемого макроблока 21, где «+» представляет операцию ИЛИ. В случае, когда опорный кадр, включающий в себя (1+1)-ый блок, является передним опорным кадром, задний опорный индекс (i+1)-го блока равен 0, а в случае когда опорный кадр, включающий в себя (1+1)-ый блок, является задним опорным кадром, передний опорный индекс (i+1)-го блока равен 0.[0057] In the case where the reference index is obtained according to the first type prediction mode, in the JOIN/SKIP prediction mode, motion estimation is not required, motion compensation is performed directly based on the index inter-frame coding information of a neighboring block in the spatial domain, and the block whose position corresponds to macroblock to be coded in the time domain, and the target macroblock with the minimum coding cost, is finally determined as the inter-frame coding information for the macroblock to be coded. During motion compensation, the interframe coding information of the left block, upper left block, upper block, upper right block, and lower left block from the macroblock to be coded can be used in the spatial domain, and the interframe coding information of the reference macroblock to be coded at the same relative position from the macroblock to be coded may be used in the time domain, wherein the time domain reference frame may include a normal reference frame and/or a long-term reference frame. A reference index for a reference encoded macroblock with optimal reference information may be determined as target reference index information, for example, a frame level index number for an encoded macroblock with the lowest encoding cost, or a frame level index number for an encoded macroblock that is closest to the subject. encoding an image frame in time sequence in a normal reference frame and/or in a long-term reference frame. Target reference index information of the left block, top left block, top block, top right block, and bottom left block and target index information of a coded macroblock reference whose position corresponds to the coded macroblock in the time domain may be stored in the first reference index. In FIG. 4 shows a simplified scheme for searching for a location of an encoded macroblock in a first reference index according to Embodiment 3 of the present disclosure. As shown in FIG. 4, a reference frame and a picture frame may be segmented into a plurality of reference macroblocks 11 and a plurality of coded macroblocks 21. The optimal neighbor block reference indexes in the spatial domain for the macroblock to be coded 21 and the reference codeblock whose position corresponds to the coded macroblock in the time domain may be combined together to get the combined pivot index

, where d represents the depth of the encoded block, and RefMask _i represents the reference index of the (i+1)th block. The first block is the left block of the current encoded block 21 in the image frame, the second block is the top left block of the current encoded block 21 in the image frame, the third block is the top block of the current encoded block 21 in the image frame, the fourth block is the top right block of the current encoded block 21 in the picture frame, the fifth block is the lower left block of the current encoded block 21 in the picture frame, and the sixth block is a reference macroblock to be encoded whose position corresponds to the encoded macroblock. In the HEVC standard, k=5, that is, there are a total of k+2 encoded macroblocks, which include left block 23, top left block 24, top block 25, top right block 26, and bottom left block 22 of the current encoded macroblock 21 in the picture frame , and a reference macroblock to be coded 21 in a reference frame whose position corresponds to the macroblock to be coded. During the creation of the reference index, the reference frame may be a leading reference frame or a trailing reference frame. The reference index RefMask _i for the (i+1)th block satisfies the condition RefMask _i =1<<RefForwardi+(1<<RefBackward _i )<<16, where RefForward _i represents the front reference index of the (i+1)th block (number frame-level front reference frame index), and RefBackward _i represents the back reference index of the (i+1)th block (frame-level back reference frame index number). The frame level back reference frame index number may be stored in the 16 most significant bits of the first reference index, and the frame level front reference frame index number may be stored in the 16 least significant bits of the first reference index. That is, in the case where the reference frame including the (i+1)th block is the front reference frame, the reference index RefMask _i of the (i+1)th block is the index obtained by moving 1 to the left by RefForward _i , and in the case where the reference frame including the (1+1)th block is the back reference frame, the reference index RefMask _i of the (i+1)th block is the index obtained by moving 1 to the left by RefBackward _i and then for 16 bits. The logical OR operation is performed at the respective locations of the reference indexes of the six blocks to obtain the first reference index of the encoded macroblock 21, where "+" represents an OR operation. In the case where the reference frame including the (1+1)th block is a front reference frame, the back reference index of the (i+1)th block is 0, and in the case where the reference frame including the (1 +1)th block is a back reference frame, the front reference index of the (i+1)th block is 0.

[0058] На стадии 303 рекурсивное кодирование выполняется на кодируемом макроблоке подлежащего кодированию кадра изображения, родительский опорный индекс объединения определяется на основании информации целевого опорного индекса родительского кодируемого блока в кодируемом макроблоке, а дочерний опорный индекс объединения определяется на основании информации целевого опорного индекса дочернего кодируемого блока в кодируемом макроблоке.[0058] In step 303, recursive encoding is performed on an encoded macroblock of an image frame to be encoded, a parent merge reference index is determined based on the target reference index information of the parent encoded block in the encoded macroblock, and a child merge reference index is determined based on the target reference index information of the child encoded block in the encoded macroblock.

[0059] Рекурсивное кодирование является процессом повторения выполнения рекурсивной сегментации кодируемого макроблока. То есть в процессе рекурсивного кодирования кодируемый макроблок может сегментироваться дальше. Например, размер кодируемого макроблока составляет 64*64, и кодируемый макроблок может быть сегментирован в четыре дочерних кодируемых макроблока с размером 32*32. В процессе рекурсивного кодирования размер дочернего кодируемого блока может изменяться в зависимости от глубины сегментации. Дочерний кодируемый блок может быть создан посредством сегментации кодируемого макроблока, а родительский кодируемый блок может быть кодируемым макроблоком, глубина которого по меньшей мере на единицу меньше глубины его дочернего кодируемого блока. Например, для дочернего кодируемого блока с глубиной 3 глубина родительского кодируемого блока может быть равна 2, 1 или 0.[0059] Recursive encoding is a process of repeating performing recursive segmentation of a macroblock to be encoded. That is, during the recursive encoding process, the macroblock to be coded may be segmented further. For example, the size of an encoded macroblock is 64*64, and the encoded macroblock may be segmented into four child encoded macroblocks with a size of 32*32. During recursive encoding, the size of the child encoded block may vary depending on the segmentation depth. A child encoded block may be created by segmenting an encoded macroblock, and a parent encoded block may be an encoded macroblock whose depth is at least one less than the depth of its child encoded block. For example, for a child codebox with a depth of 3, the depth of the parent codebox can be 2, 1, or 0.

[0060] Согласно этому варианту осуществления настоящего раскрытия, рекурсивное кодирование может быть выполнено на подлежащем кодированию кадре изображения, причем кодируемый макроблок подлежащего кодированию кадра изображения затем сегментируется дальше для создания дочерних кодируемых блоков различной глубины, и при этом определяется опорный индекс каждого из дочерних кодируемых блоков. Для каждого из дочерних кодируемых блоков может быть создан опорный индекс родительского кодируемого блока, соответствующего каждому из кодируемых блоков, и опорный индекс дочернего кодируемого блока, соответствующего каждому из дочерних кодируемых блоков. Например, для дочернего кодируемого блока опорные индексы левого дочернего блока, верхнего левого дочернего блока, верхнего дочернего блока, верхнего правого дочернего блока и нижнего левого дочернего блока для текущего дочернего кодируемого блока, и опорный индекс опорного дочернего кодируемого блока, расположенного в том же самом положении во временной области, как у текущего дочернего кодируемого блока, получаются как опорные индексы, соответствующие текущему дочернему кодируемому блоку. Таким образом, получаются дочерние опорные индексы объединения всех дочерних кодируемых блоков и родительские опорные индексы объединения всех родительских кодируемых блоков текущего дочернего кодируемого блока.[0060] According to this embodiment of the present disclosure, recursive encoding may be performed on an image frame to be encoded, wherein an encoded macroblock of the image frame to be encoded is then segmented further to create child encoded blocks of different depths, and a reference index of each of the child encoded blocks is determined. . For each of the child coding blocks, a reference index of a parent coding block corresponding to each of the coding blocks and a reference index of a child coding block corresponding to each of the child coding blocks can be created. For example, for a child coding block, the reference indices of the left child box, top left child box, top child box, top right child box, and bottom left child box of the current child coding box, and the reference index of the reference child coding box located at the same position in the time domain, as in the current child encoded block, are obtained as reference indices corresponding to the current child encoded block. Thus, the child reference indexes of the union of all child encoded blocks and the parent reference indices of the union of all parent encoded blocks of the current child encoded block are obtained.

[0061] Например, в случае, когда глубина d дочернего кодируемого блока не меньше 1, родительский опорный индекс объединения родительского кодируемого блока для дочернего кодируемого блока может быть получен как

где RefMask_j представляет первый опорный индекс родительского кодируемого блока для дочернего кодируемого блока на глубине j. В случае, когда рекурсия может быть последовательно выполнена вниз от дочернего кодируемого блока, после завершения рекурсии может быть получен опорный индекс каждого дочернего кодируемого блока для дочернего кодируемого блока. Дочерний опорный индекс объединения может быть получен посредством сбора опорных индексов множества дочерних кодируемых блоков, а более конкретно, дочерний опорный индекс объединения удовлетворяет условию

, где SplitRef_m представляет опорный индекс для (m+1)-го дочернего кодируемого блока, и максимальное значение m равно 3, что указывает, что кодируемый макроблок сегментирован в четыре дочерних кодируемых блока.[0061] For example, in the case where the depth d of the child coding block is not less than 1, the parent coding block merge reference reference index for the child coding block can be obtained as

where RefMask _j represents the first parent codeblock reference index for the child codeblock at depth j. In the case where the recursion can be sequentially performed down from the child encoded block, after the completion of the recursion, the reference index of each child encoded block for the child encoded block can be obtained. The child union reference index can be obtained by collecting the reference indices of the plurality of child encoded blocks, and more specifically, the child union reference index satisfies the condition

, where SplitRef _m represents a reference index for the (m+1)th child coded block, and the maximum value of m is 3, indicating that the coded macroblock is segmented into four child coded blocks.

[0062] На стадии 304 первый опорный индекс, родительский опорный индекс объединения и дочерний опорный индекс объединения определяются в качестве второго опорного индекса, соответствующего подлежащего кодированию кадру изображения в режиме предсказания второго типа.[0062] In step 304, the first reference index, the parent union reference index, and the child union reference index are determined as the second reference index corresponding to the image frame to be encoded in the second type prediction mode.

[0063] Второй опорный индекс, соответствующий подлежащему кодированию кадру изображения в режиме предсказания второго типа, может быть получен посредством объединения первого опорного индекса, родительского опорного индекса объединения и дочернего опорного индекса объединения. Например, совокупность из первого опорного индекса, родительского опорного индекса объединения и дочернего опорного индекса объединения может быть определена в качестве второго опорного индекса, и второй опорный индекс удовлетворяет условию

может представлять родительский опорный индекс объединения, SplitRefSum может представлять дочерний опорный индекс объединения, a RefMask_d может представлять первый опорный индекс. Объединенный второй опорный индекс RefMaskSum используется для ограничения поиска в режиме 2N*2N. Опорный индекс в это время адаптивным образом исключает из себя или включает в себя долговременный опорный индекс, вследствие чего эффективность кодирования может быть улучшена без увеличения сложности кодирования.[0063] The second reference index corresponding to the image frame to be encoded in the second type prediction mode can be obtained by combining the first reference index, the parent combining reference index, and the child combining reference index. For example, the collection of the first pivot index, the parent pivot index of the union, and the child pivot index of the union can be defined as the second pivot index, and the second pivot index satisfies the condition

may represent the parent pivot index of the union, SplitRefSum may represent the child pivot index of the union, and RefMask _d may represent the first pivot index. The combined second reference index RefMaskSum is used to limit the search in 2N*2N mode. The reference index at this time adaptively excludes or includes the long-term reference index, whereby the coding efficiency can be improved without increasing the coding complexity.

[0064] На стадии 305 выполняется поиск третьего опорного индекса, соответствующего подлежащему кодированию кадру изображения, поиск выполняется на основании типа режима у режима предсказания третьего типа в заранее определенной таблице конфигурации индексов.[0064] In step 305, a third reference index corresponding to an image frame to be encoded is searched, the search is performed based on the mode type of the third type prediction mode in a predetermined index configuration table.

[0065] В заранее определенной таблице конфигурации индексов может храниться таблица информации об опорных индексах, созданная в процессе рекурсивного кодирования, и в заранее определенной таблице конфигурации индексов может храниться режим предсказания в ассоциации с опорным индексом, полученным в методе, соответствующем режиму предсказания.[0065] The predetermined index configuration table may store the reference index information table created in the recursive encoding process, and the predetermined index configuration table may store the prediction mode in association with the reference index obtained in the method corresponding to the prediction mode.

[0066] Согласно этому варианту осуществления настоящего раскрытия, поиск соответствующего опорного индекса может быть выполнен согласно типу режима у режима предсказания третьего типа. Например, на фиг. 5 показан пример схемы режима предсказания согласно варианту осуществления 3 настоящего раскрытия. Как показано на фиг. 5, режимы предсказания могут включать в себя режим предсказания 2N*N, режим предсказания N*2N, режим предсказания 2N*nD, режим предсказания 2N*nU, режим предсказания nR*2N и режим предсказания nL*2N.[0066] According to this embodiment of the present disclosure, the search for the corresponding reference index can be performed according to the mode type of the third type prediction mode. For example, in FIG. 5 shows an example of a prediction mode circuit according to Embodiment 3 of the present disclosure. As shown in FIG. 5, the prediction modes may include 2N*N prediction mode, N*2N prediction mode, 2N*nD prediction mode, 2N*nU prediction mode, nR*2N prediction mode, and nL*2N prediction mode.

[0067] На основании вышеупомянутого варианта осуществления настоящего раскрытия, взаимосвязь соответствия между типом режима и третьим опорным индексом в заранее определенной таблице конфигурации индексов включает в себя по меньшей мере одно из следующих: в случае, когда тип режима является режимом предсказания 2N*N, верхний индекс третьего опорного индекса определяется на основании информации целевого опорного индекса первого дочернего кодируемого блока и информации целевого опорного индекса второго дочернего кодируемого блока в процессе рекурсивного кодирования, а нижний индекс третьего опорного индекса определяется на основании информации целевого опорного индекса третьего дочернего кодируемого блока и информации целевого опорного индекса четвертого дочернего кодируемого блока в процессе рекурсивного кодирования; в случае, когда тип режима является режимом предсказания N*2N, левый индекс третьего опорного индекса определяется на основании информации целевого опорного индекса первого дочернего кодируемого блока и информации целевого опорного индекса третьего дочернего кодируемого блока в процессе рекурсивного кодирования, а правый индекс третьего опорного индекса определяется на основании информации целевого опорного индекса второго дочернего кодируемого блока и информации целевого опорного индекса четвертого дочернего кодируемого блока в процессе рекурсивного кодирования; в случае, когда тип режима является режимом предсказания 2N*nD, верхний индекс третьего опорного индекса является равным второму опорному индексу, а нижний индекс третьего опорного индекса определяется на основании информации целевого опорного индекса третьего дочернего кодируемого блока и информации целевого опорного индекса четвертого дочернего кодируемого блока в процессе рекурсивного кодирования; в случае, когда тип режима является режимом предсказания 2N*nU, верхний индекс третьего опорного индекса определяется на основании информации целевого опорного индекса первого дочернего кодируемого блока и информации целевого опорного индекса второго дочернего кодируемого блока в процессе рекурсивного кодирования, а нижний индекс третьего опорного индекса является равным второму опорному индексу; в случае, когда тип режима является режимом предсказания nR*2N, левый индекс третьего опорного индекса является равным второму опорному индексу, а правый индекс третьего опорного индекса определяется на основании информации целевого опорного индекса второго дочернего кодируемого блока и информации целевого опорного индекса четвертого дочернего кодируемого блока в процессе рекурсивного кодирования; в случае, когда тип режима является режимом предсказания nL*2N, левый индекс третьего опорного индекса определяется на информации основании целевого опорного индекса первого дочернего кодируемого блока и информации целевого опорного индекса третьего дочернего кодируемого блока в процессе рекурсивного кодирования, а правый индекс третьего опорного индекса является равным второму опорному индексу.[0067] Based on the above embodiment of the present disclosure, the correspondence relationship between the mode type and the third reference index in the predetermined index configuration table includes at least one of the following: in the case where the mode type is a 2N*N prediction mode, the upper the third reference index index is determined based on the target reference index information of the first child encoded block and the target reference index information of the second child encoded block in the recursive encoding process, and the lower index of the third reference index is determined based on the target reference index information of the third child encoded block and the target reference information the index of the fourth child encoded block in the process of recursive encoding; in the case where the mode type is N*2N prediction mode, the left index of the third reference index is determined based on the target reference index information of the first child coded block and the target reference index information of the third child coded block in the recursive encoding process, and the right index of the third reference index is determined based on the target reference index information of the second child encoded block and the target reference index information of the fourth child encoded block in the recursive encoding process; in the case where the mode type is a 2N*nD prediction mode, the upper index of the third reference index is equal to the second reference index, and the lower index of the third reference index is determined based on the target reference index information of the third child coded block and the target reference index information of the fourth child coded block in the process of recursive coding; in the case where the mode type is a 2N*nU prediction mode, the upper index of the third reference index is determined based on the target reference index information of the first child coded block and the target reference index information of the second child coded block in the recursive encoding process, and the lower index of the third reference index is equal to the second reference index; in the case where the mode type is the nR*2N prediction mode, the left index of the third reference index is equal to the second reference index, and the right index of the third reference index is determined based on the target reference index information of the second child coded block and the target reference index information of the fourth child coded block in the process of recursive coding; in the case where the mode type is the nL*2N prediction mode, the left index of the third reference index is determined based on information based on the target reference index of the first child coded block and the target reference index information of the third child coded block in the recursive encoding process, and the right index of the third reference index is equal to the second reference index.

[0068] Первый дочерний кодируемый блок, второй дочерний кодируемый блок, третий дочерний кодируемый блок и четвертый дочерний кодируемый блок могут быть четырьмя дочерними кодируемыми блоками, созданными посредством сегментации кодируемого макроблока в подлежащем кодированию кадре изображения во время процесса рекурсивного кодирования, и четыре дочерних кодируемых блока могут быть соответственно обозначены как первый дочерний кодируемый блок, второй дочерний кодируемый блок, третий дочерний кодируемый блок и четвертый дочерний кодируемый блок в порядке верхний левый, верхний правый, нижний левый и нижний правый.[0068] The first child encoded block, the second child encoded block, the third child encoded block, and the fourth child encoded block may be four child encoded blocks created by segmenting an encoded macroblock in a picture frame to be encoded during a recursive encoding process, and four child encoded blocks may respectively be referred to as a first child code block, a second child code block, a third child code block, and a fourth child code block in the order top left, top right, bottom left, and bottom right.

[0069] Согласно этому варианту осуществления настоящего раскрытия, в режиме предсказания третьего типа опорный индекс может быть получен на основании взаимосвязи соответствия положения в заранее определенной таблице конфигурации индексов, где заранее определенная таблица конфигурации индексов может быть показана в виде следующий таблицы:[0069] According to this embodiment of the present disclosure, in the third type prediction mode, a reference index can be obtained based on a position correspondence relationship in a predetermined index configuration table, where the predetermined index configuration table can be shown as the following table:

[0070] ВЕРХНИЙ может представлять верхний индекс, НИЖНИЙ может представлять нижний индекс, ЛЕВЫЙ может представлять левый индекс и ПРАВЫЙ может представлять правый индекс. SplitRef₀, SplitRef₂, SplitRef₃, SplitRef₄ и Ref2N*2N могут соответственно представлять информацию целевого опорного индекса первого дочернего кодируемого блока, информацию целевого опорного индекса второго дочернего кодируемого блока, информацию целевого опорного индекса третьего дочернего кодируемого блока, информацию целевого опорного индекса четвертого дочернего кодируемого блока и второй опорный индекс.[0070] UPPER may represent a superscript, LOWER may represent a subscript, LEFT may represent a left index, and RIGHT may represent a right index. SplitRef ₀ , SplitRef ₂ , SplitRef ₃ , SplitRef ₄ , and Ref2N*2N may respectively represent target reference index information of the first child coded block, target reference index information of the second child coded block, target reference index information of the third child coded block, target reference index information of the fourth child encoded block and a second reference index.

[0071] На стадии 306 выполняется поиск целевого сопоставляемого блока на основании опорного индекса в нормальном опорном кадре и/или в долговременном опорном кадре.[0071] In step 306, a target matching block is searched based on the reference index in the normal reference frame and/or in the long-term reference frame.

[0072] Нормальный опорный кадр может быть нормальным кратковременным опорным кадром в известном уровне техники, а кратковременный опорный кадр может быть соседним кадром к подлежащему кодированию кадру изображения. Может быть понятно, что для подлежащего кодированию кадра изображения, когда выполняется поиск целевого сопоставляемого блока на основании опорного индекса, поиск целевого сопоставляемого блока может проводиться как в нормальном опорном кадре, так и в долговременном опорном кадре, причем целевой сопоставляемый блок может быть кодируемым макроблоком, подобным подлежащему кодированию кадру изображения в нормальном опорном кадре и/или в долговременном опорном кадре. Для различных режимов предсказания различные опорные индексы могут использоваться для поиска целевого сопоставляемого блока. Например, для режима предсказания первого типа первый опорный индекс используется для поиска целевого сопоставляемого блока, для режима предсказания второго типа второй опорный индекс используется для поиска целевого сопоставляемого блока, и для режима предсказания третьего типа третий опорный индекс используется для поиска целевого сопоставляемого блока.[0072] A normal reference frame may be a normal short-term reference frame in the prior art, and a short-term reference frame may be a neighboring frame to an image frame to be encoded. It may be understood that for a picture frame to be encoded, when a target matching block is searched based on a reference index, the target matching block can be searched in both a normal reference frame and a long-term reference frame, wherein the target matching block can be a coded macroblock, similar to the image frame to be encoded in the normal reference frame and/or in the long-term reference frame. For different prediction modes, different reference indexes may be used to find the target matched block. For example, for the first type prediction mode, the first reference index is used to search for the target matching block, for the second type prediction mode, the second reference index is used to search for the target matching block, and for the third type prediction mode, the third reference index is used to find the target matching block.

[0073] На стадии 307 кодирование со сжатием выполняется на подлежащем кодированию кадре изображения на основании целевого сопоставляемого блока.[0073] In step 307, compression encoding is performed on the image frame to be coded based on the target matching block.

[0074] Подлежащий кодированию кадр изображения может быть сжат на основании информации об относительном положении целевого сопоставляемого блока и данных изображения в целевом сопоставляемом блоке, а подлежащий кодированию кадр изображения может быть представлен в форме относительного смещения множества целевых сопоставляемых блоков, благодаря этому достигается кодирование со сжатием подлежащего кодированию кадра изображения.[0074] An image frame to be encoded can be compressed based on information about the relative position of the target matching block and image data in the target matching block, and the image frame to be encoded can be represented in the form of a relative offset of a plurality of target matching blocks, thereby achieving compression encoding. image frame to be encoded.

[0075] Согласно этому варианту осуществления настоящего раскрытия, долговременный опорный кадр настраивается согласно атрибутам кадра изображения, кодируемый макроблок подлежащего кодированию кадра изображения, соседний кодируемый макроблок в пространственной области такого кадра и соседний кодируемый макроблок во временной области такого кадра получаются согласно режиму предсказания первого типа, первый опорный индекс создается на основании информации целевого опорного индекса соседнего кодируемого макроблока в пространственной области и информации целевого опорного индекса опорного кодируемого макроблока. Рекурсивное кодирование выполняется на кодируемом макроблоке, во время процесса рекурсивного кодирования создаются дочерний опорный индекс объединения и родительский опорный индекс объединения, и используемый для режима предсказания второго типа второй опорный индекс создается посредством сочетания. Третий опорный индекс для режима предсказания третьего типа может быть создан посредством запрашивания заранее определенной таблицы конфигурации индексов с использованием типа модели. Выполняется поиск целевого сопоставляемого блока на основании опорного индекса, и кодирование подлежащего кодированию кадра изображения выполняется на основании целевого сопоставляемого блока. Следовательно, достигнуты надлежащий поиск и кодирование долговременного опорного кадра, и целевой сопоставляемый блок последовательно ищется согласно режимам предсказания различных типов, благодаря этому снижается сложность получения опорного индекса, сужается диапазон поиска и улучшаются эффективность кодирования и впечатления пользователя.[0075] According to this embodiment of the present disclosure, a long-term reference frame is adjusted according to attributes of an image frame, an encoded macroblock of an image frame to be encoded, an adjacent encoded macroblock in the spatial domain of such a frame, and an adjacent encoded macroblock in the time domain of such a frame are obtained according to the first type prediction mode, the first reference index is generated based on the target reference index information of the neighboring encoded macroblock in the spatial domain and the target reference index information of the encoded reference macroblock. Recursive encoding is performed on the macroblock to be coded, during the recursive encoding process, a child union reference index and a parent union reference index are created, and a second reference index used for the second type prediction mode is created by combination. The third reference index for the third type prediction mode may be created by querying a predetermined index configuration table using the model type. A target matching block is searched based on the reference index, and encoding of an image frame to be coded is performed based on the target matching block. Therefore, proper search and encoding of the long-term reference frame is achieved, and the target matching block is sequentially searched according to various types of prediction modes, thereby reducing the complexity of obtaining the reference index, narrowing the search range, and improving encoding efficiency and user experience.

[0076] На основании вышеупомянутого варианта осуществления раскрытия, основанный на опорном индексе поиск целевого сопоставляемого блока в нормальном опорном кадре и/или в долговременном опорном кадре включает в себя: в режиме предсказания 2N*2N в случае, когда определено, что нормальный опорный кадр и/или долговременный опорный кадр не является целевым опорным кадром в режиме предсказания ОБЪЕДИНЕНИЕ, проводится сужение диапазона поиска целевого сопоставляемого блока в нормальном опорном кадре и/или в долговременном опорном кадре.[0076] Based on the above disclosure, based on the reference index, searching for a target matching block in a normal reference frame and/or a long-term reference frame includes: in a 2N*2N prediction mode in the case where it is determined that the normal reference frame and /or the long-term key frame is not a target key frame in the Merge prediction mode, narrowing the search range of the target matching block in the normal key frame and/or in the long-term key frame is performed.

[0077] Согласно этому варианту осуществления настоящего раскрытия, в случае, когда поиск целевого сопоставляемого блока выполняется согласно режиму предсказания 2N*2N, поскольку текущий найденный опорный кадр не является оптимальным опорным кадром, определенным в режиме предсказания ОБЪЕДИНЕНИЕ, вероятность того, что целевой сопоставляемый блок может быть найден в текущем опорном кадре, очень мала, где текущий опорный кадр может быть нормальным опорным кадром или долговременным опорным кадром. Для улучшения эффективности поиска можно сузить диапазон поиска целевого сопоставляемого блока в текущем опорном кадре, например его можно сузить до 1/3 от начального диапазона поиска и для поиска выбираются только 1/3 опорных кадров в нормальном опорном кадре и в долговременном опорном кадре.[0077] According to this embodiment of the present disclosure, in the case where the search for the target matching block is performed according to the 2N*2N prediction mode, since the current found reference frame is not the optimal reference frame determined in the JOIN prediction mode, the probability that the target matching block can be found in the current key frame is very small, where the current key frame can be a normal key frame or a long-term key frame. To improve search performance, the search range of the target matching block in the current key frame can be narrowed, for example, it can be narrowed to 1/3 of the initial search range and only 1/3 of the key frames in the normal key frame and in the long-term key frame are selected for search.

[0078] На основании вышеупомянутого варианта осуществления раскрытия, основанный на опорном индексе поиск целевого сопоставляемого блока в нормальном опорном кадре и/или в долговременном опорном кадре предусматривает стадию: поиск целевого сопоставляемого блока на основании алгоритма алмазного поиска и опорного индекса в нормальном опорном кадре и/или в долговременном опорном кадре в случае, когда информация атрибутов о подлежащем кодированию кадре изображения включает в себя информацию В-кадра и информацию неключевого кадра.[0078] Based on the above disclosure, a reference index based search for a target matching block in a normal reference frame and/or a long-term reference frame includes the steps of: searching for a target matching block based on a diamond search algorithm and a reference index in a normal reference frame and/ or in a long-term reference frame in the case where the attribute information of the image frame to be encoded includes B-frame information and non-key frame information.

[0079] В случае, когда информация атрибутов о подлежащем кодированию кадре изображения включает в себя информацию В-кадра и информацию неключевого кадра, подлежащий кодированию кадр изображения может быть небольшим В-кадром. Роль подлежащего кодированию кадра изображения в видео не является очевидной, и алгоритм алмазного поиска можно использовать для поиска целевого сопоставляемого блока для улучшения скорости поиска целевого сопоставляемого блока.[0079] In the case where the attribute information about the image frame to be coded includes B-frame information and non-key frame information, the image frame to be coded may be a small B-frame. The role of the image frame to be encoded in the video is not obvious, and a diamond search algorithm can be used to search for a target matching block to improve the search speed for the target matching block.

[0080] Вариант осуществления 4[0080] Embodiment 4

[0081] На фиг. 6 показана упрощенная блок-схема устройства для кодирования видео на основании долговременного опорного кадра согласно варианту осуществления 4 настоящего раскрытия; Устройство для кодирования видео на основании долговременного опорного кадра может выполнять способ для кодирования видео на основании долговременного опорного кадра, предложенный в любом варианте осуществления настоящего раскрытия, и включать в себя соответствующие функциональные модули для выполнения способа. Устройство может быть реализовано аппаратными и/или программными средствами и включает в себя настроечный модуль 401, индексный модуль 402 и кодирующий модуль 403.[0081] In FIG. 6 is a simplified block diagram of an apparatus for encoding video based on a long-term reference frame according to Embodiment 4 of the present disclosure; An apparatus for encoding video based on a long-term reference frame may perform a method for encoding video based on a long-term reference frame proposed in any embodiment of the present disclosure, and include appropriate functional modules for performing the method. The device may be implemented in hardware and/or software and includes a tuning module 401, an index module 402, and an encoding module 403.

[0082] Настроечный модуль 401 выполнен с возможностью настроить долговременный опорный кадр согласно информации атрибутов кадра изображения.[0082] The adjuster 401 is configured to adjust the long-term reference frame according to the image frame attribute information.

[0083] Индексный модуль 402 выполнен с возможностью определить опорный индекс подлежащего кодированию кадра изображения на основании нормального опорного кадра и долговременного опорного кадра.[0083] The index module 402 is configured to determine a reference index of an image frame to be encoded based on a normal reference frame and a long-term reference frame.

[0084] Кодирующий модуль 403 выполнен с возможностью получить целевой сопоставляемый блок посредством осуществления интер-предсказания на основании опорного индекса, чтобы выполнить кодирование подлежащего кодированию кадра изображения.[0084] The encoding module 403 is configured to obtain a target matching block by performing inter-prediction based on the reference index to perform encoding of an image frame to be encoded.

[0085] Согласно этому варианту осуществления настоящего раскрытия, настроечный модуль настраивает кадр изображения в качестве долговременного опорного кадра согласно информации атрибутов, индексный модуль определяет опорный индекс подлежащего кодированию кадра изображения на основании долговременного опорного кадра и нормального опорного кадра, а кодирующий модуль получает целевой сопоставляемый блок для подлежащего кодированию кадра изображения посредством выполнения интерпредсказания на основании опорного индекса, чтобы осуществить кодирование подлежащего кодированию кадра изображения. Долговременный опорный кадр заранее настроен согласно информации атрибутов кадра изображения, вследствие этого достигается точная настройка долговременного опорного кадра. Опорный индекс определяется на основании долговременного опорного кадра, и получается целевой сопоставляемый блок, вследствие этого снижаются накладные расходы на вычисления в процессе кодирования, и снижается сложность кодирования на основании идеи обеспечения эффективности сжатия при кодировании изображения.[0085] According to this embodiment of the present disclosure, the setting module adjusts an image frame as a long-term reference frame according to the attribute information, the index module determines the reference index of the image frame to be encoded based on the long-term reference frame and the normal reference frame, and the encoding module obtains the target matching block on the image frame to be encoded by performing interprediction based on the reference index to encode the image frame to be encoded. The long-term reference frame is pre-adjusted according to the attribute information of the image frame, thereby achieving fine tuning of the long-term reference frame. The reference index is determined based on the long-term reference frame, and a target matching block is obtained, thereby reducing the computational overhead in the encoding process, and reducing the complexity of encoding based on the idea of ensuring compression efficiency in image encoding.

[0086] Вариант осуществления 5[0086] Embodiment 5

[0087] На фиг. 7 показана упрощенная блок-схема вычислительного устройства для кодирования видео согласно варианту осуществления 5 настоящего раскрытия. Как показано на фиг. 7, вычислительное устройство включает в себя процессор 50, память 51, входную аппаратуру 52 и выходную аппаратуру 53. В устройстве может быть один или более процессоров 50. На фиг. 7 в качестве примера используется один процессор 50. Процессор 50, память 51, входная аппаратура 52 и выходная аппаратура 53 вычислительного устройства могут быть соединены друг с друга с помощью шины или другим образом. На фиг. 7 в качестве примера соединение выполнено с помощью шины.[0087] FIG. 7 is a simplified block diagram of a video encoding computing device according to Embodiment 5 of the present disclosure. As shown in FIG. 7, a computing device includes a processor 50, memory 51, input hardware 52, and output hardware 53. The device may have one or more processors 50. In FIG. 7, one processor 50 is used as an example. The processor 50, memory 51, input hardware 52, and output hardware 53 of the computing device may be connected to each other via a bus or otherwise. In FIG. 7, as an example, the connection is made with a bus.

[0088] В качестве машиночитаемого носителя данных память 51 может быть выполнена с возможностью хранить одну или более программ программного обеспечения, одну или более выполняемых компьютером программ и один или более модулей, например, модулей (то есть настроечный модуль 401, индексный модуль 402 и кодирующий модуль 403) устройства для кодирования видео согласно варианту осуществления 4 настоящего раскрытия. Процессор 50 после загрузки и выполнения одной или более программ, одной или более инструкций, или одного или более модулей, хранящихся в памяти 51, выполняет множество функциональных приложений и обработки данных вычислительного устройства, то есть, выполняет вышеупомянутый способ для кодирования видео на основании долговременного опорного кадра.[0088] As a computer-readable storage medium, the memory 51 may be configured to store one or more software programs, one or more computer-executable programs, and one or more modules, such as modules (i.e., a tuning module 401, an index module 402, and an encoding module 403) of the video encoding apparatus according to Embodiment 4 of the present disclosure. The processor 50, after downloading and executing one or more programs, one or more instructions, or one or more modules stored in the memory 51, performs a plurality of functional applications and data processing of the computing device, that is, performs the above method for encoding video based on the long-term reference frame.

[0089] Память 51 может содержать в себе область хранения программы и область хранения данных. В области хранения программы может храниться операционная система и по меньшей мере одна прикладная программа, необходимая для функции; а в области хранения данных могут храниться данные, созданные согласно использованию терминала, и тому подобные. Кроме того, память 51 может содержать в себе высокоскоростное оперативное запоминающее устройство, и может дополнительно содержать в себе энергонезависимую память, например по меньшей мере одно устройство памяти на магнитном диске, устройство флеш-памяти, или другое энергонезависимое полупроводниковое устройство памяти. Согласно некоторым примерам память 51 может дополнительно содержать в себе память, расположенную дистанционно относительно процессора 50. Дистанционная память может быть соединена с устройством с помощью сети. Примеры вышеупомянутой сети включают в себя, помимо прочего, интернет, интранет, локальную вычислительную сеть, сеть мобильной связи и их комбинацию.[0089] The memory 51 may include a program storage area and a data storage area. The program storage area may store an operating system and at least one application program required for the function; and the data storage area may store data created according to the use of the terminal and the like. In addition, the memory 51 may include a high-speed random access memory, and may further include non-volatile memory such as at least one magnetic disk memory device, flash memory device, or other non-volatile semiconductor memory device. In some examples, memory 51 may further include memory located remotely from processor 50. Remote memory may be connected to the device via a network. Examples of the above network include, but are not limited to, the Internet, an intranet, a local area network, a mobile network, and a combination thereof.

[0090] Входная аппаратура 52 может быть выполнена с возможностью принимать вводимую численную или текстовую информацию, и создавать сигнал ввода с клавиш, относящийся к пользователю, настраивающему устройство или осуществляющему управление им. Выходная аппаратура 53 может включать в себя устройство отображения, например, экран дисплея.[0090] The input hardware 52 may be configured to receive numeric or text input and generate a key input signal related to the user configuring or controlling the device. The output apparatus 53 may include a display device, such as a display screen.

[0091] Вариант осуществления 6[0091] Embodiment 6

[0092] Согласно варианту осуществления 6 настоящего раскрытия дополнительно предложен энергонезависимый носитель данных, на котором хранится одна или более выполняемых компьютером инструкций. Одна или более инструкций, когда они загружены и выполняются процессором компьютера, заставляют процессор компьютера выполнять способ для кодирования видео, причем этот способ предусматривает следующие стадии: настройка долговременного опорного кадра согласно информации атрибутов кадра изображения; определение опорного индекса для подлежащего кодированию кадра изображения на основании нормального опорного кадра и долговременного опорного кадра; и получение целевого сопоставляемого блока посредством выполнения интер-предсказания на основании опорного индекса, чтобы осуществить кодирование подлежащего кодированию кадра изображения.[0092] Embodiment 6 of the present disclosure further provides a non-volatile storage medium that stores one or more computer-executable instructions. The one or more instructions, when loaded and executed by the computer processor, cause the computer processor to execute a method for encoding a video, the method comprising the steps of: setting a long-term reference frame according to image frame attribute information; determining a reference index for an image frame to be encoded based on the normal reference frame and the long-term reference frame; and obtaining a target matching block by performing inter-prediction based on the reference index to encode an image frame to be encoded.

[0093] Конечно, выполняемая компьютером инструкция, хранящаяся на носителе данных, предложенном в этом варианте осуществления настоящего раскрытия, не ограничена выполнением вышеупомянутого способа, и может дополнительно выполнять операции, относящиеся к способу для кодирования видео, предложенные согласно любому варианту осуществления настоящего раскрытия.[0093] Of course, the computer-executable instruction stored on the storage medium proposed in this embodiment of the present disclosure is not limited to performing the above method, and may further perform operations related to the video encoding method proposed according to any embodiment of the present disclosure.

[0094] Согласно вышеупомянутым описаниям вариантов осуществления, специалист в данной области техники может ясно понимать, что настоящее раскрытие может быть реализовано с помощью программного обеспечения и необходимого обычного аппаратного обеспечения, или, несомненно, может быть реализовано с помощью аппаратного обеспечения. На основании такого понимания технические решения настоящего раскрытия, вносящие существенный или частичный вклад в известный уровень технических решений, могут быть реализованы в виде программного продукта. Программный продукт для компьютера может храниться на машиночитаемом носителе данных, например, на гибкой дискете, в постоянном запоминающем устройстве (ПЗУ), в оперативном запоминающем устройстве (ОЗУ), во флеш-памяти, на жестком диске или на оптическом диске компьютера, и может содержать в себе несколько инструкций для указания вычислительному устройству (которое может быть персональным компьютером, сервером, сетевым устройством или к тому подобным) выполнения способов согласно вариантам осуществления настоящего раскрытия.[0094] According to the above descriptions of the embodiments, a person skilled in the art can clearly understand that the present disclosure can be implemented using software and the necessary conventional hardware, or, of course, can be implemented using hardware. Based on such an understanding, the technical solutions of the present disclosure, which contribute substantially or partially to the prior art, may be implemented as a software product. The computer software product may be stored on a computer-readable storage medium, such as a floppy disk, read-only memory (ROM), random access memory (RAM), flash memory, a hard disk drive, or an optical disk drive of a computer, and may contain includes several instructions for directing a computing device (which may be a personal computer, a server, a network device, or the like) to perform methods according to embodiments of the present disclosure.

[0095] Следует отметить, что согласно вышеупомянутому варианту осуществления устройства для кодирования видео включенное в него множество блоков и модулей разделены только согласно функциональной логике, но они не ограничены вышеупомянутым разделением, могут быть применены другие разделения при условии реализации соответствующих функций. Кроме того, названия множества функциональных блоков указаны только с целью различения их друг от друга, и они не предназначены для ограничения объема правовой охраны настоящего раскрытия. Кроме того, термин «и/или» при использовании со списком элементов означает любой один из этих элементов или их комбинацию, кроме случаев, когда контекст явно оговаривает иное. Например, «X, Y и/или Z» обозначает «X», «Y», «Z», «X и Y», «X и Z», «Y и Z» или «X, Y и Z» или любое сочетание из вышеупомянутого.[0095] It should be noted that, according to the above embodiment of the video encoding apparatus, the plurality of blocks and modules included therein are only divided according to functional logic, but they are not limited to the above separation, other separations can be applied as long as the corresponding functions are implemented. In addition, the names of a plurality of functional blocks are only for the purpose of distinguishing them from each other, and they are not intended to limit the scope of legal protection of the present disclosure. In addition, the term "and/or" when used with a list of elements means any one of these elements or a combination of them, unless the context clearly states otherwise. For example, "X, Y and/or Z" means "X", "Y", "Z", "X and Y", "X and Z", "Y and Z", or "X, Y and Z", or any combination of the above.

Claims

1. A method for encoding video based on a long-term reference frame, comprising the steps:

setting the long-term reference frame according to the image frame attribute information;

determining a reference index of an image frame to be encoded based on the normal reference frame and the long-term reference frame; And

obtaining a target matching block by performing inter-prediction based on the reference index to perform encoding of an image frame to be coded;

wherein setting the long-term reference frame according to the image frame attribute information includes the steps of:

setting a long-term reference frame based on a time-domain redundancy rate and a spatial-domain redundancy rate for an image frame, the time-domain redundancy indicating the degree of presence of the same encoded macroblock in the image frame and its adjacent image frame, and the redundancy in The spatial domain indicates the extent to which the same encoded macroblock exists within an image frame.

2. The method of claim 1, wherein setting the long-term reference frame according to the image frame attribute information comprises the steps of:

extracting, for each image frame from the plurality of image frames, a frame type from attribute information corresponding to the image frame; And

setting an image frame corresponding to the frame type as a long-term reference frame in the case where the frame type is at least one frame of the keyframe or the script switching frame.

3. The method of claim 1, wherein setting the long-term reference frame based on the degree of redundancy in the time domain and the degree of redundancy in the spatial domain for the image frame comprises the steps of:

determining for each image frame from a plurality of image frames the cost of inter-frame coding and the cost of intra-frame coding, respectively, and the cost of inter-frame coding reflects the degree of redundancy in the time domain, and the cost of intra-frame coding reflects the degree of redundancy in the spatial domain, the degree of redundancy in the time domain is inversely proportional to the cost of inter-frame coding, and the degree of redundancy in the spatial domain is inversely proportional to the cost of intra-frame coding;

determining a current encoding cost ratio for the current image frame based on intra-frame coding cost and inter-frame coding cost values;

determining a first encoding cost ratio of an image frame previous to the current image frame and a second encoding cost ratio of an image frame before the previous image frame, and determining a ratio variance for the current encoding cost ratio, the first encoding cost ratio, and the second encoding cost ratio. coding; And

setting the current image frame as a long-term reference frame when it is determined that the second encoding cost ratio, the first encoding cost ratio, and the current encoding cost ratio decrease sequentially and the ratio variance is less than a threshold value.

4. The method according to any one of paragraphs. 1-3, wherein determining a reference index of an image frame to be encoded based on a normal reference frame and a long-term reference frame includes the steps of:

determining, based on the normal reference frame and the long-term reference frame, and sequentially according to the first type prediction mode, the second type prediction mode, and the third type prediction mode, of the reference index corresponding to the image frame to be encoded, wherein

the first type prediction mode at least includes a MERGE prediction mode or a SKIP prediction mode, a second type prediction mode at least includes a 2N*2N prediction mode or a BIDP prediction mode, and a third type prediction mode at least includes a 2N*N prediction mode, N*2N prediction mode, 2N*nD prediction mode, 2N*nU prediction mode, nR*2N prediction mode, or nL*2N prediction mode.

5. The method of claim 4, wherein determining, based on the normal reference frame and the long-term reference frame and sequentially according to the first type prediction mode, the second type prediction mode, and the third type prediction mode, the reference index corresponding to the image frame to be encoded, comprises the steps of:

obtaining an encoded macroblock of the image frame to be encoded according to the first type prediction mode, and generating a first reference index corresponding to the image frame to be encoded in the first type prediction mode by combining the target reference index information of the left block, the upper left block, the upper block, the upper right block, and the lower left a block of an encoded macroblock and a reference encoded macroblock whose position corresponds to the encoded macroblock in the time domain, the reference encoded macroblock being located in at least one normal reference frame or a long-term reference frame;

performing recursive encoding on an encoded macroblock of an image frame to be encoded, determining a parent merge reference index based on the target reference index information of the parent encoded block in the encoded macroblock, and determining a child reference merge index based on the target reference index information of the child encoded block in the encoded macroblock;

determining a first reference index, a parent union reference index, and a child union reference index as a second reference index corresponding to the image frame to be encoded in the second type prediction mode; And

searching for a third reference index corresponding to an image frame to be encoded in a predetermined index configuration table based on the mode type of the third type prediction mode.

6. The method of claim 5, wherein the mapping relationship between the mode type and the third reference index in the predetermined index configuration table includes at least one of the following steps:

in the case where the mode type is a 2N*N prediction mode, the upper index of the third reference index is determined based on the target reference index information of the first child coded block and the target reference index information of the second child coded block in the recursive encoding process, and the lower index of the third reference index is determined based on the target reference index information of the third child encoded block and the target reference index information of the fourth child encoded block in the recursive encoding process;

in the case where the mode type is N*2N prediction mode, the left index of the third reference index is determined based on the target reference index information of the first child coded block and the target reference index information of the third child coded block in the recursive encoding process, and the right index of the third reference index is determined based on the target reference index information of the second child encoded block and the target reference index information of the fourth child encoded block in the recursive encoding process;

in the case where the mode type is a 2N*nD prediction mode, the upper index of the third reference index is equal to the second reference index, and the lower index of the third reference index is determined based on the target reference index information of the third child coded block and the target reference index information of the fourth child coded block in the process of recursive coding;

in the case where the mode type is a 2N*nU prediction mode, the upper index of the third reference index is determined based on the target reference index information of the first child coded block and the target reference index information of the second child coded block in the recursive encoding process, and the lower index of the third reference index is equal to the second reference index;

in the case where the mode type is the nR*2N prediction mode, the left index of the third reference index is equal to the second reference index, and the right index of the third reference index is determined based on the target reference index information of the second child coded block and the target reference index information of the fourth child coded block in the process of recursive coding; And

in the case where the mode type is the nL*2N prediction mode, the left index of the third reference index is determined based on the target reference index information of the first child coded block and the target reference index information of the third child coded block in the recursive encoding process, and the right index of the third reference index is equal to the second reference index, and

the first child code block, the second child code block, the third child code block, and the fourth child code block are respectively the top left child code block, the top right child code block, the bottom left child code block, and the bottom right child code block in the four child code blocks, created by segmenting the encoded macroblock during the recursive encoding process.

7. The method according to claim 1 or 5, wherein obtaining the target matching block is performed by performing inter-prediction based on the reference index to perform encoding of an image frame to be coded, comprises the steps of:

searching for a target matching block based on the reference index in at least one frame from the normal reference frame or the long-term reference frame; And

executing based on the target mapped compression coding block on the image frame to be coded.

8. The method of claim 7, wherein searching for a target matching block based on the reference index in at least one frame from a normal reference frame or a long-term reference frame comprises the steps of:

in the 2N*2N prediction mode, in the case where it is determined that at least one frame from the normal reference frame or the long-term reference frame is not the target reference frame in the MERGE prediction mode, the search range of the target matching block is narrowed down in at least one frame from a normal key frame or a long-term key frame.

9. The method of claim 7, wherein searching for a target matching block based on the reference index in at least one frame from a normal reference frame or a long-term reference frame comprises the steps of:

searching based on a diamond search algorithm for a target matching block in at least one frame from a normal reference frame or a long-term reference frame in the case where the attribute information about the image frame to be encoded includes B-frame information and non-key frame information.

10. Device for encoding video, including:

a setting module configured to set the long-term reference frame according to the attribute information of the image frame;

an index module, configured to determine a reference index of an image frame to be encoded based on a normal reference frame and a long-term reference frame; And

an encoding module, configured to obtain a target matching block by performing inter-prediction based on the reference index to perform encoding of an image frame to be encoded;

wherein the tuning module is further configured to adjust the long-term reference frame based on the degree of redundancy in the time domain and the degree of redundancy in the spatial domain for the image frame, and the degree of redundancy in the time domain indicates the degree of presence of the same encoded macroblock in the image frame and in its adjacent image frame, and the degree of redundancy in the spatial domain indicates the degree of presence of the same encoded macroblock within the image frame.

11. A computing device for encoding video based on a long-term reference frame, including:

one or more processors; And

a memory configured to store one or more programs, wherein

one or more programs, when downloaded and executed by one or more processors, causes one or more processors to execute a method for encoding video based on a long-term reference frame, as defined in any one of claims. 1-9.

12. A non-volatile computer-readable storage medium that stores one or more computer programs, wherein the one or more computer programs, when loaded and executed by a processor of a computing device, causes the computing device to execute a method for encoding video based on a long-term reference frame, as defined by any of paragraphs. 1-9.