WO2021168624A1

WO2021168624A1 - Video image coding method and device, and movable platform

Info

Publication number: WO2021168624A1
Application number: PCT/CN2020/076469
Authority: WO
Inventors: 邱孟品; 赵文军
Original assignee: 深圳市大疆创新科技有限公司
Priority date: 2020-02-24
Filing date: 2020-02-24
Publication date: 2021-09-02
Also published as: CN112204971A

Abstract

A video image coding method and device, and a movable platform. The method comprises: dividing each of multiple channels of an image frame in a video code stream to obtain a coding block to be processed, and thus calculating a division cost corresponding to performing block division on said coding block in each mode of preset division modes, determining a target division mode, performing block division on said coding block according to the target division mode to obtain prediction blocks, and then calculating a prediction cost corresponding to predicting each prediction block of said coding block in each mode of preset prediction modes so as to determine a target prediction mode and predict each prediction block of said coding block according to the target prediction mode. The method can reduce the data dependence, reduce the data error accumulation, facilitates hardware parallel data processing, and reduces hardware implementation resources, and there is no need to reconstruct during the quantization process, which improves the coding speed.

Description

Video image coding method, equipment and movable platform

Technical field

The embodiments of the present application relate to coding technology, and in particular to a video image coding method, device, and movable platform.

Background technique

With the development of information technology, video image services such as high-definition television, web conferencing, interactive Internet television (IPTV), and three-dimensional (3D) television have developed rapidly. Video image signals have become people’s daily routines due to their intuitiveness and efficiency. The most important way to get information in life. Because the video image signal contains a large amount of data, it needs to occupy a large amount of transmission bandwidth and storage space. For effective transmission and storage, video image signals need to be compressed and encoded.

In related technologies, DPCM, as a simple prediction method, is widely used in image compression coding. DPCM generally uses the encoded pixels on the left or above for prediction.

However, DPCM uses adjacent pixels for prediction, which makes data dependent, prone to cumulative errors and difficult hardware parallel processing, which in turn leads to higher coding error rates and slower coding speeds.

Summary of the invention

The embodiments of the present application provide a video image encoding method, device, and movable platform to overcome at least one of the above-mentioned problems.

In the first aspect, an embodiment of the present application provides a video image encoding method, including:

Divide each of the multiple channels of the image frame in the video code stream to obtain the code block to be processed;

Calculating the division cost corresponding to the block division of the coding block to be processed in each division mode of the preset division mode;

Selecting a target division mode from the preset division modes according to the division cost, and performing block division on the coding block to be processed according to the target division mode to obtain a prediction block;

Calculating the prediction cost corresponding to the prediction of each prediction block of the coding block to be processed in each prediction mode of the preset prediction mode;

Selecting a target prediction mode from the preset prediction modes according to the prediction cost, and predicting each prediction block of the coding block to be processed according to the target prediction mode to obtain a prediction residual;

Encoding is performed according to the prediction residual.

In a second aspect, an embodiment of the present application provides a video image encoding device, including a memory, a processor, and computer-executable instructions stored in the memory and running on the processor, and the processor executes the computer The following steps are implemented when executing instructions:

Encoding is performed according to the prediction residual.

In the third aspect, an embodiment of the present application provides a movable platform, including:

The movable platform body; and

In the video image encoding device described in the above second aspect and various possible designs of the second aspect, the video image encoding device is installed on the movable platform body.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium that stores computer-executable instructions. When the processor executes the computer-executable instructions, the first aspect and the first aspect described above are implemented. In terms of various possible designs, the video image coding method described.

The video image coding method, device, and movable platform body provided by the embodiments of the present application. The method divides each of the multiple channels of the image frame in the video code stream to obtain the code block to be processed, and then calculates the code block to be processed. In each mode of the preset division mode, the processing coding block is divided into the corresponding division cost, the target division mode is determined, and the coding block to be processed is divided according to the target division mode to obtain the prediction block, and then the to-be-processed coding block is calculated For each prediction block in the preset prediction mode, the corresponding prediction cost is predicted, so that the target prediction mode is determined, and each prediction block of the coding block to be processed is predicted according to the target prediction mode, which can reduce the data Dependence, reducing the accumulation of data errors, is conducive to hardware parallel data processing, reduces hardware implementation resources, and does not need to be reconstructed in the quantization process to improve coding speed.

Description of the drawings

The drawings herein are incorporated into the specification and constitute a part of the specification, show embodiments that conform to the application, and are used together with the specification to explain the principle of the application.

FIG. 1 is a schematic diagram of the architecture of a video image coding system provided by an embodiment of the application;

FIG. 2 is a schematic flowchart of a video image encoding method provided by an embodiment of the application;

3 is a schematic diagram of dividing each of multiple channels of an image frame in a video bitstream according to an embodiment of the application;

FIG. 4 is a schematic diagram of a preset division mode provided by an embodiment of the application;

FIG. 5 is a schematic flowchart of another video image encoding method provided by an embodiment of the application;

FIG. 6 is a schematic flowchart of still another video image encoding method provided by an embodiment of this application;

FIG. 7 is a schematic diagram of a first prediction mode provided by an embodiment of the application;

FIG. 8 is a schematic flowchart of another video image encoding method provided by an embodiment of the application;

FIG. 9 is a schematic diagram of a second prediction mode provided by an embodiment of the application;

FIG. 10 is a schematic flowchart of another video image encoding method provided by an embodiment of this application;

FIG. 11 is a schematic structural diagram of a video image encoding device provided by an embodiment of this application;

FIG. 12 is a schematic diagram of the hardware structure of a video image encoding device provided by an embodiment of the application;

FIG. 13 is a schematic structural diagram of a movable platform provided by an embodiment of the application.

Through the above drawings, the specific embodiments of the present application have been shown, which will be described in more detail later. These drawings and text description are not intended to limit the scope of the concept of the present application in any way, but to explain the concept of the present application for those skilled in the art by referring to specific embodiments.

Detailed ways

The exemplary embodiments will be described in detail here, and examples thereof are shown in the accompanying drawings. When the following description refers to the accompanying drawings, unless otherwise indicated, the same numbers in different drawings represent the same or similar elements. The implementation manners described in the following exemplary embodiments do not represent all implementation manners consistent with the present application. On the contrary, they are merely examples of devices and methods consistent with some aspects of the application as detailed in the appended claims.

First, explain the terms involved in this application:

Video image coding: usually refers to processing a sequence of pictures that form a video or video sequence. In the field of video coding, the terms "PiCTUre", "frame" or "Image" can be used as synonyms. The video encoding in this application is performed on the source side, and usually includes processing (for example, by compressing) the original video picture to reduce the amount of data required to represent the video picture (thus storing and/or transmitting more efficiently). Video decoding is performed on the destination side and usually involves inverse processing relative to the encoder to reconstruct the video picture.

Each picture of a video sequence is usually divided into a set of non-overlapping blocks, and is usually coded at the block level. In other words, the encoder side usually processes the video at the block (video block) level, that is, encodes the video. For example, the prediction block is generated by prediction, and the prediction block is subtracted from the current block (currently processed or to-be-processed block) to obtain the residual. Block, transform the residual block in the transform domain and quantize the residual block to reduce the amount of data to be transmitted (compressed), and the decoder side applies the inverse processing part relative to the encoder to the encoded or compressed block to reproduce Constructs the current block used to represent. In addition, the encoder duplicates the decoder processing loop so that the encoder and decoder generate the same prediction and/or reconstruction for processing, that is, encoding subsequent blocks.

In related technologies, DPCM, as a simple prediction method, is widely used in image compression coding. DPCM generally uses the encoded pixels on the left or above for prediction. However, DPCM uses adjacent pixels for prediction, which makes data dependent, prone to cumulative errors and difficult hardware parallel processing, which in turn leads to higher encoding error rates and slower encoding speeds.

Therefore, in consideration of the above-mentioned problems, the present application provides a video image coding method, which obtains the code blocks to be processed by dividing each of the multiple channels of the image frame in the video code stream, and further, calculates the code blocks to be processed In each mode of the preset division mode, the division cost corresponding to the block division is performed, the target division mode is determined, and the coding block to be processed is divided into blocks according to the target division mode to obtain the prediction block, and then each of the coding blocks to be processed is calculated. The prediction block predicts the corresponding prediction cost in each of the preset prediction modes, thereby determining the target prediction mode, and predicting each prediction block of the coding block to be processed according to the target prediction mode, which can reduce data dependence. Reducing the accumulation of data errors is conducive to hardware parallel data processing, reducing hardware implementation resources, and there is no need to reconstruct during the quantization process, which improves coding speed.

The video image encoding method provided by this application can be applied to the schematic diagram of the video image encoding system architecture shown in FIG. 1. As shown in FIG. 1, the video image encoding system 10 includes a source device 12 and a target device 14, and the source device 12 includes : Picture acquisition device 121, preprocessor 122, encoder 123 and communication interface 124. The target device 14 includes a display device 141, a processor 142, a decoder 143, and a communication interface 144. The source device 12 sends the encoded data 13 obtained by encoding to the target device 14. The method of this application is applied to the encoder 123.

Among them, the source device 12 may be referred to as a video encoding device or a video encoding device. The target device 14 may be referred to as a video decoding device or a video decoding device. The source device 12 and the target device 14 may be examples of video encoding devices or video encoding devices.

The source device 12 and the target device 14 may include any of a variety of devices, including any type of handheld or stationary device, for example, a notebook or laptop computer, mobile phone, smart phone, tablet or tablet computer, video camera, desktop computer , Set-top boxes, televisions, display devices, digital media players, video game consoles, video streaming devices (such as content service servers or content distribution servers), broadcast receiver devices, broadcast transmitter devices, etc., and may not be used or Use any type of operating system.

In some cases, source device 12 and target device 14 may be equipped for wireless communication. Therefore, the source device 12 and the target device 14 may be wireless video and image encoding devices.

In some cases, the video image encoding system 10 shown in FIG. 1 is only an example, and the technology of this application can be applied to video encoding settings that do not necessarily include any data communication between encoding and decoding devices (for example, video encoding or video decoding). ). In other instances, the data can be retrieved from local storage, streamed on the network, etc. The video encoding device can encode data and store the data to the memory, and/or the video decoding device can retrieve the data from the memory and decode the data. In some instances, encoding and decoding are performed by devices that do not communicate with each other but only encode data to and/or retrieve data from the memory and decode the data.

In some cases, the encoder 123 of the video image encoding system 10 may also be referred to as a video encoder, and the decoder 143 may also be referred to as a video decoder.

In some cases, the picture acquisition device 121 may include or may be any type of picture capture device, for example, to capture real-world pictures, and/or any type of pictures or comments (for screen content encoding, some text on the screen is also Considered to be a part of the picture or image to be encoded) generating equipment, for example, a computer graphics processor for generating computer animation pictures, or for obtaining and/or providing real-world pictures, computer animation pictures (for example, screen content, virtual Any type of equipment of virtual reality (VR) pictures, and/or any combination thereof (for example, augmented reality (AR) pictures). Wherein, the picture is or can be regarded as a two-dimensional array or matrix of sampling points with brightness values. Taking the array as an example, the sampling points in the array may also be called pixels (pixels) or pixels (piCTUre element, pel for short). The number of sampling points of the array in the horizontal and vertical directions (or axis) defines the size and/or resolution of the picture. In order to represent colors, three color components are usually used, that is, pictures can be represented as or contain three sample arrays. In the RBG format or color space, a picture includes corresponding red, green, and blue sample arrays. However, in video coding, each pixel is usually expressed in a luminance/chrominance format or color space, for example, YCbCr, including the luminance (luma) component indicated by Y (sometimes indicated by L), and Cb and Cr indications. The two chroma (chroma for short) components. The luminance component Y represents luminance or grayscale level intensity (for example, the two are the same in a grayscale picture), and the two chrominance components Cb and Cr represent chrominance or color information components. Correspondingly, a picture in the YCbCr format includes a luminance sample array of the luminance component (Y), and two chrominance sample arrays of the chrominance component (Cb and Cr). Pictures in RGB format can be converted or converted to YCbCr format, and vice versa. This process is also called color conversion or conversion.

In addition, the picture acquisition device 121 may be, for example, a camera for capturing pictures, such as a memory of a picture memory, including or storing previously captured or generated pictures, and/or any type of (internal or external) interface for acquiring or receiving pictures. . The camera may be, for example, an integrated camera that is local or integrated in the source device, and the memory may be local or, for example, an integrated memory that is integrated in the source device. The interface may be, for example, an external interface for receiving pictures from an external video source. Here, the external video source is, for example, an external picture capturing device, such as a camera, an external memory, or an external picture generating device, and the external picture generating device is, for example, an external computer graphics processor. , Computer or server. In addition, the interface can be any type of interface according to any proprietary or standardized interface protocol, such as a wired or wireless interface, and an optical interface. The interface for acquiring the picture data 125 in FIG. 1 may be the same interface as the communication interface 124 or a part of the communication interface 124. Wherein, the picture data 125 (for example, video data) may be referred to as original picture or original picture data.

In some cases, the pre-processor 122 is used to receive the picture data 125 and perform pre-processing on the picture data 125 to obtain a pre-processed picture (or pre-processed picture data) 126. The preprocessing performed by the preprocessor 122 may include trimming, color format conversion (for example, conversion from RGB to YCbCr), toning or denoising. It can be understood that the pre-processor 122 may be an optional component.

In some cases, the encoder 123 (eg, a video encoder) is used to receive pre-processed pictures (or pre-processed picture data) 126 and provide encoded picture data 127.

In some cases, the communication interface 124 of the source device 12 can be used to receive the encoded picture data 127 and transmit it to other devices, for example, the target device 14 or any other device for storage or direct reconstruction, or for storing Correspondingly, the encoded data 13 is stored and/or the encoded picture data 127 is processed before transmitting the encoded data 13 to other devices, such as the target device 14 or any other device for decoding or storage. The communication interface 144 of the target device 14 is used, for example, to directly receive the encoded picture data 127 or the encoded data 13 from the source device 12 or any other source. Any other source is, for example, a storage device, and the storage device is, for example, an encoded picture data storage device.

Among them, the communication interface 124 and the communication interface 144 can be used to directly communicate through the direct communication link between the source device 12 and the target device 14 or through any type of network to transmit or receive the encoded picture data 127 or the encoded data 13 The link is, for example, a direct wired or wireless connection, and any type of network is, for example, a wired or wireless network or any combination thereof, or any type of private network and public network, or any combination thereof. The communication interface 124 may be used, for example, to encapsulate the encoded picture data 127 into a suitable format, such as a packet, for transmission on a communication link or communication network. The communication interface 144 forming the corresponding part of the communication interface 124 may be used, for example, to decapsulate the encoded data 13 to obtain the encoded picture data 127. Both the communication interface 124 and the communication interface 144 can be configured as a one-way communication interface, as indicated by the arrow pointing from the source device 12 to the target device 14 for the encoded picture data 127 in FIG. 1, or can be configured as a two-way communication interface, and can It is used, for example, to send and receive messages to establish a connection, confirm and exchange any other information related to the communication link and/or data transmission such as the transmission of encoded picture data.

In some cases, the decoder 143 is used to receive encoded picture data 127 and provide decoded picture data (or decoded picture) 145.

In some cases, the processor 142 of the target device 14 is used to post-process decoded picture data (or decoded picture) 145, for example, a decoded picture, to obtain post-processed picture data 146, for example, a post-processed picture. The post-processing performed by the processor 142 may include, for example, color format conversion (for example, conversion from YCbCr to RGB), toning, trimming or resampling, or any other processing for preparing decoded picture data (or decoded picture data, for example). The picture 145 is displayed by the display device 141.

In some cases, the display device 141 of the target device 14 is used to receive the post-processed picture data 145 to display the picture to, for example, a user or viewer. The display device 141 may be or may include any type of display for presenting the reconstructed picture, for example, an integrated or external display or monitor. For example, the display may include a liquid crystal display (LCD for short), an organic light emitting diode (OLED) display, a plasma display, a projector, a micro LED display, and a liquid crystal on silicon (liquid crystal on silicon, for short) LCoS), digital light processor (digital light processor, DLP for short), or any other type of display.

In addition, although FIG. 1 depicts the source device 12 and the target device 14 as separate devices, the device embodiment may also include the source device 12 and the target device 14 or the functionality of both, that is, the source device 12 or the corresponding The functionality of the target device 14 or the corresponding functionality. In such embodiments, the same hardware and/or software may be used, or separate hardware and/or software, or any combination thereof may be used to implement the source device 12 or the corresponding functionality and the target device 14 or the corresponding functionality. The functionality of different units or the existence and (accurate) division of the functionality of the source device 12 and/or the target device 14 shown in FIG. 1 may vary according to actual devices and applications.

In some cases, both the encoder 123 (e.g., video encoder) and the decoder 143 (e.g., video decoder) can be implemented as any of various suitable circuits, such as one or more microprocessors, digital Signal processor (digital signal processor, DSP), application-specific integrated circuit (ASIC), field-programmable gate array (FPGA), discrete logic, hardware, or any combination thereof . If the technology is partially implemented in software, the device can store the instructions of the software in a suitable non-transitory computer-readable storage medium, and can use one or more processors to execute the instructions in hardware to execute the technology of the present application . Any of the foregoing (including hardware, software, a combination of hardware and software, etc.) can be regarded as one or more processors. Each of the encoder 123 and the decoder 143 may be included in one or more encoders or decoders, and any one of the encoders or decoders may be integrated as a combined encoder/decoder in the corresponding device ( Codec).

It should be understood that for each of the examples described above with reference to the encoder 123, the decoder 143 may be used to perform the reverse process. Regarding signaling syntax elements, the decoder 143 can be used to receive and parse such syntax elements, and decode related video data accordingly. In some examples, the encoder 123 may entropy encode one or more defined syntax elements into an encoded video bitstream. In such instances, the decoder 143 can parse such syntax elements and decode related video data accordingly.

The technical solution of the present application and how the technical solution of the present application solves the above technical problems will be described in detail below with specific embodiments. The following specific embodiments can be combined with each other, and the same or similar concepts or processes may not be repeated in some embodiments. The embodiments of the present application will be described below in conjunction with the accompanying drawings.

FIG. 2 is a schematic flowchart of a video image encoding method provided by an embodiment of this application. The execution subject of this embodiment may be the encoder in the embodiment shown in FIG. 1. As shown in FIG. 2, the method includes:

S201: Divide each of the multiple channels of the image frame in the video code stream to obtain a code block to be processed.

Here, the above-mentioned video code stream may be a preset video format, such as a YUV420 video format, which is not particularly limited for comparison in this application.

Illustratively, taking the video code stream as the YUV420 video format as an example, the encoder may divide each of the Y, U, and V channels of the image frame in the video code stream.

Optionally, the foregoing division of each of the multiple channels of the image frame in the video code stream to obtain the code block to be processed includes:

Dividing each of the multiple channels of the image frame in the video code stream into slices of a first preset size;

Separately dividing the channel components in each slice after division into coding blocks of a second preset size;

Obtain the above-mentioned to-be-processed coding block from the above-mentioned coding block.

Among them, the above-mentioned first preset size and second preset size can be set according to actual conditions, which are not particularly limited in the embodiment of the present application.

Specifically, as shown in Figure 3, taking the video code stream as the YUV420 video format as an example, the encoder can divide each of the multiple channels of the input image frame into non-overlapping 64x4 slices. It can be coded and decoded independently. The Y component in the 64x4 slice block is divided into 16 4x4 coding blocks; the UV component is divided into 4x2 coding blocks. Finally, the coded block to be processed is obtained from the coded block. For example, the 4x4 coded block of the Y component is obtained from the coded block as the coded block to be processed. Similarly, the 4x2 code of the U component can also be obtained from the coded block. The block is the above-mentioned to-be-processed coding block, which is not particularly limited in the embodiment of the present application.

S202: Calculate the division cost corresponding to the block division of the coding block to be processed in each division mode of the preset division mode.

Wherein, the foregoing preset division mode may be set according to actual conditions. For example, as shown in FIG. 4, the foregoing preset division mode may include None, Horizontal, Vertical, and Split. ), etc. The embodiments of the present application do not impose any special restrictions on this.

Here, the encoder calculates the division cost corresponding to the block division of the coding block to be processed in each division mode of the preset division mode, and then determines the final division mode according to the division cost.

S203: Select a target division mode from the preset division modes according to the division cost, and perform block division on the coding block to be processed according to the target division mode to obtain a prediction block.

Optionally, the foregoing selection of the target division mode from the foregoing preset division modes according to the foregoing division cost includes:

Obtain the smallest division cost from the above division costs;

The division mode corresponding to the minimum division cost is selected from the preset division modes, and the selected division mode is used as the target division mode.

In addition, in addition to the above method of selecting the target division mode, the target division mode can also be selected according to the actual situation, for example, the target division cost is obtained from the above division cost, and the division mode corresponding to the target division cost is selected from the above preset division mode. The selected division mode is used as the above-mentioned target division mode, which is not particularly limited in the embodiment of the present application. Among them, the aforementioned target division cost may be other division costs except the aforementioned minimum division cost, which can be specifically determined according to actual conditions.

S204: Calculate the prediction cost corresponding to the prediction of each prediction block of the coding block to be processed in each prediction mode of the preset prediction mode.

The foregoing preset prediction mode may be set according to actual conditions. For example, the foregoing preset division mode includes the first prediction mode maxDPCM and the second prediction mode MinDPCM, which are not particularly limited in the embodiment of the present application.

Among them, maxDPCM is to predict according to the maximum value of the pixel in the prediction block of the coding block to be processed, that is, the maximum value in the prediction block is selected first, and then the maximum value is used to subtract the remaining values to obtain the prediction residual. minDPCM predicts based on the minimum pixel value in the prediction block of the coding block to be processed, that is, first selects the minimum value in the prediction block, and then uses other values to subtract the minimum value to obtain the prediction residual.

Since the residuals obtained by the above two prediction modes do not have negative numbers, there is no need to encode symbols, which increases the coding speed. Moreover, the above two prediction modes do not need to use adjacent pixels for prediction, reducing data dependence and not easy to generate cumulative errors. In addition, follow-up Quantization does not need to be reconstructed, and parallel processing can be used to solve the difficult problem of hardware parallel processing.

Here, the encoder calculates the prediction cost corresponding to the prediction of each prediction block of the coding block to be processed in each prediction mode of the preset prediction mode, and then determines the final prediction mode according to the prediction cost.

S205. Select a target prediction mode from the preset prediction modes according to the prediction cost, and predict each prediction block of the coding block to be processed according to the target prediction mode to obtain a prediction residual.

Optionally, the foregoing selection of the target prediction mode from the foregoing preset prediction modes according to the foregoing prediction cost includes:

Obtain the smallest predicted cost from the above predicted costs;

The prediction mode corresponding to the minimum prediction cost is selected from the preset prediction modes, and the selected prediction mode is used as the target prediction mode.

In addition, in addition to the above method of selecting the target prediction mode, the target prediction mode can also be selected according to the actual situation, for example, the target prediction cost is obtained from the above prediction cost; the prediction mode corresponding to the target prediction cost is selected from the above preset prediction mode, The selected prediction mode is used as the aforementioned target prediction mode, which is not particularly limited in the embodiment of the present application. Among them, the aforementioned forecast division cost may be other forecast costs other than the aforementioned minimum forecast cost, which can be specifically determined according to actual conditions.

S206: Perform encoding according to the foregoing prediction residual.

Here, the encoder can perform multiple quantization, encoding, etc. processes at the same time based on the above prediction residuals to select the encoding parameters with the best image quality at the same bit rate to obtain the output bit stream.

The video image encoding method provided by this embodiment obtains the code block to be processed by dividing each of the multiple channels of the image frame in the video code stream, and further, calculates each of the code blocks to be processed in the preset division mode. In the first mode, the division cost corresponding to the block division is determined, the target division mode is determined, and the coding block to be processed is divided into blocks according to the target division mode to obtain the prediction block, and then each prediction block of the coding block to be processed is calculated in the preset prediction mode In each mode, predict the corresponding prediction cost, thereby determining the target prediction mode, and predicting each prediction block of the coding block to be processed according to the target prediction mode, which can reduce data dependence and reduce the accumulation of data errors. It is conducive to hardware parallel data processing, reduces hardware implementation resources, and does not need to be reconstructed in the quantization process, which improves the coding speed.

In addition, the embodiment of the present application can also obtain the division cost based on the prediction cost and the header information cost. FIG. 5 is a schematic flowchart of another video image encoding method proposed in an embodiment of this application. The execution subject of this embodiment may be the encoder in the embodiment shown in FIG. 1. As shown in Figure 5, the method includes:

S501. Divide each of the multiple channels of the image frame in the video code stream to obtain a code block to be processed.

Wherein, step S501 is implemented in the same manner as the foregoing step S201, and will not be repeated here.

S502: In each division mode of the preset division mode, calculate the prediction cost corresponding to the prediction of each prediction block of the coding block to be processed in each prediction mode of the preset prediction mode.

Optionally, the foregoing preset division mode may include None, Horizontal, Vertical, Split, etc. as shown in FIG. 4.

The preset prediction mode may include the first prediction mode maxDPCM, the second prediction mode minDPCM, and so on.

Here, in a division mode, the prediction mode of each prediction block of the above-mentioned coding block to be processed is the same. For example, in the non-divided mode, the prediction mode of each prediction block of the above-mentioned coding block to be processed is the same. Both are the above-mentioned first prediction mode maxDPCM, or both are the above-mentioned second prediction mode minDPCM.

Exemplarily, in each of the foregoing division modes, calculate the prediction residual corresponding to each prediction block of the foregoing coding block to be processed in each of the foregoing prediction modes, and if the prediction residual is greater than the preset residual threshold, then The predicted division cost of the above-mentioned coded block to be processed in the prediction mode is accumulated and the preset cost value is added to obtain the predicted cost predCost.

Among them, the above-mentioned preset residual threshold value and the preset cost value can be set according to actual conditions, and the embodiment of the present application does not specifically limit this. For example, as long as the prediction residual is greater than 15, the prediction cost predCost is accumulated by 8.

for(int i=0; i<size; i++)

if(resi[i]>15)

predCost+=8

S503. In each of the foregoing division modes, calculate a header information cost corresponding to the encoding block to be processed, where the header cost (headerCost) includes the number of coded bits corresponding to the division mode, the number of coded bits corresponding to the prediction mode, and a preset At least one of the number of coded bits corresponding to the base pixel.

Optionally, the foregoing preset division mode includes no division (None).

The foregoing calculation of the header information cost corresponding to the foregoing coding block to be processed in each of the foregoing division modes includes:

The number of coded bits corresponding to the non-division, the number of coded bits corresponding to the prediction mode, and the number of coded bits corresponding to the preset base pixels are added to obtain the first header information cost corresponding to the coded block to be processed.

Exemplarily, None--------headerCost=2bit (the number of coded bits corresponding to no division) + 1 bit (the number of coded bits corresponding to the prediction mode) + 8bit (the number of coded bits corresponding to the preset base pixel)= 11.

Optionally, the foregoing preset division mode includes horizontal dichotomy (Horizontal).

According to the horizontal dichotomy and the number of encoding bits corresponding to the preset base pixels, the total number of encoding bits of base pixels corresponding to the encoding block to be processed is calculated. For example, the total number of base pixel encoding bits=2*8bit (the number of encoding bits corresponding to the preset base pixel).

The number of coding bits corresponding to the horizontal dichotomy, the number of coding bits corresponding to the prediction mode, and the total number of base pixel coding bits corresponding to the coding block to be processed are added to obtain the second header information cost corresponding to the coding block to be processed.

Exemplarily, Horizontal-headerCost=2bit (the number of coding bits corresponding to the horizontal dichotomy)+1 bit (the number of coding bits corresponding to the prediction mode)+2*8bit (the total number of base pixel coding bits)=19.

Optionally, the foregoing preset division mode includes vertical dichotomy (Vertical).

According to the vertical dichotomy and the number of encoding bits corresponding to the preset base pixels, calculate the total number of base pixel encoding bits corresponding to the encoding block to be processed, for example, the total number of base pixel encoding bits=2*8bit (preset base pixel Corresponding number of coded bits).

Add the number of coding bits corresponding to the vertical dichotomy, the number of coding bits corresponding to the prediction mode, and the total number of base pixel coding bits corresponding to the coding block to be processed to obtain the third header information cost corresponding to the coding block to be processed .

Exemplarily, Vertical--headerCost=2bit (the number of coded bits corresponding to the vertical dichotomy)+1 bit (the number of coded bits corresponding to the prediction mode)+2*8bit (the total number of base pixel coded bits)=19.

Optionally, the foregoing preset division mode includes a cross division (Split).

According to the above cross division and the number of encoding bits corresponding to the preset base pixels, calculate the total number of base pixel encoding bits corresponding to the encoding block to be processed, for example, the total number of base pixel encoding bits=4*8bit (the preset base pixel corresponds to Number of coded bits).

The number of coded bits corresponding to the cross division, the number of coded bits corresponding to the prediction mode, and the total number of base pixel coded bits corresponding to the coded block to be processed are added to obtain the fourth header information cost corresponding to the coded block to be processed.

Exemplarily, Split--headerCost=2bit (the number of coded bits corresponding to the above-mentioned cross division)+1 bit (the number of coded bits corresponding to the above-mentioned prediction mode)+4*8bit (the total number of base pixel coded bits)=35.

S504. Add the calculated prediction cost and the header information cost to obtain the division cost corresponding to the block division of the to-be-processed coding block in each division mode.

Here, the division cost is the sum of the aforementioned predCosts and headerCost, and further, the division mode is selected according to the division cost.

S505: Select a target division mode from the preset division modes according to the division cost, and perform block division on the coding block to be processed according to the target division mode to obtain a prediction block.

S506: Calculate the prediction cost corresponding to the prediction of each prediction block of the coding block to be processed in each prediction mode of the preset prediction mode.

S507. Select a target prediction mode from the preset prediction modes according to the prediction cost, and predict each prediction block of the coding block to be processed according to the target prediction mode to obtain a prediction residual.

S508. Perform encoding according to the foregoing prediction residual.

Wherein, steps S505-S508 are implemented in the same manner as the foregoing steps S203-S206, and will not be repeated here.

The video image coding method provided in this embodiment calculates the prediction cost corresponding to the prediction of each prediction block of the coding block to be processed in each prediction mode in each division mode, and the header information cost corresponding to the coding block to be processed , And then accurately determine the division cost corresponding to the block division of the coded block to be processed in each division mode, to ensure the accuracy of subsequent processing, and by performing processing on each of the multiple channels of the image frame in the video bitstream. Divide, obtain the code block to be processed, and then calculate the division cost corresponding to the block division of the code block to be processed in each mode of the preset division mode, determine the target division mode, and block the code block to be processed according to the target division mode Divide, obtain the prediction block, and then calculate the prediction cost of each prediction block of the coding block to be processed. In each mode of the preset prediction mode, the prediction cost corresponding to the prediction is performed, so that the target prediction mode is determined, and the coding to be processed is determined according to the target prediction mode Predicting each prediction block of the block can reduce data dependence and reduce the accumulation of data errors, which is conducive to hardware parallel data processing, reduces hardware implementation resources, and does not need to be reconstructed in the quantization process, which improves coding speed.

In addition, the embodiment of the present application can also calculate the prediction cost corresponding to the prediction of each prediction block of the coding block to be processed in the first prediction mode. FIG. 6 is a schematic flowchart of another video image encoding method proposed by an embodiment of the application. The execution subject of this embodiment may be the encoder in the embodiment shown in FIG. 1. As shown in Figure 6, the method includes:

S601. Divide each of the multiple channels of the image frame in the video code stream to obtain a code block to be processed.

S602: Calculate the division cost corresponding to the block division of the coding block to be processed in each division mode of the preset division mode.

S603: Select a target division mode from the preset division modes according to the division cost, and perform block division on the coding block to be processed according to the target division mode to obtain a prediction block.

Wherein, steps S601-S603 are implemented in the same manner as the foregoing steps S201-S203, and will not be repeated here.

S604. The preset prediction mode includes a first prediction mode. The first prediction mode is based on the prediction based on the maximum value of the pixel in the prediction block of the to-be-processed coding block, and the prediction is performed based on the maximum value of the pixel and the first remaining value of the to-be-processed coding block. Predict the pixels of the block and calculate the first prediction residual, where the first remaining prediction block is the remaining prediction block in the prediction block of the coding block to be processed except for the prediction block with the maximum value of the pixel.

Exemplarily, as shown in FIG. 7, the maximum value of pixels in the prediction block of the coding block to be processed is 26, and the maximum value is used to subtract the remaining values, and the prediction residual is the smallest, and the residual has no negative number, so there is no need to code the sign later. , Can further improve the subsequent encoding speed.

S605: If the first prediction residual is greater than the first preset residual threshold, add a first preset generation value to the prediction cost of the code block to be processed in the first prediction mode to obtain the first prediction cost.

Here, if the prediction residual is greater than the preset residual threshold, the prediction cost of the coding block to be processed in the prediction mode is accumulated and the preset cost value is obtained to obtain the prediction cost. Among them, the above-mentioned first preset residual threshold value and the first preset generation value can be set according to actual conditions, which are not particularly limited in the embodiment of the present application. Exemplarily, as long as the prediction residual is greater than 15, the prediction cost predCost is accumulated by 8.

S606. Select a target prediction mode from the preset prediction modes according to the prediction cost, and predict each prediction block of the coding block to be processed according to the target prediction mode to obtain a prediction residual.

S607: Perform encoding according to the foregoing prediction residual.

Wherein, steps S606-S607 are implemented in the same manner as the foregoing steps S205-S206, and will not be repeated here.

The video image encoding method provided by this embodiment obtains the code block to be processed by dividing each of the multiple channels of the image frame in the video code stream, and further, calculates each of the code blocks to be processed in the preset division mode. In the first mode, the division cost corresponding to the block division is determined, the target division mode is determined, and the coding block to be processed is divided into blocks according to the target division mode to obtain the prediction block, and then each prediction block of the coding block to be processed is calculated in the preset prediction mode In each mode, the prediction cost corresponding to the prediction is performed, so that the target prediction mode is determined, and each prediction block of the coding block to be processed is predicted according to the target prediction mode. Since the residuals obtained by the above prediction modes are not negative, Therefore, it is not necessary to encode symbols, which improves the encoding speed, and the above prediction mode does not need to use adjacent pixels for prediction, which reduces data dependence, is not easy to generate cumulative errors, and solves the problem of hardware parallel processing difficulties.

In addition, the embodiment of the present application can also calculate the prediction cost corresponding to the prediction of each prediction block of the coding block to be processed in the second prediction mode. FIG. 8 is a schematic flowchart of another video image encoding method proposed by an embodiment of the application. The execution subject of this embodiment may be the encoder in the embodiment shown in FIG. 1. As shown in Figure 8, the method includes:

S801: Divide each of the multiple channels of the image frame in the video code stream to obtain a code block to be processed.

S802: Calculate the division cost corresponding to the block division of the coding block to be processed in each division mode of the preset division mode.

S803: Select a target division mode from the preset division modes according to the division cost, and perform block division on the coding block to be processed according to the target division mode to obtain a prediction block.

Wherein, steps S801-S803 are implemented in the same manner as the foregoing steps S201-S203, and will not be repeated here.

S804. The preset prediction mode includes a second prediction mode. The second prediction mode is based on the prediction based on the minimum value of the pixel in the prediction block of the to-be-processed coding block, and the pixel value of the second remaining prediction block of the to-be-processed coding block is The pixel minimum value is used to calculate a second prediction residual, wherein the second remaining prediction block is a prediction block remaining except for the prediction block of the pixel minimum value among the prediction blocks of the coding block to be processed.

Exemplarily, as shown in FIG. 9, the minimum value of the pixel in the prediction block of the coding block to be processed is 22, and the remaining other values are used to subtract the minimum value, the prediction residual is the smallest, and the residual has no negative number, so there is no need to code the sign later , Can further improve the subsequent encoding speed.

S805: If the second prediction residual is greater than a second preset residual threshold, add a second preset generation value to the prediction division cost of the code block to be processed in the second prediction mode to obtain a second prediction cost.

Here, if the prediction residual is greater than the preset residual threshold, the prediction cost of the coding block to be processed in the prediction mode is accumulated and the preset cost value is obtained to obtain the prediction cost. Among them, the above-mentioned second preset residual threshold value and second preset generation value can be set according to actual conditions, which are not particularly limited in the embodiment of the present application. Exemplarily, as long as the prediction residual is greater than 15, the prediction cost predCost is accumulated by 8.

S806. Select a target prediction mode from the preset prediction modes according to the prediction cost, and predict each prediction block of the coding block to be processed according to the target prediction mode to obtain a prediction residual.

S807. Perform encoding according to the foregoing prediction residual.

Wherein, steps S806-S807 are implemented in the same manner as the foregoing steps S205-S206, and will not be repeated here.

In addition, the embodiment of the present application can also quantize the prediction residual according to a preset quantization step size, and perform coding according to a preset coding method. FIG. 10 is a schematic flowchart of another video image encoding method proposed by an embodiment of this application. The execution subject of this embodiment may be the encoder in the embodiment shown in FIG. 1. As shown in Figure 10, the method includes:

S1001. Divide each of the multiple channels of the image frame in the video code stream to obtain a code block to be processed.

S1002: Calculate the division cost corresponding to the block division of the coding block to be processed in each division mode of the preset division mode.

S1003. Select a target division mode from the preset division modes according to the division cost, and perform block division on the coding block to be processed according to the target division mode to obtain a prediction block.

S1004. Calculate the prediction cost corresponding to the prediction of each prediction block of the to-be-processed coding block in each prediction mode of the preset prediction mode.

S1005. Select a target prediction mode from the preset prediction modes according to the prediction cost, and predict each prediction block of the coding block to be processed according to the target prediction mode to obtain a prediction residual.

Wherein, steps S1001-S1005 are implemented in the same manner as the foregoing steps S201-S205, and will not be repeated here.

S1006. Quantify the prediction residual according to a preset quantization step size to obtain a quantized residual.

Among them, the preset quantization step size can be set according to actual conditions, such as 2. Quantify the above prediction residuals according to the preset quantization step size to obtain the quantized residuals, for example, qResi[i]=round(resi[i]/Q), qResi[i] represents the quantized residuals, and Q represents quantization Step size, round() is a rounding function.

S1007: According to the above-mentioned quantized residual and the preset coding method, respectively encode each of the above-mentioned multiple channels to obtain an output code stream.

Optionally, before the encoding is performed on each of the multiple channels according to the quantized residual and the preset encoding method, the method further includes:

Determine whether the above prediction residual is greater than a preset threshold;

If the prediction residual is greater than the preset threshold, it is determined that the preset encoding method is VLC encoding;

If the prediction residual is less than or equal to the preset threshold, it is determined that the preset encoding method is FLC encoding.

Wherein, the above-mentioned preset threshold is determined according to the residual during the conversion from FLC coding to VLC coding. If the prediction residual is greater than the preset threshold, VLC coding is adopted, otherwise, FLC coding is adopted.

Optionally, the foregoing encoding each of the foregoing multiple channels separately according to the foregoing quantization residual and a preset encoding manner includes:

Encoding the MSB of each of the multiple channels separately by using the above preset encoding method according to the above-mentioned quantized residual;

When the number of encoded bits in each of the above-mentioned multiple channels reaches the preset encoding bit rate, stop the MSB of the encoding channel,

According to the quantized residual, the LSB of each of the multiple channels is separately coded by using the preset coding method;

When the number of encoded bits in each of the multiple channels reaches the preset encoding bit rate, the LSB of the encoding channel is stopped.

Here, the fixed bit rate coding is used. For example, taking the above-mentioned video bit stream as the YUV420 video format as an example, (1) If the total number of coded bits is less than the given bit rate and can meet the MSB of at least one codeword length, code Y If the total number of encoded bits is less than the given code rate and can meet the MSB of at least one codeword length, encode the MSB of the U component, And accumulate the number of encoded bits; otherwise, stop encoding and go to (5); (3) If the total number of encoded bits is less than the given code rate and can meet the MSB of at least one codeword length, encode the MSB of the V component, and accumulate the encoded The total number of bits; otherwise, stop encoding and go to (5); (4) In accordance with the order of MSB encoding, encode the LSB in the same manner until the encoded bits are greater than the given code rate. (5) If there are still a few bits away from the fixed bit rate and encoding cannot be continued, padding will be added to 0; otherwise, it will be output directly.

Optionally, the foregoing encoding the MSB of each of the foregoing multiple channels separately by using the foregoing preset encoding method according to the foregoing quantized residual includes:

The method of converting the above-mentioned quantized residual error into a bit plane;

Based on the above-mentioned bit-plane method, the above-mentioned preset coding method is used to start from the first row of the MSB of each of the above-mentioned multiple channels, and from left to right, each preset number of bits is used as a codeword to perform Huffman look-up table coding. .

Here, taking the above-mentioned video code stream as the YUV420 video format as an example, the MSB adopts the Huffman coding method. Due to the different probability distributions, different code tables are used for the Y and UV components. The code table is shown in Table 1.

Table 1 MSB code table

Specifically, the above-mentioned quantized residual is converted into a bit-plane method, and then for the MSB part, start from the first row, and each preset number of bits from left to right, for example, every 4bit is used as a codeword for Huffman table look-up encoding. After encoding the first line, continue to encode the second line until all MSBs are compiled or the given bit rate is reached.

Optionally, the foregoing encoding the LSB of each of the foregoing multiple channels separately using the foregoing preset encoding method according to the foregoing quantization residual includes:

Based on the above-mentioned bit-plane method, the above-mentioned preset coding method is used to respectively start from the first line of the LSB of each of the above-mentioned multiple channels, and perform coding bit by bit from left to right.

Exemplarily, the above-mentioned quantized residual is converted into a bit-plane, and then LSB adopts fixed-length coding, starting from the first line of the LSB part, coding from left to right bit by bit, and after editing the first line, compile the next line. Until the given bit rate is reached.

The video image encoding method provided by this embodiment obtains the code block to be processed by dividing each of the multiple channels of the image frame in the video code stream, and further, calculates each of the code blocks to be processed in the preset division mode. In the first mode, the division cost corresponding to the block division is determined, the target division mode is determined, and the coding block to be processed is divided into blocks according to the target division mode to obtain the prediction block, and then each prediction block of the coding block to be processed is calculated in the preset prediction mode In each mode of, the prediction cost corresponding to the prediction is performed, thereby determining the target prediction mode, and predicting each prediction block of the coding block to be processed according to the target prediction mode, and quantizing the prediction residual according to the preset quantization step size, Obtain the quantized residual, and further encode according to the quantized residual and the preset encoding method to obtain the output code stream. Among them, the encoding method can reduce data dependence and reduce the accumulation of data errors, which is conducive to hardware parallel data processing and reduces hardware Realize resources, and there is no need to reconstruct in the quantization process, which improves the coding speed.

FIG. 11 is a schematic structural diagram of a video image encoding device provided by an embodiment of the application. For ease of description, only the parts related to the embodiments of the present application are shown. As shown in FIG. 11, the video image encoding device 110 includes: a first division module 1101, a first calculation module 1102, a second division module 1103, a second calculation module 1104, a prediction module 1105, and an encoding module 1106.

Wherein, the first dividing module 1101 is configured to divide each of the multiple channels of the image frame in the video bitstream to obtain the code block to be processed.

The first calculation module 1102 is configured to calculate the division cost corresponding to the block division of the coded block to be processed in each division mode of the preset division mode.

The second division module 1103 is configured to select a target division mode from the preset division modes according to the division cost, and perform block division on the coding block to be processed according to the target division mode to obtain a prediction block.

The second calculation module 1104 is configured to calculate the prediction cost corresponding to the prediction of each prediction block of the coded block to be processed in each prediction mode of the preset prediction mode.

The prediction module 1105 is configured to select a target prediction mode from the preset prediction modes according to the prediction cost, and predict each prediction block of the coding block to be processed according to the target prediction mode to obtain a prediction residual ；

The encoding module 1106 is configured to perform encoding according to the prediction residual.

In a possible design, the first division module 1101 is specifically used for:

Obtain the to-be-processed coding block from the coding block.

In a possible design, the first calculation module 1102 is specifically used for:

In each of the division modes, calculating the prediction cost corresponding to the prediction of each prediction block of the to-be-processed coding block in each prediction mode;

In each division mode, calculate the header information cost corresponding to the coded block to be processed, where the header information cost includes the number of coded bits corresponding to the division mode, the number of coded bits corresponding to the prediction mode, and the preset base pixels At least one of the corresponding number of coded bits;

The calculated prediction cost and the header information cost are added to obtain the division cost corresponding to the block division of the to-be-processed coding block in each division mode.

In a possible design, the second division module 1103 selects a target division mode from the preset division modes according to the division cost, including:

Obtain the smallest division cost from the division costs;

The division mode corresponding to the smallest division cost is selected from the preset division modes, and the selected division mode is used as the target division mode.

In a possible design, the preset prediction mode includes a first prediction mode, and the first prediction mode is prediction based on a maximum value of pixels in a prediction block of the coding block to be processed;

The second calculation module 1104 is specifically used for:

Calculate the first prediction residual based on the maximum value of the pixel and the pixels of the first remaining prediction block of the to-be-processed coding block, where the first remaining prediction block is the prediction block of the to-be-processed coding block divided by Prediction blocks remaining outside the prediction block of the maximum pixel;

If the first prediction residual is greater than a first preset residual threshold, then a first preset generation value is added to the prediction cost of the to-be-processed coding block in the first prediction mode to obtain the first prediction cost.

In a possible design, the preset prediction mode includes a second prediction mode, and the second prediction mode is to perform prediction according to a minimum value of pixels in a prediction block of the coding block to be processed;

The second calculation module 1104 is specifically used for:

The second prediction residual is calculated according to the pixels of the second remaining prediction block of the to-be-processed coding block and the pixel minimum value, wherein the second remaining prediction block is the prediction block of the to-be-processed coding block divided by Prediction blocks remaining outside the prediction block of the minimum pixel;

If the second prediction residual is greater than the second preset residual threshold, the second preset generation value is accumulated for the prediction division cost of the code block to be processed in the second prediction mode to obtain the second prediction cost .

In a possible design, the prediction module 1105 selects a target prediction mode from the preset prediction modes according to the prediction cost, including:

Obtain the smallest predicted cost from the predicted cost;

The prediction mode corresponding to the smallest prediction cost is selected from the preset prediction modes, and the selected prediction mode is used as the target prediction mode.

In a possible design, the preset division mode includes no division;

The first calculation module 1102 calculates the header information cost corresponding to the to-be-processed coding block in each division mode, including:

In a possible design, the preset division mode includes horizontal dichotomy;

Calculating the total number of encoding bits of base pixels corresponding to the encoding block to be processed according to the horizontal dichotomy and the number of encoding bits corresponding to the preset base pixels;

The number of coding bits corresponding to the horizontal dichotomy, the number of coding bits corresponding to the prediction mode, and the total number of base pixel coding bits corresponding to the coding block to be processed are added to obtain the second corresponding to the coding block to be processed Header information cost.

In a possible design, the preset division mode includes vertical dichotomy;

Calculating the total number of base pixel encoding bits corresponding to the encoding block to be processed according to the vertical dichotomy and the number of encoding bits corresponding to the preset base pixel;

Add the number of coded bits corresponding to the vertical dichotomy, the number of coded bits corresponding to the prediction mode, and the total number of base pixel coded bits corresponding to the coded block to be processed to obtain the first coded bit corresponding to the coded block to be processed Three-head information cost.

In a possible design, the preset division mode includes cross division;

Calculating the total number of encoding bits of base pixels corresponding to the encoding block to be processed according to the cross division and the number of encoding bits corresponding to the preset base pixels;

The number of coded bits corresponding to the cross division, the number of coded bits corresponding to the prediction mode, and the total number of base pixel coded bits corresponding to the coded block to be processed are added to obtain the fourth coded block corresponding to the coded block to be processed. Header information cost.

In a possible design, the encoding module 1106 performs encoding according to the prediction residual, including:

Quantize the prediction residual according to a preset quantization step to obtain a quantized residual;

According to the quantized residual and the preset encoding method, each channel of the multiple channels is respectively encoded to obtain an output code stream.

In a possible design, before the encoding of each of the multiple channels according to the quantized residual and the preset encoding mode, the encoding module 1106 is further configured to:

Judging whether the prediction residual is greater than a preset threshold;

If the prediction residual is greater than the preset threshold, determining that the preset coding mode is VLC coding;

If the prediction residual is less than or equal to the preset threshold, it is determined that the preset coding mode is FLC coding.

In a possible design, the encoding module 1106 separately encodes each of the multiple channels according to the quantized residual and a preset encoding method, including:

According to the quantized residual, respectively encode the MSB of each of the multiple channels by using the preset encoding manner;

When the number of encoded bits in each of the multiple channels reaches the preset encoding bit rate, stop the MSB of the encoding channel,

Encoding the LSB of each of the multiple channels separately by using the preset encoding method according to the quantized residual;

When the number of encoded bits in each of the multiple channels reaches the preset encoding code rate, stop encoding the LSB of the channel.

In a possible design, the encoding module 1106 uses the preset encoding method to encode the MSB of each of the multiple channels separately according to the quantized residual, including:

A way of converting the quantized residual into a bit plane;

Based on the bit-plane method, the preset encoding method is used to start from the first row of the MSB of each of the multiple channels, and each preset number of bits from left to right is used as a codeword for Huffman. Look up the table code.

In a possible design, the encoding module 1106 uses the preset encoding method to encode the LSB of each of the multiple channels separately according to the quantized residual, including:

A way of converting the quantized residual into a bit plane;

Based on the bit-plane mode, the preset coding mode is used to perform coding from left to right bit by bit starting from the first row of the LSB of each of the multiple channels respectively.

The device provided in this embodiment can be used to implement the technical solutions of the foregoing method embodiments, and its implementation principles and technical effects are similar, and will not be repeated here in this embodiment.

FIG. 12 is a schematic diagram of the hardware structure of a video image encoding device provided by an embodiment of the application. As shown in FIG. 12, the video image encoding device 120 of this embodiment includes: a memory 1201 and a processor 1202; where

The memory 1201 is used to store program instructions;

The processor 1202 is configured to execute program instructions stored in the memory, and when the program instructions are executed, the processor executes the following steps:

Encoding is performed according to the prediction residual.

In a possible design, the dividing each of the multiple channels of the image frame in the video bitstream to obtain the code block to be processed includes:

Obtain the to-be-processed coding block from the coding block.

In a possible design, the calculating the division cost corresponding to the block division of the coding block to be processed in each division mode of the preset division mode includes:

In a possible design, the selecting a target division mode from the preset division modes according to the division cost includes:

Obtain the smallest division cost from the division costs;

The calculation of the prediction cost corresponding to the prediction of each prediction block of the coding block to be processed in each prediction mode of the preset prediction mode includes:

In a possible design, the selecting a target prediction mode from the preset prediction modes according to the prediction cost includes:

Obtain the smallest predicted cost from the predicted cost;

In a possible design, the preset division mode includes no division;

The calculating the header information cost corresponding to the coding block to be processed in each of the division modes includes:

In a possible design, the preset division mode includes horizontal dichotomy;

In a possible design, the preset division mode includes vertical dichotomy;

In a possible design, the preset division mode includes cross division;

In a possible design, the encoding according to the prediction residual includes:

In a possible design, before the encoding each of the multiple channels separately according to the quantized residual and the preset encoding mode, the method further includes:

Judging whether the prediction residual is greater than a preset threshold;

In a possible design, the separately encoding each of the multiple channels according to the quantized residual and a preset encoding manner includes:

In a possible design, the respectively encoding the MSB of each of the multiple channels by using the preset encoding method according to the quantized residual includes:

A way of converting the quantized residual into a bit plane;

In a possible design, the step of separately encoding the LSB of each of the multiple channels according to the quantized residual using the preset encoding manner includes:

A way of converting the quantized residual into a bit plane;

In a possible design, the memory 1201 may be independent or integrated with the processor 1202.

When the memory 1201 is independently provided, the video image encoding device further includes a bus 1203 for connecting the memory 1201 and the processor 1202.

In a possible design, the video image encoding system 120 may be a single device, and the system includes a complete set of the foregoing memory 1201, processor 1202, and so on. It can also be distributed on a certain device, which can be determined according to the actual situation.

FIG. 13 is a schematic structural diagram of a movable platform provided by an embodiment of the application. As shown in FIG. 13, the movable platform 130 of this embodiment includes: a movable platform body 1301, and a video image encoding device 1302; the video image encoding device 1302 is provided on the movable platform body 1301, and the movable platform body 1301 The platform body 1301 and the video image encoding device 1302 are connected wirelessly or wiredly.

The video image encoding device 1302 divides each of the multiple channels of the image frame in the video code stream to obtain a code block to be processed;

Encoding is performed according to the prediction residual.

Obtain the to-be-processed coding block from the coding block.

In a possible design, the calculation of the division cost corresponding to the block division of the coding block to be processed in each division mode of the preset division mode includes:

Obtain the smallest division cost from the division costs;

Obtain the smallest predicted cost from the predicted cost;

In a possible design, the preset division mode includes no division;

In a possible design, the preset division mode includes horizontal dichotomy;

In a possible design, the preset division mode includes vertical dichotomy;

In a possible design, the preset division mode includes cross division;

Judging whether the prediction residual is greater than a preset threshold;

A way of converting the quantized residual into a bit plane;

The movable platform provided by this embodiment includes: a movable platform body, and a video image encoding device. The video image encoding device is set on the movable platform body, wherein the video image encoding device performs multiple Each channel in the channel is divided to obtain the code block to be processed, and then the division cost corresponding to the block division of the code block to be processed in each mode of the preset division mode is calculated, the target division mode is determined, and the target division mode is determined. The mode performs block division on the coding block to be processed to obtain the prediction block, and then calculates the prediction cost corresponding to the prediction of each prediction block of the coding block to be processed in each mode of the preset prediction mode, thereby determining the target prediction mode, and Predicting each prediction block of the coding block to be processed according to the target prediction mode can reduce data dependence and reduce the accumulation of data errors, which is conducive to hardware parallel data processing, reduces hardware implementation resources, and does not require reconstruction in the quantization process. Improve encoding speed.

An embodiment of the present application provides a computer-readable storage medium having program instructions stored in the computer-readable storage medium, and when a processor executes the program instructions, the video image encoding method described above is implemented.

In the several embodiments provided by the present invention, it should be understood that the disclosed device and method may be implemented in other ways. For example, the device embodiments described above are merely illustrative. For example, the division of the modules is only a logical function division, and there may be other divisions in actual implementation, for example, multiple modules can be combined or integrated. To another system, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or modules, and may be in electrical, mechanical or other forms.

The modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.

In addition, the functional modules in the various embodiments of the present invention may be integrated into one processing unit, or each module may exist alone physically, or two or more modules may be integrated into one unit. The units formed by the above-mentioned modules can be implemented in the form of hardware, or in the form of hardware plus software functional units.

The above-mentioned integrated modules implemented in the form of software functional modules may be stored in a computer readable storage medium. The above-mentioned software function module is stored in a storage medium and includes a number of instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (English: processor) execute the various embodiments of the present application Part of the method.

It should be understood that the foregoing processor may be a central processing unit (Central Processing Unit, CPU for short), or other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), and application specific integrated circuits (Application Specific Integrated Circuits). Referred to as ASIC) and so on. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like. The steps of the method disclosed in combination with the invention may be directly embodied as being executed and completed by a hardware processor, or executed and completed by a combination of hardware and software modules in the processor.

The memory may include a high-speed RAM memory, or may also include a non-volatile storage NVM, such as at least one disk storage, and may also be a U disk, a mobile hard disk, a read-only memory, a magnetic disk, or an optical disk.

The bus may be an Industry Standard Architecture (ISA) bus, Peripheral Component (PCI) bus, or Extended Industry Standard Architecture (EISA) bus, etc. The bus can be divided into address bus, data bus, control bus and so on. For ease of representation, the buses in the drawings of this application are not limited to only one bus or one type of bus.

The above-mentioned storage medium can be realized by any type of volatile or non-volatile storage device or their combination, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable Except for programmable read only memory (EPROM), programmable read only memory (PROM), read only memory (ROM), magnetic memory, flash memory, magnetic disks or optical disks. The storage medium may be any available medium that can be accessed by a general-purpose or special-purpose computer.

An exemplary storage medium is coupled to the processor, so that the processor can read information from the storage medium and write information to the storage medium. Of course, the storage medium may also be an integral part of the processor. The processor and the storage medium may be located in Application Specific Integrated Circuits (ASIC for short). Of course, the processor and the storage medium may also exist as discrete components in the electronic device or the main control device.

A person of ordinary skill in the art can understand that all or part of the steps in the foregoing method embodiments can be implemented by a program instructing relevant hardware. The aforementioned program can be stored in a computer readable storage medium. When the program is executed, it executes the steps including the foregoing method embodiments; and the foregoing storage medium includes: ROM, RAM, magnetic disk, or optical disk and other media that can store program codes.

Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: The technical solutions recorded in the foregoing embodiments can still be modified, or some or all of the technical features can be equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the embodiments of the present invention. Scope.

Claims

A video image coding method, characterized in that it comprises:

Divide each of the multiple channels of the image frame in the video code stream to obtain the code block to be processed;

Calculating the division cost corresponding to the block division of the coding block to be processed in each division mode of the preset division mode;

Selecting a target division mode from the preset division modes according to the division cost, and performing block division on the coding block to be processed according to the target division mode to obtain a prediction block;

Calculating the prediction cost corresponding to the prediction of each prediction block of the coding block to be processed in each prediction mode of the preset prediction mode;

Selecting a target prediction mode from the preset prediction modes according to the prediction cost, and predicting each prediction block of the coding block to be processed according to the target prediction mode to obtain a prediction residual;

Encoding is performed according to the prediction residual.
The method according to claim 1, wherein the dividing each of the multiple channels of the image frame in the video code stream to obtain the code block to be processed comprises:

Dividing each of the multiple channels of the image frame in the video code stream into slices of a first preset size;

Separately dividing the channel components in each slice after division into coding blocks of a second preset size;

Obtain the to-be-processed coding block from the coding block.
The method according to claim 1 or 2, wherein the calculating the division cost corresponding to the block division of the coded block to be processed in each division mode of the preset division mode comprises:

In each of the division modes, calculating the prediction cost corresponding to the prediction of each prediction block of the to-be-processed coding block in each prediction mode;

In each division mode, calculate the header information cost corresponding to the coded block to be processed, where the header information cost includes the number of coded bits corresponding to the division mode, the number of coded bits corresponding to the prediction mode, and the preset base pixels At least one of the corresponding number of coded bits;

The calculated prediction cost and the header information cost are added to obtain the division cost corresponding to the block division of the to-be-processed coding block in each division mode.
The method according to claim 1, wherein the selecting a target division mode from the preset division modes according to the division cost comprises:

Obtain the smallest division cost from the division costs;

The division mode corresponding to the smallest division cost is selected from the preset division modes, and the selected division mode is used as the target division mode.
The method according to any one of claims 1 to 4, wherein the preset prediction mode comprises a first prediction mode, and the first prediction mode is based on a prediction block of the to-be-processed coding block Pixel maximum value for prediction;

The calculation of the prediction cost corresponding to the prediction of each prediction block of the coding block to be processed in each prediction mode of the preset prediction mode includes:

Calculate the first prediction residual based on the maximum value of the pixel and the pixels of the first remaining prediction block of the to-be-processed coding block, where the first remaining prediction block is the prediction block of the to-be-processed coding block divided by Prediction blocks remaining outside the prediction block of the maximum pixel;

If the first prediction residual is greater than a first preset residual threshold, then a first preset generation value is added to the prediction cost of the to-be-processed coding block in the first prediction mode to obtain the first prediction cost.
The method according to any one of claims 1 to 4, wherein the preset prediction mode comprises a second prediction mode, and the second prediction mode is based on a prediction block of the to-be-processed coding block Pixel minimum value for prediction;

The calculation of the prediction cost corresponding to the prediction of each prediction block of the coding block to be processed in each prediction mode of the preset prediction mode includes:

The second prediction residual is calculated according to the pixels of the second remaining prediction block of the to-be-processed coding block and the pixel minimum value, wherein the second remaining prediction block is the prediction block of the to-be-processed coding block divided by Prediction blocks remaining outside the prediction block of the minimum pixel;

If the second prediction residual is greater than the second preset residual threshold, the second preset generation value is accumulated for the prediction division cost of the code block to be processed in the second prediction mode to obtain the second prediction cost .
The method according to claim 1, wherein the selecting a target prediction mode from the preset prediction modes according to the prediction cost comprises:

Obtain the smallest predicted cost from the predicted cost;

The prediction mode corresponding to the smallest prediction cost is selected from the preset prediction modes, and the selected prediction mode is used as the target prediction mode.
The method according to claim 3, wherein the preset division mode includes no division;

The calculating the header information cost corresponding to the coding block to be processed in each of the division modes includes:

The number of coded bits corresponding to the non-division, the number of coded bits corresponding to the prediction mode, and the number of coded bits corresponding to the preset base pixels are added to obtain the first header information cost corresponding to the coded block to be processed.
The method according to claim 3, wherein the preset division mode includes horizontal dichotomy;

The calculating the header information cost corresponding to the coding block to be processed in each of the division modes includes:

Calculating the total number of encoding bits of base pixels corresponding to the encoding block to be processed according to the horizontal dichotomy and the number of encoding bits corresponding to the preset base pixels;

The number of coding bits corresponding to the horizontal dichotomy, the number of coding bits corresponding to the prediction mode, and the total number of base pixel coding bits corresponding to the coding block to be processed are added to obtain the second corresponding to the coding block to be processed Header information cost.
The method according to claim 3, wherein the preset division mode includes vertical dichotomy;

The calculating the header information cost corresponding to the coding block to be processed in each of the division modes includes:

Calculating the total number of base pixel encoding bits corresponding to the encoding block to be processed according to the vertical dichotomy and the number of encoding bits corresponding to the preset base pixel;

Add the number of coded bits corresponding to the vertical dichotomy, the number of coded bits corresponding to the prediction mode, and the total number of base pixel coded bits corresponding to the coded block to be processed to obtain the first coded bit corresponding to the coded block to be processed Three-head information cost.
The method according to claim 3, wherein the preset division mode includes cross division;

The calculating the header information cost corresponding to the coding block to be processed in each of the division modes includes:

Calculating the total number of encoding bits of base pixels corresponding to the encoding block to be processed according to the cross division and the number of encoding bits corresponding to the preset base pixels;

The number of coded bits corresponding to the cross division, the number of coded bits corresponding to the prediction mode, and the total number of base pixel coded bits corresponding to the coded block to be processed are added to obtain the fourth coded block corresponding to the coded block to be processed. Header information cost.
The method according to claim 1, wherein the encoding according to the prediction residual comprises:

Quantize the prediction residual according to a preset quantization step to obtain a quantized residual;

According to the quantized residual and the preset encoding method, each channel of the multiple channels is respectively encoded to obtain an output code stream.
The method according to claim 12, characterized in that, before said separately encoding each of the multiple channels according to the quantized residual and a preset encoding method, the method further comprises:

Judging whether the prediction residual is greater than a preset threshold;

If the prediction residual is greater than the preset threshold, determining that the preset coding mode is VLC coding;

If the prediction residual is less than or equal to the preset threshold, it is determined that the preset coding mode is FLC coding.
The method according to claim 12 or 13, wherein the separately encoding each of the multiple channels according to the quantized residual and a preset encoding method comprises:

According to the quantized residual, respectively encode the MSB of each of the multiple channels by using the preset encoding manner;

When the number of encoded bits in each of the multiple channels reaches the preset encoding bit rate, stop the MSB of the encoding channel,

Encoding the LSB of each of the multiple channels separately by using the preset encoding method according to the quantized residual;

When the number of encoded bits in each of the multiple channels reaches the preset encoding code rate, stop encoding the LSB of the channel.
14. The method according to claim 14, wherein the step of separately encoding the MSB of each of the multiple channels according to the quantized residual using the preset encoding method comprises:

A way of converting the quantized residual into a bit plane;

Based on the bit-plane method, the preset encoding method is used to start from the first row of the MSB of each of the multiple channels, and each preset number of bits from left to right is used as a codeword for Huffman. Look up the table code.
The method according to claim 14, wherein the encoding the LSB of each of the multiple channels by using the preset encoding method according to the quantized residual comprises:

A way of converting the quantized residual into a bit plane;

Based on the bit-plane mode, the preset coding mode is used to perform coding from left to right bit by bit starting from the first row of the LSB of each of the multiple channels respectively.
A video image encoding device, which is characterized by comprising a memory, a processor, and computer-executable instructions stored in the memory and running on the processor. When the processor executes the computer-executable instructions, the following is achieved step:

Divide each of the multiple channels of the image frame in the video code stream to obtain the code block to be processed;

Calculating the division cost corresponding to the block division of the coding block to be processed in each division mode of the preset division mode;

Selecting a target division mode from the preset division modes according to the division cost, and performing block division on the coding block to be processed according to the target division mode to obtain a prediction block;

Calculating the prediction cost corresponding to the prediction of each prediction block of the coding block to be processed in each prediction mode of the preset prediction mode;

Selecting a target prediction mode from the preset prediction modes according to the prediction cost, and predicting each prediction block of the coding block to be processed according to the target prediction mode to obtain a prediction residual;

Encoding is performed according to the prediction residual.
The device according to claim 17, wherein the dividing each of the multiple channels of the image frame in the video code stream to obtain the code block to be processed comprises:

Dividing each of the multiple channels of the image frame in the video code stream into slices of a first preset size;

Separately dividing the channel components in each slice after division into coding blocks of a second preset size;

Obtain the to-be-processed coding block from the coding block.
The device according to claim 17 or 18, wherein the calculating the division cost corresponding to the block division of the coded block to be processed in each division mode of the preset division mode comprises:

In each of the division modes, calculating the prediction cost corresponding to the prediction of each prediction block of the to-be-processed coding block in each prediction mode;

In each division mode, calculate the header information cost corresponding to the coded block to be processed, where the header information cost includes the number of coded bits corresponding to the division mode, the number of coded bits corresponding to the prediction mode, and the preset base pixels At least one of the corresponding number of coded bits;

The calculated prediction cost and the header information cost are added to obtain the division cost corresponding to the block division of the to-be-processed coding block in each division mode.
The device according to claim 17, wherein the selecting a target division mode from the preset division modes according to the division cost comprises:

Obtain the smallest division cost from the division costs;

The division mode corresponding to the smallest division cost is selected from the preset division modes, and the selected division mode is used as the target division mode.
The device according to any one of claims 17 to 20, wherein the preset prediction mode comprises a first prediction mode, and the first prediction mode is based on a prediction block of the to-be-processed coding block Pixel maximum value for prediction;

The calculation of the prediction cost corresponding to the prediction of each prediction block of the coding block to be processed in each prediction mode of the preset prediction mode includes:

Calculate the first prediction residual based on the maximum value of the pixel and the pixels of the first remaining prediction block of the to-be-processed coding block, where the first remaining prediction block is the prediction block of the to-be-processed coding block divided by Prediction blocks remaining outside the prediction block of the maximum pixel;

If the first prediction residual is greater than a first preset residual threshold, then a first preset generation value is added to the prediction cost of the to-be-processed coding block in the first prediction mode to obtain the first prediction cost.
The device according to any one of claims 17 to 20, wherein the preset prediction mode comprises a second prediction mode, and the second prediction mode is based on a prediction block of the to-be-processed coding block Pixel minimum value for prediction;

The calculation of the prediction cost corresponding to the prediction of each prediction block of the coding block to be processed in each prediction mode of the preset prediction mode includes:

The second prediction residual is calculated according to the pixels of the second remaining prediction block of the to-be-processed coding block and the pixel minimum value, wherein the second remaining prediction block is the prediction block of the to-be-processed coding block divided by Prediction blocks remaining outside the prediction block of the minimum pixel;

If the second prediction residual is greater than the second preset residual threshold, the second preset generation value is accumulated for the prediction division cost of the code block to be processed in the second prediction mode to obtain the second prediction cost .
The device according to claim 17, wherein the selecting a target prediction mode from the preset prediction modes according to the prediction cost comprises:

Obtain the smallest predicted cost from the predicted cost;

The prediction mode corresponding to the smallest prediction cost is selected from the preset prediction modes, and the selected prediction mode is used as the target prediction mode.
The device according to claim 19, wherein the preset division mode includes no division;

The calculating the header information cost corresponding to the coding block to be processed in each of the division modes includes:

The number of coded bits corresponding to the non-division, the number of coded bits corresponding to the prediction mode, and the number of coded bits corresponding to the preset base pixels are added to obtain the first header information cost corresponding to the coded block to be processed.
The device according to claim 19, wherein the preset division mode comprises horizontal dichotomy;

The calculating the header information cost corresponding to the coding block to be processed in each of the division modes includes:

Calculating the total number of encoding bits of base pixels corresponding to the encoding block to be processed according to the horizontal dichotomy and the number of encoding bits corresponding to the preset base pixels;

The number of coding bits corresponding to the horizontal dichotomy, the number of coding bits corresponding to the prediction mode, and the total number of base pixel coding bits corresponding to the coding block to be processed are added to obtain the second corresponding to the coding block to be processed. Header information cost.
The device according to claim 19, wherein the preset division mode includes vertical dichotomy;

The calculating the header information cost corresponding to the coding block to be processed in each of the division modes includes:

Calculating the total number of base pixel encoding bits corresponding to the encoding block to be processed according to the vertical dichotomy and the number of encoding bits corresponding to the preset base pixel;

Add the number of coded bits corresponding to the vertical dichotomy, the number of coded bits corresponding to the prediction mode, and the total number of base pixel coded bits corresponding to the coded block to be processed to obtain the first coded bit corresponding to the coded block to be processed Three-head information cost.
The device according to claim 19, wherein the preset division mode comprises a cross division;

The calculating the header information cost corresponding to the coding block to be processed in each of the division modes includes:

Calculating the total number of encoding bits of base pixels corresponding to the encoding block to be processed according to the cross division and the number of encoding bits corresponding to the preset base pixels;

The number of coded bits corresponding to the cross division, the number of coded bits corresponding to the prediction mode, and the total number of base pixel coded bits corresponding to the coded block to be processed are added to obtain the fourth coded block corresponding to the coded block to be processed. Header information cost.
The device according to claim 17, wherein the encoding according to the prediction residual comprises:

Quantize the prediction residual according to a preset quantization step to obtain a quantized residual;

According to the quantized residual and the preset encoding method, each channel of the multiple channels is respectively encoded to obtain an output code stream.
The device according to claim 28, characterized in that, before said separately encoding each of said multiple channels according to said quantized residual and a preset encoding method, the method further comprises:

Judging whether the prediction residual is greater than a preset threshold;

If the prediction residual is greater than the preset threshold, determining that the preset coding mode is VLC coding;

If the prediction residual is less than or equal to the preset threshold, it is determined that the preset coding mode is FLC coding.
The device according to claim 28 or 29, wherein the separately encoding each of the multiple channels according to the quantized residual and a preset encoding method comprises:

According to the quantized residual, respectively encode the MSB of each of the multiple channels by using the preset encoding manner;

When the number of encoded bits in each of the multiple channels reaches the preset encoding bit rate, stop the MSB of the encoding channel,

Encoding the LSB of each of the multiple channels separately by using the preset encoding method according to the quantized residual;

When the number of encoded bits in each of the multiple channels reaches the preset encoding code rate, stop encoding the LSB of the channel.
The device according to claim 30, wherein the encoding the MSB of each of the multiple channels by using the preset encoding method according to the quantized residual comprises:

A way of converting the quantized residual into a bit plane;

Based on the bit-plane method, the preset encoding method is used to start from the first row of the MSB of each of the multiple channels, and each preset number of bits from left to right is used as a codeword for Huffman. Look up the table code.
The device according to claim 30, wherein the encoding the LSB of each of the multiple channels by using the preset encoding method according to the quantized residual comprises:

A way of converting the quantized residual into a bit plane;

Based on the bit-plane mode, the preset coding mode is used to perform coding from left to right bit by bit starting from the first row of the LSB of each of the multiple channels respectively.
A movable platform, characterized in that it comprises:

The movable platform body; and

The video image encoding device according to any one of claims 17 to 32, wherein the video image encoding device is installed on the movable platform body.
A computer-readable storage medium, wherein the computer-readable storage medium stores computer-executable instructions, and when the processor executes the computer-executable instructions, the computer-readable Video image coding method.