WO2021168624A1 - Procédé et dispositif de codage d'image vidéo et plateforme mobile - Google Patents
Procédé et dispositif de codage d'image vidéo et plateforme mobile Download PDFInfo
- Publication number
- WO2021168624A1 WO2021168624A1 PCT/CN2020/076469 CN2020076469W WO2021168624A1 WO 2021168624 A1 WO2021168624 A1 WO 2021168624A1 CN 2020076469 W CN2020076469 W CN 2020076469W WO 2021168624 A1 WO2021168624 A1 WO 2021168624A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- prediction
- preset
- block
- division
- processed
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 133
- 238000013139 quantization Methods 0.000 claims abstract description 22
- 238000004364 calculation method Methods 0.000 claims description 24
- 238000012545 processing Methods 0.000 abstract description 26
- 238000009825 accumulation Methods 0.000 abstract description 7
- 238000013461 design Methods 0.000 description 49
- 238000004891 communication Methods 0.000 description 23
- 238000010586 diagram Methods 0.000 description 12
- 238000006243 chemical reaction Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 7
- 230000005540 biological transmission Effects 0.000 description 5
- 230000001186 cumulative effect Effects 0.000 description 5
- 230000003287 optical effect Effects 0.000 description 4
- 238000003491 array Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 239000004973 liquid crystal related substance Substances 0.000 description 3
- 238000005070 sampling Methods 0.000 description 3
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 2
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical compound [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 229910052710 silicon Inorganic materials 0.000 description 2
- 239000010703 silicon Substances 0.000 description 2
- 238000009966 trimming Methods 0.000 description 2
- 241000023320 Luma <angiosperm> Species 0.000 description 1
- 238000012952 Resampling Methods 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- OSWPMRLSEDHDFF-UHFFFAOYSA-N methyl salicylate Chemical compound COC(=O)C1=CC=CC=C1O OSWPMRLSEDHDFF-UHFFFAOYSA-N 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/184—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/186—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/587—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal sub-sampling or interpolation, e.g. decimation or subsequent interpolation of pictures in a video sequence
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/2343—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/4402—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/83—Generation or processing of protective or descriptive data associated with content; Content structuring
- H04N21/845—Structuring of content, e.g. decomposing content into time segments
- H04N21/8456—Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments
Definitions
- the embodiments of the present application relate to coding technology, and in particular to a video image coding method, device, and movable platform.
- Video image signals have become people’s daily routines due to their intuitiveness and efficiency. The most important way to get information in life. Because the video image signal contains a large amount of data, it needs to occupy a large amount of transmission bandwidth and storage space. For effective transmission and storage, video image signals need to be compressed and encoded.
- DPCM as a simple prediction method, is widely used in image compression coding.
- DPCM generally uses the encoded pixels on the left or above for prediction.
- DPCM uses adjacent pixels for prediction, which makes data dependent, prone to cumulative errors and difficult hardware parallel processing, which in turn leads to higher coding error rates and slower coding speeds.
- the embodiments of the present application provide a video image encoding method, device, and movable platform to overcome at least one of the above-mentioned problems.
- an embodiment of the present application provides a video image encoding method, including:
- Encoding is performed according to the prediction residual.
- an embodiment of the present application provides a video image encoding device, including a memory, a processor, and computer-executable instructions stored in the memory and running on the processor, and the processor executes the computer The following steps are implemented when executing instructions:
- Encoding is performed according to the prediction residual.
- an embodiment of the present application provides a movable platform, including:
- the movable platform body The movable platform body
- the video image encoding device is installed on the movable platform body.
- an embodiment of the present application provides a computer-readable storage medium that stores computer-executable instructions.
- the processor executes the computer-executable instructions, the first aspect and the first aspect described above are implemented.
- the video image coding method, device, and movable platform body provided by the embodiments of the present application.
- the method divides each of the multiple channels of the image frame in the video code stream to obtain the code block to be processed, and then calculates the code block to be processed.
- the processing coding block is divided into the corresponding division cost, the target division mode is determined, and the coding block to be processed is divided according to the target division mode to obtain the prediction block, and then the to-be-processed coding block is calculated
- the corresponding prediction cost is predicted, so that the target prediction mode is determined, and each prediction block of the coding block to be processed is predicted according to the target prediction mode, which can reduce the data Dependence, reducing the accumulation of data errors, is conducive to hardware parallel data processing, reduces hardware implementation resources, and does not need to be reconstructed in the quantization process to improve coding speed.
- FIG. 1 is a schematic diagram of the architecture of a video image coding system provided by an embodiment of the application
- FIG. 2 is a schematic flowchart of a video image encoding method provided by an embodiment of the application
- FIG. 3 is a schematic diagram of dividing each of multiple channels of an image frame in a video bitstream according to an embodiment of the application
- FIG. 4 is a schematic diagram of a preset division mode provided by an embodiment of the application.
- FIG. 5 is a schematic flowchart of another video image encoding method provided by an embodiment of the application.
- FIG. 6 is a schematic flowchart of still another video image encoding method provided by an embodiment of this application.
- FIG. 7 is a schematic diagram of a first prediction mode provided by an embodiment of the application.
- FIG. 8 is a schematic flowchart of another video image encoding method provided by an embodiment of the application.
- FIG. 9 is a schematic diagram of a second prediction mode provided by an embodiment of the application.
- FIG. 10 is a schematic flowchart of another video image encoding method provided by an embodiment of this application.
- FIG. 11 is a schematic structural diagram of a video image encoding device provided by an embodiment of this application.
- FIG. 12 is a schematic diagram of the hardware structure of a video image encoding device provided by an embodiment of the application.
- FIG. 13 is a schematic structural diagram of a movable platform provided by an embodiment of the application.
- Video image coding usually refers to processing a sequence of pictures that form a video or video sequence.
- the terms "PiCTUre", “frame” or “Image” can be used as synonyms.
- the video encoding in this application is performed on the source side, and usually includes processing (for example, by compressing) the original video picture to reduce the amount of data required to represent the video picture (thus storing and/or transmitting more efficiently).
- Video decoding is performed on the destination side and usually involves inverse processing relative to the encoder to reconstruct the video picture.
- Each picture of a video sequence is usually divided into a set of non-overlapping blocks, and is usually coded at the block level.
- the encoder side usually processes the video at the block (video block) level, that is, encodes the video.
- the prediction block is generated by prediction, and the prediction block is subtracted from the current block (currently processed or to-be-processed block) to obtain the residual.
- Block transform the residual block in the transform domain and quantize the residual block to reduce the amount of data to be transmitted (compressed), and the decoder side applies the inverse processing part relative to the encoder to the encoded or compressed block to reproduce Constructs the current block used to represent.
- the encoder duplicates the decoder processing loop so that the encoder and decoder generate the same prediction and/or reconstruction for processing, that is, encoding subsequent blocks.
- DPCM as a simple prediction method, is widely used in image compression coding.
- DPCM generally uses the encoded pixels on the left or above for prediction.
- DPCM uses adjacent pixels for prediction, which makes data dependent, prone to cumulative errors and difficult hardware parallel processing, which in turn leads to higher encoding error rates and slower encoding speeds.
- the present application provides a video image coding method, which obtains the code blocks to be processed by dividing each of the multiple channels of the image frame in the video code stream, and further, calculates the code blocks to be processed In each mode of the preset division mode, the division cost corresponding to the block division is performed, the target division mode is determined, and the coding block to be processed is divided into blocks according to the target division mode to obtain the prediction block, and then each of the coding blocks to be processed is calculated.
- the prediction block predicts the corresponding prediction cost in each of the preset prediction modes, thereby determining the target prediction mode, and predicting each prediction block of the coding block to be processed according to the target prediction mode, which can reduce data dependence. Reducing the accumulation of data errors is conducive to hardware parallel data processing, reducing hardware implementation resources, and there is no need to reconstruct during the quantization process, which improves coding speed.
- the video image encoding method provided by this application can be applied to the schematic diagram of the video image encoding system architecture shown in FIG. 1.
- the video image encoding system 10 includes a source device 12 and a target device 14, and the source device 12 includes : Picture acquisition device 121, preprocessor 122, encoder 123 and communication interface 124.
- the target device 14 includes a display device 141, a processor 142, a decoder 143, and a communication interface 144.
- the source device 12 sends the encoded data 13 obtained by encoding to the target device 14.
- the method of this application is applied to the encoder 123.
- the source device 12 may be referred to as a video encoding device or a video encoding device.
- the target device 14 may be referred to as a video decoding device or a video decoding device.
- the source device 12 and the target device 14 may be examples of video encoding devices or video encoding devices.
- the source device 12 and the target device 14 may include any of a variety of devices, including any type of handheld or stationary device, for example, a notebook or laptop computer, mobile phone, smart phone, tablet or tablet computer, video camera, desktop computer , Set-top boxes, televisions, display devices, digital media players, video game consoles, video streaming devices (such as content service servers or content distribution servers), broadcast receiver devices, broadcast transmitter devices, etc., and may not be used or Use any type of operating system.
- a notebook or laptop computer mobile phone, smart phone, tablet or tablet computer, video camera, desktop computer , Set-top boxes, televisions, display devices, digital media players, video game consoles, video streaming devices (such as content service servers or content distribution servers), broadcast receiver devices, broadcast transmitter devices, etc., and may not be used or Use any type of operating system.
- source device 12 and target device 14 may be equipped for wireless communication. Therefore, the source device 12 and the target device 14 may be wireless video and image encoding devices.
- the video image encoding system 10 shown in FIG. 1 is only an example, and the technology of this application can be applied to video encoding settings that do not necessarily include any data communication between encoding and decoding devices (for example, video encoding or video decoding). ).
- the data can be retrieved from local storage, streamed on the network, etc.
- the video encoding device can encode data and store the data to the memory, and/or the video decoding device can retrieve the data from the memory and decode the data.
- encoding and decoding are performed by devices that do not communicate with each other but only encode data to and/or retrieve data from the memory and decode the data.
- the encoder 123 of the video image encoding system 10 may also be referred to as a video encoder, and the decoder 143 may also be referred to as a video decoder.
- the picture acquisition device 121 may include or may be any type of picture capture device, for example, to capture real-world pictures, and/or any type of pictures or comments (for screen content encoding, some text on the screen is also Considered to be a part of the picture or image to be encoded) generating equipment, for example, a computer graphics processor for generating computer animation pictures, or for obtaining and/or providing real-world pictures, computer animation pictures (for example, screen content, virtual Any type of equipment of virtual reality (VR) pictures, and/or any combination thereof (for example, augmented reality (AR) pictures).
- the picture is or can be regarded as a two-dimensional array or matrix of sampling points with brightness values.
- the sampling points in the array may also be called pixels (pixels) or pixels (piCTUre element, pel for short).
- the number of sampling points of the array in the horizontal and vertical directions (or axis) defines the size and/or resolution of the picture.
- three color components are usually used, that is, pictures can be represented as or contain three sample arrays.
- a picture includes corresponding red, green, and blue sample arrays.
- each pixel is usually expressed in a luminance/chrominance format or color space, for example, YCbCr, including the luminance (luma) component indicated by Y (sometimes indicated by L), and Cb and Cr indications.
- the two chroma (chroma for short) components The luminance component Y represents luminance or grayscale level intensity (for example, the two are the same in a grayscale picture), and the two chrominance components Cb and Cr represent chrominance or color information components.
- a picture in the YCbCr format includes a luminance sample array of the luminance component (Y), and two chrominance sample arrays of the chrominance component (Cb and Cr).
- Pictures in RGB format can be converted or converted to YCbCr format, and vice versa. This process is also called color conversion or conversion.
- the picture acquisition device 121 may be, for example, a camera for capturing pictures, such as a memory of a picture memory, including or storing previously captured or generated pictures, and/or any type of (internal or external) interface for acquiring or receiving pictures.
- the camera may be, for example, an integrated camera that is local or integrated in the source device, and the memory may be local or, for example, an integrated memory that is integrated in the source device.
- the interface may be, for example, an external interface for receiving pictures from an external video source.
- the external video source is, for example, an external picture capturing device, such as a camera, an external memory, or an external picture generating device
- the external picture generating device is, for example, an external computer graphics processor. , Computer or server.
- the interface can be any type of interface according to any proprietary or standardized interface protocol, such as a wired or wireless interface, and an optical interface.
- the interface for acquiring the picture data 125 in FIG. 1 may be the same interface as the communication interface 124 or a part of the communication interface 124.
- the picture data 125 (for example, video data) may be referred to as original picture or original picture data.
- the pre-processor 122 is used to receive the picture data 125 and perform pre-processing on the picture data 125 to obtain a pre-processed picture (or pre-processed picture data) 126.
- the preprocessing performed by the preprocessor 122 may include trimming, color format conversion (for example, conversion from RGB to YCbCr), toning or denoising. It can be understood that the pre-processor 122 may be an optional component.
- the encoder 123 (eg, a video encoder) is used to receive pre-processed pictures (or pre-processed picture data) 126 and provide encoded picture data 127.
- the communication interface 124 of the source device 12 can be used to receive the encoded picture data 127 and transmit it to other devices, for example, the target device 14 or any other device for storage or direct reconstruction, or for storing
- the encoded data 13 is stored and/or the encoded picture data 127 is processed before transmitting the encoded data 13 to other devices, such as the target device 14 or any other device for decoding or storage.
- the communication interface 144 of the target device 14 is used, for example, to directly receive the encoded picture data 127 or the encoded data 13 from the source device 12 or any other source. Any other source is, for example, a storage device, and the storage device is, for example, an encoded picture data storage device.
- the communication interface 124 and the communication interface 144 can be used to directly communicate through the direct communication link between the source device 12 and the target device 14 or through any type of network to transmit or receive the encoded picture data 127 or the encoded data 13
- the link is, for example, a direct wired or wireless connection, and any type of network is, for example, a wired or wireless network or any combination thereof, or any type of private network and public network, or any combination thereof.
- the communication interface 124 may be used, for example, to encapsulate the encoded picture data 127 into a suitable format, such as a packet, for transmission on a communication link or communication network.
- the communication interface 144 forming the corresponding part of the communication interface 124 may be used, for example, to decapsulate the encoded data 13 to obtain the encoded picture data 127.
- Both the communication interface 124 and the communication interface 144 can be configured as a one-way communication interface, as indicated by the arrow pointing from the source device 12 to the target device 14 for the encoded picture data 127 in FIG. 1, or can be configured as a two-way communication interface, and can It is used, for example, to send and receive messages to establish a connection, confirm and exchange any other information related to the communication link and/or data transmission such as the transmission of encoded picture data.
- the decoder 143 is used to receive encoded picture data 127 and provide decoded picture data (or decoded picture) 145.
- the processor 142 of the target device 14 is used to post-process decoded picture data (or decoded picture) 145, for example, a decoded picture, to obtain post-processed picture data 146, for example, a post-processed picture.
- the post-processing performed by the processor 142 may include, for example, color format conversion (for example, conversion from YCbCr to RGB), toning, trimming or resampling, or any other processing for preparing decoded picture data (or decoded picture data, for example).
- the picture 145 is displayed by the display device 141.
- the display device 141 of the target device 14 is used to receive the post-processed picture data 145 to display the picture to, for example, a user or viewer.
- the display device 141 may be or may include any type of display for presenting the reconstructed picture, for example, an integrated or external display or monitor.
- the display may include a liquid crystal display (LCD for short), an organic light emitting diode (OLED) display, a plasma display, a projector, a micro LED display, and a liquid crystal on silicon (liquid crystal on silicon, for short) LCoS), digital light processor (digital light processor, DLP for short), or any other type of display.
- FIG. 1 depicts the source device 12 and the target device 14 as separate devices
- the device embodiment may also include the source device 12 and the target device 14 or the functionality of both, that is, the source device 12 or the corresponding The functionality of the target device 14 or the corresponding functionality.
- the same hardware and/or software may be used, or separate hardware and/or software, or any combination thereof may be used to implement the source device 12 or the corresponding functionality and the target device 14 or the corresponding functionality.
- the functionality of different units or the existence and (accurate) division of the functionality of the source device 12 and/or the target device 14 shown in FIG. 1 may vary according to actual devices and applications.
- both the encoder 123 e.g., video encoder
- the decoder 143 e.g., video decoder
- DSP digital Signal processor
- ASIC application-specific integrated circuit
- FPGA field-programmable gate array
- the device can store the instructions of the software in a suitable non-transitory computer-readable storage medium, and can use one or more processors to execute the instructions in hardware to execute the technology of the present application .
- Each of the encoder 123 and the decoder 143 may be included in one or more encoders or decoders, and any one of the encoders or decoders may be integrated as a combined encoder/decoder in the corresponding device ( Codec).
- the decoder 143 may be used to perform the reverse process.
- the decoder 143 can be used to receive and parse such syntax elements, and decode related video data accordingly.
- the encoder 123 may entropy encode one or more defined syntax elements into an encoded video bitstream. In such instances, the decoder 143 can parse such syntax elements and decode related video data accordingly.
- FIG. 2 is a schematic flowchart of a video image encoding method provided by an embodiment of this application.
- the execution subject of this embodiment may be the encoder in the embodiment shown in FIG. 1.
- the method includes:
- the above-mentioned video code stream may be a preset video format, such as a YUV420 video format, which is not particularly limited for comparison in this application.
- the encoder may divide each of the Y, U, and V channels of the image frame in the video code stream.
- the foregoing division of each of the multiple channels of the image frame in the video code stream to obtain the code block to be processed includes:
- the above-mentioned first preset size and second preset size can be set according to actual conditions, which are not particularly limited in the embodiment of the present application.
- the encoder can divide each of the multiple channels of the input image frame into non-overlapping 64x4 slices. It can be coded and decoded independently.
- the Y component in the 64x4 slice block is divided into 16 4x4 coding blocks; the UV component is divided into 4x2 coding blocks.
- the coded block to be processed is obtained from the coded block.
- the 4x4 coded block of the Y component is obtained from the coded block as the coded block to be processed.
- the 4x2 code of the U component can also be obtained from the coded block.
- the block is the above-mentioned to-be-processed coding block, which is not particularly limited in the embodiment of the present application.
- S202 Calculate the division cost corresponding to the block division of the coding block to be processed in each division mode of the preset division mode.
- the foregoing preset division mode may be set according to actual conditions.
- the foregoing preset division mode may include None, Horizontal, Vertical, and Split. ), etc.
- the embodiments of the present application do not impose any special restrictions on this.
- the encoder calculates the division cost corresponding to the block division of the coding block to be processed in each division mode of the preset division mode, and then determines the final division mode according to the division cost.
- S203 Select a target division mode from the preset division modes according to the division cost, and perform block division on the coding block to be processed according to the target division mode to obtain a prediction block.
- the foregoing selection of the target division mode from the foregoing preset division modes according to the foregoing division cost includes:
- the division mode corresponding to the minimum division cost is selected from the preset division modes, and the selected division mode is used as the target division mode.
- the target division mode can also be selected according to the actual situation, for example, the target division cost is obtained from the above division cost, and the division mode corresponding to the target division cost is selected from the above preset division mode.
- the selected division mode is used as the above-mentioned target division mode, which is not particularly limited in the embodiment of the present application.
- the aforementioned target division cost may be other division costs except the aforementioned minimum division cost, which can be specifically determined according to actual conditions.
- S204 Calculate the prediction cost corresponding to the prediction of each prediction block of the coding block to be processed in each prediction mode of the preset prediction mode.
- the foregoing preset prediction mode may be set according to actual conditions.
- the foregoing preset division mode includes the first prediction mode maxDPCM and the second prediction mode MinDPCM, which are not particularly limited in the embodiment of the present application.
- maxDPCM is to predict according to the maximum value of the pixel in the prediction block of the coding block to be processed, that is, the maximum value in the prediction block is selected first, and then the maximum value is used to subtract the remaining values to obtain the prediction residual.
- minDPCM predicts based on the minimum pixel value in the prediction block of the coding block to be processed, that is, first selects the minimum value in the prediction block, and then uses other values to subtract the minimum value to obtain the prediction residual.
- the above two prediction modes do not have negative numbers, there is no need to encode symbols, which increases the coding speed. Moreover, the above two prediction modes do not need to use adjacent pixels for prediction, reducing data dependence and not easy to generate cumulative errors. In addition, follow-up Quantization does not need to be reconstructed, and parallel processing can be used to solve the difficult problem of hardware parallel processing.
- the encoder calculates the prediction cost corresponding to the prediction of each prediction block of the coding block to be processed in each prediction mode of the preset prediction mode, and then determines the final prediction mode according to the prediction cost.
- the foregoing selection of the target prediction mode from the foregoing preset prediction modes according to the foregoing prediction cost includes:
- the prediction mode corresponding to the minimum prediction cost is selected from the preset prediction modes, and the selected prediction mode is used as the target prediction mode.
- the target prediction mode can also be selected according to the actual situation, for example, the target prediction cost is obtained from the above prediction cost; the prediction mode corresponding to the target prediction cost is selected from the above preset prediction mode,
- the selected prediction mode is used as the aforementioned target prediction mode, which is not particularly limited in the embodiment of the present application.
- the aforementioned forecast division cost may be other forecast costs other than the aforementioned minimum forecast cost, which can be specifically determined according to actual conditions.
- the encoder can perform multiple quantization, encoding, etc. processes at the same time based on the above prediction residuals to select the encoding parameters with the best image quality at the same bit rate to obtain the output bit stream.
- the video image encoding method obtains the code block to be processed by dividing each of the multiple channels of the image frame in the video code stream, and further, calculates each of the code blocks to be processed in the preset division mode.
- the division cost corresponding to the block division is determined
- the target division mode is determined
- the coding block to be processed is divided into blocks according to the target division mode to obtain the prediction block
- each prediction block of the coding block to be processed is calculated in the preset prediction mode
- predict the corresponding prediction cost thereby determining the target prediction mode, and predicting each prediction block of the coding block to be processed according to the target prediction mode, which can reduce data dependence and reduce the accumulation of data errors. It is conducive to hardware parallel data processing, reduces hardware implementation resources, and does not need to be reconstructed in the quantization process, which improves the coding speed.
- FIG. 5 is a schematic flowchart of another video image encoding method proposed in an embodiment of this application.
- the execution subject of this embodiment may be the encoder in the embodiment shown in FIG. 1. As shown in Figure 5, the method includes:
- step S501 is implemented in the same manner as the foregoing step S201, and will not be repeated here.
- the foregoing preset division mode may include None, Horizontal, Vertical, Split, etc. as shown in FIG. 4.
- the preset prediction mode may include the first prediction mode maxDPCM, the second prediction mode minDPCM, and so on.
- the prediction mode of each prediction block of the above-mentioned coding block to be processed is the same.
- the prediction mode of each prediction block of the above-mentioned coding block to be processed is the same. Both are the above-mentioned first prediction mode maxDPCM, or both are the above-mentioned second prediction mode minDPCM.
- each of the foregoing division modes calculate the prediction residual corresponding to each prediction block of the foregoing coding block to be processed in each of the foregoing prediction modes, and if the prediction residual is greater than the preset residual threshold, then The predicted division cost of the above-mentioned coded block to be processed in the prediction mode is accumulated and the preset cost value is added to obtain the predicted cost predCost.
- the above-mentioned preset residual threshold value and the preset cost value can be set according to actual conditions, and the embodiment of the present application does not specifically limit this. For example, as long as the prediction residual is greater than 15, the prediction cost predCost is accumulated by 8.
- headerCost includes the number of coded bits corresponding to the division mode, the number of coded bits corresponding to the prediction mode, and a preset At least one of the number of coded bits corresponding to the base pixel.
- the foregoing preset division mode includes no division (None).
- the foregoing calculation of the header information cost corresponding to the foregoing coding block to be processed in each of the foregoing division modes includes:
- the number of coded bits corresponding to the non-division, the number of coded bits corresponding to the prediction mode, and the number of coded bits corresponding to the preset base pixels are added to obtain the first header information cost corresponding to the coded block to be processed.
- the foregoing preset division mode includes horizontal dichotomy (Horizontal).
- the foregoing calculation of the header information cost corresponding to the foregoing coding block to be processed in each of the foregoing division modes includes:
- the number of coding bits corresponding to the horizontal dichotomy, the number of coding bits corresponding to the prediction mode, and the total number of base pixel coding bits corresponding to the coding block to be processed are added to obtain the second header information cost corresponding to the coding block to be processed.
- the foregoing preset division mode includes vertical dichotomy (Vertical).
- the foregoing calculation of the header information cost corresponding to the foregoing coding block to be processed in each of the foregoing division modes includes:
- the foregoing preset division mode includes a cross division (Split).
- the foregoing calculation of the header information cost corresponding to the foregoing coding block to be processed in each of the foregoing division modes includes:
- the number of coded bits corresponding to the cross division, the number of coded bits corresponding to the prediction mode, and the total number of base pixel coded bits corresponding to the coded block to be processed are added to obtain the fourth header information cost corresponding to the coded block to be processed.
- the division cost is the sum of the aforementioned predCosts and headerCost, and further, the division mode is selected according to the division cost.
- S505 Select a target division mode from the preset division modes according to the division cost, and perform block division on the coding block to be processed according to the target division mode to obtain a prediction block.
- S506 Calculate the prediction cost corresponding to the prediction of each prediction block of the coding block to be processed in each prediction mode of the preset prediction mode.
- steps S505-S508 are implemented in the same manner as the foregoing steps S203-S206, and will not be repeated here.
- the video image coding method provided in this embodiment calculates the prediction cost corresponding to the prediction of each prediction block of the coding block to be processed in each prediction mode in each division mode, and the header information cost corresponding to the coding block to be processed , And then accurately determine the division cost corresponding to the block division of the coded block to be processed in each division mode, to ensure the accuracy of subsequent processing, and by performing processing on each of the multiple channels of the image frame in the video bitstream.
- each prediction block of the block can reduce data dependence and reduce the accumulation of data errors, which is conducive to hardware parallel data processing, reduces hardware implementation resources, and does not need to be reconstructed in the quantization process, which improves coding speed.
- FIG. 6 is a schematic flowchart of another video image encoding method proposed by an embodiment of the application.
- the execution subject of this embodiment may be the encoder in the embodiment shown in FIG. 1. As shown in Figure 6, the method includes:
- S602 Calculate the division cost corresponding to the block division of the coding block to be processed in each division mode of the preset division mode.
- S603 Select a target division mode from the preset division modes according to the division cost, and perform block division on the coding block to be processed according to the target division mode to obtain a prediction block.
- steps S601-S603 are implemented in the same manner as the foregoing steps S201-S203, and will not be repeated here.
- the preset prediction mode includes a first prediction mode.
- the first prediction mode is based on the prediction based on the maximum value of the pixel in the prediction block of the to-be-processed coding block, and the prediction is performed based on the maximum value of the pixel and the first remaining value of the to-be-processed coding block. Predict the pixels of the block and calculate the first prediction residual, where the first remaining prediction block is the remaining prediction block in the prediction block of the coding block to be processed except for the prediction block with the maximum value of the pixel.
- the maximum value of pixels in the prediction block of the coding block to be processed is 26, and the maximum value is used to subtract the remaining values, and the prediction residual is the smallest, and the residual has no negative number, so there is no need to code the sign later. , Can further improve the subsequent encoding speed.
- the prediction cost of the coding block to be processed in the prediction mode is accumulated and the preset cost value is obtained to obtain the prediction cost.
- the above-mentioned first preset residual threshold value and the first preset generation value can be set according to actual conditions, which are not particularly limited in the embodiment of the present application. Exemplarily, as long as the prediction residual is greater than 15, the prediction cost predCost is accumulated by 8.
- steps S606-S607 are implemented in the same manner as the foregoing steps S205-S206, and will not be repeated here.
- the video image encoding method provided by this embodiment obtains the code block to be processed by dividing each of the multiple channels of the image frame in the video code stream, and further, calculates each of the code blocks to be processed in the preset division mode.
- the division cost corresponding to the block division is determined
- the target division mode is determined
- the coding block to be processed is divided into blocks according to the target division mode to obtain the prediction block, and then each prediction block of the coding block to be processed is calculated in the preset prediction mode
- the prediction cost corresponding to the prediction is performed, so that the target prediction mode is determined, and each prediction block of the coding block to be processed is predicted according to the target prediction mode.
- FIG. 8 is a schematic flowchart of another video image encoding method proposed by an embodiment of the application.
- the execution subject of this embodiment may be the encoder in the embodiment shown in FIG. 1. As shown in Figure 8, the method includes:
- S802 Calculate the division cost corresponding to the block division of the coding block to be processed in each division mode of the preset division mode.
- S803 Select a target division mode from the preset division modes according to the division cost, and perform block division on the coding block to be processed according to the target division mode to obtain a prediction block.
- steps S801-S803 are implemented in the same manner as the foregoing steps S201-S203, and will not be repeated here.
- the preset prediction mode includes a second prediction mode.
- the second prediction mode is based on the prediction based on the minimum value of the pixel in the prediction block of the to-be-processed coding block, and the pixel value of the second remaining prediction block of the to-be-processed coding block is The pixel minimum value is used to calculate a second prediction residual, wherein the second remaining prediction block is a prediction block remaining except for the prediction block of the pixel minimum value among the prediction blocks of the coding block to be processed.
- the minimum value of the pixel in the prediction block of the coding block to be processed is 22, and the remaining other values are used to subtract the minimum value, the prediction residual is the smallest, and the residual has no negative number, so there is no need to code the sign later , Can further improve the subsequent encoding speed.
- the prediction cost of the coding block to be processed in the prediction mode is accumulated and the preset cost value is obtained to obtain the prediction cost.
- the above-mentioned second preset residual threshold value and second preset generation value can be set according to actual conditions, which are not particularly limited in the embodiment of the present application. Exemplarily, as long as the prediction residual is greater than 15, the prediction cost predCost is accumulated by 8.
- steps S806-S807 are implemented in the same manner as the foregoing steps S205-S206, and will not be repeated here.
- the video image encoding method provided by this embodiment obtains the code block to be processed by dividing each of the multiple channels of the image frame in the video code stream, and further, calculates each of the code blocks to be processed in the preset division mode.
- the division cost corresponding to the block division is determined
- the target division mode is determined
- the coding block to be processed is divided into blocks according to the target division mode to obtain the prediction block, and then each prediction block of the coding block to be processed is calculated in the preset prediction mode
- the prediction cost corresponding to the prediction is performed, so that the target prediction mode is determined, and each prediction block of the coding block to be processed is predicted according to the target prediction mode.
- FIG. 10 is a schematic flowchart of another video image encoding method proposed by an embodiment of this application.
- the execution subject of this embodiment may be the encoder in the embodiment shown in FIG. 1. As shown in Figure 10, the method includes:
- S1002 Calculate the division cost corresponding to the block division of the coding block to be processed in each division mode of the preset division mode.
- steps S1001-S1005 are implemented in the same manner as the foregoing steps S201-S205, and will not be repeated here.
- the method further includes:
- the preset encoding method is VLC encoding
- the preset encoding method is FLC encoding.
- the above-mentioned preset threshold is determined according to the residual during the conversion from FLC coding to VLC coding. If the prediction residual is greater than the preset threshold, VLC coding is adopted, otherwise, FLC coding is adopted.
- the foregoing encoding each of the foregoing multiple channels separately according to the foregoing quantization residual and a preset encoding manner includes:
- the LSB of each of the multiple channels is separately coded by using the preset coding method
- the LSB of the encoding channel is stopped.
- the fixed bit rate coding is used. For example, taking the above-mentioned video bit stream as the YUV420 video format as an example, (1) If the total number of coded bits is less than the given bit rate and can meet the MSB of at least one codeword length, code Y If the total number of encoded bits is less than the given code rate and can meet the MSB of at least one codeword length, encode the MSB of the U component, And accumulate the number of encoded bits; otherwise, stop encoding and go to (5); (3) If the total number of encoded bits is less than the given code rate and can meet the MSB of at least one codeword length, encode the MSB of the V component, and accumulate the encoded The total number of bits; otherwise, stop encoding and go to (5); (4) In accordance with the order of MSB encoding, encode the LSB in the same manner until the encoded bits are greater than the given code rate. (5) If there are still a few bits away from the fixed bit rate and encoding cannot be continued, padding will be
- the foregoing encoding the MSB of each of the foregoing multiple channels separately by using the foregoing preset encoding method according to the foregoing quantized residual includes:
- the above-mentioned preset coding method is used to start from the first row of the MSB of each of the above-mentioned multiple channels, and from left to right, each preset number of bits is used as a codeword to perform Huffman look-up table coding. .
- the MSB adopts the Huffman coding method. Due to the different probability distributions, different code tables are used for the Y and UV components.
- the code table is shown in Table 1.
- the above-mentioned quantized residual is converted into a bit-plane method, and then for the MSB part, start from the first row, and each preset number of bits from left to right, for example, every 4bit is used as a codeword for Huffman table look-up encoding. After encoding the first line, continue to encode the second line until all MSBs are compiled or the given bit rate is reached.
- the foregoing encoding the LSB of each of the foregoing multiple channels separately using the foregoing preset encoding method according to the foregoing quantization residual includes:
- the above-mentioned preset coding method is used to respectively start from the first line of the LSB of each of the above-mentioned multiple channels, and perform coding bit by bit from left to right.
- the above-mentioned quantized residual is converted into a bit-plane, and then LSB adopts fixed-length coding, starting from the first line of the LSB part, coding from left to right bit by bit, and after editing the first line, compile the next line. Until the given bit rate is reached.
- the video image encoding method obtains the code block to be processed by dividing each of the multiple channels of the image frame in the video code stream, and further, calculates each of the code blocks to be processed in the preset division mode.
- the division cost corresponding to the block division is determined
- the target division mode is determined
- the coding block to be processed is divided into blocks according to the target division mode to obtain the prediction block, and then each prediction block of the coding block to be processed is calculated in the preset prediction mode
- the prediction cost corresponding to the prediction is performed, thereby determining the target prediction mode, and predicting each prediction block of the coding block to be processed according to the target prediction mode, and quantizing the prediction residual according to the preset quantization step size, Obtain the quantized residual, and further encode according to the quantized residual and the preset encoding method to obtain the output code stream.
- the encoding method can reduce data dependence and reduce the accumulation of data errors, which is conducive to hardware parallel data processing and reduces hardware Realize
- FIG. 11 is a schematic structural diagram of a video image encoding device provided by an embodiment of the application. For ease of description, only the parts related to the embodiments of the present application are shown.
- the video image encoding device 110 includes: a first division module 1101, a first calculation module 1102, a second division module 1103, a second calculation module 1104, a prediction module 1105, and an encoding module 1106.
- the first dividing module 1101 is configured to divide each of the multiple channels of the image frame in the video bitstream to obtain the code block to be processed.
- the first calculation module 1102 is configured to calculate the division cost corresponding to the block division of the coded block to be processed in each division mode of the preset division mode.
- the second division module 1103 is configured to select a target division mode from the preset division modes according to the division cost, and perform block division on the coding block to be processed according to the target division mode to obtain a prediction block.
- the second calculation module 1104 is configured to calculate the prediction cost corresponding to the prediction of each prediction block of the coded block to be processed in each prediction mode of the preset prediction mode.
- the prediction module 1105 is configured to select a target prediction mode from the preset prediction modes according to the prediction cost, and predict each prediction block of the coding block to be processed according to the target prediction mode to obtain a prediction residual ;
- the encoding module 1106 is configured to perform encoding according to the prediction residual.
- the first division module 1101 is specifically used for:
- the first calculation module 1102 is specifically used for:
- each division mode calculate the header information cost corresponding to the coded block to be processed, where the header information cost includes the number of coded bits corresponding to the division mode, the number of coded bits corresponding to the prediction mode, and the preset base pixels At least one of the corresponding number of coded bits;
- the calculated prediction cost and the header information cost are added to obtain the division cost corresponding to the block division of the to-be-processed coding block in each division mode.
- the second division module 1103 selects a target division mode from the preset division modes according to the division cost, including:
- the division mode corresponding to the smallest division cost is selected from the preset division modes, and the selected division mode is used as the target division mode.
- the preset prediction mode includes a first prediction mode, and the first prediction mode is prediction based on a maximum value of pixels in a prediction block of the coding block to be processed;
- the second calculation module 1104 is specifically used for:
- the first prediction residual based on the maximum value of the pixel and the pixels of the first remaining prediction block of the to-be-processed coding block, where the first remaining prediction block is the prediction block of the to-be-processed coding block divided by Prediction blocks remaining outside the prediction block of the maximum pixel;
- first prediction residual is greater than a first preset residual threshold, then a first preset generation value is added to the prediction cost of the to-be-processed coding block in the first prediction mode to obtain the first prediction cost.
- the preset prediction mode includes a second prediction mode, and the second prediction mode is to perform prediction according to a minimum value of pixels in a prediction block of the coding block to be processed;
- the second calculation module 1104 is specifically used for:
- the second prediction residual is calculated according to the pixels of the second remaining prediction block of the to-be-processed coding block and the pixel minimum value, wherein the second remaining prediction block is the prediction block of the to-be-processed coding block divided by Prediction blocks remaining outside the prediction block of the minimum pixel;
- the second preset generation value is accumulated for the prediction division cost of the code block to be processed in the second prediction mode to obtain the second prediction cost .
- the prediction module 1105 selects a target prediction mode from the preset prediction modes according to the prediction cost, including:
- the prediction mode corresponding to the smallest prediction cost is selected from the preset prediction modes, and the selected prediction mode is used as the target prediction mode.
- the preset division mode includes no division
- the first calculation module 1102 calculates the header information cost corresponding to the to-be-processed coding block in each division mode, including:
- the number of coded bits corresponding to the non-division, the number of coded bits corresponding to the prediction mode, and the number of coded bits corresponding to the preset base pixels are added to obtain the first header information cost corresponding to the coded block to be processed.
- the preset division mode includes horizontal dichotomy
- the first calculation module 1102 calculates the header information cost corresponding to the to-be-processed coding block in each division mode, including:
- the number of coding bits corresponding to the horizontal dichotomy, the number of coding bits corresponding to the prediction mode, and the total number of base pixel coding bits corresponding to the coding block to be processed are added to obtain the second corresponding to the coding block to be processed Header information cost.
- the preset division mode includes vertical dichotomy
- the first calculation module 1102 calculates the header information cost corresponding to the to-be-processed coding block in each division mode, including:
- the preset division mode includes cross division
- the first calculation module 1102 calculates the header information cost corresponding to the to-be-processed coding block in each division mode, including:
- the number of coded bits corresponding to the cross division, the number of coded bits corresponding to the prediction mode, and the total number of base pixel coded bits corresponding to the coded block to be processed are added to obtain the fourth coded block corresponding to the coded block to be processed. Header information cost.
- the encoding module 1106 performs encoding according to the prediction residual, including:
- each channel of the multiple channels is respectively encoded to obtain an output code stream.
- the encoding module 1106 is further configured to:
- the preset coding mode is FLC coding.
- the encoding module 1106 separately encodes each of the multiple channels according to the quantized residual and a preset encoding method, including:
- the encoding module 1106 uses the preset encoding method to encode the MSB of each of the multiple channels separately according to the quantized residual, including:
- the preset encoding method is used to start from the first row of the MSB of each of the multiple channels, and each preset number of bits from left to right is used as a codeword for Huffman. Look up the table code.
- the encoding module 1106 uses the preset encoding method to encode the LSB of each of the multiple channels separately according to the quantized residual, including:
- the preset coding mode is used to perform coding from left to right bit by bit starting from the first row of the LSB of each of the multiple channels respectively.
- the device provided in this embodiment can be used to implement the technical solutions of the foregoing method embodiments, and its implementation principles and technical effects are similar, and will not be repeated here in this embodiment.
- FIG. 12 is a schematic diagram of the hardware structure of a video image encoding device provided by an embodiment of the application.
- the video image encoding device 120 of this embodiment includes: a memory 1201 and a processor 1202; where
- the memory 1201 is used to store program instructions
- the processor 1202 is configured to execute program instructions stored in the memory, and when the program instructions are executed, the processor executes the following steps:
- Encoding is performed according to the prediction residual.
- the dividing each of the multiple channels of the image frame in the video bitstream to obtain the code block to be processed includes:
- the calculating the division cost corresponding to the block division of the coding block to be processed in each division mode of the preset division mode includes:
- each division mode calculate the header information cost corresponding to the coded block to be processed, where the header information cost includes the number of coded bits corresponding to the division mode, the number of coded bits corresponding to the prediction mode, and the preset base pixels At least one of the corresponding number of coded bits;
- the calculated prediction cost and the header information cost are added to obtain the division cost corresponding to the block division of the to-be-processed coding block in each division mode.
- the selecting a target division mode from the preset division modes according to the division cost includes:
- the division mode corresponding to the smallest division cost is selected from the preset division modes, and the selected division mode is used as the target division mode.
- the preset prediction mode includes a first prediction mode, and the first prediction mode is prediction based on a maximum value of pixels in a prediction block of the coding block to be processed;
- the calculation of the prediction cost corresponding to the prediction of each prediction block of the coding block to be processed in each prediction mode of the preset prediction mode includes:
- the first prediction residual based on the maximum value of the pixel and the pixels of the first remaining prediction block of the to-be-processed coding block, where the first remaining prediction block is the prediction block of the to-be-processed coding block divided by Prediction blocks remaining outside the prediction block of the maximum pixel;
- first prediction residual is greater than a first preset residual threshold, then a first preset generation value is added to the prediction cost of the to-be-processed coding block in the first prediction mode to obtain the first prediction cost.
- the preset prediction mode includes a second prediction mode, and the second prediction mode is to perform prediction according to a minimum value of pixels in a prediction block of the coding block to be processed;
- the calculation of the prediction cost corresponding to the prediction of each prediction block of the coding block to be processed in each prediction mode of the preset prediction mode includes:
- the second prediction residual is calculated according to the pixels of the second remaining prediction block of the to-be-processed coding block and the pixel minimum value, wherein the second remaining prediction block is the prediction block of the to-be-processed coding block divided by Prediction blocks remaining outside the prediction block of the minimum pixel;
- the second preset generation value is accumulated for the prediction division cost of the code block to be processed in the second prediction mode to obtain the second prediction cost .
- the selecting a target prediction mode from the preset prediction modes according to the prediction cost includes:
- the prediction mode corresponding to the smallest prediction cost is selected from the preset prediction modes, and the selected prediction mode is used as the target prediction mode.
- the preset division mode includes no division
- the calculating the header information cost corresponding to the coding block to be processed in each of the division modes includes:
- the number of coded bits corresponding to the non-division, the number of coded bits corresponding to the prediction mode, and the number of coded bits corresponding to the preset base pixels are added to obtain the first header information cost corresponding to the coded block to be processed.
- the preset division mode includes horizontal dichotomy
- the calculating the header information cost corresponding to the coding block to be processed in each of the division modes includes:
- the number of coding bits corresponding to the horizontal dichotomy, the number of coding bits corresponding to the prediction mode, and the total number of base pixel coding bits corresponding to the coding block to be processed are added to obtain the second corresponding to the coding block to be processed Header information cost.
- the preset division mode includes vertical dichotomy
- the calculating the header information cost corresponding to the coding block to be processed in each of the division modes includes:
- the preset division mode includes cross division
- the calculating the header information cost corresponding to the coding block to be processed in each of the division modes includes:
- the number of coded bits corresponding to the cross division, the number of coded bits corresponding to the prediction mode, and the total number of base pixel coded bits corresponding to the coded block to be processed are added to obtain the fourth coded block corresponding to the coded block to be processed. Header information cost.
- the encoding according to the prediction residual includes:
- each channel of the multiple channels is respectively encoded to obtain an output code stream.
- the method before the encoding each of the multiple channels separately according to the quantized residual and the preset encoding mode, the method further includes:
- the preset coding mode is FLC coding.
- the separately encoding each of the multiple channels according to the quantized residual and a preset encoding manner includes:
- the respectively encoding the MSB of each of the multiple channels by using the preset encoding method according to the quantized residual includes:
- the preset encoding method is used to start from the first row of the MSB of each of the multiple channels, and each preset number of bits from left to right is used as a codeword for Huffman. Look up the table code.
- the step of separately encoding the LSB of each of the multiple channels according to the quantized residual using the preset encoding manner includes:
- the preset coding mode is used to perform coding from left to right bit by bit starting from the first row of the LSB of each of the multiple channels respectively.
- the memory 1201 may be independent or integrated with the processor 1202.
- the video image encoding device further includes a bus 1203 for connecting the memory 1201 and the processor 1202.
- the video image encoding system 120 may be a single device, and the system includes a complete set of the foregoing memory 1201, processor 1202, and so on. It can also be distributed on a certain device, which can be determined according to the actual situation.
- FIG. 13 is a schematic structural diagram of a movable platform provided by an embodiment of the application.
- the movable platform 130 of this embodiment includes: a movable platform body 1301, and a video image encoding device 1302; the video image encoding device 1302 is provided on the movable platform body 1301, and the movable platform body 1301
- the platform body 1301 and the video image encoding device 1302 are connected wirelessly or wiredly.
- the video image encoding device 1302 divides each of the multiple channels of the image frame in the video code stream to obtain a code block to be processed;
- Encoding is performed according to the prediction residual.
- the dividing each of the multiple channels of the image frame in the video bitstream to obtain the code block to be processed includes:
- the calculation of the division cost corresponding to the block division of the coding block to be processed in each division mode of the preset division mode includes:
- each division mode calculate the header information cost corresponding to the coded block to be processed, where the header information cost includes the number of coded bits corresponding to the division mode, the number of coded bits corresponding to the prediction mode, and the preset base pixels At least one of the corresponding number of coded bits;
- the calculated prediction cost and the header information cost are added to obtain the division cost corresponding to the block division of the to-be-processed coding block in each division mode.
- the selecting a target division mode from the preset division modes according to the division cost includes:
- the division mode corresponding to the smallest division cost is selected from the preset division modes, and the selected division mode is used as the target division mode.
- the preset prediction mode includes a first prediction mode, and the first prediction mode is prediction based on a maximum value of pixels in a prediction block of the coding block to be processed;
- the calculation of the prediction cost corresponding to the prediction of each prediction block of the coding block to be processed in each prediction mode of the preset prediction mode includes:
- the first prediction residual based on the maximum value of the pixel and the pixels of the first remaining prediction block of the to-be-processed coding block, where the first remaining prediction block is the prediction block of the to-be-processed coding block divided by Prediction blocks remaining outside the prediction block of the maximum pixel;
- first prediction residual is greater than a first preset residual threshold, then a first preset generation value is added to the prediction cost of the to-be-processed coding block in the first prediction mode to obtain the first prediction cost.
- the preset prediction mode includes a second prediction mode, and the second prediction mode is to perform prediction according to a minimum value of pixels in a prediction block of the coding block to be processed;
- the calculation of the prediction cost corresponding to the prediction of each prediction block of the coding block to be processed in each prediction mode of the preset prediction mode includes:
- the second prediction residual is calculated according to the pixels of the second remaining prediction block of the to-be-processed coding block and the pixel minimum value, wherein the second remaining prediction block is the prediction block of the to-be-processed coding block divided by Prediction blocks remaining outside the prediction block of the minimum pixel;
- the second preset generation value is accumulated for the prediction division cost of the code block to be processed in the second prediction mode to obtain the second prediction cost .
- the selecting a target prediction mode from the preset prediction modes according to the prediction cost includes:
- the prediction mode corresponding to the smallest prediction cost is selected from the preset prediction modes, and the selected prediction mode is used as the target prediction mode.
- the preset division mode includes no division
- the calculating the header information cost corresponding to the coding block to be processed in each of the division modes includes:
- the number of coded bits corresponding to the non-division, the number of coded bits corresponding to the prediction mode, and the number of coded bits corresponding to the preset base pixels are added to obtain the first header information cost corresponding to the coded block to be processed.
- the preset division mode includes horizontal dichotomy
- the calculating the header information cost corresponding to the coding block to be processed in each of the division modes includes:
- the number of coding bits corresponding to the horizontal dichotomy, the number of coding bits corresponding to the prediction mode, and the total number of base pixel coding bits corresponding to the coding block to be processed are added to obtain the second corresponding to the coding block to be processed Header information cost.
- the preset division mode includes vertical dichotomy
- the calculating the header information cost corresponding to the coding block to be processed in each of the division modes includes:
- the preset division mode includes cross division
- the calculating the header information cost corresponding to the coding block to be processed in each of the division modes includes:
- the number of coded bits corresponding to the cross division, the number of coded bits corresponding to the prediction mode, and the total number of base pixel coded bits corresponding to the coded block to be processed are added to obtain the fourth coded block corresponding to the coded block to be processed. Header information cost.
- the encoding according to the prediction residual includes:
- each channel of the multiple channels is respectively encoded to obtain an output code stream.
- the method before the encoding each of the multiple channels separately according to the quantized residual and the preset encoding mode, the method further includes:
- the preset coding mode is FLC coding.
- the separately encoding each of the multiple channels according to the quantized residual and a preset encoding manner includes:
- the respectively encoding the MSB of each of the multiple channels by using the preset encoding method according to the quantized residual includes:
- the preset encoding method is used to start from the first row of the MSB of each of the multiple channels, and each preset number of bits from left to right is used as a codeword for Huffman. Look up the table code.
- the step of separately encoding the LSB of each of the multiple channels according to the quantized residual using the preset encoding manner includes:
- the preset coding mode is used to perform coding from left to right bit by bit starting from the first row of the LSB of each of the multiple channels respectively.
- the movable platform includes: a movable platform body, and a video image encoding device.
- the video image encoding device is set on the movable platform body, wherein the video image encoding device performs multiple Each channel in the channel is divided to obtain the code block to be processed, and then the division cost corresponding to the block division of the code block to be processed in each mode of the preset division mode is calculated, the target division mode is determined, and the target division mode is determined.
- the mode performs block division on the coding block to be processed to obtain the prediction block, and then calculates the prediction cost corresponding to the prediction of each prediction block of the coding block to be processed in each mode of the preset prediction mode, thereby determining the target prediction mode, and Predicting each prediction block of the coding block to be processed according to the target prediction mode can reduce data dependence and reduce the accumulation of data errors, which is conducive to hardware parallel data processing, reduces hardware implementation resources, and does not require reconstruction in the quantization process. Improve encoding speed.
- An embodiment of the present application provides a computer-readable storage medium having program instructions stored in the computer-readable storage medium, and when a processor executes the program instructions, the video image encoding method described above is implemented.
- the disclosed device and method may be implemented in other ways.
- the device embodiments described above are merely illustrative.
- the division of the modules is only a logical function division, and there may be other divisions in actual implementation, for example, multiple modules can be combined or integrated. To another system, or some features can be ignored or not implemented.
- the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or modules, and may be in electrical, mechanical or other forms.
- modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
- the functional modules in the various embodiments of the present invention may be integrated into one processing unit, or each module may exist alone physically, or two or more modules may be integrated into one unit.
- the units formed by the above-mentioned modules can be implemented in the form of hardware, or in the form of hardware plus software functional units.
- the above-mentioned integrated modules implemented in the form of software functional modules may be stored in a computer readable storage medium.
- the above-mentioned software function module is stored in a storage medium and includes a number of instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (English: processor) execute the various embodiments of the present application Part of the method.
- processor may be a central processing unit (Central Processing Unit, CPU for short), or other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), and application specific integrated circuits (Application Specific Integrated Circuits). Referred to as ASIC) and so on.
- the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
- the steps of the method disclosed in combination with the invention may be directly embodied as being executed and completed by a hardware processor, or executed and completed by a combination of hardware and software modules in the processor.
- the memory may include a high-speed RAM memory, or may also include a non-volatile storage NVM, such as at least one disk storage, and may also be a U disk, a mobile hard disk, a read-only memory, a magnetic disk, or an optical disk.
- NVM non-volatile storage
- the bus may be an Industry Standard Architecture (ISA) bus, Peripheral Component (PCI) bus, or Extended Industry Standard Architecture (EISA) bus, etc.
- ISA Industry Standard Architecture
- PCI Peripheral Component
- EISA Extended Industry Standard Architecture
- the bus can be divided into address bus, data bus, control bus and so on.
- the buses in the drawings of this application are not limited to only one bus or one type of bus.
- the above-mentioned storage medium can be realized by any type of volatile or non-volatile storage device or their combination, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable Except for programmable read only memory (EPROM), programmable read only memory (PROM), read only memory (ROM), magnetic memory, flash memory, magnetic disks or optical disks.
- SRAM static random access memory
- EEPROM electrically erasable programmable read-only memory
- EPROM erasable except for programmable read only memory
- PROM programmable read only memory
- ROM read only memory
- magnetic memory flash memory
- flash memory magnetic disks or optical disks.
- optical disks any available medium that can be accessed by a general-purpose or special-purpose computer.
- An exemplary storage medium is coupled to the processor, so that the processor can read information from the storage medium and write information to the storage medium.
- the storage medium may also be an integral part of the processor.
- the processor and the storage medium may be located in Application Specific Integrated Circuits (ASIC for short).
- ASIC Application Specific Integrated Circuits
- the processor and the storage medium may also exist as discrete components in the electronic device or the main control device.
- a person of ordinary skill in the art can understand that all or part of the steps in the foregoing method embodiments can be implemented by a program instructing relevant hardware.
- the aforementioned program can be stored in a computer readable storage medium. When the program is executed, it executes the steps including the foregoing method embodiments; and the foregoing storage medium includes: ROM, RAM, magnetic disk, or optical disk and other media that can store program codes.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Sont décrits ici un procédé et un dispositif de codage d'image vidéo, et une plateforme mobile. Le procédé comprend les étapes consistant à : diviser chacun de multiples canaux d'une trame d'image dans un flux de code vidéo pour obtenir un bloc de codage à traiter, et calculer ainsi un coût de division correspondant à la réalisation d'une division de bloc sur ledit bloc de codage dans chaque mode de modes de division prédéfinis, déterminer un mode de division cible, réaliser une division de bloc sur ledit bloc de codage selon le mode de division cible pour obtenir des blocs de prédiction, puis calculer un coût de prédiction correspondant à la prédiction de chaque bloc de prédiction dudit bloc de codage dans chaque mode de modes de prédiction prédéfinis de façon à déterminer un mode de prédiction cible et à prédire chaque bloc de prédiction dudit bloc de codage selon le mode de prédiction cible. Le procédé peut réduire la dépendance aux données, réduire l'accumulation d'erreurs de données, facilite le traitement de données parallèle matériel, et réduit les ressources de mise en œuvre matériel, et il n'est pas nécessaire de reconstruire pendant le processus de quantification, ce qui améliore la vitesse de codage.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202080002903.7A CN112204971A (zh) | 2020-02-24 | 2020-02-24 | 视频图像编码方法、设备及可移动平台 |
PCT/CN2020/076469 WO2021168624A1 (fr) | 2020-02-24 | 2020-02-24 | Procédé et dispositif de codage d'image vidéo et plateforme mobile |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2020/076469 WO2021168624A1 (fr) | 2020-02-24 | 2020-02-24 | Procédé et dispositif de codage d'image vidéo et plateforme mobile |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021168624A1 true WO2021168624A1 (fr) | 2021-09-02 |
Family
ID=74033870
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/076469 WO2021168624A1 (fr) | 2020-02-24 | 2020-02-24 | Procédé et dispositif de codage d'image vidéo et plateforme mobile |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN112204971A (fr) |
WO (1) | WO2021168624A1 (fr) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114363620A (zh) * | 2021-12-23 | 2022-04-15 | 中山大学 | 一种基于预测块子块位置交换的编码算法和系统 |
CN116156174A (zh) * | 2023-02-23 | 2023-05-23 | 格兰菲智能科技有限公司 | 数据编码处理方法、装置、计算机设备和存储介质 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090274211A1 (en) * | 2008-04-30 | 2009-11-05 | Omnivision Technologies, Inc. | Apparatus and method for high quality intra mode prediction in a video coder |
CN103327328A (zh) * | 2007-06-28 | 2013-09-25 | 三菱电机株式会社 | 图像编码装置以及图像解码装置 |
CN105791863A (zh) * | 2016-03-24 | 2016-07-20 | 西安电子科技大学 | 基于层的3d-hevc深度图帧内预测编码方法 |
CN106162176A (zh) * | 2016-10-09 | 2016-11-23 | 北京数码视讯科技股份有限公司 | 帧内预测模式选择方法和装置 |
CN109302613A (zh) * | 2018-10-26 | 2019-02-01 | 西安科锐盛创新科技有限公司 | 带宽压缩中基于宏块分割的预测方法 |
CN110213581A (zh) * | 2019-05-20 | 2019-09-06 | 广州市数字视频编解码技术国家工程实验室研究开发与产业化中心 | 一种基于块划分模式跳过的编码方法、装置及存储介质 |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190014325A1 (en) * | 2017-07-05 | 2019-01-10 | Industrial Technology Research Institute | Video encoding method, video decoding method, video encoder and video decoder |
-
2020
- 2020-02-24 CN CN202080002903.7A patent/CN112204971A/zh active Pending
- 2020-02-24 WO PCT/CN2020/076469 patent/WO2021168624A1/fr active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103327328A (zh) * | 2007-06-28 | 2013-09-25 | 三菱电机株式会社 | 图像编码装置以及图像解码装置 |
US20090274211A1 (en) * | 2008-04-30 | 2009-11-05 | Omnivision Technologies, Inc. | Apparatus and method for high quality intra mode prediction in a video coder |
CN105791863A (zh) * | 2016-03-24 | 2016-07-20 | 西安电子科技大学 | 基于层的3d-hevc深度图帧内预测编码方法 |
CN106162176A (zh) * | 2016-10-09 | 2016-11-23 | 北京数码视讯科技股份有限公司 | 帧内预测模式选择方法和装置 |
CN109302613A (zh) * | 2018-10-26 | 2019-02-01 | 西安科锐盛创新科技有限公司 | 带宽压缩中基于宏块分割的预测方法 |
CN110213581A (zh) * | 2019-05-20 | 2019-09-06 | 广州市数字视频编解码技术国家工程实验室研究开发与产业化中心 | 一种基于块划分模式跳过的编码方法、装置及存储介质 |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114363620A (zh) * | 2021-12-23 | 2022-04-15 | 中山大学 | 一种基于预测块子块位置交换的编码算法和系统 |
CN116156174A (zh) * | 2023-02-23 | 2023-05-23 | 格兰菲智能科技有限公司 | 数据编码处理方法、装置、计算机设备和存储介质 |
CN116156174B (zh) * | 2023-02-23 | 2024-02-13 | 格兰菲智能科技有限公司 | 数据编码处理方法、装置、计算机设备和存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN112204971A (zh) | 2021-01-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11936884B2 (en) | Coded-block-flag coding and derivation | |
WO2018184468A1 (fr) | Dispositif, procédé de traitement de fichier d'image et support d'informations | |
KR102120571B1 (ko) | 넌-4:4:4 크로마 서브-샘플링의 디스플레이 스트림 압축 (dsc) 을 위한 엔트로피 코딩 기법들 | |
WO2018184457A1 (fr) | Procédé de traitement de fichier d'image, et équipement et système associés | |
US11558631B2 (en) | Super-resolution loop restoration | |
WO2020119449A1 (fr) | Procédé et dispositif de prédiction d'un bloc de chrominance | |
US10136147B2 (en) | Efficient transcoding for backward-compatible wide dynamic range codec | |
WO2019210822A1 (fr) | Procédé, dispositif, et système de codage et de décodage vidéo, et support de stockage | |
WO2020103800A1 (fr) | Procédé de décodage vidéo et décodeur vidéo | |
US8526745B2 (en) | Embedded graphics coding: reordered bitstream for parallel decoding | |
CN111741302B (zh) | 数据处理方法、装置、计算机可读介质及电子设备 | |
US10382767B2 (en) | Video coding using frame rotation | |
WO2023040600A1 (fr) | Procédé et appareil de codage d'image, procédé et appareil de décodage d'image, dispositif électronique et support | |
KR102185027B1 (ko) | 디스플레이 스트림 압축을 위한 벡터 기반 엔트로피 코딩을 위한 장치 및 방법 | |
WO2021168624A1 (fr) | Procédé et dispositif de codage d'image vidéo et plateforme mobile | |
WO2018184465A1 (fr) | Procédé de traitement de fichier d'image, appareil et support de stockage | |
WO2011031592A2 (fr) | Syntaxe de flux binaire pour une compression en mode graphique dans une haute définition sans fil 1.1 | |
US20210337189A1 (en) | Prediction mode determining method and apparatus | |
CN111246208B (zh) | 视频处理方法、装置及电子设备 | |
WO2021147464A1 (fr) | Procédé et appareil de traitement vidéo, et dispositif électronique | |
US8355057B2 (en) | Joint scalar embedded graphics coding for color images | |
CN114125448B (zh) | 视频编码方法、解码方法及相关装置 | |
US10609411B1 (en) | Cross color prediction for image/video compression | |
US11870993B2 (en) | Transforms for large video and image blocks | |
WO2021180220A1 (fr) | Procédé et appareil de codage et de décodage d'image |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20921747 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20921747 Country of ref document: EP Kind code of ref document: A1 |