CN116918331A - Coding method and coding device - Google Patents

Coding method and coding device Download PDF

Info

Publication number
CN116918331A
CN116918331A CN202180094389.9A CN202180094389A CN116918331A CN 116918331 A CN116918331 A CN 116918331A CN 202180094389 A CN202180094389 A CN 202180094389A CN 116918331 A CN116918331 A CN 116918331A
Authority
CN
China
Prior art keywords
current frame
tile
frame
target
encoding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202180094389.9A
Other languages
Chinese (zh)
Inventor
郑萧桢
缪泽翔
李蔚然
郭泽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SZ DJI Technology Co Ltd
Original Assignee
SZ DJI Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SZ DJI Technology Co Ltd filed Critical SZ DJI Technology Co Ltd
Publication of CN116918331A publication Critical patent/CN116918331A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/196Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters

Abstract

The application provides an encoding method and an encoding device, comprising the following steps: acquiring image complexity information of each tile of n1 tiles in a current frame, wherein the image complexity information comprises a transformation coefficient obtained by performing image kernel transformation processing on pixel values of each tile, and n1 is a positive integer greater than or equal to 2; and updating the quantization parameters of the n1 tiles according to the image complexity information. The scheme provided by the application can improve the coding efficiency and reduce the hardware resource consumption on the premise of not sacrificing the code rate control precision by updating or calculating the QP of each tile included in the image to be coded according to the image complexity information, and can ensure the flexibility of the quantization parameter so as to avoid the problem of uncontrollable output code rate.

Description

Coding method and coding device Technical Field
The present application relates to the field of encoding and decoding, and more particularly, to an encoding method and an encoding apparatus.
Background
The joint photographic experts group extended range (Joint Photographic Experts Group Extended Range, JPEG XR) is a continuous tone still image compression algorithm and file format.
Since the code rate of the compressed image of a JPEG XR encoder ultimately depends on the degree of quantization, which depends on the specified quantization parameter (Quantization Parameter, QP). Some of the current code rate control algorithms for JPEG XR require multiple codes, and the code delay is caused by extra complexity, so that the running speed of an encoder is influenced, and the realization of real-time coding is not facilitated. Other algorithms use fixed quantization parameters, which however lead to uncontrollable output code rates.
Therefore, how to improve coding efficiency and ensure flexibility of QP is a problem to be solved.
Disclosure of Invention
The embodiment of the application provides a coding method and a coding device, which can improve coding efficiency and reduce hardware resource consumption on the premise of not sacrificing code rate control precision, and can ensure flexibility of quantization parameters so as to avoid the problem of uncontrollable output code rate.
In a first aspect, there is provided an encoding method comprising: acquiring image complexity information of each tile of n1 tiles in a current frame, wherein the image complexity information comprises a transformation coefficient obtained by performing image kernel transformation (PCT) on pixel values of each tile, and n1 is a positive integer greater than or equal to 2; and updating the quantization parameters of the n1 tiles according to the image complexity information.
According to the scheme provided by the embodiment of the application, the image complexity information is related to the transformation coefficient of the tile in the current frame, and the transformation coefficient is obtained by PCT processing the pixel value of the tile, and the QP of each tile included in the image to be coded is updated or calculated according to the image complexity information, so that the coding efficiency can be improved and the hardware resource consumption can be reduced on the premise of not sacrificing the code rate control precision, and meanwhile, the flexibility of quantization parameters can be ensured so as to avoid the problem of uncontrollable output code rate.
In a second aspect, there is provided a coding method comprising: acquiring image complexity information of a current frame, wherein the image complexity information comprises a transformation coefficient obtained by performing image kernel transformation (PCT) on pixel values of the current frame; determining an initial quantization parameter (initial QP) for the current frame based on the image complexity information; and updating the initial QP of a target frame according to the initial QP of the current frame, wherein the target frame is a front x frame and/or a rear y frame of the current frame, and x and y are positive integers which are greater than or equal to 1.
According to the scheme provided by the embodiment of the application, the image complexity information is related to the transformation coefficient of the current frame, and the transformation coefficient is the coefficient obtained by carrying out PCT (arithmetic operation) processing on the pixel value in the current frame, and the initial QP of the current frame is determined according to the image complexity information and the initial QP of the target frame is updated according to the initial QP, so that the coding efficiency can be improved and the hardware resource consumption can be reduced on the premise of not sacrificing the code rate control precision, and meanwhile, the flexibility of the quantization parameter can be ensured so as to avoid the problem of uncontrollable output code rate.
In a third aspect, there is provided an encoding apparatus comprising: the complexity calculation module is used for obtaining image complexity information of each tile of n1 tiles in the current frame, wherein the image complexity information comprises a transformation coefficient obtained by performing image kernel transformation (PCT) on pixel values of each tile, and n1 is a positive integer greater than or equal to 2; and the code rate control module is used for updating the quantization parameters of the n1 tiles according to the image complexity information.
The advantages of the third aspect may refer to those of the first aspect, and are not described in detail herein.
In a fourth aspect, there is provided an encoding apparatus including: the complexity calculation module is used for acquiring image complexity information of the current frame, wherein the image complexity information comprises a transformation coefficient obtained by performing image kernel transformation (PCT) on pixel values of the current frame; a code rate control module for determining an initial quantization parameter (initial QP) of the current frame according to the image complexity information; the code rate control module is further configured to: and updating the initial QP of a target frame according to the initial QP of the current frame, wherein the target frame is a front x frame and/or a rear y frame of the current frame, and x and y are positive integers which are greater than or equal to 1.
The advantages of the fourth aspect may refer to those of the second aspect, and will not be described here.
In a fifth aspect, there is provided an encoding apparatus including: a processor for: acquiring image complexity information of each tile of n1 tiles in a current frame, wherein the image complexity information comprises a transformation coefficient obtained by performing image kernel transformation (PCT) on pixel values of each tile, and n1 is a positive integer greater than or equal to 2; and updating the quantization parameters of the n1 tiles according to the image complexity information.
The advantages of the fifth aspect may refer to those of the first aspect, and will not be described here again.
In a sixth aspect, there is provided an encoding apparatus including: a processor for: acquiring image complexity information of a current frame, wherein the image complexity information comprises a transformation coefficient obtained by performing image kernel transformation (PCT) on pixel values of the current frame; determining an initial quantization parameter (initial QP) for the current frame based on the image complexity information; and updating the initial QP of a target frame according to the initial QP of the current frame, wherein the target frame is a front x frame and/or a rear y frame of the current frame, and x and y are positive integers which are greater than or equal to 1.
The advantages of the sixth aspect may refer to those of the second aspect, and will not be described here.
In a seventh aspect, an encoding apparatus is provided that includes a processor and a memory. The memory is used for storing a computer program, and the processor is used for calling and running the computer program stored in the memory and executing the method in the first aspect or the second aspect or each implementation manner thereof.
An eighth aspect provides a chip for implementing the method of the first aspect or the second aspect or each implementation manner thereof.
Specifically, the chip includes: a processor for calling and running a computer program from a memory, causing a device on which the chip is mounted to perform the method as in the first or second aspect or implementations thereof described above.
A ninth aspect provides a computer readable storage medium storing a computer program comprising instructions for performing the method of the first to second aspects or any possible implementation of the first to second aspects.
In a tenth aspect, there is provided a computer program product comprising computer program instructions for causing a computer to perform the above-described first to second aspects or the methods in each implementation of the first to second aspects.
Drawings
The drawings that follow are briefly described as embodiments.
Fig. 1 is a schematic diagram of a technical solution to which an embodiment of the present application is applied.
Fig. 2 is a schematic diagram of a video coding framework 2 according to an embodiment of the present application.
FIG. 3 is a schematic diagram of an image processed by JPEG XR according to an embodiment of the present application, in five layers from large to small.
Fig. 4 is a schematic block diagram of a JPEG XR encoder according to an embodiment of the present application.
Fig. 5 is a schematic diagram of forming transform coefficients based on a macroblock according to an embodiment of the present application.
Fig. 6 is a schematic diagram of an encoding method according to an embodiment of the application.
Fig. 7a is a schematic diagram of an image division to be encoded according to an embodiment of the present application.
Fig. 7b is a schematic diagram of an image partition to be encoded according to another embodiment of the present application.
Fig. 7c is a schematic diagram of a division of an image to be encoded according to a further embodiment of the present application.
Fig. 7d is a schematic diagram of an image division to be encoded according to still another embodiment of the present application.
Fig. 7e is a schematic diagram of an image partition to be encoded according to still another embodiment of the present application.
Fig. 7f is a schematic diagram of an image partition to be encoded according to still another embodiment of the present application.
Fig. 8 is a schematic diagram of a matrix position conversion function implemented according to an embodiment of the present application.
Fig. 9 is a schematic diagram of a mapping relationship of a block according to an embodiment of the present application.
Fig. 10 is a schematic diagram of an encoding method according to another embodiment of the present application.
Fig. 11 is a schematic diagram of an encoding method according to another embodiment of the present application.
Fig. 12 is a schematic block diagram of an encoding apparatus according to an embodiment of the present application.
Fig. 13a is a schematic block diagram of a JPEG XR encoder according to another embodiment of the present application.
Fig. 13b is a schematic block diagram of a JPEG XR encoder according to a further embodiment of the present application.
Fig. 14 is a schematic block diagram of an encoding apparatus according to another embodiment of the present application.
Fig. 15 is a schematic block diagram of an encoding apparatus according to still another embodiment of the present application.
Fig. 16 is a schematic block diagram of an encoding apparatus according to still another embodiment of the present application.
Fig. 17 is a schematic structural diagram of a chip provided in an embodiment of the present application.
Detailed Description
The following describes the technical solution in the embodiment of the present application.
Unless defined otherwise, all technical and scientific terms used in the embodiments of the application have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used in the present application is for the purpose of describing particular embodiments only and is not intended to limit the scope of the present application.
Fig. 1 is a schematic diagram of a technical solution to which an embodiment of the present application is applied.
As shown in fig. 1, the system 100 may receive data to be processed 102, process the data to be processed 102, and generate processed data 108. For example, the system 100 may receive data to be encoded, encode the data to be encoded to produce encoded data, or the system 100 may receive data to be decoded, decode the data to be decoded to produce decoded data. In some embodiments, components in system 100 may be implemented by one or more processors, which may be processors in a computing device or processors in a mobile device (e.g., a drone). The processor may be any kind of processor, which is not limited in this embodiment of the present application. In some possible designs, the processor may include an encoder, decoder, codec, or the like. One or more memories may also be included in the system 100. The memory may be used to store instructions and data, such as computer-executable instructions, data to be processed 102, processed data 108, etc., that implement aspects of embodiments of the present application. The memory may be any type of memory, and embodiments of the present application are not limited in this regard.
The data to be encoded may include text, images, graphical objects, animation sequences, audio, video, or any other data that needs to be encoded. In some cases, the data to be encoded may include sensory data from a sensor, which may be a vision sensor (e.g., camera, infrared sensor), microphone, near-field sensor (e.g., ultrasonic sensor, radar), position sensor, temperature sensor, touch sensor, or the like. In some cases, the data to be encoded may include information from the user, e.g., biometric information, which may include facial features, fingerprint scans, retinal scans, voice recordings, DNA samples, and the like.
Fig. 2 is a schematic diagram of a video coding framework 2 according to an embodiment of the present application. As shown in fig. 2, after receiving the video to be encoded, each frame in the video to be encoded is encoded in sequence starting from the first frame of the video to be encoded. Wherein, the current coding frame mainly passes through: prediction (Prediction), transformation (Transform), quantization (Quantization), entropy Coding (Entropy Coding), and the like, and finally outputs a code stream of the current encoded frame. Correspondingly, the decoding process generally decodes the received code stream according to the inverse of the above process to recover the video frame information before decoding.
Specifically, as shown in fig. 2, the video encoding framework 2 includes an encoding control module 201, which is used for performing decision control actions and parameter selection in the encoding process. For example, as shown in fig. 2, the encoding control module 201 controls parameters used in transformation, quantization, inverse quantization, and inverse transformation, controls parameter control for performing intra-mode or inter-mode selection, and motion estimation and filtering, and the control parameters of the encoding control module 201 are also input to the entropy encoding module to perform encoding to form a part of the encoded code stream.
When the current encoded frame starts encoding, the encoded frame is divided 202, specifically, it is first divided into slices (slices) and then divided into blocks. Optionally, in one example, the Coding frame is divided into a plurality of largest CTUs that do not overlap each other, each CTU may be further iteratively divided into a series of smaller Coding Units (CUs) in a quadtree, or binary tree, or trigeminal tree, respectively, and in some examples, the CUs may further include Prediction Units (PUs) and Transform Units (TUs) associated therewith, where PUs are predicted basic units and TUs are transformed and quantized basic units. In some examples, the PU and the TU are each partitioned into one or more blocks on a CU basis, where one PU contains a plurality of Prediction Blocks (PB) and associated syntax elements. In some examples, the PU and the TU may be the same, or may be derived from a CU by different partitioning methods. In some examples, at least two of the CUs, PUs, and TUs are identical, e.g., without distinguishing between CUs, PUs, and TUs, all predicted, quantized, and transformed in units of CUs. For convenience of description, CTUs, CUs, or other formed data units are hereinafter referred to as coding blocks.
It should be appreciated that in embodiments of the present application, the data units for which video encoding is intended may be frames, slices, coding tree units, coding blocks, or a group of any of the above. In different embodiments, the size of the data units may vary.
Specifically, as shown in fig. 2, after the encoded frame is divided into a plurality of encoded blocks, a prediction process is performed to remove spatial and temporal redundancy information of the current encoded frame. The prediction coding methods which are currently and commonly used include two methods of intra prediction and inter prediction. Intra prediction predicts the current coded block using only the reconstructed information in the current frame image, while inter prediction predicts the current coded block using information in other frame images (also referred to as reference frames) that have been reconstructed previously. Specifically, in the embodiment of the present application, the encoding control module 201 is used for deciding to select intra prediction or inter prediction.
When the intra prediction mode is selected, the process of intra prediction 203 includes obtaining reconstructed blocks of coded neighboring blocks around a current coding block as reference blocks, calculating a prediction value by using a prediction mode method based on pixel values of the reference blocks to generate a prediction block, subtracting corresponding pixel values of the current coding block and the prediction block to obtain a residual error of the current coding block, and transforming 204, quantizing 205 and entropy coding 210 the residual error of the current coding block to form a code stream of the current coding block. Further, after all the encoding blocks of the current encoding frame undergo the above encoding process, a part of the encoding code stream of the encoding frame is formed. In addition, control and reference data generated in the intra prediction 203 is also encoded by entropy encoding 210, forming part of the encoded code stream.
In particular, the transform 204 is used to remove the residual correlation of the image block in order to improve the coding efficiency. For transformation of residual data of a current coding block, two-dimensional discrete cosine transform (Discrete Cosine Transform, DCT) transformation and two-dimensional discrete sine transform (Discrete Sine Transform, DST) transformation are generally adopted, for example, residual information of the coding block is multiplied by an n×m transformation matrix and a transposed matrix thereof at a coding end, and transformation coefficients of the current coding block are obtained after multiplication.
The compression efficiency is further improved by quantization 205 after generating the transform coefficients, which are quantized to obtain quantized coefficients, which are then entropy coded 210 to obtain a residual code stream of the current encoded block, wherein the entropy coding method includes, but is not limited to, content adaptive binary arithmetic coding (Context Adaptive Binary Arithmetic Coding, CABAC) entropy coding. And finally, storing or transmitting the bit stream obtained by entropy coding and the coded coding mode information to a decoding end. At the encoding end, the quantized result is also dequantized 206, and the dequantized result is dequantized 207. After the inverse transform 207, reconstructed pixels are obtained using the inverse transform result and the motion compensation result. The reconstructed pixels are then filtered (i.e., loop filtered) 211. After 211, the filtered reconstructed image (belonging to the reconstructed video frame) is output. Subsequently, the reconstructed image may be inter-predicted as a reference frame image for other frame images. In the embodiment of the application, the reconstructed image can also be called a reconstructed image or a reconstructed image.
Specifically, the coded neighboring blocks in the intra prediction 203 process are: the current coding block is coded before, and the residual generated in the coding process of the neighboring block is transformed 204, quantized 205, dequantized 206 and inverse transformed 207 to obtain a reconstructed block which is added with the predicted block of the neighboring block. Correspondingly, inverse quantization 206 and inverse transform 207 are inverse processes of quantization 206 and transform 204 for recovering residual data prior to quantization and transform.
As shown in fig. 2, when the inter prediction mode is selected, the inter prediction process includes motion estimation (Motion Estimation, ME) 208 and motion compensation (Motion Compensation, MC) 209. Specifically, the encoding end may perform Motion estimation 208 according to a reference frame image in the reconstructed video frame, and search, in one or more reference frame images, an image block that is most similar to the current encoding block according to a certain matching criterion as a prediction block, where a relative displacement between the prediction block and the current encoding block is a Motion Vector (MV) of the current encoding block. And subtracting the original value of the pixel of the coding block from the corresponding pixel value of the prediction block to obtain the residual error of the coding block. The residual of the current encoded block is transformed 204, quantized 205 and entropy encoded 210 to form part of the encoded stream of encoded frames. For the decoding end, motion compensation 209 may be performed based on the determined motion vector and the predicted block to obtain the current encoded block.
Wherein, as shown in fig. 2, the reconstructed video frame is obtained after filtering 211. The reconstructed video frame includes one or more reconstructed images. The filtering 211 is used to reduce the blocking effect and ringing effect and the like distortion generated during the encoding process, the reconstructed video frame is used to provide a reference frame for inter-frame prediction during the encoding process, and during the decoding process, the reconstructed video frame is output as a final decoded video after post-processing.
In particular, the inter prediction mode may include an advanced motion vector prediction (Advanced Motion Vector Prediction, AMVP) mode, a Merge (Merge) mode, or a skip (skip) mode.
For AMVP mode, motion vector prediction (Motion Vector Prediction, MVP) may be determined first, after MVP is obtained, a starting point of motion estimation may be determined according to MVP, motion search is performed near the starting point, an optimal MV is obtained after searching is completed, a position of a reference block in a reference image is determined by MV, the reference block is subtracted from the current block to obtain a residual block, MV is subtracted from MVP to obtain a motion vector difference value (Motion Vector Difference, MVD), and indexes of the MVD and MVP are transmitted to a decoding end through a code stream.
For the Merge mode, the MVP may be determined first and directly as the MV of the current block. In order to obtain MVPs, an MVP candidate list (merge candidate list) may be first constructed, where at least one candidate MVP may be included in the MVP candidate list, each candidate MVP may correspond to an index, and after the encoding end selects an MVP from the MVP candidate list, the encoding end may write the MVP index into the code stream, and then the decoding end may find the MVP corresponding to the index from the MVP candidate list according to the index, so as to implement decoding of the image block.
It should be appreciated that the above process is but one specific implementation of the Merge mode. The Merge mode may also have other implementations.
For example, skip mode is a special case of Merge mode. After obtaining the MV according to the Merge mode, if the encoding end determines that the current block is substantially the same as the reference block, it is not necessary to transmit residual data, only the index of the MVP is transferred, and further a flag may be transferred, which may indicate that the current block may be directly obtained from the reference block.
That is, the Merge mode is characterized by: mv=mvp (mvd=0); the Skip mode has one more characteristic that: the reconstructed value rec=the predicted value pred (residual value resi=0).
The Merge mode may be applied in geometric prediction techniques. In the geometric prediction technology, an image block to be encoded can be divided into a plurality of sub-image blocks with polygonal shapes, a motion vector can be respectively determined for each sub-image block from a motion information candidate list, a prediction sub-block corresponding to each sub-image block is determined based on the motion vector of each sub-image block, and a prediction block of a current image block is constructed based on the prediction sub-block corresponding to each sub-image block, so that encoding of the current image block is realized.
And for the decoding end, performing an operation corresponding to the encoding end. Residual information is obtained by entropy decoding, inverse quantization and inverse transformation, and whether intra-frame prediction or inter-frame prediction is used for the current image block is determined according to a decoding code stream. If the prediction is intra-frame prediction, constructing prediction information by using the reconstructed image block in the current frame according to an intra-frame prediction method; if the inter prediction is performed, motion information needs to be analyzed, and a reference block is determined in the reconstructed image by using the analyzed motion information to obtain prediction information; and then, the prediction information and the residual information are overlapped, and the reconstruction information can be obtained through filtering operation.
As described above, encoding video based on the video encoding framework 2 shown in fig. 2 may save space or traffic that is occupied by video image storage and transmission. In general, the storage space occupied by uncompressed raw image data acquired by a camera is large, and the resolution is 3840×2160, and the storage format is YUV4:2: for example, a 210-bit image requires about 20 megabytes of memory to store the image without compression, and typically an 8G memory card can only store 500 uncompressed pictures of the above specification, meaning that an uncompressed picture of the above specification requires 20 megabytes of traffic during network transmission. Therefore, in order to save space or traffic occupied by image storage and transmission, image data needs to be subjected to encoding compression processing.
The joint photographic experts group extension (Joint Photographic Experts Group Extended Range, JPEG XR) is a continuous tone still image compression algorithm and file format, also known as HD Photo or web media image (Windows Media Photo), developed by microsoft (microsoft), which is part of the web media (windows media) family. It supports lossy data compression as well as lossless data compression and is the preferred image format for microsoft XML text specification (XML Paper Specification, XPS) documents. Wherein the XML is an extensible markup language (Extensible Markup Language). The software currently supported includes NET framework (3.0 or updated version), operating system (windows vista/windows 7), web seeker (Internet Explorer, IE) 9, animation player (flashplayer) 11, etc.
JPEG XR is an image codec that can achieve high dynamic range image encoding and requires only integer arithmetic when compressing and decompressing. It may support images in a multi-channel color format in a single color, red Green Blue (RGB), cyan Magenta Yellow Black (CMYK), 16 bit unsigned integer or 32 bit fixed or floating point representation, and it may also support RGBE radio image formats. It may choose to embed international color consortium (International Color Consortium, ICC) color profiles to achieve color consistency across different devices. The alpha channel may represent a degree of transparency while supporting exchangeable image files (Exchangeable Image File, EXIF), extensible metadata platform (Extensible Metadata Platform, XMP) metadata formats. This format also supports the inclusion of multiple images in one file. Support for only partial decoding of the image, the entire image need not be decoded for some specific operations such as cropping, downsampling, horizontal vertical flipping, or rotation.
A schematic diagram of the processing of an image in five layers from large to small when the image is processed by JPEG XR is shown in fig. 3. The map includes an image (image), a tile (tile), a macroblock (macro block), a block (block), and a pixel (pixel). Wherein an image may be composed of one or more tiles. If the tile is located at the right and bottom edges of the image, it will be padded to an integer number of macroblocks (16 x 16). Each macroblock may contain 16 4 x 4 blocks, and each block may contain 4 x 4 pixels. JPEG XR performs a two-stage transform on the reconstructed low-pass block in each of the 4 x 4 blocks and 16 x 16 macroblocks.
As shown in fig. 4, a schematic block diagram of a JPEG XR encoder according to an embodiment of the present application is provided. The JPEG XR encoder may include five modules, filtering module 410, transform module 420, quantization module 430, prediction module 440, entropy encoding module 450, which function similarly to the modules referred to above in FIG. 2. In particular, the filtering module 410 may mitigate blocking artifacts of the decoded reconstructed image by smoothing between neighboring pixels; the transformation module 420 may transform the image information from the spatial domain to the frequency domain, removing part of the spatial redundancy information; the quantization module 430 may scale down the frequency domain coefficients to reduce the coefficient magnitudes to be encoded, the degree to which the coefficient magnitudes are reduced depending on the size of the specified quantization parameter (Quantization Parameter, QP); the prediction module 440 may remove correlation of the inter-neighboring block partial coefficients through prediction between the neighboring block partial coefficients; the entropy encoding module 450 may encode the resulting coefficients into a binary code stream.
From the functional analysis of the above five modules and the description of these modules in connection with fig. 2 above, it can be seen that the size (i.e. the code rate) of the final code stream is mainly dependent on the quantization level, which has a decisive role, the prediction efficiency and the entropy coding performance.
The following Wen Xian describes the transform module and quantization module with respect to JPEG XR.
1. Conversion module
The transformation of JPEG XR is an integer based transformation, each macroblock can participate in a two-phase transformation. The transformation can be performed on a 4x4 block basis. As shown in fig. 5, the first stage transform may be applied to 16 blocks within a macroblock, resulting in 16 low pass coefficients (Low Pass Coefficient, LP coeffient) and 240 high pass (High Pass Coefficient, HP coeffient) coefficients, i.e., each of the 16 blocks produces one LP Coefficient and 15 HP coefficients. The second stage transform can be applied to the 16 LP Coefficient reorganization block obtained in the first stage, and finally generates 1 direct current Coefficient (Direct Current Coefficient, DC coeffient) and 15 LP coefficients by re-transforming.
2. Quantization module
Quantization in JPEG XR has a high degree of flexibility, as quantization parameters may vary among tiles, macroblocks and DC coefficients, LP coefficients, HP coefficients. The quantization parameter range of JPEG XR is an integer from 0 to 255, wherein the quantization parameter is lossless compression when 0 and 1, and the quantization parameter is compression with the greatest loss when 255. The mapping relation from the quantization parameter to the Scaling Factor (SF) is shown in the following formula (1):
The quantized coefficients are obtained by dividing the original coefficients by the corresponding scaling factor of the selected quantization parameter and rounding them to integer numbers.
It follows that the final size of the code rate of the compressed image using a JPEG XR encoder depends on the degree of quantization, which depends on the specified quantization parameter. Some of the current code rate control algorithms for JPEG XR require multiple codes, and the code delay is caused by extra complexity, so that the running speed of an encoder is influenced, and the realization of real-time coding is not facilitated. Other algorithms use fixed quantization parameters, which however lead to uncontrollable output code rates.
Aiming at the problems, the application provides a coding method which can improve coding efficiency and reduce hardware resource consumption on the premise of not sacrificing code rate control precision, and can ensure flexibility of quantization parameters so as to avoid the problem of uncontrollable output code rate.
As shown in fig. 6, an encoding method 600 according to an embodiment of the present application may include steps 610-620.
610, obtaining image complexity information of each tile of n1 tiles in the current frame, where the image complexity information includes a transform coefficient obtained by performing image kernel transform (Photo Core Transform, PCT) processing on a pixel value of each tile, and n1 is a positive integer greater than or equal to 2.
In the embodiment of the application, when the image to be coded is divided, the image to be coded can be divided according to the fixed width or the fixed height, or the image to be coded can be divided not according to the fixed width or the fixed height.
In addition, in the embodiment of the application, the coded image can be divided horizontally, and the image to be coded can be divided vertically.
It should be understood that the horizontal division in the embodiment of the present application may refer to division of an image to be encoded from a horizontal direction, and the vertical division may refer to division of an image to be encoded from a vertical direction.
For example, as shown in fig. 7a, a schematic diagram of dividing an image to be encoded is provided according to an embodiment of the present application. The image to be encoded can be vertically divided according to a preset fixed width, assuming that the preset fixed width is 384, as can be seen from fig. 7a, the image to be encoded can be divided into 3 tiles, and the widths of the 3 tiles are all the same, and are all 384.
As shown in fig. 7b, a schematic diagram of the division of an image to be encoded is provided for another embodiment of the present application. Likewise, the image to be encoded may be vertically divided according to a preset fixed width, assuming that the preset fixed width is 384, as can be seen from fig. 7b, the image to be encoded may be divided into 3 tiles, and the widths of tile 1 and tile 2 are the same, and are 384, and the width of tile 3 is smaller than 384.
It will be appreciated that the heights of the divided tiles (i.e. tile 1, tile 2 and tile 3) are the same, both being the heights of the images to be encoded, in the manner of the division of fig. 7a and 7b described above.
As shown in fig. 7c, a schematic diagram of the division of an image to be encoded is provided for a further embodiment of the present application. The image to be encoded can be horizontally divided according to a preset fixed height, and the preset fixed height is 384, as can be seen from fig. 7c, the image to be encoded can be divided into 3 tiles, and the heights of the 3 tiles are all the same and are 384.
As shown in fig. 7d, a schematic diagram of the division of the image to be encoded is provided for a further embodiment of the present application. The image to be encoded can be horizontally divided according to a preset fixed height, assuming that the preset fixed height is 384, it can be seen from fig. 7d that the image to be encoded can be divided into 3 tiles, the heights of tile 1 and tile 2 are the same, and are 384, and the height of tile 3 is less than 384.
It will be appreciated that the widths of the divided tiles are the same, both being the widths of the images to be encoded, in the manner described above with respect to fig. 7c and 7 d.
In some embodiments, the image to be encoded may not be divided by a fixed width or a fixed height, and the image to be encoded may be divided by a preset plurality of widths or a preset plurality of heights.
As shown in fig. 7e, a schematic diagram of the division of an image to be encoded is provided for a further embodiment of the present application. The image to be encoded can be vertically divided according to a number of preset fixed widths, assuming that the preset fixed widths include 160 and 384, it can be seen from fig. 7e that the width of tile 1 is 160, the width of tile 2 is 384, and the width of tile 3 is 224.
As shown in fig. 7f, a schematic diagram of the division of an image to be encoded according to a further embodiment of the present application is provided. The image to be encoded can be divided horizontally according to a plurality of preset fixed heights, assuming 160 and 384, it can be seen from fig. 7f that tile 1 has a height of 160, tile 2 has a height of 384, and tile 3 has a height of 224.
It should be understood that the above values are illustrative, and other values are possible, which are not particularly limited by the present application.
It should be noted that, in some embodiments, the fixed width or fixed height is set in relation to the number of pixels included in the PCT-processed block when the image to be encoded is divided by the fixed width or fixed height, as described later.
In some embodiments, the computation of the image complexity information may be replaced by computation by an operator including, but not limited to, a hadamard transform or a mean square error.
And 520, updating quantization parameters of the n1 tiles according to the image complexity information.
The image complexity information in the embodiment of the present application may refer to transform coefficients of tiles included in an image to be encoded, and the QP of the tile is updated according to the transform coefficients of the tile.
According to the scheme provided by the embodiment of the application, the image complexity information is related to the transformation coefficient of the tile in the current frame, and the transformation coefficient is obtained by PCT processing the pixel value of the tile, and the QP of each tile included in the image to be coded is updated or calculated according to the image complexity information, so that the coding efficiency can be improved and the hardware resource consumption can be reduced on the premise of not sacrificing the code rate control precision, and meanwhile, the flexibility of quantization parameters can be ensured so as to avoid the problem of uncontrollable output code rate.
It is noted above that the image complexity information includes transform coefficients obtained by subjecting the pixel values of each tile to PCT processing, and PCT processing involved therein will be described below.
PCT processing may include a procedure in which blocks of 4*4 included for each macroblock may be transformed according to the following procedure.
Hereinafter _2χ2t_h (a, b, c, d, flag), _t_odd (a, b, c, d), _t_odd_odd (a, b, c, d), and_fwdPermute (a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p) represent 4 different calculations, and the letters in brackets represent the inputs and outputs of the calculations.
PCT4×4(a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p)
_2×2T_h(a,d,m,p,0)
_2×2T_h(f,g,j,k,0)
_2×2T_h(b,c,n,o,0)
_2×2T_h(e,h,i,l,0)
_2×2T_h(a,b,e,f,1)
_T_odd(c,d,g,h)
_T_odd(i,m,j,n)
_T_odd_odd(k,l,o,p)
_FwdPermute(a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p)
1) The calculation process of _2X2T_h (a, b, c, d, flag) is shown in the formula (2) -formula (9)
a=a+d (2)
b=b-c (3)
t1=((a-b+flag)>>1) (4)
t2=c (5)
c=t1-d (6)
d=t1-t2 (7)
a=a-d (8)
b=b+c (9)
Wherein t1 and t2 are temporary values, and the > symbol represents a right shift operation.
2) The calculation process of the _T_odd (a, b, c, d) is shown as the formula (10) -formula (21)
b=b-c (10)
a=a+d (11)
c=c+((b+1)>>1) (12)
d=((a+1)>>1)-d (13)
b=b-((a*3+4)>>3) (14)
a=a+((b*3+4)>>3) (15)
d=d-((c*3+4)>>3) (16)
c=c+((d*3+4)>>3) (17)
d=d+(b>>1) (18)
c=c-((a+1)>>1) (19)
b=b-d (20)
a=a+c (21)
3) The calculation process of the _T_odd_odd (a, b, c, d) is shown as the formula (22) -formula (36)
b=-1*b (22)
c=-1*c (23)
d=d+a (24)
c=c-b (25)
t1=d>>1 (26)
t2=c>>1 (27)
a=a-t1 (28)
b=b+t2 (29)
a=a+((b*3+4)>>3) (30)
b=b-((a*3+4)>>2) (31)
a=a+((b*3+3)>>3) (32)
b=b-t2 (33)
a=a+t1 (34)
c=c+b (35)
d=d-a (36)
Wherein t1 and t2 are temporary values, and the > symbol represents a right shift operation.
4) The calculation of _ FwdPermute (a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p) is shown in FIG. 8
This process implements a matrix position conversion function, i.e., converting the position of the letters of the left-hand graphic shown in fig. 8 to the position of the letters of the right-hand graphic shown in fig. 8.
The PCT processing flow will be described by taking simple numerical values as an example, and PCT processing is performed according to the flow described above assuming values of a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, and p are 1,2,3,4,2,3,5,4,2,3,4,0,3,2,1,1, respectively.
1) Performing _2x2T_h (a, b, c, d, flag) processing
(1) The procedure for _2x2T_h (a, d, m, p, 0) is as follows
a=a+p=1+1=2
d=d-m=4-3=1
t1=((a-d+flag)>>1)=0
t2=m=3
m=t1-p=-1
p=t1-t2=0-3=-3
a=a-p=2-(-3)=5
d=d+m=1+(-1)=0
(2) The procedure for _2x2T_h (f, g, j, k, 0) is as follows
f=f+k=3+4=7
g=g-j=5-3=2
t1=((f-g+flag)>>1)=2
t2=j=3
j=t1-k=2-4=-2
k=t1-t2=2-3=-1
f=f-k=7-(-1)=8
g=g+j=2-(-2)=0
(3) The procedure for _2x2T_h (b, c, n, o, 0) is as follows
b=b+o=2+1=3
c=c-n=3-2=1
t1=((b-c+flag)>>1)=1
t2=n=2
n=t1-o=1-1=0
o=t1-t2=1-2=-1
b=b-o=3-(-1)=4
c=c+n=1+0=1
(4) The procedure for _2x2T_h (e, h, i, l, 0) is as follows
e=e+l=2+0=2
h=h-i=4-2=2
t1=((e-h+flag)>>1)=0
t2=i=2
i=t1-l=0-0=0
l=t1-t2=0-2=-2
e=e-l=2-(-2)=4
h=h+i=2+0=2
(5) The procedure of _2x2T_h (a, b, e, f, 1) is as follows
a=a+f=5+8=13
b=b-e=4-4=0
t1=((a-b+flag)>>1)=7
t2=e=4
e=t1-f=7-8=-1
f=t1-t2=7-4=3
a=a-f=13-3=10
b=b+e=0+(-1)=-1
Based on this, the values of a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p are updated to 10, -1,0, -1,3,0,2,0, -1, -1, -2, -1,0, -1, -3, respectively.
2) Processing of the _T_odd (a, b, c, d)
(1) The flow of _T_odd (c, d, g, h) is as follows
From the above, the values of c, d, g and h are 1,0,0,2 respectively.
d=d-g=0-0=0
c=c+h=1+2=3
g=g+((d+1)>>1)=0
h=((c+1)>>1)-h=0
d=d-((c*3+4)>>3)=0
c=c+((d*3+4)>>3)=3
h=h-((g*3+4)>>3)=0
g=g+((h*3+4)>>3)=0
h=h+(d>>1)=0
g=g-((c+1)>>1)=-2
d=d-h=0-0=0
c=c+g=3+(-2)=1
Therefore, the updated values of c, d, g and h are respectively 1,0, -2 and 0.
(2) The flow of _T_odd (i, m, j, n) is as follows
From the above, the values of i, m, j, n are respectively 0, -1,0.
m=m-j=-1-(-1)=0
i=i+n=0+0=0
j=j+((m+1)>>1)=-1
n=((i+1)>>1)-n=0
m=m-((i*3+4)>>3)=0
i=i+((m*3+4)>>3)=0
n=n-((j*3+4)>>3)=0
j=j+((n*3+4)>>3)=-1
n=n+(m>>1)=0
j=j-((i+1)>>1)=-1
m=m-n=0
i=i+j=-1
Thus, the updated values of i, m, j, n are-1, 0, respectively.
3) Processing of the _T_odd_odd (k, l, o, p)
From the above, the values of k, l, o and p are respectively-1, 2, -1 and-3.
l=-1*l=2
o=-1*o=1
p=p+k=(-3)+(-1)=-4
o=o-l=1-2=-1
t1=p>>1=2
t2=o>>1=0
k=k-t1=(-1)-2=-3
l=l+t2=2+0=2
k=k+((l*3+4)>>3)=-2
l=l-((k*3+4)>>2)=1
k=k+((l*3+3)>>3)=-2
l=l-t2=1-0=1
k=k+t1=(-2)+2=0
o=o+l=(-1)+1=0
p=p-k=(-4)-0=-4
Thus, the updated values of k, l, o, p are 0,1,0, -4, respectively.
Based on this, the updated values of a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p are respectively 10, -1,0, -1,3, -2,0, -1, -1,0,1,0,0,0, -4.
4) Performing _FwdPermute (a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p) processing
According to the matrix transformation processing of the input values as shown in fig. 8, the values of a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p after final updating are respectively 10, -1,0,1, -4,0, -1,0,3, -1,0,1, -2,0.
It should be appreciated that the above process is illustrated with simple values only, and that in an actual encoding process, the pixel values in the image to be encoded may be much larger than the values listed above, but the calculation flow is essentially consistent.
Optionally, in some embodiments, the updating QP for the n1 tiles according to the image complexity information includes: calculating an accumulated value of a target byte number of an ith tile in the n1 tiles according to the image complexity information, wherein i is a positive integer less than or equal to n 1; updating the QP of the ith tile according to the accumulated value of the target byte number of the ith tile.
The cumulative value of the target number of bytes for the ith tile in embodiments of the present application may refer to the target number of bytes for all tiles of the first i tiles (including the ith tile).
For example, the cumulative value of the target number of bytes for the 2 nd tile may refer to the sum of the target number of bytes for the 1 st tile and the target number of bytes for the 2 nd tile; the cumulative value of the target number of bytes for the 5 th tile may refer to the sum of the target number of bytes for the 1 st tile, the target number of bytes for the 2 nd tile, the target number of bytes for the 3 rd tile, the target number of bytes for the 4 th tile, and the target number of bytes for the 5 th tile; by analogy, the cumulative value of the target number of bytes for the n1 st tile may refer to the sum of the target number of bytes for the 1 st tile, the target number of bytes for the 2 nd tile … …, the target number of bytes for the n1-1 st tile, and the target number of bytes for the n1 st tile.
In the embodiment of the application, the QP of the ith tile can be updated according to the accumulated value of the target byte number of the ith tile. For example, the QP of the 1 st tile may be updated based on the accumulated value of the target number of bytes of the 1 st tile (i.e., the target number of bytes of the 1 st tile), and the QP of the 3 rd tile may be updated based on the accumulated value of the target number of bytes of the 3 rd tile.
It should be noted that, for the 1 st tile, there may be an initial QP, and then the 1 st tile's initial QP is updated according to the 1 st tile's target byte number.
As described above, the image complexity information in the embodiment of the present application includes the transform coefficient obtained by PCT processing the pixel value of each tile. This is because, since the image complexity information calculated by the transform form of the transform module using JPEG XR has a strong correlation with the final bitstream size, the image complexity information used in the embodiment of the present application may be calculated by the above PCT-processed method of accumulating transform coefficients.
Wherein the obtaining the image complexity information of each tile of the n1 tiles in the current frame includes:
and mapping the blocks in the tiles, and then carrying out preset transformation to generate preset transformation parameters, and generating image complexity information of the tiles according to the preset transformation parameters. The preset transform comprises PCT transform
It is noted that to reduce the hardware boundary conditions, in some embodiments, the complexity of the pixels in the tile whose rightmost width is not full of the fixed width may not be calculated if the width of the image to be encoded is not an integer multiple of the fixed width, or the complexity of the pixels in the tile whose bottommost height is not full of the fixed height may not be calculated if the height of the image to be encoded is not an integer multiple of the fixed height.
In other embodiments, if the width of the image to be encoded is not an integer multiple of the fixed width, the image complexity of the rightmost tile of the image to be encoded may be calculated by a proportional relationship, or if the height of the image to be encoded is not an integer multiple of the fixed height, the image complexity of the bottommost tile of the image to be encoded may be calculated by a proportional relationship.
As described above, the inputs to the PCT process may include blocks of 4*4 pixels, and a 2x8 block may be mapped to a 4x4 block for input in view of conserving hardware line cache resources in a System On Chip (SOC) implementation. By this operation, 2-line cache resources can be saved.
Fig. 9 is a schematic diagram of a mapping relationship of a block according to an embodiment of the present application. Here, (a) in fig. 9 is a schematic diagram of a block before mapping, and (b) in fig. 9 is a schematic diagram of a block after mapping.
The mapping relationship can be understood as moving the last four pixels of the first row to the position under the second row, i.e. the third row, and moving the last four pixels of the second row to the position under the third row, i.e. the fourth row.
It will be appreciated that the above operations involve only a change in the position of the pixel and not a change in the value of the pixel.
Furthermore, it is noted above that in some embodiments, the fixed width or fixed height setting is related to the number of pixels included in the PCT processed block when the image to be encoded is divided by the fixed width or fixed height.
Since the mapping relation is that a block of 2x8 is mapped into a block of 4x4, that is, a block before mapping includes 2 rows and 8 columns of pixels, when the fixed width is set, the mapping relation can be set to a value of an integer multiple of 8, so that when the block on the rightmost side of the tile is mapped, the mapping relation can be just mapped into a block of 4*4.
Similarly, when setting the fixed height, a value of an integer multiple of 2 may be set, so that when mapping the lowest block of tiles, it may just as well be mapped as a block of 4*4.
In the above description, when an image to be encoded is vertically divided by a fixed width or horizontally divided by a fixed height, the width of the rightmost tile thereof may not satisfy the fixed width or the height of the lowermost tile may not satisfy the fixed height, and when the image complexity value of the rightmost tile or the lowermost tile is calculated, it may be calculated in the following manner.
Taking the example of vertical division of the image to be encoded by a fixed width, if the image to be encoded is divided into n1 tiles, the width of the rightmost tile (i.e., the n1 st tile) may be less than 384. Therefore, the image complexity value of the 1 st to n1 st tiles can be calculated first, and then the image complexity value of the n1 st tile can be calculated according to the proportional relation.
After mapping all 2x8 blocks within a tile of 1 to n1-1 and performing PCT processing, the absolute values of all DC coefficients and coefficients other than DC coefficients (referred to as LPHP coefficients) within the resulting tile may be accumulated as image complexity values for the tile, respectively, as shown in equations (37) and (38), where i=1, 2,3, …, n1-1.
complexityDC i =∑ 2x8 blocks in tile DC coefficient (37)
complexityLPHP i =∑ 2x8 blocks in tile LPHP coefficient (38)
Wherein, complexDC i Image complexity value, Σ, representing DC coefficient of ith tile 2x8 blocks in tile DC coefficient represents the sum of the DC coefficients of all blocks in the ith tile, complexLPHP i Image complexity value, Σ, representing the LPHP coefficient of the ith tile 2x8 blocks in tile LPHP coefficient represents the sum of LPHP coefficients for all blocks of the ith tile.
For tiles of the rightmost undershoot width 384, the image complexity value can be calculated by equations (39) and (40):
complexityDC n1 =complexityDC n1-1 *tilewidth n1 /384 (39)
complexityLPHP n1 =complexityLPHP n1-1 *tilewidth n1 /384 (40)
wherein, complexDC n1 Image complexity value representing DC coefficient of rightmost tile, complexDC n1-1 Representing tiles adjacent to the left side of the rightmost tileImage complexity value of DC coefficient of slice, complexLPHP n1 Image complexity value representing LPHP coefficients of rightmost tile, complexlphp n1-1 Image complexity value representing LPHP coefficient of tile adjacent to left side of rightmost tile, tile width n1 Representing the actual width of the rightmost tile.
As shown in fig. 7b above, by vertically dividing the image to be encoded by a fixed width 384, 3 tiles, tile 1, tile 2 and tile 3, respectively, can be obtained. Wherein, the width of tile 1 and tile 2 is 384, and the width of tile 3 is less than 384.
The image complexity values for tile 1 and tile 2 can be obtained according to formulas (37) and (38) above, and the image complexity value for tile 3 can be obtained according to formulas (39) and (40) above. Wherein, the DC coefficient and the LPHP coefficient in the tile have been described above, and are not described herein for brevity.
Assuming that the sum of the absolute values of the DC coefficients in tile 1 is 100 and the sum of the absolute values of the lphp coefficients is 80; the sum of the absolute values of the DC coefficients in tile 2 is 146 and the sum of the absolute values of the lphp coefficients is 100. Namely:
(1) Tile 1
complexityDC 1 =∑ 2x8 blocks in tile DC coefficient=100
complexityLPHP 1 =∑ 2x8 blocks in tile LPHP coefficient=80
The image complexity value of the DC coefficient of tile 1 is 100 and the image complexity value of the LPHP coefficient of tile 1 is 80.
(2) Tile 2
complexityDC 2 =∑ 2x8 blocks in tile DC coefficient=146
complexityLPHP 2 =∑ 2x8 blocks in tile LPHP coefficient=100
Then the image complexity value for the DC coefficient of tile 2 is 146 and the image complexity value for the LPHP coefficient of tile 2 is 100.
(3) Tile 3
complexitDC 3 =complexitDC 2 *tilewidth 3 /384=146*100/384=38
complexitLPHP 3 =complexitLPHP 2 *tilewidth 3 /384=146*100/384=26
Then the image complexity value for the DC coefficient of tile 3 is 38 and the image complexity value for the LPHP coefficient of tile 3 is 26.
After obtaining the image complexity value of each tile, the target number of bytes of each tile may be calculated according to the image complexity value, and then the QP of the i-th tile is updated based on the accumulated value of the target number of bytes of the i-th tile. Wherein reference is made to the following regarding the calculation of the target number of bytes per tile from its image complexity value.
According to the scheme provided by the embodiment of the application, the QP of the ith tile is updated according to the accumulated value of the target byte number of the ith tile included in the image to be coded, so that the coding efficiency can be improved, the hardware resource consumption can be reduced on the premise of not sacrificing the code rate control precision, and meanwhile, the flexibility of quantization parameters can be ensured to avoid the problem of uncontrollable output code rate.
Optionally, in some embodiments, the updating the QP for the ith tile according to the accumulated value of the target number of bytes for the ith tile includes: updating the QP of the ith tile according to the absolute value of a first difference value and a first threshold, wherein the first difference value is the difference value between the accumulated value of the target byte number of the ith tile and the accumulated value of the actual code byte number of the ith tile.
Similarly, the cumulative value of the actual code word counts for the ith tile in embodiments of the present application may refer to the actual code word counts for all tiles of the first i tiles (including the ith tile).
For example, the cumulative value of the actual code word number of the 2 nd tile may refer to the sum of the actual code word number of the 1 st tile and the actual code word number of the 2 nd tile; the actual number of code words of the 5 th tile may refer to the sum of the actual number of code words of the 1 st tile, the actual number of code words of the 2 nd tile, the actual number of code words of the 3 rd tile, the actual number of code words of the 4 th tile, and the actual number of code words of the 5 th tile; by analogy, the cumulative value of the actual code word number for the n1 st tile may refer to the sum of the actual code word number for the 1 st tile, the actual code word number for the 2 nd tile … …, the actual code word number for the n1-1 st tile, and the actual code word number for the n1 st tile.
The actual number of encoded bytes of the ith tile in the embodiment of the present application may be obtained by the encoder during the process of encoding the image to be encoded.
The first threshold in the embodiment of the present application may be fixed or may be continuously adjusted, which is not specifically limited in the present application.
In the embodiment of the application, the encoder can encode by taking the tile as a unit, and before the image encoding starts, the initial QP of the first tile can be calculated according to the original size of the image and the target byte number of the current frame 0 Thereafter, when encoding the QP of the current tile, the QP of the current tile may be updated according to the difference between the accumulated value of the target number of bytes of the current tile and the accumulated value of the actual number of code bytes.
The calculation of the target number of bytes for the current frame and the target number of bytes for each tile is as follows.
a. The calculation of the target number of bytes of the current frame can be calculated by equation (41)
Wherein t arg etByte represents the target byte number of the image (i.e. is the current frame in the embodiment of the application), width represents the width of the image, height represents the height of the image, bitdepth represents the bit depth of the image, m represents the ratio of the number of all pixels to the number of brightness pixels (e.g. n is 2 when encoding the image in YUV422 format), and compactratio represents the image compression ratio.
b. The calculation of the target number of bytes for each tile may be calculated by equation (42)
Where n represents the number of tiles into which the current image is totally divided, i represents the ith tile, t arg etByte i Representing the target byte count for the ith tile, complexlphp i Representing the image complexity of the LPHP coefficient corresponding to the i-th tile,the sum of the image complexity of the LPHP coefficients representing all tiles in the current frame.
It is noted that not all of the image complexity information used in this embodiment is guidance information based on a priori data. For example, the Qp for each tile is calculated from the image complexity information below based on the image complexity value of the LPHP coefficient.
Continuing with the description of fig. 7b, assuming that the image has a height of 100, a bit depth of 8, a compression ratio of 200, and a coding format of YUV422, the target number of bytes of the current frame is:
the target number of bytes for each tile is:
after the target number of bytes for each tile is obtained, the difference between the cumulative value of the target number of bytes for each tile and the cumulative value of the actual number of code bytes may be calculated, and then the QP for each tile may be updated or calculated based on the obtained difference.
The first threshold in the embodiment of the present application may include a plurality of thresholds, for example, may include two thresholds, and the thresholds may be calculated by the formulas (43) and (44).
Where t arg etByte represents the target byte number of the image (i.e. is the current frame in the embodiment of the present application), n represents the total number of tiles divided into by the current image to be encoded, threshold 1 and threshold 2 represent threshold 1 and threshold 2, alpha and beta are preset parameters, and one setting method may be set to alpha=4 and beta=16.
In addition, it is noted above that the initial QP for the first tile may be calculated first 0 Based on the initial QP 0 The QP for each tile is updated or calculated.
Since the image complexity can characterize the information amount of the original image content, and the target byte number represents the compressed information amount, it is reasonable and efficient to calculate the parameter QP characterizing the compression degree by establishing a mathematical model in combination with the complexity and the target byte number. And in order to get the complexity closer to the range of the target number of bytes, both are converted to the log domain for computation when computing QP.
The initial QP calculation can be calculated by equations (45) to (49). In this embodiment, the formula (45) normalizes the number of bytes to the data amount bpp of each pixel, and converts it to the logarithmic domain.
log(bpp)=log(t arg etByte/(width*height)) (45)
Wherein log represents the logarithm based on natural logarithm, and log10 represents the logarithm based on 10. The complexity is converted into a logarithmic domain by the formulas (46) and (47), and the complexity is further converted into a bpp multiplicative relation equation (46) and an additive relation equation (47) in the process of establishing a mathematical model so as to improve the consistency of the mathematical model because the compression is a nonlinear process.
Wherein paramA, paramB, paramC, paramD is a preset parameter, a typical configuration from a priori knowledge may be parama=0.02213, paramb= -24.32, paramc=30.45, paramd= -76.78.
In this embodiment, QP is found using a formula (48) built up of complexity multiplicative, additive, log (bpp) and variable parameters x, y.
QP=x*log(bpp)+y+z (48)
Where z is a variable parameter, which may be updated later, and the initial value may be set to 0.
QP 0 =Cilp3(min QP,max QP,QP) (49)
Wherein QP in the formula 0 For the final initial QP, minQP and maxQP are minimum and maximum values of QP, respectively, and minQP may be not less than 0 and maxQP may be not more than 255. A typical set of configurations mayFor minqp=5, maxqp=150.
The meaning represented by the above formula (49) is QP 0 Any one of the values in brackets may be used. If the QP calculated by the above equation (18) is between minQP and maxQP, QP 0 Is QP; if the QP calculated by the above equation (18) is smaller than minQP, QP 0 Is minQP; if the QP calculated by the above equation (18) is greater than maxQP, QP 0 Is maxQP.
Continuing with the example of fig. 7b described above, as described above, the width and height of the image to be encoded are 868 and 100, respectively, log (bpp) =log (t arg etByte/(width height))=log (868 (868×100))=0.01
QP=x*log(bpp)+y+z=147.47
QP 0 =Cilp3(5,150,147.47)
Then the determined initial QP may be obtained 0 147.47.
In a subsequent encoding process, the initial QP may be determined based 0 Updating the QP of the tile 1 in the current frame, calculating the QP of the tile 2 based on the updated QP of the tile 1, and finally calculating the QP of the tile 3 based on the QP of the tile 2.
According to the scheme provided by the embodiment of the application, the QP of the tile included in the image to be coded is updated according to the absolute value of the first difference value and the first threshold value, so that the coding efficiency can be further improved, the hardware resource consumption can be reduced on the premise of not sacrificing the code rate control precision, and meanwhile, the flexibility of the quantization parameter can be ensured so as to avoid the problem of uncontrollable output code rate.
The encoding end may update the QP of the i-th tile according to the absolute value of the first difference and the first threshold, which will be described in detail below.
Optionally, in some embodiments, the updating the QP for the i-th tile according to the absolute value of the first difference and the first threshold includes: if the first difference value is a positive value, taking the difference value between the QP of the ith-1 tile and the first offset QP as the QP of the ith tile; if the first difference value is a negative value, taking the sum of the QP of the i-1 th tile and the first offset QP as the QP of the i-th tile; wherein the first offset QP is derived based on an absolute value of the first difference and the first threshold.
In the embodiment of the application, the QP of the ith tile of the image to be coded can be updated by combining the positive and negative of the first difference value.
If the current block is coded to the ith tile, the accumulated value of the target byte number and the accumulated value of the actual code byte number of the ith tile can be calculated according to the formulas (50) and (51).
Wherein accTarBytes i A cumulative value representing the target number of bytes for the i-th tile,the sum of the target byte numbers representing all tiles of the first i tiles (including the ith tile), accActBytes i A cumulative value representing the actual code word number of the ith tile,representing the sum of the actual code word counts of all tiles of the first i tiles (including the i-th tile).
The first difference is calculated according to the equation (50) and the equation (51), as shown in the equation (52), and the offset value of QP is obtained by the equation (53) and the equation (54).
deltaBytes i =accTarBytes i -accActBytes i (52)
QPoffset i =|deltaBytes i |>threhold1offsetA:offsetB (53)
offsetB=|deltaBytes i |>threhold21:0 (54)
Wherein deltaBytes i The difference between the cumulative value of the target number of bytes representing the ith tile and the cumulative value of the actual number of codewords, i.e. the first difference, QPoffset i The offset value of QP representing the i-th tile, offsetA, is a preset parameter, and may be set to 2.
The meaning of formula (53) is: absolute value of difference between accumulated value of target byte number of i-th tile of current coding and accumulated value of actual coding byte number i I is greater than the preset threshold value threshold 1, the offset value QPoffset of the QP of the i-th tile currently coded i The value may be offsetA, otherwise it is offsetB.
Similarly, the meaning of formula (54) is: absolute value of difference between accumulated value of target byte number of i-th tile of current coding and accumulated value of actual coding byte number i I is greater than the preset threshold value threshold 2, the offset value QPoffset of QP of the i-th tile currently coded i The value may be 1, otherwise the value is 0.
The QP used by the ith tile is further updated, and the range of QP may still be between min QP, max QP. The QP for the ith tile may be updated according to equation (55).
QP i =QP i-1 -sign(deltaBytes i )*QPoffset i (55)
QP i =Cilp3(min QP,max QP,QP i ) (56)
Wherein QP is i QP, QP representing the ith tile i-1 QP, sign (deltaBytes) i ) Representing the deltaBytes i Is a symbol of (c).
Continuing with the description of FIG. 7b above, QPs for tile 1, tile 2 and tile 3 are updated or calculated, respectively.
a. Calculation of QP for tile 1
First, a preset threshold value 1 (threshold 1) and a preset threshold value 2 (threshold 2) may be calculated according to the above-described formulas (43) and (44), respectively. Assuming alpha=4, beta=16, then:
next, as described above, assuming that the target byte number of tile 1 is 337 and the actual code byte number of tile 1 is 300, the difference between the target byte number and the actual code byte number of tile 1 is deltaBytes obtained by equation (52) 1 =accTarBytes 1 -accActBytes 1 =337-300=37。
The offset value of QP for tile 1 is calculated according to equations (53) and (54), and since the difference between the target number of bytes of tile 1 and the actual number of codewords is 37, which is greater than the preset threshold 2, the offset value of QP for tile 1 is 1.
The QP1 for tile 1 may be updated according to the offset value of the QP for tile 1 and the absolute value of the first difference as described above: QP (QP) 1 =QP 0 -sign(deltaBytes 1 )*QPoffset 1 =147.47-1=146.47
b. Calculation of QP for tile 2
As described above, the target number of bytes of tile 2 is 421, and assuming that the actual number of bytes coded by tile 2 is 470, the cumulative value of the target number of bytes of tile 2 is the sum of the target number of bytes of tile 1 and the target number of bytes of tile 2, that is 421+337=758; the cumulative value of the actual code word number of the tile 2 is the sum of the actual code word number of the tile 1 and the actual code word number of the tile 2, i.e. 300+470=770.
The difference between the target byte count accumulation value and the actual encoded byte count accumulation value of the tile 2 can be obtained by the formula (52) to be deltaBytes 2 =accTarBytes 2 -accActBytes 2 =758-770=-12。
The offset value of QP for tile 2 is calculated according to equations (53) and (54), and since the difference between the cumulative value of the target number of bytes for tile 2 and the cumulative value of the actual number of code bytes is-12, its absolute value is 12, and is less than the preset threshold 2, the offset value of QP for tile 2 is 0.
The QP2 for tile 2 may be updated according to the offset value of the QP for tile 2 and the absolute value of the first difference as described above: QP (QP) 2 =QP 1 -sign(deltaBytes 2 )*QPoffset 2 =146.47-0=146.47
c. Calculation of QP for tile 3
As described above, the target number of bytes of tile 3 is 109.5, and assuming that the actual number of bytes coded by tile 3 is 50, the cumulative value of the target number of bytes of tile 3 is the sum of the target number of bytes of tile 1 and the target number of bytes of tile 2 and the target number of bytes of tile 3, that is 421+337+109.5=867.5; the cumulative value of the actual code word number of the tile 3 is the sum of the actual code word number of the tile 1 and the actual code word number of the tile 2 and the actual code word number of the tile 3, namely 300+470+50=820.
The difference between the target byte count accumulation value and the actual encoded byte count accumulation value of the tile 3 can be obtained by the formula (52) to be deltaBytes 3 =accTarBytes 3 -accActBytes 3 =867.5-820=47.5。
The offset value of QP for tile 3 is calculated according to equations (53) and (54), and since the difference between the cumulative value of the target number of bytes for tile 3 and the cumulative value of the actual number of code bytes is 47.5, which is greater than the preset threshold 2, the offset value of QP for tile 3 is 1.
The QP3 for tile 3 may be updated based on the offset value of the QP for tile 3 and the absolute value of the first difference as described above: QP (QP) 3 =QP 2 -sign(deltaBytes 3 )*QPoffset 3 =146.47-1=145.47
Optionally, in some embodiments, the target number of bytes of the i-th tile is related to the first information; the first information is at least one of the following information: the target byte number of the current frame, the transform coefficient of the i-th tile, or the transform coefficient of the current frame.
Optionally, in some embodiments, the target number of bytes of the current frame is related to second information; the second information is at least one of the following information: the width of the current frame, the height of the current frame, the bit depth of the current frame, the encoding format of the current frame or the image compression ratio of the current frame.
In the embodiment of the present application, the target byte number of the i-th tile may be related to the target byte number of the current frame, the transform coefficient of the i-th tile, or the transform coefficient of the current frame. For example, the target byte number of the i-th tile may be calculated by the above formula (42), and for details, reference may be made to the above formula (42), and for brevity, details will not be repeated here.
In the embodiment of the present application, the target byte number of the current frame may be related to the width of the current frame, the height of the current frame, the bit depth of the current frame, the encoding format of the current frame, or the image compression ratio of the current frame. For example, the target number of bytes of the current frame may be calculated by the above formula (41), and for details, reference may be made to the above formula (41), which is not repeated here for brevity.
Based on this, the calculation of QP for n1 tiles of the current frame is described above, and the calculation of QP for tiles in the target frame will be described below, with particular reference to the following.
Optionally, in some embodiments, the method further comprises: updating QP of n2 tiles in the target frame according to the ratio of the absolute value of the second difference value to the target byte number of the current frame and the size of the second threshold.
The second difference value is a difference value between a target byte number of the current frame and an actual code byte number of the current frame, the target frame is a front x frame and/or a rear y frame of the current frame, x and y are positive integers greater than or equal to 1, and n2 is a positive integer greater than or equal to 2.
In the embodiment of the application, the QP of n2 tiles in the target frame can be updated according to the ratio of the absolute value of the second difference value to the target byte number of the current frame and the size of the second threshold value.
In the embodiment of the present application, if the current encoding method is forward prediction, the target frame may be the previous x frame of the current frame; if the current coding method is backward prediction, the target frame can be a backward y frame of the current frame; if the current encoding method is bi-directional prediction, the target frame may be a front x-frame and a rear y-frame of the current frame.
Similarly, in the embodiment of the present application, when the image to be encoded (the target frame) is divided, the image to be encoded may be divided according to a fixed width or a fixed height, or the image to be encoded may not be divided according to a fixed width or a fixed height, similar to the above-described manner of dividing the current frame; and are not limited.
In the embodiment of the application, the coded image can be divided horizontally, and the image to be coded can be divided vertically.
It should be understood that n2 in the embodiments of the present application may be the same as or different from n1 in the above description, which is not particularly limited in the present application.
According to the scheme provided by the embodiment of the application, the QP of the tile included in the target frame is updated through the ratio of the second difference value (the difference value between the target byte number of the current frame and the actual code word number) to the target byte number of the current frame, so that the coding efficiency can be further improved, and meanwhile, the flexibility of the quantization parameter can be ensured so as to avoid the problem that the output code rate is uncontrollable.
Optionally, in some embodiments, the updating the QP for n2 tiles in the target frame according to the ratio of the absolute value of the second difference to the target number of bytes of the current frame and the size of the second threshold includes:
if the second difference is a positive value, taking the difference between the first parameter and the offset parameter in the QP for calculating the n1 tiles as an updated first parameter; calculating QP of the n2 tiles according to the updated first parameters;
or alternatively, the first and second heat exchangers may be,
if the second difference is a negative value, taking the sum of the first parameter and the offset parameter in the QP for calculating the n1 tiles as an updated first parameter; and calculating QP of the n2 tiles according to the updated first parameters.
In the embodiment of the present application, updating the QP of the n2 tiles of the target frame may obtain the initial QP0 of the first tile of the target frame, and may update the parameter z in the above formula (48) first, and then calculate the initial QP of the target frame based on the updated parameter z (i.e. the first parameter in the present application) 0
First, the number of actually encoded bytes of the current frame can be calculated by equation (57).
Wherein accActBytes n1 The number of bytes representing the actual encoding of the current frame, n1 represents the number of tiles divided by the current frame.
The difference between the target number of bytes of the current frame and the actual number of bytes encoded is calculated according to equation (58).
deltaBytes n1 =t arg etByte n1 -accActBytes n1 (58)
Wherein deltaBytes n1 Representing the difference between the target number of bytes of the current frame and the actual number of bytes encoded, t arg etByte n1 Representing the target byte number of the current frame, accActBytes n1 Representing the actual encoded bytes of the current frameA number.
The updated parameter z is determined according to equations (59) - (61).
Wherein delta z represents the difference value before and after updating the parameter z, and deltaThres0, deltaThres1, deltaThres2 in the above formula (59) -formula (61) are preset threshold coefficients, for example, deltaThres 0=0.02, deltaThres 1=0.05, deltaThres 2=0.1; tmpOffset1, tmpOffset2 are temporary variables.
deltaOffset1, deltaOffset2, deltaOffset3 in the formulas (60) and (61) are preset offset values, and may be preset to deltaOffset 1=1, deltaOffset 2=2, deltaOffset 3=3, for example.
After delta z is obtained, an updated parameter z may be calculated based on equation (62).
z new =z-sign(deltaBytes n1 )*delta z (62)
Wherein z is new Represents the value of the parameter z after updating, z represents the value of the parameter z before updating, sign (deltaBytes n ) Representing the deltaBytes n Is a symbol of (c).
Still taking the above-mentioned fig. 7b as an example, the above-mentioned parameter z is updated.
The number of bytes actually encoded for the current frame, which can be obtained by the above equation (57), may be:as described above, the target byte count of the current frame is 867.5.
Then the difference between the target number of bytes and the actual number of bytes for the current frame is as follows, as can be obtained by equation (58) above: deltaBytes 3 =t arg etByte 3 -accActBytes 3 =867.5-820=47.5。
After obtaining the difference between the target number of bytes of the current frame and the actual number of bytes encoded, the difference may be calculated by equation (59), where the absolute value of the difference is 47.5/867.5=0.05, and since the ratio 0.05 is greater than the preset threshold coefficient of difference 0 (deltaThres 0=0.02), then delta z=tmpoffset 1, and further, may be calculated according to equation (60), where the ratio 0.05 is equal to the preset threshold coefficient of difference 1 (deltaThres 1=0.05), then delta z=tmpoffset 2; further, it can be calculated according to equation (61), where the ratio 0.05 is smaller than the preset threshold coefficient 2 (deltathres2=0.1), then delta z=deltaoffset 2, and if deltaOffset 2=2, then the difference between the parameter z before and after updating is 2.
Calculating an updated parameter z by the above equation (62) new =z-sign(deltaBytes 3 ) Delta z=0-2= -2. Subsequently calculating an initial QP for the first tile of the target frame 0 At the time of (1) the updated parameter z can be used new And (5) performing calculation.
In other words, the value of the parameter z in the above formula (48) is used as the updated parameter z in the process of encoding the target frame new Performing calculations, i.e., calculating the initial QP for the first tile of the target frame with-2 0 . The initial QP for the first tile of the target frame may then be updated based on the image complexity information for the first tile 1 Then based on the updated initial QP of the first tile 1 Computing QP of the second tile 2 Based on QP of the second tile 2 Computing QP of third tile 3 … …, and so on until the QP calculation for all tiles of the target frame is complete.
According to the scheme provided by the embodiment of the application, the updated first parameter is determined by combining the sign of the second difference value, and the QP of the tile included in the target frame is updated based on the updated first parameter, so that the coding efficiency can be further improved.
Fig. 10 is a schematic diagram of an encoding method 1000 according to another embodiment of the present application. The encoding method 1000 may include steps 1010-1060. The encoding method provided by the embodiment of the present application is summarized below in conjunction with fig. 10.
At 1010, a target number of bytes for the current frame and a target number of bytes for each tile are calculated.
1020, determining if the first tile is currently encoded.
If the first tile is currently encoded, then step 1030 is performed; if the first tile is not currently encoded, step 1040 is performed.
1030, the initial QP for the first tile is calculated.
And 1040, performing intra-frame QP updating of the current frame.
1050, determine if the last tile of the current frame is encoded.
If yes, go to step 1060, if not, go back to step 1020.
1060, frame level parameters are updated.
Details of steps 1010-1060 are described above with respect to fig. 6-9, and are not repeated here for brevity.
In the above, only single-pass encoding of the image to be encoded is referred to, in some embodiments, the image to be encoded may be multi-pass encoded, in which case the acquisition of the image complexity information for each tile in the current frame may acquire the image complexity information for a plurality of components, as described in more detail below.
Optionally, in some embodiments, if the current frame is multi-channel coded, the acquiring the image complexity information of each tile of the n1 tiles in the current frame includes: respectively acquiring image complexity information of components of each tile in the n1 tiles, wherein the components comprise a luminance component and/or at least one chrominance component of the current frame; updating QP for n1 tiles according to the image complexity information, including: the QP for the n1 tiles is updated according to the image complexity information for at least one of the components.
The multi-channel coding in the embodiment of the application can comprise YUV coding or RGB coding, without limitation.
YUV refers to a pixel format in which luminance parameters and chrominance parameters are separately represented, and the format of YUV encoding may include YUV444, YUV422, YUV420, and YUV411. The format of YUV coding is described below as YUV 422.
Y component: the image complexity information for the tiles of the Y component may be referred to above.
UV component: since the format of YUV encoding is YUV422, the pixels for the UV component may be half of the pixel values of the image to be encoded, and the image complexity information of the tile is acquired based on the half of the pixel values.
In the embodiment of the application, the image complexity information of the Y component and the UV component of each tile in the n1 tiles can be respectively acquired, and the QP of the n1 tiles can be updated based on the average value of the image complexity information of the Y component and the image complexity information of the UV component; the QP for n1 tiles may also be updated based on the image complexity information for the Y component; the QP for n1 tiles may also be updated based on the image complexity information for the UV component; and are not limited.
It should be appreciated that updating the QP of n1 tiles based on the image complexity information of at least one of the components is not limited to the above-listed, but may be in other manners, such as the image complexity information of the Y component and the root mean square value of the image complexity information of the UV component, etc., which is not particularly limited in this regard.
According to the scheme provided by the embodiment of the application, the precision of the quantization parameter can be improved by updating the QP of the tile according to the acquired image complexity information of at least one component in the image complexity information of the components of each tile.
Optionally, in some embodiments, the acquiring the image complexity information of each tile of the n1 tiles in the current frame includes: acquiring a plurality of transformation parameters of each tile; and selecting one parameter from the plurality of transformation parameters as the image complexity information of each tile.
In the embodiment of the application, one parameter can be selected from the acquired multiple transformation parameters to calculate the image complexity information of each tile. In the foregoing, the image complexity information of each tile is calculated based on the LPHP coefficient of the tile, in some embodiments, the image complexity information of each tile may also be calculated according to the DC coefficient of the tile, or the image complexity information of each tile may also be calculated according to the average value of the LPHP coefficient and the DC coefficient of the tile, which is not limited in detail in the present application.
Optionally, in some embodiments, the method 600 may further include:
If the QP of the updated n1 tiles is less than a third threshold, taking the third threshold as the QP of the updated n1 tiles; or alternatively, the first and second heat exchangers may be,
if the updated QP for the n1 tiles is greater than a fourth threshold, taking the fourth threshold as the updated QP for the n1 tiles;
wherein the third threshold is the minimum value of the QP used for encoding and the fourth threshold is the maximum value of the QP used for encoding.
In the embodiment of the present application, in the process of encoding the image to be encoded, a minimum value (i.e., a third threshold) and a maximum value (i.e., a fourth threshold) of the QP may be set. The third threshold and/or the fourth threshold in the embodiments of the present application may be fixed, or may be continuously adjusted, and are not limited.
If the QP of the tile calculated by the above formula is less than the minimum value, the QP of the tile may be calculated subsequently based on the set minimum value; if the QP for the tile calculated by the above formula is greater than the maximum value, the QP for the tile may be calculated subsequently based on the set maximum value.
Illustratively, as described above, the minimum and maximum values of QP may be set to 5 and 150, respectively, and if the QP of a tile calculated by the above formula is any one of values 5 to 150, the next tile may be encoded based on the calculated QP; if the QP for a tile calculated by the above formula is 3, then the next tile can be encoded based on 5; if the QP for a tile calculated by the above formula is 160, then the next tile can be encoded based on 150.
Optionally, in some embodiments, the encoding method is applied in a JPEG XR encoding format.
As described above, the JPEG XR encoding format is a continuous tone still image compression algorithm and file format that can support lossy data compression as well as lossless data compression.
The JPEG XR encoding format has certain advantages over the JPEG encoding format.
First, JPEG uses 8-bit encoding, 256 colors are realized, while JPEG XR can use 16 bits or more, providing better effects and more editing flexibility.
Second, the JPEG XR encoding format uses a more efficient compression algorithm, and in the case of a JPEG file of the same size, the image quality may be twice that of the latter, or the same quality may only require half the volume of the latter. And unlike JPEG, the highest quality compression of JPEG XR may not lose any information.
Fig. 11 is a schematic diagram of an encoding method 1100 according to another embodiment of the present application, where the encoding method 1100 may include steps 1110-1130.
1110, obtaining image complexity information of a current frame, where the image complexity information includes a transform coefficient obtained by performing an image kernel transform (PCT) on a pixel value of the current frame.
In the embodiment of the present application, the PCT processing flow may refer to the descriptions of the above formulas (2) to (36), and for brevity, the description is omitted here.
And 1120, determining an initial quantization parameter (initial QP) of the current frame according to the image complexity information.
In the embodiment of the present application, the initial QP of the current frame may be determined according to the image complexity information, and the initial QP may be described according to the above formulas (45) to (49), which are not described herein for brevity.
It should be noted that, since the encoding method 1100 is performed in units of frames, the initial QP in the embodiment of the present application is the initial QP of the current frame; the encoding method 600 above is performed in units of tiles, and thus the initial QP in the above embodiment is the initial QP for the first tile in the current frame.
1130, updating an initial QP of a target frame according to the initial QP of the current frame, wherein the target frame is a front x frame and/or a rear y frame of the current frame, and x and y are positive integers greater than or equal to 1.
In the embodiment of the present application, if the current encoding method is forward prediction, the target frame may be the previous x frame of the current frame; if the current coding method is backward prediction, the target frame can be a backward y frame of the current frame; if the current encoding method is bi-directional prediction, the target frame may be a front x-frame and a rear y-frame of the current frame.
According to the scheme provided by the embodiment of the application, the image complexity information is related to the transformation coefficient of the current frame, and the transformation coefficient is the coefficient obtained by carrying out PCT (arithmetic operation) processing on the pixel value in the current frame, and the initial QP of the current frame is determined according to the image complexity information and the initial QP of the target frame is updated according to the initial QP, so that the coding efficiency can be improved and the hardware resource consumption can be reduced on the premise of not sacrificing the code rate control precision, and meanwhile, the flexibility of the quantization parameter can be ensured so as to avoid the problem of uncontrollable output code rate.
Optionally, in some embodiments, the updating the initial QP of the target frame based on the initial QP of the current frame includes: updating an initial QP in the target frame according to the ratio of the absolute value of the second difference value to the target byte number of the current frame and the size of a second threshold; the second difference value is a difference value between the target byte number of the current frame and the actual code byte number of the current frame.
The second threshold in the embodiment of the present application may be a fixed value, or may be continuously adjusted, which is not specifically limited in the present application.
Optionally, in some embodiments, the updating the initial QP for the target frame according to the ratio of the absolute value of the second difference to the target number of bytes for the current frame and the size of the second threshold includes:
If the second difference value is a positive value, taking the difference value between the first parameter and the offset parameter in the initial QP for calculating the current frame as an updated first parameter;
calculating an initial QP of the target frame according to the updated first parameter;
or alternatively, the first and second heat exchangers may be,
if the second difference is a negative value, taking the sum of the first parameter and the offset parameter in the initial QP used for calculating the previous frame as the updated first parameter;
and calculating the initial QP of the target frame according to the updated first parameter.
The first parameter in the embodiment of the present application may be the parameter z in the above formula (48), in the process of calculating the initial QP of the current frame, the initial value of the parameter z may be set to 0, and then, when calculating the initial QP of the target frame, the parameter z in the formula (48) may be updated according to the image complexity information of the current frame.
Specifically, the difference between the number of target bytes of the current frame and the number of actual code bytes may be calculated first, the parameter z in the formula (48) is updated based on the ratio of the absolute value of the difference to the number of target bytes of the current frame and the size of the second threshold, and then the initial QP of the target frame is updated based on the updated parameter z and the image complexity information of the target frame, and for details, reference may be made to the above formulas (57) to (62) and formulas (45) to (49), which are not repeated herein for brevity.
According to the scheme provided by the embodiment of the application, the updated first parameter is determined by combining the sign of the second difference value, and the QP of the target frame is updated based on the updated first parameter, so that the coding efficiency can be further improved.
In the above, only single-pass encoding of the image to be encoded is referred to, and in some embodiments, the image to be encoded may be multi-pass encoded, in which case, obtaining the image complexity information of the current frame may obtain the image complexity information of multiple components, as described in detail below.
Optionally, in some embodiments, if the current frame is multi-channel coded, the obtaining the image complexity information of the current frame includes: respectively acquiring image complexity information of components of the current frame, wherein the components comprise a brightness component and/or at least one chromaticity component of the current frame; updating the initial QP of the target frame based on the initial QP of the current frame, including: the initial QP for the target frame is updated according to the image complexity information for at least one of the components.
The multi-channel coding in the embodiment of the application can comprise YUV coding or RGB coding, without limitation.
YUV refers to a pixel format in which luminance parameters and chrominance parameters are separately represented, and the format of YUV encoding may include YUV444, YUV422, YUV420, and YUV411. The format of YUV coding is described below as YUV 422.
Y component: the image complexity information for the tiles of the Y component may be referred to above.
UV component: since the format of YUV encoding is YUV422, the pixels for the UV component may be half of the pixel values of the image to be encoded, and the image complexity information of the tile is acquired based on the half of the pixel values.
Assuming that the image complexity information of the Y component and the UV component of the current frame is acquired respectively, the initial QP of the target frame may be updated based on the average value of the image complexity information of the Y component and the image complexity information of the UV component; the initial QP for the target frame may also be updated based on the image complexity information for the Y component; the initial QP for the target frame may also be updated based on the image complexity information for the UV component; and are not limited.
It should be appreciated that updating the initial QP of the target frame based on the image complexity information of at least one of the components is not limited to the above-listed one, but may be in other manners, for example, the image complexity information of the Y component and the root mean square value of the image complexity information of the UV component, etc., which are not particularly limited in the present application.
According to the scheme provided by the embodiment of the application, the accuracy of the quantization parameter can be improved by updating the initial QP of the target frame according to the acquired image complexity information of at least one component in the image complexity information of the components of the current frame.
Optionally, in some embodiments, the acquiring the image complexity information of the current frame includes: acquiring a plurality of transformation parameters of the current frame; and selecting one parameter from the plurality of transformation parameters as image complexity information of the current frame.
In the embodiment of the application, one parameter can be selected from the acquired multiple transformation parameters to calculate the image complexity information of each tile. In the above description, the image complexity information of the current frame may be calculated based on the LPHP coefficient of the current frame, and in some embodiments, the image complexity information of the current frame may be calculated based on the DC coefficient of the current frame, or the image complexity information of the current frame may be calculated based on an average value of the LPHP coefficient and the DC coefficient of the current frame, which is not particularly limited in the present application.
Optionally, in some embodiments, the method may further comprise: if the updated initial QP of the target frame is smaller than a third threshold, taking the third threshold as the updated initial QP of the target frame; or, if the updated initial QP of the target frame is greater than a fourth threshold, taking the fourth threshold as the updated initial QP of the target frame; wherein the third threshold is the minimum value of the QP used for encoding and the fourth threshold is the maximum value of the QP used for encoding.
In the embodiment of the present application, in the process of encoding the image to be encoded, the minimum value (i.e., the third threshold) and the maximum value (i.e., the fourth threshold) of the initial QP may be set.
If the initial QP of the current frame calculated by the above formula is less than the minimum value, then the initial QP of the target frame may be calculated based on the set minimum value; if the initial QP of the current frame calculated by the above formula is greater than the maximum value, the initial QP of the target frame may be calculated based on the set maximum value later.
Illustratively, as described above, the minimum and maximum values of the initial QP set may be 5 and 150, respectively, and if the initial QP of the current frame calculated by the above formula is any one of values 5 to 150, the target frame may be encoded based on the calculated initial QP; if the initial QP of the current frame calculated by the above formula is 3, the target frame may be encoded based on 5; if the initial QP for the current frame calculated by the above equation is 160, the target frame may be encoded based on 150.
Optionally, in some embodiments, the encoding method is applied in a JPEG XR encoding format.
As described above, the JPEG XR encoding format is a continuous tone still image compression algorithm and file format that can support lossy data compression as well as lossless data compression.
The JPEG XR encoding format has certain advantages over the JPEG encoding format.
First, JPEG uses 8-bit encoding, 256 colors are realized, while JPEG XR can use 16 bits or more, providing better effects and more editing flexibility.
Second, the JPEG XR encoding format uses a more efficient compression algorithm, and in the case of a JPEG file of the same size, the image quality may be twice that of the latter, or the same quality may only require half the volume of the latter. And unlike JPEG, the highest quality compression of JPEG XR may not lose any information.
The method embodiments of the present application are described in detail above with reference to fig. 1 to 11, and the apparatus embodiments of the present application are described below with reference to fig. 12 to 17, where the apparatus embodiments and the method embodiments correspond to each other, and thus portions that are not described in detail can refer to the method embodiments of the previous portions.
Fig. 12 is a schematic block diagram of an encoding apparatus 1200 according to an embodiment of the present application, where the encoding apparatus 1200 may include a complexity calculating module 1210 and a rate control module 1220.
The complexity calculation module 1210 is configured to obtain image complexity information of each tile of n1 tiles in the current frame, where the image complexity information includes a transform coefficient obtained by performing an image kernel transform (PCT process) on a pixel value of each tile, and n1 is a positive integer greater than or equal to 2.
A rate control module 1220 is configured to update Quantization Parameters (QP) of the n1 tiles according to the image complexity information.
Fig. 13a is a schematic block diagram of a JPEG XR encoder according to an embodiment of the present application. The schematic diagram may include a filtering module 410, a transformation module 420, a quantization module 430, a prediction module 440, an entropy encoding module 450, a complexity calculation module 460, and a rate control module 470.
The functions of the filtering module 410, the transforming module 420, the quantizing module 430, the predicting module 440, and the entropy encoding module 450 are similar to those of the modules shown in fig. 2.
The complexity calculating module 460 and the code rate controlling module 470 may be the complexity calculating module 1210 and the code rate controlling module 1220 in the embodiment of the present application, and may implement updating of QP of a tile in the embodiment of the present application.
The complexity calculating module 460 may obtain the image complexity information and output the image complexity information to the rate controlling module 470, and meanwhile, the rate controlling module 470 may receive the size of the actual code stream as input to update the rate control parameter, that is, to realize the updating of the QP.
Fig. 13b is a schematic block diagram of a JPEG XR encoder according to another embodiment of the present application. A filtering module 410, a transformation module 420, a quantization module 430, a prediction module 440, an entropy coding module 450, a complexity calculation module 460, and a rate control module 470 may also be included in the diagram.
Unlike fig. 13a, the complexity calculation module 460 and the rate control module 470 in fig. 13b are both located in the JPEG XR encoder, while the complexity calculation module 460 in fig. 13a is located in the processor, the rate control module 470 is located in the JPEG XR encoder.
Both of the JPEG XR encoders shown in fig. 13a and 13b can implement the updating of the QP of the tile according to the embodiment of the present application, except that the JPEG XR encoder shown in fig. 13a can start encoding the current image to be encoded after the complexity calculation of the current image to be encoded is completed during encoding, which is slightly increased in time compared to the JPEG XR encoder shown in fig. 13 b; the complexity calculation of the current image to be encoded is completed before the encoding of the current image to be encoded is started, after which the use of a frame is delayed, the first frame having no input a priori knowledge compared to the JPEG XR encoder shown in fig. 13 b.
Optionally, in some embodiments, the code rate control module 1220 is further configured to: calculating an accumulated value of a target byte number of an ith tile in the n1 tiles according to the image complexity information, wherein i is a positive integer less than or equal to n; updating the QP of the ith tile according to the accumulated value of the target byte number of the ith tile.
Optionally, in some embodiments, the code rate control module 1220 is further configured to: updating the QP of the ith tile according to the absolute value of a first difference value and a first threshold, wherein the first difference value is the difference value between the accumulated value of the target byte number of the ith tile and the accumulated value of the actual code byte number of the ith tile.
Optionally, in some embodiments, the code rate control module 1220 is further configured to: if the first difference value is a positive value, taking the difference value between the QP of the ith-1 tile and the first offset QP as the QP of the ith tile; if the first difference value is a negative value, taking the sum of the QP of the i-1 th tile and the first offset QP as the QP of the i-th tile; wherein the first offset QP is derived based on an absolute value of the first difference and the first threshold.
Optionally, in some embodiments, the target number of bytes of the i-th tile is related to the first information; the first information is at least one of the following information: the target byte number of the current frame, the transform coefficient of the i-th tile, or the transform coefficient of the current frame.
Optionally, in some embodiments, the target number of bytes of the current frame is related to second information; the second information is at least one of the following information: the width of the current frame, the height of the current frame, the bit depth of the current frame, the encoding format of the current frame or the image compression ratio of the current frame.
Optionally, in some embodiments, the code rate control module 1220 is further configured to: updating QP of n2 tiles in the target frame according to the ratio of the absolute value of the second difference value to the target byte number of the current frame and the size of a second threshold; the second difference value is a difference value between a target byte number of the current frame and an actual code byte number of the current frame, the target frame is a front x frame and/or a rear y frame of the current frame, x and y are positive integers greater than or equal to 1, and n2 is a positive integer greater than or equal to 2.
Optionally, in some embodiments, the code rate control module 1220 is further configured to: if the second difference is a positive value, taking the difference between the first parameter and the offset parameter in the QP for calculating the n1 tiles as an updated first parameter; calculating QP of the n2 tiles according to the updated first parameters; or if the second difference is a negative value, taking the sum of the first parameter and the offset parameter in the QP for calculating the n1 tiles as the updated first parameter; and calculating QP of the n2 tiles according to the updated first parameters.
Optionally, in some embodiments, if the current frame is multi-channel coded, the complexity calculation module 1210 is further configured to: respectively acquiring image complexity information of components of each tile in the n1 tiles, wherein the components comprise a luminance component and/or at least one chrominance component of the current frame; the rate control module 1220 is further configured to: updating the QP of the n1 tiles according to the image complexity information of at least one of the components.
Optionally, in some embodiments, the complexity calculation module 1210 is further configured to: acquiring a plurality of transformation parameters of each tile; and selecting one parameter from the plurality of transformation parameters as the image complexity information of each tile.
Optionally, in some embodiments, the code rate control module 1220 is further configured to: if the QP of the updated n1 tiles is less than a third threshold, taking the third threshold as the QP of the updated n1 tiles; or, if the QP of the updated n1 tiles is greater than a fourth threshold, using the fourth threshold as the QP of the updated n1 tiles; wherein the third threshold is the minimum value of the QP used for encoding and the fourth threshold is the maximum value of the QP used for encoding.
Alternatively, in some embodiments, the encoding apparatus 1200 is applied in a JPEG XR encoding format.
Fig. 14 is a schematic block diagram of an encoding apparatus 1400 according to another embodiment of the present application, where the encoding apparatus 1400 may include a complexity calculating module 1410 and a rate controlling module 1420.
The complexity calculation module 1410 is configured to obtain image complexity information of a current frame, where the image complexity information includes a transform coefficient obtained by performing an image kernel transform (PCT) on a pixel value of the current frame.
A rate control module 1420 is configured to determine an initial quantization parameter (initial QP) for the current frame based on the image complexity information.
The code rate control module 1420 is further configured to: and updating the initial QP of a target frame according to the initial QP of the current frame, wherein the target frame is a front x frame and/or a rear y frame of the current frame, and x and y are positive integers which are greater than or equal to 1.
The complexity calculating module 460 and the rate controlling module 470 in fig. 13b may be the complexity calculating module 1210 and the rate controlling module 1220 in the embodiment of the present application, and may implement the updating of the QP of the target frame in the embodiment of the present application.
Optionally, in some embodiments, the code rate control module 1420 is further to: updating an initial QP in the target frame according to the ratio of the absolute value of the second difference value to the target byte number of the current frame and the size of a second threshold; the second difference value is a difference value between the target byte number of the current frame and the actual code word number of the current frame.
Optionally, in some embodiments, the code rate control module 1420 is further to: if the second difference value is a positive value, taking the difference value between the first parameter and the offset parameter in the initial QP for calculating the current frame as an updated first parameter; calculating an initial QP of the target frame according to the updated first parameter; or if the second difference is a negative value, taking the sum of the first parameter and the offset parameter in the initial QP used for calculating the previous frame as the updated first parameter; and calculating the initial QP of the target frame according to the updated first parameter.
Optionally, in some embodiments, if the current frame is multi-channel coded, the complexity calculation module 1410 is further configured to: respectively acquiring image complexity information of components of the current frame, wherein the components comprise a brightness component and/or at least one chromaticity component of the current frame; the code rate control module 1420 is further configured to: the initial QP for the target frame is updated according to the image complexity information for at least one of the components.
Optionally, in some embodiments, the complexity computation module 1410 is further configured to: acquiring a plurality of transformation parameters of the current frame; and selecting one parameter from the plurality of transformation parameters as image complexity information of the current frame.
Optionally, in some embodiments, the code rate control module 1420 is further to: if the updated initial QP of the target frame is smaller than a third threshold, taking the third threshold as the updated initial QP of the target frame; or, if the updated initial QP of the target frame is greater than a fourth threshold, taking the fourth threshold as the updated initial QP of the target frame; wherein the third threshold is the minimum value of the QP used for encoding and the fourth threshold is the maximum value of the QP used for encoding.
Alternatively, in some embodiments, the encoding apparatus 1400 is applied in a JPEG XR encoding format.
Fig. 15 illustrates an encoding apparatus 1500 according to another embodiment of the present application, wherein the encoding apparatus 1500 may include a processor 1510.
A processor 1510 for: acquiring image complexity information of each tile of n1 tiles in a current frame, wherein the image complexity information comprises a transformation coefficient obtained by performing image kernel transformation (PCT) on pixel values of each tile, and n1 is a positive integer greater than or equal to 2; the Quantization Parameters (QP) for the n1 tiles are updated according to the image complexity information.
Optionally, in some embodiments, the processor 1510 is further configured to: calculating an accumulated value of a target byte number of an ith tile in the n1 tiles according to the image complexity information, wherein i is a positive integer less than or equal to n; updating the QP of the ith tile according to the accumulated value of the target byte number of the ith tile.
Optionally, in some embodiments, the processor 1510 is further configured to: updating the QP of the ith tile according to the absolute value of a first difference value and a first threshold, wherein the first difference value is the difference value between the accumulated value of the target byte number of the ith tile and the accumulated value of the actual code byte number of the ith tile.
Optionally, in some embodiments, the processor 1510 is further configured to: if the first difference value is a positive value, taking the difference value between the QP of the ith-1 tile and the first offset QP as the QP of the ith tile; if the first difference value is a negative value, taking the sum of the QP of the i-1 th tile and the first offset QP as the QP of the i-th tile; wherein the first offset QP is derived based on an absolute value of the first difference and the first threshold.
Optionally, in some embodiments, the target number of bytes of the i-th tile is related to the first information; the first information is at least one of the following information: the target byte number of the current frame, the transform coefficient of the i-th tile, or the transform coefficient of the current frame.
Optionally, in some embodiments, the target number of bytes of the current frame is related to second information; the second information is at least one of the following information: the width of the current frame, the height of the current frame, the bit depth of the current frame, the encoding format of the current frame or the image compression ratio of the current frame.
Optionally, in some embodiments, the processor 1510 is further configured to: updating QP of n2 tiles in the target frame according to the ratio of the absolute value of the second difference value to the target byte number of the current frame and the size of a second threshold; the second difference value is a difference value between a target byte number of the current frame and an actual code byte number of the current frame, the target frame is a front x frame and/or a rear y frame of the current frame, x and y are positive integers greater than or equal to 1, and n2 is a positive integer greater than or equal to 2.
Optionally, in some embodiments, the processor 1510 is further configured to: if the second difference is a positive value, taking the difference between the first parameter and the offset parameter in the QP for calculating the n1 tiles as an updated first parameter; calculating QP of the n2 tiles according to the updated first parameters; or if the second difference is a negative value, taking the sum of the first parameter and the offset parameter in the QP for calculating the n1 tiles as the updated first parameter; and calculating QP of the n2 tiles according to the updated first parameters.
Optionally, in some embodiments, if the current frame is multi-channel coded, the processor 1510 is further configured to: respectively acquiring image complexity information of components of each tile in the n1 tiles, wherein the components comprise a luminance component and/or at least one chrominance component of the current frame; updating the QP of the n1 tiles according to the image complexity information of at least one of the components.
Optionally, in some embodiments, the processor 1510 is further configured to: acquiring a plurality of transformation parameters of each tile; and selecting one parameter from the plurality of transformation parameters as the image complexity information of each tile.
Optionally, in some embodiments, the processor 1510 is further configured to: if the QP of the updated n1 tiles is less than a third threshold, taking the third threshold as the QP of the updated n1 tiles; or, if the QP of the updated n1 tiles is greater than a fourth threshold, using the fourth threshold as the QP of the updated n1 tiles; wherein the third threshold is the minimum value of the QP used for encoding and the fourth threshold is the maximum value of the QP used for encoding.
Optionally, in some embodiments, the encoding apparatus 1500 is applied in a JPEG XR encoding format.
Optionally, the encoding apparatus 1500 may further include a memory 1520. Wherein the processor 1510 may invoke and run a computer program from the memory 1520 to implement the method in embodiments of the present application.
Wherein the memory 1520 may be a separate device from the processor 1510 or may be integrated into the processor 1510.
Optionally, the encoding apparatus 1500 may further include a transceiver 1530. Wherein the transceiver 1530 may be a separate device from the processor 1510, or may be integrated into the processor 1510.
Optionally, the encoding device may be, for example, an encoder, a terminal (including but not limited to a mobile phone, a camera, an unmanned aerial vehicle, etc.), and the encoding device may implement a corresponding flow in the encoding method 600 of the embodiment of the present application, which is not described herein for brevity.
Fig. 16 illustrates an encoding apparatus 1600 according to another embodiment of the present application, wherein the encoding apparatus 1600 may include a processor 1610.
A processor 1610 configured to: acquiring image complexity information of a current frame, wherein the image complexity information comprises a transformation coefficient obtained by performing image kernel transformation (PCT) on pixel values of the current frame; determining an initial quantization parameter (initial QP) for the current frame based on the image complexity information; and updating the initial QP of a target frame according to the initial QP of the current frame, wherein the target frame is a front x frame and/or a rear y frame of the current frame, and x and y are positive integers which are greater than or equal to 1.
Optionally, in some embodiments, the processor 1610 is further configured to: updating an initial QP in the target frame according to the ratio of the absolute value of the second difference value to the target byte number of the current frame and the size of a second threshold; the second difference value is a difference value between the target byte number of the current frame and the actual code byte number of the current frame.
Optionally, in some embodiments, the processor 1610 is further configured to: if the second difference value is a positive value, taking the difference value between the first parameter and the offset parameter in the initial QP for calculating the current frame as an updated first parameter; calculating an initial QP of the target frame according to the updated first parameter; or if the second difference is a negative value, taking the sum of the first parameter and the offset parameter in the initial QP used for calculating the previous frame as the updated first parameter; and calculating the initial QP of the target frame according to the updated first parameter.
Optionally, in some embodiments, if the current frame is multi-channel coded, the processor 1610 is further configured to: respectively acquiring image complexity information of components of the current frame, wherein the components comprise a brightness component and/or at least one chromaticity component of the current frame; the initial QP for the target frame is updated according to the image complexity information for at least one of the components.
Optionally, in some embodiments, the processor 1610 is further configured to: acquiring a plurality of transformation parameters of the current frame; and selecting one parameter from the plurality of transformation parameters as image complexity information of the current frame.
Optionally, in some embodiments, the processor 1610 is further configured to: if the updated initial QP of the target frame is smaller than a third threshold, taking the third threshold as the updated initial QP of the target frame; or, if the updated initial QP of the target frame is greater than a fourth threshold, taking the fourth threshold as the updated initial QP of the target frame; wherein the third threshold is the minimum value of the QP used for encoding and the fourth threshold is the maximum value of the QP used for encoding.
Optionally, in some embodiments, the encoding apparatus 1600 is applied in a JPEG XR encoding format.
Optionally, the encoding device 1600 may also include a memory 1620. Wherein the processor 1610 may call and run a computer program from the memory 1620 to implement the method in an embodiment of the present application.
Wherein memory 1620 may be a separate device from processor 1610 or may be integrated within processor 1610.
Optionally, the encoding device 1600 may also include a transceiver 1630. Wherein transceiver 1630 may be a separate device from processor 1610 or may be integrated into processor 1610.
Optionally, the encoding device may be, for example, an encoder, a terminal (including but not limited to a mobile phone, a camera, an unmanned aerial vehicle, etc.), and the encoding device may implement a corresponding flow in the encoding method 1100 of the embodiment of the present application, which is not described herein for brevity.
Fig. 17 is a schematic structural view of a chip of an embodiment of the present application. Chip 1700 shown in fig. 17 includes a processor 1710, and processor 1710 may call and run a computer program from memory to implement a method in an embodiment of the application.
Optionally, as shown in fig. 17, chip 1700 may also include memory 1720. Wherein the processor 1710 may call and run a computer program from the memory 1720 to implement the method in an embodiment of the present application.
Wherein the memory 1720 may be a separate device from the processor 1710 or may be integrated in the processor 1710.
Optionally, the chip 1700 may also include an input interface 1730. Wherein the processor 1710 may control the input interface 1730 to communicate with other devices or chips, and in particular, may obtain information or data sent by other devices or chips.
Optionally, the chip 1700 may also include an output interface 1740. Wherein the processor 1710 may control the output interface 1740 to communicate with other devices or chips, and in particular, may output information or data to the other devices or chips.
It should be understood that the chips referred to in the embodiments of the present application may also be referred to as system-on-chip chips, or the like.
It should be appreciated that the processor of an embodiment of the present application may be an integrated circuit image processing system having signal processing capabilities. In implementation, the steps of the above method embodiments may be implemented by integrated logic circuits of hardware in a processor or instructions in software form. The processor may be a general purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), an off-the-shelf programmable gate array (Field Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be embodied directly in the execution of a hardware decoding processor, or in the execution of a combination of hardware and software modules in a decoding processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in a memory, and the processor reads the information in the memory and, in combination with its hardware, performs the steps of the above method.
It will be appreciated that the memory in embodiments of the application may be volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The nonvolatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable EPROM (EEPROM), or a flash Memory. The volatile memory may be random access memory (Random Access Memory, RAM) which acts as an external cache. By way of example, and not limitation, many forms of RAM are available, such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (Double Data Rate SDRAM), enhanced SDRAM (ESDRAM), synchronous DRAM (SLDRAM), and Direct RAM (DR RAM). It should be noted that the memory of the systems and methods described herein is intended to comprise, without being limited to, these and any other suitable types of memory.
It should be understood that the above memory is illustrative but not restrictive, and for example, the memory in the embodiments of the present application may be Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), direct RAM (DR RAM), and the like. That is, the memory in embodiments of the present application is intended to comprise, without being limited to, these and any other suitable types of memory.
The memory in embodiments of the application may provide instructions and data to the processor. A portion of the memory may also include non-volatile random access memory. For example, the memory may also store information of the device type. The processor may be configured to execute the instructions stored in the memory, and when the processor executes the instructions, the processor may perform the steps corresponding to the terminal device in the above method embodiment.
In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or by instructions in the form of software. The steps of a method disclosed in connection with the embodiments of the present application may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in the processor for execution. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in a memory, and the processor executes instructions in the memory to perform the steps of the method described above in conjunction with its hardware. To avoid repetition, a detailed description is not provided herein.
It should also be understood that, in the embodiment of the present application, the pixels in the image may be located in different rows and/or columns, where the length of a may correspond to the number of pixels located in the same row and the height of a may correspond to the number of pixels located in the same column and included in a. In addition, the length and the height of a may also be referred to as the width and the depth of a, respectively, which are not limited in the embodiments of the present application.
It should also be understood that, in the embodiment of the present application, the "boundary interval distribution with a" may refer to at least one pixel point spaced from the boundary of a, and may also be referred to as "not adjacent to the boundary of a" or "not located at the boundary of a", where a may be an image, a rectangular area, or a sub-image, and so on.
It should also be understood that the foregoing description of embodiments of the present application focuses on highlighting differences between the various embodiments and that the same or similar elements not mentioned may be referred to each other and are not repeated herein for brevity.
The embodiment of the application also provides a computer readable storage medium for storing a computer program.
Optionally, the computer readable storage medium may be applied to the encoding apparatus in the embodiment of the present application, and the computer program causes a computer to execute a corresponding flow implemented by the encoding apparatus in each method of the embodiment of the present application, which is not described herein for brevity.
The embodiment of the application also provides a computer program product comprising computer program instructions.
Optionally, the computer program product may be applied to the encoding apparatus in the embodiment of the present application, and the computer program instructions cause the computer to execute the corresponding processes implemented by the encoding apparatus in the methods in the embodiments of the present application, which are not described herein for brevity.
The embodiment of the application also provides a computer program.
Optionally, the computer program may be applied to the encoding apparatus in the embodiments of the present application, and when the computer program runs on a computer, the computer is caused to execute corresponding processes implemented by the encoding apparatus in each method of the embodiments of the present application, which are not described herein for brevity.
It should be understood that, in the embodiment of the present application, the term "and/or" is merely an association relationship describing the association object, which means that three relationships may exist. For example, a and/or B may represent: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps described in connection with the embodiments disclosed herein may be embodied in electronic hardware, in computer software, or in a combination of the two, and that the elements and steps of the examples have been generally described in terms of function in the foregoing description to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.
In the several embodiments provided by the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. In addition, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices, or elements, or may be an electrical, mechanical, or other form of connection.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the embodiment of the present application.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application is essentially or a part contributing to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk, etc.
While the application has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the application. Therefore, the protection scope of the application is subject to the protection scope of the claims.

Claims (60)

  1. A method of encoding, comprising:
    acquiring image complexity information of each tile of n1 tiles in a current frame, wherein the image complexity information comprises a transformation coefficient obtained by performing image kernel transformation (PCT) on pixel values of each tile, and n1 is a positive integer greater than or equal to 2;
    the Quantization Parameters (QP) for the n1 tiles are updated according to the image complexity information.
  2. The encoding method of claim 1, wherein the updating the QP for the n1 tiles according to the image complexity information comprises:
    calculating an accumulated value of a target byte number of an ith tile in the n1 tiles according to the image complexity information, wherein i is a positive integer less than or equal to n 1;
    updating the QP of the ith tile according to the accumulated value of the target byte number of the ith tile.
  3. The encoding method of claim 2, wherein the updating the QP for the i-th tile based on the accumulated value of the target number of bytes for the i-th tile comprises:
    updating the QP of the ith tile according to the absolute value of a first difference value and a first threshold, wherein the first difference value is the difference value between the accumulated value of the target byte number of the ith tile and the accumulated value of the actual code byte number of the ith tile.
  4. The encoding method of claim 3, wherein the updating the QP for the i-th tile based on the absolute value of the first difference and the first threshold comprises:
    if the first difference value is a positive value, taking the difference value between the QP of the ith-1 tile and the first offset QP as the QP of the ith tile;
    if the first difference value is a negative value, taking the sum of the QP of the i-1 th tile and the first offset QP as the QP of the i-th tile;
    wherein the first offset QP is derived based on an absolute value of the first difference and the first threshold.
  5. The encoding method according to claim 3 or 4, characterized in that the target number of bytes of the i-th tile is related to the first information;
    the first information is at least one of the following information:
    the target byte number of the current frame, the transform coefficient of the i-th tile, or the transform coefficient of the current frame.
  6. The encoding method according to claim 5, wherein the target number of bytes of the current frame is related to second information;
    the second information is at least one of the following information:
    the width of the current frame, the height of the current frame, the bit depth of the current frame, the encoding format of the current frame or the image compression ratio of the current frame.
  7. The encoding method according to any one of claims 1 to 6, characterized in that the method further comprises:
    updating QP of n2 tiles in the target frame according to the ratio of the absolute value of the second difference value to the target byte number of the current frame and the size of a second threshold;
    the second difference value is a difference value between a target byte number of the current frame and an actual code byte number of the current frame, the target frame is a front x frame and/or a rear y frame of the current frame, x and y are positive integers greater than or equal to 1, and n2 is a positive integer greater than or equal to 2.
  8. The encoding method of claim 7, wherein the updating the QP for the n2 tiles in the target frame based on the ratio of the absolute value of the second difference to the target number of bytes of the current frame and the size of the second threshold comprises:
    if the second difference is a positive value, taking the difference between the first parameter and the offset parameter in the QP for calculating the n1 tiles as an updated first parameter;
    calculating QP of the n2 tiles according to the updated first parameters;
    or alternatively, the first and second heat exchangers may be,
    if the second difference is a negative value, taking the sum of the first parameter and the offset parameter in the QP for calculating the n1 tiles as an updated first parameter;
    And calculating QP of the n2 tiles according to the updated first parameters.
  9. The encoding method according to any one of claims 1 to 8, wherein the acquiring the image complexity information of each of the n1 tiles in the current frame if the current frame is multi-channel encoded comprises:
    respectively acquiring image complexity information of components of each tile in the n1 tiles, wherein the components comprise a luminance component and/or at least one chrominance component of the current frame;
    updating QP for n1 tiles according to the image complexity information, including:
    updating the QP of the n1 tiles according to the image complexity information of at least one of the components.
  10. The encoding method according to any one of claims 1 to 9, wherein the acquiring the image complexity information of each of the n1 tiles in the current frame includes:
    acquiring a plurality of transformation parameters of each tile;
    and selecting one parameter from the plurality of transformation parameters as the image complexity information of each tile.
  11. The encoding method according to any one of claims 1 to 10, wherein the acquiring the image complexity information of each of the n1 tiles in the current frame includes:
    And mapping the blocks in the tiles, and then carrying out preset transformation to generate preset transformation parameters, and generating image complexity information of the tiles according to the preset transformation parameters.
  12. The encoding method according to claim 11, wherein the preset transform comprises PCT transform.
  13. The encoding method according to any one of claims 1 to 12, characterized in that the method further comprises:
    if the QP of the updated n1 tiles is less than a third threshold, taking the third threshold as the QP of the updated n1 tiles; or alternatively, the first and second heat exchangers may be,
    if the updated QP for the n1 tiles is greater than a fourth threshold, taking the fourth threshold as the updated QP for the n1 tiles;
    wherein the third threshold is the minimum value of the QP used for encoding and the fourth threshold is the maximum value of the QP used for encoding.
  14. The encoding method according to any one of claims 1 to 13, wherein the encoding method is applied in a joint photographic experts group extended range encoding format (JPEG XR encoding format).
  15. A method of encoding, comprising:
    acquiring image complexity information of a current frame, wherein the image complexity information comprises a transformation coefficient obtained by performing image kernel transformation (PCT) on pixel values of the current frame;
    Determining an initial quantization parameter (initial QP) for the current frame based on the image complexity information;
    and updating the initial QP of a target frame according to the initial QP of the current frame, wherein the target frame is a front x frame and/or a rear y frame of the current frame, and x and y are positive integers which are greater than or equal to 1.
  16. The encoding method of claim 15, wherein the updating the initial QP of the target frame based on the initial QP of the current frame comprises:
    updating an initial QP in the target frame according to the ratio of the absolute value of the second difference value to the target byte number of the current frame and the size of a second threshold;
    the second difference value is a difference value between the target byte number of the current frame and the actual code byte number of the current frame.
  17. The encoding method of claim 16, wherein the updating the initial QP in the target frame based on the ratio of the absolute value of the second difference to the target number of bytes of the current frame and the size of the second threshold comprises:
    if the second difference value is a positive value, taking the difference value between the first parameter and the offset parameter in the initial QP for calculating the current frame as an updated first parameter;
    calculating an initial QP of the target frame according to the updated first parameter;
    Or alternatively, the first and second heat exchangers may be,
    if the second difference is a negative value, taking the sum of the first parameter and the offset parameter in the initial QP used for calculating the previous frame as the updated first parameter;
    and calculating the initial QP of the target frame according to the updated first parameter.
  18. The encoding method according to any one of claims 15 to 17, wherein the acquiring the image complexity information of the current frame if the current frame is multi-channel encoded comprises:
    respectively acquiring image complexity information of components of the current frame, wherein the components comprise a brightness component and/or at least one chromaticity component of the current frame;
    updating the initial QP of the target frame based on the initial QP of the current frame, including:
    the initial QP for the target frame is updated according to the image complexity information for at least one of the components.
  19. The encoding method according to any one of claims 15 to 18, wherein the acquiring image complexity information of the current frame includes:
    acquiring a plurality of transformation parameters of the current frame;
    and selecting one parameter from the plurality of transformation parameters as image complexity information of the current frame.
  20. The encoding method according to any one of claims 15 to 19, characterized in that the method further comprises:
    If the updated initial QP of the target frame is smaller than a third threshold, taking the third threshold as the updated initial QP of the target frame; or alternatively, the first and second heat exchangers may be,
    if the updated initial QP of the target frame is greater than a fourth threshold, taking the fourth threshold as the updated initial QP of the target frame;
    wherein the third threshold is the minimum value of the QP used for encoding and the fourth threshold is the maximum value of the QP used for encoding.
  21. The encoding method according to any one of claims 15 to 20, wherein the encoding method is applied in a joint photographic experts group extended range encoding format (JPEG XR encoding format).
  22. An encoding device, comprising:
    the complexity calculation module is used for obtaining image complexity information of each tile of n1 tiles in the current frame, wherein the image complexity information comprises a transformation coefficient obtained by performing image kernel transformation (PCT) on pixel values of each tile, and n1 is a positive integer greater than or equal to 2;
    and the code rate control module is used for updating the Quantization Parameters (QP) of the n1 tiles according to the image complexity information.
  23. The encoding apparatus of claim 22, wherein the rate control module is further configured to:
    Calculating an accumulated value of a target byte number of an ith tile in the n1 tiles according to the image complexity information, wherein i is a positive integer less than or equal to n;
    updating the QP of the ith tile according to the accumulated value of the target byte number of the ith tile.
  24. The encoding apparatus of claim 23, wherein the rate control module is further configured to:
    updating the QP of the ith tile according to the absolute value of a first difference value and a first threshold, wherein the first difference value is the difference value between the accumulated value of the target byte number of the ith tile and the accumulated value of the actual code byte number of the ith tile.
  25. The encoding apparatus of claim 24, wherein the rate control module is further configured to:
    if the first difference value is a positive value, taking the difference value between the QP of the ith-1 tile and the first offset QP as the QP of the ith tile;
    if the first difference value is a negative value, taking the sum of the QP of the i-1 th tile and the first offset QP as the QP of the i-th tile;
    wherein the first offset QP is derived based on an absolute value of the first difference and the first threshold.
  26. The encoding device according to claim 24 or 25, wherein the target number of bytes of the i-th tile is related to the first information;
    The first information is at least one of the following information:
    the target byte number of the current frame, the transform coefficient of the i-th tile, or the transform coefficient of the current frame.
  27. The encoding apparatus according to claim 26, wherein the target number of bytes of the current frame is related to second information;
    the second information is at least one of the following information:
    the width of the current frame, the height of the current frame, the bit depth of the current frame, the encoding format of the current frame or the image compression ratio of the current frame.
  28. The encoding apparatus according to any one of claims 22 to 27, the rate control module further to:
    updating QP of n2 tiles in the target frame according to the ratio of the absolute value of the second difference value to the target byte number of the current frame and the size of a second threshold;
    the second difference value is a difference value between a target byte number of the current frame and an actual code byte number of the current frame, the target frame is a front x frame and/or a rear y frame of the current frame, x and y are positive integers greater than or equal to 1, and n2 is a positive integer greater than or equal to 2.
  29. The encoding apparatus of claim 28, wherein the rate control module is further configured to:
    If the second difference is a positive value, taking the difference between the first parameter and the offset parameter in the QP for calculating the n1 tiles as an updated first parameter;
    calculating QP of the n2 tiles according to the updated first parameters;
    or alternatively, the first and second heat exchangers may be,
    if the second difference is a negative value, taking the sum of the first parameter and the offset parameter in the QP for calculating the n1 tiles as an updated first parameter;
    and calculating QP of the n2 tiles according to the updated first parameters.
  30. The encoding apparatus according to any one of claims 22 to 29, wherein if the current frame is multi-channel encoded, the complexity calculation module is further configured to:
    respectively acquiring image complexity information of components of each tile in the n1 tiles, wherein the components comprise a luminance component and/or at least one chrominance component of the current frame;
    the code rate control module is further configured to:
    updating the QP of the n1 tiles according to the image complexity information of at least one of the components.
  31. The encoding apparatus according to any one of claims 22 to 30, wherein the complexity calculation module is further configured to:
    Acquiring a plurality of transformation parameters of each tile;
    and selecting one parameter from the plurality of transformation parameters as the image complexity information of each tile.
  32. The encoding apparatus according to any one of claims 22 to 31, wherein the rate control module is further configured to:
    if the QP of the updated n1 tiles is less than a third threshold, taking the third threshold as the QP of the updated n1 tiles; or alternatively, the first and second heat exchangers may be,
    if the updated QP for the n1 tiles is greater than a fourth threshold, taking the fourth threshold as the updated QP for the n1 tiles;
    wherein the third threshold is the minimum value of the QP used for encoding and the fourth threshold is the maximum value of the QP used for encoding.
  33. The encoding device according to any one of claims 22 to 32, wherein the encoding device is applied in a joint photographic experts group extended range encoding format (JPEG XR encoding format).
  34. An encoding device, comprising:
    the complexity calculation module is used for acquiring image complexity information of the current frame, wherein the image complexity information comprises a transformation coefficient obtained by performing image kernel transformation (PCT) on pixel values of the current frame;
    A code rate control module for determining an initial quantization parameter (initial QP) of the current frame according to the image complexity information;
    the code rate control module is further configured to: and updating the initial QP of a target frame according to the initial QP of the current frame, wherein the target frame is a front x frame and/or a rear y frame of the current frame, and x and y are positive integers which are greater than or equal to 1.
  35. The encoding apparatus of claim 34, wherein the rate control module is further configured to:
    updating an initial QP in the target frame according to the ratio of the absolute value of the second difference value to the target byte number of the current frame and the size of a second threshold;
    the second difference value is a difference value between the target byte number of the current frame and the actual code byte number of the current frame.
  36. The encoding apparatus of claim 35, wherein the rate control module is further configured to:
    if the second difference value is a positive value, taking the difference value between the first parameter and the offset parameter in the initial QP for calculating the current frame as an updated first parameter;
    calculating an initial QP of the target frame according to the updated first parameter;
    or alternatively, the first and second heat exchangers may be,
    if the second difference is a negative value, taking the sum of the first parameter and the offset parameter in the initial QP used for calculating the previous frame as the updated first parameter;
    And calculating the initial QP of the target frame according to the updated first parameter.
  37. The encoding apparatus according to any one of claims 34 to 36, wherein if the current frame is multi-channel encoded, the complexity calculation module is further configured to:
    respectively acquiring image complexity information of components of the current frame, wherein the components comprise a brightness component and/or at least one chromaticity component of the current frame;
    the code rate control module is further configured to:
    the initial QP for the target frame is updated according to the image complexity information for at least one of the components.
  38. The encoding apparatus according to any one of claims 34 to 37, wherein the complexity calculation module is further configured to:
    acquiring a plurality of transformation parameters of the current frame;
    and selecting one parameter from the plurality of transformation parameters as image complexity information of the current frame.
  39. The encoding apparatus according to any one of claims 34 to 38, wherein the rate control module is further configured to:
    if the updated initial QP of the target frame is smaller than a third threshold, taking the third threshold as the updated initial QP of the target frame; or alternatively, the first and second heat exchangers may be,
    If the updated initial QP of the target frame is greater than a fourth threshold, taking the fourth threshold as the updated initial QP of the target frame;
    wherein the third threshold is the minimum value of the QP used for encoding and the fourth threshold is the maximum value of the QP used for encoding.
  40. The encoding device according to any one of claims 34 to 39, wherein the encoding device is applied in a joint photographic experts group extended range encoding format (JPEG XR encoding format).
  41. An encoding device, comprising:
    a processor for: acquiring image complexity information of each tile of n1 tiles in a current frame, wherein the image complexity information comprises a transformation coefficient obtained by performing image kernel transformation (PCT) on pixel values of each tile, and n1 is a positive integer greater than or equal to 2;
    the Quantization Parameters (QP) for the n1 tiles are updated according to the image complexity information.
  42. The encoding device of claim 41, wherein the processor is further configured to:
    calculating an accumulated value of a target byte number of an ith tile in the n1 tiles according to the image complexity information, wherein i is a positive integer less than or equal to n;
    Updating the QP of the ith tile according to the accumulated value of the target byte number of the ith tile.
  43. The encoding device of claim 42, wherein the processor is further configured to:
    updating the QP of the ith tile according to the absolute value of a first difference value and a first threshold, wherein the first difference value is the difference value between the accumulated value of the target byte number of the ith tile and the accumulated value of the actual code byte number of the ith tile.
  44. The encoding device of claim 43, wherein the processor is further configured to:
    if the first difference value is a positive value, taking the difference value between the QP of the ith-1 tile and the first offset QP as the QP of the ith tile;
    if the first difference value is a negative value, taking the sum of the QP of the i-1 th tile and the first offset QP as the QP of the i-th tile;
    wherein the first offset QP is derived based on an absolute value of the first difference and the first threshold.
  45. The encoding device of claim 43 or 44, wherein the target number of bytes of the i-th tile is related to the first information;
    the first information is at least one of the following information:
    The target byte number of the current frame, the transform coefficient of the i-th tile, or the transform coefficient of the current frame.
  46. The encoding device of claim 45, wherein the target number of bytes of the current frame is related to second information;
    the second information is at least one of the following information:
    the width of the current frame, the height of the current frame, the bit depth of the current frame, the encoding format of the current frame or the image compression ratio of the current frame.
  47. The encoding apparatus according to any one of claims 41 to 46, wherein the processor is further configured to:
    updating QP of n2 tiles in the target frame according to the ratio of the absolute value of the second difference value to the target byte number of the current frame and the size of a second threshold;
    the second difference value is a difference value between a target byte number of the current frame and an actual code byte number of the current frame, the target frame is a front x frame and/or a rear y frame of the current frame, x and y are positive integers greater than or equal to 1, and n2 is a positive integer greater than or equal to 2.
  48. The encoding device of claim 47, wherein the processor is further configured to:
    If the second difference is a positive value, taking the difference between the first parameter and the offset parameter in the QP for calculating the n1 tiles as an updated first parameter;
    calculating QP of the n2 tiles according to the updated first parameters;
    or alternatively, the first and second heat exchangers may be,
    if the second difference is a negative value, taking the sum of the first parameter and the offset parameter in the QP for calculating the n1 tiles as an updated first parameter;
    and calculating QP of the n2 tiles according to the updated first parameters.
  49. The encoding apparatus according to any one of claims 41 to 48, wherein if the current frame is multi-channel encoded, the processor is further configured to:
    respectively acquiring image complexity information of components of each tile in the n1 tiles, wherein the components comprise a luminance component and/or at least one chrominance component of the current frame;
    updating the QP of the n1 tiles according to the image complexity information of at least one of the components.
  50. The encoding apparatus according to any one of claims 41 to 49, wherein the processor is further configured to:
    acquiring a plurality of transformation parameters of each tile;
    and selecting one parameter from the plurality of transformation parameters as the image complexity information of each tile.
  51. The encoding apparatus according to any one of claims 41 to 50, wherein the processor is further configured to:
    if the QP of the updated n1 tiles is less than a third threshold, taking the third threshold as the QP of the updated n1 tiles; or alternatively, the first and second heat exchangers may be,
    if the updated QP for the n1 tiles is greater than a fourth threshold, taking the fourth threshold as the updated QP for the n1 tiles;
    wherein the third threshold is the minimum value of the QP used for encoding and the fourth threshold is the maximum value of the QP used for encoding.
  52. The encoding device according to any one of claims 41 to 51, wherein the encoding device is applied in a joint photographic experts group extended range encoding format (JPEG XR encoding format).
  53. An encoding device, comprising:
    a processor for:
    acquiring image complexity information of a current frame, wherein the image complexity information comprises a transformation coefficient obtained by performing image kernel transformation (PCT) on pixel values of the current frame;
    determining an initial quantization parameter (initial QP) for the current frame based on the image complexity information;
    and updating the initial QP of a target frame according to the initial QP of the current frame, wherein the target frame is a front x frame and/or a rear y frame of the current frame, and x and y are positive integers which are greater than or equal to 1.
  54. The encoding device of claim 53, wherein the processor is further configured to:
    updating an initial QP in the target frame according to the ratio of the absolute value of the second difference value to the target byte number of the current frame and the size of a second threshold;
    the second difference value is a difference value between the target byte number of the current frame and the actual code byte number of the current frame.
  55. The encoding device of claim 54, wherein the processor is further configured to:
    if the second difference value is a positive value, taking the difference value between the first parameter and the offset parameter in the initial QP for calculating the current frame as an updated first parameter;
    calculating an initial QP of the target frame according to the updated first parameter;
    or alternatively, the first and second heat exchangers may be,
    if the second difference is a negative value, taking the sum of the first parameter and the offset parameter in the initial QP used for calculating the previous frame as the updated first parameter;
    and calculating the initial QP of the target frame according to the updated first parameter.
  56. The encoding apparatus according to any one of claims 53 to 55, wherein if the current frame is multi-channel encoded, the processor is further configured to:
    Respectively acquiring image complexity information of components of the current frame, wherein the components comprise a brightness component and/or at least one chromaticity component of the current frame;
    the initial QP for the target frame is updated according to the image complexity information for at least one of the components.
  57. The encoding apparatus according to any one of claims 53 to 56, wherein the processor is further configured to:
    acquiring a plurality of transformation parameters of the current frame;
    and selecting one parameter from the plurality of transformation parameters as image complexity information of the current frame.
  58. The encoding apparatus according to any one of claims 53 to 57, wherein the processor is further configured to:
    if the updated initial QP of the target frame is smaller than a third threshold, taking the third threshold as the updated initial QP of the target frame; or alternatively, the first and second heat exchangers may be,
    if the updated initial QP of the target frame is greater than a fourth threshold, taking the fourth threshold as the updated initial QP of the target frame;
    wherein the third threshold is the minimum value of the QP used for encoding and the fourth threshold is the maximum value of the QP used for encoding.
  59. The encoding device according to any one of claims 53 to 58, wherein the encoding device is applied in a joint photographic experts group extended range encoding format (JPEG XR encoding format).
  60. A computer readable storage medium comprising program instructions which, when executed by a computer, perform the encoding method of any one of claims 1 to 19.
CN202180094389.9A 2021-06-04 2021-06-04 Coding method and coding device Pending CN116918331A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/098383 WO2022252222A1 (en) 2021-06-04 2021-06-04 Encoding method and encoding device

Publications (1)

Publication Number Publication Date
CN116918331A true CN116918331A (en) 2023-10-20

Family

ID=84323728

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202180094389.9A Pending CN116918331A (en) 2021-06-04 2021-06-04 Coding method and coding device

Country Status (2)

Country Link
CN (1) CN116918331A (en)
WO (1) WO2022252222A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117478891B (en) * 2023-12-28 2024-03-15 辽宁云也智能信息科技有限公司 Intelligent management system for building construction

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH06292019A (en) * 1993-04-02 1994-10-18 Fujitsu Ltd Picture data compressor and picture code compressor
CN100562118C (en) * 2007-07-03 2009-11-18 上海富瀚微电子有限公司 A kind of bit rate control method of video coding
KR101450645B1 (en) * 2013-04-26 2014-10-15 주식회사 코아로직 A method and an apparatus for controlling a video bitrate
WO2019127136A1 (en) * 2017-12-27 2019-07-04 深圳市大疆创新科技有限公司 Bit rate control method and encoding device
WO2019191983A1 (en) * 2018-04-04 2019-10-10 深圳市大疆创新科技有限公司 Encoding method and device, image processing system, and computer readable storage medium

Also Published As

Publication number Publication date
WO2022252222A1 (en) 2022-12-08

Similar Documents

Publication Publication Date Title
JP6588523B2 (en) Intra-prediction method and encoder and decoder using the same
CN113574898B (en) Adaptive loop filter
TW202044834A (en) Method and system for processing video content
CN108886621B (en) Non-local self-adaptive loop filtering method
JP7308983B2 (en) Cross-component adaptive loop filter for chroma
WO2013109898A1 (en) Reference pixel reduction for intra lm prediction
CN103782598A (en) Fast encoding method for lossless coding
CN114145016A (en) Matrix weighted intra prediction for video signals
WO2021134706A1 (en) Loop filtering method and device
US8379985B2 (en) Dominant gradient method for finding focused objects
JP6502739B2 (en) Image coding apparatus, image processing apparatus, image coding method
CN113196783B (en) Deblocking filtering adaptive encoder, decoder and corresponding methods
US20220215593A1 (en) Multiple neural network models for filtering during video coding
CN114788284B (en) Method and apparatus for encoding video data in palette mode
CN115836525A (en) Method and system for prediction from multiple cross components
WO2021196035A1 (en) Video coding method and apparatus
CN116918331A (en) Coding method and coding device
CN114902670A (en) Method and apparatus for signaling sub-picture division information
CN113068026B (en) Coding prediction method, device and computer storage medium
CN114793282B (en) Neural network-based video compression with bit allocation
CN112335245A (en) Video image component prediction method and device, and computer storage medium
US8811474B2 (en) Encoder and encoding method using coded block pattern estimation
JP2018032909A (en) Image coding device, control method therefor, imaging apparatus and program
CN114303380A (en) Encoder, decoder and corresponding methods for CABAC coding of indices of geometric partitioning flags
CN112913242B (en) Encoding method and encoding device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination