WO2021143634A1 - 算术编码器及实现算术编码的方法和图像编码方法 - Google Patents

算术编码器及实现算术编码的方法和图像编码方法 Download PDF

Info

Publication number
WO2021143634A1
WO2021143634A1 PCT/CN2021/071024 CN2021071024W WO2021143634A1 WO 2021143634 A1 WO2021143634 A1 WO 2021143634A1 CN 2021071024 W CN2021071024 W CN 2021071024W WO 2021143634 A1 WO2021143634 A1 WO 2021143634A1
Authority
WO
WIPO (PCT)
Prior art keywords
value
current
character
count
interval
Prior art date
Application number
PCT/CN2021/071024
Other languages
English (en)
French (fr)
Inventor
范益波
闫霄
李敏江
李威
虞旭林
王文强
邱鹏程
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2021143634A1 publication Critical patent/WO2021143634A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/184Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/436Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation using parallelised computational arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/439Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation using cascaded computational arrangements for performing a single operation, e.g. filtering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/91Entropy coding, e.g. variable length coding [VLC] or arithmetic coding

Definitions

  • This application relates to, but is not limited to, image processing technology, in particular to an arithmetic encoder and a method for implementing arithmetic coding and an image coding method.
  • JPEG Joint Photographic Experts Group
  • MPEG-1 dynamic Picture Experts Group-1
  • MPEG-2 MPEG-2
  • MPEG-4 MPEG-4
  • video coding standards such as H.261, H.264, H.265 Wait.
  • the present application provides an arithmetic encoder, a method for realizing arithmetic coding, and an image coding method, which can improve throughput and speed up processing.
  • the embodiment of the present invention provides an arithmetic encoder, including: a first-stage processing unit, a second-stage processing unit, a third-stage processing unit, and an output unit; wherein the first-stage processing unit is used to , To process N coded characters in parallel to obtain the bit value that needs to be shifted to the left of the current encoding interval and the interval size of the current encoding interval; the second-level processing unit is used to parallelize N current encoding intervals within one clock cycle The size of the bit value that needs to be shifted to the left is processed to obtain the bit position of the current encoded output bit in a byte, the flag information of the bit position, and the offset of the current encoded character; the third-level processing unit is used in a In a clock cycle, parallelly process N coded characters, N bit values that need to be shifted to the left of the size of the current coded interval, offsets of N current coded characters, and flag information of N bit positions to obtain the interval of the current coded interval The lower limit value and
  • the embodiment of the present invention provides a method for realizing arithmetic coding.
  • the method includes: the arithmetic encoder processes N coded characters in parallel, and obtains the bit value that needs to be shifted to the left of the current coding interval and the value of the current coding interval.
  • the arithmetic encoder processes the N bit values that need to be shifted to the left of the current encoding interval size in parallel, and obtains the bit position of the current encoded output bit in a byte, the bit position flag information, and the offset of the current encoded character Shift; the arithmetic encoder processes the N coded characters, the value of bits that need to be shifted to the left of the current code interval size, the offset of the N current coded characters, and the flag information of the N bit position in parallel to obtain the current code interval The lower limit of the interval and the output code stream of the coded characters; the arithmetic encoder converts the parallel input N output code streams into a serial output single output code stream in order.
  • the embodiment of the present invention provides a method for realizing arithmetic coding.
  • the bit position in a byte the flag information of the bit position and the offset of the current coded character; the arithmetic encoder parallels numbe coded characters, numbe bit values that need to be shifted to the left of the current code interval size, and numbe current coded characters
  • the offset and the numbe bit position flag information are processed to obtain the lower limit of the current coding interval and the output code stream of the coded character; the arithmetic encoder converts the parallel input numbe output code stream into serial in order The output single-channel output code stream output.
  • the embodiment of the present invention provides an image encoding method, which includes: preprocessing an image to be processed to obtain a plurality of image blocks; converting the obtained image blocks to obtain corresponding coded characters and coding probabilities; and converting the coded characters corresponding to the image blocks And the encoding probability is input to the encoder for encoding; wherein the encoder includes an arithmetic encoder.
  • the embodiment of the present invention provides a computer-readable storage medium that stores computer-executable instructions, and the computer-executable instructions are used to execute a method for realizing arithmetic coding.
  • the embodiment of the present invention provides a device for realizing arithmetic coding, including a memory and a processor, wherein the memory stores the following instructions that can be executed by the processor: for executing the steps of realizing arithmetic coding.
  • the arithmetic encoder provided by the present application adopts a multi-channel parallel circuit structure, so it can process multiple encoded characters in one clock cycle, which improves the throughput rate and speeds up the processing speed.
  • the embodiment of the present application splits the unsigned 16-bit multiplication operation into four unsigned 8-bit multiplication operations, three shift operations, and three unsigned 8-bit addition operations.
  • the critical path of unsigned 16-bit multiplication operations is reduced, and the processing efficiency is improved.
  • the number signal is introduced into the circuit structure of the embodiment of the present application to control the number of currently valid coded characters, which improves the flexible application of the arithmetic encoder of the present application.
  • Figure 1 is a schematic diagram of the composition and structure of the arithmetic encoder of this application.
  • FIG. 2 is a schematic diagram of the composition structure of an embodiment of the first-stage processing unit in the arithmetic encoder of this application;
  • FIG. 3 is a schematic diagram of the composition structure of an embodiment of an unsigned 8-bit multiplier of this application.
  • FIG. 4 is a schematic diagram of the composition structure of an embodiment of a second-stage processing unit in an arithmetic encoder according to this application;
  • FIG. 5 is a schematic flowchart of an embodiment in which a second-level processing unit of this application implements processing
  • FIG. 6 is a schematic diagram of the composition structure of an embodiment of the third-stage processing unit in the arithmetic encoder of this application;
  • FIG. 7 is a schematic flowchart of an embodiment in which the third-level processing unit of this application implements processing
  • FIG. 8 is a schematic diagram of an embodiment of a first-in first-out queue buffer of this application.
  • Fig. 9 is a flowchart of a method for implementing arithmetic coding in this application.
  • the computing device includes one or more processors (CPU), input/output interfaces, network interfaces, and memory.
  • processors CPU
  • input/output interfaces network interfaces
  • memory volatile and non-volatile memory
  • the memory may include non-permanent memory in a computer-readable medium, random access memory (RAM) and/or non-volatile memory, such as read-only memory (ROM) or flash memory (flash RAM).
  • RAM random access memory
  • ROM read-only memory
  • flash RAM flash memory
  • Computer-readable media include permanent and non-permanent, removable and non-removable media, and information storage can be realized by any method or technology.
  • the information can be computer-readable instructions, data structures, program modules, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disc (DVD) or other optical storage, Magnetic cassettes, magnetic tape disk storage or other magnetic storage devices or any other non-transmission media can be used to store information that can be accessed by computing devices.
  • computer-readable media does not include non-transitory computer-readable media (transitory media), such as modulated data signals and carrier waves.
  • Arithmetic coding is an entropy coding method.
  • the code stream generated by arithmetic coding can be decoded to restore the original data without distortion.
  • Entropy coding is based on the statistical characteristics of random processes. It counts the source symbols with different occurrence probabilities and obtains the probability distribution, and re-encodes according to the occurrence probability of the source symbols, that is to say, the source symbols with higher occurrence probability Symbols are allocated with shorter codewords, and the source symbols with lower occurrence probability are allocated with longer codewords. In this way, it is possible to represent more source symbols with a smaller number of bits as a whole.
  • Lepton uses the VP8 binary arithmetic encoder to encode based on the 8-bit (8-bit) encoding probability prob.
  • the encoder In the recursive calculation process of arithmetic coding, the encoder must save the interval lower limit lowvaule of the current interval, the interval size range of the current interval, and the bit position count. Among them, lowvalue and range can determine the current encoding interval, count records the position of the output bit in a byte after the current encoding, and it needs to be output to the output bit stream when the byte is full.
  • the interval lower limit lowvalue will be shifted to the left until the interval size range is within the range, and the interval lower limit lowvalue will be shifted to the left by the bit Output to the output bit stream.
  • the low value of the lower limit of the interval is appended to the output bit stream, and the encoding is ended.
  • the implementation circuit structure is a single serial circuit structure, that is, only one encoded character can be received and processed per clock cycle. Therefore, when the rate of the input code stream is relatively fast, such a circuit structure cannot process the input code stream in time, which not only causes the accumulation of the input code stream, but also limits the further improvement of its throughput rate to a certain extent.
  • This application proposes an arithmetic encoder that processes multiple encoded characters in parallel in each clock cycle, makes full use of each clock cycle, and improves the encoding speed. Further, when the rate of the input code stream is relatively fast, it can ensure that the input code stream is processed in time, thereby avoiding the accumulation of the input code stream, and to a certain extent promote the further improvement of the throughput rate.
  • Figure 1 is a schematic diagram of the composition and structure of the arithmetic encoder of this application. As shown in Figure 1, it includes: a first-stage processing unit, a second-stage processing unit, a third-stage processing unit, and an output unit; among them,
  • the first-level processing unit is used to process N coded characters in parallel within one clock cycle to obtain the bit value that needs to be shifted to the left of the current encoding interval and the interval size of the current encoding interval;
  • the second-level processing unit is used to process the N bit values that need to be shifted to the left of the current encoding interval in parallel in one clock cycle, and obtain the bit position and the bit position of the current encoded output bit in a byte The offset between the logo information and the current code character;
  • the third-level processing unit is used to perform parallel processing of N coded characters, N bit values that need to be shifted to the left of the current code interval size, offsets of N current coded characters, and flags of N bit positions in one clock cycle
  • the information is processed to obtain the lower limit of the current encoding interval and the output code stream of the encoded characters.
  • the output unit is used to sequentially convert the N-channel output code stream input in parallel into a single-channel output code stream output for serial output.
  • N is an integer greater than or equal to 1.
  • the value of N may be a fixed value or a configurable parameter value.
  • the size of N depends on the length of a clock cycle. If a clock cycle is longer, the value of N can be larger, and if a clock cycle is shorter, the value of N can be smaller.
  • the arithmetic encoder of the present application is VP8 binary arithmetic encoding.
  • the arithmetic encoder provided in this application adopts a multi-channel parallel circuit structure, so it can process multiple coded characters in one clock cycle, which improves the throughput rate and speeds up the processing speed.
  • FIG. 2 is a schematic diagram of the composition structure of an embodiment of the first-stage processing unit in the arithmetic encoder of this application.
  • the first encoding interval processing module is used to receive the first encoded character bin_0 to be processed in the current clock cycle, the encoding probability prob_0 of the first encoded character, and the encoding interval size range_(N-1) of the previous encoded character; according to the received The encoding probability prob_0 of the first encoded character and the encoding interval size range_(N-1) of the last encoded character (It should be noted that at the first calculation, the encoding interval size of the last encoded character is the initial value, if it can be set Is 8'd255), the first split value split_0 is calculated; the current encoding interval size is calculated according to the calculated first split value split_0 and the first coded character bin_0; according to the current encoding interval size, by looking up the table (as shown in Table 1 Show) To get the bit value that needs to be shifted to the left of the current encoding interval size, that is, the first left-shift bit value shift_0 and the left-shifted encoding interval size value range
  • the second encoding interval processing module is used to receive the second encoded character bin_1 to be processed in the current clock cycle, the encoding probability prob_1 of the second encoded character, and the encoding interval size range_0 of the previous encoded character; according to the received second encoding
  • the encoding probability prob_1 of the character and the encoding interval size range_0 of the previous encoded character are calculated to obtain the second split value split_1;
  • the current encoding interval size is calculated according to the calculated second split value split_1 and the second encoded character bin_1; according to the current encoding interval Size, by looking up the table (as shown in Table 1) to get the bit value of the current coding interval size that needs to be shifted left, that is, the second left shift bit value shift_1 and the coding interval size value after left shift range_after_shift; the size of the coding interval after the left shift
  • the value range_after_shift is used as the range value output by the second coded character processing module, namely range_1.
  • the third encoding interval processing module is used to receive the to-be-processed third encoded character bin_2, the encoding probability prob_2 of the third encoded character and the encoding interval size range_1 of the last encoded character in the current clock cycle; according to the received third encoding
  • the encoding probability prob_2 of the character and the encoding interval size range_1 of the previous encoded character are calculated to obtain the third split value split_2;
  • the current encoding interval size is calculated according to the calculated third split value split_2 and the third encoded character bin_2; according to the current encoding interval Size, by looking up the table (as shown in Table 1) to get the bit value of the current coding interval size that needs to be shifted to the left, that is, the third left shift bit value shift_2 and the coding interval size value after left shift range_after_shift; the size of the coding interval after left shifting
  • the value range_after_shift is used as the range value output by the third coded character processing module, namely range_
  • the Nth encoding interval processing module is used to receive the Nth encoded character bin_(N-1) to be processed in the current clock cycle, the encoding probability prob_(N-1) of the Nth encoded character and the previous code
  • the encoding interval size of the character range_(N-2) according to the received encoding probability prob_(N-1) of the Nth encoded character and the encoding interval size range_(N-2) of the previous encoded character, the Nth split is calculated Value split_(N-1); According to the calculated Nth split value split_(N-1) and the Nth coded character bin_(N-1) to calculate the current encoding interval size; According to the current encoding interval size, by looking up the table ( (As shown in Table 1) to get the bit value that needs to be shifted to the left of the current encoding interval size, that is, the Nth left shift bit value shift_(N-1) and the encoding interval size value after left shift range_after_shift; the encoding interval size value after the left
  • the first temporary register is used to temporarily store the range value output by the N-th coded character processing module, namely range_(N-1), and output it to the first coded interval processing module in the next clock cycle.
  • the first streamline register is used to beat in the pipeline and store the coded character bin in the current clock cycle, namely the first coded character bin_0, the second coded character bin_1, the third coded character bin_2... the Nth coded character bin_(N-1 ), and output to the second-level processing unit.
  • N can be a fixed value or a configurable parameter value.
  • the size of N depends on the length of a clock cycle. If a clock cycle is longer, the value of N can be larger, and if a clock cycle is shorter, the value of N can be smaller.
  • unsigned The 8-bit multiplication operation is divided into four unsigned 4-bit multiplication operations, three shift operations and three unsigned 4-bit addition operations. Among them, each unsigned 4-bit multiplication operation can be realized by a lookup table . As shown in Figure 3, the symbol “ ⁇ ” or “*” means multiplication, the symbol “+” means addition, A means one of the unsigned 8-bit multipliers, and B means the other unsigned 8-bit multiplier; A_Hi (4-bit) represents an unsigned 4-bit high digit split from A; A_Lo(4-bit) represents an unsigned 4-bit low digit split from A, and B_Hi(4-bit) represents one split from B Unsigned 4_bit high digits, B_Lo (4-bit) represents an unsigned 4_bit low digits split from B; different line shapes represent different unsigned multiplication operations.
  • the first-level processing unit may further include: a first data selector
  • the first data selector is used under the control of the settable signal number.
  • the number signal is introduced into the circuit structure to control the number of currently valid encoded characters. Therefore, the number of encoded characters that can be processed by the VP8 binary arithmetic encoder in each clock cycle is configurable, such as N This improves the flexible application of the arithmetic encoder of this application.
  • the first-level pipeline structure that is, the top-level interface of the first-level processing unit, is shown in Table 2:
  • FIG. 4 is a schematic diagram of the composition structure of an embodiment of the second-stage processing unit in the arithmetic encoder of this application.
  • Step 500 Calculate the bit position of the current character after encoding, that is, the first bit position value count_0; then, determine the first offset of the current encoded character offset_0 according to the value of count_0, and update the first left shift bit values shift_0 and the first bit position value.
  • the one-bit position value count_0 is used to receive the first left shift bit value shift_0 in the current clock cycle and the bit position count_(N-1) of the last encoded output bit in one byte (it needs to be explained, In the first calculation,
  • the second temporary register is used to temporarily store the count value output by the Nth encoding position processing module, that is, count_(N-1), and output to the first encoding position processing module in the next clock cycle.
  • the second pipeline register is used to beat in the pipeline and store the coded character bin in the current clock cycle, namely the first coded character bin_0, the second coded character bin_1, the third coded character bin_2... the Nth coded character bin_(N-1 ), and the first split value split_0, the second split value split_1, the third split value split_2...the Nth split value split_(N-1) in the current clock cycle, and are output to the third stage processing unit.
  • Fig. 5 is a schematic flow chart of an embodiment of the second-level processing unit of the application for processing.
  • the value of offset offset_(i-1) is equal to the difference between the value of shift_(i-1) and the value of count_(i-1)
  • the value of shift_(i-1) is equal to the value of count_(i-1)
  • count_ The value of (i-1) is updated to the value of count_(i-1) minus 8.
  • the second-level processing unit may further include: a second data selector
  • the second data selector is used under the control of the settable signal number.
  • the N-way encoding position processing module is processed in parallel at this time. Therefore, the second temporary register stores the count output by the Nth encoding position processing module.
  • the number signal is introduced into the circuit structure to control the number of currently valid encoded characters. Therefore, the number of encoded characters that can be processed by the VP8 binary arithmetic encoder in each clock cycle is configurable, such as N This improves the flexible application of the arithmetic encoder of this application.
  • the second-level pipeline structure that is, the top-level interface of the second-level processing unit, is shown in Table 3:
  • FIG. 6 is a schematic diagram of the composition structure of an embodiment of the third-level processing unit in the arithmetic encoder of the present application.
  • the first encoding interval limit processing module is used to receive the first encoded character bin_0, the first left shift bit value shift_0, the first offset offset_0, the first bit position flag information count_flag_0, and the previous code in the current clock cycle
  • the lower limit of the encoding interval of the character is lowvaule_(N-1) (It should be noted that in the first calculation, the initial value of the lower limit of the encoding interval of the previous encoded character can be set to 32'd0), and then, according to the current
  • the first coded character bin_0 and the first bit position flag information count_flag_0 determine the lower limit value of the first coding interval lowvaule_0 and the first output code stream data_0.
  • the second encoding interval limit processing module is used to receive the second encoded character bin_1, the second left shift bit value shift_1, the second offset offset_1, the second bit position flag information count_flag_1, and the previous code in the current clock cycle
  • the lower limit value of the encoding interval of the character is lowvaule_0, and then, according to the current second encoded character bin_1 and the second bit position flag information count_flag_1, the lower limit of the second encoding interval lowvaule_1 and the second output code stream data_1 are determined.
  • the third encoding interval limit processing module is used to receive the third encoded character bin_2, the third left shift bit value shift_2, the third offset offset_2, the third bit position flag information count_flag_2, and the previous code in the current clock cycle
  • the lower limit value of the encoding interval of the character is lowvaule_1, and then, according to the current third encoded character bin_2 and the third bit position flag information count_flag_2, the lower limit of the third encoding interval lowvaule_2 and the third output code stream data_2 are determined.
  • the Nth encoding interval limit processing module is used to receive the Nth encoded character bin_(N-1), the Nth left shift bit value shift_(N-1), and the Nth offset in the current clock cycle.
  • the third temporary register is used to temporarily store the count_flag value output by the Nth encoding interval limit processing module, that is, count_flag_(N-1), and output it to the first encoding interval limit processing in the next clock cycle.
  • FIG. 7 is a schematic flow chart of an embodiment of the processing implemented by the third-level processing unit of this application.
  • determine the lower limit value of the coding interval lowvaule_(i-1) of the coded characters and the output code stream data_(i-1), i 1, 2, 3...N, including:
  • the lower limit value of the encoding interval lowvaule_(i-1) of the coded character of the current coded character is updated to the lowvaule value of the previous coded character and the split_ of the current coded character
  • the sum of (i-1), that is, lowvaule_(i-1) lowvaule_(i-1)+split_(i-1);
  • the third-level processing unit may further include: a third data selector
  • the third data selector is used under the control of the settable signal number.
  • the third temporary register stores the Nth encoding interval limit processing.
  • the number signal is introduced into the circuit structure to control the number of currently valid encoded characters. Therefore, the number of encoded characters that can be processed by the VP8 binary arithmetic encoder in each clock cycle is configurable, such as N This improves the flexible application of the arithmetic encoder of this application.
  • the third-level pipeline structure that is, the top-level interface of the third-level processing unit, is shown in Table 4:
  • the circuit structure of the output unit of the present application may include: a first-in first-out queue buffer data_refineFIFO, which is used to sequentially convert N-channel 8-bit code streams input in parallel to a string Single 8-bit code stream with line output.
  • FIG. 8 is a schematic diagram of an embodiment of the first-in-first-out queue buffer data_refineFIFO.
  • the depth of the data_refineFIFO is 4 and the width is (2+32) bits, where the front 2bit is used to store the number of valid code streams, and the last 32bit is used to store the corresponding valid code stream.
  • Fig. 9 is a flowchart of a method for implementing arithmetic coding in this application. As shown in Fig. 9, in one clock cycle, the method includes:
  • Step 900 The arithmetic encoder processes the N coded characters in parallel, and obtains the bit value that needs to be shifted to the left of the current encoding interval and the interval size of the current encoding interval.
  • step 900 may include:
  • the Nth split value split_(i-1) is calculated;
  • the size of the current coding interval look up the table (as shown in Table 1) to get the bit value of the current coding interval size that needs to be shifted to the left, that is, the i-th left shift bit value shift_(i-1) and the left shifted coding interval size value range_after_shift ; Use the left-shifted encoding interval size value range_after_shift as the range value output by the i-th encoded character processing module, namely range_(i-1).
  • i 1, 2, 3...N.
  • the size of the encoding interval of the last encoded character is the initial value, for example, it can be set to 8'd255.
  • step 900 further includes:
  • the present application also provides a method for realizing arithmetic coding, which in one clock cycle includes:
  • the arithmetic encoder processes numbe bit values that need to be shifted to the left in the current encoding interval in parallel, and obtains the bit position of the currently encoded output bit in one byte, the bit position flag information, and the offset of the current encoded character;
  • the arithmetic encoder processes numbe coded characters, numbe bit values that need to be shifted to the left of the current code interval size, numbe current coded character offsets, and numbe bit position flag information in parallel to obtain the interval of the current code interval Output code stream of limits and coded characters;
  • the arithmetic encoder converts the parallel input numbe output code stream into a serial output single output code stream in order.
  • Step 901 The arithmetic encoder processes the N bit values that need to be shifted to the left of the current encoding interval in parallel, and obtains the bit position of the currently encoded output bit in one byte, the flag information of the bit position, and the offset of the current encoded character Shift.
  • step 901 may include:
  • i 1, 2, 3...N.
  • the initial value of the bit position in a byte of the output bit after the previous encoding can be set to -32'd24, for example.
  • the i-th offset offset_(i-1) of the current encoded character is determined according to the count_(i-1) value, and the i-th left shift bit value shift_(i-1) and the second
  • the bit position value count_(i-1) includes:
  • step 901 further includes:
  • Step 902 The arithmetic encoder processes the N coded characters, the value of bits to be shifted to the left of the current coded interval size, the offset of the N current coded characters, and the flag information of the N bit positions in parallel to obtain the current coded interval The lower limit of the interval and the output code stream of coded characters.
  • step 902 may include:
  • i 1, 2, 3...N.
  • step 902 may further include:
  • Step 903 The arithmetic encoder converts the parallel input N output code streams into a serial output single output code stream in order.
  • the value of N may be a fixed value or a configurable parameter value.
  • the size of N depends on the length of a clock cycle. If a clock cycle is longer, the value of N can be larger, and if a clock cycle is shorter, the value of N can be smaller.
  • the arithmetic encoder of the present application is VP8 binary arithmetic encoding.
  • the VP8 binary arithmetic encoder provided in this application adopts a four-N parallel circuit structure, so it can process at most N coded characters in one clock cycle, which improves the throughput rate and speeds up the processing speed.
  • the method for implementing arithmetic coding in the embodiments of the present application can process multiple coded characters in parallel within one clock cycle, which improves the throughput rate and speeds up the processing speed.
  • the present application also provides a computer-readable storage medium that stores computer-executable instructions, and the computer-executable instructions are used to execute any one of the foregoing methods for realizing arithmetic coding.
  • the present application further provides a device for realizing arithmetic coding, including a memory and a processor, wherein the memory stores the following instructions executable by the processor for executing any of the steps of the method for realizing arithmetic coding described above.
  • This application also provides an image encoding method, including:
  • the encoder includes the arithmetic encoder described in any one of the embodiments of the present application.
  • preprocessing the image to be processed to obtain multiple image blocks, and performing conversion processing on the obtained image blocks to obtain coded characters and coding probabilities can be accomplished by using related technologies.
  • the specific implementation is not intended to limit the application. The scope of protection.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

本申请公开了一种算术编码器及实现算术编码的方法和图像编码方法,在一个时钟周期内能够并行处理多个编码字符,提高了吞吐率,加快了处理速度。

Description

算术编码器及实现算术编码的方法和图像编码方法
本申请要求2020年01月17日递交的申请号为202010051282.3、发明创造名称为“算术编码器及实现算术编码的方法和图像编码方法”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及但不限于图像处理技术,尤指一种算术编码器及实现算术编码的方法和图像编码方法。
背景技术
随着移动通信和互联网的发展,人们对图像的需求量不断增加,从而给有限的传输带宽和存储空间带来了巨大的压力。将图像数据先压缩编码后再进行传输,可以有效地提高图像的数据传输率。为此,图像压缩编码技术得到了迅速的发展和广泛的应用,并日臻成熟。
关于静止图像,国际标准化组织/国际电工委员会(ISO/IEC)制定了编码标准联合图像专家组(JPEG),是一种图像文件格式;关于活动图像,ISO/IEC制定了压缩编码标准如:动态图像专家组-1(MPEG-1)、MPEG-2、MPEG-4等;关于电视电话/会议电视,国际电信联盟(ITU)制定了视频编码标准如H.261、H.264、H.265等。这些标准图像编码算法融合了各种性能优良的传统图像编码方法,是对传统图像编码技术的总结,代表了当前图像编码的发展水平。此外,Lepton是Dropbox开源的一套无损图像压缩编码技术,由于其使用VP8算术编码替换哈夫曼编码,在目前JPEG图像压缩基础上,可以得到更高的压缩率。
发明内容
本申请提供一种算术编码器及实现算术编码的方法和图像编码方法,能够提高吞吐率,加快处理速度。
本发明实施例提供了一种算术编码器,包括:第一级处理单元、第二级处理单元、第三级处理单元、输出单元;其中,第一级处理单元,用于在一个时钟周期内,并行对 N个编码字符进行处理,获取当前编码区间大小需要左移的比特数值和当前编码区间的区间大小;第二级处理单元,用于在一个时钟周期内,并行对N个当前编码区间大小需要左移的比特数值进行处理,获取当前编码后输出的比特在一个字节中的比特位置、比特位置的标志信息和当前编码字符的偏移量;第三级处理单元,用于在一个时钟周期内,并行对N个编码字符、N个当前编码区间大小需要左移的比特数值、N个当前编码字符的偏移量和N个比特位置的标志信息进行处理,获取当前编码区间的区间下限值和编码字符的输出码流;输出单元,用于将并行输入的N路输出码流按顺序转换为串行输出的单路输出码流输出;其中,N为大于或等于1的整数。
本发明实施例提供了一种实现算术编码的方法,在一个时钟周期内,包括:算术编码器并行对N个编码字符进行处理,获取当前编码区间大小需要左移的比特数值和当前编码区间的区间大小;算术编码器并行对N个当前编码区间大小需要左移的比特数值进行处理,获取当前编码后输出的比特在一个字节中的比特位置、比特位置的标志信息和当前编码字符的偏移量;算术编码器并行对N个编码字符、N个当前编码区间大小需要左移的比特数值、N个当前编码字符的偏移量和N个比特位置的标志信息进行处理,获取当前编码区间的区间下限值和编码字符的输出码流;算术编码器将并行输入的N路输出码流按顺序转换为串行输出的单路输出码流输出。
本发明实施例提供了一种实现算术编码的方法,在一个时钟周期内,包括:算术编码器根据可设置的信号number,对number路编码字符进行并行处理,获取当前编码区间大小需要左移的比特数值和当前编码区间的区间大小,其中,number=1,2,3…N;算术编码器并行对numbe个当前编码区间大小需要左移的比特数值进行处理,获取当前编码后输出的比特在一个字节中的比特位置、比特位置的标志信息和当前编码字符的偏移量;算术编码器并行对numbe个编码字符、numbe个当前编码区间大小需要左移的比特数值、numbe个当前编码字符的偏移量和numbe个比特位置的标志信息进行处理,获取当前编码区间的区间下限值和编码字符的输出码流;算术编码器将并行输入的numbe路输出码流按顺序转换为串行输出的单路输出码流输出。
本发明实施例提供了一种图像编码方法,包括:对待处理的图像进行预处理得到多个图像块;将得到的图像块分别转换得到对应的编码字符和编码概率;将图像块对应的编码字符和编码概率输入编码器进行编码;其中,编码器包括算术编码器。
本发明实施例提供了一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令用于执行实现算术编码的方法。
本发明实施例提供了一种实现算术编码的设备,包括存储器和处理器,其中,存储器中存储有以下可被处理器执行的指令:用于执行实现算术编码的步骤。
本申请提供的算术编码器,采用多路并行的电路结构,因此,在一个时钟周期内能够处理多个编码字符,提高了吞吐率,加快了处理速度。
在一种示例性实例中,本申请实施例将无符号16-bit乘法操作拆分为四个无符号8-bit乘法操作、三个移位操作和三个无符号8-bit加法操作。减少了无符号16-bit乘法操作的关键路径,提升了处理效率。
在一种示例性实例中,本申请实施例电路结构中引入了number信号来控制当前有效的编码字符数,提高了本申请算术编码器的灵活应用。
本发明的其它特征和优点将在随后的说明书中阐述,并且,部分地从说明书中变得显而易见,或者通过实施本发明而了解。本发明的目的和其他优点可通过在说明书、权利要求书以及附图中所特别指出的结构来实现和获得。
附图说明
附图用来提供对本申请技术方案的进一步理解,并且构成说明书的一部分,与本申请的实施例一起用于解释本申请的技术方案,并不构成对本申请技术方案的限制。
图1为本申请算术编码器的组成结构示意图;
图2为本申请算术编码器中第一级处理单元实施例的组成结构示意图;
图3为本申请无符号8-bit乘法器实施例的组成结构示意图;
图4为本申请算术编码器中第二级处理单元实施例的组成结构示意图;
图5为本申请第二级处理单元实现处理的实施例的流程示意图;
图6为本申请算术编码器中第三级处理单元实施例的组成结构示意图;
图7为本申请第三级处理单元实现处理的实施例的流程示意图;
图8为本申请先进先出队列缓冲区的实施例的示意图;
图9为本申请实现算术编码的方法的流程图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚明白,下文中将结合附图对本申请的实施例进行详细说明。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互任意组合。
在本申请一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括非暂存电脑可读媒体(transitory media),如调制的数据信号和载波。
在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行。并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。
算术编码是一种熵编码方法,通过算术编码生成的码流可以经解码无失真地恢复出原数据。熵编码建立在随机过程的统计特性基础上,对出现概率不同的信源符号进行统计并得出概率分布,根据信源符号出现的概率重新进行编码,也就是说,出现概率较大的信源符号分配较短的码字,而出现概率较小的信源符号分配较长的码字,这样,实现了从整体上用较少的比特数表示较多的信源符号。Lepton采用VP8二进制算术编码器,基于8比特(8-bit)的编码概率prob进行编码。在算术编码的递归计算过程中,编码器必须保存当前区间的区间下限值lowvaule、当前区间的区间大小range以及比特位置count。其中,lowvalue和range可以确定当前编码区间,count记录了当前编码后输出的比特在一个字节中的位置,满字节时需要将其输出到输出比特流。
VP8算术编码的过程大致包括:首先,根据split=1+((range-1)*prob>>8)计算得到一个无符号8-bit的split值,split值可以理解为子分区间划分;然后,根据当前编码字符(0或1)和split值计算得到当前编码区间的区间下限值lowvalue和当前编码区间的区间大小range。为了确保实际编码的正确性,区间大小range必须在范围 [128,255]之内。如果在计算过程中,区间大小range在该范围之外,则将区间下限值lowvalue进行一定的左移操作,直至区间大小range在该范围之内,并将区间下限值lowvalue左移的比特输出到输出(output)比特流中。在最后一个字符完成编码时,将区间下限值lowvalue值附加到输出比特流中,并结束编码。
相关技术中利用算术编码技术实现编码虽然采用流水线技术,但是,实现电路结构是单路串行的电路结构,也就是说,每个时钟周期只能接收并处理一个编码字符。因此,当输入码流的速率较快时,这样的电路结构是不能及时处理输入码流的,这不仅造成了输入码流的堆积,而且在一定程度上限制了其吞吐率的进一步提升。
本申请提出一种算术编码器,在每个时钟周期内并行处理多个编码字符,充分利用每个时钟周期,提高编码速度。进一步地,当输入码流的速率较快时,能够保证及时处理输入码流,从而能够避免输入码流的堆积,而且在一定程度上促进吞吐率的进一步提升。
图1为本申请算术编码器的组成结构示意图,如图1所示,包括:第一级处理单元、第二级处理单元、第三级处理单元、输出单元;其中,
第一级处理单元,用于在一个时钟周期内,并行对N个编码字符进行处理,获取当前编码区间大小需要左移的比特数值和当前编码区间的区间大小;
第二级处理单元,用于在一个时钟周期内,并行对N个当前编码区间大小需要左移的比特数值进行处理,获取当前编码后输出的比特在一个字节中的比特位置、比特位置的标志信息和当前编码字符的偏移量;
第三级处理单元,用于在一个时钟周期内,并行对N个编码字符、N个当前编码区间大小需要左移的比特数值、N个当前编码字符的偏移量和N个比特位置的标志信息进行处理,获取当前编码区间的区间下限值和编码字符的输出码流。
输出单元,用于将并行输入的N路输出码流按顺序转换为串行输出的单路输出码流输出。
其中,N为大于或等于1的整数。
在一种示例性实例中,N的值可以是一个固定的值,也可以是一个可配置的参数值。N的大小取决于一个时钟周期的长度,一个时钟周期较长,N的值可以较大,一个时钟周期的较短,N的值可以较小。
在一种示例性实例中,本申请算术编码器为VP8二进制算术编码。
本申请提供的算术编码器,采用多路并行的电路结构,因此,在一个时钟周期内能 够处理多个编码字符,提高了吞吐率,加快了处理速度。
图2为本申请算术编码器中第一级处理单元实施例的组成结构示意图,如图2所示,在一种示例性实例中,第一级处理单元可以包括:N个串行连接的编码区间处理模块(如图1所示的第一编码区间处理模块、第二编码区间处理模块、第三编码区间处理模块…第N编码区间处理模块,图2中仅以N=4为例)、第一临时寄存器、第一流水线寄存器;其中,
第一编码区间处理模块,用于接收当前时钟周期内的待处理的第一编码字符bin_0、第一编码字符的编码概率prob_0以及上一个编码字符的编码区间大小range_(N-1);根据接收到的第一编码字符的编码概率prob_0以及上一个编码字符的编码区间大小range_(N-1)(需要说明的是,在首次计算时,上一个编码字符的编码区间大小为初始值如可以设置为8’d255),计算得到第一split值split_0;根据计算得到的第一split值split_0和第一编码字符bin_0计算得到当前编码区间大小;根据当前编码区间大小,通过查表(如表1所示)得到当前编码区间大小需要左移的比特数值即第一左移比特数值shift_0以及左移后的编码区间大小值range_after_shift;将左移后的编码区间大小值range_after_shift作为第一编码字符处理模块输出的range值即range_0。
第二编码区间处理模块,用于接收当前时钟周期内的待处理的第二编码字符bin_1、第二编码字符的编码概率prob_1以及上一个编码字符的编码区间大小range_0;根据接收到的第二编码字符的编码概率prob_1以及上一个编码字符的编码区间大小range_0,计算得到第二split值split_1;根据计算得到的第二split值split_1和第二编码字符bin_1计算得到当前编码区间大小;根据当前编码区间大小,通过查表(如表1所示)得到当前编码区间大小需要左移的比特数值即第二左移比特数值shift_1以及左移后的编码区间大小值range_after_shift;将左移后的编码区间大小值range_after_shift作为第二编码字符处理模块输出的range值即range_1。
第三编码区间处理模块,用于接收当前时钟周期内的待处理的第三编码字符bin_2、第三编码字符的编码概率prob_2以及上一个编码字符的编码区间大小range_1;根据接收到的第三编码字符的编码概率prob_2以及上一个编码字符的编码区间大小range_1,计算得到第三split值split_2;根据计算得到的第三split值split_2和第三编码字符bin_2计算得到当前编码区间大小;根据当前编码区间大小,通过查表(如表1所示)得到当前编码区间大小需要左移的比特数值即第三左移比特数值shift_2以及左移后的编码区间大小值range_after_shift;将左移后的编码区间大小值range_after_shift 作为第三编码字符处理模块输出的range值即range_2。
vpx_nom
range shift range_after_shift
8′b1xxxxxxx 0 8′b1xxxxxxx
8′b01xxxxxx 1 8′b1xxxxxx0
8′b001xxxxx 2 8′b1xxxxx00
8′b0001xxxx 3 8′b1xxxx000
8′b00001xxx 4 8′b1xxx0000
8′b000001xx 5 8′b1xx00000
8′b0000001x 6 8′b1x000000
8′b00000001 7 8′b10000000
8′b00000000 0 8′b00000000
表1
以此类推,第N编码区间处理模块,用于接收当前时钟周期内的待处理的第N编码字符bin_(N-1)、第N编码字符的编码概率prob_(N-1)以及上一个编码字符的编码区间大小range_(N-2);根据接收到的第N编码字符的编码概率prob_(N-1)以及上一个编码字符的编码区间大小range_(N-2),计算得到第N split值split_(N-1);根据计算得到的第N split值split_(N-1)和第N编码字符bin_(N-1)计算得到当前编码区间大小;根据当前编码区间大小,通过查表(如表1所示)得到当前编码区间大小需要左移的比特数值即第N左移比特数值shift_(N-1)以及左移后的编码区间大小值range_after_shift;将左移后的编码区间大小值range_after_shift作为第N编码字符处理模块输出的range值即range_(N-1)。
第一临时寄存器,用于暂时存放第N编码字符处理模块输出的range值即range_(N-1),在下一个时钟周期时输出给第一编码区间处理模块。
第一流水线寄存器,用于在流水线中打拍,存放当前时钟周期内的编码字符bin即第一编码字符bin_0、第二编码字符bin_1、第三编码字符bin_2…第N编码字符 bin_(N-1),并输出给第二级处理单元。
需要说明的是,N的值可以是一个固定的值,也可以是一个可配置的参数值。N的大小取决于一个时钟周期的长度,一个时钟周期较长,N的值可以较大,一个时钟周期的较短,N的值可以较小。
由于直接进行无符号8-bit乘法操作的关键路径较长,也就是说,会直接导致最高时钟频率降低,即相同时间内的时钟周期数目减少,从而降低处理效率,因此,本申请实施例中,为了减少第一级流水结构即第一级处理单元的关键路径,对于根据接收到的第N编码字符的编码概率prob_(N-1)以及上一个编码字符的编码区间大小range_(N-2),根据公式split=1+((range-1)*prob>>8)计算得到第N split值split_(N-1),在一种示例性实例中,如图3所示,将无符号8-bit乘法操作拆分为四个无符号4-bit乘法操作、三个移位操作和三个无符号4-bit加法操作,其中,每个无符号4-bit乘法操作可以通过查找表实现。如图3中,符号“×”或“*”表示相乘,符号“+”表示相加,A表示其中一个无符号8-bit乘数,B表示另一个无符号8-bit乘数;A_Hi(4-bit)表示A拆分出来的一个无符号4_bit高位数;A_Lo(4-bit)表示A拆分出来的一个无符号4_bit低位数,B_Hi(4-bit)表示B拆分出来的一个无符号4_bit高位数,B_Lo(4-bit)表示B拆分出来的一个无符号4_bit低位数;不同线形表示不同的无符号乘法操作。
在一种示例性实例中,第一级处理单元还可以包括:第一数据选择器;
第一数据选择器,用于在可设置的信号number控制下,当number=N时,此时N路编码字符处理模块并行处理,因此,第一临时寄存器存放第N编码字符处理模块输出的range值即range_(N-1);…,当number=4时,此时4路编码字符处理模块并行处理,因此,第一临时寄存器存放第四编码字符处理模块输出的range值即range_3;当number=3时,此时3路编码字符处理模块并行处理,因此,第一临时寄存器存放第三编码字符处理模块输出的range值即range_2,以此类推。
特别地,当number=0时,任何一个编码区间处理模块均不处理,因此,第一临时寄存器存放的range值保持不变。
本申请算术编码器实施例中,电路结构中引入了number信号来控制当前有效的编码字符数,因此,针对VP8二进制算术编码器在每个时钟周期内能够处理的编码字符是可配置的如N个,提高了本申请算术编码器的灵活应用。
需要说明的是,除了上面提到的信号,第一级流水结构即第一级处理单元的顶层接口如表2所示:
Figure PCTCN2021071024-appb-000001
表2
图4为本申请算术编码器中第二级处理单元实施例的组成结构示意图,如图4所示,在一种示例性实例中,第二级处理单元可以包括:N个串行连接的编码位置处理模块(如图1所示的第一编码位置处理模块、第二编码位置处理模块、第三编码位置处理模块… 第N编码位置处理模块,图4中仅以N=4为例)、第二临时寄存器、第二流水线寄存器;其中,
第一编码位置处理模块,用于接收当前时钟周期内的第一左移比特数值shift_0以及上一个编码后输出的比特在一个字节中的比特位置count_(N-1)(需要说明的是,在首次计算时,上一个编码后输出的比特在一个字节中的比特位置的初始值如可以设置为-32’d24),根据count_i=count_i+shift_i(此时i=0,如图5中的步骤500所示)计算当前字符编码后的比特位置即第一比特位置值count_0;然后,根据count_0值确定当前编码字符的第一偏移量offset_0,并更新第一左移比特数值shift_0和第一比特位置值count_0。
第二编码位置处理模块,用于接收当前时钟周期内的第二左移比特数值shift_1以及上一个编码后输出的比特在一个字节中的比特位置count_0,根据count_i=count_i+shift_i(此时i=1,如图5中的步骤500所示)计算当前字符编码后的比特位置即第二比特位置值count_1;然后,根据count_1值确定当前编码字符的第二偏移量offset_1,并更新第二左移比特数值shift_1和第二比特位置值count_1。
第三编码位置处理模块,用于接收当前时钟周期内的第三左移比特数值shift_2以及上一个编码后输出的比特在一个字节中的比特位置count_1,根据count_i=count_i+shift_i(此时i=2,如图5中的步骤500所示)计算当前字符编码后的比特位置即第三比特位置值count_2;然后,根据count_2值确定当前编码字符的第三偏移量offset_2,并更新第二左移比特数值shift_2和第二比特位置值count_2。
以此类推,第N编码位置处理模块,用于接收当前时钟周期内的第N左移比特数值shift_(N-1)以及上一个编码后输出的比特在一个字节中的比特位置count_(N-2),根据count_i=count_(i-1)+shift_i(此时i=N-1,如图5中的步骤500所示)计算当前字符编码后的比特位置即第N比特位置值count_(N-1);然后,根据count_(N-1)值确定当前编码字符的第N偏移量offset_(N-1),并更新第N左移比特数值shift_(N-1)和第N比特位置值count_(N-1)。
第二临时寄存器,用于暂时存放第N编码位置处理模块输出的count值即count_(N-1),在下一个时钟周期时输出给第一编码位置处理模块。
第二流水线寄存器,用于在流水线中打拍,存放当前时钟周期内的编码字符bin即第一编码字符bin_0、第二编码字符bin_1、第三编码字符bin_2…第N编码字符bin_(N-1),以及当前时钟周期内的第一split值split_0、第二split值split_1、第 三split值split_2…第N split值split_(N-1),并输出给第三级处理单元。
图5为本申请第二级处理单元实现处理的实施例的流程示意图,在一种示例性实例中,如图5所示,根据count_(i-1)值确定当前编码字符的偏移量offset_(i-1),并更新第i左移比特数值shift_(i-1)和第i比特位置值count_(i-1),i=1,2,3…N,包括:
对count_(i-1)值进行判断,如果count_(i-1)<0(即比特位置的标志信息count_flag_(i-1)=1,如图5中的步骤501所示),那么,如图5中的步骤502所示,当前编码字符的偏移量offset_(i-1)的值为0,shift_(i-1)的值和count_(i-1)的值保持不变;如果count_(i-1)≥0(即比特位置的标志信息count_flag_(i-1)=0,如图5中的步骤501所示),那么,如图5中的步骤503所示,当前编码字符的偏移量offset_(i-1)值等于shift_(i-1)的值和count_(i-1)的值之差,shift_(i-1)的值等于count_(i-1)的值,count_(i-1)的值更新为count_(i-1)的值减8。
在一种示例性实例中,第二级处理单元还可以包括:第二数据选择器;
第二数据选择器,用于在可设置的信号number控制下,当number=N时,此时N路编码位置处理模块并行处理,因此,第二临时寄存器存放第N编码位置处理模块输出的count值即count_(N-1);…,当number=4时,此时4路编码位置处理模块并行处理,因此,第二临时寄存器存放第四编码位置处理模块输出的count值即count_3;当number=3时,此时3路编码位置处理模块并行处理,因此,第二临时寄存器存放第三编码位置处理模块输出的count值即count_2,以此类推。
本申请算术编码器实施例中,电路结构中引入了number信号来控制当前有效的编码字符数,因此,针对VP8二进制算术编码器在每个时钟周期内能够处理的编码字符是可配置的如N个,提高了本申请算术编码器的灵活应用。
需要说明的是,除了上面提到的信号,第二级流水结构即第二级处理单元的顶层接口如表3所示:
Name Bit In/Out Description
clk 1 In 系统时钟信号
rst_n 1 In 系统复位信号
en 1 In 模块使能信号:0—模块不工作,1—模块工作
enable_i 1 In 输入使能信号:0—输入信号无效,1—输入信号有效
number_i 3 In 当前有效的编码字符数(≤3’d4)
bin_0_i 1 In 编码字符0
split_0_i 8 In 编码字符0的split值
shift_0_i 4 In 编码字符0的shift值
bin_1_i 1 In 编码字符1
split_1_i 8 In 编码字符1的split值
shift_1_i 4 In 编码字符1的shift值
bin_2_i 1 In 编码字符2
split_2_i 8 In 编码字符2的split值
shift_2_i 4 In 编码字符2的shift值
bin_3_i 1 In 编码字符3
split_3_i 8 In 编码字符3的split值
shift_3_i 4 In 编码字符3的shift值
enable_o 1 Out 输出使能信号:0—输出信号无效,1—输出信号有效
number_o 1 Out 当前有效的编码字符数(≤3’d4)
bin_0_o 1 Out 编码字符0
split_0_o 8 Out 编码字符0的split值
count_flag_0_o 1 Out 编码字符0的count值的符号位
shift_0_o 4 Out 编码字符0的shift值
offset_0_o 32 Out 编码字符0的offset值
bin_1_o 1 Out 编码字符1
split_1_o 8 Out 编码字符1的split值
count_flag_1_o 1 Out 编码字符1的count值的符号位
shift_1_o 4 Out 编码字符1的shift值
offset_1_o 32 Out 编码字符1的offset值
bin_2_o 1 Out 编码字符2
split_2_o 8 Out 编码字符2的split值
count_flag_2_o 1 Out 编码字符2的count值的符号位
shift_2_o 4 Out 编码字符2的shift值
offset_2_o 32 Out 编码字符2的offset值
bin_3_o 1 Out 编码字符3
split_3_o 8 Out 编码字符3的split值
count_flag_3_o 1 Out 编码字符3的count值的符号位
shift_3_o 4 Out 编码字符3的shift值
offset_3_o 32 Out 编码字符3的offset值
表3
图6为本申请算术编码器中第三级处理单元实施例的组成结构示意图,如图6所示,在一种示例性实例中,第三级处理单元可以包括:N个串行连接的编码区间限值处理模块(如图1所示的第一编码区间限值处理模块、第二编码区间限值处理模块、第三编码区间限值处理模块…第N编码区间限值处理模块,图6中仅以N=4为例)、第三临时寄存器;其中,
第一编码区间限值处理模块,用于接收当前时钟周期内的第一编码字符bin_0、第一左移比特数值shift_0、第一偏移量offset_0、第一比特位置标志信息count_flag_0,以及上一编码字符的编码区间下限值lowvaule_(N-1)(需要说明的是,在首次计算时,上一编码字符的编码区间下限值的初始值如可以设置为32’d0),然后,根据当前的第一编码字符bin_0和第一比特位置标志信息count_flag_0,确定第一编码区间下限值lowvaule_0和第一输出码流data_0。
第二编码区间限值处理模块,用于接收当前时钟周期内的第二编码字符bin_1、第二左移比特数值shift_1、第二偏移量offset_1、第二比特位置标志信息count_flag_1,以及上一编码字符的编码区间下限值lowvaule_0,然后,根据当前的第二编码字符bin_1和第二比特位置标志信息count_flag_1,确定第二编码区间下限值lowvaule_1和第二输出码流data_1。
第三编码区间限值处理模块,用于接收当前时钟周期内的第三编码字符bin_2、第三左移比特数值shift_2、第三偏移量offset_2、第三比特位置标志信息count_flag_2,以及上一编码字符的编码区间下限值lowvaule_1,然后,根据当前的第三编码字符bin_2和第三比特位置标志信息count_flag_2,确定第三编码区间下限值lowvaule_2和第三输出码流data_2。
以此类推,第N编码区间限值处理模块,用于接收当前时钟周期内的第N编码字符bin_(N-1)、第N左移比特数值shift_(N-1)、第N偏移量offset_(N-1)、第N比特位置标志信息count_flag_(N-1),以及上一编码字符的编码区间下限值lowvaule_(N-2),然后,根据当前的第N编码字符bin_(N-1)和第N比特位置标志信息count_flag_(N-1),确定第N编码区间下限值lowvaule_(N-1)和第N输出码流data_(N-1)。
第三临时寄存器,用于暂时存放第N编码区间限值处理模块输出的count_flag值即count_flag_(N-1),在下一个时钟周期时输出给第一编码区间限值处理。
图7为本申请第三级处理单元实现处理的实施例的流程示意图,在一种示例性实例中,如图7所示,根据当前的编码字符bin_(i-1)和比特位置标志信息count_flag_(i-1), 确定编码字符的编码区间下限值lowvaule_(i-1)和输出码流data_(i-1),i=1,2,3…N,包括:
首先,根据当前的编码字符bin_(i-1)更新当前编码字符的编码字符的编码区间下限值lowvaule_(i-1):如果bin_(i-1)=0,如图7中的步骤702所示,那么,当前编码字符的编码字符的编码区间下限值lowvaule_(i-1)保持为上一编码字符的lowvaule值,即lowvaule_(i-1)的值不变;如果bin_(i-1)=1,如图7中的步骤701所示,那么,当前编码字符的编码字符的编码区间下限值lowvaule_(i-1)更新为上一编码字符的lowvaule值和当前编码字符的split_(i-1)之和,即lowvaule_(i-1)=lowvaule_(i-1)+split_(i-1);
然后,根据当前编码字符的比特位置标志信息count_flag_(i-1)确定输出码流并进一步更新当前编码字符的编码字符的编码区间下限值lowvaule_(i-1):如果count_(i-1)<0(即count_flag_(i-1)=1),如图7中的步骤705所示,那么,不输出码流,此时,当前字符的lowvaule_(i-1)更新为lowvalue值左移shift_(i-1)位,即lowvaule_(i-1)=lowvaule_(i-1)<<shift_(i-1);如果count_(i-1)≥0(即count_flag_(i-1)=0),如图7中的步骤704和步骤705所示,输出码流data_(i-1)等于lowvalue_(i-1)右移(24-offset_(i-1))位即data_(i-1)=lowvalue_(i-1)>>(24-offset_(i-1)),当前编码字符的lowvalue_(i-1)更新为lowvalue_(i-1)左移(offset_(i-1)+shift_(i-1))位即lowvalue_(i-1)=lowvalue_(i-1)<<(offset_(i-1)+shift_(i-1))。
在一种示例性实例中,第三级处理单元还可以包括:第三数据选择器;
第三数据选择器,用于在可设置的信号number控制下,当number=N时,此时N路编码区间限值处理模块并行处理,因此,第三临时寄存器存放第N编码区间限值处理模块输出的count_flag值即count_flag_(N-1);…,当number=4时,此时4路编码区间限值处理模块并行处理,因此,第三临时寄存器存放第四编码区间限值处理模块输出的count_flag值即count_flag_3;当number=3时,此时3路编码区间限值处理模块并行处理,因此,第三临时寄存器存放第三编码区间限值处理模块输出的count_flag值即count_flag_2,以此类推。
本申请算术编码器实施例中,电路结构中引入了number信号来控制当前有效的编码字符数,因此,针对VP8二进制算术编码器在每个时钟周期内能够处理的编码字符是可配置的如N个,提高了本申请算术编码器的灵活应用。
需要说明的是,除了上面提到的信号,第三级流水结构即第三级处理单元的顶层接口如表4所示:
Figure PCTCN2021071024-appb-000002
Figure PCTCN2021071024-appb-000003
表4
在一种示例性实例中,本申请输出单元即第四级流水线结构的电路结构可以包括:先进先出队列缓冲区data_refineFIFO,用于将并行输入的N路8-bit码流按顺序转换为串行输出的单路8-bit码流。
在一种示例性实例中,如图8所示的先进先出队列缓冲区data_refineFIFO的实施例的示意图,本实施例中,data_refineFIFO的深度为4,宽度为(2+32)bit,其中,前2bit用于存放有效码流的个数,后32bit用于存放对应的有效码流。当写使能wr_enable信号有效时,给写指针wr_point对应的区间(num[wr_point]和data[wr_point]赋值,同时将写指针wr_point加1;当读读使能rd_enable信号有效时,根据计数器data_cnt从读指针rd_point对应的区间(num[rd_point]和data[rd_point])取值,当data_cnt=0时,输出码流为data[rd_point][7:0],当data_cnt=1时,输出码流为data[rd_point][15:8],以此类推,直到data_cnt=num[rd_point],输出码流后将读指针rd_point加1。
需要说明的是,除了上面提到的信号,第四级流水结构即输出处理单元的顶层接口如表5所示:
Figure PCTCN2021071024-appb-000004
Figure PCTCN2021071024-appb-000005
表5
图9为本申请实现算术编码的方法的流程图,如图9所示,在一个时钟周期内,包括:
步骤900:算术编码器并行对N个编码字符进行处理,获取当前编码区间大小需要左移的比特数值和当前编码区间的区间大小。
在一种示例性实例中,步骤900可以包括:
接收当前时钟周期内的待处理的第i编码字符bin_(i-1)、第i编码字符的编码概率prob_(i-1)以及上一个编码字符的编码区间大小range_(i-2);
根据接收到的第i编码字符的编码概率prob_(i-1)以及上一个编码字符的编码区间大小range_(i-2),计算得到第N split值split_(i-1);
根据计算得到的第i个split值split_(i-1)和第i编码字符bin_(i-1)计算得到当前编码区间大小;
根据当前编码区间大小,通过查表(如表1所示)得到当前编码区间大小需要左移的比特数值即第i左移比特数值shift_(i-1)以及左移后的编码区间大小值range_after_shift;将左移后的编码区间大小值range_after_shift作为第i编码字符处理模块输出的range值即range_(i-1)。
其中,i=1,2,3…N。
需要说明的是,在首次计算时,上一个编码字符的编码区间大小为初始值如可以设置为8’d255。
在一种示例性实例中,可以根据公式split=1+((range-1)*prob>>8)计算得到第N split值split_(N-1),包括:
将无符号8-bit乘法操作拆分为四个无符号4-bit乘法操作、三个移位操作和三个无符号4-bit加法操作,其中,每个无符号4-bit乘法操作可以通过查找表实现。
在一种示例性实例中,步骤900还包括:
根据可设置的信号number,对number路编码字符进行并行处理,其中,number=1,2,3…N。这样的方式提高了本申请算术编码器的灵活应用。
特别地,当number=0时,任何一个编码区间处理模块均不处理,因此,第一临时寄存器存放的range值保持不变。
在一种示例性实例中,本申请还提供一种实现算术编码的方法,在一个时钟周期内,包括:
算术编码器根据可设置的信号number,对number路编码字符进行并行处理,获取当前编码区间大小需要左移的比特数值和当前编码区间的区间大小,其中,number=1,2,3…N;
算术编码器并行对numbe个当前编码区间大小需要左移的比特数值进行处理,获取当前编码后输出的比特在一个字节中的比特位置、比特位置的标志信息和当前编码字符的偏移量;
算术编码器并行对numbe个编码字符、numbe个当前编码区间大小需要左移的比特数值、numbe个当前编码字符的偏移量和numbe个比特位置的标志信息进行处理,获取当前编码区间的区间下限值和编码字符的输出码流;
算术编码器将并行输入的numbe路输出码流按顺序转换为串行输出的单路输出码流输出。
步骤901:算术编码器并行对N个当前编码区间大小需要左移的比特数值进行处理,获取当前编码后输出的比特在一个字节中的比特位置、比特位置的标志信息和当前编码字符的偏移量。
在一种示例性实例中,步骤901可以包括:
接收当前时钟周期内的第i左移比特数值shift_(i-1)以及上一个编码后输出的比特在一个字节中的比特位置count_(i-2);
根据count_(i-1)=count_(i-2)+shift_(i-1),计算当前字符编码后的比特位置即第i比特位置值count_(i-1);
根据count_(i-1)值确定当前编码字符的第i偏移量offset_(i-1),并更新第i左移比特数值shift_(i-1)和第二比特位置值count_(i-1)。
其中,i=1,2,3…N。
需要说明的是,在首次计算时,上一个编码后输出的比特在一个字节中的比特位置的初始值如可以设置为-32’d24。
在一种示例性实例中,根据count_(i-1)值确定当前编码字符的第i偏移量offset_(i-1),并更新第i左移比特数值shift_(i-1)和第二比特位置值count_(i-1),包括:
对count_(i-1)值进行判断,如果count_(i-1)<0(即比特位置的标志信息count_flag_(i-1)=1),那么,当前编码字符的偏移量offset_(i-1)的值为0,shift_(i-1)的值和count_(i-1)的值保持不变;如果count_(i-1)≥0(即比特位置的标志信息count_flag_(i-1)=0),那么,当前编码字符的偏移量offset_(i-1)值等于shift_(i-1)的值和count_(i-1)的值之差,shift_(i-1)的值等于count_(i-1)的值,count_(i-1)的值更新为count_(i-1)的值减8。
在一种示例性实例中,步骤901还包括:
根据可设置的信号number,对number路编码位置进行并行处理,其中,number=1,2,3…N。这样的方式提高了本申请算术编码器的灵活应用。
步骤902:算术编码器并行对N个编码字符、N个当前编码区间大小需要左移的比特数值、N个当前编码字符的偏移量和N个比特位置的标志信息进行处理,获取当前编码区间的区间下限值和编码字符的输出码流。
在一种示例性实例中,步骤902可以包括:
接收当前时钟周期内的第i编码字符bin_(i-1)、第i左移比特数值shift_(i-1)、第i偏移量offset_(i-1)、第i比特位置标志信息count_flag_(i-1),以及上一编码字符的编码区间下限值lowvaule_(i-2);
根据当前的第i编码字符bin_(i-1)和第i比特位置标志信息count_flag_(i-1),确定第i编码区间下限值lowvaule_(i-1)和第i输出码流data_(i-1)。
其中,i=1,2,3…N。
在一种示例性实例中,根据当前的第i编码字符bin_(i-1)和第i比特位置标志信 息count_flag_(i-1),确定第i编码区间下限值lowvaule_(i-1)和第i输出码流data_(i-1),其中,i=1,2,3…N,包括:
根据当前的编码字符bin_(i-1)更新当前编码字符的编码字符的编码区间下限值lowvaule_(i-1):如果bin_(i-1)=0,那么,当前编码字符的编码字符的编码区间下限值lowvaule_(i-1)保持为上一编码字符的lowvaule值,即lowvaule_(i-1)的值不变;如果bin_(i-1)=1,那么,当前编码字符的编码字符的编码区间下限值lowvaule_(i-1)更新为上一编码字符的lowvaule值和当前编码字符的split_(i-1)之和,即lowvaule_(i-1)=lowvaule_(i-1)+split_(i-1);
根据当前编码字符的比特位置标志信息count_flag_(i-1)确定输出码流并进一步更新当前编码字符的编码字符的编码区间下限值lowvaule_(i-1):如果count_(i-1)<0(即count_flag_(i-1)=1),那么,不输出码流,此时,当前编码字符的lowvaule_(i-1)更新为lowvalue值左移shift_(i-1)位即lowvaule_(i-1)=lowvaule_(i-1)<<shift_(i-1);如果count_(i-1)≥0(即count_flag_(i-1)=0),输出码流data_(i-1)等于lowvalue_(i-1)右移(24-offset_(i-1))位即data_(i-1)=lowvalue_(i-1)>>(24-offset_(i-1)),当前编码字符的lowvalue_(i-1)更新为lowvalue_(i-1)左移(offset_(i-1)+shift_(i-1))位即lowvalue_(i-1)=lowvalue_(i-1)<<(offset_(i-1)+shift_(i-1))。
在一种示例性实例中,步骤902还可以包括:
根据可设置的信号number,对number路编码区间限制进行并行处理,其中,number=1,2,3…N。这样的方式提高了本申请算术编码器的灵活应用。
步骤903:算术编码器将并行输入的N路输出码流按顺序转换为串行输出的单路输出码流输出。
在一种示例性实例中,N的值可以是一个固定的值,也可以是一个可配置的参数值。N的大小取决于一个时钟周期的长度,一个时钟周期较长,N的值可以较大,一个时钟周期的较短,N的值可以较小。
在一种示例性实例中,本申请算术编码器为VP8二进制算术编码。
本申请提供的VP8二进制算术编码器,采用四N路并行的电路结构,因此,在一个时钟周期内最多能够处理N个编码字符,提高了吞吐率,加快了处理速度。
本申请实施例实现算术编码的方法,在一个时钟周期内能够并行处理多个编码字符,提高了吞吐率,加快了处理速度。
本申请还提供一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令用于执行上述任一项的实现算术编码的方法。
本申请再提供一种实现算术编码的设备,包括存储器和处理器,其中,存储器中存储有以下可被处理器执行的指令:用于执行上任一项所述实现算术编码的方法的步骤。
本申请还提供一种图像编码方法,包括:
对待处理的图像进行预处理得到多个图像块;
对得到的图像块分别进行转换处理得到编码字符和编码概率;
将图像块对应的编码字符和编码概率输入编码器进行编码;
其中,编码器包括本申请实施例中任一项所述的算术编码器。
需要说明的是,对待处理的图像进行预处理得到多个图像块,以及对得到的图像块分别进行转换处理得到编码字符和编码概率的实现可以采用相关技术完成,具体实现并不用于限定本申请的保护范围。
虽然本申请所揭露的实施方式如上,但所述的内容仅为便于理解本申请而采用的实施方式,并非用以限定本申请。任何本申请所属领域内的技术人员,在不脱离本申请所揭露的精神和范围的前提下,可以在实施的形式及细节上进行任何的修改与变化,但本申请的专利保护范围,仍须以所附的权利要求书所界定的范围为准。

Claims (26)

  1. 一种算术编码器,包括:第一级处理单元、第二级处理单元、第三级处理单元、输出单元;其中,
    第一级处理单元,用于在一个时钟周期内,并行对N个编码字符进行处理,获取当前编码区间大小需要左移的比特数值和当前编码区间的区间大小;
    第二级处理单元,用于在一个时钟周期内,并行对N个当前编码区间大小需要左移的比特数值进行处理,获取当前编码后输出的比特在一个字节中的比特位置、比特位置的标志信息和当前编码字符的偏移量;
    第三级处理单元,用于在一个时钟周期内,并行对N个编码字符、N个当前编码区间大小需要左移的比特数值、N个当前编码字符的偏移量和N个比特位置的标志信息进行处理,获取当前编码区间的区间下限值和编码字符的输出码流;
    输出单元,用于将并行输入的N路输出码流按顺序转换为串行输出的单路输出码流输出;
    其中,N为大于或等于1的整数。
  2. 根据权利要求1所述的算术编码器,其中,所述第一级处理单元包括:N个串行连接的编码区间处理模块:第i编码区间处理模块,i=1,2,3…N,以及第一临时寄存器,第一流水线寄存器;其中,
    第i编码区间处理模块,用于接收当前所述时钟周期内的待处理的第i编码字符bin_(i-1)、第i编码字符的编码概率prob_(i-1)以及上一个编码字符的编码区间大小range_(i-2);根据接收到的第i编码字符的编码概率prob_(i-1)以及上一个编码字符的编码区间大小range_(i-2),计算得到第i个split值split_(i-1);根据计算得到的第i个split值split_(i-1)和第i编码字符bin_(i-1)计算得到当前编码区间大小;根据当前编码区间大小,通过查表得到当前编码区间大小需要左移的比特数值shift_(i-1)以及左移后的编码区间大小值range_after_shift;将左移后的编码区间大小值range_after_shift作为第i编码字符处理模块输出的range值range_(i-1);
    第一临时寄存器,用于暂时存放第i编码字符处理模块输出的range值range_(i-1),在下一个时钟周期时输出给第一编码区间处理模块;
    第一流水线寄存器,用于在流水线中打拍,存放当前所述时钟周期内的编码字符bin:第i编码字符bin_(i-1),并输出给所述第二级处理单元。
  3. 根据权利要求2所述的算术编码器,还包括:第一数据选择器;
    第一数据选择器,用于在可设置的信号number控制下,对number路编码字符进行并行处理,其中,number=1,2,3…N。
  4. 根据权利要求2或3所述的算术编码器,其中,
    采用将无符号8-bit乘法操作拆分为四个无符号4-bit乘法操作、三个移位操作和三个无符号4-bit加法操作,实现所述第i个split值split_(i-1)的计算,其中,每个无符号4-bit乘法操作通过查找表实现。
  5. 根据权利要求1所述的算术编码器,其中,所述第二级处理单元包括:N个串行连接的编码位置处理模块:第i编码位置处理模块,i=1,2,3…N,以及第二临时寄存器、第二流水线寄存器;其中,
    第i编码位置处理模块,用于接收当前所述时钟周期内的第i左移比特数值shift_(i-1)以及上一个编码后输出的比特在一个字节中的比特位置count_(i-2),根据count_(i-1)=count_(i-2)+shift_(i-1)计算当前字符编码后的比特位置count_(i-1);根据count_(i-1)值确定当前编码字符的第i偏移量offset_(i-1),并更新第i左移比特数值shift_(i-1)和第i比特位置值count_(i-1);
    第二临时寄存器,用于暂时存放第i编码位置处理模块输出的count值count_(i-1),在下一个时钟周期时输出给第一编码位置处理模块;
    第二流水线寄存器,用于在流水线中打拍,存放当前时所述钟周期内的编码字符bin:第i编码字符bin_(i-1),以及当前所述时钟周期内的第i个split值split_(i-1),并输出给所述第三级处理单元。
  6. 根据权利要求5所述的算术编码器,还包括:第二数据选择器;
    第二数据选择器,用于在可设置的信号number控制下,对i路编码位置进行并行处理。
  7. 根据权利要求5或6所述的算术编码器,其中,所述根据count_(i-1)值确定当前编码字符的偏移量offset_(i-1),并更新第i左移比特数值shift_(i-1)和第i比特位置值count_(i-1),包括:
    对count_(i-1)值进行判断,如果count_(i-1)<0,则当前编码字符的偏移量offset_(i-1)的值为0,shift_(i-1)的值和count_(i-1)的值保持不变;如果count_(i-1)≥0,则当前编码字符的偏移量offset_(i-1)值等于shift_(i-1)的值和count_(i-1)的值之差,shift_(i-1)的值等于count_(i-1)的值,count_(i-1)的值更新为count_(i-1)的值减8。
  8. 根据权利要求1所述的算术编码器,其中,所述第三级处理单元包括:N个串行连接的编码区间限值处理模块:第i编码区间限值处理模块,i=1,2,3…N,以及第三临时寄存器;其中,
    第i编码区间限值处理模块,用于接收当前时钟周期内的第i编码字符bin_(i-1)、第i左移比特数值shift_(i-1)、第i偏移量offset_(i-1)、第i比特位置标志信息count_flag_(i-1),以及上一编码字符的编码区间下限值lowvaule_(i-2);根据当前的第i编码字符bin_(i-1)和第i比特位置标志信息count_flag_(i-1),确定第i编码区间下限值lowvaule_(i-1)和第i输出码流data_(i-1);
    第三临时寄存器,用于暂时存放第i编码区间限值处理模块输出的count_flag_(i-1),在下一个时钟周期时输出给第一编码区间限值处理。
  9. 根据权利要求8所述的算术编码器,还包括:第三数据选择器;
    第三数据选择器,用于在可设置的信号number控制下,对i路编码区间限值进行并行处理。
  10. 根据权利要求8或9所述的算术编码器,其中,所述根据当前的第i编码字符bin_(i-1)和第i比特位置标志信息count_flag_(i-1),确定第i编码区间下限值lowvaule_(i-1)和第i输出码流data_(i-1),包括:
    根据当前的编码字符bin_(i-1)更新当前编码字符的编码字符的编码区间下限值lowvaule_(i-1):如果bin_(i-1)=0,则当前编码字符的编码字符的编码区间下限值lowvaule_(i-1)不变;如果bin_(i-1)=1,则当前编码字符的编码字符的编码区间下限值lowvaule_(i-1)更新为上一编码字符的lowvaule值和当前编码字符的split_(i-1)之和;
    根据当前编码字符的比特位置标志信息count_flag_(i-1)确定输出码流并更新当前编码字符的编码字符的编码区间下限值lowvaule_(i-1):如果count_(i-1)<0,则当前字符的lowvaule_(i-1)更新为lowvalue值左移shift_(i-1)位;如果count_(i-1)≥0,输出码流data_(i-1)等于lowvalue_(i-1)右移(24-offset_(i-1))位,当前编码字符的lowvalue_(i-1)更新为lowvalue_(i-1)左移(offset_(i-1)+shift_(i-1))位。
  11. 根据权利要求1所述的算术编码器,其中,所述输出单元为先进先出队列缓冲区。
  12. 根据权利要求1所述的算术编码器,其中,所述算术编码器为VP8二进制算术 编码。
  13. 一种实现算术编码的方法,在一个时钟周期内,包括:
    算术编码器并行对N个编码字符进行处理,获取当前编码区间大小需要左移的比特数值和当前编码区间的区间大小;
    算术编码器并行对N个当前编码区间大小需要左移的比特数值进行处理,获取当前编码后输出的比特在一个字节中的比特位置、比特位置的标志信息和当前编码字符的偏移量;
    算术编码器并行对N个编码字符、N个当前编码区间大小需要左移的比特数值、N个当前编码字符的偏移量和N个比特位置的标志信息进行处理,获取当前编码区间的区间下限值和编码字符的输出码流;
    算术编码器将并行输入的N路输出码流按顺序转换为串行输出的单路输出码流输出。
  14. 根据权利要求13所述的方法,其中,所述并行对N个编码字符进行处理,获取当前编码区间的区间大小和当前编码区间大小需要左移的比特数值,包括:
    接收当前所述时钟周期内的待处理的第i编码字符bin_(i-1)、第i编码字符的编码概率prob_(i-1)以及上一个编码字符的编码区间大小range_(i-2);
    根据接收到的第i编码字符的编码概率prob_(i-1)以及上一个编码字符的编码区间大小range_(i-2),计算得到第N split值split_(i-1);
    根据计算得到的第i个split值split_(i-1)和第i编码字符bin_(i-1)计算得到当前编码区间大小;
    根据当前编码区间大小,通过查表得到当前编码区间大小需要左移的比特数值shift_(i-1)以及左移后的编码区间大小值range_after_shift;将左移后的编码区间大小值range_after_shift作为第i编码字符处理模块输出的range值range_(i-1);
    其中,i=1,2,3…N。
  15. 根据权利要求14所述的方法,还包括:
    根据可设置的信号number,对number路编码字符进行并行处理,其中,number=1,2,3…N。
  16. 根据权利要求14或15所述的方法,采用将无符号8-bit乘法操作拆分为四个无符号4-bit乘法操作、三个移位操作和三个无符号4-bit加法操作,计算所述第N split值split_(N-1);其中,每个无符号4-bit乘法操作通过查找表实现。
  17. 根据权利要求13所述的方法,其中,所述获取当前编码后输出的比特在一个字节中的比特位置、比特位置的标志信息和当前编码字符的偏移量,包括:
    接收当前所述时钟周期内的第i左移比特数值shift_(i-1)以及上一个编码后输出的比特在一个字节中的比特位置count_(i-2);
    根据count_(i-1)=count_(i-2)+shift_(i-1),计算当前字符编码后的比特位置count_(i-1);
    根据count_(i-1)值确定当前编码字符的第i偏移量offset_(i-1),并更新第i左移比特数值shift_(i-1)和第二比特位置值count_(i-1);
    其中,i=1,2,3…N。
  18. 根据权利要求17所述的方法,还包括:
    根据可设置的信号number,对number路编码位置进行并行处理,其中,number=1,2,3…N。
  19. 根据权利要求17或18所述的方法,其中,所述根据count_(i-1)值确定当前编码字符的第i偏移量offset_(i-1),并更新第i左移比特数值shift_(i-1)和第二比特位置值count_(i-1),包括:
    对count_(i-1)值进行判断,如果count_(i-1)<0,则当前编码字符的偏移量offset_(i-1)的值为0,shift_(i-1)的值和count_(i-1)的值保持不变;如果count_(i-1)≥0,则当前编码字符的偏移量offset_(i-1)值等于shift_(i-1)的值和count_(i-1)的值之差,shift_(i-1)的值等于count_(i-1)的值,count_(i-1)的值更新为count_(i-1)的值减8。
  20. 根据权利要求13所述的方法,其中,所述获取当前编码区间的区间下限值和编码字符的输出码流,包括:
    接收当前所述时钟周期内的第i编码字符bin_(i-1)、第i左移比特数值shift_(i-1)、第i偏移量offset_(i-1)、第i比特位置标志信息count_flag_(i-1),以及上一编码字符的编码区间下限值lowvaule_(i-2);
    根据当前的第i编码字符bin_(i-1)和第i比特位置标志信息count_flag_(i-1),确定第i编码区间下限值lowvaule_(i-1)和第i输出码流data_(i-1);
    其中,i=1,2,3…N。
  21. 根据权利要求20所述的方法,还包括:
    根据可设置的信号number,对number路编码区间限制进行并行处理。
  22. 根据权利要求20或21所述的方法,所述确定第i编码区间下限值lowvaule_(i-1)和第i输出码流data_(i-1),包括:
    根据当前的编码字符bin_(i-1)更新当前编码字符的编码字符的编码区间下限值lowvaule_(i-1):如果bin_(i-1)=0,则当前编码字符的编码字符的编码区间下限值lowvaule_(i-1)保持不变;如果bin_(i-1)=1,则当前编码字符的编码字符的编码区间下限值lowvaule_(i-1)更新为上一编码字符的lowvaule值和当前编码字符的split_(i-1)之和;
    根据当前编码字符的比特位置标志信息count_flag_(i-1)确定输出码流并更新当前编码字符的编码字符的编码区间下限值lowvaule_(i-1):如果count_(i-1)<0,则不输出码流,此时,当前编码字符的lowvaule_(i-1)更新为lowvalue值左移shift_(i-1)位;如果count_(i-1)≥0,输出码流data_(i-1)等于lowvalue_(i-1)右移(24-offset_(i-1))位,当前编码字符的lowvalue_(i-1)更新为lowvalue_(i-1)左移(offset_(i-1)+shift_(i-1))位。
  23. 一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令用于执行权利要求13~权利要求22任一项所述实现算术编码的方法。
  24. 一种实现算术编码的设备,包括存储器和处理器,其中,存储器中存储有以下可被处理器执行的指令:用于执行权利要求13~权利要求22任一项所述实现算术编码的步骤。
  25. 一种实现算术编码的方法,在一个时钟周期内,包括:
    算术编码器根据可设置的信号number,对number路编码字符进行并行处理,获取当前编码区间大小需要左移的比特数值和当前编码区间的区间大小,其中,number=1,2,3…N;
    算术编码器并行对numbe个当前编码区间大小需要左移的比特数值进行处理,获取当前编码后输出的比特在一个字节中的比特位置、比特位置的标志信息和当前编码字符的偏移量;
    算术编码器并行对numbe个编码字符、numbe个当前编码区间大小需要左移的比特数值、numbe个当前编码字符的偏移量和numbe个比特位置的标志信息进行处理,获取当前编码区间的区间下限值和编码字符的输出码流;
    算术编码器将并行输入的numbe路输出码流按顺序转换为串行输出的单路输出码流输出。
  26. 一种图像编码方法,包括:
    对待处理的图像进行预处理得到多个图像块;
    将得到的图像块分别转换得到对应的编码字符和编码概率;
    将图像块对应的编码字符和编码概率输入编码器进行编码;
    其中,编码器包括权利要求1~12任一项所述的算术编码器。
PCT/CN2021/071024 2020-01-17 2021-01-11 算术编码器及实现算术编码的方法和图像编码方法 WO2021143634A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010051282.3A CN113141508B (zh) 2020-01-17 2020-01-17 算术编码器及实现算术编码的方法和图像编码方法
CN202010051282.3 2020-01-17

Publications (1)

Publication Number Publication Date
WO2021143634A1 true WO2021143634A1 (zh) 2021-07-22

Family

ID=76808227

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/071024 WO2021143634A1 (zh) 2020-01-17 2021-01-11 算术编码器及实现算术编码的方法和图像编码方法

Country Status (2)

Country Link
CN (1) CN113141508B (zh)
WO (1) WO2021143634A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116738471A (zh) * 2023-08-10 2023-09-12 陕西昕晟链云信息科技有限公司 基于区块链的去中心化数据分析方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1703089A (zh) * 2005-06-09 2005-11-30 清华大学 一种数字信号的二值算术编码方法
JP2015115665A (ja) * 2013-12-09 2015-06-22 日本電信電話株式会社 二値算術符号化装置、二値算術符号化方法及び二値算術符号化プログラム
CN104918049A (zh) * 2015-06-03 2015-09-16 复旦大学 适用于hevc标准的二进制算术编码模块
CN105791828A (zh) * 2015-12-31 2016-07-20 杭州士兰微电子股份有限公司 二进制算术编码器及其编码方法

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7777654B2 (en) * 2007-10-16 2010-08-17 Industrial Technology Research Institute System and method for context-based adaptive binary arithematic encoding and decoding
US8542727B2 (en) * 2007-12-31 2013-09-24 Intel Corporation Systems and apparatuses for performing CABAC parallel encoding and decoding
CN107277553B (zh) * 2017-07-10 2020-10-27 中国科学技术大学 一种二元算术编码器

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1703089A (zh) * 2005-06-09 2005-11-30 清华大学 一种数字信号的二值算术编码方法
JP2015115665A (ja) * 2013-12-09 2015-06-22 日本電信電話株式会社 二値算術符号化装置、二値算術符号化方法及び二値算術符号化プログラム
CN104918049A (zh) * 2015-06-03 2015-09-16 复旦大学 适用于hevc标准的二进制算术编码模块
CN105791828A (zh) * 2015-12-31 2016-07-20 杭州士兰微电子股份有限公司 二进制算术编码器及其编码方法
CN109587483A (zh) * 2015-12-31 2019-04-05 杭州士兰微电子股份有限公司 码流提取模块

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116738471A (zh) * 2023-08-10 2023-09-12 陕西昕晟链云信息科技有限公司 基于区块链的去中心化数据分析方法
CN116738471B (zh) * 2023-08-10 2023-10-20 陕西昕晟链云信息科技有限公司 基于区块链的去中心化数据分析方法

Also Published As

Publication number Publication date
CN113141508A (zh) 2021-07-20
CN113141508B (zh) 2024-03-26

Similar Documents

Publication Publication Date Title
US10051285B2 (en) Data compression using spatial decorrelation
US9413387B2 (en) Data compression using entropy encoding
CN109600618B (zh) 视频压缩方法、解压缩方法、装置、终端和介质
CN105634499B (zh) 一种基于新短浮点型数据的数据转换方法
US9026568B2 (en) Data compression for direct memory access transfers
US11249721B2 (en) Multiplication circuit, system on chip, and electronic device
WO2021109696A1 (zh) 数据压缩、解压缩以及基于数据压缩和解压缩的处理方法及装置
US9806741B1 (en) Character conversion
JP2004507858A (ja) 圧縮アルゴリズムのハードウエアにおける実現
WO2021143634A1 (zh) 算术编码器及实现算术编码的方法和图像编码方法
WO2021027487A1 (zh) 一种编码方法及相关设备
EP2055007A2 (en) Data encoder
EP2787738A1 (en) Tile-based compression and decompression for graphic applications
CN103200407A (zh) 一种自适应熵编码器
US20230342419A1 (en) Matrix calculation apparatus, method, system, circuit, and device, and chip
US8817875B2 (en) Methods and systems to encode and decode sequences of images
CN102545910B (zh) 一种jpeg霍夫曼解码电路及其解码方法
CN201054155Y (zh) 一种适于jpeg码流的哈夫曼解码装置
CN116827358B (zh) 一种5g ldpc编码实现方法和装置
CN201966895U (zh) 一种jpeg霍夫曼解码电路
CN117093510B (zh) 大小端通用的缓存行高效索引方法
CN110213582B (zh) 面向超高分辨率图像分析的高精度量化加速方法
WO2019191904A1 (zh) 一种数据处理方法及装置
Shi et al. Design of Tile-based ARGB Image Lossless Compressor and Decompressor
CN103458247B (zh) 一种非定长码高速拼接硬件实现装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21741484

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21741484

Country of ref document: EP

Kind code of ref document: A1